How Can I Get Started Using R?

18 Jul 2015

I received an email this week that included the following:

Your posts, a long with those of numerous others, have convinced me that I should give R a shot. I am hoping that you would be so kind as to suggest some essential packages for political science research.

I appreciate the email and I’m thrilled to see another person interested in jumping into R. Here’s what I’d suggest: don’t think in terms of specific packages. Instead, think in terms of problems you have that you want to solve.

As some autobiographical background for this advice, I came to R through my PhD coursework at Northwestern University. I had learned SPSS during undergrad and then learned Stata at Northwestern. I had a brief - actually, let’s say rough - introduction to R during the last course of the political science methods sequence, and then I forgot about it for about a year. But then, I had a problem I needed to solve. It was 2011 and I wanted to use Amazon Mechanical Turk for a research project where I contacted survey respondents at multiple points in time. MTurk allowed this, but there wasn’t an easy way to do it through the web interface.

I learned however, that MTurk had a thing called an “API”, a term I had never heard before. I figured out that an API was an “Application Programming Interface” - a structured way of accessing a software application using programming rather than a graphical interface. And, it turned out that I could access the MTurk API through R, I just had to write an R package to do it. So, I came to R because I had a problem I needed to solve and while the solution was feasible within R, it wasn’t feasible in the other tools I already knew (SPSS and Stata). Of course, one could access MTurk through other languages (Java, Python, Javascript, Perl, Ruby, etc.) but I didn’t really know any of those. And aside from my ignorance of those languages, R was extremely well-suited to the rest of the science I wanted to do with the data I was retrieving from MTurk (namely, statistical analyses of the resulting data). It was natural to make the whole process work in R rather than use Python for accessing MTurk and then move to Stata to do the analysis.

For me, then, R was something that made sense because it was powerful, it solved a problem I had that didn’t have an easy alternative, and it helped structure my broader scientific workflow.

My advice is therefore that the best way to dive into R is to identify a problem you already have (or think you’ll soon have) and figure out how R can help you achieve it. There are a handful of problems that scientists, including political scientists, regularly have that R can help them solve quite easily:

There are thousands of R packages on CRAN and hundreds more on Github. Recommending which you should try is pretty challenging, unless you have a specific problem you want to solve. For scientists, some of the above scenarios are pretty standard problems and R has great facilities for solving those problems. But, maybe you have problem that hasn’t been solved in R (like my need for a programmatic interface to MTurk). That’s a “problem” that’s often not really a problem because any time there’s a challenge that hasn’t been addressed yet, it means there’s a great opportunity to take a deep dive into learning R and using it to solve that problem.



R software teaching mturk statistics

Creative Commons License Except where noted, this website is licensed under a Creative Commons Attribution 4.0 International License. Views expressed are solely my own, not those of any current, past, or future employer.