Here are a few of the R packages I’ve developed on my own or as part of the rOpenSci, rOpenGov, and cloudyr open source projects. My CV lists all of my software projects and you can find source code for all projects on GitHub. The projects below are a few of the more prominent examples.
cloudyr The cloudyr project is an effort to connect R to cloud computing applications, starting with Amazon Web Services and continuous integration services for R package development. See the cloudyr website or GitHub page for more details.
Dataverse packages: dataverse (version >= 4.0) and dvn (version < 4.0)
These R packages provide access to The Dataverse Network APIs. dataverse is the current generation package, providing access to the complete functionality of current Dataverse installations. It is part of the rOpenSci project and is officially supported by Harvard-IQSS, the developers of Dataverse. dvn is a legacy package supporting earlier Dataverse installations.
dvn: CRAN, GitHub
ghit: Lightweight GitHub Package Installer ghit is a lightweight, vectorized drop-in replacement for
devtools::install_github() that uses native git and R methods to clone and install a package from GitHub. It provides a lighter weight alternative to devtools with a very similar API, slightly different defaults, and completely rebuilt internals.
prediction, margins: An R port of Stata’s margins command
prediction and margins together form an R port of Stata’s
margins command. prediction provides tidy, type-safe predictions from model objects and Stata-style predictive margins. margins can calculate (average) marginal effects and their variances from regression models. The latter is especially helpful for models with power terms, non-linear transformations, and interaction terms, and for generalized linear models.
prediction: CRAN, GitHub
margins: CRAN, GitHub
rio: A Swiss-army knife for data I/O
The aim of rio is to make data file I/O in R as easy as possible by implementing data import and export for R that relies on file extensions to make a (reasonable) assumption about how to read a file into a data.frame or, conversely, save a data.frame to disk. It greatly simplifies data import and export and offers a function for easily converting between file formats (possibly from the command line).
rite: The Right Editor to Write R
A simple, powerful, multi-platform script editor for R, built with tcl/tk, rite provides features typically found in standalone editors and IDEs (e.g., syntax highlighting, command completion, shortcut keys, find and go-to-line commands, one-click access to documentation, etc.) and a helpful color-coded output “sink”. It also creates an easy workflow for reproducible research through integration with the knitr package.
tabulizer: Bindings for Tabula PDF Table Extractor Library
tabulizer provides R bindings to the Tabula java library, which extracts tables from PDF documents using a small set of really powerful and accurate algorithms. tabulizer provides a thin client around Tabula, and provides a handy interactive mode to identifying tables in PDFs directly within an R graphics window.
UNF: Tools for Creating Universal Numeric Fingerprints for Data
UNF is an R package for generating variable- and dataset-level universal numeric fingerprint signatures to uniquely identify data. UNF signatures provide a way to uniquely and persistently identify (a version of) a dataset. The UNF algorithm was created by Micah Altman and was updated to version 5 of the UNF algorithm in the current package, which I maintain. The UNF package also provides UNF-based functions to identify discrepancies between dataframes and works well with the dvn package, listed above, for comparing Dataverse-stored datasets against local copies.
Data visualization packages
ggparliament is a ggplot2-based tool for visualizing the composition of parliaments using an array of different plot styles, including the “parliament charts” commonly seen on Wikipedia and arc and bar diagrams used in The Economist.
sparktex is an R companion to the LaTeX sparklines package (by Andreas Loeffler and Dan Luecking), which produces Edward Tufte-inspired sparklines and sparkspikes (in-text histograms) natively in LaTeX.
Except where noted, this website is licensed under a Creative Commons Attribution 4.0 International License.