Thomas J. Leeper > Teaching > R Programming Course > Scripts
Tutorial Scripts
Below you will find links to a number of fully executable R scripts (written in roxygen comments) that walk through various aspects of R programming.
The R language
- Data structures:
- Object classes (Script and Tutorial), plus more information on:
- Looking at objects: Script and Tutorial
- Missing values: Script and Tutorial
Data as dataframes
- Loading/reading data into R: Script and Tutorial
- Built-in data (
data
) - R data files (
load
) - Tabular data (
read.csv
,read.table
, etc.) - Manual data entry and file scans (
scan
andreadLines
) - Reading foreign data (
read.spss
,read.dta
, etc.)
- Built-in data (
- Saving R data: Script and Tutorial
save
andload
dput
anddget
dump
andsource
write.csv
andwrite.table
- Dataframe rearrangement: Script and Tutorial
order
subset
split
sample
Data processing
-
Scale construction (additive, multiplicative, logical): Script and Tutorial
- See also: Matrix algebra (Script and Tutorial)
- Base R has many tools for scale construction, like
factanal
andprincomp
- Packages for advanced scaling include psych and IRToys. Details can be found on the Psychometrics Task View
Data summaries
- Univariate data summaries (including
summary
): Script and Tutorial - Data tables and crosstables: Script and Tutorial
- Correlations and partial correlations: Script and Tutorial
- Printing numerics: Script and Tutorial
Plotting as data summary
-
Data summary plots: Script and Tutorial
- Rugs (Marginal distributions for scatterplots): Script and Tutorial
- Local regression (LOWESS/LOESS) for scatterplots: Script and Tutorial
jitter
for scatterplots of categorical data: Script and Tutorial- add ons:
lines
,segments
(for error bars),polygon
,points
,abline
,text
,legend
- Plotting functions with
curve
: Script and Tutorial
-
Graphical parameters: Script and Tutorial TODO
- For full details on graphical parameters, see Graphical Parameters.
- Saving plots:
- In RGui, you can use point-and-click menus to save plots, but it also possible to save plots using code. The appropriate function depends on the file format you desire for the resulting plot. The main ones are:
pdf
,jpeg
,png
,tiff
,bmp
, andsvg
. PDF and PNG are good choices, though TIFF is often required for academic publishing. - If building a plot in stages (e.g. overlaying different model fits), it is also possible to save the plot in different stages. This can be useful for building plots to be used in slides (e.g., to control the display the contents of a plot during the talk). Relevant functions here are:
dev.print
,dev.copy
,dev2bitmap
, andsavePlot
.
- In RGui, you can use point-and-click menus to save plots, but it also possible to save plots using code. The appropriate function depends on the file format you desire for the resulting plot. The main ones are:
- Note: There are several other graphics packages (including ggplot2, lattice, and grid). My personal preference is to rely on the flexibility of base graphics, but these alternative approaches are preferred by some.
Statistics
- Basic parametric statistical tests: Script and Tutorial TODO
chisq.test
t.test
cor.test
prop.test
binom.test
-
One-way ANOVA (
aov
,oneway.test
, andkruskal.test
): Script and Tutorial -
Nonparametric statistical tests (e.g.,
t.test
versuswilcox_test
) - Variance tests: Script and Tutorial TODO
var.test
fligner.test
bartlett.test
ansari.test
-
by
and*apply
- Statistical distributions
Linear Regression (OLS)
-
Model predicted values (
fitted
andpredict
) -
Goodness of fit and model comparison: Script and Tutorial
- Regression tests (lmtest) TODO
-
Regression as a curve of conditional mean outcomes: Script and Tutorial
-
Two-stage least squares (instrumental variables) TODO
Regression plotting
- Plots for regression diagnostics: Script and Tutorial
- Default plots from
plot(lm)
- Residual plots and
qqplot
- Scatterplots
- Default plots from
-
Plots for interaction effects
Generalized Linear Models
The tutorials below supply a basic introduction to many GLM techniques. A guide to all of the available packages and functions for GLMs can be found in the Econometrics Task View.
-
Binary outcome models (and link functions): Script and Tutorial TODO -> bivariate and multivariate
-
Ordered outcome models: Script and Tutorial
- Estimation, predicted probabilities, and plots
- Multinomial outcome models: Script and Tutorial
- Estimation, predicted probabilities, and plots
- Multinomial logit is also available from the mlogit package
- Multinomial probit is available in the MNP package
-
Survival models from survival TODO
- Note: Gary King’s Zelig set of packages provides a slightly more unified interface for GLMs, but it is basically just a convenient wrapper for the functions described in the above tutorials.
Experiments
- ANOVA: Script and Tutorial
- Permutation tests: Script and Tutorial
- Power and minimum detectable effects: Script and Tutorial TODO
- Plotting means and effects
- Clustering
- Analysis of noncompliance/LATEs
Reproducible research
-
Using
source
-
Using
sink
- Integration with Microsoft Word
-
knitr
stitch
- Integration with LaTeX reports
knit
xtable
(alsohmisc::Latex
, apsrtable, and stargazer)
-
Presentations with beamer
- Web publishing with Rmarkdown
- knitr
- R2HTML
- Slidify
Repeated tasks
apply
and*apply
family- loops (
for
,while
) - Split-Apply-Combine (
by
,split
) - Sampling/Bootstrapping/permutations (
sample
andreplicate
) - Aggregation functions (
ave
,aggregate
, etc.)
User-Defined functions
- Variable scope and environments
- Return values (
return
andinvisible
) - Custom classes
- Default arguments
print
andsummary
S3 methods
Over-time data
- Time-series (
ts
class) - Panel data (
plm
) - Mixed effects
- Multi-level models
Text processing
- String manipulation: Script and Tutorial
- Regular expressions: Script and Tutorial TODO
- Reading and writing to console, files, and connections
Other advanced topics
- File manipulation:
list.files
/dir
,file.create
, etc. - System calls:
shell
,shell.exec
, andsystem
- Bayes: MCMCpack, RJags, RBugs, RStan
- Big data: data.table, parallel computing
- Mapping
- Web services: twitteR, MTurkR