Working with objects in R will become tedious if we don't give those objects names to refer to them in subsequent analysis.
In R, we can “assign” an object a name that we can then reference subsequently.
For example, rather than see the result of the expression 2+2
, we can store the result of this expression and look at it later:
a <- 2 + 2
To see the value of the result, we simply call our variable's name:
a
## [1] 4
Thus the <-
(less than and minus symbols together) mean assign the right-hand side to the name on the left-hand side.
We can get the same result using =
(an equal sign):
a = 2 + 2
a
## [1] 4
We can also, much more uncommonly, produce the same result by reversing the order of the statement and using a different symbol:
a <- 2 + 2
a
## [1] 4
This is very uncommon, though. The <-
is the preferred assignment operator.
When we assign an expression to a variable name, the result of the evaluated expression is saved.
Thus, when we call a
again later, we don't see 2+2
but instead see 4
.
We can overwrite the value stored in a variable by simply assigning something new to that variable:
a <- 2 + 2
a <- 3
a
## [1] 3
We can also copy a variable into a different name:
b <- a
b
## [1] 3
We may decide we don't need a variable any more and it is possible to remove that variable from the R environment using rm
:
rm(a)
Sometimes we forget what we've done and want to see what variables we have floating around in our R environment. We can see them with ls
:
ls()
## [1] "a1" "a2" "allout" "amat"
## [5] "b" "b1" "betas" "between"
## [9] "bin" "bmat" "bootcoefs" "c"
## [13] "c1" "c2" "c3" "change"
## [17] "ci67" "ci95" "ci99" "cmat"
## [21] "coef.mi" "coefs.amelia" "condmeans_x" "condmeans_x2"
## [25] "condmeans_y" "cumprobs" "d" "d1"
## [29] "d2" "d3" "d4" "d5"
## [33] "df1" "df2" "dist" "e"
## [37] "e1" "e2" "e3" "e4"
## [41] "e5" "englebert" "f" "fit1"
## [45] "fit2" "fit3" "FUN" "g"
## [49] "g1" "g2" "grandm" "grandse"
## [53] "grandvar" "h" "height" "i"
## [57] "imp" "imp.amelia" "imp.mi" "imp.mice"
## [61] "lm" "lm.amelia.out" "lm.mi.out" "lm.mice.out"
## [65] "lm1" "lm2" "lmfit" "lmp"
## [69] "localfit" "localp" "logodds" "logodds_lower"
## [73] "logodds_se" "logodds_upper" "m" "m1"
## [77] "m2" "m2a" "m2b" "m3a"
## [81] "m3b" "me" "me_se" "means"
## [85] "mmdemo" "model1" "model2" "myboot"
## [89] "mydf" "mydf2" "myformula" "myttest"
## [93] "myttest2" "myttest3" "n" "n1"
## [97] "n2" "n3" "new1" "newdata"
## [101] "newdata1" "newdata2" "newdf" "newvar"
## [105] "nx" "ologit" "ols" "ols1"
## [109] "ols2" "ols3" "ols3b" "ols4"
## [113] "ols5" "ols5a" "ols5b" "ols6"
## [117] "ols6a" "ols6b" "oprobit" "oprobprobs"
## [121] "out" "p" "p1" "p2"
## [125] "p2a" "p2b" "p3a" "p3b"
## [129] "p3b.fitted" "part1" "part2" "plogclass"
## [133] "plogprobs" "pool.mice" "ppcurve" "pred1"
## [137] "s" "s.amelia" "s.mi" "s.mice"
## [141] "s.orig" "s.real" "s1" "s2"
## [145] "s3" "search" "ses" "ses.amelia"
## [149] "sigma" "slope" "slopes" "sm1"
## [153] "sm2" "smydf" "sx" "sy"
## [157] "tmp1" "tmp2" "tmp3" "tmp4"
## [161] "tmpdata" "tmpdf" "tmpsplit" "tmpx"
## [165] "tmpz" "tr" "val" "valcol"
## [169] "w" "weight" "within" "x"
## [173] "X" "x1" "x1cut" "x1t"
## [177] "x2" "X2" "x2t" "x3"
## [181] "x4" "x5" "x6" "xseq"
## [185] "y" "y1" "y1s" "y2"
## [189] "y2s" "y3" "y3s" "y4"
## [193] "y5" "y6" "yt" "z"
## [197] "z1" "z2" "z5" "z6"
This returns a character vector containing all of the names for all named objects currently in our R environment. It is also possible to remove ALL variables in our current R session. You can do that with the following:
# rm(list=ls())
Note: This is usually an option on the RGui dropdown menus and should only be done if you really want to remove everything. Sometimes you can also see an expression like:
b <- NULL
This expression does not remove the object, but instead makes its value NULL. NULL is different from missing (NA) because R (generally) ignores a NULL value whenever it sees it. You can see this in the difference between the following two vectors:
c(1, 2, NULL)
## [1] 1 2
c(1, 2, NA)
## [1] 1 2 NA
The first has two elements and the second has three.
It is also possible to use the assign
function to assign a value to name:
assign("x", 3)
x
## [1] 3
This is not common in interactive use of R but can be helpful at more advanced levels.
R has some relatively simple rules governing how objects can be named:
(1) R object names are case sensitive, so a
is not the same as A
. This applies to objects and functions.
(2) R object names (generally) must start with a letter or a period.
(3) R object names can contain letters, numbers, periods (.
), and underscores (_
).
(4) The names of R objects can be just about any length, but anything over about 10 characters gets annoying to type.
CAUTION: We can violate some of these restrictions by naming things with backticks, but this can be confusing:
f <- 2
f
## [1] 2
f <- 3
f
## [1] 3
That makes sense and can allow us to name variables that start with a number. Then to call objects with these noncompliant names, we need to use the backticks:
`1f` <- 3
# Then try typing `1f` (with the backticks)
If we just called 1f, we would get an error. But this also means we can name objects with just a number as a name:
`4` <- 5
4
## [1] 4
# Then try typing `4` (with the backticks)
Which is kind of weird. It is best avoided.