OLS in matrix form

The matrix representation of OLS is (X'X)-1(X'Y). Representing this in R is simple. Let's start with some made up data:

set.seed(1)
n <- 20
x1 <- rnorm(n)
x2 <- rnorm(n)
x3 <- rnorm(n)
X <- cbind(x1, x2, x3)
y <- x1 + x2 + x3 + rnorm(n)

To transpose a matrix, we use the t function:

X
##             x1       x2      x3
##  [1,] -0.62645  0.91898 -0.1645
##  [2,]  0.18364  0.78214 -0.2534
##  [3,] -0.83563  0.07456  0.6970
##  [4,]  1.59528 -1.98935  0.5567
##  [5,]  0.32951  0.61983 -0.6888
##  [6,] -0.82047 -0.05613 -0.7075
##  [7,]  0.48743 -0.15580  0.3646
##  [8,]  0.73832 -1.47075  0.7685
##  [9,]  0.57578 -0.47815 -0.1123
## [10,] -0.30539  0.41794  0.8811
## [11,]  1.51178  1.35868  0.3981
## [12,]  0.38984 -0.10279 -0.6120
## [13,] -0.62124  0.38767  0.3411
## [14,] -2.21470 -0.05381 -1.1294
## [15,]  1.12493 -1.37706  1.4330
## [16,] -0.04493 -0.41499  1.9804
## [17,] -0.01619 -0.39429 -0.3672
## [18,]  0.94384 -0.05931 -1.0441
## [19,]  0.82122  1.10003  0.5697
## [20,]  0.59390  0.76318 -0.1351
t(X)
##       [,1]    [,2]     [,3]    [,4]    [,5]     [,6]    [,7]    [,8]
## x1 -0.6265  0.1836 -0.83563  1.5953  0.3295 -0.82047  0.4874  0.7383
## x2  0.9190  0.7821  0.07456 -1.9894  0.6198 -0.05613 -0.1558 -1.4708
## x3 -0.1645 -0.2534  0.69696  0.5567 -0.6888 -0.70750  0.3646  0.7685
##       [,9]   [,10]  [,11]   [,12]   [,13]    [,14]  [,15]    [,16]
## x1  0.5758 -0.3054 1.5118  0.3898 -0.6212 -2.21470  1.125 -0.04493
## x2 -0.4782  0.4179 1.3587 -0.1028  0.3877 -0.05381 -1.377 -0.41499
## x3 -0.1123  0.8811 0.3981 -0.6120  0.3411 -1.12936  1.433  1.98040
##       [,17]    [,18]  [,19]   [,20]
## x1 -0.01619  0.94384 0.8212  0.5939
## x2 -0.39429 -0.05931 1.1000  0.7632
## x3 -0.36722 -1.04413 0.5697 -0.1351

To multiply two matrices, we use the %*% matrix multiplication operator:

t(X) %*% X
##        x1     x2     x3
## x1 16.573 -3.314  4.711
## x2 -3.314 14.427 -3.825
## x3  4.711 -3.825 12.843

To invert a matrix, we use the solve function:

solve(t(X) %*% X)
##           x1       x2       x3
## x1  0.068634 0.009868 -0.02224
## x2  0.009868 0.076676  0.01922
## x3 -0.022236 0.019218  0.09175

Now let's put all of that together:

solve(t(X) %*% X) %*% t(X) %*% y
##      [,1]
## x1 0.7818
## x2 1.2857
## x3 1.4615

Now let's compare it to the lm function:

lm(y ~ x1 + x2 + x3)$coef
## (Intercept)          x1          x2          x3 
##     0.08633     0.76465     1.27869     1.44705

The numbers are close, but they're not quite right. The reason is that we forgot to include the intercept in our matrix calculation. If we use lm again but leave out the intercept, we'll see this is the case:

lm(y ~ 0 + x1 + x2 + x3)$coef
##     x1     x2     x3 
## 0.7818 1.2857 1.4615

To include the intercept in matrix form, we need to add a vector of 1's to the matrix:

X2 <- cbind(1, X)  #' this uses vector recycling

Now we redo our math:

solve(t(X2) %*% X2) %*% t(X2) %*% y
##       [,1]
##    0.08633
## x1 0.76465
## x2 1.27869
## x3 1.44705

And compare to our full model using lm:

lm(y ~ x1 + x2 + x3)$coef
## (Intercept)          x1          x2          x3 
##     0.08633     0.76465     1.27869     1.44705

The result is exactly what we would expect.