This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

## 11.1 OLS estimation

In order to show how the estimation of multiple linear regression is done, we need to present it in a more compact form. In order to do that we will introduce the following vectors: $$$\mathbf{x}'_j = \begin{pmatrix}1 & x_{1,j} & \dots & x_{k-1,j} \end{pmatrix}, \boldsymbol{\beta} = \begin{pmatrix}\beta_0 \\ \beta_{1} \\ \vdots \\ \beta_{k-1} \end{pmatrix} , \tag{11.4}$$$ where $$'$$ symbol is the transposition. This can then be substituted in (11.1) to get: $$$y_j = \mathbf{x}'_j \boldsymbol{\beta} + \epsilon_j . \tag{11.5}$$$ But this is not over yet, we can make it even more compact, if we pack all those values with index $$t$$ in vectors and matrices: $$$\mathbf{X} = \begin{pmatrix} \mathbf{x}'_1 \\ \mathbf{x}'_2 \\ \vdots \\ \mathbf{x}'_n \end{pmatrix} = \begin{pmatrix} 1 & x_{1,1} & \dots & x_{k-1,1} \\ 1 & x_{1,2} & \dots & x_{k-1,2} \\ \vdots \\ 1 & x_{1,n} & \dots & x_{k-1,n} \end{pmatrix}, \mathbf{y} = \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix}, \boldsymbol{\epsilon} = \begin{pmatrix} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_n \end{pmatrix} , \tag{11.6}$$$ where $$T$$ is the sample size. This leads to the following compact form of multiple linear regression: $$$\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon} . \tag{11.7}$$$ Now that we have this compact form of multiple linear regression, we can estimate it using linear algebra. Many statistical textbooks explain how the following result is obtained (this involves taking derivative of SSE (10.4) with respect to $$\boldsymbol{\beta}$$ and equating it to zero): $$$\hat{\boldsymbol{\beta}} = \mathbf{b} = \left(\mathbf{X}' \mathbf{X}\right)^{-1} \mathbf{X}' \mathbf{y} . \tag{11.8}$$$ The formula (11.8) is used in all the statistical software, including lm() function from stats package for R. Here is an example with the same mtcars dataset:

mtcarsModel01 <- lm(mpg~cyl+disp+hp++drat+wt+qsec+gear+carb, mtcars)

The simplest plot that we can produce from this model is fitted values vs actuals, plotting $$\hat{y}_j$$ on x-axis and $$y_j$$ on the y-axis:

plot(fitted(mtcarsModel01),actuals(mtcarsModel01))

The same plot is produced via plot() method if we use alm() function from greybox instead:

mtcarsModel02 <- alm(mpg~cyl+disp+hp++drat+wt+qsec+gear+carb, mtcars, loss="MSE")
plot(mtcarsModel02,1)

We use loss="MSE" in this case, to make sure that the model is estimated via OLS. We will discuss the default estimation method in alm(), likelihood, in Section 16.

The plot on Figure 11.2 can be used for diagnostic purposes and in ideal situation the red line (LOWESS line) should coincide with the grey one, which would mean that we have correctly capture the tendencies in the data, so that all the regression assumptions are satisfied (see Section 15).