Regression

Table of contents
  1. Regression Function
  2. Linear Regression
  3. OLS
  4. R-Squared
  5. Prediction
    1. Variance of Prediction
    2. Prediction Interval

Regression Function

Let $Y$ be the response variable and $X$ be the predictor variable.

The regression function $r(x)$ is:

$$ r(x) = E(Y | X = x) = \int y f(y | x) dy $$

The goal of regression is to estimate $r(x)$ from the data $(X_1, Y_1), \dots, (X_n, Y_n)$.


Linear Regression

See here


OLS

See here


R-Squared

See here


Prediction

Let $x^\ast$ be a new observation.

For a simple linear regression model, the prediction is:

$$ \hat{Y}^\ast = \hat{\beta}_0 + \hat{\beta}_1 x^\ast $$

Variance of Prediction

Variance of $\hat{Y}^\ast$ is a variance of a sum of two estimators $\hat{\beta}_0$ and $\hat{\beta}_1$.

Using the properties of variance,

$$ \Var(\hat{Y}^\ast) = \Var(\hat{\beta}_0) + x^{\ast 2} \Var(\hat{\beta}_1) + 2 x^\ast \Cov(\hat{\beta}_0, \hat{\beta}_1) $$

Prediction Interval

In the prediction above, the error term is ommitted.

However, when we construct the prediction interval, we consider the variance of the error term as well:

\[\hat{\mathcal{E}}^2 = \Var(\hat{Y}^\ast) + \Var(\varepsilon) = \Var(\hat{Y}^\ast) + \sigma^2\]

The prediction interval is:

$$ \hat{Y}^\ast \pm z_{\alpha/2}\, \hat{\mathcal{E}} $$