Regression
Table of contents
Regression Function
Let $Y$ be the response variable and $X$ be the predictor variable.
The regression function $r(x)$ is:
$$ r(x) = E(Y | X = x) = \int y f(y | x) dy $$
The goal of regression is to estimate $r(x)$ from the data $(X_1, Y_1), \dots, (X_n, Y_n)$.
Linear Regression
OLS
R-Squared
Prediction
Let $x^\ast$ be a new observation.
For a simple linear regression model, the prediction is:
$$ \hat{Y}^\ast = \hat{\beta}_0 + \hat{\beta}_1 x^\ast $$
Variance of Prediction
Variance of $\hat{Y}^\ast$ is a variance of a sum of two estimators $\hat{\beta}_0$ and $\hat{\beta}_1$.
Using the properties of variance,
$$ \Var(\hat{Y}^\ast) = \Var(\hat{\beta}_0) + x^{\ast 2} \Var(\hat{\beta}_1) + 2 x^\ast \Cov(\hat{\beta}_0, \hat{\beta}_1) $$
Prediction Interval
In the prediction above, the error term is ommitted.
However, when we construct the prediction interval, we consider the variance of the error term as well:
\[\hat{\mathcal{E}}^2 = \Var(\hat{Y}^\ast) + \Var(\varepsilon) = \Var(\hat{Y}^\ast) + \sigma^2\]The prediction interval is:
$$ \hat{Y}^\ast \pm z_{\alpha/2}\, \hat{\mathcal{E}} $$