Estimators / Bias / Consistency
Table of contents
Estimators
Let $X_1, \dots, X_n$ be IID samples from a population with unknown parameter $\theta$.
An estimator of $\theta$ is a random variable:
$$ \hat{\theta}_n = g(X_1, \dots, X_n) $$
where $g$ is a function of the samples.
Sampling Distribution
Distribution of the estimator $\hat{\theta}_n$ is called the sampling distribution.
Standard Error
The standard deviation of the sampling distribution is called the standard error.
$$ \text{SE}(\hat{\theta}_n) = \sqrt{\Var(\hat{\theta}_n)} $$
Bias
Let $\hat{\theta}$ be an estimator of $\theta$.
The bias of an estimator is defined as:
$$ \text{bias}(\hat{\theta}_n) = \text{E}_\theta[\hat{\theta}_n] - \theta $$
The above $\text{E}_\theta$ is expectation respect to the distribution $f(x_1, \dots, x_n; \theta)$, not the distribution for $\theta$.
Unbiased Estimator
We say that an estimator $\hat{\theta}_n$ is unbiased if:
$$ \text{bias}(\hat{\theta}_n) = 0 \iff \text{E}_\theta[\hat{\theta}_n] = \theta $$
The expected value of an unbiased estimator is equal to the true parameter value.
Consistent Estimator
We say that an estimator $\hat{\theta}_n$ is consistent if:
$$ \hat{\theta}_n \xrightarrow{P} \theta \iff \plim_{n \to \infty} \hat{\theta}_n = \theta $$
The estimator converges in probability to the true parameter value.
Bias vs Consistency
Being an unbiased estimator does not imply consistency.
However, if the unbiased estimator converges to a point, then it is consistent:
$$ \text{bias} \rightarrow 0 \wedge \text{se} \rightarrow 0 \implies \hat{\theta}_n \xrightarrow{P} \theta $$
On the other hand, a biased estimator can be consistent.
The uncorrected biased sample variance
\[S_n^2 = \frac{1}{n} \sum_{i=1}^n (X_i - \overline{X}_n)^2\]is one example of a biased estimator that is consistent.
Mean Squared Error (MSE) of an Estimator
When we build a model $\hat{\theta}_n$, we can optimize the model to minimize the mean of squared difference from the true parameter $\theta$.
This mean squared error (MSE) is the measure of the performance of an estimator.
MSE of an estimator $\hat{\theta}_n$ is defined as:
$$ \begin{align} \text{MSE}(\hat{\theta}_n) &= \text{E}_\theta[(\hat{\theta}_n - \theta)^2] \\[1em] &= \text{bias}^2(\hat{\theta}_n) + \Var(\hat{\theta}_n) \end{align} $$
See proof from $(1)$ to $(2)$, the bias-variance decomposition of MSE, here.
Normal Estimator
An estimator $\hat{\theta}_n$ is asymptotically normal if:
$$ \frac{\hat{\theta}_n - \theta}{\text{SE}(\hat{\theta}_n)} \leadsto N(0,1) $$