Central Limit Theorem / Law of Large Numbers
Table of contents
The Law of Large Numbers
The weak law of large numbers states that, when $X_1, \dots, X_n$ are IID, sample mean $\overline{X}_n$ converges in probability to the population mean $\mu$ as number of samples $n$ increases.
$$ \overline{X}_n \xrightarrow{P} \mu $$
Central Limit Theorem (CLT)
The central limit theorem states that, when $X_1, \dots, X_n$ are IID with mean $\mu$ and variance $\sigma^2$, the sample mean $\overline{X}_n$ converges in distribution to a normal distribution.
$$ \overline{X}_n \leadsto N\left(\mu, \frac{\sigma^2}{n}\right) $$
Remembering from here that,
\[\begin{align*} \E[\overline{X}_n] &= \mu \\[0.5em] \Var(\overline{X}_n) &= \frac{\sigma^2}{n} \end{align*}\]Standardizing the Sample Mean
Alternative form of the CLT is to standardize the sample mean,
\[Z_n = \frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} = \frac{\sqrt{n}(\overline{X}_n - \mu)}{\sigma} \leadsto N(0, 1)\]So, $Z_n$ converges to a standard normal distribution.
You might think, wouldn’t $\overline{X}_n - \mu$ converge to $0$ by the law of large numbers? However, it turns out that $\sqrt{n}$ is the factor that controls the convergence rate so that the denominator neither shrinks or blows up too fast, but just right to converge to a normal distribution.
Central Limit Theorem with SEM
In practice, we often do not know $\sigma$.
So instead of using standard deviation of the sample mean $\frac{\sigma}{\sqrt{n}}$, we use the estimated standard error of the mean (SEM) $\frac{S_n}{\sqrt{n}}$, where $S_n$ is the sample standard deviation.
Then, the CLT becomes,
$$ \frac{\overline{X}_n - \mu}{S_n / \sqrt{n}} \leadsto N(0, 1) $$
How?
The estimation using SEM is equivalent to:
\[\frac{\overline{X}_n - \mu}{S_n / \sqrt{n}} = \frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} \cdot \frac{\sigma}{S_n}\]We know that $\frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} \leadsto N(0, 1)$ from the CLT.
We know that
\[S_n \xrightarrow{P} \sigma \quad \implies \quad \frac{\sigma}{S_n} \xrightarrow{P} 1 \quad \implies \quad \frac{\sigma}{S_n} \leadsto 1\]\[\frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} \cdot \frac{\sigma}{S_n} \leadsto N(0, 1) \cdot 1\]Multivariate CLT
Now each $\boldsymbol{X}_i \in \mathbb{R}^k$ is a random vector with mean vector $\boldsymbol{\mu} \in \mathbb{R}^k$ and covariance matrix $\boldsymbol{\Sigma} \in \mathbb{R}^{k \times k}$.
\[\boldsymbol{X}_i = \begin{bmatrix} X_{1i} \\ X_{2i} \\ \vdots \\ X_{ki} \end{bmatrix}\]The sample mean vector $\boldsymbol{\overline{X}}_n$ is,
\[\boldsymbol{\overline{X}}_n = \frac{1}{n} \sum_{i=1}^n \boldsymbol{X}_i = \begin{bmatrix} \overline{X}_{1} \\ \overline{X}_{2} \\ \vdots \\ \overline{X}_{k} \end{bmatrix}\]Then, the multivariate CLT states that,
$$ \sqrt{n}(\boldsymbol{\overline{X}}_n - \boldsymbol{\mu}) \leadsto N(\boldsymbol{0}, \boldsymbol{\Sigma}) $$