Central Limit Theorem / Law of Large Numbers

Table of contents
  1. The Law of Large Numbers
  2. Central Limit Theorem (CLT)
  3. Central Limit Theorem with SEM
  4. Multivariate CLT

The Law of Large Numbers

Law of Large Numbers

The weak law of large numbers states that, when $X_1, \dots, X_n$ are IID, sample mean $\overline{X}_n$ converges in probability to the population mean $\mu$ as number of samples $n$ increases.

$$ \overline{X}_n \xrightarrow{P} \mu $$


Central Limit Theorem (CLT)

The central limit theorem states that, when $X_1, \dots, X_n$ are IID with mean $\mu$ and variance $\sigma^2$, the sample mean $\overline{X}_n$ converges in distribution to a normal distribution.

$$ \overline{X}_n \leadsto N\left(\mu, \frac{\sigma^2}{n}\right) $$

Remembering from here that,

\[\begin{align*} \E[\overline{X}_n] &= \mu \\[0.5em] \Var(\overline{X}_n) &= \frac{\sigma^2}{n} \end{align*}\]
Standardizing the Sample Mean

Alternative form of the CLT is to standardize the sample mean,

\[Z_n = \frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} = \frac{\sqrt{n}(\overline{X}_n - \mu)}{\sigma} \leadsto N(0, 1)\]

So, $Z_n$ converges to a standard normal distribution.

You might think, wouldn’t $\overline{X}_n - \mu$ converge to $0$ by the law of large numbers? However, it turns out that $\sqrt{n}$ is the factor that controls the convergence rate so that the denominator neither shrinks or blows up too fast, but just right to converge to a normal distribution.


Central Limit Theorem with SEM

In practice, we often do not know $\sigma$.

So instead of using standard deviation of the sample mean $\frac{\sigma}{\sqrt{n}}$, we use the estimated standard error of the mean (SEM) $\frac{S_n}{\sqrt{n}}$, where $S_n$ is the sample standard deviation.

Then, the CLT becomes,

$$ \frac{\overline{X}_n - \mu}{S_n / \sqrt{n}} \leadsto N(0, 1) $$

How?

The estimation using SEM is equivalent to:

\[\frac{\overline{X}_n - \mu}{S_n / \sqrt{n}} = \frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} \cdot \frac{\sigma}{S_n}\]

We know that $\frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} \leadsto N(0, 1)$ from the CLT.

We know that

\[S_n \xrightarrow{P} \sigma \quad \implies \quad \frac{\sigma}{S_n} \xrightarrow{P} 1 \quad \implies \quad \frac{\sigma}{S_n} \leadsto 1\]

By the property of convergence in distribution,

\[\frac{\overline{X}_n - \mu}{\sigma / \sqrt{n}} \cdot \frac{\sigma}{S_n} \leadsto N(0, 1) \cdot 1\]

Multivariate CLT

Now each $\boldsymbol{X}_i \in \mathbb{R}^k$ is a random vector with mean vector $\boldsymbol{\mu} \in \mathbb{R}^k$ and covariance matrix $\boldsymbol{\Sigma} \in \mathbb{R}^{k \times k}$.

\[\boldsymbol{X}_i = \begin{bmatrix} X_{1i} \\ X_{2i} \\ \vdots \\ X_{ki} \end{bmatrix}\]

The sample mean vector $\boldsymbol{\overline{X}}_n$ is,

\[\boldsymbol{\overline{X}}_n = \frac{1}{n} \sum_{i=1}^n \boldsymbol{X}_i = \begin{bmatrix} \overline{X}_{1} \\ \overline{X}_{2} \\ \vdots \\ \overline{X}_{k} \end{bmatrix}\]

Then, the multivariate CLT states that,

$$ \sqrt{n}(\boldsymbol{\overline{X}}_n - \boldsymbol{\mu}) \leadsto N(\boldsymbol{0}, \boldsymbol{\Sigma}) $$