Pooled Variance
Suppose we have samples with different sample variances.
If we have a good reason to believe that they come from the same population, we can pool the variances together to estimate the common variance.
Table of contents
Recap: Sample Variance
Let $X_1, \dots, X_m$ be random samples.
- $n_i$: number of observations in sample $i$.
- $\overline{X}_i$: sample mean of sample $i$.
The (unbiased) sample variance of each $X_i$ is:
$$ S_{n_i}^2 = \frac{1}{n_i - 1} \sum_{j=1}^{n_i} (x_{j} - \overline{X}_i)^2 $$
Pooling the Variances
Pooled variance is the weighted average of the sample variances.
Each variance is weighted by the degrees of freedom of each sample ($n_i - 1$):
$$ \frac{n_i - 1}{\sum_{i=1}^m (n_i - 1)} = \frac{n_i - 1}{N - m} $$
Summing them together we have the formula for pooled variance:
$$ S_p^2 = \frac{1}{N-m} \sum_{i=1}^m (n_i - 1) S_{n_i}^2 $$
Uniform Sample Sizes
If $n_i = n$ for all $i$, then the formula simplifies to:
$$ S_p^2 = \frac{1}{m} \sum_{i=1}^m S_{n_i}^2 $$