Pooled Variance

Suppose we have samples with different sample variances.

If we have a good reason to believe that they come from the same population, we can pool the variances together to estimate the common variance.

Table of contents

Recap: Sample Variance

Let $X_1, \dots, X_m$ be random samples.

The (unbiased) sample variance of each $X_i$ is:

$$ S_{n_i}^2 = \frac{1}{n_i - 1} \sum_{j=1}^{n_i} (x_{j} - \overline{X}_i)^2 $$

Pooled variance is the weighted average of the sample variances.

Each variance is weighted by the degrees of freedom of each sample ($n_i - 1$):

$$ \frac{n_i - 1}{\sum_{i=1}^m (n_i - 1)} = \frac{n_i - 1}{N - m} $$

Summing them together we have the formula for pooled variance:

$$ S_p^2 = \frac{1}{N-m} \sum_{i=1}^m (n_i - 1) S_{n_i}^2 $$

If $n_i = n$ for all $i$, then the formula simplifies to:

$$ S_p^2 = \frac{1}{m} \sum_{i=1}^m S_{n_i}^2 $$