Sample Mean and Variance

Table of contents
  1. Sample Mean
    1. Converges in Probability to Population Mean
    2. Unbiased Estimator of Population Mean
    3. Consistent Estimator of Population Mean
    4. Variance of the Sample Mean
  2. Sample Variance
    1. Converges in Probability to Population Variance
    2. Unbiased Estimator of Population Variance
    3. Consistent Estimator of Population Variance
  3. Standard Error of the Sample Mean

Sample Mean

For X1,,Xn IID random variables, the sample mean is defined as:

Xn=1ni=1nXi

Converges in Probability to Population Mean

The weak law of large numbers states that sample mean Xn converges in probability to the population mean μ as number of samples n increases.

XnPμ

Unbiased Estimator of Population Mean

When E[Xi]=μ,

The expected value of the sample mean can be easily calculated using linearity of expectation:

E[Xn]=E[1ni=1nXi]=1ni=1nE[Xi]=μ

So Xn is an unbiased estimator of the population mean μ.

Consistent Estimator of Population Mean

Since the sample mean converges in probability to the population mean, the sample mean is said to be consistent.

Variance of the Sample Mean

When Var(Xi)=σ2,

The variance of the sample mean can be easily calculated using this property of variance:

Var(Xn)=Var(1ni=1nXi)=1n2i=1nVar(Xi)=σ2n

The variance of the sample mean decreases as the sample size increases, which matches the intuition that the sample mean becomes more accurate.


Sample Variance

Do not confuse sample variance with variance of the sample mean above.

For X1,,Xn IID random variables, the (unbiased) sample variance is defined as:

Sn2=1n1i=1n(XiXn)2

Why n1 in the denominator?

You may be wondering why we divide by n1 instead of n.

Refer to this link for details. But in short: it is to make the sample variance an unbiased estimator of the population variance.

Division by n is good enough when we’re only measuring the dispersion in descriptive statistics, but when we’re using the statistic to estimate the population parameter in inferential statistics, it results in an underestimation of the population variance/standard deviation.

Converges in Probability to Population Variance

The sample variance converges in probability to the population variance σ2 as number of samples n increases.

Sn2Pσ2

Since square root is a continuous function, by the property of convergence in probability,

SnPσ

holds as well.

Unbiased Estimator of Population Variance

When E[Xi]=μ and Var(Xi)=σ2,

The expected value of the sample variance is:

E[Sn2]=σ2

So S2 is an unbiased estimator of the population variance σ2.

Consistent Estimator of Population Variance

Since the sample variance converges in probability to the population variance, the sample variance is said to be consistent.


Standard Error of the Sample Mean

Standard error of the sample mean (SEM) is the standard deviation of the sample mean.

“Standard error” does not always relate to the sample mean. This term is used to describe the standard deviation of any statistic.

We calculated above that:

Var[Xn]=σ2n

And thus standard error should be:

SEM=Var[Xn]=σn

However, in many cases, population standard deviation σ is unknown. So, we use the sample standard deviation Sn from above to estimate the standard error:

SEMSnn

We already mentioned that SnPσ above.