Moment / Expectation / Variance
Table of contents
First Moment (Expectation)
Also known as mean or expectation.
$$ \E[X] = \int x dF(x) $$
Linearity of Expectation
For random variables $X_i$ and constants $a_i$:
$$ \E\left[\sum_{i=1}^n a_i X_i\right] = \sum_{i=1}^n a_i \E[X_i] $$
Expectation of Binomial with Bernoulli
The expected value of a binomial random variable $X \sim \text{Binomial}(n, p)$ can be derived from the expected value of a Bernoulli random variable $X_i \sim \text{Bernoulli}(p)$:
\[\E[X] = \E\left[\sum_{i=1}^n X_i\right] = \sum_{i=1}^n \E[X_i] = \sum_{i=1}^n p = np\]Linearity of Expectation in Matrix Form
Let $X$ be a random vector of $n$ random variables with mean vector $\mu$ and variance $\Sigma$.
Let $a$ be a constant vector of $n$ constants.
Then:
$$ \E[a^T X] = a^T \mu $$
If $A$ is a constant matrix:
$$ \E[AX] = A\mu $$
Product of Independent Random Variables
When $X_i$ are independent:
$$ \E\left[\prod_{i=1}^n X_i\right] = \prod_{i=1}^n \E[X_i] $$
K-th Moment
The $k$-th moment of a random variable $X$ is:
$$ \E[X^k] = \int x^k dF(x) $$
as long as $\E[{|X|}^k] < \infty$.
Moment Generating Function
Moment generating function (MGF) is a function that specifies a probability distribution (just like PDF, CDF, etc.).
The MGF of a random variable $X$ is:
$$ \psi_X(t) = \E[e^{tX}] = \int e^{tx} f_X(x) dx $$
Laplace Transform
Applying LOTUS, gives us:
\[\E(e^{tX}) = \int e^{tx} f_X(x) dx\]The following is called the two-sided Laplace transform of $f_X(x)$:
\[\mathcal{L}\{f_X\}(s) = \int e^{-sx} f_X(x) dx\]Since $\E[e^{tX}] = \mathcal{L}{f_X}(-t)$ the MGF is sometimes called the Laplace transform of $f_X(x)$.
It is called moment generating function because the $k$-th derivative of $\psi_X(t)$ at $t=0$ gives the $k$-th moment of $X$:
$$ \E[X^k] = \psi_X^{(k)}(0) $$
First Moment from MGF
The first moment (expectation) of $X$ can be derived from the MGF:
\[\psi_X'(0) = \left[ \frac{d}{dt} \E[e^{tX}] \right]_{t=0} = \E\left[ \frac{d}{dt} e^{tX} \right]_{t=0} = \E[X e^{tX}]_{t=0} = \E[X]\]MGF under Random Variable Transformation
- When $\psi_X(t)$ is the MFG of $X$ and $Y = aX + b$:
$$ \psi_Y(t) = \E[e^{tY}] = \E[e^{t(aX + b)}] = e^{tb} \E[e^{taX}] = e^{tb} \psi_X(at) $$
- When $\psi_i(t)$ is the MFG of independent $X_i$ and $Y = \sum_{i=1}^n X_i$:
$$ \psi_Y(t) = \E[e^{tY}] = \E[e^{t\sum_{i=1}^n X_i}] = \prod_{i=1}^n \E[e^{tX_i}] = \prod_{i=1}^n \psi_i(t) $$
Second Central Moment (Variance)
Also known as variance.
$$ \Var(X) = \E[(X - \mu)^2] = \E[X^2] - \E[X]^2 $$
Expand
\[\begin{align*} \Var(X) &= \E[(X - \E[X])^2] \tag{$\E[X] = \mu$} \\[0.5em] &= \E[X^2 - 2X\E[X] + \E[X]^2] \\[0.5em] &= \E[X^2] - 2\E[X]\E[X] + \E[X]^2 \tag{linearity of expectation} \\[0.5em] &= \E[X^2] - \E[X]^2 \end{align*}\]Rearranging the terms a little bit gives us: $$ \E[X^2] = \Var(X) + \E[X]^2 $$ This is useful to know when you have $\E[X]$ and $\Var(X)$.
Standard Deviation
Standard deviation is the square root of variance:
$$ \sigma = \sqrt{\Var(X)} $$
Linear Combination of Variance
If $a$ and $b$ are constants:
$$ \begin{equation} \label{eq:linear-combination-of-variance} \Var(aX + b) = a^2 \Var(X) \end{equation} $$
Furthermore, for $X$ and $Y$:
$$ \Var(aX + bY) = a^2 \Var(X) + b^2 \Var(Y) + 2ab \Cov(X, Y) $$
Linear Combination of Variance in Matrix Form
Let $X$ be a random vector of $n$ random variables with mean vector $\mu$ and variance $\Sigma$.
Let $a$ be a constant vector of $n$ constants.
Then:
$$ \Var(a^T X) = a^T \Sigma a $$
If $A$ is a constant matrix:
$$ \Var(AX) = A\Sigma A^T $$
Sum of Independent Random Variables
When $X_i$ are independent and $X = \sum_{i=1}^n X_i$:
$$ \Var(X) = \sum_{i=1}^n \Var(X_i) $$
Variance of Binomial with Bernoulli
The variance of a binomial random variable $X \sim \text{Binomial}(n, p)$ can be derived from the variance of a Bernoulli random variable $X_i \sim \text{Bernoulli}(p)$:
\[\Var(X) = \sum_{i=1}^n \Var(X_i) = \sum_{i=1}^n p(1-p) = np(1-p)\]Together with linear combination of variance above, if $X = \sum_{i=1}^n a_i X_i$:
$$ \Var(X) = \sum_{i=1}^n a_i^2 \Var(X_i) $$
K-th Central Moment
The $k$-th central moment of a random variable $X$ is:
$$ \E[(X - \mu)^k] = \int (x - \mu)^k dF(x) $$