Stationarity

Although it seems like an intuitive concept, identifying stationarity is not as easy as it seems.

Identifying stationarity is important because many statistical models assume stationarity.

Also because stationarity means that some important metrics, such as mean and variance, are constant over time, it is often a desirable property for analysis.

Table of contents

Understanding stationarity
Quick rule of thumb to identify stationarity
Unit root test
1. Unit root
2. Augmented Dickey-Fuller test
  1. Caveats of ADF
Ways to correct non-stationarity
Different types of stationarity

Understanding stationarity

Suppose a time series data $y_t$ is generated by a random variable of some unknown joint probability distribution.

For any time lag $k$, if the joint probability distribution of $y_{t+k}$ remains the same as $y_t$, then the time series data is stationary.

Quick rule of thumb to identify stationarity

Mean is constant over time
Variance is constant over time
There is no seasonality

You can identify these from visualizing the data.

White noise is a special type of staionary time series data, where the mean is zero and variance is constant over time.

Unit root test

Most statistical tests for stationarity looks for the presence of a unit root.

If a time series data has a unit root, it is non-stationary.

Unit root

If a time series has a characteristic equation with a root equal to 1, it is said to have a unit root.

Let’s define a autoregressive process of order 1 (AR(1)) as follows:

\[y_t = \phi y_{t-1} + \epsilon_t\]

The characteristic equation of this process is:

\[1 - \phi z = 0\]

If $\phi = 1$, the root of this equation is $z = 1$ and thus a unit root.

Then we would say this process is non-stationary.

Augmented Dickey-Fuller test

Augmented Dickey-Fuller (ADF) is the most commonly used hypothesis test for stationarity.

The null hypothesis is that the time series data has a unit root.

If the test result is significant, we reject the null hypothesis and conclude that the time series data is stationary.

As the name suggests, ADF is an extension of the Dickey-Fuller test. The main difference is that ADF can handle higher order autoregressive processes.

Caveats of ADF

Struggles to distinguish between near-unit root and unit root
High false positive rate when sample size is small

Ways to correct non-stationarity

Although it is not always possible to correct non-stationarity (and not always what you want to do), there are some methods that can be applied to correct non-stationarity.

Differencing: gets rid of trend, fixes non-constant mean
Log transformation: fixes non-constant variance
Square root transformation: fixes non-constant variance

Different types of stationarity

To be added

There are different types of stationarity depending on the constraints enforced on the joint probability distribution.

e.g. First moment (mean) and second moment (variance) should be constant, but other statistical moments can be allowed to vary,
e.g. The stationary constraint has to hold only for certain time lags but not throughout.

Strict stationarity
Weak stationarity
N-th order stationarity