Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Part III Capstone — Dependence, Stationarity, and Unit Roots

In Part III, we studied the core ideas that make time series analysis different from ordinary statistics.

We introduced:

This capstone integrates these ideas using a practical workflow.

The goal is to move from visual inspection to formal diagnosis.


Learning Goals

By completing this capstone, you should be able to:


Dataset

We use the Thai SET Index as the main example.


Exercise 1 — Download and Plot the Series

import yfinance as yf
import matplotlib.pyplot as plt

set_index = yf.download(
    "^SET.BK",
    start="2018-01-01",
    auto_adjust=False
)

prices = set_index["Adj Close"].squeeze()

prices.plot(figsize=(10,4))

plt.title("Thai SET Index")
plt.xlabel("Date")
plt.ylabel("Index Level")

plt.savefig("figs/ch10_/set.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
SET

Questions

  1. Does the series appear to fluctuate around a stable mean?

  2. Does it appear persistent?

  3. Are there obvious periods of sharp decline or recovery?



Exercise 2 — Compute Returns

import numpy as np

returns = 100 * np.log(
    prices / prices.shift(1)
)

returns = returns.dropna()

returns.plot(figsize=(10,4))

plt.title("Thai SET Index Log Returns")
plt.xlabel("Date")
plt.ylabel("Log Return (%)")

plt.savefig("figs/ch10_/rtn.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
Returns

Questions

  1. How does the return series differ from the price series?

  2. Does the return series appear more stable?

  3. Are there periods of high volatility?



Exercise 3 — Autocorrelation of Prices

We now examine the autocorrelation function of the price level.

from statsmodels.graphics.tsaplots import plot_acf

plot_acf(
    prices.dropna(),
    lags=40
)

plt.title("ACF of SET Index Level")

plt.savefig("figs/ch10_/acf.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
ACF

Questions

  1. Do autocorrelations decline quickly or slowly?

  2. What does slow decay suggest?

  3. Why might persistent autocorrelation be a warning sign?



Exercise 4 — Autocorrelation of Returns

plot_acf(
    returns,
    lags=40
)

plt.title("ACF of SET Index Returns")

plt.savefig("figs/ch10_/acf_rtn.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
RSI

Questions

  1. Are return autocorrelations smaller than price autocorrelations?

  2. Do returns appear closer to white noise?

  3. Are there any significant autocorrelations?



Exercise 5 — PACF of Returns

The partial autocorrelation function helps identify direct lag relationships.

from statsmodels.graphics.tsaplots import plot_pacf

plot_pacf(
    returns,
    lags=40,
    method="ywm"
)

plt.title("PACF of SET Index Returns")

plt.savefig("figs/ch10_/pacf_rtn.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
RSI

Questions

  1. Are there strong partial autocorrelations?

  2. Would an AR model likely be useful for the mean of returns?

  3. Why might returns be difficult to forecast?


Exercise 6 — Unit Root Test on Prices

We now perform the Augmented Dickey-Fuller test.

from statsmodels.tsa.stattools import adfuller

adf_price = adfuller(
    prices.dropna()
)

print("ADF Statistic:", adf_price[0])
print("p-value:", adf_price[1])
ADF Statistic: -2.477312904652193
p-value: 0.1210923683245807

For raw/unfiltered results, which provides the critical values:

adfuller(prices.dropna())

Questions

  1. What is the null hypothesis of the ADF test?

  2. Is the p-value small or large?

  3. Do we reject the unit root null?


A large p-value means we fail to reject the possibility of a unit root.


Exercise 7 — Unit Root Test on Returns

adf_returns = adfuller(
    returns.dropna()
)

print("ADF Statistic:", adf_returns[0])
print("p-value:", adf_returns[1])
ADF Statistic: -10.62170495177763
p-value: 5.494107461809279e-19

Questions

  1. Is the return series more stationary than the price level?

  2. How does the p-value compare with the price-level test?

  3. Why does differencing often help with nonstationarity?



Exercise 8 — First Difference of Prices

Instead of log returns, we can also examine first differences.

price_diff = prices.diff().dropna()

price_diff.plot(figsize=(10,4))

plt.title("First Difference of SET Index")
plt.xlabel("Date")
plt.ylabel("Change in Index Level")

plt.savefig("figs/ch10_/fd.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
First difference

Questions

  1. How does the first difference compare with log returns?

  2. Does differencing remove the long-run trend?

  3. Which transformation is easier to interpret financially?


Exercise 9 — Comparing Levels, Differences, and Returns

import pandas as pd

comparison = pd.DataFrame({
    "Level": prices,
    "First Difference": price_diff,
    "Log Return": returns
})

comparison.describe()
|       | Level       | First Difference | Log Return  |
|-------|-------------|------------------|-------------|
| count | 2015.000000 | 2014.000000      | 2014.000000 |
| mean  | 1504.308084 | -0.141430        | -0.008666   |
| std   | 173.351414  | 13.984090        | 1.015393    |
| min   | 1024.459961 | -134.979980      | -11.428184  |
| 25%   | 1367.034973 | -7.194977        | -0.482559   |
| 50%   | 1549.010010 | 0.205078         | 0.012854    |
| 75%   | 1636.134949 | 7.457550         | 0.484997    |
| max   | 1838.959961 | 83.050049        | 7.653075    |

Questions

  1. Which series has the largest scale?

  2. Which series appears most stable?

  3. Why should we avoid comparing standard deviations across variables with different units?


Exercise 10 — Rolling Mean and Rolling Variance

Stationarity requires more than a stable mean.

It also involves stable variance.

rolling_mean = returns.rolling(60).mean()
rolling_std = returns.rolling(60).std()

plt.figure(figsize=(10,4))

plt.plot(
    rolling_mean,
    label="Rolling Mean"
)

plt.plot(
    rolling_std,
    label="Rolling Standard Deviation"
)

plt.legend()

plt.title("Rolling Mean and Volatility of SET Returns")

plt.savefig("figs/ch10_/rolling.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
Rolling Mean and Volatility

Questions

  1. Is the rolling mean relatively stable?

  2. Is the rolling standard deviation stable?

  3. What does changing volatility suggest?



Exercise 11 — Simulating a Stationary AR(1)

Now compare the financial data with a simulated stationary process.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(123)

T = 500
phi = 0.6

e = np.random.normal(size=T)

x = np.zeros(T)

for t in range(1, T):
    x[t] = phi * x[t-1] + e[t]

plt.figure(figsize=(10,4))

plt.plot(x)

plt.title("Simulated Stationary AR(1) Process")

plt.savefig("figs/ch10_/AR_1.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
AR 1

Questions

  1. Does this series fluctuate around a stable mean?

  2. Does it look different from the SET price level?

  3. How does it compare with returns?


Exercise 12 — Simulating a Random Walk

np.random.seed(123)

T = 500

e = np.random.normal(size=T)

rw = np.cumsum(e)

plt.figure(figsize=(10,4))

plt.plot(rw)

plt.title("Simulated Random Walk")

plt.savefig("figs/ch10_/rw.png", dpi=300, bbox_inches="tight")
plt.close()   # replace with plt.show()
AR 1

Questions

  1. Does the random walk return to a stable mean?

  2. Does it look more like prices or returns?

  3. Why is a random walk difficult to forecast?



Mini Project — Diagnosing Stationarity

Choose one time series.

Examples:

Complete the following tasks:

  1. Plot the level series.

  2. Compute first differences or log returns.

  3. Plot the transformed series.

  4. Plot ACF and PACF.

  5. Perform an ADF test on the level.

  6. Perform an ADF test on the transformed series.

  7. Compare the results.

  8. Explain whether the original series appears stationary.



GRETL Version

The same workflow can be performed in GRETL.


Plotting the Series

Menu:

Variable → Time series plot

First Difference

Menu:

Add → First differences of selected variables

or command:

diff x

ACF and PACF

Menu:

Variable → Correlogram

Unit Root Test

Menu:

Variable → Unit root tests → Augmented Dickey-Fuller test

[GRETL Screenshot Placeholder: ADF test output]

Common Mistakes


Looking Ahead

Part IV introduces linear time series models.

We will use the concepts developed in Part III to build:


Key Takeaways