Chapter 20 — Cointegration and Long-Run Relationships

In Chapter 17, we saw that regressions involving nonstationary variables can produce spurious results.

In Chapter 18, we introduced dynamic models such as ARDL, which capture short-run dependence and adjustment dynamics.

An important question now arises:

The answer is yes.

This idea is called cointegration.

Learning Objectives¶

By the end of this chapter, you should be able to:

explain the idea of cointegration
distinguish spurious regression from cointegrated relationships
understand long-run equilibrium
implement the Engle–Granger procedure in GRETL
interpret residual stationarity
understand the logic of the ARDL bounds test

20.1 Motivation: Spurious vs Meaningful Relationships¶

Recall from Chapter 17:

If we regress two unrelated nonstationary variables,

y_t = \alpha + \beta x_t + e_t

we may obtain:

high $R^2$
significant t-statistics
apparently strong relationships

even when the variables are unrelated.

However, some nonstationary variables genuinely move together because they are linked by economic forces.

Examples include:

consumption and income
exchange rates and prices
interest rates and inflation
prices in related financial markets

20.2 What Is Cointegration?¶

Suppose:

$x_t$ is nonstationary
$y_t$ is nonstationary

but a particular linear combination is stationary.

Then the variables are cointegrated.

Formally:

x_t \sim I(1), \qquad y_t \sim I(1)

but:

e_t = y_t - \beta x_t \sim I(0)

20.3 Intuition: Long-Run Equilibrium¶

Even if two variables individually behave like random walks, their difference may remain stable.

Cointegration means that while variables may drift over time, they do not drift arbitrarily far apart.

This suggests that some equilibrium force ties them together.

Examples:

consumption cannot permanently diverge from income
exchange rates and relative prices remain linked in the long run
stock prices of related firms may move together over time

20.4 Spurious Regression vs Cointegration¶

This distinction is fundamental.

Case	Residuals	Interpretation
Spurious regression	Nonstationary	No meaningful relationship
Cointegration	Stationary	Long-run equilibrium exists

If residuals are stationary, the regression may be meaningful despite nonstationarity in the original variables.

20.5 The Engle–Granger Two-Step Procedure¶

We now describe the classic Engle–Granger procedure for testing cointegration.

We use quarterly GDP data from Gretl.

Step 1: Load the Data¶

File → Open data → Sample file...

Select:

gdp

from the POE 4th ed. database.

The dataset contains:

usa     real GDP of USA
aus     real GDP of Australia

Step 2: Estimate the Long-Run Relationship¶

Estimate:

aus_t = \alpha + \beta usa_t + e_t

Gretl Command¶

ols aus const usa

Output¶

Model 1: OLS, using observations 1970:1-2000:4 (T = 124)
Dependent variable: aus

             coefficient   std. error   t-ratio    p-value 
  ---------------------------------------------------------
  const       −1.07237     0.403225      −2.659   0.0089    ***
  usa          1.00099     0.00610028   164.1     5.85e-145 ***

Mean dependent var   62.72528
R-squared            0.995489
Durbin-Watson        0.272654

Step 3: Extract the Residuals¶

Save the residuals:

series uhat = $uhat

[GRETL Screenshot Placeholder: Residual series]

Step 4: Test Residual Stationarity¶

We now test whether the residuals are stationary.

This is the crucial step.

Select uhat.

Then:

Variable → Unit root tests → Augmented Dickey-Fuller

Gretl Command¶

adf 1 uhat

Example Output¶

Augmented Dickey-Fuller test for uhat
unit-root null hypothesis: a = 1

test statistic: tau_c(1) = -3.03875
asymptotic p-value 0.03145

20.6 Hypotheses¶

We test:

H_0: \text{Residuals contain a unit root}

against:

H_1: \text{Residuals are stationary}

20.7 Interpretation¶

If residuals are stationary:

deviations from equilibrium are temporary
variables move together in the long run
the regression is not spurious

20.8 Why Residual Stationarity Matters¶

Suppose:

e_t = y_t - \beta x_t

is stationary.

Then although:

$x_t$ may drift
$y_t$ may drift

their deviations from equilibrium remain bounded.

20.9 Important Caveats¶

The Engle–Granger procedure has several limitations.

In multivariate systems, more advanced methods may be preferable.

20.10 Cointegration and Dynamic Models¶

Cointegration naturally connects to the ARDL framework from Chapter 18.

Recall the ARDL model:

y_t = \alpha + \phi y_{t-1} + \beta_0 x_t + \beta_1 x_{t-1} + u_t

This model contains both:

short-run dynamics
long-run structure

20.11 Cointegration via ARDL (Bounds Testing)¶

The ARDL bounds approach provides an alternative to Engle–Granger.

An important advantage is flexibility.

as long as none are $I(2)$ .

20.12 From ARDL to ECM¶

Consider:

y_t = \alpha + \phi y_{t-1} + \beta_0 x_t + \beta_1 x_{t-1} + u_t

This can be rewritten as:

\Delta y_t = \gamma \Delta x_t + \lambda_1 y_{t-1} + \lambda_2 x_{t-1} + u_t

20.13 The Bounds Test¶

We test:

H_0: \lambda_1 = \lambda_2 = 0

against:

H_1: \text{At least one coefficient is nonzero}

Decision Rule¶

F-statistic	Conclusion
below lower bound	no cointegration
above upper bound	cointegration
between bounds	inconclusive

20.14 Implementing ARDL Bounds Testing in GRETL¶

Step 1: Estimate an ARDL Model¶

Model → Time series → ARDL

Choose:

dependent variable
regressors
lag lengths

[GRETL Screenshot Placeholder: ARDL specification window]

Step 2: Perform Bounds Test¶

From the ARDL output window:

select bounds test option

[GRETL Screenshot Placeholder: Bounds test output]

Example Command¶

ardl 2 2 y x

Then:

ecm

20.15 Comparing Engle–Granger and ARDL¶

Feature	Engle–Granger	ARDL Bounds
Requires all variables $I(1)$	Yes	No
Residual-based	Yes	No
Dynamic model based	No	Yes
Allows mixed $I(0)/I(1)$ variables	No	Yes

20.16 Common Mistakes¶

20.17 Looking Ahead¶

Cointegration tells us that a long-run equilibrium relationship exists.

But how do variables adjust when they deviate from equilibrium?

This leads naturally to the Error Correction Model (ECM).

Key Takeaways¶

Concept Check¶

Basic¶

What is cointegration?
What does it mean for two variables to be $I(1)$ ?
What does it mean for a linear combination of variables to be $I(0)$ ?

Intuition¶

Why can two nonstationary variables still have a meaningful relationship?
What is meant by a long-run equilibrium?
Explain the “rubber band” analogy for cointegration.

Spurious vs Cointegration¶

What distinguishes a spurious regression from a cointegrated relationship?
Why is a high $R^2$ not sufficient evidence of cointegration?
What role do residuals play in diagnosing cointegration?

Engle–Granger Procedure¶

What are the two steps in the Engle–Granger method?
What is the null hypothesis in the residual-based test?
What does it mean to reject the null hypothesis?

ARDL and Bounds Testing¶

How does the ARDL bounds approach differ from Engle–Granger?
What is the key hypothesis tested in the bounds test?

Challenge¶

Can cointegration exist if one variable is $I(0)$ and the other is $I(1)$ ?

Interpretation & Practice¶

A regression between two variables produces:

high $R^2$
significant coefficients
nonstationary residuals
- What does this imply?

Residuals from a regression are stationary.
- What does this suggest?
Two variables are both $I(1)$ , but their difference is stationary.
- What does this imply?
ADF test on residuals gives p-value = 0.02.
- What is your conclusion?
ADF test on residuals gives p-value = 0.60.
- What is your conclusion?

ARDL Interpretation¶

In an ARDL model, lagged level terms are jointly significant.
- What does this imply?
Bounds test F-statistic is above the upper bound.
- What is your conclusion?

Economic Interpretation¶

Consumption and income are cointegrated.
- What does this imply about their relationship?

Challenge¶

A regression is significant in levels but insignificant in differences.
- What might this suggest?

Numerical Practice¶

Residual-Based Logic¶

Suppose:

$x_t \sim I(1)$
$y_t \sim I(1)$
residuals $\hat{e}_t \sim I(0)$

What is your conclusion?

ADF Interpretation Table¶

Consider:

Series	ADF p-value
$x_t$	0.85
$y_t$	0.78
residuals	0.03

Are $x_t$ and $y_t$ stationary?
Are residuals stationary?
What does this imply?

Now consider:

Series	ADF p-value
$x_t$	0.90
$y_t$	0.88
residuals	0.72

What is your conclusion?

Engle–Granger¶

Explain why testing residuals is central to the Engle–Granger procedure.

Bounds Test¶

Suppose:

F-statistic = 6.5
upper bound = 5.0

What is your conclusion?

Interpretation¶

Suppose cointegration exists.

What does this imply about long-run behavior?

Challenge¶

Suppose two variables are cointegrated.

What happens if they deviate from equilibrium?
What concept does this lead to?

You regress:

exchange rate
price level

You find:

strong relationship
stationary residuals

What does this imply?
Why is this not spurious?

Appendix 20A — The ARDL Bounds Test (Conceptual Overview)¶

This appendix provides a simplified explanation of the ARDL bounds approach.

A.1 Starting Point¶

Consider:

\Delta y_t = \gamma \Delta x_t + \lambda_1 y_{t-1} + \lambda_2 x_{t-1} + u_t

A.2 The Key Question¶

Do the lagged level terms matter?

That is:

H_0: \lambda_1 = \lambda_2 = 0

A.3 Interpretation¶

If both coefficients are zero:

no long-run relationship exists

If at least one coefficient is nonzero:

long-run equilibrium exists

A.4 Why Two Critical Values?¶

The asymptotic distribution depends on whether variables are:

stationary ( $I(0)$ )
nonstationary ( $I(1)$ )

The bounds approach therefore provides:

lower critical values
upper critical values

A.5 Decision Rule¶

Below lower bound → no cointegration
Above upper bound → cointegration
Between bounds → inconclusive

Chapter 20 — Cointegration and Long-Run Relationships

Learning Objectives¶

20.1 Motivation: Spurious vs Meaningful Relationships¶

20.2 What Is Cointegration?¶

20.3 Intuition: Long-Run Equilibrium¶

20.4 Spurious Regression vs Cointegration¶

20.5 The Engle–Granger Two-Step Procedure¶

Step 1: Load the Data¶

Menu¶

Step 2: Estimate the Long-Run Relationship¶

Gretl Command¶

Output¶

Step 3: Extract the Residuals¶

Step 4: Test Residual Stationarity¶

Menu¶

Gretl Command¶

Example Output¶

20.6 Hypotheses¶

20.7 Interpretation¶

20.8 Why Residual Stationarity Matters¶

20.9 Important Caveats¶

20.10 Cointegration and Dynamic Models¶

20.11 Cointegration via ARDL (Bounds Testing)¶

20.12 From ARDL to ECM¶

20.13 The Bounds Test¶

Decision Rule¶

20.14 Implementing ARDL Bounds Testing in GRETL¶

Step 1: Estimate an ARDL Model¶

Menu¶

Step 2: Perform Bounds Test¶

Example Command¶

20.15 Comparing Engle–Granger and ARDL¶

20.16 Common Mistakes¶

20.17 Looking Ahead¶

Key Takeaways¶

Concept Check¶

Basic¶

Intuition¶

Spurious vs Cointegration¶

Engle–Granger Procedure¶

ARDL and Bounds Testing¶

Challenge¶

Interpretation & Practice¶

ARDL Interpretation¶

Economic Interpretation¶

Challenge¶

Numerical Practice¶

Residual-Based Logic¶

ADF Interpretation Table¶

Engle–Granger¶

Bounds Test¶

Interpretation¶

Challenge¶

Appendix 20A — The ARDL Bounds Test (Conceptual Overview)¶

A.1 Starting Point¶

A.2 The Key Question¶

A.3 Interpretation¶

A.4 Why Two Critical Values?¶

A.5 Decision Rule¶