Chapter 20 — Cointegration and Long-Run Relationships
In Chapter 17, we saw that regressions involving nonstationary variables can produce spurious results.
In Chapter 18, we introduced dynamic models such as ARDL, which capture short-run dependence and adjustment dynamics.
An important question now arises:
The answer is yes.
This idea is called cointegration.
Learning Objectives¶
By the end of this chapter, you should be able to:
explain the idea of cointegration
distinguish spurious regression from cointegrated relationships
understand long-run equilibrium
implement the Engle–Granger procedure in GRETL
interpret residual stationarity
understand the logic of the ARDL bounds test
20.1 Motivation: Spurious vs Meaningful Relationships¶
Recall from Chapter 17:
If we regress two unrelated nonstationary variables,
we may obtain:
high
significant t-statistics
apparently strong relationships
even when the variables are unrelated.
However, some nonstationary variables genuinely move together because they are linked by economic forces.
Examples include:
consumption and income
exchange rates and prices
interest rates and inflation
prices in related financial markets
20.2 What Is Cointegration?¶
Suppose:
is nonstationary
is nonstationary
but a particular linear combination is stationary.
Then the variables are cointegrated.
Formally:
but:
20.3 Intuition: Long-Run Equilibrium¶
Even if two variables individually behave like random walks, their difference may remain stable.
Cointegration means that while variables may drift over time, they do not drift arbitrarily far apart.
This suggests that some equilibrium force ties them together.
Examples:
consumption cannot permanently diverge from income
exchange rates and relative prices remain linked in the long run
stock prices of related firms may move together over time
20.4 Spurious Regression vs Cointegration¶
This distinction is fundamental.
| Case | Residuals | Interpretation |
|---|---|---|
| Spurious regression | Nonstationary | No meaningful relationship |
| Cointegration | Stationary | Long-run equilibrium exists |
If residuals are stationary, the regression may be meaningful despite nonstationarity in the original variables.
20.5 The Engle–Granger Two-Step Procedure¶
We now describe the classic Engle–Granger procedure for testing cointegration.
We use quarterly GDP data from Gretl.
Step 1: Load the Data¶
Menu¶
File → Open data → Sample file...
Select:
gdpfrom the POE 4th ed. database.
The dataset contains:
usa real GDP of USA
aus real GDP of AustraliaStep 2: Estimate the Long-Run Relationship¶
Estimate:
Gretl Command¶
ols aus const usaOutput¶
Model 1: OLS, using observations 1970:1-2000:4 (T = 124)
Dependent variable: aus
coefficient std. error t-ratio p-value
---------------------------------------------------------
const −1.07237 0.403225 −2.659 0.0089 ***
usa 1.00099 0.00610028 164.1 5.85e-145 ***
Mean dependent var 62.72528
R-squared 0.995489
Durbin-Watson 0.272654Step 3: Extract the Residuals¶
Save the residuals:
series uhat = $uhat[GRETL Screenshot Placeholder: Residual series]Step 4: Test Residual Stationarity¶
We now test whether the residuals are stationary.
This is the crucial step.
Menu¶
Select uhat.
Then:
Variable → Unit root tests → Augmented Dickey-Fuller
Gretl Command¶
adf 1 uhatExample Output¶
Augmented Dickey-Fuller test for uhat
unit-root null hypothesis: a = 1
test statistic: tau_c(1) = -3.03875
asymptotic p-value 0.0314520.6 Hypotheses¶
We test:
against:
20.7 Interpretation¶
If residuals are stationary:
deviations from equilibrium are temporary
variables move together in the long run
the regression is not spurious
20.8 Why Residual Stationarity Matters¶
Suppose:
is stationary.
Then although:
may drift
may drift
their deviations from equilibrium remain bounded.
20.9 Important Caveats¶
The Engle–Granger procedure has several limitations.
In multivariate systems, more advanced methods may be preferable.
20.10 Cointegration and Dynamic Models¶
Cointegration naturally connects to the ARDL framework from Chapter 18.
Recall the ARDL model:
This model contains both:
short-run dynamics
long-run structure
20.11 Cointegration via ARDL (Bounds Testing)¶
The ARDL bounds approach provides an alternative to Engle–Granger.
An important advantage is flexibility.
as long as none are .
20.12 From ARDL to ECM¶
Consider:
This can be rewritten as:
20.13 The Bounds Test¶
We test:
against:
Decision Rule¶
| F-statistic | Conclusion |
|---|---|
| below lower bound | no cointegration |
| above upper bound | cointegration |
| between bounds | inconclusive |
20.14 Implementing ARDL Bounds Testing in GRETL¶
Step 1: Estimate an ARDL Model¶
Menu¶
Model → Time series → ARDL
Choose:
dependent variable
regressors
lag lengths
[GRETL Screenshot Placeholder: ARDL specification window]Step 2: Perform Bounds Test¶
From the ARDL output window:
select bounds test option
[GRETL Screenshot Placeholder: Bounds test output]Example Command¶
ardl 2 2 y xThen:
ecm20.15 Comparing Engle–Granger and ARDL¶
| Feature | Engle–Granger | ARDL Bounds |
|---|---|---|
| Requires all variables | Yes | No |
| Residual-based | Yes | No |
| Dynamic model based | No | Yes |
| Allows mixed variables | No | Yes |
20.16 Common Mistakes¶
20.17 Looking Ahead¶
Cointegration tells us that a long-run equilibrium relationship exists.
But how do variables adjust when they deviate from equilibrium?
This leads naturally to the Error Correction Model (ECM).
Key Takeaways¶
Concept Check¶
Basic¶
What is cointegration?
What does it mean for two variables to be ?
What does it mean for a linear combination of variables to be ?
Intuition¶
Why can two nonstationary variables still have a meaningful relationship?
What is meant by a long-run equilibrium?
Explain the “rubber band” analogy for cointegration.
Spurious vs Cointegration¶
What distinguishes a spurious regression from a cointegrated relationship?
Why is a high not sufficient evidence of cointegration?
What role do residuals play in diagnosing cointegration?
Engle–Granger Procedure¶
What are the two steps in the Engle–Granger method?
What is the null hypothesis in the residual-based test?
What does it mean to reject the null hypothesis?
ARDL and Bounds Testing¶
How does the ARDL bounds approach differ from Engle–Granger?
What is the key hypothesis tested in the bounds test?
Challenge¶
Can cointegration exist if one variable is and the other is ?
Interpretation & Practice¶
A regression between two variables produces:
high
significant coefficients
nonstationary residuals
What does this imply?
Residuals from a regression are stationary.
What does this suggest?
Two variables are both , but their difference is stationary.
What does this imply?
ADF test on residuals gives p-value = 0.02.
What is your conclusion?
ADF test on residuals gives p-value = 0.60.
What is your conclusion?
ARDL Interpretation¶
In an ARDL model, lagged level terms are jointly significant.
What does this imply?
Bounds test F-statistic is above the upper bound.
What is your conclusion?
Economic Interpretation¶
Consumption and income are cointegrated.
What does this imply about their relationship?
Challenge¶
A regression is significant in levels but insignificant in differences.
What might this suggest?
Numerical Practice¶
Residual-Based Logic¶
Suppose:
residuals
What is your conclusion?
ADF Interpretation Table¶
Consider:
| Series | ADF p-value |
|---|---|
| 0.85 | |
| 0.78 | |
| residuals | 0.03 |
Are and stationary?
Are residuals stationary?
What does this imply?
Now consider:
| Series | ADF p-value |
|---|---|
| 0.90 | |
| 0.88 | |
| residuals | 0.72 |
What is your conclusion?
Engle–Granger¶
Explain why testing residuals is central to the Engle–Granger procedure.
Bounds Test¶
Suppose:
F-statistic = 6.5
upper bound = 5.0
What is your conclusion?
Interpretation¶
Suppose cointegration exists.
What does this imply about long-run behavior?
Challenge¶
Suppose two variables are cointegrated.
What happens if they deviate from equilibrium?
What concept does this lead to?
You regress:
exchange rate
price level
You find:
strong relationship
stationary residuals
What does this imply?
Why is this not spurious?
Appendix 20A — The ARDL Bounds Test (Conceptual Overview)¶
This appendix provides a simplified explanation of the ARDL bounds approach.
A.1 Starting Point¶
Consider:
A.2 The Key Question¶
Do the lagged level terms matter?
That is:
A.3 Interpretation¶
If both coefficients are zero:
no long-run relationship exists
If at least one coefficient is nonzero:
long-run equilibrium exists
A.4 Why Two Critical Values?¶
The asymptotic distribution depends on whether variables are:
stationary ()
nonstationary ()
The bounds approach therefore provides:
lower critical values
upper critical values
A.5 Decision Rule¶
Below lower bound → no cointegration
Above upper bound → cointegration
Between bounds → inconclusive