using PCgive to do Econometrics work 48hous

This presentation should be focusing on one company. Single sector

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Using VAR mode 

and using pcgive to make 2 graphics at least

dont too long probably 3 pages

and also there is feedback from professor, you can check 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

should be done 48hours

Ahumada-1992June-JPM-v14n3-a

e e
e

Hildegart Ahumada, Banco Central de la Repriblica Argenti
and Instituto Torcuato Di Tella, Buenos Aires, Argentina

This work models money in the complex, highly indationary environment of Ar-
gentina ( 1977- 1988). First, cointegration techniques proposed by Engle and Granger
(1987) and extended by Johansen ( 1988) are applied anJ show that real cash balances,
income and inflation are cointegrated. Second, data information are used to specify
the dynamics of this relationship, following the “general-to-specific” methodology
developed by Hendry et al. The model selected appears to be a satisfactory represen-
tation of money demand, including being empirically constant over 1985- 1988, during
which there were major policy change s. However, constant, well-specified inflation
or interest rate equations cannot be &a.ined by inverting the money demand equation,
given the results found when testing for s1zpzr exogeneity.

1. INTRODUCTION

This work models real cash balances in the complex highly inflation-
ary environment of Argentina (1977-1988). First, the paper analyzes
the long-run determinants using an information set which includes inter-
est rates, domestic prices. and transactions volumes. Cointegration
techniques, proposed by Englz and &anger ( 1587 j and extended by Jo-
hansen (1988), are applied to evaluate the long-run hypothesis that real
money, real income, and the inflation rate are cointegrated. Second,
data information are used to specify the dynEtics of the model follow-
ing the “general-to-specific” methodology developed by Hendry et al.

The selected model incorporates a linear error correction term and
an asymmetric effect of inflation. It has a suitable economic interpre-
tation and satisfies a range of statistical criteria, so it is considered a
tentative approximation to the underlying data-generating process of

AdUress correspondence to Hil&gart Ahumordo. Banco Central & la Repriblica Argentina,

Reconquista 264. (1003) Buenos Aires, Argentina.

The author wishes to acknowledge useful comments from an anonymous tkferee and is my
indebted to Neil Ericsson for his encouragement and valuable help in improving an earlier version.
Remaining errors and the views expressed in the paper are solely the responsibility of the author.

Received March 1991; final draft accepted August I99 i .

Journal of Policy Modeling 143):335-36 I ( 1992)
Q Society for Policy Modeling, 1992

336

real cash balances. For policy
model remains constant up to
withstanding its constancy, the

ation or the interest rate

H. Ahumada

purposes it shoi%l be noted that the
the outbreak of hyperinflation. Not-

1 model is not useful to derive a model
by inversion.

t

The next section summarizes the money-demand theory and the
Argentine institutions and data. Section 3 describes cointegration tech-
niques and their relation to error correction models. The long-run
hypothesis via cointegration techniques are evaluated in Section 4.
Section 5 reports results of modeling short-run dynamics, and section
6 discusses the model obtained. Section 7 concentrates on exogeneity
issues. Finally, conclusions are stated in Section 8.

EORY, INSTITUTIONS, AND DATA

The basic model of the demand for money inclcldes transactions (Y)
and the opportunity cost of holding real cash balances as explanatory
variables. Total final domestic expenditure (GDP plus imports minus
exports) in real terms is used for Y, since it has proven to be more
useful than other definitions. Both infIation (n) and interest rates (R)
may help measure the opportunity cost. ’

Although usual theories of the demand for money include interest
rates, several problems are found empirically ior Argentina. For in-
stance, in periods of regulated interest rates, no records exist for the
differential paid in the black market, which probably has varied with
the levels of restrictions. Friedman (1956) and Cagan (1956) stress the
impor?ance of the effect of inflation on real cash balances when inflation
is high. However, whether inflation dominates the interest rate at high
inflation rates is an empirical issue. In the long run, nevertheless,
inflation and interest rates are supposed to move together, in a rela-
tionship similar to that of the Fisher hypothesis. At high inflation rates,
deviations between them seem to be negligible in the long run, although
substantial deviations may occur in the short run.

In the last decade, the Argentine economic authorities tried different
strategies to moderate inflation. These attempts included price controls,
fixed interest rates, and fixed exchange rates, alternated with tough
monetary policies which derived mainly from high compulsory reserve
requirements, aimed at increasing the spread of domestic over foreign
asset returns. As a consequence- it is sometimes believed that data

‘See appendix for data definitions and sources. Unless otherwise indicated, capital letters
denote the generic names while logs of the scnlars are in lower case. The price index (P) is CPI
and n = (p, – p, ,j. The interest rate is the rate on savings and the interest rate enters the
lrrnllel ES r = !n( 1 + R).

THE DEMAND FOR CURRENCY IN ARGENTI

-2.30

-2.68

L – ’ – ‘. 1 _ I _ I – a – : ., . 1 ., . I.

1978 1979 1988 1981 1982 1983 1924 198s 1986 a987 1988 198;

Figure 1. Real cash balances ( m -p ),.

may not always reflect the underlying behavior of monetary variables
and that it is not possible to find constant econometric relationships
even over a few years. The model developed in Sections 4-7 provides
evidence against these views.

Before modeling the data, we should consider its basic statisti
properties. Figure 1 shows monthly real cash balances (or “money”)
from 1977 to 1988. That spans the period from the monetary reform that
ended the system of nationalized deposits until the appearance of hyper-
infIation at the beginning of 1989. Real holdings of money, m – p, de-
clined during the early 198Os, but increased sharply after the reform of
June 1985, the Austral Plan. That reform included a combination of or-
thodox and heterodox policies and is associated with the main recuper-
ation of real cash money holdings in the sample period. It was followed
by a new demonetization period with partial recoveries in I987 and f 988
until August 1988, when the “Primavera” plan, was launched.

Figure 2 charts real money with inflation, in which the former has
been transformed to a pseudo-velocity measure [ – Cm – p – OSv)].
The figure suggests a close relation between them. Also, -(m L p
– 0.5~) looks verv similar to the inversion of m –
implying that the” transactions are not moving

in Figure 1,
muc relaiive to

338

.388

r

H. Ahumada

.a569

.2W

i

t

.15e –

.lW –

,858 –

I. 1. “I.‘. ’ – ’ – 1. ’ – ’ – ‘. ‘. 4
1978 1979 1988 1981 1982 1983 ii984 1985 1986 1987 1988 1989

Figure 2. The inflation rate 7~, and the pseudo-velocity, – (m -p-‘/24’),, matched by
ranges.

m – p. The transaction’s behavior can be observed in Figure 3. Both
transactions and real balances also shobv seasonal behavior; that of
money holdings is mainly associated with the two complementary
payments received in July and December by all wage earners.

3. DYNAMIC SPECIFICATION: COINTEGRATION AND
CORRECTION MECHANISMS

This section discusses the relation of the dynamic error-correction
(EC) model to the “general-to-specific” modeling approach and to
cointegration.

A common practice in applied econometrics has been to estimate
regressions in differences of the variables to avoid the spurious cor-
relation problem of trending series discussed by Granger and Newbold
( 1?77). Mowever, this kind of dynamic model is unable to display
long-run behavior, e.g., to solve for the levels of the variables when
these grow at constant rates. For that reason, among others, other
authors have attained stationarity through error-correction models,
which not only encompass models in differences, but also have a

THE DEMAND FOR CURRENCY 1

2.35

t.‘-‘.‘……-……….~
1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 6989

Figure 3. The volume of transactions y,.

suitable economic interpretation. For the first versions, see Phillips
( 1957) and Sargan ( 1964), and for more recent versions, see Davidson
et al. ( 1378) and Hendry and Mizon ( 1978), among others.

For a linear single-equation model with two variables, x and y, and
one lag, the EC representation is

A,v, = &, + f&Ar, – (I – f&My – Ax), , + e, 4, – N(O,d. (1)

In practice x and y often are logs of the underlying economic series,
denoted X and Y. The rate of growth of Yd depends not only on that
of X, (short-run impact), but also on the past disequilibrium (y –
A&,- 1. If the long-run equilibrium is static (t = t – 1 and Ay = Ax
= C), then (1) implies Y = K l XA, where In K = &,/( 1 – p3) (which
simplifies to proportionality for A = 1).

For steady-state growth with Ay = Ax = T, Equation 1 implies Y
= K’XA, where Ink? – $ f3, – 1)7)/f 1 – p3). The economic

relevance of a model like ? follows ‘because many economic
theories suggest long-run ion&y, e.g., the Permanent Income
Hypothesis and the Quant ot lMsney. Further
1 cm be derived from a certain kind of optimizing
agents responding to their past disequilibrium. For instance, gro

340 H. Ahumada

in Y, will be greater than l3, times the growth in X, if Y,_ l was less
than its long-run desired value. Thus, this representation is more gen-
eral than that of differences models while still obtair ‘irg stationarity
in the variables included. For a comparison with other dynamic spec-
ifications, see Hendry, Pagan, and Sargan (1984).

In the “general to specific” approach to modeling, Equation 1 is
equivalent to the unrestricted dynamic regression of y on x,

Y, = PO + IO, + P9, I + thy, I + 4,. (2)

where A = (PI + PM1 – fi3). If X and Y are proportional in the
long run, then A = 1 or, equivalently, 131 + p2 + p3 = 1.

The study of long-run relationships between economic time series
variables has been developed further through the cointegration concept
proposed by Engle and Granger ( 198?).* The basic idea is that indi-
vidual economic time-series variables wander considerably, but certain
linear combinations of the series do not move too far apart from each
other. Econoi+ ffirofi 1 b LvIbbs tend to bring those series into line, e.g., as
hypothesized by economic theory. The series in such a relationship
are said to be cointegrated.

More formally, if yt and x, are both I( 1 ), then it is possible that a
linear combination of them,

ii, = y, – Ax,, (3)

is I(0). If so, y and x are cointegrated with a cointegration vector
(1: -A)‘. The errOr u, measures the disequilibrium present between y,
and x,. Since it is I(O), u,_ 1 can be used as regressor in a representation
that includes AyI as dependent variable, which is also I(0) (Hendry,
1986).

Two properties of cointegration should be emphasized. First, Engle
and Granger (1987) show that cointegrated series have an EC repre-
sentation and that EC mechanisms imply cointegrated variables. Sec-
ond, Stock (1987) shows that when variables are cointegrated, ordinary
least squares (QLS) estimates of cointegration parameters converge to
their true values more rapidly than with stationary variables. Thus,
Engle and Granger (1987) suggest a two-step estimation approach for
dynamic specification, each step requiring only GLS. In the first step,
A is estimated by regressing y, on x,. In the second step, u,_ 1 (using
a) is included as regressor to explain Ay, as part of the systematic
dynamics. As a test of cointegration, they propose evaluating whether

*See also the special issue on cointegration of the O_rjkd Bulletin of Economics and Sturisdcs
! 1?%).

THE DEMAND FOR C ,341

u, is I(0) after testing that x and y are I( I ). Sev
the most common being the (possibly augment
statistic and the Cointegrating Regression Durbin-Watson statistic
(CRDW) (Dickey and Fuller, 1979 and 198 1 and
1983). However, the long-run parameter A esti
regression can be severely biased in finite samples, as B
(1986) demonstrate. In particular, when R’ is sm
regression, static and dynamic estimates can
suggest full dynamic modeling as an alternative.

An improved procedure for testing cointegration, allowing for mo
than one cointegration vector, is that suggested in Johansen ( 1988)
generalized in Johansen and juselius ( 1990). Based on the unrestri
estimation of a system, parameterized in terms of levels and differ-
ences, they propose Likelihood Ratio statistics for testing the number
of cointegration vectors. The coefficient matrix for levels contains
information about long-run relationships between variables in the
data vector. Since its rank is the number of non-z eigenvalues in a
determinental equation closely related to estimating the system, the
number of cointegrating vectors i termined by testing how many
of those eigenvalues are nonzero. e cointegrating vectors (denoted
B’) are shown to be a subset of the associated eigenvectors. Further-
more, the associated weighting coefficients (denoted cy) can be useful
to evaluate weak exogeneity, because it is lost if a cointegration vector
appears in the conditional and marginal models (Johansen, 1990).

In Section 4, both Engle-&anger and Johansen techniques are ap
plied to testing a long-run hypothesis about the demand for money in
the Argentine case. Section 5 presents the results of modeling that
function by using the ’ ‘general-to-simple” approach. -i

4. LONG-RUN BEHAVIOR

Given the basic economic model of the demand for money and the
variables’ behavior, real money balances are hypothesized to be coin-
tegrated with the volume of real transactions and the opportunity cost
of holding money, defined as ?r. Cointegration has been tested with
two different measures of opportunity cost: r and m. The seccnd is
preferred in terms of the stability of the long-run parameters, and so
is the only one presented below.

From Table 1, all variables appear to be I( 1). Inflation may be an
exception, as its statistics are very sensitive to the sample

“He&y’s (1989) PC-GIVE was used for estimation.

342 II. Ahumada

Table 1: Tests of theOrder of Integration of Individual Variables

m-p Y 77 r

l(0) DW 0.23 0.22 0.40 0.18

mF(3) – 2.7 – 2.6 -3.3 -2.5

I(i) DW 1.72 2.00 2.27 1.70

ADF(3) – 7.9 -6.3 – 7.9 -6.9

DW and ADF are the Durbin-Watson and augmented Dickey-Fuller statistics (Sargan and
Bhargava, 1983; Dickey and Fuller, 1979, 1981).

sen. Even so, ?t is considered to be I(i), having the same order of
integration as that of r.4

The static Engle-&anger cointegration regression is

m-p= –3.1 + 0.534, – 2.3~;
‘Ir = 139[77(6) – 88(12)], CRDW = 0.68,

DF = -5.3, DFAi 13) = – 4.0, R* = 0.62. (4)

All the statistics reject the hypothesis of noncointegration at the
critical values provided by Engle and Yoo (1987) for the three-
variable case. The low R2 may imply bias in estimating the long-
run parameters.

To analyze cointegration of these series further, the system-based
procedure from Johansen ( 1988) and Johansen and Juselius ( 1990) is
applied to m – p, TT, and y with six lags, a constant, and seasonal
dummies. Table 2 lists the eigenvalues frcm the smallest (closest to a

Table 2: Cointegration Results: Eigenvalues and Related Test Statistics

Eigenvalues

Maximal eigenvalues

StatisticS 95% critical value

Eigenvalue trace

Statistics 95% critical value

1 0.030 4.07 8.08 4.07 8.08
2 0.05 1 6.96 14.60 11.02 17.84
3 0.150 21.51* 21.28 32.53* 31.26

The statistics are defined in Johansen (1988) and Johansen and Juselius (1990) and critical
values are taken from the latter’s Table A2.

*Significant at the 95% critical value.

‘ADF statistics for this set of variables are very sensitive to the sample size, the inclusion or
exclusion of a constant or trend, and the model of seasonality (deterministic or autoregressive).
However, the estimation of the unrestricted model suggests that the assumed order of integration
is approphate .

THE DEMAND FOR CURRENCY IN ARGE 343

Table 3: Cointegration Results: Normalized a and p’ Matrixes

Variable a (weighting matrix)

m-p -0.234 – 0.026 – 0.034 1.000 2.776 – 0.482
m -0.001 – 0.032 0.026 -0.129 1.000 1.048
Y 0.069 -0.051 – 0.007 0.112 – 3.020 1.m

unit root) to the largest (most stationary) and the LR statistics based
on the maximal eigenvalue and on the eigenvalue trace. Table 3 PIE-
sents the normalized weighting matrix, (Y, and the matrix of cointe-
grating vectors, 9’.

Both statistics in Table 2 indicate that there is one cointegration
vector. Furthermore, the cointegration vector (row 1 of Table 3) has
a coefficient close to 0.5 on transactions and a coefficient on inflation
of – 2.78, similar to those obtained by the static regression, Equation
4. Note that this remarkable similarity may due to the large variation
of the data involved. Also note that the cointegration vectors estimated
by both procedures are very close to the square-root law of the Baumol-
Tobin model.

From the weighting values in Table 3, the cointegrating vector enters
the money equation with a coefficient of -0.23 while the money-
demand cointegrating vector does not appear to enter the equations for
v or y. That is necessary for w and y to be weakly exogenous in a
conditional money-demand equation (Johansen, 1990).

5. SHORT-RUN DYNAMICS AND THE HYPOTHESIS
OF ASYMMETRY

Asymmetric effects of inflation on real cash balances have been
hypothesized for Argentina. In empirical studies, that hypothesis has
been tested using a ratchet effect based upon the maximum of past W.
This ratchet reinforces the asymmetric effect of m in a Cagan equation,
as in Piterman ( 1988) and Melnick (1989). Ahumada ( 1989) discusses
some problems of this empirical measure. In particular, the assumption
of irreversibility associated with it ensures the ratchet’s role in the
long-run relationship. From the above cointegration analysis, it seems
empirically possible to ignore this hysteresis effect in the long-run
relationship. This does not preclude asymmetric effects in the short
run, and therefore, separate effects for rising and falling inflation are
included as regressors in modeling short-run dynamics.

To model dynamics and the long run jointly, an unrestricted auto-

344 H. Ahumada

regressive distributed lag of m – p on y, 7t, and r is estimated with
seven successive lags, seasonal lags (12 and 13), and monthly dum-
mies. Furthermore, a dummy variable (denoted by WD, equal to 1 for
April and May 1982) is included in the unrestricted regression to take
into account the jump in the demand for currency during the war conflic;
over the Malvinas. No significant loss of information is found in re-
stricting the model to three lags and four seasonal dummies (SD) for
July, August, November, and December. Also, long-run price hom-
ogcneity is not rejected when tested by including p,-, in the lat-
ter regression; its coefficient is not significant. Table 4 presents the
autoregressive-distributed lag for m – p where A$, Am,, and their
lag? are included along with IT,. The terms Am,+ and An*- denote
max(0, ATT,) and min(O, Am,). This parameterization is a generalization
of iilc!uding 7c, and its lags, with the latter imposing coefficient equality
on An;‘_j and An,j.

The solved long-run solution to Table 4 is

m-p = -2.7 -I- 0.35~ – 2.71~ + 0.54r + 1.8WD f &SD,. (5)
(1.1) (0.45) (1.9) (2.2) (0.8)

This confirms the results of the last section but with a wider information
set. The coefficient of n is similar to that of Equation 4, whereas that
of r is positive. That supports the view of a long-run relationship
between real balances, transactions, and 7t alone. The long-run coef-
ficient of y is similar to the Baumol-Tobin elasticity obtained in Equa-
tion 4, albeit with a large standard error.

S&&ntial sequential rr;duction from: T&k 4 results in the foilowing
restricted conditional model:

A\(m – p), = 0.13A(m – /I),_, + 0.32Ati,_, – 0.78A,r, – 1.2OAn,+ – 0.53Av; (6)

(0.04) (0.07) (0.12) (0.19) (0.15)

– O.l4[m – p – osy – (-3 – 2.3~)],_, + 0.2OWD + &SD, – 0.0050;

(0.03) (0.02) (0.0055)

T = 136[77(9) – 88(12)],

R2 = 0.87, u = 3.4%, BP$( 13) = 15.9. AR1 -7F(7,117) = 0.52, ARCHI – 7F(7,110) =

1.46, RESET F( 1,123) = 0.76. NORM x’(2) = 4.6, X,ZF( 16,107) = 1.98, ENCF( 19,104) =

0.96,

where ENCF(q, T – K – q) evaluates q-parameter restrictions relative
to Table 4 (Johnston, 1963). Equation 6 has much simpler dynamics
than Table 4. Equation 6 also imposes the Baumol-Tobin long-run
income elasticity of l/2, and imposes the long-run inflation elasticity
of – 2.3 from Equation 4. The next section discusses the properties
of Equatioll 6.

THE DEMAND FOR CURRENCY IN ARGENTINA 345

Table 4= Autoregressive Distributed Lag Representation for (m -p)

Lag, (or index)

Variable 0 I 2 3 Zj=,

W-P),-,

r,-,

Yl-,

SD,-,

SD,-,-,

SD,-,-,

WD

Constant

-1.0

LO)
– .305

(203)

– .805

(302)

– .310

(.234)

– .790

(.177)

.087

(. 140)

– .019
(.020)

-020
(.023)

.207
(-030)

– .310

(.192)

.918

(.083)

– .337

(294)

.I95

(.208)

– -029

(266)

-349

(. 181)

-016
(.017)

– .005

(.020)

.037
(-024)

– .095
(.123)

– -248

(-269)

– .I59

(168)

-519

(289)

– .102

(. 184)

.I00
(-017)

– .034
(021)

– .008

(019)

.063
( .077)

.361

(244)

– .294
(132)

-.060

(-018)

.I49
(021)

– 005
(.017)

-0.114

ww
– 0.305

l.203)

– 1.390

( .58O)

-0 274

(.415)

0.061
(.236)

0.040

( .058)

T = 135[77(10)-88(12)],

R2 = 0.975, 0 = 3.48%, BfX’(16) = 17.8, AR 1-7 F(7,97) = 1.31, ARCH I-7F(7,90) =

1.88, RESETF( I, 103) = 0.17, NoRMX’(2) = 4.2, XfF(36.67) = 0.71, where o is the estimated
standard deviation of residuals, standard errors are in parentheses and Bpx’(q) is Box-Pierce

statistic for qti-order; AR 1-q F(q, T-K- 1) is the LM statistic for q*-order ARCH (Engle, 1982);
NORM x2(2) is Jarque-Bera statistic (1980); Xf F(q, T-K-q-l) is the statistic for heteroskedasticity
quadratic in regressors (White, 1980; Nicholls and Pagan, 1983); RESET F(q, T-K-q) is Ramsey’s
statistic.

6. EVALUATION’

Equation 6 has a clear economic interpretation. Real holdings of
currency are determined by transactions and inflation in the Img run.

‘For the different criteria and information sets for model design and evaluation see Hendry

and Richard ( 1982), Hendry ( 1983), and Hendry ( 1989).

346 H. Ahumada

In the short nrn, agents increase (decrease) their money holdings by
14 percent of the past month’s excess demand (supply).

Notwithstanding that disequi ibrium effect, inflation has an asymmet-
ric effect on real money, the effect depending on whether inflation is ris-
ing or falling, since both AT + and AK enter Equation 6. The
coefficient on Am is 72 increases and – Q.53 when it falls.
The effect of A~F+ a can be interpreted as a part of a for-
ward-looking model of inflation. in which these variables are data-based
predictors of n,+ 1 (Campos and Ericsson, 1988 and Hendry and Erics-
son, 199 1 b). Furthermore, money holdings depend on contempora-
neous changes in the interest rate and lagged changes in transactions.

From the diagnostic statistics, the residual of the estimated equation
appears to be white noise (BPx’,ARF), homoscedastic (ARCH r”;
White’s standard errors of coefficients are also similar to those OLS
estimated), and an innovation (ENC F)? The LM statistic based on
the squares of regressors (XfF) indicates some misspecification, per-
haps the need for a nonlinear error correctio;; A cubic relationship
was tried and the derived regressor (u,_ l – 0.14)& I was’ini%tded in
the model. This error correction response is similar to that found for
the United Kingdom (Hendry and Ericsson, 1991a). Although this
representation reduces the LM statistic to F( 16,107) = 1.32, it exhibits
parameter instability for the period of increasing monetization after
the economic plan known as “Primavera. ” The instability seems to
be derived from the real money respo;lses to excess demand, which
is increasing in that cubic function (although it may be appropriate for
the cases of excess supply, above a certain level). Even so, the linear
error correction model is preferred, given its much better predictive
performance in the last 3 years.

Parameter constancy is ev luated over three periods, which include
the main attempts at econ ic stabilization. First, for the last 40
months-l 985(9)- 1988( 12~the Chow and forecast x2 statistics are
CHOW F(40,84) = = 0.95. These statistics
indicate that the null hypothesis of parameter constancy cannot be
rejected. Figure 4 charts the one-step prediction for this subsample.

An earlier, overlapping sample, 1985(6)-1988(9), includes the in-
troduction of the Austral Plan in June 1985. The corresponding sta-
tistics become CHOW F(40,Sl) = 1.11 and x2/(40)/40 = 1.39.

I986 1987 1988

Figure 4. Equation 6: One-step-ahead forecasts of A(m -p),, with 2 2 forecast
standard errors.

Although the overall statistics are insignificant, the forecasted change
in real cash balances for June 1985 is somewhat higher than two times
the forecast standard error, with a I ratio of 2.69. The reform took
place on June 15, when a price freeze and a compulsory (downwards)
renegotiation of interest rates were implemented. The reTported con-
sumer price index and interest rate statistics registered an abrupt change
only a month later. Thus, this shock appears to reflect G. measurement
(timing) error rather than any change in the behavior assumed by
Equation 6. Finally, for the “Primavera” plan in August 1988, these
statistics are CHOW F(5,119) = 1.32 and x>(5)/5 = 1.72 over
1988(8)-1988( 12).

Parameter constancy is also confirmed by recursive estimation. Fig-
ures 5-9 show recursive estimates of the main coefficients in Equation 6
for the period in which the mentioned reforms took place.’ These coef-
ficient estimates are well inside the ex ante standard errors, and the esti-
mates became more precise after the Austral plan, which introduced

‘For computational reasons, the dependent variable for the recursive estimations is

(h–p– 0.2 WfJ) and the (insignificant) constant term is set equal to zero.

H. Ahumada

R(t)-ZsE(t) __ – — :
P ? T-7, 4-

,’
/

:
c \–k

_.–
i-4

I ——.&-J

-4
L-2

b I I 1 4

1985 1986 1987 1988 1989

re 5. Equation 6: Recursive estimates of the coefficient on Aty,_ , for a model of
A( m – p),, with + 2 estimated standard errors.

new information. In addition, from Figure 10, none of the breakpoint
Chow statistics for the sequence [ 1985( l)-1988( 12); 1985(2)-
1988(12); l . . 1985(6)-1988(12);. . . 1988(1 l)-1988(12); 1988(12)]
are significant at the 5 percent level. The evaluation just performed is
critical for testing exogeneity, as discussed in the next section.

Weak exogeneity of the current-dated regressors in Equation 6 is
required for its analysis as a single equation to be efficient (Engle et
al., 1983). System estimates with Johansen’s techniques in Section 4
support that weak exogeneity, because the money-demand cointegrat-
ing vector does not appear to enter the equations for 7~ or y.

Moreover, weak exogeneity can be tested as an implication of super
exogeneity, requiring constant parameters in the conditional money-

model across different economic regimes. The above statistics
a partial proof in that direction, although the noncowstancy of

marginal process for inflation and interest rate (those variables
t lags) should also be tested. If the marginal pro-.

THE DEMAND FOR CURRENCY IN ARGENTINA 349

-.60

R(tl+zsE(t)
1

\/-/

\

I
\

_ -__–
) -_ — 4——-..__—,____,-~ –

RN) \!—
1

-/ __- .

I 1 I I a
t985 I986 1987 AT-8 1989

Figure 6. Equation 6: Recursive estimates of the coefficient on A,r, for a mxiel of

A( m-p),, with + 2 estimated standard errors.

cesses of current dated variables change while the conditional model
remains constant, then super exogeneity holds. In this case, the Lucas
critique does not apply for the relevant class of interventions (Hendry,
1988). A very appealing aspect of testing super exogeneity is that only
a simple marginal model needs to be nonconstant. Here, univariate
autoregressive (AR) models for inflation and interest rates are estimated
to show their nonconstancy.

The following models are obtained by simplifying AR( 13) models
with seasonal dummies:*

AT, = -0.17 T,_, + 0.008 + %i.JD,; (7)
(0.05) (0.008)

T = 136[77(9) – 88(12)],
R2 = 0.22, 0 = 3.3%, BPx2(13) = 14.3, ARL7F(7,123) = 0.79,

ARCHI-7F(7,116) = 0.24, RESETF(1.129) = 11.5,
N0RMx2(2) = 173.5, X;F(6,123) = 7.5.

‘For inflation, only four seasonal dummies (for January, March, June, and August) were
retained.

350 H. Ahumada

1 , 1 I I

1985 1986 1987 1988 1989

Figure 7. Equation 6: Recursive estimates of the coefficient on An,’ for a mode% of
A(m – p),, with +, 2 estimated stanriard errors.

Ar, = -0.12 r,_x + 0.011;
(0.03) (0.004)

(8)

T = 136[77(9) – 88(12)],
R2 = 0.076, a = 2.0%, BPx2( 13) = 8.88, ARI-7F(7,127) = 0.79,
ARCHb7F(7,120) = 4.0, RESET F(1,133) = 15.3, NORMx2(2) = 290.5,
X;F(2,131) = 61.7.

Figures 11 and 12 show the sequence of breakpoint Chow statistics
for Equations 7 and 8: Constancy of the marginal processes is rejected.

Engle and Hendry ( 1989) proposed how to use determinants of
nonconstancies in the marginal process to test super exogeneity. If
inflation and interest rates are super exogenous in the conditional model
(6), then the determinants of the marginal processes’ nonconstancies
should be statistically insignificant if added to Equation 6. To capture
the nonconstancies of these models, several zero-one dummies are
included in Equations 7 and 8. The expanded models are

THE DEMAND FOR CURRENCY IN ARGENTINA 351

-00 –…–…– . ..__… – . ..-_ ___ . __ __ _ _ _ __
/-N,
t

-.%O RwciEw L-__Y—-.
s.___C–

— ;–4
\’

-.W Ict)

-.68

t

i

._ _.

-.80 _-_cc__c _xs -.-___– i

RW-zsE~t~ I——– -‘-

– ,– y’

1

-l.W / /-

I I 1 I 1985 1986 t
1987 1988 1989

Figure 8. Equation 6: Recursive estimates of the coefficient on An,- for a model of
A(m-p),, with + 2 estimated standard errors.

An, = – 0.07 R,_, + 0.003 + S&x& – 0.19085(7) + 0.07087(10)
(0.04) (0.005 j (0.03) (0.03)

– 0.06087(11- 12) + O.O7D88(7) + – O.lm88(9);
(0.02) (0.03) (0.03)

T = 136[77(9) – 88(12)],
R2 = 0.57, 0 = 2.4%, BPx’(13) = 21.5, ARL7F(7,118) = 2.66,
ARCK&7F(7,111) = 0.70, RESET F(1,124) = 1.93, WRbQ*(2) = 1.95,
X;F(l1,113) = 1.32.

(9)

Ar, = 0.003 – 0.1 lo85(6) – 0.11085(7) – @.!B1)88(8); (10)
(0.001) (0.01) (0.01) (0.01)

T = 136[77(9) – 88(12)],
R2 = 0.55, 0 = 1.4%, BPx2( 13) = 18.0, ARI-7F(7,125) = 0.95,
ARCHI-7F(7,118) = 5.33, RESET F(1,131) = 0.00, NORMx2(2) = 27.5,
X;F(3,128) = 0.88.

The dummies are for the observations indicated by their names. Two
observations should be made: (i) most of the dummies are near or
during the two main economic reforms of 1985 and 1988, and (ii1
adding the dummies, the coefficients of TT,_ 1 and I-,- 2 in uations i

352 H. Ahumada

. 000

r

t

-.058 c
I

-.lGm –

R(t)+ZSE(tl
4

-.lSta –

–.a@a

i

.* ._._I \_. – –.

,A——_
\_.– – .-,

,I\” -j _ __._–… __ =-”

RdsEw

t’

. I I I 1

1985 1986 1987 1988 1989

Figure 9. Equation 6: Recursive estimates of the coefficient on the error correction
term [(m-p- %y)–( -3-2.31~)],_, for a model of A(m-p),, with 2 2 estimated

standard errors.

and 8 become ikgnificant; even so, m,- , is still in Equation 9 because
deleting it induces residual autocorrelation.

Two tests of super exogeneity are conducted, one adding the dum-
mies of (9) and (10) to (6), and the other adding functions of their
residuals to Equation 6. First, adding the dummies, Equation 6 becomes

A(m – p), = 0.15 A(m – p),-, + 0.31 Ary,-,

(0.04) (0.07)

– 0.64 Ag, – 1.34 Ax,+ – 0.56 An,-

(0.16) (0.22) (0.21)

– (O.l2)[m – p – 0.5~ + 3 + 2.3x],_, + 0.2OWD + &SD,

(0.03) (0.03)

+ O.OSD85i7) + 0.02087(10) – 0.02087(11- 12) + 0.01088(7)
(0.07) (0.04) (0.03) (0.04)

– O.O4DM(9) •t 0.12085(6) – 0.03088(8); (11)
(0.06) (0.04) (0.04)

T= 136[77(9) – 88( 12)],
R2 = 0.89, T may have seemed insurmountable in the past, but is
not now. Let zt denote the set of n ‘external’ variables’ from which
the factors ft = Hzt (say) are formed, then ft , . . . ft−s, zt , . . . zt−s
comprise the initial set of candidate variables. Automatic model
selection can use multi-path searches to eliminate irrelevant
variables with mixtures of expanding and contracting block
searches, so can handle settings with both perfect collinearity and
N > T ; see Hendry and Krolzig (2005) and Doornik (2009b). The
simulations in Castle et al. (2011) show the feasibility of such an
approachwhenN > T in linear dynamicmodels. Investigators are,
therefore, not forced to allow for only a small number of factors,
or just the factors and a few lags of the variable being forecast, as
candidates. Since model selection is unavoidable when N > T , we
consider that next.

2.2. Model selection

The search algorithm in Autometricswithin PcGive (see Doornik,
2009a; Doornik and Hendry, 2009) seeks the local DGP (denoted
LDGP), namely the DGP for the set of variables under consid-
eration (see e.g., Hendry, 2009) by formulating a general unre-
strictedmodel (GUM) that nests the LDGP, checking its congruence

when feasible (estimable once N ≪ T and perfect collinearities
are removed). Search thereafter ensures congruence, so all se-
lected models are valid restrictions of the GUM, and should
parsimoniously encompass the feasible GUM. Location shifts
are removed in-sample by impulse-indicator saturation (IIS, see
Hendry et al., 2008, Johansen and Nielsen, 2009, and the simula-
tion studies in Castle et al., 2012b), which also addresses possi-
ble outliers. Thus, if


1{j=t}, t = 1, . . . , T


denotes the complete

set of T impulse indicators, we allow for ft , . . . ft−s, zt , . . . zt−s and
1{j=t}, t = 1, . . . , T


all being included in the initial set of candi-

date variables to which multi-path search is applied. Hence N > T
will always occur when IIS is used, but the in-sample feasibility of
this approach is shown in Castle et al. (2012a). Here we are con-
cerned with the application of models selected in this way to a
forecasting contextwhen theDGP is non-stationary due to location
shifts. Since there are few analyses of how well a factor forecast-
ing approachwould then perform (see however, Stock andWatson,
2009; Corradi and Swanson, 2011), we explore its behavior when
faced with location shifts at the forecast origin. Section 5 discusses
automatic model selection.

2.3. Unanticipated location shifts

Third, ex ante forecasting is fundamentally different from ex
post modelingwhenunanticipated location shifts occur. Breaks can
always bemodeled after the event (atworst by indicator variables),
but will cause forecast failure when not anticipated. Clements and
Hendry (1998) proposed a general theory of economic forecasting
using mis-specified models in a world of structural breaks, and
emphasized that it had radically different implications from a
forecasting theory based on stationarity andwell-specifiedmodels
(as in Klein, 1971, say). Moreover, those authors also show that
breaks other than location shifts are less pernicious for forecasting
(though not for policy analyses). Pesaran and Timmermann (2005)
and Pesaran et al. (2006) consider forecasting time series subject to
multiple structural breaks, and Pesaran and Timmermann (2007)
examine the use of moving windows in that context. Castle et al.
(2011) investigate how breaks themselvesmight be forecast, and if
not, how to forecast during breaks, but draw somewhat pessimistic
conclusions due to the limited information that will be available
at the time any location shift occurs. Thus, we focus the analysis
on the impacts of unanticipated location shifts in factor-based
forecasting models.

2.4. Role of information in forecasting

Factor models can be interpreted as a particular form of ‘pool-
ing of information’, in contrast to the ‘pooling of forecasts’ lit-
erature discussed in (e.g.) Hendry and Clements (2004). Pooling
information ought to dominate pooling forecasts based on lim-
ited information, except when all variables are orthogonal (see
e.g., Granger, 1989). However, the taxonomy of forecast errors in
Clements and Hendry (2005b) suggests that incomplete informa-
tion by itself is unlikely to play a key role in forecast failure, so
using large datasetsmay not correct one of themain problems con-
fronting forecasters, namely location shifts, unless that additional
information is pertinent to forecasting breaks. Moreover, although
we use model selection from a very general initial candidate set,
combined with congruence as a basis for econometric modeling,
it cannot be proved that congruent modeling helps for forecasting
when facing location shifts (see e.g., Allen and Fildes, 2001). While
Makridakis and Hibon (2000) conclude that parsimonious models
do best in forecasting competitions, Clements and Hendry (2001)
argue that such findings may conflate parsimony and robustness
to location shifts:most of the parsimoniousmodels were relatively

J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 307

robust to location shifts compared to their non-parsimonious con-
tenders.1 Since more information cannot lower predictability, and
omitting crucial explanatory variables will both bias parameter es-
timates and lead to an inferior fit, the jury remains out on the ben-
efits of more versus less information when forecasting.

2.5. Equilibrium-correcting behavior

Factor models are often equilibrium correction in form, so
they suffer from the general non-robustness to location shifts of
that class of model; see, e.g., Clements and Hendry (1998, Ch. 8).
However, the principles of robust-model formulation discussed in
Castle et al. (2011) apply, and any EqCM, whether based on vari-
ables or factors (or both), could be differenced prior to forecast-
ing, thereby embedding the resulting model in a more robust
forecasting device. Castle et al. (2011) show that howagivenmodel
is used in the forecast periodmatters, and explore various transfor-
mations that reduce systematic forecast failure after location shifts.
Section 4 provides a more extensive discussion.

2.6. Measurement errors

Many of the ‘solutions’ to systematic forecast failure induced by
location shifts exacerbate the adverse effects of data measurement
errors near the forecast origin; for example, differencing doubles
their impact. Conversely, averagingmitigates the effects of random
measurement errors, so as a method of averaging over variables,
factors might help mitigate data errors. Forecasting models which
explicitly account for data revisions offer an alternative solution.
These include modeling different vintage estimates of a given
time observation as a vector autoregression (see, e.g., Garratt
et al., 2008, 2009, and Hecq and Jacobs, 2009, following Patterson,
1995, 2003), as well as the approach of Kishor and Koenig (2012)
(building on earlier contributions by Howrey, 1978, 1984; Sargent,
1989), who estimate a VAR on post-revision data. This necessitates
stopping the estimation sample short of the forecast origin, so the
model’s forecasts of the periods up to the origin are combinedwith
lightly-revised data via the Kalman filter to obtain post-revision
estimates. The forecast is then conditioned on these estimates of
what the revised latest data will be. Clements and Galvão (2012)
provide evidence on the efficacy of these strategies for forecasting
US output growth and inflation, albeit using information sets
consisting only of lags (and different vintage estimates) of the
variable being forecast.

The frequency of macroeconomic data can also affect its
accuracy, as can nowcasting (see e.g., Bánbura et al., 2011) and ‘real
time’ (versus ex post) forecasting (on the latter, see e.g., Croushore,
2006; Clements and Galvão, 2008). Empirical evidence suggests
that the magnitudes of data measurement errors are larger in the
most recent data, in other words, in the data on which the forecast
is being conditioned (hence the Kishor and Koenig, 2012, proposal
to predict ‘final’ estimates of the latest data), as well as during
turbulent periods (Swanson and vanDijk, 2006),whichmight favor
factor models over other approaches that do not explicitly attempt
to take data revisions into account.

2.7. Forecast evaluation

There is a vast literature on how to evaluate the ‘success or
failure’ of forecasts (see among many others, Leitch and Tanner,
1991; Pesaran and Timmermann, 1992; Clements and Hendry,
1993; Granger and Pesaran, 2000a,b; Pesaran and Skouras, 2002),
as well as using forecasts to evaluate models (see e.g., West,

1 Parsimoniousmodels need not be robust—just consider using an estimate of the
unconditional historical mean of a process as its forecast. No model specification or
selection is required, and estimation is just the calculation of the sample mean, but
this parsimonious forecasting device is highly susceptible to location shifts.

1996; West and McCracken, 1998; Hansen and Timmermann,
2011), forecasting methods (Giacomini and White, 2006), and
economic theory (Clements and Hendry, 2005a). As a first exercise
in forecasting from models selected from both variables and
factors, we report the traditional MSFE measure, and evaluate
forecasts of the levels of (log) GDP and GDP growth. Both are of
interest to the policymaker: the growth rate is a headline statistic;
whereas the level of GDP is required for the calculation of output
gaps, see e.g., Watson (2007). To judge the accuracy of alternative
forecastingmodels, the choice of levels versus changes canmatter,
as differences between the accuracy of multi-step forecasts from
correctly-specified models and models which impose ‘too many’
unit roots are typically diminished when forecasts are evaluated
in terms of growth rates rather than levels. Clements and Hendry
(1998, Ch. 6) show this analytically for a cointegrated VAR, using
the trace of the MSFE matrix as the measure of system-wide
forecast accuracy, but the results specialize to the equivalent
comparisons in terms of single equations. The impact of the mis-
specification for cointegrated systems of VARs in differences is
attenuated when forecasts of growth rates are evaluated. When
there are location shifts, the evaluation of forecasts of growth
rates may cloak the benefits of better forecasting approaches,
such as intercept corrections, since robust forecasting devices then
typically perform better on levels forecasts.

2.8. Nature of the DGP

Finally, the nature of the DGP itself matters greatly to the
success of a specific forecasting model or method. In particular,
the factor model would be expected to do well if the ‘basic’ driving
forces are primarily factors, in the sense that a few factors account
for a large part of the variance of the variables of interest. The ideal
case for factor model forecasting is where the DGP is:

xt = ϒ (L) ft + et
ft = 8 (L) ft−1 + ηt

where xt is n×1, ft ism×1, ϒ(L) and8(L) are n×m andm×m, and
n ≫ m so that the low-dimensional ft drives the co-movements
of the high-dimensional xt . The latent factors are assumed here to
have a VAR representation. Suppose in addition that themean-zero
‘idiosyncratic’ errors et satisfy E[ei,tej,t−k] = 0 all k unless i = j
(allowing the individual errors to be serially correlated), and that
E[ηtet−k] = 0 for all k.

Given the ft , each variable in xt , say xi,t , can be optimally
forecast using only the ft and lags of xi,t (xi,t−1, xi,t−2 etc.). Letting
λi(L)′ denote the ith row of ϒ(L), then:

Et

xi,t+1 | xt , ft , xt−1, ft−1, . . .


= Et


λi (L)′ ft+1 + ei,t+1 | xt , ft , xt−1, ft−1, . . .


= Et


λi (L)′ ft+1 | xt , ft , xt−1, ft−1, . . .


+ Et


ei,t+1 | xt , ft , xt−1, ft−1, . . .


= Et


λi (L)′ ft+1 | ft , ft−1, . . .


+ Et


ei,t+1 | ei,t , ei,t−1 . . .


= ψ (L)′ ft + δ (L) xi,t

under the assumptions we have made (see Stock and Watson,
2011, for a detailed discussion). Absent structural breaks, the
model with the appropriate factors and lags of xi would deliver
the best forecasts (in population, ignoring parameter estimation
uncertainty). The results of Faust andWright (2009), among others,
suggest that the factor structure may not be a particularly good
representation of the macroeconomy. Our empirical approach
allows for the ‘basic’ driving forces to be variables or factors, aswell
as the many possible non-stationarities noted above. We assume
that the DGP originates in the space of variables, with factors

308 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319

being potentially convenient approximations that parsimoniously
capture linear combinations of effects. Although non-linearity can
be tackled explicitly along with all the other complications (see
e.g., Castle and Hendry, 2011), we only analyze linear DGPs here.

The implications of these eight considerations combined are to
include both variables and factors based on a large information
set, with dynamics, and indicators for location shifts, selecting at a
stringent significance level, evaluating forecasts in both levels and
differences, and possibly adjusting EqCM forecasting models for
shifts near the forecast origin. We only consider forecasting from
linear models, selected in-sample from (a) a large set of variables;
(b) over those variables’ principal components (PCs); and (c) over
a candidate set including both, in each case with IIS, so the initial
model will necessarily have N > T , and in the third case will be
perfectly collinear, but we exploit the ability of automatic model
selection to operate successfully in such a setting.

3. Variables versus factors: the statistical framework

We begin by describing the relationship between the ‘external’
variables and the factors, and then the postulated in-sample DGP
that relates the variable of interest to the factors or ‘external’
variables.

3.1. Relating external variables to factors

Consider a vector of n stochastic variables {zt} that are weakly
stationary over t = 1, . . . , T . For specificity, we assume that
zt is generated by a first-order vector autoregression (VAR) with
intercept π:
zt = π+ 5zt−1 + vt (1)
where 5 has all its eigenvalues inside the unit circle, and vt ∼

INn [0, �v], where n < T . From (1): E [zt ] = π+ 5E [zt−1] = π+ 5µ = µ whereµ = (In − 5)−1 π. The principal-component description of zt is: zt = 9ft + et (2) so when E [ft ] = κ and E [et ] = 0, under weak stationarity in- sample from (2): E [zt ] = 9E [ft ] + E [et ] = 9κ = µ (3) where ft ∼ IDm [κ, P] is a latent vector of dimensionm ≤ n, so9 is n×m, with et ∼ IDn [0, �e], E  fte′ t  = 0 and E  ete′ t  = �e. Then: E  (zt − µ) (zt − µ)′  = 9E  (ft − κ) (ft − κ)′  9′ + E  ete′ t  = 9P9′ + �e = M (4) say, whereP is anm×m diagonalmatrix and hence zt ∼ Dn [µ,M]. Let M = H3H′ where H′H = In, so H−1 = H′ and the eigenvalues are ordered from the largest downwards with: H′ =  H′ 1 H′ 2  and 3 =  311 0 0 322  , (5) where 311 ism × m, with H′ 1MH1 = 311 and: H3H′ = H1311H′ 1 + H2322H′ 2. Consequently, from (2) and (5): H′ (zt − µ) = H′ (9 (ft − κ) + et) = ft − κ. (6) If only m linear combinations actually matter, so n − m do not, the matrix H′ 1 weights the zt to produce the relevant principal components where: H′ 1 (zt − µ) = f1,t − κ1. (7) In (6), we allow for the possibility that n = m, so ft is the complete set of principal components entered in the candidate selection set, of which only f1,t are in fact relevant to explaining yt . 3.2. Variable-based and factor-based models Suppose the in-sample DGP for yt is: yt = β0 + β′zt−1 + ρyt−1 + ϵt (8) where |ρ| < 1 and ϵt ∼ IN  0, σ 2 ϵ  . Integrated–cointegrated sys- tems can be reduced to this framework analytically, albeit posing greater difficulties empirically. Underweak stationarity in-sample: E [yt ] = β0 + β′E [zt−1] + ρE [yt−1] = β0 + β′µ+ ρδ = δ (9) so δ =  β0 + β′µ  / (1 − ρ) and (8) can be expressed in terms of deviations from means as: yt − δ = β′ (zt−1 − µ) + ρ (yt−1 − δ) + ϵt (10) or as an EqCMwhen that is a useful reparameterization. In general, only a subset of the zt−1 will matter substantively, and we denote that by za,t−1, so the remaining variables are not individually sig- nificant at relevant sample sizes, leading to themore parsimonious model: yt − δ = β′ a  za,t−1 − µa  + ρa (yt−1 − δ) + νt . (11) However, that does not preclude that known linear combinations of the omitted variables might be significant, so νt need not be an innovation process. Alternatively, given the mapping between variables and factors in Section 3.1, if a factor structure holds, from (6) and (10) we can obtain an equivalent representation to (10) in factor space: yt − δ = β′H (ft−1 − κ) + ρ (yt−1 − δ) + ϵt = τ ′ (ft−1 − κ) + ρ (yt−1 − δ) + ϵt . (12) Again, only a subset may matter, namely the f1,t−1 in (7), and the resulting parsimonious model in the space of relevant factors becomes: yt − δ = τ ′ 1  f1,t−1 − κ1  + ρ1 (yt−1 − δ) + ηt (13) where ηt need not be an innovation process against the omitted information: also, (11) and (13) are not equivalent representations in general even though (10) and (12) are. We are agnostic about the nature of the DGP, and whether a forecastingmodel based on a variant of (11) or (13) is superior from a forecasting perspective. Our proposed selection strategy allows either a variant of (11) or (13) to be the forecasting model, when we select over variables or factors, respectively, or some hybrid of variables and factors, when we select over variables and factors. As discussed in Section 2.3, the DGP may not have constant parameters, but our approach allows for location shifts and outliers in-sample by implementing IIS simultaneouslywithmodel selection. 4. Factor models and location shifts Once in-sample principal component estimates of the factors {ft} are available, one-step forecasts can be generated from estimates of the selected equation (13):2 yT+1|T =δ +τ ′ 1 f1,T −κ1  +ρ yT −δ (14) whereyT is the ‘flash’ estimate of the forecast origin value. Multi- step estimation can be used to obtain the values of the coefficients for h-step forecasting (see e.g., Clements and Hendry, 1998, Ch.11, Bhansali, 2002, and Chevillon and Hendry, 2005): here we focus on 1-step ahead forecasts. 2 Estimates f1,t of ft using principal components H′ 1 (zt − µ) depend on the scaling of the zt , so are often based on the correlation matrix. J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 309 Table 1 Factor-model taxonomy of forecast errors,uT+1|T = . . .. (1 − ρ) (δ∗ − δ) [A] equilibrium-mean shift −τ ′ (κ∗ − κ) [B] factor-mean shift + (1 − ρ)  δ −δ [C] equilibrium-mean estimation −τ ′ 1 (κ1 −κ1) [D] factor-mean estimation +ρ (yT −yT ) [E] flash estimate error +τ ′ 1  f1,T −f1,T  [F] factor estimate error +τ ′ 2  f2,T − κ2  [G] factor approximation error + (τ1 −τ1) ′ f1,T − κ1  [H] factor estimation covariance + (ρ −ρ) yT −δ [I] flash estimation covariance + (τ1 −τ1) ′ (κ1 −κ1) [J] parameter estimation covariance +ϵT+1 [K] innovation error Suppose the DGP has a factor structure, as in (12), but that there is a location shift at T such that: yT+1 = δ∗ + τ ′  fT − κ∗  + ρ  yT − δ∗  + ϵT+1 (15) where for now τ and ρ remain at their in-sample values during the forecast period. Calculating the forecast error as (15) minus (14) gives rise to:uT+1|T =  δ∗ −δ + τ ′  fT − κ∗  −τ ′ 1 f1,T −κ1  + ρ  yT − δ∗  −ρ yT −δ + ϵT+1. (16) Using τ ′ 1  κ∗ 1 − κ1  +τ ′ 2  κ∗ 2 − κ2  = τ ′ (κ∗ − κ), the forecast error can be written in terms of a number of distinct components as in Table 1, which can be compared to non-factor model taxonomies in e.g., Clements andHendry (2006) andHendry andMizon (2012). Here we focus on the primary determinants of forecast bias, which are [A] and [B], the equilibrium-mean shift, and the factor- mean shift, respectively. To see this, ignore terms of Op  T−1  (including finite-sample biases in parameter estimates), and take expectations through (16): E uT+1|T  ≃ (1 − ρ)  δ∗ − δ  − τ ′  κ∗ − κ  + ρ (yT − E[yT ]) + τ ′ 1(f1,T − E[f1,T ]). (17) From (17), datamis-measurement and factor estimation errors ([E] and [F] in Table 1) can contribute to forecast bias. These last two, together with all remaining terms, also contribute to the forecast- error variance. The factor approximation error does not enter (17) as E  f2,T  = κ2. Consider now the possibility that τ and ρ change value for the forecast period, so that in place of (15) the DGP is given by: yT+1 = δ∗ + τ∗′  fT − κ∗  + ρ∗  yT − δ∗  + ϵT+1. (18) Without constructing a detailed taxonomy, the key impacts can be deduced. Relative to the baseline case illustrated in Table 1, the change in τ induces an additional error term: τ∗′  fT − κ∗  − τ ′  fT − κ∗  =  τ∗′ − τ ′   fT − κ∗  so that the slope change will interact with the location shift, but in its absence will be relatively benign—this additional term will not contribute to the bias when κ∗ = κ, suggesting the primacy of location shifts. In a similar fashion, the change in persistence of the process (the shift in ρ) only affects the forecast bias if the mean of yt also changes over the forecast period. To see this, the additional term in the forecast error when ρ shifts is: ρ∗ − ρ  (yT − δ∗) which has a zero expectation when the shift in ρ does not cause a shift in δ, so δ∗ = δ. Finally, we consider the principal sources of forecast error for an AR(1) model, as this serves as the ‘neither’ benchmark against which the selected factor-and-variable models in Section 7 are to be compared. For brevity, we ignore influences of secondary importance, such as parameter estimation uncertainty and data mis-measurement, and construct the forecast error for the AR(1): yt = δ + φ (yt−1 − δ) + vt (19) when the forecast period DGP is (15). The omission of the factors entails that φ need not equal ρ, but the long-runmean remains the in-sample value of δ. Denoting the forecast error from the AR(1) byvT+1|T :vT+1|T = (1 − ρ)  δ∗ − δ  − τ ′  κ∗ − fT  + (ρ − φ) (yT − δ) with a forecast bias of: E vT+1|T  = (1 − ρ)  δ∗ − δ  − τ ′  κ∗ − κ  , matching the two leading terms in (17) for the bias of the factor- forecasting model. Hence, whether the ‘correct’ set of factors, a subset of these, or none at all is included, there is no effect on the bias of the forecasts (at the level of abstraction here). This affirms the importance of location shifts and the relative unimportance of forecasting model mis-specification (as in e.g., Clements and Hendry, 2006). We have assumed a single forecast origin, but forecasting is rarely a one-off venture, so the performance of the competing models as the origin moves through time is of interest. Although all models will fail when there is a location shift which is unknown when the forecast is made, the speed and extent to which forecast accuracy recovers as the origin moves forward in time from the break point are important. A feature of the ‘equilibrium-correction’ class of models, to which (14) belongs, is their lack of adaptability over time, as established in Clements and Hendry (1998, Ch. 8): see Castle et al. (2011) for some potential remedies. 5. Automatic model selection The primary comparison of interest here is between automatic selection over variables as against PC-based factors in terms of forecasting. Factors are often regarded as necessary to summarize a large amount of information, but automatic selection procedures are an alternative. Given a pre-specified critical value, selection will place a zero weight on variables that are insignificant in ex- plaining variation in the dependent variable yt , whereas princi- pal components will place a small, but non-zero, weight even on variables that have no correlation with yt . Moreover, automatic model selection enables us to remain agnostic about the form of the LDGP. If the data are generated by a few latent factors that cap- ture underlying movements in the economy such as business cy- cles, then principal components should forecast future outcomes. On the other hand, if the data are generated by individual disaggre- gated economic variables, then these should form the forecasting model. By including both explanations jointly, the data can deter- mine the most plausible structure. A further advantage ofmodel selection is that separate selection of the relevant principal components is not needed. Various methods have been proposed in the literature, but most take the principal components that explain the maximum variation within the set of explanatory variables, not the most variation between the explanatory variables and the dependent variable, which would require the correlation structure between the regressors and the dependent variable to be similar to the correlation structure within the regressors (see e.g., Castle et al., 2012a). Instead, by selecting PCs based on their statistical significance in the forecasting model, we capture the latter correlation. In the empirical application, the retained PCs are not always the first few PCs, so the correlation structure may differ from that between the dependent variable and the disaggregates. A number of approaches 310 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 have been proposed to counter the potential deficiencies of PC factor forecasting, and these are briefly reviewed in Section 6. The model selection algorithm used is Autometrics, which un- dertakes a multi-path search using block expanding and con- tracting searches to eliminate insignificant variables, commencing from a general model defined by all potential regressors including variables, factors and lags of both, as well as impulse indicators. Once a feasibly estimable set is found, further reductions ensure pre-specified diagnostic and encompassing tests are satisfied. Vari- ables are eliminated if they are statistically insignificant at the chosen criterionwhilst ensuring the resultingmodel is still congru- ent. Various methods of joint testing can speed up the search pro- cedure. Autometrics enables perfectly-collinear sets of regressors to be included jointly. While the general model is not estimable initially, the search proceeds by excluding some of the perfectly- collinear variables, so selection is undertakenwithin a subset of the candidate variables, but allows excluded variables to be included in a different path search with other perfectly-singular variables be- ing dropped. This ‘sieve’ continues until N < T and there are no perfect singularities. The standard tree search selection can then be applied; see Doornik (2009a,b). 6. Principal components and related approaches The dataset we use consists of 109 variables. Some authors such as Boivin and Ng (2006) suggest that it may be better not to use all available data when constructing PCs, although Bernanke and Boivin (2003) find that the forecast performance of their factor models improves when the factors are calculated from a dataset consisting of 215 variables compared to 78 variables. Relatedly, the standard approach calculates the PCs of the same set of variables irrespective of the target variable. Bai and Ng (2008) calculate factors from a set of ‘targeted predictors’—variables which have been shown to have predictive power for the variable of interest, based on hard or soft thresholding. Our interest is in whether the results on selecting over variables and factors, where the factors are PCs based on the full set of 109 variables, are qualitatively unchanged if instead the PCs are based on smaller sets of targeted predictors. In the empirical section, we use the soft thresholding technique of LASSO (Tibshirani, 1996) to select the first 30 ‘most important’ variables, and calculate targeted factorsf∗ as the PCs of this set of targeted variables. We then apply selection as in our standard case, but replacing the 109 variables and their PCs by the targeted variables and targeted PCs. An alternative is a block-factor approach, motivated byMoench et al. (2009), where the data is divided up into a number of cate- gories, and PCs are calculated for each category. We choose four categories: GDP components and industrial production (19 disag- gregate variables); labormarket (33 disaggregate variables); prices (29 disaggregate variables); financial variables (28 disaggregate variables). The block-factor approach ensures the factors summa- rize information from disparate sets of variables and are more readily interpretable. The first PCwas computed for each block, and entered in the forecasting model: ∆yt+h = β0 + β1∆yt + 4 j=1 γjzj,t (20) for h = 1, 4, 8, where zj,t is the first PC from the j-th block. This approach is compared to simply using the first four PCs without blocking the variables. Other approaches have been espoused in the literature, such as De Mol et al. (2008), who find that Bayesian shrinkage tends to perform as well as PC factor-model forecasting. 7. Forecasting US GDP and GDP growth Our empirical forecasting exercise compares the forecast performance of regressionmodels based on principal components, variables, both or neither. We forecast quarterly GDP growth and the corresponding level over the period 2000–2011. Models are selected in-sample using Autometrics, with all variables and principal components included in the candidate set jointly. A number of authors have assessed the forecast performance of factor models over this period, and Stock and Watson (2011) review studies which explicitly consider the impact of breaks on factor-model forecasts. A key study is Stock and Watson (2009) who find (p. 197) ‘considerable evidence of instability in the factor model; the indirect evidence suggests instability in all elements (the factor loadings, the factor dynamics, and the idiosyncratic dynamics)’. They suggest estimating the factors on the full historical period across the break (there, the Great Moderation around 1984, see, e.g., McConnell and Perez-Quiros, 2000), but only estimating the forecasting models that include the factors as explanatory variables on the post-break period. As an alternative strategy to handle instability in the forecasting models, we use the full estimation sample, but with IIS. The ‘neither’ AR benchmark against which factor model forecasts are often compared has typically been difficult to beat systematically. In terms of forecasting (e.g.) inflation, Stock and Watson (2010) argue that simple univariate models, such as a random-walk model, or the time-varying unobserved components model of Stock and Watson (2007), are competitive with models including explanatory variables. Stock and Watson (2003) are relatively downbeat about the usefulness of leading indicators for predicting output growth; see Clements and Galvão (2009) for evidence using higher-frequency data. We begin by describing the data and forecasting models, and in Section 7.3 present the results when factors are calculated in the standard way from all the available data series. Section 7.4 reports results for selection over targeted factors and variables (as described in Section 6). 7.1. Data The dataset, based on Stock andWatson (2009), consists of 144 quarterly time series for the United States over 1959:1–2006:4, updated here to 2011:2. There are n = 109 disaggregate variables, used both as the candidate set of regressors and the set for the principal components. All data are transformed to remove unit roots by taking first or second differences (usually in logs) as described in Stock and Watson (2009, Appendix Table A.1). The data available span T = 1962:3–2011:2, so there are 150 in-sample observations after transformations and lags, with the forecast horizon spanning 2000:1–2011:2, which is separated into two subsets, 2000:1–2006:4, and 2007:1–2011:2, to assess the performance of the forecasting models over the financial crisis period. Let h = 1, 4, 8 denote the step-ahead direct forecasts. Let P de- note the out-of-sample forecast period, where P0 = 2000:1, P1 = 2006:4, and P2 = 2011:2. Forecasts are evaluated over P0 : P2 (full forecast sample of 46 observations); P0 : P1 (forecast subsample 1 of 28 observations); and P1 : P2 (forecast subsample 2 of 18 ob- servations). N denotes the total number of regressors, which could include T impulse indicators, lags of variables or factors, and de- terministic terms. 7.1.1. Principal components Let xd denote the (T +m)×nmatrix of disaggregated variables after transforming to non-integrated by appropriate differencing, and M the n × n sample correlation matrix. The eigenvalue decomposition is:M = H3H′ (21) J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 311 Table 2 Significance levels used for model selection and expected null retention rates. Variables Factors Both PC1-4 no IIS IIS no IIS IIS no IIS IIS IIS N 441 591 441 591 877 1027 156 Conservative (%) 1 0.75 1 0.75 0.5 0.43 1 Super-conservative (%) 0.1 0.075 0.1 0.075 0.05 0.043 – Note: Intercepts are always retained in selection; PCs and LDV are retained in the ‘PC1-4’ model. where3 is the diagonalmatrix of ordered eigenvalues (λ1 ≥ · · · ≥λn ≥ 0) and H = (h1, . . . ,hn) is the corresponding matrix of eigenvectors, with H′H = In. The sample principal components are:f = H′xd (22) wherexd = (xd1, . . . ,xdT )′ is the standardized data,xdj,t = (xdj,t − xdj )/σxdj ∀j = 1, . . . , n where xdj = 1 T T t=1 x d j,t and σ 2 xdj = 1 T T t=1 (xdj,t − xdj ) 2. When the principal components are estimated in- sample, m = 0, whereas m = P0, . . . , P2 for recursive estimation of the principal components. 7.2. Forecasting models The forecasting models are obtained by selection using Auto- metrics on the GUM: ∆yt = γ0 + Jb j=Ja ρj∆yt−j + n i=1 Jb j=Ja βi,j∆xi,t−j + n k=1 Jb j=Ja γk,jfk,t−j + T l=1 δl1{l=t} + ϵt (23) where∆yt is the first difference of log real gross domestic product. We set: (i) γ = 0, i.e. select over variables only; (ii) β = 0, select over factors only; and (iii) γ ≠ 0 and β ≠ 0, i.e. jointly select variables and factors; (iv) β = 0, n = 4, Ja = Jb = h, i.e. the first four principal components only, with no selection; intercepts and lags of ∆yt are included in all models, with inter- cepts always retained. Three forecast horizons are recorded, for 1-step, 4-step and 8-step ahead direct forecasts. For the 1-step ahead forecasts Ja = 1 and Jb = 4, allowing for 4 lags of the dependent and exogenous regressors. For 4-step ahead direct forecasts Ja = 4 and Jb = 7, and 8-step ahead forecasts set Ja = 8 and Jb = 11. For the four forecasting specifications either: (a) δ = 0, no IIS; or (b) δ ≠ 0, with IIS, applied in-sample. The forecasting model is either selected and estimated over t = 1, . . . , T or recursively over the forecast horizon, t = 1, . . . , T + m, including selecting the eigenvalues of the principal components, so themodel specification can then changewith each new forecast. Intercept-corrected forecasts are also computed, using the simplest formwhere the last in-sample residual is added to the forecast: ∆yICT+h+m = ∆yT+h+m|T+m +ϵT+m for m = 0, . . . , P2. Two selection strategies are considered, a conservative and a super-conservative strategy. The conservative strategy sets the significance level α so that Nα ≈ 4.4 regressors are retained on average under the null that none are relevant, whereas the super- conservative strategy has a null retention of approximately 0.4 Table 3 In-sample model fit for GDP growth forecasting models selected with IIS. 1-step 4-step 8-step cons super cons super cons super Variablesσ (%) 0.49 0.59 0.58 0.69 0.51 0.77 No. regressors 15 6 16 8 26 6 No. dummies 2 2 5 1 7 3 Factorsσ (%) 0.40 0.62 0.33 0.74 0.50 0.79 No. regressors 24 7 35 6 27 5 No. dummies 6 2 11 4 5 2 Bothσ (%) 0.46 0.64 0.60 0.74 0.49 0.69 No. regressors 17 5 15 6 20 7 No. factors 2 1 4 0 3 2 No. dummies 4 1 4 4 13 4 PC1-4σ (%) 0.59 0.69 0.74 No. dummies 4 6 5 Notes: σ is the equation standard error, No. regressors and No. dummies record the number of regressors (including the intercept) and, as a subset, the number of dummies retained, and ‘cons’ and ‘super’ are the conservative and super-conservative strategies respectively. regressors. By controlling α, overfitting is not a concern despite commencing with N ≫ T , with the cost of a loss of power for retaining regressors with retention test values close to the critical value. No selection is undertaken for the model PC1-4, other than IIS to which the conservative strategy significance level applies (Tα ≈ 1.5). Table 2 summarizes the selection significance levels. Three benchmark ‘neither’ forecasts are considered, the random walk (RW) and AR(1) forecasts computed directly and iteratively: ∆yRWT+h+m = ∆yT+m ∆yAR(D) T+h+m = β0 + β1∆yT+m ∆yAR(I)T+h+m = h−1 i=0 γ0ρ i 1 +ρh 1∆yT+m for m = P0, . . . , P2 and h = 1, 4, 8. As a result, there are 354 fore- cast models to compare. We evaluate the forecasts on root mean square errors (RMSFEs), over the full sample and two subsamples, and for both levels and growth rates. The implied level forecasts for GDP (in logs) are: yT+h+m|T+m = h i=1 ∆yT+i+m|T+m + yT+m for m = P0, . . . , P2 for h = 4, 8. Although 1-step ahead forecast errors are identi- cal for levels and differences, results are reported for comparison. 4-step forecasts are calculated from 2000:4, and 8-step forecasts from 2001:4, as the earlier difference forecasts are required for computation, noting that evaluation in levels and growth rates need not result in the same ranking (see Section 2.7). 7.3. Results using PCs extracted from the whole dataset Table 3 records the in-sample model fit and number of retained regressors for selection with IIS. At the looser significance level, 312 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 Table 4 Forecast-error outcomes. Variables Factors Both PC1-4 RW AR(D) AR(I) ∆yT+k Full sample 1.036 1.091 1.262 0.825 0.967 0.811 0.811 0.697 0.806 0.741 0.588 0.700 0.495 0.489 0.757 0.849 0.849 0.634 0.746 0.551 0.545 2000:1–2006:4 0.795 0.954 0.842 0.686 0.768 0.545 0.551 0.644 0.758 0.669 0.536 0.619 0.427 0.431 0.647 0.771 0.683 0.556 0.625 0.421 0.423 2007:1–2011:2 1.275 1.250 1.654 0.984 1.191 1.101 1.097 0.865 0.923 0.946 0.714 0.909 0.692 0.677 0.929 0.970 1.107 0.754 0.935 0.753 0.736yT+k Full sample 2.138 2.345 2.350 1.965 2.668 2.693 2.681 1.505 1.767 1.611 1.380 1.977 1.725 1.712 1.555 1.786 1.677 1.433 1.958 1.851 1.826 2000:1–2006:4 1.400 1.745 1.507 1.156 1.828 1.289 1.310 1.134 1.461 1.216 0.942 1.379 0.997 1.013 1.077 1.376 1.141 0.894 1.314 0.955 0.950 2007:1–2011:2 2.873 2.967 3.187 2.750 3.554 3.978 3.947 2.286 2.482 2.472 2.315 3.083 3.318 3.282 2.299 2.423 2.512 2.271 2.959 3.245 3.188 Notes: The three rows in each block correspond to (a) RMSFE; (b) trimmed RMSFEwith 10% trimming; and (c)MAE for GDP and quarterly GDP growth, with benchmark RandomWalk, direct AR(1) [AR(D)] and iterative AR(1) [AR(I)] forecasts. (×100). Fig. 1. Average RMSFE for GDP growth (∆yT+h). selection over factors results in a better in-sample fit than selection over variables, while the ranking is reversed using the tighter significance level. The factor models retain a relatively large number of PCs under the conservative strategy, suggesting some overfitting, particularly at h = 4. The fit of the non-selected PC1-4 model is close to the fit for selecting over variables with the super- conservative strategy. Few dummies are retained on average. Table 4 records the average RMSFE, trimmed RMSFE (trimming 10% in all samples), and average mean absolute error (MAE) for GDP and GDP growth for each of the forecasting models, averaged across: the forecast horizon, whether IIS is applied or not, the selection significance level, whether intercept correction is applied or not, and whether estimated in-sample or recursively. For GDP growth it is difficult to beat an AR(1) model, either iterative or direct, particularly in the earlier subsample (P0 : P1). In terms of selection over factors, variables, or both, the results generally favor selection over variables, although just using the first 4 PCs (PC1-4) dominates selection for both subperiods. The rankings change dramatically for the levels forecasts. The AR(1) forecasts perform much worse over the second subsample, and PC1-4 dominates selection of factors or variables. The RW benchmark is preferred to using the AR models, but is worse than any selected model. Models with variables are preferred to factor models in both subsamples. There are huge differences in the forecast accuracy across the two subsamples reflecting the crisis period and the difficulty in forecasting GDP over this volatile period, emphasizing the role of location shifts. The aggregate results using RMSFE are disentangled in Figs. 1 and 2 for GDP growth and log GDP respectively. Panel (a) averages across the variants for a given forecast horizon, panel (b) averages across all models with IIS and models without, panel (c) averages across the selection criterion, panel (d) averages across whether J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 313 Fig. 2. Average RMSFE for log GDP (yT+h). intercept correction was applied or not and panel (e) compares in- sample estimation versus recursive selection and estimation. The box plots record the subsample 1 and 2 average RMSFEs (where subsample 2 is always above subsample 1), with the dash denoting the full sample average.3 For GDP growth, RMSFE does not increase substantially as the forecast horizon grows. At the shortest horizon, the models with variables perform poorly in the second subsample relative to the first, whereas the factor model is less affected by greater turbulence of the second period. Using just the first four factors (PC1-4) is the dominant strategy. IIS yieldsmore accurate forecasts when selection is over variables or ‘both’ for the recession period. A tighter selection strategy is preferred for all forecasting models, both in stable and volatile periods. The intercept- correction strategy yields more accurate factor model forecasts (both with and without selection) during the recession period, but little benefit during the earlier period. Recursive selection and estimation yield some gains. In terms of forecasting the levels, the performance of all the models now deteriorates as the forecast horizon increases, as expected, but the stand-out finding is the gain from intercept correction for all models, especially in the second, more volatile, subperiod, so RMSFE-based evaluations of growth rates can hide the benefits of using a robust forecasting device. For the volatile subperiod, RMSFEs are approximately halved. The reasons for the improvements in forecast accuracy from intercept correction are explored in Fig. 3, which records the h-step ahead forecasts of the levels of GDP for h = 1, . . . , 8. Panel (a) records the forecasts from the factor model without intercept correction, with recursive selection and estimation at the super-conservative significance level with IIS, panel (b) records the forecasts from the same model with intercept correction, and panels (c) and (d) record the forecasts from the corresponding variable models. The benefits of intercept correction can be seen around the 2008/9 downturn, where the forecasts are pulled back 3 Results for the 1-step ahead levels forecasts are recorded in Fig. 2 for comparison despite being identical to those in Fig. 1. on track. The simple correction is beneficial for both model’s forecasts, indicating that the ‘break’ induced by the recession is the dominant feature affecting the forecast performance of the levels forecasts, and the choice of selecting over variables or factors is of secondary importance. Figs. 4 and 5 record the distributions of forecast errors for variables (panel a), factors (panel b), both (panel c) and the first 4 PCs (panel d), for GDP growth and the level of GDP, respectively. Separate distributions are plotted for the uncorrected and intercept-corrected forecasts. In growth rates, the forecast errors for the factor models are downward biased, but intercept correction corrects the bias. The variables and ‘variable and factor’ models contain some outliers resulting in very long tails. A closer examination reveals that the outlying forecast errors are mainly due to the retention of the second difference of the log monetary base as an explanatory variable. There was a dramatic increase in the monetary base following the financial crisis, which jumped from $863bn in 2008:3 to $1724bn in 2009:3. In practice intervention by the forecaster would likely attenuate such effects. The levels forecasts demonstrate a substantial negative skew, and the benefits of intercept correction can be seen clearly in these distributions. Other than the couple of outliers in the variables, and variable and factormodels, there are nomajor differences between the variable and factor model forecast errors. 7.4. Results using targeted factors and variables Of interest is whether using targeted variables and factors as suggested by e.g., Bai and Ng (2008) affects the relative forecast performance of selection over factors, variables (or both). We use LASSO to select the 30 most important disaggregate variables, as described in Section 6. Table 5 reports the significance levels used for selection after soft thresholding. Table 6 reports in-sample summary statistics. The selected models are quite highly parameterized in many cases. For the targeted factors, IIS reduces the number of factors retained suggesting that more parsimonious specifications can be obtained if breaks and outliers are accounted for. Table 7 reports the summary forecasting results for soft thresholding compared to standard selection, and Table 8 provides 314 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 Fig. 3. 1-step to 8-step ahead forecasts for GDP. Forecasts from factor models and variable models, both recursive super-conservative selection with IIS, with and without intercept correction. Fig. 4. Distribution of forecast errors for GDP growth. Table 5 Soft thresholding and selection significance levels. Targeted factors Targeted variables Targeted factors & targeted variables Targeted factors & all variables Soft threshold 30 vars 30 vars 30 vars 30 vars Selection with IIS (%) 1 1 0.5 0.5 Selection without IIS (%) 5 5 2.5 1 the breakdown by forecast horizon, where all the forecasts are based on fixed estimation schemes. Averaging across horizons (Table 7), there is little evidence that using targeted factors and variables leads to improvements across the board. For growth rates, selection over targeted factors improves on selection over (standard) factors, and selection over targeted factors and targeted J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 315 Fig. 5. Distribution of forecast errors for log GDP. Table 6 In-sample summary statistics for soft thresholding using LASSO. σ (%) N F∗ n d f Targeted factors 1-step IIS 0.501 274 30 13 4 8 no IIS 0.465 124 30 25 – 24 4-step IIS 0.601 274 30 12 8 3 no IIS 0.642 124 30 13 – 12 8-step IIS 0.512 274 30 23 14 8 no IIS 0.593 124 30 28 – 25 Targeted variables 1-step IIS 0.526 274 30 14 2 – no IIS 0.463 124 30 34 – – 4-step IIS 0.605 274 30 15 8 – no IIS 0.641 124 30 18 – – 8-step IIS 0.371 274 30 41 28 – no IIS 0.710 124 30 19 – – Targeted factors & targeted variables 1-step IIS 0.492 394 30 14 2 4 no IIS 0.513 244 30 14 – 6 4-step IIS 0.638 394 30 9 4 2 no IIS 0.665 244 30 8 – 4 8-step IIS 0.500 394 30 23 14 6 no IIS 0.639 244 30 15 – 5 Targeted factors & all variables 1-step IIS 0.513 710 30 11 1 9 no IIS 0.577 560 30 8 – 5 4-step IIS 0.641 710 30 11 3 2 no IIS 0.670 560 30 11 – 3 8-step IIS 0.453 710 30 28 9 7 no IIS 0.595 560 30 18 – 5 Notes: N = number of regressors in GUM (excluding the retained intercept); F∗ = number of variables retained after soft thresholding, fixed at 30; n = number of retained regressors after selection with Autometrics; d = number of impulse indicators retained after IIS; f = number of principal components retained after selection, including lags. variables improves on selection over factors and variables, but choosing the first 4 factors is the dominant factor-forecasting strategy. Table 8 shows that selection over targeted factors and targeted variables dominates selection over factors and variables at all horizons during the second forecast period, but that using the first 4 (standard) factors is the dominant strategy at h = 4, 8 for forecasting levels and growth rates. 7.4.1. Block-factor approach In view of the good performance of the first 4 factors, we calculate forecasts using the block-factor approach of Section 6. Without IIS, there is no selection, and with IIS there is selection only over the impulse-indicators at 1%, with the lagged dependent variable, intercept and four block factors (computed as the first factor from each block) always retained. The results are reported in Table 9, which averages over results obtained for fixed estimation, denoted ‘Blocking 4 factors’, allowing a direct comparison to ‘First 4 factors’ (the first 4 factors computed from the full variance–covariance matrix with 109 regressors). We also include the simple benchmark models. The results in Table 9 generally are not supportive of the block- factor approach. Blocking does not lead to a forecast improvement 316 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 Table 7 Summary forecast results for soft thresholding compared to standard selection. Factors Target factors Variables Target variables Factors & variables Target factors & target variables Target factors & all variables First 4 factors RW AR(D) AR(I) ∆yt+h Full sample 1.220 1.064 1.125 1.002 1.562 1.018 0.982 0.807 0.967 0.811 0.811 0.916 0.823 0.759 0.770 0.804 0.759 0.758 0.582 0.700 0.495 0.489 0.957 0.851 0.829 0.794 0.970 0.789 0.779 0.626 0.746 0.551 0.545 2000:1–2006:4 1.064 0.903 0.840 0.894 0.917 0.840 0.870 0.688 0.768 0.545 0.551 0.846 0.735 0.684 0.724 0.729 0.673 0.711 0.538 0.619 0.427 0.431 0.862 0.746 0.689 0.726 0.746 0.673 0.711 0.560 0.625 0.421 0.423 2007:1–2011:2 1.404 1.256 1.390 1.127 2.120 1.225 1.120 0.938 1.191 1.101 1.097 1.065 1.005 0.999 0.879 1.056 0.981 0.864 0.690 0.909 0.692 0.677 1.104 1.014 1.046 0.900 1.317 0.968 0.885 0.728 0.935 0.753 0.736yt+h Full sample 2.641 2.615 2.398 2.300 2.635 2.376 2.431 1.883 2.668 2.693 2.681 2.033 2.150 1.725 1.779 1.764 1.847 1.945 1.371 1.977 1.725 1.712 2.057 2.268 1.800 1.870 1.890 1.960 2.056 1.422 1.958 1.851 1.826 2000:1–2006:4 2.019 2.245 1.611 1.622 1.616 1.779 2.030 1.182 1.828 1.289 1.310 1.733 1.967 1.355 1.383 1.352 1.492 1.750 0.973 1.379 0.997 1.013 1.637 2.034 1.285 1.422 1.279 1.548 1.817 0.927 1.314 0.955 0.950 2007:1–2011:2 3.294 3.001 3.170 2.870 3.604 2.946 2.845 2.597 3.554 3.978 3.947 2.780 2.607 2.583 2.482 2.729 2.538 2.429 2.191 3.083 3.318 3.282 2.709 2.572 2.600 2.416 2.841 2.472 2.368 2.192 2.959 3.245 3.188 Notes: The three rows in each block correspond to (a) RMSFE; (b) trimmed RMSFE with 10% trimming; and (c) MAE for GDP and quarterly GDP growth, with benchmark RandomWalk, direct AR(1) [AR(D)] and iterative AR(1) [AR(I)] forecasts. Averaged over fixed estimation. (×100). Table 8 RMSFE for soft thresholding and standard selection by forecast horizon (fixed estimation). Factors Target factors Variables Target variables Factors & variables Target factors & target variables Target factors & all variables First 4 factors RW AR(D) AR(I) ∆yt+h Full sample 1-step 1.012 0.996 1.363 0.999 1.403 0.991 0.864 0.737 0.753 0.729 0.729 4-step 1.214 0.965 0.963 0.897 1.019 0.918 0.942 0.819 1.045 0.834 0.851 8-step 1.436 1.232 1.049 1.109 2.263 1.145 1.140 0.865 1.104 0.869 0.852 2000:1-2006:4 1-step 0.908 0.843 0.754 0.969 0.713 0.815 0.767 0.665 0.766 0.560 0.560 4-step 0.974 0.743 0.842 0.798 0.874 0.818 0.781 0.685 0.681 0.536 0.546 8-step 1.310 1.123 0.923 0.915 1.164 0.887 1.060 0.713 0.858 0.537 0.547 2007:1-2011:2 1-step 1.141 1.185 1.863 1.039 1.956 1.201 0.991 0.825 0.731 0.934 0.934 4-step 1.497 1.216 1.118 1.025 1.197 1.052 1.137 0.967 1.439 1.153 1.177 8-step 1.573 1.368 1.189 1.319 3.207 1.424 1.231 1.023 1.404 1.217 1.179yt+h Full sample 1-step 1.012 0.996 1.363 0.999 1.403 0.991 0.864 0.737 0.753 0.729 0.729 4-step 2.764 1.699 1.863 1.697 1.983 1.743 1.974 1.935 2.922 2.615 2.719 8-step 4.146 5.149 3.968 4.202 4.518 4.395 4.455 2.977 4.330 4.735 4.593 2000:1-2006:4 1-step 0.908 0.843 0.754 0.969 0.713 0.815 0.767 0.665 0.766 0.560 0.560 4-step 1.843 1.311 1.412 1.230 1.437 1.420 1.508 1.182 1.534 1.250 1.290 8-step 3.306 4.580 2.668 2.667 2.697 3.101 3.815 1.699 3.184 2.057 2.080 2007:1-2011:2 1-step 1.141 1.185 1.863 1.039 1.956 1.201 0.991 0.825 0.731 0.934 0.934 4-step 3.737 2.113 2.365 2.183 2.538 2.104 2.470 2.713 4.262 3.878 4.038 8-step 5.003 5.706 5.281 5.387 6.318 5.533 5.075 4.252 5.669 7.121 6.869 Note: (×100). for the growth rates. For the levels the rankings depend on the forecast error measure. Fig. 6 plots the first four principal compo- nents from the full set of disaggregates and from blocking. There are some differences, particularly at the end of the sample for PC3, but the overall variation is similar. 8. Conclusion There have been many analyses of the forecast performance of either factor models or regressionmodels, but few examples of the joint consideration of factors and variables. Recent developments in automatic model selection now allow for more regressors than observations and perfect collinearities. This enables the set of regressors to be extended to include both factors, as measured by their static principal components, and variables, to be jointly included in regressionmodels. The natural extension is to consider which methods perform best in a forecasting context, which is the objective of this paper. One of the key explanations for forecast failure is that of location shifts. When the underlying data generating process shifts, but the forecasting model remains constant, forecast failure will often result. As both regressionmodels and factormodels are in the class of equilibrium-correction models, they both face the problem of non-robustness to location shifts. In our empirical example, we use impulse-indicator saturation to account for breaks in-sample, and IIS could also be used to implement intercept corrections if an indicator variable was retained for the last in-sample observation. We find there is some advantage to using IIS for forecasting, as the unconditional mean is better estimated in differences. As the data are differenced to stationarity in order to estimate the principal components, few impulse-indicators are retained. Backing out levels forecasts highlights the non-stationarity due to level shifts, J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 317 Table 9 Summary forecast results for blocking compared to including first four factors. Blocking 4 factors First 4 factors RW AR(D) AR(I) ∆yt+h Full sample 0.859 0.807 0.967 0.811 0.811 0.603 0.582 0.700 0.495 0.489 0.646 0.626 0.746 0.551 0.545 2000:1-2006:4 0.696 0.688 0.768 0.545 0.551 0.541 0.538 0.619 0.427 0.431 0.553 0.560 0.625 0.421 0.423 2007:1-2011:2 1.049 0.938 1.191 1.101 1.097 0.759 0.690 0.909 0.692 0.677 0.791 0.728 0.935 0.753 0.736yt+h Full sample 1.984 1.883 1.965 2.693 2.681 1.439 1.371 1.977 1.725 1.712 1.555 1.422 1.958 1.851 1.826 2000:1-2006:4 1.302 1.182 1.156 1.289 1.310 1.067 0.973 1.379 0.997 1.013 1.084 0.927 1.314 0.955 0.950 2007:1-2011:2 2.595 2.597 2.750 3.978 3.947 2.148 2.191 3.083 3.318 3.282 2.150 2.192 2.959 3.245 3.188 Notes: The three rows in each block correspond to (a) RMSFE; (b) trimmed RMSFE with 10% trimming; and (c) MAE for GDP and quarterly GDP growth, with benchmark RandomWalk, direct AR(1) [AR(D)] and iterative AR(1) [AR(I)] forecasts. Averaging over fixed estimation results. (×100). Fig. 6. First 4 principal components from full set of disaggregates and from blocking. most notable over the 2008/9 recession, and a further extension would be to consider selection of the variables in levels, augmented by stationary principal components to capture underlying latent variable dynamics. The empirical application considered GDP and GDP growth, computing forecasts using Autometrics to select forecastingmodels that include either principal components, individual variables, or both. When forecasting GDP growth, it is difficult to beat simple autoregressions, our ‘neither’ benchmarks. However, these naive benchmarks are poor at forecasting levels, when robust devices such as differencing (the randomwalk model) or intercept corrections are preferred. The empirical results are mixed, but suggest that selection over variables is preferable to selection over factors when breaks occur over the forecast horizon. Comparing Table 1 with that in Hendry and Mizon (2012) suggests this may be due to mean shifts in irrelevant variables that are given a non-zero weight in factors. There appears to be little empirical support for including both variables and factors jointly. The information set is identical between the two transformations of the data, but there is weak evidence to suggest that factor models are preferable for short horizons (nowcasting and 1-step ahead), but variable models are preferred at longer horizons during the second, volatile forecasting period. For directmulti-step forecasting, Autometrics selection over factors tends to forecast worse than imposing the first four factors, suggesting that there are no benefits to selecting the weights based on the correlation with yt+h. While circumventing the need for off-line selection of factors, the empirical results suggest that this is of less importance than dealing with location shifts. The block-factor approach did not offer clear improvements relative to simply using the first four 318 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 principal components, nor did selection over targeted factors and variables. Whether the data are generated by latent factors or observable variables will depend on the phenomena being analyzed, but can be determined from the data using model selection techniques. Regardless of whether factor models or variable models are used for forecasting, the theory and evidence presented demonstrate the importance of robustifying the forecasts to location shifts. Acknowledgments It is a great pleasure to contribute a paper on economic forecasting to a Festschrift in honor of Professor Hashem Pesaran, who hasmade somany substantive contributions to this important topic. Hashem has also published on virtually every conceivable topic in econometrics, both theory and applied, thereby acquiring almost 20,000 citations, as well as creating and editing the Journal of Applied Econometrics since its foundation in 1986. This research was supported in part by grants from theOpen Society Foundations and the Oxford Martin School. We would like to thank seminar participants at the Computational and Financial Econometrics Conference, London 2011, the OxMetrics Conference, Washington 2012 and Leicester University Departmental Seminar for helpful discussions. We also acknowledge helpful comments from two anonymous referees and the editors. References Allen, P.G., Fildes, R.A., 2001. Econometric forecasting strategies and techniques. In: Armstrong, J.S. (Ed.), Principles of Forecasting. Kluwer Academic Publishers, Boston, pp. 303–362. Anderson, T.W., 1958. An Introduction to Multivariate Statistical Analysis. John Wiley & Sons, New York. Bai, J., Ng, S., 2008. Forecasting economic time series using targeted predictors. Journal of Econometrics 146 (2), 304–317. Bánbura, M., Giannone, D., Reichlin, L., 2011. Nowcasting. In: Clements and Hendry (2011) (Chapter 7). Bartholomew, D.J., 1987. Latent Variable Models and Factor Analysis. Oxford University Press, New York. Bernanke, B.S., Boivin, J., 2003. Monetary policy in a data-rich environment. Journal of Monetary Economics 50, 525–546. Bhansali, R.J., 2002. Multi-step forecasting. In: Clements and Hendry (2002), pp. 206–221. Boivin, J., Ng, S., 2006. Are more data always better for factor analysis? Journal of Econometrics 132 (1), 169–194. Box, G.E.P., Jenkins, G.M., 1970. Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco. Castle, J.L., Doornik, J.A., Hendry, D.F., 2011. Evaluating automatic model selection. Journal of Time Series Econometrics 3 (1). http://dx.doi.org/10.2202/1941- 1928.1097. Castle, J.L., Doornik, J.A., Hendry, D.F., 2012a. Model selection in equations with many ‘small’ effects. Oxford Bulletin of Economics and Statistics http://dx.doi.org/10.1111/j.1468–0084.2012.00727.x. Castle, J.L., Doornik, J.A., Hendry, D.F., 2012b. Model selection when there are multiple breaks. Journal of Econometrics 169, 239–246. Castle, J.L., Fawcett, N.W.P., Hendry, D.F., 2011. Forecasting Breaks and During Breaks. In: Clements and Hendry (2011), pp. 315–353. Castle, J.L., Hendry, D.F., 2011. Automatic selection of non-linear models. In: Wang, L., Garnier, H., Jackman, T. (Eds.), System Identification, Environmen- tal Modelling and Control. Springer, New York, pp. 229–250. Castle, J.L., Shephard, N. (Eds.), 2009. The Methodology and Practice of Economet- rics. Oxford University Press, Oxford. Cattell, R.B., 1952. Factor Analysis. Harper, New York. Chevillon, G., Hendry, D.F., 2005. Non-parametric direct multi-step estimation for forecasting economic processes. International Journal of Forecasting 21, 201–218. Clements, M.P., Galvão, A.B., 2008. Macroeconomic forecasting with mixed- frequency data: Forecasting output growth in the United States. Journal of Business and Economic Statistics 26, 546–554. Clements, M.P., Galvão, A.B., 2009. Forecasting US output growth using leading indicators: An appraisal using MIDAS models. Journal of Applied Econometrics 24, 1187–1206. Clements, M.P., Galvão, A.B., 2012. Forecasting with vector autoregressive models of data vintages: US output growth and inflation. International Journal of Forecasting http://dx.doi.org/10.1016/j.ijforecast.2011.09.003. Clements, M.P., Hendry, D.F., 1993. On the limitations of comparing mean squared forecast errors. Journal of Forecasting 12, 617–637 (with discussion). Clements, M.P., Hendry, D.F., 1998. Forecasting Economic Time Series. Cambridge University Press, Cambridge. Clements, M.P., Hendry, D.F., 2001. Explaining the results of the M3 forecasting competition. International Journal of Forecasting 17, 550–554. Clements, M.P., Hendry, D.F. (Eds.), 2002. A Companion to Economic Forecasting. Blackwells, Oxford. Clements, M.P., Hendry, D.F., 2005a. Evaluating a model by forecast performance. Oxford Bulletin of Economics and Statistics 67, 931–956. Clements, M.P., Hendry, D.F., 2005b. Information in economic forecasting. Oxford Bulletin of Economics and Statistics 67, 713–753. Clements, M.P., Hendry, D.F., 2006. Forecasting with breaks. In: Elliott et al. (2006), pp. 605–657. Clements, M.P., Hendry, D.F. (Eds.), 2011. Oxford Handbook of Economic Forecasting. Oxford University Press, Oxford. Corradi, V., Swanson, N.R., 2011. Testing for factor model forecast and structural stability. Working paper. Economics Department, Warwick University. Croushore, D., 2006. Forecastingwith real-timemacroeconomic data. In: Elliott et al. (2006), pp. 961–982. Dees, S., di Mauro, F., Pesaran, M.H., Smith, L.V., 2007. Exploring the international linkages in the EURO area: a global VAR analysis. Journal of Applied Econometrics 22, 1–38. De Mol, C., Giannone, D., Reichlin, L., 2008. Forecasting using a large number of predictors: is Bayesian shrinkage a valid alternative to principal components? Journal of Econometrics 146, 318–328. Diebold, F.X., Rudebusch, G.D., 1991. Forecasting output with the composite leading index: an ex ante analysis. Journal of the American Statistical Association 86, 603–610. Doornik, J.A., 2009a. Autometrics. In: Castle and Shephard (2009), pp. 88–121. Doornik, J.A., 2009b. Econometric model selection with more variables than observations. Working paper. Economics Department, University of Oxford. Doornik, J.A., Hendry, D.F., 2009. Empirical Econometric Modelling using PcGive: Volume I. Timberlake Consultants Press, London. Duesenberry, J.S., Fromm, G., Klein, L.R., Kuh, E. (Eds.), 1969. The Brookings Model: Some Further Results. North-Holland, Amsterdam. Elliott, G., Granger, C.W.J., Timmermann, A. (Eds.), 2006. Handbook of Econometrics on Forecasting. Elsevier, Amsterdam. Emerson, R.A., Hendry, D.F., 1996. An evaluation of forecasting using leading indicators. Journal of Forecasting 15, 271–291. Ericsson, N.R., 2010. Improving GVARs. Discussion paper. Federal Reserve Board of Governors. Washington, D.C. Fair, R.C., 1970. A Short-Run Forecasting Model of the United States Economy. D.C. Heath, Lexington. Faust, J., Wright, J.H., 2009. Comparing Greenbook and reduced form forecasts using a large realtime dataset. Journal of Business and Economic Statistics 27, 468–479. Fildes, R.A., 1992. The evaluation of extrapolative forecasting methods. Interna- tional Journal of Forecasting 8, 81–98. Forni, M., Hallin, M., Lippi, M., Reichlin, L., 2000. The generalized factor model: identification and estimation. Review of Economics and Statistics 82, 540–554. Garratt, A., Lee, K., Mise, E., Shields, K., 2008. Real time representations of the output gap. Review of Economics and Statistics 90, 792–804. Garratt, A., Lee, K., Mise, E., Shields, K., 2009. Real time representations of the UK output gap in the presence of model uncertainty. International Journal of Forecasting 25, 81–102. Giacomini, R., White, H., 2006. Tests of conditional predictive ability. Econometrica 74, 1545–1578. Gorman, W.M., 1956. Demand for related goods. Discussion paper. Agricultural Experimental Station, Iowa. Granger, C.W.J., 1989. Combining forecasts—twenty years later. Journal of Forecasting 8, 167–173. Granger, C.W.J., Pesaran, M.H., 2000a. A decision-theoretic approach to forecast evaluation. In: Chon, W.S., Li, W.K., Tong, H. (Eds.), Statistics and Finance: An Interface. Imperial College Press, London, pp. 261–278. Granger, C.W.J., Pesaran, M.H., 2000b. Economic and statistical measures of forecasting accuracy. Journal of Forecasting 19, 537–560. Hansen, P.R., Timmermann, A., 2011. Choice of sample split in forecast evaluation. Working paper. Economics Department, Stanford University. Hecq, A., Jacobs, J.P.A.M., 2009. On the VAR-VECM representation of real time data. Discussion paper. Mimeo. University of Maastricht, Department of Quantitative Economics. Hendry, D.F., 2009. The methodology of empirical econometric modeling: applied econometrics through the looking-glass. In: Mills, T.C., Patterson, K.D. (Eds.), Palgrave Handbook of Econometrics. Palgrave MacMillan, Basingstoke, pp. 3–67. Hendry, D.F., Clements, M.P., 2004. Pooling of forecasts. Econometrics Journal 7, 1–31. Hendry, D.F., Johansen, S., Santos, C., 2008. Automatic selection of indicators in a fully saturated regression. Computational Statistics 33, 317–335; Computa- tional Statistics 33, 337–339 (erratum). Hendry, D.F., Krolzig, H.-M., 2005. The properties of automatic Gets modelling. Economic Journal 115, C32–C61. Hendry, D.F., Mizon, G.E., 2012. Open-model forecast-error taxonomies. In: Chen, X., Swanson, N.R. (Eds.), Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis. Springer, New York, pp. 219–240. Holt, C.C., 1957. Forecasting seasonals and trends by exponentially weighted moving averages. ONR Research Memorandum 52. Carnegie Institute of Technology, Pittsburgh. http://dx.doi.org/doi:10.2202/1941-1928.1097 http://dx.doi.org/doi:10.2202/1941-1928.1097 http://dx.doi.org/doi:10.2202/1941-1928.1097 http://dx.doi.org/doi:10.1111/j.1468{\T1\ndash }0084.2012.00727.x http://dx.doi.org/doi:10.1016/j.ijforecast.2011.09.003 J.L. Castle et al. / Journal of Econometrics 177 (2013) 305–319 319 Howrey, E.P., 1978. The use of preliminary data in economic forecasting. The Review of Economics and Statistics 60, 193–201. Howrey, E.P., 1984. Data revisions, reconstruction and prediction: an application to inventory investment. The Review of Economics and Statistics 66, 386–393. Johansen, S., Nielsen, B., 2009. An analysis of the indicator saturation estimator as a robust regression estimator. In: Castle and Shephard (2009), pp. 1–36. Joreskog, K.G., 1967. Some contributions to maximum likelihood factor analysis. Psychometrika 32. Kishor, N.K., Koenig, E.F., 2012. VAR estimation and forecasting when data are subject to revision. Journal of Business and Economic Statistics 30, 181–190. Klein, L.R., 1950. Economic Fluctuations in the United States, 1921–41. In: Cowles Commission Monograph, vol. 11. John Wiley, New York. Klein, L.R., 1971. An Essay on the Theory of Economic Prediction. Markham Publishing Company, Chicago. Klein, L.R., Ball, R.J., Hazlewood, A., Vandome, P., 1961. An Econometric Model of the UK. Oxford University Press, Oxford. Lawley, D.N., Maxwell, A.E., 1963. Factor Analysis as a Statistical Method. Butterworth and Co., London. Leitch, G., Tanner, J.E., 1991. Economic forecast evaluation: profits versus the conventional error measures. American Economic Review 81, 580–590. Makridakis, S., Andersen, A., Carbone, R., Fildes, R., et al., 1982. The accuracy of extrapolation (time series) methods: results of a forecasting competition. Journal of Forecasting 1, 111–153. Makridakis, S., Hibon, M., 2000. The M3-competition: results, conclusions and implications. International Journal of Forecasting 16, 451–476. McConnell, M.M., Perez-Quiros, G.P., 2000. Output fluctuations in the United States: what has changed since the early 1980s? American Economic Review 90, 1464–1476. Moench, E., Ng, S., Potter, S., 2009. Dynamic hierarchical factormodels. Staff reports, no. 412, Federal Reserve Bank of New York. Patterson, K.D., 1995. An integrated model of the data measurement and data generation processes with an application to consumers’ expenditure. Economic Journal 105, 54–76. Patterson, K.D., 2003. Exploiting information in vintages of time-series data. International Journal of Forecasting 19, 177–197. Peña, D., Poncela, P., 2004. Forecasting with nonstationary dynamic factor models. Journal of Econometrics 119, 291–321. Persons,W.M., 1924. The Problem of Business Forecasting. In: Pollak Foundation for Economic Research Publications, vol. 6. Pitman, London. Pesaran, M.H., Pettenuzzo, D., Timmermann, A., 2006. Forecasting time series subject to multiple structural breaks. Review of Economic Studies 73, 1057–1084. Pesaran, M.H., Schuerman, T., Smith, L.V., 2009. Forecasting economic and financial variables with global VARs. International Journal of Forecasting 25, 642–675. Pesaran,M.H., Skouras, S., 2002. Decision-basedmethods for forecast evaluation. In: Clements and Hendry (2002), pp. 241–267. Pesaran, M.H., Timmermann, A., 1992. A simple nonparametric test of predictive performance. Journal of Business and Economic Statistics 10, 461–465. Pesaran, M.H., Timmermann, A., 2005. Small sample properties of forecasts from autoregressive models under structural breaks. Journal of Econometrics 129, 183–217. Pesaran, M.H., Timmermann, A., 2007. Selection of estimation window in the presence of breaks. Journal of Econometrics 137, 134–161. Sargent, T.J., 1989. Two models of measurements and the investment accelerator. Journal of Political Economy 97, 251–287. Schumacher, C., Breitung, J., 2008. Real-time forecasting of German GDP based on a large factor model with monthly and quarterly data. International Journal of Forecasting 24, 386–398. Smets, F., Wouters, R., 2003. An estimated stochastic dynamic general equilibrium model of the Euro Area. Journal of the European Economic Association 1, 1123–1175. Smith, B.B., 1927. Forecasting the volume and value of the cotton crop. Journal of the American Statistical Association 22, 442–459. Smith, B.B., 1929. Judging the forecast for 1929. Journal of the American Statistical Association 24, 94–98. Spearman, C., 1927. The Abilities of Man. Macmillan, London. Stock, J.H., Watson, M.W., 1989. New indexes of coincident and leading economic indicators. NBER Macro-Economic Annual 351–409. Stock, J.H., Watson, M.W., 1999. A comparison of linear and nonlinear models for forecasting macroeconomic time series. In: Engle, R.F., White, H. (Eds.), Cointegration, Causality and Forecasting. Oxford University Press, Oxford, pp. 1–44. Stock, J.H.,Watson, M.W., 2003. How did leading indicator forecasts perform during the 2001 recession. Federal Reserve Bank of Richmond, Economic Quarterly 89 (3), 71–90. Stock, J.H., Watson, M.W., 2007. Why has U.S. inflation become harder to forecast? Journal of Money, Credit and Banking 39 (Suppl.), 3–33. Stock, J.H., Watson, M.W., 2009. Forecasting in dynamic factor models subject to structural instability. In: Castle and Shephard (2009) (Chapter 7). Stock, J.H., Watson, M.W., 2010. Modelling Inflation After the Crisis. In: NBER Working Paper Series, vol. 16488. Stock, J.H., Watson, M.W., 2011. Dynamic factor models. In: Clements and Hendry (2011) (Chapter 2). Stone, J.R.N., 1947. On the interdependence of blocks of transactions. Journal of the Royal Statistical Society 8 (Supplement), 1–32. Swanson, N.R., van Dijk, D., 2006. Are statistical reporting agencies getting it right? Data rationality and business cycle asymmetry. Journal of Business and Economic Statistics 24, 24–42. Tibshirani, R., 1996. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, B 58, 267–288. Tinbergen, J., 1930. Determination and interpretation of supply curves: an example (Bestimmung und Deutung von Angebotskurven: ein Beispiel). Zeitschrift fur Nationalökonomie 1, 669–679. Tinbergen, J., 1951. Business Cycles in the United Kingdom 1870–1914. North- Holland, Amsterdam. Waelbroeck, J.K. (Ed.), 1976. The Models of Project LINK. North-Holland Publishing Company, Amsterdam. Wallis, K.F., 1989. Macroeconomic forecasting: a survey. Economic Journal 99, 28–61. Watson, M.W., 2007. How accurate are real-time estimates of output trends and gaps? Federal Reserve Bank of Richmond Economic Quarterly 93, 143–161. West, K.D., 1996. Asymptotic inference about predictive ability. Econometrica 64, 1067–1084. West, K.D., McCracken, M.W., 1998. Regression-based tests of predictive ability. International Economic Review 39, 817–840. Winters, P.R., 1960. Forecasting sales by exponentially weighted moving averages. Management Science 6, 324–342. Zarnowitz, V., Boschan, C., 1977. Cyclical indicators: an evaluation and new leading indexes. In: Handbook of Cyclical Indicators. U.S. Department of Commerce, Washington, USA, pp. 170–183. Forecasting by factors, by variables, by both or neither? Introduction and historical background Setting the scene Pooling of information Model selection Unanticipated location shifts Role of information in forecasting Equilibrium-correcting behavior Measurement errors Forecast evaluation Nature of the DGP Variables versus factors: the statistical framework Relating external variables to factors Variable-based and factor-based models Factor models and location shifts Automatic model selection Principal components and related approaches Forecasting US GDP and GDP growth Data Principal components Forecasting models Results using PCs extracted from the whole dataset Results using targeted factors and variables Block-factor approach Conclusion Acknowledgments References CastleHendry-2010Oct-ALowDimensionPortmanteauTest-JEctrics-v158n2 Journal of Econometrics 158 (2010) 231–245 Contents lists available at ScienceDirect Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom A low-dimension portmanteau test for non-linearity Jennifer L. Castle, David F. Hendry ∗ Department of Economics, Oxford University, United Kingdom a r t i c l e i n f o Article history: Available online 18 January 2010 JEL classification: C51 C52 Keywords: Functional form Portmanteau test Non-linearity Principal components Collinearity a b s t r a c t A new test for non-linearity in the conditional mean is proposed using functions of the principal components of regressors. The test extends the non-linearity tests based on Kolmogorov–Gabor polynomials (Thursby and Schmidt, 1977; Tsay, 1986; Teräsvirta et al., 1993), but circumvents problems of high dimensionality, is equivariant to collinearity, and includes exponential functions, so is a portmanteau test with power against a wide range of possible alternatives. A Monte Carlo analysis compares the performance of the test to the optimal infeasible test and to alternative tests. The relative performance of the test is encouraging: the test has the appropriate size and has high power in many situations. © 2010 Elsevier B.V. All rights reserved. 1. Introduction It is a great pleasure to contribute a paper in honor of Phoebus Dhrymes, whose many major publications have helped establish econometrics in its modern form. Our topic is the validity of the functional form of a model, which is an essential component of its correct specification: both are topics on which Phoebus has published (see e.g., Dhrymes (1966), and Dhrymes et al. (1972)). Nevertheless, we consider that a test for non-linearity is required that can evaluate the ‘goodness’ of a postulated model against general non-linear alternatives, particularly in the context where there would be more non-linear terms to include than available observations. Evidence of non-linearity implies that an alternative functional form should be utilized. Consequently, we propose a general portmanteau test for non-linearity, which is designed to accommodate large numbers of potential regressors, so is applicable prior to undertaking model estimation or selection. Our proposed test is based on a third-order polynomial with additional exponential functions, formed from the principal components of the original variables, which allows us to obtain a highly flexible non-linear approximation yet in a parsimonious formulation. If there are (say) n = 10 linear variables, there are 55 quadratic functions and 220 cubics, adding 275 variables to test for a general third-order polynomial, whereas our proposed test would require just 30 functions yet can check exponentials aswell; ∗ Corresponding address: Nuffield College, New Road, OX1 1NF Oxford, United Kingdom. Tel.: +44 0 1865 281162. E-mail address: david.hendry@nuffield.ox.ac.uk (D.F. Hendry). 0304-4076/$ – see front matter© 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2010.01.006 for 30 linear regressors, an unmanageable 5425 additional terms would be required—as against 90. Many tests of linearity have been proposed in the literature: see Granger and Teräsvirta (1993, Ch. 6) for an overview. Ramsey (1969) proposed tests for specification errors in regression, including unmodeled non-linearity, based on adding powers of the fitted values: Doornik (1995) provided a careful evaluation of both the numerical and statistical properties of the RESET test. Thursby and Schmidt (1977) extended the RESET test by proposing the use of powers and cross products of explanatory variables, as well as another variant that used the principal components of the explanatory variables: simulation studies across a range of specifications suggested that the preferred test was one based on squared, cubic and quartic powers of the explanatory variables. Lee et al. (1993) utilized these generalized RESET tests, which took principal components of the polynomials, in their simulation study to control for collinearity. Keenan (1985) developed a univariate test for detecting non-linearity based on a Volterra expansion, an analog of the Tukey (1949) one degree-of-freedom test for non- additivity: the test checks whether the squared fit is correlated with the estimated residuals, so is a variant of the Ramsey (1969) test. Tsay (1986) improved on the power of the Keenan (1985) test by allowing for disaggregated non-linear variables (all n(n + 1)/2 cross products for n linear variables), based on a Kolmogorov–Gabor polynomial. Bierens (1990) developed a conditional moment test of functional form. White (1989) and Lee et al. (1993) proposed a neural network test for neglected non-linearity, in which they undertook a Lagrange multiplier test for hidden activation functions, typically logistic functions, after pre-filtering using an AR(p)model selected by BIC. As the hidden unit activations, denoted 9t , tended to http://www.elsevier.com/locate/jeconom http://www.elsevier.com/locate/jeconom mailto:david.hendry@nuffield.ox.ac.uk http://dx.doi.org/10.1016/j.jeconom.2010.01.006 232 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 be collinear with the regressors, they suggested using principal components of the9t , andwe compare a variant of this idea to our proposed test. Teräsvirta et al. (1993) showed that such a test had low power in many situations, and was not robust to the inclusion of an intercept. They proposed a related test based on a third-order polynomial that approximated the single hidden-layer artificial neural network model of Lee et al. (1993), but had a more general application. This test will suffer when there are many regressors, as the number of non-linear functions, Cn = n(n + 1)(n + 5)/6, increases rapidly with n. (White, 1982, 1992 ch. 9) proposed a range of dynamic informationmatrix tests based on the covariance of the conditional score functions. The tests have power against mis-specifications that induce autocorrelation in the conditional scores. Brock et al. (1987) developed a non-parametric test for serial independence based on the correlation integral of a scalar series, which emerged from the chaos theory literature, and McLeod and Li (1983) developed a test against ARCH which has power against non- linearity in the conditional mean, and which is asymptotically equivalent to the LM ARCH test developed by Engle (1982). Finally, the White (1980) test for heteroskedasticity can be considered as a test of linearity in which the alternative hypothesis is a doubly stochastic model given by: yt = α′txt + εt , εt ∼ IN [ 0, σ 2ε ] (1) with xt = (x1,t , . . . , xn,t)′, αt ∼ IIDn[α,6α], where 6α is positive definite and E[αtεs] = 0,∀s, t . The test adds all the squares of regressors, or squares and cross-products, to test for heteroskedasticity, implicitly testing for omitted non-linearity as well: see Spanos (1986, p. 466). This has been investigated by numerous authors, including a recent appraisal in Hendry and Krolzig (2003) in the context of model selection, and has direct parallels with the approach we propose. Although the literature on non-linearity testing is substantial, our test is designed for a setting that is not yet handled well, namely, high dimensional, possibly relatively collinear, specifications where the functional form of the non-linearity is not known a priori. Standard tests based on second-order polynomials such as White (1980) and Tsay (1986) face several practical drawbacks: first, their increasing dimensionality with n; secondly, the potentially high collinearity between powers of regressors that are slowly changing (see Phillips (2007), for a time series analog) and third, the possibility that the second derivative is not the source of the departure from linearity. To rectify these potential drawbacks, our test first forms the n standardized, mutually-orthogonal principal components zt of the original n linear regressors xt , with weights given by the eigenvectors of their estimated variance–covariance matrix. Then, for fixed regressors, adding the squares, z2i,t , cubes, z 3 i,t , and exponential functions of the zi,t , yields an F-test with 3n degrees of freedom, jointly resolving the problems of high dimensionality, collinearity, and restriction to second-order departures. The corresponding ‘unrestricted’ function of all these terms would involve adding Rn = Cn + n elements, which is often infeasible, and is usually low powered for large n. When the regressors are weakly exogenous (see Engle et al., 1983), an approximate F-test results in stationary models. The structure of the paper is as follows. Section 2 considers functional-form testing, and proposes a test for situations where many general tests are infeasible. Section 3 discusses the statistical power of the proposed test, computes the non-centrality, and examines its power approximations for scalar polynomials. Section 4 describes the range of alternative tests we will compare against, then Section 5 undertakes a set of Monte Carlo experiments to examine the comparative power of the test for various IID data generation process (DGP) designs. Testing for non-linearity in dynamic models is considered in Section 6, and Section 7 concludes. 2. Testing functional form Given the general functional relationship for t = 1, . . . , T : yt = f ( x1,t , . . . , xn,t ) + εt εt ∼ IN [ 0, σ 2ε ] (2) the linear approximation is: yt = β0 + β1x1,t + · · · + βnxn,t + et = β0 + β′xt + et (3) where xt needs to be weakly exogenous for β. To evaluate (3) requires testing the validity of the functional form approximation. We first consider the case where f (xt) is quadratic, which highlights the key considerations, then consider a more general approximation that avoids the three drawbacks of dimensionality, collinearity and only quadratic departures, by developing an Index test. 2.1. Testing against a quadratic First, consider the optimal test for linearity when f (xt) is the exact quadratic: f (xt) = β0 + β′xt + γ x′tAxt (4) andA is knownup to a constant factor and symmetric (without loss of generality), so x′tAxt = ut is known and γ 6= 0. Then when (4) holds: yt = β0 + β′xt + γ ut + εt (5) so a t-test of H0 : γ = 0 in (5) will be the most powerful test for non-linearity. 2.2. A quadratic approximation test However, as A is unknown in general, a ‘natural’ test is to consider the importance of adding all the quadratic terms to (3). Let: wt = vech ( xtx′t ) (6) where vech vectorizes and selects the non-redundant elements from the lower triangle including the diagonal of the outer product. Then, an exact test for fixed regressors would be an Fn(n+1)/2T−(n+2)(n+1)/2- test of the null δ1 = 0 in: yt = β0 + β′xt + δ′1wt + et . (7) When A is unknown, (7) provides one operational counterpart (analogous to the third-order polynomial test proposed by Spanos (1986, p. 460)). However, for a non-collinear test, let: xt ∼ Dn [µ,�] (8) where � is the symmetric, positive-definite variance–covariance matrix. Factorize� = H3H′, whereH is thematrix of eigenvectors of � and 3 the corresponding eigenvalues, so H′H = In. Since 3−1/2H′�H3−1/2 = In, let: z∗t = 3 −1/2H′ (xt − µ) ∼ Dn [0, I] . (9) Replacing xt by z∗t in (6) and (7) does not affect the test of δ1 = 0, merely transforming the data to its complete set of principal components, so it is equivariant to collinearity, a property that will prove useful below. 2.3. A low-dimensional quadratic test Instead of including all of the quadratic terms, let u∗,i,t = (z∗i,t) 2, then under the null that γ = 0 in (4), the test of κ1 = 0 in: yt = β0 + β′xt + κ′1u∗,t + vt (10) is an exact F-test with n degrees of freedom, for fixed regressors. There are only n elements in u∗,t , so relative to one in ut in (5), J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 233 power will be lower. However, n is many fewer terms than n(n + 1)/2 in wt in (7), yet every element in u∗,t potentially depends on the squares and cross-products of every xi,t . Thus, the first and second objectives—effecting amajor dimensionality reduction and formulating a test in terms of non-collinear variables—have been achieved. The test based on (10) is equivariant to whether the linear terms are xt or z∗t , and if transformed to the latter, corresponds to adding their squares (see e.g., White, 1980). Since� is unknown, a symmetric, positive-definite estimate �̂ is used when calculating principal components, denoted zt so: zt = 3̂ −1/2Ĥ′ (xt − µ̂) ãppDn [0, I] . (11) Then the zi,t in (11) are the standardized, mutually-orthogonal combinations of the original xi,t to be used for the operational quadratic version of the test via u1,i,t = z2i,t . 2.4. Testing against more general alternatives To accommodate the third drawback, and generalize the test for higher-derivative departures from the null, we also include u2,i,t = z3i,t to capture skewness. Since polynomial approximations may be slow to converge when the approximated series involves exponentials, and tend not to be a parsimonious approximation for other functional forms, Abadir (1999) suggested confluent hypergeometric functions as a more general class. We examined simple exponential functions such as {ezi,t }, {ezi,t zi,t} and {e−|zi,t |} but these yielded almost no additional information over the polynomial approximation for a range of alternatives. However, a product like u3,i,t = e−|zi,t |zi,t does yield additional flexibility as well as help capture overall asymmetry, with the possible disadvantage that it is not differentiable. Since the zi,t are standardized as in (11), both signs occur equally often on average, so adding u1,i,t , u2,i,t , u3,i,t as in (12) yields the low-dimensional portmanteau F3nT−(4n+1)-test of κ1 = κ2 = κ3 = 0 in: yt = β0 + β′xt + κ′1u1,t + κ ′ 2u2,t + κ ′ 3u3,t + et . (12) Under the null, et = εt , so the test is distributed as F for fixed regressors. Under the alternative: yt = β0 + n∑ i=1 ( βizi,t + κ1,iz2i,t + κ2,iz 3 i,t + κ3,izi,te −|zi,t | ) + et ≈ β0 + n∑ i=1 ( βizi,t + γizi,t ( 1− ∣∣zi,t ∣∣) + κ1,iz2i,t + κ2,iz 3 i,t + θiz 3 i,t ( 1− 1 3 ∣∣zi,t ∣∣))+ et . (13) Thus, the approximation is quite flexible: the model is constant if β = κ1 = κ2 = κ3 = 0; linear if κ1 = κ2 = κ3 = 0; quadratic if κ2 = κ3 = 0; cubic if κ3 = 0; ‘bi-linear with bi- quadratic’ if β = κ2 = 0; and fairly general if all the coefficients are non-zero. The non-linear bases included in (13) are recorded in Fig. 1: a wide range of possible non-linearities can be detected by combinations of the three proposed here. e−φz 2 i zi was considered as well, but yielded almost no additional information over the exponential product included, as can be seen from replacing |zi,t | by z2i,t in (13). 1 We call this joint F-test the Index-test, as only 3n additional regressors are needed, against Rn if all the terms were entered unrestrictedly. Despite including terms for quadratic, cubic 1 As ezi,t = 1 + zi,t + 1 2 z 2 i,t + 1 6 z 3 i,t + 1 24 z 4 i,t + O(z 5 i,t ), and z 2 i,t and z 4 i,t tend to be highly correlated, exp(zi,t ) adds little to a cubic approximation. and exponential departures as shown, the Index-test remains computable provided 4n < T , whereas Rn < T would be needed unrestrictedly (e.g., allowing n ' 24 versus n ' 6 at T = 100). We next investigate its null distribution, then consider its power in several canonical settings. 2.5. Null distributions Since zt is just an orthogonal transformation of xt , the Index- test needs the same assumptions as almost all the other tests considered in Section 4 (other than RESET: see Caceres (2007)). To check that the Index-test is distributed as an F3nT−4n−1 in finite samples even when there would be more variables than observations in the unrestricted test, so that Rn > T , we
examined its QQ-plots under the null for the static experiments in
Section 5.1. The results closely corresponded to an F-distribution.
Collinearity between regressors did not affect the null distribution.
We also checked the null distributions for fixed regressors of these
alternative tests, including the RESET test, V23 (Teräsvirta et al.,
1993), principal components of V23, denoted PCV23, V2 (Tsay,
1986), and principal components of V2, PCV2, confirming that all of
these tests are distributed as F under the null with the appropriate
degrees of freedom.2 In stationary dynamic processes with weakly
exogenous regressors, the Index-test is only an F3nT−4n−1 in large
samples.

3. Test power

Thus, the key issue is the comparative powers of the various
tests to reject the null when it is false in a range of states of nature.
We first consider scalar polynomials, then a vector of quadratic
regressors, which also highlights when the quadratic Index-test
would have no power, and finally turn to the general portmanteau
test.

3.1. Power approximations for polynomials

The simplest polynomial DGPs are given by:

yt = βjx
j
t + εt (14)

where εt ∼ IN[0, σ 2ε ] and xt ∼ IN[µ, 1], for t = 1, . . . , T , and
j = 1, . . . , 4, such that the four cases are a linear, quadratic, cubic
and quartic function. The distribution of the t2-test of H0 : βj = 0
conditional on xt is:

β̂2j

T∑
t=1
(xjt)2

σ̂ 2ε
ãppχ

2
1

(
ϕ2j
)

(15)

with the non-centrality parameter:

ϕ2j =

β2j

T∑
t=1
(xjt)2

σ 2ε
. (16)

Rather than use the specific value of
∑T
t=1(x

j
t)
2 in the simulations,

we have approximated it by TE[(xjt)2]when calculating ϕ2j where,
using normality, E[(xt)2] = 1 + µ2, E[(x2t )

2
] = 3 + 6µ2 + µ4,

E[(x3t )
2
] = 15+45µ2+15µ4+µ6, and E[(x4t )

2
] = 105+420µ2+

210µ4 + 28µ6 + µ8.

2 The labels follow the notation of Teräsvirta et al. (1993), where V refers to a
Volterra expansion.

234 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245

Fig. 1. Quadratic, cubic, exponential and composite functions.

Simulated power functions when T = 100 and σ 2ε = 1 for
the four DGPs are recorded in Fig. 2, based on 10,000 replications,
for each value of ϕj = 1, . . . , 6, along with the analytic power
functions, calculated by approximating FkT−k by aχ

2with k degrees
of freedom, FkT−k(ϕ

2)ãpp χ
2
k (ϕ

2), and relating that non-central χ2

distribution to a central χ2 (as in e.g., Hendry, 1995, p. 475). When
µ = 0 (panels a and b), the divergence between the analytic
and Monte Carlo test powers is notable for the cubic and quartic,
particularly for intermediate non-centralities. This is due to the
high skewness and kurtosis of the distribution of xjt , which impacts
on the distribution of the t-statistic under the alternative, as the
sample mean of the (xjt)2 can be far from E[(xjt)2] in (16). Because
of this effect, at a given non-centrality, it is more difficult to detect
a higher-order term – even if its form is known – when means are
zero (or yt depends on (xt − µ)j). Although the mean is arbitrary
in most economic data series, (14) is not equivariant to the mean,
unlike (3), as even for a quadratic:

yt = β2x2t + εt = β2µ
2
+ 2β2µ (xt − µ)+ β2 (xt − µ)2 + εt .

Panels c and d show the much better match for µ 6= 0. Also, while
(16) still holds, non-centralities diverge faster at higher powers as
µ increases. Combinations of polynomial functions like (14) lead
to F-tests, and in these scalar cases are identical to using zt .

3.2. Vector of quadratic regressors

For a vector of regressors, let β0 = 0 and take µ = 0 in (8),
so that all linear terms have means of zero (or are deviations from
sample means). Under the alternative that γ 6= 0 in (4), the test of
κ1 = 0 in (10) will have power against quadratic departures that
are not orthogonal to ut in (5) as follows. Since � = H3H′, from
(9):

yt = β′xt + γ x′tAxt + εt

= β′xt + γ (z∗t )

(
31/2H′AH31/2

)
z∗t + εt . (17)

Let 31/2H′AH31/2 = ϒ∗ + D∗ where ϒ∗ is diagonal and D∗ non-
diagonal with a zero diagonal, so:

yt = β′xt + γ (z∗t )
′ (ϒ∗ + D∗) z∗t + εt (18)

then:
yt = β′xt + γ (z∗t )

′ϒ∗z∗t + γ (z

t )
′D∗z∗t + εt

= β′xt + κ′1u∗,t + vt (19)
(say), yielding the test of κ1 = 0 in (10). Such a test is not optimal
if D∗ 6= 0, as then the cross-products of the z∗i,t matter, although
by construction only those components that are orthogonal to u∗,t
will be omitted from (19). The ‘closer’ D∗ is to 0, the less the power
loss. Additional terms from the next sub and super diagonals could
be added as a check when n is small; or going in the opposite
direction, a scalar test could be constructed using

∑n
i=1(z

i,t)
2 as

a single regressor, as in Tukey (1949). The operational equivalent
replaces z∗t by zt so tests κ1 = 0 in:

yt = β′xt + γ z′tϒzt + et = β′xt + κ′1u1,t + et . (20)

3.3. Powerless cases

The quadratic Index-test will have no power when the
departure from linearity is in the direction ofu1,t,⊥, which requires
ϒ = 0n when D 6= 0 in (20). This may occur if the xi,ts are
perfectly orthogonal, such that � = 3, and the non-linearity
only enters in the form of a cross product, whereas the principal
components would not include cross-product terms. However, the
practical relevance of such a case seems limited. Note that the form
of any non-centrality due to the non-linear functions of xt being
omitted from (3) changes with the mapping to zt , maintaining the
test’s equivariance to collinearity.
The quadratic Index-test would also have no power if the

second derivative of f (·) were zero, but the third was non-zero,
a case to which we now turn.

3.4. General Index-test power

The possibility that the first non-zero derivative is the third is
precisely the reason for including the additional terms u2,t and
u3,t in (12). Even if the first non-zero derivative is the fourth, the
general Index-test in (12)will have power, as the second derivative
will almost certainly be correlatedwith the fourth; similarly for the
third with the fifth.

J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 235

Fig. 2. Analytic and Monte Carlo power functions for a single polynomial function.

When the non-linearity takes the form of a ‘squashing function’
like an ogive, the exponential component is likely to prove useful.
There is a trade-off between the degrees of freedom and the value
of the test’s non-centrality parameter. When the non-centralities
of the individual terms are small and n is large, a test with Rn
degrees of freedom would have very low power. For the general
Index-test, the overall power depends on combinations of these
non-centralities for each variable and each polynomial power, as
well as the exponential term, an issue we now explore by Monte
Carlo experiments.

4. Alternative tests

There is a wide range of alternative linearity tests, see Granger
and Teräsvirta (1993) for a summary. We consider a subset that
aims to test against a general non-linear alternative. Consider
a univariate process yt which is a function of a set of n
potential linear regressors xi,t , i = 1, . . . , n, and a set of non-
linear functions, denoted g(·). The non-linearity tests computed
include:

1. Optimal test, H0 : φ = 0 for:

yt = β0 +
n∑
i=1

βixi,t + φg(·)+ νt . (21)

2. A second-order Kolmogorov–Gabor polynomial test, which
applying the Frisch and Waugh (1933) theorem, is identical to
the Tsay (1986) test and is analogous to the White (1980) test,
H0 : δ = 0 for:

yt = β0 +
n∑
i=1

βixi,t +
n∑
j=1

n∑
k=j

δjkxj,txk,t + ηt . (22)

This test has n(n + 1)/2 degrees of freedom, and is denoted
V2.

3. A third-order Kolmogorov–Gabor polynomial test, proposed by
Spanos (1986, p. 460) and Teräsvirta et al. (1993), H0 : δ = ϑ =
0 for:

yt = β0 +
n∑
i=1

βixi,t +
n∑
j=1

n∑
k=j

δjkxj,txk,t

+

n∑
j=1

n∑
k=j

n∑
l=k

ϑjklxj,txk,txl,t + ζt . (23)

This test has n(n + 1)(n + 5)/6 degrees of freedom, and is
denoted V23.

4. Principal components of the second-order polynomial test,
using the first k principal components of (6), denoted w̃t , and
test H0 : π = 0 for:

yt = β0 +
n∑
i=1

βixi,t +
k∑
j=1

πjw̃j,t + ξt . (24)

We set k = 3n to correspond to the Index-test and the test is
denoted PCV2.

5. Principal components of the third-order polynomial test based
on the first k principal components of st = vech[(xtx′t) ⊗ x′t ],
denoted s̃t : H0 : τ = 0 for:

yt = β0 +
n∑
i=1

βixi,t +
k∑
j=1

τj̃sj,t + ςt . (25)

We set k = 3n to compare to the Index-test, and the test is
denoted PCV23.

6. Index-test, H0 : γ = θ = ξ = 0 for:

yt = β0 +
n∑
i=1

βixi,t + γ′u1,t + θ′u2,t + ξ′u3,t + et , (26)

with 3n degrees of freedom. Variants of the test with just the
quadratic, cubic, and exponential functions are also considered,

236 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245

eachwith n degrees of freedom. In practice, thesewould require
knowledge of the functional form of the DGP. The benefit of the
Index-test is that it is a portmanteau test that has power over a
range of possible non-linear DGPs.

7. RESET test: H0 : κ1 = κ2 = κ3 = 0 for:

yt = β0 +
n∑
i=1

βixi,t + κ1̂y2t + κ2̂y
3
t + κ3̂y

4
t + υt , (27)

which has 3 degrees of freedomwhen ŷt is the fitted value from
the linear model.

4.1. Principal components of non-linear functions

The number of parameters for V2 and V23 quickly becomes
large as more linear regressors are included, such that the tests are
infeasible for large n. This implies that the principal components
of V2 and V23 must also be restricted to small n. For a sample size
of T = 100, a maximum of 5 linear regressors can be included for
the V23 test and amaximum of 13 for the V2 test. The Index-test is
feasible over a wider range of n, as only 3n degrees of freedom are
needed for the most general test.
Whilst taking principal components of the linear functions

and then computing the non-linear functions delivers similar
power to taking the principal components of the non-linear
functions directly for small n, there are drawbacks to the latter
approach. No dimension reduction is achieved as the number
of principal components will equal the number of non-linear
functions. Hence, an arbitrary k is selected, either using formal
tests such as the Scree test (Cattell, 1966) or the Kaiser (1960)
criterion, or based on a preference for the degrees of freedom.
If a batch of tests is computed in which k varies, the critical
values would need to be corrected for the joint procedure—
which would require a tighter significance level for the principal
components tests than the Index-test. Most variance is collected
by the first few principal components, so a small k is often
beneficial, but discarding the PCs with smaller eigenvalues
can be problematic if the relevant non-linear regressors are
highly correlated with yt but not with other regressors. By
undertaking the dimension reduction first and then computing
the non-linear transformations, no arbitrary reduction is needed.
Further, principal components of the non-linear functions can be
computationally demanding for large n compared to the proposed
Index-test.

5. Power simulations for static DGPs

5.1. Experimental design

A range of simulations was undertaken in Ox (see Doornik
(2007)) using M = 1000 replications to examine the powers of
the Index-test for varying degrees of collinearity and numbers of
regressors. The unmodeled variables’ DGP is:

xt ∼ INn [µ,�] (28)

where µi = 0 or 10, V[xi,t ] = 1, ∀i, and cov[xi,txj,t ] = ρ, ∀i 6= j.
Here, n is the number of linear regressors in eachmodel, increasing
from 2 to 20 in the general unrestricted model, of which only
two (x1,t and x2,t ) are relevant. We consider up to 18 irrelevant
variables, as happens in model selection experiments (see e.g.
Hoover and Perez (1999)), so some tests are infeasible. The {yt}
DGPs considered are listed in Table 1, where (see Section 5.3.7 for
the ‘Quadratic z’ case):

yt = f (·)+ εt , εt ∼ IN [0, 1] . (29)

We consider two magnitudes of correlation, ρ = 0, 0.9, and two
sample sizes, T = 100 and 300, with the parameters of the DGP
held constant to assess the impact on power of increasing the
sample size. Parameters are set for the non-linear functions such
that the non-centrality of an individual t-test for T = 100, ρ = 0
andµ = 0 is given byψ in Table 1. Results are mainly reported for
T = 100, a 5% significance and µ = 0, due to space constraints,
but are available on request.

5.2. Results for n = 2

Although a scenario where the model coincides with a low-
dimensional DGP is unlikely, we first consider that case to compare
the set of alternative tests.

5.2.1. Quadratic DGP
For the quadratic DGP, the optimal test of H0 : β3 = 0 when

n = 2 has a non-centrality of ϕ21,α = 3Tβ
2
3 , which is independent

of ρ, the degree of collinearity.
The power of the V2 test in (22) for (29) will depend on:

δ1×21,t + δ2x
2
2,t + δ3×1,tx2,t (30)

and the non-centrality of the test is given by3:

ϕ2V2 = T
(
3δ21 + 3δ

2
2 +

(
1+ 2ρ2

) [
δ23 + 2δ1δ2

]
+ 6ρδ3 [δ1 + δ2]

)
.

When δ2 = δ3 = 0, the test non-centrality collapses to the
optimal test non-centrality of 3Tδ21 . Hence, the difference between
the optimal and V2 test will be a function of the number of degrees
of freedom alone. This applies as well to V23, and to the principal
components versions PCV2 and PCV23 if k spans the DGP non-
linearity.
The Index-test is based on γ1z21,t + γ2z

2
2,t , where:

z1,t = x1,t + %1×2,t
z2,t = x2,t + %2×1,t (31)

where % depends on the eigenvalues, 3̂, of �̂. Under perfect
orthogonality, � = I2, which implies 3 = I2, and H = I2, such
that z21,t = x

2
1,t and z

2
2,t = x

2
2,t . When� 6= I2, zt comprises a linear

combination of the xts:

z21,t = x
2
1,t + %

2
1x
2
2,t + 2%1×1,tx2,t (32)

z22,t = x
2
2,t + %

2
2x
2
1,t + 2%2×1,tx2,t (33)

so the Index-test power will depend on:

γ1
(
x21,t + %

2
1x
2
2,t + 2%1×1,tx2,t

)
+ γ2

(
x22,t + %

2
2x
2
1,t + 2%2×1,tx2,t

)
. (34)

As the optimal test will directly test for the significance of x21,t , the
Index-test will have highest power when:

γ1 + γ2%
2
2 ≈ β3

γ1%
2
1 + γ2 ≈ 0

2γ1%1 + 2γ2%2 ≈ 0. (35)

This will give low weight to the x22,t and x1,tx2,t terms in the linear
combination. Under orthogonality, the analytic non-centrality

3 Using the fact that the fourth cumulant of a normal is zero, Hannan (1970, p. 23)
shows that:

E
[
w1,tw2,tw3,tw4,t

]
= E

[
w1,tw2,t

]
E
[
w3,tw4,t

]
+ E

[
w1,tw3,t

]
E
[
w2,tw4,t

]
+ E

[
w1,tw4,t

]
E
[
w2,tw3,t

]
where x31,tx2,t comprise the fourwi,t .

J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 237

Table 1
Simulation experiments for static DGPs. ψ refers to the non-centrality of an optimal individual t-test for T = 100, ρ = 0 and µ = 0, for both the linear and non-linear
parameters.

DGP Specification: f (·) Coefficients and non-centralities

Quadratic β1×1,t + β2×2,t + β3×21,t β1 = β2 = 0.3, β3 = 0.1732;ψ = 3
Cubic β1×1,t + β2×2,t + β3×31,t β1 = 0.4743, β2 = 0.3, β3 = 0.1225;ψ = 3
Quartic β1×1,t + β2×2,t + β3×41,t β1 = β2 = 0.3, β3 = 0.0293;ψ = 3
Composite β1×21,t + β2x

3
2,t + β3×2,te

−|x2,t | β1 = 3.6, β2 = 0.285, β3 = 0.195;ψ = 3
Cross-product β1×1,t + β2×2,t + β3×21,t + β4×1,tx2,t β1 = β2 = 0.2, β3 = 0.1155, β4 = 0.2;

ψ = 2
Exponential (a) β1{x1,te−|x1,t |} or (b) β1{x1,te−|x2,t |} (a) β1 = 2.04;ψ = 6 or (b) β1 = 2.04;ψ = 6
LSTR β1×1,t + β2×2,t + (δ0 + δ1×1,t + δ2×2,t )× [1+ exp(−γ (x1,t − c))]−1 β1 = β2 = 0.3, δ0 = 0.9

δ1 = δ2 = 0.6, γ = 2.5, c = 0.3
Quadratic z (a) β1z21,t or (b) β2z

2
2,t (a) β1 = 0.32;ψ = 3 or

(b) β2 = 3.802; ψ = 3

Fig. 3. Powers of non-linearity tests for n = 2.

collapses to that of the optimal test, 3Tδ21 , but with fewer degrees
of freedom than the V2 test. Collinearity impacts on the non-
centrality via the %weighting.
The non-centrality of the RESET test will also be 3Tδ21 , with 3

degrees of freedom.
Fig. 3, panels a–d record the powers of the tests for the first four

DGPs for n = 2, so there are no irrelevant regressors. Figures are
labeled a, b, c, d in rows from top left. The degree of correlation,
ρ, is reported by either :0 or :0.9 after the test name unless the
test is invariant to ρ. We consider panel a here, and panels b–d in
the following sub-sections. The optimal test power is around 0.60,
which compares to the analytic power of 0.87 (see Section 3.1). The
V2 test (and PCV2) has the highest power for the quadratic as the
test is designed to detect that form of non-linearity.4 As the Index-
test includes additional cubic and exponential terms, this reduces

4 V2 and PCV2 are identical for n = 2, 4. PCV2 will include min{k = 3n; n(n +
1)/2} principal components. For n = 2, there are 3 non-linear functions for V2
and PCV2 will include all 3 principal components of V2. For n = 4 there are 10
non-linear functions so all 10 principal components are included and the tests are
identical. PCV2= V2 in Fig. 3.

its power by approximately 15% when compared to the Index-test
with just quadratic zi,ts with n degrees of freedom, but it insures
against a broader range of non-linear DGPs. Also, the Index-test has
a higher power to detect non-linearity when there is collinearity,
than under orthogonality, as the collinear variables will proxy the
relevant non-linear functions.

5.2.2. Cubic DGP
We next consider the cubic DGP in (29). Under orthogonality,

the non-centralities are:

ϕ2r,α (β1) =
β21E

[
x′1Q̃x1

]
σ 2ε

= 0.4Tβ21

ϕ2r,α (β2) = Tβ
2
2

ϕ2r,α (β3) =
β23E

[
x(3)′1 Qx(3)1

]
σ 2ε

= 6Tβ23

where Q = 1 − x1(x′1×1)
−1x′1, and Q̃ = 1 − x(3)1 (x

(3)′
1 x(3)1 )

−1x(3)′1
when x(3)1 denotes the vector of {x

3
1,t}. Parameter values are chosen

such that ψ = 3 for T = 100 under orthogonality.

238 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245

Fig. 3(b) records the results for n = 2. V2 and PCV2 have very
low power against a cubic DGP: the benefits of a portmanteau test
such as the Index-test are seen by comparing across experiments.
V23 has high power for parsimonious specifications. The RESET
test performs well for this DGP specification as the simple non-
linear function is easily picked up by the parsimonious fitted value.
The RESET test has a higher power to detect cubic non-linearity
than quadratic. The Index-test suffers as usual under orthogonality,
and a high degree of collinearity is again beneficial. The linear
combination of the xts for zt is:

z31,t = x
3
1,t + 3κ1x

2
1,tx2,t + 3κ

2
1 x1,tx

2
2,t + κ

3
1 x
3
2,t

z32,t = x
3
2,t + 3κ2x

2
2,tx1,t + 3κ

2
2 x2,tx

2
1,t + κ

3
2 x
3
1,t . (36)

Therefore, if ρ = 0.9, the Index-test will gain power to detect x31,t
via the linear combinations, x21,tx2,t and x1,tx

2
2,t . The gap between

the analytic and optimal test is larger than for the quadratic DGP,
in keeping with our previous analysis.

5.2.3. Quartic DGP
While a quartic function is somewhat extreme, and the small-

sample distribution of even the ‘optimal’ t-statistic is poor under
zero means, we investigate whether the Index-test based on
quadratic functions has power against quartic functions due to the
collinearity between them. Again ψ = 3 for T = 100.
The results are recorded in Fig. 3(c) for n = 2. The substantial

gap between the analytic power (0.87) and optimal test power
(0.60) is evident, due to regressor kurtosis. The power is only
marginally lower than that for the quadratic function, and both
tests based on Volterra expansions and the Index-test do have
power against a quartic function. The patterns exhibited by the
power functions correspond to those for the quadratic function.
The RESET test requires a high degree of collinearity and all linear
functions of non-linear regressors to be included in the DGP for
reasonable power.

5.2.4. Composite DGP
As the Index-test is designed as a portmanteau test, it has

power against a wide range of alternatives. We next consider its
performance for a compositeDGP inwhich the non-linearity enters
in several ways, such as quadratic, cubic and exponential jointly.
The results are recorded in Fig. 3, panel d, and confirm that this
composite DGP favors the Index-test, especially when there is
collinearity.

5.3. DGPs for n > 2

Generally, models will not coincide with DGPs, so we now
increase n by adding irrelevant regressors.

5.3.1. Quadratic DGP
Fig. 4 reports the results for the quadratic DGP as n increases

from 2 to 20. The number of linear regressors is recorded along
the horizontal axis with the empirical rejection frequency on the
vertical axis. The Index-test does not have the highest power, but
is robust to a range of non-linearities. Panel a compares the power
to a range of alternative tests. As the DGP is a simple form of non-
linearity, detecting the non-linearity is difficult and all tests have
low power for large n compared to the optimal test. The RESET test
has the highest power when ρ = 0.9, but panel b compares this
to the case where the linear functions do not enter the DGP (i.e.
β1 = β2 = 0), when the RESET test has a lower power than the
Index-test for all n, as powers of ŷt will not contribute much to
detecting non-linearity. The V23 (panel c) and V2 (panel d) tests,
and their PCs, have high power at n = 2, but the power declines

sharply as n increases due to degrees of freedom. These tests are
infeasible for many values of n here. Again, the Index-test has
higher power under collinearity, as a high ρ increases the power to
detect non-linearity via collinear squares and cross-products. As T
increases to 300, the powers of all tests increase, with a unit power
for the analytic calculation, and near unit power for the optimal
infeasible test. As before, V2 and PCV2 deliver the highest power
for the case they target.

5.3.2. Cross-product DGP
For the DGPwith a cross-product term, all individual regressors

have non-centralities of 2 when T = 100 under orthogonality. A
conventional t-test of each null would, therefore, have power of
approximately 0.5 at 5%when the specification in (29) was known.
As the power of the optimal test depends on:

β3×21,t + β4×1,tx2,t ,

the non-centrality of the optimal F-test is:

ϕ2F = T
[
3β23 + 6β3β4ρ + β

2
4

(
1+ 2ρ2

)]
.

Hence, thenon-centrality of the joint F-test isϕF = 5.2 forρ = 0.9,
delivering a high power for all tests under collinearity.
Under orthogonality, the Index-testmust again have low power

against a single cross-product term in (29). For the test to have
power against (29), we require a low weight on the x22,t term
in the z21,t equation, but then there is no close approximation to
β3×21,t + β4×1,tx2,t .
Results for the cross-product experiment are reported in Fig. 5,

panel a for ρ = 0.9 and panel b for ρ = 0 to a maximum of
n = 10, as the power functions are nearly horizontal for n > 10.
PCV2 has the highest power, but as n increases it declines sharply,
whereas the Index-test power declinesmore slowly, as does RESET.
The degrees of freedom are costly for V23 in terms of sharply
declining power. Under orthogonality, the Index-test again has low
power relative to the alternative tests. If the non-linear terms offset
each other, all tests have low power. This form of non-linearity
is particularly difficult to detect: increasing collinearity yields a
lower power as the high correlations cancel the relevant non-linear
terms. Thus, higher correlations are not always advantageous.

5.3.3. Cubic DGP
Fig. 5, panels c and d, records powers for the cubic DGP for up

to 10 regressors, and show similar patterns, although RESET is less
affected by n increasing.

5.3.4. Exponential DGPs
Many non-linear models include exponential functions such as

neural networks with logistic squashing functions, and logistic or
exponential smooth transitionmodels. Tests based on polynomials
will have power against exponential DGPs due to the exponential
approximation in footnote 2, but tests with exponentials included,
such as the Index-test, should be able to capture this form of non-
linearity in a more parsimonious manner.
Fig. 6, panels a and b, record the power for an exponential

DGP in one linear variable, i.e. {x1,te−|x1,t |}, and panels c and d
record the power for a non-linear function of a combination of
linear regressors {x1,te−|x2,t |}. The divergence between the optimal
test and the non-linear tests is marked. When the regressors are
collinear, the Index-test has a higher power to detect this form
of non-linearity than the RESET test and the Volterra-expansion
based tests. V2 and PCV2 have no power to detect exponentials of

J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 239

Fig. 4. Powers of non-linearity tests for a quadratic function as n increases.

Fig. 5. Powers of non-linearity tests for a cross-product function and a cubic function.

this form, but V23 andPCV23dohave power due to the exponential
approximation:

yt = β1
{
x1,te−|x2,t |

}
+ εt

≈ β1

(
x1,t − x1,t

∣∣x2,t ∣∣+ 12×1,tx22,t − 16×1,t ∣∣x32,t ∣∣
)
+ vt . (37)

The powers for V23 and PCV23 increase under orthogonality,
which is the opposite of the Index-test. The RESET test does not
have power for cross-exponentials, but does have a higher power
than the Index-test under orthogonality for {x1,te−|x1,t |}. Including
the linear terms in the DGP would favor the RESET test. The
Index-test with just quadratics (i.e., n degrees of freedom) has
a higher power than the general Index-test, suggesting that the

240 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245

Fig. 6. Power of non-linearity tests for an exponential function.

quadratic functions can also ‘pick up’ exponentials, so the increase
in degrees of freedom from n to 3n is costly. However, more
complex exponential functions favor the Index-test.

5.3.5. LSTR DGPs
Having considered an exponential DGP, we generalize this to

an LSTR DGP as this nests aspects of threshold models, regime-
switchingmodels,Markov-switchingmodels and neural networks,
and is therefore representative of a general class of non-linear
models. Teräsvirta (1996) proposes a range of non-linear DGPs
based on Lee et al. (1993) for the time-series domain, but here we
focus on cross-section DGPs: Section 6 considers dynamic models.
TheMonte Carlo is necessarily equation specific, but it is indicative
of the performance of non-linearity tests to detect this type of
departure from linearity.
While the optimal infeasible test based on the LSTR specifica-

tion is not computed, we do compute the power of one feasible test
based on a third-order Taylor approximation. Replacing the transi-
tion function by:[
1+ exp

{
−γ

(
x1,t − c

)}]−1

1
2
+
γ
(
x1,t − c

)
4

(
γ
(
x1,t − c

))3
48

, (38)

results in the approximation:

yt ‘ θ0 + θ1×1,t + θ2×2,t + θ3×21,t + θ4x
3
1,t + θ5x

4
1,t

+ θ6×1,tx2,t + θ7×21,tx2,t + θ8x
3
1,tx2,t + εt . (39)

Hence, the Taylor approximation test is highly parameterized, and
there may be degrees-of-freedom gains for the Index-test.
The results are recorded in Fig. 7, panels a and b. The Index-test

and Volterra-expansion tests have power against an LSTR model
under collinearity, indicating that polynomial approximations do

capture non-linearities generated by a smooth-transition model.
For n = 2, most tests outperform the RESET test, but then degrees-
of-freedom favor the RESET test. An index of the Index-test
with comparable degrees of freedom (using {

∑n
i=1 z

2
i,t}, {

∑n
i=1 z

3
i,t}

and {
∑n
i=1 e

−|zi,t |zi,t}) delivers higher power than the RESET test,
suggesting that over-parameterized models suffer in an LSTR
DGP. Furthermore, including exponential functions in the Index-
test yields no improvements over the Taylor-approximation tests,
suggesting that polynomials capture this form of non-linearity
well. Orthogonality implies that combinations of the non-linear
regressors will have low weight, reducing the power of the Index-
test relative to PCV2. Alternative LSTR specifications are needed
to draw more precise conclusions, but to the extent that (39) is a
reasonable approximation, the Index-test will have power against
(29). Conversely, rejecting an initial specification does not entail
that the alternative must be a polynomial function.

5.3.6. Composite DGP
Fig. 7, panels c and d confirm that the composite DGP favors the

Index-test as its power is high for small nwhen there is collinearity.
As n increases, the power declines and the degrees-of-freedom
benefits of RESET yield a higher power for n > 7, where only 3
degrees of freedom are required compared to 21 for the Index-test.
Increasing the sample size favors the Index-test, and its power is
higher than RESET for all n at T = 300. The Index-test outperforms
the V2 and V23 tests, but under orthogonality, has lower power
than the Volterra-expansion tests and their principal components.

5.3.7. Quadratic z DGP
We finally consider a DGP that is specified in terms of the

orthogonal linear combinations. From (9) the DGP is:

yt = β1z21,t + εt

= β1
(
4.998×21,t + 4.998x

2
2,t − 9.997×1,tx2,t

)
+ εt (40)

J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 241

Fig. 7. Power of non-linearity tests for an LSTR function and a combination of non-linear functions.

where ρ = 0.9 and µ = 0. We set β1 = 0.32 so individual
non-centralities are≈ |3| for the three non-linear functions. Under
orthogonality, however, these weights will be incorrect.
TheDGPhas offsetting effects.When collinearity is large, x1,tx2,t

will be a proxy for x21,t and x
2
2,t so the negative coefficient on

the cross-product will adversely affect the power. Thus, we also
consider the alternative DGP:
yt = β2z22,t + εt

= β2
(
0.263×21,t + 0.263x

2
2,t + 0.526×1,tx2,t

)
+ εt (41)

where β2 = 3.802, so both the quadratic and cross-product
terms are ‘in the same direction’ and power will be higher under
collinearity.
Fig. 8 records the results for (40) in panels a and b, and for

(41) in panels c and d. The Index-test and V2/PCV2 have the same
power for n = 2 as the tests are equivalent given the DGP design.
V23 has a higher power than its principal component analog for
n = 4 under collinearity, which suggests that PCs of the non-
linear functions can omit relevant non-linear combinations given
the arbitrary selection of the number of PCs. The Index-test, V2 and
PCV2 have comparative powers for ρ = 0.9. Under orthogonality,
all tests have higher power due to the off-setting effects of the
cross-product term. All tests based on polynomials outperform the
RESET test. When the second eigenvector is used to form the DGP
weights, all tests have unit power under orthogonality apart from
the RESET test, although the power of the Index-test does decline
as n increases since the weightings are incorrect for n > 2.
A further case included both orthogonalized regressors, and

delivered similar power shapes over n as the single regressor
case. A cubic z was also considered, and again the results were
comparable.

6. Power simulations for dynamic models

6.1. Experiment design

So far we have considered static equations with strongly ex-
ogenous regressors, essentially a cross-section context. There are

numerous well-known problems in generalizing to a non-linear
dynamic context, but some can be addressed. To assess the proper-
ties of the test in a dynamic context, we undertake simulation ex-
periments for a first-order autoregressive-distributed lag, ADL(1,
1), DGP:

yt = β0 + β1×1,t + β2yt−1 + β3×1,t−1 + g(·)+ εt (42)

where |β2| < 1 and xt is now generated by: xt = 5xt−1 + αyt−1 + εt (43) where5 is the (n× n)matrix {πij} for i, j = 1, . . . , n, |πii| < 1,∀i, |α| < 1, and:( εt εt ) ∼ IN1+n [( 0 0 ) , ( ω11 0′ 0 ω22 )] . (44) E[ε2t ] = ω11 is a scalar and E[εi,t , εj,t ] = ω22,.ij∀i, j. We generate n = 1, . . . , 9 regressors, xt , based on (43), of which one (x1) enters the DGP (42). Commencing with 3 linear regressors (yt−1, x1,t , x1,t−1), we include an additional regressor and its lag sequentially until all 19 regressors (yt−1, xt , xt−1) are included in the general model. We discard the first 20 observations when generating the data. Table 2 records the experiments undertaken, and the tests listed in Section 4 are computed. We compute the principal components for the Index-tests based on the second moment matrix of (yt−1 : xt : xt−1)′. Unless otherwise stated we set ω11 = 1;ω22,ii = 1,∀i;ω22,.ij = 0.5,∀i 6= j; and πij = 0,∀i 6= j. 6.2. Results 6.2.1. Strong exogeneity Strong exogeneity requires the absence of feedback, such that α = 0. As the DGP is unknown to the econometrician, a test of non-linearity will require the inclusion of all non-linear functions of the information set, (yt−1 : xt : xt−1)′, although we assume the lag length is known at unity here. Fig. 9, panel a, records the test 242 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 Fig. 8. Power of the non-linearity test for a quadratic zt function. Table 2 Simulation experiments for the dynamic DGPs. Test g(·) Coefficients Baseline ω11 = 1;ω12.i = 0,∀i;ω22.ii = 1,∀i;ω22.ij = 0.5,∀i 6= j; πij = 0,∀i 6= j Strong exogeneity Size – β0 = 5;β1 = β2 = β3 = πii = 0.5;α = 0 Power β4x21,t−1 β0 = 5;β1 = β2 = β3 = πii = 0.5;α = 0;β4 = 0.4 Power β4x21,t + β5x 3 1,t−1 β0 = 5;β1 = β2 = β3 = πii = 0.5;α = 0;β4 = 0.15;β5 = 0.1 Power β4 exp x21,t−1 β0 = 5;β1 = β2 = β3 = πii = 0.5;α = 0;β4 = 0.4 Increasing persistence Size – β0 = 5;β1 = β3 = 0.5;β2 = πii = 0.8;α = 0 Power β4x21,t−1 β0 = 5;β1 = β3 = 0.5;β2 = πii = 0.8;α = 0;β4 = 0.4 Relaxing strong exogeneity Size – β0 = 5;β1 = β2 = β3 = πii = 0.5;α = −0.5 Power β4x21,t−1 β0 = 5;β1 = β2 = β3 = πii = 0.5;α = −0.5;β4 = 0.15 size (for nominal sizes of 1% and 5%); and panels b, c and d record the powers (at 5%) for the non-linear functions listed in Table 2.5 The results show that the Index-test has a large-sample actual size close to its nominal, at least at 5% and 1%, although all tests become slightly over-sized as the degree of persistence increases. The powers of the tests based on Volterra expansions are high for smalln, but decline rapidly asn increases, and the Index-test power declines steadily. All tests outperform the RESET test for small n, although its 3 degrees of freedom help as n increases. A higher correlation between the regressors, or a more complex form of non-linearity, should favor the Index-test. 5 We only report the results for T = 100. As the sample size increases to T = 300, the size of the test is similar and the power increases, in keeping with asymptotic theory. We only report powers for a 5% nominal significance level, as the patterns of the power curves are similar at 1%, but sizes are reported for both 1% and 5%. Full results are available on request. 6.2.2. Increasing persistence We next increased the degree of persistence in both the marginals and conditional from β2 = πii = 0.5,∀i to β2 = πii = 0.8,∀i. The size is reported in Fig. 10, panel a, and the power for a quadratic DGP in panel b. Increasing the persistence yields a higher power for the Index-test due to the collinearity arguments noted for the static case. Hence, strongly exogenous dynamic DGPs have the appropriate size and a power close to that of their static counterparts. 6.2.3. Weak exogeneity If the regressors are only weakly exogenous, then non-linearity could induce unstable or chaotic behavior for some parameter configurations. Providing that stationarity is maintained, in large samples the null distribution is close to an F, checked here when α = −0.5. For power, we set β4 = 0.15 to ensure convergence, giving a non-centrality of approximately 3.5. The sizes and powers J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 243 Fig. 9. Size and power for ADL(1, 1) DGP with strong exogeneity. Fig. 10. Size and power for ADL(1, 1) DGP with increasing persistence or weak exogeneity. of the tests are recorded in Fig. 10, panels c and d, showing similar power behavior to earlier. 6.2.4. Unit roots If the levels data are integrated, the null distributions of many of the non-linearity tests are non-standard: see Caceres (2007). When the conditional model contains a unit root, resulting in a differenced process estimated in levels, and the strongly exogenous regressors are stationary, the Index-test has a size close to the nominal. Non-linear functions of the lagged dependent variable will be non-stationary, but regressions of I(1) variables on I(0) processes do not lead to spurious regression: see 244 J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 Hendry (1995, p. 129). If the original formulation is in non- stationary variables, so the non-linear functions have complicated behavior but are nevertheless strongly exogenous, then again the F-distribution is the relevant one based on conditioning, although the power may not be described well by a non-central F, as seen in Section 3.1. 7. Conclusion For scalar DGPs with relatively few regressors, the current tests in the literature perform well. However, if the model contains many potential regressors that are possibly highly correlated, and the form of non-linearity under the alternative is unknown, many general tests become infeasible: our proposed portmanteau Index- test performswell in such situations. Although it is only distributed exactly as a (non-central) F for fixed regressors, that distribution remains as a large-sample approximation in stationary processes with weakly exogenous regressors. Tests based on second-order Volterra expansions are ideal if the departure from non-linearity is in the direction of a quadratic and sample sizes are large, but they have little power to detect cubic or exponential non-linearities. Tests based on third-order Volterra expansions have power to detect departures from linearity inmore directions, but are only feasible for small numbers of regressors. Principal components of these two tests perform well, but do not uniformly have higher power than their original counterparts (as the quadratic z1,t DGP above demonstrates for V23). Further, they need not achieve a dimension reduction, unlike Index-test, which is computationally simpler. Tests based onVolterra expansions and our proposed Index-test perform comparatively for small n, but the power of Volterra expansion tests declines more sharply as n increases due to the rapidly increasing degrees of freedom. The RESET test has high power in situations where all the linear components of the non-linear functions enter the DGP, but has low power both when the non-linear functions enter independently of the linear functions, or when many of the linear functions do not enter non-linearly, as then powers of the fitted dependent variable from a linear regression are a poor approximation to the non-linear DGP. Also, non-zeromeans adversely affect the power of the RESET test—see Teräsvirta (1996) for the impact of the intercept on the RESET test—and outliers also reduce its power. The RESET test has a higher power to detect non-linearity in the form of a cubic than a quadratic, and the degrees of freedom benefits of the RESET test are more apparent for large n. The Index-test has the appropriate size, and is equivariant to collinearity. However, if the non-linear DGP is such that two highly-collinear non-linear functions ‘cancel each other’, all tests tend to have low power. More complex DGPs favor the Index- test, so when the functional form is unknown and there is a large set of candidate relevant variables, but the specification nests the DGP, the Index-test has power to reject a false null in a wide range of circumstances: pure quadratic, pure cubic, pure quartic, exponential, and these in combinations. Recent simulations show that its relative power is also higher when the regressors are non- normal. All tests (apart from RESET in some cases) decline in power as n increases. While parsimony delivers a higher power, such that selection of relevant variables prior to implementing the test might appear to be beneficial, this may be a hazardous strategy if the linear term is irrelevant, yet enters the DGP in a non-linear function. Thus, there is a trade-off between a higher power after selection and a risk of eliminating variables that are relevant only via a non-linear transformation, resulting in a lower power to detect non-linearity when such a variable is excluded. By using a portmanteau test, power cannot be uniformly higher than all the alternative tests considered, although the Index-test offers somepower against a range of possible non-linear functional forms and is feasible for quite large n. The Index-test outperforms Volterra-expansion tests and RESET in many such situations, and can even be close to the optimal test. For larger departures from linearity, where several non-linear terms occur for a number of variables, its power will dominate that illustrated here, where the experiments were deliberately chosen with many irrelevant variables to highlight the potential. Moreover, the simulation experiments for dynamic models suggest that the Index-test has the correct large sample size, and has reasonable power properties for stationary dynamic processes even with weakly exogenous regressors. Thus, it promises to be a useful mis-specification test for examining the functional forms of general models. Acknowledgements We thank the Editors and two anonymous referees of the Journal of Econometrics, Jurgen A. Doornik, participants of the 2006 Royal Economic Society Conference, Econometric Society Australasian Meeting, Econometric Society European Meeting and the 4th Oxmetrics Users Conference for helpful comments and suggestions on an earlier version. Financial support from the ESRC under grants RES-000-23-0539 and RES-062-23-0061 is gratefully acknowledged. References Abadir, K.M., 1999. An introduction to hypergeometric functions for economists. Econometric Reviews 18, 287–330. Bierens, H.J., 1990. A consistent conditional moment test of functional form. Econometrica 58, 1443–1458. Brock, W.A., Dechert, W.D., Scheinkman, J.A., 1987. A test for independence based on the correlation dimension. SSRI Working Paper No. 8702, Department of Economics, University of Wisconsin. Caceres, C., 2007. Asymptotic properties of tests for mis-specification. Unpublished Doctoral Thesis, Economics Department, Oxford University. Cattell, R.B., 1966. The scree test for the number of factors. Multivariate Behavioral Research 1, 245–276. Dhrymes, P.J., 1966. On the treatment of certain recurrent nonlinearities in regression analysis. Southern Economic Journal 33, 187–196. Dhrymes, P.J., Howrey, E.P., Hymans, S.H., Kmenta, J., et al., 1972. Criteria for evaluation of econometricmodels. Annals of Economic and SocialMeasurement 3, 291–324. Doornik, J.A., 1995. Econometric computing. Ph.D Thesis, University of Oxford, Oxford. Doornik, J.A., 2007. Object-Oriented Matrix Programming using Ox, 6th ed.. Timberlake Consultants Press, London. Engle, R.F., 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of UK inflation. Econometrica 50, 987–1008. Engle, R.F., Hendry, D.F., Richard, J.-F., 1983. Exogeneity. Econometrica 51, 277–304. Frisch, R., Waugh, F.V., 1933. Partial time regression as compared with individual trends. Econometrica 1, 221–223. Granger, C.W.J., Teräsvirta, T., 1993. Modelling Nonlinear Economic Relationships. Oxford University Press, Oxford. Hannan, E.J. (Ed.), 1970. Multiple Time Series. John Wiley, New York. Hendry, D.F., 1995. Dynamic Econometrics. Oxford University Press, Oxford. Hendry, D.F., Krolzig, H.-M., 2003. New developments in automatic general-to- specific modelling. In: Stigum, B.P. (Ed.), Econometrics and the Philosophy of Economics. Princeton University Press, Princeton, pp. 379–419. Hoover, K.D., Perez, S.J., 1999. Data mining reconsidered: Encompassing and the general-to-specific approach to specification search. Econometrics Journal 2, 167–191. Kaiser, H.F., 1960. The application of electronic computers to factor analysis. Educational and Psychological Measurement 20, 141–151. Keenan, D.M., 1985. A Tukey non-additivity-type test for time series nonlinearity. Biometrika 72, 39–44. Lee, T.-H.,White, H., Granger, C.W.J., 1993. Testing for neglected nonlinearity in time series models: A comparison of neural network methods and alternative tests. Journal of Econometrics 56, 269–290. McLeod, A.I., Li, W.K., 1983. Diagnostic checking ARMA time series models using squared residual autocorrelations. Journal of Time Series Analysis 4, 269–273. Phillips, P.C.B., 2007. Regression with slowly varying regressors and nonlinear trends. Econometric Theory 23, 557–614. Ramsey, J.B., 1969. Tests for specification errors in classical linear least squares regression analysis. Journal of the Royal Statistical Society. Series B 31, 350–371. J.L. Castle, D.F. Hendry / Journal of Econometrics 158 (2010) 231–245 245 Spanos, A., 1986. Statistical Foundations of Econometric Modelling. Cambridge University Press, Cambridge. Teräsvirta, T., 1996. Power properties of linearity tests for time series. In: Studies in Nonlinear Dynamics & Econometrics, vol. 1. Berkeley Electronic Press, pp. 3–10. Teräsvirta, T., Lin, C.-F., Granger, C.W.J., 1993. Power of the neural network linearity test. Journal of Time Series Analysis 14, 309–322. Thursby, J.G., Schmidt, P., 1977. Some properties of tests for specification error in a linear regression model. Journal of the American Statistical Association 72, 635–641. Tsay, R.S., 1986. Non-linearity tests for time series. Biometrika 73, 461–466. Tukey, J.W., 1949. One degree of freedom for non-additivity. Biometrics 5, 232–242. White, H., 1980. A heteroskedastic-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48, 817–838. White, H., 1982. Maximum likelihood estimation of misspecified models. Econo- metrica 50, 1–26. White, H., 1989. An additional hidden unit test for neglected nonlinerarity in multilayer feedforward networks. In: Proceedings of the International Joint Conference on Neural Networks II, Washington, DC, pp. 451–455. White, H., 1992. Artificial Neural Networks: Approximation and Learning Theory. Basil Blackwell, Oxford. A low-dimension portmanteau test for non-linearity Introduction Testing functional form Testing against a quadratic A quadratic approximation test A low-dimensional quadratic test Testing against more general alternatives Null distributions Test power Power approximations for polynomials Vector of quadratic regressors Powerless cases General Index-test power Alternative tests Principal components of non-linear functions Power simulations for static DGPs Experimental design Results for n = 2 Quadratic DGP Cubic DGP Quartic DGP Composite DGP DGPs for n >2

Quadratic DGP

Cross-product DGP

Cubic DGP

Exponential DGPs

LSTR DGPs

Composite DGP

Quadratic z DGP

Power simulations for dynamic models

Experiment design

Results

Strong exogeneity

Increasing persistence

Weak exogeneity

Unit roots

Conclusion

Acknowledgements

References

CastleHendry-2020-ClimateEconometrics-FTinEctrics-v10n3-4

Foundations and Trends® in Econometrics

Climate Econometrics: An
Overview

Suggested Citation: Jennifer L. Castle and David F. Hendry (2020), “Climate Econo-
metrics: An Overview”, Foundations and Trends® in Econometrics: Vol. 10, No. 3–4, pp
145–322. DOI: 10.1561/0800000037.

Jennifer L. Castle
Institute for New Economic Thinking at the

Oxford Martin School and
Climate Econometrics at Nuffield College

University of Oxford
UK

jennifer.castle@magd.ox.ac.uk

David F. Hendry
Institute for New Economic Thinking at the

Oxford Martin School and
Climate Econometrics at Nuffield College

University of Oxford
UK

david.hendry@nuffield.ox.ac.uk

This article may be used only for the purpose of research, teaching,
and/or private study. Commercial use or systematic downloading
(by robots or other automatic processes) is prohibited without ex-
plicit Publisher approval. Boston — Delft

Contents

1 Introduction 149

2 Econometric Methods for Empirical Climate Modeling 156
2.1 Theory Models and Data Generation Processes . . . . . . 157
2.2 Formulating Wide-Sense Non-Stationary Time

Series Models . . . . . . . . . . . . . . . . . . . . . . . . 159
2.3 Model Selection by Autometrics . . . . . . . . . . . . . . 163
2.4 Model Selection Facing Wide-Sense

Non-Stationarity . . . . . . . . . . . . . . . . . . . . . . . 164
2.5 Understanding Why Model Selection Can Work . . . . . . 165
2.6 Selecting Models with Saturation Estimation . . . . . . . . 172
2.7 Summary of Saturation Estimation . . . . . . . . . . . . . 192
2.8 Selecting Simultaneous Equations Models . . . . . . . . . 195
2.9 Forecasting in a Non-Stationary World . . . . . . . . . . . 199

3 Some Hazards in Modeling Non-Stationary
Time-Series Data 205
3.1 Assessing the Constancy and Invariance

of the Relationship . . . . . . . . . . . . . . . . . . . . . 210
3.2 An Encompassing Evaluation of the Relationship . . . . . 213

4 A Brief Excursion into Climate Science 215
4.1 Can Humanity Alter the Planet’s

Atmosphere and Oceans? . . . . . . . . . . . . . . . . . . 218
4.2 Climate Change and the ‘Great Extinctions’ . . . . . . . . 224

5 The Industrial Revolution and Its Consequences 229
5.1 Climate Does Not Change Uniformly Across the Planet . . 232

6 Identifying the Causal Role of CO2 in Ice Ages 234
6.1 Data Series Over the Past 800,000 Years . . . . . . . . . . 239
6.2 System Equation Modeling of the Ice-Age Data . . . . . . 243
6.3 Long-Run Implications . . . . . . . . . . . . . . . . . . . . 249
6.4 Looking Ahead . . . . . . . . . . . . . . . . . . . . . . . . 253
6.5 Conclusions on Ice-Age Modeling . . . . . . . . . . . . . . 258

7 Econometric Modeling of UK Annual CO2
Emissions, 1860–2017 260
7.1 Data Definitions and Sources . . . . . . . . . . . . . . . . 262
7.2 UK CO2 Emissions and Its Determinants . . . . . . . . . . 264
7.3 Model Formulation . . . . . . . . . . . . . . . . . . . . . 268
7.4 Evaluating a Model Without Saturation Estimation . . . . 271
7.5 Four Stages of Single-Equation Model Selection . . . . . . 273
7.6 Selecting Indicators in the General Model . . . . . . . . . 275
7.7 Selecting Regressors and Implementing Cointegration . . . 276
7.8 Estimating the Cointegrated Formulation . . . . . . . . . . 278
7.9 Encompassing of Linear-Semilog versus Linear-Linear . . . 279
7.10 Conditional 1-Step ‘Forecasts’ and System Forecasts . . . 281
7.11 Policy Implications . . . . . . . . . . . . . . . . . . . . . . 283
7.12 Can the UK Reach Its CO2 Emissions Targets for 2050? . 286
7.13 Climate-Environmental Kuznets Curve . . . . . . . . . . . 293

8 Conclusions 296

Acknowledgements 303

References 304

Climate Econometrics: An
Overview
Jennifer L. Castle1 and David F. Hendry2

1Institute for New Economic Thinking at the Oxford Martin School and
Climate Econometrics at Nuffield College, University of Oxford, UK;
jennifer.castle@magd.ox.ac.uk
2Institute for New Economic Thinking at the Oxford Martin School and
Climate Econometrics at Nuffield College, University of Oxford, UK;
david.hendry@nuffield.ox.ac.uk

ABSTRACT

Climate econometrics is a new sub-discipline that has grown
rapidly over the last few years. As greenhouse gas emissions
like carbon dioxide (CO2), nitrous oxide (N2O) and methane
(CH4) are a major cause of climate change, and are gener-
ated by human activity, it is not surprising that the tool
set designed to empirically investigate economic outcomes
should be applicable to studying many empirical aspects of
climate change.

Economic and climate time series exhibit many commonali-
ties. Both data are subject to non-stationarities in the form
of evolving stochastic trends and sudden distributional shifts.
Consequently, the well-developed machinery for modeling
economic time series can be fruitfully applied to climate
data. In both disciplines, we have imperfect and incomplete
knowledge of the processes actually generating the data.
As we don’t know that data generating process (DGP), we
must search for what we hope is a close approximation to it.

Jennifer L. Castle and David F. Hendry (2020), “Climate Econometrics: An Overview”,
Foundations and Trends® in Econometrics: Vol. 10, No. 3–4, pp 145–322. DOI:
10.1561/0800000037.

146

The data modeling approach adopted at Climate Economet-
rics (http://www.climateeconometrics.org/) is based on a
model selection methodology that has excellent properties
for locating an unknown DGP nested within a large set of
possible explanations, including dynamics, outliers, shifts,
and non-linearities. The software we use is a variant of ma-
chine learning which implements multi-path block searches
commencing from very general specifications to discover a
well-specified and undominated model of the processes under
analysis. To do so requires implementing indicator satura-
tion estimators designed to match the problem faced, such
as impulse indicators for outliers, step indicators for loca-
tion shifts, trend indicators for trend breaks, multiplicative
indicators for parameter changes, and indicators specifically
designed for more complex phenomena that have a com-
mon reaction ‘shape’ like the impacts of volcanic eruptions
on temperature reconstructions. We also use combinations
of these, inevitably entailing settings with more candidate
variables than observations.

Having described these econometric tools, we take a brief
excursion into climate science to provide the background
to the later applications. By noting the Earth’s available
atmosphere and water resources, we establish that humanity
really can alter the climate, and is doing so in myriad ways.
Then we relate past climate changes to the ‘great extinctions’
seen in the geological record. Following the Industrial Revo-
lution in the mid-18th century, building on earlier advances
in scientific, technological and medical knowledge, real in-
come levels per capita have risen dramatically globally, many
killer diseases have been tamed, and human longevity has ap-
proximately doubled. However, such beneficial developments
have led to a global explosion in anthropogenic emissions of
greenhouse gases. These are also subject to many relatively
sudden shifts from major wars, crises, resource discoveries,

147

technology and policy interventions. Consequently, stochas-
tic trends, large shifts and numerous outliers must all be
handled in practice to develop viable empirical models of cli-
mate phenomena. Additional advantages of our econometric
methods for doing so are detecting the impacts of impor-
tant policy interventions as well as improved forecasts. The
econometric approach we outline can handle all these jointly,
which is essential to accurately characterize non-stationary
observational data. Few approaches in either climate or eco-
nomic modeling consider all such effects jointly, but a failure
to do so leads to mis-specified models and hence incorrect
theory evaluation and policy analyses. We discuss the haz-
ards of modeling wide-sense non-stationary data (namely
data not just with stochastic trends but also distributional
shifts), which also serves to describe our notation.

The application of the methods is illustrated by two detailed
modeling exercises. The first investigates the causal role of
CO2 in Ice Ages, where a simultaneous-equations system is
developed to characterize land ice volume, temperature and
atmospheric CO2 levels as non-linear functions of measures
of the Earth’s orbital path round the Sun. The second turns
to analyze the United Kingdom’s highly non-stationary an-
nual CO2 emissions over the last 150 years, walking through
all the key modeling stages. As the first country into the
Industrial Revolution, the UK is one of the first countries
out, with per capita annual CO2 emissions now below 1860’s
levels when our data series begin, a reduction achieved with
little aggregate cost. However, very large decreases in all
greenhouse gas emissions are still required to meet the UK’s
2050 target set by its Climate Change Act in 2008 of an
80% reduction from 1970 levels, since reduced to a net zero
target by that date, as required globally to stabilize tem-
peratures. The rapidly decreasing costs of renewable energy

148

technologies offer hope of further rapid emission reductions
in that area, illustrated by a dynamic scenario analysis.

Keywords: climate econometrics; model selection; policy interventions;
outliers; saturation estimation; Autometrics; Ice Ages; CO2 emissions.

1
Introduction

Climate econometrics is a sub-discipline that has grown rapidly over
the last few years, having held four annual international conferences (at
Aarhus, Oxford, Rome and Milan) and with a global network.1 A Spe-
cial Issue of the Journal of Econometrics (https://www.sciencedirect.
com/journal/journal-of-econometrics/vol/214/issue/1) has 14 contri-
butions across a wide range of climate issues, and a second in Econo-
metrics (https://www.mdpi.com/journal/econometrics/special_issues/
econometric_climate) is in preparation. Because greenhouse gas emis-
sions like carbon dioxide (CO2), nitrous oxide (N2O) and methane (CH4)
are the major cause of climate change, and are generated by human
activity, it is not surprising that the tool set originally designed to em-
pirically investigate economic outcomes should be applicable to studying
many empirical aspects of climate change. Most climate-change analysis
is based on physical process models embodying the many known laws
of conservation and energy balance at a global level. Such results under-
pin the various reports from the Intergovernmental Panel on Climate
Change (IPCC: https://www.ipcc.ch/). Climate theories can also be

1See https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=climateeconometrics:
its planned 5th Econometric Models of Climate Change Conference at the University
of Victoria has had to be postponed till 2021 because of the SARS-CoV-2 pandemic.

149

https://www.sciencedirect.com/journal/journal-of-econometrics/vol/214/issue/1

https://www.sciencedirect.com/journal/journal-of-econometrics/vol/214/issue/1

https://www.mdpi.com/journal/econometrics/special_issues/econometric_climate

https://www.mdpi.com/journal/econometrics/special_issues/econometric_climate

https://www.ipcc.ch/

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=climateeconometrics

150 Introduction

embedded in models of the kind familiar from macroeconomics: for ex-
ample, Kaufmann et al. (2013) link physical models with statistical ones
having a stochastic trend, and Pretis (2019) establishes an equivalence
between two-component (i.e., atmosphere and oceans) energy-balance
models of the climate and a cointegrated vector autoregressive system
(CVAR). Even in such a well-understood science, knowledge is not
complete and immutable, and there are empirical aspects that need
attention. For example, CO2 and other greenhouse gas emissions depend
on changeable human behavior; volcanic eruptions vary greatly in their
climate impacts; the rate of loss of Arctic sea ice alters the Earth’s
albedo and such feedbacks affect warming.

Our approaches at Climate Econometrics (our research group, shown
capitalized to differentiate it from the general research area) are comple-
mentary to physical process models, and use a powerful set of modeling
tools developed to analyze empirical evidence on evolving processes
that are also subject to abrupt shifts, called wide-sense non-stationarity
to distinguish from the use of ‘non-stationarity’ purely for unit-root
processes that generate stochastic trends: see Castle and Hendry (2019).
A key reason is that differencing a wide-sense non-stationary time
series does not ensure stationarity as is often incorrectly assumed
in economics. Because the data are wide-sense non-stationary time
series observations, the data generating process (DGP) is inevitably
unknown and has to be discovered. The model selection methodol-
ogy described below has excellent properties for locating an unknown
DGP when it is embedded within a large set of potential explana-
tions. Thus, we advocate commencing from a general specification that
also includes variables to allow for dynamics, outliers, shifts, and non-
linearities. We use a variant of machine learning called Autometrics
that explores multi-path block searches to discover a well-specified and
undominated model of the processes under analysis (see Doornik, 2009).
Hendry and Doornik (2014) analyze the properties of Autometrics: also
see §2.3.2 The approach is available in R by Pretis et al. (2018a) at
https://cran.r-project.org/web/packages/gets/index.html, and as the

2For summaries, see http://voxeu.org/article/data-mining-more-variables-obser
vations and https://voxeu.org/article/improved-approach-empirical-modelling-0.

https://cran.r-project.org/web/packages/gets/index.html

http://voxeu.org/article/data-mining-more-variables-observations

http://voxeu.org/article/data-mining-more-variables-observations

https://voxeu.org/article/improved-approach-empirical-modelling-0

151

Excel Add-in XLModeler (see https://www.xlmodeler.com/). Other
model selection algorithms include the Lasso (see Tibshirani, 1996) and
its variants.

Our methods are designed to select models even when there are more
candidate variables, N , than the number of observations, T . Autometrics
employs a variety of saturation estimators that inevitably create N > T .
Each is designed to match the problem faced, namely impulse-indicator
saturation (denoted IIS) to tackle outliers, step-indicator saturation
(SIS) for location shifts, trend-indicator saturation (TIS) for trend
breaks, multiplicative-indicator saturation (MIS) for parameter changes,
and designed-indicator saturation for modeling phenomena with a regu-
lar pattern, applied below to detecting the impacts on temperature of
volcanic eruptions (VIS). Importantly, saturation estimators can be used
in combination, and can be applied when retaining without selection a
theory-model that is the objective of a study, while selecting from other
potentially substantive variables. Saturation estimators, and indeed our
general approaches, have seen applications across a range of disciplines
including dendrochronology, volcanology, geophysics, climatology, and
health management, as well as economics, other social sciences and fore-
casting. Although theory models are much better in many of these areas
than in economics and other social sciences, modeling observational
data faces most of the same problems, which is why an econometric
toolkit can help.

Below, we explain our econometric methods and illustrate some of
their applications to climate time series. The first illustration investi-
gates past climate variability over the Ice Ages, where a simultaneous-
equations system is developed to characterize land ice volume, Antarctic
temperature and atmospheric CO2 levels as non-linear functions of mea-
sures of the Earth’s evolving orbital path round the Sun. The focus
is on system modeling and how we implement that despite N > T , as
well as the difference in how saturation estimation is applied in systems.
Few economists will ever have the opportunity to consider multi-step
forecasts over 100,000 years as we do here! The second illustration
is a detailed study of the UK’s CO2 emission over 1860–2017 that
walks through the various stages of formulation, model specification,
selection while tackling outliers and location shifts, then investigating

https://www.xlmodeler.com/

152 Introduction

cointegration, and on to model simplification for forecasting and policy
analyses. A key aim is establishing the possible impacts of past policy
interventions though we also discuss possible future developments.

As Pretis (2019) remarks

Econometric studies beyond IAMs (integrated assessment
models) are split into two strands: one side empirically
models the impact of climate on the economy, taking climate
variation as given . . . the other side models the impact of
anthropogenic (e.g., economic) activity onto the climate by
taking radiative forcing—the incoming energy from emitted
radiatively active gases such as CO2—as given . . . . This
split in the literature is a concern as each strand considers
conditional models, while feedback between the economy
and climate likely runs in both directions.

Examples of approaches conditioning on climate variables such as tem-
perature include Burke et al. (2015), Pretis et al. (2018b), Burke et al.
(2018), and Davis (2019). Hsiang (2016) reviews such approaches to
climate econometrics. Examples from many studies modeling climate
time series include Estrada et al. (2013), Kaufmann et al. (2011, 2013)
and Pretis and Hendry (2013). Pretis (2017) addresses the exogeneity
issue in more detail. Most of the research described in this monograph
concerns the second approach, although the methods are applicable
both to the first and to investigating exogeneity as shown in Section 6.
The resulting econometric tools also contrast with the methodology
predominantly used in the first approach of a quasi-experimental frame-
work using panel regressions under the assumption of strict exogeneity
of climate variables.

The structure of the monograph is as follows. First, Section 2 de-
scribes econometric methods for empirical climate modeling that can
account for wide-sense non-stationarity, namely both stochastic trends
and location shifts, with possibly large outliers, as well as dynamics and
non-linearities. Model selection is essential as the behavioral processes
determining greenhouse gas emissions are too complicated to be known
a priori. A basic question then concerns what is model selection trying
to find? This is answered in §2.1 on the roles therein of theory models

153

and DGPs by trying to find the latter, or at least a good approximation
to its substantive components. §2.2 first discusses the formulation of
models for wide-sense non-stationary time series, then §2.3 describes
model selection by Autometrics and §2.4 explains its block multi-path
selection algorithm. Next, §2.5 turns to understanding why automatic
model selection can work well despite N > T . Saturation estimators
are described in §2.6, commencing with impulse-indicator saturation
(IIS) to tackle outliers. IIS is illustrated in §2.6.1, and its properties
are described in §2.6.2. Then §2.6.3 considers step-indicator saturation
(SIS), §2.6.4 the extension to super saturation estimation combining IIS
and SIS, §2.6.5 explains a variant to handle trend saturation estimation
(TIS), followed in §2.6.6 by multiplicative-indicator saturation (MIS)
which interacts SIS with regressors for detecting parameter changes.
Then §2.6.7 illustrates designed-indicator saturation by formulating in-
dicators for modeling the impacts of volcanic eruptions on temperature
reconstructions (VIS). §2.7 summarizes the various saturation estima-
tors. §2.8 considers selection, estimation and evaluation of simultaneous
equations models, addressing identification in §2.8.1. Facing forecasting
in a wide-sense non-stationary world, §2.9 discusses the consequences
of not handling location shifts and describes forecasting devices that
are more robust after shifts than ‘conventional’ forecasting models.

Section 3 considers hazards confronting empirical modeling of non-
stationary time-series data using an example where a counter-intuitive
finding is hard to resolve. The framework has a clear subject-matter
theory, so is not mere ‘data mining’, yet the empirical result flatly
contradicts the well-based theory. §3.1 considers whether assessing the
constancy and invariance of the relationship can reveal the source of the
difficulty, but does not. An encompassing evaluation of the relationship
in §3.2 fortunately does.

Section 4 provides a brief excursion into climate science, mainly
concerned with the composition of the Earth’s atmosphere and the role
of CO2 as a greenhouse gas. §4.1 considers whether humanity can alter
the planet’s atmosphere and oceans, and demonstrates we can—and
are. §4.2 discusses the consequences of changes in the composition of
the atmosphere, focusing on the impacts of climate change on ‘great
extinctions’ over geological time.

154 Introduction

Section 5 considers the consequences, both good and bad, of the
Industrial Revolution raising living standards beyond the wildest dreams
of those living in the 17th century, but leading to dangerous levels of
CO2 emissions from using fossil fuels.

Against that background, we consider applications of climate econo-
metrics. Section 6 illustrates the approach by modeling past climate
variability over the Ice Ages. §6.1 describes the data series over the past
800,000 years, then §6.2 models ice volume, CO2 and temperature as
jointly endogenous in a 3-variable system as a function of variations in
the Earth’s orbit, taking account of dynamics, non-linear interactions
and outliers using full information maximum likelihood. The general
model is formulated in §6.2.1, and the simultaneous system estimates
are discussed in §6.2.2. Their long-run implications are described in
§6.3 with one hundred 1000-year 1-step and dynamic forecasts in §6.3.1.
Then, §6.3.2 considers when humanity might have begun to influence
climate, and discusses the potential exogeneity of CO2 to identify its
role during Ice Ages. §6.4 looks 100,000 years into the future using the
fact that the eccentricity, obliquity and precession of Earth’s orbital
path is calculable far into the future, to explore the implications for the
planet’s temperature of atmospheric CO2 being determined by humans
at levels far above those experienced during Ice Ages. Finally, §6.5
summarizes the conclusions on Ice-Age modeling.

Section 7 models UK annual CO2 emissions over 1860–2017 to walk
through the stages of modeling empirical time series that manifest all the
problems of wide-sense non-stationarity. §7.1 provides data definitions
and sources, then §7.2 discusses the time-series data. §7.3 formulates
the econometric model, then §7.4 highlights the inadequacy of simple
model specifications. The four stages of model selection from an initial
general model are described in §7.5, then implemented in §7.6–§7.8. §7.9
conducts an encompassing test of the linear-semilog model against a
linear-linear one. §7.10 presents conditional 1-step ‘forecasts’ and multi-
step forecasts from a VAR. §7.11 addresses the policy implications of
the empirical analysis, then §7.12 considers whether the UK can reach
its 2008 Climate Change Act (CCA) CO2 emissions targets for 2050.
Finally, §7.13 estimates a ‘climate-environmental Kuznets curve’.

155

Section 8 concludes and summarizes a number of other empirical
applications.

To emphasize the different and interacting forms of non-stationarity,
Figure 1.1 records time series from climate and economic data. Panel (a)
shows the varying trends in global monthly atmospheric CO2 concentra-
tions in ppm measured at Mauna Loa over 1958(1)–2019(6); Panel (b)
records the dramatically non-stationary UK per capita CO2 emissions,
with up and down trends, outliers and shifts; Panel (c) reports the log
of UK GDP, again with changing trends and large shifts; and (d) plots
the log of the UK wage share, with large shifts and outliers.

The lockdowns round the world in response to SARS-CoV-2 will
doubtless cause a sharp drop in global CO2 emissions in early 2020 need-
ing modeled. The indicator saturation estimators described in Section 2
are designed to tackle such multiple shifts of unknown magnitudes
and directions at unknown dates as countries gradually bring their
pandemics under sufficient control to ‘restart’ their economies.

Atmospheric CO2 concentrations

1960 1970 1980 1990 2000 2010 2020

325

350

375

400

C
O

2 c
on

ce
nt

ra
tio

ns
p

pm

(a)

Linear trend

ppm tons p.c. p.a.

Atmospheric CO2 concentrations

1870 1895 1920 1945 1970 1995 2020
5

6

7

8

9

10

11

12

U
K

C
O

2 e
m

is
si

on
s

(b)

UK Log(GDP)

1870 1895 1920 1945 1970 1995 2020

10.5

11.0

11.5

12.0

12.5

13.0
(c)

Linear trend
UK Log(GDP)

UK Log(Wage share)

1870 1895 1920 1945 1970 1995 2020
1.60

1.65

1.70

1.75

1.80 (d)UK Log(Wage share)

Figure 1.1: (a) Global monthly atmospheric CO2 concentrations in parts per million
(ppm) measured at Mauna Loa, 1958(1)–2019(6); (b) UK CO2 emissions in tons per
capita per annum; (c) the log of UK GDP; (d) log of the UK wage share. (b)–(d)
are all annual over 1860–2018.

2
Econometric Methods for Empirical Climate

Modeling

In this section, we describe the econometric tools that are needed
for empirical climate modeling of wide-sense non-stationary data. §2.1
commences with a discussion of the objective of the study, usually a
theory-based formulation, as against what should be the target for
modeling. Often the objective is made the target, but that needs om-
niscience: instead the target should be the process that generated the
data while retaining the object of analysis. Such an approach allows for
the possibility of finding that the target and object coincide without
imposing that they must. §2.2 describes the formulation of models
for wide-sense non-stationary time series, then §2.3 discusses model
selection by Autometrics and §2.4 explains the block multi-path se-
lection algorithm. §2.5 analyzes why automatic model selection can
work well despite N > T , building on Hendry and Doornik (2014),
before §2.6 explains saturation estimators, summarizing the different
saturation approaches in §2.7. The selection, estimation and evaluation
of simultaneous-equations models are considered in §2.8 commencing
from a dynamic system, with the issue of identification addressed in
§2.8.1. §2.9 discusses forecasting in a wide-sense non-stationary world.

156

2.1. Theory Models and Data Generation Processes 157

2.1 Theory Models and Data Generation Processes

The most basic question concerns ‘what is empirical model selection
trying to find’? Given the answer to that, then one can address how
best to find it. Many features of models of observational data cannot
be derived theoretically, particularly facing wide-sense non-stationarity.
While a theory-model provides the object of interest to a modeler, that
theory can only be the target of a study if it is complete, correct and
immutable, despite often being imposed as the target yet lacking those
characteristics. Facing non-stationary time series, ceteris paribus is
simply not applicable because what is excluded will not stay the same:
see Boumans and Morgan (2001). Viable models of non-stationarity
must include everything that matters empirically if estimated models
are to be constant.

To understand how any system actually functions, the appropriate
target for model selection must be its data generation process (DGP).
The DGP of the world is the joint density DW1

T
(w1, . . . ,wT | ψ1

T ,W0)
where W1

T is the complete set of variables over a time period 1, . . . , T ,
conditional on the past, W0. However, DW1

T
(·) and the ‘parameters’

ψ1
T ∈ Ψ of the processes may be time varying. In practice, DGPs are too

high dimensional and too non-stationary to develop complete theories
about, or to precisely model empirically, so local DGPs (denoted LDGP)
are usually the best that can be achieved. The LDGP is the DGP for
the n variables {xt} which an investigator has chosen to model, with
entailed ‘parameters’ θ1

T ∈ Θ. The theory of reduction explains how the
LDGP DX1

T
(·) is derived from DW1

T
(·), the resulting transformations of

the ‘parameters’ that implies, and what the properties of DX1
T

(·) will
be (see e.g., Hendry, 2009).

The LDGP DX1
T

(·) can always be written by sequential factorization
with a martingale difference (innovation) error, that is unpredictable
from the past of the process (see Doob, 1953):

DX1
T

(X1
T | X0, θ

1
T ) =

T∏
t=1

Dxt(xt | X1
t−1,X0, θt). (2.1)

158 Econometric Methods for Empirical Climate Modeling

Thus, the joint density can be expressed as the product of the sequen-
tial individual densities even when the ‘parameters’ are not constant.
Let EXt−1 denote the expectation over the distribution DXt−1(·), and
define εt = xt − EXt−1 [xt | X1

t−1], then EXt−1 [εt | X1
t−1] = 0, so {εt} is

indeed a martingale difference error process, with EXt−1 [εt | E1
t−1] = 0

where E1
t−1 = (εt−1, . . . , ε1). This provides a viable basis for laws of

large numbers, central limit theorems and congruent models which
are models that match the evidence, so are ‘well specified’. Note that
the LDGP innovation error {εt} is designed, or created, by the reduc-
tions entailed in moving from the DGP to the LDGP, so is not an
‘autonomous’ process, but rather a reflection of our ignorance. A larger
choice of relevant variables than {xt} would make the LDGP a better
approximation to the actual DGP, which should deliver smaller innova-
tion errors, sustaining a progressive research strategy. Once whatever
set of {xt} has been chosen, one cannot do better than know its LDGP
DX1

T
(·), which encompasses all models thereof on the same data (i.e., can

explain their findings: see Bontemps and Mizon, 2008). Consequently,
the LDGP is the only appropriate target for model selection.

However, LDGPs are almost always unknown in practice, so Hendry
and Doornik (2014) emphasize the need to discover the LDGP from
the available evidence while retaining theory information. Doing so
requires nesting the LDGP in a suitably general unrestricted model
(denoted GUM), while also embedding the theory model in that GUM,
then searching for the simplest acceptable representation, stringently
evaluating that selection for congruence and encompassing. Since the
variables {xt} chosen for analysis usually depend on available subject-
matter theory, institutional knowledge, and previous evidence, most
theory-model objects can be directly related to the target LDGP by
embedding them therein.

Unfortunately, (2.1) provides cold comfort for empirical modelers:
sequential factorization only delivers an innovation error when using
the correct sequential distributions Dxt(·). To discover that LDGP
therefore requires also finding all distributional shifts, as omitting key
variables and/or shifts will adversely affect selected representations.
§2.2–§2.8 describe modeling aimed at discovering the LDGP. Then

2.2. Formulating Wide-Sense Non-Stationary Time 159

§2.9 considers forecasting in a non-stationary world, where surprisingly
different approaches may be useful.

2.2 Formulating Wide-Sense Non-Stationary Time Series Models

Empirical modeling of observational time series inevitably involves un-
certainty about which ‘explanatory’ variables are relevant and which
are irrelevant, their functional forms, lag lengths, unit roots and cointe-
gration, possible outliers, structural breaks and location shifts, as well
as the constancy and invariance of parameters. Moreover, it is essential
for valid inference to check the exogeneity status of contemporaneous
conditioning variables, whether the resulting residuals satisfy the error
properties assumed in deriving the distributions of parameter estimates,
so that selected models are congruent, and whether the selected model
can also encompass alternative explanations of the same dependent
variables. Thus, there is a set of tests for evaluating empirical modeling
outcomes, and consequently, satisfying these should also constrain the
selection process.

As it is rare not to have some theoretical basis to guide an empirical
analysis, algorithms should retain without selection all the variables in
a theory-model when selecting other features. Doing so enables much
tighter than conventional significance levels to be used during selec-
tion to reduce adventitious retention of irrelevant candidates without
jeopardizing the retention of any theory-relevant variables that are the
object of the analysis. Hendry and Johansen (2015) propose an approach
in which those other variables are orthogonalized with respect to the
theory variables, so the distributions of estimators of the parameters of
the object are unaffected by selection: for additional analyses and an
empirical illustration, see Hendry (2018).

To formalize our notation, denote the variables to be modeled by
yt, the theory-model contemporaneous ‘explanatory’ variables by the
n1 × 1 vector zt, and the other current-dated candidate variables by
the n2 × 1 vector vt (possibly after orthogonalization with respect to
zt). Let w′t = (z′t,v′t) which is n× 1 where n = n1 + n2 (although wt is
the same symbol as in §2.1, here it denotes a tiny subset of all possible
variables). §2.2.1 considers specifying lag length; §2.2.2 functional forms;

160 Econometric Methods for Empirical Climate Modeling

§2.2.3 the formulation of the resulting general unrestricted model; and
§2.2.4 model evaluation. Saturation estimators are not discussed in
detail till §2.6.

2.2.1 Lag Length Specification

Almost all empirical econometric models of time series are dynamic
as events take time to work through economies. The Earth’s climate
also adjusts relatively slowly to changes in greenhouse gas emissions
because of the absorption and temperature interactions between the
oceans and the atmosphere, so such models have to be dynamic as
well. The first step is to create s lags, wt, . . . ,wt−s, to implement the
sequential factorization in Equation (2.1) for the GUM. This is to avoid
residuals from the general model being autocorrelated. The sequential
densities in (2.1) should include all lags of the data, but in practice,
lags are truncated at s, where a sufficient number of lags are retained
to ensure there is no loss of dynamic information, with residuals that
are white noise. Nevertheless, as noted above, distributional shifts will
also need to be modeled to achieve that, a topic left till §2.6, though
included in (2.2).

2.2.2 Functional Forms

Many economic models are non-linear in that the variables are trans-
forms of the original measurements (not simply logs) as with the
quadratic relationship between real wages and the unemployment rate
in Castle and Hendry (2014b). Because of the potential for climate
tipping points, such as when an ice-free Arctic Sea led to large-scale
methane release from the permafrost melting in the tundra causing
rapid climate warming (see Vaks et al., 2020), non-linear relationships
cannot be neglected. More positively, sensitive intervention points in the
post-carbon transition could induce leverage in policy actions inducing
non-linearities in models (see Farmer et al., 2019).

A class of functional-form transformations for non-linearity tests
(denoted Fnl below) was proposed by Castle and Hendry (2010) based on
(u2
i,t; u3

i,t; ui,te−|ui,t|) obtained from the principal components of the wt,
given by ut = Ĥ′(wt −w), where Ω̂ = T−1 ∑T

t=1(wt −w)(wt −w)′ =

2.2. Formulating Wide-Sense Non-Stationary Time 161

ĤΛ̂Ĥ′. When Ω̂ is non-diagonal, each ui,t is a linear combination of
many wj,t, so (e.g.,) u2

i,t involves squares and cross-products of almost
every wj,t etc. This approach can also be used to automatically generate
a set of non-linear transforms of the variables for including in the GUM.
The formulation is low dimensional compared to a general cubic, with
no collinearity between the {ui,t}, yet includes many departures from
linearity (see Castle and Hendry, 2014b, for an illustration). However,
such non-linear transforms should be restricted to I(0) variables as
Castle et al. (2020a) show that calculating principal components on
I(1) variables with stochastic trends, where some may be irrelevant, can
distort analyses of the cointegrating combinations. In §7.3.1 below, we
consider log versus linear transformations in the UK CO2 model, as
well as applying Fnl.

2.2.3 Formulation of Single-Equation General
Unrestricted Model

Since climate time series are wide-sense non-stationary, models thereof
must include features to tackle that. As well as many candidate explana-
tory variables, dynamics and non-linearities, we include T indicators for
impulses denoted 1{t} equal to zero except for unity at t, and T − 2 step
shifts S{i≤t} =

∑t
j=1 1{j} (which Ericsson, 2012, calls super-saturation:

see §2.6.1 and §2.6.3). This results in N = 4n(s + 1) + s + 2(T − 1)
candidate regressors including indicators, so N � T . Denoting by [ ]
the set of variables that will be retained without selection (which may
include lags but we avoid that notational complication), the resulting
general unrestricted model is given by:

yt =
[
n1∑
i=1

θizi,t

]
+

n2∑
i=1

φivi,t +
n∑
i=1

s∑
j=1

βi,jwi,t−j +
n∑
i=1

s∑
j=0

λi,ju
2
i,t−j

+
n∑
i=1

s∑
j=0

γi,ju
3
i,t−j +

n∑
i=1

s∑
j=0

κi,jui,t−je
−|ui,t−j | +

s∑
j=1

ρjyt−j

+
T∑
i=1

δi1{t} +
T−1∑
i=2

ηiS{i≤t} + εt. (2.2)

162 Econometric Methods for Empirical Climate Modeling

It is also assumed that εt ∼ IN[0, σ2
ε ], denoting an independent,

Normal distribution with constant mean zero and constant variance σ2
ε .

Since outliers and shifts are modeled, Normality seems reasonable, and
the sequential decomposition entailed by including s ≥ 1 lags on all
variables makes ‘independence’ (in the sense of a martingale difference
sequence) a viable assumption. The three more critical assumptions
that need addressed later are (a) the constancy and (b) invariance of
the parameters, and (c) the super exogeneity of the contemporaneous
variables for the parameters in (2.2): see Engle et al. (1983). The first is
discussed below in the context of multiplicative-indicator saturation after
an initial selection; the second by testing for invariance of the parameters
of the selected model to interventions affecting the subset of the wt

retained (described in §3.1), and the third indirectly by that invariance
test when the data are wide-sense non-stationary, and by an appropriate
estimator allowing for potential endogeneity. Generalizations to systems
are discussed in §2.8.

The general concept underlying (2.2) is that of designing the model
to characterize the evidence as discussed in §2.1. This notion applies to
the saturation estimators as well, in that indicators have formulations
to detect specific departures from the rest of the model: see §2.6. Thus,
impulse indicators detect outliers not accounted for by other variables,
step indicators are designed to detect location shifts etc., and below we
consider indicators with break shapes that are explicitly designed to
detect regular shift patterns.

2.2.4 Model Evaluation

Reductions from the DGP via the LDGP to a general model have testable
implications discussed in Hendry (1995). There are seven testable null
hypotheses about the congruence of the initial feasible GUM, and also
the final model, albeit there are many alternatives to each null. First
are Normal, homoskedastic, innovation errors, {εt}. Below, these null
hypotheses are tested by χ2

nd(2) for non-Normality (see Doornik and
Hansen, 2008), FAR for residual autocorrelation (see Godfrey, 1978),

2.3. Model Selection by Autometrics 163

FARCH tests for autoregressive conditional heteroskedasticity (see Engle,
1982), and FHet for residual heteroskedasticity (see White, 1980).1

Next, conditioning variables wt should be weakly exogenous for
the complete set of parameters, denoted ψ, which should be constant
and invariant. As just noted, we will test that joint hypothesis through
super exogeneity (see Engle and Hendry, 1993), partly by saturation
estimators applied to the conditioning variables (see Hendry and Santos,
2010) and partly by forecast evaluation discussed in §2.9. Also, the
conditioning relationship should be linear in the transforms used to
define the wt, tested below by the reset test FReset (see Ramsey, 1969),
and Fnl described in §2.2.2.

Given a formulation like (2.2) where N � T , how can a model
of the underlying LDGP be selected from the GUM? A powerful
machine-learning tool like automatic model selection is needed for such
high-dimensional problems that humans cannot tackle: there are many
available algorithms to do so, we use Autometrics (Doornik, 2009).

2.3 Model Selection by Autometrics

Autometrics is implemented in a likelihood framework, separating the
four key components of the selection approach: the model class; the
search algorithm; model evaluation; and the decision on when a final
selection has been found. Separating these roles allows considerable
flexibility in what models and data types can be analyzed, how they
are estimated, selected and evaluated, how theory information is incor-
porated and how the final model choice is made.

The model class includes simultaneous systems through to condi-
tional single equations, their associated data types from discrete, time
series, cross sections, panels etc., and their corresponding estimation

1Most of these tests were developed for known models fitted to stationary
data, and usually need simulation calibration outside that context. Nielsen (2006)
extends the theory for residual autocorrelation tests to non-stationary autoregressive-
distributed lag models (ADL) with polynomial trends. Caceres (2007) shows that
residual-based tests for mis-specification in models of unit-root processes have asymp-
totic sizes that remain close to the nominal significance level under the null. Berenguer-
Rico and Wilms (2020) develop analyses of heteroskedasticity testing after outlier
removal.

164 Econometric Methods for Empirical Climate Modeling

criteria, from least squares, through instrumental variables to maximum
likelihood. The search algorithm is a multiple-path tree search that
‘learns’ what variables are relevant as it proceeds, with many potential
settings for the thoroughness of search, and the significance level needed
to retain variables. Next, evaluation checks are carried out at every stage
to try and ensure congruent selections, starting with mis-specification
testing of the first feasible GUM and applying the same diagnostic tests
to intermediate models. These are used in the selection process to limit
the loss of information from the attempted reduction, not to evaluate
the model. Thus, we call them mis-specification tests when they are
first applied, and diagnostic tools at later stages. As shown in Hendry
and Krolzig (2005), the distributions of the tests are not affected by
their role as diagnostic tools. Fourth, the termination decision is based
on the initial nominal significance level for false null retention set by the
user (called the gauge: see §2.5.1), and is also checked by parsimonious
encompassing of the GUM to ensure that the cumulation of model sim-
plifications did not lead to a poor model choice: see Doornik (2008). The
program records all terminal model selections that are undominated,
and if several are found, uses an information criterion like AIC (see
Akaike, 1973), BIC or SC (see Schwarz, 1978) or HQ (see Hannan and
Quinn, 1979).

2.4 Model Selection Facing Wide-Sense Non-Stationarity

Once N > T , selection is essential as the GUM cannot be estimated
as it stands. More generally, every decision about (a) how a theory
is formulated, (b) how it is implemented, (c) its evidential base, (d)
its empirical specification, and (e) its evaluation, involves selection,
although these are not usually reported as such. Consider a simple
theory model that derives the equation y = f(x). Then the formulation
concerns the choice of f(·), and any transformations (such as logs) of
the variables. How it is then implemented depends on many decisions
about the dynamics linking y and x, hence on the decision time frame of
the agents involved (e.g., daily, weekly, monthly etc.), and whether the
components of x are given or need to be instrumented. The evidential
base may be at a different data frequency and aggregation level across

2.5. Understanding Why Model Selection Can Work 165

agents, space, commodities, etc., than the theory, and for a selected
time period, perhaps chosen at the maximum available, but sometimes
excluding turbulent periods like wars or crises. It will be interesting to
see how researchers handle the massive disruptions of the pandemic.
The empirical specification may lead to a GUM like (2.2), where there
are additional features not included in the theory-model, so may include
more candidate variables as ‘robustness’ checks, as well as data-based
indicators. This is often the stage to which ‘selection’ is confined in
reporting empirical results, though sometimes the reasons behind the
earlier choices are noted. Finally, there are many possible evaluation
tests, only a subset of which is often selected for reporting. When N > T

is very large, there is also a computational problem of being able to
analyze large numbers of candidate variables, an issue not addressed
here (but see e.g., Doornik and Hendry, 2015, and Castle et al., 2020a).
Castle et al. (2020b) describe the selection algorithm in detail.

In the absence of an omniscient researcher, selection is inevitable
when modeling observational data: the issue is not whether to select,
but how best to do so. Our viewpoint is that an investigator must
discover what actually matters empirically, in the context of retaining
theory insights and institutional knowledge, while encompassing pre-
vious findings and alternative explanations: see Hendry and Doornik
(2014). The software implementation of that approach is Autometrics
in Doornik and Hendry (2018), in essence a machine learning algorithm
within each study, building knowledge of the main determinants and
non-stationarities. To explain how and why model selection can be
successful, we first consider its application to linear regression equations
with orthogonal variables when T � N . Although this may not seem a
likely setting for investigating non-stationary data modeled by N > T ,
the key principles can be explained, then extended to more variables
than observations while retaining theory-models and tackling wide-sense
non-stationarity.

2.5 Understanding Why Model Selection Can Work

Consider a perfect setting: a well specified linear model with constant pa-
rameters βj , an independent Normal error {εt} that is also independent

166 Econometric Methods for Empirical Climate Modeling

of the N mutually orthogonal regressors {zj,t}:

yt =
N∑
j=1

βjzj,t + εt, where εt ∼ IN[0, σ2
ε ], (2.3)

with
∑T
t=1 zi,tzj,t = 0 ∀i 6= j and

∑T
t=1 z

2
i,t > 0, estimated by least-

squares from T � N correctly measured observations. The first n
variables are relevant with βj 6= 0, whereas the last N −n are irrelevant
with βj = 0, but this is not known. Nevertheless, (2.3) nests the DGP.

Estimate (2.3), then order the N sample t2-statistics, denoted τ2
j ,

testing H0: βj = 0 (using squares to obviate the need to consider
signs), as:

τ2
(N) ≥ · · · ≥ τ

2
(m) ≥ c

2
α > τ2

(m−1) ≥ · · · ≥ τ
2
(1) (2.4)

where the cut-off τ2
(m) between retaining and excluding variables uses the

critical value c2
α for significance level α. Variables with the r = N−m+1

largest τ2
j values greater than c2

α are retained whereas the remaining
m− 1 others are eliminated. Only one decision is needed (which we call
1-cut) however large N is: there is no ‘repeated testing’, and ‘goodness
of fit’ is never explicitly considered, though will be implicitly determined
by the magnitude of c2

α and the fit of the DGP equation. Ordering all
the test statistics as in (2.4) requires considerable computational power
for large N , and ‘looks like’ repeated testing, but is not in fact necessary,
as each t2 can just be compared to c2

α given orthogonality.
The key issue is what determines how close m is to n, when some or

all of the n relevant variables are not retained without selection? There
are two main components: the probability of eliminating the N − n
irrelevant variables, and the probability of retaining the n relevant.
Setting α = 1 ensures that all relevant variables will be retained, but
so will all irrelevant; and setting α = 0, no irrelevant variables will be
retained, but no relevant either. Intermediate values of α will retain
differing combinations, so we first address the probability of eliminating
irrelevant variables when n = 0 in §2.5.1 (called the gauge), then
consider the probability of retaining relevant variables when n > 0 in
§2.5.2 (called the potency).

2.5. Understanding Why Model Selection Can Work 167

2.5.1 Probability of Eliminating Irrelevant Variables

The average number of false null retentions can be kept at a maximum
of k variables on average by setting α ≤ k/N . We call α the nominal
significance level, and the resulting outcome, g, the theoretical, or
expected, gauge. In the context of a 1-off similar test under the null,
size is used to describe the null rejection frequency. However, there is
no guarantee of selection distributions being independent of nuisance
effects, and tests here are being used for selection, hence the need for a
different term like gauge to describe false null retentions: see Johansen
and Nielsen (2016) for an analysis.

When n = 0, so all N variables are irrelevant, α ≈ g as shown
in Table 2.1, which records the probabilities of all 2N null rejection
outcomes in t-testing at critical value cα in (2.3). The first column
records the events that can happen from all t-tests being insignificant
through to all being significant. The second column shows the probability
of each such event when the tests have independent t-distributions,
as would be the case for (2.3), and the third reports the number of
rejection outcomes that would result. The fourth column, denoted p0.005,
illustrates the numerical probabilities for column two when α = 0.005
(i.e., 0.5%) and N = 100 when all variables are irrelevant.

The average number of null variables retained is given by:

k =
N∑
i=0

i
N !

i!(N − i)!α
i(1− α)N−i = Nα. (2.5)

Table 2.1: Rejection probabilities under the null

Event Probability Reject p0.005

P(|ti| < cα, ∀i = 1, . . . , N) (1− α)N 0 0.61 P(|ti| ≥ cα | |tj | < cα, ∀j 6= i) Nα(1− α)N−1 1 0.30 P(|ti|, |tk| ≥ cα | |tj | < cα, 1 2N(N − 1)α2(1− α)N−2 2 0.08 ∀j 6= i, k) ... ... ... ... P(|ti| < cα | |tj | ≥ cα, ∀i 6= j) Nα(N−1)(1− α) N − 1 0 P(|ti| ≥ cα, ∀i = 1, . . . N) αN N 0 168 Econometric Methods for Empirical Climate Modeling When N = 100, there are more than 1030 events (given 2100 possible test outcomes for the events in column 1), nevertheless k = 0.5 at α = 0.005 so with g = k/N , then g = α and one irrelevant variable will be significant by chance every second time such a decision rule is used. In practice, empirical values of g have a sampling distribution with a variance that depends on α, so is small when α is (see e.g., Johansen and Nielsen, 2009, 2016). Formally the empirical gauge is the null retention frequency of selec- tion statistics across M replications of a given DGP. Correspondingly, we define the empirical potency as the average non-null retention fre- quency, and will consider that shortly. For each variable, zj , denote the retention rate by r̃j , then when 1(|tβj,i |≥cα) denotes an indicator equal to unity (zero) when its argument is true (false), then for (2.3): retention rate: r̃j = 1 M M∑ i=1 1(|tβj,i |≥cα), j = 1, . . . , N, gauge: g = 1 N − n N∑ j=n+1 r̃j , (2.6) potency: p = 1 n n∑ j=1 r̃j . Although there will be considerable apparent ‘model uncertainty’ in (say) a Monte Carlo experiment with N = 100, n = 0 and α = 0.005, because with a large number of replications, many hundreds of different models will be selected, such variation is essentially inconsequential since most will have just one or two irrelevant variables, and it does not matter which particular irrelevant variable(s) are adventitiously retained. Moreover, while it is not reflected in Table 2.1, as their null distributions will be Normal around zero, most chance significant selections will have |t|-values close to cα. It may be thought that small values of α (so large values of cα) will seriously reduce the chances of retaining relevant variables. Table 2.2 addresses this issue for a Normal distribution when N = 500, under the null that no variables matter for t-testing, using the Normal as the relevant baseline after taking account of outliers. 2.5. Understanding Why Model Selection Can Work 169 Table 2.2: Significance levels, retained variables, and critical values under the null for N = 500 α 0.05 0.01 0.005 0.0025 0.001 0.0001 k = Nα 25 5 2.5 1.25 0.50 0.05 cα 1.98 2.61 2.85 3.08 3.35 4.00 c2 α 3.92 6.81 8.12 9.56 11.2 16.00 As can be seen, critical values increase slowly as α decreases and are just 2.85 at α = 0.005. Using a ‘conventional’ 5% would lead to 25 null variables being retained on average. However, doing so at such a loose significance level in any sample will lead to an underestimate of σ2 ε in (2.3), which can lead to more irrelevant variables being retained and hence serious overfitting. Consequently, we recommend setting α ≤ min[0.01, 1/N ], which will increase cα to at least 2.61, and here for N = 500 to α = 0.002 so cα to just over 3. The main application of very tight α is when selecting with indica- tor saturation, in which case we set α ≤ min[0.01, 1/(N + qT )] for q indicator sets. For example, with super-saturation, q = 2, so qT = 200 at T = 100 (see e.g., Kurle, 2019, and §2.6.4) and if N = 500 then α could be rounded to 0.001. This formula for setting α also entails that it declines towards zero as T →∞ to ensure a consistent selection pro- cedure for a GUM which nests a DGP that remains finite with constant parameters. In practice, with saturation estimation when N � T , we suggest retaining all N variables without selection when selecting indi- cators at a very tight α, and once the indicators are found, select over the (N −n1) variables plus those indicators at α = 1/(N −n1) when n1 theory variables are retained without selection. Section 6 illustrates this procedure. However, when selecting a forecasting device as discussed in §2.9, a much looser target α can be appropriate. 2.5.2 Probability of Retaining Relevant Variables The trade-off of using a very tight α like 0.005 (tight relative to 0.05) is not retaining relevant variables with estimated t-values between 1.98 to 3, so we now turn to the second key component, the potency, which 170 Econometric Methods for Empirical Climate Modeling is the probability of retaining the n relevant variables. Above we noted how to mitigate some of the costs of tight αs by retaining theory-entailed variables without selection. As gauge is not test size, potency is not test power given the selection context but can be evaluated against power to measure any costs from not knowing the DGP. The potency in (2.3) depends on the non-centralities, denoted ψ, of their t-tests relative to cα. When ψi = cα for a Normal distribution, there is a 50% chance that ti ≥ cα in which case the ith variable will be retained. For example, when cα ≈ 2 there is just a 50–50 chance that a variable with ψ = 2 will be retained, as its t-test will be approximately Normally distributed around a mean of 2. Similarly at cα ≈ 2.6, a value of ψ = 2.6 will lead to retention half the time. It must be stressed that this example is not a problem of selection potency, but of test power, albeit such a conflation is often forgotten. If a variable would not be significant at the chosen cα on a single t-test when the DGP is known, it cannot be expected to be selected at that critical value when the DGP is unknown. The approximate t-test powers when a false null hypothesis is tested once at different non-centralities are shown in Table 2.3, as well the potency when five such tests are conducted at the same non-centrality (based on a standard Normal distribution). There is a 50–50 chance of retaining one variable with ψ = 2 for cα = 2, but only a 3% chance of finding five such variables significant in any trial. This is a probability statement, not a problem due to ‘repeated Table 2.3: Approximate t-test powers at different non-centralities and critical values ψ α Pr(|t| ≥ cα) (Pr(|t| ≥ cα))5 2 0.05 0.50 0.031 2 0.01 0.27 0.001 3 0.01 0.65 0.116 4 0.01 0.92 0.651 4 0.005 0.87 0.513 5 0.01 0.992 0.959 6 0.001 0.996 0.980 2.5. Understanding Why Model Selection Can Work 171 testing’ or selection. As Table 2.3 shows, the power rises non-linearly with ψ, and importantly, when it is very high, as in the bottom two rows, it remains high even for five variables. The potency p is then the average rejection frequency of a false null hypothesis when selecting at a given significance level. As shown in a number of simulation studies, the potency is close to the average power for models like (2.3): see e.g., Castle et al. (2011) and Hendry and Doornik (2014). Since problems confronting selection potency, such as non-orthogonality, also confront power, potency remains close to average power more generally. Combining these two key components, with g ≈ α and potency close to power, we can see that commencing selection at the same significance level from the DGP (i.e., with no irrelevant variables) or from a GUM that nests that DGP, a similar model will be selected, with on average g(N−n) additional irrelevant variables in the latter. Of course this is no guarantee that k will equal n, as non-centralities of ‘relevant variables’, defined by having βi 6= 0, may be too small to lead to ‘significant’ outcomes even if the DGP is estimated. The principles behind the approach just described apply to non- orthogonal models, although 1-cut is not a sensible strategy in such a setting. Instead, the software Autometrics uses multi-path block searches to order the underlying values of τ2 j taking account of their intercorrelations (see Doornik, 2009). Autometrics has two additional features to maintain its effectiveness as noted earlier. First, given that the feasible estimated GUM is congruent on the desired mis-specification tests, the same tests are applied during path searches as diagnostic checks so that paths are not followed if deleting variables makes any of those tests significant. This is a logical requirement since the LDGP must be a congruent representation of itself, and consequently a non-congruent model cannot be the LDGP. Similarly, when N � T , Autometrics checks that the current selection encompasses the GUM so no salient information is lost (see Bontemps and Mizon, 2008, and Doornik, 2008). These two checks on potential information losses play an important role in helping Autometrics select appropriate models. However, while all retained variables must be significant by de- sign at cα in 1-cut, that will not necessarily occur with Autometrics. 172 Econometric Methods for Empirical Climate Modeling In Autometrics, variables may also be retained because of the roles played by: (a) diagnostic checks–when a variable is insignificant, but its deletion makes a diagnostic test significant; (b) encompassing tests–a variable can be individually insignificant, but not jointly with all variables deleted so far. Thus, the gauge may be larger than α if retained variables are irrelevant, but equally the potency can also exceed average power. When N > T ,
both expanding and contracting block searches are needed, and as this
inevitably arises for indicator saturation estimators, we now consider
that setting.

2.6 Selecting Models with Saturation Estimation

The approach to tackling outliers that we use, called impulse-indicator
saturation (denoted IIS), was accidentally discovered by Hendry (1999),
then developed by Hendry et al. (2008) for a special case, and analyzed
more generally by Johansen and Nielsen (2009, 2016). IIS is an integral
part of the general model selection algorithm that not only selects over
the relevance of explanatory variables, lag lengths and non-linearities,
but also over observations during selection, all jointly—even though
that creates more candidate variables in total (here denoted N) than
observations T , so N > T . By selecting over observations, IIS is a
robust-statistical device, as shown by Johansen and Nielsen (2016). One
aspect of commencing from (2.2) is to achieve robustness against a
range of potential mis-specifications, an issue discussed in Castle et al.
(2020b), since the costs of selection are small compared to those of
mis-specification. Variables deemed relevant from prior reasoning can
be retained without selection while selecting over other aspects at a
tight significance level as in Hendry and Johansen (2015).

The simplest explanation of how model selection works for IIS (and
hence in general when N > T ), is to consider an independent, identically-
distributed Normal random variable with constant mean µ and variance
σ2 denoted yt ∼ IN[µ, σ2] for t = 1, . . . , T . There are no genuine outliers,

2.6. Selecting Models with Saturation Estimation 173

but random samples will still on average deliver observations that lie
outside ±2σ with the appropriate probabilities (e.g., 5% for that range,
and 1% for ±2.6σ). Consider an investigator who seeks to check for
potential outliers in the data process, as in Hendry et al. (2008). Create
a complete set of impulse indicators {1{t}}, t = 1, . . . , T , so there is one
for every observation, which is why the approach is called saturation.
Set a nominal significance level α, such that αT ≤ 1, and its associated
critical value cα. Then, under the null hypothesis of no outliers, αT ≤ 1
impulse indicators should be significant on average by chance, entailing
that number of observations will be removed adventitiously.

Of course, the complete set of impulse indicators cannot sensibly be
included because a perfect fit would always result, so nothing would be
learned. Although Autometrics considers many splits and does multiple
searches across different splits, consider implementing a simple ‘split-
half’ version of the IIS procedure, namely add half the indicators, select
then repeat for the other half of the data, which is what inadvertently
occurred in Hendry (1999). Taking T as even for convenience, regress yt
on a constant and the first half of the indicators, {1{t}}, t = 1, . . . , T/2.
Doing so ‘dummies out’ the first half of the data set, so is in fact just
a standard regression on an intercept using only the second half of
the sample, which delivers unbiased estimates of µ and σ2. However, it
also provides estimates of the impulse indicators in the first half, and
any with significant t-statistics are recorded. Each individual t-test is
a standard statistical test under the null hypothesis, with Pr(|t1{t} | ≥
cα) = α. Impulse indicators are mutually orthogonal, so their estimated
coefficients are unaffected by eliminating insignificant indicators.

Drop all the first half indicators, and now regress yt on the intercept
and the second half of the indicators. Again, this delivers unbiased esti-
mates of µ and σ2, now from the first half, and reveals any significant
indicators in the second half. Finally, regress yt on the intercept and
all the sub-block significant indicators using the full sample, and select
only those indicators that remain significant at cα. Overall the algo-
rithm solves a variant of a multiple testing problem for more candidate
regressors than observations, where we control the average number of

174 Econometric Methods for Empirical Climate Modeling

‘false outlier’ discoveries, or gauge as defined above and analyzed by
Johansen and Nielsen (2016).

This split-half algorithm is merely an expository device,
and as shown under the null by Hendry et al. (2008) the same outcome
will be found using more blocks which can be of different sizes. The main
reason for considering split-half analyses is that they can be undertaken
analytically to establish the feasibility of saturation estimation, and
as shown in Castle et al. (2019a), reveal the structure of different
specifications of indicators. No ‘repeated testing’ is involved by exploring
many choices of blocks, as an impulse indicator will only be significant
at the preset α if there is a large residual at that observation, with
the caveat we noted above that α ≤ 1/T . Loose significance levels can
lead to overfitting as selecting more irrelevant indicators than one or
two will result in underestimating σ2, leading to more indicators being
adventitiously selected, unless σ̂2 is bias corrected as in Johansen and
Nielsen (2009).

As described in Doornik (2009), the operational approach in Auto-
metrics uses a multi-path block search. The whole sample is divided
into blocks usually smaller than about T/4, selection at α is applied
within blocks, and any significant indicators found in each block are
added to the current selection until: (i) the resulting estimated equation
does not fail any of the diagnostic tests; and (ii) none of the as-yet-
unselected indicators are significant if added, which thereby creates
a terminal model. Then different block mixes are created and a new
multi-path block search is commenced from these different partitions
until there are no new directions to explore. In the more general case of
regressors, the same algorithm applies, and if several terminal models
are found, their variables are combined, and selection recommenced till
an undominated model is chosen. If several terminal models remain,
an information criterion can choose between terminal models that are
mutually undominated (i.e., mutually encompassing as in Bontemps
and Mizon, 2008).

Although creating a set of candidate variables that exceeds the
number of available observations with impulse indicators for every
observation may seem unlikely to be a successful approach to model

2.6. Selecting Models with Saturation Estimation 175

selection, we must stress that many well-known and widely-used sta-
tistical procedures are in fact variants of IIS: rather like we all speak
prose even if we don’t know it. Examples are:

(a) recursive estimation, which is equivalent to including impulse
indicators for the ‘future’ observations, then sequentially dropping
those indicators as the algorithm moves through the sample;

(b) also, moving windows essentially uses IIS on ‘pre’ and ‘post’ data;

(c) and ‘holding back data’ (e.g., to control size) is equivalent to IIS
applied to the excluded data points;

(d) as is the less acceptable process of prior sample selection by
excluding data like war years, or ending a sample early to avoid
including shifts, etc., which also implicitly uses IIS on the omitted
observations.

(a) and (b) can detect outliers or shifts when the model is otherwise
known, but even so, ignore the information on the unused data. Im-
portantly, IIS can be included as part of the general selection process
when the model specification has to be discovered from the evidence.
Moreover, as shown by Salkever (1976), the Chow (1960) test can be
implemented by including an appropriate set of impulse indicators, so is
essentially IIS restricted to a specific subset of data for a given model.

2.6.1 Illustrating Impulse-Indicator Saturation

We now illustrate the application of split-half selection to impulse-
indicator saturation when there are no outliers, then when there is one,
using 12 observations on the simple time series:

yt = 5 + εt, εt ∼ IN[0, σ2
ε ], (2.7)

where σ2
ε = 1, with an outlier of magnitude δ = −2 at t = 8 in the

second setting. To implement split-half selection, we create all 12 impulse
indicators, then add the first six to a regression of yt on a constant to
see if any have |t|-statistics exceeding the critical value cα. If so, these
are recorded and the first six are replaced by the second six, and the

176 Econometric Methods for Empirical Climate Modeling

process repeated. Finally, all selected indicators are included and their
|t|-values checked, and only those still greater than cα are retained.

Figure 2.1 records the split-half sequence. We chose α = 0.05 as
the theoretical gauge so αT = 0.6, suggesting less than one impulse
will be retained on average by chance. However, the empirical gauge
has an approximate standard deviation of 0.063 at T = 12, as shown
in Johansen and Nielsen (2016), so could be a bit larger or somewhat
smaller. With T = 12, no null indicators are likely to be selected at
α = 0.01, so although that would still be a feasible choice, we felt it
might be thought to bias the procedure in favor of finding no irrelevant
indicators.

We actually compared four closely related approaches, namely the
split half, checking which indicators are significant; Autometrics applied
to each half with their indicators included; commencing with all 12
indicators in the candidate information set and selecting by Autometrics
general algorithm; and Autometrics pre-programmed IIS algorithm.

Data

1 2 3 4 5 6 7 8 9 10 11 12

4

5

6

(a)Data

0.25

0.50

0.75

1.00

1 2 3 4 5 6

(b)

0.25

0.50

0.75

1.00

7 8 9 10 11 12

(c) Data
Model fit

1 2 3 4 5 6 7 8 9 10 11 12

4

5

6

(d)Data
Model fit

Figure 2.1: (a) Time series. (b) First six of the impulse indicators. None significant
at α = 0.05. (c) Second six indicators: Again none significant. (d) Final outcome,
retaining just the intercept.

2.6. Selecting Models with Saturation Estimation 177

Panel (a) shows the original (uncontaminated) data; (b) the first six
impulse indicators, none of which is significant at α = 0.05 when added,
and none is retained when applying Autometrics to that specification,
where the constant is not selected over; (c) the second six, where again
none is significant—and none retained by Autometrics; and (d) the
model fit, which is just the mean of y.

The third approach is to add all 12 indicators to make an (infeasible)
GUM and apply Autometrics to that set of candidate variables, which
here also delivers the null model, as does the fourth, using the pre-
programmed IIS algorithm in Autometrics, which commences from a
model like (2.7) and automatically creates the impulse indicators if IIS
is chosen as the estimator. All four ways are applied below to other
saturation estimators, with the health warning that the first two are just
a convenient form for analysis and need not work for more complicated
DGPs: the third and fourth deliver the same outcomes at the same α,
but with the benefit that the pre-programmed IIS saves creating the
indicators.

Figure 2.2(a) illustrates a DGP when there is an outlier denoted
δ = −2 at t = 8. Figure 2.2(b) shows the first six indicators, and as before
none is significant and none retained by Autometrics applied to that
specification. But when the second six are added as regressors, 1{8} is
significant, corresponding to the smallest observed data value, and is also
retained by Autometrics for that specification with t = −2.87, p = 0.016.
The intercept is then estimated as 5.13(SE = 0.30) as opposed to
4.89(SE = 0.37) if the outlier is ignored. When the IIS option in
Autometrics is used for a model of yt on a constant, the same indicator,
1{8}, is selected. As α = 0.05, the potency given by Pr(|tδ| ≥ cα) is
approximately 0.5, so 1{8} will be significant about half the time, even
if the correct indicator was added at that observation. At α = 0.01, an
outlier of δ = 2 would be found on average about a 1/4 of the time even
if its precise location was known.

So far there is just one outlier and it is relatively small. Having
two outliers in a sample of T = 12 would be fairly bad contamination
(>15%), and their detectability depends on their magnitudes, signs
and the particular values of the observations at which they happen.
Setting δ = 2 at t = 1, so the first observation is incorrectly recorded,

178 Econometric Methods for Empirical Climate Modeling

Contaminated data

1 2 3 4 5 6 7 8 9 10 11 12
2

3

4

5

6
(a)

Contaminated data

0.25

0.50

0.75

1.00

1 2 3 4 5 6

(b)

0.25

0.50

0.75

1.00

7 8 9 10 11 12

(c)

Contaminated data
Model fit with IIS
Model fit without IIS

1 2 3 4 5 6 7 8 9 10 11 12
2

3

4

5

6 (d)

Contaminated data
Model fit with IIS
Model fit without IIS

Figure 2.2: (a) Time series with an outlier of δ = −2 at t = 8. (b) First six of the
impulse indicators: None significant at α = 0.05. (c) Second six indicators: Now one
is retained, shown as wide. (d) Outcome with and without that indicator.

creates a doubly contaminated data set. Although it is not obvious
that there are two outliers from the data plot or the scaled residuals in
Figure 2.3(b), with one affecting the first observation (which would be
excluded by some approaches), IIS selects both 1{1} and 1{8}. However,
the split-half approach selects neither. This occurs because one of the
outliers lies in each half, so contaminates each baseline half in turn.
Such a problem applies even more forcibly to 1-step single path searches,
and emphasizes the benefits of multi-path block searches: see e.g., the
discussion between Gamber and Liebner (2017) and Ericsson (2017b).

IIS can also be beneficial when the underlying error distribution is
fat-tailed, as shown in Castle et al. (2012) where the errors have a t3
distribution. IIS ‘removes’ many of what would be classified as outliers
relative to a Normal and so can reduce the mean square error (MSE) of
the resulting model parameter estimates around their population values.
Also Hendry and Santos (2010) apply IIS to testing for parameter
invariance under changes in the distributions of the unmodeled (weakly

2.6. Selecting Models with Saturation Estimation 179

Model fit, no IIS
Doubly contaminated

1 2 3 4 5 6 7 8 9 10 11 12
2

3

4

5

6

7

8 (a)Model fit, no IIS
Doubly contaminated

Scaled residuals, no IIS

1 2 3 4 5 6 7 8 9 10 11 12

-1

0

1

(b)Scaled residuals, no IIS

Doubly contaminated
Model fit with IIS

1 2 3 4 5 6 7 8 9 10 11 12
2

3

4

5

6

7

8
(c)Doubly contaminated

Model fit with IIS
Scaled residuals with IIS

1 2 3 4 5 6 7 8 9 10 11 12

-1

0

1
(d)Scaled residuals with IIS

Figure 2.3: (a) Time series with two outliers at t = 1 and t = 8 and the model fit
without IIS. (b) Residuals scaled by σ̂ε. (c) Time series and the model fit with IIS.
(d) Residuals scaled by σ̃ε (after IIS).

exogenous) variables, and show that impulse indicators may also detect
heteroskedasticity. Finally, IIS can be useful for detecting forecast biases
as in Ericsson (2017a).

2.6.2 Properties of Impulse-Indicator Saturation

When any additional regressor variables are included in the model
of yt, and are retained without selection, Johansen and Nielsen (2009)
show that the usual rates of estimator convergence to their asymptotic
distribution (namely


T under stationarity) are unaffected by IIS

(despite selecting from more candidate variables than observations) with
a loss of estimation efficiency dependent on the error distribution and
the choice of α. When the error {εt} is distributed symmetrically with
no outliers, as with a Normal distribution, applying IIS to a regression
with n variables xt (retained without selection), constant parameter
vector β and data second moment matrix Σ, with E[x′tεt] = 0 and

180 Econometric Methods for Empirical Climate Modeling

t = 1, . . . , T for the model:

yt = β′xt + εt, where εt ∼ IN[0, σ2
ε ], (2.8)

the limiting distribution of the IIS estimator β̃ converges to β at the
usual rate of


T , where orthogonality is not required. That limiting

distribution is Normal, with a variance that is somewhat larger than
σ2
εΣ−1 so: √

T (β̃ − β) D→ Nn[0, σ2
εΣ−1Ωα]. (2.9)

The efficiency of β̃ with respect to the usual (and here valid) ordinary
least-squares (OLS) estimator β̂, as measured by Ωα, depends on cα
and the error distribution, but is close to (1− α)−1In for small α: see
Johansen and Nielsen (2009). Consequently, despite adding T irrelevant
impulse indicators to the candidates for selection, there is a very small
cost: for α = 1/T , on average from (2.5) just one observation will be
‘lost’ by being dummied out.

Conversely, there is the potential for major gains under the alter-
natives of data contamination and/or breaks, and as noted IIS can
be done jointly with all other selections. Under the alternative that
there are one or more outliers, IIS locates the ones with t-statistics
exceeding cα in absolute value. Estimates of impulse indicators cannot
be consistent for outliers as there is only ever a single observation from
which to estimate them, but if embedded in a process where the error
variance tended to zero (small sigma asymptotics), would be selected
with probability approaching unity. In practice, what matters is ensuring
that the estimates of the parameters of interest in the modeling exercise
are robust to the presence of a moderate number of relatively large
outliers, and simulations suggest this occurs except where the outliers
are ‘evenly’ spread through the data so most sub-samples ‘look alike’.

The simplest setting of an impulse indicator model can be useful
when teaching test power, relating back to §2.5.2. Let:

yt = δ1{τ} + εt where εt ∼ IN[0, σ2
ε ] (2.10)

where 1 ≤ τ ≤ T is known, and δ = 2σε. Then δ̂ = δ + ετ < 2σε whenever ετ < 0 which has a probability of 1 2 , so will not be significant at cα = 2. Moreover, even for δ = 3σε, then δ̂ will not be significant when ετ < −σε which has a probability of about 16%; and so on. 2.6. Selecting Models with Saturation Estimation 181 In the context of testing for parameter invariance, Hendry and Santos (2010) derive the powers of individual impulse-indicator tests, which depend on the magnitude of the outlier they detect relative to σε. Hendry and Mizon (2011) demonstrate the dangers of not tackling outliers in empirical models, as a failure to handle large outliers can lead to the rejection of even a sound theory—in that instance because a positive price elasticity for the demand for food results when estimating without IIS. Castle et al. (2020b) show that IIS can shed insight into the long-standing problem noted by Hettmansperger and Sheather (1992) when using ‘conventional’ robust-statistical estimators like least median squares (LMS: see Rousseeuw, 1984), and least trimmed squares (LTS: see Rousseeuw, 1984, and Víšek, 1999), which transpire not to be robust to a small accidental measurement error created when inputting the data. In the general context of selecting variables, lags and non-linear functions along with IIS, Hendry and Doornik (2014) show in simulations that Autometrics has an empirical gauge close to α for small α under the null hypothesis of no additional relevant variables, and also describe its ability to select relevant variables under the alternative. 2.6.3 Step-Indicator Saturation Step-indicator saturation (SIS) uses steps rather than impulses to cap- ture permanent shifts in the location of a relationship: see Castle et al. (2015b). Step indicators are cumulations of impulse indicators up to the given date, so terminate with the last observation for the date shown: e.g., S{1960} is unity till 1960 then zero thereafter. Consequently, changes in steps, such as ∆S{1960} = S{1960} − S{1961}, correspond to −1{1961}. Castle et al. (2017) apply SIS to invariance testing, discussed below. Much of the analysis of SIS is similar to that for IIS above, even though the steps are not orthogonal. Nielsen and Qian (2018) provide an analysis of the asymptotic properties of the empirical gauge of SIS. While showing that its distribution is more complicated than that of IIS in Johansen and Nielsen (2016) described above, it converges to the nominal significance level in a range of models with stationary, random walk and deterministically trending regressors. They conclude the gauge 182 Econometric Methods for Empirical Climate Modeling of SIS can be reliably used at a tight significance level in applications. While there is no formal analysis as yet, a result similar to that in Johansen and Nielsen (2009) seems likely to hold for the distribution of estimated retained regression parameters when SIS is applied at a suitably stringent level under the null of no step shifts. Under the null, SIS has a somewhat larger empirical gauge than IIS as it retains two successive roughly equal magnitude, opposite-signed step indicators to capture an outlier, although that could be adjusted manually. Like IIS, coefficients of step indicators will not be consistently estimated when new steps keep occurring, but would be in a scenario of a fixed number of steps that are all proportional to T →∞. We now illustrate SIS for handling shifts of distributions when there is a location shift in (2.7) of δ = −2 starting at t = 8. Only T − 1 step indicators are needed as the T th step coincides with the constant term. Figure 2.4(a) shows the resulting data, together with the fit that would be obtained if SIS was not used; (b) the first six step indicators, none of which is retained; (c) the remaining five, where S{7} is significant on the split half, and also retained by Autometrics at α = 0.05 applied to that specification (as well as by SIS at α = 0.05); and the model’s fitted values. The equation standard error σ̂ε is 1.28 without SIS and 1.02 with. Figure 2.5 shows the resulting 1-step ahead forecasts with and without SIS: the former is both more accurate and more precise. Although this outcome certainly favors SIS as expected because the DGP does have a location shift, the illustration did not use knowledge of that, nor of the timing, sign and magnitude of the shift. 2.6.4 Super Saturation Estimation Combining IIS and SIS leads to super saturation, named by Ericsson (2012), which can be helpful when there is a mix of outliers and location shifts. Kurle (2019) undertook a detailed simulation study of super saturation and found that the gauge was proportional to the total number of indicators (roughly 2(T − 1)) so the target α needed to be tight and about half that for either alone to achieve an appropriate gauge and avoid ‘overfitting’ by selecting at too loose a significance level. For example, at α = 0.005 for T = 100, the gauge was around 2.6. Selecting Models with Saturation Estimation 183 Data with a location shift at t=8 Model fit without SIS 1 2 3 4 5 6 7 8 9 10 11 12 3 4 5 6 (a)Data with a location shift at t=8 Model fit without SIS 1 2 3 4 5 6 7 8 9 10 11 12 0.25 0.50 0.75 1.00 First 6 step indicators (b) 1 2 3 4 5 6 7 8 9 10 11 12 0.25 0.50 0.75 1.00 Last 5 step indicators (c) Data with a location shift at t=8 Model fit with SIS 1 2 3 4 5 6 7 8 9 10 11 12 3 4 5 6 (d)Data with a location shift at t=8 Model fit with SIS Figure 2.4: (a) Time series with a location shift. (b) Add the first six of the step indicators: None significant, nor selected by Autometrics at α = 0.05. (c) Add the other five steps: S{7} (thick solid line) significant and also retained by Autometrics. (d) Outcome with SIS. 1-step ahead forecast with SIS ±2 Data Fit with SIS 1-step forecast, no SIS ±2 ~σf Fit without SIS 0 5 10 15 2 3 4 5 6 7 1-step ahead forecast with SIS ±2σfData Fit with SIS 1-step forecast, no SIS ±2~σfFit without SIS ^ Figure 2.5: 1-step ahead forecast ±2σ̂f with SIS (solid error bar) and without ±2σ̃f (dotted). 184 Econometric Methods for Empirical Climate Modeling 0.01, both under the null of an IID process, and for retaining irrelevant indicators when relevant ones were found. We use super saturation when modeling the UK’s CO2 emissions since 1860 as both outliers (e.g., from the General Strike of 1926) and location shifts (e.g., from the introduction of natural gas in place of coal gas starting in 1969) occur. 2.6.5 Trend Saturation Estimation Deterministic linear trends are found in a number of empirical settings, but are unlikely to maintain the same growth rate over long periods. For example, annual UK productivity since 1860 (defined as output per worker per year at constant prices), shown in Figure 2.6(a) on a log scale, exemplifies such trend shifts. An overall trend line completely fails to characterize the evidence: early periods of below average growth are above the line; the boom during World War I (WWI) is below and the dramatic increase in the growth rate after World War II (WWII) is UK productivity 1900 1950 2000 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 overall trend → ↑ 3rd of 6 separate trends (a) UK productivity UK productivity TIS selected trends 1900 1950 2000 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 (b) UK productivity TIS selected trends Figure 2.6: (a) UK productivity since 1860 with overall and six trend lines at roughly 25-year subperiods; (b) UK productivity since 1860 with TIS selected trend lines at α = 0.0001. 2.6. Selecting Models with Saturation Estimation 185 far below. The figure also shows the fit of six separate trends to roughly 25-year subperiods: these obviously provide a far better description, but are both arbitrary and occasionally do not match changes in trend growth. This is most obvious at the end of the period, shown by the ellipse, where measured productivity has flat-lined since the 2008 ‘Great Recession’. Trend indicator saturation (TIS: see Castle et al., 2019b, for an analysis), was applied at α = 0.0001 to compensate for larger critical values in non-stationary processes and for non-IID errors. Figure 2.6(b) shows the outcome after manual deletion of trend indicators that were selected to avoid failures of congruence: because Autometrics imple- ments a range of mis-specification tests, when a model is under-specified, as here, indicators can be retained that do not satisfy the target sig- nificance level. These were deleted by the authors given the aim of the present illustration. Even so, 12 trend indicators remained (although an overall trend was not retained). As with super-saturation, TIS could be combined with IIS or SIS, appropriately adjusting α. We noted above that two successive step indicators can capture an outlier, and as can be seen in Figure 2.6(b), several trend indicators combined can act to correct outliers and location shifts rather than trend breaks as such. It is well-known that second differencing removes linear trends, changes breaks in trends to impulses and converts location shifts to ‘blips’. Combining the near equal magnitude, opposite-sign trend indicators found for 1917–1919, 1929–1930 and 1938–1940, left nine trend breaks, namely at the earlier of each of those three dates and at 1873, 1890, 1946, 1957, 1972, and 2006, noting that these times are when the trend indicators end. Many of these are salient dates: in historical order, 1873 was the start of the UK’s ‘Great Depression’ which ended gradually by 1896; 1919 was just after the end of WWI (where 1917 had been the peak output year), and also the start of the worldwide flu’ pandemic; 1930 followed the US stock market crash of 1929, and the onset of the Great Depression; 1940 was the start of mass production for WWII and 1946 the end of war production; 1972 saw the end of the ‘post-war recovery’ growth in productivity, as 1973 was hit by the first Oil Crisis leading to the UK’s high inflation and industrial troubles; and 2006 was the end of a long period of growth, 186 Econometric Methods for Empirical Climate Modeling when the flat-lining of productivity starts with the Great Recession of 2008. However, we are not aware of conspicuous events in 1890 or 1957, although these had the least significant estimated coefficients. To interpret what TIS finds, imagine you are in 1874 with data on productivity over 1860–1873 and fit a linear trend. Then that will be the same as the one selected by TIS from the whole sample if the trend rate of growth then changed in 1874. Now move forward to 1891 with data from 1860–1890 and fit that first trend line jointly with one over 1860–1890: the first is replicated as the sum of the trend coefficients and the second reveals a shift. Thus, TIS replicates what the historical record would have shown at the time on the available subsamples of data. That contrasts with fitting an overall trend to the full sample, which is not what would have been found at the time, and hence distorts the historical record—as we have just discussed. A perhaps surprising application of TIS is to health care, studying the rates of adoption of Desogesterol to replace Cerazette (a synthetic progestogen) at all 213 UK National Health Service (NHS) Clinical Commissioning Groups (CCGs). Walker et al. (2019) applied TIS to analyze the heterogeneity in the extent, time, and speed of diffusion of innovations as measured by the prescribing behavior of the CCGs, and the resulting additional costs to the healthcare system of slow adoption of the replacement pharmaceutical (note the favorable editorial). 2.6.6 Multiplicative-Indicator Saturation for Parameter Changes Multiplicative-indicator saturation (MIS) focuses on changes in the parameters of variables in models. In MIS, every variable is multiplied by a complete set of step indicators: see e.g., Kitov and Tabor (2015) and Castle et al. (2017) for simulations and applications, and Castle et al. (2019a) for an analysis of its properties. Thus, SIS can be interpreted as MIS for the constant term, but would have little potency facing changes in β when that was the coefficient of (say) (xt − x) where x denotes the mean of x. Indeed, Clements and Hendry (1998) show that zero-mean changes have little impact on forecasts, and that forecast failure is primarily caused by direct or induced shifts in the long-run mean (an issue discussed further in §2.9). Nevertheless, MIS could detect 2.6. Selecting Models with Saturation Estimation 187 such parameter changes. To see why, consider knowing that T1 was the shift date: fitting separate models up to and after T1 will naturally deliver estimates of the different subsample parameters. Using MIS to select that split should find the correct indicator, or one close to it, despite the large number of interactive indicators, albeit with added variability. Searching through many initial candidate variables efficiently is essential here as the number of regressors increases rapidly if testing for non-constancy in the parameters of many variables as each variable is multiplied by T − 1 step indicators. To illustrate MIS, we consider a generalization of (2.7) adding a regressor {xt} with a mean of zero, first when its coefficient is constant with β1 = 10, and then when β1 shifts to five at observation t, which date is revealed below, with β0 = 5 throughout, so under the null: yt = β0 + β1xt + εt, εt ∼ IN[0, σ2 ε ]. (2.11) To create the GUM to implement MIS, we add to 1 and xt the candidate set of 11 variables S{j≤t}×xt = d{j}, j = 1, . . . , T−1 noting that ST = 1: yt = β0 + β1,0xt + T−1∑ j=1 β1,jd{j} + et, et ∼ IN[0, σ2 e ], (2.12) and apply the usual split-half approach, where no d{j} are significant in either half, nor are any when applying the general search procedure in Autometrics to (2.12) at α = 0.01, in both cases retaining 1 and xt without selection. Next, Figure 2.7 records the situation when β1 shifts to (β1 − 5) using the ‘split-half’ approach. Panel (a) shows yt and xt where the plot scales the data such that both yt and xt have the same mean and range, which here is equivalent to their regression. Panel (b) reports the first six multiplicative indicators d{1}, . . . , d{6} of which d{5} is significant. An indicator close to the shift is the most likely to be retained when the shift lies in the other half. Panel (c) records the next five multiplicative indicators d{7}, . . . , d{11} where d{7} is significant. Including the two significant indicators, only d{7}, is selected leading to the improved outcome in Panel (d), which is precisely what Autometrics finds. Figure 2.8(a) shows the data with the model’s fitted values for constant parameters in (2.11) and Panel (b) records the residuals, which 188 Econometric Methods for Empirical Climate Modeling y β x 0 2 4 6 8 10 12 -5 0 5 10 15 (a) t with shift in 1 t 0 2 4 6 8 10 12 -1.0 -0.5 0.0 0.5 1.0 First six d{t}(b) ← d{5} selected 0 2 4 6 8 10 12 -2 -1 0 1 Second five d{t} (c) ↑ d{7} yt with d{7}selected 0 2 4 6 8 10 12 -5 0 5 10 15 (d) yt ^ Figure 2.7: (a) Time series with a change in parameter β1 of δ = −5 hitting at t = 7; (b) Six multiplicative indicators: d{5} is significant at α = 0.01; (c) Next five multiplicative indicators: d{7} is selected. (d) Add d{5} & d{7} with the outcome that only d{7} is selected. are almost the same as the DGP errors. The non-constant parameter graphs shown in Figures 2.8(c) and (d) are where β1 was halved to five at a point in the sample. It is not immediately obvious from ocular econometrics (looking at the data) that the poorer fit in (c) is due to a major shift in the parameter of the regressor, or precisely where that occurred: t = 5 looks suspect but is not the switch date. Finally, Figures 2.8(e) and (f) provide the corresponding graphs after MIS, where d{7} was correctly selected. Fitting separately to t = 1, . . . , 6 and t = 7, . . . , 12 delivers almost the same parameter estimates as MIS.2 Applying MIS delivers, noting that S{12}xt = xt: ŷt = 4.89 (0.58) S{6}xt + 4.89 (0.31) S{12}xt + 5.34 (0.31) (2.13) 2Here, recursive graphs do in fact reveal the problem of parameter non-constancy. 2.6. Selecting Models with Saturation Estimation 189 0 2 4 6 8 10 12 -20 -10 0 10 20 (a) yt ŷt 0 2 4 6 8 10 12 -5.0 -2.5 0.0 2.5 Residuals: constant model (b) 0 2 4 6 8 10 12 -5 0 5 10 15 (c) ~yt yt 0 2 4 6 8 10 12 -5.0 -2.5 0.0 2.5 Residuals: non-constant model (d) 0 2 4 6 8 10 12 -5 0 5 10 15 (e) 0 2 4 6 8 10 12 -5.0 -2.5 0.0 2.5 Residuals: non-constant model with MIS (f)y t ~t MIS y Figure 2.8: (a) Time series with model fit when constant parameters; (b) Residuals from (a); (c) Time series and model fit when non-constant parameters; (d) Residuals from (c); (e) Time series and model fit with MIS when non-constant parameters; (f) Residuals from (e): All residuals on same scale. revealing the shift was at t = 7 and was from β1 = 10 to β1 = 5. If MIS+SIS is undertaken for T = 12, despite having 24 variables, the same equation results, since the intercept is constant. The fit and residuals from (2.13) are essentially the same as Figures 2.8(a) and (b) despite the non-constancy, because the shift in the parameter is correctly ‘picked up’ by MIS. Since xt has a mean of zero and the intercept is constant, there is little difference between the 1-step ahead forecasts with and without MIS, but a large difference in the interval forecasts as Figure 2.9 records. 2.6.7 Designed-Indicator Saturation Indicators for saturation approaches can be designed to match known properties of a physical or social process under analysis, denoted VIS for volcano-indicator saturation as applied to volcanic impacts in Pretis et al. (2016). Their indicators have the form d′t = (0, . . . , 0, 1, 0.5, 190 Econometric Methods for Empirical Climate Modeling 1-step forecasts without MIS ±2 Data 1-step forecasts with MIS ±2 σ̂f 0 5 10 15 -15 -10 -5 0 5 10 15 1-step forecasts without MIS ±2σfData 1-step forecasts with MIS ±2σ̂f ~ Figure 2.9: 1-step ahead forecast ±2σ̂f with MIS (solid error bar) and ±2σ̃f without (dotted error bar). 0.25, 0.125, 0, . . . , 0) for t = 1, . . . , T , so the first commences with (1, 0.5, 0.25, 0.125, 0, . . .) and so on. A saturating set of such indica- tors were selected over to model the impacts of volcanic eruptions on dendrochronological temperature reconstructions as follows. Volcanoes erupt both gasses and material, and if sufficiently large amounts are ejected high into the atmosphere, their emissions can block solar radiation thereby reducing temperatures, sometimes on a global scale. Because their ejected material gets gradually removed from the atmosphere, the temperature ‘shape’ caused by an eruption is relatively similar across different volcanoes, as illustrated by Figure 2.10. The abrupt initial fall in temperature creates ‘outliers’ in temperature reconstructions, such as that based on dendrochronology used here. To locate statistically significant drops in temperature of a form likely to be from a volcanic eruption, Pretis et al. (2016) ‘designed’ a saturating set of indicators from the physical-theory shape of ν to match the temperature response using dt above, an illustrative subsample of which is shown in Figure 2.11. The principle of selecting significant indicators from the VIS sat- urating set is just like that for IIS. Although there is considerable 2.6. Selecting Models with Saturation Estimation 191 Figure 2.10: Solar radiation patterns of some recent volcanic eruptions. V1908 V1909 V1910 V1911 V1912 V1913 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 Volcanic Functions ↑ Katmai detected ← forecast trajectory V1908 V1909 V1910 V1911 V1912 V1913 Figure 2.11: Illustrative subsample of volcanic designed indicators. uncertainty about the timings and magnitudes of some eruptions, VIS saturation estimation can help correct dendrochronological temperature records (see Schneider et al., 2017, for a new archive of large volcanic events over the past millennium based on VIS). Figure 2.12 top panel shows a temperature reconstruction since 1200. The issue is whether the large drops coincide with volcanic eruptions. Having detected significant indicators, many of these can be checked against dates of known eruptions such as Tambora in 1816 (the year 192 Econometric Methods for Empirical Climate Modeling Northern Hemisphere Temperature Model Fit 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 -2 -1 0 1 C Northern Hemisphere Temperature Model Fit Detected Volcanoes 1200 1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1750 1800 1850 1900 1950 -1 0 1 K at m ai 1 91 2 Pe le e 19 02 K ra ka to a 18 83 Ta m bo ra 1 81 6 Et na 1 83 2 Sa ba nc ay a 16 95 La ki 1 78 3 Pa rk er 1 64 1 H ua yn ap ut in a 16 01 K uw ae a 14 53 Sa m al as 1 25 8 Po po ca te pe tl 1816 - "Year without Summer" 13 45 Detected Volcanoes Figure 2.12: Top panel: Temperature data and the fit of the model; bottom panel: Detected volcanoes, 1200–2000. without a summer when it snowed in New Haven in July, Frankenstein was written by Mary Shelley when stuck for months in a Swiss Chateau from almost ceaseless rain and J. M. Turner painted his remarkable skies) or Krakatoa in 1883. Figure 2.12 reports the outcome and the names of the volcanoes detected applying VIS in a first-order autoregressive model. 2.7 Summary of Saturation Estimation This subsection summarizes saturation estimators under the null as delivered by split half for the DGP yt ∼ IN[0, σ2 ε ], t = 1, . . . , T when the model fitted to the first-half is given by yt = ∑T/2 j=1 γjd{j} + εt so no intercept is included in the model. Here, d{j} denotes the appropriate indicator. For IIS, d{j} = 1{j=t}, for SIS, d{j} = 1{j≤t}, for TIS d{j} = 1{j≤t}t and for MIS, d{j} = 1{j≤t}xt. Finally, the indicator form for the 2.7. Summary of Saturation Estimation 193 one example of VIS is d{t} in the previous subsection. The formulae for the t-tests, tt, on the tth indicator under the null, as well as their non-centralities, ψ1, for a single alternative, for all these saturation estimators are derived in Castle et al. (2019a), who also record the corresponding non-centrality, ψk, when applying those t-tests for a known alternative. Their analysis reveals the basic structure of the various indicator saturation estimators and tt tests in a split-half analysis. All the sat- uration methods transpire to estimate a weighted combination of the current and next error (wj,tεt − wj,t+1εt+1). For example, the IIS es- timate of the coefficient γ̂t of 1{t} can be written as (εt − 0εt+1) so w1,t = 1 and w1,t+1 = 0 as each impulse exactly removes the corre- sponding error. For SIS, w2,t = 1 = w2,t+1; for TIS, w3,t = t−1 and w3,t+1 = (t + 1)−1; and for MIS with a regressor zt, w4,t = z−1 t and w4,t+1 = z−1 t+1 so the estimate can be erratic for near-zero zt; and for VIS, w5,t = 1 and w5,t+1 = 0.5 (ignoring smaller weights on εt+i). Thus SIS is MIS for the constant, and TIS is MIS for the trend, two important special cases. The resulting tt tests are then a scaled combination of the weighted current and next error, albeit with different weights. The ‘future’ only appears to be included because the steps are defined as cumulating the impulses up to the given time, so induce forward differ- encing: using the isomorphic reverse formulation would lead to backward differencing. Because impulses are mutually orthogonal, γ̂t for IIS will be the same under the null as when a single impulse indicator is included at t. However, for SIS, when only a single step is included at T1, γ̂T1 becomes (ε(1) − ε(2)) where ε(1) = T−1 1 ∑T1 t=1 εt and ε(2) = (T − T1)−1 ∑T t=T1+1 εt, so has the same form, but would have a much smaller variance. Similar changes occur for the other cases when a single indicator is used, although under the null such formulae are not overly insightful. More importantly are the non-centralities, ψ1 when split-half sat- uration estimation is applied and the break occurs in the first half of the sample at T1 ≤ T/2. The comparable non-centrality, ψk, is that of a t-test for the known form, timing and length of break, which would also apply when the saturation estimator found precisely the 194 Econometric Methods for Empirical Climate Modeling correct outcome. As Hendry and Santos (2005) note, for the IIS t-test, ψ1 = ψk so both are inconsistent. For the other saturation estimators, ψ1 ≤ ψk, so selection is essential to eliminate irrelevant but highly collinear indicators and improve potency over split half. For SIS, the non-centrality is increased by √ 2T1 from correct selection, with similar large improvements for the others. In practice, breaks are often detected at dates close to the actual occurrence rather than the precise date since errors with the opposite signs to a shift can delay detection or bring forward the apparent end, whereas with the same sign can lead to an earlier start or later end. The analysis around (2.10) explained why. Consequently, when simulating saturation estimators, we usually calculate the potency for the actual date ± a few periods. The resulting rejection frequencies depend on the choice of α and correctly retaining all other relevant variables. When only IIS is used, but there is a location shift, many similar magnitude, same-sign indicators may be retained, and these can be costlessly combined at their average value—this will be almost the same outcome as applying SIS: see Hendry and Santos (2005). Conversely, using SIS when there is a single outlier requires two indicators to characterize it. As noted above, such indicators can be combined manually to recreate an impulse indicator; alternatively, super saturation can help avoid that, but at the cost of a tighter α and hence potential loss of potency for smaller breaks. All these aspects are illustrated in Section 7. When IIS and SIS are applied in a system, as in Section 6, indicators are retained by a likelihood-ratio test, so depend on their significance in the system as a whole not in any individual equation therein. However, an unrestricted VAR can be estimated an equation at a time, so satura- tion estimation can also be applied an equation at a time and compared to the system selection to see if very different shifts or outliers occur in different equations. All selection decisions in both settings are based on likelihood-ratio tests as described in §2.3, albeit these often coincide with conventional t-tests, which allows a seamless transition between classes of models. 2.8. Selecting Simultaneous Equations Models 195 2.8 Selecting Simultaneous Equations Models A simultaneous equations representation is a model of a system of n variables, yt, that are to be modeled as endogenous by m other variables, zt, that are non-modeled. The properties of such systems were first analyzed by Haavelmo (1943), and are a potential representation when modeling the local data generation process (LDGP: see §2.1, and Hendry, 2009, 2018). To validly condition on zt requires that those variables are known to be at least weakly exogenous (see Koopmans, 1950b, and Engle et al., 1983). In essence, the weak exogeneity of zt requires that the DGP of zt does not depend on the parameters of the DGP of yt conditional on zt. When the status of zt is not certain, as with CO2 in Section 6, zt should be treated, at least initially, as a component of yt. The strong exogeneity of zt, as applies to the orbital drivers in Section 6, requires that their DGP also does not depend on lagged values of yt, in which case non-linear functions of the zt can also be included as ‘conventional’ conditioning determinants of yt. A dynamic representation of the system (yt, zt) can be formulated as a vector autoregression (VAR) conditional on the zt (often denoted by VARX), lags of all the variables, and deterministic terms such as intercepts and any indicator variables, denoted dt: yt = Ψ0zt + s∑ j=1 Ψjzt−j + s∑ i=1 Γiyt−i + Adt + ut (2.14) where ut ∼ INn[0,Ωu]. The assumptions on the error process require that s is sufficiently large to create a martingale difference process, and the variables included in dt remove any outliers, location shifts and parameter changes not captured by the other regressors so that homoskedasticity and constant parameters are viable. Then given that zt is weakly exogenous, the error process will also be uncorrelated with the regressors. To check the specification of (2.14), the system should be tested for congruence: once the initial system is congruent, all later reductions of it should be congruent as well to avoid relevant information being lost. Next, a parsimonious version of the system in (2.14) can be selected, while ensuring that congruence is maintained (denoted PVARX): Hendry 196 Econometric Methods for Empirical Climate Modeling and Mizon (1993) propose evaluating such dynamic models by their ability to encompass the VAR. Since the initial system is identified, all later non-simultaneous reductions from eliminating insignificant variables must be as well. At this selection stage, the system should also have been reduced to a non-integrated (I(0)) representation so that conventional critical values can be used: if the data are I(1), cointegration and differencing can do so: see e.g., Johansen (1995), and Doornik and Juselius (2018). A system like (2.14), also called the ‘reduced form’ in the economics literature, is always identified, so that multivariate least-squares esti- mators of its parameters are unique, and under these assumptions will deliver consistent estimates. A simultaneous equations representation is a model of the system derived by reduction from (2.14). However, in economics there is often a pre-specified theory of that representation from which the ‘reduced form’ is derived (hence the terminology), in- verting the correct order of the relationship between the system and a model thereof. Written in a concise notation, with the N × 1 vector wt denoting all the right-hand side variables, the system in (2.14) is: yt = Πwt + ut where ut ∼ INn[0,Ωu]. (2.15) Then a simultaneous-equations model of (2.15) is a reduction to: Byt = Cwt + εt where εt ∼ INn[0,Σε], (2.16) with: BΠ = C and Σε = BΩuB′. (2.17) A necessary condition for (2.17) to be solvable is that there are no more non-zero parameters in B and C than the n×N in Π, which is called the order condition. In addition, when the rank condition discussed in §2.8.1 is satisfied, B and C have a unique relation to Π and (2.16) is fully identified. We use ‘structure’ (in quotation marks) to denote an equation with more than one endogenous variable as in (2.16), without any claim that it is a structural equation in the sense of being invariant to extensions of the information set for new variables, over time, and across regimes. A simultaneous-equations model like (2.16) then needs 2.8. Selecting Simultaneous Equations Models 197 to be estimated appropriately, because including the ith endogenous variable in the equation for the jth will induce a correlation with its equation error. An infinite number of possible estimation methods exists, characterized by the estimator generating equation in Hendry (1976). Here we use full information maximum likelihood (FIML) first proposed in Koopmans (1950a). The general formulation and estimation procedures underlying FIML are described in Hendry et al. (1988). Since a simultaneous-equations model is a reduction from the system, automatic model selection is applicable as discussed in Hendry and Krolzig (2005): Doornik and Hendry (2017) propose an algorithm for doing so based on the multi-path search procedure of Autometrics, a variant of which is applied in Section 6. 2.8.1 Identification Identification, in the sense of uniqueness of B,C, in systems like (2.16) given Π has been extensively explored in the econometrics literature: see e.g., Fisher (1966), Koopmans (1949), Koopmans and Reiersøl (1950), and Rothenberg (1973) inter alia. The rank condition for identification determines the extent to which each equation is or is not identified. In that literature, identification is usually a synonym for uniqueness, although usage also entails connotations of ‘interpretable in the light of subject matter theory’ and ‘corresponding to reality’ (as in ‘identify a demand curve’, as opposed to a supply relation, or a mix). Whether or not B,C in (2.16) can be recovered uniquely from Π in (2.15) requires the exclusion of some different variables in every equation and the inclusion of some others, otherwise linear combinations of equations cannot be distinguished.3 Given the appropriate exclusions and inclu- sions corresponding to particular elements of B and C being zero, the rank condition is then satisfied so (2.16) is fully identified. Consequently, B and C are uniquely related to Π, which entails restrictions on the Π matrix in (2.15). The system for the three ice-age variables in Section 6 3Other forms of restriction than exclusions could identify ‘structural’ parameters that do not directly satisfy the rank condition, such as a diagonal error covariance matrix, or cross-equation links, but these are not considered here. 198 Econometric Methods for Empirical Climate Modeling is highly overidentified, so all the yi,t, i 6= j can be included in equations for yj,t, j 6= i. To ensure a unique relationship and hence avoid ‘spurious identifica- tion’ of a simultaneous representation, all the right-hand side variables need to be significant at a reasonable level both in the system and in their associated equations. Otherwise, claiming identification by exclud- ing insignificant regressors from any equation based on their apparent presence in other equations, when in fact they are also insignificant there, will be misleading when such variables are actually irrelevant to the system as a whole. Throughout selection of a simultaneous rep- resentation, the rank condition for identification can be imposed as a constraint, both to ensure that essentially the ‘same equation’, but with different normalizations, is not included twice, and that at every stage, the current form is identified (see Hendry et al., 1988). There are three possibilities of lack of identification, just identifica- tion, and over identification (subsets of parameters could be identified or not when others are the converse, in which case the following com- ments apply to the appropriate set). When B, C are not identified, then (2.15) is the least restricted but still fully identified, representation. Any just-identified simultaneous representation with a form like (2.16) will also be minimally identified, so there is an equivalence class of such specifications with equal likelihood (see e.g., Rothenberg, 1971), although in such a setting, reductions may be possible by eliminating irrelevant regressor variables from the entire system. When B,C are over identified by the rank condition, then (2.16) is a unique representation for the given restrictions. However, Hendry et al. (2009) show there may exist different sets of restrictions embodied in matrices B∗, C∗ which are not linear transforms of B, C (precluded by their identifiability), but under which (2.16) is equally over identi- fied. Thus, again an equivalence class of such specifications with equal likelihood can result: a given degree of over identification by itself does not ensure a unique model even when there is a unique DGP. The validity of any set of over-identified restrictions can be checked through parsimonious encompassing of the system by the ‘structure’. When L is the log-likelihood of the system (2.15), and L0 that of the ‘struc- tural’ form (2.16), in stationary DGPs, the test is 2(L − L0) ∼ χ2 OR(s) 2.9. Forecasting in a Non-Stationary World 199 for s over-identifying restrictions (see Hendry and Mizon, 1993, and Koopmans, 1950a).4 2.9 Forecasting in a Non-Stationary World We will undertake forecasts for both major illustrations below, so need to address two key aspects of wide-sense non-stationarity. First, it affects forecasting directly though the different approaches needed to select models for such data as discussed in the preceding subsections; and secondly, because the observations to be forecast will also be non- stationary, different forecasting devices may be required. Specifically, location shifts at or near the forecast origin can lead to forecast failure as emphasized by Clements and Hendry (1998), as of course can shifts that occur after forecasts are made. Systematic forecast failure, defined as when forecasts are significantly different from the later outcomes compared to their ex ante forecast intervals, is mainly caused by direct or induced shifts in the long-run means of the variables being forecast. We have suggested saturation estimation during model selection as a complement to cointegration to jointly handle non-stationarity in-sample. In this subsection, we describe both the consequences for forecasts of not handling location shifts near the forecast origin, and consider forecasting devices that are more robust than ‘conventional’ methods after such shifts. Almost all econometric model formulations are implicitly or ex- plicitly equilibrium correction: this huge class includes regressions, au- toregressions, VARs, cointegrated systems, dynamic stochastic general equilibrium models (DSGEs), autoregressive conditional heteroskedas- ticity error processes (ARCH) and generalizations thereof like GARCH. For example, a stationary scalar first-order autoregression of the form: yt = ρ0 + ρ1yt−1 + εt = µ+ ρ1(yt−1 − µ) + εt, (2.18) where εt ∼ IN[0, σ2 ε ] with |ρ1| < 1, and µ = ρ0/(1− ρ1) is the long-run mean, so E[yt] = µ, can be rewritten as: ∆yt = (ρ1 − 1)(yt−1 − µ) + εt. (2.19) 4Hence the earlier advice to obtain an I(0) and constant representation of the system. 200 Econometric Methods for Empirical Climate Modeling Since |ρ1| < 1, when yt−1 > µ then ∆yt < 0 and the process is brought back towards µ, and similarly when yt−1 < µ. In that way, (2.19) ‘error corrects’, as such mechanisms are often called. Unfortunately, if µ changes to µ∗ say, (2.19) will still equilibrium correct towards µ and will continue to do so until revised to replace µ by µ∗, so does not error correct to the new long-run mean. Should such a shift occur at the forecast origin yT , where ρ1 changes to ρ∗1 so µ∗ = ρ0/(1− ρ∗1) then the next observation will actually be: ∆yT+1 = (ρ∗1 − 1)(yT − µ∗) + εT+1, (2.20) whereas from (2.19), the 1-step ahead forecast will have been (ignor- ing parameter estimation uncertainty for simplicity as second order compared to the shift): ∆̂yT+1 |T = (ρ1 − 1)(yT − µ), (2.21) leading to the forecast error ε̂T+1 |T = ∆yT+1 − ∆̂yT+1 |T : ε̂T+1 |T = (ρ∗1 − 1)(yT − µ∗)− (ρ1 − 1)(yT − µ) + εT+1 = (1− ρ∗1)(µ∗ − µ) + (ρ∗1 − ρ1)(yT − µ) + εT+1, (2.22) and since E [yT − µ] = 0: E[ε̂T+1 |T ] = (1− ρ∗1)(µ∗ − µ) 6= 0 if µ∗ 6= µ. (2.23) Thus, (2.21) fails to correct the error induced by the location shift. An error like (2.23) will persist in future periods if the in-sample model is used unchanged as: ∆̂yT+2 |T+1 = (ρ1 − 1)(yT+1 − µ), (2.24) whereas: ∆yT+2 = (ρ∗1 − 1)(yT+1 − µ∗) + εT+2, so that letting ∇ρ1 = ρ∗1 − ρ1 and ∇µ = µ∗ − µ: ε̂T+2 |T+1 = (1− ρ∗1)∇µ+∇ρ1(yT+1 − µ) + εT+2, (2.25) with: E[ε̂T+2 |T+1] = (1− ρ∗1)(1 +∇ρ1)∇µ, 2.9. Forecasting in a Non-Stationary World 201 because: E[yT+1 − µ] = E[(ρ∗1 − 1)(yT − µ∗)] = (1− ρ∗1)∇µ. Similar mistakes of not error correcting after a shift in the long-run mean will affect all members of the equilibrium-correction class. A surprising feature of these second period 1-step ahead forecasts in (2.24) is that if (say) ∇µ < 0 so there has been a downward shift in the long-run mean, then: E[ŷT+2 |T+1 − yT+1] = −(1− ρ1)(1− ρ∗1)∇µ ≥ 0, (2.26) so that on average, ŷT+2 |T+1 ≥ yT+1, and the next forecast is usually above the previous outcome, and conversely when ∇µ < 0. This creates a ‘hedgehog’ effect in the graph of forecasts around outcomes, and is caused by (2.19) correcting to the old equilibrium µ and hence in the opposite direction to µ∗. Figure 2.13(a) illustrates with computer generated data on a model matching (2.18) where ρ0 = 10 and ρ1 = 0.65, but here ρ0 is changed to ρ∗0 = 6 which again shifts µ, now from 28.6 to 16.1 at observation t = 86 and back to its original value at t = 96 to create a ‘recession’ like pattern. The forecasts are based on an in-sample model matching the DGP with estimated values of ρ̂0 = 10.6 and ρ̂1 = 0.63, so are close to the DGP parameter values. The forecast failure from ŷT+h |T+h−1 is marked as all the outcomes from t = 86, . . . , 96 lie outside the 95% error bars, and the forecasts come back to the data only after the shift ends. To illustrate the ‘hedgehog’ effect, lines are drawn from the outcomes at t = 86 and t = 87 to the corresponding forecasts for the next periods, both of which lie well above the previous observed values. Figure 2.13(a) also records the forecasts from a robust device defined for h = 2, . . . ,H by (see Hendry, 2006): ỹT+h |T+h−1 = yT+h−1 + ρ̂1∆yT+h−1 = yT+h−1 + ρ̂1(yT+h−1 − yT+h−2), (2.27) which are dramatically better over the shift period with a 33% smaller root mean-square forecast error (RMSFE) overall. So why does a mis- specified ‘model’ like (2.27) forecast better than the estimated in-sample 202 Econometric Methods for Empirical Climate Modeling ŷT+h |T+hy + 75 80 85 90 95 100 105 15 20 25 30 (a) ŷT+h|T+h–1+2SE: RMSFE = 3.4 yT+h|T+h–1 yt : RMSFE = 2.3 75 80 85 90 95 100 105 15 20 25 30 10 steps ahead `hedgehog' graph (b) ~ Figure 2.13: (a) Successive 1-step ahead forecasts after a location shift at t = 86 and back at t = 96 for both ‘conventional’, ŷT+h |T+h−1, and ‘robust’, ỹT+h |T+h−1, forecasts; and (b) 10-steps ahead forecasts, ŷT+h|T+h−10, to highlight the hedgehog effect. DGP? An explanation follows from comparing the second expression in (2.18) with the second expression in (2.27), which reveals that the first long-run mean µ in the former is replaced by the ‘instantaneous’ estimator yT+h−1 and the second by yT+h−2. These are very noisy but unbiased estimators when there is no shift in µ, and highly adaptive estimators of µ∗ after a shift. Simplifying by ignoring parameter estima- tion, and taking expected values, when µ shifts because of a constant ρ1 but changed ρ0, then forecasting from T + 1 to T + 2: E[ỹT+2 |T+1] = E[yT+1] + ρ1E[∆yT+1] = µ∗ − ρ2 1∇µ, (2.28) since: E[yT+1] = µ∗ − ρ1∇µ and E[∆yT+1] = (1− ρ1)∇µ, then for ε̃T+2 |T+1 = yT+2 − ỹT+2 |T+1 as E[yT+2] = µ∗ − ρ2 1∇µ: E[ε̃T+2 |T+1] = µ∗ − ρ2 1∇µ− (µ∗ − ρ2 1∇µ) = 0, 2.9. Forecasting in a Non-Stationary World 203 as against: E[ε̂T+2 |T+1] = (1− ρ1)∇µ. In Section 7, we forecast by both the equivalent of (2.21) and (2.27). The robustness depends on forecasting later than the shift, and does not improve the forecast from T as Figure 2.13(a) shows. Although the algebra does not simplify neatly when ρ1 changes, the principle is the same and much of the error cancels unless the dynamics change greatly. However, the robust device in (2.27) is noisy and over-shoots when a shift ends, and an improved device is proposed by Martinez et al. (2019). Castle et al. (2015a) re-interpret (2.27) as: ỹT+h |T+h−1 = µ̃a + ρ̂1(yT+h−1 − µ̃b), (2.29) where µ̃a and µ̃b can be estimated by averages of past data rather than just a single data point. Then (2.29) defines a class of forecasting devices where µ̂ is the full sample average based on T observations, through to (2.27) which is based on one data point. Equally, combinations of several members of a robust class could be used, as using longer averages in estimating µ entails slower adjustment, but less volatility, as does forecast combination in general. Figure 2.13(b) records 10-steps ahead forecasts to highlight the hedgehog effect: well above during the drop in the variable to be forecast, and well below during the rise (an upside-down hedgehog). The former is similar to the substantive over-forecasts of productivity by the UK Office of Budget Responsibility since 2008, not adjusting to the ‘flat- lining’ seen in §2.6.5 above, much improved by the device in Martinez et al. (2019). Another example of forecasting through breaks is provided in Fig- ure 2.14 for the eruption in 1641 of volcano Parker in the Philippines. The model using VIS to detect volcanic eruptions was described in §2.6.7, and the Figure shows the forecasts from the first-order autoregression (denoted AR(1)) without saturation estimation, and that model with the detected indicator selected from the single observation at the eruption date. As can be seen, the AR(1) without the break indicator forecasts a rise as derived from the theory above, whereas that with the VIS form 204 Econometric Methods for Empirical Climate Modeling −2 −1 0 1 16351635 1640 1645 Year C Temperature AR(1) Forec. Volcanic Indic. AR(1) Forec. AR(1) Forecast - excluding Break-Indicator AR(1) Forecast - including Break-Indicator Observed Temperature Figure 2.14: Forecasting through the example eruption in 1641 of volcano Parker in the Philippines. estimated just from the initial fall does a reasonable job of tracking the temperature recovery despite forecasting six periods ahead from an indicator fitted to a single data point. Because the break pattern is very different from a location shift, the robust device described in (2.27) also performs poorly, although its forecasts are not shown. 3 Some Hazards in Empirical Modeling of Non-Stationary Time-Series Data We describe some of the hazards that can be encountered when empiri- cally modeling non-stationary time-series data, with potential implica- tions for analyzing observations on climate variables. Most importantly, the degree of integration of time-series data need not be constant, as integrability is a derived, not an intrinsic, property of a time-series process, a key issue addressed in this section. Other hazards include unmodeled shifts in relationships, or more generally, incorrectly modeled relations omitting substantively important explanations; data measure- ment errors; and aggregation bias. Any of these can seriously distort empirical statistical studies, leading to mistaken inferences and hence fallacious conclusions, and possibly false causal attribution. We use an example from societal behavior related to climate change (vehicle distances traveled) to illustrate how false implications can arise, even when an analysis is undertaken in a substantive and well understood framework, not ‘mere data mining’, and to explore the possibility of revealing such problems. To highlight as many hazards as possible in a simple example, the section is written like a ‘detective’ story where a potential culprit is only revealed towards the end. 205 206 Some Hazards in Modeling Non-Stationary Time-Series Data 1840 1860 1880 1900 1920 1940 1960 1980 2000 2020 0.00 0.02 0.04 Start of Keeling flask measurement→ Figure 3.1: Change in the radiative forcing measure of CO2. Our analysis of what can go wrong with an empirical statistical study builds on the critique by Pretis and Hendry (2013) of the claimed absence of links between temperature and radiative forcing of greenhouse gases in Beenstock et al. (2012). Those authors used the measure of the changes in the radiative forcing of CO2 shown in Figure 3.1, merging ice-core based data with Charles Keeling’s direct atmospheric readings (discussed below). The time-series properties of the subperiods are clearly different: the data up to 1958 seem to be I(0) around a level, and after are trending up with a much larger variance. Combining the two sub-samples suggests that the overall sample is I(2), and as the temperature time series is I(1) Beenstock et al. claim it cannot be caused by radiative forcing despite the well-established theory to the contrary. Pretis and Hendry (2013) highlighted the hazards that can be encountered in statistical analyses by an example that implies the absurd conclusion that moving vehicles do not cause human road fatalities, to match the equally absurd ‘proof’ that greenhouse gases don’t cause climate change. The examples of Venus, boiled by an excess of greenhouse gases, and Mars, frozen by a lack thereof, are notable extremes. Here, we address the problems that can be encountered in empirical modeling using their road fatalities example as one where the link to moving vehicles is not to our knowledge disputed. For millennia, from horse-drawn carriages to engine-driven vehicles, collisions with people have injured and killed them, vastly more so as cars initially proliferated. Figure 3.2 updates the UK data in Pretis and Hendry (2013) for total vehicle distances driven in billions of kilometers p.a. (denoted Dt) and road fatalities (Ft) to 2017, with 207 UK annual road fatalities (F 1940 1960 1980 2000 2020 2000 4000 6000 8000 (a)UK annual road fatalities (Ft) ∆F 1940 1960 1980 2000 2020 -500 -250 0 250 500 (b)∆Ft UK annual kilometers driven (D 1940 1960 1980 2000 2020 100 200 300 400 500 (c) UK annual kilometers driven (Dt) ∆D 1940 1960 1980 2000 2020 0 10 20 30 (d)∆Dt Figure 3.2: (a) Road fatalities p.a. in the UK (Ft); (b) its first differences (∆Ft); (c) vehicle kilometers driven p.a. in billions (Dt); and (d) its first differences (∆Dt). six new observations since their earlier analysis for an out-of-sample evaluation. The four panels labeled a, b, c, d respectively show Ft, ∆Ft = Ft − Ft−1, Dt and ∆Dt. Data on UK road fatalities are only available continuously from 1979 onwards and previously were interpolated from intermittent data, as the graph of ∆Ft reveals. The ‘blocks’ and large jumps suggest major data measurement errors, changing its time-series properties, although only observations from 1949 can be used in the regressions below given the shorter sample on Dt. It is also manifest from their graphs that Ft and Dt are highly non-stationary with changing means and variances, and have strong opposite trends. If an empirical analysis is undertaken of these two variables, absent subject-matter knowledge, then it ‘proves’ that greater distances driven by vehicles each year lead to fewer road deaths. We establish that finding by a statistical analysis of these two variables that includes testing for cointegration, checking the constancy 208 Some Hazards in Modeling Non-Stationary Time-Series Data and invariance of the relationship, and conditional forecasting of the six new observations on fatalities given the traffic data. The empirical model passes all the required tests, providing a near congruent explanation that even satisfies a test for invariance to large shifts in Dt. This example also serves to establish our notation. The aim is to stress that however sophisticated a statistical analysis may appear to be, the implications have to be understood in a substantive context: the claims that road fatalities are not caused by moving vehicles, or could be reduced by vehicles driving greater distances, are both absurd. Our illustration is a first-order autoregressive-distributed lag model (ADL: see Hendry, 1995, Ch. 7), relating Ft to its first lag, Ft−1, a constant, Dt and Dt−1 estimated by least squares over 1951–2011: F̂t = 728 (169) + 0.912 (0.022) Ft−1 + 12.3 (3.16) Dt − 13.7 (3.2) Dt−1 σ̂ = 163, R2 = 0.99, FAR(2, 55) = 3.53∗ χ2 nd(2) = 3.66, FHet(6, 54) = 1.17 tur = −4.08∗∗, FARCH(1, 59) = 0.02, FReset(2, 55) = 2.83. (3.1) In (3.1), estimated coefficient standard errors are in parentheses below estimated coefficients, σ̂ is the residual standard deviation, and R2 is the coefficient of multiple correlation: see §2.2.4 for model evaluation test statistics. All the estimated coefficients are highly significant in (3.1), there is a near perfect fit, only one mis-specification test is significant at even 5%, and the PcGive unit-root t-test, denoted tur, rejects the null hypothesis of no cointegration at 1% (see Ericsson and MacKinnon, 2002). Moreover, the equation is constant over the new observations with FChow(6, 57) = 0.06 where FChow is a parameter constancy forecast test over 2012–2017 with a RMSFE of 42.1, which is a 1/4 of σ̂ in (3.1), so the model fits the data that has arrived since Pretis and Hendry (2013) far better than the previous sample. The long-run derived solution from (3.1) is: F̃ = 8257 (636) − 15.7 (2.4) D (3.2) which has a negative effect from D. Thus, there is a long-run station- ary relation between the non-stationary series of road fatalities and 209 vehicle kilometers driven, such that road fatalities decrease with vehicle kilometers driven. Figure 3.3 records the fitted and actual values and 1-step ahead conditional forecasts F̂T+h |T+h−1 | DT+h for h = 1, . . . , 6, the residuals from (3.1) scaled by σ̂, with their histogram, density and correlogram. There is residual autocorrelation, as well as visual evidence of some residual non-normality and heteroskedasticity. Applying IIS and SIS at 1% (see §2.6), one outlier was selected for 1987, denoted 1{1987}, with step shifts that ended in 1955 and 1995, denoted S{1955} and S{1995}, respectively, the first of these applying to a very short initial sample that may not have been accurate. The outcome was a larger negative impact of D on F , with no significant diagnostic tests. To assess short-run effects, Pretis and Hendry (2013) estimated an equilibrium-correction model using the derived long-run (cointegrating) solution in (3.2), modeling the changes in road fatalities by changes of ±2 F 1950 1960 1970 1980 1990 2000 2010 2000 4000 6000 8000 (a) t FT +h |T+h–1 ±2σ̂f Ft scaled residuals forecast errors 1950 1960 1970 1980 1990 2000 2010 -2 -1 0 1 (b) scaled residuals forecast errors residual density N(0,1) -3 -2 -1 0 1 2 3 0.1 0.2 0.3 0.4 0.5 (c)residual density N(0,1) residual correlogram 0 5 -0.5 0.0 0.5 1.0 (d)residual correlogram F̂ ~ Figure 3.3: (a) Fitted, actual values and conditional forecasts of road fatalities p.a. in the UK; (b) residuals scaled by σ̂; (c) scaled residual histogram and density with N[0, 1]; and (d) residual correlogram all for (3.1). 210 Some Hazards in Modeling Non-Stationary Time-Series Data vehicle kilometers driven and deviations from the long-run equilibrium: ∆̂Ft = 12.3 (2.48) ∆Dt − 0.088 (0.012) Q̃t−1 (3.3) where Q̃ = F − F̃ , with a residual standard deviation of σ̂ = 160. Equation (3.3) shows a short-run increase in deaths as vehicle kilometers driven increase, with the long-run decrease embodied in Q̃. 3.1 Assessing the Constancy and Invariance of the Relationship The first check is on the constancy of the relationship in (3.1). In one interpretation, the step shifts S{1955} and S{1995} found above are evidence against that hypothesis, although the former only applies to four years. Conditional on their inclusion, the resulting recursive estimates of the other four coefficients are remarkably constant as shown in Figure 3.4. Ft−1 × ±2SE 1990 1995 2000 2005 2010 0.85 0.90 0.95 (a) Ft−1× ±2SE Constant × ±2SE 1990 1995 2000 2005 2010 1000 1500 (b) Constant × ±2SE Dt × ±2SE 1990 1995 2000 2005 2010 10 15 20 25 (c) Dt × ±2SE D 1990 1995 2000 2005 2010 -20 -10 (d) Dt−1 × ±2SE 1-step residuals ±2 1990 1995 2000 2005 2010 -250 0 250 (e) 1-step residuals ±2σ sequential Chow tests 1% significance level 1990 1995 2000 2005 2010 0.5 1.0 (f) sequential Chow tests 1% significance level ^ Figure 3.4: Recursive estimates of the coefficients of: (a) Ft−1; (b) constant term; (c) Dt; and (d) Dt−1, with (e) 1-step residuals and recursive estimates of σ; and (f) recursively computed Chow tests. 3.1. Assessing the Constancy and Invariance of the Relationship 211 To examine the possible ‘causal’ nature of (3.1), we next assessed its invariance to large changes in Dt (see e.g., Engle and Hendry, 1993, and Castle et al., 2017). Applying SIS to a first-order autoregression in Dt over the whole period, four highly significant step indicators were retained ending in 1985, 1989, 2007, and 2013 (p < 0.0002). Adding these to (3.1) yielded an Fexclude(4, 57) = 1.79 which is insignificant. Thus, a powerful test for invariance of (3.1) does not reject that hypothesis. It would appear that Ft and Dt cointegrate in a congruent constant relation that remains so outside the original data sample, and is invariant to the large shifts in Dt. How can such powerful statistical evidence fly in the face of the obvious falsity of the proposition that moving vehicles decrease road fatalities? The third check is whether the assumed degree of integration is constant (here, I(1)). Merging data of ostensibly the same variable from different measurement systems can alter the apparent degree of integration, and that was a key problem with the Beenstock et al. (2012) study. We use an augmented Dickey–Fuller test (ADF: see Dickey and Fuller, 1981) with a constant but no trend, roughly splitting the sample pre and post 1975. That for Ft over 1933–1975 yielded tadf = −2.74 which is close to the 5% significance level of −2.93 despite the small sample, where the first lagged difference was highly significant with t = 8.13. In the second period, tadf = −1.31 and no lagged differences were significant, which is a marked change. Referring back to Figure 3.2, the data behavior certainly changes noticeably after interpolation ends. The converse happens with Dt, albeit the samples are even smaller. Over 1951–1983, tadf = 0.06 whereas over 1984–2017, tadf = −3.96∗∗ with no lagged differences included, or tadf = −2.80 with a significant first lagged difference. At best, the assumption of a constant degree of integration is dubious. Including a trend in these tests alters the outcomes such that the only near significant outcome is now for F in the second period with tadf = −3.25 where the critical value is −3.52. However, that check highlights a possible issue: by failing to include a trend in the cointegration analysis, the formulation did not ensure the test was similar (see Nielsen and Rahbek, 2000). In the present conditional single equation specification, adding a linear trend to an 212 Some Hazards in Modeling Non-Stationary Time-Series Data unrestricted version of (3.3) leads to: ∆̂F t = 1834 (638) − 0.154 (0.028) Ft−1 − 1.16 (1.16) Dt−1 − 9.55 (10.5) t + 11.8 (2.94) ∆Dt − 438 (118) S{1955} R2 = 0.55, σ̂ = 151.9, FAR(2, 54) = 1.97, FARCH(1, 60) = 0.65 χ2 nd(2) = 2.82, FHet(9, 52) = 1.67, FReset(2, 54) = 1.90 (3.4) with FChow(6, 56) = 0.09, where the remaining significant step indicator has also been retained. While both the trend and Dt−1 are insignificant, if Dt−1 is deleted, the result is: ∆̂F t = 2337 (389) − 0.155 (0.028) Ft−1 − 19.6 (2.94) t + 11.6 (2.94) ∆Dt − 463 (115) S{1955} R2 = 0.54, σ̂ = 151.6, FAR(2, 55) = 2.22, FARCH(1, 60) = 1.44 χ2 nd(2) = 3.06, FHet(7, 54) = 1.66, FReset(2, 55) = 2.18 (3.5) with FChow(6, 57) = 0.85. This is at last interpretable: increases in road traffic lead to increased fatalities, but the trend reduction in deaths is due to cumulative improvements in many aspects of road safety. There are many potentially relevant explanatory variables omitted from a model simply relating road fatalities to distances driven. A par- tial list affecting deaths from vehicle accidents would include improved driving standards after 1935 from more stringent driving tests, increas- ingly tough this century; safer cars with better brakes such as discs from the mid 1950s, then anti-lock braking systems (abs) in the 1980s, and improved crash impact designs; reduced fatalities from retractable front seat belts, compulsory from 1983 (see the analysis in Harvey and Durbin, 1986) and in the later 1980s from air bags; increasing separation of opposite direction traffic flows on motorways from 1959 onwards; cameras at traffic lights as well as speed cameras; reductions in drunk driving from electronic breathalyzer tests after 1967 and lower acceptable alcohol limits; possibly heavier penalties for traffic violations; and so on. 3.2. An Encompassing Evaluation of the Relationship 213 In terms of reducing UK pedestrian fatalities, we note ‘zebra cross- ings with Belisha beacons’ dating back to 1934, and pedestrian controlled traffic lights in 1969 (often called pelican crossings in the UK); lower speed limits in urban areas; better road safety training, especially for children, etc. Converse effects come from faster cars; driver overconfi- dence; driving after taking drugs; and recently driving, and even walking, while using mobile phones etc. Also less has been done in the UK to protect cyclists: although fatalities have decreased slightly this century, the number of serious injuries has risen. Modeling total fatalities in- volves changing aggregation biases from different sub-populations being killed (pedestrians, pedal and motor cyclists, cars and other drivers) as a consequence of the differential effects of the above changes. Overall, the reduction from almost 8000 deaths p.a. in 1967 to under 1800 in 2017 reflects these many factors cumulatively, although a constant linear trend is obviously a crude approximate description (but see the analysis of trends in Hendry, 1995, Ch. 15), and the pandemic will create large location shifts in both ∆Ft and ∆Dt. 3.2 An Encompassing Evaluation of the Relationship To discriminate between the two non-nested explanations by trend or Dt−1, we use an encompassing test between (3.5) and (3.3) over 1950–2011, both with the three indicators. The formal analysis of encompassing originates with Mizon and Richard (1986) who also relate it to non-nested hypothesis tests: see Bontemps and Mizon (2008) for a recent overview. The two hypotheses are denoted M1 which relates ∆Ft to 1, Ft−1, t, ∆Dt, S{1955}, 1{1987}, S{1995} with σ̂[M1] = 136.2 from (3.5), and M2 which relates ∆Ft to ∆Dt, Q̃t−1, S{1955}, 1{1987}, S{1995} with σ̂[M2] = 139.5 from (3.3) augmented by the indicators. The instruments are the regressors of the joint nesting model: 1, Ft−1, t, ∆Dt, S{1955}, 1{1987}, S{1995}, Q̃t−1 with σ̂[Joint] = 134.0. The resulting test statistics are shown in Table 3.1. The results reject M2 in favor of M1, though not decisively. Nevertheless, the combined evidence is consistent with a long-run trend decrease that is an approximation to many safety improvements, despite increased distances driven causing more fatalities. 214 Some Hazards in Modeling Non-Stationary Time-Series Data Table 3.1: Encompassing test statistics Test Form M1 vs. M2 Form M2 vs. M1 Cox (1962) N[0, 1] −1.96∗ N[0, 1] −3.06∗∗ Ericsson (1983) IV N[0, 1] 1.70 N[0, 1] 2.58∗∗ Sargan (1964) χ2(1) 2.68 χ2(3) 7.13 Joint model F(1, 54) 2.77 F(3, 54) 2.57 Overall, the illustration highlights some of the difficulties that can arise when not commencing from a sufficiently general model to embed the local data generation process, albeit that formalizing the effects of the many technological, social, educational and legal changes affecting road safety would be hard. In wide-sense non-stationary processes, even well-established tests like those for a unit root may implicitly make untenable assumptions, such as a non-changing degree of integration, leading to misinterpretations. We now turn to the reasons for seeking to apply our tools to climate modeling. 4 A Brief Excursion into Climate Science The Earth’s climate depends on the balance between the sun’s incoming radiation and the heat loss back to space. Short-wave radiation from the Sun enters the Earth, warms the planet’s land and sea surfaces, then heat is radiated back through the atmosphere to space. However, greenhouse gases like CO2 absorb some of that long-wave radiation en route, and that is then re-radiated, with some being directed back towards the planet’s surface. Consequently, higher concentrations of such greenhouse gases will increase the extent of re-radiation, raising temperatures.1 The sun has itself warmed since its formation, increasing the radiation reaching the Earth over geological time, but has been relatively stable for the epoch of human (i.e., homo sapiens) existence. The composition of the Earth’s atmosphere has also changed greatly over geological time, altering radiation balances. That atmosphere currently comprises the four major components of water vapor, nitrogen (almost 80% of dry air), oxygen (about 20%) and carbon dioxide. There are also smaller volumes of related greenhouse gases including nitrous oxide, N2O, which is becoming an increasing component of greenhouse 1Myhre et al. (2001) review the radiative forcing of different greenhouse gases, and e.g., Kaufmann et al. (2013) show that the stochastic trend in global temperature is driven by the stochastic trends in anthropogenic forcing series. 215 216 A Brief Excursion into Climate Science gases, methane, CH4, and various chlorofluorocarbons, CFCs, halons and other halocarbons.2 These atmospheric components differ greatly in their roles in retaining heat in our planet. Water vapor is crucial, created by evaporation and providing rain, but also retaining heat: as the climate warms, more evaporation will lead to greater heat retention and more cloud cover, but that in turn will reflect back some of the incoming radiation. The roles in atmospheric heat retention of water vapor, carbon diox- ide, and dry air (mainly nitrogen and oxygen) were elegantly demon- strated by Eunice Foote in 1856, who filled separate glass jars with them to compare how they heated when placed in sunlight. She showed that the flask containing water vapor heated more than one with dry air, but that carbon dioxide heated considerably more and took far longer to cool. Her simple experiment could be demonstrated to school children to explain why CO2 emissions are causing climate change, leading to the worrying trend in global temperatures. Foote’s research predated that of the confirming and more exact experiments by John Tyndall.3 The physics of greenhouse gases was established by Arrhenius (1896), who argued that the atmospheric change in temperature was proportional to the logarithmic change in CO2 concentrations: Weart (2010) provides a history of the discovery of global warming. Next, nitrous oxide emissions have doubled in the last 50 years (see e.g., U.S. Energy Information Administration, 2009) and are about 300 times more potent per molecule than carbon dioxide as a greenhouse gas. Catalytic converters for car exhaust emissions oxidize and reduce nitrogen oxides and carbon monoxide to CO2, nitrogen and water, but can produce nitrous oxide when the exhaust system is cold or malfunctioning. Excess fertilizer on fields that runs off into rivers and lakes also releases nitrous oxide. The Black Sea is an indication of how fast such problems can happen due to excess nitrogen and phosphates 2Despite being non-reactive, CFCs gained notoriety for destroying the ozone layer by breaking down from ultraviolet radiation in the upper atmosphere. Although the Montreal Protocol has led to a major reduction, they remain powerful greenhouse gases by absorbing infrared radiation, as are substitutes like HCFCs. Nitrous oxide also now poses a serious problem for the ozone layer: see Ravishankara et al. (2009). 3http://www.climatechangenews.com/2016/09/02/the-woman-who-identified- the-greenhouse-effect-years-before-tyndall/. http://www.climatechangenews.com/2016/09/02/the-woman-who-identified-the-greenhouse-effect-years-before-tyndall/ http://www.climatechangenews.com/2016/09/02/the-woman-who-identified-the-greenhouse-effect-years-before-tyndall/ 217 from run-off, with the anoxic layer reaching the surface and killing its fish (see Mee, 2006), fortunately now tackled. Lüthil et al. (2008) establish that methane concentrations in the atmosphere are now double the levels seen over the past 800,000 years, and that ‘strong correlations of methane and CO2 with temperature reconstructions are consistent’ over that period. As noted in Hendry (2011), go to any lake in northern Siberia, drill a hole in the ice and hold a flame over it—but jump back quickly to avoid being burned by the methane explosively catching fire. Melting the permafrost in Siberia’s tundra could lead to a marked increase in global temperatures (see Vaks et al., 2020). Methane is about 20 times as powerful as CO2 as a greenhouse gas, with a half-life in the upper atmosphere of around 15 years, as it gradually gets converted to CO2, so has a second effect on climate. Current estimates of the world’s methane hydrates are over 6 trillion tonnes which is roughly twice the carbon content of all fossil fuels.4 The proportions of all these components of the atmosphere have been greatly altered over geological time by many natural processes, as well as more recently by humanity. These natural processes include the evolution of photosynthesis converting CO2 into energy and releasing oxygen, thereby cooling the planet once iron oxidation was completed; massive volcanism, releasing vast volumes of greenhouse gases and shorter-term cooling particulates; and tectonic plate movements altering the locations and depths of the oceans, which also play a key role through both heat and CO2 absorption and release to maintain a temperature balance with the Earth’s atmosphere. The consequences of these changes are the subject of §4.2. First we digress to consider the possibility that despite the magnitude of the planet and its apparently vast oceans, human behavior has become a geological force, reflected in the suggestion of renaming the current epoch from Holocene to Anthropocene. 4Methane hydrates are crystalline solids in which a gas molecule is surrounded by a cage of water molecules that act as ‘cement’ at low temperatures and high pressure. 218 A Brief Excursion into Climate Science 4.1 Can Humanity Alter the Planet’s Atmosphere and Oceans? To answer the first part of the question in the title, we just need to look at a satellite photograph of the atmosphere as in Figure 4.1. This shows that our atmosphere is but a thin blue line atop the Earth, relatively as thick as a sheet of paper round a soccer ball. Almost everyone knows that the peak of Mt Everest at just over 29,000 feet (8,848 m) is above a height where there is sufficient oxygen to sustain life, yet most people seem to act as if the atmosphere is almost unbounded. Given how little air there is, it should not come as a surprise that human economic activity can alter the composition of the atmosphere to influence the Earth’s climate, as our emissions of a variety of greenhouse gases like CO2, N2O, CH4, and various CFCs have done—and are increasingly doing so. Indeed, sulphur hexafluoride (SF6), widely used in the electricity industry as an insulator to prevent short circuits and fires, is a very long-lived gas estimated to be more than 20,000 times worse over a century as a greenhouse gas than CO2, and is already leaking into the atmosphere at almost double the rate of 20 years ago: see Rigby (2010). Figure 4.1: Satellite photograph of the Earth’s atmosphere available from NASA. 4.1. Can Humanity Alter the Planet’s Atmosphere and Oceans? 219 4.1.1 Anthropogenic Increases in Greenhouse Gases Adding greenhouse gases to the atmosphere increases global temper- atures, a growing concern reflected in the Paris Accord agreement at COP21 to seek to limit temperature increases to less than 2 ◦C, and ‘to pursue efforts to limit it to 1.5 ◦C’. The recent Special Re- port by the Intergovernmental Panel on Climate Change (IPCC: https: //www.ipcc.ch/sr15/) emphasizes that the latter is still just achievable, but that rapid action is required if it is to be achieved. A comparison of current with estimated atmospheric CO2 levels over the last 800,000 years of Ice Ages is informative as shown in Figure 4.2. Over the long period shown, atmospheric CO2 varied over the range of roughly 175 parts per million (ppm) to 300 ppm, for reasons addressed in Section 6. However, the recent CO2 records collected at Mauna Loa in Hawaii by Charles Keeling from 1958 (see Keeling et al., 1976, and Sundquist and Keeling, 2009), show a strong upward trend from more than 300 ppm -800 -700 -600 -500 -400 -300 -200 -100 175 200 225 250 275 300 325 350 375 400 CO levels2 at Mauna Loa CO levels pre Industrialisation2ppm 1000 years before present 1960 1980 2000 2020 Figure 4.2: Atmospheric CO2 levels pre-Industrialization and recent Mauna Loa recordings. https://www.ipcc.ch/sr15/ https://www.ipcc.ch/sr15/ 220 A Brief Excursion into Climate Science to now exceed 400 ppm: see CO2 Program Scripps (2010). These in- creases in atmospheric levels of CO2 are clearly anthropogenic in origin, as shown by the different isotopic ratios of CO2 from using fossil fuels compared to its release from photosynthesis by plants. This matches the attribution of CO2 emissions to human activity in (e.g.,) Hendry and Pretis (2013), as the trend dominates the marked seasonal variation. Plants absorb more CO2 in their growing seasons and release more as they die back in winter. The marked seasonality which that creates is due to the greater proportion of the planet’s land being in the northern hemisphere. The climate change resulting from higher CO2 concentra- tions has potentially dangerous implications, as highlighted by various IPCC reports,5 and many authors including Stern (2006), leading to the agreement in Paris at COP21. Meinshausen et al. (2009) analyze the difficulties of even achieving 2 ◦C as annual changes are increasing, so Paris COP21 has not yet even slowed the growth of CO2 emissions. How can we be sure human activity is responsible? Here is how. There have been trend increases in our use of fossil fuels and in deforestation. Since Suess (1953) it has been known that radioactive isotope carbon- 14 is created by cosmic rays in the upper atmosphere hitting CO2 molecules, after which the radioactivity gradually decays. Since coal and oil deposits were laid down hundreds of millions of years ago, their radioactivity has dissipated, so carbon dioxide released by their burning lacks this radioactive isotope. The changing ratio of the isotopes of carbon detected in the atmosphere would point directly at anthropogenic sources. Unfortunately, atmospheric nuclear explosions have radically altered that ratio, making it inapplicable as an indicator of human fossil fuel consumption. However, the ratio of another heavier isotope, carbon-13, relative to carbon-12 in atmospheric CO2 is also larger than its ratio in fossil fuels, and is not affected by nuclear tests. Consequently, if additional CO2 output is due to burning fossil fuels, the ratio of carbon-13 to carbon-12 should be decreasing—as is occurring. Moreover oxygen is needed for combustion, and matching the increases in CO2, atmospheric levels of oxygen have been falling slowly, albeit from a substantial level. 5See e.g., https://www.ipcc.ch/report/ar5/wg2/. https://www.ipcc.ch/report/ar5/wg2/ 4.1. Can Humanity Alter the Planet’s Atmosphere and Oceans? 221 4.1.2 The Earth’s Available Water Although the oceans seem vast—and seen from space, Earth is a blue sphere—actually collecting all the Earth’s water makes but a ‘puddle’, as Figure 4.3 shows. The spheres shown represent respectively: (1) All water on Earth, in the largest sphere over western USA, a mere 860 miles in diameter. (2) All fresh liquid water in the ground, lakes, swamps, and rivers in the sphere over Kentucky, just 169.5 miles in diameter. (3) Fresh water in lakes and rivers in the tiny sphere over Georgia, only 34.9 miles in diameter. Imagine heating up these small spheres, or adding millions of tons of plastic or other polluting substances to them, or make them absorb Figure 4.3: Earth’s water resources. Credit: Howard Perlman, USGS; globe illus- tration by Jack Cook, Woods Hole Oceanographic Institution © Adam Nieman. 222 A Brief Excursion into Climate Science gigatonnes of CO2. It is easy to see how we can affect oceans, lakes and rivers. These tiny spheres may seem to conflict with pictures of apparently endless oceans shown on programs like Blue Planet, and other films. But oceans are relatively shallow: the Atlantic is around 2.25 miles deep on average. The Pacific is wider and deeper at about 2.65 miles—at its deepest in the Challenger Deep of the Mariana Trench, roughly 6.8 miles down—and holds 170 million cubic miles of water in total, just over half the 330 million cubic miles of water in the largest sphere shown in Figure 4.3. From the narrower viewpoint of climate change, ocean warming, sea-level rises and creating a weak carbonic acid are the three key issues. The warming climate leads to thermal expansion and to loss of glaciers and ice sheets over land. Based on satellite altimetry, the global mean sea level has been rising since 1880 as seen in Figure 4.4. Although Figure 4.4: Global mean sea level (GMSL) has risen more than 20 cm since 1880 and is now rising at 3.4 mm p.a. versus 1.3 mm p.a. over 1850–1992. Source: CSIRO. 4.1. Can Humanity Alter the Planet’s Atmosphere and Oceans? 223 there is uncertainty around early measures based on tide gauges, these are dwarfed by the trend; and the later satellite altimetry matches the trend for the overlapping period. Not only has that rise accelerated this century to 3.4 mm per annum, future rises till about 2050 are inexorable from the oceans’ gradual response to atmospheric temperatures already achieved, though later rises depend on how much more greenhouse gas is emitted. To put that annual rise of 3.4 mm in context, dumping into the world’s oceans a volume of earth, pebbles and stones equivalent to the top five inches from the 9.1 million square miles of the land area of the United States would raise sea levels by about 3.3 mm–once.6 Such rises have dangerous implications for populations living near coasts from increases in the heights and frequencies of extreme sea levels from combinations of high tides, wind-driven waves and storm surges (see inter alia, Vousdoukas et al., 2018). What were rare flood- ing events historically will occur more frequently after sea-level rises. Failing adequate advance preparations, the risks to natural and human systems from sea-level rise and extreme sea levels include flooded cities, beach and cliff erosion, biodiversity loss, territorial losses (e.g., small island nations), displacement of people, harm to physical and psycho- logical health and well-being, and stress upon energy, transport and communication systems (see e.g., Jackson and Hendry, 2018).7 Confirming that part of the reason for the rising sea level is that Earth’s oceans are warming, recent measurements reveal that the ocean heat content is higher than previously estimated and goes to a greater depth: see Figure 4.6, Zanna et al. (2019) and Cheng et al. (2019). Not only will thermal expansion be greater, warmer oceans threaten sea life, from coral reefs (whose destruction is further exacerbated by increasing carbonic acid) through a rise in the chemocline between oxygenated water above and anoxic water below (see e.g., Riccardi et al., 2007). 6Testimony by Philip Duffy, President of Woods Hole Research Center before the USA House Committee on Science, Space, and Technology, as reported by The Washington Post, 17 May, 2018. 7Worse, it seems satellite measurements have overestimated coastal heights so an extra 300 million people will be affected by sea-level rises by 2050: see https:// sealevel.climatecentral.org/news/new-study-triples-global-estimates-of-population- threatened-by-sea-level-ri/. https://sealevel.climatecentral.org/news/new-study-triples-global-estimates-of-population-threatened-by-sea-level-ri/ https://sealevel.climatecentral.org/news/new-study-triples-global-estimates-of-population-threatened-by-sea-level-ri/ https://sealevel.climatecentral.org/news/new-study-triples-global-estimates-of-population-threatened-by-sea-level-ri/ 224 A Brief Excursion into Climate Science 4.2 Climate Change and the ‘Great Extinctions’ A number of major extinction events have occurred over geological time when many of the world’s life-forms ceased to exist, defined as their permanent disappearance from the fossil record. That record is incomplete, as the recent spate of discoveries of fossils of new dinosaur species in China attests; and disappearance is not final, as revealed by the curious tale of the rediscovery of a living coelacanth, previously thought extinct for about 70 million years (see Thomson, 1991, for that exciting story). Although dating is difficult for very distant events, and species go extinct intermittently in the absence of major events, the vast numbers of marine and land species vanishing over several relatively short geological time intervals is convincing evidence of ‘great extinctions’. ‘Mass extinctions’ occurred even before land life evolved, including in the pre-Cambrian era, before about 600 mya (million years ago). That was so severe that almost all micro-organisms vanished, possibly from large scale cooling and global glaciation (called ‘snowball Earth’: see Hoffman and Schrag, 2000). The Cambrian period then appears to have suffered four more major marine extinctions, possibly also from global sea cooling. These early disasters were followed by five others over the next 500 million years, shown by the labeled peaks in Figure 4.5. Figure 4.5 shows one estimate of the percentage of species vanishing from the fossil record, a clear demonstration of the fragility of life forms to the major climate changes that occurred at the boundaries of the Ordovician, Devonian, Permian, Triassic and Cretaceous periods. The first of the five mass extinctions came at the end of the Or- dovician period, approximately 440 mya (dates rounded for simplicity), again probably from global cooling, possibly followed by warming. The next, about 375 mya, occurred toward the close of the Devonian pe- riod, probably from the rapid spread of plant life on land reducing atmospheric CO2 by photosynthesis. The third mass extinction at the Permian–Triassic (P/Tr) boundary around 250 mya was the worst, leading to major losses of both ocean life and plants, animals, and insects on land (see inter alia, Erwin, 1996, 4.2. Climate Change and the ‘Great Extinctions’ 225 Figure 4.5: Fossil record disappearances showing an ‘extinctions timeline’ at period endings. Source: https://courses.lumenlearning.com/wm-biology2/chapter/mass-extinctions/. 2006).8 Explanations include the formation of massive flood basalts from extensive and prolonged volcanic eruptions, called large igneous provinces (LIPs).9 The LIP in Siberia forming at that time covered in excess of 2 million square kilometers with global temperatures about 6 ◦C higher than now. Methane hydrates released from relatively shallow continental shelves by the formation of the LIP are a possible cause (see e.g., Heydari et al., 2008). Magma pouring into seas may also have disturbed deep ocean levels (see Ward, 2006). In particular, underwater volcanism can induce oxygen deficiency when LIPs disrupt the ocean conveyor belts (see Bralower, 2008). Extinctions from oceanic warm- ing can become drastic if the chemocline between oxygenated water above and anoxic water below reaches the surface (see e.g., Riccardi et al., 2007). Then archaea and anaerobic bacteria, such as green-sulfur bacteria, proliferate and can generate vast quantities of hydrogen sulfide 8Rampino and Shen (2019) present evidence that there was a further mass extinction approximately 260 mya associated with the eruption of the Emeishan flood basalts in China, which time corresponds to the ‘bump’ before the end-Permian great extinction in Figure 4.5. 9These can form layered hills looking a bit like stairs, called Traps. https://courses.lumenlearning.com/wm-biology2/chapter/mass-extinctions/ 226 A Brief Excursion into Climate Science (H2S), which is almost as toxic as hydrogen cyanide. As with CO2, hydrogen sulfide is heavier than air, so can accumulate on the surface.10 There is evidence that the chemocline rose during the end-Permian ex- tinction, with a large increase in phototrophic sulfur bacteria replacing algae and cyanobacteria, consistent with huge loss of ocean life.11 Ocean circulation may also have slowed, or even stopped, from a lack of ice at the poles. While that initially affects marine life, CO2 dissolves more readily in cold water, and is released when water warms (as from an open glass of sparkling water): a massive overturn of cold oceanic water can release large quantities of CO2, warming the atmosphere. The fourth extinction at the end of the Triassic period, roughly 200 mya, helped the dinosaurs to emerge as a dominant life form in the Jurassic (see e.g., Brusatte, 2018). The cause is possibly the formation of a massive LIP called the Central Atlantic Magmatic Province, covering over eleven million square kilometers. The mass extinction could have been due to extensive CO2 emissions from the dissociation of gas hydrates inducing intense global warming, as there is some evidence of a rise in atmospheric CO2 near the Triassic–Jurassic boundary, or alternatively from sulfur dioxide emissions leading to cooling, but doubt remains about the cause and mechanism. The fifth and perhaps best known major extinction occurred ap- proximately 60 mya at the Cretaceous–Tertiary (K/T) boundary (now called Cretaceous–Paleogene, K-Pg), when the family of dinosaurs called saurischia went extinct. This extinction could plausibly be attributed to a meteoric impact, matching traces of iridium found between rocks separating dinosaur from mammalian epochs, and the discovery of the Chicxulub crater near the Yucatan peninsula. Even so, volcanism may also have played a role, as that time saw the formation of another LIP in India (the Deccan Traps), where a prolonged magma extrusion covered approximately 100,000 square kilometers by about 160 meters deep, 10As noted above, recent behavior of the Black Sea shows how quickly a switch in the chemocline can lead to the anoxic layer reaching the surface (see Mee, 2006). The eruption of sulfur bacteria round China’s southern coast just before the 2008 Olympics is a more worrying example, although that too might have had similar local causes. 11H2S also attacks the ozone layer if driven to the upper atmosphere (possibly by volcanism), reducing protection from solar radiation. 4.2. Climate Change and the ‘Great Extinctions’ 227 and global temperatures that were about 4 ◦C higher than currently (see e.g., Prothero, 2008, for an evaluation).12 Ward (2006) shows that the extinction at the end of the Triassic began when atmospheric CO2 was just above 1,000 ppm, and that at the K-Pg boundary when CO2 was just under 1,000 ppm, both far above the level of under 300 ppm at the end of the last ice age, and just over 400 ppm now. The fossil record over the past 520 million years as presently known shows that terrestrial and marine biodiversity was related to sea-surface temperature, with biodiversity being relatively low during warm periods (see Clarke, 1993, and Mayhew et al., 2009). Climate change, manifested by excessive global warming or cooling, has been a cause in all the above large-scale species extinctions. Indeed, it is difficult to imagine any other mechanism that would simultaneously exterminate both land and sea life other than large shifts in global cooling or warming. The key is change: though not anthropogenic, both directions have led to major losses of species from the fossil record, although since life is abundant today, some species have clearly always managed to survive and evolve. However, a climate-change induced mass extinction could threaten the lives of mil- lions of humans if species crucial to modern food chains were to vanish. Since all the great extinctions seem due to global climate change, albeit from possibly different causes, and since greenhouse gases lead to temperature changes, what is the evidence for the accumulation of CO2 equivalents in the atmosphere? As discussed above, the records collected at Mauna Loa in Hawaii (see Keeling et al., 1976 and Sundquist and Keeling, 2009), show an unequivocal upward trend, with large seasonal variations around it. Figure 4.2 showed the recent increases in CO2 levels from the low 300 parts per million (ppm) to near 400 ppm since 1958. As a consequence global mean surface temperatures have been rising as seen in Figure 4.6(a), especially in the Arctic where a feedback from ice melting lowers albedo and accelerates warming. Global ocean heat content to a depth of 700 m shown in Figure 4.6(b) has been rising rapidly over 1957–2012. 12The meteor may have struck an underwater oil deposit, ejecting huge quantities of smoke and soot into the atmosphere; also the impact could have played a role in the volcanism in India. 228 A Brief Excursion into Climate Science Arctic Temp Global Temp 1900 1950 2000 -1 0 1 2 3 K 10 J 22 (a) Global mean surface temperature deviations Arctic Temp Global Temp 1960 1970 1980 1990 2000 2010 -5 0 5 10 (b) Ocean heat content deviations, 0−700m Figure 4.6: (a) Global and Arctic mean surface temperature deviations in degrees K since 1880; (b) global ocean heat content to a depth of 700 m over 1957–2012. The oceans currently contain approximately sixty times more carbon than the atmosphere, and from a geological perspective, some of that CO2 can be exchanged quite rapidly between the atmosphere and oceans. Moreover, although oceans can probably absorb more CO2 at present, that may have adverse effects for marine life (see Stone, 2007): acidification slows the growth of plankton and invertebrates, which are basic to the ocean food chain. Lower pH levels could prevent diatoms and coral reefs from forming their calcium carbonate shells (e.g., just from lowering the current pH level of 8.1 to pH of 7.9). Moreover, while oceans rapidly absorb CO2 initially, much is evaporated straight back into the atmosphere (again think sparkling water left unsealed), and while later recycled, takes a long time before much is stored in deep ocean layers. So how has humanity created this potentially dire situation of adverse climate change? The next section addresses that development. 5 The Industrial Revolution and Its Consequences The ‘Industrial Revolution’ began in the UK in the mid-18th century for reasons well explained by Allen (2009, 2017). While its antecedents lay several centuries earlier in the many scientific, technological and medical knowledge revolutions, the UK was the first country to industrialize on a large scale. The startling consequences of that step can be seen everywhere 250 years later: real income levels are 7–10 fold higher per capita as shown in Figure 5.1, many killer diseases have been tamed, and longevity has approximately doubled. The evidence recorded in https://ourworldindata.org/economic-growth shows great increases in living standards in many countries albeit these are far from evenly shared. Nevertheless, the Industrial Revolution and its succeeding developments have been of vast benefit to humanity, raising standards of living for billions of humans far above levels dreamt of by earlier generations: see the excellent discussion in Crafts (2003) who demonstrates that the average individual living in the UK today would be unwise to swap their life for that of even one of the richest people several centuries ago. Unfortunately, an unintended consequence of the Industrial Revolu- tion was an explosion in anthropogenic CO2 emissions. This occurred because the main source of non-human and non-animal power at the time came from steam engines fired by coal, following improvements 229 https://ourworldindata.org/economic-growth 230 The Industrial Revolution and Its Consequences Average real GDP per capita across regions The measures are adjusted for inflation (at 2011 prices) and also for price differences between regions (multiple benchmarks allow for cross-regional income comparisons). 1870 1900 1920 1940 1960 1980 2000 2016 $10,000 $20,000 $30,000 $40,000 $50,000 Australia, NZ, Canada, US Western Europe Western Asia Eastern Europe Latin America East Asia Africa Source: Maddison Project Database (2018) OurWorldInData.org • CC BY-SA Figure 5.1: Increases in average real GDP per capita across major regions from 1870. Source: https://ourworldindata.org/. to the earlier engine of Thomas Newcomen by the separate condenser invented by James Watt, as well as his enhancing its versatility to gen- erate rotary power. Moreover, at the time, coal was relatively available in the UK, and quite cheap to mine, so although transporting such a heavy substance by land was expensive, sea routes were widely used, and from the opening of the Bridgewater Canal in 1771 greatly reducing costs, a boom in canal building occurred.1 However, the transport situation was even more radically altered at the ‘Rainhill trials’ in 1829 won by George and Robert Stephensons’ locomotive Rocket demonstrating the speed and power of steam-driven trains, a development that soon went global.2 With improvements in 1An invaluable spin off of this boom was the wonderful 1819 geological map by William Smith of the rock strata of the UK: see e.g., Winchester (2001). 2See e.g., Fullerton et al. (2002). As argued in Hendry (2011), prizes for methods to reduce, store or extract CO2 from the atmosphere deserve serious consideration. https://ourworldindata.org/ 231 steam engines, rail transport came to dominate for more than a century and of course produced volumes of CO2 in the process. Oil consumption added considerably to CO2 emissions from the late 19th century with the invention of gasoline and diesel powered internal combustion engines for cars, and replacing coal in ships as the heat source for steam engines. In the 20th century, air travel has further increased the demand for oil products, as have various chemical industries. Nevertheless, the victory of coal and later oil was not guaranteed. Although electricity was known from ancient times as a shock that electric fish could deliver, and from ‘static electricity’ created by rubbing objects, the first understanding (and the English name) only came after Gilbert (1600). Following many discoveries through its links to lightning by Benjamin Franklin, creating batteries by Alessandro Volta, and Hans Oersted’s finding that an electric current produces a magnetic field, the key breakthrough was Michael Faraday’s electric motor in the early 1830s, linking electricity with a moving magnet that allowed electricity to be generated as needed (see Blundell, 2012, for an excellent introduction). The first electricity generator in the UK in 1868 was hydro driven, but from 1882 till recently, coal-fired steam-driven power stations produced most of the UK’s electricity—adding to the already large use of coal in household fires, industry and rail transport. Not only was hydro-electric power available in the 1860s, the first commercial photovoltaic solar panel was developed by Charles Fritts in 1881, building on the creation by Edmond Becquerel in 1839 of the first photovoltaic cell, a device that converted the energy of light directly to electricity. However, it took till the mid 1950s for really viable solar cells to be created by Bell Labs. Moreover, wind power has been used sporadically for more than 2000 years, growing in use in Iran from the 7th century with windmills, that idea reaching Europe about 400 years later and leading to their widespread use to generate power to grind grain and pump water. The first wind turbine to generate electricity was built by James Blyth in 1887, and by the 1930s wind-generated electricity was relatively common on US farms. Finally, electric cars also date back before the 1880s but became a serious mode of transport when Thomas Parker built a vehicle with a 232 The Industrial Revolution and Its Consequences high capacity and rechargeable battery. As they were quiet, comfortable, could travel fast for the time, and did not need gears, they were popular in the early 20th century, when internal combustion engines displaced them given their much greater travel range and lower cost after Henry Ford’s mass produced vehicles, again adding to greenhouse gas emissions (and many more noxious substances including tiny particulates, nitrogen oxides and carbon monoxide). Thus, not only are there long historical precedents for renewable energy generation and electric vehicles, they were first available in the second half of the 19th century: had these developments come a century earlier and seen the concomitant efficiency improvements achieved recently, coal-fired electric power and petrol cars need not have happened, an issue we return to in §7.13. 5.1 Climate Does Not Change Uniformly Across the Planet Such large increases as 100 ppm in atmospheric CO2 seen in Figure 4.2 have warmed the planet as shown in Figure 4.6. In addition to the faster Arctic warming,3 Figure 5.2 shows that temperature changes have varied between regions. As the tropics receive much more heat from the sun than the poles, that heat is distributed away from the equatorial regions towards the poles. In his excellent video, David Battisti explains that tropical cloud cover plays a key role in that process, and that using the average cloud cover in all the major climate systems greatly reduces the differences between their simulations of future climate.4 The Arctic and other northern hemisphere regions have warmed the most over the period shown, but many parts of the planet have seen little change, whereas some ocean and Antarctic areas have cooled. Thus, the key is climate change induced by the rising temperatures that are fuelled by additional greenhouse gas emissions from human 3See e.g., https://www.msn.com/en-ca/weather/topstories/the-unexpected- link-between-the-ozone-hole-and-arctic-warming/ar-BB1058WF. 4https://www.oxfordmartin.ox.ac.uk/videos/from-global-to-local-the- relationship-between-global-climate-and-regional-warming-with-prof-david- battisti/. https://www.msn.com/en-ca/weather/topstories/the-unexpected-link-between-the-ozone-hole-and-arctic-warming/ar-BB1058WF https://www.msn.com/en-ca/weather/topstories/the-unexpected-link-between-the-ozone-hole-and-arctic-warming/ar-BB1058WF https://www.oxfordmartin.ox.ac.uk/videos/from-global-to-local-the-relationship-between-global-climate-and-regional-warming-with-prof-david-battisti/ https://www.oxfordmartin.ox.ac.uk/videos/from-global-to-local-the-relationship-between-global-climate-and-regional-warming-with-prof-david-battisti/ https://www.oxfordmartin.ox.ac.uk/videos/from-global-to-local-the-relationship-between-global-climate-and-regional-warming-with-prof-david-battisti/ 5.1. Climate Does Not Change Uniformly Across the Planet 233 Figure 5.2: Changes in global temperature 2014–2018. Source: NASA. behavior, primarily burning fossil fuels, especially coal and oil, as well as reduced CO2 take up from deforestation. To summarize, we showed above that humanity can easily affect Earth’s oceans and its atmosphere, and is doing so. Given the Earth’s limited atmosphere, the share of CO2 has risen by more than 30% to over 400 ppm since 1860. Greenhouse gases are transparent to incoming short- wave radiation from the sun, but reflect back some outgoing long-wave radiation, warming the atmosphere, land and the oceans which respond to balance temperatures, also leading to sea-level rises. Over geological time, climate change has been responsible for the great extinctions, as life cannot adapt to some losses of habitat. Thus, understanding climate change is crucial to tackling its likely consequences. The large climate systems model how the Earth responds to changes in greenhouse gases, but the emissions thereof are a function of economic, social, technological, and political behavior. Empirical modeling is an essential addition which Section 2 described, but as we also discussed, can be prone to important difficulties. To illustrate how Climate Econometrics tackles these, we describe in detail econometric modeling of Ice Ages and past climate variability over the last 800,000 years in the next section, then UK annual CO2 emissions 1860–2017 in Section 7. 6 Identifying the Causal Role of CO2 in Ice Ages While many contributions led to the discovery of massive past glacia- tion on land, that by Agassiz (1840), based on the contemporaneous movements of glaciers in his native Switzerland and using those to explain a number of previously puzzling features of the landscape in Scotland, was a major step forward in understanding the variability of past climate. Agassiz conceived of a ‘Great Ice Age’, an intense, global winter lasting ages, rather than multiple Ice Ages as now, but Geikie (1863) discovered plant fragments between different layers of glacial deposits, implying that sustained warm periods separated cold glacial periods in prehistory. The calculations by Croll (1875) using just the variations in the Earth’s orbit then gave a theoretical mechanism for how ice ages could occur and a time line, where the changing albedo of ice coverage helped explain the relative rapidity with which glacial periods switched, although he predicted that the last Ice Age was older than observed. Recently, Pistone et al. (2019) have shown that the complete disappearance of Arctic sea ice would be (in temperature terms) ‘equivalent to the effect of one trillion tons of CO2 emissions’ (roughly 140 ppm) because an open ocean surface typically absorbs approximately six times more solar radiation than a high albedo surface 234 235 covered with sea ice. Such an effect reducing ocean ice as the climate gradually warmed after the peak of glacial extent would accelerate melting, and conversely for cooling. Croll’s research was later amplified by Milankovitch (1969) (originally 1941) who calculated solar radia- tion at different latitudes from changes in obliquity and precession of the Earth as well as eccentricity. Milankovitch also corrected Croll’s assumption that minimum winter temperatures mattered, to show that cooler summer maxima were more important in leading to glaciation. Even a century after Agassiz, there was limited evidence to sup- port such ideas and the timings of glacial episodes. However, these general explanations have since been corroborated by many empirical observations of past oceanic and atmospheric climate changes: see e.g., Imbrie (1992). As we show below, an important reason for analyzing what may seem like the distant past is its relevance today. The climate then was little affected by the activities of the various human species on the planet, partly as they were too sparse and partly did not have the technology. Consequently, any links between, say, CO2 and tem- perature above the forces from the orbital drivers (which of course are still operating) must have been natural ones, so can help us understand their present interactions when CO2 emissions are anthropogenic. There are three main interacting orbital changes over time affecting incoming solar radiation (insolation) that could drive Ice Ages and inter-glacial periods. These are: (a) 100,000 year periodicity deriving from the non-circularity of the Earth’s orbit round the Sun from the gravitational influences of other planets in the solar system where zero denotes circularity (eccentricity: Ec below); (b) a 41,000 year periodicity coming from changes in the tilt of the Earth’s rotational axis relative to the ecliptic measured in degrees (obliquity: Ob); (c) about 23,000 and 19,000 year periodicities due to the precession of the equinox, also measured in degrees, which changes the season at which the Earth’s orbit is nearest to the Sun, resulting in part from the Earth not being an exact sphere (Pr). These three are shown measured at 1000-year intervals in Figures 6.1(a), (b) and (c), together with summer-time insolation at 65◦ south (St) in Panel (d) (see Paillard et al., 1996). The X-axes in such graphs are labeled by the time before the present in 1000- year intervals, starting 800,000 years ago. Ec and St show two major 236 Identifying the Causal Role of CO2 in Ice Ages Eccentricity -750 -600 -450 -300 -150 0 0.01 0.02 0.03 0.04 0.05 (a) Eccentricity Obliquity -750 -600 -450 -300 -150 0 22 23 24 (b) Obliquity Precession -750 -600 -450 -300 -150 0 0.1 0.2 0.3 (c) Precession Summer-time insolation -750 -600 -450 -300 -150 0 450 500 550 (d) Summer-time insolation Figure 6.1: Ice-age orbital drivers: (a) Eccentricity (Ec); (b) obliquity (Ob); (c) pre- cession (Pr); (d) Summer-time insolation at 65◦ south (St). long-swings pre and post about −325 and within each, a number of shorter ‘cycles’ of varying amplitudes, levels and durations. Ob appears to have increased in amplitude since the start of the sample, whereas it is difficult to discern changes in the patterns of Pr. The orbital series are strongly exogenous, and most seem non-stationary from shifting distributions, not unit roots. Orbital variations are not the only forces that affect glaciation. The Earth’s energy balance is determined by incoming and outgoing radiation: for a cointegrated econometric model thereof, see Pretis (2019). The role of St is to summarize changes in the former, but an exogenous summary measure of outgoing radiation is not clear, as changes that also affect climate include: (i) variations in the Sun’s radiation output (radiative forcing); (ii) atmospheric water vapor and greenhouse gases (e.g., CO2, N2O, CH4); 237 (iii) volcanic eruption particulates in the atmosphere; (iv) albedo from alterations in ice cover, including from volcanic dust; (v) iron in wind-blown dust enabling Southern Ocean storage of CO2;1 (vi) ocean temperatures (which lag behind land); (vii) sea levels and induced ocean circulation patterns; (viii) cloud cover and its distribution in location and season; (ix) changes in the magnetic poles. Of these, (i), (iii) and (ix) seem strongly exogenous, as do volcanic contributions to (iv), whereas (ii), the rest of (iv) and (v)–(viii) must be endogenously determined within the global climate system by the strongly exogenous drivers. However, anthropogenic greenhouse gas emissions are now ‘exogenously’ changing atmospheric composition: see Richard (1980) for an analysis of modeling changes in a variable’s status as endogenous or exogenous, which here would just affect the last few (1000 year) observations. That the distance from the Sun matters seems rather natural, as such variations change radiative forcing and hence global temperatures. However, the variations due purely to the eccentricity of the orbit are small. Obliquity also must matter: if the Northern Hemisphere directly faced the Sun, ice would usually be absent there; and if it never faced the Sun, would generally be frozen. Precession seems the smallest driving force of these, but interactions may be important: when the Earth is furthest from the Sun and tilts away in the Northern Hemisphere summer, that may cool faster: see Paillard (2010) for an excellent discussion of these interactions. Even so, a problem with the theory that ‘purely orbital’ variations drove ice ages over the last 800,000 years is that the known orbital variations should not result in sufficiently large changes in radiative forcing on the Earth to cause the rapid arrival and especially the rapid ending, of glacial periods: see Paillard (2001). Although St could provide some additional explanation, and 1See Buchanan et al. (2019). 238 Identifying the Causal Role of CO2 in Ice Ages in particular seems to help capture changes at peaks and troughs, we decided to only use the strongly exogenous orbital drivers. An equation regressing St just on these and its first lag produced R2 = 0.988, so we leave to the reader the exercise of building a model with St included in the list of variables.2 There are several possible reasons for ‘rapid’ changes in the climate system, remembering that the observation frequency is 1000 years. The extent of Southern Ocean sea ice can substantively alter ocean ventila- tion by reducing the atmospheric exposure time of surface waters and by decreasing the vertical mixing of deep ocean waters, which Stein et al. (2020) show can lead to 40 ppm changes in atmospheric CO2. An- other explanation is the presence of non-linear feedbacks or interactions between the drivers. Thus, Figure 6.2 shows their interactions in Panels (a) [Ec×Ob], (b) [Ec× Pr], (c) [Ob× Pr], (d) [Ec× St], (e) [Pr× St] and (f) [Ob× St] although the model developed here includes only the first three interactions together with the squares to capture non-linear influences. Explaining glaciation over the Ice Ages has garnered a huge literature, only a small fraction of which is cited here. The possibility of the Northern Hemisphere facing another Ice Age was still considered in the 1950s as the following quote illustrates: We do not yet know whether the latest turn in our climatic fortunes, since the optimum years of the 1930s, marks the beginning of a serious downward trend or whether it is merely another wobble. . . Lamb (1959) but by 1982, Lamb (1995) emphasized global warming as the more serious threat to climate stability. The remainder of the section is as follows. §6.1 describes the data series over the past 800,000 years, then §6.2 models ice volume, CO2 and temperature as jointly endogenous in a 3-variable system as a function of variations in the Earth’s orbit. The general model is formulated in 2For more comprehensive systems that endogenously model measures for all the variables in (iii)–(vii), see Kaufmann and Juselius (2013) and Pretis and Kaufmann (2018). We are also grateful to those authors for providing the data series analyzed here. 6.1. Data Series Over the Past 800,000 Years 239 EcOb -750 -600 -450 -300 -150 0 0.25 0.50 0.75 1.00 1.25 (a) EcOb EcPr -750 -600 -450 -300 -150 0 0.005 0.010 0.015 0.020 (e) (b) EcPr ObPr -750 -600 -450 -300 -150 0 0.0 2.5 5.0 7.5 (f) (c) ObPr EcSt -750 -600 -450 -300 -150 0 10 20 (d) EcSt PrSt -750 -600 -450 -300 -150 0 50 100 150 200 PrSt ObSt -750 -600 -450 -300 -150 0 110 120 130 140 ObSt Figure 6.2: Ice-age orbital driver interactions: (a) EcOb; (b) EcPr; (c) ObPr; (d) EcSt; (e) PrSt; (f) ObSt. §6.2.1, and the simultaneous system estimates are discussed in §6.2.2. Their long-run implications are described in §6.3 with one hundred 1000-year 1-step and dynamic forecasts in §6.3.1. Then, §6.3.2 considers when humanity might have begun to influence climate, and discusses the potential exogeneity of CO2 to identify its role during Ice Ages. §6.4 looks 100,000 years into the future using the calculable eccentricity, obliquity and precession of Earth’s orbital path, to explore the implications for the planet’s temperature of atmospheric CO2 being determined by humans at levels far above those experienced during Ice Ages. Finally, §6.5 summarizes the conclusions on Ice-Age modeling. 6.1 Data Series Over the Past 800,000 Years A vast international effort over many decades has been devoted to measuring the behavior of a number of variables over the Ice Ages. Naturally, proxies or indirect but closely associated observables that 240 Identifying the Causal Role of CO2 in Ice Ages remain in the ground, ice, oceans and ocean floors are used based on well-established physical and chemical knowledge. Econometricians are essentially mere end users of this impressive research base. Antarctic-based land surface temperature proxies (denoted Temp below) were taken from Jouzel et al. (2007). The paleo record from deep ice cores show that atmospheric CO2 varied between 170 ppm and 300 ppm over the Ice Ages, where 1 ppm = 7.8 gigatonnes of CO2 (see Lüthil et al., 2008). Ice volume estimates (denoted Ice below) were from Lisiecki and Raymo (2005) (based on δ18O as a proxy measure). To capture orbital variations, Ec, Ob and Pr and their interactions are conditioned on. All observations had been adjusted to the common EDC3 time scale and linearly interpolated for missing observations to bring all observations on a 1000 year time interval (EDC3 denotes the European Project for Ice Coring in Antarctica–EPICA–Dome C, where drilling in East Antarctica has been completed to a depth of 3260 meters, just a few meters above bedrock (see Parrenin et al., 2007). Synchronization between the EPICA Dome C and Vostok ice core measures over the period −145,000 to the present was based on matching residues from volcanic eruptions (see Parrenin et al., 2012). The total sample size in 1000 year intervals is T = 801 with the last 100 observations (i.e., 100,000 years, ending 1000 years before the present) used to evaluate the predictive ability of the estimated system. Figure 6.3 records a shorter sample of sea level data.3 We focus on modeling Ice, CO2 and Temp as jointly endogenous functions of the orbital variables which we take to be strongly exogenous, so feedbacks onto their values from Earth’s climate are negligible. The patterns of these time series are remarkably similar, all rising (or falling) at roughly the same times. Figure 6.4 emphasizes how close these movements are by plotting pairs of time series: (a) CO2 and the negative of ice volume (denoted IceNeg); (b) CO2 and Temp; (c) Temp and IceNeg; (d) IceNeg and sea level (only for the last 465,000 years). 3Sea surface temperature data are available from Martinez-Garcia et al. (2009) which could help explain oceanic CO2 uptake and interactions with land surface temperature. Sea level data, based on sediments, can be obtained from Siddall et al. (2003), over a shorter sample, but are not analyzed here. 6.1. Data Series Over the Past 800,000 Years 241 Ice -750 -600 -450 -300 -150 0 3.5 4.0 4.5 5.0 (a) ppm M Ice CO2 -750 -600 -450 -300 -150 0 175 200 225 250 275 300 (b)CO2 Temp -750 -600 -450 -300 -150 0 -10 -5 0 (c)Temp Sea level -750 -600 -450 -300 -150 0 -100 -75 -50 -25 0 25 (d)Sea level Figure 6.3: Ice-age time series: (a) Ice volume (Ice); (b) atmospheric CO2 in parts per million (CO2); (c) temperature (Temp); (d) shorter-sample sea level changes in meters. Atmospheric CO2 levels closely track the negative of ice volume, the temperature record and sea level, as do other pairs. If ice ages are due to orbital variations, why should atmospheric CO2 levels also correlate so closely with ice volume? Lea (2004) relates changes in tropical sea surface temperature to atmospheric CO2 levels over the last 360,000 years to suggest that CO2 was the main determi- nant of tropical climate. Conversely, in https://climateaudit.org/2005/ 12/18/gcms-and-ice-ages/, Stephen McIntyre argues that CO2 should not be treated as a forcing variable in statistical models of ice-age cli- mate, as it is an endogenous response. So is the mechanism not orbital variations, but instead that changes in atmospheric CO2 levels alter global temperatures which in turn drive changes in ice volume? The answer lies in the deep oceans, in particular, the Southern Ocean, which acts as a carbon sink during cold periods, and releases some of that CO2 as the planet warms, in turn enhancing cooling and warming: see https://climateaudit.org/2005/12/18/gcms-and-ice-ages/ https://climateaudit.org/2005/12/18/gcms-and-ice-ages/ 242 Identifying the Causal Role of CO2 in Ice Ages CO2IceNeg -750 -600 -450 -300 -150 0 200 250 300 (a)CO2 IceNeg CO2Temp -750 -600 -450 -300 -150 0 200 250 300 (b)CO2Temp Temp IceNeg -750 -600 -450 -300 -150 0 -10 -5 0 (c)Temp IceNeg IceNeg Sea level -400 -300 -200 -100 0 -5.0 -4.5 -4.0 -3.5 -3.0 (d)IceNeg Sea level Figure 6.4: (a) CO2 and the negative of ice volume (IceNeg); (b) CO2 and tem- perature; (c) temperature and IceNeg; (d) IceNeg and sea level (only for the last 465,000 years). e.g., Jaccard et al. (2016). Thus, the exogenous orbital variations drive temperature, which drives changes in ice volume and in turn CO2 levels. By modeling the 3-variable simultaneous-equations system estimated using full information maximum likelihood (FIML: see e.g., Hendry, 1976), treating all three as endogenous, the roles of Temp and CO2 as endogenous determinants of Ice can be investigated. The approach used here is described in §2.8. In addition to the many dozens of climatology-based studies, there are several econometric analyses of ice-age data, examining issues of cointegration and the adequacy of using orbital variables as the ex- ogenous explanatory regressors. Kaufmann and Juselius (2010, 2013) analyze the late Quaternary ‘Vostok’ period of four ‘glacial cycles’ and Pretis and Kaufmann (2018) build and simulate a statistical climate model over the paleo-climate record of the 800,000 years of data inves- tigated here. We now turn to system modeling of our three variables of interest. 6.2. System Equation Modeling of the Ice-Age Data 243 6.2 System Equation Modeling of the Ice-Age Data Our focus is on modeling Ice allowing for the endogeneity of Temp and CO2, with dynamic feedbacks, non-linear impacts of the orbital variables and handling outliers. Consequently, the initial GUM is a VARX(1) for yt = (Ice CO2 Temp)t conditional on the nine orbital measures and non-linear functions thereof where: z′t = (Ec Ob Pr EcOb EcPr PrOb Ec2 Ob2 Pr2)t (6.1) with a one-period lag (i.e., 1000 years earlier) on all variables. The lagged values are to capture dynamic inertia: when the ice covers a vast area, that will influence the ice sheet in the next period, even when periods are 1000 years apart. Moreover, that observation length is just 1% of the eccentricity periodicity, so the Earth will still be close to its previous position.4 System IIS at 0.1% was implemented with all the continuous variables retained, then after locating outliers, the regressor variables were selected at 1% to create a parsimonious VARX(1), denoted PVARX(1). Next, that system was transformed to a simultaneous-equations model of the PVARX(1), where only variables and outliers that were relevant in each equation were included, and finally contemporaneous links were investigated. Only retaining variables that are significant in a PVARX(1) avoids ‘spurious identification’ from using completely irrelevant variables that are then excluded differently in each equation to apparently achieve the order condition. 6.2.1 The General Unrestricted Model (GUM) The GUM in this setting is a dynamic system with strongly exogenous regressors which can be written as: yt = γ0 + Γ1yt−1 + Γ2zt + Γ3zt−1 + Ψdt + εt, (6.2) where dt denotes a vector of impulse indicators selected by system IIS. The difference from single-equation IIS described above is that indicators have to be significant at the target nominal significance level in the system, not just in any one equation therein. 4Residual autocorrelation suggests that a second lag or longer may also matter, despite such variables being at least 2000 years earlier. 244 Identifying the Causal Role of CO2 in Ice Ages First, all the yt−1, zt and zt−1 in (6.2) are retained without selection when IIS is applied at α = 0.001 for T = 697 keeping the last hundred observations for out-of-sample forecast evaluation. This led to 35 impulse indicators being selected, the earliest of which was 1{−339}. However, many of these were retained to avoid a failure of encompassing the first feasible GUM, and were not significant at α = 0.001. Table 6.1 records the correlations between the actual observations and the fitted values taking impulse indicators into account, so each variable can be explained in large measure by a model of the form (6.2). Table 6.2 shows the correlations between the residuals of the three equations, with residual standard deviations on the diagonal. There remains a high correlation between CO2 and Temp residuals even conditional on all the orbital variables, but not between those of Ice and either CO2 or Temp, although those correlations remain negative. Next, the other regressors were selected at 1% resulting in a PVARX(1). Again, note that selection decisions are at the level of the system rather than individual equations. Finally, to avoid the spu- rious identification issue from indicators that were insignificant in the system, any that were also insignificant in every equation were manually deleted from the system, still leaving 32. Table 6.1: Correlations between actual and fitted values in the VARX(1) Ice CO2 Temp 0.981 0.981 0.972 Table 6.2: Correlations between VARX(1) residuals, with standard deviations on the diagonal Ice CO2 Temp Ice 0.090 – – CO2 −0.179 5.13 – Temp −0.180 0.574 0.711 6.2. System Equation Modeling of the Ice-Age Data 245 6.2.2 The Simultaneous System Estimates Because many of the exogenous and lagged variables and impulse indicators were only significant in one equation, we reformulated the system as a simultaneous-equation model. This treats all three modeled variables as endogenous and was estimated by FIML. We then manually eliminated insignificant regressors in each equation in turn. The current dated values of Temp and CO2 in the Ice equation and of Temp in the CO2 equation were insignificant, but that of CO2 was significant in the equation for Temp. This delivered the system model in (6.3)–(6.5): Îcet = 1.43 (0.34) + 0.860 (0.015) Icet−1 − 0.020 (0.002) Tempt−1 + 102 (31) Ect − 101 (32) Ect−1 − 0.040 (0.014) Obt−1 − 5.07 (1.30) EcObt + 5.05 (1.36) EcObt−1 − 4.97 (1.03) EcPrt (6.3) ĈO2,t = 218 (32) + 0.853 (0.018) CO2,t−1 + 1.34 (0.18) Tempt−1 + 1400 (342) Ect − 3070 (647) Ect−1 − 13.0 (2.31) Obt−1 + 70.7 (23) EcObt−1 + 0.232 (0.047) Ob2 t (6.4) T̂empt = − 2.49 (0.69) + 0.879 (0.023) Tempt−1 + 0.0080 (0.0026) CO2,t − 301 (37) Ect + 22.6 (2.45) EcObt − 9.80 (1.94) EcObt−1 + 25.5 (7.1) EcPrt. (6.5) The correlations between the actual and fitted values for the three variables in the SEM are virtually identical to those in Table 6.1, con- sistent with the likelihood-ratio test of the over-identifying restrictions against the PVARX(1) being χ2 OR(64) = 69.7, which is insignificant at even the 5% level. Although the inertial dynamics play a key role in the three equations, all the eigenvalues of the system dynamics are less than unity in absolute value at (0.97, 0.86, 0.77). The test for excluding all the non-linear functions yields χ2(8) = 155∗∗, and that for dropping 246 Identifying the Causal Role of CO2 in Ice Ages all the impulse indicators χ2(44) = 439∗∗, both of which reject at any viable significance level.5 Figure 6.5 records the actual and fitted values, and residuals scaled by their standard deviations for the three equations. The tracking is very close, including over the final 100 ‘out-of-sample’ observations, although the residuals show the occasional outlier: remember that IIS selection was at 0.1% to avoid overfitting. Figure 6.6 reports residual densities with a Normal matched by mean and variance, and correlograms. The densities are relatively close to the Normal for Ice and Temp after IIS, but less so for CO2, probably because the restriction to one lag has left some residual autocorrelation. Most of the formal mis-specification tests rejected, possibly also reflecting the many omitted influences noted -750 -600 -450 -300 -150 0 4 5 (a) Ice Fitted 1-step Forecasts Ice Fitted 1-step Forecasts -750 -600 -450 -300 -150 0 -2.5 0.0 2.5 5.0 (b) Scaled residuals (Ice) Forecast errors Scaled residuals (Ice) Forecast errors -750 -600 -450 -300 -150 0 200 250 300 (c) CO2 Fitted 1-step Forecasts CO2 Fitted 1-step Forecasts -750 -600 -450 -300 -150 0 -2.5 0.0 2.5 (d) Scaled residuals (CO2) Forecast errors Scaled residuals (CO2) Forecast errors -750 -600 -450 -300 -150 0 -10 -5 0 5 (e) Temp Fitted 1-step Forecasts Temp Fitted 1-step Forecasts -750 -600 -450 -300 -150 0 -2.5 0.0 2.5 (f) Scaled residuals (Temp) Forecast errors Scaled residuals (Temp) Forecast errors Figure 6.5: Actual, fitted and forecast values, with scaled residuals and forecast errors: (a) and (b) for Ice from (6.3); (c) and (d) for CO2 from (6.4); (e) and (f) for Temp from (6.5). The vertical bar at T = −100 marks the start of the forecast period. 5The model recording retained impulse indicators is available on request from the authors. 6.2. System Equation Modeling of the Ice-Age Data 247 Ice N(0,1) -4 -2 0 2 4 0.25 0.50 Density (a)Ice N(0,1) Ice 0 5 10 0 1 Residual correlogram (b)Ice CO2N(0,1) -3 -2 -1 0 1 2 3 4 0.2 0.4 (c)CO2 N(0,1) CO 0 5 10 0 1 (d)CO2 Temp N(0,1) -4 -2 0 2 4 0.2 0.4 (e)Temp N(0,1) Temp 0 5 10 0 1 (f)Temp Figure 6.6: Residual densities and correlograms: (a) and (b) for Ice from (6.3); (c) and (d) for CO2 from (6.4); (e) and (f) for Temp from (6.5). above, although most of those seem to be endogenous responses as the climate changed, such as dust from wind storms and sea level changes both varying with temperature. In the present context, outliers as rep- resented by impulse indicators could derive from measurement errors in the variables, super-volcanoes either dramatically lowering temperature by erupted particulates, or raising by emitting large volumes of CO2, or like wind-blown dust changing the albedo of ice sheets. Most indica- tors retained for Ice were negative around −0.2, whereas for CO2 they were primarily positive and around +15, and for Temp around 2 but mixed in sign. As outliers are relative to the model being estimated, those found here could also represent variables omitted from the system in (6.3)–(6.5). Table 6.3 records the correlations between the residuals of the simultaneous equations model, with the residual standard deviations on the diagonal: these are close to those in Table 6.2. 248 Identifying the Causal Role of CO2 in Ice Ages Table 6.3: Correlations of the simultaneous model residuals, with standard devia- tions on the diagonal Ice CO2 Temp Ice 0.086 – – CO2 −0.173 4.88 – Temp −0.184 0.509 0.667 Considering the equations in more detail, the volume of ice in (6.3) depends on its previous level and on previous temperatures, as well as on eccentricity, past obliquity, current and lagged interactions of eccentricity with obliquity, and with current precession. Although Ec and EcOb appear to enter primarily as changes, the solved long-run outcome in Table 6.4 confirms they both also enter significantly as levels. CO2 has a similar coefficient on its lag, a positive feedback from past temperature, current and past levels of eccentricity, past obliquity and their interaction, and squared obliquity. Temp responds to its previous value and positively to current CO2: its coefficient in (6.5) entails that a 100 ppm increase (as seen since 1958) would raise temperatures by 0.8 ◦C ceteris paribus. Neither current CO2 nor Temp are significant in the equation for Ice; and current Temp is insignificant if added to that for CO2. Table 6.4: Long-run solutions as a function of the relevant strongly exogenous orbital variables where CO2 has been divided by 100 and Temp by 10 to align numerical coefficient values 1 Ec Ob EcOb EcPr ObSq Ice −17.3 1162 1.80 −49.3 −111 −0.037 SE (16.3) (402) (1.2) (17) (31) (0.020) CO2 32.0 −837 −2.19 35.6 47.2 0.039 SE (11.9) (276) (0.87) (11.7) (19.1) (0.016) Temp 19.0 −798 −1.44 34.0 52.0 0.026 SE (10.8) (257) (0.80) (10.9) (19.5) (0.014) 6.3. Long-Run Implications 249 6.3 Long-Run Implications Table 6.4 solves out the dynamics and lags to express each endogenous variable as a function of the relevant orbital variables. The original coefficients are not easy to interpret as they depend on the units of measurement of the orbital variables, so CO2 has been divided by 100 and Temp by 10 to align numerical coefficient values. Figure 6.7 graphs the computed time series of Ice, CO2 and Temp from the long-run relationships in Table 6.4. These graphs include the last 100,000 years before the present which are outside the estimation sample. Despite the different coefficients in the three long-run equations, the resulting time series in Figure 6.7 are relatively similar, and the correlations between them all exceed |0.977|. These graphs are just recombinations of the orbital drivers weighted by the coefficients in Table 6.4, so reflect the relatively volatile and quiescent periods seen in Figure 6.1. The increase in volatility from about 250,000 years ago Ice IceLR -750 -600 -450 -300 -150 0 2 4 6 Ice IceLR CO CO2 -750 -600 -450 -300 -150 0 100 200 300 400 CO LR2 CO2 TempLR Temp -750 -600 -450 -300 -150 0 -20 -10 0 10 TempLR Temp Figure 6.7: Computed time series of Ice, CO2 and Temp from the relationships in Table 6.4. 250 Identifying the Causal Role of CO2 in Ice Ages is marked, though the inertial dynamics from the lagged dependent variables smooths that over time as seen in Figure 6.5. 6.3.1 1-Step and Long-Run Forecasts Figure 6.8 records the hundred 1-step ahead forecasts with ±2SE error bands based on coefficient estimation variances as well as the resid- ual variances. The second column shows the resulting forecast errors (unscaled). Table 6.5 reports their RMSFEs which are close to the in-sample standard deviations shown in the following row as σ̂s (IIS) from Table 6.3 for comparison, or no IIS. The model for Ice provides a better description of the last 100 observations than the earlier sample, even though the in- sample residual standard deviations were calculated after outliers were removed by IIS. The forecast intervals in Figure 6.8 could be adjusted for the likely presence of outliers in the future at roughly their rate of 1-step forecasts with ±2SE Ice -100 -80 -60 -40 -20 0 3.0 3.5 4.0 4.5 5.0 (a) 1-step forecasts with ± 2SE Ice Ice forecast errors -100 -80 -60 -40 -20 0 -0.2 -0.1 0.0 0.1 Ice forecast errors CO2 -100 -80 -60 -40 -20 0 200 250 (b) CO2 CO2 -100 -80 -60 -40 -20 0 -20 -10 0 10 20 (d) CO forecast errors 2 1-step forecasts with ±2SE Temp -100 -80 -60 -40 -20 0 -10 -5 0 (c) (e) Temp Temp forecast errors -100 -80 -60 -40 -20 0 -2 -1 0 1 2 (f) Temp forecast errors 1-step forecasts with ± 2SE 1-step forecasts with ± 2SE Figure 6.8: A hundred 1-step ahead forecasts at 1000-year measures with forecast intervals at ±2SE shown by error bands: (a) for Ice from (6.3); (c) for CO2 from (6.4); (e) for Temp from (6.5). (b), (d) and (f) report the associated forecast errors. 6.3. Long-Run Implications 251 Table 6.5: 1-step ahead root mean square forecast errors; in-sample model residual standard deviations after IIS; and model residual standard deviations without IIS Ice CO2 Temp RMSFE 0.084 5.35 0.958 σ̂s (IIS) 0.086 4.88 0.667 σ̃s (no IIS) 0.091 5.38 0.748 occurrence in the past by calculating the in-sample residual standard deviations excluding impulse indicators, as those after IIS understate the future uncertainty to some extent. The last row in Table 6.5 reports those ‘no-IIS’ σ̃ values. The removal of outliers has not greatly improved the in-sample fit, and omitting impulse indicators would only increase the reported forecast intervals by about 10%. The table confirms that surprisingly, Ice provides a better description forecasting over the last 100 periods than in-sample with IIS, whereas CO2 and Temp forecasts are worse. Also, the RMSFE for Ice is smaller than the in-sample fitted σ̃ without IIS, that for CO2 is similar, whereas again Temp forecasts are worse. Looking back at Figure 6.5, the forecast errors for Ice seem less variable than the in-sample residuals on ‘ocular’ econometrics, less so those for CO2, whereas those for Temp look somewhat more volatile. Figure 6.9 shows the 100 multi-period ahead forecasts with ±2.2SE error bands to reflect the absence of indicators. These error bands assume that the coefficients in the model remain constant, and that no new forces intervene. With the Industrial Revolution, an additional driver of CO2 was human fossil fuel emissions, and hence of temperature, so extending forecasts to that era by an unchanged model is likely to reveal failure. The first 60 periods are tracked quite well, but miss the changes around 20,000 years ago and all three sets of forecasts either cross or are close to an error band by the end. Compared to earlier changes seen in Figure 6.5, the profiles of Ice and CO2 are similar over the last two cyclical periods, although the last cycle persisted for longer. 252 Identifying the Causal Role of CO2 in Ice Ages Dynamic forecasts Ice -100 -80 -60 -40 -20 0 3.5 4.0 4.5 5.0 5.5 (a) Dynamic forecasts Ice Dynamic forecasts CO2 -100 -80 -60 -40 -20 0 175 200 225 250 275 (b) Dynamic forecasts CO2 Dynamic forecasts Temp -100 -80 -60 -40 -20 0 -10 -5 0 (c) Dynamic forecasts Temp Figure 6.9: A hundred dynamic forecasts with ±2.2SE error bands: (a) for Ice from (6.3); (b) for CO2 from (6.4); (c) for Temp from (6.5). 6.3.2 When Did Humanity First Influence the Climate? Ruddiman (2005) suggested that humanity began to influence the climate around the time of domesticating animals and starting farming, so we ‘zoom in’ on the last 10,000 years, and re-estimate the system up to −10. The estimates are not much changed with χ2 OR(64) = 67.0, although the contemporaneous coefficient of CO2 on Temp has increased to unity. Figure 6.10 records the multi-step forecasts over −10 to −1 for Ice, CO2 and Temp. All the forecasts lie within their ±2SE error bands, but they, and the fitted values for most of the previous 10,000 years, are systematically over for Ice and under for CO2 and Temp. This matches the dynamic forecasts in Figure 6.9, and could reflect model mis- specification, or the slowly growing divergence that might derive from the increasing influence of humanity envisaged by Ruddiman (2005). Using the presence of proto-weeds that needed ground disturbance to grow in new areas, Snir et al. (2015) provide evidence of the origins of 6.4. Looking Ahead 253 Dynamic forecasts Ice -30 -20 -10 0 3.5 4.0 4.5 5.0 (a) Dynamic forecasts Ice Dynamic forecasts CO2 -30 -20 -10 0 200 225 250 275 (b)Dynamic forecasts CO2 Dynamic forecasts Temp -30 -20 -10 0 -10 -5 0 (c)Dynamic forecasts Temp Figure 6.10: Multi-step forecasts over −10 to −1 of: (a) Ice; (b) CO2; (c) Temp. cultivation long before Neolithic farming, dating such events to around 23,000 years ago. Having estimated the system up to 10,000 years ago, we changed the status of CO2 to unmodeled and re-estimated the two-equation model for Ice and Temp conditional on CO2. Neither fit was much improved, with σ̂Ice = 0.085 and σ̂Temp = 0.688, but now contemporaneous CO2 is highly significant in the equation for Ice, with t = −3.37∗∗, and its coefficient in the equation for Temp has more than doubled to 0.024, which seems implausibly large with t = 10.4∗∗. Also, χ2 OR(43) = 198∗∗ strongly rejects. Thus, the evidence here favors CO2 over the Ice Ages being an endogenous response to the orbital drivers jointly with Ice and Temp. 6.4 Looking Ahead The eccentricity of Earth’s orbital path is calculable far into the future, as are its obliquity and precession, so we extended the data set for 100,000 years into the future: Figure 6.11 records these, where the dark 254 Identifying the Causal Role of CO2 in Ice Ages Eccentricity -800 -650 -500 -350 -200 -50 100 0.01 0.02 0.03 0.04 0.05 (a) Eccentricity Obliquity -800 -650 -500 -350 -200 -50 100 23 24 (b) Obliquity Precession -800 -650 -500 -350 -200 -50 100 0.1 0.2 0.3 (c) Precession Summer time insolation -800 -650 -500 -350 -200 -50 100 450 500 550 (d) Summer time insolation Figure 6.11: (a) Eccentricity; (b) Obliquity; (c) Precession; (d) Summer-time insolation at 65◦ south, all over −800 to +100. vertical lines denote the present (see https://biocycle.atmos.colostate. edu/shiny/Milankovitch). The recent and future eccentricity is relatively quiescent compared to past values. Extending the data allows us both to forecast over that horizon and to simulate the potential comparative climate should anthropogenically determined atmospheric CO2 levels settle at (say) 400 ppm.6 First, §6.3.2 suggested that humanity had been affecting climate since 10,000 years ago, so we commence the multistep forecasts from that date. Consequently, the first 10 forecasts are almost the same as those in Figure 6.10 but are based on a system that does not include any impulse indicators. Figure 6.12 shows these forecasts over −10 to 100. The figure records the plot of the earlier time series back to 400 kya (thousand years ago) to emphasize the comparison with the future period. 6We are grateful to Bingchen Wang for his excellent research assistance in data collection and curation. Any sequence of CO2 values could be investigated: 400 ppm is used as close to current levels. https://biocycle.atmos.colostate.edu/shiny/Milankovitch https://biocycle.atmos.colostate.edu/shiny/Milankovitch 6.4. Looking Ahead 255 Forecasts Ice Fitted -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 4 5 (a) Forecasts Ice Fitted Forecasts CO2 Fitted -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 200 250 300 (b) Forecasts CO2 Fitted Forecasts Temp Fitted -400 -350 -300 -250 -200 -150 -100 -50 0 50 100 -10 -5 0 5 (c) Forecasts Temp Fitted Figure 6.12: 110 dynamic forecasts with ±2SE error bands: (a) for Ice from (6.3); (b) for CO2 from (6.4); (c) for Temp from (6.5). Given the relatively quiescent orbitals, these distant forecasts suggest a path well within the range of past data. Matching Pretis and Kaufmann (2020), we also find the next glacial maximum occurs in about 20,000 years. However, we know that the current value of CO2—i.e., at time 0 on the graphs—is already greater than 400 ppm, which is a value dramatically outside the ice-age range, so the location shift in CO2 values has caused forecast failure in a model that treats CO2 as remaining endogenously determined by natural factors. Handling the impacts on Ice and Temp of a permanent jump from the highest Ice-Age value of CO2 of around 300 ppm to 400 ppm or higher in recent times requires care. Arrhenius (1896) showed that the ‘greenhouse’ temperature response was proportional to the logarithm of CO2. Our model is linear in CO2, which does not matter over the ice-age period where log(CO2) and CO2 are correlated 0.99, but does later. Panel (a) in Figure 6.13 plots log(CO2) and CO2, matched to have the same means and ranges, showing the very close match across all 256 Identifying the Causal Role of CO2 in Ice Ages -400 -300 -200 -100 0 100 150 200 250 300 350 400 (a) ↑ 385ppm ppm CO2log(CO -400 -300 -200 -100 0 100 200 250 300 350 400 450 ↑ 440ppm CO2 log(CO )2 CO2log(CO CO2 log(CO )2 Ice Fitted Forecasts at CO2=560ppmForecasts at CO2=400ppm -400 -300 -200 -100 0 100 0 1 2 3 4 5 Exogenous CO with 110 steps ahead forecasts2 (b) (d) Ice Fitted Forecasts at CO =560ppm2 Forecasts at CO =400ppm2 Fitted Temp Forecasts at CO2=560ppmForecasts at CO2=400ppm -400 -300 -200 -100 0 100 -10 -5 0 5 10 15 20 Peak Ice-age temperature ↓ (c) Fitted Temp Forecasts at CO =560ppm2 Forecasts at CO =400ppm2 Figure 6.13: (a) and (b) CO2 and log(CO2) matched by means and ranges for scenarios of 400 ppm and 560 ppm respectively; 110 dynamic forecasts conditional on CO2 = 385 ppm and 440 ppm with ±2SE error bands (c) for Ice; (d) for Temp. time periods once the future value representing 400 ppm is set at CO2 = 385 ppm against log(400); Panel (b) shows the equivalent match for CO2 = 440 ppm against log(560), which is a doubling in CO2 since the last in-sample observation. Thus, to establish the temperature and ice responses in the system, we set CO2 to 385 ppm and 440 ppm to mimic the climatic effects of the anthropogenically exogenous values of 400 ppm and 560 ppm given the log-linear relation. Of course, the usual orbital drivers still operate, so will continue to influence all three variables in addition to human outputs, but that is switched off for CO2 so the constant values are the net outcome. To validly exogenize CO2, as the long-run relations between the variables and the orbital drivers should be unaffected by humanity’s intervention in CO2 production, we fix the values of the parameters as those in Equations (6.3) and (6.5), then assign an exogenous status to 6.4. Looking Ahead 257 CO2. However, as the forecast period starts 90,000 years later than the estimated model, re-selection by IIS would be needed to include that sample followed by re-estimation. For comparability with the trajectories in Figure 6.12, we omitted all impulse indicators, which should have a negligible effect on these dynamic forecasts. The outcomes are shown in Figure 6.13 Panels (c) for Ice and (d) for Temp. Even at 400 ppm, ice volume in Panel (c) falls well below any previous values to a minimum around 75,000 years in the future, then increases somewhat from the increased eccentricity. The upper bound of the estimated uncertainty then lies below the lowest ice-age values. Increasing CO2 to mimic a doubling simply magnifies such effects, leading to an almost ice-free Antarctic. Panel (d) for Temp has a line at the peak ice-age temperature, showing that future values will be greater than that for 400 ppm, at a maximum of almost 3 ◦C more. Because the observation frequency is 1000 years, and the dynamic reactions occur with those lags given the slowly evolving orbitals, the impacts of the jump in CO2 take a long time to work through. Nevertheless, for almost 100,000 years, the Antarctic temperature would be above zero (shown by the thin dashed line). Increasing CO2 to 560 ppm raises temperatures dramatically, rising to 13 ◦C above the ice-age peak. Table 6.6 records the resulting long-run relationships between Ice and Temp and their determinants: substituting the estimated equation for CO2 from (6.4) and taking account of the scaling there would essentially deliver Table 6.4. As the model’s parameters are fixed, there are no standard errors.7 Using the solved long-run coefficient for CO2 on Table 6.6: Long-run solutions as a function of the relevant strongly exogenous orbital variables and CO2 1 Ec Ob EcOb EcPr CO2 Ice 13.2 365 −0.29 −15.5 −66 −0.0095 Temp −20.5 −2476 0 106 210 0.066 7If just the coefficient of CO2 is left free in this bivariate model with exogenous CO2, all others fixed, its long run value is estimated as 0.065 with a standard error of 0.001. 258 Identifying the Causal Role of CO2 in Ice Ages Temp of 0.066, the simulated increase of 105 ppm from 280 ppm would eventually raise temperatures by about 6.9 ◦C ceteris paribus, close to the forecast peak in the Panel (d). For the doubling of CO2, equivalent to adding 160 ppm allowing for the log-transform as seen in Panel (b), the increase would be about 13.2 ◦C, somewhat lower than the peak simulated value of 17.8 ◦C. Kaufmann and Juselius (2013) note that: ‘a permanent 180 ppm increase in atmospheric CO2 increases the long-run Antarctic temperature by about 11.1 ◦C, which corresponds to a global value of about 5.6 ◦C [Masson-Delmotte et al., 2006, 2010]’, although Masson-Delmotte et al. (2010) also question the 2-to-1 relation with global temperatures. However, our estimates of the effects of adding 120 ppm are 7 ◦C Antarctic warming and for adding 280 ppm of between 13.2 ◦C and 17.8 ◦C which straddle their estimate. Knutti et al. (2017) record a wide range of estimates of equilibrium climate sensitivity (ECS) with many larger than six. There are many caveats, from assuming the parameters of the models stay constant as anthropogenic warming increases despite the many implicit dynamic relationships between atmosphere and oceans, and how ice loss would impact those. Nevertheless, the simulated temperature responses to exogenous changes in atmospheric CO2 are similar to but smaller than those that actually occurred during the ice ages: for example, 10 ◦C between 252 kya and 242 kya from an 80 ppm increase in CO2, or 13 ◦C between 156 kya and 128 kya from a 90 ppm increase in CO2, remembering that much of these temperature changes were driven directly by the orbitals. 6.5 Conclusions on Ice-Age Modeling The 3-equation model over the Ice Ages of ice volume, atmospheric CO2 and Antarctic Temperature illustrates the approach to modeling a system. While much simpler than the larger cointegrated systems in Kaufmann and Juselius (2013) and Pretis and Kaufmann (2018), the resulting estimated system provided a useful description of the available time series and strongly supported the view that the role of CO2 was an endogenous response to the orbital drivers during the Ice Ages. The evidence also supported an impact of humanity on the Earth’s climate 6.5. Conclusions on Ice-Age Modeling 259 starting at least 10,000 years ago. Extending the orbital data for 100,000 years ahead allowed multi-step forecasts with the system continuing as before so CO2 was endogenously determined, as well as switching its status to exogenously determined by anthropogenic emissions, although the orbital drivers would still be operating. The resulting inferred global temperature rises would be dangerous at more than 5 ◦C, with Antarctic temperatures positive for thousands of years. Thus, the aims of the Paris Accord remain crucial, so the next section considers the UK’s role in reducing its CO2 emissions. 7 Econometric Modeling of UK Annual CO2 Emissions, 1860–2017 As described in Section 4, CO2 and other greenhouse gas emissions influence the Earth’s climate. Over the Ice Ages, such emissions were determined by natural forces as highlighted by Figure 6.4, but the model developed in Section 6 suggested that CO2 then was primarily an intermediate determinant rather than an exogenous cause of climate variations. However, since the Industrial Revolution discussed in Sec- tion 5, although the same natural forces still operate, greenhouse gas emissions are now mainly by-products of energy production, manufac- turing, and transport (all about a quarter of the UK’s emissions), with agriculture, construction and waste making most of the rest. As first into the Industrial Revolution, the UK initially produced a large share of global anthropogenic CO2 emissions, albeit much of that was embodied in its exports of cloth production, steam engines, ships and iron products etc. Not only has its share of world CO2 emissions shrunk to a tiny proportion following global industrialization, there has been a dramatic drop in its domestic emissions of CO2, so that by 2017 they were back to 1890’s levels: the country first into the Industrial Revolution is one of the first out. Indeed, on April 22, 2017, ‘Britain has gone a full day without turning on its coal-fired power stations for the first time in more than 130 years’,1 and on May 26, 2017 generated 1See https://www.ft.com/content/8f65f54a-26a7-11e7-8691-d5f7e0cd0a16. 260 https://www.ft.com/content/8f65f54a-26a7-11e7-8691-d5f7e0cd0a16 261 almost 25% of its electrical energy from solar,2 and now goes weeks without burning coal for electric energy production. The data analyzed here are aggregate, but as the UK population has more than doubled since 1860, in 2013 the UK’s CO2 emissions in per capita terms actually dropped below the level of 1860 (see Figure 7.1(c)), and are now just 55% of their level in 1894, despite per capita real incomes being around 7-fold higher. Thus, although the UK now ‘im- ports’ substantial embodied CO2—reversing the Industrial Revolution direction—major domestic emissions reductions have occurred but have obviously not involved substantive sacrifice: see Brinkley (2014) for an empirical analysis of decoupling growth and CO2 emissions. Much remains to reduce CO2 emissions towards the net zero level that will be required to stabilize temperatures, an issue we address in §7.12. 1875 1900 1925 1950 1975 2000 200 400 600 (a)UK CO emissions, Mt, p.a.2 ←1890 ⇑ 2017 1875 1900 1925 1950 1975 2000 5.0 7.5 10.0 12.5 C O 2 e m is si on s, M t → (c) C O 2 e m is si on s, to ns p er c ap ita → 2013→↑ 1860 1875 1900 1925 1950 1975 2000 -2.0 -1.5 -1.0 -0.5 0.0 0.5 (d) Log ratio of CO emissions to capital stock2 Coal (Mt) Natural Gas (Mtoe) Wind+Solar (Mtoe) Oil (Mt) 0 25 50 75 100 125 150 175 200 225 1875 1900 1925 1950 1975 2000 (b) Mt Coal (Mt) Natural Gas (Mtoe) Wind+Solar (Mtoe) Oil (Mt) Figure 7.1: (a) UK CO2 emissions in millions of tonnes (Mt); (b) UK fuel sources: coal (Mt), oil (Mt), natural gas (millions of tonnes of oil equivalent, Mtoe) and wind+solar (Mtoe); (c) CO2 emissions per capita, in tons per annum; (d) ratio of CO2 emissions to the capital stock on a log scale, all series to 2017. 2See https://www.ft.com/content/c22669de-4203-11e7-9d56-25f963e998b2. https://www.ft.com/content/c22669de-4203-11e7-9d56-25f963e998b2 262 Econometric Modeling of UK Annual CO2 Emissions The aim of this section is to model the UK’s CO2 emissions to estab- lish the determinants of the UK’s remarkable drop accomplished with rising real incomes. We again use Autometrics to jointly select relevant variables, their lags, possible non-linearities, outliers and location shifts in putative relationships, and also rigorously test selected equations for being well-specified representations of the data. The structure of this section is as follows. §7.1 defines the variables and records their sources, then §7.2 describes the UK time-series data un- der analysis, initially using only data over 1861–2011 for estimation and selection to allow an end-of-sample parameter-constancy test to 2017, updating estimation to 2013 in §7.7. §7.3 formulates the econometric model, where §7.3.1 considers the choice of functional forms of the re- gressors. Then §7.4 evaluates a simple model formulation, and highlights the inadequacy of such specifications facing wide-sense non-stationary data. The four stages of model selection from an initial general model are described in §7.5, then §7.6 addresses selecting indicators in the general model. §7.7 describes selecting relevant regressors given the retained indicators, and implementing a cointegration reduction, where the non-integrated formulation is estimated in §7.8. §7.9 conducts an encompassing test of the linear-semilog model versus a linear-linear one. §7.10 presents conditional 1-step ‘forecasts’ and multi-step forecasts from a VAR, §7.11 addresses the policy implications of the empirical analysis, then §7.12 considers whether the UK can reach its 2008 Cli- mate Change Act (CCA) CO2 emissions targets for 2050, and the more recent aim of net zero greenhouse gas (GHG) emissions. Finally, §7.13 estimates a ‘climate-environmental Kuznets curve’. 7.1 Data Definitions and Sources The variables used in the analysis of UK CO2 emissions are defined as follows: Et = CO2 emissions in millions of tonnes (Mt) [1], [2]. Ot = Net oil usage, millions of tonnes [3]. Ct = Coal volumes in millions of tonnes [4]. 7.1. Data Definitions and Sources 263 Gt = real GDP, £10 billions, 1985 prices [5], [7], p.836, [8]a,b. Kt = total capital stock, £billions, 1985 prices [6], [7], p.864, [8]b,c. ∆xt = (xt − xt−1) for any variable xt ∆2xt = ∆xt −∆xt−1 Sources: [1] World Resources Institute http://www.wri.org/our-work/project/ cait-climate-data-explorer and https://www.gov.uk/government/ collections/final-uk-greenhouse-gas-emissions-national-statistics; [2] Office for National Statistics (ONS) https://www.gov.uk/government/statistics/provisional-uk-green house-gas-emissions-national-statistics-2015; [3] Crude oil and petroleum products: production, imports and exports 1890 to 2017 Department for Business, Energy and Industrial Strategy (Beis); [4] Beis and Carbon Brief http://www.carbonbrief.org/analysis-uk-cuts- carbon-record-coal-drop; [5] ONS https://www.ons.gov.uk/economy/nationalaccounts/uksector accounts#timeseries; [6] ONS https://www.ons.gov.uk/economy/nationalaccounts/ uksectoraccounts/bulletins/capitalstocksconsumptionoffixedcapital/ 2014-11-14#capital-stocks-and-consumption-of-fixed-capital-in-detail; [7] Mitchell (1988) and Feinstein (1972); [8] Charles Bean (from (a) Economic Trends Annual Supplements, (b) Annual Abstract of Statistics, (c) Department of Employment Gazette, and (d) National Income and Expenditure). See Hendry (2001, 2015) and Hendry and Ericsson (1991) for discussions about Gt and Kt. There are undoubtedly important measurement errors in all these time series, but Duffy and Hendry (2017) show that strong trends and large location shifts of the form prevalent in the data analyzed here help offset potential biases in the long-run relation’s estimated coefficients. http://www.wri.org/our-work/project/cait-climate-data-explorer http://www.wri.org/our-work/project/cait-climate-data-explorer https://www.gov.uk/government/collections/final-uk-greenhouse-gas-emissions-national-statistics https://www.gov.uk/government/collections/final-uk-greenhouse-gas-emissions-national-statistics https://www.gov.uk/government/statistics/provisional-uk-greenhouse-gas-emissions-national-statistics-2015 https://www.gov.uk/government/statistics/provisional-uk-greenhouse-gas-emissions-national-statistics-2015 http://www.carbonbrief.org/analysis-uk-cuts-carbon-record-coal-drop http://www.carbonbrief.org/analysis-uk-cuts-carbon-record-coal-drop https://www.ons.gov.uk/economy/nationalaccounts/uksectoraccounts#timeseries https://www.ons.gov.uk/economy/nationalaccounts/uksectoraccounts#timeseries https://www.ons.gov.uk/economy/nationalaccounts/uksectoraccounts/bulletins/capitalstocksconsumptionoffixedcapital/2014-11-14#capital-stocks-and-consumption-of-fixed-capital-in-detail https://www.ons.gov.uk/economy/nationalaccounts/uksectoraccounts/bulletins/capitalstocksconsumptionoffixedcapital/2014-11-14#capital-stocks-and-consumption-of-fixed-capital-in-detail https://www.ons.gov.uk/economy/nationalaccounts/uksectoraccounts/bulletins/capitalstocksconsumptionoffixedcapital/2014-11-14#capital-stocks-and-consumption-of-fixed-capital-in-detail 264 Econometric Modeling of UK Annual CO2 Emissions 7.2 UK CO2 Emissions and Its Determinants As already noted, energy production, manufacturing, and transport each account for roughly 25% of UK CO2 emissions, the rest coming mainly from agriculture, construction and waste in approximately equal shares. While other greenhouse gas emissions matter, CO2 comprises about 80% of the UK total, with methane, nitrous oxide and hydrochlorofluo- rocarbons (HCFCs) making up almost all the rest in CO2 equivalents. However, the various fossil fuels have different CO2 emissions per unit of energy produced and how efficiently fuels are burnt also matters, from coal on an open fire or in a furnace, through gasoline-powered vehicles with different engine efficiencies, to a gas-fired home boiler or a power station. A standard approach to estimate country fossil fuel emissions is to use the product of the volumes of fuels produced, the proportion of each fuel that is oxidized, and each fuels’ carbon content (see Marland and Rotty, 1984). Table 7.1 records the average CO2 emissions per million British thermal units (Btu) of energy produced for the main fossil fuels.3 Table 7.1: Pounds of CO2 emitted per million British thermal units (Btu) of energy produced Coal (anthracite) 228.6 Coal (bituminous) 205.7 Coal (lignite) 215.4 Coal (sub-bituminous) 214.3 Diesel fuel & heating oil 161.3 Gasoline 157.2 Propane 139.0 Natural gas 117.0 Source: US Department of Energy & https://www.eia.gov/tools/faqs/faq.php?id=73&t= 11. 3Variations on such data are used in Erickson et al. (2008), Jones and Cox (2005), Nevison et al. (2008), and Randerson et al. (1997). Data using this methodology are available at an annual frequency in Marland et al. (2011). CO2 emissions from cement production are estimated to make up about 5% of global anthropogenic emissions (see Worrell et al., 2001). https://www.eia.gov/tools/faqs/faq.php?id=73&t=11 https://www.eia.gov/tools/faqs/faq.php?id=73&t=11 7.2. UK CO2 Emissions and Its Determinants 265 As rough approximations for interpreting CO2 reductions, coal has a relative weight of around 2.2, oil 1.6 and natural gas 1.1, depending on the units of measurements. Thus, switching energy production from coal to natural gas would reduce emissions by about 45%–50% for the same amount of energy. Of course, switching to renewable sources would effect a 100% reduction, and is an essential step to reach a net-zero emissions target. The main data over 1860–2017 on UK CO2 emissions, energy vol- umes, and the relation of CO2 emissions to the capital stock are shown in Figure 7.1. Panel (a) shows that UK CO2 emissions rose strongly and quite steadily from 1860 till about 1916, oscillated relatively violently till about 1946 from the sharp depression at the end of World War I, the General Strike, Great Depression starting in 1930, and World War II, then resumed strong growth till 1970. Following another somewhat turbulent period till 1984, emissions began to fall slowly, accelerating after 2005 to the end of our time series in 2017, by which time they were below levels first reached in 1890. Panel (c) plots CO2 emissions per capita, revealing that by 2013 they had fallen below the level at the start of our data period in 1860. Panel (b) records the time series for coal volumes and net oil us- age (imports plus domestic production less exports), natural gas and renewables. Coal volumes behave similarly to CO2 emissions till 1956 at which point they turn down and continue falling from then onwards, dropping well below the volumes mined in 1860. The sharp dips from miners’ strikes in 1921, 1926 and 1984 are clearly visible. Conversely, oil volumes are essentially zero at the start, but rise rapidly in the period of cheap oil after World War II, peak in 1973 with the first Oil Crisis, but stabilize from 1981 on, despite a doubling in vehicle travel to more than 500 billion kilometers p.a. Natural gas usage rises quickly from the late 1960s, but has recently fallen slightly, and renewables have been growing rapidly this century. Finally Panel (d) plots the log-ratio of CO2 emissions to the cap- ital stock and shows that it started to decline in the 1880s, and has dropped by more than 92% over the hundred and thirty years since. As capital embodies the vintage of technology prevalent at the time of its 266 Econometric Modeling of UK Annual CO2 Emissions construction, tends to be long lasting, and is a key input to production, the volumes of CO2 produced by production are likely to be strongly affected by the capital stock: see e.g., Pfeiffer et al. (2016). Hence, ‘stranded assets’ could be a potential problem if legislation imposed much lower CO2 emissions targets, as looks likely for the UK. To highlight the massive changes that have occurred in the UK, Figure 7.2 reports a scatter plot of CO2 emissions against the quantity of coal, showing the dates of each pair of points, and a 3-dimensional plot of Et against Kt and Ct. As with Figure 7.1(a), there is strong growth in emissions as coal output expands until the mid 1950s when coal production peaks, but emissions continue to grow till the mid 1970s despite a substantial reduction in coal volumes, and only then start to decline, falling noticeably after 2008. Referring back to Figure 7.1(b), the rapid rise in oil use initially offsets the fall in coal, but after the two Oil Crises of the 1970s, the fall in coal is reflected in the decline in emissions. Panel (b) shows the major role of the capital stock in changing 20 40 60 80 100 120 140 160 180 200 220 200 400 600 18601861186218631864186518661867186818691870187118721873187418751876187718781879188018811882188318841885188618871888188918901891189218931894189518961897189818991900190119021903190419051906190719081909191019111912 19131914191519161917 191819191920 1921 1922 192319241925 1926 19271928 19291930193119321933 193419351936193719381939194019411942194319441945 1946 1947 194819491950 195119521953195419551956195719581959 1960196119621963196419651966196719681969197019711972 1973 197419751976197719781979 19801981198219831984 198519861987198819891990199119921993199419951996199719981999200020012002200320042005200620072008 20092010 2011 20122013 201420152016 Volume of coal in Mt→ C O 2 e m is si on s in M t→ C O 2 e m is si on s→ Volume of coal→Capital stock (a) (b) 50 100 150 2001000 2000 3000 20 0 40 0 60 0 Figure 7.2: (a) Scatter plot of CO2 emissions against the quantity of coal by date; (b) 3-dimensional plot of Et against Kt and Ct. 7.2. UK CO2 Emissions and Its Determinants 267 the link between coal and CO2 emissions, reflecting the efficiency gains seen in Figure 7.1(d). Figure 7.3 shows the distributional shifts in CO2 emissions that have occurred historically, using approximately 40-year sub-periods. All the above graphs show non-linear relationships at the bivariate level (i.e., between CO2 emissions and coal production, say), as well as shifts in relations. An immediate implication is that simple correlations between pairs of variables change over time, so will be poor guides to what matters in a multivariable relationship, as Table 7.2 shows. Coal volumes have the smallest correlation with CO2 emissions, yet were manifestly one of its main determinants.4 Figure 7.4 shows recursive estimates of the relation Et = β̂0 + β̂1Ct+ ν̂t, confirming the dramatic non-constancy of that overly simple model, illustrating the problems of not modeling non-stationarity. UK CO2 emissions, 1860−1899 UK CO2 emissions, 1900−1939 UK CO2 emissions, 1940−1979 UK CO2 emissions, 1980−2017 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 0.002 0.004 0.006 0.008 0.010 0.012 Mt p.a. → UK CO2 emissions, 1860−1899 UK CO2 emissions, 1900−1939 UK CO2 emissions, 1940−1979 UK CO2 emissions, 1980−2017 Figure 7.3: Sub-period distributions of UK CO2 emissions. 4Correlations are not well defined for non-stationary variables, as they are not constant over time. 268 Econometric Modeling of UK Annual CO2 Emissions Table 7.2: Whole-sample correlations Correlations: CO2 emissions Coal Oil Real GDP Capital CO2 emissions 1.000 0.243 0.734 0.528 0.506 Coal 1.000 −0.424 −0.598 −0.624 Oil 1.000 0.829 0.822 Real GDP 1.000 0.997 7.3 Model Formulation Following the formulation in (2.1) above, the general model is the system characterizing the LDGP. Here, we are interested in modeling UK CO2 emissions given the volumes of coal and oil the UK used and the main representations of the scale of the economy and its productive capacity, namely GDP and the capital stock. Over most of our sample period, there would not be any contemporaneous or lagged feedbacks from CO2 emissions to the explanatory regressors, although by the middle of the 20th century with ‘Clean Air’ Acts of Parliament, that is a possibility, increasingly so by the first decade of the 21st century as climate change concerns grow, but overall a conditional model seems a viable representation here. Combining all the above information, neither of the two ‘polar’ approaches to modeling the UK’s CO2 emissions, namely as (a) de- composed into its sources (coal, oil, gas etc.), or (b) as a function of economic variables (capital and output) alone, seems likely to be best. On (a), not all sources have been recorded historically, especially their carbon compositions, which will have varied over time with the type of coal used, and how oil was refined to achieve which products (inter alia). On (b), that changing mix will entail non-constancy in the relation between emissions and the capital stock and GDP. To capture the changing mix and its relation to the economic variables, we included the two main emitters, coal and oil, with the capital stock and GDP. The latter then explain the emissions not accounted for by the former: the solved long-run relationship in Equation (7.5) below finds all four variables play a significant role, and the coefficients for 7.3. Model Formulation 269 ^ β 1900 1950 2000 0.25 0.50 0.75 1.00 1.25 Et= β̂ ^ ^β0+ 1Ct t+ν(a) β̂1×±2SE β̂1 β̂ 1900 1950 2000 -1.00 -0.75 -0.50 -0.25 0.00 0.25 (b) β̂0 ×±2SE β̂0 t ±2^t 1900 1950 2000 -1 0 1 2 3 (c) ν̂t ±2σ̂t Break-point Chow tests scaled by 0.1% critical value 0.1% 1900 1950 2000 25 50 75 100 125 (d) Break-point Chow tests scaled by 0.1% critical value 0.1% Figure 7.4: (a) Recursive β̂1,t with ±2SE β̂1,t ; (b) Recursive β̂0,t with ±2SE β̂0,t ; (c) 1-step recursive residuals ν̂t with ±2σ̂t; (d) Break-point Chow tests scaled by their 0.1% critical values. coal and oil are also consistent with that interpretation. In turn, the additive nature of emissions suggests a linear relation with coal and oil, although that leaves open how the economic variables might enter, considered in §7.3.1. A further obvious feature of Figure 7.1(a) is the number of very large ‘outliers’ occurring during the inter-war and immediate post-war periods. Consequently, the general set of variables from which the model for CO2 emissions will be selected comprises its lagged value and current and first lagged values of coal and oil volumes, real GDP and the capital stock. These variables are all retained without selection while selecting over both impulse and step indicators at α = 0.1% significance. First, however, we address the functional forms for Gt and Kt. 270 Econometric Modeling of UK Annual CO2 Emissions 7.3.1 Functional Forms of the Regressors In §2.2.2 we considered a low-dimensional representation of non-linearity, but here a more specific issue is whether to transform the various regressors to logarithms or leave as linear. CO2 emissions depend linearly on the volumes of fossil fuels consumed with the weights shown in Table 7.1. Moreover, it is the volume of CO2 emitted that has to be reduced to net zero, so we use that as the dependent variable. In turn, it is natural to include coal and oil volumes linearly as well. Nevertheless, both linear and log linear relations were investigated. As oil was used in negligible quantities in the 19th century, early volumes were increased by unity (to ensure positive values), but the log transform still seemed to distort rather than help. Equivalent linear and log-linear equations were formulated as: Et = β0 + β1Et−1 + β2Ct + β3Ct−1 + β4Ot + β5Ot−1 + β6Gt + β7Gt−1 + β8Kt + β9Kt−1 + ut (7.1) and the same form with all variables in logs, then estimated with IIS+SIS selecting at 0.001 retaining all the regressors in (7.1). The log-linear version had a residual standard devi- ation of 2.6%, whereas dividing the residual standard devia- tion of the linear form (reported in https://voxeu.org/article/ driving-uks-capita-carbon-dioxide-emissions-below-1860-levels) by the mean value of Et yielded 2.0%, so the linear representation dominated on the criterion proposed by Sargan (1964). By way of comparison, even after IIS, the ‘Kuznets curve’ formulation in (7.7) below had a residual standard deviation of 5.5%. However, that leaves open the choice of log or linear just for Gt and Kt. Figure 7.5 graphs those variables in linear and log transforms, matched by means and ranges to highlight any relative curvature. Given the large increase in both since 1860, £100 billion corresponds to very different percentage changes, illustrated by the apparently small fall in G after World War I, yet the largest drop in g, with the opposite after 2008. Consequently, we will model with the logs, denoted g and k, scaled by 100 so coefficients are between ±10, and ∆g and ∆k are percentage changes. The encompassing test in §7.9 checks how well the https://voxeu.org/article/driving-uks-capita-carbon-dioxide-emissions-below-1860-levels https://voxeu.org/article/driving-uks-capita-carbon-dioxide-emissions-below-1860-levels 7.4. Evaluating a Model Without Saturation Estimation 271 GDP Log(GDP) ( gdp) 1875 1900 1925 1950 1975 2000 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 (a)GDP Log(GDP) ( gdp) Capital (K) Log(Capital) (k) 1875 1900 1925 1950 1975 2000 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 (b)Capital (K) Log(Capital) (k) Figure 7.5: Graphs on a logarithmic scale matched by means and ranges of linear (capitals) and log (lower case) transforms of (a) GDP; (b) Capital stock. two possibilities of linear and semi-log compare. Outliers and location shifts detected by super saturation estimation may well differ between these specifications. 7.4 Evaluating a Model Without Saturation Estimation Thus, the baseline relationship between emissions and its main determi- nants was formulated as: Et = β0 + β1Et−1 + β2Ct + β3Ct−1 + β4Ot + β5Ot−1 + β6gt + β7gt−1 + β8kt + β9kt−1 + vt. (7.2) To demonstrate why a simple-to-general methodology is inadequate, we will first estimate and evaluate the relation in (7.2) over 1861–2011 with six observations retained as an end-of-sample constancy test for 2012– 2017, given in (7.3) where estimated coefficient standard errors (SEs) are shown in parentheses below estimated coefficients with heteroskedastic 272 Econometric Modeling of UK Annual CO2 Emissions and autocorrelation consistent standard errors (HACSEs) shown below those in brackets (see Andrews, 1991, and Newey and West, 1987). Êt = 0.79 (0.054) [0.070] Et−1 + 2.58 (0.14) [0.38] Ct − 2.21 (0.18) [0.40] Ct−1 + 2.05 (0.43) [0.53] Ot − 1.53 (0.43) [0.53] Ot−1 + 0.81 (0.53) [0.49] gt − 0.99 (0.53) [0.57] gt−1 + 1.67 (2.67) [2.65] kt − 1.39 (2.62) [2.57] kt−1 + 61 (133) [109] (7.3) σ̂ = 16.2 R2 = 0.985 FAR(2, 139) = 8.44∗∗ χ2 nd(2) = 64.4∗∗ FARCH(1, 149) = 18.9∗∗ FHet(18, 132) = 2.95∗∗ FReset(2, 139) = 14.3∗∗ FChow(6, 141) = 0.96 tur = −3.91. Despite the high R2 induced by the non-stationarities in the variables, the model is completely inadequate. Every mis-specification test rejects, the key economic variables g and k are insignificant, and tur does not reject the null hypothesis of no cointegration. The solved long-run equation for E in Table 7.3 also has the ‘wrong’ relative coefficients of coal and oil. The HACSEs do not alter the significance or insignificance of the regressors, and given the substantive rejections on FAR and FHet, are surprisingly close to the conventional SEs (see the critiques of HACSEs in Castle and Hendry, 2014a, and Spanos and Reade, 2015), so do not alert investigators who fail to compute mis-specification tests as to the problems. Finally, the recursively-estimated coefficients β̂i,t with ±2SEi,t, the residuals with ±2σ̂t, and the recursive FChow test are shown in Fig- ure 7.6 revealing considerable non-constancy. The coefficient of Et−1 is converging towards unity, often signalling untreated location shifts (see Castle et al., 2010). Table 7.3: Solved static long-run equation for E from (7.3) Variable 1 C O g k Coefficient 289 1.77 2.45 −0.86 1.35 SE 635 0.17 0.64 1.05 0.97 7.5. Four Stages of Single-Equation Model Selection 273 E 1875 1910 1945 1980 2015 -0.5 0.0 0.5 1.0 Et−1±2SEt constantt 1875 1910 1945 1980 2015 -8000 -4000 0 4000 8000 constantt±2SEt Ct 1875 1910 1945 1980 2015 -10 0 10 Ct±2SEt Ct 1875 1910 1945 1980 2015 -10 0 10 Ct−1±2SEt O 1875 1910 1945 1980 2015 -2000 0 2000 Ot±2SEt Ot 1875 1910 1945 1980 2015 -2000 0 2000 Ot−1±2SEt gt 1875 1910 1945 1980 2015 -250 0 250 500 gt±2SEt gt 1875 1910 1945 1980 2015 -250 0 250 gt−1±2SEt kt 1875 1910 1945 1980 2015 -2000 0 2000 4000 kt±2SEt k 1875 1910 1945 2015 -2500 0 2500 ût±2 1875 1910 1945 1980 2015 -25 0 25 ût±2σ̂t 1875 1910 1945 1980 2015 1 2 3 Chow tests 1% kt−1±2SEt 1980 Figure 7.6: Graphs of β̂i,t, i = 0, . . . , 9 with ±2SEi,t; ût with ±2σ̂t, and FChowt over 1875–2011. The dilemma confronting any investigator after fitting (7.3), and facing so many test rejections, is how to proceed. Misspecification tests can reject against a number of different alternatives to those for which they were originally derived, so implementing that particular alternative is a non-sequitur. For example, residual autocorrelation need not entail error autocorrelation but may arise from incorrect dynamics, unmodeled location shifts or other parameter changes, data measurement errors and omitted variables, so adopting a recipe of the form often attributed to Orcutt and Cochrane (1949) can be counter-productive (see e.g., Mizon, 1995). Indeed, once there is residual heteroskedasticity and non- constancy, it is unclear what other rejections mean, except to confirm that something is wrong. The obvious alternative of general-to-specific is what we now explore for modeling UK CO2 emissions. 7.5 Four Stages of Single-Equation Model Selection In this subsection, we consider the four stages of conditional model se- lection from (7.2) extended by using super saturation (namely IIS+SIS), 274 Econometric Modeling of UK Annual CO2 Emissions fitting to data over 1861–2011 to allow an end-of-sample parameter- constancy test to 2017. First, in §7.6 we select both impulse and step indicators at a tight nominal significance level α, which is the theoretical gauge, retaining all of the other regressors in (7.2) without selection. The studies referenced in Section 2 have established that the theoretical and empirical gauges are generally close for IIS, and have derived the uncertainty around the latter, which is almost negligible for very small α0.001 = 0.001. Less is known analytically about the gauge of SIS or super saturation, but the simulation studies noted earlier suggest the gauge should be set around 1/2T . Since there are T = 151 observations, there will be M ≈ 300 indicators in the candidate set (T impulse indicators and T −2 step indicators), so under the null hypothesis that no indicators are needed, α0.001M = 0.001× 300 = 0.3 of an indicator will be significant by chance. Even doubling that, α0.0012M can be interpreted that one indicator will be retained adventitiously approximately three out of every five times these choices are applied to new data sets with the same configuration of T , so over-fitting seems unlikely. As shown above, estimating (7.2) without indicator variables is unsuccessful as all mis- specification tests strongly reject. Diagnostic tests will be applied to check that the finally selected equation is well specified, with non- autocorrelated, homoskedastic and nearly Normal residuals, constant parameters, and no remaining non-linearity: (7.4) records that outcome. Second, in §7.7 we select over the other nine regressors at α0.01 (indicators already selected are bound to be significant at this second stage). Almost none of the nine regressors will be retained by chance if in fact they are irrelevant. Third, also in §7.7 we solve this selected model for the cointe- grating, or long-run, relation implicit in it, and reparametrize the non-deterministic variables to differences. In doing this mapping to a non-integrated specification, step indicators are included in the coin- tegration relation, so that they do not cumulate to trends, leaving impulse indicators and differenced step indicators unrestricted. While this may seem somewhat complicated, the reasons for doing so are ex- plained in the survey articles by Hendry and Juselius (2000, 2001) and 7.6. Selecting Indicators in the General Model 275 in Hendry and Pretis (2016). Finally, we re-estimate that non-integrated formulation in §7.8. 7.6 Selecting Indicators in the General Model Following this path, we find for T = 1862–2011, retaining all the regressors and selecting impulse and step indicators jointly at 0.1%, testing constancy over 2012–2017: Êt = 0.52 (0.06) Et−1 − 47 (13) 1{1921} − 163 (20) 1{1926} − 44 (10) 1{1946} + 56 (11) 1{1947} + 29 (9.8) 1{1996} − 42 (14) S{1925} + 72 (13) S{1927} − 31 (7.5) S{1969} + 47 (10) S{2010} − 158 (89) + 1.86 (0.13) Ct − 0.88 (0.18) Ct−1 + 1.71 (0.26) Ot − 1.07 (0.28) Ot−1 + 0.95 (0.33) gt − 1.13 (0.33) gt−1 + 7.64 (1.8) kt − 7.02 (1.8) kt−1 (7.4) σ̂ = 9.58 R2 = 0.995 FAR(2, 130) = 2.93 χ2 nd(2) = 5.97 FARCH(1, 149) = 3.42 FHet(20, 123) = 0.82 FReset(2, 130) = 2.30 FChow(6, 132) = 1.40 Fnl(27, 105) = 1.04 where Fnl tests for non-linearity (see §2.2.2). All of these mis-specification tests are insignificant, including FReset and Fnl so all of the non-linearity has been captured by (7.4), but the tests are applied to I(1) data, so correct critical values are not known: see Berenguer-Rico and Gonzalo (2014) for a test of non-linear cointegration applied in this context. Five impulse and four step indicators have been selected despite the very tight significance level. Combining the indicators in (7.4) allows some simplification by transforming 1{1926} and S{1927} to ∆1{1926}, and 1{1947}−1{1946} = ∆1{1947}. This reduces the number of genuine location shifts to three, an intermediate modeling stage that was implemented before selecting over the nine regressors. The resulting σ̂ was unaffected by these transformations. 276 Econometric Modeling of UK Annual CO2 Emissions The remaining step shifts capture major events with long-term im- pacts that are not otherwise captured by the variables in the model. These could reflect changes in the improving efficiency of fuel use, or the effects of omitting other sources of emissions with key technological changes, or usage shifts not taken into account in calculating emissions. Since steps in the Autometrics implementation of SIS terminate at the dates shown, their reported signs reflect what happened earlier, so a positive coefficient for S{1925} entails a higher level prior to 1926. That is the date of the 1926 Act of Parliament that created the UK’s first nationwide standardized electricity distribution grid, greatly enhanc- ing the efficiency of electricity, but also witnessed the General Strike probably captured by ∆1{1926}. Then 1969 saw the start of the major conversion of UK gas equipment from coal gas (about 50% hydrogen) to natural gas (mainly methane) with a considerable expansion in its use. The coefficients of both these location shifts have the appropriate signs of reducing and increasing emissions respectively. Although the UK’s Clean Air Act of 1956 did not need a step indicator, probably because it was captured by the resulting fall in coal use, we inter- pret the step shift S{2010} showing a higher level of emission of 47Mt before then as the reaction to the Climate Change Act of 2008 (see https://www.legislation.gov.uk/ukpga/2008/27/contents) and the Eu- ropean Union’s Renewables Directive of 2009, discussed in §7.11. Thus, we doubt the explanation is the Great Recession of 2008–2012, since the previous largest GDP fall in 1921–22 did not need a step, but just had an impulse indicator for the large outlier in 1921. As coal volumes are included, indicators for miners’ strikes should only be needed to capture changes in inventories, which might explain part of the large impulse indicator for 1926. 7.7 Selecting Regressors and Implementing Cointegration Secondly, all nine regressors are retained when selecting at 1% signifi- cance. Third, we solve for the long-run cointegrating relationship, justified by the Doornik and Hendry (2018) unit-root t-test value of tur = −8.99∗∗ which strongly rejects the null hypothesis of no cointegration (see https://www.legislation.gov.uk/ukpga/2008/27/contents 7.7. Selecting Regressors and Implementing Cointegration 277 Ericsson and MacKinnon, 2002, for the appropriate critical values, which are programmed into PcGive). The resulting cointegration relation defines the equilibrium-correction trajectory Q̃t = Et − ẼLR,t (adjusted to a mean of zero in-sample). Step indicators need to be led by one period as Q̃t−1 will be entered in the transformed model. However, being at the end of the initial sample up to 2011 from the definition of step indicators here, 1− S{2010} only has two observations in sample. Consequently, it was decided to extend the estimation sample by two observations to 2013 since the current full sample now ended in 2017, to enable the cointegrating relation to include S{2010}. This led to closely similar estimates to (7.4) with tur = −9.34∗∗ and the solved long-run: ẼLR = 2.0 (0.06) C + 1.4 (0.18) O + 1.18 (0.27) k − 0.27 (0.28) g + 63 (6) S{1924} − 64.0 (14) S{1968} + 70 (13) S{2009} − 328. (165) (7.5) All variables are significant at 1% other than g, which is ‘wrong signed’. The coefficient of coal is close to the current standard estimate of ≈ 2.1– 2.3, as is that of oil to its estimate, though somewhat lower than the 1.6 in Table 7.1. Figure 7.7(a) shows how closely the long-run derived relation ẼLR,t in (7.5) tracks Et. Panel (b) records the resulting time series for Q̃t centered on a mean of zero. While Q̃t is not stationary from a changing variance—unsurprising given the omission of the impulse indicators, matching the visible spikes—a unit root is rejected. Because the units in which the different variables in (7.5) are mea- sured are not directly comparable, their relative importance as determi- nants of the level of Et is hard to judge. However, Figure 7.2(b) provided a 3-dimensional plot of Et against Kt and Ct to show that while the rise then fall of coal usage in Figure 7.2(a) explains much of the behavior of CO2 emissions, the increases in the capital stock track the shift in the mid 20th century to higher emissions for the same volumes of coal as in the 19th (a similar picture emerges when plotting Et against Ct and Ot, but with a more erratic spread). Moreover, in the relatively 278 Econometric Modeling of UK Annual CO2 Emissions ,t E 1860 1880 1900 1920 1940 1960 1980 2000 2020 200 400 600 (a) ~ELR,t Et 1860 1880 1900 1920 1940 1960 1980 2000 2020 0 50 (b) ~Qt Figure 7.7: (a) Et and ẼLR,t; (b) Q̃t = Et − ẼLR,t centered on a mean of zero. similar long-run solution in a log-linear formulation, where coefficients are elasticities, the two dominant influences were 0.42 from coal and 0.40 from capital stock, with much smaller effects from GDP and oil. These effects match prior anticipations as discussed above. Indeed, in the linear and log-linear models, the long-run effect of GDP is negative, possibly reflecting the move from manufacturing to a service-based economy, although it is insignificant in the semi-log form (7.5). 7.8 Estimating the Cointegrated Formulation Fourth, transforming to a model in differences and the lagged coin- tegration relation from (7.5) then re-estimating revealed a couple of additional outliers (significant at 1% but not the original 0.1%), and adding those indicators yielded (7.6) for 1861–2013, testing constancy over 2014–2017. 7.9. Encompassing of Linear-Semilog versus Linear-Linear 279 ∆̂Et = 1.88 (0.10) ∆Ct + 1.71 (0.21) ∆Ot + 7.15 (1.09) ∆kt + 0.89 (0.28) ∆gt − 0.50 (0.05) Q̃t−1 − 15.2 (2.4) − 79.4 (8.8) ∆1{1926} + 50.2 (6.4) ∆1{1947} − 45.8 (11.1) 1{1921} − 27.5 (8.9) 1{1912} + 26.8 (8.9) 1{1978} + 28.4 (8.9) 1{1996} (7.6) σ̂ =8.87 R2 = 0.94 FAR(2, 139) = 0.49 χ2 nd(2) = 1.67 FHet(14, 134) = 1.03 FARCH(1, 151) = 0.53 FReset(2, 139) = 1.50 Fnl(15, 126) = 1.35 FChow(4, 141) = 1.75. Increases in oil, coal, k and g all lead to increases in emissions, which then equilibrate back to the long-run relation in (7.5). There are very large perturbations from this relationship, involving step shifts, impulses and blips. Archival research revealed that 1912 saw the first national strike by coal miners in Britain causing considerable disruption to train and shipping schedules, although nothing obvious was noted for 1978. The turbulent periods create such large changes it is difficult to ascertain how well the model describes the data from Figure 7.8, so Figure 7.9 records the implied levels’ fitted values and outcomes. The match is extremely close, although the sudden lurches are only ‘modeled’ by indicator variables, as are several of the step shifts. Possible explana- tions for the need for impulse indicators, some discussed above, include the role of gas, changes in stocks of coal and oil leading to divergences from measured output (so having different effects on emissions), the changing efficiency of production and usage (e.g., replacing electric fires by central heating), and general changes such as better insulation. All of the diagnostic statistics remain insignificant. 7.9 Encompassing of Linear-Semilog versus Linear-Linear Encompassing tests were applied in §3.2 to discriminate between the roles of trend and previous distances driven. In this subsection, we apply them to test the linear-semilog model in (7.6) denoted M1 280 Econometric Modeling of UK Annual CO2 Emissions Annual changes in CO2 emissions Fitted 1875 1900 1925 1950 1975 2000 -200 -100 0 100 200 (a) Annual changes in CO emissions 2Fitted Scaled residuals Forecast errors 1875 1900 1925 1950 1975 2000 -2 0 2 (b) Scaled residuals Forecast errors -3 -2 -1 0 1 2 3 0.1 0.2 0.3 0.4 0.5 (c) residual density N(0,1) 0 5 10 -0.5 0.0 0.5 1.0 Residual autocorrelations (d) Figure 7.8: (a) Actual and fitted values for ∆Et from (7.6); (b) residuals scaled by σ̂; (c) residual density and histogram with a Normal density for comparison; (d) residual autocorrelation. against the earlier linear-linear model reported in https://voxeu.org/ article/driving-uks-capita-carbon-dioxide-emissions-below-1860-levels denoted M2. As in the application of encompassing in Table 3.1, the instruments are the combined regressors of the two models. Table 7.4 records the outcome. The instruments used were S{2010}, ∆gt, ∆kt, Constant, ∆1{1947}, Q̃t−1, ∆Ct, ∆Ot, 1{1912}, 1{1921}, 1{1978}, 1{1996}, ∆Kt, 1{1970}, ∆S{1983}, ∆1{1926}, and Q̃GK,t−1, where Q̃t−1 and Q̃GK,t−1 denote the equilibrium correction terms of the versions with log and linear GDP and Capital respectively. Although M1 is rejected against M2, the F(4, 134) parsimo- nious encompassing test against the joint model is equivalent to adding the four variables from the linear model, and is not significant at the 1% level used for selection, nor are any of those variables individually https://voxeu.org/article/driving-uks-capita-carbon-dioxide-emissions-below-1860-levels https://voxeu.org/article/driving-uks-capita-carbon-dioxide-emissions-below-1860-levels 7.10. Conditional 1-Step ‘Forecasts’ and System Forecasts 281 CO2 emissions Model fit 1-step Forecasts 1860 1880 1900 1920 1940 1960 1980 2000 2020 200 300 400 500 600 700 1925 ↓ 1969→ 2010→ ←(1926) ←(1947) (1996) ↓ (1921)→ CO Mt2 Step shifts in bold (impulse indicators) ↑ (1912) ←(1978)CO emissions 2 Model fit 1-step Forecasts Figure 7.9: Actual and fitted values for UK CO2 emissions with indicator dates. Table 7.4: Encompassing test statistics where M1 is (7.6) with σM1 = 8.89, M2 is the linear model with σM2 = 9.18 and σJoint = 8.69 Test Form M1 vs. M2 Form M2 vs. M1 Cox (1962) N[0, 1] −3.74∗∗ N[0, 1] −5.40∗∗ Ericsson (1983) IV N[0, 1] 3.26∗∗ N[0, 1] 4.53∗∗ Sargan (1964) χ2(4) 10.0∗ χ2(4) 17.9∗∗ Joint model F(4, 134) 2.62∗ F(4, 134) 5.00∗∗ significant at 1%. Conversely, Q̃t−1, ∆gt, ∆kt, and 1{1921} are highly significant at less than 0.1% if added to M2. 7.10 Conditional 1-Step ‘Forecasts’ and System Forecasts To check the constancy of the model after 2013, Figure 7.10(a) records the four 1-step ahead ‘forecasts’ ∆̂ET+h |T+h−1 for ∆ET+h from (7.6), from T = 2013 with h = 1, . . . , 4, conditional on the realized values for the regressors, where σ̂f denotes the forecast standard error. We also 282 Econometric Modeling of UK Annual CO2 Emissions 2003 2006 2009 2012 2015 2018 -50 -25 0 25 MtMt Mt (a) ^^ 2003 2006 2009 2012 2015 2018 350 400 450 500 550 Mt (b) ∆ +h |T+h−1∆E t∆ ~ET+h |T+h− 2003 2006 2009 2012 2015 2018 -50 -25 0 25 50 75 Mt (c) ∆ET+h|T+h−1±2σf ∆ET+h|T+h−1 ∆Et ∆ET+h|T+h−1±2σf ∆ET+h|T+h−1 ∆Et ^ ET+h|T+h−1±2σf ET+h|T+h−1 Et ^^ ~ ~ ~ ^ Figure 7.10: (a) Outcomes ∆Et, fitted values, and 1-step conditional ‘forecasts’ ∆̂ET+h |T+h−1 with ±2σ̂f shown as bars, and robust ‘forecasts’ ∆̃ET+h |T+h−1; (b) implied ÊT+h |T+h−1 from (a) with ±2σ̂f , and corresponding robust ‘forecasts’ ẼT+h |T+h−1, both from 2013; (c) ∆Et, fitted values, and 1-step conditional ‘forecasts’ ∆̂ET+h |T+h−1 with ±2σ̂f shown as bars, and robust ‘forecasts’ ∆̃ET+h |T+h−1 commencing in 2008. report ‘forecasts’ from the robust device (2.27), denoted ∆̃ET+h |T+h−1 (see Hendry, 2006, and §2.9). The derived ‘forecasts’ ÊT+h |T+h−1 for the levels ET+h are also shown in Panel (b). The robust devices have slightly larger RMSFEs of 14.9, as against ∆̂ET+h |T+h−1 of 13.7, so the conditional ‘forecasts’ suggest no substantive shift in the relationship, despite describing the lowest levels of CO2 emissions seen since the 19th century. However, Panel (c) shows the important role of the step- indicator for 2010 as the forecasts resulting when it is absent are systematically too high. Re-estimating the CO2 model up to 2017 showed almost no change in σ̂ to 8.99, consistent with constancy. However, dropping S2010 then re-estimating to 2017 leads to a jump in σ̂ to 11.1 and rejection on 7.11. Policy Implications 283 almost all the diagnostic tests, as does commencing forecasts from 2008, at which point the effects of the Climate Change Act would not be known. Now the advantages of the robust device come into their own as panel (c) shows. The misspecified model’s ‘forecasts’ suffer systematic failure when S{2010} is not included (all other indicators were included), lying outside the ±2σ̂f error bars for the last four observations, with a RMSFE of 36, whereas despite that omission, the robust ‘forecasts’ track the downward trend in emissions with RMSFE = 25. To obtain unconditional forecasts and evaluate the role of IIS and SIS in model development and forecasting, a vector autoregression (VAR) with two lags was estimated for the five variables, Et, Ct, Ot, gt, and kt, over 1862–2011 with and without the indicators found for (7.4). In the former, those indicators were included in all equations. The VARs were estimated unrestrictedly without any selection to eliminate insignificant variables as that would lead to different specifications between the systems: Clements and Hendry (1995) demonstrate the validity of forecasting in this setting. Figure 7.11 Panel (a) reports the outcomes for 1-step ahead forecasts with and without the step indicators and Panel (b) the multi-step forecasts going 1, 2, . . . , 6 steps ahead. In both cases, indicator-based forecast intervals are shown as fans, and without step indicators by bars. In-sample, impulse indicators only have an impact on forecasts to the extent that they change estimated parameters, whereas step indicators can have lasting effects. As can be seen, including the step indicators greatly reduces ±2σ̂f for the 1-step forecasts in Panel (a) and leads to much smaller RMSFEs in both cases as compared to when no step indicators are included. The outcomes do not always lie within the uncertainty intervals for forecasts without SIS, which under-estimate the uncertainty despite a larger σ̂. 7.11 Policy Implications The most important implication of the above evidence is that substantial CO2 reductions have been feasible, so far with little apparent impact on GDP. The UK’s 2008 Climate Change Act (CCA) established the world’s first legally-binding climate-change target to reduce the UK’s greenhouse-gas (GHG) emissions by at least 80% by 2050 from the 1990 284 Econometric Modeling of UK Annual CO2 Emissions 2000 2005 2010 2015 350 400 450 500 550 (a) 1-step forecasts, SIS, E t1-step forecasts, no SIS, 1-step forecasts, SIS, ̂ f =16: RMSFE=25 Et 1-step forecasts, no SIS, ~σ σ f =32: RMSFE=46 2000 2005 2010 2015 300 400 500 600 (b) Dynamic forecasts, SIS, RMSFE=29 E tDynamic forecasts, no SIS, RMSFE=94 Dynamic forecasts, SIS, RMSFE=29 Et Dynamic forecasts, no SIS, RMSFE=94 Figure 7.11: (a) Outcomes fitted values, and 1-step forecasts with and without step indicators, with ±2σ̂f respectively shown as bars and fans, plus RMSFEs; (b) Outcomes fitted values, and multi-step forecasts with and without step indicators, with ±2σ̂f respectively shown as bars and fans, plus RMSFEs. baseline (the UK carbon budget counts six GHG emissions, not just CO2). A range of policy initiatives was implemented, with an updated carbon plan in 2011 (again covering more than just CO2 emissions), with carbon budgets to limit GHG emissions to 3018 Mt CO2-equivalent over the five years 2008–2012 and 2782 Mt over 2013–2017. While only counting the CO2 component, which is approximately 80% of the total, emissions over 2008–2012 cumulated to 2477 Mt, and to date over 2013– 2017, to 2039 Mt, both below the sub-targets, allowing 20% for other GHG emissions while still hitting those overall targets. To test the UK’s achievement of its 2008 CCA targets for CO2, the above 5-year total targets were translated into annual magnitudes, starting 20 Mt above and ending 20 Mt below the average target for the period. However, our test does not depend greatly on the within- period allocation, which affects any apparent residual autocorrelation (not significant, but the sample is small). We then scaled these annual targets by 0.8 as the share of CO2 in total greenhouse gases emitted by 7.11. Policy Implications 285 the UK, shown in Figure 7.12(a). As a decade has elapsed since the Act, there were 10 annual observations on CO2 emissions to compare to the targets, and we calculated a test of the difference between targets and outcomes being zero, but starting in 2009 as the Act could not have greatly influenced the emissions in its year of implementation. A graph of those differences is shown in Figure 7.12(b). The null of ‘emissions = targets’ is strongly rejected on the negative side with a mean of −18 and a zero-innovation error t-test value of −2.67 (p < 0.03: t = −1.99 correcting estimated standard errors for residual autocorrelation and heteroskedasticity), or as in panel (b), a downward step of −46.8 starting in 2013 with a t of −5.9. A similar approach could be used to evaluate the extent to which countries met their Paris Accord Nationally Determined Contributions (NDCs), given the relevant data. Thus, the UK has reduced its emissions faster than the targets and in CO2 emissions CC Act 2008 CO 2008 2010 2012 2014 2016 2018 350 400 450 500 550 (a) Mt CO emissions 2 CC Act 2008 CO Target 2 CO2 Target Difference Fitted step indicator 2008 2010 2012 2014 2016 2018 -50 -25 0 25 (b) CO Target Difference 2 Fitted step indicator Coal trajectory Oil trajectory 0 20 40 60 80 2005 2010 2015 (c) Coal trajectory Oil trajectory Simulated trajectory In-sample fit CO2100 200 300 400 500 600 (d) 2005 2010 2015 Simulated trajectory In-sample fit CO emissions 2 Figure 7.12: (a) UK CO2 emissions and Climate Change Act 2008 CO2 targets; (b) deviations from targeted values with a step indicator; (c) scenario reductions required in coal and oil use for original 2050 target; (d) resulting reductions in CO2 emissions from (7.6). In (c) and (d), the horizon is compressed to 5-year intervals after 2017. 286 Econometric Modeling of UK Annual CO2 Emissions 2017 was already below the implicit target for 2018. Indeed, the budget for 2018–2022 of 2544 Mt, roughly 410 Mt p.a. of CO2, is undemanding given the 2017 level of 368 Mt, but should not induce complacency, as the easiest reductions have been accomplished with coal use now almost negligible. The NDCs agreed at COP21 in Paris are insufficient to keep temperatures below 2 ◦C so must be enhanced, and common time frames must be adopted to avoid a lack of transparency in existing NDCs: see Rowan (2019). Since the baseline dates from which NDCs are calculated is crucial, 5-year NDC reviews and evaluation intervals are needed. 7.12 Can the UK Reach Its CO2 Emissions Targets for 2050? For CO2 emissions to meet their share by 2050 of the 80% drop from the 1990 baseline of 590 Mt, as required by the 2008 CCA, they would need to fall to about 120 Mt p.a. To illustrate, we simulate a scenario with no coal usage, quite a possibility now that coal is banned for electricity generation from 2025, and a 70% fall in oil use, to around 20 Mt p.a., from greatly increased use of non-gasoline vehicles sustained by expanded renewables and alternative engines. The outcome is shown in Figure 7.12 panel (c). The horizon is compressed after 2017, as the timing of such dramatic reductions is highly uncertain. Implicitly, reduced dependence on natural gas to under 35 Mtoe p.a. (a 75% reduction) is required, potentially replaced by hydrogen as the UK used to burn (but then from coal gas) before the switch starting in 1969 discussed above. With about a quarter of CO2 emissions coming from agriculture, construction and waste (currently about 100 Mt p.a.) a serious effort to much more than halve those must also be entailed. Panel (d) records the resulting trajectory for CO2 emissions, falling from around the 2015 level of 400 Mt p.a. to about 120 Mt p.a., or around 1.8 tonnes per capita p.a., down from 12.4 tonnes per capita p.a. in 1970. The point and interval ‘forecasts’ are at constant K and G, and assume the parameters of (7.6) remain constant despite the major shift: increases in K and G would make the targets harder to achieve unless they were carbon neutral. However, given the key role of the capital stock in explaining the UK’s CO2 emissions since 1860, as K 7.12. Can the UK Reach Its CO2 Emissions Targets for 2050? 287 embodies the vintage of the technology at the time of its construction and is long lived, transition to zero carbon has to be gradual, and necessitates that new capital, and indeed new infrastructure in general, must be zero carbon producing. As a ‘policy’ projection, together these measures would reach the UK’s 2008 announced 2050 target–but only if such reductions, perhaps with offsetting increases, could actually be achieved. The rapidly falling costs of renewable-energy sources like solar cells and wind turbines (see e.g., Farmer and Lafond, 2016) combined with improved storage methods should substantially reduce oil and gas use in electricity production. Table 7.5 records recent estimates of electricity generating costs in £/MWh by different technologies. Onshore wind turbines have fallen in cost and increased in efficiency so rapidly over the past two decades that for the UK at least they Table 7.5: Electricity generating technology costs in £/MWh (megawatt hour) Power generating technology costs £/MWh Low Central High Nuclear PWR (Pressurized 82 93 121 Water Reactor) (a) Solar Large-scale PV (Photovoltaic) 71 80 94 Wind Onshore 47 62 76 Wind Offshore 90 102 115 Biomass 85 87 88 Natural Gas Combined Cycle Gas Turbine 65 66 68 CCGT with CCS 102 110 123 Open-Cycle Gas Turbine 157 162 170 Advanced Supercritical Coal 124 134 153 Oxy-comb. CCS Coal IGCC with CCS (b) 137 148 171 Note: (a) New nuclear power guaranteed strike price of £92.50/MWh for Hinkley Point C in 2023; (b) IGCC =Integrated Gasification Combined Cycle. Lowest cost alternatives shown in bold. Source: Electricity Generation Costs, Department for Business, Energy and Industrial Strat- egy (BEIS), November 2016. 288 Econometric Modeling of UK Annual CO2 Emissions offer the lowest cost alternative. Offshore wind turbines have also fallen greatly in cost and increased in efficiency, so now also offer a low cost method of electricity generation (with the incidental benefit of creating marine reserves and saltwater fish sanctuaries), even below natural gas combined-cycle turbines before the costs of carbon capture and storage (CCS) are included.5 Solar photovoltaics come next (and this is the UK!) if CCS is enforced, though both require large backup electricity storage systems for (e.g.,) windless nights. Renewables’ share of overall electricity generation reached a peak of 60.5% at one stage in April 2020, according to National Grid data. Increased outputs of renewable electricity will reduce the volume of emissions for a given level of energy production by also reducing usage of oil in transport through electric car use, but should not influence the constancy of the empirical models above conditional on the volumes of coal and oil included. A probable reason for the sharp fall in coal use in 2017 is a rise in its price relative to those of other energy sources, with the UK carbon tax doubling in 2015 to £18 per tonne of CO2. Conversely, natural gas use has increased 3.5 fold since the mid-1980s, so although producing less than half the CO2 emissions of coal per Btu, still contributes about 140 Mt p.a. to CO2 emissions. The use of oil in transport will take longer to reduce, but more efficient engines (with diesel being phased out completely given its toxic pollutants), and most vehicles powered from renewable sources, combined with much higher taxes on gasoline, offer a route to the next stage of CO2 emissions reductions. In 2019, the UK Government amended the original CCA target to zero net emissions by 2050. Then all the sources of CO2 emissions must go to a level such that carbon capture and sequestration (CCS), possibly combined with atmospheric CO2 extraction methods, would remove the rest. Facing an almost certain irreducible non-zero minimum demand for oil and gas (e.g., for chemicals), to achieve the Paris COP21 target of zero net emissions before 2050 requires really major technological change, almost certainly involving development of current research 5https://institutions.newscientist.com/article/2217235-the-cost-of-subsidising- uk-wind-farms-has-dropped-to-an-all-time-low/ show a more recent cost of £40 MWh, now highly competitive. https://institutions.newscientist.com/article/2217235-the-cost-of-subsidising-uk-wind-farms-has-dropped-to-an-all-time-low/ https://institutions.newscientist.com/article/2217235-the-cost-of-subsidising-uk-wind-farms-has-dropped-to-an-all-time-low/ 7.12. Can the UK Reach Its CO2 Emissions Targets for 2050? 289 avenues into removing or using existing CO2: see https://phys.org/news/ 2014-09-carbon.html. Net zero is an excellent target, but incredibly difficult to achieve, and as yet there is no sensible strategy to do so . . . although we now discuss some highly speculative routes. To meet the net zero target, overall natural gas use would need to be reduced to near zero, like coal. Natural gas is mainly used for electricity production and household indoor and water heating. The former could be handled in part by increased renewable sources, but as potentially serious storage problems remain (more on this shortly), research effort should be devoted to developing safe small modular nuclear reactors (SMRs) based on well developed nuclear-powered engines in submarines.6 These SMRs might be able to use thorium or the ‘spent’ uranium fuel rods from older reactors, helping reduce the serious problem of existing nuclear-waste disposal. Household use could be reduced by increased taxes on natural gas and oil usage, encouraging the adoption of solar panels and (e.g.,) air heat pumps, as well as switching the national gas system back to its pre-1969 hydrogen basis.7 Over the next 20–30 years with ever improved technologies, and consequential cost reductions in generating electricity by renewables, a zero target does not seem impossible for electricity and gas without requiring reductions in GDP growth, perhaps even increasing it with new opportunities. Next, road transport decarbonisation is progressing slowly using lithium-ion battery powered electric vehicles, with relatively short jour- ney capacity yet take a non-negligible time to recharge, discouraging the replacement of internal combustion engines. Huge advances have oc- curred in recent years both in understanding the properties of graphene, and in its cost of production (see ‘graphene in a flash’ from plastic waste in https://phys.org/news/2020-01-lab-trash-valuable-graphene.html). Graphene nanotubes (GNTs) can act as electrode supercapacitors (see e.g.: https://www.nature.com/articles/s41598-020-58162-9). Thus, one could imagine an electric vehicle that sandwiched an array of GNTs between two Faraday cages over the roof and above the inside of a 6www.world-nuclear.org/info/Nuclear-Fuel-Cycle/Power-Reactors/Small- Nuclear-Power-Reactors/. 7Such taxes are to change behavior, not to raise revenue, so should be redistributed to families facing fuel poverty. https://phys.org/news/2014-09-carbon.html https://phys.org/news/2014-09-carbon.html https://phys.org/news/2020-01-lab-trash-valuable-graphene.html https://www.nature.com/articles/s41598-020-58162-9 http://www.world-nuclear.org/info/Nuclear-Fuel-Cycle/Power-Reactors/Small-Nuclear-Power-Reactors/ http://www.world-nuclear.org/info/Nuclear-Fuel-Cycle/Power-Reactors/Small-Nuclear-Power-Reactors/ 290 Econometric Modeling of UK Annual CO2 Emissions vehicle, or by a prefabricated modular unit fitted to the existing roof (perhaps even retrofitted on existing car roofs) to power an electric motor, so the vehicle becomes the battery. GNTs seem capable of rapid charging, and should be able to sustain viable distances on a single charge. Moreover, if successful, once most vehicles are like that, then man- dating them to be plugged in when not in use, a vast electric storage system would be available for no additional investment, so renewable sources of electricity could be widely adopted without worrying about security of supply. There are undoubtedly many key technical issues needing research as to how such a system would work in practice, some ongoing such as developing 2-dimensional tri-layers of graphene as an insulator, superconductor and magnet.8 The potential benefits of such a power source could be huge as a ‘sensitive intervention point’ (SIP: see https://science.sciencemag.org/ content/364/6436/132). By not demonising road transport for its CO2 footprint and dangerous pollution, cars with internal combustion engines could be replaced at a rate matching the increased need for storage from the extension of renewables. The basics of electric engines are established, so employment can be maintained in vehicle manufacture and all its ancillary industries as well as in new graphene-based ones. Two side benefits are a major reduction in both mining for lithium, and later disposal of, or recycling, the resulting toxic battery waste; and eliminating the need for expensive catalytic converters, cutting production costs markedly, eliminating a target for theft (which then exacerbates air pollution), and reducing palladium mining. Indirect consequences could solve the UK’s rail systems problem of a lack of electrification across much of the network by replacing diesel-electric trains by GNT-supplied electric ones, although some progress is occurring with hydrogen driven trains in Germany and the UK (see https://www.birmingham.ac.uk/research/spotlights/ hydrogen-powered-train.aspx); and even more speculatively, as GNTs are so light, possibly short-haul electric aircraft. As an historical aside, 8See https://www.graphene-info.com/graphene-triples-superconducting- insulating-and-ferromagnetic. https://science.sciencemag.org/content/364/6436/132 https://science.sciencemag.org/content/364/6436/132 https://www.birmingham.ac.uk/research/spotlights/hydrogen-powered-train.aspx https://www.birmingham.ac.uk/research/spotlights/hydrogen-powered-train.aspx https://www.graphene-info.com/graphene-triples-superconducting-insulating-and-ferromagnetic https://www.graphene-info.com/graphene-triples-superconducting-insulating-and-ferromagnetic 7.12. Can the UK Reach Its CO2 Emissions Targets for 2050? 291 we noted above that electric cars date back before the 1880s, so an all-electric transport system is just going back to where society might have been 140 years ago. However, agriculture, construction, chemical industry and waste management look more problematic, although there is progress in efficiency improvements. Inner-city vertical and underground farms economize on water, fertilizer and energy (especially from transport reductions) and are increasingly viable given the falls in costs for LED lighting (see e.g., www.scientificamerican.com/ article.cfm?id=the-rise-of-vertical-farms). There is considerable research on altering farm mammal diets to reduce methane emissions, including adding dietary fumaric acid (from plants like lichen and Iceland moss), where lambs showed a reduction by up to 70% (e.g., https://phys.org/news/2008-03-scientists-cow-flatulence.html). Changes to human diets are also en route, and need encouraging: on a small scale, see http://www.climateeconometrics.org/2020/03/ 18/nuffield-colleges-decreasing-food-emissions/. Prefabrication of highly insulated dwellings must be a priority, as well as using less GHG-intensive building materials. Recycling more, using more waste for fuel, and landfilling less to reduce methane are all essential. The UK’s total ‘consumption induced’ CO2 equivalent emissions are higher than the domestic level through CO2 embodied in net imports,9 although the large reductions achieved to date have a major domestic component, and of course ‘consumption induced’ CO2 will fall as the CO2 intensity of imports falls with reductions in exporting countries. However, targeting consumption rather than production emissions has the unwanted consequence of removing any incentives for emitting industries or exporting countries to improve their performance, as these would not be counted against them (e.g., if NDCs used a consumption basis). Border carbon taxes have a role to play in improving both exporters and importers performance. Similarly, allocating emissions from transport and packaging to (say) the food sector would again alleviate those intermediate sectors of the responsibility to invest to 9See http://www.emissions.leeds.ac.uk/chart1.html and https://www. biogeosciences.net/9/3247/2012/bg-9-3247-2012.html. www.scientificamerican.com/article.cfm?id=the-rise-of-vertical-farms www.scientificamerican.com/article.cfm?id=the-rise-of-vertical-farms https://phys.org/news/2008-03-scientists-cow-flatulence.html http://www.climateeconometrics.org/2020/03/18/nuffield-colleges-decreasing-food-emissions/ http://www.climateeconometrics.org/2020/03/18/nuffield-colleges-decreasing-food-emissions/ http://www.emissions.leeds.ac.uk/chart1.html https://www.biogeosciences.net/9/3247/2012/bg-9-3247-2012.html https://www.biogeosciences.net/9/3247/2012/bg-9-3247-2012.html 292 Econometric Modeling of UK Annual CO2 Emissions reduce what are in fact their emissions by attributing them to retail outlets or consumers. Conversely, the purchasing clout of large retail chains can pressure suppliers to improve, as (e.g.,) Walmart is doing.10 The aggregate data provide little evidence of high costs to the reductions achieved in CO2 emissions, which have dropped by 186 Mt from 554 Mt to 368 Mt (34%) so far this century, during which period real GDP has risen by 35%, despite the ‘Great Recession’ but before the pandemic. Historically, those in an industry that was being replaced (usually by machines) lost out and bore what should be the social costs of change, from cottage spinners, weavers and artisans in the late 18th–early 19th centuries (inducing ‘Luddites’), to recent times (from a million coal miners in 1900 to almost none today). There is a huge difference in the impacts of substitutes and complements for existing methods: motor vehicles were a huge advance, and created many new jobs directly and indirectly, mainly replacing horses but indirectly destroying their associated workforce. Although not a direct implication of the aggregate model here, greater attention needs to be focused on the local costs of lost jobs as new technologies are implemented: mitigating inequality impacts of climate induced changes ought to matter centrally in policy decisions. Given the important role of the capital stock in the model above, ‘stranded assets’ in carbon producing industries are potentially problem- atic as future legislation imposes ever lower CO2 emissions targets to achieve zero net emissions (see Pfeiffer et al., 2016). As argued by Farmer et al. (2019) exploiting sensitive intervention points in the post-carbon transition could be highly effective, and they cite the UK’s Climate Change Act of 2008 as a timely example that had a large effect. An excellent ‘role model’ that offers hope for reductions in other en- ergy uses is the dramatic increases in lumen-hours per capita consumed since 1300 of approximately 100,000 fold yet at one twenty-thousandth the price per lumen-hour (see Fouquet and Pearson, 2006). 10See https://corporate.walmart.com/newsroom/2016/11/04/walmart-offers- new-vision-for-the-companys-role-in-society. https://corporate.walmart.com/newsroom/2016/11/04/walmart-offers-new-vision-for-the-companys-role-in-society https://corporate.walmart.com/newsroom/2016/11/04/walmart-offers-new-vision-for-the-companys-role-in-society 7.13. Climate-Environmental Kuznets Curve 293 7.13 Climate-Environmental Kuznets Curve The ‘environmental Kuznets curve’ is assumed to be a ∩ shaped rela- tionship between pollution and economic development: see Dasgupta et al. (2002) and Stern (2004). For a ‘climate-environmental Kuznets curve’, we estimated a regression of the log of CO2 emissions, denoted et (lower case denotes logs) on the log of real GDP, gt and its square g2 t , which delivered: êt = − 31.5 (1.6) + 6.13 (0.27) gt − 0.247 (0.012) g2 t (7.7) σ̂ = 0.091 R2 = 0.91 FAR(2, 145) = 37.3∗∗ FARCH(1, 148) = 0.26 FHet(3, 146) = 1.70 χ2 nd(2) = 68.3∗∗ FReset(2, 145) = 10.61∗∗ FChow(5, 147) = 3.06∗ Fnl(6, 141) = 8.72∗∗. Many of the diagnostic tests are significant, and both Freset and Fnl reveal that all of the non-linearity has not been captured by (7.7). Indeed, (7.7) has a borderline rejection on the parameter-constancy test, but the rejections on the other mis-specification tests makes that difficult to interpret. Full-sample impulse-indicator saturation (IIS) selected 17 indicators at a significance level of 0.1%, but still led to Fnl(6, 128) = 11.5∗∗, and σ̂ = 0.055. The relationship between log CO2 emissions and log real GDP is plotted in Figure 7.13. The large drop in CO2 emissions while GDP more than doubled is notable, and reflects improved technology in energy use as well as a changing mix of fuels. Although the non-linearity is marked, there are large and systematic deviations from the fitted curve, shown inside ellipses for the start and end of the sample, 1921 and 1926, and the 1930s and 1940s. Since the final model in (7.6) is linear in CO2 emissions, and log- linear in GDP, a natural question is whether it can account for the non- linearity of the ‘climate Kuznets curve’ in Figure 7.13. This is answered in Figure 7.14 where the log of the fitted values from (7.6) is cross plotted against log(GDP) together with log(CO2) data, to reveal the same non-linearity even though log(GDP) enters the equilibrium-correction mechanism in (7.5) negatively and is insignificant. The regression of 294 Econometric Modeling of UK Annual CO2 Emissions 10.25 10.5 10.75 11 11.25 11.5 11.75 12 12.25 12.5 12.75 13 13.25 13.5 5. 2 5. 4 5. 6 5. 8 6 6. 2 6. 4 Lo g( C O em is si on s) 2 → Log(GDP) → 1860 1861 1862 1863 1864 1865 1866 18671868 18691870 1871 1872 1873 1874 18751876187718781879 1880 18811882 18831884188518861887 1888 1889189018911892 189318941895 189618971898 189919001901 1902190319041905 1906 19071908190919101911 1912 1913 19141915 19161917 19181919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 19321933 19341935 1936 1937 19381939 194019411942 1943 1944 1945 1946 1947 19481949 1950 1951 195219531954 19551956195719581959 1960196119621963196419651966 19671968 1969 1970 1971 1972 1973 1974 197519761977 19781979 19801981 19821983 1984 1985198619871988198919901991 1992199319941995 1996 19971998199920002001 20022003 20042005200620072008 2009 2010 2011 20122013 2014 2015 2016 20172018 Figure 7.13: Scatter plot of log of CO2 emissions against the log of GDP (shown by dates) with the fitted values from Equation (7.7) (shown by the line). log(CO2) on the log of the fitted values from (7.6) had σ̂ = 0.019. Of course that better explanation is greatly enhanced by using coal and oil, but conversely is after translation into logs. Thus, the ‘curvature’ of an eventually declining relationship between log CO2 emissions and log real GDP is an artefact of both being corre- lated with technology. Had electricity been discovered in 1300, batteries several decades later, rather than waiting for Volta in 1800, and solar cell technologies a few decades after that and so on, all of which de- pended on knowledge and understanding rather than income levels per se, an electrical world economy may have circumvented the need for coal. Conversely, if neither electricity nor the internal combustion engine had been discovered, leaving only coal as a fuel source, efficiency improve- ments or lower usage would have been the only routes to reductions in CO2 emissions. Relative costs of energy provision matter, and Table 7.5 showed recent power generating costs, but the metaphor suggests a ‘climate Kuznets curve’ is mainly a technology-driven relation. Income 7.13. Climate-Environmental Kuznets Curve 295 Model fit CO2 emissions 10.5 10.75 11 11.25 11.5 11.75 12 12.25 12.5 12.75 13 13.25 13.5 5.2 5.4 5.6 5.8 6.0 6.2 6.4 Log(GDP)→ Lo g( C O 2 e m is si on s) → Figure 7.14: Plot of UK log(CO2) emissions and log fitted values against log(GDP): Re-creating a ‘climate Kuznets curve’. levels may matter more for other environmental relations such as clean air and less polluted rivers. 8 Conclusions Climate econometrics aims to apply econometric methods to augment our understanding of climate change and the interactions between hu- man actions and climate responses. The field has evolved rapidly over the last few years from a pressing need to understand the science, economics, health and policy implications of climate change. Time-series economet- rics is well placed to offer insights on these issues, as the methodology has been developed to model complex, evolving and shifting interactions over time due to human behavior. Although originally applied to eco- nomic data, the methodology is applicable to climate-economic research as anthropogenic forces play a key role in determining climate and vice versa. This review aimed to explain the tools developed at Climate Econometrics (http://www.climateeconometrics.org/) to disentangle complex relationships between human actions and climate responses and their associated economic effects, masked by stochastic trends and breaks. Empirical applications to climate problems demonstrate the benefits of applying data-based methods jointly with underlying the- ory to improve our understanding of climate phenomena, to assist in forecasting future outcomes and to provide policy guidance. 296 http://www.climateeconometrics.org/ 297 We described novel modeling approaches to climate-economic data, combining insights from climate theory with empirical evidence to dis- cover new results. We embed theory models in far larger information sets, allowing new features, dynamics, outliers, breaks and non-linearities to be discovered, while retaining established theory. One of the fundamen- tal aspects of the modeling approach is the use of multi-path selection that enables more candidate variables than observations to be explored. This opens the door to a wide range of indicator saturation estimators to model outliers, breaks, distributional shifts and non-constancies. We discuss the costs of selection relative to mis-specification and show the remarkably small costs associated with searching over large numbers of candidate variables, thus enabling wide-sense non-stationary data to be modeled. The monograph emphasizes the importance of taking account of the non-stationary nature of time series, both stochastic trends and distributional shifts. The example in Section 3 looking at distances traveled by car and human road fatalities illustrates the hazards of not correctly dealing with non-stationary time series. Such an example highlights that mistaken inferences and possibly false causal attribution can occur if the data properties are not carefully taken into account and relationships modeled with rigorous testing. Our short excursion into climate science in Section 4 discusses the Earth’s atmosphere and oceans, demonstrates that humanity can easily alter these, and shows it is doing so. The composition of the atmosphere, the roles of CO2 and other greenhouse gases, and the consequences of changes in atmospheric composition are discussed, focusing on the impacts of climate change on the ‘great extinctions’ over geological time. Section 5 notes that the consequences of the Industrial Revolution in the UK during the 18th century have been both good and bad, greatly raising living standards worldwide, but leading to dangerous levels of CO2 emissions from using fossil fuels. The econometric approach outlined in Section 2 was illustrated by two detailed applications. Section 6 applies the modeling approach to the last 800,000 years of Ice Ages to illustrate its practical details. The theoretical model of Ice Ages is based on variations in the Earth’s orbit, which determine the solar radiation reaching the planet, where and 298 Conclusions when it is most concentrated and hence the speed with which glacial periods occurred and later retreated. But the question then arises that if Ice Ages are due to orbital variations, why should CO2 levels correlate so highly negatively with land ice volume? Ice, CO2 and temperature are modeled as jointly endogenous functions of the orbital variables in a 3-variable simultaneous equations system, applying saturation estimation on the system to model outliers, along with dynamics, and non-linearities to capture interaction effects. The approach embeds the theory and allows for dynamics, non-linearities, non-stationarities and endogeneity in a system, rebutting concerns that the modeling approach is inherently single equation. The evidence suggests that CO2 was an endogenous response to orbital drivers over Ice Ages, jointly with ice volume and temperature, albeit now mainly determined by anthropogenic sources. Looking into the future with CO2 changing to an exogenously determined value set by anthropogenic emissions points to temperatures dangerously in excess of the peak values measured over the Ice Ages. Section 7 developed an explanation of the United Kingdom’s CO2 emissions data over 1860–2017 in terms of coal and oil usage, capital stock and GDP, taking account of their non-stationary nature, with many turbulent periods and major shifts over the 157 years. Having been first into the Industrial Revolution that has transformed the world’s wealth at the cost of climate change, the UK is one of the first out in terms of its CO2 emissions; the UK’s total CO2 emissions have dropped below the level first reached in 1894, and per capita UK CO2 emissions are now below their level in 1860, when the UK was the ‘workshop of the world’, and yet per capita real incomes are more than 7-fold higher. The econometric approach to modeling such dramatic changes was explained in four steps. The key explanatory variables were coal and oil usage and capital stock, whereas GDP had an insignificant effect in levels given the other explanatory variables, possibly reflecting a move away from manufacturing to a service economy, notwithstanding which, the model implies a non-linear ‘climate Kuznets curve’ between emissions and GDP. Compared to directly fitting a ‘climate Kuznets curve’ as in (7.7), the resulting model highlights the benefits of the more general methodology. Improvements in multi-step forecasts also highlighted 299 the advantages of taking account of in-sample outliers and shifts using impulse- and step-indicator saturation, despite those creating more candidate variables to select over than observations. The UK’s climate policy has been effective, and the resulting large emissions reductions have not yet involved major aggregate sacrifices. The UK’s reductions (e.g., of 24 Mt in 2016) are the more impressive against recent recorded global annual increases of 3.3 ppm (remember that 1 ppm = 7.81 gigatonnes of CO2). However, local losses in incomes and employment from changes in fuel production have not been ad- dressed and ‘stranded assets’ could be a potential problem if future legislation imposes lower CO2 emissions targets. The UK’s target of a 100% reduction from the 1990 baseline of 590 Mt of emissions is only achievable with complete reductions in oil and gas use and in other sources net of increased re-absorption of CO2. We conclude by briefly describing some of the other applications of methods developed at Climate Econometrics, including to panels and cross-sections, to demonstrate the diverse array of problems to which our approaches can be applied, including to health care as well as economic modeling. Their common feature is to emphasize the role of human behavior in climate change additional to well-established physical processes. To determine any anthropogenic signature in global CO2 concentra- tions, Hendry and Pretis (2013) modeled anthropogenic and natural contributions to atmospheric CO2 using a large number of potential explanatory variables: their general unrestricted model had 492 variables for 246 monthly observations. Despite approximately 10148 possible mod- els to select over, Autometrics estimated just 571 models at α = 0.001, and finally selected 14 variables. No deterministic terms were retained, so all the explanation of changes in global CO2 concentrations came from stochastic sources, demonstrating that the modeling approach does not impose any particular stance on likely causes. Increases in cumulated changes in vegetation reduced atmospheric levels of CO2 by about 2 ppm with large seasonal fluctuations, whereas the Southern Oscillation Index (SOI) increased it by about 3 ppm, both over 1981(7)– 2002(12), revealing small net effects from natural sources. However, the cumulation of anthropogenic sources over that period produced a strong 300 Conclusions trend from 340 ppm to almost 380 ppm, closely matching the CO2 measured at Mauna Loa, and consistent with isotope-based measures. Pretis et al. (2015a) applied SIS to objectively investigate a slow- down in the rise of global mean surface temperature (called the hiatus in warming). Their results indicated that when temperature can be modeled by anthropogenic forcings and natural variability such as solar insolation, the hiatus was not statistically different from the rest of the 1950–2012 temperature record. They also found no evidence that the slowdown in temperature increases was uniquely tied to episodes of La Niña-like cooling. Pretis and Roser (2017a) compared socio-economic scenarios created in 1992 and 2000 against the observational record to investigate the possible decoupling of economic growth and fossil-fuel CO2 emissions. They showed that global emission intensity (fossil fuel CO2 emissions relative to GDP) rose in the first part of the 21st century, primarily from the under-predicted rapid growth in Asia, counter to some climate projections foreseeing a decline. Nevertheless, the wide spread of tem- perature changes in climate projections did not predominately originate from uncertainty across climate models, but from the broad range of different global socio-economic scenarios and their implied energy pro- duction. As discussed in §2.9 above, forecasting the future when human behavior is involved is always prone to unanticipated shifts. Kaufmann et al. (2017) demonstrated that spatial heterogeneity of climate-change experiences mattered, since skepticism about whether the Earth was warming was greater in areas exhibiting cooling relative to areas that had warmed. Moreover, recent cooling could offset his- torical warming to enhance skepticism. While climate change is due to global warming, the former is a better epithet since not all areas warm uniformly as a consequence of changes. Pretis et al. (2018b) applied a panel-data version of IIS to analyze the potential impacts on economic growth of stabilizing global temperatures at either 1.5 ◦C or 2 ◦C above pre-industrial levels. They estimated a non-linear response of changes in global annual per capita GDP growth to average annual temperatures, both without and with IIS. They found that stabilizing temperature increases at 1.5 ◦C did not lead to too many major costs in reducing per capita GDP growth 301 across countries, although some already poor countries in the tropics would suffer. However, temperatures rising by 2 ◦C would inflict serious losses on most already-hot countries, lowering their projected GDP per capita growth by up to 2.5% per annum, consistent with other empirical estimates. At a global level, 1.5 ◦C is preferable to 2 ◦C above pre-industrial levels and is still (possibly) achievable. Tropical cyclones, including hurricanes in the West Atlantic and ty- phoons in the Pacific, are often highly destructive: see Emanuel (2005). Four of the five costliest US natural disasters have been caused by relatively recent hurricanes, so Martinez (2020) applied the model se- lection methods and saturation estimators described above to examine whether improvements in forecasting could reduce their damages as ear- lier warnings would allow more time to prepare or evacuate. A measure of forecast uncertainty was added to the main natural and human chan- nels determining damages, the former including highest wind speeds, minimum pressure, maximum storm surge, maximum rainfall, seasonal cyclone energy, historical frequency of hurricanes, soil moisture content and air temperature; and the latter including income, population, and the number of housing units in the area at risk, together with other candidate variables such as time of year and the location hit etc. Ap- plying Autometrics with IIS to the 98 hurricane strikes to the Eastern USA since 1955, minimum pressure, maximum storm surge, maximum rainfall, income, housing units, and forecast uncertainty were selected.1 All of these studies had theory guidance in formulating their ap- proaches, but the econometric techniques discussed above also allow for direct linking of climate models with empirical data, as in Pretis (2019), to further improve econometric research on human responses to climate variability. This monograph has emphasized the need for handling all modeling decisions jointly, allowing for key forms of wide-sense non- stationarity, facing possibly incorrect theories and mis-measured data. Few approaches in either climate or economic modeling as yet consider all such effects jointly, but a failure to do so can lead to mis-specified 1His web site https://sites.google.com/view/andrewbmartinez/current-research/ damage-prediction-tool provides a tool based on his model for predicting the damages after a hurricane strike to the Atlantic coast of the USA, which has proved accurate for most of the hurricanes in recent years. https://sites.google.com/view/andrewbmartinez/current-research/damage-prediction-tool https://sites.google.com/view/andrewbmartinez/current-research/damage-prediction-tool 302 Conclusions models and hence incorrect theory evaluation, mis-leading forecasts and poor policy analyses. The software to implement our approach is available as Autometrics in OxMetrics8, in XLModeler for Excel, and as Gets in R, so can be readily accessed. Acknowledgements Financial support from the Robertson Foundation (award 9907422), the Institute for New Economic Thinking (grant 20029822) and Nuffield College is gratefully acknowledged, as are valuable contributions to the background research from Jurgen A. Doornik, Luke P. Jackson, Søren Johansen, Andrew B. Martinez, Bent Nielsen and Felix Pretis. All calculations and graphs use PcGive (Doornik and Hendry, 2018) and OxMetrics (Doornik, 2018). We thank Rob Engle, Neil Ericsson, Vivien Hendry, Luke Jackson, Jonas Kurle, Andrew Martinez, Susana Martins, Bent Nielsen, Felix Pretis, Matthais Qian, Ryan Rafaty, Moritz Schwarz, Bingchen Wang and Angela Wenham for helpful comments, and Thomas Sterner for the invitation to deliver a plenary session at the 2018 World Conference in Environmental and Resource Economics in Gothenburg on which this monograph is based. 303 References Agassiz, L. (1840). Études sur les glaciers. Digital book on Wikisource accessed on July 22, 2019: https://fr.wikisource.org/w/index.php?t itle=%C3%89tudes_sur_les_glaciers&oldid=297457. Neuchâtel: Imprimerie de OL Petitpierre. Akaike, H. (1973). “Information theory and an extension of the maxi- mum likelihood principle”. In: Second International Symposium on Information Theory. Ed. by B. N. Petrov and F. Csaki. Budapest: Akademia Kiado. 267–281. Allen, R. C. (2009). The British Industrial Revolution in Global Per- spective. Cambridge: Cambridge University Press. Allen, R. C. (2017). The Industrial Revolution: A Very Short Introduc- tion. Oxford: Oxford University Press. Andrews, D. W. K. (1991). “Heteroskedasticity and autocorrelation consistent covariance matrix estimation”. Econometrica. 59: 817– 858. Arrhenius, S. A. (1896). “On the influence of carbonic acid in the air upon the temperature of the ground”. London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science (fifth series). 41: 237–275. Beenstock, M., Y. Reingewertz, and M. Paldor (2012). “Polynomial cointegration tests of anthropogenic impact on global warming”. Earth Systems Dynamics. 3: 173–188. 304 https://fr.wikisource.org/w/index.php?title=%C3%89tudes_sur_les_glaciers&oldid=297457 https://fr.wikisource.org/w/index.php?title=%C3%89tudes_sur_les_glaciers&oldid=297457 References 305 Berenguer-Rico, V. and J. Gonzalo (2014). “Co-summability: From lin- ear to non-linear co-integration”. Working Paper. Oxford University: Economics Department. Berenguer-Rico, V. and I. Wilms (2020). “Heteroscedasticity testing after outlier removal”. Econometric Reviews. doi: 10.1080/0747493 8.2020.1735749. Blundell, S. (2012). Magnetism: A Very Short Introduction. Oxford: Oxford University Press. Bontemps, C. and G. E. Mizon (2008). “Encompassing: Concepts and implementation”. Oxford Bulletin of Economics and Statistics. 70: 721–750. Boumans, M. A. and M. S. Morgan (2001). “Ceteris paribus conditions: Materiality and the applications of economic theories”. Journal of Economic Methodology. 8: 11–26. Bralower, T. J. (2008). “Earth science: Volcanic cause of catastrophe”. Nature. 454: 285–287. Brinkley, C. (2014). “Decoupled: Successful planning policies in countries that have reduced per capita greenhouse gas emissions with contin- ued economic growth”. Environment and Planning C: Government and Policy. 32: 1083–1099. Brusatte, S. (2018). “How the dinosaurs got lucky”. Scientific American. 318(5): 18–25. Buchanan, P. J., Z. Chase, R. J. Matear, S. J. Phipps, and N. L. Bindoff (2019). “Marine nitrogen fixers mediate a low latitude pathway for atmospheric CO2 drawdown”. Nature Communication. 10. https:// doi.org/10.1038/s41467-019-12549-z. Burke, M., W. M. Davis, and N. S. Diffenbaugh (2018). “Large poten- tial reduction in economic damages under UN mitigation targets”. Nature. 557: 549–553. Burke, M., S. M. Hsiang, and E. Miguel (2015). “Global non-linear effect of temperature on economic production”. Nature. 527: 235–239. Caceres, C. (2007). “Asymptotic properties of tests for mis- specification”. Unpublished Doctoral Thesis. Oxford University: Economics Department. https://doi.org/10.1080/07474938.2020.1735749 https://doi.org/10.1080/07474938.2020.1735749 https://doi.org/10.1038/s41467-019-12549-z https://doi.org/10.1038/s41467-019-12549-z 306 References Castle, J. L., M. P. Clements, and D. F. Hendry (2015a). “Robust approaches to forecasting”. International Journal of Forecasting. 31: 99–112. Castle, J. L., J. A. Doornik, and D. F. Hendry (2011). “Evaluating automatic model selection”. Journal of Time Series Econometrics. 3(1). doi: 10.2202/1941-1928.1097. Castle, J. L., J. A. Doornik, and D. F. Hendry (2012). “Model selection when there are multiple breaks”. Journal of Econometrics. 169: 239– 246. Castle, J. L., J. A. Doornik, and D. F. Hendry (2019a). “Multiplicative- indicator saturation”. Working Paper. Oxford University: Nuffield College. Castle, J. L., J. A. Doornik, and D. F. Hendry (2020a). “Modelling non- stationary ‘big data’”. Working Paper No. 905. Oxford University: Department of Economics. Castle, J. L., J. A. Doornik, and D. F. Hendry (2020b). “Robust dis- covery of regression models”. Working Paper 2020-W04, Oxford University: Nuffield College. Castle, J. L., J. A. Doornik, D. F. Hendry, and F. Pretis (2015b). “Detecting location shifts during model selection by step-indicator saturation”. Econometrics. 3(2): 240–264. Castle, J. L., J. A. Doornik, D. F. Hendry, and F. Pretis (2019b). “Trend-indicator saturation”. Working Paper. Oxford University: Nuffield College. Castle, J. L., N. W. P. Fawcett, and D. F. Hendry (2010). “Forecast- ing with equilibrium-correction models during structural breaks”. Journal of Econometrics. 158: 25–36. Castle, J. L. and D. F. Hendry (2010). “A low-dimension Portmanteau test for non-linearity”. Journal of Econometrics. 158: 231–245. Castle, J. L. and D. F. Hendry (2014a). “Model selection in under- specified equations with breaks”. Journal of Econometrics. 178: 286– 293. Castle, J. L. and D. F. Hendry (2014b). “Semi-automatic non-linear model selection”. In: Essays in Nonlinear Time Series Econometrics. Ed. by N. Haldrup, M. Meitz, and P. Saikkonen. Oxford: Oxford University Press. 163–197. https://doi.org/10.2202/1941-1928.1097 References 307 Castle, J. L. and D. F. Hendry (2019). Modelling Our Changing World. London: Palgrave McMillan. url: https://link.springer.com/book/ 10.1007%2F978-3-030-21432-6. Castle, J. L., D. F. Hendry, and A. B. Martinez (2017). “Evaluating fore- casts, narratives and policy using a test of invariance”. Econometrics. 5(39). doi: 10.3390/econometrics5030039. Cheng, L., J. Abraham, Z. Hausfather, and K. E. Trenberth (2019). “How fast are the oceans warming?” Science. 363(6423): 128–129. Chow, G. C. (1960). “Tests of equality between sets of coefficients in two linear regressions”. Econometrica. 28: 591–605. Clarke, A. (1993). “Temperature and extinction in the sea: A physiolo- gist’s view”. Paleobiology. 19: 499–518. Clements, M. P. and D. F. Hendry (1995). “Forecasting in cointegrated systems”. Journal of Applied Econometrics. 10: 127–146. Reprinted in T. C. Mills (ed.), Economic Forecasting. Edward Elgar, 1999. Clements, M. P. and D. F. Hendry (1998). Forecasting Economic Time Series. Cambridge: Cambridge University Press. CO2 Program Scripps (2010). The Keeling Curve. La Jolla, CA: Scripps Institution of Oceanography. url: http://scrippsco2.ucsd.edu/ history_legacy/keeling_curve_lessons. Cox, D. R. (1962). “Further results on tests of separate families of hypotheses”. Journal of the Royal Statistical Society. B, 24: 406– 424. Crafts, N. F. R. (2003). “Is economic growth good for us?” World Economics. 4(3): 35–49. Croll, J. (1875). Climate and Time in Their Geological Relations, A Theory of Secular Changes of the Earth’s Climate. New York: D. Ap- pleton. Dasgupta, S., B. Laplante, H. Wang, and D. Wheeler (2002). “Con- fronting the environmental Kuznets curve”. Journal of Economic Perspectives. 16: 147–168. Davis, W. M. (2019). “Dispersion of the temperature exposure and economic growth: Panel evidence with implications for global in- equality”. Thesis Submitted in Partial Fulfilment for the MPhil Degree. Oxford: Economics Department. https://link.springer.com/book/10.1007%2F978-3-030-21432-6 https://link.springer.com/book/10.1007%2F978-3-030-21432-6 https://doi.org/10.3390/econometrics5030039 http://scrippsco2.ucsd.edu/history_legacy/keeling_curve_lessons http://scrippsco2.ucsd.edu/history_legacy/keeling_curve_lessons 308 References Dickey, D. A. and W. A. Fuller (1981). “Likelihood ratio statistics for autoregressive time series with a unit root”. Econometrica. 49: 1057–1072. Doob, J. L. (1953). Stochastic Processes. 1990 edition. New York: John Wiley Classics Library. Doornik, J. A. (2008). “Encompassing and automatic model selection”. Oxford Bulletin of Economics and Statistics. 70: 915–925. Doornik, J. A. (2009). “Autometrics”. In: The Methodology and Practice of Econometrics. Ed. by J. L. Castle and N. Shephard. Oxford: Oxford University Press. 88–121. Doornik, J. A. (2018). OxMetrics: An Interface to Empirical Modelling (8th ed). London: Timberlake Consultants Press. Doornik, J. A. and H. Hansen (2008). “An Omnibus test for univariate and multivariate normality”. Oxford Bulletin of Economics and Statistics. 70: 927–939. Doornik, J. A. and D. F. Hendry (2015). “Statistical model selection with big data”. Cogent Economics and Finance. url: http://www.t andfonline.com/doi/full/10.1080/23322039.2015.1045216#.VYE5 bUYsAsQ. Doornik, J. A. and D. F. Hendry (2017). “Automatic selection of mul- tivariate dynamic econometric models”. Unpublished Typescript. University of Oxford: Nuffield College. Doornik, J. A. and D. F. Hendry (2018). Empirical Econometric Mod- elling Using PcGive: Volume I. 8th. London: Timberlake Consultants Press. Doornik, J. A. and K. Juselius (2018). CATS 3 for OxMetrics. London: Timberlake Consultants Press. Duffy, J. A. and D. F. Hendry (2017). “The impact of near-integrated measurement errors on modelling long-run macroeconomic time series”. Econometric Reviews. 36: 568–587. Emanuel, K. (2005). Divine Wind: The History and Science of Hurri- canes. Oxford: Oxford University Press. Engle, R. F. (1982). “Autoregressive conditional heteroscedasticity, with estimates of the variance of United Kingdom inflation”. Economet- rica. 50: 987–1007. http://www.tandfonline.com/doi/full/10.1080/23322039.2015.1045216#.VYE5bUYsAsQ http://www.tandfonline.com/doi/full/10.1080/23322039.2015.1045216#.VYE5bUYsAsQ http://www.tandfonline.com/doi/full/10.1080/23322039.2015.1045216#.VYE5bUYsAsQ References 309 Engle, R. F. and D. F. Hendry (1993). “Testing super exogeneity and invariance in regression models”. Journal of Econometrics. 56: 119– 139. Engle, R. F., D. F. Hendry, and J.-F. Richard (1983). “Exogeneity”. Econometrica. 51: 277–304. Erickson, D., R. Mills, J. Gregg, T. J. Blasing, F. Hoffmann, R. Andres, M. Devries, Z. Zhu, and S. Kawa (2008). “An estimate of monthly global emissions of anthropogenic CO2: Impact on the seasonal cycle of atmospheric CO2”. Journal of Geophysical Research. 113: G01023. Ericsson, N. R. (1983). “Asymptotic properties of instrumental variables statistics for testing non-nested hypotheses”. Review of Economic Studies. 50: 287–303. Ericsson, N. R. (2012). “Detecting crises, jumps, and changes in regime”. Working Paper. Federal Reserve Board of Governors, Washington, DC. Ericsson, N. R. (2017a). “How biased are U.S. Government Forecasts of the Federal Debt?” International Journal of Forecasting. 33: 543– 559. Ericsson, N. R. (2017b). “Interpreting estimates of forecast bias”. Inter- national Journal of Forecasting. 33: 563–568. Ericsson, N. R. and J. G. MacKinnon (2002). “Distributions of error correction tests for cointegration”. Econometrics Journal. 5: 285– 318. Erwin, D. H. (1996). “The mother of mass extinctions”. Scientific American. 275(1): 72–78. Erwin, D. H. (2006). Extinction: How Life on Earth Nearly Ended 250 Million Years Ago. Princeton: Princeton University Press. Estrada, F., P. Perron, and B. Martínez-López (2013). “Statistically de- rived contributions of diverse human influences to twentieth-century temperature changes”. Nature Geoscience. 6: 1050–1055. Farmer, J. D., C. Hepburn, M. C. Ives, T. Hale, T. Wetzer, P. Mealy, R. Rafaty, S. Srivastav, and R. Way (2019). “Sensitive intervention points in the post-carbon transition”. Science. 364(6436): 132–134. Farmer, J. D. and F. Lafond (2016). “How predictable is technological progress?” Research Policy. 45: 647–665. 310 References Feinstein, C. H. (1972). National Income, Expenditure and Output of the United Kingdom, 1855–1965. Cambridge: Cambridge University Press. Fisher, F. M. (1966). The Identification Problem in Econometrics. New York: McGraw Hill. Fouquet, R. and P. J. G. Pearson (2006). “Seven centuries of energy services: The price and use of light in the United Kingdom (1300– 2000)”. Energy Journal. 27: 139–178. Fullerton, R. L., B. G. Linster, M. McKee, and S. Slate (2002). “Using auctions to reward tournament winners: Theory and experimental investigations”. RAND Journal of Economics. 33: 62–84. Gamber, E. N. and J. P. Liebner (2017). “Comment on ‘How biased are US government forecasts of the federal debt?’” International Journal of Forecasting. 33: 560–562. Geikie, A. (1863). “On the phenomena of the glacial drift of Scotland”. Transactions of the Geological Society of Glasgow. 1: 1–190. Gilbert, W. (1600). De Magnete, Magnetisque Corporoibus, et de Magno Magnete Tellure: Physiologia noua, Plurimis & Argumentis, & Ex- perimentis Demonstrata. Translated by Mottelay, P. F. (1893). ‘On the Loadstone and Magnetic Bodies, and on That Great Magnet the Earth: A New Physiology, Demonstrated with Many Arguments and Experiments’. New York: John Wiley & Sons. London: Peter Short. Godfrey, L. G. (1978). “Testing for higher order serial correlation in regression equations when the regressors include lagged dependent variables”. Econometrica. 46: 1303–1313. Haavelmo, T. (1943). “The statistical implications of a system of simul- taneous equations”. Econometrica. 11: 1–12. Hannan, E. J. and B. G. Quinn (1979). “The determination of the order of an autoregression”. Journal of the Royal Statistical Society. B, 41: 190–195. Harvey, A. C. and J. Durbin (1986). “The effects of seat belt legislation on British road casualties: A case study in structural time series modelling”. Journal of the Royal Statistical Society, Series B. 149: 187–227. References 311 Hendry, D. F. (1976). “The structure of simultaneous equations estima- tors”. Journal of Econometrics. 4: 51–88. Hendry, D. F. (1995). Dynamic Econometrics. Oxford: Oxford University Press. Hendry, D. F. (1999). “An econometric analysis of US food expenditure, 1931–1989”. In: Methodology and Tacit Knowledge: Two Experiments in Econometrics. Ed. by J. R. Magnus and M. S. Morgan. Chichester: John Wiley and Sons. 341–361. Hendry, D. F. (2001). “Modelling UK inflation, 1875–1991”. Journal of Applied Econometrics. 16: 255–275. Hendry, D. F. (2006). “Robustifying forecasts from equilibrium- correction models”. Journal of Econometrics. 135: 399–426. Hendry, D. F. (2009). “The methodology of empirical econometric mod- eling: Applied econometrics through the looking-glass”. In: Palgrave Handbook of Econometrics. Ed. by T. C. Mills and K. D. Patterson. Basingstoke: Palgrave MacMillan. 3–67. Hendry, D. F. (2011). “Climate change: Possible lessons for our future from the distant past”. In: The Political Economy of the Environ- ment. Ed. by S. Dietz, J. Michie, and C. Oughton. London: Routledge. 19–43. Hendry, D. F. (2015). Introductory Macro-Econometrics: A New Ap- proach. http://www.timberlake.co.uk/macroeconometrics.html. London: Timberlake Consultants. Hendry, D. F. (2018). “Deciding between alternative approaches in macroeconomics”. International Journal of Forecasting. 34: 119–135, with ‘Response to the Discussants’, 142–146. Hendry, D. F. and J. A. Doornik (2014). Empirical Model Discovery and Theory Evaluation. Cambridge, MA: MIT Press. Hendry, D. F. and N. R. Ericsson (1991). “An econometric analysis of UK money demand in ‘monetary trends in the United States and the United Kingdom’ by Milton Friedman and Anna J. Schwartz”. American Economic Review. 81: 8–38. Hendry, D. F. and S. Johansen (2015). “Model discovery and Trygve Haavelmo’s legacy”. Econometric Theory. 31: 93–114. http://www.timberlake.co.uk/macroeconometrics.html 312 References Hendry, D., S. Johansen, and C. Santos (2008). “Automatic selection of indicators in a fully saturated regression.” Computational Statis- tics & Data Analysis. 33: 317–335. Hendry, D. F. and K. Juselius (2000). “Explaining cointegration analysis: Part I”. Energy Journal. 21: 1–42. Hendry, D. F. and K. Juselius (2001). “Explaining cointegration analysis: Part II”. Energy Journal. 22: 75–120. Hendry, D. F. and H.-M. Krolzig (2005). “The properties of automatic gets modelling”. Economic Journal. 115: C32–C61. Hendry, D. F., M. Lu, and G. E. Mizon (2009). “Model identification and non-unique structure”. In: The Methodology and Practice of Econometrics. Ed. by J. L. Castle and N. Shephard. Oxford: Oxford University Press. 343–364. Hendry, D. F. and G. E. Mizon (1993). “Evaluating dynamic economet- ric models by encompassing the VAR”. In: Models, Methods and Applications of Econometrics. Ed. by P. C. B. Phillips. Oxford: Basil Blackwell. 272–300. Hendry, D. F. and G. E. Mizon (2011). “Econometric modelling of time series with outlying observations”. Journal of Time Series Econometrics. 3(1). doi: 10.2202/1941-1928.1100. Hendry, D. F., A. J. Neale, and F. Srba (1988). “Econometric analysis of small linear systems using PC-FIML”. Journal of Econometrics. 38: 203–226. Hendry, D. F. and F. Pretis (2013). “Anthropogenic influences on atmospheric CO2”. In: Handbook on Energy and Climate Change. Ed. by R. Fouquet. Cheltenham: Edward Elgar. 287–326. Hendry, D. F. and F. Pretis (2016). All Change! The Implications of Non-Stationarity for Empirical Modelling, Forecasting and Policy. Oxford University: Oxford Martin School Policy Paper. Hendry, D. F. and C. Santos (2005). “Regression models with data-based indicator variables”. Oxford Bulletin of Economics and Statistics. 67: 571–595. Hendry, D. F. and C. Santos (2010). “An automatic test of super exo- geneity”. In: Volatility and Time Series Econometrics. Ed. by M. W. Watson, T. Bollerslev, and J. Russell. Oxford: Oxford University Press. 164–193. https://doi.org/10.2202/1941-1928.1100 References 313 Hettmansperger, T. P. and S. J. Sheather (1992). “A cautionary note on the method of least median squares”. The American Statistician. 46: 79–83. Heydari, E., N. Arzani, and J. Hassanzadeh (2008). “Mantle plume: The invisible serial killer–Application to the Permian-Triassic boundary mass extinction”. Palaeogeography, Palaeoclimatology, Palaeoecology. 264: 147–162. Hoffman, P. F. and D. P. Schrag (2000). “Snowball Earth”. Scientific American. 282: 68–75. Hsiang, S. (2016). “Climate econometrics”. Annual Review of Resource Economics. 8(1): 43–75. Imbrie, J. E. (1992). “On the structure and origin of major glaciation cy- cles, 1, Linear responses to Milankovitch forcing”. Paleoceanography. 7: 701–738. Jaccard, S. L., E. D. Galbraith, A. Martínez-García, and R. F. Ander- son (2016). “Covariation of deep Southern Ocean oxygenation and atmospheric CO2 through the last ice age”. Nature. 530: 207–210. Jackson, L. P. and D. F. Hendry (2018). “Risk and exposure of coastal cities to future sea-level rise”. Working Paper. Oxford University: INET Oxford. Johansen, S. (1995). Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press. Johansen, S. and B. Nielsen (2009). “An analysis of the indicator saturation estimator as a robust regression estimator”. In: The Methodology and Practice of Econometrics. Ed. by J. L. Castle and N. Shephard. Oxford: Oxford University Press. 1–36. Johansen, S. and B. Nielsen (2016). “Asymptotic theory of outlier detection algorithms for linear time series regression models”. Scan- dinavian Journal of Statistics. 43: 321–348. Jones, C. and P. Cox (2005). “On the significance of atmospheric CO2 growth rate anomalies in 2002–2003”. Journal of Geophysical Research. 32. Jouzel, J., V. Masson-Delmotte, O. Cattani, G. Dreyfus, S. Falourd, and G. E. Hoffmann (2007). “Orbital and millennial Antarctic climate variability over the past 800,000 years”. Science. 317: 793–797. 314 References Kaufmann, R. K. and K. Juselius (2010). “Glacial cycles and solar insolation: The role of orbital, seasonal, and spatial variations”. Climate of the Past Discussions. 6: 2557–2591. Kaufmann, R. K., H. Kauppi, M. L. Mann, and J. H. Stock (2011). “Reconciling anthropogenic climate change with observed tempera- ture 1998–2008”. Proceedings of the National Academy of Science. 108: 11790–11793. Kaufmann, R. K., H. Kauppi, M. L. Mann, and J. H. Stock (2013). “Does temperature contain a stochastic trend: Linking statistical results to physical mechanisms”. Climatic Change. 118: 729–743. Kaufmann, R. K., M. L. Mann, S. Gopala, J. A. Liederman, P. D. Howe, F. Pretis, X. Tanga, and M. Gilmore (2017). “Spatial heterogeneity of climate change as an experiential basis for skepticism”. Proceedings of the National Academy of Sciences. 114(1): 67–71. Kaufmann, R. and K. Juselius (2013). “Testing hypotheses about glacial cycles against the observational record”. Paleoceanography. 28: 175– 184. Keeling, C. D., R. B. Bacastow, A. E. Brainbridge, C. A. Ekdahl, P. R. Guenther, L. S. Waterman, and J. F. S. Chin (1976). “Atmospheric carbon dioxide variations at Mauna Loa Observatory, Hawaii”. Tellus. 6: 538–551. Kitov, O. I. and M. N. Tabor (2015). “Detecting structural changes in linear models: A variable selection approach using multiplicative indicator saturation”. Unpublished Paper. University of Oxford. Knutti, R., M. A. A. Rugenstein, and G. C. Hegerl (2017). “Beyond equilibrium climate sensitivity”. Nature Geoscience. 10: 727–736. Koopmans, T. C. (1949). “Identification problems in economic model construction”. Econometrica. 17: 125–144. Koopmans, T. C., ed. (1950a). Statistical Inference in Dynamic Eco- nomic Models. Cowles Commission Monograph. No. 10. New York: John Wiley & Sons. Koopmans, T. C. (1950b). “When is an equation system complete for statistical purposes?” In: Statistical Inference in Dynamic Economic Models. Ed. by T. C. Koopmans. Cowles Commission Monograph. No. 10. New York: John Wiley & Sons. Chap. 17. References 315 Koopmans, T. C. and O. Reiersøl (1950). “The identification of struc- tural characteristics”. The Annals of Mathematical Statistics. 21: 165–181. Kurle, J. K. (2019). “Essays in climate econometrics”. Unpublished M. Phil. Thesis. University of Oxford: Economics Department. Lamb, H. H. (1959). “Our changing climate, past and present”. Weather. 14: 299–317. Lamb, H. H. (1995). Climate, History and the Modern World. Second edition (First ed., 1982). London: Routledge. Lea, D. W. (2004). “The 100,000-yr cycle in tropical SST, greenhouse forcing and climate sensitivity”. Journal of Climate. 17: 2170–2179. Lisiecki, L. E. and M. E. Raymo (2005). “A pliocine-pleistocene stack of 57 globally distributed benthic δ18O records”. Paleoceanography. 20. doi: 10.1029/2004PA001071. Lüthil, D., M. Le Floch, B. Bereiter, T. Blunier, J.-M. Barnola, U. Siegenthaler, D. Raynaud, J. Jouzel, H. Fischer, K. Kawamura, and T. F. Stocker (2008). “High-resolution carbon dioxide concentration record 650,000–800,000 years before present”. Nature. 453. doi: 10.1038/nature06949. Marland, G., T. Boden, and R. Andres (2011). “Global, regional, and national fossil fuel CO2 emissions”. In: Trends: A Compendium of Data on Global Change. Oak Ridge, Tenn., U.S.A.: Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, U.S. Department of Energy. url: http://cdiac.ornl.gov/trends/emis/ overview.html. Marland, G. and R. Rotty (1984). “Carbon dioxide emissions from fossil fuels: A procedure for estimation and results for 1950–1982”. Tellus. 36B: 232–261. Martinez, A. B. (2020). “Forecast accuracy matters for Hurricane dam- ages”. Econometrics. url: https://www.mdpi.com/2225-1146/8/2/ 18. Martinez, A. B., J. L. Castle, and D. F. Hendry (2019). “Smooth robust multi-step forecasting methods”. Unpublished Paper. Oxford University: Nuffield College. https://doi.org/10.1029/2004PA001071 https://doi.org/10.1038/nature06949 http://cdiac.ornl.gov/trends/emis/overview.html http://cdiac.ornl.gov/trends/emis/overview.html https://www.mdpi.com/2225-1146/8/2/18 https://www.mdpi.com/2225-1146/8/2/18 316 References Martinez-Garcia, A., A. Rosell-Melé, W. Geibert, R. Gersonde, P. Masqué, and V. E. Gaspari (2009). “Links between iron supply, marine productivity, sea surface temperature, and CO2 over the last 1.1 Ma”. Paleoceanography. 24. doi: 10.1029/2008PA001657. Masson-Delmotte, V., M. Kageyama, P. Braconnot, S. Charbit, G. Krinner, C. Ritz, E. Guilyardi, J. Jouzel, A. Abe-Ouchi, M. Crucifix, R. M. Gladstone, C. D. Hewitt, A. Kitoh, A. N. LeGrande, O. Marti, U. Merkel, T. Motoi, R. Ohgaito, B. Otto-Bliesner, W. R. Peltier, I. Ross, P. J. Valdes, G. Vettoretti, S. L. Weber, F. Wolk, and Y. Yu (2006). “Past and future polar amplification of climate change: Climate model intercomparisons and ice-core constraints”. Climate Dynamics. 26: 513–529. Masson-Delmotte, V., B. Stenni, K. Pol, P. Braconnot, O. Cattani, S. Falourd, M. Kageyama, J. Jouzel, A. Landais, B. Minster, J. M. Barnola, J. Chappellaz, G. Krinner, S. Johnsen, R. Röthlisberger, J. Hansen, U. Mikolajewicz, and B. Otto-Bliesner (2010). “EPICA Dome C record of glacial and interglacial intensities”. Quaternary Science Reviews. 29: 113–128. Mayhew, P. J., G. B. Jenkins, and T. G. Benton (2009). “A long-term association between global temperature and biodiversity, origina- tion and extinction in the fossil record”. Proceedings of the Royal Society, B. 275:1630: 47–53. Mee, L. (2006). “Reviving dead zones: How can we restore coastal seas ravaged by runaway plant and algae growth caused by human activities?” Scientific American. 295: 78–85. Meinshausen, M., N. Meinshausen, W. Hare, S. C. Raper, K. Frieler, R. Knutti, D. J. Frame, and M. R. Allen (2009). “Greenhouse-gas emission targets for limiting global warming to 2 ◦C”. Nature. 458: 1158–1162. Milankovitch, M. (1969). Canon of Insolation and the Ice-Age Problem. English translation by the Israel Program for Scientific Transla- tions of Kanon der Erdbestrahlung und seine Anwendung auf das Eiszeitenproblem, Textbook Publishing Company, Belgrade, 1941. Washington, DC: National Science Foundation. Mitchell, B. R. (1988). British Historical Statistics. Cambridge: Cam- bridge University Press. https://doi.org/10.1029/2008PA001657 References 317 Mizon, G. E. (1995). “A simple message for autocorrelation correctors: Don’t”. Journal of Econometrics. 69: 267–288. Mizon, G. and J. Richard (1986). “The encompassing principle and its application to non-nested hypothesis tests”. Econometrica. 54: 657–678. Myhre, G., A. Myhre, and F. Stordal (2001). “Historical evolution of radiative forcing of climate”. Atmospheric Environment. 35: 2361– 2373. Nevison, C., N. Mahowald, S. Doney, I. Lima, G. van der Werf, J. Randerson, D. Baker, P. Kasibhatla, and G. McKinley (2008). “Con- tribution of ocean, fossil fuel, land biosphere, and biomass burning carbon fluxes to seasonal and interannual variability in atmospheric CO2”. Journal of Geophysical Research. 113. Newey, W. K. and K. D. West (1987). “A simple positive semi-definite heteroskedasticity and autocorrelation-consistent covariance matrix”. Econometrica. 55: 703–708. Nielsen, B. (2006). “Order determination in general vector autoregres- sions”. In: Time Series and Related Topics: In Memory of Ching- Zong Wei. Ed. by H.-C. Ho, C.-K. Ing, and T. L. Lai. Vol. 52. Lecture Notes–Monograph Series. Beachwood, OH: Institute of Mathematical Statistics. 93–112. Nielsen, B. and M. Qian (2018). “Asymptotic properties of the gauge of step-indicator saturation”. Discussion Paper. University of Oxford: Economics Department. Nielsen, B. and A. Rahbek (2000). “Similarity issues in cointegration analysis”. Oxford Bulletin of Economics and Statistics. 62: 5–22. Orcutt, G. H. and D. Cochrane (1949). “A sampling study of the merits of autoregressive and reduced form transformations in regression analysis”. Journal of the American Statistical Association. 44: 356– 372. Paillard, D. (2001). “Glacial cycles: Towards a new paradigm”. Reviews of Geophysics. 39: 325–346. Paillard, D. (2010). “Climate and the orbital parameters of the Earth”. Compte Rendus Geoscience. 342: 273–285. Paillard, D., L. D. Labeyrie, and P. Yiou (1996). “Macintosh program performs time-series analysis”. Eos Transactions AGU. 77: 379. 318 References Parrenin, F., J.-R. Barnola, J. Beer, T. Blunier, and E. E. Castellano (2007). “The EDC3 chronology for the EPICA Dome C ice core”. Climate of the Past. 3: 485–497. Parrenin, F., J.-R. Petit, V. Masson-Delmotte, and E. E. Wolff (2012). “Volcanic synchronisation between the EPICA Dome C and Vostok ice cores (Antarctica) 0–145 kyr BP”. Climate of the Past. 8: 1031– 1045. Pfeiffer, A., R. Millar, C. Hepburn, and E. Beinhocker (2016). “The ‘2 ◦C capital stock’ for electricity generation: Committed cumulative carbon emissions from the electricity generation sector and the transition to a green economy”. Applied Energy. 179: 1395–1408. Pistone, K., I. Eisenman, and V. Ramanathan (2019). “Radiative heating of an ice-free Arctic Ocean”. Geophysical Research Letters. 46: 7474– 7480. Pretis, F. (2017). “Exogeneity in climate econometrics”. Working Paper. Oxford University: Economics Department. Pretis, F. (2019). “Econometric models of climate systems: The equiv- alence of two-component energy balance models and cointegrated VARs”. Journal of Econometrics. doi: 10.1016/j.jeconom.2019.05.0 13. Pretis, F. and D. F. Hendry (2013). “Comment on ‘Polynomial cointegra- tion tests of anthropogenic impact on global warming’ by Beenstock et al. (2012) – some hazards in econometric modelling of climate change”. Earth System Dynamics. 4: 375–384. Pretis, F. and R. K. Kaufmann (2018). “Out-of-sample Paleo-climate simulations: Testing hypotheses about the mid-Brunhes event, the stage 11 paradox, and orbital variations”. Discussion Paper. Canada: University of Victoria. Pretis, F. and R. K. Kaufmann (2020). “Managing carbon emissions to avoid the next ice age”. Discussion Paper. Canada: University of Victoria. Pretis, F., M. L. Mann, and R. K. Kaufmann (2015a). “Testing com- peting models of the temperature Hiatus: Assessing the effects of conditioning variables and temporal uncertainties through sample- wide break detection”. Climatic Change. 131: 705–718. https://doi.org/10.1016/j.jeconom.2019.05.013 https://doi.org/10.1016/j.jeconom.2019.05.013 References 319 Pretis, F., J. J. Reade, and G. Sucarrat (2018a). “Automated general- to-specific (GETS) regression modeling and indicator saturation for outliers and structural breaks”. Journal of Statistical Software. 68, 4. Pretis, F. and M. Roser (2017a). “Carbon dioxide emission-intensity in climate projections: Comparing the observational record to socio- economic scenarios”. Energy. 135: 718–725. Pretis, F., L. Schneider, J. E. Smerdon, and D. F. Hendry (2016). “Detecting volcanic eruptions in temperature reconstructions by designed break-indicator saturation”. Journal of Economic Surveys. 30: 403–429. Pretis, F., M. Schwarz, K. Tang, K. Haustein, and M. R. Allen (2018b). “Uncertain impacts on economic growth when stabilizing global temperatures at 1.5 ◦C or 2 ◦C warming”. Philosophical Transactions of the Royal Society. A376: 20160460. Prothero, D. R. (2008). “Do impacts really cause most mass extinctions?” In: From Fossils to Astrobiology. Ed. by J. Seckbach and M. Walsh. Netherlands: Springer. 409–423. Rampino, M. and S.-Z. Shen (2019). “The end-Guadalupian (259.8 Ma) biodiversity crisis: The sixth major mass extinction?” Historical Biology. doi: 10.1080/08912963.2019.1658096. Ramsey, J. B. (1969). “Tests for specification errors in classical linear least squares regression analysis”. Journal of the Royal Statistical Society. B, 31: 350–371. Randerson, T., M. Thompson, T. Conway, I. Fung, and C. Field (1997). “The contribution of terrestrial sources and sinks to trends in the seasonal cycle of atmospheric carbon dioxide”. Global Biogeochemical Cycles. 11:4: 535–560. Ravishankara, A. R., J. S. Daniel, and R. W. Portmann (2009). “Nitrous Oxide (N2O): The dominant ozone-depleting substance emitted in the 21st century”. Science. 326: 123–125. Riccardi, A., L. R. Kump, M. A. Arthur, and S. D’Hondt (2007). “Car- bon isotopic evidence for chemocline upward excursion during the end-Permian event”. Palaeogeography, Palaeoclimatology, Palaeoe- cology. 248: 263–291. https://doi.org/10.1080/08912963.2019.1658096 320 References Richard, J.-F. (1980). “Models with several regimes and changes in exogeneity”. Review of Economic Studies. 47: 1–20. Rigby, M. E. (2010). “History of atmospheric SF6 from 1973 to 2008”. Atmospheric Physics and Chemistry. 10: 10305–10320. Rothenberg, T. J. (1971). “Identification in parametric models”. Econo- metrica. 39: 577–592. Rothenberg, T. J. (1973). Efficient Estimation with a Priori Information. Cowles Foundation Monograph. No. 23. New Haven: Yale University Press. Rousseeuw, P. J. (1984). “Least median of squares regression”. Journal of the American Statistical Association. 79: 871–880. Rowan, S. S. (2019). “Pitfalls in comparing Paris pledges”. Climatic Change. url: https://link.springer.com/article/10.1007/s10584-019 -02494-7. Ruddiman, W., ed. (2005). Plows, Plagues and Petroleum: How Humans took Control of Climate. Princeton: Princeton University Press. Salkever, D. S. (1976). “The use of dummy variables to compute pre- dictions, prediction errors and confidence intervals”. Journal of Econometrics. 4: 393–397. Sargan, J. D. (1964). “Wages and prices in the United Kingdom: A study in econometric methodology (with discussion)”. In: Econometric Analysis for National Economic Planning. Ed. by P. E. Hart, G. Mills, and J. K. Whitaker. Vol. 16. Colston Papers. London: Butterworth Co. 25–63. Schneider, L., J. E. Smerdon, F. Pretis, C. Hartl-Meier, and J. Es- per (2017). “A new archive of large volcanic events over the past millennium derived from reconstructed summer temperatures”. En- vironmental Research Letters. 12, 9. Schwarz, G. (1978). “Estimating the dimension of a model”. Annals of Statistics. 6: 461–464. Siddall, M., E. J. Rohling, A. Almogi-Labin, C. Hemleben, D. Meischner, and I. E. Schmelzer (2003). “Sea-level fluctuations during the last glacial cycle”. Nature. 423: 853–858. Snir, A., D. Nadel, I. Groman-Yaroslavski, Y. Melamed, M. Sternberg, and O. E. Bar-Yosef (2015). “The origin of cultivation and proto- weeds, long before Neolithic farming”. PLoS ONE. 10(7): e0131422. https://link.springer.com/article/10.1007/s10584-019-02494-7 https://link.springer.com/article/10.1007/s10584-019-02494-7 References 321 Spanos, A. and J. J. Reade (2015). “Heteroskedasticity/autocorrelation consistent standard errors and the reliability of inference”. Unpub- lished paper. USA: Virginia Tech. Stein, K., A. Timmermann, E. Y. Kwon, and T. Friedrich (2020). “Timing and magnitude of Southern Ocean sea ice/carbon cycle feedbacks”. PNAS. 117(9): 4498–4504. Stern, D. I. (2004). “The rise and fall of the environmental Kuznets curve”. World Development. 32: 1419–1439. Stern, N. (2006). The Economics of Climate Change: The Stern Review. Cambridge: Cambridge University Press. Stone, R. (2007). “A world without corals?” Science. 316: 678–681. Suess, H. E. (1953). “Natural radiocarbon and the rate of exchange of carbon dioxide between the atmosphere and the sea”. In: Nuclear Processes in Geologic Settings. Ed. by National Research Council Committee on Nuclear Science. Washington, DC: National Academy of Sciences. 52–56. Sundquist, E. T. and R. F. Keeling (2009). “The Mauna Loa carbon diox- ide record: Lessons for long-term earth observations”. Geophysical Monograph Series. 183: 27–35. Thomson, K. S. (1991). Living Fossil: The Story of the Coelacanth. London: Hutchinson Radius. Tibshirani, R. (1996). “Regression shrinkage and selection via the Lasso”. Journal of the Royal Statistical Society. B, 58: 267–288. U.S. Energy Information Administration (2009). “Emissions of green- house gases in the U.S.” Report DOE/EIA-0573(2008). https:// www.eia.gov/environment/emissions/ghg_report/. Vaks, A., A. Mason, S. Breitenbach, A. Kononov, A. Osinzev, M. P. Rosensaft, A. Borshevsky, O. Gutareva, and G. Henderson (2020). “Palaeoclimate evidence of vulnerable permafrost during times of low sea ice”. Nature. 577: 221–225. Víšek, J. A. (1999). “The least trimmed squares – random carriers”. Bulletin of the Czech Econometric Society. 6: 1–30. Vousdoukas, M. I., L. Mentaschi, E. Voukouvalas, M. Verlaan, S. Jevre- jeva, L. P. Jackson, and L. Feyen (2018). “Global probabilistic projections of extreme sea levels show intensification of coastal flood hazard”. Nature Communications. 9: 2360. https://www.eia.gov/environment/emissions/ghg_report/ https://www.eia.gov/environment/emissions/ghg_report/ 322 References Walker, A., F. Pretis, A. Powell-Smith, and B. Goldacre (2019). “Varia- tion in responsiveness to warranted behaviour change among NHS clinicians: A novel implementation of change-detection methods in longitudinal prescribing data”. British Medical Journal. 367: l5205. Ward, P. D. (2006). “Impact from the deep”. Scientific American. 295: 64–71. Weart, S. (2010). “The Discovery of Global Warming”. url: http:// www.aip.org/history/climate/co2.htm. White, H. (1980). “A heteroskedastic-consistent covariance matrix esti- mator and a direct test for heteroskedasticity”. Econometrica. 48: 817–838. Winchester, S. (2001). The Map That Changed the World. London: Harper Collins. Worrell, E., L. Price, N. Martin, C. Hendriks, and L. Meida (2001). “Carbon dioxide emissions from the global cement industry”. Annual Review of Energy and the Environment. 26: 303–329. Zanna, L., S. Khatiwala, J. M. Gregory, J. Ison, and P. Heimbach (2019). “Global reconstruction of historical ocean heat storage and transport”. PNAS. doi: 10.1073/pnas.1808838115. http://www.aip.org/history/climate/co2.htm http://www.aip.org/history/climate/co2.htm https://doi.org/10.1073/pnas.1808838115 Introduction Econometric Methods for Empirical Climate Modeling Theory Models and Data Generation Processes Formulating Wide-Sense Non-Stationary Time Series Models Model Selection by Autometrics Model Selection Facing Wide-Sense Non-Stationarity Understanding Why Model Selection Can Work Selecting Models with Saturation Estimation Summary of Saturation Estimation Selecting Simultaneous Equations Models Forecasting in a Non-Stationary World Some Hazards in Modeling Non-Stationary Time-Series Data Assessing the Constancy and Invariance of the Relationship An Encompassing Evaluation of the Relationship A Brief Excursion into Climate Science Can Humanity Alter the Planet's Atmosphere and Oceans? Climate Change and the `Great Extinctions' The Industrial Revolution and Its Consequences Climate Does Not Change Uniformly Across the Planet Identifying the Causal Role of CO2 in Ice Ages Data Series Over the Past 800,000 Years System Equation Modeling of the Ice-Age Data Long-Run Implications Looking Ahead Conclusions on Ice-Age Modeling Econometric Modeling of UK Annual CO2 Emissions, 1860–2017 Data Definitions and Sources UK CO2 Emissions and Its Determinants Model Formulation Evaluating a Model Without Saturation Estimation Four Stages of Single-Equation Model Selection Selecting Indicators in the General Model Selecting Regressors and Implementing Cointegration Estimating the Cointegrated Formulation Encompassing of Linear-Semilog versus Linear-Linear Conditional 1-Step `Forecasts' and System Forecasts Policy Implications Can the UK Reach Its CO2 Emissions Targets for 2050? Climate-Environmental Kuznets Curve Conclusions Acknowledgements References Edison-1987Aug-PurchasingPowerParity-JMCB-v19n3 Purchasing Power Parity in the Long Run: A Test of the Dollar/Pound Exchange Rate (1890-1978) Author(s): Hali J. Edison Source: Journal of Money, Credit and Banking, Vol. 19, No. 3 (Aug., 1987), pp. 376-387 Published by: Ohio State University Press Stable URL: https://www.jstor.org/stable/1992083 Accessed: 08-03-2020 16:58 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at https://about.jstor.org/terms Ohio State University Press is collaborating with JSTOR to digitize, preserve and extend access to Journal of Money, Credit and Banking This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms HALI J. EDISON Purchasing Power Parity in the Long Run: A Test of the Dollar/ Pound Exchange Rate (1890 1978) THIS PAPER ADDRESSES THE QUESTION of whether pur- chasing power parity (PPP) is valid in the long run,' even though it has failed to explain short-run movements in exchange rates.2 The economic rationale for the failure of PPP to hold in the short run is that the economy is never observed in equilibrium. Over short periods large shocks or structural changes occur that may disturb exchange rates from their long-run equilibrium positions. However, as noted by Cassel,3 self-correcting mechanisms built into the economy help to 'See, for example, the review article of Officer ( 1976) and Katseli-Papaefstratiou ( 1979). Also see Frenkel (1976) and Hakkio (1984). 2The current consensus among economists is that PPP no longer provides an explanation of short- run movements of the exchange rate. See, for example, Frenkel ( 198 1), Branson ( 198 1), Desai ( 198 1), Dornbusch (1985), Daniel (1986), and Miller (1986). One of the major reasons for the significant deterioration in the explanatory power of PPP doctrine in the 1 970s is the greater importance of real shocks to the economy in that period and the resulting changes in the relative price structure. 3Cassel ( 1922) outlined in his chapter "Deviations from PPP" factors that we associate with K. Some of these factors are changes in relative prices due to structural change, expectations, changes in taste and in technology, and speculation. The author thanks Meghnad Desai, Jan Tore Klovland, Jeffrey Frankel, David Howard, Jaime Marquez, an anonymous referee, and the editors of this journal for many helpful comments. This paper represents the views of the author and should not be interpreted as representing the views of the Board of Governors of the Federal Reserve System. HALI J. EDISON is an economist, Board °S Governors of the Federal Reserve System, Division of International Finance. Journal o+Money, Credit, and Banking, Vol. 19, No. 3 (August 1987) Copyright 'B' 1987 by the Ohio State University Press This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms HALI J. EDISON : 377 produce a proportionality between exchange rates and relative prices over time. This paper focuses on amending the simple theoretical version of PPP in a way consistent with Cassel's original arguments in favor of the PPP hypothesis. Gailliot (1970), Lee (1976), and Friedman (1980) have used simple statistical methods to test the validity of PPP in the long run. More recently, a number of competing studies have emerged that test the behavior of exchange rates. Adler and Lehmann (1983), Rush and Husted (1985), Frankel (1986), and Mark (1986) have all employed more sophisticated statistical techniques to examine the valid- ity of the PPP hypothesis over the long run. Adler and Lehmann and Mark provide evidence that PPP does not hold in the long run, while Rush and Husted and Frankel provide evidence that PPP does hold in the long run. The present study incorporates an error-correction mechanism and uses an- nual data for the United Kingdom and the United States since 1890 to test the PPP proposition. (This methodology is similar to that used in Davidson et al. (1978) and Rose (1985) to study consumption and the demand for money, respec- tively.) The relatively long period of time that the data span and the alternative econometric methodology employed provide a comprehensive means of testing the validity of long-run PPP. This paper is organized as follows: Section 1 restates the PPP theory and out- lines the economic strategy. Section 2 presents a simple, naive model and a more elaborate monetary model of PPP. Section 3 discusses the empirical results, which generally fail to support long-run PPP (for the dollar/pound exchange rate). Section 4 contains some concluding remarks. 1. PURCHASING POWER PARITY THEORY The basic notion behind purchasing power parity is that the ratio of domestic to foreign prices determines the "fundamental" or "equilibrium" exchange rate. The PPP hypothesis is stated as _ P E- K p* (1) where E denotes the equilibrium exchange rate (that is, the domestic price of one unit of foreign currency), P denotes an index of domestic prices, P* denotes an index of foreign prices, and K is a scalar.4 Throughout the analysis the United States is identified as the home country and the United Kingdom is identified as the foreign country. Tests of the PPP hypothesis generally include tests of two basic distinct basic 4As discussed in the review articles cited in note 2, a number of issues about PPP theory have been debated, ranging from which prices are appropriate in assessing the theory to whether the theory itself is a tautology. In recent years, the debate about PPP has concentrated on empirical issues rather than on the theory itself. This study treats PPP as a long-run relationship towards which the actual ex- change rate may exhibit a tendency. This is consistent with the writings of Cassel and his contempo- raries. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms 378 : MONEY, CREDIT, AND BANKING properties: (i) symmetry between the domestic and the foreign country, and (ii) proportionality between relative prices and the exchange rate (that is that the long-run coefficient on relative prices is one). In the expanded model described in section 2, consideration of a third hypothesis is added. The property of exclu- siveness holds if no variable other than relative prices affects the long-run ex- change rate. Testing of PPP is conducted in the context of a general autoregressive distributed-lag or AD(g, qi) model, where g and qi are the lengths of the lags on the dependent variable and on the independent variable i, respectively. The choice of g and qi depends on a priori information or data constraints. For the case in which g = qi = n, this model is given by n e, = olO + (°eljP,< + °l2,P,<* + ,Bje,< l) + u, (2) a=o where lowercase letters indicate logarithms and u, N(O, CJ 2). The empirical methodology employed in this paper differs from that of previous attempts in several aspects. First, this functional form is compatible with an error-correction mechanism (ECM) model, which can be used to test the PPP proportionality property.5 Second, starting from a general econometric specification, each re- striction is tested to uncover a parsimonious model that is consistent with the data. This specification can then be used to test theoretical restrictions. This is the "general to the specific" method. In contrast, most studies assert PPP and then test it in a restrictive form thus pursuing a "specific to the general" strategy. 2. TWO MODELS FOR TESTING PPP This section proposes two models which provide an analytical framework for Section 3 of tests of the PPP hypothesis. The first model is the standard, naive PPP model which is given by e= k+ p-F* where lowercase variables denote logarithms and bars over the variables denote long-run values of variables defined in (1). The second model allows for different speeds of adjustment in commodity prices and exchange rates (see Dornbusch 1976). This model captures differential influences of real and monetary factors on the exchange rate, and is simultane- ously constistent with short-run deviations from PPP and the validity of long- run PPP. This alternative model consists of a simple PPP relationship like equation (3) [but k is assumed to equal zero, which yields equation (4)], a conventional de- 5A thorough presentation of ECM models can be found in Hendry et al. (1984). This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms HALI J. EDISON : 379 mand for money equation (5), a short-run price adjustment equation (6), a de- mand function for output (7), and an equation describing the short-run path of the exchange rate (8). In the equations below the superscript + denotes the differ- ence between the home country and foreign country values of the relevant vari- able. The variables are all in logarithms (except for interest rates) where m, y, r, and d are nominal money stock, real output, nominal interest rate, and real ex- penditure on domestic goods, respectively. Assuming that all coefficients are identical for the two countries and that all coefficients are positive, the model can be written as6 e= p+ . (4) m+ = p+ + HIY+ - 82r+ * (5) AP+ = P+ - p+ + ¢+1 (d +- y) . (6) d = dI(e-p+) + 62Y+ - 63r+ (7) e= e- l (r+) (8) The price equation (6) assumes that the change in prices is proportional to the excess demand for output and deviation of actual from long-run prices, while equation (7) relates the expenditure on domestic goods to the real exchange rate, output, and the interest rate. Equation (8) is obtained by combining the uncov- ered interest parity condition (r-r* = expected depreciation) with regressive expectations (expected depreciation is proportional to e-e). The simple five- equation model can be used to determine the five endogenous variables e, p +, e, r+ and d +. The quasi-reduced form for e is e=f(p+,m+,y+) (9) In the short run, the exchange rate may deviate from the level implied by PPP (equation 4) due to real and monetary factors. In the long run, these factors should not influence the exchange rate [as suggested equation (4)]. Equation (9) is rewritten as an extension of equation (4) e= p+ + x, (10) where x, = f(m,, m ,*, y,, m ,). In the empirical section, monetary and real fac- tors are included in modeling the current exchange rate to help explain short-run movements in the exchange rate. However, in the long run, as long as PPP holds, the exchange rate should only be a function of relative prices. This hypothesis is referred to as the exclusiveness condition and it is explicitly tested in section 3. 6In estimating the final exchange rate equation derived from this model the restriction that all coefficients are identical for the two countries (symmetry) is not imposed. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms 380 : MONEY, CREDIT, AND BANKING The exclusiveness condition holds as long as there are no deviations from PPP in the long run. 3. EM PIRICAL RESULTS This study employs annual data for the period from 1890 to 1978.7 Testing PPP over this long time horizon provides the opportunity to examine the long- run tendency of the economy. Figure 1 plots the annual average value of the real and nominal exchange rate. Figure 1 also serves as a background for a brief discussion of the history of exchange rate regimes during the period under con- sideration. Vast economic differences existed between the period prior to the First World War, the interwar period, and the period after World War II. Three outstanding features of the period before World War I are the unity of the indus- trialized economic world, the gold standard (the pound was fixed in gold, and was worth approximately $4.86), and the general economic strength of the U. K. economy. The interwar period includes a major depression and periods of high unemployment, the decline of the United Kingdom as a world power, and in- creases in protectionism and exchange rate variability. The period after World War II marks an era characterized by the domination of the U.S. economy and the adoption of the Bretton Woods agreement, which was in effect until 1971. To evaluate the PPP doctrine a test of the naive version of PPP is considered first. Even though this version of PPP can be nested within the monetary model, it is tested explicitly and treated as the baseline result. The empirical analysis employs GDP price deflators as price proxies because they are continuously available and consistently calculated over the entire period. The coverage of these price indices is quite broad and often leads to rejection of the PPP hypothe- sis, but tests based on more narrowly defined traded goods prices tend to be less . . nterestlng. The first econometric specification considered in this section is equation (2), which is the empirical counterpart to equation (3). A general specification that follows the "general to the specific" strategy outlined in section 1 was considered first but is not reported here.8 Based on a series of tests of restrictions, the final results for the naive model using an error-correction mechanism is reported below.9 i\e, = 0.1355 + 0.756 i\p+, + 0.0866(p+ - e), l . (11) (0.080) (0.173) (0.049) NOBS = 80 R 2 = 0.2037 s = 0.089 7Data used in this study come from Desai (1981). 8The details are reported in Edison ( 1981), which is available from the author. 9All the equations were estimated using the computer program G.I.V.E. (see Hendry and Srba 1 978). This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms HALI J. EDISON : 381 2 0 1 6 1 2 .8 .4 1 890 1 900 1 9 1 0 1 920 1 930 1 940 1 950 1 960 1 970 FIG. 1. Log of Real and Nominal Dollar/ Pound Exchange Rate The numbers reported in parentheses are standard errors. These results show that neither the condition of symmetry nor of proportionality can be rejected when tested against the more general specification of the naive model (that is, the coefficient on i\p + is not significantly different from 1 and the coefficient on the error correction term (p + - e), , is significant at the S and 1 percent significance level respectively). Setting e = e,_n and F + = P + t-n for all n and solving for e allows us to derive the long-run stationary state. The long-run stationary state representation of (11) is exactly that of equation (1), implying that the exchange rate has a tendency toward PPP. The coefficient on the term (p + - e) suggests that about 9 percent of the gap or deviation from PPP is amended each year. (Using a similar data set, Frankel finds the real exchange rate regresses to PPP at a rate of 14 percent per year.) This result indicates that it takes a long time for the exchange rate to adjust towards its equilibrium level. Thus, it may be said that PPP provides a fair, though rough, approximation of the long-run exchange rate. Next, the alternative model that amends the naive PPP relationship by includ- ing real and monetary factors in the exchange rate equation is considered. A general specification was first estimated to permit evaluation of restrictions. For Log of Real and Nominal Dollar/ Pound Exchange Rate This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms 382 : MONEY, CRED1T, AND BANK1NG brevity, the results of most tests of restrictions are not reported here. '° Table 1 reports three specifications of the monetary model, where the dependent variable is the change in the log of the exchange rate. " The lagged variables have been introduced by using equation ( 10) as the equilibrium model and by assuming an autoregressive distributed-lag model as shown in equation (2). Regression 1 A in Table 1 is fairly general and the restrictions imposed on this specification cannot be rejected. However, the signs on the impact coefficients do not correspond to those suggested by theory. For example, the coefficient on the income differen- tial is positive while theory suggests it should be negative.l2 The regression reported in column 1 C represents the "final" specification for the monetary model. Several linear restrictions on coefficients are imposed and tested. For example, the term R3 in Table 1 represents a linear restriction. This restriction imposes the same coefficient, but of opposite sign, on relative prices and the exchange rate lagged one period. In testing the PPP proposition, the condition of symmetry is not rejected except for the impact coefficients on inter- est rates. (Note that in regressions 1B and 1C the variable i\r, 29 lagged changes of U.S. interest rates, is dropped because it is not significant.) Furthermore, it appears that the condition of proportionality between prices and exchange rates cannot be rejected, except that relative supplies of cash balances also inRuence the long-run exchange rate. This result can be seen clearer by examining the long-run stationary state that emerges from regression 1 C that is found by setting e = et-n 9 X = Xt_n for all n (where x represents p +, m +, y +). The long-run sta- tionary state is e = k + p + - .301m + . This result shows that the condition of exclusiveness can be rejected.'3 In general, these results indicate that forces exist in the economy that drive the exchange rate towards the PPP equilibrium. However, the exchange rate never quite returns to the PPP equilibrium because of the wedge created by relative supplies of cash balances. There are many plausible explanations for these re- sults. One reason is that prices of traded and nontraded goods do not move in unison. (It is assumed that internationally traded goods in all countries tend to be equal after taking into account tariffs and transportation costs, etc.) As Bloom- field (1947) suggests, this failure of all prices to adjust in unison may be due to capital movements, changes in international demand, or other structural changes. The homogeneity between money supplies and the exchange rate could break down even if PPP holds if the demand for money has shifted over time. Monetary '°The results are reported in Edison (1981). "The various models were also estimated by instrumental variable methodss but the coefficient estimates were little affecteds therefore only the ordinary least-square estimates are reported here. '2The sign on income differs depending on whether one adopts a "Keynesian or "monetary framework (see Kreinin and Officer 1978). In the texts a "Keynesian framework has been adopted. '3This result is consistent with Darby ( 1983) who finds that there is no fixeds long-run parity level of PPP. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms TABLE 1 TEST OF THE PPP HYPOTHESIS DOLLAR/ POUND: THE "MONETARYS MODEL X Vanables Regression Coefficients IA 1B 1C Aet-2 -.245 -.295 -.294 (0.106) (0-103) (0.103) SY+ 0.279 0.218 0.218 (0.120) (0.110) (0.lQ9) Am+ l -.193 -.139 RIC -.132 R1 (0.1I3) (0.073) (0.073) /\m+ 2 0-096 -. 139 R1 -. 132 R1 (0.099) (0.073) (o.073) APt-1 0.714 0.520 R2C 0.489 R2 (0.23) (0. 178) (0. 173) Pt-l -.41 0.520 R2 0.489 R2 (0.19) (0.178) (0-173) Art -6*5 1 -6.36 -6.26 (2.5) (2.39) (2.38) Art -5.08 -4.51 -4.56 (2.34) (2.22) (2.22) Ar.-2 0.54 (2.9) Art-2 -6-73 -5.49 -5.05 (2.73) (2.27) (2. 18) et_l -.329 0.252 R3C 0.257 R3 (.088) (0.072) (0.072) Pt-l -.139 0.252 R3 0.257 R3 (0.15) (0.072) (0.072) P,_] -. 103 0.252 R3 0.257 R3 (0. 12) (0.072) (0.072) m l -.079 -.091 -.076 (0.07) (0.034) (0.027) Yt-l 0.219 -- (0. 1 16) rt_l 1.35 -.698 R4C (1.81) (0.936) r;l 0.942 -.698 R4 (1.60) (0.936) CNST 1.73 0.606 0.583 (0.558) (0. 168) (0. 164) NOBSd 80 80 80 R2 0.534 0.434 0.429 3e 0.050 0.0S3 0.053 LM TeStf 7.65 7.55 6.93 ChOW TeStg 2.02 2.08 2.04 DFh (6,62) (6,69) (6,70) This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms 384 : MONEY, CREDIT, AND BANKING innovations and the development of the banking system may have altered the long-run velocity of money. Other changes also may have occurred over the cen- tury that have altered the data generation process. Three distinct periods were identified when analyzing the residuals of the estimated model. In the table below tests of parameter constancy are reported. STABILITY TEST OF EQUATION 1C Break-year 1918 1924 1932 1937 1940 Chow test 0.368 0.74 1.65 1.66 1.82 The Chow tests (distributed as F(11,64)), using equation 1C of table 1, do not reveal any structural breaks (with critical values at 1.94 and 2.54 at the 5 and 1 percent significance level, respectively). These results do not corroborate the re- sults of Friedman (1980), which indicated that a major break occurred in 1932. In an attempt to understand these results better, two subsamples, the Gold Standard (1894- 1914) and Bretton Woods (1950- 1972) are considered. In both of these subsamples the exchange rate is fixed, so the U.K. price is used as the dependent variable. The final equation, following the modeling strategy for the Gold Standard era with U.K. prices as the dependent variable, is Ap, = 0.505 + 0.510 lvp, + 0.3211 L\P.-I (12) (0.201) (0.141) (0.188) - 0.1302y,+, + 0.3205(p+ - e),,. (0.053) (0.123) NOBS = 22 R 2 = 0.48 D-W = 1.83 s = 0.014 Long run: e = k + p + - .406 y + . The estimates for the Bretton Woods period are TABI F. I N()TES: aDependent variable the rate of change (A log e) of the dollar t pound exchange rate. bNumbers in parentheses are standard errors (unless noted otherwise). CRestrictions: R I (Amt_ I - Am t-2 ). R2(5pt_ 1 - Ap; I ) * * R3( p,_ I - p,_ I - et_ I ). R4(rt_ 1 - rt_ I ) dNumber of observations (estimation range: I X94- 1972). e.Standard error of the regression. fl.agrange multiplier test for the Ith order residual autocorrelation distributed as X 2(1 ) in large samples. Here I is set to (i the critical value at the S and I percent significance level are 12.5 ldi.X respectively. gChow test for parameter stability between the estimated sample of Tobservations ( I X94- 1972) and the subsequent m observations (1973-1978) and is asymptotically distributed as F(m t-k). M is set equal to six. hDegrees of freedom the critical values at the 5 and I percent significance level for columns I A I B I C are 2.25 S 3.12 2.24 3.09 and 2.23 5 3.07 respectively. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms HALI J. EDISON : 385 Ap, = 0.121 + 0.904 /\p, + 0.56 /\p,, (13) (0.056) (0.256) (0.16) - 0.0375y,, + 0.07 (P - e)t- (0.060) (0.05) NOBS = 23 R 2 = 0.65 D-W = 1.66 s = 0.016 Long run: e = k + p+ - .528 y+. Several common features of the two sample periods can be gleaned from these results. First, the conditions of symmetry and of proportionality are not rejected by the data. Second, the condition of exclusiveness fails in both sample periods, because the long-run exchange rate is a function of relative income levels. These results do not support the naive version of PPP or the amended monetary version. However, it is possible to find a rationale for these results. Balassa (1964) argued that the distinction between traded and nontraded goods, together with the existence of differential rates of productivity growth, would lead to the failure of the simple PPP hypothesis. The more productive or faster growing economy will experience increases in general prices due to the rise in nontraded goods prices. A productivity bias emerges in PPP comparisons when using general price indices, that is, GNP deflators. For fast-growing countries a PPP compari- son will show a real exchange rate appreciation. These findings are also consis- tent with those of Kravis and Lipsey (1983) and Genberg (1978). 4. SUMMARY AND CONCLUSION The main conclusion that follows from this empirical study of the PPP doc- trine is that a naive version of the PPP relationship does not adequately represent the dollar/ pound exchange rate. Within the enlarged, monetary model the con- ditions of symmetry and of proportionality cannot be rejected. However, the condition of exclusiveness is rejected, so that permanent deviations from PPP cannot be ruled out. This result was reinforced when the sample was divided into two smaller homogeneous subsamples where PPP was tested using its fixed rate counterpart the equalization of prices across countries. The results of this paper support a qualified interpretation of the PPP doctrine: the proportionality between the exchange rates and relative price level emerges in the long run, after taking into account the effects of changes in structural factors. LITERATURE CITED Adler, Michael, and Bruce Lehmann. "Deviations from Purchasing Power Parity in the Long Run." Journal of Finan(e 38 (December 1983), 1471-87. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms 386 : MONEY, CREDIT, AND BANKING Balassa, Bela. "The Purchasing Power Parity Doctrine: A Reappraisal." Journal of Polit- ical Economy 72 (December 1964), 584-96. Bloomfield, Arthur I. "Foreign Exchange Rate Theory and Policy." In The New E( onom- ics: Keynes' Influence on 7theory and Public Policy, edited by Seymour Harris, pp. 293-314. London: Dennis Dobson Ltd., 1947. Branson, William H. "Comment: The Collapse of Purchasing Power Parity during the 1970's . " European Economic Review 16 ( May 1981), 167-71. Cassel, Gustav. Money and Foreign Exchange after 1914. New York: Constable & Co., Ltd., 1922. Daniel, Betty C. "Sticky Prices and Purchasing Power Parity Deviations: Empirical Im- plications." Economic Letters 20 (February 1986), 187-90. Darby, Michael R. "Movements in Purchasing Power Parity: The Short and Long Runs." In The International Transmission of Inflation, edited by Michael R. Darby, James R. Lothian, Arthur E. Gandolfi, Anna J. Schwartz, and Alan C. Stockman. Chicago: The University of Chicago Press, 1983. Davidson, James E. H., David F. Hendry, Frank Srba, and Stephen Yeo. "Econometric Modelling of the Aggregate Time-Series Relationship between Consumers' Expendi- tures and Income in the United Kingdom." Economic Journal 88 (December 1978), 661 -92. Desai, Meghnad. Testing Monetarism. London: Frances Pinter, 1981. Dornbusch, Rudiger. "Expectations and Exchange Rate Dynamics." Journal of Political Economy 84 ( December 1976), 1161 -76. . "Purchasing Power Parity." National Bureau of Economic Research, Working Paper No . 1591, March 1985. Edison, Hali J. "Short-Run Dynamics and Long-Run Equilibrium Behaviour in Purchas- ing Power Parity: A Quantitative Reassessment." Ph.D. dissertation. London School of Economics, 1981. Frankel, Jeffrey A. "International Capital Mobility and Crowding-Out in the U.S. Econ- omy: Imperfect Integration of Financial Markets or of Goods Markets?" In How Open is the U.S. Economy, edited by R. W. Hafer, Federal Reserve Bank of St. Louis, pp. 33-67. Lexington Mass.: Lexington Books, 1986. Frenkel, Jacob A. "The Collapse of Purchasing Power Parities during the 1970s." Euro- pean Economic Review 16 ( May 1981), 146-65. . "A Monetary Approach to the Exchange Rate: Doctrinal Aspects and Empirical Evidence." Scandivanian Journal of Economics 78 (May 1976), 200-224. Friedman, Milton. "Prices of Money and Goods Across Frontiers: The £ and $ over a Century." The World Economy 2 (February 1980), 497-511. Gailliot, Henry J. "Purchasing Power Parity as an Explanation of Long-Term Changes in Exchange Rates." Journal of Money, Credit, and Banking 2 (August 1970), 348-57. Genberg, Hans. "Purchasing Power Parity under Fixed and Flexible Exchange Rates." Journal of International Economics 8 (May 1978), 247-76. Hakkio, Craig S. "A Re-Examination of Purchasing Power Parity: A Multicountry and Multiperiod Study." Journal of International Economics 17 (November 1984), 265-77. Hendry, David F., and Frank Srba. "Technical Manual of G.I.V.E.: General Instrumen- tal Variable Estimation of Linear Equations with Lagged Dependent Variables and First Order Autoregressive Errors." Unpublished, London School of Economics, Oc- tober 1978. Hendry, David F., Adrian R. Pagan, and J. Denis Sargan. "Dynamic Specification." In Handbook of Econometrics, edited by Zvi Griliches and Michael Intriligator, pp. 1023-1100. Amsterdam: North-Holland, 1984. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms HALI J. EDISON : 387 Katseli-Papaefstratiou, Louka. "The Re-Emergence of the PPP Doctrine in the 1970s." Special Papers in International Economics, 13. International Finance Section, Prince- ton University, 1979. Kravis, Irving B., and Robert E. Lipsey. Toward an Explanation of National Price Levels. Princeton Studies in International Finance,52. International Finance Section, Prince- ton University, 1983. Kreinin, Mordechai E., and Lawrence H. Officer. The Monetary Approach to Balance of Payments: A Survey. Princeton Studies in International Finance, 43. International Finance Section, Princeton University, 1978. Lee, Moon. Purchasing Power Parity. New York: Marcel Dekker, 1976. Mark, Nelson C. "Real and Nominal Exchange Rates in the Short Run and in the Long Run: An Empirical Investigation." Unpublished, 1986. Miller, Stephen. "Purchasing Power Parity and Relative Price Variability Evidence from the 1970s." European Economic Review 26 (December 1984), 353-67. Officer, Lawrence H. "The Purchasing-Power Parity Theory of Exchange Rates: A Re- view Article." International Monetary Fund Staff Papers, 23 March 1976, 1-60. Rose, Andrew K. "An Alternative Approach to the American Demand for Money." Journal of Money, Credit, and Banking 17 (November 1985), 439-55. Rush, Mark, and Steven Husted. "Purchasing Power Parity in the Long Run." Canadian Journal of Economics 18 (February 1985), 137-45. This content downloaded from 132.200.132.34 on Sun, 08 Mar 2020 16:58:07 UTC All use subject to https://about.jstor.org/terms Contents image 1 image 2 image 3 image 4 image 5 image 6 image 7 image 8 image 9 image 10 image 11 image 12 Issue Table of Contents Journal of Money, Credit and Banking, Vol. 19, No. 3 (Aug., 1987), pp. 273-408 Front Matter Monetary Policy Regimes and the Reduced Form for Interest Rates [pp. 273-291] The Effects of Money Announcements under Alternative Monetary Control Procedures [pp. 292-307] A Partisanship Theory of Fiscal and Monetary Regimes [pp. 308-325] The Federal Funds Market under Bank Deregulation [pp. 326-339] A Reexamination of the Over- (or Under-) Pricing of Deposit Insurance [pp. 340-360] Government Debt, Economic Activity, and Transmission of Economic Disturbances [pp. 361-375] Purchasing Power Parity in the Long Run: A Test of the Dollar/Pound Exchange Rate (1890-1978) [pp. 376-387] Notes, Comments, Replies Short-term Inflation Expectations: Evidence from a Monthly Survey: Note [pp. 388-395] Functional Form in Finished Goods Inventory Investment: Note [pp. 396-401] Book Reviews Review: untitled [pp. 402-404] Review: untitled [pp. 404-406] Review: untitled [pp. 406-408] Back Matter EdisonPauls-1993Apr-AReassessment-JME-v31n2 Journal of Monetary Economics 31 (1993) 165-187. North-Holland A re-assessment of the relationship between real exchange rates and real interest rates: 1974-1990 Hali J. Edison and B. Dianne Pauls* Board of Governors of the Federal Reserve System, Washington. DC 20551. USA Received August 1991, final version received November 1992 This paper uses cointegration techniques and error-correction models to re-examine the link between real exchange rates and real interest rate differentials. The results show that real exchange rates and real interest rates are nonstationary; however, they are not cointegrated with each other. On the other hand, the dynamic models indicate that there might be a long-run relationship between these variables, but this cannot be verified. The final conclusion is that there is little empirical evidence in support of a systematic relationship, and this result is robust across exchange rates, time periods, and measures of expected inflation. Keywords: Real exchange rates; Real uncovered interest rate parity; Cointegration 1. Introduction The wide swings in the value of the U.S. dollar during the past two years have rekindled interest in the search for understanding exchange rate movements. This paper re-examines the link between real exchange rates and real interest rate differentials, often cited as ‘the most robust relationship in empirical Correspondence to: Hali J. Edison, Division of International Finance, Board of Governors of the Federal Reserve System, Washington, DC 20551, USA. *We would like to thank Marianne Baxter, Neil Ericsson, Oyvind Eitrheim, Mike Gavin, Dale Henderson, Bill Helkie, David Howard, Karen Johnson, Bob King, Jaime Marquez, Will Melick, Peter Schmidt, Ted Truman, and participants in seminars at UC San Diego, Michigan State, the Division of International Finance at the Federal Reserve, and the FRB International Finance System Committee group for their helpful comments. David Eiler provided valuable research assistance. The views expressed in this paper are solely the responsibility of the authors and should not be interpreted as those reflecting the Board of Governors of the Federal Reserve System or other members of its staff. 0304-3932/93/$06.00 c 1993-Elsevier Science Publishers B.V. All rights reserved 166 H.J. Edison and B.D. Path. Real exchange and interest rates exchange rate models’.’ These series are shown in fig. 1 for the trade-weighted value of the dollar against the foreign G-10 currencies. Based on casual inspec- tion, the two time series appear to move together. However, this appearance may be an apparition and may not reflect a true long-run stable relationship. This paper asks whether this relationship is indeed systematic and, if so, what empirical representation of it the data support. The general view of the economics profession as represented in Meese (1990) is that past research has been unsuccessful in explaining exchange rate move- ments. Many earlier papers, which model exchange rate movements as a func- tion of real interest rate differentials and other economic fundamentals, have obtained statistically significant coefficients on real interest differentials.’ How- ever, more recent work that uses more sophisticated empirical techniques generally has been unable to establish a long-run relationship between these variables. Two of the more well-known papers are those of Campbell and Clarida (1987) and Meese and Rogoff (1988). Campbell and Clarida examine whether real exchange rate movements can be explained by shifts in real interest rate differentials and find that expected real interest rate differentials have simply not been persistent enough, and their innovation variance not large enough, to account for much of the fluctuation in the dollar’s real exchange rate. Meese and Rogoff test for cointegration and find that they cannot reject the null hypothesis of noncointegration between real long-term interest rate differentials and real exchange rates. They suggest that this finding may indicate that a variable omitted from the relationship, possibly the expected value of some future real exchange rate, may have a large variance which, if included, would lead to finding cointegration. This conjecture of an important missing variable is also consistent with the Campbell-Clarida results. Two recent papers by Coughlin and Koedijk (1990) and Blundell-Wignall and Browne (1991), however, find that real exchange rates and real interest rates may be cointegrated. The ability of Blundell-Wignall and Browne to find cointegra- tion is due to the inclusion of the difference in the share of the cumulated current account relative to GNP in the relevant countries; the finding of cointegration by Coughlin and Koedijk is only for the mark/dollar exchange rate and results from extending the sample period by using more recent data. This paper also focuses on the long-run relationship between real exchange rates and real interest rate differentials. We begin by examining the statistical properties of the data, Using a unit root test, we cannot reject the null hypothe- sis of a unit root for real exchange rates, real interest rates, and most of our measures of expected inflation. We then test the long-run implications of the model for the cointegration of real exchange rates and real interest rates. Similar ‘Meese and Rogoff (1988). *Frankel(1979), Hooper and Morton (1982), Shafer and Loopesko (1983), and Boughton (1987). H.J. Edison and B.D. Pa&, Real exchange and interest rates 167 to the MeeseeRogoff result, we have not been able to detect any long-run relationship between real exchange rates and real interest rates using EngleGranger cointegration tests over the entire sample period. We have expanded these tests to allow for other variables, such as the cumulated current account balance, that may affect the expected long-run real exchange rate, but we still fail to find any evidence of cointegration. In addition to these tests, general dynamic specifications for the real trade- weighted dollar are examined in an attempt to find an error correction model. Error correction models provide information not only about the long-run relationship but also about short-run dynamics. The final models derived show that most of the short-run movements in real exchange rates are accounted for by their own past; over the longer run, however, changes in interest rates are important in explaining movements in exchange rates. However, we cannot impose a specific error correction term as indicated by each of the level variables entering with a statistically different coefficient. This result suggests the lack of a long-run relationship. Therefore, the findings from the dynamic models must be interpreted quite carefully - they do not corroborate the hypothesis that there is a long-run relationship between real exchange rates and real interest rate differentials. The rest of the paper is organized as follows. Section 2 examines the data, and section 3 gives the model framework. Section 4 presents the time series proper- ties of the data. Section 5 discusses the econometric results. Section 6 concludes. 2. The data The issues in this paper are fundamentally empirical. Before presenting a formal model, we consider the data by visually inspecting it. In particular, we want to know whether the results as depicted in fig. 1 are conditional on: (1) the time period selected, (2) the inflation measure used to construct the real interest rate, and (3) the choice of exchange rate. Some of the differences in the results in the existing literature appear to stem from aspects of the data selected. It is possible for graphs to portray the data misleadingly, nevertheless we think this method is useful to highlight the above issues.3 The data are quarterly observations for 1974-1990. Exchange rates are the Federal Reserve Board staff’s trade-weighted value of the U.S. dollar against the other G-10 currencies, and the Japanese yen, German mark, British pound sterling, and Canadian dollar against the U.S. dollar. Nominal interest rates are the ten-year constant maturity rate on Treasury bonds for the United States (i) and yields on bellwether government bonds for the foreign G-10 countries 3Danker and Hooper (1990) also present several graphs in their examin~tjon of this relationship. 168 H.J. Edison and B.D. Pa&. Real exchange and interest rates (i*).4 Prices are measured by CPIs. The weighted average value of the dollar in real terms is calculated by adjusting the nominal value by the ratio of the U.S. to the foreign CPI. For the analysis of the trade-weighted dollar, the foreign variables are similarly trade-weighted. The cumulated current account balances are created assuming the cumulated current accounts of the various countries were in balance as of 1972.Q4; the current accounts were then accumulated as of 1973.Ql.’ Three alternative measures of expected inflation are considered. The first alternative is a twelve-quarter centered moving average of CPI inflation rates, where forecasts are used when published data are not available. The other two measures are based on quarterly and four-quarter changes in the CPI index, respectively. The appendix gives details of the data and sources. Fig. 1 presents the weighted average value of the dollar in real terms and a measure of the real long-term interest differential calculated using the twelve- quarter centered moving average measure of expected inflation.6 The figure indicates that movements in the two series have been at least roughly correlated over most of the floating rate period. The decline in the dollar during the 1970s is consistent with a general downtrend in the interest differential. The relationship also holds up reasonably well during the dollar’s appreciation in 1979983, and again during its depreciation in 1985586. The relationship breaks down, how- ever, during 1984 to early 1985, when the dollar continued to rise strongly after the interest differential turned down. The same thing occurred to a lesser extent in the first part of 1989. The chart shows a tendency for movements of real interest rate differentials to precede movements in real exchange rates, but the strength of this relationship may vary over time. A very different story about the relationship between real interest rate differ- entials and real exchange rates emerges when using short-term real interest rates. Fig. 2 illustrates that the relationship between real exchange rates and real 41n most of the foreign G-10 countries, the liberalization of financial markets is a fairly recent phenomenon. Previously, ten-year bonds did not exist in many of these countries. For the early part of our sample, we used the best available proxy - often an average yield on a set of bonds of intermediate maturity. ‘The assumption that the cumulated current accounts were in balance does not, of course, accord with the data. However, this assumption only affects our initial condition and does not alter the dynamic results. 6The history of the dollar since the collapse of the Bretton Woods system breaks up fairly neatly into six phases: 1973-75, when the dollar depreciated after the breakdown of Bretton Woods; 1975-76, when the dollar appreciated; 1977-80, when the dollar depreciated as market participants were concerned that U.S. authorities were not adequately fighting inflation; 1981-84, when the dollar appreciated sharply as monetary policy in the United States was firm and prospects for continued large U.S. fiscal deficits exerted upward pressure on real interest rates; 1985586, when the dollar peaked and reversed its trend after U.S. monetary conditions had begun to ease; and 1987-90, when the dollar fluctuated within a range. H.J. Edison and B.D. Pauls, Real exchange and interest rates 169 Percentage Points March 1973 i 100 A CPI-adjusted dollar against G-10 currencies II I II 11 11 1 ‘I 11 11 974 1976 1976 1960 1962 ,964 1966 1966 IE Fig. 1. The dollar and the real long-term interest rate differential. \ I90 60 40 20 100 Percentage POl”k March 1973 = 100 s- 1 1 ,: A-r ’ - 5- III l l l I l l 11 l l l l l 19 80 60 74 1976 1976 ,960 1962 ,964 1966 ,988 Fig. 2. The dollar and the real short-term interest rate differential. 1990 160 140 120 100 60 60 170 H.J. Edison and B.D. Pa&. Real exchange and interest rates March1973= 100 - 160 - 140 120 100 *O 1976 1960 1962 1964 1966 1966 1990 Fig. 3. The trade-weighted value of the dollar. 1974 1976 short-term interest rate differentials does not resemble its long-term counterpart over most of the floating rate period.’ Fig. 3 displays the nominal and real trade-weighted values of the dollar. As is well known, there is a close correspondence between the two series, and, as has been shown elsewhere in the literature, most of the movement in the real exchange rate reflects movements in the nominal exchange rate. Fig. 4 shows that there is little apparent relationship between the nominal trade-weighted dollar and the nominal long-term interest rate differential. One explanation for this seeming lack of correlation is that the expected future nominal value of the dollar, unlike its real counterpart, does not even approximate a stable anchor; it varies with changes in inflation expectations. On the other hand, this picture does raise the question of whether the relationship in real terms is dependent on the inflation measure we use. Fig. 5 presents three alternative real interest rate differentials based on three different expected inflation measures. As this figure illustrates, the generated real interest rate differentials do vary considerably with the different measures of inflation8 Figs. 6-9 plot for the four bilateral rates ~ German mark, Japanese yen, British pound sterling, and Canadian dollar against the U.S. dollar - the relationship between real exchange rates and real interest rate differentials using a twelve-quarter centered moving average measure of expected inflation. A strong relationship between real long-term interest differentials and real exchange rates is seen for the mark/dollar over most of the period. In contrast to the mark/dollar rate, there appears to be little relationship between the other ‘The relationship does not hold up well, in general, because the expected value of the dollar over a short horizon tends to vary more than its expected long-run real value. However, since 1985, the CPI-adjusted value of the dollar and the real short-term interest differential - like its long-term counterpart - have tended to move together as relative yield curves have changed little. ‘As Baxter (1992) points out, our twelve-quarter centered moving average inflation measure creates a real interest rate differential which is smoother than our alternative measures. H.J. Edison and B.D. Pauls. Real exchange and interest rates 171 3.6 16 3.6 March 1973 = lW Y I I I I I I I I I I 1976 1960 1982 1964 1966 1966 1990 Fig. 4. Nominal dollar and nominal long-term interest rate differential. Percentage Points 1 quarter change - 1 ~~~ I- _ L I I I I I I I I I I I I I I I I I ’ 1974 1976 1976 1980 1962 1964 1966 1966 1990 Fig. 5. Alternative measures for the real long-term interest rate differential. 160 2.5 5 three bilateral real exchange rates and their real interest differentials. One reason why this relationship may not be evident for the United Kingdom during much of the 1970s is that capital controls were in place there until late 1979; however, the relationship does not work well since that date either. Although Japan also had capital controls until late 1980, much of the apparent breakdown in the relationship for the yen/dollar occurred since then.’ All in all, these graphs seem to suggest that the strong relationship between real exchange rates and real interest rate differentials that was apparent in fig. 1 may be tenuous. The next few sections of this paper examine this issue statistically. ‘Another reason why this relationship might not be evident for these two countries is that the consumer price index might not be the most appropriate index to use. The weight of raw commodi- ties, especially oil prices, in the CPI for both Japan and the U.K. might bias the calculation. 172 H.J. Edison and B.D. Pauls, Real exchange and interest rates Percentaae Points March 1973 = 100 ,\#Y ,-\#‘, : Rate Differential v , ;\’ -: \ I( (leti scale) : : ; :: 1 I I I II II II I I I I l J 1975 1980 1965 1 170 150 130 110 90 70 i0 Fig. 6. U.S./Germany: Real exchange rate and real long-term interest rate differential. wcentage Points March 1973 = 100 170 Real Interest 1975 1980 1985 1990 Fig. 7. U.S./Japan: Real exchange rate and real long-term interest rate differential. H.J. Edison and B.D. Pa&, Real exchange and interest rates 173 Percentage Points March 1973 = 100 170 Real Interest ( . Rate Differential 150 - #!,,\ (left scale) L 130 110 t , . : 90 70 Real Exchange Rate (right scale) I I I I I I I I I III I II 50 1975 1980 1985 3 1990 Fig. 8. U.S./U.K.: Real exchange rate and real long-term interest rate differential. Percentage Points March 1973 = 100 8- - 170 - 150 6- Real Exchange Rate (right scale) - 130 l \ ’ \ \ 2- * : -\ “,# ‘+ 70 Real Interest 4- Rate Differential (left scale) 6 I I I I I I I I I I I I I I I I 50 1975 1980 1985 1990 Fig. 9. U.S./Canada: Real exchange rate and real long-term interest rate differential. 174 H.J. Edison and B.D. Pauls, Real exchange and interest rates 3. The model As in Isard (1982) we begin with a set of useful definitions. The uncovered interest parity condition, assuming a risk premium, is defined as follows: st = WT) + 4, T - if T - pr, (1) where s = log of spot exchange rate (foreign currency per dollar), E(x) = expected value of any future variable x based on information at time t, i, i* = nominal own rates of interest on assets denominated in home and foreign currencies, as compounded over horizon T - t, P = exchange risk premium. Next, the real exchange rate is defined as q=s+p-p*, where 4 = log of the real exchange rate, p, p* = log of domestic and foreign price levels. Combining (1) with an expression for E(sr) derived from (2), (2) St = E(qT) + E(P?) - UP,) + kT - ?T - pt. (3) It is convenient to rewrite the expected future logarithmic price levels in terms of expected inflation, using the approximation WPT) = pt + E(4, E(p*,) = p: + E(z*). (4) Applying the Fisher equation to obtain an expression for expected real rates of interest, W-t, T) = it, T - E(n), E(r,Tr) = i;TT - E(x*). Substituting (4) and (5) into (3), and using the definition in (2): (5) 4r = E(rt, T) - E(rfTT1 + E(qT) - Pt. (6) In order to obtain a relationship between the real exchange rate and the expected real interest rate differential, it is necessary to model the expected future real exchange rate and the risk premium. Traditional econometric work in this area has used a single-equation, semireduced form, often with no dynamics. The equation is derived by assuming that the risk premium is white noise and the expected long-run real exchange rate is equal to a constant plus H.J. Edison and B.D. Pa&, Real exchange and interest rates 175 possibly a function of some ‘fundamental’ factors; a typical example of a ‘funda- mental’ factor is the cumulated current account. That is, k = a constant, cc&l = relative cumulated current accounts (domestic to foreign). We introduce dynamics into eq. (7) by modelling the risk premium as an autoregressive process, i.e., A(L)p = - et. lo This allows us to obtain a general dynamic specification, having dropped the constant, of the following form: A(&, = ‘wh,tT + A(L)lgdxll), + Et, (8) where r tt~ = Wt. 1-1 - W:: T). Eq. (8) represents a very general relationship and is empirically motivated. In section 5, we refer to this equation as the autoregressive distributed lag model. In implementing this equation empirically we attempt to fit a specific form, namely an error correction model. A simplified version of the general dynamic specification, truncating the lags at one, is If we have an error correction model, then we can restrict the coefficients to be Ir, = 82 = P3t which is the restriction implied by eq. (6). This is a testable hypothesis that is considered in the empirical section. 4. Time series properties of the data Before modelling the relationship between exchange rates and interest rates, the statistical properties of the data are analyzed. In particular, each time series is examined to assess whether it contains a unit root. We need to establish the “We choose to model the risk premium with an autoregressive process because of the poor empirical performance of variables that have been used to explain the risk premium, such as relative asset supplies or the conditional covariance of the asset return with the intertemporal marginal rate of substitution. 176 H.J. Edison and B.D. Pauls, Real exchange and interest rates order of integration of the time series before we can proceed to our next step of testing for cointegration. 4.1. Unit root tests For an arbitrary time series (x,), consider the model Xl = PO + P1t + P2xt-1 + 4. (10) Using this equation, we test the null hypothesis Ho: (PO, pi, f12) = (/I,,, 0, 1) against the general alternative based on an ‘F’ statistic. This hypothesis tests whether there is a unit root and whether the trend term is important. The results for the augmented Dickey-Fuller test are reported in table 1 for both the trade-weighted dollar and the four bilateral exchange rates.’ ’ Expected inflation and real interest rates were constructed using three proxies: a twelve- quarter centered moving average of actual inflation, a four-quarter changes in the CPI, and a one-quarter change in the CPI. We frequently reject the null hypothesis of a unit root for the one-quarter change in the CPI because the price level in many of these countries appears to be nonstationary. For most of the other variables tested, the null hypothesis that these series have a unit root cannot be rejected. Further exceptions are the difference in the share of the cumulated current account between the United States and Japan, the expected inflation variables for Japan, and the real interest rate differential between the United States and the United Kingdom when the four-quarter change in the CPI is used as a proxy for expected inflation. The results for the inflation variables and inflation differentials tend to be the most sensitive to the number of lags selected in the augmented test. Because the exceptions were few we proceeded under the assumption that the time series relevant for our basic cointegration tests are all integrated of the same order - I(l), which is a necessary condition for these time series to be cointegrated.’ 2 The cointegration tests provide a means of evaluating the relationship between real exchange rates and real interest rate differentials (and the components of the differential) as described in eq. (7). “In Edison and Pauls (1991) alternative tests for unit roots are presented, including standard Dickey-Fuller t and F tests, augmented Dickey-Fuller t tests, and Phillips-Perron tests. The conclusions of these alternative tests are consistent with those reported in table 1. Table 1 reports results using one lag for the augmented tests, based on the criteria that the errors are white noise using an LM test. For a great majority of the variables examined we were able to achieve white noise with one lag. For those variables that needed more lags the inference for the test reported are almost always unchanged. t2We repeated our unit root tests on the first differences of each time series and found, in general, that we rejected the null that the first difference of these series had a unit root. In other words, we confirmed that most levels of our original time series are I(1). H.J. Edison and B.D. Pa&, Real exchange and interest rates 177 Table 1 Statistical properties of the data; augmented Dickey-Fuller F test; l974.Q3-1990.Q4.’ Exchange rate Variable 4 i i* (i - i*) ccballgnp dif(cbal/gnp) United States/ United States/ United States/ United States/ United States/ Foreign G-10 Germany Japan United Kingdom Canada 1.09 1.52 3.09 3.62 2.35 1.86 1.86 1.86 1.86 1.85 1.81 3.53 2.88 6.52 1.87 3.58 4.92 3.89 4.86 4.65 3.90 4.39 7.16 3.24 3.79 Expected inflation proxy: Twelve-quarter centered moving average 2.46 2.46 2.46 2.46 2.84 3.09 1.62 2.02 3.53 2.16 13.36 2.03 7.17 4.87 2.7 1.98 1 _ 2.46 2.73 3.58 1.87 Expected injation proxy: Four-quarter changes n 2.74 2.74 2.74 2.74 2.74 ;*- n*) 5.33 1.44 2.35 2.74 30.42 9.26 5.60 5.35 4.27 2.68 (r - r*) 5.43 4.44 4.44 16.19 1.87 Expected injlation proxy: One-quarter changes n 4.44 4.44 4.44 4.44 4.44 ;- n*) 5.02 5.49 16.41 11.48 11.44 5.22 13.17 8.97 4.48 6.79 (r - r*) 5.42 9.85 5.85 22.02 1.87 aFor the trade-weighted dollar the data ended in 1990.Q3. Numbers are F statistics based on augmented Dickey-Fuller unit root test using one lag. The critical values are given in Dickey and Fuller (1981, table VI); they are 5.61 at the 10% significance level and 6.73 at the 5% signficance level. bq = log of real exchange rate, i = U.S. long-term nominal interest rate, rl = U.S. expected inflation, r = U.S. real interest rate, ccballgnp = cumulated current account/GNP, dif(cbal/gnp) = difference between U.S. and foreign cumulated current account/GNP, *denotes foreign country (bilateral or G-10 weighted average). 4.2. Cointegration tests Table 2 contains the results of cointegration tests. These tests are based upon the residuals from a regression of the following sort: q=po+pli+p2i*+p37t+8471*+p5x+u, (11) where X is a vector of unspecified additional variables. 178 H.J. Edison and B.D. Pa&. Real exchange and interest rates A two-step procedure as outlined in Engle and Granger (1987) and Engle and Yoo (1987) is followed. First we run the OLS regression implied by eq. (11). Second, we test the regression residuals for stationarity using the standard Dickey-Fuller ‘t’ test. If the residuals from the cointegration regressions are indeed I(O), then we can reject the null hypothesis of noncointegration. Before discussing the general results from these cointegration tests we present eq. (12), which reports the first stage of an Engle-Granger cointegration test using the simple bivariate case for the trade-weighted value of the dollar and the real interest rate differential: 4 = 4.56 + 0.062 (r - r*), (12) (0.012) (0.006) R2 = 0.62, D W = 0.35, SER = 0.090, Cointegration test: - 2.47. The results of this equation appear to show that there is a relationship between these variables as indicated by the strongly significant coefficient - the numbers in parentheses are standard errors - on the real interest rate differential. Note, however, when testing for cointegration we find that we cannot reject the null hypothesis of a unit root. This result implies that q and (r - I*) are not cointegrated and that the results of eq. (12) could be spurious. Furthermore, if we make the real interest rate differential the dependent variable our con- clusions are unchanged.’ 3 The specifications used to test for cointegration are shown at the bottom of the table 2. The tests were run first examining whether nominal interest rates, actual inflation, and real exchange rates can be cointegrated. These tests are valid under the assumption that, in the long run, actual and expected inflation move together. We then tested whether the real interest rate differential and real exchange rates can be cointegrated. The final set of specifications include the cumulated current account scaled by GNP.14 r3The results for this regression are as follows: (r - r*) = - 45.30 + 9.97q, (4.50) (0.98) R= = 0.62, D W = 0.39, SER = 1.14, Cointegration test: - 2.63. “‘In an earlier version of this paper we also reported results from decomposing the real exchange rate into the nominal exchange rate and the respective price levels. The results from these tests were similar to those reported in table 2. Furthermore, tests using current account balance variables not scaled by GNP gave similar results [see Edison and Pauls (1991)]. H.J. Edison and B.D. Pauls. Real exchange and interest rates 179 Table 2 Engle-Granger cointegration tests; l974.Q3-1990.44.” Model Exchange rate 1 2 3 4 5 6 Expected injation proxy: Twelve-quarter centered moving average U.S.-Foreign G-10 - 2.30 - 2.57 - 2.48 - 2.81 - 2.58 U.S.Germany - 0.81 0.41 - 0.18 - 2.24 - 1.28 U.S.-Japan 0.04 0.64 0.02 - 2.79 - 2.63 U.S.-U.K. - 3.11 - 1.78 - 2.72 - 3.12 - 2.08 US-Canada - 1.72 - 1.69 - 2.34 - 1.66 - 2.38 - 2.53 - 1.88 - 1.82 - 2.41 - 3.00 U.S.-Foreign G-10 U.S.-Germany U.S.-Japan U.S.-U.K. U.S.-Canada Expected injation proxy: Four-quarter changes - 3.02 - 3.34 - 2.68 - 3.01 - 1.40 0.27 - 0.55 - 2.27 - 0.23 0.53 0.07 - 2.85 - 3.43 - 2.19 - 3.09 - 3.52 - 1.90 - 1.53 - 2.19 - 2.04 - 3.37 - 2.79 - 1.99 - 2.31 - 3.02 - 1.92 - 2.20 - 2.21 - 2.21 - 2.76 ‘For the trade-weighted dollar the data ended in 1990.43. The null hypothesis is that the series are not cointegrated. The critical values are given in Engle and Yoo (1987, table 2) and are as follows: No. of vars. 5% 10% 2 3.67 3.28 3 4.11 3.13 4 4.35 4.02 5 4.76 4.42 The models tested for the trade-weighted dollar are: Model 1: q = a, + a,i + a2i* + C(~B + CX~~* + u, Model 2: q = a0 + a,(i - i*) + a2(n - x*) + u, Model 3: q=aO-taI(r-r*)+u, Model 4: q = a0 + a,i + a2i* + a3n + a.+R* + a,(ccbal/gnp) + u, Model 5: q = a0 + a,@ - i*) + a2(x - n*) + a,(ccbal/gnp) + u, Model 6: q = a0 + aI(r - r*) + a,(ccbal/gnp) + u. The models for the bilateral exchange rates use the differential between U.S. and foreign cumulated current account shares of GNP. For the various specifications using the twelve-quarter centered moving average measure of expected inflation and the four-quarter change measure of expected inflation, it is not possible to reject the null hypothesis of no cointegra- tion, which is similar to the results in Meese and Rogoff (1988).” This result IsWe do not examine the series that involved the one-quarter inflation measures as they were found to be I(0). J.Mon- B 180 H.J. Edison and B.D. Pauls, Real exchange and interest rates suggests that there does not exist a linear combination of real exchange rates and real interest rate differentials that is itself stationary, implying that there is no simple long-run relationship between the two variables (and/or their compo- nents broken out). As Meese and Rogoff suggest, it is most likely that one or more highly variable factors have been omitted from the real exchange rate-real interest rate relationship. We investigate this possibility by including various measures of the cumulated current account. Even after including these data, we consistently cannot reject the null hypothesis of noncointegration. These findings conflict with those of Blundell-Wignall and Browne, who report that real exchange rates are cointegrated with the real interest rate differentials and the differential between cumulated current account balances as shares of GNP, which is one of the measures we investigate. We do find, however, very weak evidence in support of Coughlin’s and Koedijk’s results that the cointegration tests are time-period-sensitive.‘6 In running recursive, or expanding, cointegration tests for the trade-weighted dollar including the cumulated current account we find we can reject the null of noncointegration for sample periods ending from 1980.41 to 1982.43. We examine this possibility for several other cointegration regressions for different bilateral exchange rates and found that the Dickey-Fuller test statistic varied over time, but the conclusion drawn from the statistics remained unchanged: we could not reject the null of noncointegration. In summary, the cointegration tests do not find any conclusive evidence linking the real exchange rate to the components of the real interest rate differential. As we said earlier, this may be due to the omission of an important factor or, alternatively, it may be due to our test procedures. 5. Empirical results” Note that Engle and Granger show that if a set of variables are cointegrated, then there always exists an error correction formulation of the dynamic model and vice versa. This result suggests that the two approaches are isomorphic. In addition, error correction models give information about short-run dynamics; it is this information that distinguishes the two approaches. Also, not only does the error correction approach offer an alternative test of the existence of the equilibrium imposed by theory, but these tests often tend to be more powerful ‘% contrast to Coughlin and Koedijk, we do not find evidence of cointegration for the mark/dollar rate, even over the longer time period. “In this section we limit our investigation to the trade-weighted value of the dollar. H.J. Edison and B.D. Pauls. Real exchange and interest rates 181 than the simple cointegration tests presented above.‘* The rest of this section attempts to obtain an error correction model to test the hypothesis that there exists a relationship between real exchange rates and real interest rate differen- tials. We first describe the models using the twelve-quarter centered moving average measure of expected inflation; then we examine in the following subsec- tion the models based on the two other measures of expected inflation. The starting point for the dynamic modelling is a single equation using an autoregressive distributed lag model similar to eq. (8) in section 3.19 The goal of the specification search is to derive an error correction model such as eq. (9). In estimating these equations we introduce an impulse dummy variable around the dramatic increase in the dollar from 1984.41 to 1985.Ql. The dummy represents the unexplained run-up in the dollar - the so-called ‘bubble’. It takes on values between 1 in 1984.41 and 5 in 1985.Ql. Table 3 lists the coefficient estimate for eq. (8). The residual standard error is slightly above 2.3 percent. We reparameterize the changes in nominal interest rates and expected inflation as changes in real interest rates. Several exclusion restrictions are also applied, including the lagged change in the real exchange rate. Column 2 in table 3 gives the final specification using the twelve-quarter centered moving average inflation proxy. The estimated equation standard error is roughly that of the general model, and the joint F statistics that all the restrictions on the model are valid are below any reasonable significance level.” These results show that in the short run most of the movement in the real exchange rate is accounted for by the level of its own past and changes in foreign real interest rates. The stationary state shown at the bottom of the table indicates that in the long run real interest differentials are the important determinant of the real exchange rate. The estimate of the long-run elasticity of the real exchange rate with respect to the real interest differential is approxi- mately 7 percent. The implied stationary state of this dynamic equation, however, is at odds with the results of the cointegration tests, which suggested that there was no simple long-run relationship between real exchange rates and real interest rate differentials. We know from Banerjee et al. that the results from the error “Banerjee et al. (1986) show that testing for cointegration using an error correction model under the null that cointegration is valid, has more power than a typical test suggested by Engle and Granger. r91t is well known that an autoregressive distributed lag model can be reparameterized with variables in levels and differences. See, for example, Harvey (1990, ch. 8.5) and/or Hendry, Pagan, and Sargan (1984). *‘The F statistic between the final specification and the general specification is 1.11 and is distributed as F(11,40). 182 H.J. Edison and B.D. Pauls, Real exchange and interest rates Table 3 Trade-weighted value of the dollar; twelve-quarter center moving average measure of expected inflation; 1975.41 to 199O.Q3.* Estimation model General specification Explanatory variables Coefficient Std. error Final specification -- Explanatory variables Coefficient Std. error 4-1 - 0.004 A.i - 0.003 Ai,_, 0.005 Ai’ 0.057 Ai:_, - 0.02 1 A7t - 0.010 An,-, 0.027 AX* - 0.044 A$-, 0.016 41-I - 0.241 ‘t-1 0.010 i:_ , 0.006 K-1 - 0.014 +- 1 0.009 dtr84851 0.017 Constant 0.980 RZ SER Parameter constancy Autocorrelation test Long-run stationary state _. (0.119) (0.007) (0.00s) (0.016) (0.015) (0.012) (0.015) (0.025) (0.023) (0.052) (0.007) (0.014) (0.~5) (0.006) (0.004) (0.236) (Ai* - An*) 0.039 (0.008) tr - r*),- I 0.017 (0.003) 41-I - 0.245 (0.037) dfr84851 0.017 WW Constant 1.105 (0.168) 0.78 0.71 0.023 1 0.0234 2.27 1.81 3.13 0.37 -___I q = 4.51 + 0.069 (v - r*) _ ‘Dependent variable is change in the log real exchange rate Aq. Parameter constancy test is a Chow test distributed-as F(7, t - K - 7). Autocorrelation test is Lagrange multiplier test for fourth-order residual autocorrelation distrib- uted as F(4, T - K - 4). T G number of observations, K = number of regressors, A = first difference, dtr84851 = dummy variable (1984.Ql-1985Ql: 1 to 5). correction model are more powerful if the null of cointegration is valid, but what we do not know is if the null is correct. Our final specification of the dynamic model shows that the level of the real interest rate differential is statistically significant based on the null hypothesis. However, we cannot impose a specific error correction term as indicated by the level variables 1I.J. Edison and B.D. Pauls. Real exchange and interest rates 183 Table 4 Final specification: Trade-weighted value of the dollar; alternative measure of expected inflation; 1975.Ql to 1990.43.” Estimation model One-quarter changes Four-quarter changes Explanatory Explanatory variables Coefficient Std. error variables Coefficient Std. error Ai* 0.037 Ax,-, 0.005 (Ar - Ar*) 0.003 4l?l - 0.225 (n - n*),- 1 - 0.013 ‘tm 1 0.017 i:- , - 0.010 dtr84851 0.019 Constant 0.945 (0.009) (0.001) (0.001) (0.035) (0.002) (0.006) (0.005) (0.004) (0.164) Ai* 0.035 (0.008) (Ar - Ar*),_ 1 0.012 (0.005) 4r-I - 0.233 (0.040) (r - r*),-I 0.012 (0.003) 0.007 (0.002) t;&,N 0.020 (0.004) Constant 0.988 (0.179) R2 0.78 0.77 SER 0.0214 0.0214 Parameter constancy 1.36 2.11 Autocorrelation test 1.47 0.74 Long-run stationary state q = 4.21 - O.O58(n - K*) + 0.08Oi - O.O44i* q = 4.24 + O.O52(r - r*) + 0.03Oi “See tables 1 and 3 for variable definitions and explanations. entering with statistically different coefficients.21 This result suggests the lack of a long-run relationship. Therefore, the findings from the dynamic model must be interpreted quite carefully - they do not corroborate the hypothesis that there is a long-run relationship between real exchange rates and real interest rate differentials. 5.1. Alternative inflation meausres The two alternative expected inflation measures are also used to evaluate the real exchange rate-real interest rate relationship, using a similar modelling methodology. Table 4 reports the final model for each of these expected inflation measures.22 The standard errors for each equation are about 0.2 percent lower than the final model reported in table 3. ‘I We tested an error correction term, which scaled the real interest rate term by the appropriate constant, but we still rejected the implied restriction. *‘For the one-quarter change model the F statistic between the final specification and the general specification is 0.5 and is distributed as F(7,40), while the F statistic for the four-quarter change model is 0.78 and is distributed as F(9,40). 184 H.J. Edison and B.D. Pauls, Real exchange and interest rates The final models derived for the two measures share a number of common features. In both instances, short-run changes in the real exchange rate are explained not only by changes in the real interest rate and the past level of the exchange rate, but also by changes in foreign interest rates. The long-run stationary states of the models are also very similar. The implied long run shows that the components of the real interest rate differential have different effects on the real exchange rate; this is in contrast to results in table 3, which uses the twelve-quartered centered moving average measure of inflation. However, the results in table 4 are similar to those in table 3 insofar as we cannot impose the error correction term, (q - Ar’). Consequently, even though these results appear at first blush to support the hypothesis that there exists a relationship between real exchange rates and real interest rates, they are not, in fact, consistent with the hypothesis. 6. Conclusion The fundamental question this paper asks has two parts: (1) Is there a system- atic relationship between real exchange rates and real interest rate differentials, and (2) if so, what empirical representation of it does the data support? The model we present, as one would expect, suggests that there is good reason to believe that there should be a systematic relationship between the two variables. However, similar to other researchers, we cannot find a good empirical repre- sentation that is supported by the data.23 The results presented here for the trade-weighted value of the U.S. dollar, and of the value of the U.S. dollar against the Japanese yen, German mark, British pound sterling and the Canadian dollar, suggest that the respective real ex- change rates and real interest rates, and most of their constituent series, are nonstationary. Yet, similar to other reserachers, we cannot find a series or a set of series that are cointegrated with real exchange rates. In particular, the real interest differentials using a twelve-quarter centered moving average measure for expected inflation are not cointegrated with real exchange rates, nor are nominal interest differentials and inflation differentials cointegrated with real exchange rates. These results are duplicated for various alternative measures of expected inflation and are robust to the sample period selected. Furthermore, the inclu- sion of cumulated current account balances does not reverse these results. In the final section of this paper we investigate a general dynamic specification for the trade-weighted value of the dollar in an attempt to derive an error correction model. Our final specifications of the dynamic models show that the level of the real interest rate differential (or its components) are statistically 23Baxter (1992) uncovered a strong correlation between real exchange rates and real interest rates at medium to low frequencies, which helps explain in part the visual evidence we presented. Like other researchers, she also fails to find a statistically significant relationship. H.J. Edison and B.D. Pa&, Real exchange and interest rates 185 significant under the null hypothesis of cointegration. However, the cointegra- tion test results suggest a lack of cointegration, and we cannot impose a specific error correction term as indicated by the level variables entering separately. This result suggests the lack of a bivariate long-run relationship between real ex- change rates and real interest rate differentials, in contrast to what the dynamic models might seem to suggest. The filral interpretation of the empirical work is that the apparent relationship as depicted by fig. 1 is not confirmed using standard statistical methods. One or more highly variable factors most likely have been omitted from the relationship as the charts for some of the bilateral exchange rates seem to suggest. One extension for future research might be to employ more powerful tests of cointe- gration, which allow for more than one cointegrating vector.24 Appendix Exchange Rate Trade-weighted value of the dollar (FRB Bulletin). German mark/U.S. dollar (FRB Bulletin). Japanese yen/U.S. dollar (FRB Bulletin). British pound sterling/US. dollar (FRB Bulletin). Canadian dollar/US. dollar (FRB Bulletin). Interest Rate25 Ten-year constant maturity rate on Treasury bonds (FRB Bulletin). Trade-weighted average of yields on bellwether government bonds for foreign G-10 countries (various publications). German bellwether government bonds (Bundesbank Monthly Report). Japanese bellwether government bonds (Toyko Stock Exchange). British bellwether government bonds (Bank of England Quarterly Report). Canadian bellwether government bonds (Bank of Canada Review). Prices (CPls) U.S. (FRB Bulletin). Trade-weighted average for the foreign G-10 countries. Germany (Bundesbank Monthly Report). 24Edison and Melick (1992) study these data using system cointegration techniques proposed by Soren Johansen. In contrast to the results cited above, that paper almost always identifies at least one cointegrating vector among the variables. However, in further testing the authors cannot empirically verify the theoretical models that show how exchange rates and interest rates are linked. “The interest rate data are also available from FRB publication ‘Selected Interest and Exchange Rates - Weekly Series of Charts’. 186 H.J. Edison and B.D. Pauls, Real exchange and interest rates Japan (Bank of Japan, Economic Statistics). U.K. (CSO, Employment Gazette). Canada (Bank of Canada Review). Current Account U.S. (FRB Bulletin). Germany (Bundesbank Monthly Report). Japan (Japanese Economic Indicators, EPA). U.K. (CSO, Economic Trends). Canada (Bank of Canada Review). To obtain the cumulated current account we assume for each country that the cumulated current account was zero in 1972.Q4 and accumulate the current account thereafter. GNP U.S. (FRB Bulletin). Germany (Wirtschaft und Statistik). Japan (Bank of Japan, Economic Statistics). U.K. (CSO, Monthly Digest). Canada (Canadian Economic Observer). Expected Injation (Created from CPI price indices) Twelve-quarter center moving average of CPI inflation rates. One-quarter change in the CPI index (at an annual rate). Four-quarter change in the CPI index. References Banerjee, A., J.J. Dolado, D.F. Hendry, and G.W. Smith, 1986, Exploring equilibrium relationships in econometrics through static models: Some Monte Carlo evidence, Oxford Bulletin of Eco- nomics and Statistics 48, 253-277. Baxter, M., 1992, Real exchange rates, real interest differentials, and government policy: Theory and evidence, Mimeo. Boughton, J.M., 1987, Tests of the performance of reduced-form exchange rate models, Journal of International Economics 23, 41-56. Blundell-Wignall, A. and F. Browne, 1991, Increasing financial market integration: Real exchange rates and macroeconomic adjustment, OECD working paper (OECD, Paris). Campbell, J.Y. and R.H. Clarida, 1987, The dollar and real interest rates, Carnegie-Rochester Conference Series on Public Policy 27, 103-140. Coughlin, C.C. and K. Koedijk, 1990, What do we know about the long-run real exchange rate?, St. Louis Federal Reserve Bank Review 72, 36-48. Danker, D. and P. Hooper, 1990, International financial markets and the U.S. external imbalance, International finance discussion paper no. 372 (Board of Governors of the Federal Reserve, Washington, DC). H.J. Edison and B.D. Pa&, Real exchange and interest rates 187 Dickey, D.A. and W.A. Fuller, 1981, Likelihood ratio statistics for autoregressive time series with a unit root, Econometrica 49, 1057-1072. Edison, H.J. and B.D. Pauls, 1991, Re-assessment of the relationship between real exchange rates and real interest rates: 19741990, International finance discussion paper no. 408 (Board of Governors of the Federal Reserve, Washington, DC). Edison, H.J. and W.R. Melick, 1992, Purchasing power parity and uncovered interest rate parity: The United States 19741990, International finance discussion paper no. 425 (Board of Gov- ernors of the Federal Reserve, Washington, DC). Frankel, J., 1979, On the mark: A theory of floating exchange rates based on real interest differential, American Economic Review 69, 61&622. Engle, R. and C. Granger, 1987, Co-integration and error correction: Representation, estimation, and testing, Econometrica 55, 251-276. Engle, R. and B.S. Yoo, 1987, Forecasting and testing in co-integrated systems, Journal of Econo- metrics 35, 143-160. Harvey, A.C., 1990, The econometric analysis of time series (MIT Press, Cambridge, MA). Hendry, D.F., A.R. Pagan, and J.D. Sargan, 1984, Dynamic specification, in: Z. Griliches and M. Intriligator, eds., Handbook of econometrics (North-Holland, Amsterdam) 1023-l 100. Hooper, P. and J. Morton, 1982, Fluctuations in the dollar: A model of nominal and real exchange rate determination, Journal of International Money and Finance 1, 39-56. Isard, P., 1982, An accounting framework and some issues for modelling how exchange rates respond to the news, International finance discussion paper no. 200 (Board of Governors of the Federal Reserve, Washington, DC). Meese, R., 1990, Currency fluctuations in the post-Bretton Woods era, Journal of Economic Perspective 4, 117-134. Meese, R.A. and K. Rogoff, 1988, Was it real? The exchange rate interest rate relation, 197331984, Journal of Finance 43, 9333948. Shafer, J. and B. Loopesko, 1983, Floating exchange rates after ten years, Brookings Papers on Economic Activity 1, l-70. EmersonHendry-1996-AnEvaluationOfForecasting-JFore-v15n4 Journal of Forecasting, Vol. 15,271 -291 (1996) An Evaluation of Forecasting Using Leading Indicators REBECCA A. EMERSON Bank of England DAVID F. HENDRY Nufield College, Oxford, UK ABSTRACT We consider the use of indices of leading indicators in forecasting and macro-economic modelling. The procedures used to select the components and construct the indices are examined, noting that the composition of indicator systems gets altered frequently. Cointegration within the indices, and between their components and macro-economic variables are considered as well as the role of co-breaking to mitigate regime shifts. Issues of model choice and data-based restrictions are investigated. A framework is proposed for index analysis and selecting indices, and applied to the UK longer-leading indicator. The effects of adding leading indicators to macro models are considered theoretically and for UK data. KEY WORDS forecasting; leading indicators; cointegration; co-breaking; macroeconomic modelling INTRODUCTION There has been a revival of interest in using leading indicators of economic activity to forecast a variety of economic time series: see, inter a h , Artis et al., (1993), Diebold and Rudebusch (1989, 1991a, b), Koch and Rasche (1988), Lahiri and Moore (1991), NeftFi (1979), Parigi and Schlitzer (1995), Samuelson (1987), Stock and Watson (1989, 1992), Weller (1979) and Zarnowitz and Braun (1992). This seems to be partly a reaction to perceived forecasting failures by macro-econometric systems and partly due to developments in leading-indicator theory. Most work has been done on the US indicator system, perhaps due to the importance placed on leading indicators in the USA by their popular press. For earlier work, see e.g. Moore (1961) and Shiskin and Moore (1968). An indicator is any variable believed informative about another variable of interest; an index is a weighted average of a set of component indicators. The weights may be fixed as in a Laspreyes index for (say) real gross national product (GNP) or changing as in a Divisia index such as that for retail prices (RPI), and may be based on 'natural' choices such as value shares (like the RPI or FTSE), or ad hoc (as when weighting GI", inflation, and unemployment in an index of 'economic activity'). A leading indicator is any variable whose known outcome occurs in advance of a related variable that it is desired to forecast; a composite leading index (CLI) is a combination of such CCC 0277-6693/96/040271-21 Received November I995 Q 1996 by John Wiley & Sons, Ltd. Accepted December I995 272 Journal of Forecasting Vol. 15, Iss. No. 4 variables. We will be concerned with the selection of indicators, methods of constructing indices from these, and with the theoretical forecasting properties such indicators and indexes may have either in isolation or as a component time series incorporated in another forecasting procedure, such as a vector autoregression (VAR). Our analysis is prompted by the ‘real-time’ evaluations of the success of leading indicators in forecasting in Diebold and Rudebusch (1991b) and Stock and Watson (1992). These authors show a marked deterioration in practice of post-sample performance relative to the within-sample findings claimed for CLIs. We present an analysis as to why such an outcome might be anticipated in general in non-stationary time series (both integrated and subject to regime shifts), and suggest that there are also dangers in attempting to include CLIs in other forecasting procedures. The Harvard A-B-C curves were the earliest construction meant to serve as a prediction system (see Persons, 1924), but fell into disrepute after claims that they failed to predict the 1929 crash and subsequent depression (see Samuelson, 1987). Indicators that occurred in early work included the output of pig-iron and ton-miles of rail shipments, neither of which would be selected today: the dangers in using the largest ex post lagged correlations to forecast are well illustrated by Coen, Gomme and Kendall (1969). The current system of business cycle indicators began with the seminal work of Mitchell and Bums (1938) and Burns and Mitchell (1946). The criticism in Koopmans (1947) that Bums and Mitchell’s work on business cycles and their indicators lacked a micro-economic theory basis often recurs (see e.g. Auerbach, 1982), as does his complaint that the notion of the reference cycle lacked a secure definition, although Vining (1949) offered a robust defence of their approach in terms of model discovery (see Hendry and Morgan, 1995). In practice, the composition of indicator systems gets altered rather frequently, suggesting that elements do not lead systematically for prolonged periods. The list of publications in our references by the UK Central Statistical Office (CSO) documents the many changes to their system since 1975. Our analysis of indicators and index summaries thereof considers five aspects (see Emerson, 1994, for background and bibliographic perspective). First, we consider the theoretical foundations for the indices. Second, we discuss some procedures used to select the components. Next, we examine methods of constructing the indices for possible information losses. Then issues of cointegration within the indices and between the indicators (especially for the current CLIs in the UK) are discussed, and it is shown that non-cointegration between indices and target variables leads to inefficient forecasts. Lastly, we consider the impact of regime shifts on the effectiveness of indicators, using the concept of co-breaking, analogous to cointegration. We then investigate the use of CLIs in macro-economic modelling and forecasting from four angles. First, model choice issues may become important when preselected combinations of variables are added to models. Next, data-based restrictions analogous to principal components are employed when the indices are used, but tests of these may be feasible. Third, the unstable composition of indices over time may not be correctly reflected by in-sample tests of their value added for ex ante forecasting, particularly when indices are specifically constructed to lead aggregate economic activity ex post. Finally, we consider the degree of integration of the indicators and of cointegration between the indices and the variables in the macro-economic model. However, the claim that CLIs specifically predict ‘turning points’ is not considered here: we are concerned with systematic ‘leading’ at all points of the cycle. The overall objective is to consider the theory of leading indicators and their potential role in general forecasting in the context of an integrated-cointegrated system subject to regime shifts. The paper proceeds as follows. The next section explains the construction of the UK indices of leading indicators. The third section considers some of the current analytic bases for CLIs, and notes some potential problems when using leading indices in forecasting. The fourth section R. A. Emerson and D. F. Hendry Forecasting Using Leading Indicators 273 develops a more formal framework for constant-parameter processes, while the fifth section considers generalizations, including to non-constant parameters. The sixth section undertakes an empirical analysis of the UK longer-leading indicator. The penultimate sections then discuss possible problems with adding leading indicators to econometric models for forecasting, and apply that analysis to a small monetary model of the UK. The last section presents conclusions. CURRENT UK COMPOSITE LEADING INDICATORS In this section, we briefly describe the current CLIs in the UK, their components and their method of construction. The present (as of January 1994) cyclical indicators for the economy of the United Kingdom have been in place since July 1993. The CSO has two leading indicators for the UK economy: one which is constructed to lead by a year or more (the longer-leading indicator), and the other which is intended to have a median lead of six months (the shorter- leading indicator). They are calculated and published monthly in Economic Trends. Using gross domestic product as the aggregate economic activity, the individual indicators are chosen using a computer program developed by the National Bureau of Economic Research (NBER) in the United States. The selection criteria are similar to those used in the USA.' The current components of the longer-leading indicator IL.,UK are the three-month rate of interest on C k ~ c in real co113u1c~ borr-lng l a k a l financial surplus Cr ICCS 15 - a i a - 0 7 - -a -C e - . - ' . . ' - . ' - . - - l7lO 1913 1770 1793 19.0 19B5 1770 197 -7 18 ao 0 4 0 -6. 1915 1-90 1-33 -.O Figure 1. Time series of UK quarterly leading indicators 'For the exact list of criteria, see Central Statistical Office (1975). In the US, the NBER computer program selects a set of indicators based on a scoring system giving points for (to quote) 'cyclical timing, economic significance, statistical adequacy, conformity to the business cycle, smoothness, prompt availability (currency), and revisions.' From the list of potential individual indicators, a smaller number is chosen, based mainly on timing, but also on the sector of the economy from which they originate, to construct the composite index. 274 Journal of Forecasting Vol. 15, Iss. No. 4 prime bank bills (inverted), denoted R3,; total dwelling starts for Great Britain, S,; the inverted yield curve ( I t 3 , - R,, where R , is the twenty-year bond rate); the financial surplusfdeficit of industrial and commercial companies in 1990 prices (deflated by the GDP deflator), D,; and the optimism balance from the Confederation of British Industry (CBI) survey, 0, The components of the shorter-leading index, I::, are new car registrations, N,; the balance of new orders from the past four months from the CBI survey, B,; the consumer confidence index from the EC/Gallup survey, GI; the change in consumers’ outstanding borrowing in 1990 prices (deflated by the GDP deflator), L,; and The Financial Times Actuaries’ 500-share index of common stock prices, F,. The quarterly components of the longer-leading indicator are shown in Figure 1 (the monthly components of the indices are not reported as they are not modelled, but have similar ‘cyclical’ patterns, whereas those in Figure 1 are quite disparate). Figure 2 compares the time series of the UK and USA composite leading indices to illustrate the considerable differences in their trajectories. The composite leading index is constructed from its components as follows.2 The observed or interpolated individual monthly series are detrended, smoothed by centred moving averages, scaled so that their amplitudes are approximately equal, then weighted, perhaps after ‘inversion’ so that all weights are positive, such that no single series dominates the index. The composite index takes the value of 100 in the base month.3 The index is extended from the base month by adding the average proportional change in the components’ values, using the formula: --- 1 Figure 2. Comparison of shorter (SLI) and longer (LLI) UK and US leading indicators ’For an exact description, see Central Statistical Office (1983) or Moore (1993). ’This is mid-1991 for the longer-leading indicator and mid-1992 for the shorter-leading indicator. R. A. Emerson and D . F. Hendry Forecasting Using Leading lndicators 275 where I:K is the index in month m, A,, is the value of the rth component in month m, and n is the total number of components. The longer-leading index is analysed below. ANALYSIS OF INDICES OF LEADING INDICATORS In this section we examine potential problems of composite indices of leading indicators in forecasting. Foundations for leading indices Those who construct leading indices justify them on several grounds. First, it is intended that in a properly constructed index, much of the noise in individual indicators is eliminated by averaging. For example, when data errors are independent, using many series together can strengthen their signal. Alternatively, if one indicator ‘fails’ on one occasion, then averaging it with others need not cause the whole system to fail. If it keeps ‘failing’, then an indicator will eventually be taken out of the composite index. A reason usually given for averaging is that the components originate from many sectors of the economy; thus, the CLI should work under different causes of cyclical fluctuations (see e.g. Zarnowitz and Boschan, 1977a, p173). Although such reasons for the current composite indices are sensible, they are mainly reasons for using indices as opposed to selecting specific indicators. One cannot be sure, when the current cycle originates from one sector of the economy and that particular sector is represented by one indicator in the index (say), that the movement of that sector will be reflected in the index, as opposed to averaged out. Selection procedure of the components The components of the UK CLIs are selected using the system of ‘scoring’ economic variables developed by Moore and Shiskin (1967) for the NBER. This entails giving each economic series points out of 100, based on certain characteristics. Scores are given for six broad categories. A possible maximum 20 points is awarded for each of the categories of economic significance, statistical adequacy, conformity (to the business cycle), and timing. A possible maximum of I0 points is awarded for each of the categories of currency and smoothness. Although this system of scoring economic variables is implemented with the aim of using the highest scoring variables, in practice, human judgement may be needed to make the final selection of variables (see e.g. Zarnowitz and Boschan, 1977b). In general, potential components of CLIs appear to be chosen by having high bivariate correlations with the variable they are intended to lead. Given the objective of constructing a CLI, this method may have merit since the usual concepts of (say) omitted variable bias become irrelevant. Later sections consider outcomes in terms of multivariate relations, as these are the relevant attribute when econometric modelling, but the robustness to structural change may differ between the two selection methods. Constructing leading indices The construction process for CLIs emphasizes their systematically leading, but does not draw upon the time-series properties of the components. For example, data dynamics, cointegration and non-constancy are all relevant features (and are considered below). Since aggregate economic activity is autoregressive, it is partly forecastable, although few CLIs seem to make use of this attribute. In addition to excluding lags of aggregate economic activity from leading indicators, there is little justification for only using contemporaneous values of the components. 276 Journal of Forecasting Vol. 15, Iss. No. 4 Thus, CLIs as currently constructed impose strong restrictions on dynamic models relative to other economic forecasting methods, so this issue is analysed in detail in the next section. The method used to construct the indices also has implications for other issues than just their forecastability. For example, the lack of formalization currently precludes confidence intervals being constructed by statistical agencies. Ex-post and ex ante efficiency of the indices Because of the method of component selection and index construction, the indices are ex-posr efficient, but need not be ex-anre efficient for prediction. In terms of their selection criteria, the indices are optimal when they are first introduced, since they are based on realized data. Thus, they are ex-post efficient, subject to possible mistakes in selection, compilation, etc. However, the composition of the indices is altered frequently due to prediction deterioration. For example, Diebold and Rudebusch (1991a, b) demonstrate that the current and past US CLIs are not ex- ante efficient due to this difficulty. Also, Stock and Watson (1989,1992) demonstrated that some alternative CLIs constructed for real-time forecasting by a more formal probabilistic approach were also ex-post inefficient. Issues of integration and cointegration Methods of constructing CLIs rarely take formal account of the orders of integration of the components (e.g. whether aggregate economic activity is l(1) or I(O)), whether subsets of the components are cointegrated, or whether the components are cointegrated with aggregate economic a~t ivi ty .~ Our framework explicitly allows for integrated-cointegrated time series since whether or not a variable has a unit root must influence how any model involving it is constructed, how inference is conducted, and how the target process is forecast. We deal with these in turn. The primary result (see Granger, 1986) is that an efficient forecast must be cointegrated with the outcome, and hence there must be Granger (1969) causality in at least one direction. Consequently, if a forecast is not cointegrated with the outcome, it cannot be efficient at any horizon. Thus, indicators that are not cointegrated with targets must be inefficient. Conversely, if they are cointegrated with targets, they must be related by Granger causality. These seem valuable attributes even if no unique mean-square forecast error ranking (say) is possible at short horizons between cointegrated and non-cointegrated indicators due to estimator uncertainties. In particular, cointegrating vectors could be useful leading indicators by themselves, and their automatic omission potentially detracts from forecast accuracy. Next, the presence of mixtures of I(0) and I (1) variables can complicate inference (including the choice of appropriate critical values: see e.g. Sims, Stock and Watson, 1990), and remains prone to ‘nonsense regressions’ problems in the absence of cointegration (see Banerjee et al., 1993). This argues for reducing all components of indicators to a common degree of integration, selected to ensure cointegration with the target. Finally, if a CLI is cointegrated with the levels of a set of variables such as consumption and income (c and y), then it cannot be cointegrated with I(0) linear combinations of these (e.g. s = y - c) when c and y are I ( l ) , and s is I(0). Thus, once data are integrated, care is required in specifying precisely what variable the CLI is intended to lead if the forecasts are not to be inefficient-or even irrelevant-for a given choice. 4See Nelson and Plosser (1982) and Campbell and Mankiw (1987) for discussions of the empirical frequency of unit roots and whether (e.g.) US GNP has a unit root. We define all stationary series to be mutually cointegrated. R. A. Emerson and D . F. Hendry Forecasting Using Leading Indicators 277 The UK shorter- and longer-leading indicators are mainly constructed from differenced data, but contain series which might be integrated of order one. In the sixth section we consider whether the information contained in the UK components is efficiently used, and find that, empirically, both the UK indices allow the unit-root hypothesis to be rejected. This is also true of the cointegrating vectors, but the latter seem to produce better forecasts for the data set used here. Co-breaking When regime shifts occur, there are liable to be structural breaks in relationships. Such breaks may or may not be related across variables, in an analogous way to cointegration. However, when breaks between series in a forecast relation and their putative target are unrelated, predictive failure will ensue. To avoid this significant problem, the breaks must be accounted for by the forecasting procedure. The concept of co-breaking in Hendry (1995) considers conditions under which regime shifts vanish for linear combinations of variables, which thereby do not depend on the breaks. Such an outcome either entails a coincidentally equal effect, or a genuine relationship: across many breaks, the former seems unlikely, leading to poor forecasts. This highlights the key distinction between an indicator-which is non-causal of, and non-caused by, the target-and a causally related variable. The former is unlikely to systematically experience the same shifts as the target. Thus, although CLIs and econometric models face similar problems for ex ante forecasting in a world of regime shifts, the cards seem heavily stacked against CLIs when the mappings of the indicators to the outcomes are not causal relations. The work by Diebold and Rudebusch (1989, 1991a, b) and Stock and Watson (1989, 1992), and the dynamic nature of the changes made to the UK CLIs, point to the fact that the data generation process is not constant. That is, the relationships between aggregate economic activity, and within the set of constituent variables, are not constant over time. For example, the latest US recession was not preceded by a large downturn in monetary variables and thus CLIs which relied heavily on that relationship had problems picking up the downturn (see Stock and Watson, 1992). The recent behaviour of unemployment in the UK leading the recovery may be another example (usually it is a ‘lagging indicator’). Of course, non-constant processes pose problems for all forecasting devices at times when parameters alter. The primary advantage of an econometric system is to explain why breaks occur, and progressively move towards more robust (deeper) parameterizations than those which directly characterize the phenomenological level. The converse difficulty for leading-indicator methods is that breaks can occur for reasons that would not affect an econometric model with co-breaking relations. For example, the demand for M2 depends on relative rates of return, and hence alters for portfolio reasons as well as transactions changes: the demand function could be stable for a regime shift in interest rate policy, yet the correlation with GNP be unstable. We suspect the absence of co-breaking is the main reason for predictive failure in indicators that lack a causal relation to the target. Indicators with a causal basis, that are both cointegrated with and co-break with the target, will maintain constant relationships with that target and hence provide a useful forecasting procedure, but one that is tantamount to an econometric model. A CONSTANT-PARAMETER FRAMEWORK FOR INDEX ANALYSIS The assembly procedure used for composite leading indicators (described above) makes them essentially a restriction on the vector of components. In this section, therefore, we consider an 218 Journal of Forecasting Vol. 15, Iss. No. 4 integrated dynamic system with constant parameters to investigate the consequences of constructing CLIs using only contemporaneous values of variables. The next section generalizes the analysis to parameter changes and longer lags. Let x, denote a vector of n observable economic magnitudes, i , an r x 1 vector of components of the index (the individual leading indicators), g, the leading index, and a, = A’x, the index of economic activity, where R is an n x 1 vector of coefficients. We assume that the index weights are fixed for simplicity, so: g, = #i, (2) where @ is an r x 1 vector of weights. Finally, i , is a subset of x, given by i ,= Sx, where S is r x n, and time units are determined by the highest frequency involved in the if. For analytical tractability, we take the data generation process (DGP) to be the first-order I (1) vector autoregression: x, = t + rx,- , + u, (3) where u,- IN[O, Q] and Y has no eigenvalues outside the unit circle: Ionger lags complicate, but do not alter the principles of, the analysis. Under the hypothesis that there are p cointegrating combinations of x,: r = I + a p (4) where a and /? are n x p matrices of rank p , such that B)x, is I(0) (see Johansen and Juselius, 1990, and Banerjee et al., 1993, for an exposition). Letting t = y - up, reformulate equation (3) in I(0) space as: Ax, = y + a(B’x,-, - p) + u, ( 5 ) where E[Ax,]= y ( n x 1) and E[/3’x,] = p ( p x 1) so that @ y = O (which imposes ( n - p ) restrictions). The vector equilibrium correction mechanism (VEqCM) in equation (5) is the focus of our analysis, but we also consider the issues that arise when (z,Y, B) are not constant over time. Activity index We are not concerned with the mechanics of how an activity index might be selected (discussed above), but with the properties of such indices, given A. Then a, is a reparameterization and reduction of x, defined from equation (3) by: a,= x X , = x t + x r x , - , + x u , =Po+ p,a,-, + 17, ~ , = A ’ ( ~ - ~ , I ~ ) X , - ~ + x U , = a y [ i -p,11,,+ u ~ ) x , - , + x u , (6) (7) where po = A’z, and p, is determined by minimizing the expected sum of squares of ( v,) in: Thus, the index has a first-order autoregressive component, but in general 7, will be autocorrelated as it contains linear combinations of x,- ,. More fundamentally, we must determine when ( a , } is I(0) or l(1). First, when the weights are a linear combination of (a subset of) the cointegrating vectors, so R = /3k where k is p x 1, then: (8) and hence [ q , ) will be I(0) irrespective of the value of p, . Second, when A + /3k, then { v,} will be stationary only if p, = 1, in which case the activity index will be a random walk with drift (plus a potentially autocorrelated error). In words, either the items in the index cointegrate using the weights il so ( a,) is stationary, or the index will be I (1) so its autoregressive representation must have a unit root. This result is unaffected by continually resetting a, = 100 in a base period. A’([ l -p,]I ,+ ap)=kf/3’([1 - p I ] I n + ~ B ) ) = k ’ ( [ l -p,]I,+fl‘a)B) R. A. Emerson and D. F. Hendry Forecasting Using Leading Indicators 279 Such an outcome places tight restrictions on admissible combinations of levels of variables that can be used in any stationary index: otherwise only the change in the index will be I(0). Moreover, when stationarity is desired, this analysis suggests using recent multivariate cointegration approaches to determine the activity index. Detrending a, (removing a linear deterministic trend) will not suffice to remove the unit root, which induces a stochastic-not a deterministic-trend: see Phillips and Durlauf (1986). However, differencing a, is sufficient for it to be I(0) when the DGP is l(1) given the representation in equation (9, but will be unnecessary when A = @. The covariance of the activity index with any element xi, = j ix , of x, is given by (ignoring means): E[a,x,] = E[l'x,x: j , ] This is well defined only when R = flk and x, is stationary; or when a, and x, are both 1 (1) but cointegrate. When x, is 1(1), the index is I(@, and its covariance with Ax;, is of concern, then: E[a,Ax,J = k'E[fl'x,Ax:Qi= k'(E[fl'x,x:-,fla'] + P'R)j,= k'Ra,+ k'/3'tui where R = E[/?'x,xX:-,fl] is the first-order autocovariance matrix of the cointegrating vectors, and tu; is the ith column of R. Leading index Next, from equation (3), since i, = Sx,: i, = Sx, = sz + srX,-, + Su, = y + r'i,-, + w, (9) w,= ([I,-~]S + s u ~ ) x , - , +su, (10) where y = S r , and I' derives from 'minimizing' the expected matrix sum of squares of { w,} in: The choice of the indicators could retain or lose cointegration, and could have a VAR or a VARMA representation accordingly. For simplicity, we assume that a good choice of S has been made (SY = rS), and so equation (9) is the VAR: i,= y+ri,-, +w, (1 1) where w,-IN[O,Z] withZ=SRS'. Further: g, = #i, = #'Sx, = q'x, (12) and as a reparameterization and reduction of x,, a similar analysis to that for a, applies with the added constraint that g, should lead either a, or the relevant components of x,. In particular, when a stationary CLI is desired, Q, = /lh is required. From the Granger representation theorem (see e.g. Engle and Granger, 1987), cointegration entails Granger (1969) causality in at least one direction, so some predictability must result from the use of cointegrating combinations. Indeed, when Q, = /lh, for Ax,: E[g 1-1 Ax,] = h'E[/?'x,-,Ax',lji= h'E[fl'x,-,d,-, flu'] ji = h'Va, where V = E[fl'x,-,x~-,/3] is the variance matrix of the cointegrating vectors. This is well defined, but the resulting correlation could take any value in the interval (- 1 , 1). Moreover, the covariance need not be the best that can be achieved. When i, satisfies the VAR in equation (1 l), we first consider the conditions under which g, is a sufficient summary of the component indices. Then we analyse its role relative the original 280 Journal of Forecasting Vol. 15, Iss. No. 4 VAR (3), and consider its ability to predict future variables or indexes. Let Q be an r x r non- singular matrix such that: Q = ($) and so Qi, = (14:) where Q* is ( r - 1) x r. From equation (1 1) and (12): Qi, = Qy+ QK'i,-, + Qw, = y* + CQi,-l + Qw, where QI"Q - I = C. Then: Thus, g, is a sufficient summary of i , for prediction if cI2 = 0 and c22 = 0, and otherwise, i:-, also contains relevant information for predicting i,. These are testable conditions. Conversely, necessary conditions for g, to predict elements of i, are that c I I + O and/or c2, +O. Generalizations of this approach will apply below. Next, without loss of generality, x, can be reparameterized in terms of the (non-overlapping) items for the putative CLI, the remaining indicators, the activity index and the rest of the (perhaps combinations of) x, as: where P is an n x n non-singular matrix such that I PI = 1, and Y is ( n - r - I ) x n. From equation (3), x, follows a cointegrated autoregressive process, so substituting equation (3) into (1 5 ) yields: (16) Px, = Pt + P r X , - , + Pu, = Pt + (PrP-l)Px,-, + Pu, = t * + BPx,-, + v, * * where PYP-I = I, + PaflP-l so that B - I,, = PaflP-' = a /? ' = 8 Consequently, in levels: and hence in an I(0) form: + + + where /?* = P-"/?. R. A. Emerson and D. F. Hendry Forecasting Using Leading Indicators 28 1 The forecasting objective may be to predict future values of levels or changes in a, or elements of x , (or functions thereof), including i , . Any or all of x , - ~ , or the linear combinations I , . . ~ , a,-l or g,-l could be used for forecasting. We consider the various possibilities in turn, focusing on predicting changes, from which levels can be obtained by integration. For one-step forecasts, the results are invariant but it matters which functions are selected for longer horizons (see Clements and Hendry, 1993). From equation (18), the index g, is potentially a CLI for predicting future values of elements derivable from Y k , when the relevant elements of 6 4 1 # 0. The CLI is a sufficient statistic for future YAx, when 8 4 2 = 0 , 843=0, and 8,Y=O as well, which hypotheses are testable. However, when 6,,+0, 8,,+0, and/or O,Y+O, the omitted variables may attenuate or enhance the marginal predictive value of g,-] when it alone is used. Similar comments apply to forecasting Au, : g, is potentially a CLI when 631 # 0. Using only the index g,-l instead of the entire vector x , - ~ involves no loss of information for predicting future Aa, when t132 = 0, B,, = 0, and 6i4Y = 0' which again are testable assumptions given a specification of a, and g,. In general, strong exclusion restrictions are imposed by regarding a CLI as a sufficient predictor. While there certainly exist grounds for wishing to use parsimonious models (see Box and Jenkins, 1976; Hendry and Clements, 1993), the methods for selecting CLIs and their component indicators described above suggest that the resulting (untested) reductions may lose information. Suppose that the DGP is given by equation (3), all of the elements of x , are stationary, but 532 # 0, 533 # 0 and 5i4Y + 0' , yet the modeller estimates: (19) and uses it to forecast future values of the activity index; we consider a one-step-ahead forecast. We assume that the modeller knows the value of d = (do : d l ) ' , and let .'fl; be the conditional expectation forecast of aTCl given by: .x a, = do + d,g,-, + e, However, the optimal conditional forecast of x T c l from equation (3) is: and thus of aT+] is: The forecast error from using equation (20) with known parameters is: and therefore the conditional expected forecast error is: which is non-zero unless considerable (chance) cancellation occurs. The resulting biases are due to the restrictions on how the component indices enter the CLI, the absence of lagged values of the activity index, and the omission of useful predictors YxT together with the consequentially biased coefficients in equation (19). 282 Journal of Forecasting Vol. 15, Iss. No. 4 Selecting a CLI An alternative application of this framework is to select the ‘optimal’ composite leading indicator for forecasting a,. Since a, = A‘x,, consider the linear transformation of x , - ~ from equation (3) that maximizes predictability. From equation (6), this is: (25) a,=po+q‘x,- I + e ,=pO+ 1,-1 + e, where q = Y ‘ A and l , - l = ~ ’ x , - l is defined to be the ‘best’ leading indicator as it leaves an innovation error I eT) : * * E [ e ~ ~ x , ~ l l = O w i t h E [ e , * 2 ~ x , ~ l l = A’nA (26) This formulation highlights that the best weights are 11, but leaves the usual selection problems of which variables to include and how to estimate the weights. Moreover, it reveals that howsoever the weights are selected, their sampling uncertainty should be included in measures of forecast uncertainty associated with any CLI method. We suspect that one difficulty is that different ‘indicators’ are relevant at different points in time, and which will be relevant is as hard to predict as the original problem of forecasting the target outcomes. To visualize one difficulty of using CLIs, as presently implemented, partition /3 into /I1 and p2, where fl’,x,-, - ,ul is non-zero over the selection period, but essentially zero over the forecast horizon, and B;x,-, - p,behaves in the opposite manner, so is not selected as part of the CLI within sample, although both PI and p2 are far from zero. Then the forecasts could be badly out-in the simplest case from equation (5): Interestingly, cointegration analysis is more likely to have detected the near zero in-sample combinations (B;x,- I) as cointegrated and used these in forecasting. GENERALIZATIONS Non-constant parameters The previous section established the framework, and most of its implications generalize to the empirically realistic case of a process with non-constant parameters. Consider the DGP as given instead by: x, = Z, + r,x,- l + W, (30) The algebra closely follows that already sketched with the original parameters indexed by time. If a modeller estimates equation (19), the forecast will be sub-optimal because of the earlier reasons, augmented by the parameter non-constancy. Parameter non-constancy within-sample is testable, but present methods of constructing CLIs do not appear to check this key aspect. We discussed the general issue of the impact of non-constancy on modelling and forecasting above, R. A. Emerson and D. F. Hendry Forecasting Using Leading Indicators 283 and reiterate our view that unless correlations have a more substantive basis than that they were observed over a past period, they offer little prospect of delivering a reliable forecasting method. A DGP such as that in equation (30) explains why the composition of CLIs may need regular revision, but by eschewing attempts to understand the evolution of the ( c r : r,) in the space of the [x,], will not induce a progressive accumulation of knowledge. Parameter non- constancy also induces predictive failure in econometric equations, which offer no more nor less protection against it at any point in time, but do provide a framework for consolidating and improving understanding to reduce likely future failures. Clements and Hendry (1994) consider the effects of structural breaks on econometric models in more detail. Longer lags Allowing more lags in the DGP complicates the algebra without adding greatly to the analysis. For example, extend the DGP to: (31) Ax, = 2 + a( /3 'x , - , ) + AL\x,-, + u, This points up the drawbacks of using only contemporaneous variables in a CLI, since lagged changes also have predictive power when A # 0. Indeed, if cointegration was weak, only lagged changes would matter. Conversely, as considered above, if lagged effects were small, cointegrated combinations offer the only predictive possibility. Overall, therefore, an analysis which does not a priori restrict the extent of cointegration, the degree or differencing, and the lag length seems preferable. Non-linearities Similarly, non-lineanties again complicate the analysis, and suggest that either clever functional-form transforms will be needed, or that linear weights will not be constant. Several analyses have considered mapping the formulation to one of probabilities of turning points (see e.g. Stock and Watson, 1989), but the underlying issues of constancy or predictive power are not altered by doing so, as stressed by Stock and Watson (1992). AN EMPIRICAL ANALYSIS OF THE UK LONGER-LEADING INDICATOR We first investigate the UK longer-leading indicator, as a prelude to studying its role in VAR modelling. Since the econometric model data set is only available quarterly, we converted the three component monthly indicators and the CLI to quarterly data by selecting the last monthly outcome of each quarter. The resulting time series closely resemble the corresponding monthly variables, and as shown in equation (32), 87 per cent of the variance of quarterly changes in the CLI is explained by the quarterly change in its components. More precisely, over the available quarterly sample 1975 (2)- 1993 (2), regressing AIY,' on its synthetic components yields: A]:,: = - 0.75 N ? 3 , + 0.64 AS, - 0.62 AfRb, - R3,) + 0.17 m, - 0.01 A o , R2 = 0.87, b = 0.84, DW = 1.58 This is far from an identity as defined by equation (1) for monthly data, but nevertheless explains much of the variance of the changes in the index. The main 'explanatory' variables all have the correct sign and are highly significant, other than AO,. Next, we apply the analysis in the fourth section of this paper. The VAR for the component indicators is a five-dimensional system for AR3r, ASr, A(Rb, - R3,) , ADr, and AOt. This was (0.14) (0.07) (0.18) (0.06) (0.01) (32) 284 Journal of Forecasting Vol. I S , lss. No. 4 Table I. Cointegration analysis r 1 2 3 4 5 P 0.36 0.32 0.14 0.08 0.04 Max 27.1 23.6" 9.4 5.3 2.6 Tr 68.0a 40.9a 17.3 7.9 2.6 "The unadjusted test is significant at 5%. Table II. Cointegration vectors P R3 S Rb-R3 D 0 1 1 - 1.01 0.02 -0.68 0.25 2 0.79 1 -0.08 0.18 0.05 Table III. Feedback coefficients 6r 1 2 R3 -0.02 0.11 S 0.03 -0.28 D -0.03 -0.14 0 -1.80 - 1.08 Rb-R3 -0.01 0.13 i9.a J**O 19! l*.. *o Figure 3. Cointegrating combination of indicators and UK longer-leading index R. A. Emerson and D. F. Hendry Forecasting Using Leading Indicators 285 estimated by multivariate least squares in levels, using 2 lags on every variable plus an intercept, and tested for cointegration using the Johansen (1988) approach in PcFiml 8 (see Doornik and Hendry, 1994). Table I shows the outcome for the cointegration test statistics for Strictly, no test is significant at the 5% level, but that may be due to the small sample size: two of the eigenvalues are quite large at over 0.3. Next, Table I1 shows the two eigenvectors corresponding to the two largest eigenvalues. The vector estimated in equation (32) does not lie in the cointegration space, as assuming there are two cointegration vectors, ~ ~ ( 3 ) = 16.9. The feedback coefficients for the first two eigenvectors are shown in Table 111. Figure 3 shows the time series of the ‘cointegrating combination’ and the UK longer-leading CLI: they are not highly correlated, and often not in phase. Finally, we consider the restrictions placed on the leading indicators by their combination in the CLI, I:,:, as in equation (14). If the five-dimensional VAR is modelled parsimoniously, it becomes nearly diagonal, so no single vector with all of its components non-zero could capture all the available information for predicting the component indicators. Replacing R, by ZL, rUK the test of cI2=O accepts, so no additional variables help predict the CLI. However, cZ2=O is strongly rejected, with seven coefficients having t-values in excess of 2.7 in absolute value, so the CLI is not fully informative. Conversely, both c,, = 0 and cZ1 = 0 are strongly rejected so the CLI does capture some of the dynamic behaviour. 1975(4)-1993(2). THE USE OF LEADING INDICATORS IN MACRO-ECONOMIC MODELS We now consider the use of indices, or CLIs, in macro-economic modelling and forecasting (see e.g. Bladen-Hovel1 and Zhang, 1992; Marsland and Weale, 1992). When modellers use an index, instead of all of the components separately, they are assuming that the way in which the index enters the model matches the way the components also enter the model. Testing the significance of an index involves testing the joint hypothesis of the components entering the model and having the same relationship in the model as in the index. Thus, the insignificance of the index does not preclude that some (or all) of the components of the index belong in the model, albeit with different coefficients than those in the index, as in the fourth section of this paper. Consider the role of a CLI in a VAR in differences (DVAR) for the DGP: AX,= y + a ( / 3 ’ x r - l - p ) + u , (33) AX,=a+rh,-, +v, (34) which is thereby approximated by: Such approximations can be robust to regime shifts, particularly in p (see Clements and Hendry, 1995). First, consider adding g f - ] = q ~ ’ x , - ~ to equation (34) when r= 0 (for simplicity), so that: AX, = 6+ pg,- I + e, e, = ( y - ap - 4 + ( a/pI - w‘ )Xr- 1 + Ur (35) (36) and testing p = 0. From equation (33), therefore: When 9‘ = h’p‘ (where h is p x I), so the CLI is a cointegrating combination, then the error (e,] will be stationary, and p selected to minimize the deviation of a from ph’, generally 286 Journal of Forecasting Vol. 15, Iss. No. 4 leading to p # 0 even though pp’ is only rank 1 . Further, then E [g, ] = h’p, so that 6= y - ph‘p. Otherwise, when Q, does not cointegrate x,, ( e,) will not be stationary unless p = 0. When r+ 0, both Ax, , and g,- , will proxy the omitted cointegrating vectors, reducing the significance of the latter. However, if g,-l does enter empirically, the intercept will depend on p so the robustness of equation (34) to shifts in the equilibrium mean will be lost. Second, consider a CLI based on differenced data, replacing g,-, in equation (35) by g,-, = +‘x,-,. Now in place of equation (36): (37) * e, = ( [ I n - PP’ 1 y - 4 + Q ( B 1 X r - 1 - - PQ,’ ( h r - I - Y ) + u, Again, the error [ e,) will be stationary, and the minimizing value of p will usually be non-zero, due to the autocorrelated nature of the cointegrating vectors. There is no additional role for gT-, when Ax,-, is included unrestrictedly in equation (34). In practice, only a subset y, of x, will be analysed, so the analysis of the fourth section reapplies. Thus, the CLI will often proxy omitted effects. There are interesting implications of this analysis under regime shifts in y and p. The CLI will improve forecasting performance relative to the DVAR only if co-breaking occurs using the CLI’s weights. Since /3 is co-breaking for shifts in the growth rate y when the cointegrating vectors do not trend (so /3’ y = 0) , whereas [ e,) depends on y in both equation (36) and (37), a CLI will be a poor proxy in such a state of nature relative to the correct cointegrating vectors. Further, the DVAR is little affected by changes in p (as the deviations of the cointegrating vectors from their means are omitted) whereas the CLI-based model in equation (36) depends on ,u and will experience similar predictive failure to the VEqCM. Thus, robustness is lost in both leading cases. THE ROLE OF THE UK LONGER-LEADING INDICATOR IN A MONETARY SYSTEM To illustrate the role of a leading indicator in a linear dynamic system, we now consider the four-equation monetary model analysed by Hendry and Mizon (1 993) and Hendry and Doornik (1994), and also investigated by Boswijk (1992), Ericsson, Campos and Tran (1990) and Johansen (1992), inter alia. The data set comprises M, Y, P, and R, which are respectively nominal M , , real total final expenditure (TFE) at 1985 prices, the TFE deflator, and the (learning-adjusted) differential between the three-month local authority interest rate (R , ) and the MI retail sight-deposit interest rate (R,,,, the rate on interest-bearing checking accounts at commercial banks, introduced in 1984(3)). R , is a measure of the opportunity cost of holding MI (see Hendry and Ericsson, 1991, for details). Money and expenditure are in fmillion, the deflator is unity in 1985, and the interest rate is annual, in fractions. Lower-case letters denote logs of the corresponding capitals. The data are quarterly, seasonally adjusted and after allowing for the leading-indicator sample period, estimation is usually over 1976( 1)- 1989(2). We adopted the specification in Hendry and Doornik (1994) and so analysed m-p, y, Ap, and R,, adding 1:; to create a five-variable VAR. A lag length of two was selected given the small sample. Figure 4 shows the time-series graph of I:,: and the two cointegration vectors of the econometric system (denoted by c, , and czf ) defined by Hendry and Doornik (1994) as: (38) c, , = (m, - pr ) - y, + 7.0Ap, + 7.0Rn, and c2, = y, - 0.0063r - 3.4Ap, + 1.8R,, The first is the excess demand for money (deviation from the long-run moneydemand equation) and the second the excess demand for goods (deviation of output from trend as a function of, R. A . Emerson and D . F. Hendry Forecasting Using Leading Indicators 287 essentially, the 'real' interest rate). The overall pattern of behaviour of the three series is similar, the correlations of IF,: with cI, and c2, being about 0.55. Following a similar reduction sequence to Hendry and Doornik (1994) leads to the parsimonious representation shown in Table IV, where D,, and Doil are dummy variables for output and oil-price shocks. The likelihood-ratio test of the over-identifying restrictions on the original VAR yields ~ ' ( 3 9 ) = 39.9, which is insignificant. Thus, the reductions are valid, and lead to a specification in which the leading indicator is not significant in the TFE equation, although 'leading' income is one of the main rationales for the CLI. However, the CLI is significant in the inflation equation, and induces the additional significance of cl,-,, thereby violating weak exogeneity of A2p, for a conditional money-demand model. The parameter estimates in the money-demand equation become less well determined, but the impact of c ~ , - ~ on both Ay, and A2p, is now stronger. Table IV. FIML estimates with I:.: A(WI - p ) , = -1.18 A2p,- 0.74 AR,- 0.085 clr-l + 0.005 (0.41) (0.37) (0.007) (0.002) (0.007) (0.023) (0.001) (0.005) (0.028) (0.009) (0.005) (0.10) (0.036) (0.09) (0.09) (0.005) AY, = 0.064 D,- 0.17 ~2-l+0.0068 A'p, = 0.01 1 D,,il- 0.12 ~ a - 1 - 0.031 I:,!- I - 0.024 CI,- I AR,, = 0.27 AR,- I - 0.069 ~ 1 - 1 UK = 1.33 IL , , - I - 0.45 I:,7-2 + 0.012 UK 1L.t 288 Journal of Forecasting Vol. 15, Iss. No. 4 Table V. Residual standard deviations 1.30% 0.78% 0.63% 0.016 0.0020 Table V records the residual standard deviations which are generally smaller than in Hendry and Doornik (1 994). Thus, there is some information in the longer-leading index that explains the previous error in the inflation equation, but otherwise does not greatly influence the system. Unfortunately, the sample is too short to sustain unrestricted modelling of the component indicators as well as the monetary data, so we are unable to determine which individual indicator variables contribute most. If the cointegrating-combination index is used in place of the longer-leading CLI, no reduction is significant from a VEqCM to a model thereof, and the final selection is close to that found using the CLI. The significance of the excess demand vector in the inflation equation is further emphasized, but most other coefficients are about the same. If we allow both the cointegrating combination index and the CLI to be elements of a VAR, only the former enters significantly as a level, forcing the CLI to insignificance. This supports the idea of using cointegration analysis to select a CLI when stationarity is desired. A dramatically different answer ensues if the economic cointegration vectors ( c l r and cz f ) are deleted from the VEqCM, which becomes a VAR in differences, but the two forms of CLI are used. Now the longer-leading CLI lagged is significant in every equation, and the cointegrating combination version is irrelevant. This may reflect the method of selecting the CLI to lead within sample. However, it might seem as if a DVAR can benefit from adding a CLI as a proxy for omitted cointegration vectors, even when a VEqCM cannot be much improved. Of course, the over-identification test for dropping c l r and c~~ is highly significant against the original cointegrated VAR, and a valid reduction is hard to obtain. Such results on the use of CLIs in DVARs are in line with the analysis in the previous section. Ex posr, an apparent improvement is seen, but appears to derive from proxying the cointegration vectors. Ex ante, however, regime shifts will cause the DVAR with only proxy CLIs to lose its relative robustness (see Clements and Hendry, 1995). Thus, the augmented DVAR may get the worst of both worlds, and could perform worse than a pure DVAR against a shift in the long-run equilibrium or in the growth rate. CONCLUSIONS Indicator systems are altered sufficiently frequently to suggest that historically, leading indicators do not in practice systematically lead for long. The theoretical analysis considered the selection of CLIs when the data generation process was a linear cointegrated vector autoregression, and established the importance of cointegration, dynamics, co-breaking, and parametric restrictions both when selecting CLIs and when including them in econometric systems, The implication is that the present focus of indicator selection (high correlation with the target) is too narrow. The empirical evidence confirmed the theoretical analysis: information was lost by restricting the way the component indicators entered the CLI, and the weights did not lie in the cointegration space. Nevertheless, the UK longer-leading indicator appeared to be stationary, and helped explain some of the variance of inflation in a VEqCM. It was significant in every R. A . Emerson and D . F . Hendry Forecasting Using Leading Indicators 289 equation of a DVAR, acting as a proxy for omitted cointegration effects, but was a poor proxy in as much as the test of that reduction was highly significant. Although there is much work that can and will be done on the topic in the near future, four issues remain when trying to construct a leading index. First, the causes of business cycles change over time and successful forecasting requires modelling such evolution. Second, the relationships between economic variables change over time: both the size and timing of relationships alter and co-breaking seems unlikely for non-causal relations. Third, methods of selecting components of CLIs by bivariate correlations with the variable they are intended to lead deserves closer analysis in terms of the resulting robustness (or lack thereof) relative to multivariate approaches. Fourth, the degree of integration and the extent of cointegration need formal consideration for efficient forecasting. On the positive side, the selection of indicators could be made more formal without undue difficulty. This would involve testing for information losses due to reduction, including cointegration analyses, parametric restrictions, dynamics, parameter non-constancy and co- breaking. Such a step would also induce a greater convergence between leading-indicator approaches and econometric models. ACKNOWLEDGEMENTS This research was supported in part by grant ROO0233447 from the UK Economic and Social Research Council. We are grateful to Mike Clements and two anonymous referees of this journal for helpful comments on an earlier draft. The views expressed in the paper are those of the authors and not necessarily those of the Bank of England. REFERENCES Artis, M. J., Bladen-Hovell, R. C., Osbom, D. R., Smith, J. P. and Zhang, W., ‘Turning point prediction in the UK: Preliminary results using CSO leading indicators’, presented to the Royal Economic Society Conference, York, 1993. Auerbach, A. J., ‘The index of leading indicators: Measurement without theory thirty-five years later’, Review of Economics and Statistics, 64 (1982), 589-95. Banerjee, A., Dolado, J. J., Galbraith, J. W. and Hendry, D. F., Co-integration, Error Correction and the Econometric Analysis of Non-Stationary Data, Oxford: Oxford University Press, 1993. Bladen-Hovell, R. C., and Zhang, W., ‘A BVAR model for the UK economy: A forecast comparison with LBS and NI models’, mimeo, Economics department, University of Manchester, 1992. Boswijk, H. P., Cointegration, Identification and Exogeneity, Vol. 37 of Tinbergen Institute Research Series, Amsterdam: Thesis Publishers, 1992. Box, G. E. P., and Jenkins, G. M., Time Series Analysis, Forecasting and Control, San Francisco: Holden-Day , 1976. Bums, A. F., and Mitchell, W. C., Measuring Business Cycles, New York: NBER, 1946. Campbell, J. Y. and Mankiw, N. G., ‘Are output fluctuations transitory?’ Quarterly Journal of Economics, 102 (1987), 857-880. Central Statistical Office, UK, ‘Cyclical indicators for the United Kingdom economy’, Economic Trends, Central Statistical Office (UK), ‘Output measures: Calculation and interpretation of the cyclical indicators of the UK economy’, Occasional paper no. 16 (revised), London: Central Statistical Office, 1983. Clements, M. P. and Hendry, D. F., ‘On the limitations of comparing mean squared forecast errors’, Journal of Forecasting, 12 (1993), 617-37, with discussion. CIements, M. P. and Hendry, D. F., ‘Towards a theory of economic forecasting’ in Hargreaves, C. (ed.), Non-stationary Time-series Analyses and Coinregrution, Oxford: Oxford University Press, 1994, pp. 257 (1975), 95-9. 9-52. 290 Journal of Forecasting Vol. IS, Iss. No. 4 Clements, M. P., and Hendry, D. F., ‘Macro-economic forecasting and modelling’, Economic Jourrzal, Coen, P. G., Gomme, E. D. and Kendall, M. G., ‘Lagged relationships in economic forecasting’ Journal Department of Commerce (US), Handbook of Cyclical Indicators, Washington, DC: US Department of 105 (1995), 1001-13. of the Royal Statistical Society A , 132 (1969), 133-63. Commerce, 1977. Diebold, F. X. and Rudebusch, G. D., ‘Scoring the leading indicators’, Journal of Business, 62 (1989), - - 369-91. Diebold, F. X. and Rudebusch, G. D., ‘Forecasting output with the composite leading index: An ex ante analysis’, Journal of the American Statistical Association, 86 (1991a), 603- 10. Diebold, F. X. and Rudebusch, G. D., ‘Turning point prediction with the composite leading index: An ex ante analysis’, in Lahiri, K., and Moore, G. H. (eds), Leading Economic Indicators: New Approaches and Forecasting Records, Cambridge: Cambridge University Press, 1991b, pp. 23 1-56. Doornik, J. A. and Hendry, D. F., PcFiml8: An Interactive Program for Modelling Econometric Systems, London: International Thomson Publishing, 1994. Emerson, R. A., Two essays on Investment Trusts and one on leading indicators, unpublished doctoral thesis, Oxford University, 1994. Engle, R. F. and Granger, C. W. J., ‘Cointegration and error correction: Representation, estimation and testing’, Econornetrica, 55 (1987), 251 -76. Ericsson, N. R., Campos, J. and Tran, H.-A., ‘PC-GIVE and David Hendry’s econometric methodology’, Revista De Econornetria, 10 (1990), 7- 117. Granger, C . W. J., ‘Investigating causal relations by econometric models and cross-spectral methods’, Econornetrica, 37 (1969), 424-38. Granger, C. W. J., ‘Developments in the study of cointegrated economic variables’, Oxford Bulletin of Economics and Statistics, 48 (1986), 213-28. Hendry, D. F., ‘A theory of co-breaking’, mimeo, Nuffield College, University of Oxford, 1995. Hendry, D. F. and Clements, M. P. ‘On model selection when forecasting’, mimeo, Institute of Economics, University of Oxford, 1993. Hendry, D. F. and Doornik, J. A., ‘Modelling linear dynamic econometric systems’, Scottish Journal of Political Economy, 41 (1994), 1-33. Hendry, D. F. and Ericsson, N. R., ‘Modeling the demand for narrow money in the United Kingdom and the United States’, European Economic Review, 35 (1991), 833-86. Hendry, D. F. and Mizon, G. E., ‘Evaluating dynamic econometric models by encompassing the VAR’, in Phillips, P. C. B. (ed.), Models, Methods and Applications of Econometrics, Oxford: Basil Blackwell, Hendry, D. F. and Morgan, M. S. The Foundations of Econometric Analysis, Cambridge: Cambridge Johansen, S., ‘Statistical analysis of cointegration vectors’, Journal of Economic Dynamics and Control, Johansen, S . , ‘Testing weak exogeneity and the order of cointegration in UK money demand’, Journal of Policy Modeling, 14 (1992), 313-34. Johansen, S. and Juselius, K., ‘Maximum likelihood estimation and inference on cointegration-With application to the demand for money’, Oxford Bulletin of Economics and Statistics, 52 (1990), Koch, P. and Rasche, R. H., ‘An examination of the commerce department leading-indicator approach’, Koopmans, T. C., ‘Measurement without theory’, Review of Economics and Statistics, 29 (1947), 161 -79. Lahiri, K. and Moore, G. H. (eds), Leading economic indicators: New approaches and forecastirzg Marsland, J. and Weale, M., ‘The leading indicator in a VAR model of the UK’, unpublished paper, Mitchell, W. and Bums, A. F. Statistical Indicators of Cyclical Revivals, New York: NBER, 1938. Moore, B., ‘A review of CSO cyclical indicators’, Economic Trends, 477 (1993), 99-107. Moore, G. (ed.), Business Cycle Indicators, New York: NBER, 1961. Moore, G. and Shiskin, J., Indicators of Business Expansions and Contractions, New York: NBER, 1967. NeftGi, S. N., ‘Lead-lag relations, exogeneity and prediction of economic time series’, Econotnetrica, 47 1993, pp.272-300. University Press, 1995. 12 (1988), 231-54. 169-210. Jourrzal of Business & Economic Statistics, 6 (1988), 167-87. records, Cambridge: Cambridge University Press, 1991. Downing College and Clare College, University of Cambridge, 1992. (1979), 101-13. R . A. Emerson and D . F. Hendry Forecasting Using Leading Indicators 29 1 Nelson, C. R. and Plosser, C. I., ‘Trends and random walks in macroeconomic time series: some evidence Parigi, G. and Schlitzer, G., ‘Quarterly forecasts of the Italian business cycle by means of monthly Persons, W. M., The Problem of Business Forecasting, No. 6 in Pollak Foundation for Economic Phillips, P. C. B. and Durlauf, S. N., ‘Multiple time series regression with integrated processes’, Review Samuelson, P. A., ‘Paradise lost and refound: The Harvard ABC barometers’, Journal of Portfolio Shiskin, J. and Moore, G., Composite Indexes of Leading, Coinciding and Lagging Indicators, Sims, C. A., Stock, J. H. and Watson, M. W., ‘Inference in linear time series models with some unit Stock, J. H. and Watson, M. W., ‘New indexes of coincident and leading economic indicators’, NBER Stock, J. H. and Watson, M. W., ‘A procedure for predicting recessions with leading indicators: Vining, R., ‘Methodological issues in quantitative economics’, Review of Economics and Statistics, 31 Weller, B. R., ‘Usefulness of the newly revised composite index of leading indicators as a quantitative predictor’, Journal of Macroeconornics, 1, 141-7. Zarnowitz, V. and Boschan, C., ‘Cyclical indicators: An evaluation and new leading indexes’, in Handbook of Cyclical Indicators, pp. 170- 183, 1977a. Zarnowitz, V. and Boschan, C., ‘New composite indexes of coincident and lagging indicators’, in Handbook of Cyclical Indicators, pp. 185- 198, 1977b. Zamowitz, V. and Braun, P., ‘Major macroeconomic variables and leading indicators: Some estimates of their interrelations, 1886- 1982’, Working paper 2812, National Bureau of Economic Research, New York, 1992. and implications’, Journal of Monetary Econornics, 10 (1982), 139-62. economic indicators’, Journal of Forecasting, 14 (1995), 117-41. Research Publications, London: Pitman, 1924. of Econornic Studies, 53 (1986), 473-95. Managernent, 4 (1987), 4-9. 1948-1967, New York: NBER, Supplement to National Bureau Report 1, 1968. roots’, Econornetrica, 58 (1990), 113-44. Macro-Economic Annual (1989), 35 1-409. Econometric issues and recent experience’, Working paper 4014, NBER, 1992. (1949), 77-86. Authors’ biographies : Rebecca A. Emerson is a financial economist in the Monetary Instruments and Markets Division at the Bank of England. Her research interests also include debt management, investment trusts, learning and the market for art. She received a doctorate from the University of Oxford and was a Fellow at the London Business School. David F. Hendry is Leverhulme Personal Research Professor and Fellow of Nuffield College, University of Oxford; past President and Honorary Vice-President, Royal Economic Society; Fellow and Council member, Econometric Society; Foreign Honorary Member, American Economic Association and American Academy of Arts and Sciences; Fellow, British Academy. He received the Guy Medal in Bronze from the Royal Statistical Society. He was chairman of the Society for Economic Analysis, editor of Review of Economic Studies, Economic Journal, Oxford Bulletin of Economics and Statistics, and an associate editor of Econornetrica. He has published extensively on econometric methods, theory, modelling, and history; numerical techniques and computing; empirical applications; and forecasting. Authors’ addresses: Rebecca A. Emerson, Bank of England, Threadneedle Street, London, UK. David F. Hendry, Nuffield College, Oxford OX1 lNF, UK. Ericsson-2004-Hendry-ETInterview THE ET INTERVIEW: PROFESSOR DAVID F. HENDRY Interviewed by Neil R. Ericsson1 David Hendry was born of Scottish parents in Nottingham, England, on March 6, 1944+ After an unpromising start in Glasgow schools, he obtained an M+A+ in economics with first class honors from the University of Aberdeen in 1966+ He then went to the London School of Economics and completed an M+Sc+ ~with distinction! in econometrics and mathematical economics in 1967 and a David F+ Hendry Econometric Theory, 20, 2004, 743–804+ Printed in the United States of America+ DOI: 10+10170S0266466604204078 © 2004 Cambridge University Press 0266-4666004 $12+00 743 Ph+D+ in economics in 1970 under Denis Sargan+ His doctoral thesis~The Esti- mation of Economic Models with Autoregressive Errors! provided intellectual seeds for his future research on the development of an integrated approach to modeling economic time series+ David was appointed to a lectureship at the LSE while finishing his thesis and to a professorship at the LSE in 1977+ In 1982, David moved to Oxford University as a professor of economics and a fellow of Nuffield College+ At Oxford, he has also been a Leverhulme Per- sonal Research Professor of Economics~1995–2000!, and he is currently an ESRC Professorial Research Fellow and the head of the department of economics+ Much of David’s research has focused on constructing a unified approach to empirical modeling of economic time series+ His 1995 book, Dynamic Econo- metrics, is a milestone on that path+ General-to-specific modeling is an impor- tant aspect of this empirical methodology, which has become commonly known as the “LSE” or “Hendry” approach+ David is widely recognized as the most vocal advocate and ardent contributor to this methodology+ His research also has aimed to make this methodology widely available and easy to implement, both through publicly available software packages that embed the methodol- ogy ~notably, PcGive and PcGets! and by substantive empirical applications of the methodology+ As highlighted in many of his papers, David’s interest in methodology is driven by a passion for understanding how the economy works and, specifically, how best to carry out economic policy in practice+ David’s research has many strands: deriving and analyzing methods of esti- mation and inference for nonstationary time series; developing Monte Carlo techniques for investigating the small-sample properties of econometric tech- niques; developing software for econometric analysis; exploring alternative modeling strategies and empirical methodologies; analyzing concepts and cri- teria for viable empirical modeling of time series, culminating in computer- automated procedures for model selection; and evaluating these developments in simulation studies and in empirical investigations of consumer expenditure, money demand, inflation, and the housing and mortgage markets+ Over the last dozen years, and in tandem with many of these developments on model design, David has reassessed the empirical and theoretical literature on fore- casting, leading to new paradigms for generating and interpreting economic forecasts+ Alongside these endeavors, David has pursued a long-standing inter- est in the history of econometric thought because of the insights provided by earlier analyses that were written when techniquequa technique was less dominant+ David’s enthusiasm for econometrics and economics permeates his teaching and makes his seminars notable+ Throughout his career, he has promoted inno- vative uses of computers in teaching, and, following the birth of the PC, he helped pioneer live empirical and Monte Carlo econometrics in the classroom and in seminars+ To date, he has supervised over thirty Ph+D+ theses+ David has held many prominent appointments in professional bodies+ He has served as president of the Royal Economic Society; editor of theReview of Economic Studies, the Economic Journal, and theOxford Bulletin of Eco- nomics and Statistics; associate editor ofEconometricaand theInternational Journal of Forecasting; president~Section F! of the British Association for the Advancement of Science; chairman of the UK’s Research Assessment Exer- 744 ET INTERVIEW cise in economics; and special adviser to the House of Commons, both on monetary policy and on forecasting+ He is a chartered statistician, a fellow of the British Academy and of the Royal Society of Edinburgh, and a fellow and council member of the Econometric Society+ Among his many awards and hon- ors, David has received the Guy Medal in Bronze from the Royal Statistical Society and honorary degrees from the Norwegian University of Science and Technology, Nottingham University, St+ Andrews University, the University of Aberdeen, and the University of St+ Gallen+ In addition to his academic tal- ents, David is an excellent chef and makes a great cup of cappuccino! 1. EDUCATIONAL BACKGROUND, CAREER, AND INTERESTS Let’s start with your educational background and interests. Tell me about your schooling; your original interest in economics and econometrics; and the principal people, events, and books that influenced you at the time. I went to Glasgow High School but left at 17, when my parents migrated to the north of Scotland+ I was delighted to have quit education+ What didn’t you like about it? The basics that we were taught paled into insignificance when compared to untaught issues such as nuclear warfare, independence of postcolonial coun- tries, and so on+ We had an informal group that discussed these issues in the playground+ Even so, I left school with rather inadequate qualifications: Glas- gow University simply returned my application+ That was not a promising start. No, it wasn’t+ However, as barman at my parents’ fishing hotel in Ross-shire, I met the local chief education officer, who told me that the University of Aber- deen admitted students from “educationally deprived areas” such as Ross-shire, and would ignore my Glasgow background+ I was in fact accepted by Aberdeen for a 3-year general M+A+ degree~which is a first degree in Scotland!—a “civ- ilizing” education that is the historical basis for a liberal arts education+ Why did you return to education when you had been so discouraged earlier? Working from early in the morning till late at night in a hotel makes one con- sider alternatives! I had wanted to be an accountant, and an M+A+ opened the door to doing so+ At Aberdeen, I studied maths, French, history, psychology, economic history, philosophy, and economics, as these seemed useful for accoun- tancy+ I stayed on because they were taught in a completely different way from school, emphasizing understanding and relevance, not rote learning+ What swayed you off of accountancy? My “moral tutor” was Peter Fisk+ + + ET INTERVIEW 745 Ah, I remember talking with Peter (author of Fisk, 1967) at Royal Sta- tistical Society meetings in London, but I had not realized that connection. Peter persuaded me to think about other subjects+ Meeting him later, he claimed to have suggested economics, and even econometrics, but I did not recall that+ Were you enrolled in economics? No, I was reading French, history, and maths+ My squash partner, Ian Souter, suggested that I try political economy and psychology as “easy subjects,” so I enrolled in them after scraping though my first year+ Were they easy? I thought psychology was wonderful+ Rex and Margaret Knight taught really interesting material+ However, economics was taught by Professor Hamilton, who had retired some years before but continued part time because his post remained unfilled+ I did not enjoy his course, and I stopped attending lectures+ David’s first salmon, caught near his parents’ hotel in Ross-shire~billiard room in back- ground, hotel serving platter in foreground!+ 746 ET INTERVIEW Shortly before the first term’s exam, Ian suggested that I catch up by reading Paul Samuelson’s~1961! textbook, which I did ~fortunately, not Samuelson’s @1947# Foundations!!+ From page one, I found it marvelous, learning how eco- nomics affected our lives+ I discovered that I had been thinking economics with- out realizing it+ You had called it accountancy rather than economics? Partly, but also, I was naive about the coverage of intellectual disciplines+ Why hadn’t you encountered Samuelson’s text before? We were using a textbook by Sir Alec Cairncross, the government chief eco- nomic advisor at the time and a famous Scots economist+ Ian was in second- year economics, whereSamuelsonwas recommended+ I readSamuelsonfrom cover to cover before the term exam, which then seemed elementary+ Decades later, that exam came back to haunt me when I presented the “Quincentennial Lecture in Economics” at Aberdeen in 1995+ Bert Shaw, who had marked my exam paper, retold that I had written “Poly Con” at the top of the paper+ The course was called “PolEcon,” but I had never seen it written+ He had drawn a huge red ring around “Poly Con” with the comment: “You don’t even know what this course is called, so how do you know all about it?” That’s when I decided to become an economist+ My squash partner Ian, however, became an accountant+ Were you also taking psychology at the time? Yes+ I transferred to a 4-year program during my second year, reading joint honors in psychology and economics+ The Scottish Education Department gen- erously extended my funding to 5 years, which probably does not happen today for other “late developers+” There remain few routes to university such as the one that Aberdeen offered or funding bodies willing to support such an educa- tion+ Psychology was interesting, though immensely challenging—studying how people actually behaved and eschewing assumptions strong enough to sustain analytical deductions+ I enjoyed the statistics, which focused on design and analy- sis of experiments, as well as conducting experiments, but I dropped psychol- ogy in my final year+ You published your first paper, [1], while an undergraduate. How did that come about? I investigated student income and expenditure in Aberdeen over two years to evaluate changing living standards+ To put this in perspective, only about 5% of each cohort went to university then, with most being government funded, whereas about 40% now undertake higher or further education+ The real value of such funding was falling, so I analyzed its effects on expenditure patterns ~books, clothes, food, lodging, travel, etc+!: the paper later helped in planning social investment between student and holiday accommodation+ ET INTERVIEW 747 What happened after Aberdeen? I applied to work with Dick Stone in Cambridge+ Unfortunately he declined, so I did an M+Sc+ in econometrics at LSE with Denis Sargan—the Aberdeen fac- ulty thought highly of his work+ My econometrics knowledge was woefully inadequate, but I only discovered that after starting the M+Sc+ Had you taken econometrics at Aberdeen? Econometrics was not part of the usual undergraduate program, but my desk in Aberdeen’s beautiful late-medieval library was by chance in a section that had books on econometrics+ I tried to read Lawrence Klein’s~1953! A Textbook of Econometricsand to use Jan Tinbergen’s~1951! Business Cycles in the United Kingdom 1870–1914in my economic history course+ That led the economics department to arrange for Derek Pearce in the statistics department to help me: he and I worked through Jim Thomas’s~1964! Notes on the Theory of Multiple Regression Analysis+ Derek later said that he had been keeping just about a week ahead of me, having had no previous contact with problems in economet- rics like simultaneous equations and residual autocorrelation+ Was teaching at LSE a shock relative to Aberdeen? The first lecture was by Jim Durbin on periodograms and spectral analysis, and it was incomprehensible+ Jim was proving that the periodogram was inconsis- tent, but that typical spectral estimators are well-behaved+ As we left the lec- ture, I asked the student next to me, “What is a likelihood?” and got the reply “You’re in trouble!”+ But luck was on my side+ Dennis Anderson was a physi- cist learning econometrics to forecast future electricity demand, so he and I helped each other through econometrics and economics, respectively+ Dennis has been a friend ever since and is now a neighbor in Oxford after working at the World Bank+ Did Bill Phillips teach any of your courses? Yes, although Bill was only at LSE in my first year+ When we discussed my inadequate knowledge of statistical theory, he was reassuring, and I did even- tually come to grips with the material+ Bill , along with Meghnad Desai, Jan Tymes, and Denis Sargan, ran the quantitative economics seminar, which was half of the degree+ They had erudite arguments about autoregressive and moving- average representations, matching Denis’s and Bill’s respective interests+ They also debated whether a Phillips curve or a real-wage relation was the better model for the United Kingdom+ That discussion was comprehensible, given my economics background+ What do you recall of your first encounters with Denis Sargan? Denis was always charming and patient, but he never understood the knowl- edge gap between himself and his students+ He answered questions about five levels above the target, and he knew the material so well that he rarely used 748 ET INTERVIEW lecture notes+ I once saw him in the coffee bar scribbling down a few notes on the back of an envelope—they constituted his entire lecture+ Also, while the material was brilliant, the notation changed several times in the course of the lecture: a becameb, theng, and back toa, while g had becomea and thenb; and x and z got swapped as well+ Sorting out one’s notes proved invaluable, however, and eventually ensured comprehension of Denis’s lectures+ Our present teaching-quality assessment agency would no doubt regard his approach as disas- trous, given their blinkered view of pedagogy+ That sort of lecturing could be discouraging to students, whereas it didn’t bother Denis. One got used to Denis’s approach+ For Denis, notation was just a vehicle, with the ideas standing above it+ My own recollection of Denis’s lectures is that some were crystal clear, whereas others were confusing. For instance, his expositions of instru- mental variables and LIML were superb. Who else taught the M.Sc.? Did Jim Durbin? Yes, Jim taught the time-series course, which reflected his immense under- standing of both time- and frequency-domain approaches to econometrics+ He was a clear lecturer+ I have no recollection of Jim ever inadvertently changing notation—in complete contrast to Denis—so years later Jim’s lecture notes remain clear+ What led you to write a Ph.D. after the M.Sc.? The academic world was expanding rapidly in the United Kingdom after the ~Lionel! Robbins report+ Previously, many bright scholars had received tenured posts after undergraduate degrees, and Denis was an example+ However, as in the United States, a doctorate was becoming essential+ I had a summer job in the Labour government’s new Department of Economic Affairs, modeling the secondhand car market+ That work revealed to me the gap between economet- ric theory and practice, and the difficulty of making economics operational, so I thought that a doctorate might improve my research skills+ Having read George Katona’s research, including Katona and Mueller~1968!, I wanted to investi- gate economic psychology in order to integrate the psychologist’s approach to human behavior with the economist’s utility-optimization intertemporal mod- els+ Individuals play little role in the latter—agents’ decisions could be made by computers+ By contrast, Katona’s models of human behavior incorporated anticipations, plans, and mistakes+ Had you read John Muth (1961) on expectations by then? Yes, in the quantitative economics seminar, but his results seemed specific to the given time-series model, rather than being a general approach to expecta- tions formation+ Models with adaptive and other backward-looking expecta- tions were being criticized at the time, although little was known about how ET INTERVIEW 749 individuals actually formed expectations+ However, Denis guided me into mod- eling dynamic systems with vector autoregressive errors for my Ph+D+ What was your initial reaction to that topic? I admired Sargan~1964!, and I knew that misspecifying autocorrelation in a single equation induced modeling problems+ Generalizing that result to sys- tems with vector autoregressive errors appeared useful+ Denis’s approach entailed formulating the “solved-out” form with white-noise errors and then partition- ing dynamics between observables and errors+ Because any given polynomial matrix could be factorized in many ways, with all factorizations being obser- vationally equivalent in a stationary world, a sufficient number of~strongly! exogenous variables were needed to identify the partition+ The longer lag length induced by the autoregressive error generalized the model, but error autocorre- lation per se imposed restrictions on dynamics, so the autoregressive-error rep- resentation was testable: see@4# , @14# , and@22# , the last with Andy Tremayne+ Did you consider the relationship between the system and the condi- tional model as an issue of exogeneity? No+ I took it for granted that the variables called “exogenous” were indepen- dent of the errors, as in strict exogeneity+ Bill Phillips ~1956! had considered whether the joint distribution of the endogenous and potentially exogenous vari- ables factorized, such that the parameters of interest in the conditional distribu- tion didn’t enter the marginal distribution+ On differentiating the joint distribution with respect to the parameters of interest, only the conditional distribution would contribute+ Unfortunately, I didn’t realize the importance of conditioning for model specification at the time+ What other issues arose in your thesis? Computing and modeling+ Econometric methods are pointless unless opera- tional, but implementing the new procedures that I developed required consid- erable computer programming+ The IBM 360065 at University College London ~UCL! facilitated calculations+ I tried the methods on a small macro-model of the United Kingdom, investigating aggregate consumption, investment, and out- put; see@15# + At the time, Denis had several Ph.D. students working on specific sec- tors of the economy, whereas you were working on the economy as a whole. How much did you interact with the other students? The student rebellion at the LSE was at its height in 1968–1969, and most of Denis’s students worked on the computer at UCL, an ocean of calm+ It was a wonderful group to be with+ Grayham Mizon wrote code for optimization applied to investment equations, Pravin Trivedi for efficient Monte Carlo methods and modeling inventories, Mike Feiner for “ratchet” models for imports, and Ross Williams for nonlinear estimation of durables expenditure+ Also, Cliff Wymer 750 ET INTERVIEW was working on continuous-time simultaneous systems, Ray Byron on systems of demand equations, and William Mikhail on finite-sample approximations+ We shared ideas and code, and Denis met with us regularly in a workshop where each student presented his or her research+ Most theses involved econometric theory, computing, an empirical application, and perhaps a simulation study+ 1.1. The London School of Economics After finishing your Ph.D. at the LSE, you stayed on as a lecturer, then as a reader, and eventually as a professor of econometrics. Was Denis Sargan the main influence on you at the LSE—as a mentor, as a col- league, as an econometrician, and as an economist? Yes, he was+ And not just for me, but for a whole generation of British econo- metricians+ He was a wonderful colleague+ For instance, after struggling with a problem for months, a chat with Denis often elicited a handwritten note later that afternoon, sketching the solution+ I remember discussing Monte Carlo con- trol variates with Denis over lunch after not getting far with them+ He came to my office an hour later, suggesting a general computable asymptotic approxi- mation for the control variate that guaranteed an efficiency gain as the sample size increased+ That exchange resulted in@16# and@27# + Denis was inclined to suggest a solution and leave you to complete the analysis+ Occasionally, our flailings stimulated him to publish, as with my attempt to extractnth-order auto- regressive errors from~n 1 k!th-order dynamics+ Denis requested me to repeat my presentation on it to the econometrics workshop—the kiss of death to an idea! Then he formulated the common-factor approach in Sargan~1980!+ How did Jim Durbin and other people at LSE influence you? In 1973, I was programming GIVE—the Generalized Instrumental Variable Esti- mator @33#—including an algorithm for FIML+ I used the FIML formula from Jim’s 1963 paper, which was published much later as Durbin~1988! in Econo- metric Theory+ While explaining Jim’s formula in a lecture, I noticed that it subsumed all known simultaneous equations estimators+ The students later claimed that I stood silently looking at the blackboard for some time, then turned around and said “this covers everything+” That insight led to@21# on estimator generating equations, from which all simultaneous equations estimators and their asymptotic properties could be derived with ease+When Ted Anderson was vis- iting LSE in the mid-1970s and writing Anderson~1976!, he interested me in developing an analog for measurement-error models, leading to@20# + What were your teaching assignments at the LSE? I taught the advanced econometrics option for the undergraduate degree, and the first year of the two-year M+Sc+ It was an exciting time because LSE was then at the forefront of econometric theory and its applications+ I also taught ET INTERVIEW 751 control theory based on Bill Phillips’s course notes and the book by Peter Whittle ~1963!+ Interactions between teaching, research, and software have been impor- tant in your work. Indeed+Writing operational programs was a major theme at LSE because Denis was keen to have computable econometric methods+ The mainframe program GIVE was my response+ Meghnad Desai called GIVE a “model destruction pro- gram” because at least one of its diagnostic tests usually rejected anyone’s pet empirical specification+ 1.2. Overseas Visits During 1975–1976, you split a year-long sabbatical between Yale— where I first met you—and Berkeley. What experiences would you like to share from those visits? There were three surprises+ The first was that the developments at LSE follow- ing Denis’s 1964 paper were almost unknown in the United States+ Few econ- ometricians therefore realized that autoregressive errors were a testable restriction and typically indicated misspecification, and Denis’s equilibrium-correction ~or “error-correction”! model was unknown+ The second surprise was the diver- gence appearing in the role attributed to economic theory in empirical model- ing: from pure data-basing, through using theory as a guideline—which nevertheless attracted the accusation of “measurement without theory”—to the increasingly dominant fitting of theory models+ Conversely, little attention was given to which theory to use, and to bridging the gap between abstract models and data by empirical modeling+ The final surprise was how foreign the East Coast seemed, an impression enhanced by the apparently common language+ The West Coast proved more familiar—we realized how much we had been condi- tioned by movies! I enjoyed the entire sabbatical+At Yale, the Koopmans, Tobins, and Klevoricks were very hospitable, and in Berkeley, colleagues were kind+ I ended that year at Australian National University~ANU !, where I first met Ted Hannan, Adrian Pagan, and Deane Terrell+ One of the academic highlights was the November 1975 conference in Minnesota held by Chris Sims. Yes, it was, although Chris called my comments in@25# “acerbic+” In @25# , I concurred with Clive Granger and Paul Newbold’s critique of poor econo- metrics, particularly that a highR2 and a low Durbin–Watson statistic were diagnostic of an incorrect model+ However, I thought that the common-factor interpretation of error autocorrelation, in combination with equilibrium- correction models, resolved the nonsense-regressions problem better than dif- ferencing, and it retained the economics+ My invited paper@26# at the 1975 752 ET INTERVIEW Toronto Econometric Society World Congress had discussed a system of equi- librium corrections that could offset nonstationarity+ George Box and Gwilym Jenkins’s book (initially published as Box and Jenkins, 1970) had appeared a few years earlier. What effect was that having on econometrics? The debate between the Box–Jenkins approach and the standard econometrics approach was at its height, yet the ideas just noted seemed unknown+ In the United States, criticisms by Phillip Cooper and Charles Nelson~1975! of macro- forecasters had stimulated debate about model forms—specifically, about simul- taneous systems versus ARIMA representations+ However,my Monte Carlo work with Pravin in@8# on estimating dynamic models with moving-average or auto- regressive errors had shown that matching the lag length was more important than choosing the correct form, and neither lag length nor model form was very accurately estimated from the sample sizes of 40–80 observations then avail- able+ Thus, to me, the only extra ingredients in the Box–Jenkins approach over Bill Phillips’s work on dynamic models with moving-average errors~Phillips, 2000! were differencing and data-based modeling+ Differencing threw away steady-state economics—the long-run information—so it was unhelpful+ I sus- pected that Box–Jenkins models were winning because of their modeling approach, not their model form, and if a similar approach was adopted in econometrics—ensuring white-noise errors in a good representation of the time series—econometric systems would do much better+ 1.3. Oxford University Why did you decide to move to Nuffield College in January 1982? Oxford provided a good research environment with many excellent econo- mists, it had bright students, and it was a lovely place to live+ Our daughter Vivien was about to start school, and Oxford schools were preferable to those in central London+ Amartya Sen, Terence Gorman, and John Muellbauer had all recently moved to Oxford, and Jim Mirrlees was already there+ In Oxford, I was initially also acting director of their Institute of Economics and Statistics because academic cutbacks under Margaret Thatcher meant that the university could not afford a paid director+ In 1999, the Institute transmogrified into the Oxford economics department+ That sounds strange—not to have had an economics department at a major UK university. No economics department and no undergraduate economics degree+ Econom- ics was college-based rather than university-based, it lacked a building, and it had little secretarial support+ PPE—short for “Politics, Philosophy, and Economics”—was the major vehicle through which Oxford undergraduates ET INTERVIEW 753 learnt economics+ The joke at the time was that LSE students knew everything but could do nothing with it, whereas Oxford students knew nothing and could do everything with it+ How did your teaching responsibilities differ between LSE and Nuffield? At Oxford, I taught the second-year optional econometrics course for the M+Phil+ in economics—36 hours of lectures per year+ Oxford students didn’t have a strong background in econometrics, mathematics, or statistics, but they were interested in empirical econometric modeling+ With the creation of a depart- ment of economics, we have now integrated the teaching programs at both the graduate and the undergraduate levels+ 1.4. Research Funding Throughout your academic career, research funding has been impor- tant. You’ve received grants from the Economic and Social Research Coun- cil (ESRC, formerly the SSRC), defended the funding of economics generally, chaired the 1995–1996 economics national research evalua- tion panel for the Higher Education Funding Council for England (HEFCE), and just recently received a highly competitive ESRC-funded research professorship. On the first, applied econometrics requires software, computers, research assis- tants, and data resources, so it needs funding+ Fortunately, I have received sub- stantial ESRC support over the years, enabling me to employ Frank Srba, Yock Chong, Adrian Neale, Mike Clements, Jurgen Doornik, Hans-Martin Krolzig, and yourself, who together revolutionized my productivity+ That said, I have also been critical of current funding allocations, particularly the drift away from fundamental research towards “user-oriented” research+ “Near-market” projects should really be funded by commercial companies, leaving the ESRC to focus on funding what the best researchers think is worthwhile, even if the payoff might be years later+ The ESRC seems pushed by government to fund research on immediate problems such as poverty and inner-city squalor—which we would certainly love to solve—but the opportunity cost is reduced research on the tools required for a solution+ My work on the fundamental concepts of forecast- ing would have been impossible without support from the Leverhulme Foun- dation+ I still have more than half of my applications for funding rejected, and I regret that so many exciting projects die+ In an odd way, these prolific rejec- tions may reassure younger scholars suffering similar outcomes+ Nevertheless, you have also defended the funding of economics against outside challenges. In the mid-1980s, the UK meteorologists wanted another supercomputer, which would have cost about as much as the ESRC’s entire budget+ There was an 754 ET INTERVIEW enquiry into the value of social science research, threatening the ESRC’s exis- tence+ I testified in the ESRC’s favor, applying PcGive live to modeling UK house prices to demonstrate how economists analyzed empirical evidence; see @52# + The scientists at the enquiry were fascinated by the predictability of such an important asset price, as well as the use of a cubic differential equation to describe its behavior+ Fortunately, the enquiry established that economics wasn’t merely assertion+ I remember that one of the deciding arguments in favor of ESRC fund- ing was not by an economist but by a psychiatrist . Yes+ Griffiths Edwards worked in the addiction research unit at the Maudsley on a program for preventing smoking+ An economist had asked him if lung- cancer operations were worthwhile+ Checking, he found that many patients did not have a good life postoperation+ This role of economics in making people think about what they were doing persuaded the committee of inquiry of our value+ Thatcher clearly attached zero weight to insights like Keynes’s~1936! General Theory, whereas I suspect that the output saved thereby over the last half century could fund economics in perpetuity+ There also seems to be a difference in attitudes towards, say, a fail- ure in forecasting by economists and a failure in forecasting by the weathermen. The British press has often quoted my statement that, when weathermen get it wrong, they get a new computer, whereas when economists get it wrong, they get their budgets cut+ That difference in attitude has serious consequences, and it ignores that one may learn from one’s mistakes+ Forecast failure is as infor- mative for us as it is for meteorologists+ That difference in attitude may also reflect how some members of our profession ignore the failures of their own models. Possibly+ Sometimes they just start another research program+ Let’s talk about your work on HEFCE. Core research funding in UK universities is based on HEFCE’s research assess- ment exercise+ Peer-group panels evaluate research in each discipline+ The panel for economics and econometrics has been chaired in the past by Jim Mirrlees, Tony Atkinson, and myself+ It is a huge task+ Every five years,more than a thou- sand economists from UK universities submit four publications each to the panel, which judges their quality+ This assessment is the main determinant of future research funding, as few UK universities have adequate endowments+ It also unfortunately facilitates excessive government “micro-management+” Through the Royal Economic Society, I have tried to advise the funding council about designing such evaluation exercises, both to create appropriate incentives and to adopt a measurement structure that focuses on quality+ ET INTERVIEW 755 1.5. Professional Societies and Journals Professional societies have several important roles for economists, and you have been particularly active in both the Econometric Society and the Royal Economic Society. As a life member of the Econometric Society, and as a fellow since 1976, I know that the Econometric Society plays a valuable role in our profession, but I believe that it should be more democratic by allowing members, and not just fellows, to have a voice in the affairs of the Society+ I was the first competi- tively elected president of the Royal Economic Society+ After empowering its members, the Society became much more active, especially through financing scholarships and funding travel+ I persuaded the RES to start up theEconomet- rics Journal, which is free to members and inexpensive for libraries+ Neil Shep- hard has been a brilliant and energetic first managing editor, helping to rapidly establish a presence for theEconometrics Journal+ I also helped found a com- mittee on the role of women in economics, prompted by Karen Mumford and steered to a formal basis by Denise Osborn, with Carol Propper as its first chair- person+ The committee has created a network and undertaken a series of useful studies, as well as examined issues such as potential biases in promotions+ Some women had also felt that there was bias in journal rejections and were sur- prised that~e+g+! I still received referee reports that comprised just a couple of rude remarks+ Almost from the start of your professional career, you have been active in journal editing. Yes+ In 1971, Alan Walters~who had the office next door to mine at LSE! nom- inated me as the econometrics editor for theReview of Economic Studies+ Geoff Heal was theReview’s economics editor, and we were both in our twenties at the time+ I have no idea how Alan persuaded the Society for Economic Analy- sis to agree to my appointment, although theReviewwas previously known as the “Children’s Newspaper” in some sections of our profession+ Editing was invaluable for broadening my knowledge of econometrics+ I read every submis- sion, as I did later when editing for theEconomic Journaland theOxford Bul- letin+ An editor must judge each paper and evaluate the referee reports, not just act as a post box+ All too often, editors’ letters merely say that one of the ref- erees didn’t “like” the paper, and so reject it+ If my referees didn’t like a paper that I liked, I would accept the paper nonetheless, reporting the most serious criticisms from the referee reports for the author to rebut+ Active editing also requires soliciting papers that one likes, which can be arduous when still han- dling 100–150 submissions a year+ I then edited theEconomic Journalwith John Flemming~who regrettably died last year! and covered a wider range of more applied papers+When I began editing theOxford Bulletin, a shift to the mainstream was needed, and this was 756 ET INTERVIEW helped by commissioning two timely special issues on cointegration that attracted the profession’s attention; see@63# and@97# + Some people then nicknamed it the Oxford Bulletin of Cointegration ! Let’s move on to conferences. You organized the Oslo meeting of the Econometric Society, and you helped create the Econometrics Confer- ences of the European Community (EC2). EC2 was conceived by Jan Kiviet and Herman van Dijk as a specialized forum, and I was delighted to help+ Starting in Amsterdam in 1991, EC2 has been very successful, and it has definitely enhanced European econometrics+ We attract about a hundred expert participants, with no parallel sessions, although EC2 does have poster sessions+ Poster sessions have been a success in the scientific community, but they generally have not worked well at American economics meetings. That has puzzled me, but I gather they succeeded at EC2? We encouraged “big names” to present posters, we provided champagne to encourage attendance, and we gave prizes to the best posters+ Some of the pre- sentations have been a delight, showing how a paper can be communicated in four square meters of wall space, and allowing the presenter to meet the research- ers they most want to talk to+ At a conference the size of EC2, about twenty people present posters at once, so there are two to three audience members per presenter+ That said, in the natural sciences, poster sessions also work at large conferences, so perhaps the ratio is important, not the absolute numbers. 1.6. Long-Term Collaborations Your extensive list of long-term collaborators includes Pravin Trivedi, Frank Srba, James Davidson, Grayham Mizon, Jean-François Richard, Rob Engle, Aris Spanos, Mary Morgan, myself, Julia Campos, John Muell- bauer, Mike Clements, Jurgen Doornik, Anindya Banerjee, and, more recently, Katarina Juselius and Hans-Martin Krolzig. What were your rea- sons for collaboration, and what benefits did they bring? The obvious ones were a shared viewpoint yet complementary skills; my co- authors’ brilliance, energy, and creativity; and that the sum exceeded the parts+ Beyond that, the reasons were different in every case+ Any research involving economics, statistics, programming, history, and empirical analysis provides scope for complementarities+ The benefits are clear to me, at least+ Pravin was widely read and stimulated my interest in Monte Carlo+ Frank greatly raised my productivity—our independently written computer code would work when combined, which must be a rarity+ When I had tried this with Andy Tremayne, ET INTERVIEW 757 we initially defined Kronecker products differently, inducing chaos! James brought different insights into our work and insisted~like you! on clarity+ Grayham and I have investigated a wide range of issues+ Like yourself, Rob, Jean-François, Katarina, and Mike~and also Søren Johansen, although we have not yet published together!, Grayham shares a willingness to discuss economet- rics at any time, in any place+ On the telephone or over dinner, we have started exchanging ideas about each other’s research, usually to our spouses’ dismay+ I find such discussions very productive+ Jean-François and Rob are both great at stimulating new developments and clarifying half-baked ideas, leading to impor- tant notions and formalizations+ Aris has always been a kindred spirit in ques- tioning conventional econometric approaches and having an interest in the history of econometrics+ Mary is an historian, as well as an econometrician, and so stops me from writing “Whig history” ~i+e+, history as written from the perspective of the vic- tors!+ With yourself, we have long arguments ending in new ideas and then David and Evelyn cooking at the Mizon residence in Florence+ 758 ET INTERVIEW write the paper+ Julia rigorously checks all derivations and frequently corrects me+ John has a clear understanding of economics, so keeps me right in that arena+ Mike and I have pushed ahead on investigating a shared interest in the fundamentals of economic forecasting, despite a reluctance of funding agen- cies to believe that it is a worthwhile activity+ In addition to his substantial econometrics skills, Jurgen is one of the world’s great programmers, with an extraordinary ability to conjure code that is almost infallible+ He ported PcGive across to C11 after persuading me that there was no future inFortran+ We interact on a host of issues, such as on how meth- odology impinges on the design and structure of programs+Anindya brings great mathematical skills, and Katarina has superb intuition about empirical model- ing+ Hans-Martin has revived my interest in methodology with automatic model- selection procedures, which he pursues in addition to his “regime-switching” research+ Ken Wallis and I have regularly commented on each other’s work, although we have rarely published together+ And, of course, Denis Sargan was also a long-term collaborator, but he almost never needed co-authors, except for @55# , which was written jointly with Adrian Pagan and myself+ As the acknowledgments in my publications testify, many others have also helped at various stages, most recently Bent Nielsen and Neil Shephard, who are won- derful colleagues at Nuffield+ 2. RESEARCH STRATEGY I want to separate our discussion of research strategy into the role of economics in empirical modeling, the role of econometrics in econom- ics, and the LSE approach to empirical econometric modeling. 2.1. The Role of Economics in Empirical Modeling I studied economics because unemployment, living standards, and equity are important issues—as noted previously, Paul Samuelson was a catalyst in that— and I remain an economist+ However, a scientific approach requires quantifica- tion, which led me to econometrics+ Then I branched into methodology to understand what could be learnt from nonexperimental empirical evidence+ If econometrics could develop good models of economic reality, economic policy decisions could be significantly improved+ Since policy requires causal links, economic theory must play a central role in model formulation, but economic theory is not the sole basis of model formulation+ Economic theory is too abstract and simplified, so data and their analysis are also crucial+ I have long endorsed the views in Ragnar Frisch’s~1933! editorial in the first issue ofEconometrica, particularly his emphasis on unifying economic theory, economic statistics~data!, and mathematics+ That still leaves open the key question as to “which eco- nomic theory+” “High-level” theory must be tested against data, contingent on “well-established” lower level theories+ For example, despite the emphasis on ET INTERVIEW 759 agents’ expectations by some economists, they devote negligible effort to col- lecting expectations data and checking their theories+ Historically, much of the data variation is not due to economic factors but to “special events” such as wars and major changes in policy, institutions, and legislation+ The findings in @205# and @208# are typical of my experience+ A failure to account for these special events can elide the role of economic forces in an empirical model+ 2.2. The Role of Econometrics in Economics Is the role of econometrics in economics that of a tool, just as Monte Carlo is a tool within econometrics? Econometrics is our instrument, as telescopes and microscopes are instruments in other disciplines+ Econometric theory, and, within it, Monte Carlo, evaluates whether that instrument is functioning as expected+ Econometric methodology studies how such methods work when applied+ Too often, a study in economics starts afresh, postulating and then fitting a theory-based model, failing to build on previous findings+ Because investiga- tors revise their models and rewrite a priori theories in light of the evidence, it is unclear how to interpret their results+ That route of forcing theoretical mod- els onto data is subject to the criticisms in Larry Summers~1991! about the “illusion of econometrics+” I admire what Jan Tinbergen called “kitchen-sink econometrics,” being explicit about every step of the process+ It starts with what the data are; how they are collected, measured, and changed in the light of theory; what that theory is; why it takes the claimed form and is neither more general nor more explicit; and how one formulates the resulting empirical rela- tionship and then fits it by a rule~an estimator! derived from the theoretical model+ Next comes the modeling process, because the initial specification rarely works, given the many features of reality that are ignored by the theory+ Finally, ex post evaluation checks the outcome+ That approach suggests a difference between being primarily inter- ested in the economic theory—where data check that the theory makes sense—and trying to understand the data—where the theory helps inter- pret the evidence rather than act as a straitjacket. Yes+ To derive explicit results, economic theory usually abstracts from many complexities, including how the data are measured+ There is a vast difference between such theory being invaluable and its being optimal+ At best, the theory is a highly imperfect abstraction of reality, so one must take the data and the theory equally seriously in order to build useful empirical representations+ The instrument of econometrics can be used in a coherent way to interpret the data, build models, and underpin a progressive research strategy, thereby providing the next investigator with a starting point+ 760 ET INTERVIEW 2.3. The LSE Approach What is meant by the LSE approach? It is often associated with you in particular, although many other individuals have contributed to it and not all of them have been at the LSE. There are four basic stages, beginning with an economic analysis to delineate the most important factors+ The next stage embeds those factors in a general model that also allows for other potential determinants and relevant special fea- tures+ Then, the congruence of that model is tested+ Finally, that model is sim- plified to a parsimonious undominated congruent final selection that encompasses the original model, thereby ensuring that all reductions are valid+ When developing the approach, the first tractable cases were linear dynamic single equations, where the appropriate lag length was an open issue+ However, the principle applies to all econometric modeling, albeit with greater difficulty in nonlinear settings; see Trivedi~1970! and Mizon~1977! for early empirical and theoretical contributions+ Many other aspects followed, such as developing a taxonomy for model evaluation, orthogonalizing variables, and recommenc- ing an analysis at the general model if a rejection occurs+ Additional develop- ments generalized this approach to system modeling, in which several~or even all! variables are treated as endogenous+ Multiple cointegration is easily ana- lyzed as a reduction in this framework, as is encompassing of the VAR and whether a conditional model entails a valid reduction+ Mizon ~1995! and@157# provide discussions+ Do you agree with Chris Gilbert (1986) that there is a marked contrast between the “North American approach” to modeling and the “European approach”? Historically, American economists were the pragmatists, but Koopmans~1947! seems to mark a turning point+ Many American economists now rely heavily on abstract economic reasoning, often ignoring institutional aspects and inter- agent heterogeneity, as well as inherent conflicts of interest between agents on different sides of the market+ Some economists believe their theories to such an extent that they retain them, even when they are strongly rejected by the data+ There are precedents in the history of science for maintaining research pro- grams despite conflicts with empirical evidence, but only when there was no better theory+ For economics, however,Werner Hildenbrand~1994!, Jean-Pierre Benassy~1986!, and many others highlight alternative theoretical approaches that seem to accord better with empirical evidence+ 3. RESEARCH HIGHLIGHTS We discussed estimator generation already. Let’s now turn to some other highlights of your research program, including equilibrium correc- tion, exogeneity, model evaluation and design, encompassing, Dynamic ET INTERVIEW 761 Econometrics, and Gets. These issues have often arisen from empirical work, so let’s consider them in their context, focusing on consumers’ expenditure and money demand, including the Friedman–Schwartz debate. We should also discuss Monte Carlo as a tool in econometrics; the his- tory of econometrics; and your recent interest in ex ante forecasting, which has emphasized the difference between error correction and equi- librium correction. 3.1. Consumers’ Expenditure Your paper [28] with James Davidson, Frank Srba, and Stephen Yeo models UK consumers’ expenditure. This paper is now commonly known by the acronym DHSY, which is derived from the authors’ initials. Some background is necessary+ I first had access to computer graphics in the early 1970s, and I was astonished at the picture for real consumers’ expendi- ture and income in the United Kingdom+ Expenditure manifested vast season- ality, with double-digit percentage changes between quarters, whereas income had virtually no seasonality+ Those seasonal patterns meant that consumption was much more volatile than income on a quarter-to-quarter basis+ Two impli- cations followed+ First, it would not work to fit first-order lags~as I had done earlier! and hope that dummies plus the seasonality in income would explain the seasonality in consumption+ Second, the general class of consumption- smoothingtheories like the permanent-income and life-cycle hypotheses seemed misfocused+ Consumers were inducing volatility into the economy by large inter- quarter shifts in their expenditure, so the business sector must be a stabilizing influence+ Moreover, the consumption equation in my macro-model@15# had dramati- cally misforecasted the first two quarters of 1968+ In 1968Q1, the chancellor of the exchequer announced that he would greatly increase purchase~i+e+, sales! taxes unless consumers’ expenditure fell, the response to which was a jump in consumers’ expenditure, followed in the next quarter by the chancellor’s tax increase and a resulting fall in expenditure+ I wrongly attributed my model’s forecast failure to model misspecification+ In retrospect, that failure signaled that forecasting problems with econometric models come from unanticipated changes+ At about this time, Gordon Anderson and I were modeling building soci- eties, which are the British analogue of the U+S+ savings and loans associa- tions+ In @26# , we nested the long-run solutions of existing empirical equations, using a formulation related to Sargan~1964!, although I did not see the link to Denis’s work until much later; see@50# + I adopted a similar approach for mod- eling consumers’ expenditure, seeking a consumption function that could inter- pret the equations from the major UK macro-models and explain why their proprietors had picked the wrong models+ In DHSY @28# , we adopted a “detec- 762 ET INTERVIEW tive story” approach, using a nesting model for the different variables, valid for both seasonally adjusted and unadjusted data, with up to 5 lags in all the vari- ables to capture the dynamics+ Reformulation of that nesting model delivered an equation that@39# later related to Phillips~1957! and was called an error- correction model+ Under error correction, if consumers made an error relative to their plan by overspending in a given quarter, they would later correct that error+ Even with DHSY, a significant change in model formulation occurred just before publication. Angus Deaton (1977) had just established a role for inflation if agents were uncertain as to whether relative or absolute prices were changing. The first DHSY equation explained real consumers’ expenditure given real income, and it significantly overpredicted expenditure through the 1973–1974 oil crisis+ Angus’s paper suggested including inflation and changes therein+ Add- ing these variables to our equation explained the underspending+ This result was the opposite of what the first-round economic theory suggested, namely, that high inflation should induce preemptive spending, given the opportunity costs of holding money+ Inflation did not reflect money illusion+ Rather, it implied the erosion of the real value of liquid assets+ Consumers did not treat the nom- inal component of after-tax interest as income, whereas the Statistical Office did, so disposable income was being mismeasured+Adding inflation to our equa- tion corrected that+ As ever, theory did not have a unique prediction+ DHSY explained why other modelers selected their models, in addi- tion to evaluating your model against theirs. Why haven’t you applied that approach in your recent work? It was difficult to do+ Several ingredients were necessary to explain other modelers’ model selections: their modeling approaches, data measurements, seasonal adjustment procedures, choice of estimators, maximum lag lengths, and misspecification tests+ We first standardized on unadjusted data and repli- cated models on that+ While seasonal filters leave a model invariant when the model is known, they can distort the lag patterns if the model is data-based+ We then investigated both OLS and IV but found little difference+ Few of the then reported evaluation statistics were valid for dynamic models, so such tests could mislead+ Most extant models had a maximum lag of one and low short- run marginal propensities to consume, which seemed too small to reflect agent behavior+We tried many blind alleys~including measurement errors! to explain these low marginal propensities to consume+ Then we found that equilibrium correction explained them by induced biases in partial-adjustment models+ We designed a nesting model, which explained all the previous findings but with the paradox that it simplified to a differenced specification, with no long-run term in the levels of the variables+ Resolving that conundrum led to the error- correction mechanism+ While this “Sherlock Holmes” approach was extremely ET INTERVIEW 763 time-consuming, it did stimulate research into encompassing, i+e+, trying to explain other models’ results from a given model+ Were you aware of Phillips (1954) and Phillips (1957)? Now the interview becomes embarrassing! I had taken over Bill Phillips’s lec- ture course on control theory and forecasting, so I was teaching how propor- tional, integral, and derivative control rules can stabilize the economy+ However, I did not think of such rules as an econometric modeling device in behavioral equations+ What other important issues did you miss at the time? Cointegration! Gordon Anderson’s and my work on building societies showed that combinations of levels variables could be stationary, as in the discussion by Klein ~1953! of the “great ratios+” Granger ~1981, 1986! later formalized that property as cointegration removing unit roots+ Grayham Mizon and I were debating with Gene Savin whether unit roots changed the distributions of esti- mators and tests, but bad luck intervened+ Grayham and I found no changes in several Monte Carlos, but, unknowingly, our data generation processes had strong growth rates+ Rather than unit root processes with a zero mean? Yes+We found that estimators were nearly normally distributed, and we falsely concluded that unit roots did not matter; see West~1988!+ The next missed issue concerned seasonality and annual differences+ In DHSY, the equilibrium correction was the four-quarter lag of the log of the ratio of consumption to income, and it was highly seasonal+ However, seasonal dummy variables were insignificant if one used the SchefféS procedure; see Savin ~1980!+ About a week after DHSY’s publication, Thomas von Ungern-Sternberg added seasonal dummies to our equation and, with conventionalt-tests, found that they were highly significant, leading to the “HUS” paper@39# + Care is clearly required with multiple-testing procedures! Those results on seasonality stimulated an industry on time-varying seasonal patterns, periodic seasonality, and periodic behavior, with many contributions by Denise Osborn (1988, 1991). Indeed+ The final mistake in DHSY was our treatment of liquid assets+ HUS showed that, in an equilibrium-correction formulation, imposing a unit elastic- ity of consumption with respect to income leaves no room for liquid assets+ Logically speaking, DHSY went from simple to general+ On derestricting their equation, liquid assets were significant, which HUS interpreted as an integral correction mechanism+ The combined effect of liquid assets and real income on expenditure added up to unity in the long run+ 764 ET INTERVIEW The DHSY and HUS models appeared at almost the same time as the Euler-equation approach in Bob Hall (1978). Bob emphasized consump- tion smoothing, where changes in consumption were due to the innova- tions in permanent income and so should be ex ante unpredictable. A large literature has tested if changes in consumers’ expenditure are pre- dictable in Hall’s model. How did your models compare with his? In @35# , James Davidson and I found that lagged variables, as derived from HUS, were significant in explaining changes in UK consumers’ expenditure+ HUS’s model thus encompassed Hall’s model+ “Excess volatility” and “excess smoothing” have been found in various models, but few authors using an Euler- equation framework test whether their model encompasses other models+ You produced a whole series of papers on consumers’ expenditure. After DHSY, HUS, and @35# , there were four more papers+ They were written in part to check the constancy of the models and in part to extend them+ @46# modeled annual interwar UK consumers’ expenditure, obtaining results similar to the postwar relation in DHSY and HUS, despite large changes in the corre- lation structure of the data+ @88# followed up on DHSY, @101# developed a model of consumers’ expenditure in France, and @119# revisited HUS with additional data+ The 1990 paper [88] with Anthony Murphy and John Muellbauer finds that additional variables matter. We would expect that to happen+As the sample size grows, noncentralt-statistics become more significant, so models expand+ That’s another topic that Denis worked on; see Sargan~1975! and the interesting follow-up by Robinson~2003!+ It also fits in with the work on m -testing by Hal White (1990). Yes+ Misspecification evidence against a given formulation accumulates, which unfortunately takes one down a simple-to-general path+ That is one reason empir- ical work is difficult+ ~The other is that the economy changes+! A “reject” out- come on a test rejects the model, but it does not reveal why+ Bernt Stigum ~1990! has proposed a methodology to delineate the source of failure from each test, but when a test rejects, it still takes a creative discovery to improve a model+ That insight may come from theory, institutional evidence, data knowledge, or inspiration+While general-to-specific methodology provides guidelines for build- ing encompassing models, advances between studies are inevitably simple-to- general, putting a premium on creative thinking+ A good initial specification of the general model is a major source of value added, making the rest relatively easy, and incredibly difficult otherwise. That’s correct+ Research can be wasted if a key variable is omitted+ ET INTERVIEW 765 3.2. Equilibrium-Correction Models and Cointegration You already mentioned that you had presented an equilibrium-correction model at Sims’s 1975 conference. Yes, in @25# , I presented an example that was derived from the long-run eco- nomic theory of consumers’ expenditure, and I merely asserted that there were other ways to obtain stationarity than differencing+ Nonsense regressions are only a problem for static models or for those patched up with autoregressive errors+ If one begins with a general dynamic specification, it is relatively easy to detect that there is no relationship between two unrelated random walks, yt andzt ~say!+ A significant drawback of being away from the LSE was the dif- ficulty of transporting software, so I did not run a Monte Carlo simulation to check this+ Now it is easy to do so, and@229, Figure 1# shows the distributions of the t-statistics for the coefficients in the regression of yt 5 a0 1 a1 yt21 1 a2 zt 1 a3 zt21 1 ut , (1) wherea1 5 1, a2 5 a3 5 0, zt 5 zt21 1 vt , and ut and vt are each normal serially independent and are independent of each other+ This simulation con- firms my earlier claim about detecting nonsense regressions, but thet-statistic for the coefficient on the lagged dependent variable is skewed+While differenc- ing the data imposes a common factor with a unit root, a model with differ- ences and an equilibrium-correction term remains in levels because it allows for a long-run relation+ To explain this, DHSY explicitly distinguished between differencing as an operator and differencing as a linear transformation+ What was the connection between [25] and Clive’s first papers on cointegration—Granger (1981) and Granger and Weiss (1983)? At Sims’s conference, Clive was skeptical about relating differences to lagged levels and doubted that the correction in levels could be stationary: differences of the data did not have a unit root, whereas their lagged levels did+ Investigat- ing that issue helped Clive discover cointegration; see his discussion of@49#, and see Phillips~1997!+ Your interest in cointegration led to two special issues of the Oxford Bulletin, your book [104], and a number of papers—[61], [64], [78], [95], [98], and [136]—the last two also addressing structural breaks. The key insight was that fewer equilibrium corrections~r ! than the number of decision variables~n! induced integrated-cointegrated data, which Søren Johansen~1988! formalized as reduced-rank feedbacks of combinations of lev- els onto growth rates+ In the Granger representation theorem in Engle and Granger~1987!, the data are I~1! becauser , n, a situation that I had not thought about+ So, although DHSY was close in some ways, it was far off in others+ In fact, I missed cointegration for a second time in@32# , where I showed that “nonsense regressions” could be created and detected, but I failed to for- malize the latter+ Cointegration explained many earlier results+ For instance, in 766 ET INTERVIEW Denis’s 1964 equilibrium relationship involving real wages relative to produc- tivity, the measured disequilibrium fed back to determine future wage rates, given current inflation rates+ Peter Phillips~1986, 1987!, Jim Stock~1987!, and others~such as Chan and Wei, 1988! were also changing the mathematical technology by using Weiner integrals to represent the limiting distributions of unit-root processes+ Anindya Banerjee, Juan Dolado, John Galbraith, and I thought that the power and gen- erality of that new approach would dominate the future of econometrics, espe- cially since some proofs became easier, as with the forecast-error distributions in @139# + Our interest in cointegration resulted in@104# , following Benjamin Disraeli’s reputed remark that “if you want to learn about a subject, write a book about it+” Or edit a special issue on it! 3.3. Exogeneity Exogeneity takes us back to Vienna in August 1977 at the European Econometric Society Meeting. Discussions of the concept of exogeneity abounded in the econometrics litera- ture, but for me, the real insight came from the paper presented in Vienna by Jean-François Richard and published as Richard~1980!+ Although the concept of exogeneity needed clarifying, the audience at the Econometric Society meet- ing seemed bewildered, since few could relate to Jean-François’s likelihood fac- torizations and sequential cuts+ Rob Engle was also interested in exogeneity, so, when he visited LSE and CORE shortly after the Vienna meeting, the three of us analyzed the distinctions between various kinds of exogeneity and devel- oped more precise definitions+We all attended a Warwick workshop, with Chris Sims and Ed Prescott among the other econometricians, and we argued end- lessly+ Reactions to our formalization of exogeneity suggested that fundamen- tal methodological issues were in dispute, including how one should model, what the form of models should be, what modeling concepts were, and even what appropriate model concepts were+ Since I was working with Jean-François and Rob, I visited their respective institutions~CORE and UCSD! during 1980– 1981+ My time at both locations was very stimulating+ The coffee lounge at CORE saw many long discussions about the fundamentals of modeling with Knud Munk, Louis Phlips, Jean-Pierre Florens, Michel Mouchart, and Jacques Drèze ~plus Angus Deaton during his visit!+ In San Diego, we argued more about technique+ Your paper [44] with Rob and Jean-François on exogeneity went through several revisions before being published, and many of the examples from the CORE discussion paper were dropped. Regrettably so+ Exogeneity is a difficult notion and is prone to ambiguities, whereas examples can help reduce the confusion+ The CORE version was writ- ET INTERVIEW 767 ten in a cottage in Brittany, which the Hendrys and Richards shared that sum- mer+ Jean-François even worked on it while moving along the dining table as supper was being laid+ The extension to unit-root processes in@130# shows that exogeneity has yet further interesting implications+ How did your paper [106] on super exogeneity with Rob Engle come about? Parameter constancy is a fundamental attribute of a model, yet predictive fail- ure was all too common empirically+ The ideal condition was super exogeneity, which meant valid conditioning for parameters of interest that were invariant to changes in the distributions of the conditioning variables+ Rob correctly argued that tests for super exogeneity and invariance were required, so we developed some tests and investigated whether conditioning variables were valid, or whether they were proxies for agents’ expectations+ Invalid conditioning should induce nonconstancy, and that suggested how to test whether agents were forward- looking or contingent planners, as in@76# + The idea is a powerful one logically, but there is no formal work on the class of paired parameter constancy tests in which we seek rejec- tion for the forcing variables’ model and nonrejection for the conditional model. That has not been formalized+ Following Trevor Breusch~1986!, tests of super exogeneity reject if there is nonconstancy in the conditional model, ensuring refutability+ The interpretation of nonrejection is less clear+ You reported simulation evidence in [100] with Carlo Favero. That work was based on my realization in@76# that feedback and feedforward models are not observationally equivalent when structural breaks occur in mar- ginal processes+ Intercept shifts in the marginal distributions delivered high power, but changes in the parameters of mean-zero variables were barely detect- able+ At the time, I failed to realize two key implications: the Lucas~1976! critique could only matter if it induced location shifts, and predictive failure was rarely due to changed coefficients of zero-mean variables+ More recently, I have developed these ideas in@183# and@188# + In your forecasting books with Mike Clements—[163] and [170]—you discuss how shifts in the equilibrium’s mean are the driving force for empirically detectable nonconstancy. Interestingly, such a shift was present in DHSY, since inflation was needed to model the falling consumption-income ratio, which was the equilibrium correc- tion+ When inflation was excluded from our model, predictive failure occurred because the equilibrium mean had shifted+ However, we did not realize that logic at the time+ 768 ET INTERVIEW 3.4. Model Development and Design There are four aspects to model development. The first is model eval- uation, as epitomized by GIVE (or what is now PcGive) in its role as a “model destruction program.” The second aspect is model design. The third is encompassing, which is closely related to the theory of reduc- tion and to the general-to-specific modeling strategy. The fourth con- cerns a practical difficulty that arises because we may model locally by general to specific, but over time we are forced to model specific to general as new variables are suggested, new data accrue, and so forth. On the first issue, Denis Sargan taught us that “problems” with residuals usu- ally revealed model misspecification, so tests were needed to detect residual autocorrelation, heteroskedasticity, nonnormality, and so on+ Consequently, my mainframe econometrics program GIVE printed many model evaluation statis- tics+ Initially, they were usually likelihood ratio statistics, but many were switched to their Lagrange multiplier form, following the implementation of Silvey~1959! in econometrics by Ray Byron, Adrian Pagan, Rob Engle, Andrew Harvey, and others; see Godfrey~1988!+ Why doesn’t repeated testing lead to too many false rejections? Model evaluation statistics play two distinct roles+ In the first, the statistics gen- erate one-off misspecification tests on the general model+ Because the general model usually has four or five relevant, nearly orthogonal, aspects to check, a 1% significance level for each test entails an overall size of about 5% under the null hypothesis that the general model is well-specified+ Alternatively, a com- bined test could be used, and both approaches seem unproblematic+ However, for any given nominal size for each test statistic, more tests must raise rejec- tion frequencies under the null+ This cost has to be balanced against the prob- ability of detecting a problem that might seriously impugn inference, where repeated testing~i+e+, more tests! raises the latter probability+ The second role of model evaluation statistics is to reveal invalid reductions from a congruent general model+ Those invalid reductions are then not fol- lowed, so repeated testing here does not alter the rejection frequencies of the model evaluation tests+ The main difficulty with model evaluation in the first sense is that rejection merely reveals an inappropriate model+ It does not show how to fix the prob- lem+ Generalizing a model in the rejected direction might work, but that infer- ence is a non sequitur+ Usually, creative insight is required, and reexamining the underlying economics may provide that+ Still, the statistical properties of any new model must await new data for a Neyman–Pearson quality-control check+ The empirical econometrics literature of the 1960s manifested covert design+ For instance, when journal editors required that Durbin–Watson statistics be close to two, residual autocorrelation was removed by fitting autoregressive ET INTERVIEW 769 errors+ Such difficulties prompted the concept of explicit model design, leading us to consider what characteristics a model should have+ In @43# , Jean-François and I formalized model concepts and the information sets against which to evaluate models, and we also elucidated the design characteristics needed for congruence+ If we knew the data generation process (DGP) and estimated its param- eters appropriately, we would also obtain insignificant tests with the stated probabilities. So, as an alternative complementary interpretation, success- ful model design restricts the model class to congruent outcomes, of which the DGP is a member. Right+ Congruence~a name suggested by Chris Allsopp! denotes that a model matches the evidence in all the directions of evaluation, and so the DGP is congruent with itself+ Surprisingly, the concept of the DGP once caused con- siderable dispute, even though~by analogy! all Monte Carlo studies needed a mechanism for generating their data+ The concept’s acceptance was helped by clarifying that constant parameters are not an intrinsic property of an econom- ics DGP+ Also, the theory of reduction explains how marginalization, sequen- tial factorization, and conditioning in the enormous DGP for the entire economy entails the joint density of the subset of variables under analysis; see@69# and also @113# with Steven Cook+ That joint density of the subset of variables is what Christophe Bontemps and Mizon~2003! have since called the local DGP+ The local DGP can be trans- formed to have homoskedastic innovation errors, so congruent models are the class to search; and Bontemps and Mizon prove that a model is congruent if it encompasses the local DGP+ Changes at a higher level in the full DGP can induce nonconstant parameters in the local DGP, putting a premium on good selection of the variables+ One criticism of the model design approach, which is also applicable to pretesting, is that test statistics no longer have their usual distribu- tions. How do you respond to that? For evaluation tests, that view is clearly correct, whether the testing is within a given study or between different studies+When a test’s rejection leads to model revision and only “insignificant” tests are reported, tests are clearly design cri- teria+ However, their insignificance on the initial model is informative about that model’s goodness+ So, in model design, insignificant test statistics are evidence of having successfully built the model. What role does encompassing play in such a strategy? In experimental disciplines, most researchers work on the data generated by their own experiments+ In macroeconomics, there is one data set with a prolif- eration of models thereof, which raises the question of congruence between any given model and the evidence provided by rival models+ The concept of 770 ET INTERVIEW encompassing was present in DHSY and HUS, but primarily as a tool for reduc- ing model proliferation+ The concept became clearer in@43# and @45# , but it was only formalized as a test procedure in Mizon and Richard~1986!+ Although the idea surfaced in David Cox~1962!, David emphasized single degree-of- freedom tests for comparing nonnested models, as did Hashem Pesaran~1974!, whose paper I had handled as editor for theReview of Economic Studies+ I remain convinced of the central role of encompassing in model evaluation, as argued in@75# , @83# , @118# , and@142# + Kevin Hoover and Stephen Perez~1999! suggested that encompassing be used to select a dominant final model from the set of terminal models obtained by general-to-specific simplifications along dif- ferent paths+ That insight sustains multipath searches and has been imple- mented in@175# and@206# + More generally, in a progressive research strategy, encompassing leads to a well-established body of empirical knowledge, so new studies need not start from scratch+ As new data accumulate, however, we may be forced to model spe- cific to general. How do we reconcile that with a progressive research strategy? As data accrue over time, we can uncover both spurious and relevant effects because spurious variables have centralt-statistics, whereas relevant variables have noncentralt-statistics that drift in one direction+ By letting the model expand appropriately and by letting the significance level go to zero at a suitable rate, the probability of retaining the spurious effects tends to zero asymptotically, whereas the probability of retaining the relevant variables tends to unity; see Hannan and Quinn~1979! and White~1990! for stationary processes+ Thus, modeling from specific to general between studies is not problematic for a pro- gressive research strategy, provided one returns to the general model each time+ Otherwise, @172# showed that successively corroborating a sequence of results can imply the model’s refutation+ Still, we know little about how well a pro- gressive research strategy performs when there are intermittent structural breaks+ 3.5. Money Demand You have analyzed UK broad money demand on both quarterly and annual data, and quarterly narrow money demand for both the United Kingdom and the United States. In your first money demand study [29], you and Grayham Mizon were responding to work by Graham Hacche (1974) at the Bank of England. How did that arise? Tony Courakis~1978! had submitted a comment to theEconomic Journalcrit- icizing Hacche for differencing data in order to achieve stationarity+ Grayham Mizon and I proposed testing the restrictions imposed by differencing as an exam- ple of Denis’s new common-factor tests—later published as Sargan~1980!— and we developed an equilibrium-correction representation for money demand, ET INTERVIEW 771 using the Bank’s data+ The common factor restriction in Hacche~1974! was rejected, and the equilibrium-correction term in our model was significant+ So, you assumed that the data were stationary, even though differenc- ing was needed. We implicitly assumed that both the equilibrium-correction term and the differ- ences would be stationary, despite no concept of cointegration; and we assumed that the significance of the equilibrium-correction term was equivalent to reject- ing the common factor from differencing+ Also, the Bank study was specific to general in its approach, whereas we argued for general-to-specific modeling, which was the natural way to test common-factor restrictions using Denis’s deter- minantal conditions+ Denis’s COMFAC algorithm was already included in GIVE, although Grayham’s and my Monte Carlo study of COMFAC only appeared two years later in@34# + Did Courakis (1978) and [29] change modeling strategies in the United Kingdom? What was the Bank of England’s reaction? The next Bank study—of M1 by Richard Coghlan~1978!—considered general dynamic specifications, but they still lacked an equilibrium-correction term+ As I discussed in my follow-up@31# , narrow money acts as a buffer for agents’ expenditures, but with target ratios for money relative to expenditure, devia- tions from which prompt adjustment+ That target ratio should depend on the opportunity costs of holding money relative to alternative financial assets and to goods, as measured by interest rates and inflation, respectively+ Also, because some agents are taxed on interest earnings, and other agents are not, the Fisher equation cannot hold+ So your interest rate measure did not adjust for tax. Right+ @31# also highlighted the problems confronting a simple-to-general approach+ Those problems include the misinterpretation of earlier results in the modeling sequence, the impossibility of constructively interpreting test rejec- tions, the many expansion paths faced, the unknown stopping point, the col- lapse of the strategy if later misspecifications are detected, and the poor properties that result from stopping at the first nonrejection—a criticism dat- ing back to Anderson~1962!+ A key difficulty with earlier UK money-demand equations had been param- eter nonconstancy+ However,my equilibrium-correction model was constant over a sample with considerable turbulence after Competition and Credit Control regulations in 1971+ [31] also served as the starting point for a sequence of papers on UK and US M1. You returned to modeling UK M1 again in [60] and [94]. That research resulted in a simple representation for UK M1 demand, despite a very general initial model, with only four variables representing opportunity 772 ET INTERVIEW costs against goods and other assets, adjustment costs, and equilibrium adjustment+ In 1982, Milton Friedman and Anna Schwartz published their book Monetary Trends in the United States and the United Kingdom, and it had many potential policy implications. Early the following year, the Bank asked you to evaluate the econometrics in Friedman and Schwartz (1982) for the Bank’s panel of academic consultants, leading to Hendry and Ericsson (1983) and eventually to [93]. You were my research officer then+ Friedman and Schwartz’s approach was deliberately simple to general, commencing with bivariate regressions, gener- alizing to trivariate regressions, etc+ By the early 1980s, most British econo- metricians had realized that such an approach was not a good modeling strategy+ However, replicating their results revealed numerous other problems as well+ I recall that one of those was simply graphing velocity. Yes+ The graph in Friedman and Schwartz~1982, Chart 5+5, p+ 178! made UK velocity look constant over their century of data+ I initially questioned your plot of UK velocity—using Friedman and Schwartz’s own annual data—because your graph showed considerable nonconstancy in velocity+ We discovered that the discrepancy between the two graphs arose mainly because Friedman and Schwartz plotted velocity allowing for a range of 1 to 10, whereas UK velocity itself only varied between 1 and 2+4+ Figure 1 reproduces the comparison+ Testing Friedman and Schwartz’s equations revealed a considerable lack of congruence+ Friedman and Schwartz phase-averaged their annual data in an Figure 1. A comparison of Friedman and Schwartz’s graph of UK velocity with Hen- dry and Ericsson’s graph of UK velocity+ ET INTERVIEW 773 attempt to remove the business cycle, but phase averaging still left highly auto- correlated, nonstationary processes+ Because filtering~such as phase averag- ing! imposes dynamic restrictions, we analyzed the original annual data+ Our paper for the Bank of England panel started a modeling sequence, with contri- butions from Sean Holly and Andrew Longbottom~1985! and Alvaro Escrib- ano~1985!+ Shortly after the meeting of the Bank’s panel of academic consul- tants, there was considerable press coverage. Do you recall how that occurred? The Guardian newspaper started the debate. As background, monetarism was at its peak+ Margaret Thatcher—the prime minister—had instituted a regime of monetary control, as she believed that money caused inflation, precisely the view put forward by Friedman and Schwartz+ From this perspective, a credible monetary tightening would rap- idly reduce inflation because expectations were rational+ In fact, inflation fell slowly, whereas unemployment leapt to levels not seen since the 1930s+ The Treasury and Civil Service Committee on Monetary Policy~which I had advised in @36# and @37# ! had found no evidence that monetary expansion was the cause of the post-oil-crisis inflation+ If anything, inflation caused money, whereas money was almost an epiphenomenon+ The structure of the British banking system made the Bank of England a “lender of the first resort,” and so the Bank could only control the quantity of money by varying interest rates+ At the time, Christopher Huhne was the economics editor at theGuardian+ He had seen our critique, and he deemed our evidence central to the policy debate+ As I recall, when Huhne’s article hit the press, your phone rang for hours on end. That it did+ There were actuallytwoarticles about Friedman and Schwartz~1982! in the Guardian on December 15, 1983+ On page 19, Huhne had written an article that summarized—in layman’s terms—our critique of Friedman and Schwartz~1982!+ Huhne and I had talked at length about this piece, and it provided an accurate statement of Hendry and Ericsson~1983! and its implica- tions+ In addition—and unknown to us—theGuardiandecided to run a front- page editorial on Friedman and Schwartz with the headline “Monetarism’s guru ‘distorts his evidence’+” That headline summarized Huhne’s view that it was unacceptable for Friedman and Schwartz to use their data-based dummy vari- able for 1921–1955 and still claim parameter constancy of their money demand equation+ Rather, that dummy variable actually implied nonconstancy because the regression results were substantively different in its absence+ That noncon- stancy undermined Friedman and Schwartz’s policy conclusions+ Charles Goodhart (1982) had also questioned that dummy. It is legitimate to question any data-based dummy selected for a period unrelated to historical events+Whether that dummy “distorted the evidence” is less obvi- 774 ET INTERVIEW ous, since econometricians often use indicators to clarify evidence or to proxy for unobserved variables+ In its place, we used a nonlinear equilibrium correc- tion, which had two equilibria, one for normal times and one for disturbed times ~although one could hardly call the First World War “normal”!+ Like Friedman and Schwartz, we did include a dummy for the two world wars that captured a 4% increase in demand, probably due to increased risks+ Huhne later did a TV program about the debate, spending a day at my house filming+ Hendry and Ericsson (1983) was finally published nearly eight years later in [93], after a prolonged editorial process. Just when we thought the issue was laid to rest, Chris Attfield, David Demery, and Nigel Duck (1995) claimed that our equation had broken down on data extended to the early 1990s whereas the Friedman and Schwartz specification was constant. To compile a coherent statistical series over a long run of history, Attfield, Dem- ery, and Duck had spliced several different money measures together, but they had not adjusted the corresponding measures of the opportunity cost+With that combination, our model did indeed fail+ However, as shown in@166# , our model remained constant over the whole sample once we used an appropriate measure of opportunity cost, whereas the updated Friedman and Schwartz model failed+ Escribano~2004! updates our equation through 2000 and confirms its contin- ued constancy+ Your model of U.S. narrow money demand also generated contro- versy, as when you presented it at the Fed. Yes, that research appeared as@96# with Yoshi Baba and Ross Starr+ After the supposed breakdown in U+S+ money demand recorded by Steve Goldfeld~1976!, it was natural to implement similar models for the United States+ Many new financial instruments had been introduced, including money market mutual funds, CDs, and NOW and SuperNOW accounts, so we hypothesized that these non- modeled financial innovations were the cause of the instability in money demand+ Ross also thought that long-term interest-rate volatility had changed the matu- rity structure of the bond market, especially when the Fed implemented its New Operating Procedures+ A high long rate was no longer a signal to buy because high interest rates were associated with high variances, and interest rates might go higher still and induce capital losses+ This situation suggested calculating a certainty-equivalent long-run interest rate—that is, the interest rate adjusted for risk+ Otherwise, the basic approach and specifications were similar+ We treated M1 as being determined by the private sector, conditional on interest rates set by the Fed, although the income elasticity was one-half, rather than unity, as in the United Kingdom+ Seminars at the Fed indeed produced a number of chal- lenges, including the claim that the Fed engineered a monetary expansion for Richard Nixon’s reelection+ Dummies for that period were insignificant, so agents were willing to hold that money at the interest rates set, confirming valid con- ET INTERVIEW 775 ditioning+Another criticism concerned the lag structure, which represented aver- age adjustment speeds in a large and complex economy+ Some economists still regard the final formulation in [96] as too com- plicated. Sometimes, I think that they believe the world is inherently sim- ple. Other times, I think that they are concerned about data mining. Have you had similar reactions? Data mining could never spuriously produce the sizes oft-values we found, however many search paths were explored+ The variables might proxy unmod- eled effects, but their larget-statistics could not arise by chance+ 3.6. Dynamic Econometrics That takes us to your book Dynamic Econometrics [127], perhaps the largest single project of your professional career so far. This book had several false starts, dating back to just after you had finished your Ph.D. In 1972, the Italian public company IRI invited Pravin Trivedi and myself to publish ~in Italian! a set of lectures on dynamic modeling+ In preparing those lectures, we became concerned that conventional econometric approaches cam- ouflaged misspecification+ Unfortunately, the required revisions took more than two decades! Your lectures with Pravin set out a research agenda that included gen- eral misspecification analysis (as in [18]), the plethora of estimators (uni- fied in [21]), and empirical model design (systematized in [43], [46], [49], and [69]). Building on the success of@11# in explaining the simulation results in Gold- feld and Quandt~1972!, @18# used a simple analytic framework to investigate the consequences of various misspecifications+ As I mentioned earlier~in Sec- tion 1+1!, I had discovered the estimator generating equation while teaching+ To round off the book, I developed some substantive illustrations of empirical modeling, including consumers’ expenditure, and housing and the construc- tion sector~which appeared as@59# and@65#!+ However, new econometric issues continually appeared+ For instance, how do we model capital rationing, or the demand for mortgages when only the supply is observed, or the stocks and flows of durables? I realized that I could not teach students how to do applied econometrics until I had sorted out at least some of these problems+ Did you see that as the challenge in writing the book? Yes+ The conventional approach to modeling was to write down the economic theory, collect variables with the same names~such as consumers’ expenditure for consumption!, develop mappings between the theory constructs and the obser- vations, and then estimate the resulting equations+ I had learned that that approach did not work+ The straitjacket of the prevailing approach meant that one under- 776 ET INTERVIEW stood neither the data processes nor the behavior of the economy+ I tried a more data-based approach, in which theory provided guidance rather than a complete structure, but that approach required developing concepts of model design and modeling strategy+ You again attempted to write the book when you were visiting Duke University annually in the mid- to late-1980s. Yes, with Bob Marshall and Jean-François Richard+ By that time, common factors, the theory of reduction, equilibrium correction and cointegration, encom- passing, and exogeneity had clarified the empirical analysis of individual equa- tions, and powerful software with recursive estimators implemented the ideas+ However, modeling complete systems raised new issues, all of which had to be made operational+ Writing the software package PcFiml enforced begin- ning from the unrestricted system, checking its congruence, reducing to a model thereof, testing overidentification, and encompassing the VAR; see@79# , @110# , and @114# + This work matched parallel developments on system cointegration by Søren, Katarina, and others in Copenhagen+ Analyses were still needed of general-to-specific modeling and diagnostic testing in systems~which eventually came in@122# !, judging model reliability ~my still unpublished Walras–Bowley lecture!, and clarifying the role of inter- temporal optimization theory+ That was a daunting list! Bob and Jean-François became more interested in auctions and experimental economics, so their co-authorship lapsed+ I remember receiving your first full draft of Dynamic Econometrics for comment in the late 1980s. That draft would not have appeared without help from Duo Qin and Carlo Favero+ Duo transcribed my lectures, based on draft chapters, and Carlo drafted answers for the solved exercises+ The final manuscript still took years more to complete+ Dynamic Econometrics lacks an extensive discussion of cointegration. That is a surprising omission, given your interest in cointegration and equilibrium correction. All the main omissions inDynamic Econometricswere deliberate, as they were addressed in other books+ Cointegration had been treated in@104#; Monte Carlo in @53# and @95#; numerical issues and software in@81# , @99# , and @115#; the history of econometrics in@132#; and forecasting was to come, presaged by @112# + That distribution of topics letDynamic Econometricsfocus on model- ing+ Because~co! integrated series can be reduced to stationarity, much of Dynamic Econometricsassumes stationarity+ Other forms of nonstationarity would be treated later in@163# and @170# + Even as it stood, Dynamic Econo- metricswas almost 1,000 pages long when published! ET INTERVIEW 777 You dedicated Dynamic Econometrics to your wife, Evelyn, and your daughter, Vivien. How have they contributed to your work on econometrics? I fear that we tread on thin ice here, whatever I say! Evelyn and Vivien have helped in numerous ways, both directly and indirectly, such as by facilitating time to work on ideas and time to visit collaborators+ They have also tolerated numerous discussions on econometrics; corrected my grammar; and, in Vivi- en’s case, questioned my analyses and helped debug the software+ As you know, Vivien is now a professional economist in her own right+ 3.7. Monte Carlo Methodology Let’s now turn to three of the omissions from Dynamic Econometrics: Monte Carlo, the history of econometrics, and forecasting. Pravin introduced me to the concepts of Monte Carlo analysis, based on Ham- mersley and Handscomb~1964!+ I implemented some of their procedures, par- ticularly antithetic variates~AVs! in @8# with Pravin, and later control variates in @16# with Robin Harrison+ I think that it is worth repeating your story about antithetic variates. Pravin and I were graduate students at the time+ We were investigating fore- casts from estimated dynamic models and were using AVs to reduce simulation uncertainty+ Approximating moving-average errors by autoregressive errors Evelyn, Vivien, and David at home in Oxford+ 778 ET INTERVIEW entailed inconsistent parameter estimates and hence, we thought, biased fore- casts+ To check, we printed the estimated AV bias for each Monte Carlo simu- lation of a static model with a moving-average error+We got page upon page of zeros and a scolding from the computing center for wasting paper and com- puter time+ In fact, we had inadvertently discovered that, when an estimator is invariant to the sign of the data but forecast errors change sign when the data do, then the average of AV pairs of forecast errors is precisely zero: see@8# + The idea works for symmetric distributions and hence for generalized least squares with estimated covariance matrices; see Kakwani~1967!+ I have since tried other approaches, as in@34# and@58# + Monte Carlo has been important for developing econometric methodology—by emphasizing the role of the DGP—and in your teach- ing, as reported in [73] and [92]. In Monte Carlo, knowledge of the DGP entails all subsequent results using data from that DGP+ The same logic applies to economic DGPs, providing an essen- tial step in the theory of reduction and clarifying misspecification analysis and encompassing+ Monte Carlo also convinced me that the key issue was specifi- cation, rather than estimation+ In Monte Carlo response surfaces, the relative efficiencies of estimators were dominated by variations between models, a view reinforced by my later forecasting research+ Moreover, deriving control vari- ates yielded insights into what determined the accuracy of asymptotic distribu- tion theory+ The software package PcNaive facilitates the live classroom use of Monte Carlo simulation to illustrate and test propositions from econometric theory; see@196# + A final major purpose of Monte Carlo was to check software accuracy by simulating econometric programs for cases where results were known+ Did you also use different software packages to check them against each other? Yes+ The Monte Carlo package itself had to be checked, of course, especially to ensure that its random number generator was i+i+d+ uniform+ 3.8. The History of Econometrics How did you become interested in the history of econometrics? Harry Johnson and Roy Allen sold me their old copies ofEconometrica, which went back to the first volume in 1933+ Reading early papers such as Haavelmo ~1944! showed that textbooks focused on a small subset of the interesting ideas and ignored the evolution of our discipline+ Dick Stone agreed, and he helped me to obtain funding from the ESRC+ By coincidence, Mary Morgan had lost her job at the Bank of England when Margaret Thatcher abolished exchange controls in 1979, so Mary and I commenced work together+ Mary was the opti- mal person to investigate the history objectively, undertaking extensive archi- ET INTERVIEW 779 val research and leading to her superb book,Morgan~1990!+We had the privilege of ~often jointly! interviewing many of our discipline’s founding fathers, includ- ing Tjalling Koopmans, Ted Anderson, Gerhard Tintner, Jack Johnston, Trygve Haavelmo, Herman Wold, and Jan Tinbergen+ The interviews with the latter three provided the basis for@84# , @123# , and@146# + Mary and I worked on@82# and also collated many of the most interesting papers for@132# + Shortly after- wards, Duo Qin~1993! studied the more recent history of econometrics through to about the mid-1970s+ Your interest must have also stimulated some of Chris Gilbert’s work. I held a series of seminars at Nuffield to discuss the history of econometrics with many who published on the topic, such as John Aldrich, Chris, Mary, and Duo+ It was fascinating to reexamine the debates about Frisch’s confluence analy- sis, between Keynes and Tinbergen, etc+ On the latter, I concluded that Keynes was wrong, rather than right, as many believe+ Keynes assumed that empirical econometrics was impossible without knowing the answer in advance+ If that were true generally, science could never have progressed, whereas in fact it has+ You also differ markedly with the profession’s view on another major debate—the one between Koopmans and Vining on “measurement with- out theory.” As @132# reveals, the profession has wrongly interpreted that debate’s implica- tions+ Perhaps this has occurred because the debate is a “classic”—something that nobody reads but everybody cites+ Koopmans~1947! assumed that eco- nomic theory was complete, correct, and unchanging, and hence formed an opti- mal basis for econometrics+ However, as Rutledge Vining~1949! noted, economic theory is actually incomplete, abstract, and evolving, so the opposite inference can be deduced+ Koopmans’s assumption is surprising because Koopmans him- self was changing economic theory radically through his own research+ Econ- omists today often use theories that differ from those that Koopmans alluded to, but still without concluding that Koopmans was wrong+ However, absent Koopmans’s assumption, one cannot justify forcing economic-theory specifica- tions on data+ 3.9. Economic Policy and Government Interactions London gave ready access to government organizations, and LSE fos- tered frequent interactions with government economists. There is no equiv- alent academic institution in Washington with such close government contacts. You have had long-standing relationships with both the Trea- sury and the Bank of England. The Treasury’s macroeconometric model had a central role in economic policy analysis and forecasting, so it was important to keep its quality as high as fea- 780 ET INTERVIEW sible with the resources available+ The Treasury created an academic panel to advise on their model, and that panel met regularly for many years, introducing developments in economics and econometrics and teaching modeling to their recently hired economists+ Also, DHSY attracted the Treasury’s attention+ The negative effect of infla- tion on consumers’ expenditure—approximating the erosion of wealth—entailed that if stimulatory fiscal policy increased inflation, the overall outcome was deflationary+ Upon replacing the Treasury’s previous consumption function with DHSY, many multipliers in the Treasury model changed sign, and debates fol- lowed about what were the correct and wrong signs for such multipliers+ Some economists rationalized these signs as being due to forward-looking agents pre- empting government policy, which then had the opposite effect from the previ- ous “Keynesian” predictions+ The Bank of England also had an advisory panel+ My housing model showed large effects on house prices from changes in outstanding mortgages because the mortgage market was credit-constrained, so ~in the mid-1980s! I served on the Bank’s panel, examining equity withdrawal from the housing market and the consequential effect of housing wealth on expenditure and inflation+ Civil servants and ministers interacted with LSE faculty on parliamentary select com- mittees as well+ Once, in a deputation with Denis Sargan and other LSE econ- omists, we visited Prime Minister Callaghan to explain the consequences of expansionary policies in a small open economy+ You participated in two select committees, one on monetary policy and one on economic forecasting. I suspect that my notoriety was established by@32# , my paper nicknamed “Alchemy,” which was even discussed in Parliament for deriding the role of money+ Shortly after@32# appeared, a Treasury and Civil Service Committee on monetary policy was initiated because many members of Parliament were unconvinced by Margaret Thatcher’s policy of monetary control, and they sought the evidential basis for that policy+ The committee heard from many of the world’s foremost economists+ Most of the evidence was not empirical but purely theoretical, being derived from simplified economic models from which their proprietor deduced what must happen+ As the committee’s econometric adviser, I collected what little empirical evidence there was, most of it from the Trea- sury+ The Treasury, despite arguing the government’s case, could not establish that money caused inflation+ Instead, it found evidence that devaluations, wage- price spirals, excess demands, and commodity-price shocks mattered; see@36# and@37# + Those testimonies emphasized theory relative to empirical evidence— a more North American approach. Many of those presenting evidence were North American, but several UK econ- omists also used pure theory+ Developing sustainable econometric evidence requires considerable time and effort, which is problematic for preparing mem- ET INTERVIEW 781 oranda to a parliamentary committee+ Most of my empirical studies have taken years+ Surprisingly, evidence dominated theory in the 1991 enquiry into official eco- nomic forecasting; see@91# + There was little relevant theory, but there was no shortage of actual forecasts or studies of them+ There were many papers on statistical forecasting but few explicitly on economic forecasting for large, com- plex, nonstationary systems in which agents could change their behavior+ Fore- casts from different models frequently conflicted, and the underlying models often suffered forecast failure+As Makridakis and Hibon~2000! and@191# argue, those realities could not be explained within the standard paradigm that fore- casts were the conditional expectations+ That enquiry triggered my interest in developing a viable theory of forecasting+ Even after numerous papers—starting with @124# , @125# , @137# , @138# , @139# , and @141#—that research program is still ongoing+ You have also interacted with government on the preparation and qual- ity of national statistics. In the mid-1960s, I worked on National Accounts at the Central Statistical Office with Jack Hibbert and David Flaxen+ Attributing components of output to sec- tors, calculating output in constant prices, and aggregating the components to measure GNP was an enlightening experience+ Most series were neither chained nor Divisia, but Laspeyres, and updated only intermittently, often inducing changes in estimated relationships+ More recently, in @179# and @190# with Andreas Beyer and Jurgen Doornik, I have helped create aggregate data series for a synthetic Euroland+ Data accuracy is obviously important to any approach that emphasizes empirical evidence, and I had learned that, although macro sta- tistics were imperfect, they were usable for statistical analysis+ For example, consumption and income were revised jointly, essentially maintaining cointe- gration between them+ Is that because the relationship is primarily between their nominal values—which alter less on updating—and involves prices only secondarily? Yes+ Ian Harnett~1984! showed that the price indices nearly cancel in the log ratio, which approximates the long-run outcome+ However, occasional large revi- sions can warp the evidence+ In the early 1990s, the Central Statistical Office revised savings rates by as much as 8 percentage points in some quarters~from 12% to 4%, say!, compared to equation standard errors of about 1%+ In unraveling why these revisions were made, we uncovered mistakes in how the data were constructed+ In particular, the doubling of the value-added tax ~VAT ! in the early 1980s changed the relation between the expenditure, output, and income measures of GNP+ Prior to the increase in VAT, some individuals had cheated on their income tax but could not do so on expenditure taxes, so the expenditure measure had been the larger+ That relationship reversed after VAT rose to 17+5%, but the statisticians wrongly assumed that they had mis- measured income earlier+ Such drastic revisions to the data led me to propose 782 ET INTERVIEW that the recently created Office of National Statistics form a panel on the qual- ity of economic statistics, and the ONS agreed+ The panel has since discussed such issues as data measurement, revision, seasonal adjustment, and national income accounting+ 3.10. The Theory of Economic Forecasting The forecast failure in 1968 motivated your research on methodology. What has led you back to investigate ex ante forecasting? That early failure dissuaded me from real-time forecasting, and it took 25 years to understand its message+ In the late 1970s, I investigated ex post predictive failure in @31# + Later, in @62# with Yock Chong and also in@67# , I looked at forecasting from dynamic systems, mainly to improve our power to test mod- els+ In retrospect, these two papers suggest much more insight than we had at the time—we failed to realize the implications of many of our ideas+ In an important sense, policy rekindled my interest in forecasting+ The Trea- sury missed the sharp downturn in 1989, having previously missed the boom from 1987, and the resulting policy mistakes combined to induce high inflation and high unemployment+ Mike Clements and I then sought analytical founda- tions for ex ante forecast failure when the economy is subject to structural breaks and forecasts are from misspecified and inconsistently estimated models that are based on incorrect economic theories and selected from inaccurate data+ Everything was allowed to be wrong, but the investigator did not know that+ Despite the generality of this framework, we derived some interesting theo- rems about economic forecasting, as shown in@105# , @120# , and @121# + The theory’s empirical content matched the historical record, and it suggested how to improve forecasting methods+ Surprisingly, estimation per se was not a key issue. The two important features were allowing for misspecified models and incorporating struc- tural change in the DGP. Yes+ Given that combination, we could disprove the theorem that causal vari- ables must beat noncausal variables at forecasting+ Hence, extrapolative meth- ods could win at forecasting, as shown in@171# + As @187# and@188# considered, that result suggests different roles for econometric models in forecasting and in economic policy, with causality clearly being essential in the latter+ The implications are fundamental+ Ex ante forecast failure should not be used to reject models, as happened after the first oil crisis; see@159# + An almost perfect model could both forecast badly and be worse than an extrapolative procedure, so the debate between Box–Jenkins models and econometric mod- els needs reinterpretation+ In @162# , we also came to realize a difference between equilibrium correction and error correction+ The first induces cointegration, whereas in the latter a model adjusts to eliminate forecast errors+ Devices like random walks and exponentially weighted moving averages embody error cor- ET INTERVIEW 783 rection, whereas cointegrated systems—which have equilibrium correction— will forecast systematically badly when an equilibrium mean shifts, since they continue to converge to the old equilibrium+ This explained why the Treasury’s cointegrated system had performed so badly in the mid-1980s, following the sharp reduction in UK credit rationing+ It also helped us demonstrate in@138# the properties of intercept corrections to offset such shifts+ Most recently, @204# offers an exposition and@210# a compendium+ Are you troubled that the best explanatory model need not be the best for forecasting and that the best policy model could conceivably be different from both, as suggested in [187]? Some structural breaks—such as shifts in equilibrium means—are inimical to forecasts from econometric models but not from robust devices, which do not explain behavior+ Such shifts might not affect the relevant policy derivatives+ For example, the effect of interest rates on consumers’ expenditure could be constant, despite a shift in the target level of savings due to~say! changed gov- ernment provisions for health in old age+ After the shift, changing the interest rate still will have the expected policy effect, even though the econometric model is misforecasting+ Because we could robustify econometric models against such forecast failures, it may prove possible to use the same baseline causal econo- metric model for forecasting and for policy+ If the econometric model alters after a policy experiment, then at least we learn that super exogeneity is lacking+ There was considerable initial reluctance to fund such research on forecast- ing, with referees deeming the ideas as unimplementable+ Unfortunately, such attitudes have returned, as the ESRC has recently declined to support our research on this topic+ One worries about their judgment, given the importance of forecasting in modern policy processes, and the lack of understanding of many aspects of the problem even after a decade of considerable advances+ 4. ECONOMETRIC SOFTWARE 4.1. The History and Roles of GIVE and PcGive In my M.Sc. course, you enumerated three reasons for having written the computer package GIVE. The first was to facilitate your own research, seeing as many techniques were not available in other packages. The second was to ensure that other researchers did not have the excuse of unavailability—more controversial! The third was for teaching. Nonoperational econometric methods are pointless, so computer software must be written+ Early versions of GIVE demonstrated the computability of FIML for systems with high-order vector autoregressive errors and latent-variable struc- tures, as in @33#: @174# and @218# provide a brief history+ In those days, code was on punched cards+ I once dropped my box off a bus and spent days sorting it out+ 784 ET INTERVIEW You dropped your box of cards off a bus? The IBM 360065 was at UCL, so I took buses to and from LSE+ Once, when rounding the Aldwych, the bus cornered faster than I anticipated, and my box of cards went flying+ The program could only be re-created because I had num- bered every one of the cards+ I trust that it wasn’t a rainy London day! That would have been a disaster+ After moving to Oxford, I ported GIVE to a menu-driven form~called PcGive! on an IBM PC 8088, using a rudimentary Fortran compiler; see@81# + That took about four years, with Adrian Neale writ- ing graphics in Assembler+ A Windows version appeared after Jurgen Doornik translated PcGive to C11, leading to@195# , @201# , @197# , and@194# + An attractive feature of PcGive has been its rapid incorporation of new tests and estimators—sometimes before they appeared in print, as with the Johansen (1988) reduced-rank cointegration procedure. Adding routines initially required control of the software, but Jurgen recently converted PcGive to his Ox language, so that developments could be added by anyone writing Ox packages accessible from GiveWin; see Doornik~2001!+ The two other important features of the software are its flexibility and its accu- racy, with the latter checked by standard examples and by Monte Carlo+ Earlier versions of PcGive were certainly less flexible: the menus defined everything that could be done, even while the program’s interactive nature was well-suited to empirical model design. The use of Ox and the devel- opment of a batch language have alleviated that. I was astounded by a feature that Jurgen recently introduced. At the end of an interactive ses- sion, PcGive can generate batch code for the entire session. I am not aware of any other program that has such a facility. Batch code helps replication+ Our latest Monte Carlo package~PcNaive! is just an experimental design front end that defines the DGP, the model specification, sample size, etc+, and then writes out an Ox program for that formulation+ If desired, that program can be edited independently; then it is run by Ox to calculate the Monte Carlo simulations+ While this approach is mainly menu-driven, it delivers complete flexibility in Monte Carlo+ For teaching, it is invaluable to have easy-to-use, uncrashable, menu-driven programs, whereas complicated batch code is a disaster waiting to happen+ In writing PcGive, you sought to design a program that was not only numerically accurate but also reasonably bug-proof. I wonder how many graduate students have misprogrammed GMM or some other estimator using GAUSS or RATS. Coding mistakes and inefficient programs can certainly produce inaccurate out- put+ Jurgen found that the RESETF-statistic can differ by a factor of a hun- ET INTERVIEW 785 dred, depending upon whether it is calculated by direct implementation in regression or by partitioned inversion using singular value decomposition+ Bruce McCullough has long been concerned about accurate output, and with good reason, as his comparison in McCullough~1998! shows+ The latest development is the software package PcGets, designed with Hans- Martin Krolzig+ “Gets” stands for “general-to-specific,” and PcGets now auto- matically selects an undominated congruent regression model from a general specification+ Its simulation properties confirm many of the earlier methodolog- ical claims about general-to-specific modeling, and PcGets is a great time- saver for large problems; see@175# , @206# , @209# , and@226# + PcGets still requires the economist’s value added in terms of the choice of variables and in terms of transformations of the unrestricted model. The algorithm indeed confirms the advantages of good economic analysis, both through excluding irrelevant effects and~especially! through including relevant ones+ Still, excessive simplification—as might be justified by some economic theory—will lead to a false general specification with no good model choice+ Fortunately, there seems little power loss from some overspecification with orthogonal regressors, and the empirical size remains close to the nominal+ 4.2. The Role of Computing Facilities More generally, computing has played a central role in the develop- ment of econometrics. Historically, it has been fundamental+ Estimators that were infeasible in the 1940s are now routine+ Excellent color graphics are also a major boon+ Computation can still be a limiting factor, though+ Simulation estimation and Monte Carlo studies of model selection strain today’s fastest PCs+ Parallel computation thus remains of interest, as discussed in@214# with Neil Shephard and Jurgen Doornik+ There is an additional close link between computing and econometrics: dif- ferent estimators are often different algorithms for approximating the same like- lihood, as with the estimator generating equation+ Also, inefficient numerical procedures can produce inefficient statistical estimates, as with Cochrane– Orcutt estimates for dynamic models with autoregressive errors+ In this exam- ple, stepwise optimization and the corresponding statistical method are both inefficient because the coefficient covariance matrix is nondiagonal+ Much can be learned about our statistical procedures from their numerical properties+ 4.3. The Role of Computing in Teaching Was it difficult to use computers in teaching when only batch jobs could be run? Indeed it was+ My first computer-based teaching was with Ken Wallis using the Wharton model for macroeconomic experiments; see McCarthy~1972!+ The 786 ET INTERVIEW students gave us their experimental inputs, which we ran, receiving the results several hours later+ Now such illustrations are live and virtually instantaneous and so can immediately resolve questions and check conjectures+ The absorp- tion of interactive computing into teaching has been slow, even though it has been feasible for nearly two decades+ I first did such presentations in the mid- 1980s, and my first interactive-teaching article was@68# , with updates in@70# and@131# + Even now, few people use PCs interactively in seminars, although some do in teaching. Perhaps interactive computer-based presentations require familiarity with the software, reliability of the software, and confidence in the model being presented. When I have made such presentations, they have often led to testing the model in ways that I hadn’t previously thought of. If the model fails on such tests, that is informative for me because it implies room for model improvement. If the model doesn’t fail, then that is additional evidence in favor of the model. Some conjectures involve unavailable data, but Internet access to data banks will improve that+Also,models that were once thought too complicated to model live—such as dynamic panels with awkward instrumental variable structures, allowing for heterogeneity, etc+—are now included in PcGive+ In live Monte Carlo simulations, students often gain important insights from experiments where theychoose the parameter values+ David teaching econometrics “live” in Argentina in 1993+ ET INTERVIEW 787 5. CONCLUDING REMARKS 5.1. Achievements and Failures What do you see as your most important achievements, and what were your biggest failures? Achievements are hard to pin down, even retrospectively, but the ones that have given me most pleasure were~a! consolidating estimation theory through the estimator generating equation; ~b! formalizing the methodology and model concepts to sustain general-to-specific modeling; ~c! producing a theory of eco- nomic forecasting that has substantive content; ~d! successfully designing com- puter automation of general-to-specific model selection in PcGets; ~e! developing efficient Monte Carlo methods; ~f ! building useful empirical models of hous- ing, consumers’ expenditure, and money demand; and ~g! stimulating a resur- gence of interest in the history of our discipline+ I now see automatic model selection as a new instrument for the social sci- ences, akin to the microscope in the biological sciences+ Already, PcGets has demonstrated remarkable performance across different~unknown! states of nature, with the empirical data generating process being found almost as often by commencing from a general model as from the DGP itself+ Retention of rel- evant variables is close to the theoretical maximum, and elimination of irrele- vant variables occurs at the rate set by the chosen significance level+ The selected estimates have the appropriate reported standard errors, and they can be bias- corrected if desired, which also down-weights adventitiously significant coeffi- cients+ These results essentially resuscitate traditional econometrics, despite data- based selection; see@226# and@231# + Peter Phillips~1996! has made great strides in the automation of model selection using a related approach; see also@221# + The biggest failure is not having persuaded more economists of the value of data-based econometrics in empirical economics, although that failure has stim- ulated improvements in modeling and model formulations+ This reaction is certainly not uniform+ Many empirical researchers in Europe adopt a general- to-specific modeling approach—which may be because they are regularly exposed to its applications—whereas elsewhere other views are dominant and are virtually enforced by some journals+ What role does failure play in econometrics and empirical modeling? As a psychology student, I learned that failure was the route to success+ Look- ing for positive instances of a concept is a slow way to acquire it when com- pared to seeking rejections+ Because macroeconomic data are nonexperimental, aren’t economists correctly hesitant about overemphasizing the role of data in empirical modeling? Such data are the outcome of governmental administrative processes, of which we can only observe one realization+We cannot rerun an economy under a dif- 788 ET INTERVIEW ferent state of nature+ The analysis of nonexperimental data raises many inter- esting issues, but lack of experimentation merely removes a tool, and its lack does not preclude a scientific approach or prevent progress+ It certainly hasn’t stopped astronomers, environmental biologists, or meteorologists from analyzing their data. Indeed+ Historically, there are many natural, albeit uncontrolled, experiments+ Governments experiment with policies; new legislation has unanticipated con- sequences; and physical and political turmoil through violent weather, earth- quakes, and war are ongoing+ It is not easy to persuade governments to conduct controlled, small-scale, regular experiments+ I once unsuccessfully suggested randomly perturbing the Treasury bill tender at a regular frequency to test its effects on the discount and money markets and on the banking system+ You have worked almost exclusively with macroeconomic time series, rather than with micro data in cross sections or in panels. Why did you make that choice? My first empirical study analyzed panel data, and it helped convince me to focus on macroeconomic time series instead+ I was consulting for British Petro- leum on bidding behavior, and I had about a million observations in total for oil products on about a thousand outlets for every canton in Switzerland,monthly, over a decade+ BP’s linear programming system took prices as parametric, and they wanted to endogenize price determination+ The Swiss study sought to esti- mate demand functions+ Even allowing for fixed effects, dynamics dominated, with near-unit roots, despite the~now known! downward biases+We built opti- mized models to determine bids, assuming that the winning margin had a Weibull distribution, estimated from information on the winning bid and our own bid, which might coincide+ I also wrote a panel-data analysis program with Chris Gilbert to study voting behavior in York+ The program tested for pooling the cross sections, the time series, and both+ It was difficult to get much out of such panels, as only a tiny percentage of the variation was explained+ It seemed unlikely that the remaining variation was random, so much of the explanation must be missing+ Because omitted variables would rarely be orthogonal to the included variables, the estimated coefficients would not correspond to the behav- ioral parameters+With macroeconomic data, the problem is the converse of fit- ting too well+ A difficulty with cross sections is their dependence on time, so the errors are not independent, due to common effects+ Quite early on, I thus decided to first understand time series and then come back to analyzing micro data, but I haven’t reached the end of the road on time series yet+ Your view on cross-section modeling differs from the conventional view that it reveals the long run. I have not seen a proof of that claim+ As a counterexample, suppose that a recent shock places all agents in disequilibrium during the measured cross section+ ET INTERVIEW 789 5.2. Directions for the Future What directions will your research explore? A gold mine of new results awaits discovery from extending the theory of eco- nomic forecasting in the face of rare events, and from delineating what aspects of models are most important in forecasting+ Also, much remains to be under- stood about modeling procedures+ Both are worthwhile topics, especially as new developments are likely to have practical value+ The econometrics of economic policy analysis also remains underdeveloped+ For instance, it would help to understand which structural changes affect forecasting but not policy in order to clarify the relationship between forecasting models and policy models+ Given the difficulties with impulse response analyses documented in@128# , @165# , and @188# , open models would repay a visit+ Policy analyses require congruent mod- els with constant parameters, so more powerful tests of changes in dynamic coefficients are needed+ Many further advances are already in progress for automatic model selection, such as dealing with cointegration, with systems, and with nonlinear models+ This new tool resolves a hitherto intractable problem, namely, estimating a regres- sion when there are more candidate variables than observations, as can occur when there are many potential interactions+ Provided that theDGPhas fewer vari- ables than observations, repeated application of the multipath search process to feasible blocks is likely to deliver a model with the appropriate properties+ That should keep you busy! NOTE 1+ The interviewer is a staff economist in the Division of International Finance, Board of Gov- ernors of the Federal Reserve System, Washington, D+C+ 20551 U+S+A+, and the interviewee is an ESRC Professorial Research Fellow and the head of the Economics Department at the Univer- sity of Oxford+ They may be reached on the Internet at ericsson@frb+gov and david+hendry@ economics+ox+ac+uk, respectively+ The views in this interview are solely the responsibility of the author and the interviewee and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System or of any other person associated with the Federal Reserve System+ We are grateful to Julia Campos, Jonathan Halket, Jaime Marquez, Kristian Rogers, and especially Peter Phillips for helpful comments and discussion, and to Margaret Gray and Hayden Smith for assistance in transcription+ Empirical results and graphics were obtained using PcGive Professional Version 10: see@195# and@201# + REFERENCES Anderson, T+W+ ~1962! The choice of the degree of a polynomial regression as a multiple decision problem+ Annals of Mathematical Statistics33, 255–265+ Anderson, T+W+ ~1976! Estimation of linear functional relationships: Approximate distributions and connections with simultaneous equations in econometrics+ Journal of the Royal Statistical Soci- ety, Series B38, 1–20~with discussion!+ Attfield, C+L+F+, D+ Demery, & N +W+ Duck ~1995! Estimating the UK Demand for Money Func- tion: A Test of Two Approaches+ Mimeo, Department of Economics, University of Bristol+ Benassy, J+-P+ ~1986! Macroeconomics: An Introduction to the Non-Walrasian Approach+ Academic Press+ 790 ET INTERVIEW Bontemps, C+ & G+E+ Mizon ~2003! Congruence and encompassing+ In B+P+ Stigum ~ed+!, Econo- metrics and the Philosophy of Economics: Theory-Data Confrontations in Economics, pp+ 354– 378+ Princeton University Press+ Box, G+E+P+ & G+M+ Jenkins~1970! Time Series Analysis: Forecasting and Control+ Holden-Day+ Breusch, T+S+ ~1986! Hypothesis testing in unidentified models+ Review of Economic Studies53, 635–651+ Chan, N+H+ & C+Z+ Wei ~1988! Limiting distributions of least squares estimates of unstable auto- regressive processes+ Annals of Statistics16, 367–401+ Coghlan, R+T+ ~1978! A transactions demand for money+ Bank of England Quarterly Bulletin18, 48–60+ Cooper, J+P+ & C+R+ Nelson~1975! The ex ante prediction performance of the St+ Louis and FRB- MIT-PENN econometric models and some results on composite predictors+ Journal of Money, Credit, and Banking7, 1–32+ Courakis, A+S+ ~1978! Serial correlation and a Bank of England study of the demand for money: An exercise in measurement without theory+ Economic Journal88, 537–548+ Cox, D+R+ ~1962! Further results on tests of separate families of hypotheses+ Journal of the Royal Statistical Society, Series B24, 406–424+ Deaton, A+S+ ~1977! Involuntary saving through unanticipated inflation+ American Economic Review 67, 899–910+ Doornik, J+A+ ~2001! Ox 3.0: An Object-Oriented Matrix Programing Language+ Timberlake Con- sultants Press+ Durbin, J+ ~1988! Maximum likelihood estimation of the parameters of a system of simultaneous regression equations+ Econometric Theory4, 159–170~paper presented to the European Meet- ings of the Econometric Society, Copenhagen, 1963!+ Engle, R+F+ & C+W+J+ Granger~1987! Co-integration and error correction: Representation, estima- tion, and testing+ Econometrica55, 251–276+ Escribano, A+ ~1985! Non-linear Error-correction: The Case of Money Demand in the U+K+ ~1878– 1970!+ Mimeo, University of California at San Diego+ Escribano, A+ ~2004! Nonlinear error correction: The case of money demand in the United King- dom ~1878–2000!+ Macroeconomic Dynamics8, 76–116+ Fisk, P+R+ ~1967! Stochastically Dependent Equations: An Introductory Text for Econometricians+ Griffin’s Statistical Monographs and Courses 21+ Charles Griffin+ Friedman, M+ & A +J+ Schwartz~1982! Monetary Trends in the United States and the United King- dom: Their Relation to Income, Prices, and Interest Rates, 1867–1975+ University of Chicago Press+ Frisch, R+ ~1933! Editorial+ Econometrica1, 1–4+ Gilbert, C+L+ ~1986! Professor Hendry’s econometric methodology+ Oxford Bulletin of Economics and Statistics48, 283–307+ Godfrey, L+G+ ~1988! Misspecification Tests in Econometrics+ Cambridge University Press+ Goldfeld, S+M+ ~1976! The case of the missing money+ Brookings Papers on Economic Activity3, 683–730~with discussion!+ Goldfeld, S+M+ & R+E+ Quandt~1972! Nonlinear Methods in Econometrics+ North-Holland+ Goodhart, C+A+E+ ~1982! Monetary Trends in the United States and the United Kingdom: A British Review+ Journal of Economic Literature20, 1540–1551+ Granger, C+W+J+ ~1981! Some properties of time series data and their use in econometric model specification+ Journal of Econometrics16, 121–130+ Granger, C+W+J+ ~1986! Developments in the study of cointegrated economic variables+ Oxford Bul- letin of Economics and Statistics48, 213–228+ Granger, C+W+J+ & A +A+Weiss~1983! Time series analysis of error-correction models+ In S+ Karlin, T+ Amemiya, and L+A+ Goodman~eds+!, Studies in Econometrics, Time Series, and Multivariate Statistics: In Honor of Theodore W. Anderson, pp+ 255–278+ Academic Press+ Haavelmo, T+ ~1944! The probability approach in econometrics+ Econometrica12, supplement, i–viii , 1–118+ ET INTERVIEW 791 Hacche, G+ ~1974! The demand for money in the United Kingdom: Experience since 1971+ Bank of England Quarterly Bulletin14, 284–305+ Hall, R+E+ ~1978! Stochastic implications of the life cycle-permanent income hypothesis: Theory and evidence+ Journal of Political Economy86, 971–987+ Hammersley, J+M+ & D +C+ Handscomb~1964! Monte Carlo Methods+ Chapman and Hall+ Hannan, E+J+ & B +G+ Quinn ~1979! The determination of the order of an autoregression+ Journal of the Royal Statistical Society, Series B41, 190–195+ Harnett, I+ ~1984! An Econometric Investigation into Recent Changes of UK Personal Sector Con- sumption Expenditure+ M+ Phil+ thesis, University of Oxford+ Hendry, D+F+ & N +R+ Ericsson~1983! Assertion without empirical basis: An econometric appraisal of “Monetary Trends in+ + + the United Kingdom” by Milton Friedman and Anna Schwartz+ In Monetary Trends in the United Kingdom, Bank of England Panel of Academic Consultants, Panel paper 22, Bank of England, pp+ 45–101+ Hildenbrand, W+ ~1994! Market Demand: Theory and Empirical Evidence+ Princeton University Press+ Hoover, K+D+ & S+J+ Perez~1999! Data mining reconsidered: Encompassing and the general-to- specific approach to specification search+ Econometrics Journal2, 167–191~with discussion!+ Johansen, S+ ~1988! Statistical analysis of cointegration vectors+ Journal of Economic Dynamics and Control12, 231–254+ Kakwani, N+C+ ~1967! The unbiasedness of Zellner’s seemingly unrelated regression equations esti- mators+ Journal of the American Statistical Association62, 141–142+ Katona, G+ & E+ Mueller ~1968! Consumer Response to Income Increases+ Brookings Institution+ Keynes, J+M+ ~1936! The General Theory of Employment, Interest and Money+ Harcourt, Brace+ Klein, L+R+ ~1953! A Textbook of Econometrics+ Row, Peterson and Company+ Koopmans, T+C+ ~1947! Measurement without theory+ Review of Economics and Statistics~for- merly theReview of Economic Statistics! 29, 161–172+ Longbottom, A+ & S+ Holly ~1985! Econometric methodology and monetarism: Professor Fried- man and Professor Hendry on the demand for money+ Discussion paper 131, London Business School+ Lucas, R+E+, Jr+ ~1976! Econometric policy evaluation: A critique+ In K+ Brunner and A+H+ Meltzer ~eds+!, The Phillips Curve and Labor Markets, Carnegie-Rochester Conference Series on Public Policy, vol+ 1+ Journal of Monetary Economics, supplement, 19–46~with discussion!+ Makridakis, S+ & M + Hibon ~2000! The M3-competition: Results, conclusions and implications+ International Journal of Forecasting16, 451–476+ McCarthy, M+D+ ~1972! The Wharton Quarterly Econometric Forecasting Model Mark III+ Studies in Quantitative Economics 6+ University of Pennsylvania+ McCullough, B+D+ ~1998! Assessing the reliability of statistical software: Part I+ American Statis- tician 52, 358–366+ Mizon, G+E+ ~1977! Inferential procedures in nonlinear models: An application in a UK industrial cross section study of factor substitution and returns to scale+ Econometrica45, 1221–1242+ Mizon, G+E+ ~1995! Progressive modeling of macroeconomic time series: The LSE methodology+ In K+D+ Hoover~ed+!, Macroeconometrics: Developments, Tensions, and Prospects, pp+ 107–170 ~with discussion!+ Kluwer Academic Publishers+ Mizon, G+E+ & J+-F+ Richard~1986! The encompassing principle and its application to testing non- nested hypotheses+ Econometrica54, 657–678+ Morgan, M+S+ ~1990! The History of Econometric Ideas+ Cambridge University Press+ Muth, J+F+ ~1961! Rational expectations and the theory of price movements+ Econometrica29, 315–335+ Osborn, D+R+ ~1988! Seasonality and habit persistence in a life cycle model of consumption+ Jour- nal of Applied Econometrics3, 255–266+ Osborn, D+R+ ~1991! The implications of periodically varying coefficients for seasonal time-series processes+ Journal of Econometrics48, 373–384+ 792 ET INTERVIEW Pesaran, M+H+ ~1974! On the general problem of model selection+ Review of Economic Studies41, 153–171+ Phillips, A+W+ ~1954! Stabilisation policy in a closed economy+ Economic Journal64, 290–323+ Phillips, A+W+ ~1956! Some notes on the estimation of time-forms of reactions in interdependent dynamic systems+ Economica23, 99–113+ Phillips, A+W+ ~1957! Stabilisation policy and the time-forms of lagged responses+ Economic Jour- nal 67, 265–277+ Phillips, A+W+ ~2000! Estimation of systems of difference equations with moving average distur- bances+ In R+ Leeson~ed+!, A.W.H. Phillips: Collected Works in Contemporary Perspective, pp+ 423–444+ Cambridge University Press+ ~Walras–Bowley Lecture, Econometric Society Meet- ing, San Francisco, December 1966+! Phillips, P+C+B+ ~1986! Understanding spurious regressions in econometrics+ Journal of Economet- rics 33, 311–340+ Phillips, P+C+B+ ~1987! Time series regression with a unit root+ Econometrica55, 277–301+ Phillips, P+C+B+ ~1996! Econometric model determination+ Econometrica64, 763–812+ Phillips, P+C+B+ ~1997! The ET interview: Professor Clive Granger+ Econometric Theory13, 253–303+ Qin, D+ ~1993! The Formation of Econometrics: A Historical Perspective+ Clarendon Press+ Richard, J+-F+ ~1980! Models with several regimes and changes in exogeneity+ Review of Economic Studies47, 1–20+ Robinson, P+M+ ~2003! Denis Sargan: Some perspectives+ Econometric Theory19, 481–494+ Samuelson, P+A+ ~1947! Foundations of Economic Analysis+ Harvard University Press+ Samuelson, P+A+ ~1961! Economics: An Introductory Analysis, 5th ed+ McGraw-Hill+ Sargan, J+D+ ~1964! Wages and prices in the United Kingdom: A study in econometric methodol- ogy+ In P+E+ Hart, G+ Mills , and J+K+ Whitaker ~eds+!, Econometric Analysis for National Eco- nomic Planning, Colston Papers, vol+ 16, pp+ 25–54~with discussion!+ Butterworths+ Sargan, J+D+ ~1975! Asymptotic theory and large models+ International Economic Review16, 75–91+ Sargan, J+D+ ~1980! Some tests of dynamic specification for a single equation+ Econometrica48, 879–897+ Savin, N+E+ ~1980! The Bonferroni and the Scheffé multiple comparison procedures+ Review of Economic Studies47, 255–273+ Silvey, S+D+ ~1959! The Lagrangian multiplier test+ Annals of Mathematical Statistics30, 389–407+ Stigum, B+P+ ~1990! Toward a Formal Science of Economics: The Axiomatic Method in Economics and Econometrics+ MIT Press+ Stock, J+H+ ~1987! Asymptotic properties of least squares estimators of cointegrating vectors+ Econ- ometrica55, 1035–1056+ Summers, L+H+ ~1991! The scientific illusion in empirical macroeconomics+ Scandinavian Journal of Economics93, 129–148+ Thomas, J+J+ ~1964! Notes on the Theory of Multiple Regression Analysis+ Center of Economic Research, Training Seminar Series, No+ 4+ Contos Press+ Tinbergen, J+ ~1951! Business Cycles in the United Kingdom, 1870–1914+ North-Holland+ Trivedi, P+K+ ~1970! The relation between the order-delivery lag and the rate of capacity utilization in the engineering industry in the United Kingdom, 1958–1967+ Economica37, 54–67+ Vining, R+ ~1949! Koopmans on the choice of variables to be studied and of methods of measure- ment+ Review of Economics and Statistics31, 77–86+ West, K+D+ ~1988! Asymptotic normality, when regressors have a unit root+ Econometrica56, 1397–1417+ White, H+ ~1990! A consistent model selection procedure based onm-testing+ In C+W+J+ Granger ~ed+!, Modelling Economic Series: Readings in Econometric Methodology, pp+ 369–383+ Oxford University Press+ Whittle, P+ ~1963! Prediction and Regulation by Linear Least-Square Methods+ D+ Van Nostrand+ ET INTERVIEW 793 THE PUBLICATIONS OF DAVID F. HENDRY 1966 1+ Survey of student income and expenditure at Aberdeen University, 1963–64 and 1964–65+ Scottish Journal of Political Economy13, 363–376+ 1970 2+ Book review ofIntroduction to Linear Algebra for Social Scientistsby Gordon Mills+ Econom- ica 37, 217–218+ 1971 3+ Discussion+ Journal of the Royal Statistical Society, Series A134, 315+ 4+ Maximum likelihood estimation of systems of simultaneous regression equations with errors generated by a vector autoregressive process+ International Economic Review12, 257–272+ 1972 5+ Book review ofElements of Econometricsby J+ Kmenta+ Economic Journal82, 221–222+ 6+ Book review of Regression and Econometric Methodsby David S+ Huang+ Economica39, 104–105+ 7+ Book review ofThe Analysis and Forecasting of the British Economyby M+J+C+ Surrey+ Eco- nomica39, 346+ 8+ With P+K+ Trivedi+ Maximum likelihood estimation of difference equations with moving aver- age errors: A simulation study+ Review of Economic Studies39, 117–145+ 1973 9+ Book review ofEconometric Models of Cyclical Behaviour, edited by Bert G+ Hickman+ Eco- nomic Journal83, 944–946+ 10+ Discussion+ Journal of the Royal Statistical Society, Series A136, 385–386+ 11+ On asymptotic theory and finite sample experiments+ Economica40, 210–217+ 1974 12+ Book review ofA Textbook of Econometricsby L+R+ Klein+ Economic Journal84, 688–689+ 13+ Book review ofOptimal Planning for Economic Stabilization: The Application of Control Theory to Stabilization Policyby Robert S+ Pindyck+ Economica41, 353+ 14+ Maximum likelihood estimation of systems of simultaneous regression equations with errors generated by a vector autoregressive process: A correction+ International Economic Review15, 260+ 15+ Stochastic specification in an aggregate demand model of the United Kingdom+ Econometrica 42, 559–578+ 16+ With R+W+ Harrison+ Monte Carlo methodology and the small sample behaviour of ordinary and two-stage least squares+ Journal of Econometrics2, 151–174+ 1975 17+ Book review ofForecasting the U.K. Economyby J+C+K+ Ash and D+J+ Smyth+ Economica42, 223–224+ 18+ The consequences of mis-specification of dynamic structure, autocorrelation, and simultaneity in a simple model with an application to the demand for imports+ In G+A+ Renton~ed+!, Mod- elling the Economy, pp+ 286–320~with discussion!+ Heinemann Educational Books+ 794 ET INTERVIEW 1976 19+ Discussion+ Journal of the Royal Statistical Society, Series A139, 494–495+ 20+ Discussion+ Journal of the Royal Statistical Society, Series B38, 24–25+ 21+ The structure of simultaneous equations estimators+ Journal of Econometrics4, 51–88+ 22+ With A+R+ Tremayne+ Estimating systems of dynamic reduced form equations with vector auto- regressive errors+ International Economic Review17, 463–471+ 1977 23+ Book review ofStudies in Nonlinear Estimation, edited by Stephen M+ Goldfeld and Richard E+ Quandt+ Economica44, 317–318+ 24+ Book review ofThe Models of Project LINK, edited by J+L+ Waelbroeck+ Journal of the Royal Statistical Society, Series A140, 561–562+ 25+ Comments on Granger-Newbold’s “Time series approach to econometric model building” and Sargent-Sims’ “Business cycle modeling without pretending to have too mucha priori eco- nomic theory+” In C+A+ Sims ~ed+!, New Methods in Business Cycle Research: Proceedings from a Conference, pp+ 183–202+ Federal Reserve Bank of Minneapolis+ 26+ With G+J+ Anderson+ Testing dynamic specification in small simultaneous systems: An applica- tion to a model of building society behavior in the United Kingdom+ In M+D+ Intriligator ~ed+!, Frontiers of Quantitative Economics, vol+ 3A, pp+ 361–383+ North-Holland+ 27+ With F+ Srba+ The properties of autoregressive instrumental variables estimators in dynamic systems+ Econometrica45, 969–990+ 1978 28+ With J+E+H+ Davidson, F+ Srba, & S+ Yeo+ Econometric modelling of the aggregate time-series relationship between consumers’ expenditure and income in the United Kingdom+ Economic Journal 88, 661–692+ 29+ With G+E+ Mizon+ Serial correlation as a convenient simplification, not a nuisance: A comment on a study of the demand for money by the Bank of England+ Economic Journal88, 549–563+ 1979 30+ The behaviour of inconsistent instrumental variables estimators in dynamic systems with auto- correlated errors+ Journal of Econometrics9, 295–314+ 31+ Predictive failure and econometric modelling in macroeconomics: The transactions demand for money+ In P+ Ormerod~ed+!, Economic Modelling: Current Issues and Problems in Macro- economic Modelling in the UK and the US, pp+ 217–242+ Heinemann Education Books+ 1980 32+ Econometrics—Alchemy or science?Economica47, 387–406+ 33+ With F+ Srba+ AUTOREG: A computer program library for dynamic econometric models with autoregressive errors+ Journal of Econometrics12, 85–102+ 34+ With G+E+ Mizon+ An empirical application and Monte Carlo analysis of tests of dynamic spec- ification+ Review of Economic Studies47, 21–45+ 1981 35+ With J+E+H+ Davidson+ Interpreting econometric evidence: The behaviour of consumers’ expen- diture in the UK+ European Economic Review16, 177–192~with discussion!+ 36+ Comment on HM Treasury’s memorandum, “Background to the Government’s economic pol- icy+” In House of Commons~ed+!, Third Report from the Treasury and Civil Service Commit- tee, Session 1980–81, Monetary Policy, vol+ 3, pp+ 94–96~Appendix 4!+ Her Majesty’s Stationery Office+ ET INTERVIEW 795 37+ Econometric evidence in the appraisal of monetary policy+ In House of Commons~ed+!, Third Report from the Treasury and Civil Service Committee, Session 1980–81, Monetary Policy, vol+ 3, pp+ 1–21~Appendix 1!+ Her Majesty’s Stationery Office+ 38+ With J+-F+ Richard+ Model formulation to simplify selection when specification is uncertain+ Journal of Econometrics16, 159+ 39+ With T+ von Ungern-Sternberg+ Liquidity and inflation effects on consumers’ expenditure+ In A+S+ Deaton~ed+!, Essays in the Theory and Measurement of Consumer Behaviour: In Honour of Sir Richard Stone, pp+ 237–260+ Cambridge University Press+ 1982 40+ Comment: Whither disequilibrium econometrics?Econometric Reviews1, 65–70+ 41+ A reply to Professors Maasoumi and Phillips+ Journal of Econometrics19, 203–213+ 42+ The role of econometrics in macro-economic analysis+ UK Economic Prospect1982, 26–38+ 43+ With J+-F+ Richard+ On the formulation of empirical models in dynamic econometrics+ Journal of Econometrics20, 3–33+ 1983 44+ With R+F+ Engle & J+-F+ Richard+ Exogeneity+ Econometrica51, 277–304+ 45+ Comment+ Econometric Reviews2, 111–114+ 46+ Econometric modelling: The “consumption function” in retrospect+ Scottish Journal of Politi- cal Economy30, 193–220+ 47+ On Keynesian model building and the rational expectations critique: A question of methodol- ogy+ Cambridge Journal of Economics7, 69–75+ 48+ With R+C+ Marshall+ On high and lowR2 contributions+ Oxford Bulletin of Economics and Statistics45, 313–316+ 49+ With J+-F+ Richard+ The econometric analysis of economic time series+ International Statistical Review51, 111–148~with discussion!+ 1984 50+ With G+J+ Anderson+ An econometric model of United Kingdom building societies+ Oxford Bul- letin of Economics and Statistics46, 185–210+ 51+ Book review ofAdvances in Econometrics: Invited Papers for the 4th World Congress of the Econometric Society, edited by Werner Hildenbrand+ Economic Journal94, 403–405+ 52+ Econometric modelling of house prices in the United Kingdom+ In D+F+ Hendry & K+F+ Wallis ~eds+!, Econometrics and Quantitative Economics, pp+ 211–252+ Basil Blackwell+ 53+ Monte Carlo experimentation in econometrics+ In Z+ Griliches & M+D+ Intriligator ~eds+!, Hand- book of Econometrics, vol+ 2, pp+ 937–976+ North-Holland+ 54+ Present position and potential developments: Some personal views@on# time-series economet- rics+ Journal of the Royal Statistical Society, Series A147, 327–338~with discussion!+ 55+ With A+ Pagan & J+D+ Sargan+ Dynamic specification+ In Z+ Griliches & M+D+ Intriligator ~eds+!, Handbook of Econometrics, vol+ 2, pp+ 1023–1100+ North-Holland+ 56+ With K+F+ Wallis ~eds+!+ Econometrics and Quantitative Economics+ Basil Blackwell+ 57+ With K+F+ Wallis+ Editors’ introduction+ In D+F+ Hendry & K+F+ Wallis ~eds+!, Econometrics and Quantitative Economics, pp+ 1–12+ Basil Blackwell+ 1985 58+ With R+F+ Engle & D+ Trumble+ Small-sample properties of ARCH estimators and tests+ Cana- dian Journal of Economics18, 66–93+ 59+ With N+R+ Ericsson+ Conditional econometric modeling: An application to new house prices in the United Kingdom+ In A+C+ Atkinson & S+E+ Fienberg~eds+!, A Celebration of Statistics: The ISI Centenary Volume, pp+ 251–285+ Springer-Verlag+ 60+ Monetary economic myth and econometric reality+ Oxford Review of Economic Policy1, 72–84+ 796 ET INTERVIEW 1986 61+ With A+ Banerjee, J+J+ Dolado, & G+W+ Smith+ Exploring equilibrium relationships in econo- metrics through static models: Some Monte Carlo evidence+ Oxford Bulletin of Economics and Statistics48, 253–277+ 62+ With Y+Y+ Chong+ Econometric evaluation of linear macro-economic models+ Review of Eco- nomic Studies53, 671–690+ 63+ Econometric Modelling with Cointegrated Variables+ Special Issue, Oxford Bulletin of Econom- ics and Statistics, 48 ~3!+ 64+ Econometric modelling with cointegrated variables: An overview+ Oxford Bulletin of Econom- ics and Statistics48, 201–212+ 65+ Empirical modeling in dynamic econometrics+ Applied Mathematics and Computation20, 201–236+ 66+ An excursion into conditional varianceland+ Econometric Reviews5, 63–69+ 67+ The role of prediction in evaluating econometric models+ Proceedings of the Royal Society, London, Series A407, 25–33+ 68+ Using PC-GIVE in econometrics teaching+ Oxford Bulletin of Economics and Statistics48, 87–98+ 1987 69+ Econometric methodology: A personal perspective+ In T+F+ Bewley ~ed+!, Advances in Econo- metrics: Fifth World Congress, vol+ 2, pp+ 29–48+ Cambridge University Press+ 70+ Econometrics in action+ Empirica ~Austrian Economic Papers! 14, 135–156+ 71+ PC-GIVE: An Interactive Menu-Driven Econometric Modelling Program for IBM-Compatible PC’s, Version 4+2+ Institute of Economics and Statistics and Nuffield College, University of Oxford+ 72+ PC-GIVE: An Interactive Menu-Driven Econometric Modelling Program for IBM-Compatible PC’s, Version 5+0+ Institute of Economics and Statistics and Nuffield College, University of Oxford+ 73+ With A+J+ Neale+ Monte Carlo experimentation using PC-NAIVE+ In T+B+ Fomby & G+F+ Rhodes, Jr+ ~eds+!, Advances in Econometrics: A Research Annual, vol+ 6, pp+ 91–125+ JAI Press+ 1988 74+ With J+ Campos & N+R+ Ericsson+ Comment on Telser+ Journal of the American Statistical Association83, 581+ 75+ Encompassing+ National Institute Economic Review3088, 88–92+ 76+ The encompassing implications of feedback versus feedforward mechanisms in econometrics+ Oxford Economic Papers40, 132–149+ 77+ Some foreign observations on macro-economic model evaluation activities at INSEE–DP+ In INSEE ~ed+!, Groupes d’Études Macroeconometriques Concertées: Document Complémen- taire de Synthèse, pp+ 71–106+ INSEE+ 78+ With A+J+ Neale+ Interpreting long-run equilibrium solutions in conventional macro models: A comment+ Economic Journal98, 808–817+ 79+ With A+J+ Neale & F+ Srba+ Econometric analysis of small linear systems using PC-FIML+ Jour- nal of Econometrics38, 203–226+ 1989 80+ Comment+ Econometric Reviews8, 111–121+ 81+ PC-GIVE: An Interactive Econometric Modelling System, Version 6+006+01+ Institute of Eco- nomics and Statistics and Nuffield College, University of Oxford+ 82+ With M+S+ Morgan+ A re-analysis of confluence analysis+ Oxford Economic Papers41, 35–52+ ET INTERVIEW 797 83+ With J+-F+ Richard+ Recent developments in the theory of encompassing+ In B+ Cornet & H+ Tulkens~eds+!, Contributions to Operations Research and Economics: The Twentieth Anni- versary of CORE, pp+ 393–440+ MIT Press+ 84+ With A+ Spanos & N+R+ Ericsson+ The contributions to econometrics in Trygve Haavelmo’s The Probability Approach in Econometrics+ Sosialøkonomen43, 12–17+ 1990 85+ With J+ Campos & N+R+ Ericsson+ An analogue model of phase-averaging procedures+ Jour- nal of Econometrics43, 275–292+ 86+ With E+E+ Leamer & D+J+ Poirier+ The ET dialogue: A conversation on econometric method- ology+ Econometric Theory6, 171–261+ 87+ With G+E+ Mizon+ Procrustean econometrics: Or stretching and squeezing data+ In C+W+J+ Granger~ed+!, Modelling Economic Series: Readings in Econometric Methodology, pp+ 121– 136+ Oxford University Press+ 88+ With J+N+J+ Muellbauer & A+ Murphy+ The econometrics of DHSY+ In J+D+ Hey & D+ Winch ~eds+!, A Century of Economics: 100 Years of the Royal Economic Society and the Economic Journal, pp+ 298–334+ Basil Blackwell+ 89+ With A+J+ Neale & N+R+ Ericsson+ PC-NAIVE: An Interactive Program for Monte Carlo Exper- imentation in Econometrics, Version 6+01+ Institute of Economics and Statistics and Nuffield College, University of Oxford+ 1991 90+ Comments: “The response of consumption to income: A cross-country investigation” by John Y+ Campbell and N+ Gregory Mankiw+ European Economic Review35, 764–767+ 91+ Economic forecasting+ In House of Commons~ed+!, Memoranda on Official Economic Fore- casting, Treasury and Civil Service Committee, Session 1990–91+ Her Majesty’s Stationery Office+ 92+ Using PC-NAIVE in teaching econometrics+ Oxford Bulletin of Economics and Statistics53, 199–223+ 93+ With N+R+ Ericsson+ An econometric analysis of U+K+ money demand inMonetary Trends in the United States and the United Kingdomby Milton Friedman and Anna J+ Schwartz+ Amer- ican Economic Review81, 8–38+ 94+ With N+R+ Ericsson+ Modeling the demand for narrow money in the United Kingdom and the United States+ European Economic Review35, 833–881~with discussion!+ 95+ With A+J+ Neale+ A Monte Carlo study of the effects of structural breaks on tests for unit roots+ In P+ Hackl & A+H+ Westlund~eds+!, Economic Structural Change: Analysis and Fore- casting, pp+ 95–119+ Springer-Verlag+ 1992 96+ With Y+ Baba & R+M+ Starr+ The demand for M1 in the U+S+A+, 1960–1988+ Review of Eco- nomic Studies59, 25–61+ 97+ With A+ Banerjee~eds+!+ Testing Integration and Cointegration+ Special Issue, Oxford Bulle- tin of Economics and Statistics, 54 ~3!+ 98+ With A+ Banerjee+ Testing integration and cointegration: An overview+ Oxford Bulletin of Eco- nomics and Statistics54, 225–255+ 99+ With J+A+ Doornik+ PcGive Version 7: An Interactive Econometric Modelling System+ Insti- tute of Economics and Statistics, University of Oxford+ 100+ With C+ Favero+ Testing the Lucas critique: A review+ Econometric Reviews11, 265–306~with discussion!+ 101+ Assessing empirical evidence in macroeconometrics with an application to consumers’ expen- diture in France+ In A+ Vercelli & N+ Dimitri ~eds+!, Macroeconomics: A Survey of Research Strategies, pp+ 363–392+ Oxford University Press+ 798 ET INTERVIEW 102+ An econometric analysis of TV advertising expenditure in the United Kingdom+ Journal of Policy Modeling14, 281–311+ 103+ With J+-F+ Richard+ Likelihood evaluation for dynamic latent variables models+ In H+M+Amman, D+A+ Belsley, & L +F+ Pau~eds+!, Computational Economics and Econometrics, pp+ 3–17+ Klu- wer Academic Publishers+ 1993 104+ With A+ Banerjee, J+J+ Dolado, & J+W+ Galbraith+ Co-integration, Error Correction, and the Econometric Analysis of Non-stationary Data+ Oxford University Press+ 105+ With M+P+ Clements+ On the limitations of comparing mean square forecast errors+ Journal of Forecasting12, 617–637~with discussion!+ 106+ With R+F+ Engle+ Testing super exogeneity and invariance in regression models+ Journal of Econometrics56, 119–139+ 107+ Econometrics: Alchemy or Science? Essays in Econometric Methodology+ Blackwell Publishers+ 108+ Introduction+ In D+F+ Hendry~ed+!, Econometrics: Alchemy or Science? Essays in Economet- ric Methodology, pp+ 1–7+ Blackwell Publishers+ 109+ Postscript: The econometrics of PC-GIVE+ In D+F+ Hendry ~ed+!, Econometrics: Alchemy or Science? Essays in Econometric Methodology, pp+ 444–466+ Blackwell Publishers+ 110+ With G+E+ Mizon+ Evaluating dynamic econometric models by encompassing the VAR+ In P+C+B+ Phillips ~ed+!, Models, Methods, and Applications of Econometrics: Essays in Honor of A.R. Bergstrom, pp+ 272–300+ Basil Blackwell+ 111+ With R+M+ Starr+ The demand for M1 in the USA: A reply to James M+ Boughton+ Economic Journal 103, 1158–1169+ 1994 112+ With M+P+ Clements+ Towards a theory of economic forecasting+ In C+P+ Hargreaves~ed+!, Nonstationary Time Series Analysis and Cointegration, pp+ 9–52+ Oxford University Press+ 113+ With S+ Cook+ The theory of reduction in econometrics+ In B+ Hamminga & N+B+ De Marchi ~eds+!, Idealization VI: Idealization in Economics, Poznan´ Studies in the Philosophy of the Sciences and the Humanities, vol+ 38, pp+ 71–100+ Rodopi+ 114+ With J+A+ Doornik+ PcFiml 8.0: Interactive Econometric Modelling of Dynamic Systems+ Inter- national Thomson Publishing+ 115+ With J+A+ Doornik+ PcGive 8.0: An Interactive Econometric Modelling System+ International Thomson Publishing+ 116+ With R+F+ Engle+ Appendix: The reverse regression~Appendix to “Testing super exogeneity and invariance in regression models”!+ In N+R+ Ericsson & J+S+ Irons ~eds+!, Testing Exogene- ity, pp+ 110–116+ Oxford University Press+ 117+ With N+R+ Ericsson & H+-A+ Tran+ Cointegration, seasonality, encompassing, and the demand for money in the United Kingdom+ In C+P+ Hargreaves~ed+!, Nonstationary Time Series Analy- sis and Cointegration, pp+ 179–224+ Oxford University Press+ 118+ With B+ Govaerts & J+-F+ Richard+ Encompassing in stationary linear dynamic models+ Jour- nal of Econometrics63, 245–270+ 119+ HUS revisited+ Oxford Review of Economic Policy10, 86–106+ 120+ With M+P+ Clements+ Can econometrics improve economic forecasting?Swiss Journal of Eco- nomics and Statistics130, 267–298+ 121+ With M+P+ Clements+ On a theory of intercept corrections in macroeconometric forecasting+ In S+ Holly ~ed+!, Money, Inflation and Employment: Essays in Honour of James Ball, pp+ 160– 182+ Edward Elgar+ 122+ With J+A+ Doornik+ Modelling linear dynamic econometric systems+ Scottish Journal of Polit- ical Economy41, 1–33+ 123+ With M+S+Morgan+ The ET interview: Professor H+O+A+Wold: 1908–1992+ Econometric Theory 10, 419–433+ ET INTERVIEW 799 1995 124+ With M+P+ Clements+ Forecasting in cointegrated systems+ Journal of Applied Econometrics 10, 127–146+ 125+ With M+P+ Clements+ Macro-economic forecasting and modelling+ Economic Journal105, 1001–1013+ 126+ With M+P+ Clements+ A reply to Armstrong and Fildes+ Journal of Forecasting14, 73–75+ 127+ Dynamic Econometrics+ Oxford University Press+ 128+ Econometrics and business cycle empirics+ Economic Journal105, 1622–1636+ 129+ Le rôle de l’économétrie dans l’économie scientifique+ In A+ d’Autume & J+ Cartelier~eds+!, L’Économie Devient-Elle Une Science Dure?pp+ 172–196+ Economica+ 130+ On the interactions of unit roots and exogeneity+ Econometric Reviews14, 383–419+ 131+ With J+A+ Doornik+ A window on econometrics+ Cyprus Journal of Economics8, 77–104+ 132+ With M+S+ Morgan ~eds+!+ The Foundations of Econometric Analysis+ Cambridge University Press+ 133+ With M+S+ Morgan+ Introduction+ In D+F+ Hendry & M+S+ Morgan~eds+!, The Foundations of Econometric Analysis, pp+ 1–82+ Cambridge University Press+ 1996 134+ With A+ Banerjee~eds+!+ The Econometrics of Economic Policy+ Special Issue, Oxford Bulle- tin of Economics and Statistics, 58 ~4!+ 135+ With A+ Banerjee & G+E+ Mizon+ The econometric analysis of economic policy+ Oxford Bul- letin of Economics and Statistics58, 573–600+ 136+ With J+ Campos & N+R+ Ericsson+ Cointegration tests in the presence of structural breaks+ Journal of Econometrics70, 187–220+ 137+ With M+P+ Clements+ Forecasting in macro-economics+ In D+R+ Cox, D+V+ Hinkley, & O+E+ Barndorff-Nielsen~eds+!, Time Series Models: In Econometrics, Finance and Other Fields, pp+ 101–141+ Chapman and Hall+ 138+ With M+P+ Clements+ Intercept corrections and structural change+ Journal of Applied Econo- metrics11, 475–494+ 139+ With M+P+ Clements+ Multi-step estimation for forecasting+ Oxford Bulletin of Economics and Statistics58, 657–684+ 140+ With J+A+ Doornik+ GiveWin: An Interface to Empirical Modelling, Version 1+0+ International Thomson Business Press+ 141+ With R+A+ Emerson+ An evaluation of forecasting using leading indicators+ Journal of Fore- casting15, 271–291+ 142+ With J+-P+ Florens & J+-F+ Richard+ Encompassing and specificity+ Econometric Theory12, 620–656+ 143+ On the constancy of time-series econometric equations+ Economic and Social Review27, 401–422+ 144+ Typologies of linear dynamic systems and models+ Journal of Statistical Planning and Infer- ence49, 177–201+ 145+ With J+A+ Doornik+ Empirical Econometric Modelling Using PcGive 9.0 for Windows+ Inter- national Thomson Business Press+ 146+ With M+S+ Morgan+ Obituary: Jan Tinbergen, 1903–94+ Journal of the Royal Statistical Soci- ety, Series A159, 614–616+ 1997 147+ With A+ Banerjee~eds+!+ The Econometrics of Economic Policy+ Blackwell Publishers+ 148+ With L+ Barrow, J+ Campos, N+R+ Ericsson, H+-A+ Tran, & W+ Veloce+ Cointegration+ In D+ Glasner~ed+!, Business Cycles and Depressions: An Encyclopedia, pp+ 101–106+ Garland Publishing+ 800 ET INTERVIEW 149+ With J+ Campos & N+R+ Ericsson+ Phase averaging+ In D+ Glasner~ed+!, Business Cycles and Depressions: An Encyclopedia, pp+ 525–527+ Garland Publishing+ 150+ With M+P+ Clements+ An empirical study of seasonal unit roots in forecasting+ International Journal of Forecasting13, 341–355+ 151+ With M+J+ Desai & G+E+ Mizon+ John Denis Sargan+ Economic Journal107, 1121–1125+ 152+ With J+A+ Doornik+ Modelling Dynamic Systems Using PcFiml 9.0 for Windows+ International Thomson Business Press+ 153+ With N+R+ Ericsson+ Lucas critique+ In D+ Glasner~ed+!, Business Cycles and Depressions: An Encyclopedia, pp+ 410–413+ Garland Publishing+ 154+ Book review ofDoing Economic Research: Essays on the Applied Methodology of Econom- ics by Thomas Mayer+ Economic Journal107, 845–847+ 155+ Cointegration analysis: An international enterprise+ In H+ Jeppesen & E+ Starup-Jensen~eds+!, University of Copenhagen: Centre of Excellence, pp+ 190–208+ University of Copenhagen+ 156+ The econometrics of macroeconomic forecasting+ Economic Journal107, 1330–1357+ 157+ On congruent econometric relations: A comment+ Carnegie-Rochester Conference Series on Public Policy47, 163–190+ 158+ The role of econometrics in scientific economics+ In A+ d’Autume & J+ Cartelier ~eds+!, Is Economics Becoming a Hard Science?pp+ 165–186+ Edward Elgar+ 159+ With J+A+ Doornik+ The implications for econometric modelling of forecast failure+ Scottish Journal of Political Economy44, 437–461+ 160+ With N+ Shephard~eds+!+ Cointegration and Dynamics in Economics+ Special Issue, Journal of Econometrics, 80 ~2!+ 161+ With N+ Shephard+ Editors’ introduction+ Journal of Econometrics80, 195–197+ 1998 162+ With M+P+ Clements+ Forecasting economic processes+ International Journal of Forecasting 14, 111–131~with discussion!+ 163+ With M+P+ Clements+ Forecasting Economic Time Series+ Cambridge University Press+ 164+ With J+A+ Doornik & B+ Nielsen+ Inference in cointegrating models: UK M1 revisited+ Jour- nal of Economic Surveys12, 533–572+ 165+ With N+R+ Ericsson & G+E+ Mizon+ Exogeneity, cointegration, and economic policy analysis+ Journal of Business and Economic Statistics16, 370–387+ 166+ With N+R+ Ericsson & K+M+ Prestwich+ The demand for broad money in the United Kingdom, 1878–1993+ Scandinavian Journal of Economics100, 289–324~with discussion!+ 167+ With N+R+ Ericsson & K+M+ Prestwich+ Friedman and Schwartz~1982! revisited: Assessing annual and phase-average models of money demand in the United Kingdom+ Empirical Eco- nomics23, 401–415+ 168+ With G+E+ Mizon+ Exogeneity, causality, and co-breaking in economic policy analysis of a small econometric model of money in the UK+ Empirical Economics23, 267–294+ 169+ With N+ Shephard+ The Econometrics Journal of the Royal Economic Society: Foreword+ Econometrics Journal1, i–ii + 1999 170+ With M+P+ Clements+ Forecasting Non-stationary Economic Time Series+ MIT Press+ 171+ With M+P+ Clements+ On winning forecasting competitions in economics+ Spanish Economic Review1, 123–160+ 172+ With N+R+ Ericsson+ Encompassing and rational expectations: How sequential corroboration can imply refutation+ Empirical Economics24, 1–21+ 173+ An econometric analysis of US food expenditure, 1931–1989+ In J+R+ Magnus & M+S+ Mor- gan ~eds+!, Methodology and Tacit Knowledge: Two Experiments in Econometrics, pp+ 341– 361+ Wiley+ ET INTERVIEW 801 174+ With J+A+ Doornik+ The impact of computational tools on time-series econometrics+ In T+ Cop- pock~ed+!, Information Technology and Scholarship: Applications in the Humanities and Social Sciences, pp+ 257–269+ Oxford University Press+ 175+ With H+-M+ Krolzig+ Improving on ‘Data mining reconsidered’ by K+D+ Hoover and S+J+ Perez+ Econometrics Journal2, 202–219+ 176+ With G+E+ Mizon+ The pervasiveness of Granger causality in econometrics+ In R+F+ Engle & H+ White ~eds+!, Cointegration, Causality, and Forecasting: A Festschrift in Honour of Clive W.J. Granger, pp+ 102–134+ Oxford University Press+ 2000 177+ With W+A+ Barnett, S+ Hylleberg, T+ Teräsvirta, D+ Tjøstheim, & A + Würtz+ Introduction and overview+ In W+A+ Barnett, D+F+ Hendry, S+ Hylleberg, T+ Teräsvirta, D+ Tjøstheim, & A +Würtz ~eds+!, Nonlinear Econometric Modeling in Time Series: Proceedings of the Eleventh Inter- national Symposium in Economic Theory, pp+ 1–8+ Cambridge University Press+ 178+ With W+A+ Barnett, S+ Hylleberg, T+ Teräsvirta, D+ Tjøstheim, & A + Würtz ~eds+!+ Nonlinear Econometric Modeling in Time Series: Proceedings of the Eleventh International Symposium in Economic Theory+ Cambridge University Press+ 179+ With A+ Beyer & J+A+ Doornik+ Reconstructing aggregate Euro-zone data+ Journal of Com- mon Market Studies38, 613–624+ 180+ Does money determine UK inflation over the long run? In R+E+ Backhouse & A+ Salanti~eds+!, Macroeconomics and the Real World, vol+ 1, pp+ 85–114+ Oxford University Press+ 181+ Econometrics: Alchemy or Science? Essays in Econometric Methodology, new ed+ Oxford University Press+ 182+ Epilogue: The success of general-to-specific model selection+ In D+F+ Hendry ~ed+!, Econo- metrics: Alchemy or Science? Essays in Econometric Methodology, new ed+, pp+ 467–490+ Oxford University Press+ 183+ On detectable and non-detectable structural change+ Structural Change and Economic Dynam- ics 11, 45–65+ 184+ With M+P+ Clements+ Economic forecasting in the face of structural breaks+ In S+ Holly & M + Weale~eds+!, Econometric Modelling: Techniques and Applications, pp+ 3–37+ Cambridge Uni- versity Press+ 185+ With K+ Juselius+ Explaining cointegration analysis: Part I+ Energy Journal21, 1–42+ 186+ With G+E+ Mizon+ The influence of A+W+ Phillips on econometrics+ In R+ Leeson~ed+!, A.W.H. Phillips: Collected Works in Contemporary Perspective, pp+ 353–364+ Cambridge University Press+ 187+ With G+E+ Mizon+ On selecting policy analysis models by forecast accuracy+ In A+B+ Atkin- son, H+ Glennerster, & N +H+ Stern~eds+!, Putting Economics to Work: Volume in Honour of Michio Morishima, pp+ 71–119+ STICERD, London School of Economics+ 188+ With G+E+ Mizon+ Reformulating empirical macroeconometric modelling+ Oxford Review of Economic Policy16, 138–159+ 189+ With R+ Williams+ Distinguished fellow of the Economic Society of Australia, 1999: Adrian R+ Pagan+ Economic Record76, 113–115+ 2001 190+ With A+ Beyer & J+A+ Doornik+ Constructing historical Euro-zone data+ Economic Journal 111, F102–F121+ 191+ With M+P+ Clements+ Explaining the results of the M3 forecasting competition+ International Journal of Forecasting17, 550–554+ 192+ With M+P+ Clements+ Forecasting with difference-stationary and trend-stationary models+ Econo- metrics Journal4, S1–S19+ 193+ With M+P+ Clements+ An historical perspective on forecast errors+ National Institute Eco- nomic Review2001, 100–112+ 802 ET INTERVIEW 194+ With J+A+ Doornik+ Econometric Modelling Using PcGive 10, vol+ 3+ Timberlake Consultants Press~with Manuel Arellano, Stephen Bond, H+ Peter Boswijk, & Marius Ooms!+ 195+ With J+A+ Doornik+ GiveWin Version 2: An Interface to Empirical Modelling+ Timberlake Con- sultants Press+ 196+ With J+A + Doornik+ Interactive Monte Carlo Experimentation in Econometrics Using PcNaive 2+ Timberlake Consultants Press+ 197+ With J+A+ Doornik+ Modelling Dynamic Systems Using PcGive 10, vol+ 2+ Timberlake Con- sultants Press+ 198+ Achievements and challenges in econometric methodology+ Journal of Econometrics100, 7–10+ 199+ How economists forecast+ In D+F+ Hendry & N+R+ Ericsson~eds+!, Understanding Economic Forecasts, pp+ 15–41+ MIT Press+ 200+ Modelling UK inflation, 1875–1991+ Journal of Applied Econometrics16, 255–275+ 201+ With J+A+ Doornik+ Empirical Econometric Modelling Using PcGive 10, vol+ 1+ Timberlake Consultants Press+ 202+ With N+R+ Ericsson+ Editors’ introduction+ In D+F+ Hendry & N+R+ Ericsson~eds+!, Under- standing Economic Forecasts, pp+ 1–14+ MIT Press+ 203+ With N+R+ Ericsson+ Epilogue+ In D+F+ Hendry & N+R+ Ericsson~eds+!, Understanding Eco- nomic Forecasts, pp+ 185–191+ MIT Press+ 204+ With N+R+ Ericsson~eds+!+ Understanding Economic Forecasts+ MIT Press+ 205+ With K+ Juselius+ Explaining cointegration analysis: Part II+ Energy Journal22, 75–120+ 206+ With H+-M+ Krolzig+ Automatic Econometric Model Selection Using PcGets 1.0+ Timberlake Consultants Press+ 207+ With M+H+ Pesaran+ Introduction: A special issue in memory of John Denis Sargan: Studies in empirical macroeconometrics+ Journal of Applied Econometrics16, 197–202+ 208+ With M+H+ Pesaran~eds+!+ Special Issue in Memory of John Denis Sargan 1924–1996: Stud- ies in Empirical Macroeconometrics+ Special Issue, Journal of Applied Econometrics, 16 ~3!+ 209+ With H+-M+ Krolzig+ Computer automation of general-to-specific model selection procedures+ Journal of Economic Dynamics and Control25, 831–866+ 2002 210+ With M+P+ Clements~eds+!+ A Companion to Economic Forecasting+ Blackwell Publishers+ 211+ With M+P+ Clements+ Explaining forecast failure in macroeconomics+ In M+P+ Clements & D+F+ Hendry~eds+!, A Companion to Economic Forecasting, pp+ 539–571+ Blackwell Publishers+ 212+ With M+P+ Clements+ Modelling methodology and forecast failure+ Econometrics Journal5, 319–344+ 213+ With M+P+ Clements+ An overview of economic forecasting+ In M+P+ Clements & D+F+ Hendry ~eds+!, A Companion to Economic Forecasting, pp+ 1–18+ Blackwell Publishers+ 214+ With J+A+ Doornik & N+ Shephard+ Computationally intensive econometrics using a distrib- uted matrix-programming language+ Philosophical Transactions of the Royal Society, Lon- don, Series A360, 1245–1266+ 215+ Applied econometrics without sinning+ Journal of Economic Surveys16, 591–604+ 216+ Forecast failure, expectations formation and theLucas Critique+ Annales D’Économie et de Statistique2002, 21–40+ 2003 217+ With J+ Campos & H+-M+ Krolzig+ Consistent model selection by an automaticGetsapproach+ Oxford Bulletin of Economics and Statistics65, 803–819+ 218+ With J+A+ Doornik+ PcGive+ In C+G+ Renfro ~ed+!, A Compendium of Existing Econometric Software Packages, Journal of Economic and Social Measurement, 26, forthcoming+ 219+ With J+A+ Doornik+ PcNaive+ In C+G+ Renfro ~ed+!, A Compendium of Existing Econometric Software Packages, Journal of Economic and Social Measurement, 26, forthcoming+ ET INTERVIEW 803 220+ With N+ Haldrup & H+K+ van Dijk+ Guest editors’ introduction: Model selection and evalua- tion in econometrics+ Oxford Bulletin of Economics and Statistics65, 681–688+ 221+ With N+ Haldrup & H+K+ van Dijk ~eds+!+ Model Selection and Evaluation+ Special Issue, Oxford Bulletin of Economics and Statistics65, supplement+ 222+ Book review ofCausality in Macroeconomicsby Kevin D+ Hoover+ Economica70, 375–377+ 223+ Forecasting pitfalls+ Bulletin of E.U. and U.S. Inflation and Macroeconomic Analysis2003, 65–82+ 224+ J+ Denis Sargan and the origins of LSE econometric methodology+ Econometric Theory19, 457–480+ 225+ With M+P+ Clements+ Economic forecasting: Some lessons from recent research+ Economic Modelling 20, 301–329+ 226+ With H+-M+ Krolzig+ New developments in automatic general-to-specific modeling+ In B+P+ Stigum ~ed+!, Econometrics and the Philosophy of Economics: Theory-Data Confrontations in Economics, pp+ 379–419+ Princeton University Press+ 227+ With H+-M+ Krolzig+ PcGets+ In C+G+ Renfro ~ed+!, A Compendium of Existing Econometric Software Packages+ Journal of Economic and Social Measurement26, forthcoming+ 2004 228+ With J+ Campos & N+R+ Ericsson~eds+!+ Readings on General-to-Specific Modeling+ Edward Elgar+ Forthcoming+ 229+ The Nobel memorial prize for Clive W+J+ Granger+ Scandinavian Journal of Economics106, forthcoming+ 230+ With M+P+ Clements+ Pooling of forecasts+ Econometrics Journal7, forthcoming+ 231+ With H+-M+ Krolzig+ Sub-sample model selection procedures in general-to-specific model- ling+ In R+ Becker & S+ Hurn ~eds+!, Contemporary Issues in Economics and Econometrics: Theory and Application, pp+ 53–75+ Edward Elgar+ 804 ET INTERVIEW EricssonFiallosSeymour-2015-JSMProceedings-BES-Paper315656_233944 Detecting Time-dependent Bias in the Fed’s Greenbook Forecasts of Foreign GDP Growth∗ Neil R. Ericsson† Emilio J. Fiallos‡ J E. Seymour§ Abstract Building on Sinclair, Joutz, and Stekler (2010) and Ericsson, Hood, Joutz, Sinclair, and Stekler (2013), this paper examines publicly available Fed Greenbook forecasts of several foreign countries’ GDP growth, focusing on potential biases in the forecasts. While standard tests typically fail to detect biases, recently developed indicator saturation techniques detect economically sizable and highly significant time-varying biases. Estimated biases differ not only over time, but by country and across the forecast horizon. KeyWords: Autometrics, bias, Federal Reserve, forecasts, foreign countries, GDP, Green- book, impulse indicator saturation, Tealbook, United States 1. Introduction The Fed’s monetary policy has attracted considerable attention domestically and abroad; see Bernanke (2012) and Yellen (2012) inter alia for recent discussions. Monetary policy decisions at the Fed are based in part on the “Greenbook” forecasts, which are economic forecasts produced by the Fed’s staff. The Greenbook forecasts of U.S. economic variables have been extensively analyzed, including by Romer and Romer (2008), Sinclair, Joutz, and Stekler (2010), Nunes (2013), and Ericsson, Hood, Joutz, Sinclair, and Stekler (2015). Surprisingly, Greenbook forecasts of foreign economic variables have not been examined, even though foreign economic activity is often a topic of discussion in the Fed’s deliberations on monetary policy; see Yellen (2015) inter alia. This paper thus examines the properties of Greenbook forecasts of foreign GDP growth–a key measure of foreign economic activity. A central focus in forecast evaluation is forecast bias, especially because fore- cast bias is systematic, and because ignored forecast biases may have substantive adverse consequences for policy. Building on Sinclair, Joutz, and Stekler (2010) and Ericsson, Hood, Joutz, Sinclair, and Stekler (2013), the current paper analyzes Greenbook forecasts of output growth in several foreign countries over 1998—2008. Standard tests typically fail to detect any important forecast biases. However, a recently developed technique–impulse indicator saturation–detects economically large and highly statistically significant time-varying biases. Biases differ across ∗The views in this paper are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System or of any other person associated with the Federal Reserve System. The authors are grateful to Shaghil Ahmed, David Hendry, Jun Ma, Ricardo Nunes, Andrea Raffo, John Rogers, and Herman Stekler for helpful discussions and comments. All numerical results were obtained using PcGive Version 14.0B3, Autometrics Version 1.5e, and Ox Professional Version 7.00 in 64-bit OxMetrics Version 7.00: see Doornik and Hendry (2013) and Doornik(2009). †Division of International Finance, Board of Governors of the Federal Reserve System, Wash- ington, DC 20551 USA (ericsson@frb.gov), and Research Program on Forecasting, Department of Economics, The George Washington University, Washington, DC 20052 USA (ericsson@gwu.edu) ‡Department of Statistics, Rutgers, The State University of New Jersey, New Brunswick, NJ (emilio_f1@yahoo.com) §Division of International Finance, Board of Governors of the Federal Reserve System, Wash- ington, DC 20551 USA (jedwardseymour@gmail.com) JSM2015 - Business and Economic Statistics Section 858 Neil R. Ericsson, Emilio J. Fiallos, and J E. Seymour (2015) "Detecting Time-dependent Bias in the Fed's Greenbook Forecasts of Foreign GDP Growth", in JSM Proceedings, Business and Economic Statistics Section, American Statistical Association, Alexandria, Virginia, pp. 858-872. the country being forecast, the horizon, and the date of the forecast. For example, forecasts of Chinese real GDP growth are systematically biased, with a statistically significant and economically large bias of approximately two percent per annum. For all countries examined, there is little observed predictability beyond two quarters ahead. This paper is organized as follows. Section 2 describes the data and the forecasts being analyzed. Section 3 discusses different approaches to testing for potential fore- cast bias and proposes impulse indicator saturation as a generic test of forecast bias. Section 4 describes indicator saturation techniques, including impulse indicator sat- uration and several of its extensions. Section 5 presents evidence on forecast bias, using the methods detailed in Sections 3 and 4. Section 6 concludes. 2. The Data and the Forecasts As input to the decision-making process of the Federal Open Market Committee, the staff of the Federal Reserve Board (the “Fed”) produce a document called the Greenbook, which includes forecasts of U.S. and foreign economic activity. This section describes the foreign Greenbook forecasts analyzed in this paper and the data being forecast. See Ericsson, Fiallos, and Seymour (2014) for further details. The data being forecast are real GDP growth rates for nine countries: • Brazil (BZ), • Canada (CA), • China (CH), • Germany (GE), • Japan (JA), • South Korea (KO), • Mexico (MX), • the United Kingdom (UK), and • (for comparison) the United States (US). Country abbreviations are in parentheses. The sample period is determined by the presence of the forecasts in the publicly available Greenbooks: 1998Q1—2008Q4 for Canada, Germany, Japan, the United Kingdom, and the United States; and 1999Q4—2008Q4 for Brazil, China, Mexico, and South Korea. The Greenbook fore- casts are from the final Greenbook of each quarter so as to allow as much information to be available for the forecasts being made in a given quarter. The forecast hori- zon  (in quarters) is  = −1 0 1 2 3 4, where  = −1 denotes the one-quarter backcast,  = 0 denotes the nowcast, and  = 1 2 3 4 denote the one-, two-, three-, and four-quarter-ahead forecasts. Output growth is measured in quarterly rates expressed as percent changes at an annual rate. Measured actual values are the GDP growth rates as reported in the Greenbook with a two-quarter lag. The Greenbooks are publicly available from the Federal Reserve Bank of Philadel- phia: http://www.phil.frb.org/research-and-data/real-time-center/greenbook-data/pdf-data-set.cfm. These forecasts are made publicly available approximately five years after the fact. The assumptions underlying the Greenbook forecasts, the complex process involved in generating the forecasts, and the goals and objectives of that process are of considerable interest in their own right and merit detailed examination. However, in the spirit of Stekler (1972), Chong and Hendry (1986), and Fildes and Stekler (2002) inter alia, the current paper focuses on the properties of the forecasts themselves. JSM2015 - Business and Economic Statistics Section 859 Several properties of the data, the Greenbook forecasts, and the corresponding forecast errors are apparent upon graphing. The Chinese forecasts systematically underpredict actual growth, albeit by different amounts, varying over time. Fore- casts for other countries’ growth likewise under- or over-predict actual growth, and the degree of inaccuracy depends on the country, the horizon, and the date of the forecast. Forecast errors are often persistent, suggestive of systematic biases in the forecasts. See Ericsson, Fiallos, and Seymour (2014) for further details. For some previous analyses of Greenbook and other governmental and institutional forecasts, see Corder (2005), Engstrom and Kernell (1999), Frankel (2011), Joutz and Stekler (2000), Nunes (2013), Sinclair, Joutz, and Stekler (2010), Romer and Romer (2008), Tsuchiya (2013), and Ericsson, Hood, Joutz, Sinclair, and Stekler (2015). 3. Approaches for Detecting Forecast Bias This section considers different approaches for assessing potential forecast bias, starting with the standard test of (time-invariant) forecast bias by Mincer and Zarnowitz (1969). This section then considers forms of time-dependent forecast bias, with impulse indicator saturation providing a generic test of potentially time- varying forecast bias. This section’s exposition draws on Ericsson (2015) and Er- icsson, Hood, Joutz, Sinclair, and Stekler (2015). Mincer and Zarnowitz (1969, pp. 8—11) suggest testing for forecast bias by re- gressing the forecast error on an intercept and testing whether the intercept is statistically significant. That is, for a variable  at time  and its forecast ̂, estimate the equation: ( − ̂) =  +   = 1      (1) where  is the intercept,  is the error term at time , and  is number of observa- tions. A test of  = 0 is interpretable as a test that the forecast ̂ is unbiased for the variable . For current-period and one-step-ahead forecasts, the error  may be serially uncorrelated, in which case a - or  -statistic may be appropriate. For multi-step-ahead forecasts,  generally will be serially correlated; hence inference about the intercept  may require some accounting for that autocorrelation. Holden and Peel (1990) and Stekler (2002) discuss a generalization of equa- tion (1): ( − ̂) = 0 + 01 +   = 1      (2) in which the right-hand side variables  might be any variables; and they interpret a test of 1 = 0 as a test of efficiency. See Holden and Peel (1990) and Stekler (2002) for expositions on these tests as tests of unbiasedness and efficiency, and Sinclair, Stekler, and Carnow (2012) for a recent discussion. Many forecast tests are interpretable as being based on equation (2). For ex- ample, in Sinclair, Joutz, and Stekler (2010) and Ericsson, Hood, Joutz, Sinclair, and Stekler (2015), the regressor  includes a dummy variable that is indicates the business cycle’s phase–either contraction or expansion. Another choice of  is ̂, proposed by Mincer and Zarnowitz (1969, p. 11). In Ericsson (2015) and Ericsson, Hood, Joutz, Sinclair, and Stekler (2015), “Mincer—Zarnowitz A” denotes the regression-based test of  = 0 in equation (1), whereas “Mincer—Zarnowitz B” denotes the regression-based test of {0 = 0 1 = 0} in equation (2) with  = ̂. Other choices for  include an alternative forecast ̃ or the differential between the two forecasts (̃ − ̂), generating the forecast-encompassing tests in Chong JSM2015 - Business and Economic Statistics Section 860 and Hendry (1986). As Ericsson (1992) discusses, a necessary condition for forecast encompassing is having the smallest mean squared forecast error (MSFE); Granger (1989) and Diebold and Mariano (1995) propose tests of whether one model’s MSFE is less than another model’s MSFE. Also, the “alternative forecast” could be a fore- cast made in a different time period, in which case (̃ − ̂) is the revision of the forecast. Nordhaus (1987) proposes this test based on forecast revisions across mul- tiple horizons as a test of efficiency. Tversky and Kahneman (1974) earlier described “anchoring” as a phenomenon in which 1  0 for forecast revisions; see Campbell and Sharpe (2009) for empirical evidence on anchoring. In equation (2), the term (0 + 01) is also interpretable as a specific form of time-dependent forecast bias. That time dependence could be completely general, as follows: ( − ̂) =  +  = P =1  +   = 1      (3) where the impulse indicator  is a dummy variable that is unity for  =  and zero otherwise, and  is the corresponding coefficient for . Because the {} may have any values whatsoever, the intercept  in (3) may vary arbitrarily over time. In this context, a test that all coefficients  are equal to zero is a generic test of forecast unbiasedness. Because equation (3) includes  coefficients, equation (3) cannot be estimated unrestrictedly. However, the question being asked can be answered using impulse indicator saturation, as summarized in Section 4. 4. Indicator Saturation Techniques Impulse indicator saturation (IIS) uses the zero-one dummies {} to analyze prop- erties of a model. Unrestricted inclusion of all  dummies in the model (thereby “saturating” the sample) is infeasible. However, blocks of dummies can be included, and statistically significant dummies can be retained from those blocks. That in- sight provides the basis for IIS. See Ericsson and Reisman (2012) for an intuitive non-technical exposition of IIS, and Hendry and Doornik (2014) for extensive ana- lysis in the context of automatic model selection. This section’s exposition draws on Ericsson (2015) and Ericsson, Hood, Joutz, Sinclair, and Stekler (2015). IIS provides a general procedure for robust estimation and for model evaluation– in particular, for testing parameter constancy. IIS is a generic test for an unknown number of structural breaks, occurring at unknown times, with unknown duration and magnitude, anywhere in the sample. IIS is a powerful empirical tool for both evaluating and improving existing empirical models. Hendry (1999) proposes IIS as a procedure for testing parameter constancy. Further discussion, recent develop- ments, and applications appear in Hendry, Johansen, and Santos (2008), Doornik (2009), Johansen and Nielsen (2009, 2013, 2015), Hendry and Santos (2010), Er- icsson (2011a, 2011b, 2012), Ericsson and Reisman (2012), Bergamelli and Urga (2013), Hendry and Pretis (2013), Hendry and Doornik (2014), Pretis, Mann, and Kaufmann (2015), and Castle, Doornik, Hendry, and Pretis (2015). Ericsson (2015) proposes a new application for IIS–as a generic test for time-varying forecast bias. Section 5 applies IIS to test for potential bias in the Greenbook forecasts. Many existing procedures can be interpreted as “special cases” of IIS in that they represent particular algorithmic implementations of IIS. Such special cases include recursive estimation, rolling regression, the Chow (1960) predictive fail- ure statistic (including the 1-step, breakpoint, and forecast versions implemented JSM2015 - Business and Economic Statistics Section 861 Table 1: Impulse indicator saturation and super saturation, as characterized by the variables involved. Name Description Variables Definition of variables Impulse indicator saturation Zero-one dummies {}  = 1 for  =  zero otherwise Super saturation Step functions { }  = 1 for  ≥  zero otherwise in OxMetrics), the Andrews (1993) unknown breakpoint test, the Bai and Perron (1998) multiple breakpoint test, tests of extended constancy in Ericsson, Hendry, and Prestwich (1998, pp. 305ff), tests of nonlinearity, intercept correction (in fore- casting), and robust estimation. IIS thus provides a general and generic procedure for analyzing a model’s constancy. Algorithmically, IIS also solves the problem of having more potential regressors than observations by testing and selecting over blocks of variables. Table 1 summarizes IIS and an extension: super saturation. Throughout,  is the sample size,  is the index for time,  and  are the indexes for indicators,  is the index for economic variables (denoted ), and  is the total number of potential regressors considered. A few remarks may be helpful for interpreting the entries in Table 1. Impulse indicator saturation. This is the standard IIS procedure proposed by Hendry (1999), with selection among the  zero-one impulse indicators {}. Super saturation. Super saturation searches across all possible one-off step func- tions {}, in addition to {}. Step functions are of economic interest because they may capture permanent or long-lasting changes that are not otherwise in- corporated into a specific empirical model. A step function is a partial sum of impulse indicators; equivalently, it is a parsimonious representation of a sequential subset of impulse indicators that have equal coefficients. Castle, Doornik, Hendry, and Pretis (2015) investigate the statistical properties of a closely related satura- tion estimator–step indicator saturation (SIS)–which searches among only the step indicator variables {}. Autometrics now includes IIS, SIS, super saturation (IIS+SIS), and zero-sum pairwise IIS (mentioned below); see Doornik and Hendry (2013). Table 1 is by no means an exhaustive list of extensions to IIS. One direct extension is ultra saturation, with searches across {  }, where the {} are broken linear trends. Broken quadratic trends, broken cubic trends, and higher- order broken trends are also feasible. Other extensions include sequential ( = 1) and non-sequential (  1) pairwise impulse indicator saturation for an indicator , defined as ++; sequential multiplet indicator saturation for an indicator +1  , defined as +· · ·++ for  ≥ 1; zero-sum pairwise IIS for an indicator , defined as ∆; many many variables for a set of  potential regressors {  = 1    } JSM2015 - Business and Economic Statistics Section 862 for    ; factors; principal components; and multiplicative indicator saturation for the set of . See Castle, Clements, and Hendry (2013) and Ericsson (2011b, 2012) for details, discussion, and examples in the literature. Also, the IIS-type procedure chosen may itself be a combination of extensions; and that choice may affect the power of the procedure to detect specific alternatives. Notably, dummies for economic expansions and contractions are examples of sequential multiplets. As a more general observation, different types of indicators are adept at charac- terizing different sorts of bias: impulse dummies {} for date-specific anomalies, step dummies {} for level shifts, and broken trends {} for evolving develop- ments. Transformations of the variable being forecast also may affect the interpre- tation of the retained indicators. For instance, an impulse dummy for a growth rate implies a level shift for the (log) level of the variable. IIS-based tests of forecast bias can serve both as diagnostic tools to detect what is wrong with the forecasts, and as developmental tools to suggest how the forecasts can be improved. Clearly, “rejection of the null doesn’t imply the alternative”. However, for time series data, the date-specific nature of IIS-type procedures can aid in identifying important sources of forecast error. Use of these tests in forecast development is consistent with a progressive modeling approach; see White (1990). As equation (3) emphasizes, IIS-based tests generalize the Mincer—Zarnowitz tests to allow for arbitrarily time-varying forecast bias. This observation and the observations above highlight the strength of the Mincer—Zarnowitz tests (that they focus on detecting a constant nonzero forecast bias) and also their weakness (that they assume that the forecast bias is constant over time). These characteristics of the Mincer—Zarnowitz tests bear directly on the empirical results in Section 5. 5. Evidence on Biases in the Greenbook Forecasts This section examines the Greenbook forecasts of output growth for eight foreign countries and for the United States. Standard (Mincer—Zarnowitz) tests of fore- cast bias typically fail to detect economically and statistically important biases. By contrast, IIS-type tests detect large time-varying biases. Forecast biases differ numerically across the forecast horizon, country being forecast, and the date of the forecast, albeit with some qualitative similarities. Section 5.1 reports a standard summary statistic on forecast performance: root mean squared forecast errors. Section 5.2 reports standard Mincer—Zarnowitz tests of forecast bias. Section 5.3 employs IIS-type procedures to test for and estimate time-varying forecast bias. 5.1 Summary Statistics of Forecast Performance Figure 1 plots the root mean squared forecast errors (RMSEs) for the nine countries as a function of the forecast horizon . The RMSEs for four developed countries (Germany, Canada, the United Kingdom, and the United States) are considerably smaller at every horizon than the RMSEs for the remaining countries. Notably, the remaining countries include Japan–a developed country–and all of the emerging market economies analyzed (Brazil, Mexico, South Korea, and China). For all JSM2015 - Business and Economic Statistics Section 863 BZ CH JA MX US CA GE KO UK -1 0 1 2 3 4 0 1 2 3 4 5 BZ CA CH GE JA KO MX UK US horizon h RMSE BZ CH JA MX US CA GE KO UK Figure 1: RMSEs of the Greenbook forecasts for GDP growth of nine countries at different forecast horizons  ( = −1 0 1     4). countries, the RMSEs generally increase with the forecast horizon; and the RMSEs increase little beyond a horizon of two quarters ahead. 5.2 Standard Tests of Forecast Bias This subsection examines the Greenbook forecasts for bias using the standard (Mincer—Zarnowitz) test. With the exception of China, the Mincer—Zarnowitz test finds little evidence of economically and statistically important biases. Table 2 reports estimated intercepts and estimated standard errors for the Mincer—Zarnowitz regression in equation (1). Here and in Table 3, HAC estimated standard errors appear under regression coefficients in square brackets [·]. The symbols +, *, and ** respectively denote significance at the 10%, 5%, and 1% lev- els. Statistically, there is little evidence of forecast bias, except for Chinese output growth, for which the estimated bias is 1.5%—2.0% per annum for the nowcast and the forecasts. 5.3 Estimated Time-varying Bias Using Indicator Saturation To assess possible time dependence of the forecast biases, this subsection estimates IIS-type equations in the form of equation (3). Time dependence is detected for all countries at some or all forecast horizons. Table 3 reports estimated intercepts and estimated standard errors for the Mincer—Zarnowitz regression in equation (3); and Table 4 lists the impulse and step dummies retained from super saturation with a target size of 0.5%. Tables 3 and 4 highlight the dependence of bias on the country and on the forecast hori- JSM2015 - Business and Economic Statistics Section 864 Table 2: Estimated intercepts and HAC standard errors for the Mincer—Zarnowitz test of bias in Greenbook forecasts for GDP growth. Country Forecast horizon  —1 0 1 2 3 4 Brazil 051 [027] 061 [057] 063 [056] 039 [060] 055 [077] 044 [077] Canada −004 [010] 017 [022] 001 [032] −009 [037] −023 [038] −035 [036] China 001 [014] 196∗∗ [034] 180∗∗ [042] 173∗∗ [045] 160∗∗ [056] 182∗∗ [047] Germany 010 [009] −005 [017] −016 [026] −044 [037] −062 [044] −075 [048] Japan −042+ [024] 047 [052] 018 [062] −007 [065] −018 [068] −032 [067] Korea 066∗ [025] 070 [051] 049 [067] 020 [072] 011 [074] −003 [073] Mexico −003 [016] −030 [048] −064 [056] −110 [067] −121 [077] −136+ [075] United Kingdom 010+ [005] 001 [011] −016 [015] −019 [023] −023 [026] −028 [025] United States −006∗ [003] 056∗∗ [018] 039 [030] 014 [045] −001 [050] −006 [051] Table 3: IIS-estimated intercepts and HAC standard errors for the Mincer— Zarnowitz test of bias in the Greenbook forecasts for GDP growth. Country Forecast horizon  —1 0 1 2 3 4 Brazil 014 [011] 061 [057] 064 [056] 039 [060] 015 [061] 016 [068] Canada 005 [005] 000 [017] −013 [029] −034 [032] −044 [028] −026 [024] China 0 [0] 152∗∗ [031] 181∗∗ [046] 178∗∗ [049] 162∗∗ [059] 186∗∗ [049] Germany 006 [005] −022 [018] −016 [026] −044 [037] −062 [044] −075 [048] Japan −011 [016] 047 [052] −008 [062] −033 [067] −046 [071] −059 [071] Korea 002 [002] 070 [051] 049 [067] 096 [065] 011 [074] −003 [073] Mexico 013 [013] −054 [044] −089+ [052] −136∗ [062] −153∗ [068] −164∗ [068] United Kingdom 006 [005] 001 [012] −016 [015] −010 [020] −020 [020] −012 [020] United States −000 [003] 040∗ [018] 039 [030] 014 [045] −001 [050] −006 [051] JSM2015 - Business and Economic Statistics Section 865 Table 4: Dummy variables selected by super saturation at the 0.1% level at at least one forecast horizon, by country. Country Indicators selected Impulse Step Brazil 1999(4) 2000(3), 2001(3) Canada 1998(2), 2002(1) 1998(3), 2000(1), 2000(3), 2001(3), 2001(4), 2002(1), 2002(2) China 2003(2), 2003(3) 2003(1), 2003(2), 2003(3) Germany – 2006(2), 2007(4), 2008(1) Japan 2001(2) 1998(4), 1999(2), 1999(4), 2000(1), 2001(4), 2008(1) Korea 2000(4) 2000(3), 2000(4), 2001(3), 2002(4), 2003(2) Mexico 2000(1) 2000(3), 2001(3), 2001(4), 2002(1), 2003(1) United Kingdom 1999(3) 2000(4) 2002(4), 2003(1), 2003(4), 2007(4), 2008(1), 2008(2) United States 2003(3) 2000(1), 2000(2), 2001(3), 2001(4), 2002(1), 2003(2), 2003(3) JSM2015 - Business and Economic Statistics Section 866 zon . The selected dummies in Table 4 indicate the pervasiveness of time-varying bias. Because forecast errors tend to be very small for backcasts ( = −1), only non-negative forecast horizons are considered in Table 4. Graphs directly convey a sense of the magnitude and extent of the biases present. Figures 2—5 thus plot actual values and forecasts and the forecast errors for two of the countries analyzed: China and the United States. Each figure is a panel of 2×3 for forecast horizons  ( = −1 0 1     4). Figure 2 plots the actual and forecast values for Chinese growth; and Figure 3 plots the corresponding forecast errors and the bias as estimated from super satura- tion. Large biases are evident for all horizons except  = −1, and the biases appear somewhat different before and after 2003. Figure 4 plots the actual and forecast values for U.S. growth; and Figure 5 plots the corresponding forecast errors and the bias as estimated from super saturation. The biases are notably time-dependent and persistent at all non-negative horizons. Ericsson, Hood, Joutz, Sinclair, and Stekler (2015) show that those biases depend primarily on the phase of the business cycle. Forecast biases vary markedly over time, being sometimes positive and other- times negative. The Mincer—Zarnowitz tests have particular difficulty in detecting such biases because the Mincer—Zarnowitz tests average all biases (both negative and positive) over time, and because the Mincer—Zarnowitz tests assign any time variation in bias to the residual rather than to the bias itself. As an extreme exam- ple, the Mincer—Zarnowitz A test has no power to detect a forecast bias that is +10% for the first half of the sample and −10% for the second half of the sample, even though this bias would be obvious from (e.g.) graphing the data. Super saturation often detects time-varying bias, and for historically and economically consequential years. The dates of the retained dummies are important and informative, and those dummies often appear to reflect cyclical movements. 6. Conclusions Building on Sinclair, Joutz, and Stekler (2010), the current paper analyzes Green- book forecasts of foreign output growth for potential biases over 1999—2008. Stan- dard tests typically fail to detect bias. However, super saturation detects eco- nomically large and highly statistically significant time-dependent biases across all countries being forecast. Biases depend on the country, the forecast horizon, and the date of the forecast. Saturation as a technique defines a generic procedure for examining forecast properties; it explains why standard tests fail to detect bias; and it provides a potential mechanism for improving forecasts. In particular, such biases imply an opportunity to robustify the forecasts, as with intercept correction; see Clements and Hendry (1999, 2002), Hendry (2006), and Castle, Fawcett, and Hendry (2010). JSM2015 - Business and Economic Statistics Section 867 ActualCHa ForecastCHN1 2000 2005 2010 0 5 10 15 20 ActualCHa ForecastCHN1 ActualCHa ForecastCHH0 2000 2005 2010 0 5 10 15 20 ActualCHa ForecastCHH0 ActualCHa ForecastCHH1 2000 2005 2010 0 5 10 15 20 ActualCHa ForecastCHH1 ActualCHa ForecastCHH2 2000 2005 2010 0 5 10 15 20 ActualCHa ForecastCHH2 ActualCHa ForecastCHH3 2000 2005 2010 0 5 10 15 20 ActualCHa ForecastCHH3 ActualCHa ForecastCHH4 2000 2005 2010 0 5 10 15 20 ActualCHa ForecastCHH4 Figure 2: Chinese GDP growth and its Greenbook forecasts at different forecast horizons  ( = −1 0 1     4). ErrorCHN1a Fitted 2000 2005 -10 0 10 ErrorCHN1a Fitted ErrorCHH0a Fitted 2000 2005 -10 0 10 ErrorCHH0a Fitted ErrorCHH1a Fitted 2000 2005 -10 0 10 ErrorCHH1a Fitted ErrorCHH2a Fitted 2000 2005 -10 0 10 ErrorCHH2a Fitted ErrorCHH3a Fitted 2000 2005 -10 0 10 ErrorCHH3a Fitted ErrorCHH4a Fitted 2000 2005 -10 0 10 ErrorCHH4a Fitted Figure 3: Greenbook forecast errors for Chinese GDP growth at different forecast horizons  ( = −1 0 1     4), and estimated forecast biases as calculated using super saturation. JSM2015 - Business and Economic Statistics Section 868 ActualUSa ForecastUSN1 2000 2010 0 5 ActualUSa ForecastUSN1 ActualUSa ForecastUSH0 2000 2010 0 5 ActualUSa ForecastUSH0 ActualUSa ForecastUSH1 2000 2010 0 5 ActualUSa ForecastUSH1 ActualUSa ForecastUSH2 2000 2010 0 5 ActualUSa ForecastUSH2 ActualUSa ForecastUSH3 2000 2010 0 5 ActualUSa ForecastUSH3 ActualUSa ForecastUSH4 2000 2010 0 5 ActualUSa ForecastUSH4 Figure 4: US GDP growth and its Greenbook forecasts at different forecast hori- zons  ( = −1 0 1     4). ErrorUSN1a Fitted 2000 2005 -5 0 5 ErrorUSN1a Fitted ErrorUSH0a Fitted 2000 2005 -5 0 5 ErrorUSH0a Fitted ErrorUSH1a Fitted 2000 2005 -5 0 5 ErrorUSH1a Fitted ErrorUSH2a Fitted 2000 2005 -5 0 5 ErrorUSH2a Fitted ErrorUSH3a Fitted 2000 2005 -5 0 5 ErrorUSH3a Fitted ErrorUSH4a Fitted 2000 2005 -5 0 5 ErrorUSH4a Fitted Figure 5: Greenbook forecast errors for US GDP growth at different forecast horizons  ( = −1 0 1     4), and estimated forecast biases as calculated using super saturation. JSM2015 - Business and Economic Statistics Section 869 REFERENCES Andrews, D. W. K. (1993) “Tests for Parameter Instability and Structural Change with Unknown Change Point”, Econometrica, 61, 4, 821—856. Bai, J., and P. Perron (1998) “Estimating and Testing Linear Models with Multiple Struc- tural Changes”, Econometrica, 66, 1, 47—78. Bergamelli, M., and G. Urga (2013) “Detecting Multiple Structural Breaks: A Monte Carlo Study and an Application to the Fisher Equation for the US”, draft, Cass Business School, London, March. Bernanke, B. S. (2012) “U.S. Monetary Policy and International Implications”, remarks at the seminar “Challenges of the Global Financial System: Risks and Governance under Evolving Globalization”, Bank of Japan, Tokyo, Japan, October 14. Campbell, S. D., and S. A. Sharpe (2009) “Anchoring Bias in Consensus Forecasts and Its Effect on Market Prices”, Journal of Financial and Quantitative Analysis, 44, 2, 369—390. Castle, J. L., M. P. Clements, and D. F. Hendry (2013) “Forecasting by Factors, by Vari- ables, by Both or Neither?”, Journal of Econometrics, 177, 2, 305—319. Castle, J. L., J. A. Doornik, D. F. Hendry, and F. Pretis (2015) “Detecting Location Shifts During Model Selection by Step-indicator Saturation”, Econometrics, 3, 2, 240—264. Castle, J. L., N. W. P. Fawcett, and D. F. Hendry (2010) “Forecasting with Equilibrium- correction Models During Structural Breaks”, Journal of Econometrics, 158, 1, 25—36. Chong, Y. Y., and D. F. Hendry (1986) “Econometric Evaluation of Linear Macro-economic Models”, Review of Economic Studies, 53, 4, 671—690. Chow, G. C. (1960) “Tests of Equality Between Sets of Coefficients in Two Linear Regres- sions”, Econometrica, 28, 3, 591—605. Clements, M. P., and D. F. Hendry (1999) Forecasting Non-stationary Economic Time Series, MIT Press, Cambridge. Clements, M. P., and D. F. Hendry (2002) “Explaining Forecast Failure in Macroeconomics”, Chapter 23 in M. P. Clements and D. F. Hendry (eds.) A Companion to Economic Forecasting, Blackwell Publishers, Oxford, 539—571. Corder, J. K. (2005) “Managing Uncertainty: The Bias and Efficiency of Federal Macro- economic Forecasts”, Journal of Public Administration Research and Theory, 15, 1, 55—70. Diebold, F. X., and R. S. Mariano (1995) “Comparing Predictive Accuracy”, Journal of Business and Economic Statistics, 13, 3, 253—263. Doornik, J. A. (2009) “Autometrics”, Chapter 4 in J. L. Castle and N. Shephard (eds.) The Methodology and Practice of Econometrics: A Festschrift in Honour of David F. Hendry, Oxford University Press, Oxford, 88—121. Doornik, J. A., and D. F. Hendry (2013) PcGive 14, Timberlake Consultants Press, London (3 volumes). Engstrom, E. J., and S. Kernell (1999) “Serving Competing Principals: The Budget Es- timates of OMB and CBO in an Era of Divided Government”, Presidential Studies Quarterly, 29, 4, 820—829. Ericsson, N. R. (1992) “Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Performance: An Exposition, Extensions, and Illustration”, Journal of Policy Modeling, 14, 4, 465—495. Ericsson, N. R. (2011a) “Improving Global Vector Autoregressions”, draft, Board of Gov- ernors of the Federal Reserve System, Washington, D.C., June. Ericsson, N. R. (2011b) “Justifying Empirical Macro-econometric Evidence in Practice”, invited presentation, online conference Communications with Economists: Current and Future Trends commemorating the 25th anniversary of the Journal of Economic Sur- veys, November. Ericsson, N. R. (2012) “Detecting Crises, Jumps, and Changes in Regime”, draft, Board of Governors of the Federal Reserve System, Washington, D.C., November. Ericsson, N. R. (2015) “How Biased Are U.S. Government Forecasts of the Federal Debt?”, International Journal of Forecasting, forthcoming. JSM2015 - Business and Economic Statistics Section 870 Ericsson, N. R., E. J. Fiallos, and J. E. Seymour (2014) “Assessing Greenbook Forecasts of Foreign GDP Growth”, draft, Board of Governors of the Federal Reserve System, Washington, D.C., August. Ericsson, N. R., D. F. Hendry, and K. M. Prestwich (1998) “The Demand for Broad Money in the United Kingdom, 1878—1993”, Scandinavian Journal of Economics, 100, 1, 289— 324 (with discussion). Ericsson, N. R., S. B. Hood, F. Joutz, T. M. Sinclair, and H. O. Stekler (2013) “Greenbook Forecasts and the Business Cycle”, draft, Board of Governors of the Federal Reserve System, Washington, D.C., December. Ericsson, N. R., S. B. Hood, F. Joutz, T. M. Sinclair, and H. O. Stekler (2015) “Time- dependent Bias in the Fed’s Greenbook Forecasts”, in JSM Proceedings, Business and Economic Statistics Section, American Statistical Association, Alexandria, VA, forth- coming. Ericsson, N. R., and E. L. Reisman (2012) “Evaluating a Global Vector Autoregression for Forecasting”, International Advances in Economic Research, 18, 3, 247—258. Fildes, R., and H. O. Stekler (2002) “The State of Macroeconomic Forecasting”, Journal of Macroeconomics, 24, 4, 435—468. Frankel, J. (2011) “Over-optimism in Forecasts by Official Budget Agencies and Its Impli- cations”, Oxford Review of Economic Policy, 27, 4, 536—562. Granger, C. W. J. (1989) Forecasting in Business and Economics, Academic Press, Boston, Massachusetts, Second Edition. Hendry, D. F. (1999) “An Econometric Analysis of US Food Expenditure, 1931—1989”, Chapter 17 in J. R. Magnus and M. S. Morgan (eds.)Methodology and Tacit Knowledge: Two Experiments in Econometrics, John Wiley and Sons, Chichester, 341—361. Hendry, D. F. (2006) “Robustifying Forecasts from Equilibrium-correction Systems”, Jour- nal of Econometrics, 135, 1—2, 399—426. Hendry, D. F., and J. A. Doornik (2014) Empirical Model Discovery and Theory Evaluation: Automatic Selection Methods in Econometrics, MIT Press, Cambridge, Massachusetts. Hendry, D. F., S. Johansen, and C. Santos (2008) “Automatic Selection of Indicators in a Fully Saturated Regression”, Computational Statistics, 23, 2, 317—335, 337—339. Hendry, D. F., and F. Pretis (2013) “Anthropogenic Influences on Atmospheric CO2”, Chapter 12 in R. Fouquet (ed.) Handbook on Energy and Climate Change, Edward Elgar, Cheltenham, 287—326. Hendry, D. F., and C. Santos (2010) “An Automatic Test of Super Exogeneity”, Chapter 12 in T. Bollerslev, J. R. Russell, and M. W. Watson (eds.) Volatility and Time Series Econometrics: Essays in Honor of Robert F. Engle, Oxford University Press, Oxford, 164—193. Holden, K., and D. A. Peel (1990) “On Testing for Unbiasedness and Efficiency of Forecasts”, The Manchester School, 58, 2, 120—127. Johansen, S., and B. Nielsen (2009) “An Analysis of the Indicator Saturation Estimator as a Robust Regression Estimator”, Chapter 1 in J. L. Castle and N. Shephard (eds.) The Methodology and Practice of Econometrics: A Festschrift in Honour of David F. Hendry, Oxford University Press, Oxford, 1—36. Johansen, S., and B. Nielsen (2013) “Outlier Detection in Regression Using an Iterated One-step Approximation to the Huber-skip Estimator”, Econometrics, 1, 1, 53—70. Johansen, S., and B. Nielsen (2015) “Asymptotic Theory of Outlier Detection Algorithms for Linear Time Series Regression Models”, Scandinavian Journal of Statistics, in press. Joutz, F., and H. O. Stekler (2000) “An Evaluation of the Predictions of the Federal Re- serve”, International Journal of Forecasting, 16, 1, 17—38. Mincer, J., and V. Zarnowitz (1969) “The Evaluation of Economic Forecasts”, Chapter 1 in J. Mincer (ed.) Economic Forecasts and Expectations: Analyes of Forecasting Behavior and Performance, National Bureau of Economic Research, New York, 3—46. Nordhaus, W. D. (1987) “Forecasting Efficiency: Concepts and Applications”, Review of Economics and Statistics, 69, 4, 667—674. JSM2015 - Business and Economic Statistics Section 871 Nunes, R. (2013) “Do Central Banks’ Forecasts Take Into Account Public Opinion and Views?”, International Finance Discussion Paper No. 1080, Board of Governors of the Federal Reserve System, Washington, D.C., May. Pretis, F., M. L. Mann, and R. K. Kaufmann (2015) “Testing Competing Models of the Temperature Hiatus: Assessing the Effects of Conditioning Variables and Temporal Uncertainties Through Sample-wide Break Detection”, Climatic Change, 131, 4, 705— 718. Romer, C. D., and D. H. Romer (2008) “The FOMC versus the Staff: Where Can Monetary Policymakers Add Value?”, American Economic Review, 98, 2, 230—235. Sinclair, T. M., F. Joutz, and H. O. Stekler (2010) “Can the Fed Predict the State of the Economy?”, Economics Letters, 108, 1, 28—32. Sinclair, T. M., H. O. Stekler, and W. Carnow (2012) “A New Approach for Evaluating Economic Forecasts”, Economics Bulletin, 32, 3, 2332—2342. Stekler, H. O. (1972) “An Analysis of Turning Point Forecasts”, American Economic Re- view, 62, 4, 724—729. Stekler, H. O. (2002) “The Rationality and Efficiency of Individuals’ Forecasts”, Chapter 10 in M. P. Clements and D. F. Hendry (eds.) A Companion to Economic Forecasting, Blackwell Publishers, Oxford, 222—240. Tsuchiya, Y. (2013) “Are Government and IMF Forecasts Useful? An Application of a New Market-timing Test”, Economics Letters, 118, 1, 118—120. Tversky, A., and D. Kahneman (1974) “Judgment under Uncertainty: Heuristics and Bi- ases”, Science, 185, 4157, 1124—1131. White, H. (1990) “A Consistent Model Selection Procedure Based on -testing”, Chap- ter 16 in C. W. J. Granger (ed.) Modelling Economic Series: Readings in Econometric Methodology, Oxford University Press, Oxford, 369—383. Yellen, J. L. (2012) “Perspectives on Monetary Policy”, remarks at the Boston Economic Club Dinner, Federal Reserve Bank of Boston, Boston, Massachusetts, June 6. Yellen, J. L. (2015) “FOMC Press Conference”, transcript, Federal Reserve Board, Wash- ington, D.C., Septeber 17. JSM2015 - Business and Economic Statistics Section 872 EricssonHendryMizon-1998Oct-Exogeneity-JBES-v16n4 Exogeneity, Cointegration, and Economic Policy Analysis Neil R. ERICSSON Division of International Finance, Federal Reserve Board, Washington, DC 20551 (ericsson@frb.gov) David F. HENDRY Nuffield College, Oxford OX1 1NF, United Kingdom (david.hendry@nuffield.oxford.ac.uk) Grayham E. MIZON Department of Economics, European University Institute, Florence 1-50016, Italy, and Department of Economics, Southampton University, Southampton S017 1 BJ, United Kingdom (mizon@datacomm.iue.it) This overview examines conditions for reliable economic policy analysis based on econometric models, focusing on the econometric concepts of exogeneity, cointegration, causality, and invari- ance. Weak, strong, and super exogeneity are discussed in general, and these concepts are then applied to the use of econometric models in policy analysis when the variables are cointegrated. Im- plications follow for model constancy, the Lucas critique, equation inversion, and impulse response analysis. A small money-demand model for the United Kingdom illustrates the main analytical points. This article then summarizes the other articles in this issue's special section on exogeneity, cointegration, and economic policy analysis. KEY WORDS: Causality; Equation inversion; Impulse response analysis; Invariance; Lucas cri- tique; Money demand. The assessment of alternative economic policies is one of the most challenging uses of econometric models. This article discusses how the concepts of exogeneity and cointe- gration influence and help interpret the uses of econometric models in economic policy analysis. The main contribution of the article is expositional-it unifies, supplements, and synthesizes previously disparate results on exogeneity, par- ticularly those for cointegrated systems. Discussion focuses on econometric conditions for reliable conditional policy analysis-specifically, analysis based on an econometric model that characterizes the distribution of policy targets conditional on policy instruments. Relatedly, we consider limitations of some commonly used policy tools, includ- ing equation inversion and impulse response analysis. The econometric concepts of exogeneity, cointegration, causal- ity, and invariance are vital in determining the usefulness of estimated models for economic policy. Throughout, this article lays out the concepts and the structure of the mod- eling approach adopted in the other empirical articles of this special section of the Journal of Business & Economic Statistics. Building on the work of Engle, Hendry, and Richard (1983), Section 1 reviews several econometric concepts, in- cluding the data-generation process (as distinct from the econometric model), weak exogeneity and parameters of interest, strong exogeneity and Granger causality, super ex- ogeneity and invariance, and parameter constancy. Section 2 discusses the cointegrated vector autoregression-a class of econometric models used throughout the rest of the article-and uses it to illustrate long-run weak exogene- ity and Granger causality. This section also considers some purposes of and conditions for conducting economic pol- icy with econometric models. The next three sections fo- cus on implications of the discussed econometric concepts for economic policy analysis. Particular issues include the Lucas critique (Sec. 3), inversion of econometric equations to determine policy effects (Sec. 4), and impulse response analysis (Sec. 5). Banerjee, Hendry, and Mizon (1996) and Hendry and Mizon (1998) discussed the additional, related policy issues of co-breaking, forecasting, policy credibility, expectations, and scenario studies. Section 6 illustrates the analytical discussion with a small econometric model using U.K. money-demand data. Finally, Section 7 summarizes the articles in this issue's special section and relates them to recent developments in cointegration and exogeneity in the context of economic policy analysis. 1. PRELIMINARIES AND NOTATION Section 1.1 distinguishes between the data-generation process and an econometric model thereof as background for discussing exogeneity in Sections 1.2-1.4. Whether or not a variable is exogenous depends on whether or not that variable can be taken as given without losing information for the purpose at hand. The distinct purposes of statistical inference (estimation and testing), forecasting, and policy analysis define the three concepts of weak, strong, and su- per exogeneity. Valid exogeneity assumptions permit sim- pler modeling strategies, reduce computational expense, and help isolate invariants of the economic mechanism, with the last being particularly important in policy analysis. Invalid exogeneity assumptions may lead to inefficient or inconsis- ? 1998 American Statistical Association Journal of Business & Economic Statistics October 1998, Vol. 16, No. 4 370 This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 371 tent inferences and result in misleading forecasts and policy simulations. Weak, strong, and super exogeneity are defined relative to parameters of interest, whereas predetermined- ness and strict exogeneity are not, making the latter two concepts of limited use for policy analysis; see Engle et al. (1983) for details. Sections 1.2, 1.3, and 1.4 discuss weak, strong, and su- per exogeneity, respectively. Section 2 reexamines these no- tions when cointegration holds. Readers familiar with weak, strong, and super exogeneity may wish to skip directly to Section 2. 1.1 The Data-Generation Process and the Econometric Model The data-generation process (DGP) is commonly defined in terms of the joint distribution of the data. Let the triple (Q, .F, P(.)) denote a probability space, where Q is the sam- ple space for a vector of N variables x at time t (denoted xt) that characterize an economy, .F is the event space of Q, and P(.) is the probability measure for the events in F. Denote the history of the stochastic process {xt } up to time (t - 1) by Xt_1, which is (Xo, xi, ..., xt-1) or (Xo, Xt1_), where Xo is the set of initial conditions and XJ = (xi,...., xj) for i j J. The DGP Dx (Xf Xo, ) can be sequentially factor- ized as T Dx(XIXo, I) = J D?(xtlXt-1,t), eE R', (1) t=-1 where Dx(X1 Xo, ,) is the joint density of X} given Xo, is an 1 x 1 vector of parameters, the sample is over [1, T], (t is the subset of the parameters in ( that enters the sequentially conditioned density Dx(xtlXt-l,Ct), and ( = ((1,..., ,T) (ignoring redundancies). This formulation allows for non- constant parameters, such as transients due to regime shifts and structural breaks, so the sequential conditioning in (1) may include very complicated effects. The econometric model for the full sample of xt may be factorized similarly: T fx(X)IXo,O) -= f0(xtXt-1,0), E0 e C Rn, (2) t=l1 where fx(X)IXo, 0) represents the econometric model; fx (xt Xtl, 8) is the postulated, sequentially conditioned joint density for xt; and 0 is an n x 1 vector of parameters lying in the parameter space O. Section 2 examines vector autoregressions as a particular form for the joint sequential density f (xt '). The econometric model is generally not the DGP; that is, fx(xtI.) f D,(xtI.). That lack of equal- ity has implications for inferences about 0. For example, if OT [= 0(c)] is the full-sample pseudo-true value assuming constancy, OT will minimize the Kullback-Leibler measure of distance between the two densities. A modeling strategy that aims to develop congruent and encompassing econo- metric models thus endeavors to keep such differences to a minimum; see Hendry (1995a), Mizon (1995), and Bon- temps and Mizon (1997) inter alia. Sometimes, joint modeling of all variables in xt is too dif- ficult, so a subset of m variables yt is modeled conditional on the remaining k variables zt-that is, with xt partitioned as xt = (y': z')' and N = m + k. Weak exogeneity of the conditioning variables zt is required for estimation of the conditional model for yt to be without loss of information relative to estimation of the joint model for yt and zt. Before turning to weak exogeneity itself, a few notational conventions are helpful to establish. The partitioning of xt into yt and zt implies similar partitionings of parameter vectors and matrices. Notationally, numerical subscripts 1 and 2 indicate the partitions associated with y and z, re- spectively. The partition may be according to the density in which the parameter enters, as with A = (A': A')', or corresponding to the variable multiplied. The context clar- ifies the precise usage. Pairs of numerical subscripts gen- erally correspond to such partitions applied by row and by column. For example, the dynamic weighting matrix F is { FiJ }, where {.} denotes the partitioned matrix consisting of (possibly matrix) elements Fij (i, j = 1, 2). For the coin- tegration weighting matrix a and the cointegrating vectors 0, the column partition is by choice. A subscript t on a parameter explicitly indicates potential time dependence. 1.2 Parameters of Interest and Weak Exogeneity One useful approach to developing a model of x may be to model some subset of x, conditional on the other variables in x. Weak exogeneity is the requirement for conditional estimation to be without loss of information from conditioning. Richard (1980) proposed the concept of weak exogeneity, building on Koopmans (1950); Engle et al. (1983) analyzed it in greater detail; and Ericsson (1992) provided an exposition. Florens and Mouchart (1985) and Boswijk (1994) discussed its relationship to the concept of S-ancillarity described by Barndorff-Nielsen (1978). To state the conditions for weak exogeneity, it is neces- sary to transform the parameters of the joint distribution into those of the conditional and the marginal distributions. Hence, the original model parameters 0 are transformed to the parameters A as A = g(O). The function g(') defines a one-to-one mapping of 0 into A for A E A, sustaining A'= (A': A') and corresponding to the factorization of the joint density into a conditional density and a marginal den- sity: fa (yt, ztlXt-l, 8) = flz(ytlzt, Xt-1, A1) . fz(ztlXt-, , 2). (3) Such a factorization always can be achieved if A1 and A2 are defined to support it, although the resulting parameters may then be linked. Whether or not conditional estimation will result in a loss of information depends crucially on what parameters are the focus of attention. Denote the q (q < n) identified parameters of interest by the vector b. The factorization (3), in combination with the parameters of interest, permits defining weak exogeneity. This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms 372 Journal of Business & Economic Statistics, October 1998 Definition 1. zt is weakly exogenous for the parameters of interest 0 if and only if 1. 0 = b(A1); that is, 0 is a function of A1 alone; and 2. A1 and A2 are variation free. (See Engle et al. 1983, definition 2.5.) Condition 1 ensures that 0 can be learned from A1. To- gether, conditions 1 and 2 exclude the possibility that b depends on A2, either directly (condition 1) or indirectly (condition 2): Hence, no information about the parameters of interest can be derived from the marginal model. Be- cause 0 can be learned uniquely and completely from the conditional model, weak exogeneity is a sufficient condition for efficient inference on 0 from the conditional model. The marginal distribution of policy variables may be diffi- cult to model empirically, due to changes in policy regime. In such a situation, valid conditioning on those variables can greatly assist empirical modeling. Although noncon- stancy in the marginal process argues strongly for condi- tional modeling, it also places a premium on ensuring that weak exogeneity holds, both in sample and out of sample. Equally, the "exogenous" control of policy variables by a policy agency does not in itself justify conditioning on those variables, as Section 2.2 and Section 3 show. Failure of either condition 1 or 2 precludes inference without loss of information when using the conditional model alone. The potential consequences of that informa- tion loss can be delineated into four distinct types. First, the parameters of interest cannot be obtained from the condi- tional model, as with errors-in-variables and simultaneity. Second, inference is distorted, even though b is obtained, as for unit-root processes in which $ (the cointegrating vector, say) depends on both A1 and A2; see Phillips and Loretan (1991) and Hendry (1995b). Third, knowledge of A2 is re- quired to identify 0. Fourth, efficiency may be lost because A2 contains useful information about AX, as with classical cross-equation restrictions. In this article, the first and sec- ond types are most relevant. 1.3 Granger Noncausality and Strong Exogeneity Granger noncausality is one of two conditions required for strong exogeneity, which bears on conditional impulse response analysis (Sec. 5) and conditional forecasting inter alia. Granger causality is defined as the presence of feed- back from one variable to another, with Granger noncausal- ity defined as the absence of such feedback. Definition 2. Suppose that the marginal density f (.) does not depend on Yt-; that is, fz(ztIXt-1,') = f (ztZt-_l, .). Then, y does not Granger-cause z. (See Granger 1969 and Engle et al. 1983, definition 2.1.) One set of variables (here, y) does not Granger-cause the remaining variables (here, z) if, given the information set available, deleting the history of the former set of variables does not alter the joint distribution of the remaining vari- ables. Unlike weak exogeneity, Granger causality does not in- volve parameters of interest and so is not related to their es- timation. Indeed, Granger noncausality is neither necessary nor sufficient for weak exogeneity. Granger noncausality in combination with weak exogeneity, however, defines strong exogeneity. Definition 3. zt is strongly exogenous for the parame- ters of interest b if zt is weakly exogenous for b and if fz(ztlXt-1, .) = f (ztlZt-1, .). (See Engle et al. 1983, defi- nition 2.6.) Strong exogeneity permits conditional forecasting from the conditional model without loss of information. That is, fore- casts of z over several periods may be constructed, and then forecasts for y are generated from the conditional model, conditional on that set of forecasts for z. If y did Granger- cause z, then forecasts for y and z would need to be con- structed together, one period at a time, or else risk losing valuable information from feedback. The preceding discussion of weak and strong exogeneity assumes parameter constancy, although abundant empirical evidence indicates the presence of regime shifts and struc- tural changes in the economy. Hence, the next subsection extends the concept of exogeneity to allow for the pres- ence of parameter nonconstancies such as might result from changes in economic policy rules. 1.4 Invariance and Super Exogeneity Even if the parameters of fylI(ytlzt,Xt-1,A1) and fz(ztlXt-1, A2) are variation free, A1 still may change as A2 alters. The explanation turns on the relationship between A1 and A2 and on their dependence on {(t } through interven- tions, at what is sometimes called the level of deep parame- ters. Specifically, the policy agency determines the marginal process Dz(zt Xt_1, C2t), even though it decides policy on the basis of its models rather than the DGP. Changes in (2t alter A2, and those changes may or may not alter A1. The presence or lack of invariance to a class of interventions is tied to a third concept of exogeneity, super exogeneity, as discussed later in this subsection. With super exogeneity, conditional policy simulations involve no loss of informa- tion relative to simulations of the variables' joint distribu- tion. A parameter intervention at time t affecting the DGP D. (xtXt_1, t) is defined as any action at by an agent, where that action alters last period's parameter value 5t-1 to become this period's parameter value qt [- w(at, 5t-1), say]. If no intervention occurs, then qt = t-1. Let Cc(t) [= {at: 5t = w(at, 5t-1)}] be the set of interventions at time t on D (xt Xt-1, .) that potentially affect Ct, and de- note the set of such interventions over the full sample as CQ [= {Qe(t), t = 1,..., T}]. Possible interventions include changes to monetary, fiscal, and exchange-rate policy rules, deregulation, financial and technological innovation, nation- alization, and war. Because the DGP is the economic mech- anism, its parameterization can be affected by intervention. Indeed, the aim of many economic policies is precisely to affect the DGP. Such interventions may consequentially af- fect an econometric model. Here, those of most relevance are changes in the marginal process that are enacted by a This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 373 policy agency controlling (2t, with the associated class of interventions denoted by C2. The conditional model's parameters A1 may alter as (2t varies within the class of interventions C2. Engle and Hendry (1993) considered and Section 3 will consider de- rived changes in (it such as would occur under the Lucas critique. (Policies involving direct changes in (it are more difficult to analyze and are not considered here.) Conversely, the model parameters A1 may be invariant to a class of in- terventions C2. Definition 4. AX is invariant to a class of interventions C2 if AX is constant over C2. (See Engle et al. 1983, definitions 2.7 and 2.8.) Because economies undergo numerous changes, it is important for econometric modeling that some of the economies' features be invariant to those changes. Without such invariance, econometric modeling and economic pol- icy analysis using econometric models would be of limited value. Moreover, the invariance of A1 is central to obtaining reliable policy simulations (or counterfactual experiments), in which the path of some policy instrument z is specified and the path of a target variable y is calculated from the con- ditional econometric model fylz(YtlZt, Xt-l, 1). Because A1 is unknown, the weak exogeneity of zt for the parame- ters of interest is typically required for efficient estimation from the conditional distribution alone. Reliable policy sim- ulation thus requires the combination of weak exogeneity and invariance, which Engle et al. (1983) defined as super exogeneity. Definition 5. zt is super exogenous for the parameters of interest 4 if zt is weakly exogenous for 4 and if A1 is invariant to C2. (See Engle et al. 1983, definition 2.9.) Under super exogeneity, policy actions affect (2t, thus al- tering the marginal model's parameters A2 and the path of the policy variable z. Those policy actions do not affect the conditional model's parameters A1, although outcomes for y depend on the hypothesized path for the policy variable z through the conditional model itself. Thus, under super exogeneity, policy can and (in general) does affect agent behavior: It does so through the variables entering the con- ditional model, albeit not through the parameters of that model. Finally, y may Granger-cause z even if zt is super exogenous. 2. COINTEGRATION AND EXOGENEITY This section elaborates on the concepts of weak exogene- ity, Granger noncausality, strong exogeneity, invariance, and super exogeneity for a commonly used class of econometric model-the finite-order vector autoregression (VAR). The cointegrated (or reduced-rank) VAR has been used success- fully in many areas of economics, and it appears capable of capturing some of the potential relationships among eco- nomic time series. Section 2.1 discusses the VAR itself. Sections 2.2 and 2.3 consider specific implications of weak exogeneity and Granger causality for that VAR, whereas Section 2.4 highlights more general aspects of economic policy analysis. Sections 3, 4, and 5 focus on implications of exogeneity for three central issues in policy analysis- the Lucas critique, model inversion, and impulse response analysis. 2.1 A Cointegrated VAR A VAR has several equivalent representations that are valuable for understanding the interactions between exo- geneity, cointegration, and economic policy analysis. To start, the levels form of the sth-order Gaussian VAR for S zc is Xt = Kqt + Ajxtj + Et, Et ? INN(O, i), (4) j=1 where K is an N x No matrix of coefficients of the No deterministic variables qt, the Aj are N x N matrices of autoregressive coefficients, and Et is a vector of N unob- served, jointly normal, sequentially independent errors with mean 0 and (constant) covariance matrix E. This subsection examines some properties of (4), first transforming it to I(0) space with mean-zero variables. For expositional con- venience, x is restricted to be (at most) integrated of order 1, denoted I(1), where an I(j) variable requires jth differ- encing to make it stationary. Again, for expositional con- venience, the system is often first-order (s = 1) or second- order (s = 2) with a (possibly zero) intercept 6 as the only deterministic variable (6 = Kqt). See Johansen (1992a,b,c) and Juselius (1998, this issue) for theoretical and empirical analyses of cointegrated 1(2) variables. Normality is also a convenient assumption, but one not required by the frame- work of Section 1. Suppose that the VAR in (4) is second-order, has an in- tercept as its only deterministic variable, and has r (r < N) cointegrating relations O'xt, defined as nonzero linear com- binations of xt that are I(0). That VAR can be written as the vector equilibrium-correction model, Axt = 6 + ap'xti- + FAxtz_ + Et. (5) The first difference operator A equals 1 - L, where L is the lag operator such that Lxt = xt-1, a and 3 are N x r matrices of rank r such that ap' = A1 + A2 - IN, and F = -A2. See Johansen (1988, 1991, 1992d, 1995), Jo- hansen and Juselius (1990), Banerjee, Dolado, Galbraith, and Hendry (1993), and Hendry (1995a) for key develop- ments of and discussions on this class of cointegrated sys- tem. Identification restrictions must be imposed to ensure uniqueness of a and p3. For the correct choice of r, the model in (5) is in I(0) space, so inference about the param- eters 6, a, F, and E can be conducted using conventional procedures. The VAR in (5) can be usefully rewritten with zero-mean variables. Suppose that, in steady state, the underlying vari- ables grow at a rate 7, defined as (Axzt), where &(.) is the expectations operator. Taking expectations across (5) and rearranging, the long-run solution of the system is ac(/3'xzt) = (IN - F)7 - 6. (6) This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms 374 Journal of Business & Economic Statistics, October 1998 Defining the equilibrium mean (flP'xt) as pL implies 6 = (IN - )Y)7 - ap. (7) Thus, adding and subtracting y and p in (5) yields an ex- pression in which all terms have zero means: (AXt - -y) = a(/3'xti - IL) + r(Axt-i - y) + Et. (8) If 6 lies in the cointegration space (6 = -ap), then - = 0, so x has a growth rate of 0. The I(0) system in (8) can be reexpressed as a VAR in the N + r zero-mean variables (Axt - y) and (P'xzt-1 - ), similar to Hendry and Mizon [1993, eq. (5)]: (O t-1 - At) Of I, (O'xt-2 - p)+ 0 (9) noting that y'7y = 0. Although the errors in this (N + r)- dimensional system have a singular distribution, (9) is con- venient for retaining relevant transformations of variables and for generating multistep-ahead forecasts of the trans- formed variables. Moreover, the empirical example in Sec- tion 6 uses it to calculate the impulse responses of the coin- tegrating combinations as well as of the growth rates. Many mappings of (9) into N dimensions yield a nonsingular dis- tribution. Some are interpretable as (N - r) common trends and r cointegrating vectors, but they are not unique. 2.2 Long-Run Weak Exogeneity Cointegrating vectors are often parameters of interest, and weak exogeneity of zt for them can aid in conduct- ing inference. Johansen and Juselius (1990), Davidson and Hall (1991), Boswijk (1992, 1995), Dolado (1992), Johansen (1992c,e), Urbain (1992), Hendry and Mizon (1993), and Hendry (1995b) discussed testing weak exogeneity in coin- tegrated systems. From those works, this subsection ex- tracts two key results, presented as Lemmas 1 and 2, and it shows how the choice of parameters of interest affects the conditions for weak exogeneity. Partitioning of the variables and matrices in (5) facilitates discussing weak exogeneity, Granger causality, and their relation to policy analysis. With partitioning, (5) is aE zt 6= 2 ]+ [O2 /t i + ]2 C2t [6 1 a a12 / 12 [ Yt-1 2 j L a21 a22 J 21 3 22 Zt-1 F J21 i22 A[_l 2t (10) Parameters are partitioned by variable and (for a and /) by cointegrating vector. Specifically, 3' is divided by row into two subsets of cointegrating vectors, and I0', which are themselves partitioned by variable as (lJl": 01'2) and (Pfl: P0I2). The VAR in (10) may be reparameterized as the condi- tional and marginal distributions in (3), AYt = (61 - D62) "+ (a1 - Da2)/'xt-1 + DAzt "+ (F1 - DF2)Axt-1 +Vlt = (61 - D62) + [(alpll + a l12021) - D(a21,l1 + +L22321)]Yt-1 "+ [(aillO,2 + a12/22) - D(oa21/12 + 22322)]Zt-1 "+ DAzt + (Fll - DF2i)Ayt-1 "+ (F12 - DF22)Azt-1 + Vlt (01) and Azt = 62 + a2/3 t-1 + F2Axt-1 + C2t = 62 + (Ca21O1 + '322/21)Yt-1 "+ (a21i2 + a22322)t-1 "+ F21Ayt-1 + F22Azt-1 + C2t, (12) where D = E12221, VUlt = Elt - DC2t, so var(vlt) = 11 Ell - E12E22-121. We assume that Pl and 02 each contain at least one cointegrating vector (rl > 0, r2 > 0), that Pf
enters the first block (aell 0), and that D * 0 (imply-
ing contemporaneous correlation between y and z) because
these assumptions generate situations of particular interest.

Different conditions for the long-run weak exogeneity of
zt exist, depending on the parameters of interest.

Lemma 1. Suppose that the parameters of interest are f.
Then, zt is weakly exogenous for 0 if and only if a2 = 0
(i.e., a2i = 0 and a22 = 0). (See Johansen 1992e.)

The condition a2 = 0 ensures that 0 does not appear in
the marginal distribution for zt. The parameters of inter-
est, however, might be only a subset of the cointegrating
vectors, in which case other conditions also obtain weak
exogeneity.

Lemma 2. Suppose that the parameters of interest are

il. Then, zt is weakly exogenous for fl if a21 = 0 and
(a12 – Da22) = 0. (See Hendry and Mizon 1993.)

These conditions ensure that Pl enters the conditional
model but not the marginal one. These conditions are suf-
ficient but not necessary, and Ericsson (1995, sec. 3.1) pre-
sented an alternative set of sufficient conditions for the

weak exogeneity of zt for Pl, noting that Z12 (and so D)
need not be 0 in those conditions. Harbo, Johansen, Nielsen,
and Rahbek (1998, Sec. 5, this issue) illustrate empirically
how the choice of parameters of interest can affect which
variables are weakly exogenous. Equations (11)-(12) also
indicate how the choice of policy rule can affect weak exo-

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 375

geneity. For instance, responses by the policy variable Azt

to past disequilibria (l xt-1 – tL1) induce a failure of weak
exogeneity, which could imply inefficient or inconsistent

inference on pi if estimation is of the conditional model
alone.

Empirically, weak exogeneity in cointegrated systems
arises with considerable regularity, as documented by Erics-
son and Irons’s (1995, pp. 298-301) literature search on su-
per exogeneity and by the empirical articles reprinted by
Ericsson and Irons (1994). Depending on how many vari-
ables are included in z, weak exogeneity may entail a con-
ditional subsystem or a conditional single-equation model.
The work of Juselius (1998, this issue) and that of Section 6

are of the first type, whereas that of de Brouwer and Erics-
son (1998, this issue) is of the second. Even without weak
exogeneity, single-equation modeling may be feasible by
treating the system estimates of the cointegrating vector(s)
as given; see Juselius (1992), Durevall (1998, this issue),
and Metin (1998, this issue).

2.3 Granger Causality

The conditions for Granger noncausality in (10) can be
stated simply.

Lemma 3. In (10), y does not Granger-cause z if and
only if (a21/31 + a’22LL21) = 0 and F21 = 0.

Suppose that the policy instruments are z and that the tar-
get variables of economic policy are y. These conditions for
Granger noncausality are unlikely to be satisfied in practice
because past values of target variables typically influence
the present choice of the policy instruments’ values. For ex-
ample, recent past inflation is likely to influence the tight-
ness of monetary and fiscal policy. Actual policy simula-
tions may or may not assume such feedback: All the policy
simulations in the papers collected by Bryant, Henderson,
Holtham, Hooper, and Symansky (1988) assume Granger
noncausality in the policy rule (see their p. 29), whereas
the papers collected by Bryant, Hooper, and Mann (1993)
focus on the presence of feedbacks in the policy rule (see
their pp. 13, 20).

Economic policy typically assumes that changes in the
instruments z affect the targets y. The converse has imme-
diate repercussions.

Lemma 4. In (10), z does not Granger-cause y if and
only if (a11/32 + 12fl22) = 0 and F12 = 0.

Without Granger causality from instruments to targets, pol-
icy is unlikely to be effective. The conditions for Granger
causality, which are directly relevant for assessing the feasi-
bility of economic policy, are features of the joint distribu-
tion of yt and zt, conditional on their past values. These con-
ditions are unrelated to those for weak exogeneity, which
pertains to statistical inference in a conditional model. For
further details on testing causality in I(1) systems, see
Mosconi and Giannini (1992) and Toda and Phillips (1993,
1994).

2.4 Policy, the DGP, and the Econometric Model

Often, the objective of economic policy is to shift the
mean of the target variables y to a desired value (or within
a desired range) by changing the instruments z. In such a
situation, one set of sufficient conditions for reliable pol-
icy analysis is that the econometric model coincides with
the DGP both before and after the policy intervention and
that the policy is feasible. The first of these conditions is
unlikely to be fulfilled in practice, and no criteria currently
exist for determining whether or not it is fulfilled. Conse-
quently, necessary conditions with testable implications are
typically examined. The hypotheses of exogeneity, causal-
ity, and invariance are testable in econometric models, so
important aspects of conditional econometric models may
be assessed prior to their use in policy analysis.

The hypotheses of exogeneity, causality, and invariance
have implications for policy analysis with a macroecono-
metric model, even if that model is treated as a determin-
istic, numerical accounting structure. Causal links from the
instruments to the targets are essential for policy to have
an effect, and unchanging parameters are needed for that
effect to be as anticipated. On the latter, a model might
well perform according to its established operational char-
acteristics when forecasting over a short horizon or analyz-
ing small policy adjustments. For active interventions such
as financial deregulation, new taxes, and monetary union,
more stringent criteria are likely to be required. Tests of
exogeneity help assess how well a model will predict the
actual outcome.

The following conditions are typically necessary for such
policy analysis to be of value:

1. The policy instruments and the targets have genuine
causal links; see Granger and Deutsch (1992).

2. The model represents the economy closely enough
in relevant attributes that its policy predictions reasonably
match outcomes. Empirically testable conditions include
congruency (the model is coherent with all available infor-
mation) and encompassing (the model is undominated by
alternative models); see Hendry (1995a) and Bontemps and
Mizon (1997).

3. The change in policy does not alter the econometric
model in a self-contradictory way; see Lucas (1976).

4. The policy experiment is feasible in reality and within
the econometric model.

5. The policy instruments are manipulable, in that the
policy agency can set the instruments to those values de-
sired for the policy experiment.

The econometric concepts of causality, congruence, encom-
passing, exogeneity, and invariance are associated with con-
ditions 1, 2, and 3. Under condition 1, moving y by changing
z may occur in three distinct ways-contemporaneously,
with a delay, and in the long run, as (11) clarifies. The ap-
plicability of policy, however, depends on the actual causal
links and not on the apparent links in the econometric for-
mulation. Although such a point seems obvious, models are
often applied to policy issues without explicit testing of
their policy relevance. Tests of super exogeneity may shed

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

376 Journal of Business & Economic Statistics, October 1998

light on their relevance by focusing on the invariance of
the econometric relation under historical interventions and

on the significance of the claimed connections. Condition
3 leads to Section 3, immediately following. Feasibility (in
condition 4) concerns the relationship between the policy
experiment, the DGP, and the econometric model, and ma-
nipulability (in condition 5) is a feature of the policy instru-
ments. Manipulability differs from controllability, which
has a specific meaning in the engineering literature. Here,
manipulability corresponds to altering (2t in the marginal
distribution of zt, as might arise if (2t were the base rate
set by a central bank and zt were a very short-term interest
rate. The same policy might be infeasible for a long-term
interest rate if international arbitrage determined the latter.

3. THE LUCAS CRITIQUE

Lucas (1976) criticized using an econometric model for
policy analysis if implementing the policy under evaluation
would alter the structure that the model was attempting to
capture. Lucas considered examples in which agents’ expec-
tations of policy behavior enter into their optimization prob-
lem, so parameters relating to policy makers’ rules appear
in the agents’ first-order conditions. Specifically, if agents
form model-based expectations about z when planning y,
then A1 depends on A2, and A1 will change if policy alters
A2 through changes in (2t. Without super exogeneity, such
unmodeled changes in A1 are likely to confound conditional
policy analysis. More generally, behavioral parameters may
not be invariant to some policy interventions. Simply treat-
ing the econometric model’s coefficients as constant does
not ensure their constancy when a policy is implemented.
Even if weak exogeneity holds, parameters may change
when policy rules alter. In essence, the Lucas critique ques-
tions whether or not an econometric model isolates invari-

ants of the economic process. This section briefly reviews
recent research on the Lucas critique, drawing on Section
1.4. See Frisch (1938), Haavelmo (1944), and Marschak
(1953) inter alia for earlier discussions on invariance and
on what is now called the Lucas critique.

The Lucas critique concerns two related properties of the
conditional and marginal models’ parameters-constancy
and invariance. As suggested by Gordon (1976, pp. 48-49)
and developed by Hendry (1988) and Engle and Hendry
(1993), these properties provide two approaches for testing
the Lucas critique:

1. Test for the constancy of A1 and of A2. If A1 is constant
but A2 is not, then A1 is invariant to the interventions that
occurred, so the Lucas critique could not apply. (See Hendry
1988.)

2. Develop the marginal model until its parameters are
empirically constant. For instance, model the way in which
A2 varies over time by adding dummies or other variables
to the marginal model. Then, test for the significance of
those dummies or other variables in the conditional model.

Their insignificance in the conditional model demonstrates
the invariance of A1 to the modeled interventions, whereas

their significance shows the dependence of A1 on A2. (See
Engle and Hendry 1993.)

Provided that the parameters of interest can be retrieved
from A1 alone, the invariance of A1 implies the super exo-
geneity of zt.

The empirical presence of super exogeneity refutes the
Lucas critique in practice. Under super exogeneity, expec-
tational models such as those proposed by Lucas could not
explain why A1 remained constant while A2 changed and so
could not adequately explain the data. The Lucas critique
as a possibility theorem is not empirically refutable, but,
through that theorem, its assumptions generate testable im-
plications. Whether the Lucas critique applies for a specific
economic relationship is thus an empirical issue.

A literature search by Ericsson and Irons (1995) of ar-
ticles citing Lucas (1976) found virtually no substantiating
evidence for the empirical relevance of the Lucas critique.
In an additional literature search of articles citing Engle
et al. (1983), Hendry (1988), and Engle and Hendry (1993),
Ericsson and Irons (1995) uncovered numerous models with
empirical super exogeneity, with those models spanning
many sectors of several countries’ economies. Economi-
cally, super exogeneity may arise if agents form expecta-
tions without using models-for example, because infor-
mation is costly or because the benefits from model-based
expectations are low. Hendry (1988; 1995a, chaps. 14 and
15), Favero and Hendry (1992), Engle and Hendry (1993),
and Ericsson and Irons (1995) provide further discussion.

Fundamental difficulties may arise if policy experiments
violate preexisting exogeneity conditions. Empirically, pol-
icy might alter the exogeneity of a variable, as perhaps in
the switch from fixed to floating exchange rates. A condi-
tional model is unlikely to be reliable in such a situation,
at least not without more information about its properties
than is usually provided. Moreover, policy analysis might
make a counterfactual assumption about exogeneity. Sec-
tion 4 considers one such situation-inversion of an econo-

metric equation-and shows what problems arise.

4. INVERSION OF AN ECONOMETRIC EQUATION

An econometric equation is sometimes inverted to de-
termine the settings of policy variables z as a function of
target values for y. For example, money-demand equations
are often inverted to obtain the price level, transforming
M = kPI to P = M/(klI) in a standard notation. Invert-
ing money-demand equations to obtain prices is common
among monetarists and macroeconomists, whereas inver-
sion to obtain the interest rate (implicit in k) is common
among macromodelers. For examples, see Friedman and
Schwartz (1982, chap. 2), Hallman, Porter, and Small (1991,
pp. 842ff), and Barro (1997, pp. 184ff, 278ff) on the for-
mer, and Fair (1984, pp. 319-323) and Edison, Marquez,
and Tryon (1987, pp. 130-131) on the latter. Inversion may
occur prior to or after estimation: Depending on the DGP,
each may have unfortunate consequences for policy anal-
ysis, as Sections 4.1 and 4.2 discuss. Detailed expositions

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 377

were given by Hendry (1985), Ericsson (1992, sec. 2C), and
Engle and Hendry (1993; 1994, appendix).

4.1 Inversion Prior to Estimation

Inversion prior to estimation can imply parameter non-
constancy, the loss of weak exogeneity, alteration of the
inverted coefficient, and invalid exclusion restrictions. Such
inversion corresponds to an alternative factorization for the
joint density (3)-namely,

fX (Yt, zt IXt- ,Iot)

= fzil(ztlYt, Xt-, Olt) – fy(ytXt-l,i 2t), (13)

where fzly(‘) is the conditional density of zt given yt and
Xt-1, f(.) is the marginal density of Yt given Xt-1, the
corresponding parameters are t= (=qt: t)’ = g*(0t), the
function g* (.) is a one-to-one mapping from 0 to q, and A2
(and so 0) is allowed to be nonconstant (A2 = A2t). The
conditional model fulz(‘) in the original factorization (3)
is assumed constant, although its affiliated marginal model
fz () is nonconstant through A2t. Because (Olt: 2t) depends
on all elements of (A1: A2t) through 0t, in general neither
the conditional model nor the marginal model in (13) has
constant parameters. This lack of invariance poses problems
for policy analysis. It also provides a modified approach
for testing super exogeneity and causality, as Hoover and
Sheffrin (1992) proposed.

A dynamic bivariate normal distribution highlights spe-
cific difficulties of inversion for policy analysis. Suppose
that the original factorization, corresponding to (3), obtains

yt = Dzt + vit (14)

and

Zt = a2t + A2(L)xt-1 + 62t, (15)

where &(vltE2t) = 0 by construction, a2t is a variable that
can be set by the policy agency, A2(.) is a k x N matrix
polynomial of order s, A2(L)xt-1 captures the dynamics in
(15), and m = k = 1 [cf. (4)]. The conditional model (14)
thus omits a2t and dynamics, which are testable exclusions.

The parameterization (D, 11, A2(.), 22) in (14)-(15)
sustains the weak exogeneity of zt for (D, Q21) in (14).
If (D, Q11) is also invariant to interventions in (A2(.), Z22),
then zt is super exogenous for (D, 211). If A21(L) = 0 and
zt is weakly exogenous for (D, 211), then zt is strongly
exogenous for (D, 211).

Direct inversion of the conditional model (14) yields

zt = D-lyt – D-lvlt. (16)
Although (16) is algebraically correct, (16) is not a condi-
tional model for zt. That (reverse) conditioning, as in (13),
actually obtains

zt = Boa2t + BoA2(L)xt-1 + Blyt + v2t (17)
and

yt = Da2t + DA2(L)xt-1 + lt, (18)

where S(v2t1t) = 0 from conditioning, Bo = Ik –
B1D, B1 = E22D'(22D'(Q D22D’)-1, and var(v2t) 22=
E22 – E21E-1E12.

In (17), Yt is not weakly exogenous for either (Bo, B1) or
the original parameters (D, Q11) because the six sets of pa-
rameters (Bo, B1, A2(.), Q22, D, E11) in the reparameterized
system (17)-(18) are linked by cross-equation restrictions.
Additionally, B1 * D-1 unless Q11 = 0; or, equivalently,
the coefficient on Yt in the conditional model (17) is not
equal to the coefficient on Yt in the inverted model (16) un-
less (16) is an identity. Finally, the conditional model (17)
includes a2t and Xt-j, whereas the inverted equation (16)
excludes those variables. This difference arises because the

original conditional model (14) explicitly excludes a2t and
Xtt. If a2t and Xt- entered (14) unrestrictedly, either di-
rection of conditioning could sustain weak exogeneity for
some choice of parameters of interest, and weak exogeneity
would not be testable directly. Weak exogeneity could be
tested indirectly through testing for super exogeneity (as in
Sec. 3), and an incorrect choice of factorization could still
induce poor policy recommendations through violation of
super exogeneity.

4.2 Inversion After Estimation

Inversion after estimation also affects model use, even if

such inversion does not alter the underlying equation pa-
rameters. Estimated models are sometimes treated as deter-

ministic accounting algorithms and their equations manip-
ulated accordingly, as when target variables in a model are
“exogenized.” The appropriateness of doing so depends on
whether or not the underlying stochastic structure is pre-
served. The unconditional expectation of (14) illustrates:

S(yt) = DS(zt). (19)

If zt is controlled by the policy agency and D f 0, inversion
of (19) permits calculating the value it needed to achieve
the desired target ft in expectation:

it = D- S(Qt). (20)

Choosing that t delivers yt on average, assuming super
exogeneity of zt for D under that intervention.
Although closely related to (20), direct inversion of (14)

could alter the covariance structure of the system. That in-
version is

zt = D-l(yt – vlt), (21)

where ?(ztvit) = 0 in fact and (ytv t) # O. Relatedly, for
nonlinear systems, unconditional expectations correspond-
ing to (19) are often analytically intractable and so are ap-
proximated by averaging stochastic simulations. If inversion
precedes simulation, then the effect is equivalent to (21). To

preserve the covariance structure, a procedure might fix zt
at a trial value and simulate $(yt), iterating on zt until the
desired (average) target value for yt results.

If y and z are I(1) and cointegrated, inversion of the
cointegrating relation might seem innocuous because the

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

378 Journal of Business & Economic Statistics, October 1998

particular normalization in a cointegrating regression need
not matter (at least asymptotically), even prior to estima-
tion; see Engle and Granger (1987). Inversion does not alter
the direction of causality, however, so the inverted relation
need not be economically interpretable. Consider a simple,
mainly static version of the conditional and marginal mod-
els in (11)-(12),

Yt = b’zt + vlt (22)
and

Azt = 62t, (23)

where b is nonzero and is defined such that /3′ = (1: -b’).
This system’s cointegrating relation implies S(yt – b’zt) =
0. A cointegrating relation, however, is simply that-a
relation-and it does not specify causal direction. Renor-
malizing the cointegrating relation on b does not necessarily
imply that yt determines zt. In (22)-(23), for example, yt is
generated conditional on zt, and zt is a pure random walk.
In a more fully dynamic DGP, such as (11)-(12), Granger
causality may exist in both directions, with no single direc-
tion of causation exclusively present.
The analytics of inversion have immediate implications

for policy analysis, as this section’s motivating example
of money demand highlights. Economic policy often cen-
ters on the determinants of inflation. Noninvertibility of
a money demand equation suggests modeling inflation di-
rectly rather than attempting to do so indirectly through
modeling money demand. In this issue of JBES, de Brouwer
and Ericsson (1998), Durevall (1998), Juselius (1998), and
Metin (1998) all model inflation directly. Doing so still al-
lows examining the role of money in determining prices,
but that examination is separate from modeling money de-
mand itself; see also Juselius (1992). Modeling both prices
and money also follows directly from the initial joint den-
sity (3), which implies distinct equations for money and
prices, whether formulated as a joint density or in a condi-
tional/marginal factorization.

5. IMPULSE RESPONSE ANALYSIS

Investigators often use impulse response analysis to
ascertain responses by one set of variables to changes
(“shocks”) in another set of variables, where the latter set
may include policy instruments. This section summarizes
the analytics of impulse response analysis (Sec. 5.1), com-
ments on its usefulness in policy analysis (Sec. 5.2), and ex-
amines the role of exogeneity in impulse response analysis
(Sec. 5.3). See Sims (1980), Runkle (1987), and Liitkepohl
(1991) inter alia for further details on the use of impulse
response analysis in economics.

5.1 A Summary

Empirical impulse responses are typically calculated
from a finite-lag VAR with possibly integrated variables
and with various deterministic series such as a trend, sea-
sonal dummies, and an intercept. The essential features of
impulse response analysis can be ascertained from a sim-

pler structure, such as (4) as a stationary first-order VAR
(s = 1) with its only deterministic series being an intercept
(Kqt = 6). The moving average (MA) representation of this
stationary system is

xt = (IN – AL)-<(6 + et) 00 = Ch(6 + Et-h) h=O = C(L)(6 + Et), (24) where A = A1, the Ch are N x N matrices of moving aver- age coefficients, and C(.) is the corresponding matrix poly- nomial. Higher-order vector autoregressive systems can be accommodated through companion-form representation, at the expense of complicating the algebra without yielding further insights. Unit roots in cointegrated systems can be accommodated by mapping from I(1) to I(0) variables, as in (9). We consider only invertible MA representations: Han- nan (1960, 1970) and Whittle (1963) inter alia discussed the relative merits of invertible and noninvertible MA pro- cesses. The matrix of responses of Xt+h to unit impulses in each of the elements of et equals Ch: dXt+h Xt Ch = (A)h. (25) 0t Plotting a typical element OXj,t+h/e0it against h for h = 0,1, 2,... , H graphically presents the impulse response function of xj to e. There are N2 such graphs correspond- ing to i,j = 1,2,..., N. Although (25) is a convenient mathematical relation, the literature also often interprets E as economic shocks (rather than as just residuals) and so in- terprets the impulse responses OXt+h/O'C as the responses of x to those economic shocks. An alternative interpreta- tion is that the impulse responses measure the adjustment of Xt+h to a policy action at, where a, = t for 7 = t (c being a unit vector), a, = 0 for r7 # t, and 6 + at replaces 6 in (24). Such policy actions shift the system's intercept and are (by assumption) autonomous. Although Oxt+-h/OE and Oxt+h/Oa represent impulse responses, responses also can be calculated for persistent changes-for example, with e-, # 0 or a, * 0 for r7 > t.
Analytically, OXt+ah/1e equals OXt+h/Oa’. Even so,

these two responses differ not only in their interpretations
but also in their caveats. The first response measures the
adjustment of x to changes in the errors e consistent with
their distribution [namely, INN(0, )], in effect requiring
that the empirical residuals closely match the actual un-
derlying economic shocks. They need not. The second re-
sponse measures the adjustment of x to autonomous shifts a
in the system’s intercept, thus assuming that the VAR’s pa-
rameters are invariant to the class of interventions defined

by a. Such interventions may not have occurred in sam-
ple and, even if they have, the VAR may not be invariant
to them. Under each interpretation, violation of the under-
lying assumptions could seriously mislead policy analysis,
with (e.g.) incorrect signs for impulse responses resulting.
See Hendry and Mizon (1998) for further discussion.

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 379

The formulation in (25) assumes a unit impulse for each
shock eit. That ignores differences in variability and units
of measurement across the corresponding variables, and,
by perturbing each eit while holding constant all other ejt
(j * i), it ignores the (usual) nondiagonality of E, the co-
variance matrix for et. The first issue is often addressed
by using impulses equal to the residuals’ standard devi-
ations. Orthogonalization and (more generally) identifica-
tion schemes aim to address the second issue; see Bernanke
(1986), Blanchard and Quah (1989), and King, Plosser,
Stock, and Watson (1991). These adjustments to (25) are
all particular cases of impulse response analysis applied to
the original VAR premultiplied by a nonsingular transfor-
mation matrix P’:

xt = A*x_ ef t, Ae *- INN(0, Q), (26)

where 6 = 0 (for simplicity), x* = P’xt, A* = P’A(P’)-1,
E* = P’et, and Q = P’EP. The transformed impulse re-

sponse matrix is

OX+h _= (A*)h

= P'(A)h(p’)-i

P, OXt+h (P’)-i, (27)

which differs from the raw response matrix in (25) for
P’ * IN. Orthogonalized impulse response analysis sets
P such that Q = IN. Structural VAR analysis sets P
to correspond to an identified structure, which may have
nonorthogonal errors. Premultiplication of (27) by (P’)-1
yields (OXt+h/Oe’)(P’)-1 [= Xt+h/OeC’t], which may be of
more interest than either (25) or (27) because OXt+h/OEt’
gives the response of the original variables x to the trans-
formed impulses e*.

5.2 Comments

Prior to discussing the role of exogeneity in the analysis
of impulse response functions, several general comments
are germane. They pertain to dynamics, model specification,
model estimation, structure, and constancy.

First, the roots of the model’s companion matrix deter-
mine the model’s dynamic properties, so impulse response
analysis is an alternative way of presenting this informa-
tion. In this vein, de Brouwer and Ericsson (1998, fig. 3, b
and d, this issue) plot normalized lag distributions for their
conditional model of inflation, in effect calculating impulse
responses for a partial (conditional) system.

Second, impulse response functions describe the dynam-
ics of a model, not the dynamics of the variables in the
model. For example, suppose that the DGP is a multivari-
ate random walk. The impulse response functions calculated
from an estimated VAR will rarely reveal the pure persis-
tence of the shocks because the estimated roots will not be

exactly unity. Model misspecification can induce additional
discrepancies between the properties of the model and those
of the variables.

Third, impulse responses provide no additional informa-
tion for evaluating the model, beyond what is available from
the coefficient estimates of the model. See, however, Faust

(1998), who proposed assessing the robustness of impulse
responses to alternate identification schemes.

Fourth, model specification, as well as the data proper-
ties themselves, affects impulse response functions. In par-
ticular, specifying whether a variable is weakly or strongly
exogenous can directly affect impulse responses, indepen-
dently of whether that variable is weakly or strongly ex-
ogenous.

Fifth, empirical impulse response functions are deter-
mined by a model’s estimated parameters, regardless of the
corresponding estimator’s properties or the model specifi-
cation. Making policy inferences from an empirical impulse
response analysis thus places a premium on having a con-
gruent model that encompasses rival alternatives and is in-
variant to extensions of the information set used. Relatedly,
estimated impulse responses at long horizons may be incon-
sistent if obtained from an unrestricted VAR with cointe-

grated variables; see Phillips (1998). Ignoring cointegration
by differencing all the variables also generates problems by
confounding short-run and long-run properties, which are
central to impulse response analysis.

Sixth, if a model is congruent, encompassing, and invari-
ant, it may contain some structure. A model with structure,
however, does not imply that the corresponding residuals
are structural because they usually are not invariant to ex-
tensions of the information set unless (e.g.) the model co-
incides with the DGP. Increasing or reducing the dimen-
sion of x directly affects e, and e is also indirectly affected
by conditioning on or by partialing out putative exogenous
variables. See Hendry (1995c, p. 1632).

Seventh, economic data often appear to have structural
breaks, so a complete (or closed) VAR is unlikely to be
empirically constant. Impulse responses are inherently sen-
sitive to parameter change, both from in-sample noncon-
stancy and because impulse response analysis assumes su-
per exogeneity for the counterfactual shocks generating
the responses. Impulse responses from a nonconstant (or
noninvariant) VAR are clearly problematic to interpret for
policy analysis. Notwithstanding that difficulty, empirical
constancy of the estimated VAR is rarely checked in im-
pulse response analysis. A conditional model, in contrast
to a complete VAR, may achieve constancy, precisely by
conditioning on those variables generating the (shared)
breaks in the time series. That close relation between con-

stancy, invariance, and super exogeneity motivates the re-
mainder of this section, which examines how the assumed
and actual exogeneity of variables affects impulse response
analysis.

5.3 Exogeneity and Impulse Response Analysis

Exogeneity has several specific implications for impulse
response analysis. As in Section 2, consider the condi-
tional/marginal factorization. For a first-order stationary

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

380 Journal of Business & Economic Statistics, October 1998

VAR, that factorization generates

t] = [ D (All-DA21) (A12-DA22) Zt
zt 0 A21 A22 Zt-i

+ [Vit (28)
E2t J

where zt need not enter the conditional distribution (e.g.,
policy may act only at a lag). Irrespective of the actual
exogeneity status of zt, the equation for yt in (28) implies
that modeling the conditional distribution alone will result
in the following impulse response matrices:

9Yt+h = (A11 – DA21)h (29)
Ovit

and

OYt+h = (All – DA21)h(pt)-1, (30)
ivt

where vt = Pt’vlt, and Pt is an m x m matrix such that
Pt’Q11pt = Im.

The impulse responses for y derived from the system
equal those derived from the conditional model under the
following (sufficient) conditions. From (25) and (29), the
raw impulse responses are equal if A21 = 0; that is, y does
not Granger-cause z. In effect, strong exogeneity ensures
that the parameters for multistep-ahead calculations are iso-
lated in the conditional model. From (27) and (30), the or-
thogonalized impulse responses are equal if A21 = 0 and
if, in addition, E12 = 0 (i.e., Yt and zt are contemporane-
ously uncorrelated) and P21 is set to 0 (allowing P11 = Pt).
Although sufficient but not necessary, these conditions do
give a sense for the sorts of restrictions required for partial
and complete systems to yield the same impulse responses.
Conversely, weak exogeneity of zt for A1 is not sufficient
to ensure equivalence of impulse responses from the con-
ditional model and the system, whether they are raw or
orthogonalized responses.

At a more general level, many identification schemes
for a structural VAR (including orthogonal ones with tri-
angular P) correspond to conditional/marginal factoriza-
tions. If regime shifts occur in the policy variables, then
(essentially) at most one factorization isolates the invari-
ants of the process, as follows from Hendry (1988) and the
preceding discussion of super exogeneity. Tests for super
exogeneity are thus a promising direction for identifying
a VAR empirically rather than relying on arbitrary (and
untestable) identification assumptions. Put somewhat dif-
ferently, even though factorization orthogonalizes, the cor-
responding transformed variables z4 need not be weakly ex-
ogenous for the parameters of interest, and the latter might
fail to be invariant to (possibly policy-induced) regime shifts

in the marginal process for zt.

6. A SMALL EMPIRICAL POLICY MODEL
FOR THE UNITED KINGDOM

This section illustrates the feasibility of and issues in
conditional economic policy analysis. The analysis begins
with a four-equation system of money, prices, output, and
interest rates in the United Kingdom, from which Sections
6.1 and 6.2 develop two conditional subsystems and test for
super exogeneity via parameter constancy tests. Section 6.3
compares impulse responses from the complete system and
the two subsystems. The data and results are available via
anonymous ftp at www.amstat.org\jbes.

The data are nominal narrow money Ml (M, in ? mil-
lions), real total final expenditure (TFE) at 1985 prices (I,
in ? millions), the TFE deflator (P, 1985 = 1.00), and the
(learning-adjusted) opportunity cost of holding Ml (R, per
cent per annum, expressed as a fraction). The data are quar-
terly and span 1963Q1-1989Q2. Allowing for lags, esti-
mation uses the 100 observations 1964Q3-1989Q2. Money,
TFE, and the deflator are seasonally adjusted, but R is not.
Lowercase letters denote logs of the corresponding upper-
case variables. Hendry and Ericsson (1991) and Ericsson,
Hendry, and Tran (1994) provided further details on the
data.

Model formulation follows Hendry and Mizon (1993) and
Hendry and Doornik (1994). The variables modeled are
m – p, i, Ap, and R, with an intercept, a linear trend (re-
stricted to the cointegration space), and two dummy vari-
ables, DOUT and DOIL. These dummies aim to capture
output shocks from U.K. government policy and the two
oil crises. The stochastic series appear to be I(1), although
R is probably I(0) after 1984, when it becomes a differen-
tial between competing interest rates. Hendry and Doornik
(1994) modeled the joint behavior of the four stochastic
variables, starting with an unrestricted fourth-order VAR
and simplifying that VAR to an unrestricted second-order
VAR and thence to an overidentified dynamic system, fol-
lowing the reduction approach of Hendry, Neale, and Srba
(1988), as extended for cointegration by Hendry and Mi-
zon (1993). Hendry and Doornik found two cointegrating
vectors, one for money demand and one for goods demand.
The former enters only the money equation, whereas the
latter enters the remaining three equations. Hendry and
Doornik’s final dynamic model appears to be congruent
with the sample evidence, excepting some possible changes
around the mid-1970s in the error variances for the infla-

tion and interest-rate equations. Additional analyses of this
dataset include those of Ericsson, Campos, and Tran (1990),
Boswijk (1992), Johansen (1992c), Ericsson et al. (1994),
and Harbo et al. (1998, this issue).

The current empirical analysis aims to implement the the-
oretical notions from the earlier sections, so two alterna-
tive conditional models are developed from the unrestricted
four-variable second-order VAR, and impulse responses are
constructed from those models and from the original VAR.
In the first conditional subsystem, the interest rate is treated
as weakly exogenous. The interest rate appears to be em-
pirically weakly exogenous for the money-demand cointe-
grating vector but not for the goods-demand cointegrating

This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC
All use subject to https://about.jstor.org/terms

Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 381

vector. That said, feedback from goods demand onto the
interest rate is small numerically and only barely signif-
icant statistically, so weak exogeneity of the interest rate
for both cointegrating vectors appears to be approximately
satisfied empirically. In the second conditional subsystem,
money (rather than the interest rate) is treated as weakly ex-
ogenous, and a simplified model is again constructed. Small
multiples of graphs summarize the subsystems’ constancy
properties and the impulse responses.

6.1 A Money-Demand Formulation

This subsection tests for cointegration in the first con-
ditional subsystem, calculates various diagnostic statistics
for the resulting model, and demonstrates super exogeneity.
The first panel in Table 1 reports cointegration test statistics
for the three-equation, second-order VAR of m – p, i, and
Ap, conditional on R. The dummies (DOUT and DOIL),
ARt, and ARt_1 enter unrestrictedly, whereas the trend and
Rt-1 are restricted to lie in the cointegration space. The
results support two cointegrating vectors, similar to those
found by earlier researchers. The second and third panels in
Table 1 report unrestricted standardized estimates of those

Table 1. A Conditional Cointegration Analysis
of U.K. Money-Demand Data

Variables in the cointegrating relations

Statistic type m – p i Ap R trend

Summary statistics and cointegration test statistics

Null hypothesis r = 0 r < 1 r < 2 Eigenvalue .55 .13 .08 Maximal eigenvalue 74.5** 13.0 7.9 95% critical value 25.5 19.0 12.3 Trace eigenvalue 95.3** 20.8 7.9 95% critical value 49.6 30.5 15.2 Unrestricted standardized eigenvectors 3' 1 -.99 7.39 7.64 -.000563 -.07 1 -3.10 .68 -.006019 Unrestricted standardized adjustment coefficients a m- p -.089 -.060 i -.022 -.099 Ap -.001 .102 Restricted standardized eigenvectors 3' 1 -1 7 7 0 0 .25 -1 .25 -.001575 Restricted standardized adjustment coefficients a m- p -.099 -.229 (.011) (.206) i -.024 -.114 (.008) (.153) Ap -.000 .315 (.005) (.092) NOTE: Johansen's maximal eigenvalue and trace eigenvalue statistics for testing cointegration are adjusted for degrees of freedom. The null hypothesis is in terms of the cointegration rank r and, for example, rejection of r = 0 is evidence in favor of at least one cointegrating vector. Critical values for the trace statistic are taken from Harbo et al. (1998, table 2, this issue). No critical values for the max statistic have been tabulated for conditional models, so the reported critical values are those from Osterwald-Lenum (1992, table 2*) for a complete system. Asterisks * and ** denote significance at the 5% and 1% levels, respectively, and estimated standard errors appear in parentheses (.). cointegrating vectors and their associated feedback coef- ficients. The first cointegrating vector is recognizably the money-demand relation of Hendry and Ericsson (1991) and Hendry and Doornik (1994), with a near unit income elas- ticity, large negative long-run semi-elasticities for inflation and the interest rate, and a negligible trend. The second cointegrating vector is similar to Hendry and Doornik's re- lation for aggregate goods demand, with the deviation of output from trend being positively related to inflation and negatively related to the interest rate. As shown by Harbo et al. (1998, this issue), this cointegration analysis does not depend on nuisance parameters if weak exogeneity is valid. The following restrictions were placed on the cointegrat- ing vectors in Table 1. For money demand, income has a unit elasticity, Ap and R have equal coefficients (and equal to +7), and the trend has a zero coefficient. The unrestricted coefficients on Ap and R are both numerically close to +7, although imposing them as such has no direct economic interpretation, so this restriction serves mainly to obtain greater parsimony. For goods demand, the trend coefficient is imposed at .0063 (equal to the sample mean of Ai), real money has no effect, and Ap and R have coefficients of -4 and +1, interpretable (prior to 1985) as an effect from the "real interest rate" R - 4Ap. These restricted cointegrat- ing vectors were then tested for lying in the cointegration space. The associated test statistic is asymptotically X2(_): The hypothesis is linear, cointegration rank is preserved, and the system is in I(0) space. The value of the statistic is X2(6) = 1.86, insignificant at the 5% level. The last two panels in Table 1 report the corresponding restricted val- ues of the cointegrating vectors, their associated feedback coefficients, and the standard errors for the latter. The con- structed equilibrium correction terms from 'xt are Clt = mt - Pt - it + 7Apt + 7Rt - .207 (31) and C2t = .25(it - .0063t) - Apt + .25Rt - 2.7516. (32) The dominant long-run feedback effects are of cl in the money equation and of c2 in the inflation equation. This system determines the five variables (A(m - p)t, Ait, A2pt, clt, C2t), similar to (9). The formulation in the differences of the original variables, together with the equi- librium corrections, permits a more useful impulse response analysis in Section 6.3. The I(0) conditioning variables com- prise A(m - p)t-1, Ait-l, A2pt-1, ARt, ARt-l, Clt-1, c2t-1, DOUT, DOIL, and an intercept. The correlations be- tween actual and fitted values for the stochastic variables are .85, .69, and .69, respectively. Diagnostic tests were calculated for fifth-order serial correlation, fourth-order autoregressive conditional heteroscedasticity, general het- eroscedasticity, and nonnormality; see Godfrey (1978), En- gle (1982), White (1980), and Doornik and Hansen (1994). The outcomes were satisfactory, other than some evidence of nonnormality and heteroscedasticity in the inflation equa- tion. Figure 1 plots recursive statistics for checking parameter constancy in this conditional subsystem. The statistics are This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms 382 Journal of Business & Economic Statistics, October 1998 .04 - Residuals: Dmp - Residuals: Di .02-- Residuals: DDp 18 - Scaled log-likelihood .02 .02 - .01- 17 0- . . .. 0 - 0- 16 -.02 -.01 - 15 -.02 -.04 1 199I '0 I I149 1970 1980 1990 1970 1980 1990 *J70 1980 1990 1970 1980 1990 ---- 1% critical values ---- 1% critical values ---- 1% critical values ---- 1% critical values SChow statistics: Dmp - Chow statistics: Di 2 Chow statistics: DDp - Chow statistics: system 2 2 2 2 1 ----------------- 1------ ------ 1 ------------ ------------------ Y970 1980 1990 ?970 1980 1990 N970 1980 1990 r970 1980 1990 Figure 1. Recursive Statistics for a Conditional Subsystem of Money, Income, and Inflation: One-Step Residuals and 0 ? 2"&t for Equations Explaining A(m - p)t, Lit, and A2pt; the Scaled Log-likelihood; and Breakpoint Chow Statistics for Equations Explaining A(m - p)t, Lit, and A2pt, and for the Conditional Subsystem, With the Statistics Rescaled by Their One-off 1% Critical Values. one-step residuals with plus or minus twice the correspond- ing equation standard errors (0 ? 2e-t) for each equation, the scaled recursive log-likelihood for the subsystem, and breakpoint Chow (1960) statistics for each equation and for the subsystem, with the Chow statistics scaled by their one- off 1% significance levels. When viewed as a set, the Chow statistics can only be taken as informal diagnostics. That said, they provide greater detail about empirical constancy than (e.g.) an unknown breakpoint test statistic, which inten- tionally aggregates over outcomes for many possible sam- ple splits. In particular, the Chow statistics and the one-step residuals provide evidence on the possible nature and timing of nonconstancy (if any), evidence that could be useful in further model development. Conversely, if no Chow statis- tic in a given sequence rejects (as for the money equation), that lack of rejection points to marked empirical constancy in the corresponding equation. The recursive plots suggest reasonable constancy for this conditional monetary system, albeit with the minor caveats for the inflation equation noted previously. Section 3 sketches two procedures for testing super ex- ogeneity: Both point to empirical super exogeneity in this conditional subsystem. The first procedure uses evidence about parameter constancy for the conditional and marginal models. Figure 1 supports the constancy of the conditional model. In their figure 7, Hendry and Ericsson (1991) plot- ted breakpoint Chow statistics for a marginal model of Rt, demonstrating considerable nonconstancy in that model. The second procedure is the variable-addition test of En- gle and Hendry (1993). To check the super exogeneity of (it, Apt, Rt) for the parameters of their money-demand equation, Hendry and Doornik (1994) tested the significance of four dummy variables from Engle and Hendry (1993) and did not reject at the 5% level. Both findings on su- per exogeneity are consistent with the money stock being endogenously determined by the private sector. Even with super exogeneity, the determinants of money demand could and did induce large shifts in the real money stock, which increased by nearly 100% during the last decade of the sample. 6.2 An Interest-Rate Formulation A similar analysis was undertaken for a reverse condi- tioning, in which Rt, it, and Apt are treated as endogenous and (m - p)t as weakly exogenous. The cointegration space now includes (m - p)t-1, and A(m - p)t and A(m - p)t-1 enter the VAR unrestrictedly. This conditional analysis de- livers one significant cointegrating relation and one nearly significant cointegrating relation, which are similar to those shown in Table 1. The first is recognizably the money- demand vector, but it enters all three equations, judging from the size and significance of the feedback coefficients. The second cointegrating vector appears to enter the equa- tion for inflation. The cointegrating restrictions in (31)-(32) remain statistically acceptable. Figure 2 reports the recursive sequences for this re- stricted, I(0), conditional VAR. Rejection occurs in the equations for both R and Ap during the early 1970s, and the system breakpoint statistic values are significant then as well. This reverse conditioning deleteriously affects pa- This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 383 - Residuals: DR - Residuals: Di .02 - Residuals: DDp 18 - Scaled log-likelihood .02 .02 - .01 - 17 "0 0. 0 ? 16 -.01 -15 -.02 -.02 III_. I II.I ._n01'1 14 I' I - I I I I I I 1970 1980 1990 1970 1980 1990 T*70 1980 1990 1970 1980 1990 ---- 1% critical values --- 1% critical values ---- 1% critical values ---- 1% critical values - Chow statistics: DR - Chow statistics: Di - Chow statistics: DDp - Chow statistics: system 2 2 2- 2 1------------------ 1 ------------------ 1 ----------------- 1 --------------- 970 1980 1990 3970 1980 1990 r970 1980 1990 ?970 1980 1990 Figure 2. Recursive Statistics for a Conditional Subsystem of the Interest Rate, Income, and Inflation: One-Step Residuals and 0 + 2&t for Equations Explaining ARt, Ait, and A2pt; the Scaled Log-likelihood; and Breakpoint Chow Statistics for Equations Explaining ARt, Ait, and A2pt, and for the Conditional Subsystem, With the Statistics Rescaled by Their One-off 1% Critical Values. rameter constancy in the model. Such results are consistent with earlier econometric analysis and with the institutional structure of the U.K. money market, in which the mone- tary authority determines the interest rate and the private sector their desired money balances given that interest rate. Indeed, because analysis of the four-variable system finds it, Apt, and Rt to be long-run weakly exogenous for the money-demand relation in the equation for (m -p)t, condi- tioning on (m -p)t may account for the second conditional subsystem's relatively poor performance. 6.3 Impulse Response Analysis This subsection derives impulse responses for the four- variable (complete) VAR and for each of the two condi- tional subsystems, in each case calculating the impulse re- sponses of A(m - p)t, Ait, A2pt, Clt, and c2t to orthogo- nalized shocks in the income and inflation equations. Fig- ure 3 plots the impulse responses for all three models in a panel representing selections from the derivative matrix OXt+h/OEt'. For calculating impulse responses, the first conditional model treats Rt as strongly exogenous. That model's or- thogonal impulse responses differ substantially from those from the unconditional model because the residuals in the interest-rate equation are highly correlated with those of the money and inflation equations. The condition E12 = 0 is not satisfied, even though Granger noncausality (corre- sponding to the condition A21 = 0 in Sec. 5.3) holds empir- ically. The impulse responses from the second conditional model differ substantially from those of the other two mod- els in numerous instances. Relatedly, i, Ap, and R empiri- cally Granger-cause m - p, contrary to the assumption of Granger noncausality embedded in the impulse responses for the second conditional model. Many orthogonalizations violate weak exogeneity, and different exogeneity specifications directly affect impulse responses. Thus, it seems preferable to develop a valid con- ditional representation and avoid the use of orthogonalized impulse response analysis. 6.4 Summary This section developed two conditional cointegrated sub- systems from different factorizations of the data's joint dis- tribution and then drew inferences about constancy, invari- ance, super exogeneity, the Lucas critique, model inversion, and impulse response analysis for those two models. Tests of parameter constancy provided information about super exogeneity, which holds in the first subsystem but not in the second. The second subsystem is a (partial) inversion of the first, and, as Section 4 implies, only one factorization typically obtains a constant conditional model. The empir- ical super exogeneity of the first conditional model refutes the Lucas critique in practice and argues against inversion. Even with super (and strong) exogeneity in the first subsys- tem, orthogonal impulse response analysis is still problem- atic because of certain nonzero contemporaneous residual covariances. These results elucidate how the concepts of exogeneity and cointegration can help interpret and guide econometric model use in economic policy analysis. At a more general level, econometric models are subject to stringent requirements if their use in economic policy This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms 384 Journal of Business & Economic Statistics, October 1998 Response by: Shock to Di Shock to DDp Dmp .002 - C -. -.002 -. .00-.002 -.002- -.004 0 5 10 0 5 10 .0005 - Di .01 - .005 0 . . - -.0005 0 5 10 0 5 10 .001 DDp .004 .0005 -.0005 -.00451 0 5 10 0 5 10 c2 .003 .002 -.00 .001 - S-- 00 0 5 10 0 5 10 Figure 3. Impulse Responses by A(m - p), Ai, A2p, C1, and c2 to Orthogonalized Income and Inflation Shocks in the Complete System (- ), the Subsystem Conditional on Rt (- - -), and the Subsystem Conditional on (m - p)t (- -). analysis is to deliver reliable inferences. Sufficient condi- tions are unrealistically demanding, so this article has de- lineated several operational, necessary conditions and con- sidered how these conditions impinge on policy analysis. In particular, various forms of exogeneity and causality were highlighted and were related to the key issue of parameter invariance. Many potential approaches to policy analysis typically fail some of the necessary conditions and hence may not be viable. Inversion of an equation to derive instrument val- ues for achieving a given target value may violate weak exogeneity. Moreover, statistical inference in conditional models may be hazardous unless the conditioning variables are weakly exogenous, even though Granger noncausality is sufficient for the equivalence of standard-error-based im- pulse responses from systems and conditional models. Al- though such impulse response analysis is invariant to the ordering of variables, it ignores the correlation between equation residuals. Orthogonalizing the residuals addresses this problem, but orthogonalization is not recommended This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 385 nonetheless. In general, it violates weak exogeneity, and it can induce a sequential conditioning of variables that de- pends on the initial order of the variables. That, in turn, may lose invariance of coefficients. If economic policy analysis based on econometric mod- els is to yield useful inferences, those models should be congruent with available information, embody valid weak exogeneity, and be invariant to policy changes. Tools such as inversion and impulse response analysis require careful handling if their implications are to match later realized outcomes. 7. THIS ISSUE'S SPECIAL SECTION ON EXOGENEITY, COINTEGRATION, AND ECONOMIC POLICY ANALYSIS This first article in this JBES special section provides a framework for discussing econometric models in economic policy analysis, focusing on the implications of and interre- lations between cointegration and exogeneity. The second article (by Harbo, Johansen, Nielsen, and Rahbek) develops the basis for inference about cointegration in conditional models, which arise naturally in a policy context. The re- maining four articles (by Juselius, Metin, Durevall, and de Brouwer and Ericsson) are primarily empirical, develop- ing (for the most part) conditional models of inflation. In- flation has been a longstanding concern of policy makers, as highlighted by the recent increased interest in inflation targeting. These four articles are not only substantive ap- plied contributions in themselves for the countries consid- ered (Denmark, Turkey, Brazil, and Australia): They also provide templates for similar analyses for other countries. The remainder of this section summarizes these articles. Harbo, Johansen, Nielsen, and Rahbek analyze the like- lihood ratio statistic for testing cointegration in conditional (or partial) models when weak exogeneity holds. Johansen's (1988, 1991) likelihood ratio statistic for testing cointe- gration in complete vector autoregressive models has been popular in practice. Weak exogeneity often appears to be satisfied empirically, so cointegration tests that assume weak exogeneity have a strong appeal. The authors derive the statistic's asymptotic distribution, show its invariance to nuisance parameters, and tabulate critical values; and they discuss the roles of the various assumptions under- lying the procedure, focusing on the implications of deter- ministic variables. To illustrate the approach, the authors test for cointegration in two conditional models involving real money (Ml), inflation, income, and a net interest rate for the United Kingdom. Juselius develops a multiple-equation model for money, prices, income, and interest rates for Denmark. Money and prices appear to be I(2), and they cointegrate to form real money, which is I(1). The presence of 1(2) variables compli- cates the econometric implementation and the correspond- ing economic interpretation. Nominal money and prices ap- pear to be 1(2) for many other countries as well, however, so Juselius's approach may serve as a blueprint for modeling such 1(2) variables. Juselius sequentially considers long-run 1(2) properties, long-run I(1) properties, and short-run struc- ture in the process of constructing a model for real money, inflation, and the domestic deposit interest rate, conditional on income and the domestic bond rate. The empirical weak exogeneity of income and the domestic bond rate help sim- plify both the cointegration analysis and the modeling of short-run effects, and tests of the conditional model's con- stancy help demonstrate the super exogeneity of those vari- ables. A policy regime shift in 1983 proves central to the design and evaluation of the conditional model. A marginal model for the domestic bond rate, conditional on the Ger- man bond rate, elucidates an important international aspect to the Danish economy, an aspect that may be similar for many other small open economies. Metin models the relationship between inflation and the public sector deficit in Turkey. A vector autoregressive anal- ysis finds three stationary relations involving output growth, inflation, the deficit, the monetary base, and a trend. Output growth itself is stationary and so constitutes a trivial coin- tegrating (stationary) combination. The second cointegrat- ing vector involves inflation, the deficit, and the monetary base, and the third involves inflation, the deficit, and a trend. Weak exogeneity does not hold, so single-equation condi- tional modeling of inflation proceeds by taking the system estimates of the cointegrating vectors as given, following the approach of Juselius (1992). Even though major regime shifts occurred in sample, with inflation ranging from 0% to 100% per annum, the resulting equilibrium-correction model is empirically constant, thus demonstrating super ex- ogeneity of the remaining (short-run) parameters of the con- ditional model. Economically, budget deficits, real income growth, and debt monetization all affect Turkish inflation. Durevall evaluates contending theories for the determina- tion of Brazilian inflation by allowing feedback effects from disequilibria in relationships based on purchasing power parity and money demand. Because of the number of vari- ables and the shortness of the sample, the two correspond- ing cointegrating relationships are obtained from separate analyses, paralleling Juselius (1992). Inflation appears to be I(1), so the econometric methodology is closely related to that of Juselius (1998, this issue). From the conditional model derived for inflation, its primary long-run determi- nant appears to be deviations from a generalized version of purchasing power parity. Although the growth rate of nominal money directly affects inflation, its level (through disequilibria in money demand) does not. Additional en- compassing tests also rule out the direct influence of wages and the output gap on inflation. That said, indirect effects of such variables are possible through the determination of the exchange rate and the domestic interest rate. Finally, de Brouwer and Ericsson develop an empirically constant equilibrium-correction model for Australian infla- tion, in which the level of consumer prices adjusts dynami- cally to relative aggregate demand and to a markup of prices over domestic and import costs. Domestic and import costs are weakly exogenous, thereby sustaining conditional mod- eling. The resulting model encompasses a range of eco- nomic models for prices and inflation, including a variant of the price-inflation Phillips curve, wage-price models, and purchasing power parity. Each of these economic theories, This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms 386 Journal of Business & Economic Statistics, October 1998 however, provides only a partial explanation of empirical price behavior in Australia: Several economic determinants are necessary to understand the behavior of the Australian consumer price index in practice. ACKNOWLEDGMENTS The first author is a staff economist at the Federal Re- serve Board; the second author is Leverhulme Personal Re- search Professor of Economics at Nuffield College; and the third author is a Professor of Economics at the European University Institute, on leave from Southampton Univer- sity. The views in this article are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System or of any other person associated with the Federal Reserve System. This article is a revised version of Hendry and Mizon (1992). All numerical results were obtained using PcGive Professional Version 9.1; see Doornik and Hendry (1996). Financial support from the U.K. Economic and Social Research Council under grants R000233447 and L116251015 and from the EUI Research Council is grate- fully acknowledged. We are indebted to Mike Clements, Jurgen Doornik, Massimiliano Marcellino, Jaime Marquez, Anders Rahbek, Mark Salmon, George Tauchen, Pravin Trivedi, Ken Wallis, Mark Watson, and two anonymous ref- erees for helpful comments. [Received January 1995. Revised June 1998.] REFERENCES Banerjee, A., Dolado, J. J., Galbraith, J. W., and Hendry, D. F. (1993), Co-integration, Error Correction, and the Econometric Analysis of Non- stationary Data, Oxford, U.K.: Oxford University Press. Banerjee, A., Hendry, D. F., and Mizon, G. E. (1996), "The Economet- ric Analysis of Economic Policy" Oxford Bulletin of Economics and Statistics, 58, 573-600. Barndorff-Nielsen, O. E. (1978), Information and Exponential Families: In Statistical Theory, Chichester, U.K.: Wiley. Barro, R. J. (1997), Macroeconomics (5th ed.), Cambridge, MA: MIT Press. Bernanke, B. S. (1986), "Alternative Explanations of the Money-Income Correlation" (with discussion), in Real Business Cycles, Real Exchange Rates and Actual Policies (Carnegie-Rochester Conference Series on Public Policy, Vol. 25), eds. K. Brunner and A. H. Meltzer, Amsterdam: North-Holland, pp. 49-99. Blanchard, O. J., and Quah, D. (1989), "The Dynamic Effects of Aggre- gate Demand and Supply Disturbances," American Economic Review, 79, 655-673. Bontemps, C., and Mizon, G. E. (1997), "Congruence and Encompassing," unpublished manuscript, European University Institute, Florence, Italy, Economics Dept. Boswijk, H. P. (1992), Cointegration, Identification and Exogeneity (Vol. 37, Tinbergen Institute Research Series), Amsterdam: Thesis Publishers. S - (1994), "S-Ancillarity and Strong Exogeneity," unpublished manuscript, University of Amsterdam, Dept. of Actuarial Science and Econometrics. - (1995), "Efficient Inference on Cointegration Parameters in Struc- tural Error Correction Models," Journal of Econometrics, 69, 133-158. Bryant, R. C., Henderson, D. W., Holtham, G., Hooper, P., and Syman- sky, S. A. (eds.) (1988), Empirical Macroeconomics for Interdependent Economies, Washington, DC: The Brookings Institution. Bryant, R. C., Hooper, P., and Mann, C. L. (eds.) (1993), Evaluating Policy Regimes: New Research in Empirical Macroeconomics, Washington, DC: The Brookings Institution. Chow, G. C. (1960), "Tests of Equality Between Sets of Coefficients in Two Linear Regressions," Econometrica, 28, 591-605. Davidson, J. E. H., and Hall, S. (1991), "Cointegration in Recursive Sys- tems," Economic Journal, 101, 239-251. de Brouwer, G., and Ericsson, N. R. (1998), "Modeling Inflation in Aus- tralia'" Journal of Business & Economic Statistics, 16, 433-449. Dolado, J. J. (1992), "A Note on Weak Exogeneity in VAR Cointegrated Models," Economics Letters, 38, 139-143. Doornik, J. A., and Hansen, H. (1994), "An Omnibus Test for Univariate and Multivariate Normality," Discussion Paper W4&91, Oxford Univer- sity, Nuffield College. Doornik, J. A., and Hendry, D. F. (1996), PcGive Professional 9.0 for Windows, London: International Thomson Business Press. Durevall, D. (1998), "The Dynamics of Chronic Inflation: Brazil 1968- 1985," Journal of Business & Economic Statistics, 16, 423-432. Edison, H. J., Marquez, J. R., and Tryon, R. W. (1987), "The Structure and Properties of the Federal Reserve Board Multicountry Model," Eco- nomic Modelling, 4, 115-315. Engle, R. F (1982), "Autoregressive Conditional Heteroscedasticity With Estimates of the Variance of United Kingdom Inflation," Econometrica, 50, 987-1007. Engle, R. F., and Granger, C. W. J. (1987), "Co-integration and Error Correction: Representation, Estimation, and Testing," Econometrica, 55, 251-276. Engle, R. F., and Hendry, D. F. (1993), "Testing Super Exogeneity and Invariance in Regression Models," Journal of Econometrics, 56, 119- 139. S- (1994), "Testing Super Exogeneity and Invariance in Regression Models," in Testing Exogeneity, eds. N. R. Ericsson and J. S. Irons, Oxford, U.K.: Oxford University Press, pp. 93-119. Engle, R. F., Hendry, D. F., and Richard, J.-F. (1983), "Exogeneity," Econo- metrica, 51, 277-304. Ericsson, N. R. (1992), "Cointegration, Exogeneity, and Policy Analysis: An Overview," Journal of Policy Modeling, 14, 251-280. -(1995), "Conditional and Structural Error Correction Models," Journal of Econometrics, 69, 159-171. Ericsson, N. R., Campos, J., and Tran, H.-A. (1990), "PC-GIVE and David Hendry's Econometric Methodology," Revista de Econometria, 10, 7- 117. Ericsson, N. R., Hendry, D. F., and Tran, H.-A. (1994), "Cointegration, Seasonality, Encompassing, and the Demand for Money in the United Kingdom," in Nonstationary Time Series Analysis and Cointegration, ed. C. P. Hargreaves, Oxford, U.K.: Oxford University Press, pp. 179-224. Ericsson, N. R., and Irons, J. S. (eds.) (1994), Testing Exogeneity, Oxford, U.K.: Oxford University Press. - (1995), "The Lucas Critique in Practice: Theory Without Measure- ment" (with discussion), in Macroeconometrics: Developments, Tensions and Prospects, ed. K. D. Hoover, Boston, MA: Kluwer, pp. 263-312. Fair, R. C. (1984), Specification, Estimation, and Analysis of Macroecono- metric Models, Cambridge, MA: Harvard University Press. Faust, J. (1998), "The Robustness of Identified VAR Conclusions About Money," International Finance Discussion Paper 610, Board of Gover- nors of the Federal Reserve System, Washington, DC. Favero, C., and Hendry, D. F. (1992), "Testing the Lucas Critique: A Re- view" (with discussion), Econometric Reviews, 11, 265-306. Florens, J.-P., and Mouchart, M. (1985), "Conditioning in Dynamic Mod- els," Journal of Time Series Analysis, 6, 15-34. Friedman, M., and Schwartz, A. J. (1982), Monetary Trends in the United States and the United Kingdom: Their Relation to Income, Prices, and Interest Rates, 1867-1975, Chicago: University of Chicago Press. Frisch, R. (1938), "Statistical Versus Theoretical Relations in Eco- nomic Macrodynamics," unpublished manuscript, League of Nations. (Reprinted as "Autonomy of Economic Relations," in The Foundations of Econometric Analysis (1995), eds. D. F. Hendry and M. S. Morgan, Cambridge, U.K.: Cambridge University Press, pp. 407-419.) Godfrey, L. G. (1978), "Testing Against General Autoregressive and Mov- ing Average Error Models When the Regressors Include Lagged Depen- dent Variables," Econometrica, 46, 1293-1301. Gordon, R. J. (1976), "Can Econometric Policy Evaluations Be Salvaged?-A Comment," in The Phillips Curve and Labor Markets (Carnegie-Rochester Conference Series on Public Policy, Vol. 1), eds. K. Brunner and A. H. Meltzer, Amsterdam: North-Holland, pp. 47-61. This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms Ericsson, Hendry, and Mizon: Exogeneity, Cointegration, and Economic Policy Analysis 387 Granger, C. W. J. (1969), "Investigating Causal Relations by Econometric Models and Cross-spectral Methods," Econometrica, 37, 424-438. Granger, C. W. J., and Deutsch, M. (1992), "Comments on the Evaluation of Policy Models," Journal of Policy Modeling, 14, 497-516. Haavelmo, T. (1944), "The Probability Approach in Econometrics," Econo- metrica, 12, i-viii, 1-118. Hallman, J. J., Porter, R. D., and Small, D. H. (1991), "Is the Price Level Tied to the M2 Monetary Aggregate in the Long Run?" American Eco- nomic Review, 81, 841-858. Hannan, E. J. (1960), Time Series Analysis, London: Methuen. - (1970), Multiple Time Series, New York: Wiley. Harbo, I., Johansen, S., Nielsen, B., and Rahbek, A. (1998), "Asymptotic Inference on Cointegrating Rank in Partial Systems," Journal of Busi- ness & Economic Statistics, 16, 388-399. Hendry, D. F. (1985), "Monetary Economic Myth and Econometric Real- ity," Oxford Review of Economic Policy, 1, 72-84. - (1988), "The Encompassing Implications of Feedback Versus Feed- forward Mechanisms in Econometrics," Oxford Economic Papers, 40, 132-149. - (1995a), Dynamic Econometrics, Oxford, U.K.: Oxford University Press. - (1995b), "On the Interactions of Unit Roots and Exogeneity," Econometric Reviews, 14, 383-419. - (1995c), "Econometrics and Business Cycle Empirics," Economic Journal, 105, 1622-1636. Hendry, D. F., and Doornik, J. A. (1994), "Modelling Linear Dynamic Econometric Systems," Scottish Journal of Political Economy, 41, 1-33. Hendry, D. F., and Ericsson, N. R. (1991), "Modeling the Demand for Narrow Money in the United Kingdom and the United States" (with discussion), European Economic Review, 35, 833-881. Hendry, D. F., and Mizon, G. E. (1992), "The Role of Weak Exogeneity in Econometric Model Policy Analyses," unpublished manuscript, Institute of Economics and Statistics, Oxford University, Oxford, U.K. - (1993), "Evaluating Dynamic Econometric Models by Encompass- ing the VAR," in Models, Methods, and Applications of Econometrics: Essays in Honor of A. R. Bergstrom, ed. P. C. B. Phillips, Cambridge, MA: Blackwell, pp. 272-300. - (1998), "Exogeneity, Causality, and Co-breaking in Economic Pol- icy Analysis of a Small Econometric Model of Money in the U.K.," Empirical Economics, 23, 267-294. Hendry, D. F., Neale, A. J., and Srba, F. (1988), "Econometric Analysis of Small Linear Systems using PC-FIML," Journal of Econometrics, 38, 203-226. Hoover, K. D., and Sheffrin, S. M. (1992), "Causation, Spending, and Taxes: Sand in the Sandbox or Tax Collector for the Welfare State?" American Economic Review, 82, 225-248. Johansen, S. (1988), "Statistical Analysis of Cointegration Vectors," Jour- nal of Economic Dynamics and Control, 12, 231-254. - (1991), "Estimation and Hypothesis Testing of Cointegration Vec- tors in Gaussian Vector Autoregressive Models," Econometrica, 59, 1551-1580. - (1992a), "An 1(2) Cointegration Analysis of the Purchasing Power Parity Between Australia and the United States," in Macroeconomic Modelling in the Long Run, ed. C. P. Hargreaves, Aldershot, Hants., U.K.: Edward Elgar, pp. 229-248. - (1992b), "A Representation of Vector Autoregressive Processes In- tegrated of Order 2," Econometric Theory, 8, 188-202. - (1992c), "Testing Weak Exogeneity and the Order of Cointegration in UK Money Demand Data," Journal of Policy Modeling, 14, 313-334. - (1992d), "Determination of Cointegration Rank in the Presence of a Linear Trend," Oxford Bulletin of Economics and Statistics, 54, 383- 398. S- (1992e), "Cointegration in Partial Systems and the Efficiency of Single-equation Analysis," Journal of Econometrics, 52, 389-402. S-(1995), Likelihood-based Inference in Cointegrated Vector Auto- regressive Models, Oxford, U.K.: Oxford University Press. Johansen, S., and Juselius, K. (1990), "Maximum Likelihood Estimation and Inference on Cointegration-With Applications to the Demand for Money," Oxford Bulletin of Economics and Statistics, 52, 169-210. Juselius, K. (1992), "Domestic and Foreign Effects on Prices in an Open Economy: The Case of Denmark," Journal of Policy Modeling, 14, 401- 428. -- (1998), "A Structured VAR for Denmark Under Changing Mone- tary Regimes," Journal of Business & Economic Statistics, 16, 400-411. King, R. G., Plosser, C. I., Stock, J. H., and Watson, M. W. (1991), "Stochastic Trends and Economic Fluctuations," American Economic Review, 81, 819-840. Koopmans, T. C. (1950), "When Is an Equation System Complete for Sta- tistical Purposes?" in Statistical Inference in Dynamic Economic Models (Cowles Commission Monograph No. 10), ed. T. C. Koopmans, New York: Wiley, pp. 393-409. Lucas, R. E., Jr. (1976), "Econometric Policy Evaluation: A Critique" (with discussion), in The Phillips Curve and Labor Markets (Carnegie- Rochester Conference Series on Public Policy, Vol. 1), eds. K. Brunner and A. H. Meltzer, Amsterdam: North-Holland, pp. 19-46. Liitkepohl, H. (1991), Introduction to Multiple Time Series Analysis, New York: Springer-Verlag. Marschak, J. (1953), "Economic Measurements for Policy and Prediction," in Studies in Econometric Method (Cowles Commission Monograph No. 14), eds. W. C. Hood and T. C. Koopmans, New York: Wiley, pp. 1-26. Metin, K. (1998), "The Relationship Between Inflation and the Budget Deficit in Turkey," Journal of Business & Economic Statistics, 16, 412- 422. Mizon, G. E. (1995), "Progressive Modelling of Macroeconomic Time Se- ries: The LSE Methodology" (with discussion), in Macroeconometrics: Developments, Tensions and Prospects, ed. K. D. Hoover, Boston, MA: Kluwer, pp. 107-170. Mosconi, R., and Giannini, C. (1992), "Non-causality in Cointegrated Sys- tems: Representation, Estimation and Testing," Oxford Bulletin of Eco- nomics and Statistics, 54, 399-417. Osterwald-Lenum, M. (1992), "A Note With Quantiles of the Asymp- totic Distribution of the Maximum Likelihood Cointegration Rank Test Statistics," Oxford Bulletin of Economics and Statistics, 54, 461-472. Phillips, P. C. B. (1998), "Impulse Response and Forecast Error Variance Asymptotics in Nonstationary VARs," Journal of Econometrics, 83, 21- 56. Phillips, P. C. B., and Loretan, M. (1991), "Estimating Long-run Economic Equilibria," Review of Economic Studies, 58, 407-436. Richard, J.-F. (1980), "Models With Several Regimes and Changes in Ex- ogeneity," Review of Economic Studies, 47, 1-20. Runkle, D. E. (1987), "Vector Autoregressions and Reality" (with discus- sion), Journal of Business & Economic Statistics, 5, 437-442. Sims, C. A. (1980), "Macroeconomics and Reality," Econometrica, 48, 1- 48. Toda, H. Y., and Phillips, P. C. B. (1993), "Vector Autoregressions and Causality," Econometrica, 61, 1367-1393. - (1994), "Vector Autoregression and Causality: A Theoretical Overview and Simulation Study," Econometric Reviews, 13, 259-285. Urbain, J.-P. (1992), "On Weak Exogeneity in Error Correction Models," Oxford Bulletin of Economics and Statistics, 54, 187-207. White, H. (1980), "A Heteroskedasticity-consistent Covariance Matrix Es- timator and a Direct Test for Heteroskedasticity," Econometrica, 48, 817-838. Whittle, P. (1963), Prediction and Regulation by Linear Least-Square Methods, Princeton, NJ: D. Van Nostrand. This content downloaded from 132.200.132.34 on Fri, 27 Mar 2020 16:45:26 UTC All use subject to https://about.jstor.org/terms Contents image 1 image 2 image 3 image 4 image 5 image 6 image 7 image 8 image 9 image 10 image 11 image 12 image 13 image 14 image 15 image 16 image 17 image 18 Issue Table of Contents Journal of Business & Economic Statistics, Vol. 16, No. 4, Oct., 1998 Volume Information [pp. 509 - 510] Front Matter Special Section on Exogeneity, Cointegration, and Economic Policy Analysis Associate Editor's Introduction [p. 369] Exogeneity, Cointegration, and Economic Policy Analysis [pp. 370 - 387] Asymptotic Inference on Cointegrating Rank in Partial Systems [pp. 388 - 399] A Structured VAR for Denmark under Changing Monetary Regimes [pp. 400 - 411] The Relationship between Inflation and the Budget Deficit in Turkey [pp. 412 - 422] The Dynamics of Chronic Inflation in Brazil, 1968-1985 [pp. 423 - 432] Modeling Inflation in Australia [pp. 433 - 449] Cointegration and Long-Horizon Forecasting [pp. 450 - 458] Outlier Detection in Cointegration Analysis [pp. 459 - 468] Prior Density-Ratio Class Robustness in Econometrics [pp. 469 - 478] Why Do Investment Euler Equations Fail? [pp. 479 - 488] Measuring Intervention Effects on Multiple Time Series Subjected to Linear Restrictions: A Banking Example [pp. 489 - 497] The Risk Premium of Volatility Implicit in Currency Options [pp. 498 - 507] Back Matter [pp. 508 - 508] EricssonMacKinnon-2002-DistributionsOfErrorCorrectionTests-EctricsJ-v5n2 Econometrics Journal (2002), volume 5, pp. 285-318. Distributions of error correction tests for cointegration Neil κ. Ericsson^ and James G. MacKinnon* ^ Stop 20, Division of International Finance, Federal Reserve Board, 2000 C Street, NW, Washington, DC 20551, USA E-mail: ericsson@frb.gov *Department of Economics, Queen's University, Kingston, Ontario, Canada K7L 3N6 E-mail: jgm@qed.econ.queensu.ca; Homepage: www.econ.queensu.ca/f acuity/mackinnon Received: January 2000 Summary This paper provides densities and finite sample critical values for the single equation error correction statistic for testing cointegration. Graphs and response surfaces sum marize extensive Monte Carlo simulations and highlight simple dependencies of the statistic's quantiles on the number of variables in the error correction model, the choice of deterministic components, and the sample size. The response surfaces provide a convenient way for calcu lating finite sample critical values at standard levels; and a computer program, freely available over the Internet, can be used to calculate both critical values and p-values. Two empirical applications illustrate these tools. Keywords: Cointegration, Critical value, Distribution function, Error correction, Monte Carlo, Response surface. 1. INTRODUCTION Three general approaches are widely used for testing whether or not non-stationary economic time series are cointegrated: single-equation static regressions, due to Engle and Granger (1987); vector autoregressions, as formulated by Johansen (1988,1995); and single-equation conditional error correction models, initially proposed by Phillips (1954) and further developed by Sargan (1964). While all three have their advantages and disadvantages, testing for cointegration with any of these approaches requires non-standard critical values, which are usually calculated by Monte Carlo simulation. Engle and Granger (1987) tabulate a limited set of critical values for their procedure. MacKinnon (1991) derives a more extensive set with finite sample corrections based on response surfaces, and MacKinnon (1996) provides a computer program to calculate critical values for Engle and Granger's test at any desired level. Johansen (1988), Johansen and Juselius (1990), and Osterwald-Lenum (1992) include critical values for the Johansen proce dure under typical assumptions about deterministic terms and the number of stochastic variables. Johansen (1995), Doornik (1998), and MacKinnon et al. (1999) provide more accurate estimates of these critical values, with the last of these papers also providing computer programs to calcu late critical values and p-values. © Royal Economic Society 2002. Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Maiden, MA, 02148, USA. This content downloaded from ������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������ All use subject to https://about.jstor.org/terms 286 Neil R. Ericsson and James G. MacKinnon By contrast, critical values for the single-equation error correction procedure are scant, per haps because error correction models substantially predate the literature on cointegration. Baner jee et al. (1993) tabulate critical values for an error correction model with two variables at three sample sizes; and Banerjee et al. (1998) list critical values for models with two through six vari ables at five sample sizes. Harbo et al. (1998), MacKinnon et al. ( 1999), and Pesaran et al. (2000) list asymptotic critical values for a related but distinct procedure for single- and multiple-equation error correction models. The current paper addresses this dearth by providing an extensive set of cointegration critical values for the single-equation error correction model. These critical values include finite sample adjustments similar to those in MacKinnon (1991, 1996) for the Engle-Granger (EG) proce dure, they are very accurate numerically and are easy to use in practice, and they encompass and supersede comparable results in Banerjee et al. (1993) and Banerjee et al. (1998). We also provide a freely available Excel spreadsheet and a Fortran program (the latter being similar to the one in MacKinnon (1996) for the EG procedure) that compute both critical values and p-values for the error correction statistic. As the articles in Banerjee and Hendry (1996), Ericsson (1998), and Lütkepohl and Wolters (1998) inter alia highlight, conditional error correction models are ubiquitous empirically, so these tools for calculating critical values and p-values should be of immediate and widespread use to the empirical modeler. Finally, general distributional proper ties are of considerable interest. Accurate numerical approximations to the entire distribution of the error correction statistic are calculated herein and offer insights into the nature of that statistic, particularly relative to the Dickey-Fuller and EG statistics. Graphs highlight the error correction statistic's properties and relationships, and show for the first time what many of its var ious distributions look like. Throughout, the focus is on testing for cointegration, rather than on the complementary task of estimating the cointegrating vectors, assuming a given cointegration rank. This paper is organized as follows. Section 2 sets the backdrop by considering the three common procedures and their relationships to each other. Section 3 outlines the structure of the Monte Carlo analysis for calculating the distributional properties of the cointegration test statistic based on the single-equation error correction model. Section 4 presents the Monte Carlo results, which include densities and finite sample critical values. Section 5 applies the finite sample crit ical values derived in Section 4 and the computer program for calculating /?-values to empirical error correction models of UK narrow money demand from Hendry and Ericsson (1991) and of US federal government debt from Hamilton and Flavin (1986). Section 6 concludes. 2. AN OVERVIEW OF THREE TEST PROCEDURES This paper focuses on finite sample inference about cointegration in a single-equation condi tional error correction model (ECM).1 To motivate the use of conditional ECMs, this section describes the analytics of and inferential methods for the three common approaches for test ing cointegration: the Johansen procedure (Section 2.1), the conditional ECM (Section 2.2), and the EG procedure (Section 2.3). Differences between the three approaches turn on their various assumptions about dynamics and exogeneity (Section 2.4). 'Strictly speaking, the models examined herein are equilibrium correction models; see Hendry and Doornik (2001, p. 144). © Royal Economic Society 2002 This content downloaded from ������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������ All use subject to https://about.jstor.org/terms Distributions of error correction tests for cointegration 2.1. The Johansen procedure Johansen (1988, 1995) derives maximum likelihood procedures for testing for cointegration in a finite-order Gaussian vector autoregression (VAR). That system is: i χ, = 7TjXt—i + Φ D¡ + st, et ~ IΝ φ, Ω), t = l,...,T, (1) 1=1 where χ, is a vector of k variables at time t; π, isakxk matrix of coefficients on the ¡th lag of x, ; I is the maximal lag length; Φ is a k xd matrix of coefficients on D,, a vector of d deterministic variables (such as a constant term and a trend); st is a vector of k unobserved, sequentially independent, jointly normal errors with mean zero and (constant) covariance matrix Ω; and Τ is the number of observations. Throughout, χ is restricted to be (at most) integrated of order one, denoted 1(1), where an l(j ) variable requires y th differencing to make it stationary. The VAR in (1) may be rewritten as a vector error correction model: ί-\ Axt = Ttxt-\ + ^2 riAxt-i + ε' ~ ΙΝ(0, Ω), (2) i=1 where π and Γ,· are: 7Γ = - Ik, (3) Γ/ = ~(m+\ 4 \-πι), i — I, I, (4) Ik is the identity matrix of dimension k, and Δ is the difference operator.2 For any specified number of cointegrating vectors r (0 < r < k), the matrix η is of (potentially reduced) rank r and may be rewritten as αβ', where a and β are k χ r matrices of full rank. By substitution, (2) is: e-1 Αχ,=αβ'χ,-ΐ+ΣΓίΑΧ'-ί+φDt+St, ε,~ΙΝ(0, Ω), (5) ί = 1 where β is the matrix of cointegrating vectors, and a is the matrix of adjustment coefficients (equivalently, the loading matrix). Johansen (1988, 1995) derives two maximum likelihood statistics for testing the rank of π in (2) and hence for testing the number of cointegrating vectors in (2). Critical values appear in Johansen (1988, Table 1) for a VAR with no deterministic components, in Johansen and Juselius (1990, Tables A1-A3) for VARs with a constant term, and in Osterwald-Lenum (1992) and Johansen (1995, Ch. 15) for VARs with no deterministic components, with a constant term only, and with a constant term and a linear trend. Doornik (1998) derives a convenient approxi mation to the maximum likelihood statistics' distributions using the Gamma distribution, and MacKinnon et al. (1999) provide computer programs to calculate critical values and p-values for the Johansen procedure. The difference operator Δ is defined as (1 — L), where the lag operator L shifts a variable one period into the past. Hence, for xt, Lx, = jc,_j and so Axt = x, — xt-\. More generally, AljXt = (1 — LJ)'xt for positive integers ι and j. If i or j is not explicit, it is taken to be unity. © Royal Economic Society 2002 This content downloaded from ������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������ All use subject to https://about.jstor.org/terms Neil R. Ericsson and James G. MacKinnon 2.2. Single-equation conditional error correction models Without loss of generality, the VAR in (1) can be factorized into a pair of conditional and marginal models. If the marginal variables are weakly exogenous for the cointegrating vectors β, then inference about cointegration using the conditional model alone can be made without loss of information relative to inference using the full system (the VAR); see Johansen (1992a,b). This subsection derives a single-equation conditional model from the VAR and delineates two related approaches for conducting such inferences about cointegration from that conditional model. The second of those approaches is the focus of the Monte Carlo analysis in Sections 3 and 4 and of the empirical analysis in Section 5. For expositional clarity, assume that (1) is a first-order VAR with no deterministic compo nents. Its explicit representation as the vector error correction model (2) is: Ay, = 7T(U)}>r_l + ΤΓ(12)Ζ(-1 + ει I (6)
Δζ, = π·(21)Λ-1 + JT(22)Zí-l + £2 ί. (7)

where x’t = (y,, z’t), yt is a scalar endogenous variable, zt is a (k — 1) χ 1 vector of potentially
weakly exogenous variables, π is partitioned conformably to xt as and ε[ = (ε|(, ε’2{).
From (5), equations (6) and (7) may be written as:

Ay, =a\ß’x,-\ + eu
Azt =a2ß’xt-i +ε2,,

(8)

(9)

where α’ = (α,, α’2). Equations (8) and (9) may always be factorized into the conditional distri
bution of y, given z¡ and lags on both variables, and the marginal distribution of z, (also given
lags on both variables):

δΛ = YÓ^Zt + Ylß’xt-l + vii
Azi =a2ß’xt-i +621,

(10)

(11)

where — Ω,2Ω^’, Υι = «ι — Ώ12^22α2′ v\t = ει, — Ω^Ω^1 C2i, the expectation S(v\,S2t)
is zero (by construction), and the error covariance matrix Ω in (1) is {Ω£J}. Equivalently, the
error ει, in (8) may be partitioned into two uncorrelated components as ει, = v¡, + γ’^:2ι, and
then S21 is substituted out to obtain (10).

The variable zt is weakly exogenous for β if and only if «2 = 0 in (11), in which case (10)
and (11) become:

Ayt = Y¿Azt + nß’xt-i + v\t
Δζ, = ε 2t >

(12)

(13)

where γ\ = a\. The test of zt being weakly exogenous for β is thus a test of ai = 0; see Johansen
(1992a).

If (*2 = 0, the conditional ECM (12) by itself is sufficient for inference about β that is with
out loss of information relative to inference from (10) and (11) together. Two distinct approaches
have evolved for testing cointegration in the conditional ECM (12): one is due to Harbo et al.
(1998), and the other originates from the literature on ECMs. The current paper analyzes the
second approach, and clarifying the distinction between the two approaches is central to under
standing their respective properties.

(ç) Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration

Harbo et al. (1998) derive the likelihood ratio statistic for testing cointegrating rank in a
conditional subsystem obtained from a Gaussian VAR when the marginal variables are weakly
exogenous for β. For a single-equation conditional model such as (12), the null hypothesis being
tested is γ\β’ = 0, i.e. that the cointegrating rank for χ is zero. The alternative hypothesis is that
γj β’ φ 0, implying that χ has a cointegrating vector β with at least one non-zero element.

The second approach stems from the literature on error correction models and is based on
transformations of (12), with an auxiliary assumption about the nature of x’s cointegration.
Specifically, the conditional ECM (12) can be motivated as a reparameterization of the condi
tional autoregressive distributed lag (ADL) model; see Davidson et al. (1978) and Hendry et al.
(1984) inter alia. Data transformations imply reparameterizations, and two transformations are
of particular interest:

differencing : μ\χ, + ß2*t-\ -► μιΔχ, + (μι + μ2>*ί-ΐ
differentials: ßiyt + ß2Zt ßiiyt ~ zt) + (μι + μ2)ζ,,

for arbitrary coefficients μ\ and μι- Repeatedly applying these two transformations re-arranges
a conditional ADL into the conditional ECM (12):

y¡ — λ()Ζί + λ’ιΖ,-1 + λ2>”(-1 + V\t (14)
yt = ΚοΔζί + ÀjZr-i + ^2yt-i + v\t (15)

Ay, = Y¿Azt + λ’3ζι-1 + Yiyt-\ + v\t (16)
Ay, = y¿Az, + Yi(yt-i – S’z,-1) + vi, (17)
Ay, = Y¿Azt + Y \ß’x,-\+vu, (18)

where λο, λι, À2, λ3, and á are various coefficients; and the cointegrating vector β has been
normalized on its first coefficient (i.e. for y) such that β’ = (1, —S’). In practice, significance
testing of the error correction term typically has been based on the r-ratio for γ\ in (16), not (17)
or (18). This is the ‘PcGive unit root test’ in Hendry (1989, p. 149) and Hendry and Doornik
(2001, p. 256), which here is denoted the ECM statistic.

When interpreted as a test for cointegration of *, this approach requires an additional assump
tion: namely, that the variables in ζ are not cointegrated among themselves. Thus, γ\ — 0 in (16)
implies (and is implied by) a lack of cointegration between y and z, whereas γ\ < 0 implies cointegration. The f-ratio based upon the least squares estimator of γ\ in (16) is the ECM statis tic analyzed in Sections 3-5. That f-ratio is denoted Kj(k), where d indicates the deterministic components included in the ECM, or the number of such deterministic components, depending upon the context; and k is the total number of variables in χ (not to be confused with the num ber of regressors in the ECM). This /-ratio is used to test the null hypothesis that γ\ — 0, i.e. that y and ζ are not cointegrated. If weak exogeneity does not hold, critical values generally are affected; see Hendry (1995). Campos et al. (1996) and Banerjee et al. (1998) derive the asymptotic distribution of Kj(k) under the null hypothesis of no cointegration: Kd{k) (/«Π BvdBv, (19) where Bv and Βε are the standardized Wiener processes corresponding to v\t and £21, Bv is Bv - (/ ΒεΒυ)' (/ ΒεΒ'ε) Βε, '=>‘ denotes weak convergence of the associated probability

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

measures as Γ —» oo, strong exogeneity of ζ with respect to a and β is assumed, and the ECM
has no deterministic terms. If the ECM includes deterministic terms, the asymptotic distribution
of Kd(k) is of the same form as in (19), but with the Wiener processes replaced by the correspond
ing Brownian bridges. Johansen (1995, Ch. 11.2) develops analogous algebra for the Johansen
maximum likelihood statistic when the VAR has deterministic terms.

Kiviet and Phillips (1992) and Banerjee et al. (1998) discuss similarity for Kd(k). Notably,
the asymptotic distribution in (19) depends on k and d, but not on the short-run coefficients in
the ECM. That is, Kd(k) is asymptotically similar with respect to γο, and also with respect to
coefficients on any lags of Ax in the ECM, provided that those parameters lie within the space
satisfying the 1(1) conditions for x. The statistic K¿(k) is exactly similar with respect to the
constant term if the estimated ECM includes a constant term and a linear trend, and with respect
to the constant term and the linear trend’s coefficient if the estimated ECM includes a constant

term, a linear trend, and a quadratic trend. Following Johansen (1995, p. 84), seasonal dummies
with a constant term may affect the finite sample (but not asymptotic) distribution. Likewise, the
choice of a fixed lag length Í affects the finite sample (but not asymptotic) distribution, provided
I is large enough to avoid mis-specification; see Banerjee et al. ( 1998, Section 5).

To summarize, the ECM statistic K¿(k) is designed to detect cointegration involving y in the
conditional model (12). The procedure in Harbo et al. (1998) is designed to detect any coin
tegration in χ in the conditional model (12), where that cointegration may include y or it may
be restricted to ζ alone. While both statistics derive from conditional models, the two statistics

are testing different hypotheses. They have different distributions—even asymptotically—and so
require separate tabulation.

Harbo et al. (1998, Tables 2-4) present asymptotic critical values for their statistic for (typ
ically) k = 2,…, 7 with several choices of deterministic terms, allowing for conditional sub
systems (i.e. with more than one endogenous variable) as well as conditional single equations.
Pesaran et al. (2000, Tables 6(a)-6(e)) estimate the 5% and 10% critical values for up through five
weakly exogenous variables and 12 endogenous variables. Using response surfaces, MacKinnon
et al. (1999, Tables 2-6) extend and more precisely estimate the 5% critical values in Harbo
et al. (1998) and Pesaran et al. (2000) for up through eight weakly exogenous variables and 12
endogenous variables. They also make available a program that calculates asymptotic critical val
ues at any level and /7-values. Doornik (1998, Section 9) approximates the distribution of Harbo
et al.’s maximum likelihood trace statistic by a Gamma function. Boswijk and Franses (1992)
and Boswijk (1994) analyze a Wald statistic for testing γ\β’ = 0. Boswijk (1994) also tabulates
asymptotic critical values for this Wald statistic in the single-equation case, and they are nume
rically very similar to those in Harbo et al. (1998) for the comparable likelihood ratio statistic.

Critical values for the ECM statistic K¿{k) appear in Banerjee et al. (1993, Table 7.6) for
k — 2 with a constant term, and in Banerjee et al. (1998, Table I) for k = 2,… ,6 with a
constant term and with a constant term and a linear trend. In both studies, the maximum num

ber of variables is too small for many empirical purposes, the estimates of the critical values
are relatively imprecise, and finite sample adjustments are impractical from the reported critical
values. The results in Section 4 address these limitations. In the next subsection, the derivation

in ( 14)—( 18) clarifies the relationship between the ECM and EG procedures.

2.3. The Engle-Granger procedure

Engle and Granger (1987) propose testing for cointegration by testing whether the residuals of a
static regression are stationary. The usual unit root test used is that of Dickey and Fuller (1981),

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration 291

which is based on a finite-order autoregression. Engle and Granger’s procedure imposes a com
mon factor restriction on the dynamics of the relationship between the variables involved. If that
restriction is invalid, a loss of power relative to the ECM and Johansen procedures may well
result. This subsection highlights the role of the common factor restriction by expressing the
model for Engle and Granger’s procedure as a restricted ECM.

Reconsider the conditional ECM derived from a first-order VAR:

Ay, = y¿Az, + Y\(y – S’z)t-1 + v\t, (20)

where y, — 8’zt is the putative disequilibrium. Engle and Granger’s cointegration test statistic
can be formulated from (20), thus establishing the relationship between it and the ECM statistic.
Specifically, subtract 8’Azt from both sides of (20) and re-arrange:

A(y – S’z)t -yi(y- S’z)t-i + {(/Ó – s’)&Zt + vir}. (21)

Defining the Engle-Granger residual yt — S’zt as wt, (21) may be rewritten as:

Awt – yiw,-i + et, (22)

where, by construction, the disturbance et is (j/q – S’)Azt + v\t. The ?-ratio on the least squares
estimator of γ\ in (22) is the EG cointegration test statistic. It is the Dickey-Fuller statistic for
testing whether w has a unit root and hence whether y and ζ lack (or obtain) cointegration with
cointegrating vector (1, —δ’). Below, that /-ratio is denoted τd(k), paralleling Dickey and Fuller’s
notation.

From (21), τ¿(k) imposes γο — S, equating the short-run and long-run elasticities (the
common factor restriction). Empirically, estimated short- and long-run elasticities often differ
markedly, so imposing their equality is arbitrary and hazardous. Weak exogeneity is assumed in
the presentation above but is not required for the EG procedure. See Kremers et al. (1992) for a
general derivation of the common factor restriction in the EG procedure.

If the cointegrating coefficient δ is known, then the t -ratio on γ\ in (22) has a Dickey-Fuller
distribution (equivalent to assuming k = 1), as originally tabulated by Dickey in Fuller (1976,
Table 8.5.2). If δ is estimated by least squares prior to testing that = 0, then other critical
values are required. Engle and Granger (1987, Table II) give such critical values for the bivari
ate model (k = 2) with a constant term. The response surfaces in MacKinnon (1991, Table 1)
allow construction of critical values with finite sample adjustments for k = 1,…, 6 with a
constant term and with a constant term and a linear trend. MacKinnon (1996) provides a com
puter program to calculate numerically highly accurate critical values at any desired level for
k = 1,, 12 with deterministic terms up to and including a quadratic trend.

comparison

The Johansen, ECM, and EG procedures all focus on whether or not the feedback parameters for
the cointegrating vector(s) are non-zero: a for the Johansen procedure, on for the ECM proce
dure, and y\ (which is α ι under weak exogeneity) for the EG procedure. The procedures differ
in their assumptions about the data generation process (DGP), and those assumptions imply both
advantages and disadvantages for empirical implementation. For all three procedures, numerical
computations are easy and fast for both estimation and testing.

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

Table 1. A comparison of the Johansen, ECM, and Engle-Granger procedures for testing cointegration.
Aspect Procedure

Johansen ECM (both types) Engle-Granger
Statistic Maximal eigenvalue

and trace statistics.
K¿(k)\Harbo et al.
(1998) statistic.

Assumptions Well-specified
full system.

Weak exogeneity
of zt for β.

Common factor

restriction.

Advantages Maximum likelihood

of full system.
Determines r (the
number of

cointegrating vectors),
β, and a.

Starting point for

ECM modeling;
unrestrictive dynamics.

Weak exogeneity is
often valid empirically.

Robust to particulars

of the marginal process.

Intuitive.

Super-consistent
estimator of β.

Disadvantages Full system should
be well-specified.

Weak exogeneity
is assumed.

r < 1 is imposed (usually). Comfac is often invalid. Inferences on β are messy. Biases in estimating β. r < 1 imposed (usually). Normalization affects estimation. Dynamics may be of interest. Sources for critical values and /»-values Johansen (1988, 1995), Johansen and Juselius (1990), Osterwald-Lenum ( 1992), Doornik (1998), MacKinnon et al. ( 1999). Baneijee et al. (1993), Banerjee et al. (1998), this paper; Harbo et al. (1998), MacKinnon et al. (1999), Pesaran et al. (2000). Engle and Granger (1987), MacKinnon (1991, 1994, 1996). Table 1 compares the assumptions of these procedures and their implied advantages and disadvantages. For the procedure using the conditional ECM, the advantages are severalfold. The conditional ECM (or, equivalently, the unrestricted ADL) is a common starting point for modeling general to specific in a single-equation context. Also, weak exogeneity is often valid empirically. And, the ECM procedure is robust to many particulars of the marginal process, e.g. specific lag lengths and dynamics involved. While the ECM procedure assumes weak exogeneity and often assumes at most a single cointegrating vector, the procedure's appeal has made it com mon in the literature—hence the need for a clear understanding of the procedure's distributional properties.3 The next two sections describe the structure of the Monte Carlo analysis used for calculating such properties (Section 3) and the results obtained (Section 4). 3. THE STRUCTURE OF THE MONTE CARLO ANALYSIS This paper's objective is to provide information on finite sample inference about cointegration in conditional error correction models. Section 2 motivated the interest in the ECM statistic by α ^Testing for weak exogeneity in a VAR and then for cointegration in a conditional ECM need not suffer from classical pre-test problems, as the corresponding hypotheses are nested. See Hoover and Perez (1999). © Royal Economic Society 2002 This content downloaded from ������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������ All use subject to https://about.jstor.org/terms Distributions of error correction tests for cointegration clarifying its relationships to the Johansen and EG procedures. The remaining sections examine the distributional properties of the ECM statistic. Because no analytical solution is known for even the asymptotic distribution of the ECM test statistic, distributional properties are estimated by Monte Carlo simulation. This section outlines the structure of that Monte Carlo simulation. Section 3.1 describes the focus of this paper's sim ulation, the DGP, and the model estimated. Sections 3.2 and 3.3 sketch the design and simulation of the Monte Carlo experiments, and Section 3.4 discusses post-simulation analysis. 3.1. The focus, the data generation process, and the model The general object of interest is the distribution of the ECM test statistic Kd(k) under the null of no cointegration. Asymptotic properties are derived in Kiviet and Phillips (1992), Campos et al. (1996), and Baneqee et al. (1998), with certain invariance results appearing in Kiviet and Phillips (1992). Finite sample properties appear in Banerjee et al. (1993), Campos et al. (1996), and Banerjee et al. (1998), but all are very limited in their experimental design.4 In the current paper, two aspects are of primary concern: the distribution of Kd(k), and critical values at common levels of significance. To examine the properties of the ECM statistic under the null hypothesis of no cointegration, the DGP is a standardized multivariate random walk for χ : Αχ, ~ IN(0, Ik), (23) a common DGP for simulating the null distribution of cointegration test statistics. The estimated model is the conditional ECM resulting from a possibly cointegrated, £th order, ¿-variable VAR, assuming weak exogeneity of zt for β and with y, scalar. That is, the estimated model is: l-1 Ay, = γ^Δζι + b'xt-\ + JZ rli Δ*ι-< + Φ'\ Dt + v\, v\t ~ IN(0, σ„), (24) 1=1 where b, Γι,·, and φ\ are coefficients in the conditional ECM; and is the conditional ECM's error variance. Because b' = (b\, bj, ■ ■ ■, bk) = Y\ ß' in the notation of the ECM (18), then b\ is γι, which is the coefficient of interest in the ECM statistic Kd(k). The deterministic component Dt may include a constant term, a constant term and a linear trend, or a constant term, a linear trend, and a quadratic trend. The corresponding ECM statistics are denoted Kc{k), Kct(k), and Kctt(k), respectively. If no variables are included in D¡, then the ECM statistic is denoted Knc(k) (nc for no constant term). ^The current paper, like much of the literature, focuses on cointegration tests when the cointegrating vectors are unknown a priori. This is a reasonable approach in many situations. Economic theory may not be fully informative about the cointegrating vector, or the researcher may wish to test the implied economic restrictions. Moreover, different eco nomic theories may imply different cointegrating vectors, as with the quantity theory and the Baumol-Tobin framework. Notably, economic theory does not fully specify the cointegrating vectors for the empirical applications in Section 5. Kremers et al. (1992), Hansen (1995), Campos et al. (1996), and Zivot (2000) consider distributional properties for the ECM statistic when the cointegrating coefficients are known. In that case, the statistici distribution contains nui sance parameters, even asymptotically, although those parameters can be estimated consistently. Hansen (1995) provides asymptotic critical values for such a procedure; response surfaces for finite sample properties could be developed along the lines of our paper. As Zivot (2000) shows, considerable power gains can be achieved by correctly prespecifying the cointegrating vector. Conversely, the test can be inconsistent if the cointegrating vector is incorrectly prespecified, as that prespecification induces an 1(1) component in the error term. Horvath and Watson (1995) and Elliott (1995) analyze properties of cointegration tests from a VAR when the cointegrating vectors are prespecified. © Royal Economic Society 2002 This content downloaded from ������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������ All use subject to https://about.jstor.org/terms Neil R. Ericsson and James G. MacKinnon 3.2. Specifics of the experimental design The analysis focuses on the finite sample properties of the ECM statistic. Three 'design parame ters' are central to the statistic's distributional properties: the estimation sample size (Τ), the total number of variables in xr (k), and the number of deterministic components in D, (d). To provide results for a wide range of situations common in empirical investigations, the simulations span a full factorial design of the following T, k, and D, : Τ = (20, 25, 30, 35,40,45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 200, 400, 500, 600, 700, 1000) k = (1,2,3,4,5,6,7, 8,9, 10, 11, 12) Dt = (none; constant term; constant term, t; constant term, t, t2). (25) The range of the sample size aims to provide information on both the test statistic's asymptotic properties and its finite sample deviations therefrom. The design includes all positive integer val ues of k up through 12, sufficient for virtually all empirical applications. The choice of D, implies four test statistics: Knc(k), Kc(k), Kcl(k), and Kctt{k). Deterministic terms may be included in the model because they are required for adequate model specification, i.e. because the determinis tic terms enter the DGP. Also, a deterministic term of one order higher than 'required' may be included in the model in order to obtain similarity to the coefficients of the lower-order deter ministic terms; see Kiviet and Phillips (1992), Johansen (1994), and Nielsen and Rahbek (2000). Throughout the simulations, the model's lag length is set to unity (£ = 1). However, the lag notation in (24) is useful, as I > 1 for the empirical models in Section 5.

One minor modification exists for the experimental design in (25). Because 2k— 1 + d degrees
of freedom are used in the estimation of (24), some smaller values of Τ are not considered for

larger values of k that imply 2k — 1 + d close to or exceeding Τ. Specifically, Τ = 20 is dropped
for k — 8; Τ — (20, 25) are dropped for k — (9, 10); and Τ = (20, 25, 30) are dropped for
¿ = (11,12).

3.3. Monte Carlo simulation

This paper aims to provide numerically accurate estimates of the ECM statistic’s distribution,
particularly in its tails, where inference is commonly of concern. Thus, a large number of replica
tions are simulated for each experiment in (25): specifically, 10 million replications for each pair
of Τ and k. Such large numbers of replications do not pose difficulties for calculations of sample
moments, but they are problematic for calculating quantiles—and hence densities—because the
full set of replications must be stored and sorted. As a reasonably efficient second-best alter
native, the adopted design divides each experiment into 50 sets of 200 000 replications apiece,
determines the quantiles for each set, and then averages the estimated quantile values across
the sets. Partitioning each experiment into several sets also provides an easy way to measure
experimental randomness. To estimate accurately the complete densities of the ECM statistic,
a large number of quantiles are calculated: 221 in total, corresponding to ρ — 0.0001, 0.0002,
0.0005, 0.001, 0.002, 0.003,…, 0.008, 0.009, 0.010, 0.015, 0.020, 0.025,…, 0.495, 0.500,
0.505,…, 0.975, 0.980, 0.985, 0.990, 0.991, 0.992,…, 0.997, 0.998, 0.999, 0.9995, 0.9998,

0.9999, where ρ denotes the quantile’s percent level.

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration 295

Because so many random numbers were generated, it was vital to use a pseudo-random num
ber generator with a very long period. The generator used was that in MacKinnon (1994, 1996),
which combines two different pseudo-random number generators recommended by L’Ecuyer
(1988). The two generators were started with different seeds and allowed to run independently,
so that two independent uniform pseudo-random numbers were generated at once. Each pair was
then transformed into two N(0, 1) variates using the modified polar method of Marsaglia and
Bray (1964, p. 260). See MacKinnon (1994, p. 170) for details.

3.4. Post-simulation analysis

These Monte Carlo simulations generate a vast quantity of information: 221 estimated quantiles
on 50 sets of replications for (typically) 21 sample sizes with 12 different values of k and four
choices of Dt : over 10 million numbers. Graphs and regressions provide two succinct ways of
conveying and summarizing such information. This paper uses both means: graphs of asymptotic
and finite sample densities, and response surfaces for finite sample critical values. An explanation
is helpful for interpreting both the response surfaces and the graphs.

Typically, authors have tabulated estimated critical values for several sample sizes or for one
large (‘close to asymptotic’) sample size. Such tabulations recognize the dependence of the crit
ical values on the estimation sample size. That dependence can be approximated by regression,
regressing the Monte Carlo estimates of the critical value on functions of the sample size. Such
regressions are response surfaces: see Hammersley and Handscomb (1964) and Hendry (1984)
for general discussions.

Here, for each triplet defined by the quantile’s percent level p, the number of variables k, and
the choice of deterministic components D,, a response surface was estimated:

q(T¡) = θ«, + 0i (7fr1 + θ2(ΤιαΓ2 + 03(7ΤΓ3 + «ί· (26)

The dependent variable q(T¡) is the estimated finite sample p\h quantile from the Monte Carlo
simulation with the ¡th sample size 7], which takes the values for Τ in the experimental
design (25). The regressors are an intercept and three inverse powers of the adjusted sample
size Ίf (which equals T, — (2k — 1) — d)\ θοο, θ\, 02, and Ö3 are the corresponding coefficients;
and Ui is an error that reflects both simulation uncertainty and the approximation of the quantile’s
true functional form by the cubic in (26).

The benefits of these response surfaces are several. First, they reduce consumption costs to
the user by summarizing numerous Monte Carlo experiments in a simple regression. Second,
the coefficient θoo is interpretable as the asymptotic (T — 00) pth quantile for the choice of k
and Dt concerned. Estimation of that asymptotic quantile does not necessarily require very large
sample sizes in the experimental design. Third, response surfaces reduce the Monte Carlo uncer
tainty by averaging (through regression) across different experiments. Fourth, response surfaces
reduce the specificity of the simulations by allowing easy calculation of quantiles for sample
sizes not included in the experimental design (25). Fifth, ¿»-values and critical values at any
level can be calculated from the response surfaces, as by the computer program accompanying
MacKinnon (1996) for the EG statistic τj(k) and by the one accompanying this paper for the
ECM statistic tcd(k). Finally, response surfaces for commonly used quantiles (e.g. ρ = 5%)
are easily programmed into econometrics computer packages so as to provide empirical model
ers with estimated finite sample critical values directly. For instance, PcGive and EViews have
incorporated the response surfaces in MacKinnon (1991) for the Dickey-Fuller critical values,

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

and (more recently) PcGive has added the response surfaces in Tables 2-5 below for /c 1
is of interest for testing cointegration.

From (27), a crude approximation (9Crude to the lower 5% critical value for K¡¡{k) is:

$crude — -3.0 – 0.2£ – 0.3(d – 1). (28)

The negative coefficients in (28) can be easily remembered as a ‘3/2/3′ rule of thumb: an inter
cept of -3.0, a coefficient of -0.2 on the number of variables in x, and a coefficient of -0.3

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

1% level

Figure 10. Estimated asymptotic 1%, 5%, and 10% quantiles 0oo for the ECM statistic as a function of k
and d.

on the number of deterministic terms over and above a constant term. For the ECM evaluated

earlier (k = 4, d = 1, Τ — 47), 0Crude »s —3.8, deviating by only 0.04 from the value of —3.84
calculated with Table 3. While deviations between 0Crude and q(T¡) may be larger or smaller than
this for other k, d, and T, it is well worth keeping in mind that, with typical macroeconomic data,
the ECM statistic itself can easily fluctuate by a few tenths, simply by adding or dropping a few
observations from the sample.

Figure 10 highlights this near-linear dependence of the asymptotic quantile 0oo on k and d.
Each 3D graph in Figure 10 plots #oo against k and d, given the quantile’s percent level p. The
surfaces are virtually planar except for the Dickey-Fuller statistic (k — 1), which is excluded
from (27) and (28).

The asymptotic moments of the ECM statistic also show marked regularity in the distribu
tion’s behavior. Figure 11 plots its asymptotic mean, standard deviation, skewness, and excess
kurtosis as a function of k and d,6 The asymptotic mean declines by approximately 0.2 and 0.4
respectively for unit increases in k and d, close to the estimated shifts for the critical values
in (27). While the asymptotic standard deviation, skewness, and excess kurtosis also depend on
k and d, those dependencies are numerically much smaller than that of the asymptotic mean.
For all values of k and d examined, the asymptotic standard deviation is close to unity, and the
asymptotic skewness and excess kurtosis are close to zero. These results reconfirm the visual
characterization from Figures 1-8: the distribution of the ECM statistic K¡¡{k) is relatively close
to normality with unit variance. In light of these observations, parametric distributional approx
imations to the distribution of the ECM statistic may be promising—perhaps using the normal
distribution, Student’s i-distribution, or an expansion thereon.

Equations (27) and (28) quantify the straightforward dependencies of the ECM statistic’s
quantiles on k and d, they provide a mechanism for extrapolating critical values for values of k
and d outside the experimental design (25), and they offer a rough-and-ready way of assessing
empirical results when Tables 2-5 are not available. Preferably, though, Tables 2-5 or the related
computer program should be used.

”The asymptotic moments were calculated by response surfaces from a separate set of Monte Carlo experiments,
following an approach like that used for the quantiles. Monte Carlo estimation of the statistic’s finite sample moments
does assume the existence of those moments. However, even if those moments are infinite, their Monte Carlo estimates

may be close to the (finite) moments of a Nagar approximation to the statistic; see Sargan (1982).

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration

Asymptotic standard deviation

j ι ι ι I ι ι ι ι I ι ι

0 5 10 k
Asymptotic excess kurtosis

Figure 11. The asymptotic mean, standard deviation, skewness, and excess kurtosis for the ECM statistic
as a function of k and d.

4.3. Encompassing previous Monte Carlo results

Two previous studies—Banerjee et al. (1993) and Banerjee et al. (1998)—report estimated crit
ical values for the ECM statistic. This subsection shows that these previous results for the 1%,
5%, and 10% levels are superseded by the response surfaces reported in Tables 2-5. Simula
tion uncertainty in these two studies appears to be the dominant factor explaining discrepancies
relative to the response surfaces in Tables 2-5. In this encompassing approach, many pages of
existing independent Monte Carlo simulations are subsumed by the current paper’s results. That
is both progressive research-wise and efficient space-wise.

Pre-existing Monte Carlo studies are encompassed by evaluating the response surfaces in
Tables 2-5 over the experimental designs of the past studies and comparing the critical values
derived from Tables 2-5 with those reported in the studies’ simulations. Deviations between the
two types of critical values typically are small relative to the estimated simulation uncertainty of
the pre-existing Monte Carlo studies or are simply small numerically. Hence, Tables 2-5 encom
pass those studies. For this purpose, the simulation uncertainty associated with the response sur
faces in Tables 2-5 is treated as negligible. That assumption seems reasonable. The largest value
of σ in Tables 2-5 is under 0.02, and each (T, d, k, p) quadruplet includes 50 estimates of the
quantile, implying an associated standard error of the response surface quantile of under 0.003.
Frequently, that standard error is under 0.001. The remainder of this subsection briefly describes
the Monte Carlo simulations in each study and the outcomes of the encompassing exercise.

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

Banerjee et al. (1993, Table 7.6, p. 233) report estimated critical values at the 1%, 5%, and
10% levels for kc{2) at Γ = (25, 50, 100), using 5000 replications per experiment. Deviations
relative to the response surfaces from Table 3 are all under 0.1 in absolute value. Using the
values of σ in Table 3 as a benchmark and rescaling by the square root of the ratio of simula
tions calculated, the estimated standard errors for the three quantiles in Banerjee et al. (1993)
are approximately 0.063, 0.032, and 0.025. The observed discrepancies between the estimated
quantiles in Banerjee et al. (1993) and those calculated from Table 3 appear as expected, given
the simulation uncertainty of the former.

Banerjee et al. (1998, Table I) report estimated critical values at the 1%, 5%, 10%, and
25% levels for icc{k) and Kct(k) (k = 2,…, 6) at Τ = (25, 50, 100, 500, oo), using 25 000
replications per experiment. Deviations relative to the response surfaces from Tables 3 and 4 are
all under 0.2 in absolute value, and are typically 0.04 or smaller in magnitude. The estimated
standard errors for the 1%, 5%, and 10% quantiles in Banerjee et al. (1998) are approximately
0.028, 0.014, and 0.011.

5. TWO EMPIRICAL APPLICATIONS

This section applies the finite sample critical values derived earlier and the computer program for
calculating p-values to two empirical ECMs. Section 5.1 considers a model of UK narrow money
demand from Hendry and Ericsson (1991), and Section 5.2 a model of US federal government
debt from Hamilton and Flavin (1986). (Ericsson and MacKinnon (1999) also assess the model
of UK consumers’ expenditure from Davidson et al. (1978).) The model in Hendry and Ericsson
(1991) has played a significant role in the literature on ECMs and cointegration, and Hamilton
and Flavin (1986) was one of the early papers to employ unit root statistics for testing economic
hypotheses. Each subsection briefly reviews the estimated equation and considers corresponding
conditional ECM tests. Tables summarize the results, reporting the empirical /-values for testing
cointegration, along with critical values and p-values. Use of the critical values from Tables 2-5
for the ECM statistic affects the economic inferences drawn.

Several issues arise in testing for cointegration in these models. First, the ECM for money
demand was derived from an unrestricted ADL. Both the ADL and the ECM allow testing of
cointegration, although the ECM requires slight modification to apply the critical values from
Tables 2-5. Second, dynamic specification affects the degrees of freedom used in estimation.
Hence, when computing critical values, the adjusted sample size Ta is calculated as Τ — h (rather
than as Τ — (2k + d — 1)), where h is the total number of regressors, including deterministic
variables. The calculation of ^-values utilizes h similarly. Third, the choice of deterministic
variables affects the /-values and the corresponding critical values and ¿»-values, so potentially
affecting inference. Finally, nonlinearity of the deterministic trend and lack of weak exogeneity
are important in the model of government debt. Throughout this section, capital letters denote
both the generic name and the level of a variable, logarithms are in lowercase, and OLS standard
errors are in parentheses.

5.1. UK narrow money demand

Hendry and Ericsson (1991, equation (6)) model UK narrow money demand as a conditional

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration

ECM, whose final parsimonious form is as follows:

A(m — p)t = — 0.687 Apt — 0.175 A(m — ρ — i)t-i — 0.630 R
(0.125) (0.058) (0.060)

– 0.0929 (m – ρ – i),-ι + 0.0234
(0.0085) (0.0040)

Τ = 100 ( 1964Q3-1989Q2) R2 = 0.76 σ = 1.313%.

The data are nominal narrow money M\ (M, in £ millions), real total final expenditure (TFE)
at 1985 prices (/, in £ millions), the TFE deflator (Ρ, 1985 = 1.00), and the net interest rate
(Rnti, in percent per annum expressed as a fraction). The last series is the difference between the
three-month local authority interest rate and a learning-adjusted retail sight-deposit interest rate.

While the i-value on the error correction term (m — p—i)t~\ in (29) is very large and negative

(—10.87), significance levels are not known, given the presence of nuisance parameters; see Kre
mers et al. (1992) and Kiviet and Phillips (1992). This difficulty arises because one of the coeffi
cients in the cointegrating vector—the long-run income elasticity—is constrained. One solution
is to estimate that coefficient unrestrictedly, as occurs when estimating (29) with it-1 added:

Λ (m – ρ), = – 0.702 Ap¡ – 0.178 Δ (m – ρ – ί),_ι – 0.611 Ä,net
(0.128) (0.058) (0.067)

– 0.0882 (m – ρ – i)<-i + 0.0065 /,_ι - 0.049 (30) (0.0113) (0.0104) (0.117) Τ = 100 ( 1964Q3-1989Q2) R2 = 0.76 σ = 1.317%. The í-value on (m - ρ - í')r-l in (30) is -7.78, which is significant at the 1% level for kc(4), with critical value of —4.45. In fact, the finite sample /rvalue for —7.78 is 0.0000. Equations (29) and (30) can be derived from an unrestricted fifth-order ADL model in m — ρ, Ap, i, and Rnel. The ECM statistic for that ADL is -5.17, also significant at the 1% level for kc(A), with critical value of -4.47. Its finite sample p-value is 0.0014, suggesting a minor loss in power from estimating additional coefficients on dynamics relative to (30). Both this fifth-order ADL and the ECM in (30) include one deterministic component: a con stant term. Table 6 reports the statistic /c¿(4) for the four choices of deterministic components considered in the sections earlier; the value of h\ the finite sample, asymptotic, and crude critical values at the 1%, 5%, and 10% levels; finite sample and asymptotic p-values; the estimated equa tion standard error σ; and an F-statistic for testing the significance of omitted deterministic com ponents. The symbols +, *, and ** denote rejection at the 10%, 5%, and 1% levels, respectively. With a constant term, linear trend, and quadratic trend included, the statistic Kctt(4) is insignif icant at the 10% level for both the ADL and the ECM: their p-values are 0.3859 and 0.4544. With fewer deterministic components, cointegration is detected at the 0.5% level or smaller in the ADL and the ECM, as the statistics ¡cct(4), kc(A), and Knc(4) show. The final column in Table 6 lists the /-'-statistics for testing the significance of the omitted deterministic components in the corresponding regressions, relative to the regressions for obtain ing Kctt(4): degrees of freedom for the F-statistics appear in parentheses as F( ■, · ), and the statistics' p-values are in brackets [ ·]. These F-statistics indicate that the constant term, lin ear trend, and quadratic trend are statistically insignificant, so all the reported ECM statistics in Table 6 make statistically justifiable assumptions about these deterministic components. The © Royal Economic Society 2002 This content downloaded from ������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������ All use subject to https://about.jstor.org/terms Neil R. Ericsson and James G. MacKinnon Table 6. Empirical /-values, critical values, and p-values for the ECM statistic: models of UK narrow money demand. Statistic Model or calculation Empirical f-value h Critical value 1% 5% 10% p-value Finite Asymp sample totic σ (%) /•"-statistic vs. the model for Kctt (4) Kctti 4) ADL -3.29 26 -5.21 -4.54 -4.19 0.3859 0.4140 1.313 — ECM -3.14 8 -5.18 -4.52 -4.19 0.4544 0.4819 1.326 — Asymptotic — — -5.04 -4.47 -4.17 — — — — Crude — — -5.0 -4.4 -4.1 — — — — Kct(4) ADL _5 14« 25 -4.87 -4.19 -3.85 0.0047 0.0024 1.306 F( 1,74) = 0.11 [0.74] ECM —6.53** 7 -4.84 -4.18 -3.85 0.0000 0.0000 1.320 F(l, 92) = 0.16 [0.69] Asymptotic — — -4.72 -4.14 -3.83 — — — — Crude — — -4.7 -4.1 -3.8 — — — — *c(4) ADL -5.17** 24 -4.47 -3.80 -3.45 0.0014 0.0006 1.301 F(2, 74) = 0.28 [0.76] ECM -7.78** 6 -4.45 -3.79 -3.45 0.0000 0.0000 1.317 F(2, 92) = 0.37 [0.69] Asymptotic — — -4.36 -3.76 -3.44 — — — — Crude — — -4.4 -3.8 -3.5 — — — — *>ic(4)

ADL -6.10** 23 -4.04 -3.35 -3.00 0.0000 0.0000 1.297 F(3, 74) = 0.36 [0.78]

ECM -10.57** 5 -4.02 -3.35 -3.00 0.0000 0.0000 1.311 F( 3,92) = 0.31 [0.82]

Asymptotic — — -3.94 -3.33 -2.99 — — — —

Crude — — -4.1 -3.5 -3.2 — — — —

statistics Knc(4), Kc(4), and xct(4) reject at standard levels, but Kctt(4) does not, pointing to the
value of parsimony in deterministic components for obtaining increased power of the cointegra
tion test, when parsimony is merited. The insignificance of a linear trend is particularly interest
ing. In a system analysis of this dataset, Hendry and Mizon (1993) find a second cointegrating
vector, which includes a linear trend; but in their system model, that cointegrating vector does
not enter the equation for money.
Table 6 lists the asymptotic and crude critical values at the 1%, 5%, and 10% levels, and
these differ by at most 0.21 from the calculated finite sample critical values. Likewise, the finite
sample and asymptotic p-values in the table differ by only modest amounts. These numerically
small discrepancies are not surprising because the sample size is relatively large {T — 100).

5.2. US federal government debt

The second model is an ADL from Hamilton and Flavin (1986, p. 816), relating real US federal
government debt to a deterministic nonlinear trend or ‘bubble’ (1 +r)’ and the budget surplus:

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration

48.41 – 22.68(1 +/·)’+ 0.69 Β,-χ + 0.20 Β,.
(26.40) (21.29) (0.21) (0.24)

– 1.305, – 0.63 5,-ι
(0.13) (0.31)

23 (1962-1984) R2 = 0.98 σ = 7.405.

The data are the adjusted debt (Β) for the end of the fiscal year and the adjusted surplus (S)
for the fiscal year (both in $ millions, 1967 prices). The variable r is set to 0.0112, the average
ex post real interest rate on US government bonds over 1960-84. The coefficient on (1 + r)’
is statistically insignificant, consistent with the absence of a speculative bubble. From this and
related evidence, Hamilton and Flavin (1986, pp. 816-817) conclude that ‘… the data appear
quite consistent with the assertion that the government has historically operated subject to the
constraint that expenditures not exceed receipts in expected present-value terms’.

This interpretation of the evidence assumes a long-run solution to (31) relating debt and
surplus. That is equivalent to assuming both cointegration between Β and S, and the presence
of the corresponding cointegrating vector in (31). Empirically, however, (31) does not support
cointegration of Β and S. Rewriting (31) as an unrestricted ECM yields the following equation:

ΑΒ,= 48.41 – 22.68(1 +1·)’- 0.104Bt-\ – 0.20Δ5,_ι
(26.40) (21.29) (0.076) (0.24)

– 1.30 ASt – 1.925,_ι (32)
(0.13) (0.36)

Τ = 23 (1962-1984) R2 = 0.94 σ = 7.405.

The í-value on Bt-\ is —1.36, which is insignificant at the 10% level for kc1 (2), with critical
value of —3.53. Using the critical value for Kct(2) assumes that (1 + r)’ is well approximated
by a linear trend, which, visually, it is. Alternatively, the 10% critical value for Kctt{2) is -3.95,
again with no rejection. The finite sample p-values under these two alternative assumptions are
0.8386 and 0.9247. Notably, estimating (32) (or (31)) with t and t2 rather than with (1 + r)’
obtains a statistically significantly better fitting model, pointing to mis-specification in (32).

Table 7 reports the /-values and critical values for (32) with various choices of deterministic
components. The bubble (1 + r)’ is statistically insignificant in (32), whereas a linear trend and
quadratic trend in its stead are statistically significant. Even so, the resulting /-value for kcu (2)
is —2.96, which is insignificant at the 10% level, having a p-value of 0.3689. Cointegration does
not appear to hold in this conditional model, undercutting the economic inferences drawn by
Hamilton and Flavin (1986).

The sample size in (32) is small: Τ = 23. Correspondingly, the finite sample adjustments
for critical values are typically larger numerically in Table 7 than in Table 6, with the largest
adjustment being —0.72 at the 1% level, i.e. about two thirds of a standard error in the f-value.
The p-values have small finite sample adjustments, which mainly reflect each reported /-value
being far from the lower tail of the associated density; cf. Figure 7.

The single-equation results in Table 7 all assume that S is weakly exogenous, whereas S does
not appear to be so empirically. Starting with a second-order VAR in Β and S, a single cointegrat
ing vector is apparent from the Johansen procedure when (1 + r)’ or a linear trend is restricted
to lie in the cointegration space. Weak exogeneity of S is rejected, as is that of B, invalidating
cointegration analysis in a conditional single equation such as (31). Without weak exogeneity,

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

Table 7. Empirical í-values, critical values, and p-values for the ECM statistic: models of US federal
government debt.

Statistic Empirical h Critical value p value σ /•”-statistic

Model or

calculation

f-value 1% 5% 10% Finite

sample
Asymp

totic

vs. the model

for Kctt (2)

*ctt (2)

ADL + bubble -1.36 6 -5.34 -4.39 – -3.95 0.9247 0.9651 7.40 —

ADL -2.96 7 -5.38 -4.41 – -3.96 0.3689 0.4121 6.37 —

Asymptotic — — -4.62 -4.07 – -3.78 — — — —

Crude — — -4.6 -4.0 – -3.7 — — — —

Kct (2)

ADL + bubble -1.36 6 -4.85 -3.95 -3.53 0.8386 0.8947 7.40

ADL -1.38 6 -4.85 -3.95 -3.53 0.8308 0.8886 7.38

Asymptotic — — -4.25 -3.69 -3.39 — — —

Crude — — -4.3 -3.7 -3.4 — — —

Kc( 2)

ADL + bubble

ADL

Asymptotic

Crude

Oîc(2)

ADL + bubble

ADL

Asymptotic

Crude

-1.50 5 -4.25 -3.40 -2.99

— — -3.79 -3.21 -2.91

— — -4.0 -3.4

+2.58 -3.48

-3.21

-3.7

-3.1

-2.68

-2.59

-3.1

-2.29

-2.26

-2.8

0.5944 0.6458

0.9984 0.9992

7.43 F(2, 16) =4.26 [0.03]

7.94 F(3, 16) = 4.52 [0.02]

single equation inference about cointegration is hazardous at best; and testing the implied exo
geneity assumptions is clearly important. For example, in the Johansen procedure, the coefficient
on the bubble (1 + r)’ or on the linear trend is statistically significant and negatively related
to B, whichever type of trend is included. That contrasts with the statistical insignificance of the
coefficient on (1 + r)’ in (31). Furthermore, the negative coefficient on the trend is economically

surprising and puzzling, although it may be indicative of certain non-ergodic features of the data:
see Kremers (1988) inter alia.

In summary, the first empirical analysis illustrates the importance of parsimony, both in the
choice of deterministic terms and in the reduction from an ADL to a simpler ECM. The sec
ond analysis shows that mis-specification can render inference hazardous, even when the mis
specification is indirect, as with a violation of weak exogeneity. Imposition of valid restrictions
on the cointegrating vector may increase power, although asymptotically correct critical values
for such ECM statistics have been derived only for the case when all cointegrating coefficients
are known; see Hansen (1995, Table 1).

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration

6. CONCLUSIONS

This paper has assessed the distributional properties of the ECM statistic for testing cointegration.
Graphs and response surfaces provide complementary summaries of the vast array of results
from the Monte Carlo study undertaken. Both the graphs and the response surfaces highlight
some simple dependencies of the quantiles on the number of variables in the ECM, the choice
of deterministic components, and the estimation sample size. The reported response surfaces
provide a computationally convenient way for calculating finite sample critical values at the
1%, 5%, and 10% levels. The response surfaces also encompass and supersede much of the
literature’s previous estimates of critical values for the ECM statistic. A computer program, freely
available over the Internet, can be used to calculate p-values and critical values at any level.
Empirical conditional ECMs are ubiquitous in the cointegration literature, so these tools should
be of immediate use to the empirical modeler. Two previous empirical studies illustrate how
critical values and /^-values for the ECM statistic can be employed in practice, and how their use
may affect economic inferences.

Several limitations of the current study come to mind, thereby suggesting some possible
extensions. First, the model’s lag order is assumed to be (and is) unity throughout the Monte
Carlo analysis. For longer lags, the adjusted sample size may be corrected for additional degrees
of freedom lost in estimation and thence used to calculate critical values from a response sur
face, as in Section 5. This refinement may not be sufficient in itself, so an extended analysis,
such as in Cheung and Lai (1995) for the Dickey-Fuller statistic, may be required. Second, all
of the ECM statistics with deterministic components have those components fully unconstrained
in estimation. In analyzing similar statistics, Harbo et al. (1998) and Doornik et al. (1998) argue
strongly for constraining the highest-order deterministic component to lie in the cointegration
space, so distributional properties for so constrained versions of Kc(k), Kct(k), and Kctt{k) are
of interest. That said, virtually all empirically calculated ECM statistics to date have been with
unconstrained deterministic components. Finally, the current paper has considered the proper
ties of the ECM statistic only under the null of no cointegration. While Banerjee et al. (1993),
Campos et al. (1996), and Banerjee et al. (1998) present some calculations on the power of the
ECM statistic, further analysis could be illuminating, particularly comparisons with the Johansen
procedure and the EG procedure under various assumptions about weak exogeneity and common
factor restrictions.

ACKNOWLEDGEMENTS

The views in this paper are solely the responsibility of the authors and should not be interpreted
as reflecting the views of the Board of Governors of the Federal Reserve System or of any other
person associated with the Federal Reserve System. JGM’s research was supported in part by
grants from the Social Sciences and Humanities Research Council of Canada. An early version
of this paper appeared under the title ‘Finite Sample Properties of Error Correction Tests for
Cointegration’, and an intermediate version was circulated as Ericsson and MacKinnon (1999).
We are grateful to Shaghil Ahmed, David Bowman, Jon Faust, David Hendry, S0ren Johansen,
Fred Joutz, Andy Levin, Jaime Marquez, Bent Nielsen, John Rogers, Neil Shephard, an editor,
and an anonymous referee for helpful comments and discussion; to Hayden Smith and Sebastian
Thomas for research assistance; and to Jürgen Doornik and David Hendry for providing us with
a beta-test version of Give Win Version 2. Monte Carlo simulations and the graphs of the densi
ties were obtained from modified versions of programs for MacKinnon (1994, 1996). Response

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

surfaces were obtained using PcGive Professional Version 9.2, and 3D graphics were generated
from Give Win Version 2.02: see Doornik and Hendry (2001). The paper’s tables of response
surface coefficients and the Excel spreadsheet and Fortran program for calculating critical values
and /^-values are available from JGM’s home page.

REFERENCES

Abadir, Κ. M. (1995). The limiting distribution of the t ratio under a unit root. Econometric Theory 11,
775-93.

Banerjee, Α., J. J. Dolado, J. W. Galbraith and D. F. Hendry (1993). Co-integration, Error Correction, and

the Econometric Analysis of Non-stationary Data. Oxford: Oxford University Press.

Banerjee, Α., J. J. Dolado and R. Mestre (1998). Error-correction mechanism tests for cointegration in

a single-equation framework. Journal of Time Series Analysis 19, 267-83.

Baneijee, A. and D. F. Hendry (eds) (1996). The econometrics of economic policy, Special issue. Oxford
Bulletin of Economics and Statistics 58, 567-819.

Boswijk, H. P. (1994). Testing for an unstable root in conditional and structural error correction models.

Journal of Econometrics 63, 37-60.

Boswijk, H. P. and P. H. Franses (1992). Dynamic specification and cointegration. Oxford Bulletin of Eco
nomics and Statistics 54, 369-81.

Campos, J., N. R. Ericsson and D. F. Hendry (1996). Cointegration tests in the presence of structural breaks.

Journal of Econometrics 70, 187-220.

Cheung, Y.-W. and K. S. Lai (1995). Lag order and critical values of the augmented Dickey-Fuller test.
Journal of Business and Economic Statistics 13, 277-80.

Davidson, J. E. H., D. F. Hendry, F. Srba and S. Yeo (1978). Econometric modelling of the aggregate
time-series relationship between consumers’ expenditure and income in the United Kingdom. Economic
Journal 88, 661-92.

Dickey, D. A. and W. A. Fuller (1981). Likelihood ratio statistics for autoregressive time series with a unit
root. Econometrica 49, 1057-72.

Doornik, J. A. (1998). Approximations to the asymptotic distributions of cointegration tests. Journal of
Economic Surveys 12, 573-93. Erratum in (1999) 13, i.

Doornik, J. A. and D. F. Hendry (2001). PcGive Version 10 for Windows. London: Timberlake Consultants
Press.

Doornik, J. Α., D. F. Hendry and B. Nielsen (1998). Inference in cointegrating models: UK Ml revisited.

Journal of Economic Surveys 12, 533-72.

Elliott, G. (1995). Tests for the correct specification of cointegrating vectors and the error correction model.

Mimeo. La Jolla, CA: Department of Economics, University of California at San Diego.

Engle, R. F. and C. W. J. Granger (1987). Co-integration and error correction: representation, estimation,

and testing. Econometrica 55, 251-76.

Ericsson, N. R. (1991). Monte Carlo methodology and the finite sample properties of instrumental variables

statistics for testing nested and non-nested hypotheses. Econometrica 59, 1249-77.

Ericsson, N. R. (ed.) (1998). Exogeneity, cointegration, and economic policy analysis. Special section,
Journal of Business and Economic Statistics 16, 369-449.

Ericsson, N. R. and J. G. MacKinnon (1999). Distributions of error correction tests for cointegration. Inter

national finance discussion paper no. 655, Board of Governors of the Federal Reserve System, Wash

ington, D.C. December, available on the WorldWide Web at www. f ederalreserve. gov/pubs/if dp/
1999/655/default.htm.

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Distributions of error correction tests for cointegration

Fuller, W. A. (1976). Introduction to Statistical Time Series. New York: John Wiley and Sons.

Hamilton, J. D. and M. A. Flavin (1986). On the limitations of government borrowing: a framework for

empirical testing. American Economic Review 76, 808-19.
Hammersley, J. M. and D. C. Handscomb (1964). Monte Carlo Methods. London: Chapman and Hall.

Hansen, B. E. (1995). Rethinking the univariate approach to unit root testing: using covariates to increase

power. Econometric Theory 11, 1148-71.
Harbo, I., S. Johansen, B. Nielsen and A. Rahbek (1998). Asymptotic inference on cointegrating rank in

partial systems. Journal of Business and Economic Statistics 16, 388-99.

Hendry, D. F. (1984). Monte Carlo experimentation in econometrics. In Z. Griliches and M. D. Intriliga
tor (eds), Handbook of Econometrics, Ch. 16, vol. 2, pp. 937-76. Amsterdam: North-Holland.

Hendry, D. F. (1989). PC-GIVE: An Interactive Econometric Modelling System, Version 6.0/6.01. Oxford:
Institute of Economics and Statistics and Nuffield College, University of Oxford.

Hendry, D. F. (1995). On the interactions of unit roots and exogeneity. Econometric Reviews 14, 383—419.

Hendry, D. F. and J. A. Doornik (2001). Empirical Econometric Modelling Using PcGive 10, vol. 1. Lon
don: Timberlake Consultants Press.

Hendry, D. F. and N. R. Ericsson (1991). Modeling the demand for narrow money in the United Kingdom

and the United States. European Economic Review 35, 833-81. With discussion.

Hendry, D. F. and G. E. Mizon (1993). Evaluating dynamic econometric models by encompassing the
VAR. In P. C. B. Phillips (ed.), Models, Methods, and Applications of Econometrics: Essays in Honor of

A. R. Bergstrom, Ch. 18, pp. 272-300. Cambridge, MA: Blackwell Publishers.
Hendry, D. F., A. Pagan and J. D. Sargan (1984). Dynamic specification. In Z. Griliches and M. D. Intrili

gator (eds), Handbook of Econometrics, Ch. 18, vol. 2, pp. 1023-1100. Amsterdam: North-Holland.
Hoover, K. D. and S. J. Perez (1999). Data mining reconsidered: encompassing and the general-to-specific

approach to specification search. Econometrics Journal 2, 167-91. With discussion.
Horvath, M. T. K. and M. W. Watson (1995). Testing for cointegration when some of the cointegrating vec

tors are prespecified. Econometric Theory 11, 984—1014.

Johansen, S. (1988). Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Con
trol 12, 231-54.

Johansen, S. (1992a). Cointegration in partial systems and the efficiency of single-equation analysis. Journal

of Econometrics 52, 389-402.
Johansen, S. (1992b). Testing weak exogeneity and the order of cointegration in UK money demand data.

Journal of Policy Modeling 14, 313-34.
Johansen, S. (1994). The role of the constant and linear terms in cointegration analysis of nonstationary

variables. Econometric Reviews 13, 205-29.

Johansen, S. (1995). Likelihood-based Inference in Cointegrated Vector Autoregressive Models. Oxford:

Oxford University Press.
Johansen, S. and K. Juselius (1990). Maximum likelihood estimation and inference on cointegration—with

applications to the demand for money. Oxford Bulletin of Economics and Statistics 52, 169-210.

Kiviet, J. F. and G. D. A. Phillips (1992). Exact similar tests for unit roots and cointegration. Oxford Bulletin

of Economics and Statistics 54, 349-67.
Kremers, J. J. M. (1988). Long-run limits on the US federal debt. Economics Letters 28, 259-62.

Kremers, J. J. M., N. R. Ericsson and J. J. Dolado (1992). The power of cointegration tests. Oxford Bulletin

of Economics and Statistics 54, 325—48.

L’Ecuyer, P. (1988). Efficient and portable combined random number generators. Communications of the
ACM 31, 742-9, 774.

Lütkepohl, H. and J. Wolters (eds) (1998). Money demand in Europe. Empirical Economics 23, 263-524
(special issue).

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Neil R. Ericsson and James G. MacKinnon

MacKinnon, J. G. (1991). Critical values for cointegration tests. In R. F. Engle and C. W. J. Granger (eds),

Long-run Economic Relationships: Readings in Cointegration, Ch. 13, pp. 267-76. Oxford: Oxford Uni
versity Press.

MacKinnon, J. G. (1994). Approximate asymptotic distribution functions for unit-root and cointegration

tests. Journal of Business and Economic Statistics 12, 167-76.

MacKinnon, J. G. (1996). Numerical distribution functions for unit root and cointegration tests. Journal of

Applied Econometrics 11, 601-18.
MacKinnon, J. G., A. A. Haug and L. Michelis (1999). Numerical distribution functions of likelihood

ratio tests for cointegration. Journal of Applied Econometrics 14, 563-77.

MacKinnon, J. G. and H. White (1985). Some heteroskedasticity-consistent covariance matrix estimators

with improved finite sample properties. Journal of Econometrics 29, 305-25.

Marsaglia, G. and T. A. Bray (1964). A convenient method for generating normal variables. SIAM Review
6, 260-4.

Nielsen, B. and A. Rahbek (2000). Similarity issues in cointegration analysis. Oxford Bulletin of Economics
and Statistics 61, 5-22.

Osterwald-Lenum, M. (1992). A note with quantiles of the asymptotic distribution of the maximum likeli

hood cointegration rank test statistics. Oxford Bulletin of Economics and Statistics 54, 461-72.

Pesaran, M. H., Y. Shin and R. J. Smith (2000). Structural analysis of vector error correction models with

exogenous 1(1) variables. Journal of Econometrics 97, 293-343.

Phillips, A. W. (1954). Stabilisation policy in a closed economy. Economic Journal 64, 290-323.

Sargan, J. D. (1964). Wages and prices in the United Kingdom: a study in econometric methodology. In P. E.

Hart, G. Mills and J. K. Whitaker (eds), Econometric Analysis for National Economic Planning, vol. 16

of Colston Papers, pp. 25-54. London: Butterworths (with discussion).
Sargan, J. D. (1982). On Monte Carlo estimates of moments that are infinite. In R. L. Basmann and G. F.

Rhodes, Jr (eds), Advances in Econometrics: A Research Annual, vol. 1, pp. 267-99. Greenwich, CT:
JAI Press.

Zivot, E. (2000). The power of single equation tests for cointegration when the cointegrating vector is
prespecified. Econometric Theory 16, 407-39.

© Royal Economic Society 2002

This content downloaded from
������������132.200.132.106 on Sun, 11 Dec 2022 21:52:18 UTC������������

All use subject to https://about.jstor.org/terms

Contents

p. [285]

p. 286

p. 287

p. 288

p. 289

p. 290

p. 291

p. 292

p. 293

p. 294

p. 295

p. 296

p. 297

p. 298

p. 299

p. 300

p. 301

p. 302

p. 303

p. 304

p. 305

p. 306

p. 307

p. 308

p. 309

p. 310

p. 311

p. 312

p. 313

p. 314

p. 315

p. 316

p. 317

p. 318

Issue Table of Contents

The Econometrics Journal, Vol. 5, No. 2 (2002) pp. 263-532

Front Matter

An investigation of tests for linearity and the accuracy of likelihood based inference using random fields [pp. 263-284]

Distributions of error correction tests for cointegration [pp. 285-318]

Modelling methodology and forecast failure [pp. 319-344]

Moments and dynamic structure of a time-varying parameter stochastic volatility in mean model [pp. 345-357]

Residual-based diagnostics for conditional heteroscedasticity models [pp. 358-373]

Lag length and mean break in stationary VAR models [pp. 374-386]

Testing for reduction to random walk in autoregressive conditional heteroskedasticity models [pp. 387-416]

Multinomial probit estimation without nuisance parameters [pp. 417-434]

Estimating saving functions in the presence of excessive-zeros problems [pp. 435-456]

Projection estimators for autoregressive panel data models [pp. 457-479]

A comparative study of alternative estimators for the unbalanced two-way error component regression model [pp. 480-493]

Bounds for inference with nuisance parameters present only under the alternative [pp. 494-519]

An optimal test against a random walk component in a non-orthogonal unobserved components model [pp. 520-532]

Back Matter

EricssonMacKinnon-2002-ecmtest.xls

Still stressed from student homework?
Get quality assistance from academic writers!

Order your essay today and save 25% with the discount code LAVENDER