**ARIMA models for time series forecasting**

Notes
on nonseasonal ARIMA models (pdf file)

Slides on seasonal and nonseasonal ARIMA models (pdf file)

Introduction to
ARIMA: nonseasonal models

Identifying the order of differencing in an ARIMA model

Identifying the numbers of AR or MA terms in an ARIMA
model

Estimation of ARIMA models

Seasonal differencing in ARIMA models

Seasonal random walk: ARIMA(0,0,0)x(0,1,0)

Seasonal random trend: ARIMA(0,1,0)x(0,1,0)

General seasonal models: ARIMA (0,1,1)x(0,1,1) etc.

Summary of rules for identifying ARIMA models

ARIMA models with regressors

The mathematical structure of ARIMA models (pdf file)

**Introduction to ARIMA:
nonseasonal models**

ARIMA(p,d,q)

ARIMA(1,0,0) = first-order autoregressive model

ARIMA(0,1,0) = random walk

ARIMA(1,1,0) = differenced first-order autoregressive model

ARIMA(0,1,1) without constant = simple exponential smoothing

ARIMA(0,1,1) with constant = simple exponential smoothing
with growth

ARIMA(0,2,1) or (0,2,2) without constant = linear exponential
smoothing

A "mixed" model--ARIMA(1,1,1)

Spreadsheet implementation

**ARIMA(****p,d,q****): **ARIMA models are, in
theory, the most general class of models for forecasting a time series which
can be made to be “stationary” by differencing (if necessary),
perhaps in conjunction with nonlinear transformations such as logging or
deflating (if necessary). A random variable that is a time series is stationary
if its statistical properties are all constant over time. *A
stationary series has no trend, its variations around its mean have a constant
amplitude, and it wiggles in a consistent fashion*, i.e., its short-term
random time patterns always look the same in a statistical sense. The latter condition means that its *autocorrelations* (correlations with its
own prior deviations from the mean) remain constant over time, or equivalently,
that its *power spectrum* remains
constant over time. A random
variable of this form can viewed (as usual) as a combination of signal and
noise, and the signal (if one is apparent) could be a pattern of fast or slow
mean reversion, or sinusoidal oscillation, or rapid alternation in sign, and it
could also have a seasonal component.
An ARIMA model can be viewed as a “filter” that tries to separate
the signal from the noise, and the signal is then extrapolated into the future
to obtain forecasts.

The ARIMA
forecasting equation for a stationary time series is a *linear* (i.e., regression-type) equation in which the predictors
consist of *lags of the dependent*
variable and/or *lags of the forecast
errors*. That is:

**Predicted value of Y = a constant and/or a
weighted sum of one or more recent values of Y and/or a weighted sum of one or
more recent values of the errors.**

If the
predictors consist only of lagged values of Y, it is a pure autoregressive
(“self-regressed”) model, which is just a special case of a
regression model and which could be fitted with standard regression
software. For example, a so-called *first-order autoregressive (“AR(1)”) model* for Y is a simple regression model
in which the independent variable is just Y lagged by one period (LAG(Y,1) in
Statgraphics or Y_LAG1 in RegressIt). If some of the predictors are lags
of the errors, an ARIMA model it is NOT a linear regression model, because
there is no way to specify ”last period’s error” as an
independent variable: the errors
must be computed on a period-to-period basis when the model is fitted to the
data. From a technical standpoint,
the problem with using lagged errors as predictors is that *the model’s predictions are not linear functions of the
coefficients*, even though they are linear functions of the past data. So, coefficients in ARIMA models that
include lagged errors must be estimated by *nonlinear*
optimization methods (“hill-climbing”) rather than by just solving
a system of equations.

The acronym
ARIMA stands for **Auto-Regressive
Integrated Moving Average**. Lags of the stationarized series in the
forecasting equation are called "autoregressive" terms, lags of the
forecast errors are called "moving average" terms, and a time series
which needs to be differenced to be made stationary is said to be an
"integrated" version of a stationary series. **Random-walk and
random-trend models, autoregressive models, and exponential smoothing models
are all special cases of ARIMA models.**

A nonseasonal ARIMA model is classified as an "ARIMA(p,d,q)" model, where:

**p**is the number of autoregressive terms,**d**is the number of nonseasonal differences, and**q**is the number of lagged forecast errors in the prediction equation.

To identify the
appropriate ARIMA model for a time series, you begin by identifying the
order(s) of differencing needing to stationarize the series and remove the
gross features of seasonality, perhaps in conjunction with a
variance-stabilizing transformation such as logging or deflating. If you stop
at this point and predict that the differenced series is constant, you have
merely fitted a random walk or random trend model. (Recall that the random walk
model predicts the first difference of the series to be constant, and a random
trend model predicts that the first difference of the series is a random walk,
rather than a constant.) However, the best random walk or random trend model
may still have autocorrelated errors, suggesting that
additional factors of some kind are needed in the prediction equation.

The process
of determining which ARIMA model is best for a given time series will be
discussed in later sections of the notes, but a preview of some of types of *nonseasonal* ARIMA
models that are commonly encountered is given below. For those who want to know a bit more
about the underlying mathematics, details are provided in the handout: The
mathematical structure of ARIMA models

**ARIMA(****1,0,0) = first-order
autoregressive model: **if the series is stationary and
autocorrelated, perhaps it can be predicted as a
multiple of its own previous value, plus a constant. The forecasting equation in this case is

…which is Y regressed on itself lagged by one period. This is an
“ARIMA(1,0,0)+constant” model.
The
constant term is denoted by "mu" and the autoregressive coefficient
is denoted by "phi", in keeping with the terminology for ARIMA models
popularized by Box and Jenkins. (In the output of the Forecasting procedure in
Statgraphics, this coefficient is simply denoted as the AR(1) coefficient.) If the mean of Y is zero, then the
constant term would not be included.

If the slope coefficient* Φ*_{1}* *is positive and less than 1 in
magnitude (it *must* be less than 1 in
magnitude if Y is stationary), the model describes mean-reverting behavior in
which next period’s value should be predicted to be *Φ*_{1 }times as far away from the mean as this
period’s value. If *Φ*_{1 }is negative, it predicts
mean-reverting behavior with alternation of signs, i.e., it also predicts that
Y will be below the mean next period if it is above the mean this period.

In a *second-order*
autoregressive model (ARIMA(2,0,0)), there would be a Y(t-2) term on the right
as well, and so on. Depending on
the signs and magnitudes of the coefficients, an ARIMA(2,0,0) model could
describe a system whose mean reversion takes place in a *sinusoidally oscillating* fashion, like the motion of a mass on a
spring that is subjected to random shocks.

**ARIMA(0,1,0)
= random walk: ** If the series Y is not stationary, the
simplest possible model for it is a random walk model, which can be considered
as a limiting case of an AR(1) model in which the autoregressive coefficient is
equal to 1, i.e., a series with infinitely slow mean reversion. The prediction equation for this model
can be written as:

or
equivalently

...where the
constant term is the average period-to-period change (i.e. the long-term drift)
in Y. This model could be fitted as
a *no-intercept regression model* in
which the first difference of Y is the dependent variable. Since it includes (only) a nonseasonal
difference and a constant term, it is classified as an "ARIMA(0,1,0) model
with constant." The random-walk-*without*-drift
model would be an ARIMA(0,1,0) model *without* constant

**ARIMA(1,1,0)
= differenced first-order autoregressive model: **If the errors of a
random walk model* *are autocorrelated,
perhaps the problem can be fixed by adding one lag of the dependent variable to
the prediction equation--i.e., by regressing *the first difference of Y *on itself lagged by one period. This
would yield the following prediction equation:

which can be
rearranged to

This is a
first-order autoregressive model with one order of nonseasonal differencing and
a constant term--i.e., an "ARIMA(1,1,0) plus constant" model.

**ARIMA(0,1,1)
without constant = simple exponential smoothing:** Another strategy for
correcting autocorrelated errors in a random walk model is suggested by the
simple exponential smoothing model. Recall that for some nonstationary time
series (e.g., ones that exhibit noisy fluctuations around a slowly-varying
mean), the random walk model does not perform as well as a moving average of
past values. In other words, rather than taking the most recent observation as
the forecast of the next observation, it is better to use an *average *of
the last few observations in order to filter out the noise and more accurately
estimate the local mean. The simple exponential smoothing model uses an *exponentially
weighted moving average* of past values to achieve this effect. The
prediction equation for the simple exponential smoothing model can be written
in a number of mathematically equivalent ways, one of which is:

...where
e(t-1) denotes the error at period t-1. Note that this resembles the prediction
equation for the ARIMA(1,1,0) model, except that instead of a multiple of the
lagged difference it includes *a multiple of the lagged forecast error.*
(It also does not include a constant term--yet.) The coefficient of the lagged
forecast error is denoted by the Greek letter "theta" (again
following Box and Jenkins) and it is conventionally written with a *negative*
sign for reasons of mathematical symmetry. "Theta" in this equation
corresponds to the quantity "1-minus-alpha" in the exponential
smoothing formulas we studied earlier.

When a
lagged forecast error is included in the prediction equation as shown above, it
is referred to as a "moving average" (MA) term. The simple exponential
smoothing model is therefore a first-order moving average ("MA(1)")
model with one order of nonseasonal differencing and no constant term --i.e.,
an "ARIMA(0,1,1) model without constant." This means that in
Statgraphics (or any other statistical software that supports ARIMA models) you
can actually fit a simple exponential smoothing by specifying it as an
ARIMA(0,1,1) model without constant, and the estimated MA(1) coefficient
corresponds to "1-minus-alpha" in the SES formula.

**What’s the best way to correct for
autocorrelation: adding AR terms or adding MA terms? **In the previous two models discussed above,
the problem of autocorrelated errors in a random walk model was fixed in two
different ways: by adding a lagged
value of the differenced series to the equation or adding a lagged value of the
forecast error. Which approach is
best? A “rule of thumb”
for this situation, which will be discussed in more detail later on, is that *positive* autocorrelation is usually best
treated by adding an AR term to the model and *negative* autocorrelation is usually best treated by adding an MA
term. In business and economic time
series, *negative* autocorrelation
often arises as *an artifact of
differencing*. (In general,
differencing reduces positive autocorrelation and may even cause a switch from
positive to negative autocorrelation.)
So, the ARIMA(0,1,1) model, in which differencing is accompanied by an
MA term, is more often used than an ARIMA(1,1,0) model.

**ARIMA(0,1,1)
with constant = simple exponential smoothing with growth:** By implementing the SES
model as an ARIMA model, you actually gain some flexibility. First of all, the
estimated MA(1) coefficient is allowed to be *negative*: this corresponds
to a smoothing factor larger than 1 in an SES model, which is usually not
allowed by the SES model-fitting procedure. Second, you have the option of
including a constant term in the ARIMA model if you wish, in order to estimate
an average non-zero trend. The ARIMA(0,1,1) model *with* constant has the
prediction equation:

The
one-period-ahead forecasts from this model are qualitatively similar to those
of the SES model, except that the trajectory of the long-term forecasts is
typically a sloping line (whose slope is equal to mu) rather than a horizontal
line.

**ARIMA(0,2,1)
or (0,2,2) without constant = linear exponential smoothing: **Linear exponential
smoothing models are ARIMA models which use *two*
nonseasonal differences in conjunction with MA terms. The second difference of
a series Y is not simply the difference between Y and itself lagged by two
periods, but rather it is the *first difference of the first difference*--i.e.,
the change-in-the-change of Y at period t. Thus, **the second difference of Y
at period t is equal to (Y(t)-Y(t-1)) - (Y(t-1)-Y(t-2)) = Y(t) - 2Y(t-1) +
Y(t-2)**. A second difference of a discrete function is analogous to a second
derivative of a continuous function: it measures the "acceleration"
or "curvature" in the function at a given point in time.

The ARIMA(0,2,2)
model without constant predicts that the second difference of the series equals
a linear function of the last two forecast errors:

which can be
rearranged as:

where
theta-1 and theta-2 are the MA(1) and MA(2) coefficients. This is a general *linear exponential smoothing model*,
essentially the same as Holt’s model, and Brown’s model is a
special case. It uses
exponentially weighted moving averages to estimate both a *local level* and a *local trend*
in the series. The long-term
forecasts from this model converge to a straight line whose slope depends on
the average trend observed toward the end of the series.

**A "mixed" model--ARIMA(1,1,1):** The features of
autoregressive and moving average models can be "mixed" in the same
model. For example, an ARIMA(1,1,1) model with constant would have the
prediction equation:

Normally,
though, we will try to stick to "unmixed" models with either only-AR
or only-MA terms, because including both kinds of terms in the same model sometimes
leads to overfitting of the data and non-uniqueness of the coefficients.

**Spreadsheet
implementation: **ARIMA
models such as those described above are easy to implement on a spreadsheet.
The prediction equation is simply a linear equation that refers to past values
of original time series and past values of the errors. Thus, you can set up an
ARIMA forecasting spreadsheet by storing the data in column A, the forecasting
formula in column B, and the errors (data minus forecasts) in column C. The
forecasting formula in a typical cell in column B would simply be a linear
expression referring to values in preceding rows of columns A and C, multiplied
by the appropriate AR or MA coefficients stored in cells elsewhere on the
spreadsheet.

Go to next topic: Identifying the order of differencing