Issues in estimating parameters in ARIMA models: nonlinear least squares, backforecasting, mean versus constant

ARIMA models for time series forecasting

Notes on nonseasonal ARIMA models (pdf file)

Slides on seasonal and nonseasonal ARIMA models (pdf file)

Introduction to ARIMA: nonseasonal models
Identifying the order of differencing in an ARIMA model
Identifying the numbers of AR or MA terms in an ARIMA model
Estimation of ARIMA models
Seasonal differencing in ARIMA models
Seasonal random walk: ARIMA(0,0,0)x(0,1,0)
Seasonal random trend: ARIMA(0,1,0)x(0,1,0)
General seasonal models: ARIMA (0,1,1)x(0,1,1) etc.
Summary of rules for identifying ARIMA models
ARIMA models with regressors
The mathematical structure of ARIMA models (pdf file)

Estimation of ARIMA models

Linear versus nonlinear least squares
Mean versus constant
Backforecasting

Linear versus nonlinear least squares

ARIMA models which include only AR terms are special cases of linear regression models, hence they can be fitted by ordinary least squares.

AR forecasts are a linear function of the coefficients as well as a linear function of past data.
In principle, least-squares estimates of AR coefficients can be exactly calculated from autocorrelations in a single "iteration".
In practice, you can fit an AR model in the Multiple Regression procedure--just regress DIFF(Y) (or whatever) on lags of itself. (But you would get slightly different results from the ARIMA procedure--see below!)

ARIMA models which include MA terms are similar to regression models, but can't be fitted by ordinary least squares:

Forecasts are a linear function of past data, but they are nonlinear functions of coefficients--e.g., an ARIMA(0,1,1) model without constant is an exponentially weighted moving average:

Ŷ_t = (1 - θ₁ )[Y_t-1 + θ₁Y_t-2 + θ₁²Y_t-3 + …]

...in which the forecasts are a nonlinear function of the MA(1) parameter ("theta").
Another way to look at the problem: you can't fit MA models using ordinary multiple regression because there's no way to specify ERRORS as an independent variable--the errors are not known until the model is fitted! They need to be calculated sequentially, period by period, given the current parameter estimates.
MA models therefore require a nonlinear estimation algorithm to be used, similar to the "Solver" algorithm in Excel.
The algorithm uses a search process that typically requires 5 to 10 iterations and occasionally may not converge.
You can adjust the tolerances for determining step sizes and stopping criteria for search (although default values are usually OK).

"Mean" versus "constant"

The "mean" and the "constant" in ARIMA model-fitting results are different numbers whenever the model includes AR terms. Suppose that you fit an ARIMA model to Y in which p is the number of autoregressive terms. (Assume for convenience that there are no MA terms.) Let y denote the differenced (stationarized) version of Y, e.g., y_t = Y_t - Y_t-1 if one nonseasonal difference was used. Then the AR(p) forecasting equation for y is:

ŷ_t = μ + ϕ₁y_t-1 + ϕ₂y_t-2 +… + ϕ_py_t--p

This is just an ordinary multiple regression model in which μ is the constant term, ϕ₁ is the coefficient of the first lag of y, and so on.

Now, internally, the software converts this slope-intercept form of the regression equation to an equivalent form in terms of deviations from the mean. Let m denote the mean of the stationarized series y. Then the p-order autoregressive equation can be written in terms of deviations from the mean as:

ŷ_t = m + ϕ₁(y_t-1 - m) + ϕ₂(y_t-2 - m) +… + ϕ_p(y_t--p - m)

By collecting all the constant terms in this equation, we see it is equivalent to the original form of the equation if:

μ = m(1 - ϕ₁ - ϕ₂ - … - ϕ_p )

or in words:

CONSTANT = MEAN x (1 - sum of AR coefficients)

The software actually estimates m (along with the other model parameters) and reports this as the MEAN in the model-fitting results, along with its standard error and t-statistic, etc. The CONSTANT (μ) is then calculated according to the formula above. If the model does not contain any AR terms, the MEAN and the CONSTANT are identical.

In a model with one order of nonseasonal differencing (only), the MEAN is the trend factor (average period-to-period change). In a model with one order of seasonal differencing (only), the MEAN is the annual trend factor (average year-to-year change).

"Backforecasting"

The basic problem: an ARIMA model (or other time series model) predicts future values of the time series from past values--but how should the forecasting equation be initialized to make a forecast for the very first observation? (Actually, AR models can be initialized by dropping the first few observations--although this is inefficient and wastes data-- but MA models require an estimate of a prior error before they can make the first forecast.)
Strange but true: a stationary time series looks the same going forward or backward in time, therefore...
The same model that predicts the future of a series can also be used to predict its past.
The solution: to squeeze the most information out of the available data, the best way to initialize an ARIMA model (or any time series forecasting model) is to use backward forecasting ("backforecasting") to obtain estimates of data values prior to period 1.
When you use the backforecasting option in ARIMA estimation, the search algorithm actually makes two passes through the data on each iteration: first a backward pass is made to estimate prior data values using the current parameter estimates, then the estimated prior data values are used to initialize the forecasting equation for a forward pass through the data.
If you DON'T use the backforecasting option, the forecasting equation is initialized by assuming that prior values of the stationarized series were equal to the mean.
If you DO use the backforecasting option, then the backforecasts that are used to initialize the model are implicit parameters of the model, which must be estimated along with the AR and MA coefficients. The number of additional implicit parameters is roughly equal to the highest lag in the model--usually 2 or 3 for a nonseasonal model, and s+1 or 2s+1 for a seasonal model with seasonality=s. (If the model includes both a seasonal difference and a seasonal AR or MA term, it needs two season's worth of prior values to start up!)
Note that with either backforecasting option, an AR model is estimated in a different way than it would be estimated in the Multiple Regression procedure (missing values are not merely ignored--they are replaced either with an estimate of the mean or with backforecasts), hence an AR model fitted in the ARIMA procedure will never yield exactly the same parameter estimates as an AR model fitted in the Multiple Regression procedure.
Conventional wisdom: turn backforecasting OFF when you are unsure if the current model is valid, turn it ON to get final parameter estimates once you're reasonably sure the model is valid.
If the model is mis-specified, backforecasting may lead to failures of the parameter estimates to converge and/or to unit-root problems.

Go to next topic: Seasonal differencing in ARIMA models