How to choose forecasting models
Steps in
choosing a forecasting model
Forecasting flow chart
Data transformations and forecasting models: what to use
and when
Automatic
forecasting software
Political and ethical issues in forecasting
How to avoid trouble: principles of good data analysis
Automatic Forecasting Software
There are
a number of software packages on the market that advertise
"automatic" forecasting capabilities: you put a time series in, and
the package automatically identifies and fits the "best" model from
among some class of models. Some of them bill themselves as "expert
systems" on relatively flimsy grounds, while others more-or-less live up
to their claims. With regard to such software, I would offer the same warning
as with stepwise regression: properly used, and guided by experience with
"manual" model-fitting, the best of such software can put more
data-analysis power at your fingertips and speed up routine forecasting
applications. Carelessly used, it merely allows you to foul things up in a
bigger way, obtaining results without insight while getting a false sense of
security that "the computer knows best." Automatic forecasting
software is a complement to, not a substitute for, your own forecasting
expertise. When evaluating such software, here are a few points to keep in
mind:
- What
kind of exploratory data analysis does it enable you to do prior to
model-fitting? What kind of graphical and statistical reporting capabilities
does it have in general? What sort of data transformations (e.g.,
differencing, logging, deflating) does it make available?
- What
class of models does it scan through in order to come up with a
best model? In particular, is it limited to smoothing and 1- or 2-variable
regression models, or does it also look at multiple regression and ARIMA,
or even more sophisticated models? What methods does it use to handle
seasonality? Is this class of models adequate for your purposes?
- What
sort of diagnostic information (e.g., residual plots and
statistics) does it provide after fitting a model? Does it perform
goodness-of-fit tests and out-of-sample validation? Does it warn you if
the modeling assumptions are not satisfied or the model otherwise does not
fit well? Does it show you time series plots, probability plots, and ACF
plots of the residuals, so you can decide for yourself?
- What
sort of control, if any, are you allowed to exercise over the model
selection process? Can you force it to try a model of your choice?
- What
sort of audit trail does it leave to document the model-fitting
process? How much does it tell you about the criteria it used to determine
the best model? Is it open and forthright, or is it an inscrutable
"black box" using some mysterious proprietary algorithm?
- What
capability does it have for importing and exporting data in forms
that can be read by other programs (e.g., ASCII files, spreadsheet files,
etc.). Can you easily get data into it from your corporate database, if
necessary? Does it have its own programming/command/macro language for
automating routine analyses?
- Can
it be used to build a system for forecasting large numbers of time
series? How large a data set can it handle?
- To
what extent, if any, can models be customized to take into account the
unique features of your data?
Forecast Pro for Windows, developed
by Robert Goodrich and Eric Stellwagen (Business Forecast Systems Inc.,
Belmont, Massachusetts) appears to me to be one of the better of these programs.
It offers capabilities for fitting the most commonly-used models--exponential
smoothing, multiple regression, and ARIMA--and it provides decent diagnostic
support while offering model-selection advice which is usually sound.
Comparison
of features between Forecast Pro and Statgraphics
Common features of both programs:
- Can
be used to automatically fit exponential smoothing models, ARIMA models,
and dynamic regression models (regression models with lags of dependent
and/or independent variables and/or forecast errors)
- Uses
sophisticated nonlinear estimation and backforecasting
- Performs
out-of-sample validation
- Automatically
performs a battery of residual diagnostic tests
Advantages and/or nice features of Forecast Pro:
- Built-in
"expert system" helps you select models and compare them
- Automatically
tests for significance of next-higher lags of all variables and errors in
regression models
- "Rolling
simulation" feature tests univariate models out-of-sample at a number
of different forecasting horizons (up to 12 months)
- "Multi-level"
option allows you to generate forecasts at different levels of aggregation
so that they "add up"
- "Batch"
mode allows automatic forecasting of many series in one step
- Includes
the capability to add event (e.g. promotion) variables to exponential
smoothing models
- Includes
specialized models for "intermittent" data
Disadvantages and/or caveats:
- Designed
expressly for forecasting, not general-purpose statistical data analysis
and modeling
- Limited
capabilities for manual data exploration (no scatterplots,
cross-correlation plots, horizontal residual plots, probability plots,
correlation matrices, etc.)
- Does
not have some regression options (tests for influential observations and
multicollinearity, general nonlinear regression, automatic forward and
backward stepwise, all-possible regressions, etc.)
- Built-in
"expert" does not understand your data, does not consider all
modeling possibilities (e.g., stationarizing transformations, deflation,
missing variables) and is not necessarily right. You still have to
look over its shoulder!
- Remember
that you, not the built-in expert, must take responsibility for the final
results and explain the model to the client.
Examples
of automated analysis with Forecast Pro (v. 2):
Example #1: Department store data
Example #2: Time series models
for auto sales series