How to choose forecasting models

Steps in choosing a forecasting model
Forecasting flow chart
Data transformations and forecasting models: what to use and when
Automatic forecasting software
Political and ethical issues in forecasting
How to avoid trouble: principles of good data analysis

Data transformations and forecasting models: what to use and when

Transformation Properties            When to use               Points to keep in mind                        
Deflation by   Converts data from    When data are measured    To generate a true forecast for the future    
CPI or         nominal dollars (or   in nominal dollars (or    in nominal terms, you will need to make an    
another price  other currency) to    other currency) and you   explicit forecast of the future value of the  
index          constant dollars;     want to explicitly show   price index--i.e., you will need to forecast  
               usually helps to      the effect of             the inflation rate (but this is easy if       
               stablilize variance   inflation--i.e., uncover  you're in a period of steady inflation)       
                                     "real growth"                                                           
Deflation at   Merely applies a      When you only need to     When used with a zero-trend model like        
a fixed rate   constant discount     approximately model the   simple exponential smoothing or random walk   
               factor to past data   effect of past inflation  without growth, the assumed inflation rate    
                                     and/or you wish to        is precisely the percentage growth in the     
                                     impose an assumption      future forecasts.                             
                                     about the current and                                                   
                                     future inflation                                                        
                                     rate--you can twiddle                                                   
                                     the inflation rate to                                                   
                                     see what value does the                                                 
                                     best job of flattening                                                  
                                     out the trend and/or                                                    
                                     stabilizing the variance                                                
Logarithm      Converts              When compound growth is   Logging is not the same as deflating:  it     
               multiplicative        not due to inflation      linearizes growth but does not remove a       
               patterns to additive  (e.g. when data is not    general upward trend; if logged data still    
               patterns and/or       measured in currency);    have a consistent upward trend, then you      
               linearizes            when you do not need to   should use a model that includes a trend      
               exponential growth;   separate inflation from   factor (e.g., random walk with growth,        
               converts absolute     real growth; when data    ARIMA, linear exponential smoothing).         
               changes to            distribution is positive                                                
               percentage changes;   and highly skewed (e.g.,                                                
               often stablizes the   exponential or                                                          
               variance of data      log-normal                                                              
               with compound         distribution); when                                                     
               growth, regardless    variables are                                                           
               of whether deflation  multiplicatively related  
               is also used                  
First          Converts "levels" to  When you need to          Differencing is an explicit option in ARIMA   
difference     "changes"             stationarize a series     modeling and it is implicitly a part of       
                                     with a strong trend       random walk and exponential smoothing         
                                     and/or random-walk        models; therefore you would not manually      
                                     behavior (often useful    difference the input variable (using the      
                                     when fitting regression   DIFF function) when specifying model type as  
                                     models to time series     "random walk" or "exponential smoothing" or   
                                     data)                     "ARIMA"; first difference of LOG(Y) is the
                                                               percentage change in Y
Seasonal       Converts "levels" to  When you need to remove   Seasonal differencing is an explicit option   
difference     "seasonal changes"    the gross features of     in ARIMA modeling; you MUST include a         
                                     seasonality from a        seasonal difference (as a modeling option,    
                                     strongly seasonal series  not an SDIFF transformation of the input      
                                     without going to the      variable)  if the seasonal pattern is         
                                     trouble of estimating     consistent and you wish it to be maintained   
                                     seasonal indices          in long-term forecasts                        
Seasonal       Removes a constant    When you wish to          Adds a lot of parameters to the model--one  
adjustment     seasonal pattern      separate out the          for each season of the year.          
               from a series         seasonal component of a   (In Statgraphics, the seasonal indices        
               (either               series and then fit       are not explicitly shown in the output      
               multiplicative or     what's left with a        of the Forecasting procedure--you must
               additive)             nonseasonal model         separately run the Descriptive Methods        
                                     (regression, smoothing,   procedure to display the seasonal indices.)                                   or trend line); normally                                                
                                     use the multiplicative                                                  
                                     version unless data has                                                 
                                     been logged                                                             

Model type     Properties            When to use               Points to keep in mind                        
Random walk    Predicts that "next   As a baseline against     Plot of forecasts looks exactly like a plot   
               period equals this    which to compare more     of the data, except lagged by one period      
               period" (perhaps      elaborate models; when    (and shifted slightly up or down if a growth  
               plus a constant);     applied to logged data,   term is included);  long term forecasts       
               a.k.a. ARIMA(0,1,0)   it is a "geometric"       follow a straight line (horizontal if no      
               model                 random walk--the default  growth term is included); confidence          
                                     model for stock market    intervals for long-term forecasts widen       
                                     data                      according to a square-root law (sideways-     
                                                               parabola shape); logically equivalent to      
                                                               MEAN model fitted to DIFF(Y)                  
Linear trend   Regression of Y on    Rarely the best model     Forecasts follow a straight line whose slope  
               the time index        for forecasting--use      equals the average slope over the whole       
                                     only when you have very   estimation period but whose intercept is      
                                     few data points and no    anchored in the distant past;  short-term     
                                     obvious pattern in data   forecasts therefore may miss badly and        
                                     other than a slight       confidence intervals for long-term forecasts  
                                     trend; can be used in     are usually not reliable; other models that   
                                     conjunction with          extrapolate a linear trend into the future    
                                     seasonal adjustment--but  (random walk with growth, linear exponential  
                                     if you have enough data   smoothing, ARIMA models with 1                
                                     to seasonally adjust,     difference w/constant or 2 differences w/o      
                                     you probably should use   constant) often do a better job by            
                                     another model             "reanchoring" the trend line on recent data   
Simple moving  Simple (equally       When data are in short    Primitive but relatively robust against       
average        weighted) average of  supply and/or highly      outliers and messy data; long-term forecasts  
               recent data           irregular                 are a horizontal line extrapolated from the   
                                                               most recent average; a long-term trend can    
                                                               be incorporated via fixed-rate deflation at   
                                                               an assumed interest rate                      
Simple         Exponentially         When data are nonseasonal Long-term forecasts are a horizontal line     
exponential    weighted average of   (or deseasonalized) and   extrapolated from the most recent smoothed    
smoothing      recent data;          display a time-varying    value;  same as a random walk model without    
               "average age" of      mean without a            growth if alpha=0.9999; forecasts get         
               data in forecast      consistent trend          smoother and slower to respond to turning     
               (amount by which                                points as alpha approaches zero; confidence   
               forecasts lag behind                            intervals widen less rapidly than in the      
               turning points) is                              random walk model; a long-term trend can be   
               1/alpha; same as an                             incorporated via fixed-rate deflation at an   
               ARIMA(0,1,1) model                              assumed interest rate or by fitting an     
               without constant                                ARIMA(0,1,1) model with constant              
Linear         Assumes a             When data are nonseasonal Long-term forecasts follow a straight line    
exponential    time-varying linear   (or deseasonalized) and   whose slope is the estimated local trend at   
smoothing      trend as well as a    display time-varying      the end of the series; confidence intervals   
(Brown's or    time-varying level    local trends (usually     for long-term forecasts widen rapidly--the    
Holt's)        (Brown's uses 1       applicable to data that   model assumes that the future is VERY         
               parameter, Holt's     are "smoother" in         uncertain because of time-varying trends;     
               uses separate         appearance--i.e., less    often does not outperform simple exponential  
               smoothing parameters  noisy--than what would    smoothing, even for data with trends,         
               for level and         be well fitted by simple  because extrapolation of time-varying trends  
               trend); essentially   exponential smoothing)    is risky                                      
               an ARIMA(0,2,2)                                                                               
               model without                                                                                 
Seasonal       Predicts that "next   As a baseline against     Long-term forecasts have same seasonal        
random walk    period equals same    which to compare fancier  pattern as last year; long-term trend is      
               period last year"     seasonal models; as       equal to the average trend over whole past    
               (plus constant); an   foundation for seasonal   history of series; confidence intervals       
               ARIMA(0,0,0)x(0,1,0)  ARIMA models (e.g.,       widen slowly; slow to respond to cyclical     
               model with constant   (1,0,0)x(0,1,1))          upturns and downturns; logically equivalent   
                                                               to MEAN model fitted to SDIFF(Y,s)              
Seasonal       Predicts that change  As a baseline against     Long-term forecasts have same seasonal        
random trend   from this period to   which to compare fancier  pattern as last year; long-term trend is      
               next period will be   seasonal models; as       equal to the most recently observed annual    
               the same as change    foundation for seasonal   trend; confidence intervals widen rapidly;    
               observed at this      ARIMA models (e.g.,       quick to respond to cyclical upturns and      
               time last year; an    (0,1,1)x(0,1,1) without   downturns; logically equivalent to MEAN       
               ARIMA(0,1,0)          constant)                 model fitted to DIFF(SDIFF(Y)) (with no       
               x(0,1,0) model                                  constant--i.e., mean is assumed to be zero)   
               without constant                                                                              
Winter's       Assumes time-varying  When data are trended and Initialization of seasonal indices and joint  
seasonal       level, trend, and     seasonal and you wish to  estimation of three smoothing parameters is   
smoothing      seasonal indices      decompose it into local   sometimes tricky--watch to see that           
               (either               level/trend/seasonal      parameter estimates converge and that         
               multiplicative or     factors; normally you     forecasts and confidence intervals look       
               additive              use the multiplicative    reasonable; a popular choice for "automatic"  
               seasonality)          version unless data is    forecasting because it does a little of       
                                     logged                    everything, but has a lot of parameters and   
                                                               sometimes overfits the data or is unstable    
Multiple       A general linear      When data are             Forecasts cannot be extrapolated into the     
regression     forecasting equation  correlated with other     future unless and until values are available  
               involving other       explanatory or causal     for the independent variables; for this       
               variables             variables (e.g., price,   reason the independent variables must often   
                                     advertising, promotions,  be lagged by one or more periods--but when    
                                     interest rates,           only lagged variables are used, a regression  
                                     indicators of general     model may fail to outperform a time series    
                                     economic activity,        model which relies only on the history of     
                                     etc.);  the key is to     the original series; regressions of           
                                     choose the right          nonstationary variables often have high       
                                     variables and the right   "R-squared" but poor performance compared to  
                                     transformations of those  time series models; it often helps to         
                                     variables to justify the  stationarize the dependent variable and/or    
                                     assumption of a linear    add lags of the dependent and independent     
                                     model and to take into    variables to the model; "automatic" model     
                                     account the time          selection techniques such as stepwise         
                                     dimension in the data     regression and all-possible regressions are   
                                                               available, but beware of overfitting; it is   
                                                               important to validate the model by testing    
                                                               it on hold-out data and by computing its      
                                                               "effective R-squared" (percent of variance    
                                                               explained)  relative to a random walk model   
                                                               or other appropriate time series model        
ARIMA          A general class of    When data are relatively  ARIMA models are designed to squeeze all      
               models that includes  plentiful (4 seasons or   autocorrelation out of the original time      
               random walk, random   more) and can be          series; a systematic procedure exists for     
               trend, seasonal and   satisfactorily            identifying the best ARIMA model for any  
               non-seasonal          stationarized by          given time series; features of ARIMA   
               exponential           differencing and other    and multiple regression models can be         
               smoothing, and auto-  mathematical              combined in a natural way;  ARIMA models      
               regressive models;    transformations; when it  often provide a good fit to highly            
               forecasts for the     is not necessary to       aggregated, highly plentiful data; they may   
               stationarized         explicitly separate out   perform relatively less well on               
               dependent variable    the seasonal component    disaggregated, irregular, and/or sparse data  
               are a linear          (if any) in the form of                                                 
               function of lags of   seasonal indices                                                        
               the dependent                                                                                 
               variable and/or lags                                                                          
               of the errors