FORECAST PRO EXAMPLE #2: AUTOSALE revisited.

First, as a benchmark, here is a battery of models fitted in the Forecasting procedure in Statgraphics. Models A and B are the simple and linear exponential smoothing models with multiplicative seasonal adjustment. A fixed-rate inflation adjustment of 0.5% per period has been added to the simple exponential smoothing model to put some long-term trend into its forecasts and reduce error heteroscedasticity. The Winter's seasonal smoothing model has been specified as model C. Two ARIMA models fitted to the logged data are included as models D and E.

Notice that all the models are quite similar in their all-around performance in the estimation period, and models A-D are very similar in the validation period, with model E lagging a bit behind. Interestingly, the ARIMA models did relatively better in terms of percentage error because, being fitted in log units, they were actually optimized in terms of percentage error rather than absolute error.

Model Comparison
----------------
Data variable: AUTOSALE
Number of observations = 281
Start index = 1/70            
Sampling interval = 1.0 month(s)
Length of seasonality = 12
Number of periods withheld for validation: 36

Models
------
(A) Simple exponential smoothing with alpha = 0.4021
    Inflation adjustment: 0.5 percent at end of period
    Seasonal adjustment: Multiplicative
(B) Brown's linear exp. smoothing with alpha = 0.1375
    Seasonal adjustment: Multiplicative
(C) Winter's exp. smoothing with alpha = 0.3117, beta = 0.0001, gamma = 0.2495
(D) ARIMA(0,1,1)x(0,1,1)12
    Math adjustment: Natural log
(E) ARIMA(1,0,1)x(0,1,1)12 with constant
    Math adjustment: Natural log

Estimation Period
Model  MSE          MAE          MAPE         ME           MPE
------------------------------------------------------------------------
(A)    1.40991      0.770647     4.70927      0.0662132    0.276835     
(B)    1.41955      0.791537     5.02856      0.0248242    0.282996     
(C)    1.53718      0.842734     5.12773      0.0154446    -0.454623    
(D)    1.58972      0.823505     4.77485      -0.0302983   -0.279636    
(E)    1.56714      0.819359     4.71859      0.0307884    0.143125     

Model  RMSE         RUNS  RUNM  AUTO  MEAN  VAR
-----------------------------------------------
(A)    1.1874        OK    OK    ***   OK   ***  
(B)    1.19145       OK    ***   ***   OK   ***  
(C)    1.23983       OK    *     OK    OK   ***  
(D)    1.26084       OK    OK    *     OK   **   
(E)    1.25185       OK    OK    *     *    **   

Validation Period
Model  MSE          MAE          MAPE         ME           MPE
------------------------------------------------------------------------
(A)    1.09317      0.947184     2.94464      -0.12697     -0.532003    
(B)    1.23223      0.951346     2.94155      0.234821     0.57216      
(C)    1.35863      0.959531     2.97817      0.00353942   -0.144202    
(D)    1.15307      0.912145     2.82051      -0.144336    -0.571749    
(E)    2.20643      1.19816      3.76333      -1.06931     -3.37841     

Here are the estimated coefficients of the ARIMA(0,1,1)x(0,1,1) model:

                            ARIMA Model Summary
Parameter           Estimate        Stnd. Error     t               P-value
----------------------------------------------------------------------------
MA(1)               0.50284         0.0570327       8.8167          0.000000
SMA(1)              0.896011        0.0243729       36.7626         0.000000
----------------------------------------------------------------------------
Backforecasting: yes
Estimated white noise variance = 0.0040629 with 266 degrees of freedom
Estimated white noise standard deviation = 0.0637409
Number of iterations: 7


Now let's fit the same series using the built-in expert in Forecast Pro:

Forecast Pro for Windows Standard Edition Version 2.00
Sat Oct 05 17:12:33 1996

Expert data exploration of dependent variable AUTOSALE
---------------------------------------------------------------------
Length 245  Minimum 4.485  Maximum 37.068
Mean 16.781 Standard deviation 8.770

Classical decomposition (multiplicative)
    Trend-cycle: 96.53%  Seasonal: 2.55%  Irregular: 0.92%

Log transform recommended for Box-Jenkins.

There are no strongly significant regressors, so I will choose
a univariate method.

Exponential smoothing outperforms Box-Jenkins by 1.754 to 1.939
out-of-sample (MAD). I tried 78 forecasts up to a maximum horizon 12.
For Box-Jenkins, I used a log transform.

Out-of-sample forecast errors are used by Forecast Pro to compare models of different types during the automatic data exploration phase. The 78 forecasts are made on a rolling basis for a 12-month period, yielding 12 one-step ahead forecasts, 11 two-step-ahead forecasts, etc., for a total of 78 forecasts at various horizons within the same year. This is a generally good method to compare models, but keep in mind that it focuses only on a single year: some models may get luckier than others in that year!

Series is trended and seasonal.

Recommended model: Exponential smoothing

OK--let's use exponential smoothing with the "automatic" estimation option. Also, let's use 36 months for the validation period, as we did in Statgraphics.

Forecast Model for AUTOSALE
Automatic model selection
Multiplicative Winters: Linear trend, Multiplicative seasonality
Confidence limits proportional to indexes and level

                   Smoothing     Final
Component           Weight       Value
--------------------------------------
Level              0.29731      32.557
Trend              0.01889     0.12847
Seasonal           0.22900      1.1080

Seasonal Indexes
----------------------------------------------------------
January - March            0.87334     0.89532     1.06630
April - June               1.05658     1.10797     1.11531
July - September           1.06250     1.07803     1.01079
October - December         0.96825     0.90960     0.90090 

Standard Diagnostics
-------------------------------------------------------------
Sample size 245                  Number of parameters 3
Mean 16.78                       Standard deviation 8.788
R-square 0.9816                  Adjusted R-square 0.9814
Durbin-Watson 1.91               Ljung-Box(18)=20.49 P=0.6942
Forecast error 1.198             BIC 1.232 (Best so far)
MAPE 0.04891                     RMSE 1.191
MAD 0.7953                      

The Ljung-Box statistic is a test of the significance of the sum of squares of the first n residual autocorrelations--i.e., a test for the "total autocorrelation" in the residuals--where n=18 was used here. A "good" value of this statistic is a number not much larger than n, and the P-value as reported here ought to be less than 0.95. This test shows that the overall amount of autocorrelation in the residuals is acceptably low.

Rolling simulation results
                    Cumulative        Cumulative
  H   N       MAD      Average   MAPE    Average
---------------------------------------------------------------------
  1  36      0.974      0.974    0.030    0.030
  2  35      0.982      0.978    0.031    0.031
  3  34      1.074      1.009    0.034    0.032
  4  33      1.237      1.064    0.039    0.033
  5  32      1.309      1.110    0.041    0.035
  6  31      1.554      1.178    0.049    0.037
  7  30      1.697      1.246    0.053    0.039
  8  29      1.742      1.301    0.053    0.041
  9  28      1.843      1.354    0.055    0.042
 10  27      1.960      1.406    0.058    0.043
 11  26      2.020      1.453    0.060    0.045
 12  25      2.147      1.500    0.064    0.046
 13  24      2.226      1.545    0.067    0.047
 14  23      2.188      1.580    0.066    0.048
 15  22      2.308      1.617    0.069    0.049
 16  21      2.336      1.650    0.070    0.050
 17  20      2.419      1.683    0.072    0.051
 18  19      2.496      1.714    0.074    0.052
 19  18      2.518      1.742    0.074    0.053
 20  17      2.470      1.765    0.071    0.053
 21  16      2.569      1.789    0.073    0.054
 22  15      2.582      1.810    0.073    0.054
 23  14      2.413      1.825    0.068    0.055
 24  13      2.360      1.837    0.066    0.055
 25  12      2.168      1.843    0.060    0.055
 26  11      1.857      1.844    0.050    0.055
 27  10      1.766      1.842    0.048    0.055
 28   9      1.223      1.833    0.033    0.055
 29   8      0.844      1.821    0.023    0.054
 30   7      0.785      1.810    0.023    0.054
 31   6      0.832      1.801    0.023    0.054
 32   5      0.843      1.794    0.024    0.053
 33   4      0.970      1.789    0.027    0.053
 34   3      0.958      1.785    0.025    0.053
 35   2      0.663      1.781    0.017    0.053
 36   1      1.194      1.781    0.030    0.053


Notice that the estimated coefficients are similar but not identical to those found by Statgraphics--with nonlinear estimation, it's not guaranteed that you will get exactly the same results, and the two programs may perform backforecasting or otherwise initialize the models in different ways. (Fortunately, Forecast Pro prints out the final estimates of the seasonal indices, unlike Statgraphics. Neither program reveals the starting values it uses, which is one reason the Winters model is a bit of a black box.) The first row of the "rolling simulation rsults"--which shows the 1-step ahead error statistics in the validation period--agrees reasonably closely with what we obtained in Statgraphics (MAE=0.96 in SG versus MAD=0.974 here).

This model gets very lucky with its long-term forecasts near the end of the validation period, as shown here;


Now let's try Box-Jenkins with a log transform. (We need to return to the "tableau" to add the log transformation.) We'll start by using the "automatic" estimation option here too.

Forecast Model for AUTOSALE (Log transform)
Automatic model selection
ARIMA(0,1,1)*(2,0,3)

Term          Coefficient  Std. Error  t-Statistic  Significance
---------------------------------------------------------------------
b[1]             0.4733       0.0571       8.2876       1.0000
A[12]            0.4403       0.2779       1.5844       0.8869
A[24]            0.5578       0.2776       2.0092       0.9555
B[12]            0.1197       0.2725       0.4393       0.3396
B[24]            0.5286       0.2115       2.4995       0.9876
B[36]            0.2467       0.0693       3.5622       0.9996

Embedded insignificant AR terms -- consider dynamic regression. 

Yike--what happened here? How did we end up with SAR=2 and SMA=3 with NO seasonal difference? At first glance this model appears VERY strange, but upon closer inspection, some interesting facts emerge. Notice that the two SAR coefficients add up to 0.9981--i.e., almost exactly 1.0. This means there is a UNIT ROOT IN THE SAR PART OF THE MODEL!! Effectively the model is performing a seasonal difference after all--it's just buried in the SAR part of the model. What we ought to do (minimally) is add a seasonal difference while reducing SAR from 2 to 1--and I would try reducing the orders of seasonal terms even further. (I suspect the "expert system" in this program is trying too hard to eliminate the stubborn autocorrelation we saw around the seasonal period when we fitted the series in Statgraphics.) But let's continue for now...

Standard Diagnostics
------------------------------------------------------------
Sample size 244                  Number of parameters 6
Mean 2.682                       Standard deviation 0.5428
R-square 0.9876                  Adjusted R-square 0.9873
Durbin-Watson 1.894              Ljung-Box(18)=28.8 P=0.9491
Forecast error 0.06107           BIC 0.9432 (Best so far)
MAPE 0.04579                     RMSE 1.165
MAD 0.7542                      

Rolling simulation results
                    Cumulative        Cumulative
  H   N       MAD      Average   MAPE    Average
---------------------------------------------------------------------
  1  36      0.893      0.893    0.028    0.028
  2  35      0.890      0.892    0.028    0.028
  3  34      0.922      0.902    0.030    0.028
  4  33      1.119      0.954    0.036    0.030
  5  32      1.080      0.977    0.035    0.031
  6  31      1.276      1.023    0.041    0.033
  7  30      1.389      1.071    0.044    0.034
  8  29      1.442      1.112    0.045    0.035
  9  28      1.531      1.153    0.046    0.036
 10  27      1.644      1.195    0.049    0.037
 11  26      1.728      1.236    0.051    0.038
 12  25      1.907      1.281    0.057    0.040
 13  24      1.961      1.323    0.059    0.041
 14  23      1.957      1.359    0.060    0.042
 15  22      2.109      1.397    0.064    0.043
 16  21      2.221      1.435    0.067    0.044
 17  20      2.387      1.475    0.072    0.045
 18  19      2.483      1.513    0.075    0.046
 19  18      2.477      1.547    0.073    0.047
 20  17      2.497      1.578    0.073    0.048
 21  16      2.676      1.610    0.077    0.049
 22  15      2.601      1.636    0.074    0.050
 23  14      2.412      1.655    0.068    0.050
 24  13      2.515      1.674    0.071    0.051
 25  12      2.370      1.688    0.066    0.051
 26  11      2.311      1.699    0.064    0.051
 27  10      2.380      1.710    0.068    0.051
 28   9      2.214      1.717    0.064    0.052
 29   8      1.991      1.721    0.059    0.052
 30   7      2.131      1.725    0.065    0.052
 31   6      2.267      1.730    0.066    0.052
 32   5      2.578      1.737    0.073    0.052
 33   4      3.080      1.745    0.084    0.052
 34   3      3.029      1.751    0.078    0.052
 35   2      2.699      1.754    0.068    0.053
 36   1      4.035      1.757    0.101    0.053

This model actually outperforms the Winters model in its 1-step-ahead forecasts in the validation period, but it doesn't get quite as lucky with its long-term forecasts. (It's Ljung-Box statistic is also a little worse.) The confidence intervals widen fairly rapidly because of the effect of the log transformation:



Now here's one of the more "standard" seasonal ARIMA models that we identified earlier in Statgraphics:

Forecast Model for AUTOSALE (Log transform)
ARIMA(0,1,1)*(0,1,1)

Term          Coefficient  Std. Error  t-Statistic  Significance
---------------------------------------------------------------------
b[1]             0.4981       0.0575       8.6632       1.0000
B[12]            0.8896       0.0287      30.9505       1.0000 

Standard Diagnostics
-------------------------------------------------------------
Sample size 232                  Number of parameters 2
Mean 2.733                       Standard deviation 0.5058
R-square 0.985                   Adjusted R-square 0.9849
Durbin-Watson 1.896              Ljung-Box(18)=24.39 P=0.8575
Forecast error 0.06218           BIC 0.9751
MAPE 0.04774                     RMSE 1.252
MAD 0.8215                      

Rolling simulation results
                    Cumulative        Cumulative
  H   N       MAD      Average   MAPE    Average
---------------------------------------------------------------------
  1  36      0.907      0.907    0.028    0.028
  2  35      0.947      0.927    0.030    0.029
  3  34      0.974      0.942    0.031    0.030
  4  33      1.207      1.005    0.039    0.032
  5  32      1.209      1.044    0.039    0.033
  6  31      1.379      1.095    0.044    0.035
  7  30      1.544      1.154    0.049    0.037
  8  29      1.551      1.198    0.048    0.038
  9  28      1.644      1.241    0.050    0.039
 10  27      1.814      1.290    0.054    0.040
 11  26      1.820      1.331    0.055    0.042
 12  25      1.931      1.372    0.058    0.043
 13  24      1.964      1.408    0.060    0.044
 14  23      1.819      1.431    0.057    0.044
 15  22      2.037      1.462    0.063    0.045
 16  21      2.151      1.493    0.066    0.046
 17  20      2.204      1.523    0.068    0.047
 18  19      2.304      1.553    0.070    0.048
 19  18      2.246      1.578    0.067    0.049
 20  17      2.332      1.602    0.068    0.049
 21  16      2.425      1.626    0.069    0.050
 22  15      2.518      1.650    0.071    0.051
 23  14      2.526      1.671    0.071    0.051
 24  13      2.510      1.690    0.071    0.052
 25  12      2.417      1.704    0.068    0.052
 26  11      2.566      1.720    0.074    0.052
 27  10      2.769      1.737    0.081    0.053
 28   9      2.596      1.749    0.077    0.053
 29   8      2.655      1.760    0.079    0.053
 30   7      3.202      1.776    0.095    0.054
 31   6      3.329      1.790    0.096    0.054
 32   5      3.646      1.804    0.103    0.055
 33   4      4.122      1.818    0.112    0.055
 34   3      4.173      1.829    0.107    0.055
 35   2      4.019      1.836    0.101    0.055
 36   1      5.210      1.841    0.130    0.055


Notice that its 1-period-ahead error statistics agree reasonably well with what was obtained in Statgraphics (MAD=0.907 here versus MAE=0.912 in SG). The estimated coefficients are close too (b[1]=0.498 and b[12]=0.889 here versus MA(1)=.503 and SMA(1)=0.896 in SG.) This model is a littles less lucky in its long-term forecasts, but its average performance in the validation period is not much different from those of the previous models--and it is MUCH simpler!



Finally, here's the other ARIMA model we tried in Statgraphics, which substitutes an AR(1) term for the nonseasonal difference and thereby estimates the long-term trend in the series rather than the local trend:

Forecast Model for AUTOSALE (Log transform)
ARIMA(1,0,1)*(0,1,1)

Term          Coefficient  Std. Error  t-Statistic  Significance
---------------------------------------------------------------------
a[1]             0.9198       0.0326      28.2486       1.0000
b[1]             0.4221       0.0735       5.7396       1.0000
B[12]            0.8924       0.0271      32.9531       1.0000
_CONST           0.0072       0.0030       2.3964       0.9834 

Standard Diagnostics
-------------------------------------------------------------
Sample size 233                  Number of parameters 3
Mean 2.729                       Standard deviation 0.5097
R-square 0.9857                  Adjusted R-square 0.9856
Durbin-Watson 1.938              Ljung-Box(18)=24.78 P=0.8688
Forecast error 0.06123           BIC 0.9647
MAPE 0.04699                     RMSE 1.237
MAD 0.8149                      

Rolling simulation results
                    Cumulative        Cumulative
  H   N       MAD      Average   MAPE    Average
---------------------------------------------------------------------
  1  36      1.207      1.207    0.038    0.038
  2  35      1.683      1.442    0.053    0.045
  3  34      2.160      1.674    0.068    0.053
  4  33      2.694      1.918    0.085    0.060
  5  32      3.143      2.149    0.098    0.067
  6  31      3.595      2.372    0.112    0.074
  7  30      4.033      2.587    0.125    0.081
  8  29      4.374      2.787    0.134    0.087
  9  28      4.699      2.973    0.143    0.092
 10  27      5.027      3.149    0.151    0.097
 11  26      5.257      3.309    0.158    0.102
 12  25      5.519      3.460    0.166    0.106
 13  24      5.931      3.612    0.178    0.111
 14  23      6.148      3.754    0.185    0.115
 15  22      6.443      3.890    0.194    0.119
 16  21      6.637      4.016    0.200    0.123
 17  20      6.861      4.136    0.206    0.126
 18  19      7.101      4.250    0.212    0.129
 19  18      7.310      4.357    0.216    0.132
 20  17      7.485      4.457    0.217    0.135
 21  16      7.771      4.554    0.223    0.138
 22  15      8.058      4.648    0.229    0.140
 23  14      8.090      4.732    0.230    0.142
 24  13      8.194      4.808    0.233    0.144
 25  12      8.296      4.878    0.236    0.146
 26  11      8.402      4.942    0.241    0.148
 27  10      8.633      5.001    0.249    0.150
 28   9      8.555      5.052    0.246    0.151
 29   8      8.759      5.098    0.252    0.152
 30   7      9.081      5.142    0.262    0.153
 31   6      9.190      5.179    0.260    0.154
 32   5      9.571      5.212    0.266    0.155
 33   4     10.061      5.242    0.271    0.156
 34   3     10.389      5.265    0.265    0.156
 35   2     10.192      5.280    0.256    0.157
 36   1     11.312      5.289    0.283    0.157

Despite its good intentions in trying to estimate the long-term trend in a more stable manner than the other models, this model gets embarrassed in the validation period because it fails to respond to the cyclical downturn which happens to occur right at the end of the estimation period (around the beginning of '91). The other models picked this up in one way or another and therefore were luckier in their long-term forecasts for the next three years. ("The race is not always to the swift nor the battle to the strong....") This model's 1-step-ahead performance is not too bad, though.