Statistical forecasting:
notes on regression and time series analysis

 

Robert Nau

Fuqua School of Business

Duke University

 


This web site contains notes and materials for an advanced elective course on statistical forecasting that is taught at the Fuqua School of Business, Duke University. It covers linear regression and time series forecasting models as well as general principles of thoughtful data analysis. The time series material is illustrated with output produced by Statgraphics, a statistical software package that is highly interactive and has good features for testing and comparing models, including a parallel-model forecasting procedure that I designed many years ago. The material on multivariate data analysis and linear regression is illustrated with output produced by RegressIt, a free Excel add-in which I also designed. However, these notes are platform-independent. Any statistical software package ought to provide the analytical capabilities needed for the various topics covered here.

If you use Excel in your work or in your teaching to any extent, you should check out the latest release of RegressIt, a free Excel add-in for linear and logistic regression. See it at regressit.com. The linear regression version runs on both PC's and Macs and has a richer and easier-to-use interface and much better designed output than other add-ins for statistical analysis. It may make a good complement if not a substitute for whatever regression software you are currently using, Excel-based or otherwise. RegressIt is an excellent tool for interactive presentations, online teaching of regression, and development of videos of examples of regression modeling.  It includes extensive built-in documentation and pop-up teaching notes as well as some novel features to support systematic grading and auditing of student work on a large scale. There is a separate logistic regression version with highly interactive tables and charts that runs on PC's. RegressIt also now includes a two-way interface with R that allows you to run linear and logistic regression models in R without writing any code whatsoever.

If you have been using Excel's own Data Analysis add-in for regression (Analysis Toolpak), this is the time to stop. It has not changed since it was first introduced in 1993, and it was a poor design even then. It's a toy (a clumsy one at that), not a tool for serious work. Visit this page for a discussion: What's wrong with Excel's Analysis Toolpak for regression


1.  Get to know your data

Principles and risks of forecasting (pdf)
Famous forecasting quotes
How to move data around
Get to know your data
Inflation adjustment (deflation)
Seasonal adjustment
Stationarity and differencing
The logarithm transformation


2.  Introduction to forecasting: the simplest models

Statistics review and the simplest forecasting model: the sample mean (pdf)
Notes on the random walk model (pdf)
Mean (constant) model
Linear trend model
Random walk model
Geometric random walk model
Three types of forecasts: estimation period, validation period, and the future


3.  Averaging and smoothing models

Notes on forecasting with moving averages (pdf)
Moving average and exponential smoothing models
Slides on inflation and seasonal adjustment and Winters seasonal exponential smoothing
Spreadsheet implementation of seasonal adjustment and exponential smoothing
Equations for the smoothing models (SAS web site)


4.  Linear regression models

Notes on linear regression analysis (pdf)
Introduction to linear regression analysis

Mathematics of simple regression
Regression examples

- Baseball batting averages

- Beer sales vs. price, part 1: descriptive analysis

- Beer sales vs. price, part 2: fitting a simple model

- Beer sales vs. price, part 3: transformations of variables

- Beer sales vs. price, part 4: additional predictors

- NC natural gas consumption vs. temperature

- More regression datasets at regressit.com

What to look for in regression output
What's a good value for R-squared?
What's the bottom line? How to compare models
Testing the assumptions of linear regression
Additional notes on regression analysis
Stepwise and all-possible-regressions
Excel file with simple regression formulas
Excel file with regression formulas in matrix form

Notes on logistic regression (new!)

RegressIt: free Excel add-in for linear and logistic regression and multivariate data analysis


5.  ARIMA models for time series forecasting

Notes on nonseasonal ARIMA models (pdf)
Slides on seasonal and nonseasonal ARIMA models (pdf)
Introduction to ARIMA: nonseasonal models
Identifying the order of differencing
Identifying the orders of AR or MA terms
Estimation of ARIMA models
Seasonal differencing
Seasonal random walk: ARIMA(0,0,0)x(0,1,0)
Seasonal random trend: ARIMA(0,1,0)x(0,1,0)

General seasonal ARIMA models: ARIMA(0,1,1)x(0,1,1) etc.
Summary of rules for identifying ARIMA models
ARIMA models with regressors
The mathematical structure of ARIMA models (pdf)


6.  Choosing the right forecasting model

Steps in choosing a forecasting model
Forecasting flow chart
Data transformations and forecasting models: what to use and when
Automatic forecasting software
Political and ethical issues in forecasting
How to avoid trouble: principles of good data analysis


7.  Statistics resources on the web

Forecasting Principles and Practice (R-based on-line textbook by Rob Hyndman and George Athanasopoulos)
OpenIntro Statistics (David Diez, Christopher Barr, Mine Cetinkaya-Rundel)
Stat 510: Applied Time Series (R-based on-line course at Penn State)
Online StatBook (David Lane)
International Institute of Forecasters links (sites, references, software)
Forecasting Principles web site (J. Scott Armstrong and Kesten Green)
HyperStat Online web site (David Lane)
Institute for Digital Research and Education at UCLA
Statsblogs (links to many blogger sites)
StatPages web site (John Pezzullo)
StatSci web site (Gordon Smyth)
Talk Stats forum
StackExchange Cross-Validated forum

 


(c) 2020, All Rights Reserved. This site receives over 1 million daily visitors per year. The traffic pattern is highly seasonal with strong day-of-week and academic-calendar effects. Click here for a chart. More than 50% of visitors are from outside the U.S. and 19% of visitors spend more than an hour between first and last pages viewed on the same day. Last updated on August 18, 2020.

shopify site analytics