We will be interested in forecasting *R_t* as a function of lagged information
*Z_t-1*. It is logical to start with a linear regression model. Later we
discuss the generalization of this linear model using nonparametric density
estimation techniques.

The linear regression model is with a single explanatory variable:

Rwhere_{t}= d_{0}(Z_{0}) + d_{1}(Z_{1,t-1}) + residual_{t}[1]

This is often presented as

RThe_{t}= d_{0}+ d_{1}(Z_{1,t-1}) + residual_{t}[2]

Suppose we ran the following regression:

RThis is a regression on the column of ones. What is_{t}= d_{0}(Z_{0}) + residual_{t}[3]

dwhere_{0}= INV(Z'Z)Z'R [4]

1 INV(Z'Z) = INV(#obs) = ---- #obs Z'R = SUM(returns)Hence, it is obvious that the

Why are we focussing on this trivial regression? Well, the traditional style
of asset management uses average returns (as well as variances and covariances)
the mean-variance optimization. Sometimes, moving-window averages (MA) are used, say
the last five years. In this case, *Z _{0}* would have zeros in the initial
rows and "1"s in the last 60 rows (assuming monthly data is used). Sometimes,
exponentially weighted moving averages (EWEMA) are used. Again, we can set
the

What is the R-square of this regression in [3]. Remember, the definition of R-square is the variance of the regression fitted values divided by the variance of the dependent variable. An R-square of 1.0 or 100% implies that the fitted values exactly coincide with the realized returns.

Var(fitted) Var(d_0) R-square = ------------ = -------- = 0 Var(R) Var(R)The R-square is zero. Why? The variance of a constant,

Another way of looking at this exercise is to note that those using this
style of model are assuming that no other *Z* variable influences
future returns. In fact, in running this special regression (and, indeed,
you do not need to run a regression, you simply need to push the average button),
they are assuming the *d _{1}* and other coefficients are

Using the average as a forecast forces the asset manager implement a strategy with a zero R-square. This is not necessarily a desirable strategy. Indeed, it implies that no other information affects expected returns. It implies that expected returns are constant (at least over the 60-month window of the MA).

Using a more general regression model, we can incorporate predictability. We can execute statistical tests to ensure that the predictability is genuine rather than an artifact of data snooping. The Research Protocol details procedures that avoid potential misspecification.