Tactical Global Asset Allocation - Protocol: Campbell R. Harvey
Tactical Global Asset Allocation
Research Protocol
In order to maximize
the chance of successful model building, a strict research protocol should
be set in place. This discipline involves
seven steps.
1. Specifying the Problem
In this step, the problem is defined. In most cases, a forecasting problem is
considered. A list of candidate explanatory variables is formulated. It is important
that this list be developed in advance of the data gathering process (to minimize
the impact of data snooping). Candidate variables should have economic meaning.
For example, no forecasting model should contain variables that have no economic
foundation (e.g. sunspots, Yankee batting averages and hemlines). In addition,
a discipline should be developed in advance for the appropriate lag structure.
No model should be presented, say, with the 5th and 17th lag of an explanatory variable.
This smacks of data snooping. If a lag structure beyond 1 is need, the justification
must be developed in advance. Detailed notes of this process must be taken. An appendix
to each research presentation must list the variables which were prespecified in
stage one.
2. Data Collection
In this step, the data is obtained. It is important that the data are examined
for potential errors. Data services like DATASTREAM are notorious for having
data incorrectly keyed in. I recommend graphing each series. The eyeball is
a powerful filter. In addition, when emerging markets are being considered
extra attention needs to be paid to how the stock return is constructed (does
it properly account for rights issues and dividends). Even well-known data
like those from the International Finance Corporation, are riddled with errors.
Importantly, I usually require a two holdout
sample methodology. For example, suppose we are forecasting the U.K. equity
returns. We obtain data from say, 1970-1993. Our model would be fit
over a shorter sample, say 1970-1990. Our model will be validated over the
1991-93 sample (36 observations). An alternative strategy which has advantages
when low frequency data are used is to randomly holdout 60 months. That is,
use the data from 1970-1993 and randomly select 60 noncontiguous
months for a holdout sample.
After the validation process has been documented
and the final model selected, the 1994 and 1995 data is obtained. A second validation is executed
with this fresh data.
3. Estimation
In this step, the initial models are estimated. Linear and nonlinear models
should be estimated with the most general econometric methodologies, such
as the Generalized Method of Moments (GMM).
[See, for example,
Harvey P3 and Harvey and Kirby W15.
The GMM is ideally suited as a forecasting
method. The method handles linear and nonlinear problems. The method provides
test statistics which are robust to departures from the traditional distributional
assumptions. However, it is possible that some preliminary analysis could be
done with the least squares (OLS) methodologies. It should be noted that, least squares
is just a special case of the GMM.
The use of GMM allows for rigorous hypothesis testing of the model specifications.
All tests should be heteroskedasticity-consistent and robust to potential moving-average
errors induced in the data (this structure could be particularly important is
raw transactions are being modeled which bounce between bid and ask and index situations
which might suffer from infrequent trading).
"Stepwise" regression or GMM is not desirable. A small number of models should be specified in
advance (step 1) and the final in-sample selection can be made using a variety of evaluation
methodologies. The final selection should contain about three models. These
models are chosen on the basis on traditional criteria such as adjusted
R-square, Akaike Information Criterion, and Schwarz Criterion.
All of these methods minimize squared errors and
penalize models which have extra variables (reward parsimony). Parsimony
is critical because it is an important determinant of out of sample performance.
The data snooping problem should be recognized at all stages of the research.
With 20 randomly selected explanatory variables, one variable should enter the
regression by chance with a t-statistic greater than 2.0. The prespecification
of the variables helps minimize the snooping problem.
4. Validation
Each model must be validated on an out-of-sample basis. Using the holdout samples,
the model performance is assessed using the same metrics. This
helps identify the model which will be promoted.
The validation stage is clearly the most important. It potentially eliminates
the data snooping problem. That is, if variables have been snooped to maximize,
say the R-square, it is unlikely that these variables will perform on an
out-of-sample basis. The validation will also eliminate forecasting models
which contain too many variables. The overparameterized models will surely fail.
Finally, the validation stage is important for identifying potential instability
in the parameter estimates. One might have the right explanatory variables but
the wrong functional form. While some of this problem should have been resolved
in the estimation stage, any residual issues should be evident in the validation
stage.
With the final model or models, the final part of the data is obtained for the
second validation. This dataset is free of any possible snooping and provides a
clean out of sample test.
5. Trading simulation
The statistical model building must next be transfered to a trading strategy.
This stage is implemented on the two holdout samples. It may also be implemented
on the in-sample data. There should be a close correspondence between
model statistical fit and trading performance. However, when we are faced
with three finalist models, it is unlikely that their predictive performances
are statistically different. The trading simulation can help isolate the
final model.
At this stage, levels of slippage and transaction costs need to be incorporated
into the analysis. In addition, if large trades are being considered market
impact must also be assessed.
6. Trading simulation benchmarking
A measure of the significance for the selected stategy is assessed by comparing
the strategy's performance to two sets of benchmarks. First, a bootstrap distribution
of random trading strategies is formed. The strategy in question should
lie in the upper 10% tail of the bootstrapped distribution. Second, another
distribution is formed with a set of 100 moving-average rules denoted (x,y,z).
These moving-averages are applied to price (rather than return series). However,
all evaluation of signals is done on the basis of returns.
The moving average cross-over rule executes a buy (sell) when the short-term
moving average, length x, goes below (above) the long term moving average, length y.
The variable z denotes the band. For example, a band of 1% says that no
trade is executed (neutral) when the short-term moving average is within 1%
of the cross-over price. The selected strategy is evaluated within the
context of the moving average rules. This provides a tougher comparison
to the randomly generated strategies.
7. Reporting
The paper trail is extremely important. A detailed appendix needs to be
maintained which logs the models tested. Extensive model diagnostics
are also presented in the appendix along with the graphical analysis.
The diagnostics should include the standard battery of residual analysis
tests, specification tests, details about the explanatory variables and
their correlation structure, model comparisons, and model (parameter)
stability tests.
The main body of reporting should detail: the statistical performance
of the model of choice (in sample and out of sample), the next two
closest competing models, the trading performance (correct hit rates,
number of negative months in a row, maximum drawdown in any year, dollar
profits, dollar standard deviations, Sharpe ratios, Harvey-Graham measures
,
and benchmark comparisons).
The report should explicitly detail the validation procedures employed.
Click here BA453 Index Page
Click here for Campbell
Harvey's Home Page