Tactical Global Asset Allocation - Protocol: Campbell R. Harvey

Tactical Global Asset Allocation

Research Protocol

In order to maximize the chance of successful model building, a strict research protocol should be set in place. This discipline involves seven steps.

1. Specifying the Problem

In this step, the problem is defined. In most cases, a forecasting problem is considered. A list of candidate explanatory variables is formulated. It is important that this list be developed in advance of the data gathering process (to minimize the impact of data snooping). Candidate variables should have economic meaning. For example, no forecasting model should contain variables that have no economic foundation (e.g. sunspots, Yankee batting averages and hemlines). In addition, a discipline should be developed in advance for the appropriate lag structure. No model should be presented, say, with the 5th and 17th lag of an explanatory variable. This smacks of data snooping. If a lag structure beyond 1 is need, the justification must be developed in advance. Detailed notes of this process must be taken. An appendix to each research presentation must list the variables which were prespecified in stage one.

2. Data Collection

In this step, the data is obtained. It is important that the data are examined for potential errors. Data services like DATASTREAM are notorious for having data incorrectly keyed in. I recommend graphing each series. The eyeball is a powerful filter. In addition, when emerging markets are being considered extra attention needs to be paid to how the stock return is constructed (does it properly account for rights issues and dividends). Even well-known data like those from the International Finance Corporation, are riddled with errors.

Importantly, I usually require a two holdout sample methodology. For example, suppose we are forecasting the U.K. equity returns. We obtain data from say, 1970-1993. Our model would be fit over a shorter sample, say 1970-1990. Our model will be validated over the 1991-93 sample (36 observations). An alternative strategy which has advantages when low frequency data are used is to randomly holdout 60 months. That is, use the data from 1970-1993 and randomly select 60 noncontiguous months for a holdout sample. After the validation process has been documented and the final model selected, the 1994 and 1995 data is obtained. A second validation is executed with this fresh data.

3. Estimation

In this step, the initial models are estimated. Linear and nonlinear models should be estimated with the most general econometric methodologies, such as the Generalized Method of Moments (GMM). [See, for example, Harvey P3 and Harvey and Kirby W15. The GMM is ideally suited as a forecasting method. The method handles linear and nonlinear problems. The method provides test statistics which are robust to departures from the traditional distributional assumptions. However, it is possible that some preliminary analysis could be done with the least squares (OLS) methodologies. It should be noted that, least squares is just a special case of the GMM.

The use of GMM allows for rigorous hypothesis testing of the model specifications. All tests should be heteroskedasticity-consistent and robust to potential moving-average errors induced in the data (this structure could be particularly important is raw transactions are being modeled which bounce between bid and ask and index situations which might suffer from infrequent trading).

"Stepwise" regression or GMM is not desirable. A small number of models should be specified in advance (step 1) and the final in-sample selection can be made using a variety of evaluation methodologies. The final selection should contain about three models. These models are chosen on the basis on traditional criteria such as adjusted R-square, Akaike Information Criterion, and Schwarz Criterion. All of these methods minimize squared errors and penalize models which have extra variables (reward parsimony). Parsimony is critical because it is an important determinant of out of sample performance.

The data snooping problem should be recognized at all stages of the research. With 20 randomly selected explanatory variables, one variable should enter the regression by chance with a t-statistic greater than 2.0. The prespecification of the variables helps minimize the snooping problem.

4. Validation

Each model must be validated on an out-of-sample basis. Using the holdout samples, the model performance is assessed using the same metrics. This helps identify the model which will be promoted.

The validation stage is clearly the most important. It potentially eliminates the data snooping problem. That is, if variables have been snooped to maximize, say the R-square, it is unlikely that these variables will perform on an out-of-sample basis. The validation will also eliminate forecasting models which contain too many variables. The overparameterized models will surely fail. Finally, the validation stage is important for identifying potential instability in the parameter estimates. One might have the right explanatory variables but the wrong functional form. While some of this problem should have been resolved in the estimation stage, any residual issues should be evident in the validation stage.

With the final model or models, the final part of the data is obtained for the second validation. This dataset is free of any possible snooping and provides a clean out of sample test.

5. Trading simulation

The statistical model building must next be transfered to a trading strategy. This stage is implemented on the two holdout samples. It may also be implemented on the in-sample data. There should be a close correspondence between model statistical fit and trading performance. However, when we are faced with three finalist models, it is unlikely that their predictive performances are statistically different. The trading simulation can help isolate the final model.

At this stage, levels of slippage and transaction costs need to be incorporated into the analysis. In addition, if large trades are being considered market impact must also be assessed.

6. Trading simulation benchmarking

A measure of the significance for the selected stategy is assessed by comparing the strategy's performance to two sets of benchmarks. First, a bootstrap distribution of random trading strategies is formed. The strategy in question should lie in the upper 10% tail of the bootstrapped distribution. Second, another distribution is formed with a set of 100 moving-average rules denoted (x,y,z). These moving-averages are applied to price (rather than return series). However, all evaluation of signals is done on the basis of returns. The moving average cross-over rule executes a buy (sell) when the short-term moving average, length x, goes below (above) the long term moving average, length y. The variable z denotes the band. For example, a band of 1% says that no trade is executed (neutral) when the short-term moving average is within 1% of the cross-over price. The selected strategy is evaluated within the context of the moving average rules. This provides a tougher comparison to the randomly generated strategies.

7. Reporting

The paper trail is extremely important. A detailed appendix needs to be maintained which logs the models tested. Extensive model diagnostics are also presented in the appendix along with the graphical analysis. The diagnostics should include the standard battery of residual analysis tests, specification tests, details about the explanatory variables and their correlation structure, model comparisons, and model (parameter) stability tests.

The main body of reporting should detail: the statistical performance of the model of choice (in sample and out of sample), the next two closest competing models, the trading performance (correct hit rates, number of negative months in a row, maximum drawdown in any year, dollar profits, dollar standard deviations, Sharpe ratios, Harvey-Graham measures , and benchmark comparisons).

The report should explicitly detail the validation procedures employed.


Click here BA453 Index Page
Click here for Campbell Harvey's Home Page