Positivity requirement and choice of base

First difference of LOG = percentage change

The poor man's deflator

Trend in logged units = percentage growth

Errors in logged units = percentage errors

**The
poor man's deflator:** Logging
a series often has an effect very similar to deflating: it dampens
exponential growth patterns and reduces heteroscedasticity
(i.e., stabilizes variance). Logging is therefore a "poor
man's deflator" which does not require any external data (or any
head-scratching about which price index to use). Logging is not *exactly*
the same as deflating--it
does not *eliminate* an upward trend in the data--but it
can straighten the trend out so that it can be better fitted by
a linear model. (Compare the logged auto sales graph
with the deflated auto sales graph.)

If you're going to log the data
and then fit a model that implicitly
or explicitly uses *differencing* (e.g., a random walk,
exponential
smoothing, or ARIMA model), then it is usually redundant to deflate
by a price index, as long as the rate of inflation changes only
slowly: the percentage change measured in nominal dollars will
be nearly the same as the percentange change in constant dollars.
Mathematically speaking, DIFF(LOG(Y/CPI)) is nearly identical
DIFF(LOG(Y)): the only difference between the two is a very faint
amount of noise due to fluctuations in the inflation rate. To
demonstrate this point, here's a graph of the first difference of
logged auto sales, with and without deflation:

By logging *rather* than
deflating, you avoid the need to incorporate
an *explicit* forecast of future inflation into the model: you
merely lump inflation together with any other sources of
steady compound growth in the original data. Logging the data before
fitting a random walk model yields a so-called
**geometric random walk**--i.e., a random walk with geometric
rather than linear growth. A geometric random walk is the default
forecasting model that is commonly used for stock price data. (Return to top of page.)

Trend in logged units = percentage growth: Because changes in the natural logarithm are (almost) equal to percentage changes in the original series, it follows that the slope of a trend line fitted to logged data is equal to the average percentage growth in the original series. For example, in the graph of LOG(AUTOSALE) shown above, if you "eyeball" a trend line you will see that the magnitude of logged auto sales increases by about 2.5 (from 1.5 to 4.0) over 25 years, which is an average increase of about 0.1 per year, i.e., 10% per year. It is much easier to estimate this trend from the logged graph than from the original unlogged one! The 10% figure obtained here is nominal growth, including inflation. If we had instead eyeballed a trend line on a plot of logged deflated sales, i.e., LOG(AUTOSALE/CPI), its slope would be the average real percentage growth.

Usually the trend is estimated more precisely by fitting a statistical model that explicitly includes a local or global trend parameter, such as a linear trend or random-walk-with-drift or linear exponential smoothing model. When a model of this kind is fitted in conjunction with a log transformation, its trend parameter can be interpreted as a percentage growth rate.

Thus, if you use least-squares
estimation to fit a linear forecasting
model to *logged *data, you are implicitly minimizing mean
squared *percentage* error, rather than mean squared error
in the original units--which is probably a good thing if the
log transformation was appropriate in the first place. And if
you look at the error statistics in logged units, you can interpret
them as percentages. For example, the standard deviation of the
errors in predicting a logged series is essentially the standard
deviation of the percentage errors in predicting the original
series, and the mean absolute error (MAE) in predicting a logged series
is essentially the mean absolute percentage error (MAPE) in predicting
the original series.

Statgraphics tip: In the Forecasting procedure in Statgraphics, the error statistics shown on the Model Comparison report are all in