Nonlinear regression


If the hypothesized relation between dependent and independent variables is nonlinear, one approach is to try to linearize the relationship by transforming the variables, and then fit a linear regression model to the transformed variables. For example, suppose that X is hypothesized to be multiplicatively related to Y and Z according to the equation:

X = a*(Y^b)*(Z^c)

Here, the parameters b and c represent the "elasticities" of X with respect to Y and Z. (We should call them parameters rather than coefficients in this case, since they are exponents rather than multipliers.) For example, if b=1.0, a 10% change in Y is expected to produce a 10% change in X, other things remaining the same. (This is the case of "unitary elasticity.") If b=0.5, then a 10% change in Y produces only a 5% change in X. (This is an example of an "inelastic" response of X to Y.) Multiplicative models of this form are often used in marketing research to represent the combined effects of variables like prices, promotions, and advertising, which are imagined to affect sales in percentage terms. If we take the logarithm of both sides of this equation, we obtain:

LOG(X) = LOG(a) + b*LOG(Y) + c*(LOG(Z))

In other words, LOG(Y) is a linear function of LOG(Y) and LOG(Z), in which the coefficients of LOG(Y) and LOG(Z) are the simply the exponents b and c in the original equation, and the constant term is the log of a. One way to fit this model, then, is to use ordinary linear regression to regress LOG(X) on LOG(Y) and LOG(Z). Another possibility, however, is to use nonlinear regression to fit the original multiplicative model directly. Nonlinear regression uses a "search" algorithm to fit a model in which the predictions for the dependent variable are not necessarily linear functions of the parameters. The "solver" feature in your spreadsheet can do this, and so can Statgraphics' nonlinear regression procedure.


In the Nonlinear Regression procedure in Statgraphics (under Special/Advanced Regression), you must enter (i) a name or expression specifying the dependent variable, (ii) a specification of the function to be fitted to the dependent variable using those parameters, and (iii) starting values for the vector of parameter estimates. The function specification should be a Statgraphics mathematical expression in which the different parameters to be estimated are represented by letters or names which are not already in use as Statgraphics variable names. To fit the multiplicative model described above, you could enter X as the dependent variable and then enter the following expression as the function to be fitted:

a*(Y^b)*(Z^c)

(assuming that Y and Z are existing variable names, but a, b, and c are not.) After hitting OK to continue, you will see a screen on which to enter starting values for the parameters. (The default values are all 0.1, and usually it is OK to leave them alone, at least on the first pass.

The nonlinear regression procedure, like the linear regression procedure, tries to find those values of the parameter estimates which minimize the sum of squared errors. If you're not satisfied with where it ends up, or if it bombs, you can restart the procedure with a different set of initial values for the parameter estimates. For example, try 1.0 instead of 0.1 for a parameter that represents an exponent, and make sure that the starting values have signs that agree with your intuitions about the relationships among the variables. If you have already tried fitting a linear regression model to nonlinearly transformed variables, the result of the linear regression run may suggest parameter values that would be appropriate in the nonlinear regression model.

The Analysis Summary report for this procedure looks pretty much like standard regression or ARIMA output: you see the values of the estimated coefficients, their standard errors, and 95% confidence limits, plus the usual assortment of error statistics. The graphical options include "response surface plotting," which is fun if you have exactly two independent variables. (This is a 3-D surface plot that shows the predicted value of the dependent variable as a function of the first two independent variables appearing in the functional expression.) You can also get the usual residual plots.


Which approach should you use: linear regression with transformed variables or nonlinear regression with the original variables? Since both procedures estimate parameters by "least squares," they are both implicitly assuming that the error distribution is approximately normal. One thing to think about, then, is whether the assumption of normally distributed errors is more plausible in original units or in transformed units. You can look at probability plots of the residuals from both types of model to see which assumption is more consistent with the data.

Another thing to think about is how you would wish to weight different magnitudes of error for decision-making purposes. Is the important criterion absolute error or percentage error? The nonlinear regression on original data is trying to inimize the RMS error in absolute terms, while the linear regression on logged data is trying to minimize the RMS error in percentage terms. If you're trying to minimize error in percentage terms, then you're more tolerant of large errors when they're associated with large predictions: making an error of size 10 when the prediction is 100 is no worse than making an error of size 1 when the prediction is 10. If you're trying to minimize error in absolute terms, then making an error of size 10 is 10 times as bad as making an error of size 1, no matter what the prediction is. (Actually, it's 100 times worse to the regression procedure, which looks at squared error!)


You can also use Excel's solver to fit nonlinear regression models (or linear regression models, for that matter). For example, suppose that data for the variables X, Y, and Z are stored in columns X, Y, and Z on the spreadsheet, beginning in row 1. Also, suppose that the value of the parameters a, b, and c are stored in cells A1, B1, and C1. Then you could enter a formula in another column (say, the first cell in column AA), containing the expression:

=$A$1*(Y1^$B$1)*(Z1^$C$1)

...and copy it into the remaining cells in the column. In the next column over, calculate the errors of the model (the actual values in column X minus the predicted values in column AA). For example, you could enter the following formula in cell AB1:

=X1-AA1

...and copy it into the remaining cells in the column. Then add a formula in a cell somewhere to compute the mean squared error, which is (more or less) equal to the population variance plus the square of the mean. For example, if the errors are stored in cells AB1:AB100, the formula for the mean squared error would be

=VARP(AB1:AB100)+AVERAGE(AB1:AB100)^2

Then run the Solver, specify that the adjustable cells are A1, B1, and C1, and specify that the target cell to be minimized is the cell containing the mean squared error (and do NOT check the "assume linear model" option). Unfortunately, Excel will not compute standard errors and t-statistics for the parameters, nor will it automatically give you other residual diagnostics and plots.