Author: J. M. McCormick

Last Update: August 12, 2008

It is possible to have Excel perform a non-linear least
square regression. One simple trick is to create columns each containing the
variable of interest to the requisite power. For example, if we wanted to fit a
set of data to a third order polynomial (i. e., *y* = *a*x + *b*x^{2} +
*c*x^{3} + *d*, where *a*,
*b*, *c* and *d* are constants that we need to find), then we would create columns
containing the independent variable to the desired powers, as shown in Fig. 1.
Note that in addition to *x*^{1}, *x*^{2} and *x*^{3} there must be a column containing
*x*^{0},
which contains only ones. Once the spreadsheet is set up as shown below, select
* Tools, Data Analysis* from the menu bar and scroll down to

**Figure 1.** Sample spreadsheet that is ready to be
fit to the cubic expression *y* = *ax* + *bx*^{2} +
*cx*^{3} + *d* using Excel's regression package.

The regression pop up will appear (Fig. 2). In the *Input
Y Range* box type in the cell addresses that contains your *y* values (or click
and drag to select them from the worksheet). In Fig. 1 this
would be the cells F2 to F28 (enter F2:F28 in the *Input Y Range* box).
While entering the *y* values is no different than for a linear regression,
inputting the *x* values is very different in that the *Input X Range* box
must contain all of the columns containing a power
of *x*. In Fig. 1 this would be columns B through E (cells B2 to E28). Once
the *x* and *y* ranges are set, you can set any other parameters as desired, and click OK.

**Figure 2.** Regression set-up window.

If you have
selected to have the results of the fit put in a new worksheet ply, then you
will see something like what is shown in Fig. 3 (note that the columns in this
spreadsheet have been modified to clearly show their contents). Each of the rows labeled
*X
Variable*, except *X Variable 1*, correspond to one of the constants in the
polynomial. The row labeled *Intercept* will always be the constant in the
polynomial that is not multiplied by *x*. In this case where the polynomial is
*y* = *ax* + *bx*^{2} + *cx*^{3} + *d*, the
*Intercept* is *d*, *X
Variable 2* is *a*, *X Variable 3* is *b* and *X Variable 3* is
*c*. *X Variable 1* will
always be meaningless. The values under the *Standard Error* and *Upper* and
*Lower
95%* for each variable have the same meaning as they do in a simple regression.

Note that the values of *R* and *R*^{2} are not going to be particularly
useful in defining how well this expression fit the data, or in comparing this
fit to a fit using another expression. For this one must use the standard
error about the regression (listed as the *
Standard Error* under the *Regression Statistics* heading). Also
note that we could set up a column of calculated *y* values based on this fit (for
example, in column G of the spreadsheet shown in Fig. 1) by entering the function
in the appropriate column and using absolute references to the values in the
output spreadsheet (Fig. 3).

**Figure 3.** Excel worksheet showing the results for
the fit of the data in Fig. 1 to a cubic function.

Another way to have Excel perform a non-linear least
squares fit is to use the *Solver* tool. Before getting started, you will need to
set up the spreadsheet, but in a different way than for the *Regression*
package. Again let us assume that we want to fit a set of y values to a
cubic expression in x (i. e., *y* = *ax* + *bx*^{2} + *cx*^{3}_{
}+ *d*). We only need a single column for the *x* values, but we will
need to have two columns for the *y* values (see Fig. 4). The first column of
*y*
values (column C in Fig. 4) will contain the *y* values that we actually measured.
It is labeled *y _{obs}* (for observed values) to avoid confusion with the
calculated

**Figure 4.** Excel spreadsheet set up to fit a data
set to any expression in the variable *x* using the *Solver* tool.

Once the spreadsheet is ready, select * Tools, Solver* from
the menu bar. The

**Figure 5.** Solver set-up window.

Once everything is set, click *Solve*. *Solver* will then
attempt change the cells that you told it to change, subject to any constraints
that you specified, to minimize the
difference between the observed and calculated y values. If it is successful it
will display a window stating that it found a solution and asking whether you
want to keep it (note that the best fit values of your variables and the
function that you were minimizing will be displayed in the spreadsheet behind
the pop-up). If you want to keep the fit, click *OK*, if not click *CANCEL*. If
a satisfactory fit was not found, a similar message will be displayed and
you can either keep or discard the result.

Sometimes you will need to restart
the *Solver* tool (that is reinitiate the procedure with the values from a
previous fit) before a satisfactory fit is obtained. In any case, it is good
practice to perform the fit with different values of the initial parameters to
be sure that the procedure converges to a unique solution. It may also be
necessary to adjust the fitting procedure itself (done by clicking the *Options*
button in the main *Solver* window) to get a good result. However, if this occurs
it may be advisable to find a more robust fitting algorithm or to find a simpler
model.

Unlike the *Regression* package, *Solver* does not give uncertainties in the
fit parameters. However, there is a way to estimate the uncertainties,^{1} but it is
somewhat time consuming.

- 1. Harris, D. C.
*J. Chem. Educ.***1998**,*75*, 119-121. Click here to view as a PDF file (Truman addresses and*J. Chem. Educ.*subscribers only).