4
Francis x. McConville Impact Technology Consuitants FunClions for Easier Curve Filling An overview Df empirical relations that cao be used to tit your data E ngineers often need to fit ex- perimental data to an empirical relationship for extrapolation or modeling without resorting to a fuU mathematical treatment based on physical principIes or theory. The most widely used platform in sci- ence for doing this is MS Excel, and while its "off-the-shelf" Trendline tool Shifted reciprocal 1 y=-- a-x is useful, it is limited to only 4 or 5 simple functions. Excel's Solver add- in, on the other hand, offers a simple, powerful means to fit data to user- defined functions. There are numerous commercial packages for data fitting and statisti- cal analysis (1], but Excel is ubiqui- tous, and as E. G. John [2] aptly puts it, the use of Solver for data fitting is "simplicity itself'. More recently, in this publication, Du Plessis [3] further extols the virtues of Excel's Solver function and describes in detail how to use it for this purpose. For a synopsis, see box on p. 5l. Once one has identified an appropri- ate function and achieved a good fit, then more-advanced modeling tasks become easier; and using calculus one can obtain derivatives and integraIs and thus rates of change, areas under curves, and so on. But selecting the best function for the curve-fitting exercise is never triv- ial. To simplify the task, this article illustrates a coUection of 52 common one-, two- and three-parameter bi- nary functions. that cover quite a wide range of behavior and should provide a good starting point. Even the one- and two-parameter relationships can be very versatile in spite of their sim- plicity. The accompanying figures il- 13 L, y= 17 II I li ti I{ 11 Modi y =(, FIGURE 2. Eq I lustrate the. b: I these equatlOn of welI-known! where I Simple /1 . I exponentla I x (asymptotlc) y=a \ I I x y=l-a A I Modified power Pareto I' \ I-Inx (asymptotic) y=a 1 y=I- 7 y/ I I Hyperbolic COSlne cosine y = acosh(x/ a) a y = cosh(x) For the cUf\] most effective, data are in thE y = a l/x Exponential (asymptotic) y=I_e- ax Exponential (rootfit) FIGURE 1. Equations (1) to (12) are functions with one fitting parameter, a 48 CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008

Tip i Cas Curva Seng Qui Mica

Embed Size (px)

Citation preview

Page 1: Tip i Cas Curva Seng Qui Mica

Francis x. McConville Impact Technology Consuitants FunClions for

Easier Curve Filling An overview Df empirical

relations that cao be used to tit your data

Engineers often need to fit ex­perimental data to an empirical relationship for extrapolation or modeling without resorting to a

fuU mathematical treatment based on physical principIes or theory. The most widely used platform in sci­ence for doing this is MS Excel, and while its "off-the-shelf" Trendline tool

Shifted reciprocal

1 y=-­

a-x

is useful, it is limited to only 4 or 5 simple functions. Excel's Solver add­in, on the other hand, offers a simple, powerful means to fit data to user­defined functions.

There are numerous commercial packages for data fitting and statisti­cal analysis (1], but Excel is ubiqui­tous, and as E. G. John [2] aptly puts it, the use of Solver for data fitting is "simplicity itself'. More recently, in this publication, Du Plessis [3] further extols the virtues of Excel's Solver function and describes in detail how to use it for this purpose. For a synopsis, see box on p. 5l.

Once one has identified an appropri­

ate function and achieved a good fit, then more-advanced modeling tasks become easier; and using calculus one can obtain derivatives and integraIs and thus rates of change, areas under curves, and so on.

But selecting the best function for the curve-fitting exercise is never triv­ial. To simplify the task, this article illustrates a coUection of 52 common one-, two- and three-parameter bi­nary functions. that cover quite a wide range of behavior and should provide a good starting point. Even the one­and two-parameter relationships can be very versatile in spite of their sim­plicity. The accompanying figures il­

13 L,

y=

~ 17

II I

li

ti I{ 11

I~ Modi y =(,

.~nl

FIGURE 2. Eq

I

lustrate the. b:I these equatlOn of welI-known! where possibl~

~ I Simple

/1. I exponentla

I

x

(asymptotlc)

y=a

\ I I xy=l-a

A I Modified powerPareto I' \ I-Inx(asymptotic)

y=a1

y=I-7

Hy~,boUc y/I I Hyperbolic

COSlnecosine

y = acosh(x/ a)a y = cosh(x)

For the cUf\] most effective, data are in thE

y = a l/x

Exponential (asymptotic)

y=I_e-ax

Exponential (rootfit)

FIGURE 1. Equations (1) to (12) are functions with one fitting parameter, a

48 CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008

Page 2: Tip i Cas Curva Seng Qui Mica

13tants ReciprocalLinear

y=ax+b 1 y=-­

a+bx

1 good fit, ing tasks lculus one integraIs Exponential

'eas under bx y=ae

Logarilhmic. ction for never triv­ y = a+blnx his article 2 common Lmeter bi­üte a wide lld provide n the one­nships can 'their sim-figures il-

Rational

a y=-­

b+x

Exponenlial b/xy =ae

Power b y=ax

16

Hyperbola (saturation

growlh)

ax y=-­

b+x

Exponential (asymplolic)

y = a(l-ebx )

24í Shijied power

y = a(l + X)b

y = axb/x

FIGURE 2. 'Equations (13) to (32) are functions with two fitting parameters, a and b

lustrate the basic curve shapes that these equations generate. The names of well-known functions are included where possible.

For the curve fitting exercise to be most effective, it is important that the data are in the correct form and that

all units are consistent. For example, solubility data can often be fit to one of the simpIe logarithmic functions, but the best results are obtained if solubility is expressed as mole frac­tion and not weight percent. Thus, some understanding of the underly­

ing principIes proves valuable in se­lecting and properly applying the best empirical model. It is also best to use the minimum number of parameters that will give a good fit, or else the fit may become meaningless.

A common technique is linearizing

CHEMICAL ENGINEERING WWW.CHE,COM DECEMBER 2008 49

Page 3: Tip i Cas Curva Seng Qui Mica

li lnverse hyperbola

b c y=a+-+­2 x x

FIGURE 3. Equations (33) to (52) are functions with three fitting parameters, a, b and c

Logarithmic

Shifted power

a non-linear equation by rearranging it, thus simplifying the fit processo For example, Equation (9) can be linear­ized by plotting ln(l- y) versus x. This results in a straight line with slope --a that passes through O. Another example is using Lineweaver-Burke

plots to linearize enzyme kinetic data. This technique can work well in many cases, but it tends to distort the exper­imental error and amplify the effect of outlying data points. Thus parameters

, derived in this way may not be as ac­curate as those obtained by fitting

50 CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008

data to the native model. The use of a program like Excel obviates the need to utilize such methods.

It is also possible to treat a data set as a bimodal distribution and fit the data to two different functions, apply­ing one function above a certain value

lhe Solver F spreodshee Column A: Column B: E

coIumn C:' coIumn D: í Column E: t lhe values setup and r such a way lhis genera simpie spre www.pprbo

and anothel For exampl linear up te which it e~

ior. This a~

achieve a I quickly WhE

Fortheva quick look a cate the ex! pIe, Equatie pected to a~

x increases. the value a is negative, ishing valUl And functio perbola [Eql man model to O. Funcl [Equation C peak and th

Some oft: expected bei shapes whel used. This important p is selecting fit parametE derstanding work for selE must resort ues selected physical wc the model fi plished first tified by usi (goodness of fraction thai from 1 for 11 negative vaI

GoodplacE~

tion on curvl

Page 4: Tip i Cas Curva Seng Qui Mica

-c)

~

model

, txC

model

-bx )Ce

rI'he use of a les the need

~t a data set and fit the

bons, apply­ertain value

A OUICK LOOK AT USING EXCEL SOLVER The Solver procedure is usuolly bosed on lhe "sum oF leost squores" opprooch. A typicol spreodsheet setup For on x-r data set would look like this: Column A: independent variable (xl data Column B: experimental dependent variable (r) data Column C: r values calculaled using lhe curve Fitting function of inleresl Column D: the difference between the values in Column C and Column B [residuais) Column E: lhe square of lhe values in column D The values in column E are lhen summed lo generale a "solver parameler". Solver is selup and run to aulomalicolly adjusl lhe volues oF equalion paramelers o, b and c in such o way as lo minimize lhe value of lhe "solver parometer" (c1ick Tools>Solver [4]). This generales lhe values a, b, and c lhat provide lhe besl possible Fil of lhe dolo. A simple spreodsheel demonslraling lhe approach is available 01 lhe aulhor's websile www.pprbook.com under "Templates".

and another function below that value. For example, a data set may be very linear up to a certain value of x, after which it exhibits exponential behav­ior. This approach can help the user achieve a purely empirical fit more quickly when required.

For the various asymptotic models, a quick look at the function should indi­cate the expected behavior. For exam­pIe, Equations (7) and (27) can be ex­pected to approach a value of unity as x increases. Equation (20) approaches the value a as x increases and when b is negative, but approaches Oat dimin­ishing values of x when b is positive. And functions such as the inverse hy­perbola [Equation (35)] and the Chap­man model [Equation (48)] asymptote to O. Functions such as Box-Lucas [Equation (21)] fit data that reach a peak and then begin to decline.

Some of the models can exhibit un­expected behavior and unusual curve shapes when negative parameters are used. This raises the point that an important part of running the Solver is selecting the starting values of the fit parameters. Where a physical un­derstanding does not provide a frame­work for selecting starting values, one must resort to trial and error. The val­ues selected must make sense in the physical world. Checking how well the model fits will usually be accom­plished first by eye, but can be quan­tified by using values such as the R2 (goodness of fit) parameter, a unitless fraction that usually ranges in value from 1 for a perfect fit to O (or even negative values) for a poor fito

Good places to look for more informa­tion on curve fitting in general are the

O

websites listed in References [51 and [6]. For a more advanced discussion of proper curve fitting using non-linear regression, multivariate analysis, deal­ing with outlying data points, applying weighting to the data, and other other issues, see References [7-9].

The functions The functions included here represent a number of the most common curve types - power, exponential, logarith­mie and trigonometric among others. Some are well known and historically important, and many correspond di­rectly to well-known physical models. For example, the Arrhenius kinetic equation k =Ae-E /RT is a classic ex­ample of a basic exponential form [(Equation (19)], where: x =T;y =rate constant, k; a = Arrhenius constant, A; and b =-E/R.

Radioactive decay is a commonly cited example of an exponential model [Equation (18)]. Tlie basic form of the equation is L(t) = L(O)e-lt . In our con­text, y =L(t), the number of decays per unit time; x =t, elapsed time; a =initial decay rate; and b = l, the probability of a decay event during one time unit.

Similarly, the dilution of a species in a stirred tank being continuously fed fresh media ([A]=O) is described by Equation (18). Here the relationship is [A]/[A ] = e-(q/V)t where [Ao] and [A]oare the concentration ofA at time = O and at time = t, q and V the flowrate and system volume, and t is time.

The Michaelis-Menten enzyme ki­netic model, v = Vmax!3/(S + Km), char­acterized by classic non-competitive substrate inhibition (saturation), fits the form of a hyperbola [Equation

(16)]. Here x = substrate concentra­tion, S; y = reaction rate, v; a = Vmax;

and b = Michaelis constant, Km .

Other interesting examples are de­scribed in Bates and Watts [8]. Here the drop in biological oxygen demand (BOD) at a fixed rate constant was re­ported to follow an exponential decay of the form of Equation (20). And the change in intercellular concentration ofions due to membrane transport out of the cell is effectively fit to the form ofEquation (47).

As with any other endeavor, the deeper the understanding ofthe mech­anisms at work, the easier the selec­tion of an appropriate model will be. Hopefully the equations collected here will simplify your data fitting tasks by highlighting typical behaviors and the models that describe them. •

Edited by Gerald Ondrey

References 1. MathCAD, FindGraph, SigmaPlot, Origin­

Lab, XlXtrFun. Easily found through an in­ternet search.

2. E. G. John, Simplified Curve Fitting using Spreadsheet Add-ins, Int. J Engineering Ed. 14(5)pp.375-380, 1998.

3. B. J. du Plessis, Using Spreadsheets as Curve Fitting Tools, Chem. Eng. 114 (5) pp 66-69, May 2007.

4. If 'Solver' does not appear under the tools menu in Excel, it may be necessary to acti­vate the add-in by selecting the 'Solver add­in' checkbox under Tools>Add-ins.

5. http://www.aip.org/tip/INPHFA/vol-9/iss-2/ p24.html (by Marko Ledvij at The Industrial Physicist)

6. http://www.curvefit.com/(by GraphPad Soft­ware)

7. Draper, N. R; Smith, H. "Applied Regression Analysis," 3rd edition, John Wiley & Sons, N.Y.,1998

8. Bates, D. M. and Watts, D. G., "Nonlinear Re­gression Analysis and its Applications," John Wiley & Sons, N.Y., 1988.

9. Bevington, P. R "Data Reduction and Error Analysis for the Physical Sciences," McGraw HiII, N.Y., 1969.

Author

•~. Francis McConville is a senior consultant for Impact Technology Consultants in Lincoln, Mass. He is the au­

... thor of "The Pilot Plant Real Book - A Unique Handbook . . for the Chemical Process In­dustry," and an instructor for. "'f' the Scientific Update profes­sional training course "Se­'-! crets of Batch Process Scale-Up". He has over 25 years

experience in the process industries, including 14 years as a pharmaceutical process develop­ment engineer at Sepracor, Inc. McConville holds a B.S. degree in Chemistry and M.S. degrees in Chemical Engineering and in Biotechnology from Worcester Polytechnic Institute in Mas· sachusetts. He is a member of the ACS, ISPE, and a lifetime member of the AIChE. He may be reached at [email protected].

CHEMICAL ENGINEERING WWW.CHE.COM DECEMBER 2008 51