Mathematical Modelling

Mathematical Modeling

My Students

February, 2010

It is mostly based on the textbook, Frank R. Giordano, Maurice D. Weir, and William P. Fox, A First Course inMathematical Modeling, 3rd Ed and it has been reorganized and retyped by Jae Lee.

Mathematical Modeling Spring, 2010

Page 2 of 57

CONTENTS

1 Modeling Change 51.1 Modeling Change with Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Approximating Change with Difference Equations . . . . . . . . . . . . . . . . . . . . . . . 7

Topic I. Discrete Versus Continuous Change . . . . . . . . . . . . . . . . . . . . . . . . . . 7Topic II. Model Refinement: Modeling Births, Deaths, and Resources . . . . . . . . . . . . 8

1.3 Solutions to Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Topic I. Method of Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Topic II. Homogeneous Linear Dynamical System an+1 = ran, r constant . . . . . . . . . . 10Topic III. LongTerm Behavior of an+1 = ran, r constant . . . . . . . . . . . . . . . . . . . 11Topic IV. Nonhomogeneous Linear Dynamical System an+1 = ran+b, r and b constant . . . 11Topic V. Finding and Classifying Equilibrium Values . . . . . . . . . . . . . . . . . . . . . 13Topic VI. Nonlinear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4 Systems of Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 The Modeling Process 172.1 Mathematical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Modeling Using Proportionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Topic I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Topic II. Geometric Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Topic III. Modeling Vehicular Stopping Distance . . . . . . . . . . . . . . . . . . . . . . . 20

2.3 Modeling Using Geometric Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Topic I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Topic II. Testing Geometric Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4 Automobile Gasoline Mileage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.5 Body Weight and Height, Strength and Agility . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Modeling Fitting 27Topic I. Relationship Between Model Fitting and Interpolation . . . . . . . . . . . . . . . . 27Topic II. Sources of Error in the Modeling Process . . . . . . . . . . . . . . . . . . . . . . 28

3.1 Fitting Models to Data Graphically . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Topic I. Visual Model Fitting with the Original Data . . . . . . . . . . . . . . . . . . . . . . 28Topic II. Transforming the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 Analytic Methods of Model Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Topic I. Chebyshev Approximation Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 30Topic II. Minimizing the Sum of the Absolute Deviations . . . . . . . . . . . . . . . . . . . 31Topic III. LeastSquares Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Topic IV. Relating the Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Applying the LeastSquares Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Topic I. Fitting a Straight Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3


Topic II. Fitting a Power Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Topic III. Transformed LeastSquares Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.4 Choosing a Best Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4 Chapter 7 Discrete Optimization Modeling 45Section 7.4 Linear Programming III: Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . 45

5 Chapter 8 Dimensional Analysis 535.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.2 Dimensions as Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6 Chapter 10 Modeling with a Differential Equation 5510.5 Numerical Approximation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7 Chapter 11 Modeling with Systems of Differential Equations 5711.1 Graphical Solutions of Autonomous Systems of FirstOrder Differential Equations . . . . . . 57

Page 4 of 57

Chapter 1

Modeling Change

A mathematical model is an idealization of the realworld phenomenon and never a completely accuraterepresentation. In modeling our world, we are often interested in predicting the value of a variable at sometime in the future such as a population, a real estate value, and the number of people with a communicativedisease.

.

.RealWorld Data .Model

.Mathematical Conclusions.Predictions/Explanations

.Simplification

.Analysis

.Interpretation

.Verification

Figure 1.1: A flow of the modeling process beginning with an examination of realworld data

One very powerful simplifying relationship is proportionality.

Definition 1.0.1. Two variables, x and y, are proportional (to each other) if there is a nonzero constant k suchthat y= kx. We write y x.

When x and y are proportional, the graph of y versus x is a straight line passing through the origin. When xand y are proportional, one of our concerns is to find the constant of proportionality k. Moreover, we observex and y are proportional if and only if y=x or x=y or (y=x)p or (x=y)p is constant, where p is any real number.

Example 1.0.2. Consider a springmass system. An experiment gives the following table.

Elongation (e) 1.000 1.875 2.750 3.250 4.375 4.875 5.675 6.500 7.250 8.000 8.750

Mass (m) 50 100 150 200 250 300 350 400 450 500 550

A simple computation shows that the ratio of e over m is roughly a constant:em

= 0:0171;

where the number is the average of each ratio ei=mi, i= 1;2; : : : ;11. Since the ratio e=m is roughly a constant0:017, it is allowed to say that m and e are proportional with the relation e= 0:0171m.When we plot the points of (m;e), we can observe that a graph close to all points looks like a straight linepassing through the origin. See the figure 1.2.

A paradigm to use in modeling change is

future value = present value + change, i.e., change = future value present value.If the pattern of the model is taking place over discrete time periods, the preceding construction leads toa difference equation. In the case that it is taking place continuously with respect to time, it leads to adifferential equation.

5


Figure 1.2: Data from springmass system with proportionality line

x1.1 Modeling Change with Difference Equations.Definition 1.1.1. For a sequence of numbers, A = fa0;a1;a2; : : :g, the difference an+1 an is called the nthfirst difference and denoted by4an, i.e.,

4an = an+1an; n= 0;1;2; : : : :

Geometrically, the first difference represents the vertical change in the graph of the sequence during one timeperiod.

Example 1.1.2 (Savings Certificate). Consider the value of a savings certificate initially worth $1000 thataccumulates interest paid each month at 1% per month.Let an be the value of the certificate after n months. Then a0 = 1000 and because of the interest,

4a0 = a1a0 = 0:01a04a1 = a2a1 = 0:01a1

...4an = an+1an = 0:01an:

The last equation can be rewritten by

an+1 = an+0:01an = 1:01an; i:e:; an+1 = 1:01an with a0 = 1000;

which is called the dynamical system model and the equation is called the dynamical system.

Example 1.1.3. Consider the value of a savings certificate initially worth $1000 that accumulates interestpaid each month at 1% per month. We withdraw $50 from the account each month.Let an be the value of the certificate after n months. Then a0 = 1000 and because of the interest and thewithdrawal, we have

4an = an+1an = 0:01an50; i:e:; an+1 = 1:01an50 with a0 = 1000:

How to describe a change mathematically? Often it is necessary to plot the change and observe a pattern anddescribe the change in mathematical terms. Simply we will try to find

change =4an = some function f .

Page 6 of 57


Example 1.1.4 (Mortgaging a Home). Six years ago your parents purchased a home by financing $80;000for 20 years paying monthly payments of $880:87 with a monthly interest of 1%. Currently they have made72 payments. Currently how much do they owe on the mortgage?

ANSWER. Let bn be the amount of money owed to the bank after n months. Then we have

4bn = bn+1bn = 0:01bn880:87; i:e:; bn+1 = 1:01bn880:87 with b0 = 80;000:

The answer to the raised question is b27, which can be easily computable.

Month (n) 0 1 2 3 : : : 72 Owed Money (bn) 80000 79919.1 79837.5 79755 71532.1 Month (n) 237 238 239 240Owed Money (bn) 2589.58 1734.6 871.078 1:0814

Based on the table of n and bn, we have the figure 1.3 below from 0 month to 240 months.

Figure 1.3: Mortgaging a Home

Definition 1.1.5. A sequence is a function whose domain is the set of all nonnegative integers and whoserange is a subset of the real numbers. A dynamical system is a relationship among terms in a sequence. Anumerical solution is a table of values satisfying the dynamical system.

x1.2 Approximating Change with Difference Equations.In this section, we approximate some observed change to complete the expression

change =4an = some function f .

Topic I. Discrete Versus Continuous Change.Some changes takes place in discrete time intervals such as the depositing of interest in an account. In thiscase, we consider a difference equation. But some changes happen continuously such as the change in thetemperature of a cold can of soda on a worm day. In this case, a differential equation can be dealt with.

Example 1.2.1 (Growth of a Yeast Culture). By an experiment measuring the growth of a yeast culture, wehave the following table.

Mortgage: a legal agreement by which a bank, building society, etc. lends money at interest in exchange for taking title ofthe debtors property, with the condition that the conveyance of title becomes void upon the payment of the debt (Concise OxfordEnglish Dictionary)

Yeast: a microscopic singlecelled fungus capable of converting sugar into alcohol and carbon dioxide

Page 7 of 57


n pn 4pn0 9.6 8.7

1 18.3 10.7

2 29.0 18.2

3 47.2 23.9

4 71.1 48.0

5 119.1 55.5

6 174.6 82.7

7 257.3

Here n represents the time in hour, pn observed yeast biomass, and4pn = pn+1 pn the change in biomass.We observe the ratio (4pn)=pn is roughly a constant,

4pnpn

= 0:6057;

which is the average of the ratios (4pi)=pi, i= 0;1;2; : : : ;6. It implies that pn and4pn are proportional withthe constant of proportionality 0:6057 so that

4pn = 0:6057pn; i:e:; pn+1 = 1:6057pn:Since the ratio pn+1=pn = 1:6057 > 1, so the model predicts the population will increase forever. See thefigure 1.4.

Figure 1.4: Change in Biomass versus Biomass

Topic II. Model Refinement: Modeling Births, Deaths, and Resources.Certain resources (e.g., food) can support only a maximum population level rather than one that increasesindefinitely.

Example 1.2.2 (Growth of a Yeast Culture Revisited). Under the restriction, suppose we have the table 1.8on the page 11 of the textbook. The table gives the plot (the one on the lefthand side).

From the graph of the population above, the population appears to be approaching a limiting value, whichseems to be 665. So we may propose

4pn (665 pn)pn:By computing the ratio (4pn)=((665 pn)pn), we can estimate a constant

4pn(665 pn)pn = 0:000802886;

Page 8 of 57


which is the average of (4pi)=((665 pi)pi), i= 1;2; : : : ;18. It implies

4pn = 0:000802886(665 pn)pn; pn+1 = pn+0:000802886(665 pn)pn with p0 = 9:6:

(The textbook uses the linear proportionality with the proportionality k = 0:00082.)Let us solve our model pn+1 = pn+0:000802886(665 pn)pn with p0 = 9:6 numerically. That is, for each n,we compute pn and plot the data (n; pn) and compare the one obtained by the experiment. See the figure 1.5.

Figure 1.5: Red #: Experimental Result, Blue : Model Result

x1.3 Solutions to Dynamical Systems.

Topic I. Method of Conjecture.The method of conjecture is a powerful mathematical technique to hypothesize the form of a solution to adynamical system and then to accept or reject the hypothesis. The method has four steps: Look for Pattern,Conjecture, Test Conjecture and Conclusion.

Example 1.3.1 (Savings Certificate (Revisited)). A savings certificate is initially worth $1000 accumulatedinterest paid each month at 1% of the balance. No deposits or withdrawals occurred in the account. Lettingan be the amount in the account after n months, we deduce the dynamical system

an+1 = 1:01an with a0 = 1000: (1.3.1)

Page 9 of 57


Step 1 Look for Pattern:

a1 = 1:01a0a2 = 1:01a1 = 1:01(1:01a0) = 1:012a0a3 = 1:01a2 = 1:01(1:012a0) = 1:013a0: : : : : :

an = 1:01na0:

Step 2 Conjecture: From the step 1, we conjecture an = 1:01na0.Step 3 Test Conjecture: The conjecture implies

an+1 = 1:01n+1a0 = 1:01(1:01na0) = 1:01an;

which is the dynamical system (1.3.1). Thus, our conjecture is right.Step 4 Conclusion: The solution of the dynamical system (1.3.1) is an = 1:01na0 = 1:01n1000.

Topic II. Homogeneous Linear Dynamical System an+1 = ran, r constant.

Theorem 1.3.2. The solution of the linear dynamical system an+1 = ran with constant r 6= 0 is an = rna0,where a0 is the given initial value.

Example 1.3.3 (Sewage Treatment). A sewage treatment plant processes raw sewage to produce usablefertilizer and clean water by removing all other contaminants. The process is such that each other 12% ofremaining contaminants in a processing tank are removed.Questions:

1. What percentage of the sewage would remain after 1 day?2. How long would it take to lower the amount of sewage by half?3. How long until the level of sewage is down to 10% of the original level?

ANSWER. Let an be the amount of sewage contaminants after n hours and a0 the initial amount. Then webuild the model

an+1 = an0:12an = 0:88an; i:e:; an+1 = 0:88an:Using the Theorem above, we have the solution of the dynamical system,

an = 0:88na0:

Answer to Question 1: Since 1 day is equivalent to 24 hours, so the answer is

a24 = 0:8824a0 = 0:0465a0:

It means the level of contaminants in the sewage can be reduced by more than 95% at the end of the first day.Answer to Question 2: The question is about the time n satisfying an = 0:5a0. So we solve

0:5a0 = 0:88na0 =) 0:5= 0:88n =) n= ln0:5ln0:88 = 5:42Hence, it takes about 5.42 hours to lower the contaminants to half their original level.Answer to Question 3: The question is about the time n satisfying an = 0:1a0. So we solve

0:1a0 = 0:88na0 =) 0:1= 0:88n =) n= ln0:1ln0:88 = 18:01Hence, it takes about 18 hours before the contaminants are reduced to 10% of their original level.

Sewage: refuse liquids or waste matter usually carried off by sewers

Page 10 of 57


Topic III. LongTerm Behavior of an+1 = ran, r constant.By the Theorem, we recall that the solution of an+1 = ran is an = rna0. We observe the longterm behaviorof an = rna0 when n is sufficiently large.

r Behavior of an

r = 0 The sequence an converges to a0and so the dynamical system has a constant solution and equilibrium value at 0.

r = 1 The sequence an converges to a0 and so all initial values are constant solutions.

r < 0 The sequence an oscillates.

jrj< 1 The sequence an converges to 0 and so it decays to the limiting value of 0.jrj> 1 The sequence an diverges and so it grows without bound.

Topic IV. Nonhomogeneous Linear Dynamical System an+1 = ran+b, r and b constant.

Definition 1.3.4. If a dynamical system an+1 = f (an) has a constant solution an = constant, say an = c, thenthe constant c is called an equilibrium value or fixed point of the system.

Example 1.3.5. Consider an+1 = 0:5an+0:1.(1) When a0 = 0:1, the given dynamical system implies

a1 = (0:5)(0:1)+0:1= 0:15; a2 = (0:5)(0:15)+0:1= 0:175: : : : : : a15 = 0:19999695

So we may expect that as n! , an ! 0:2.(2) When a0 = 0:2, the given dynamical system implies

a1 = (0:5)(0:2)+0:1= 0:2; a2 = (0:5)(0:2)+0:1= 0:2: : : : : : a15 = 0:2

So we can deduce an = 0:2= a0 for any integer n. That is, 0:2 is the equilibrium value.(2) When a0 = 0:3, the given dynamical system implies

a1 = (0:5)(0:3)+0:1= 0:25; a2 = (0:5)(0:25)+0:1= 0:225: : : : : : a15 = 0:20000305

So we may expect that as n! , an ! 0:2.When we see the graphs of the sequences (1), (2) and (3), we observe that whatever the initial value is, thesequence converges to the equilibrium value 0:2. We say this equilibrium value is stable.Example 1.3.6. Consider bn+1 = 1:01bn1000.(1) When b0 = 90000, the given dynamical system implies

b1 = (1:01)(90000)1000= 89900b2 = (1:01)(89900)1000= 89799: : : : : :

b15 = 88390

So we may expect that as n! , an 6! 100000.(2) When b0 = 100000, the given dynamical system implies

b1 = (1:01)(100000)1000= 100000b2 = (1:01)(100000)1000= 100000

Page 11 of 57


Figure 1.6: Red (Lower @): a0 = 0:1, Blue (Middle #): a0 = 0:2, Black (Upper ): a0 = 0:3

: : : : : :

b15 = 100000

So we can deduce bn = 100000= b0 for any integer n. That is, 100000 is the equilibrium value.(3) When b0 = 110000, the given dynamical system implies

b1 = (1:01)(110000)1000= 110100b2 = (1:01)(110100)1000= 110201: : : : : :

b15 = 111610

So we may expect that as n! , an 6! 100000.When we see the graphs of the sequences (1), (2) and (3), we observe that whatever the initial value is, thesequence does not converge to the equilibrium value 100000. We say this equilibrium value is unstable.

Figure 1.7: Red (Lower): b0 = 90000, Blue (Middle): b0 = 100000, Black (Upper): b0 = 110000

Page 12 of 57


Topic V. Finding and Classifying Equilibrium Values.Suppose an+1 = ran+b has an equilibrium value a 6= 0. Then by definition, we get

an+1 = a= an:

Putting them into the dynamical system, we have

a= ra+b =) a= b1 r (r 6= 1):

It implies the following Theorem.

Theorem 1.3.7. If an+1 = ran+ b has a nonzero equilibrium value a, then the equilibrium value a is givenby

a=b

1 r (r 6= 1):(1) If r = 1 and b= 0, then the dynamical system becomes an+1 = an, i.e., any initial value is an equilibriumvalue, i.e., every number is an equilibrium value.(2) If r = 1 but b 6= 0, then there is no equilibrium value.Example 1.3.8. Let us use the Theorem 1.3.7 to find the equilibrium values of the dynamical systems an+1 =0:5an+0:1 in the Example 1.3.5 and bn+1 = 1:01bn1000 in the Example 1.3.6 above.The Theorem 1.3.7 implies the equilibrium values a and b for each system

a=0:1

10:5 = 0:2; b=100011:01 = 100000:

Through examples, we can observe the following longterm behavior for an+1 = ran+b, b 6= 0.

r LongTerm Behavior

jrj< 1 Stable equilibrium valuejrj> 1 Unstable equilibrium valuer = 1 Straight line with no equilibrium value

Theorem 1.3.9. The dynamical system an+1 = ran+b has the solution

an = rnc+b

1 r ; (1.3.2)

where c is a constant depending on the initial condition, explicitly,

a0 = r0c+b

1 r = c+b

1 r =) c= a0b

1 r :

So the solution can be rewritten by

an = rna0 b1 r

+

b1 r = r

na0+b(1 rn)1 r :

PROOF. Substituting the result (1.3.2) into the given system, we have

an+1 = rn+1c+b

1 r ; and ran+b= rrnc+

b1 r

+b= rn+1c+

b1 r :

So an given in (1.3.2) satisfies the given system an+1 = ran+b. Therefore, (1.3.2) is the solution.

Page 13 of 57


We observe that the second term in the solution (1.3.2) is the equilibrium value of the given system an+1 =ran+b.

Example 1.3.10. Solve an+1 = 1:01an1000.ANSWER. By the Example 1.3.6 or Example 1.3.8, we recall that the given system has the equilibrium value100000. The Theorem 1.3.9 above implies the solution

an = 1:01nc+100000 =) a0 = c+100000 =) c= a0100000=) an = 1:01n (a0100000)+100000;

where a0 is the initial value.

Topic VI. Nonlinear Systems.We recall in the Example of Growth of a Yeast Culture (Revisited):

pn+1 = pn+0:00082(665 pn)pn = 1:5453pn0:00082p2n = 1:5453(10:0005306pn)pn0:0005306pn+1 = 1:5453(10:0005306pn)0:0005306pn

which is a nonlinear dynamical system. Letting an = 0:0005306pn and r = 1:5453, the equation can berewritten by

an+1 = r (1an)an:In this system, when we play with various rs: r = 1:5453, r = 2:750, r = 3:250, r = 3:525 and r = 3:555,we have plots showing various longterm phenomena. (See the figure 1.21 on page 32 in the textbook.)

x1.4 Systems of Difference Equation.In the previous section, we have studied the equilibrium value of one linear/nonlinear dynamical system. Inthis section, we study the equilibrium value of a pair of dynamical systems involved with each other.

Example 1.4.1 (Car Rental Company). A car rental company has distributorships in Orlando and Tampa.In analyzing the historical records, it is determined that 60% of the cars rented in Orlando are returned toOrlando, whereas 40% end up in Tampa. Of the cars rented from the Tampa office, 70% are returned toTampa, whereas 30% end up in Orlando.Questions:

1. Will a sufficient number of cars end up in each city to satisfy the demand for cars in that city?2. If not, how many cars must the company transport from Orlando to Tampa or from Tampa to Orlando?

ANSWER. Let On and Tn be the number of cars in Orlando and Tampa, respectively, at the end of day n.Then, we can build the following dynamical system model:

On+1 = 0:6On+0:3Tn; and Tn+1 = 0:7Tn+0:4On;

which is a system of difference equations.Suppose the system has the equilibrium values On =O and Tn = T . Then putting them into the equations, wehave

O= 0:6O+0:3T and T = 0:7T +0:4O =) O= 34T:

So if the Orlando and Tampa offices initially have O0 = 3000 and T0 = 4000 cars, respectively, then weobserve

O1 = 0:6(3000)+0:3(4000) = 3000; T1 = 0:7(4000)+0:4(3000) = 4000O2 = 0:6(3000)+0:3(4000) = 3000; T2 = 0:7(4000)+0:4(3000) = 4000

Page 14 of 57


: : : : : : : : : : : :

On = 3000; Tn = 4000:

That is, the system remains at the initial value (On;Tn) = (3000;4000) = (O0;T0).In fact, even if we change the initial values, we observe eventually On ! 3000 and Tn ! 4000 (assuming thetotal number of cars is 7000).

Case 1 Case 2 Case 3 Case 4

O0 7000 5000 2000 0

T0 0 2000 5000 7000

For the various starting values in the table above, the figures 1.8 shows that On and Tn approach to theequilibrium values, i.e., On ! 3000 and Tn ! 4000.Answers to Question 1 and 2: Even though an office starts with insufficient number of cars, it can satisfy thedemand on that day. Even 2 days later, it can have the ideal number of cars (i.e., equilibrium values). So wedont have to transport any car from one city to the other.

Figure 1.8: Red: On, Blue: Tn

Example 1.4.2 (Competitive Hunter Model Spotted Owls and Hawks). Suppose a species of spotted owlscompetes for survival in a habitat that also supports hawks. Suppose also that in the absence of the otherspecies, each individual species exhibits unconstrained growth in which the change in the population duringan interval of time (e.g., 1 day) is proportional to the population size at the beginning of the interval.

Page 15 of 57


Let On and Hn denote the size of the spotted owls and hawks population, respectively, at the end of day n.Then by the assumption on the proportionality, we have

4On On and 4Hn Hn =) 4On = k1On and 4Hn = k2Hn;

where k1 and k2 are constants.Assuming that the decrease of the population is proportional to the product ofOn andHn, the system becomes

4On = k1On k3OnHn and 4Hn = k2Hn k4OnHnOn+1 = (1+ k1)On k3OnHn and Hn+1 = (1+ k2)Hn k4OnHn;

where k1, k2, k3 and k4 are constants. We fix the constants kis and consider the system:

On+1 = 1:2On0:001OnHn and Hn+1 = 1:3Hn0:002OnHn:

Letting O and H be the equilibrium values, we have

O= 1:2O0:001OH; H = 1:3H0:002OH =) 0= O(0:20:001H) 0= H(0:30:002O):

It gives the equilibrium values (O;H) = (150;200).Plotting On and Hn with various initial values as given in the table,

Case 1 Case 2 Case 3

O0 151 149 10

H0 199 201 10

we observe On 6! 150 and Hn 6! 200 in any case. In Case 1, the population of owls grows indefinitely whilethe population of hawks goes extinct. In Case 2, the opposite phenomenon occurs. That is, in either case, oneof the two pieces drives the other to extinct. However, it is interesting to see in Case 3 that the population ofhawks grows while owls goes to be vanished. Confer the figure 1.28 on pages 4345 in the textbook.

Let us compare the results on equilibrium values in Examples 1.4.1 (Car Rental Company) and 1.4.2 (Owlsand Hawks). The equilibrium values in Example 1.4.1 (Car Rental Company) are stable and insensitive tothe initial conditions, while those in Example 1.4.2 (Owls and Hawks) are unstable and very sensitive to theinitial conditions.

Page 16 of 57

Chapter 2

The Modeling Process

x2.1 Mathematical Models.Read the textbook.

x2.2 Modeling Using Proportionality.

Topic I. Introduction.From Chapter 1, we recall

y x (i.e., x and y are proportional to each other) if and only if y= kx for some constant k > 0.

It is easy to seey x () x y:

Example 2.2.1.

1. y x2 if and only if x y1=2.2. y xn for a fixed constant n if and only if x y1=n.3. y lnx if and only if x ey.4. y ex if and only if x lny.

PROOF. Skip. But, one should be able to prove all of them.

Property 2.2.2 (TRANSITIVITY). If z y and y x, then z x.

Topic II. Geometric Interpretation.Suppose x and y are proportional to each other. Then there is a constant k such that y = kx. It is clear to seethat the graph of y = kx is a line with a slope of k passing through the origin. That is, for x and y which areproportional, its graph should be a line and should pass the origin. So u and v satisfying v = mu+ b withconstant m and b 6= 0 cannot be proportional.Example 2.2.3. Suppose there is an open box floating in the water tank. As we add heavy objects into thebox, the water in the tank will flow out. We recall the fact that the volume of the water displaced by the loadedbox is equal to the weight of the loaded box. For an example, if we add a ball into the box and the volumeof the water flown out of the tank is 5, then we can say the weight of the ball with the box is 5 (withoutconsidering all the units).Let y be the volume of the water displaced by the loaded box and x be the weight of the loaded box.Then we have y = x. However, letting z be the weight of the loaded ball alone, we should have y =z+ (weight of the box), i.e., y and z cannot be proportional, because of the constant term. See the figure 2.1.

If we have a case as in the Example 2.2.3, then is it prohibited from assuming the proportionality? No it isnot. In fact, the answer depends on the problem, specifically, the slope. Let us consider two lines havingsame slope, L : y=mx andM : y=mx+b, where b 6= 0. Let (x0;yL) and (x0;yM) be points on lines L andM,respectively. That is,

yL = mx0; and yM = mx0+b:

17


..Added weight x

.Displaced volume y

Figure 2.1: It is not a proportionality because the line fails to pass through the origin.

Then, we can compute

yM yL = mx0+bmx0 = b; i:e:; yM yL = b:We divide both sides by yM:

yM yLyM

=byM

: (2.2.1)

Now we observe:

1. If the slope m is relatively large (e.g., compared to 1), then yM should be relatively large too, whichimplies the ratio (2.2.1) is close to zero, i.e.,

yM yLyM

0; i:e:; yM yL 0; i:e:; yM yL:

So in this case, we may assume the proportionality for the data fitting the model y= mx+b.2. If the slope m is relatively small (e.g., compared to 1), then yM should be relatively small too, which

implies the ratio (2.2.1) is not close to zero, i.e.,

yM yLyM

6 0; i:e:; yM yL 6 0; i:e:; yM 6 yL:

So in this case, we cannot assume the proportionality for the data fitting the model y= mx+b.

In a nutshell, if the data fit to the model y = mx+ b and m is relatively large, then we can assume theproportionality even though b 6= 0. See the figure 2.2.Example 2.2.4 (KEPLERS THIRD LAW). What is the relationship between the orbital period and the meandistance between the sun and the planet in the solar system?

ANSWER. Method 1. Keplers Third Law:Keplers Third Law: The square of the orbital period of a planet is directly proportional to the cube of thesemi-major axis of its orbit.Symbolically, it can be written by

T 2 R3;

where T is the orbital period of planet and R is the semimajor axis of the orbit, i.e., mean distance betweenthe sun and the planet. The proportionality constant is same for any planet around the Sun,

T 2EarthR3Earth

=T 2MarsR3Mars

=T 2PlanetR3Planet

:

Page 18 of 57


..x

.y .y= mx+b.y= mx

..x

.y

.y= mx+b

.y= mx

Figure 2.2: Assumable and Unassumable Proportionality

Computing the ratio for the Earth, TEarth= 365:25 days and REarth= 92:9millions of miles= 149:508058millions of kilometers,we have the constant of proportionality,

365:252

92:93= 0:166392 (days2/millions of miles3):

Thus, we deduce that for any planet, T 2 = 0:166392R3, i.e., T = 0:1663921=2R3=2 = 0:407912R3=2.Method 2. Modeling Method based on Data: The following table is from 1993 World Almanac.

Period (T ) Mean Distance (R)

Planet (days) (millions of miles)

Mercury 88.0 36.

Venus 224.7 67.25

Earth 365.3 93.

Mars 687.0 141.75

Jupiter 4331.8 483.80

Saturn 10760.0 887.97

Uranus 30684.0 1764.50

Neptune 60188.3 2791.05

Pluto 90466.8 3653.90

When we plot the point (R3=2;T ), we can approximate a straight line passing through the origin. The slope(constant of proportionality) can be obtained by choosing any points. However, let us use the leastsquarescriterion in SECTION 3.3 APPLYING THE LEASTSQUARES CRITERION. Then we deduce T = 0:40948R3=2.See the figure 2.3.

Remark 2.2.5 (ASIDE). Keplers First Law: Each planet moves along an ellipse with the sun at one focus.Keplers Second Law: For each planet, the line form the sun to the planet sweeps out equal areas in equaltimes.

We have some famous formulas involved with the proportionalities as follows:

1. (Hookes Law) F = kS, where F is the restoring force in a spring stretched or compressed a distance S.2. (Newtons Law) F =ma or a= F=m, where a is the acceleration of a mass m subjected to a net external

force F .3. (Ohms Law) V = iR, where i is the current induced by a voltage V across a resistance R.

Page 19 of 57


Figure 2.3: Keplers Third Law as proportionality

4. (Boyles Law) V = k=p, where under a constant temperature k the volume V is inversely proportional tothe pressure p.

5. (Einsteins Theory of Relativity) E = c2M, where under the constant speed of light squared c2 the energyE is proportional to the massM of the object.

6. (Keplers Third Law) T = cR3=2, where T is the period (days) and R is the mean distance to the sun.

Topic III. Modeling Vehicular Stopping Distance.We start with recalling the onecarlength rule (OCL) that allows one car length for every 10 mph of speed.That rule was also stated as the 2seconds rule (2S) which allows for 2 seconds between cars. In fact, thesetwo rules are not same and they cannot be compatible. If one is true, the other one should be wrong. First wewill show this incompatibility and then develop a better model on the stopping distance.If we stick to the rule 2S, then a simple computation shows

1 car length= distance=speed in ft

sec

(2 sec)

=

10 miles

hr

5280 ftmile

1 hr

3600 sec

(2 sec) = 29:33 ft:

It says that if we follow the rule 2S, then a car should be about 29.33 ft long. However, by statistics, since theaverage car length is 15 ft, so the rule OCL should be wrong.When we use the correct information on the average car length, we have

15 ft= 1 car length= distance=speed in ft

sec

(x sec)

=

10 miles

hr

5280 ftmile

1 hr

3600 sec

(x sec) =

443x

=) x= 4544

= 1:02273:

It says that the time should be 1.02273 seconds rather than 2 seconds. So if we follow the rule OCL, then therule 2S becomes wrong. Although two rules are incompatible, it may be preferable to keep using both rulesfor the road safety.Now, we recall from Chapter 1,

total stopping distance = reaction distance + braking distance.

Page 20 of 57


Using the collected data on the reaction and braking distances, we observe the proportionalities,

dr v; db v2;

where v is the car speed and dr and db are reaction and braking distances, respectively. Explicitly, we deduce

dr = 1:1v; db = 0:054v2; d = 1:1v+0:054v2;

where d is the total stopping distance. See the figure 2.4.

Figure 2.4: Red : Given Data, Blue +: Prediction by Model, Black: OCL ruleIn the left one of the figure 2.4, the black line is the prediction by the OCL rule of which equation is given byd = 1:5v, because the rule says d=v= 15=10 (ft/mph). However, as we can see from the figure, it is definitelyuseless especially after the car speed 20. So when we take the guideline given in the table,

Speed (mph) 0 10 10 40 40 60 60 75

Guideline (sec) 1 2 3 4

the modified OCL rule will be better than the original one, as shown in the right one of the figure 2.4.

x2.3 Modeling Using Geometric Similarity.Geometric similarity is a concept related to proportionality and can be useful to simplify the mathematicalmodeling process.

Topic I. Introduction.

Definition 2.3.1. Two objects are said to be geometrically similar if there is a onetoone correspondencebetween points of the objects such that the ratio of distances between corresponding points is constant for allpossible pairs of points.

Example 2.3.2. Consider two boxes X and X 0, where each one has length l, l0, width w, w0, and height h, h0,respectively. See the figure 2.5. Suppose X and X 0 are geometrically similar so that there is a onetoonecorrespondence between points A, B, C, and A0, B0 and C0, and other points and the ratio of the distancesbetween corresponding points is constant. Then it must be true that

ll0=

ww0

=hh0

= k;

Page 21 of 57


..B .D

.A.C

.l .w

.h.

.B0 .D0

.A0.C0

.l0

.w0

.h0

Figure 2.5: Two geometrically similar objects X and X 0

for some constant k > 0.1. For two triangles ABC and A0B0C0 in the boxes X and X 0, respectively, we observe that the angles aresame, i.e., \BCA = \B0C0A0 and \BAC = \B0A0C0. It is easy to understand that the boxes X and X 0 aregeometrically similar and so those triangles also should be geometrically similar.The shape is the same for two geometrically similar objects and one object is simply an enlarged copy of theother. We can think of geometrically similar objects as scaled replicas of one another, as in an architecturaldrawing in which all the dimensions are simply scaled by some constant factor.2. One of the advantages with the geometric similarity lies on simplifying the computations. For the boxesX and X 0 above, the volumes of X and X 0 are, respectively, VX = lwh and VX 0 = l0w0h0. It is easy to see

VXVX 0

=lwhl0w0h0

= k3; VX = k3VX 0; VX VX 0:

Similarly, for the total surface areas SX and SX 0 of the boxes X and X 0, we have

SXSX 0

=2(lw+wh+hl)

2(l0w0+w0h0+h0l0)= k2; SX = k2SX 0; SX SX 0 :

Moreover, we can find a relationship between the ratio of volumes and the ratio of surface areas:

VX=VX 0SX=SX 0

=k3

k2= k;

VXVX 0

= kSXSX 0

;VXVX 0

SXSX 0

:

Now let us choose l and l0 between the dimensions of the boxes. Since l=l0 = k and SX=SX 0 = k2, so we have

SXSX 0

= k2 =l2

l02;

SXl2

=SX 0l02

Aside= constant:

It impliesSX = (constant) l2; SX l2; similarly SX 0 l02:

By the same argument on VX and VX 0 , it follows

VX l3; VX 0 l03:

Remark 2.3.3 (ASIDE). From SX=SX 0 = l2=l02 = constant, how can we deduce SX=l2 = SX 0=l02 = constant?First it is easy to see

SXSX 0

=l2

l02=) SX

l2=

SX 0l02

;

Page 22 of 57


by multiplying both sides by SX 0=l2. Let us consider two functions f (x) and g(y) where we have only two

independent variables x and y. Supposef (x)g(y)

=xy= constant. Then, we have

f (x)x

=g(y)y

:

Letting F(x)= f (x)=x andG(y)= g(y)=y, the equation says F(x)=G(y). If the function of x and the functionof y are same, then both of them should be a constant, i.e., F(x) = G(y) = constant. (For instance, one mayrecall that a polynomial of any independent variable of degree 0 is a constant.) This kind of technique istypically used under the topic, separation of variables, in PARTIAL DIFFERENTIAL EQUATIONS.

In the Example 2.3.2 above, we have argued on S andV with the length l. We can develop the same argumentwith the width w and the height h, i.e.,

S w2; S h2; V w3; V h3:

Once we choose a dimension (in the Example above, it was the length l), it is called the characteristicdimension.Suppose a function f depends on the length l and surface area S and volume V of a box. Then since wecan express S and V in terms of the length l, eventually the function f can be expressed by l, l2 and l3.For instance, if y = f (l;S;V ) = 3l+ SV , then there are some constants k1 and k2 such that S = k1l2 andV = k2l3. So, we have

y= 3l+SV = 3l+ k1l2 k2l3;which is a function of l, l2 and l3.

Example 2.3.4 (RAINDROP FROM A MOTIONLESS CLOUD). Suppose we are interested in the terminalvelocity (i.e., maximum velocity) of a raindrop from a motionless cloud. We assume only two forces exerton the raindrop, Fd due to the air resistance and Fg due to the gravity. Then the net force F (i.e., sum of allforces) becomes F = FgFd so that it falls down. By Newtons Second Law, the net force F should be equalto ma, i.e.,

FgFd = ma;where a is the acceleration and m is the mass of the raindrop. Since the maximum velocity occurs when theacceleration vanishes (i.e., a= 0), the equation for the terminal velocity becomes

FgFd = 0; Fg = Fd:Question: What is the relationship between the terminal velocity and the mass of the raindrop?Assumptions:

(A1) Fd is proportional to the surface area S times the square of its speed v, i.e., Fd Sv2.(A2) Fg is proportional to weight w, i.e., Fg w.(A3) Mass m is proportional to the weight w, i.e., m w.(A4) All the raindrops are geometrically similar.

ANSWER. Thanks to (A4), we can use the proportionality. We recall in general that the surface area S andthe volume V of an object are proportional to l2 and l3 for any characteristic dimension l, i.e.,

S l2; V l3 =) S V 2=3:Because weight w and mass m are proportional to volume, the transitive rule for proportionality gives

S V 2=3; and V m =) S m2=3:

Page 23 of 57


With this result and (A1), we have

Fd Sv2 m2=3v2; i:e:; Fd m2=3v2:

The assumptions (A2) and (A3) yield

Fg w; and w m =) Fg m:

Since Fd m2=3v2 and Fg m, there are some positive constants k1 and k2 such that

Fd = k1m2=3v2; Fg = k2m:

The equation Fg = Fd given in the problem implies

k2m= k1m2=3v2; m1=3 =k1k2

v2; m1=3 v2; m1=6 v:

Thus, the terminal velocity of the raindrop is proportional to its mass raised to the onesixth power.

Remark 2.3.5 (ASIDE: STOKES LAW). Droplets falling in a motionless air can be modeled by the differ-ential equation,

d2ydt2

= 32:2 cD2

dydt;

where 32:2 is the gravitational acceleration and c is a fixed constant and D is the diameter of the sphericalraindrop and dy=dt is the velocity of the raindrop. So the terminal velocity can be obtained by solving thedifferential equation,

0=d2ydt2

= 32:2 cD2

dydt;

dydt

=32:2c

D2:

The diameter D is proportional to the radius r of the spherical raindrop and the volume V of the raindrop isproportional to r3, so we can deduce

V D3; D2 V 2=3:

Since the mass m is proportional to the volume V , the result above becomes

D2 V 2=3; V m =) D2 m2=3:

Hence, the differential equation implies

dydt

D2 m2=3;dydt

m2=3;

i.e., the terminal velocity is proportional to the mass raised to the (2=3) power. This is a quite different resultthan the one we deduced in the Example 2.3.4 above. Why? What happened? Its because in formulatingthe differential equation, we assume the constant gravitational acceleration 32:2. But in the Example 2.3.4above, we assume that Fg is not a constant and so Fg is involved with the mass m.A droplet falling according to the differential equation above never quite reaches its terminal velocity, but getscloser and closer to it. Unless its fall is interrupted by hitting the ground, the velocity eventually becomes soclose to the solution of the differential equation that for practical purpose, we consider it equal to the terminalvelocity.

From the book, Concepts of Mathematical Modeling, written by Walter J. Myer

Page 24 of 57


Topic II. Testing Geometric Similarity.Are a triangle and a rectangle geometrically similar? Clearly, they are not. Because of the differences betweenthe vertices of two polygons, there is no onetoone correspondence. When we use the geometric similarityassumption, those involved objects should be of same shape.Now let us fix a polygon, for example, a circle. Then we can think of the radius, area, circumference, arc,angle of the arc, length of the arc, and so on. We consider two circles of diameters d1 and d2, respectively.LettingC1 andC2 be the circumferences of the circles, it is straightforward to see

C1d1

=C2d2

= p; andC1C2

=pd1pd2

=d1d2: (2.3.1)

We recall that the length l of an arc (part of a circle) having the angle q in a circle with radius r is given byl = rq . Letting l1 and l2 be the lengths of arcs in circles above having the angles q1 and q2, respectively, wehave

l1 = r1q1 and l2 = r2q2:

If q1 6= q2, then there is no geometric similarity between those two arcs. So assuming q1 = q2, we deducel1l2

=r1r2

= q1 = q2; andl1l2

=2r12r2

=d1d2: (2.3.2)

Combining two results above yields,

C1C2

=d1d2

=l1l2; i:e:;

C1C2

=l1l2:

From those two equations, it is deduced that the ratio of distances between corresponding points around anytwo circles is always the ratio of their diameters.

Example 2.3.6 (MODELING A BASS FISHING DERBY). A sport fishing club wishes to encourage its mem-bership to release their fish immediately after catching them. The club also wishes to grant awards based onthe total weight of fish caught. It is suggested that each individual carry a small portable scale. Question:How does someone fishing determine the weight of a fish he/she has caught?

ANSWER. Skip. Read the textbook.

x2.4 Automobile Gasoline Mileage.Read the textbook.

x2.5 Body Weight and Height, Strength and Agility.Read the textbook.

Page 25 of 57


Page 26 of 57

Chapter 3

Modeling Fitting

When analyzing a collection of data points, it is suggested to consider the following three tasks.

1. Fitting a selected model type or types to the data.2. Choosing the most appropriate model from competing types that have been fitted. For example, we

may need to determine whether the bestfitting exponential model is a better model than the bestfittingpolynomial model.

3. Making predictions from the collected data.

In the first two tasks, we do have a model or competing models explaining the observed behavior of the data.It will be discussed in this chapter under the model fitting. For the third case, since no model can explain theobserved behavior, so we will try to construct an empirical model based on the collected data, which will bestudied in the following chapter.

Topic I. Relationship Between Model Fitting and Interpolation.Consider the figure 3.1 of the collected data.

Figure 3.1: Observations relating the variables y and x

There are mainly two ways to approximate the given data.

1. Based on the shape of the data, we make the assumption on the model and find the better one. Forexample, for the data in the figure 3.1, we assume a quadratic model and find a best fitting parabolay = ax2+ bx+ c such as in the figure 3.2. In this way, we may explain the situation on which the datalie. Usually this approach is theory driven.

2. We can find a curve passing through all those points. Finding such a curve is called the spline interpo-lation and it will be studied in the following chapter. In this way, we can capture the trend of the datato predict in between the data points. Usually this approach is data driven. For the collected data, thefigure 3.3 shows the curve obtained by the spline interpolation.

27


Figure 3.2: Fitting a parabola y = ax2+ bx+ c to thedata points

Figure 3.3: Interpolating the data using a smooth poly-nomial

Topic II. Sources of Error in the Modeling Process.For purposes of easy reference, we classify errors under the following category scheme:

1. Formulation error: for instance, in the Example of Stopping Distance, we ignored the road friction forthe braking distance. Because of this ignorance, the model may be less effective.

2. Truncation error: for instance, when we compute the value of sinx, we may use only x x3=3!+ x5=5and because of the truncation of the other terms, the computation cannot be accurate.

3. Roundoff error: for instance, rigorously speaking, 0:3333333331=3 6= 0. So if we use 0:333333333,then we may confront an error.

4. Measurement error: one can understand this error as human error. When we measure an object in anaked eye, it may not be accurate compared to the one measured by a machine.

x3.1 Fitting Models to Data Graphically.

Topic I. Visual Model Fitting with the Original Data.

Figure 3.4: Minimizing the sum of the absolute deviation from the fitted line

Suppose we want to fit the model y= ax+b to the data shown in figure 3.4. All of them cannot be expectedto lie exactly along a single straight line. So there will be some vertical discrepancy between a few of the datapoints and any particular line under consideration. These vertical discrepancies are called absolute deviation.

Page 28 of 57


Based on the deviation, we may think of two cases.

1. Minimizing the sum of the absolute deviation from the fitted line and2. Minimizing the largest absolute deviation form the fitted line.

For the bestfitting line, we might try to achieve the first one, minimizing the sum of deviation. For thesecond one, minimizing the largest deviation, see the figure 3.5.

Figure 3.5: Minimizing the largest absolute deviation from the fitted line

Topic II. Transforming the Data.Suppose we have the following collected data.

Collected Data:x 1 2 3 4

y 8.1 22.1 60.1 165Transformed Data:

x 1 2 3 4

lny 2.1 3.1 4.1 5.1

Since the data points are suspected to follow the form y = cex, by taking the logarithmic function on eachside, we deduce

lny= x+ lnc;

which is a line on the (x)(lny)plane such that its slope is 1 and the (lny)intercept is (x; lny) = (0; lnc).

Page 29 of 57


x3.2 Analytic Methods of Model Fitting.

Topic I. Chebyshev Approximation Criterion.Goal: For the collection of m data points (xi;yi), i = 1;2; : : : ;m and a certain function y = f (x) (given asin the example below), we want to minimize the largest deviation between the data and the function, i.e.,minimize

Maximum of jyi f (xi)j; i= 1;2; : : : ;m: (3.2.1)In other words, if f (x)= ax+b, then we want to find a and bwhich minimizing the maximum value in (3.2.1).This criterion is often called the Chebyshev approximation criterion.Rewriting Problem: Let

ri = jyi f (xi)j; i= 1;2; : : : ;m:Then, we have

r =Maximum of jyi f (xi)j=Maximum of ri; i= 1;2; : : : ;mand so r is the largest deviation and we want to minimize this r. Since r is the maximum value of all jrijs,i= 1;2; : : : ;m, it is easy to see

jrij r =) r ri r =) 0 r ri and 0 r+ ri i= 1;2; : : : ;m:Thus, the whole problem can be rewritten as follows:

Minimize r (i.e., find the minimum value of r)

subject to0 r ri and 0 r+ ri i= 1;2; : : : ;m:

This kind of problem (finding a maximum/minimum value) is called a linear program or optimization prob-lem.Strategy: Computer implementation of an algorithm known as Simplex Method. It will be discussed inChapter 7 later. In the examples below, we will see how to minimize the largest deviation for a given functionvia Mathematica.

Example 3.2.1. For the following data set, formulate the mathematical model that minimize the largestdeviation between the data and the line y= ax+b. (We use Mathematica to estimate a and b.)

x 1.0 2.3 3.7 4.2 6.1 7.0

y 3.6 3.0 3.2 5.1 5.3 6.8

ANSWER. Let r be the largest absolute deviation between the data and f (x) = ax+b. Then for the followingabsolute deviations, jyi f (xi)j, we have

j3:6 f (1:0)j= j3:61:0abj r=) r 3:61:0ab r =) 0 r1:0ab+3:6 and 0 r+1:0a+b3:6

j3:0 f (2:3)j= j3:02:3abj r=) r 3:02:3ab r =) 0 r2:3ab+3:0 and 0 r+2:3a+b3:0

j3:2 f (3:7)j= j3:23:7abj =) r 3:23:7ab r =) 0 r3:7ab+3:2 and 0 r+3:7a+b3:2

j5:1 f (4:2)j= j5:14:2abj =) r 5:14:2ab r =) 0 r4:2ab+5:1 and 0 r+4:2a+b5:1

j5:3 f (6:1)j= j5:36:1abj

Page 30 of 57


=) r 5:36:1ab r =) 0 r6:1ab+5:3 and 0 r+6:1a+b5:3j6:8 f (7:0)j= j6:87:0abj r

=) r 6:87:0ab r =) 0 r7:0ab+6:8 and 0 r+7:0a+b6:8So the constraints are

0 r1:0ab+3:6 0 r+1:0a+b3:60 r2:3ab+3:0 0 r+2:3a+b3:00 r3:7ab+3:2 0 r+3:7a+b3:20 r4:2ab+5:1 0 r+4:2a+b5:10 r6:1ab+5:3 0 r+6:1a+b5:30 r7:0ab+6:8 0 r+7:0a+b6:8:

We want to find a and b, i.e., f (x) = ax+ b minimizing the largest absolute deviation r subject to the 12constraints above.By the computer, we obtain

f (x) = 0:533333x+2:14667

and the largest absolute deviation is r = 1:45333 which occurs at the last data (7:0;6:8).

Topic II. Minimizing the Sum of the Absolute Deviations.Consider a given data (xi;yi), i = 1;2; : : : ;m, and the model y = f (x). Let ri = jyi f (xi)j. Then the sumof the absolute deviations is

m

i=1

ri. Let us consider the case of m = 2 so that we have only two absolute

deviations r1 and r2. The sum r1+ r2. If we plot the points (r1;0) and (r1+ r2;0) on the line, we observe thatminimizing the sum r1+ r2 can be interpreted as minimizing the length of the line formed by adding togetherthe numbers ri.To solve this optimization problem using the calculus, the differentiability of the absolute deviation in termsof the parameter should be guaranteed so that its critical number can be found. However, since an absolutefunction is not differentiable at the cusp, the calculus technique may not be applied to the sum of the absolutedeviation. Because of this drawback, the following technique is considered.

Topic III. LeastSquares Criterion.Currently, the most frequently used curvefitting criterion is the leastsquares criterion. Consider a givendata (xi;yi), i = 1;2; : : : ;m, and the model y = f (x). Let ri = jyi f (xi)j. Then the sum of the squaresof absolute deviations is

m

i=1

r2i . Let us consider the case of m = 3. If we introduce a vector in the three

dimensional space, !r = hr1;r2;r3i, we observe that the sum of the squares of the absolute deviations is infact

m

i=1

r2i = khr1;r2;r3ik2 = k!r k2:

That is, we may interpret the leastsquares criterion as minimizing the magnitude of the vector whose coor-dinates represent the absolute deviation between the observed and predicted values.

Topic IV. Relating the Criteria.In the previous topics, we have discussed the geometric interpretations. Now let us compare the criteriaanalytically.Suppose m data, (xi;yi), i = 1;2; : : : ;m, are given and Chebyshev and leastsquares criterion give the modely= fC(x) and y= fL(x), respectively. Let

ci = jyi fC(xi)j; cmax =maxfci : i= 1;2; : : : ;mg ;

Page 31 of 57


di = jyi fL(xi)j; dmax =maxfdi : i= 1;2; : : : ;mg :

Then we observe

1. cmax dmax.Proof. Because of the parameters of the function y = fC(x) are determined so as to minimize the valueof cmax, it is the minimal largest absolute deviation obtainable.

2. Letting

D=

smi=1 d2i

m;

we have D cmax dmax.Proof. Since y= fL(x) gives the minimal sum of the squares obtainable, so we have

m

i=1

d2i = d21 +d

22 + +d2m c21+ c22+ + c2m c2max+ c2max+ + c2max = mc2max

mi=1 d2im

c2max =) D=smi=1 d2i

m cmax:

With the observation 1, we have D cmax dmax. Through an example, one can apply the criteria and compare the values D, cmax and dmax, which will bestudied in SECTION 3.4 CHOOSING A BEST MODEL3.4.

Page 32 of 57


x3.3 Applying the LeastSquares Criterion.In this section we study the leastsquares criterion to estimate the parameters for several types of curves. Wediscuss the topics analytically rather than graphically.

Topic I. Fitting a Straight Line.Suppose a model of the form y = ax+ b is expected and m data points (xi;yi), i = 1;2; : : : ;m, are given.The leastsquares criterion is minimizing the sum of the squares of the largest deviations, i.e., minimizing Sdefined by

S=m

i=1

(yiaxib)2 : (3.3.1)

Considering S as a function of two independent variables a and b, finding the minimum value of S is theproblem on the minimum value of S(a;b) in Calculus. From Calculus, we recall

1. A point (a0;b0) is called a critical point of S(a;b) if(i) (a0;b0) is in the domain of S(a;b) and(ii) either Sa(a0;b0) = 0 or Sb(a0;b0) = 0 and(iii) one of both of Sa(a0;b0) and Sb(a0;b0) do not exist.

(Here Sa means the partial derivative of S(a;b) with respect to a.)2. If S(a;b) has a local extremum at (a;b)= (a0;b0), then (a;b)= (a0;b0)must be a critical point of S(a;b).

(However, the converse is not generally true.)3. (SECOND DERIVATIVE TEST) Suppose that S(a;b) has continuous secondorder partial derivatives in

some open disk containing the point (a0;b0) and that Sa(a0;b0) = 0 = Sb(a0;b0). For the discriminantD(a;b) for the point (a;b) defined by

D(a;b) = Saa(a;b)Sbb(a;b) [Sab(a;b)]2 ;(i) if D(a0;b0)> 0 and Saa(a0;b0)> 0, then S has a local minimum at (a0;b0),(ii) if D(a0;b0)> 0 and Saa(a0;b0)< 0, then S has a local maximum at (a0;b0),(iii) if D(a0;b0)< 0, then S has a saddle point at (a0;b0),(iv) if D(a0;b0) = 0, then no conclusion can be drawn.

Let us find the local minimum of S(a;b) defined in (3.3.1). To find the critical point of S, we compute

0=Sa

=2m

i=1

(yiaxib)xi =2m

i=1

xiyiax2i bxi

=2

"m

i=1

(xiyi)am

i=1

x2i bm

i=1

xi

#;

0=Sb

=2m

i=1

(yiaxib) =2"

m

i=1

yiam

i=1

xibm

i=1

1

#:

For a simple computation, we introduce vectorsx= hx1;x2; : : : ;xmi and y= hy1;y2; : : : ;ymi and i= h1;1; : : : ;1i.Then we observe

m

i=1

x2i = x21+ x

22+ + x2m = kxk2 = x x;

m

i=1

xi = x1(1)+ x2(1)+ + xm(1) = x i;m

i=1

yi = y1(1)+ y2(1)+ + ym(1) = y i;m

i=1

(xiyi) = x1y1+ x2y2+ + xmym = x y;

Page 33 of 57


m= 1+1+ +1= i i;where x y is the dot product between vectors x and y. So those equations on Sa and Sb for the critical pointsbecome

0= x yax xbx i =) ax x+bx i= x y (3.3.2)0= y iax ibi i =) ax i+bi i= y i: (3.3.3)

The resulting two equations (3.3.2) and(3.3.3) are called the normal equations. Simply,

x (yaxbi) = 0 and i (yaxbi) = 0:(Be careful! In general, u v = 0 implies neither u= 0 nor v = 0. So we should not say yaxbi= 0.)To find the critical points of S, we should solve the normal equations (3.3.2) and (3.3.3) for a and b.(1) (3.3.2) (i i) (3.3.3) (x i) implies

a [(x x)(i i) (x i)(x i)] = (x y)(i i) (y i)(x i); i:e:; a= (x y)(i i) (x i)(y i)(x x)(i i) (x i)2 :

(2) (3.3.3) (x x) (3.3.2) (x i) implies

b [(x x)(i i) (x i)(x i)] = (x x)(y i) (x y)(x i); i:e:; b= (x x)(y i) (x y)(x i)(x x)(i i) (x i)2 :

Thus, S(a;b) defined in (3.3.1) has the critical point

(a0;b0) =(x y)(i i) (x i)(y i)(x x)(i i) (x i)2 ;

(x x)(y i) (x y)(x i)(x x)(i i) (x i)2

=

(x y)kik2 (x i)(y i)kxk2kik2 (x i)2 ;

(y i)kxk2 (x y)(x i)kxk2kik2 (x i)2

(3.3.4)

which is clearly in the domain of S(a;b), i.e., RR under the assumption (x x)(i i) (x i)2 6= 0. In fact,the denominator (x x)(i i) (x i)2 cannot be zero. See below (3.3.5).Now we use the SECOND DERIVATIVE TEST to classify the local extremum at the found critical point(a0;b0).

Sa(a;b) =2m

i=1

(yiaxib)xi =2m

i=1

xiyiax2i bxi

; Saa(a;b) = 2

m

i=1

x2i = 2x x

Sb(a;b) =2m

i=1

(yiaxib) ; Sbb(a;b) = 2m

i=1

1= 2i i; Sab(a;b) = 2m

i=1

xi = 2x i;

D(a;b) = Saa(a;b)Sbb(a;b)S2ab(a;b) = 4(x x)(i i)4(x i)2 = 4(x x)(i i) (x i)2 :

We observe D(a;b)=4 = (x x)(i i) (x i)2 is the denominator of the critical points. So it cannot be zero.See below (3.3.5).As we can see, all of Saa and Sbb and D(a;b) are constants, because they dont have the variables a and b.Moreover, the CAUCHYSCHWARTZ INEQUALITY in Calculus says

ju vj kukkvk;which implies

ju vj2 kuk2kvk2; 0 kuk2kvk2ju vj2 ; 0 (u u)(v v) (u v)2; (3.3.5)Personally, I do believe that it is one of THE MOST IMPORTANT inequalities in MATHEMATICS.

Page 34 of 57


where the equality holds for the cases: Case 1. u= 0 or v = 0 and Case 2. u and v are parallel, i.e., u= svfor some scalar s.Since x and i are not parallel, hence, we deduce D(a;b) = 4

(x x)(i i) (x i)2> 0 for all (a;b) and also

Saa(a;b) = 2x x = 2kxk2 > 0 for all (a;b). Therefore, by the SECOND DERIVATIVE TEST, the function Shas the local minimum at the critical point (a0;b0) found in (3.3.4).Before we find the local minimum value of S, let us modify the function S using the vectors:

S(a;b) =m

i=1

[yiaxib]2 =m

i=1

y2i +a

2x2i +b22(axiyiabxi+byi)

=

m

i=1

y2i +a2

m

i=1

x2i +b2

m

i=1

12 a

m

i=1

(xiyi)abm

i=1

xi+bm

i=1

yi

!= y y+a2x x+b2i i2(ax yabx i+by i)= (ax+bi) (ax+bi)2(ax+bi) y+y y= (ax+biy) (ax+biy) = kyaxbik2;

which is amazingly nice, because the function S defined by the sum is rewritten as the norm of the vectorhaving the same form as in the sum definition (3.3.1).Therefore, the local minimum value of S is obtained by putting (a;b) = (a0;b0) into the result:

S(a0;b0) = kya0xb0ik2: (3.3.6)

Example 3.3.1. For the given data,

x 1 5 8

y 1 10 6

estimate the parameters of the fitting model y= ax+b by using the LeastSquares criterion.

ANSWER. We use the formulas deduced above. Let x= h1;5;8i and y= h1;10;6i and i= h1;1;1i. Then theobjective function S is

S(a;b) = kyaxbik2;and by the result (3.3.4), it has the following critical point (a0;b0):

(a0;b0) =(x y)kik2 (x i)(y i)kxk2kik2 (x i)2 ;

(y i)kxk2 (x y)(x i)kxk2kik2 (x i)2

=

99(3)14(17)90(3)142 ;

17(90)99(14)90(3)142

=

5974

;7237

= (0:797297;1:94595) :

Putting it into the formula (3.3.6) on the minimum value, we have the minimum value

S5974

;7237

=

y 5974x 7237 i

2

=

1 5974 7237 ; 10 5(59)74 7237 ; 6 8(59)74 7237

2 = 184974 = 24:9865:

Thus, we conclude that the model y = 0:797297x+ 1:94595 gives the minimum value of the sum of thesquares of the absolute deviations and the minimum value is 24:9856.

Remark 3.3.2. 1. The minimum value 24:9856 is obtained analytically by the formula (3.3.6). When we usethe model y= 0:797297x+1:94595 and the data, we can make the following table.

Page 35 of 57


x 1 5 8

y 1 10 6

yi0:797297xi1:94595 1:74324 4:06757 2:32432and the sum of the squares of all deviations becomes

3

i=1

(yi0:797297xi1:94595)2 = (1:74324)2+(4:06757)2+(2:32432)2 = 24:9856;

which is exactly same as the one obtained by the formula (3.3.6).2. When we use Mathematica, we obtain the result as in the figure 3.6, which is exactly same as we didabove.

Figure 3.6: Mathematica Results

Topic II. Fitting a Power Curve.To the given m data points (xi;yi), i = 1;2; : : : ;m, suppose we fit the model y = axn by the leastsquarescriterion, where n is fixed and the parameter a will be determined. In this case, the sum S of the squares ofthe absolute deviations becomes

S=m

i=1

(yiaxni )2 : (3.3.7)

By the similar argument as discussed in Topic I above, we introduce vectors x =

xn1;x

n2; : : : ;x

nmand y =

hy1;y2; : : : ;ymi. Then S becomes the function of one variable a and its critical point is found by

0=Sa

=2m

i=1

(yiaxni )xni =2"

m

i=1

(xni yi)am

i=1

x2ni

#=2x yakxk2 =) a= x ykxk2 :

That is, S has the critical pointa0 =

x ykxk2 :

Using the vectors, the function S in (3.3.7) turns to be

S=m

i=1

(yiaxni )2 =m

i=1

y2i +a

2x2ni 2axni yi

=m

i=1

y2i +a2

m

i=1

x2ni 2am

i=1

(xni yi) = y y+a2x x2x y= kyaxk2:

Putting the critical point a= a0 into the resulting function S(a), we have the minimum value

S(a0) = kya0xk2:

Page 36 of 57


Example 3.3.3 (LEASTSQUARES WITH FIXED POWER n= 2). For the given data,

x 0.5 1.0 1.5 2.0 2.5

y 0.7 3.4 7.2 12.4 20.1

estimate the parameters of the fitting model y= ax2 by using the LeastSquares criterion.

ANSWER. Let x=

0:52;1:02;1:52;2:02;2:52

and y= h0:7;3:4;7:2;12:4;20:1i. Then the objective function

S isS(a) = kyaxk2;

and it has the following critical point a= a0:

a0 =x ykxk2 = 3:18693:

Putting it into the formula, we have the minimum value

S (3:18693) = ky3:18693xk2 = 0:20954Thus, we conclude that the model y = 3:18693x2 gives the minimum value of the sum of the squares of theabsolute deviations and the minimum value is 0:20954.

Remark 3.3.4. Check with Mathematica: See the figure 3.7.

Figure 3.7: Mathematica Results

Topic III. Transformed LeastSquares Fit.When the data can be approximated by a linear function y = ax+ b (Topic I) or a curve with fixed degreey= axn, where n is fixed (Topic II), it is not difficult to estimate the parameters. In this topic, we consider asimple case of the fitting model y= axn, where a and n are parameters to be determined.

If we apply the leastsquares criterion on y= axn directly to them data, we have to differentiatem

i=1

(yiaxni )2

with respect to a and n to get the critical point. However, it is not easy to get the critical point of such afunction. (One may try to find the critical point!)The strategy to estimate the parameters of the model y= axn is using the independent substitution on x and yor simply transformation on the data, Y = lny and X = lnx. Taking the natural logarithmic function on data(x;y), we have the transformed data (X ;Y ) = (lnx; lny) and the fitting model becomes

lny= ln(axn) = lna+ lnxn = lna+n lnx =) Y = nX+A; (A= lna) (3.3.8)which is linear in terms of X and Y and whose parameters n and lna can be estimated by the techniquediscussed in Topic I above. The reason why we transform the data and the model by the natural logarithmicfunction mainly lies on the properties of the logarithmic function: explicitly, we can pull down the power of

Page 37 of 57


an exponential function so that the exponential model y= axn becomes a linear one Y = nX+ lna. Moreover,since a logarithmic function is a onetoone correspondence (or a bijection) and a conformal mapping (i.e.,anglepreserving mapping), the transformation via the function does not change the critical properties (suchas the absolute deviations) inherited in the original data.Now let us estimate a and n in the exponential model y= axn. We apply the leastsquares criterion developedin Topic I to the transformed model (3.3.8). Then the objective function S on the transformed data (X ;Y ) =(lnx; lny) has the independent variables n and A, i.e.,

S(n;A) =m

i=1

(lnyin lnxi lna)2 =m

i=1

(YinXiA)2 ;

where Xi = lnxi and Yi = lnyi. As we did in Topic I, let us introduce vectors

X= hX1;X2; : : : ;Xmi= hlnx1; lnx2; : : : ; lnxmi ;Y= hY1;Y2; : : : ;Ymi= hlny1; lny2; : : : ; lnymi ; i= h1;1; : : : ;1i :

Then the objective function S(n;A) becomes

S(n;A) = kYnXAik2:

(See the argument preceding the result (3.3.6).) It has the normal equations,

nX X+AX i= X Y; and nX i+Ai i= Y i:

(See the equations (3.3.2) and (3.3.3).) Solving the equations for n and A, we find the critical point (n;A) =(n0;A0),

(n0;A0) =(X Y)kik2 (X i)(Y i)

kXk2kik2 (X i)2 ;(Y i)kXk2 (X Y)(X i)

kXk2kik2 (X i)2

(3.3.9)

(See the result (3.3.4).) Thus S(n;A) has the minimum value

S(n0;A0) = kYn0XA0ik2: (3.3.10)

Since now we know the critical point (n0;A0), those parameters of the exponential model y= axn are obtainedby

n= n0; and a= eA0; i:e:; y= eA0xn0 :

Example 3.3.5 (TRANSFORMED LEASTSQUARES WITH UNFIXED POWER n (SAME DATA AS IN EXAM-PLE 3.3.3)). For the given data,

x 0.5 1.0 1.5 2.0 2.5

y 0.7 3.4 7.2 12.4 20.1

estimate the parameters of the fitting model y= axn by using the Transformed LeastSquares criterion. (Herewe have two parameters a and n to estimate.)

ANSWER. Since the model is an exponential function, we transform the data and the model by the logarith-mic function.

x 0.5 1.0 1.5 2.0 2.5

y 0.7 3.4 7.2 12.4 20.1

X = lnx 0:693147 0 0.405465 0.693147 0.916291Y = lny 0:356675 1.22378 1.97408 2.5177 3.00072

Page 38 of 57


Then the model becomeslny= n lnx+ lna =) Y = nX+A;

by setting X = lnx and Y = lny and A= lna. Using the results deduced above, the objective function of thetransformed data becomes

S(n;A) = kYnXAik2;where X, Y, and i are vectors as in the argument above. By the formulas (3.3.9) and (3.3.10), the function Shas the critical point (n0;A0) and the minimum value S(n0;A0):

(n0;A0) =(X Y)kik2 (X i)(Y i)

kXk2kik2 (X i)2 ;(Y i)kXk2 (X Y)(X i)

kXk2kik2 (X i)2= (2:06281;1:12661) ;

S(n0;A0) = kYn0XA0ik2 = kY2:06281X1:12661ik2 = 0:014179:

Therefore, we deduce the fitting model y = eA0xn0 = e1:12661x2:06281 = 3:08519x2:06281 with the minimumvalue 0:014179 of the sum of the squares of the absolute deviations.

Example 3.3.6 (TRANSFORMED LEASTSQUARES WITH FIXED POWER n = 2 (SAME DATA AS IN EX-AMPLE 3.3.3)). For the given data,

x 0.5 1.0 1.5 2.0 2.5

y 0.7 3.4 7.2 12.4 20.1

estimate the parameters of the fitting model y= ax2 by using the Transformed LeastSquares criterion. (Herewe have two parameters a and n to estimate.)

ANSWER. We follow the exactly same argument in the previous example. The model y= ax2 becomes

lny= 2lnx+ lna =) Y = 2X+A:

The objective function of the transformed data becomes

S(A) = kY2XAik2;

The function S(A) has the critical point A= A0 and the minimum value S(A0):

A0 =Y i2(X i)

kik2 = 1:14322;

S(A0) = kY2XA0ik2 = kY2X1:14322ik2 = 0:0205521:

Therefore, we deduce the fitting model y= eA0x2= e1:14322x2= 3:13684x2 with the minimum value 0:0205521of the sum of the squares of the absolute deviations.

Remark 3.3.7. For the data given in Example 3.3.3, we have tested three models with different criteria:1. y = ax2 with LeastSquares (Example 3.3.3): We have deduced y = 3:18693x2 with the minimum value0:20954.2. y= axn with Transformed LeastSquares (Example 3.3.5): We have deduced y= 3:08519x2:06281 with theminimum value 0:014179.3. y = ax2 with Transformed LeastSquares (Example 3.3.6): We have deduced y = 3:13684x2 with theminimum value 0:0205521.Let us make a table on the deviations and compare them.

Page 39 of 57


x 0.5 1.0 1.5 2.0 2.5

y 0.7 3.4 7.2 12.4 20.1

1. yi3:18693x2i 0:0967325 0.21307 0.0294075 0:34772 0.1816882. yi3:08519x2:06281i 0:0384383 0.31481 0.0792666 0:489902 0:324743. yi3:13684x2i 0:08421 0.26316 0.14211 0:14736 0.49475

Model mi=1(yi y(xi))2 max jyi y(xi)j1. yi3:18693x2i 0.20954 0.34772

2. yi3:08519x2:06281i 0.452326 0.4899023. yi3:13684x2i 0.363032 0.49475

In the case of the given data, we observe the first model y = 3:18693x2 makes the smallest sum of theabsolute deviations and also smallest sum of the squares of the deviations. So we are allowed to say thebest one between those three models is the first one. We may guess this result before going through allthe computations in those three examples. Simply speaking, when we use the transformation, the data getdamaged and the transformed model also loose some information. Due to these loss, the transformed modelmay not be better than the one obtained from the original data. However, such as in the exponential modely= axn with unfixed power n, we need to transform the data and the model anyway.We end this section by copying one paragraph in the textbook. The preceding examples illustrate twofacts.

1. If an equation can be transformed to yield an equation of a straight line (Topic I) in the transformedvariable, equations (3.3.2) and (3.3.3) can be used directly to solve for the slope and intercept of thetransformed graph.2. The leastsquares best fit to the transformed equations does not coincide with the leastsquares bestfit of the original equations. The reason for this discrepancy is that the resulting optimization problemsare different. In the case of the original problem, we are finding the curve that minimizes the sum of thesquares of the deviations using the original data, whereas in the case of the transformed problem we areminimizing the sum of the squares of the deviations using the transformed variables.

Page 40 of 57


x3.4 Choosing a Best Model.We start with the example from the previous section. The data is given and approximate by the model y= ax2.

x 0.5 1.0 1.5 2.0 2.5

y 0.7 3.4 7.2 12.4 20.1

Table 3.1: Collected Data

LeastSquares Criterion: We have y= 3:1869x2. Transformed LeastSquares Fit: We have y= 3:1368x2. Chebyshev Criterion: We have y= 3:17073x2.We can summarize the deviations by each method as follows:

xi yi yi3:1869x2i yi3:1368x2i yi3:17073x2i 0.5 0.7 0:0967 0:0842 0:09271.0 3.4 0:2131 0:2632 0:2293

1.5 7.2 0:029475 0:1422 0:0659

2.0 12.4 0:3476 0:1472 0:28292.5 20.1 0:181875 0:4950 0:28293

Table 3.2: Summary of deviations for each model y= ax2

Criterion Model [yi y(xi)]2 max jyi y(xi)jLeastSquares y= 3:1869x2 0.2095 0.3476

Transformed LeastSquares y= 3:1368x2 0.3633 0.4950

Chebyshev y= 3:17073x2 0.2256 0.28293

Table 3.3: Summary of results for the three models

Which model is best? The model obtained by the leastsquares criterion is better than the one by Chebyshevin the sense that the former gives smaller sum of the squares of the deviations. However, under the purpose ofminimizing the largest absolute deviation, the model obtained by Chebyshev is better. Thus, the best criteriondepends on the case, judged by the purpose of modeling and so on.Even if the sum of the squares of deviations is small, we need to be careful before we jump into making adecision on the trend of the data. Its because we may be misled into a wrong prediction. See the figure 3.15on page 121 in the textbook. The model y = x has the same sum of squared deviations. However, as wecan see, they give significantly different prediction on the trend of the data. In order to prevent this kind ofmisleading, we need to plot the model and the data together and compare them.

Example 3.4.1. 1. Find a model using the leastsquares criterion either on the data or on the transformeddata (as appropriate).2. Compare your results with the graphical fits obtained in the problem set 3.1 by computing the devia-tions, maximum absolute deviations and the sum of the squared deviations for each model.3. Find a bound on cmax if the model was fit using the leastsquares criterion.

Problem 3 in problem set 3.1. In the following data, x is the diameter of a ponderosa pine in inches measuredat breast height and y is a measure of volumenumber of board feet divided by 10.

(1) Test the model y= axb by plotting the transformed data.

Page 41 of 57


(2) If the model seems reasonable, estimate the parameters a and b of the model graphically.

Figure 3.8: Data and model y= 0:00282062x3:11139 by transforming data

ANSWER TO QUESTIONS (1) AND (2) IN SECTION 3.1. Taking the natural logarithmic function on bothsides of y= axb, we have

lny= b lnx+ lna;

of which graph on the lny versus lnx is a line with the slope b and the (lny)intercept (0; lna). A simplecomputation gives the transformed data (lnx; lny). Plotting them, we observe a line. Using two points(ln17; ln19) and (ln41; ln294), we have the slope b = 3:11139 and the (lny)intercept (0;5:8708), i.e.,a= 0:00282062 and so we deduce

y= 0:00282062x3:11139:

See the figure 3.8.

ANSWER TO QUESTIONS 1, 2, AND 3 IN SECTION 3.4. For the model y = axb, we transform the data andapply the transformed leastsquares fit:

lny= b lnx+ lna;

which gives the normal equations with m= 15,

b=mmi=1 lnxi lnyimi=1 lnximi=1 lnyi

mmi=1(lnxi)2 (mi=1 lnxi)2

lna=mi=1(lnxi)2

mi=1 lnyimi=1 lnximi=1 lnxi lnyi

mmi=1(lnxi)2 (mi=1 lnxi)2:

By the equations, we deduce b= 3:09187 and a= 0:00320603 and

y= 0:00320603x3:09187:

x 17 19 20 22 23 25 28 31 32 33 36 37 38 39 41

y 19 25 32 51 57 71 113 141 123 187 192 205 252 259 294

Page 42 of 57


Figure 3.9: Data and model y= 0:00320603x3:09187 by transformed leastsquares fit

See the figure 3.9.As we can observe from the table 3.5, in this problem, the transformed leastsquares model gives a betterapproximation than the one obtained by transforming the data and the linearity.

Page 43 of 57


xi17

1920

2223

2528

yi19

2532

5157

71113

yi 0:0028x 3:1114

i0.0001

1:8563

0.49678.6215

8.33567.9215

23.2535

yi 0:0032x 3:0919

i1:4341

3:8209

1:7741

5.65144.9701

3.668817.4145

xi31

3233

3637

3839

41

yi141

123187

192205

252259

294

yi 0:0028x 3:1114

i17.8165

12:9732

37.36484:1592

8:6150

19.90417.3673

0.0022

yi 0:0032x 3:0919

i10.0625

21:4427

28.139715:8991

21:2786

6.27297:2763

16:8033

Table3.4:Sum

mary

ofthedeviations

foreachmodel

Criterion

Model

[yi

y(xi )] 2maxjyi

y(xi )jTransform

edLinearity

y=0:00282062x 3

:111393174.81

37.3648

Transformed

LeastSquares

y=0:00320603x 3

:091872826.26

28.1397

Table3.5:Sum

mary

ofresultsforthe

twomodels

Page 44 of 57

Chapter 4

Chapter 7 Discrete Optimization Modeling

Section 7.4 Linear Programming III: Simplex Method.

PROBLEM: Use the simplex method to solve the following optimization problem.

Maximize 3x1+ x2subject to 2x1+ x2 6 (Original Pr)

x1+3x2 9x1; x2 0:

ANSWER. We introduce variables y1 and y2 and convert the problem as follows:

Maximize 3x1+ x2subject to 2x1+ x2+ y1 = 6

x1+3x2+ y2 = 9x1; x2; y1; y2 0;

which is called canonical slack maximization and the variables y1 and y2 are called slack variables.Since x1 0 and x2 0 are required by the condition, so the objective function should satisfy

3x1 x2 0and we introduce another slack variable z so that

3x1 x2+ z= 0 and z 0:Hence, we can rewrite the original problem (Original Pr) as follows:

2x1+ x2+ y1 = 6 (4.0.1)x1+3x2+ y2 = 9 (4.0.2)3x1 x2+ z= 0 (4.0.3)

x1; x2; y1; y2; z 0;and we will find the largest value of z and the pair (x1;x2) which gives the largest value of z.By collecting the coefficients, we record the problem into the socalled Tucker tableau.

In the objective function constraint (4.0.3), we compare the absolute value of the coefficients. Since thecoefficient of x1 has the largest absolute value 3, so we choose x1 as the entering variable.

Compute the ratio of the RHS divided by the column labeled x1 to determine the minimum positive ratio.

Choose y1 corresponding to the minimum positive ratio 3 as the exiting variable.Pivot Divide the row containing the exiting variable (the first row in this case) by the coefficient of the

Simplex algorithm was developed in the 1940s by George B. Dantzig. We will employ certain refinements in Dantzigsoriginal technique developed in the 1960s by A.W. Tucker.

slack: wanting in activity; lacking in completeness, finish, or perfectionTableau: picture, painting, representation, illustration, imagePivot: a shaft or pin on which something turns

45


x1 x2 y1 y2 z RHS

2 1 1 0 0 6

1 3 0 1 0 9

3 1 0 0 1 0Table 4.1: Original Tucker Tableau

x1 x2 y1 y2 z RHS

2 1 1 0 0 6

1 3 0 1 0 9

3 1 0 0 1 0

Table 4.2: Entering Variable x1

x1 x2 y1 y2 z RHS Ratio

2 1 1 0 0 6 3 (= 6=2)

1 3 0 1 0 9 9 (= 9=1)

3 1 0 0 1 0

Table 4.3: Entering Variable x1 and Exiting Variable y1

entering variable in that row (the coefficient of x1 in this case), giving a coefficient of 1 for the enteringvariable in this row. Then eliminate the entering variable x1 from the remaining rows (which do not containthe exiting variable y1 and have a zero coefficient for it).

x1 x2 y1 y2 z RHS

1 1/2 1/2 0 0 3

1 3 0 1 0 9

3 1 0 0 1 0

x1 x2 y1 y2 z RHS

1 1/2 1/2 0 0 3

11 31=2 01=2 1 0 933+3 1+3=2 0+3=2 0 1 0+9

Simply,

Since there are no negative coefficients in the bottom row, thus x1 = 3 and y2 = 6 (i.e., x2 = 0 by (4.0.2))gives the extreme point (x1;x2) = (3;0) at which the optimal objective function value z= 9.

Page 46 of 57


x1 x2 y1 y2 z RHS

1 1/2 1/2 0 0 3 (= x1)

0 5/2 1=2 1 0 6 (= y2)0 1/2 3/2 0 1 9 (= z)

Table 4.4: Tableau giving Extreme Point and Optimal Value

Example 4.0.2. Solve Carpenters problem.

Maximize 25x1+30x2subject to 20x1+30x2 690

5x1+4x2 120x1; x2 0:

ANSWER. By introducing slack variables y1, y2 and z, we convert the problem as follows:

20x1+30x2+ y1 = 690 (4.0.4)5x1+4x2+ y2 = 120 (4.0.5)

25x130x2+ z= 0 (4.0.6)x1; x2; y1; y2; z 0:

By collecting the coefficients, we record the problem into the Tucker tableau:

x1 x2 y1 y2 z RHS

20 30 1 0 0 690

5 4 0 1 0 120

25 30 0 0 1 0Table 4.5: Original Tucker Tableau


x1 x2 y1 y2 z RHS

20 30 1 0 0 690

5 4 0 1 0 120

25 30 0 0 1 0



Page 47 of 57


x1 x2 y1 y2 z RHS Ratio

20 30 1 0 0 690 23 (= 690=30)

5 4 0 1 0 0 30 (= 120=4)

25 30 0 0 1 0


Choose y1 corresponding to the minimum positive ratio 23 as the exiting variable.Pivot Divide the row containing the exiting variable (the first row in this case) by the coefficient of theentering variable in that row (the coefficient of x2 in this case), giving a coefficient of 1 for the enteringvariable in this row. Then eliminate the entering variable x2 from the remaining rows (which do not containthe exiting variable y1 and have a zero coefficient for it).

x1 x2 y1 y2 z RHS

2/3 1 1/30 0 0 690/30

5 4 0 1 0 120

25 30 0 0 1 0

x1 x2 y1 y2 z RHS

2/3 1 1/30 0 0 690/30

58=3 44 04=30 1 0 1204(690=30)25+30(2=3) 30+30 0+30=30 0 1 0+30(690=30)

Simply,

x1 x2 y1 y2 z RHS

2/3 1 1/30 0 0 23

7/3 0 2=15 1 0 285 0 1 0 1 690

Since we have a negative coefficient in the bottom row, we repeat the work by comparing the coefficients ofthe absolute values in the bottom row.

Simply,

Page 48 of 57


x1 x2 y1 y2 z RHS

2/3 1 1/30 0 0 23

7/3 0 2=15 1 0 285 0 1 0 1 690


x1 x2 y1 y2 z RHS

2/3 1 1/30 0 0 23

1 0 2=35= (2=15)(3=7) 3/7 0 12= 28(3=7)5 0 1 0 1 690

x1 x2 y1 y2 z RHS

2=32=3 1 1=30 (2=3)(2=35) 0 (2=3)(3=7) 0 23 (2=3)121 0 2=35 3/7 0 12

5+5 0 1+5(2=35) 0+5(3=7) 1 690+5(12)

x1 x2 y1 y2 z RHS

0 1 1/14 2=7 0 15 (= x2)1 0 2=35 3/7 0 12 (= x1)0 0 5/7 15/7 1 750 (= z)

Table 4.9: Tableau giving Extreme Point and Optimal Value

Since there are no negative coefficients in the bottom row, thus by choosing y1 = 0 = y2, we get x1 = 12and x2 = 15 which gives the extreme point (x1;x2) = (12;15) and the optimal objective function value z =750.

Example 4.0.3. An electrical firm manufactures circuit boards in two configurations, say configuration #1and configuration #2. Each circuit board in configuration #1 requires 1A component, 2B component, and 2Ccomponent; each circuit board in configuration #2 requires 2A component, 2B component, and 1C compo-nent. The firm has 20A components, 30B components and 25C components available. If the profit realizedupon sale is $200 per circuit board in configuration #1 and $150 per circuit board in configuration #2, howmany circuit boards of each configuration should the electrical firm manufacture so as to maximize profits?

ANSWER. Let x1 and x2 be respectively the number of circuit boards in configuration #1 and #2. Then themathematical formulation of the problem is

Maximize 200x1+150x2subject to x1+2x2 20

2x1+2x2 302x1+ x2 25:

Using the slack variables y1, y2, y3 and z, we can rewrite the original problem as follows:

x1+2x2+ y1 = 20 (4.0.7)

Page 49 of 57


2x1+2x2+ y2 = 30 (4.0.8)2x1+ x2+ y3 = 25 (4.0.9)

200x1150x2+ z= 0 (4.0.10)x1; x2; y1; y2; y3; z 0;

and we will find the largest value of z and the pair (x1;x2) which gives the largest value of z.By collecting the coefficients, we record the problem into the Tucker tableau.

x1 x2 y1 y2 y3 z RHS

1 2 1 0 0 0 20

2 2 0 1 0 0 30

2 1 0 0 1 0 25

200 150 0 0 0 1 0Table 4.10: Original Tucker Tableau



1 2 1 0 0 0 20

2 2 0 1 0 0 30

2 1 0 0 1 0 25

200 150 0 0 0 1 0




1 2 1 0 0 0 20 (= 20=1)

2 2 0 1 0 0 15 (= 30=2)

2 1 0 0 1 0 25=2 (= 25=2)

200 150 0 0 0 1 0


Choose y3 corresponding to the minimum positive ratio 25=2= 12:5 as the exiting variable.Pivot Divide the row containing the exiting variable (the third row in this case) by the coefficient of theentering variable in that row (the coefficient of x1 in this case), giving a coefficient of 1 for the enteringvariable in this row. Then eliminate the entering variable x1 from the remaining rows (which do not containthe exiting variable y3 and have a zero coefficient for it).

Page 50 of 57



1 2 1 0 0 0 20

2 2 0 1 0 0 30

1 1/2 0 0 1/2 0 25/2

200 150 0 0 0 1 0


11 21=2 1 0 01=2 0 2025=222 21 0 1 01 0 30251 1/2 0 0 1/2 0 25/2

200+200 150+100 0 0 100 1 0+(25=2)200

Simply,


0 3/2 1 0 1=2 0 15=20 1 0 1 1 0 51 1/2 0 0 1/2 0 25/2

0 50 0 0 100 1 2500Table 4.13: Tableau giving Extreme Point and Optimal Value

Since there are no negative coefficients in the bottom row, thus x1 = 3 and y2 = 6 (i.e., x2 = 0 by (4.0.2))gives the extreme point (x1;x2) = (3;0) at which the optimal objective function value z= 9.

Page 51 of 57


Page 52 of 57

Chapter 5

Chapter 8 Dimensional Analysis

x5.1 Introduction.Read the textbook. Studied in class but the lecture note has not been typed.

x5.2 Dimensions as Product.Read the textbook. Studied in class but the lecture note has not been typed.

53


Page 54 of 57

Chapter 6

Chapter 10 Modeling with a Differential Equation

10.5 Numerical Approximation Method.Read the textbook

Documents

Mathematical Modelling