Optimum exponential regression with one nonlinear term

Optimum Exponential Regression with One Nonlinear Term

Kuldeep Kumar

Department of Economics and Statistics National University of Singapore 10 Kent Ridge Crescent Singapore 0511

ABSTRACT

A method is suggested for estimating regression parameters for a model E(Y) = a+ bX + ceeat, where observations on Y, X and t are given. This model has been extensively used in economics for relating optimum output of a production process with the available labor and services. This method is optimum in the sense it involves minimizing the error sum of squares for the linear regression of Y on X and eear over the range a > 0. The limits as a + 0 and a -+ 0~ are obtained and the computational procedure is discussed.

1. INTRODUCTION

Optimizing techniques are commonly applied to solve routine problems in many areas of science, business, industry, and government. Topics in optimization have become important areas of research in subjects like economics, engineering, business studies and operations research, etc. Optimization methods play a key role in many statistical concepts, and many statistical techniques often turn out to be ordinary applications of optimization.

The most elementary numerical optimization technique is by direct search, requiring the evaluation of the function to be optimized over the range of variables leading to the optimizing value. Other commonly used numerical optimization techniques include the methods of Gauss-Newton and Newton-Raphson; gradient methods, including methods of steepest ascent and descent; and iterative methods. Williams [4] and Bard [l] have described these techniques in detail. Often the optimizing techniques

APPLIED MATHEMATICS AND COMPUTATZON 50:51-58 (1992)

0 Elsevier Science Publishing Co., Inc., 1992 655 Avenue of the Americas, New York, NY 10010

51

0096-3003/92/$5.00

52 K. KUMAR

reduce to the solution of nonlinear equations and numerical analysis methods can be used for solving them. Fletcher [2] has surveyed these methods in his book on unconstrained optimization.

In this paper we have suggested a new method for estimating the regression parameters of the model

where observations on Y, X, and t are given. The preceding model has been extensively used in economics for relating the optimum output of the production process with the available labor and services, e.g., see Goldfeld and Quandt [3]. This method is optimum in the sense it involves minimizing the regression sum of squares (RSS) for the linear regression of Y on X and epat over the range a > 0. In this model only one nonlinear parameter is involved and the method discussed in this paper is suitable for such models.

In Section 2 of this paper we have outlined the procedure and in Section 3 we have obtained the limits as a + 0 and a + 00. We have also discussed the problem of initial grid search in Section 4. In Section 5 we have given an example to demonstrate the computational procedure and finally in Section 6 we have drawn the conclusions.

2. PROCEDURE

We are interested in fitting the model

to a set of observations (Y,, Xi, ti) for i = 1,2,. . . , n. &i are disturbances and the parameters to be estimated are a, b, c, and the nonlinear parameter a. Sometimes X can be absent and so the model (2) reduces to

In model (3) if we put 2 = emat, then using least squares we can estimate c and a as

;= c(Yi-Y)(zi-z)

qzi- Z)’

Optimum Exponential Regression 53

and

In other words, c can be expressed as the ratio of the estimated covariance between Y and Z and the estimated variance of Z. After obtaining the estimates of a and c in model (3) we get an estimated relationship between Y and Z. We can then obtain the predicted value of Y, as

2t=(i+cz. (6) Hence we define the RSS as

RSS = C( Yt - 7)’ = s;= /s,,

A simplified description of the procedure adopted to fit the model (2) is as follows:

(i) Define a grid of values of a; (ii) For each value of 8 on this grid, the RSS is obtained by regressing

Y on X and epat; (iii) Starting with the three values of 8 surrounding the maximum of the

RSS versus 8 curve, the optimum 8 is approximated by successively fitting a quadratic and finding its maximum;

(iv) Follow the convergence algorithm as given in Section 4.

The limits of the RSS as C? 4 0 and as 8 + 00 are not the same as shown in the next section, so one cannot just drop the t term altogether for these limits.

3. PROPERTIES OF THE RSS

In this section we study the limiting behavior of the RSS as 8 + 0 and as 8 + 00. Although the properties that we are going to discuss here hold good for the general model (2) for the sake of simplicity we will consider the model (3). The theoretical results thus obtained can be easily generalized for the model (2).

The RSS for model (2) as mentioned in Section 2 is given as

RSS = Si3 /S,,

54 K. KUMAR

where Z = ePat or Z= e- at. so

RSS= (~(Yi-~)(e-ar,-e~z))2

qe-at, _ e-Z)2 ’ (8)

As ij+oo, eFati+O and e-’ + 0 and so both numerator and denominator converge to zero. Suppose the observations ti(i = 1,2,. . . , n) are in ascending order. Then multiplying both numerator and denominator by ezatl gives

RSS _ ( qyi _ y)(e-w-h) _ ,-ajr-11))12

E(e-

- 2 w-td _ e- v - h)

)

Now as a +o3, e-a(ti-tl)-+ D( i, 1) where

D(i,l)=l if i=l

= 0 otherwise

and e-a(t-tl)* n-‘. Hence

RSS + {C(Yi-Y)(D(i,l)-n-l)}e

C(D(i,l)- rq2

(9)

where Z(“) = (1, 0, 0, 0, . . . , 0). In other words, to obtain the limit of the RSS as a + 00, we regress Y on

Z(“). In general for model (2), the limit of the RSS as a --t 00 is obtained by regressing Y on X and (1, 0, 0, . . . , 0) = 8”).

The following results follow immediately:

(i) If some ti < 0, the limits can be obtained in the same way provided t, < min( t,, . . . , t,).

(ii) If the ti are not in ascending order, Z(“) is obtained by placing a 1 in the position where the minimum value of ti occurs and zeros else- where.


(iii) If there is more than one minimum of ti, then Z(“) contains l’s at each of the minima.

Now let us consider the limits as a + 0. As a + 0, ewati -+ 1 and e- ‘+ 1; so both numerator and denominator in the RSS [see (S)] tend to zero. Using the L’Hopital rule we have

te-ati _ e-Z) = sl tewati - e-atJ) a na

f, c ,-ati a e-atj

=-

Dividing the numerator and denominator in (8) by a2 we get,

RSS~ {C(yimy)(t-ti))2 qt- tiy (10)

So to get the limit when a -+ 0, we simply regress Y on t. This result can be generalized for the model (2). To get the limit when

a+ 0 in this case regress Y on X and t.

4. INITIAL GRID SEARCH

It is necessary to do a grid search before embarking on an iterative method of converging on the optimum value of a. To do this model (2) is fitted for different values of a (a > 0) and RSS is plotted against a. This means for each 8 > 0, the coefficients a, b and c are fitted by least squares. If there is a clearly single maximum, the value of a and the RSS close to the maximum can then be used to start a convergence iterative procedure. We have applied the preceding procedure to some real data sets and in some instances we have found two maxima. Usually one of these maxima was either at 8 + 0 or at 8 -00. Under these circumstances a second search could be called when the initial grid search suggests that this may be worthwhile.

56 K. KUMAR

The convergence algorithm can be outlined in the following steps:

(i) We choose three values of a from the initial grid values so that the RSS is largest for the middle one.

(ii) After this we fit a quadratic

RSS = a,# + b,a + c,,,

the maximum of which is at a, = - b, /(2a,). (iii) This new value of a( = a,), and its value of the RSS obtained by

another regression is included with the other three. (iv) The next three values of a are selected from these four and Steps

(ii) and (iii) are repeated until the desired convergence is achieved. At the start of each stage the middle value of a always gave the largest RSS, through which we can determine which one among the four can be dropped.

This technique was tried on a large number of data sets and in most cases it was found to be successful. However, sometimes one problem emerged and in these cases two values of a stay very close to each other, resulting in a slow convergence. For example, a, might stay fixed and a, and as become close together. This was overcome by enforcing such large gaps to be bisected for new a. In these cases a, would be placed halfway between a, and as (instead of being obtained by quadratic fit). Then we drop a, and the new a and the values of the RSS would be relabeled.

5. EXAMPLE

Suppose we are interested in fitting a model of the form

Y=a+bX+ceeaf

where the values of X, Y and t are as follows:

t=1 2 3 4 5 6 7 8 9 10 X = 42 7 16 46 88 33 55 28 9 22

Y = -18.2 -29.3 -6.2 32.8 80.0 28.1 52.0 26.5 8.2 21.3


In the initial grid search the values of a and the corresponding RSS were as follows:

a RSS a RSS

0 0.9207 0.0222 0.9275 0.0444 0.9341 0.1111 0.9523 0.2222 0.9768 0.4444 0.9992 1.1111 0.9644 2.2222 0.9144 4.4444 0.8914

11.1111 0.8886 22.2222 0.8886 44.4444 0.8886

111.1111 0.8886 222.2222 0.8886 444.4444 0.8886

1111.1111 0.8886 00 0.8886

The values corresponding to a = 0 and 03 refer to the two limits described in Section 3.

The initial grid search gives a maximum of the RSS at a = 0.4444. The three values of a were selected, i.e., a = 0.2222, a = 0.4444, and a = 1 .llll . To get the convergence, Steps (i) to (iv) given in Section 4 were repeated. The following convergence was achieved after three iterations:

a RSS

0.4444 0.99921 0.51365 0.99995 0.50542 0.99998 0.50542 0.99998

Hence the value of a was chosen to be equal to 0.50542. The estimated values of a, b, and c were as follows:

a = 0.10677, b = 0.99806, c = -60.388

The RSS for regressing Y on X alone is 0.6443, which has improved a great deal by inclusion of the term e-at.

6. CONCLUSIONS

This approach was found to give satisfactory results even with awkward data sets. In the curve of the RSS versus a, for some data sets two maxima occurred, but none of the data sets we tried gave rise to more than two

58 K. KUMAR

maxima. Also, in these instances one of the maxima was at one end (i.e., either at 8 --* 0 or at 8 -+ 03). It would be interesting to prove theoretically that these are the only possible cases. To improve efficiency it may be important to know more about the possible shapes of the RSS versus a curve.

REFERENCES

1 Y. Bard, Nonlinear Parameter Estimation, Academic Press, New York, 1974. 2 R. Fletcher, In unconstrained optimization, in Vol. 1, Practical Methods of

Optimization, Wiley, New York, 1980. 3 S. M. Goldfeld and R. E. Quandt, Nonlinear Methods in Econometrics, North

Holland, Amsterdam, 1972. 4 E. J. Williams, Regression Analysis, Wiley, New York, 1989.

Documents

Optimum exponential regression with one nonlinear term