LaboratoryWe develop)and assess a number 4;.quivalent mathematical formulations of the general equi-librium problem in economics. V%.begin~with the traditional representation as a

'u-rip

S ystems

O ptimization

Laboratory

Ct,

0"

FORMULATION AND SOLUTION OF

IJohn C. STONE

TECHNICAL REPORT SOL 88-7

April 1988

uDTIC

JUL 1 21988

DISTRIBUTION STATEM A

Approved for public relea e;D istibution U nlim't edl'

.

Department of Operations Research ,Stanford University I h%%Stanford, CA 94305 oI

88 /12 067

SYSTEMS OPTIMIZATION LABORATORYDEPARTMENT OF OPERATIONS RESEARCH

STANFORD UNIVERSITYSTANFORD, CALIFORNIA 94305-4022

FORMULATION AND SOLUTION OFECONOMIC EQUILIBRIUM PROBLEMS

John C. STONE

TECHNICAL REPORT SOL 88-7

April 1988

DTICJUL 1 21988

Research and reproduction of this report were partially supported by National ScienceFoundation Grants DMS-8420623, SES-8518662, and ECS-8617905; U.S. Department ofEnergy Grant DE-FG03-87-ER25028; Office of Naval Research Contract N00014-85-K-0343;and the Center for Economic Policy Research at Stanford University.

Any opinions, findings, conclusions or recommendations expressed in this publication arethose of the author(s) and do NOT necessarily reflect the views of the above sponsors.

Reproduction in whole or in part is permitted for any purposes of the United StatesGovernment. This documco, -1' b'en pprovcd for public release and sale; its distributionis unlimited.

K. _KT. mZKVr.. IT T- VTRTT~k. -.. -.T_

ACKNOWLEDGEMENTS

It is more than just proper form to say that an effort of this magnitude could not have beencompleted without the help and support of friends, colleagues, and family. It is absolutelytrue. I would like for a number of people to know how much I appreciate their support. ge

Thank you, George Dantzig, for supporting me these many years, both in terms of employ-ment and in giving me enough rope to do things my own way - even when I wrapped thatrope around my own neck.

Thank you, George, Richard Cottle, and Curtis Eaves, for your commentary on the firstdraft and for helping me over some of the rough technical spots along the way.

Thank you, Alan Manne, for helping to get me (and others in this Department) involvedwith general equilibrium modeling in the first place. Thank you, Alan and Jackie Manne,for caring enough to start nagging me to get finished.

Thank you, Patrick McAllister, for taking the time to read through the first draft so carefullyand for the many discussions we have had about the subtleties of equilibrium problems.

Thank you, Michael Saunders; your patience knows no bounds. I honestly do not knowwhere I would be without your many hours of help concerning both my using (and abusing)MINOS for the computations and my specializing IATEX for preparing this document.

Thank you, Thomas Rutherford, Olvi Mangasarian, Philip Gill, and Margaret Wright, fornumerous fruitful discussions about this research.

Thank you, Gail Stein and Audrey Stevenin, for your friendly support and your help witha million little details along the way.

Thank you, Charles and Mary Bess, my parents, for always making sure that it was possiblefor me to spend my years at Harvard and here at Stanford. Thank you also, I think, fordoing whatever it was you did to make me the perfectionist I am today. Maybe now I cangive it a rest.

Finally, thank you, Lupita, John, Elizabeth, and Christine, my Pinter family, for puttingup with this cranky procrastinator these too many years that it has taken to finish thiseffort.

For

John Charles Stone F1Stanford, California 1 7 .

March 1988

General equilibrium starts at home. L tIA)1/

Lupita Arce Pinter I Av " 1 -it 11l / ( , C c, 9

TICA 1 ,-

(NSPrC~tD

iii 6

ABSTRACT

We develop)and assess a number 4;.quivalent mathematical formulations of the general equi-librium problem in economics. V%.begin~with the traditional representation as a nonlinearcoraplemetaxity prob'ei and dc.vciop alternative representations as nonlinear optimizationproblems. All of our formulations depart from previous approaches by including an explicitlinear equality for price normalization and a matched artificial variable which must be zeroat any equilibrium solution. This structure has the theoretical and computational advantagethat any linearization of the equilibrium problem has a feasible complementary solution.Moreover, under a reasonable rank condition, a basic complementary solution exists which

can be successfully computed by Lemke's almost-complementary pivoting method.

We describd five general-purpose methods which can be applied to solving the equilibrium

problei/ The common feature of these methods is solving a sequence of linearized prob-lems. We establislf~a number of equivalences between the methods, when applied to theequilibrium problem, and assess their local and global convergence properties in that con-text. An important new tool in this analysis is another problem formulation based on,a differentiable exact penalty function. This formulation provides, perhaps for the firsttime, a rigorous means of evaluating the progress of a sequential method for computing anequilibrium solution. Our analysis reveals a basic theoretical dilemma in solving generalequilibrium problems by these sequential methods. One group of methods produces iterates 4.

that converge from any starting point, but the sequence may converge to a non-equilibriumpoint.(Another group produces iteratesthat may fail to converge, but successfully converg-ing sequences do attain an equilibrium.) ( . .

We perform a number of computational experiments on two small problems from tlic 1itora-ture. The results show considerable variation in the solution times for the various methods,

but all methods succeed in locating an equilibrium, even from poor starting points. Thissuccessful performance (in addition to that reported by other researchers) suggests thatthe kinds of general equilibrium models formulated in practice possess certain favorablecomputational properties that theoretical analysis has yet to discover.

iv

CONTENTS

ACKNOWLEDGEMENTS iii

ABSTRACT iv

LIST OF FIGURES viii

1. INTRODUCTION 1

1.1 Scope of this research .............................. 1

1.2 Tools of the trade ................................. 4

1.3 Outline of presentation .............................. 5

1.4 A word on acronyms, notation, and internal references ..... ....... 6

1.4.1 Acronym s ... ... ...... ... ... ... ... ... . .. .. . 7

1.4.2 Notation . . . . .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . 8

2. THE COMPUTABLE GENERAL EQUILIBRIUM PROBLEM (CGE) 9

2.1 Properties of demand . ............................... 9

2.2 Properties of production ............................. 11

2.3 Price normalization ................................ 11

2.4 General equilibrium conditions ......................... 12

2.5 Existence of equilibrium solutions ........................ 13

3. TWO EQUIVALENT FORMULATIONS OF CGE 15

3.1 The nonlinear complementarity problem .................... 15

3.2 CGE as a nonlinear complementarity problem (Eq-NLCP) ............. 16

3.3 CGE as a nonlinear program (Eq-NLP) .................... 17

3.3.1 Stationary points of Eq-NLP ...................... 18

4. SOLUTION METHODS FOR Eq-NLCP AND Eq-NLP 21

4.1 A sequence of linear complementarity problems (SLCP) .............. 21

4.1.1 Basic solutions, uniqueness, and regularity ............... 22

v

V M "An M . -- -

vi Contents

4.2 A sequence of quadratic programs (SQP) ....................... 25

4.3 SLCP implements SQP ....... ............................. 26

4.3.1 A more general equivalence result ......................... 27

4.4 A sequence of linear programs (SLP) ........................... 29

4.5 A projected (augmented) Lagrangian method ..................... 30

4.6 Other linearization methods ................................. 30

5. CONVERGENCE PROPERTIES OF SOLUTION METHODS 33

5.1 Local convergence ....... ................................ 33

5.1.1 SLCP and Wilson's SQP method ......................... 33

5.1.2 Other optimization methods ............................ 35

5.1.3 Other linearization methods ............................ 35

5.2 Global convergence ....................................... 36

5.2.1 A differentiable exact penalty function (Eq-DEPF) ............. 37

5.2.2 Stationary points of Eq-DEPF .......................... 39

5.2.3 Judging descent using Eq-DEPF ......................... 40

5.2.4 Solving Eq-DEPF directly .............................. 43

5.3 Convergence to what? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 44

5.3.1 Descent from nonoptimal stationary points .................. 45

6. SOLVABILITY OF LINEARIZED SUBPROBLEMS 47

6.1 Relevant results for Lemke's method ...... ...................... 47

6.2 Application to a linearized subproblem .......................... 48

7. COMPUTATIONAL COMPARISONS 53

7.1 Two test problems ....... ................................ 53

7.2 Solution methods implemented ...... ......................... 54

7.2.1 Departures from standard SLCP method ................... 54

7.2.2 A symmetric linearization .............................. 55

7.2.3 Using MINOS to solve Eq-NLP and Eq-DEPF ................ 56

7.2.4 Specification of starting points .......................... 57

0 -

Contents vii

7.2.5 Convergence criteria ................................. 58

7.2.6 Hardware, software and timing measures .................... 58

7.3 Results for the Scarf problem ................................ 58

7.3.1 Solving Eq-DEPF with MINOS .......................... 59

7.3.2 Solving Eq-NLP with MINOS ........................... 59

7.3.3 Three sequential methods .............................. 60

7.4 Results for the Kehoe problem ...... ......................... 63

7.4.1 Solving Eq-DEPF with MINOS ......................... 63

7.4.2 Solving Eq-NLP with MINOS ........................... 63

7.4.3 Three sequential methods .............................. 64

7.5 Commentary ........ ................................... 65

8. SUMMARY AND PERSPECTIVE 69

8.1 Extension to nonlinear constant-returns production .................. 70

8.2 Future research ........ .................................. 72

BIBLIOGRAPHY 75

L

''

LIST OF FIGURES

1 SLP vs. SLCP - Scarf problem. .. .. .. .. ... ... ... ... ... ... 62

2 SLTZ vs. SLCP - Scarf problem .. .. .. .. ... ... ... ... ... ... 62

3 SLP vs. SLCP -Kehoe problem .. .. .. .. ... ... ... ... ... ... 66

4 SLTZ vs. SLCP - Kehoe problem. .. .. .. ... ... ... ... ... ... 66

viii

1INTRODUCTION

An important and powerful paradigm in economic theory is the notion of general equilib-rium, that is, a set of prices (and corresponding quantities) which balance the supply anddemand for all commodities. The term general equilibrium is used equally to refer to themarket-clearing prices and quantities, themselves, as well as to the "invisible hand" mech-anism by which markets are cleared. The concept of equilibrium is an old one, but it isalso one for which current interest is both diverse and intense. The advent of the com-puter has given rise to a class of so-called computable general equilibrium models, whichmay be distinguished by the explicit intent to mathematically represent the equilibriumsystem, compute a solution of the system, and derive policy or other conclusions by com-paring solutions across different model specifications. As well exemplified by the collectionof papers in a recent Mathematical Programming Study [Mann 85], ever more sophisticatedapplications of computable general equilibrium models both stimulate and take advantageof improvements in computational procedures.

Against this backdrop, the present research is concerned with various aspects of the formu-I ation and solution of general equilibrium problems. The key idea will be to define, compare,and exploit the properties of a number of different but equivalent mathematical formula-tions of the problem. While most of the results presented are theoretical in nature, ourunderlying interests and motivations are very much computational. These interests, cou-pled with some basic pragmatism, serve to establish some limits on the kinds of problemsto be considered.

1.1 Scope of this research

First and foremost, our attention is limited to general equilibrium problems which requiresimultaneous determination of all prices (and quantities) in the economy. In contrast,partial equilibrium problems accept certain prices as given exogenously. This importantdifference has implications for the mathematical properties of the problems which, on thewhole, make partial equilibrium models considerably easier to solve than general equilib-rium models of comparable size and functional complexity. Partial equilibrium problemstypically satisfy some contraction or monotonicity property which guarantees the conver-gence of an appropriate iterative method, examples of which are legion. Many successfulapproaches are specializations of the general iterative procedures for variational inequalitiessurveyed by Pang and Chan [PC 82]. An historically important example is the so-calledPIES method [AH 82]. Also in this group are a variety of methods originally applied tonetwork/traffic equilibria and more recently extended, in the work of Dafermos and Nagur-ney, to spatial equilibria and oligopolistic markets; see, for instance, [DN 84] and [Nag 87].Viable approaches for specific large-scale models of energy markets have also been based onNewton-like methods [Phi 85] and on Gauss-Seidel iterations [Mur 83].

15

2 1.1 Scope of this research

In sharp contrast, general equilibrium problem-, can almost never be shown to satisfy knownsufficient conditions for the convergence of established iterative methods. (An importantexception is noted below.) Consequently, procedures which perform efficiently on partialequilibrium models need not and typically do not perform especially well in a general equi-librium context. Conversely, while most of the findings and methods addressed in thisresearch are also applicable to the partial equilibrium case, we would not expect the meth-ods to be competitive with procedures that take explicit advantage of partial equilibriumproperties.

In a general equilibrium context, there are three distinguishing features of a problem whichdetermine not only how the real-world economy is stylized but also how difficult the problemproves to be in terms of computation:

1. departures from the pure competition paradigm2. the representation of final consumer demand3. the representation of production.

Simplifying assumptions in each of these areas define the class of general equilibrium prob-lems to be considered in this research.Departures from a purely competitive equilibrium market structure can take many forms.

Taxes and subsidies on consumption or production activities effectively cause consumers and

producers to see different prices for the same commodity. Price rigidities such as a mini-mum wage or regulated energy prices restrict the domain over which equilibrium prices canbe sought, possibly preventing market-clearing altogether. Balance of payments constraintscan also be included (somewhat loosely) under this heading. The work of Shoven and Whal-ley (e.g., [SW 73]) has spawned an entire generation of applied general equilibrium modelsspecifically addressing taxes and other market interventions and imperfections. While suchdepart.ures from pure competition are certainly characteristic of real-world economies, theirrepresentation in a model can present serios problems of bth a theoretical and compu-tational nature. Since the kind of computational analysis presented here has proved to bedifficult enough even in the pure competition case, similar consideration of non-competitivestructures must be left a subject for future research.

With respect to the representation of final consumer demand, we restrict our attention totwice continuously differentiable aggregate excess demand functions. As a theoretical mat-ter, this assumes away situations in which a utility-maximizing choice occurs at a boundaryof the space of consumption quantities (most typically, at a point where the consumptionof some commodity is zero). The work of Mas-Colll [MasC 85] demonstrates that such sit-uations are "rare" in a rigorous sense that need not be defined here. As a practical matter,virtually all demand functions which are actually used in applied equilibrium modeling sat-isfy this condition. What is ruled out are demand correspondences and piecewise-specified

functions such av might he derived from corresponding linear or piecewise-specified utilityfunctions for individual consumers. Two of the few examples of applications with nondiffer-entiable demand may be found in [DEG 79] and [MCW 80], wherein a number of conskiierclasses are each modeled explicitly by linear programs.

-w -' ~ ~ -

1.1 Scope of this research 3

We also exclude from consideration the class of integrable demand functions, though in thiscase the reason is that models with such functions are relatively easy to solve. An aggregatedemand function is said to be integrable if it can be derived from the optimality conditionsfor maximizing a corresponding utility function. In this case, it is well known that a generalequilibrium solution can be computed (readily) by maximizing the utility function over theset of consumption levels which can be feasibly produced. As a theoretical matter, theconditions (on individual demands) required for the existence of an integrable aggregatedemand function are quite restrictive. As a practical matter, integrable demand systemsare often employed anyway, precisely because of the associated computational advantages.An example from our own work is the PILOT energy-economic model [SMD 871, in whichthe dynamic representation of production activity is so large that computation even withintegrable demand functions is an expensive process. Indeed, it was precisely the desireto understand the computational consequences of abandoning the integrability assumptionthat initiated this research.

With respect to the representation of production, our focus is on models with activityanalysis (or linear) production. This is a restrictive assumption, to be sure, althoughin principle any convex production structure can be approximated by a sufficiently largeactivity-analysis representation. Linear production figures prominently in early works onequilibrium and, in particular, in the pioneering computable formulations of Scarf [Sca 73].It is also a principal feature of PILOT.

Happily, a model structure with (twice) continuously differentiable demand and linear pro-duction will also permit the incorporation of nonlinear production structures represented by

(three times) continuously differentiable production (or profit) functions that reflect strictlydecreasing returns to scale. This is accomplished by defining a composite demand functionas the difference between consurrcr demand and the (nonlinear) net output from produc-tion, which in this case is uniquely determined as a function of prices. (The consumerdemand component must be carefully defined so as to include the distribution of the pos-itive net profits of production.) The composite demand function so defined possesses thesame mathematical properties as the consumer demand functions employed throughout thisresearch. The only restriction is the differentiability assumption, which in fact is satisfiedby most of the nonlinear production or profit functions that are actually used in applied

* equilibrium modeling.

The production structures that are not immediately covered by our computational analysisare either nonconvex or involve nonlinear technologies with constant returns to scale. Thelatter omission is particularly unfortunate, given the theoretical importance and widespreaduse of such structures in equilibrium models. In the final chapter, we outline some veryrecent observations which suggest that most of the results obtained in this research canindeed be extended to the case of nonlinear production with constant returns.

After consideration of the above limitations of focus, we are left with an important classof competitive general equilibrium problems with continuously differentiable, nonintegrabledemand and linear (or decreasing returns) production. Our theoretical analysis of soluticaalgorithms for such problems will not explicitly address the issue of model size; nonetheless,our principal interest is with procedures which could be applied to medium- and large-scale

- AM - .-

4 1.2 Ibols of the trade

problems containing hundreds or thousands of commodities and activities. For this reasonwe will not address computational procedures based on homotopy or fixed-point methods.

Such methods, as originated by Scarf [Sca 73] and more recently implemented by Blroadie[Bro 85], are both general and robust, but their application to large-scale models is notcurrently feasible.

We also will not specifically analyze three related families of methods designed for the case inwhich nonintegrability of the final demand functions arises directly from aggregating knownintegrable demand functions for a manageable number of consumer classes. Indeed, the basicspecification of consumer behavior is through the underlying direct utility functions. Themethods of I)antzig, Eaves, and Gale [DEG 79] and of Ginsburgh and Waelbroeck [GW 81]seek a competitive general equilibrium by solving a sequence of optimization problems inwhich the objective function is a weighted sum of individual utilities. The method of Manne,Chao, and Wilson [MCW 80] solves utility-maximization problems for individual consumersso as to generate consumption bundles for use in a column generation procedure that seeksa combination of bundles consistent with aggregate supply-demand balances and the pricesdual to those balances. While it may prove possible to analyze the convergence of these

W,ethods using the machinery of Chapter 5, such an analysis has not been performed aspart of this research.

1.2 Tools of the trade

In order to fully characterize the relevant mathematical characteristics of a general equi-librium problem, Chapter 2 will provide a brief but reasonably self-contained discussion ofthe problem components and equilibrium conditions. Much of this material is likely to befamiliar to the reader, but the repetition seems necessary in order to define the relevantcontext for the computational analysis that follows.

In contrast, it would be unrealistic to attempt to provide here a review of the many basicaspects of mathematical programming and complementarity theory that will be used exten-sively throughout the later chapters. We must in fact presume that the reader is more than

familiar with these matters. Concepts to be employed include the standard formulationsof mathematical programs (linear, quadratic, and general nonlinear) and of linear and non-linear complementarity problems. 'We also employ the first- and second-order optimnalityconditions (both necessary and sufficient) and the associated fundamental concept of the

Lagrangian function and Lagrange multipliers. Notions of basic solutions, uniqueness, andregularity prove to be important in several respects. In light of the breadth of the theoriesemployed, we are also in no position to provide well-balanced citations for the many path-breaking contributions to this body of theory by Dantzig, Kuhn and Tucker, Cottle, andLemke. A rigorous treatment of the fundamental concepts of mathematical programming,with appropriate references to the original contributions, can be found in th text of Avriel

[Avr 76], for instance. A brief review of the basic concepts and origins of complementaritytheory can be found in [Cot 76].

We also will be concerned with a variety of algorithms for solving mathematical programsand comnplementarity problems. The features of some of these algorithims will be reviewed as

1.3 Outline of presentation 5

they have direct bearing on our results. Reference will be made to Newton and quasi-Newtonmethods without elaboration. Penalty function concepts will be used freely. Aspects ofconvergence theory will be relevant, but there will be no detailed convergence proofs. Acomprehensive survey of the state of the art in optimization algorithms (as of about 1980)is available in the text of Gill, Murray and Wright [GMW 811. A brief review of more

recent developments will soon appear in a forthcoming handbook of operations research[GMSW 87].

1.3 Outline of presentation

The balance of this document is organized as follows. First, we define some acronyms andnotation that will be used recurrently in the later chapters. In Chapter 2 we build up thecomputable general equilibrium problem from its component parts, which entails describingthe relevant mathematical properties of the demand and production structures, discussingthe convention for normalizing prices, and delineating the mathematical conditions thatdefine an equilibrium solution. In Chapter 3 we develop two alternative computationalformulations of the general equilibrium problem. The first follows closely the statement ofthe problem in Chapter 2 and comprises a specially structured nonlinear complementarityproblem. The second is a new development comprising a nonlinear program constructedto be equivalent to the complementarity problem. Both formulations depart from moretraditional formulations by including both an explicit equality for price normalization anda matching artificial variable in the supply/demand balances.

In Chapter 4 we define five classes of methods for solving these equivalent formulations.The common feature of these methods is solving a sequence of subproblems defined by lin-earizing the aggregate demand functions. The methods may be distinguished by the typeof linearization employed and the kind of solution sought relative to the linearized con-straints. All of these methods are applicable to more general problems, but we demonstratesome interesting equivalences which arise when the methods are specifically applied to theequilibrium problem.

The local and global convergence properties of these methods are the subject of Chapter 5.We briefly survey known results and then examine the applicability of those results in thecontext of a general equilibrium problem. We develop an important new tool in under-standing global convergence properties in the form of another equivalent formulation of theproblem, which replaces the nonlinear constraints of the nonlinear programming formula- k.tion with a differentiable exact penalty function. This formulation provides, perhaps for the N.

first time, a rigorous means of evaluating the progress of a sequential method for computingan equilibrium.

In Chapter 6 we demonstrate the value and power of our formulation of the problem withboth the price normalization and the matching artificial variable. In particular, we provethat the normalization can readily be specified in such a way as to guarantee that anylinearization of the nonlinear equilibrium problem has a feasible complementary solution.Moreover, under a reasonable rank condition (discussed in Chapter 4), a basic complemen-tary solution exists which can be successfully computed by Lemke's almost-complementary

6 1.4 A word on acronyms, notation, and internal references

pivoting method.

In Chapter 7 we report the results of a number of computational experiments on twosmall problems from the literature. One has a unique equilibrium, while the other hasthree distinct equilibria. The results show considerable variation in the solution times forthe various methods, but all methods succeed in locating an equilibrium, even from poorstarting points.

Finally, in Chapter 8 we attempt to provide some perspective on the collection of resultsobtained and suggest some possible directions for future research. Overall, our analysisreveals a basic theoretical dilemma in solving general equilibrium problems by the sequen-tial methods studied. One group of methods produces iterates that converge from anystarting point, but the sequence may converge to a non-equilibrium point. Another group

produces iterates that may fail to converge, but successfully converging sequences do attainan equilibrium. Nonetheless, the successful performance of these methods on actual models(both in our experiments and those of others) suggests that the kinds of general equilib-rium models formulated in practice possess certain favorable computational properties thattheoretical analysis has yet to discover.

1.4 A word on acronyms, notation, and internal references

This section provides a convenient listing of important acronyms and notation to be used inthe following chapters. Both here and throughout, references to other parts of this documentwill denote chapter, section, and subsection. For instance, the reference "Section 3.2.1"denotes Subsection 1 of Section 2 of Chapter 3.

Throughout this document, vectors should be taken as column vectors unless explicitlytransposed, as in T. The one exception to this is gradient vectors, for which we use therow convention. In particular, the Jacobian matrix of a vector-valued function f(z) will be

denoted by Vf(z), for which the (ij)-th element is a , where fi(z) denotes the i-th

component of f(z).

We also will make use of index set notation for identifying sub-vectors and submatrices.For instance, if J is a set of indices, z, refers to the sub-vector of components of a vector zwhose indices are in the set J. For matrices, MJK identifies the submatrix of a matrix M

determined by the rows with indices in set J and the columns with indices in set K. Wewill use the notation M.. to indicate a submatrix containing all of the columns of M and

the rows with indices in set J. Similarly, M.K selects all of the rows but only the columns

in set K. The cardinality of a finite set J is indicated by jJ. Set subtraction (finite orinfinite) will be indicated by J \ K, meaning the set of all elements of set J that are notalso in set K.

V. V~ V.~ *, .~.; ~ '*,*%*~ ~ -~ .~

1.4.1 Acronyms

1.4.1 Acronyms

CGE Computable General Equilibrium problem (Section 2.4)

NLCP NonLinear Complementarity Problem

NLP NonLinear Program

QP quadratic Program

LP Linear Program

CNLP equivalent NLP formulation of an NLCP (Section 3.1)

Eq-NLCP NLCP formulation of CGE (Section 3.2)

Eq-NLP NLP formulation of CGE (Section 3.3)

DEPF Differentiable Exact Penalty Function (Section 5.2.1)

Eq-DEPF DEPF formulation of CGE (Section 5.2.1)

SLCP Sequence of Linear Complementarity Problems (Section 4.1)

SQP Sequence of quadratic Programs (Section 4.2)

SLP Sequence of Linear Programs (Section 4.4)

SLTZ linearization method based on Slutzky matrices (Section 7.2.2)

Sp P O "' 'l.l ' ." l'' ' "'"P".'" ", . p .mm, . . . - . - .d

8 1.4 A word on acronyms, notation, and internal references

1.4.2 Notation

RY! nonnegative orthant of r-dimensional Euclidean space

I symbol denoting a complementarity condition

zi elements of vector z with indices in set J

MJ K submatrix of matrix M; row indices in set J, column indices in set K

IJI cardinality of a finite set J

J \ K all elements of set J not also in set K

e a vector of ones

ej the i-th unit vector

m number of commodities/prices

n number of production activities

p m-vector of prices

Y n-vector of production activity levels

v artificial variable

d(p) m-vector of demands net of endowments (and nonlinear supplies)

X m-vector of approximate demands

A m x n activity analysis matrix

h price normalization vector (hTp = 1)

p Lagrange multipliers for supply/demand balances

9 Lagrange multipliers for nonprofitability conditions

i3 Lagrange multiplier for price normalization

.o,

,

:U.

- -

I2

THE COMPUTABLE GENERALEQUILIBRIUM PROBLEM (CGE)

In this chapter we build up the computable general equilibrium problem (CGE) from itscomponent parts. This entails describing the relevant mathematical properties of the de-

mand and production structures, discussing the convention for normalizing prices, and

finally delineating the mathematical conditions that define an equilibrium solution. Much

of this material is based on the excellent presentations by Kehoe [Keh 82] and Mas-Colell[MasC 85].

2.1 Properties of demand

Let m denote the number of commodities and associated prices represented in the equilib-rium problem. Let p denote an m-vector of prices and d(p) an m-vector of final demandsas a function of prices. For generality and notational convenience, d(p) represents finaldemand net of any initial endowments or price-sensitive nonlinear supplies. Typically, d(p)is undefined at the origin but defined for all strictly positive prices. If d(p) is not definedon all of WR' \ {0}, then some further assumption about boundary behavior is necessary inorder to ensure the existence of equilibrium prices in the domain of d(p). To this end, definea set U contained in the boundary of R!' over which d(p) is undefined. (Note: 0 E U.) Thedomain of definition of d(p) is denoted by V = RW \ U.

We consider demand functions, d(p), that satisfy the following properties on :

(D1) homogeneity of degree zero (d(Ap) = d(p), VA > 0)

(D2) Waras' law (pTd(p) = 0)

(D3) boundedness from below (3,r > 0 : d(p) -T)

(D4) twice- continuous differentiability

(D5) boundary behavior (for any P E U \ {0} and any sequence

p' _. f, p' E D it must be that jId(p')jj - oo).

Assumption (D5) is a technical point important to issues of existence and setting up ma-chinery to investigate regularity and uniqueness of equilibria. It is perhaps most easilyunderstood by example. If the demand functions are derived from a Cobb-Douglas utilityfunction, the demand for any commodity becomes infinite as its price approaches zero. In

this case, the set U is the entire boundary of R'.

As to the other properties, homogeneity of degree zero (D1) follows from the general equi-librium nature of the problem (i.e., all prices are endogenous) and an implicit assumptionthat money is neutral and has no direct effect on economic decisions. Walras' law (D2) isa condition that all income is spent. This property is inherited by the aggregate demand

.9- . V VV .(

10 2.1 Properties of demand

function when it is true for the demand functions of individuals. The implicit assumptionis utility-maximizing behavior with non-satiation. Boundedness from below (D3) is theeconomically meaningful condition that initial endowments (and price-sensitive nonlinearsupplies) axe finite. Property (D4) is a restrictive assumption in theory, but not so restric-tive in terms of practical computations. The motivations for this limitation were discussedin Section 1.1.

The above domain definitions and properties require special interpretation in the case ofeconomies with primary and/or intermediate commodities used only in the production sec-tor. A primary commodity is one for which aggregate demand is constant and equal tothe negative of the aggregate endowment. (There is a potential ambiguity in defining thedemand for a primary commodity when the price is zero. Any quantity in the range be-tween the total endowment and the amount actually used by the production sector is anacceptable definition. It is convenient to treat the demand as constant, and, in the presenceof a free disposal assumption to be made below, this convention can be used without loss ofgenerality.) An intermediate commodity is one for which aggregate demand is identicallyzero. Aggregate demand for final commodities is unaffected by the prices of intermediatecommodities. In the presence of primary commodities, it is essential that the set U includeall price vectors in which only primary commodities have positive prices. This rules out therather anomalous situation in which consumers have income but anything they might wantto buy is free. If intermediate commodities exist, all definitions and properties above are tobe interpreted in terms of demand functions defined for primary and final commodities only.Prices of intermediate commodities do not enter the demand functions and consequentlyare irrelevant to the definition of the domain D.

The special model structures that arise in the case of intermediate and primary commoditieswill be explicitly relevant only in Chapter 6. The rest of our results and discussions applyindependently of the commodity classifications. As explicitly carrying the partitioning ofcommodity balances and prices would only make the notation more cumbersome, we willuse the more general and simple notation.

The homogeneity property and Walras' law imply some strong relationships between thefirst- and second-order derivatives of the demand functions. Homogeneity of degree zero(Dl) implies via Euler's law that:

(H1) Vd(p)p = O.

Further differentiating this identity implies:

(112) Vdi(p) = -pTvd(p).

Differentiating Walras' law (D2) implies:

(Wi) VTd(p)p = -d(p).

Further differentiating this identity implies:

(W2) E, p,V~d,(p) = - [Vd(p) + vTd(p)].

An important implication of properties (H1) and (Wl) is that Vd(p) can be symmetric only!I-'., V - -!' i U S S S ~ * -

2.2 Properties of production 11

if d(p) is identically zero. We axe not aware of any previous use being made of properties

(H2) and (W2). For our purposes, these relationships prove instrumental in establishing

the equivalence of two solution methods in Section 4.3.

The properties itemized in this section are the only restrictions placed by economic theory

on differentiable aggregate excess demand functions. This generality of form is the principal

source of difficulty in solving general equilibrium problems.

2.2 Properties of production

Let n denote the number of linear production activities in the equilibrium problem (exclud-

ing disposal activities). Let y denote an n-vector of production activity levels and A the

associated m x n activity analysis matrix. In this case, the net output from production is

defined by the polyhedral cone {Ay : y _ 0). As a function of prices, the vector of excess

profits obtained per unit of operation of the various production activities is given by ATp.

Unless a CGE model is specifically designed to address issues of indivisibility and/or in-

creasing returns to scale in production activity, it is generally assumed that the set of

aggregate production possibilities is closed and convex and admits inactivity. These condi-

tions are clearly met in the case of linear production considered in this research. The othercharacteristic assumptions are the following:

(Al) free disposal of excess supply

(A2) output requires input (Ay t> 0, y >_ 0 =::o Ay = 0).

Any departures from free disposal property (Al) axe readily represented by defining ap-propriate resource-consuming activities. Property (A2) is the specialization to linear pro-

duction of the economically meaningful condition that no combination of activities canresult in a net output (of anything) without a net input (of something). This property ismathematically equivalent to the following statement in terms of price conditions:

(A2') prices yielding no profits (3p > 0: ATp < 0).

Economically, this means that positive prices always exist which eliminate all excess profits.

2.3 Price normalization

Because of the neutrality of money in the class of general equilibrium problems underconsideration, an equilibrium solution determines only relative prices. This invariance to

price scale is reflected in the homogeneity of degree zero of the final demand functions,property (Dl), and in the fact that the relative profitabilities of linear production activities

(ATp) are invariant to a positive scaling of all prices. A price normalization is a mathematical

construct employed to remove this degree of freedom and determine a price scale (howeverarbitrary).

In principle, any nonnegative function of prices h(p) which is homogeneous of degree onecan be used to define a price normalization: h(p) = constant. It is somewhat traditional in

12 2.4 General equilibrium conditions

economic literature to employ a numiraire good, that is, a selected commodity whose priceis arbitrarily fixed (at unity, say). Mas-Colell [MasC 85] makes general use of the unit ball:JpJ = 1. Many others, including Kehoe [Keh 821, employ the unit simplex: eTp = 1, wheree is a vector of ones.

Given the inherent arbitrariness of the mode of price normalization, it makes sense to selecta normalization which imparts desirable theoretical and/or computational properties to theequilibrium system. For present purposes we employ a general linear normalization: hTp -

1, where 0 $ h > 0. Note that this includes the unit simplex and specifying a num6rairegood as special cases, simply by setting h = e or h = ei, where ei is the i-th unit vectorand i is the index of the num6raire commodity. In Chapter 6 we build upon earlier work ofthe author [Sto 851 and of Eaves [Eav 87] in using the price normalization to computationaladvantage. In particular, we demonstrate that normalizing on a simplex of prices for finaland primary commodities (in conjunction with including a mnatched artificial column to bediscussed in the next chapter) is sufficient to guarantee the feasibility and solvability ofsubproblems derived by linearizing the final demand functions in the equilibrium problem.

2.4 General equilibrium conditions

Given the definition of demand and production components provided above, we can nowdelineate the mathematical conditions which characterize a general equilibrium solution.A general equilibrium is a set of nonnegative prices p and nonnegative production levels ywhich satisfy:

(El) d(p) < Ay (demand cannot exceed supply)

(E2) ATp < 0 (no positive excess profits)

(E3) hTp = 1 (price normalization)

(E4) prd(p) - pay (excess supply has a zero price and apositive price means the market clears)

(E5) yTATp = 0 (activities used have zero profit andno unprofitable activities are used).

Note that free disposal is reflected in (E4) and in the use of an inequality in (El). Condition(E5) is in fact implied by (E4) and Walras' law (D2). Changing the price normalization in(E3) does not affect the essential characteristics of an equilibrium solution.

Conditions (E4) and (E5) are called complementarity conditions. Each inequality in (El)and (E2) is associated with a matching price or activity variable, respectively. Complemen-tarity of a given pair means that if the inequality is strict, the matching variable must bezero, and, conversely, if the variable is positive, the inequality must be satisfied as an equal-ity. (This is equivalent to multiplying through each inequality by its matching variable andrequiring the result to be an equality.) Conditions (E4) and (E5) are precisely the summingof these pairwise complementarity conditions fox inequalities (El) and (E2), respectively.

2.5 Existence of equilibrium solutions 13

Since all inequalities have the same sense and all variables are nonnegative, equality holdsfor the sum if and only if all matching pairs are individually complementary.

While complementarity is an attribute serving to define an equilibrium solution, an inter-esting aspect of the general equilibrium problem is that any nonnegative feasible solutionof (E1)-(E2) automatically satisfies the complementarity conditions (E4)-(E5). Because ofWalras' law, the left-hand-side of (E4) is identically zero, making (E4) in effect the sameas (E5). For the same reason, multiplying through inequality (El) by any p 2! 0 impliesprAy > 0. In turn, multiplying th ough inequality (E2) by any y _ 0 implies pTAy <_ 0.Hence, PTAy = 0 and any nonnegative feasible solution of (E1)-(E2) satisfies (E4)-(E5).

The upshot of this result is that the general equ: ibrium problem is, in essence, a (nonlinear,nonconvex) feasibility problem. This characteristic will resurface a number of times in thedircussions of later chapters. Nonetheless, the complementarity aspect remains central toone powerful and influential approach to formulating and solving the computable generalequilibrium problem. It also distinguishes this formulation from specifications made purelyin terms of simultaneous equations with no nonnegativity restrictions. Such specificationsdo not admit linear production and must presume that any solution will obtain positiveprices. The complementarity formulation could be reduced to simultaneous equations givenprior information as to the binding inequalities, which may not be limited to those for whichthe complementary prices and activities are positive at the equilibrium solution.

2.5 Existence of equilibrium solutions

The properties of demand and production delineated above are sufficient to ensure theexistence of equilibrium solutions in the domain D. Proofs are traditionally based on a fixed-point argument using either the Brouwer or the Kakutani fixed-point theorem (dependingon the continuity of a map constructed on a compact, convex subset of the price space). Themost appropriate proof for the present context is probably that of Kehoe [Keh 82], whichexplicitly deals with boundary behavior and assumption (D5), as well as with primary andintermediate commodities.

Note that the polyhedral region defined by conditions (E2) and (E3) is certainly closedand convex. To satisfy the assumptions for an existence proof, the price normalizationis used to bound the region, as is easily done, for instance, by using the unit simplex(h = e). (It is also possible to normalize only the prices of final and primary commodities;see [Keh 82].) Once existence has been established as a theoretical matter, however, it is nolonger necessary to use the same price normalization for computational purposes. Indeed,a normalization applied only to a subset of the prices may well prove advantageous, as wewill show in Chapter 6. The important proviso here is that positive scalings of (at leastone of) the equilibria -xisting on the unit simplex (say) be admitted by the alternativeprice normalization employed for computation. The only situation in which this would notbe true is if all prices associated with positive components of h % 0 are in fact zero in allequilibrium solutions. The most likely example of this unfortunate circumstance would bechoosing as the num~raire good a commodity which must have zero price at equilibrium.

t , . "" /¢ .:,., ,N ' ''q' ' ., ,, ,, , ' ' -- Zx .- --- ,., - : - , - , _ ,' N%

14 2.5 Existence of equilibrium solutions

Throughout the following discussions, it will be assumed that h has been chosen so as notto rule out the equilibrium solutions. This is not a restrictive assumption, since h =ecan always be used in the absence of enough insight into model structure to allow a betterchoice. Consequently, existence will not be an issue in any of the succeeding discussions ofalternative mathematical formulations of the general equilibrium problem.

I..

* ~ I I 9 1

TWO EQUIVALENT FORMULATIONS OF CGE

In this chapter we develop two equivalent formulations of the computable general equi-librium problem (CGE). The first follows closely the statement of CGE in the previouschapter and comprises a specially structured nonlinear complementarity problem (NLCP).The second is a new development comprising a nonlinear program (NLP) derived fromthe NLCP by minimizing the nonnegative sum of the complementarity conditions over thesame inequalities. Both formulations depart from more traditional formulations by includ-ing an artificial variable in the inequalities representing the supply/demand balances. Thisartificial variable proves to be helpful in two respects: (1) providing a simple characteriza-tion of an equilibrium solution, and (2) ensuring the solvability of subproblems generatedby iterative methods for solving either formulation. Discussion of point (2) is deferred toChapter 6.

First we briefly review the general form of an NLCP. This is partly to establish notationand partly to demonstrate some useful properties which later will be specialized to theformulation of CGE.

3.1 The nonlinear complementarity problem

The general nonlinear complementarity problem is defined as follows. Given a vector-valuedfunction f(z) defined for z > 0, find a solution that satisfies:

(NLCP) z > O, f(z)> O and zTf(z) = O.

Any z satisfying the two nonnegativity conditions is called a feasible solution. For any suchfeasible solution it must be the case that zifi(z) > 0 for all i and hence that zTf(z) > 0. Acomplementary solution is a feasible solution that satisfies the complementarity conditionzifi(z) = 0 for all i. Because of the nonnegativity of each term, this condition is equivalentto zTf(z) = o.

It is well known that the general nonlinear complementarity problem may be equivalentlystated as a nonlinear program:

(CNLP) minimize zTf(z) subject to z > 0, f(z) > 0.

Note that the objective function of (CNLP) is bounded below by zero on the feasible region.Given the existence of a solution of (NLCP), it follows that global minimizers of (CNLP)are complementary solutions of (NLCP) and vice versa.

The special case of an affine f(z) = Mz + q produces a linear complementarity problem,for which the corresponding CNLP form is a quadratic program with Hessian M + MT.

Mathematical programs of the CNLP form have been called composite programs by Cottle(e.g., in [Cot 641), whose original studies of the form pertained to the optimality conditions

15

-QV

16 3.2 CGE as a nonlinear complementarity problem (Eq-NLCP)

for convex quadratic programs. More generally, the (first-order) optimality conditions for

any NLP can be represented by an NLCP, and this establishes an important connectionbetween nonlinear complementarity and nonlinear programming. It is essential, however,not to confuse this connection for a special class of NLCPs with the constructed equivalencebetween problems (NLCP) and (CNLP) above, which applies regardless of the origin andspecific form of (NLCP).

3.2 CGE as a nonlinear complementarity problem (Eq-NLCP)

We now transform the statement of equilibrium conditions for CGE in Section 2.4 into

a highly structured NLCP. We begin by introducing an artificial variable v into the sup-ply/demand balance inequalities (El). The associated artificial column has the appearance

of an extra linear supply activity in which, for reasons which will become apparent, theoutput coefficients are defined by the price normalization vector h. After some stylisticreformatting, the CGE problem can be equivalently formulated as the following nonlinearcomplementarity problem:

IEq-NLCP[

Find a complementary solution of:

(NLCP1) -d(p) + Ay + hv > 0 1 pO

(NLCP2) -ATp > 0 2- y 2 0

(NLCP3) - hTp = -1 _ v±

(NLCP4) p , y > 0

Here we place the shorthand notation "I p > 0" to the right of an inequality in order toindicate that each component of the nonnegative vector p is to be pairwise complementarywith the (slack variable of the) associated inequality.

We have innocuously changed the sign of the price normalization constraint so as to producea skew-symmetric structure in the two linear blocks of the system of inequalities. Theaddition of variable v makes the system square, and its associated inequality is actually theequality price normalization constraint. Consequently, the variable v is not sign-constrained,and the complementarity condition is always satisfied. This deviates somewhat from thestandard format of a complementarity problem, but it seems best to avoid the notationaltedium of splitting the price normalization into two inequalities and v into its positive andnegative parts.

The complementarity conditions for Eq-NLCP reduce to a very simple form. To obtainthe appropriate specialization of "zJf(z), " take any feasible solution and multiply through

C.

C.

CV

3.3 CGE as a nonlinear program (Eq-NLP) 17

inequalities (NLCP1) by p > 0 and inequalities (NLCP2) by y > 0. Add the two resultingscalar inequalities to obtain:

-pTd(p) + pTA + pThv YTP 0.

Now note that the first term vanishes because of Walras' law, (D2) of Section 2.1. Thesecond and fourth terms cancel, and pTh = 1 by (NLCP3). What is left is simply v > 0.Hence, any feasible solution of Eq-NLCP necessarily has v > 0, and the complementarysolutions of Eq-NLCP are those feasible solutions with v = 0.

The correspondence of Eq-NLCP to the equilibrium conditions for CGE in Section 2.4 isimmediate. Inequalities (NLCP2) are precisely CGE conditions (E2), and complementaritywith y is exactly (E5). (NLCP3) is just (E3) with a sign reversal. Inequalities (NLCP1)correspond to (El), with the addition of an artificial term hv. Complementarity with pcorresponds logically to (E4). Despite initial appearances, the presence of hv does notdisturb the equivalence at equilibrium. Since any feasible solution of Eq-NLCP has v > 0,this solution may not satisfy (El). However, a complementary solution has v = 0 and thusdirectly satisfies both (El) and (E4).

It follows then that equilibrium solutions of CGE can be computed by solving problemEq-NLCP. It is important to keep in mind the implication of properties (H1) and (W1)of Section 2.1 that d(p) can never be the gradient of any scalar-valued function. Conse-quently, even though the linear parts of the problem conform to a skew-symmetric structure,Eq-NLCP does not represent the first-order conditions for optimality of any nonlinear pro-gram.

3.3 CGE as a nonlinear program (Eq-NLP)

Constructing the CNLP nonlinear program corresponding to problem Eq-NLCP is partic-ularly easy since the complementarity conditions reduce to v = 0. Using this fact, wemay define the following nonlinear programming formulation of the computable generalequilibrium problem:

minirmzesubject to

(NLP1) -d(p) + Ay + hv > 0 1 P > 0

(NLP2) -ATp > 0 _ > 0

(NLP3)- h~p - .i

(NLP4) p , y > 0

Note that the inequalities of Eq-NLP are identical to those of Eq-NLCP. At a Karush-Kuhn-Tucker point of Eq-NLP, each inequality is complementary with an associated Lagrange

18 3.3 CGE as a nonlinear program (Eq-NLP)

multiplier, and the collection of such multipliers is suggestively denoted by (75, ,0). Bythe same argument as made above, v > 0 for any feasible solution of Eq-NLP, and global

minima are those feasible solutions that attain v = 0.

Since global minimizers of Eq-NLP are equivalent to complementary solutions of Eq-NLCP,the correspondence to equilibrium solutions of CGE is the same as shown above. Theform of problem Eq-NLP makes explicit the observation of Section 2.4 that the equilibriumconditions of CGE in essence define a feasibility problem. Eq-NLP is precisely seeking afeasible solution of conditions (E1)-(E3) by minimizing the value of an artificial variableintroduced into (El).

3.3.1 Stationary points of Eq-NLP

Aggregate demand functions axe not in general convex, and finding a global minimum ofa nonconvex program can be an extremely difficult undertaking in practice. In the case ofproblem Eq-NLP, we at least have the prior knowledge that global minima have v = 0, sostationary points found by any applied solution algorithm can be readily and definitivelychecked for optimality. In addition, the special structure of Eq-NLP and the properties ofd(p) impart some interesting properties to stationary points of Eq-NLP which suggest thatnonoptimal stationary points may be rare or at least avoidable. While we have not been

able to obtain any conclusive results in this regard, it is worthwhile discussing some partialresults as a starting point (and hopefully motivation) for future research into this issue.

Let (p, Y, b) be a Karush-Kuhn-Tucker (KKT) point of Eq-NLP, with (p3, , b) the associatedLagrange multipliers. For the present we consider only the first-order conditions, whichentail the following parallel sets of inequalities and complementarity relationships:

(P1) -d(p)+ AV+ h> 0 1P 5_ 0 Vd(p)p + A + h0 > 01 f)_0 (M1)

(P2) -ATji 0 1 > 0 ->AT 0 _1 > 0 (M2)

(P3) -hTp -- 1 1 -hT5 =-1 V (M3)

(P4) 73, P 0 P3, i) >0 (M4)

Equivalent statements of (P1):

(P1W) VTd(p)p + Aq + hi > 0 1P3>0

(Pill) -Vd(p)p +Ap+ h > d(l) I P > 0

The equivalence of conditions (P1W) and (Pl1) to (Pl) follows from properties (W1) and(ill 1) of Section 2.1. We will refer to the conditions on the left as the primal conditions (hencethe labeling with "P") and to those on the right as the multiplier conditions (hence the"M"). Because of the nonconvexity of Eq-NLP, the above conditions need not be sufficientfor optimality of the solution.

We now detail a number of observations about the KKT system for Eq-NLP.

W,

3.3.1 Stationary points of Eq-NLP 19

Observation 1. Putting (P1W) in place of (P1) demonstrates that the primal inequalitiesare identical in form to the multiplier inequalities at a KKT point. This implies in particularthat the primal feasible solution (p, #, ,V) is also feasible with respect to the multiplier in-equalities (M1)-(M4). It may not be self-complementary, however, in which case V > 0. Aswe have already shown, a primal feasible solution is globally minimal (i.e., complementary)if and only if V = 0.

Observation 2. Putting (P1H) in place of (P1) gives primal and multiplier conditions whichare precisely the complementary slackness conditions for the linear program of minimizingv subject to "generic" versions of (P1H) and (P2)-(P4). This linear program will resurfacein Section 4.4.

Observation 3. We may intuitively expect that, at any KKT point, 13 = 0. The reasoningis that 0 is the Lagrange multiplier for price normalization (NLP3) in Eq-NLP, and neither(NLP1) nor (NLP2) is affected by changing the price scale. This intuition can in fact bevalidated as follows. By complementarity of p with (MI), we have

0 = PTvTd(p)P + PT + pTh = 0 + 0 +.

Here, the first term vanishes because of (H1) of Section 2.1, the second term vanishesbecause is complementary with (P2), and the third term reduces to 13 because of (P3).Observation 4. By a similar argument using complementarity of f5 with (P1), we obtain3= pTd(p) = -Tvd(p), where the second equality results from transposition and (Wl)

of Section 2.1. Since V > 0 at any non-optimal (non-equilibrium) solution, we may derivethe economic interpretation that the consumption bundle d(p) would not be affordable atprices p.

Observation 5. By multiplying through (Ml) and (M2) by 0 and 0, respectively, andthen adding the result, we obtain PTVd(p) 3> 0. Adding this inequality to the expressionobtained for V in Observation 4, and again making use of (H1) from Section 2.1, we canconclude that V < (pl - p)rVd(p)(p - p). In pa. cular this implies that, if by some chanceVd(p) is negative semidefinite on the linear subspace orthogonal to h, then V must be zero.

We will again have use for these KKT conditions in Chapter 5, wherein we demonstratethat the multipliers from a nonoptimal stationary point of Eq-NLP can be used to construct

4 a descent direction in a penalty function problem that is equivalent to Eq-NLP.

-, S

r .ta

1S.

SOLUTION METHODS FOREq-NLCP AND Eq-NLP

In this chapter we examine the application of five general-purpose methods to bolving thenonlinear complementarity problem (Eq-NLCP) or nonlinear program (Eq-NLP) formula-tions of the computable general equilibrium problem. The common feature of these methodsis solving a sequence of subproblems defined by linearizing nonlinear constraints, i.e., theaggregate demand functions in the equilibrium problem. The methods may be distinguishedby the type of linearization employed and the kind of solution sought relative to the lin-earized constraints. The first four methods employ a first-order Taylor's expansion and thensolve an appropriately defined linear complementarity problem (LCP), quadratic program(QP), linear program (LP), or projected Lagrangian problem, respectively. The fifth classof methods employs any of a number of alternative linearizations and then solves either anLCP or a QP. We defer the discussion of convergence of these methods to the next chapter.

Our list of methods by no means includes all possible optimization methods for solvingproblem Eq-NLP. In particular we omit sequential unconstrained methods based on penaltyor barrier functions (although such functions may appear as merit functions guiding one ofthe other methods considered). For the most part, this reflects a bias towards exploiting asmuch as possible the largely linear nature of the CGE problem under study.

4.1 A sequence of linear complementarity problems (SLCP)

The notion of solving a nonlinear complementarity problem by generating and solving asequence of linear complementarity problems is a natural analog of well-studied methodsfor solving systems of nonlinear equations by sequences of linear equations. In particular,the analog of Newton's method is based on employing a first-order Taylor's expansion as themode of linearization. Important contributions to the extension of Newton's method to NL-CPs are found in the algorithm of Eaves [Eav 78] and in the work of Josephy and Robinsonon generalized equations, which is comprehensively surveyed in [Rob 82]. The NLCP is animportant special case of the generalized equation, and Josephy in [Jo-N 79] demonstratesthat the application of Newton's method to the generalized equations corresponding to anNLCP results in solving a sequence of LCPs.

Mathiesen independently elaborated the SLCP method as a means of solving economicequilibrium problems. The best presentation of his work from a theoretical perspectiveis in [Mat 87]. Particularly in collaboration with Rutherford, Mathiesen has conducted

extensive empirical investigations into the behavior of SLCP as applied to both partial andgeneral equilibrium problems. Almost all of the computational results attest to the generalrobustness and efficiency of the SLCP method. The most informative discussions of theseexperiments are unfortunately in unpublished form, [MR 83] and [Rut 86].

21

-. ~~~~~~& -7 ICI -If V'*K~ 4 ~*

22 4.1 A sequence of linear complementarity problems (SLCP)

SLCP is applied to problem Eq-NLCP as follows. Let an initial linearization point be givenas p0 , which must be in the domain of d(p) but need not be feasible with respect to theprofitability conditions (NLCP2) in Section 3.2. Thereafter, any iteration k + 1 begins witha first-order Taylor's approximation of d(p) at the point pk:

d(p) ; dk(p) E d(pk ) + Vd(pk)(p - pk) = d(pk) + Vd(pk)p,

where the simplification on the right results from (H1) of Section 2.1. We then define andsolve the linearized subproblem denoted by LCP(pk):

ILCP(pk)


(LCP1) -Vd(pk)p + Ay + hv > d(pk) I p _ 0

(LCP2) -ATp > 0 1 y > 0

(LCP3) -hTp -1

(LCP4) p , y > 0

Let ( VYi) solve LCP(pk). If p is sufficiently close to pk, then we have an approximatecomplementary solution of Eq-NLCP. If not, we define a new linearization point by:

pk+1 =pk+A(p _pk), where O<A< 1.

Set k = k + 1 and repeat.

As in most iterative algorithms, the choice of step length A is a matter of art. At the least,A must be chosen so as to ensure that pk+1 remains in the domain of d(p). We will haveonly a few more remarks about the issue of step length in the next chapter on convergence.

An important consideration in the definition and execution of SLCP is the feasibility andsolvability of the LCPs encountered. Mathiesen uses the num6raire method of price nor-malization and Lemke's almost-complementary pivoting method [Lem 68] for solving eachLCP. He resorts to choosing a different num~raire commodity whenever a subproblem is notsolved. As we shall demonstrate in Chapter 6, however, the normalization vector h can bechosen in such a way as to guarantee that subproblems constructed as above have a feasiblecomplementary solution which moreover can be successfully computed by Lemke's method.

4.1.1 Basic solutions, uniqueness, and regularity

Since Lemke's method proceeds by examining only basic solutions of an LCP, it is importantto ascertain whether an equilibrium solution can in fact occur at an extreme point (orvertex) of the linearized contraints. Mathiesen correctly recognized that, in the absence

of a price normalization, the equilibrium conditions do not admit a basic solution of the

!L%:: 'L~t 1, X , %I

4.1.1 Basic solutions, uniqueness, and regularity 23

inequalities. (We will review this observation below.) In practice Mathiesen avoids the

singularity by deleting the price column and the supply/demand inequality correspondingto the numiraire commodity. We will show that adding the explicit price normalization

equality (LCP3) and the artificial variable v accomplishes the same purpose. In fact, thetwo approaches are equivalent in this respect for h = ei. There is, however, a minimum

rank condition required, at least at an equilibrium price.

Since any equilibrium solution corresponds to a complementary solution of Eq-NLCP, theonly question is whether it can be reduced to a basic complementary solution. That is,given an equilibrium price vector P, can we construct a basic complementary solution of

LCP(3)? To this end, let I3 be an index set for all positive components of p and also forthe corresponding rows of the matrix A. Let a index all those columns of A such that

P r+.A+ = 0. Accordingly, in any complementary solution, qj = 0 for all j 0 a. We furtherpartition the set a into sets a + and a0 = a \ a + . This partition can always be constructed

so as to satisfy the property that a + is a maximal linearly independent set of columns suchthat there exists a ( +, y satisfying:

AO+ a+9+ = dp+(p)

, Ao_,+, ,,+ > d,_(- )

So+ > 0

a0 =0.

Here, the index set 03+ is given, while the index sets 3i0 and 03- are implicitly defined tocorrespond to weakly binding (po = 0) and strictly slack inequalities in the above solution.

We now construct a partitioned matrix which will be the focus of the ensuing discussion.For brevity we let V = Vd(p).

-Va.+,g+ A6+ + h9 +o A+.o

-(AO+. T-Ao+) T

-hT -h - j MK J_+ _0

-V.+ A$,,,+ ho -Voo A 0 [MJ MKK,

( T00 -(A00o) T

The abbreviated notation in terms of "M" corresponds to that of Mangasarian [Mang 80],wherein J indexes binding inequalities with complementary variables that are positive and

K indexes binding inequalities with complementary variables that are zero. Since the pricenormalization is an equality and variable v is unconstrained in sign (and therefore necessarilybasic), it is appropriate to include them in set J - even though vi = 0 at an equilibriumsolution.

In building a basis for this solution, the J columns must always be included while the Kcolumns need not be considered at all. Indeed, a satisfactory and sparser set of basis columns

]1p

- - *

V.. w

24 4.1 A sequence of linear complementariy problems (SLCP)

corresponding to the K rows would be the columns for the (degenerate) slack variables ofthe K rows, specifically the #0 rows of (LCP1) and the a ° rows of (LCP2). In addition,nondegenerate slacks are in the basis for all inequalities that are strict. Consequently, theability to construct a basic solution turns on the nonsingularity of M,,, We must thenaddress the conditions under which this is true.

Let M+. denote the leading principal submatrix of M,, obtained by deleting the rowand column containing h,,+. The submatrix M4+ is singular because -= 0 and

(A 0 r)T+,+, = 0 at an equilibrium solution. Also, by virtue of property (Wi) from Sec-

tion 2.1, (Vr++)Tpr + Ap+04 V0+ = 0. Hence, (-TV+ )M++ = 0. Thus, since hT+y,+ = 1,we can conclude:

M 4 4Z + (hp+) W=0==; 0 +W=0,

that is, that the artificial column is linearly independent of the columns of M4+. Nonsin-gularity of M, then depends on the rank of Mr . If the rank of M++ is I3+1 + ja - 1(where I - denotes cardinality), then the null space of M+ is spanned by the single vector

(p, , 0). But since hr+j = 1, we can conclude from this and the preceeding implicationthat bordering M++ with the price normalization and artificial column serves to make M,nonsingular. If M+ has more than one degree of rank deficiency, then naturally we can-not expect the addition of one row and column to produce a nonsingular matrix. Thus,the necessary and sufficient condition for existence of a basic equilibrium solution is thatthe associated submatrix M++ have rank 10+1 + a+1 - 1. This argument was inspired by asimilar argument of Mas-Colell [MasC 85] for a pure exchange economy.

The above conclusion holds for any normalization such that 0 6 h'9+ _ 0. In particular,it applies for h = ei, for any i E /3+. By a simple expansion of determinants argument, itis clear that M., is nonsingular in this case if and only if the submatrix of M++ obtainedby deleting the row and column corresponding to index i is nonsingular. Hence, Math-iesen's procedure of deleting the column and row corresponding to the num6raire price andcommodity balance is equivalent to adding an explicit price normalization and matchingartificial column based on h = ei, where i is the index of the numdraire commodity.

We are thus left with the reassurance that a solution procedure which examines only basicsolutions can find any equilibrium solution that satisfies the indicated rank condition onthe submatrix M++. Note that this condition depends on both the Jacobian of the demandfunctions and the activity analysis matrix.

We can also use the above partitioned matrix to examine a solution for local uniquenessand regularity. Mangasarian in [Mang 80] derives a collection of necessary/sufficient andsufficient conditions for local uniqueness of a solution of an LCP. These conditions are in turnsufficient for local uniqueness of the solution of an NLCP for which the LCP is a linearizationat the solution point. Mangasarian employs the CNLP form of the complementarity problemand assumes twice-continuous differentiability so as to utilize the second-order sufficientconditions for optimality. Kyparisis in [Kyp 86] obtains the same results assuming onlyonce-continuous differentiability by using the theory of generalized equations. Robinsonin [Rob 801 establishes related conditions (and definitions) for the regularity of solutions

4.2 A sequence of quadratic programs (SQP) 25

to complementarity problems as important special cases of regular solutions of generalizedequations.

Mangasarian's necessary and sufficient conditions for local uniqueness of a solution of an

LCP solution encompass both the general case and the particular case of M., nonsingu-lax. The nonsingular case is the most interesting in our context, both because we exam-ine solution algorithms that find basic solutions and because there is then an immediateconnection to Robinson's conditions for regularity. Both issues turn on the propertiesof the Schur complement of M, 1 in the above partitioned matrix, which we denote by

MKK - M MJJ-MJK. A solution is locally unique if and only if z = 0 is the onlysolution of the LCP:

find z>0 such that Sz>0 and zTSz=O.

A solution is regular if and only if all of the principal minors of S are positive. Clearly,regularity then implies local uniqueness.

In the further special case of nondegenerate (strict complementary slack) solutions, we haveK = 0. In this case regularity is equivalent to nonsingularity of M., Such solutions arewhat Kehoe [Keh 82] and Mas-Colell [MasC 85] refer to as regular and properly regularequilibria, respectively. Their focus on the nondegenerate case is necessitated by the exten-sive use of differential topology and genericity analysis based on arbitrary perturbations ofproblem data. There is some reassurance in their findings that regularity is a generic resultin the space of production economies, meaning that almost all economies have only regularequilibria. Regularity is significant for computational purposes not only in guaranteeing theexistence of basic equilibrium solutions but also in providing conditions for convergence ofsolution methods to be discussed in the next chapter.

4.2 A sequence of quadratic programs (SQP)

An important class of methods for solving general nonlinear programs proceeds by solvinga sequence of quadratic programs, derived by linearizing the nonlinear constraints (via Tay-lor's expansion) and formulating a second-order approximation of the associated Lagrangianfunction. A reasonably self-contained discussion of various manifestations of the basic SQPmethod can be found in [GMW 81]. Different versions may be characterized by the natureof the Lagrangian approximation, whether or not the linearized constraints are perturbedin any way, and by the nature of the merit function and line search employed to promoteconvergence from any starting point. Typically, the QP subproblem is expressed in termsof a search direction relative to the linearization point.

We will not deal with constraint perturbations and defer the convergence issues to laterchapters. With respect to problem Eq-NLP, the SQP process is very similar to applyingSLCP to Eq-NLCP. Specifically, p0 must be given and iteration k + 1 begins by linearizingd(p) at pk. A second-order approximation of the Lagrangian function must be defined,noting that for problem Eq-NLP the only source of curvature is the demand functionsd(p) in inequalities (NLPI). We leave this approximation non-specific for the moment anddenote the estimated Hessian matrix by H(pk,,i'), thus indicating its dependence on both

5'.

,'.5

.,, . - 2 - - --------, , ],-,o-. , ,.. -V V- -, .- ',' . .- . - ,5. '-...%.-. --- ,.-., .. :-,....,,,-,-..:,

26 4.3 SLCP implements SQ

the linearization point and an estimate of the Lagrange multipliers, /i. We then define andsolve the linearized subproblem denoted by Qp(pk):

minimize (p-pkFH(pk, pk)(p-pk) + Vsubject to

(QP1) -Vd(pk)p + Ay + hv >_ d(pk) & ; > 0

(QP2) AT > 0 0 0

(QP3) hTp = - 1

(QP4) p , y > 0

Let (p, 9, V) solve QP(pk*), wiith (P, , ii) the associated Lagrange multipliers. If p and 7) are

sufficiently close to pk and 1k, then we have an approximate local minimum for Eq-NLP. Ifnot, we define a new linearization point by:

pk+1 =p k+A(p-pk), where 0< A< 1.

Typically, the Lagrange multiplier estimate is updated by ip'+l = i3. Set k = k + 1, andrepeat. A

Note that problem QP(pk) has the same inequalities as LCP(pk), so it inherits the sameproperties of feasibility and non-singularity as discussed above. In many practical imple-mentations, the Hessian estimate H(pk, # 1) is constructed so as to be positive semidefinite.If this is not enforced, as in the next subsection, it is desirable at least that the subproblemrbe constructed so that the objective function is bounded below on the feasible region.

4.3 SLCP implements SQP

In the original SQP method of Wilson [Wil 63], the QP objective function is derived fromthe second-order Taylor's expansion of the Lagrangian function at the linearization pointp'. The Lagrangian is constructed from the exact Hessians of the objective and constraint

functions using the QP multipliers Pjk as Lagrange multiplier estimates. The method wasintended for use on convex problems, for which the resulting QPs would also be convex.

Even in the absence of convexity, we can specialize Wilson's formulation to problem Eq-N LP.yielding the following objective function for QP(pk):

PkT fkV(,()] P)+ 1!.minimize 1(p ( 7))T 1)'V2di(pk (I -

If by design or chance pk pk, the Hessians can be eliminated using (W2) of Section 2.1to produce:

minimize -(p _ pk)T [V(.(Ipk) + VTd(pk)] (p _ Pk) + v.

minimizeS- ~*'~~% S~

" 4 -N -

4.3.1 A more general equivalence result 27

We can then apply (111) and (W1) from Section 2.1 to obtain:

minimize _ pT [Vd(pk) + vTd(pk)] p - pTd(pk) + v,

which in non-symmetric form is:

minimize - pTVd(pk)p - pTd(pk) + v.

The resulting QP is thus precisely the composite quadratic program equivalent to LCP(pk).

The objective function is a statement of the complementarity conditions and consequently isbounded below by zero on the feasible region. Moreover, the global minima of the compositeQP are equivalent to the complementary solutions of the LCP.

We can now provide an SQP interpretation of SLCP iterations. Suppose that at iteration 1we take P5 = p0 . (Given the underlying complementarity nature of the equilibrium problem,this would seem to be a better estimate than most.) With this initialization, QP(p 0 ) andLCP(p ° ) are identical problems. Given this equivalence, a complementary solution (P, y, f)of LCP(p0 ) is indeed a global minimizer of QP(p0 ). Hence, if a unit step length is applied(as in Wilson's algorithm), for the next iteration we can take pl 1 = . So again, QP(pl)and LCP(p 1 ) are identical. By the obvious induction, this equivalence continues so long asa unit step length is taken. Thus, SLCP (with unit step length) is precisely implementingWilson's SQP method on a nonconvex problem. The saving feature is that solving thesubproblem as a complementarity problem guarantees finding a global minimum for the(composite) QP.

If a unit step length is not taken, the equivalence can be maintained by applying thesame update formulas to the primal iterates pk and the Lagrange multipliers jjk. Suchsimultaneous updating is a feature of a different version of SQP implemented by Gill,Murray, Saunders and Wright [GMSW 86].

4.3.1 A more general equivalence result

We derived the equivalence result just discussed using the special properties of the demandfuinctions in a general equilibrium problem. As it turns out, however, this result is actuallya special case of an equivalence applying to any NLCP. Our discovery of the more generalresult w'-s kindled by Mangasarian's utilization in [NIang SO] of the second-order sufficientoptinality conditions for the equivalent CNLP statement of the general NLCP (defined inSection 3.1).

Consider applying Wilson's SQP method to problem CNLP. Let Z be a linearization pointleading to the following linearized constraints:

z > 0 and Vf(.)z > -f(z) + Vf(z)z.

The Q1P objective function is formed from the gradient of lie CNLI1 objective and aiiestiniated Hlessian of the Lagrangian function based on estimated Lagrange iiiltipliers.

C-*r h* *J, ~ ~%- ~ - ' ~ W ~ C

C- .,., .- 17 VMP-. ,, 77

28 4.3 SLCP implements SQP

The gradient of the CNLP objective is:

[f (i + f(zz

The estimated Hessian of the Lagrangian is:

vf() + Vf(5) + DZ- (AP - )i

Again, if by design or chance i = z, the summation term drops from the estimated Hessian.We can then compose and simplify the QP objective function. The contribution from thegradient term is:

[() + v:(.)](z - :(_) + Vf (:)z + constant.

[he contribution from the Hessian term is:

(Z- .)T VT - zTVf()(z - 2) - rTVf(_)z + constant.

Collecting non-constant terms, we are left with

minimize zT[Vf(i)z + f(2) -Vf(i)Z].

Relating this to the linearized constraints, we see once again the composite quadratic pro-gram corresponding to the LCP derived from linearizing the NLCP at 2. The inductiveargument given above then applies to the general case as well.

XVe summarize the equivalence result just obtained. Applying the SLCP method (withunit steps) to the general NLCP problem precisely implements Wilson's SQP method onthe equivalent CNLP problem. Applying Wilson's SQP method to the CNLP form willproduce the same iterates as SLCP (with unit steps) applied to the NLCP provided thatglobal minima are found for indefinite QP subproblems. If a unit step cannot be taken, theequivalence is maintained simply by updating the multiplier estimates in the same manneras the primal iterates.

To avoid potential confusion, it is perhaps important to distinguish this result from aseemingly similar observation of Josephy in [Jo-N 79], which is summarized in Robinson'ssurvey [Rob 82]. His observation concerns applying the generalized equation form of New-ton's method to the special NLCP that represents the optimality conditions for a generalNLP. lie demonstrates that the method produces a sequence of (bisymmetric) LCPs, eachof which represents the optimality conditions for a corresponding member of the sequenceof QPs that is generated by applying Wilson's SQP method to the given NLP. This is aresult which we will use in the next chapter, but it is altogether different from the equiva-lence result we obtained above, which pertains to a general NLCP and the special CNLPconstructed to be equivalent to it.

.1

4.4 A sequence of linear programs (SLP) 29

4.4 A sequence of linear programs (SLP)

Perhaps the earliest practical method for solving sizeable nonlinear programs is based on asequence of linear programs. Origination of the method is attributed to Griffith and Stewart[GS 61], and a recent robust implementation is reported in [ZKL 85]. The LPs are definedby using first-order Taylor's expansions of the objective and constraint functions. Additionalbounds are usually added to the subproblems so as to prevent large deviations from thelinearization point. These bounds create a so-called trust region over which the linearizationis presumed to be an acceptably accurate approximation of the nonlinear functions. Thebounds further serve to move the subproblem solutions away from basic solutions of thelinearized constraints. For the general NLP, we would not expect an optimal solution tobe such a basic solution. For a CNLP problem like Eq-NLP, however, we know that basicoptimal (complementary) solutions do in fact exist (given the rank condition discujssed in

Section 4.1.1). In such a case, we can reasonably consider defining LP subproblems withoutany additional bounds (and worry about convergence later).

For application to problem Eq-NLP, we define the following subproblem at linearizationpoint pk:

LP(pk)

minimize Vsubject to

(LP1) -Vd(pk)p + Ay + hv > d(pk) _. p> 0

(LP2) ATp > 0 3- > 0'

(LP3) -hTp = 1 ±

(LP4) p, y > 0

LP(pk) is identical to QP(pk*) except for the use of a linear objective function. In this sense,SLP may be considered a special case of SQP that uses a vacuous (and therefore positivesemidefinite) approximation of the Hessian of the Lagrangian. The iterative process is alsothe same.

If h is chosen (as described in Chapter 6) to guarantee the feasibility and solvability ofany linearized subproblem, then both the primal and dual of LP(pk) must be feasible.

Consequently, by weak duality, an optimal solution must exist.

If P is an equilibrium price vector, subproblem LP(p) is the linear program alluded toin Observation 2 of Section 3.3.1 for which the complementary slackness conditions arvequivalent to the optimality conditions for problem Eq-NLP.

30 4.6 Other linearization methods

4.5 A projected (augmented) Lagrangian method

The projected Lagrangian algorithm has proved to be an effective means for solving large,mostly linear optimization problems. It is thus a natural candidate for solving problemEq-NLP. The method was originated by Robinson [Rob 72] and Rosen and Kreuser [RK 72],and has been implemented in the optimization package MINOS by Murtagh and Saunders[MS 82]. This algorithm also solves a sequence of linearly constrained subproblems, based onTaylor's expansions of the nonlinear functions. (As these constraints are identical to thoseof the previous three methods, we will not delineate them again here.) The distinctivefeature of the projected Lagrangian method is the objective function, which incorporatesthe original problem objective (as is) and a Lagrangian term based on an estimate of theoptimal multipliers and the difference between the linearized and actual constraint functionvalues. As implemented in MINOS, the objective optionally includes an additional quadraticpenalty term (also based on the linearization errors) designed to stabilize the iterativeprocess and promote convergence from any starting point. This addition corresponds to thequadratic term of an augmented Lagrangian merit or penalty function.

As applied to Eq-NLP, the objective takes the following form at iteration k + 1. Let p be apenalty parameter, lik be an estimate of the optimal Lagrange multipliers for the nonlinearinequalities (NLP1), and recall dk(p) as a notation for the first-order Taylor's expansion ofd(p) at pk. The objective is then:

minimize v - (,a k)T [dk (P) -d(p)] + 1 p 11dk (p) -d(p) 2

Given a solution of the subproblem, pk and j k are updated as for SQP with the step lengthdetermined by a linesearch procedure.

Typically, the multiplier estimate yk is taken to be 1 k, the multipliers for the linearized con-straints in the subproblem solution. We have found that two deviations from this procedureyield interesting connections to the solution methods previously described. In both cases wemust take p = 0, which is generally safe for a well-behaved problem. If Uk is always takento be zero, the resulting subproblem is precisely SLP(pk). This is a general result for anyNLP with a linear objective function. As an alternative, intuition suggests the possibilityof using the prior knowledge that the prices, p, solved for in the subproblem are also goodestimates of the optimal multipliers. If we then replace the constant ik with the variablep, the now "endogenous" Lagrangian term transforms the objective function to:

v T - d (p)J = v pT )0 = - -p v(pk)pp -

where again we apply Walras' law, (D2) of Section 2.1. The rightmost expression may berecognized as the non-symmetric form of the objective function for Wilson's SQP(pk), whichis in turn the complementarity conditions for SLCP(pk).

4.6 Other linearization methods

Another diverse class of solution methods for general nonlinear complementarity )roblemsuses linearizations other than first-order Taylor's expansions. For concreteness, denote by

4.6 Other linarization methods 31

B k a matrix depending on the iterate pk such that:

d(p) ; d(pk) + Bk(p - pk).

By substituting this linearization into (LCP1) above, an LCP for iteration k + 1 is ob-tained, and an iterative procedure otherwise identical to SLCP can be pursued. This classof methods includes the quasi-Newton method for generalized equations as studied by Jose-phy [Jo-Q 79] and all of the linear approximation methods for complementarity problemssurveyed by Pang and Chan [PC 82].

There are two possible reasons in general for considering an alternative linearization scheme.The first is to avoid explicit computation of the Jacobian matrix if the derivatives areexpensive to evaluate. In this case, Bk must be substantially easier to calculate thanVd(pk). In a CGE model, particularly one with a sizeable linear production component,it is not in general likely that the evaluation of a Jacobian once each iteration will loomlarge relative to the computational effort involved in processing the linearized subproblem.

There are always exceptions, of course, but we will not consider situations in which expensivederivatives effectively rule out the application of first-order methods.

The other reason for using an alternative linearization is choosing a form of Bk which pro-duces a subproblem that is easier to solve than LCP(pk). This is a much more relevant issuein the context of problem Eq-NLCP, for which Vd(pk) is singular and has no demonstrablydesirable computational properties. While each subproblem can be successfully processedby Lenike's method regardless of the linearization employed (see Chapter 6), other lin-earizations yield subproblems amenable to solution by other, possibly faster, methods. Forinstance, using B Vd(pk) - VTd( pk) has the interesting property that the linearizationsatisfies WqIras' law, (D2) of Section 2.1. The skew-symmetry further implies that eachLCP subproblem, expressed as its associated composite quadratic program, is actually alinear program because the Hessian vanishes. Using a general positive semidefinite Bk yieldsa suiproblein whose composite QP can be effectively solved by a, convex QP code as wellas by any number of iterative (non-pivoting) methods. Finally, using a. symmetric positivesemnidefinite Bk )roduces a bisymmetric LCP subproblem that comprises the optimalityconditions for a convex QP in the price space alone. This lower dimensional QP can inge1neral be solved much faster than LCP()pk).

As is to be expected. the advantages of easier subproblenis have to be weighed againstinferior convergence l)roportie.s of the overall process. These convergence issues are thesiu hject of thoe next chapter.

%%

, ?,

5

CONVERGENCE PROPERTIESOF SOLUTION METHODS

In this chapter we present a number of observations about the convergence properties ofthe methods discussed in the previous chapter for solving problems Eq-NLCP and Eq-NLP.Three related but separate convergence issues are relevant. All involve some notion of atarget point. In the case of a complementarity problem, target points are unambiguouslythe complementary solutions. In an optimization context, target points are generally taken(for purposes of convergence analysis) to be local optimizers, which in the absence of (somegeneralization of) convexity need not be global optimizers.

The first issue is generally referred to as local convergence. The question here is whetherthere is a neighborhood around a target point such that, if the iterates generated by alalgorithm enter that neighborhood, all subsequent iterates remain in the neighborhood andmoreover converge to the target point. An important associated issue is the rate at whichthe iterates converge to the target point once in this neighborhood. The next issee is widelyreferred to as global convergence. The question is whether, from any starting point, theiterates generated by the algorithm eventually converge to some target point. Herein arisesan unfortunate ambiguity in the established terminology, since global convergence does notmean convergence to a global optimum. Locating a global optimum is precisely our thirdconvergence issue, which is relevant only in applying optimization methods to Eq-NLIP. I1is important that converging iterates in fact approach a global minimum, since it is globalminima that correspond to complementary (equilibrium) solutions.

5.1 Local convergence

In light of the equivalence between SLCP and Wilson's SQP method in an equilib'ium con-text, we provide below an analysis of the relationship between the known local convergenceproperties of the two methods. For comIpleteness, we will also briefly survey known resultsfor the other sohltion methods we have identified in Chapter .1.

5.1.1 SLCP and Wilson's SQP method

Recall the equivalence discussed ii Section 1.3 between SLCP and NVilson's SQI met hod asal)plied to th, onlinear complementarity problem and its CN1, I nonlinear prograinil gcon uterpart. ht iglit of this equivalence, it seems sensiblo to invoke the mosl powerfil of li('('011vergenco resuilts available for the two metlods. Using somie prec iiusors of tIe t leorv ofgeieralized equatiolis, Robiinson ill [Rob 7,1] establishes the local <ia(Iratic comv-ergeice o4;I rail i ofoptimization lethiods that includes \Vilson 's SQP) algoril hIn. Iik result eiio'Hire conditions for a local minin izer to possess a doiiiai i of alt ractioll:

I. second-order sufficient optimality conditions

33

34 5.1 Local convergence

2. linear independence of the gradients of the binding constraints

(including simple nonnegativity constraints)3. strict complementary slackness.

These are strong conditions, but little exists to this day in the way of convergence resultsfor optimization methods without similar conditions.

Josephy in [Jo-N 79) establishes the local quadratic convergence of Newton's method forgeneralized equations. In particular, any regular solution posseses a domain of attraction.Robinson develops conditions for the regularity of a solution of an NLCP in [Rob 80].(These conditions were delineated in Section 4.1.1.) He also demonstrates regularity of asolution of an NLP (without strict complementarity) under a strengthening of the second-order sufficiency condition and linear independence of the binding constraint gradients. Inlater work [Rob 82], he observes that these latter conditions are sufficient for local quadraticconvergence of Wilson's method. This follows from Josephy's demonstration of the equiv-alence between Wilson's method applied to a given NLP and Newton's method applied tothe NLCP formed from the optimality conditions for the NLP. (Recall our discussion inSection 4.3 that this is not the same equivalence as the one we develop.)

Unfortunately, this strengthening of the convergence result for Wilson's method on generalproblems is of no help in the context of applying Wilson's method to the CNLP form ofa given NLCP (these forms were defined in Section 3.1). This is because, as somewhatcasually observed by Kyparisis in [Kyp 86], the linear independence condition cannot besatisfied in the absence of strict complementarity (or nondegeneracy in the NLCP context).By way of a quick demonstration, we can use the general form of an NLCP since the resultis general. Define the usual index sets for a complementary point z:

J = {i: fj(z) = 0,zi > 0}, K = {i: fj(z) = O,zi = 0}, L = {i: fj(z) > 0,zi = 0}.

Also, let M = Vf(z) and let I be a comformably dimensioned identity matrix. The matrixof binding constraints including the nonnegativity conditions is:

-' MK MJL

ILL

Clearly, the rows of this matrix are linearly independent if and only if the northwest sub-matrix has full row rank. But this requires that M., be nonsingular and that A' = 0.

In light of this situation, the equivalence we establish between SLCP and Wilson's SQPmethod is of no particular help in investigating local convergence. In our complementaritycontext, the convergence conditions for Wilson's method turn out to be a special case ofthe regularity condition needed to establish the local quadratic convergence of Newton'smethod. Recall Robinson's result (discussed in Section 4.1.1) that a solution of an NLCPis regular if and only if Al, is nonsingular and either It' 0 or all l)rincipal minors of

• -

1'.E IVAE V -7 w_ v 2 2 l. VII III' 1 -I

5.1.2 Other optimization methods 35

the Schur complement, MKK - MKjM;.MJK, are positive. Of course, any regular solutionmust be locally unique, so any regular solution of an NLCP with K = 0 must satisfy thethree strong conditions given above for the equivalent CNLP problem.

An interesting by-product of this otherwise disappointing result is that it identifies a class ofproblems for which Wilson's method will exhibit local quadratic convergence under weakerconditions than linear independence and strict complementarity. Thus, Robinson's sufficientconditions for Wilson's method cannot be necessary conditions as well.

Returning to our central theme, the useable results from these observations are that we canindeed expect local quadratic convergence from SLCP (Wilson's SQP) when the model hasregular equilibria and the global path of the algorithm brings the iterates into some domainof attraction.

5.1.2 Other optimization methods

Practical implementations of SQP typically combine a positive definite approximation of theHessian of the Lagrangian with a merit function of some kind to guage descent. The issueof local convergence " en becomes inseparable from the mechanism used to promote globalconvergence. Local superlinear convergence can be achieved if the Hessian approximationapproaches the true Hessian in a certain way and the line search permits unit steps inthe neighborhood of a local minimum. The precise conditions are unimportant for presentpurposes, but a summary with references can be found in [GMSW 87].

The SLP method is in essence choosing a steepest-descent search direction (subject to sometrust region bounds) with respect to an absolute value (or f4) penalty function representa-tion of the given NLP. Consequently, we cannot expect better than linear convergence ona general problem. If the problem at hand has a solution at a vertex of the linearized con-straints, however, it is an intuitive result that, once the correct basis has been determined,the method is actually performing Newton's method on the square system of basic variablesand binding constraints. We could then expect quadratic convergence in the neighborhoodof a regular solution. Zhang, Kim and Lasdon assert this result (without proof) in [ZKL 85].

The projected Lagrangian method was shown by Robinson [Rob 74] to be locally quadrat-ically convergent under the same conditions as given above for Wilson's SQP method. Thepractical implementation in MINOS inherits the same property, since the penalty parameterp is decreased to zero as a local minimum is approached.

5.1.3 Other linearization methods

Pang and Chan in [PC 82] unify an extensive collection of convergence results for lin-ear approximation methods for solving variational inequalities, organized around the basicthemes of norm-contraction, vector-contraction, and monotonicity. Further elaboration onthese results is not justified in our context, since the general equilibrium problem does notdemonstrably satisfy any of the known conditions for local (or global) convergence.

* U * ' , .U w ,1'.'*.' -.- ,-. -- "- ". --. - ,--.-, - - , ---.---.. --.- -% .3 ,.-.% -- -- ,.'- -"-- '.t--- .-.-.- .--. --. -... 11 . -. I .. ' -- 'a q' . " ' - " = - ' . * % - . --

36 5.2 Global convergence

Josephy in [Jo-Q 79] extends the known results for solving nonlinear equations by quasi-Newton methods to the case of generalized equations. In particular he shows that, underconditions satisfied by the usual quasi-Newton update formulas, the sequence of iterates islinearly convergent in the neighborhood of a regular solution. Of incidental interest is thecondition for superlinear convergence of a (known-to-be) linearly convergent sequence ofquasi-Newton iterates. In our context, this specializes to:

lir JU[Bk - Vd(pk)](pk -- pk-,) 0,k oo I1pk - pkl I 1

where Bk is the approximation matrix at iteration k.

5.2 Global convergence

We first briefly survey the known global convergence properties of the solution methodswe have identified. We then develop some new machinery for analyzing convergence in thecontext of an equilibrium or complementarity problem.

In the absence of one of the contraction or monotonicity properties discussed by Pang andChan, very little is known about the global convergence properties of Newton, quasi-Newton,or other linearization methods for solving the NLCP. The same can be said about Wilson'sSQP method in the absence of convexity. As an empirical matter, however, Mathiesen'scomputational experience has shown SLCP to be remarkably robust.

Global convergence has been proved for well-designed SQP methods, given the usual as-sumptions about a strict local minimizer with strict complementarity and linear indepen-dence of the gradients of the binding constraints. (Recall that linear independence and 0strict complementarity are inseparable for a problem of type CNLP.) "Globalization" has--1been based on line-search procedures relative to an absolute value (t4) merit function orto an augmented Lagrangian merit function. In either case, the convergence argument de-pends critically on the positive definiteness of the matrix used to represent the Hessian ofthe Lagrangian. Again see [GMSW 87] for a summary of these results and references to thedetailed demonstrations.

Zhang, Kim and Lasdon in [ZKL 85] demonstrate the global convergence of their SLPmethod by use of an 11 exact penalty function combined with trust region bounds. Since SLPis a special case of SQP with vacuous Hessian, it is intuitive to think of the tightening of trustregion bounds as a substitute (in the convergence argument) for positive definiteness of theestimated Hessian. How best to determine satisfactory penalty weights is still unresolved. %

Despite the close connection between the objective function used in MINOS for the linearizedsubproblems and the augmented Lagrangian merit function used in some SQP procedures,there has yet to be a rigorous theoretical demonstration of the global convergence of thealgorithm. The method has strong intuitive appeal, however, and a wide range of successfulapplications to nonlinear problems has been reported, both in the literature and informally.

10

5.2.1 A differentiable exact penalty function (Eq-DEPF) 37

5.2.1 A differentiable exact penalty function (Eq-DEPF)

In the course of examining the tx and augmented Lagrangian penalty functions (particularlywith respect to their use as merit functions for global convergence analysis), we discovereda potentially fruitful specialization of the approach to problem Eq-NLP. This specializationprovides, perhaps for the first time, an unambiguous means of evaluating the progress of asequential method towards an equilibrium solution. While both penalty functions specializein an identical fashion, we prefer to deal with the differentiable augmented Lagrangianfunction.

The specialization begins by replacing the d(p) term in inequality (NLP1) of Section 3.3with a proxy variable x (which is not constrained in sign). Instead of adding a constraint,x = d(p), we add to the previously linear objective function a Lagrangian term and aquadratic penalty term, i.e.,

-1[x - d(p)] + p I 1k- d(p)II •

In the typical case, p would be an iteratively updated estimate of the Lagrange multipliersfor the omitted constraint, x = d(p). Instead, as we illustrated in Section 4.5, we can useour prior knowledge of the desired complementarity relationships and replace the updatedconstant p with the variable p. The result is a linearly constrained problem which we willshow to be equivalent to Eq-NLP (and thus to Eq-NLCP as well).

Eq-DEPF]

minimize -px + +pJx-d(p)IJ + v

subject to(DEPF1) -x + Ay + hv > 0 ± P 0-

(DEPF2) -ATp > 0 _L 9 0

(DEPF3) -hTp =-1 ±_

(DEPF4) p , y > 0

Observe first that v- pTx > 0 for any feasible solution. By the usual device, multiplythrough inequalities (DEPF1) by p > 0 and (DEPF2) by y >_ 0. Add the resulting two

inequalities and v-p x > 0 follows. Since the penalty term is nonnegative, we can concludethat the objective function is bounded below by zero on the feasible region.

Equivalence between Eq-DEPF and Eq-NLP is then easily established. Given any globalminimum solution of Eq-NLP (i.e., (p, V, i) feasible with f) = 0), we immediately constructa global minimum of Eq-DEPF as (p,.,,) with x = d(p). Note that the penalty termvanishes and pTx = prd(p) = 0 by virtue of Walras' law, (D2) of Section 2.1. We thus knowthat the globally minimal objective value for Eq-DEPF is in fact zero. Given this, we canargue the converse. Any global minimum (p,x, 9, V) for Eq-DEPF must have x = d(p) or

4. ,


else the penalty term would be positive. In this case, we again have p=T pTd(p) = 0 byvirtue of Walras' law. It then follows that V = 0 since the global minimum objective value iszero. We then immediately have (p, 9, 0) feasible with V = 0 as a global minimum solutionfor Eq-NLP.

Note that the above equivalence holds for any value of p > 0. This is because v - Tx > 0on the feasible region of Eq-DEPF and zero is in fact an attainable global minimum. Thisinvariance to p is in happy contrast to the usual situation with penalty functions, in whichthe correspondence between optimal solutions of the penalty problem and optimal solutionsof the original constrained problem obtains only for sufficiently large values of the penaltyparameter. We are thus free in problem Eq-DEPF to vary the penalty parameter in anyway conducive to an analysis of global convergence.

Any number of further transformations of problem Eq-DEPF are possible that still maintainthe equivalence to problem Eq-NLP. Other types of penalty terms could be used, differ-entiable or otherwise. Since feasibility of subproblems is not an issue, variable v could bedispensed with altogether. Having omitted v, the term -pTx could be dropped from theobjective function, leaving only the penalty term. (This clearly highlights the underlyingcharacter of the equilibrium problem as a feasibility problem.) The disadvantage of thistransformation from a computational perspective is that the resulting variant of Eq-DEPFwould have zero Lagrange multipliers at the global minimum. In contrast, a global mini-mum of Eq-DEPF as stated has the desirable property that the "primal" variables are alsocomplementary Lagrange multipliers. Leaving v in Eq-DEPF also allows a more direct andconvenient correspondence to the solutions of linearized subproblems in which the presenceof v is used to guarantee the existence of feasible and complementary solutions. (We discussthis further in Chapter 6.)

The significance of problem Eq-DEPF is twofold. First, the value of the objective functioncan be used as a merit function for judging the progress of any sequential method thatproduces a (p,x,y,v) which is feasible for Eq-DEPF. All of the solution methods thatwe discuss indeed have this property; x is directly obtained as the value of the linearizeddemand function in the solution of the subproblem. Second, it may be viable to applyan optimization method directly to problem Eq-DEPF instead of problem Eq-NLP. Bothof these uses of Eq-DEPF will be discussed in turn below, but first a brief aside on theimmediate generalization of this penalty function formulation to problem CNLP.

A more general result

Similar to what we discovered in Section 4.3, a more general result lurks behind the de-velopments for problems Eq-NLCP and Eq-NLP. A differentiable exact penalty functionrepresentation is also readily available for the equivalent CNLP form of the general nonlin- 4

ear complementarity problem (see Section 3.1). It has the following simple form:

(DEPF) minimize wTz + 1PJw- f(z)I' subject to w > 0, z > 0.

Again, we have a linearly constrained problem with an objective function that is boundedbelow by zero on the feasible region (the nonnegative orthant in this case). Assuming

per Y' e r or r

5.2.2 Stationary points of Eq-DEPF 39

that the original NLCP has a solution, any such solution is a global minimum of DEPF andconversely. As in the case of Eq-DEPF, any linear parts of f(z) can optionally be maintained

as general linear constraints as opposed to being included in the penalty term.

Analogous to the situation with the specific Eq-DEPF problem, formulation DEPF may

prove to have independent computational potential as well as usefulness in evaluating theconvergence of any sequential method for solving an NLCP.

5.2.2 Stationary points of Eq-DEPF

Problem Eq-DEPF inherits the nonconvexity of problem Eq-NLP, but it also inherits the

interesting properties of the first-order optimality conditions which suggest that nonoptimal

stationary points may be rare or avoidable. These conditions are thus worth delineatinghere. Let (p, , ), ) be a Karush-Kuhn-Tucker point of Eq-DEPF with associated Lagrangemultipliers (3, , )). The primal conditions are simply (DEPF1)-(DEPF4) evaluated at(p, 2, , t). The multiplier conditions are as follows:

(DMx) p[x-d()] - p + 0 = 1 x

(DMO) -pvTd(f)[2-d(p)] - i + A + ht > 0 _! p> 0

(DM1) -[±-d(p)] + vTd(P)p + A9 + h 3 > 0 1 p 0

(DM2) -ATi > 0 _L Y 0

(DM3) = -1 1 )

(DM4) p, y > 0

Here, we have obtained (DM1) from the "original" condition (DM0) by applying the iden-tity (DMx). Conditions (DM1)-(DM4) then differ from the Eq-NLP multiplier conditions(M1)-(M4) (see Section 3.3.1) only in the presence of the term [±-d(fi)]. Equation (DMx)also adds a direct connection between p and P that is not present in the KKT conditionsfor problem Eq-NLP.

We can now detail a number of interesting observations about these conditions.

Observation 1. By the usual manipulations of the complementarity conditions, we derivethat V = pT2 and = X..

Observation 2. If i = d(p), then p = p by (DMx). In turn we must have ib = 0 byWalras' law, property (D2) of Section 2.1. In short, we have an equilibrium solution.

Observation 3. Since v - >x >_ 0 for any feasible solution of Eq-DEPF, we can useObservation 1 and (DMx) to conclude:

0 < v pT- = _ ,v= (0-b p)T2 = _p[. -d(p)]T±.Id.

.", ' " ''t '' j '¢u g",. 2' a, .' ,'" ', ,'. 2 '''.£'.4 " '€ :' ".,"" ' "" "" "'. °" " " ,"' , ,," "-. A


If i X d(P), then fIt-d()I' > 0 requires:

p[- d(z)]Td(P) < 0 (P - }Td(p) < 0 * Td(p) > 0,

where the first implication follows from (DMx) and the second from Walras' law. In Ob-servation 4 of Section 3.3.1, we obtained the same conclusion for a nonoptimal stationarypoint of Eq-NLP.

Observation 4. Given the price normalizations (DEPF3) and (DM3), equation (DMx)implies that hT[t-d(p)] = 0. Since the normalization bounds individual prices as well, italso follows that phiI, - d <()l < 1. Thus, if h > 0 the deviations i - d(p) obtained at aKKT point of Eq-DEPF can be made arbitrarily small by increasing p. Intuitively, it is hardto believe that a non-contrived problem could have such near-equilibrium stationary pointssufficiently isolated from true equilibria that a sequence of iterates would be attracted to anonoptimal point rather than to an equilibrium point.

We will have further use for some of these conditions and observations at the end of thischapter.

5.2.3 Judging descent using Eq-DEPF

For brevity, let 4)(p,x,v) denote the objective function of Eq-DEPF. We will need itsgradient:

V+(pkXk vk) = [- _ pVTd(pk Xk-d(pk)J , k+p[Xk-d(pk)] , jT.

Recall that the general sequential procedure for all the methods we discuss begins at iter-ation k + 1 by linearizing the demand functions d(p) at the point pk. It will help to havetwo shorthand notations for the possible linearizations:

dk(p) d(pk) + Vd(pk)(p pk) = d(pk)+ Vd(pk)p

bk(p) d(pk) + B(p_ pk).

Note that bk(p) includes the Taylor's expansion dk(p) as a special case. We will use the non-specific bk(p) when an expression or remark applies equally well to any linearization, andwill explicitly call notice to any reference that expressly excludes the Taylor's expansion.

We will also use SP(pk) to denote the subproblem defined at iteration k + 1 when it is notimportant to distinguish between specific methods.

If (p, ,if) solves SP(pk), we take i = bk(p) and immediately obtain a feasible solution ofEq-DEPF. Its objective value can be readily compared to those of previous iterations. Ifthe solution obtained for SP(pk) is a complementary solution, then y - FTbk(p) = 0, sincethis expression is precisely the complementarity conditions for the subproblem. In thiscase, *(p,±, V) = 1p I1i- d()ll'. The interesting implication here is that, in comparingtwo successive iterates of the SLCP procedure (for instance), all that really matters is

5.2.3 Judging descent using Eq-DEPF 41

the accuracy of the linear approximation to the demand functions. Counter to prevailingintuition, the combinatorial aspects of identifying basic production activities and bindinginequality constraints are important only insofar as they affect the prices at which thelinearized demand functions are evaluated.

We can also use the subproblem solution to determine a search direction, i.e., (ip - ti

xk, 9_ yf, _ vk). This direction is always locally feasible and will furthermore be a descentdirection relative to Eq-DEPF if:

VIP(pk X, Vk)(p - ,jXk,i0-vk) < o.

Then, in principle at least, a line search could be performed with respect to 4)(p, x, v) andan appropriate step length chosen (0 < A < 1). Note that xk+ - bk(pk+1) only if A = 1.

We can now examine the search direction for descent.

41' = V(pkXk,vk)(ppk, .- xk,V-vk) - (upon some reordering)

(i kv)_-(pk)T(.Xk)_-(XkT(p~pk)_ k p~kdpk)]T {-2 + Xk + Vd(pk)(p~pk)}

(f (p'k T;- + (Vk_ T~k) - V (kTk

-p [Xk - d(pk)]{ [Xk - d(pk)] - -dk(p)]}

Here, the transformation of the first three terms is by reordering and by adding and sub-tracting vk. The transformation of the sub-term in braces is by adding and subtracting d(pk)and then substituting in the shorthand d k(p). Note that we will use )' as an abbreviationfor this descent formula.

Since both (pk,±,9, i) and (p,xk,yk,vk) are feasible solutions for Eq-DEPF, it follows thatthe first two terms of )' are nonnegative. Similarly, the third term is nonpositive. Ifx k = bk-(pk) (that is a unit step was taken at the previous iteration) and a complemen-tary solution was obtained to subproblem SP(pk-l), then we can conclude that the thirdterm vanishes because (as mentioned above) it is the statement of the complementarityconditions for SP(pk-1). Other than this, we are unable to state anything definitive aboutthe magnitudes of these first three terms.

The fourth term indicates the leverage that can be exercised by the choice of the penaltyparameter p. For the four classes of methods that linearize by means of the Taylor's ex-

pansion, i = dk(p). In this case the fourth term reduces to -p 1xk - d(pk ) 2. Clearly, this

quantity is strictly negative so long as the iterations have not converged. So the searchdirection obtained by SP(pk) can be considered a descent direction with respect to anymanifestation of Eq-DEPF that has a sufficiently large value of p. This result depends onlyon the use of Taylor's expansions; the objective function or complementarity condition forthe subproblem does not matter. The nature of the solution obtained by the subproblemaffects only the magnitudes of the first three terms, the sum of which conditions how largep must be in order to make )' negative.

-' ' ' .,r ' '. ', ' ,', '.. ' ,-' 7:•r,-',,, ',,, '. "-.'.'? '.'.'. ".--,",.',' ," .,,' , ,,',','J , ','.",,,,' ",,;.,.'.-," .,,.. ",", ,,,".",',,,",',. ,'',, ", ,,'.' * .1,i


Unfortunately, this somewhat encouraging result does not necessarily apply for lineariza-tions other than the Taylor's expansion. In this case we have:

- dk(p) = bkQ3) - dk(p) [Bkvd(pk)] (P p).

(The expression on the right may be recognized as the vector in the numerator of the limitcondition for establishing superlinear convergence of quasi-Newton iterates.) The influenceof the deviation t - dk(p) on the sign of the fourth term of V9 is ambiguous. Should thefourth term be positive, it may or may not be possible to make p small enough to obtain alocal descent. In particular, this will not be possible if the third term vanishes, as indeedit will if a unit step was taken on the previous iteration. (Recall that the non-Taylorlinearization methods we discuss all obtain complementary solutions of the subproblems.)

These last two results establish an important connection between local and global conver-gence conditions. They also deliver a rather serious second blow to the prospects of usingother linearizations on a general equilibrium problem. The use of Taylor's expansions (andthe corresponding relationship to Newton's method) is central to the local convergence ar-guments for SQP methods, the projected Lagrangian method, and SLP (given a vertexsolution). Their use allows also for the superlinear and quadratic local convergence ratesof these methods. We now see also that use of Taylor's expansions means that any searchdirection obtained can be interpreted as a descent direction in Eq-DEPF for large enoughvalues of p. So their use is aJso conducive to global convergence of the iterates. In sharpcontrast, the use of othei linearizations typically dooms local convergence to a linear rategiven that the iterates converge at all. At the same time departures from Taylor's expan-sions endanger the assumption of convergence by producing search directions that could infact be uphill. It is further interesting in this respect that the condition for superlinear localconvergence of quasi-Newton iterates is mutually reinforcing with the condition promoting

global convergence in the sense that "small" values of [Bk - Vd(pk)] (-pk) are more likely

to produce a negative fourth term in 4V. ,p'p

We have been careful not to assert that any of the above proves the global convergence ofmethods using Taylor's expansions. It merely provides a partial theoretical explanation forthe empirical record, particularly as regards SLCP. Two issues continue to elude resolution:the boundedness of p and the choice of step length.

Boundedness of the values of p required to indicate descent is not strictly necessary forglobal convergence. These values can grow indefinitely provided a number of other stringentconditions are met, including for example:

<' , -y V$(pk,xk, vk)J 1j(jp kJ _xk,i_ vk)JJ

where 7 > 0. In the literature on SQP and SLP, verification that the iterates satisfy suchconditions is critically dependent on either thc usc of positive definite estimated Hessiansor sufficiently small trust regions. We have not been able to demonstrate that the special

properties of an equilibrium problem in some way obviate the necessity of such mechanisms.

Given a downhill search direction, it also remains unclear whether a step length chosenby any means simpler than a line search would necessarily yield a net descent. We are

l-

5.2.4 Solving Eq-DEPF directly 43

particularly concerned here about the unit steps taken whenever possible by SLCP (or theother linearization methods) and the deviations from this if the unit step leads outside thedomain of d(p).

One final, perhaps obvious, remark should be made. To the extent that the merit functionsroutinely employed in optimization algorithms are good approximations of the exact meritfunction available from Eq-DEPF, it is reasonable to expect that descent directions and steplengths determined by those algorithms will also be valid with respect to Eq-DEPF.

5.2.4 Solving Eq-DEPF directly

The relative merits of solving Eq-DEPF directly (as opposed to Eq-NLP) depend both onthe size and structure of the specific problem and the optimization method to be employed.A decided advantage of solving Eq-DEPF is avoiding the solution of a difficult subproblemwhich is only approximate. It also avoids the necessity (in solving Eq-NLP) of having toupdate or recalculate the factorization of the linearized constraints at each new linearizationpoint. Search directions calculated with respect to the Eq-DEPF objective function havemore accurate information about curvature, and the demand estimate x is allowed to movemore flexibly than in the linear path bk(p) available to subproblem SP(pk) of Eq-NLP.

To be weighed against these advantages is the increase in problem size, both in terms ofthe matrix of linear constraints and (in a second-order method) the approximation of theprojected Hessian that must be maintained (to be denoted by W). In current SQP imple-mentations, the constraint matrix is stored in dense form, while in MINOS the constraintsare stored in sparse form. W is in general dense and is therefore maintained in dense formin both SQP routines and MINOS.

In both areas the effects on problem size of adding x are critically dependent on the numberof final commodities relative to the number of primary and intermediate commodities.Since demand is constant for primary and intermediate commodities, the vector x needonly correspond to the final commodities.

Regardless of the dimension of x, adding x to the problem would not significantly increasethe effort of maintaining and factorizing the constraints in MINOS. Indeed, there would befewer nonzero coefficients than in SP(pk), since presumably the identity matrix associatedwith x is much sparser than Vd(pk). Moreover, since the identity columns associated withx must be basic (because x is unconstrained in sign), the basis will be sparser since thesecolumns will displace some columns of A. In contrast, adding columns to the dense repre-sentation of constraints in an SQP routine might be a significant burden, growing with thedimensionality of x.

The most serious penalty, however, is with respect to the dimensionaJity of the (approx-imate) projected Hessian, which equals the dimension of the null space of the bindingconstraints. (Problem Eq-DEPF does not inherit the vertex solution property of Eq-NLP.)Each component of x adds a dimension to this null space unless the associated commoditybalance happens to be slack - which is a rare occurrence for final commodities. In the %context of MINOS, this has the nice interpretation that the dimensionality of W is increased

44 5.3 Convergence to what?

by each production activity that is displaced from the basis by a component of x. SinceW is dense and its factorization must be updated repeatedly, it is unlikely that solvingEq-DEPF directl) by a second-order method will be viable for a problem with "many" finalcommodities. The definition of "many" naturally depends on hardware, computer budget,and user patience.

Maintenance of a projected Hessian approximation can of course be avoided by use of a first-

order optimization method, such as steepest feasible descent implemented via SLP (withthe trust region bounds in this case). The cost of doing so is an overall !inear convergencerate. It is significant to note that, once the objective function of Eq-DEPF is linearized, the

problem completely separates into a price problem over constraints (DEPF2)-(DEPF3) anda supply/demand balance problem over (DEPF1). The price problem can then be dualized,

leaving two very similar prublemn on the supply/demand balances.

5.3 Convergence to what?

The methods we have surveyed may be classified into two groups with respect to the kind

of solutions they determine. SLCP, the equivalent Wilson's SQP method (given globalsolutions to the indefinite subproblems), and the other linearization methods compute com-

plementary solutions. In an optimization framework, this amounts to seeking a Karush-Kuhn-Tucker point with the additional qualitative constraint that the Lagrange multipliersequal the "primal" variables. This screening process ensures that if the iterates converge,the point located is a complementary (equilibrium) solution.

In contrast, the other SQP variants, SLP, and the projected Lagrangian algorithms are gen-eral optimization procedures that can be applied in principle to either Eq-NLP or Eq-DEPF.They do not presume any special structure, nor can they directly utilize the prior knowl-edge that the globally optimal objective value is zero. Iterations terminate upon finding anyKKT point of the problem. Given the lack of convexity, however, there is no demonstrableguarantee that the stopping point will be a global minimum, i.e., an equilibrium solution.

We discover then the Scylla and Charybdis of solving general equilibrium problems. On oneside we have a group of methods which insist on complementary solutions of the linearizedsubproblems. If the iterates converge for one of these methods, they do in fact converge toan equilibrium solution. However, we still have very little in the way of a rigorous reasonwhy they should converge globally from any starting point. On the other side, we have a

battery of optimization methods with strong theoretical and/or practical properties whichguarantee (or at least make likely) the global convergence of iterates from any startingpoint. Unfortunately, they may converge to a local minimum rather than to an equilibriumsolution. We can readily identify this unsatisfactory termination, but what to do in that

event is not at all clear. We make some suggestions relative to this matter in the nextsubsection.

II

5.3.1 Descent from nonoptimal stationary points 45

5.3.1 Descent from nonoptimal stationary points

We have established that global optima of problem Eq-DEPF are in one-to-one correspon-dence with those of Eq-NLP. There is no particular reason to expect that a similar correspon-dence would apply to nonoptimal stationary points. This naturally induces the question:if an algorithm applied to one problem terminates at a nonoptimal Karush-Kuhn-Tuckerpoint, is there any perspective available from the other formulation that would suggest adirection of escape? Since any stationary point of Eq-NLP provides a feasible solution ofEq-DEPF, we are in a good position to examine this point from the perspective of Eq-DEPF.We do so below, and exhibit two sensible possibilities for escape-one that yields a localdescent and one that does not. In contrast, a stationary point of Eq-DEPF need not providea feasible solution of Eq-NLP, and meaningful additional perspective can be obtained onlyfrom a feasible point. In the case of a jointly feasible point, we can obtain some partialresults which we present first.

Stationary points of Eq-DEPF

Suppose that (p,.t, 9, -0) is a KKT point of Eq-DEPF with associated Lagrange multipliers(P, , ). In light of equation (DMx) from Section 5.2.2, we can conclude that 5i = 0 ==>ii - d,(p) >_ 0. If Pi > 0, then the associated inequality of (DEPF1) in Section 5.2.1 mustbe binding. Consequently, to obtain a feasible solution of Eq-NLP (Section 3.3), we mustagain have Yi > d(p). Thus, the necessary and (obviously) sufficient condition for derivinga feasible solution of Eq-NLP from a stationary point of Eq-DEPF is that x > d(P).

Recall from Observation 4 of Section 5.2.2 that hT [i"- d(p)] = 0. This general notationobscures the fact that in practice we would only define variables xi for final commodities, i.e.,those for which di(p) is not a known constant. If hi > 0 for all final commodities i, then theabove equality and nonnegativity of xi - di(p) clearly requires i = di(p). Hence, by virtueof Observation 2 of Section 5.2.2, the price normalization can ensure that any stationarypoint of Eq-DEPF which is feasible for Eq-NLP must be an optimal (equilibrium) solution.This particular price normalization is precisely the one shown in Chapter 6 to guaranteethe feasibility and solvability of all lin2arized subproblems of Eq-NLP or Eq-NLCP.

Stationary points of Eq-NLP

Suppose that (pk, yk, vk) is a KKT point of Eq-NLP with associated Lagrange multipliers(P, , ). (We use the superscript notation because we will be using the descent formulasof Section 5.2.3.) If the point is nonoptimal, it must be the case that vk > 0 and P pk %(recall the discussion of Section 3.3.1). This means that (pk, yk, vk) solves LP(pk) but notSLCP(pk), as defined in Sections 4.4 and 4.1, respectively. A feasible solution of Eq-DEPFis immediately obtained by setting xk = d(pk). We again investigate search directions of theform (p-p , _ -Xg-yk, f -V), where (p, , ,i5) must be a feasible solution of Eq-DEPF.In this special context, the descent formula simplifies to:

-(P) T Y) + - !iTxk) - 27k.

J- R ' ~ ~

46 5.3 Convergence to what?

One might intuitively suspect that solving SLCP(pk) would provide a descent directionwhen viewed from the perspective of problem Eq-DEPF. Unfortunately, this is not the case.Let (pl, 9, VJ) solve SLCP(pk) and set t = dk(p). Then, using Walras' law (D2) and property(W1) of Section 2.1:

V k)T t = (pk)T [d(pk) + Vd(pk)p] = p 'd(pk).

Since xk = d(pk), we obtain l' = f; - vk. This looks promising at first, but since (pk, yk, vk)

solves LP(p k ) and ( V, , O) is feasible for LP(p), it must be the case that V > vk. Hence, thelocal information indicates that relinearizing at some step in the direction pi-pk would notproduce a descent in problem Eq-DEPF. This is discouraging but not necessarily terminal.We certainly are not satisfied with the current KKT point, and even if a movement toward Pdoes not yield a local descent, it might at least initiate a new set of iterates converging to adifferent KKT point. In particular, following a path of complementary solutions thereafterwould ensure that the nonoptimal point would not be revisited.

Another potential search direction is available without further calculation: the Lagrangemultipliers from the nonoptimal stationary point of Eq-NLP. If we define i = -vTd(pk)pand examine the multiplier conditions (M1)-(M4) from Section 3.3.1 (with P replaced by pk,etc.), we observe that (P, i, , ) is indeed a feasible solution of Eq-DEPF. Now recall fromObservations 3 and 4 Af Section 3.3.1 that - = 0 and vk = Td(pk). In the present context,the latter means vk = ,Txk. Also, (pk)T, = 0 because of property (H1) of Section 2.1. Thedescent formula then reduces to V" = - 2 vk, which is strictly negative if the stationary pointis nonoptimal. In fact, this shows the direction to be a steepest descent direction since itobtains zero for the two nonnegative terms in the formula for '.

Observe that the estimated descent of -2vk could not be obtained by a unit step in thedirection of (i, , ), ). This is because xk = d(pk) at the current point, which meansthat the value of the objective is vk in both Eq-NLP and Eq-DEPF. Since the objectivw isbounded below by zero, the maximum attainable descent is -vk. A unit step will producedescent if:

-pi >+ pIli-d()II ,- iTVd(pk)p + lp 2[di d~k]3I

where we have inserted the definition of i and applied property (W1) of Section 2.1. Wedemonstrated in Observation 5 of Section 3.3.1 that the first term of this expression isnonnegative. Hence, it is impossible to say anything a priori as to whether a unit step willnecessarily yield a descent.

What we have established, then, is that the Lagrange multipliers for a nonoptimal KKTpoint of problem Eq-NLP can be used to define a steepest local descent direction from thatpoint when viewed in the context of problem Eq-DEPF. Some step in the direction P - pk

may then be used to define a new linearization point. Iterates from that sensible point maythen lead to a different KKT point, provided once again that something is done to preventa return to the nonoptimal point.

€, € S.

.~ f ~ - '~.'.'p%~ P. P0

6

SOLVABILITY OF LINEARIZED SUBPROBLEMS

In this chapter we verify earlier assertions that the normalization vector h can be chosenin such a way as to guarantee the feasibility and solvability of all linearized subproblemsencountered in the solution of Eq-NLCP or Eq-NLP. By solvability we mean that all sub-problem constraint matrices belong to a class that can be successfully processed by Lemke'salmost-complementary pivoting method. In particular, for such matrices Lemke's methodcan terminate on a secondary ray only if the problem is infeasible. However, the requisitechoice of h also guarantees feasibility of the linearized subproblem as long as the originalequilibrium problem is feasible. In light of the properties of computable general equilib-rium models outlined in Chapter 2, we may safely presume feasibility (and existence of anequilibrium solution) for any properly formulated model. Consequently, the combination ofsolvability and feasibility guarantees that each linearized subproblem has a complementarysolution and that this solution can be computed by at least one method, i.e., Lemke's.

The demonstration of solvability is a direct application of some results of Garcia [Gar 73] onclasses of matrices amenable to solution by Lemke's method. His results extend somewhatthe earlier results of Eaves [Eav 71] in a way that is remarkably well suited to our purposes.Feasibility will follow immediately from problem structure and the same choice of h asrequired for solvability.

In previous work, the author [Sto 85] and Eaves [Eav 87] have recognized that choosingh = e and incorporating a matching artificial column guarantees the solvability of the LCP

subproblems. Mathiesen in [Mat 87] applies Eaves' earlier results [Eav 71] to a specificsmall problem in order to investigate the effects of the linearization point and the choiceof num6raire on the solvability and existence of solutions to the LCP subproblems. Theresults obtained below are both more general and justify stronger conclusions.

'a'

6.1 Relevant results for Lemke's method

Here we briefly summarize the definitions and results of Garcia [Gar 73] which we will applyin the next section. We have made some minor modifications to notation and wording soas to better fit our context. Use of the vector d and of index sets J and K should not beconfused with previous uses in this document.

Denote the standard linear complementarity problem by q/M and the augmented form withan artificial covering vector d by d/q/M. That is, we seek a solution of the system:

z > O, z, 0 , q + dzo + M z > O

with the property:qTz+zTMz=O and z0 =0.

47

% NU ~ Ui.~ . .

48 6.2 Application to a linearized subproblem

Let C(q, M) and C(d, q, M) denote the solution sets of problems q/M ,nd d/q/M, respec-tively. Of particular interest in the following will be the special LCP diM and its solution

set C(d, M).

We will use the following definitions and results, which require no elaboration.

Definitions of two matrix classes as those matrices satisfying:

E(d): z E C(d,M), z $ 0 =: 3 > O, 0#O such that:(i) -mT_; > 0

(ii) z> and d+Mz+MTi > 0 .

E*(d): z E C(d, M) =>- z = O.

Lemma 3.1. E(d) = E*(d) for any d > 0 or d < 0.

Observation. E(O) = L 2 of Eaves [Eav 71].

Corollary 5.2. Let dlq/M be partitioned as:

di qj Mjj MjK b

d / qK / MKJ MK 00 r -aT 0 0.

where M., and MKK are square matrices, a and b are columns, dj, d, r, a and b are all

positive, M E E(0), and MK,. E E(dK).If C(d, q, M) has a secondary ray, then q/M is infeasible.

Commentary

If the partition K is vacuous, then the structure and results of the above corollary correspondto those of Theorem (11.5) of Eaves [Eav 71].

It is important to note that the covering vector d is specially constructed to be strictlypositive except for the zero in the last position. Since this is not a standard specification,in practice it would be necessary to modify the default specification of the covering vectorincorporated in available software for implementing Lemke's algorithm.

6.2 Application to a linearized subproblem

We are concerned with the linearized subproblem encountered at any iteration of any ofthe sequential methods we study. For simplicity, we will suppress the superscripts referringto iteration number. The results also do not depend on the nature of the linearizationemployed, so we will non-specifically refer to the linearized demand functions as b + Bp.Where we depart from previous modes of presentation is that now we take explicit accountof the partitioning of commodities into final commodities and primary or intermediatecommodities.

-9t

6.2 Application to a linearized subproblem 49

Accordingly, partition the vector of prices as p = (r,a), where 7r corresponds to finalcommodities and a to primary or intermediate commodities. We also, with a suggestiveabuse of notation, use 7r and a to denote index sets for partitioning the rows of the activityanalysis matrix A, the elements of the vector b, and the rows and columns of the matrixB. Since the a partition corresponds to commodities for which demand is constant, itfollows that B,, would be zero in any sensible linearization. B, may also be zero, as in

the case B = Vd(7r, a). BW would not be zero, for instance, if B = [Vd(7r, a) + VTd(r, a)].If B,, = 0, b, is constant across all iterations and represents the negative of aggregateendowments (which by definition are zero for intermediate commodities).

With this notation, we may state the linearized subproblem as:


(SP1) -B,7r - B,,a + A,.y + hv > b, I r >0

(SP2) -B,,ir + A,,.y > b, I a> 0

(SP3) -(A,.)Tir - (A,.)-ra > 0 1 y > 0

(SP4) -hT 7r - 1 i v

(SP5) 7r, a, y > 0

In order to correspond exactly to the structure of Garcia, we must now write the normal-ization equality (SP4) as two inequalities and divide the artificial variable v into its positiveand negative parts, denoted by v. and v-, respectively. Given these notational conventions,we can partition the data of the linearized subproblem so as to correspond directly to thepartitioning of Garcia's Corollary 5.2 (above):

dddb ,____ B - B,, , A,. -h +h

dy0 (A.)T (A )T

1 -1 _ hT

0 1 L L -hT

Before verifying that this data satisfies the hypotheses of the corollary, it is worth mentioningwhy it is advantageous to restrict the price normalization to the final commodity pricesalone. As a theoretical issue, it does not matter. As a computational issue, excluding the

bprices a from the normalization leaves a problem structure that requires each subproblem tobe feasible with respect to the commodity balances (SP2), which are identical to the original

problem constraints and do not chiange from iteration to iteration. If the artificial coluinextended into these constraints, then feasibility would be attained only at equilibrium wherev = 0. We may thus expect a more reasonable and constrained path to the solution if theartificial column is limited to the final commodity constraints. If B,, 0, we also may be

' '/ p/ *. °1 1~~-** ~ %- --- -U ~-P~ .

50 6.2 Application to a linearized subproblem

able to take computational advantage of any simple bounds that appear in the constraints(SP2). This also could not be done if the artificial column extended into these constraints.

We now verify that the problem data for any linearized subproblem will satisfy the conditionsfor Garcia's corollary if h > 0. Note that h corresponds to both a and b of the matrix inthe corollary, which are required to be positive. We may presume that all components ofthe covering vector (except the final zero) are positive as required.

Claim: MKK E E*(dz:).Proof. M.K is clearly skew-symmetric. Hence, zT'MKKZK = 0 for all ZK. Since dK > 0, ifT=z, _ 0 and zTdK + z1MKKZ = 0, it must be that z, = 0. That is, ZK E C(d, MKK) =Z= 0.

Claim: If h > 0, then M E E(0).Proof. Let z = (r, a, Y, v-, v+) E C(d, M) and 2 = (fr, o, , ,+) be the two vectors relevantto the definition of class E(d) above. Here, we are temporarily concerned with d = 0. Giventhat h > 0, it is clear that z > 0 and Mz > 0 requires 7r = 0. Similarly, 2 > 0 and -MTi > 0requires * = 0. (As an interesting aside, note that the submatrix of M obtained by deletingthe rows and columns corresponding to 7r is skew-symmetric. Thus, for any z satisfyingz > 0 and Mz > 0 we have zTMz 0, i.e., z E C(0,M).) Given that r = r= 0, we canwrite:

-B,a + nr. (y - )-h(v_ - _) + h(v, - )

A,.( y- )Mz + MTi = -(A,. )T(a &)

0

0

We now consider two cases. First, if a = 0, then we simply take 2 = z = (0, 0, y, v-, v+).

Then - MTi = Mz > 0 and Mz+ MT= 0. Second, ifa 0 we take 2 = (0, a, ,0, 0) < z.

Then

0 -Boo + A,.y - hv + hv+

0 A,.y

-MTi= -(Ar.)Ta >0 and Mz+MT,= 0 >0.0 0

0 0

So both cases satisfy the conditions (i) and (ii) of the definition of E(0). Hence, Al E E(0)provided h > 0.

6.2 Application to a linearized subproblem 51

Since the problem data satisfy both of the conditions for Garcia's Corollary 5.2, we canconclude that each linearized subproblem is in a class that can be solved by Lemke's methodprovided that the problem is feasible. But if h > 0 the only way that the subproblem canbe infeasible is for the inequalities (SP2)-(SP5) to be inconsistent. These inequalities donot change from subproblem to subproblem, however, and their inconsistency would implythat no equilibrium solution exists. Since any properly formulated general equilibriummodel does have an equilibrium solution, we can conclude (as a theoretical matter) thateach linearizcd subproblem has a complementary solution which can be found by Lemke'smethod.

.

.N

111

7

COMPUTATIONAL COMPARISONS

In this chapter we present the results of some computational experiments applying fivedifferent solution methods to each of two small test problems. The two problems are rea-sonably well known in the equilibrium literature and are small enough to permit hundredsof model solutions from different starting points. Despite the small size, they possess prop-erties which serve to illustrate much of what can go right and wrong in solving a generalequilibrium model. The solution methods implemented are as follows:

1. SLCP (Section 4.1) applied to problem Eq-NLCP (Section 3.2)2. SLP (Section 4.4) applied to problem Eq-NLP (Section 3.3)3. symmetric positive semidefinite linearization (Section 4.6)

applied to Eq-NLCP4. projected (augmented) Lagrangian method (Section 4.5)

applied to Eq-NLP5. direct solution of problem Eq-DEPF (Section 5.2.1).

Another linearization strategy discussed in Section 4.6 was abandoned after disappointingintitial results. In practice it turns out that the skew-symmetric approximation matrix

[Vd(p) - VTd(p)] is prone to be so rank-deficient as to prevent the attainment of a basicsolution of the linearized subproblem that has all positive prices. As all equilibrium pricevectors for the two test problems are strictly positive, a method based on pivotal solutionsof the subproblems could never find an equilibrium using this linearization.

If there is a single word to describe the results obtained, it is diversity. A few generalizationsare supported, but there are numerous exceptions to every conclusion. Before presentingthe results, we briefly describe the test problems and discuss the specific implementationsof the solution methods.

7.1 Two test problems

The first problem was devised by Scarf [Sca 73, pp. 113-119] and has become a standardproblem for testing and comparing solution methods for general equilibrium problems. Themodel contains 14 commodities and 26 linear production activities. The commodities maybe subdivided into 7 final consumption commodities (for which there are no initial endow-ments), 3 primary commodities, and 4 intermediate commodities (these terms were definedin Section 2.1). Aggregate final demand is the sum of the demands of 4 consumers, eachwith a Cobb-Douglas utility function. This means that demand for final commodity i be-comes unbounded as p, - 0. If none of the primary commodities has a positive price, thenincome is zero and so is demand. The Scarf model has a unique equilibrium.

In [Keh 85] Kehoe describes a tiny equilibrium problem specifically constructed so as to havemultiple equilibria. The model has 4 final commodities and 2 linear production activities.

N 53- .Iqw, .W 1(W W~M W~U S

54 7.2 Solution methods implemented

Final demand is the sum of the demands of 4 Cobb-Douglas consumers, each with an initialendowment of only a single commodity. The problem was devised so as to have an unstableequilibrium at unit prices, which implies the existence of at least two other equilibria. (Proofof this relies upon the global index theorem described in both [Keh 82] and [Keh 85].) Kehoeestablishes that the model has exactly 3 equilibria. When prices are normalized to sum to4, these are:

(0.6377, 1.0000, 0.1546, 2.2077) (Eql)

E e(1.0000, 1.0000, 1.0000, 1.0000) (Eq2)

1(1.1005, 1.0000, 1.2346, 0.6649) (Eq3).

Both production acLtvities are strictly positive at all equilibria, which results in P2 havingthe same value for all equilibria. This is because a binding excess profit constraint for thesecond production activity requires that p, + P3 + P4 = 3 p2. The intersection of this andthe price normalization fixes p2 = 1, which in turns restricts the values of the other threeprices to lie on the subsimplex pi + P3 + P4 = 3. The other binding profit constraint thenforces all equilibria to lie on a line in this subsimplex. Indeed, any iterates obtained insolving the model must lie on this line if both production activities are positive. We foundthese observations in unpublished work of Mathiesen and Rutherford [MR 83], who in turnattribute them to unpublished work of Kehoe (which later appeared in [Keh 84]). Thesespecial features of the model prove to have significant consequences for the computational

performance of the methods studied.

7.2 Solution methods implemented

All five methods are implemented by means of specialized user subroutines in MINOS.In particular, SLCP, SLP, and the symmetric linearization method are all implemented %

by means of the same program to sequentially revise the linearized constraints and define ,

the appropriate objective function for the subproblem. MINOS is used solely to manageand solve the individual subproblems, not to "supervise" the overall solution process. Incontrast, the other two methods directly apply MINOS as an optimization algorithm forsolving problems Eq-NLP and Eq-DEPF. Use of a common computational vehicle makes theexperiments more manageable and also enhances the comparability of the solution timesfor the various methods.

7.2.1 Departures from standard SLCP method

Using MINOS to solve the subproblems means that SLCP is actually implemented as (Wil-son's) SQP method (Section 4.2) applied to problem Eq-NLP. Thus, the LCP subproblemsare not processed by Lemke's method, but rather are solved as indefinite quadratic pro-grams. With this approach, it is possible that some subproblems will not be solved to globaloptimality (i.e., complementarity). This in fact happens from many starting points at oneor two of the early iterations of the method. Nonetheless, failure to achieve complementarity

7.2.2 A symmetric linearization 55

does not prevent convergence for either problem from any starting point. The consequenceof the MINOS implementation is that the "simulated" SLCP iterations follow a somewhatdifferent path than would a "pure" implementation.

Another difference in implementation arises from the choice of step length and its relationto avoiding zero prices at linearization points. The most elegant approach would be toimplement a rigorous line-search procedure, applying a merit function (such as that ofEq-DEPF) that becomes infinite at points outside the domain of the demand functions. Weregret that time did not permit undertaking the programming effort required to accomplishsuch a solution. (We do, however, evaluate the merit function at each point obtained bya sequential method and therefore can assess progress after the fact.) Any other appoachto determining a step length and avoiding boundaries is necessarily heuristic. Mathiesen'sapproach is to allow zero prices in the LCP solutions but then to take less than a unit stepif one or more of the prices is zero. This means that the merit function cannot be evaluatedat the LCP solution if there is a zero price; it must be evaluated at the linearization pointdetermined by the step length. A more attractive and convenient alternative in the contextof our implementation with MINOS is simply to impose reasonable nonzero lower bounds onthe prices. This means that the merit function can be evaluated at all subproblem solutionsand that a unit step is always feasible. With this approach, however, a binding lower boundin the subproblem can and generally does prohibit the attainment of a complementarysolution. For the most part, these bounds are the true cause for the lack of complementarityin certain early iterations that we alluded to above.

Intuitively, we know that obtaining a zero price in a linearized subproblem means that thelinear approximation did not properly reflect the true boundary behavior of the demandfunctions. It is thus arguable that obtaining such a solution is of no more value thanobtaining a non-complementary solution with more reasonable prices. In some sense, wemight expect that preventing the early iterations from obtaining extreme prices wouldenhance overall convergence. As a crude test of this hypothesis, we applied the first threemethods to each problem using lower bounds of 0.01 and 0.0001. On balance, use of the0.01 bounds proved superior, so the rest of the results pertain to computations using thehigher bounds.

7.2.2 A symmetric linearization

For the third method we define a sequential linearization scheme that utilizes a symmetric,negative semidefinite approximation matrix (referred to as Bk in Section 4.6). Since thedemand functions enter as -d(p) in the commodity balances, it is -Bk that appears inthe linearized constraints. The resulting constraint matrix is then bisymmetric and positivesemidefinite, implying that the subproblem can be solved as a convex QP on the price spacealone. The production activity levels are obtained from the dual, i.e., from the Lagrangemultipliers on the excess profit constraints. We derive this approximation matrix from thesums of the so-called Slutzky matrices for the demand functions of the individual consumers.For a consumer with income 0 and demand function d(p,0), the Slutzky matrix S(p, 0) is

56 7.2 Solution methods implemented

defined as:S(p, 0) E Vd(p, 6) + Ved(p, 6) [d(p, 0)]T.

(The derivation may be found in any intermediate textbook of microeconomics.) The matrixS(p, 0) is symmetric and negative semidefinite; in particular, p is in its null space. Theseproperties are clearly preserved under summation across a finite collection of consumers. Inboth of our test problems, we know the demand of each consumer as a function of prices andincome, as well as the definition of individual income in terms of endowments and prices.Hence, at any price vector p we can construct the individual Slutzky matrices and obtaintheir sum. We thus obtain a meaningful symmetric, negative semidefinite linearization ofthe aggregate demand function d(p).

Note that the Slutzky matrix is properly defined only for final consumption items. It is

thus significant that, in a problem with primary and/or intermediate commodities (such asthe Scarf problem), an approximation matrix derived from Slutzky matrices is nonzero onlyin the submatrix corresponding to prices of the final commodities. Such an approximationnot only has B,. = 0 in the rows designated (SP2) in Section 6.2, it further has B,, =0 in inequalities (SPI). Other symmetric approximations, such as any scheme involving

[Vd(7r, a) + VTd(7r, a)], produce a nonzero B,,, in inequalities (SP2). These terms not onlyincrease the density of the matrix, they also produce a distortion in the commodity balancesfor the primary and intermediate commodities.

We are not aware of any previous attempts to use Slutzky matrices as a basis for lineariz-ing final demand functions. It is unfortunate that such a construction is not possible foran arbitrary nonintegrable aggregate demand function. It is applicable only to problemsinvolving sums of known individual demand functions for which the Slutzky matrices canbe constructed. In the following discussions we will use the abbreviation SLTZ to refer tothis sequential linearization method based on Slutzky matrices.

7.2.3 Using MINOS to solve Eq-NLP and Eq-DEPF

As described in Section 4.5, the features that characterize the projected augmented La-grangian method (as implemented in MINOS to solve nonlinearly constrained problems)are the Lagrangian term based on multiplier estimates and the penalty parameter weight-ing the quadratic penalty term. If both terms are suppressed, the resulting procedureautomatically implements SLP (without trust regions) on any problem with a linear objec-

tive function (such as Eq-NLP). In our context, this simple, low-overhead implementationof SLP naturally proves to be somewhat faster than the specialized user program we devisedto implement all three of methods SLCP, SLP, and SLTZ. For maximum comparability, wewill report the results for these three methods based on common use of the special user

program.

The interesting question with respect to using MINOS on Eq-NLP is whether the additionalincorporation of the Lagrangian and/or penalty terms proves to be a help or a hindrancerelative to SLP. The terms impart to each subproblem some of the underlying nonlinearnature of the original problem, and the penalty term in particular is intended to provide

7.2.4 Specification of starting points 57

a stabilizing influence that promotes global convergence. This benefit comes at the costof solving subproblems with a general nonlinear objective, which is furthermore nonconvexin the case ot Eq-NLP. To address this question, we solved each test problem from manydifferent starting points, both with and without the Lagrangian term and for different valuesof the penalty parameter.

An important aspect of solving problem Eq-DEPF directly is that no initial values for theprices need to be supplied. A standard Phase I procedure will automatically generate afeasible solution of the linear constraints; see again Section 5.2.1. For this solution, x, yand v may very well be zero, but the prices determined will be normalized and feasible withrespect to the excess profit constraints. Thus, the first evaluation of the demand functionsoccurs at feasible prices. This is in sharp contrast to all the other solution methods whichrequire the user to divine a set of starting prices. These may or may not be feasibledepending on the method used to obtain them. As it is meaningless to attempt to solveEq-DEPF from numerous starting points, the only relevant comparison is between a singlesolution time for Eq-DEPF and the distribution of solution times for the other methods.The interesting issue in solving Eq-DEPF is the effect on solution time of differing values ofthe penalty parameter in the objective function. This is readily investigated, and we reportbelow the effects of this parameter on solution times for each test problem.

7.2.4 Specification of starting points

Since our purpose is to investigate both the robustness and speed of convergence of thevarious methods from arbitrary starting points, we deem it preferable to avoid randomiza-tion and instead construct a uniform grid of prices covering the (interior of the) relevantprice simplex. For the Scarf problem, we construct a grid over the 10 prices for final andprimary commodities. To obtain a manageable number of points, the range for each priceis subdivided into 3 (equal) intervals, the endpoints of which define a grid of 220 points onthe simplex. For the Kehoe problem, we can afford a finer grid based on 10 intervals, whichresults in 286 starting points. For each problem, the minimum allowable starting price is0.01. This is relative to price normalizations requiring the 4 prices in the Kehoe model tosum to 4 and the 7 final commodity prices in the Scarf model to sum to 7.

It is important to bear in mind that the purpose of this grid of prices is to examine how wellthe methods perform even when given a starting point near the boundary of the simplex- and hence near the boundary of the domain of definition of the demand functions d(p).Many of the points have one or more prices at the minimum of 0.01. Indeed, for the Scarfproblem each starting point has at least 7 of the 10 prices at the minimum value. Forthe Kehoe problem, 202 of the 286 points have at least 1 of the 4 prices at the minimumvalue. Such points naturally tend to produce rather unstable initial iterations. This allowsfor some worst-case comparisons, but such results are not necessarily representative of theperformance of the method in a realistic setting.

In practice, model users may well have reasonable "priors" as to relative prices and, at thevery least, can readily avoid specifying initial prices near the boundary of the domain of the

demand functions. To assess the performance of the methods when given more reasonable

58 7.3 Results for the Scarf problem

starting points, we do the following. For the Scarf problem, we generate another 220 startingpoints by taking a convex combination of the near-boundary points and the center of thesimplex (i.e, a vector of ones). We use a weight of 0.7 on the center point. (For reference,all but one of the equilibrium prices are near unity; the "outlier" is approximately 0.5.) Incontrast, for the Kehoe problem, we simply examine the results for the subset of 84 startingpoints that are not near the boundary.

7.2.5 Convergence criteria

The significant feature of imposing lower bounds on the prices is that any subproblem solu-tion for methods SLCP, SLP, and SLTZ produces a feasible solution for problem Eq-DEPF.We can then utilize the value of the Eq-DEPF objective function as a rigorous measure ofprogress for the method and also as a viable stopping criterion. For the convergence crite-rion. we established the extremely strict requirement that both the value of the Eq-DEPFobi '-tive and the absolute value of artificial variable v must be less than 10-'. This was cho-Sen, ifter early experiments indicated inadequately precise convergence of the SLTZ methodwith a criterion of 10- ', which is the tolerance for judging feasibility and optimality inMINOS. Those MINOS tolerances were also tightened to 10- ' for the SLTZ method. Thestoppping criterion utilized is much more demanding than the kinds of criteria typicallyemployed in solving equilibrium models.

In using MINOS to solve problem Eq-DEPF, we also found it worthwhile to use the tighter10- ' feasibility and optimality tolerances. This did not appreciably increase solution time,but did improve the accuracy of the final solution. For solving problem Eq-NLP withMINOS, the default MINOS criteria for convergence proved more than adequate to ensureconvergence within the tolerances defined above for the other methods.

7.2.6 Hardware, software and timing measures

All of the computational experiments were performed on an IBM 3090 Series 200 computer,kindly made available to us by the IBM Palo Alto Scientific Center. The operating systemis VM/CMS running as a "guest host" under VM/XA. Both MINOS and the required usersubprograms were compiled and executed using Release 2 of Version 2 of VS FORTRAN. Noattempt was made to utilize the available vectorization and parallel processor capabilities.Even without using such features, 3090 computing power is something of an "overkill" forthe problems solved. As a result, we will be comparing solution times measured in fractionsof a second. For the purposes of our comparison of methods, it is the relative solution timesthat matter. All reported solution times exclude program linking, MINOS initialization,and printing of the final solution.

7.3 Results for the Scarf problem

The existence of a unique equilibrium solution for the Scarf problem allows for a straightfor-ward assessment of computation times. It is meaningful to examine the range and average

~* W,

7.3.1 Solving Eq-DEPF with MINOS 59

of solution times across all starting points as well as ratios of solution times for differentmethods from the same starting points. To begin with the simpler discussions, we first ad-dress using MINOS to solve problems Eq-DEPF and Eq-NLP. We then examine at greaterlength the comparisons of methods SLCP, SLP, and SLTZ.

7.3.1 Solving Eq-DEPF with MINOS

As we indicated above, solving problem Eq-DEPF does not require the specification of aninitial starting point. What we examine is the effect of the choice of penalty parameter onsolution time. We solved the Scarf model a small number of times for each of 5 differentsettings of the penalty parameter. After averaging out some small random fluctuations inthe individual trials, representative solution times are as follows:

parameter 0.01 0.1 10 10 100

CPU seconds 0.145 0.132 0.145 0.140 0.149

The relative variation is not particularly significant, and we have no explanation for thebimodal pattern. A solution time on the order of 0.14 seconds for Eq-DEPF should be keptin mind when reviewing the results for the other methods below.

7.3.2 Solving Eq-NLP with MINOS

We solve problem Eq-NLP using four different strategies:

Lagrangian term penalty parameter

(1) no 0

(2) yes 0

(3) no 10

(4) yes 10

Strategy (1) is effectively SLP because of the linearity of the objective function of Eq-NLP.Since no Lagrange multiplier estimates are available for the first linearized subproblem,strategies (1) and (2) solve the same initial subproblem, and so do strategies (3) and (4).

The most significant result of these comparisons is that all four strategies succeed in locatingthe equilibrium solution from all 220 near-boundary starting points. Solution times rangefrom 0.1 to 0.5 seconds across all starting points. We can also compare solution times acrossmethods initiated from the same starting points. We do so by computing the ratios of thesolution times for strategies (2)-(4) to that of strategy (1). With only two exceptions, thesolution times for strategies (3) and (4) range between 95% and 102% of that for strategy

-* (1); the average is about 99%. Curiously, the number of linearizations required from eachstarting point is the same for all three of these strategies. Hence, the presence of a penalty

Em . CE \. J* .

A A A %

I -T 1 JK XM -f' .k k ' r, - . -


term did not affect the convergence rate obtained, and, contrary to prior expectations, themore complicated objective function for the subproblem did not increase overall solutiontime. On balance, we do not consider the variation in solution times for these three strategiesto be at all significant.

In contrast, the solution times for strategy (2) range from 62% to 153% of that for strategy(1); the average is about 105%. In most, but not all, cases the number of linearizationswas reduced by the presence of the Lagrangian term. Nonetheless, solution times were

frequently longer despite the improved convergence rate. There is no apparent pattern toexplain why solution times increased for some starting points and decreased for others. Inparticular, the distance of the starting point from the equilibrium point is not in any wayrelated to the solution time. We conjecture that the wide variation in relative solution rimesis attributable to differing qualities of the Lagrange multiplier estimates obtained from thefirst two or three subproblems. In the absence of a stabilizing penalty term, the multiplier

estimates from the first subproblem are likely to be quite poor, given a linearization pointnear the boindary of the simplex. Poor multiplier estimates can in turn lead to distorted

solutions for the next subproblem, which may well take longer to compute if the solution isdistant from that of the previous subproblem solution.

Since the primary purpose of these comparisons is to assess the necessity of the Lagrangianand penalty terms for achieving global convergence from near-boundary starting points, wedo not think it worthwhile to repeat the experiment from the interior starting points.

7.3.3 Three sequential methods

We first discuss the results for the set of starting points with at least 7 of the 10 prices setat the minimum value of 0.01. It is significant that all three of the methods SLCP, SLP, andSLTZ succeeded in finding the equilibrium solution from all 220 of these deliberately poorstarting points. The character of the results partition the starting points into two groups.The most varied results are obtained for a group of 84 points at which all 3 of the prices forprimary commodities are set at 0.01. (This means that consumer income is near zero, sincethe Scarf model has no endowments of the final consumption goods.) We use the solutiontimes for SLCP as a basis for comparison. These range from 0.19 to 0.46 seconds across all84 points. For the most part, the number of linearizations required is in the range of 4-6,but 10 troublesome points required 10-12 linearizations. The variation is not related in anyvisible way to the distance of the starting point from the equilibrium.

We now look at solution times (from the same starting point) for the SLP and SLTZmethods expressed relative to the solution time for SLCP. For SLP these time ratios rangebetween 0.37 and 1.54, with an average of 0.87. The most extreme of these deviations are

attributable to significantly different numbers of linearizations required. For the most part,buit not universally, solution times for the same number of linearizations are markedly lowerfor SLP. For SLTZ, the range of relative solution times is front 0.14 to 0.78, with an average

of 0.48. In a small number of cases, SLTZ required fewer linearizations than SLCP, but itgenerally required 2 or 3 more - and occasionally 2 or 3 times as many. These excursionsreflect an underlying phenomenon that SLTZ iterations occasionally produce an increase in

,4

~ ~ V ~ -. *.- .- C*4' ~ ~ ~ s-. ' % ~ ~ $ -* * .~*

7.3.3 Three sequential methods 61

the value of the Eq-DEPF merit function. These uphill segments of the path may endurefor a few linearizations, but descent is eventually regained. Lower overall solution times inthe face of the higher linearization counts clearly point to substantially lower solution timesfor solving the smaller SLTZ subproblems.

The results are quite different and less diverse for the other 136 points. The solution timesfor SLCP range from 0.41 to 0.88 seconds, which is almost wholly above the range of solutiontimes for the other group of points. Here, the number of linearizations required is generallyin the range 9-12. The relative solution times for SLP are more tightly confined in therange 0.32-0.71, with an average of 0.45. For the most part, the number of linearizationsis somewhat higher than that for SLCP, so these time savings reflect lower solution timesfor the individual subproblems. Quite surprisingly, on this group of points SLTZ generallyrequired fewer linearizations than either SLCP or SLP. We do not know how to accountfor this result, but it naturally means that solution times are substantially lower for SLTZ.Solution times relative to SLCP are in the range 0.14 to 0.35, with an average of 0.21.

The above results certainly attest to the robustness of all three sequential methods whenapplied to a well-behaved problem such as the Scarf model. We now look at more "normal"performance as measured by the other set of 220 points, for which no starting price is lowerthan 0.7. Here, we tabulate the range and average solution times for the three methods soas to permit a somewhat meaningful comparison with the time required to solve problemEq-DEPF directly.

CPU seconds SLCP SLP SLTZ

minimum 0.157 0.125 0.018

maximum 0.393 0.285 0.075

average 0.219 0.170 0.034

An Eq-DEPF solution time of 0.14 seconds is basically at the lower end of the range ofsolution times for SLCP and SLP, but it is almost twice the value of the upper end of therange for SLTZ.

Turning to the relative solution times from identical starting points, the results show a strong

concentration near the average with a significant number of outliers above the average. SLPsolution times relative to SLCP range between 0.59 and 1.68, with a mean of 0.79. ForSLTZ the range is 0.1-0.36, with an average of 0.16. Underlying causes for the deviationsare essentially the same as those discussed above for the collection of points with near-zeroendowment prices. No pattern appears with respect to the distance of the starting pointfrom the equilibrium, but there is an interesting pattern with respect to the solution time forSLCP. This pattern is discernible in Figures 1 and 2, which plot the relative solution timesof SLP and SLTZ versus the solution time for SLCP. (The highest outlier was excludedfrom each figure so as to obtain better resolution for the rest of the points.) The generaldownward trend in the clusters of points indicates that, on groups of starting points forwhich SLCP encountered increasing difficulty, the other two methods suffered less thanproportional increases in solution time. The scatter also indicates that SLP and SLTZ

I


1 .4 0 . .............. ................................ ............................... .....................................SLP 1 .3 0 . .............. ......... . ........... ................................................................................

T1 1 .20 . .... ............................... ...... ............................................................................

ME

1 .10 . ..................................................... ................. .......................................REL 1 .0 0 . .. ............................. I........... ................ ............................................L1AT 0 .9 0 .. ........................ : ........ V, ................ ........ .- ......................................................V i

I i

V 0 .E ..... "'; .2 , , , ...... ...... :...... ................................ ; ........ I......................

0 ......... .... ..... .... .......

S a • •

L 0 .60 . ............................................ ...................... ........................ ................CP 0.50 # I ! I #

0.15 0.20 0.25 0.30 0.35 0.40

CPU SECONDS FOR SLCP

Figure 1: SLP vs. SLCP - Scarf problem

s 0.35LTz

0.30 . .............................................................................................. .......................

M . '-E 0 .25 .. . . . . . ................................................................................... . ...................

RE

T "

.I. . . .. . .......

E 0...- .................. .. ...... ... ............ .. . ...........T

0 .10 .. .............. . .... ................................................. .................. . . . . ... . .....S ',"a

LCP 0 .0 5 1II

0.15 0.20 0.25 0.30 0.35 0.40


Figure 2: SLTZ vs. SLCP -- Scarf problem

7.4 Results for the Kehoe problem 63

frequently, but not universally, encountered difficulties with the same starting points. Theconclusions which state themselves, however, are the overwhelming superiority of the SLTZmethod on the Scarf problem and the generally superior performance of SLP relative toSLCP.

7.4 Results for the Kehoe problem

The existence of three distinct equilibrium solutions for the Kehoe problem rather compli-cates the assessment of computational results. Examining the range and average of solutiontimes for a given method across all starting points can provide an indication of how long ittakes the method to find some equilibrium point, but both range and average may dependupon the proportion of times each equilibrium point is found. Also, it does not seem partic-ularly meaningful to compare solution times for different methods from the same stArtingpoint unless the methods converge to the same equilibrium point. It is interesting to notewhen different methods converge to different points, but it is not clear what else can besaid abouL such cases. To begin again with the simpler discussions, we first address usingMINOS to solve problems Eq-DEPF and Eq-NLP. We then examine at greater length thecomparisons of methods SLCP, SLP, and SLTZ.

7.4.1 Solving Eq-DEPF with MINOS

As for the problem of Scarf, we solve the Eq-DEPF formulation of the Kehoe problemfor 5 different settings of the penalty parameter. For all parameter values, the solutionfinds the (Eql) equilibrium point. Again, the variation is not at all significant: 0.018seconds is required for a parameter value of 0.1, with the other values requiring 0.02 seconds.Convergence to (Eql) with a solution time on the order of 0.02 seconds for Eq-DEPF shouldbe kept in mind when reviewing the results for the other methods below.

7.4.2 Solving Eq-NLP with MINOS

We solve the Eq-NLP formulation of the Kehoe problem using the same four strategies asidentified above for the Scarf problem. Again, the most significant result is that, from all286 starting points, all four strategies succeed in finding one of the equilibrium solutions,including the unstable equilibrium on several occasions. It is interesting that, from eachstarting poin', strategies (2)-(4) converge as a group to the same equilibrium point, whichalmost half the time is a different point than that found by strategy (1). Moreover, withonly one exceptional point, the solution times and number of linearizations for strategies(2)-(4) are virtually identical.

For all strategies, solution times range from about 0.02-0.11 seconds across starting points.Great variation is apparent in the relative solution times of similarly behaving strategies(2)-(4) as compared to strategy (1). For the 115 points from which all strategies convergeto (Eqi), relative solution times range from 51% to 265%, with an average of 138%. Only 9points result in all strategies converging to the unstable equilibrium; here the range is much

,-;", , - , . - ,.',%

64 7.4 Results for the Kehoe problem

tighter (88% to 107%) and the mean is 103%. Finally, all strategies converge to (Eq3) from37 points. The range in relative solution times is 61% to 187%, with an average of 117%.Again, there is no apparent pattern to explain these variations in relative performance.Moreover, the results for the subset of 84 interior starting points very much resemble theresults for the entire set. With such a diversity of results, we are in no position to draw anyviable conclusions or generalizations.

7.4.3 Three sequential methods

As we indicated earlier, of the 286 starting points generated for the Kehoe problem, 202points (intentionally) have at least one price at the minimum of 0.01. While we mightintuitively expect to see markedly different behaviors of the methods in the subset of near-boundary points as compared to the subset of 84 interior points, this did not turn out to bethe case for the Kehoe problem. In light of this result, we will discuss all 286 cases togetherin order to utilize the larger sample size.

Both SLCP and SLP successfully converged to an equilibrium from all starting points.

SLCP never reached the unstable equilibrium (Eq2), while SLP converged to (Eq2) on 14occasions. SLTZ also never found the unstable equilibrium and converged to (Eql) from 181

starting points. From the other 105 points, SLTZ clearly targeted (Eq3) as a limit point,but the convergence criterion was met before the iterates obtained a precision comparableto that of the other methods. For instance, the terminal values for p, were either 1.1003or 1.1008 (depending upon the direction of approach). This is representative of a generalpattern for SLTZ on the Kehoe model that iterations terminate upon narrowly satisfying theconvergence criterion. In contrast, the final iteration of SLCP or SLP almost always satisfiesconditions several orders of magnitude stronger than the specified convergence criterion.

This precise "jump" at the last iteration is typical of Newton's method. This differencein convergence patterns is very important, but we do not believe it justifies declaring thatthe SLTZ method failed to converge; it just fails to converge at a satisfactory rate (on theKehoe problem).

It is significant that, out of 286 starting points, all three methods ccnverged to (Eql) in

only 71 cases and to (Eq3) on only 29 occasions. Pairwise intersections are also importantin terms of defining subsets of starting points over which to evaluate relative solution times.SLCP and SLP jointly converge to (Eql) from 83 starting points and to (Eq3) from 35points. SLCP and SLTZ jointly converge to (Eql) from 146 starting points and to (Eq3)from 71 points.

Before presenting results on relative solution times, we again tabulate the range and averagesolution times for the three methods across all starting points. Given different patterns asto which equilibrium is located, these figures are not as meaningful as those for the Scarfproblem, but they do allow a loose comparison with the time required to solve problemEq-DEPF directly.

* .. .

7.5 Commentary 65

CPU seconds SLCP SLP SLTZ

minimum 0.025 0.016 0.086

maximum 0.111 0.097 1.382

average 0.077 0.057 0.475

An Eq-DEPF solution time of 0.02 seconds is basically at the lower end of the range ofsolution times for SLCP and SLP. The times for SLTZ lie almost wholly above the upperend of the ranges for SLCP and SLP. This is in strong contrast to the results for the Scarfproblem.

We examine relative solution times from identical starting points only for points from whichthe methods compared converged to the same equilibrium. As for the Scarf problem, theresults show strong concentrations near the averages with significant numbers of outliersabove the averages. SLP solution times relative to SLCP range between 0.38 and 1.89,with a mean of 0.75. This is quite similar to the results for the Scarf problem. For SLTZ,the story is completely different: the range is 1.4-32, with an average of 6.1. No patternappears with respect to the distance of the starting point from an equilibrium. Indeed, ascan be seen from Figure 3, there is no particular pattern at all to the relative performanceof SLP. In sharp contrast, Figure 4 illustrates that the generally poor performance of SLTZis markedly worse when the process converges to (Eq3). Note that the two different symbolsfor points in the figures correspond to the two different equilibrium points obtained. Weagain have excluded the highest outlier from each figure.

The decidedly poor performance of SLTZ on the Kehoe problem is attributable to thefollowing general pattern of convergence. The worst cases arise when the first subproblemsolution that obtains positive levels for both production activities also obtains prices thatare near the unstable equilibrium (Eq2). The iterates never converge to this equilibrium- no matter how close it may be - but proceed toward another equilibrium at a linear

rate. Along this path, the Eq-DEPF merit function increases for many iterations, often byseveral orders of magnitude. A peak is eventually reached, and descent is resumed at avery poor linear rate. It is remarkable that convergence is achieved at all. Intuitively, thispattern suggests that the unstable equilibrium exerts a perverse kind of attraction whichdoes not draw the iterates toward (Eq2) but seriously retards the rate at which the iteratesare drawn to one of the other equilibria. The proximity of (Eq3) to (Eq2) causes this effectto be particularly pronounced for iterates that target (Eq3).

7.5 Commentary

The great diversity of the above results makes it difficult to derive any meaningful general-ities and conclusions. It is important to recognize, however, that all of the methods studieddid in fact converge to an equilibrium, regardless of the quality of the starting point. Eventhe extremely disappointing performance of the SLTZ method on the Kehoe problem doesnot reflect aimless wandering of the iterates through the simplex. An equilibrium was def-initely targeted; the problem was with the rate of convergence, not an actual absence of

66 7.5 Commentary

1.3 ............................................ ..........................SM +

T 1.1 . -. ........................ ............... ...... E q3 .....

M

E 1.0 .............................................................................• + +

R 0 .9 . .. . . . . . . . . . . . . . .. . . . . . . . . . . . ........................................... ....... .............E +LA 0 .8 . ..................................... .....................................................T + + ++ # +V 0.7 ........................................... ......

E0 .6 . .. ............................................................ ....... ....... .........................

T0 0 .5 . .. . . . . . . . . . . . . . . .. . . . . . . . . . . . . ....................................... .....................

L 0 .4 . .. . . ............................................. -;......................................... .....................C 0.3

0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12


Figure 3: SLP vs. SLCP - Kehoe problem

S 30 ...................................................................... ............S +

T Eql25 ...... +........................................... .. ..............T + Eq3I +

ME 2 0 . . ...................................................................... ........................... *.......... . . . . .

REL 15 . ..............30................. ........... ....... . ...........................AT +

I

25........... ..... .... . . . . . .. .... +...

T+

O

0 5 ............................... .... ............ . .......................................S

L

CP Of i I I II

0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0,10 0.11 0.12CPU SECONDS FOR SLCP

Figure 4: SLTZ vs. SLCP - Kehoe problem

7.5 Commentary 67

convergence.

For the most part, sequential methods SLCP and SLP demonstrate comparable performanceon both test problems. Solution times for SLP are on average somewhat better than those forSLCP, but there are a number of outliers for which SLP times are significantly higher. Theresults do not permit any conclusion as to whether supplementing SLP with a Lagrangianterm or a penalty term improves overall performance. With such a mixture of results, it isprobably not worth the effort to add the extra terms in practice. What may well be worthimplementing is direct solution of problem Eq-DEPF. Solution times for this method rivalthe best of the results for SLCP and SLP, and use of Eq-DEPF does not require the user tospecify a set of initial prices.

The risky venture is clearly the SLTZ method. The evidence indicates that, for a problemwith a unique equilibrium, this method can compute an accurate equilibrium in as littleas 10-20% of the time required by any of the other methods. On the other hand, in amultiple equilibrium context, the SLTZ method can perform so poorly as to border on beingnon-convergent for all practical purposes. Considerable further experimentation with othermodels would be required to ascertain whether it is exclusively the multiplicity characteristicthat accounts for the extreme differences in the performance of SLTZ.

It is tempting to suggest the development and use of some form of hybrid method, but suchhybrids are much easier to contemplate than to implement in practical software. Indeed,if a user must be prepared to "switch" to SLCP, say, when another method appears to beconverging too slowly, as a practical matter it would be easier to employ SLCP throughout.If it is feasible to solve an SLCP subproblem once or a few times, it is probably feasible tosolve it several times. It is often the case that the actual reason for wanting to use a lowerdimensional method such as SLTZ is that the problem is too big to solve in an SLCP/SLPor Eq-DEPF context. In this case, it is of no practical value to know that SLCP or SLPwould quadratically converge to the solution once we had used SLTZ to bring the iterates"close enough" to the equilibrium.

p

*u * fl. ~**~r- - "- V. .4%~'w

8

SUMMARY AND PERSPECTIVE

There is no question but that general equilibrium problems can be very difficult to solve.

Nonetheless, the results of our research establish a number of characteristic features of

CGE problems which make them in some sense more benign than the general nonlinearcomplementarity problem or the general (nonconvex) nonlinear program.

First, under mathematical assumptions that have reasonable economic content, theEq-NLCP problem is known to have at least one feasible and complementary solution. In im

an optimization context, this mears that the equivalent problem Eq-NLP is feasible and hasa global minimum objective value of zero, which can in fact be attained.

Second, we have shown that, under a seemingly reasonable rank condition, a complementary

(optimal) solution exists which is a vertex of the linearized constraints at that point.

Third, we have derived an easily implemented rule for constructing a price normalizationwhich, in conjunction with a matching artificial column, guarantees that any linearizationof the nonlinear problem has a feasible, complementary solution. Moreover, given the rankcondition for existence of a basic complementary solution, this same construction guaranteesthat a complementary solution can be successfully computed by Lemke's method.

Fourth, we have seen that any regular equilibrium solution exerts a domain of attractionfor Newton and quasi-Newton iterates. Within this domain, we may expect quadraticconvergence of Newton iterates, and superlinear convergence is possible for quasi-Newtoniterates. SLCP, SLP, and projected Lagrangian methods all produce Newton iterates oncethe complementary (optimal) basis has been determined. We have some assurance fromgenericity analysis that regular solutions are in fact typical in equilibrium models. .0

Fifth, we have derived an equivalent linearly constrained formulation of the CGE problem(Eq-DEPF) that has two important uses. One, its objective function can be used as adifferentiable exact merit function for judging and perhaps guiding the iterates of any ofthe sequential solution methods studied. We have used this function to demonstrate that

the search directions generated by any method which uses Taylor's expansions to linearizethe demand functions can be considered a descent direction for some value of an arbitrarypositive penalty parameter. The second contribution of this development is that optimizingproblem Eq-DEPF directly can be an effective means of computing an equilibrium solutionfor models of an appropriate size and structure.

These are helpful and encouraging results, both in terms of improving our understandingof the computational properties of equilibrium problems and in suggesting directions forthe next generation of solution procedures. There remains, nonetheless, a basic theoreticaldilemma which cannot be resolved at our current level of understanding. On one hand,we have a collection of optimization methods which have been designed to produce iteratesthat converge from almost any starting point. Unfortunately, for the equilibrium problem, Vwe cannot rule out the possibility that these iterates may converge to a non-equilibritil

I69

, .4.

70 8.1 Extension to nonlinear constant-returns production

point. On the other hand, we have Newton and other methods that obtain complementarysolutions of the subproblems. These methods produce iterates that do in fact converge toan equilibrium solution, if they converge. Unfortunately, we can only suggest reasons whyiterates of such methods might always be globally convergent on an equilibrium problem;nothing definitive has been established. In the face of this theoretical dilemma, however, thel)reponderence of the computational evidence (both our own and that of other researchers)shows that: (a) Newton-like methods do tend to produce iterates that converge from virtu-ally any starting point, and (b) optimization approaches do tend to find the global optimumof the nonconvex equilibrium problem. This situation certainly suggests that the kinds ofequilibrium models that are formulated in practice possess certain favorable computationalproperties that theoretical analysis has yet to discover.

Before closing this document, we first present a brief section on extending our results toequilibrium models that contain nonlinear production technologies with constant returns toscale. We then suggest what we believe to be some promising and important directions forfuture research.

8.1 Extension to nonlinear constant-returns production

Many computable general equilibrium models represent the production side of the economyin terms of differentiable nonlinear production or profit functions that reflect constant re-turns to scale. Given the theoretical and practical importance of such structures, we desireto ascertain whether or not the results of this research can be extended to apply to suchmodels. We outline here some very recent and preliminary observations on this matter delwhich indeed suggest that our findings have more general applicability.

Since most (and possibly all) applied general equilibrium models with nonlinear productionspecify an explicit profit function (or, if not, a production function for which the profitfunction is known), we focus our attention on models with nonlinear profit functions. Inthe case of constant returns to scale, a unit profit function is specified which does not lor

determine the level of operation. Profit functions that represent closed, convex, constant-returns production possibilities are known to be continuous and homogeneous of degreeone and convex in prices. For our computational purposes, it will be necessary to assumethat the unit profit function is at least twice continuously differentiable. To apply thecorrespondence between Wilson's SQP method and SLCP, continuous third derivatives arealso required. As a practical matter, most of the functions employed in applied models areinfinitely differentiable wherever they are once differentiable.

We generalize the notion of n linear production activities to n independent (constant-returns) production sectors whose operations are rel)resented by unit profit functions rj(p).Let r(p) be the n-vector of these individual unit profit functions. Since r(p) is homogeneousof degree one, Euler's law implies that r(p) = Vr(p)p. It follows that the excess profitconditions, r(p) 5 0, may be equivalently stated as Vr(p)p < 0. The gradient functionsare in turn homogeneous of degree zero, which implies that V2 r3 (p)p = 0 for all j. TheHessians V'r,(p) are, of course, symmetric and are furthermore positive semidefinite by

1'14,,0-- ., .-- M, - , -+* . ,- , .

8.1 Extension to nonlinear constant-returns production 71

virtue of the convexity of the ri(p). By the lemma of Hotelling, the profit-maximizing net-

output combination (per unit of operation) is given by [Vrj(p)]T. A positive component ofthis vector represents a net output of the commodity; a negative component indicates a netinput.

We again follow the presentations of Kehoe [Keh 82] and Mas-Colell [MasC 85]. To mimicour earlier notation for the linear production case, we define a matrix function A(p)VTr(p). The levels of operation for the individual production sectors are represented by then-vector y. For the appropriate analogies to properties (Al) and (A2) of Section 2.2, wereasonably assume the availability of free disposal and that no prices exist for which A(p)permits positive output without input. Existence of equilibrium then follows from the usualfixed-point arguments; see either of the above references.

Given existence, we may now specify the equilibrium problem as the following speciallystructured NLCP:


(CR1) -d(p) + A(p)y + hv > 0 1 p 0

(CR2) -AT(p)p > 0 I y_ 0 -

(CR3) - hTp =-1 ± v

(CR4) p , y > 0

Note that this formulation has the same basic structure as problem Eq-NLCP. The onlydifference is that the previously constant matrix A has been replaced by the matrix functionA(p). This generalization in no way affects the analysis used in Section 3.3 to demonstratethat the problem can be equivalently stated as a nonlinear program which minimizes v overthe same inequalities. Similarly, a formulation analogous to Eq-DEPF can be constructed,although, in light of the additional nonlinearity of A(p), this formulation will more resemblethe general case (DEPF) than the special case Eq-DEPF (see Section 5.2.1). Thus, wemay also apply the theory, algorithms, and convergence analysis of optimization methodsto the equilibrium problem with nonlinear, constant-returns production. In particular,given adequate differentiability of the unit profit functions, the basic equivalence betweenSLCP and Wilson's SQP method applies to this problem as well. What we have not yetinvestigated is the effect of the nonlinear A(p) on the descent results we derived in Chapter 5.

In applying Newton's method to the above NLCP, the linearized subproblem which is con-structed at a linearization point (pk, yk) is the following:

72 8.2 Future research


(LCR1) [ yV2j(pk)-Vd(pk)]p + A(pk)y + hv > d(pk) j p>O

(LCR2) -AT(pk)p > 0 y> 0

(LCR3) -hTp = -1 1 v

(LCR4) p , > 0

The above relations are also the appropriate linearized constraints for applying any of theoptimization methods we have discussed in this research. Two observations are important.

First, the constraint matrix indeed has the form we studied in Chapter 6. This meansthat, if the relations (LCR1)-(LCR4) are consistent, the subproblem has a complementarysolution which can be successfully computed by Lemke's method. Given the assumptionthat A(p) satisfies property (A2) of Section 2.2 for all p, we can in fact conclude that eachsubproblem is feasible, provided that the price normalization is constructed as prescribedin Chapter 6.

Second, the symmetry and positive semidefiniteness of the Hessian terms appearing inthe above linearization may in fact cause the subproblem to be better behaved than thesubproblems encountered with linear production. In particular, solving the subproblemas a quadratic program may be more reliable in this case. Moreover, it is intuitive Losuspect that the submatrix [Ej y4V~rj(pk)-Vd(pk)] is more amenable to approximationby a symmetric positive semidefinite matrix, such as we studied with the SLTZ method inthe previous chapter.

It would appear, then, that most of the results we have obtained in this research will alsoapply to the important case of nonlinear production with constant returns to scale, giventhe existence of sufficiently differentiable unit profit functions. Further analysis is requiredto ensure that we have not overlooked any subtleties arising in the nonlinear situation. Ifnot, this extension will greatly broaden the applicability of our research.

8.2 Future research

We believe that it is extremely important to better understand the empirical observationsthat Newton methods (and other linearization methods) tend to be globally convergent".,ben aP-r;,, t- problem Eq-NLCP and that optimization algorithms successfully find theglobal minimum of nonconvex problems Eq-NLP and Eq-DEPF. We think there is moreto be learned in this regard from study of the optimality conditions for the optimizationformulations of the equilibrium problem. In particular, given properties (H2) and (W2) ofSection 2.1, further investigation of the second-order conditions may be fruitful, even thoughour analysis to date has not revealed anything definitive. Since many of the models solvedin practice employ sums of demand functions derived from Cobb-Douglas or more generalconstant-elasticity-of-substitution utility functions, it may well be that special properties

8.2 Future research 73

of these functions are the source of good computational performance. This suggests alsostudying the optimality conditions as specialized to these popular functional forms.

We think it worthwhile to develop and evaluate computational procedures which combineany (and all) of the sequential methods studied with a rigorous line search and choice ofstep length based on the merit function available through problem Eq-DEPF. The prin-cipal intent here is both to avoid uphill steps (such as have been seen to occur with theSLTZ method, for example) and to stabilize the initial iterations of the algorithms. It iseasy to focus on asymptotic behavior of the iterates, but we have observed repeatedly inour computational experiments that the first few linearizations can be critical in terms ofestablishing the path that is to be followed to the equilibrium solution. It is here that aline search accounting for the divergence of the linearization from the true functions couldprevent wasted iterations on extreme and unrealistic subproblem solutions.

A major disappointment of the current research concerns the equivocal results obtainedfor symmetric linearization approaches (such as SLTZ) that allow for the solution of asubproblem in a lower dimension. Not only does the theoretical analysis of Chapter 5demonstrate that such methods can produce uphill seaich directions, the computationalresults of Chapter 7 confirm that such behavior does in fact occur on actual models. Onthe other hand, in some contexts the symmetric methods could perform many times betterthan the other methods studied. It is a true irony that we set out in this research toestablish the viability of symmetric approximation methods, but we seem to have donemore to demonstrate their inherent riskiness.

We see two possible approaches to investigate for models of a size that makes directlysolving Eq-DEPF or repeatedly solving SLCP/SLP subproblems a costly undertaking. The

first avoids the solution of such subproblems by using symmetric linearizations. It is thennecessary to devise (somehow) an overall sequential scheme that is both globally convergentand convergent at an acceptable (linear) rate. The second formulates subproblems based onTaylor's expansions (thereby promoting overall convergence at a quadratic rate) but seeksalternative procedures for solving the (large) subproblems that may be less costly thanprocedures known to date.

For the first approach, we may consider using a rigorous Eq-DEPF line search to supervisea symmetric linearization procedure (whether it be SLTZ, a Jacobi method, or some formof projection method). This certainly would prevent the kind of uphill excursion that weobserved on Kehoe's multiple equilibrium problem. Unfortunately, it is not at all clear what

to do if the subproblem solution provides a search direction that is uphill for all positivestep lengths. This is where further research may suggest a superior formulation of thesubproblem. Given the availability now of the Eq-DEPF merit function, we are in a muchbetter position to pursue such research than previously.

The other direction is seeking improved methods for solving the specially structured sub-problems which arise in the course of SLCP or SLP. Current solution procedures basedon Lemke's method or MINOS, for instance, can take advantage of the general sparsity ofthe matrix of linearized constraints, but they cannot make any use of the skew-symmetric

74 8.2 Future research

structure corresponding to the production activities. This suggests the study and develop-ment of special factorization routines that recognize the special structure of complementaryand almost-complementary bases for these subproblems. The block-angular nature of thesubproblem also makes it amenable to solution by a decomposition procedure, particularlyin the linear case of an SLP subproblem or the convex case of an SQP subproblem (whichutilizes the Jacobian of the demand functions in the constraints but uses a positive semidef-inite approximation in the objective function). Earlier thoughts on applying decompositionto linearized equilibrium problems are presented at some length in [Sto 85]. An effectivedecomposition strategy would lead to solving subproblems that axe roughly the same sizeas those encountered in using a symmetric linearization method.

On balance, we believe that we have made some valuable incremental progress in thisresearch, but much remains to be studied and understood before the solution of large-scale(and structurally general) general equilibrium models becomes a routine undertaking.

V V

BIBLIOGRAPHY

[AH 821 B.-H. Ahn and W.W. Hogan, On convergence of the PIES algorithm for computingequilibria, Operations Research 30 (1982) 281-300.

[Avr 76] M. Avriel, Nonlinear Programming: Analysis and Methods (Prentice-Hall, Engle-wood Cliffs, New Jersey, 1976).

[Bro 851 M.N. Broadie, An introduction to the octahedral algorithm for the computation ofeconomic equilibria, Mathematical Programming Study 23 (1985) 121-143.

[Cot 64] R.W Cottle, Note on a fundamental theorem in quadratic programming, Journal

of the Society for Industrial and Applied Mathematics 12 (1964) 663-665.

[Cot 76] R.W. Cottle, Complementarity and variational problems, Symposia Mathemat-ica XIX (1976) 177-208. 1

[DEG 79] G.B. Dantzig, B.C. Eaves and D. Gale, An algorithm for a piecewise linear modelof trade and production with negative prices and bankruptcy, Mathematical Pro- Vr

gramming 16 (1979) 190-209.

[DN 84] S. Dafermos and A. Nagurney, A network formulation of market equilibrium prob-lems and variational inequalities, Operations Research Letters 3 (1984) 247-250.

[Eav 71] B.C. Eaves, The linear complementarity problem, Management Science 17 (1971)612-634. 4

[Eav 78] B.C. Eaves, A locally quadratically convergent algorithm for computing stationarypoints, Technical Report SOL 78-13, Systems Optimization Laboratory, Depart-ment of Operations Research, Stanford University (May 1978).

[Eav 87] B.C. Eaves, Thoughts on computing market equilibrium with SLCP, in A.J.J. Tal-man and G. van der Laan (eds.), The Computation and Modeling of Economic

Equilibria (North-Holland, Amsterdam, 1987) 1-17.

[Gar 73] C.B. Garcia, Some classes of matrices in linear complementarity theory, Mathemat-ical Programming 5 (1973) 299-310.

[GMSW 86] P.E. Gill, W. Murray, M.A. Saunders and M.lI. Wright, Some theoretical propertiesof an augmented Lagrangian merit function, Technical Report SOL 86-6, SystemsOptimization Laboratory, Department of Operations Research, Stanford University(September 1986).

[GMSW 87] P.E. Gill, W. Murray, M.A. Saunders and M.H. Wright, Constrained nonlinear pro-gramming, Technical Report SOL 87-13, Systems Optimization Laboratory, De-partment of Operations Research, Stanford University (December 1987), to appearin G.L. Nemhauser, A.II.G. Rinnooy Kan and M.J. Todd (eds.), Handbooks in Oper-ations Research and Management Science-Volume 1, Optimization (North-Holland,Amsterdam, 1988).

[GMW 81] P.E. Gill, W. Murray and M.I. Wright, Practical optimization (Academic Press,London, 1981).

[GS 61] R.E. Griffith and R.A. Stewart, A nonlinear programming technique for the opti-mization of continuous processing systems, Management Science 7 (1961) 379-392.

75

76 Bibliography

[GXW 81] V.A. Ginsburgh and J.L. Waelbroeck, Activity analysis and general equilibriummodeling (North-IHolland, Amsterdam, 1981).

[Jo-N 79] N.I. Josephy, Newton's method for generalized equations, Technical SummaryReport 1965, Mathematics Research Center, University of Wisconsin-Madison(1979), available from National Technical Information Service, accession numberAD A077 096.

[Jo-Q 79] N.H. Josephy, Quasi-Newton methods for generalized equations, Technical Sum-mary Report 1966, Mathematics Research Center, University of Wisconsin-Madison(1979), available from National Technical Information Service, accession numberAD A077 097.

[Keh 82] T.J. Kehoe, Regular production economies, Journal of Mathematical Economics 10(1982) 147-176.

[Keh 84] T.J. Kehoe, Computing all of the equilibria in econoniic5 with two factors of pro-duction, Journal of Mathematical Economics 13 (1984) 207-223.

[Keh 85] T.J. Kehoe, A numerical investigation of multiplicity of equilibria, MathematicalProgramming Study 23 (1985) 240-258.

[Kyp 86] J. Kyparisis, Uniqueness and differentiability of solutions of parametric nonlinearcomplementarity problems, Mathematical Programming 36 (1986) 105-113. .A

[Lem 68] C.E. Lemke, Bimatrix equilibrium points and mathematical programming, Man-agement Science 11 (1965) 681-689. S

[Mang 80] O.L. Mangasarian, Locally unique solutions of quadratic programs, linear and non-linear complementarity problems, Mathematical Programming 19 (1980) 200-212.

[Mann 85] A.S. Manne (ed.), Economic equilibrium: model formulation and solution, Mathe-matical Programming Study 23 (1985).

[MasC 85] A. Mas-Colell, The theory of general economic equilibrium: a diffcrentiable approach(Cambridge University Press, Cambridge, England, 1985).

[Mat 87] L. Mathiesen, An algorithm based on a sequence of linear complementarity prob-lems applied to a Walrasian equilil)rium model: an example, Mathematical Pro-gramming 37 (1987) 1-18.

[NTCW 80] A.S. Manne, l.-P. Chao and R. Wilson, Computation of competitive equilibria bya sequence of linear programs, Econometrica 48 (1980) 1595-1615.

[M R 83] L. Mathiesen and T.F. Rutherford, Testing the robustness of an iterative LUP algo- 1,rithm for solving Walrasian equilibrium models, Discussion Paper 0883, NorwegianSchool of Economics and Business Administration (1983).

0[MS 82] B.A. Murtagh and M.A. Saunders, A projected Lagrangian algorithm and its imi-

plementation for sparse nonlinear constraints, Mathematical Programming Study 16(1982) 84-117.

[Mur 83] F.I. Murphy, Design strategies for energy market models, in 1B. Lev (ed.), Energymodels and studies (North- Holland, Amsterdam, 1983) 45-64.

[Nag 87] A. Nagurney, Competitive equilibrium problems, variational inequalities and re-gional science, Journal of Regional Science 27 (1987) 503 517. %

""''x'' 'q=e :-w '--• - ' '"€ - "'€ Yj , i- -w , .¢, -w-w- -,=-,',...i. .. .... ; ' """ -

' ' ' ':*." w €' -. . €,

Bibliography 77

[PC 82] J.S. Pang and D. Chan, Iterative methods for variational and complementarityproblems, Mathematical Programming 24 (1982) 284-313.

[Phi 85] R.L. Phillips, Computing solutions to generalized equilibrium models by successiveunder-relaxation, Mathematical Programming Study 23 (1985) 192-209.

[RK 721 J.B. Rosen and J. Kreuser, A gradient projection algorithm for nonlinear con-straints, in F.A. Lootsma (ed.), Numerical Methods for Non-Linear Optimization(Academic Press, London and New York, 1972) 297-300.

[Rob 72] S.M. Robinson, A quadratically convergent algorithm for general nonlinear pro-gramming problems, Mathematical Programming 3 (1972) 145-156.

[Rob 74] S.M. Robinson, Perturbed Kuhn-Tucker points and rates of convergence for a classof nonlinear programming algorithms, Mathematical Programming 7 (1974) 1-16.

[Rob 80] S.M. Robinson, Stroiigy regular generalized equatioiis, Mathemaiics of OQcraticn-Research 5 (1980) 43-62.

[Rob 82] S.M. Robinson, Generalized equations, in A. Bachem, M. Grotschel and B. Korte(eds.), Mathematical Programming: The State of the Art (Springer-Verlag, Berlin,1982) 346-367.

[Rut 86] T.F. Rutherford, Applied general equilibrium modeling, Doctoral Dissertation, De-partment of Operations Research, Stanford University (1986).

[Sca 73] H.E. Scarf, The computation of economic equilibria (Yale University Press, NewHaven, 1973).

[SMD 87] J .C. Stone, P.H. McAllister and G.B. Dantzig, Using the PILOT model to study theeffects of technological change, in B. Lev, J.A. Bloom, A.S. Gleit, F.H. Murphy andC. Shoemaker (eds.), Strategic Planning in Energy and Natural Resources (North-Holland, Amsterdam, 1987) 31-42.

[Sto 85] J.C. Stone, Sequential optimization and complementarity techniques for computingeconomic equilibria, Technical Report SOL 84-9, Systems Optimization Labora-tory, Department of Operations Research, Stanford University (November 1984),reproduced in part in Mathematical Programming Study 23 (1985) 173-191"

[SW 73] J.B. Shoven and J. Whalley, General equilibrium with taxes: a computational pro-cedure and an existence procf, Review of Economic Studies 60 (1973) 475-490.

[Wil 63] R.B. Wilson, A simplicial algorithm for concave programming, Doctoral Disserta-tion, Graduate School of Business Administration, Harvard University (1963).

[ZKL 851 J. Zhang, N.-I. Kim and L. Lasdon, An improved successive linear programmingalgorithm, Management Science 31 (1985) 1312-1331.

I

' 'S

UNCLASSIFIEDSSCURITY CLASSIFICATION OF THIS PAGE (Whio Da, l-4e.

REPORT DOCUMENTATION PAGE RED MMUCTINOs

- Gow ACCESSIOu NO RECIPItorr'S CATALOG NUMBER

4. TITLE (ind SiIbmtIJ S. TYPE OF REPORT & PEmOO COVIERW O

Formulation and Solution of Economic Technical ReportEquilibii um Problems

a. PERFORIG Oo. REPORT NUNDER

7. AUTHOR(I,1 ill. COMRACT 0N GRANT NUNUlER(s)

Johi G. Stone N00014-85-k\-0343

S. PERFORMING ORGANIZATION NAME AND ADDRESS Is. PROGRAM ELEMENT. PROJECT, TASK

AREA & WORK UNIT NUMBERS

Department ot Operations Research - SOL

Stanford University 1111MAStauford, CA 94305-4022

I. CONTROLLING OFFICE NAME AND ADDRES 1. REPORT DATS

Office of Naval Research - Dept. of the Navy April 1988

800 N. Quincy Street 1I. NUMBEROFPAGES

Arlington, VA 22217 D, . 7714. MONITORING AGENCY NAME & ADORESS(('I dIjent m Covilmkflk1 o0riee) IS. SECURITY CLASS. (e t#do tep"o)

UNCLASSI FI ED .

IS&. DECI. ASSIFICATION/OWNGRADING ,SON DULE

1S. OiSTRIOUT1ON STATEMENT (of Mdde Report)

This document has been approved for public release and sale;its distribution is unlimited.

17. DISTRIBUTION STATEMENT (of the Shrebt' 4stored i Wlee 0L 0 lk1' m600 Repor)

Il. SUPPLEMENTARY NOTES

I. KEY WORDS (Coffimae o rever3 se~ IlI Menolw am Inod d f& hr eeko wmAer)

economic equilibliumcomplementarit problems

20. ABSTRACT (CUmwe en everesi 81neo*d eoUeem and Ids..ir by leek =mibe

Please see back of form.

DO ro , n 1473 EDIT ON oF I NOV IIs OsOLe.TE

SECURITY CLASIIFICATION OF THIS PAGE (USe. Da tnrewdO

W ~,r.~aaaW~" a~ ~ . p.~'~a 'W%~'. ' ~ .. l

SECURITY CLASSIICATION OP THIS PAOE'When Date Bras.r .

TR SOL 88-7: Formulation and Solution of Economic Equilibrium Problems, by John S. Stone

We develop and assess a number of equivalent mathematical formulations of the general equilibriumproblem in economics. We begin with the traditional representation as a nonlinear complementarityproblem and develop alternative representations as nonlinear optimization problems. All of ourformulations depart from previous approaches by including an explicit linear equality for pricenormalization and a matched artificial variable which must be zero at any equilibrium solution. Thisstructure has the theoretical and computational advantage that any linearization of the equilibrium ,

problem has a feasible complementary solution. Moreover, under a reasonable rank condition, a basiccomplementary solution exists which can be successfully computed by Lemke's almost-complementarypivoting method.

We describe five general-purpose methods which can be applied to solving the equilibrium problem.The common feature of these methods is solving a sequence of linearized problems. We establish anumber of equivalences between the methods, when applied to the equilibrium problem, and assesstheir local and global convergence properties in that context. An important new tool in this analysis isanother problem formulation based on a differentiable exact penalty function. This formulationprovides, perhaps for the first time, a rigorous means of evaluating the progress of a sequentialmethod for computing an equilibrium solution. Our analysis reveals a basic theoretical dilemma insolving general equilibrium problems by these sequential methods. One group of methods producesiterates that converge from any starting point, but the sequence may converge to a non-equilibriumpoint. Another group produces iterates that may fail to converge, but successfully convergingsequences do attain an equilibrium.

We perform a number of computational experiments on two small problems from the Jilerature. Theresults show considerable variation in the solution times for the various methods, but all methodssucceed in locating an equilibrium, even from poor starting points. This successful performance (inaddition to that reported by other researchers) suggests that the kinds of general equilibrium modelsformulated in practice possess certain favorable computational properties that theoretical analysis hasyet to discover.

I________________________________________________________ S*1

~ V ''SV'~%~% ~.~JW% 'S V %.V .. '~.V.%.V .~%

Documents

LaboratoryWe develop)and assess a number 4;.quivalent mathematical formulations of the general equi-librium problem in economics. V%.begin~with the traditional representation as a