Semi-inﬁnite Programming: An Introduction “preliminary ...stillgj/sip/tutorial.pdf · compromise between an introduction and a survey, treats the theoretical basis and numerical

Semi-infinite Programming: An Introduction“preliminary version”

June 29, 2004

Georg Still

Abstract

‘ A semi-infinite programming problem is an optimization problem in whichfinitely many variables appear in infinitely many constraints. This model natu-rally arises in an abundant number of applications in different fields of mathe-matics, economics and engineering. The present paper which intends to make acompromise between an introduction and a survey, treats the theoretical basis andnumerical methods of the field.

1 Introduction

1.1 Problem formulation

A semi-infinite program(SIP) is an optimization problem in finitely many variablesx= .x1; : : : ; xn/ ∈ Rn on a feasible set described by infinitely many constraints:

P: minx

f .x/ s.t. g.x; y/ ≥ 0 ∀y ∈ Y ; (1)

whereY is an infiniteindex set. F will denote the feasible set,v = inf{ f .x/ | x ∈ F }the optimal value andS = {x ∈ F | f .x/ = v} the set of minimizers of the problem.We assume thatY⊂ Rm is compact,g.x; y/ is continuous onRn

×Y and for any fixedy ∈ Y the gradient∇xg.x; y/ (wrt. x) exists and is continuous onRn

× Y.

We also will consider generalizations of SIP, where the index setY= Y.x/ is allowedto be dependent onx (GSIP for short),

minx

f .x/ s.t. g.x; y/ ≥ 0 ∀y ∈ Y.x/ : (2)

1

During the last five decades the field of Semi-infinite Programming has known atremendous development. More than 1000 articles have been published on the the-ory, numerical methods and applications of SIP. Originally the research on SIP wasstrongly related to (Chebyshev) approximation, seee.g., [12], [8]. As excellent reviewarticles on the field we refer to Polak [25] and Hettich/Kortanek [9]. Since a first con-tribution [13] thegeneralized SIP problem(2) became a topic of intensive research(seee.g., [22] and [32]). For an extensive treatment of linear semi-infinite problemswe refer to the book by Goberna/Lopez [10].

REMARK 1 For shortness and a clearer presentation we omit additional equality con-straintshi .x/ = 0 in the formulation of SIP. It is not difficult to generalize (underappropriate assumptions onhi) all results in this paper to the more general situation.

The paper is organized as follows. The next section presents some illustrative applica-tions of SIP and GSIP. Section 3 discusses the structure of the feasible set of finite- andsemi-infinite problems. Section 4 derives first order necessary and sufficient optimalityconditions. Linear semi-infinite and linear semidefinite programming is treated in Sec-tion 5. The following Section 6 presents second order optimality conditions based onthe so-called reduction approach. In Section 7 different numerical methods for solv-ing SIP are surveyed. Finally, the last section discusses the relation between SIP andbilevel programming and mathematical programs with equilibrium constraints. Theappendix gives some additional definitions facts from optimization.

2 Applications

In [25], [9] and [10] many applications of SIP are mentioned in different fields suchas (Chebyshev) approximation), robotics, boundary- and eigen value problems fromMathematical Physics, engineering design, optimal control, transportation problems,fuzzy sets, cooperative games, robust optimization and statistics. From this large listof applications we have chosen some illustrative examples.

Chebyshev Approximation : Let be given a functionf .y/ ∈ C.Rm;R/ and a spaceof approximating functionsp.x; y/, p ∈ C.Rn

× Rm;R/, parameterized byx ∈ Rn.We want to approximatef by functionsp.x; ·/ in the max-norm (Chebyshev-norm)‖ f ‖∞ = maxy∈Y | f .y/| on a compact setY ⊂ Rm. To minimize the approximationerror� = ‖ f − p‖∞, leads to the problem:

minx;�

� s. t. g±.x; y/ := ±(

f .y/− p.x; y/)≤ � for all y ∈ Y : (3)

This is a semi-infinite problem.

The so-called reverse Chebyshev problem consists of fixing the approximation error�and making the regionY as large as possible (see [16] for such problems). Suppose,the setY = Y.d/ is parameterized byd ∈ Rk andv.d/ denotes the volume ofY.d/

2

(e.g. Y.d/ = 5ki=1[−di ;di ]). The reverse Chebyshev problem then leads to the GSIP

(� fixed).

maxd;x

v.d/ s. t. g±.x; y/ := ±(

f .y/− p.x; y/)≤ � for all y ∈ Y.d/ : (4)

where the index setY.d/ depends on the variabled. We refer the reader also to [16].

Mathematical Physics. The so-called defect minimization approach leads to semi-infinite programming models for solving problems from Mathematical Physics whichis different from the common finite element and finite difference approaches. We givean example.

Shape optimization problem :Consider the following Boundary-value problem,

BVP: GivenG0⊂Rm (G0 a simply connected open region with smooth boundary@G0,(closureG0)) andk > 0. Find a functionu ∈ C2.G0;R/ such that with the Laplacian1u= uy1y1 + : : :+ uymym:

1u.y/ = k; ∀y ∈ G0

u.y/ = 0; ∀y ∈ @G0

By choosing a linear space of appropriate trial functionsu j ∈ C2.Rm;R/,

S={u.x; y/ =

n−1∑i=1

xiui .y/}

this BVP can approximately be solved via the following SIP:

min";x

" s.t. ±(1u.x; y/− k

)≤ "; y ∈ G0

±u.x; y/ ≤ "; y ∈ @G0

In [6] the related but more complicated so-calledShape Optimization Problemhasbeen considered theoretically.

SOP: Find a (simply connected) regionG ∈ Rm with normalized volume�.G/ = 1and a functionu∈ C2.G;R/ which solves with a given objective functionF.G;u/ theproblem

min�.G/=1;u

F.G;u/ s.t. 1u.y/ = k; ∀y ∈ G

u.y/ = 0; ∀y ∈ @G

This is a problem with variable regionG and can be solved approximately via thefollowing GSIP-problem:

Choose some appropriate set of regionsG.z/ depending on a parameterz ∈ Rp and

3

satisfying�.G.z// = 1 for all z. Fix some small error bound" > 0. Then we solvewith trial functionsu.x; y/ ∈ S the program

minz;x

F.G.z/;u.x; ·// s.t. ±(1u.x; y/− k

)≤ "; ∀y ∈ G.z/

±u.x; y/ ≤ "; ∀y ∈ @G.z/

For further contributions to the theory and numerical results of this approach we refere.g. to [31], [9].

Robotics.Control problems in robotics can often be modeled as a semi-infinite prob-lem. We refer to the Ex.2.1 in [9] and the many references therein (cf. also [11]). Asanother example we discuss the so-calledManeuverability problem:

Let 2 = 2.�/ ∈ Rm denote the position of the so-called tool center point of the robot(in robot coordinates) at time�. Let 2; 2 be the corresponding velocities, accelera-tions (derivatives w.r.t.�). The dynamical equation has (often) the form

g.2; 2; 2/ := A.2/2+ F.2; 2/ = K;

with (external) forcesK ∈ Rm. Here,A.2/ is the inertia matrix andF describes thefriction, gravity, centrifugal forces, etc. The forcesK are bounded by

K− ≤ K ≤ K+ :

For fixed2; 2, the set of feasible (possible) accelerations is given by

Z.2; 2/ = {2 | K− ≤ g.2; 2; 2/ ≤ K+}:

Note that, sinceg is linear in2, for fixed .2; 2/, the setZ.2; 2/ is convex (intersec-tion of half-spaces). Let now be given an ’operating region’Q, e.g.

Q= {.2; 2/ ∈ R2m| .2−; 2−/ ≤ .2; 2/ ≤ .2+; 2+/}

with bounds.2−; 2−/ and.2+; 2+/. Then, the set of feasible accelerations2 (ac-celerations which can be realized in every point.2; 2/ ∈ Q) becomes

Z0 =

⋂.2;2/∈Q

Z.2; 2/ = {2 | K− ≤ g.2; 2; 2/ ≤ K+; for all .2; 2/ ∈ Q}:

The setZ0 is convex (as an intersection of the convex setsZ). For the steering ofthe robot one has to check whether a desired acceleration2 is possible, i.e. whether2 ∈ Z0. Often, this check takes to much time due to the complicated description ofZ0. Then, one is interested in a simple bodyY (e.g. a ball) as large as possible, whichis contained inZ0. Instead of the test2 ∈ Z0 one performs the (quicker) check2 ∈ Y.Suppose the bodyY.d/ depends on the parameterd ∈ Rq andv.d/ is the volume ofY.d/. Then, to maximize the volume of the body gives the following GSIP called themaneuverability problem,

maxd

v.d/ s.t. K− ≤ g.2; 2; 2/ ≤ K+; for all .2; 2/ ∈ Q; 2 ∈ Y.d/: (5)

4

Geometry.Semi-infinite problems can be interpreted in a geometrical setting. Given afamily of setsS.x/ ⊂ Rp depending onx ∈ Rn, we wish to find a ’body’Y of a certainclass and a valuex such thatY is contained inS.x/ andY is as large as possible.

The mathematical formulation is as follows. SupposeS.x/ is defined by

S.x/ = {y ∈ Rp| g.x; y; t/ ≥ 0; for all t ∈ Q}

whereQ is a given compact set inRs and g ∈ C2.Rn× Rp

× Rs;R/. Let the bodyY.d/⊂ Rp be parameterized byd ∈ Rq with v.d/, a measure for the size ofY.d/ (e.g.the volume). To maximizev.d/ for Y.d/ ⊂ S.x/ then becomes:

maxd;x

v.d/ s.t. g.x; y; t/ ≥ 0 for all y ∈ Y.d/; t ∈ Q : (6)

For the case that the setS is fixed this problem is known asdesign centeringproblem.

Optimization under uncertainty (Robust optimization). As a concrete situationwe consider a linear program

minx

cTx s.t. aTj x− b j ≥ 0; ∀ j ∈ J;

J a finite index set. Often in the model the dataa j andb j are not known exactly. Itis only known that the vectors.a j;b j / may vary in a setYj ⊂ Rn+1. In a “pessimisticway” we now can restrict the problem to suchx which are feasible for all possible datavectors leading to a SIP

minx

cTx s.t. aTx− b≥ 0; ∀.a;b/ ∈ Y := ∪ j∈JYj :

In the next example we discuss such a “robust optimization” model in economics. Formore details we refer to [29], [35], [3], [23] and [1].

Economics.We consider the followingPortfolio problem. We wish to investK eurosinto n shares, say for a period of one year. We investxi euros in sharei and expect atthe end of the period a return ofyi euros per 1 euro investment in sharei.Our goal is to maximize the portfolio valuev= yTx after a year, wherex= .x1; : : : ; xn/andy= .y1; : : : ; yn/. The problem is that the valuey is not known in advance. How-ever often models from economics suggest that the vectory will vary in some compactsubsetY of Rn. In this case we are led to solve the linear SIP:

maxv;x

v s.t. yTx− v ≥ 0 ∀y ∈ Y∑i

xi = K; x≥ 0 :

We refer to [30] for details and numerical experiments.

Different other applications There are other interesting applications in game theory(see [27]) in data envelopment analysis (see [18]) and in probability theory (see [5]).

5

3 Constraints and constraint qualifications

3.1 Constraints

In this section we discuss the structure of the feasible setsF of finite and semi-infiniteoptimization problems.

F in linear programming (LP): The feasible set is defined by finitely many linearconstraints

F = {x | aTj x≥ b j; j ∈ J}

with a finite index setJ = {1; : : : ;m} anda j ∈ Rn, b j ∈ R.

F in semi-infinite linear programming (LSIP): The feasible set is defined by in-finitely many linear constraints

F = {x | aTy x≥ by; y ∈ Y} (7)

with an infinite index setY⊂ Rm and functionsy→ ay ∈ Rn , y→ by ∈ R.

Any such setF is closed and convex but also the converse can be proven with anappropriate “separation theorem”.

L EMMA 1 A subsetF ⊂ Rn is convex and closed iff it can be defined in the form (7).

Proof. To prove that the closed convex setF can be written in the form (7) let uschose a pointy =∈ F arbitrarily. In view of the Separation Theorem (e.g., [7, Th.10.1])the pointy =∈ F can be separated from the closed convex setF by a separating halfs-paceH≥y = {x | a

Ty x≥ by}, i.e.,

aTy x≥ by > aT

y y; ∀x ∈ F :

So, the setF can be written asF =⋂

y=∈F H≥y = {x | aTy x≥ by; ∀y =∈ F } :

2

F in finite programming (FP): The feasible set is defined by finitely many (nonlin-ear) constraints

F = {x | g j.x/ ≥ 0; j ∈ J} (8)

whereJ = {1; : : : ;m} is finite andg j : Rn→ R are continuous.

F in semi-finite programming (SIP): Feasible set defined by infinitely many (non-linear) constraints

F = {x | g.x; y/ ≥ 0; y ∈ Y} (9)

with infinite index setY⊂ Rm and continuousg : Rn×Rn

→ R.

F in generalized semi-finite programming (GSIP):

F = {x | g.x; y/ ≥ 0; y ∈ Y.x/}

6

with variable index setY.x/ defined by a set valued mappingY : Rn ⇒ Rm and con-tinuousg : Rn

×Rm→ R.

The topological structure of the feasible setsF in FP and SIP are the same.

EX . 1 The feasible setsF in FP and SIP are closed.

The feasible set in GSIP need not be closed.

EX . 2 Consider the problem

minx∈R

x s.t. x≥ 1− y; ∀y ∈ Y.x/ = {y ∈ R | 0≤ y; y≤ −x}

Obviously, the feasible set consists of the open interval.0;∞/ (Y.x/ = ∅ means thatx is feasible). A minimizer of the program does not exist. The problem here is that theindex setY.x/ = [0;−x] for x≤ 0, Y.x/ = ∅ for x > 0, is not continuous atx= 0.

If Y.x/ is continuous then, non-closedness ofF is excluded.

L EMMA 2 Suppose the mapping Y: K ⇒ Rm is continuous on a compact set K⊂ Rn.Then the feasible setF ∩ K of GSIP is compact (in particular closed).

3.2 Constraint qualifications (CQ)

Throughout the subsection we assume thatg j andg areC1-functions.

CQ in FP: For x ∈ F (see (8) we introduce theactive index set J0.x/ = { j ∈ J |g j.x/ = 0}. Locally nearx the feasible set is defined by the constraintsg j ≥ 0; j ∈J0.x/. TheLinear Independence Constraint Qualification(LICQ) is said to hold atxif the vectors

∇g j.x/; j ∈ J0.x/; are linearly independent:

Under LICQ atx ∈ F locally (up to a diffeomorphism) the feasible has a simple geo-metrical form.

L EMMA 3 Let atx∈ F the condition LICQ hold with say J0.x/= {1; : : : ; p} (p≤ n).Then in a (sufficiently small) neighborhood U ofx “up to a local diffeomorphism” thesetF ∩U is given by

x ∈ U s.t. xj ≥ 0; j = 1; : : : p:

Proof. By LICQ the gradients∇g j.x/; j = 1; : : : ; p are linearly independent and wecan complete these vectors to a basis ofRn by adding vectorsv j; j = p+ 1; : : : ;n.Now we define the transformationy= T.x/ by

y j = g j.x/; i = 1; : : : ; p; y j = vTj .x− x/; i > p :

By construction, the Jacobian∇T.x/ is regular. SoT defines locally a diffeomor-phism. This means that there exists" > 0 and neighborhoodsB".x/ of x andU".y/ :=

7

T.B".x// of y= 0 such thatT : B".x/→ U".y/ is a bijective mapping withT;T−1∈

C1, T.x/ = y and fory= T.x/ it follows:

x ∈ F ∩ B".x/ ⇔ y j ≥ 0 j = 1; : : : ; p ; y ∈ U".y/ :

In particular, sinceT is a diffeomorphism, the distance between two points remains’equivalent’ in the sense that with constants 0< �− < �+:

�−‖y1− y2‖ ≤ ‖x1− x2‖ ≤ �+‖y1− y2‖ ∀x1; x2 ∈ B".x/; y1= T.x1/; y2= T.x2/ :

2

LICQ even assures the following “analytic” stability result (presented in an “informal”manner, for details we refer to [21, Th.6.3.1]).

Let the feasible setF of a FP be compact and suppose LICQ holds at any pointx∈ F . Then for any (small) C1-perturbationg j of the functions gj the perturbedfeasible setF = {x | g j.x/ ≥ 0; j ∈ J} is diffeomorphic toF .

We now present a “bad example”:F is described by the inequalities inR3

g1.x/ := x3 ≥ 0; g2.x/ := x3+ x21+ x2

2 ≤ 0 :

ObviouslyF = {0} has (only) dimension 0, and any small perturbationg.x/ := x3−

� ≥ 0 (" > 0) yields an empty feasible set.

The following constraint qualification (which is weaker than LICQ, see Ex. 3) excludessuch a bad behavior. TheMangasarian Fromovitz Constraint Qualification(MFCQ)is said to hold atx if there exists a vectord ∈ Rn such that

∇g j.x/d > 0; j ∈ J0.x/ :

EX . 3 If LICQ holds atx ∈ F, then also MFCQ is satisfied.

EX . 4 Let MFCQ hold atx ∈ F with the vector d. Then for any� > 0 small enoughthe pointsx+ �d are interior points ofF .

Structural stability is equivalent with MFCQ as is shown in [17]. (We again presentthe result informally.)

Let the feasible setF of a FP be compact. Then for any (small) C1-perturbationg j of the functions gj the perturbed feasible setF = {x | g j.x/ ≥ 0; j ∈ J} is(Lipschitz-) homeomorphic toF if and only if MFCQ is satisfied at any pointx ∈ F .

8

CQ in SIP: For x ∈ F (see (9) we define theactive index set Y0.x/ = {y ∈ Y |g.x; y/ = 0}.TheLinear Independence Constraint Qualification(LICQ) is said to hold atx ∈ F ifthe vectors

∇xg.x; y/; y ∈ Y0.x/; form a linearly independent set:

Note that if LICQ holds atx the active setY0.x/ cannot contain more thann points.TheMangasarian Fromovitz Constraint Qualification(MFCQ) is said to hold atx ifthere exists a vectord ∈ Rn such that

∇xg.x; y/d > 0; ∀y ∈ Y0.x/ : (10)

The statements of Ex. 3, 4 remain true in the SIP case as well as the the structuralstability result (see [20] for a precise formulation and a proof):

Let the feasible setF of a SIP be compact. Then for any (small) C1-perturbationg of the function g the perturbed feasible setF = {x | g.x; y/ ≥ 0; y ∈ Y} is(Lipschitz-) homeomorphic toF if and only if MFCQ is satisfied at any pointx ∈ F .

4 First order optimality conditions

In this section, first order optimality conditions are derived for the SIP problemP in(1). We assumef; g ∈ C1.A feasible pointx ∈ F is called alocal minimizerof SIP if there is some" > 0 suchthat

f .x/− f .x/ ≥ 0 for all x ∈ F with ‖x− x‖ < " : (11)

The minimizerx is said to beglobal if this relation holds for any" > 0. We callx ∈ Fastrict local minimizer of order p> 0 if there exist someq > 0 and" > 0 such that

f .x/− f .x/ ≥ q ||x− x||p for all x ∈ F with ‖x− x‖ < " : (12)

It is not difficult to see that near a pointx ∈ F , where the active index setY0.x/ ={y ∈ Y | g.x; y/ = 0} is empty, the SIP problem represents a common unconstrainedminimization problem. So throughout the paper we assume that for any (candidate)minimizer of P the conditionY0.x/ 6= ∅ is satisfied.

L EMMA 4 [Primal necessary optimality condition]Let x ∈ F be a local minimizer ofP. Then there cannot exist astrictly feasible descent directiond, i.e., a vector d∈ Rn

satisfying the relations

∇ f .x/d < 0 ; ∇xg.x; y/d > 0; ∀y ∈ Y0.x/ :

9

Proof. Let d be a strictly feasible descent direction. Thenf .x+ �d/ < f .x/ holds forsmall� > 0 (recall f ∈ C1). We claim that there exists some�0 > 0 with the property

g.x+ �d; y/ > 0 for all 0< � ≤ �0 andy ∈ Y

showing thatx cannot be a local minimizer. Suppose that the claim is false. Then toeachk≥ 1 there exists some 0< �k < 1=k and someyk ∈ Y such thatg.x+ �kd; yk/≤0. SinceY is compact, there must exist some convergent subsequence.�ks; yks/ suchthat �ks → 0 andyks → y∗ ∈ Y. The continuity ofg then yields g.x+ �ksd; yks/→g.x; y∗/ , which implies g.x; y∗/ = 0 andy∗ ∈ Y0.x/ . On the other hand, the MeanValue Theorem provides us with numbers 0< �ks < �ks such that

0≥ g.x+ �ksd; yks/− g.x; yks/ = �ks∇xg.x+ �ksd; yks/d

and hence∇xg.x+ �ksd; yks/d ≤ 0. So the continuity of∇xg entails

∇xg.x; y∗/d= lims→∞∇xg.x+ �ksd; yks/d ≤ 0 ;

which contradicts the hypothesis ond.2

THEOREM 1 [First order sufficient condition]Let x be feasible for P. Suppose thatthere does not exist a vector0 6= d ∈ Rn satisfying

∇ f .x/d ≤ 0 ; ∇xg.x; y/d ≥ 0; ∀y ∈ Y0.x/ :

Thenx is a strict local minimizer of SIP of order p= 1.

Proof. If x is an isolated feasible point than the result is trivially fulfilled. If not then,it is not difficult to see that the setK.x/ := {d ∈ Rn

| ‖d‖ = 1;∇xg.x; y/d ≥ 0; ∀y ∈Y0.x/ } is nonempty. (Consider with a sequence ofxk ∈ F , xk→ x, an accumulationpoint d of .xk− x/=‖xk− x‖). SinceK.x/ is also compact, the value

c1 = mind∈K.x/

∇ f .x/d

is attained by somed′ ∈ K.x/. By assumption,c1 = ∇ f .x/d′ > 0 holds. Fix any0 < c < c1. We claim that there is some" > 0 with the property

f .x/− f .x/ ≥ c ‖x− x‖ for all x ∈ F ; ‖x− x‖ < " : (13)

Suppose to the contrary that this claim is false and there is an infinite sequence offeasible pointsxk→ x with the propertyf .xk/− f .x/ < c ‖xk− x‖. We writexk =

x+ �kdk with �k > 0 anddk ∈ S= {d ∈ Rn| ‖d‖ = 1}. Then,xk→ x implies�k→ 0.

Moreover, the compactness ofS ensures that some subsequence of.dk/ converges.Without loss of generality, we thus assumedk→ d, d ∈ S.

10

The differentiability assumption together with the feasibility conditiong.xk; y/ ≥ 0and the propertyg.x; y/ = 0 implies

c |�k| > f .xk/− f .x/ = �k∇ f .x/dk + o.|�k|/ ;0 ≤ g.xk; y/− g.x; y/ = �k∇g.x; y/dk + o.|�k|/ ; y ∈ Y0.x/ :

Divide all these relations by�k. Lettingk→∞ (�k→ 0), we conclude that∇g.x; y/d≥0; i.e., d ∈ K.x/ and∇ f .x/d≤ c. This contradicts our choice ofc< c1. So (13) mustbe true.

2

REMARK. The assumptions of Theorem 1 are rather strong and can be expected to hold onlyin special cases (in Chebyshev approximationcf. e.g. [12]). It is not difficult to see that theassumptions imply that the set of gradients{∇xg.x; y/ | y ∈ Y0.x/ } contain a basis ofRn. Soin particular |Y0.x/| ≥ n .More general sufficient optimality conditions need second order information (cf. Section 6below).

We now derive the famousFritz John(FJ) and theKarush-Kuhn-Tucker(KKT) opti-mality conditions.

THEOREM 2 [Dual Necessary Optimality Conditions]Let x be a local minimizer ofP. Then the following holds.

(a) There exist multipliers�0; �1; : : : ; �k+1 ≥ 0 and indices y1; : : : ; yk+1 ∈ Y0.x/,k≤ n, such that

∑k+1j=0� j = 1 and

�0∇ f .x/−k+1∑j=1

� j∇xg.x; y j / = 0 : (FJ- condition) (14)

(b) If MFCQ holds atx, then there exist multipliers�1; : : : ; �k ≥ 0 and indicesy1; : : : ; yk ∈ Y0.x/, k≤ n, such that

∇ f .x/−k∑

j=1

� j∇xg.x; y j / = 0 : (KKT-condition) (15)

Proof.Consider the setS= {∇ f .x/} ∪ {−∇xg.x; y/ | y∈ Y0.x/} ⊆Rn. Sincex is a lo-cal minimizer ofP, there is no strictly feasible descent directiond at x (cf. Lemma 4).This means that

there is nod ∈ Rn with dTs< 0 for all s∈ S :

Now Y0.x/ is compact. Therefore (by continuity of∇xg.x; ·/) alsoS is compact. ByLemma 9 it follows 0∈ conv S. In view of Caratheodory’s Lemma 7, 0 is a convexcombination of at mostn+ 1 elements ofS, i.e.,

k∑j=0

� jsj = 0 sj ∈ S; � j ≥ 0;k∑

j=0

� j = 1 with k≤ n ; (16)

11

which implies (a).

Now assume thatd ∈ Rn is a strictly feasible direction atx )i.e., MFCQ holds). Forstatement (b) it suffices to show:�0 6= 0 in the representation (a) (as division by�0 > 0in (14) yields a representation of type (15)). Suppose to the contrary that�0 = 0 istrue, and multiply (14) withd. Since� j > 0 holds for at least onej ≥ 1, we obtain thecontradiction

0 > −k+1∑j=1

� j∇xg.x; y j /d= 0Td= 0 :

2

REMARK 2 Note that the results of this and the next section in particular remain truefor a finite program. In fact, given a finite program (see (8), we only have to chose afinite setY= {y1; : : : ; ym} and to identifyg.x; y j / := g j.x/; j = 1; : : : ;m.

EX . 5 Assume thatx ∈ F is a KKT-point such that (15) holds withk = n linear in-dependent vectors∇xg.x; y j /; j = 1; : : : ;n, (LICQ) and� j > 0. Show that then thesufficient conditions of Theorem 1 are satisfied.

5 Convex and linear semi-infinite programs.

5.1 Convex SIP

The semi-infinite program is calledconvexif the objective functionf .x/ is convexand, for every indexy ∈ Y, the constraint functiongy.x/ = g.x; y/ is concave (i.e.,−gy.x/ is convex).

EX . 6 It is not difficult to show that for a convex SIP the feasible set is convex andthat any local minimizerx is a global minimizer. Moreover, (ifg ∈ C1) the existenceof a MFCQ vectord at a feasible pointx is equivalent with the Slater condition: Thereexists a feasible pointx such thatg.x; y/ > 0 for all y ∈ Y.

Recall that a local minimizer of a convex program is actually a global one (see Ex. 6).As in finite programming, the KKT-conditions are sufficient for optimality.

THEOREM 3 Assume that the SIP problem P is convex. Ifx is a feasible point thatsatisfies the Kuhn-Tucker condition (15), thenx is a (global) minimizer of P.

Proof. By the convexity assumption, we have for every feasiblex andy ∈ Y0.x/ (cf.Lemma 8)

f .x/− f .x/ ≥ ∇ f .x/.x− x/

0≤ g.x; y/ = g.x; y/− g.x; y/ ≤ ∇xg.x; y/.x− x/ :

12

Hence, if there are multipliers� j ≥ 0 and index pointsy j ∈ Y0.x/ such that∇ f .x/ =∑kj=1� j∇xg.x; y j /, we conclude

f .x/− f .x/ ≥ ∇ f .x/.x− x/ =k∑

j=1

� j∇xg.x; y j /.x− x/ ≥ 0 :

2

5.2 Linear SIP

An important special case of (convex) SIP is given by thelinear semi-infinite problem(LSIP), where the objective functionf and the functiong are linear inx:

P: minx

cTx s.t. aTy x≥ by ∀y ∈ Y : (17)

For an intensive treatment of LSIP we refer to the monograph [10]. Here we onlyderive strong duality results.

Recall that any convex optimization problem

minx

cTx s.t. x ∈ F ; F ⊂ Rn a closed convex set;

can be written as a LSIP. LSIP is said to be continuous, ifY is compact and the func-tions y→ ay = a.y/; y→ by = b.y/ are continuous onY.

EX . 7 Let P be a continuous LSIP. Show that the function'.x/ := miny∈Y{a.y/Tx−b.y/} is concave. So LSIP can be written as a convex program:min cTx s.t.'.x/ ≥ 0.

Assume from now on that LSIP is continuous. We also write (17) formally in thefamiliar form

(P) minx

cTx s.t. Ax≥ b ;

whereA is a matrix with infinitely many rowsaTy , y∈ Y andb is a vector with infinitely

many componentsby. We now define the (so-called Haar)dualof (P) as

(D) maxu

bTu s.t. ATu= c; u≥ 0 ;

with the following understanding:u = .uy/ is dual feasibleif uy ≥ 0, y ∈ Y anduy > 0 for only finitely many y∈ Y. SobTu=

∑y byuy is actually a finite sum and so

is ATu=∑

y ayuy. Note that, by definition,

(D) is feasible ⇐⇒ c ∈ cone{ay | y ∈ Y}: (18)

The optimal objective function values of (P) and (D) will be denoted byvP andvD.

13

REMARK. Recall from Caratheodory’s Lemma 7 that the sumc=∑

y ayuy can be expressedas sums with at mostn non-zero coefficientsuy.

As in LP, if x ∈ Rn andu= .uy/ are primal resp. dual feasible, then

cTx− uTb= uT.Ax− b/ =∑

y

uy.aTy x− by/ ≥ 0:

THEOREM 4 (Weak duality, complementary slackness)If x and u are primal resp.dual feasible, then cTx≥ bTu. If cx

= bTu then x resp. u are optimal withvP = vD.

We emphasize that in contrast to ordinary (finite) linear programs, linear semi-infiniteproblems do not necessarily have the strong duality property unless additional con-straint qualifications hold (cf. Ex.8). A further notable difference to finite linear pro-gramming is the fact that the existence of primal and dual feasible solutions need notimply the existence of optimal solutions (cf. Ex. 9). To assuree.g. the existence of anoptimal solution of (P) we may assume strict feasibility of the dual program.

TheSlater constraint qualificationfor (P) assumes the existence of a strictly feasibleprimal solution (Slater point, see also Ex. 6):

(SCP) There exists somex ∈ Rn with Ax > b :

We also strengthen the dual feasibility condition (18) to thedual Slater condition:

(SCD) c ∈ int cone{ay | y ∈ Y}.

One can show that a feasible LSIP has an minimal solution if the condition SCD isfulfilled. This leads to the following strong duality result (seee.g., [7] for a proof).

THEOREM 5 (Strong duality) Assume that the Slater conditions (SCP) and (SCD)are satisfied, i.e., (P) and (D) are strictly feasible. Then optimal primal and dualsolutionsx andu exist, andvP = vD holds.Moreover in this casex is an optimal solution of (P) if and only ifx is a KKT-point,i.e.,there exist a dually feasibleu satisfyinguy j ≥ 0 for y j ∈ Y0.x/; j = 1; : : : ; k, k≤ n,anduy = 0 otherwise, such that

c=k∑

j=1

uy j ay j :

In particular uT.Ax− b/ = 0.

EX . 8 Let � > 0 be fixed. ShowvP = 0 andvD = � and that neither the primal nor the dualSlater condition is satisfied for the linear (SIP)

min − x2 s.t. − y2x1− yx2 ≥ 0 ; y ∈ Y= [0;1] and x2 ≤ � :

14

EX . 9 Show thatF = {.x1; x2/ | x2 ≥ ex1 for x1 ≤ 0 and x2 ≥ x1+ 1 for x1 ≥ 0} is the feasi-ble set of the (LSIP)

.P/ min x2 s.t. − yx1+ x2 ≥ y.1− ln y/; y ∈ Y= [0;1] :

Show furthermore: (D) is feasible and (P) satisfies the primal Slater condition but does nothave an optimal solution. (Consequently, (SCD) cannot be satisfied).

5.3 Linear semidefinite programs

A semidefinite program(SDP) is of the type

.P/ : minx∈Rn

cTx s.t.n∑

i=1

Ai xi − B� 0 (positive semidefinite); (19)

whereB; A1; : : : ; An ∈ Sk×k are givenk× k matrices.Sk×k denotes the set of (real)symmetrick× k matrices. By the following trick a semidefinite program can be trans-formed into an LSIP. Recall that forS= .si j / ∈ Sk×k the relation

S� 0 ⇔ yT Sy=∑i; j

si j yi y j ≥ 0 ∀y ∈ Y := {y ∈ Rk| ‖y‖ = 1} (20)

holds. So we may state (19) equivalently as

minx

cTx s.t. a.y/Tx− b.y/ ≥ 0 ∀y ∈ Y (21)

witha.y/T

=(yT A1y; : : : ; yT Any

)and b.y/ = yT By :

Let us denote the inner product of (k× k)-matricesC= .ci j / andD= .di j / by C◦ D=∑ki; j=1 ci j di j . Then it is not difficult to see that the LSIP dual of (21) can be written in

the form (see [7])

.D/ : maxU

B◦U s.t. Ai ◦U = ci ; i = 1; : : : ;n; U � 0: (22)

The primal Slater condition takes the form:

(SCP) There existsx ∈ Rn s.t.∑

Ai xi − B� 0;

whereS� 0 indicates thatS is a positive definite matrix. Moreover it is not difficultto show (cf. [7]) that the dual slater condition can be written as

(SCD) there existsU � 0 such thatAi ◦U = ci ; i = 1; : : : ;n:

So in principle, optimality and duality results for SDP can directly be deduced fromthe corresponding results in LSIP. We obtain in this way (see [7] for a proof)

15

COROLLARY 1 (Strong duality) Assume that the Slater conditions (SCP) and (SCD)are satisfied, i.e., (P) and (D) are strictly feasible. Then optimal primal and dualsolutionsx andU exist, andvP = vD holds.Moreover in this casex is an optimal solution of (P) if and only if there exist a duallyfeasibleU satisfyingU · .

∑i Ai xi − B/ = 0.

Numerical methods for SDP: From the numerical viewpoint, semidefinite program-ming plays a special role in LSIP. Generally we cannot expect that LSIP can be solvedefficiently (i.e., in polynomial time with respect to the input data). However, as LPproblems, also semidefinite programs can be solved efficientlye.g. by the interiorpoint method or the ellipsoid method. We refer to [37] for an extensive treatment ofthis issue.Here however, we will try to motivate the fact that SDP can be solved efficiently fromthe viewpoint of the ellipsoid method. Firstly, we emphasize that it is not difficultto see that a convex program can be solved efficiently if the corresponding feasibilityproblem can be solved efficiently. To find a feasible pointx ∈ F , the ellipsoid methodproceeds as follows.

Ellipsoid Method

INIT: Start with a (sufficiently large) ellipsoidE0 so thatF ∩ E0 6= ∅.

ITER: Given the ellipsoidE j = {x | .x− x j /T A j.x− x j / ≤ 1}, A j � 0, such that

F ∩ E j = F ∩ E0

(1) if x j ∈ F STOP

(2) OTHERWISE: compute a separating halfspaceH≥j such thatF ⊆ H≥j ; x j =∈ H≥j .and find (the smallest) ellipsoidE j+1 with vol E j+1 ≤ q vol E j .0 < q < 1/containingE j ∩ H≥j and thusF ∩ E0 .

So the Ellipsoid Method generates ellipsoidsE0; E1; : : : with the property

vol .F ∩ E0/ ≤ vol E j <≤ q jvol E0 → 0 :

Hence, if 0< vol .F ∩ E0/, it is clear that the algorithm will find a pointx j ∈ F aftera finite number of iterations.The overall ellipsoid method can be shown to be polynomial if the feasibility testx∈ F in step (1) and the construction of a separating halfspace in step (2) can be donein polynomial time. LetG.x/ :=

∑ni=1 Ai xi − B ∈ Sk×k.

L EMMA 5 The feasibility test G.x/� 0and the construction of a separating halfspacecan be done in timeO.k3/.

16

Proof. It is well-known that by applying a modification of the Gauss algorithm toG= G.x/ we can compute, in timeO.k3/, a decomposition

QGQT= D with Q regular andD = diag.d1; : : : ;dk/;

such thatG� 0 if and only ifdi ≥ 0; ∀i. So if di ≥ 0; ∀i, the pointx is feasible. If notsayd1 < 0, than by defininga1 = QTe1 (e1 the unit vector) it follows for all feasiblex, i.e., G.x/ � 0:

aT1 G.x/a1 ≥ 0 > d1 = eT

1 De1 = eT1 QGQTe1 = aT

1 Ga1 :

So the linear inequalityaT1 G.x/a1=

∑ni=1.a

T1 Aia1/xi−aT

1 Ba1≥ 0 defines a separatinghalfspace.

2

6 Second order optimality conditions

We now derive second order optimality conditions for semi-infinite problems (1) and(2) by applying the so-calledreduction approachwhich goes back to Wetterling [36].Let x ∈ F be a feasible point of SIP or GSIP. We assume thatf; g ∈ C2 and thatthe index setY.x/ is defined as the solution set of inequalities with functionsvl ∈

C2.Rn×m;R/:

Y.x/ = {y ∈ Rm| vl .x; y/ ≥ 0; l ∈ L} ; L := {1; : : : ;q} :

The following observation is crucial for this approach. By definition, any active pointy from Y0.x/ is a (global) minimizer of the following parametric optimization problem

Q.x/ : miny

g.x; y/ s.t. vl .x; y/ ≥ 0; l ∈ L ; (23)

the so-calledlower level problem, depending onx as a parameter. So (under some CQ)any pointy ∈ Y0.x/ must satisfy the KKT-conditions for the finite programQ.x/,

∇yL .x;y/.x; y; / = 0 ; (24)

with a multiplier vector 0≤ ∈ R|L0.x;y/| whereL0.x; y/ = {l ∈ L | vl .x; y/ = 0} isthe active index set andL .x;y/ the Lagrange function ofQ.x/ at x ∈ F , y ∈ Y0.x/,

L .x;y/.x; y; / = g.x; y/−∑

l∈L0.x;y/ lvl .x; y/ : (25)

We will assume, that the following strong regularity conditions (of finite optimization)are satisfied for all points inY0.x/.

Ared: For any y ∈ Y0.x/ the following holds (implying thaty is a strict (local)minimizer of Q.x/, seee.g. [7, Th.12.6]) :

17

1. LICQ: ∇yvl .x; y/; l ∈ L0.x; y/ are linearly independent.

2. .Under LICQ there are unique multipliers0≤ ∈ R|L0.x;y/| satisfying (24)).We assume: l > 0; l ∈ L0.x; y/ (strict complementary slackness).

3. The second order condition (SOC): With in 2.,

�T∇

2yL .x;y/.x; y; /� > 0; for all � ∈ T.x; y/ \ {0}

where T.x; y/ = {� ∈ Rm| ∇yvl .x; y/� = 0; l ∈ L0.x; y/}.

By using techniques from parametric optimization, under this assumption, the follow-ing can be proven.

THEOREM 6 Let at the feasible pointx for GSIP the condition Ared be satisfied. Then:

(a) The active index set is finite, Y0.x/ = {y1; : : : ; yr}, and there exist neighborhoodsUx of x and Vy j

of y j and continuous mappings

y j : Ux→ Vy j; y j.x/ = y j and j : Ux→ R|L0.x;y j /| ; j.x/ = j

such that for every x∈ Ux the value yj.x/ is the unique local minimizer of Q.x/ in Vy j

with corresponding Lagrange multiplier vector j.x/.

(b) With the functions in (a) the followingfinite reductionholds: x∈ Ux∩F is a localsolution of GSIP if and only ifx is a local solution of the so-called reduced problem

Pred.x/ : minx∈Ux

f .x/ s:t: G j.x/ := g.x; y j.x// ≥ 0; j = 1; : : : ; r : (26)

(c) The functions Gj.x/ = g.x; y j.x// are C2-functions in Ux with derivatives

∇G j.x/ = ∇xL .x;y j /.x; y j.x/; j.x//

∇2G j.x/ = ∇

2xL .x;y j /.x; y j.x/; j.x//−∇

T y j.x/∇2yL .x;y j /.x; y j.x/; j.x//∇y j.x/

−

∑l∈L0.x;y j /

∇T[ j ] l .x/∇xvl .x; y j.x//+∇

Txvl .x; y j.x//∇[ j ] l .x/:

Proof. (See [32] for details). Choosej ∈ {1; · · · ; r} and put for brevity.y; / =.y j; j /. AssumeL0.x; y/ = {1; · · · ; p} and definev := .v1; · · · ; vp/. Consider thefollowing system of Karush-Kuhn-Tucker equations forQ.x/, near.x; y; /, with La-grange functionL = L .x;y j /:

F.x; y; / :=∇yL .x; y; / = 0v.x; y/ = 0 :

(27)

Under assumption LICQ and SOC of Ared the Jacobian∇.y; /F.x; y; / can be shownto be regular. The Implicit Function Theorem applied to (27) then yields the result(a) and (b). The derivative ofy.x/ .= y j.x// can be calculated by differentiating

18

the identityF.x; y.x/; .x// = 0. We obtain for the second relation (the argumentsx; y.x/; .x/ omitted)

∇xv+∇yv∇y = 0 : (28)

For G.x/ = g.x; y.x// we find using∇yg∇y= T∇yv∇y= − T

∇xv (cf. (27), (28))

∇G.x/ = ∇xg.x; y.x//+∇yg.x; y.x// ∇y.x/

= ∇xg.x; y.x//− T.x/∇xv.x/ = ∇xL .x; y.x/; .x// (29)

and in a similar way the formula for∇2G.2

By this theorem, locally nearx, the problem GSIP is equivalent toPred.x/ in (26).which is a common finite optimization problem. Thus, the standard optimality con-ditions of finite optimization can be applied to obtain optimality conditions for thesemi-infinite problem P. To that end we define the cone ofcritical directions

C.x/ = {� ∈ Rn| ∇ f .x/� ≤ 0; ∇G j.x/� ≥ 0; j = 1; · · · ; r}

and the Lagrange function (of theupper level)

L .x; �/ = �0 f .x/−r∑

j=1

� jG j.x/ :

We now state the optimality condition of FJ-type. For more details, necessary condi-tions and sufficient conditions under weaker assumptions, see [15] and [22].

THEOREM 7 Suppose, the assumption Ared is satisfied forx ∈ F such that by Theo-rem 6, in a neighborhood U ofx, the problem GSIP can be locally reduced to Pred.x/according to (26). Then the following holds.a. Suppose,x is a local minimizer of GSIP. Then, to any� ∈ C.x/ there exists a multi-plier � ≥ 0 such that (with∇xL ; ∇2

xL given below)

∇xL .x; �/ = 0 and �T∇

2xL .x; �/� ≥ 0 :

b. Suppose, for any� ∈ C.x/ \ {0}, that there exists a multiplier� ≥ 0 such that

∇xL .x; �/ = 0 and �T∇

2xL .x; �/� > 0 :

Thenx is a (strict) local minimizer of GSIP.

The expressions for∇xL .x; �/ and∇2xL .x; �/ read:

∇xL .x; �/ = �0∇ f .x/−r∑

j=1

� j∇xg.x; y j /+r∑

j=1

� j

( ∑l∈L0.x;y j /

[ j ] l∇xvl .x; y j /

)∇

2xL .x; �/ = �0∇

2 f .x/−r∑

j=1

� j∇2xg.x; y j /+

r∑j=1

� j∇T y j.x/∇

2yL .x;y j /.x; y j; j /∇y j.x/

+

r∑j=1

� j

∑l∈L0.x;y j /

([ j ] l∇

2xvl .x; y j /+∇

T[ j ] l .x/∇xvl .x; y j /+∇Txvl .x; y j /∇[ j ] l .x/

)

19

(the first and second terms in∇xL .x; �/ and∇2xL .x; �/ are present in finite optimiza-

tion, the third term in∇2xL .x; �/ is the additional term for SIP and the third term

in ∇xL .x; �/ and the fourth term in∇2xL .x; �/ are typical for GSIP, containing the

dependence ofvl (and Y) on x.)

Proof. The formulas for∇xL and∇2xL follow immediately by using the formulas for

∇G j; ∇2G j in Theorem 6 . 2

REMARK 3 We say that atx ∈ F the condition LICQ is satisfied if for the functionsin Theorem 6 :

(LICQ) ∇G j.x/ = ∇xL .x;y j /.x; y j; � j /; y j ∈ Y0.x/ are linearly independent.

Note that under a constraint qualification atx ∈ F the multiplier�0 in the optimalityconditions of Theorem 7 can be set to�0 = 1.

7 Numerical methods

7.1 Primal methods

We discuss the so-called method offeasible directionsfor the SIP problem (1). Theidea is to move from a current (non-optimal) feasible pointxk to the next pointxk+1 ≈

xk + tkdk in such a way thatxk+1 remains feasible and has a smaller objective value.The simplest choice would be to move along a (strictly) feasible descent directionddefined by:

∇ f .x/d < 0; ∇xg.x; y/d > 0; ∀y ∈ Y0.x/ : (30)

Feasible Direction Method( (for FP, Zoutendijk)

INIT: Choose a starting pointx0 ∈ FITER: WHILE xk is not a Fritz John pointDO

BEGIN

Choose a strictly feasible descent directiondk.Determine a solutiontk for the problem

(∗) min { f .xk+ tdk/ | t > 0; xk+ tdk ∈ F }Set xk+1 = xk+ tkdk :

END

20

Let us consider the following “stable” way to obtain a strictly feasible descent directiond that takesall constraints into account. For FP this is the method suggested by Topkisand Veinott. We solve the linear SIP:

mind;z

z s.t. ∇ f .xk/d ≤ z

∇xg.xk; y/d+ g.xk; y/ ≥ −z y∈ Y

±di ≤ 1 i = 1; : : : ;n :

(31)

REMARK. Note that.d; z/= .0;0/ is always feasible for (31) and if (31) has an optimal valuez= 0 then (according to the proof of the next theorem) no strictly feasible descent directiond(cf, (30)) exists,i.e. (by Gordan’s Lemma 9),xk satisfies the FJ necessary optimality conditions(14).

THEOREM 8 Assume that the Feasible Direction Method generates the points xk andcomputes the dk as solutions of (31). Suppose further that a subsequence.xs/ con-verges tox, S⊆ N an infinite subset of indices s. Thenx is a Fritz John point.

Proof. SinceF is closed, the limitx of the pointsxs ∈ F also belongs toF , i.e., xis feasible. Suppose that the Theorem is false andx is not a Fritz John point. Thenby Gordans’s Lemma 9 there exists a (strictly feasible descent) directiond (see (30)).With the same arguments as in the proof of Lemma 4 we find that there is some�0 > 0such that

g.x; y/+ �∇xg.x; y/d > 0; ∀0 < � < �0; y ∈ Y :

So by choosing� > 0 such that�|di | ≤ 1; ∀i, we have shown that there existsz< 0and a vectord.= �d/ with |di | ≤ 1 satisfying

∇ f .x/d < 2z and g.x; y/+∇xg.x; y/d > −2z ∀y ∈ Y :

The continuity of the gradients∇ f .x/ and∇xg.x; y/ and the compactness ofY nowimplies that also

∇ f .xs/d < z and g.xs; y/+∇xg.xs; y/d > −z; ∀y ∈ Y

must hold ifs is sufficiently large. So.d; z/ is feasible for (31), which in turn impliesz≥ zs for the optimal solution.ds; zs/ of (31) atxs. In particular, we note

∇ f .xs/ds < z and g.xs; y/+∇xg.xs; y/ds > −z ∀y ∈ Y :

Let � > 0 be such that‖∇ f .x/−∇ f .x/‖ < |z|=3n holds whenever‖x− x‖ < �. Then‖ds‖ ≤ n together with the inequality of Cauchy-Schwarz yields

|.∇ f .x/−∇ f .x//ds| ≤ ‖∇ f .x/−∇ f .x/‖ · ‖ds‖ < |z|=3 :

So, if ‖x− x‖ < � ands is large

∇ f .x/ds≤ ∇ f .x/ds+ |z|=3≤ ∇ f .xs/ds+ 2|z|=3≤ z=3 :

21

Applying the same reasoning to∇xg.x; y/, we therefore conclude

g.xs; y/+∇xg.x; y/ds≥ −z=3 ∀y ∈ Y :

We are interested in pointsx of the formx= xs+ tds; .0< t < �=2n/. If s is so largethat‖xs− x‖< �=2, then‖x− x‖< �. Moreover, the Mean Value Theorem guaranteessome 0< � < t that exhibits

f .xs+ tds/ = f .xs/+ t∇ f .xs+ �ds/ds < f .xs/+ tz=3 :

Similarly, we find for anyy ∈ Y

g.xs+ tds; y/ = g.xs; y/+ t∇xg.xs+ �yds; y/ds

= .1− t/g j.xs; y/+ t[g.xs; y/+∇xg.xs+ �yds; y/ds]

≥ .1− t/g.xs; y/− tz=3 > 0

(use the feasibilityg.xs; y/ ≥ 0), which tells usxs+ tds ∈ F . So, the minimizationstep (∗) in the algorithm produces somets ensuring a decrement of at least

f .xs+1/− f .xs/ = f .xs+ tsds/− f .xs/ ≤ −�

2n· |z|=3= −�|z|=6n

for all sufficiently larges. Hence,f .xs/→−∞ ass→∞ contradicting our assump-tion thatxs→ x and thusf .xs/→ f .x/.

2

REMARK 4 The following variant of (31) is more efficient (and the results of Theo-rem 8 remain true): Chose some (small)" > 0 and replace the constraints∇xg.xk; y/d+g.xk; y/ ≥ −z; ∀y ∈ Y by

∇xg.xk; y/d+ g.xk; y/ ≥ −z ∀y ∈ Y".xk/ := {y ∈ Y | g.xk; y/ ≤ miny∈Y

g.xk; y/ + "}

Another possibility is to use the reduction approach in Section 6 and to solve the in-equalities:

G j.xk/+∇G j.xk/d ≥ −z;∀ j

whereG j.x/ = g.x; y j.x// (see (26)).

A foundation of descent methods based on nonsmooth analysis is to be found in thereview article of Polak [25].

22

7.2 KKT-approach, genericity and the SQP-method

In this subsection we discuss the so-called dual or KKT-approach for solving SIP andGSIP. These methods are directly based on the system of Karush-Kuhn-Tucker neces-sary optimality conditions.

FP case: Consider a finite program (FP)

minx

f .x/ s.t. g j.x/ ≥ 0; j ∈ J = {1; : : : ;m} :

Recall that under the constraint qualification LICQ (see Section 3.2) at a local mini-mizerx necessarily the KKT-equations

∇xL .x; �/ := ∇ f .x/−∑

j∈J0.x/� j∇g j.x/ = 0;

g j.x/ = 0; j ∈ J0.x/(32)

must hold with (unique) multipliers� j ≥ 0; j ∈ J0.x/, whereL denotes the Lagrangefunction atx, L .x; �/ = f .x/−

∑j∈J0.x/� j g j.x/. By putting G = [g j; j ∈ J0.x/],

the Jacobian of the system (32) at a solution.x; �/ reads:

J.x; �/ =

(∇

2xL .x; �/ −∇TG.x/∇G.x/ 0

):

The system (32) represents a system ofn+ |J0.x/| equations in equally many un-knownsxi ; � j . So we may apply the Newton method to solve this system (32). It iswell-known that the Newton method (locally) has quadratic convergence if the Jaco-bian J.x; �/ is regular at a solution.x; �/ (cf. e.g.[7]). We now discuss the questionwhether in the “general” (generic) case this regularity condition is satisfied. We intro-duce the

Problem set P = {P= . f; g1; : : : gm/} ≡ C∞.Rn;R/1+m :

The function spaceC∞.Rn;R/ is assumed to be endowed with the so-called strongWithney topology (see [21]). By a generic subset of a setP we mean a subset which isdense and open inP . The following result states that the Newton method is genericlyapplicable.

THEOREM 9 (Jongen/Jonker/Twilt[21]) There is an open and dense subsetP0 ⊂ Psuch that for all finite programs P∈ P0, LICQ holds at each feasible point x and ateach solution.x; �/ of (32) the Jacobian J.x; �/ is regular.

REMARK 5 It is interesting to note that the common second order sufficient optimal-ity conditions for FP at a local minimizerx imply that the JacobianJ.x; �/ is regular(seee.g. [7, Ex.12.30]).

SQP-method The KKT-approach, to solve (32) by Newton-type iterations is a con-ceptual idea for the computation of a solution of a FP numerically. TheSQP-method

23

presents one of the (most efficient) practical variants of this approach. In this methodin each iteration a “descent direction”dk is computed by solving a quadratic program(seee.g., [7] for details and convergence results). Under appropriate assumptions theSQP-method is superlinear convergent. A step of this method proceeds as follows:

SQP-method for FP

step k: givenxk, Bk > 0 andr > 0 large enough.

(1) compute a solutionsk of the quadratic program :

mins∈Rn∇ f .xk/s+

12

sT Bks s.t. g j.xk/+∇g j.xk/s≥ 0; j ∈ J.

(2) perform a line search with the merit-functionMr := f .x/− r∑

j min{0; g j.x/}

mint>0

Mr .xk+ tsk/ with a solutiontk and put xk+1 = xk+ tksk

SIP and GSIP case:Let x be a local minimizer of a SIP or GSIP. Recall from Sec-tion 6 that under LICQ (see Remark 3 and Ared the minimizerx necessarily satis-fies the following complete system of optimality conditions with the active index setY0.x/ = {y1; : : : ; yr}:

∇ f .x/−r∑

j=1

� j∇x

(g.x; y j /−

∑l∈L0.x;y j /

[ j ] lvl .x; y j /

)= 0 (33)

∀ j : g.x; y j / = 0

and for ally j; j = 1; : : : ; r :

∇yg.x; y j /−∑

l∈L0.x;y j /[ j ] l∇yvl .x; y j / = 0 (34)

∀ l ∈ L0.x; y j / : vl .x; y j / = 0

As in the FP case, this system hasK = n+ r .m+ 1/+∑r

j=1 |L0.x; y j /| equations andequally many unknownsx; � j; y j; j . So again, Newton-type methods may be appliedto solve GSIP numerically (see [33]). As ’natural’ regularity conditions we assume

RC: For eachx ∈ F with any choice of solutions j of (34) for eachy j ∈ Y0.x/ thefollowing is true: If there is a solution� of the relations (33) then

1. LICQ holds: ∇xL .x;y j /.x; y j; j /; y j ∈ Y0.x/ are linearly independent.

2. For anyy j ∈ Y0.x/ the assumptionAred holds (see Section 6).

REMARK 6 Note that for common SIP-problems the system simplifies. Since thefunctionsvl do not depend onx, in this case, in the first equations the sum over[ j ] l∇xvl .x; y j / vanishes.

24

For common SIP the following genericity result has been proven, which gives thetheoretical basis for the Kuhn-Tucker methods for solving SIP.

THEOREM 10 (Jongen/Zwier[19]) The setP = {. f; g; vl ; l ∈ L/} ≡ C∞.Rn;R/×C∞.Rn+m;R/1+|L| of all C∞-SIP problems contains an open and dense subsetP0 ⊂ Psuch that for all programs P∈ P0 the regularity condition RC is satisfied.

Unfortunately the situation is more complicated for GSIP where a genericity result asin Theorem 10 is not true. A counterexample is provided by the next

EX . 10minx∈R2

x21+ .x2− 1/2 s.t. y− x2 ≥ 0; ∀y ∈ Y.x/

with index setY.x/ = {y ∈ R | y− x1 ≥ 0; y+ x1 ≥ 0}. The setY.x/ behaves Lips-chitz continuous but is not smooth at points.0; x2/. The feasible set readsF = {x ∈R2| |x1| ≥ x2} with a re-entrant corner point atx= .0;0/. The solutions of the prob-

lem are the pointsx = .±1=2;1=2/. Obviously the problem is equivalent with theprogram in disjunctive form (not intersection but union of half spaces):

min x21+ .x2− 1/2 s.t. x ∈ {x | x2 ≤ x1} ∪ {x | x2 ≤ −x1}:

Near the re-entrant corner pointx= .0;0/ of F with corresponding active index pointy= 0 the system of optimality conditions can be written as:

2

(x1

x2− 1

)+�

[(01

)− 1

(−10

)− 2

(10

)]=

(00

)y− x2 = 0

1− . 1+ 2/ = 0

.y− x1/ = 0

.y+ x1/ = 0

The vector.x; y; �; / = .0;0;0;1;1=2;1=2/ is a solution of this system. Howeverthe condition LICQ ofAred is not fulfilled.On the other hand the Jacobian of the system can be shown to be regular at this solution.x; y; �; /, i.e., this point is an attraction point for the Newton method. Butx is nota critical point in the common sense since there is a feasible directiond = .±1;1/ ofdescent:∇ f .x/d= −2 < 0 (sox is not a candidate for a solution).

In [26] it is conjectured that the genericity result will at least hold at all local solutionsof GSIP (a proof is given for the linear case).

REMARK. Conceptually there are two strategies for solving the above system (33- 34):

25

• Compute an approximate solutionx of GSIP (for example by a Discretization method;see the next section). Use these data as starting values for a (Quasi-) Newton iterationfor solving (33).

This strategy has the disadvantage that often the approximate solution does not yet pro-duce data within the region of attraction of the Newton method. Therefore the nextapproach is preferable especially for problems of medium/large size.

• Use a Quasi-Newton method (such as the SQP-method) in step (2) of the conceptual“method based on the reduction” below.

For many implementational details and references we refer to [9].

Method for GSIP based on reduction (SQP-method):To obtain a practical methodfor solving GSIP problems we may try to solve the locally reduced problemPred.x/described in Section 6. We give a conceptual description (see [9, Section 7.3] for moredetails).

Method based on the reduction:

Stepk: Givenxk (not necessarily feasible)

1. Determine the local minimay1; · · · ; yrk of Q.xk/.

2. Apply Nk steps (of a finite programming algorithm) to the locally reduced prob-lem (cf. (26))

Pred.xk/ : minx

f .x/ s.t. G j.x/ := g.x; y j.x// ≥ 0; j = 1; · · · ; rk ;

leading to iteratesxk;i ; i = 1; · · · ; Nk.

3. Putxk+1 = xk;Nk andk= k+ 1.

The iteration in sub-step 2 can be done by performing ’SQP-steps’ applied to theKarush-Kuhn-Tucker system of Pred.xk/. For a discussion of such a method com-bining globally convergence and locally superlinear convergence we refer to [9].

7.3 Discretization methods

In a discretization method we choose finite subsetsY′ of Y, and instead ofP≡ P.Y/we solve the finite programs

P.Y′/: min f .x/ s.t. g.x; y/ ≥ 0; ∀y ∈ Y′ :

Let v.Y′/, F .Y′/ andS.Y′/ denote the the minimal value, the feasible set and the setof (global) minimizers ofP.Y′/ respectively.

26

We introduce the Hausdorff distance (meshsize)�.Y′/ betweenY′ andY by

�.Y′/ := maxy∈Y

dist .y;Y′/ where dist.y;Y′/ = miny′∈Y′‖y− y′‖ :

The following relation is trivial but important:

Y2 ⊂ Y1 ⇒ F .Y1/ ⊂ F .Y2/ andv.Y2/ ≤ v.Y1/ : (35)

We consider different discretization concepts:P= P.Y/ is calledfinitely reducable

if there is a finite setY′ ⊂ Y such thatv.Y′/ = v.Y/.

The SIP is calledweakly discretizableif there exists a sequence of discretizationsYk

such thatv.Yk/→ v.Y/ :

P.Y/ is said to bediscretizableif for each sequence of finite gridsYk ⊂ Y satisfying�.Yk/→ 0 (for k lage enough) there exist solutionsxk of P.Yk/ and for each sequenceof solutions the relation

dist.xk; S.Y//→ 0 and v.Yk/→ v.Y/

holds. We also introduce a local concept. Given a local minimizerx of P.Y/ the SIPis called locally discretizable atx if the discretizability relation holds locally, i.e. if theproblemPl .Y/,

Pl .Y/ : minx∈Ux

f .x/ s.t. g.x; y/ ≥ 0 ∀y ∈ Y ;

obtained as the restriction ofP.Y/ to an open neighborhoodUx of x, is discretizable.From a numerical viewpoint only the concept of discretizability is useful.

Finite reducability and weak discretizability. From the definition we immediatelydeduce

P is finitely reducable⇒ P is weakly discretizable:

We begin with two negative examples.

EX . 11 Let us consider the nonconvex SIP problem

P : min − x2 s.t. .x1− y/2+ x2 ≥ 0; y ∈ Y= [0;1]; 0≤ x1 ≤ 1:

ObviouslyF = {.x1; x2/ | 0≤ x1 ≤ 1 andx2 ≥ 0}; v.Y/ = 0 andS = {.x1; x2/ | 0≤x1 ≤ 1 andx2 = 0}: On the other hand, whichever gridY′ we take, it is evident thatv.Y′/ < 0 andS .Y′/ is a (nonempty) finite set. SoP is not finitely reducable but it isdiscretizable (as is not difficult to see).

EX . 12 The problem of Ex 9 is not weakly discretizable. In this examplev.Y′/=−∞,for every finite gridY′ ⊂ Y; whereasv.Y/ = 0.

27

We now consider LSIP problems. Define forY′ ⊂ Y the pair of primal-dual problems:

P.Y′/: min cTx s.t. ayx− by ≥ 0; ∀y ∈ Y′

D.Y′/: max∑

y∈Y′ uyby s.t.∑

y∈Y′ uyay = c; uy ≥ 0:

Let v.Y′/, vD.Y′/ andF .Y′/ FD.Y′/ denote the minimal values and feasible sets ofthe problemsP.Y′/, D.Y′/ respectively. Instead of

∑y∈Y′ uyby for u ∈ FD.Y′/, as in

Section 5.2, we will writebTu.From the LP theory it follows (see [7]) that for any finiteY′ (with F .Y′/ 6= ∅ orFD.Y′/ 6= ∅) the relation

vD.Y′/ = v.Y′/ ≤ vD.Y/ ≤ v.Y/ (36)

holds. Recall that a finitely reducable LSIP is in particular weakly discretizable. Forthe casev.Y/ = −∞, the problemP is trivially finitely reducable.

THEOREM 11 Let P.Y/ be feasible and bounded. Then

P.Y/ is finitely reducable⇔ D.Y/ is solvable and P.Y/ is weakly discretizable.

Proof. ”⇒”: Let Y′ ⊂ Y be finite withv.Y/ = v.Y′/, −∞ < v.Y/ <∞. The LPtheory entales that the problemD.Y′/ has an optimal solutionu′ ∈ FD.Y′/ ⊂ FD.Y/.Usingv.Y/ = v.Y′/, we find from (36) thatvD.Y′/ = vD.Y/ must hold,i.e., u′ is anoptimal solution ofD.Y/. Recall that a finitely reducable problem is always weaklydiscretizable.“⇐”: Assume that we have given a sequence of gridsYk in Y such thatv.Yk/→ v.Y/(−∞ < v.Y/ <∞) andu′ is a maximizer ofD.Y/. With the finite setY′ = supp u′

(the support ofu′) the solutionu′ is also a maximizer ofD.Y′/ and it follows (see (36))

v.Yk/ = vD.Yk/ ≤ vD.Y′/ = v.Y′/ = vD.Y/ ≤ v.Y/:

In view of v.Yk/→ v.Y/ we findv.Y/ = v.Y′/2

THEOREM 12 Let P.Y/ be feasible. Then

P.Y/ is weakly discretizable⇔ v.Y/ = vD.Y/ (no duality gap)

Proof. ”⇒”: Assume thatYk is a sequence of grids inY such thatv.Yk/→ v.Y/(<∞). By (36) we find

v.Yk/ = vD.Yk/ ≤ vD.Y/ ≤ v.Y/

and by taking the limitk→∞ alsov.Y/ = vD.Y/“⇐”: If v.Y/ = vD.Y/ = −∞, i.e., FD.Y/ = ∅, P.Y/ is finitely reducable and thus

28

weakly discretizable. In the other case, there exist sequences of feasible solutionsxk ∈ F .Y/ anduk ∈ FD.Y/ such that

bTuk→ vD.Y/ = v.Y/← cTxk : (37)

With the finite setsYk = supp uk from (36) we deduce

bTuk = vD.Yk/ = v.Yk/ ≤ vD.Y/ ≤ v.Y/ ≤ cTxk

and (37) impliesv.Yk/→ v.Y/.2

Note that by the last theorems weak discretizability is directly related to duality andfinite reducability to the solvability of the dual. A sufficient condition for the existenceof an optimal solution of the dual is (for example) the fact that the Slater conditionholds.

Discretizability. The next example illustrates the difference between weak discretiz-ability and discretizability.

EXAMPLE 1 Consider the linear SIP (with some fixed" > 0):

min x1 s.t. x1 cosy+ x2 siny≥ 1; y ∈ Y :=

{[�;3�=2] case A

[�− ";3�=2] case B

The minimizer of P.Y/is x= .−1;0/.

Case A: The problem is weakly discretizable but not discretizable. For a gridY′ con-taining y= � we havev.Y′/ = v.Y/. On the other hand for anyY′ not containing�the value is unbounded,v.Y′/ = −∞.Case B: The problem is discretizable as is easily shown. Note that in case B the con-dition c ∈ int cone{a.y/ | y ∈ Y} is satisfied but not in case A (cf. also Theorem 14).

The following algorithm is based on the concept of discretizability.

Algorithm 1 (Conceptual discretization method)

Step k: Given a discretizationYk ⊂ Y

i. Compute a solutionxk of P(Yk).

ii. Stop, if xk is feasible within a fixed accuracy� > 0, i.e.g.xk; y/ ≥ −�;y ∈ Y. Otherwise, select a finer discretizationYk+1 ⊂ Y.

Under a compactness assumption on the feasible sets we obtain a general convergenceresult for this method.

THEOREM 13 Let the sequence of discretizations Yk satisfy

Y0 ⊂ Yk ∀k≥ 1 and�.Yk/→ 0 for k→∞ :

SupposeF .Y0/ is compact. Then P.Y/ is discretizable,i.e., the problems P.Yk/ havesolutions xk and each such sequence of solutions satisfiesdist .xk;S .Y//→ 0.

29

Proof. By assumption and usingY0⊂ Yk ⊂ Y the feasible setsF .Y/, F .Yk/, of P.Y/,P.Yk/ respectively, are compact and satisfyF .Y/ ⊂ F .Yk/ ⊂ F .Y0/; k ∈ N. Con-sequently, solutionsxk of P.Yk/ exist. Suppose now that a sequence of such solutionsdoes not satisfy dist.xk;S .Y//→ 0. Then there exist" > 0 and a subsequencexk�such that

dist .xk�;S .Y// ≥ � > 0 ∀� :

Sincexk� ∈ F .Y0/ we can select a convergent subsequence. Without restriction we canassumexk�→ x; �→∞. In view of F .Y/⊂ F .Yk/ the relationf .xk� /≤ v.Y/ holdsand thus by continuity off we find

f .x/ ≤ v.Y/ :

We no show thatx ∈ S .Y/ in contradiction to our assumption. To do so it suffices toprove thatx ∈ F .Y/. Let y ∈ Y be given arbitrarily. Since�.Yk� /→ 0 for �→∞ wecan chooseyk� ∈ Yk� , such thatyk� → y. In view of g.xk�; yk� / ≥ 0, by taking the limit�→∞, it follows g.x; y/ ≥ 0, i.e.x ∈ F .Y/.

2

The following result on discretizability of LSIP is contained in [12, p. 70-75].

THEOREM 14 Let the LSIP problem P.Y/ be feasible and assume that the conditionc ∈ int cone{a.y/ | y ∈ Y} holds. ThenS .Y/ is non-empty and bounded, so a solutionof P.Y/ exists. Moreover P.Y/ is discretizable.

We now consider discretizability for nonlinear semi-infinite problems. Here we haveto make use of the local concept. The following can be easily proven (e.g. in [34])

L EMMA 6 Let be given a sequence of grids Yk ⊂ Y with�k := �.Yk/→ 0.

(a) Let xk be points inF .Yk/ ∩ K, where K is a compact subset ofRn. Then thereexists c> 0 such that for all�k > 0 small enough

g.xk; y/ ≥ −c �k ∀y ∈ Y :

(b) Let MFCQ be satisfied atx with the vector d (cf. (10)). Then there exist numbers� > 0; "1 > 0 such that for small�k the points x+ ��kd are feasible for P.Y/for all xk ∈ F .Yk/ with ‖xk− x‖ < "1.

THEOREM 15 Let x be a local minimizer of P.Y/ of order p. Suppose MFCQ holdsat x. Then P is locally discretizable atx. More precisely, there is some� > 0 suchthat for any sequence of grids Yk ⊂ Y with�.Yk/→ 0 and any sequence of solutionsxk of the locally restricted problem Pl .Yk) (see the definition of discretizability) thefollowing relation holds:

||xk− x|| ≤ � �.Yk/1=p : (38)

30

Proof.Consider the SIP restricted to the closed ball clB�.x/ with small� chosen suchthat� < "; "1 (with " in (12) and"1 in Lemma 6) :

Pl .Yk/ : min f .x/ s.t. x ∈ F .Yk/∩ cl B�.x/ :

Obviously, sincex ∈ F .Yk/ andF .Yk/∩ cl B�.x/ is compact (and nonempty), a solu-tion xl

k exists. Note thatx is the unique (global) minimizer of Pl .Y/. Put�k := �.Yk/and consider any sequence of solutionsxl

k of Pl .Yk/. In view of F .Y/ ⊂ F .Yk/ andxl

k+ ��kd ∈ F.Y/∩ cl B".x/ (for largek, see Lemma 6(b)) we find

f .xlk/ ≤ f .x/ ≤ f .xl

k+ ��kd/ :

Sincex is a minimizer of orderp (see 12) it follows

||xlk+ ��kd− x||p≤

1q

(f .xl

k+ ��kd/− f .x/)≤

1q

(f .xl

k+ ��kd/− f .xlk/

)=O.�k/ :

Finally, the triangle inequality yields

||xlk− x|| ≤ ||xl

k+ ��kd− x|| + ||��kd|| = O.�1=pk / : (39)

In particular||xlk − x|| < � (for largek) such thatxl

k are (global) minimizers of theproblem Pl .Yk/ restricted to the open neighborhoodB�.x/.

2

REMARK 7 The result of Theorem 15 remains true for the global minimization prob-lem P if the setsF .Yk/ andF .Y/ are restricted to a compact subsetK ⊂ Rn.

Under additional assumptions on the quality of the discretizationsYk one can prove afaster convergence than in (38) (see [34]).

7.4 Exchange method

We also outline theexchange methodwhich is often more efficient than a pure dis-cretization method. This method can be seen as a compromise between the discretiza-tion method in Section 7.3 and the continuous reduction approach in Section 7.2.

Algorithm 3 (Conceptual exchange method)

Step k: Given a discretizationYk ⊂ Y and a fixed, small value� > 0.

i. Compute a solutionxk of SIP(Yk).

ii. Compute local solutionsyik; i = 1; : : : ; ik (ik ≥ 1) of Q.xk/ (cf. (23)) such that

one of them, sayy1k, is a global solution, i.e.,g.xk; y1

k/ = maxy∈Y

g.xk; y/

31

iii. Stop, if g.xk; y1k/ ≥ −�, with a solutionx≈ xk. Otherwise, update

Yk+1 = Yk ∪ {yik; i = 1; : : : ; ik} : (40)

THEOREM 16 Suppose that the (starting) feasible setF .Y0/ is compact. Then, theexchange method (with� = 0) either stops with a solutionx = xk0 of P.Y/ or thesequence{xk} of solutions of P.Yk/ satisfiesdist .xk;S .Y//→ 0.

Proof. We consider the case that the algorithm does not stop with a minimizer ofP(Y). As in the proof of Theorem 13, by our assumptions, a solutionxk of P.Yk/exists,xk ∈ F .Y0/ and with the subsequencexk� → x we find

f .x/ ≤ v.Y/ :

Again we have to showx ∈ F or equivalently'.x/ ≥ 0 for the value function'.x/ ofQ.x/. In view of '.xk/ = g.xk; y1

k/ (see Algorithm 2, step ii) we can write

'.x/ = '.xk/+ '.x/− '.xk/ = g.xk; y1k/+ '.x/− '.xk/ :

Sincey1k ∈ Yk+1 we haveg.xk+1; y1

k/ ≤ 0 and by continuity ofg and' we find

'.x/ ≥(g.xk; y1

k/− g.xk+1; y1k/

)+

('.x/− '.xk/

)→ 0 for k→∞ :

2

We refer to the review paper [9] for more details on this approach.

8 Bilevel problems and Mathematical programs withequilibrium constraints

Recall the GSIP problems (2). Bilevel problems are of the form

BL: minx;y

f .x; y/ s.t. h.x; y/ ≥ 0

andy is a solution of Q.x/ : mint

F.x; t/ s.t. t ∈ Y.x/(41)

We also consider mathematical programs with equilibrium constraints

MPEC: minx;y

f .x; y/ s.t. h.x; y/ ≥ 0

y ∈ Y.x/

g.x; y; t/ ≥ 0 ∀t ∈ Y.x/ :

(42)

BL and MPEC problems both represent important classes of optimization problemswhich have been investigated in a large number of papers and books (seee.g., [2], [24]and the references therein).

32

8.1 Comparison between the structure of the problems

In this subsection we discuss the relations between the three problem classes GSIP, BLand MPEC. Obviously, GSIP is a special case of MPEC. But also BL can be written inMPEC form as follows.

BL → MPEC: If we write the second constraint in BL equivalently as

y ∈ Y.x/ and F.x; t/− F.x; y/ ≥ 0 ∀t ∈ Y.x/

the BL problem becomes

BL : minx;y

f .x; y/ s.t. h.x; y/ ≥ 0

y ∈ Y.x/

F.x; t/− F.x; y/ ≥ 0 ∀t ∈ Y.x/ :

(43)

Remark. Note however that there is a subtle difference in the interpretation of the constraint

g.x; y/ ≥ 0 ∀y ∈ Y.x/ or g.x; y; t/ ≥ 0 ∀t ∈ Y.x/ : (44)

In the case thatY.x/ is empty, for BL and MPEC because of the additional conditiony∈ Y.x/,no feasible point.x; y/ exists (for thisx). For the GSIP problem however an empty index setY.x/ means that there are no constraints and such pointsx are feasible.

GSIP and MPEC, can also be transformed into a problem of bilevel structure.

GSIP→ BL : We assumeY.x/ 6= ∅; ∀x, and recall the (lower level) problem (see(23))

Q.x/ : mint

g.x; t/ s.t. t ∈ Y.x/ ; (45)

depending on the parameterx. Then (assuming thatQ.x/ is solvable) we can write

g.x; t/ ≥ 0 ∀t ∈ Y.x/ ⇔ g.x; y/ ≥ 0 andy solvesQ.x/ :

So GSIP takes the BL form:

GSIP : minx;y

f .x/ s.t. g.x; y/ ≥ 0

andy is a solution of Q.x/ : mint

g.x; t/ s.t. t ∈ Y.x/(46)

This problem is a BL program with the special property that the objective functionfdoes not depend ony and that the constraint functiong in the first level and the lowerlevel objective function coincide.

MPEC→ BL : By applying the same trick as before we consider the programs

Q.x; y/ : mint

g.x; y; t/ s.t. t ∈ Y.x/ ; (47)

33

depending on the parameter.x; y/ and find

g.x; y; t/ ≥ 0 ∀t ∈ Y.x/ ⇔ g.x; y; t/ ≥ 0 andt solvesQ.x; y/ :

Consequently MPEC turns into a problem of BL type (withx replaced by.x; y/):

MPEC : minx;y;t

f .x; y/ s.t. h.x; y/ ≥ 0

y ∈ Y.x/g.x; y; t/ ≥ 0

andt is a solution of Q.x; y/ : minu

g.x; y;u/ s.t. u ∈ Y.x/

(48)

Special caseg.x; y; y/ = 0 : Under the extra condition

g.x; y; y/ = 0 ; ∀y (49)

in MPEC we can ’eliminate’ thet variable as follows. In view of (49 ) we find

y ∈ Y.x/ andg.x; y;u/ ≥ 0 ∀u ∈ Y.x/ ⇔ y solvesQ.x; y/

and MPEC is equivalent with

MPEC : minx;y

f .x; y/ s.t. h.x; y/ ≥ 0

y ∈ Y.x/

andy is a solution of Q.x; y/ : minu

g.x; y;u/ s.t. u ∈ Y.x/

(50)

Remark. Note that now the problem MPEC in (50) has a more complicated structure thanBL, since in MPEC the lower level problemQ.x; y/ depends on the variabley which shouldat the same time solveQ.x; y/.

For a comparison from the structural and generic viewpoint between BL and GSIP werefer to [26] and between BL and MPEC to [4].

To solve these problems numerically it is convenient to reformulate the problemsas nonlinear programs. For shortness we only consider (the most complicated case)MPEC. To do so we assume that the setsY.x/ are given explicitly in the form

Y.x/ = {y ∈ Rm| v.x; y/ ≥ 0} with a C1-functionv : Rn+m

→ Rq :

Let alsog be fromC1. Then, if t is a solution ofQ.x; y/ which satisfies somecon-straint qualification(CQ) thent must necessarily satisfy the Kuhn-Tucker conditions:

∇tg.x; y; t/− �T∇tv.x; t/ = 0

�Tv.x; t/ = 0(51)

34

with some multiplier 0≤ � ∈ Rm. So we can consider the following relaxation ofMPEC.

minx;y;t

f .x; y/ s.t.

g.x; y; t/ ≥ 0

∇tg.x; y; t/− �T∇tv.x; t/ = 0

�Tv.x; t/ = 0

�;h.x; y/; v.x; y/; v.x; t/ ≥ 0

(52)

The program (52) is a relaxation of the MPEC (48) in the sense that (under CQ) thefeasible set of the MPEC is contained in the feasible set of (52). In particular, anysolution.x; y; t/ of (52) with the property thatt is a minimizer ofQ.x; y/, must alsobe a solution of the original MPEC.

8.2 Convexity conditions forQ.x; y/

Let us now consider the special case thatQ.x; y/ represents a convex problem,i.e., forany fixedx andy the functiong.x; y; t/ is convex int, and for any fixedx, the functions−vl .x; t/ are convex int. Then, it is well-known that the Kuhn-Tucker conditions (51)are sufficient fort to be a solution ofQ.x; y/. So in this case any solution.x; y/ of(52) provides a solution of the MPEC. If moreover CQ is satisfied forQ.x; y/, then(52) is equivalent with the original MPEC.In the form (52), an MPEC is transformed into a nonlinear program with complemen-tarity constraints (seee.g. [28]). In this form the problems can be solved numerically,for instance by some interior point method approach. For GSIP problems this has beendone successfully in [30].

9 Appendix

This section contains some definitions and auxiliary results. A setC is calledconvexif

C contains with anyx; y∈ C also the wholeline segment[x; y] = {.1− �/x+ �y |0≤� ≤ 1} : For an arbitrary setA⊂ Rn we define itsconvex coneby

coneA= {a=k∑

j=1

� ja j | k≥ 1; a j ∈ A; � j ≥ 0} :

and itsconvex hullby

conv A= {a=k∑

j=1

� ja j | k≥ 1; a j ∈ A; � j ≥ 0;k∑

j=1

� j = 1} :

35

L EMMA 7 (Caratheodory)For A ⊂ Rn, each a∈ conv A can be represented as aconvex combination of (at most) n+ 1 vectors: a=

∑n+1j=1 � ja j , and each element

a ∈ coneA as a conic combination of (at most) n vectors: a=∑n

j=1� ja j

A function f : C→ R defined on a convex setC ⊂ Rn is calledconvexif for allx; y ∈ C and 0≤ � ≤ 1,

f ..1− �/x+ �y/ ≤ .1− �/ f .x/+ � f .y/ :

L EMMA 8 A differentiable function f: C→ R (C⊂ Rn, open and convex) is convexon C if and only if for any x; y ∈ C: f .x/ ≥ f .x/+∇ f .y/.x− y/ .

L EMMA 9 [Generalized Gordan Lemma]Let A⊂ Rn be a compact set. Then exactlyone of the following alternatives is true.

(i) 0 ∈ conv A .

(ii) There exists some d∈ Rn such that aTd < 0 for all a ∈ A .

References

[1] Amaya J., Gomez J.A.,Strong duality for inexact linear programming problems.Optimization 49, no. 3, 243-269, (2001).

[2] Bard, J.F.,Practical bilevel optimization. Algorithms and applications., Noncon-vex Optimization and its Applications, 30. Kluwer Academic Publishers, Dor-drecht, (1998).

[3] Ben-Tal A., Nemirovski A.,Robust solutions of uncertain linear programs, Op-erations Research Letters 25, 1-13, (1999).

[4] S. Birbil, G. Bouza, J. B. G. Frenk and G. Still,Equilibrium constrained opti-mization problems, submitted to EJOR:

[5] Dall’Aglio M., On Some Applications of LSIP to Probability and Statistics, Semi-infinite programming, Eds. Goberna/Lopez, (Alicante, 1999), 237–254, Noncon-vex Optim. Appl., 57, Kluwer Acad. Publ., Dordrecht, 2001.

[6] Dambrine M., Pierre M.,About stability of equilibrium shapes, Marhem. Mod-elling and Num. Analysis Vol.34, N0 4, 811-834, (2000).

[7] Faigle U., Kern W., Still G., Algorithmic Principles of Mathematical Program-ming, Kluwer, Dordrecht, (2002).

36

[8] Glashoff K., Gustafson S.-A., Linear Optimization and Approximation, SpringerVerlag, Berlin, (1983).

[9] Hettich R., Kortanek K., Semi-infinite programming: Theory, methods and ap-plications, SIAM Review, vol 35, N0.3, 380-429, (1993).

[10] Goberna M.A., L opez M.A.,Linear Semi-Infinite Optimization, John Wiley &Sons, Chichester, (1998).

[11] Haaren-Retagne E.,A semi-infinite programming algorithm for robot trajectoryplanning, Dissertation, University Trier (1992).

[12] Hettich R. and Zencke P.,Numerische Methoden der Approximation und dersemi-infiniten Optimierung, Teubner, Stuttgart (1982).

[13] Hettich R., Still G.,On Generalized semi-infinite programming problems, Work-ing paper, University of Trier, (1986).

[14] R. Hettich, G. Still,Semi-infinite programming models in Robotics, in ’Paramet-ric Optimization and Related Topics II’, Guddat et al. (eds.), Academie Verlag,Berlin (1991).

[15] R. Hettich, G. Still,Second order optimality conditions for generalized semi-infinite programming problems, Optimization Vol. 34, pp. 195-211, (1995).

[16] Hoffmann A. and Reinhardt R.,On reverse Chebyshev approximation problems,Technical University of Illmenau, Preprint No. M08/94, (1994).

[17] Guddat J., Jongen H.Th. and Ruckmann J.,On stability and stationary points innonlinear optimization, J. Australian Math. Soc., Series B 28, 36-56, (1986).

[18] Jess A., Jongen H.Th., Neralic L., Stein O.,A semi-infinite programming modelin data envelopment analysis, Optimization 49, 369-385, (2001).

[19] H. Th Jongen, G.. Zwier,On the local structure of the feasible set in semi-infiniteoptimization, in: Brosowski, Deutsch (eds.): Int. Ser. Num. Math. 72, BirkhauserVerlag, Basel, 185-202, (1984).

[20] Jongen H.Th., Twilt F., Weber G.-W.Semi-infinite optimization: structure andstability of the feasible set, JOTA 72, 529-552, (1992)

[21] Jongen H.Th., Jonker P., Twilt F.,Nonlinear Optimization in finite Dimensions,Kluwer, Dordrecht, (2000).

[22] H. Th Jongen, J.-J. Ruckmann, O. Stein,Generalized semi-infinite optimiza-tion: A first order optimality condition and examples, Mathematical Program-ming 83, 145-158, (1998).

37

[23] Len T., Vercher E.,Optimization under uncertainty and linear semi-infinite pro-gramming: a survey.in Semi-infinite programming, Eds. Goberna/Lopez, (Ali-cante, 1999), 327–348, Nonconvex Optim. Appl., 57, Kluwer Acad. Publ., Dor-drecht, 2001.

[24] Luo, Z.-Q., Pang, J.-S. and D. Ralph, Mathematical Programs with EquilibriumConstraints, Cambridge University Press, Cambridge, (1997).

[25] Polak E.,On the mathematical foundation of nondifferentiable optimization inengineering design, SIAM Rev., 29, 21-89, (1987).

[26] Stein O., Still G.,On generalized semi-infinite optimization and bilevel optimiza-tion., Europ. J. of Operational Research 142, Issue 3, 444-462, (2002).

[27] Sanchez-Soriano J., Llorca N., Tijs S., Timmer J.,Semi-infinite assignment andtransportation games.Semi-infinite programming, Eds. Goberna/Lopez, (Ali-cante, 1999), 349–363, Nonconvex Optim. Appl., 57, Kluwer Acad. Publ., Dor-drecht, (2001).

[28] Scheel, H. and S. Scholtes, Mathematical programs with complementarity con-straints: Stationarity, optimality, and sensitivity, Mathematics of Operations Re-search 25, 1-22, (2000).

[29] Soyster A.L.,Convex programming with set-inclusive constraints and applica-tions to inexact linear programming, Operations research 21, 1154-1157, (1973).

[30] Stein O., Still G.,Solving semi-infinite optimization problems with interior pointtechniques.SIAM J. Control Optim. 42, no. 3, 769–788, (2003).

[31] Still G., Haaren-Retagne E., Hettich R.,A numerical comparison of two ap-proaches to compute membrane eigenvalues by defect-minimization, in “Intern.Series of Numer. Math.” vol 96, Birkhauser Verlag, 209-224, (1991).

[32] Still G., Generalized semi-infinite programming: Theory and methods, EuropeanJournal of Operational Research 119, 301-313, (1999).

[33] Still G., Generalized semi-infinite programming: Numerical aspects, Optimiza-tion 49, No. 3, 223-242, (2001).

[34] Still G., Discretization in semi-infinite programming: the rate of convergence,Math. Program. 91 (2001), no. 1, Ser. A, 53–69.

[35] Tichatschke R., Hettich R., Still G.,Connections between generalized inexactand semi-infinite linear programming, ZOR-Methods and Models of OR 33, 367-382, (1989).

38

[36] Wetterling W.,Definitheitsbedingungen fur relative Extrema bei Optimierungs-und Approximationsaufgaben, Numer. Math. 15, 122-136, (1970).

[37] Wolkowicz H., Saigal L., Vandenberghe (Eds.), Handbook of semidefinite pro-gramming. Theory, algorithms, and applications. International Series in Oper-ations Research and Management Science 27, Kluwer Academic Publishers,Boston, (2000).

39

Documents

Semi-inﬁnite Programming: An Introduction “preliminary ...stillgj/sip/tutorial.pdf · compromise between an introduction and a survey, treats the theoretical basis and numerical