17
MATHEMATICS OF OPERATIONS RESEARCH Vol. 32, No. 4, November 2007, pp. 840–856 issn 0364-765X eissn 1526-5471 07 3204 0840 inf orms ® doi 10.1287/moor.1070.0274 © 2007 INFORMS Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control Boris S. Mordukhovich Department of Mathematics, Wayne State University, Detroit, Michigan 48202, [email protected] This paper is devoted to applications of modern methods of variational analysis to constrained optimization and control problems generally formulated in infinite-dimensional spaces. The main focus is on the study of problems with nonsmooth structures, which require the usage of advanced tools of generalized differentiation. In this way we derive new necessary optimality conditions in optimization problems with functional and operator constraints and then apply them to optimal control problems governed by discrete-time inclusions in infinite dimensions. The principal difference between finite-dimensional and infinite-dimensional frameworks of optimization and control consists of the “lack of compactness” in infinite dimensions, which leads to imposing certain “normal compactness” properties and developing their comprehensive calculus, together with appropriate calculus rules of generalized differentiation. On the other hand, one of the most important achievements of the paper consists of relaxing the latter assumptions for certain classes of optimization and control problems. In particular, we fully avoid the requirements of this type imposed on target endpoint sets in infinite-dimensional optimal control for discrete-time inclusions. Key words : variational analysis; nonsmooth optimization and optimal control; discrete-time inclusions; generalized differentiation; infinite dimensions; necessary optimality conditions MSC2000 subject classification : Primary: 49J53, 90C30 OR/MS subject classification : Primary: nondifferentiable programming, optimal control History : Received March 19, 2006; revised December 7, 2006. 1. Introduction. Variational analysis has been recognized as a rapidly growing and fruitful area in applied mathematics concerning mainly the study of optimization and equilibrium problems and also applying pertur- bation ideas and variational principles to a broad class of problems and situations, which may not be of a variational nature. The reader can find more information about basic principles and results of variational analysis in the now classic book by Rockafellar and Wets [20] devoted to variational analysis in finite dimensions and in the recent mutually complementary books by Borwein and Zhu [3] and by Mordukhovich [16, 17] dealing with variational analysis and its applications in both finite-dimensional and infinite-dimensional spaces. It is important to emphasize that basic techniques in variational analysis (particularly those related to perturbations and the usage of modern variational principles) unavoidably lead to nonsmooth structures of functions, sets, and set-valued mappings, even for problems with initially smooth data. Thus, appropriate tools of generalized differentiation lie at the very heart of modern variational analysis and its applications. This paper concerns applications of advanced techniques of variational analysis and generalized differentiation to deriving necessary optimality conditions in three important classes of constrained optimization problems formulated generally in infinite-dimensional spaces. The first class consists of problems in nondifferentiable programming with finitely many functional constraints described by inequalities and inequalities with real-valued functions, along with geometric constraints given by closed subsets of Asplund spaces, which is a large subclass of Banach spaces that particularly includes all reflexive spaces. The main results are obtained via several versions of the extended Lagrange principle expressed in terms of our basic normals, subgradients, and coderivatives that satisfy comprehensive calculus rules—the so-called full calculus. The second class of optimization problems under consideration contains the so-called operator constraints given by mappings with values in infinite-dimensional spaces. Such constraints are typical in many infinite- dimensional optimization and equilibrium problems, particularly those arising in dynamic optimization and optimal control. Problems with operator constraints are significantly different from those with finitely many functional constraints; it is well known that a counterpart of the Lagrange multiplier rule does not hold even in smooth settings with no additional assumptions. We derive general results that ensure the fulfillment of first- order necessary optimality conditions in nonsmooth problems with operator constraints by imposing fairly weak assumptions of a “sequential normal compactness” (SNC) type, which are automatic in finite dimensions while being among the most essential ingredients of infinite-dimensional variational analysis. Furthermore, we show that the imposed assumptions always hold for a large class of generalized Fredholm mappings, which exhibit a kind of “finite codimension” behavior. 840

Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

  • Upload
    boris-s

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

MATHEMATICS OF OPERATIONS RESEARCH

Vol. 32, No. 4, November 2007, pp. 840–856issn 0364-765X �eissn 1526-5471 �07 �3204 �0840

informs ®

doi 10.1287/moor.1070.0274©2007 INFORMS

Variational Analysis in Nonsmooth Optimization andDiscrete Optimal Control

Boris S. MordukhovichDepartment of Mathematics, Wayne State University, Detroit, Michigan 48202,

[email protected]

This paper is devoted to applications of modern methods of variational analysis to constrained optimization and controlproblems generally formulated in infinite-dimensional spaces. The main focus is on the study of problems with nonsmoothstructures, which require the usage of advanced tools of generalized differentiation. In this way we derive new necessaryoptimality conditions in optimization problems with functional and operator constraints and then apply them to optimal controlproblems governed by discrete-time inclusions in infinite dimensions. The principal difference between finite-dimensionaland infinite-dimensional frameworks of optimization and control consists of the “lack of compactness” in infinite dimensions,which leads to imposing certain “normal compactness” properties and developing their comprehensive calculus, togetherwith appropriate calculus rules of generalized differentiation. On the other hand, one of the most important achievements ofthe paper consists of relaxing the latter assumptions for certain classes of optimization and control problems. In particular,we fully avoid the requirements of this type imposed on target endpoint sets in infinite-dimensional optimal control fordiscrete-time inclusions.

Key words : variational analysis; nonsmooth optimization and optimal control; discrete-time inclusions; generalizeddifferentiation; infinite dimensions; necessary optimality conditions

MSC2000 subject classification : Primary: 49J53, 90C30OR/MS subject classification : Primary: nondifferentiable programming, optimal controlHistory : Received March 19, 2006; revised December 7, 2006.

1. Introduction. Variational analysis has been recognized as a rapidly growing and fruitful area in appliedmathematics concerning mainly the study of optimization and equilibrium problems and also applying pertur-bation ideas and variational principles to a broad class of problems and situations, which may not be of avariational nature. The reader can find more information about basic principles and results of variational analysisin the now classic book by Rockafellar and Wets [20] devoted to variational analysis in finite dimensions andin the recent mutually complementary books by Borwein and Zhu [3] and by Mordukhovich [16, 17] dealingwith variational analysis and its applications in both finite-dimensional and infinite-dimensional spaces. It isimportant to emphasize that basic techniques in variational analysis (particularly those related to perturbationsand the usage of modern variational principles) unavoidably lead to nonsmooth structures of functions, sets,and set-valued mappings, even for problems with initially smooth data. Thus, appropriate tools of generalizeddifferentiation lie at the very heart of modern variational analysis and its applications.This paper concerns applications of advanced techniques of variational analysis and generalized differentiation

to deriving necessary optimality conditions in three important classes of constrained optimization problemsformulated generally in infinite-dimensional spaces. The first class consists of problems in nondifferentiableprogramming with finitely many functional constraints described by inequalities and inequalities with real-valuedfunctions, along with geometric constraints given by closed subsets of Asplund spaces, which is a large subclassof Banach spaces that particularly includes all reflexive spaces. The main results are obtained via several versionsof the extended Lagrange principle expressed in terms of our basic normals, subgradients, and coderivatives thatsatisfy comprehensive calculus rules—the so-called full calculus.The second class of optimization problems under consideration contains the so-called operator constraints

given by mappings with values in infinite-dimensional spaces. Such constraints are typical in many infinite-dimensional optimization and equilibrium problems, particularly those arising in dynamic optimization andoptimal control. Problems with operator constraints are significantly different from those with finitely manyfunctional constraints; it is well known that a counterpart of the Lagrange multiplier rule does not hold evenin smooth settings with no additional assumptions. We derive general results that ensure the fulfillment of first-order necessary optimality conditions in nonsmooth problems with operator constraints by imposing fairly weakassumptions of a “sequential normal compactness” (SNC) type, which are automatic in finite dimensions whilebeing among the most essential ingredients of infinite-dimensional variational analysis. Furthermore, we showthat the imposed assumptions always hold for a large class of generalized Fredholm mappings, which exhibit akind of “finite codimension” behavior.

840

Page 2: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 841

The third class of problems under consideration in this paper concerns dynamic optimization and dealswith discrete optimal control in infinite dimensions. More precisely, we study problems of minimizing Mayer-type functionals on trajectories of discrete-time inclusions, as well as those arising from discrete approxima-tions of optimal control systems governed by constrained differential/evolution inclusions in Banach spaces.Problems of these types can be reduced to mathematical programs in infinite-dimensional spaces with manygeometric constraints and operator constraints of a special Fredholm kind. The results obtained in this direc-tion are heavily based on comprehensive generalized differential and SNC calculi developed in the author’sbook (Mordukhovich [16]). At the same time, we are able to significantly relax SNC requirements in infinite-dimensional optimal control problems for discrete-time systems. In particular, this allows us to cover optimalcontrol problems with no such assumptions imposed on target/endpoint constraint sets, e.g., the two-point con-straint case that has always been an obstacle in infinite-dimensional optimal control.The rest of the paper is organized as follows. In §2 we give a brief overview of the basic constructions of

generalized differentiation and some of their properties widely used in formulations and proofs of the mainresults. Section 3 is devoted to the afore-mentioned sequential normal compactness properties and related issues.In particular, in this section we consider generalized Fredholm mappings and prove that they enjoy, in Banachand Asplund space frameworks, a certain underlying version of “partial sequential normal compactness” neededfor deriving necessary optimality conditions in problems with operator constraints.Section 4 concerns optimization problems with functional and geometric constraints in infinite dimensions,

while in §5 we deal with optimization problems involving operator constraints. Finally, §6 is devoted to the studyof optimal control problems governed by discrete-time and finite-difference inclusions in infinite-dimensionalspaces.Our notation is basically standard; see Mordukhovich [16, 17]. Unless otherwise stated, all the spaces consid-

ered are Banach with the norm �·� and the canonical dual pairing �·� ·� between the space in question, say X,and its topological dual X∗ whose weak∗ topology is denoted by w∗. We use the symbols � and �∗ to signifythe closed unit balls of the space under consideration and its dual, respectively. Given a set-valued mappingF � X ⇒X∗, its sequential Painlevé-Kuratowski upper/outer limit at x is defined by

Lim supx→x

F �x� �= x∗ ∈X∗ � ∃ sequences xk→ x� x∗kw∗−→ x∗ with x∗k ∈ F �xk� as k ∈���

where � �= 1�2� � � � �. The symbols x −→ x and x �−→ x signify, respectively, that x→ x with x ∈ andthat x→ x with ��x�→ ��x� for sets ⊂ X and extended-real-valued functions �� X→ �� �= �−����. Wedistinguish between single-valued and set-valued mappings by using the symbols f � X→ Y and F � X ⇒ Y ,respectively. Recall that

gphF �= �x� y� ∈X× Y � y ∈ F �x��

stands for the graph of the mapping F � X ⇒ Y .

2. Generalized differentiation. We present here a brief overview of the basic generalized differential con-structions in variational analysis and some of their properties widely used in what follows. The material istaken from the author’s book (Mordukhovich [16]), where the reader can find a comprehensive theory for theseconstructions with extensive discussions, references, and commentaries.Given a nonempty set ⊂X and a point x ∈ , the (basic, limiting) normal cone to at x is defined by

N�x� � �= Lim supx→x�↓0

�N��x� �� (2.1)

where �N��x� � stands for the set of �-normals (�≥ 0) to at x ∈X given by

�N��x� � �={x∗ ∈X∗

∣∣∣∣ lim supu −→x

�x∗� u− x��u− x� ≤ �

}� x ∈ � (2.2)

with �N��x� � �=� if x � . If the space X is Asplund (i.e., each of its separable subspaces has a separabledual) and if the set is locally closed around x, then we can equivalently put � = 0 in (2.1) and replace�N� by the generally smaller prenormal (or Fréchet normal� cone �N�x� � �= �N0�x� � from (2.2) with �= 0.Observe that the class of Asplund spaces is sufficiently large including every Banach space with a Fréchetdifferentiable renorm away from the origin (in particular, any reflexive space) and every space with a separable

Page 3: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control842 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

dual; see, e.g., the book by Phelps [19] and the references therein for the well-developed theory of Asplundspaces and some of its applications.Given a set-valued mapping F � X ⇒ Y and a point �x� y� ∈ gphF , consider two kinds of limiting coderivatives

of F at �x� y�: the normal coderivative

D∗NF �x� y��y

∗� �= x∗ ∈X∗ � �x∗�−y∗� ∈N��x� y��gphF ��� y∗ ∈ Y ∗� (2.3)

and the mixed coderivative

D∗MF �x� y��y∗� �= Lim sup

�x�y�→�x�y�y∗→y∗�↓0

�D∗�F �x� y��y

∗�� y∗ ∈ Y ∗� (2.4)

where the �-coderivative �D∗�F �x� y� is defined by

�D∗�F �x� y��y

∗� �= x∗ ∈X∗ � �x∗�−y∗� ∈ �N���x� y��gphF ��� y∗ ∈ Y ∗� �≥ 0� (2.5)

where the subindex “�” is omitted if �= 0. Note that we can equivalently replace �D∗� by �D∗ in (2.4) and in the

corresponding representation of D∗N due to (2.1) if both spaces X and Y are Asplund and if the graph of F is

locally closed around �x� y�.Following (2.4) and (2.1), the only difference between the normal and mixed coderivatives is that the norm

convergence of y∗ → y∗ mixed with the weak∗ sequential convergence of x∗ w∗−→ x∗ are used in (2.4) instead ofboth weak∗ sequential convergences y∗ w∗−→ y∗ and x∗ w∗−→ x∗ in the limiting representation of D∗

N . ObviouslyD∗

MF �x� y��y∗�⊂D∗NF �x� y��y

∗�, where the equality holds if dim Y <�, while this inclusion may be strict evenfor single-valued Lipschitzian mappings from � into Hilbert spaces with certain differentiability properties asin Mordukhovich [16, Example 1.35]. In general, the equality

D∗MF �x� y��y∗�=D∗

NF �x� y��y∗�� y∗ ∈ Y ∗�

is postulated in Mordukhovich [16] as the strong coderivative normality of F at �x� y�. This property holdsfor important classes of set-valued and single-valued mappings between infinite-dimensional spaces includingconvex-graph mappings, the so-called “strictly Lipschitzian” mappings (see below), etc., and it is preservedunder various operations; see cases (a)–(i) summarized in Mordukhovich [16, Proposition 4.9].If F = f � X→ Y is single-valued and strictly differentiable at x (which is automatic when f is C1 around

this point), thenD∗

Mf �x��y∗�=D∗N f �x��y

∗�= �f �x�∗y∗�� y∗ ∈ Y ∗� (2.6)

via the adjoint derivative operator �f �x�∗� Y ∗ → X∗ in (2.6). In Mordukhovich [16, 17], the reader can findequivalent analytic representations of both normal and mixed coderivatives and their efficient calculations forvarious classes of nonsmooth single-valued and set-valued mappings.Let �� X→ �� be an extended-real-valued function finite at x. Then

���x� �= Lim supx �−→x�↓0

����x� (2.7)

is the (basic, limiting) subdifferential of � at x, where

����x� �={x∗ ∈X∗

∣∣∣∣ ��u�−��x�−�x∗� u− x��u− x� ≥−�

}(2.8)

is the �-subdifferential of � at x, for each �≥ 0. When �= 0, the set ���x� �= �0��x� in (2.8) is known alsoas the presubdifferential, or the Fréchet (regular, viscosity) subdifferential of � at x. If X is Asplund and if thefunction � is lower semicontinuous (l.s.c.) around x, the sets ����x� can be equivalently replaced by ���x� in(2.7). Furthermore, the subdifferential (2.7) admits the geometric description

���x�= x∗ ∈X∗ � �x∗�−1� ∈N��x���x��� epi���� (2.9)

via the normal cone (2.1) to the epigraph epi� �= �x��� ∈X×� ��≥ ��x�� of �.

Page 4: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 843

On the other hand, the geometrically defined coderivatives (2.3) and (2.4) admit, in the case of single-valuedlocally Lipschitzian mappings f � X→ Y , the following representations

D∗Mf �x�= ��y∗� f ��x�� D∗

N f �x��y∗�= ��y∗� f ��x� as y∗ ∈ Y ∗ (2.10)

via the basic subdifferential (2.7) of the scalarized function �y∗� f ��x� �= �y∗� f �x��. The first representation in(2.10) holds in arbitrary Banach spaces X and Y , while the second one requires in addition that X is Asplund(Y is still arbitrary Banach) and that f is strictly Lipschitzian at x in the sense that the sequence

{f �xk+ tkv�− f �xk�

tk

}� k ∈��

contains a norm convergent subsequence whenever xk→ x and v belongs to some neighborhood of the origin;see Mordukhovich [16, §3.1.3] for characterizations, verifiable sufficient conditions, and applications. Note thatthe latter property is in fact equivalent to the “compactly Lipschitzian” property introduced by Thibault [21].

3. Sequential normal compactness and Fredholm mappings. We start this section with recalling certain“sequential normal compactness” properties of sets and mappings that are automatic in finite dimensions whileplaying a crucial role in infinite-dimensional variational analysis and its applications. We refer the reader to theauthor’s books (Mordukhovich [16, 17]) for more details and historical comments and also to the papers byIoffe [10, 11] and the books by Fattorini [6] and by Li and Yong [14] for related (somewhat different from eachother) “finite codimension” properties and their significance in infinite-dimensional optimization and optimalcontrol.A subset ⊂ X of a Banach space is sequentially normally compact (SNC) at x ∈ if for any sequences

�k ↓ 0, xk −→ x, and x∗k ∈ �N�k

�xk� � one has

x∗kw∗−→ 0 =⇒ �x∗k�→ 0 as k→�!

In Asplund spaces, we can equivalently put �k = 0 in the above definition provided that is locally closedaround x. Besides finite dimensions, the SNC property of at x is automatic in any Banach space if the set is “compactly epi-Lipschitzian” (CEL) around x in the sense of Borwein and Strójwas [2]. Note that in generalthe implication CEL⇒SNC is strict even for convex cones in nonseparable Asplund spaces; see Fabian andMordukhovich [5] for a comprehensive study of the relationships between the SNC and CEL properties.A set-valued mapping F � X ⇒ Y between Banach spaces is SNC at �x� y� ∈ gphF if its graph enjoys this

property at �x� y�, which is in fact equivalent to requiring that for any sequences ��k� xk� yk� x∗k� y

∗k� ∈ �+ ×

�gphF �×X∗ × Y ∗ satisfying

�k ↓ 0� �xk� yk�→ �x� y�� x∗k ∈ �D∗�kF �xk� yk��y

∗k�� and �x∗k� y

∗k�

w∗−→ �0�0�� (3.1)

we have ��x∗k� y∗k��→ 0 as k→�, where one can equivalently put �k = 0 if both spaces X and Y are Asplundand if F is closed-graph around �x� y�. A more subtle partial SNC (PSNC) property of F at �x� y� ∈ gphFmeans that

�x∗k�→ 0 provided that �y∗k�→ 0 as k→�for any sequences ��k� xk� yk� x

∗k� y

∗k� ∈�+ × �gphF �×X∗ × Y ∗ satisfying (3.1), with the similar simplification

in the Asplund space setting. The PSNC property is significantly less restrictive than the SNC one and alwaysholds, in particular, for mappings F having Aubin’s Lipschitz-like (“pseudo-Lipschitz”) property around �x� y�,in the sense (Aubin [1]) that there are neighborhoods U of x and V of y and a number $≥ 0 such that

F �x�∩V ⊂ F �u�+ $�x− u�� whenever x�u ∈U! (3.2)

When V = Y , the latter property reduces to the classical (Hausdorff) Lipschitz continuity of F around x.Moreover, the simultaneous fulfillment of the PSNC property of F at �x� y� and the mixed coderivative condition

D∗MF �x� y��0�= 0� (3.3)

is necessary and sufficient for F to be Lipschitz-like around �x� y�; see Mordukhovich [16, Theorem 4.10].We refer the reader to Mordukhovich [16, 17] for other efficient conditions implying the SNC/PSNC properties

for specific classes of set-valued and single-valued mappings and to the well-developed SNC calculus ensuring

Page 5: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control844 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

the preservation of such properties under various operations; this seems to be the most important for applications.Note that the proofs of the major rules of SNC calculus in Mordukhovich [16] are based on the extremalprinciple, which can be viewed as a local variational counterpart of the classical convex separation in nonconvexsettings.Next let us describe a general class of the mappings particularly important in applications to infinite-

dimensional optimization and control problems, for which the PSNC property is satisfied. We first recall thefollowing significant modification of the strictly Lipschitzian mappings (see §2) that is due to Ngai et al. [18]:a mapping f � X→ Y locally Lipschitzian around x is called compactly strictly Lipschitzian at x if, for eachsequence xk→ x and hk→ 0 ∈X with hk �= 0, the sequence{

f �xk+hk�− f �xk�

�hk�}� k ∈��

has a norm convergent subsequence.It is obvious that a compactly strictly Lipschitzian mapping is strictly Lipschitzian at the reference point.

Moreover, for dim Y <� the above strict Lipschitzian notions agree and reduce to the standard local Lipschitzcontinuity. However, it is not the case when the space Y is infinite-dimensional being Asplund, in particular.Indeed, consider the mapping f � c0→ c0 given by

f �x� �= sin xk� for x �= xk��

where c0 stands for the Asplund space of sequences xk� with xk→ 0 as k→� endowed with the supremumnorm. It is not hard to check that the mapping f is strictly Lipschitzian but not compactly strictly Lipschitzianat the origin. Observe that f is compactly strictly Lipschitzian at x if it is strictly Fréchet differentiable at x withthe compact derivative operator, or more generally: if f is a composition, f = g �f0, where g is strictly differen-tiable with the compact derivative while f0 is locally Lipschitzian. Furthermore, the class of compactly strictlyLipschitzian mappings contains every mapping f � X→ Y that is uniformly directionally compact around x asdefined by Ioffe [11] in primal space terms: there is a norm compact set Q⊂ Y for which

f �x+ th� ∈ f �x�+ t�h�Q+ t)��x− x�� t��whenever h ∈X with �h� ≤ 1 and x close to x, with )��� t�→ 0 as � ↓ 0 and t ↓ 0.Definition 3.1 (Generalized Fredholm Mappings). A mapping f � X→ Y is generalized Fredholm at x

if there is a g� X→ Y compactly strictly Lipschitzian at x and such that the difference f −g is a linear boundedoperator whose image is a closed subspace of finite codimension in Y .The generalized Fredholm notion extends the “semi-Fredholm” one from Ioffe [11], where g is assumed

to be uniformly directionally compact; see also Ginsburg and Ioffe [7] for the “Fredholm” predecessor of thelatter definition with more discussions on the previous developments and applications to optimal control. Thefollowing major result establishes the partial sequential normal compactness of generalized Fredholm mappingsand their restrictions to closed SNC sets.

Theorem 3.2 (Partial Sequential Compactness of Generalized Fredholm Mappings). Let f � X→ Ybe a generalized Fredholm at x. The following assertions hold:

(i) The inverse set-valued mapping f −1� Y ⇒X is PSNC at �f �x�� x� for arbitrary Banach spaces X and Y .(ii) Assume that both X and Y are Asplund, and let

f �x� �=f �x� if x ∈ �

� if x �

be the restriction of f to a set ⊂X. Then f −1 is PSNC at �f �x�� x� provided that is SNC at x and locallyclosed around this point.

Proof. To prove (i), take sequences �k ↓ 0, xk −→ x, x∗k → 0, and y∗k

w∗−→ 0 such that

x∗k ∈ �D∗�kf �xk��y

∗k� for all k ∈�!

To justify the PSNC property of f −1 at �f �x�� x�, we need to show that �y∗k�→ 0 as k→�.Denote by A �= f −g the linear bounded operator from X to Y whose image Y0 �=AX is a closed subspace of

finite codimension by Definition 3.1. The latter means that there is a closed subspace Y1 ⊂ Y with Y = Y0⊕Y1 and

Page 6: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 845

codimY1 <�. Due to the elementary sum rule for �-coderivatives in the setting f =A+ g under consideration,the desired PSNC property reduces to justifying the implication

[�k ↓ 0� �x∗k�→ 0� y∗k

w∗−→ 0] =⇒ �y∗k�→ 0 as k→� (3.4)

provided the fulfillment of the inclusions

x∗k −A∗y∗k ∈ �D∗�kg�xk��y

∗k� for all k ∈�! (3.5)

Since f is compactly strictly Lipschitzian at x, inclusions (3.5) ensure, by the �-coderivative modification of theproof of Proposition 3.4 from Ngai et al. [18], that y∗k

w∗−→ 0 implies the strong convergence

�x∗k −A∗y∗k�→ 0 and hence �A∗y∗k�→ 0 as k→�!

By Y = Y0⊕ Y1 with Y0 =AX, for each k ∈� we find y∗0k ∈ Y0 and y∗1k ∈ Y ∗1 such that

y∗k = y∗0k+ y∗1k and A∗y∗k =A∗y∗0k!

Since dim Y ∗1 <� and since A maps X onto Y0, we get �y∗1k�→ 0 and

�A∗y∗0k� ≥��y∗0k� with some �> 0

by the classical open mapping theorem. Thus �y∗0k� → 0 and �y∗k� → 0 as k→�, which justifies (3.4) andcompletes the proof of assertion (i) of the theorem.To justify assertion (ii) under the Asplund space and SNC assumptions made, we use the simplified description

of the PSNC property of f at x, with �k = 0 in (3.1), and take sequences �xk� x∗k� y

∗k� satisfying

x∗k ∈ �D∗�A+ g+,�·� ���xk��y∗k� for all k ∈N�

where ,�·� � is the indicator mapping of the set relative to the image space Y , which equals 0 ∈ Y of x ∈

and � otherwise. Our aim is to show that for all such �xk� x∗k� y

∗k� the implication (3.4) holds with �k = 0.

Employing the fuzzy sum rule for coderivatives in Asplund spaces as in Mordukhovich [16, Theorem 3.10]and taking into account that

�D∗,�x� �= �N�x� � whenever x ∈ �

we find xk→ x, uk→ x, x∗k → 0, y∗kw∗−→ 0, y∗k

w∗−→ 0, and x∗k ∈ �D∗g�xk��y∗k� such that

x∗k −A∗y∗k − x∗k ∈ �N�uk� � and �y∗k − y∗k�→ 0 as k→�!

As in the proof of assertion (i) above, conclude from x∗k ∈ �D∗g�xk��y∗k� and y∗k

w∗−→ 0 that, by the compact strictLipschitzian property of g at x, we have �x∗k�→ 0 as k→�. Furthermore, the strong convergence

�x∗k −A∗y∗k − x∗k�→ 0 as k→��

follows from the SNC property of at x. We get therefore that �A∗y∗k�→ 0 and conclude, by the open mappingarguments as in the proof of (a), that

�y∗k�→ 0 as k→�!

This implies the strong convergence �y∗k�→ 0 and completes the proof of the theorem. �A “codirectional compactness” counterpart of Theorem 3.2(ii) in general Banach spaces was established by

Ioffe [11, Theorem 6], with a different (significantly more involved) proof, for semi-Fredholm mappings f andCEL sets .

Page 7: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control846 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

4. Extended Lagrange principle. In this section we study the following problem of nondifferentiable pro-gramming in infinite-dimensional spaces with finitely many functional constraints of the inequality and equalitytypes, along with general geometric constraints:

minimize �0�x�

subject to �i�x�≤ 0� i= 1� � � � �m�

�i�x�= 0� i=m+ 1� � � � �m+ r�

x ∈ ⊂X�

(4.1)

where �i� X→ � for i = 0� � � � �m+ r . Note that the single geometric constraint x ∈ in (4.1) is consideredjust for convenience and simplicity; the results obtained below can be extended to the case of finitely manygeometric constraints x ∈ j , j = 1� � � � � l, by reducing them to x ∈ �= 1 ∩ · · · ∩ l and employing theintersection formula for basic normals in Asplund spaces used in what follows; cf. Mordukhovich [17, §5.1].Define the classical Lagrangian

L�x�30� � � � � 3m+r � �= 30�0�x�+ · · ·+3m+r�m+r �x� (4.2)

built upon the cost function and the functional (but not geometric) constraints.It has been observed by Vladimir Tikhomirov in the 1960s (see, in particular, his book with Ioffe [12]

and the recent one with Brinkhuis [4]) that necessary optimality conditions for various classes of extremalproblems (arising, e.g., in nonlinear programming, approximation theory, the calculus of variations, optimalcontrol, classical inequalities, mechanics, etc.) can be obtained via the so-called Lagrange principle: by derivingthem as necessary conditions for minimizing the Lagrangian (4.2) involving the cost and functional constraintdata, subject only to the remaining geometric constraints.Of course, the validity of the Lagrange principle must be justified for each class of optimization prob-

lems under consideration. It has been done in the book by Ioffe and Tikhomirov [12] and related publica-tions for some extremal and control problems with smooth, convex, and mixed “smooth-convex” structurestypical in optimal control problems with smooth dynamics. More general nonsmooth developments of theLagrange principle were later provided by many researchers; see particularly Hiriart-Urruty [8], Ioffe [10],Kruger [13], and Mordukhovich [15] for early results in this direction. We refer the reader to the author’sbook (Mordukhovich [17]) for various results, discussions, and commentaries on first-order necessary optimalityconditions on nonsmooth optimization.In this section we present extended results of the Lagrange principle type for the general class of nondifferen-

tiable programming problems (4.1) in infinite dimensions. Let us first establish in the following theorem certaincalculus rules of independent interest that give efficient representations of basic normals (2.1) to the so-calledgeneralized epigraphs

��f �4� � �= �x� y� ∈X× Y � f �x�− y ∈4� x ∈ �� (4.3)

of the mapping f � X→ Y with respect to the “ordering set” 4 ⊂ Y and the constraint set ⊂ X; both thesesets may be arbitrary nonempty subsets of the spaces X and Y , respectively. As usual, f �x� �= f �x�+,�x� �signifies the restriction of f to the set .

Theorem 4.1 (Relationships for Basic Normals to Generalized Epigraphs). Let f � X → Y be amapping between Banach spaces, and let ⊂ X and 4 ⊂ Y be nonempty sets with x ∈ and f �x�− y ∈4.The following assertions hold:

(i) One always has the implication

�x∗� y∗� ∈N��x� y����f � �4�� =⇒ −y∗ ∈N�f �x�− y�4�!

(ii) Assume that X and Y are Asplund, that f is continuous around x relative to , and that and 4 arelocally closed around x and f �x�− y, respectively. Then

N��x� y����f � �4��⊂ �x∗� y∗� ∈X∗ × Y ∗ � x∗ ∈D∗N f �x��y

∗��−y∗ ∈N�f �x�− y�4��� (4.4)

where the equality holds if f is locally Lipschitzian around x relative to and if f is strongly coderivativelynormal at this point.

Page 8: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 847

(iii) Assume that f is locally Lipschitzian around x relative to in the Banach space framework. Then

N��x� y����f � �4��⊃ �x∗� y∗� ∈X∗ × Y ∗ � x∗ ∈ ��y∗� f ��x��−y∗ ∈N�f �x�− y�4��� (4.5)

where the equality holds if, in addition, X and Y are Asplund, if and 4 are locally closed around x andf �x�− y, respectively, and if f is strongly coderivatively normal at x.

Proof. Assertion (i) follows directly from Definition (2.1) of basic normals in Banach spaces due to thestructure of the set ��f � �4� in (4.3). The proof of (ii) is more involved. First observe that ��f � �4� admitsthe inverse image representation

��f � �4�= g−1�4� with g�x� y� �= f �x�− y! (4.6)

Thus we can apply Mordukhovich [16, Theorem 3.10] ensuring an upper estimate of N��x� y�� g−1�4�� underthe Asplund and closedness/continuity assumptions imposed in (ii) provided that

ker D∗Mg�x� y�= 0� and g−1 is PSNC at �f �x�− y� x� y�� (4.7)

where D∗Mg�x� y� stands for the reversed mixed coderivative of g at �x� y� defined in this case by the relation

D∗Mg�x� y��z∗� �= �x∗� y∗� ∈X∗ × Y ∗ � z∗ ∈ −D∗

Mg−1�f �x�− y� x� y��−z∗�� (4.8)

via the mixed coderivative (2.4) of the inverse mapping; see Mordukhovich [16] for more details. Observe thatthe upper estimate of Mordukhovich [16, Theorem 3.10] directly implies (4.4) due to the equivalence

�x∗� y∗� ∈D∗N g�x� y��z

∗� ⇐⇒ x∗ ∈D∗N f �x��z

∗�� y∗ =−z∗�which easily follows from the normal coderivative sum rule of Mordukhovich [16, Theorem 1.62].Let us now show that both conditions in (4.7) are automatically satisfied due to the special structure of

g in (4.6). First check the kernel condition in (4.7). To proceed, pick z∗ with 0 ∈ D∗Mg�x� y��z∗� and find

�xk� yk�→ �x� y� and �u∗k� v∗k� ∈ �D∗g�xk� yk��z

∗k� such that

xk ∈ � ��u∗k� v∗k��→ 0� and z∗kw∗−→ z∗ as k→�!

Using the elementary sum rule for the Fréchet coderivative of g in (4.6), we get

�D∗g�xk� yk��z∗k�= � �D∗f �xk��z

∗k��0�+ �0�−z∗k��

which implies the relationships

u∗k ∈ �D∗f �xk��z∗k�� v∗k =−z∗k� and hence �z∗k�→ 0= z∗!

This means that ker D∗Mg�x� y� = 0�. To check the PSNC condition in (4.7) further, we take xk ∈ , yk ∈ Y ,

and �u∗k� v∗k� z

∗k� ∈ �N��xk� yk� f �xk�− yk��gphg� with

�xk� yk� f �xk�− yk�→ �x� y� f �x�− y�� ��u∗k� v∗k��→ 0� and z∗kw∗−→ 0

and, arguing as above, we get �z∗k�→ 0 as k→�. This precisely means, in the Asplund space setting, that g−1is PSNC at �f �x�− y� x� y�, which therefore justifies (4.4).Next we prove inclusion (4.5) in (iii) and then finally establish the equalities in both assertions (ii) and

(iii) under the additional assumptions made. To proceed with (4.5), pick any y∗ ∈ −N�f �x�− y�4� and x∗ ∈��y∗� f ��x�. By Definitions (2.1) and (2.7) of basic normals and subgradients in Banach spaces, find sequences�1k ↓ 0, �2k ↓ 0, xk

−→ x, yk→ y, x∗kw∗−→ x∗, and y∗k

w∗−→ y∗ as k→� such that

x∗k ∈ ��1k�y∗k� f ��xk� and − y∗k ∈ �N�2k�f �xk�− yk�4� for all k ∈�!

It is easy to observe from Definitions (2.2) and (2.8) of the corresponding �-elements and from the assumedLipschitz continuity of f around x with constant $ that

�x∗k� y∗k� ∈ �N�k

��xk� yk����f � �4�� with �k �= �1k+ �$+ 1��2k ↓ 0 as k→�!

Page 9: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control848 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

This implies, by passing to the limit as k→�, that �x∗� y∗� ∈N��x� y����f � �4��, which justifies inclusion(4.5) in the general Banach space setting.To establish the opposite inclusion in (4.5), and hence the equality statement in (iii), we invoke the scalariza-

tion formulaD∗

Mf �x��y∗�= ��y∗� f ��x� whenever y∗ ∈ Y ∗ (4.9)

for the mixed coderivative, which is proved in Mordukhovich [16, Theorem 1.90] for the general class of single-valued and locally Lipschitzian mappings on Banach spaces. Strictly speaking, the result of this theorem isformulated for locally Lipschitzian mappings relative to the whole space, i.e., with = X, but its proof holdswith no change in the case of arbitrary closed sets considered in (4.9). Furthermore, we have

D∗N f �x��y

∗�=D∗Mf �x��y

∗�� y∗ ∈ Y ∗�

under the coderivative normality requirement on f at x. Thus the opposite inclusion in (4.5) follows from (4.4)under all the assumptions made in (iii). Moreover, in this case the right-hand sides in (4.5) and (4.4) are thesame, and we simultaneously get the equality statement in (ii) and thus complete the proof of the theorem. �Observe that for our optimization problem (4.1) under consideration, the set

���0� � � � ��m+r � �

�= �x�60� � � � �6m+r � ∈X×�m+r+1 � x ∈ ��i�x�≤ 6i� i= 0� � � � �m��i�x�= 6i� i=m+ 1� � � � �m+ r�

corresponds to (4.3) with an arbitrary constraint set ⊂X and the other data given by f = ��0� � � � ��m+r �� X→�m+r+1 and 4=�m+1

− × 0�⊂�m+r+1. We use the calculus results of Theorem 4.1 for the following necessaryoptimality conditions in (4.1) given in certain extended forms of the Lagrange principle, the proof of whichis based on the extremal principle of variational analysis; see the arguments below. Along with the classicalLagrangian (4.2), consider the extended Lagrangian

L �x�30� � � � � 3m+r � �= 30�0�x�+ · · ·+3m+r�m+r �x�+ 7�x� � (4.10)

also involving geometric constraints via the set indicator function 7�x� � �= 0 for x ∈ and 7�x� � �=�otherwise.

Theorem 4.2 (Lagrange Principle for Nondifferentiable Programming). Let x be a local optimalsolution to problem (4.1), where the space X is Asplund and where the set is locally closed around x. Thefollowing assertions hold:

(i) Assume that the functions �i are lower semicontinuous around x relative to for i= 0� � � � �m and con-tinuous around x relative to for i=m+1� � � � �m+ r . Then there are Lagrange multipliers �30� � � � � 3m+r � ∈�m+r+1, not all zero, such that

�0�−30� � � � �−3m+r � ∈N��x�0�����0� � � � ��m+r � ��� (4.11)

which implies the sign and complementary slackness conditions

3i ≥ 0 for i= 0� � � � �m� 3i�i�x�= 0 for i= 1� � � � �m! (4.12)

If, in addition, the functions �i, i= 0� � � � �m, are also continuous around x relative to , then (4.11) yields thecoderivative inclusion

0 ∈D∗N ���0� � � � ��m+r �+,�·� ���x��30� � � � � 3m+r �! (4.13)

(ii) Assume that all the functions �i, i = 0� � � � �m + r , are Lipschitz continuous around x relative to theset . Then there are Lagrange multipliers �30� � � � � 3m+r � �= 0 satisfying (4.12) and the inclusion

0 ∈ �L �·�30� � � � � 3m+r ��x� (4.14)

in terms of the basic subdifferential of the extended Lagrangian (4.10), which implies

0 ∈ �L�·�30� � � � � 3m+r ��x�+N�x� � (4.15)

via the classical Lagrangian (4.2). Moreover, in this case conditions (4.13) and (4.14) are equivalent, while(4.11) is equivalent to the simultaneous fulfillment of (4.13) and (4.14).

Page 10: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 849

Proof. Suppose for simplicity that �0�x�= 0. Since x is a local optimal solution to (4.1), there is a neigh-borhood U of x such that x provides the minimum to �0 over x ∈U subject to the constraints in (4.1). Considerthe sets

1 �=���0� � � � ��m+r � � and 2 �= clU × 0�

in the Asplund space X × �m+r+1 that are locally closed around �x�0� due to the assumptions made on �i

and . Let us check that �x�0� is an extremal point for the set system 1� 2� in the sense of Mordukhovich[16, Definition 2.1], i.e., �x�0� ∈ 1 ∩ 2 (which is obvious) and there is a sequence ak�⊂ X ×�m+r+1 withak→ 0 as k→� such that

� 1− ak�∩ 2 =� for all k ∈�!

Indeed, the latter holds by the local optimality of x in (4.1) for ak �= �0� 9k�0� � � � �0�, where 9k ↑ 0 = �0�x�as k→�. Note that 2 is SNC at �x�0�, since x ∈ ıU and the second set in the product therein is finite-dimensional. Applying now the exact extremal principle from Mordukhovich [16, Theorem 2.22] to the system 1� 2� and taking into account the structure of 2, we directly arrive at (4.11) with �30� � � � � 3m+r � �= 0.The sign and complementary slackness conditions in (4.12) follow from (4.11) by assertion (i) of Theorem 4.1above due to the structure of 1 = ���0� � � � ��m+r � � corresponding to 4 = �m+1

− × 0� ⊂ �m+r+1 in (4.3).Furthermore, the coderivative condition (4.13) follows from inclusion (4.4) in assertion (ii) of Theorem 4.1provided that the functions �i, i=m+ 1� � � � �m+ r , are continuous around x. This proves assertion (i) of thetheorem.To justify (ii), observe that the mapping

f �x� �= ��0�x�� � � � ��m+r �x��+,�x� �

is strongly coderivatively normal at x, since its image space is finite-dimensional. Thus the extended Lagrangianinclusion (4.14) and the equivalence statements in (ii) follow from assertions (ii) and (iii) of Theorem 4.1, respec-tively. Finally, inclusion (4.15) follows from (4.14) by the semi-Lipschitzian sum rule of Mordukhovich [16,Theorem 2.33(c)] applied to the extended Lagrangian (4.10) due to �7�x� �=N�x� �. �The following consequence of the Lagrange principle from Theorem 4.2 provides separated necessary con-

ditions for local minimizers to problem (4.1) that are expressed, in contrast to the “condensed” Lagrangianforms (4.14) and (4.15), in terms of the corresponding subdifferentials of each function �i, i = 0� � � � �m+ r ,separately.

Corollary 4.3 (Separated Necessary Optimality Conditions in Nondifferentiable Programming).Let all the assumptions from assertion (ii) of Theorem 4.2 be fulfilled. Then there are nonnegative multipliers�30� � � � � 3m+r � �= 0 satisfying the complementary slackness conditions in (4.12) and such that

0 ∈m∑i=0

3i��i�x�+m+r∑

i=m+13i���i�x�∪ ��−�i��x��+N�x� �! (4.16)

Proof. Inclusion (4.16) follows from (4.15) by the afore-mentioned sum rule from Mordukhovich [16,Theorem 2.33(c)] due to the relationships

��3���x�= 3���x� for 3≥ 0 and ��3���x�⊂ �3�����x�∪ ��−���x�� for any 3 ∈�!

Thus the results of Corollary 4.3 follow from assertion (ii) of Theorem 4.2. �Note that all the multipliers 3i in Corollary 4.3, including those corresponding to the equality constraints,

are nonnegative which is different from conventional versions of the Lagrange multiplier rule, even for smoothfunctions. On the other hand, the even subgradient set ���x�∪��−���x� does not reduce to the standard gradientwhen � is smooth. Observe also that the even subgradient sets 3i���i�x�∪ ��−�i��x�� with 3i ≥ 0 in (4.16) canbe replaced by the generally larger sets

3i�0�i�x� with 3i ∈� as i=m+ 1� � � � �m+ r

via the symmetric subdifferential of � at x defined by

�0��x� �= ���x�∪ �−��−���x���which is a two-sided extension of the classical gradient to nonsmooth functions.The next consequence of Theorem 4.2 gives a result of the abstract minimum/maximum principle type,

which directly follows from the extended Lagrange principle under the convexity requirement on the geometricconstraint set.

Page 11: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control850 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

Corollary 4.4 (Abstract Minimum Principle in Nondifferentiable Programming). Let x be a localminimizer for problem (4.1). Suppose in addition to the assumptions of Theorem 4.2(ii) that the set is convex.Then there are multipliers �30� � � � � 3m+r � �= 0, satisfying the sign and complementary slackness conditions in(4.12), and a basic subgradient of the classical Lagrangian

x∗ ∈ �L�·�30� � � � � 3m+r ��x�

such that the following minimum condition holds:

�x∗� x� =min�x∗� x� � x ∈ �!

Proof. This follows from condition (4.15) of Theorem 4.2(ii), since the basic normal cone (4.1) reduces, inthe case of convex sets, to the classical normal cone of convex analysis

N�x� �= x∗ ∈X∗ � �x∗� x− x� ≤ 0 for all x ∈ �

intrinsically having an extremal structure. �

5. Optimization problems with operator constraints. In this section we study optimization problemsthat, in contrast to the nondifferentiable programming problem (4.1) from §4, have infinitely many equalityconstraints given by mappings/operators with values in infinite-dimensional spaces. Such constraints are knownas operator constraints; they are typical, e.g., in optimal control problems governed by ordinary, or delay, orpartial differential equations. The problem under consideration in this section is described as follows:

minimize�0�x� subject to �i�x�≤ 0� i= 1� � � � �m� f �x�= 0� x ∈ � (5.1)

where �� X→�, ⊂X, and f � X→ Y is a mapping between Banach spaces. This problem is formally moregeneral than (4.1), while the results obtained and the methods employed in this section are different from thosein §4.Given a point x feasible to problem (5.1), define the index set for active inequality constraints at x by

I�x� �= i ∈ 1� � � � �m� � �i�x�= 0�!

Theorem 5.1 (Necessary Optimality Conditions in Problems with Operator Constraints). Let xbe a local minimizer for (5.1), where the space X is Asplund, where the set is locally closed around x,and where the functions �i are locally Lipschitzian around x for i ∈ I�x� ∪ 0� and upper semicontinuous atx for i ∈ 1� � � � �m�\I�x�. Assume that the mapping f � X → Y with values in Asplund spaces is Lipschitzcontinuous around x and the restriction f −1 is PSNC at �f �x�� x�. Then there are Lagrange multipliers 3i ≥ 0for i ∈ I�x�∪ 0� and a linear functional y∗ ∈ Y ∗, not equal to zero simultaneously, such that

0 ∈ �

( ∑i∈I�x�∪0�

3i�i

)�x�+D∗

N f �x��y∗�+N�x� �

⊂ ∑i∈I�x�∪0�

3i��i�x�+D∗N f �x��y

∗�+N�x� �! (5.2)

If, in addition, f is strictly Lipschitzian at x, then

0 ∈ �

( ∑i∈I�x�∪0�

3i�i

)�x�+ ��y∗� f ��x�+N�x� �

⊂ ∑i∈I�x�∪0�

3i��i�x�+ ��y∗� f ��x�+N�x� �! (5.3)

Proof. Suppose first that the mapping f is metrically regular at x relative to in the sense that there is a�> 0 and a neighborhood U of x such that

dist�x�S�≤��f �x�− f �x�� for all x ∈U ∩ �

Page 12: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 851

where S �= x ∈ � f �x� = f �x��. Then, by Ioffe’s exact penalization theorem from Ioffe [9] (seeMordukhovich [17, Theorem 5.16]), x is a local optimal solution to the unconstrained minimization problem forall �> 0 sufficiently large:

minimize max{�0�x�−�0�x��max

i∈I�x��i�x�

}+���f �x��+ dist�x� ��! (5.4)

Thus, by the generalized Fermat rule in (5.4), we have

0 ∈ �(max

{�0�·�−�0�x��max

i∈I�x��i�·�

}+���f �·��+ dist�·� ��

)�x�!

To transform the latter “condensed” subdifferential condition into the efficient form of the theorem under theassumptions made, we employ the sum, maximum, and chain rules for basic subgradients of functions in Asplundspaces taken from Mordukhovich [16, §3.2.1] and the representation of basic subgradients of the distancefunction ⋃

3≥0�dist�x� �=N�x� �� x ∈ �

valid in arbitrary Banach spaces; see Mordukhovich [16, Theorem 1.97]. Thus we arrive at (5.2) with

3i ≥ 0 and �3i� �= 0 as i ∈ I�x�∪ 0�!

The inclusions in (5.3) follow from (5.2) due to the scalarization formula for the normal coderivative of strictlyLipschitzian mappings given in (2.10); see Mordukhovich [16, Theorem 3.28].Suppose now that f is not metrically regular at x relative to . One can easily check that in this case the

mappingf �x�= f �x�+,�x� �� x ∈ �

is not metrically regular around x in the conventional sense; see Mordukhovich [16, Definition 1.47]. Thus, bythe characterization of the latter notion from Mordukhovich [16, Theorem 4.18], we have that

either ker D∗Mf �x� �= 0� or f −1 is not PSNC at �f �x�� x��

where the reversed mixed coderivative is defined via the mixed one (2.4) in the way of (4.8). Since the latter isexcluded by the assumption of the theorem, we find y∗ �= 0 satisfying 0 ∈ D∗

Mf �x��y∗� and thus

0 ∈D∗N f �x��y

∗�=D∗N �f +,�·� ���x��y∗�⊂D∗

N f �x��y∗�+N�x� �� (5.5)

the set inclusion in (5.5) is by the coderivative sum rule from Mordukhovich [16, Proposition 3.12], which holdsdue to the local Lipschitz continuity of f around x. By (5.5), we conclude that (5.2) is fulfilled in this case withy∗ �= 0. As above, this implies (5.3) provided that f is strictly Lipschitzian at x and thus completes the proof ofthe theorem. �It is obvious that the PSNC assumption on f in the theorem automatically holds when Y is finite-dimensional.

In this case Theorem 5.1 actually reduces to assertion (iii) of Theorem 4.2. The next corollary, ensuring thelatter property in infinite dimensions, provides efficient conditions for the fulfillment of the optimality results ofTheorem 5.1.

Corollary 5.2 (Optimality Conditions for Problems with Operator Constraints of the General-ized Fredholm Type). Let x be a local solution to problem (5.1). Suppose that, in addition to the assumptionsimposed on the functions �i, i= 0� � � � �m+ r , and on the spaces X and Y in Theorem 5.1, the set is locallyclosed around x and SNC at this point, and the mapping f � X→ Y is generalized Fredholm at x. Then there areLagrange multipliers 3i ≥ 0 for i ∈ I�x�∪ 0� and a linear functional y∗ ∈ Y ∗, not all zero, such that conditions(5.3) are satisfied.

Proof. First observe from Definition 3.1 of generalized Fredholm mappings that f is strictly Lipschitzianat x. Furthermore, Theorem 3.2(ii) ensures that f −1 is PSNC at �f �x�� x�. Thus we arrive at all the requirementsand conclusions of Theorem 5.1 in the case of operator constraints given by strictly Lipschitzian mappings. �

Page 13: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control852 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

6. Discrete optimal control. In the concluding section of this paper, we study dynamic optimization prob-lems governed by discrete-time inclusions and also by those arising from discrete approximations of continuous-time problems of optimal control.Let us first consider the general Mayer problem of discrete optimal control with endpoint constraints of

inequality, equality, and geometric types in infinite-dimensional spaces:

minimize �0�x0� xK�

subject to xj+1 ∈ Fj�xj� for j = 0� � � � �K− 1�

�i�x0� xK�≤ 0 for i= 1� � � � �m�

�i�x0� xK�= 0� for i=m+ 1� � � � �m+ r�

�x0� xK� ∈ ⊂X2�

(6.1)

where Fj � X ⇒ X, �i� X2 → �, and K ∈ �. Observe that the discrete inclusion model in (6.1) covers more

conventional discrete control systems of the parameterized type

xj+1 = fj�xj� uj�� uj ∈Uj as j = 0� � � � �K− 1

with explicit control variables uj taking values in some admissible control regions Uj .The next theorem provides necessary optimality conditions of the discrete Euler-Lagrange type for problem

(6.1), where we impose the Lipschitzian assumptions on �i for simplicity.

Theorem 6.1 (Extended Euler-Lagrange Conditions for Discrete Optimal Control). Let xj � j =0� � � � �K� be a local optimal solution to the optimal control problem (6.1). Assume that the space X is Asplund,that the functions �i are locally Lipschitzian around �x0� xK� for all i = 0� � � � �m + r while the set islocally closed around this point, and that the graphs of Fj are locally closed around �xj � xj+1� for every j =0� � � � �K− 1. Suppose also that all but one of the sets and gphFj , j = 0� � � � �K− 1, are SNC at the points�x0� xK� and �xj � xj+1�, respectively. Then there are multipliers �30� � � � � 3m+r � and an adjoint discrete trajectorypj ∈X∗ � j = 0� � � � �K� satisfying the relationships:—the Euler-Lagrange inclusion

−pj ∈D∗NFj�xj � xj+1��−pj+1� for j = 0� � � � �K− 1� (6.2)

—the transversality inclusion

�p0�−pK� ∈ �

(m+r∑i=0

3i�i

)�x0� xK�+N��x0� xK�� �� (6.3)

—the sign and complementary slackness conditions

3i ≥ 0 for i= 0� � � � �m� 3i�i�x0� xK�= 0 for i= 1� � � � �m� (6.4)

—and the nontriviality condition

�30� � � � � 3m+r � p0� � � � � pK� �= 0! (6.5)

If, in addition, the set-valued mappings Fj are Lipschitz-like around �xj � xj+1� and strongly coderivatively normalat these points for all j = 0� � � � �K− 1, then we have the enhanced nontriviality condition

�30� � � � � 3m+r � pK� �= 0! (6.6)

Proof. Introduce the new variable

z �= �x0� � � � � xK� ∈Z �=XK+1

and consider the nondifferentiable programming problem of type (4.1):

minimize ?0�z� �= �0�x0� xK�

subject to ?i�z� �= �i�x0� xK�≤ 0 for i= 1� � � � �m�

?i�z� �= �i�x0� xK�= 0 for i=m+ 1� � � � �m+ r�

z ∈@ �=⋂Kj=0@j ⊂Z�

(6.7)

Page 14: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 853

where the latter geometric constraints are defined by

@j �= z ∈Z � �xj� xj+1� ∈ gphFj�� j = 0� � � � �K− 1�

@K �= z ∈Z � �x0� xK� ∈ �!(6.8)

Applying assertion (ii) of Theorem 4.2 to the optimal solution z �= �x0� � � � � xK� for problem (6.7), we findmultipliers �30� � � � � 3m+r � �= 0 satisfying (6.4) and such that

0 ∈ �

(m+r∑i=0

3i?i

)�z�+N

(z�

K⋂j=0

@j

)! (6.9)

By the intersection rule for basic normals from Mordukhovich [16, Corollary 3.37], we have

N

(z�

K⋂j=0

@j

)⊂N�z�@0�+ · · ·+N�z�@K� (6.10)

provided that all but one of the sets @j are SNC at z and that the qualification condition

�z∗0+ · · ·+ z∗K = 0� z∗j ∈N�z�@j�� =⇒ �z∗j = 0� j = 0� � � � �K� (6.11)

is satisfied. It is easy to observe, by the structures of the sets @j , that the above SNC conditions for @j reduceto the SNC requirements on and Fj imposed in the theorem. Furthermore, these structures yield that thegeneralized normals

z∗j = �x∗0j � � � � � x∗Kj� ∈N�zj�@j�� j = 0� � � � �K� (6.12)

admit the representations

x∗jj ∈D∗NFj�xj � xj+1��−x∗�j+1�j � and x∗ij = 0 as i �= j� j + 1 for j = 0� � � � �K− 1�

�x∗0K� x∗KK� ∈N��x0� xK�� � and x∗iK = 0 as i �= 0�K!

(6.13)

It follows from (6.9) and (6.10) that, under the SNC and qualification conditions imposed, there are generalizednormals z∗j , j = 0� � � � �K, from (6.12) such that

−K∑j=0

z∗j ∈ �

(m+r∑i=0

3i?i

)�z�! (6.14)

Taking into account the structures of ?i in (6.7) and representations of z∗j in (6.13), we deduce from (6.14) that

�−x∗00�−x∗K�K−1�� ∈ �

(m+r∑i=0

3i�i

)�x0� xK�+N��x0� xK�� ��

−x∗jj = x∗�j+1�j for j = 0� � � � �K− 1!

(6.15)

Introducing now the adjoint trajectory p �= �p0� � � � � pK� ∈ �X∗�K+1 by

pj �= x∗j�j−1� for j = 1� � � � �K and p0 �=−x∗00� (6.16)

we get from (6.13) and (6.15) both Euler-Lagrange and transversality inclusions of the theorem, where�30� � � � � 3m+r � �= 0 in the nontriviality condition (6.5).Consider next the case when the qualification requirement (6.11) is not fulfilled. This means that there are

�z∗0� � � � � z∗K� �= 0 such that

z∗j ∈N�x�@j� as j = 0� � � � �K with z∗0+ · · ·+ z∗K = 0! (6.17)

Equality (6.17) obviously implies that �z∗0� � � � � z∗K−1� �= 0. Then defining �p0� � � � � pK� by (6.16), we deduce from

(6.13) and (6.17) that there is an adjoint discrete trajectory p= �p0� � � � � pK� �= 0 satisfying all the conclusionsof the theorem with �p0� � � � � pK� �= 0 in the nontriviality condition (6.5).

Page 15: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control854 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

Finally, let us show that the relationships (6.2)–(6.4) hold with the enhanced transversality condition (6.6)under the additional assumptions imposed in the theorem. It will follow from the implication

pK =⇒ �pj = 0 for all j = 0� � � � �K− 1�� (6.18)

that this condition is fulfilled along the trajectory p= �p0� � � � � pK� of the Euler-Lagrange inclusion (6.2) pro-vided that the mappings Fj are Lipschitz-like around �xj � xj+1� and strongly coderivatively normal at these pointsfor all j = 0� � � � �K − 1. It is sufficient to justify that pK−1 = 0 whenever pK = 0 under the above assumptionsimposed at the point �xK−1� xK�; the other relationships in (6.18) are proved similarly.To proceed, we get −pK−1 ∈D∗

NFK−1�xK−1� xK��0� from the Euler-Lagrange inclusion (6.2) as j =K− 1 andpK = 0. Since FK−1 is strongly coderivatively normal at �xK−1� xK�,

D∗NFK−1�xK−1� xK��0�=D∗

MFK−1�xK−1� xK��0��

while D∗MFK−1�xK−1� xK��0� = 0� by the coderivative characterization (3.3) of Lipschitz-like mappings. Thus

pK−1 = 0, which completes the proof of the theorem. �As a consequence of Theorem 6.1, we next derive necessary optimality conditions for the discrete optimal

control problem (6.1) in the form of the discrete maximum principle. It is well known that maximum-typeconditions require certain convexity assumptions, which are generally unavoidable even for systems with smoothdynamics; cf. the abstract framework of constrained optimization in Corollary 4.4. The following corollary liststhe precise requirements that ensure the fulfillment of the discrete maximum principle in the constrained Mayerproblem (6.1) for discrete-time evolution inclusions.Recall (Rockafellar and Wets [20]) that a set-valued mapping F � X ⇒ Y between Banach spaces is inner

semicontinuous at x if for every y ∈ F �x� and every sequence xk → x with F �xk� �= �, there is a sequence ofyk ∈ F �xk� converging to y as k→�.Corollary 6.2 (Discrete Maximum Principle for Constrained Evolution Inclusions). In addition

to the assumptions imposed in the first part of Theorem 6.1, let the mappings Fj be inner semicontinuousat �xj � xj+1� and convex-valued around these points for all j = 0� � � � �K − 1. Then there are multipliers�30� � � � � 3m+r � and an adjoint discrete trajectory �p0� � � � � pK� satisfying relationships (6.2)–(6.5) and the max-imum condition

�pj+1� xj+1� = maxy∈F �xj �

�pj+1� y� for all j = 0� � � � �K− 1! (6.19)

Proof. It follows from Mordukhovich [16, Theorem 1.34] that, under the convexity and inner semicontinuityassumptions imposed on Fj , one has

D∗NFj�xj � xj+1��p� �= � =⇒

[�p� xj+1� = min

y∈Fj �xj ��p�y�

]� j = 0� � � � �K− 1!

Thus the discrete maximum condition (6.19) readily follows from the Euler-Lagrange inclusion (6.2) under theadditional assumptions of Corollary 6.2. �Finally, let us consider the following Mayer-type problem for parametric discrete-time evolution inclusions:

minimize �0�x0� xK�

subject to xj+1 ∈ xj +hFj�xj� for j = 0� � � � �K− 1�

�i�x0� xK�≤ 0 for i= 1� � � � �m�

�i�x0� xK�= 0 for i=m+ 1� � � � �m+ r�

�x0� xK� ∈ ⊂X2

(6.20)

with general endpoint constraints, where Fj and �i are as in (6.1), while h> 0 is a parameter. Problems of type(6.20) particularly arise from discrete/finite-difference approximations of constrained optimal control systemsgoverned by differential inclusions:

minimize��x�a�� x�b�� subject to x�t� ∈ F �x�t�� t� a.e. t ∈ �a� b�

under the endpoint constraints; see Mordukhovich [17, Chapter 6] for more details and references.The next result, which is a consequence of Theorem 6.1 and certain rules of generalized differential and

SNC calculi, provides necessary optimality conditions of the extended Euler-Lagrange type for problem (6.20)in Asplund spaces under minimal assumptions on the initial data; see the discussions below.

Page 16: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal ControlMathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS 855

Theorem 6.3 (Extended Euler-Lagrange Conditions for Discrete Approximations). Let xj � j =0� � � � �K� be a local optimal solution to problem (6.20) with the Asplund state space X. Assume that andgphFj are locally closed while �i are locally Lipschitzian around �x0� xK� and �xj � �xj+1− xj �/h�, respectively,for every i = 0� � � � �m+ r and j = 0� � � � �K − 1. Assume also that all but one of the sets and gphFj areSNC at the corresponding points �x0� xK� and �xj � �xj+1− xj �/h�, j = 0� � � � �K−1. Then there exist multipliers�30� � � � � 3m+r � ∈ �m+r+1 and an adjoint discrete trajectory pj ∈ X∗ � j = 0� � � � �K�, not simultaneously zero,satisfying the extended Euler-Lagrange inclusion

pj+1−pj

h∈D∗

NFj

(xj �

xj+1− xj

h

)�−pj+1�� j = 0� � � � �K− 1� (6.21)

accompanied by the transversality inclusion (6.3) together with the sign and complementary slacknessconditions (6.4).

Proof. Problem (6.20) can be considered as a particular case of the general discrete-time Mayer problem(6.1) governed by the discrete inclusions

xj+1 ∈Gj�xj� with Gj�x� �= x+hFj�x�� j = 0� � � � �K− 1� (6.22)

subject to the above endpoint constraints. Applying Theorem 6.1 to this form of (6.20), we need to present theassumptions and necessary optimality conditions therein in terms of the initial data of (6.20). First check that, foreach j = 0� � � � �K−1, the SNC property of Gj at �xj � xj+1� is equivalent to this property of Fj�xj � �xj+1− xj �/h�.In fact, this equivalence follows directly from the SNC calculus result of Mordukhovich [16, Theorem 1.70]applied to the mapping addition in (6.22).Furthermore, applying the Euler-Lagrange inclusion (6.2) to Gj and using the coderivative sum rule from

Mordukhovich [16, Theorem 1.62(ii)] taking into account the obvious relationship D∗N �hF ��·�= hD∗

NF �·� heldfor any mapping F and number h> 0, we readily deduce (6.21) from (6.2) and thus complete the proof of thetheorem. �The results and methods developed in this section can be similarly applied to the so-called Bolza-type problems

for discrete-time inclusions and discrete approximations. In the latter case the cost functional is given by

minimize�0�x0� xK�+hK−1∑j=0

Dj

(xj�

xj+1− xj

h

)� (6.23)

which comes, in particular, from the classical Euler finite-difference replacement of the time-derivative in theBolza functional

minimize�0�x�a�� x�b��+∫ b

aD�x�t�� x��t�� t�dt

for differential inclusions; see Mordukhovich [17, Chapter 6] for more details.In Mordukhovich [17, §6.1.4], we studied the Bolza problem for discrete approximations of differential

inclusions in Asplund spaces with geometric endpoint constraints by using its reduction to infinite-dimensionalproblems of mathematical programming with operator constraints of Fredholm type. Such an approach allowedus to establish extended Euler-Lagrange conditions in the form of Theorem 6.3 but under more restrictiveassumptions requiring that all the sets and gphFj for j = 0� � � � �K−1 are SNC at the corresponding points. Inthis paper we achieve, using another approach, significant improvements of the previous results requiring the SNCproperty of all but one of the above sets. This particularly allows us to fully avoid any SNC/finite codimensiontype assumptions on the constraint/target set providing that all the mappings Fj , j = 0� � � � �K− 1, are SNCat the corresponding points. It covers, e.g., the case of two-point constraints with fixed endpoints x0 and xK ,which has always been troublesome in infinite-dimensional optimal control; cf. the books by Fattorini [6], Liand Yong [14], and Mordukhovich [17].

Acknowledgments. This research was partially supported by the USA National Science Foundation underGrants DMS-0304989 and DMS-0603846 and by the Australian Research Council under Grant DP-0451168.

Page 17: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control

Mordukhovich: Variational Analysis in Nonsmooth Optimization and Discrete Optimal Control856 Mathematics of Operations Research 32(4), pp. 840–856, © 2007 INFORMS

References

[1] Aubin, J.-P. 1984. Lipschitz behavior of solutions to convex minimization problems. Math. Oper. Res. 9 87–111.[2] Borwein, J. M., H. M. Strójwas. 1985. Tangential approximations. Nonlinear Anal. 9 1347–1366.[3] Borwein, J. M., Q. J. Zhu. 2005. Techniques of Variational Analysis. CMS Books in Mathematics, Springer, New York.[4] Brinkhuis, J., V. M. Tikhomirov. 2005. Optimization: Insights and Applications. Princeton University Press, Princeton, NJ.[5] Fabian, M., B. S. Mordukhovich. 2003. Sequential normal compactness versus topological normal compactness in variational analysis.

Nonlinear Anal. 54 1057–1067.[6] Fattorini, H. O. 1999. Infinite Dimensional Optimization and Control Theory. Cambridge University Press. Cambridge, UK.[7] Ginsburg, B., A. D. Ioffe. 1996. The maximum principle in optimal control of systems governed by semilinear equations. B. S. Mor-

dukhovich, H. J. Sussmann, eds. Nonsmooth Analysis and Geometric Methods in Deterministic Optimal Control. Springer, New York,81–110.

[8] Hiriart-Urruty, J.-B. 1979. Refinements of necessary optimality conditions in nondifferentiable programming I. Appl. Math. Optim. 563–82.

[9] Ioffe, A. D. 1979. Necessary and sufficient conditions for a local minimum, I: A reduction theorem and first order conditions. SIAMJ. Control Optim. 17 245–250.

[10] Ioffe, A. D. 1984. Necessary conditions in nonsmooth optimization. Math. Oper. Res. 9 159–188.[11] Ioffe, A. D. 1997. Directional compactness and nonsmooth semi-Fredholm mappings. Nonlinear Anal. 29 201–219.[12] Ioffe, A. D., V. M. Tikhomirov. 1979. Theory of Extremal Problems. North-Holland, Amsterdam, The Netherlands.[13] Kruger, A. Y. 1985. Generalized differentials of nonsmooth functions and necessary conditions for an extremum. Siberian Math. J. 26

370–379.[14] Li, X., J. Yong. 1995. Optimal Control Theory for Infinite-Dimensional Systems. Birkhäuser, Boston, MA.[15] Mordukhovich, B. S. 1985. On necessary conditions for an extremum in nonsmooth optimization. Soviet Math. Dokl. 32 215–220.[16] Mordukhovich, B. S. 2006. Variational Analysis and Generalized Differentiation, I: Basic Theory, Vol. 330. Grundlehren Series

(Fundamental Principles of Mathematical Sciences), Springer, Berlin, Germany.[17] Mordukhovich, B. S. 2006. Variational Analysis and Generalized Differentiation, II: Applications, Vol. 331. Grundlehren Series

(Fundamental Principles of Mathematical Sciences), Springer, Berlin, Germany.[18] Ngai, H. V., D. T. Luc, M. Théra. 2002. Extensions of Fréchet �-subdifferential calculus and applications. J. Math. Anal. Appl. 268

266–290.[19] Phelps, R. R. 1993. Convex Functions, Monotone Operators and Differentiability, 2nd ed. Lecture Notes in Mathematics, Vol. 1364.

Springer, Berlin, Germany.[20] Rockafellar, R. T., R. J.-B. Wets. 1998. Variational Analysis, Vol. 317. Grundlehren Series (Fundamental Principles of Mathematical

Sciences), Springer, Berlin, Germany.[21] Thibault, L. 1980. Subdifferentials of compactly Lipschitzian vector functions. Ann. Mat. Pura Appl. 125 157–192.