Applied Functional Analysis Lecture Notes Fall, 2010 · Applied Functional Analysis Lecture Notes Fall, 2010 Dr. H.T.Banks CenterforResearchinScientiﬁcComputation DepartmentofMathematics

Applied Functional AnalysisLecture NotesFall, 2010

Dr. H. T. BanksCenter for Research in Scientific Computation

Department of MathematicsN. C. State University

August 18, 2010

1 Preface

1.1 Features of These Notes

In these lectures, we shall present functional analysis for partial differentialequations (PDE’s) or distributed parameter systems (DPS) as the basis ofmodern PDE techniques. This is in contrast to classical PDE techniquessuch as separation of variables, Fourier transforms, Sturm-Louiville prob-lems, etc. It is also somewhat different from the emphasis in usual functionalanalysis courses where one learns functional analysis as a subdiscipline inits own right. Here we treat functional analysis as a tool to be used inunderstanding and treating distributed parameter systems. We shall alsomotivate our discussions with numerous application examples from biology,electromagnetics and materials/mechanics.

∙ Sometimes we give proofs, sometimes not!!–Our choice–if proof shedslight or is not readily found in literature, we might give some or all ofarguments behind a result-

∙ examples are quite diverse and details given varying widely -for ex-ample, heat,, transport, beam examples are widely found and welldocumented in literature-don’t always give much in formulation—onother hand, static two-player min-max example not widely found, cel-lular HIV leading to delay systems, population models with aggregatedata, relaxed or chattering controls vs mixing distributions vs Preisachhysteresis and relation to Prohorov metric, some of applications of FAto IP and control not so widely found–

1.2 Functional Analysis for Infinite Dimensional Systems

As we shall see, functional analysis techniques can often provide powerfultools for for treatment of PDEs and FDEs in areas of:

∙ Modeling

∙ Qualitative analysis

∙ Inverse problems

∙ Feedback Control

∙ Engineering analysis

1

∙ Approximation and Computation (such as finite element and spectralmethods)

2

2 Introduction to Functional Analysis

2.1 Goals of this Course

In these lectures, we shall present functional analysis for partial differentialequations (PDE’s) or distributed parameter systems (DPS) as the basis ofmodern PDE techniques. This is in contrast to classical PDE techniquessuch as separation of variables, Fourier transforms, Sturm-Louiville prob-lems, etc. It is also somewhat different from the emphasis in usual functionalanalysis courses where one learns functional analysis as a subdiscipline inits own right. Here we treat functional analysis as a tool to be used inunderstanding and treating distributed parameter systems. We shall alsomotivate our discussions with numerous application examples from biology,electromagnetics and materials/mechanics.

2.2 Uses of Functional Analysis for PDEs

As we shall see, functional analysis techniques can often provide powerfultools for insight into a number of areas including:

∙ Modeling

∙ Qualitative analysis

∙ Inverse problems

∙ Control

∙ Engineering analysis

∙ Computation (such as finite element and spectral methods)

2.3 Example 1: Heat Equation

∂𝑦

∂𝑡(𝑡, 𝜉) =

∂

∂𝜉(𝐷(𝜉)

∂𝑦

∂𝜉(𝑡, 𝜉)) + 𝑓(𝑡, 𝜉) 0 < 𝜉 < 𝑙, 𝑡 > 0 (1)

where 𝑦(𝑡, 𝜉) denotes the temperature in the rod at time 𝑡 and position 𝜉and 𝑓(𝑡, 𝜉) is the input from a source, e.g. heat lamp, laser, etc.B.C.

𝜉 = 0 : 𝑦(𝑡, 0) = 0 Dirichlet B.C. This indicates temperatureis held constant.

3

𝜉 = 𝑙 : 𝐷(𝑙)∂𝑦∂𝜉 (𝑡, 𝑙) = 0 Neumann B.C. This indicates an insulated

end and results in heat flux being zero.

Here 𝑗(𝑡, 𝜉) = 𝐷(𝜉)∂𝑦∂𝜉 (𝑡, 𝜉) is called the heat flux.

I.C.

𝑦(0, 𝜉) = Φ(𝜉) 0 < 𝜉 < 𝑙

Here Φ denotes the initial temperature distribution in the rod.

2.4 Some Preliminary Operator Theory

Let 𝑋,𝑌 be normed linear spaces. Then

ℒ(𝑋,𝑌 ) = {𝑇 : 𝑋 → 𝑌 ∣bounded, linear}

is also a normed linear space, and

∣𝑇 ∣ℒ(𝑋,𝑌 ) = sup∣𝑥∣𝑋=1

∣𝑇𝑥∣𝑌 .

Recall that linear spaces are closed under addition and scalar multiplication.A normed linear space 𝑋 is complete if for any Cauchy sequence {𝑥𝑛} thereexists 𝑥 ∈ 𝑋 such that lim𝑛→∞ 𝑥𝑛 = 𝑥. Note: ℒ(𝑋,𝑌 ) is complete if 𝑌 iscomplete.

If 𝑋 is a Hilbert or Banach space (complex), then 𝑇 is a bounded linearoperator from 𝑋1 → 𝑋2, i.e., 𝑇 ∈ ℒ(𝑋1,𝑋2) ⇐⇒ 𝑇 is continuous.

Differentiation in Normed Linear Spaces [HP]

Suppose 𝑓 : 𝑋 → 𝑌 is a (nonlinear) transformation (mapping).

Definition 1 If lim𝜖→0+𝑓(𝑥0+𝜖𝑧)−𝑓(𝑥0)

𝜖 exists, we say 𝑓 has a directionalderivative at 𝑥0 in direction 𝑧. This is denoted 𝛿𝑓(𝑥0; 𝑧), and is called theGateaux differential at 𝑥0 in direction 𝑧.

If the limit exists for any direction 𝑧, we say 𝑓 is Gateaux differentiable at𝑥0 and 𝑧 → 𝛿𝑓(𝑥0; 𝑧) is Gateaux derivative. Note that 𝑧 → 𝛿𝑓(𝑥0; 𝑧) is notnecessarily linear, however it is homogeneous of degree one (i.e., 𝛿𝑓(𝑥0;𝜆𝑧) =𝜆𝛿𝑓(𝑥0; 𝑧) for scalars 𝜆). Moreover, it need not be continuous in 𝑧.

The following definition of 𝑜(∣𝑧∣) will be useful in defining the Frechetderivative:

4

Definition 2 𝑔(𝑧) is 𝑜(∣𝑧∣) if ∣𝑔(𝑧)∣∣𝑧∣ → 0 as 𝑧 → 0.

Definition 3 If there exists 𝑑𝑓(𝑥0; ⋅) ∈ ℒ(𝑋,𝑌 ) such that ∣𝑓(𝑥0 + 𝑧) −𝑓(𝑥0) − 𝑑𝑓(𝑥0; 𝑧)∣𝑌 = 𝑜(∣𝑧∣𝑋 ), then 𝑑𝑓(𝑥0; ⋅) is the Frechet derivative of 𝑓

at 𝑥0. Equivalently, lim𝜖→0𝑓(𝑥0+𝜖𝑧)−𝑓(𝑥0)

𝜖 = 𝑑𝑓(𝑥0; 𝑧) for every 𝑧 ∈ 𝑋 with𝑧 → 𝑑𝑓(𝑥0; 𝑧) ∈ ℒ(𝑋,𝑌 ). We write 𝑑𝑓(𝑥0; 𝑧) = 𝑓 ′(𝑥0)𝑧.

Results

1. If 𝑓 has a Frechet derivative, it is unique.

2. If 𝑓 has a Frechet derivative at 𝑥0, then it also has a Gateaux derivativeat 𝑥0 and they are the same.

3. If 𝑓 : 𝐷𝑜𝑝𝑒𝑛 ⊂ 𝑋 → 𝑌 has a Frechet derivative at 𝑥0, then 𝑓 iscontinuous at 𝑥0.

Examples

1. 𝑋 = ℝ𝑛, 𝑌 = ℝ𝑚, and 𝑓 : 𝑋 → 𝑌 such that each component of 𝑓 has

partials at 𝑥0 in the form ∂𝑓 𝑖

∂𝑥𝑗 (𝑥0). Then

𝑓 ′(𝑥0) =

(∂𝑓 𝑖

∂𝑥𝑗(𝑥0)

)

and hence

𝑑𝑓(𝑥0; 𝑧) =

(∂𝑓 𝑖

∂𝑥𝑗(𝑥0)

)𝑧.

2. Now we consider the case where 𝑋 = ℝ𝑛 and 𝑌 = ℝ1. Then

𝑑𝑓(𝑥0; 𝑧) = ▽𝑓(𝑥0) ⋅ 𝑧.

3. We also look at the case where 𝑋 = 𝑌 = ℝ1. Here,

𝑑𝑓(𝑥0; 𝑧) = 𝑓 ′(𝑥0)𝑧.

Normally 𝑧 = 1 because that is the only direction in ℝ1. Then 𝑓 ′(𝑥0)is called the derivative, but in actuality 𝑧 → 𝑓 ′(𝑥0)𝑧 is the derivative.Note that 𝑓 ′(𝑥0) ∈ ℒ(ℝ1,ℝ1), but the elements of ℒ(ℝ1,ℝ1) are justnumbers.

5

Homework Exercises

∙ Ex. 1 : Consider 𝑓 : ℝ2 → ℝ1.

𝑓(𝑥1, 𝑥2) =

{𝑥1𝑥2

2

𝑥21+𝑥4

2, if 𝑥 ∕= 0

0, if 𝑥 = 0

Show 𝑓 is Gateaux differentiable at 𝑥 = (𝑥1, 𝑥2) = 0, but not contin-uous at 𝑥 = 0, (and hence cannot be Frechet differentiable).

2.5 Transforming the Initial Boundary Value Problem

If we can transform the initial boundary value problem (IBVP) for equation(1) into something of the form{

��(𝑡) = 𝐴𝑥(𝑡) + 𝐹 (𝑡)𝑥(0) = 𝑥0

(2)

then conceptually it might be an easier problem with which to work. Thatis, formally it looks like an ordinary differential equation problem for which𝑒𝐴𝑡 is a solution operator.

To rigorously make this transformation and develop a corresponding con-ceptual framework, we need to undertake several tasks:

1. Find a space 𝑋 of functions and an operator 𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑋 suchthat the IBVP can be written in the form of equation (2). We mayfind 𝑋 = 𝐿2(0, 𝑙) or 𝐶(0, 𝑙).

We want a solution 𝑥(𝑡) = 𝑦(𝑡, ⋅) to (2), but in what sense - mild,weak, strong, classical? This will answer questions about regularity(i.e., smoothness) of solutions.

2. We also want solution operators or “semigroups” 𝑇 (𝑡) which play therole of 𝑒𝐴𝑡. But what does “𝑒𝐴𝑡” mean in this case? If 𝐴 ∈ 𝑅𝑛 × 𝑅𝑛

is a constant matrix, then

𝑒𝐴𝑡 ≡∞∑𝑛=0

(𝐴𝑡)𝑛

𝑛!= 𝐼 + 𝐴𝑡 +

𝐴2𝑡2

2!+ . . .

where the series has nice convergent properties. In this case, by the“variation of constants” or “variation of parameters” representation,we can then write

𝑥(𝑡) = 𝑒𝐴𝑡𝑥0 +

∫ 𝑡

0𝑒𝐴(𝑡−𝑠)𝐹 (𝑠)𝑑𝑠.

6

We want a similar variation of constants representation in 𝑋 withoperators 𝑇 (𝑡) ∈ ℒ(𝑋) = ℒ(𝑋,𝑋) such that 𝑇 (𝑡) ∼ 𝑒𝐴𝑡, so that

𝑥(𝑡) = 𝑇 (𝑡)𝑥0 +

∫ 𝑡

0𝑇 (𝑡− 𝑠)𝐹 (𝑠)𝑑𝑠.

holds and represents the PDE solutions in some appropriate sense andcan be used for qualitative (stability, asymptotic behavior, control)and quantitative (approximation and numerics) analyses.

7

3 Semigroups and Infinitesimal Generators

3.1 Basic Principles of Semigroups [HP, Pa, Sh, T]

Definition 4 A semigroup is a one parameter set of operators {𝑇 (𝑡) :𝑇 (𝑡) ∈ ℒ(𝑋)}, where 𝑋 is a Banach or Hilbert space, such that 𝑇 (𝑡) satisfies

1. 𝑇 (𝑡 + 𝑠) = 𝑇 (𝑡)𝑇 (𝑠) (semigroup or Markov or translation property)

2. 𝑇 (0) = 𝐼. (identity property)

Classification of semigroups by continuity

∙ 𝑇 (𝑡) is uniformly continuous if lim𝑡→0+

∣𝑇 (𝑡) − 𝐼∣ = 0.

This is not of interest to us, because 𝑇 (𝑡) is uniformly continuous ifand only if 𝑇 (𝑡) = 𝑒𝐴𝑡 where 𝐴 is a bounded linear operator.

∙ 𝑇 (𝑡) is strongly continuous, denoted 𝐶0, if for each 𝑥 ∈ 𝑋, 𝑡 → 𝑇 (𝑡)𝑥is continuous on [0, 𝛿) for some positive 𝛿.

Note 1: All continuity statements are in terms of continuity from theright at zero. For fixed 𝑡

𝑇 (𝑡 + ℎ) − 𝑇 (𝑡) = 𝑇 (𝑡)[𝑇 (ℎ) − 𝑇 (0)]

= 𝑇 (𝑡)[𝑇 (ℎ) − 𝐼]

and𝑇 (𝑡)− 𝑇 (𝑡− 𝜖) = 𝑇 (𝑡− 𝜖)[𝑇 (𝜖) − 𝐼]

so that continuity from the right at zero is equivalent to continuity at any 𝑡for operators that are uniformly bounded on compact intervals.

Note 2: 𝑇 (𝑡) uniformly continuous implies 𝑇 (𝑡) strongly continuous(∣𝑇 (𝑡)𝑥− 𝑥∣ ≤ ∣𝑇 (𝑡)− 𝐼∣ ∣𝑥∣), but not conversely.

3.2 Return to Example 1 : Heat Equation

To begin the process of writing the system of Example 1 in the form ofequation (2), take 𝑋 = 𝐿2(0, 𝑙) and define

𝒟(𝐴) = {𝜑 ∈ 𝐻2(0, 𝑙)∣𝜑(0) = 0, 𝜑′(𝑙) = 0}

8

to be the domain of 𝐴. (Assume 𝐷 is smooth for now, e.g., 𝐷 is at least𝐻1.) Then we can define 𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑋 by

𝐴[𝜑](𝜉) = (𝐷(𝜉)𝜑′(𝜉))′. (3)

Notation: In general, 𝑊 𝑘,𝑝(𝑎, 𝑏) = {𝜑 ∈ 𝐿𝑝∣𝜑′, 𝜑′′, . . . , 𝜑(𝑘) ∈ 𝐿𝑝}.If 𝑝 = 2, we write 𝑊 𝑘,2(𝑎, 𝑏) = 𝐻𝑘(𝑎, 𝑏). 𝐴𝐶(𝑎, 𝑏) = {𝜑 ∈ 𝐿2(𝑎, 𝑏)∣𝜑 isabsolutely continuous (𝐴𝐶) on [𝑎, 𝑏]}.

Homework Exercises

∙ Ex. 2 : Show that 𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑋 of the heat example (Example1) is not bounded 𝑋 → 𝑋.

∙ Ex. 3 : Give an example when 𝑋 is a Hilbert space, but ℒ(𝑋) is nota Hilbert space.

∙ Ex. 4 : Consider 𝐻1(𝑎, 𝑏) = 𝑊 1,2(𝑎, 𝑏),𝑊 1,1(𝑎, 𝑏) and 𝐴𝐶(𝑎, 𝑏). Findthe relationships for each pair of spaces in terms of ⊂. That is, estab-lish 𝑋 ⊂ 𝑌 or 𝑋 ⊊ 𝑌 , etc.

3.3 Example 2 : General Transport Equation

Problems in which the transport equation is used

∙ Insect dispersion

∙ Growth and decline of population-a special case-see Example 4 below

∙ Flow problems (convective/diffusive transport)

Specific applications as described in [BKa, BK] include the “cat brainproblem” (which involves in vivo labeled transport) and population dispersalproblems.

Description of various fluxes

The quantity 𝑦(𝑡, 𝜉) is the population or amount of a substance at location𝜉 in the habitat.

∙ 𝑗1(𝑡, 𝜉) is the flux due to dispersion (random foraging or moments;random molecular collisions).

𝑗1(𝑡, 𝜉) = 𝐷(𝜉)∂𝑦

∂𝜉(𝑡, 𝜉)

9

∙ 𝑗2(𝑡, 𝜉) is the advective/convective/directed bulk movement.

𝑗2(𝑡, 𝜉) = 𝜈(𝜉)𝑦(𝑡, 𝜉)

∙ 𝑗3(𝑡, 𝜉) is the net loss due to “birth/death”; immigration/emigration,label decay.

𝑗3(𝑡, 𝜉) = 𝜇(𝜉)𝑦(𝑡, 𝜉)

General transport equation for a 1-dimensional habitat

∂𝑦

∂𝑡(𝑡, 𝜉) +

∂

∂𝜉(𝜈(𝜉)𝑦(𝑡, 𝜉)) =

∂

∂𝜉(𝐷(𝜉)

∂𝑦

∂𝜉(𝑡, 𝜉)) − 𝜇(𝜉)𝑦(𝑡, 𝜉) (4)

0 < 𝜉 < 𝑙, 𝑡 > 0

B.C.

𝜉 = 0 : 𝑦(𝑡, 0) = 0 (essential)

𝜉 = 𝑙 :[𝐷(𝜉)∂𝑦∂𝜉 (𝑡, 𝜉) − 𝜈(𝜉)𝑦(𝑡, 𝜉)

]𝜉=𝑙

= 0 (natural)

I.C. 𝑦(0, 𝜉) = Φ(𝜉)

To write this example in operator form (2), we first let 𝑋 = 𝐿2(0, 𝑙)be the usual complex Hilbert space of square integrable functions, and thedomain of 𝐴 be defined by 𝒟(𝐴) = {𝜑 ∈ 𝐻2(0, 𝑙)∣𝜑(0) = 0, [𝐷𝜑′ − 𝜈𝜑] (𝑙) =0} where

𝐴𝜑(𝜉) = (𝐷(𝜉)𝜑′(𝜉))′ − (𝜈(𝜉)𝜑(𝜉))′ − 𝜇𝜑(𝜉). (5)

Note: We will assume 𝐷 and 𝜈 are smooth for now.

3.4 Example 3 : Delay Systems–Insect/Insecticide Models

Delay systems have been of interest for the past 70 or more years, arisingin applications ranging from aerospace engineering to biology (biochemicalpathways, etc.), population models, ecology, HIV and other disease pro-gression models to viscoelastic and smart hysteretic materials, as well asnetwork models [BBH, BBJ, BBPP, BKurW1, BKurW2, BRS, Hutch, KP,Ma, Vis, Warga, Wright]. We describe here one arising in insect/insecrticideinvestigations [BBJ].

10

Mathematical models that are suitable for field data with mixed pop-ulations should consider reproductive effects and should also account formultiple generations, containing neonates (juveniles) and adults and theirinterconnectedness. This suggests the need at the minimum for a coupledsystem of equations describing two separate age classes. Additionally, dueto individual differences within the insect population, it is biologically un-realistic to assume that all neonate aphids born on the same day reach theadult age class at the same time. In fact, the age at which the insects reachadulthood varies from as few as five to as many as seven days. Hence onemust include a term in any model to account for this variability, leading oneto develop a coupled delay differential equation model for the insect pop-ulation dynamics. We consider the delay between birth and adulthood forneonate pea aphids and present a first mathematical model that treats thisdelay as a random variable. For a careful derivation of models with similarstructure in HIV progression at the cellular level, see [BBH].

Let 𝑎(𝑡) and 𝑛(𝑡) denote the number of adults and neonates, respectively,in the population at time 𝑡. We lump the mortality due to insecticide intoone parameter 𝑝𝑎 for the adults, 𝑝𝑛 for the neonates, and denote by 𝑑𝑎 and 𝑑𝑛the background or natural mortalities for adults and neonates, respectively.We let 𝑏 be the rate at which neonates are born into the population.

We suppose that there is a time delay for maturation of a neonate toadult life stage. We further assume that this time delay varies across the in-sect population according to a probability distribution 𝑃 (𝜏) for 𝜏 ∈ [−𝑇𝑛, 0]

with corresponding density 𝑘(𝜏) = 𝑑𝑃 (𝜏)𝑑𝜏 . Here we tacitly assume an up-

per bound on 𝑇𝑛 for the maturation period of neonates into adults. Thus,we have that 𝑘(𝜏), 𝜏 < 0, is the probability per unit time that a neonatewho has been in the population −𝜏 time units becomes an adult. Thenthe rate at which such neonates become adults is 𝑛(𝑡 + 𝜏)𝑘(𝜏). Summingover all such 𝜏 ’s, we obtain that the rate at which neonates become adults is∫ 0−𝑇𝑛

𝑛(𝑡+𝜏)𝑘(𝜏)𝑑𝜏 . Using the biological knowledge that the maturation pro-cess varies between five and seven days (i.e., 𝑘 vanishes outside [−7,−5]), weobtain the functional differential equation (FDE) (see [JKH1, JKH2, JKH3]for the widespread interest and use of such systems) system

𝑑𝑎𝑑𝑡 (𝑡) =

∫ −5

−7𝑛(𝑡 + 𝜏)𝑘(𝜏)𝑑𝜏 − (𝑑𝑎 + 𝑝𝑎) 𝑎(𝑡)

𝑑𝑛𝑑𝑡 (𝑡) = 𝑏𝑎(𝑡) − (𝑑𝑛 + 𝑝𝑛)𝑛(𝑡)−

∫ −5

−7𝑛(𝑡 + 𝜏)𝑘(𝜏)𝑑𝜏

𝑎(𝜃) = Φ(𝜃), 𝑛(𝜃) = Ψ(𝜃), 𝜃 ∈ [−7, 0)𝑎(0) = 𝑎0, 𝑛(0) = 𝑛0,

(6)

11

where 𝑘 is now a probability density kernel which we have assumed has theproperty 𝑘(𝜏) ≥ 0 for 𝜏 ∈ [−7,−5] and 𝑘(𝜏) = 0 for 𝜏 ∈ (−∞,−7)∪ (−5, 0].

The system of functional differential equations described in (6) can bewritten in terms of semigroups [BBu1, BBu2, BKap1].

Let𝑧(𝑡) = (𝑎(𝑡), 𝑛(𝑡))𝑇

and𝑧𝑡(𝜏) = 𝑧(𝑡 + 𝜏),−7 ≤ 𝜏 ≤ 0. (7)

We define the Hilbert space 𝑋 ≡ ℝ2 × 𝐿2(−7, 0;ℝ2) with inner product

∣(𝜂, 𝜑)∣𝑋 =

(∣𝜂∣2 +

∫ 0

−7∣𝜑(𝜃)∣2𝑑𝜃

)1/2

, (𝜂, 𝜑) ∈ 𝑋,

and let 𝑥(𝑡) = (𝑧(𝑡), 𝑧𝑡) ∈ 𝑋. Then our system (6) can be written as

𝑑𝑧𝑑𝑡 (𝑡) = 𝐿 (𝑧𝑡) for 0 ≤ 𝑡 ≤ 𝑇,(𝑧(0), 𝑧0) = (Φ(0),Φ) ∈ 𝑋, Φ ∈ 𝒞(−7, 0;ℝ2),

(8)

where 𝑇 < ∞ and for 𝜑 = (𝜓, 𝜁)𝑇 ∈ 𝒞(−7, 0;ℝ2)

𝐿(𝜑) =

[−𝑑𝑎 − 𝑝𝑎 0𝑏 −𝑑𝑛 − 𝑝𝑛

]𝜑(0) +

[0 10 −1

] ∫ −5

−7𝜑(𝜏)𝑘(𝜏)𝑑𝜏. (9)

We now define a linear operator 𝒜 : 𝒟(𝒜) ⊂ 𝑋 → 𝑋 with domain

𝒟(𝒜) ={

(𝜑(0), 𝜑) ∈ 𝑋 ∣𝜑 ∈ 𝐻1(−7, 0;ℝ2)}

(10)

by𝒜(𝜑(0), 𝜑) = (𝐿(𝜑),𝐷𝜑) . (11)

Then the delay system (6) can be formulated as

��(𝑡) = 𝒜𝑥(𝑡)𝑥(0) = 𝑥0,

(12)

where 𝑥0 =((𝑎0, 𝑛0)𝑇 , (Φ,Ψ)𝑇

).

12

3.5 Example 4 : Probability Measure Dependent Systems -Maxwell’s Equations

We consider Maxwell’s equations in a complex, heterogenerous material (see[BBL] for details):

▽× 𝐸 =−∂𝐵

∂𝑡

▽×𝐻 =∂𝐷

∂𝑡+ 𝐽

▽ ⋅𝐷 = 𝜌

▽ ⋅𝐵 = 0

where 𝐸 is the electric field (force), 𝐻 is the magnetic field (force), 𝐷 is theelectric flux density (also called the electric displacement), 𝐵 is the magneticflux density, and 𝜌 is the density of charges in the medium.

To complete this system, we need constituitive (material dependent)relations:

𝐷 = 𝜖0𝐸 + 𝑃

𝐵 = 𝜇0𝐻 + 𝑀

𝐽 = 𝐽𝑐 + 𝐽𝑠

where 𝑃 is electric polarization, 𝑀 is magnetization, 𝐽𝑐 is the conductioncurrent density, 𝐽𝑠 is the source current density, 𝜖0 is the dielectric permit-tivity, and 𝜇0 is the magnetic permeability.

General Polarization:

𝑃 (𝑡, 𝑥) = [𝑔 ∗𝐸](𝑡, 𝑥) =

∫ 𝑡

0𝑔(𝑡− 𝑠, ��)𝐸(𝑠, ��)𝑑𝑠

Here, 𝑔 is the polarization susceptibility kernel, or dielectric response func-tion (DRF).

Several examples of polarization are widely used:

∙ Debye Model for PolarizationThis describes reasonably well a polar material. This is also calleddipolar or orientational polarization. The DRF is defined by

𝑔(𝑡− 𝑠, ��, 𝜏) = 𝑒−(𝑡−𝑠)

𝜏

(𝜖0(𝜖𝑠 − 𝜖∞)

𝜏

)

13

and corresponds to

�� +1

𝜏𝑃 = 𝜖0(𝜖𝑠 − 𝜖∞)𝐸.

It is important to note that 𝑃 represents macroscopic polarization,and when we refer to microscopic polarization we will instead use 𝑝.

∙ Lorentz PolarizationThis is also called electronic polarization or the electronic cloud model.The DRF is given by

𝑔(𝑡− 𝑠, ��, 𝜏) =𝜖0𝜔

2𝑝

𝜈0𝑒

−(𝑡−𝑠)2𝜏 sin(𝜈0(𝑡− 𝑠)),

and corresponds to

𝑃 +1

𝜏�� + 𝜔2

0𝑃 = 𝜖0𝜔2𝑝𝐸

where 𝜔𝑝 = 𝜔0√𝜖𝑠 − 𝜖∞ and 𝜈0 =

√𝜔20 − 1

4𝜏2.

Note: When we allow for instantaneous polarization, we find that 𝐷 =𝜖0𝜖𝑟𝐸 + 𝑃 where 𝜖𝑟 = 1 + 𝜒 ≥ 1 is a relative permittivity.

For complex composite materials, the standard Debye or Lorentz polar-ization model is not adequate, e.g., we need multiple relaxation times 𝜏 ’sin some kind of distribution [BBo, BG1, BG2]. Then the multiple Debye

model becomes

𝑃 (𝑡, 𝑥;𝐹 ) =

∫𝒯𝑝(𝑡, ��; 𝜏)𝑑𝐹 (𝜏)

where 𝒯 is a set of possible relaxation parameters 𝜏 and

𝐹 ∈ ℱ(𝒯 ) = {𝐹 : 𝒯 → ℝ1∣𝐹 is a probability distribution 𝒯 }.

Note: Here the microscopic polarization is given by

𝑝(𝑡, ��; 𝜏) =

∫ 𝑡

0𝑔(𝑡− 𝑠, ��; 𝜏)𝐸(𝑠, 𝑥)𝑑𝑠

where 𝑔(𝑡− 𝑠, ��; 𝜏) = 𝜖0(𝜖𝑠−𝜖∞)𝜏 𝑒

−(𝑡−𝑠)𝜏 which corresponds to

�� +1

𝜏𝑝 = 𝜖0(𝜖𝑠 − 𝜖∞)𝐸.

14

Here, 𝐸 is the total electric field. Thus,

𝑃 (𝑡, ��;𝐹 ) =

∫𝒯

∫ 𝑡

0𝑔(𝑡− 𝑠, ��; 𝜏)𝐸(𝑠, 𝑥)𝑑𝑠𝑑𝐹 (𝜏)

=

∫ 𝑡

0

[∫𝒯𝑔(𝑡− 𝑠, ��; 𝜏)𝑑𝐹 (𝜏)

]𝐸(𝑠, ��)𝑑𝑠

=

∫ 𝑡

0𝐺(𝑡− 𝑠, ��;𝐹 )𝐸(𝑠, 𝑥)𝑑𝑠

Assuming 𝑀 = 0 (nonmagnetic materials), we find this system becomes

▽× 𝐸 = − ∂

∂𝑡(𝜇0𝐻)

▽×𝐻 =∂

∂𝑡

[𝜖0𝜖𝑟𝐸 +

∫ 𝑡

0𝐺(𝑡− 𝑠, ��;𝐹 )𝐸(𝑠, 𝑥)𝑑𝑠

]+ 𝐽

▽ ⋅𝐷 = 0 (13)

▽ ⋅𝐻 = 0

where 𝐹 ∈ ℱ(𝒯 ) and 𝐽 = 𝐽𝑐 + 𝐽𝑠. Note: 𝐽𝑐 is usually also a convolutionon 𝐸 although Ohm’s law uses 𝐽𝑐 = 𝜎𝐸 where 𝜎 is the conductivity of thematerial. In general, one should treat 𝐽𝑐 as a convolution, e.g.,

𝐽𝑐 = 𝜎𝑐 ∗ 𝐸 =

∫ 𝑡

0𝜎𝑐(𝑡− 𝑠, ��)𝐸(𝑠, 𝑥)𝑑𝑠.

For the measure dependent system (13), existence and uniqueness via asemigroup formulation have not been established. Comparison of solutionsvia semigroups versus weak solution have not been done. Nothing has yetbeen done in two or three dimensions. For the one-dimensional case only,existence, uniqueness, and continuous dependence have been established viaa weak formulation (see [BG1]). For continuous dependence of solutions on𝐹 , a metric is needed on ℱ(𝒯 ). More generally, we may need to treat othermaterial parameters 𝑞 = (𝜏, 𝜖𝑠, 𝜖∞, 𝜎), where 𝑞 ∈ 𝑄 ⊂ ℝ4 and we look for𝐹 ∈ ℱ(𝑄).

∙ Special Case

We can consider a physically meaningful special case of the Maxwellsystem in a dielectric material which has a general polarization convolutionrelationship. Detailed derivations given in Section 2.3 of [BBL] lead to a

15

one-dimensional version we next present and will use for an example in oursubsequent discussions. Among the assumptions are some homogeneity inthe medium (in planes parallel to that of an interrogating polarized planarwave) and a polarized sheet antenna source 𝐽𝑠. The resulting model leadsto only nontrivial 𝐸 fields in the 𝑥 direction, and 𝐻 fields in the 𝑦 direction,each depending only on 𝑡 and 𝑧. Assuming Ohm’s law for 𝐽𝑐, we find thesystem reduces to

∂𝐸

∂𝑧= −𝜇0

∂𝐻

∂𝑡∂𝐻

∂𝑧=

∂𝐷

∂𝑡+ 𝜎𝐸 + 𝐽𝑠 (14)

𝐷 = 𝜖𝐸 + 𝑃

or in second order form for 𝐸(𝑡, 𝑧):

𝜇0𝜖∂2𝐸

∂𝑡2+ 𝜇0

∂2𝑃

∂𝑡2+ 𝜇0𝜎

∂𝐸

∂𝑡− ∂2𝐸

∂𝑧2= −𝜇0

∂𝐽𝑠∂𝑡

or

𝜖𝑟∂2𝐸

∂𝑡2+

1

𝜖0

∂2𝑃

∂𝑡2+

𝜎

𝜖0

∂𝐸

∂𝑡− 𝑐2

∂2𝐸

∂𝑧2=−1

𝜖0

∂𝐽𝑠∂𝑡

,

where 𝑐2 = 1𝜇0𝜖0

and 𝜖𝑟 is a relative permittivity that is material and geom-etry dependent. Typical boundary conditions (say on Ω = [0, 1]) are[

1

𝑐

∂𝐸

∂𝑡− ∂𝐸

∂𝑧

]𝑧=0

= 0 (absorbing B.C. at 𝑧 = 0)

and 𝐸∣𝑧=1 = 0 (supraconductive B.C. at 𝑧 = 1).

If we assume a general polarization relationship

𝑃 (𝑡, 𝑧) =

∫ 𝑡

0𝑔(𝑡− 𝑠, 𝑧)𝐸(𝑠, 𝑧)𝑑𝑠

and initial conditions

𝐸(0, 𝑧) = Φ(𝑧),∂𝐸(0, 𝑧)

∂𝑡= Ψ(𝑧),

then it can be argued that the system becomes

∂2𝐸

∂𝑡2+ 𝛾

∂𝐸

∂𝑡+ 𝛽𝐸 +

∫ 0

−∞𝑔(𝑠)𝐸(𝑡 + 𝑠)𝑑𝑠− 𝑐2

∂2𝐸

∂𝑧2= 𝒥 (𝑡),

16

(without loss of generality we may take 𝜖𝑟 = 1 for theoretical conditions)where we have tacitly assumed 𝐸(𝑡, 𝑧) = 0 for 𝑡 < 0. If we approximatethe “memory” term by assuming only a finite past history is significant, weobtain the integro-partial differential equation

∂2𝐸

∂𝑡2+ 𝛾

∂𝐸

∂𝑡+ 𝛽𝐸 +

∫ 0

−𝑟𝑔(𝑠)𝐸(𝑡 + 𝑠)𝑑𝑠 − 𝑐2

∂2𝐸

∂𝑧2= 𝒥 (𝑡). (15)

As with the first two examples, this example can also be written as𝑑𝑥𝑑𝑡 = 𝐴𝑥 + 𝐹 in an appropriate function space setting. In this directionwe define an operator 𝐴 in appropriate state spaces. In this example onemight choose the electric field 𝐸 and the magnetic field (𝐻 in (14)) as“natural” state spaces along with some type of hysteresis state to accountfor the memory in (15). However, in second order (in time) systems, it isalso sometimes natural to choose the state 𝐸 and velocity ∂𝐸

∂𝑡 as states.We do that in this example. We first define an auxiliary variable 𝑤(𝑡) in𝑊 ≡ 𝐿2

𝑔(−𝑟, 0;𝐿2(Ω)) by

𝑤(𝑡)(𝜃) = 𝐸(𝑡) − 𝐸(𝑡 + 𝜃), −𝑟 ≤ 𝜃 ≤ 0.

For the inner product in 𝑊 , we choose the weighted inner product

⟨𝜂,𝑤⟩𝑊 ≡∫ 0

−𝑟𝑔(𝜃)⟨𝜂(𝜃), 𝑤(𝜃)⟩𝐿2(Ω)𝑑𝜃

and then (15) can be written as

∂2𝐸

∂𝑡2+ 𝛾

∂𝐸

∂𝑡+ (𝛽 + 𝑔11)𝐸 −

∫ 0

−𝑟𝑔(𝑠)𝑤(𝑡)(𝑠)𝑑𝑠 − 𝑐2

∂2𝐸

∂𝑧2= 𝒥 (𝑡),

where 𝑔11 ≡∫ 0−𝑟 𝑔(𝑠)𝑑𝑠 and 𝑤(𝑡)(𝑠) = 𝐸(𝑡) − 𝐸(𝑡 + 𝑠), −𝑟 ≤ 𝑠 ≤ 0. For a

semigroup formulation we consider the state space

𝑍 = 𝑉 ×𝐻 ×𝑊 = 𝐻1𝑅(Ω)× 𝐿2(Ω) × 𝐿2

𝑔(−𝑟, 0;𝐻)

(here 𝐻1𝑅(Ω) = {𝜙 ∈ 𝐻1∣𝜙(1) = 0}) with states

(𝜙,𝜓, 𝜂) = (𝐸(𝑡),∂𝐸(𝑡)

∂𝑡, 𝑤(𝑡)) = (𝐸(𝑡),

∂𝐸(𝑡)

∂𝑡,𝐸(𝑡) − 𝐸(𝑡 + ⋅)).

To define what we will later see is an infinitesimal generator of a 𝐶0

semigroup, we first define several component operators. Let 𝐴 ∈ ℒ(𝑉, 𝑉 ★)be defined by

𝐴𝜙 ≡ 𝑐2𝜙′′ − (𝛽 + 𝑔11)𝜙 + 𝑐2𝜙′(0)𝛿0

17

where 𝛿0 is the Dirac operator 𝛿0𝜓 = 𝜓(0). Here 𝑉 ★ is the dual space of𝑉 (e.g. the space of continuous linear functionals on 𝑉 , to be discussed indetail later). We also define 𝐵 ∈ ℒ(𝑉, 𝑉 ★) and �� ∈ ℒ(𝑊,𝐻) by

𝐵𝜙 = −𝛾𝜙− 𝑐𝜙(0)𝛿0

and, for 𝜂 ∈ 𝑊 ,

��𝜂(𝑧) =

∫ 0

−𝑟𝑔(𝑠)𝜂(𝑠, 𝑧)𝑑𝑠, 𝑧 ∈ Ω.

Moreover, we introduce the operator 𝐶 : 𝒟(𝐶) ⊂ 𝑊 → 𝑊 defined on

𝒟(𝐶) = {𝜂 ∈ 𝐻1(−𝑟, 0;𝐿2(Ω)) ∣ 𝜂(0) = 0}by

𝐶𝜂(𝜃) =∂

∂𝜃𝜂(𝜃).

We then define the operator 𝒜 on

𝒟(𝒜) = {(𝜙,𝜓, 𝜂) ∈ 𝑍∣𝜓 ∈ 𝑉, 𝜂 ∈ 𝒟(𝐶), 𝐴𝜙 + 𝐵𝜓 ∈ 𝐻}by

𝒜 =

⎛⎝ 0 𝐼 0

𝐴 𝐵 ��0 𝐼 𝐶

⎞⎠ .

That is,𝒜Φ = (𝜓,𝐴𝜙 + 𝐵𝜓 + ��𝜂, 𝜓 + 𝐶𝜂)

for Φ = (𝜙,𝜓, 𝜂) ∈ 𝒟(𝒜).

3.6 Example 5: Structured Population Models

We consider the special case of “transport” models or Example 2 with𝐷 = 0 but with so-called renewal or recruitment boundary conditions.The “spatial” variable 𝜉 is actually “size” in place of spatial location (see[BT]) and such models have been effectively used to model marine popula-tions such as mosquitofish [BBKW, BF, BFPZ] and, more recently, shrimp[GRD-FP, BDEHAD, GRD-FP2]. Such models have also been the basisof labeled cell proliferation models in which 𝜉 represents label intensity[BSTBRSM]. The early versions of these size structured population mod-els were first proposed by Sinko and Streifer [SS] in 1967. Cell populationversions were proposed by Bell and Anderson [BA] almost simultaneously.

18

Other versions of these models called “age structured models”, where agecan be “physiological age” as well as chronical age, are discussed in [MetzD].

The Sinko-Streifer model (SS) for size-structured mosquitofish popula-tions is given by

∂𝑣

∂𝑡+

∂

∂𝜉(𝑔𝑣) = −𝜇𝑣, 𝜉0 < 𝜉 < 𝜉1, 𝑡 > 0 (16)

𝑣(0, 𝜉) = Φ(𝜉)

𝑔(𝑡, 𝜉0)𝑣(𝑡, 𝜉0) =

∫ 𝜉1

𝜉0

𝐾(𝑡, 𝑠)𝑣(𝑡, 𝑠)𝑑𝑠

𝑔(𝑡, 𝜉1) = 0.

Here 𝑣(𝑡, 𝜉) represents number density (given in numbers per unit length)or population density, where 𝑡 represents time and 𝜉 represents the lengthof the mosquitofish. The growth rate of individual mosquitofish is assumedgiven by 𝑔(𝑡, 𝜉), where

𝑑𝜉

𝑑𝑡= 𝑔(𝑡, 𝜉) (17)

for each individual (all mosquitofish of a given size have the same growthrate).

In the SS 𝜇(𝑡, 𝜉) represents the mortality rate of mosquitofish, and thefunction Φ(𝜉) represents initial size density of the population, while 𝐾 repre-sents the fecundity kernel. The boundary condition at 𝜉 = 𝜉0 is recruitment,or birth rate, while the boundary condition at 𝜉 = 𝜉1 = 𝜉𝑚𝑎𝑥 ensures themaximum size of the mosquitofish is 𝜉1. The SS model cannot be used asformulated above to model the mosquitofish population because it does notpredict dispersion or bifurcation of the population in time under biologi-cally reasonable assumptions [BBKW, BF, BFPZ]. As we shall see, we willsubsequently replace the growth rate 𝑔 by a family 𝒢 of growth rates andreconsider the model with a probability distribution 𝑃 on this family. Thepopulation density is then given by summing “cohorts” of subpopulationswhere individuals belong to the same subpopulation if they have the samegrowth rate [BD, BDTR, GRD-FP, BDEHAD, GRD-FP2]. Thus, in theso-called Growth Rate Distribution (GRD) model, the population density𝑢(𝑡, 𝜉;𝑃 ), first discussed in [BBKW] and developed in [BF], is actually givenby

𝑢(𝑡, 𝜉;𝑃 ) =

∫𝒢𝑣(𝑡, 𝜉; 𝑔)𝑑𝑃 (𝑔), (18)

19

where 𝒢 is a collection of admissible growth rates, 𝑃 is a probability mea-sure on 𝒢, and 𝑣(𝑡, 𝜉; 𝑔) is the solution of the (SS) with growth rate 𝑔. Thismodel assumes the population is made up of collections of subpopulationswith individuals in the same subpopulation having the same size dependentgrowth rate. This example can also be formulated in terms of semigroups[BKa, BKW1, BKW2], but the details are somewhat more difficult thanthose for the first three examples. In some cases it is advantageous to use aweak formulation (to be developed below) instead of a semigroup formula-tion.

3.7 Infinitesimal Generators

Definition 5 Assume {𝑇 (𝑡) ∈ ℒ(𝑋); 0 ≤ 𝑡 < ∞} is a 𝐶0 semigroup in 𝑋or on 𝑋, then the infinitesimal generator 𝐴 of the semigroup 𝑇 (𝑡) is definedby

𝒟(𝐴) = {𝑥 ∈ 𝑋 : lim𝑡→0+

𝑇 (𝑡)𝑥− 𝑥

𝑡exists in X}

and

𝐴𝑥 = lim𝑡→0+

𝑇 (𝑡)𝑥− 𝑥

𝑡

with 𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑋.

Theorem 1 Let 𝑇 (𝑡) be a 𝐶0 semigroup in X. Then there exist constants𝜔 ≥ 0 and 𝑀 ≥ 1 such that

∣𝑇 (𝑡)∣ ≤ 𝑀𝑒𝜔𝑡 𝑡 ≥ 0.

See [Pa]: Theorem 2.2.

Outline of proof: We want to show there exists 𝜂 > 0 and 𝑀 > 0 suchthat ∣𝑇 (𝑡)∣ ≤ 𝑀 on [0, 𝜂]. If not, then there exists 𝑡𝑛 → 0+ such that∣𝑇 (𝑡𝑛)∣ ≥ 𝑛, which implies ∣𝑇 (𝑡𝑛)𝑥∣ is unbounded for some 𝑥 ∈ 𝑋 (by theuniform boundedness principle). This contradicts the fact that 𝑡 → 𝑇 (𝑡)𝑥continuously on [0, 𝛿] for each 𝑥 ∈ 𝑋. Then define 𝜔 ≡ 1

𝜂 𝑙𝑛(𝑀). Then

𝑒𝜔𝑡 = 𝑀 𝑡/𝜂 for any 𝑡 > 0.Let 𝑡 = 𝛿 + 𝑘𝜂 for some integer 𝑘 and 𝛿 ∈ [0, 𝜂]. Then ∣𝑇 (𝑡)∣ =

∣𝑇 (𝛿)𝑇 (𝜂)𝑘 ∣ ≤ 𝑀𝑀𝑘 ≤ 𝑀𝑀 𝑡/𝜂 = 𝑀𝑒𝜔𝑡.

Notation: Write 𝐴 ∈ 𝐺(𝑀,𝜔). For 𝑀 = 1 and 𝜔 = 0, 𝐴 ∈ 𝐺(1, 0) isthe infinitesimal generator of the contraction semigroup: ∣𝑇 (𝑡)𝑥− 𝑇 (𝑡)𝑦∣ ≤∣𝑥− 𝑦∣.

20

Theorem 2 Let 𝑇 (𝑡) be a 𝐶0 semigroup and let A be its infinitesimal gen-erator. Then

1. For 𝑥 ∈ 𝑋,

limℎ→0

1

ℎ

∫ 𝑡+ℎ

𝑡𝑇 (𝑠)𝑥𝑑𝑠 = 𝑇 (𝑡)𝑥.

2. For 𝑥 ∈ 𝑋,∫ 𝑡

0𝑇 (𝑠)𝑥𝑑𝑠 ∈ 𝒟(𝐴) and 𝐴

(∫ 𝑡

0𝑇 (𝑠)𝑥𝑑𝑠

)= 𝑇 (𝑡)𝑥− 𝑥.

3. If 𝑥 ∈ 𝒟(𝐴), then 𝑇 (𝑡)𝑥 ∈ 𝒟(𝐴) and

𝑑

𝑑𝑡𝑇 (𝑡)𝑥 = 𝐴𝑇 (𝑡)𝑥 = 𝑇 (𝑡)𝐴𝑥.

In other words, 𝒟(𝐴) is invariant under 𝑇 (𝑡), and on 𝒟(𝐴) at least,

𝑇 (𝑡)𝑥0 is a solution of

{��(𝑡) = 𝐴𝑥(𝑡)𝑥(0) = 𝑥0.

4. For 𝑥 ∈ 𝒟(𝐴),

𝑇 (𝑡)𝑥− 𝑇 (𝑠)𝑥 =

∫ 𝑡

𝑠𝑇 (𝜏)𝐴𝑥𝑑𝜏 =

∫ 𝑡

𝑠𝐴𝑇 (𝜏)𝑥𝑑𝜏.


Corollary 1 𝐴 ∈ 𝐺(𝑀,𝜔) ⇒ 𝒟(𝐴) is dense in 𝑋 and 𝐴 is a closed linearoperator.

See [Pa]: Corollary 2.5.

Recall: By definition, a linear operator 𝐴 is closed is equivalent to theproperty that 𝐴 has a closed graph in 𝑋 ×𝑋.

That is, 𝐺𝑟(𝐴) = {(𝑥, 𝑦) ∈ 𝑋 × 𝑋∣𝑥 ∈ 𝒟(𝐴), 𝑦 = 𝐴𝑥} is closed or forany (𝑥𝑛, 𝑦𝑛) ∈ 𝐺𝑟(𝐴), if 𝑥𝑛 → 𝑥 and 𝑦𝑛 → 𝑦, then 𝑥 ∈ 𝒟(𝐴) and 𝑦 = 𝐴𝑥.

Question: Does there exist a one-to-one relationship between a semigrouupand its infinitesimal generator?

Theorem 3 Let 𝑇 (𝑡) and 𝑆(𝑡) be 𝐶0 semigroups on 𝑋 with infinitesimalgenerators 𝐴 and 𝐵 respectively. If 𝐴 = 𝐵, then 𝑇 (𝑡) = 𝑆(𝑡), i.e., theinfinitesimal generator uniquely determines the semigroup on all of X.


21

4 Generators

4.1 Introduction to Generation Theorems

How do we tell when 𝐴, derived from a PDE (recall Example 1: the heatequation), is actually a generator of a 𝐶0 semigroup? This is important,because it leads to the idea of well-posedness and continuous dependence ofsolutions for an IBVPDE.

Well-posedness of a PDE is equivalent to saying that a unique solutionexists in some sense and is continuous with respect to data. In other words,{

��(𝑡) = 𝐴𝑥(𝑡) + 𝐹 (𝑡)𝑥(0) = 𝑥0,

is satisfied in some sense and the corresponding semigroup generated solution𝑥(𝑡) = 𝑇 (𝑡)𝑥0 +

∫ 𝑡0 𝑇 (𝑡−𝑠)𝐹 (𝑠)𝑑𝑠, yields the map (𝑥0, 𝐹 ) → 𝑥(⋅;𝑥0, 𝐹 ), that

is then continuous in some sense (depending on the spaces used).Probably the best known generation theorem is the Hille-Yosida theo-

rem. First, however, we review resolvents. In the study of 𝐶0-semigroups,one frequently encounters the operators 𝜆𝐼 − 𝐴, for 𝜆 ∈ ℂ (also denotedby 𝜆 − 𝐴), and their inverse 𝑅𝜆(𝐴) = (𝜆 − 𝐴)−1. Here ℂ is the field ofcomplex scalars. For a linear operator 𝐴 in 𝑋, we denote the resolvent setby 𝜌(𝐴) = {𝜆 ∈ ℂ∣𝜆 − 𝐴 has range ℛ(𝜆 − 𝐴) dense in 𝑋 and 𝜆 − 𝐴 hasa continuous inverse on ℛ(𝜆 − 𝐴)}. We recall that any continuous denselydefined linear operator in 𝑋 can be extended continuously to all of 𝑋. For𝜆 ∈ 𝜌(𝐴), we denote the resolvent operator in ℒ(𝑋) by 𝑅𝜆(𝐴) = (𝜆−𝐴)−1.The spectrum 𝜎(𝐴) of a linear operator 𝐴 is the complement in ℂ of theresolvent set 𝜌(𝐴).

4.2 Hille-Yosida Theorems [HP, Pa, Sh, T]

Theorem 4 (Hille - Yosida) For 𝑀 ≥ 1, 𝜔 ∈ ℝ, we have 𝐴 ∈ 𝐺(𝑀,𝜔) ifand only if

1. A is closed and densely defined (i.e., 𝐴 closed and 𝐷(𝐴) = 𝑋).

2. For real 𝜆 > 𝜔, we have 𝜆 ∈ 𝜌(𝐴) and 𝑅𝜆(𝐴) satisfies

∣𝑅𝜆(𝐴)𝑛∣ ≤ 𝑀

(𝜆− 𝜔)𝑛, 𝑛 = 1, 2, . . . .

22

Note: The equivalent version of Hille-Yosida in [Pa] is stated differently:

Theorem 5 (Hille - Yosida) 𝐴 ∈ 𝐺(1, 0) ⇐⇒1. A is closed and densely defined.

2. 𝑅+ ⊂ 𝜌(𝐴) and for every 𝜆 > 0, ∣𝑅𝜆(𝐴)∣ ≤ 1𝜆 .


Homework Exercises

∙ Ex. 5 :

a) Study the proof of Theorem 3.1 in [Pa] and read Section 1.3carefully.

b) Show that the version of Hille-Yosida in [Pa] is completely equiv-alent to the version stated previously. (Hint: This is not anexercise in proving Hille-Yosida. It is an exercise in comparing𝐴 ∈ 𝐺(𝑀,𝜔) and 𝐴 ∈ 𝐺(1, 0) with regard to exponential ratesand norms.)

4.3 Results from the Hille-Yosida proof

Results of the necessity portion of the proof

Out of the necessity portion of the proof of Hille-Yosida we find the repre-sentation:

𝑅𝜆(𝐴)𝑥 = 𝑅(𝜆,𝐴)𝑥 ≡∫ ∞

0𝑒−𝜆𝑡𝑇 (𝑡)𝑥𝑑𝑡 for 𝑥 ∈ 𝑋, 𝜆 > 𝜔

This says that the Laplace transform of the semigroup is the resolvent op-erator.

We now ask the question: can we recover the semigroup 𝑇 (𝑡) from theresolvent 𝑅𝜆(𝐴) through some kind of inverse Laplace transform? Yes -later!!

23

Homework Exercises

∙ Ex. 6 : Show that the heat equation operator of Example 1 generatesa 𝐶0 semigroup in 𝑋 = 𝐿2(0, 𝑙).

(That is, show that 𝐴𝜑 = (𝐷(𝜉)𝜑′(𝜉))′ on 𝒟(𝐴) = {𝜑 ∈ 𝐻2(0, 𝑙)∣𝜑(0) =0, 𝜑′(𝑙) = 0} is the infinitesimal generator of a 𝐶0 semigroup in 𝑋.)

Note: If you want to, you can take 𝐷(𝜉) = 𝐷 constant for this exercise.

Results of the sufficiency portion of the proof

In the sufficiency proof of Hille-Yosida, we encounter the Yosida approxi-mation:

For 𝜆 > 𝜔,𝐴𝜆 ≡ 𝜆𝐴𝑅𝜆(𝐴) = 𝜆2𝑅𝜆(𝐴) − 𝜆𝐼.

From this definition, we can see that 𝐴𝜆 is bounded and defined on all of𝑋. To see the above relationship, note that

𝜆𝐴𝑅𝜆(𝐴) = 𝜆[(𝐴− 𝜆𝐼)𝑅𝜆(𝐴) + 𝜆𝑅𝜆(𝐴)].

We also found that for 𝐴 satisfying the hypothesis of Hille-Yosida, we have

lim𝜆→∞

𝐴𝜆𝑥 = 𝐴𝑥 for all 𝑥 ∈ 𝒟(𝐴).

This says that for large 𝜆, 𝐴𝜆 acts like 𝐴. Moreover, under this hypothesis,𝐴𝜆 is the infinitesimal generator of the uniformly continuous semigroup ofcontraction operators : 𝑒𝑡𝐴𝜆 . Then we can argue that 𝑇 (𝑡)𝑥 ≡ lim

𝜆→∞𝑒𝑡𝐴𝜆𝑥,

for all 𝑥 ∈ 𝑋, is the desired semigroup that 𝐴 generates. (The proof isconstructive.) See Corollary 3.5 in [Pa].

4.4 Corollaries to Hille-Yosida

Corollary 2 Let 𝐴 be the infinitesimal generator of a 𝐶0 semigroup in 𝑋,and 𝐴𝜆 be the Yosida approximation, then

𝑇 (𝑡)𝑥 = lim𝜆→∞

𝑒𝑡𝐴𝜆𝑥 for all 𝑥 ∈ 𝑋.

Corollary 3 If 𝐴 is the infinitesimal generator of a 𝐶0 semigroup in 𝑋,then we actually have {𝜆∣𝑅𝑒(𝜆) > 𝜔} ⊂ 𝜌(𝐴) and for such 𝜆

∣𝑅𝜆(𝐴)𝑛∣ ≤ 𝑀

(𝑅𝑒𝜆− 𝜔)𝑛.

24

Corollary 4 𝐴 ∈ 𝐺(𝑀,𝜔) implies 𝒟(𝐴) is dense in 𝑋, and moreover, we

find∞∩𝑛=1

𝒟(𝐴𝑛) is also dense in 𝑋.

25

5 Dissipative Operators

Definition 6 Let 𝑋 be a Hilbert space, then a linear operator 𝐴 : 𝒟(𝐴) ⊂𝑋 → 𝑋 is dissipative if

Re ⟨𝐴𝑥, 𝑥⟩ ≤ 0 for all 𝑥 ∈ 𝒟(𝐴).

Theorem 6 A linear operator 𝐴 is dissipative if and only if ∣(𝜆𝐼 −𝐴)𝑥∣ ≥𝜆∣𝑥∣ for all 𝑥 ∈ 𝒟(𝐴) and 𝜆 > 0.

This result yields that for dissipative operators we have 𝜆𝐼 − 𝐴 is con-tinuously invertible on ℛ(𝜆−𝐴) when 𝜆 > 0. Thus, 𝜆 > 0 implies 𝜆 ∈ 𝜌(𝐴)whenever ℛ(𝜆−𝐴) is dense in 𝑋. Note: Proof will come later.

Theorem 7 (Lumer-Phillips) Suppose 𝐴 is a linear operator in a Hilbertspace 𝑋.

1. If 𝐴 is densely defined, 𝐴 − 𝜔𝐼 is dissipative for some real 𝜔, andℛ(𝜆0 −𝐴) = 𝑋 for some 𝜆0 with Re (𝜆0) > 𝜔, then 𝐴 ∈ 𝐺(1, 𝜔).

2. If 𝐴 ∈ 𝐺(1, 𝜔), then 𝐴 is densely defined, 𝐴 − 𝜔𝐼 is dissipative andℛ(𝜆0 −𝐴) = 𝑋 for all 𝜆0 with Re (𝜆0) > 𝜔.

Proof of Theorem 6

𝐴 is dissipative means Re ⟨𝐴𝑥, 𝑥⟩ ≤ 0 for all 𝑥 ∈ 𝒟(𝐴). This impliesRe⟨−𝐴𝑥, 𝑥⟩ ≥ 0. So we have:

∣(𝜆−𝐴)𝑥∣∣𝑥∣ ≥ ∣⟨(𝜆 −𝐴)𝑥, 𝑥⟩∣≥ Re⟨(𝜆−𝐴)𝑥, 𝑥⟩≥ 𝜆⟨𝑥, 𝑥⟩= 𝜆∣𝑥∣2.

Conversely, suppose ∣(𝜆𝐼 −𝐴)𝑥∣ ≥ 𝜆∣𝑥∣ for all 𝑥 ∈ 𝒟(𝐴) and 𝜆 > 0.Let 𝑥 ∈ 𝒟(𝐴).

Define 𝑦𝜆 = (𝜆−𝐴)𝑥 and 𝑧𝜆 = 𝑦𝜆∣𝑦𝜆∣ . Then we have ∣𝑧𝜆∣ = 1 and

𝜆∣𝑥∣ ≤ ∣(𝜆−𝐴)𝑥∣= ⟨(𝜆−𝐴)𝑥, 𝑧𝜆⟩= 𝜆Re⟨𝑥, 𝑧𝜆⟩ − Re⟨𝐴𝑥, 𝑧𝜆⟩ (19)

≤ 𝜆∣𝑥∣∣𝑧𝜆∣ − Re⟨𝐴𝑥, 𝑧𝜆⟩= 𝜆∣𝑥∣ − Re⟨𝐴𝑥, 𝑧𝜆⟩.

26

This implies

Re⟨𝐴𝑥, 𝑧𝜆⟩ ≤ 0. (20)

We always have the relationship

−Re⟨𝐴𝑥, 𝑧𝜆⟩ ≤ ∣𝐴𝑥∣. (21)

Combining equations (19) and (21), we obtain

𝜆∣𝑥∣ ≤ 𝜆Re⟨𝑥, 𝑧𝜆⟩ − Re⟨𝐴𝑥, 𝑧𝜆⟩≤ 𝜆Re⟨𝑥, 𝑧𝜆⟩ + ∣𝐴𝑥∣. (22)

Rearranging equation (22), we have

𝜆Re⟨𝑥, 𝑧𝜆⟩ ≥ 𝜆∣𝑥∣ − ∣𝐴𝑥∣or

Re⟨𝑥, 𝑧𝜆⟩ ≥ ∣𝑥∣ − 1

𝜆∣𝐴𝑥∣. (23)

Observe that we have ∣𝑧𝜆∣ = 1 for 𝜆 > 0. This implies that as 𝜆𝑘 →∞, 𝑧𝜆𝑘

⇀ 𝑧 with ∣𝑧∣ ≤ 1. (Note: The norm is weakly lower semi-continuous;therefore, 𝑧𝜆𝑘

⇀ 𝑧 implies ∣𝑧∣ = ∣w lim𝑘

𝑧𝜆𝑘∣ ≤ lim∣𝑧𝜆𝑘

∣ = 1. This is discussed

in detail in Section 5.2.)Taking the limit of equation (50), we obtain

Re⟨𝐴𝑥, 𝑧⟩ ≤ 0. (24)

Similarly, taking the limit of equation (23), we have

Re⟨𝑥, 𝑧⟩ ≥ ∣𝑥∣. (25)

Thus because ∣𝑧∣ ≤ 1 we can make the arguments

∣𝑥∣ ≤ Re⟨𝑥, 𝑧⟩≤ ∣⟨𝑥, 𝑧⟩∣≤ ∣𝑥∣∣𝑧∣≤ ∣𝑥∣.

Therefore, we actually have the equality

Re⟨𝑥, 𝑧⟩ = ∣⟨𝑥, 𝑧⟩∣ = ⟨𝑥, 𝑧⟩ = ∣𝑥∣. (26)

27

Define �� = 𝑧∣𝑥∣; then ⟨𝑥, 𝑥⟩ = ∣𝑥∣2 by (26) as ⟨𝑥, 𝑧⟩ = ∣𝑥∣ implies ⟨𝑥, 𝑧∣𝑥∣⟩ =∣𝑥∣⟨𝑥, 𝑧⟩ = ∣𝑥∣2. From equation (24),

Re⟨𝐴𝑥, 𝑧⟩ ≤ 0implies Re⟨𝐴𝑥, 𝑧∣𝑥∣⟩ ≤ 0or Re⟨𝐴𝑥, 𝑥⟩ ≤ 0.

We also have ∣��∣ ≤ ∣𝑥∣ because ∣��∣ = ∣𝑧∣∣𝑥∣ ≤ ∣𝑥∣.

We claim that �� = 𝑥. If this is true, then (27) would imply Re⟨𝐴𝑥, 𝑥⟩ ≤0 for all 𝑥 ∈ 𝒟(𝐴) and we are finished. We need the following lemma toprove this.

Lemma 1 (Duality Set Lemma) Let 𝑥 ∈ 𝑋 be arbitrary in a Hilbert space𝑋 and �� ∈ 𝑋 such that ⟨��, 𝑥⟩ = ∣𝑥∣2 and ∣��∣ ≤ ∣𝑥∣; then �� = 𝑥.

Proof0 ≤ ⟨𝑥− ��, 𝑥− ��⟩

= ∣𝑥∣2 − ⟨��, 𝑥⟩ − ⟨��, 𝑥⟩ + ∣��∣2≤ ∣𝑥∣2 − ∣𝑥∣2 − ∣𝑥∣2 + ∣𝑥∣2= 0

Therefore, ∣𝑥− ��∣ = 0 implies 𝑥 = ��; i.e., in a Hilbert space, 𝐹 (𝑥) ≡ dualityset = {𝑥}. For 𝑥 ∈ 𝑋,𝐹 (𝑥) = {𝑥∗ ∈ 𝑋∣⟨𝑥, 𝑥∗⟩ = ∣𝑥∣2 = ∣𝑥∗∣2}.

As we shall see below, for a general Banach space 𝑋, the duality set isa subset of the dual space 𝑋∗. Because Hilbert spaces are self-dual, in aHilbert space 𝑋 the concept reduces to one involving subsets of the Hilbertspace 𝑋 itself.

Definition 7 Let 𝑋 be a Hilbert space with 𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑋 with𝒟(𝐴) dense in 𝑋. . Then the adjoint of 𝐴, denoted 𝐴∗, is defined by

𝒟(𝐴∗) = {𝑦 ∈ 𝑋∣ there exists 𝑤 ∈ 𝑋 satisfying ⟨𝐴𝑥, 𝑦⟩ = ⟨𝑥,𝑤⟩ for all 𝑥 ∈ 𝒟(𝐴)}and

𝐴∗𝑦 = 𝑤.

Corollary 5 (to Lumer-Phillips) Let 𝑋 be a Hilbert space and 𝐴 : 𝒟(𝐴) ⊂𝑋 → 𝑋 with 𝒟(𝐴) dense and 𝐴 closed in 𝑋. If both 𝐴 and 𝐴∗ are dissipa-tive, then 𝐴 is an infinitesimal generator of a 𝐶0 semigroup of contractionson 𝑋.

28

Proof By Lumer-Phillips, it suffices to argue that ℛ(𝜆 − 𝐴) = 𝑋 for some𝜆 > 0.Take 𝜆 = 1:

Recall that in a Hilbert space we have

⟨𝐴𝑥, 𝑦⟩ = ⟨𝑥,𝐴∗𝑦⟩ for 𝑥 ∈ 𝒟(𝐴), 𝑦 ∈ 𝒟(𝐴∗).

But by assumption 𝐴 is dissipative and closed, which means

{(𝑥,𝐴𝑥)∣𝑥 ∈ 𝒟(𝐴)} is closed in 𝑋 ×𝑋.

Therefore, ℛ(𝐼 −𝐴) is a closed subspace of 𝑋.

Claim: ℛ(𝐼 −𝐴) = 𝑋.Suppose ℛ(𝐼−𝐴) ∕= 𝑋. Then by the Hahn-Banach theorem in a Hilbert

space, there exists �� ∈ 𝑋,𝑥 ∕= 0, such that

⟨��, (𝐼 −𝐴)𝑥⟩ = 0.

In other words

⟨��, 𝑥⟩ = ⟨��, 𝐴𝑥⟩ for all 𝑥 ∈ 𝒟(𝐴).

Therefore

�� ∈ 𝒟(𝐴∗) and ⟨𝐴∗��, 𝑥⟩ = ⟨��, 𝑥⟩,or

⟨��−𝐴∗��, 𝑥⟩ = 0 for all 𝑥 ∈ 𝒟(𝐴).

However, 𝒟(𝐴) is dense, which implies

��−𝐴∗�� = 0 or �� = 𝐴∗��. (27)

By assumption, 𝐴∗ is also dissipative. Therefore

∣(𝐼 −𝐴∗)��∣ ≥ ∣��∣ for all �� ∈ 𝒟(𝐴∗). (28)

Combining (27) and (28) we have

∣��∣ ≤ ∣��−𝐴∗��∣∣��∣ ≤ ∣��− ��∣∣��∣ ≤ 0.

This is a contradiction since �� ∕= 0. Therefore, ℛ(𝐼 −𝐴) = 𝑋.

29

Theorem 8 In a Hilbert space 𝑋, if 𝐴 is dissipative and ℛ(𝜆 − 𝐴) = 𝑋for some 𝜆 > 0, then 𝐴 is densely defined.

See [Pa]: Theorem 4.6.Note: This also holds in a reflective Banach space (more on this later).

Important practical results:

1. In a Hilbert space, we only need 𝐴 to be dissipative and ℛ(𝜆𝐼−𝐴) = 𝑋for some 𝜆 > 0 in order to use Lumer-Phillips theorem.

2. In a real Hilbert space (such as real 𝐿2(0, 𝑙)), we can get by with⟨𝐴𝑥, 𝑥⟩ ≤ 0.

Lemma 2 (A Practical Lemma) Let 𝑋 be a complex Hilbert space and sup-pose that for 𝑢 ∈ 𝒟(𝐴), 𝐴𝑢 is real whenever 𝑢 is real (for example, anoperator 𝐴 coming from a PDE defined by real coefficients), then 𝐴 is dis-sipative, i.e.,

Re⟨𝐴𝑥, 𝑥⟩ ≤ 0 for all 𝑥 ∈ 𝒟(𝐴)

if and only if

⟨𝐴𝑢, 𝑢⟩ ≤ 0 for all real valued 𝑢 ∈ 𝒟(𝐴).

Proof:We have for 𝑥 = 𝑢 + 𝑖𝑣, with 𝑢, 𝑣, real:

⟨𝐴𝑥, 𝑥⟩ = ⟨𝐴(𝑢 + 𝑖𝑣), 𝑢 + 𝑖𝑣⟩= ⟨𝐴𝑢, 𝑢⟩ + ⟨𝐴𝑣, 𝑣⟩ + 𝑖[⟨𝐴𝑣, 𝑢⟩ − ⟨𝐴𝑢, 𝑣⟩].

We want to argue that ⟨𝐴𝑣, 𝑢⟩ − ⟨𝐴𝑢, 𝑣⟩ is real; however, for any complexHilbert space we have the polarization identity [GP, p. 168]

⟨𝑢, 𝑣⟩ =1

4{∣𝑢 + 𝑣∣2 + 𝑖∣𝑢 + 𝑖𝑣∣2 − 𝑖∣𝑢− 𝑖𝑣∣2 − ∣𝑢− 𝑣∣2}.

But if 𝑢 and 𝑣 are real-valued we have ∣𝑢 + 𝑖𝑣∣ = ∣𝑢− 𝑖𝑣∣ and hence

⟨𝑢, 𝑣⟩ =1

4{∣𝑢 + 𝑣∣2 − ∣𝑢− 𝑣∣2}

(note that this is the polarization identity in a real Hilbert space) which isreal valued. Since 𝑢 and 𝑣 are real, then by assumption 𝐴𝑢 and 𝐴𝑣 are real.Therefore, ⟨𝐴𝑣, 𝑢⟩ − ⟨𝐴𝑢, 𝑣⟩ is real-valued and hence

30

Re ⟨𝐴𝑥, 𝑥⟩ = ⟨𝐴𝑢, 𝑢⟩ + ⟨𝐴𝑣, 𝑣⟩.Thus to have Re ⟨𝐴𝑥, 𝑥⟩ ≤ 0, it suffices to argue

⟨𝐴𝑢, 𝑢⟩ ≤ 0 for all real valued 𝑢 in 𝒟(𝐴).

Remark: Any Banach space is a Hilbert space if and only if the paral-lelogram law

∣𝑥 + 𝑦∣2 + ∣𝑥− 𝑦∣2 = 2∣𝑥∣2 + 2∣𝑦∣2

holds. If this holds in a Banach space, the polarization identity can be usedto define a norm compatible inner product.

To illustrate the use of the Lumer-Phillips theorem, we return to someof the previous examples.

Examples using Lumer-Phillips Theorem

Return to Example 1: The Heat Equation

We return to the heat diffusion example where we found the PDE

∂𝑦

∂𝑡(𝑡, 𝜉) =

∂

∂𝜉

(𝐷(𝜉)

∂𝑦

∂𝜉

)+ 𝑓(𝑡, 𝜉)

𝑦(𝑡, 0) = 0, 𝐷(𝑙)∂𝑦

∂𝜉(𝑡, 𝑙) = 0

gave rise to an operator 𝐴 in 𝑋 = 𝐿2(0, 𝑙) given by

𝒟(𝐴) = {𝜑 ∈ 𝐻2(0, 𝑙)∣(𝐷𝜑′)′ ∈ 𝐿2(0, 𝑙), 𝜑(0) = 0, 𝜑′(𝑙) = 0},

and𝐴𝜑 = (𝐷𝜑′)′.

Remark: Notice that if 𝐷 is smooth, we have

𝒟(𝐴) = {𝜑 ∈ 𝐻2(0, 𝑙)∣𝜑(0) = 0, 𝜑′(𝑙) = 0}.

In view of the remarks following the Lumer-Phillips theorem, to showthat 𝐴 generates a 𝐶0 semigroup in 𝑋, it suffices to show that 𝐴 is dissipativeand that ℛ(𝜆−𝐴) = 𝑋; i.e. (𝜆−𝐴)𝒟(𝐴) = 𝑋 for some 𝜆 > 0.

31

We have for 𝜑 ∈ 𝒟(𝐴), 𝜑 real-valued:

⟨𝐴𝜑,𝜑⟩ = ⟨(𝐷𝜑′)′, 𝜑⟩=

∫ 𝑙0(𝐷𝜑′)′𝜑

= −⟨𝐷𝜑′, 𝜑′⟩+ 𝐷𝜑′𝜑∣𝑙0= − ∫ 1

0 𝐷(𝜑′)2

≤ 0

for 𝐷 ≥ 0 (a natural assumption since 𝐷 is the coefficient of thermal diffu-sion).

To argue that ℛ(𝜆 − 𝐴) = 𝐿2(0, 𝑙) for some 𝜆 > 0, we note that this isequivalent to arguing that for some 𝜆 > 0, the boundary value problem

𝜆𝜑− (𝐷𝜑′)′ = 𝜓, 0 < 𝑥 < 𝑙,

𝜑(0) = 0, 𝜑′(𝑙) = 0,

has a solution 𝜑 ∈ 𝐻2(0, 𝑙) for any 𝜓 ∈ 𝐿2(0, 𝑙). But one can answer thislatter question in the affirmative (assuming 𝐷 > 0 is sufficiently regular,e.g., 𝐷 ∈ 𝑊 1∞) using classical Sturm-Liouville theory. (See [BR, CH, D, S].)Thus, we readily see that 𝐴 is a densely-defined generator of a 𝐶0 solutionsemigroup for the system above.

Homework Exercises

∙ Ex. 7 : Consider the general transport example, Example 2, and showthat it generates a 𝐶0 semigroup.

Return to Example 2 : The General Transport Equation

We next reconsider the population model given by

∂𝑦

∂𝑡(𝑡, 𝜉) +

∂

∂𝜉(𝜈(𝜉)𝑦(𝑡, 𝜉)) =

∂

∂𝜉

(𝐷(𝜉)

∂𝑦

∂𝜉(𝑡, 𝜉)

)− 𝜇(𝜉)𝑦(𝑡, 𝜉)

𝑦(𝑡, 0) = 0,

[𝐷(𝜉)

∂𝑦

∂𝜉(𝑡, 𝜉) − 𝜈(𝜉)𝑦(𝑡, 𝜉)

]𝜉=𝑙

= 0.

The corresponding operator equation in 𝑋 = 𝐿2(0, 𝑙) is in terms of theoperator 𝐴 given by

𝒟(𝐴) =

{𝜑 ∈ 𝐻2(0, 𝑙) ∣(𝐷𝜑′)′ − (𝜈𝜑)′ ∈ 𝐿2(0, 𝑙), 𝜑(0) = 0,[𝐷𝜑′ − 𝜈𝜑

](𝑙) = 0}

32

𝐴𝜑 = (𝐷𝜑′)′ − (𝜈𝜑)′ − 𝜇𝜑.

Again we show that 𝐴 is dissipative in 𝐿2(0, 𝑙) (actually we shall argue thata translation of 𝐴 is dissipative). For real valued 𝜑 ∈ 𝒟(𝐴), we have

⟨𝐴𝜑,𝜑⟩ = ⟨(𝐷𝜑′ − 𝜈𝜑)′, 𝜑⟩ − ⟨𝜇𝜑,𝜑⟩,

= −⟨(𝐷𝜑′ − 𝜈𝜑), 𝜑′⟩+ (𝐷𝜑′ − 𝜈𝜑)𝜑∣𝑙0 − ⟨𝜇𝜑,𝜑⟩,

= −⟨𝐷𝜑′, 𝜑′⟩ − ⟨𝜇𝜑,𝜑⟩ + ⟨𝜈𝜑, 𝜑′⟩.

Hence, using 𝑎𝑏 = 1√2𝜖𝑎√

2𝜖𝑏 ≤ 14𝜖𝑎

2 + 𝜖𝑏2, we find

⟨𝐴𝜑,𝜑⟩ ≤ −⟨𝐷𝜑′, 𝜑′⟩ − ⟨𝜇𝜑,𝜑⟩ + 14𝜖 ∣𝜈𝜑∣2 + 𝜖∣𝜑′∣2

= ⟨(−𝐷 + 𝜖)𝜑′, 𝜑′⟩ + ⟨(−𝜇 + 𝜈2

4𝜖 )𝜑,𝜑⟩

≤ 𝑘4𝜖 ∣𝜑∣2

for 𝜖 sufficiently small so that 𝐷(𝜉) ≥ 𝜖 and 𝑘 chosen so that −4𝜖𝜇+𝜈2(𝜉) ≤𝑘. Thus, we have that 𝐴− ( 𝑘

4𝜖 )𝐼 is dissipative in 𝐿2(0, 𝑙).

The range statement ℛ(𝜆+ 𝑘4𝜖 −𝐴) = 𝑋 again reduces to a question for

Sturm-Liouville problems. For any 𝜓 ∈ 𝐿2(0, 𝑙), we seek an 𝐿2(0, 𝑙) solutionof

(𝜆 +𝑘

4𝜖+ 𝜇)𝜑 + (𝜈𝜑)′ − (𝐷𝜑′)′ = 𝜓, 0 < 𝜉 < 𝑙,

𝜑(0) = 0, [𝐷𝜑′ − 𝜈𝜑](𝑙) = 0.

For 𝜆 sufficiently large, the Sturm-Liouville theory guarantees existence ofsolutions. Thus 𝐴 − 𝑘

4𝜖 is the generator of a 𝐶0-semigroup 𝑆(𝑡) so that

𝑇 (𝑡) = 𝑒𝑘4𝜖

𝑡𝑆(𝑡) is the solution semigroup generated by 𝐴.

Return to Example 3 : Delay Systems

We return to the delay system equation of Example 3 which is a special caseof more general vector delay systems. We consider the Cauchy problem

��(𝑡) = 𝐿(𝑧𝑡), (𝑧(0), 𝑧0) = (𝜂, 𝜙), (29)

where 𝑧𝑡(𝜉) = 𝑧(𝑡 + 𝜉), 𝜉 ∈ [−𝑟, 0] and

33

𝐿(𝑧𝑡) =𝑚∑𝑖=0

𝐴𝑖𝑧(𝑡− 𝜏𝑖) +

∫ 0

−𝑟𝐾(𝜉)𝑧(𝑡 + 𝜉)𝑑𝜉,

where 0 = 𝜏0 < 𝜏1 < . . . < 𝜏𝑚 = 𝑟, and 𝐴𝑖 and 𝐾(𝜉) are 𝑛×𝑛 matrices. Forgeneral functions 𝜙 ∈ 𝐶(−𝑟, 0;ℝ𝑛), the mapping 𝐿(⋅) : 𝐶(−𝑟, 0;ℝ𝑛) → ℝ𝑛

has the form

𝐿(𝜙) =𝑚∑𝑖=0

𝐴𝑖𝜙(−𝜏𝑖) +

∫ 0

−𝑟𝐾(𝜉)𝜙(𝜉)𝑑𝜉. (30)

Let 𝑥(𝑡) = (𝑧(𝑡), 𝑧𝑡) ∈ 𝑋, 𝑋 ≡ ℝ𝑛 × 𝐿2(−𝑟, 0;ℝ𝑛), where the Hilbertspace 𝑋 has the inner product

< (𝜂, 𝜙), (𝜁, 𝜓) >𝑋=< 𝜂, 𝜁 >ℝ𝑛 +

∫ 0

−𝑟𝜙(𝜉)𝜓(𝜉)𝑑𝜉. (31)

Define 𝒜 : 𝒟(𝒜) ∈ 𝑋 → 𝑋, where 𝒟(𝒜) = {𝜙 = (𝜙(0), 𝜙) ∈ 𝑋∣𝜙 ∈𝐻1(−𝑟, 0;ℝ𝑛)}, which is dense in 𝑋. Then 𝒜𝜙 = 𝒜(𝜙(0), 𝜙) = (𝐿(𝜙),𝐷𝜙)for 𝜙 = (𝜙(0), 𝜙) ∈ 𝒟(𝒜). We use Lumer-Phillips to show that 𝒜 is aninfinitesimal generator of the 𝐶0 semigroup 𝑆(𝑡) where 𝑆(𝑡)(𝜂, 𝜙) = (𝑧(𝑡), 𝑧𝑡)for solutions of (29). We do this in a space 𝑋𝑔 that is topologically equivalentto 𝑋, so that we have the semigroup on 𝑋 as well as on 𝑋𝑔.

First, we show that 𝒜 is dissipative in a space topologically equivalentto 𝑋. Renorm 𝑋 by the weighting function 𝑔 defined on [−𝑟, 0], where𝑔(𝜉) = 𝑗 for 𝜉 ∈ [−𝜏𝑚−𝑗+1,−𝜏𝑚−𝑗), 𝑗 = 1, 2, . . . ,𝑚. Define the Hilbertspace 𝑋𝑔 ≡ ℝ𝑛 × 𝐿2(−𝑟, 0; 𝑔;ℝ𝑛) to be the elements of 𝑋 with this newinner product

< (𝜂, 𝜙), (𝜁, 𝜓) >𝑋𝑔=< 𝜂, 𝜁 >ℝ𝑛 +

∫ 0

−𝑟𝜙(𝜉)𝜓(𝜉)𝑔(𝜉)𝑑𝜉. (32)

This gives rise to an equivalent topology to that of 𝑋 as long as 𝑔(𝜉) >0 for all 𝜉 ∈ [−𝑟, 0] (see [BBu2, p. 186], [BKap1, Webb] for more details).Then 𝒜 is dissipative if

< 𝒜𝑥, 𝑥 >𝑋𝑔≤ 𝜔∣𝑥∣2𝑋𝑔(33)

for all 𝑥 ∈ 𝒟(𝒜) where 𝑥 = (𝜙(0), 𝜙). To show this, we argue

34

< 𝒜𝑥, 𝑥 > = < (𝐿(𝜙),𝐷𝜙), (𝜙(0), 𝜙) >

= < 𝐿(𝜙), 𝜙(0) >ℝ𝑛 + < 𝐷𝜙, 𝜙 >𝐿2(−𝑟,0;𝑔;ℝ𝑛)

= <

𝑚∑𝑖=0


∫ 0

−𝑟𝐾(𝜉)𝜙(𝜉)𝑑𝜉, 𝜙(0) >ℝ𝑛 (34)

+

∫ 0

−𝑟𝐷𝜙(𝜉)𝜙(𝜉)𝑔(𝜉)𝑑𝜉

= <

𝑚∑𝑖=0


∫ 0

−𝑟𝐾(𝜉)𝜙(𝜉)𝑑𝜉, 𝜙(0) >ℝ𝑛 (35)

+

𝑚∑𝑗=1

∫ −𝜏𝑚−𝑗

−𝜏𝑚−𝑗+1

𝐷𝜙(𝜉)𝜙(𝜉)𝑔(𝜉)𝑑𝜉. (36)

Consider the last term and denote 𝜙𝑚−𝑗 = 𝜙(−𝜏𝑚−𝑗)

35

𝑚∑𝑗=1

∫ −𝜏𝑚−𝑗

−𝜏𝑚−𝑗+1

𝐷𝜙(𝜉)𝜙(𝜉)𝑔(𝜉)𝑑𝜉

=𝑚∑𝑗=1

1

2𝑗𝜙(𝜉)2∣𝜉=−𝜏𝑚−𝑗

𝜉=−𝜏𝑚−𝑗+1

=

𝑚∑𝑗=1

(𝑗

2∣𝜙𝑚−𝑗 ∣2 − 𝑗

2∣𝜙𝑚−𝑗+1∣2

)

=1

2

𝑚∑𝑗=1

𝑗∣𝜙𝑚−𝑗 ∣2 − 1

2

𝑚−1∑𝑘=0

(𝑘 + 1)∣𝜙𝑚−𝑘∣2 ( for 𝑘 = 𝑗 − 1)

=1

2𝑚∣𝜙0∣2 +

1

2

𝑚−1∑𝑗=1

𝑗∣𝜙𝑚−𝑗 ∣2

−1

2

𝑚−1∑𝑘=1

(𝑘 + 1)∣𝜙𝑚−𝑘∣2 − 1

2∣𝜙𝑚∣2

=1

2𝑚∣𝜙0∣2 − 1

2

𝑚−1∑𝑗=1

∣𝜙𝑚−𝑗 ∣2 − 1

2∣𝜙𝑚∣2

=1

2𝑚∣𝜙0∣2 − 1

2

𝑚−1∑𝑗=0

∣𝜙𝑚−𝑗 ∣2

=𝑚 + 1

2∣𝜙0∣2 − 1

2

𝑚∑𝑗=0

∣𝜙𝑚−𝑗 ∣2

Returning to (36), we have for 𝑥 = (𝜙(0), 𝜙),

36

< 𝒜𝑥, 𝑥 >𝑋𝑔 ≤𝑚∑𝑖=0

∣𝐴𝑖∣∣𝜙𝑖∣∣𝜙0∣+ <

∫ 0

−𝑟𝐾(𝜉)𝜙(𝜉)𝑑𝜉, 𝜙(0) >ℝ𝑛

+𝑚 + 1

2∣𝜙0∣2 − 1

2

𝑚∑𝑗=0

∣𝜙𝑚−𝑗 ∣2

≤𝑚∑𝑖=0

(1

2∣𝐴𝑖∣2∣𝜙0∣2 +

1

2∣𝜙𝑖∣2

)+ ∣∫ 0

−𝑟𝐾(𝜉)𝜙(𝜉)𝑑𝜉∣∣𝜙(0)∣

+𝑚 + 1

2∣𝜙0∣2 − 1

2

𝑚∑𝑗=0

∣𝜙𝑚−𝑗 ∣2

=1

2

𝑚∑𝑖=0

∣𝐴𝑖∣2∣𝜙0∣2 +1

2

𝑚∑𝑖=0

∣𝜙𝑖∣2 + ∣∫ 0

−𝑟𝐾(𝜉)𝜙(𝜉)𝑑𝜉∣∣𝜙(0)∣

+𝑚 + 1

2∣𝜙0∣2 − 1

2

𝑚∑𝑗=0

∣𝜙𝑚−𝑗 ∣2

≤(

1

2

𝑚∑𝑖=0

∣𝐴𝑖∣2 +𝑚 + 1

2

)∣𝜙0∣2 +

1

2∣𝐾∣2𝐿2

∣𝜙∣2𝐿2+

1

2∣𝜙(0)∣2

≤(

1

2

𝑚∑𝑖=0

∣𝐴𝑖∣2 +𝑚 + 1

2+

1

2+

1

2∣𝐾∣2𝐿2

)∣𝑥∣2𝑋

≤ 𝜔∣𝑥∣2𝑋≤ 𝜔∣𝑥∣2𝑋𝑔

,

using the definition of the inner product and the parallelogram identity;therefore choosing

𝜔 =1

2

𝑚∑𝑖=0

∣𝐴𝑖∣2 +𝑚

2+ 1 +

1

2∣𝐾∣2𝐿2

,

we have that 𝒜 is dissipative in 𝑋𝑔.To use Lumer-Phillips, it only remains to be shown that for some 𝜆 > 0,

given any 𝜓 = (𝜓(0), 𝜓) ∈ 𝑋, (𝜆𝐼 − 𝐴)𝜙 = 𝜓, can be solved for some 𝜙 =(𝜙(0), 𝜙) ∈ 𝒟(𝒜). This is equivalent to solving 𝜆𝜙(0)−𝐿(𝜙) = 𝜓(0) and 𝜆𝜙−𝐷𝜙 = 𝜓 on [−𝑟, 0). But this follows immediately from [BBu2, Lemma 3.3,p. 178].

37

6 Adjoint Operators and Dual Spaces

6.1 Adjoint Operators

Recall that 𝐴∗ : 𝒟(𝐴∗) ⊂ 𝑋 → 𝑋 is defined in Definition 7 and we requiredthat 𝐴 be densely defined in 𝑋. We argue that this defines 𝐴∗ uniquely if𝒟(𝐴) is dense in 𝑋. To show this, suppose there exist 𝑤1 and 𝑤2 such that

⟨𝐴𝑥, 𝑦⟩ = ⟨𝑥,𝑤1⟩ for all 𝑥 ∈ 𝒟(𝐴)

and⟨𝐴𝑥, 𝑦⟩ = ⟨𝑥,𝑤2⟩ for all 𝑥 ∈ 𝒟(𝐴).

Then,⟨𝑥,𝑤1 − 𝑤2⟩ = 0 for all 𝑥 ∈ 𝒟(𝐴).

If 𝒟(𝐴) is dense in 𝑋, then 𝑤1 − 𝑤2 must equal zero. Therefore, 𝑤1 = 𝑤2,and the adjoint of A is uniquely defined.

6.1.1 Computation of 𝐴∗: an Example from the Heat Equation

The computation of 𝐴∗ can be easy or impossible. Let 𝑋 = 𝐿2(0, 𝑙), thenthe operator 𝐴 from the heat equation is defined by

𝐴𝜑 = (𝐷𝜑′)′

on𝒟(𝐴) = {𝜑 ∈ 𝐻2(0, 𝑙) ∣ (𝐷𝜑′)′ ∈ 𝐿2(0, 𝑙), 𝜑(0) = 𝜑′(𝑙) = 0}.

For 𝜑 ∈ 𝒟(𝐴), 𝐷 real, and 𝜓 ∈ 𝐿2(0, 𝑙) = 𝑋, we have

⟨𝐴𝜑,𝜓⟩ =∫ 𝑙0(𝐷𝜑′)′𝜓

= − ∫ 𝑙0 𝐷𝜑′𝜓′ + 𝐷𝜑′𝜓∣𝑙0

=∫ 𝑙0 𝜑(𝐷𝜓′)′ − 𝜑𝐷𝜓′∣𝑙0 + 𝐷𝜑′𝜓∣𝑙0

= ⟨𝜑, (𝐷𝜓′)′⟩ − 𝜑𝐷𝜓′∣𝑙0 + 𝐷𝜑′𝜓∣𝑙0= ⟨𝜑,𝑤⟩ for all 𝜑 ∈ 𝒟(𝐴)

if and only if 𝜓′(𝑙) = 0 and 𝜓(0) = 0. (We must, of course, have 𝜓 ∈ 𝐻2(0, 𝑙)and (𝐷𝜓′)′ ∈ 𝐿2(0, 𝑙) to carry out the operations.) Hence

𝒟(𝐴∗) = {𝜓 ∈ 𝐻2(0, 𝑙) ∣ (𝐷𝜓′)′ ∈ 𝐿2(0, 𝑙), 𝜓(0) = 𝜓′(𝑙) = 0}.and

𝐴∗𝜓 = (𝐷𝜓′)′.

In this case, ⟨𝐴𝜑,𝜓⟩ = ⟨𝜑,𝐴∗𝜓⟩; therefore 𝐴 = 𝐴∗, 𝒟(𝐴) = 𝒟(𝐴∗), andhence 𝐴 is a self adjoint operator.

38

6.1.2 Self Adjoint Operators and Dissipativeness

For self adjoint linear operators, if 𝜑 ∈ 𝒟(𝐴) = 𝒟(𝐴∗), we have

Re ⟨𝐴𝜑,𝜑⟩ = Re ⟨𝜑,𝐴∗𝜑⟩ = Re ⟨𝐴∗𝜑,𝜑⟩.Therefore, whenever 𝐴 is self adjoint, 𝐴 is dissipative if and only if 𝐴∗ isdissipative. Thus, for any closed, densely-defined, dissipative self adjointoperator in 𝑋, both 𝐴 and 𝐴∗ are infinitesimal generators (see Corollary5 to the Lumer-Phillips theorem) of 𝐶0 semigroups 𝑇 (𝑡) ∼ 𝑒𝐴𝑡 and 𝑆(𝑡) ∼𝑒𝐴

∗𝑡. From a fundamental Hilbert space result, given below, we know that𝑆(𝑡) = 𝑇 (𝑡)∗.

Theorem 9 If 𝐴 is an infinitesimal generator of a 𝐶0 semigroup 𝑇 (𝑡) on aHilbert space 𝑋, then 𝑇 (𝑡)∗ = 𝑆(𝑡) is a 𝐶0 semigroup on 𝑋 with infinitesimalgenerator 𝐴∗.

Homework Exercises

∙ Ex. 8 (Project 1) : Consider 𝒟(𝐴∗) in a Hilbert space 𝑋 where 𝐴 isdensely defined in 𝑋.

– What can you say about 𝒟(𝐴∗) with respect to closed, dense,nonempty, all of 𝑋, relationship with 𝐴−1 and (𝐴∗)−1 when theyexist, and symmetric versus self adjoint?

– Now consider all the same questions in a Banach space 𝑋. (Note:In a Banach space 𝒟(𝐴∗) ⊂ 𝑋∗.)

Give complete examples and counter examples. Also, give completereferences.

For relevant material, see [DS, HP, K, RN, Y].

6.2 Dual Spaces and Strong, Weak, and Weak* Topologies

Before discussing dissipative operators in a Banach space 𝑋, we must dis-cuss topological dual spaces 𝑋∗. We talk about strong, weak, and weak*topologies in a Banach space 𝑋. For relevant material, see [DS, K, Y].

Let 𝑋 be a Banach space. Then 𝑋∗, called the dual space or “topologi-cal” dual space, is a Banach space whenever 𝑋 is a normed linear space andthe scalar field is complete. It is defined as the Banach space of continuous(conjugate or anti-) linear functionals on 𝑋, where a continuous linear func-tional satisfies ∣𝑓(𝑥)∣ ≤ 𝑘∣𝑥∣ for all 𝑥 ∈ 𝑋 and a conjugate linear functional

39

satisfies 𝑓(𝛼𝑥 + 𝛽𝑦) = ��𝑓(𝑥) + 𝛽𝑓(𝑦) for all 𝑥, 𝑦 ∈ 𝑋. The elements of 𝑋∗

are denoted by 𝑥∗ where ∣𝑥∗∣∗ = sup∣𝑥∣≤1

∣𝑥∗(𝑥)∣, where ∣ ⋅ ∣∗ denotes the norm

in 𝑋∗. This of course gives rise to a strong or normed topology of 𝑋∗.If 𝑋 is a complex Hilbert space, then 𝑋 ≈ 𝑋∗, and we have the Riesz

theorem.

Theorem 10 (Riesz theorem) If 𝑋 is a Hilbert space, then for every 𝑓 ∈𝑋∗, there is a corresponding 𝑓 ∈ 𝑋 such that

𝑓(𝑥) = ⟨𝑓 , 𝑥⟩and

𝑓(𝛼𝑥 + 𝛽𝑦) = ⟨𝑓 , 𝛼𝑥 + 𝛽𝑦⟩ = ��⟨𝑓 , 𝑥⟩ + 𝛽⟨𝑓, 𝑦⟩for all 𝑥 ∈ 𝑋.

Definition 8 Weak convergence of a sequence {𝑥𝑛} ⊂ 𝑋, 𝑥𝑛 ⇀ 𝑥, is de-fined by

𝑥𝑛 ⇀ 𝑥 if and only if 𝑥∗(𝑥𝑛) → 𝑥∗(𝑥) for all 𝑥∗ ∈ 𝑋∗.

We write 𝑥 = wlim 𝑥𝑛.

Definition 9 𝑋 is said to be weakly sequentially complete if and only ifevery weakly Cauchy sequence in 𝑋 converges weakly to an element 𝑥 ∈ 𝑋.

Result Section 1

There are a number of useful results related to weak convergence.

1. Weak limits are unique.

To see this, suppose 𝑥𝑛 ⇀ 𝑥 and 𝑥𝑛 ⇀ 𝑦, then 𝑥∗(𝑥) = 𝑥∗(𝑦), i.e,𝑥∗(𝑥 − 𝑦) = 0 for all 𝑥∗ ∈ 𝑋∗. However, the Hahn-Banach theoremsays if 𝑧 ∕= 0, then there exists an 𝑥∗ ∈ 𝑋∗ such that 𝑥∗(𝑧) ∕= 0.Therefore, 𝑥 = 𝑦.

2. (a) 𝑥𝑛 → 𝑥 implies 𝑥𝑛 ⇀ 𝑥, but the converse is not true.

(b) 𝑥𝑛 ⇀ 𝑥 implies {𝑥𝑛} is bounded in 𝑋 and ∣wlim 𝑥𝑛∣ = ∣𝑥∣ ≤ liminf ∣𝑥𝑛∣.Note: This says that the norm is weakly lower semi-continuous.Recall if 𝑓 is continuous, then ∣𝑓(𝑥) − 𝑓(𝑥0)∣ < 𝜖 or 𝑓(𝑥0) − 𝜖 <

40

𝑓(𝑥) < 𝑓(𝑥0) + 𝜖 for 𝑥 near 𝑥0 . Lower semi-continuous meanswe just have 𝑓(𝑥0)− 𝜖 < 𝑓(𝑥) for 𝑥 sufficiently near 𝑥0. This hasimportant applications in optimization (see the remark below).

(c) 𝑥𝑛 → 𝑥 if and only if 𝑥∗(𝑥𝑛) converges uniformly for ∣𝑥∗∣∗ ≤ 1.

Remark: Application of Semi-continuity in Optimization.Weak lower semi-continuity plays a fundamental role in many optimiza-

tion settings. Suppose we’re in a Hilbert space 𝑋 and we want to minimize afunction 𝐽(𝑥) over 𝐾 ⊂ 𝑋. Further suppose we have a minimizing sequence{𝑥𝑛} such that 𝐽(𝑥𝑛) ↘ inf

𝑥∈𝐾𝐽(𝑥) and {𝑥𝑛} is bounded, i.e., ∣𝑥𝑛∣ ≤ 𝑀 .

As we shall see below, bounded sequences possess weakly convergent sub-sequences. Therefore we can choose a subsequence converging weakly, i.e,𝑥𝑛𝑘

⇀ ��. If 𝐽 is weakly lower semi-continuous, then we have 𝐽(wlim 𝑥𝑛𝑘) ≤

lim inf 𝐽(𝑥𝑛𝑘) or 𝐽(��) ≤ inf

𝑥∈𝐾𝐽(𝑥). Whenever 𝐾 is weakly closed, this

yields existence of a minimum (not just an infimum) when minimizing aweakly lower semi-continuous 𝐽 . These facts are often used in constructivearguments to develop computational algorithms.

Definition 10 Weak convergence generates a topology in 𝑋 (use gener-alized sequences or nets) called the weak topology of 𝑋 or 𝑋∗ topology of𝑋.

Note: Weakly closed implies strongly closed, but not conversely in general.However, we do have an important case for which weak and strong closureare equivalent.

Theorem 11 (Mazur - 1933) A convex set 𝐸 ⊂ 𝑋 is 𝑋∗ closed if and onlyif 𝑋 is closed.

Thus, for convex sets, a set is weakly closed if and only if it is stronglyclosed. See [DS] for relevant information.

Definition 11 Since 𝑋∗ is a Banach space, we can define its dual, 𝑋∗∗.There is a natural embedding 𝑋 ↪→ 𝑋∗∗ defined by 𝐽 : 𝑋 → 𝑋∗∗ where

𝐽(𝑥)(𝑥∗) = 𝑥∗(𝑥)

for all 𝑥∗ ∈ 𝑋∗.

In general, 𝐽(𝑋) ⊂ 𝑋∗∗, but 𝐽(𝑋) ∕= 𝑋∗∗.

Definition 12 If 𝐽(𝑋) = 𝑋∗∗, then 𝑋 is called a reflexive Banach space.

41

Result Section 2

The following elementary results can be argued.

1. Hilbert spaces are reflexive.

2. 𝑋 is a reflexive Banach space implies 𝑋 is weakly sequentially com-plete.

3. 𝑋 is a reflexive Banach space if and only if {𝑥 ∈ 𝑋∣ ∣𝑥∣ ≤ 1} is compactin the weak topology. Therefore, in a Banach space, bounded sets areweakly, sequentially pre-compact.

4. Bounded sets in a Hilbert space are weakly, sequentially pre-compact.(We have this from Results 1. and 3.)

5. Let 𝑋 be a Hilbert space. Then

𝑥𝑛 ⇀ 𝑥 and ∣𝑥𝑛∣ → ∣𝑥∣ implies 𝑥𝑛 → 𝑥.

Conversely,

𝑥𝑛 ⇀ 𝑥 and 𝑥𝑛 → 𝑥 implies ∣𝑥𝑛∣ → ∣𝑥∣.Therefore,

𝑥𝑛 ⇀ 𝑥 implies 𝑥𝑛 → 𝑥 if and only if ∣𝑥𝑛∣ → ∣𝑥∣.

This is easy to see as

∣𝑥𝑛 − 𝑥∣2 = ⟨𝑥𝑛 − 𝑥, 𝑥𝑛 − 𝑥⟩ = ∣𝑥𝑛∣2 − ⟨𝑥𝑛, 𝑥⟩ − ⟨𝑥, 𝑥𝑛⟩ + ∣𝑥∣2.

Definition 13 For a Banach space 𝑋, 𝑋∗ is itself a Banach space; therefore,the weak topology of 𝑋∗ can be defined and is called the 𝑋∗∗ topology.

We can now consider weak convergence in 𝑋∗ which can be characterizedby

𝑥∗𝑛 ⇀ 𝑥∗ ∈ 𝑋∗ if and only if 𝑥∗∗(𝑥∗𝑛) → 𝑥∗∗(𝑥∗) for all 𝑥∗∗ ∈ 𝑋∗∗.

Definition 14 We can put an even weaker topology on 𝑋∗. We can define

a weak* convergence of {𝑥∗𝑛} ⊂ 𝑋∗, denoted as 𝑥∗𝑛𝑤∗⇀ 𝑥∗, by

𝑥∗𝑛𝑤∗⇀ 𝑥∗ ≡ 𝑥∗𝑛(𝑥) → 𝑥∗(𝑥) for all x ∈ 𝑋.

(i.e., the pointwise convergence of (anti) linear functionals.) Note that thisis in general a weaker convergence than weak convergence since 𝐽(𝑋) ⊂ 𝑋∗∗

with 𝐽(𝑋) ∕= 𝑋∗∗ in general.

42

Result Section 3

We have the following results.

1. 𝑥∗𝑛𝑤∗⇀ 𝑥∗ implies ∣𝑥∗𝑛∣∗ ≤ 𝑀 .

2. 𝑥∗𝑛 → 𝑥∗ in 𝑋∗ implies 𝑥∗𝑛𝑤∗⇀ 𝑥∗ in 𝑋∗ (i.e., strong convergence implies

weak* convergence).

The converse does not hold.

3. If 𝑋 is a Banach space, then 𝑋∗ is weak* sequentially complete andthe norm ∣ ⋅ ∣∗ is weak* lower semi-continuous. In other words,

∣w*lim inf 𝑥∗𝑛∣∗ ≤ lim inf ∣𝑥∗𝑛∣∗.

4. 𝑥∗𝑛 ⇀ 𝑥∗ in 𝑋∗ implies 𝑥∗𝑛𝑤∗⇀ 𝑥∗ in 𝑋∗ (i.e., weak convergence implies

weak* convergence).

The weak* convergence in 𝑋∗ can be used to define a topology for 𝑋∗,called the 𝑋 topology of 𝑋∗. In general, the 𝑋 topology of 𝑋∗ is weakerthan the 𝑋∗∗ topology of 𝑋∗. In other words, in 𝑋∗

strong convergence implies weak convergence implies weak* convergence;

however, the converses do not hold.

6.2.1 Summary of topologies on 𝑋 and 𝑋∗

𝑋 𝑋∗

strong strong

weak weak( 𝑋∗ topology of 𝑋) ( 𝑋∗∗ topology of 𝑋∗)

– weak *( 𝑋 topology of 𝑋∗)

Note:For any Banach space 𝑌 , we can talk about weak convergence, but we canonly talk about weak* convergence in the case that 𝑌 = 𝑋∗ = the dual ofsome space 𝑋 !!

Theorem 12 (Alaoglu’s Theorem) Let 𝑋 be a Banach space. Then the unitball (closed unit sphere) in 𝑋∗ is compact in the weak* topology (i.e., the 𝑋topology of 𝑋∗).

Corollary 6 Bounded sets in 𝑋∗ are weak* sequentially compact.

43

Result Section 4

We can summarize additional findings that are consequences of the aboveresults.

1. Let 𝐽 : 𝑋 → 𝑋∗∗ be the natural embedding so that for a reflex-ive Banach space, 𝐽(𝑋) = 𝑋∗∗. Hence we have that the weak andweak* topologies of 𝑋∗ are the same. To see this we argue 𝑥∗∗(𝑥∗𝑛) =𝐽(𝑥)(𝑥∗𝑛) = 𝑥∗𝑛(𝑥) and hence

𝑥∗∗(𝑥∗𝑛) → 𝑥∗∗(𝑥∗)

is weak sense convergence with

𝑥∗∗(𝑥∗) = 𝐽(𝑥)(𝑥∗) = 𝑥∗(𝑥).

Since 𝑥∗𝑛(𝑥) → 𝑥∗(𝑥) is weak* convergence, these are thus equivalent.

Conversely, if 𝑋 is not reflexive, then 𝐽(𝑋) ∕= 𝑋∗∗ and the 𝑋 topologyof 𝑋∗ is weaker than the 𝑋∗∗ topology of 𝑋∗. Thus in a reflexiveBanach space (which includes Hilbert spaces), bounded sets are weaklysequentially compact. (This is a special case of Alaoglu’s theorem.)

2. A Banach space is reflexive if and only if the unit ball is compact inweak topology.

6.3 Examples of Spaces and Their Duals

∙ We first consider 𝐿𝑝(0, 1) and recall

𝐿∗𝑝(0, 1) = 𝐿𝑞(0, 1), 1 ≤ 𝑝 < ∞,

1

𝑝+

1

𝑞= 1.

In other words, 𝐿∞ is the dual of a space, 𝐿∗1 = 𝐿∞; however, we do

not have a satisfactory representation for the dual 𝐿∗∞ of 𝐿∞. In otherwords, we cannot write 𝐿1 as the dual of 𝐿∞ or any other known space.As a result, we can consider the weak* topology in 𝐿𝑝 for 1 < 𝑝 ≤ ∞,but we cannot talk about the weak* topology in 𝐿1.

∙ 𝐿𝑝(0, 𝑇 ;𝑋) are important spaces because they are very useful in study-ing the system ��(𝑡) = 𝐴𝑥(𝑡) +𝐹 (𝑡), 0 < 𝑡 < 𝑇, in 𝑋. We have (to bediscussed later)

𝐿𝑝(0, 𝑇 ;𝑋)∗ ∼= 𝐿𝑞(0, 𝑇 ;𝑋∗), 1 ≤ 𝑝 < ∞.

See [E] for relevant details on Banach space valued functions of timeon intervals (0, 𝑇 ).

44

∙ The Banach space of (bounded) continuous functions on [𝑎, 𝑏] is de-noted by 𝐶[𝑎, 𝑏]. It has a norm defined by ∣𝜑∣ = sup

𝜉∈[𝑎,𝑏]∣𝜑(𝜉)∣ for

𝜑 ∈ 𝐶[𝑎, 𝑏]. Then we have the following theorem from [DS, Chap-ter IV].

Theorem 13 The following equivalences hold:

𝐶∗[𝑎, 𝑏] ∼= rba[𝑎, 𝑏] ∼= NBV[𝑎, 𝑏]

where NBV[𝑎, 𝑏] stands for normalized bounded variation functions andrba[𝑎, 𝑏] stands for regular bounded additive set functions.

Therefore we can consider the weak* topology for NBV[𝑎, 𝑏]; however,𝐶[𝑎, 𝑏] is not the dual of another space, and specifically NBV[𝑎, 𝑏]∗ ∕=𝐶[𝑎, 𝑏] (i.e., we do not have satisfactory characterizations or representations–there is no known space 𝑋 such 𝐶[𝑎, 𝑏] = 𝑋∗). Thus we cannot talkabout the weak* topology in 𝐶[𝑎, 𝑏].

Discussion of rba[𝑎, 𝑏]: Suppose 𝜇 is a regular set function on Borelsubsets of [𝑎, 𝑏]. Then by definition of regular, given a set 𝐹 ⊂ [𝑎, 𝑏]and 𝜖 > 0, there exists a closed set 𝐸 and an open set 𝑂 such that𝐸 ⊂ 𝐹 ⊂ �� and 𝜇(�� − 𝐸) < 𝜖. Additive means that for any twoBorel subsets 𝐴 and 𝐵 contained in [𝑎, 𝑏] such that 𝐴∩𝐵 = ∅, we have𝜇(𝐴 ∪𝐵) = 𝜇(𝐴) + 𝜇(𝐵).

Discussion of NBV[𝑎, 𝑏]: BV[𝑎, 𝑏] has a weak norm ∣𝑓 ∣ = ∣𝑓(𝑎+)∣+var(𝑓) where var(𝑓) is the total variation

var(𝑓) = sup𝜋[𝑎,𝑏]

𝑁∑𝑖=1

∣𝑓(𝑥𝑖) − 𝑓(𝑥𝑖−1)∣,

and 𝜋[𝑎, 𝑏] denotes the partitions of [𝑎, 𝑏]. Note that var(𝑓) provides asemi-norm on BV and BV can be written as BV𝑜 ⊕ 𝑁 where BV𝑜 ={𝑓 ∈ BV∣𝑓(𝑎+) = 0 and N is the 1-D space of constant functions}.

NBV[𝑎, 𝑏] is normalized by making 𝑓 right-continuous at the interiorpoints and 𝑓(𝑎+) = 0 with ∣𝑓 ∣ = var(𝑓).

Suppose 𝑓 ∈ NBV[𝑎, 𝑏]; then we can write [Ru1, p. 101], [Ru2, p. 163]

𝑓(𝑥) = 𝑓(𝑎) + 𝑝𝑓 (𝑥) − 𝑛𝑓 (𝑥)

𝑣(𝑥) = 𝑝𝑓 (𝑥) + 𝑛𝑓 (𝑥),

45

where 𝑝𝑓 , 𝑛𝑓 and 𝑣 are nondecreasing monotone functions defined by

𝑝𝑓 (𝑥) = sup𝜋[𝑎,𝑥]

𝑁∑𝑖=1

(𝑓(𝑥𝑖)− 𝑓(𝑥𝑖−1))+,

𝑛𝑓 (𝑥) = sup𝜋[𝑎,𝑥]

𝑁∑𝑖=1

∣(𝑓(𝑥𝑖) − 𝑓(𝑥𝑖−1))−∣, and

𝑣(𝑥) = sup𝜋[𝑎,𝑥]

𝑁∑𝑖=1

∣𝑓(𝑥𝑖)− 𝑓(𝑥𝑖−1∣,

where (⋅)+ and (⋅)− are the positive and negative parts of (⋅), respec-tively. Using the monotone functions 𝑝𝑓 and 𝑛𝑓 , one can generate(positive) Lebesgue-Stieltjes measures 𝜇𝑝𝑓 and 𝜇𝑛𝑓

[DS, Hal, HeSt,McS, Ru1, Ru2, Smir] and subsequently a signed measure

𝜇𝑓 = 𝜇𝑝𝑓 − 𝜇𝑛𝑓,

where 𝜇𝑓 is regular. From the above theorem we have 𝐶∗[𝑎, 𝑏] ∼=rba(𝑎, 𝑏), which means there is a one-to-one correspondence

𝑥∗ ↔ 𝜇,

given by

𝑥∗(𝜑) =

∫𝜑𝑑𝜇.

We also have 𝐶∗[𝑎, 𝑏] ∼= NBV[𝑎, 𝑏], which means we have

𝑥∗ ↔ 𝜇𝑓 ,

given by

𝑥∗(𝜑) =

∫𝜑𝑑𝜇𝑓

(=

∫𝜑𝑑𝑓

)where the integrals are Lebesque-Stieltjes integrals.

∙ We also need to consider Sobolev spaces, 𝑊 𝑘,𝑝(Ω). Recall 𝑊 𝑘,𝑝(Ω) isreflexive for 1 < 𝑝 < ∞ and, moreover, 𝑊 𝑘,2(Ω) is a Hilbert space. We

can define 𝑊 𝑘,𝑝0 (Ω) as Sobolev spaces with compact support (“func-

tions vanish at the boundary”), or we can think of 𝑊 𝑘,𝑝0 (Ω) as the

closure of 𝐶∞0 (Ω) in 𝑊 𝑘,𝑝. We can consider their dual spaces and find

𝑊 𝑘,𝑝0 (Ω)∗ ∼= 𝑊−𝑘,𝑝(Ω) 1 < 𝑝 < ∞,

1

𝑝+

1

𝑞= 1.

For relevant material, see [A, DS].

46

6.4 Return to Dissipativeness for General Banach Spaces

Definition 15 Let 𝑋 be a Banach space and 𝑋∗ be its conjugate dual.Then

𝑥∗(𝑥) = ⟨𝑥∗, 𝑥⟩is called the duality product.

Definition 16 For each 𝑥 in a Banach space 𝑋, we define the duality set𝐹 (𝑥) ⊂ 𝑋∗ by

𝐹 (𝑥) = {𝑥∗ ∈ 𝑋∗∣ ⟨𝑥∗, 𝑥⟩ = ∣𝑥∣2𝑋 = ∣𝑥∗∣2𝑋∗}.We know 𝐹 (𝑥) ∕= ∅ for each 𝑥 ∈ 𝑋 by the Hahn-Banach theorem.

Result

If 𝑋 is a Hilbert space, one can show 𝐹 (𝑥) = {𝑥} under the usual isomor-phism 𝑋 ≃ 𝑋∗. (This is the Duality Set Lemma, Section 4)

Definition 17 Let 𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑋 for a Banach space 𝑋. Then 𝐴 iscalled dissipative if for every 𝑥 ∈ 𝒟(𝐴) there exists some 𝑥∗ ∈ 𝐹 (𝑥) suchthat

Re⟨𝑥∗, 𝐴𝑥⟩ ≤ 0.

Result

The theorems stated previously for Hilbert spaces are still true in a Banachspace with the definition of dissipativeness given above. (Lumer-Phillips,characterizations, etc.)See [Pa, p. 13-16].

6.5 More on Adjoint Operators

Let 𝑋 and 𝑌 be normed linear spaces with 𝑋∗ and 𝑌 ∗ as their conjugateduals, respectively. For 𝒟(𝐴) dense in 𝑋, suppose we have the operator𝐴 : 𝒟(𝐴) ⊂ 𝑋 → 𝑌 . Then we can define the adjoint operator 𝐴∗ : 𝒟(𝐴∗) ⊂𝑌 ∗ → 𝑋∗ with

𝒟(𝐴∗) = {𝑦∗ ∈ 𝑌 ∗∣ 𝑥 → 𝑦∗(𝐴𝑥) is continuous on 𝒟(𝐴)}.Given 𝑦∗ ∈ 𝑌 ∗, we see that 𝑦∗ ∈ 𝒟(𝐴∗) if and only if there exists 𝑥∗ ∈ 𝑋∗

such that 𝑦∗(𝐴𝑥) = 𝑥∗(𝑥) for all 𝑥 ∈ 𝒟(𝐴). Therefore, we define 𝐴∗ by

𝐴∗𝑦∗ = 𝑥∗.

47

In other words,(𝐴∗𝑦∗)(𝑥) = 𝑥∗(𝑥).

6.5.1 Adjoint Operator in a Hilbert Space

In a Hilbert space, 𝑋∗ ∼ 𝑋 and 𝑌 ∗ ∼ 𝑌 ; therefore,

⟨𝐴𝑥, 𝑦⟩𝑌 = ⟨𝑥,𝐴∗𝑦⟩𝑋 .

6.5.2 Special Case

In general, we do not need to have 𝑇 ∈ ℒ(𝑋,𝑌 ) for 𝑇 ∗ to be defined;however, if we do have 𝑇 ∈ ℒ(𝑋,𝑌 ), then 𝑇 ∗ ∈ ℒ(𝑌 ∗,𝑋∗).

Theorem 14 Let 𝑋 and 𝑌 be reflexive Banach spaces. If 𝑇 is a closedlinear operator 𝑇 : 𝑋 → 𝑌 with 𝒟(𝑇 ) dense, then 𝒟(𝑇 ∗) is dense in 𝑌 ∗(∼𝑌 ) and 𝑇 ∗∗ = 𝑇 .

Note: This does not say anything about the relationship between 𝒟(𝑇 ) and𝒟(𝑇 ∗).

6.6 Examples of Computing Adjoints

In this series of examples, we illustrate how the domain, and in particular,boundary conditions, can affect the definition of an operator. In this casewe study operators which are all essentially the operations of second orderdifferentiation, i.e., 𝐴𝜑 = 𝜑′′, arising in the heat equation operator of Ex-ample 1. As we shall see, the boundary conditions affect dramatically thenature of the adjoint 𝐴∗.

We define 𝐻20 (0, 1) = {𝜑 ∈ 𝐻2(0, 1)∣𝜑(0) = 𝜑(1) = 𝜑′(0) = 𝜑′(1) = 0}.

1. Let 𝑉 = 𝐻20 (0, 1) and 𝐻 = 𝐿2(0, 1). We define the operator 𝐴 by

𝐴 : 𝒟(𝐴) = 𝑉 ⊂ 𝐻 → 𝐻

by𝐴𝜑 = 𝜑′′.

In order to define the adjoint operator 𝐴∗, we need to find, for each𝜓, an element 𝑤 such that

⟨𝐴𝜑,𝜓⟩ = ⟨𝜑,𝑤⟩ for all 𝜑 ∈ 𝐻20 (0, 1).

48

Therefore, we may carry out the calculations

⟨𝐴𝜑,𝜓⟩ =∫ 10 𝜑′′𝜓

= − ∫ 10 𝜑′𝜓′ + 𝜑′𝜓∣10

=∫ 10 𝜑𝜓′′ − 𝜑𝜓′∣10 + 𝜑′𝜓∣10

=∫ 10 𝜑𝜓′′ + 0 + 0

= ⟨𝜑,𝜓′′⟩

which hold if and only if 𝜓 ∈ 𝐻2.

Therefore𝒟(𝐴∗) = 𝐻2(0, 1)

with𝐴∗𝜓 = 𝜓′′.

Observe that 𝒟(𝐴∗) ∕= 𝒟(𝐴); therefore, 𝐴 is not self-adjoint. However,𝐴∗ = 𝐴 on their common domain 𝐻2

0 . Thus we see that there areboundary conditions which immediately render the heat operator non-self-adjoint.

2. Let 𝐻 = 𝐿2(0, 1), 𝑉 = 𝐻20 (0, 1) and recall that 𝑉 ∗ = 𝐻−2(0, 1) ≡

{𝜓∣𝑥 → ∫ 𝑥0

∫ 𝑠0 𝜓(𝑡)𝑑𝑡𝑑𝑠 is in 𝐿2(0, 1)} (see [A]). We define the opera-

tor 𝐴 by𝐴 : 𝐻 → 𝑉 ∗

by(𝐴𝜑)𝜓 = ⟨𝜑,𝜓′′⟩𝐻

for 𝜑 ∈ 𝐻 and 𝜓 ∈ 𝑉 . We claim that 𝐴 ∈ ℒ(𝐻,𝑉 ∗) and this shows thatdifferentiation can be bounded (continuous) if the spaces are chosenin a certain way. To verify our claim show ∣𝐴𝜑∣𝑉 ∗ ≤ 𝑘∣𝜑∣𝐻 . We have

∣(𝐴𝜑)(𝜓)∣ = ∣⟨𝜑,𝜓′′⟩∣≤ ∣𝜑∣𝐿2 ∣𝜓′′∣𝐿2

≤ ∣𝜑∣𝐻 ∣𝜓∣𝑉 .Therefore

∣(𝐴𝜑)(𝜓)∣∣𝜓∣𝑉 ≤ ∣𝜑∣𝐻 .

However

∣𝐴𝜑∣𝑉 ∗ = sup𝜓 ∕=0

∣(𝐴𝜑)(𝜓)∣∣𝜓∣𝑉 .

49

Thus∣𝐴𝜑∣𝑉 ∗ ≤ ∣𝜑∣𝐻

which implies that 𝐴 ∈ ℒ(𝐻,𝑉 ∗). From this we know 𝐴∗ ∈ ℒ(𝑉,𝐻).Next we seek to find 𝐴∗ by using

(𝐴∗𝜓)(𝜑) = (𝐴𝜑)(𝜓) = ⟨𝜑,𝜓′′⟩

for 𝜑 ∈ 𝐻. Thus 𝐴∗ is defined on all of 𝑉 by

𝐴∗𝜓 = 𝜓′′.

As in the previous example, 𝐴 is not self adjoint.

3. Let 𝐻 = 𝐿2(0, 1) and define the operator 𝐴1 by

𝐴1 : 𝒟(𝐴1) ⊂ 𝐻 → 𝐻

where𝒟(𝐴1) = 𝐻2(0, 1) ∩𝐻1

0 (0, 1)

with𝐴1𝜑 = 𝜑′′.

We know 𝐴1 /∈ ℒ(𝐻) from a previous assignment. In order to find theadjoint operator 𝐴∗

1, we want for each 𝜓 to find a 𝑤 such that

⟨𝐴1𝜑,𝜓⟩ = ⟨𝜑,𝑤⟩ for all 𝜑 ∈ 𝒟(𝐴1).

However⟨𝐴1𝜑,𝜓⟩ =

∫ 10 𝜑′′𝜓

=∫ 10 𝜑𝜓′′ − 𝜑𝜓′∣10 + 𝜑′𝜓∣10

=∫ 10 𝜑𝜓′′ + 0 + 𝜑′𝜓∣10

=∫ 10 𝜑𝜓′′

if and only if 𝜓 ∈ 𝐻2 ∩𝐻10 . Hence

𝐴∗1𝜓 = 𝜓′′

with𝒟(𝐴∗

1) = 𝐻2(0, 1) ∩𝐻10 (0, 1) = 𝒟(𝐴1).

Thus 𝐴1 is self-adjoint; however, it’s not a bounded linear operator on𝐻.

50

4. Let 𝑉 = 𝐻10 (0, 1). Then 𝑉 ∗ = 𝐻−1(0, 1). We can define the operator

𝐴1 by𝐴1 : 𝑉 → 𝑉 ∗

with(𝐴1𝜑)(𝜓) = −⟨𝜑′, 𝜓′⟩𝐻 .

We argue 𝐴1 ∈ ℒ(𝑉, 𝑉 ∗). To see this we must establish ∣𝐴1𝜑∣𝑉 ∗ ≤𝑘∣𝜑∣𝑉 for some 𝑘. We find

∣(𝐴1𝜑)(𝜓)∣ = ∣⟨𝜑′, 𝜓′⟩∣≤ ∣𝜑′∣𝐿2 ∣𝜓′∣𝐿2

≤ ∣𝜑∣𝑉 ∣𝜓∣𝑉 ,

and hence∣(𝐴1𝜑)(𝜓)∣

∣𝜓∣𝑉 ≤ ∣𝜑∣𝑉 .

However

∣𝐴1𝜑∣𝑉 ∗ = sup𝜓 ∕=0

∣(𝐴1𝜑)(𝜓)∣∣𝜓∣𝑉 .

Therefore∣𝐴1𝜑∣𝑉 ∗ ≤ ∣𝜑∣𝑉

which yields that 𝐴1 ∈ ℒ(𝑉, 𝑉 ∗). We know that 𝐴1∗ ∈ ℒ(𝑉, 𝑉 ∗). Next

we would like to find 𝐴∗1 which must satisfy

(𝐴1∗𝜓)(𝜑) = (𝐴1𝜑)(𝜓) = −⟨𝜑′, 𝜓′⟩

for 𝜑 ∈ 𝒟(𝐴1). Hence 𝐴1∗

is defined on all of V by

(𝐴1∗𝜓)𝜑 = −⟨𝜑′, 𝜓′⟩.

Thus we find that 𝐴1 is a bounded self adjoint linear operator.

51

7 Weak* Convergence, the Prohorov Metric, In-verse Problems and Optimization

7.1 Weak* Convergence and the Prohorov Metric

We turn next to discuss functional analysis concepts that are integral to sev-eral types of inverse problems. These problems entail parameter estimationusing output observations where the parameters to be estimated are proba-bility distributions. Such problems can be readily found in certain areas ofapplications [BBi, BBH, BBKW, BBPP, BD, BDTR, GRD-FP, BDEHAD,GRD-FP2, BF, BFPZ, BG1, BG2, BGIT]. Here we consider two types ofmeasure dependent problems: individual dynamics and aggregate dynamics.

7.1.1 Type I: Individual Dynamics/Aggregate Data

Recall, from Example 5 (the size-structured population model for mosquitofishwhere there is a family 𝒢 of individual growth rates 𝑔) we have the expectedsize density 𝑢(𝑡, 𝜉;𝑃 ) = ℰ [𝑣(𝑡, 𝜉; ⋅)∣𝑃 ] described for a general probabilitydistribution 𝑃 , we have

𝑢(𝑡, 𝜉;𝑃 ) =

∫𝒢𝑣(𝑡, 𝜉; 𝑔)𝑑𝑃 (𝑔),

as the expected value of the density 𝑣(𝑡, 𝜉; 𝑔) which satisfies for a given 𝑔the Sinko-Streiffer.

Since in typical inverse problems, we must use aggregate population data𝑑𝑖𝑗 to estimate 𝑃 itself, we are therefore required to understand the qualita-tive properties (continuity, sensitivity, etc.) of 𝑢(𝑡, 𝜉;𝑃 ) as a function of 𝑃 .Thus, we see that the data for parametric estimation or inverse problems is𝑑𝑖𝑗 ∼ 𝑢(𝑡, 𝜉;𝑃 ).

7.1.2 Type II: Aggregate Dynamics

We return to the electromagnetics example of Example 4 which entails po-larization of inhomogeneous dielectric materials. In this case, individual(particle) dynamics are not available. Instead, the dynamics (13) them-selves depend on a probability measure 𝐹 , e.g.,

∂𝑢

∂𝑡+

∂2

∂𝑥2𝑢 = 𝑓(𝑢, 𝐹 ).

To describe the behavior of the electric polarization 𝑃 , we begin withthe general formulation of Chapter 2 of [BBL] by employing a polarization

52

kernel 𝑔 in the convolution expression

𝑃 (𝑡, 𝑧) =

∫ 𝑡

0𝑔(𝑡− 𝑠, 𝑧; 𝜏)𝐸(𝑠, 𝑧)𝑑𝑠. (37)

As explained in Section 2.5 and [BBL], this general formulation includes asspecial cases the well known orientational or Debye polarization model, theelectronic or Lorentz polarization model, and linear combinations thereof,as well as other higher order models. In the Debye case the kernel is givenby

𝑔(𝑡; 𝜏) = 𝜖0(𝜖𝑠 − 𝜖∞)/𝜏 𝑒−𝑡/𝜏 ,

while in the Lorentz model (again see [BBL]), it takes the form

𝑔(𝑡; 𝜏) = 𝜖0𝜔2𝑝/𝜈0 𝑒−𝑡/2𝜏 sin(𝜈0𝑡).

However, use of these kernels presupposes that the material may be suf-ficiently defined by a single relaxation parameter 𝜏 , which is generally notthe case. In order to account for multiple relaxation parameters in the po-larization mechanisms, we allow for a distribution of relaxation parameterswhich is conveniently described in terms of a probability measure 𝐹 . Thus,we define our polarization model in terms of a convolution operator

𝑃 (𝑡, 𝑧) =

∫ 𝑡

0𝐺(𝑡− 𝑠, 𝑧;𝐹 )𝐸(𝑠, 𝑧)𝑑𝑠

where 𝐺 is determined by various polarization mechanisms each describedby a different parameter 𝜏 , and therefore is given by

𝐺(𝑡, 𝑧;𝐹 ) =

∫𝒯𝑔(𝑡, 𝑧; 𝜏)𝑑𝐹 (𝜏)

where 𝒯 ⊂ [𝜏1, 𝜏2].To obtain the macroscopic polarization, we sum over all the parameters.

We cannot separate dynamics to obtain individual dynamics, and thereforewe have (13) in Section 2.5 as an example where the dynamics for the 𝐸and 𝐻 fields depend explicitly on the probability measure 𝐹 .

For inverse problems, data 𝑑𝑖𝑗 ≈ 𝑢(𝑡𝑖, 𝜉𝑗) = 𝑢(𝑡𝑖, 𝜉𝑗;𝑃 ) in Type I, whereas𝑑𝑖𝑗 ≈ 𝐸(𝑡𝑖, ��𝑗 ;𝐹 ) in Type II. Observe that 𝑃 or 𝐹 ∈ 𝒫(𝑄), which is thespace of probability measures, where 𝑄 consists of “parameters”, and 𝒫(𝑄)is a set of probability distributions over 𝑄.

For example, in ordinary least square problems, we have

𝐽(𝑃 ) =∑𝑖𝑗

∣𝑑𝑖𝑗 − 𝑢(𝑡𝑖, 𝜉𝑗;𝑃 )∣2 or 𝐽(𝐹 ) =∑𝑖𝑗

∣𝑑𝑖𝑗 −𝐸(𝑡𝑖, ��𝑗 ;𝐹 )∣2,

53

to be minimized over 𝑃 or 𝐹 ∈ 𝒫(𝑄) or over some subset of 𝒫(𝑄). We need(for computational and theoretical considerations) a sense of closeness of twoprobability distributions 𝑃1 and 𝑃2. The ideas of local minima, continuity,“gradient”, etc., depend on a metric (or topology) for 𝑃1 and 𝑃2.

More fundamentally for Type II systems, the “well-posedness of sys-tems”, including existence and continuous dependence of solutions on “pa-rameters”, depends on the concept of 𝐹1 and 𝐹2 being close, i.e., 𝐹 →𝐸(𝑡, 𝑥;𝐹 ) is continuous.

54

7.2 Metrics on Probability Spaces

We consider 𝒫(𝑄) as the space of probability measures or distributions on𝑄. Let 𝑃1 and 𝑃2 be probability distributions. If working on the real line(−∞,∞) or (0,∞) we will sometimes not distinguish between the probabil-ity measure 𝑃 and its cumulative distribution function 𝐹 .

We need to understand the concept of a metric or distance 𝜌(𝑃1, 𝑃2)between 𝑃1 and 𝑃2. There are several metrics that metrize (i.e.,generate anequivalent topology for) the weak* topology on 𝒫 including

i) Levy −− denoted here by 𝜌𝐿(𝑃1, 𝑃2),

ii) Prohorov −−denoted by 𝜌𝑃𝑅(𝑃1, 𝑃2),

iii) Bounded Lipschitz −−denoted by 𝜌𝐵𝐿(𝑃1, 𝑃2),

and also some that do not metrize the weak* topology on 𝒫 including

iv) Total variation −− denoted by 𝜌𝑇𝑉 (𝑃1, 𝑃2),

v) Kolmogorov −− denoted by 𝜌𝐾(𝑃1, 𝑃2).

For relevant material, see [B, H, P].Previous developments in probability theory provide helpful results in

the pursuit of a possible complete computational methodology. One of themost important tools in probability theory is the Prohorov metric, which wewill now formally define. Let 𝑄 be a complete metric space with metric 𝑑.Given a closed subset 𝐴 of 𝑄, define the 𝜖-neighborhood of 𝐴 as

𝐴𝜖 = {𝑞 ∈ 𝑄 : 𝑑(𝑞, 𝑞) ≤ 𝜖 for some 𝑞 ∈ 𝐴}= {𝑞 ∈ 𝑄 : inf

𝑞∈𝐴𝑑(𝑞, 𝑞) ≤ 𝜖}.

We define the Prohorov metric 𝜌 : 𝒫(𝑄) × 𝒫(𝑄) → 𝑅+ by

𝜌(𝑃1, 𝑃2) ≡ inf{𝜖 > 0 : 𝑃1(𝐴) ≤ 𝑃2(𝐴𝜖) + 𝜖, 𝐴 closed ⊂ 𝑄}.

This can be shown to be a metric on 𝒫(𝑄) and has many properties including

(𝑎.) If 𝑄 is complete, then (𝒫(𝑄), 𝜌) is a complete metric space;

(𝑏.) If 𝑄 is compact, then (𝒫(𝑄), 𝜌) is a compact metric space.

Note that the definition of 𝜌 is not intuitive. For example, we do not nec-essarily know what 𝑃𝑘 → 𝑃 in 𝜌 means. We have the following importantcharacterizations [B]. Given 𝑃𝑘, 𝑃 ∈ 𝒫(𝑄), the following convergence state-ments are equivalent:

55

1. 𝜌(𝑃𝑘, 𝑃 ) → 0;

2.∫𝑄 𝑓𝑑𝑃𝑘(𝑞) → ∫

𝑄 𝑓𝑑𝑃 (𝑞) for all bounded, continuous 𝑓 : 𝑄 → 𝑅1;

3. 𝑃𝑘[𝐴] → 𝑃 [𝐴] for all Borel sets 𝐴 ⊂ 𝑄 with 𝑃 [∂𝐴] = 0.

Thus, we immediately obtain the following results:

∙ Convergence in the 𝜌 metric is equivalent to convergence in distri-bution or so-called “weak” convergence of measures encountered inprobability theory.

∙ Let 𝐶∗𝐵(𝑄) denote the topological dual of 𝐶𝐵(𝑄), where 𝐶𝐵(𝑄) is the

usual space of bounded continuous functions on 𝑄 with the supremumnorm. If we view 𝒫(𝑄) ⊂ 𝐶∗

𝐵(𝑄), convergence in the 𝜌 topology isequivalent to weak* convergence in 𝒫(𝑄). Note the misnomer “weakconvergence of measures” used by probabilists which is actually weak*convergence in the functional analysis sense defined here and in almostall standard texts.

More importantly,

𝜌(𝑃𝑘, 𝑃 ) → 0 is equivalent to

∫𝑄𝑥(𝑡𝑖; 𝑞)𝑑𝑃𝑘(𝑞) →

∫𝑄𝑥(𝑡𝑖; 𝑞)𝑑𝑃 (𝑞),

for any continuous functions 𝑞 → 𝑥(𝑡𝑖; 𝑞). Thus 𝑃𝑘 → 𝑃 in 𝜌 metric isequivalent to

ℰ [𝑥(𝑡𝑖; 𝑞)∣𝑃𝑘] → ℰ [𝑥(𝑡𝑖; 𝑞)∣𝑃 ]

or “convergence in expectation”. This yields that

𝑃 → 𝐽(𝑃 ) =

𝑛∑𝑖=1

∣ℰ [𝑥(𝑡𝑖; 𝑞)∣𝑃 ] − 𝑑𝑖∣2

is continuous in the 𝜌 topology. Continuity of 𝑃 → 𝐽(𝑃 ) allows us to assertthe existence of a solution to min 𝐽(𝑃 ) over 𝑃 ∈ 𝒫(𝑄).

These results give us the following theorem.

Theorem 15 The Prohorov metric metrizes the weak* topology of 𝒫(𝑄).

Now we may consider whether there are other (equivalent) metrics thatmetrize the weak* topology on 𝒫. But first we point out an application instatistics where the need for metrics on distributions arises.

56

Robust Statistics

Prohorov (also sometimes found as Prokhorov in translations) was interestedin finding a useful metric for the frequently used “convergence in distribu-tion” or “convergence in measure” in probability theory. Later theoreti-cal statisticians became interested in (and concerned with) the underlyingfoundations for statistical testing and in particular inference procedures [H].Specifically, attention was given to the “stability” of assumptions underwhich statistical asymptotic theories may be valid. This led to “robust-ness” (insensitivity to small deviations in underlying assumptions) or RobustStatistics. These ideas focus on “distributional robustness” as embodied ininsensitivity to “small” derivations in distributions (often from the normalor Gaussian distribution assumptions found in many formulations) and alsohow lack of this robustness might affect the validity of statistical tests. Thisled to a renewed interest in the use of distributional metrics such as that ofProhorov. As formulated by Prohorov this led to attempts to find a metricfor the “weak” topology on 𝒫 defined as the weakest topology on 𝒫 suchthat for every bounded continuous 𝜓 (𝜓 ∈ 𝒞𝐵(𝑄)) the map

𝑃 →∫𝑄𝜓𝑑𝑃

is continuous, i.e. consider 𝒫 ⊂ 𝐶∗𝐵(𝑄).

To describe the various metrics that have found use by statisticians andprobabilitists, it is desirable to recall the concept of a Polish space which isdefined as a complete, separable, metrizable topological space. We turn toa brief summary of other metrics of interest.

The Levy Metric

The Levy metric is defined in the special case for 𝑄 = 𝑅1, where one usuallydoes not distinguish between the probability measure or distribution 𝑃1 andits cumulative distribution function (cdf) 𝐹1. The metric is given by

𝜌𝐿(𝑃1, 𝑃2) = inf{𝜖∣ for all 𝑥, 𝑃1(𝑥− 𝜖) − 𝜖 ≤ 𝑃2(𝑥) ≤ 𝑃1(𝑥 + 𝜖) + 𝜖}.

One can argue that this is a symmetric function and indeed defines a metric.Remarks

∙ √2𝜌𝐿(𝑃1, 𝑃2) is maximum distance between graphs of 𝑃1, 𝑃2 measured

along 45𝑜 direction. It is a theorem that the Levy metric metrizes theweak* topology of 𝒫.

57

∙ The Prohorov metric is more difficult to visualize but is applicablewhen 𝑄 is any complete separable metric space (Polish space), notjust the real line (Levy case). For example, when 𝑄 is a function spacesuch as growth rates or mortality rates in Sinko-Streiffer (mosquito fishproblem), the Prohorov metric is applicable whereas the Levy metricis not.

We have already noted that both the Levy and Prohorov metrics metrizethe weak* topology on 𝒫, the former only when 𝑄 = 𝑅1. Moreover, asnoted above, we can argue that for 𝑄 a complete separable metric space,then (𝒫(𝑄), 𝜌𝑃𝑅) is a complete separable metric space.

Recall separability implies the existence of a subset 𝑄0 that is a countabledense subset of 𝑄. One can then obtain a countable dense subset of 𝒫 bydefining (see also Theorem 17 below)

𝒫0 = {measures with finite support in 𝑄0 with rational masses}= {𝑃 =

∑finite

𝑝𝑖𝛿𝑞𝑖 ∣𝑝𝑖 rational, {𝑞𝑖} ⊂ 𝑄0}

The Bounded Lipschitz Metric:

Assume without loss of generality that distance function 𝑑 on 𝑄 is boundedby 1. (If necessary, one can replace the metric by an equivalent metric

𝑑(𝑞1, 𝑞2) = 𝑑(𝑞1,𝑞2)1+𝑑(𝑞1,𝑞2)

.) Then define

𝜌𝐵𝐿(𝑃1, 𝑃2) = sup𝜓∈Ψ

∣∣∣∣∫𝑄𝜓𝑑𝑃1 −

∫𝑄𝜓𝑑𝑃2

∣∣∣∣where Ψ = {𝜓 ∈ 𝒞(𝑄) : ∣𝜓(𝑞1) − 𝜓(𝑞2)∣ ≤ 𝑑(𝑞1, 𝑞2)}

A fundamental result of practical importance is

Theorem 16 For all 𝑃1, 𝑃2 ∈ 𝒫(𝑄),

𝜌𝑃𝑅(𝑃1, 𝑃2)2 ≤ 𝜌𝐵𝐿(𝑃1, 𝑃2) ≤ 2𝜌𝑃𝑅(𝑃1, 𝑃2).

Thus, 𝜌𝑃𝑅 and 𝜌𝐵𝐿 define the same topology and hence 𝜌𝐵𝐿 also metrizesthe weak topology on 𝒫(𝑄).

Other metrics

Other metrics of interest can be found in the literature. These include

58

∙ Total variation:

𝜌𝑇𝑉 (𝑃1, 𝑃2) = sup𝐴∈ℬ

∣𝑃1(𝐴) − 𝑃2(𝐴)∣,

and

∙ Kolmogorov: (𝑄 = 𝑅1)

𝜌𝐾(𝑃1, 𝑃2) = sup𝑥∈𝑅′

∣𝑃1(𝑥) − 𝑃2(𝑥)∣.

These metrics do not metrize the weak topology, but do satisfy

𝜌𝐿 ≤ 𝜌𝑃𝑅 ≤ 𝜌𝑇𝑉

𝜌𝐿 ≤ 𝜌𝐾 ≤ 𝜌𝑇𝑉 .

Why is the knowledge about 𝜌𝐵𝐿 versus 𝜌𝑃𝑅 useful? We have that

Δ𝑘 ≡∣∣∣∣∫

𝐺(𝑔)𝑑𝑃𝑘(𝑔) −∫

𝐺(𝑔)𝑑𝑃 (𝑔)

∣∣∣∣=

∣∣∣∣∫

𝐺(𝑔)𝑓𝑘(𝑔)𝑑𝑔 −∫

𝐺(𝑔)𝑓(𝑔)𝑑𝑔

∣∣∣∣=

∣∣∣∣∫

𝐺(𝑔)[𝑓𝑘(𝑔) − 𝑓(𝑔)]𝑑𝑔

∣∣∣∣≤ 𝜌𝐵𝐿(𝑃𝑘, 𝑃 )

≤ 2𝜌𝑃𝑅(𝑃𝑘, 𝑃 ).

Therefore, if we know 𝑃𝑘 → 𝑃 in 𝜌𝑃𝑅, it may be useful in direct esti-mates. But Δ𝑘 → 0 is Prohorov metric convergence itself if 𝐺 ∈ 𝒞𝐵(𝑄).

59

7.3 Populations with Aggregate Data and Uncertainty

We consider approximation methods in estimation or inverse problems–quantity of interest is a probability distribution assume we have parameter(𝑞 ∈ 𝑄) dependent system with model responses 𝑥(𝑡, 𝑞) describing popula-tion of interest For data or observations, we are given a set of values {𝑦𝑙}for the expected values

ℰ [𝑥𝑙(𝑞)∣𝑃 ] =

∫𝑄𝑥𝑙(𝑞)𝑑𝑃 (𝑞)

for model 𝑥𝑙(𝑞) = 𝑥(𝑡𝑙, 𝑞) wrt unknown probability distribution 𝑃 describingdistribution of parameters 𝑞 over population

Use data to choose from a given family 𝒫(𝑄) the distribution 𝑃 ∗ thatgives best fit of underlying model to data

Formulate ordinary least squares (OLS) problem–not essential–couldequally well use a WLS, MLE, etc., approach

Seek to minimize

𝐽(𝑃 ) =∑𝑙

∣ℰ [𝑥𝑙(𝑞)∣𝑃 ] − 𝑦𝑙∣2

over 𝑃 ∈ 𝒫(𝑄)Even for simple dynamics for 𝑥𝑙 is an infinite dimensional optimiza-

tion problem–need approximations that lead to computationally tractableschemes

That is, it is useful to formulate methods to yield finite dimensional sets𝒫𝑀 (𝑄) over which to minimize 𝐽(𝑃 ) Of course, we wish to choose thesemethods so that “𝒫𝑀 (𝑄) → 𝒫(𝑄)” in some sense

In this case we shall use Prohorov metric [BBPP, BBi] of weak* conver-gence of measures to assure the desired approximation results

A general theoretical framework is given in [BBPP] with specific resultson the approximations we use here given in [BBi, BPi]. Briefly, ideas for theunderlying theory are as follows:

1. One argues continuity of 𝑃 → 𝐽(𝑃 ) on 𝒫(𝑄) with the Prohorov metric

2. If 𝑄 is compact then 𝒫(𝑄) is a complete metric space–also compact–when taken with Prohorov metric

3. Approximation families 𝒫𝑀 (𝑄) are chosen so that elements 𝑃𝑀 ∈𝒫𝑀 (𝑄) can be found to approximate elements 𝑃 ∈ 𝒫(𝑄) in Prohorovmetric

60

4. Well-posedness (existence, continuous dependence of estimates on data,etc.) obtained along with feasible computational methods

The data {𝑦𝑙} available (which, in general, will involve longitudinal ortime evolution data) determines the nature of the problem.

∙ Type I: In what we have referred to above as Type I problems one hasonly aggregate or population level longitudinal data available. This iscommon in marine, insect, etc., catch and release experiments [BK]where one samples at different times from the same population butcannot be guaranteed of observing the same set of individuals at eachsample time. This type of data is also typical in experiments wherethe organism or population member being studied is sacrificed in theprocess of making a single observation (e.g., certain physiologicallybased pharmacokinetic (PBPK) modeling [BPo, Po] and whole organ-ism transport models [BK]). In this case one may still have dynamic(i.e., time course) models for individuals, but no individual data isavailable.

∙ Type II: The second class of problems which above we called Type IIproblems involves dynamics which depend explicitly on the probabilitydistribution 𝑃 itself. In this case one only has dynamics (aggregatedynamics) for the expected value

��(𝑡) =

∫𝑄𝑥(𝑡, 𝑞)𝑑𝑃 (𝑞)

of the state variable. No dynamics are available for individual trajecto-ries 𝑥(𝑡, 𝑞) for a given 𝑞 ∈ 𝑄. The electromagnetic example of Example4 is precisely this situation. Such problems also arise in viscoelasticityas well as biology (the HIV cellular models of Banks, Bortz and Holte[BBH]) – see also [BBPP, BG1, BG2, BPi].

While the approximations we discuss below are applicable to both typesof problems, we shall illustrate the computational results in the context ofsize-structured marine populations (mosquitofish, shrimp) and PBPK prob-lems (TCE) where the inverse problems are of Type I. Finally, we notethat in the problems considered here, one can not sample directly from theprobability distribution being estimated and this again is somewhat differentfrom the usual case treated in some of the statistical literature, e.g., see[Wahba1, Wahba2] and the references cited therein.

61

Example 5: The Growth Rate Distribution Model and InverseProblem in Marine Populations

Our motivating application entails estimation of growth rate distributionsfor size-structured mosquitofish and shrimp populations. Mosquitofish areused in place of pesticides to control mosquito populations in rice fields. Ma-rine biologists desire to correctly predict growth and decline of mosquitofishpopulation in order to determine the optimal densities of mosquitofish to useto control mosquito populations. A mathematical model that accurately de-scribes the mosquitofish population would be beneficial in this application,as well as in other problems involving population dynamics and age/size-structured data.

Based on data collected from rice fields, a reasonable mathematicalmodel would have to predict two key features that are exhibited in thedata: dispersion and bifurcation (i.e., a unimodal density becomes a bi-modal density) of the population density over time [BBKW, BF, BFPZ].As mentioned above in Example 5, the Sinko-Streifer model does exhibit ei-ther of these under reasonable biological assumptions. However, the Growthrate distribution (GRD) model, developed in [BBKW] and [BF], capturesboth of these features in its solutions.

The model is a modification of the Sinko-Streifer model (used for mod-eling age/size-structured populations) which allows individuals to have dif-ferent characteristics or behaviors.

Figure 1: Mosquitofish data.

Recall that the Sinko-Streifer model (SS) for size-structured mosquitofishpopulations is given by

∂𝑣

∂𝑡+

∂

∂𝜉(𝑔𝑣) = −𝜇𝑣, 𝜉0 < 𝜉 < 𝜉1, 𝑡 > 0 (38)

𝑣(0, 𝜉) = Φ(𝜉)

𝑔(𝑡, 𝜉0)𝑣(𝑡, 𝜉0) =

∫ 𝜉1

𝜉0

𝐾(𝑡, 𝜁)𝑣(𝑡, 𝜁)∂𝜁

𝑔(𝑡, 𝜉1) = 0.

Here 𝑣(𝑡, 𝜉) represents size (given in numbers per unit length) or populationdensity, where 𝑡 represents time and 𝑥 represents length of mosquitofish–

62

growth rate of individual mosquitofish given by 𝑔(𝑡, 𝜉), where

𝑑𝜉

𝑑𝑡= 𝑔(𝑡, 𝜉) (39)

for each individual (all mosquitofish of given size have same growth rate)As we have noted, the SS model cannot be used as is to model the

mosquitofish population because it does not predict dispersion or bifurcationof the population in time under biologically reasonable assumptions [BBKW,BF]. But by modifying the SS model so that the individual growth rates ofthe mosquitofish vary across the population (instead of being the same forall individuals in the population), one obtains a model, known as the growthrate distribution (GRD) model which does in fact exhibit both dispersal intime and development of a bimodal density from a unimodal density (see[BF, BFPZ]).

As noted in Example 5, in the (GRD) model, the population density𝑢(𝑡, 𝜉;𝑃 ) is actually given by

𝑢(𝑡, 𝜉;𝑃 ) =

∫𝒢𝑣(𝑡, 𝜉; 𝑔)𝑑𝑃 (𝑔). (40)

where 𝒢 is collection of admissible growth rates, 𝑃 is probability measureon 𝒢, and 𝑣(𝑡, 𝜉; 𝑔) is solution of (SS) with 𝑔. This model assumes that thepopulation is made up of collections of subpopulations –individuals in samesubpopulation have same growth rate

As demonstrated in [BF], solutions to GRD model exhibit both disper-sion and bifurcation of the population density in time. Here we assume thatthe admissible growth rates 𝑔 have the form

𝑔(𝜉; 𝑏, 𝛾) = 𝑏(𝛾 − 𝜉)

for 𝜉0 ≤ 𝜉 ≤ 𝛾 and zero otherwise, where 𝑏 is the intrinsic growth rate of themosquitofish and 𝛾 = 𝜉1 is the maximum size. This choice based on work in[BBKW], where idea of other properties related to the growth rates varyingamong the mosquitofish is discussed.

Under assumption of varying intrinsic growth rates and maximum sizes,assume that 𝑏 and 𝛾 are random variables taking values in the compact sets𝐵 and Γ, respectively. A reasonable assumption is that both are boundedclosed intervals. Thus we take

𝐺 = {𝑔(⋅; 𝑏, 𝛾)∣𝑏 ∈ 𝐵, 𝛾 ∈ Γ}

63

so that 𝐺 is also compact in, for example, 𝐶[𝜉0, 𝜉1] where 𝜉1 = max(Γ). Then𝒫(𝐺) is compact in the Prohorov metric and we are in framework outlinedabove. In illustrative examples, one may choose a growth rate parameterizedby the intrinsic growth rate 𝑏 with 𝛾 = 1 , leading to a one parameter familyof varying growth rates 𝑔 among the individuals in the population. We alsoassume here that 𝜇 = 0 and 𝐾 = 0 in order to focus on only the distributionof growth rates; however, distributions could just as well be placed on 𝜇 and𝐾.

Next, introduce two different approaches that can be used in inverseproblem for estimation of distribution of growth rates of the mosquitofish

The first approach, which has been discussed and used in [BF] and[BFPZ], involves the use of delta distributions. We assume that probabil-ity distributions 𝒫𝑀 placed on growth rates are discrete corresponding toa collection 𝐺𝑀 with the form 𝐺𝑀 = {𝑔𝑘}𝑀𝑘=1 where 𝑔𝑘(𝑥) = 𝑏𝑘(1 − 𝜉), for𝑘 = 1, . . . ,𝑀 . Here the {𝑏𝑘} are a discretization of 𝐵. For each subpopu-lation with growth rate 𝑔𝑘, there is a corresponding probability 𝑝𝑘 that anindividual is in subpopulation 𝑘. The population density 𝑢(𝑡, 𝑥;𝑃 ) in (40)is then approximated by

𝑢(𝑡, 𝜉; {𝑝𝑘}) =∑𝑘

𝑣(𝑡, 𝜉; 𝑔𝑘)𝑝𝑘,

where 𝑣(𝑡, 𝜉; 𝑔𝑘) is the subpopulation density from (SS) with growth rate 𝑔𝑘.We denote this delta function approximation method as DEL(M), where Mis number of elements used in this approximation.

While it has been shown that DEL(M) provides a reasonable approxima-tion to (40), a better approach might involve techniques that will provide asmoother approximation of (40) in the case of continuous probability distri-butions on the growth rates. Thus, as a second approach, we chose to use anapproximation scheme based on piecewise linear splines. Here we have as-sumed that 𝑃 is a continuous probability distribution on the intrinsic growthrates. We approximate the density 𝑃 ′ = 𝑑𝑃

𝑑𝑏 = 𝑝(𝑏) using piecewise linearsplines, which leads to the following approximation for 𝑢(𝑡, 𝜉;𝑃 ) in (40):

𝑢(𝑡, 𝜉; {𝑎𝑘}) ≈∑𝑘

𝑎𝑘

∫𝐵𝑣(𝑡, 𝜉; 𝑔)𝑙𝑘(𝑏)𝑑𝑏,

where 𝑔(𝜉; 𝑏) = 𝑏(1−𝜉), 𝑝𝑘(𝑏) = 𝑎𝑘𝑙𝑘(𝑏) is the probability density for an indi-vidual in subpopulation 𝑘 and 𝑙𝑘 represents the piecewise linear spline func-tions. This spline based approximation method is denoted by SPL(M,N),where M is the number of basis elements used to approximate the growth

64

rate probability distribution and N is the number of quadrature nodes usedto approximate the integral in the formula above.

One can use the approximation methods DEL(M) and SPL(M,N) in theinverse problem for the estimation of the growth rate distributions. Theleast squares inverse problem to be solved is

min𝑃∈𝒫𝑀 (𝐺)

𝐽(𝑃 ) =∑𝑖,𝑗

∣𝑢(𝑡𝑖, 𝜉𝑗 ;𝑃 ) − ��𝑖𝑗 ∣2 (41)

=∑𝑖,𝑗

(𝑢(𝑡𝑖, 𝜉𝑗 ;𝑃 ))2 − 2𝑢(𝑡𝑖, 𝜉𝑗;𝑃 )��𝑖𝑗 + (��𝑖𝑗)2,

where {��𝑖𝑗} is the data and 𝒫𝑀 (𝐺) is the finite dimensional approximationto 𝒫(𝐺). When using DEL(M), the finite dimensional approximation 𝒫𝑀 (𝐺)to the probability measure space 𝒫(𝐺) is given by

𝒫𝑀 (𝐺) =

{𝑃 ∈ 𝒫(𝐺)∣ 𝑃 ′ =

∑𝑘

𝑝𝑘𝛿𝑔𝑘 ,∑𝑘

𝑝𝑘 = 1

},

where 𝛿𝑔𝑘 is the delta function with an atom at 𝑔𝑘. However, when usingSPL(M,N), the finite dimensional approximation 𝒫𝑀 (𝐺) is given by

𝒫𝑀 (𝐺) =

{𝑃 ∈ 𝒫(𝐺)∣ 𝑃 ′ =

∑𝑘

𝑎𝑘𝑙𝑘(𝑏),∑𝑘

𝑎𝑘

∫𝐵𝑙𝑘(𝑏)𝑑𝑏 = 1

}.

Furthermore, we note that this least squares inverse problem (41) be-comes a quadratic programming problem [BF, BFPZ]. Letting p be thevector that contains 𝑝𝑘, 1 ≤ 𝑘 ≤ 𝑀, when using DEL(M) or 𝑎𝑘, 1 ≤ 𝑘 ≤ 𝑀,when using SPL(M,N), we let A be the matrix with entries given by

𝐴𝑘𝑚 =∑𝑖,𝑗

𝑣(𝑡𝑖, 𝜉𝑗 ; 𝑔𝑘)𝑣(𝑡𝑖, 𝜉𝑗 ; 𝑔𝑚),

let b the vector with entries given by

𝑏𝑘 = −∑𝑖,𝑗

��𝑖𝑗𝑣(𝑡𝑖, 𝜉𝑗; 𝑔𝑘),

and𝑐 =

∑𝑖,𝑗

(��𝑖𝑗)2,

where 1 ≤ 𝑘,𝑚 ≤ 𝑀. In the place of (41), we now minimize

𝐹 (p) ≡ p𝑇Ap+ 2p𝑇b+ 𝑐 (42)

65

over 𝒫𝑀 (𝐺). We note when using DEL(M) we also had to include the con-straint ∑

𝑘

𝑝𝑘 = 1,

while when using SPL(M,N) we had to include the constraint

∑𝑘

𝑎𝑘

∫𝐵𝑙𝑘(𝑏)𝑑𝑏 = 1.

However, in both cases, we were able to include these constraints along withnon-negativity constraints on the {𝑝𝑘} and {𝑎𝑘} in the programming of thesetwo inverse problems.

Other Size-Structured Population Models

The Sinko-Streifer (SS) model [SS] and its variations have been widely usedto describe numerous age and size-structured populations (see [BBDS1,BBDS2, BBKW, BF, BFPZ, BA, Kot, MetzD] for example).

One recent example involves the shrimp growth models of [Shrimp, BDEHAD].Dispersion in size observed in experimental data for early growth of shrimphas been observed in data from different raceways at Shrimp MaricultureResearch Facility, Texas Agricultural Experiment Station in Corpus Christi,TX. Initial sizes were very similar but variability is observed in aggregatetype longitudinal data. A reasonable model for these populations must ac-count for variability in size distribution data which perhaps is a result ofvariability in individual growth rates across population.

In summary, the GRD model (40) represents one approach to account-ing for variability in growth rates by imposing a probability distribution onthe growth rates in the SS model (38). Individuals in the population growaccording to a deterministic growth model (39), but different individuals inthe population may have different parameter dependent growth rates in theGRD model. The population is assumed to consist of subpopulations withindividuals in the same subpopulation having the same growth rate. Thegrowth uncertainty of individuals in the population is the result of variabilityin growth rates among the subpopulations. This modeling approach, whichentails a stationary probabilistic structure on a family of deterministic dy-namic systems, may be most applicable when the growth of individuals isassumed to be the result of genetic variability.

66

7.4 Two Player Min-Max Games with Uncertainty

We consider electromagnetic evasion-interrogation games wherein the evadercan use ferroelectric material coatings to attempt to avoid detection whilethe interrogator can manipulate the interrogating frequencies (wave num-bers) and angles of incidence of the interrogating inputs to enhance detec-tion and identification. The resulting problems are formulated as two playergames in which one player wishes to minimize the reflected signal while theother wishes to maximize it. Simple deterministic strategies are easily de-feated and hence the players must introduce uncertainty to disguise theirintentions and confuse their opponent. Mathematically, the resulting gameis carried out over spaces of probability measures which in many cases areappropriately metrized using the Prohorov metric. Details of the resultsdiscussed in this section can be found in [BGIT].

In [BIKT] the authors demonstrated that it is possible to design fer-roelectric materials with appropriate dielectric permittivity and magneticpermeability to significantly attenuate reflections of electromagnetic inter-rogation signals from highly conductive targets such as airfoils and missiles.This was done under assumptions that the interrogating input signal is uni-formly likely to come from a sector of interrogating angles 𝛼 ∈ [𝛼0, 𝛼1](𝛼 = 𝜋

2 − 𝜙 where 𝜙 is the angle of incidence of the input signal) but thatthe evader has knowledge of the interrogator’s input frequency or frequencies(denoted in the presentations below as the interrogator design frequencies𝐼𝐷). These results were further sharpened and illustrated in [BIT] wherea series of different material designs were considered to minimize over agiven set of input design frequencies 𝐼𝐷 the maximum reflected field frominput signals. In addition, a second critical finding was obtained in thatit was shown that if the evader employed a simple counter interrogationdesign based on a fixed set (assumed known) of interrogating frequencies𝐼𝐷, then by a rather simple counter-counter interrogation strategy (use aninterrogating frequency little more than 10% different from the assumed de-sign frequencies), the interrogator can easily defeat the evader’s materialcoatings counter interrogation strategy to obtain strong reflected signals.

From the combined results of [BIKT, BIT] it is thus rather easily con-cluded that the evader and the interrogator must each try to confuse theother by introducing significant uncertainty in their design and interro-gating strategies, respectively. This concept, which we refer to as mixedstrategies in recognition of previous contributions to the literature on games(von Neumann’s finite mixed strategies to be explained below), leads totwo player non-cooperative games with probabilistic strategy formulations.

67

These can be mathematically formulated as two sided optimization prob-lems over spaces of probability measures, i.e., minmax games over sets ofprobability measures.

7.4.1 Problem Formulation

We consider electromagnetic interrogation of objects in the context of min-max evader-interrogator games where each player has uncertain informationabout the adversary’s capabilities. The minmax cost functional is basedon reflected fields from an object such as an airfoil or missile and can becomputed in one of several ways [BIKT, BIT].

Dielectric Coating

z

y

z=0

z=d

Ambient

Perfectly conducting half plane

φ

Figure 2: Interrogating high frequency wave impinging (angle of incidence𝜙) on coated (thickness 𝑑) perfectly conducting surface

The simplest is the reflection coefficient based on a simple planar geome-try (e.g., see Figure 7) using Fresnel’s formula for a perfectly conducting halfplane which has a coating layer of thickness 𝑑 with dielectric permittivity 𝜖and magnetic permeability 𝜇. A normally incident (𝜙 = 0) electromagneticwave with the frequency 𝑓 is assumed to impinge the half plane. Then thecorresponding wavelength 𝜆 in air is 𝜆 = 𝑐/𝑓 , where the speed of light is𝑐 = 0.3 × 109. The reflection coefficient 𝑅 for the wave is given by

𝑅 =𝑎 + 𝑏

1 + 𝑎𝑏, (43)

where

𝑎 =𝜖−√

𝜖𝜇

𝜖 +√𝜖𝜇

and 𝑏 = 𝑒4𝑖𝜋√𝜖𝜇𝑓𝑑/𝑐. (44)

68

This expression can be derived directly from Maxwell’s equation by con-sidering the ratio of reflected to incident wave for example in the case ofparallel polarized (𝑇𝐸𝑥) incident wave (see [BIKT, Jackson]).

An alternative and much more computationally intensive approach (whichmay be necessitated by some target geometries) employs the far field pat-tern for reflected waves computed directly using Maxwell’s equations (seeExample 4). In two dimensions, for a reflecting body Ω with coating layerΩ1 and computational domain Π with an interrogating plane wave 𝐸(𝑖), thescattered field 𝐸(𝑠) satisfies the Helmholtz equation [CK1, CK2]

∇ ⋅(

1

𝜇∇𝐸(𝑠)

)+ 𝜖𝜔2𝐸(𝑠) = −∇ ⋅

(1

𝜇∇𝐸(𝑖)

)− 𝜖𝜔2𝐸(𝑖) in Π ∖ Ω

𝐸(𝑠) = −𝐸(𝑖) on ∂Ω[1

𝜇

∂𝐸

∂𝑛

]= [𝐸] = 0 on ∂Ω1 ∖ ∂Ω

∂𝐸(𝑠)

∂𝑛− 𝑖𝑘𝐸(𝑠) − 𝑖

2𝑘

∂2𝐸(𝑠)

∂𝑠2= 0 on ∂Π

∂𝐸(𝑠)

∂𝑠− 𝑖𝑘

3

2𝐸(𝑠) = 0 at 𝐶,

(45)

where the Silver-Muller radiation condition has been approximated by asecond-order absorbing boundary condition on ∂Π as described in [BJR,HKNT, HRT]. The vectors 𝑛 and 𝑠 denote the normal and tangential di-rections on the boundary ∂Π, respectively, and 𝐶 is the set of the cornerpoints of Π. Here [ ⋅ ] denotes the jumps at interfaces. The incident field inair is given by

𝐸(𝑖) = 𝑒𝑖𝑘(𝑥1 cos𝛼+𝑥2 sin𝛼),

where 𝛼 is the interrogation angle (𝛼 = 𝜋2 − 𝜙) and 𝑘 = 2𝜋𝑓/𝑐 = 𝜔/𝑐 is the

interrogation wave number corresponding to the interrogating frequency 𝑓 .The corresponding far field pattern is given by [BIKT, CK1, CK2]

𝐹𝛼(𝛼 + 𝜋; 𝜖, 𝜇, 𝛼, 𝑓) =

lim𝑟→∞

(√8𝜋𝑘𝑟 𝑒−𝑖(𝑘𝑟+𝜋/4) 𝐸(𝑠)(𝑟 cos(𝛼 + 𝜋), 𝑟 sin(𝛼 + 𝜋); 𝜖, 𝜇, 𝛼, 𝑓)

),

(46)

where 𝐸(𝑠)(𝑥1, 𝑥2; 𝜖, 𝜇, 𝛼, 𝑓) is the scattered electromagnetic field. This canbe used as a measure of the reflected field intensity instead of the reflectioncoefficient 𝑅 of (43)–(44).

69

The evader and the interrogator are each subject to uncertainties as tothe actions of the other. The evader wants to choose a best coating design(i.e., best 𝜖’s and 𝜇’s ) while the interrogator wants to choose best anglesof interrogation 𝛼 and interrogating frequencies 𝑓 . Each player must actin the presence of incomplete information about the other’s action. Partialinformation regarding capabilities and tendencies of the adversary can beembodied in probability distributions for the choices to be made. That is,we may formalize this by assuming the evader may choose (with an as yetto be determined set of probabilities) dielectric permittivity and magneticpermeability parameters (𝜖, 𝜇) from admissible sets ℰ ×ℳ while the inter-rogator chooses angles of interrogation and interrogating frequencies (𝛼, 𝑓)from sets 𝒜×ℱ . The formulation here is based on the mixed strategies pro-posals of von Neumann [Aubin, VN, VNM] and the ideas can be summarizedas follows. The evader does not choose a single coating, but rather has a setof possibilities available for choice. He only chooses the probabilities withwhich he will employ the materials on a target. This, in effect, disguises hisintentions from his adversary. By choosing his coatings randomly (accordingto a best strategy to be determined in, for example, a minmax game), heprevents adversaries from discovering which coating he will use–indeed, evenhe does not know which coating will be chosen for a given target. The inter-rogator, in a similar approach, determines best probabilities for choices offrequency and angle in the interrogating signals. Note that such a formula-tion tacitly assumes that the adversarial relationship persists with multipleattempts at evasion and detection.

The associated minmax problem consists of the evader choosing distribu-tions 𝑃𝑒(𝜖, 𝜇) over ℰ×ℳ to minimize the reflected field while the interrogatorchooses distributions 𝑃𝑖(𝛼, 𝑓) over 𝒜×ℱ to maximize the reflected field. If𝐵𝑅(𝜖, 𝜇, 𝛼, 𝑓) is the chosen measure of reflected field and 𝒫𝑒 = 𝒫(ℰ ×ℳ)and 𝒫𝑖 = 𝒫(𝒜 × ℱ) are the corresponding sets of probability distributionsor measures over ℰ ×ℳ and 𝒜× ℱ , respectively, then the cost functionalfor the minmax problem can be defined by

𝐽(𝑃𝑒, 𝑃𝑖) =

∫ℰ×ℳ

∫𝒜×ℱ

∣𝐵𝑅(𝜖, 𝜇, 𝛼, 𝑓)∣2𝑑𝑃𝑒(𝜖, 𝜇)𝑑𝑃𝑖(𝛼, 𝑓). (47)

The problems thus formulated are special cases of classical static zero–sum two player non-cooperative games [Aubin, Basar] where the evader min-imizes over 𝑃𝑒 ∈ 𝒫𝑒 and the interrogator maximizes over 𝑃𝑖 ∈ 𝒫𝑖. In suchgames one defines upper and lower values for the game by

𝐽 = inf𝑃𝑒∈𝒫𝑒

sup𝑃𝑖∈𝒫𝑖

𝐽(𝑃𝑒, 𝑃𝑖)

70

and𝐽 = sup

𝑃𝑖∈𝒫𝑖

inf𝑃𝑒∈𝒫𝑒

𝐽(𝑃𝑒, 𝑃𝑖).

The first represents a security level (worst case scenario) for the evader whilethe latter is a security level for the interrogator. It is readily argued that𝐽 ≤ 𝐽 and if the equality 𝐽∗ = 𝐽 = 𝐽 holds, then 𝐽∗ is called the value ofthe game. Moreover, if there exist 𝑃 ∗

𝑒 ∈ 𝒫𝑒 and 𝑃 ∗𝑖 ∈ 𝒫𝑖 such that

𝐽∗ = 𝐽(𝑃 ∗𝑒 , 𝑃

∗𝑖 ) = min

𝑃𝑒∈𝒫𝑒

𝐽(𝑃𝑒, 𝑃∗𝑖 ) = max

𝑃𝑖∈𝒫𝑖

𝐽(𝑃 ∗𝑒 , 𝑃𝑖),

then (𝑃 ∗𝑒 , 𝑃

∗𝑖 ) is a saddle point solution or non-cooperative equilibrium of the

game.To investigate theoretical, computational and approximation issues for

these problems, it is necessary to put a topology on the space of probabilitymeasures: a natural choice for 𝒫(𝑄) is the Prohorov metric topology Usingits properties and arguments similar to those in [BBi, BG1, BG2, BK, BPi],one can develop well-posedness and approximation results for the minmaxproblems defined above. Efficient computational methods that correspondto von Neumann’s finite mixed strategies [Aubin] can readily be presentedin this context. These can be based on several approximation theories thathave been recently developed and used. The first, developed in [BBi] andbased on Dirac delta measures, is summarized in the following theorem.

Theorem 17 Let 𝑄 be a complete, separable metric space, ℬ the class of allBorel subsets of 𝑄 and 𝒫(𝑄) the space of probability measures on (𝑄,ℬ). Let𝑄0 = {𝑞𝑗}∞𝑗=1 be a countable, dense subset of 𝑄. Then the set of 𝑃 ∈ 𝒫(𝑄)such that 𝑃 has finite support in 𝑄0 and rational masses is dense in 𝒫(𝑄)in the 𝜌 metric. That is,

𝒫0(𝑄) ≡ {𝑃 ∈ 𝒫(𝑄) : 𝑃 =𝑘∑

𝑗=1

𝑝𝑗Δ𝑞𝑗 , 𝑘 ∈ 𝒩+, 𝑞𝑗 ∈ 𝑄0, 𝑝𝑗 rational,𝑘∑

𝑗=1

𝑝𝑗 = 1}

is dense in 𝒫(𝑄) taken with the 𝜌 metric, where Δ𝑞𝑗 is the Dirac measurewith atom at 𝑞𝑗.

It is rather easy to use the ideas and results associated with this theoremto develop computationally efficient schemes. Given 𝑄𝑑 =

∪∞𝑀=1 𝑄𝑀 with

𝑄𝑀 = {𝑞𝑀𝑗 }𝑀𝑗=1 (a “partition” of 𝑄) chosen so that 𝑄𝑑 is dense in 𝑄, define

𝒫𝑀 (𝑄) = {𝑃 ∈ 𝒫(𝑄) : 𝑃 =

𝑀∑𝑗=1

𝑝𝑗Δ𝑞𝑀𝑗, 𝑞𝑀𝑗 ∈ 𝑄𝑀 , 𝑝𝑗 rational,

𝑀∑𝑗=1

𝑝𝑗 = 1}.

71

Then we find

(i) 𝒫𝑀 (𝑄) is a compact subset of 𝒫(𝑄) in the 𝜌 metric,

(ii)𝒫𝑀 (𝑄) ⊂ 𝒫𝑀+1(𝑄) whenever 𝑄𝑀+1 is a refinement of 𝑄𝑀 ,

(iii)“𝒫𝑀 (𝑄) → 𝒫(𝑄)” in the 𝜌 topology; that is, for 𝑀 sufficiently large,elements in 𝒫(𝑄) can be approximated in the 𝜌 metric by elements of𝒫𝑀 .

A second class of approximations, based on linear splines, was devel-oped and used in [BPi] for problems where one assumes that the probabilitydistributions to be approximated possess densities in 𝐿2.

Theorem 18 Let ℱ be a weakly compact subset of 𝐿2(𝑄), 𝑄 compact andlet 𝒫ℱ (𝑄) ≡ {𝑃 ∈ 𝒫(𝑄) : 𝑃 ′ = 𝑝, 𝑝 ∈ ℱ}. Then 𝒫ℱ (𝑄) is compact in 𝒫(𝑄)in the 𝜌 metric. Moreover, if we define {ℓ𝑀𝑗 } to be the linear splines on 𝑄corresponding to the partition 𝑄𝑀 , where

∪𝑀 𝑄𝑀 is dense 𝑄, define

𝒫𝑀 ≡ {𝑝𝑀 : 𝑝𝑀 =∑𝑗

𝑏𝑀𝑗 ℓ𝑀𝑗 , 𝑏𝑀𝑗 rational }

and if

𝒫ℱ𝑀 ≡ {𝑃𝑀 ∈ 𝒫(𝑄) : 𝑃𝑀 =

∫𝑝𝑀 , 𝑝𝑀 ∈ 𝒫𝑀},

we have∪

𝑀 𝒫ℱ𝑀 is dense in 𝒫ℱ (𝑄) taken with the 𝜌 metric.

A study comparing the relative strengths and weaknesses of these twoclasses of approximation schemes in the context of inverse problems is givenin [BD].

7.4.2 Theoretical Results

To establish existence of a saddle point solution for the evasion-interrogationproblems formulated above, one can employ a fundamental result of vonNeumann [VN, VNM] as stated by Aubin (see [Aubin, p. 126]).

Theorem 19 (von Neumann) Suppose 𝑋0, 𝑌0 are compact, convex subsetsof metric linear spaces 𝑋,𝑌 respectively. Further suppose that

(i) for all 𝑦 ∈ 𝑌0, 𝑥 → 𝑓(𝑥, 𝑦) is convex and lower semi-continuous;

(ii) for all 𝑥 ∈ 𝑋0, 𝑦 → 𝑓(𝑥, 𝑦) is concave and upper semi-continuous.

72

Then there exists a saddle point (𝑥∗, 𝑦∗) such that

𝑓(𝑥∗, 𝑦∗) = min𝑋0

max𝑌0

𝑓(𝑥, 𝑦) = max𝑌0

min𝑋0

𝑓(𝑥, 𝑦).

A straight forward application of these results to the problems underdiscussion lead immediately to desired well-posedness results.

Theorem 20 Suppose ℰ ,ℳ,Φ,ℱ are compact and the spaces 𝑋0 = 𝒫(ℰ ×ℳ), 𝑌0 = 𝒫(Φ × ℱ) are taken with the Prohorov metric. Then 𝑋0, 𝑌0

are compact, convex subsets of 𝑋 = 𝐶∗𝐵(ℰ × ℳ) and 𝑌 = 𝐶∗

𝐵(Φ × ℱ),respectively. Moreover, there exists (𝑃 ∗

𝑒 , 𝑃∗𝑖 ) ∈ 𝒫(ℰ ×ℳ)× 𝒫(Φ ×ℱ) such

that

𝐽(𝑃 ∗𝑒 , 𝑃

∗𝑖 ) = min

𝒫(ℰ×ℳ)max

𝒫(Φ×ℱ)𝐽(𝑃𝑒, 𝑃𝑖) = max

𝒫(Φ×ℱ)min

𝒫(ℰ×ℳ)𝐽(𝑃𝑒, 𝑃𝑖).

For computations, we may use the “delta” approximations of Theorem17 and obtain

𝑑𝑃𝑒(𝜖, 𝜇) ≈∑

𝑝𝑗𝑒𝛿(𝜖𝑗 ,𝜇𝑗)(𝜖, 𝜇)𝑑𝜖𝑑𝜇,

𝑑𝑃𝑖(𝜙, 𝑓) ≈∑

𝑝𝑗𝑖𝛿(𝜙𝑗 ,𝑓𝑗)(𝜙, 𝑓)𝑑𝜙𝑑𝑓.

As noted above, a convergence theory can be found in [BBi]. This cor-responds precisely to von Neumann’s finite mixed strategies framework forprotection by disguising intentions from opponents, i.e., by introducing un-certainty in the players’ choices.

To illustrate the computational framework based on the delta measureapproximations, we take

𝑑𝑃𝑀𝑒 (𝜖, 𝜇) =

𝑀∑𝑗=1

𝑝𝑀𝑗 𝛿(𝜖𝑀𝑗 ,𝜇𝑀𝑗 )𝑑𝜖𝑑𝜇 and 𝑑𝑃𝑁

𝑖 (𝜙, 𝑓) =𝑁∑

𝑘=1

𝑞𝑁𝑘 𝛿(𝜙𝑁𝑘 ,𝑓𝑁

𝑘 )𝑑𝜙𝑑𝑓,

which can be represented respectively by

𝑝𝑀 ={𝑝𝑀𝑗}𝑀𝑗=1

∈ 𝑃𝑀 ≡ {𝑝 ∈ ℝ𝑀 ∣ 𝑝𝑗 ≥ 0,

𝑀∑𝑗=1

𝑝𝑗 = 1} (48)

and

𝑞𝑁 ={𝑞𝑁𝑘}𝑁𝑘=1

∈ 𝑄𝑁 ≡ {𝑞 ∈ ℝ𝑁 ∣ 𝑞𝑘 ≥ 0,

𝑁∑𝑘=1

𝑞𝑘 = 1}. (49)

73

We note that in this case the 𝑝𝑀𝑗 , 𝑞𝑁𝑘 are the probabilities associated with

the use of the material parameters (𝜖𝑀𝑗 , 𝜇𝑀𝑗 ) and interrogating parameters

(𝜙𝑁𝑘 , 𝑓𝑁

𝑘 ) in the mixed strategies of the evader and interrogator, respectively.Then 𝐽(𝑃𝑀

𝑒 , 𝑃𝑁𝑖 ) reduces to

𝒥 (𝑝𝑀 , 𝑞𝑁 ) =

𝑁∑𝑘=1

𝑀∑𝑗=1

𝑝𝑀𝑗∣∣𝐵𝑅(𝜖𝑀𝑗 , 𝜇𝑀

𝑗 , 𝜙𝑁𝑘 , 𝑓𝑁

𝑘 )∣∣2 𝑞𝑁𝑘

where 𝐵𝑅 is a measure of the reflected field (either the reflection coefficientor the far field scattering intensity). Since 𝑃𝑀 , 𝑄𝑁 are compact, convexsubsets of ℝ𝑀 ,ℝ𝑁 , respectively, we have the following theorem.

Theorem 21 For fixed 𝑀,𝑁 there exists (𝑝𝑀∗ , 𝑞𝑁∗ ) in 𝑃𝑀 ×𝑄𝑁 such that

𝒥 ∗ = 𝒥 (𝑝𝑀∗ , 𝑞𝑁∗ ) = min𝑝𝑀∈𝑃𝑀

max𝑞𝑁∈𝑄𝑁

𝒥 (𝑝𝑀 , 𝑞𝑁 ) = max𝑞𝑁∈𝑄𝑁

min𝑝𝑀∈𝑃𝑀

𝒥 (𝑝𝑀 , 𝑞𝑁 ).

Assume further that (𝜖, 𝜇, 𝜙, 𝑓) → 𝐵𝑅(𝜖, 𝜇, 𝜙, 𝑓) is continuous on ℰ×ℳ×Φ × ℱ which is assumed compact. Then there exists a sequence (𝑝𝑀∗ , 𝑞𝑁∗ )of elements from 𝑃𝑀 × 𝑄𝑁 respectively with corresponding (𝑃𝑀

𝑒 , 𝑃𝑁𝑖 ) in

𝒫(ℰ ×ℳ)×𝒫(Φ×ℱ) converging in the Prohorov metric to (𝑃 ∗𝑒 , 𝑃

∗𝑖 ) which

is a saddle point for the original minmax problem.

We give the arguments for these results since they are in some sense rep-resentative of the use of functional analysis in such problems. The hypothesisthat the map (𝜖, 𝜇, 𝜙, 𝑓) → 𝐵𝑅(𝜖, 𝜇, 𝜙, 𝑓) be continuous on ℰ ×ℳ× Φ ×ℱimplies continuity of the map (𝑃𝑒, 𝑃𝑖) → 𝐽(𝑃𝑒, 𝑃𝑖) on 𝑋0×𝑌0 → ℝ1

+, where𝑋0 = 𝒫(ℰ ×ℳ) and 𝑌0 = 𝒫(Φ×ℱ) are compact convex in 𝑋 = 𝐶∗

𝐵(ℰ×ℳ)and 𝑌 = 𝐶∗

𝐵(Φ×ℱ), respectively. This is due to the compactness of ℰ ×ℳand Φ ×ℱ , and properties of the Prohorov metric.

Let 𝑃𝑀 , 𝑄𝑁 be defined as in (48),(49), and observe that these are com-pact convex subsets of ℝ𝑀 ,ℝ𝑁 , respectively. Now let 𝑃𝑒 ∈ 𝑋0 and 𝑃𝑖 ∈ 𝑌0

be arbitrary. Then by Theorem 17 and (iii) above (see also [BBi]), thereexists a sequence 𝑝𝑀 , 𝑞𝑁 ∈ 𝑃𝑀 , 𝑄𝑁 with associated measures 𝑃𝑀

𝑒 , 𝑃𝑁𝑖 , re-

spectively, such that

𝑃𝑀𝑒 → 𝑃𝑒 in 𝑋0 as 𝑀 →∞ and 𝑃𝑁

𝑖 → 𝑃𝑖 in 𝑌0 as 𝑁 →∞.

Let 𝑃𝑀∗𝑒 , 𝑃𝑁∗

𝑖 (guaranteed to exist by continuity, compactness, and thevon Neumann theorem) with coordinates 𝑝𝑀∗ , 𝑞𝑁∗ , respectively, satisfy

𝐽(𝑃𝑀∗𝑒 , 𝑃𝑁

𝑖 ) ≤ 𝐽(𝑃𝑀∗𝑒 , 𝑃𝑁∗

𝑖 ) ≤ 𝐽(𝑃𝑀𝑒 , 𝑃𝑁∗

𝑖 )

74

for all (𝑃𝑀𝑒 , 𝑃𝑁

𝑖 ) in 𝑋0 × 𝑌0 with coordinate representations in 𝑃𝑀 , 𝑄𝑁 ,respectively. Now 𝑋0 × 𝑌0 compact implies that there exists a subsequence(𝑃𝑀𝑘∗

𝑒 , 𝑃𝑁𝑘∗𝑖 ) and (𝑃𝑒, 𝑃𝑖) in 𝑋0 × 𝑌0 such that (𝑃𝑀𝑘∗

𝑒 , 𝑃𝑁𝑘∗𝑖 ) → (𝑃𝑒, 𝑃𝑖) in

𝑋0 × 𝑌0.Consider (50) for indices 𝑀𝑘, 𝑁𝑘, so that

𝐽(𝑃𝑀𝑘∗𝑒 , 𝑃𝑁𝑘

𝑖 ) ≤ 𝐽(𝑃𝑀𝑘∗𝑒 , 𝑃𝑁𝑘∗

𝑖 ) ≤ 𝐽(𝑃𝑀𝑘𝑒 , 𝑃𝑁𝑘∗

𝑖 ).

Then given the continuity of 𝐽 and taking the limit we find

𝐽(𝑃𝑒, 𝑃𝑖) ≤ 𝐽(𝑃𝑒, 𝑃𝑖) ≤ 𝐽(𝑃𝑒, 𝑃𝑖).

Since 𝑃𝑒, 𝑃𝑖 are arbitrary in 𝑋0, 𝑌0 respectively, we have (𝑃𝑒, 𝑃𝑖) is a saddlepoint for the original minmax problem.

Further note that since these arguments hold for any subsequence of(𝑃𝑀∗

𝑒 , 𝑃𝑁∗𝑖 ) , then

𝐽(𝑃𝑀∗𝑒 , 𝑃𝑁

𝑖 ) ≤ 𝐽(𝑃𝑀∗𝑒 , 𝑃𝑁∗

𝑖 ) ≤ 𝐽(𝑃𝑀𝑒 , 𝑃𝑁∗

𝑖 )

and 𝑃𝑀∗𝑒 → 𝑃𝑒 as well as 𝑃𝑁∗

𝑖 → 𝑃𝑖. That is, any subsequence of 𝑃𝑀∗𝑒 , 𝑃𝑁∗

𝑖

has a convergent subsequence to 𝑃𝑒, 𝑃𝑖 so that the sequence itself mustconverge to 𝑃𝑒, 𝑃𝑖.

75

8 Weak* Convergence, the Prohorov Metric, Re-laxed Controls, Preisach Hysteresis, and Mixing

Distributions in Statistics

UNDER CONSTRUCTION!!!

8.1 Mixing Distributions

8.1.1 Type I: Individual Dynamics/Aggregate Data

Recall, from Example 5 (the size-structured population model for mosquitofishwhere there is a family 𝒢 of individual growth rates 𝑔) we have the expectedsize density 𝑢(𝑡, 𝜉;𝑃 ) = ℰ [𝑣(𝑡, 𝜉; ⋅)∣𝑃 ] described for a general probabilitydistribution 𝑃 , we have

𝑢(𝑡, 𝜉;𝑃 ) =

∫𝒢𝑣(𝑡, 𝜉; 𝑔)𝑑𝑃 (𝑔),

as the expected value of the density 𝑣(𝑡, 𝜉; 𝑔) which satisfies for a given 𝑔the Sinko-Streiffer.

Since in typical inverse problems, we must use aggregate population data𝑑𝑖𝑗 to estimate 𝑃 itself, we are therefore required to understand the qualita-tive properties (continuity, sensitivity, etc.) of 𝑢(𝑡, 𝜉;𝑃 ) as a function of 𝑃 .Thus, we see that the data for parametric estimation or inverse problems is𝑑𝑖𝑗 ∼ 𝑢(𝑡, 𝜉;𝑃 ).

8.1.2 Type II: Aggregate Dynamics

We return to the electromagnetics example of Example 4 which entails po-larization of inhomogeneous dielectric materials. In this case, individual(particle) dynamics are not available. Instead, the dynamics (13) them-selves depend on a probability measure 𝐹 , e.g.,

∂𝑢

∂𝑡+

∂2

∂𝑥2𝑢 = 𝑓(𝑢, 𝐹 ).

To describe the behavior of the electric polarization 𝑃 , we begin withthe general formulation of Chapter 2 of [BBL] by employing a polarizationkernel 𝑔 in the convolution expression

𝑃 (𝑡, 𝑧) =

∫ 𝑡

0𝑔(𝑡− 𝑠, 𝑧; 𝜏)𝐸(𝑠, 𝑧)𝑑𝑠. (50)

76

As explained in Section 2.5 and [BBL], this general formulation includes asspecial cases the well known orientational or Debye polarization model, theelectronic or Lorentz polarization model, and linear combinations thereof,as well as other higher order models. In the Debye case the kernel is givenby

𝑔(𝑡; 𝜏) = 𝜖0(𝜖𝑠 − 𝜖∞)/𝜏 𝑒−𝑡/𝜏 ,

while in the Lorentz model (again see [BBL]), it takes the form

𝑔(𝑡; 𝜏) = 𝜖0𝜔2𝑝/𝜈0 𝑒−𝑡/2𝜏 sin(𝜈0𝑡).

However, use of these kernels presupposes that the material may be suf-ficiently defined by a single relaxation parameter 𝜏 , which is generally notthe case. In order to account for multiple relaxation parameters in the po-larization mechanisms, we allow for a distribution of relaxation parameterswhich is conveniently described in terms of a probability measure 𝐹 . Thus,we define our polarization model in terms of a convolution operator

𝑃 (𝑡, 𝑧) =

∫ 𝑡

0𝐺(𝑡− 𝑠, 𝑧;𝐹 )𝐸(𝑠, 𝑧)𝑑𝑠

where 𝐺 is determined by various polarization mechanisms each describedby a different parameter 𝜏 , and therefore is given by

𝐺(𝑡, 𝑧;𝐹 ) =

∫𝒯𝑔(𝑡, 𝑧; 𝜏)𝑑𝐹 (𝜏)

where 𝒯 ⊂ [𝜏1, 𝜏2].To obtain the macroscopic polarization, we sum over all the parameters.

We cannot separate dynamics to obtain individual dynamics, and thereforewe have (13) in Section 2.5 as an example where the dynamics for the 𝐸and 𝐻 fields depend explicitly on the probability measure 𝐹 .

For inverse problems, data 𝑑𝑖𝑗 ≈ 𝑢(𝑡𝑖, 𝜉𝑗) = 𝑢(𝑡𝑖, 𝜉𝑗;𝑃 ) in Type I, whereas𝑑𝑖𝑗 ≈ 𝐸(𝑡𝑖, ��𝑗 ;𝐹 ) in Type II. Observe that 𝑃 or 𝐹 ∈ 𝒫(𝑄), which is thespace of probability measures, where 𝑄 consists of “parameters”, and 𝒫(𝑄)is a set of probability distributions over 𝑄.

For example, in ordinary least square problems, we have

𝐽(𝑃 ) =∑𝑖𝑗

∣𝑑𝑖𝑗 − 𝑢(𝑡𝑖, 𝜉𝑗;𝑃 )∣2 or 𝐽(𝐹 ) =∑𝑖𝑗

∣𝑑𝑖𝑗 −𝐸(𝑡𝑖, ��𝑗 ;𝐹 )∣2,

to be minimized over 𝑃 or 𝐹 ∈ 𝒫(𝑄) or over some subset of 𝒫(𝑄). We need(for computational and theoretical considerations) a sense of closeness of two

77

probability distributions 𝑃1 and 𝑃2. The ideas of local minima, continuity,“gradient”, etc., depend on a metric (or topology) for 𝑃1 and 𝑃2.

More fundamentally for Type II systems, the “well-posedness of sys-tems”, including existence and continuous dependence of solutions on “pa-rameters”, depends on the concept of 𝐹1 and 𝐹2 being close, i.e., 𝐹 →𝐸(𝑡, 𝑥;𝐹 ) is continuous.

8.2 Relaxed Controls

Relaxed Controls (sliding regimes, chattering controls)L.C.Young (1937,1938)–generalized curves in calculus of variations, E.J

McShane (1940,1967)–relaxed controls in variational problems, control, A.F.Filippov(1962)–sliding regimes in optimal control J.Warga, (1962,1967,..)–relaxedcurves and controls- approximation to original controls

Idea: Problems lack closure in ordinary function spaces (Re: Sobolevspaces and weak solutions, distributions of Schwartz for PDE)

Chattering controls:

Figure 3: Relaxed or generalized curves

In the limit, derivatives approach “ limits of combinations of delta func-tions”

Convexifies and generalizes problem—obtain closure, and hence exis-tence! Corresponding trajectories are called relaxed trajectories or general-ized trajectories (in which family of ordinary trajectories are embedded) Ex-istence (inf = min) of optimal controls in a wider class of functions McShane:Under not too restrictive assumptions, inf actually attained by generalizedcontrol that happens to be ordinary! Warga: Knowledge of minimizinggeneralized curve permits approximation by nearby ordinary curve!

8.3 Preisach Hysteresis

Smart materials 1) SMA, piezoelectric, magnetostrictive 2) Viscoelastic ma-terials Rubber, polymers, living tissue 3) Electromagnetics Polarization,

78

conductivityExample: Stress-strain in shape memory alloys, rubber, tissue

Figure 4: Simple ideal relay

Preisach Ideal RelayPreisach-1935-(E&M Hysteresis) Krasnoselskii&Pokrovskii-1983 Mayergoyz-

1991 Visintin-1994 Brokate&Sprekels-1996

Figure 5: Super imposed relays

These ideas are the basis of SMA control efforts in:H.T.Banks and A.J.Kurdila, Hysteretic control influence operators rep-

resenting smart material actuators: Identification and approximation, Proc.Conf. Decision and Control, Kobe, Japan, Dec, 1996, p.3711-3716. H.T.Banks,A.J.Kurdila and G.Webb, Math. Prob. in Engr.,3(1997),p.287-328. H.T.Banks,A.J.Kurdila and G.Webb, J. Intell. Mat. Sys.&Structures,8(1997),p.536-550.

Efforts on inverse problems involving estimation of measures in controlinput operator: well- posedness and approximation, computational methods–Involves generalization of Krasnoselskii&Pokrovskii operators and kernels–

79

Figure 6: Smoothed relay

Figure 7: Preisach domain

8.4 Mixing distributions

This class of problems involves a mathematical model (the system dynam-ics for the parameter dependent state x(t;q)), a statistical model (in thiscase,assumptions about the data– i.i.d. normal with constant variance lead-ing to the ordinary least squares criterion MLE) and mixing distributionsor random effects to account for variability in individuals

L.B.Sheiner,et.al. -Comp. Biomed. Res.5(1972),p.441-459. B.G.Lindsay-Ann. Statist.11(1983),p.86-94. A.Mallet-Biometrika73(1986),p.657-669. M.Davidianand A.R.Gallant-J.Pharm.&Biopharm.20(1992), p.529-556.

Texts summarizing some of the statistical literature on this includeB.G.Lindsay, Mixture Models: Theory, Geometry, and Applications,

NSF-CBMS Regional Conf. Series in Prob. and Statistics, Inst. Math.Stat, Haywood, CA, Vol. 5, 1995.

M.Davidian and D.Giltinan, Nonlinear Models for Repeated Measure-ment Data, Chapman&Hall, London, 1998.

80

References

[A] R. A. Adams. Sobolev Spaces. Academic Press, 1975.

[Aubin] J.-P. Aubin, Optima and Equilbria: An Introduction to NonlinearAnalysis, Spring-Verlag, Berlin Heidelberg, 1993.

[BJR] A. Bamberger, P. Joly and J. E. Roberts, Second-order absorbingboundary conditions for the wave equation: a solution for the cornerproblem, SIAM J. Numer. Anal., 27 (1990), 323–352.

[BBDS1] H.T. Banks, J.E. Banks, L.K. Dick, and J.D. Stark, Estimationof dynamic rate parameters in insect populations undergoing sublethalexposure to pesticides, CRSC-TR05-22, NCSU, May, 2005; Bulletin ofMathematical Biology, 69 (2007), 2139–2180.

[BBDS2] H.T. Banks, J.E. Banks, L.K. Dick, and J.D. Stark, Time-varyingvital rates in ecotoxicology: selective pesticides and aphid populationdynamics, Ecological Modelling, 210 (2008), 155–160.

[BBJ] H. T. Banks, J. E. Banks and S. L. Joyner, Estimation in time-delay modeling of insecticide-induced mortality, CRSC-TR08-15, Octo-ber, 2008; J. Inverse and Ill-posed Problems, 17 (2009), 101–125.

[BBi] H. T. Banks and K. L. Bihari, Modeling and estimating uncertaintyin parameter estimation, Inverse Problems, 17 (2001) 1–17.

[BBo] H.T. Banks and V.A. Bokil, Parameter identification for dispersivedielectrics using pulsed microwave interrogating signals and acousticwave induced reflections in two and three dimensions, CRSC-TR04-27,July, 2004; Revised version appeared as A computational and statisticalframework for multidimensional domain acoustoptic material interroga-tion, Quart. Applied Math., 63 (2005), 156–200.

[Shrimp] H.T.Banks, V.A. Bokil, S. Hu, F.C.T. Allnutt, R. Bullis, A.K.Dhar and C.L. Browdy, Shrimp biomass and viral infection for produc-tion of biological countermeasures, CRSC-TR05-45, NCSU, December,2005; Mathematical Biosciences and Engineering, 3 (2006), 635–660.

[BBH] H. T. Banks, D. M. Bortz and S. E. Holte, Incorporation of vari-ability into the mathematical modeling of viral delays in HIV infectiondynamics, Math. Biosciences, 183 (2003), 63–91.

81

[BBPP] H.T. Banks, D.M. Bortz, G.A. Pinter and L.K. Potter, Modelingand imaging techniques with potential for application in bioterrorism,CRSC-TR03-02, January, 2003; Chapter 6 in Bioterrorism: Mathe-matical Modeling Applications in Homeland Security, (H.T. Banks andC. Castillo-Chavez, eds.), Frontiers in Applied Math, FR28, SIAM,Philadelphia, PA, 2003, 129–154.

[BBKW] H.T. Banks, L.W. Botsford, F. Kappel, and C. Wang, Modelingand estimation in size structured population models, LCDS-CCS Re-port 87-13, Brown University; Proceedings 2nd Course on MathematicalEcology, (Trieste, December 8-12, 1986) World Press (1988), Singapore,521–541.

[BBL] H.T. Banks, M.W. Buksas, T. Lin. Electromagnetic Material Inter-rogation Using Conductive Interfaces and Acoustic Wavefronts. SIAMFR 21, Philadelphia, 2002.

[BBu1] H.T. Banks and J.A. Burns, An abstract framework for approx-imate solutions to optimal control problems governed by hereditarysystems, International Conference on Differential Equations (H. An-tosiewicz, Ed.), Academic Press (1975), 10–25.

[BBu2] H.T. Banks and J.A. Burns, Hereditary control problems: numericalmethods based on averaging approximations, SIAM J. Control & Opt.,16 (1978), 169–208.

[BDSS] H.T. Banks, M. Davidian, J.R. Samuels Jr., and K. L. Sutton, An in-verse problem statistical methodology summary, Center for Research inScientific Computation Report, CRSC-TR08-01, January, 2008, NCSU,Raleigh; Chapter XX in Statistical Estimation Approaches in Epidemi-ology, (edited by Gerardo Chowell, Mac Hyman, Nick Hengartner, LuisM.A Bettencourt and Carlos Castillo-Chavez), Springer, Berlin Heidel-berg New York, to appear.

[BD] H.T. Banks and J.L. Davis, A comparison of approximation methodsfor the estimation of probability distributions on parameters, CRSC-TR05-38, NCSU, October, 2005; Applied Numerical Mathematics, 57(2007), 753–777.

[BDTR] H.T. Banks and J.L. Davis, Quantifying uncertainty in the estima-tion of probability distributions with confidence bands, CRSC-TR07-21, NCSU, December, 2007; Mathematical Biosciences and Engineer-ing, 5 (2008), 647–667.

82

[GRD-FP] H.T. Banks, J.L. Davis, S.L. Ernstberger, S. Hu, E. Artimovich,A.K. Dhar and C.L. Browdy, A comparison of probabilistic and stochas-tic formulations in modeling growth uncertainty and variability, CRSC-TR08-03, NCSU, February, 2008; Journal of Biological Dynamics, 3(2009) 130–148.

[BDEHAD] H.T. Banks, J.L. Davis, S.L. Ernstberger, S. Hu, E. Artimovichand A.K. Dhar, Experimental design and estimation of growth ratedistributions in size-structured shrimp populations, CRSC-TR08-20,NCSU, November, 2008; Inverse Problems, to appear.

[GRD-FP2] H.T. Banks, J.L. Davis, and S. Hu, A computational compar-ison of alternatives to including uncertainty in structured populationmodels, CRSC-TR09-14, June, 2009; in Three Decades of Progress inSystems and Control, Springer, 2009, to appear.

[BDEK] H.T. Banks, S. Dediu, S. L. Ernstberger and F. Kappel, Gener-alized sensitivities and optimal experimental design, CRSC-TR08-12,September, 2008, Revised Nov, 2009; J Inverse and Ill-Posed Problems,submitted.

[BF] H.T. Banks and B.G. Fitzpatrick, Estimation of growth rate distri-butions in size-structured population models, CAMS Tech. Rep. 90-2,University of Southern California, January, 1990; Quarterly of AppliedMathematics, 49 (1991), 215–235.

[BFPZ] H.T. Banks, B.G. Fitzpatrick, L.K. Potter, and Y. Zhang, Es-timation of probability distributions for individual parameters usingaggregate population data, CRSC-TR98-6, NCSU, January, 1998; InStochastic Analysis, Control, Optimization and Applications, (Editedby W. McEneaney, G. Yin and Q. Zhang), Birkhauser, Boston, 1989.

[BG1] H.T. Banks and N.L. Gibson. Well-posedness in Maxwell systemswith distributions of polarization relaxation parameters. CRSC, TechReport TR04-01, North Carolina State University, January, 2004; Ap-plied Math. Letters, 18 (2005), 423–430.

[BG2] H.T. Banks and N.L. Gibson, Electromagnetic inverse problems in-volving distributions of dielectric mechanisms and parameters, CRSC-TR05-29, August, 2005; Quarterly of Applied Mathematics, 64 (2006),749–795.

83

[BGIT] H.T. Banks, S.L. Grove, K. Ito and J.A. Toivanen, Static two-playerevasion-interrogation games with uncertainty, CRSC-TR06-16, June,2006; Comp. and Applied Math, 25 (2006), 289–306.

[BI86] H.T. Banks and D.W. Iles, On compactness of admissible parame-ter sets: convergence and stability in inverse problems for distributedparameter systems, ICASE Report #86-38, NASA Langley Res. Ctr.,Hampton VA 1986; Proc. Conf. on Control Systems Governed byPDE’s, February, 1986, Gainesville, FL, Springer Lecture Notes in Con-trol & Inf. Science, 97 (1987), 130-142.

[BI] H.T. Banks and K. Ito, A unified framework for approximation andinverse problems for distributed parameter systems. Control: Theoryand Adv. Tech. 4 (1988), 73-90.

[BI2] H.T. Banks and K. Ito, Approximation in LQR problems for infinitedimensional systems with unbounded input operators, CRSC-TR94-22,N.C. State University, November, 1994; in J. Math. Systems, Estima-tion and Control, 7 (1997), 119–122.

[BIKT] H. T. Banks, K. Ito, G. M. Kepler and J. A. Toivanen, Materialsurface design to counter electromagnetic interrogation of targets, Tech.Report CRSC-TR04-37, Center for Research in Scientific Computation,N.C. State University, Raleigh, NC, November, 2004; SIAM J. Appl.Math., 66 (2006), 1027–1049.

[BIT] H. T. Banks, K. Ito and J. A. Toivanen, Determination of interrogat-ing frequencies to maximize electromagnetic backscatter from objectswith material coatings, Tech. Report CRSC-TR05-30, Center for Re-search in Scientific Computation, N.C. State University, Raleigh, NC,August, 2005; Communications in Comp. Physics, 1 (2006), 357–377.

[BKap1] H. T. Banks and F. Kappel, Spline approximations for functionaldifferential equations, J. Differential Equations, 34 (1979), 496–522.

[BKap] H.T. Banks and F. Kappel, Transformation semigroups and 𝐿1 ap-proximation for size structured population models (with ), LCDS/CCSRep. 88-20, July, 1988; Semigroup Forum, 38 (1989), 141–155.

[BKW1] H.T. Banks, F. Kappel and C. Wang, Weak solutions and differ-entiability for size structured population models (with ), CAMS Tech.Rep. 90-11, August, 1990, University of Southern California; in DPS

84

Control and Applications, Birkhauser, Intl. Ser. Num. Math., Vol.100(1991), 35–50.

[BKW2] H.T. Banks, F. Kappel and C. Wang, A semigroup formulationof nonlinear size-structured distributed rate population model, CRSC-TR94-4, March 1994; in Control and Estimation of Distributed Param-eter Systems: Nonlinear Phenomena, Birkhauser ISNM, 118 (1994),1–19.

[BKa] H.T. Banks and P. Kareiva, Parameter estimation techniques fortransport equations with application to population dispersal and tis-sue bulk flow models, LCDS Report #82-13, Brown University, July1982; J. Math. Biol., 17 (1983), 253–273.

[BK] H.T. Banks and K. Kunisch. Estimation Techniques for DistributedParameter Systems. Birkhausen, Bosten, 1989.

[BKcontrol] H.T. Banks and K. Kunisch, An approximation theory for non-linear partial differential equations with applications to identificationand control, LCDS Technical Report #81-7, Brown University, April1981; SIAM J. Control & Opt., 20 (1982), 815–849.

[BKurW1] H. T. Banks, A. J. Kurdila and G. Webb, Identification of hys-teretic control influence operators representing smart actuators, Part I:Formulation, Mathematical Problems in Engineering 3 (1997), 287–328.

[BKurW2] H. T. Banks, A. J. Kurdila and G. Webb, Identification of hys-teretic control influence operators representing smart actuators, PartII. Convergent approximations, J. of Intelligent Material Systems andStructures 8 (1997, 536–550.

[BaPed] H.T. Banks and M. Pedersen, Well-posedness of inverse problemsfor systems with time dependent parameters, CRSC-TR08-10, August,2008; Arabian Journal of Science and Engineering: Mathematics, 1(2009), 39–58.

[BPi] H.T. Banks and G.A. Pinter, A probabilistic multiscale approach tohysteresis in shear wave propagation in biotissue, CRSC-TR04-03, Jan-uary, 2004; SIAM J. Multiscale Modeling and Simulation, 3 (2005),395–412.

[BPo] H.T. Banks and L.K. Potter, Probabilistic methods for addressinguncertainty and variability in biological models: Application to a tox-

85

icokinetic model, CRSC-TR02-27, N.C. State University, September,2002; Math. Biosci., 192 (2004), 193–225.

[BRS] H.T. Banks, K.L. Rehm and K.L. Sutton, Dynamic social networkmodels incorporating stochasticity and delays, CRSC-TR09-11, May,2009; Quarterly Applied Math., to appear.

[BSW] H.T. Banks, R.C. Smith and Y. Wang, Smart Material Struc-tures: Modeling, Estimation and Control, Masson/John Wiley,Paris/Chichester, 1996.

[BSTBRSM] H.T. Banks, Karyn L. Sutton, W. Clayton Thompson, Gen-nady Bocharov, Dirk Roose, Tim Schenkel, and Andreas Meyerhans,Estimation of cell proliferation dynamics using CFSE data, CRSC-TR09-17, August, 2009; Bull. Math. Biology, submitted.

[BT] H.T. Banks and H.T. Tran, Mathematical and Experimental Modelingof Physical and Biological Processes, CRC Press, Boca Raton, FL, 2009.

[Basar] T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory,SIAM Publ., Philadelphia, PA, 2𝑛𝑑 edition, 1999.

[BA] G. Bell and E. Anderson, Cell growth and division I. A mathemati-cal model with applications to cell volume distributions in mammaliansuspension cultures, Biophysical Journal, 7 (1967), 329–351.

[B] P. Billingsley. Convergence of Probability Measures. Wiley, New York,1968.

[BR] G. Birkhoff and G.C. Rota. Ordinary Differential Equations, 1962.

[ButBer] P.L. Butzer and H. Berens, Semi-Groups of Opeerators and Ap-proximation, Springerr-Verlag, New York, 1967.

[CK1] D. Colton and R. Kress. Integral Equation Methods in ScatteringTheory. Wiley, New York, 1983.

[CK2] D. Colton and R. Kress. Inverse Acoustic and Electromagnetic Scat-tering Theory. Springer-Verlag, New York, 1992.

[CH] R. Courant and D. Hilbert. Methods of Mathematical Physics, Vol.1,p. 291.

[D] Dettman. Mathematical Methods in Physics and Engineering, p. 201.

86

[DS] N. Dunford and J.T. Schwartz. Linear Operators, Vol. I, 1966.

[E] R. E. Edwards. Functional Analysis: Theory and Applications. Holt,Rinehart and Winston. New York, 1965.

[Ev] L.C. Evans, Partial Differential Equations, Graduate Studies in Math-ematics, Vol. 19, Amer. Math. Soc., Providence, 2002.

[GP] C. Goffman and G. Pedrick, First Course in Functional Analysis,Prentice-Hall, Englewood Cliffs, NJ, 1965.

[JKH1] J. K. Hale, Functional Differential Equations, Springer-Verlag, NewYork, NY, 1971.

[JKH2] J. K. Hale, Theory of Functional Differential Equations, Springer-Verlag, New York, NY, 1977.

[JKH3] J. K. Hale and S. M. Verduyn Lunel, Introduction to FunctionalDifferential Equations, Springer-Verlag, New York, NY, 1993.

[Hal] P. R. Halmos, Measure Theory Springer-Verlag, Berlin-New York, 1974

[HKNT] E. Heikkola, Y. A. Kuznetsov, P. Neittaanmaki and J. Toiva-nen, Fictitious domain methods for the numerical solution of two-dimensional scattering problems, J. Comput. Phys., 145 (1998), 89–109.

[HRT] E. Heikkola, T. Rossi and J. Toivanen, Fast direct solution of theHelmholtz equation with a perfectly matched layer or an absorbingboundary condition, Internat. J. Numer. Methods Engr., 57 (2003),2007–2025.

[HeSt] E. Hewitt and K.R. Stromberg, Real and Abstract Analysis, Springer-Verlag, New York, 1965.

[HP] E. Hille and R. Phillips. Functional Analysis and Semigroups. AMSColloquium Publications, Vol. 31, 1957.

[Hutch] G.E. Hutchinson, Circular causal systems in ecology, Ann. N.Y.Acad. Sci., 50 (1948), 221–246.

[H] P. Huber, Robust Statistics. Wiley, New York, (1981).

[Jackson] J. D. Jackson, Classical Electrodynamics, Wiley & Sons, NewYork, 1975.

87

[K] T. Kato, Perturbations Theory for Linear Operators, 1966.

[Kot] M. Kot, Elements of Mathematical Ecology, Cambridge UniversityPress, Cambridge, 2001.

[KP] M. A. Krasnoselskii and A. V. Pokrovskii, Systems withHysteresis,Springer-Verlag, New York, NY, 1989.

[KS] C. Kravis and J.H. Seinfield, Identification of parameters in distributedparameter systems by regularization, SIAM J. Control and Optimiza-tion, 23 (1985), 217–241.

[Kr] R. Kress. Linear Integral Equations. Springer-Verlag, New York, 1989.

[Li] J.L. Lions, Optimal Control of Systems Governed by Partial DifferentialEquations, Springer-Verlag, New York, 1971.

[LiMag] J.L. Lions and E. Magenes, Non-Homogeneous Boundary ValueProblems and Applications, Vol.I-III, Springer-Verlag, Berlin, Heidel-berg, New York, 1972.

[Ma] I. D. Mayergoyz, Mathematical Models of Hysteresis, Springer-Verlag,New York, NY, 1991.

[McS] E.J. McShane, Integration, Princeton Unoversity Press, Princeton,NJ, 1944.

[MetzD] J. A. J. Metz and E.O. Diekmann, The Dynamics of Physiolog-ically Structured Populations, Lecture Notes in Biomathematics. Vol.68, Springer, Heidelberg, 1986.

[NS] A.W. Naylor and G.R. Sell, Linear Operator Theory in Engineeringand Science, Springer Verlag, New York, 1982.

[Pa] A. Pazy. Semigroups of Linear Operators and Applications to PartialDifferential Equations. Springer-Verlag, New York, 1983.

[P] Yu. V. Prohorov, Convergence of random processes and limit theoremsin probability theory, Theor. Prob. Appl., 1 (1956), 157–214.

[Po] L.K. Potter, Physiologically Based Pharmacokinetic Models for the Sys-temic Transport of Trichloroethylene, Ph.D. Thesis, N.C. State Univer-sity, August, 2001; (http://www.lib.ncsu.edu).

88

[RM] R.D. Richtmyer and K.W. Morton. Finite Differences Methods forInitial Value Problems, 2d. Edt. Intersciences Publishers, New Tork,1967.

[RN] F. Riesz and B.S. Nagy. Functional Analysis. Ungar Press, 1955.

[Ru1] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, NewYork, 1953.

[Ru2] W. Rudin, Real and Complex Analysis, McGraw-Hill, New York,1966.

[DLR] D.L. Russel, On mathematical models for the elastic beam with fre-quency proportional damping, in Control and Estimation of DistributedParameter Systems, SIAM, Philadelphia, 1992, 125–169.

[SS] J. Sinko and W. Streifer, A new model for age-size structure of a pop-ulation, Ecology, 48 (1967), 910–918.

[Sh] R. E. Showalter. Hilbert Space Methods for Partial Differential Equa-tions. Pitman, London, 1977.

[Smir] V.I. Smirnov, Integration and Functional Analysis, Vol 5 of A Courseof Higher Mathematics,Adison-Wesley, Reading, MA, 1964.

[RCS] R.C. Smith, Smart Material Systems: Model Development, Frontiersin Applied Math, FR32, SIAM, Philadelphia, 2005.

[S] I. Stakgold. Boundary Value Problems of Mathematical Physics, Vol. 1,1967.

[T] H. Tanabe. Equations of Evolution. Pitman, London, 1979.

[Vis] A. Visintin, Differential Models of Hystersis, Springer-Verlag, NewYork, NY, 1994.

[Vog] C.R. Vogel, Computational Methods for Inverse Problems, Frontiersin Applied Math, FR23, SIAM, Philadelphia, 2002.

[VN] J. von Neumann, Zur theorie der gesellschaftsspiele, Math. Ann., 100(1928), 295–320.

[VNM] J. von Neumann and O. Morgenstern, Theory of Games and Eco-nomic Behavior, Princeton University Press, Princeton, NJ, 1944.

89

[Wahba1] G. Wahba, Bayesian “confidence intervals” for the cross-validatedsmoothing spline, J. Roy. Statist. Soc. B, 45 (1983), 133–150.

[Wahba2] G. Wahba, (Smoothing) splines in nonparametric regression, Deptof Statistics-TR 1024, Univ. Wisconsin, September, 2000; In Encyclope-dia of Environmetrics, (A. El-Shaarawi and W. Piegorsch, eds.), Wiley,Vol. 4, pp. 2099–2112, 2001.

[Warga] J. Warga, Optimal Control of Differential and Functional Equa-tions, Academic Press, New York, NY, 1972.

[Webb] G. F. Webb, Functional differential equations and nonlinear semi-groups in 𝐿𝑝 spaces, J. Differentail Equations, 20, (1976), 71–89.

[W] J. Wloka. Partial Differential Equations. Cambridge University Press,Cambridge, 1987.

[Wright] E. M. Wright, A non-linear difference-differential equation, J.Reine Angew. Math., 494 (1955), 66–87.

[YY] Y.S. Yoon and W.W.-G. Yeh, Parameter identification in an inhomo-geneous medium with the finite element method, Soc. Petr. Engr. J.,16 (1976), 217–226.

[Y] K. Yosida. Functional Analysis, 1971.

90

Documents

Applied Functional Analysis Lecture Notes Fall, 2010 · Applied Functional Analysis Lecture Notes Fall, 2010 Dr. H.T.Banks CenterforResearchinScientiﬁcComputation DepartmentofMathematics