24
Stochastic Approximation Algorithms with Set-Valued Maps Shalabh Bhatnagar Computer Science and Automation Indian Institute of Science Bangalore November 10, 2019 Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 1 / 24

Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

  • Upload
    others

  • View
    1

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Stochastic Approximation Algorithms with

Set-Valued Maps

Shalabh Bhatnagar

Computer Science and Automation

Indian Institute of Science

Bangalore

November 10, 2019

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 1 / 24

Page 2: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Outline

1 Optimization under noise

2 Stochastic approximation algorithms

3 Random Directions Stochastic Approximation (RDSA)

4 Gradient algorithms with set-valued maps and their analysis

5 Ongoing and future work

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 2 / 24

Page 3: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Example: Parameter Optimization1

Consider a repeated experiment that gives i.i.d input-output pairs

(Xn,Yn), n ≥ 0, in real time

Goal: Find a best parameterized fit

Yn = fw (Xn) + εn,

i.e., one with the least g(w) =1

2E [‖ εn ‖2]

fw could correspond to polynomials, neural networks, splines,

wavelets etc.

Note ∇g(w) = −E [< Yn − fw (Xn),∇fw (Xn) >]

1V.S.Borkar, Stochastic Approximation: A Dynamical Systems Viewpoint,

Cambridge Univ Press, 2008

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 3 / 24

Page 4: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Example: Parameter Optimization

Problem: Cannot find the expectation

Solution: Drop the expectation!

Gradient Scheme with Noise:

wn+1 = wn + a(n) < Yn − fwn(Xn),∇fwn(Xn) >

= wn + a(n)(−∇g(wn) + Mn+1),

where Mn+1 is the noise term

Algorithms of this type are called Stochastic Approximation

Algorithms

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 4 / 24

Page 5: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Stochastic Approximation2

Objective: Solve the equation J(x) = 0 when analytical form of J

is not known, however, ‘noisy’ measurements J(x) + Mn+1 can be

obtained

J(.)

J( n+1

Mn+1

x x) +M

The Robbins-Monro Algorithm:

xn+1 = xn + a(n)(J(xn) + Mn+1) (1)

2H.Robbins and S.Monro Annals of Mathematical Statistics, 22: 400–407, 1951

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 5 / 24

Page 6: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

A Convergence Result3

(C1) J : RN → RN is Lipschitz continuous

(C2)∑

n

a(n) = ∞,∑

n

a(n)2 < ∞

(C3) Mn+1, n ≥ 0 is a martingale difference w.r.t.

{Fn�= σ(xm,Mm,m ≤ n)}. Further, for some K > 0,

E [‖ Mn+1 ‖2| Fn] ≤ K (1+ ‖ xn ‖2)

(C4) supn ‖ xn ‖< ∞ almost surely

Let x∗ be the unique globally asymptotically stable attractor for the

ODE x(t) = J(x(t)). Then

Theorem (Convergence of SAA): Under (C1)–(C4), {xn}converges almost surely to x∗.

3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems

Viewpoint, Cambridge Univ. Press, 2008

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 6 / 24

Page 7: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Optimization under Noise

Let J : RN → R be a given objective function having the form

J(x) = Eμ[h(x , μ)], where μ denotes ‘noise’ and Eμ[·] is the

expectation under that noise

x

J(x)

h(x,

h(x,

μ1)

μ3)

h(x, μ2)

Goal: Find x∗ s.t. J(x∗) = minx∈RN J(x)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 7 / 24

Page 8: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Gradient Estimation using RDSA4

Run two simulations with parameters

x + δd =

⎛⎜⎜⎜⎜⎝

x1 + δd1

x2 + δd2

··

xN + δdN

⎞⎟⎟⎟⎟⎠ , x − δd =

⎛⎜⎜⎜⎜⎝

x1 − δd1

x2 − δd2

··

xN − δdN

⎞⎟⎟⎟⎟⎠

where d1, . . . , dN are independent random variables with

distribution U[−η, η]

Gradient Estimator:

∇J(x) =3

η2d

J(x + δd)− J(x − δd)

4L.A.Prashanth, S.Bhatnagar, M.Fu and S.Marcus, IEEE Transactions on

Automatic Control, 62(5):2223-2238, 2017

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 8 / 24

Page 9: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Hessian Estimator for RDSA

Hessian Estimator:

∇2J(x) =9

2η4R

(J(x + δd) + J(x − δd)− 2J(x)

δ2

),

where

R =

⎡⎢⎢⎣

52(d

1)2 − η2/3 · · · d1dN

d2d1 · · · d2dN

· · · · · · · · ·dNd1 · · · 5

2(dN)2 − η2/3

⎤⎥⎥⎦ .

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 9 / 24

Page 10: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Main Convergence Result

RDSA Algorithm:

xn+1 = xn − a(n)Γ(∇2J(xn))−1∇J(xn)

except that δ is replaced with δn ↓ 0 and

∑n

a(n) = ∞;∑

n

(a(n)

δn

)2

< ∞ (2)

Let x∗ be the unique globally asymptotically stable equilibrium of

the ODE

x(t) = −Γ(∇2J(x))−1∇J(x)

Let a(n) = 1/nα and δn = 1/nγ with α− γ > 0.5 and

β�= α− 2γ > 0

Theorem: Under (C1) on ∇J, (2), (C3) and (C4)1 xn

a.s.→ x∗

2 nβ/2(xn − x∗)dist→ N (μ,Ω)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 10 / 24

Page 11: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Sufficient Conditions for Stability of SA5

(C5)

(i) Jc(x)�= J(cx)/c, c ≥ 1 satisfies Jc → J∞, for some J∞ : RN → RN

uniformly on compacts

(ii) The origin in RN is a unique globally asymptotically stable

equilibrium for the ODE x(t) = J∞(x(t))

(iii) There is a unique globally asymptotically stable equilibrium

x∗ ∈ RN for the ODE x(t) = J(x(t))

The Stability Theorem: Under (C1)-(C3), (C5), for any initial

condition x0 ∈ RN , supn ‖ xn ‖< ∞ a.s. Further, xna.s.→ x∗.

5V. S. Borkar and S. P. Meyn, SIAM Journal of Control and Optimization,

38(2):447-469, 2000

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 11 / 24

Page 12: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Analysis of Algorithms with Biases67

Let ∇J(x) denote an estimator for ∇J(x) s.t.

‖ ∇J(x)−∇J(x) ‖≤ ε(δ) → 0 as δ → 0

ε J(x)

J(x)

ΔΔ

Consider the recursion xn+1 = xn − a(n)(∇J(xn) + εn), where

‖ εn ‖≤ ε ∀n6A.Ramaswamy and S.Bhatnagar, IEEE Transactions on Automatic Control,

63(5):1465-1471, 20187A.Ramaswamy and S.Bhatnagar, Mathematics of Operations Research,

42(3):648-661, 2017

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 12 / 24

Page 13: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Marchaud Map

x1

h(x1)

Rn R

m

x2

x3

x

h(x)

h(x3)

h(x2

) y1

y2

y3

y

A set-valued map h is called Marchaud if

h(x) is convex and compact for each x

supw∈h(x)

‖ w ‖ ≤ K (1+ ‖ x ‖) for each x

h is upper-semicontinuous, i.e., given {xn} ⊂ Rn and {yn} ⊂ Rm

with xn → x and yn → y with yn ∈ h(xn), ∀n, we have y ∈ h(x)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 13 / 24

Page 14: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Differential Inclusion

Consider the differential inclusion (DI) in Rd :

x(t) ∈ H(x(t)), (3)

where H : Rd → {subsets of Rd} is Marchaud. Then the above DI

has at least one solution x and each solution is absolutely

continuous.8

8J.Aubin and A.Cellina, Differential Inclusions: Set-Valued Maps and Viability

Theory, Springer, 1984

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 14 / 24

Page 15: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Semiflow

The Set-Valued Semiflow Φ associated with (3) is defined on

[0,∞)×Rd as

Φ(t , x) = {x(t) | x ∈ Σ, x(0) = x},

where Σ is the set of all absolutely continuous solutions to (3).

x

φ( t,x)

For B × L ⊂ [0,∞)×Rd , let

Φ(B, L) = ∪t∈B,x∈LΦ(t , x)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 15 / 24

Page 16: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Invariant Set

M ⊂ Rd is invariant if for every x ∈ M, there exists x∈ Σ s.t.

x(t) ∈ M ∀t with x(0) = x

x

M

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 16 / 24

Page 17: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Attractor of a DI

A ⊂ Rd is attracting if it is compact and there exists a

neighborhood U such that for any ε > 0, ∃T (ε) ≥ 0 with

Φ([T (ε),∞),U) ⊂ Nε(A)

A

U

A

U

(A )

T(ε)

If the above A is invariant, it is called an attractor

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 17 / 24

Page 18: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

An Alternative View of Algorithm

Recall the recursion

xn+1 = xn − a(n)(∇J(xn) + εn),

where ‖ εn ‖≤ ε ∀n

Alternatively consider

xn+1 = xn − a(n)g(xn), (4)

where g(xn) ∈ G(xn)∀n and G(x)�= ∇J(x) + Bε(0), i.e., gradient

estimate lies in an ε-ball around true gradient

ε J(x)

J(x)

ΔΔG(x)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 18 / 24

Page 19: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Assumptions

(A1) ∇J is a continuous function s.t. ‖ ∇J(x) ‖ ≤ K (1+ ‖ x ‖) for

all x ∈ Rd , K > 0

(A2) a(n) > 0∀n with

∑n

a(n) = ∞,∑

n

a(n)2 < ∞

Can show that G is upper-semicontinuous

Let Gc(x)�= {

y

c| y ∈ G(cx)}

Let G∞(x)�= co(Limsupc→∞Gc(x)), where Limsupc→∞Gc(x)

�= {y | lim infc→∞ d(y ,Gc(x)) = 0}

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 19 / 24

Page 20: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

An Important Result

Lemma 1 The map x �→ G∞(x) is Marchaud

Thus, x(t) ∈ −G∞(x(t)) has at least one solution and which is

absolutely continuous

(A3) x(t) ∈ −G∞(x(t)) has an attractor set A such that A ⊂ Ba(0)for some a > 0 and Ba(0) is a fundamental neighborhood of A

(A4) Let cn ≥ 1 be an increasing sequence of integers such that

cn ↑ ∞ as n ↑ ∞. Let xn → x and yn → y as n ↑ ∞, such that

yn ∈ Gcn(xn), ∀n, then y ∈ G∞(x)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 20 / 24

Page 21: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

The Stability Result

Theorem 1 Under (A1)-(A4), the iterates (4) are stable i.e.,

supn ‖ xn ‖< ∞ a.s.

Now recall that G(x) = ∇J(x) + Bε(0)

Let the minimum set M of J be the global attractor of

x(t) = −∇J(x(t))

It can be shown that any compact set K with M ⊂ K ⊂ Rd is a

fundamental neighborhood of M

From Theorem 1, x(t) ∈ K0 ∀t ≥ 0 for some (possibly sample path

dependent) compact set K0 which then is a fundamental

neighborhood of M

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 21 / 24

Page 22: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

The Main Result

Theorem 2 Given δ > 0, there exists ε(δ) > 0 such that (4)

converges to Nδ(M) provided ε ≤ ε(δ)/2

1T2 T3

T4T

t(1) t(2)

x(t)

x(t)

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 22 / 24

Page 23: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Related Recent Work

A general stochastic recursion with set-valued maps and Markov noise

xn+1 = xn + a(n)(h(xn,Zn) + Mn+1)

General convergence with Zn non-ergodic, iterate-dependent, Markov

process [V.Yaji and SB, Stochastics, 2018]

Two-timescale stochastic recursions with Markov noise

xn+1 = xn + a(n)(h(xn, yn,Z1n ) + M1

n+1)

yn+1 = yn + b(n)(g(xn, yn,Z2n ) + M2

n+1)

Z 1n ,Z

2n independent non-ergodic iterate-dependent Markov processes,

M1n+1,M

2n+1 independent martingale differences, a(n) = o(b(n)), h,g

point-to-point maps – analysis and application to reinforcement learning

[P.Karmakar and SB, Math of OR, 2018]

Analysis under set-valued h,g [V.Yaji and SB, Math of OR, 2019]

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 23 / 24

Page 24: Stochastic Approximation Algorithms with Set-Valued Maps€¦ · 3V.S.Borkar, Stochastic Approximation Algorithms: A Dynamical Systems Viewpoint, Cambridge Univ. Press, 2008 Shalabh

Ongoing and Future Work

Finding minima of non-differentiable functions under noise

Algorithms for convergence to global minima

Asynchronous update algorithms

Reinforcement learning algorithms for partially observed Markov

decision processes

Analysis of deep reinforcement learning algorithms

Applications in robotics, microgrids, vehicular traffic control etc.

Shalabh Bhatnagar (IISc) SAA with Set-Valued Maps November 10, 2019 24 / 24