30
Distributed Subgradient Methods for Multi-agent Optimization Asu Ozdaglar February 2009 Department of Electrical Engineering & Computer Science Massachusetts Institute of Technology, USA MIT Laboratory for Information and Decision Systems 1

Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Distributed Subgradient Methods for

Multi-agent Optimization

Asu Ozdaglar

February 2009

Department of Electrical Engineering & Computer Science

Massachusetts Institute of Technology, USA

MIT Laboratory for Information and Decision Systems 1

Page 2: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Motivation

• Increasing interest in distributed control and coordination of networks

consisting of multiple autonomous agents

• Motivated by many emerging networking applications, such as ad hoc wireless

communication networks and sensor networks, characterized by:

– Lack of centralized control and access to information

– Time-varying connectivity

• Control and optimization algorithms for such networks should be:

– Completely distributed relying on local information

– Robust against changes in the network topology

MIT Laboratory for Information and Decision Systems 2

Page 3: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Multi-Agent Optimization Problem

Goal: Develop a general computational model

for cooperatively optimizing a global system

objective through local interactions and com-

putations in a multi-agent system

• Global objective is a combination of

individual agent performance measures

Examples:

• Consensus problems: Alignment of estimates maintained by different agents

– Control of moving vehicles (UAVs), computing averages of initial values

• Parameter estimation in distributed sensor networks:

– Regression-based estimates using local sensor measurements

• Congestion control in data networks with heterogeneous users

MIT Laboratory for Information and Decision Systems 3

Page 4: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Related Literature

• Parallel and Distributed Optimization Algorithms:

– General computational model for distributed asynchronous optimization

∗ Tsitsiklis 84, Bertsekas and Tsitsiklis 95

• Consensus and Cooperative Control:

– Analysis of group behavior (flocking) in dynamical-biological systems

∗ Vicsek 95, Reynolds 87, Toner and Tu 98

– Mathematical models of consensus and averaging

∗ Jadbabaie et al. 03, Olfati-Saber Murray 04, Boyd et al. 05

• Game Theory/Mechanism Design for Distributed Cooperative Control:

– Assign each agent a local utility function such that:

∗ The equilibrium of the resulting game is the same as (or close to) the

global optimum

∗ Derive learning algorithms that reach equilibrium

– Marden, Arslan, Shamma 07 used this approach for the consensus problem

to deal with constraints

MIT Laboratory for Information and Decision Systems 4

Page 5: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

This Talk

• Development of a distributed subgradient method for multi-agent optimization

[Nedic, Ozdaglar 08]

– Convergence analysis and performance bounds for time-varying topologies

under general connectivity assumptions

• Effects of local constraints [Nedic, Ozdaglar, Parrilo 08]

• Effects of networked-system constraints: quantization, delay, asynchronism

MIT Laboratory for Information and Decision Systems 5

Page 6: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Model

• We consider a network of m agents with node set V = {1, . . . , m}– Agents want to cooperatively solve

minx∈Rn

m∑i=1

fi(x)

– Function fi(x) : Rn → R is a convex objec-

tive function known only by node i

f2(x1, . . . , xn)

fm(x1, . . . , xn)

f1(x1, . . . , xn)

• Agents update and send their information at discrete times t0, t1, t2 . . .

• We use xi(k) ∈ Rn to denote the estimate of agent i at time tk

Agent Update Rule:

• Agent i locally minimizes fi according to:

xi(k + 1) =

m∑j=1

aij(k)xj(k)− αi(k)di(k)

aij(k) are weights, αi(k) is stepsize, di(k) is subgradient of fi at xi(k)

• ai(k) = (ai1(k), . . . , ai

m(k)) represents i’s time-varying neighbors at slot k

• The model includes consensus as a special case (fi(x) = 0 for all i)MIT Laboratory for Information and Decision Systems 6

Page 7: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Linear Dynamics and Transition Matrices

• We let A(s) denote the matrix whose ith column is the vector ai(s) and

introduce the transition matrices

Φ(k, s) = A(s)A(s + 1) · · ·A(k − 1)A(k) for all k ≥ s

• We use these matrices to relate xi(k + 1) to xj(s) at time s ≤ k:

xi(k+1) =

m∑j=1

[Φ(k, s)]ijxj(s)−

k−1∑r=s

(m∑

j=1

[Φ(k, r + 1)]ijαj(r)dj(r)

)−αi(k)di(k).

• We analyze convergence properties of the distributed method by establishing:

– Convergence of transition matrices Φ(k, s) (consensus part)

– Convergence of an approximate subgradient method (effect of optimization)

MIT Laboratory for Information and Decision Systems 7

Page 8: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Assumptions

Assumption (Weights) At all times k, we have

(a) The weights aij(k) are nonnegative for all agents i, j.

(b) There exists a scalar η ∈ (0, 1) such that for all agents i,

aii(k) ≥ η and ai

j(k) ≥ η

for agents j communicating with agent i at time k.

(c) The agent weight vectors ai(k) = [ai1(k), . . . , ai

m(k)]T are stochastic, i.e.,∑mj=1 ai

j(k) = 1 for all i and k.

Example: Equal neighbor weights

aij(k) =

1

ni(k) + 1

where ni(k) is the number of agents communicating with i at time k

MIT Laboratory for Information and Decision Systems 8

Page 9: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Information Exchange

• Agent i influences any other agent infinitely

often - connectivity

• Agent j send his information to neighbor-

ing agent i within a bounded time interval

- bounded intercommunication interval

At slot k, information exchange may be represented by a directed graph (V, Ek) with

Ek = {(j, i) | aij(k) > 0}

Assumption (Connectivity) The graph (V, E∞) is connected, where

E∞ = {(j, i) | (j, i) ∈ Ek for infinitely many indices k}.

Assumption (Bounded Intercommunication Interval) There is some B ≥ 1 s.t.

(j, i) ∈ Ek ∪ Ek+1 ∪ · · · ∪ Ek+B−1 for all (j, i) ∈ E∞ and k ≥ 0.

MIT Laboratory for Information and Decision Systems 9

Page 10: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Properties of Transition Matrices

Lemma: Let Weights and Information Exchange assumptions hold. We then have

[Φ(s + (m− 1)B − 1, s)]ij ≥ η(m−1)B for all s, i, and j,

where η is the lower bound on weights and B is the intercommunication interval

bound.

• We introduce the matrices Dk(s) as follows: for a fixed s ≥ 0,

Dk(s) = Φ′ (s + kB0 − 1, s + (k − 1)B0) for k = 1, 2, . . . ,

where B0 = (m− 1)B.

• By the previous lemma, all entries of Dk(s) are positive.

MIT Laboratory for Information and Decision Systems 10

Page 11: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Convergence of Transition Matrices

Lemma: Let Weights and Information Exchange assumptions hold. For each s ≥ 0,

we have:

(a) The limit D(s) = limk→∞Dk(s) · · ·D1(s) exists.

(b) The limit D(s) has identical rows and the rows are stochastic.

(c) For every j, the entries [Dk(s) · · ·D1(s)]ji , i = 1, . . . , m, converge to the same

limit φj(s) as k →∞ with a geometric rate:

∣∣∣[Dk(s) · · ·D1(s)]ji − φj(s)

∣∣∣ ≤ 2(1 + η−B0

) (1− ηB0

)k

where η is the lower bound on weights, B is the intercommunication interval

bound, and B0 = (m− 1)B.

MIT Laboratory for Information and Decision Systems 11

Page 12: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Proof Outline

• We show that the sequence {(Dk · · ·D1)x} converges for every x ∈ Rm

• Consider the sequence {xk} with xk = Dk · · ·D1x and write xk as

xk = zk + cke, where ck = min1≤i≤m

[xk]i

• Using the property that each entry of the matrix Dk is positive, we show

‖zk‖∞ ≤(1− ηB0

)k

‖z0‖∞ for all k.

Hence zk → 0 with a geometric rate.

• We then show that the sequence {ck} converges to some c ∈ R and use the

contraction constant to establish the rate estimate

• The final relation follows by picking x = ej , the jth unit vector

MIT Laboratory for Information and Decision Systems 12

Page 13: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Convergence of Transition Matrices

Proposition: Let Weights and Information Exchange assumptions hold. For each

s ≥ 0, we have:

(a) The limit Φ(s) = limk→∞ Φ(k, s) exists.

(b) The limit matrix Φ(s) has identical columns and the columns are stochastic, i.e.,

Φ(s) = φ(s)e′,

where φ(s) ∈ Rm is a stochastic vector.

(c) For every i, [Φ(k, s)]ji , j = 1, ..., m, converge to the same limit φi(s) as k →∞with a geometric rate, i.e., for all i, j and all k ≥ s,

∣∣∣[Φ(k, s)]ji − φi(s)∣∣∣ ≤ 2

1 + η−B0

1− ηB0

(1− ηB0

) k−sB0

where η is the lower bound on weights, B is the intercommunication interval

bound, and B0 = (m− 1)B.

• The rate estimate in part (c) recently improved in [Nedic, Olshevsky, Ozdaglar,

Tsitsiklis 08]

MIT Laboratory for Information and Decision Systems 13

Page 14: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Convergence Analysis

• Recall the evolution of the estimates (with αi(s) = α):

xi(k + 1) =

m∑j=1

[Φ(k, s)]ijxj(s)− α

k−1∑r=s

(m∑

j=1

[Φ(k, r + 1)]ijdj(r)

)− αdi(k).

• Proof method: Consider a “stopped process”, where after time k, agents stop

computing subgradients but keep exchanging their estimates: di(k) = 0 for all

k ≥ k and all i.

• It can be seen that the stopped process takes the form

xi(k + 1) =

m∑j=1

[Φ(k, 0)]ijxj(0)− α

k∑r=1

(m∑

j=1

[Φ(k, r)]ijdj(r − 1)

)

• Using limk→∞[Φ(k, s)]ij = φj(s) for all i, we see that the limit vector

limk→∞ xi(k) exists, independent of i, but depends on k:

limk→∞

xi(k) = y(k)

MIT Laboratory for Information and Decision Systems 14

Page 15: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Behavior of “Stopped Process”

• Stopped process is described by:

y(k + 1) = y(k)− α

m∑j=1

φj(k)dj(k)

– Subgradients dj(k) of fj are computed at xj(k) instead of y(k)

• This would correspond to an approximate subgradient method for minimizing∑j fj(x) provided that the values φj(k) are the same for all j

• Assumption (Doubly Stochastic Weights) The matrices A(k) are doubly

stochastic, i.e.,∑m

i=1 aij(k) = 1 for all j and k.

– Can be ensured when the agents exchange their information simultaneously

and coordinate the selection of the weights aij(k)

• In this case the stopped process reduces to

y(k + 1) = y(k)− α

m

m∑j=1

dj(k)

MIT Laboratory for Information and Decision Systems 15

Page 16: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Main Convergence Result

Proposition: Let

• Weights, Doubly Stochastic Weights, and Information Exchange assumptions hold

• The subgradients of fi be uniformly bounded by a constant L, and

max1≤j≤m ‖xj(0)‖ ≤ αL.

Then, for the averages xi(k) of estimates xi(0), . . . , xi(k − 1) we have

f(xi(k)) ≤ f∗ +m dist2(y(0), X∗)

2αk+ αL2

(C

2+ 2mC1

)

where f∗ is the optimal value of f =∑

i fi, X∗ is the optimal set,

y(0) =1

m

∑i

xi(0), C = 1 + 8mC1

C1 = 1 +m

1− (1− ηB0)1

B0

1 + η−B0

1− ηB0, B0 = (m− 1)B

• Estimates are per iteration

• Captures tradeoff between accuracy and computational complexity

MIT Laboratory for Information and Decision Systems 16

Page 17: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Proof Outline

• We analyze the stopped process:

y(k + 1) = y(k)− α

m

m∑j=1

dj(k)

• Establish approximate convergence relation for the running averages

y(k) =1

k

k−1∑

h=0

y(h)

• Using the convergence rate estimate for the transition matrices Φ(k, s), we show

that y(k) are close to the agents’ averages xi(k) = 1k

∑k−1h=0 xi(h)

‖y(k)− xi(k)‖ ≤ 2αLC1 for all i and k

• Infer the result for the running averages x(k)

MIT Laboratory for Information and Decision Systems 17

Page 18: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Constrained Consensus Problem

• Estimates of agent i restricted to lie in a closed convex constraint set Xi

• We assume that the intersection set X = ∩mi=1Xi is nonempty

• Examples where constraints important:

– Motion planning and alignment problems, where each agent’s position is

limited to a certain region or range

– Distributed constrained multi-agent optimization

• This talk: Pure consensus problem in the presence of constraints

MIT Laboratory for Information and Decision Systems 18

Page 19: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Projected Consensus Algorithm

• For the constrained consensus problem, we develop a consensus algorithm based

on projections.

• We use xi(k) ∈ Xi to denote the estimate of agent i at time tk

• Given a closed convex set X ⊂ Rn, and a vector y ∈ Rn, we define:

dist(y, X) = minx∈X

‖y − x‖, PX(y) = argminx∈X‖y − x‖

Agent Update Rule:

• Agent i updates his estimates subject to his constraint set:

xi(k + 1) = PXi

[ m∑j=1

aij(k)xj(k)

]

where ai(k) = (ai1(k), . . . , ai

m(k))′ is the weight vector

MIT Laboratory for Information and Decision Systems 19

Page 20: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Connection to Alternating Projections

• Update rule similar to classical alternating projection method

Alternating/Cyclic Projection Methods:

Given closed convex sets X1, X2 ⊂ Rn, find a

point in X = X1 ∩X2:

x(k + 1) = PX1(x(k))

x(k + 2) = PX2(x(k + 1))

Convergence analysis:

Xi affine [Von Neumann; Aronszajn 50],

Xi convex [Gubin, Polyak, Raik 67]

Constrained Consensus Algorithm:

Given closed convex sets X1, X2 ⊂ Rn:

w1(k) =∑

j a1j (k)xj(k)

w2(k) =∑

j a2j (k)xj(k)

x1(k + 1) = PX1(w1(k))

x2(k + 1) = PX2(w2(k))

MIT Laboratory for Information and Decision Systems 20

Page 21: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Convergence Analysis

• Analysis of the impact of constraints

• Recall the update rule

xi(k + 1) =

m∑j=1

aij(k)xj(k) + ei(k),

where the projection error ei(k) is given by

ei(k) = PXi

[ m∑j=1

aij(k)xj(k)

]−

m∑j=1

aij(k)xj(k) = xi(k + 1)− wi(k).

Proposition: Assume that the intersection X = ∩mi=1Xi is nonempty. Let Doubly

Stochastic Weights assumption hold. Then:

limk→∞

ei(k) = 0 for all i.

• Implication: The analysis translates into unconstrained case

• The proof relies on two lemmas

MIT Laboratory for Information and Decision Systems 21

Page 22: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Convergence of Projection Error

Lemma: Let w ∈ Rn and X ⊂ Rn closed and

convex. For all x ∈ X,

‖w − PX(w)‖2 ≤ ‖w − x‖2 − ‖PX(w)− x‖2.

Lemma: Let Doubly Stochastic Weights assumption hold. Then:

• ‖xi(k + 1)− x‖ ≤ ‖wi(k)− x‖ ∀ x ∈ Xi (nonexpansiveness of projection)

• ∑mi=1 ‖wi(k)− x‖2 ≤ ∑m

i=1 ‖xi(k)− x‖2 ∀ x (by doubly stochasticity)

– Of independent interest in convergence analysis of doubly stochastic matrices

Implication:

limk→∞∑m

i=1 ‖wi(k)− x‖2 − ‖xi(k + 1)− x‖2 = 0, for all x ∈ X

MIT Laboratory for Information and Decision Systems 22

Page 23: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Convergence of the Estimates

• Recall that the transition matrices are defined as follows:

Φ(k, s) = A(s)A(s + 1) · · ·A(k − 1)A(k) for all s and k with k ≥ s,

• Using the transition matrices and the decomposition of the estimate evolution,

xi(k + 1) =

m∑j=1

[Φ(k, s)]ij xj(s) +

k∑r=s+1

(m∑

j=1

[Φ(k, r)]ij ej(r − 1)

)+ ei(k).

• Use a two-time scale analysis, where we define a similar “stopped process”

y(k) =1

m

m∑j=1

xj(s) +1

m

k∑r=s+1

(m∑

j=1

ej(r − 1)

)

• It can be seen that the agent estimates reach a consensus:

limk→∞

‖xi(k)− y(k)‖ = 0 for all i

Proposition: (Convergence) Let Weights, Doubly Stochastic Weights, and

Information Exchange assumptions hold. We have for some x ∈ X,

limk→∞

‖xi(k)− x‖ = 0 for all i.

MIT Laboratory for Information and Decision Systems 23

Page 24: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Rate Analysis

Assumption (Interior Point): There exists

a vector x such that

x ∈ int(X) = int(∩m

i=1 Xi

),

i.e., there exists some scalar δ > 0 such

that {z | ‖z − x‖ ≤ δ} ⊂ X.

Proposition: Let Interior Point Assumption hold. Let the weight vectors ai(k) be

given by ai(k) = (1/m, . . . , 1/m)′ for all i and k. Then,

m∑i=1

‖xi(k)− x‖2 ≤(

1− 1

4R2

)k m∑i=1

‖xi(0)− x‖2 for all k ≥ 0,

where x ∈ X is the limit of the sequence {xi(k)}, and R = 1δ

∑mi=1 ‖xi(0)− x‖,

i.e., the convergence rate is linear.

• Linear convergence rate extends to time-varying weights with Xi = X

• Convergence rate for time-varying weights and different local constraints open

MIT Laboratory for Information and Decision Systems 24

Page 25: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Summary

• We presented a distributed subgradient method for multi-agent optimization

• The method can operate over networks with time-varying connectivity

• We proposed a constrained consensus policy for the case when agents have local

constraints on their estimates

• This policy has connections to the classical alternating projection method

• We analyzed the convergence and rate of convergence of the algorithms

MIT Laboratory for Information and Decision Systems 25

Page 26: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Putting Things Together: Constrained Distributed

Optimization

• Each agent has a convex closed local constraint set Xi [Nedic, Ozdaglar,

Parrilo 08]

• Agent i updates his estimate by

vi(k) =

m∑j=1

aij(k)xj(k)

xi(k + 1) = PXi

[vi(k)− α(k)di(k)

],

α(k) > 0 is a diminishing stepsize sequence and di(k) is a subgradient of fi(x)

at x = vi(k).

Results:

• Agent estimates generated by this algorithm converge to the same optimal

solution for the cases when the weights are constant and equal, and when the

weights are time-varying but Xi = X for all i.

• Convergence analysis in the general case open!

MIT Laboratory for Information and Decision Systems 26

Page 27: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Optimization over Random Networks

• Existing work focuses on deterministic models of network connectivity (i.e.,

worst-case assumptions about intercommunication intervals)

• Time-varying connectivity modeled probabilistically [Lobel, Ozdaglar 08]

• Each component l of agent estimates evolve according to

xl(k + 1) = A(k)xl(k)− α(k)dl(k)

• We assume that the matrix A(k) is a random matrix drawn independently over

time from a probability space of stochastic matrices.

• This allows edges at any time k to be correlated.

Results:

• We establish properties of random transition matrices by constructing positive

probability events in which information propagates from every node to every

other node in the network

• We provide convergence analysis of a stochastic subgradient method under

different stepsize rules

MIT Laboratory for Information and Decision Systems 27

Page 28: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Limits on Communication

Quantization Effects:

• Agents have access to quantized estimates due to storage and communication

bandwidth constraints [Nedic, Olshevsky, Ozdaglar, Tsitsiklis 07]

• Agent i updates his estimate by

xi(k + 1) =

m∑j=1

aij(k)xj

Q(k)− αdi(k)

xiQ(k + 1) = bxi(k + 1)c

b·c represents rounding down to the nearest integer multiple of 1/Q

Delay Effects:

• Agents have access to outdated estimates due to communication delays

[Bliman, Nedic, Ozdaglar 08]

• In the presence of delays, agent i updates his estimate by

xi(k + 1) =

m∑j=1

aij(k)xj(k − ti

j(k))− αdi(k)

tij(k) is the delay in passing information from j to i

MIT Laboratory for Information and Decision Systems 28

Page 29: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

Applications to Social Networks

• Growing interest in dynamics in a social network of communicating agents

• A specific example is learning and information aggregation over networks

• Consensus policies can be used to develop and analyze myopic/quasi-myopic

learning models [Golub, Jackson 07], [Acemoglu, Ozdaglar, ParandehGheibi

08]

• In the context of social networks, these rules may be too myopic:

– Alternative approach: Bayesian learning over social networks [Acemoglu,

Dahleh, Lobel, Ozdaglar 08]

MIT Laboratory for Information and Decision Systems 29

Page 30: Distributed Subgradient Methods for Multi-agent Optimizationgroups.csail.mit.edu/tds/seminars/s09/MIT-talk.pdf · 2009. 4. 13. · † Development of a distributed subgradient method

References

• A. Nedic and A. Ozdaglar, “Distributed Subgradient Methods for Multi-agent

Optimization,” IEEE Transactions on Automatic Control, 2008.

• A. Ozdaglar, “Constrained Consensus and Alternating Projections,” Proc. of

Allerton Conference, 2007.

• A. Nedic, A. Olshevsky, A. Ozdaglar, and J.N. Tsitsiklis, “On Distributed

Averaging Algorithms and Quantization Effects,” IEEE Transactions on

Automatic Control, 2008.

• A. Nedic, A. Ozdaglar, and P.A. Parrilo, “Constrained Consensus and

Optimization in Multi-agent Networks,” submitted for publication, 2008.

• I. Lobel and A. Ozdaglar, “Distributed Subgradient Methods over Random

Networks,” Proc. of Allerton Conference, 2008.

• D. Acemoglu, M. Dahleh, I. Lobel, and A. Ozdaglar, “Bayesian Learning in

Social Networks,” submitted for publication, 2008.

• D. Acemoglu, A. Ozdaglar, and A. ParandehGheibi “Spread of (Mis)Information

over Social Networks,” working paper, 2008.

All papers can be downloaded from web.mit.edu/asuman/wwwMIT Laboratory for Information and Decision Systems 30