Cooperative Control of Multi-Agent Systemswatanabe- › syllabus › atmis › ATMIS_Ishii_… · PageRank in Various Areas. 6 Keyword: “Department of control engineering” (in

1

Cooperative Control of

Multi-Agent Systems

Hideaki Ishii

Dept. Computational Intelligence & Systems Science

[email protected]

Advanced Topics in Mathematical Information Sciences II

Jul 17th, 2015

2

Part 2: Application of consensus

Distributed computation of PageRank

for search engines

2

Search Engine at Google

Proposed by the co-founders S. Brin and L. Page

Quantifies importance/popularity of each web page

Popular pages are ranked higher in search results

One of the 200 signals used at Google

PageRank Algorithm at Google

Brin & Page (1998), Langville & Meyer (2006)

A paradigmatic problem for ranking objects in

IT, Bibliometrics, Biology, E-Commerce,…

Scientific Journals: Eigenfactor

cf. Impact Factor: Based on # of citations only (Not enough!)

Ideas similar to PageRank can be traced back to 70s

Proteins in systems biology

Companies related by business interactions

…

PageRank in Various Areas

6

Keyword: “Department of control engineering” (in Japanese)

Differences in search results

6

Tokyo Tech: Only in the 16th place!

7

Keyword: “Department of control engineering” (in Japanese)

Differences in search results

7

What makes the difference?

8

PageRank algorithm at Google

Quantifies the importance of each website Uses this info for the ordering

8

Google Toolbar

9

How is PageRank determined?

Basic idea Brin & Page (1999)

More incoming links, especially those from

important pages, make a page important.

Determined by the link structure of the web

9

５５

７ Top page of Tokyo Tech

Cont. Sys. Eng. Mech. Eng. Sci.

５

Comp. Intel. Sys. Sci.

10

PageRank: Computational aspectsComputed centrally with in Google

Web data： Automatically collected by crawlers

Over 8 billion webpage indices

Computed once a month, takes about a week

Recent research on more efficient computing

10

11

Motivation 1: Multi-agent consensusView each webpage as an agent capable of computing

Motivation 2: Probabilistic algorithms for systems and control problems

Allow the agents to behave asynchronously

Ishii & Tempo, IEEE Control Systems Magazine, 201411

Distributed randomized approach

12

PageRank problem

13

PageRank problem

Web consisting of n pages

Link

Page i

13

Node

Edge

Directed graph

Add an artificial link

Page without any link(e.g., PDF file)

14

PageRank problem

Web consisting of n pages

Link

Page i

14

Page without any link(e.g., PDF file)

Add an artificial link

: PageRank value of page i Larger value implies more important Determined by the link structure only

]10[ ,ix

15

PageRank definition Example: A web of ４ pages

15

1

2 3

4

2 3

4

41 31 xx

# of links of page 4

PageRank value of page 1

1616

324 21

21 xxx

1

2 3

4

# of links of page 2


PageRank value of page 4

17


17

1

2 3

4

324

423

4312

41

21

21

31

21

31

21

31

xxx

xxx

xxxx

xx

14321 xxxx

Normalization：

1

2

3

4

Ordering

1818

1

2 3

4

4

3

2

1

4

3

2

1

02/12/103/102/103/12/1013/1000

xxxx

xxxx

Sum of elements in each column = 1


In the vector form

(Column) stochastic

matrix

19

PageRank vector

Eigenvector corresponding to eigenvalue 1

Always exists, but there may be multiple such vectors

If the web as a graph is strongly connected, then it is

unique

1]10[1

, , ,n

ii

n xxAxx

19

Link matrix: Stochastic matrix

However, the real web does not have this

property…

20

where

M > 0：Each element is a positive value

Modified PageRank problem

1]10[1

, , ,n

ii

n xxMxx

: , ,: SmSn

mAmM 15.01)1(

11

11

20

Stochastic

(by Perron theorem in matrix theory)

Redefine PageRank by

Only one such eigenvector exists

21

Computation based on the power method

Eigenvalues of M have magnitude ≦ 1

As a discrete-time system, it is (critically) stable

Asymptotic convergence to PageRank:


1]10[1

, , ,n

ii

n xxMxx

)()1( kMxkx

21

kxkx ,)(

22

Computation based on the power method

Centralized computation:

May require high computational load


1]10[1

, , ,n

ii

n xxMxx

)()1( kMxkx

22

Can the computation be implemented in a distributed way?

23

Data center of Google

23The Dalles, Oregon, U.S.A.

24

Distributed randomized approach to PageRank computation

25


Each page i computes its own value

Pages exchange their info over the links)(kxi

25

)(kxi

Page i

26

Basic protocol in the algorithm

At each time k, one page is chosen.

Denote the index of this page by

Then follow Steps 1～3.

26

)(k

Chosen at time k)()( kx k


Step 1: Send Step 2:

Return

Step 3: Update

27

One page is chosen at a time

Select a page probabilistically

Each page has the same probability of


27

It can be implemented decentrally

n1

28

Distributed update scheme

Goal: The scheme should compute the PageRank

of each agent from the state

)()1( )( kxAkx k

Switches depending on the chosen page

ix

i

ixx 1)0(0)0( ,

How shall the link matrices be selected?

28

)(kxi

Initial vector

29

Centralized scheme

02/12/103/102/103/12/1013/1000

A

Distributed link matrices

1

2 3

4

)()1( kAxkx

29

3

4

Page 4 is chosen

02/12/103/13/13/1

4A

Distributed scheme

4A

4)()()1( )( kkxAkx k ,

02/12/103/1003/1003/100

4A

02/12/103/12/1003/102/103/1001

4A

Column stochastic matrix

30

Modified update scheme

Stochastic system

Average state Its dynamics Average matrix

)()1( )( kxMkx k i

ixx 1)0(0)0( ,

30

)()1( kxMkx

)]([)( kxEkx :

][ )(kMEM :

Must converge to x

31

Modified link matrix

Same form as M:

)()1( )( kxMkx k i

ixx 1)0(0)0( ,

iSn

mAmM ii ,1ˆ)ˆ1(

31

kxkx ,)(

kxkx ,)(However, the state does not converge…

and M share the eigenvector for eigenvalue 1.M: , )10(ˆ m

32

Convergence result

The time average converges to the PageRank vector in the mean-square sense:

kxkyE ,0)(

2*

x

32Ishii and Tempo (2010), Ishii, Tempo, Bai (2012)

k

llx

kky

0)(

11)( :

We hence focus on the time average:

Numerical Experiment

From a university in New Zealand (www.lincoln.ac.nz)

3,756 nodes, 31,718 links, 684 subdomains

Statistical Cybernetics Research Group, Univ. Wolverhampton, U.K.

0 1000 2000 3000

0

1000

2000

3000

Index j

Inde

x i

Inde

x i

Web Structure

Plotted nonzero entries in the link matrix A. Very Sparse.

Large number of dangling nodes (red dots > 85 %)

Large clusters

Index j

PageRank Values

Pages in the clusters take larger values.

Top ranked pages: “Search” page & Univ. top page

0 1000 2000 3000 40000

0.002

0.004

0.006

0.008

0.01

0.012Pa

geR

ank

Page index

Pag

eRan

k

Index i

36

Relation to consensus

37

Consensus problemNetwork of agents: Directed graph, Strongly connected

Agent i

37

)(kxi

Communication: Edges are chosen randomlyConsensus: With probability 1, it holds that

jikkxkx ji ,0)()( , ,

Hatano & Mesbahi (2005), Wu (2006), Tahbaz & Jadbabaie (2008)

38

Comparison

PageRank Consensus

Graph × Strongly connected

Update law

Randomization Page Edge

Objective

Matrixaaaa Column stochastic

Row stochastic

*)( xky 0)()( kxkx ji

)()1( )( kxMkx k

38

Time ave.

)(kM

)(k

39

Effects of uncertain links

40

Uncertain links When the linked page cannot be viewed

Server failure, or the page deleted temporarily

Incorrect data of the web structure

Can we find how much PageRank values vary in the presence of such links?

Especially, for important pages, the error may be large

40Ishii & Tempo (2009)

41

PageRank values under uncertain data

: Set of uncertain links （with d links）

Graphs for all combinations of missing links:

PageRank vector corresponding to each graph

41

fEd2D

1]10[1

)()()()()(

, , ,n

j

ij

niiii xxxMx

Difficult to compute all of them (just too many!)

Proposed method: (Centralized) computation ofPageRank interval

⇒ Contains error, but computationally efficient

42

Total # of links: 2,560, Uncertain links: 18

Range of PageRank (for 20 pages)

Numerical example 1: Web of 150 pages

42

0 2 4 6 8 10 12 14 16 18 200

0.01

0.02

0.03

0.04

0.05

Page index

Page

Ran

k

Page Index

Pag

eRan

kImportant pages：

Linked by about half of the pages

True maximum & minimum

4343

2 4 6 8 10 12 14 16 18 200

200

400

600

800

1000

1200

Number of fragile links

Com

puta

tion

time

# of uncertain links

Computation time

2 4 6 8 10 12 14 16 18 200.98

0.99

1

1.01

1.02

Number of fragile links

Rel

ativ

e er

ror

Relative error

# of uncertain links

Relative error w.r.t. true values

True range

Proposed method True range

Proposed method

Numerical example 1: Web of 150 pages

44

0 2 4 6 8 10 12 14 160

0.005

0.01

0.015

0.02

0.025

Page index

Page

Ran

k

Total # of links: 27,500, Uncertain links: 1000

Range of PageRank values (for 16 pages)

44Page Index

Pag

eRan

kImportant pages:

Linked by over 250 pages

Range obtained by proposed method

Large enough that rankings may change!

Numerical example 1: Web of 1,200 pages

45

Summary: Part 2

PageRank computation at Google

Algorithms via a distributed randomized approach

Method to study effects of uncertain links

Issues related to computation/communication resources

Collaborators：

Roberto Tempo (IEIIT-CNR, Politecnico di Torino)

Er-Wei Bai (University of Iowa)45

Documents

Cooperative Control of Multi-Agent Systemswatanabe- › syllabus › atmis › ATMIS_Ishii_… · PageRank in Various Areas. 6 Keyword: “Department of control engineering” (in