Link Counts

Preview:

DESCRIPTION

GOOGLE Page Rank engine needs speedup. Link Counts. Taher’s Home Page. Sep’s Home Page. CS361. DB Pub Server. CNN. Yahoo!. Linked by 2 Unimportant pages. Linked by 2 Important Pages. adapted from G. Golub et al. importance of page i. importance of page j. - PowerPoint PPT Presentation

Citation preview

1

Link Counts

Linked by 2 Important Pages

Linked by 2 Unimportant

pages

Sep’s Home Page

Taher’s Home Page

Yahoo! CNNDB Pub Server CS361

GOOGLE Page Rank engine needs speedup

adapted from G. Golub et al

2

Definition of PageRank

The importance of a page is given by the importance of the pages that link to it.

jBj j

i xN

xi

1

importance of page i

pages j that link to page i

number of outlinks from page j

importance of page j

3

Definition of PageRank

1/2 1/2 1 1

0.1 0.10.1

0.05

Yahoo!CNNDB Pub Server

Taher Sep

0.25

4

PageRank Diagram

Initialize all nodes to rank

0.333

0.333

0.333

nxi

1)0(

5

PageRank Diagram

Propagate ranks across links(multiplying by link weights)

0.167

0.167

0.333

0.333

6

PageRank Diagram

0.333

0.5

0.167

)0()1( 1j

Bj ji x

Nx

i

7

PageRank Diagram

0.167

0.167

0.5

0.167

8

PageRank Diagram

0.5

0.333

0.167

)1()2( 1j

Bj ji x

Nx

i

9

PageRank Diagram

After a while…

0.4

0.4

0.2

jBj j

i xN

xi

1

10

Computing PageRank Initialize:

Repeat until convergence:

)()1( 1 kj

Bj j

ki x

Nx

i

nxi

1)0(

importance of page i

pages j that link to page i

number of outlinks from page j

importance of page j

11

Matrix Notation

jBj j

i xN

xi

1

0 .2 0 .3 0 0 .1 .4 0 .1=

.1

.3

.2

.3

.1

.1

.2

.1

.3

.2

.3

.1

.1TP

x

12

Matrix Notation

.1

.3

.2

.3

.1

.1

0 .2 0 .3 0 0 .1 .4 0 .1=

.1

.3

.2

.3

.1

.1

.2

xPx TFind x that satisfies:

13

Power Method Initialize:

Repeat until convergence:

(k)T1)(k xPx

T(0)x

nn

1...

1

14

PageRank doesn’t actually use PT. Instead, it uses A=cPT + (1-c)ET.

So the PageRank problem is really:

not:

A side note

AxxFind x that satisfies:

xPx TFind x that satisfies:

15

Power Method And the algorithm is really . . .

Initialize:

Repeat until convergence:

T(0)x

nn

1...

1

(k)1)(k Axx

16

Power Method

u1

1u2

2

u3

3

u4

4

u5

5

Express x(0) in terms of eigenvectors of A

17

Power Method

u1

1u2

22

u3

33

u4

44

u5

55

)(1x

18

Power Method)2(x

u1

1u2

222

u3

332

u4

442

u5

552

19

Power Method

u1

1u2

22k

u3

33k

u4

44k

u5

55k

)(kx

20

Power Method

u1

1u2

u3

u4

u5

)(x

21

Why does it work?

Imagine our n x n matrix A has n distinct eigenvectors ui.

ii uAu i

n0 uuux n ...221)(

u1

1u2

2

u3

3

u4

4

u5

5

Then, you can write any n-dimensional vector as a linear combination of the eigenvectors of A.

22

Why does it work? From the last slide:

To get the first iterate, multiply x(0) by A.

First eigenvalue is 1.

Therefore:

...;1 211

n0 uuux n ...221)(

n

n

(0)(1)

uuu

AuAuAu

Axx

nn

n

...

...

22211

221

n(1) uuux nn ...2221

All less than 1

23

Power Method

n0 uuux n ...221)(

u1

1u2

2

u3

3

u4

4

u5

5

u1

1u2

22

u3

33

u4

44

u5

55

n(1) uuux nn ...2221

n)( uuux 2

22221

2 ... nn u1

1u2

222

u3

332

u4

442

u5

552

24

The smaller 2, the faster the convergence of the Power Method.

Convergence

n)( uuux k

nnkk ...2221

u1

1u2

22k

u3

33k

u4

44k

u5

55k