27
The $25 Billion Eigenvector How does Google do Pagerank?

The $25 Billion Eigenvector

  • Upload
    cormac

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

The $25 Billion Eigenvector. How does Google do Pagerank ?. The Imaginary Web Surfer:. Starts at any page, Randomly goes to a page linked from the current page, Randomly goes to any web page from a dangling page, … except sometimes (e.g. 15% of the time), goes to a purely random page. J. - PowerPoint PPT Presentation

Citation preview

Page 1: The $25 Billion Eigenvector

The $25 Billion Eigenvector

How does Google do Pagerank?

Page 2: The $25 Billion Eigenvector

The Imaginary Web Surfer:

• Starts at any page,• Randomly goes to a page linked from the

current page,• Randomly goes to any web page from a

dangling page,• … except sometimes (e.g. 15% of the time),

goes to a purely random page.

Page 3: The $25 Billion Eigenvector

A tiny web: who should get the highest rank?

J A B

I C

DH

G F E

Page 4: The $25 Billion Eigenvector

The associated stochastic matrix:

0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.2983 0.4400 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.8650 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.0150

Page 5: The $25 Billion Eigenvector

How is yk+1=Axk performed?

J A B

I C

DH

G F E

connection = [2 5 3 4 6 4 5 6 5 1 10 7 8 1 8 9]end = [2 5 6 7 8 9 11 12 13 16]

Page 6: The $25 Billion Eigenvector

How is yk+1=Axk performed?

1. yk+1 = .15/n e, (where e is all 1’s)2. start = 13. for j = 1,…, n

a) col_tot = endj-startb) for i = start,…, endj

• ii = connectioni

• yk+1ii = yk+1

ii+.85/col_tot*yki

c) start =endj+1

Page 7: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Start with equal components

Page 8: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

One iteration

Page 9: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Two iterations

Page 10: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Three iterations

Page 11: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Four iterations

Page 12: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Five iterations

Page 13: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Six iterations

Page 14: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Seven iterations

Page 15: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Eight iterations

Page 16: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Nine iterations

Page 17: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Ten iterations

Page 18: The $25 Billion Eigenvector

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

The Eigenvector

Page 19: The $25 Billion Eigenvector

The Imaginary Web Surfer:

• Starts at any page,• Randomly goes to a page linked from the

current page,• Randomly goes to any web page from a

dangling page,• … except sometimes (e.g. 15% of the time),

goes to a purely random page.

Page 20: The $25 Billion Eigenvector

[U,G] = surfer (‘http://google.com’, 100)

0 20 40 60 80 100

0

10

20

30

40

50

60

70

80

90

100

Page 21: The $25 Billion Eigenvector

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03 Pagerank Power Iteration 1 step

Page 22: The $25 Billion Eigenvector

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03 Pagerank Power Iteration 2 steps

Page 23: The $25 Billion Eigenvector

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03 Pagerank Power Iteration 3 steps

Page 24: The $25 Billion Eigenvector

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03

0.035 Pagerank Power Iteration 4 steps

Page 25: The $25 Billion Eigenvector

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03

0.035 Pagerank Power Iteration 5 steps

Page 26: The $25 Billion Eigenvector

0 20 40 60 80 100 1200

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09 Pagerank Power Iteration the limit

Page 27: The $25 Billion Eigenvector

And the winners are… 'http://www.loc.gov/standards/iso639-2' 'http://www.sil.org/iso639-3' 'http://www.loc.gov/standards/iso639-5' 'http://purl.org/dc/elements/1.1' 'http://purl.org/dc/terms' 'http://purl.org/dc' 'http://creativecommons.org/licenses/by/3.0' 'http://i.creativecommons.org/l/by/3.0/88x31.png' 'http://www.nlb.gov.sg' 'http://purl.org/dcpapers' 'http://www.nl.go.kr' 'http://purl.org/dcregistry' 'http://www.kc.tsukuba.ac.jp/index_en.html'