213
The Problem Google Models Further thoughts Google’s PageRank Algorithm Anders O. F. Hendrickson Concordia College Moorhead, MN Math/CS Colloquium March 23, 2010 Anders O. F. Hendrickson Google’s PageRank Algorithm

faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Google’s PageRank Algorithm

Anders O. F. Hendrickson

Concordia CollegeMoorhead, MN

Math/CS ColloquiumMarch 23, 2010

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 2: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Outline

1 The ProblemToo many resultsPrinciples behind a solution

2 Google ModelsFive ModelsMarkov Chains

3 Further thoughts

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 3: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize an index?

This is two questions, really:How do you organize thetopics in the index?

(Alphabetically)

How do you organize thepages found for eachtopic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 4: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize an index?

This is two questions, really:How do you organize thetopics in the index?

(Alphabetically)

How do you organize thepages found for eachtopic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 5: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize an index?

This is two questions, really:How do you organize thetopics in the index?(Alphabetically)How do you organize thepages found for eachtopic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 6: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize an index?

This is two questions, really:How do you organize thetopics in the index?(Alphabetically)How do you organize thepages found for eachtopic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 7: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize an index?

This is two questions, really:How do you organize thetopics in the index?(Alphabetically)How do you organize thepages found for eachtopic? (Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 8: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize a computer search?

For a computer search, youspecify the topic, so the firstquestion is irrelevant, but thesecond still matters.

Only one topic isdesired.How do you organize theentries found for a topic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 9: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize a computer search?

For a computer search, youspecify the topic, so the firstquestion is irrelevant, but thesecond still matters.

Only one topic isdesired.How do you organize theentries found for a topic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 10: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize a computer search?

For a computer search, youspecify the topic, so the firstquestion is irrelevant, but thesecond still matters.

Only one topic isdesired.How do you organize theentries found for a topic?

(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 11: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How do you organize a computer search?

For a computer search, youspecify the topic, so the firstquestion is irrelevant, but thesecond still matters.

Only one topic isdesired.How do you organize theentries found for a topic?(Sequentially)

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 12: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

Drawbacks to returning results sequentially

Sequential ordering isn’treally informative.Okay for rare topics, badfor common topics.Slight improvements:

SubtopicsMarkup

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 13: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

Drawbacks to returning results sequentially

Sequential ordering isn’treally informative.Okay for rare topics, badfor common topics.Slight improvements:

SubtopicsMarkup

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 14: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

Drawbacks to returning results sequentially

Sequential ordering isn’treally informative.Okay for rare topics, badfor common topics.Slight improvements:

SubtopicsMarkup

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 15: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

Drawbacks to returning results sequentially

Sequential ordering isn’treally informative.Okay for rare topics, badfor common topics.Slight improvements:

SubtopicsMarkup

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 16: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

Drawbacks to returning results sequentially

Sequential ordering isn’treally informative.Okay for rare topics, badfor common topics.Slight improvements:

SubtopicsMarkup

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 17: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

What if we’re searching the Web?

There is no sequential ordering!Even “rare” topics find many, many webpages. (Example)Attempted improvements:

Subtopics—e.g. Yahoo! DirectoryMarkup slightly possible—e.g. PDF vs. HTML, showing afew relevant lines

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 18: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

What if we’re searching the Web?

There is no sequential ordering!Even “rare” topics find many, many webpages. (Example)Attempted improvements:

Subtopics—e.g. Yahoo! DirectoryMarkup slightly possible—e.g. PDF vs. HTML, showing afew relevant lines

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 19: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

What if we’re searching the Web?

There is no sequential ordering!Even “rare” topics find many, many webpages. (Example)Attempted improvements:

Subtopics—e.g. Yahoo! DirectoryMarkup slightly possible—e.g. PDF vs. HTML, showing afew relevant lines

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 20: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

What if we’re searching the Web?

There is no sequential ordering!Even “rare” topics find many, many webpages. (Example)Attempted improvements:

Subtopics—e.g. Yahoo! DirectoryMarkup slightly possible—e.g. PDF vs. HTML, showing afew relevant lines

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 21: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

What if we’re searching the Web?

There is no sequential ordering!Even “rare” topics find many, many webpages. (Example)Attempted improvements:

Subtopics—e.g. Yahoo! DirectoryMarkup slightly possible—e.g. PDF vs. HTML, showing afew relevant lines

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 22: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

A solution: ask the librarian

An expert will know which books are particularly good orrelevant.Likewise, you could ask people for recommendations ofgood websites.Of course, the quality of the recommendation is only asgood as the quality of the recommender.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 23: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How does this translate to the web?

Fundamental PrincipleEvery hyperlink on the Web is a recommendation to visitanother page.

So one way to figure out which pages are “important” or“good” is to trust the judgment of the other website authorsout there who link to it.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 24: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Too many resultsPrinciples behind a solution

How does this translate to the web?

Fundamental PrincipleEvery hyperlink on the Web is a recommendation to visitanother page.

So one way to figure out which pages are “important” or“good” is to trust the judgment of the other website authorsout there who link to it.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 25: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Modeling the web

For simplicity, suppose the web has only six pages, A, B,C, D, E, and F.There are two natural ways to model its hyperlink structuremathematically.

Directed Graph Matrix

A

B C

D E

F

A BFROM

C D E F

A 0 1 1 1 0 0B 1 0 0 0 0 0

TO C 0 1 0 0 0 0

D 0 1 0 0 1 0E 0 0 0 0 0 0F 0 0 0 1 0 0

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 26: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Modeling the web

For simplicity, suppose the web has only six pages, A, B,C, D, E, and F.There are two natural ways to model its hyperlink structuremathematically.

Directed Graph Matrix

A

B C

D E

F

A BFROM

C D E F

A 0 1 1 1 0 0B 1 0 0 0 0 0

TO C 0 1 0 0 0 0

D 0 1 0 0 1 0E 0 0 0 0 0 0F 0 0 0 1 0 0

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 27: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Modeling the web

For simplicity, suppose the web has only six pages, A, B,C, D, E, and F.There are two natural ways to model its hyperlink structuremathematically.

Directed Graph

Matrix

A

B C

D E

F

A BFROM

C D E F

A 0 1 1 1 0 0B 1 0 0 0 0 0

TO C 0 1 0 0 0 0

D 0 1 0 0 1 0E 0 0 0 0 0 0F 0 0 0 1 0 0

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 28: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Modeling the web

For simplicity, suppose the web has only six pages, A, B,C, D, E, and F.There are two natural ways to model its hyperlink structuremathematically.

Directed Graph Matrix

A

B C

D E

F

A BFROM

C D E F

A 0 1 1 1 0 0B 1 0 0 0 0 0

TO C 0 1 0 0 0 0

D 0 1 0 0 1 0E 0 0 0 0 0 0F 0 0 0 1 0 0

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 29: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 1: Counting hyperlinks

A BFROM

C D E F

A 0

+

1

+

1

+

1

+

0

+

0B 1

+

0

+

0

+

0

+

0

+

0

TO C 0

+

1

+

0

+

0

+

0

+

0D 0

+

1

+

0

+

0

+

1

+

0E 0

+

0

+

0

+

0

+

0

+

0F 0

+

0

+

0

+

1

+

0

+

0

= 3 A= 1 B= 1 C= 2 D= 0 E= 1 F

We want to count the number of hyperlinks into a webpage.In other words, we want to count the entries in each row.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 30: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 1: Counting hyperlinks

A BFROM

C D E F

A

0 + 1 + 1 + 1 + 0 + 0

B

1 + 0 + 0 + 0 + 0 + 0

TO C

0 + 1 + 0 + 0 + 0 + 0

D

0 + 1 + 0 + 0 + 1 + 0

E

0 + 0 + 0 + 0 + 0 + 0

F

0 + 0 + 0 + 1 + 0 + 0

= 3 A= 1 B= 1 C= 2 D= 0 E= 1 F

We want to count the number of hyperlinks into a webpage.In other words, we want to count the entries in each row.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 31: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 1: Counting hyperlinks

A BFROM

C D E F

A

0 + 1 + 1 + 1 + 0 + 0

B

1 + 0 + 0 + 0 + 0 + 0

TO C

0 + 1 + 0 + 0 + 0 + 0

D

0 + 1 + 0 + 0 + 1 + 0

E

0 + 0 + 0 + 0 + 0 + 0

F

0 + 0 + 0 + 1 + 0 + 0

= 3 A= 1 B= 1 C= 2 D= 0 E= 1 F

We want to count the number of hyperlinks into a webpage.In other words, we want to count the entries in each row.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 32: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Matrix Multiplication

As it happens, this is exactly what matrix multiplication by

~e =

1...1

does!

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

111111

=

311201

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 33: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 1

Look again at the graph.We trust A most, since he has 3 recommendations.So when A recommends B, shouldn’t that count for morethan E ’s recommendation of D?

A

B C

D E

F

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 34: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

311201

=

431102

Let’s re-run our matrix multiplication, but multiplying eachrecommendation by the recommender’s previous score.This is M(M~e) = M2~e.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 35: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

311201

=

431102

Let’s re-run our matrix multiplication, but multiplying eachrecommendation by the recommender’s previous score.This is M(M~e) = M2~e.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 36: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

311201

=

431102

Let’s re-run our matrix multiplication, but multiplying eachrecommendation by the recommender’s previous score.This is M(M~e) = M2~e.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 37: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M1~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

111111

=

311201

= ~π1

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 38: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M1~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

111111

=

311201

= ~π1

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 39: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M2~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

311201

=

431102

= ~π2

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 40: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M2~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

311201

=

431102

= ~π2

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 41: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M3~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

431102

=

543301

= ~π3

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 42: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M3~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

431102

=

543301

= ~π3

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 43: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M4~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

543301

=

1054403

= ~π4

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 44: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M4~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

543301

=

1054403

= ~π4

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 45: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M5~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

1054403

=

13105504

= ~π5

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 46: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M5~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

1054403

=

13105504

= ~π5

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 47: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M6~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

13105504

=

2013101005

= ~π6

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 48: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M6~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

13105504

=

2013101005

= ~π6

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 49: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M7~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

2013101005

=

33201313010

= ~π7

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 50: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M7~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

2013101005

=

33201313010

= ~π7

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 51: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M8~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

33201313010

=

46332020013

= ~π8

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 52: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 2: Iteration

So let’s iterate! Let ~π0 = ~e and let ~πk+1 = M~πk .

M8~e =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

33201313010

=

46332020013

= ~π8

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 53: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problems with Model 2

Four problems:Where should we stop?When we iterate, the numbers get bigger and bigger.How do we know the rankings stabilize?A profligate recommender will have an undue influence onthe results.

Solution: when B recommends 3 people (A, C, and D), let’smake that worth 1

3 each instead of 1.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 54: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problems with Model 2

Four problems:Where should we stop?When we iterate, the numbers get bigger and bigger.How do we know the rankings stabilize?A profligate recommender will have an undue influence onthe results.

Solution: when B recommends 3 people (A, C, and D), let’smake that worth 1

3 each instead of 1.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 55: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problems with Model 2

Four problems:Where should we stop?When we iterate, the numbers get bigger and bigger.How do we know the rankings stabilize?A profligate recommender will have an undue influence onthe results.

Solution: when B recommends 3 people (A, C, and D), let’smake that worth 1

3 each instead of 1.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 56: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problems with Model 2

Four problems:Where should we stop?When we iterate, the numbers get bigger and bigger.How do we know the rankings stabilize?A profligate recommender will have an undue influence onthe results.

Solution: when B recommends 3 people (A, C, and D), let’smake that worth 1

3 each instead of 1.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 57: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problems with Model 2

Four problems:Where should we stop?When we iterate, the numbers get bigger and bigger.How do we know the rankings stabilize?A profligate recommender will have an undue influence onthe results.

Solution: when B recommends 3 people (A, C, and D), let’smake that worth 1

3 each instead of 1.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 58: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problems with Model 2

Four problems:Where should we stop?When we iterate, the numbers get bigger and bigger.How do we know the rankings stabilize?A profligate recommender will have an undue influence onthe results.

Solution: when B recommends 3 people (A, C, and D), let’smake that worth 1

3 each instead of 1.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 59: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

Scaling the columns to sum to 1 gives us a hyperlink matrix H:

H =

0 1 1 1 0 01 0 0 0 0 00 1 0 0 0 00 1 0 0 1 00 0 0 0 0 00 0 0 1 0 0

.

So let’s let ~π0 = ~e and let ~πk+1 = H~πk .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 60: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

Scaling the columns to sum to 1 gives us a hyperlink matrix H:

H =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

.

So let’s let ~π0 = ~e and let ~πk+1 = H~πk .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 61: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

Scaling the columns to sum to 1 gives us a hyperlink matrix H:

H =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

.

So let’s let ~π0 = ~e and let ~πk+1 = H~πk .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 62: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π0 = H0 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

0

161616161616

=

0.170.170.170.170.170.17

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 63: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π1 = H1 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

1

161616161616

=

0.310.170.060.220.000.08

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 64: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π2 = H2 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

2

161616161616

=

0.220.310.060.060.000.11

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 65: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π3 = H3 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

3

161616161616

=

0.190.220.100.100.000.03

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 66: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π4 = H4 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

4

161616161616

=

0.230.190.070.070.000.05

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 67: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π5 = H5 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

5

161616161616

=

0.170.230.060.060.000.04

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 68: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π6 = H6 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

6

161616161616

=

0.170.170.080.080.000.03

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 69: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π7 = H7 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

7

161616161616

=

0.170.170.060.060.000.04

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 70: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π8 = H8 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

8

161616161616

=

0.140.170.060.060.000.03

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 71: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π9 = H9 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

9

161616161616

=

0.140.140.060.060.000.03

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 72: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π10 = H10 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

10

161616161616

=

0.130.140.050.050.000.03

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 73: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π11 = H11 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

11

161616161616

=

0.120.130.050.050.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 74: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π12 = H12 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

12

161616161616

=

0.110.120.040.040.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 75: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π13 = H13 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

13

161616161616

=

0.110.110.040.040.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 76: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π14 = H14 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

14

161616161616

=

0.100.110.040.040.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 77: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π15 = H15 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

15

161616161616

=

0.090.100.040.040.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 78: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π16 = H16 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

16

161616161616

=

0.090.090.030.030.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 79: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π17 = H17 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

17

161616161616

=

0.080.090.030.030.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 80: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π18 = H18 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

18

161616161616

=

0.060.080.030.030.000.02

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 81: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π19 = H19 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

19

161616161616

=

0.060.060.030.030.000.01

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 82: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π20 = H20 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

20

161616161616

=

0.050.060.020.020.000.01

A

B C

D E

F

Problem: All the scores go to zero!

Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 83: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 3: hyperlink matrix H

~π21 = H21 1n~e =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

21

161616161616

=

0.050.050.020.020.000.01

A

B C

D E

F

Problem: All the scores go to zero!Why? Because F is “dangling node.”

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 84: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Fix the dangling node

A dangling node like F recommends no other page.Instead, let’s “read” F as recommending everyone equally.

H =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

+

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

= H +16

111111

(

0 0 0 0 0 1)

= H +1n~e~aT .

where ~a is a vector whose 1’s indicate dangling nodes.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 85: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Fix the dangling node

A dangling node like F recommends no other page.Instead, let’s “read” F as recommending everyone equally.

S =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

+

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

= H +16

111111

(

0 0 0 0 0 1)

= H +1n~e~aT .

where ~a is a vector whose 1’s indicate dangling nodes.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 86: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Fix the dangling node

A dangling node like F recommends no other page.Instead, let’s “read” F as recommending everyone equally.

S =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

+

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

= H +16

111111

(

0 0 0 0 0 1)

= H +1n~e~aT .

where ~a is a vector whose 1’s indicate dangling nodes.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 87: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Fix the dangling node

A dangling node like F recommends no other page.Instead, let’s “read” F as recommending everyone equally.

S =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

+

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

= H +16

111111

(

0 0 0 0 0 1)

= H +1n~e~aT .

where ~a is a vector whose 1’s indicate dangling nodes.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 88: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Fix the dangling node

A dangling node like F recommends no other page.Instead, let’s “read” F as recommending everyone equally.

S =

0 13 1 1

2 0 01 0 0 0 0 00 1

3 0 0 0 00 1

3 0 0 1 00 0 0 0 0 00 0 0 1

2 0 0

+

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

0 0 0 0 0 16

= H +16

111111

(

0 0 0 0 0 1)

= H +1n~e~aT .

where ~a is a vector whose 1’s indicate dangling nodes.Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 89: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π0 = S0 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

0

161616161616

=

0.170.170.170.170.170.17

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 90: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π1 = S1 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

1

161616161616

=

0.330.190.080.250.030.11

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 91: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π2 = S2 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

2

161616161616

=

0.290.350.080.110.020.14

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 92: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π3 = S3 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

3

161616161616

=

0.280.320.140.160.020.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 93: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π4 = S4 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

4

161616161616

=

0.340.290.120.140.010.09

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 94: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π5 = S5 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

5

161616161616

=

0.300.360.110.130.020.09

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 95: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π6 = S6 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

6

161616161616

=

0.310.320.130.150.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 96: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π7 = S7 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

7

161616161616

=

0.330.320.120.130.010.09

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 97: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π8 = S8 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

8

161616161616

=

0.310.340.120.130.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 98: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π9 = S9 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

9

161616161616

=

0.320.320.130.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 99: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π10 = S10 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

10

161616161616

=

0.320.330.120.130.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 100: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π11 = S11 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

11

161616161616

=

0.310.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 101: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π12 = S12 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

12

161616161616

=

0.320.320.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 102: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π13 = S13 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

13

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 103: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π14 = S14 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

14

161616161616

=

0.310.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 104: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π15 = S15 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

15

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 105: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π16 = S16 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

16

161616161616

=

0.310.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 106: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π17 = S17 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

17

161616161616

=

0.310.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 107: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π18 = S18 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

18

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 108: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π19 = S19 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

19

161616161616

=

0.310.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 109: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π20 = S20 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

20

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 110: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π21 = S21 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

21

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 111: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π22 = S22 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

22

161616161616

=

0.310.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 112: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π23 = S23 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

23

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 113: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π24 = S24 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

24

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 114: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π25 = S25 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

25

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 115: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 4: S = H + 1n~e~aT

~π26 = S26 1n~e =

0 13 1 1

2 0 16

1 0 0 0 0 16

0 13 0 0 0 1

60 1

3 0 0 1 16

0 0 0 0 0 16

0 0 0 12 0 1

6

26

161616161616

=

0.320.330.120.140.010.08

A

B C

D E

F

Hooray!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 116: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π0 = S0 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

0

161616161616

=

0.170.170.170.170.170.17

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 117: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π1 = S1 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

1

161616161616

=

0.220.170.170.060.220.17

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 118: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π2 = S2 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

2

161616161616

=

0.110.220.170.060.220.22

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 119: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π3 = S3 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

3

161616161616

=

0.110.110.220.060.280.22

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 120: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π4 = S4 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

4

161616161616

=

0.130.110.110.070.300.28

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 121: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π5 = S5 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

5

161616161616

=

0.110.130.110.040.310.30

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 122: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π6 = S6 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

6

161616161616

=

0.070.110.130.040.330.31

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 123: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π7 = S7 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

7

161616161616

=

0.080.070.110.040.360.33

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 124: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π8 = S8 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

8

161616161616

=

0.080.080.070.040.370.36

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 125: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π9 = S9 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

9

161616161616

=

0.060.080.080.020.380.37

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 126: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π10 = S10 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

10

161616161616

=

0.050.060.080.030.400.38

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 127: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π11 = S11 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

11

161616161616

=

0.050.050.060.030.410.40

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 128: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π12 = S12 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

12

161616161616

=

0.050.050.050.020.420.41

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 129: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π13 = S13 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

13

161616161616

=

0.040.050.050.020.430.42

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 130: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π14 = S14 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

14

161616161616

=

0.030.040.050.020.440.43

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 131: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π15 = S15 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

15

161616161616

=

0.030.030.040.020.440.44

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 132: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π16 = S16 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

16

161616161616

=

0.030.030.030.010.450.44

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 133: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π17 = S17 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

17

161616161616

=

0.020.030.030.010.450.45

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 134: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π18 = S18 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

18

161616161616

=

0.020.020.030.010.460.45

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 135: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π19 = S19 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

19

161616161616

=

0.020.020.020.010.460.46

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 136: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π20 = S20 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

20

161616161616

=

0.020.020.020.010.470.46

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 137: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π21 = S21 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

21

161616161616

=

0.020.020.020.010.470.47

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 138: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π22 = S22 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

22

161616161616

=

0.010.020.020.010.470.47

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 139: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π23 = S23 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

23

161616161616

=

0.010.010.020.010.480.47

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 140: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π24 = S24 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

24

161616161616

=

0.010.010.010.010.480.48

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 141: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π25 = S25 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

25

161616161616

=

0.010.010.010.000.480.48

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 142: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π26 = S26 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

26

161616161616

=

0.010.010.010.000.480.48

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 143: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π27 = S27 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

27

161616161616

=

0.010.010.010.000.490.48

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 144: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π28 = S28 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

28

161616161616

=

0.010.010.010.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 145: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π29 = S29 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

29

161616161616

=

0.010.010.010.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 146: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π30 = S30 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

30

161616161616

=

0.010.010.010.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 147: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π31 = S31 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

31

161616161616

=

0.000.010.010.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 148: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π32 = S32 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

32

161616161616

=

0.000.000.010.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 149: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π33 = S33 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

33

161616161616

=

0.000.000.000.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 150: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π34 = S34 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

34

161616161616

=

0.000.000.000.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 151: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π35 = S35 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

35

161616161616

=

0.000.000.000.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 152: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π36 = S36 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

36

161616161616

=

0.000.000.000.000.490.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 153: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π37 = S37 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

37

161616161616

=

0.000.000.000.000.500.49

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 154: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π38 = S38 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

38

161616161616

=

0.000.000.000.000.500.50

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 155: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Problem with Model 4: cycles

~π39 = S39 1n~e =

0 0 13 1 0 0

1 0 0 0 0 00 1 0 0 0 00 0 1

3 0 0 00 0 1

3 0 0 10 0 0 0 1 0

39

161616161616

=

0.000.000.000.000.500.50

A

B

C

D

E F

Bother.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 156: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

The Problem

A

B

C

D

E F

In this graph, E and F form a cycle. Neither is dangling,but together they absorb all the prestige.The fix: give some probability of “teleporting” from allnodes, not just dangling ones.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 157: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

The Problem

A

B

C

D

E F

In this graph, E and F form a cycle. Neither is dangling,but together they absorb all the prestige.The fix: give some probability of “teleporting” from allnodes, not just dangling ones.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 158: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

The Solution: The Google Matrix

At each turn, every node will share only α ≈ 85% of itsprestige equally among the pages it links to, as it didbefore.The remaining 1− α ≈ 15% of its prestige is sharedequally among all nodes, just like with dangling nodes.This gives us a new matrix, the “Google matrix”

G = αS + (1− α)

(1n~e~eT

).

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 159: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

The Solution: The Google Matrix

At each turn, every node will share only α ≈ 85% of itsprestige equally among the pages it links to, as it didbefore.The remaining 1− α ≈ 15% of its prestige is sharedequally among all nodes, just like with dangling nodes.This gives us a new matrix, the “Google matrix”

G = αS + (1− α)

(1n~e~eT

).

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 160: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

The Solution: The Google Matrix

At each turn, every node will share only α ≈ 85% of itsprestige equally among the pages it links to, as it didbefore.The remaining 1− α ≈ 15% of its prestige is sharedequally among all nodes, just like with dangling nodes.This gives us a new matrix, the “Google matrix”

G = αS + (1− α)

(1n~e~eT

).

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 161: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π0 = G0 1n~e =

0.170.170.170.170.170.17

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 162: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π1 = G1 1n~e =

0.210.170.170.070.210.17

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 163: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π2 = G2 1n~e =

0.130.210.170.070.210.21

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 164: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π3 = G3 1n~e =

0.130.140.200.070.250.21

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 165: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π4 = G4 1n~e =

0.140.140.140.080.260.24

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 166: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π5 = G5 1n~e =

0.140.150.140.070.270.24

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 167: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π6 = G6 1n~e =

0.120.140.150.070.270.25

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 168: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π7 = G7 1n~e =

0.120.130.140.070.280.26

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 169: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π8 = G8 1n~e =

0.120.130.130.070.280.26

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 170: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π9 = G9 1n~e =

0.120.130.140.060.290.27

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 171: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π10 = G10 1n~e =

0.120.130.140.060.290.27

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 172: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π11 = G11 1n~e =

0.120.120.130.060.290.27

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 173: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π12 = G12 1n~e =

0.120.120.130.060.290.27

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 174: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π13 = G13 1n~e =

0.120.120.130.060.290.27

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 175: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π14 = G14 1n~e =

0.110.120.130.060.300.28

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 176: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π15 = G15 1n~e =

0.110.120.130.060.300.28

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 177: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π16 = G16 1n~e =

0.110.120.130.060.300.28

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 178: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π17 = G17 1n~e =

0.110.120.130.060.300.28

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 179: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π18 = G18 1n~e =

0.110.120.130.060.300.28

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 180: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Model 5: G = αS + (1− α)1n~e~eT , α = 0.85

~π19 = G19 1n~e =

0.110.120.130.060.300.28

A

B

C

D

E F

Splendid! It converged fast, too!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 181: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Markov Chains

G = αS + (1− α)1n~e~eT .

This situation is called a Markov chain. Notice thatG is square, every entry of G is nonnegative, and eachcolumn of G sums to 1. So G is stochastic.Every entry of G is greater than zero. This implies that G isregular.

Regular Markov Chain Theorem

If G is a regular stochastic matrix, then there exists a limitingvector ~π such that

limk→∞

Gk~x = ~π

for all starting vectors ~x .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 182: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Markov Chains

G = αS + (1− α)1n~e~eT .

This situation is called a Markov chain. Notice thatG is square, every entry of G is nonnegative, and eachcolumn of G sums to 1. So G is stochastic.Every entry of G is greater than zero. This implies that G isregular.

Regular Markov Chain Theorem

If G is a regular stochastic matrix, then there exists a limitingvector ~π such that

limk→∞

Gk~x = ~π

for all starting vectors ~x .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 183: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Markov Chains

G = αS + (1− α)1n~e~eT .

This situation is called a Markov chain. Notice thatG is square, every entry of G is nonnegative, and eachcolumn of G sums to 1. So G is stochastic.Every entry of G is greater than zero. This implies that G isregular.

Regular Markov Chain Theorem

If G is a regular stochastic matrix, then there exists a limitingvector ~π such that

limk→∞

Gk~x = ~π

for all starting vectors ~x .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 184: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Markov Chains

G = αS + (1− α)1n~e~eT .

This situation is called a Markov chain. Notice thatG is square, every entry of G is nonnegative, and eachcolumn of G sums to 1. So G is stochastic.Every entry of G is greater than zero. This implies that G isregular.

Regular Markov Chain Theorem

If G is a regular stochastic matrix, then there exists a limitingvector ~π such that

limk→∞

Gk~x = ~π

for all starting vectors ~x .

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 185: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Eigenvectors

Another way to phrase this is in terms of eigenvectors.

Definition

Let G be a square matrix. If ~x is a vector and λ a scalar suchthat G~x = λ~x , we say λ is an eigenvalue and ~x aneigenvector for G.

FactsIn a regular stochastic matrix like G,

G~π = ~π

So the limiting vector ~π is an eigenvector with eigenvalue 1.In fact, 1 is the largest eigenvalue of G.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 186: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Eigenvectors

Another way to phrase this is in terms of eigenvectors.

Definition

Let G be a square matrix. If ~x is a vector and λ a scalar suchthat G~x = λ~x , we say λ is an eigenvalue and ~x aneigenvector for G.

FactsIn a regular stochastic matrix like G,

G~π = ~π

So the limiting vector ~π is an eigenvector with eigenvalue 1.In fact, 1 is the largest eigenvalue of G.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 187: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Eigenvectors

Another way to phrase this is in terms of eigenvectors.

Definition

Let G be a square matrix. If ~x is a vector and λ a scalar suchthat G~x = λ~x , we say λ is an eigenvalue and ~x aneigenvector for G.

FactsIn a regular stochastic matrix like G,

G~π = ~π

So the limiting vector ~π is an eigenvector with eigenvalue 1.In fact, 1 is the largest eigenvalue of G.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 188: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 189: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 190: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 191: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 192: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 193: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 194: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 195: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Five ModelsMarkov Chains

Computation

The matrix G is dense—all positive entries.Naively, multiplying G~π requires O(n2) operations.However, we never actually need G to compute:

~πk+1 = G~πk

=

(αS + (1− α)

1n~e~eT

)~πk

=

(αH +

1n~e~aT +

1− α

n~e~eT

)~πk

= αH~πk +1n~e~aT~πk +

1− α

n~e~eT~πk

= αH~πk +1n

(~a · ~πk + 1− α)~e

Since H is an extremely sparse matrix, this can be done inO(n) time!

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 196: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 197: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 198: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 199: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 200: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 201: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 202: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

The α factor

G = αS + (1− α)1n~e~eT .

The α factor needs to be carefully chosen!The higher α is, the more you’re using the hyperlinkstructure of the Web.So if α ≈ 0, your ranking will be nearly useless.But if α ≈ 1, your answer will

Converge more slowly, andFluctuate rapidly as links are created or destroyed.

Google has found α ≈ 0.85 to work well.

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 203: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Secrets

Everything I’ve said been made public by Google’s founders,Larry Page and Sergey Brin. But much of Google’s powercomes from other factors:

“Relevance” scoreGeographic targetingPersonalizationPlenty of secrets

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 204: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 205: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 206: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 207: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 208: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 209: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 210: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 211: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Problems with Google

Is popularity really the best way to rank websites?Example: Netflix search engineExample: Jim JefsonExample: Anders Hendrickson

Google bombsExample: French military victories

Link farmsGoogle as “gatekeeper” to the web

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 212: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

The ProblemGoogle Models

Further thoughts

Questions?

Anders O. F. Hendrickson Google’s PageRank Algorithm

Page 213: faculty.cord.edufaculty.cord.edu/ahendric/colloquium/Google.pdf · The Problem Google Models Further thoughts Too many results Principles behind a solution How do you organize a computer

References

Google’s PageRank and Beyond: The Science of SearchEngine Rankings, by Amy N. Langville and Carl D. Meyer.(2006)Java applets by Tim Chartierhttp://mathdl.maa.org/images/upload_library/23/chartier/google-opoly/index.html

Sergey Brin, Lawrence Page, R. Motwami, and TerryWinograd. The PageRank citation ranking: Bringing orderto the Web. Technical Report 1999-0120, ComputerScience Department, Stanford University, 1999.

Anders O. F. Hendrickson Google’s PageRank Algorithm