Two Unrelated Talks

Preview:

Citation preview

1/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Two unrelated talks

MARCO BRESSAN

January 30, 2012

2/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Outline

1 Local computation of PageRank: the ranking sideIntroductionMotivationsLocal ranking in theoryLocal ranking in practiceConclusions

2 psort, yet another fast stable external sorting softwareIntroductionMaking sorting a complicate taskInside psortConclusions

3 Conclusions

3/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Local computation of PageRank:the ranking side

4/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Ranking robustly

Rank a graph’s nodes

1. the graph 2. external factors

• (varying) parameters• graph availability• . . .

Is ranking robust?

How is ranking influenced by external factors?

4/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Ranking robustly

Rank a graph’s nodes

1. the graph 2. external factors

• (varying) parameters• graph availability• . . .

Is ranking robust?

How is ranking influenced by external factors?

5/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

PageRank

u

v

PageRank of node v:

P (v) =

α

∑u→v

P (u)

o(u)

+1− αn

n = |G| α = damping factor

Applicationsweb search, web crawling, web spam detection, personalized web search, social network

mining, ranking in databases, structural re-ranking, opinion mining, word sense

disambiguation, credit and reputation systems, bibliometrics, gene ranking, . . .

Among top data mining algorithmsWu et al. Top 10 algorithms in data mining. Knowl. and Inform. Systems, 2007.

5/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

PageRank

u

v

PageRank of node v:

P (v) = α∑u→v

P (u)

o(u)+

1− αn

n = |G| α = damping factor

Applicationsweb search, web crawling, web spam detection, personalized web search, social network

mining, ranking in databases, structural re-ranking, opinion mining, word sense

disambiguation, credit and reputation systems, bibliometrics, gene ranking, . . .

Among top data mining algorithmsWu et al. Top 10 algorithms in data mining. Knowl. and Inform. Systems, 2007.

5/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

PageRank

u

v

PageRank of node v:

P (v) = α∑u→v

P (u)

o(u)+

1− αn

n = |G| α = damping factor

Applicationsweb search, web crawling, web spam detection, personalized web search, social network

mining, ranking in databases, structural re-ranking, opinion mining, word sense

disambiguation, credit and reputation systems, bibliometrics, gene ranking, . . .

Among top data mining algorithmsWu et al. Top 10 algorithms in data mining. Knowl. and Inform. Systems, 2007.

6/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Choose the damping, choose the ranking?

P (v) = α∑u→v

P (u)

o(u)+

1− αn

Is PageRank’s rankingrobust to small variationsin α ?

Results1. not robust in theory (permutation theorem, reversal theorem)2. novel tools for checking robustness (lineage analysis)3. somewhat robust in real-world graphs (experiments)

Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?

J. Discrete Algorithms 8(2): 199-213 (2010)

Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?

Proc. of WAW 2009: 76-89

6/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Choose the damping, choose the ranking?

P (v) = α∑u→v

P (u)

o(u)+

1− αn

Is PageRank’s rankingrobust to small variationsin α ?

Results1. not robust in theory (permutation theorem, reversal theorem)2. novel tools for checking robustness (lineage analysis)3. somewhat robust in real-world graphs (experiments)

Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?

J. Discrete Algorithms 8(2): 199-213 (2010)

Marco Bressan, Enoch Peserico. Choose the damping, choose the ranking?

Proc. of WAW 2009: 76-89

7/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Is it possible to compute the rank locally?

Local computation

u

v

Ranking

0.15

0.2

0.10.3

0.25

In many applicationsonly the rank matters!

Is it possible to compute the rank locally?

• stated by Chen et al. (CIKM 2004)• restated by Bar-Yossef and Mashiach (CIKM 2008)

7/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Is it possible to compute the rank locally?

Local computation

u

v

Ranking

0.15

0.2

0.10.3

0.25

1st

2nd3rd

4th

5th

In many applicationsonly the rank matters!

Is it possible to compute the rank locally?

• stated by Chen et al. (CIKM 2004)• restated by Bar-Yossef and Mashiach (CIKM 2008)

7/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Is it possible to compute the rank locally?

Local computation

u

v

Ranking

0.15

0.2

0.10.3

0.25

1st

2nd3rd

4th

5th

In many applicationsonly the rank matters!

Is it possible to compute the rank locally?

• stated by Chen et al. (CIKM 2004)• restated by Bar-Yossef and Mashiach (CIKM 2008)

8/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (I): crawling

The visited graph expands startingfrom seed nodes.

Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?

8/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (I): crawling

The visited graph expands startingfrom seed nodes.

Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?

8/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (I): crawling

The visited graph expands startingfrom seed nodes.

Which red nodes should be visitednow? And in what order?

Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?

8/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (I): crawling

The visited graph expands startingfrom seed nodes.

Which red nodes should be visitednow? And in what order?

Order the nodes with PageRank!

Cho et al. Efficient crawling through URLordering. Computer Networks, 1998.

Is it possible to rank the red frontier for a low cost, without visitingthe whole crawled graph?

9/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (II): ranking withcompetitors

Retrieve graph structure using e.g. Google’s link:

Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.

Is it possible to compute this rank efficiently, using few queries?

9/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (II): ranking withcompetitors

Retrieve graph structure using e.g. Google’s link:

Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.

Is it possible to compute this rank efficiently, using few queries?

9/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (II): ranking withcompetitors

Retrieve graph structure using e.g. Google’s link:

Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.

Is it possible to compute this rank efficiently, using few queries?

9/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (II): ranking withcompetitors

Retrieve graph structure using e.g. Google’s link:

Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.

Is it possible to compute this rank efficiently, using few queries?

9/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (II): ranking withcompetitors

Retrieve graph structure using e.g. Google’s link:

Bar-Yossef and Mashiach. Local approximation of PageRank and reversePageRank. Proc. ACM CIKM, 2008.

Is it possible to compute this rank efficiently, using few queries?

10/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (III): social networkmining

Rank key users in social networks

Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.

Full graph not available (privacy settings).

Is it still possible to pretend correctness of the output ranking?

10/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (III): social networkmining

Rank key users in social networks

Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.

Full graph not available (privacy settings).

Is it still possible to pretend correctness of the output ranking?

10/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (III): social networkmining

Rank key users in social networks

Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.

Full graph not available (privacy settings).

Is it still possible to pretend correctness of the output ranking?

10/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (III): social networkmining

Rank key users in social networks

Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.

Full graph not available (privacy settings).

Is it still possible to pretend correctness of the output ranking?

10/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Motivating examples (III): social networkmining

Rank key users in social networks

Heidemann et al. Identifying key users in online social networks: APageRank based approach. Proc. ICIS, 2010.

Full graph not available (privacy settings).Is it still possible to pretend correctness of the output ranking?

11/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Formal definition of the problem

Input

• graph G of size n

• target nodes v1, . . . , vk

• score separation ε > 0

Output

• ranking of v1, v2, . . . , vk

If (1− ε) < P (vi)P (vj)

< (1 + ε)

any ranking of vi, vj is valid

Cost Model• computation for free• but visiting G costs

(query to link server)

cost of ranking = |queries| = |nodes visited|

12/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Is it possible to compute the rank locally?

Our contribution: NO!

NO in practice: experimental results

1. real web/social graphs behave like worst-case input instancesfor local ranking

2. approximating is not trivial:state-of-the-art local score approximation algorithms do notturn into low-cost local rank approximation algorithms

12/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Is it possible to compute the rank locally?Our contribution: NO!

NO in theory: lower bounds

1. Every deterministic local ranking algorithm has an adversarialgraph forcing Ω(n) queries (and can be tightened)

2. Every randomized local ranking algorithm has an adversarialgraph forcing Ω(n) queries

even to rank the top k nodes,even if their scores are highly separated!

=⇒ a general low-cost local ranking algorithm does not exist

NO in practice: experimental results

1. real web/social graphs behave like worst-case input instancesfor local ranking

2. approximating is not trivial:state-of-the-art local score approximation algorithms do notturn into low-cost local rank approximation algorithms

12/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Is it possible to compute the rank locally?Our contribution: NO!

NO in practice: experimental results

1. real web/social graphs behave like worst-case input instancesfor local ranking

2. approximating is not trivial:state-of-the-art local score approximation algorithms do notturn into low-cost local rank approximation algorithms

13/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (I): deterministic algorithms

Every det.algorithm has anadversarial graphforcing cost Ω(n)

n(1 −O(εk))

Theorem 1 (paper Thm. 4)

Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2

20k . For

any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the

top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking

according to Pα(·), algorithm A performs Ω(n) queries.

n(1−O(εk)) queries.

13/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (I): deterministic algorithms

Every det.algorithm has anadversarial graphforcing cost Ω(n)

n(1 −O(εk))

Theorem 1 (paper Thm. 4)

Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2

20k . For

any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the

top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking

according to Pα(·), algorithm A performs Ω(n) queries.

n(1−O(εk)) queries.

13/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (I): deterministic algorithms

Every det.algorithm has anadversarial graphforcing cost Ω(n)

n(1 −O(εk))

Theorem 1 (paper Thm. 4)

Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2

20k . For

any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the

top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking

according to Pα(·), algorithm A performs Ω(n) queries.

n(1−O(εk)) queries.

13/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (I): deterministic algorithms

Every det.algorithm has anadversarial graphforcing cost Ω(n)

n(1 −O(εk))

Theorem 1 (paper Thm. 4)

Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2

20k . For

any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the

top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking

according to Pα(·), algorithm A performs Ω(n) queries.

n(1−O(εk)) queries.

13/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (I): deterministic algorithms

Every det.algorithm has anadversarial graphforcing cost Ω(n)

n(1 −O(εk))

Theorem 1 (paper Thm. 4)

Choose integers k > 1 and n0 ≥ k2, a damping factor α ∈ (0, 1), and ε ≤ α2

20k . For

any deterministic local algorithm A there exists a graph of size n ∈ Θ(n0) where the

top k nodes v0, . . . , vk−1 are ε-separated and, to compute their relative ranking

according to Pα(·), algorithm A performs Ω(n) n(1−O(εk)) queries.

14/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (II): randomized algorithms

Every rand.(Las Vegas orMonte Carlo)algorithm has anadvers. graphforcing costΩ(α√nε

)

Ω(n)

[v3 v10 ... v7]

link

serv

er v1

(109 nodes)

v2

v20

AR

AN

DO

M

graph G

~104.5 queries

Theorem 2 (paper Thm. 3)

Choose k > 1, n0 ≥ 6k3, a damping factor α ∈ (0, 1), and ε ∈[α2k2

4n0, α

2

24k

]. Then

1. for any Las Vegas local algorithm A

2. for any Monte Carlo local algorithm A with constant confidence

there exists a graph of size n ∈ Θ(n0) where the top k nodes v0, . . . , vk−1 are

ε-separated and, to compute their relative ranking, A performs in expectation Ω(α√nε

)queries.

14/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Lower bounds (II): randomized algorithms

Every rand.(Las Vegas orMonte Carlo)algorithm has anadvers. graphforcing costΩ(α√nε

)Ω(n) [v3 v10 ... v7]

link

serv

er v1

(109 nodes)

v2

v20

AR

AN

DO

M

graph G

~104.5 108 queries

Theorem 2 (paper Thm. 3)

Choose k > 1, n0 ≥ 6k3, a damping factor α ∈ (0, 1), and ε ∈[α2k2

4n0, α

2

24k

]. Then

1. for any Las Vegas local algorithm A

2. for any Monte Carlo local algorithm A with constant confidence

there exists a graph of size n ∈ Θ(n0) where the top k nodes v0, . . . , vk−1 are

ε-separated and, to compute their relative ranking, A performs in expectation Ω(α√nε

)queries.

15/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

What happens in practice?

Two experiments

1. Hardness of real-world graphs

Compute the minimal number of nodes that an algorithm mustvisit to always guarantee a correct ranking.

2. Performance of approximation algorithms

Evaluate cost and accuracy of local ranking algorithms derivedfrom state-of-the-art local score approximation algorithms.

Datasets

nodes arcs crawled.it 40M 1150M 2004

LiveJournal 5M 79M 2008

publicly available from LAW

- Univ. Milan

http://law.dsi.unimi.it

16/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 1: hardness of real-world graphs (1/2)

Breakdown of a local ranking algorithm

1. Visit ancestors

Thm.: must visit at least|minset(G, u, v)|ancestors

2. Compute ranking

Thm.: must agree withnatural PageRank scoreapproximation

|minset(G, u, v)| ≤ cost of ranking u, v in graph G

16/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 1: hardness of real-world graphs (1/2)

Breakdown of a local ranking algorithm

1. Visit ancestors

Thm.: must visit at least|minset(G, u, v)|ancestors

2. Compute ranking

Thm.: must agree withnatural PageRank scoreapproximation

|minset(G, u, v)| ≤ cost of ranking u, v in graph G

17/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 1: hardness of real-world graphs (2/2)

103

104

105

106

107

.01.02.04.08.16.32.641.282.56

ave

rage n

um

ber

of vi

site

d n

od

es

ε

.it web graphLiveJournal graph

18/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

Improved variant of the pruned bruteforce algorithm: limitPageRank computation to ancestors giving a high contribution.

vpruning

threshold = 10%

18/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

Improved variant of the pruned bruteforce algorithm: limitPageRank computation to ancestors giving a high contribution.

v

35%

24%17%

10%

pruningthreshold = 10%

18/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

Improved variant of the pruned bruteforce algorithm: limitPageRank computation to ancestors giving a high contribution.

v

35%

24%17%

10%

<10%

<10%

<10%

<10%<10%

<10%

pruningthreshold = 10%

19/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

.it web graph

103

104

105

106

10-710-610-510-410-310-210-1

ave

rage c

ost

pruning threshold

(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)

(2.56,5.12)(1.28,2.56)

20/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

LiveJournal graph

103

104

105

106

10-710-610-510-410-310-210-1

ave

rage c

ost

pruning threshold

(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)

(2.56,5.12)(1.28,2.56)

21/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

.it web graph

-0.2

0

0.2

0.4

0.6

0.8

1

10-7

10-6

10-5

10-4

10-3

10-2

10-1

pruning threshold

(2.56,5.12)(1.28,2.56)(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)

fra

ctio

n o

f co

rre

ctly

ra

nke

d n

od

e p

air

s

22/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Exp. 2: performance of approximationalgorithms

LiveJournal graph

-0.2

0

0.2

0.4

0.6

0.8

1

10-710-610-510-410-310-210-1

fra

ctio

n o

f co

rre

ctly

ra

nke

d n

od

e p

air

s

pruning threshold

(0.64,1.28)(0.32,0.64)(0.16,0.32)(0.08,0.16)(0.04,0.08)(0.02,0.04)(0.01,0.02)

(1.28,2.56)(2.56,5.12)

23/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Conclusions

1. Local computation of PageRank ranking is infeasible

2. Cost of exact local ranking algorithms bounded by minsets

3. Tested real web/social graphs are near worst-case

4. And approximation is not trivial

Marco Bressan, Luca Pretto. Local computation of PageRank: the ranking side.Proc. of CIKM 2011: 631-640

24/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

psort, yet another fast stableexternal sorting software

25/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

In a nutshell

the psort sorting library

• written in C++• handles large datasets (> TB)• stable sorting• fast• designed for PC-class machines

ideal applications of psort

• sorting large databases• sorting large log files• sorting on commodity machines• . . .

25/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

In a nutshell

the psort sorting library

• written in C++• handles large datasets (> TB)• stable sorting• fast• designed for PC-class machines

ideal applications of psort

• sorting large databases• sorting large log files• sorting on commodity machines• . . .

26/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

psort and the Sort Benchmark (1/2)

The PennySort Benchmark

Sort what you can in 0.01$ of computing time.

1998

1999

2000

2002

2003

2007

2008

2009

2011

0 GB

50 GB

100 GB

150 GB

200 GB

250 GB

300 GB

350 GB

400 GBye

arly

rec

ord

(Sor

t Ben

chm

ark)

psort

Source: http://sortbenchmark.org

Paolo Bertasi, Marco Bressan, Enoch Peserico. psort, yet another fast stable sorting software.

ACM Journal of Experimental Algorithmics 16: (2011)

27/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

psort and the Sort Benchmark (2/2)

The Datamation BenchmarkSort 100MB disk-to-disk as fast as you can.

440 msNOW-sort (2001)

980 sthunder (1987)

psort (2011)

Paolo Bertasi, Michele Bonazza, Marco Bressan, Enoch Peserico: Datamation. A Quarter of a

Century and Four Orders of Magnitude Later. CLUSTER 2011: 605-609

28/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

psort and the STXXL library

101

102

103

1040

20

40

60

80

100

120

140

160

180

200

sort size (in MB)

sort

spe

ed (

in M

B/s

)

stxxl on disks (8,8)stxxl on disks (8,32)stxxl on disks (8,128)stxxl on RAID (8,8)stxxl on RAID (8,32)stxxl on RAID (8,128)psort on RAID (8,8)psort on RAID (8,32)psort on RAID (8,128)

29/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Machine budget for Sort Benchmark 2011

Power Supply Unit15 EUR

Case22 EUR

CPU38 EUR

RAM47 EURMotherboard

60 EUR

Hard Disks215 EUR

Assembly fee35 EUR

30/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

The big picture

psort execution diagram

CPU/cache

main memory

external memory

mergesort heap merge heap merge

1st disk pass 2nd disk pass

time

1MB, 10GB/s

1GB, 3GB/s

1TB, 0.7GB/s

31/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

The big picture - now complicated

Hardware/software details you must deal with:

I/O• hdd quality• file system• scheduling

• buffer size• direct transfer• data placement

memory• size• bandwidth• latency

• page size• access pattern• conflicts

cache• size• speed

• line size• associativity

32/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Hard disks

The speed curve of 13 “identical” WD1600JS disks

0 50 100 1500

50

100

150

Bandw

idth

(M

B/s

)

Distance from the outer rim (in GB)

33/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Memory

Why main memory is not really a RAM

0.5

1

1.5

2

2.5

3

3.5

4

4.5

struct size (bytes)

band

wid

th(G

B/s

)

sequential readrandom readsequential writerandom write

20 22 24 26 28 210 212 214 216 218

L2 c

ach

e lin

e s

ize

34/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

CPU

Is a dual-core always worth its price?

0

5e+09

1e+10

1.5e+10

2e+10

2.5e+10

3e+10

16 18 20 22 24 26 28 30

band

wid

th (

MB

/s)

log2( bytes visited )

Intel dual core readIntel dual core write

AMD single core readAMD single core write

35/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

A list of psort’s tricks

general• fast polling• payload

detachment

• key pre/postprocessing• . . .

diskaccess

• O_DIRECT• independent

disks

• uniform fetching• . . .

mergesort • smart merging• quasi-in-place

• special base case• . . .

heapsort• key caching• key offsetting

• payload interleaving• . . .

35/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

A list of psort’s tricks

general• fast polling• payload

detachment

• key pre/postprocessing• . . .

diskaccess

• O_DIRECT• independent

disks

• uniform fetching• . . .

mergesort • smart merging• quasi-in-place

• special base case• . . .

heapsort• key caching• key offsetting

• payload interleaving• . . .

35/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

A list of psort’s tricks

general• fast polling• payload

detachment

• key pre/postprocessing• . . .

diskaccess

• O_DIRECT• independent

disks

• uniform fetching• . . .

mergesort • smart merging• quasi-in-place

• special base case• . . .

heapsort• key caching• key offsetting

• payload interleaving• . . .

35/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

A list of psort’s tricks

general• fast polling• payload

detachment

• key pre/postprocessing• . . .

diskaccess

• O_DIRECT• independent

disks

• uniform fetching• . . .

mergesort • smart merging• quasi-in-place

• special base case• . . .

heapsort• key caching• key offsetting

• payload interleaving• . . .

35/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

A list of psort’s tricks

general• fast polling• payload

detachment

• key pre/postprocessing• . . .

diskaccess

• O_DIRECT• independent

disks

• uniform fetching• . . .

mergesort • smart merging• quasi-in-place

• special base case• . . .

heapsort• key caching• key offsetting

• payload interleaving• . . .

35/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

A list of psort’s tricks

general• fast polling• payload

detachment

• key pre/postprocessing• . . .

diskaccess

• O_DIRECT• independent

disks

• uniform fetching• . . .

mergesort • smart merging• quasi-in-place

• special base case• . . .

heapsort• key caching• key offsetting

• payload interleaving• . . .

36/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Smart merging (1/3)

Naive merging

void merge(T *s1, T *s2, T *out, int size) int i = 0, j = 0, k = 0;bool bit;while ((i < size) & (j < size))

if (s1[i] > s2[j]) // READ + READout[k] = s2[j]; // READj++;

else out[k] = s1[i]; // (READ)i++;

k++;...

total mem READs per iteration: 3

36/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Smart merging (1/3)

Naive merging

void merge(T *s1, T *s2, T *out, int size) int i = 0, j = 0, k = 0;bool bit;while ((i < size) & (j < size))

if (s1[i] > s2[j]) // READ + READout[k] = s2[j]; // READj++;

else out[k] = s1[i]; // (READ)i++;

k++;...

total mem READs per iteration: 3

37/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Smart merging (2/3)

Smart merging

void merge(T* s1, T* s2, T* out, int size) int i = 0, j = 0, k = 0;bool bit;T cache[ 2 ];cache[0] = s1[0];cache[1] = s2[0];while ((i < size) & (j < size))

if (cache[0] > cache[1]) out[k] = cache[1];cache[1] = s2[j]; // READj++;

else out[k] = cache[0];cache[0] = s1[i]; // (READ)i++;

k++;...

total mem READs per iteration: 1

37/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Smart merging (2/3)

Smart merging

void merge(T* s1, T* s2, T* out, int size) int i = 0, j = 0, k = 0;bool bit;T cache[ 2 ];cache[0] = s1[0];cache[1] = s2[0];while ((i < size) & (j < size))

if (cache[0] > cache[1]) out[k] = cache[1];cache[1] = s2[j]; // READj++;

else out[k] = cache[0];cache[0] = s1[i]; // (READ)i++;

k++;...

total mem READs per iteration: 1

38/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Smart merging (3/3)

Time required to merge two sequences

0

100000

200000

300000

400000

500000

600000

700000

800000

10 12 14 16 18 20 22 24

tim

e in m

icro

seconds

log2( merge size )

smart mergenaive merge

39/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Quasi-in-place mergesort (1/3)

traditional mergesort

void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) merge(&input[j * subsize],

&input[(j + 1) * subsize],&output[j * subsize * 2],subsize);

T* tmp = input; // swap input and outputinput = output;output = tmp;

extra space = N

39/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Quasi-in-place mergesort (1/3)

traditional mergesort

void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) merge(&input[j * subsize],

&input[(j + 1) * subsize],&output[j * subsize * 2],subsize);

T* tmp = input; // swap input and outputinput = output;output = tmp;

extra space = N

40/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Quasi-in-place mergesort (2/3)

“quasi-in-place” mergesort

void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size/2); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) /* merge, overwriting the input vector */merge(&input[j * subsize],

&input[(j + 1) * subsize],&input[(j - 1) * subsize],subsize);

input = &input[-subsize]; // shift input left

// finally merge into the output vectormerge(input, &input[size/2], output, size/2);

extra space = N/2

40/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Quasi-in-place mergesort (2/3)

“quasi-in-place” mergesort

void mergesort(T* input, T* output, int size) for (int i = 1; i < log2(size/2); i++) int subsize = 1 << (i + 1);for (int j = 0; j < size/subsize; j++) /* merge, overwriting the input vector */merge(&input[j * subsize],

&input[(j + 1) * subsize],&input[(j - 1) * subsize],subsize);

input = &input[-subsize]; // shift input left

// finally merge into the output vectormerge(input, &input[size/2], output, size/2);

extra space = N/2

41/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Quasi-in-place mergesort (3/3)

Average time required to compare two keys

0

0.5

1

1.5

2

2.5

3

3.5

4

10 12 14 16 18 20 22 24

rela

tive

uniti

es

log2( input size in bytes )

quasi-in-place

42/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Conclusions

1. Solving old problems really fast is still tricky

2. To do it, you must match today’s hardware

3. Solution: software engineering and tuning

Paolo Bertasi, Marco Bressan, Enoch Peserico. psort, yet another fast stable sorting software.

ACM Journal of Experimental Algorithmics 16: (2011)

43/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Conclusions

Ranking

1. Local computation of PageRank ranking infeasible in theory

2. On tested web/social graphs, infeasible also in practice

3. Rank analysis requires novel tools!

Sorting

1. Solving old problems really fast is still tricky

2. To do it, you must match today’s hardware

3. Software engineering and tuning are the ways

And of course now you should pay me twice! :-)

43/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Conclusions

Ranking

1. Local computation of PageRank ranking infeasible in theory

2. On tested web/social graphs, infeasible also in practice

3. Rank analysis requires novel tools!

Sorting

1. Solving old problems really fast is still tricky

2. To do it, you must match today’s hardware

3. Software engineering and tuning are the ways

And of course now you should pay me twice! :-)

43/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Conclusions

Ranking

1. Local computation of PageRank ranking infeasible in theory

2. On tested web/social graphs, infeasible also in practice

3. Rank analysis requires novel tools!

Sorting

1. Solving old problems really fast is still tricky

2. To do it, you must match today’s hardware

3. Software engineering and tuning are the ways

And of course now you should pay me twice! :-)

43/43

Localcomputation ofPageRank: theranking sideIntroduction

Motivations

Local ranking intheory

Local ranking inpractice

Conclusions

psort, yet anotherfast stableexternal sortingsoftwareIntroduction

Making sorting acomplicate task

Inside psort

Conclusions

Conclusions

Conclusions

Ranking

1. Local computation of PageRank ranking infeasible in theory

2. On tested web/social graphs, infeasible also in practice

3. Rank analysis requires novel tools!

Sorting

1. Solving old problems really fast is still tricky

2. To do it, you must match today’s hardware

3. Software engineering and tuning are the ways

And of course now you should pay me twice! :-)

Recommended