Engineering shortest path algorithms (a round trip between theory and experiments) Camil Demetrescu University of Rome “La Sapienza” Giuseppe F. Italiano

Engineering shortest path algorithms

(a round trip between theory and experiments)Camil Demetrescu University of Rome “La Sapienza”

Giuseppe F. ItalianoUniversity of Rome “Tor Vergata”

Why Engineering Algorithms?

Theory

In theory, theory and practice are the same.

Why Engineering Algorithms?

The real world out there...

In practice, theory and

practice may be quite

different…

Programs are first class citizens as well

int, float, doubleNumber types: N, R

Only asymptotics matter Seconds do matter

Abstract algorithm description

Non-trivial implementationdecisions, error-prone

Unbounded memory,unit access cost

Memory hierarchy / bandwidth

Elementary operations take constant time

Instruction pipelining, …

Theory Practice

The Algorithm Engineering Cycle

Algorithm implementation

Experimental analysis

Theoretical analysisBottlenecks,Heuristics

Deeperinsights

Morerealisticmodels

Hints torefine analysis

Algorithm design

Fully Dynamic all-pairs shortest paths

Given a weighted directed graph G=(V,E,w),

perform any intermixed sequence of the following

operations:

return distance from x to y(or shortest path from x to y)

Query(x,y):

update cost of edge (u,v) to wUpdate(u,v,w): update edges incident to v [w( )]Update(v,w):

State of the Art

First fully dynamic algorithms date back to the 60’s

• P. Loubal, A network evaluation procedure, Highway

Research Record 205, 96-109, 1967.

• J. Murchland, The effect of increasing or decreasing the

length of a single arc on all shortest distances in a graph, TR

LBS-TNT-26, Transport Network Theory Unit, London

Business School, 1967.

• V. Rodionov, A dynamization of the all-pairs least cost

problem, USSR Comput. Math. And Math. Phys. 8, 233-277,

1968.

• …

State of the Art

65-69 70-74 75-79 80-84 85-89 90-94 95-99 00-

# papers

State of the Art

Until 1999, none of them was better in the worst case than recomputing APSP from scratch (~ cubic time!)

general real ? < o(n3) O(1)

QueryUpdateGraph Weight

Ramaling.&Reps 96 general real O(n3) O(1)

King 99 general [0,C] O(n2.5 (C log n)0.5) O(1)

D. & Italiano 01 general S real O(n2.5 (S log3 n)0.5) O(1)

Fully Dynamic APSP

Edge insertions (edge cost decreases)

Quite easy: O(n2)

10

10

10

10

For each pair x,y check whether d(x,i) + w(i,j) + d(j,y) < d(x,y)

x y

i j

Fully Dynamic APSP

• Edge deletions (edge cost increases)

Seem the hard operations

• When edge (shortest path) deleted: need info about second shortest path? (3rd, 4th, …)

• Now we take a little break from this problem…

• Edge insertions (edge cost decreases)

Quite easy: O(n2)

Engineering Dijkstra’s AP Shortest Path

y’

The algorithm:

Run Dijkstra from all vertices “in parallel”

Possibly insert (x,y’) into heap or decrease priority3.

2. Scan onlyy’ for which (a,y’) shortest

(suboptimality)

Can we do better?

Extract shortest pair (x,y) from heap:

x y

1.

a

Possibly insert (x,y’) into heap or decrease its priority3.

y’ 2. Scan all neighborsy’ of y

Edge scanning bottleneck for dense graphs [Goldberg]

Extract shortest pair (x,y) from heap:

x y

1.

Definition: Locally Shortest Paths [DI’03]

A path is locally shortest if all of its proper subpaths are shortest paths

Shortest path Shortest path

x yπxy

Locally shortest path

Locallyshortest paths

Locally Shortest Paths

Shortest paths

By optimal-substructure property of shortest paths:

How much do we gain?

Running time on directed graphswith real edge weights

O(n2) spaceO( #LS-paths + n2 log n) time

Q.: How many locally shortest paths ?

Q.: How much can we gain in practice?

A.: #LS-paths ≤ mn. No gain in asymptopia…

How many LS paths in a graph?

aaa

Locally shortest paths in random graphs (500 nodes)

0

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000

# edges

m*n#LS-paths

n*n

#LS-paths

m*n

n*n

What about real-world graphs?

Locally shortest paths in US road networks

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

DE NV NH ME AZ ID MT ND CT NM MA NJ LA CO MD NE MS IN AR KS KY MI IA AL MN MO CA NC

US states

average degree#LS-paths per pair

US road networks

Can we exploit this in practice?

aaa

Experiment for increasing number of edges (rnd, 500 nodes)

0

2

4

6

8

10

12

14

16

18

0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000

Number of edges

Dijkstra

New algorithm

Dijkstra'salgorithm

Algorithm basedon locally shortest paths

Theoretical analysisBottlenecks,Heuristics

What we have seen so far


Deeperinsights

Morerealisticmodels


Algorithm design


Bottlenecks,Heuristics

Return trip to Theory


Theoretical analysisDeeperinsights

Morerealisticmodels


Algorithm design


Back to Fully Dynamic APSP

Given a weighted directed graph G=(V,E,w),

perform any intermixed sequence of the following

operations:

return distance from x to y(or shortest path from x to y)

Query(x,y):

update cost of edge (u,v) to wUpdate(u,v,w):

Recall Fully Dynamic APSP

• Hard operations seem edge deletions

(edge cost increases)

• When edge (shortest path) deleted: need info

about second shortest path? (3rd, 4th, …)

Shortest path Shortest path

x yπxy

• Hey… what about locally shortest paths?

Candidate for being shortest path?Locally shortest path

Locally shortest paths for dynamic APSP

Idea: Maintain all the locally shortest paths of the graph

How do locally shortest paths change in a dynamic graph?

Assumptions behind the analysis

Locally shortest paths πxy are internally vertex-disjoint

x y

π1

π3

π2

No two paths share the same weight.

Tie breaking

Shortest paths are unique

Assumptions

In theory, tie breaking is not a problem

Practice

In practice, tie breaking can be subtle

Appearing Locally Shortest Paths

Fact 1

At most mn (n3) paths can START being locallyshortest after an increase

x y

10

20

30

40

x y100

10

20

30

40

Disappearing Locally Shortest Paths

Fact 2

At most n2 paths can STOP being locally shortestafter an increase

if π stops being locally shortest after increase of e

subpath of π (was shortest path) must contain e

shortest paths are unique: at most n2 contain e

Maintaining Locally Shortest Paths

# Locally shortest paths appearing after increase: < n3

# Locally shortest paths disappearing after increase: < n2

The amortized number of changes in the set of locally shortest paths at each update in an increase-only sequence is O(n2)

An Increase-only update Algorithm

This gives (almost) immediately:

O(n2 log n) amortized time per increase

O(mn) space

Maintaining Locally Shortest Paths

x y

10

20

30

40

x y100

10

20

30

40

What about fully dynamic sequences?

x y

How to pay only once?

x yx y

This path remains the same while flipping between being LS and non-LS:

Would like to have update algorithm that pays only once for it until it is further updated...

x y

Looking at the substructure

x y

…but if we removed the same edge it would be a shortest path again!

It is notdead!

This path remains a shortest pathafter the insertion

This path is no longer a shortest pathafter the insertion…

Historical Paths

x y

A path is historical if it was shortest at some time since it was last updated

historical path

Locally Historical Paths

Shortest path

Shortest path

x

yπxyLocally shortest

path

Historical path

Historical path

x

yπxy

Locally historicalpath

Key idea for Partially Dynamic

LSPSP

Key idea for Fully Dynamic

LHP

HP

Key Idea for Fully Dynamic

LHP

SPHP

Putting things into perspective:

LHP

HP LSPSP

The fully dynamic update algorithm

O(n2 log3 n) amortized time per update

Fully dynamic update algorithm very similar to partially dynamic, but maintains locally historical paths instead of locally shortest paths (+ performs some other operations)

O(mn log n) space

Idea:Maintain all the locally historical paths of the graph

Experiments (update time)

aaa

Experiments for increasing #edges (rnd, 500 nodes, Intel Xeon)

0.001

0.01

0.1

1

10

100

0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000

# edges

S-DIJ

S-LSP

D-KIN

D-RRL

D-LHP

What about real graphs? (update time)

aaa

Experiments on US road networks

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

HI DE NV NH ME ID MT ND CT NM MA NJ LA CO MD NE MS IN AR KS KY MI IA AL MN MO CA NC

US states

D-RRL

D-LHP

Rel

ativ

e ti

me

perf

orm

ance

D-R

LL

/D-L

HP

Number of edges (x 100)

D-LHP slower than D-RRL

1 2 3 4 5 6 7 8 9 10

D-LHP faster than D-RRL

Experiments on US road networks

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

2.4

IBM Power 4 Sun UltraSPARC IIi AMD Athlon

(32MB L3 cache)

(256KB L2 cache)(2MB L2 cache)

OH

NC

CAMO

MNAL

IA

MI

KY

KS

AR

INMS

NEMDCO

LANJ

MACT

NMND

MT

ID

AZ

MENH

NV

DE

IBM Power 4

Sun UltraSPARC IIi

AMD Athlon

(on different platforms)

Cache effects

Cache size

0.69

0.84

1.00

1.17

1.30

1.41

1.83

0.75

0.870.92

0.71

11.12

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

128KB 256KB 512KB 1MB 2MB 4MB 8MB 16MB 32MB

Simulated cache miss ratio D-RRL/D-LHPPerformance ratio D-RRL/D-LHP on real architectures

UltraSPARC IIi

Xeon

Athlon

1.59

Power 4

Colorado road network

Conclusions and open problems…

C Code publicly available at

http://www.dis.uniroma1.it/~demetres/experim/dsp

O(n2) space too much for applications

Trading-off space and (query) time?

Fully dynamic single source shortest path problem?

The world is not perfect…

Documents

Engineering shortest path algorithms (a round trip between theory and experiments) Camil Demetrescu University of Rome “La Sapienza” Giuseppe F. Italiano