26
1 Fast Top-k Simple Shortest Paths Discovery in Graphs Database Research Group Department of Computer Science Peking University Jun Gao, Huida Qiu, Xiao Jiang, Dongqing Yang, Tenjiao Wang

Top-k shortest path

Embed Size (px)

DESCRIPTION

cikm 2009: top k shortest paths

Citation preview

Page 1: Top-k shortest path

1

Fast Top-k Simple Shortest Paths Discovery in Graphs

Database Research Group

Department of Computer Science

Peking University

Jun Gao, Huida Qiu, Xiao Jiang, Dongqing Yang, Tenjiao Wang

Page 2: Top-k shortest path

2

Motivation

Related Work

Our Method

Experiments

Conclusion

Outline

Page 3: Top-k shortest path

Motivation

From “Finding the k Shortest Paths” by David Eppstein

• Additional constraints

• Model evaluation

• Sensitivity analysis

• Generation of alternatives

When the shortest path is not sufficient for application, top-k shortest paths are desired.

3

Page 4: Top-k shortest path

Top k shortest paths query

Top 2 general shortest path

(allowing loops) :1st: 1234, length: 3

2nd: 1236234, length: 6

1

1

1

1

1

3

3

1

2

3

4

5 6• Top 2 simple shortest path

(without loops) : 1st: 1234, length: 3 2nd: 12534, length: 8

1

Page 5: Top-k shortest path

5

Motivation

Related work

Our method

Experiments

Conclusion

Outline

Page 6: Top-k shortest path

Top K genenal shortest path problem

Related work

• David Eppstein. Finding the k shortest paths. SIAM J.Comput. (SIAMCOMP), 28(2):652–673, 1998

Basic Idea

Time Complexity

• O(m+nlogn+k)

6

Original Graph Shortest Path Tree Side Cost on Edges

Page 7: Top-k shortest path

Top K loopless Shortest Path Problem

Related work• J. Y. YEN. Finding the k shortest

loopless paths in a network. Manage. Sci, 17(712-716), 1971.

Basic Idea• Find the shortest path first

• 2-th shortest path should be

- different from the shortest path

- loopless

- shortest in the remaining paths

• Find the next shortest paths iteratively

Time Complexity• O(kn(m+nlogn))

7

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

Candidate Paths

Page 8: Top-k shortest path

Top K loopless Shortest Path Problem

Related work

• J. Hershberger, S. Suri, and A. Bhosle. On the difficulty of some shortest path problems. ACM Transactions on Algorithms, 3(1), 2007

Basic Idea

• Remove edge to find next shortest path

• Use the intermediate result to lower the cost

Time Complexity

• O(k(m+nlogn))

• Loop in some cases

8

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

s

b

e

g

f

t

Candidate Paths

Page 9: Top-k shortest path

9

Motivation

Related work

Our method

Experiments

Conclusion

Outline

Page 10: Top-k shortest path

Basic Idea

The key operation is to reduce the redundant computation cost for the same target node.

We pre-compute the shortest path tree rooted at the target node

We expect the candidate path searching can be terminated early with the shortest path tree

The final path is the concatenation of 3 sub-paths, the first sub-path is in the current shortest path, the second one is discovered online, the third on is in the shortest path tree.

• The existing method need discover the second and third sub-path online.

10

Page 11: Top-k shortest path

Graph Pre-processing

11

Precompute the shortest path tree rooted at t

Make side cost of each edge

Assign (pre, post, parent) encoding on each node to accelarate the loop detection

Page 12: Top-k shortest path

Searching for the candidate paths

On the transformed graph, we start searching with the side cost

The path with the minimal side cost equals the path with the cost in the original graph

In the seaching, the loop needs be detected.

When no loop can be found, the path can be discovered directly

12

ts d1

u

d2

i

l

Starting Node

Page 13: Top-k shortest path

Path Searching Example

13

The edge e to g cannot be considered

d is then considered. But e is the ancestor of d

c is then considered, but e is the ancestor of c

f is then considered, f to t is not via node s b e

Page 14: Top-k shortest path

Optimization-1

k-reduction strategy: stop when

• k1 shortest paths discovered;

• k2 paths in candidate pool have the same length as the k1-th shortest path;

• k1 + k2 ≥ k

Page 15: Top-k shortest path

Optimization-2

Suppose we know the length of the shortest path is l1, the length of the k-th shortest path is l2;

Let th = l2 – l1;

When looking for paths from deviation nodes, we can stop searching when the current accumulated side cost already exceeds th;

Page 16: Top-k shortest path

Optimization-2

Approximate threshold: the shortest path is needed; any other k-1 paths will do.

• Eager policy: search for k-1 candidate paths instead of one from the first deviation node;

• Lazy policy: determine after there are k-1 paths in the candidate pool.

• As more paths are discovered, the threshold can be adaptively updated and slowly becomes tighter.

Page 17: Top-k shortest path

17

Motivation

Related work

Our method

Experiments

Conclusion

Outline

Page 18: Top-k shortest path

Experimental Evaluation

Comparison algorithms:

• YEN: Yen’s classic algorithm;

• JH: the edge-replacement based method by John Hershberger et al.

• Implementation: C++, by Hershberger et al.

Our method: all implemented in Java

• KR: the base method with k-reduction;

• KRE: k-reduction plus Eager policy;

• KRL: k-reduction plus Lazy policy;

Page 19: Top-k shortest path

Datasets

• Real datasets: (Density = # of Edges / # of nodes)

• Synthetic datasets:• Random graphs generated by Barabasi Graph

Generator (by Derek Dreier, available from Internet)

Dataset # of Nodes # of Edges Density

Add32 4,960 9,462 1.91

Crack 10,240 30,380 2.97

Gupta3 16,783 4,670,105 278.26

FLA 1,070,376 2,712,798 2.53

Page 20: Top-k shortest path

Impact of Graph Size

Density = 3, # of nodes from 10k to 100k;

Page 21: Top-k shortest path

Impact of k

Performed on real graphs.

Page 22: Top-k shortest path

Impact of Density

k=1000

Page 23: Top-k shortest path

23

Motivation

Related Work

Our Method

Experiments

Conclusion

Outline

Page 24: Top-k shortest path

Conclusion

We speed up top-k shortest path discovery.

• Combine Yen’s and Eppstein’s idea

• Transform the candidate path discovery to the side cost graph

- Terminate Earlier

• Use structural labels to detect the loop effectively.

• Introduce two other optimizations

- Reduce number of k

- Avoid the worst case of path searching.

Page 25: Top-k shortest path

Future work

Extend the top k shortest path between two node to two node sets

Find top-k shortest path core

Find approximate top-k shortest path

25

Page 26: Top-k shortest path

26

Thanks for your attention!

[email protected]