12
SHORTEST PATH ANALYSIS IN REAL GRAPHS Authors Waqas Nawaz, Kifayat Ullah Khan, Young-Koo Lee Department of Computer Engineering, Kyung Hee University, South Korea The 3rd International Conference on Convergence and its Application (ICCA 2014) , 25-27 June, Seoul, Korea

(Icca 2014) shortest path analysis in social graphs

Embed Size (px)

Citation preview

SHORTEST PATH ANALYSIS IN REAL GRAPHS

Authors

Waqas Nawaz,

Kifayat Ullah Khan,

Young-Koo Lee

Department of Computer Engineering, Kyung Hee University, South Korea

The 3rd International Conference on Convergence and its Application (ICCA 2014) , 25-27 June, Seoul, Korea

2

“... shortest path problems are among the most fundamental combinatorial optimization problems with many applications, both direct and as subroutines in other combinatorial optimization algorithms. Algorithms for these problems

have been studied since the 1950’s and

still remain an active area of research.”[1]

MOTIVATION

[1] Camil Demetrescu, Andrew V. Goldberg, and David S. Johnson. Implementation challenge for shortest paths. In Encyclopedia of Algorithms. 2008.

3

Telephone routes Which communication links to activate when a user makes a

phone call, e.g. from HK to New York, USA. Road systems design

Problem: how to determine the no. of lanes in each road? Given: expected traffic between each pair of locations Method: Estimate total traffic on each road link assuming each

passenger will use shortest path Many other applications, including:

Finance (arbitrage), In economics and finance, arbitrage is the practice of taking advantage

of a price difference between two or more markets: striking a combination of matching deals that capitalize upon the imbalance, the profit being the difference between the market prices. (Wikipedia)

Assembly line inspection systems design,

Graph Median, Traffic Simulation, Image Segmentation, Drug

Target Identification, Community Detection, Social Search, Social Networking, Message Routing,

SHORTEST PATH APPLICATIONS

SHORTEST PATH ANALYSIS: CONTRIBUTIONS Empirically prove that a

significant amount of shortest paths are overlapped

The behavior of the overlapped regions in diverse networks E.g. Scale free networks

The impact of hub-nodes on the shortest paths E.g. What portion of the shortest

paths are pass through the hub nodes or across dense regions

Analysis on the coverage of the entire graph through shortest paths

4

Hub-nodes

5

Which portion of a graph is traversed through shortest paths?

ORValidate

A significant amount of shortest paths are overlapped

Hub nodes are contained in shortest paths

PROBLEM STATEMENT

To the best of our knowledge, there is no such empirical analysis exists in literature

DEFINITION: SHORTEST PATH (SP) Definition: A sequence of edges i.e.

pi = Eseq = {e1 e2 … em} from source vertex vs to destination vertex vd where dist(pi) is minimum m is the number of edges, ei = {(vi-1,vi, cost)|vi-1,vi Є V}, vs ≠ vd

dist(…) is the distance function based on edge cost

Example Shortest Path = p(v0, v7) = {e1 e2 e8 e5 e6 e7 }

where m=7, vs = v0 and vd = v7

6v2 v3 v4 v5 v6 v7v1v0e1 e2 e3

e4 e5 e6 e7

e8

e9

source destination

7

Straight Forward Approach (Brute-force)Generate all pair shortest paths

SP-DB (file on disk) If N is the number of vertices then N2 paths, may not appropriate for very large graphs

Manually scan SP-DB to identify the overlaps and frequently occurring verticesEfficiency subjected to careful data structure or indexing method

HOW TO ANALYZE SPS? BASIC IDEA (1/2)

8

Alternate Approach (Non-Exhaustive)Generate all pair shortest paths

(small graphs) OR k-source shortest paths (for large graphs, where k << N ) into SP-DB

Estimate the occurrences of vertices using data mining approachFrequent Item-set Mining with given threshold to limit search spaceWe can easily prune the rarely occurring vertices

HOW TO ANALYZE SPS? BASIC IDEA (2/2)

9

Shortest Path Computation

Frequent Item-set Mining towards finding SPOREs FP-Growth approach Each shortest path is considered as a transaction

which contains nodes as set of items If sup = 2 then

1 len SPORE (C, D, E) 2 len SPORE (CD, DE, CE) 3 len SPORE (CDE)

NON-EXHAUSTIVE APPROACH: EXAMPLE

All Pair Shortest Pathsk-Source Shortest Paths, k=2

C D ENM P

C D EBA F

10

Real Dataset Social circles from Facebook (anonymized) Vertices (4,050), Edges (88,254), Diameter (8)

Environment Windows 7, 32bit, Java Implementation

EXPERIMENTS

Original Facebook Graph Shortest Path Traversals

11

FACEBOOK STATS: DEGREE DISTRIBUTION VS. FREQUENCY OF OCCURRENCES

12

The frequency distribution of shortest path overlaps is influenced by network node degree distribution

The probability of the shortest path passing through hub-nodes is high

A significant amount of shortest paths are overlapped

CONCLUSION