Traffic-driven model of the World-Wide-Web Graph

Preview:

DESCRIPTION

Traffic-driven model of the World-Wide-Web Graph. A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France A. Vespignani, LPT, Orsay, France. Outline. The WebGraph Some empirical characteristics Various models Weights and strengths Our model: Definition Analysis: analytics+numerics - PowerPoint PPT Presentation

Citation preview

Traffic-driven model of the World-Wide-Web Graph

A. Barrat, LPT, Orsay, FranceM. Barthélemy, CEA, FranceA. Vespignani, LPT, Orsay, France

Outline The WebGraph Some empirical characteristics Various models Weights and strengths Our model:

Definition Analysis: analytics+numerics

Conclusions

The Web as a directed graph

i

jl nodes i: web-pagesdirected links: hyperlinks

in- and out- degrees:

•Small world : captured by Erdös-Renyi graphs

Poisson distribution

<k> = p N

With probability p an edge is established among couple of vertices

Empirical facts

•Small world•Large clustering: different neighbours of a node will likely know each other

1

2

3

n

Higher probability to be connected

=>graph models with large clustering, e.g. Watts-Strogatz 1998

Empirical facts

•Small world•Large clustering•Dynamical network•Broad connectivity distributions

•also observed in many other contexts (from biological to social networks)•huge activity of modeling

Empirical facts

(Barabasi-Albert 1999; Broder et al. 2000; Kumar et al. 2000; Adamic-Huberman 2001; Laura et al. 2003)

Various growing networks models Barabáási-Albert (1999): preferential attachment Many variations on the BA model: rewiring (Tadic

2001, Krapivsky et al. 2001), addition of edges, directed model (Dorogovtsev-Mendes 2000, Cooper-Frieze 2001), fitness (Bianconi-Barabáási 2001), ...

Kumar et al. (2000): copying mechanism Pandurangan et al. (2002): PageRank+pref.

attachment Laura et al. (2002): Multi-layer model Menczer (2002): textual content of web-pages

The Web as a directed graph

i

jl nodes i: web-pagesdirected links: hyperlinks

Broad P(kin) ; cut-off for P(kout)

(Broder et al. 2000; Kumar et al. 2000; Adamic-Huberman 2001; Laura et al. 2003)

Additional level of complexity: Weights and Strengths

i

jLinks carry weights/traffic:

wij

In- and out- strengths

l

Adamic-Huberman 2001: broad distribution of sin

Model: directed network

n i

j (i) Growth

(ii) Strength driven preferential attachment (n: kout=m outlinks)

AND...

“Busy gets busier”

Weights reinforcement mechanism

i

j

n

The new traffic n-i increases the traffic i-j“Busy gets busier”

Evolution equations

(Continuous approximation)

Coupling term

Resolution

Ansatz

supported by numerics:

Results

Approximation

Total in-weight i sini : approximately proportional to the

total number of in-links i kini , times average weight hwi = 1+

Then: A=1+

sin 2 [2;2+1/m]

Measure of A

prediction of

Numerical simulations

Approx of

Numerical simulations

NB: broad P(sout) even if kout=m

Clustering spectrum

i.e.: fraction of connected couples of neighbours of node i

Clustering spectrum

• increases => clustering increases

• New pages: point to various well-known pages, often connected together => large clustering for small nodes

• Old, popular pages with large k: many in-links from many less popular pages which are not connected together => smaller clustering for large nodes

Clustering and weighted clustering

takes into account the relevance of triangles in the global traffic

Clustering and weighted clustering

Weighted Clustering larger than topological clustering:triangles carry a large part of the traffic

AssortativityAverage connectivity of nearest neighbours of i

Assortativity

•knn: disassortative behaviour, as usual in growing networksmodels, and typical in technological networks

•lack of correlations in popularity as measured by the in-degree

Summary Web: heterogeneous topology and traffic Mechanism taking into account interplay between

topology and traffic Simple mechanism=>complex behaviour, scale-free

distributions for connectivity and traffic Analytical study possible Study of correlations: non-trivial hierarchical

behaviour Possibility to add features (fitnesses, rewiring,

addition of edges, etc...), to modify the redistribution rule...

Empirical studies of traffic and correlations?

Recommended