53
School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Embed Size (px)

Citation preview

Page 1: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

School of InformationUniversity of Michigan

SI 614Small Worlds

Lecture 5

Instructor: Lada Adamic

Page 2: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Outline

Milgram’s small world experiment Watts & Strogatz small world model Kleinberg small world model Watts, Dodds & Newman community model Network models: a few examples

Things for Lada to remember: Online survey for short profiles Email in groups by Monday Feb. 6th

Cormen chapter now available as PDF on cTools PS 1 graded PS 3 available by tomorrow

Page 3: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

NE

MA

Milgram’s experiment (1960’s):

Given a target individual and a particular property, pass the message to a person you correspond with who is “closest” to the target.

Small world experiments then

Page 4: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Milgram’s small world experiment

Target person worked in Boston as a stockbroker. 296 senders from Boston and Omaha. 20% of senders reached target. average chain length = 6.5.

“Six degrees of separation”

Page 5: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Small world experiments now

email experiment Dodds, Muhamad, Watts, Science 301, (2003)

18 targets13 different countries

60,000+ participants24,163 message chains 384 reached their targetsaverage path length 4.0

image by Stephen G. Eick http://www.bell-labs.com/user/eick/index.html(unrelated to small world experiment…)

Page 6: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Targets for the small world experiment at Columbia

a professor at an Ivy League university, an archival inspector in Estonia, a technology consultant in India, a policeman in Australia, a veterinarian in the Norwegian army.

no chain reached the target in Croatia

Page 7: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Accounting for attrition

Approximate 37% participation rate approximately . Probability of a chain of length 10 getting through: .3710 ~ 5 x 10-5

so only one out of 20,000 chains would make it actual # of completed chains: 384 (1.6% of all chains).

Small changes in attrition rates lead to large changes in completion rates

e.g., a 15% decrease in attrition rate would lead to a 800% increase in completion rate

Page 8: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Estimating ‘recovered’ chain lengths for uncompleted chains

<L> = 4.05 for all completed chains

L* = Estimated `true' median chain length

Intra-country chains: L* = 5

Inter-country chains: L* = 7

All chains: L* = 7

Milgram: L * ~ 8-9 hops

Page 9: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Attrition rate stays approx. constant throughout

rL – probability of not passing on the message at distance L from the source

average

95 % confidence interval

Page 10: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Estimated ‘recovered’ chain lengths

observed chain lengths

‘recovered’ histogram of path lengths

inter-countryintra-country

Page 11: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Small world experiment at Columbia

Successful chains disproportionately used• weak ties (Granovetter)• professional ties (34% vs. 13%)• ties originating at work/college• target's work (65% vs. 40%)

. . . and disproportionately avoided• hubs (8% vs. 1%) (+ no evidence of funnels)• family/friendship ties (60% vs. 83%)

Strategy: Geography -> Work

Page 12: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

How many hops actually separate any two individuals in the world?

Participants are not perfect in routing messages They use only local information “The accuracy of small world chains in social networks”

Peter D. Killworth, Chris McCarty , H. Russell Bernard& Mark House: Analyze 10920 shortest path connections between 105 members of

an interviewing bureau, together with the equivalent conceptual, or ‘small world’ routes,

which use individuals’ selections of intermediaries. This permits the first study of the impact of accuracy within small

world chains. The mean small world path length (3.23) is 40% longer than the

mean of the actual shortest paths (2.30) Model suggests that people make a less than optimal small world

choice more than half the time.

Page 13: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Why study small world phenomena?

Curiosity:Why is the world small?How are people able to route messages?

Social Networking as a Business:Friendster, Orkut, MySpaceLinkedIn, Spoke, VisiblePath

Page 14: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Six degrees of separation - to be expected

Pool and Kochen (1978) - average person has 500-1500 acquaintances

Ignoring clustering (the probability that my friend’s friend is not someone unknown to me, but is actually my friend…)

~ 103 first neighbors, 106 second neighbors, 109 third neighbors

Since the number of neighbors grows exponentially with distance (measured in hops traversed in a breadth-first search)Connected random networks have short average path lengths:

<dAB> ~ log(N)

N = population size,dAB = distance between nodes A and B.

But: social networks aren't random…

Page 15: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Reverse small world experiment

Killworth & Bernard (1978): Given hypothetical targets (name, occupation, location, hobbies, religion…)

participants choose an acquaintance for each target Acquaintance chosen based on (most often) occupation, geography only 7% because they “know a lot of people” Simple greedy algorithm: most similar acquaintance two-step strategy rare

Page 16: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

The small world model

High clustering: my friends’ friends tend to be my friends

Watts & Strogatz (1998) - a few random links in an otherwise clustered graph give an average shortest path close to that of a random graph

Page 17: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Networks in nature (empirical observations)

)ln(network Nl

graph randomnetwork CC

neural network of C. elegans,

semantic networks of languages,

actor collaboration graph,

food webs.

Page 18: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Model proposed

Crossover from regular lattices to random graphs Tunable Small world network with (simultaneously):

Small average shortest path Large clustering coefficient (not obeyed by RG)

Page 19: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Two ways of constructing a small world graph

As in many network generating algorithms Disallow self-edges Disallow multiple edges

Select a fraction p of edges

Reposition on of their endpoints

Add a fraction p of additional

edges leaving underlying lattice

intact

Page 20: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Original model

Each node has K>=4 nearest neighbors (local) Probability p of rewiring to randomly chosen nodes p small: regular lattice p large: classical random graph

Page 21: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

p=0 Ordered lattice

Compute the clustering coefficient as follows each node is connected to K neighbors, who can have K*(K-1)/2

pairwise connections between them some of the connections between them are present in the lattice

If K = 4 (connected to two closest neighbors on each side)

C = 3*2/4/3 = ½

Caution: sometimes the lattice will be specified as

each node connects to K closest neighbors

each node connects to all neighbors within distance k (k = K/2)

Page 22: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Clustering coefficient for regular lattice

In general, can have any K a neighbor K/2 hops away from i

can connect to (K/2 – 1) of i’s neighbors

a neighbor K/2-1 hops away can connect to (1 + K/2 – 1) neighbors

K/2 – 2 hops away (2 + K/2 – 1) neighbors

1 hop away 2*(K/2 – 1)

Sum this up multiply by factor of 2 because i

has neighbors on both sides divide by a factor of 2 because

edges are undirected

i

i

i

Page 23: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Clustering coefficient for regular lattice

The number of connections between neighbors is given by

i

i

i

)2(8

3)1

2(

12

0

KKiK

K

j

The maximum number of connections is K*(K-1)/2

→ clustering coefficient is

)1(4

)2(3

K

KC

Page 24: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Average shortest path – regular lattice

Average node is N/4 hops away (a quarter of the way around the ring), and you can hop over K/2 nodes at a time

12

K

Nl

Page 25: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

p=1 Random graph

We’ll talk more about this next week

small N

KC

small ln

ln

K

Nl

There are an average of K links per node.

The probability that any two nodes are connected is p = K/N.

The probability that two nodes which share in a neighbor in common

are connected themselves is the same as any two random nodes: K/N

(actually (K-1)/N because they have already expended one edge on their

common neighbor.

Page 26: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

What happens in between?

Small shortest path means small clustering? Large shortest path means large clustering? Through numerical simulation

As we increase p from 0 to 1 Fast decrease of mean distance Slow decrease in clustering

Page 27: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Change in clustering coefficient and average path length as a function of the proportion of rewired edges

l(p)/l(0)

C(p)/C(0)

10% of links rewired1% of links rewired

No exact analytical solution

Exact analytical solution

Page 28: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Clustering coefficient for SW model with rewiring

The probability that a connected triple stays connected after rewiring probability that none of the 3 edges were rewired (1-p)3

probability that edges were rewired back to each other very small, can ignore

Clustering coefficient = C(p) = C(p=0)*(1-p)3

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

C(p)/C(0)

p

Page 29: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Clustering coefficient: addition of random edges

How does C depend on p? C’(p)= 3xnumber of triangles / number of connected

triples C’(p) computed analytically for the small world model

without rewiring

)2(4)12(2

)1(3)('

pkpk

kpC

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

p

C’(p)

Page 30: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Degree distribution

p=0 delta-function p>0 broadens the distribution Edges left in place with probability (1-p) Edges rewired towards i with probability 1/N

Page 31: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Model: small world with probability p of rewiring

visit nodes sequentially and rewire linksexponential decay, all nodes have similar number of links

1000 vertices

random network

with average connectivity K

Why does each

node keep at least

K/2 links?

Even at p = 1,

graph is not a purely random graph

Page 32: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Some examples for real networks(in averages)

Network size vertex degree

shortest path

Shortest path in fitted random graph

Clustering

(# triangles)

Clustering

(averaged over vertices)

Clustering in random graph

Film actors 225,226 61 3.65 2.99 0.20 0.79 0.00027

MEDLINE co-authorship

1,520,251 18.1 4.6 4.91 0.45 0.56 1.8 x 10-4

E.Coli substrate graph

282 7.35 2.9 3.04 0.32 0.026

C.Elegans 282 14 2.65 2.25 0.28 0.05

Page 33: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

What if long range links depend on distance?

“The geographic movement of the [message] from Nebraska to Massachusetts is striking. There is a progressive closing in on the target area as each new person is added to the chain”

S.Milgram ‘The small world problem’, Psychology Today 1,61,1967

NE

MA

Page 34: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

nodes are placed on a lattice andconnect to nearest neighbors

additional links placed with puv~

Kleinberg’s geographical small world model

ruvd

Kleinberg, ‘The Small World Phenomenon, An Algorithmic Perspective’ (Nature 2000)

Page 35: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

When r=0, links are randomly distributed, ASP ~ log(n), n size of grid

no locality

0~p p

Page 36: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Links highly localized links on a lattice

4

1~p

d

Page 37: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Links balanced between long and short range

2

1~p

d

Page 38: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

How the small world phenomenon arises

T

S

R|R|<|R’|<|R|

k = c log2n calculate probability that s fails to have a link in R’

R’

Page 39: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Kleinberg, ‘Small-World Phenomena and the Dynamics of Information’ NIPS 14, 2001

Hierarchical network models:

Individuals classified into a hierarchy, hij = height of the least common ancestor.

Group structure models:Individuals belong to nested groupsq = size of smallest group that v,w belong to

f(q) ~ q-

ijhijp b

h b=3

e.g. state-county-city-neighborhoodindustry-corporation-division-group

Page 40: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Identity and search in social networksWatts, Dodds, Newman (Science,2001)

individuals belong to hierarchically nested groups

multiple independent hierarchies h=1,2,..,H coexist corresponding to occupation, geography, hobbies, religion…

pij ~ exp(- x)

Page 41: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic
Page 42: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Other generative models

Assign properties to nodes (e.g. spatial location, group membership)

Add or rewire links according to some rule optimize for a particular property (simulated annealing) add links with probability depending on property of existing

nodes, edges (preferential attachment, link copying) simulate nodes as agents ‘deciding’ whether to rewire or add

links

Page 43: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Example: trade-off between wiring and connectivity

E is the ‘energy’ cost we are trying to minimize L is the average shortest path in ‘hops’ W is the total length of wire used

Small worlds: How and Why, Nisha Mathias and Venkatesh Gopal

Page 44: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Network configuration

rewire using simulated annealing

sequence is shown in order of increasing

Page 45: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Small worlds: the how and the why

same networks, but the vertices are allowed to move using a spring layout algorithm

wiring cost associated with the physical distance between nodes

Page 46: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Shape and efficiency in spatial distribution networksMichael Gastner & Mark Newman

(a) Commuter rail network in the Boston area. The arrow marks the assumed root of the network.

(b) Star graph.

(c) Minimum spanning tree.

(d) The model of Eq. (3) applied to the same set of stations.

Page 47: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Assign an effective cost to each edge

l incorporates a person’s preference for short distances or a small number of hops car travel: short distance airplane travel: small number of hops, sometimes at the expense

of total distance

Construct network using simulated annealing

physical distance number of hops

Page 48: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

slide by Mark Newman

Page 49: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

slide by Mark Newman

Page 50: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

slide by Mark Newman

Page 51: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Roads Air routes

slide by Mark Newman

Page 52: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

How do networks become navigable?Aaron Clauset and Christopher Moore

arxiv.org/abs/cond-mat/0309415

• start with a 1-D lattice (a ring)

• each node is connected to its 2 nearest neighbors and has one long range link (initially just a self-loop)

• we start going from x to y, but go no more than a certain threshold # of steps which repesents how many hops we think getting to y should take

• if we give up, we rewire x’s long range link to the last node we reached

• In the limit N->

• long range link distribution becomes 1/r, r = lattice distance between nodes

• search time starts scaling as log(N)

x

y

Page 53: School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

Summary

The world is small! Watts & Strogatz came up with a simple model to explain

why Later, more sophisticated models of social structure

were developed There are many, many more models that can be thought

up and that give useful insights