114
Superstar Model: ReTweets, Lady Gaga and Surgery on a Branching Process J. Michael Steele INFORMS-APS San Jose, Costa Rica 2013 J.M. Steele (U Penn, Wharton) July 2013 1 / 30

Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model:ReTweets, Lady Gaga and Surgery on a Branching Process

J. Michael Steele

INFORMS-APSSan Jose, Costa Rica 2013

J.M. Steele (U Penn, Wharton) July 2013 1 / 30

Page 2: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 2 / 30

Page 3: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Passage from the Retweet Graph to the Superstar Model

Joint work with Tauhid Zaman (MIT) and Shankar Bhamidi (UNC) — genuinemembers of the Twitter generation!

Retweet graph: Given a topic and a time frame — form all the (undirected) retweetarcs and look at the giant component of the graph you get.

Black Entertainment Television (BET) Awards 2010

J.M. Steele (U Penn, Wharton) July 2013 3 / 30

Page 4: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Passage from the Retweet Graph to the Superstar Model

Joint work with Tauhid Zaman (MIT) and Shankar Bhamidi (UNC) — genuinemembers of the Twitter generation!

Retweet graph: Given a topic and a time frame — form all the (undirected) retweetarcs and look at the giant component of the graph you get.

Black Entertainment Television (BET) Awards 2010

J.M. Steele (U Penn, Wharton) July 2013 3 / 30

Page 5: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Passage from the Retweet Graph to the Superstar Model

Joint work with Tauhid Zaman (MIT) and Shankar Bhamidi (UNC) — genuinemembers of the Twitter generation!

Retweet graph: Given a topic and a time frame — form all the (undirected) retweetarcs and look at the giant component of the graph you get.

Black Entertainment Television (BET) Awards 2010

J.M. Steele (U Penn, Wharton) July 2013 3 / 30

Page 6: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment eventsI Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 7: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment eventsI Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 8: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment events

I Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 9: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment eventsI Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 10: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment eventsI Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 11: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment eventsI Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 12: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Reading the Message from Some Empirical Retweet Graphs

Retweet graphs were constructed for 13 different public events 1

I Sports, breaking news stories, and entertainment eventsI Time range for each topic was between 4-6 hours

Empirically the graphs arevery tree-like (almost nocycles)

Empirically the graphs eachhave one giant component —this is what we model

The graphs are taken asundirected — and the thedegrees tell the whole story

A) Federer, N = 505 B) England, N = 1024

C) BET Awards, N = 1724 D) World Cup, N = 2847

1Data courtesy of Microsoft Research, Cambridge, MA

J.M. Steele (U Penn, Wharton) July 2013 4 / 30

Page 13: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Lady Gaga and ....

BET Awards

BET Awards

Superstar

J.M. Steele (U Penn, Wharton) July 2013 5 / 30

Page 14: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Empirical Observations on the Retweet Graph

Lady Gaga and ....

BET Awards

BET Awards

Superstar

J.M. Steele (U Penn, Wharton) July 2013 5 / 30

Page 15: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Where Preferential Attachment Fails

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 6 / 30

Page 16: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Where Preferential Attachment Fails

What Goes Wrong with Plain Vanilla Preferential Attachment?One finds Max degree in empirically observed retweet graphs have the order of thegraph size, i.e. MaxDeg ∼ pn

Preferential attachment would predict sub-linear max degree

Third adventure: Twitter event networks and the superstar model Retweet Graph

√n (preferential attachement)

J.M. Steele December 12, 2012 21 / 35

0 2000 4000 6000 80000

200

400

600

800

1000

Number of vertices (n)

Max

imu

m d

egre

e

J.M. Steele (U Penn, Wharton) July 2013 7 / 30

Page 17: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 8 / 30

Page 18: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G2

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 19: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 20: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 21: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 22: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 23: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 24: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 25: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 26: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

The Super Star Model: Just One Parameter

The Superstar Model — It’s Completely Determined by p

G3

v0 (superstar)

v1

v2

p

(1− p)deg(v1,G2)

Attach to superstar with probability p

Else with probability 1− p attach to one of thenon-superstar vertices.

Non-SS Attachment Rule: probability proportional toits degree (preferential attachment rule)

The only model parameter is p: The super star parameter

This is a very simple model: But (1) it has empirical benefits and (2) it is tractable —though not particularly easy.

J.M. Steele (U Penn, Wharton) July 2013 9 / 30

Page 27: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 10 / 30

Page 28: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Degree of the Superstar Under the Superstar Model

Remark (Built-In Easy Fact)

Let deg(v0,Gn) be the degree of the superstar in Gn. We then have that

deg(v0,Gn)

n→ p with probability 1 as n→∞

Empirically the Superstar degree is Θ(n) and the Superstar Model “Bakes this intothe Cake”

But that is ALL that is baked in...

The value of p predicts other features of the graph

The Superstar Model is TESTABLE.

J.M. Steele (U Penn, Wharton) July 2013 11 / 30

Page 29: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Degree of the Superstar Under the Superstar Model

Remark (Built-In Easy Fact)

Let deg(v0,Gn) be the degree of the superstar in Gn. We then have that

deg(v0,Gn)

n→ p with probability 1 as n→∞

Empirically the Superstar degree is Θ(n) and the Superstar Model “Bakes this intothe Cake”

But that is ALL that is baked in...

The value of p predicts other features of the graph

The Superstar Model is TESTABLE.

J.M. Steele (U Penn, Wharton) July 2013 11 / 30

Page 30: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Degree of the Superstar Under the Superstar Model

Remark (Built-In Easy Fact)

Let deg(v0,Gn) be the degree of the superstar in Gn. We then have that

deg(v0,Gn)

n→ p with probability 1 as n→∞

Empirically the Superstar degree is Θ(n) and the Superstar Model “Bakes this intothe Cake”

But that is ALL that is baked in...

The value of p predicts other features of the graph

The Superstar Model is TESTABLE.

J.M. Steele (U Penn, Wharton) July 2013 11 / 30

Page 31: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Degree of the Superstar Under the Superstar Model

Remark (Built-In Easy Fact)

Let deg(v0,Gn) be the degree of the superstar in Gn. We then have that

deg(v0,Gn)

n→ p with probability 1 as n→∞

Empirically the Superstar degree is Θ(n) and the Superstar Model “Bakes this intothe Cake”

But that is ALL that is baked in...

The value of p predicts other features of the graph

The Superstar Model is TESTABLE.

J.M. Steele (U Penn, Wharton) July 2013 11 / 30

Page 32: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Degree of the Superstar Under the Superstar Model

Remark (Built-In Easy Fact)

Let deg(v0,Gn) be the degree of the superstar in Gn. We then have that

deg(v0,Gn)

n→ p with probability 1 as n→∞

Empirically the Superstar degree is Θ(n) and the Superstar Model “Bakes this intothe Cake”

But that is ALL that is baked in...

The value of p predicts other features of the graph

The Superstar Model is TESTABLE.

J.M. Steele (U Penn, Wharton) July 2013 11 / 30

Page 33: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Most Starry of the Non-Superstars

Theorem

Let degmax(Gn) be the maximal non-superstar degree in Gn, i.e.

degmax(Gn) = max1≤i≤n

deg(vi ,Gn)..

If we set

γ =1− p

2− p.

then here is a non-degenerate, strictly positive, random variable ∆∗ such that

n−γdegmax(Gn))→ ∆∗ with probability 1 as n→∞

Maximal non-superstar degree is little-oh of the degree of the Superstar

The Super Star Model makes an explicit prediction for the growth rate of maximumdegree of a non-superstar.

J.M. Steele (U Penn, Wharton) July 2013 12 / 30

Page 34: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Most Starry of the Non-Superstars

Theorem

Let degmax(Gn) be the maximal non-superstar degree in Gn, i.e.

degmax(Gn) = max1≤i≤n

deg(vi ,Gn)..

If we set

γ =1− p

2− p.

then here is a non-degenerate, strictly positive, random variable ∆∗ such that

n−γdegmax(Gn))→ ∆∗ with probability 1 as n→∞

Maximal non-superstar degree is little-oh of the degree of the Superstar

The Super Star Model makes an explicit prediction for the growth rate of maximumdegree of a non-superstar.

J.M. Steele (U Penn, Wharton) July 2013 12 / 30

Page 35: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

The Most Starry of the Non-Superstars

Theorem

Let degmax(Gn) be the maximal non-superstar degree in Gn, i.e.

degmax(Gn) = max1≤i≤n

deg(vi ,Gn)..

If we set

γ =1− p

2− p.

then here is a non-degenerate, strictly positive, random variable ∆∗ such that

n−γdegmax(Gn))→ ∆∗ with probability 1 as n→∞

Maximal non-superstar degree is little-oh of the degree of the Superstar

The Super Star Model makes an explicit prediction for the growth rate of maximumdegree of a non-superstar.

J.M. Steele (U Penn, Wharton) July 2013 12 / 30

Page 36: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

Realized Degree Distribution in the Superstar Model

Theorem

Let F (k,Gn) be the realized degree distribution of Gn under the Superstar model,

F (k,Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

and introduce the superstar model scaling constant

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

We then have

F (k,Gn)→ fSSM(k, p) with probability 1 as n→∞

KEY POINT: The degree distribution scales like k−β , where β = 3 + p/(1− p)

This contrasts with the preferential attachment model which scales like k−3

J.M. Steele (U Penn, Wharton) July 2013 13 / 30

Page 37: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

Realized Degree Distribution in the Superstar Model

Theorem

Let F (k,Gn) be the realized degree distribution of Gn under the Superstar model,

F (k,Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

and introduce the superstar model scaling constant

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

We then have

F (k,Gn)→ fSSM(k, p) with probability 1 as n→∞

KEY POINT: The degree distribution scales like k−β , where β = 3 + p/(1− p)

This contrasts with the preferential attachment model which scales like k−3

J.M. Steele (U Penn, Wharton) July 2013 13 / 30

Page 38: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Predictions of the Superstar Model

Realized Degree Distribution in the Superstar Model

Theorem

Let F (k,Gn) be the realized degree distribution of Gn under the Superstar model,

F (k,Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

and introduce the superstar model scaling constant

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

We then have

F (k,Gn)→ fSSM(k, p) with probability 1 as n→∞

KEY POINT: The degree distribution scales like k−β , where β = 3 + p/(1− p)

This contrasts with the preferential attachment model which scales like k−3

J.M. Steele (U Penn, Wharton) July 2013 13 / 30

Page 39: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 14 / 30

Page 40: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree

∼ pn NA

Maximal non-superstardegree exponent

1− p

2− p1

2

Degree distributionpower-law exponent

3 +p

1− p3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 41: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree ∼ pn

NA

Maximal non-superstardegree exponent

1− p

2− p1

2

Degree distributionpower-law exponent

3 +p

1− p3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 42: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree ∼ pn NA

Maximal non-superstardegree exponent

1− p

2− p1

2

Degree distributionpower-law exponent

3 +p

1− p3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 43: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree ∼ pn NA

Maximal non-superstardegree exponent

1− p

2− p

1

2

Degree distributionpower-law exponent

3 +p

1− p3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 44: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree ∼ pn NA

Maximal non-superstardegree exponent

1− p

2− p1

2

Degree distributionpower-law exponent

3 +p

1− p3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 45: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree ∼ pn NA

Maximal non-superstardegree exponent

1− p

2− p1

2

Degree distributionpower-law exponent

3 +p

1− p

3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 46: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model vs Preferential Attachment

ModelSuperstar Preferential

Model Attachment

Superstar Degree ∼ pn NA

Maximal non-superstardegree exponent

1− p

2− p1

2

Degree distributionpower-law exponent

3 +p

1− p3

J.M. Steele (U Penn, Wharton) July 2013 15 / 30

Page 47: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model Predictions

Use actual data Gn to fit the superstar degree and predict the degree distribution

Consider the observed degree distribution for each empirical retweet graph:

F (k, Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

Consider the theoretical asymptotic degree distribution under the Superstar Model

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

Bottom Line: We get a pretty impressive fit “observed vs predicted”

F (k, Gn) ≈ fSM(k, p) where p =observed superstar degree

n

Basis for Tests: Preferential Attachment always predicts...

fPA(k) =4

k(k + 1)(k + 2)

J.M. Steele (U Penn, Wharton) July 2013 16 / 30

Page 48: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model Predictions

Use actual data Gn to fit the superstar degree and predict the degree distribution

Consider the observed degree distribution for each empirical retweet graph:

F (k, Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

Consider the theoretical asymptotic degree distribution under the Superstar Model

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

Bottom Line: We get a pretty impressive fit “observed vs predicted”

F (k, Gn) ≈ fSM(k, p) where p =observed superstar degree

n

Basis for Tests: Preferential Attachment always predicts...

fPA(k) =4

k(k + 1)(k + 2)

J.M. Steele (U Penn, Wharton) July 2013 16 / 30

Page 49: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model Predictions

Use actual data Gn to fit the superstar degree and predict the degree distribution

Consider the observed degree distribution for each empirical retweet graph:

F (k, Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

Consider the theoretical asymptotic degree distribution under the Superstar Model

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

Bottom Line: We get a pretty impressive fit “observed vs predicted”

F (k, Gn) ≈ fSM(k, p) where p =observed superstar degree

n

Basis for Tests: Preferential Attachment always predicts...

fPA(k) =4

k(k + 1)(k + 2)

J.M. Steele (U Penn, Wharton) July 2013 16 / 30

Page 50: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model Predictions

Use actual data Gn to fit the superstar degree and predict the degree distribution

Consider the observed degree distribution for each empirical retweet graph:

F (k, Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

Consider the theoretical asymptotic degree distribution under the Superstar Model

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

Bottom Line: We get a pretty impressive fit “observed vs predicted”

F (k, Gn) ≈ fSM(k, p) where p =observed superstar degree

n

Basis for Tests: Preferential Attachment always predicts...

fPA(k) =4

k(k + 1)(k + 2)

J.M. Steele (U Penn, Wharton) July 2013 16 / 30

Page 51: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Superstar Model Predictions

Use actual data Gn to fit the superstar degree and predict the degree distribution

Consider the observed degree distribution for each empirical retweet graph:

F (k, Gn) = n−1 |{1 ≤ j ≤ n : deg(vj ,Gn) = k}|

Consider the theoretical asymptotic degree distribution under the Superstar Model

fSSM(k, p) =2− p

1− p(k − 1)!

k∏i=1

(i +

2− p

1− p

)−1

.

Bottom Line: We get a pretty impressive fit “observed vs predicted”

F (k, Gn) ≈ fSM(k, p) where p =observed superstar degree

n

Basis for Tests: Preferential Attachment always predicts...

fPA(k) =4

k(k + 1)(k + 2)

J.M. Steele (U Penn, Wharton) July 2013 16 / 30

Page 52: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Degree distribution

0 5 10 15 2010

−4

10−3

10−2

10−1

100

k

Lebron, p’=0.09

0 5 10 15 2010

−4

10−3

10−2

10−1

100

k

Brazil Portugal, p’=0.28

0 5 10 15 2010

−5

10−4

10−3

10−2

10−1

100

k

BET Awards, p’ = 0.58

0 5 10 15 2010

−4

10−3

10−2

10−1

100

k

Federer, p’=0.37

f(k,Gn)

fSM

(k,p’)

fPA

(k)

J.M. Steele (U Penn, Wharton) July 2013 17 / 30

Page 53: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Degree distribution Comparison

Compare relative error of the Superstar Model and Preferential Attachment for differentdegrees k

ModelSuperstar Preferential

Model Attachment

Relative Error|f (k,Gn)− fSM(k, p′)|

fSM(k, p′)

|f (k,Gn)− fSM(k, p′)|fSM(k, p′)

J.M. Steele (U Penn, Wharton) July 2013 18 / 30

Page 54: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

Degree Distribution Comparison

1 3 5 7 9 11 130

0.2

0.4

0.6

0.8

1

Retweet Graph

degree = 1

1 3 5 7 9 11 130

0.2

0.4

0.6

0.8

1degree = 2

1 3 5 7 9 11 130

0.2

0.4

0.6

0.8

1degree = 3

1 3 5 7 9 11 130

0.2

0.4

0.6

0.8

1degree = 4

Rela

tive

Err

or

Rela

tive

Err

or

Rela

tive

Err

or

Rela

tive

Err

or

Retweet Graph

Retweet Graph Retweet Graph

PreferentialAttachment

SuperstarModel

J.M. Steele (U Penn, Wharton) July 2013 19 / 30

Page 55: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

The Superstar Model and the Realized Degree Distribution: Bottom Line

The Superstar Model implies a mathematical link between the superstar degree andthe degree distribution of the non-superstars.

When we look at Twitter data for actual events, we see (1) a superstar and (2) adegree distribution of non-superstars that is more compatible with the superstarmodel than with the preferential attachment model.

The first property was “baked” into our model, but the second was not. It’s anhonest discovery.

Next: How Can one Analyze the Superstar Model?

J.M. Steele (U Penn, Wharton) July 2013 20 / 30

Page 56: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

The Superstar Model and the Realized Degree Distribution: Bottom Line

The Superstar Model implies a mathematical link between the superstar degree andthe degree distribution of the non-superstars.

When we look at Twitter data for actual events, we see (1) a superstar and (2) adegree distribution of non-superstars that is more compatible with the superstarmodel than with the preferential attachment model.

The first property was “baked” into our model, but the second was not. It’s anhonest discovery.

Next: How Can one Analyze the Superstar Model?

J.M. Steele (U Penn, Wharton) July 2013 20 / 30

Page 57: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

The Superstar Model and the Realized Degree Distribution: Bottom Line

The Superstar Model implies a mathematical link between the superstar degree andthe degree distribution of the non-superstars.

When we look at Twitter data for actual events, we see (1) a superstar and (2) adegree distribution of non-superstars that is more compatible with the superstarmodel than with the preferential attachment model.

The first property was “baked” into our model, but the second was not. It’s anhonest discovery.

Next: How Can one Analyze the Superstar Model?

J.M. Steele (U Penn, Wharton) July 2013 20 / 30

Page 58: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Comparison with Preferential Attachment Model

The Superstar Model and the Realized Degree Distribution: Bottom Line

The Superstar Model implies a mathematical link between the superstar degree andthe degree distribution of the non-superstars.

When we look at Twitter data for actual events, we see (1) a superstar and (2) adegree distribution of non-superstars that is more compatible with the superstarmodel than with the preferential attachment model.

The first property was “baked” into our model, but the second was not. It’s anhonest discovery.

Next: How Can one Analyze the Superstar Model?

J.M. Steele (U Penn, Wharton) July 2013 20 / 30

Page 59: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 21 / 30

Page 60: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Basic Link: Branching Processes

Proto-Idea: Branching processes have a natural role almost anytime one considers astochastically evolving tree.

More Concrete Observation: If the birth rates depend on the number of children, thearithmetic of the Poisson process relates lovingly to the arithmetic of preferentialattachment — this is sweet.

Creating the Superstar: Yule processes don’t come with a superstar. Still, it is notterribly hard to move to multi-type branching processes. In a world with multipletypes, you have the possibility of doing some surgery that let you build a super star.

Realistic Expectations: The paper is a dense 29 pages. Some of the branchingprocess theory is drawn from the dark well of experts; it’s not off-the-shelf stuff.Still, if you want the deeper parts of the theory (e.g. the distribution of themaximum degree of the non-superstars) then you have to pay the piper.

News You Can Use? One can see the benefits of using multi-type branchingprocesses. One can see that the connection between the Yule process andpreferential attachment is natural. This is enough to get you rolling in a variety ofapplied probability models (social net works are a good start — but they are not theonly game.)

J.M. Steele (U Penn, Wharton) July 2013 22 / 30

Page 61: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Basic Link: Branching Processes

Proto-Idea: Branching processes have a natural role almost anytime one considers astochastically evolving tree.

More Concrete Observation: If the birth rates depend on the number of children, thearithmetic of the Poisson process relates lovingly to the arithmetic of preferentialattachment — this is sweet.

Creating the Superstar: Yule processes don’t come with a superstar. Still, it is notterribly hard to move to multi-type branching processes. In a world with multipletypes, you have the possibility of doing some surgery that let you build a super star.

Realistic Expectations: The paper is a dense 29 pages. Some of the branchingprocess theory is drawn from the dark well of experts; it’s not off-the-shelf stuff.Still, if you want the deeper parts of the theory (e.g. the distribution of themaximum degree of the non-superstars) then you have to pay the piper.

News You Can Use? One can see the benefits of using multi-type branchingprocesses. One can see that the connection between the Yule process andpreferential attachment is natural. This is enough to get you rolling in a variety ofapplied probability models (social net works are a good start — but they are not theonly game.)

J.M. Steele (U Penn, Wharton) July 2013 22 / 30

Page 62: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Basic Link: Branching Processes

Proto-Idea: Branching processes have a natural role almost anytime one considers astochastically evolving tree.

More Concrete Observation: If the birth rates depend on the number of children, thearithmetic of the Poisson process relates lovingly to the arithmetic of preferentialattachment — this is sweet.

Creating the Superstar: Yule processes don’t come with a superstar. Still, it is notterribly hard to move to multi-type branching processes. In a world with multipletypes, you have the possibility of doing some surgery that let you build a super star.

Realistic Expectations: The paper is a dense 29 pages. Some of the branchingprocess theory is drawn from the dark well of experts; it’s not off-the-shelf stuff.Still, if you want the deeper parts of the theory (e.g. the distribution of themaximum degree of the non-superstars) then you have to pay the piper.

News You Can Use? One can see the benefits of using multi-type branchingprocesses. One can see that the connection between the Yule process andpreferential attachment is natural. This is enough to get you rolling in a variety ofapplied probability models (social net works are a good start — but they are not theonly game.)

J.M. Steele (U Penn, Wharton) July 2013 22 / 30

Page 63: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Basic Link: Branching Processes

Proto-Idea: Branching processes have a natural role almost anytime one considers astochastically evolving tree.

More Concrete Observation: If the birth rates depend on the number of children, thearithmetic of the Poisson process relates lovingly to the arithmetic of preferentialattachment — this is sweet.

Creating the Superstar: Yule processes don’t come with a superstar. Still, it is notterribly hard to move to multi-type branching processes. In a world with multipletypes, you have the possibility of doing some surgery that let you build a super star.

Realistic Expectations: The paper is a dense 29 pages. Some of the branchingprocess theory is drawn from the dark well of experts; it’s not off-the-shelf stuff.Still, if you want the deeper parts of the theory (e.g. the distribution of themaximum degree of the non-superstars) then you have to pay the piper.

News You Can Use? One can see the benefits of using multi-type branchingprocesses. One can see that the connection between the Yule process andpreferential attachment is natural. This is enough to get you rolling in a variety ofapplied probability models (social net works are a good start — but they are not theonly game.)

J.M. Steele (U Penn, Wharton) July 2013 22 / 30

Page 64: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Basic Link: Branching Processes

Proto-Idea: Branching processes have a natural role almost anytime one considers astochastically evolving tree.

More Concrete Observation: If the birth rates depend on the number of children, thearithmetic of the Poisson process relates lovingly to the arithmetic of preferentialattachment — this is sweet.

Creating the Superstar: Yule processes don’t come with a superstar. Still, it is notterribly hard to move to multi-type branching processes. In a world with multipletypes, you have the possibility of doing some surgery that let you build a super star.

Realistic Expectations: The paper is a dense 29 pages. Some of the branchingprocess theory is drawn from the dark well of experts; it’s not off-the-shelf stuff.Still, if you want the deeper parts of the theory (e.g. the distribution of themaximum degree of the non-superstars) then you have to pay the piper.

News You Can Use? One can see the benefits of using multi-type branchingprocesses. One can see that the connection between the Yule process andpreferential attachment is natural. This is enough to get you rolling in a variety ofapplied probability models (social net works are a good start — but they are not theonly game.)

J.M. Steele (U Penn, Wharton) July 2013 22 / 30

Page 65: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 66: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 67: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 68: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 69: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 70: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 71: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 72: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 73: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Introduction of a Special Branching Process

Two types of vertices: red and blue

Each vertex gives birth to vertices according to a non-homogeneous Poisson processthat has rate proportional to (1+ number of blue children)

cB(v , t) = number of blue children of v at t time units after the birth of v

At birth vertex is painted red with probability p and painted blue with probability1− p

v1

v4

v6

v2 v3

v5

cB(v1, t) = 1

cB(v3, t − τ3) = 0

F(t) = Branching process at time t

τn = inf {t : |F(t)| = n}

J.M. Steele (U Penn, Wharton) July 2013 23 / 30

Page 74: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 75: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 76: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 77: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 78: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 79: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 80: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 81: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 82: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1F(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 83: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Surgery: From BP Model to Superstar Model

Add an exogenous superstar vertex v0 to the vertex set

For each red vertex remove the edge from parent and create an undirected edge tothe superstar vertex v0

With the surgery done, all edges are made undirected and all colors are erased

v0 (superstar)

v1F(τ6)

v4

v6

v2 v3

v5

v1S(τ6)

v4

v6

v2 v3

v5

J.M. Steele (U Penn, Wharton) July 2013 24 / 30

Page 84: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2

v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 85: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2

v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 86: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2

v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 87: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 88: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 89: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 90: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 91: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 92: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6)

cB(v1, τ6 − τ1) + 1 = 2

G7

v1

deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 93: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6)

cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 94: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Relating the BP Construction with the Superstar Model

Claim: S(τn) is “probabilistically the same” as Gn+1

Base case: S(τ1) = G2v0 v1

Need to show that S(τn) and Gn+1 have same probabilistic evolution

Superstar: probability of joining superstar = probability of red vertex being born = p

Same probability for S and G

Non-superstars: degree of vertex = number of blue children + 1

deg(vk ,Gn+1) = cB(vk , τn − τk) + 1

v1F(τ6) cB(v1, τ6 − τ1) + 1 = 2

G7

v1deg(v1,G7) = 2

J.M. Steele (U Penn, Wharton) July 2013 25 / 30

Page 95: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 96: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 97: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 98: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 99: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 100: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 101: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)

J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 102: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Further Linking of the BP Model and the Superstar Model

P (vn joins vk |Gn) = P (vn is blue and born to vk |F(τn−1))

=

P (vn joins vk |Gn) = (1− p)deg(vk ,Gn)∑

vj∈Gn\v0deg(vj ,Gn)

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)=

P (vn is blue and born to vk |F(τn−1)) = (1− p)cB(vk , τn − τk) + 1∑

vk∈F(τn−1) cB(vk , τn − τk) + 1

= (1− p)deg(vk ,Gn)

2(n − 1)− deg(v0,Gn)J.M. Steele (U Penn, Wharton) July 2013 26 / 30

Page 103: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)≈ cB(vj , τn − τj)

|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 104: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)≈ cB(vj , τn − τj)

|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 105: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)≈ cB(vj , τn − τj)

|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 106: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)

≈ cB(vj , τn − τj)|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 107: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)≈ cB(vj , τn − τj)

|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 108: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)≈ cB(vj , τn − τj)

|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 109: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Tools for Analysis

Non-Superstar Degree

Theorem

There exists a strictly positive, non-degenerate, random variable W such that

|F(t)|e−(2−p)t →W with probability 1 as t →∞

The number of blue children is a Yule process with rate 1− p

cB(vj , t)e−(1−p)t → T where T ∼ Exp(1− p)

deg(vj ,Gn)

n(2−p)−1(1−p)≈ cB(vj , τn − τj)

|F(τn−1)|(2−p)−1(1−p)

=cB(vj , τn − τj)e−(1−p)τn

(|F(τn−1)|e−(2−p)τn )(2−p)−1(1−p)

→ T

W (2−p)−1(1−p)with probability 1

J.M. Steele (U Penn, Wharton) July 2013 27 / 30

Page 110: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Patterns (or News) You Can Use?

Outline

1 Empirical Observations on the Retweet Graph

2 Where Preferential Attachment Fails

3 The Super Star Model: Just One Parameter

4 Predictions of the Superstar Model

5 Comparison with Preferential Attachment Model

6 Superstar Model: Tools for Analysis

7 Superstar Model: Patterns (or News) You Can Use?

J.M. Steele (U Penn, Wharton) July 2013 28 / 30

Page 111: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Patterns (or News) You Can Use?

What Did I Learn?

Value of Simple but Honest “Variation”: This is one of the most reliable process inscience. Too old but famous examples: Neyman Scott models and GARCH model.Nice company for the Superstar Model

Nature of Difficulty: Things are often substantially harder than they look at firstblush. He we took quite an obvious variation on the Preferential Attachment model,and we were led to quite different mathematics. Still the implications of this work dotell us something even about the PA model. One can pass from the SS model to thePA model by letting p → 0.

Using the SS Model:

I The Superstar Model “looks like” perferential attachment with a twist — but thedifferences are HUGE!

I It’s easy to use since it is easy to reject. The plain vanilla SS Model is rigid. It if worksit’s great; if it doesn’t you’ll find out quickly.

I This is the charm of a one-parameter model where the parameter is easy to estimate.I Still, if modeling needs demand changes, further parameters can be introduced.

J.M. Steele (U Penn, Wharton) July 2013 29 / 30

Page 112: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Patterns (or News) You Can Use?

What Did I Learn?

Value of Simple but Honest “Variation”: This is one of the most reliable process inscience. Too old but famous examples: Neyman Scott models and GARCH model.Nice company for the Superstar Model

Nature of Difficulty: Things are often substantially harder than they look at firstblush. He we took quite an obvious variation on the Preferential Attachment model,and we were led to quite different mathematics. Still the implications of this work dotell us something even about the PA model. One can pass from the SS model to thePA model by letting p → 0.

Using the SS Model:

I The Superstar Model “looks like” perferential attachment with a twist — but thedifferences are HUGE!

I It’s easy to use since it is easy to reject. The plain vanilla SS Model is rigid. It if worksit’s great; if it doesn’t you’ll find out quickly.

I This is the charm of a one-parameter model where the parameter is easy to estimate.I Still, if modeling needs demand changes, further parameters can be introduced.

J.M. Steele (U Penn, Wharton) July 2013 29 / 30

Page 113: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Superstar Model: Patterns (or News) You Can Use?

What Did I Learn?

Value of Simple but Honest “Variation”: This is one of the most reliable process inscience. Too old but famous examples: Neyman Scott models and GARCH model.Nice company for the Superstar Model

Nature of Difficulty: Things are often substantially harder than they look at firstblush. He we took quite an obvious variation on the Preferential Attachment model,and we were led to quite different mathematics. Still the implications of this work dotell us something even about the PA model. One can pass from the SS model to thePA model by letting p → 0.

Using the SS Model:

I The Superstar Model “looks like” perferential attachment with a twist — but thedifferences are HUGE!

I It’s easy to use since it is easy to reject. The plain vanilla SS Model is rigid. It if worksit’s great; if it doesn’t you’ll find out quickly.

I This is the charm of a one-parameter model where the parameter is easy to estimate.I Still, if modeling needs demand changes, further parameters can be introduced.

J.M. Steele (U Penn, Wharton) July 2013 29 / 30

Page 114: Superstar Model: ReTweets, Lady Gaga and Surgery on a …stat.wharton.upenn.edu/~steele/AccessCash/Steele-INFORMS-CR-2013.pdf · Superstar Model: ReTweets, Lady Gaga and Surgery on

Thank you!

Thanks Again to My Co-Authors on This Project

Shankar Bhamidi

Tauhid Zaman

J.M. Steele (U Penn, Wharton) July 2013 30 / 30