17
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Implicit Structure and Dynamics of Blogspace Lada Adamic Accelerating Change 2004 (joint work with: Eytan Adar, Li Zhang, and Rajan Lukose)

Implicit Structure and Dynamics of Blogspace

Embed Size (px)

DESCRIPTION

Implicit Structure and Dynamics of Blogspace. Lada Adamic Accelerating Change 2004 (joint work with: Eytan Adar, Li Zhang, and Rajan Lukose). Blogs and the digital experience. Use: Record real-world and virtual experiences Easy to record and discuss things “seen” on the net - PowerPoint PPT Presentation

Citation preview

Page 1: Implicit Structure and Dynamics of Blogspace

© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Implicit Structure and Dynamics of Blogspace

Lada AdamicAccelerating Change 2004

(joint work with: Eytan Adar, Li Zhang, and Rajan Lukose)

Page 2: Implicit Structure and Dynamics of Blogspace

2

Blogs and the digital experience• Use:

− Record real-world and virtual experiences− Easy to record and discuss things “seen” on the

net

• Structure: blog-to-blog linking• Use + Structure

− Great to track “memes”:ideas spreading in the blogosphere like an

epidemic

Page 3: Implicit Structure and Dynamics of Blogspace

3

Our interest• Macroscopic patterns of blog epidemics

− How does the popularity of a topic evolve over time?

• Microscopic patterns of blog epidemics− Implicit & Explicit− Who is getting information from whom?

• Ranking algorithms that take advantage of infection patterns

Page 4: Implicit Structure and Dynamics of Blogspace

4

Tracking Blogs• Blogdex: Earliest example

− Lets you see which blogs (and when) linked to a site

− Others emerged with similar/related functionality

• Can find epidemic profiles (popularity over time)

• Our question: do different types of information have different epidemic profiles

Page 5: Implicit Structure and Dynamics of Blogspace

5

For Example…

Pop

ula

rity

Time

Slashdot EffectSlashdot Effect

BoingBoing EffectBoingBoing Effect

Page 6: Implicit Structure and Dynamics of Blogspace

6

Clusters reflect different epidemic profiles

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

Slashdot huge surge followed by sharp drop

(slashdot-effect)

Major News – front page

More delayed death (broader interest)

Page 7: Implicit Structure and Dynamics of Blogspace

7

Clusters

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

Products, etc.

Sustained over a period of time

Major-news site (editorial content) – back of the paper

Page 8: Implicit Structure and Dynamics of Blogspace

8

Microscale Example: Giant Microbes

Page 9: Implicit Structure and Dynamics of Blogspace

9

Microscale Dynamics• What do we need track specific epidemics?

− Timings− Graphs

b1b1

Time of infectiont0 t1

b2b2

b3b3

Page 10: Implicit Structure and Dynamics of Blogspace

10

Microscale Dynamics

• Challenges− Root may be unknown− Multiple possible paths− Uncrawled space, alternate media (email, voice)− No links

b1b1

Time of infectiont0 t1

b2b2

b3b3

??

bnbn

Page 11: Implicit Structure and Dynamics of Blogspace

11

Microscale Dynamics who is getting info from whom

• Explicit blog to blog links (easy)− Via links are even better

• Implicit/Inferred transfer (harder)− Use ML algorithm for link inference problem

• Support Vector Machine (SVM)• Logistic Regression

− What we can use• Full text• Blogs in common• Links in common• History of infection

Page 12: Implicit Structure and Dynamics of Blogspace

12

Visualization• Zoomgraph tool

− Using GraphViz (by AT&T) layouts

• Simple algorithm− If single, explicit link exists, draw it− Otherwise use ML algorithm

• Pick the most likely explicit link• Pick the most likely possible link

• Tool lets you zoom around space, control threshold, link types, etc.

Page 13: Implicit Structure and Dynamics of Blogspace

13

Giant Microbes epidemic visualization

via link explicit link inferred link blog

Page 14: Implicit Structure and Dynamics of Blogspace

14

iRank• “Practical” uses of inferred epidemic

information− Can use a simpler inference (timing)

• Finding good sources− Invisible authorities b1b1

b2b2

b3b3 b4b4 b5b5 bnbn…

True source

Popular site

Page 15: Implicit Structure and Dynamics of Blogspace

15

iRank Algorithm• Draw a weighted edge for all pairs of blogs that cite the same URL• higher weight for mentions closer together• run PageRank• control for ‘spam’

Time of infectiont0 t1

Page 16: Implicit Structure and Dynamics of Blogspace

16

Do Bloggers Kill Kittens?

Friday morning Wired writes:

"Warning: Blogs Can Be Infectious.”

Shortly thereafter Slashdot posts:

"Bloggers' Plagiarism Scientifically Proven"

Which is picked up by Metafilter as "A good amount of bloggers are outright thieves."

Page 17: Implicit Structure and Dynamics of Blogspace

17

Research at the Information Dynamics Lab at HP:

http://www.hpl.hp.com/research/idl

[email protected]