The War on Attention Poverty: Measuring Twitter Authority

Embed Size (px)

Citation preview

The War on Attention Poverty:Measuring Twitter Authority

Daniel TunkelangGoogle

http://www.wvculture.org/history/thisdayinwvhistory/0424.html

Disclaimers

Much of the material in this presentation is work done prior to my employment at Google.

Google is not, to the best of my knowledge, using TunkRank.

Any opinions expressed are my own, and do not represent Google's official positions.

Executive Summary

Authority requires scarcity.

http://www.southparkstudios.com/

http://en.wikipedia.org/wiki/Diamond

Overview

Aboutness and Authority

Social Networks 101

Measuring Twitter Authority

TunkRank

Aboutness and Authority

http://www.ncgenealogy.org/blogs/ngs2009/2009_04_01_archive.html

http://www.clker.com/clipart-2406.html

Information Retrieval: Pre-Web

http://archimedes.fas.harvard.edu/presentations/2002-03-09/img13.html

Information Retrieval: Web

http://blogoscoped.com/archive/2007-01-11-n25.html

How Authority Matters for IR

Promoting official content

Demoting spam

Ranking everything in between

http://whitehouse.org/

Social Networking Sites

2003: goes live

2010: claims 400M+ users

Global Alexa Top 30 also include:

Social Networks = Information Feeds

Social Information Overload!

http://loiclemeur.com/english/2007/06/im-overload.html

What's a Friend?

Bands of Reduced Attention

http://bhc3.wordpress.com/2009/02/25/the-serendipity-of-attention/

Asymmetric Follower Model

http://www.engineeringdaily.net/brain-game-weighing-24-coins/

Follower Count as Status

http://www.southparkstudios.com/

Follower Count as Authority?

http://loiclemeur.com/english/2008/12/
twitter-we-need-search-by-authority.html

http://twithority.com/

Buy Followers...on eBay!

Exploit Norm of Reciprocity

72% of users
....follow at least 80% of their followers

80% of users...
...have at least 80% of their friends as followers

TwitterRank: finding topic-sensitive influential twitterers. [Weng et al, WSDM 2010]

Do Actions Speak Louder?

influence = potential of an action of a user to initiate a further action by another user

The Influentials: New Approaches for Analyzing Influence on Twitter [Leavitt et al, 2009]

Dan Zarrella's ReTweetability Metric:

Gaming Retweet Count

Create two users. Tweet. Retweet. Repeat.

Retweet counts are low: less than 2% of tweets

State of the Twittersphere [Zarrella, June 2009]

Twitter cyborgs already produce retweet spam

Twitter Cyborgs [Mowbray and Andrade, 2010]

Actions can be (and are) Faked

What Should We Measure?

in an information-rich world, the
wealth of information means...
a scarcity of whatever it is
that information consumes...
the attention of its recipients.

Designing Organizations for
an Information-Rich World
[Herbert Simon, 1971]

Introducing...TunkRank!

Demo

http://tunkrank.com/

Retweet Decision Model

Simple Recurrence

Measures expected propagation of tweet from X
pnotice = total attention user devotes to Twitterpretweet = probability that user retweets Note Following(Y) in denominator!

Discourages Exploiting Reciprocity

Indiscriminate followers who follow many users make low contributions to TunkRank.

Consistent with idea that influence correlates to high follower-friend ratio.

But TunkRank only considers user's followers, not user's friends.

TunkRank Pros and Cons

Based entirely on follower graph.Ignores retweets, etc.

Resists manipulation.

Uniformly distributes attention among followers.Distribution is probably a power law.

But fake follow data is hidden.

Bug or a feature?

Press

http://techcrunch.com/2010/06/16/barackobama-techcrunch-
twitter-followers/

http://blogs.forbes.com/firewall/2010/07/09/a-better-way-to-filter-
twitters-spambots-ask-google/

Research

TwitterRank: finding topic-sensitive influential twitterers. [Weng et al, 2010] Overcoming Spammers in Twitter A Tale of Five Algorithms [Gayo-Avello and Brenes, 2010] Nepotistic Relationships in Twitter and their Impact on Rank Prestige Algorithms
[Gayo-Avello, 2010]

Go TunkRank! [Gayo-Ayello, 2010]

similar to PageRank but better vs. cheating

aggressive marketers almost indistinguishable from common users

spammers grab small amount of global availableprestige

agrees with PageRank for top-ranked users

simple, induces plausible rankings, severely penalizes spammers compared to PageRank

Room for Improvement

Still can be gamed through fake users.

Multiply by follow cost?

Consider user actions?

Topic-sensitivity?

Non-uniform distribution?

Tradeoff of simplicity vs. realism.

http://followcost.com/

Conclusion

Web IR is unthinkable without modeling attention scarcity.

Social networks are new and increasingly important information feeds.

We need measures to mitigate social information overload.

TunkRank is a promising proof-of-concept.

Thank you!

...and thanks to Jason Adams for developing
and maintaining the http://tunkrank.com site!

Questions?

Email: [email protected]: @dtunkelangBlog: http://thenoisychannel.com/