Upload
aubrie-lynch
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
February 11, 2008 WSDM 1
Preferential Behavior inOnline Groups
Lars Backstrom, Ravi Kumar, Cameron Marlow, Jasmine Novak, Andrew Tomkins
February 11, 2008 WSDM 2
Power Users
February 11, 2008 WSDM 3
Executive Summary of Preferential Treatment
Long term power-users are:
1) 20 times more likely to receive a response upon joining
2) Twice as likely to receive a response upon becoming heavily engaged
3) 9 times more likely to have early responses come from other power-users
February 11, 2008 WSDM 4
Outline
• Introduction to online groups
• Experimental set-up
• Statistics of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 5
Outline
• Introduction to online groups
• Experimental set-up
• Statistics of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 6
Online Groups
• A majority of internet users participate in some form of “online group” related to hobbies, beliefs or offline relationships (Pew 2001)
• Groups vary along a number of dimensions:– Scale– Online vs. offline relationships– Broadcast, Q/A, and interaction– etc.
• Examples– Ithaca Rotary Club mailing list– Palo Alto Parenting group– Aerosmith fan club on MySpace
February 11, 2008 WSDM 7
Yahoo Groups
• 100 million users, 6 million groups• Can be created by any user. This user becomes the
moderator of the group– controls privacy settings, access, memberships, etc.
• Content includes information pages, multimedia content, and message boards.
• Majority of contents resides in message boards.– Members may post a message on a new topic, or respond to a
message posted earlier– Users may read content online, or receive by email– ~6 million groups, ~6 billion messages
• We used data from one year: May 2005 - May 2006
February 11, 2008 WSDM 8
Privacy and Size
• Analysis performed on several categories of groups
• Size– Small: fewer than 20 unique posters– Medium: 20-99 unique posters– Large: greater than 100 posters
• Privacy– Public: open and listed or open and unlisted– Semi-public: restricted and listed– Private: closed and listed, closed and unlisted or restricted and
unlisted
February 11, 2008 WSDM 9
Yahoo Groups
February 11, 2008 WSDM 10
Outline
• Introduction to online groups
• Experimental set-up
• Statistics of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 11
Engaged Users & Thriving Groups
• Various degrees of user engagement.– lurkers, to heavily engaged
• Our focus: users who are heavily engaged in the group, with a high level of posting activity
• What differentiates these engaged users? Are they treated differently? Do they behave differently?
• Look at “thriving” groups
February 11, 2008 WSDM 12
Thriving Groups
Three requirements to be a “thriving” group:
1) Baseline Users– At least 10 users must post during the year
2) Baseline traffic– at least two messages for every 30 day window.
3) Dense period– a two-month period during which every 7-day interval has at least
10 posts
New corpus: 44,473 groups, 1M users
February 11, 2008 WSDM 13
k-Cores
• We define the k-core of a group at time t as follows:• For a two week window around t, a user is in the k-core if:
– the user has replied to k distinct users in the group– the user has been replied to by k distinct users
3-core user
February 11, 2008 WSDM 14
Outline
• Introduction to online groups
• Experimental set-up
• Statistics of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 15
Core Size
48% of group/time pairs have a 2-core of at least 6 people
February 11, 2008 WSDM 16
Fraction of Posters in Core
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Small Medium Large
PrivateSemi-publicPublic
February 11, 2008 WSDM 17
Time Spent in Core
small-private groups: 20% in core less than 2 weeks
large public groups: 94% in core less than 2 weeks
small private groups: 48% in core for 200+ days
February 11, 2008 WSDM 18
Half-life of Cores
February 11, 2008 WSDM 19
Core Populations
• Light: briefly enters the conversation, i.e., don’t enter the core
• Short Core: enters the core for less than 50 days
• Long Core: enters the core for 50 days or more
Light 774k
Short core 134k
Long core 90k
February 11, 2008 WSDM 20
Outline
• Introduction to online groups
• Experimental set-up
• Statistics of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 21
Long Core Users Across Groups
(6, 0.55) = long-core in first 6 groups joined, 55% probability of being long-core in the 7th
February 11, 2008 WSDM 22
Multiple Memberships
probability
February 11, 2008 WSDM 23
Outline
• Introduction to online groups
• Experimental set-up
• Statistics of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 24
Preferential Treatment of Engaged Users
• Are engaged, or “long-core”, users treated differently within a group?
• Yes! We detail three key forms of preferential treatment given to heavily engaged users.
February 11, 2008 WSDM 25
Response to Newcomer
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Light Short-Core Long-core
Private Small Semi-Public Small Public Small Private Medium Semi-public MediumPublic Medium Private Large Semi-Public LargePublic Large
February 11, 2008 WSDM 26
Response to Core Members
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Long-core Short-core
PublicSemi-PublicPrivate
February 11, 2008 WSDM 27
Response Probability by Newcomer Type
00.10.20.30.40.50.60.70.80.9
1
Light Short-core Long-core
Newcomer Type
Probability
Long-coreColumn 2Light
February 11, 2008 WSDM 28
Long-core Response Probability
0.28
0.29
0.3
0.31
0.32
0.33
0.34
At time ofjoining core
When in Core50+ Days
PublicSemi-PublicPrivate
February 11, 2008 WSDM 29
Summary of Preferential Treatment
Heavily engaged “long-core” users are:
1) 20 times more likely to receive a response upon joining
2) Twice as likely to receive a response upon becoming heavily engaged
3) 9 times more likely to have early responses come from other long-core users
Note: Probability of receiving a response increases until joining the core, then begins to decline.
February 11, 2008 WSDM 30
First Post Types
100 first posts of long-core users:
Friends: the newcomer has some prior relationship with another group member
Introduction: the new poster introduces herself to the group
No decision: no information to determine a relationship
No decision 57%
Introduction 37%
Friends 6%
February 11, 2008 WSDM 31
Pregnancy-and-pups: Coccidiosis
I am new to this board and I have enjoyed reading the posts. I am hoping you can help me learn more about coccidiosis.
I have a litter of puppies whose stool is good. They have been on Albon for 10 days. One of the puppies went home and to the vet today and has coccidiosis. Why is that? What else can I use to be sure the puppies are free of cocci?
I do appreciate all your input and all your time in helping me!
February 11, 2008 WSDM 32
skatefans: Appropos of Barbara Cook
Hi! This is my first "Skatefans" post. (I've been reading -- don't like the word "lurking" -- Skatefans since about the summer of 2000, but haven't had the time to post before.)While everyone, of course, is entitled to their own likes and dislikes, I'd just like to add some thoughts about "Fosse" from someone who's been a very big fan of that program. I've only been following figure skating since 1997 so my frame of reference is obviously limited but, in terms of the exhibition programs that I've seen duing this period, I think "Fosse" is one of the "landmark" exhibition programs that I've seen (although I can also see some of the problems with it that people have been pointing out).
February 11, 2008 WSDM 33
Outline
• Introduction to online groups
• Experimental set-up
• Examination of group “cores”
• Statistics of heavily engaged users
• Preferential treatment of engaged users
• Model: Predicting deep engagement
February 11, 2008 WSDM 34
Modeling Long-Core Engagement
Factors at work creating long-core engagement:
• User factor: a user’s personality causes her to become long-core in every group she joins
• Group factor: a group is so welcoming, or its topic so engaging, that users are likely to become long-core
February 11, 2008 WSDM 35
Model
Users Groups
Chance of being long-core in a group: 0.7
Chance of random member being a long-core user: 0.3
February 11, 2008 WSDM 36
Model
• For each (u,g) pair, predict whether pair is in set of memberships H that are long-core.
• Pr[(u,g) in H] = 1 - (1-p(u))(1-p(g))• Task is to choose the best p(u) and p(g) to reproduce H,
the set of long-core memberships• Evaluate quality by the likelihood of predicting H
• Consider three variants:– Use only properties of users, p(g) = 0– Use only properties of groups, p(u) = 0– Allow both p(u) and p(g) to be arbitrary
February 11, 2008 WSDM 37
Analytical Results
Model % of correct edges
User-only p(g) = 0 94.9
Group-only p(u) = 0 85.6
Combined 95.1
February 11, 2008 WSDM 38
Improvement Using Group Factor
February 11, 2008 WSDM 39
Fin
• Social analysis of one of the world’s largest collections of online communities
• Proposed a partitioning of the data to select for active communities of engaged users
• Examined several levels of engagement: “light”, “short-core”, and “long-core”
• Identified several striking ways in which heavily engaged users are given preferential treatment from other members of the group
• Proposed a model to study factors contributing to long-term engagement and showed that both user and group factors play a role.
February 11, 2008 WSDM 40
Fin++
Special thanks to the Groups team:Di-fa Chang, Lee Clancy, David Kopp, Bobby Lee,
Maria Saltz and Gordon Strause.
Ravi Kumar, Jasmine Novak & Andrew Tomkins
{ravikuma,jnovak,atomkins}@yahoo-inc.com
Lars Backstrom [email protected]
Cameron Marlow [email protected]