Upload
kiran-garimella
View
48
Download
0
Embed Size (px)
Citation preview
1
QUANTIFYING AND BURSTING THE ONLINE FILTER BUBBLEKIRAN GARIMELLAKCL, 13 FEB 2017
2HELLO!
2011
2013
2014
BACHELORS &
MASTERS IN
COMPUTER SCIENCE
BARCELONA
DOHA, QATAR
HYDERABAD,INDIA HELSINKI,
FINLAND
RESEARCH ENGINEER
RESEARCH ASSOCIAT
EPHD
ADVISOR: ARISTIDES
GIONIS
EXPECTED: SEPT 2017
3
OVERVIEW▸ Motivation▸ Summary of the thesis▸ Shallow dive into one sub-topic
4
SOCIAL MEDIA BUBBLE
5
FILTER BUBBLE
6
ECHO CHAMBERS
7
THE POLARIZATION CYCLE
USER HOMOPH
ILY
ALGORITHMIC
PERSONALIZATION
Increased Polarization
8
POLARIZATION - TWITTER
9
BLOGS
10
11
US SENATE VOTES
12
HOW CAN WE DEAL WITH THE POLARIZATION ON SOCIAL MEDIA?
THIS THESIS
13
RESEARCH QUESTIONS1. Identify polarized discussions on social media and
quantify their severity.2. Track evolution of polarized discussions and
understand their properties.3. Design ways to reduce the polarization.
14
1. IDENTIFYING AND QUANTIFYING POLARIZED DISCUSSIONS
▸ Using different types of user interactions▸ A. Retweet network▸ B. Reply network
RESEARCH QUESTION I
15
1 A. QUANTIFYING CONTROVERSY ON SOCIAL MEDIA [WSDM’16, CSCW’16]
IDENTIFYING AND QUANTIFYING POLARIZED DISCUSSIONS
16
QUANTIFYING CONTROVERSY▸ In the wild▸ Not necessarily political controversies▸ Compare across controversies▸ Language independent
17
NEED NOT BE POLITICAL …
18
NEED NOT BE POLITICAL …
19
COMPARING ACROSS CONTROVERSIES
20
SOLUTION▸ Graph based formulation▸ Model conversations using a retweet
graph▸ Nodes: users, Edges: retweets
21
EXAMPLE
controversial non-controversial
retweet graphs
#beefban #russia march #sxsw #germanwings.
22
EXAMPLE
controversial non-controversial
retweet graphs
follow graphs
23
PIPELINE
Any Clustering algorithm
• Retweets
• Mentions
• Social network
• Content
• Random walk
• Edge-betweenness
• 2d-embedding
• Sentiment variance
Controversy score
24
SENTIMENT VARIANCE▸ Controversy = intensified sentiments▸ Positive and negative sentiments on each side are
higher compared to non-controversial issues▸ Language dependent
25IDENTIFYING AND QUANTIFYING POLARIZED DISCUSSIONS
1 B. A MOTIF-BASED APPROACH FOR IDENTIFYING CONTROVERSY [UNDER REVIEW]▸ Use motifs defined on the reply networks
26
REPLY NETWORKS
27
CONTROVERSIAL NON-CONTROVERSIAL
REPLY NETWORKS
28
MOTIFS
29RESEARCH QUESTION II
2. POLARIZATION OVER TIME▸ A. How do polarized debates change with interest▸ B. Has polarization on Twitter increased over the
years
30POLARIZATION OVER TIME
2 A. HOW DO POLARIZED DEBATES CHANGE WITH INTEREST [UNDER REVIEW]▸ Polarization increases with interest▸ Most retweeting activity occurs within a side▸ Endorsement network becomes more hierarchical
and a large fraction of edges go from periphery to core
▸ Content becomes more similar between the two sides
31POLARIZATION OVER TIME
2 B. HAS POLARIZATION INCREASED OVER THE YEARS? [UNDER REVIEW]▸ Are Twitter users less likely to follow/retweet users
from both sides?▸ Are users less likely to use biased content?▸ Large scale study – 700,000 users, 2B tweets, 8
years
32RESEARCH QUESTION III
3. REDUCING POLARIZATION▸ A. Reducing Controversy by Connecting
Opposing Views
▸ B. Balancing Information Exposure in Social Networks
33REDUCING POLARIZATION
3 A. REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS [WSDM’17]
34
POLARIZATION - TWITTER
35
HOW CAN WE BRIDGE THE DIVIDE?
THIS PAPER
36REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS
▸ Connect the two sides▸ Model interactions as a graph
▸ Retweet graph Nodes: users, Edges: retweets
HOW CAN WE BRIDGE THE DIVIDE?
37
▸ Quantify degree of polarization in a network▸ How well does information flow between the two
sides?
MEASURE OF POLARIZATION
38
RANDOM WALK CONTROVERSY SCORE▸ Authoritative users exist on both sides of the
controversy▸ How likely a random user on either side is to be
exposed to authoritative content from the opposing side
39
RANDOM WALK CONTROVERSY SCORE (RWC)
X Y
40
RANDOM WALK CONTROVERSY SCORE (RWC)
X Y
41
RANDOM WALK CONTROVERSY SCORE (RWC)
X Y
42
RANDOM WALK CONTROVERSY SCORE (RWC)
43
RWC SCORE: 0.95RWC SCORE: 0.12
44
PROBLEM▸ Given a graph▸ Two sides▸ RWC score
45
FIND THE k BEST EDGES TO ADD TO THE GRAPH THAT MAXIMIZE THE REDUCTION IN RWC SCORE
46REDUCING POLARIZATION
Side 1 Side 2
REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS
47
▸ Greedy▸ Look for all pairs of nodes
▸ Find the k pairs that give the highest reduction in RWC
▸ O(n2), n: number of nodes
ALGORITHMS
48REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS
Side 1 Side 2
OUR ALGORITHM
The best edges are between the highest degree nodes
49REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS
Side 1 Side 2
OUR ALGORITHM
The best edges are between the highest degree nodesO(p2), p << n
50
▸ High degree users Highly retweeted users▸ We can not recommend @realDonaldTrump to follow
@BarackObama▸ Not likely to materialize
NOT PRACTICAL
51
▸ Take into account the probability of the user liking the recommendation
▸ Not all users are the same▸ Popular users▸ Highly polarized users
▸ Compute polarity scores for users
ACCEPTANCE PROBABILITY
52
ACCEPTANCE PROBABILITY
POLARITY SCORE: -0.99
POLARITY SCORE: 0.95
53
based on connections
based on retweets
p(u, v) =
ACCEPTANCE PROBABILITY▸ Learn probabilities from data
54
DEMO
55REDUCING POLARIZATION
Side 1 Side 2
3 B. BALANCING INFORMATION EXPOSURE IN SOCIAL NETWORKS
▸ Find a set of seed nodes that can balance the exposure of information