Learning about Personal Experiences and their Outcomes...

Preview:

Citation preview

Learning about Personal Experiences and their Outcomes:

Analyzing Social Media as an Observational Study

Emre Kıcıman

emrek@microsoft.com

Microsoft Research

should i buy a fixer upper or new home?

should i rent or buy a home in 2014

should i get preapproved before house hunting

should i refinance my mortgage

should i get a divorce

should i break up with my boyfriend

should i pay off my mortgage

should i file bankruptcy

should i get a flu shot

should i lease or buy a car

should i join a gym

should i eat before or after working out

should i text him

should i buy a chromebook

should i quit my job

should i pop a blister

should i get bangs

should i wash my hair before coloring it

should i buy a house

should i cut my hair short

should i retire at 65

should i go to college

should i go to law school

should i take a multivitamin

should i text him or wait for him to text me

should i join the military

should i leave my husband

should i get married

should i pop a burn blister

should i see a doctor

should i consolidate my student loans

should i do cardio before or after weights

should i get a tattoo

should i workout with a cold

should i join the army

should i buy an xbox one

should i move to florida

should i have a third child

should i take creatine

should i get a dog

should i run everyday

should i be a teacher

should i become a real estate agent

should i quit drinking coffee

should i shave my head

100s of millions reporting their experiences

People post about the actions they take.

… and post about what happens later.

Broad goal: Create a system for open-domain querying of relationship between situations, actions and outcomes• Enable this for the long tail of unfamiliar situations

• Accommodate many possible data sources

Personal and policy questions• Explore Situations: What should I expect now?

• Understand Actions: Should I do ____?• Plan for Outcomes: What might help reach goals?

Individual timelinesMessages Query-aligned Timelines

C ardigan fanny pac k

Odd Future, B anksy

cre d selvage chil lwa ve

ret ro se lfi es orga nic.

YOLO shabby chic

Thunderca ts , lomo

me di ta tion

Willi ams burg plaid

narwha l cruc ifix Ma rfa

u1

u2u1

u2

E+

E-

Precedents

Subsequents

E+

E+

1. Extract timeline of event occurrences from social media

2. Select timelines that match a query event; and its control group.

3. Identify outcomes

• Introduction

• Causal inference (very) basics

• Brief sanity check on applying causal inference to social media

• Case Study: Effects of alcohol usage during college

• Case Study: Transitions from mental health to suicidal ideation posts

• Stepping back: Insights & bias

A Brief Introduction to Causal Inference

Treatment

Alice Treatment

Alice

Alice

𝑌𝑇=1

𝑌𝑇=1𝑌𝑇=0

Counterfactual Framework

𝑌𝑇=1𝑌𝑇=0

Treatment effect = 𝑌𝑇=1 − 𝑌𝑇=0

Counterfactual Framework

Counterfactual Framework

• We have a missing data problem. We can estimate missing values.

• We’ll focus on average treatment effect (ATE)

ATE = ത𝑌𝑇=1 − ത𝑌𝑇=0

• Careful: 𝑌 is dependent not only on 𝑇, but alsocovariates 𝑋• And 𝑇 can depend on 𝑋 as well

X

Y

T

Disentangling covariates: Randomized Expts.

• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋

Disentangling covariates: Randomized Expts.

• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋

Disentangling covariates: Randomized Expts.

• Randomized assignment of treatment status provides independence: 𝑇 ⊥ 𝑋

Observational Studies (not random)

• … Identify comparable treated and untreated subgroups such that independence holds

Matching

• For every person who received treatment, find another with identical covariates who did not.

• In high-dimensions, cannot find identical or nearest neighbor matches

• Instead, project units down to a single dimensional space

A balancing score subdivides observational data so that:

𝑇 ⊥ 𝑋 | 𝑠𝑐𝑜𝑟𝑒

Estimated propensity is one possible balancing score

Match on propensity score.

Generalizes to stratification and weighting approaches

Strong assumptions:- Ignorability of

unobserved covariates- Individual outcome

does not depend on whether others receive treatment

𝑌0, 𝑌1 ⊥ 𝑇 | 𝑠𝑐𝑜𝑟𝑒

Other causal inference methods

Regression Discontinuities Instrumental Variables

Disentangle 𝑋 and 𝑇 through another variable 𝑍.

X

Y

TZ

Sanity check: Might this work? With Olteanu and Varol

CSCW17

1. Extract timeline of event occurrences from texts;Identify treatment and

control groups

2. Learn propensity score estimator and stratify users

3. Calculate population average outcomes

Evaluation

Applied technique to 39 specific situations; across 9 high-level topics.

• Chose 9 domains for diversity.

• Selected situations as popularity-weighted random sample within domain.

Data: Firehose Twitter data, March-May, 2014

• Extract ranked outcomes and temporal evolution;

• Correctness judgements via MTurk

Raw Results: Health\Diseases\TriglyceridesOutcome Count Absolute Increase Z-Score

Your_risk 46 24.8% 18.12

Statin 48 23.1% 17.69

Lower 120 35.9% 17.18

Cardiovascular 54 23.0% 16.72

Healthy_diet 55 19.3% 16.54

Fatty_acid 29 18.3% 16.37

Help_prevent 73 26.9% 16.01

Risk_factor 33 18.3% 15.55

Fish_oil 48 24.4% 15.42

inflammation 78 25.1% 15.30

Raw Results: Health\Diseases\GoutOutcome Count Absolute Increase Z-Score

Flare_up 35 4.1% 12.33

Uric_acid 27 2.9% 10.36

Uric 28 2.9% 10.11

Flare 81 4.9% 9.92

Big_toe 38 2.9% 9.86

Joint 301 7.2% 7.22

Aged 32 1.7% 6.51

Correlation 45 2.8% 6.11

Bollock 53 2.5% 5.96

Shite 108 3.4% 5.93

Raw Results: Society\Issues\Belly Fat

Outcome Count Absolute Increase Z-Score

Burn 156 62.2% 8.96

Ab_workout 13 8.5% 5.82

Workout_lose 13 8.5% 5.82

Help_burn 8 11.1% 5.82

add_video 26 14.0% 5.75

url_playlist 26 14.0% 5.75

Fitness 39 18.6% 5.51

Ab 43 19.1% 5.51

Playlist_metion 30 15.3% 5.39

Biceps 7 4.7% 4.74

Raw Results: Business\Financial\PensionOutcome Count Absolute Increase Z-Score

Tax 1334 18.9% 18.89

Retire 675 15.6% 17.97

Budget 762 14.0% 17.05

Benefit 920 14.7% 15.82

Vote 1278 13.9% 15.57

Government 876 11.5% 15.47

Financial 673 13.7% 15.22

Income 619 12.3% 14.87

Report 1125 13.7% 14.70

investment 490 10.4% 14.28

Evaluating correctness

• Use mechanical turk workers to judge correctness of results:

True or False:

Someone mentioning T will later be more likely to talk about YProvide discussion context to aid annotation

• Calculate precision at rank (P@N), ranked by average effect

Context to aid annotating relevance:

Context to aid annotating relevance:

Precision at Rank

P@10 across domains

Data volume vs P@10

Summary

• Propensity score analysis can be applied to social media timelines to extract outcomes, and do extract relevant results

• Developed methodology to allow human judges to evaluate correctness using the discussion context.

• Evaluated on a large variety of domains:

• Outcomes with higher statistical significance and outcomes with larger treatment effects are more likely to be judged correct

• Quality of results is correlated with data volume

Effects of Alcohol Use During CollegeWith Scott Counts (MSR) and Melissa Gasser (UW)

College is an important transition period

• Success in college predicts career success, career happiness, economic achievement.• High rates of college graduates drive regional income levels and other positive

macro-economic indicators

• Over 40% of college students leave without earning a degree.• Many factors: academic and social integration, financial pressures, …

• Excessive alcohol consumption negatively associated w/ college success, as well as other long-term negative consequences

5-year longitudinal social media analysis

• Existing study methods primarily use surveys.• Limited to single institutions and/or small number of participants

• Rely on participant recall

• Response biases

• Social media studies can complement• Large number of participants

• In situ reporting of experiences

• (different) reporting biases

• Granular observations

What insight are we looking for?

Goal: might intervening to stop early alcohol usage aid college success?

Does early alcohol usage have measurable effects on topics linked to college success?

Does early alcohol usage have measurable effects on future alcohol usage?

5-year longitudinal social media analysis

1. Build a dataset of college students twitter timelines:• Identify twitter accounts of entering college students in Fall 2010

• Gather their tweets from Fall 2010 through Summer 2015

2. Identify relevant events and topics:• Drinking and alcohol mentions in Fall 2010

• Topics known to be related to college success: financial pressures, negative academic outcomes, studying, family, friends, criminal/legal issues.

• We could not find a simple, reliable indicator of college graduation

3. Infer effects of early drinking on college-success linked topics• Stratified propensity score analysis.

Identifying a college cohort

1. Find all tweets matching a high-recall, low-precision phrases2. Build and apply high-precision classifier

Identifying drinking and alcohol mentions

• Previous studies find link between alcohol mentions on social media and real-world alcohol usage

• Identify all tweets that contain a validated list of high-precision alcohol phrases

• Alcohol mentions in first semester (fall 2010) will be our treatment

Individual timelinesMessages Query-aligned Timelines

C ardigan fanny pac k

Odd Future, B anksy

cre d selvage chil lwa ve

ret ro se lfi es orga nic.

YOLO shabby chic

Thunderca ts , lomo

me di ta tion

Willi ams burg plaid

narwha l cruc ifix Ma rfa

u1

u2u1

u2

E+

E-

Precedents

Subsequents

E+

E+

1. Extract timeline of word usage per user

2. Treatment = alcohol mention in 1st semester

3. Identify outcome effectsin a sliding window

Details on causal inference setup

• Covariates:• Word counts for top 50k words• Daily tweet frequency and tweet length statistics• Featurize word counts as proportions• Don’t use words in the week before alcohol mention as covariates

• Treatment:• Marker: Alcohol mention in first semester• Semantically: the treatment is everything that happens the week before a drinking

mention.

• Outcomes• Same as covariates• Week-long sliding window, starting from drinking mention for ~5 years.

High-level results

• Increase in tweet rates after drinking mentions

• Mentioning drinking indicates higher rates of alcohol mentions for ~next 2 years as compared to control.

• Drinking mentions has 6mo-2yr effect on most college-success linked topics; with a longer effect on study habits and friend mentions.

Increase in tweet rate

0

1

2

3

4

5

6

-14

7-1

35

-12

3-1

11

-99

-87

-75

-63

-51

-39

-27

-15 -3 9

21

33

45

57

69

81

93

10

51

17

12

9

Twee

ts/D

ay

Days Before/After Alcohol Mention

75th percentile strata

Treatment Placebo

0

1

2

3

4

5

6

-14

6

-13

3

-12

0

-10

7

-94

-81

-68

-55

-42

-29

-16 -3 10

23

36

49

62

75

88

10

1

11

4

12

7

14

0

Twee

ts/D

ay

Days Before/After Alcohol Mention

50th percentile strata

0

1

2

3

4

5

6

-13

7

-12

5

-11

3

-10

1

-89

-77

-65

-53

-41

-29

-17 -5 7

19

31

43

55

67

79

91

10

3

11

5

12

7

13

9

Twee

ts/D

ay

Days Before/After Alcohol Mention

25th percentile strata

• Before drinking, tweet rates are approximately balanced

• After drinking, tweet rates increase by 0.98 tweets/day

• Further validated with a difference-in-differences analysis, with effect persisting.

• Minor implications for analysis: we have to represent words and tokens as proportions, not counts

Effect on future drinking

0.9

2.9

4.9

6.9

8.9

10.9

12.9

14.9

16.91

11

3

22

5

33

7

44

9

56

1

67

3

78

5

89

7

10

09

11

21

12

33

13

45

14

57

Re

lati

ve T

reat

me

nt

Effe

ct

Days since first alcohol tweet in Fall 2010

Effect on future drinking

0.9

0.95

1

1.05

1.1

1.15

1.21

11

3

22

5

33

7

44

9

56

1

67

3

78

5

89

7

10

09

11

21

12

33

13

45

14

57

Re

lati

ve T

reat

me

nt

Effe

ct

Days since first alcohol tweet in Fall 2010

0

0.005

0.01

0.015

0.02

0.025

0.03

1

11

3

22

5

33

7

44

9

56

1

67

3

78

5

89

7

10

09

11

21

12

33

13

45

14

57

Pro

po

rtio

n o

f al

coh

ol-

twee

ts

Days since first alcohol tweet in Fall2010

control

treatment

Relative effect shows early drinking leads to more drinking, with diminishing effect over time. This is because the non-drinkers “catch up”, increasing the drinking mentions over time.

Effect on topics related to college success

Both control and treatment users follow similar temporal patterns.

But generally see positive effect on each outcome initially, with diminishing effect over time.

Effect on topics related to college success

Initially, drinkers are more social, but after about 1-1.5 years, drinkers mention peers less than control group.

Strong effect on negative academic outcome mentions in year, then no difference.

Persistently lower proportion of study habit mentions.

Summary of findings & insights

Q: Does early alcohol usage have measurable effects on topics linked to college success?

A: Yes, especially in short-term (6mo-2yrs). However, effects might be due to other factors closely associated with drinking (e.g., socialization at parties)

Q: Does early alcohol usage have measurable effects on future alcohol usage?

A: Not long term. In fact, control group catches up over time. Note, this does not account for frequency of drinking.

Goal: might intervening to stop early alcohol usage aid college success?

Causes of effects:Case study on transitions to suicide ideationWith Munmun De Choudhury, Glen Coppersmith, Mark Dredze, Mrinal Kumar

CHI 2016

Shifts from Mental health discussions to Suicide Ideation• Suicide is one of the 10 leading causes of death in United States; yet

identifying suicide risk and preventing suicide is difficult. History of mental illness is a known to be a major factor, but little research on characterizing risk of transition from mental health problems to suicide ideation

• We studied individuals posting to mental health (MH) forums on Reddit, who go on to post in suicide ideation forum (SW).

• What words in MH forum posts indicate that user is more likely to post in SW in the future?

• Causal inference analysis, with same methodology, but sweeping treatments instead of sweeping outcomes

• In particular, we’ll go into more detail looking into heterogeneous effects

1. Not forgetting limitations that restrict actual causal claims

What insight are we looking for?

• Validate that off-line theories of factors that drive suicidal ideation also manifest in online discussions

• Use findings to build better, more nuanced early identification of people at risk of suicidal ideation.

Example posts

MHs (Mental Health Subreddits)

I have been considering going for some formal therapy. Any suggestions?

Everyday I feel sad and lonely

Since past sometime I think I am having panic attacks. I really need help from you guys.

SW (SuicideWatch Subreddit)

I know I was never meant to lead this life

Don’t want to hurt the people I care but I can’t take this anymore

Today I felt I have nothing left, why am I even living… I don’t see a point

Content paraphrased to protect privacy of individuals

Data gathering

To focus on transition from mental health postings to suicide ideation, find people who have been posting in MH forum, but not in SW, before August 11, 2014.

Some of these individuals never post in a SW. Label this group MH

Some of these individuals later do post in SW. Label this group MHSW

PS CAUSE-OF-Effects analysis results

Tok

Most statistically significant treatment tokens. Results consistent with social theories on suicide ideation

Stratified results

Opposite Effects across strata

Case study results

• Distinct markers associated w/increased risk of suicide ideation conform with social science theories

• Highlighted heterogeneous effects and importance of context in interpreting risk factors

Actionable Insights and BiasesDerived from work with Alexandra Olteanu, Carlos Castillo and Fernando Diaz

Working paper: Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries

Actionable Insight

A deep understanding that affects our actions

Lots of social media data == lots of insight??

Barriers to insight: Do we believe the results?

We have to ensure:

• Construct validity: Are we measuring what we think we’re measuring?• E.g., are text classifiers correct?

• Internal validity: Is our analysis correct? • E.g., did we need causal inference but not use it?

• External validity: Do our results generalize to target situations?• E.g., population biases, platform biases, …

Incomplete list of potential biases

• Data source issues• Population biases, behavioral biases, platform biases• Temporal biases• System errors, adversaries, fake content

• Data collection issues• Querying biases, sampling biases, system errors

• Data processing • Cleaning, featurization, aggregation biases

• Analysis issues• Measures, classifiers, featurization• Algorithms and interpretation

Mitigation approaches

• Construct validity• Validate measures through qualitative reading.• Validate measures in separate experiments, against known ground-truth• Use more direct measures (e.g., step counters rather than text interpretation)

• Internal validity• When appropriate, use causal inference to reduce bias from observed confounders

• External validity• Try to avoid need for significant generalization with an “in population” task and

intervention.• Otherwise, measure and account for population and other biases.• Validate biases over time.

Conclusions

1. Estimating causal relationships among experience reports on social media is a rich and promising approach to understanding broad set of phenomenon

Alcohol usage in college;

Transitions to suicidal ideation online

2. Causal inference reduces some kinds of bias, but not enough for actionable insights!

In particular, still need measurement validity and generalizability.

Achieve through separate validation, and/or expt design & scoping of goals

Questions?Referenced papers:

- Towards Decision Support and Goal Achievement: Identifying Action-Outcome Relationships from Social Media. Kıcıman, Richardson. KDD15

- Distilling the Outcomes of Personal Experiences: A Propensity-scored Analysis of Social Media. Olteanu, Varol, Kıcıman. CSCW17

- Using Social Media to Understand the Effects of Alcohol Use During College. Kıcıman, Counts, Gasser (email for draft)

- Shifts to Suicidal Ideation from Mental Health Content in Social Media. De Choudhury, Kıcıman, Dredze, Coppersmith, Kumar. CHI16

- Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Olteanu, Castillo, Diaz and Kıcıman. Working paper