10

Click here to load reader

Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Embed Size (px)

DESCRIPTION

presented at Medicine 2.0'12

Citation preview

Page 1: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

A Study of Positive and Negative Affect in Tracking Influenza-like Illness Rate in Twitter Data

Son Doan1, Mike Conway1, Nigel Collier2

1Division of Biomedical Informatics, University of California San Diego2National Institute of Informatics

Medicine 2.0, Boston Sep 15, 2012

Page 2: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Seasonal influenza and influenza-like illness

Influenza-like Illness (ILI) = fever (> 100o F)* AND cough and/or sore throat (in the absence of a known cause other than influenza)*Temperature can be measured in the office or at home

Epidemics of seasonal influenza result in about three to five million cases of severe illness and 250 000 to 500 000 deaths worldwide each year (WHO, 2009)

Case definition from CDC

Calculating ILI rate is key for seasonal influenza surveillance

Page 3: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Related work

• Events tracking/predicting: – Predict election, gasoline price: O’Connor et al. (2010)

– Predict stock market: Bollen et al. (2011)

– Earthquake warning: Sasaki et al. (2010), Guy et al (2010)

– Public mood tracking: Golder and Macy (2011), Doan and Collier (2011)

• Predicting the Influenza-Like Illness rate:– Google Flu Trends: Ginsberg et al. (2009), Valdivia et al. (2010), now

extended to dengue tracking (Chan et al. (2012)) use query logs– Culotta (2009), Lampos and Christinini (2010), Signorini et al

(2011), Chew and Eysenbach (2011) use Twitter

Page 4: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Positive and Negative Affect

• We manually created a positive affect (PA) and a negative affect (NA)word list from:– Wikipedia– Internet resources: text book, literature

• Final lists: 966 terms (509 PA and 457 NA)

Happy SadKeenJoyful Nervous Angry

Positive Affect (PA)Negative Affect (PA)

……

Link between positive and negative affect and ILI rate?

I am terribly sick, I'm thinking swine flu:O

i really hope i dont get sick otherwise im gonna be angry

what are the symptoms for swine flu I am nervous!

I do not have H1N1. I'm so happy about this fact.

tell everyone how great my swine flu cookies were.

Page 5: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Twitter Corpus

Timeline: 36 weeks for the US 2009 influenza season (Aug 30, 2009 to May 8, 2010)

Name Total

Tweets 587,290,394

Users 23,571,765

URL 136,034,309

Hash Tags

96,399,587

Thanks to Brendan O’Connor (CMU) and Twitter Inc.

5 mil

10 mil

15 mil

20 mil

25 mil

Page 6: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Methods

Twitter corpus

ILI-related keywords filtering

ILI-related tweets

Non-PA/Non-NA ILI-related tweets

PA/NA filtering

Culotta Signorini et al. Chew and Eysenbach

flu swine h1n1

cough flu swine flu

headache influenza swineflu

sore throat

Page 7: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Main results

ILINet Non-NA Non-PA

Culotta 0.9485 0.9483 0.9548

Signorini et al. 0.9470 0.9532 0.9586

Chew and Eysenbach 0.9448 0.9444 0.9467

• Gold standard: Laboratory data from the US Outpatient Influenza-Like Illness Surveillance Network (ILINet)

• Pearson correlation coefficients are used

Notes: - Google Flu Trend got 0.9912!!! (using query logs)

Page 8: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Discussion

• Retaining negative affect tweets, but filtering out positive affect tweets, helps to increase the correlation coefficient with laboratory data ILI rate.

• Many true positive tweets have negation associated with positive affect, e.g., “My little guy and I just got our H1N1. He wasn't too happy about getting the shot.” Emphasizes the need for NLP (negation detection)

• Further semantic filtering might help to significantly improve results (our work at HISB 2012)

Page 9: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

Syndromic surveillance for gastrointestinal, respiratory, neurological, dermatological, haemorrhagic, musculoskeletal from Tweets in 40 world cities.

DIZIE: system for syndromic surveillance on Twitter

Collier and Doan. eHealth 2012;186-95

http://born.nii.ac.jp/dizie/

Page 10: Medicine2.0'12: A Study of Positive and Negative Affects in Tracking Influenza-like Illness Rate in Twitter Data

THANK YOU !!!