TRACKING CLIMATE CHANGE OPINIONS FROM TWITTER DATA XIORAN AN, AUROOP GANGULY, YI FANG, STEVEN SCYPHERS, ANN HUNTER, JENNIFER DY NORTHEASTERN UNIVERSITY

TRACKING CLIMATE CHANGE OPINIONSFROM TWITTER DATAXIORAN AN, AUROOP GANGULY, YI FANG, STEVEN SCYPHERS, ANN HUNTER, JENNIFER DY

NORTHEASTERN UNIVERSITY

Presented by Roi Ceren

OVERVIEW

Introduction

Climate Change Debate

Principled Twitter Modeling Source Data With(out) Re-Tweets

Hierarchical Classification

Feature selection

Sentiment Analysis Model Selection

Event Prediction

Conclusion

INTRODUCTION

Anthropogenic climate change unequivocal Yet, very controversial

Public perception varies widely, but poorly studied

Twitter contains high-density unfiltered opinion data Not labeled

Highly subject to naïve language

Leveraging principled techniques against Twitter data offers accurate perception data

Authors compare Naïve Bayes and SVM approaches in classifying shifts in public perception about anthropogenic climate shift Identify sentiment w.r.t. climate change, not just activity

CLIMATE DEBATE

Several sources state it is unequivocal that humans are causing climate change Intergovernmental Panel on Climate Change (IPCC), NASA GISS, etc.

However, the climate change debate is very controversial More people believe that aliens have visited Earth (77%) than that humans are causing

climate change (44%)

Climate events may cause extreme shifts in public reaction Existing methodologies do not capture the disproportionate effect of climate events or recent

politics

Vast quantities of data in social media may be a departure point for more accurate models

http://environment.yale.edu/climate-communication/article/scientific-and-public-perspectives-on-climate-change

CLIMATE DEBATE

Previous attempts to model public perception suffer from significant biases Small sample size

Selection bias: individuals selected may be disproportionately passionate

Infrequent

Response bias: since surveys are elicited, individuals may be led to an answer they don’t believe

Social media outlets provide superior data at a cost Massive data set

Self reported

Unstructured

Other biases?

CLIMATE DEBATE

Previous attempts to model public perception suffer from significant biases Small sample size

Selection bias: individuals selected may be disproportionatelypassionate

Infrequent

Response bias: since surveys are elicited, individuals may be led toan answer they don’t believe

Social media outlets provide superior data at a cost Massive data set

Self reported

Unstructured

Twitter biases?

Age (73% < 50 y.o.), Political leanings (40% Democrat, 22% Republican), Education (22% High School or lower, 78% above), technologically inclined

http://www.nationaljournal.com/thenextamerica/demographics/twitter-demographics-and-public-opinion-20130306

PRINCIPLED TWITTER MODELING

Twitter contains vast, sparse data on public opinion, some concerning theclimate debate Vast: over 7M tweets collected during two month period

Sparse: tweets contain a maximum of 140 characters

Data collected using Twitter Streaming API

Data must be pruned English language and climate change relevant

Climate hashtags, or weather-related?

Java package Lucene used for data pruning

Should re-tweets be allowed? Might be difficult to discern if user supports the claim

Useful in determining the proportion of tweets concerning climate change

Not useful in determining sentiment

MODELING: SOURCE DATA WITH RETWEETS

Identify proportion of tweets discussing climate change ~494k tweets out of 7M on average over 2 months

~7k per day

Average 7.5% of total tweets

Authors don’t mention how they identify related tweets

Several spikes in discussion correlate to climate events Australian brushfires

Hurricane Haiyan

MODELING: SOURCE DATA WITHOUT RETWEETS

Majority of contribution is in sentiment analysis using data without retweets

Remove RT because sentiment difficult to analyze

~285k tweets without retweets (all inclusive)

Validation set

1/5th of data labeled using manual labeling

Three Groups

Objective tweets, stating fact (1,050)

Subjective tweets, stating opinion (1,500)

Positive: belief in anthropogenic climate change (1,000)

Negative: disbelief (500)

Small data set…

MODELING: HIERARCHICAL CLASSIFICATION

Approach: classify data hierarchically First, identify objective and subjective tweets

Next, identify positive or negative subjective tweets

Pre-process data Treated as bag-of-words

Lowercase, tokenize, remove rare words, remove stop/frequentwords, and stem

Categorization methods Naïve Bayes

Support Vector Machines

Objective Subjective

Positive Negative

MODELING: FEATURE SELECTION

Issue: with bag-of-words representation of Twitter dictionary, 140-word tweets are very sparse D = 1,500, high dimensional

Solution? Feature selection!

Task: Identify features that discriminate the presence of a document class Exploring all 2D features is intractable

Instead, score each feature individually

Chi-squared test Essentially, if X2 is high for a feature and class, they are not independent

Select the top k features to reduce dimensionality in the classification

𝑋 2 (𝐷 , 𝑓 ,𝑐 )= ∑𝑒 𝑓∈{0,1}

∑𝑒𝑐∈{0,1}

(𝑁 𝑒𝑓 𝑒𝑐−𝐸𝑒𝑓 𝑒𝑐)

2

𝐸𝑒𝑓 𝑒𝑐

MODELING: FEATURE SELECTION

Selecting k features dependent on F-measure Perform feature selection and classification, evaluate its performance on classification

Prefer higher F-measures

F-measure (F1 score) tests the accuracy of a classification metric Precision: correct positive classifications over all positive classifications (TP/(TP+FP))

Recall: correct positive classifications over all positive events (TP/(TP+FN))

Wikipedia (Precision and recall): http://en.wikipedia.org/wiki/Precision_and_recall

SENTIMENT ANALYSIS

Experiments performed in hierarchically classifying previously examined Twitter data set, pruned for English and climate centric topics 1/5th of data set reserved for validation

Rest of data used to train Naïve Bayes/SVM classifiers using 10-fold cross-validation

2,030 tweets comprised the training set 840 objective tweets

790 positive tweets

400 negative tweets

“Default settings” for Naïve Bayes/SVM classifiers in the scikit-learn Python packages

SENTIMENT ANALYSIS: MODEL SELECTION

Classifiers tested using a variety of feature counts Tested accuracy and F1 on both identifying objective vs.

subjective tweets, then the sentiment

Significant overfitting problems As features increase, feature vectors for tweets become

increasingly sparse

Training set too small for such sparse features

Candidate models selected balancing accuracy and F1measure

SENTIMENT ANALYSIS: MODEL SELECTION

Naïve Bayes performs admirably on average, butrequires far more features SVM performs comparably in F1 and accuracy with

a fraction of the feature set

However, no computational gains are garnered by thisreduction in feature set, but lower-dimensional modelsare more resistant to overfitting

SENTIMENT ANALYSIS: PREDICTION AND EVENT DETECTION

SVM used to delineate objective vs. subjective (consisting ofpositive vs. negative tweets) 30 features for subjectivity, 100 for polarity

While the SVM is good at identifying the proper subjectivityand sentiment, the classifications are poor predictors of events Fluctuations in subjectivity may indicate major events and stimuli

for shifts in public perception, but they poorly matched with actualevents

Almost no fluctuations in proportion of sentiment

However, it’s clear most Twitter users believe in anthropogenicclimate shift

Australian brushfiresHurricane Haiyan

SENTIMENT ANALYSIS: PREDICTION AND EVENT DETECTION

As a last-ditch experiment, the author’s analyzed the slope of thenegative percentage data using z-score normalization +-2.0 indicates a significant change

Authors conjecture that changes on day 21 and 40 relate to naturaldisasters in Australia and the Phillipines

Recall: data set is only 500 tweets Variance in the negative tweet count might be accounted for in the z-score,

but the total variance in tweets is not

i.e. is this significant considering the variance in positive tweets, as this metricis dependent on that count

Australian brushfiresHurricane Haiyan

CONCLUSION

Hierarchical Twitter classification 7M tweets streamed over 2 months

500k relevant in English, relevant to climate change

285k non-retweets

2.5k labeled tweets, 1k objective, 1.5k subjective (1k positive, 500 negative)

Naïve Bayes and SVM compared on accuracy and F1 measure Feature selection used to lower dimensionality

SVM performed equitably to NB with far fewer features

Classification proved poor predictor of changes in opinion Subjectivity proved highly variable over time

Z-normalized decreased disbelief potentially related to climate events

QUESTIONS?

POTENTIAL IMPROVEMENTS?

BIG data Samples were far too low

Lack of statistical significance analysis makes some results dubious

Automated classification Authors note, but do not comment on, previous automation attempts

Manual training/validation set labeling expensive

Better models! Naïve Bayes and SVMs are hardly principled process models

Simple classification techniques can be bootstrapped with social network graph analysis

Documents

TRACKING CLIMATE CHANGE OPINIONS FROM TWITTER DATA XIORAN AN, AUROOP GANGULY, YI FANG, STEVEN SCYPHERS, ANN HUNTER, JENNIFER DY NORTHEASTERN UNIVERSITY