Sentiment Analysis and Political Disaffection in Italy

IntroductionDevelopment of the classification system

Application and experimental resultsConclusions

Sentiment Analysis for italian languagemicroblogging: development and valutation of an

automatic system

Corrado Monti

22-04-2013

Corrado Monti Sentiment Analysis for italian microblogging 1/23



Twitter

I Main microblogging platformI instant pubblication of short textual contents

I Tweet → 140 charactersI Widespread in Italy too

Italian internet users

Hearsay knowledge of

Twitter

WeeklyTwitter users

DaylyTwitter users

28.6 M people100%

25.3 M people88.6% of internet users



december 2012




Textual classification

I We would like to classify millions of textsI Rule-based approach: definitions of rules (keywords, regular

expressions) together with field experts

I Inaccurate, hard to define

I Supervised machine learning

I They need a training set to learn how to classify new textsI Every text is transformed in a sequence of numerical features;

the algorithm learns what these numbers mean

I Recent problem: Sentiment Analysis → recognize thesentiment expressed by the author of the text

I Many applications, both industrial and academic




Measuring collective feelings

I One of the first works is O’Connor et al. (2010), whomeasured economic trust through Sentiment Analysis onTwitter

Sent

imen

t Rat

io

1.5

2.0

2.5

3.0

3.5

4.0

Gal

lup

Econ

omic

Con

fiden

ce

−60

−50

−40

−30

−20

2008−01

2008−02

2008−03

2008−04

2008−05

2008−06

2008−07

2008−08

2008−09

2008−10

2008−11

2008−12

2009−01

2009−02

2009−03

2009−04

2009−05

2009−06

2009−07

2009−08

2009−09

2009−10

2009−11Dates Dates

I For which phenomena is this possible?

I How much are these measures valid? Do they work in Italytoo?




Goals of this work

I Hypothesis: can political disaffection be measured throughmassive tweet classification?

I It is a relevant phenomenon, especially in ItalyI Lot of interest, academic (sociology) and not academic

I We’d also like to build reusable classifiers




Political disaffection

I How to define a “disaffected” tweet?

I According to domain experts, it must

1. have a politic-related topic → topic detection2. have negative sentiment → sentiment analysis3. be directed to all politicians

Tweet Is political?

Discarded

No

YesIs negative?

Discarded

No

YesIs general?

Discarded

No

YesPolitically

DisaffectedTweet




Training Set

I 28′340 tweet labelled by 40 political science studentsI Between april and june 2012I 3 labellers for every text

I We keep in the dataset only tweet with unanimous “politic”label

I For other labels we measure agreement with Krippendorff α:I “negative” → 0.78 → reliable labels XI “generic” → 0.41 → much noise on labels ×

⇓

negative non negative

politic 7′965 4′544

not politic 15′831




1 – Topic Detection: politics

I Time-robust classifier

I Training Set extension: 17′388 news titles (January-October2012)

I Best feature extraction:I 5-grams of characters → uniformity from different spacingsI tf-idfI Discard terms with less than 4 occourences




I 45′728 points with 78′642 featuresI SMO SVM-solver, k-Nearest Neighbor, Random Forest, kernel are too

resource-hungry

I We preferred online algorithmsI ALMA, Passive-Aggressive, PEGASOS, OIPCAC gave good results

Classifier Accuracy F-Measure Global time

ALMA 0.88 ± 0.01 0.87 ± 0.01 13.5 ± 1PA 0.89 ± 0.01 0.89 ± 0.01 10.6 ± 0.1

PEGASOS 0.88 ± 0.01 0.88 ± 0.01 1103 ± 10OIPCAC 0.89 ± 0.001 0.89 ± 0.01 5911 ± 52

I Other algorithms were tested, but with worst results (i.e. NaıveBayes)

I We selected Passive-Aggressive: good results, low costs




2 – Sentiment Analysis: feature extraction

I Different ways were tested:I n-grams of characters, words, n-grams of wordsI boolean presence, term frequency, tf-idf, fuzzy match between

n-grams

I Best: term frequency of single wordsI Adjustments:

I Fraction of uppercase words as a featureI Features of synonyms were joined together

I Other adjustments did not gave better results:I Consider as different words in beginning or in the end of the

tweet




2 – Removal of the target of sentiment

I We do not want that the sentiment could be guessed basingon specific entities (PD, Berlusconi. . . ). It should be based onthe surrounding words.

I Training Set prejudices ⇒ OverfittingI Problem sometimes ignored, especially in Italian language Sentiment

Analysis

I How can we decide which words must be removed?I Proposed solution: ontology-based filter

DBpedia

querySPARQL

fi rstobjects

list

text fromonline

newspapers

list ofobjects

to removeappeared?

Discarded

No

Yes




2 – Sentiment Analysis: classification

I 12′476 points with 2′383 features

Classifier Accuracy F-Measure Global time

ALMA 0.70 ± 0.03 0.75 ± 0.03 0.82 ± 0.3PA 0.67 ± 0.06 0.71 ± 0.12 0.9 ± 10−3

PEGASOS 0.69 ± 0.03 0.73 ± 0.05 76 ± 0.1

OIPCAC 0.71 ± 0.03 0.75 ± 0.02 121 ± 25

RF 0.72 ± 0.03 0.78 ± 0.03 2173 ± 48

I We select ALMA: good results, low costs




3 – Genericity: rule-based approach

I Problem: select tweets directed to all politicians

I Unhelpful training set labels → we simplify the problem anddecide to use a rule-based approach

I Regole usate:

Presence of keywordsdefined with field experts

(based on the most “secure”part of the training set)

∨Entities from at least three

different political areas(identified through

DBpedia ontologies)




Applicazione del sistema di classificazione

Hypothesis:

Applying this classification system to tweet of April-October2012 we can obtain a good measure of the diffusion ofpolitical disaffection in society?




Surveys

I Surveys are a curtesy of IPSOSI We compute two indexes:

1. Inefficacy: fraction of italians that for everyparty say that their propensity to vote themis 1 in a 1 to 10 scale

I Sociologists say that this capture thesentiment of inefficacy perceived by citizens,symptom of political disaffection

2. Non-vote: fraction of italians that will notvote

I It measures a behaviourI Influenced by other factors: election proximity,

“moral” perception of abstention

...

...

I We have a survey every ∼ 10 days in April-October




Tweet sample

I We gather a set of user active in October and wesample also their followers

I We obtain 261′313 users active in October

I ∼ 5% of italian users and ∼ 10% of active italianusers in October 2012

I We select only those active in April → 167′557utenti

I Scraping of their tweets in this time period:35′882′423 tweet

Lorem ipsum dolor sit amet, consectetur adipiscing elit. In

rhoncus diam a urna pulvinar convallis.














Comparison

I For every survey date, we compute the il ratio betweendisaffected tweet volume and political tweet volume in acertain time window. We consider:

∆141 ∆14

7 ∆71

Giugno 2012L M M G V S D

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30


1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30


1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Last two weeks Between 14 and 7 days before Last week

I We compare these ratios with polls through Pearsoncorrelation index




Inefficacy:

∆maxmin ρ 95%Confidenceinterval P-Value for ρ > 0

∆141 0.7860 0.476-0.922 0.031%

∆147 0.7749 0.454-0.917 0.042%

∆71 0.6880 0.310-0.878 0.226%

Non-voto:

∆maxmin ρ 95%Confidenceinterval P-Value for ρ > 0

∆141 0.5579 0.190-0.788 0.567%

∆147 0.5920 0.248-0.803 0.231%

∆71 0.4433 0.049-0.718 3.00%




æ

æ

æ

æ

æ

æ

ææ

ææ

æ

æ

æ

æ

æ

æ

æ

à

à

à

à

à

à

à

ààà

à à

à à

à

à

à

Apr 15 May 01 May 15 Jun 01 Jun 15 Jul 01 Jul 15 Aug 01 Aug 15 Sep 01 Sep 15 Oct 010.00

0.05

0.10

0.15

0.20

0.25

0.004

0.005

0.006

0.007

0.008

0.009

0.01

0.011

Time

Inef

fica

cyin

dica

tor

Tw

itter

disa

ffec

tion

ratio

Twitter disaffection ratioInefficacy indicator




Interpretation

I Data seem to indicate a quite strong correlation betweendisaffected tweet and diffusion of the phenomena in society

I This does not mean that Twitter is a representative sample

I We can guess that the quantity of discussion about thispheonomenon is connected with how much it will spread




Further investigation

I With this big tweet sample, we can study causes ofoscillations with text mining techinques

I We compute the ratio day by day and we found out when wehave disaffection peaks

I For every peak

1. we take news of that day2. we compare them with the mass of tweet of the peak3. which is the most similar news?




Apr 01 Apr 15 May 01 May 15 Jun 01 Jun 15 Jul 01 Jul 15 Aug 01 Aug 15 Sep 01 Sep 15 Oct 010.000

0.005

0.010

0.015

0.020

0.

0.025

0.05

0.075

0.1

0.125

0.15

0.175

0.2

Time

Twitterdisaffectionratio

Inefficacyindicator

Twitter disaffection ratioInefficacy indicatorRapporto di tweet Sondaggi (Inefficacia)

Tempo

Sondaggi

Sondaggi

Tweet

(Teorie del complotto su stragismo di Stato)

Scandalo Lega

Amministrative

Attentato di Brindisi

Scandalo Fiorito




Conclusions

I We build classifiers for political messages

I Topic and sentiment classifers are reusable

I We showed a correlation between quantity of discussion onTwitter and diffusion of a phenomenon

I Future works

I Classifying users as nodes of a social networkI Confirm or denial of models that could explain our data




Thanks


Technology

Sentiment Analysis and Political Disaffection in Italy