Sentiment is Shite (2014)

Preview:

DESCRIPTION

Slide deck from a talk given at the BreakingNews.ie Measurement Conference on 10th September 2014. It's an updated version of a talk I gave (only once) in 2011. My understanding (and the technology) have developed since then.

Citation preview

NEGATIVE SENTIMENT another grumbly presentation from @mediaczar

TWITTER VOLUME

0

5

10

15

1 Jun 2014 1 Jul 2014

Thou

sand

s

“facebook” AND (“newsfeed” OR “news feed”) June 1 – July 31 2014

Data: Netbase

WHAT HAPPENED HERE?

aND HOW DOES THAT MAKE YOU FEEL?

Topic Analysis Emotions

Data: Brandwatch, Netbase

DELOITTE: BALANCE OF SENTIMENT AFFECTS SALES

The balance of sentiment in Tweets is a more powerful driver of sales than reach (or volume) alone, with positive Tweets having generally a higher impact than negative Tweets. Therefore, to gain the most out of the online word- of-mouth embodied by Tweets, companies would be best served by

addressing the balance of sentiment

about their games through increasing the

number of positive Tweets.

6.10%

3.30%

1.60%

30% more positive Tweets

30% fewer negative Tweets

30% more non-Twitter advertising

Deloitte, “Tweets for Sales, Gaming” (2013)

We want to increase the positive buzz around Your

brand

“SENTIMENT” HAS BECOME A MARKETING OBJECTIVE

“SENTIMENT” HAS BECOME A MARKETING OBJECTIVE

Real case study. ���Name obscured to protect the innocent.

THIS IS HONEST ABE

Public sentiment is everything.

With public sentiment, nothing

can fail.

Without it, nothing can succeed.

SENTIMENTANALYSIS

I’M GOING TO BE EVEN MORE HONEST

DON'T HOLD BACK...TELL US WHAT YOU REALLY THINK!

SENTIMENTANALYSIS

IS SHITE!

THE PROMISE OF SOCIAL INTELLIGENCE

AN ALMOST INFINITE SOURCE OF QUAL & QUANT

DATA!

FINALLY THEY WILL REVEAL WHAT THEY REALLY THINK!

VOLUME

RELEVANCE

AUTHORITY

TOPICS

SENTIMENT SENTIMENT

WEIGHTED SENTIMENT, MRS BROWN’S BOYS, JULY 2014

14.8%

12.2%

11.4%

7.3%

-4.1%

-5.2%

-6.3%

-8.5%

POSITIVE

4 DIFFERENT TOOLS GIVE 4 DIFFERENT MEASURES OF SAME DATA

NEGATIVE

METHODOLOGICAL PROBLEMS

LEXICAL ANALYSIS CLASSIFICATION MANUAL INPUT

LEXICAL ANALYSIS

visit: http://www.wjh.harvard.edu/~inquirer/spreadsheet_guide.htm

e.g. Harvard General Enquirer: 11.8K categorised, tagged words

QUICK & DIRTY DIY SENTIMENT ANALYSIS TOOL

SOME OBVIOUS PROBLEMS…

CAN’T HANDLE IDIOMS, SYNONYMS (OR IRONY)

WELL, THAT'S JUST GREAT

“Posts were determined to be positive or negative if they contained at least one positive or negative word”

Visit: http://www.pnas.org/content/111/24/8788.full LIWC: http://www.liwc.net/

FACEBOOK USED LEXICAL APPROACH

0

5

10

15

Thou

sand

s

CLASSIFIERS & SUPERVISED LEARNING…

TRAIN MODEL

TEST MODEL TAGGED DATA

TRAINING DATA

TEST DATA

POSITIVE

NEUTRAL

NEGATIVE

THIS PRESENTATION IS GOOD

visit: text-processing.com/demo/sentiment

CLASSIFIER SAYS “POSITIVE”

THIS PRESENTATION IS BAD

CLASSIFIER SAYS “NEGATIVE”

visit: text-processing.com/demo/sentiment

THIS PRESENTATION IS NOT GOOD

CLASSIFIER SAYS “NEGATIVE”

visit: text-processing.com/demo/sentiment

THESE ARE BOTH POSITIVE

analysed

analysed

THIS IS NEUTRAL

analysed

WHAT ABOUT THIS?

DOES IT UNDERSTAND WHAT IT’S READING?

2. Randomise the word order 1. Take the original text 3. Re-test

recipe: http://stackoverflow.com/questions/17825945/generating-a-list-of-random-words-in-excel-but-no-duplicates

WORD ORDER MAKES NO DIFFERENCE

WORD ORDER MAKES NO DIFFERENCE

WORD ORDER MAKES NO DIFFERENCE

DOMAIN SPECIFIC

Models trained on one set of data ���may not work well on other sets

WHAT ABOUT HUMAN MARKERS?

FAIRLY EASY TO ASSESS ENTERTAINMENT CATEGORY

EXPERIMENT: IS GUINNESS GOOD FOR YOU?

Selected 50 positive and 50 negative tweets as scored by classifier. Passed these tweets to human markers. Each tweet scored 3 times (5 point scale) Average score compared to classifier.

MECHANICAL TURKS!

Visit: http://www.crowdflower.com/

RESULTS: IS GUINNESS GOOD FOR YOU?

NEG NEUT POS

CLASSIFIER 50 0 50

MANUAL 23 30 47

AGREEMENT 40% 0% 72% (Agreement based on #tweets with matched judgments)

Advert used to say #Guinness is good for you

but I think it is not acceptable to say that these days, but in moderation I thrive on it at

70

It's called a rotten apple

#twobeersonecup #Guinness #angryorchard

#delicious http://t.co/GUFxrYoF6i

HUMANS DON’T ALWAYS AGREE…

CLASSIFY THIS…

OR

THERE ARE HUGE LINES AT

THE APPLE STORE TODAY

BIG ENOUGH NUMBERS If the sample is large enough, won’t these problems get ironed out?

SAMPLE BIAS (1936 US PRESIDENTIAL ELECTION)

See: Tim Harford, “Big Data: are we making a big mistake?” (FT Magazine, 28 March 2008)

SURVEY SIZE ROOSEVELT

LITERARY DIGEST 2,400,000 43%

GALLUP 50,000 54%

ACTUAL 61%

ALF LANDON

I THINK YOU'll find iT'S 48 times bigger

WIN 30 MINS OF FREE CONSULTANCY (VALUE £750)

I'm loving #breakingnewsconf,

@mediaczar.

IT’S FAR TOO EASY TO GAME “POSITIVE SENTIMENT” METRICS

RECOMMENDATIONS

PUT GREAT TECH TO MORE MEANINGFUL USE

DON’T LET SENTIMENT BECOME A KPI

MAKE SENTIMENT A TOOL FOR MORE COMPLEX RESEARCH

LIFE ISN’T A POPULARITY CONTEST

THANK YOU! PLEASE DON’T ASK ME ANY TRICKY QUESTIONS THAT WILL MAKE ME LOOK STUPID I’M @MEDIACZAR ON TWITTER FEEL FREE TO COME AND TALK TO ME AFTERWARDS

Recommended