Upload
diana-maynard
View
511
Download
0
Embed Size (px)
Citation preview
Dr. Diana MaynardUniversity of Sheffield, UK
Are We Really Interested In Climate Change?
www.decarbonet.eu
What do people think about climate change?
And how much do we really know about it?
How do we know what's really true?
It's cold in my flat
The Decarbonet Project
● Scientists predict adverse consequences to our climate unless stronger actions are taken
● Collective awareness about many climate change issues is still problematic
● We are exposed to vast amounts of conflicting information● Hard to know what is accurate and relevant● DecarboNet: “A Decarbonisation Platform for Citizen
Empowerment and Translating Collective Awareness into Behavioural Change”
● 3-year EU project, started October 2013
DecarboNet Objectives
• Raise Individual and Collective Awareness
• Trigger Behavioural Change and Foster Social Innovation
• Analyse Behavioural Patterns and Information Diffusion
Social media analysis for climate change
● What are people tweeting about?● NLP tools for the automatic discovery of new insights, by
automatically extracting information from social media. ● Extracted information can be linked together to form new facts or to
allow new hypotheses to be explored further. ● What arguments for and against man-made causes of climate
change develop in social media?● What impact does this information have?● How do people's opinions change over time?● What kinds of topics are most engaging for social media users?
Earth Hour Campaign represented on Twitter
What is going on here???
Media Watch for Climate Change
How can this be used to raise awareness of climate change?
● Organisations need to have a better understanding of public perception of climate change in order to develop campaigns and strategies.
● Our technology helps them to understand what are the opinions on crucial topics and events
● How are these opinions distributed in relation to demographic user data?
● How have these opinions evolved?● Who are the opinion leaders, and what is their impact and
influence?● Helps them to improve their campaigns, better target them ● Helps improve both the development and marketing of environment-
related tools and technology by better understanding social perception and behaviour
● Hard to get this information by traditional means (youGov polls etc)
We are all connected to each other...
● Information, thoughts and opinions are shared prolifically on the social web these days
● 72% of online adults use social networking sites
Your grandmother is three times as likely to use a social networking site now as in 2009
Social media and climate change are not just for
young people!
Analysing Social Media is harder than it sounds
There are lots of things to think about!
Not every tweet about global warming is
relevant
Some tweets are sarcastic
Let's search for keywords like “Arctic”
Oops!
Opinion Mining involves finding out what people think
● A simple approach would look for positive and negative words in a tweet.● e.g. “Climate change is terrible.” “Recycling is a good way to save the
planet.”● We could simply collect lists of positive and negative words and
categorise the tweets accordingly.● But it's not as simple as that.
Language in Social Media is complicated
● Grundman:politics makes #climatechange scientific issue,people don’t like knowitall rational voice tellin em wat 2do
● Want to solve the problem of #ClimateChange? Just #vote for a #politician! Poof! Problem gone! #sarcasm #TVP #99%
● Human Caused #ClimateChange is a Monumental Scam! http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!! Lying to us like MOFO's Tax The Air We Breath! F**k Them!
Challenges for NLP
● Noisy language: unusual punctuation, capitalisation, spelling, use of slang, sarcasm etc.
● Terse nature of microposts such as tweets● Use of hashtags, @mentions etc causes problems for
tokenisation #thisistricky● Lack of context gives rise to ambiguities● NER performs poorly on microposts, mainly because of linguistic
pre-processing failure● Standard NER tools almost halve their performance rate when
used on tweets
Persons in news articles
Persons in tweets
Lack of context causes ambiguity
Branching out from Lincoln park after dark ...
Hello Russian Navy, it's like the same thing but with glitter!
??
Getting the NEs right can be tricky
Branching out from Lincoln park after dark ...
Hello Russian Navy, it's like the same thing but with glitter!
“Positive” tweets about fracking
● Help me stop fracking. Sign the petition to David Cameron for a #frack-free UK now!
● I'll take it as a sign that the gods applaud my new anti-fracking country love song.
● #Cameron wants to change the law to allow #fracking under homes without permission. Tell him NO!!!!!
● Clearly, existing tools don't work very well!
Whitney Houston wasn't very popular...
Death confuses opinion mining tools
Opinion mining tools are good for a general overview, but not for some situations
Text Analysis with GATE
● GATE is a toolkit for Natural Language Processing (NLP) developed at the Univesity of Sheffield for 20 years
● http://gate.ac.uk● components for language processing: parsers, machine learning
tools, stemmers, IR tools, IE components for various languages. opinion mining
● tools for visualising and manipulating text, annotations, ontologies, parse trees, etc.
● various information extraction tools● evaluation and benchmarking tools
GATE Components for Opinion Mining
● TwitIE● structural and linguistic pre-processing, specific to
Twitter● includes language detection, hashtag retokenisation,
POS tagging, NER● Term recognition using TermRaider● Sentiment gazetteer lookup● JAPE opinion detection grammars
● Include target and opinion holder detection based on entities/terms
● currently positive/negative, extending to emotion detection (happy/sad/anger/fear etc)
Basic approach for opinion finding
● Find sentiment-containing words in a linguistic relation with terms/entities (opinion-target matching)
● life flourishing in Antarctica● Dictionaries give a starting score for sentiment words● Use a number of linguistic sub-components to deal with
issues such as negatives, adverbial modification, swear words, conditionals, sarcasm etc.
A positive sentiment list
● awesome category=adjective score=0.5● beaming category=adjective score=0.5● belonging category=noun score=0.5● benefic category=adjective score=0.5● benevolently category=adverb score=0.5● caring category=noun score=0.5● charitable category=adjective score=0.5● charm category=verb score=0.5
A negative sentiment list
Examples of phrases following the word “go”:● down the pan● down the drain● to the dogs● downhill● pear-shaped
Opinion scoring
● Sentiment gazetteers (developed from sentiment words in WordNet) have a starting “strength” score
● These get modified by context words, e.g. adverbs, swear words, negatives and so on
amazing campaign --> really amazing campaign.
good campaign --> not so good campaign● Swear words modifying adjectives count as intensifiers
good campaign --> damned good campaign● Swear words on their own are classified as negative
Damned politicians.
A positive tweet
A negative tweet
A Sarcastic Tweet
A little look at sarcasm
● Sarcasm is usually about conveying the opposite meaning to the words we use
● Frequent hashtags: #sarcasm, #notreally, #whocares, #whoknew, #lol● Even knowing it's sarcastic isn't enough unless you now how to interpret
the sarcasm● Did you know, trees grow? YEEEEEEE #yee #trees #biodiversity
#wee #igottapee #notreally● RT @James_BG: Some lunchtime reading - why golf is an
environmental abomination and MUST BE BANNED http://t.co/tBs0aGW8bc #notreally.
● It can be positive as well as negative● @jimcramer I hate your energy and knowledge #notreally
Booyah Cramer.
Analysis of the EarthHour campaign
● Analysis of hashtags and topics mentioned
● The main activities and themes of the campaign drove most of the social media conversations
● Users engaged in the campaign but did not necessarily engage with climate change and sustainability issues.
● Lack of correlation between Durex campaign and climate change engagement
Engagement Analysis
● Retweets as the strongest engagement action● Identify the characteristics of those tweets that are followed
by● an engagement action (retweet) ● a high level of engagement (high number of retweets)
● For generating engagement, the content of the tweet is more relevant than the reputation of the user.
● This contradicts previous findings in other domains● Posts generating attention are slightly longer, easier to read,
have positive sentiment, mention other users and repeat terminology from other posts
● People are perhaps bored of hearing doom and gloom about how the world is going to end?
38
Summary
● Tools for social media analysis could be very useful to companies and organisations involved in climate change
● Understanding engagement and user impact of campaigns, products etc.
● Also useful to the general public to understand issues better● While retweets etc are a useful indicator of engagement, the
key is understanding content of social media posts● For this we need in-depth analysis of many language issues● NLP can help us understand what is really going on!
Some final thoughts on climate change
Acknowledgements and further Information
● Research partially supported by the European Union/EU under the Information and Communication Technologies (ICT) theme of the 7th Framework Programme for R&D (FP7) DecarboNet (610829) http://www.decarbonet.eu
● GATE website http://gate.ac.uk● Opinion mining demo: http://demos.gate.ac.uk/arcomem/opinions/● Diana Maynard, Gerhard Gossen, Marco Fisichella, Adam Funk. Should I
care about your opinion? Detection of opinion interestingness and dynamics in social media. Journal of Future Internet, Special Issue on Archiving Community Memories, 2014.
● Diana Maynard and Mark A. Greenwood. Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis. Proc. of LREC 2014, Reykjavik, Iceland, May 2014.
This document does not represent the opinion of the European Community, and the European Community is not responsible for any use that might be made of its content