24
Social Media Analytics as a Service Dr. Diana Maynard University of Sheffield, UK

Social media analytics as a service: tools from GATE

Embed Size (px)

DESCRIPTION

Slides from my talk at LT-Accelerate 2014, Brussels

Citation preview

Page 1: Social media analytics as a service: tools from GATE

Social Media Analytics as a Service

Dr. Diana MaynardUniversity of Sheffield, UK

Page 2: Social media analytics as a service: tools from GATE

We are all connected to each other...

● Information, thoughts and opinions are shared prolifically on the social web these days

● 72% of online adults use social networking sites

Page 3: Social media analytics as a service: tools from GATE

● In Britain and theUS, approx 1 houra day on social media

● 90% of marketers usesocial media channels for business

Page 4: Social media analytics as a service: tools from GATE

Popularity of Social Networking Sites

Twitter● 284 million monthly active users● 100 million daily active users● 80% of world leaders use Twitter

Facebook ● 1.35 billion monthly active users. 864 million

daily active, 10 billion messages a day● 30% of Americans get their news from Facebook● Facebook has more users than the whole of the

Internet did in 2005

● Google+: 300 million monthly active users● LinkedIn: 332 million users● MySpace: 36 million users

● 44% of “users” have never sent a tweet

● 390 million users have no followers

● Google+: only 7 minutes per month

Page 5: Social media analytics as a service: tools from GATE

Your grandmother is three times as likely to use a social networking site now as in 2009

Page 6: Social media analytics as a service: tools from GATE

Why analyse social media?

● Contrary to popular belief, Twitter isn't just full of tweets about Justin Bieber.

● In an emergency, one in two people would use social media to let people know they were safe or to find out more information

● Less than 24 hours after the recent Nepal trekking disaster hit, Facebook and Twitter accounts had been set up to provide information channels, missing persons register etc.

● For companies, sentiment analysis tools are critical to keep track of the market pulse, customer feedback, etc.

● Fast-growing, highly dynamic and high volume source of data● Reflects language and current views of today's society● Analysing social media is far more efficient than e.g. youGov

polls

Page 7: Social media analytics as a service: tools from GATE

Opinion mining from social media

● Understanding customer reviews and so on is a huge business● But also:

● Tracking political opinions: what events make people change their minds?

● How does public mood influence the stock market, consumer choices etc?

● How are opinions distributed in relation to demographics?● Who are the opinion influencers?

● SMA tools are crucial in order to make sense of all the information

Page 8: Social media analytics as a service: tools from GATE

Social media analysis for journalists

● Twitter is immensely valuable to news professionals● gauging opinion on breaking news● discovering new stories● first hand reports from disasters,

war zones, ...● Issues of veracity: London Eye on Fire!

Page 9: Social media analytics as a service: tools from GATE

Analysing language in social media is hard

● Grundman:politics makes #climatechange scientific issue,people don’t like knowitall rational voice tellin em wat 2do

● @adambation Try reading this article , it looks like it would be really helpful and not obvious at all. http://t.co/mo3vODoX

● Want to solve the problem of #ClimateChange? Just #vote for a #politician! Poof! Problem gone! #sarcasm #TVP #99%

● Human Caused #ClimateChange is a Monumental Scam! http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!! Lying to us like MOFO's Tax The Air We Breath! F**k Them!

Page 10: Social media analytics as a service: tools from GATE

We need tools for hashtag analysis

● Hashtags need unravelling:● #gasprices

● And disambiguating:● #therapist● #nowthatcherisdead

Page 11: Social media analytics as a service: tools from GATE

Hijacking of hashtags

Page 12: Social media analytics as a service: tools from GATE

#earthhour2014

Page 13: Social media analytics as a service: tools from GATE

NER is dead! Long live NER!

Page 14: Social media analytics as a service: tools from GATE

NER on Tweets

● NER on Tweets much harder than on longer text● Very short, so ambiguous terms hard to interpret● Poor grammar and spelling, use of abbreviations, shorthands● Twitter-specific features: hashtags, @mentions, etc.● Tools designed for longer texts do very badly on Twitter

System P R F1 F0.5

OpenCalais 68.59 67.17 67.87 68.30

Lupedia 70.93 44.17 54.44 63.27

TextRazor 59.12 83.83 69.34 62.82

TwitIE 69.69 61.03 65.07 67.76

Zemanta 29.64 29.31 29.47 29.57

Page 15: Social media analytics as a service: tools from GATE

Tools for Sentiment Analysis

● There are lots of tools for sentiment analysis around● Many of them don't work well at more than a very basic level● They mainly use dictionary lookup for positive and negative words● ML methods only works for text that's similar in style to the

training data, and it's hard to understand when it goes wrong● Things like sarcasm tend not to get picked up● They classify the tweets as positive or negative, but not with

respect to the keyword you're searching for● keyword search just retrieves any tweet mentioning it, but not

necessarily about it as a topic● no correlation between the keyword and the sentiment

Page 16: Social media analytics as a service: tools from GATE

Sentiment Analysis in GATE

● Knowledge-based linguistic approach based on entity detection for opinion holders and targets

● Sentiment words have to be in a linguistic relation to the opinion holder and target

● Use linguistic analysis to deal with scope issues (negation, hashtags, sarcasm etc)

● Sentiment word scores are modified incrementally● Easy understanding of errors and adaptation of the rules● Twitter-specific pre-processing using TwitIE

Page 17: Social media analytics as a service: tools from GATE

This all sounds like it would be hard

to set up on my system!

Page 18: Social media analytics as a service: tools from GATE

GATE Cloud to the Rescue

● What?● end-to-end text and web processing solutions from the

GATE family running on cloud computing infrastructures.● Why?

● Solve any sort of text processing problem: web, text or opinion mining; indexing and search (fulltext, boolean, conceptual, structural); information extraction; semantic annotation; sentiment analysis; ontology population; etc.

● Run large-scale jobs without investing in server hardware or other fixed costs.

● Exploit a 15-year R&D programme, the expertise of the GATE community and a defined and repeatable process.

Page 19: Social media analytics as a service: tools from GATE

Benefits of Gate Cloud

Text Analytics Consumer

Cloud Large scale, no CAPEX, no system admin, no commitment

Open Source No vendor lock-in

TA Services Twitter, News, BioMed, Sentiment, etc.low-level pre-processing support (POS tagging etc)

APIs Integrate

Page 20: Social media analytics as a service: tools from GATE

20

Application Types

● Low-level: stemmers, PoS taggers, phrase chunkers, morphological analysers

● Coverage: tools for 18 languages including BG and RU● General Purpose IE: named entities, numbers,

measurements, language ID● Domain-specific IE: News, TwitIE, Biomed● LOD-based semantic annotation: DBpedia, GeoNames,

Freebase● Sentiment analysis● Summarisation● Includes many 3rd party tools also

Page 21: Social media analytics as a service: tools from GATE

On-demand document processing workflow

Page 22: Social media analytics as a service: tools from GATE

It's just like online shopping

● Click through to the online shop, browse products and add them to your shopping basket.

● Create an account and then buy credit vouchers● Put the vouchers in your account, and go to checkout. ● We'll email you the login or job creation details for your cloud servers.● Monitor and control your cloud machines on your dashboard.● Use our existing applications:

● Just upload your documents and sit down with a cup of tea● Create your own pipeline:

● Upload your own customised application along with your documents, and sit down with a cup of tea

Page 23: Social media analytics as a service: tools from GATE

23

Summary

● SMA tools are crucial, but hard to find what's good● Solutions are readily available in GATE● Easy to test different versions and configurations● Open source and easily customisable● Big data and installation problems are solved with GATE Cloud:

● PaaS for text analytics● Low barrier to entry● Just pay for what you use● State-of-the-art pipelines for news and social media● More pipelines constantly being added

Page 24: Social media analytics as a service: tools from GATE

Acknowledgements and more information

● GATE: http://gate.ac.uk● GATE Cloud: http://gatecloud.net● Annomarket: http://www.annomarket.eu● Research partially supported by the European Union/EU under

the Information and Communication Technologies (ICT) theme of the 7th Framework Programme for R&D (FP7) DecarboNet (610829) and AnnoMarket (296322)

● Original GATE Cloud development supported by JISC/EPSRC, reference number EP/I034092/1

This document does not represent the opinion of the European Community, and the European Community is not responsible for any use that might be made of its content