36
Building Effective Frameworks for Social Media Analysis Presented by: Josh Liss Open Analytics Summit DC 2013

Open analytics social media framework

Embed Size (px)

DESCRIPTION

IKANOW's OA D

Citation preview

Page 1: Open analytics   social media framework

Building Effective Frameworks for Social Media Analysis

Presented by: Josh Liss

Open Analytics Summit DC 2013

Page 2: Open analytics   social media framework

Segway

- 200+ million users in 200 countries – techcrunch

- Incredible amount of personal information

- 10 million mo. unique visitors faster than any independent site in history – Sirona Consulting

- 28.1% annual household income of $100K - ultralinx

- 1+ billion monthly active users - facebook

- 17 billion geo-tagged pictures & check-ins - gizmodo

- 230+ million monthly active users - globalewebindex

- 175 million tweets/day in 2012 – infographics labs

- Google + button used 5 billion times/day - alltwitter

- 625,000 new users on Google+ every day - alltwitter

Page 3: Open analytics   social media framework

Agenda

• Social Media: An Intelligence perspective• Common Analytic Pitfalls• An Analytic Framework• Case Study: Superstorm #Sandy

– Problem Definition– Source Selection– Data Capture– Data Reporting– Data Analysis

• The Way Forward – do’s & don’ts• Discussion

Page 4: Open analytics   social media framework

Intelligence

• Intelligence is information that has been transformed to meet an operational need

Data Intelligence

Operational Lens

Page 5: Open analytics   social media framework

Intelligence Cycle

• No matter what methodology you use…

intelligence analysis is an iterative process.

Collect

Store

Analyze

Distribute

Page 6: Open analytics   social media framework

Social Media: Intelligence Perspective

• Intelligence derived from social media brings with it the best and worst aspects of:– HUMINT– SIGINT– OSINT

HUMINT

SIGINTOSINT

Page 7: Open analytics   social media framework

Social Media Analysis Goals

• Provide value to the organization – turn data into

intelligence using an “operational lens”

• Ensure cyclical feedback occurs during collection,

processing, analysis, and consumption

• Validate that a particular network is the right source

of data for the questions you need answered

• $$$

Page 8: Open analytics   social media framework

Common Misconceptions

• Social media is not a panacea– Not everyone uses social media– Users of social media use it unevenly– User behavior changes based on situations

• Just because people can talk about anything does not mean they talk about everything all the time.

Page 9: Open analytics   social media framework

Common Pitfalls

• Analyzing What Instead of Why:The important thing is often not what people are saying… but why they are saying it.

• Using the Wrong Analysis Tools:Reporting tools rarely help dig into the why. Many common tools, reports, and metrics are misleading:– Word clouds atomize message context– Sentiment metrics are often highly inaccurate– Information in aggregate hides more than it reveals

Page 10: Open analytics   social media framework

Pitfalls: An Example of the Challenge

Page 11: Open analytics   social media framework

Pitfalls: An Example of the Challenge

Page 12: Open analytics   social media framework

Dangers of Disintegration

Source: Matthew Auer, Policy Studies Journal, Volume 39, Issue 4, pages 709–736, Nov 2011

The problems are analytical rather than aesthetic or technical. The context is virtually indecipherable: -

Page 13: Open analytics   social media framework

Analytic Framework

• Data Capture (DC)• Data Reporting (DR)• Data Analysis (DA)

– What to measure– What the data is saying– What should be done based on the data

Source: Avinash Kaushik, Occam’s Razor Blog http://www.kaushik.net/avinash/web-analytics-consulting-framework-smarter-decisions/

Capture

ReportAnalyze

Page 14: Open analytics   social media framework

Choosing a Platform

• Social media, and the ways that it is used, is relatively new and evolving rapidly:– Static approaches to social media are flawed from

the outset– No one metric or set of metrics will always let you

know what is happening– No turn-key solution to all problems

• Platforms need to be open and highly adaptable to facilitate data capture, reporting, and analysis

Page 15: Open analytics   social media framework

Case Study: Superstorm Sandy

• Industry: Disaster Response/Crisis Informatics– 14 Billion-dollar disasters in 2011– 11 Billion-dollar disasters in 2012

• Over $100 Billion in total damages

• Oct 29 2012 - Hurricane Sandy– $50+ Billion Damages– 72 deaths directly attributed to storm

• Additional 87 deaths indirectly attributed

• Can social media SAVE money/lives/resources?

Page 16: Open analytics   social media framework

Problem Definition

• Question: How can social media assist civil authorities responding to natural disasters:– Prevent/limit loss of life and limb– Prevent/limit damage and loss of property– Protect critical infrastructure

• Challenges: Capture relevant information from social media sources.– Query too large/broad = false positives– Query too small/narrow = miss potential information– Signal vs. Noise

Page 17: Open analytics   social media framework

The Source: Twitter

• Twitter has excellent analytical potential:– Enormous volume, 400+ million tweets per day– Large user base, 200+ million active users– Open API

• But its not without its limitations:– 140 characters– Limited historical (look-back) capacity without using

a 3rd party provider like DataSift or GNIP = $$$– Anonymity, credibility– Fact vs. satire

Page 18: Open analytics   social media framework

Data Capture

• 975,000+ Tweets – Filters: temporal, geo, keywords, hashtags– Timeline: 28 Oct to 06 Nov

• Pre-land fall, Land-fall, Aftermath, Recovery

– Geo focus on Tri-state area• Entity Extraction / Sentiment

– NLP extracts the entities, events and associations from unstructured text

• Isolates Twitter Handles, Keywords, URLs, etc.

Page 19: Open analytics   social media framework

Data Capture: Entities & Associations

Hashtags

Twitter Handles

URL

Unstructured Keywords

Time / Date Stamp

WhoTwitterHandles, retweeters

WhatHashtags, Keywords, URLs

WhenTime, Date

WhereGeo (if Available)

Page 20: Open analytics   social media framework

Data Reporting

Page 21: Open analytics   social media framework

Data ReportingKeywords Twitter handle

Page 22: Open analytics   social media framework

Data Analysis

• Analysis must be rooted in the operational need:– How can social media help civil authorities & first

responders during natural disaster response and relief efforts.

• Emphasis on hypothesis generation, testing, and experimentation

Page 23: Open analytics   social media framework

Data Analysis: Hashtags

• Top hashtags were almost all generic or abstract– Undermines tracking and understanding– Generates leads for further analysis

Hashtags#Sandy #Recovery#NYC #Power#Hoboken #SandyABC7#NJ #Gas#Brooklyn #JERSEYSTRONG

Page 24: Open analytics   social media framework

Data Analysis: Sentiment

• Sentiment analysis on small chunks of text like Tweets is generally poor

• Follow and convert linked URLs into derivative sources

Larger text sources offer potential value with sentiment analysis that tweets alone cannot offer

Page 25: Open analytics   social media framework

Data Analysis: Sentiment

• Top negative and positive sentiment scores can provide a glimpse into aggregate attitudes

• Provide starting points for additional analysis

Page 26: Open analytics   social media framework

Data Analysis: Narrow the scope

Page 27: Open analytics   social media framework

Next Steps: Agile Intelligence

• New Problem Identified:– NYC 911 received approx. 20,000 calls/hour– Life/limb emergencies could not get through– Callers prompted to text or call 311

– NYC spent $2 Billion since 2009 “overhauling” the system• $680 Million call center – “Unified Call Taker” system

• New Question: Can social media serve as a supplement/alternative to traditional emergency response systems during times of natural disasters, state of emergencies?– Promote/monitor hashtags– Dedicated analysts/dispatchers– Facilitate proactive use of local/city/state resources

Page 28: Open analytics   social media framework

Next Steps: Segment the Data

• Segment, or cluster, your data by:– User name or twitterhandle– Hashtags– Keywords– Geographic region– Timelineto explore patterns and trends at the micro level versus the entire dataset

Page 29: Open analytics   social media framework

Next Steps: Try on different lenses

Highest traffic occurred during the height of the storm, despite spreading power outages

Page 30: Open analytics   social media framework

Next Steps: Segment the Data

< 5% of of Tweets are geo-tagged

Page 31: Open analytics   social media framework

Next Steps: Graph Analysis

Visualize associations between top influencers

Page 32: Open analytics   social media framework

Next Steps: Findings

• Targeted queries based on tailored information requirements

• Findings:– Few legitimate “calls for help”– No dedicated #’s

• #help used for encouraging donations/volunteering• #distress used for

– Significant & accurate i-reporting on flooding, downed trees/power lines, fires, etc.

– Crowd-sourced info on where to find gas, food/water, donate goods, volunteer, etc.

– Despite widespread power outages, cell service was a life-line

Page 33: Open analytics   social media framework

Lessons Learned

• Don’t:– Try drinking from a fire hose

• sometimes less really is more

– Use metrics you can’t tie to actions

– Use visualizations or reports that strip the data

from its context

Page 34: Open analytics   social media framework

Lessons Learned

• Do:– Segment data rather than attempting to work in

the aggregate

– Look for the why behind the message

– Always return to the source material

– Explore alternative explanations

– Always consider the ultimate goal

Page 35: Open analytics   social media framework

Discussion

Success stories or lessons learned from social media analysis/monitoring in 2012?

Arguments for or against the use of social media?

Where will social media monitoring/analysis be in 2014?

Page 36: Open analytics   social media framework

Thank You!

Joshua [email protected]

github.com/ikanow/infinit.e