Upload
seth-grimes
View
111
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Presentation by Seth Grimes to the Washington DC chapter of the Data Warehousing Institute, June 21, 2013.
Citation preview
Big Data Analytics: Facts and Feelings
Seth GrimesAlta Plana Corporation
@sethgrimes
TDWI – Washington DCJune 21, 2013
Big Data Analytics: Facts and Feelings
Theses• We gain knowledge when we make
connections.• Data analysis is a process of
connection discovery.• The more data, the greater the
possibilities.• The more data, the greater the need
to filter, reduce, and contextualize.• Timeliness counts.
Big Data Analytics: Facts and Feelings
The World of Big DataMachine data (e.g., logs, sensor outputs,
clickstreams).Actions, interactions, and transactions:
geolocation and time.Profiles: individual, demographic &
behavioral.Text, audio, images, and video.
Facts and feelings.
Big Data Analytics: Facts and Feelings
A 3-slide reprise...
Big Data Analytics: Facts and Feelings
Imperatives for the 2010s:Do more with more.
“It’s Not Information Overload. It’s Filter Failure”: Clay Shirky, 2008.
• More sources & types of data.• Greater data volumes.• New hardware and methods.
Automate more, more intelligently.• Analytics.• Semantics.
Engage. Socialize.
Big Data Analytics: Facts and Feelings
I see three categories of data:1. Quantities, whether measured,
observed, or computed.2. Content, which I’ll characterize as
non-quantitative information.3. Metadata (semantic & structural)
describing quantities and content.
• Our concern is content, analytics & fusion.
• Structured/unstructured is a false dichotomy.
• Where do relationships fit?
Big Data Analytics: Facts and Feelings
http://www.businessweek.com/magazine/content/04_19/b3882029_mz072.htm
En route.
Big Data Analytics: Facts and Feelings
End of reprise. So, is this Big Data?
44
Big Data Analytics: Facts and Feelings
Of course not. It’s a number, not data.
Size (Volume) is only one Big Data factor.
Other factors (standard definition) are Velocity and Variety.
I reject reVisionist 3Vs extensions such as:
Variation/Variability.Veracity.Value.
These factors are the province of analytics.
Big Data Analytics: Facts and Feelings
Gary King, Harvard Univ. –“Big Data isn’t about the data. It’s about analytics.”
Me – Analytics is a collection of tools and techniques that extract insights from data.
I’d argue –The Value in Big Data is in content, patterns, and connections, derived via analytics.
Big Data Analytics: Facts and Feelings
Variability is an interpretive property. I say –The sense of Big Data is in context and intent…of both the data producer and the data consumer, captured in metadata and (also) derived via analytics.
Big Data Analytics: Facts and Feelings
As for Veracity, data is data. Consider:“The Iraqi regime… possesses and produces chemical and biological weapons.” – George W. Bush, October 7, 2002.
Big Data Analytics: Facts and Feelings
Data… and more. Is this Big Data?
No, it’s a screen of aggregated query results.
Big Data Analytics: Facts and Feelings
The Big Data is behind it.
http://www.newyorker.com/online/blogs/culture/2012/05/google-knowledge-graph.html
Big Data Analytics: Facts and Feelings
And behind comparably-scaled/structured systems.
Big Data Analytics: Facts and Feelings
Comparably-scaled/structured systems?
http://www.cambridgesemantics.com/semantic-university/semantic-search-and-the-semantic-web
Big Data Analytics: Facts and Feelings
Graphs model language models relationships.
Big Data Analytics: Facts and Feelings
Another view, using an old GATE image.
http://gate.ac.uk/hamish/talks/ibot-slidy.html
Big Data Analytics: Facts and Feelings
… for instance, social language.
Big Data Analytics: Facts and Feelings
Text analytics applies natural-language processing (NLP) techniques to discern –
EntitiesRelationships
Context Identity
– and get at the sense of “unstructured” online, social, and enterprise information.
Semantic identity unites data of all types.
Big Data Analytics: Facts and Feelings
http://searchuserinterfaces.com/
Sensemaking:“It is convenient to divide the
entire information access process into two main components: information retrieval through searching and browsing, and analysis and synthesis of results. This broader process is often referred to in the literature as sensemaking. Sensemaking refers to an iterative process of formulating a conceptual representation from of a large volume of information.”
– Marti Hearst, 2009
Big Data Analytics: Facts and Feelings
Intelligent computing – sensemaking –involves:Big (and little) Data.• Quantities.• Content.• Metadata.
Analytics.Semantics.Integration.Facts and feelings.
Big Data Analytics: Facts and Feelings
Feelings: Sentiment detection, classification.
Big Data Analytics: Facts and Feelings
http://techpresident.com/news/21618/politico-facebook-sentiment-analysis-bogus
Big Data Analytics: Facts and Feelings
“Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations.”
-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity in Phrase-
Level Sentiment Analysis”
“Sentiment analysis or opinion mining is the computational study of opinions, sentiments and emotions expressed in text… An opinion on a feature f is a positive or negative view, attitude, emotion or appraisal on f from an opinion holder.”
-- Bing Liu, 2010, “Sentiment Analysis and Subjectivity,” in Handbook of Natural Language
Processing
Big Data Analytics: Facts and Feelings
Sentiment may be of interest at multiple levels.Corpus / data space, i.e., across multiple sources.
Document.Statement / sentence.Entity / topic / concept.
Human language is noisy and chaotic!Jargon, slang, irony, ambiguity, anaphora, polysemy, synonymy, etc.
Context is key. Discourse analysis comes into play.
Big Data Analytics: Facts and Feelings
Emotion and effect.
Big Data Analytics: Facts and Feelings
Emotion and understanding.
Big Data Analytics: Facts and Feelings
Prediction/Feeling/Wish... and Intent.
http://www.aiaioo.com/whitepapers/intention_analysis_use_cases.pdf
http://sentibet.com/
Big Data Analytics: Facts and Feelings
Many options (text).
Big Data Analytics: Facts and Feelings
(Accessible) Data Everywhere
Big Data Analytics: Facts and Feelings
Beyond Text:• Audio including speech.• Images.• Video.
http://www.geekosystem.com/facebook-face-recognition/
http://www.sciencedirect.com/science/article/pii/S0167639312000118
http://flylib.com/books/en/2.495.1.54/1/
Big Data Analytics: Facts and Feelings
Or just ask: Swipp mobile polling (example).
Big Data Analytics: Facts and Feelings
Surfaced.
Big Data Analytics: Facts and Feelings
A Big Data analytics architecture (example).
http://www.geeklawblog.com/2011/12/lexis-advance-platform-launch-two.html
Big Data Analytics: Facts and Feelings
Complementary.
Big Data Analytics: Facts and Feelings
Seth GrimesAlta Plana Corporation
@sethgrimes
TDWI – Washington DCJune 21, 2013