Upload
john-sing
View
105
Download
0
Tags:
Embed Size (px)
DESCRIPTION
John Sing's latest April 2014 technology presentation overviewing the interaction of Cloud, Big Data Analytics, Mobile, and Social technologies upon today's business models. Competitive Advantage value in today's business models comes from the proper *blending* of these technologies, in the proper way unique to one's business model. We discuss the Journey from Data to Value, provide big picture review of these technology's impact on today's world. My goal here is to provide a holistic overview of technologies for multiple business audiences to understand. You may use this material to further your own business goals. All I ask is that you give full attribution to me or to the original authors (attributed on each slide), use proper business ethics, courtesy. If you are further interested in what I have to say / contribute, please contact me! http://www.johnsing.us
Citation preview
© 2014 John Sing – All Rights Reserved
Big Data: the Big Picture
For your 2014+ Business and Career
Opening video
John Sing, Executive IT Consultant
http://johnsing.us
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
2
John Sing 32 years of experience in enterprise servers, storage, and software
– 2009 – 2014: IBM Executive IT Consultant: IT Strategy and Planning, Enterprise Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data Analytics, HA/DR/BC
– 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business Continuity, HA/DR/BC, IBM Storage
– 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror, Global Mirror)
– 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage– 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors– 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including VSE
and VSE/ESA)
http://johnsing.us
Follow my daily IT research blog– http://www.delicious.com/atsf_arizona
Follow me on Slideshare.net:– http://www.slideshare.net/johnsing1
LinkedIn:– http://www.linkedin.com/in/johnsing
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
3
Data, the new natural Resource
Big Data in context:
Cloud, Analytics, Mobil, Social
Innovating using Big Data:
Monetizing, innovating, creating competitive advantage out of Big Data
Agenda
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
4
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Today’s message: The Big Data Journey to Value
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
5
Data, the new natural Resource
Data, the new natural resource
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
6
Time
Com
pu
tin
g P
ow
er
Gro
wth
Traditional IT “sensemaking” capability
Available datafor observation
What we see in the world today……
Chart by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
7
Time
Com
pu
tin
g P
ow
er
Gro
wth
Traditional IT “sensemaking” capability
Available datafor observation
ContextEnterpriseAmnesia
What we see in the world today ………..
Chart by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
8
Enterprise Amnesia, definition
A defect in memory, resulting in missed opportunity, wasted resources, lower revenues, unnecessary fraud losses, and other bad news.
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
9
Time
Com
pu
tin
g P
ow
er
Gro
wth
Traditional IT “sensemaking” capabilities
Available ObservationSpace
Because traditional IT methods could not keep pace
WHY?
Chart by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
10
Time
Com
pu
tin
g P
ow
er
Gro
wth
This is the Big Data Opportunity
Add: Big Data Sensemaking Algorithms
Available ObservationSpace
Context Big Datacapability
New/Useful Information
DataAnalytics
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
11
Think of the Gold Mine analogy – in the “Olden Days”
Miners could actually see nuggets / veins of gold
There was much more gold out there….
– but it wasn’t visible to naked eye…
It was a big gambling game– You dig like crazy, but you’ve no
idea where more gold will be found
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
12
In the “olden days”, no one could afford to dig everywhere
Where gold is mined on Earth (as of 2006)
Despite gold rush fevers, no one could afford to mobilize millions of people to dig everywhere
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
13
Gold mining in 2012: Massive capital equipment
Millions of tons of dirt
Ore of 30 mg/kg (30 ppm)– Needed to even see the gold
By using the right equipment
On a massive scale
We can process lots of dirt affordably and keep the gold we find
That’s like Big Data!
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
14
Processor power: Google, Yahoo, Facebook surpassed the Supercomputer community in compute power and scale…… in 2008
Google in 2012:– 200+ petaflops– Processes 1 TB / hour– 2003: Batch– 2005: Warehouse– 2011: Instant – Dumped MapReduce – Wrote replacement real-time indexing
(“Percolator”)– Click here for architecture
Facebook in 20 Minutes in 2012
– 30 PB cluster of storage– 2.7M Photos, 10.2M Comments, 4.6M Me
ssages– Facebook's New Realtime Analytics Syst
em: Hadoop HBase To Process 20 Billion Events Per Day
May 21, 2008: http://www.circleid.com/posts/85218_google_surpasses_supercomputer/
http://highscalability.com real life internet architectures
http://highscalability.com/display/Search?searchQuery=facebook&moduleId=4876569
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
15
Hmmmm. What might we find in all this data? And How?
Cisco estimate: by 2015, will be annual 4,8 zettabytes of data center traffic flowing through Internet, Only 5% will be traditional OLTP database
Data in existence today = 1,000 exabytes = 1 million petabytes
http://venturebeat.com/2011/11/29/cisco-global-cloud-traffic/
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
16
Visualizing Big Data
Source: Wikibon March 2011
Goal: Analyze *all* the datareal time
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Very large
Distributedaggregation
Looselystructured
Often incomplete
Sampling not strategically competitive
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
17
Visualizing Big Data….
Source: Wikibon March 2011
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
PetabytesExabytes
Millions / Billions of
people
Billions /Trillions of
records
Time-stampedevents
Unknown inter-
relationships
Flat files
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
18
Visualizing Big Data…..
Source: Wikibon March 2011
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Connectionsdetermined by
probability
Process entire (huge)
data set
Data generated by collective actionover the Internet
OpenSource
innovation
It’s more than the
algorithms….
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
19
It’s also:
Source: Wikibon March 2011
Original source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
Its collaboration of algorithms
Combined / Collaboratedinnovative
ways
A softwareEcosystemis essentialOn a worldwide
scale
MultipleWorldwide“Pockets of
Value”
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
20
Kevin Slavin at TEDGlobal July 2011
“How algorithms shape our world”http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world
Visualizing what Algorithms are doing
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
21
Big Data and Hadoop: worldwide usage
eBay
Yahoo!
Major Fortune 500 customers
Including all IBM industries:
– Financial – Healthcare– M&E– Telecom– Utilities– Retail
http://www.datanami.com/datanami/2012-04-26/six_super-scale_hadoop_deployments.html One source for Hadoop users (but not the only one!): http://wiki.apache.org/hadoop/PoweredBy
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
22
Hadoop is a well-developed ecosystem for Big Data app development
Hadoop– Overall name of software
stack
HDFS– Hadoop Distributed File
System
MapReduce– Software compute framework
• Map = queries • Reduce=aggregates
answers
Hive– Hadoop-based data
warehouse
Pig– Hadoop-based language
Hbase– Non-relationship database
fast lookups
Flume– Populate Hadoop with data
Oozie– Workflow processing
system
Whirr– Libraries to spin up Hadoop
on Amazon EC2, Rackspace, etc.
Avro– Data serialization
Mahout– Data mining
Sqoop– Connectivity to non-
Hadoop data stores
BigTop– Packaging / interop of all
Hadoop components
http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyondhttp://blog.cloudera.com/blog/2013/01/apache-hadoop-in-2013-the-state-of-the-platform/ http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
23
Visualizing why Hadoop was created for Big DataTraditional approach : Move data to program
Big Data approach: Move function/programs to data
Database server
Data
Query Data
return Data
process Data
Master node
Data nodes
Data
Application server
User request
Send result
User request
Send Function to process on Data
Query & process Data
Data nodes
Data
Data nodes
Data
Data nodes
DataSend Consolidate result
Traditional approachApplication server and Database server are separateAnalysis Program can run on multiple Application serversNetwork is still in the middleData has to go through networkDesigned to analyze TBs of data
•Big Data Approach Analysis Program runs where the data is : on Data NodeOnly Analysis Program has to go through the networkAnalysis Program is executed on every DataNodeDesigned to analyze PBs of dataHighly Scalable :
1000s NodesPetabytes and more
Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
24
Example of Hadoop in action
Database server
Data
Query Data
return Data
process Data
Application server
User request
Send result
Master node
Data nodes
Data
User request
Send Function to process on Data
Query & process Data
Data nodes
Data
Data nodes
Data
Data nodes
DataSend Consolidate result
Example: How many hours of Clint Eastwood appears in all the movies he has done?
Task: All movies need to be parsed to find Clint’s face
•Traditional approach :1)Upload a movie to the application server through the network
2) The Analysis Program compares Clint’s picture with every frame of the loaded movie.
3) Repeat the 2 previous steps for every movie
•Big Data Approach :
1)Send the Analysis Program and Clint’s picture to all the DataNodes.
2) The Analysis Program in every DataNode (all in parallel) compares the Clint’s picture with every frame of the loaded movie.
3) The results of every DataNodes are consolidated. A unique result is generated.
Traditional approach : Move data to program
Big Data approach: Move function/programs to data
Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
25
Hadoop in action – details: Hadoop Distributed File System = HDFS : where Hadoop stores the data
– HDFS file system spans all the nodes in a cluster with locality awareness
Hadoop data storage, computation model– Data stored in a distributed file system, spanning many inexpensive computers– Send function/program to the data nodes– i.e. distribute application to compute resources where the data is stored– Scalable to thousands of nodes and petabytes of data
MapReduce Application
1. Map Phase(break job into small parts)
2. Shuffle(transfer interim outputfor final processing)
3. Reduce Phase(boil all output down toa single result set)
Return a single result setResult Set
Shuffle
public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();
public void map(Object key, Text val, Context StringTokenizer itr = new StringTokenizer(val.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } }}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWrita private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> val, Context context){ int sum = 0; for (IntWritable v : val) { sum += v.get();
. . .
public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();
public void map(Object key, Text val, Context StringTokenizer itr = new StringTokenizer(val.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } }}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWrita private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> val, Context context){ int sum = 0; for (IntWritable v : val) { sum += v.get();
. . .
Distribute maptasks to cluster
Hadoop Data Nodes
Data is loaded, spread, resident in Hadoop cluster
Performance = tuning Map Reduce workflow, network, application, servers, and storage
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/ http://blog.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/ http://www.slideshare.net/allenwittenauer/2012-lihadoopperf
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
26
What is being done
with Big Data today?
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
27
Humans are collecting useful data on massive scale
Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf:
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
28
We are building real-time, integrated stream computing on massive scale
Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf:
n d
Chapter 1
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
29
• Unlimited in amount, but you have to refine it
• Basis of competitive advantage, no matter what industry
• Every market being transformed by data
Data is the new natural resource
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
30
Aerospace / defense transformation
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
31
Automobile transformation
Ford: https://www.youtube.com/watch?v=nFUszkSv5X0
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
32
Energy & utilities transformation
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
33
Government transformation
Miami-Dade County: https://www.youtube.com/watch?v=toL4Yx9WYPoMiami-Dade Police: https://www.youtube.com/watch?v=1b5RiPWd-Pw
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
34
Media and entertainment transformation
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
35
Predictive Analytics: Movement in a City
•10 minute-ahead volume forecast (blue) vs. actual value (black)
•10 minute-ahead speed forecast (blue) vs. actual value (black).
Blue line: analytics prediction 10 minutes in advanceBlack line: actual result
Stockholm: http://www.youtube.com/watch?v=rfMylzF4lv8
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
36
Predictive Analytics: Using Information to Ensure Public Safety:Blue CRUSH in Memphis, TN & Richmond, VA
Blue CRUSH predictive analysis for officer deployment & risk management generated easy-to-read crime maps every four hours Richmond, VA: Violent crime decreased in the first year by 32%, another 40% thereafter,
moving Richmond from #5 on the list of the most dangerous US cities to #99
Memphis Blue CRUSH MapMemphis Blue CRUSH Map
Police videos: http://www.youtube.com/watch?v=8SJQtn4RO7I
Playvideo
https://www.youtube.com/watch?v=_xsffIAHY3I
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
37
A new class of data-rich industries has already emerged
Yesterday’s Hyperscale Data Companies
New business models: company’s value based on amount of information stored, exploited
Today’s Hyperscale Data Companies
Aerospace
Banking
Energy
Government
Healthcare
Insurance
Manufacturing
Media andEntertainment
Retail
3.5 PB in 20101 TB CT scanner → 2.5 PB/Year/Scanner
20 PB in 2011Grow 300 TB per month, every month
ExamplesIndustries
Healthcare
Provider
Claims
Processor
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
38
How much data?
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
39
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Solution: take Big Data on the Journey to Value
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
40
Data + Analytics = Information
Information + context = Insight
So…. What is “context”?
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
41
Time
Com
pu
tin
g P
ow
er
Gro
wth
Review: this is the Big Data Opportunity
Add: Big Data Sensemaking Algorithms
Available ObservationSpace
Context Big DataCapability“context”
New/Useful Information
DataAnalytics
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
42
No Context
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
43
Context, definition
Better understanding something by taking into account the things around it.
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
44
Information in Context … = Insights
Top 200Customer
Job Applicant
IdentityThief
CriminalInvestigation
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
45
The Puzzle Metaphor: what we mean by “Context”
Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors
What it represents is unknown – there is no picture on hand
Is it one puzzle, 15 puzzles, or 1,500 different puzzles?
Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted
Some pieces may even be professionally fabricated lies
Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
46
Puzzling
270 pieces90%
200 pieces66%
150 pieces50%
6 pieces2%(pure noise)
30 pieces10% (duplicates)
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
47
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
48
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
49
First Discovery
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
50
More Data Finds Data
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
51
Duplicates in Front Of Your Eyes
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
52
First Duplicate Found Here
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
53
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
54
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
55
Incremental Context – Incremental Discovery
6:40pm START
22min “Hey, this one is a duplicate!”
35min “I think some pieces are missing.”
37min “Looks like a bunch of hillbillies on a porch.”
44min “Hillbillies, playing guitars, sitting on a porch, near a barber sign … and a banjo!”
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
56
150 pieces50%
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
57
Incremental Context – Incremental Discovery
47min “We should take the sky and grass off the table.”
2hr “Let’s switch sides, and see if we can make sense of this from different perspectives.”
2hr10m “Wait, there are three … no, four puzzles.”
2hr17m “We need a bigger table.”
2hr18m “I think you threw in a few random pieces.”
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
58
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
59
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
60
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
61
How Context Accumulates
With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected
New observations sometimes reverse earlier assertions
Some observations produce new discovery
As the working space expands, computational effort increases
Given sufficient observations, there can come a tipping point. Thereafter, confidence improves while computational effort decreases!
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
62
Big Data [in context] = Insights.
More data: better the predictions– Lower false positives– Lower false negatives
More data: bad data … good– Suddenly glad your data was not perfect
More data: less compute
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
63
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Quiz: The Big Data Journey to Value
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
64
The most competitive organizations
are going to make sense of what they are observing
fast enough to do something about it
while they are observing it.
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
65
65
Data inMotion
Data atRest
Data inMany Forms
Information Ingestion and Operational Information
Information Ingestion and Operational Information
Decision Management
BI and Predictive Analytics
Navigation and Discovery
IntelligenceAnalysis,
Raw Data Structured Data Text Analytics Data Mining Entity Analytics Machine Learning
Landing Area, Analytics Zone, ArchiveLanding Area, Analytics Zone, Archive
Video/AudioNetwork/SensorEntity AnalyticsPredictive
Real-time AnalyticsReal-time Analytics
Exploration,Integrated Warehouse,
and Mart Zones
DiscoveryDeep ReflectionOperationalPredictive Stream Processing
Data Integration Master Data
StreamsStreams
Information Governance, Security and Business Continuity Information Governance, Security and Business Continuity
Batch parallel Big Data processing
Real-Time In-memory servers
Data WarehouseTraditional IT
Thus, there is a Workflow in a Big Data infrastructureThus, there is a Workflow in a Big Data infrastructure
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
66
In order to build a workflow for Big Data, you must know:
Where/how is Big Data is stored, analyzed, delivered?
Understanding Big Data in Context
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
67
C = cloud
A = analytics
M = Mobile
S = Social
Remember this acronym: C.A.M.S.
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
68
C = cloud
A = analytics
M = Mobile
S = Social
Big Data in Context:
Where data is generated and collected
Where data is stored
How data is analyzed
Where data is analyzed
How data is delivered
Who is consuming it
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
69
Cloud – today’s Delivery Model
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
70
Where is the Big Data?
Answer: Cloud Data Centers
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
71
71
Bandwidth availability is tipping point for adoption of “The Cloud”………
Worldwide broadband bandwidth availability is becoming commonplace
Facilitates a pervasive web services delivery model – (i.e. “The Cloud”)
Hosted in mega data centers with massive amounts:– Processors, Storage, Network
As a result:
– We are seeing on-premise data centers worldwide rapidly disappearing, off-premise, into the cloud
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
72
72 http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
73http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
74
Large Cloud Data Centers
10. SUPERNAP, LAS VEGAS, 407,000 SF
9A and 9B. MICROSOFT QUINCY AND SAN ANTONIO DATA CENTERS, 470,000 S
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
75
75
Container Data Center Architecture 7. PHOENIX ONE, PHOENIX, ARIZ. 538,000 SF
5. MICROSOFT CHICAGO DATA CENTER, Chicago 700,000 SF 2. QTS METRO DATA CENTER, ATLANTA, 990,000 SF
Microsoft’s Chicago Container Data Center
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
76
76
More data centers….
4. NEXT GENERATION DATA EUROPE, WALES 750,000 SF
3. NAP OF THE AMERICAS, MIAMI, 750,000 SF
1. 350 EAST CERMAK, CHICAGO, 1.1 MILLION SQUARE FEET
Consumes 100 megawatts of power, 2nd-largest power customer for Commonwealth Edison, trailing only Chicago’s O’Hare Airport.
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
77
Now….. what about the web giants?
i.e. Apple, Facebook, Google, Amazon, etc?
That’s Big!
Great Technology Wars of 2012 – Future of the Innovation Economy - Fast Company.com
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
78
AppleHere’s what powers iCloud, see Jobs at WWDC 2011 iCloud announce (YouTube)
Rendering of Apple's new North Carolina Data Center. Credit: Apple
Other Apple data centers:
Cork, IrelandMunich, GermanyNewark, CaliforniaCupertion, Calif
Apple Data Center
FAQ
Maiden, North Carolina 500K sq ftUSD $1B
This is phase 1 only
Apple Data Center Newark, California
Purposes for all these data centers:
•iCloud•Support Apple’s WW install base of devices•Futures: Move Content Delivery Network in-house?•Futures: Streaming video?
Under construction: Prineville, Oregon
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
79
Facebook’s North Carolina Data Center Goes Live Lulea, Sweden - 290K sq ft (27K
sq meters) by late 2012
Facebook – Prinville, Oregon
Has spent $1B on it’s data centers
Open Compute Project
http://www.wired.com/wiredenterprise/2011/12/facebook-data-center/all/1
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
80
Amazon Web Services
Amazon Web Services 1Q12: 450,000 servers
Amazon Perdix Modular Datacenter
EC2 17K core, 240 teraflop cluster 42nd fastest supercomputer in world
1Q12:
450,000Servers
estimated
1Q13: > 2 trillion
objects in S3
1Q13: 1.1 Mreq/sec
http://aws.typepad.com/aws/2012/04/amazon-s3-905-billion-objects-and-650000-requestssecond.html http://gigaom.com/cloud/how-big-is-amazon-web-services-bigger-than-a-billion/http://aws.typepad.com/aws/2013/04/amazon-s3-two-trillion-objects-11-million-requests-second.html
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
81
What is Google? Google is not a search engine
Google is a real-time “Data Factory” ecosystem
– Defacto organizer of all human internet data
– Provides worldwide Patterns of Life data• Search, analytics, etc as processing• Interactive maps as visualization
– Android as ingest / output devices• Motorola Wireless acquisition $12B
– Supporting businesses and ecosystem roles:• Google+, Play, Shop, Books, Gmail, Docs• Voice recognition software
The history of search engine http://www.wordstream.com/articles/internet-search-engines-history
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
82
82
Google: The Dalles, Oregon internet scale data center
82Google Data Center – The Dalles, Oregon
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
83
83
Google Data Center Photo Gallery
http://www.google.com/about/datacenters/gallery/#/
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
84
84
Google Data Centers
in 2008:
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
85
85
Google Data Center CAPEX worldwide
Capital expenditures on datacenters:– YTD 2013: USD$ 2.4B– 2012: USD$ 3.2B– 2011: USD$ 3.4B– 2010: USD$ 4.0B– 2009: USD$ 809M
The Dalles, Oregon
Each data center between $200M and
$600M
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
86
Time to market
Cost Reduction
Data proximity
Better/faster technology support
Self-service
Shift the culture/business process
New kinds of applications
At scale never before imagined
Why Cloud Delivery Model, Cloud Data Centers
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
87
Primary drivers for move to cloud = business reasons
http://www.kpmg.com/global/en/issuesandinsights/articlespublications/cloud-service-providers-survey/pages/service-providers.aspx
Competitive Advantage,RevenueCompetitive Advantage,Revenue
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
88
Value delivered
IT Infrastructure Provisioning
Continuous Access to data
From traditional
Weeks
To cloud
Minutes
For usersFor users
Reduced admin costs Up to 50% savings
For ITFor IT
Reduced energy costs Up to 36%
Increased utilization Up to 90% From 50%
Localized, any time
any where
Dynamic (Elastic)
Centralized
FixedCapacity
Cloud Infrastructure Business Value
Time-to-DeliveryCompetitive AdvantageRevenue“Time is Money”
Time-to-DeliveryCompetitive AdvantageRevenue“Time is Money”
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
89
Growth ofThe Cloudby 2016
Mobile
Geo-locational
Real-time data
Shift to cloud mega-data centers
http://www.datacenterknowledge.com/archives/2012/10/23/cisco-releases-2nd-annual-global-cloud-index/
Source:
> 50% in cloud
Cisco already knows > 50% workload is in the cloud
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
90
Visualizing Mobile and Social
C = cloud
A = analytics
M = Mobile
S = Social
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
91
Space-Time-Travel
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
6 billion mobile phones
6.8 billion people
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
92
Space-Time-Travel
6 billion mobile phones
6.8 billion people
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
Re-Identify
(figuring who is who) is somewhat trivial
Reveal
Where you spend timeWho with (e.g., friends)
Geo-location data
Mobile Phones600B transactions /
day(in US)
De-Identify
in volume in real-time
share with third parties
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
93
Space-Time-Travel
6 billion mobile phones
6.8 billion people
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analytics http://jeffjonas.typepad.com/
Here Now
More to come
Unravel
All of one’s secretsAbsolute identification
Ultimate biometric
Reshape
Tough problemsImage classification
Identification
EnormousOpportunity
Challenge all notions of privacy
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
94
Possible….. Like Magic …
Source - blog by: Jeff Jonas/Las Vegas/IBM, Chief Scientist, IBM SWG Entity Analyticshttp://jeffjonas.typepad.com/jeff_jonas/2009/08/your-movements-speak-for-themselves-spacetime-travel-data-is-analytic-superfood.html
87% certainty where you will be this Thursday at 5pm
Top 10 people you co-locate with (home / work)
High quality traffic-avoid predictions pushed to you real-time
Transactions not consistent with your pattern = reduce credit card theft 90%
Political opponent crushed, resigns two days after announcing candidacy
Governments change
Due to mass online social networking
Cannot truly be turned off6 billion
mobile phones6.8 billion
people
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
95
80%
5 minutes
4/5ths
2/3rds
$1Tril.
84%
of Millennials say social and user-generated content has an influence on what they buy.
70%
2x
of Boomers agree.
57%
57% of companiesin 2014 expect to devote more than 25% of their IT spending to systems of engagement. (Almost double the investment one year ago.)
95
IBM CONFIDENTIAL 2014
Mobile/Social:
84%
of smartphone users check anapp as soon as they wake up.
as many people in 2013 werewilling to share their geolocation data in return for personalized offers compared to the previous year.
The response time users expect from a company once they have contacted them via social media.
of U.S. adult smartphone users keeptheir phones with them 22 hours per day.
of individuals are willingto trade their information for a personalized offering.
of U.S. adults say they would not return to a business that lost their personal, confidential information.
of upside potential in onlineretail sales if buyers trust more.
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
96
Observe: how fast mobile internet grows by 2014
By 2014:
Mobile will be main way
Of connecting to Internet
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
9797
Mobile affects all business models…
Mobile = Geo-locational superfood for real-time analytics
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
98
Mobile / Social endpoints for Data Supertransformagicability
TaxiWiz
HousingMaps
Source: http://mashable.com/2007/07/11/google-maps-mashups-2/
Weatherbug
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
9999
By 2016, how much mobile data? What kind?
2012:–Mobile-connected
devices > # people
• 2016:– 10 billion mobile devices– (world population: 7.3 B)
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html
Smartphones 48%
Web data,video70%
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
100
Monetizing, innovating, creating competitive advantage out of Big Data
Innovating using Big Data
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
101
Different forms of automation have had a profound impact
0
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
102
Manufacturing changes with an end of mass production..
• Growth in manufacturing capable countries
• Global levelling out
• Hybridised manufacturing
• Micro multi-nationals clusters
• Globally recognised specialisation
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
103
3D printing has the potential to drive another step change
• Digitisation often leads to the freemium
• Defining a sustainable position in the value chain
• Really understanding what customer value is critical
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
104
“The Curve”: giving away things for free, in exchange for data?
http://www.youtube.com/watch?v=pcyzn5oiDrI
Today’s changing business models
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
105
1
2
3
4
5
AugmentProducts
Codify Services
InterconnectIndustries
Trade Information
Digitise Assets
Instrument products to create new data and extend notion of client value
Expand use of differentiated capabilities through ecosystems or business platforms to create additional value
Use information to create new value chains that reduce waste and bridge gaps between organizations
Translate data into information that is of value to adjacent industries
Transform analogue into digital assets
New Patterns for Innovation have emerged
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
106
Using this patterns require elastic enterprises..
Adjacency
Leverage core competency
Earn market permission
Differentiation
Maintainable advantage
Serve individual needs
Scaling Ecosystems
Amplified innovation
Co-creation of new value
Dynamic Operating Model
Able to share the new value
Scalable business platform
Source: Elastic Enterprise, Nicholas Vilatari and Haydn ShaughnessySource: Elastic Enterprise, Nicholas Vilatari and Haydn Shaughnessy
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
107
Interesting to look at Corning
Strong light glass for light bulbs
Dishes, plates… They are the “standard” in some cultures
Glass for LCD screens.
Now predicting the future of glass
http://www.youtube.com/watch?v=jZkHpNnXLB0
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
108
Big Data is at the heart of innovation in business
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
109
Big Data business decisions URL:
https://bda.expertise.client-conversations.com
Available on the internet
Complete information on Innovating with Big Data:
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
110
Jobs replaced by Technology
http://www.businessinsider.com/the-future-of-jobs-the-onrushing-wave-2014-1
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
111
1. Data + Analytics = Information
2. Information + Context = Insight
3. Insight + Actions = Desired Outcomes
Quiz: The Big Data Journey to Value
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
112
Data, the new natural Resource
Data + Analytics = Information. Information + Context = Insight. Insight + Action = Outcomes
Big Data in context:
Cloud, Analytics, Mobil, Social
Innovating using Big Data:
Monetizing, innovating, creating competitive advantage out of Big Data
Summary – what we covered today:
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
113
Thank You
Merci
Grazie
ObrigadoDanke
Japanese
Hebrew
English
French
Russian
German
Italian
Brazilian PortugueseArabic
Traditional Chinese
Simplified
Chinese
Hindi
Tamil Korean
Thai
TesekkurlerTurkish
© 2014 John Sing – All Rights Reserved
University of South Florida - Spring 2014
114