Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
1,000,000,000,000,000,000,000 bytes = 10007 bytes = 1021 bytes
1 Zettabyte = 1 billion terabytes
Internet of thingsAudio / Video
Log Files
Text/Image
Social Sentiment
Data Market Feeds
eGov Feeds
Weather
Wikis / Blogs
Click StreamSensors / RFID / Devices
Spatial &
GPS Coordinates
WEB 2.0Mobile
Advertising CollaborationeCommerce
Digital Marketing
Search Marketing
Web Logs
Recommendations
ERP / CRM
Sales Pipeline
Payables
Payroll
Inventory
Contacts
Deal Tracking
Terabytes
(10E12)
Gigabytes
(10E9)
Exabytes
(10E18)
Petabytes
(10E15)
Variety, variability & Velocity
Vo
lum
e
1980
190,000$2010
0.07$
1990
9,000$2000
15$Storage/GB
ERP / CRM WEB 2.0Internet of
things
What’s the
social sentiment
for my brand or
products ?
How do I
optimize my fleet
based on weather
and traffic
patterns?
How do I
better
predict future
outcomes?
Big Data, BIG OPPORTUNITY
49% CEOs and CIOs are planning big data projects
Software Growth
1.8
2.5
3.4
4.6
0
1
2
3
4
5
2012 2013 2014 2015
Bil
lio
ns
$
34% compound
annual growth rate
Services Growth
2.7
3.9
5.1
6.5
0
2
4
6
8
2012 2013 2014 2015
Bil
lio
ns
$
39% compound
annual growth rate
1. McKinsey&Company, McKinsey Global Survey Results, Minding Your Digital Business, 2012
2. IDC Market Analysis, Worldwide Big Data Technology and Services 2012–2015 Forecast , 2012
Block #1
Block #3
Block #2
Block #1
Block #3
Block #2
Block #1
Block #2
Block #3
Symbol Date High Low
AET 2009-09-21 30.49 31.09
AET 2009-09-18 31.01 31.44
…
MRK 1988-11-25 55.00 55.25
Key Value
AET 0.61
AET 0.33
…
MRK 0.25
Key Value
AET 13.75
AXP 15.12
…
MRK 29.00
Batch Processing Interactive analysis Stream processing
Query runtime Minutes to hours Milliseconds to minutes Never-ending
Data volume TBs to PBs GBs to PBs Continuous stream
Programming model MapReduce Queries DAG
Users Developers Analysts and developers Developers
Originating project Google MapReduce Google Dremel Twitter Storm
Open source project Hadoop / Spark Hive*/ Drill / Shark /Impala
Hbase
Storm / Apache S4 /Kafka
Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus…
C#, F# Map/Reduce, LINQ to Hive, .NET management
clients
JavaScript Map/Reduce, Browser hosted console, Node.js
management clients
PowerShell, Cross Platform CLI tools
https://www.windowsazure.com/en-us/develop/net/how-to-guides/hadoop/
http://blogs.msdn.com/hpctrekker/
https://github.com/wenming/BigDataSamples