Upload
saurabhyadav84
View
8
Download
0
Tags:
Embed Size (px)
DESCRIPTION
This PPT describes the big data in general and its use in BI .Big data is the fuuture of IT analytics and its complete understanding will help the industry to do predictive forecasting.
Citation preview
Big Data
New Frontiers for IT Management
Executive Briefing
••
•
•
•
DefinitionCatalysts
Potential Value of
Thought LeadersBig Data
Leading Technology Vendors
The Age of Big Data
“Data is a new class of economic asset,like currency and gold.”
What is Big Data
“A massive volume of both structured andunstructured data that is so large that it's difficultto process with traditional database and software techniques.”
What is Big Data
• Walmart handles more than 1 million customertransactions every hour.
• Facebook handles 40 billion photos from its userbase.
• Decoding the human genome originally took 10years to process; now it can be achieved in one week.
Volume, Velocity and Variety
Big data spans three dimensions
VolumeVelocity
Real-time capture and
Real-time analytics
Petabytes per day/week
VarietyUnstructured
data, web logs, audio,
video, image
Big Data
Traditional Approach V/s Big Data Approach
IT
Structures the data to answer that question
IT
Delivers a platform to enable creative discovery
Business
Explores what questions could be asked
Business Users
Determine what question to ask
Monthly sales reportsProfitability analysisCustomer surveys
Brand sentimentProduct strategyMaximum asset utilization
Big Data ApproachIterative & Exploratory Analysis
Traditional ApproachStructured & Repeatable Analysis
Catalyst – Commodity Servers
• Commodity server hardware creating thepossibility for cost effective massively parallelprocessing (MPP)
–
Example server might contain:
CPU – 16 Cores
RAM – 1 Terabyte Disk – 500 Terabytes Ethernet – 1 Gbit
Catalyst – Humans and the Internet
• 1.2 Billion active mobile broadband subscriptions
• Web sites with 300+ million unique visitors/month––
–
–
FacebookYahoo
YouTube–
Potential Value of Big Data
$300 billion potential annual valuecare.
to US health•
$600 billion potential annual consumer surplus•
from using personal location data.
60% potential in retailers’ operating margins.•
Source: McKinsey Global Institute - 2010
Leading Technology Vendors
Example Vendors Commonality
••
•
••
•
•
IBM – NetezzaEMC – Greenplum
Oracle – Exadata
MPP architecturesCommodity Hardware
RDBMS based
Full SQL compliance
Hadoop – Open Source
••
•
•
Started by Google and Yahoo!Now Open Source – Hadoop
“NoSQL” approach to data
Foundational Technologies:––
–
Hadoop Data Storage Framework
MapReduce engine
HIVE and PIG query tools
• Almost SQL compliant
Leading Vendors - Hadoop
• Cloudera – Open Source HADOOP––
–
–
Production ReleasesVery good support
Conferences and education
• Amazon's Elastic Computing Cloud––
–
–
Map/Reduce environmentMPP for everyone
Cost effective
And you can buy a book!
IBM -Netezza
Simplifies Data Warehousing •Speed :10-100x better performance
•Simplicity: Admin cost reduced by 75-90%
•Scalability
•Smart System- In-database analytics
IBM -Netezza
• Significant response improvement:
• Faster platform means better reports response
• Direct Data Availability• Higher trust in data , one version of truth• Aggregation reduction• Any attribute available
• Operational Benefits• Storage savings (no data replicas)• Administration costs reduction(DBA)
• Infrastructure Simplification• Lower environment complexity
BigDataArchitecture.com