View
695
Download
0
Tags:
Embed Size (px)
DESCRIPTION
In this slidedeck, Infochimps Director of Product, Tim Gasper, discusses how Infochimps tackles business problems for customers by deploying a comprehensive Big Data infrastructure in days; sometimes in just hours. Tim unlocks how Infochimps is now taking that same aggressive approach to deliver faster time to value by helping customers develop analytic applications with impeccable speed.
Citation preview
Getting to Insights Faster:A Framework for Agile Big Data
@TimGasperDirector of Product
Infochimps, a CSC Big Data Business
Agenda
(1) IT’S ALL ABOUT THE APP
(2) WHAT IS A BIG DATA APP
(3) TRADITIONAL VS AGILE APPROACH
(4) ENABLERS OF AGILE BIG DATA
(5) DEMONSTRATION
What problem areyou trying to solve?
It’s all about the apps.
Poll Question 1
What is a Big Data app?
ImpactfulAnalytic
Applications
+ =Critical
Business Problems
?
Source: PARC
Predictive Inventory Levels to Minimize Warehousing Costs
Personalized Medicine Treatment
Programs
Smart Meter Monitoring for
Customer Value Add
Customer Churn Analysis for Increased Customer Lifetime
Value
Trade Options and Futures Pricing
Platform
Poll Question 2
It’s all about the apps.
Source: Tableau
Predictive Manufacturing +Smart Manufacturing & Energy Ad Publisher Campaign Analytics
360° Customer Experience Management Social Media Monitoring & Analytics
The Traditional Way
Business Discovery
Info Discovery
Logical Data Model
Physical Data Model
System Staging
Data Ingestion, Transformation, ETL
Application Development
Analytics
Production Staging
Data Warehouse Project12-24 Months to Reach Production
Big Data: A New Hope
Business Discovery
Info Discovery
Logical Data Model
Physical Data Model
System Staging
Data Ingestion, Transformation, ETL
Application Development
Analytics
Production Staging
Data Warehouse Project12-24 Months to Reach Production
Business Discovery
Info Discovery
Sys. Stag.
InitialData
Ingest
Prod.Stag.
Big Data Project3-6 Months to Reach Production
Schema on Read
Analytics
App Dev
Schema on Read
Analytics
App Dev
Schema on Read
Analytics
App Dev
Schema on Read
Analytics
App Dev
Schema on Read
Analytics
App Dev
Application Development Timelines
Months2 Developers6
2 Developers5
Months
3Months1 Developer
2 Developers4
Months
Speed to Value: A Case Study
HGST, a Western Digital company, is improving customer support and product quality by collecting, analyzing, and acting on massive quantities of machine and sensor data.
Greatly diminished operational burden with ability to focus on analysis and driving business action
Fast project delivery and success Expertise with Big Data technologies like Hadoop
KEY STATSIndustry Storage Technology
Solution Machine Data Analysis Engine
Channel B2B
Cloud Services Cloud::QueriesCloud::Hadoop
Users Application Developers, Data Scientists, Analysts
Deployment Amazon Web Services
Poll Question 3
Enablers of Agile Big Data
1. Managed infrastructure means focusing on Big Data apps
2. The community tech itself and what it enables
3. Our customer engagement framework for choosing use cases that have impact and designing successful solutions
4. Agile, iterative analytics app dev lifecycle
5. Our application reference design framework for kick starting application development
A Managed Platform
Technologies Under the HoodPART 1
HADOOP • Java MapReduce • Streaming MapReduce • SQL on Hadoop, Pig, Hive
NOSQL DATABASES • HBase/Accumulo • Elasticsearch • Cassandra, MongoDB
STREAM PROCESSING, MESSAGE QUEUES • Storm • Kafka
Technologies Under the HoodPART 2
HADOOP INTERFACES • Hue • Command Line
STATISTICAL TOOLS• R, SAS, SPSS
BUSINESS INTELLIGENCE AND DATA VIZ• Legacy: Cognos, Biz Objects, OBIEE, Microsoft BI• New Gen: Tableau, Qlikview, SiSense, Kibana
Public Cloud Private Cloud
IaaS
Abstract to any cloud with Orchestration DSL
Virtual Private Cloud
Develop & Test Locally withApp/Analytics Scripting &
“Deploy Pack” Orchestration
SaaS
Real-time AnalyticsWith Cloud::Streams
Batch AnalyticsWith Cloud::Hadoop
Interactive AnalyticsWith Cloud::Queries
PaaS
Our Unique Toolset Addition
MAJOR ACTIVITIES
Solution
OngoingIterative App Development
Week 5-8+Platform Rollout
Week 3-4Technical Design
Week 1-2Discovery
Interview Key Business
Stakeholders
Interview Key Technical
Stakeholders
Define Objectives & Challenges
Define Target Use Case
Identify Data Sources
Define Business Benefits
Define Architecture
Develop High-Level Approach
& Costs
Agree to Project Plan/Rollout
Standup / Connect
Environment
Design Data Flows
Architecture Validation
Build Data Flows
Historical Data
Real-Time Data Flow
ProductionDesign & BuildService Requirements
Tuning
Customer Engagement Framework
• Identify data sources for target use case
• Develop high level tech approach and costs
• Define high level benefits• Develop initial case for action• Develop go forward plan
• Develop Data Model• Technical architecture &
integration design• Stand up environment• Dashboard design workshops• Data mapping
• Build prototype dashboard• Configure prototype
application• Data load• Run solution iterations• Analytical modeling
• Run 2-4 hour Design Thinking Workshop
• Review current state metrics• Review business pain points &
opportunities• Review application & infrastructure
environment• Define target use case
Agile Iteration for App Dev
::
App Reference Design Framework
• A use-case-driven reference design• A code repository with:
o Domain-specific sample data sets/sourceso Sample data flowso Sample data processors/analyticso Simple data visualization
App Reference Designs
Predictive Manufacturing +Smart Manufacturing & Energy Ad Publisher Campaign Analytics
360° Customer Experience Management Social Media Monitoring & Analytics
Social Media App Reference Design
Demonstration
Big Data Benefits
New Use Cases
New Analyticsand Analytical
Techniques
Time to Value
Faster Iteration
IncreasedFlexibility
ENABLED BY• Unstructured data and semi-structured data allow for faster path to data integration• Real-time analysis and batch analysis with scripting tools• Schema on read for app-driven data models and data structures• Local to cloud, small data to big data… tools can talk to each other
MoreData
FasterData
What is Your First Big Data App?