Client Approaches to Successfully Navigate the Big Data Storm

Client Approaches toSuccessfully Navigate throughthe Big Data Storm

June 2014

Does Your Big Data Project Look Like This?

IBM Presentation Template Full Version

You need cost predictability,

together with a solution that

can quickly take you places!

Hadoop is a fascinating, exciting engine. However, it is: Ungoverned All custom, all the time Requires expensive, constantly changing skills Includes no concept of quality, governance or lineage

And, MapReduce was originally designed for finely grained fault tolerance, which makes it slow for big data integration processing

Hadoop is just not a solution for big data integration

If so, that’s because 80% of the development work for a big data project is to address Big Data Integration challenges

“By most accounts, 80 percent of the development effort in a big data project goes into data integration and only 20 percent goes towards data analysis.”

Intel Corporation: Extract, Transform, and Load Big Data With

Apache Hadoop (White Paper)

Most Hadoop initiatives end up achieving garbage in, garbage out faster, against larger data volumes and:

MapReduce was not designed to accommodate the processing all the logic necessary for big data integration

Teams forget that Hadoop initiatives require: collecting, moving, transforming, cleansing, integrating, exploring & analyzing volumes of disparate data (of various types, from various sources) --- AKA Data Integration

To succeed, you need Data Integration capabilities that create consumable data by:

Collecting, moving, transforming, cleansing, governing, integrating, exploring & analyzing volumes of disparate data

Providing simplicity, speed, scalability and reduced risk

A large US Bank needed to reduce total cost of ownership …

Business Problem Challenges

Primary: Reduce Teradata total cost of ownership

Secondary: Allow for new analytic exploration & asset optimization

Create a Data Distribution Hub / Big Data platform to cut costs

Move front-end processing from Teradata to the Data Distribution Hub

Needed to offload ELT workload in a cost-effective, efficient way

… and successfully offloaded ELT workloads to reduce costs

Approach Outcome

Reduce costs by offloading ELT workloads from Teradata to a Big Data platform

Leverage existing InfoSphere Information Server data integration skills and assets (jobs)

Hand coding: Client would not consider hand coding for data integration capabilities

Client decides to deploy IBM PureData for Hadoop

Client uses InfoSphere Information Server as their single scalable & flexible Big Data Integration solution

Client successfully migrated their Teradata ELT and now uses InfoSphere Information Server to exploit the lower cost of running data integration on Hadoop

A government entity anticipated the need to support 10x increase in incoming data volumes over 3-5 years …

Business Problem Project Challenges

This Master Data Management(MDM) client compares frequently updated records to identify potential national security threats. They needed to:

– Support a 10X increase in incoming data volumes (in the next 3-5 years)

– Reduce high software and hardware costs

Create a solution that could support scalable probabilistic matching for up to 10X data growth

Modernize ETL practices and remove bottlenecks

… and replaced an expensive and failing hand-coding approach with a massively scalable Big Data Integration solution

Approach Outcome

Eliminate hand coding for data integration to significantly reduce software costs

Deploy a data integration solution that can scale fast enough to feed the MDM system

Reduce high costs of ELT running in their database

Removed hand coding & replaced it with InfoSphere InfoSphereInformation Server for massively scalable data integration processing

Stopped running ELT in the database, leveraging Hadoop instead

Client purchased an end-to-end Big Data solution from IBM – across MDM, Hadoop, and Information Integration areas

A large European telco wants to leverage big data to increase revenue and customer satisfaction …

Business Problem Project Challenges

Increase revenue & customer satisfaction by analyzing usage patterns of mobile devices to match user demand

Needed a comprehensive Big Data platform that could keep up with analytics requirements

Reduce costs by reducing inventory

Client used Informatica for ETL, generally, and planned to extend use to the Big Data effort. They asked Informatica to improve (existing) Netezza loading performance in support of their goals and:

– The ETL process broke with a small sample of jobs

– They switched to an ELT approach and encountered technical problems

… and learned that ELT only was not sufficient to support Big Data Integration

Approach Outcome

Leverage a worldwide predictive solution to anticipate customer requirements

Add a Hadoop layer to enrich predictive models with unstructured social media data

Expand existing IBM Netezza footprint to keep pace with new data volumes

Client requested a full-workload data integration POC with IBM

Client realized ELT only was not sufficient for Big Data Integration (all data integration logic cannot be pushed into IBM Neteeza or Hadoop)

Client found InfoSphere Information Server can often run data integration faster than either Neteeza or Hadoop

Client selected InfoSphere Information Server over Informatica for Big Data Integration and InfoSphere BigInsights over Cloudera

Plan for Success!Successfully navigate the big data maze

Hadoop is not a Data Integration platform,

80% of the work is around Big Data Integration, and

MapReduce is slow

To move into production successfully, you need to

plan ahead and make sure you have accounted

for your Big Data Integration needs: Hand

coding does not meet Big Data Integration scalability, flexibility,

or performance requirements

Get more informationabout Big Data Integration requirements and keysuccess factors

ELT only is NOT sufficient to meet

most Big Data Integration

requirements, because you cannot push ALL the data

integration logic into the data warehouse or

into Hadoop

Client Approaches to Successfully Navigate the Big Data Storm

Data & Analytics

Curriculum, CurricUNET, and YOU! How to Successfully Navigate the Curriculum Development Process Cathy Cox CRC Chair Fall Staff Development Day August

Productivity and the Law - HRComplianceTraining.Net · Productivity and the Law ... successfully navigate human resources challenges you encounter each ... culture and productivity

T E X A S - files.ctctcdn.comfiles.ctctcdn.com/1355b334201/cadae8f6-a391-4e0e-95cd-17fef0a1a… · responded to and communicated with the auditors, and how they successfully navigate

HORIZONS STUDENT SUPPORT SERVICES - …€¢ Acquire the aptitude to successfully navigate pivotal life phases. ... Horizons Student Support Services is a federally funded TRIO Program

How to Successfully Navigate the Evolving World of ...annex.ipacweb.org/library/conf/08/gardiner.pdf · •Embrace both passive and active job seekers, including those from other

Navigate successfully with GlomarisIMPROVE BUSINESS AGILITY Glomaris combines all your business functions in one, ... Glomaris facilitates your management of a leaner, better, and

AgeTech - WordPress.com€¦ · to help information technology professionals and decision-makers successfully navigate the evolving array of safety and wellness systems by sharing

T. Peter Ramsey. Social Networking Services: Library ... · librarianship, Thomas contends that libraries must expand their collaborative efforts to successfully navigate the transition

Private Banking · knowledge about investments as well as remaining informed about their investments, investors will be able to successfully navigate the investment market storms

How to Successfully Navigate the Professional School Admissions Process Spring 2012 CLAS Academic Advising /gvsuclasadvising?ref=ts

Helping Teachers Navigate Differentiation Successfully and Still Have a Life ASCD Conference Philadelphia, 2012 01

Masters of Science Degrees in Biology and Biotechnology How to successfully navigate through your program

Business and Community College Partnerships: A Blueprintiwnc.org/documents/LearnEarnBlueprint.pdf · To successfully launch and navigate a partnership, business and community college

COLLEGE INFORMATION NIGHT @ RODRIGUEZ HIGH SCHOOL To provide information to Students and parents so they can successfully navigate their future both in

Selling Your Business: How to Navigate the Deal Successfully Your Business.pdf · Selling Your Business: How to Navigate the Deal Successfully The following material presented by

Adolescence to Adulthood - John T. Gorman Foundation...people to successfully navigate the passage from adolescence to adulthood. Success requires connections to supportive adults,

Conference Program June 3-7 Marriott Renaissance Hotel ... · solutions and develop best practices to successfully navigate the rapids, whirlpools and sometimes deceptively calm waters

1 Day 3 PM Objectives Effectively network within the organization Overcoming barriers Successfully navigate organizational politics Identifying the client,

Serving Greater Kansas City51dd2465b56688d68c55-598f377eccf2d9e6374ac45b7946d27f.r35.… · competencies to successfully navigate life experiences. Girls in the 3rd-8th grades combine

Comprehensive user education to successfully navigate the Internet Part 4- Scholarly communication Course developed by University Library of Debrecen