15
Hadoopable Problems and Hadoop Application Challenges Dr. Tariq Mahmood Chief Data Scientist, NexDegree Pvt. Ltd. Email: [email protected] [Professor at PAF-KIET, Karachi]

Open-BDA Hadoop Summit 2014 - Dr. Tariq Mahmood (Hadoopable Problems and Hadoop Application Challenges)

Embed Size (px)

Citation preview

Hadoopable Problems and Hadoop Application Challenges

Dr. Tariq Mahmood

Chief Data Scientist, NexDegree Pvt. Ltd.

Email: [email protected]

[Professor at PAF-KIET, Karachi]

Agenda

Enterprise Hadoop Architecture: Decisions and Process

6 Common Hadoopable Problems

Hadoop Application Challenges

HADOOPING PROCESS

ZERO IN

REVELATIONS

Distinguish Big Data from Small Data

Specify Big Data Analytical Requirements

From Requirements to MapReduce Code

Develop Integrated Hadoop Base

Generate Smart Data for Big Data

Generate Smart Data for Small Data

VISUALIFY

SMART DATA

Hadoopable Problems

Credit Card Late Payment Risk

Small Data Analytics:

Significant segments of Credit Card Customers

Risk of Late Payment for each Segment

Accuracy of Risk Prediction for each Segment

Big Data Analytics with Hadoop:

Discover Segments in Real-Time

Discover Late Payment Risk of each Segment in Real-Time

Discover Per Segment Accuracy in Real-Time

Apply Risk-Aversion Policies Per Segment in Real-Time

Credit Card Late Payment Risk

Card Activation

Authorized Name Match

CV Validation

Fraud Risk Identification Services

Internal Fraud Monitoring

FICO Card Usage Anomaly Prediction

Customer Churn Analysis

Discover Customer Segments in Real-Time

Discover Churn Rate of Each Segment in Real-Time

Dynamic Customer Retention Policies per Segment

Product Recommendations

Discover Product Preferences Per Segment in Real-Time

Apply Personalization Strategies Per Segment Dynamically

Ad Targeting

Discover Web usage Behavior Per Segment in Real-Time

Apply Ad Targeting Policies Per segment in Real-Time

MMM… I could go for some Pizza Tonight

Plan to Shop for Clothes this Weekend… Wanna Join?

Going Berserk over my new iPAD

Going on a Long Drive to Uncle John’s this Friday

POS Transaction Analysis

Discover Customer Segments in Real-Time

Discover Shopping Patterns per Segment in Real-Time

Dynamically Manage Promotion Policies per Segment

Real-Time Inventory Control

Real-Time Purchasing

Real-Time Warehouse Stock Transfers

Real-Time Cash Management

BOUTIQUE

FASHION STORE

SPORTING GOODS

APPAREL STORE

CLOTHING STORE

Predict Network Anomalies

Why Hadoop Challenges?

Hadoop Continues to Evolve

Lack of a standardized implementation infrastructure – too much breadth

A Huge Clash of Technologies – A Big Muddle!

Business Intelligence, Statistics, Data Mining, Machine Learning, Analytics, Data Warehousing, Distributed Computing (Hadoop), Cloud, Computer Visualization, Natural Language Processing

Lack of Relevant Talent to Harness Big Data Technologies

List of Challenges

Stream Analysis: Develop, Drill and Standardize

Difficult to Standardize Adoption to 3V’s

Adapting MapReduce dynamics and Hardware to generate Smart Data – clusters, configurations

Too Much Focus on Big Players

Full Resource Optimization not Guaranteed

Big Data – Effective Game Plan

Conviction in Mind

Analyses Required – Think Small for Big

Data and Don’t Expect Too Much

Hire the Competence – rigorous process

Focus on Data Revelations for Some Time

To Hadoop or not to Hadoop? To Cloud or

not to Cloud? – Technology Compromise

Ensure Smart Data Validity

Merge Infographics with Dashboards

Be on your Guard all the time

THANK YOU!

Comments and Questions?

Tariq Mahmood

Email: [email protected]