Upload
inmobi-technology
View
182
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Today's world, Big Data seems to be buzz word and enabling BI seems to be the dream come true. In traditional world, all BI systems have run on RDBMS and embraced Star Model to enable DWH queries. Imagine, enabling the same for data lying in Hadoop clusters along with RDBMS and bringing down the barrier for business to be able to play with this data. The slides essentially covers around this theme.
Citation preview
Who I am – Rohit Chatter
Big Data – Opportunities & Challenges
agenda
What’s Inside – 10,000 Feet
Use Cases
The Big Data Product
Big Data - Preview
Rohit Chatter was Senior Architect at Yahoo! in Advertiser and Data Platform group. Now at
inMobi as Principal Architect - Analytics
He is a thought leader specializing in designing solutions involving huge amount of data. He
architected Paid Search BI stack for Microsoft-Yahoo alliance that uses Hadoop, Hive, GraphDB
& HBase.
He has deep knowledge and understanding of various usage models involving traditional databases and newer Big Data platforms to provide customer centric and cost effective
solutions.
He has spent 17 years in the industry. Before joining inMobi, he has worked for companies like
Yahoo!, Tivo, Alcatel Lucent, TCS etc. Some of his recent projects include BI solutions for Paid Search Advertiser Analytics, Partner Analytics
and Web Analytics.
Business Domain:Web Analytics
Search Advertising AnalyticsPublisher Analytics
Technology:Hadoop, Hive, Hbase, RDBMS,
BI tools & technology, Data Modeling
[email protected]@TDWI Bangalore Chapter
Panel Member @ Hadoop The Fifth Elephant
Big Data - Preview
Big Data – Opportunities & Challenges
agenda
What’s Inside – 10,000 Feet
Use Cases
The Big Data Product
Who I am – Rohit Chatter
Today’s Dynamic World
“Information is the oil of the 21st century,
...and analytics is the combustion engine.”
“Unfortunately, we spend 80% of the time collecting data and 20% analyzing it.”
“With increasing importance of precise and timely insights, analysts want to be able to deliver accurate data reports quickly.”
Big Data
Big Data - Preview
Big Data – Opportunities & Challenges
agenda
What’s Inside – 10,000 Feet
Use Cases
Who I am – Rohit Chatter
The Big Data Product
Business Problems
Scale• Data growth with time• Granularity needed for right business
decisions
Data ReachEase of Data Access. Distance between Data and BusinessOne time reports for investigation or validation of analysis
Reprocessing• Data reprocessing becomes a
nightmare• IT always in catch-up mode
Timely Insights• Data acquisition to Insight – In Time
Low Flexibility for new Reports & Dashboards• Add new dimension and metrics with
complex business rules• Modify reports• New dashboards
Engineering Involvement• Huge dependency on IT/BI team on a
day to day basis
IT/BI Business
Big Data - Preview
Big Data – Opportunities & Challenges
agenda
What’s Inside – 10, 000 Feet
Use Cases
Who I am – Rohit CHatter
The Big Data Product – To Be
BI Framework on Hadoop
Custom Reports & Dashboard Canned & Schedule based reports Cubes (Yes!! On Hadoop) Pivot interface for Visualization & Dashboard
STAR Model on Hadoop Define Entities & Relationship Define complex metrics Define dimensions
Data to Analytics - Improved SLAsSignificantly reduces time to analytics from the time data is
acquired
Single Sign On
What should Big Data BI have? Analytics, Dashboards & Reports
Business grouping of reports Report Designer Dashboard Builder Adhoc Analysis
Scalable & Pluggable architecture Any Source HBase, Solr, Graphdb, Pig, Shark, Impala, Hive, Oracle,
MySQL
Data Re-processing – SimplifiedAll data processing happens on grid and stays on grid
SecurityReport & Data access are managed via roles
What all it should do for you?
Simplify?
Data to Insights
Data Accessibility
Self Serve
New Dashboard
0 5 10 15 20 25
BigData BIOthers
In Hours
INGEST DEFINE RELATIONSHIP VISUALIZE
Days
Big Data - Preview
Big Data – Opportunities & Challenges
agenda
What’s Inside – 10, 000 Feet
Who I am – Rohit Chatter
The Big Data Product – To Be
Use Cases
Stack
Big Data - Preview
Big Data – Opportunities & Challenges
agenda
What’s Inside – 10,000 Feet
Use Cases
Who am I – Rohit Chatter
The Big Data Product – To Be
Media Industry
•Audience Engagement, User Value life cycle, User Behavior•Ad Network – Campaign optimization, Better ROI, Brand Performance
•Exchange
E-commerce
•Recommendation engine•Sentiment Analysis & Brand loyalty
Where all BigData BI can help?
CHURN PREDICTION FOR A TELECOM OPERATOR
► Dependent variable to define attritors: Customer was defined as attritor if they has done less than 2 calls over a period of 3 months
► Logistic regression was used to develop a model equation to calculate attrition propensity score for all customers
► Customer scores were developed to rank them into high medium and low attritors.
Based on Model the customers were
targeted with a marketing offer proactively
which reduced attrition and resulted in $ 2.3
MM inc. volume
PredictedValue
Observedvalue
Likelihood for attrition
Likelihood for no attrition
Total
Customers on Attrition 8,422 1,824 10,246
Customers on No attrition 1,708 14,012 15,720
Total 10,130 15,836 25,966The statistical Model performed 2.67% better than random prediction
BUSINESS IMPACT
SOLUTION APPROACH
Identify the risky customers and develop focused strategies to retain them.
Customer Life Time Value
Segmentation:
The Natural Segmentation conducted through K-Means clustering showed 4
distinct segments: S1 – Utility Customers S2 – Premium and Loyal Customers S3 –
Premium and careful S4 – Service shy
The final cluster comprised of low value customers though the number of
customers in that segment was high.
SOLUTION APPROACH
► Behavioral change among the customers falling in the two groups of interest represented over 12 Million $ of
revenue to be gained annually
► The CLTV value was compared with marketing investment per customer to find the viability of customer
acquisition. The organization was able to save on marketing investment by 35% and increased revenue by 43%.
BUSINESS IMPACT
SAMPLE OUTPUT: LABOR REVENUE FROM 4 SEGMENTS
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Total Revenue Share
Lab
or
Re
ve
nu
e
S1
S2
S3
S4
Customer Life Time Value
The NPV method was employed for calculating the Customer Life Time
Value (CLTV).
CLTV model for each segment was built and CLTV of each customer was
calculated.
Based on the CLTV values, a further segmentation of customers were
done as: High value, Moderate value and Low value.SAMPLE OUTPUT: Top 10 customers of S2 Segment CLTV
Develop targeted marketing programs for high potential/high value clients
Thank [email protected]