Click here to load reader
View
133
Download
2
Embed Size (px)
DESCRIPTION
Business Situation The client sources the market research data from one of the largest market research companies, which provides around 34 trillion data points of relevant market research data, which was made available to client using their proprietary tool. The market research department of client would then work through this data for about two months to manually create a 220 tabbed excel report, after doing analytics and calculations on raw data manually, using tools provided by one of the largest market research companies, and using other tools, to be used by marketers and brand managers to make rightful business decisions. This was a very inefficient approach since there was loss of valuable time until the report was published. Most of the process was manual; giving scope to human errors. The reports were shared through emails, physical documents creating lack of flexibility and easy availability. Creating new reports was extremely time consuming and an expensive process. The client has to source some of the data from one of the largest market research companies, which usually turned out to be expensive.
Citation preview
Business Situation
The client sources the market research data from one of the largest market
research companies, which provides around 34 trillion data points of relevant
market research data, which was made available to client using their proprietary
tool. The market research department of client would then work through this data
for about two months to manually create a 220 tabbed excel report, after doing
analytics and calculations on raw data manually, using tools provided by one of
the largest market research companies, and using other tools, to be used by
marketers and brand managers to make rightful business decisions.
This was a very inefficient approach since there was loss of valuable time until the
report was published. Most of the process was manual; giving scope to human
errors. The reports were shared through emails, physical documents creating lack
of flexibility and easy availability. Creating new reports was extremely time
consuming and an expensive process. The client has to source some of the data
from one of the largest market research companies, which usually turned out to
be expensive.
Solution Approach
We got the data from the market research company in the form of raw database
files, and processed this data, to generate a macro level database in the form of
OLAP cubes, to be served to end users through a web based, advanced data
visualization platform called iCharts. Since this was a huge data set that needs to
be processed and analyzed, the data was processed in parallel through a
MapReduce implementation.
The highly complex, proprietary format raw data files provided by the research
companies firm , amounted to around 130GB! We had to deal with the data at the
SKU level to get some of the KPIs required by client. The reports to be generated
Big Data : Market Research
The client is division of
the world's largest
beverage company.
Globally, through the
world's largest beverage
distribution system,
consumers in more than
200 countries enjoy their
sparkling beverages,
ready-to-drink coffees,
juices, juice drinks and
beverages at the rate of
1.8 billion servings a day!
You can read more
about Compassites at
www.compassitesinc.com
Reach out to us at
+91 - 80- 4203 2572
+91 - 80- 6500 2371
Via email
Client Profile
Beverages
needed at highly complex groupings of products and markets. The technical solution designed was as shown below.
The complex data is decoded and loaded by custom written decoders into the tables in the MySQL database in the
form of readable raw data. MapReduce was used to split the data into multiple EC2 instances for massive parallel
processing, with each EC2 instance executing a pipeline of Linux and MySQL processes on the raw data to reduce it to
macro level data, suitable for analytical processing and load it into the star schema. This data is then loaded into OLAP
cubes for analytical processing.
Technology Used
Adobe Flex, Adobe Life Cycle Data Services, J2EE, MySQL & Pentaho
Benefits & Results
Fully automatic process preventing any human errors.
Quick turnaround of reports and the monthly data is now available to managers within 3 days.
Highly organized dashboard, with advanced data visualization capabilities like filtering.
Access controlled dashboard to give selective access to different users.
Individual reports can be downloaded, to be shared with management.
Availability of filtered raw data allows for creating new reports in short time, with fractional cost.
Intermediate database allows for merging data from other sources to create an integrated report.
Built on open source technology stack, thereby saving tremendous cost on software licenses.
The whole process happens on the elastic cloud of Amazon EC2, which is instantiated only for the duration of
process, reducing the need to have permanent infrastructure to process the data.
Pune Center
608 - Kapil Zenith Building, Near Chandani Chowk,
Off Bangalore - Mumbai bypass, Bavdhan, Pune - 411021
Maharashtra, India Phone:+91-20-6500 2371
Bangalore Center
"Gaayatthri Chambers", #948, 24th Main Rd,2nd Phase,
J.P.Nagar, Bangalore - 560078 Karnataka, India
Phone: +91 -80 42032572