Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Real-Time Inventory Tracking with Big Data Tools
Andy McNalisBig Data / Data Warehouse
Administration
Intro Data Warehouse and Big Data
• MPP – Massively Parallel Processing
• Multiple computers interconnected together (the interconnection and
communication is the secret sauce)
• Large amounts of data divided up and stored across this network of
computers.
• When an inquiry is made of the data, the work is split between all these
machine and run simultaneously, in parallel.
• Answer is consolidated and delivered back to the requestor.
• Splitting the data and the parallelization of the processes are abstracted
from the user.
22
Data Warehousing Tools and Products
33
DATA WAREHOUSE
Traditional
Proprietary
High costs – licensing, support, etc.
Teradata
• Market leader
• Solution includes
Software and
Hardware
• Works as
advertised
• Costly to scale
Greenplum
(EMC/Pivotal)
• Software only
option (but they do
have an appliance)
• Offered
MapReduce
• Columner projects
Netezza (IBM)
Big Data Tools and Products
BIG DATA
Open source
Many contributors
Less expensive to deploy as compared to traditional Data Warehouse vendors
44
Hadoop
• HDFS Hadoop Distributed File System
• Map/Reduce processing framework
• Batch centric
• Becoming more database like
• Scales easily and inexpensively
NoSQL
• HBase
• Cassandra
• MongoDB
• Real-time analytics
5
Right Tool for the Right Job
5
6
Challenge of Achieving Big Data ROI
Enterprise practitioners believe the
potential value of big data is significant
However, many struggle to derive
maximum value from their big data
investments
• 46% -- Only partial value from
their Big Data deployments
• 2% -- No value achieved
Source: Enterprises Struggling to Derive Maximum Value from Big Data, Wikibon, Sep 2013
http://wikibon.org/wiki/v/Enterprises_Struggling_to_Derive_Maximum_Value_from_Big_Data
Compelling reasons for this struggle to
achieve maximum business value from
big data…
1. A lack of skilled Big Data
practitioners
2. Perceived as "Raw" and
relatively immature technology
3. A lack of compelling
business use case
According to Wikibon
7
Making Business Decisions Quickly
Ability to create value from data by being able to process and store
large volumes of data from disparate sources
Hadoop enables analytics, deep analytics, and real-time analytics to
make business agile
We recommend using Hadoop to create an Enterprise Data Hub
where all data is sourced once and re-used across the enterprise
Hadoop for Big Data
8
Real-Time Inventory Management
9
Real-Time Analytics with Cassandra
By implementing Hadoop and Cassandra into a
traditional environment, Business Intelligence teams
are able to provide more accurate and real-time
inventory, pricing, sales and return data as well as
predicting ideal floor plans.
Managing inventory with up-to-the-second data...
In-Store
Purchases
Online
Purchases
Real-time
inventory data
ensures that
items ordered are in-stock.
10
Real-Time Analytics with Cassandra
POS data was stored in different formats in different
legacy systems (Mainframe and Teradata)
No single version of truth
No real-time capability
Inventory
Batch File Sent
ONCE A DAY
CHALLENGE
This latency resulted in potential loss of sales and customer
dissatisfaction when items are ordered that are no longer in stock.
POS Volume
Average 100,000 message per day
Peak 77,000 messages in 1 hour at
4:00am the day after Thanksgiving
11
Real-Time Analytics with Cassandra
SOLUTION – Phase 1
Condense all POS data from different legacy
systems and applications into Hadoop
Enterprise Data Hub
Create a Single Version of Truth
Hadoop enables a single version of truth for deep analytics,
but there is still no real-time capability…
12
Real-Time Analytics with Cassandra
SOLUTION – Phase 2
Use Cassandra to extract messages
from POS queue for real-time
processing
13
Real-Time Analytics with Cassandra
SOLUTION – End-to-End
Messages are sent from Cassandra to
Hadoop for back-end, deep analytics.
14
Real-Time Analytics with Cassandra
Faster decision making…
Business Intelligence Teams
are able to provide more
accurate and real-time
inventory, pricing, sales and
return data.
BEFORE Cassandra
Real-Time Solution:
Inventory Batch File
Sent Once a Day
AFTER Cassandra
Real-Time Solution:
Inventory Data Sent
in Minutes/Seconds
RESULT
15
Real-Time Analytics with Cassandra
Increased sales by improving item
availability.
Value for the Organization
Increased customer satisfaction
because customer is able to get
what was ordered.
16
Real-Time Analytics with Cassandra
Value for the Organization
Cost savings from reduced
customer service center calls.
Aha Moments
Cost savings from reduced truck
load times.
17
Additional Components
18
Single Version of Truth
Hadoop Enterprise
Data Hub gives
business users access
to more data from
more sources for deep
analytics.
Enterprise Data Hub
19
Unique Challenges
Firewall Issues
Normally, Storm or Kafka can be
used to send POS messages to
Cassandra.
In certain situations where a firewall
exists between data source and
processing cluster - such as created
by mergers or spin-outs – both
Storm and Kafka can be used to
send messages over the firewall.
20
Unique Challenges
Real-Time Over Firewall
21
Data Driven Decision Making
Advanced Analytics
Inventory forecasting with
Machine Learning on data from
Weather Reports
Once the Hadoop / Cassandra framework is in place, data
from virtually any source can be consumed in the Enterprise
Data Hub for Advanced Analytics.
New ways to use Social, Geo,
Sensor data to develop
predictive models…
22
KEY TAKEAWAYS
23
Achieving Big Data Success
Bring IT and Business together
Understand how Hadoop will fit into your environment
Ask “what are you really trying to accomplish?”
See the end results first before you start your journey
Define realistic success criteria
Discover your big data use case!
Hadoop Ecosystem for Big Data ROI
24
Big Data Business Wins
Enterprise Data Hub and single version of truth for all data
Hadoop can help you answer questions that were difficult
or cost prohibitive to answer before
Hadoop can transform your organization’s approach to
how you use data and ask questions you never even
thought of
Must have a clear strategy and
long-term plan
Leverage the right partnerships to
achieve your goals
25
Questions?
Q & A
For further information
email:visit:
www.metascale.comMetaScale is a company of Sears Holdings
Thank You