Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
#AnalyticsXC o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Visual Analytics and Hadoop
Rosie PoultneyVP Analytics89 Degrees
1
Logo – Place on top slide in master
About the presenter
Rosie Poultney, VP Analytics, 89 Degrees
30 years in analytics, mostly marketing analytics using SAS
I focus on giving people appropriate tools to solve business
problems. I believe that increasing data availability throughout
an organization, whether standard reporting or advanced
analytics, leads to better business decisions
Twitter: @RosiePoultney
2
Logo – Place on top slide in master
Visual Analytics and SAS/ACCESS® Interface for
HadoopImproving Efficiency and Increasing Analyst
Satisfaction
3
Logo – Place on top slide in master
Why tell this story?
• Demand for analysts is out-pacing supply, and likely to continue
• We have changed the way our analytics are delivered Visual Analytics was implemented to reduce report development time, and is now a
collaboration tool
Hadoop was cost-effective storage, and now the SAS/ACCESS Interface for Hadoop supports faster, better analytics
• It’s not just about implementing software, you have to change the
business to make best use of resources
4
Logo – Place on top slide in master
Visual Analytics
5
Logo – Place on top slide in master
Business Intelligence checklist
Provide more people with easy and secure access to trusted, relevant data to enable better business decisions focused around the customer
Enterprise-wide, scalable, different
types of user, enables standard reports and ad-hoc querying
Web-based, intuitive interface.
Able to create and edit reports quickly. Email alerts
Up-to-date information using
agreed definitions and metrics. Single source of truth for the
organization
Reports and metrics are integrated
into the planning process. Consistency of baseline information
Links together all customer interactions (e.g.
purchases, browsing and social, communications) to create customer-driven metrics of success
Credential-
driven permissions.
6
Logo – Place on top slide in master
Proving the case for Visual Analytics
Started Q4 2013
Non-distributed instance on AWS
Built a community of users among analysts and BI specialists
Solution provided reporting for three clients
Further details of the environment in Pasion and Aanderud, SASGF 2015
7
Logo – Place on top slide in master
Moved to our own hardware
4-node distributed system, each with 16 cores running in a
virtualized environment using the Linux operating system.
POC provided justification and fueled demand
Wide user base of report viewers and builders
Faster report development
Easy data access for non-SAS coders
8
Logo – Place on top slide in master
Growing the user base
• Find advocates, and tackle their pain points
• Share exploratory results using Visual Analytics Quicker than creating Excel charts and easier to make changes
Focus on the insights earlier in the process
• Understand how the reports will ultimately be used, and
design accordingly KPI dashboard vs data extraction tool
Include data for common filters
9
Logo – Place on top slide in master
Make it presentation ready
• Profiling template used by our internal teams Change one calculated variable and 30+ charts and tables automatically update
• Users were exporting data and recreating charts Matched required format
Able to use screenshots
Request for ‘no title’
10
Logo – Place on top slide in master
Increased use led to a better product
Easier access to
data
Wider user base
Focus on implications
Analysis for complex
questions
Improved datasets
Create familiarity around 100 standard reports, across multiple clients
Increase utility summary tables, and training, to answer ad-hoc questions “How many customers…?”
Collaboration
11
Logo – Place on top slide in master
Exploring a new dataset - collaboration
• Easy histograms!
• Typical requests Remove annual spend
over $1,000
Just customers joining
via an in-store event
Exclude new markets
Caution: choose participants wisely – pick people who like playing with data!
12
Logo – Place on top slide in master
Adjustable spend and visit bands
13
Logo – Place on top slide in master
Create RFM bands using custom categories
• Focus on strategic
questions earlier Where do my best
customers shop?
What are their retention
rates and how should we
incentivize them?
How do they respond to
email?
Who uses free shipping?
14
Logo – Place on top slide in master
5 tips for increasing use of Visual Analytics
• Create standard reports to get people using the tool
• Understand how reports will be used
• Integrate Visual Analytics into the analytic process
• Understand the hot topics for your audience, and build in
the ability to filter on these
• Build a system that can scale as needed
15
Logo – Place on top slide in master
SAS/ACCESS Interface for
Hadoop
16
Logo – Place on top slide in master
The importance of purchase intent
Many retailers have high ticket items combined with long consideration cycles
e.g. cars, furniture and appliances, high-end clothing
Models built on historic purchasing and demographics can miss key triggers
Low Value
High Value
High EngagementLow Engagement
Use browsing behavior and email response to quantify engagement and purchase intent
In a recent analysis, highly engaged,
but historically low value shoppers
were twice as likely to shop.
17
Logo – Place on top slide in master
Identify trigger behavior…for more products
Tailor content to encourage customer to subsequent steps in journey
Hadoop reduces the analytic cost
Not just ‘large’ purchases 0%
1%
2%
3%
4%
5%
6%
0%
5%
10%
15%
20%
25%
Category pages Lookbooks Buying guide
% site visitors % ''large' purchase
18
Logo – Place on top slide in master
Proprietary & Confidential
Linking Online and Offline Behavior
• Customers identify themselves by email Unidentified customer browses 32 pages on website. A cookie is
installed on their machine
8 days later they receive an unrelated loyalty email and click through
to the website. Customer id from email linked to cookie.
Online behavior linked to online and in-store purchases
19
Logo – Place on top slide in master
Needed faster access to weblog data
• Weblog data managed by separate team
• Increasing volume of data requests
Analyst identifies
data need
Analyst writes data
request
Data team
executes request
Data team
publishes file
Analyst uses in analysis
One data request per project – ask for everything!
20
Logo – Place on top slide in master
Hadoop POC using AWS
• In 2014, trialed Hadoop for storage
• Weblog data accessible to analysts through HiveQL
• Better control of timeline, but expensive to scale
Analyst identifies data need
Analyst queries
AWS using HiveQL
Analyst transfers
data to SAS
Analyst uses in analysis
21
Logo – Place on top slide in master
SAS/ACCESS to Hadoop
• Weblog data queried directly from SAS
• Easier and faster data access allows use in more projects
Analyst identifies data need
Analyst codes in
SAS / HiveQL
Analyst uses in analysis
22
Logo – Place on top slide in master
Including web behavior in shopper models
• Simulated data for a fashion retailer. One million
customers clicked through from an email to the website Web behavior now linked to online and in-store purchases.
• Likelihood to purchase is a function of the following: Browsing for the specific products, category, or at inspiration/look books (Hadoop +
SAS/ACCESS)
Engagement with the brand / responsive to messaging (campaign responses,
number of web sessions)
Previous purchasers (standard RFM measures, category purchasers)
23
Logo – Place on top slide in master
Example code
libname hdplib hadoop subprotocol=hive2 port=x server=“x" user=x password=x schema=omniture_datamart;
data weblog;
set hdplib.OUTWEAR_weblogs (keep=userid page_url );
if index(upcase(page_url), '/US/EN/CATALOG/CATEGORIES/DEPARTMENTS/OUTWEAR') then view_outwear=1;
else view_outwear=0;
if index(upcase(page_url), '/US/EN/CATALOG/PRODUCTS/LOOKBOOK') then view_lookbook=1;
else view_lookbook=0;
if index(upcase(page_url), '/US/EN/CATALOG/PRODUCTS/INSPIRATION') then view_inspiration=1;
else view_inspiration =0;
proc sql;
create table user_web_view as
select userid, sum(view_outwear) as view_outwear, sum(view_lookbook) as view_lookbook,
sum(view_inspiration) as view_inspiration
from weblog
group by userid
order by userid;
quit;
24
Logo – Place on top slide in master
Results
• Merged three sources of data and added post period
purchase flag, analyzed using PROC LOGISTIC
• Allowing for historic transactions and general level of
engagement, we were able to see viewing look book = 4x more likely to purchase product
viewing specific products = 7x more likely to buy product
• Can replicate analysis for lower price-point categories
Note: data simulated to reflect results we have seen in live examples
25
Logo – Place on top slide in master
Effect on efficiency and satisfaction
• Visual Analytics and SAS/ACCESS for Hadoop have
changed how we work Less time preparing data, stronger focus on analytics
Easier to explore, and explain, data
Including weblog data improves results
• We have happier business partners Self-serve answers to common questions
Use analyst time for custom / strategic questions
26
Logo – Place on top slide in master
On our work plan
• Continue on the virtuous circle Give more people more access to deeper data
Deliver more project work in Visual Analytics
• Build on our capabilities Add Visual Statistics and link to Hadoop environments (additional to the LASR
server)
• Extend our toolset by exploring machine learning in SAS
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
#AnalyticsX