Upload
progress
View
293
Download
1
Embed Size (px)
Citation preview
Geekier analytics for SaaS data Sumit Sarkar Chief Data Evangelist Progress DataDirect @SAsInSumit www.linkedin.com/in/meetsumit
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 2
Agenda
§ Demand for geekier analytics
§ In depth: SaaS marketing data lake
§ Our guidance
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 3
Demand for geekier analytics
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 4
As Lines of Business Control Technology Select, SaaS Adoption Continues to Accelerate
http://www.forbes.com/sites/louiscolumbus/2015/01/24/roundup-of-cloud-computing-forecasts-and-market-estimates-2015/#192c1957740c
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 5
Are SaaS apps getting left behind the analytics revolution?
Current state Opportunity Data silos in each SaaS App behind process APIs
Open access to $200 billion analytics market with standard interfaces for analytics
Embedded BI provides limited analytics options
Engage data skills and tools in IT for richer insights
?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 6
What are geekier analytics?
Business Intelligence Data Integration
ODBC, JDBC
or OData ?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 7
What are geekier analytics?
JDBC ?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 8
Who is getting geeky today with analytics connectivity?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 9
Who is getting geeky today with analytics connectivity?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 10
Who is getting geeky today with analytics connectivity?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 11
In depth: SaaS marketing data lake
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 12
A data lake is a large-scale storage repository and processing engine. A data lake provides "massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs” - SAS Institute
What is a Marketing Data Lake?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 13
Benefits of a Marketing Data Lake?
Some of the benefits of a data lake include: § Store data in all shapes and sizes § Flexible analytics with “schema on read” § Query data using SQL or big data
programming frameworks § Eliminate data silos
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 14
Why Marketing Data?
§ CMOs will outspend CIOs on technology by 2017 (Gartner)
§ Oracle spent $3B on a martech aquisition spree to gain CMO mindshare.
§ Expect more collaboration between CMO and CIO (CIO.com)
§ Marketing Data Warehouse/Lake Webinars ~750 registrations (Progress)
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 15
It’s easy to forget that it’s still about solving real business problems.
Relevant data
Transaction / behavior history
Manage
Data Perform
Analytics Drive
Decisions Insights
continuous feedback loop
Appropriate data sources
Answers to business questions
Strategy (Thinking) Moves Right to Left
Implementation Moves Left to Right
Before you think data, think decisions!
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 16
Our marketing data is almost all in the cloud
And it’s almost all complex, stream data – which means APIs that only give aggregations aren’t too useful
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 17
How to ingest data directly from SaaS applications into HDFS
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 18
How to ingest data directly from SaaS applications into Spark
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 19
JDBC access to SaaS data
Progress DataDirect JDBC Connector
Schema Manager
Apache Sqoop
Salesforce.com Schema
User Defined Schema
Driver uses
§ SOAP API
§ Bulk API
§ Metadata API
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 20
Geek Speak
$ sqoop help import
usage: sqoop import [GENERIC-ARGS] [TOOL-ARGS]
Common arguments:
--connect <jdbc-uri> Specify JDBC connect string
--connect-manager <jdbc-uri> Specify connection manager class to use
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-mapred-home <dir>+ Override $HADOOP_MAPRED_HOME
--help Print usage instructions
-P Read password from console
--password <password> Set authentication password
--username <username> Set authentication username
--verbose Print more information while working
--hadoop-home <dir>+ Deprecated. Override $HADOOP_HOME
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 21
Geek Speak
val dataframe_salesforce = sqlContext.read.format("jdbc").option("url","jdbc:datadirect:sforce://login.salesforce.com;").option("driver","com.ddtek.jdbc.sforce.SForceDriver").option("dbtable","SFORCE.<table_name>").option("user","<Username>").option("password","<password>").option("securitytoken","<security_token>").load()
dataframe_salesforce.registerTempTable("account")
dataframe_salesforce.sqlContext.sql("select * from account").collect.foreach(println)
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 22
Our guidance
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 23
Data in SaaS Applications is Siloed, Protected by Proprietary APIs Designed for Process Integration, not Data Integration
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 24
Partner Summit registration report from SFDC
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 25
Each SaaS API is vastly different
§ Get OData, JDBC, ODBC interfaces on top of any API
Data Source API Marketo Web Services API (REST/SOAP)
Bulk and non-Bulk APIs No query language
Oracle Service Cloud Web Services APIs (REST/SOAP) ROQL
Google Analytics Hypercube (query limits of 10 metrics grouped by max of 7 dimensions)
Salesforce SOAP, BULK, Metadata APIs SOQL
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 26
#DontBeJeff
§ http://prgress.co/dontbejeff
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 27
Detail is important because this digital data is true big data
The relationship
between events is
critical
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 28
We’re almost never solving for one problem with a big data system
Reporting Analytics
Summarized Data
Segmented Data
Detail Data
We can’t just aggregate / We can’t not aggregate
Dashboarding
Campaign
Optimization
Customer
Drill-down Attribution, CLTV,
Experience,
Personalization
Targeting
Forecasting
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 29
Expose variety of detailed data using standard interfaces
SaaS vendors
• Expose bulk, non-bulk and analytics data model
• Leverage standard interfaces for SQL and REST
• Direct secure database access option
SaaS data consumers
• Support same standard interfaces
• Turn to trusted data connectivity partners or dedicate significant headcount
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 30
Other SQL to SaaS platforms
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 31
Progress DataDirect Embed SaaS Connectors into the Data Access Layer
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 32
Q&A with R&D
§ How do you handle the varying quality of services across SaaS APIs?
§ With analytics style connectivity, are SaaS vendors concerned about scalability against large extracts from multiple tenants?
§ Which SaaS vendors are easier to work with?
© 2016 Progress Software Corporation and/or its subsidiaries or affiliates. All rights reserved. 33
Ingest data across 200+ data sources (beyond marketing data sources)
Big Data/NoSQL § Apache Hadoop Hive
§ Cloudera
§ Hortonworks
§ Pivotal HD
§ MapR
§ EMR
§ Pivotal HAWQ
§ Cloudera Impala
§ MongoDB
§ Spark SQL
§ Cassandra
§ SAP HANA
Data Warehouses § Amazon Redshift
§ SAP Sybase IQ
§ Teradata
§ Pivotal Greenplum
Relational § Oracle DB
§ Microsoft SQL Server
§ IBM DB2
§ MySQL
§ PostgreSQL
§ IBM Informix
§ SAP Sybase
§ Pervasive SQL
§ Progress OpenEdge
§ Progress Rollbase
SaaS/Cloud § Salesforce.com
§ Database.com
§ FinancialForce
§ Veeva CRM
§ ServiceMAX
§ Any Force.com App
§ Hubspot
§ Marketo
§ Microsoft Dynamics CRM
§ Microsoft SQL Azure
§ Oracle Eloqua
§ Oracle Service Cloud
§ Google Analytics
EDI/XML/Text § EDIFACT
§ EDIG@S
§ EANCOM
§ X12
§ IATA
§ Healthcare EDI: X12, HIPAA, ICD-10, HL7
§ Custom EDI
§ Flat files: CSV, TSV, dBase, Clipper, Foxpro, Paradox
§ Text Files
Any § SDK
§ SequeLink Socket Server
§ Customer Engineering