14
Spark Usage in Enterprise Business Operations Ken Tsai VP, Data Management & Platform-as-Services SAP @kentsaiSAP 2.17.16: Spark Summit, NYC

Spark Summit presentation by Ken Tsai

Embed Size (px)

Citation preview

Page 1: Spark Summit presentation by Ken Tsai

Spark Usage in Enterprise Business OperationsKen TsaiVP, Data Management & Platform-as-Services SAP@kentsaiSAP

2.17.16: Spark Summit, NYC

Page 2: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

© 2016 SAP SE or an SAP affiliate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an

SAP affiliate company.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices.

Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.

National product specifications may vary.

These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.

Page 3: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

SAP – Our Quick Snapshot in the Enterprise Computing World

74% of the world’s transaction revenue touches an SAP system.

SAP’s product focus:Enterprise ApplicationsBusiness NetworksPlatforms – 15 yrs on IMC

SAP customers represent 87% of Forbes Global 2,000 companies.

SAP touches$16 trillion of world consumer purchases.

Page 4: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

SAP HANA – An In-Memory Platform to Enable New Business Scenarios Previously Not Feasible

COSPCOEPCOBKBKPF BSEG BSEG BSEG BSIS BSIS BSIK BSET LFC1 GLT0 GLT0 GLT0

SAP Simple Finance 4 0

updatesinserts

SAP Finance with aggregates and indices 10 5

no indices no aggregates no redundancies

CORE DATA STRUCTURE REMAINS UNCHANGED

• Soft financial close anytime• Real-time revenue and cost analysis• Real-time liquidity forecasts• Real-time alerts and blocks on suspicious

transactions

Page 5: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Distributed Big Data Is EverywhereHow to better use it in core enterprise business applications?

~79% of Data Reservoirs/Lakes are still disconnected from core

business operations

How do I embed big data signal into my business

applications and enterprise analytics?

53 Difficulty integrating with CRM and/or other systems

% 49Unable to apply or integrate external data quickly enough to inform real-time decision making

% 59 Only a few analysts with specialized training can analyze big data

%

Harvard Business Review Analytic Services, Global Survey of 251 Respondents, Sept. 2015

Page 6: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Introducing SAP HANA Vora

An in-memory query engine that extends the Apache Spark execution framework to enrich the interactive analytics experiences on massively distributed computing

clusters

• OLAP processing• In-Memory

Computing for high performance

• Connecting to Enterprise Systems

• Unified System Management

SAP HANA

ERP DATA BIG DATA

Parallelized Queries

Vora

Page 7: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Key Open Source Contribution to Apache Spark EcosystemSpark to HANA Push-downs & Data Hierarchies

scala> val hierarchy = sqlContext.sql( s"""SELECT  LVL, COUNT(*), ROUND( AVG(P_RETAILPRICE), 2)FROM (  SELECT LEVEL(node) AS LVL, P_RETAILPRICE  FROM    HIERARCHY(      USING PART_HIERARCHY AS c      JOIN PARENT p ON c.P_PARENT = p.P_PARTKEY      SEARCH BY        P_PARTKEY ASC      START WHERE        P_PARTKEY = 1      SET node ) AS H0  ) T1 GROUP BY LVL """.stripMargin ).collect().foreach(println)

901

903

913

912

904

911

+---+---+------------+|LEVEL|COUNT|AVG(P_RETAILPRICE)|+-----+-----+------------------+| 0 | 1 | 901 || 1 | 2 | 903.5 || 2 | 3 | 912 |+-----+-----+------------------+

val options = Map("dbschema" -> config.user,"host" -> config.host,"instance" -> config.instance) # HANA Live CustomerBasicData Virtual Data Modelval custConf = options + ("path" -> s"""sap.hba.ecc/CustomerBasicData""")val cust = sqlContext.read.format("com.sap.spark.hana").options(custConf).load()cust.registerTempTable("customer") # HANA Live SalesOrderHeader VDMval sohConf = options + ("path" -> s"""sap.hba.ecc/SalesOrderHeader""")val soh = sqlContext.read.format("com.sap.spark.hana").options(sohConf).load()soh.registerTempTable(soh)

# Top 5 Countries by Sales Order VolumesalesOrder = sqlContext.sql("select "Country",count(*) as Frequency                     from salesOrder as s LEFT OUTER JOIN customer as c on s.soldToParty = c.Customer                            GROUP BY Country ORDER BY Frequency desc”)

Page 8: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Airline Use Case – Optimize MRO scheduling with Sensor Data

Challenges

• $10,000 loss for every hour spent on maintenance, repair, and overhaul (MRO)

• Predictive MRO generates TB of sensor data per flight

Solution

• SAP HANA Vora rapidly processes sensor data in HDFS and combines it with flight schedule and staffing data in SAP HANA to prioritize maintenance jobs and accelerate MRO

Why SAP HANA Vora

• Optimize MRO operations with interactive, on-demand drill down by airport, flight route, etc.

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Page 9: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Utility Use Case – CenterPoint Energy

Challenge

• Smart meters generate TBs of data/month

• Regulatory requirement to retain data for 10 years

• Current storage solution full by end-2016

• Need to leverage HDFS as an additional tier for storage

Solution

• SAP HANA for most recent sensor signal and operational data, Dynamic Tiering for 1~2yrs old data, HDFS for historical sensor data

• SAP HANA Vora accesses and queries data across all tiers

Why SAP HANA Vora

• SAP HANA Vora provides enterprise analytics & OLAP like experience across data warehouse and HDFS.

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Page 10: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Utility Use Case – How It WorksCenterPoint Energy

Our benchmark tests proved that SAP HANA paired with SAP HANA Vora are the right solutions for us. We expect immediate cost benefits and to see competitive differentiation in the future.”Gary Hayes, CIO & SVP at CenterPoint Energy

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

SAP HANAMOST RECENT SENSOR DATA

Dynamie Tiering

1-2 YR OLD DATA

Parallelized Queries

HDFS

HISTORICAL SENSOR DATA

Query data within and across tiers

Page 11: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Financial Services Use Case – Extend Fraud Pattern Detection

Challenges

• 100+ million business transactions daily, 25% growth YoY

• Limited access to archived data• Difficult to detect patterns in

historical transactions

Solution

• Current transactions in SAP HANA, historical transactions in HDFS clusters

• Real-time detection of abnormalities

Why SAP HANA Vora

• Real-time, aggregated insights from current and historical transactions

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Page 12: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

2016 and the Road Ahead

Customers in North America, APJ, and EMEA

Dev edition available on AWS

TODAY

General Availability

Vora Modeler to build and query

OLAP style cubes on data

COMING SOON

Planning (HR, Financial)

Extend engine support for time

seriesTransaction

managementAnalytics on archived ERP data in Hadoop

FUTURE

Page 13: Spark Summit presentation by Ken Tsai

© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Contribute to Spark Ecosystem, Embrace Best of Community Innovation

Contribution toOpen Source:

Hierarchy capabilities

Connection to ERP: predicate pushdown to HANA

On-the-marketsolution

SAP HANA Vora

Page 14: Spark Summit presentation by Ken Tsai

Thank you!Ken Tsai: [email protected] @kentsaiSAP

Enter to Win a GoPro HERO4

Session at SAP Booth 102

Learn More @hana.sap.com/vora

Try Dev Edition bit.ly/1K1qLyo

We’re Hiring: https://spark-summit.org/east-2016/jobs/