Apache Spark and z Systems

Point of View Workshop on Apache Spark and z Systems Background:

The large volume and variety of data available today offers the possibility of seeing things in new ways. Value comes from data-driven insights, not data alone. Analytic algorithms can combine and extract insights from a wide range of data, from transactional systems and data on the mainframe, to customer data collected from mobile and social applications and data streaming in from the Internet of Things and sensors to identify trends, model possible scenarios, and predict future results. These insights can provide business advantage by identifying emerging opportunities, improving the customer experience, enhancing operational efficiencies, reducing risks, and more.

Are you building data lakes to gain insight from multiple sets of data?

These data lakes quickly become data swamps.

Enterprises need to develop insights from high value, highly sensitive, operational data and data from virtually any other source — without any of the data leaving the system of origin.

Data scientists are spending as much as 80% of their time wrangling data.

Meet Apache Spark on z Systems:

“The Operating System for analytics.”

“The most exciting thing happening in Big Data today”

Apache® Spark™ is an open-source computing framework with in-memory processing to speed analytic applications up to 100 times faster compared to technologies on the market today and enhance mission-critical applications with deep intelligence.

Benefits with Apache Spark on z Systems:

Simplify data access

◦ Reduce data access complexity with seamless Spark access to enterprise data.

Speed development

◦ Take advantage of your expertise with popular programming languages such as Scala, Python, and SQL.

Accelerate results

◦ Use in-memory processing for fast results. Apply a range of analytics including machine learning, iterative, streams and batch.

Analyze all of your enterprise data in place

◦ Avoid latency, costly processing and security concerns associated with data movement. Optimized data abstraction services enable efficient access to a broad set of structured and unstructured data sources – in place. These services allow analytic applications deployed on z/OS to take advantage of standard Spark APIs to analyze data without requiring that data to be copied first; for example, access traditional z/OS data sources such as VSAM or SMF with SparkSQL.

Objectives:

The goal of this Point of View workshop is to enable clients to explore unique capabilities of Apache Spark on IBM z Systems, learn more about use cases, improved performance and cost benefits that can be achieved by reduced ETL needs.

Agenda:

Introduction and brief history of Apache Spark

IBM’s Commitment to Open Source Software & Apache Spark

Analytics Across Heterogeneous Data Environments

Why Spark on z Systems, Advantages

Spark z/OS Demo

Ecosystem

Next steps

Cost: This no-fee workshop is an investment both of your and IBM’s time, and provides significant value to your organization.

Contact: IBM z Systems and LinuxONE Solutions Team, Asia Pacific ([email protected])

https://www.ibm.com/blogs/systems/sparking-a-data-revolution-on-z-systems-mainframe/

https://spark-summit.org/2014/wp-content/uploads/2014/07/Spark-and-the-Future-of-Big-Data-Applications-Eric14.pdf

mailto:[email protected]

Technology

Apache Spark and z Systems