Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
© 2015 IBM Corporation
IBM Analytics for Apache Spark (Spark as a Service)
Arancha Ocaña
IT Specialist, CTP Big Data and Spark.
<<For questions about this presentation contact Arancha Ocaña [email protected]>
© 2016 IBM Corporation2~ Minutes
Platform as a ServiceCustomer Managed
Service Provider Managed
IaaS
Benefits
Setup environments and
deploy apps very quickly.
Infrastructure and platform
managed by SP.
Time Commitment
Minutes to setup and deploy.
Focus on your apps and their
data.
Timing is critical…
~ Weeks
IBM Bluemix
~ Days
Time to initial deployment
Code
Data
Runtime
Middleware
OS
Virtualization
Servers
Storage
Networking
Core IT
Today’s apps must keep up with the speed of the app revolution.
© 2016 IBM Corporation3
How does Bluemix work?
Bluemix embraces Cloud Foundry as an open source Platform as a Service and extends it with IBM,
third party, and community built services.
© 2016 IBM Corporation4
IBM Analytics for Apache Spark
Performant Architecture
Productive Workflows
Leverages Existing Investments
Only IBM brings strength in enterprise, scale, and a managed offering to the Spark market
Continually Improving
Fully-managed & secured Spark environment,
accessible on-demand or via reserved instances
In-memory architecture greatly reduces disk I/O
20-100x faster for common tasks
Analytic workflows across a multitude of sources
Simplified but powerful syntax (~5x less code)
Integrates with SQL, Python, Scala, etc.
No lock-in: 100% open source Spark
Spark v1.6 available
Continually updated apace evolving Spark ecosystem
Pay-as-you-go or Reserved deployment options
as a service
© 2016 IBM Corporation5
IBM Analytics for Apache Spark – Personas & Practitioners
Data Scientist Application Developer
Business Analyst Data Engineer
>5,000active users
Accessible Integrated Powerful
Available standalone, within
platforms, & within solutions 10–100xfaster in-memory processing
+
Derive insights which are immediately
actionable with powerful Spark tools.
Self-service, rapid access to understanding
of the business, without IT intervention.
Integrate 100% open-standards Spark with
any application, regardless of the platform.
Assemble data pipelines with ease to
power interactive dashboards and
services.
© 2015 IBM Corporation6
Managed Service Ecosystem
Client Environmentas a service
IBM hosted,
managed, secure
environment
Apps
Data
EnvData
Result
Request
Bluemix
Platform
Other
managed
cloud services
+ 3rd party
tools
Access
Points:
Notebooks,
and others
to come
(Spark
Submit,
REST API,
Streaming)
© 2016 IBM Corporation7
BigInsights
(HDFS)
Cloudant
(DBaaS)
dashDB
(Analytics
)
Swift
(Object
Storage)
SQDB
(Manage
d DB2)
Data Sources
IBM Cloud Public Cloud Cloud Apps On-Premises
Execute SQL
Statements
Streaming
Analytics via
Micro-batch
M.L. and
Statistical
Algorithms
Distributed
Graph Processing
Framework
General compute engine
Basic I/O functions
Task dispatching
Scheduling
Spark Core
Spark SQLSpark
StreamingMLlib
Machine Learning
GraphX Graphing
+
Analytics for Apache Spark – Blends Multiple Data Types, Sources, & Workloads
8 © 2015 IBM Corporation
Spark Application Architecture
A Spark application is initiated from a driver program
Spark execution modes:– Standalone with the built-in cluster manager
Spark application execution via spark-submit.sh
© 2016 IBM Corporation9
Analytics for Apache Spark – Notebooks
Interactive, unified, and collaborative Spark work environments built with Jupyter (iPython)
Graphical user interface for executing and visualizing the results of Spark programs
Accessible through Web in-browser documents
Easily used by both business analysts and deeply-technical programmers
-Bridge the gap between “concept” to “production” application, all within a single environment
© 2016 IBM Corporation10
Examples in Spark as a service
© 2016 IBM Corporation11
Examples in Spark as a service
© 2016 IBM Corporation12
© 2016 IBM Corporation13
13
© 2016 IBM Corporation14
Lecciones aprendidas
Ejecución del Servicio de Spark y desarrollo mediante notebook
© 2016 IBM Corporation15
Demo
© 2016 IBM Corporation16
Demo
© 2016 IBM Corporation17
Demo
© 2016 IBM Corporation18
Demo
© 2016 IBM Corporation19
Demo
© 2016 IBM Corporation20
Demo
© 2016 IBM Corporation21
Demo
© 2016 IBM Corporation22
Demo
© 2016 IBM Corporation23
Demo
© 2016 IBM Corporation24
Demo
© 2016 IBM Corporation25
Demo
© 2016 IBM Corporation26
Demo
© 2016 IBM Corporation27
Demo
© 2016 IBM Corporation28
Demo
© 2016 IBM Corporation29
Demo
© 2016 IBM Corporation30
Demo
© 2016 IBM Corporation31
Demo
© 2016 IBM Corporation32
Demo
© 2016 IBM Corporation33
Demo
© 2016 IBM Corporation34
Demo
© 2016 IBM Corporation35
Demo
© 2016 IBM Corporation36
Demo
© 2016 IBM Corporation37
IBM Analytics for Apache Spark – Editions
Spark Application
(Driver Program)
SparkContext
Swift (SoftLayer), AWS S3, HDFS,
or other storage
Cluster Manager
Worker 1 Worker n
Spark
Executor #1Spark
Executor n
No additional charge
for the master node
– unlike competitors
Permanent storage
billed separately
Memory
CPU
Temp Storage
12.5 GB
1 Core
20 GB
Bare metal machine specs
per Spark Executor
IBM Analytics for Apache Spark
Personal Reserved Enterprise
Allowances
Access to interactive Spark
notebooks
Spark v1.4.1
2 Spark Executors per instance
Access to interactive Spark notebooks
Spark v1.4.1
30 Spark Executors per instance
Sold Via Bluemix CDS Sales (SQO), PPA
Price€ …
per instance-hour
€ …
per instance-month
. . .