View
2.030
Download
0
Category
Preview:
Citation preview
Apache Spark: The Analytics Operating System
Anjul BhambhriIBM Vice President, Big Data
IBM Invests in Reinventing Computing
Linux, 199913,000,000 lines of code.500+ Server SolutionsUshered in Computer Science
System 360, 196410,000,000 lines of code.54 Peripheral SolutionsUshered in Information Science
Apache Spark, 2015400,000 lines of code.15+ Data & Analytics SolutionsUshered in Data Science
The Analytics Operating System
1 platform
Apache Spark
IBM | Spark
expressive-ness speed
any data:on disk,
or on the wire
(almost) any application unified model ->
high productivity
unparalleled performance
Why Spark?
Enhance it! Offer it!
Leverage it!
Spark Technology Center @ SF
Shipping with BigInsights /Spark as a
Service
Inside our products
At IBM, We Love Spark!
IBM is Building on Apache Spark
• IBM Analytics• IBM Commerce• IBM Watson• IBM Research• IBM Cloud
Spark for scalable financial reporting Financial data lakes are growing• Regulatory requirements => data retention• 30+ years of historical data (petabytes)• 100s of business analysts• 1000s of disparate reports requested
Overnight and real-time transactions also large• Complex ledger “posting” processes
Tight timelines (2-3 hours before banks open)
Scalable “scan-sharing” engine to the rescue:• SQL-inspired “financial” DSL built on Spark• Runs common portions of queries simultaneously• Dramatically lowers cost of producing the “next” analyst request that comes along
Spark maps Customer Experience “journey”• Multiple channels of customer
interaction.
• Very large data volumes that need fast processing.
• Correlating events across channels to interactions.
• Continuous classification of interactions and map the journey of the customer across channels.
• Sequence mining algorithm on Spark processes terabytes of interactions in minutes• MLLib models detect frustration in customers by length and frequency of interaction across
channels• SparkSQL and Parquet allow supporting multiple concurrent queries
PUB / SUBMQTT / WebSockets / Flume / Kafka
> > >
` ` `
JourneyDashboards
> > >
>>>
>> >
Interaction & Journey Data
<< < >> >
Voice & Text Data
visit www.spark.tc for more information
IBM | Spark
IBM Spark Technology CenterSan Francisco
Growing pool of contributors
300+ inventors
Contributed SystemML
Founding member of AMPLab
Partnerships in the ecosystem
IBM has made a significant investment in Spark
Power of data. Simplicity of design. Speed of innovation.
IBM Apache Spark
For Apache Spark news and innovationfrom IBM’s Spark Technology Center —
Recommended