17

VoDcast Slides: The Rise in Popularity of Apache Spark

Embed Size (px)

Citation preview

The Rise in Popularity of Apache Spark With Ian Lumb, Product Marketing Manager

Youtube VoDcast: https://youtu.be/PimVUaQBMLM

What is the single most appealing aspect of Apache Spark?

4

Single-most Appealing AspectSingle-most Appealing Aspect

5

Single-most Appealing AspectSingle-most Appealing Aspect

https://spark.apache.org/

6

Abstraction for in-memory computing

Fault-tolerant, parallel data structures• Cluster-ready

Optionally persistent

Can be partitioned for optimal placement

Manipulated via operators

Resilient Distributed Datasets (RDDs)Resilient Distributed Datasets (RDDs)

Zaharia et al., NSDI 2012http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf

What are the key differences between Hadoop clusters and Spark clusters?

8

Cluster ManagementCluster Management

9

Hadoop DistributionsHadoop Distributions

10

Well-managed ClustersWell-managed Clusters

https://spark.apache.org/ http://aryannava.com/2014/02/19/apache-hadoop-ecosystem/hadoopecosystem/

11

ApplicationsApplications

What applications benefit from utilizing Hadoop vs. Spark?

13

Hadoop LimitationsHadoop Limitations

https://tomsitpro.com/

14

Spark’s Converged ApplicationsSpark’s Converged Applications

http://www.informationweek.com/big-data/big-data-analytics/apache-spark-3-promising-use-cases/a/d-id/1319660

15

Big Data Analytics

• Combine SQL, streaming, machine learning and graph analytics

HPC

• Decouple from Hadoop to easily incorporate with existing infrastructure

Spark’s converged application playSpark’s converged application play

https://spark.apache.org/

www.brightcomputing.com/solutions-hadoop

Youtube VoDcast: https://youtu.be/PimVUaQBMLM