Upload
brenden-matthews
View
164
Download
0
Embed Size (px)
Citation preview
Analytics on Mesos w/ Spark, DockerApril 23, 2015 @ AddThis
Brenden Matthews @brndnmtthws
© 2015 Mesosphere, Inc.
Agenda
2
• DCOS Cloud Early Access update
• Spark with Docker on Marathon demo
• Cluster resizing with AWS Auto Scaling Groups
• Use case discussion
• Q&A
© 2015 Mesosphere, Inc.
DCOS Cloud
This video shows the DCOS Cloud provisioning process with AWS CloudFormation templates on EC2.
DEMO
3
© 2015 Mesosphere, Inc.
Spark with Docker
• Spark scheduler runs on Marathon, within Docker container
4
Schedulers
© 2015 Mesosphere, Inc.
Spark with Docker
• Spark scheduler runs on Marathon, within Docker container
• Spark tasks run atop Mesos
5
Schedulers
Workers
© 2015 Mesosphere, Inc.
Spark with Docker
• Spark scheduler runs on Marathon, within Docker container
• Spark tasks run atop Mesos
• Spark worker tasks are Docker containers too
6
Schedulers
Workers
© 2015 Mesosphere, Inc.
Launch Spark
• Install Spark from DCOS universe
• Install HDFS from DCOS universe
DEMO
7
© 2015 Mesosphere, Inc.
Run TeraSort benchmark
• Submit TeraGen to generate 100m of data
• Submit TeraSort to sort 100m of data
• Submit TeraValidate to validate TeraSort output
• Resize cluster w/ Auto Scaling Groups
DEMO
8
© 2015 Mesosphere, Inc.
Use Cases
9
• Analytics
• Data warehousing (Spark SQL, aka Hive on Spark)
• One-off analysis (copy to S3)
• Machine learning
• Spark includes the excellent Machine Learning Library
• Stream processing
• Spark Streaming w/ Kafka
© 2015 Mesosphere, Inc.
Questions?
10
EOF