Upload
taro-l-saito
View
7.484
Download
2
Embed Size (px)
DESCRIPTION
Citation preview
Spark Internals
Taro L. SaitoTreasure Data, Inc.
Hadoop Source Code Reading #16NTT Data, Tokyo
May 29, 2014
1
Spark Internals
Spark Code Base Size
spark/core/src/main/scala 2012 (version 0.6.x)
20,000 lines of code
2014 (branch-1.0) 50,000 lines of code
Other components Spark Streaming Bagel (graph processing library) MLLib (machine learning library) Container support: Mesos, YARN, Docker, etc. Spark SQL (Shark: Hive on Spark)
2
Spark Internals
Spark Core Developers
3
Spark Internals
IntelliJ Tips
Install Scala Plugin
Useful commands for code reading Go to definition (Ctrl + Click) Show Usage Navigate Class/Symbol/File Bookmark, Show Bookmarks Ctrl + Q (Show type info) Find Action (Ctrl + Shift + A)
Use your favorite key bindings
4
Spark Internals
Scala Console (REPL)
$ brew install scala
5
Spark Internals
Scala Basics
object Singleton, static methods
Package-private scope private[spark] visible only from spark package.
Pattern matching
6
Spark Internals
Scala: Case Classes
Case classes Immutable and serializable
Can be used with pattern match.
7
Spark Internals
Scala Cookbook
http://xerial.org/scala-cookbook
8
Components
sc = new SparkContext
f = sc.textFile(“…”)
f.filter(…) .count()
...
Your program
Spark client(app master) Spark worker
HDFS, HBase, …
Block manager
Task threads
RDD graph
Scheduler
Block tracker
Shuffle tracker
Clustermanager
Spark Internalshttps://cwiki.apache.org/confluence/display/SPARK/Spark+Internals
Scheduling Process
rdd1.join(rdd2) .groupBy(…) .filter(…)
RDD Objects
build operator DAG agnostic
to operators!
doesn’t know about
stages
DAGScheduler
split graph into stages of taskssubmit each stage as ready
DAG
TaskScheduler
TaskSet
launch tasks via cluster managerretry failed or straggling tasks
Clustermanager
Worker
execute tasks
store and serve blocks
Block manager
ThreadsTask
stagefailed
Spark Internalshttps://cwiki.apache.org/confluence/display/SPARK/Spark+Internals
Spark Internals
RDD
Reference M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley,
M.J. Franklin, S. Shenker, I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, NSDI 2012, April 2012
SparkContext Contains SparkConfig, Scheduler, entry point of running jobs
(runJobs) Dependency
Input RDDs
11
Spark Internals
RDD.map operation
Map: RDD[T] -> RDD[U]
MappedRDD For each element in a partition, apply function f
12
Spark Internals
RDD Iterator
13
First, check the local cache If not found, compute the RDD
StorageLevel Off-heap
Tachiyondistributed memory store
Spark Internals
Task
DAGScheduler organizes stages Each stage has several tasks Each task has preferred locations (host names)
Favor data local computation
14
Spark Internals
Task Locality
Preferred location to run a task Process, Node, Rack
15
Spark Internals
Delay Scheduling
Reference M. Zaharia, D.
Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker and I. Stoica. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling, EuroSys 2010, April 2010.
Try to run tasks in the following order:
Local Rack local
Involves data serialization, local transfer
At any node It might involve
remote data transfer
16
Spark Internals
Serializing Tasks
TaskDescription
ResultTask RDD Function Stage ID, outputID
func aggregation
17
Spark Internals
TaskScheduler: submitTasks
Serialize Task Request Then, send task requests to ExecutorBackend
ExecutorBackend handles task requests (Akka Actor)
18
Spark Internals
ClosureSerializer
Clean
Function in scala: Closure Closure: free variable + function body (class)
x: bound variable, N: free variable, M:unused variable class A$apply$1 extends Function1[T, U] {
val $outer : A$outer def apply(T:input) : U = …}
class A$outer { val N = 100, val M = (large object)}
Fill M with null, then serialize the closure.
19
Spark Internals
Traversing Byte Codes
Closure is a class in Scala Traverse outer variable accesses Using ASM4 library
20
Spark Internals
JVM Bytecode Instructions
21
Spark Internals
Cache/Block Manager
CacheManager Stores computed RDDs to
BlockManager
BlockManager Write-once storage Manages block data
according to StorageLevel memoryStore diskStore shuffleStore
Serializes/deserializes block data For remote data
Compression ning LZF Snappy-java
Faster decompression
22
Spark Internals
Storing Block Data
IteratorValues Raw objects
ArrayBufferValues
Array[Byte] ByteBufferValu
es ByteBuffer
23
Spark Internals
ConnectionManager
Asynchronous Data I/O server Using its own protocol Send and receive block data (BufferMessage)
Split data into 64KB chunksChunkHeader
24
Spark Internals
RDD.compute
Local Collection
25
Spark Internals
SparkContext - RunJob
RDD -> DAG Scheduler
26
Spark Internals
SparkConf
Key-Value configuration Master address, jar file address, environment variables,
JAVA_OPTS, etc.
27
Spark Internals
SparkEnv
Holding spark components
28
Spark Internals
SparkContext.makeRDD
Convert local Seq[T] into RDD[T]
29
Spark Internals
HadoopRDD
Reading HDFS data as (Key, Value) records
30
Spark Internals
Mesos Scheduler – Fine Grained
31
Mesos Offer slave resources
Scheduler Determine resource
usage
Task lists are stored in TaskScheduler
Launches JVM for each task createMesosTask createExecutorInfo
Spark Internals
Mesos Fine-Grained Executor
32
Spark Internals
Mesos Fine-Grained Executor
spark-executor Shell script for launching JVM
33
Spark Internals
Coarse-grained Mesos Scheduler
Launches Spark executor on Mesos slave Runs CoarseGrainedExecutorBackend
34
Spark Internals
Coarse-grained ExecutorBackend
Akka Actor
Register itself to the master
Initialize the executor after response
35
Spark Internals
Cleanup RDDs
ReferenceQueue Notified when weakly referenced objects are
garbage collected.
37
Copyright ©2014 Treasure Data. All Rights Reserved. 38
WE ARE HIRING!