41
Anatomy of in-memory processing in Spark A deep-dive into custom memory management in Spark https://github.com/shashankgowdal/introduction_to_dataset

Anatomy of in memory processing in Spark

Embed Size (px)

Citation preview

Page 1: Anatomy of in memory processing in Spark

Anatomy of in-memory processing in Spark

A deep-dive into custom memory management in Spark

https://github.com/shashankgowdal/introduction_to_dataset

Page 2: Anatomy of in memory processing in Spark

● Shashank L

● Big data consultant and trainer at datamantra.io

● www.shashankgowda.com

Page 3: Anatomy of in memory processing in Spark

Agenda

● Era of in-memory processing● Big data frameworks on JVM● JVM memory model● Custom memory management● Allocation● Serialization● Processing● Benefits of these on Spark

Page 4: Anatomy of in memory processing in Spark

Era of in-memory processing● After Spark, in memory has become a defacto

standard for big data workloads● Advancement in hardware is pushing more

frameworks in that direction● Memory management is coupled with runtime of the

framework● Using memory efficiently in a big data workload is a

challenging task● Memory management depends upon runtime of the

framework

Page 5: Anatomy of in memory processing in Spark

Why JVM is a prominent runtime for big data workloads

● Managed runtime

● Portable

● Hadoop was on JVM

● Rich eco system

Page 6: Anatomy of in memory processing in Spark

Big data frameworks on JVM● Many frameworks today runs on JVM today

○ Spark○ Flink○ Hadoop○ etc

● Organising data in memory○ In-memory processing○ In-memory caching of intermediate results

● Memory management influences○ Resource efficiency○ Performance

Page 7: Anatomy of in memory processing in Spark

Straight-forward approach

● JVM memory model approach● Store collection of objects and perform any processing on

the collection.● Advantages

○ Eases development cycle○ Built in safety checks before modifying any of the memory○ Reduces complexity○ JVM built-in GC

Page 8: Anatomy of in memory processing in Spark

JVM memory model - Disadvantages

● Predicting memory consumption is hard○ If it fails, OutOfMemory error kills the JVM

● High garbage collection overhead○ Easily 50% of the time spent in GC

● Objects have space overhead○ JVM objects doesn’t take the same amount of memory as

we think.

Page 9: Anatomy of in memory processing in Spark

Java object overhead● Consider a string “abcd” as a JVM object. By looking at it, it

should take up 4 bytes (one per character) of memory.

Page 10: Anatomy of in memory processing in Spark

Garbage collection challenges● Many big data workloads create objects in a way that are

unfriendly to regular Java GC.

● Young generation garbage collection is frequent.● Objects created in big data workloads tend to live in Young

generation itself because they are used few times.

Page 11: Anatomy of in memory processing in Spark

Generality has a cost, so semantics and schema should be used to exploit specificity

instead

Page 12: Anatomy of in memory processing in Spark

Custom memory management

● Allocation - Allocate fixed number of segments upfront.

● Serialization - Data objects are serialized into memory segments

● Processing - Implement algorithms on binary representation.

Page 13: Anatomy of in memory processing in Spark

Allocation

Page 14: Anatomy of in memory processing in Spark

Managing memory on our own

● sun.misc.Unsafe

● Directly manipulating memory without safety checks.

(hence, its unsafe)

● This API is used to build data structures off heap in

Spark.

Page 15: Anatomy of in memory processing in Spark

sun.misc.Unsafe

● Unsafe is one of the gateway to low level

programming in Java.

● Exposes C-style memory access

● explicit allocation, deallocation, pointer arithmetics

● Unsafe methods are intrinsic

Page 16: Anatomy of in memory processing in Spark
Page 17: Anatomy of in memory processing in Spark

Hands-on

● DirectIntArray

● MemoryCorruption

Page 18: Anatomy of in memory processing in Spark

Custom memory management in SparkOn heap

● Stores data inside an array of type Long● Capable of storing 16GB at once● Bytes are encoded in long and stored here

Off heap

● Allocates memory in the memory assigned to JVM other than heap

● Uses Unsafe API● Stores bytes directly

Page 19: Anatomy of in memory processing in Spark

Encoding memory addresses● Off heap: Addresses are raw memory pointers.● On heap: Addresses are base object + offset pairs● Spark uses its own page table abstraction to enable more

compact encoding of on-heap addresses.

Page 20: Anatomy of in memory processing in Spark

Serialization

Page 21: Anatomy of in memory processing in Spark

Data Structures prominently used in Big data

● Sequence

● Key-Value pair

Page 22: Anatomy of in memory processing in Spark

Java object-based row notation● 3 fields of type (int, string, string)

with value (123, “data”,”mantra”)

➔ 5+ objects➔ high space overhead➔ expensive hashCode()

Page 23: Anatomy of in memory processing in Spark

Tungsten’s unsafe row format

● Bitset for tracking null values● Every column appears in the fixed-length value region

○ Fixed length variables are inclined○ For variable length values, we store a relative offset

into the variable length data section● Rows are always 8 byte aligned● Equality comparison can be done on raw bytes.

Page 24: Anatomy of in memory processing in Spark

Example of unsafe row

null tracking bitmap

(123, “data”,”mantra”)

Page 25: Anatomy of in memory processing in Spark

Hands-on

● UnsafeRowCreator

Page 26: Anatomy of in memory processing in Spark

java.util.HashMap

...

array

● Huge object overheads

● Poor memory locality

● Size estimation is hard

Page 27: Anatomy of in memory processing in Spark

Tungsten BytesToBytesMap

...

● Low overheads

● Good memory locality, especially for scans

Page 28: Anatomy of in memory processing in Spark

Processing

Page 29: Anatomy of in memory processing in Spark

Many big data workloads are now compute bound

● Network optimizations can only reduce job completion time by a median of

at most 2%.

● Optimizing or eliminating disk accesses can only reduce job completion

time by a median of at most 19%

● [1]

Page 30: Anatomy of in memory processing in Spark

Hardware trends

Page 31: Anatomy of in memory processing in Spark

Why is CPU the new bottleneck● Hardware has improved

○ 1Gbps to 10Gbps link in networks○ High B/W SSDs or Stripped HDD arrays

● Spark IO has been optimized○ Many workloads now avoid significant disk IO by pruning

data that is not needed in a given job○ New shuffle and network layer implementations

● Data formats have improved○ Parquet, binary data formats

● Serialization and Hashing are CPU-bound bottlenecks

Page 32: Anatomy of in memory processing in Spark

Code generation

● Generic evaluation of expression logic is very expensive on the JVM○ Virtual function calls○ Branches based on expression type○ Object creation due to primitive boxing○ Memory consumption by boxed primitive objects

● Generating the code which directly applies the expression logic on serialized data

Page 33: Anatomy of in memory processing in Spark

Which Spark API can be benefited

Spark dataframes

SparkSQL

RDD

Page 34: Anatomy of in memory processing in Spark

Why only Dataframes are benefited?

PythonDF

Java/ScalaDF

RDF

Logical Plan

Physicalexecution

Catalyst optimizer

SparkSQL

Physicalexecution

RDDAPI

Page 35: Anatomy of in memory processing in Spark

Runtime bytecode generation

Dataframe code

Catalyst expressions

Low level bytecode

Page 36: Anatomy of in memory processing in Spark

Code generation

Page 37: Anatomy of in memory processing in Spark

Aggregation optimization in DataFrame

Page 38: Anatomy of in memory processing in Spark

Aggregation optimization in DataFrame

Input row Grouping keyUnsafeRow

BytesToBytesMapIterate

Update in place

Probe

ProjectConvert

Scan

Page 39: Anatomy of in memory processing in Spark

Performance results with optimizations (Run time)

Page 40: Anatomy of in memory processing in Spark

Performance results with optimizations (GC Time)