Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
ElasticMemoryManagementforCloudDataAnalytics
JingjingWangandMagdalenaBalazinska
1
…
Large-ScaleCloudDataAnalytics
2
Large-ScaleData
MyriaSpark
Hadoop
Impala
MemoryGiraph
• Memoryintensive• Deployedonclusters• Sharedenvironments
MemoryManagement
MyriaQuery
Container-BasedCloudMemoryManagement
3
SparkApp
MyriaQuery
ResourceManagerSparkApp
MyriaQuery
Containers
• Hard limits: Good forisolation but lack flexibility• Estimating memory usage before execution is hard
0 2 4 6 8 10 12 14 16Heap Size (GB)
20
50100200
500
Que
ry T
ime
(s)
1..
MyriaSpark 1.1Spark 2.0
InaccurateMemoryEstimatesAffectPerformance
• Applicationfailures due to out-of-memory• Performance degradationduetogarbage collection
4Aself-joinquery
OOM
OurApproach:ElasticMem• Makecontainermemorylimitsdynamic
• Allocatememorytomultiple applications– Perform actions:garbagecollection,change memlimits,etc
• Predicthowmemoryactionsaffectperformance– Usepredictionstodrivememoryallocationdecisions
• Ourfocus:analytical(relational) queriesinJava-basedcontainers (JVM)
5
OurApproach:ElasticMem• Makecontainermemorylimitsdynamic
• Allocatememorytomultiple applications– Perform actions:garbagecollection,change memlimits,etc
• Predicthowmemoryactionsaffectperformance– Usepredictionstodrivememoryallocationdecisions
6
ImplementingDynamicHeapAdjustmentinaJVM
• OpenJDK hasarigiddesign:– Reserveheapspacebasedonuser-specifiedvalue– Cannotbechangedduringruntime
• Butmemoryovercommitting+64-bitaddressspaceopensup anopportunity– Reserveandcommitalargeaddressspace• Doesnotphysicallyoccupymemory
– Adjustlimitsaccordingtoactualusage
7
ImplementingDynamicHeapAdjustmentinaJVM
8
Change limit(GROW)
GC(Shrink)DeadLive
ResourceManager
JVM
Socket
Kill
Live
Actions
(OpenJDK 7)
Dynamiclimit
Used
OurApproach:ElasticMem• Makecontainermemorylimitsdynamic
• Allocatememorytomultiple applications– Perform actions:garbagecollection,change memlimits, etc
• Predicthowmemoryactionsaffectperformance– Usepredictionstodrivememoryallocationdecisions
9
DynamicMemoryAllocation
• Problemdescription:–Multiplequeriessharingmemory– At each timestep, allocatememorybyperformingactions
– Goal:Reducequerytimesandfailures• 0-1knapsackproblem:– Capacity:totalmemory– Items:JVMmemoryusagesafterperformingactions– Itemvalue:definedonmultipleattributes
10
DynamicMemoryAllocation
11
SparkApp
MyriaQuery
SparkApp
MyriaQuery
GC GROW
SparkApp
GROW
MyriaQuery
KILL
SparkApp
MyriaQuery
SparkApp
MyriaQuery
PAUSE GROW
GROWGROW
ValuesofActionsandStates
• Kill(KILL):#ofkilledqueries,fewerisbetter• Pause (NOOP):#ofpausedqueries,fewer is better• Costto acquire more memory(cost)– Time/space efficiency
12
Action Value.KILL Value.NOOP Value.costKILL 1 0 N/A
NOOP 0 1 N/A
Others 0 0 time/space
• Valueofastate:sumofactionvalues• Lexicographicorder
ValuesofActions:TimeandSpace
• Increase memory limit (GROW):– Space:estimatedheapgrowth• Maximumheapusagechangeinthepast few timesteps
– Time:acquiringandaccessingmemoryfromOS• Runacalibrationprogram
• Reclaim memory (GCactions)– Space:sizeofrecycledmemory– Time:GCtime– Howtopredictthemfromheapstates?
13
OurApproach:ElasticMem• Makecontainermemorylimitsdynamic
• Allocatememorytomultiple applications– Perform actions:garbagecollection,change memlimits,etc
• Predicthowmemoryactionsaffectperformance– Usepredictionstodrivememoryallocationdecisions
14
BuildPerformanceModelsfromHeapStates
• Our focus: analytical (relational) queries– Largein-memorydatastructures
• Pickhashtablesasourfocus– Predicttime&spacefordifferentGCsfromstats
Join
Aggregate
Select Select
HashTable✕ 2
HashTable
15
FeaturesofHashTables
9819595054
…
Alice Bob Carol
Alice
… …
• #oftuples• #ofkeys• Schemainformation
– #oflong columns– #ofString columns– SumoflengthsofString
Keys(zip code),long Values(name),String
16
Join
Select Select
…
FeaturesofHashTables
98195
95054
Alice Bob Carol
Alice98195 Alice Bob
98195 Alice Bob
• #oftuples&#ofkeys:Total&deltasincelastGC• 7features to collect,4valuestopredict
GC
17
L LL D D L L L
L L LL LL D D DYoung Collection
Full Collection
LforliveobjectsDfordeadobjectsYoungGen OldGen
Evaluation:GCModels
• Model: M5PinWeka• Training:generatehashtableswithspecificfeaturevalues
18
Run a query with thegenerated hash table
WekaM5P
Pick featurevalues Collect stats
Evaluation:GCModels
• Testing:17 TPC-H queries, randomlytriggerGCs
Valuesto Predict RelativeAbsoluteErrors(RAE)
Totalsize ofliveobjectintheyounggeneration(ylive)
23%
Totalsize ofliveobjectintheoldgeneration(olive)
6%
Timeforayounggeneration GC(gcy) 25%Timeforanoldgeneration GC(gco) 22%
19
Evaluation:Scheduling
• OneAmazonEC2r3.4xlargeinstance• 4mostmemoryintensiveTPC-Hquerieswithscalefactors1and2onMyria
• Original:OpenJDK 7u85– Serialexecution/fullyparallel
• Elastic:ourapproach– Resubmit:resubmittingkilledqueriesseriallyafterallqueriescomplete
20
Total Memory (GB)10 15 20 30 40 50 70 90
0
500
1000
1500
Ela
psed
Tim
e (s
) 78
88
8
8
8
8
8
8
8
8
8
8
8
8
Elastic-Resubmit, U=1/12ori,null,1,f
ComparetoSerial:MuchLessTimeinMostCases
21
Total Memory (GB)10 15 20 25 30 40 50 70 90
0
500
1000
1500
2000
Elap
sed
Tim
e (s
)
4 4
78
2 45
6
8 8
468 8
8
58 8
8
7 58 8
8
7 68 8
8
8 8 8
8
87
8 8
8
87
8 8
8
8 8
Elastic-Resubmit, U=1/12Elastic, U=1/12
Original, DOP=1Original, DOP=4
Original, DOP=8
#ofcompletedqueries Total Memory (GB)10 15 20 30 40 50 70 90
0
500
1000
1500
Elap
sed
Tim
e (s
) 78
88
8
8
8
8
8
8
8
8
8
8
8
8
Elastic-Resubmit, U=1/12Original, Serial
Total Memory (GB)10 15 20 30 40 50 70 90
0
500
1000
1500
Ela
psed
Tim
e (s
) 7
2 4
8
48 58
68 8
78
78 8
Elastic-Resubmit, U=1/12ori,null,8,f
ComparetoFullyParallel:LessFailures,LessTime
22
Total Memory (GB)10 15 20 25 30 40 50 70 90
0
500
1000
1500
2000
Elap
sed
Tim
e (s
)
4 4
78
2 45
6
8 8
468 8
8
58 8
8
7 58 8
8
7 68 8
8
8 8 8
8
87
8 8
8
87
8 8
8
8 8
Elastic-Resubmit, U=1/12Elastic, U=1/12
Original, DOP=1Original, DOP=4
Original, DOP=8
Total Memory (GB)10 15 20 30 40 50 70 90
0
500
1000
1500
Elap
sed
Tim
e (s
) 7
2 4
8
48 58
68 8
78
78 8
Elastic-Resubmit, U=1/12Original, Fully Parallel
• AdvantagesofElasticMem:– Automaticallyadjustsconcurrencylevel– Fasterqueryexecutionsandfewerfailures– Lowoverheadincaseserialexecutionisnecessary
2 4 4 5 6 7 7
7
Total Memory (GB)10 15 20 30 40 50 70 90
0
20
40
60
80
GC
Tim
e R
educ
tion
Ove
r Ful
ly P
aral
lel
(%)
Elastic, U=1/8Elastic, U=1/12
Elastic, U=1/16Elastic, U=500MB
Elastic, U=1000MB
GCTimeReduction:Upto80%
23
• DifferentmemoryincrementsU:– Fixed(U=500MB)ordynamic(U=1/12offreespace)
• Whenmemoryisabundant,carefultuningofU isnotrequired
OtherResults
24
• Query time saving up to 30%• Elasticmethodsusememorymoreefficiently
ElasticMem:Conclusion
• Schedulingwithhardmemorylimitsisinefficient• AvoidusingcontainerswithhardlimitsbymodifyingJVM
• Designaschedulingalgorithmtoallocatememoryacrossmultipleapplicationsinrealtime– BuildmodelstopredictGCtimeandspacesaving– Reducequerytimeupto30%,GCtimeupto80%,usememorymoreefficiently
25