27
Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning Alessandro Pellegrini Sapienza, University of Rome HPCS 2016

Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Embed Size (px)

Citation preview

Page 1: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Optimizing Memory Management for Optimistic Simulationwith Reinforcement Learning

Alessandro Pellegrini

Sapienza, University of Rome

HPCS 2016

Page 2: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Context

• Simulation is a powerful technique to explore complex scenarios

• Parallel Discrete Event Simulation (PDES) has been applied to alarge set of research fields

• Speculative Simulation (Time Warp-based) is proven to beeffective to deliver high performance simulations

• Ensuring consistency of speculative simulation requires effort

• Transparency towards the application-model developer is critical

2 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 3: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Organization of a Time Warp Kernel

Communication Network

Machine

CPU

Kernel

LPLP

LP LPLP

LP LPLP

LP LPLP

LP

...

...

CPU CPU CPU

Machine

CPU

Kernel

...CPU CPU CPU

Kernel

3 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 4: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj

LPk Execution Time

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 5: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5

LPk Execution Time7

10 Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 6: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5

LPk Execution Time7 17

10

17

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 7: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5 10

20

LPk Execution Time7 17 25

10

17

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 8: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5 10

20

12

LPk Execution Time7 17 25

10

17

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 9: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5 10

20

Straggler Message

12

LPk Execution Time7 17 25

10

17

Rollback Execution:

Recovering state at

LVT 10

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 10: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5 10

20

Straggler Message

12

LPk Execution Time7 17 25

10

17 17

Anti-message

anti-message

reception

Rollback Execution:

Recovering state at

LVT 10

Rollback Execution:

Recovering State at

LVT 7

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 11: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The Synchronization Problem

LPi

LPj 15

5 10

20 12

Straggler Message

12

LPk Execution Time7 17 25

10

25

17 17

Anti-message

anti-message

reception

Rollback Execution:

Recovering state at

LVT 10

Rollback Execution:

Recovering State at

LVT 7

Execution Time

Execution Time

4 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 12: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Memory Management and Rollbacks

• How can a runtime environment restore a state?

• It has to know the complete memory map of each LP

• It should take “sometimes” a snapshot of that map

• The snapshot could be either full or incremental

• Memory management is fundamental to Time Warp systems◦ Too many snapshots: memory/latency inefficiency◦ Too few: rollbacks are long!◦ Full vs incremental: how to decide?

5 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 13: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Memory Management and Rollbacks

• How can a runtime environment restore a state?

• It has to know the complete memory map of each LP

• It should take “sometimes” a snapshot of that map

• The snapshot could be either full or incremental

• Memory management is fundamental to Time Warp systems◦ Too many snapshots: memory/latency inefficiency◦ Too few: rollbacks are long!◦ Full vs incremental: how to decide?

5 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 14: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Our memory management organization

• Transparency◦ Interception of memory-related operations (no platform APIs)◦ No application-level procedure for (incremental) log/restore tasks

• Optimism-Aware Runtime Supports◦ Recoverability of generic memory operations: allocation, deallocation,

and updating

• Incrementality◦ Cope with memory “abuse” of speculative rollback-based

synchronization schemes◦ Enhance memory locality

6 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 15: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Our memory management organization

• Lightweight software instrumentation◦ Optimized memory-write access tracing and logging◦ Arbitrary-granularity memory-write tracing◦ Concentration of most of the instrumentation tasks at a pre-running

stage:

• No costly runtime dynamic disassembling

• Standard API wrappers◦ Code can call standard malloc services◦ Memory map transparently managed by the simulation platform

7 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 16: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Our memory management organization

malloc_area

malloc_area

... Block status

bitmap

Dirty

bitmap

chunk

chunk

...

• Memory (for each LP) is pre-allocated

• Requests are served on a chunk basis

• Explicit avoidance of per-chunk metadata◦ Block status bitmap: tracks used chunks◦ Dirty bitmap: tracks updated chunks since last log

8 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 17: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Our memory management organization

• We use Hijacker to track memory-update instructions

mov $3, xoriginal memory

update

jmp *%eaxindirect branch

mov $3, x

jmp .Jump

call track

jmp 0xXXXX.Jump:

push struct

new writeable section regular jump modi ed

by branch_corrector

call corrector

Instrumentation Process

Original Executable Final executable

push struct

regular jump

• Multi-code packs two different version of the program9 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 18: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Memory management self-optimization

• To optimize the memory manager we have to determine:◦ When to take a snapshot◦ Its mode (incremental vs incremental)

• But to take an incremental log, tracing must be active

• Traditional approaches are based on analytic models◦ Periodic recomputation (e.g., checkpoint interval)◦ Non-responsive if dynamics change fast◦ Might not capture secondary effects

• We use reinforcement learning

10 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 19: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Memory management self-optimization

• To optimize the memory manager we have to determine:◦ When to take a snapshot◦ Its mode (incremental vs incremental)

• But to take an incremental log, tracing must be active

• Traditional approaches are based on analytic models◦ Periodic recomputation (e.g., checkpoint interval)◦ Non-responsive if dynamics change fast◦ Might not capture secondary effects

• We use reinforcement learning

10 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 20: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Reinforcement Leraning-based self-optimization

• An agent takes an action in the environment depending on thecurrent state

• An a-posteriori reward tells whether the choice was good

• Previous decisions affect future ones (we learn from history!)

• With some random probability, we ignore history and explore

• We take an action after the execution of each event

• After some knowledge has been acquired, the system can becomevery responsive

11 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 21: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

States and actions

Actions: Monitored, Unmonitored, Checkpoint

12 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 22: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

The reward

• We want to reduce the time spent in non-necessary tasks

• We define the expected computation loss:

Γ = E[∫ T

0X (t)dt

]• where:

X (t) =

0 if x = Non-IncrementalδM

(δe+δM) if x = Incremental

1 − γ if x = CKPTI

1 if x = CKPTF

1 if x = Rollback

13 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 23: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Experimental Setup

• 64-bit NUMA machine, 24 cores, 32GB of RAM

• SuSe Enterprise 11, Linux 2.6.32.13

• GSM coverage simulation model

• High fidelity model (fading, power regulation, meteorologicalconditions)

• Ring highway coverage

• 1000 channels per cell

• Variable call interarrival (simulation of one week of traffic)

14 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 24: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Experimental Results

0

50000

100000

150000

200000

250000

10000 30000 50000 70000

Cum

ulat

ed C

omm

itted

Eve

nts

Wall-clock-time (sec)

Overall Execution Speed

Q-LearningIncremental

Full

15 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 25: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Experimental Results

0

20000

40000

60000

80000

100000

120000

RL Genetic

Total execution time

16 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 26: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Experimental Results

0

5

10

15

20

25

30

35

40

200 400 600 800 1000Eve

nts

Com

mitt

ed b

etw

een

Che

ckpo

ints

Logical Processes

Checkpoint Interval

17 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning

Page 27: Optimizing Memory Management for Optimistic Simulation ...pellegrini/talks/hpcs16.pdfLPj LPk Execution Time Execution Time Execution Time 4 of 18 - Optimizing Memory Management for

Thanks for your attention

Questions?

[email protected]

http://www.dis.uniroma1.it/∼pellegrini

https://github.com/HPDCS/ROOT-Sim

18 of 18 - Optimizing Memory Management for Optimistic Simulation with Reinforcement Learning