18
©2019 VMware, Inc. Don't shoot down TLB shootdowns! Nadav Amit , Amy Tai, Michael Wei April 2020

Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai,Michael Wei

April 2020

Page 2: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Virtual Address

Translation Lookaside Buffer (TLB)

TLB = cache for virtual to physical address translations

PGD PUDPMD

PTE

TLBPage-Tables

VAàPA

Page 3: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

TLB Coherency

Hardware does not maintain TLBs coherent

The problem is left for software (OS)

TLBincoherent

PTEs TLB

VAàPA VAàPA’ VAàPA’’

incoherent

Page 4: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

TLB Shootdown (in Linux)

initiator

time

responder

Page 5: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Challenge

TLB shootdowns are expensive.

How can we further optimize them?

This work focus on:• Linux/x86 – common lessons• Userspace mappings – common case

Lessons are relevant to other environments

Page 6: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Existing Solutions

Hardware based TLB invalidations• Not available on all architectures

• Does not coexist (yet) with software techniques:– No selective target cores for TLB invalidation

Software solutions• Replicating page-tables [RadixVM, Clements’13]

– Can increase overhead with low-latency IPIs

• Aggressive batching [LATR, Kumar’18]– Breaks POSIX semantics

Page 7: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

TLB Flushes in Linux and FreeBSD

initiator

responder

time

busy-wait

Page 8: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Optimization 1: Concurrent Flushes (forgotten lesson)

initiator

time

RP3 TLB consistency algorithm [Rosenburg’89]

responder

Page 9: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

TLB Shootdown Responder

Entry

SMP

TLB

Page Table Isolation

Page 10: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Optimization 2: Cacheline Consolidation

SMP info

TLB flush info

memoryEntry

SMP

TLB

Page 11: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Optimization 3: Early Acknowledgment

Entry

SMP

TLB

Page 12: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Optimization 3: Early Acknowledgment

Entry

SMP

TLB

Safe: flush will happenBetter: Initiator is faster

Page 13: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Optimization 4: In-Context Flushes

Entry

SMP

TLB

Page 14: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Optimization 4: In-Context Flushes

Entry

SMP

TLB

1. Efficient2. Better batching

Page 15: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

In the Paper

Userspace-safe batching• Deferring TLB shootdowns while the kernel runs

Avoiding TLB flushes on Copy-on-Write• Special case we can optimize

TLB flushes in virtualization• The effect of page size mismatch

Many important and subtle details

Page 16: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc. 16

Evaluation: Unmapping and Flushing 10 PTEsmadvise(MADV_DONTNEED)

0

3000

6000

9000

12000

15000

18000

21000

samecore samesocket diffsocket

cycl

es

baseconcurrentcachelineearly-ackin-context

16208

7685

14361

6247

16475

6929

0

2000

4000

6000

8000

samecore samesocket diffsocket

8411

6785

7313

5879

8039

6290

Initiator Responder

Page 17: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc. 17

Evaluation: SysBench – Random Writes

1

1.05

1.1

1.15

1.2

1.25

0 5 10 15 20 25

spee

dup

threads [#]

baseconcurrent

cacheline-consolearly-ack

in-context flushesuserspace-safe batchingRandom writes

Periodic flushes

Memory-mapped file

Emulated persistent memory, no write-cache

Page 18: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address

©2019 VMware, Inc.

Conclusions

TLB shootdown can be improved

Doing it well in software è better hardware interfaces

We are working to push these enhancements to Linux