Counter Wars (JEEConf 2016)

Preview:

Citation preview

АлексейФедоров,Одноклассники

CounterWars

Зачемвыздесь?

3 Много интересных докладов в других залах

4Counter

public interface Counter {

long get();

void increment();

}

5Simple Counter

class SimpleCounter implements Counter {

private long value = 0;

public long get() {return value;

}

public void increment() {value++;

}

}

6Volatile Counter

class VolatileCounter implements Counter {

volatile long value = 0;

public long get() {return value;

}

public void increment() {value++;

}

}

7Volatile Counter

class VolatileCounter implements Counter {

volatile long value = 0;

public long get() {return value;

}

public void increment() {long oldValue = value; // readlong newValue = oldValue + 1; // modifyvalue = newValue; // write

}}

8

class SynchronizedCounter implements Counter {

volatile long value = 0;

public synchronized long get() {return value;

}

public synchronized void increment() {value++;

}

}

Synchronized Counter

9Synchronized Counter

class SynchronizedCounter implements Counter {

long value = 0;

public synchronized long get() {return value;

}

public synchronized void increment() {value++;

}

}

10Тестовый стенд

Core(TM) i7-47704 x 2 x 2.0 Ghz (downscaled)

Linux Ubuntu 14.04.43.13.0-86-generic x86_64

taskset -c 0,1,2,3 (thread affinity)

11Бенчмарки, op/µs

Thanks to Nitsan Wakarthttp://psy-lob-saw.blogspot.ru/2014/06/jdk8-update-on-scalable-counters.html

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

LONG 308 267 182 220 180 86

VOLATILE_LONG 77 77 79 22 21 35

SYNCHRONIZED 26 43 27 12 12 13

Вывод1:синхронизациячего-тостоит

13

class LockCounter implements Counter {

long value;final Lock lock = new ReentrantLock();

public long get() {try {

lock.lock();return value;

} finally {lock.unlock();

}

}

}

public void add() {try {

lock.lock();value += 1;

} finally {lock.unlock();

}}

Lock Counter

14Lock Counter

class LockCounter implements Counter {

long value;final Lock lock = new ReentrantLock();

public long get() {try {

lock.lock();return value;

} finally {lock.unlock();

}

}

}

public void add() {try {

lock.lock();value += 1;

} finally {lock.unlock();

}}

15Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

REENTRANTLOCK 32 32 18 5 20 20

http://mechanical-sympathy.blogspot.com/2011/11/java-lock-implementations.htmlhttp://mechanical-sympathy.blogspot.com/2011/11/biased-locking-osr-and-benchmarking-fun.htmlhttp://www.javaspecialist.ru/2011/11/synchronized-vs-reentrantlock.htmlhttp://dev.cheremin.info/2011/11/synchronized-vs-reentrantlock.html

16

class LockCounter implements Counter {

long value;final Lock lock = new ReentrantLock();

public long get() {try {

lock.lock();return value;

} finally {lock.unlock();

}

}

}

public void add() {try {

lock.lock();value += 1;

} finally {lock.unlock();

}}

Lock Counter

17

class LockCounter implements Counter {

long value;final Lock lock = new ReentrantLock(true);

public long get() {try {

lock.lock();return value;

} finally {lock.unlock();

}

}

}

public void add() {try {

lock.lock();value += 1;

} finally {lock.unlock();

}}

Lock Counter

18Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

FAIR_LOCK

Насколькомедленнее,чемunfairlock?

19Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

FAIR_LOCK 31 5 ± 9 0.5 ± 0.3 0.26 0.24 0.23

Насколькомедленнее,чемunfairlock?Надвапорядка!

20Бенчмарки, op/µs

Насколькомедленнее,чемunfairlock?Надвапорядка!Страшнолиэто?

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

FAIR_LOCK 31 5 ± 9 0.5 ± 0.3 0.26 0.24 0.23

21Как устроен типичный Core i7

CPU4

CPU0

CPU5

CPU1

CPU6

CPU2

CPU7

CPU3

L1cache

L2cache

L1cache L1cache L1cache

L2cache L2cache L2cache

L3cache

Вывод2:честностьчего-тостоит

Атомики иCAS

24CompareandSwap— HardwareSupport

compare-and-swapCAS

load-link / store-conditionalLL/SC

cmpxchg

ldrex/strex lwarx/stwcx

25 CAS Counterpublic class CasLoopCounter implements Counter {

private AtomicLong value = new AtomicLong();

public long get() {return value.get();

}

public void increment() {for (;;) {

long oldValue = value.get();long newValue = oldValue + 1;if (value.compareAndSet(oldValue, newValue))

return;}

}

}

26Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

CAS_LOOP

27Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

CAS_LOOP 62 62 45 10 6 5

28Get-and-Add Counter

public class CasLoopCounter implements Counter {

private AtomicLong value = new AtomicLong();

public long get() {return value.get();

}

public void increment() {value.getAndAdd(1);

}

}

29 AtomicLong.getAndAdd()

30Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

CAS_LOOP 62 62 45 10 6 5

GET_AND_ADD 100 100 97 27 28 28

31True Sharing

CPU4

CPU0

CPU5

CPU1

CPU6

CPU2

CPU7

CPU3

L1cache

L2cache

L1cache L1cache L1cache

L2cache L2cache L2cache

L3cache

32

atomicLong.getAndAdd(5)

JDK7u95 JDK8u72

60

9 7 6

100

27 27 27

1 2 3 4

ops/μs

threads

33

atomicLong.getAndAdd(5)

JDK7u95 JDK8u72

60

9 7 6

100

27 27 27

1 2 3 4

ops/μs

threads

34

atomicLong.getAndAdd(5)

JDK7u95 JDK8u72

60

9 7 6

100

27 27 27

1 2 3 4

ops/μs

threads

35

loop:mov 0x10(%rbx),%raxmov %rax,%r11add $0x5,%r11lock cmpxchg %r11,0x10(%rbx)sete %r11bmovzbl %r11b,%r11dtest %r10d,%r10dje loop

JDK7u95-XX:+PrintAssembly

atomicLong.getAndAdd(5)

36

lock addq $0x5,0x10(%rbp))loop:mov 0x10(%rbx),%raxmov %rax,%r11add $0x5,%r11lock cmpxchg %r11,0x10(%rbx)sete %r11bmovzbl %r11b,%r11dtest %r10d,%r10dje loop

JDK7u95-XX:+PrintAssembly JDK8u72 -XX:+PrintAssembly

atomicLong.getAndAdd(5)

JDK7u95 JDK8u72

60

9 7 6

100

27 27 27

1 2 3 4

ops/μs

threads

37 AtomicLong.getAndAdd()— JDK7

38 AtomicLong.getAndAdd()— JDK7

cmpxchg

39 AtomicLong.getAndAdd()— JDK8

40 AtomicLong.getAndAdd()— JDK8

lock addqJVMIntrinsic

Вывод3: неверьтевсему,чтонаписановисходникахOpenJDK

JDK8

43StampedLock Counter

public class StampedLockCounter implements Counter {private long value = 0; private StampedLock lock = new StampedLock();

public long get() { ... }

public void add() {long stamp = lock.writeLock();try {

value++;} finally{

lock.unlock(stamp);}

}}

44Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

CAS_LOOP 62 62 45 10 6 5

GET_AND_ADD 100 100 97 27 28 28

STAMPED_LOCK 31 31 24 5 22 21

45Long Adder Counter

public class LongAdderCounter implements Counter {

private LongAdder value = new LongAdder();

public long get() {return value.longValue();

}

public void increment() {value.add(1);

}

}

46Бенчмарки, op/µs

1 thread 2 threads2 threads 2 threads4 threads8 threads

Core 0 Core 0 Cores 0,4 Cores 0,1Cores 0-3Cores 0-7

SYNCHRONIZED 26 43 27 12 12 13

UNFAIR_LOCK 32 32 18 5 20 20

CAS_LOOP 62 62 45 10 6 5

GET_AND_ADD 100 100 97 27 28 28

STAMPED_LOCK 31 31 24 5 22 21

LONG_ADDER 62 62 85 124 248 340

47Литература

48Материалы

• Все-все-все— bit.ly/concurrency-interest• Nitsan Wakart — psy-lob-saw.blogspot.com• АлексейШипилёв — shipilev.net

Вопросыиответы