Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Top Performance
Myths and Folklore
Martin Thompson - @mjpt777
Top Performance
Myths and Folklore
Martin Thompson - @mjpt777
Top 10
Performance Mistakes
Martin Thompson - @mjpt777
10
Not Upgrading
9
Duplicated Work
Database Tuning?
Where is the real issue?
8
Data Dependent Loads
Aka “Pointer Chasing”
Are all memory
operations equal?
Sequential Access
-
Average time in ns/op to sum all
longs in a 1GB array?
Access Pattern Benchmark
Benchmark Mode Score Error Units
testSequential avgt 0.832 ± 0.006 ns/op
~1 ns/op
Really???
Less than 1ns per operation?
Random walk per OS Page
-
Average time in ns/op to sum all
longs in a 1GB array?
Access Pattern Benchmark
Benchmark Mode Score Error Units
testSequential avgt 0.832 ± 0.006 ns/op
testRandomPage avgt 2.703 ± 0.025 ns/op
~3 ns/op
Data dependant walk per OS Page
-
Average time in ns/op to sum all
longs in a 1GB array?
Access Pattern Benchmark
Benchmark Mode Score Error Units
testSequential avgt 0.832 ± 0.006 ns/op
testRandomPage avgt 2.703 ± 0.025 ns/op
testDependentRandomPage avgt 7.102 ± 0.326 ns/op
~7 ns/op
Random heap walk
-
Average time in ns/op to sum all
longs in a 1GB array?
Access Pattern Benchmark
Benchmark Mode Score Error Units
testSequential avgt 0.832 ± 0.006 ns/op
testRandomPage avgt 2.703 ± 0.025 ns/op
testDependentRandomPage avgt 7.102 ± 0.326 ns/op
testRandomHeap avgt 19.896 ± 3.110 ns/op
~20 ns/op
Data dependant heap walk
-
Average time in ns/op to sum all
longs in a 1GB array?
Access Pattern Benchmark
Benchmark Mode Score Error Units
testSequential avgt 0.832 ± 0.006 ns/op
testRandomPage avgt 2.703 ± 0.025 ns/op
testDependentRandomPage avgt 7.102 ± 0.326 ns/op
testRandomHeap avgt 19.896 ± 3.110 ns/op
testDependentRandomHeap avgt 89.516 ± 4.573 ns/op
~90 ns/op
Need to ADD 40+ ns/opfor NUMA access on a server!!!
Access Pattern Benchmark
Benchmark Mode Score Error Units
testSequential avgt 0.832 ± 0.006 ns/op
testRandomPage avgt 2.703 ± 0.025 ns/op
testDependentRandomPage avgt 7.102 ± 0.326 ns/op
testRandomHeap avgt 19.896 ± 3.110 ns/op
testDependentRandomHeap avgt 89.516 ± 4.573 ns/op
What does this mean for
data structures?
Buckets
1
EUR/USD
Hash
Buckets
Key Value Next
1
EUR/USD
Hash
Buckets
2 GBP/EUR
Key Value Next
HashKey Value Next
1
EUR/USD
Hash
Buckets
2 GBP/EUR
Key Value Next
HashKey Value Next
3
GBP/USD
HashKey Value Next
Buckets Key Value Hash Next
0
1 EUR/USD 4 -1
Buckets Key Value Hash Next
1
0
1 EUR/USD 4 -1
Buckets Key Value Hash Next
2 GBP/EUR 2 -1
1
0
1 EUR/USD 4 2
Buckets Key Value Hash Next
2 GBP/EUR 2 -1
3 GBP/USD 4 -1
.net Dictionary is >10X faster than
HashMap for 2+ GB of data
Understand object relationships
and then choose appropriate
data structures
Java desperately needs
Value Types on the stack
and Aggregates on the heap
Data Structures are becoming
evermore important again!
7
Too Much Allocation
“Allocation is free…”
Reclamation is NOT free!
Remember
Data Dependent Loads?
Too much allocation or copying
will wash out your cache
6
Going Parallel
http://www.frankmcsherry.org/assets/COST.pdf
Amdahl’s Law
0
2
4
6
8
10
12
14
16
18
20
1 2 4 8 16 32 64 128 256 512 1024
Sp
ee
du
p
Processors
Amdahl
Universal Scalability Law
0
2
4
6
8
10
12
14
16
18
20
1 2 4 8 16 32 64 128 256 512 1024
Sp
ee
du
p
Processors
Amdahl USL
Universal Scalability Law
C(N) = N / (1 + α(N – 1) + ((β* N) * (N – 1)))
C = capacity or throughput
N = number of processors
α = contention penalty
β = coherence penalty
Shared mutable state is Evil!
“You can have a second
computer once you’ve shown
you know how to use the first one”
– Paul Barham
“You can have a second CPU
once you’ve shown you know
how to use the first one”
– Martin Thompson
5
Not Understanding TCP
TCP – Sequenced Flow 1
Client Server
TCP – Sequenced Flow 1
Client Server
SYN
TCP – Sequenced Flow 1
Client Server
SYN
SYN, ACK
TCP – Sequenced Flow 1
Client Server
SYN
SYN, ACK
ACK
TCP – Sequenced Flow 1
Client Server
SYN
SYN, ACK
ACK
Data == MSS
TCP – Sequenced Flow 1
Client Server
SYN
SYN, ACK
ACK
Data == MSS
TCP – Sequenced Flow 1
Client Server
SYN
SYN, ACK
ACK
Data == MSS
Delayed ACK
TCP – Sequenced Flow 1
Client Server
SYN
SYN, ACK
ACK
Data == MSS
Delayed ACK
Data < MSS
TCP – Sequenced Flow – TCP_NODELAY
Client Server
SYN
SYN, ACK
ACK
TCP – Sequenced Flow – TCP_NODELAY
Client Server
SYN
SYN, ACK
ACK
Data == MSS
TCP – Sequenced Flow – TCP_NODELAY
Client Server
SYN
SYN, ACK
ACK
Data == MSS
Data < MSS
TCP – Sequenced Flow – TCP_NODELAY
Client Server
SYN
SYN, ACK
ACK
Data == MSS
ACK
Data < MSS
4
Synchronous Communications
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Asynchronous Communications
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Client Server
Synchronous Communications
is the crystal meth
of distributed computing
3
Text Encoding
“But it’s human readable...”
“Binary is hard to work with...”
while (i >= 0)
{
int remainder = quotient % 10;
quotient = quotient / 10;
results[i--] = (byte)('0' + remainder);
}
Communications
Battery life and bandwidth?
2
API Design
public void characters(
char[] ch,
int start,
int length) throws SAXException
public void characters(
char[] ch,
int start,
int length) throws SAXException
public void startElement(
String uri,
String localName,
String qName,
Attributes atts) throws SAXException
API Design can be composed for
usability vs performance trade offs
public String[] split(String regex)
public String[] split(String regex)
public Iterable<String> split(String regex)
public String[] split(String regex)
public Iterable<String> split(String regex)
public void split(
String regex, Collection<String> dst)
selector.selectNow();
Set<SelectionKey> selectedKeys =
selector.selectedKeys();
Iterator<SelectionKey> iter =
selectedKeys.iterator();
while (iter.hasNext())
{
SelectionKey key = iter.next();
if (key.isReadable())
{
key.attachment(); // do work
}
iter.remove();
}
selector.selectNow();
Set<SelectionKey> selectedKeys =
selector.selectedKeys();
Iterator<SelectionKey> iter =
selectedKeys.iterator();
while (iter.hasNext())
{
SelectionKey key = iter.next();
if (key.isReadable())
{
key.attachment(); // do work
}
iter.remove();
}
// Keep and reuse
List<SelectionKey> keys = new ArrayList<>();
selector.selectNow(keys, READABLE);
keys.forEach(keyHandler);
1
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
1 2 3 4 5 6 7 8
Tim
e (
nan
ose
con
ds)
Average (Mean) Logging Duration
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
1 2 3 4 5 6 7 8
Tim
e (
nan
ose
con
ds)
Why do we Log?
Recording Events
Recording Errors
Instrumentation
Debugging
Recording Events
Big Data
Recording Errors
public class DistinctErrorLog
{
public void record(Throwable observation)
Instrumentation
systemCounters.get(FAILED_LOGIN).increment();
Debugging
Byte Buddy
In Closing…
Where are you spending you
Computing Resource Budget?
Run a profiler regularly!!!
Blog: http://mechanical-sympathy.blogspot.com/
Twitter: @mjpt777
“Any intelligent fool can make things bigger, more complex, and more violent.
It takes a touch of genius, and a lot of courage, to move in the opposite direction.”
- Albert Einstein
Questions?