Upload
esther-kundin
View
300
Download
0
Embed Size (px)
Citation preview
2015
High Availability and High Frequency Big Data Analytics
Esther KundinBloomberg LP10/15/2015
#GHC15
2015
2015
Outline The Problem Space High Availability High Frequency Takeaways Questions
2015
The Problem Space The Problem Space High Availability High Frequency Takeaways Questions
2015
The Problem Space
2015
The Problem Space Total data set: 2 TB – roughly 2x1013 data points
− “medium data” Average Write: 4 billion data points a day Average read: 140 trillion data points a day Read/Write latency: 50 ms Read throughput: 3 trillion points in the peak
minute – 2000 bulk requests Allowable downtime < read latency
2015
High Availability – Pain Points and Solutions
The Problem Space High Availability High Frequency Takeaways Questions
2015
High Availability - Major Points of Failure
Client
HDFS
RegionServer RegionServer RegionServer
Meta Region Server
2015
High Availability – Solution HBASE-10070
Client
HDFS
RegionServer 1 RegionServer 2 RegionServer 3
Meta Region Server
SecondaryRegionServer 1
SecondaryRegionServer 2
SecondaryRegionServer 3
Secondary Meta Region Server
2015
High Availability Across Data Centers
3 Options− HBASE-12259 – HydraBase integration – HBASE +
Raft – In Progress− Cloudera BDR in Cloudera Enterprise 5 – Not
Open Source− Roll Your Own!
2015
Replication Across Data Centers
HBase 1 HBase 2
Writer1 Writer2
Reader1 Reader2
Global ZK
Replication
2015
High Frequency – Pain Points and Solutions
The Problem Space High Availability High Frequency Takeaways Questions
2015
HA to remove fat tails
50 60 80 90 95 990
2
4
6
8
10
12
Avg Latency per-Get Distribution
Percentile
Late
ncy
in m
s
2015
High Frequency – Pain Points Speed bounded by slowest responding region
server Garbage Collection causes spikes in latency
2015
The Art of Fine Tuning Use Data to set your heuristics
− Identify repeatable base-line tests− Identify performance parameters − Tweak one setting at a time
2015
Tuning Your DB – Garbage Collection
What Did Not Work− Stop The World− Small Memory Footprint – 4GB− Synchronized GC via coprocessors
What worked for us:− CMS – shorter pauses− Very large memory footprint – 28GB− Read from backup RS when GC in progress
2015
Takeaways The Problem Space High Availability High Frequency Takeaways Questions
2015
Takeaways High Availability can solve most availability
and latency concerns Multiple Data Center Support Needed Tune those settings!
2015
Questions? The Problem Space High Availability High Frequency Takeaways Questions
2015
Resources:Tuning Your DB – What to Tweak
Key Design Column Family Design hbase_site.xml - Lots of configuration to try! Bloom Filters Short-Circuit Reads Block Cache Scheduling Major Compactions Judiciously
2015
Got Feedback?
Rate and review the session on our mobile app
Download at http://ddut.ch/ghc15or search GHC 2015 in the app store