Upload
fred-de-villamil
View
708
Download
5
Embed Size (px)
Citation preview
MIGRATING A 130TB CLUSTER FROM ELASTICSEARCH 2 TO 5 IN 20 HOURS WITHOUT DOWNTIME
FRED DE VILLAMIL @FDEVILLAMIL
OCTOBER 2017
ABOUT ME
FRED DE VILLAMIL, FORMER DIRECTOR OF INFRASTRUCTURE @SYNTHESIO
FIRST ELASTICSEARCH IN PRODUCTION WAS 0.17.6
LINUX / (FREE)BSD USER SINCE 1996, OPEN SOURCE CONTRIBUTOR SINCE 1998,
LOVES COOL TECHS, TENNIS, PHOTOGRAPHY, CUTE OTTERS, INAPPROPRIATE HUMOR AND ELASTICSEARCH CLUSTERS OF UNUSUAL SIZE.
WRITES ABOUT ES & MORE AT HTTPS://THOUGHTS.T37.NET
ABOUT SYNTHESIO
SYNTHESIO IS THE LEADING SOCIAL INTELLIGENCE TOOL FOR SOCIAL MEDIA MONITORING & SOCIAL ANALYTICS
SYNTHESIO CRAWLS THE WEB FOR RELEVANT DATA, ENRICHES IT WITH SENTIMENT ANALYSIS AND DEMOGRAPHICS TO BUILD SOCIAL ANALYTICS DASHBOARDS.
ELASTICSEARCH @SYNTHESIO
8 production clusters: • +600 hosts, all bare metal • 3 data center • 1.7PB storage SSD / NVME • 37.5TB RAM
Hardware: • 6 core Xeon E5v3 or bi Xeon E5-2687Wv4
12 core (160 watts!!!) • 64GB to 256GB RAM • 4 x 800GB SSD / 2 x 1.2TB NVME • RAID0 everywhere
We agregate data from various cold storage and make them searchable in a giffy.
Average cluster stats • writes: 85k documents / second, 1.5M
in peak • 800 search /s, with some cluster
having a continuous 25k search / second
• Doc size from 150KB to 200MB
THE BLACKHOLE CLUSTER
Topology • 68 data nodes • 3 master nodes • 6 ingest nodes • 200TB storage SSD • 2.4TB heap • 924 core
Cluster stats: • 1137 indices (daily) • 27266, shards • 130TB data • 201 billion documents • 7000 new documents / second • 800 search / second on the whole dataset
FEEDING BLACKHOLE FOR FUN AND PROFIT
BLACKHOLE ALLOCATION SETTINGS
"CLUSTER.ROUTING.ALLOCATION.NODE_INITIAL_PRIMARIES_RECOVERIES": 50 "CLUSTER.ROUTING.ALLOCATION.NODE_CONCURRENT_RECOVERIES": 20 "INDICES.RECOVERY.MAX_BYTES_PER_SEC": "2048MB" "INDICES.RECOVERY.CONCURRENT_STREAMS": "30"
"CLUSTER.ROUTING.ALLOCATION.DISK.THRESHOLD_ENABLED" : TRUE "CLUSTER.ROUTING.ALLOCATION.DISK.WATERMARK.LOW" : "78%" "CLUSTER.ROUTING.ALLOCATION.DISK.WATERMARK.HIGH" : "79%"
“CLUSTER.ROUTING.REBALANCE.ENABLE": "ALL" "CLUSTER.ROUTING.ALLOCATION.CLUSTER_CONCURRENT_REBALANCE": 50 "CLUSTER.ROUTING.ALLOCATION.ALLOW_REBALANCE": "ALWAYS"
USING THE REINDEX API?
REINDEX API:
• NO SLICED SCROLL UNTIL ES 6.0
• SLOW
• MIGHT LOSE SOME DOCUMENTS, NEEDS LOTS OF ERROR CONTROL
LOGSTASH:
• NO SLICED SCROLLS UNTIL ES 6.0
• FASTER THAN THE REINDEX API
• REALLY DOESN’T LIKE ERRORS
BEFORE UPGRADING
• USE THE UPGRADE CHECK PLUGIN TO VALIDATE CURRENT INDEXES COMPATIBILITY
• UPGRADE YOUR MAPPING TEMPLATES TO BE ES 5 COMPLIANT
• CREATE THE NEXT 10 DAYS INDEXES (JUST IN CASE)
• TELL YOUR HOSTING PROVIDER YOU’RE GOING TO TRANSFER 130TB
IN 17 HOURS
EXPANDING BLACKHOLE
OPS: • +90 NEW SERVERS IN 2 NEW RACKS • RAISED THE REPLICATION FACTOR TO 3
RESULT: • 167 NODES • 53626 SHARDS • 279TB DATA • 391TB STORAGE • 5.42TB HEAP • 2004 CORE
SETTINGS UPDATE DURING THE REPLICA INIT
"INDICES.RECOVERY.MAX_BYTES_PER_SEC": “4096MB"
"INDICES.RECOVERY.CONCURRENT_STREAMS": "50"
"CLUSTER.ROUTING.ALLOCATION.DISK.WATERMARK.LOW" : "98%"
"CLUSTER.ROUTING.ALLOCATION.DISK.WATERMARK.HIGH" : “99%"
"CLUSTER.ROUTING.REBALANCE.ENABLE": “NONE"
PROBLEMS
• THE TRANSFER PUT THE WHOLE CLUSTER ON THEIR KNEES.
• THIS SLOWERS THE WRITES.
• THE BULK THREAD POOL STARTS TO FILL IN.
SOLUTION: ZONING FOR FUN & PROFIT
• ALLOCATE THE FRESHEST DATA AND ONGOING IN A ZONE
• SEGREGATE EVERYTHING ELSE IN A DIFFERENT ZONE
• WAIT FOR THE CLUSTER TO CALM DOWN
• TOTAL SPENT TIME FOR THE TRANSFER: 17 HOURS
SPLITTING THE CLUSTER IN 2
• SET "CLUSTER.ROUTING.ALLOCATION.ENABLE" TO "ALL"
• SHUTDOWN 2 OF THE RACKS • SHUTDOWN ONE OF THE
MASTERS • SWITCH THE NUMBER OF
REPLICAS TO 1
BUILDING BLACKHOLE02
• RECONFIGURE THE 2 SHUTDOWN RACKS AND MASTER SO THEY TALK TO EACH OTHER
• START THE MASTER, ALONE, CLOSE THE INDEXES • UPGRADE THE MASTER TO ES 5.1.1 • UPGRADE ALL THE PLUGINS • START THE MASTER: THE WHOLE UPGRADE TOOK 32 SECONDS
BRINGIN BACK THE DATA
• UPGRADE ES AND THE PLUGINS ON THE DATA NODES • START ELASTICSEARCH • WAIT 30 MINUTES FOR THE CLUSTER TO GO BACK GREEN • PLUG A WORK UNIT TO CATCH UP WITH THE PAST 18 HOURS
OF DATA • UPDATE THE LOAD BALANCER CONFIGURATION TO USE THE
NEWLY UPGRADED CLUSTER
TIMELINE
QUESTIONS ?
@FDEVILLAMIL