Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Shah, Bloomreach

O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A

Rebalance API for SolrCloud

Nitin Sharma - Senior Software Engineer, Netflix Suruchi Shah - Software Engineer, Bloomreach

3

Agenda ➢ Motivation ➢ Introduction to Rebalance API ➢ Scaling Scenarios ○  Query Performance ○  Redistribution of Data ○  Removing/Replacing Nodes ○  Indexing Performance

➢ Allocation Strategies

➢ Open Source ➢ Summary

4

Motivation ●  Harder to scale & guarantee SLA (Availability, Query Performance, Data Freshness ) for a multi-tenant, cross datacenter

search architecture based on solrcloud

●  Scaling issues related to Query Performance:

■  Inability to auto scale Solr serving with increasing index sizes

■  Dynamic Shard Setup based on index size

●  SLA issues due to Availability:

■  Nightmare to manipulate cluster/collection setup with expanding clusters and frequent node replacements (AWS issues)

■  Flexible Replica Allocation based on custom strategies to guarantee 99.995% availability

●  Data Freshness (aka Indexing Performance) hiccups:

■  Break the tight coupling between Indexing and Serving latency (Tp 95) SLAs

●  Generic framework for Solr SLA management that can be open-sourced

5

Rebalance API ●  Fine-grained SLA management in Solr

●  Smarter Index, Cluster & Data Management for SolrCloud

●  Forms the basis for Solr Auto Scale

●  Admin handler in Solr

●  2 levels of abstraction

●  Scaling Strategies

○  Aid with shard, replica and cluster manipulation techniques to guarantee SLA

○  Zero Downtime operations

○  Tunable for Availability, Performance or Cost

●  Allocation Strategies

○  Decides Core Placement


Rebalance API

Scaling Strategies Allocation Strategies

Auto Shard

Redistribute

Smart Merge

Replace

Scale Up

Scale Down

Least Used

Unused

AZ Aware

6

Agenda ➢ Motivation ➢ Introduction to Rebalance API ➢ Scaling Scenarios ○  Query Performance ○  Redistribution of Data ○  Removing/Replacing Nodes ○  Indexing Performance



7

01 Query Performance Issues

●  Indexing doubles documents - Per shard latency goes up

●  Tp 95 shoots up ●  No way to change shards

dynamically ●  Delete, Recreate, Re-index ●  Availability goes down

Node 1

Zk Ensemble

Node 2

Solr Collection A

Indexing 2x documents

Shard 1 Shard 2

Query Tp95

Shard 3 Shard 4

8

03 Rebalance Auto Shard

●  Re-sharding existing collection to any number of destination shards. (e.g can help with reducing latency)

●  Includes re-distributing the index and configs consistently.

●  Zero downtime - No query failures

●  Avoiding any heavy re-indexing processes.

●  Can be size based as well

●  Sample API Call:

/admin/collections?action=REBALANCE &scaling_strategy=AUTO_SHARD &collection=collection_name&num_shards=number of shards &size_cap=1G &allocation_strategy=least_used

9

Solr Collection A

Node 1

Zk Ensemble

Node 2

Shard 1 Shard 2

Rebalance Auto Shard - Overview

Shard 3 Shard 4

10

01 Auto Shard (1) - Internals - Simple Strategy

●  Merge documents from all shards - Lucene library based

●  Split the merged shard into desired

number ●  Auto Zk update for shard ranges ●  Heavyweight but works ●  Based on size, could take in 20-30 of

minutes to complete

Solr Collection A

Shard 1 Shard 2

Merged Shard (Temp)

Shard 1 Shard 2 Shard 3 Shard 4

Solr Collection A’

merge

Even Split

11

01 Auto Shard (2) - Internals - Smart Split Strategy

●  Identify minimum number of splits ●  Split shards in parallel to required desired

setup ●  Relatively high performance ○  2 Tb index from 2 to 4 shards - 2.5 mins

●  Auto Zk Update

Solr Collection A

Shard 1 Shard 2

Shard 1_1

Shard 1_2

Shard 2_1

Shard 2_2

Solr Collection A


Solr Collection A

Smart Split Renamed Shards

12

01 Solution : Auto Sharding

●  Dynamically Increase/Decrease shards

●  E.g. Increase shards from 2 to 4 to reduce latency

●  E.g. Tp 95 reduced from > 1 sec to 250

ms.

Solr Collection A

Node 1

Zk Ensemble

Node 2

Shard 1 Shard 2

Shard 3 Shard 4

13

03 Agenda ➢ Motivation ➢ Introduction to Rebalance API ➢ Scaling Scenarios ○  Query Performance ○  Redistribution of Data ○  Removing/Replacing Nodes ○  Indexing Performance ○  Data Consistency



14

01 Data distribution issues

●  Adding a new node - Does nothing ●  No Redistribution of solr cores ●  Machines running out of disk space -

heavier collections need to be moved out ●  Problem amplifies at large scale - 100s of

nodes, 1000s of collections ●  Manual Management of core placement

becomes an issue

Default Solr Behavior

Node 1

Zk Ensemble

Node 2

Node 3

Core A2

Core B1

Core D2

Core D1

Core A1

Core B2

Core C1

Core C2

Node 4

15

01 Re-distribute Strategy - Internals

●  Internal topology construction from ZK

●  Desired Core placement computation - External

or Trigger based

●  Migration of cores within the cluster

●  Knobs to control min/max to reduce cluster

load

●  Zero downtime - No query failures and

resiliency to node failures

●  API Call: /admin/collections?

action=REBALANCE&scaling_strategy=REDIST

RIBUTE

Compute & Redistribute

Node 1

Zk Ensemble

Node 2

Node 3

Core A2

Core B1

Core D1

Core A1

Core B2

Core C1

Node 1

Zk Ensemble

Node 2

Node 3

Core A2

Core B1

Core D1

Core A1

Core B2

Core C1

16

01 Solution: Auto Redistribution of Data

Redistribute

●  Adding new node -

triggers redistribution

●  Respects the core

placement allocation

strategy

●  Zero downtime

Node 1

Zk Ensemble

Node 2

Node 3

Core A2

Core B1

Core D2

Core D1

Core A1

Core B2

Core C1

Core C2

Node 4

Node 1

Zk Ensemble

Node 2

Node 3

Core A2

Core B1

Core D2

Core D1

Core A1

Core B2

Core C1

Core C2

Node 4

17

03 Agenda ➢ Motivation ➢ Introduction to Rebalance API ➢ Scaling Scenarios ○  Query Performance ○  Redistribution of Data ○  Removing/Replacing Nodes ○  Indexing Performance



18

01 Replace Solr Nodes Default Solr Behavior

● A node might die, need to be replaced,

decommissioned

● Default behavior - Do nothing

● Can cause downtime - Heavy cores on

the nodes

● Problem exacerbated with 1000s of

nodes/collections

Node 1

Zk

Ense

mbl

e

Node 2

Node 3

Core A2

Core B1

Core D1

Core A1

Core B2

Core C1

19

01 Replace Nodes with Rebalance

● Read the Topology of cluster

● Migrate replicas from node about to die to

new node


● API Call:

○  /admin/collections?

action=REBALANCE&scaling_strategy=REPLACE

&collection=collectionName&source_node=so

urce_host &dest_node=dest_host

Node 1

Zk

Ense

mbl

e

Node 2

Node 3

Core A2

Core B1

Core D1

Core A1

Core B2

Core C1

Node 4

Core B1

Core A1

Replaced Node

20

03 Agenda ➢ Motivation ➢ Introduction to Rebalance API ➢ Scaling Scenarios ○  Query Performance ○  Redistribution of Data ○  Removing/Replacing Nodes ○  Indexing Performance



21

01 Indexing Performance ●  Higher the shards, faster the indexing

(parallelism) ●  Faster indexing - Data Freshness SLA ●  Solr - # shards is the same for indexing

vs serving ●  Shard setup - tweaked for serving

query performance ●  E.g. ○  Indexing 100M docs in 2 shards - 2

hours ○  Serving 100M docs in 2 shards - Tp

95 < 100 ms


Indexing

500M Documents

Performance Hit

Serving Queries

22

01 Indexing Performance - Smart Merge

●  Separate Indexing shard setup vs serving

● More shards for indexing - Merged into lesser shards for serving

● Post Indexing issue API to merge into serving collection

● API Call: ○  /admin/collections?

action=Rebalance&scaling_strategy=SMART_MERGE_DISTRIBUTED&collection=collectionName&num_shards=numRequiredShards

● Parallel Merge


23

01 Indexing Performance - Smart Merge

●  Index vs Serving has different collections

●  Indexed Collection

merged into Serving - Using smart merge call

●  Indexing can be tuned

independently for performance

● Serving SLA unaffected


Indexing

500M Documents

Serving Queries

Shard 1

Shard 4

Shard 5

Shard 8

Shard 9

Shard 12

Shard 13

Shard 16… … … …

Collection A_Indexing

Collection A

Parallel merge Parallel merge Parallel merge Parallel merge

24

Rebalance API ●  Fine-grained SLA management in Solr

●  Smarter Index, Cluster & Data Management for SolrCloud


●  Admin handler in Solr

●  2 levels of abstraction

●  Scaling Strategies

○  Aid with shard, replica and cluster manipulation techniques to guarantee SLA

○  Zero Downtime operations


●  Allocation Strategies

○  Decides Core Placement


Rebalance API

Scaling Strategies Allocation Strategies

Auto Shard

Redistribute

Smart Merge

Replace

Scale Up

Scale Down

Least Used

Unused

AZ Aware

25

01 Allocation Strategies

● Abstracts out the core placement methodology

●  Least Used Strategy - Pick the node that has the least amount of cores

● AZ aware Strategy - Pick the node that is in a different availability zone than the other

cores for a given collection

● Unused Strategy - Pick the node that does not have any cores for a given collection

● All of them are compatible with all scaling strategies

26

01 Open Source

●  Fully open sourced - SOLR-9241

(4.6.1).

● Contributed patch works on top of

4.6.1 and tested up to 4.10

●  SOLR-9241 (epic) - patches/features

on master. Has sub patches

○  SOLR-93{16-21}

○  SOLR-9407

● Actively working with community to

get the rest of the API 6+ compatible.

27

01 Summary ●  Harder to scale & guarantee SLA (Availability, Query Performance, Data Freshness ) for a multi-tenant, cross datacenter search

architecture based on solrcloud

●  Rebalance API

○  Scaling Strategies - How to scale?

○  Allocation Strategies - Where to place cores?


●  Zero Downtime operations for

○  Dynamically changing shard setup

○  Decoupling indexing SLA from Serving

○  Replacing Nodes

○  Auto -Redistributing data with cluster expansion

●  Open Source

28

01 Speakers

Nitin Sharma https://www.linkedin.com/in/knitinsharma [email protected]

Suruchi Shah https://www.linkedin.com/in/suruchishah [email protected]

Technology

Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Shah, Bloomreach