31
Using Elastic to search over 2.5B videos

Tubular Labs - Using Elastic to Search Over 2.5B Videos

Embed Size (px)

Citation preview

Using Elastic tosearch over 2.5Bvideos

Talk structure

● 4 steps to make user experience great again

● 4 patterns to simplify architecture and reduce costs

© 2016 Tubular Labs

2

Data size

● 2.5B documents

● AVG doc size 2Kb, 4Tb total size

● 200M daily updates (~8% of the index)

● Constant indexing rate of 3k/s with spikes

● Querying rate 1-3 r/s (low concurrency)

© 2016 Tubular Labs

3

Hardware

● 52 x c3.4xlarge

● 128 shards

● 16 cores per node

● ~3 shards per node

● 832 cores, 16Tb

SSD, 1.5Tb RAM

© 2016 Tubular Labs

4

● 26 x c3.8xlarge

● 416 shards

● 32 cores per node

● 16 shards per node

● 832 cores, 16Tb

SSD, 1.5Tb RAM

Before After (25% bigger)

Indexing

Optimize indexing

● Using bulk API• 1Mb per batch (500 docs), should be 5k docs/s

• Recommended 5-15Mb

● Increasing refresh interval• From 1 to 30 seconds

● Monitoring bulk.rejected• Increased bulk.queueSize from 50 to 2000

© 2016 Tubular Labs

6

Searching

Product view

© 2016 Tubular Labs

8

Summary

Search results

Term aggregations

Before optimization

© 2016 Tubular Labs

9

Goal

© 2016 Tubular Labs

10

• Slow queries • From 15 to 5 seconds for 95th

• Seeking for 3x improvement

Problem Goal

Understand hardware utilization

© 2016 Tubular Labs

11

• Run the heaviest query

• No bottlenecks (CPU, disk IO, network)

• Thread pool search.size 25

• Max search.active is 3

CPU utilization

© 2016 Tubular Labs

12

• Know

• Your

• Concurrency

Benchmarking # of shards

© 2016 Tubular Labs

13

On a single 32 cores node

More CPU per request results

© 2016 Tubular Labs

14

15s to 7.5s

Search & Aggregations

© 2016 Tubular Labs

15

• Searching and sorting

is fast

• 8 term aggregations

are slow

Aggregation impact

© 2016 Tubular Labs

16

Check facet usage

© 2016 Tubular Labs

17

● Talk to your product manager

● Low product usage

● Remove networks and claims aggregations

● Replace facets with filters

Removing two aggregations results

© 2016 Tubular Labs

18

15s to 5.3s

Cardinality

© 2016 Tubular Labs

19

● Reduce cardinality

● Going from 200M to 5M (channels to creators)

● Reducing # of topics from 5M to 500

Reducing cardinality results

© 2016 Tubular Labs

20

15s to 4.4s

Split query and aggregations

© 2016 Tubular Labs

21

● Searching and aggregating separately

● Using shard-level query cache

● Showing results in UI asynchronously

Split query and aggregations results

© 2016 Tubular Labs

22

15s to 4.0s

Performance gain

© 2016 Tubular Labs

23

● From 15 to 4 seconds (<5 seconds)

● Overall improvement 3.7x

● What about costs?

Architecture patterns

Part 2. Goals

© 2016 Tubular Labs

25

● Reduce costs

● Improve reliability

● Simplify architecture

● Reduce variability in latency

Current flow

© 2016 Tubular Labs

26

● Too many

dependencies

● Expensive

intermediate

storage

Denormalization

© 2016 Tubular Labs

27

● 90% of data is

shared

● No extra calls

from frontend

Partial updates with Update API (experimental)

© 2016 Tubular Labs

28

“Partial” updates with parent-child relations (experimental)

© 2016 Tubular Labs

29

Split data by hot/full (idea for future)

© 2016 Tubular Labs

30

● Cheaper

hardware on full

● Shard allocation

filtering

Thank you