47
©2011 Azul Systems, Inc. Speeding up Enterprise Search with In-Memory Indexing Gil Tene, CTO & co-Founder, Azul Systems

Enterprise Search Summit - Speeding Up Search

Embed Size (px)

DESCRIPTION

In this presentation, Gil Tene, CTO of Azul Systems, discusses the benefits of using in-memory, in-process, and in-heap index representations with heap sizes that can now make full use of current commodity server scale. He also compares throughput and latency characteristics of the various architectural alternatives, using some of the configurable choices available in Apache Lucene™ 4.0 for specific examples.

Citation preview

Page 1: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Speeding up Enterprise Search

withIn-Memory Indexing

Gil Tene, CTO & co-Founder, Azul Systems

Page 2: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

This Talk’s Purpose / Goals

Describe how in-memory indexing can help enterprise search

Explain why in-memory and large memory use has been delayed

Show how in-memory indexing is now (finally) practical for interactive search applications

Page 3: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

About me: Gil Tene

co-founder, CTO @Azul Systems

Have been working on a “think different” GC approaches since 2002

Created Pauseless & C4 core GC algorithms (Tene, Wolf)

A Long history building Virtual & Physical Machines, Operating Systems, Enterprise apps, etc... * working on real-world trash compaction issues, circa 2004

Page 4: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

About Azul

We make scalable Virtual Machines

Have built “whatever it takes to get job done” since 2002

3 generations of custom SMP Multi-core HW (Vega)

Now Pure software for commodity x86 (Zing)

“Industry firsts” in Garbage collection, elastic memory, Java virtualization, memory scale

Vega

C4

Page 5: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

About Azul Supporting mission-critical deployments around

Page 6: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

High level agenda

How can in-memory help things?

Why isn’t everyone just using it?

The “Application Memory Wall” problem

Some Garbage Collection banter

What an actual solution looks like...

Page 7: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Memory use How many of you use heap sizes of:

F more than ½ GB?

F more than 1 GB?

F more than 2 GB?

F more than 4 GB?

F more than 10 GB?

F more than 20 GB?

F more than 50 GB?

Page 8: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Why is In-Memory indexing interesting?

Page 9: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Interesting things happen whenan entire problem fits in a single heap

No more out-of-process access for “hot” stuff

~1 microsecond instead of ~1 millisecond access

10s of GB/sec instead of 100s of MB/sec rates

Work eliminationNo more reading/parsing/deserializing per access

No network or I/O load

These can amount to dramatic speed benefits

Page 10: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Many business problems can now comfortably fit in server memory

Product Catalogs are often in the 2-10GB size range

Overnight reports are often done on 10-20GB data sets

All of English Language Wikipedia text is ~29GB

Indexing “bloats” things, but not by that much...

A full index of English Language Wikipedia (including stored fields and term vectors fits for highlighting) fits in ~78GB.

You’ll need even more than that in heap size...

Page 11: Enterprise Search Summit - Speeding Up Search

100%

CPU%

Heap sizeLive set

Heap size vs. GC CPU %

Page 12: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Case in point:The impact of in-memory index in Lucene

Lucene provides multiple index representation optionsSimpleFSDirectory, NIOFSDirectory (On-disk file-based)

MMapDirectory (mapped files, better leverage of OS cache memory)

RAMDirectory (In-process, in-heap)

Comparing throughput (e.g. on English Lang. Wikipedia)RAMDirectory ~2-3x throughput vs. MMapDirectory

Similar impact on average response.

Even higher impact with promising things in Lucene 4.0

But average is not all that mattersThere is a good reason people aren’t doing this (yet...)

Page 13: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Why isn’t all search done within-memory indexes already?

Page 14: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

The “Application Memory Wall”

Page 15: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Memory use How many of you use heap sizes of:

F more than ½ GB?

F more than 1 GB?

F more than 2 GB?

F more than 4 GB?

F more than 10 GB?

F more than 20 GB?

F more than 50 GB?

Page 16: Enterprise Search Summit - Speeding Up Search

Reality check: servers in 2012Retail prices, major web server store (US $, Oct 2012)

Cheap (< $1/GB/Month), and roughly linear to ~1TB

10s to 100s of GB/sec of memory bandwidth

24 vCore, 128GB server ≈ $5K

24 vCore, 256GB server ≈ $8K

32 vCore, 384GB server ≈ $14K

48 vCore, 512GB server ≈ $19K

64 vCore, 1TB server ≈ $36K

Page 17: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

The Application Memory WallA simple observation:

Application instances appear to be unable to make effective use of modern server memory capacities

The size of application instances as a % of a server’s capacity is rapidly dropping

Page 18: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

How much memory do applications need?

“640KB ought to be enough for anybody”

WRONG!

So what’s the right number?6,400K?64,000K?640,000K?6,400,000K?64,000,000K?

There is no right number

Target moves at 50x-100x per decade

“I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time …” - Bill Gates, 1996

Page 19: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

“Tiny” application history

100KB apps on a ¼ to ½ MB Server

10MB apps on a 32 – 64 MB server

1GB apps on a 2 – 4 GB server

??? GB apps on 256 GBAssuming Moore’s Law means:

“transistor counts grow at ≈2x every ≈18 months”

It also means memory size grows ≈100x every 10 years

2010

2000

1990

1980

“Tiny”: would be “silly” to distribute

Application Memory Wall

Page 20: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

What is causing theApplication Memory Wall?

Garbage Collection is a clear and dominant cause

There seem to be practical heap size limits for applications with responsiveness requirements

[Virtually] All current commercial JVMs will exhibit a multi-second pause on a normally utilized 2-6GB heap.

It’s a question of “When” and “How often”, not “If”.

GC tuning only moves the “when” and the “how often” around

Root cause: The link between scale and responsiveness

Page 21: Enterprise Search Summit - Speeding Up Search

What quality of GC is responsiblefor the Application Memory Wall?

It is NOT about overhead or efficiency:CPU utilization, bottlenecks, memory consumption and utilization

It is NOT about speedAverage speeds, 90%, even 95% speeds, are all perfectly fine

It is NOT about minor GC events (right now)GC events in the 10s of msec are usually tolerable for most apps

It is NOT about the frequency of very large pauses

It is ALL about the worst observable pause behavior

People avoid building/deploying visibly broken systems

Page 22: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

GC Problems

Page 23: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Framing the discussion:Garbage Collection at modern server scales

Modern Servers have 100s of GB of memory

Each modern x86 core (when actually used) produces garbage at a rate of ¼ - ½ GB/sec +

We want to make use of that memory and throughput in responsive applications

Monolithic stop-the-world operations are the cause of the current Application Memory Wall

Even if they are done “only a few times a day”

Page 24: Enterprise Search Summit - Speeding Up Search

One way to deal with Monolithic-STW GC

Page 25: Enterprise Search Summit - Speeding Up Search

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

0" 2000" 4000" 6000" 8000" 10000"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=16023.552&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 26: Enterprise Search Summit - Speeding Up Search

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

0" 2000" 4000" 6000" 8000" 10000"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=16023.552&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

16000"

18000"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 27: Enterprise Search Summit - Speeding Up Search
Page 28: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Another way to cope: “Creative Language”

“Guarantee a worst case of X msec, 99% of the time”

“Mostly” Concurrent, “Mostly” IncrementalTranslation: “Will at times exhibit long monolithic stop-the-world pauses”

“Fairly Consistent”Translation: “Will sometimes show results well outside this range”

“Typical pauses in the tens of milliseconds”Translation: “Some pauses are much longer than tens of milliseconds”

Page 29: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Actually measuring things

Page 30: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Incontinuities in Java platform execution

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"

0" 200" 400" 600" 800" 1000" 1200" 1400" 1600" 1800"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups"by"Time"Interval"

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=1665.024&

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups"by"Percen@le"Distribu@on"

Page 31: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

But reality is simple

People deal with GC problems primarily by keeping heap sizes artificially small.

Keeping heaps small has translated to avoiding the use of in-process, in-memory contents

That needs to change...

Page 32: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Solving the problems behind theApplication Memory Wall

Page 33: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

We needed to solve the right problems

Focus on the causes of the Application Memory WallScale is artificially limited by responsiveness

Responsiveness must be unlinked from scale:Data size, Throughput, Queries per second, Concurrent Users, ...Heap size, Live Set size, Allocation rate, Mutation rate, ...Responsiveness must be continually sustainableCan’t ignore “rare” events

Eliminate all Stop-The-World FallbacksAt modern server scales, any Stop-The-World fallback is a failure

Page 34: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

The GC problems that needed solving

Robust Concurrent MarkingIn the presence of high throughput (mutation and allocation rates)

Compaction that is not stop-the-world E.g. stay responsive while compacting 100+GB heapsMust be robust: not just a tactic to delay STW compaction

Young-Gen that is not stop-the-world Stay responsive while promoting multi-GB data spikesSurprisingly little work done in this specific area

Page 35: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Azul’s “C4” Collector Continuously Concurrent Compacting Collector

Concurrent guaranteed-single-pass markerOblivious to mutation rateConcurrent ref (weak, soft, final) processing

Concurrent CompactorObjects moved without stopping mutatorReferences remapped without stopping mutatorCan relocate entire generation (New, Old) in every GC cycle

Concurrent, compacting old generation

Concurrent, compacting new generation

No stop-the-world fallbackAlways compacts, and always does so concurrently

Page 36: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Benefits

Page 37: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Sample responsiveness behavior

๏ SpecJBB + Slow churning 2GB LRU Cache๏ Live set is ~2.5GB across all measurements๏ Allocation rate is ~1.2GB/sec across all measurements

Page 38: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

A JVM for Linux/x86 servers

ELIMINATES Garbage Collection as a concern for enterprise applications

Decouples scale metrics from response time concerns

Transaction rate, data set size, concurrent users, heap size, allocation rate, mutation rate, etc.

Leverages elastic memory for resilient operation

Page 39: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Sustainable Throughput:The throughput achieved while safely maintaining service levels

UnsustainableThroughout

Page 40: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Instance capacity test: “Fat Portal”HotSpot CMS: Peaks at ~ 3GB / 45 concurrent users

* LifeRay portal on JBoss @ 99.9% SLA of 5 second response times

Page 41: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Instance capacity test: “Fat Portal”C4: still smooth @ 800 concurrent users

Page 42: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Lucene: English Language Wikipedia

RAMDirectory based index

Page 43: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Fun with jHiccup

Page 44: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Oracle HotSpot CMS, 1GB in an 8GB heap

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=13156.352&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Zing 5, 1GB in an 8GB heap

0"

5"

10"

15"

20"

25"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"

Max=20.384&

0"

5"

10"

15"

20"

25"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Page 45: Enterprise Search Summit - Speeding Up Search

©2012 Azul Systems, Inc.

Oracle HotSpot CMS, 1GB in an 8GB heap

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%"

Max=13156.352&

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Zing 5, 1GB in an 8GB heap

0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

0" 500" 1000" 1500" 2000" 2500" 3000" 3500"

Hiccup

&Dura*

on&(m

sec)&

&Elapsed&Time&(sec)&

Hiccups&by&Time&Interval&

Max"per"Interval" 99%" 99.90%" 99.99%" Max"

0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"Max=20.384&0"

2000"

4000"

6000"

8000"

10000"

12000"

14000"

Hiccup

&Dura*

on&(m

sec)&

&&

Percen*le&

Hiccups&by&Percen*le&Distribu*on&

Drawn to scale

Page 46: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Summary:In-memory indexing and search

Interesting content can now fit in memory

In-memory indexing makes things go fast

In-memory indexing makes new things possibleChange Batch to Interactive. Change business processes.

It is finally practical to use in-memory indexing for interactive search applications

Zing makes this work...

Page 47: Enterprise Search Summit - Speeding Up Search

©2011 Azul Systems, Inc.

Q & ALucene In-RAM English Lang. Wikipedia test: http://blog.mikemccandless.com/2012/07/lucene-index-in-ram-with-azuls-zing-jvm.html

jHiccup: http://www.azulsystems.com/dev_resources/jhiccup

GC :G. Tene, B. Iyengar and M. WolfC4: The Continuously Concurrent Compacting CollectorIn Proceedings of ISMM’11, ACM, pages 79-88