Solr on Docker - the Good, the Bad and the Ugly

Solr on Docker - the Good, the Bad and the UglyRadu Gheorghe

Sematext Group, Inc.

2

01Agenda

The Good (well, arguably). Why containers? Orchestration, configuration drift...

The Bad (actually, not so bad). How to do it? Hardware, heap size, shards...

The Ugly (and exciting). Why is it slow/crashing? Container limits, GC&OS settings

3

01

Clients

Sematext Cloud

logs

metrics

...

Our own dockerizing (dockerization?)

4

01

Because Docker is the future!

5

01

*

* you’re not tied to the provider’s autoscaling

* you may get better deals with huge VMs

Orchestration

6

01

github.com/sematext/lucene-revolution-samples

Demo: Kubernetes

https://github.com/sematext/lucene-revolution-samples

7

01

dev=test=prod; infrastructure as code. Sounds familiar? But:

○ light images

○ faster start&stop

○ hype ⇒ community

Efficiency (overhead vs isolation): (processes + VMs)/2 = containers

More on “the Good” of containerization

8

01

Zookeeper on separate hosts

nodes

Avoid hotspots:

Equal nodes per host

Equal shards per node(per collection)

podAntiAffinity on k8s

Moving on to “how”

9

01

Overshard*. A bit.

time

logs1 logs2logs3

*Moving shards creates load ⇒ be aware of spikes

Time series? Size-based indices

On scaling

10

01

volumes/StatefulSet for persistence

local > network (esp. for full-text search)

permissions

latency (mostly to Zookeeper) AWS → enhanced networking

network storage on different interface AWS → EBS-optimized

11

01

Not too small

OS caches are shared between containers⇩

>1 Solr nodes per host?

Co-locate with less IO-intensive apps?

Not too big

Host failure will be really bad

Overhead (e.g. memory allocation)

Big vs small hosts

12

01

Many small Solr nodes ⇒ bigger cluster state, # of shards

Multithreaded indexing

Full text search is usually bound by IO latency

Facets are usually parallelized between shards/collections

Size usually limited by heap (can’t be too big due to GC)or by recovery time

bigger = better

Big vs small containers/nodes

13

01

More data → more heap (terms, docValues, norms…)

Caches (generally, fieldValueCache is evil, use docValues)

Transient memory (serving requests)→ add 50-100% headroom

Make sure to leave enough room for OS caches

How much heap?

14

01

@32GB → no more compressed object pointers

Depending on OS, >30GB → still compressed, but not 0-based → more CPU

Uncompressed pointers’ overhead varies on use-case, 5-10% is a good

Larger heaps → GC is a bigger problem

The 32GB heap problem

15

01

Defaults → should be good up to 30GB

Larger heaps need tuning for latency

100GB+ per node is doable.

CMS: NewRatio, SurvivorRatio, CMSInitiatingOccupancyFraction

G1 trades heap for latency and throughput:

■ Adaptive sizing depending on MaxGCPauseMillis

■ Compacts old gen (check G1HeapRegionSize)

More useful info: https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

usually jumpto 45GB+

typical cluster killer (timeouts)

GC Settings

https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

16

01

GC-relatedyoung: ParallelGCThreadsold: ConcGCThreads + G1ConcRefinementThreads

facet.threads

merges*: maxThreadCount & maxMergeCount

* also account for IO throughput&latency

<Java 9 defaults depend on host’s #CPUs

N nodes per host ⇒ threads

17

01

Memory: more than heap, but won’t include OS caches

CPU

Single NUMA node? --cpu-shares

Multiple NUMA nodes? --cpuset*

vm.zone_reclaim_mode to store caches only on local node?

* Docker isn’t NUMA aware: https://github.com/moby/moby/issues/9777But kernel automatically balances threads by default

Container limits

https://github.com/moby/moby/issues/9777

18

01

Memory leak → OOM killer with a wide range of Java versions*

What helps:

Similar leaks (growing RSS) → NativeMemoryTracking

Don’t overbook memory + leave room for OS caches

Allocate on startup via AlwaysPreTouch

Increase vm.min_free_kbytes?

* https://bugs.openjdk.java.net/browse/JDK-8164293

JVM+Docker+Linux = love. Or not.

https://bugs.openjdk.java.net/browse/JDK-8164293

Newer kernels and Dockers are usually better

Open files and locked memory limits

Check dmesg and kswapd* CPU usage

Dare I say it:Try smaller hosts

Try niofs? (if you trash the cache - and TLB - too much)

A bit of swap? (swappiness is configurable per container, too)

Play with mmap arenas and THP

19

01

* kernel’s (single-threaded) GC: https://linux-mm.org/PageOutKswapd

e.g. 4.4+ and 1.13+More on that love

https://linux-mm.org/PageOutKswapd

20

01

The Good:

Orchestration

Dynamic allocation of resources (works well for bigger boxes)

Might actually deliver the promise of dev=testing=prod, because

The Bad:

Pets → cattle requires good sizing, config, scaling practices

The Ugly:

Ecosystem is still young → exciting bugs

Docker is the future!

Summary

Thank You! And please check out:

Solr&Kubernetes cheatsheets:sematext.com/resources/#publications

Openings:sematext.com/jobs

@sematext @radu0gheorgheOur booth :)

https://sematext.com/resources/#publications

http://sematext.com/jobs

Technology

Solr on Docker - the Good, the Bad and the Ugly