Lessons Learned: Refactoring a Solr-Based API App - Torsten Koester

Architectural lessons learned from refactoring a Solr based API application.

Torsten Bøgh Köster (Shopping24) Apache Lucene Eurocon, 19.10.2011

Contents

Shopping24 and it‘s API

Technical scaling solutions

ShardingCachingSolr Cores„Elastic“ infrastructure

business requirements as key factor

@tboeghk

Software- and systems- architect2 years experience with Solr3 years experience with Lucene

Team of 7 Java developers currently at Shopping24

shopping24 internet group

1 portal became n portals

30 partner shops became 700

500k to 7m documents

index fact time

•16 Gig Data•Single-Core-Layout•Up to 17s response time•Machine size limited•Stalled at solr version 1.4•API designed for small tools

scaling goal:15-50m documents

ask the nerds

„Shard!“ That‘ll be fun!

„Use spare compute cores at Amazon?“

breathe load into the cloud

„Reduce that index size“

„Get rid of those long running queries!“

data sharding ...

... is highly effective.

1 4 8 12 16 20

1shard 2shard 3shard4shard 6shard 8shard

concurrent requests

Sharding: size matters

the bigger your index gets, the more complex your

queries are, the more concurrent requests,

the more sharding you need

but wait ...

Why do we have such a big index?

7m documents vs. 2m active poducts

fashionproduct

lifecyclemeets SEO

Bastografie / photocase.com

Separation of duties! Remove unsearchable data from your index.

Why do we have complex queries?

A Solr index designed for 1 portal

Grown into a multi-portal index

Let “sharding“ follow your data ...

... and build separate cores for every client.

Duplicate data as long as access is fast.

andybahn / photocase.com

Streamline your index provisioning

process.

A thousand splendid cores at your fingertips.

Throwing hardware at problems. Automated.

evil traps: latency, $$

mirror your complete system – solve load balancer problems

froodmat / photocase.com

I said faster!

use a cache layerlike Varnish.

What about those complex queries? Why do we have them? And how do we get

rid of them?

Lost in encapsulation: Solr API exposed to world.

What‘s the key factor?

look at your business requirements

decrease complexity

Questions? Comments? Ideas?

Twitter: @tboeghkGithub: @tboeghkEmail: torsten.koester@s24.com

Web: http://www.s24.com

Images: sxc.hu (unless noted otherwise)

Lessons Learned: Refactoring a Solr-Based API App - Torsten Koester

Technology

Koester Grace

Michelle KOESTER, Appellant,

Solr Flair

Extending Solr: Packaging Common - · PDF fileExtending Solr: Packaging Common Sense ... •Zookeeper for property ... //cwiki.apache.org/conﬂuence/display/solr/Kerberos+Authentication+Plugin

Solr Fusion a Solr Proxy

Solr Recipes

Solr - home.apache.orgpeople.apache.org/~yonik/presentations/Solr_notes.pdf · solr/data/index Master solr/data/index Searcher new segment solr/data/snapshot-2006062950000 1. hard

Solr introduction

Understanding the Solr security framework - Lucene Solr Revolution 2015

Solr + jQuery =

Apache Solr CMS Integration @ Lucene/Solr Revolution San Diego 2013

Oak / Solr integration Tommaso Teofili - pro!vision · Solr replicated architecture Solr%@10.1.1.20% C1 C2 Solr%@10.1.1.21% C1 C2 Solr%@10.1.1.22% C1 C2 RRLoad%balancer% adaptTo()

Apache Solr Cookbook - the-eye.euApache Solr Cookbook iii 4 Solr autocomplete example 27 4.1 Install Apache Solr

Solr JDBC - Lucene/Solr Revolution 2016

Scaling Solr with Solr Cloud

Torsten Weisel

Optimizing SOLR to Improve Searchinfo2.magento.com/rs/magentosoftware/images/SOLR... · Agenda ! Overview of SOLR ! Basic Solr Troubleshooting – Common SOLR Troubleshooting and

Startup jackpot april 13 2013 - koester

Apache Solr + ajax solr

Apache Solr