36
Magento Expert Consulting Group Webinar | July 31, 2013 Thinking Beyond Search with Solr Understanding How Solr Can Help Your Business Scale

Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Magento Expert Consulting Group Webinar | July 31, 2013

Thinking Beyond Search with Solr Understanding How Solr Can Help Your Business Scale

Page 2: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Udi Shamay Head, Expert Consulting Group [email protected]

Steve Kukla Business Solution Architect, Expert Consulting Group [email protected]

Kirill Morozov Application Architect, Expert Consulting Group [email protected]

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale July 31, 2013 | 2

The presenters Magento Expert Consulting Group

Page 3: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

What is Apache Solr?

Business Use Cases for Scale Supporting Initial Catalog Growth Supporting Growing Traffic Supporting Substantial Catalog Growth Supporting A Real-Time Catalog

Key Points to Remember

Q&A

Today’s agenda

July 31, 2013 | 3 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 4: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

What is Apache Solr?

July 31, 2013 | 4 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 5: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Solr • Separate application – installed on its own server, or

on an existing server in the environment depending on

business needs.

• Solr uses schema configuration files which can be

found in Magentto/lib/Apache

• Magento communicates with Solr via HTTP/XML

• Searching options configured via the Magento admin

panel

July 31, 2013 | 5 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

What is Apache Solr? General Solr Overview

Page 6: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Better text-based searching provides a better customer experience • More relevant “fuzzy” searching*

• Faceted searches

• Search corrections

• Out of the box type-ahead*

• Response caching for better performance

July 31, 2013 | 6

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

*Requires customization to leverage at 100%

What is Apache Solr? Solr the Search Platform

Page 7: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Solr is more than a search engine because… • Most data customers see is handled by

Solr instead of MySQL

July 31, 2013 | 7 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

What is Apache Solr? What Makes Solr Powerful

Page 8: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Solr is more than a search engine because… • Most data customers see is handled by

Solr instead of MySQL

• Solr uses a simpler data structure

July 31, 2013 | 8 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

What is Apache Solr? What Makes Solr Powerful

product_id attribute_id product_id attribute_name

attribute_id product_id attribute_value

product_id attribute_name attribute_value

MySQL (EAV)

Solr (No EAV)

Page 9: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Solr is more than a search engine because… • Most data customers see is handled by

Solr instead of MySQL

• Solr uses a simpler data structure

• Solr supports replication which allows it to

truly scale for growth

July 31, 2013 | 9 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

What is Apache Solr? What Makes Solr Powerful

Solr Solr

Solr Solr

Solr

Magento

Page 10: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting Initial Catalog Growth

July 31, 2013 | 10 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 11: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Business Background • Growing catalog – from 10K to 100K SKUs

• From 1 to 2 stores

• From 1 to 2+ web nodes / 1 database node

• Using native Solr Search

July 31, 2013 | 11 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Problems • Increased indexing time

• Out-dated information on the front-end

Business Use Case Supporting Initial Catalog Growth

Page 12: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting Initial Catalog Growth Problem – Increasing Index Footprint

*Expected indexing time

July 31, 2013 | 12

35 Min* 17.5

min* 3.5 min*

Year 2 2 websites 2 store views

17.5 min*

10 Min*

1.75 Min*

Control 1 website 1 store view

10,000 SKUs

50,000 SKUs

100,000 SKUs

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Slow Indexing

Page 13: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

July 31, 2013 | 13

Concept

• Connects to the database using JDBC

• Extra data transformations must be

written in Java/JavaScript.

• Uses a prepared xml configuration

Supporting Initial Catalog Growth Solution – Custom Data Import Handler

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 14: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Results • 10 times faster indexing

• Supports delta-indexing

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale July 31, 2013 | 14

Supporting Initial Catalog Growth Data Import Handler – Results

Things to keep in mind • Solr knows about its data source

• May require extra development efforts

• Extra data transformations must be

written in Java/JavaScript

Page 15: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting Growing Traffic

July 31, 2013 | 15 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 16: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Business Background • Growing catalog – 1,000,000 SKUs

• Growing traffic: up to 100 requests / second

• 3 stores

• 3+ web nodes/ 1 database node

• Using Data Import Handler

July 31, 2013 | 16 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Problem • Solr can’t handle increasing user

concurrency

Business Use Case Supporting Growing Traffic

Page 17: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

47.5 Min* 23.75

min*

35 min*

17.5 Min*

3.5 Min*

Control 2 website 2 store view

500,000 SKUs

1,000,000 SKUs

*Expected indexing time

July 31, 2013 | 17

4.75 min*

Year 3 3 websites 3 store views

100,000 SKUs

< 1000 updates/sec

Indexing delta data handles

updates

Supporting Growing Traffic Increasing Index Footprint – OK

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 18: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

120 msec* 100

msec* 80 msec*

Year 3 3 websites 3 store views

105 msec*

95 msec*

75 msec*

Control 2 website 2 store view

100,000 SKUs 30 RPS

500,000 SKUs 60 RPS

1,000,000 SKUs 100 RPS

*Expected average response time

July 31, 2013 | 18

Solr CPU is maxed

out

Supporting Growing Traffic Problem – Increased Response Time

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 19: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

July 31, 2013 | 19

Supporting Growing Traffic Solution – Solr Replication

Concept • Separate reading requests

• Replicate index across multiple nodes

• Read from multiple servers

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 20: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Results • Allows Solr to handle read traffic

• Introduces fail-over

Things to keep in mind • Requires middle-ware or Magento customization

• Possible heavy data duplication

• Extra changes in infrastructure

July 31, 2013 | 20

Supporting Initial Catalog Growth Solr Replication – Results

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 21: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting Substantial Catalog Growth

July 31, 2013 | 21 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 22: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Business Background • Growing catalog – 5,000,000 SKUs

• 4 stores

• 4+ web nodes / 1 database node

• Using Data Import Handler

• Using Solr replication

July 31, 2013 | 22 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Problems • Delta-indexing delays

• Slow response time

Business Use Case Supporting Substantial Catalog Growth

Page 23: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

317.5 Min* 158.75

min*

237.5 min*

118.75 Min*

47.5 Min*

Control 3 website 3 store view

2,500,000 SKUs

5,000,000 SKUs

*Expected indexing time

July 31, 2013 | 23

63.5 min*

Year 4 4 websites 4 store views

1,000,000 SKUs

> 1000 updates/sec

Supporting Substantial Catalog Growth Problem – Increasing Index Footprint

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Delta indexing delays

Page 24: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

400 msec* 270

msec* 150 msec*

Year 4 4 websites 4 store views

300 msec*

230 msec*

120 msec*

Control 3 website 3 store view

1,000,000 SKUs 100 RPS

2,500,000 SKUs 200 RPS

5,000,000 SKUs 400 RPS

*Expected average response time

July 31, 2013 | 24

Slow response

time

Supporting Substantial Catalog Growth Problem – Increased Response Time

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 25: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

July 31, 2013 | 25

Concept

• Distributed search

• Distributed + Replication

(SolrCloud)

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Supporting Substantial Catalog Growth Solution – Index Sharding

Page 26: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Results • Distributed search for faster response time

• 50 times faster indexing with 5 shards

Supporting Growing Traffic Index Sharding – Results

July 31, 2013 | 26

MySQL A B C

I D H

F G E

Magento

D E F

G H I Solr Shards

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Things to keep in mind… • Custom solution

• Requires Magento customization or

middleware introduction

• Extra changes in infrastructure

Page 27: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting A Real-Time Catalog

July 31, 2013 | 27 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 28: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Business Background • Growing catalog – 10,000,000 SKUs

• 5 stores

• 5+ web nodes / 1 database node

• Data Import Handler

• SolrCloud and distributed search

July 31, 2013 | 28 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Business Requirement • Always up-to-date index

Business Use Case Supporting A Real-Time Catalog

Page 29: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting A Real-Time Catalog Solution – Listen To The MySQL Bin Log

July 31, 2013 | 29

Concept • Connect via MySql replication protocol

• Listen to data-related events

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

MySQL

MySql Slave

Replication Binlog

Page 30: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Supporting A Real-Time Catalog Solution – Listen To The MySQL Bin Log

July 31, 2013 | 30

Concept • Connect via MySql replication protocol

• Listen to data-related events

• Extract information from events

• Manipulate with document in Lucene index

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

MySQL

Solr

Log

Parser

Replication Listener Binlog

Page 31: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Results • Replication-like connection • Indexes are always up-to-date Things to keep in mind • Relatively complex implementation

July 31, 2013 | 31

Magento

MySQL

A

Solr Shards

B C

I D H

F G E

D E F

G H I

Bin log

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Supporting A Real-Time Catalog Listening To The MySQL Bin Log – Results

Page 32: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Key Points to Remember

July 31, 2013 | 32 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 33: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

• Solr’s search capabilities provide a better site experience than MySQL LIKE or Full-text

• Solr is more than a search platform – it is a key for scalability and growth

• Solr’s data import handler keeps Solr performing well as your catalog grows

• Solr replication helps accommodate growing traffic

• Solr shards help keep indexing execution time and search response times low for very

large catalogs

• Listening to the MySQL bin log can help facilitate a continuously updating catalog

July 31, 2013 | 33 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Key Points to Remember Solr helps businesses scale

Page 34: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Scaling Solr Solr Wiki http://wiki.apache.org/solr/ Type-Ahead http://wiki.apache.org/solr/Suggester Data Import Handler(DIH) http://wiki.apache.org/solr/DataImportHandler Replication http://wiki.apache.org/solr/SolrReplication Shard http://wiki.apache.org/solr/SolrCloud Distributed Search http://wiki.apache.org/solr/DistributedSearch MySql Replication listening Change Data Capture http://www.slideshare.net/mkindahl/binary-log-api-presentation-oscon-2011 Replication Listener (C) https://launchpad.net/mysql-replication-listener Open-Replicator (Java) http://code.google.com/p/open-replicator/

References

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale July 31, 2013 | 34

Page 35: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Q&A

July 31, 2013 | 35 Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale

Page 36: Thinking Beyond Search with Solrinfo2.magento.com/rs/magentoenterprise/images/... · 7/31/2013  · July 31, 2013 | 6 Thinking Beyond Search with Solr – Understanding How Solr Can

Udi Shamay Head, Expert Consulting Group [email protected]

Steve Kukla Business Solution Architect, Expert Consulting Group [email protected]

Kirill Morozov Application Architect, Expert Consulting Group [email protected]

July 31, 2013 | 36

The presenters Magento Expert Consulting Group

Thinking Beyond Search with Solr – Understanding How Solr Can Help Your Business Scale