29
Intro to Solr

Intro to Solr in Drupal

Embed Size (px)

DESCRIPTION

Does your website have a ton of data? How do your users find the relevant pages among all the noise in your site? Solr can help deliver the pertinent search results to your users regardless of your site's size. Apache Solr is a Java program that integrates with the Drupal contrib module that allows your users to quickly search millions of records and narrow down the results with minimal system impact.

Citation preview

Page 1: Intro to Solr in Drupal

Intro to Solr

Page 2: Intro to Solr in Drupal

DrupalConPortland

Page 3: Intro to Solr in Drupal
Page 4: Intro to Solr in Drupal

Andrew RileyDirector of Drupal Development

@andrewmriley

Page 5: Intro to Solr in Drupal

Agenda

Search?WhySolr? Searching

Behindthe

Scenes

Page 6: Intro to Solr in Drupal

Search?

Page 7: Intro to Solr in Drupal

What is Search?

Search (v): to go or look through (a place, area, etc.) carefully in order to find something missing or lost: I searched the desk for the letter.

Source: http://dictionary.reference.com/browse/search

@Mediacurrent

Page 8: Intro to Solr in Drupal

Why Users Search

•Navigation doesn't make sense

• It can be faster

•Lots of data

•Frequent data changes

•Might just be looking for something

@Mediacurrent

Page 9: Intro to Solr in Drupal

Search Problems

•Search accuracy

•Too much data

•Slow response

•Wrong results

@Mediacurrent

Page 10: Intro to Solr in Drupal

Why

Solr?

Page 11: Intro to Solr in Drupal

History

Solr was initially created in 2004 as an in-house project for CNET. It was open sourced in 2006 and donated to the Apache Software Foundation.

@Mediacurrent

Page 12: Intro to Solr in Drupal

Lucene

•Solr is a layer on top of Lucene

•Lucene is a library

•Solr stores files in Lucene format

*http://wiki.apache.org/solr/SolrPerformanceData

@Mediacurrent

Page 13: Intro to Solr in Drupal

Speed

Search speed is important!

@Mediacurrent

Page 14: Intro to Solr in Drupal

Speed

Source: Web Performance Today http://j.mp/12h8wLZ

@Mediacurrent

Page 15: Intro to Solr in Drupal

Speed

• Important!

• It scales well

•No database required

•Clustering & Sharding

•Netflix runs 1.2MM q/day on 4 servers*

*http://wiki.apache.org/solr/SolrPerformanceData

@Mediacurrent

Page 16: Intro to Solr in Drupal

Natural Results

•Stemming: Blogging vs. Blog

•Stop Word Removal: The

•Synonyms: Tissue vs Kleenex

•Highly Configurable

@Mediacurrent

Page 17: Intro to Solr in Drupal

Drupal Search

•Not stemmed by default

•Queries the database

•Stores tokenized words in a single large table

•Much slower to index

@Mediacurrent

Page 18: Intro to Solr in Drupal

VS@Mediacurr

ent

Page 19: Intro to Solr in Drupal

Searching

Page 20: Intro to Solr in Drupal

Ordering

•Score

•Comes from Lucene

•Not "out of 100"

•Bigger score first

More Info: http://lucene.apache.org/core/3_6_1/scoring.html

???

201

200

199

184

@Mediacurrent

Page 21: Intro to Solr in Drupal

Facets

•Users do the work

•Fixes too much data

•Native to Solr

•Requires the Facet API module

•Shopping Sites

@Mediacurrent

Page 22: Intro to Solr in Drupal

Behind the

Scenes

Page 23: Intro to Solr in Drupal

Index?

• Index contains Documents

•Documents have Fields

•Fields have Terms

•~2 minutes for updates

•Uses Lucene syntax

@Mediacurrent

Page 24: Intro to Solr in Drupal

Tokenizing

•Splits words and numbers"this" "is" "blogging"

•Excludes Stopwords"this" "blogging"

•Handles Stemming (if enabled)"this" "blog"

•Very configurable

@Mediacurrent

Page 25: Intro to Solr in Drupal

Bias

•Adjusts the order of search results

•Works on: Content Type, Fields, Comments, Promoted to Home Page and more

•Can be dynamic with custom modules.

@Mediacurrent

Page 26: Intro to Solr in Drupal

Recap

Page 27: Intro to Solr in Drupal

Modules

•Apache Solr (apachesolr)

•Facet API (facetapi)

•Chaos tool suite (ctools)

@Mediacurrent

Page 28: Intro to Solr in Drupal

Overall

•Search is becoming more and more important

•You want to control your search results

• If you don't provide a good search experience, somebody else will.

•Solr doesn't have to be complex.

•Solr is fast and scales.

@Mediacurrent

Page 29: Intro to Solr in Drupal

Thank You!

Questions?

@Mediacurrent Mediacurrent.com

[email protected]

@andrewmriley

slideshare.net/mediacurrent