Searching with SolrWhen, Why and How?By Paul Matthews
2
86p
@paulmatthews86
86p.paul-matthews.co.uk
[email protected] techportal.ibuildings.com Projects:
Travel companies Media corporations
3
Searching…
What?
When?
Why?
How?
4
Searching…
What?
When?
Why?
How?
5
What is search?
Text navigation
Customers describing
Sorting
Examples Quick search Category listings
6
The power of search
7
Database Like
8
Database Like
Very little effort
A very basic search
Poor at: > 1 word
9
Database Full-Text
10
Database Full-Text
Some power
Convenient
Feature poor
Often very slow
11
Basic Search Systems
12
Basic Search Systems
Rapid search
Simple to setup
Feature poor
Accuracy
Require more application code
13
Solr Search
14
Solr Search
Very powerful
Feature rich
Relatively simple
Lots of plugins (community)
Overkill?
Java
15
Things you need to know
16
Searching…
What?
When?
Why?
How?
17
Applicable to me?
Who is Solr designed for? Traffic Features
When is a good time to implement it? Creation Post-live Open Source projects
18
Business indicators
Money / Time / Effort spent Bugs Tuning Features
Customers
19
Development indicators
Data
MySQL Full Text
Degradation
20
Searching…
What?
When?
Why?
How?
21
Is Solr right for me?
Know your enemy
With great functionality comes great responsibility
22
Data sources
Database Easy
API Features
CSV & XML
Solr Cell - Rich Documents PDF MS Office
23
Indexing
Parsing
Half now, half later
24
Analyzer
Process documents
The query gets analyzed too
25
Tokenizer
26
TokenizerFilter
Synonym
27
Stemming
Matching similar words
Reduce to Stem
SearchingSearchSearchesSearchedSearchers
Search
28
Hit Highlighting
“Hit” ==> “This is a <em>Hit</em> test.”
29
Spell Check
Spelchk Did you mean …?
“flickr”
30
31
By the power of Queries!
Phrase “Search for a phrase”
Wildcards Look*familiar?
Fuzzy fuzzy~
Proximity “two words”~12
Range name:{Paul TO Jeff}
32
name:paul AND location:uk
A single field
Multiple Fields
33
Faceting (21)
Pre-fetching (11)
Results (37)
34
Ranked Search
Ordered
Any field
35
Simultaneous update & search
Hold on a minute!
Actually, I don’t have to…
36
Searching…
What?
When?
Why?
How?
37
Flow
38
Container
Choose container
Make accessible
http://<host>:<port>/solr/admin
39
Solr Config
Cores ~ Database Schema
schema.xml ~ Schema definition
40
Fields
Define the data indexed Stored
Important to model accurately
Tweak to achieve functionality
Conscious of space and index
41
Index
Create documents to Schema Spec
42
Search
Quick Search
Default Search
Advanced Search
43
Quick Search
Partial words
Search all fields?
Required response data
44
Default Search
Consider useful Analyzers
Potentially match on more fields
Enrich or refine results with personal data
More in depth results
45
Advanced Search
Offer user control
Consider search storage Data size vs Additional queries
To return more / less results “Search entire document” “Filter by Colour”
46
Searching…
What?
When?
Why?
How?
47
Questions?
48
We’re Hiring
NL Vlissingen Utrecht
UK London Sheffield Liverpool
Speak to me at the end… [email protected]
49
Thank you
Resources Links: http://www.delicious.com/paulm86/solr
This talk: http://joind.in/3221
Contact Me: @paulmatthews86 http://about.me/paul.matthews