27

Sunspot - The Ruby Way into Solr

  • Upload
    badr

  • View
    128

  • Download
    1

Embed Size (px)

Citation preview

Apache Solr

Solr is an open source enterprise search platform, written in Java, from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word, PDF) handling.Providing distributed search and index replication, Solr is designed for scalability and Fault tolerance.Solr is the second-most popular enterprise search engine after Elasticsearch.

Major search features in solr:● Full text.● Phrases.● Boosting.● Scoping.● Disjunctions and conjunctions.● Pagination.● Faceting.

○ Field facets.○ Query facets.○ Range facets.

● Ordering.○ Order by function.

● Grouping.● Geospatial.● Highlighting.● Stats.● Dynamic fields.

Agenda

➢ Text fields will be full-text searchable. Other fields (e.g., integer and string) can be used to scope queries.

Setting Up Objects

Searching Objects

Search In Depth● Full text

➢ phrase searches are represented as a double quoted group of words.

➢ query_phrase_slop sets the number of words that may appear between the words in a phrase.

Search In Depth - cont.● Phrases

Search In Depth - cont.● Phrase boosts

➢ Fields not defined as text (e.g., integer, boolean, time, etc...) can be used to scope (restrict) queries before full-text matching is performed.

Search In Depth - cont.● Scoping (Scalar Fields)

Search In Depth - cont.● Disjunctions and Conjunctions

➢ The results array that is returned has methods mixed in that allow it to operate seamlessly with common pagination libraries like will_paginate and kaminari.

➢ By default, Sunspot requests the first 30 results from Solr

Search In Depth - cont.● Pagination

Faceting➢ Faceting is a feature of Solr that determines the number of documents that match a given

search and an additional criterion. This allows you to build powerful drill-down interfaces for search.

➢ Each facet returns zero or more rows, each of which represents a particular criterion conjoined with the actual query being performed. For field facets, each row represents a particular value for a given field. For query facets, each row represents an arbitrary scope; the facet itself is just a means of logically grouping the scopes.

➢ By default Sunspot will only return the first 100 facet values. You can increase this limit, or force it to return all facets by setting limit to -1.

Faceting - cont.● Field Facets

Faceting - cont.● Query Facets

Faceting - cont.● Range Facets

➢ By default Sunspot orders results by "score": the Solr-determined relevancy metric. Sorting can be customized with the order_by method.

Ordering

➢ Solr supports sorting on multiple fields using custom functions (Solr 3.1 and above).

Ordering by function

➢ Solr supports grouping documents, similar to an SQL GROUP BY.

➢ Grouping is only supported on string fields that are not multivalued. To group on a field of a different type (e.g., integer), add a denormalized string type.

Grouping

Geospatial

➢ Filter By Radius.

➢ Sort By Distance.

➢ Highlighting allows you to

display snippets of the part of

the document that matched the

query.

Highlighting

Highlighting

● Solr can return some statistics on indexed numeric fields. Fetching statistics for average_rating.○ Stats on multiple fields.

○ Faceting on stats.

Stats

Dynamic Fields

Dynamic fields allow Solr to index fields that you did not explicitly define in your schema. This is useful if you discover you have forgotten to define one or more fields. Dynamic fields can make your application less brittle by providing some flexibility in the documents you can add to Solr.

Note: you can’t define a dynamic_text field. Hence, it is not possible to do a fulltext search on dynamic fields.

class MyClass

searchable do

dynamic_integer :custom_category_ids, :multiple => true do

custom_categories.inject(Hash.new { |h, k| h[k] = [] }) do |map, custom_category|

map[custom_category.name] << custom_category_values_for(custom_category)

map

end

end

end

end

search = MyClass.search do

dynamic(:custom_categories) do

facet(some_custom_category.id)

end

end

facet = search.facet(:custom_categories, some_custom_category.name)

Dynamic Fields