45
Finding the right stuff Michael Reinsch an intro to Elasticsearch with Ruby/Rails at Ruby User Group Berlin, Feb 2016

Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Embed Size (px)

Citation preview

Page 1: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Finding the right stuff

Michael Reinsch

an intro to Elasticsearch with Ruby/Rails

at Ruby User Group Berlin, Feb 2016

Page 2: Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Page 3: Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Page 4: Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Page 5: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

How does it fit into my app?

Page 6: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Blackbox with REST API

elasticsearch

Update API: your app pushes updates (updates are fast, but asynchronous)

Search API: returns search results

Page 7: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

For Ruby / Rails

• https://github.com/elastic/elasticsearch-rails

• gems for Rails:

• elasticsearch-model & elasticsearch-rails

• without Rails / AR:

• elasticsearch-persistence

Page 8: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class Event < ActiveRecord::Base include Elasticsearch::Model

Page 9: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601, featured: group.featured? } end

Page 10: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601, featured: group.featured? } end

settings do mapping dynamic: 'false' do indexes :title, type: 'string' indexes :description, type: 'string' indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end

Page 11: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.import

Page 12: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Elasticsearch cluster

Page 13: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Index: events

Type: event

doc 1

Elasticsearch cluster

Page 14: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Index: creations

Type: creation

doc 1

Type: activity

doc 2 doc 1

Index: events

Type: event

doc 1

Elasticsearch cluster

Page 15: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Documents, not relationships

compose documents with all relevant data

➜ "denormalize" your data

Page 16: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { titles: [ title1, title2 ], locations: locs.map(&:as_indexed_json)

} end

settings do mapping dynamic: 'false' do indexes :titles, type: 'string' indexes :locations, type: 'nested' do indexes :name, type: 'string' indexes :address, type: 'string' indexes :location, type: 'geo_point' end end end

Page 17: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search 'tokyo rubyist'

Page 18: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

response = Event.search 'tokyo rubyist'

response.took # => 28

response.results.total # => 2075

response.results.first._score # => 0.921177

response.results.first._source.title # => "Drop in Ruby"

response.page(2).results # => second page of results

Page 19: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

response = Event.search 'tokyo rubyist'

response.took # => 28

response.results.total # => 2075

response.results.first._score # => 0.921177

response.results.first._source.title # => "Drop in Ruby"

response.page(2).results # => second page of results supports kaminari /

will_paginate

Page 20: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

response = Event.search 'tokyo rubyist'

response.records.to_a # => [#<Event id: 12409, ...>, ...]

response.page(2).records # => second page of result records

response.records.each_with_hit do |rec,hit| puts "* #{rec.title}: #{hit._score}" end # * Drop in Ruby: 0.9205564 # * Javascript meets Ruby in Kamakura: 0.8947 # * Meetup at EC Navi: 0.8766844 # * Pair Programming Session #3: 0.8603562 # * Kickoff Party: 0.8265461

Page 21: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search 'tokyo rubyist'

Page 22: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search 'tokyo rubyist'

only upcoming events?

Page 23: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search 'tokyo rubyist'

only upcoming events?

sorted by start date?

Page 24: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } }

Page 25: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } }

our query

Page 26: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } }

filtered by conditions

our query

Page 27: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } }

filtered by conditions

sorted by start time

our query

Page 28: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Query DSL

query: { <query_type>: <arguments> }filter: { <filter_type>: <arguments> }

valid arguments depend on query / filter type

Page 29: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Query DSL

query: { <query_type>: <arguments> }filter: { <filter_type>: <arguments> }

valid arguments depend on query / filter type

scores matching documents

Page 30: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Query DSL

query: { <query_type>: <arguments> }filter: { <filter_type>: <arguments> }

valid arguments depend on query / filter type

scores matching documents

filters documents

Page 31: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }

Page 32: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Match QueryMulti Match Query

Bool Query Boosting Query

Common Terms Query Constant Score Query

Dis Max Query Filtered Query

Fuzzy Like This Query Fuzzy Like This Field Query

Function Score QueryFuzzy Query

GeoShape Query Has Child Query

Has Parent Query Ids Query

Indices Query Match All Query

More Like This Query

Nested Query Prefix Query

Query String Query Simple Query String Query

Range Query Regexp Query

Span First Query Span Multi Term Query

Span Near Query Span Not Query Span Or Query

Span Term Query Term Query Terms Query

Top Children Query Wildcard Query

Minimum Should Match Multi Term Query Rewrite

Template Query

Page 33: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

And FilterBool Filter

Exists Filter Geo Bounding Box Filter

Geo Distance Filter Geo Distance Range Filter

Geo Polygon Filter GeoShape Filter

Geohash Cell Filter Has Child Filter

Has Parent Filter Ids Filter

Indices Filter

Limit Filter Match All Filter Missing Filter Nested Filter

Not FilterOr Filter

Prefix Filter Query Filter

Range FilterRegexp Filter Script Filter Term Filter

Terms FilterType Filter

Page 34: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search query: { bool: { should: [ { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, { function_score: { filter: { and: [ { range: { starts_at: { lte: 'now' } } }, { term: { featured: true } } ] }, gauss: { starts_at: { origin: 'now', scale: '10d', decay: 0.5 }, }, boost_mode: "sum" } } ], minimum_should_match: 2 } }

Page 35: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Create service objectsclass EventSearch

def initialize @filters = [] end

def starting_after(time) tap { @filters << { range: { starts_at: { gte: time } } } } end

def featured tap { @filters << { term: { featured: true } } } end

def in_group(group_id) tap { @filters << { term: { group_id: group_id } } } end

Page 36: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Event.search '東京rubyist'

Page 37: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Dealing with different languages

built in analysers for arabic, armenian, basque, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian, lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, thai.

Page 38: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class Event < ActiveRecord::Base include Elasticsearch::Model

def as_indexed_json(options={}) { title: { en: title_en, de: title_de, ja: title_ja }, description: { en: desc_en, de: desc_de, ja: desc_ja }, starts_at: starts_at.iso8601, featured: group.featured? } end

settings do mapping dynamic: 'false' do indexes :title do indexes :en, type: 'string', analyzer: 'english' indexes :de, type: 'string', analyzer: 'german' indexes :ja, type: 'string', analyzer: 'cjk' end indexes :description do indexes :en, type: 'string', analyzer: 'english' indexes :de, type: 'string', analyzer: 'german' indexes :ja, type: 'string', analyzer: 'cjk' end indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end

Page 39: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Changes to mappings?

⚠ can't change field types / analysers ⚠

but: we can add new field mappings

Page 40: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class AddCreatedAtToES < ActiveRecord::Migration def up client = Elasticsearch::Client.new client.indices.put_mapping( index: Event.index_name, type: Event.document_type, body: { properties: { created_at: { type: 'date' } } } ) Event.__elasticsearch__.import end

def down end end

Page 41: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Automated tests

Page 42: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

class Event < ActiveRecord::Base include Elasticsearch::Model

index_name "drkpr_#{Rails.env}_events"

Index names with environment

Page 43: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Test helpers

• everything is asynchronous!

• Helpers:wait_for_elasticsearchwait_for_elasticsearch_removalclear_elasticsearch!➜ https://gist.github.com/mreinsch/094dc9cf63362314cef4

• specs: Tag tests which require elasticsearch

Page 44: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Production ready?

• use elastic.co/found or AWS ES

• use two clustered instances for redundancy

• Elasticsearch could go away

• keep impact at a minimum!

• update Elasticsearch from background worker

Page 45: Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Questions?

Resources:

Elastic Docs https://www.elastic.co/guide/index.html

Ruby Gem Docs https://github.com/elastic/elasticsearch-rails

Elasticsearch rspec helpershttps://gist.github.com/mreinsch/094dc9cf63362314cef4 Elasticsearch indexer job examplehttps://gist.github.com/mreinsch/acb2f6c58891e5cd4f13

or ask me later:

[email protected] @mreinsch