42
Introduction to Elasticsearch Jason Austin - @jason_austin

Introduction to Elasticsearch

Embed Size (px)

DESCRIPTION

Elasticsearch is a powerful, distributed, open source searching technology. By integrating Elasticsearch into your application, you instantly provide a way to search a lot of data very quickly. Elasticsearch has a RESTful API, it scales, its super fast, you can use plugins to customize it, and much more. In this talk I go over the basics of setting up Elasticsearch, creating a search index, importing your data, and doing some basic searching. I also touch on a few advanced topics that will show the flexibility of this awesome service.

Citation preview

Page 1: Introduction to Elasticsearch

Introduction to Elasticsearch

Jason Austin - @jason_austin

Page 2: Introduction to Elasticsearch

The Problem

• You are building a website to find beers

• You have a huge database of beers and breweries to sift through

• You want simple keyword-based searching

• You also want structured searching, like finding all beers > 7% ABV

• You want to run some analytics on what beers are in your dataset

Page 3: Introduction to Elasticsearch

Enter Elasticsearch

• Lucene based

• Distributed

• Fast

• RESTful interface

• Document-Based with JSON

Page 4: Introduction to Elasticsearch
Page 5: Introduction to Elasticsearch

Install Elasticsearch

• Download from http://elasticsearch.org

• Requires Java to run

Page 6: Introduction to Elasticsearch

Run Elasticsearch

• From the install directory:

./bin/elasticsearch -d!

!

http://localhost:9200/!

Page 7: Introduction to Elasticsearch

Communicating

• Elasticsearch listens to RESTful HTTP requests

• GET, POST, PUT, DELETE

• CURL works just fine

Page 8: Introduction to Elasticsearch

ES Structure

Relational DB

Databases

Tables

Rows

Columns

Elasticsearch

Indices

Types

Documents

Fields

Page 9: Introduction to Elasticsearch

ES Structure

Elasticsearch

Indices

Types

Documents

Fields

Elasticsearch

phpbeer

beer

Pliny the Elder

ABV, Name, Desc

Page 10: Introduction to Elasticsearch

Create an Indexcurl -XPOST 'http://localhost:9200/phpbeer'

Page 11: Introduction to Elasticsearch

What to Search?

• Define the types of things to search

• Beer

• Brewery - Maybe later

Page 12: Introduction to Elasticsearch

Define a Beer

• Name

• Style

• ABV

• Brewery

‣ Name

‣ City

Page 13: Introduction to Elasticsearch

Beer JSON{ ! "name": "Pliny the Elder", ! "style": "Imperial India Pale Ale", ! "abv": 7.0, ! "brewery": { ! ! "name": "Russian River Brewing Co.", ! ! "city": "Santa Rosa", "state": "California" ! } }

Page 14: Introduction to Elasticsearch

Saving The Beer

curl -XPOST 'http://localhost:9200/phpbeer/beer/1' -d '{ ! "name": "Pliny the Elder", ! "style": "Imperial India Pale Ale", ! "abv": 7.0, ! "brewery": { ! ! "name": "Russian River Brewing Co.", ! ! "city": "Santa Rosa", "state": "California" ! } }'

Page 15: Introduction to Elasticsearch

Getting a beercurl -XGET 'http://localhost:9200/phpbeer/beer/1?pretty'

Page 16: Introduction to Elasticsearch

Updating a Beercurl -XPOST 'http://localhost:9200/phpbeer/beer/1' -d '{ ! "name": "Pliny the Elder", ! "style": "Imperial India Pale Ale", ! "abv": 8.0, ! "brewery": { ! ! "name": "Russian River Brewing Co.", ! ! "city": "Santa Rosa", "state": "California" ! } }'

Page 17: Introduction to Elasticsearch

POST vs PUT

• POST

• No ID - Creates new doc, assigns ID

• With ID - Updates or creates new doc

• PUT

• No ID - Error

• With ID - Updates doc

Page 18: Introduction to Elasticsearch

Delete a Beercurl -XDELETE 'http://localhost:9200/phpbeer/beer/1'

Page 19: Introduction to Elasticsearch

Finally! Searching!curl -XGET 'http://localhost:9200/_search?pretty&q=pliny'

Page 20: Introduction to Elasticsearch

Specific Field Searchingcurl -XGET 'http://localhost:9200/_search?pretty&q=style:pliny'!

curl -XGET 'http://localhost:9200/_search?pretty&q=style:imperial'

Page 21: Introduction to Elasticsearch

Alternate Approach

• Search using DSL (Domain Specific Language)

• JSON in request body

Page 22: Introduction to Elasticsearch

DSL Searchingcurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "style" : "imperial" } } }'

Page 23: Introduction to Elasticsearch

DSL = Query + Filter

• Query - “How well does the document match”

• Filter - Yes or No question on the field

Page 24: Introduction to Elasticsearch

Query DSL• match

• Used to query across all fields for a string

• match_phrase

• Used to query an exact phrase

• match_all

• Matches all documents

• multi_match

• Runs the same match query on multiple fields

Page 25: Introduction to Elasticsearch

Filter DSL

• term

• Exact match on a field

• range

• Match numbers over a specified range

• exists / missing

• Match based on the existence of a value for a field

Page 26: Introduction to Elasticsearch

More Complex Search

• Find beer whose styles include “Pale Ale” that are less than 7% ABV

Page 27: Introduction to Elasticsearch

Match + Rangecurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "style" : "pale ale" } }, "filter" : { "range" : { "abv" : { "lt" : 7 } } } }'

Page 28: Introduction to Elasticsearch

Embedded Field Searchcurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "brewery.state" : "California" } } }'

Page 29: Introduction to Elasticsearch

Highlighting Search Results

Page 30: Introduction to Elasticsearch

Highlighting Search Resultscurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "style" : "pale ale" } }, "highlight": { "fields" : { "style" : {} } } }'

Page 31: Introduction to Elasticsearch

Aggregations

• Collect analytics on your documents

• 2 main types

• Bucketing

• Produce a set of buckets with documents in them

• Metric

• Compute metrics over a set of documents

Page 32: Introduction to Elasticsearch

Bucketing Aggregations

Page 33: Introduction to Elasticsearch

Metric Aggregations

• How many beers exist of each style?

• What is the average ABV of beers for each style?

• How many beers exist that are brewed in California?

Page 34: Introduction to Elasticsearch

What is the average ABV of beers for each style?

curl -XGET 'http://localhost:9200/_search?pretty' -d '{ "aggs" : { "all_beers" : { "terms" : { "field" : "style" }, "aggs" : { "avg_abv" : { "avg" : { "field" : "abv" } } } } } }'

Page 35: Introduction to Elasticsearch

Mappings

• Define how ES searches

• Completely optional

• Must re-index after defining mapping

Page 36: Introduction to Elasticsearch

Create Index with Mapping

curl -XPOST localhost:9200/phpbeer -d '{ "mappings" : { "beer" : { "_source" : { "enabled" : true }, "properties" : { "style" : { "type" : "string", "index" : "not_analyzed" } } } } }'

curl -XDELETE localhost:9200/phpbeer

Page 37: Introduction to Elasticsearch

What is the average ABV of beers for each style?

curl -XGET 'http://localhost:9200/_search?pretty' -d '{ "aggs" : { "all_beers" : { "terms" : { "field" : "style" }, "aggs" : { "avg_abv" : { "avg" : { "field" : "abv" } } } } } }'

Page 38: Introduction to Elasticsearch

Non-Analyzed Fieldscurl -XGET 'http://localhost:9200/_search?pretty&q=style:imperial'!

curl -XGET 'http://localhost:9200/_search?pretty&q=style:hefeweizen'

Page 39: Introduction to Elasticsearch

Flexibility

• Mixing aggregations, filters and queries all together

• What beers have the word “night” in the name that are between 4 and 6 % ABV, broken down by style.

Page 40: Introduction to Elasticsearch

Elasticsearch and PHP

• Elasticsearch PHP Libhttps://github.com/elasticsearch/elasticsearch-php

• Elasticahttp://elastica.io/

Page 41: Introduction to Elasticsearch

Other Awesome ES Features

• Search analyzers

• Geo-based searching

• Elasticsearch Plugins

• kopf - http://localhost:9200/_plugin/kopf

Page 42: Introduction to Elasticsearch

Questions?

• @jason_austin

• http://www.pintlabs.com

• https://joind.in/10821