64
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Pravin Pillai, Sr. Product Manager Jon Handler, Principal Solutions Architect October, 2015 Introduction to Amazon Elasticsearch Service

AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Embed Size (px)

Citation preview

Page 1: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Pravin Pillai, Sr. Product ManagerJon Handler, Principal Solutions Architect

October, 2015

Introduction to Amazon Elasticsearch Service

Page 2: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Amazon Elasticsearch Service

Page 3: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

What to Expect from the Session

• Context: Managing your growing data• Introducing Amazon Elasticsearch Service (Amazon ES)• Configuring, securing, connecting, monitoring, and

scaling your Amazon ES cluster

Page 4: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Your data is constantly growingProduct usage

Page 5: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Your data is constantly growingSystem logs

Page 6: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Your data is constantly growingCustomer conversations

Page 7: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

That’s a lot of data!

Page 8: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

“Big data is not about the data”- Gary King, Harvard University, making the point that while data is plentiful and easy to collect, the real value is in the analytics.

Page 9: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

So what can you do with all this data?

• Share information• Extract insight• Recognize patterns• Track performance

Ultimately, make better business, technical, and operational decisions

Page 10: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Scenario 1: Full-text search

Knowledge Sharing Systems

•Your team is constantly generating content•You are tasked with making this knowledge base searchable and accessible•You need key search features including text matching, faceting, filtering, fuzzy search, auto complete, and highlighting

Page 11: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Scenario 2: Streaming data analytics

Intrusion detection

•You have to protect your system from attacks•You need easy to use, yet powerful analytics and data visualization tools to detect issues in near real-time•Easy and flexible data ingestion is important to capture information from a variety of key data sources

Page 12: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Scenario 3: Batch data analytics

Usage Monitoring

•You are a mobile app developer•You have to monitor/manage users across multiple app versions•You want to analyze and report on usage and migration between app versions

Page 13: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

What options do you have?

Page 14: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

How Elasticsearch can help

A powerful, real-time, distributed, open-source search and analytics engine:•Built on top of Apache Lucene•Schema free•Developer friendly RESTful API

Page 15: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

How Elasticsearch can help

Combined with Logstash and Kibana, the ELK stack provides a tool for real-time analytics and data visualization

Page 16: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Operating Elasticsearch is time-consuming

“Elasticsearch allows us to easily and quickly build bleeding edge big data and analytics applications using the ELK stack. By offering direct access to the Elasticsearch API while offloading administrative tasks, Amazon Elasticsearch Service gives us the manageability, flexibility and control we need ”

Sean Curtis, SVP Engineering at Major League Baseball Advanced Engineering

Page 17: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Introducing Amazon Elasticsearch Service

Amazon Elasticsearch Service is a managed service from AWS that makes it easy to set up, operate, and scale Elasticsearch clusters in the cloud.

Page 18: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Key benefits

Easy cluster creation and configuration management Support for ELK

Security with AWS IAMMonitoring with Amazon CloudWatch

Auditing with AWS CloudTrail

Integration options with other AWS services (CloudWatch Logs, Amazon DynamoDB, Amazon S3,

Amazon Kinesis)

Page 19: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Create the cluster

Page 20: AWS October Webinar Series - Introducing Amazon Elasticsearch Service
Page 21: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

AWS CLI commands

add-tagscreate-elasticsearch-domaindelete-elasticsearch-domaindescribe-elasticsearch-domaindescribe-elasticsearch-domain-config

describe-elasticsearch-domainslist-domain-nameslist-tagsremove-tagsupdate-elasticsearch-domain-config

aws es create-elasticsearch-domain --domain-name my-domain --elasticsearch-cluster-config InstanceType=m3.xlarge.elasticsearch,InstanceCount=3 --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=512

Page 22: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Amazon ES domain overview

Amazon Route 53

Elastic LoadBalancingIAM

CloudWatch

Elasticsearch API

CloudTrail

Page 23: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Amazon Route 53

Elastic LoadBalancingIAM

CloudWatch

Elasticsearch API

CloudTrail

Amazon ES domain overview

Nodes under management

Page 24: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

IAM

CloudWatchCloudTrail

Elasticsearch API

Amazon Route 53

Elastic LoadBalancing

Amazon ES domain overview

Single endpoint, REST API

Page 25: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

CloudWatchCloudTrail

Elasticsearch API

Amazon Route 53

Elastic LoadBalancingIAM

Amazon ES domain overview

IAM integration

Page 26: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Elasticsearch API

Amazon Route 53

Elastic LoadBalancingIAM

CloudWatchCloudTrail

Amazon ES domain overviewCloudWatch/CloudTrail for monitoring

Page 27: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Scale for your workload

Page 28: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Online scaling operations

XUpdate

Page 29: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Data partitioning for search

Shard 1 Shard 2

{ {Id Id Id . . .

Documents {Index

• Document: The unit of search• ID: Unique identifier, one per

document• Field: Documents comprise a

collection of fields• Shard: An instance of Lucene with

a portion of an index• Index: A collection of data

Page 30: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Deployment of indices to a cluster

• Index 1• Shard 1• Shard 2• Shard 3

• Index 2• Shard 1• Shard 2• Shard 3

Amazon ES cluster

12

3

12

3

12

3

12

3

Primary Replica

1

3

3

1

Instance 1

2

1

1

2

Instance 2

3

2

2

3

Instance 3

Page 31: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Performance: single shard, single nodeInstance type (EBS Volume)

Average Write (EBS)1000 doc _bulks

Average Read (EBS) vCPU RAM(GB)

T2.micro (35GB) - (1.3) - (0.47) 1 1

T2.small (35GB) - (2.6) - (0.77) 1 2

T2.medium (35GB) - (4.2) - (1.3) 2 4

M3.medium (100GB) 2.95 (2.86) 1.31 (1.39) 1 3.75

M3.large (100GB) 6.35 (6.29) 2.81 (2.84 2 7.5

M3.xlarge (100GB) 11.6 (11.6) 4.62 (5.57) 4 15

M3.2xlarge (100GB) 18.45 (18) 11.32 (12.05) 8 30

R3.large (100GB) 5.72 (5.94) 2.86 (2.88) 2 15.25

R3.xlarge (100GB) 10.8 (10.5) 5.76 (5.79) 4 30.5

R3.2xlarge (100GB) 16.8 (16.5) 11.31 (11.38) 8 61

R3.4xlarge (100GB) 19.1 (19.2) 24.05 (24.66) 16 122

R3.8xlarge (100GB) 22.2 (21.8) 44 (47.29) 32 244

I2.xlarge (100GB) 10.8 (10.8) 5.09 (5.88) 4 30.5

I2.2xlarge (100GB) 17.8 (18.1) 10.05 (10.93) 8 61

Page 32: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Instance type recommendations

Instance WorkloadT2 Entry point. Dev and test. OK for dedicated masters.

M3 Equal read and write volumes. Up to 5 TB of storage with EBS.

R3 Read-heavy or workloads with high query demands (e.g., aggregations).

I2 Up to 16 TB of SSD instance storage.

Page 33: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Secure access to your domain

Page 34: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Secure access to your domain{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource":

"arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"} ] }

Page 35: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Secure access to your domain

Control access by user with signed requests

{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource":

"arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"} ] }

Page 36: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Secure access to your domain

Allow/Deny HTTP methods and Config operations per policy

{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource":

"arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"} ] }

Page 37: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Secure access to your domain

Fine-grained control to the index level

{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam:123456789012:user/susan" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource":

"arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"} ] }

Page 38: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Secure access to your domain

And/or use IP-based access control

{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:CreateElasticsearchDomain", "es:ListDomainNames" ], "Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*", "Condition": "IpAddress": { "aws:SourceIp": [ "xx.xx.xx.xx/yy" ] } } ] }

Page 39: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Load data

Page 40: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Direct access to the Elasticsearch API

$ curl -XPUT https://<endpoint>/blog -d '{ "settings" : { "number_of_shards" : 3, "number_of_replicas" : 1 } }'$ curl -XPOST http://<endpoint>/blog/post/1 -d '{

"author":"jon handler","title":"Amazon ES Launch" }'

$ curl -XPOST https://<endpoint>/blog/post/_bulk -d '{ "index" : { "_index" : "blog", "_type" : "post", "_id" : "2"}}{"title":"Amazon ES for search", "author": "pravin pillai"},{ "index" : { "_index":"blog", "_type":"post", "_id":"3" } }{ "title":"Analytics too", "author": "vivek sriram"}'

$ curl -XGET http://<endpoint>/_search?q=ES{"took":16,"timed_out":false,"_shards":{"total":3,"successful":3,"failed":0},"hits":

{"total":2,"max_score":0.13424811,"hits":[{"_index":"blog","_type":"post","_id":"1","_score":0.13424811,"_source":{"author":"jon handler", "title":"Amazon ES Launch" }},{"_index":"blog","_type":"post","_id":"2","_score":0.11506981,"_source":{"title":"Amazon ES for search", "author": "pravin pillai"},}]}}

Page 41: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Loading data using Logstash

Application nodes/Logstash forwarders

Logstash indexerAmazon

Elasticsearch Service

Page 42: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Logstash plugin for Amazon ES

https://github.com/awslabs/logstash-output-amazon_esoutput { amazones { *hosts => ["foo.us-east-1.es.amazonaws.com"] *region => "us-east-1" access_key => 'ACCESS_KEY' (optional) secret_key => 'SECRET_KEY' (optional) codec => "plain" workers => 1 index => "logstash-%{+YYYY.MM.dd}" } }

Page 43: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Loading data using Lambda

Amazon Lambda

Amazon Elasticsearch

Service

Amazon S3

DynamoDB

Amazon Kinesis

Page 44: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Lambda code snippet (node.js) for upload

var AWS = require('aws-sdk');var creds = new AWS.EnvironmentCredentials('AWS');

function postDocumentToES(doc, context) { var req = new AWS.HttpRequest(endpoint); var signer = new AWS.Signers.V4(req, 'es'); signer.addAuthorization(creds, new Date()); var send = new AWS.NodeHttpClient(); send.handleRequest(req, null, function(httpResp)...

https://github.com/awslabs/amazon-elasticsearch-lambda-samples

Page 45: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Export logs to Amazon ES

CloudWatch Amazon Elasticsearch

Service

Page 46: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Export CloudWatch Logs

Demo

Page 47: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Monitor and auditCloudWatch

CloudTrail

Page 48: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Monitoring

Page 49: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

What should I monitor?

• FreeStorageSpace – monitor and alarm before the cluster runs out of space

• CPUUtilization – alarm at 80% CPU to signal the need to scale up

• ClusterStatus.yellow – check whether replication requires additional nodes

• JVMMemoryPressure – check instance type and count for sufficient resources

• MasterCPUUtilization – monitoring for master nodes is separated from data nodes

Page 50: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Snapshot and restore for data durability

Page 51: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Daily automated snapshots

• No additional charges• Snapshots retained for 14 days

Page 52: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Taking manual snapshots

Amazon S3 role

Snapshot repository

Trust relationship:{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "es.amazonaws.com" }, "Action": "sts:AssumeRole" } ]}

Page 53: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Taking manual snapshots

Amazon S3

Snapshot repository

{ "Version":"2012-10-17", "Statement":[ { "Action":[ "s3:ListBucket" ], "Effect":"Allow", "Resource": [ "arn:aws:s3:::bucket" ] }, { "Action":[ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "iam:PassRole" ], "Effect":"Allow", "Resource":[ "arn:aws:s3:::bucket/*" ] } ] }

role

Page 54: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Taking manual snapshots

Register the bucketcurl -XPUT http://<endpoint>/_snapshot/<repo-name> -d '{"type":"s3", "settings": { "bucket":"<bucket>", "region":"<region>", "role-arn":"<arn>"}}'

Take a snapshotcurl -XPUT http://<endpoint>/_snapshot/<repo-name>/snapshot1

Snapshot time is proportional to size.

Page 55: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Built-in Kibana

Page 56: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Application overview

Logstash indexerAmazon

Elasticsearch Service

Application nodes/Logstash forwarders

Page 57: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Kibana UI

Page 58: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Securing Kibana

IAMProxy(Optional)

Page 59: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

IAM policy for Kibana

{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "AWS": "*" }, "Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost", "es:ESHttpHead"], "Resource": [ "arn:aws:es:us-east-1:####:domain/<domain>/*" ], "Condition": { "IpAddress": { "aws:SourceIp": [ xx.xx.xx.xx ] } } } ]}

Page 60: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Pay for what you use

Page 61: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Pay for compute and storage you use

With Amazon Elasticsearch Service, you pay only for the compute and storage resources you use. AWS Free Tier for qualifying customers.

Page 62: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Amazon Elasticsearch Service is publicly available now!

• us-east-1• us-west-1• us-west-2

• eu-west-1• eu-central-1• ap-southeast-1

• ap-southeast-2• ap-northeast-1• sa-east-1

You can use Amazon Elasticsearch Service in these regions:

Page 63: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Wrap up

1. Elasticsearch is a tool for full-text search, analysis, and visualization of time series data that helps you get the most out of your growing data set

2. Amazon Elasticsearch Service makes it easy to deploy and manage an Elasticsearch cluster in the AWS cloud

3. Amazon Elasticsearch Service is a drop-in replacement for your existing Elasticsearch cluster

Page 64: AWS October Webinar Series - Introducing Amazon Elasticsearch Service

Thank you!

aws.amazon.com/elasticsearch-service