54
Log Analytics with Amazon Elasticsearch Service Christoph Schmitter ([email protected])

Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Log Analytics with Amazon Elasticsearch

Service

Christoph Schmitter ([email protected])

Page 2: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

What we'll cover

• Understanding Elasticsearch capabilities• Elasticsearch, the technology• Aggregations; ad-hoc analysis• Amazon Elasticsearch Service is a drop-in

replacement for self-managed Elasticsearch• Q&A

Page 3: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Understanding Elasticsearch capabilities

Page 4: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Scenario: Log data analytics

• Application monitoring and event diagnosis

• You need to monitor the performance of your application, web servers, and hardware

• You need easy to use, yet powerful data visualization tools to detect issues in near real-time

• You want the ability to dig into your logs in an intuitive, fine-grained way

• Kibana provides fast, easy visualization

Page 5: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Scenario: Batch data analytics

• Reporting and Analysis

• You are a mobile app developer• You have to monitor/manage users

across multiple app versions• You want to analyze and report on

usage and migration between app versions

• Use Kibana for dashboarding. Use the query API for deeper analysis

Page 6: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Scenario: Full-text search

• Traditional search

• Your application or website provides search capabilities over diverse documents

• You are tasked with making searchable this knowledge base and accessible

• You need key search features including text matching, faceting, filtering, fuzzy search, auto complete, and highlighting

• Use the query API to support application search

Page 7: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

CloudTrail delivers API calls to you

• AWS API call monitoring

• You need to understand the changing landscape of your AWS resources

• You need to do security analysis and compliance auditing

• You want the ability to dig into your logs in an intuitive, fine-grained way

Page 8: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter
Page 9: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

How Elasticsearch can help

• Combined with Kibana, Elasticsearch provides a tool for search, real-time analytics, and data visualization

Page 10: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Demo Architecture

Amazon CloudWatch

Logs

Amazon Elasticsearch Service

CloudTrailLogs

AWS Resources

Page 11: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Log lines

Page 12: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Demo:

Log Analytics

Page 13: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Elasticsearch the technology

Page 14: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Elasticsearch is like a database

SearchValueField

DocumentIndex

Cluster

Queries

DatabaseValueColumnRowTableDatabase

SQL

Page 15: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Documents are the core entityID

F1 Value

F2 Value

{"eventVersion": "1.03","eventTime": "2016-06-01T00:16:19Z","eventSource": "dynamodb.amazonaws.com","eventName": "DescribeStream","awsRegion": "eu-west-1","sourceIPAddress": "52.51.24.XX","userAgent": "leb-kcl-580935a6-5f94-4ce0-ac69-cdeb609ba16a,amazon-

kinesis-client-library-java-lambda_1.2.1, aws-internal/3","requestParameters": {

"streamArn": "arn:aws:dynamodb:eu-west-1:17816119XXXX:table/restaurant/stream/2016-04-08T18:07:53.837"

},"responseElements": null,"requestID": "KC608PH8POAF2I184E2SL1PS2FVV4KQNSO5AEMVJF66Q9ASUAAJG","eventID": "49b56379-903b-4f04-8ce5-d21bbfcf8ab3","eventType": "AwsApiCall","apiVersion": "2012-08-10","recipientAccountId": "17816119XXXX","userIdentity": {

"type": "AssumedRole","principalId":

"AROAJBQVRM7LN25CAHX7Y:awslambda_338_20160531233813522","arn": "arn:aws:sts::178161197791:assumed-role/geospatial-rec-

engine-ApplicationExecutionRole-9LPKB77QMR97/awslambda_338_20160531233813522", ...

Page 16: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter
Page 17: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Lucene provides text analysis and indexing

0 quick 1,3,51 brown 2,3,4,62 fox 1,7,93 lazy 2,84 dog 24

Term ID Term Postings

IndexWriter

IndexSearcher

Segment

Page 18: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Elsaticsearch query processing

Query

quickbrownfoxlazy

loremipsumdolorsit

Index Lookup

id: 216id: 305id: 486id: 713

Matches

Querylogic and post-filtering Scoring,

aggs

id: 713id: 305id: 486id: 216

Sorted matches(results)

Page 19: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Aggregations; ad-hoc analysis

Page 20: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Faceting: basic aggregation

• Query: shirt

Facets Carhartt (1092) Russell Athletic (1087) Dickies (954) RALPH LAUREN (823) Wrangler (701) Doublju (259) Levi's (12)

ID

F1 Value

F2 Value

Page 21: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Elasticsearch Aggregations

• Buckets – a collection of documents meeting some criterion

• Metrics – calculations on the content of buckets.

Bucket: time

Met

ric: c

ount

Page 22: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

A more complicated aggregation

Bucket: ARNBucket: RegionBucket: eventNameMetric: Count

Page 23: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

More kinds of aggregations

Buckets• Date histogram• Histogram• Range• Terms• Filters• Significant terms

Metrics• Count• Average• Sum• Min• Max• Std. Dev• Unique Count• Percentiles

Page 24: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Setting up your cluster

Page 25: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Shard 1 Shard 2 Shard 3{ { { { Shard 4

Shards: independent collections of documents

Id Id Id . . .

Documents

{ Index/Type

Page 26: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Deployment of indices to a cluster

• Index 1– Shard 1– Shard 2– Shard 3

• Index 2– Shard 1– Shard 2– Shard 3

Amazon ES cluster

123

123

123

123

Primary Replica

1

3

3

1

Instance 1,Master

2

1

1

2

Instance 2

3

2

2

3

Instance 3

Page 27: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Determining storage

• Data:Index ratio is typically close to 1:1• Add a replica, double the storage• Figure out data node count based on storage

– Current limits; 10T EBS, 32T instance store

Page 28: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Determining instance type

• Instance type is workload-dependent• T2; dev, test, QA• M3; solid performance• R3; heavier queries, aggs• I2; largest storage option

Page 29: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Best practices

• Take the minimum number of shards for 50G max data per shard

• Number of replicas = 1• For all prod workloads: use 3 dedicated masters• Use the _bulk API. Some ingest mechanisms do

this automatically• Increase index.refresh_interval for higher

throughput

Page 30: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Indexing strategy

Page 31: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Indexing strategy for streaming data

• Use an index per time period, typically index-per-day, high volume can go to index-per-hour

• Shard the index according to data size; use 50GB as a soft limit per shard

• Master nodes increase cluster stability

Page 32: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Index settings control sharding and more

curl -XPUT <endpoint>/<index>/_settings -d '{"number_of_shards" : 5,"number_of_replicas" : 1,"refresh_interval": "5s"

}'

Page 33: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Mappings control how data is indexed

curl -XPUT <endpoint>/<index> -d '{"mappings" : {

<type> : {"properties" : {

"eventName" : {"type" : "string", "index" : "not_analyzed" } } } }

}'

Page 34: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Index templates simplify mapping creation

curl -XPUT <endpoint>/_template/<name> -d '{"template" : "<wildcard e.g. cwl-*>","settings" : { "number_of_shards" : 2 },"mappings" : {

<type, e.g. _default_> : {"dynamic_templates" : [ {

<name> : { "index" : "not_analyzed" } } ]"properties" : {

"@timestamp" : { "type" : "date" } } }

}'

Page 35: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Don't forget the query API!

Page 36: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Direct access to the Elasticsearch API

• $ curl -XPUT https://<endpoint>/blog -d '{• "settings" : { "number_of_shards" : 3, "number_of_replicas" : 1 } }'

• $ curl -XPOST http://<endpoint>/blog/post/1 -d '{• "author":"jon handler",• "title":"Amazon ES Launch" }'

• $ curl -XPOST https://<endpoint>/blog/post/_bulk -d '• { "index" : { "_index" : "blog", "_type" : "post", "_id" : "2"}}• {"title":"Amazon ES for search", "author": "carl meadows"},• { "index" : { "_index":"blog", "_type":"post", "_id":"3" } }• { "title":"Analytics too", "author": "vivek sriram"}'

• $ curl -XGET http://<endpoint>/_search?q=ES• {"took":16,"timed_out":false,"_shards":{"total":3,"successful":3,"failed":0

},"hits":{"total":2,"max_score":0.13424811,"hits":[{"_index":"blog","_type":"post","_id":"1","_score":0.13424811,"_source":{"author":"jon handler", "title":"Amazon ES Launch" }},{"_index":"blog","_type":"post","_id":"2","_score":0.11506981,"_source":{"title":"Amazon ES for search", "author": "carl meadows"},}]}}

Page 37: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Elasticsearch is a full-featured search engine

• Built on Lucene, the popular, open-source library• Search structured and unstructured data with

complex, boolean queries• Supports common search features: geo search,

aggregations, highlighting, search suggestions, and more

Page 38: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Challenges with self-managed Elasticsearch

• Easy to get started, challenging to scale• Scaling ingest pipelines is difficult• Undifferentiated heavy lifting

Page 39: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Amazon Elasticsearch Service

Page 40: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Amazon ES overview

Amazon Route 53

Elastic LoadBalancingIAM

CloudWatch

Elasticsearch API

CloudTrail

Page 41: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Easy cluster configuration and reconfiguration

AWS

• Elasticsearch Version• Data nodes, count and type• Master nodes, count and type• Storage option – EBS/instance• HA option• Advanced options

Page 42: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

High availability with Zone Awareness

Amazon ES cluster

1

3

Instance 1

2

1 2

Instance 2

3

2

1

Instance 3

Availability Zone 1 Availability Zone 2

2

1

Instance 4

3

3

Page 43: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Monitor with CloudWatch metrics

• FreeStorageSpace – monitor and alarm before the cluster runs out of space

• CPUUtilization – alarm at 80% CPU to signal the need to scale up

• ClusterStatus.yellow – check whether replication requires additional nodes

• JVMMemoryPressure – check instance type and count for sufficient resources

• MasterCPUUtilization – monitoring for master nodes is separated from data nodes

Page 44: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Logstash

REST

CWL Agent

EC2 Instances

Amazon Kinesis

AmazonRDS

AmazonDynamoDB

AmazonSQS

Queue

LogstashCluster

Amazon Elasticsearch

Service

Amazon CloudWatch

AWSLambda

AWSCloudTrail

Access Logs

Amazon VPC Flow

Logs

Amazon S3 bucket

AWS IoT

Amazon Kinesis Firehose

Integration with the AWS ecosystem

Amazon ECS

Page 45: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Security with IAM{

"Version": "2012-10-17","Statement": [{

"Sid": "","Effect": "Allow","Principal": {"AWS": "arn:aws:iam:123456789012:user/susan"

},"Action": [ "es:ESHttpGet", "es:ESHttpPut", "es:ESHttpPost",

"es:CreateElasticsearchDomain","es:ListDomainNames" ],

"Resource": "arn:aws:es:us-east-1:###:domain/logs-domain/<index>/*"

} ] }

Page 46: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Pay for compute and storage you use

• With Amazon Elasticsearch Service, you pay only for the compute and storage resources you use. AWS Free Tier for qualifying customers.

Page 47: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Wrap up

• Combined with Kibana, Elasticsearch provides search and visualization for streaming data and full-text use cases.

• Elasticsearch is based on Lucene, which reads and writes search indices

• Aggregations allow you to analyze your data, splitting into Buckets and computing Metrics

• Amazon Elasticsearch Service makes it easy to set up and manage your Elasticsearch cluster on AWS

• Amazon ES is a great way to get started with Elasticsearch!

Page 48: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Q&A

• Christoph Schmitter: [email protected] Architect

• https://run.qwiklab.com/searches/elasticsearch

Page 49: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter

Demo Screenshots

Page 50: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter
Page 51: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter
Page 52: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter
Page 53: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter
Page 54: Log Analytics with Amazon Elasticsearch Serviceaws-de-media.s3.amazonaws.com/images/_Munich_Loft_Slides/Elast… · Log Analytics with Amazon Elasticsearch Service Christoph Schmitter