Upload
amazon-web-services
View
676
Download
0
Embed Size (px)
Citation preview
Fast DataActing on Information
µs are the new ms
What is Fast Data?• Generated fast
• Actionable fast
Source: Financial TradingBids, asks, trades, equities, bonds, derivatives …
Source: SensorsAny Thing in the Internet of Things
• Temperature• Location• Velocity• Acceleration• Radiation• Electricity
Source: ClickstreamYou click, someone tracks…
Source: Log Aggregation• Servers• Storage• Applications• Network• Security
Capturing the Data
AWS IoT
IoT Device AmazonElasticsearchAWS
Lambda
Hot Data
AmazonElastiCache
AmazonDynamoDB
Longer Retention
Data Lake
Amazon S3
Amazon Glacier
Cold Data
Amazon Kinesis Firehose
7
Data Sources
101 0
Amazon Kinesis Streams
8
In-Memory Key-Value Store
High-performance
Redis and Memcached
Fully managed; Zero admin
Highly Available and Reliable
Hardened by Amazon
AmazonRDS
Amazon S3
Request RateHigh Low
LatencyLow High
Data VolumeLow High
AmazonGlacier
AmazonCloudSearch & ElasticsearchSt
ruct
ure
Low
High
AmazonDynamoDB
AmazonElastiCache
HDFS
Redis – the fast in-memory database
Powerful ~200 commands + Lua scripting
In-memory database
Utility data structuresstrings, lists, hashes, sets, sorted sets, bitmaps & HyperLogLogs
Simple
Atomic operationssupports transactionshas ACID properties
Ridiculously fast!<500microsecond latency for most commands
Highly Availablereplication
Persistentsnapshots or append-only log
Open Source
10
One Simple Example ….
+on-demand c3.8xlarge
3,000,000 objects
+
100 bytes eachmemtier benchmark50 read : 50 write
1,228,432 TPSSource: http://highscalability.com/blog/2014/8/27/the-12m-opssec-redis-cloud-cluster-single-server-unbenchmar.html
12
Fully managed service = Automated Operations
Redis datastore hosted on Amazon EC2 Amazon ElastiCache for Redis
ElastiCache for Redis Key AttributesExtreme
Performance
Sub-millisecond access latencies
4.5 million writes / sec
20 millionreads / sec
Open Source Compatible
Compatible with Open Source
Redis
Works with any Redis Client
Fully Managed
Automates node
replacement, software patching,
upgrades and backups
CloudWatch automates
monitoring of cache
performance metrics
Secure and Hardened
Supports Amazon VPC and IAM for
secure and fine grained access.
Monitors nodes and applies
security patches when needed
Highly Available and
Scalable
Clusters of up to 15 shards, each with a primary node and up to 5 replica nodes.
Multi-AZ with rapid automatic failover, with no
human intervention or code changes
needed.
Cost Effective
Pay as low as US$0.017 per
hour. Get started with
AWS Free Tier.
Zero data transfer costs for cross-AZ replication
Amazon ElastiCache
Enhanced Redis Engine on ElastiCacheHardened by Amazon
RDB Without Fork•Mitigate the risk of increased swap usage during syncs and snapshots.
Dynamic write throttling•To improve output buffer management when the node’s memory is close to being exhausted
Smoother failovers•Clusters recover faster as replicas will avoid flushing their data to do a full re-sync with the primary.
15
GEO Commands Store and retrieve longitude, latitude and
radius as a Sorted Set
• GEOADD
• GEODIST
• GEOHASH
• GEOPOS
• GEORADIUS
• GEORADIUSBYMEMBER
Amazon ElastiCache for Redis
16
BITFIELD• Memory-efficient approach to
storing many small integers
as a large bitmap
• Treats a Redis string as an array of bits
• GET, SET on INCRBY any group of bits within a string
• Control increment / decrement behavior with OVERFLOW
Amazon ElastiCache for Redis
17
Clusters! • Enables Horizontal Scaling• In ElastiCache for Redis 3.2:
• a Cluster is one or more Shards• a Shard is a primary node plus up to 5
replica nodes• Up to 15 Shards per Cluster
• 15 r3.8xlarge = 3.5TiB!• Cluster-level backup and restore
S5
S1
S2
S4 S3Client
Amazon ElastiCache for Redis
Clusters!
18
Autosharding • 16384 hash slots per Cluster• Slot for a key is CRC16 modulo {key}
• Slots are distributed across the Clusterinto Shards
• Developers must use a Redis cluster client!• Clients are redirected to the correct shard• Smart clients store a mapS5
S1
S3
S4 S3Client
Shard S1 = slots 0 – 3276Shard S2 = slots 3277 – 6553Shard S3 = slots 6554 – 9829Shard S4 = slots 9830 – 13106Shard S5 = slots 13107 - 16383
Amazon ElastiCache for Redis
Faster Recovery
• A Cluster consists of 1 – 15 Shards• Each Shard has a primary node and up to 5
replica nodes• In the event of a primary node outage, a replica node
in that Shard will be promoted
PrimaryS1a
S1b S1c
S2a
S2b S2c
S3a
S3b S3c
ReplicaShard Shard Shard
XS2c
S2a
Amazon ElastiCache for Redis
20
Open Source Distributed Index
Managed Service using Elasticsearch and Kibana
Fully managed; Zero admin
Highly Available and Reliable
RESTful API for easy integrationAmazon Elasticsearch
Service
Logstash
REST
CWL Agent
EC2 Instances
Amazon Kinesis
AmazonRDS
AmazonDynamoDB
AmazonSQS
Queue
Logstash Cluster
Amazon Elasticsearch Service
Amazon CloudWatch
AWSLambda
AWSCloudTrail
Access Logs
Amazon VPC Flow
Logs
Amazon S3 bucket
AWS IoT
Amazon Kinesis Firehose
Integration with the AWS ecosystem
Amazon ECS
Using Fast Data: FinanceFraud Detection
Regulatory Compliance
Using Fast Data: RetailRecommendations
Using Fast Data: Utilities
Incident Response
Using Fast Data: Health CareDrug Interactions
What’s Next?How will Fast Data change your industry?
Featured Tools:Amazon ElastiCache for RedisAmazon Elasticsearch Service
Darin Briskman ~ [email protected]