Upload
amazon-web-services
View
346
Download
0
Embed Size (px)
Citation preview
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Diego Macadar - Michael Fort
December 1, 2016
From 0 to 100M Records in 1 SecondAWS Metering
ARC308
Three principles
• 100% accurate
• Once and only once guarantee
• Idempotent processing
• Horizontally scalable
• Loosely coupled components
• Elasticity: Automated scaling
• Focus on the business
• Operationally excellent
• Use managed frameworks
Global data
Amazon
S3
Structured dataUnstructured data
vs
• Must be immutable
• Avoid performance bottlenecks by using storage best practices
• Monitoring with Amazon CloudWatch
• Secure data using versioning and encryption
Amazon
DynamoDB
Amazon
RDS
Global data example
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”energyUsage",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd891”,
”lightIdentifier": "000000000001",
“value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”lumens",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd891”,
”lightIdentifier": "000000000001",
“value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”outage",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "000000000001",
“value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000,
"eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000,
"eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”energyUsage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”lumens", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6", "timestamp": 1476477276000, "eventType": ”outage", “socketIdentifier”: “dac06b790cb5b0856437b3efa92bd891”,”lightIdentifier": "000000000001", “value”: “1” }
Local data
ServerAWS Cloud
Amazon
S3
Amazon
DynamoDB
Amazon
RDS
Local store
• Can be mutable
• Cache data locally to speed up processing
• Invalidate local data once processed
• Persist all long-term data in globally accessible cloud store
Local data example
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”energyUsage",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd891”,
”lightIdentifier": "000000000001",
“value”: “23” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”lumens",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd891”,
”lightIdentifier": "000000000001",
“value”: “300” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”outage",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "000000000001",
“value”: “1” }
Transform
{ "clientId": ”bestHotel",
"timestamp": 10/14/2016,
"eventType": ”energyUsage",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd891”,
”lightIdentifier": ”LED_A1",
“value”: “23” }
{ "clientId": " bestHotel",
"timestamp": 10/14/2016,
"eventType": ”lumens",
“socketIdentifier”:
“dac06b790cb5b0856437b3efa92bd891”,
”lightIdentifier": " LED_A1",
“value”: “300” }
{ "clientId": " bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": " LED_A1",
“value”: “1” }
Architecture
DeliveryCollect
Amazon
DynamoDB
Audit
AggregateTransform Analyze
Global State
Amazon
S3
Global Store
Channel selection
Amazon SQS Amazon Kinesis
Order Not ordered Ordered
Locality Not localized LocalizedDelivery At-least-once At-least-once
Channel
Attributes
Amazon Kinesis hotspot management
Amazon Kinesis Stream
Producer Consumer
PutRecord GetRecords
Hotel ID Entropy
Partition Key
Entropy = MD5 % Partition Key Size
CONSTRAINTS:
• Hash function calculation is idempotent
• Partition Key Size changes are time versioned
• Partition Key Size is selected based on time
information of the entity
Amazon Kinesis hotspot management
Amazon Kinesis Stream
Producer Consumer
PutRecord GetRecords
AWS Cloud
Amazon
S3
Amazon
DynamoDB
Amazon
RDS
Capture
Stream IO
Statistics
Hotspot
Manager
Reads
Stream IO
Statistics
DescribeStreamSplitShardMergeShards
Read
Partition
Information
Update
Partition
Information
Architecture
DeliveryCollect
Amazon
DynamoDB
Audit
AggregateTransform Analyze
Global State
Amazon
S3
Global Store
Local state
ServerAWS Cloud
Amazon
S3
Amazon
DynamoDB
Amazon
RDS
Local cache
• Cache state locally with Write-Once-Read-Many (WORM)
characteristic
• Validate state cache against global store as often as possible
• Read state directly from global store which changes often
Architecture
DeliveryCollect
Amazon
DynamoDB
Audit
AggregateTransform Analyze
Global State
Amazon
S3
Global Store
Server-based compute
Compute
Serverless compute
Amazon EC2 AWS Lambda
• No server management
• Out-of-the-box scaling
• Out-of-the-box metrics
• Out-of-the-box logging
• Fine grained controls
• Time-sensitive response
• Co-location of resources
• Clustering
vs
Architecture
DeliveryCollect
Amazon
DynamoDB
AWS Lambda Amazon EC2
Audit
Aggregate
Global State
Amazon
S3
Global Store
Amazon EC2 Auto Scaling
Amazon EC2 w/ Auto Scaling
Amazon
CloudWatch
Auto
Scaling
Monitors
CloudWatch
Alarms
EC2 emits
metrics to
CloudWatch
Architecture
DeliveryCollect
Amazon
DynamoDB
AWS Lambda
Amazon EC2 w/ Auto Scaling
Audit
Aggregate
Global State
Amazon
S3
Global Store
Map Reduce workflow
Lock input dataset for idempotent execution
Amazon
DynamoDB
List of
Manifests
List of
Batches
Map and
Reduce
RecordsAmazon
S3
Architecture
DeliveryCollect
Amazon
DynamoDB
AWS Lambda
Amazon EC2 w/ Auto Scaling
AWS Lambda Amazon EMR
Audit
Global State
Amazon
S3
Global Store
Cluster management
Amazon EMR
ControllerCluster Manager
Amazon
DynamoDB
Amazon
EMR
Gather backlog
Information
Find and Lease
Cluster
Spin-up / Tear Down
ClustersEnqueue Step
Architecture
DeliveryCollect
Amazon
DynamoDB
AWS Lambda
Amazon EC2 w/ Auto Scaling
AWS Lambda Amazon EMR
Audit
Global State
Amazon
S3
Global Store
External-facing API
Elastic Load
Balancing
Amazon
CloudFrontAmazon
Route 53Amazon API
Gateway
• Authorization • Version control
• Authentication • DDOS prevention
• Caching • Throttling
• Scale
Audit
Architecture
DeliveryCollect
Amazon
DynamoDB
AWS Lambda
Amazon EC2 w/ Auto Scaling
AWS Lambda Amazon EMR
AWS Lambda AWS Lambda
Amazon API
Gateway
Amazon API
Gateway
Global State
Amazon
S3
Global Store
Incremental auditingTransitive property of equality
If A = B and B = C, then A = C
Color() Unique()
Audit() Audit()
Checksum auditing
Fixed – Static through the end of processing
Checksum = HF(Fixed + Transformed) * Aggregating Value
{ "clientId": "bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": "LED_A1",
“value”: “2” }
Result
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”outage",
“socketIdentifier”: “dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "000000000001",
“value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”outage",
“socketIdentifier”: “dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "000000000001",
“value”: “1” }
Source
Checksum auditingChecksum = HF(Fixed + Transformed) * Aggregating Value
{ "clientId": "bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": "LED_A1",
“value”: “2” }
Result
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”outage",
“socketIdentifier”: “dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "000000000001",
“value”: “1” }
{ "clientId": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
"timestamp": 1476477276000,
"eventType": ”outage",
“socketIdentifier”: “dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "000000000001",
“value”: “1” }
Source
Transformed – Changed in the lifetime of processing
Checksum auditingChecksum = HF(Fixed + Transformed) * Aggregating Value
{ "clientId": "bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": "LED_A1",
“value”: “2” }
Result
{ "clientId": ”bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
“socketIdentifier”: “dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "LED_A1",
“value”: “1” }
{ "clientId": "bestHotel ",
"timestamp": 10/14/2016,
"eventType": ”outage",
“socketIdentifier”: “dac06b790cb5b0856437b3efa92bd8f3”,
”lightIdentifier": "LED_A1",
“value”: “1” }
Source
Perform transformations and filters on source data
Checksum auditingChecksum = HF(Fixed + Transformed) * Aggregating Value
{ "clientId": "bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": "LED_A1",
“value”: “2” }
Result
{ "clientId": ”bestHotel",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": "LED_A1",
“value”: “1” }
{ "clientId": "bestHotel ",
"timestamp": 10/14/2016,
"eventType": ”outage",
”lightIdentifier": "LED_A1",
“value”: “1” }
Source
Run a hashing function over the fixed and transformed fields
1ae035081ed6c9a40f1c6eb1177350a9
Checksum auditingChecksum = HF(Fixed + Transformed) * Aggregating Value
1ae035081ed6c9a40f1c6eb1177350a9
{“value”: “2” }
Result
1ae035081ed6c9a40f1c6eb1177350a9
{“value”: “1” }
1ae035081ed6c9a40f1c6eb1177350a9
{“value”: “1” }
Source
Aggregating Value – Field used for aggregation during processing
Checksum auditingChecksum = HF(Fixed + Transformed) * Aggregating Value
1ae035081ed6c9a40f1c6eb1177350a9
{“value”: “2” }
Result
1ae035081ed6c9a40f1c6eb1177350a9
{“value”: “1” }
1ae035081ed6c9a40f1c6eb1177350a9
{“value”: “1” }
Source
Multiply hash * aggregating value
1AE035081ED6C9A40F1C6EB1177350A9
35C06A103DAD93481E38DD622EE6A1521AE035081ED6C9A40F1C6EB1177350A9
Checksum auditingAssert(sum(sourceChecksums) = sum(resultChecksums))
ResultSource
Sum results compare source vs results
1AE035081ED6C9A40F1C6EB1177350A9
35C06A103DAD93481E38DD622EE6A1521AE035081ED6C9A40F1C6EB1177350A9
35C06A103DAD93481E38DD622EE6A152 35C06A103DAD93481E38DD622EE6A152
Architecture
DeliveryCollect
Amazon
DynamoDB
AWS Lambda
Amazon EC2 w/ Auto Scaling
AWS Lambda Amazon EMR
AWS Lambda AWS Lambda
Amazon API
Gateway
Amazon API
Gateway
Global State
Audit AuditAmazon
S3
Global Store