Upload
amazon-web-services
View
331
Download
3
Embed Size (px)
Citation preview
AWS Cloud Kata for Start-Ups and Developers
Hong
Kong
Real-time Data Processing
Using AWS Lambda
KJ Wu
Solutions Architect
AWS Cloud Kata for Start-Ups and Developers
AWS Services for Data ProcessingAWS Lambda
Amazon Kinesis
Architecture & Workflow for Streaming Data Processing
Demo
Best Practices in Building Data Processing Solutions
Agenda
AWS Cloud Kata for Start-Ups and Developers
AWS Services for Data
Processing
Amazon
Kinesis
AWS
Lambda
AWS Cloud Kata for Start-Ups and Developers
Amazon Lambda: OverviewServerless compute service that runs code in response to events without need to manage servers
No Servers to Manage
• Automatically runs code without Provisioning or Managing servers.
• Just write the code and upload it to Lambda.
Continuous Scaling • Auto scales Application
precisely with the size of the workload.
• Code runs in Parallel & Processes each trigger individually
Subsecond Metering
• Starts Code within
milliseconds of an Event
Executes only when event
is triggered.
AWS Cloud Kata for Start-Ups and Developers
AWS Lambda: Serverless Compute in the Cloud
Easy to author, deploy, maintain,
secure and manage
Allows for focus on business logic,
not infrastructure
Stateless, event-driven code with native support for
Node.js, Java, and Python languages
Compute & Code without managing infrastructure like
EC2 instances and auto scaling groups
Makes it easy to Build back-end
services that perform at scale
AWS Cloud Kata for Start-Ups and Developers
Amazon Kinesis Streams
• Build your own custom
applications that process or
analyze streaming data
Amazon Kinesis Firehose
• Easily load massive volumes of
streaming data into Amazon S3,
ElasticSearch and Redshift
Amazon Kinesis: OverviewA managed service for streaming data ingestion and processing
Buffer size/interval
Data compression
Data Encryption with
KMS
AWS Cloud Kata for Start-Ups and Developers
Data Processing/Streaming
Architecture & Workflow
Smart
Devices
Click
Stream
Log
Data
AWS Cloud Kata for Start-Ups and Developers
AWS Lambda and Amazon Kinesis integration
Stream-based model: ▪ Lambda polls the stream; When new records detected Lambda
function invoked.
▪ New records are passed by Kinesis as parameter.
▪ Kinesis mapped as Event source in Lambda
Synchronous invocation: ▪ Lambda invoked by RequestResponse invocation
▪ Lambda function is executed once.
Event structure:▪ Event received by Lambda function is a collection of records from
Kinesis stream.
▪ Lambda Kinesis Event source: Batch size/Max records configured to be received per invocation.
AWS Cloud Kata for Start-Ups and Developers
Streaming Architecture Workflow: Lambda+Kinesis
Data Input Kinesis Action Lambda Data Output
IT application activity
Capture the
stream
Audit
Process the
stream
SNS
Metering records Condense Redshift
Change logs Backup S3
Financial data** Store RDS
Transaction orders** Process SQS
Server health metrics Monitor EC2
User clickstream Analyze EMR
IoT device data Respond Backend endpoint
Custom data Custom action Custom application
AWS Cloud Kata for Start-Ups and Developers
Common Architecture: Lambda + KinesisReal Time Data Processing
Amazon
Kinesis
AWS
Lambda 1
Amazon
CloudWatch
Amazon
DynamoDB
AWS
Lambda 2 Amazon
S3
1. Real-time event data sent to Amazon
Kinesis, allows multiple AWS Lambda
functions to process the same events.
2. In AWS Lambda, Function 1 processes the
incoming events and stores event data in
Amazon DynamoDB
3. Lambda Function 1 also sends values to
Amazon CloudWatch for simple monitoring
of metrics.
4. In AWS Lambda function, Function 2 stores
incoming data events in Amazon S3
AWS Cloud Kata for Start-Ups and Developers
Demo: Real time processing of
Amazon Kinesis data streams with
AWS Lambda
AWS Cloud Kata for Start-Ups and Developers
Data Processing:
Best Practices & Tips
AWS Cloud Kata for Start-Ups and Developers
Best Practices: Kinesis
Batch size:
▪ Number of records that AWS Lambda will retrieve from Kinesis at the time of invoking your function
▪ Increasing batch size will cause fewer Lambda function invocations with more data processed per function
Starting Position:
▪ The position in the stream where Lambda
starts reading
▪ Set to “Trim Horizon” for reading from
start of stream (all data)
▪ Set to “Latest” for reading most recent
data (LIFO) (latest data)
Performance tuning Kinesis as an
event source
AWS Cloud Kata for Start-Ups and Developers
Best Practices: Lambda
• Write your Lambda function code in a stateless style
• Be aware of Lambda retries on different error and retry scenario
• Synchronous invocation – The invoking application receives
a 429 error, and is responsible for retries.
• Stream-based event sources - Lambda attempts to process
the erring batch of records until the time the data expires
• Minimizing the use of startup code not directly related to
processing the current event, ex. the third party lib Cost &
Performance
AWS Cloud Kata for Start-Ups and Developers
Get Started: Data Processing with AWS
1. Create your first Kinesis stream. Configure hundreds of thousands of data
producers to put data into an Amazon Kinesis stream. Ex. data from Social
media feeds.
2. Create and test your first Lambda function. Use any third party library, even
native ones. First 1M requests each month are on us! (free-tier)
3. Read the Developer Guide, AWS Lambda and Kinesis Tutorial, and resources
on GitHub at AWS Labs
• http://docs.aws.amazon.com/lambda/latest/dg/with-kinesis.html
• https://github.com/awslabs/lambda-streams-to-firehose lambda-streams-to-firehose
Next Steps
AWS Cloud Kata for Start-Ups and Developers
THANK YOU