Upload
vanduong
View
231
Download
5
Embed Size (px)
Citation preview
AWS Architecture Case Study:
Real-Time Bidding
Tom Maddox, Solutions Architect
Who am I?
• Gardener (Capacity Planning)
• Motorcyclist (Agility)
• Mobile App Writer
• Problem Solver
• Technology Geek
• Solutions Architect
Agenda
• What Is Real-Time Bidding (RTB)?
• Architectural Challenges
• Architecture Deep Dive
– DynamoDB Streams
– Big Data Transformation and Analysis Tools
– Machine Learning
What is Real-Time Bidding?
Real-Time Bidding (RTB) is a service offered by advertising networks to
agencies. The agencies decide on the value of advertising opportunities in real-
time and bid accordingly on behalf of their advertising clients. Typically the
window of opportunity for bids to be calculated from provided consumer details
(e.g. cookies) and then submitted is 100ms.
The 100ms Handshake
Real Estate Owner
• Websites• Mobile Apps• Video
Streaming
Advertising Agency
• Logged in with…• Referred by…• Location
• Historic user profiles
• Active Campaigns
• Content Management
• Billing
• Campaign Management
• Keywords• Interests
Why is it interesting?
• Most agencies want to maximize their
campaign audiences by responding to
advertising opportunities all over the world.
• Responses based on data driven, calculated
decisions are important to yield value to
campaigns.
• Consistently bidding on global opportunities
in under 100ms can be challenging at any
scale.
AdRoll
AdRoll is a global leader in retargeting with more than 10,000 active
advertisers across over 100 countries.
AdRoll store 1.5 PB of data in Amazon S3 and run just 30 core Amazon Elastic
Compute Cloud (Amazon EC2) instances. Additional instances—anywhere
from 200 to 1,000 of them, including Amazon EC2 Spot Instances—are used
for variable capacity.
“We need high performance, but we need more than that,” says Valentino
Volonghi, CTO. “We need flexibility, and we need software that could scale
across multiple data centers and machines, software we could optimize as we
go. Moving our operations to the cloud was really our only option.”
SocialVibe
SocialVibe has built a global business that handles peaks in its traffic using
Amazon DynamoDB and multiple Availability Zones across different Regions.
Using AWS has enabled SocialVibe to experiment with new architectures to
meet the demands of a diverse worldwide customer base.
“We had to order hardware in advance, we couldn’t experiment with new
hardware easily,” explains Joshua Rangsikitpho, CTO. “once we moved over to
AWS all those problems went away”
On to the architecture…
Architecture Overview
Click Stream Ingest
Real-TimeBidding
Regional Hub
Regional HubRegional Hubs
Big Data Processing(Billing, Profile Tracking, Machine
Learning)
Campaign Mgmt
Architecture Overview
Flyby Comments
• This architecture focuses on bidding logic. We’re not
going to look closely at content management or serving.
• There is a split between time sensitivity. Bidding is done
as fast as possible, but clickstream data can be buffered to
update data for bidding decisions.
• Long range connectivity can be error prone. We’re
leveraging AWS managed, resilient replication
techniques wherever possible.
Deep Dive: Campaign Management
Deep Dive: Campaign Management
• Role: Management and monitoring of advertising
campaigns.
• Usage: Marketing departments and advertising
agencies log into a marketing portal to define
campaigns and monitor dashboards.
• Top Tip: campaign managers can optimize target
audiences in real-time, based on ongoing success.
• Services: Elastic Beanstalk, Elastic Load
Balancers, EC2 Instances, RDS
Deep Dive: Click Stream Ingest
Real-time processing
High throughput; elastic
Easy to use
S3, Redshift, DynamoDB Integrations
Amazon Kinesis
Amazon Kinesis StreamsManaged service for streaming ingest & processing
Sending Consuming
HTTP Post
AWS SDK
LOG4J
Flume
Fluentd
Get* APIs
Kinesis Client Library+
Connector Library
Apache Storm
Amazon Elastic MapReduce
AWS Mobile SDK
Deep Dive: Click Stream Ingest
• Role: Collect and aggregate click stream data
from audience interactions.
• Usage: Interactions with audiences are
offloaded to a Kinesis Stream or Kinesis
Firehose. Records are persisted to Amazon
S3.
• Services: Elastic Beanstalk, Kinesis, Lambda,
S3
Amazon Kinesis FirehoseLoad massive volumes of streaming data into Amazon S3 and Amazon Redshift
• Zero administration: Capture and deliver streaming data into S3, Redshift, and
other destinations without writing an application or managing infrastructure.
• Direct-to-data store integration: Batch, compress, and encrypt streaming data
for delivery into data destinations in as little as 60 secs using simple configurations.
• Seamless elasticity: Seamlessly scales to match data throughput w/o intervention
Capture and submit streaming data to Firehose
Firehose loads streaming data continuously into S3 and Redshift
Analyze streaming data using your favorite BI tools
Deep Dive: Big Data Processing
S3 Cross Region ReplicationReplication of data across AWS regions reliably
• All new uploads into source bucket will be replicated
• Asynchronous
• Entire bucket or prefixes
• Versioning required
Deep Dive: Big Data Processing
• Role: Consolidate clickstream data
from around the world, update user
profiles, KPI’s and client invoices.
• Usage: S3 bucket replication is used to bring regional data together.
Then AWS analytics services transform the data to derive insights.
Updated user profiles and bidding tactics are distributed with the
DynamoDB replication client.
• Top Tip: Faster analysis can lead to better return on investment.
Leveraging Spot Instances can maximize this optimization opportunity.
• Services: S3 bucket replication, Data Pipeline, EMR, Redshift,
Machine Learning, Kinesis, DynamoDB Replication Client, Spot Market.
Don’t lock your big data pipeline
down.Keep it agile and experiment often!
Deep Dive: Real-Time Bidding
Optimized Connectivity
• Many advertising exchanges already use AWS.
• Enquire into low latency connectivity options such as VPC
peering.
• If not, consider Direct Connect as a means to get best
possible latency from AWS to anyone.
Deep Dive: Real-Time Bidding
• Role: Respond to advertising opportunities
with bids based on campaign and user data in
under 100ms.
• Usage: An API responds to opportunities in
real-time using in-memory caches and cross-
region replication.
• Services: Elastic Beanstalk, ElastiCache,
RDS and DynamoDB
What are DynamoDB Streams?A stream of updates that scales with your table
Asynchronous
Exactly once
Strictly ordered records
i=Ai=B
i=C
i=C
Cross region replication
Architecture Overview
Summary
• Use regional hubs to minimize latency.
• Leverage asynchronous cross-region replication to keep
regional hubs in sync, in real-time.
• Use event driven workloads that remain responsive at any
scale.
• Many exchanges already use AWS, so enquire about
optimized connectivity.
• Let us manage moving data around, so that you can
concentrate on finding new insights from your data.
Further Challenges
• How could we make sure a campaign budget is never
exceeded?
• What’s the best way to manage and serve advert content
too?
Call to Action
1. Come to us and tell us more about your AdTech use cases
and what your priorities are.
2. We a building an AdTech community for AWS customers
to get early preview of products and collaborate with us.
3. We are looking for cool and popular AdTech blog ideas
and for customers to tell their story.