Upload
cosmin-stanciu
View
41
Download
4
Embed Size (px)
Citation preview
Tracking and Monitoring APIs at ScaleBuilding a big-data pipeline with Nginx
Cosmin StanciuLead Software Engineer, Adobe I/O
#nginx #nginxconf
Adobe API’s
Adobe Document Cloud
Adobe Creative Cloud
Adobe Marketing Cloud
API Gateway
Adobe API Gateway
Validation Caching Throttling Logging
• apigateway• api-gateway-request-validation• api-gateway-cachemanager• api-gateway-aws• api-gateway-async-logger
Github
metadata
request
response
600M/day
API Gateway
Kinesis / Kafka
#nginx #nginxconf
Data
Analytics
Debugging
Business model
Monitoring
#nginx #nginxconf
– Carl Sagan
If you wish to make an apple pie from scratch, you must first invent the universe.
architecture
architectureBatch
ServiceSpeed
HDFS SQLStreaming
{API}
Agg. Index
Speed IndexConsumers
Kinesis
S3
- Streaming layer
Kinesis
BatchService
Speed
HDFS SQLStreaming
{API}
Agg. Index
Speed IndexConsumers
S3
Kinesis Logger Config1. local logger_module = “api-gateway.logger.BufferedAsyncLogger"
3. local logger_opts = {4. flush_length = 500,5. flush_interval = 5,6. flush_concurrency = 16,7. flush_throughput = 10000, 8. sharedDict = “stats_kinesis”9. }
PutRecords - ever request up to 500 records
5s - interval in seconds to flush regardless if the buffer is full or
notMax parallel
threads used for sending logs
max logs / SECOND that can be sent to
the Kinesis backend
dict for caching the logs
Kinesis Logger Config1. backend_opts = {2. aws_region = ngx.var.aws_region or "us-east-1",3. kinesis_stream_name = "api-gateway-stream",4. aws_credentials = {5. provider = "api-gateway.aws.AWSSTSCredentials",6. role_ARN = "arn:aws:iam::123456789012:user/admin,7. role_session_name = "kinesis-logger-session",8. shared_cache_dict = “aws_credentials”9. }10. }
Security Token Service
AsumeRole returns a set of temporary
security credentials
Cluster
Cluster deployment
VPCPublicPrivate
Stateless Body
Stateful BodyNucleus Membrane
Mesos Agents HDFS
NameNodeZookeeper
Mesos master
Stateless Workloads
Stateful Workloads
- HDFS DataNodes
API Gateway- Auto discover
- Load balancing- Security
SpeedSpeed IndexConsumers
Kinesis
BatchService
HDFS SQLStreaming
{API}
Agg. Index
S3
Real-time data
Kibana UIDebugging
Monitoring
- Speed layer
- Batch layer Batch
HDFS SQLStreamingAgg. Index
ServiceSpeed
{API}
Speed IndexConsumers
Kinesis
S3
Docker / Marathon Batch size
Checkpointing Kinesis / Spark
Store in Parquet format
Temporary storage Parquet files
S3 sync
Elasticsearch Index
Docker / ChronosDaily / Hourly aggregation
Run job hourly or daily
#nginx #nginxconf
OLAP Data Cube
count
time
consumer
serviceThe Elasticsearch aggregated index can be represented as a Data Cube
The cube is actually a hypercube with more than 3 dimensions
Users can apply filters, roll-ups or drill-downs
Service{API}
Agg. Index
Speed Index
S3 BatchSpeed
HDFS SQLStreaming
Consumers
Kinesis
- Service layer
Testing
Performance testing
Functional testing
HDFS SQLStreaming
Kafka
Testing app
Canary testing
{API}
Agg. Index Speed Index
Canary traffic
Results
Results
Monitoring
14 15 16 17 18 19 200
1750
3500
5250
7000
8750
Realtime Forcast
Demo
Functional testing
HDFS SQLStreaming
Kafka
Testing app
Kafka Logger Config1. local logger_module = “api-gateway.logger.BufferedAsyncLogger"
3. local logger_opts = {4. flush_length = 500,5. flush_interval = 1,6. flush_concurrency = 16,7. flush_throughput = 10000, 8. sharedDict = “stats_kafka”9. }
#nginx #nginxconf
Thank Youfacebook.com/selfxp
linkedin.com/in/cosminstanciu
@selfxp80