56
Future-proofing Application Delivery at Yelp Building and tuning traffic management for large web-scale applications

Future-proofing Application Delivery at Yelp

  • Upload
    ns1

  • View
    113

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Future-proofing Application Delivery at Yelp

Future-proofing Application Delivery at Yelp

Building and tuning traffic management for large web-scale applications

Page 2: Future-proofing Application Delivery at Yelp

Hello!Who are these guys anyway?

Sarguru Mohan

Site Reliability Engineer

Yelp

Kris Beevers

Founder

NS1

Page 3: Future-proofing Application Delivery at Yelp

Intelligent DNS & Traffic

Management

Page 4: Future-proofing Application Delivery at Yelp

A broad look at modern DNS & traffic

management

What’s changed / changing?

Page 5: Future-proofing Application Delivery at Yelp

Application delivery & traffic automation are

deeply intertwined

• Today’s application architectures --like Yelp -- are distributed, dynamic, and driven by real-timeconditions

• As elastic infrastructure shifts, traffic needs to move with it

• Deep integration between traffic management and the application is key

Page 6: Future-proofing Application Delivery at Yelp

What makes DNS a good place in the stack

to take control over traffic?

• Ubiquitous: every client speaks DNS to

find your application

• No application changes: you’re already

using DNS to direct traffic

• Early: first indication a user needs to

interact with your infrastructure, first

opportunity to impact traffic

• Simple: mapping names to services –

lightweight idea with lots of flexibility

Page 7: Future-proofing Application Delivery at Yelp

Automate DNS & traffic management like

everything else

• Software development & operations

are increasingly intertwined -- DNS

should be tightly integrated with

deployments, scaling, testing & burn-

in, etc

• Demand comprehensive API control– Zone files aren’t granular or expressive enough for modern

traffic management

– Change propagation is a key consideration -- how long for

an automated change to propagate across global

authoritative DNS (independent of TTL)

Page 8: Future-proofing Application Delivery at Yelp

Beware the pitfalls!

• Caching / TTL induce limitations

–Can’t control every connection

like L7

–DNS is better for global load

balancing than for local

–But! Can still impact availability /

Page 9: Future-proofing Application Delivery at Yelp

Beware the pitfalls!

• Modern DNS traffic management

doesn’t translate across providers

–(Yet)

–Different approaches / semantics

–Lose zone transfer & traditional

approaches for redundancy

–Think about this up front -- today’s

internet demands DNS network

redundancy

Page 10: Future-proofing Application Delivery at Yelp

DNS based traffic management enables

powerful automation

• Health checks & geo-routing: still

important, but we can do better

• Load shedding: optimize infrastructure

utilization with thin provisioning

• Real-time performance management:

traffic routing based on RUM telemetry --

bust the geo approximation & optimize

real perf metrics

Page 11: Future-proofing Application Delivery at Yelp

DNS based traffic management enables

powerful automation

• Flexibility around datacenter /

infrastructure elasticity, migration,

expansion: weighting, stickiness,

etc

• Network-based control: specifically

route ISPs/prefixes

• Data ingestion, aggregation,

propagation drives global traffic

automation

Page 12: Future-proofing Application Delivery at Yelp

TLDR

Lots going on in modern DNS & traffic

management.

Let’s look at Yelp’s use case.

Page 13: Future-proofing Application Delivery at Yelp

Yelp’s Mission:Connecting People with great local business.

Page 14: Future-proofing Application Delivery at Yelp

Yelp StatsAs of Q1 2016

90M 102M

Page 15: Future-proofing Application Delivery at Yelp

And some more stats

6 datacenters 5 continents

Page 16: Future-proofing Application Delivery at Yelp

Goals

• Availability

• Performance

Page 17: Future-proofing Application Delivery at Yelp

Challenges

• Balance traffic across multiple edges.

• Actively monitor edges and respond to events.

• Respond to changes in elastic infrastructure

Page 18: Future-proofing Application Delivery at Yelp

The “edge” stack

Page 19: Future-proofing Application Delivery at Yelp

The “edge” stack

Page 20: Future-proofing Application Delivery at Yelp

The “edge” stack

Page 21: Future-proofing Application Delivery at Yelp

The “edge” stack

Page 22: Future-proofing Application Delivery at Yelp

Why is DNS critical here?

• Public facing DNS records resolves to CDN’s

Anycast address.

• Users are routed to the nearest PoP.

• CDN is configured to route requests to our

“backend” DNS records.

Page 23: Future-proofing Application Delivery at Yelp

What’s behind the edge

Page 24: Future-proofing Application Delivery at Yelp

What’s behind the edge

Page 25: Future-proofing Application Delivery at Yelp

What’s behind the edge

• Load Balancers

• Webservers

• 100s of micro-services

• smartstack for service discovery

Page 26: Future-proofing Application Delivery at Yelp

What’s behind the edge

Page 27: Future-proofing Application Delivery at Yelp

What’s behind the edge

Page 28: Future-proofing Application Delivery at Yelp

What’s behind the edge

• Hardware boxes.

• EC2 instances.

• Auto Scaling Groups.

• Spot Fleets.

Page 29: Future-proofing Application Delivery at Yelp

What’s behind the edge

• Hardware boxes.

• EC2 instances.

• Auto Scaling Groups.

• Spot Fleets.

Page 30: Future-proofing Application Delivery at Yelp

DNS & Elastic Infrastructure

• Elastic Infra demands Intelligent DNS

Page 31: Future-proofing Application Delivery at Yelp

DNS & Elastic Infrastructure

• Elastic Infra demands Intelligent DNS

• Intelligent = Fast

Page 32: Future-proofing Application Delivery at Yelp

DNS & Elastic Infrastructure

• Elastic Infra demands Intelligent DNS

• Intelligent = Fast

• Intelligent = Flexible

Page 33: Future-proofing Application Delivery at Yelp

DNS & Elastic Infrastructure

• Elastic Infra demands Intelligent DNS

• Intelligent = Fast

• Intelligent = Flexible

• Intelligent = API Driven

Page 34: Future-proofing Application Delivery at Yelp
Page 35: Future-proofing Application Delivery at Yelp

Let’s Talk About Infra Automation

Page 36: Future-proofing Application Delivery at Yelp

Let’s Talk About Infra Automation

Page 37: Future-proofing Application Delivery at Yelp

So how do we launch these AMIs?

• AWS Web console? !!!

• Shell script?

• clops

Page 38: Future-proofing Application Delivery at Yelp

So how do we launch these AMIs?

Page 39: Future-proofing Application Delivery at Yelp

Terraform

• Declarative Infrastructure

• “Self Documenting”

• Infra changelog

• Reproducible Infrastructure

Page 40: Future-proofing Application Delivery at Yelp

Terraform at Yelp

• Base AWS Infrastructure.

• Front-end Infrastructure

• Datastores

• Datapipeline

• Batch systems

• DNS

Page 41: Future-proofing Application Delivery at Yelp

Terraform at Yelp

• Base AWS Infrastructure.

• Front-end Infrastructure

• Datastores

• Datapipeline

• Batch systems

• DNS

Page 42: Future-proofing Application Delivery at Yelp

DNS @ Pre-terraformic days

• BIND

• Web UI

• Script reading YAML for traffic

management.

Page 43: Future-proofing Application Delivery at Yelp

Terraform at Yelp

• 4 custom providers

• aws wrapper

• git

• nsone

• ddns

• 16 custom resources

Page 44: Future-proofing Application Delivery at Yelp

Terraforming DNS

Page 45: Future-proofing Application Delivery at Yelp

Terraforming DNS

• So we wrote a go-api client.

• And a custom terraform provider too!

Page 46: Future-proofing Application Delivery at Yelp

Why not X?

• Web UI??

• BIND??

• Yet Another Script??

Page 47: Future-proofing Application Delivery at Yelp

Why Terraform?

• Common tool

• Source of Truth

• Declaratively describe your complete

infrastructure.

• Remote states <3

Page 48: Future-proofing Application Delivery at Yelp
Page 49: Future-proofing Application Delivery at Yelp

Show me the Code!

Page 50: Future-proofing Application Delivery at Yelp
Page 51: Future-proofing Application Delivery at Yelp

So how does automation help here?

• Test/deploy new features

• IP Fencing

• ALIAS records

• CDN gets DDoSed?

• Let go of it!

• Keep your TTLs Reasonable.

• Use Multiple CDNs.

Page 52: Future-proofing Application Delivery at Yelp

So how does automation help here?

• DNS provider gets DDoSed?

• Use multiple DNS providers.

• Automation to keep them in sync.

• vendor_agnostic_tools++

Page 53: Future-proofing Application Delivery at Yelp

Problems

• Terraform config is verbose and

unexpressive for some zones

compared to BIND

• Sometimes Terraform’s remote state

refresh step is not fast enough.

Page 54: Future-proofing Application Delivery at Yelp

Future of DNS at Yelp

• Automated traffic distribution

management.

• Load shedding.

• Continuous Integration and Delivery.

Page 55: Future-proofing Application Delivery at Yelp

Code

https://github.com/bobtfish/terraform-provider-nsone

https://github.com/bobtfish/go-nsone-api

Page 56: Future-proofing Application Delivery at Yelp

Thanks!!

@sargru90

@YelpCareers

@YelpEngineering

github.com/yelp

engineeringblog.yelp.com

@beevek

@nsoneinc

github.com/ns1

ns1.com/blog