Dyn.com | @dyninc
Optimizing Web Performance by Using Fast DNS
Tom DalyChief Scientist, Dyn [email protected] | @tomdyninc#velocityconf
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Goal of the Talk
• Understand the impact of DNS response latency on total web page load time
• Understand the trade-offs between Unicast and Anycast DNS architectures
• Understand how to deploy Anycast services
• Demonstrate benefits through a series of case studies
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Connecting DNS and HTTP Load Time
• Speed metrics:– Google: 400ms => 0.6% search decrease– Shopzilla: 5 seconds => 12% revenue increase– Dyn: Post Velocity 2011 efforts: 55.8% speed improvement
• DNS queries are the first blocking operation encountered by your browser’s HTML parser.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Understand the Tradeoffs
• HTTP optimizations everywhere! Yslow, ShowSlow, etc.
• Many HTTP optimizations ignore DNS impact: – Short TTLs require more queries– Pipelining across host FQDNs require more queries– Multiple CNAMEs in a chain require more queries– GSLB devices require multiple delegations
• Legacy DNS architectures + More Queries= More DNS latency overall!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
A Worst Case Scenario!
Poor anycast + CDN CNAME chain + back to origin node
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Don’t worry, there is hope!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Easy Wins for WebOps
• Higher TTLs mean less queries to keep cache hot, less agility in moving resources around– Not so compatible with today’s movement to cloud.
• Less FQDNs mean less queries– Also meaning less opportunity to use pipelining– Careful choice on how much to break out to different FQDNs
• Watch for long CNAME chains being used
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Harder Wins for WebOps
• Offline nameservers / Lame delegations– This means timed out queries upstream: 2 or 10 seconds
depending upon implementation– For unicast, monitor your DNS extensively
• DNS Deployment Approach:– Registrar / Hosting Provider (cost center, no focus)– In-house (Unicast, due to cost)– Managed External DNS (Anycast, focused business driver)
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Hardest Win for WebOps
• Self Deploy a Proper DNS Architecture:– For speed and performance, you must deploy anycast DNS.– You need multiple sites for HTTP anyways, but you need LOTs
of sites to achieve redundancy with anycast – otherwise useless approach
– Then there are LOTs of specific limitations that you don’t deal with in HTTP serving to figure out
• Choices: Unicast vs. Anycast DNS…
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Traditional DNS: Unicast
• Need to deploy many nameservers for redundancy and geographic diversity.
• With Unicast, you get a one to one mapping between domain’s NS records and each nameserver listed.
• Non-timed, non-cached queries MUST contact all of the nameservers in the delegation, taking time.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Today’s DNS: Anycast
• Again, need to deploy many nameservers for redundancy and geographic diversity.
• But with Anycast, we decouple the mapping of NS records to a nameserver.
• We reduce the number of NSes in the delegation – reducing timeouts!
• We make all of the nameservers much faster, lowering the non-primed, non-cached tax.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
The Ultimate Performance Enemy: DNS Protocol Resiliency
• When was the last time you saw a DNS query drop, given enough time to resolve?
• DNS was designed with crazy protocol level redundancy techniques due to lossy networks of the 1980s – lots of retry mechanisms.
• DNS RTT banding requires all nameservers in a delegation to be contacted. An offline NS cause 2-10 seconds of latency in non-cached lookups.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
ns1: Seattle
ns2: Palo Alto
ns3: Los Angeles
ns4: New York
ns5: Ashburn
ns6: Miami
Unicast Experience
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Anycast Experience
ns1: Seattle
ns2: Palo Alto
ns3: Los Angeles
ns1: New York
ns2: Ashburn
ns3: Miami
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Handling Site Outages
• What happens during an Unicast site outage?– RTT banding timeouts delay DNS query response times,
delaying web load times.
• What about in Anycast?– BGP routing “pulls” traffic to the next best site.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
ns1: Seattle
ns2: Palo Alto
ns3: Los Angeles
ns4: New York
ns5: Ashburn
ns6: Miami
Unicast Redundancy
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Anycast Redundancy
ns1: Seattle
ns2: Palo Alto
ns3: Los Angeles
ns1: New York
ns2: Ashburn
ns3: Miami
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Handling DDoS
• What happens during an Unicast DDoS?– Likely, all of the nameservers in the delegation will be
enjoying the packet love!
• What about in Anycast?– BGP routing “isolates” traffic to the origins of the DDoS.– Attackers are “blinded” from seeing the whole topology.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Unicast DDoS
ns1: Seattle
ns2: Palo Alto
ns3: Los Angeles
ns4: New York
ns5: Ashburn
ns6: Miami
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Anycast DDoS
ns1: Seattle
ns2: Palo Alto
ns3: Los Angeles
ns1: New York
ns2: Ashburn
ns3: Miami
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Anycast sounds pretty awesome, right?
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Deploying Anycast Services
• Deploying anycast isn’t easy:– Understanding / capability of BGP routing– Need your own routable PI IP address space– Consistency of connectivity is important to ensure
performance – leads to limited colocation options– Data synchronization across the deployment is critical– In-house monitoring is nearly impossible
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Routing 101
• Backbones and routing protocols:– IGP (OSPF) for internal connections (link state / metric based)– BGP for external connections (distance vector)
• Mixing IGP (OSPF) with BGP– OSPF “floods” routes about the network’s interfaces and
point to point links throughout the entire network.– iBGP is “stacked” on top using adjacencies formed in OSPF.– eBGP routes get carried through iBGP, with a partial decision
factor coming from the OSPF metrics.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Internet Scale Routing
AS 1
AS 2 AS 3
AS 4ns1: New York
A network is defined as an ASN.BGP is exchanges “best” routes between networks.OSPF floods “all” routes inside a network.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
BGP 101
• With BGP, routing information is exchanged between “peers” – routers connected to each other
• Only the “best” routes get exchanged, limiting scope of information shared
• BGP provides next-hop information along AS-paths only, then the IGP takes over.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
BGP Routing
AS 1
AS 2 AS 3
AS 4ns1: New York
With BGP, the shortest AS pathis selected as the best path.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
BGP Routing Example
[email protected]> show route www.level3.com
inet.0: 409151 destinations, 1656770 routes (409148 active, 0 holddown, 14 hidden)
Restart Complete
+ = Active Route, - = Last Active, * = Both
4.0.0.0/9 *[BGP/170] 10w4d 11:35:04, MED 100, localpref 100
AS path: 3356 I
> to 4.53.90.149 via ge-0/0/4.0
[BGP/170] 1w2d 14:18:00, MED 100, localpref 100
AS path: 174 3356 I
> to 38.104.190.53 via ge-0/0/2.0
The BGP path with the lowest number of AS hops traversed is the best path.
Prefix
Best BGP Path
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
OSPF + iBGP 101
• IGP (OSPF) enumerates all potential “paths” and “costs” in the network.
• Links between routers are given metrics for traffic engineering– Longer links tend to have higher metrics
• OSPF will calculate end-to-end cost for a path through the network
• iBGP carries all Internet routes on top, OSPF costs decides the paths to take.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
AS 1
AS 2 AS 3
AS 4ns1: New York
OSPF Routing in AS4
Within the ASN, OSPF picks paths based upon metric preferences
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
IGP (iBGP + OSPF) Routinginet.0: 414878 destinations, 4526985 routes (413866 active, 19 holddown, 422693 hidden)208.78.70.0/24 (16 entries, 1 announced) *BGP Preference: 170/-121 Next hop type: Indirect Next hop type: Router, Next hop index: 2100575 Next hop: 129.250.4.69 via ae1.0 State: <Active Int Ext> Local AS: 65000 Peer AS: 65000 Age: 1w2d 10:33:19 Metric: 0 Metric2: 17 Task: BGP_65000.129.250.0.18+??? AS path: 33517 I Communities: 2914:370 2914:1009 2914:2000 2914:3000 Accepted Localpref: 120 Router ID: 129.250.0.18 BGP Preference: 170/-121 Next hop type: Indirect Next hop type: Router, Next hop index: 2101345 Next hop: 129.250.2.183 via ae5.0 weight 0x1 State: <NotBest Int Ext> Inactive reason: Not Best in its group - IGP metric Local AS: 65000 Peer AS: 65000 Age: 5d 8:44:14 Metric2: 20 Task: BGP_65000.129.250.0.178+??? AS path: 33517 I Communities: 2914:370 2914:1001 2914:2000 2914:3000 4459:412 Accepted Localpref: 120 Router ID: 129.250.0.178
Prefix
IGP Metric2
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Anycast, BGP and OSPF• To make Anycast work, you need to:
– Understand the impact of BGP peering upon route path selection as traffic is exchanged between ISPs.
– Understand the impact of OSPF route selection to ensure traffic is off-ramped at the right spot to your services
• BGP-wise: Your transit and peers external routing policies govern your traffic– BGP communities can help “steer” traffic– Peering policy is king!
• IGP-wise: Your backbone providers internal routing metrics– Maintenance events can drag traffic around oddly!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Putting it All Together
AS 1
AS 2 AS 3
AS 4ns1: New York
ns1: Los Angeles
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Aut. Systems and IP Addresses
• To multihome your network, you need to run BGP; you need an autonomous system number (ASN)
• To stay independent of any provider, you need to apply for and obtain your own address space.– You /want/ to have lots of choice – ISPs do funny things!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Datacenters, IP Transit, and Peering• Consistent ISP connectivity is key to ensure networks take
advantage of IGP metrics over regular BGP routing.
• Means connecting in carrier neutral facilities (so you have multiple connections) which means more cost, means multiple contracts, access lists, procedures, etc.
• Finding consistency IP transit between US/EU and APAC is a difficult challenge (solved by communities)
• To achieve performance, you MUST depend upon IGP routing metrics –granular information needed to off-ramp traffic to you in the right spot
• Peering networks don’t always appreciate anycast traffic engineering – inconsistent route sets
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Data Synchronization and Monitoring• Two networks are deployed:
– First for Anycast services facing TOWARDs users– Second for Unicast data replication in and amongst the application
• CAP Theorem kicks in:– Keeping things in sync while making changes becomes very very hard
at scale– You enjoy the inter-datacenter latency during replication
• Visibility of an anycast network is reduced:– Monitoring from an anycast site means you ONLY see that same site.– Monitoring externally could mean non-deterministic coverage of
anycast instances.
• This topic is another talk altogether!
Router
em0: 208.78.69.140lo0: 208.78.70.1Local LAN: 208.78.69.0/24Static Route: 208.78.70.1 -> 208.78.69.140
Example Datacenter Configuration
BGP Announcement: 208.78.69.0/24208.78.70.0/24
ISP #2ISP #1
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Example BGP Peer Configshow configuration protocols bgp ...group NTT { type external; peer-as 2914; neighbor 129.250.192.57 { import [ Full-Routes-In ]; export [ Dyn-Anycast Site-Unicast ]; }} group Tata { type external; peer-as 6453; neighbor 209.58.26.53 { import [ Full-Routes-In ]; export [ Dyn-Anycast Site-Unicast ]; } }...
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Example Routing Policy Configshow configuration policy options...policy-statement Dyn-Anycast { term advertise { from { protocol aggregate; route-filter 208.78.70.0/24 exact; } then accept; } term next-policy { then next policy; } } ...policy-statement Site-Unicast { term advertise { from { protocol aggregate; route-filter 208.78.69.0/24 exact; } then accept; } term next-policy { then next policy; }}...
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Example Static Route Configshow configuration routing-options...static { route 0.0.0.0/0 next-hop 129.250.192.57; route 208.78.70.1 next-hop 208.78.69.140;...}
aggregate { route 208.78.70.0/24 { as-path { origin igp; } brief; }...}
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Example Server Config[tom@dns ~]$ ifconfig igb0igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
ether 00:1b:21:aa:61:d0inet 208.78.69.140 netmask 0xffffff00 broadcast
208.78.69.255media: Ethernet autoselect (1000baseT <full-duplex>)status: active
[tom@dns ~]$ ifconfig lo0 lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>inet 127.0.0.1 netmask 0xffffffff inet 208.78.70.1 netmask 0xffffffff
Router
em0: 208.78.69.140lo0: 208.78.70.1Local LAN: 208.78.69.0/24Static Route: 208.78.70.1 -> 208.78.69.140
Do this Twice and…Anycast
BGP Announcement: 208.78.69.0/24208.78.70.0/24
Router
BGP Announcement: 208.78.68.0/24208.78.70.0/24
em0: 208.78.68.140lo0: 208.78.70.1Local LAN: 208.78.68.0/24Static Route: 208.78.70.1 -> 208.78.68.140
ISP #2ISP #1
And it works for HTTP services too!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Example Improvements
• Warning: YMMV! But let’s chat about it.
• Monitoring from 50 Catchpoint Nodes, excluding China (too much noise)
• Configured www.foo.com as www.foo.com.dynect-demo.com, matching as many DNS parameters as possible.
• Some results expected, others were drastic and surprising!
• The domains have been obfuscated to protect the innocent.
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Unicast vs. Anycast DNSwww.domain.com. 1800 IN A X.Y.162.26domain.com. 1800 IN NS ns1-auth.sprintlink.net.domain.com. 1800 IN NS ns2-auth.sprintlink.net.domain.com. 1800 IN NS ns3-auth.sprintlink.net.domain.com. 1800 IN NS ns-XXX-01.lXXig.com.domain.com. 1800 IN NS ns-XXX-02.lXXig.com.;; Received 199 bytes from 144.228.255.10#53(ns3-auth.sprintlink.net) in 99 ms
www.domain.com.dynect-demo.com. 1800 IN A X.Y.162.26dynect-demo.com. 86400 IN NSns4.p13.dynect.net.dynect-demo.com. 86400 IN NSns2.p13.dynect.net.dynect-demo.com. 86400 IN NSns1.p13.dynect.net.dynect-demo.com. 86400 IN NSns3.p13.dynect.net.;; Received 157 bytes from 204.13.251.13#53(ns4.p13.dynect.net) in 11 ms
~100ms of page load decrease!
~60ms of DNS latency decrease!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Unicast Domain Pointing to CDN
www.sport.com. 300 IN CNAMEwww.sport.com.edgesuite.net.sport.com. 172800 IN NS ns40.sport.com.sport.com. 172800 IN NS ns50.sport.com.sport.com. 172800 IN NS ns60.sport.com.;; Received 276 bytes from 209.133.83.36#53(ns60.sport.com) in 45 ms
www.sport.com.dynect-demo.com. 300 IN CNAMEwww.sport.com.edgesuite.net.dynect-demo.com. 172800 IN NS ns1.p13.dynect.net.dynect-demo.com. 172800 IN NS ns3.p13.dynect.net.dynect-demo.com. 172800 IN NS ns2.p13.dynect.net.dynect-demo.com. 172800 IN NS ns4.p13.dynect.net.;; Received 292 bytes from 204.13.250.13#53(ns2.p13.dynect.net) in 18 ms
~75ms of page load decrease, and more stability!
~62ms of DNS latency decrease!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
Unicast + Extra Lookups on GSLB Servers
bank.com. 172800 IN NS ns1.bank.com.
bank.com. 172800 IN NS ns2.bank.com.
bank.com. 172800 IN NS ns05.bank.com.
bank.com. 172800 IN NS ns06.bank.com.
;; Received 183 bytes from 192.5.6.30#53(a.gtld-servers.net) in 188 ms
www.bank.com. 600 IN CNAME wwwbc.gslb.bank.com.
gslb.bank.com. 3600IN NS dbes1gbx01.bank.com.
gslb.bank.com. 3600IN NS dcss1gbx01.bank.com.
gslb.bank.com. 3600IN NS dbes1gbx02.bank.com.
gslb.bank.com. 3600IN NS dbws1gbx01.bank.com.
gslb.bank.com. 3600IN NS drds1gbx01.bank.com.
gslb.bank.com. 3600IN NS dbws1gbx02.bank.com.
gslb.bank.com. 3600IN NS drds1gbx02.bank.com.
;; Received 370 bytes from 159.53.110.152#53(ns05.bank.com) in 90 ms
~3s of page load decrease!
~140ms of DNS latency decrease plus 2 round trips!
Optimizing Web Performance By Using Fast DNS Tom Daly @tomdyninc #velocityconf
Dyn.com | @dyninc
So, go do some DNS magic to speed up your site or use DynECT Managed DNS to
Dyn.com | @dyninc
Questions?
Send me email: [email protected]
Twitter: @tomdyninc | #velocityconf
Thanks for attending!