24
11:20 - Route Server 2.0 in Detail DE-CIX ________________ Benedikt Rudolph Researcher

11:20 - Route Server 2.0 in Detail - de-cix.net · container name :target BIRD eth1 bgperf-br eth1 eth1 bgperf.py Advertise routes to the target Get CPU/MEM stats Get # of routes

Embed Size (px)

Citation preview

11:20 - Route Server 2.0 in Detail

DE-CIX________________Benedikt Rudolph

Researcher

DE-CIX Route Server Testframework

Benedikt Rudolph (other contributors: Allen Taylor, Daniel Spierling, Johannes Moos)

DE-CIX, Researcher

RS functionality, scalability & stress testing: existing projects

2

AMS-IX, DECIX,

NIX.CZ

JPNAP, NTT

2010 2014 2016 Today

DE-CIXLONAP, Switch & Data,

AMS-IX, LINX,

PLIX, DE-CIX,

VIX

Secondary

flaps

Expiring

sessions

Motivation: Make IXP Route Servers future proof!

Estimate route server capacity limits

Make assumptions objective through measurement

Stress testing → better capacity estimation

3

Unresponsive

BGPd ≥ tBGP_hold

Flapping BGP

sessions

Route server hardening based on observed incidents

Execute tests under realistic conditions

Test new SW features / configurations in large scale deployments

Design elements: inherited and new ones

Do not re-invent the wheel, build on existing tools

Container virtualization

Emphasize automation (for repeatable results)

Consider scalability (to meet future requirements)

4

$ ./bgperf.py

Execution of a benchmark run

5

Initialization1

Execution2

Evaluation3

BGPerf implementation revisted

6

container name :monitor

GoBGP

container name :tester

ExaBGPExaBGP

ExaBGPExaBGP

container name :target

BIRD

eth1

bgperf-br

eth1

eth1

bgperf.py

Advertise routes to

the target

Get CPU/MEM

stats

Get # of routes the

monitor receives

DE-CIX enhanced BGPerf

7

container name :monitor

GoBGP

container name :tester

ExaBGPExaBGP

ExaBGPExaBGP

container name :target

BIRD

eth1

bgperf-br

eth1

eth1

bgperf.py

Advertise routes to

the target

Get CPU/MEM

stats

Get # of routes the

monitor receives

VPN-GW

l2tpeth1

l2tpeth2

l2tpeth3

eth1

AWS

Manager

• PHP based solution

• Start/Stop ExaBGP cont

• Realistic ExaBGP Configs

ExaBGP

configExaBGP

configExaBGP

config

DE-CIX

BIRD.conf

BIRD

Monitor

BGP Add-Path

Action

Sequencer

Input

Script

Wait-

Convergent

Action

InterruptPeers

Action

Get CPU/MEM/ Routes

Execute / Stop Actions

script:

- wait-convergent:

- interrupt:

- sleep:

- ...

...

What do we measure?

8

What do we measure?

9

What do we measure?

10

What do we measure?

11

What do we measure?

12

BGP best paths

All received BGP routes

CPU utilization in %

Memory usage

What do we measure?

13

BGP best paths

All received BGP routes

CPU utilization in %

Memory usage

Realistic Peer Generation & Simulation

Emulation of peers with ExaBGP (https://github.com/Exa-Networks/exabgp)

One ExaBGP process per peer

Real world peer snapshots from DE-CIX route servers

Auto generated ExaBGP configs incl.:

Session Hold timers

Announced prefixes

AS-Path, BGP next hop, local pref,

(extended) BGP communities, ...

Export from per-customer RIBs

Includes all filtered prefixes as well

~ 720.000 routes in ExaBGP configs

14

LIVE

Routeserver

Emulate packet loss and delay with an existing tool (https://github.com/tylertreat/comcast)

Makes use of iptables and tc (on Linux)

Simulate L2 problems and emerging peer flaps

High Loss leading to missed keepalives

Will result in peer flaps

Example: simulate entire switch / linecard failures

generate 100% packet loss for a given time

No flaps, but high number of sessions go down

RS needs to calculate new best paths / send withdraws

Simulation of L2 problems

15

16

Automation: Benchmark Action Sequencer

Extension for bgperf (Execution Phase)

Load a script and execute actions on the testbed (three types)

Wait-Convergent Action

Wait until BGPd under test reached a steady state e.g.

CPU below x % for n measurement intervals

BGPd received at least m routes

Interrupt-Peers Action

Disrupt communication to a user-defined list of IPs from the testbed

Complete interruption or configurable packet loss

Sleep Action

Wait for a defined amount of time

17

Benchmark Action Sequencer script file

--- # example input script for the bgperf benchmark action sequencer

script: # a script is an ordered list of actions that are executed in sequence

- wait-convergent: # wait for bgpd convergence / steady state

cpu-below: 1 # threshold for cpu utilization

routes: 200 # greater or equal number of routes in master table

confidence: 3 # check criteria in n consecutive measurement intervals

- interrupt-peers: # interrupt a single peer

duration: 1 # duration of the interruption in seconds

peers: [172.31.192.43] # list of peers, at least one peer

loss: 100 # packet-loss in percent [0-100]

recovery: 20 # optional: How long to wait for recovery after duration?

- interrupt-peers: &bigOnes1 # interrupt multiple peers (reusable through anchor)

duration: 20

peers: [172.31.192.43, 172.31.192.44, 172.31.192.45]

- interrupt-peers: # reuse entry above, but redefine duration and add repetition

<<: *bigOnes1 # reference to anchor "&bigOnes1"

duration: 10

- sleep: # sleep for a fixed duration

duration: 1 # duration in seconds

18

BIRD Memory Leak / Cisco Bug

Detection and investigation of a memory leak in BIRD

Customer facing Cisco bug CSCus56036, graceful restart (4s)

Memory leak, BIRD process killed by OoM killer

Comunicate with developers, bug fixed in BIRD v1.6.3

Reproduce scenario and test effectiveness of fix

Graceful Restart bug

21

one peer flappingRS convergent

BGP best paths

CPU utilization in %

Memory usage

Simulation of a realistic L2 disruption

328 peers for 800s (e.g. caused by an edge switch SW-upgrade)

22

L2 interrupt for 328 peers (800s)

L2 interrupt for 328 peers (800s)

BGP best paths

All received BGP routes

CPU utilization in %

BIRD v1.5.0 (Multi-RIB config)

BIRD v1.6.3 (Multi-RIB config)

Conclusion

21

New toolset for route server testing

New

Enhanced and unique test framework. Realistic one-to-one copy of our live IXP network (Use of custom BGPd config)

. High scalability of peers due to AWS cloud integration

. Dynamic and automated test benchmarks, using the action sequencer extension

Enhanced & Unique

Benedikt RudolphResearcher

Daniel SpierlingNetwork Engineer

Johannes MoosSystems Engineer

Thank you!

Thank you!