16
Some Observations on Network Failures NANOG 15 Craig Labovitz <[email protected]>

Some Observations on Network Failures NANOG 15 Craig Labovitz

Embed Size (px)

DESCRIPTION

Internet Failures Analysis Look at default-free BGP announcements from multiple large providers –Long lived (60 % of 9 months) –Consider stable if covered by less specifics –15 minute filter window –Mean-time failure, repair and availability Case study regional network

Citation preview

Page 1: Some Observations on Network Failures NANOG 15 Craig Labovitz

Some Observations on Network Failures

NANOG 15Craig Labovitz

<[email protected]>

Page 2: Some Observations on Network Failures NANOG 15 Craig Labovitz

Observations

• Goal: Model Internet topological changes• Lots of strange BGP routing• Strange BGP routing went away• What causes remaining BGP topological and

policy changes?– Not just count flaps, but study how routing tables

changes over extended periods– Not end-to-end

Page 3: Some Observations on Network Failures NANOG 15 Craig Labovitz

Internet Failures Analysis• Look at default-free BGP announcements

from multiple large providers– Long lived (60 % of 9 months)– Consider stable if covered by less specifics– 15 minute filter window– Mean-time failure, repair and availability

• Case study regional network

Page 4: Some Observations on Network Failures NANOG 15 Craig Labovitz

What We Did• Lots of probe machines

– Mae-East, Mae-West, Paix, PacBell, AADs• A default-free collector at UM

– Routeviews Multi-hop EBGP 6 providers US, Canada, Europe and Japan (300,000 routes)

• Case study of regional backbone (OSPF, IBGP/BGP)• 42 gigabytes and four years of logged routing packets

Page 5: Some Observations on Network Failures NANOG 15 Craig Labovitz

RouteTracker

• Peer with ISP routers

• Log all routing packets to disk

• Maintain statistics

Page 6: Some Observations on Network Failures NANOG 15 Craig Labovitz

MTBF

Page 7: Some Observations on Network Failures NANOG 15 Craig Labovitz

Route Fail-Over

Page 8: Some Observations on Network Failures NANOG 15 Craig Labovitz

MTTR

Page 9: Some Observations on Network Failures NANOG 15 Craig Labovitz

Availability

Page 10: Some Observations on Network Failures NANOG 15 Craig Labovitz

Default-Free Route Availability

Page 11: Some Observations on Network Failures NANOG 15 Craig Labovitz
Page 12: Some Observations on Network Failures NANOG 15 Craig Labovitz

Backbone MTR

Page 13: Some Observations on Network Failures NANOG 15 Craig Labovitz

Network Failures

Michnet Backbone Failures 11/97 - 11/98

Page 14: Some Observations on Network Failures NANOG 15 Craig Labovitz

Observations

• Internet significantly less availability than PSTN (99.99% +)

• Low mean time to change

Page 15: Some Observations on Network Failures NANOG 15 Craig Labovitz

Next Steps

• Host other routeviews machines?– Merit has several FreeBSD desktop boxes

• Looking for peers…– [email protected]

Page 16: Some Observations on Network Failures NANOG 15 Craig Labovitz