Upload
osamu-kurokochi
View
195
Download
0
Embed Size (px)
Citation preview
Copyright © GREE, Inc. All Rights Reserved. Copyright © GREE, Inc. All Rights Reserved.
No-Full Route Changed Our Lives @AS55394
Osamu Kurokochi, Data Center Team, Infrastructure Headquarters
Copyright © GREE, Inc. All Rights Reserved.
Name Osamu Kurokochi
Dept. Data Center Team, Infrastructure Headquarters, GREE, Inc
Self Introduction
Copyright © GREE, Inc. All Rights Reserved.
There were 8 border gateway protocol (BGP) routers.
Full route reception was in operation on all routers, and each BGP router was connected as iBGP peers in a full mesh topology.
Summer in 2012 (Configuration at the time)GREE environment
R R
R R
R
R
R
R
Copyright © GREE, Inc. All Rights Reserved.
Summer in 2012 (occurrence of fault)
One time, a fault on the Transit side occurred and caused peers to crash.
Convergence of the routes at the time took time and the Router CPU froze.
iBGP peers also began to crash which caused chaos.
A shutdown of about 5 minutes lasted intermittently until convergence. (The only thing we could do was to watch what happened.)
Copyright © GREE, Inc. All Rights Reserved.
Summer in 2012 (cause of fault)
There were 3 main factors.
1. Insufficient hardware processing capability 2. Increased number of routes 3. Too many iBGP peers (not that many)
Caused by one or a mixture of 3 factors above.
Copyright © GREE, Inc. All Rights Reserved.
Solution 1. Reinforced hardware Buy hardware having better performance.
Solution 2. Configuration change Introduce RR and reduce the number of iBGP-‐‑‒Peers.
Solution 3. Decreased number of routes Decrease the number of routes with a mechanism to reduce the load during a BGP update.
Breakthrough Solutions Considered at the Time
Copyright © GREE, Inc. All Rights Reserved.
Key Judgment Point
Replacement of BGP routers at all bases also was considered but is difficult in terms of effort.
When a procedure “verification → order → delivery → maintenance arrangement” was considered, this remedy was too slow… It was judged that the problem was difficult to solve by introducing new hardware.
Copyright © GREE, Inc. All Rights Reserved.
Key Judgment Point
We narrowed the solutions down to solution 3.
In our companyʼ’s business model, 99% of accesses were from mobile devices. The necessity of full route itself was reconsidered resulting in as follows:
Full route → Partial route + Default route *Partial Route = 3 domestic mobile carriers and 5 ASs.
Copyright © GREE, Inc. All Rights Reserved.
Transit Router Own router
Transit Router
1 In-house filtering
2 TransitFilter method
Full route Default route
Partial routeDefault route
RIB FIB
Own router RIB FIB
Partial RouteDefault Route
Transparent
GREE adopts this solution.
There Are Two Partial Routes
Copyright © GREE, Inc. All Rights Reserved.
Summary of Solutions
Solution 1. Reinforced hardware Buy hardware having better performance. → Verification required and it takes time for delivery.
Solution 2. Configuration change Implement RR to reduce the number of iBGP-‐‑‒Peers. → Verification required, it takes time for delivery and no conclusive evidence that the problem will be rectified.
Solution 3. Decreased number of routes Lower the number of routes with a mechanism to reduce the load during a BGP update. → This can solve the problem in a short time and is reliable.
Copyright © GREE, Inc. All Rights Reserved.
Solution of Problem
Number of routes: At the time, 400,000 routes → reduced to approx. 2600 routes
Come on, line trouble!
We actually tried the solution.
These have been further reduced to approx. 1800 routes.
Copyright © GREE, Inc. All Rights Reserved.
I thought...
We have found that we can run our operations without full route. Why not reconsider whether you need it too?
Copyright © GREE, Inc. All Rights Reserved.