119
BGP Techniques for Internet Service Providers Philip Smith Philip Smith < [email protected] [email protected]> NANOG 34 NANOG 34 Seattle, 15 Seattle, 15- 17 May 2005 17 May 2005

NANOG34 BGP Techniques

Embed Size (px)

DESCRIPTION

NANOG34 BGP Techniques

Citation preview

111 NANOG 34BGP Techniques for Internet Service ProvidersPhilip Smith Philip Smith < >NANOG 34 NANOG 34Seattle, 15 Seattle, 15- -17 May 2005 17 May 2005222 NANOG 34Presentation Slides Slides are at:ftp://ftp-eng.cisco.com/pfs/seminars/NANOG34-BGP-Techniques.pdfAnd on the NANOG 34 meeting website Feel free to ask questions any time333 NANOG 34BGP for Internet Service Providers BGP Basics Scaling BGP Using Communities Deploying BGP in an ISP network4 NANOG 34BGP BasicsReminder Reminder 555 NANOG 34Border Gateway Protocol Routing Protocol used to exchange routing information between networksexterior gateway protocol Described in RFC1771 work in progress to updatewww.ietf.org/internet-drafts/draft-ietf-idr-bgp4-26.txt The Autonomous System is BGPs fundamental operating unitIt is used to uniquely identify networks with common routing policy666 NANOG 34Autonomous System (AS) Collection of networks with same routing policy Single routing protocol Usually under single ownership, trust and administrative control Identified by a unique numberAS 100777 NANOG 34Autonomous System Number (ASN) An ASN is a 16 bit integer1-64511 are for public network use64512-65534 are for private use and should never appear on the Internet0 and 65535 are reserved 32 bit ASNs are coming soonwww.ietf.org/internet-drafts/draft-ietf-idr-as4bytes-09.txtWith ASN 23456 reserved for the transition888 NANOG 34Autonomous System Number (ASN) ASNs are distributed by the Regional Internet RegistriesAlso available from upstream ISPs who are members of one of the RIRs Current ASN allocations up to 37887 have been made to the RIRsOf these, around 19500 are visible on the Internet Current estimates are that 4-byte ASNs will be required by July 20109 NANOG 34Applying Policy with BGPControl!10 10 10 NANOG 34Applying Policy in BGP:Why? Policies are applied to:Influence BGP Path Selection by setting BGP attributesDetermine which prefixes are announced or blockedDetermine which AS-paths are preferred, permitted, or deniedDetermine route groupings and their effects Decisions are generally based on prefix, AS-path and community11 11 11 NANOG 34Applying Policy with BGP:Tools Most implementations have tools to apply policies to BGP:Prefix manipulation/filteringAS-PATH manipulation/filteringCommunity Attribute setting and matching Implementations also have policy language which can do various match/set constructs on the attributes of chosen BGP routes12 NANOG 34BGP CapabilitiesExtending BGP13 13 13 NANOG 34BGP Capabilities Documented in RFC2842 Capabilities parameters passed in BGP open message Unknown or unsupported capabilities will result in NOTIFICATION message Codes:0 to 63 are assigned by IANA by IETF consensus64 to 127 are assigned by IANA first come first served128 to 255 are vendor specific14 14 14 NANOG 34BGP CapabilitiesCurrent capabilities are:0 Reserved [RFC3392]1 Multiprotocol Extensions for BGP-4 [RFC2858]2 Route Refresh Capability for BGP-4 [RFC2918]3 Cooperative Route Filtering Capability [ID]4 Multiple routes to a destination capability [RFC3107]64 Graceful Restart Capability [ID]65 Support for 4 octet ASNs [ID]66 Deprecated 2003-03-0667 Support for Dynamic Capability [ID]See www.iana.org/assignments/capability-codes15 15 15 NANOG 34BGP Capabilities Multiprotocol extensionsThis is a whole different world, allowing BGP to support more than IPv4 unicast routesExamples include: v4 multicast, IPv6, v6 multicast, VPNsAnother tutorial (or many!) Route refresh is a well known scaling technique covered shortly The other capabilities are still in development or not widely implemented or deployed yet16 16 16 NANOG 34BGP for Internet Service Providers BGP Basics Scaling BGP Using Communities Deploying BGP in an ISP network17 NANOG 34BGP Scaling Techniques18 18 18 NANOG 34BGP Scaling Techniques How does a service provider:Scale the iBGP mesh beyond a few peers?Implement new policy without causing flaps and route churning?Keep the network stable, scalable, as well as simple?19 NANOG 34Route Refresh20 20 20 NANOG 34Route Refresh BGP peer reset required after every policy changeBecause the router does not store prefixes which are rejected by policy Hard BGP peer reset:Terminates BGP peering & Consumes CPUSeverely disrupts connectivity for all networks Soft BGP peer reset (or Route Refresh):BGP peering remains activeImpacts only those prefixes affected by policy change21 21 21 NANOG 34Route Refresh Capability Facilitates non-disruptive policy changes For most implementations, no configuration is neededAutomatically negotiated at peer establishment No additional memory is used Requires peering routers to support route refresh capability RFC291822 22 22 NANOG 34Route Refresh Use Route Refresh capability if supportedfind out from the BGP neighbour status displayNon-disruptive, Good For the Internet If not supported, see if implementation has a workaround Only hard-reset a BGP peering as a last resortConsider the impact to be equivalent to a router reboot23 NANOG 34Route Flap DampingStabilising the Network24 24 24 NANOG 34Route Flap Damping Route flapGoing up and down of path or change in attributeBGP WITHDRAW followed by UPDATE = 1 flapeBGP neighbour peering reset is NOT a flapRipples through the entire InternetCauses instability, wastes CPU Damping aims to reduce scope of route flap propagation25 25 25 NANOG 34Route Flap Damping (continued) RequirementsFast convergence for normal route changesHistory predicts future behaviourSuppress oscillating routes Advertise stable routes Documented in RFC243926 26 26 NANOG 34Operation Add penalty for each flapNB: Change in attribute is also penalized Exponentially decay penaltyhalf life determines decay rate Penalty above suppress-limitdo not advertise route to BGP peers Penalty decayed below reuse-limitre-advertise route to BGP peers27 27 27 NANOG 34OperationReuse limit0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2501000200030004000TimePenaltySuppress limitNetworkAnnouncedNetworkRe-announcedNetworkNot AnnouncedPenalty28 28 28 NANOG 34Operation Only applied to inbound announcements from eBGP peers Alternate paths still usable Controllable by at least:Half-lifereuse-limitsuppress-limitmaximum suppress time29 29 29 NANOG 34Configuration Implementations allow various policy control with flap dampingFixed damping, same rate applied to all prefixesVariable damping, different rates applied to different ranges of prefixes and prefix lengths30 30 30 NANOG 34Implementing Flap Damping Flap Damping should only be implemented to address a specific network stability problem Flap Damping can and does make stability worseFlap Amplification from AS path attribute changes caused by BGP exploring alternate paths being unnecessarily penalisedRoute Flap Damping Exacerbates Internet Routing ConvergenceZhuoqing Morley Mao, Ramesh Govindan, George Varghese & Randy H. Katz, August 200231 31 31 NANOG 34Implementing Flap Damping If you have to implement flap damping, understand the impact on the networkVendor defaults are very severeVariable flap damping can bring benefitsTransit provider flap damping impacts peer ASes more harshly due to flap amplification Recommendations for ISPshttp://www.ripe.net/docs/ripe-229.html(work by European and US ISPs a few years ago as vendor defaults were considered to be too aggressive)32 NANOG 34Route Reflectors33 33 33 NANOG 34Scaling iBGP meshTwo solutionsRoute reflector simpler to deploy and runConfederation more complex, corner case benefits13 Routers 78 iBGP Sessions!n=1000 nearlyhalf a millionibgp sessions!n=1000 nearlyhalf a millionibgp sessions!Avoid n(n-1) iBGP mesh34 34 34 NANOG 34AS 100Route Reflector: PrincipleA AC C B BRoute Reflector35 35 35 NANOG 34Route ReflectorAS 100A AB B C CClientsReflectors Reflector receives path from clients and non-clients Selects best path If best path is from client, reflect to other clients and non-clients If best path is from non-client, reflect to clients only Non-meshed clients Described in RFC279636 36 36 NANOG 34Route Reflector Topology Divide the backbone into multiple clusters At least one route reflector and few clientsper cluster Route reflectors are fully meshed Clients in a cluster could be fully meshed Single IGP to carry next hop and local routes37 37 37 NANOG 34Route Reflectors:Loop Avoidance Originator_ID attributeCarries the RID of the originator of the route in the local AS (created by the RR) Cluster_list attributeThe local cluster-id is added when the update is sent by the RRBest to set cluster-id is from router-id (address of loopback)(Some ISPs use their own cluster-id assignment strategy but needs to be well documented!)38 38 38 NANOG 34Route Reflectors:Redundancy Multiple RRs can be configured in the same cluster not advised!All RRs in the cluster must have the same cluster-id (otherwise it is a different cluster) A router may be a client of RRs in different clustersCommon today in ISP networks to overlay two clusters redundancy achieved that wayEach client has two RRs = redundancy39 39 39 NANOG 34Route Reflectors:RedundancyAS 100Cluster OneCluster TwoPoP2PoP1PoP340 40 40 NANOG 34Route Reflectors:Migration Where to place the route reflectors?Always follow the physical topology!This will guarantee that the packet forwarding wont be affected Typical ISP network:PoP has two core routersCore routers are RR for the PoPTwo overlaid clusters41 41 41 NANOG 34Route Reflectors:Migration Typical ISP network:Core routers have fully meshed iBGPCreate further hierarchy if core mesh too bigSplit backbone into regions Configure one cluster pair at a timeEliminate redundant iBGP sessionsPlace maximum of one RR per clusterEasy migration, multiple levels42 42 42 NANOG 34Route Reflector:MigrationAS 200AS 100AS 300A AB BG GF FE ED DC C Migrate small parts of the network, one part at a time.43 43 43 NANOG 34BGP Scaling Techniques Route RefreshUse should be mandatory Route flap dampingOnly use if you understand why Route ReflectorsThe way to scale the iBGP mesh44 44 44 NANOG 34BGP for Internet Service Providers BGP Basics Scaling BGP Using Communities Deploying BGP in an ISP network45 NANOG 34Service Providers use of CommunitiesSome examples of how ISPs make life easier for themselves Some examples of how ISPs make life easier for themselves46 46 46 NANOG 34BGP Communities Another ISP scaling technique Prefixes are grouped into different classes or communities within the ISP network Each community means a different thing, has a different result in the ISP network47 47 47 NANOG 34BGP Communities Communities are generally set at the edge of the ISP networkCustomer edge: customer prefixes belong to different communities depending on the services they have purchasedInternet edge: transit provider prefixes belong to difference communities, depending on the loadsharing or traffic engineering requirements of the local ISP, or what the demands from its BGP customers might be One simple example follows to explain the concept48 48 48 NANOG 34Community Example Customer Edge This demonstrates how communities might be used at the customer edge of an ISP network ISP has three connections to the Internet:IXP connection, for local peersPrivate peering with a competing ISP in the regionTransit provider, who provides visibility to the entire Internet Customers have the option of purchasing combinations of the above connections49 49 49 NANOG 34Community Example Customer Edge Community assignments:IXP connection: community 100:2100Private peer: community 100:2200 Customer who buys local connectivity (via IXP) is put in community 100:2100 Customer who buys peer connectivity is put in community 100:2200 Customer who wants both IXP and peer connectivity is put in 100:2100 and 100:2200 Customer who wants the Internet has no community setWe are going to announce his prefix everywhere50 50 50 NANOG 34Community Example Customer EdgeCORE COREAggregation RouterCustomers Customers CustomersCommunities set at the aggregation router where the prefix is injected into the ISPs iBGPBorder Router51 51 51 NANOG 34Community Example Customer Edge No need to alter filters at the network border when adding a new customer New customer simply is added to the appropriate communityBorder filters already in place take care of announcementsEase of operation! More experienced operators tend to have more sophisticated options availableAdvice is to start with the easy examples given, and then proceed onwards as experience is gained52 52 52 NANOG 34Some ISP Examples ISPs also create communities to give customers bigger routing policy control Public policy is usually listed in the IRRFollowing examples are all in the IRRExamples build on the configuration concepts from the introductory example Consider creating communities to give policy control to customersReduces technical support burdenReduces the amount of router reconfiguration, and the chance of mistakes53 53 53 NANOG 34 www.sprintlink.net/policy/bgp.htmlSome ISP Examples: SprintlinkMore info at www.sprintlink.net/policy/bgp.html54 54 54 NANOG 34Some ISP ExamplesMCI Europe Permits customers to send communities which determinelocal preferences within MCIs networkReachability of the prefixHow the prefix is announced outside of MCIs network55 55 55 NANOG 34Some ISP ExamplesMCI Europeaut-num: AS702descr: MCI EMEA - Commercial IP service provider in Europeremarks: MCI uses the following communities with its customers:702:80Set Local Pref 80 within AS702702:120 Set Local Pref 120 within AS702702:20Announce only to MCI AS'es and MCI customers702:30Keep within Europe, don't announce to other MCI AS's702:1 Prepend AS702 once at edges of MCI to Peers702:2 Prepend AS702 twice at edges of MCI to Peers702:3 Prepend AS702 thrice at edges of MCI to PeersAdvanced communities for customers702:7020Do not announce to AS702 peers with a scope ofNational but advertise to Global Peers, EuropeanPeers and MCI customers.702:7001Prepend AS702 once at edges of MCI to AS702peers with a scope of National.702:7002Prepend AS702 twice at edges of MCI to AS702peers with a scope ofNational.(more)56 56 56 NANOG 34Some ISP ExamplesMCI Europe(more) 702:7003 Prepend AS702 thrice at edges of MCI to AS702peers with a scopeof National.702:8020 Do not announce to AS702 peers with a scope ofEuropean but advertise to Global Peers, NationalPeers and MCIcustomers.702:8001 Prepend AS702 once at edges of MCI to AS702peers with a scope of European.702:8002 Prepend AS702 twice at edges of MCI to AS702peers with a scope ofEuropean.702:8003 Prepend AS702 thrice at edges of MCI to AS702peers with a scopeof European.--------------------------------------------------------------Additional details of the MCI communities are located at:http://global.mci.com/uk/customer/bgp/--------------------------------------------------------------mnt-by:WCOM-EMEA-RICE-MNTchanged: [email protected] 20040523source:RIPE57 57 57 NANOG 34Some ISP ExamplesBT One of the most comprehensive community lists aroundwhois h whois.ripe.net AS5400 reveals all Extensive community definitions allow sophisticated traffic engineering by customers58 58 58 NANOG 34Some ISP ExamplesBT Igniteaut-num:AS5400descr:BT Ignite European Backboneremarks:remarks:Community to Community toremarks:Not announceTo peer: AS prepend 5400remarks:remarks:5400:1000 All peers & Transits 5400:2000remarks:remarks:5400:1500 All Transits 5400:2500remarks:5400:1501 Sprint Transit (AS1239)5400:2501remarks:5400:1502 SAVVIS Transit (AS3561)5400:2502remarks:5400:1503 Level 3 Transit (AS3356) 5400:2503remarks:5400:1504 AT&T Transit (AS7018)5400:2504remarks:5400:1505 UUnet Transit (AS701)5400:2505remarks:remarks:5400:1001 Nexica (AS24592) 5400:2001remarks:5400:1002 Fujitsu (AS3324) 5400:2002remarks:5400:1003 Unisource (AS3300) 5400:2003

notify: [email protected]: CIP-MNTsource: RIPEAnd many many more!59 59 59 NANOG 34BGP for Internet Service Providers BGP Basics Scaling BGP Using Communities Deploying BGP in an ISP network60 NANOG 34Deploying BGP in an ISP NetworkOkay, so weve learned all about BGP now; how do we use it on our network??61 61 61 NANOG 34Deploying BGP The role of IGPs and iBGP Aggregation Receiving Prefixes Configuration Tips62 NANOG 34The role of IGP and iBGPShips in the night?OrGood foundations?63 63 63 NANOG 34BGP versus OSPF/ISIS Internal Routing Protocols (IGPs)examples are ISIS and OSPFused for carrying infrastructure addressesNOT used for carrying Internet prefixes or customer prefixesdesign goal is to minimise number of prefixes in IGP to aid scalability and rapid convergence64 64 64 NANOG 34BGP versus OSPF/ISIS BGP used internally (iBGP) and externally (eBGP) iBGP used to carrysome/all Internet prefixes across backbonecustomer prefixes eBGP used toexchange prefixes with other ASesimplement routing policy65 65 65 NANOG 34BGP/IGP model used in ISP networks Model representationIGPiBGPIGPiBGPIGPiBGPIGPiBGPeBGP eBGP eBGP66 66 66 NANOG 34BGP versus OSPF/ISIS DO NOT:distribute BGP prefixes into an IGPdistribute IGP routes into BGPuse an IGP to carry customer prefixes YOUR NETWORK WILL NOTSCALE67 67 67 NANOG 34Injecting prefixes into iBGP Use iBGP to carry customer prefixesdont ever use IGP Point static route to customer interface Enter network into BGP processEnsure that implementation options are used so that the prefix always remains in iBGP, regardless of state of interfacei.e. avoid iBGP flaps caused by interface flaps68 NANOG 34AggregationQuality or Quantity?69 69 69 NANOG 34Aggregation Aggregation means announcing the address block received from the RIR to the other ASes connected to your network Subprefixes of this aggregate may be:Used internally in the ISP networkAnnounced to other ASes to aid with multihoming Unfortunately too many people are still thinking about class Cs, resulting in a proliferation of /24s in the Internet routing table70 70 70 NANOG 34Aggregation Address block should be announced to the Internet as an aggregate Subprefixes of address block should NOT be announced to Internet unless special circumstances (more later) Aggregate should be generated internallyNot on the network borders!71 71 71 NANOG 34Announcing an Aggregate ISPs who dont and wont aggregate are held in poor regard by community Registries publish their minimum allocation sizeAnything from a /20 to a /22 depending on RIRDifferent sizes for different address blocks No real reason to see anything longer than a /22 prefix in the InternetBUT there are currently >87000 /24s!72 72 72 NANOG 34Aggregation Example Customer has /23 network assigned from AS100s /19 address block AS100 announced /19 aggregate to the InternetAS100customer100.10.10.0/23100.10.0.0/19aggregateInternet100.10.0.0/1973 73 73 NANOG 34Aggregation Good Example Customer link goes downtheir /23 network becomes unreachable/23 is withdrawn from AS100s iBGP /19 aggregate is still being announcedno BGP hold down problemsno BGP propagation delaysno damping by other ISPs Customer link returns Their /23 network is visible againThe /23 is re-injected into AS100s iBGP The whole Internet becomes visible immediately Customer has Quality of Service perception74 74 74 NANOG 34Aggregation Example Customer has /23 network assigned from AS100s /19 address block AS100 announces customers individual networks to the InternetAS100customer100.10.10.0/23 Internet100.10.10.0/23100.10.0.0/24100.10.4.0/2275 75 75 NANOG 34Aggregation Bad Example Customer link goes downTheir /23 network becomes unreachable/23 is withdrawn from AS100s iBGP Their ISP doesnt aggregate its /19 network block/23 network withdrawal announced to peersstarts rippling through the Internetadded load on all Internet backbone routers as network is removed from routing table Customer link returnsTheir /23 network is now visible to their ISPTheir /23 network is re-advertised to peersStarts rippling through InternetLoad on Internet backbone routers as network is reinserted into routing tableSome ISPs suppress the flapsInternet may take 10-20 min or longer to be visibleWhere is the Quality of Service???76 76 76 NANOG 34Aggregation Summary Good example is what everyone should do!Adds to Internet stabilityReduces size of routing tableReduces routing churnImproves Internet QoS for everyone Bad example is what too many still do!Why? Lack of knowledge? Laziness?77 77 77 NANOG 34The Internet Today (May 2005) Current Internet Routing Table StatisticsBGP Routing Table Entries 162009Prefixes after maximum aggregation 94157Unique prefixes in Internet 78129Prefixes smaller than registry alloc 75990/24s announced 88342only 5702 /24s are from 192.0.0.0/8ASes in use 1962778 78 78 NANOG 34The New Swamp Swamp space is name used for areas of poor aggregationThe original swamp was 192.0.0.0/8 from the former class C blockName given just after the deployment of CIDRThe new swamp is creeping across all parts of the InternetNot just RIR space, but legacy space too79 79 79 NANOG 34The New SwampRIR Space May 1999Block Networks192/8 6287193/8 2439194/8 2921195/8 1429196/8 550197/8 0198/8 4015199/8 3503200/8 1459201/8 0202/8 2398203/8 3782204/8 3936205/8 2694206/8 3421207/8 3014Block Networks208/8 2995209/8 3083210/8 700211/8 0212/8 840213/8 1214/8 2215/8 4216/8 1218217/8 0218/8 0219/8 0220/8 0221/8 1222/8 0223/8 0Block Networks24/8 2460/8 061/8 262/8 10063/8 7864/8 065/8 066/8 067/8 068/8 069/8 070/8 071/8 172/8 073/8 0Block Networks80/8 081/8 082/8 083/8 084/8 085/8 086/8 087/8 088/8 0124/8 0125/8 0126/8 0RIR blocks contribute 50891 prefixes or 86% of the Internet Routing Table80 80 80 NANOG 34The New SwampRIR Space May 2005Block Networks192/8 6867193/8 4818194/8 3776195/8 3168196/8 965197/8 0198/8 4977199/8 4184200/8 6445201/8 654202/8 8659203/8 8629204/8 5252205/8 2834206/8 3990207/8 4162Block Networks208/8 3297209/8 5124210/8 3622211/8 2355212/8 2731213/8 2776214/8 316215/8 356216/8 6217217/8 2495218/8 1114219/8 948220/8 1273221/8 618222/8 731223/8 0Block Networks24/8 298960/8 24961/8 241962/8 162863/8 278264/8 476865/8 358666/8 595367/8 173168/8 247469/8 263270/8 86971/8 25072/8 47973/8 0Block Networks80/8 152281/8 125882/8 115883/8 165384/8 64285/8 80686/8 3587/8 288/8 2124/8 0125/8 0126/8 7RIR blocks contribute 142575 prefixes or 88% of the Internet Routing Table81 81 81 NANOG 34The New SwampSummary RIR space shows creeping deaggregationIt seems that an RIR /8 block averages around 4000 prefixes once fully allocatedSo their existing 58 /8s will eventually result in 232000 prefix announcements Food for thought:Remaining 74 unallocated /8s and the 58 RIR /8s combined will cause:528000 prefixes with density of 4000 prefixes per /8Plus ~10% due to non RIR space deaggregation82 82 82 NANOG 34The New SwampSummary Rest of address space is showing similar deaggregation too L What are the reasons?Main justification is traffic engineering Real reasons are:Lack of knowledgeLazinessDeliberate & knowing actions83 83 83 NANOG 34BGP Report(bgp.potaroo.net) 157000 total announcements 108000 prefixesAfter aggregating including full AS PATH infoi.e. including each ASNs traffic engineering 33% saving possible 93000 prefixesAfter aggregating by Origin ASi.e. ignoring each ASNs traffic engineering10% saving possible84 84 84 NANOG 34The excuses Traffic engineering causes 10% of the Internet Routing table Deliberate deaggregation causes 33% of the Internet Routing table85 85 85 NANOG 34Efforts to improve aggregation The CIDR ReportInitiated and operated for many years by Tony BatesNow combined with Geoff Hustons routing analysiswww.cidr-report.orgResults e-mailed on a weekly basis to most operations lists around the worldLists the top 30 service providers who could do better at aggregating86 86 86 NANOG 34Efforts to improve aggregationThe CIDR Report Also computes the size of the routing table assuming ISPs performed optimal aggregation Website allows searches and computations of aggregation to be made on a per AS basisFlexible and powerful tool to aid ISPsIntended to show how greater efficiency in terms of BGP table size can be obtained without loss of routing and policy informationShows what forms of origin AS aggregation could be performed and the potential benefit of such actions to the total table sizeVery effectively challenges the traffic engineering excuse87 87 87 NANOG 3488 88 88 NANOG 3489 89 89 NANOG 3490 90 90 NANOG 3491 91 91 NANOG 3492 92 92 NANOG 3493 93 93 NANOG 34Aggregation Potential(source: bgp.potaroo.net/as4637/)AS PathAS Origin94 94 94 NANOG 34AggregationSummary Aggregation on the Internet could be MUCH better35% saving on Internet routing table size is quite feasibleTools are availableCommands on the routers are not hardCIDR-Report webpage95 NANOG 34Receiving Prefixes96 96 96 NANOG 34Receiving Prefixes There are three scenarios for receiving prefixes from other ASNsCustomer talking BGPPeer talking BGPUpstream/Transit talking BGP Each has different filtering requirements and need to be considered separately97 97 97 NANOG 34Receiving Prefixes:From Customers ISPs should only accept prefixes which have been assigned or allocated to their downstream customer If ISP has assigned address space to its customer, then the customer IS entitled to announce it back to his ISP If the ISP has NOT assigned address space to its customer, then:Check in the four RIR databases to see if this address space really has been assigned to the customerThe tool:whois h whois.apnic.net x.x.x.0/2498 98 98 NANOG 34Receiving Prefixes:From Customers Example use of whois to check if customer is entitled to announce address space:pfs-pc$ whois -h whois.apnic.net 202.12.29.0inetnum:202.12.29.0 - 202.12.29.255netname:APNIC-AP-AU-BNEdescr:APNIC Pty Ltd - Brisbane Offices + Serversdescr:Level 1, 33 Park Rddescr:PO Box 2131, Miltondescr:Brisbane, QLD.country:AUadmin-c:HM20-APtech-c: NO4-APmnt-by: APNIC-HMchanged:[email protected] 20030108status: ASSIGNED PORTABLEsource: APNICPortable means its an assignment to the customer, the customer can announce it to you99 99 99 NANOG 34Receiving Prefixes:From Customers Example use of whois to check if customer is entitled to announce address space:$ whois -h whois.ripe.net 193.128.2.0inetnum:193.128.2.0 - 193.128.2.15descr:Wood Mackenziecountry:GBadmin-c:DB635-RIPEtech-c: DB635-RIPEstatus: ASSIGNED PAmnt-by: AS1849-MNTchanged:[email protected] 20020211source: RIPEroute:193.128.0.0/14descr:PIPEX-BLOCK1origin: AS1849notify: [email protected]: AS1849-MNTchanged:[email protected] 20020321source: RIPEASSIGNED PA means that it is Provider Aggregatable address space and can only be used for connecting to the ISP who assigned it100 100 100 NANOG 34Receiving Prefixes:From Peers A peer is an ISP with whom you agree to exchange prefixes you originate into the Internet routing tablePrefixes you accept from a peer are only those they have indicated they will announcePrefixes you announce to your peer are only those you have indicated you will announce101 101 101 NANOG 34Receiving Prefixes:From Peers Agreeing what each will announce to the other:Exchange of e-mail documentation as part of the peering agreement, and then ongoing updatesORUse of the Internet Routing Registry and configuration tools such as the IRRToolSetwww.isc.org/sw/IRRToolSet/102 102 102 NANOG 34Receiving Prefixes:From Upstream/Transit Provider Upstream/Transit Provider is an ISP who you pay to give you transit to the WHOLE Internet Receiving prefixes from them is not desirable unless really necessaryspecial circumstances see later Ask upstream/transit provider to either:originate a default-routeORannounce one prefix you can use as default103 103 103 NANOG 34Receiving Prefixes:From Upstream/Transit Provider If necessary to receive prefixes from any provider, care is requireddont accept RFC1918 etc prefixesftp://ftp.rfc-editor.org/in-notes/rfc3330.txtdont accept your own prefixesdont accept default (unless you need it)dont accept prefixes longer than /24 Check Project Cymrus list of bogonswww.cymru.com/Documents/bogon-list.htmlUsing the bogon list means you MUST keep it up to date104 104 104 NANOG 34Receiving Prefixes Paying attention to prefixes received from customers, peers and transit providers assists with:The integrity of the local networkThe integrity of the Internet Responsibility of all ISPs to be good Internet citizens105 NANOG 34Configuration TipsOf templates, passwords, tricks, and more templates106 106 106 NANOG 34iBGP and IGPsReminder! Make sure loopback is configured on routeriBGP between loopbacks, NOT real interfaces Make sure IGP carries loopback /32 address Consider the DMZ nets:Use unnumbered interfaces?Use next-hop-self on iBGP neighboursOr carry the DMZ /30s in the iBGPBasically keep the DMZ nets out of the IGP!107 107 107 NANOG 34Next-hop-self Used by many ISPs on edge routersPreferable to carrying DMZ /30 addresses in the IGPReduces size of IGP to just core infrastructureAlternative to using unnumbered interfacesHelps scale networkBGP speaker announces external network using local address (loopback) as next-hop 108 108 108 NANOG 34Templates Good practice to configure templates for everythingVendor defaults tend not to be optimal or even very useful for ISPsISPs create their own defaults by using configuration templates eBGP and iBGP examples followAlso see Project Cymrus BGP templateswww.cymru.com/Documents109 109 109 NANOG 34iBGP TemplateExample iBGP between loopbacks!So IGP can do intelligent re-route Next-hop-selfKeep DMZ and external point-to-point out of IGP Always send community attribute for iBGPOtherwise accidents will happen Hardwire BGP to version 4Yes, this is being paranoid! Use passwords on iBGP sessionNot being paranoid, VERY necessary110 110 110 NANOG 34eBGP TemplateExample BGP dampingDo NOT use it unless you understand whyUse RIPE-229 parameters, or something even weakerDo NOT use the vendor defaults without thinking Remove private ASes from announcementsPrivate ASNs should not appear on the public Internet Use extensive filters, with backupUse as-path filters to backup prefix filtersKeep policy language for implementing policy, rather than basic filtering Use password agreed between you and peer on eBGP session111 111 111 NANOG 34eBGP TemplateExample continued Consider using maximum-prefix trackingRouter will warn you if there are sudden increases in BGP table size, bringing down eBGP if desired Log changes of neighbour stateand monitor those logs...both on and off the router! Make BGP admin distance higher than that of any IGPOtherwise prefixes heard from outside your network could override your IGP!!112 112 112 NANOG 34Limiting AS Path Length Some BGP implementations have problems with long AS_PATHSMemory corruptionMemory fragmentation Even using AS_PATH prepends, it is not normal to see more than 20 ASes in a typical AS_PATH in the Internet todayThe Internet is around 5 ASes deep on averageLargest AS_PATH is usually 16-20 ASNs113 113 113 NANOG 34Limiting AS Path Length Some announcements have ridiculous lengths of AS-paths:*> 3FFE:1600::/24 3FFE:C00:8023:5::2 22 11537 145 12199 10318 10566 13193 1930 2200 3425 293 5609 5430 13285 6939 14277 1849 33 15589 25336 6830 8002 2042 7610 iThis example is an error in one IPv6 implementation If your implementation supports it, consider limiting the maximum AS-path length you will accept114 114 114 NANOG 34BGP TTL hack Implement RFC3682 on BGP peeringsNeighbour sets TTL to 255Local router configured to expect TTL of incoming BGP packets to be 254No one apart from directly attached devices can send BGP packets which arrive with TTL of 254, so any possible attack by a remote miscreant is dropped due to TTL mismatchISPR1 R2AS 100AttackerTTL 254TTL 253 TTL 254115 115 115 NANOG 34BGP TTL hack TTL Hack:Both neighbours must agree to use the featureTTL check is much easier to perform than MD5(Called BTSH BGP TTL Security Hack) Provides security for BGP sessionsIn addition to packet filters of courseMD5 should still be used for messages which slip through the TTL hack See www.nanog.org/mtg-0302/hack.html for more details116 116 116 NANOG 34Passwords on BGP sessions Yes, I am mentioning passwords again Put password on the BGP sessionIts a secret shared between you and your peerIf arriving packets dont have the correct MD5 hash, they are ignoredHelps defeat miscreants who wish to attack BGP sessions Powerful preventative tool, especially when combined with filters and the TTL hack117 117 117 NANOG 34Using Communities Use communities to:Scale iBGP managementEase iBGP management Come up with a strategy for different classes of customersWhich prefixes stay inside networkWhich prefixes are announced by eBGPetc118 118 118 NANOG 34Summary Use configuration templates Standardise the configuration Be aware of standard tricks to avoid compromise of the BGP session Anything to make your life easier, network less prone to errors,network more likely to scale Its all about scaling if your network wont scale, then it wont be successful119 119 119 NANOG 34BGP Techniques for Internet Service ProvidersPhilip Smith Philip Smith < >NANOG 34 NANOG 34Seattle, 15 Seattle, 15- -17 May 2005 17 May 2005