3869eb9011869366109fbfbebae3b320.ppt
- Количество слайдов: 53
Crafting Confederations An overview of the Confederation POP Approach to Network Architecture Dan Golding Net. Rail, Inc. dan@netrail. net Miguel Dimayuga Earthlink, Inc. mdimayuga@corp. earthlink. net
The Old Way… Conventional Network Routing Architectures…. • Full Mesh i. BGP or Route Reflectors • A fully meshed Network via ATM PVCs.
What’s Wrong With The Old Way? • It’s not adapted to the New Optical Network! • POS is here in force, ATM’s value in the core is receding. • It is far more fragile, and far less agile than newer methods of Inter-domain Routing. • The Old Way was prone to user-error. The ECommerce Revolution demands a New Way!
A Better Way • Emphasizes Large Scale, IP Based, Fiber Ring Networks • Optimized for Service Provider Needs • Utilizes cutting edge routing technologies to provide far greater fault tolerance and usable traffic engineering. • Implemented via advanced BGP techniques: Communities and Confederations.
How the Old worked… (Full Mesh i. BGP) • Every router must be fully meshed with all others. • Works well in small systems • Grows exponentially • Eventually consumes all CPU, memory, and engineering resources. Full i. BGP Mesh Exponential growth!
How the Old Way worked… (Route Reflectors) • Scaled Well • Well suited to fully meshed ATM Networks – Star Topology. but. . . • Not Survivable in a Fiber Ring Network. Peer Isolation with BGP Route Reflection Peers RR Client RR Server Peers
How the Old Way worked… (Filtering) • List of IP Prefixes and/or AS numbers set on all border routers to other ISPs. Only the access-list contents would be advertised. • Worked well when most customers were singlehomed and didn’t run BGP. • Changes were VERY manpower intensive. • With multi-homed e-commerce shops, no longer feasible.
How the New Way works… (Confederations) • Routers peer with neighbors • Highly Survivable • Very Scalable • Easily Configured • Aids Troubleshooting BGP Confederations Routers Peer with Neighbors Peers
Confederation Overview • BGP allows three types of peer relationships: – i. BGP (Full i. BGP mesh) – e. BGP (External Peering or Transit) – Confederation e. BGP (its an i. BGP with an e. BGP look!) • Confederation e. BGP is like regular e. BGP, except – Next Hop, Local Preference and MEDs are preserved – Confederation elements in the AS-PATH are not counted for route selection purposes
Confederation Overview • Confederations allow groups of routers to form “subautonomous systems” to eliminate scaling problems with full mesh i. BGP • All Routers within a sub-AS must be fully meshed (or optionally in a route reflector cluster configuration) • Confederations are most advantageous when there are few routers per sub-AS. There is no reason to limit the number of sub-AS’s you have – nothing is gained.
Confederation Overview • Most confederation designs start out with only two or three sub-ASes. This offers few advantages over full mesh i. BGP in a ring network topology. • The more sub-ASes you add, the greater the advantage • The final result: One sub-AS per POP • The upper limit on this is 1000 sub-AS’s per RFC
The Advantages of a Confederation of POPs • The routers within each POP need only peer with each other, utilizing i. BGP • Neighboring POPs are peered with via POP border routers speaking confederation e. BGP • Next Hop, Local Pref and MEDs are preserved • More survivable than Route Reflectors • Far more scalable than full i. BGP mesh
How to Make It Work • • • Thoughtful use of sub-AS numbers Local Preference Hierarchy Useful and Descriptive Community Strings Meaningful MEDs Use of various policies – via access lists, community lists, etc – as building blocks • Use of Peer Groups whenever implementation allows.
Sub-AS Assignment • Sub-AS’s become useful tools for debugging – show ip bgp, show route • Suggested assignment is geographical • Always remember to keep room for expansion! • Put plenty of extra sub-AS’s in your configs – don’t count on adding them later!
Geographical Region as sub-AS • • • Southeast Northcentral Southcentral Western Canadian Latin/South American European Asian Reserved 65000 -65099 65100 -65199 65200 -65299 65300 -65399 65400 -65499 65500 -65535 64512 -64599 64600 -64699 64700 -64799 64800 -64999
Sample Community Assignments
Community Strings are the Key • Communities are “tags” or “post-it notes” attached to routes to help identify them. – There can be more than one community attached to a route. • Communities are recommended to be set at the ingress point. – Communities need be applied only once – administrative burden and complexity is greatly reduced. • When routes egress, filtering can be based on one or more community strings. • Sample Communities – Regional, by Peer, Customer, Internal, Peer, Transit
Communities Set at Ingress router bgp 4355 network 207. 69. 0. 0/16 route-map make-green network 199. 174. 166. 0/24 route-map make-red transit 4. 0. 0. 0/8 5. 0. 0. 0/8 i i AS 701 207. 69. 0. 0/16 198. 99. 146. 0/24 4. 0. 0. 0/8 5. 0. 0. 0/8 i i 701 i AS 4355 router bgp 4355 neighbor a. a remote-as 701 neighbor a. a route-map make-blue in
Communities Used to Filter on Egress transit 4. 0. 0. 0/8 5. 0. 0. 0/8 i i AS 701 207. 69. 0. 0/16 198. 99. 146. 0/24 4. 0. 0. 0/8 5. 0. 0. 0/8 i i 701 i AS 4355 router bgp 4355 neighbor b. b remote-as 3703 neighbor b. b route-map blue-green out customer 4. 0. 0. 0/8 5. 0. 0. 0/8 207. 69. 0. 0/16 701 4335 i AS 3703
Community Categories – Route Type • • Customer Routes Private Peering Transit Public Peering • Internal Routes (OPN-visible) • Internal Routes (Global-visible) 4006: 65150 4006: 65140 4006: 65130 4006: 65120 4006: 65110 4006: 65100
Other Peoples Networks (OPNs) • To expand our national coverage, Mindspring utilized third party networks’ dialup facilities. These networks are what we term as OPNs. • Prefixes for Core Services which we want restricted to Mind. Spring customers and not visible to the rest of the world (e. g. news, radius, smtp) are announced to our OPNs alone. – This has the added advantage of protecting against abuse of our services by non-customers. • With communities, we can tag routes for export to OPNs alone.
Community Categories – Route Ingress Location • Field Peering • Exchange Point Peer 4006: 65020 4006: 65010 • • • 4006: 65030 4006: 65040 4006: 65050 4006: 65060 4006: 65070 Northeast Region Peering (DC) Southeast Region Peering (Atlanta) Northcentral Region Peering (Chicago) West Peering Region (Palo Alto) Southcentral Region Peering (Dallas)
Community Categories – Specials • No Export to any external BGP peer No-Export • Do Not Advertise to any peer (Well Known) No-Advertise • Always Prefer (proposed Well Known) Prefer-Me (65535: 65519) • Always Avoid (proposed Well Known) Avoid-Me (65535: 65504)
Community Categories – Origin AS Also add a community string for the origin AS If the route comes from UUNet, then add 4006: 701 If the route comes from Sprint, then add 4006: 1239
transit 165. 200. 1. 0/24 Local Preference peering 1239 3703 i AS 701 165. 200. 1. 0/24 100 90 60 165. 200. 1. 0/24 AS 4006 3703 i 701 3703 i router bgp 4355 neighbor a. a remote-as 3703 neighbor c. c remote-as 701 neighbor b. b remote-as 4006 neighbor a. a route-map setlocpref 100 in neighbor c. c route-map setlocpref 60 in neighbor b. b route-map setlocpref 90 in customer 165. 200. 1. 0/24 AS 4355 1 3703 i i AS 3703
Local Preference Hierarchy • The higher the Local Preference, the more desirable the route. • Customers ALWAYS come first – we never want to send their traffic to a peer, regardless of ASPath padding • Private Peering is always more desirable than Public Peering • Transit is less desirable than private peering for economic reasons
Local Preference Hierarchy • • • Always Preferred Customer Routes Customer Backup Routes Private Peering Less Preferred Private Peering (congested) Paid Transit Less Preferred Paid Transit (congested) Public Peering (ATM NAPs) Less Preferred Public Peering (FDDI NAPs) Never Preferred 250 100 90 80 70 60 50 40 30 1
Peer Types • • • Local sub-AS Peer (within a POP) Confederation Peers (other POPs or sub-ASes) Transit Peers (we buy transit from them) Public/Private Peering Customer Peers
Local sub-AS Peers • All peers within a POP are members of this group. • The update source for these BGP sessions will be the loopback address of the router. • Communities must be recognized. • Option to use full-mesh or route-reflectors. For Each Local Sub-AS Peer neighbor
Update-Source Loopback Address • The routers will use loopback address as the source of the bgp packets. – Only one session needs to be created even with multiple paths between routers. • Peering between loopback addresses increase the stability of the bgp sessions since loopback addresses don’t go down. 207. 69. 132. 1/24 207. 69. 132. 2/24 192. 168. 128. 1/32 207. 69. 133. 1/24 207. 69. 133. 2/24 192. 168. 128. 2/32
Confederation Peers • • • All peers that are POP border routers are members of this group. The update source for these BGP sessions will be the facing interface of the router. Inbound Soft Reconfiguration is not necessary. – Outbound soft reconfiguration can be done at the remote end • • Communities must be recognized. Filtering is done on egress, MEDs are set on ingress.
Soft Reconfiguration • “clear ip bgp” drops the TCP session. Soft reconfiguration is much friendlier. • “clear ip bgp
Confederation Peer Configuration Peer-Group neighbor internal peer-group neighbor internal version 4 neighbor internal send-community For Each Peer neighbor
Confederation Peer Routes • Don’t Send: No Advertise • Send: Customer, Peer, Transit, Internal
Additive MEDs • Why – Allows a tiebreaker based on optimum routing – Allows an alternate method to de-prefer routes in case of transit/peering congestion • Possible Values – – Mileage – delay in ms – fixed value per hop • Supported by – Cisco IOS – Feature Request in JUNOS, Riverstone, Foundry Iron. Ware
Additive MEDs in Confederations 207. 69. 0. 0/16 580 40 207. 69. 0. 0/16 700 (65012 65000) 760 (65401 65012 65000) 207. 69. 0. 0/16 120 (65000) 120 600 207. 69. 0. 0/16 0 (originated here) 720 (65012 65000) 740 (65400 65012 65000)
Transit Peers • The update source for these BGP sessions will be the facing interface address of the router. • Soft Reconfiguration should be used. • Communities must be recognized. • Send out only customer and internal routes. • Apply an import ACL to the routes that prevents reception of martian routes, and assigns proper communities and local preference. • Allows prepending certain subsets of routes with additional AS numbers.
Transit Peer Config neighbor
Transit Peer Config • Don’t Send: No Exports, No Advertise Peers or Transit • Send: Customers, Internal
Transit Tricks • De-prefer routes for congested outbound – Set Local Pref normally for routes with AS-Path Length=1 or 2 – Set Local Pref Lower for all other routes – Effect: Only most direct routes flow through that connection. Others flow through other transit, if available • OPN’s and sending OPN routes – Send special routes – usually for servers and services – only to your own network, and OPNs – Have a special community list or policy specifying the routes.
Private/Public Peers • The update source for these BGP sessions will be the facing interface address of the router. • Soft Reconfiguration should be used. • Communities must be recognized. • Send out only customer and internal routes. • Apply an import ACL to the routes that prevents reception of martian routes, and assigns proper communities and local preference. • Option to use local preference to prefer unconditionally all or only some routes coming from a free peer.
Peer Configuration neighbor free-peering peer-group neighbor free-peering send-community neighbor free-peering version 4 neighbor free-peering next-hop-self neighbor free-peering-full soft-reconfiguration inbound neighbor free-peering-full distribute-list martians in neighbor free-peering route-map
Free Peering Routes • Don’t Send: No Exports, No Advertise Peers or Transit • Send: Customers, Internal
Customer Peers • The update source for these BGP sessions will be the facing interface address of the router. • Soft Reconfiguration should be used. • Communities must be recognized. This includes communities sent from customers. • Send out selected routes, based on customer request. • Apply an import ACL to the routes that prevents reception of martian routes, and assign proper communities and local preference. • The import filter must also accept only specific customer routes. – We recommend using Rtconfig to query RADB and generate the ACLs.
What Type of Routes Can We Send? • Full Routes – Customer, Peers, Internals, Transit. – AKA “A Full View” • Customer Routes – Customer and Internal Routes. – Good for weaker routers (Cisco) – AKA “A Partial View” • Default Route – Send only a default route - 0. 0/0, pointed to the router interface – Limited utility
Special Considerations for Customers • Carefully Filter routes – the farther downstream you get, the less clueful (generally) • Filtering can be based on AS or Prefix • The generally accepted practice is to filter by IP Access List at ingress (use radb tools if possible) • Customers do not have to advertise the same routes everywhere – peers do!
Customer Configuration – Full Routes bgp { group
Customer Configuration – Partial Routes bgp { group
Default Route Only • Cisco – neighbor a. b. c. d default-originate • Juniper - A little more complex. . . bgp { group
Question and Answer • Confederations • General BGP Questions
The New Way gives us… • • Less complexity More stability More flexibility for traffic management Greater Survivability Lower Engineering and Administrative costs. Increased Uptime A Scalable, Next Generation IP Network
Bibliography • RFC 1771 A Border Gateway Protocol 4 (BGP-4) • RFC 1965 Autonomous System Confederations for BGP • RFC 1930 Guidelines for creation, selection, and registration of an Autonomous System (AS) • RFC 1997 BGP Community Attributes • Nussbacher, Rudnev, and Hares, Global BGP Community Values, Internet Draft, 12/99 • Halabi, Bassam; Internet Routing Architectures • Freedman, Avi, Lecture Notes: January 1999 NANOG Conference Session: “BGP 102”
In Tribute to the Memory of. . . • Mind. Spring Enterprises, Inc. Very Special Thanks to… • Brandon Ross, Netrail • Avi Freedman, Akamai • Khalid Raza, Cisco


