Design Requirements for Bullet-Proof Packet Passers Avi Freedman

Design Requirements for Bullet-Proof Packet Passers Avi Freedman avi@freedman. net Chief Technical Officer, Netaxs VP and Chief Network Architect, Akamai

Overview • • Goals and problems in Good Networking Current and future SLAs Failure analysis Hardware requirements Software requirements Sample architecture – Nortel OPC Open questions

Goals for Good Networking • The three things that customers seem to want from IP networking: – Stability – Performance – Burstability/capacity assurance – Price • Order varies, but Stability is almost always #1

Problems in Good Networking • Performance is often a backbone capacity – and more often a peering/transit issues. • Burstability problems come from lack of large aggregation capabilities (no 100 gb ports to connect 1 gb customers to); a soluble engineering effort, though, with enough of even today’s hardware.

Problems in Good Networking • The biggest problem is stability. Four main causes: – – Operator error Software Fiber cuts Hardware • One can argue over ranking, but all are important. • Fiber is a soluble issue with money and engineering. • We’ll revisit these.

Current and Future SLAs • Today’s SLAs are fairly weak. SLAs of the future will trend towards minutes per year of outage, with large credits for complete outages. • CDNs already offer SLAs that give 1 day credit for a 15 minute slowdown (not even outage). • Today’s hardware and software cannot be relied upon to pass IP packets reliably enough to meet these SLAs. • To meet these SLAs, 5 minutes/year of systemwide outage is probably all that customers will tolerate at some point – and the first network to offer it in a vacuum will win huge market share.

Failure Analysis – Op Error • What causes operator error? • Often it’s not ignorance, but the fact that doing distributed configuration is hard with today’s tools. • Key point – cisco ‘no’ method has caused many a network outage. • GUIs are unwiedly, though. • And Unix OS on routers is a security problem! • Industry work on ‘safer’ GUIs is needed.

Failure Analysis - Hardware • Hardware is typically less of a problem, but OIR often stands for “Online insert and reboot”. • The design needs to be simple, elegant, and redundant. • Ideally, scalable and expandable as well, but simplicity of design is the best assurance of stability.

Failure Analysis - Software • Router software causes literally hundreds of outages per year – even (excuse the term) megalapses inside networks. • Most of the problems do NOT relate to protocol design, though there are scaling issues to be solved there. • Most of the problems come from – – Bad code – Bad OS (OS fails to protect against bad code)

Failure Analysis – CPU Protection • Additionally, there is a chronic problem in that vendors are not providing sufficient protection for the route-processing engines, and as denial of service attacks get more aggressive, this is a growing problem! • The industry needs to describe to vendors what rules are needed – (Don’t allow multicast except for OSPF to connected interfaces, etc…)

Failure Analysis – Software Modularity • In addition to contributing to bad code, the more monolithic nature of current router OSs make it hard to avoid downtime while upgrading the network. • Upgrade-on-the-fly (with a base OS that remains unchanged) is an elusive goal, but it is achievable – 5 ESS and DMS boxes prove it.

Sample Architecture – Nortel OPC • As a case study, we consider the Nortel OPTera Packet Core, which has been designed around carrier-class robustness, with feedback from industry and telephonyswitch engineers. • The OPC is a 3+-year-old research project that went into “product” mode about a year ago. Products are about a year out, so Nortel is aggressively seeking input about robustness!

OPC – Design Requirements • The OPC team defined 99. 999% as the target uptime, and defined “uptime” as uptime across ports. So, 5 minutes downtime across all (of up to) 480 ports, or potentially more downtime across fewer ports. • Figures 2 software upgrades/year, and splits “acceptable” failures roughly evenly between hardware and software.

OPC – Hardware Overview • The OPC starts with a base 20 slot “application shelf” chassis of port and/or processor cards, and fabric slots. Base config can run in-chassis fabric, but is not expandable on the fly. • If broken out into an application shelf and fabric shelf, can be expanded to full 480 -slot config without downtime or packet loss.

OPC – Hardware Overview • Each slot has (up to) 10 gb of “port” capacity, and 16 gb of backplane (14. 5 gb effective after overhead). • Maximally configured, a 4. 8 tb router consisting of 24 application shelves in 12 racks, 16 fabric shelves in 4 bays, and a processor shelf. • Each shelf can be up to 1 km apart (entire system must be within 1 km diameter per spec, though it’s not clear this is a robustness-enhancing function until the router can operate partitioned)

OPC – Fabric • The OPC fabric is “passive” – with each possible set of boards, the config is fixed, and no software is required to drive or configure the fabric. • Can be imagined as parallel train tracks, with each board being a “station”, and slightly fewer “trains” shuttling 4 cells of traffic (each cell being one of 4 fixed priorities per cell). More boards is more stations.

OPC – Card Architecture • Each card has a general-purpose CPU (Motorola 750), and two packet-processor chips (the RSP 2). • The RSP 2 runs “software”, mostly microcode, scheduling, etc… • The RSP 2 can do up to 100 instructions on each of 16 packets in parallel, and then in serial for packet modification. • For read-only packet processing, within 1% of line rate is possible per card. 40 -43 byte packets are line-rate, 65 -70 byte packets yeild < 1% loss, beyond is line-rate.

OPC - Software • The major cause of software-based router failures is bad code. Ultimately, better software engineering is required. • Along the way, sound software architecture and protective features are needed. • And on-the-fly upgrade-ability. • As well as main-CPU-protection.

OPC – Main-CPU Protection • Each board’s RSP 2 s can do packet classification inbound or outbound, can throw away packets, replicate them (multicast or sniffing), kick them up to the main CPU, or send them to another port/card. • The capability exists as well to shape different classes of traffic as part of kicking packets up to the main CPU on-card or on another card. • The key is the ruleset; input is needed.

Main CPU Protection • As a general issue, rules should be reflected in multiple router vendors. • Rules such as – – 64 k/sec of BGP from an IP, only if we are talking to that IP – No non-OSPF multicast – 10 packets per second to each connected IP

Nortel OPC - CLI • Nortel is soliciting input on robust CLI design to reduce operator error. • Possibilities include ability for comments, transactions (commit/rollback), networkwise synchronized update (though this can cause instability as well)

OPC – Software Architecture • We now talk about the software that runs on the main CPUs, and the main Motorola 750 procs per board. • Chorus multi-threaded, multi-CPU real-time OS as a base. Has memory protection and preemptive multitasking. • IPC layer (“RACE”) on top, handles communication between processes “agents” and threads. Among other things, RACE allows “virtual synchrony” – running multiple processes in parallel and taking the first answer as a result. • This allows for easy upgrading of processes, and robustness in case of single- or multi-card failures.

Open Questions • What are other vendors doing? Cisco, Juniper, Avici all seem to be missing in major areas Nortel is addressing. Of course, you can buy Cisco, Juniper, and Avici products now • CLI design input • CPU protection rule input • Software architecture input (what modules should be on-the-fly upgrade-able); for example, tradeoffs in BGP converge-ance vs. upgrade-ability.