Скачать презентацию Large-Scale Ethernets and Enterprise Networks Major Themes Скачать презентацию Large-Scale Ethernets and Enterprise Networks Major Themes

3e1cd1f8da7ab5445ad89f2c8efae31e.ppt

  • Количество слайдов: 60

Large-Scale “Ethernets” and Enterprise Networks Major Themes: • Scaling Ethernets to millions of nodes Large-Scale “Ethernets” and Enterprise Networks Major Themes: • Scaling Ethernets to millions of nodes • Building scalable, robust and “plug-&-play” networks • Building “Beyond IP” and “Future-Proof” Networks Two Case Studies: SEATTLE, VIRO, see also TRILL (IETF draft standard) Please do the required readings! CSci 5221: Data Center Networking, and Large-Scale Enterprise Networks: Part II 1

Current Trends and Future Networks • Large number of mobile users and systems • Current Trends and Future Networks • Large number of mobile users and systems • Large number of smart appliances Online TV Multimedia Streaming Games Web/emails • High bandwidth core and edges Cell & resource smart phones • But also limited elements and networks (e. g. , sensors, MANETs) Surveillance & Security Home users Internet (or Future Networking Substrate) POTS Vo. IP Banking & e-commerce • Heterogeneous technologies 2

Even within a Single Administrative Domain • Large ISPs and enterprise networks • Large Even within a Single Administrative Domain • Large ISPs and enterprise networks • Large data centers with thousands or tens of thousands machines • Metro Ethernet • More and more devices are “Internet-capable” and plugged in • Likely rich and more diverse network topology and connectivity 3

Challenges posed by These Trends • Scalability: capability to connect tens of thousands, millions Challenges posed by These Trends • Scalability: capability to connect tens of thousands, millions or more users and devices – routing table size, constrained by router memory, lookup speed • Mobility: hosts are mobile – need to separate location (“addressing”) and identity (“naming”) • Availability & Reliability: must be resilient to failures – need to be “proactive” instead of reactive – need to localize effect of failures • Manageability: ease of deployment, “plug-&-play” need to minimize manual configuration – self-configure, self-organize, while ensuring security and trust – • ……. 4

Quick Overview of Ethernet • Dominant wired LAN technology – Covers the first IP-hop Quick Overview of Ethernet • Dominant wired LAN technology – Covers the first IP-hop in most enterprises/campuses • First widely used LAN technology • Simpler, cheaper than token LANs, ATM, and IP • Kept up with speed race: 10 Mbps and now to 40 Gbps – Soon 100 Gbps would be widely available Metcalfe’s Ethernet sketch 5

Ethernet Frame Structure • Addresses: source and destination MAC addresses – Flat, globally unique, Ethernet Frame Structure • Addresses: source and destination MAC addresses – Flat, globally unique, and permanent 48 -bit value – Adaptor passes frame to network-level protocol • If destination address matches the adaptor • Or the destination address is the broadcast address – Otherwise, adapter discards frame • Type: indicates the higher layer protocol – Usually IP 6

Interaction w/ the Upper Layer (IP) • Bootstrapping end hosts by automating host configuration Interaction w/ the Upper Layer (IP) • Bootstrapping end hosts by automating host configuration (e. g. , IP address assignment) – DHCP (Dynamic Host Configuration Protocol) – Broadcast DHCP discovery and request messages • Bootstrapping each conversation by enabling resolution from IP to MAC addr – ARP (Address Resolution Protocol) – Broadcast ARP requests • Both protocols work via Ethernet-layer broadcasting (i. e. , shouting!) 7

Broadcast Domain and IP Subnet • Ethernet broadcast domain – A group of hosts Broadcast Domain and IP Subnet • Ethernet broadcast domain – A group of hosts and switches to which the same broadcast or flooded frame is delivered – Note: broadcast domain != collision domain • Broadcast domain == IP subnet – Uses ARP to reach other hosts in the same subnet – Uses default gateway to reach hosts in different subnets • Too large a broadcast domain leads to – Excessive flooding and broadcasting overhead – Insufficient security/performance isolation 8

Ethernet Bridging: “Routing” at L 2 • Routing determines paths to destinations through which Ethernet Bridging: “Routing” at L 2 • Routing determines paths to destinations through which traffic is forwarded • Routing takes place at any layer (including L 2) where devices are reachable across multiple hops App Layer P 2 P, or CDN routing Overlay routing IP Layer IP routing Link Layer Ethernet bridging 9

Ethernet (Layer-2) “Routing” • Self-learning algorithm for dynamically building switch (forwarding) tables – “Eavesdrop” Ethernet (Layer-2) “Routing” • Self-learning algorithm for dynamically building switch (forwarding) tables – “Eavesdrop” on source MACs of data packets – Associate source MACs with port # (cached, “soft-state”) • Forwarding algorithm • Forwarding algorithm – If dst MAC found in switch table, send to the corresp. port – Otherwise, flood to all ports (except the one it comes from) • Dealing with “loopy” topologies – Running (periodically) spanning tree algorithm to convert it into a tree (rooted at an “arbitrary” node) • 802. 11 Wireless LANs use somewhat similar methods – Use the same 48 -bit MAC addresses more complex frame structures; – End hosts need to explicitly associate with APs 10

Pros and Cons of Layer 2 Technologies Pluses: • “Plug-&-Play, ” with minimal configuration Pros and Cons of Layer 2 Technologies Pluses: • “Plug-&-Play, ” with minimal configuration – But VLAN requires manual configuration! • MAC address: flat-name, host mobility Minuses: • Data plane flooding, not scalable to large networks • Sub-optimal routing (using a single spanning tree) – Can’t deal with complex topology! • Not robust to failures! – if any edge or node on the spanning tree fails, need to recompute spanning tree; – slow convergence! 11

Pros and Cons of (Layer 3) IP Pluses: • Better data plane scalability – Pros and Cons of (Layer 3) IP Pluses: • Better data plane scalability – unicast packets is sent point-to-point! • More “optimal” routing – one spanning tree per destination; – link weights can be used to reflect link bw, latency, load, etc. Minuses: • Two-level hierarchical names & (manual) address management – prefix assignment per link; router configuration – poorer support for mobility; – difficulty/complexity in “re-naming” • Esp. , changing addressing schemes (IPv 4 -> IPv 6 transition) • control plane flooding & reactive approach to failures! – global effect of network failures – Can’t take advantage of rich network topologies (path diversity)! 12

State of the Practice: A Hybrid Architecture Enterprise networks comprised of Ethernet-based IP subnets State of the Practice: A Hybrid Architecture Enterprise networks comprised of Ethernet-based IP subnets interconnected by routers Ethernet Bridging - Flat addressing - Self-learning - Flooding - Forwarding along a tree Broadcast Domain (LAN or VLAN) R R IP Routing (e. g. , OSPF) - Hierarchical addressing - Subnet configuration - Host configuration - Forwarding along shortest paths R R R 13

“All-Ethernet” Enterprise Network? • “All-Ethernet” makes network mgmt easier – Flat addressing and self-learning “All-Ethernet” Enterprise Network? • “All-Ethernet” makes network mgmt easier – Flat addressing and self-learning enables plug-and-play networking – Permanent and location independent addresses also simplify • Host mobility • Access-control policies • Network troubleshooting 14

Data Center Networks ! • Data centers – Backend of the Internet – Mid- Data Center Networks ! • Data centers – Backend of the Internet – Mid- (most enterprises) to mega-scale (Google, Yahoo, MS, etc. ) • E. g. , A regional DC of a major on-line service provider consists of 25 K servers + 1 K switches/routers • To ensure business continuity, and to lower operational cost, DCs must – Adapt to varying workload Breathing – Avoid/Minimize service disruption (when maintenance, or failure) Agility – Maximize aggregate throughput Load balancing 15

But, Ethernet Bridging Does Not Scale • Flooding-based delivery – Frames to unknown destinations But, Ethernet Bridging Does Not Scale • Flooding-based delivery – Frames to unknown destinations are flooded • Broadcasting for basic service – Bootstrapping relies on broadcasting – Vulnerable to resource exhaustion attacks • Inefficient forwarding paths – Loops are fatal due to broadcast storms; uses the STP – Forwarding along a single tree leads to inefficiency and lower utilization 16

Layer 2 vs. Layer 3 Again Neither bridging nor routing is satisfactory. Can’t we Layer 2 vs. Layer 3 Again Neither bridging nor routing is satisfactory. Can’t we take only the best of each? Architectures Features Ease of configuration Optimality in addressing Host mobility Path efficiency Load distribution Convergence speed Tolerance to loop Ethernet Bridging IP SEATTLE Routing 17

“Floodless” in SEATTLE (Scalable Ethernet Archi. Tec. Ture for Larger Enterprises) • Objectives – “Floodless” in SEATTLE (Scalable Ethernet Archi. Tec. Ture for Larger Enterprises) • Objectives – – Avoiding flooding Restraining broadcasting Keeping forwarding tables small Ensuring path efficiency • SEATTLE architecture – Hash-based location management – Shortest-path forwarding – Responding to network dynamics 18

Avoiding Flooding • Bridging uses flooding as a routing scheme – Unicast frames to Avoiding Flooding • Bridging uses flooding as a routing scheme – Unicast frames to unknown destinations are flooded “Don’t know where destination is. ” – Does not scale to a large network “Send it everywhere! At least, they’ll learn where the source is. ” • Objective #1: Unicast unicast traffic – Need a control-plane mechanism to discover and disseminate hosts’ location information 19

Restraining Broadcasting • Liberal use of broadcasting for bootstrapping (DHCP and ARP) – Broadcasting Restraining Broadcasting • Liberal use of broadcasting for bootstrapping (DHCP and ARP) – Broadcasting is a vestige of shared-medium Ethernet – Very serious overhead in switched networks • Objective #2: Support unicast-based bootstrapping – Need a directory service • Sub-objective #2. 1: Yet, support general broadcast – Nonetheless, handling broadcast should be more scalable 20

Keeping Forwarding Tables Small • Flooding and self-learning lead to unnecessarily large forwarding tables Keeping Forwarding Tables Small • Flooding and self-learning lead to unnecessarily large forwarding tables – Large tables are not only inefficient, but also dangerous • Objective #3: Install hosts’ location information only when and where it is needed – Need a reactive resolution scheme – Enterprise traffic patterns are better-suited to reactive resolution 21

Ensuring Optimal Forwarding Paths • Spanning tree avoids broadcast storms. But, forwarding along a Ensuring Optimal Forwarding Paths • Spanning tree avoids broadcast storms. But, forwarding along a single tree is inefficient. – Poor load balancing and longer paths – Multiple spanning trees are insufficient and expensive • Objective #4: Utilize shortest paths – Need a routing protocol • Sub-objective #4. 1: Prevent broadcast storms – Need an alternative measure to prevent broadcast storms 22

Backwards Compatibility • Objective #5: Do not modify end-hosts – From end-hosts’ view, network Backwards Compatibility • Objective #5: Do not modify end-hosts – From end-hosts’ view, network must work the same way – End hosts should • Use the same protocol stacks and applications • Not be forced to run an additional protocol 23

SEATTLE in a Slide • Flat addressing of end-hosts – Switches use hosts’ MAC SEATTLE in a Slide • Flat addressing of end-hosts – Switches use hosts’ MAC addresses for routing – Ensures zero-configuration and backwards-compatibility (Obj # 5) • Automated host discovery at the edge – Switches detect the arrival/departure of hosts – Obviates flooding and ensures scalability (Obj #1, 5) • Hash-based on-demand resolution – Hash deterministically maps a host to a switch – Switches resolve end-hosts’ location and address via hashing – Ensures scalability (Obj #1, 2, 3) • Shortest-path forwarding between switches – Switches run link-state routing to maintain only switch-level topology (i. e. , do not disseminate end-host information) – Ensures data-plane efficiency (Obj #4) 24

How does it work? x Deliver to x Host discovery or registration C Optimized How does it work? x Deliver to x Host discovery or registration C Optimized forwarding directly from D to A y Traffic to x A Hash (F(x) = B) Tunnel to egress node, A Entire enterprise (A large single IP subnet) Switches Tunnel to relay switch, B D LS core Notifying to D Store B at B Hash (F(x) = B) E End-hosts Control flow Data flow 25

Terminology Dst x < shortest-path forwarding x. A , A > y Src Ingress Terminology Dst x < shortest-path forwarding x. A , A > y Src Ingress Egress D < Relay (for x) B < x , A > Ingress applies a cache eviction policy to this entry x , A > 26

Responding to Topology Changes • The quality of hashing matters! h h A E Responding to Topology Changes • The quality of hashing matters! h h A E h F Consistent Hash minimizes re-registration overhead h h h B h h D h C h 27

Single Hop Look-up y sends traffic to x x F(x) y A E Every Single Hop Look-up y sends traffic to x x F(x) y A E Every switch on a ring is logically one hop away D B C 28

Responding to Host Mobility Old Dst x < < x x. A , G Responding to Host Mobility Old Dst x < < x x. A , G > , A > when shortest-path forwarding is used y Src D < < x x , A > , G > Relay (for x) G New Dst < x , G > B < < x x , A > , G > 29

Unicast-based Bootstrapping: ARP • ARP – Ethernet: Broadcast requests – SEATTLE: Hash-based on-demand address Unicast-based Bootstrapping: ARP • ARP – Ethernet: Broadcast requests – SEATTLE: Hash-based on-demand address resolution Owner of (IPa , maca) 4. Broadcast ARP req for a b sb a 1. Host discovery sa 2. Hashing 6. Unicast ARP req to ra F(IPa) = ra Switch End-host Control msgs ARP msgs 5. Hashing F(IPa) = ra 7. Unicast ARP reply (IPa , maca , sa) ra to ingress 3. Storing (IPa , maca , sa) 30

Unicast-based Bootstrapping: DHCP • DHCP – Ethernet: Broadcast requests and replies – SEATTLE: Utilize Unicast-based Bootstrapping: DHCP • DHCP – Ethernet: Broadcast requests and replies – SEATTLE: Utilize DHCP relay agent (RFC 2131) • Proxy resolution by ingress switches via unicasting 4. Broadcast DHCP discovery DHCP server (macd=0 x. DHCP) d 6. DHCP msg to r 8. Deliver DHCP msg to d 1. Host discovery sd 2. Hashing 7. DHCP msg to sd sh h 5. Hashing F(0 x. DHCP) = r F(macd) = r Switch End-host Control msgs DHCP msgs r 3. Storing (macd , sd) 31

Control-Plane Scalability When Using Relays • Minimal overhead for disseminating host-location information – Each Control-Plane Scalability When Using Relays • Minimal overhead for disseminating host-location information – Each host’s location is advertised to only two switches • Small forwarding tables – The number of host information entries over all switches leads to O(H), not O(SH) • Simple and robust mobility support – When a host moves, updating only its relay suffices – No forwarding loop created since update is atomic 32

Data-Plane Efficiency w/o Compromise • Price for path optimization – Additional control messages for Data-Plane Efficiency w/o Compromise • Price for path optimization – Additional control messages for on-demand resolution – Larger forwarding tables – Control overhead for updating stale info of mobile hosts • The gain is much bigger than the cost – Because most hosts maintain a small, static communities of interest (COIs) [Aiello et al. , PAM’ 05] – Classical analogy: COI ↔ Working Set (WS); Caching is effective when a WS is small and static 33

SEATTLE: Summary • SEATTLE is a plug-and-playable enterprise architecture ensuring both scalability and efficiency SEATTLE: Summary • SEATTLE is a plug-and-playable enterprise architecture ensuring both scalability and efficiency • Enabling design choices – Hash-based location management – Reactive location resolution and caching – Shortest-path forwarding • Lessons – Trading a little data-plane efficiency for huge control-plane scalability makes a qualitatively different system – Traffic patterns are our friends 34

VIRO: Virtual Id Routing • Goal • How are the nodes (racks) inter-connected? – VIRO: Virtual Id Routing • Goal • How are the nodes (racks) inter-connected? – Typically a hierarchical inter-connection structure • Today’s typical data center structure Cisco recommended data center structure: starting from the bottom level – rack switches – 1 -2 layers of (layer-2) aggregation switches – access routers – core routers • Is such an architecture good enough? 35

VIRO: Virtual Id Routing A Scalable, Robust, “Plug-&-Play” and Namespace. Independent Routing Arch. for VIRO: Virtual Id Routing A Scalable, Robust, “Plug-&-Play” and Namespace. Independent Routing Arch. for “Future” Networks (beyond simply large-scale Ethernets, …) • Key Features – Separate identities (identifiers) from locations (locators) • introduce (“hierarchical”) virtual ids as “locators” • a “topology-ware”, self-organizing vid layer – Decouple routing/forwarding from addressing/naming • unify (traditional) layer 2 & layer 3 data plane operations • routing/forwarding done using vid’s only (except first/last hop) – Support any namespaces (identifiers), whether flat or not! • “native” or “application” naming/address-independent • co-existence and inter-operability between diff. namespaces! • “future-proof” 36

VIRO: Virtual Id Routing … • More Key Features … – Highly scalable, and VIRO: Virtual Id Routing … • More Key Features … – Highly scalable, and fully take advantage of path diversity • DHT-like routing paradigm --- beyond shortest-path routing! • O(log N) routing table size • Ease to support multi-path routing & dynamic load-balancing – And highly robust • eliminate (network-wide) control plane flooding! • localize failures, enable fast rerouting (pro-active!) • Other Important features – Allow for (logically centralized) network management • access and other policy control • also facilitate other security mechanisms – Can easily be adapted to support multiple topologies or virtualized network services 37

Virtual Id Layer and Vid Space • Topology-aware, structured virtual id (vid) space – Virtual Id Layer and Vid Space • Topology-aware, structured virtual id (vid) space – Kademlia-like “virtual” binary tree – self-configurable and self-organizing Other App Namespaces IPv 4/IPv 6 Virtual ID Layer D E G B C F A Layer 2 Physical Network Topology E B E D B D G G F F C C A 38 A

VIRO: Three Core Components • Virtual id space construction and vid assignment – performed VIRO: Three Core Components • Virtual id space construction and vid assignment – performed most at the bootstrap process (i. e. , network set up): • a vid space “skeleton” is created – once network is set up/vid space is constructed: • a new node (a “VIRO switch”) joins: assigned based on neighbors’ vid’s • end-host/device: inherits a vid (prefix) from “host switch” (to which it is attached), plus a randomly assigned host id; host may be agnostic of its vid • VIRO routing algorithm/protocol: – DHT-style, but needs to build end-to-end connectivity/routes • a bottom-up, round-by-round process, no network-wide control flooding • O(log N) routing entries per node, N: # of VIRO switches (Persistent) layer-2/3 address/name resolution and vid look-up – DHT directory services built on top of the same vid space l • “persistent” identifier (e. g. , MAC/IP address) hashed to a “vid” key, which is then used for (pid, vid) mapping registration, look-up, etc. l Data forwarding among VIRO switches using vid only 39

Vid Assignment: Bootstrap Process • Initial vid assignment and vid space construction performed during Vid Assignment: Bootstrap Process • Initial vid assignment and vid space construction performed during the network bootstrap process • Depending on network operating environments, can be performed using either a centralized or distributed vid assignment algorithm, e. g. , – top-down graph-partition (centralized) – bottom-up clustering (distributed) 1 0 0 0 1 D G E B 0 C 1 1 0 F 1 A 40

Vid Assignment: Key Properties • Logical distance defined on the vid space d(vidx, vidx): Vid Assignment: Key Properties • Logical distance defined on the vid space d(vidx, vidx): = L – lcp (vidy, vidy) -- L: max. tree height; lcp: longest common prefix • Key invariant properties (to ensure “topology-aware”): – closeness: if two nodes are close in the vid space, then they are also close in the physical topology • esp. , any two logical neighbors must be directly connected. – connectivity: any two adjacent logical sub-trees must be physically connected. 1 0 switch vid 0 0 1 D G E B 0 C L 1 1 0 F host id 1 A 41

VIRO Routing: Key Features • Inspired by Kademlia DHT – but need to build VIRO Routing: Key Features • Inspired by Kademlia DHT – but need to build end-to-end connectivity/routes! • Bottom-up, round-by-round process – first: neighbor discovery – then: build routing entries to reach nodes within each level of the vid space (virtual binary tree) • use “publish-query” mechanisms • Highly scalable: O(L) routing entries per node – L ~ O(log N), N: number of nodes (VIRO switches) – more importantly, path diversity (e. g. , multi-path routing) can be naturally exploited for load-balancing, etc. • routing is no longer “shortest” path based ! • No single point of failure, and localize effect of failures – unlike link state routing, node/link failure/recovery is not flooded network-wide; impact scope is limited – also enable localized fast rerouting l Completely eliminate “network-wide” control flooding 42

VIRO Routing: Some Definitions For k =1, …, L, and any node x: • VIRO Routing: Some Definitions For k =1, …, L, and any node x: • (level-k) sub-tree, denoted by Sk(x): • set of nodes within a logical distance of k from x • (level-k) bucket, denoted by Bk(x): • set of nodes exactly k logical distance from node x • (level-k) gateway, denoted Gk(x): • a node in Sk-1(x) which is connected to a node in Bk(x) is a gateway to reach Bk(x) for node x; a direct neighbor of x that can reach this gateway node is a next-hop for this node Example: S 3(A) S 2(A) 1 S 1(A)= {A, F}, B 1(A)={F}, G 1(A)={A} 0 B 3(A) B 2(A) 1 0 0 1 D G 0 E 1 B 0 C 0 1 1 0 F B 1(A) S 1(A) 1 A S 2(A) ={A, C, F}, B 2(A)={C} G 2(A)={A}, S 3(A) = {A, B, C, D, E, F, G} B 3(A) = {B, D, E, G} G 3(A) = {C, F} 43

VIRO Routing: Some Specifics Bottom-up, “round-by-round” process: • round 1: neighbor discovery – discover VIRO Routing: Some Specifics Bottom-up, “round-by-round” process: • round 1: neighbor discovery – discover and find directly/locally connected neighbors • round k ( 2 <= k <= L): – build routing entry to reach level-k bucket Bk(x) -- a list of one or more (gateway, next-hops) – use “publish-query” (rendezvous) mechanisms Algorithm for building Bk(x) routing entry at node x: l – if a node(x) is directly connected to a node in Bk(x), then it is a gateway for Bk(x), and also publishes it within Sk-1(x). • nexthop to reach Bk(x) = direct physical neighbor in Bk(x) – else node x queries within Sk-1(x) to discover gateway(s) to reach Bk(x) • nexthop to reach Bk(x) = nexthop(gateway) Correctness of the algorithm can be formally established. 44

VIRO Routing: Routing Table An Example 01000 Level 00100 Gateway Nexthop 1 - - VIRO Routing: Routing Table An Example 01000 Level 00100 Gateway Nexthop 1 - - 2 - - 3 - - . . . L - 00000 M C N 01001 H 10110 - D A 00110 10100 10000 E B G F 00010 11100 1 1 0 0 0 1 10 L J K 10010 From node A’s perspective: Buckets from level 1 to L(=5) 11000 1 A B C D M N 0 0 1 1 1 0 11 0 1 0 1 E F G H J K L

VIRO Routing: Example • Round 1: – each node x discovers and learns about VIRO Routing: Example • Round 1: – each node x discovers and learns about its directly/locally connected neighbors – build the level-1 routing entry to reach nodes in B 1(x) E. g. Node A: discover three direct neighbors, B, C, D; build the level-1 routing entry to reach B 1(A)={} Bucket Gateway Nexthop Distance 1 - - 2 3 Routing Table for node A 01000 00100 M C 00000 D A 00110 B 00010 F H G 10100 10000 E N 010010 11000 J K 11100 10110 L 11110

VIRO Routing: Example … • Round 2: – if directly connected to a node VIRO Routing: Example … • Round 2: – if directly connected to a node in B 2(x), enter self as gateway in level-2 routing entry, and publish in S 1(x) – otherwise, query “rendezvous point” in S 1(x) and build the level-2 routing entry to reach nodes in B 2(x) E. g. Node A: B 2(A)={B}; node A directly connected to node B; publish itself as gateway to B 2(A) 01000 Bucket Gateway Nexthop Distance 1 - - 2 A B 3 Routing Table for node A 00100 M C 00000 D A 00110 B 00010 F H G 10100 10000 E N 010010 11000 J K 11100 10110 L 11110

VIRO Routing: Example … • Round 3: – if directly connected to a node VIRO Routing: Example … • Round 3: – if directly connected to a node in B 3(x), enter self as gateway in level-3 routing entry, and publish in S 2(x) – otherwise, query “rendezvous point” in S 2(x) and build the level-2 routing entry to reach nodes in B 3(x) E. g. Node A: B 3(A)={C, D}; A publishes edges A->C, A->D to “rendezvous point” in S 2(A), say, B; Bucket Gateway Nexthop Distance 1 - - 2 A B 3 A C, D 01000 00100 M C 00000 D A 00110 B 00010 F H G 10100 10000 E N 010010 11000 J K 11100 10110 L 11110

VIRO Routing: Example … • Round 4: – if directly connected to a node VIRO Routing: Example … • Round 4: – if directly connected to a node in B 4(x), enter self as gateway in level-4 routing entry, and publish in S 3(x) – otherwise, query “rendezvous point” in S 3(x) and build the level-4 routing entry to reach nodes in B 4(x) E. g. Node A: B 4(A)={M, N}; A queries “rendezvous point” in S 3(A), say, C; learns C as gateway Bucket Gateway Nexthop Distance 1 - - 2 A B 3 A C C 00100 M C 00000 D A 00110 B 00010 E N 01001 F H G 10100 10000 C, D 4 01000 10010 11000 J K 11100 10110 L 11110

VIRO Routing: Example … • Round 5: – if directly connected to a node VIRO Routing: Example … • Round 5: – if directly connected to a node in B 5(x), enter self as gateway in level-5 routing entry, and publish in S 4(x) – otherwise, query “rendezvous point” in S 4(x) and build the level-4 routing entry to reach nodes in B 5(x) E. g. Node A: B 5(A)={E, F, G, H, J, K, L}; A queries “rendezvous point” in S 4(A), say, D; learns B as gateway Bucket Gateway Nexthop Distance 1 - - 2 A B 3 A C, D 4 C C 5 B B 001000 C 00000 M 01001 D A 10100 10000 B 00010 H G 00110 E N F 10010 11000 J K 11100 10110 L 11110

VIRO Routing: Packet Forwarding • To forward a packet to a destination node, say, VIRO Routing: Packet Forwarding • To forward a packet to a destination node, say, L – compute the logical distance to that node – Use the nexthop corresponding to the logical distance forwarding the packet – If no routing entry: • drop the packet Bucket Gateway Nexthop Distance 1 - 01000 - 00100 C 2 A B 3 A C, D 4 C C 5 B B 00000 A M D 00110 B 00010 F H G 10100 10000 E N 010010 11000 J K 11100 10110 L 11110

<pid, vid> Mapping and vid Lookup • pid: persistent identifier of end host/device, or Mapping and vid Lookup • pid: persistent identifier of end host/device, or switch – e. g. , MAC/IP address, or any other layer 2/3 address, “flat” host id, or higher layer names – can simultaneously support multiple namespaces • mapping registration and look-up using Kademlia DHT on top of the same vid space – Hash(pid) -> vidkey: used for registration & look-up – mapping stored at “access switch” whose vid is “closest” to vidkey • Look-up speed, scalability & mobility support trade-off – – • can use one-hop or multi-hop DHT or use hierarchical (or “geographically scoped”) hash tables vid look-up and data forwarding may be combined – – use hierarchical (or “geographically scoped”) rendezvous points provide better support for mobility

VEIL: a VIRO Realization over Ethernet (and 802. 11, etc) • Re-use 48 -bit VEIL: a VIRO Realization over Ethernet (and 802. 11, etc) • Re-use 48 -bit MAC addresses for vid • vid structure: divided into two fields § switch vid (32 bits) l l assigned to switches using the vid assignment process L: default 24 bits § host id (16 bits) l assigned by “host-switches” l uniquely identify hosts directly connected to a switch. l l l switch vid L End hosts agnostic of their vid’s Host switch performs vid/MAC address translation Backward compatible w/ Ethernet, 802. 11, etc. host id

VEIL: <IP/MAC, vid> Mapping • Host-switch: – a switch directly connected to the host VEIL: Mapping • Host-switch: – a switch directly connected to the host – discover host MAC/IP through (gratuitous) ARP, and assign vid to host – host-switch publishes pid vid mappings at an “access-switch” • Access-switch: – a switch whose vid is closest to hash (pid of the host) VIRO Switch Access-switch for y Sz x IPy VIRO Switch Sx An example using IP address as pid VIDy register mapping IPy VIDy VIRO Switch y Sy Host-switch for y IPy MACy VIDy

Address/vid Lookup & Data Forwarding • Use DHT look-up for address/vid resolution – with Address/vid Lookup & Data Forwarding • Use DHT look-up for address/vid resolution – with local cache • vid to MAC address translation at VIRO last-hop Switch Mapping Table at Sz VID IP Address MAC Address VIDy IPy MACy … … … Sz 3. ARP Reply (IPy VIDy) 1. ARP Query (IPy MAC? ) VIRO Switch x 4. Ethernet packet (MACx VIDy) Sx 2. ARP query forwarded as unicast request (IPy MAC? ) 5. Sx changes source MAC address in the Ethernet Packet (VIDx VIDy) 6. Sy changes destination MAC address in the Ethernet packet (VIDx MACy) VIRO Switch Sy y

Other Advantages/Features • Load balancing & fast rerouting can be naturally incorporated – no Other Advantages/Features • Load balancing & fast rerouting can be naturally incorporated – no additional complex mechanisms needed • Can support multiple namespaces, and inter-operability among such namespaces (e. g. , IPv 4<->IPv 6, IP<->Phone No. , etc. ) – VIRO: “native” naming/address-independent – simply incorporate more directory services • Support multiple topologies or virtualized network services – e. g. , support for VLANs – multiple vid spaces may be constructed • e. g. , by defining different types of “physical” neighbors • Also facilitate security support – host and access switches can perform access control – “persistent” id is not used for data forwarding • eliminate flooding attacks

Robustness: Localized Failures 01000 00100 M C 00000 D A 00110 E 00010 N Robustness: Localized Failures 01000 00100 M C 00000 D A 00110 E 00010 N 01001 F H G 10100 10000 B 00100 10010 11000 J 10110 11100 Initial Topology Bucket Gateway Nexthop Distance 1 A B 3 A C, D 4 C C 5 B B D 00110 N 10000 L B 00010 E 10010 F Link F -K fails 01001 H G 10100 11000 J 10110 L 11110 K 11100 Topology after link F-K fails - 2 00000 A 11110 K M C Routing table for node A does not change despite the failure!

More Than One Gateway Exist and Can be Used ! 1 0 1 0 More Than One Gateway Exist and Can be Used ! 1 0 1 0 1 1 0 0 11 0 1 0 1 0 1 1 0 0 1 0 1

Multi-Path Routing, Load-Balancing and Resilient Fast Re-Routing • Learn multiple gateways at each level Multi-Path Routing, Load-Balancing and Resilient Fast Re-Routing • Learn multiple gateways at each level – Default gateway is the one that is logically closest • Use additional gateways for multi-path routing and fast failure re-routing – Requires consistent gateway selection strategies • otherwise forwarding loops may occur – Use appropriate “forwarding directive” while using alternate gateways

Other VIRO Features … (not discussed in paper) • Multi-path routing & dynamic load-balancing Other VIRO Features … (not discussed in paper) • Multi-path routing & dynamic load-balancing – Esp. , can Valiant load-balancing can be adopted – use of “forwarding directives” • to ensure no forwarding loops! • Failure Management and Fast Re-routing • Namespace and network management – can/should be logically centralized – declarative paradigm? • Network security (? ) – access control – network monitoring, defend against denial-of-service, … • Virtual topologies/services & network virtualization 60