Скачать презентацию Network Layer IP COMS W 6998 Spring 2010 Скачать презентацию Network Layer IP COMS W 6998 Spring 2010


  • Количество слайдов: 32

Network Layer: IP COMS W 6998 Spring 2010 Erich Nahum Network Layer: IP COMS W 6998 Spring 2010 Erich Nahum

Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Recall what IP Does l IP-packet format 0 3 Version 15 7 IHL Codepoint Recall what IP Does l IP-packet format 0 3 Version 15 7 IHL Codepoint Total length DM F F Fragment-ID Time to Live 31 Protocol Fragment-Offset l Checksum l Source address Destination address l Options and payload l Encapsulate/ decapsulate transport-layer messages into IP datagrams Routes datagrams to destination Handle static and/or dynamic routing updates Fragment/ reassemble datagrams Unreliably

IP Implementation Architecture Higher Layers ip_input. c ip_output. c ROUTING ip_local_deliver_finish NF_INET_LOCAL_INPUT Forwarding Information IP Implementation Architecture Higher Layers ip_input. c ip_output. c ROUTING ip_local_deliver_finish NF_INET_LOCAL_INPUT Forwarding Information Base ip_route_input ip_forward. c NF_INET_FORWARD ip_local_deliver ip_forward ip_rcv_finish ip_route_output_flow ip_forward_finish netif_receive skb NF_INET_LOCAL_OUTPUT ip_output ip_finish_output NF_INET_PRE_ROUTING dev. c ip_local_out NF_INET_POST_ROUTING MULTICAST ip_mr_input ip_rcv ip_queue_xmit ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Sources of IP Packets arrive on an interface and are passed to the ip_rcv() Sources of IP Packets arrive on an interface and are passed to the ip_rcv() function. TCP/UDP packets are packed into an IP packet and passed down to IP via ip_queue_xmit(). The IP layer generates IP packets itself: 1. 2. 3. 1. 2. 3. Multicast packets Fragmentation of a large packet ICMP/IGMP packets.

Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

What is Netfilter? l l l A framework for packet “mangling” A protocol defines What is Netfilter? l l l A framework for packet “mangling” A protocol defines "hooks" which are well-defined points in a packet's traversal of that protocol stack. l IPv 4 defines 5 l Other protocols include IPv 6, ARP, Bridging, DECNET At each of these points, the protocol will call the netfilter framework with the packet and the hook number. Parts of the kernel can register to listen to the different hooks for each protocol. When a packet is passed to the netfilter framework, it will call registered callbacks for that hook and protocol.

Netfilter IPv 4 Hooks l l l NF_INET_PRE_ROUTING l Incoming packets pass this hook Netfilter IPv 4 Hooks l l l NF_INET_PRE_ROUTING l Incoming packets pass this hook in ip_rcv() before routing NF_INET_LOCAL_IN l All incoming packets addressed to the local host pass this hook in ip_local_deliver() NF_INET_FORWARD l All incoming packets not addressed to the local host pass this hook in ip_forward() NF_INET_LOCAL_OUT l All outgoing packets created by this local computer pass this hook in ip_build_and_send_pkt() NF_INET_POST_ROUTING l All outgoing packets (forwarded or locally created) will pass this hook in ip_finish_output()

Netfilter Callbacks l l Kernel code can register a call back function to be Netfilter Callbacks l l Kernel code can register a call back function to be called when a packet arrives at each hook. and are free to manipulate the packet. The callback can then tell netfilter to do one of five things: l l l NF_DROP: drop the packet; don't continue traversal. NF_ACCEPT: continue traversal as normal. NF_STOLEN: I've taken over the packet; stop traversal. NF_QUEUE: queue the packet (usually for userspace handling). NF_REPEAT: call this hook again.

IPTables l l A packet selection system called IP Tables has been built over IPTables l l A packet selection system called IP Tables has been built over the netfilter framework. It is a direct descendant of ipchains (that came from ipfwadm, that came from BSD's ipfw), with extensibility. Kernel modules can register a new table, and ask for a packet to traverse a given table. This packet selection method is used for: l l l Packet filtering (the `filter' table), Network Address Translation (the `nat' table) and General preroute packet mangling (the `mangle' table).

Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Naming Conventions l Methods are frequently broken into two stages (where the second has Naming Conventions l Methods are frequently broken into two stages (where the second has the same name with a suffix of finish or slow, is typical for networking kernel code. ) l l E. g. , ip_rcv_finish In many cases the second method has a “slow” suffix instead of “finish”; this usually happens when the first method looks in some cache and the second method performs a lookup in a more complex data structure, which is slower.

Receive Path: ip_rcv Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input l Packets Receive Path: ip_rcv Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input l Packets that are not addressed to the host (packets received in the promiscuous mode) are dropped. Does some sanity checking l ip_forward. c ip_local_deliver ip_forward ip_rcv_finish l l MULTICAST l ip_mr_input NF_INET_PRE_ROUTING l ip_rcv dev. c netif_receive skb l Does the packet have at least the size of an IP header? Is this IP Version 4? Is the checksum correct? Does the packet have a wrong length? If the actual packet size > skb len, then invoke skb_trim(skb, iph total_len) Invokes netfilter hook NF_INET_PRE_ROUTING l ip_rcv_finish() is called

Receive Path: ip_rcv_finish Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input l l Receive Path: ip_rcv_finish Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input l l ip_forward. c ip_local_deliver ip_forward ip_rcv_finish MULTICAST ip_mr_input NF_INET_PRE_ROUTING ip_rcv dev. c If skb->dst is NULL, ip_route_input() is called to find the route of packet. l l skb->dst is set to an entry in the routing cache which stores both the destination IP and the pointer to an entry in the hard header cache (cache for the layer 2 frame packet header) If the IP header includes options, an ip_option structure is created. skb->input() now points to the function that should be used to handle the packet (delivered locally or forwarded further): l netif_receive skb Someone else could have filled it in l l ip_local_deliver() ip_forward() ip_mr_input()

Receive Path: ip_local_deliver Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input ip_forward. c Receive Path: ip_local_deliver Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input ip_forward. c ip_local_deliver ip_forward ip_rcv_finish MULTICAST ip_mr_input NF_INET_PRE_ROUTING ip_rcv dev. c netif_receive skb l l The only task of ip_local_deliver(skb) is to reassemble fragmented packets by invoking ip_defrag(). The netfilter hook NF_INET_LOCAL_IN is invoked. This in turn calls ip_local_deliver_finish

Recv: ip_local_deliver_finish Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input l ip_forward. c Recv: ip_local_deliver_finish Higher Layers ip_input. c l ip_local_deliver_finish ROUTING NF_INET_LOCAL_INPUT ip_route_input l ip_forward. c l ip_local_deliver ip_forward ip_rcv_finish MULTICAST l ip_mr_input Remove the IP header from skb by __skb_pull(skb, ip_hdrlen(skb)); The protocol ID of the IP header is used to calculate the hash value in the inet_protos hash table. Packet is passed to a raw socket if one exists (which copies skb) If transport protocol is found, then the handler is invoked: l NF_INET_PRE_ROUTING l ip_rcv l l dev. c netif_receive skb l tcp_v 4_rcv(): TCP udp_rcv(): UDP icmp_rcv(): ICMP igmp_rcv(): IGMP Otherwise dropped with an ICMP Destination Unreachable message returned.

Hash Table inet_protos[MAX_INET_PROTOS] 0 net_protocol handler udp_rcv() udp_err() err_handler gso_send_check gso_segment gro_receive gro_complete 1 Hash Table inet_protos[MAX_INET_PROTOS] 0 net_protocol handler udp_rcv() udp_err() err_handler gso_send_check gso_segment gro_receive gro_complete 1 net_protocol handler err_handler gso_send_check gso_segment gro_receive gro_complete MAX_INET_ PROTOS net_protocol igmp_rcv() Null

Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Send Path: ip_queue_xmit (1) Higher Layers ip_output. c l skb dst is checked to Send Path: ip_queue_xmit (1) Higher Layers ip_output. c l skb dst is checked to see if ROUTING it contains a pointer to an ip_route_output_flow entry in the routing cache. l l Many packets are routed through the same path, so storing a pointer to an routing entry in skb dst saves expensive routing table lookup. If route is not present (e. g. , the first packet of a socket), then ip_route_output_flow() is invoked to determine a route. ip_queue_xmit ip_local_out NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Send Path: ip_queue_xmit (2) Higher Layers ip_output. c l Header is pushed onto packet Send Path: ip_queue_xmit (2) Higher Layers ip_output. c l Header is pushed onto packet l l skb_push(skb, sizeof(header + options); The fields of the IP header are filled in (version, header length, TOS, TTL, addresses and protocol). If IP options exist, ip_options_build() is called. Ip_local_out() is invoked. ip_queue_xmit ROUTING ip_route_output_flow ip_local_out NF_INET_LOCAL_OUTPUT ip_output NF_INET_POST_ROUTING ip_finish_output ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Send Path: ip_local_out Higher Layers 1. The checksum is computed 1. 2. ip_send_check(iph) Netfilter Send Path: ip_local_out Higher Layers 1. The checksum is computed 1. 2. ip_send_check(iph) Netfilter is invoked with NF_INET_LOCAL_OUTPUT using skb->dst_output() 1. l ip_output. c ip_queue_xmit ROUTING ip_route_output_flow NF_INET_LOCAL_OUTPUT This is ip_output() If the packet is for the local machine: l l dst->output = ip_output dst->input = ip_local_deliver ip_output() will send the packet on the loopback device Then we will go into ip_rcv() and ip_rcv_finish() , but this time dst is NOT null; so we will end in ip_local_deliver(). ip_local_out ip_output NF_INET_POST_ROUTING ip_finish_output ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Send Path: ip_output Higher Layers ip_output. c l l l ip_output() does very little, Send Path: ip_output Higher Layers ip_output. c l l l ip_output() does very little, ip_queue_xmit essentially an entry into the ROUTING ip_local_out ip_route_output_flow output path from the forwarding layer. NF_INET_LOCAL_OUTPUT Updates some stats. ip_output Invokes Netfilter with NF_INET_POST_ROUTING ip_finish_output and ip_finish_output() ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Send Path: ip_finish_output Higher Layers l l ip_output. c Checks message length against the Send Path: ip_finish_output Higher Layers l l ip_output. c Checks message length against the destination MTU ROUTING Calls either ip_route_output_flow l l ip_fragment() ip_finish_output 2() l ip_queue_xmit ip_local_out NF_INET_LOCAL_OUTPUT Latter is actually a very long inline, not a function ip_output NF_INET_POST_ROUTING ip_finish_output ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Send Path: ip_finish_output 2 Higher Layers ip_output. c l l Checks skb for room Send Path: ip_finish_output 2 Higher Layers ip_output. c l l Checks skb for room for MAC header. If not, call skb_realloc_headroom(). Send the packet to a neighbor by: l l l ip_queue_xmit ROUTING ip_route_output_flow NF_INET_LOCAL_OUTPUT dst->neighbour->output(skb) arp_bind_neighbour() sees to it that the L 2 address (a. k. a. the mac address) of the next hop will be known. These eventually end up in dev_queue_xmit() which passes the packet down to the device. ip_local_out ip_output NF_INET_POST_ROUTING ip_finish_output ARP neigh_resolve_ output ip_finish_output 2 dev. c dev_queue_xmit

Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Outline l l l IP Layer Architecture Netfilter Receive Path Send Path Forwarding (Routing) Path

Forwarding: ip_forward (1) ROUTING Forwarding Information Base ip_route_input ip_input. c ip_rcv_finish l ip_forward. c Forwarding: ip_forward (1) ROUTING Forwarding Information Base ip_route_input ip_input. c ip_rcv_finish l ip_forward. c ip_route_output_flow NF_INET_FORWARD ip_forward_finish ip_output. c ip_output Does some validation and checking, e. g. , : l l l If skb->pkt_type != PACKET_HOST, drop If TTL <= 1, then the packet is deleted, and an ICMP packet with ICMP_TIME_EXCEEDED set is returned. If the packet length (including the MAC header) is too large (skb->len > mtu) and no fragmentation is allowed (Don’t fragment bit is set in the IP header), the packet is discarded and the ICMP message with ICMP_FRAG_NEEDED is sent back.

Forwarding: ip_forward (2) ROUTING Forwarding Information Base ip_route_input ip_input. c ip_rcv_finish l l NF_INET_FORWARD Forwarding: ip_forward (2) ROUTING Forwarding Information Base ip_route_input ip_input. c ip_rcv_finish l l NF_INET_FORWARD ip_forward_finish ip_output. c ip_output skb_cow(skb, headroom) is called to check whethere is still sufficient space for the MAC header in the output device. If not, skb_cow() calls pskb_expand_head() to create sufficient space. The TTL field of the IP packet is decremented by 1. l l ip_forward. c ip_route_output_flow ip_decrease_ttl() also incrementally modifies the header checksum. The netfilter hook NF_INET_FORWARDING is invoked.

Forwarding: ip_forward_finish ROUTING Forwarding Information Base ip_route_input ip_input. c ip_rcv_finish l l l ip_forward. Forwarding: ip_forward_finish ROUTING Forwarding Information Base ip_route_input ip_input. c ip_rcv_finish l l l ip_forward. c ip_route_output_flow NF_INET_FORWARD ip_forward_finish ip_output. c ip_output Increments some stats. Handles any IP options if they exist. Calls the destination output function via skb->dst>output(skb) – which is ip_output()

IP Backup IP Backup

Recall the IP Header IP-packet format 0 3 Version 15 7 IHL Codepoint Total Recall the IP Header IP-packet format 0 3 Version 15 7 IHL Codepoint Total length DM F F Fragment-ID Time to Live 31 Protocol Source address Destination address Options and payload Fragment-Offset Checksum

Recall the sk_buff structure sk_buff_head struct sock sk_buff next prev sk tstamp dev. . Recall the sk_buff structure sk_buff_head struct sock sk_buff next prev sk tstamp dev. . . lots. . . of. . . stuff. . transport_header network_header mac_header head data tail end truesize users linux-2. 6. 31/include/linux/skbuff. h sk_buff net_device Packetdata ``headroom‘‘ MAC-Header IP-Header UDP-Data ``tailroom‘‘ dataref: 1 nr_frags. . . destructor_arg skb_shared_info

Recall pkt_type in sk_buff l pkt_type: specifies the type of a packet l l Recall pkt_type in sk_buff l pkt_type: specifies the type of a packet l l l PACKET_HOST: a packet sent to the local host PACKET_BROADCAST: a broadcast packet PACKET_MULTICAST: a multicast packet PACKET_OTHERHOST: a packet not destined for the local host, but received in the promiscuous mode. PACKET_OUTGOING: a packet leaving the host PACKET_LOOKBACK: a packet sent by the local host to itself.