Скачать презентацию PEER-TO-PEER MULTIMEDIA APPLICATIONS Jin Li Principal Researcher Microsoft Скачать презентацию PEER-TO-PEER MULTIMEDIA APPLICATIONS Jin Li Principal Researcher Microsoft

0c246fa6ba21bae2e58772a2ac0bec70.ppt

  • Количество слайдов: 181

PEER-TO-PEER MULTIMEDIA APPLICATIONS Jin Li, Principal Researcher Microsoft Research jinl@microsoft. com 1 PEER-TO-PEER MULTIMEDIA APPLICATIONS Jin Li, Principal Researcher Microsoft Research jinl@microsoft. com 1

Outline Introduction P 2 P today Anatomy of Bit. Torrent, Skype & PPLive Components Outline Introduction P 2 P today Anatomy of Bit. Torrent, Skype & PPLive Components and tools for P 2 P applications P 2 P deployment issues Summary 2

INTRODUCTION 3 INTRODUCTION 3

Why P 2 P applications P 2 P is ideal to serve the long Why P 2 P applications P 2 P is ideal to serve the long tail … 6 B Advantage Economy to run: saves centralized bandwidth and/or storage Robustness: no single point of failure Super-scalability: system capacity increases with number of nodes 4

P 2 P Application History 1 st Generation Napster: 5/99 2 nd Generation Gnutella, P 2 P Application History 1 st Generation Napster: 5/99 2 nd Generation Gnutella, early 2000 Fast. Track, (Kazaa, Grokster, i. Mesh) 03/2001 e. Donkey 3 rd Generation Bit. Torrent, 2002 Skype, 08/2003 PPLive, 12/2004 5

P 2 P Aren’t New Existing P 2 P technology may find its origin P 2 P Aren’t New Existing P 2 P technology may find its origin in IP Routers DNS Distributed Computing What is P 2 P “Peer-to-peer is a class of applications that take advantage of resources— storage, cycles, content, human presence—available at the edges of the Internet. ” (Clay Shirkey) Nodes serve both as server & client Every node pays for service by providing access to some of resource (bandwidth, storage, etc. . ) No single point of bottleneck or failure Distributed algorithm for ○ ○ ○ Service/content discovery Status tracking Application layer routing Resilience, … 6

P 2 P TODAY 7 P 2 P TODAY 7

P 2 P Traffic Today 1999 to present: fuelled by Napster, Ka. Za. A, P 2 P Traffic Today 1999 to present: fuelled by Napster, Ka. Za. A, e. Donkey and Bit. Torrent Cache. Logic Research Internet Protocol Breakdown 1993 - 2006 8

P 2 P Today Source: Sandvine, www. sandvine. com P 2 P Today Source: Sandvine, www. sandvine. com

P 2 P Networks Large, Growing and active Estimated 200 M P 2 P P 2 P Networks Large, Growing and active Estimated 200 M P 2 P users worldwide 420 million P 2 P searches are conducted daily on P 2 P networks, rivaling searches on Google, Yahoo and Live. The number of P 2 P files downloaded in the US was up 24% in 2006. 60% of Internet backbone traffic is P 2 P and up to 90% of upstream user traffic is now consumed by P 2 P applications. Source: Cache. Logic

ANATOMY OF BITTORRENT, SKYPE & PPLIVE 11 ANATOMY OF BITTORRENT, SKYPE & PPLIVE 11

P 2 P App 1: Bit. Torrent Information Debut: 2002, by Bram Cohen For P 2 P App 1: Bit. Torrent Information Debut: 2002, by Bram Cohen For file-sharing (content location by tracker, which is a centralized server, rather P 2 P) Accounts for 35% traffic (according to analysis by Cache. Logic) Numerous clients ○ Official client (Python), Azureus (Java), Bit. Comet (C++) 12

Authorize Use of Bit. Torrent Many adopters report only by Bit. Torrent, could they Authorize Use of Bit. Torrent Many adopters report only by Bit. Torrent, could they afford to distribute their files Major open source & free software project Game update & games (e. g. , World of Warcraft) Films (Warner brother, fan-film) Other materials 13

Step 1. Install Bit. Torrent Client 14 Step 1. Install Bit. Torrent Client 14

Step. 2 Tracker – Centralized or DHT Centralized tracker Trackerless bittorrent (DHT) Trackerless Bit. Step. 2 Tracker – Centralized or DHT Centralized tracker Trackerless bittorrent (DHT) Trackerless Bit. Torrent Eliminate the need of the tracker, more robust Less efficient Lack content distribution control Lack content distribution statistics 15

Step 3. Make/Download Torrent File. torrent file (tracker addr, hash w/integrity check) web Torrent Step 3. Make/Download Torrent File. torrent file (tracker addr, hash w/integrity check) web Torrent search BBS Mail 16

Step 4. Overlay Forming & Sharing tracker swarms client 17 Step 4. Overlay Forming & Sharing tracker swarms client 17

Scheduling Rule 1: Tit-For-Tat Bit. Torrent rule A node preferentially uploads to neighbors that Scheduling Rule 1: Tit-For-Tat Bit. Torrent rule A node preferentially uploads to neighbors that provide it with the best download rate (top m) [jumpstart] Optimistic unchoking: unchokes a random neighbor regardless of download rate every 30 s 18

Scheduling Rule 2: Local Rarest Rule tracker swarms Client: client download pieces in rarest Scheduling Rule 2: Local Rarest Rule tracker swarms Client: client download pieces in rarest first order. End game: sends request for all missing blocks, & sends a cancel every time a block arrives. 19

P 2 P App 2: Skype Information Debut: 08/2003, by N. Zennstrom and J. P 2 P App 2: Skype Information Debut: 08/2003, by N. Zennstrom and J. Friis, who founded Ka. Za. A A P 2 P overlay network for Vo. IP and other app Free intra-net Vo. IP and fee-based Skype. Out/Skype. In 20

Skype Usage (Apr. 2008) 11 million concurrent Skype users on line in peak time Skype Usage (Apr. 2008) 11 million concurrent Skype users on line in peak time (180, 000+ simultaneous calls) 309 million registered users worldwide, the largest registered user base within e. Bay portfolio (33 million added users for Q 1 FY 08) $126 M revenue in Q 1 FY 08 (61% YOY growth, 5. 6 billion Skype. Out minutes in FY 2007) 100 billion cumulative Skype-to-Skype minutes 21

Skype Share of International Vo. IP Traffic 22 Skype Share of International Vo. IP Traffic 22

Skype Gadget Motorola CN 620 Wi. Fi Cellphone IPEVO Free-1 USB Skype Phone IPDRUM Skype Gadget Motorola CN 620 Wi. Fi Cellphone IPEVO Free-1 USB Skype Phone IPDRUM mobile Skype Cable Netgear Skype Wi-Fi Phone 50 hardware partners, 150+ Skype certificated device. 23

Skype vs. Vo. IP Public Vo. IP standard H. 323, SIP Skype is a Skype vs. Vo. IP Public Vo. IP standard H. 323, SIP Skype is a proprietary Vo. IP solution Rely on P 2 P network for user directory ○ Scalable without costly infrastructure Route calls through supernodes in Skype ○ Universal firewall/NAT traversal Encrypted traffic (but you have to trust e. Bay/Skype) 24

Skype Ingredient (1) User retrieves ID from a skype server 25 Skype Ingredient (1) User retrieves ID from a skype server 25

Skype Network Skype Server authentication Supernode Overlay: any computer w/ sufficient CPU, memory & Skype Network Skype Server authentication Supernode Overlay: any computer w/ sufficient CPU, memory & network bw & not behind firewall For distributed directory service Relay traffic for computer behind NAT/firewall 26

NAT Traversal (Skype) NAT/Firewall detection Try UDP connection Try TCP connection (arb port, 80 NAT Traversal (Skype) NAT/Firewall detection Try UDP connection Try TCP connection (arb port, 80 (http), 443(https) ) Traversal Direct connection if a) both clients have no NAT, b) one client has no NAT, and one behind cone-NAT Relay by supernode otherwise Since Skype doesn’t need to pay for relay cost ○ High bitrate wideband voice codec (>24 kbps) Tutorial, Jin Li 27

Skype : Call Routing Through Supernode Skype Server authentication Supernode Overlay: n. Route call Skype : Call Routing Through Supernode Skype Server authentication Supernode Overlay: n. Route call through supernodes n. High bitrate wideband voice codec (>24 kbps) Tutorial, Jin Li 28

Skype Encryption Peer 1 Peer 2 256 -bit AES over 128 bit data block Skype Encryption Peer 1 Peer 2 256 -bit AES over 128 bit data block 1536/2048 RSA for key negotiation (2048/2048 for paid service) 29

Skype: Complete Black box (Security by Obfuscation ) Almost everything is obfuscated Many protections, Skype: Complete Black box (Security by Obfuscation ) Almost everything is obfuscated Many protections, anti-debugging tricks, ciphered code Avoid static disassembly: xor binary with a hardcoded key, erasure beginning of the code, own packer Code integrity check: use checksum to avoid breakpoint Anti-debugging technique: anti softice, integrity check Code obfuscation Network obfuscation 30 30

P 2 P App 3: PPLive 31 P 2 P App 3: PPLive 31

Online Video Usage On the Rise China - 70% of broadband users watch TV Online Video Usage On the Rise China - 70% of broadband users watch TV over broadband 18 -24 year old BB users: 87% watch music videos online 82% watch TV programs UK – 18 -24 year old BB users: 77% watch music videos online 60% watch TV programs US 72% stream news videos consistently 59% are watching short clips from movies or TV 48% watch music videos 44% stream sports highlights 43% watch user-generated home videos 23% stream concert clips 22% downloads a full-length movie or TV show 17% stream live sporting events

Video Dominates P 2 P Traffic Source: Cache. Logic www, cachekiguc, com Cache. Logic Video Dominates P 2 P Traffic Source: Cache. Logic www, cachekiguc, com Cache. Logic Research Breakdown of File-Types on Major P 2 P Networks - 2007 More than 60% of P 2 P traffic is video Asia - 50% Objects > 2. 5 Gb !! 33

Video Streaming as a Major Video Distribution Vehicle Video streams served increased 38. 8% Video Streaming as a Major Video Distribution Vehicle Video streams served increased 38. 8% in 2006 to 24. 92 billion across all entertainment and media sites* Source: Accustream i. Media Research (Excluding UGC)

Is CDN the Answer? CDN Capacity Akamai 400 Gbps Limelight 1000 GB/s TV quality Is CDN the Answer? CDN Capacity Akamai 400 Gbps Limelight 1000 GB/s TV quality around 500 kbps 100, 000 viewers= 50 Gbps 2. 8 million viewers in total from top two CDNs Current TV audience 2. 5 Billion watched the Olympics 1. 1 Billion still watches Baywatch EVERY day? Soccer world cup final 3 Billion

Peer Assisted Streaming Peer assisted streaming is the only solution for a popular site Peer Assisted Streaming Peer assisted streaming is the only solution for a popular site to distribute Large number of streams (virtual channels) Without IP multicast In cost effective manner Tutorial, Jin Li 36

PPLive History Started in Dec. 2004 Status: Aug. 2007 75 Million installed base 3. PPLive History Started in Dec. 2004 Status: Aug. 2007 75 Million installed base 3. 5 Million daily active viewer 2. 2 Million peak concurrent viewer 1. 48 Million peak concurrent viewer per show (A NBA play-off game with Huston Rocket, Live, China, Q 2, 2007) ○ 740 Gbps bandwidth bill if not P 2 P Tutorial, Jin Li 37

PPLive – Contact Channel Server Channel List Server Approx. 300 -400 channels, all viewer PPLive – Contact Channel Server Channel List Server Approx. 300 -400 channels, all viewer at the same channel watch the media at the same point Tutorial, Jin Li 38

PPLive – Obtain Peer List Channel List Server Tracker IP lists Tutorial, Jin Li PPLive – Obtain Peer List Channel List Server Tracker IP lists Tutorial, Jin Li 39

PPLive: Video Display PPLive Engine Media Player Internet P 2 P Queue Http Request PPLive: Video Display PPLive Engine Media Player Internet P 2 P Queue Http Request PPLive UI 40

PPLive: Protocol Analysis (Hei’ 06) Chunk size: 1 s of video Buffer map: coded PPLive: Protocol Analysis (Hei’ 06) Chunk size: 1 s of video Buffer map: coded chunk avail proprietary exchange protocol & algorithm 41

Subsidizing Behavior (Hei’ 06) Campus Node Home Node High bandwidth node subsidizing low bandwidth Subsidizing Behavior (Hei’ 06) Campus Node Home Node High bandwidth node subsidizing low bandwidth node High bandwidth node upload much more than low bandwidth node 42

Other P 2 P Streaming App PPStream. com (World. Cup Soccer) Roxbeam Mysee. com Other P 2 P Streaming App PPStream. com (World. Cup Soccer) Roxbeam Mysee. com Tutorial, Jin Li 43

LESSON LEARNT 44 LESSON LEARNT 44

Content is King P 2 P is popular when it facilitates content distribution that Content is King P 2 P is popular when it facilitates content distribution that is prohibitively expensive in other means Many content (e. g. , software from small vendors & movie from a small provider) will not be available without Bit. Torrent or other P 2 P application Quality matters P 2 P attracts users as user is able to receive much higher quality of service compared to the server-client approach Bit. Torrent, PPLive offers file download/media streaming with unparallel quality of service 45

Server Role Server is not necessary evil in P 2 P Both Bit. Torrent Server Role Server is not necessary evil in P 2 P Both Bit. Torrent & Skype has server component Greatly simplify design Replace all server components with distributed component may lead to high implementation cost and lower quality of service for the user ○ Identify the primary source of P 2 P saving ○ Identify what key component can be better served by the server to simplify system design 46

Security Eavesdropping Confidentiality Imposter Pollution Denial of Service 47 Security Eavesdropping Confidentiality Imposter Pollution Denial of Service 47

Proper Incentive is Crucial Incentive is crucial Bit. Torrent do succeed in discouraging free Proper Incentive is Crucial Incentive is crucial Bit. Torrent do succeed in discouraging free riding ○ Tit-for-Tat, if you reduce upload rate, your rate of download suffers ○ Additional community feature further helps Skype: ○ User online (and contribute) during work hours User doesn’t like but may tolerate subsidizing behavior ○ Skype: supernode subsidize other nodes for relay ○ PPLive: high bandwidth node subsidize low bandwidth node 48

NAT/Firewall Traversal NAT Traversal is always a pain point in Developing P 2 P NAT/Firewall Traversal NAT Traversal is always a pain point in Developing P 2 P Application Skype wins mainly because it provides “free relay” capacity 49

DATA CENTER VS. CDN VS. P 2 P 50 DATA CENTER VS. CDN VS. P 2 P 50

Our Vision Of Microsoft Internet Infrastructure: Inner Layer DC Inner layer: massive data center Our Vision Of Microsoft Internet Infrastructure: Inner Layer DC Inner layer: massive data center 40 Gbps egress, 20 k+ servers, 50 pairs of switch/router chassis Good for: computation intensive application, e. g. , Live Search 51

Our Vision Of Microsoft Internet Infrastructure: Middle Layer ECN DC Middle layer: edge computing Our Vision Of Microsoft Internet Infrastructure: Middle Layer ECN DC Middle layer: edge computing network (ECN) A few dozen sites strategically placed all over the world Hundreds plus servers per site Good for: ○ Form a high speed network core of the Internet ○ latency sensitive/throughput hungry application 52

Our Vision Of Microsoft Internet Infrastructure: Outer Layer P 2 P ECN DC Outer Our Vision Of Microsoft Internet Infrastructure: Outer Layer P 2 P ECN DC Outer layer: P 2 P delivery Peer contributes resource (network bw, CPU, memory, hard drive) Good for: ○ Throughput intensive app: improve server scalability, use locality to improve throughput to the end peers 53

P 2 P COMPONENT AND TOOLS 54 P 2 P COMPONENT AND TOOLS 54

P 2 P Component and Tools Overlay network Scheduling algorithm Erasure resilient coding (ERC) P 2 P Component and Tools Overlay network Scheduling algorithm Erasure resilient coding (ERC) NAT/Firewall traversal 55

Overlay network 56 Overlay network 56

Overlay Network Overlay network is a computer network which is built on top of Overlay Network Overlay network is a computer network which is built on top of another network 57

Why Study Overlay 1 st step in building a P 2 P application Overlay Why Study Overlay 1 st step in building a P 2 P application Overlay graph affects Content distribution efficiency Robustness of P 2 P application Tutorial, Jin Li 58

Overlay Building Methods Tracker based overlay construction Random overlay w/super peer: Bit. Torrent Distributed Overlay Building Methods Tracker based overlay construction Random overlay w/super peer: Bit. Torrent Distributed overlay construction Pure random overlay: Gnutella DHT: Trackerless Bit. Torrent 59

Tracker Based Overlay Construction: Random Overlay (Bit. Torrent) Client sends a request to tracker Tracker Based Overlay Construction: Random Overlay (Bit. Torrent) Client sends a request to tracker ask for a set of peers The tracker randomly selects peers to include in the response The tracker return numwant peers [default=50, smaller if there are fewer peers] Upper & lower limit 30 peers is plenty (below, new connections will be formed) 55 peers is too much (client will refuse connections) Parameter is important to performance ○ Too fewer peers, not enough for scheduling algorithm to work with ○ Too many peers, high overhead in exchanging HAVE message Tutorial, Jin Li 60

Distributed Overlay Building : Random Walk (Gnutella) n Each node maintains a neighborhood table Distributed Overlay Building : Random Walk (Gnutella) n Each node maintains a neighborhood table (IP addresses) n n n Symmetric table With upper and lower bound on # of entries Joining node uses a random walk from a bootstrap node to find other nodes in its neighborhood table n n Neighbor discovery msg with count of to-filled entries Upon receiving the neighbor discovery msg, check if the # of neighbors reaches the upper bound n n No, invite the node to join its neighbor Forward neighbor discovery msg to a random node in the neighborhood if counter is still greater than zero Fail recovery: acknowledge all neighbor discovery msg Detect failure n n Every t sec, sends keep-alive to every neighbor No response, probe, still no response, assume failure A cache is maintain to replace failed neighbor Cache empty: send a neighbor discovery msg to a randomly chosen neighbor 61

Distributed Overlay Building : DHT (Bit. Torrent) Trackerless Bit. Torrent All trackerless Bit. Torrent Distributed Overlay Building : DHT (Bit. Torrent) Trackerless Bit. Torrent All trackerless Bit. Torrent clients of all shared file form a DHT Each peer becomes a virtual tracker The ID (hash) of the file determines which peers will serve as the tracker 62

Bit. Torrent: Trackerless Centralized tracker is a single point of failure Multi-trackers Defined by Bit. Torrent: Trackerless Centralized tracker is a single point of failure Multi-trackers Defined by Bit. Tornado Specify an order that the trackers should be accessed Trackerless Bit. Torrent Use Kademlia DHT Azureus Bit. Torrent: 1. 3 million members ○ Kademlia with k=20 Official Bit. Torrent: 200 k members ○ Kademlia with k=8 Tutorial, Jin Li 63

Distributed Hash Table Partition ownership of a set of keys among participating nodes Basic Distributed Hash Table Partition ownership of a set of keys among participating nodes Basic functionality (routing): route the msg to the unique owner of any given key DHT: ○ Store(ID, value) ○ Retrieve(ID) Examples ○ CAN, Chord, Pastry, Kademlia 64

DHT: The Key is Routing A P 2 P Cloud Each peer has a DHT: The Key is Routing A P 2 P Cloud Each peer has a unique ID For VALUE, which is the peer with the largest ID that is smaller than VALUE 65

DHT: Two Sets of Routing Entries Leaf Set What are is the node with DHT: Two Sets of Routing Entries Leaf Set What are is the node with ID that is immediately before, and is immediately after the current node ○ Correctness of DHT routing is guaranteed by the leaf set Finger Set A set of fingers that are stick out for fast routing ○ May consider node proximity in finger set construction DHT schemes differ primarily in leaf set construction Tutorial, Jin Li 66

Kademlia DHT: History Designed by P. Maymounkov and D. Mazieres (NYU, 2002) Used by Kademlia DHT: History Designed by P. Maymounkov and D. Mazieres (NYU, 2002) Used by e. Donkey & e. Mule Bit. Torrent Azureus DHT Trackerless Bit. Torrent (Official client, Torrent, Bit. Spirit, Bit. Comet) Tutorial, Jin Li 67

Kademlia DHT: XOR based Routing Use XOR based distance measure Node ID: 160 -bit Kademlia DHT: XOR based Routing Use XOR based distance measure Node ID: 160 -bit Each node is treated as a leaf with position determined by the shortest unique prefix of its ID Subtrees of node: ○ 1 st: half of binary tree not containing the node ○ 2 nd: half of remaining tree not containing the node ○ … A node know at least one node in each of its subtrees (can know more) Tutorial, Jin Li 68 68

Kademlia DHT Node: 0011…… 1 st subtree: 1 2 nd subtree: 01 3 rd Kademlia DHT Node: 0011…… 1 st subtree: 1 2 nd subtree: 01 3 rd subtree: 000 4 th subtree: 0010 69

Kademlia DHT: Distance Each node For each subtree (distance 2 i to 2 i+1), Kademlia DHT: Distance Each node For each subtree (distance 2 i to 2 i+1), keep a k-bucket ○ A list of at most k nodes ○ Sorted by time last seen ○ Default: k=20 When encounter new msg from node x Node x already in a k-bucket: move it to the tail Node x not in a k-bucket ○ Associate bucket has less than k node: add x ○ Associate bucket has k node: ping least recently seen node No Response: Evict node Live node is never evicted 70

Kademlia DHT : Less than k node One k-bucket 71 Kademlia DHT : Less than k node One k-bucket 71

Kademlia DHT : k < NUM < 2 k At most k fingers to Kademlia DHT : k < NUM < 2 k At most k fingers to each k bucket One k-bucket 72

Kademlia DHT : 2 k < NUM < 4 k One k-bucket 73 Kademlia DHT : 2 k < NUM < 4 k One k-bucket 73

Kademlia DHT: Protocol Kademlia protocol PING (ID) ○ Ping a node FIND_NODE (ID) ○ Kademlia DHT: Protocol Kademlia protocol PING (ID) ○ Ping a node FIND_NODE (ID) ○ Returns k nodes that is closest to the target ID STORE( ID, VALUE) ○ Store pair to a node FIND_VALUE (ID) ○ Similar to FIND_NODE, except if a value is stored associated with ID, the stored value is returned 74

Kademlia DHT: Lookup Node x: find k closest nodes to some given node ID Kademlia DHT: Lookup Node x: find k closest nodes to some given node ID Call FIND_NODE(ID) on x, of k nodes closest to the target, pick closest nodes (default =3): x 1, x 2, x 3 Node x resend FIND_NODE(ID) to xi ○ If xi fails to respond, removed xi from k-bucket and resend query ○ If closer node is found, repeat the step If a round of FIND_NODE(ID) fails to return a node any closer than the closest already seen, resend FIND_NODE(ID) to all k closest neighbors Terminate if after FIND_NODE(ID) is sent to all k closest neighbors, no more closer neighbor is found Kademlia operation relies on lookup Operation store is implemented by sending STORE(ID, VALUE) to k closest node found in lookup Operation retrieve is implemented by sending FIND_VALUE(ID) instead of FIND_NODE(ID) during lookup 75

Scheduling Algorithm 76 Scheduling Algorithm 76

Scheduling Algorithm Under a certain overlay, how can we efficiently move content in a Scheduling Algorithm Under a certain overlay, how can we efficiently move content in a P 2 P network Tree based distribution (Push) ○ Content is distributed in a deterministic way in the overlay Mesh based distribution (Pull/Push) ○ Content flows dynamically, with specific delivery path negotiated by the sender & receiver Key measurement ○ Efficiency ○ Robustness 77

Content Delivery Efficiency P 2 P content delivery efficiency: Content delivery throughput in P Content Delivery Efficiency P 2 P content delivery efficiency: Content delivery throughput in P 2 P / bandwidth in P 2 P For example, in P 2 P file delivery N: # of nodes L: Length of file T: Session length (time last node finish) Bi: Upload bandwidth of node I Bs: Upload bandwidth of source node s Efficiency: 78

Content Delivery Robustness How scheduling algorithm behaves when Node join/leave network gracefully/abruptly Certain node/network Content Delivery Robustness How scheduling algorithm behaves when Node join/leave network gracefully/abruptly Certain node/network link is congested Certain node slows down due to ○ CPU constraint ○ Network constraint etc. . 79

Tree Based Delivery: Coop. Net FEC/MDC striped across trees a failed node Up/download bandwidths Tree Based Delivery: Coop. Net FEC/MDC striped across trees a failed node Up/download bandwidths equalized 80 80

Mesh Based Delivery: Bit. Torrent 81 Mesh Based Delivery: Bit. Torrent 81

Tree vs. Mesh Comparison Tree Mesh Single Multiple Efficiency Poor Fair Good Robustness Poor Tree vs. Mesh Comparison Tree Mesh Single Multiple Efficiency Poor Fair Good Robustness Poor Fair Good Balancing Poor Fair Good Latency Low High Implementation Easy Fair Tricky 82

Mesh Delivery In-Depth Pull vs. push Peer selection (flow control) Block selection Bandwidth (resource) Mesh Delivery In-Depth Pull vs. push Peer selection (flow control) Block selection Bandwidth (resource) allocation 83

Mesh Delivery: Pull vs. Push Pull (Receiver-Driven) Receiver first learns what & where data Mesh Delivery: Pull vs. Push Pull (Receiver-Driven) Receiver first learns what & where data exists Then request data Push (Sender-Driven) Sender learns what receiver has (to avoid receipt of duplicates) Hybrid approach (pull-push) 84

Pull vs. Push: Comparison Pull Push Hybrid 40 nodes, 14 neighbors 40 nodes, 20 Pull vs. Push: Comparison Pull Push Hybrid 40 nodes, 14 neighbors 40 nodes, 20 neighbors 74% 99% 99% 99% n n n Push & hybrid can achieve more efficient distribution with sparse overlay When overlay becomes dense, the gap between push/hybrid vs. pull shrinks Pull can better control Qo. S for receiver 85

Mesh Delivery: Peer Selection & Flow Control Ensure network link between client & each Mesh Delivery: Peer Selection & Flow Control Ensure network link between client & each peer is fully utilized. 86

Life of A Request & Reply Bottleneck Req Pending 87 Life of A Request & Reply Bottleneck Req Pending 87

Mesh Delivery: Peer Selection & Flow Control Maintain a queue between the receiver and Mesh Delivery: Peer Selection & Flow Control Maintain a queue between the receiver and each sender Queue size: # of data pending from the sender Queue may 1) Identify data from sender, 2) Redirect loss/delayed request, 3) Flow control Maintain constant request-reply time between the receiver and all senders This is equivalent to let queue size be proportional to link bandwidth Link bandwidth ○ Amount of replied data from the link ○ Packet size / inter packet arrival time 88

Idle Peer Detection Redirect of request Peer become idle: no data received for a Idle Peer Detection Redirect of request Peer become idle: no data received for a while (say 1 s) Request to the idle peer is redirected to other active peers (use same peer selection policy) Follow-up ○ If packets come from the idle peer, it is reactivated ○ If peer is disconnected due to TCP disconnect event/timeout, it is removed from the neighbor list 89

Mesh Delivery: Block Selection Sequential: receive the block in sequence [poor performance] Random: receive Mesh Delivery: Block Selection Sequential: receive the block in sequence [poor performance] Random: receive a random block [trail close behind] Rarest: receive the rarest block in the neighborhood, no method for tie breaking (Bit. Torrent) [trail close behind] Rarest random: among the rarest block, select a random one [Find to be best, Kostic’ 05 & Liu’ 08] 90

Sender Bandwidth Allocation Fair vs. Subsidizing Bit. Torrent: Tit-for-Tat ○ A node preferentially uploads Sender Bandwidth Allocation Fair vs. Subsidizing Bit. Torrent: Tit-for-Tat ○ A node preferentially uploads to neighbors that provide it with the best download rate (top m) Subsidizing ○ A sender upload blocks in round-robin to receiver ○ Subsidizing is desirable for resource intensive P 2 P application (e. g. , peer assisted streaming) 91

Erasure Resilient Coding (ERC) 92 Erasure Resilient Coding (ERC) 92

Erasure Resilient Coding Original data: 1 2 3 k ERC: 1 2 3 k Erasure Resilient Coding Original data: 1 2 3 k ERC: 1 2 3 k At a certain instance X X X k messages n k+1 X X X Some of the blocks may be lost in delivery. However, as long as there at least k blocks delivered, the original data can be reconstructed. 93

ERC in P 2 P File Sharing Split file into k blocks Generate n ERC in P 2 P File Sharing Split file into k blocks Generate n encoded blocks Perform P 2 P file sharing (e. g. , in a Bit. Torrentlike fashion) The peer succeeds in receiving the file if it receives any k of the n coded blocks 94

ERC Terms Number of Original Block: k Number of Coded Block: n Rate of ERC Terms Number of Original Block: k Number of Coded Block: n Rate of ERC: k/n MDS: Maximum Distance Separable Any k of n coded block may recover the original The theoretical optimal performance 95

Erasure Encoding: Mathematics Original data: x 1 x 2 Coded data: y 1 y Erasure Encoding: Mathematics Original data: x 1 x 2 Coded data: y 1 y 2 xk yn : Vectors on Galois Field. 96

Example: ERC of 10 MB Original data (10 MB): Coded data: (n=30) x 1 Example: ERC of 10 MB Original data (10 MB): Coded data: (n=30) x 1 x 2 y 1 y 2 xk k=10, GF(28), each vector is 1 MB. yn 30 10 1 M 1 M 97

Erasure Decoding: Mathmatics Original data: x 1 x 2 Coded data: y 1 y Erasure Decoding: Mathmatics Original data: x 1 x 2 Coded data: y 1 y 2 xk yn Available Code select 98

Erasure Decoding: Mathmatics Original data: x 1 x 2 Coded data: y 1 y Erasure Decoding: Mathmatics Original data: x 1 x 2 Coded data: y 1 y 2 xk yn Original data can be recovered if the sub-generator matrix has a full rank k. 99

Systematic vs Non-Systematic ERC k messages Original data: 1 2 3 k Non systematic Systematic vs Non-Systematic ERC k messages Original data: 1 2 3 k Non systematic ERC: 1 2 3 k k+1 n Systematic ERC Slightly low encoding & decoding complexity 100

Reed-Solomon Only known MDS code for arbitrary k and n Has been around for Reed-Solomon Only known MDS code for arbitrary k and n Has been around for decades Has systematic form Cauchy Reed-Solomon Code Tutorial, Jin Li 101

Reed-Solomon Decoding Inverse Receive 102 Reed-Solomon Decoding Inverse Receive 102

Network Coding in P 2 P File Delivery k messages Original data: Source coding Network Coding in P 2 P File Delivery k messages Original data: Source coding n coded messages n >> k Host friend nodes … Intermediate nodes Mix & generate new block Client node As long as we get more than k 1 messages, we can decode the original data. For MDS code, k 1=k, otherwise k 1>k Tutorial, Jin Li 103

ERC & Network Coding ERC in P 2 P: Source send out different ERC ERC & Network Coding ERC in P 2 P: Source send out different ERC blocks to the connected peers ERC blocks are forwarded, but not mixed during the delivery Network coding in P 2 P: Source send out different network coded blocks to the connected peers The coded blocks are mixed during the delivery 104

Network Coding (Random Linear Code) Each coded block is a randomly formed Generator vector Network Coding (Random Linear Code) Each coded block is a randomly formed Generator vector is attached to each coded block Block mixing Start with block c 0, c 1 Get block c 2 105

How Useful is ERC in P 2 P Delivery Theory Broadcast (Edmond, 1972), all How Useful is ERC in P 2 P Delivery Theory Broadcast (Edmond, 1972), all nodes are receivers ○ Maxflow(s, T) = min. T {mincut(s, ti)} Routing is enough, block coding/mixing can not further improve theoretical throughput Tutorial, Jin Li 106

Network Coding vs ERC Simulation by Gkantsidis (Infocomm 2005) In homogeneous topology ○ Network Network Coding vs ERC Simulation by Gkantsidis (Infocomm 2005) In homogeneous topology ○ Network coding performs slightly better than ERC at source, which performs slightly better than no coding But with heterogeneous capacity & especially in topologies with cluster ○ Network coding performs better than ERC, which performs better than no coding 107 Tutorial, Jin Li

Network Coding / ERC at Source Implementation by Kostic (Usenix 2005) In a well Network Coding / ERC at Source Implementation by Kostic (Usenix 2005) In a well connected graph, ERC doesn’t help Implementation by Wang (IWQo. S 2006) Network coding offers inferior performance ○ Due to its need to wait for at least two blocks before it can redistribute ○ Computational complexity hinders the use of network coding in high capacity nodes (e. g. , core routers) 108 Tutorial, Jin Li

NAT/Firewall Traversal 109 NAT/Firewall Traversal 109

NAT/Firewall Traversal A very important component in consumer P 2 P application You have NAT/Firewall Traversal A very important component in consumer P 2 P application You have to build the component Its performance greatly affects the system performance NAT/Firewall traversal behavior may also affect system design decisions 192. 168. 0. 3 4. 18. 133. 70 Internet 192. 168. 0. 2 Tutorial, Jin Li 110

NAT/Firewall Traversal NAT: Network Address Translation, An Internet standard that enables a local-area network NAT/Firewall Traversal NAT: Network Address Translation, An Internet standard that enables a local-area network (LAN) to use one set of IP addresses for internal traffic and a second set of addresses for external traffic. A NAT box located where the LAN meets the Internet makes all necessary IP address translations. NAT serves three main purposes: Provides a type of firewall by hiding internal IP addresses Enables a company to use more internal IP addresses. Since they're used internally only, there's no possibility of conflict with IP addresses used by other companies and organizations. Allows a company to combine multiple ISDN connections into a single Internet connection. 111

Firewall A piece of hardware and/or software which functions in a networked environment to Firewall A piece of hardware and/or software which functions in a networked environment to prevent some communications forbidden by the security policy Egress filtering Only allow certain outbound traffic (to certain IP: port, from a selected set of IP addr) Ingress filtering Only allow certain inbound traffic (to certain IP: port, following know outbound traffic) 112

NAT/Firewall Traversal Naïve Approach Under Windows Use IPv 6 (Windows) Windows XP SP 2 NAT/Firewall Traversal Naïve Approach Under Windows Use IPv 6 (Windows) Windows XP SP 2 & Vista implements Teredo tunneling ○ Turn on by default in Vista ○ Turn off by default in Windows XP (need to turn it on) Supports STUN traversal (and TCP on top of UDP) About 60% traversal success rate Tutorial, Jin Li 113

Build Your Own NAT/Firewall Traversal NAT/Firewall discovery Peer address advertisement Port prediction & traversal Build Your Own NAT/Firewall Traversal NAT/Firewall discovery Peer address advertisement Port prediction & traversal 114

Traversal Procedure 1: NAT/Firewall Discovery What type of NAT/Firewall am I behind? 115 Traversal Procedure 1: NAT/Firewall Discovery What type of NAT/Firewall am I behind? 115

Traversal Procedure 2: Peer Address Advertisement How do I Advertise my contact information to Traversal Procedure 2: Peer Address Advertisement How do I Advertise my contact information to other peers? Know if there are peers who want to connect to me? Tutorial, Jin Li 116

Traversal Procedure 3: NAT/Firewall Traversal How to establish direct connections between the peers that Traversal Procedure 3: NAT/Firewall Traversal How to establish direct connections between the peers that are behind NAT/firewall? Tutorial, Jin Li 117

NAT/Firewall Detection Echo Servers With one or more echo servers, determine what type of NAT/Firewall Detection Echo Servers With one or more echo servers, determine what type of NAT/Firewall am I behind. Tutorial, Jin Li 118

NAT/Firewall Detection Ext Addr: Port = Int Addr: Port ○ Yes: Public Internet Incoming NAT/Firewall Detection Ext Addr: Port = Int Addr: Port ○ Yes: Public Internet Incoming UDP/TCP allowed no ingress firewall filtering, can server as a new public peer ○ No: Behind NAT/Firewall discovery UDP/TCP connection to server allowed ○ Yes: no egress firewall filtering ○ No: behind firewall with egress filtering Connect to popular port (say TCP 80, TCP 443) - Yes: may connect by relay - No: connection failed 119

Cone NAT (70%) 131. 107. 224. 9: 3074 131. 107. 224. 9: 3075 Port: Cone NAT (70%) 131. 107. 224. 9: 3074 131. 107. 224. 9: 3075 Port: 36721 192. 168. 0. 1: 8000 NAT 4. 35. 148. 9: 8100 120

Symmetric NAT: Sequential (30%) 131. 107. 224. 9: 3074 131. 107. 224. 9: 3075 Symmetric NAT: Sequential (30%) 131. 107. 224. 9: 3074 131. 107. 224. 9: 3075 Port: 36721 36722 36723 192. 168. 0. 1: 8000 NAT 4. 35. 148. 9: 8100 121

Symmetric NAT: Random (1%) 131. 107. 224. 9: 3074 131. 107. 224. 9: 3075 Symmetric NAT: Random (1%) 131. 107. 224. 9: 3074 131. 107. 224. 9: 3075 192. 168. 0. 1: 8000 NAT 4. 35. 148. 9: 8100 122

Advertise for Access To identify a peer Addr: Port of the presence server Public Advertise for Access To identify a peer Addr: Port of the presence server Public port of the peer (useful for Cone NAT) Private port of the peer 123

NAT Traversal D - IP UPn. P Full Cone Res Cone Seq Sym Rand NAT Traversal D - IP UPn. P Full Cone Res Cone Seq Sym Rand Fire wall 1 1 1 3 3 3 4 1. UPn. P Full Cone 1 1 1 3 3 3 R 2. 3. 1 1 1 3 3 3 R Res Cone 3 3 3 2 5 R R Seq Sym 3 3 3 5 6 R R Sym Rand 3 3 3 R R Fire wall 4 R R R 4. 5. 6. R. Direct Connection STUN Direct Connection or Connect Back to Specific Port Symmetric NAT traversal Relay 124

Direct Connection (A B) A, B: D-IP, UPn. P, Full Cone Presence Server Echo Direct Connection (A B) A, B: D-IP, UPn. P, Full Cone Presence Server Echo Server NAT Client A NAT Client B 125

Connect Back (A B) A: D-IP, UPn. P, Full Cone, B: not Firewall Presence Connect Back (A B) A: D-IP, UPn. P, Full Cone, B: not Firewall Presence Server Echo Server NAT Client A NAT Client B Tutorial, Jin Li 126

Connect Back @ port 80/443(A B) A: D-IP, B: Firewall Presence Server Echo Server Connect Back @ port 80/443(A B) A: D-IP, B: Firewall Presence Server Echo Server NAT Client A NAT Client B Tutorial, Jin Li 127

STUN (A B) : A, B: Restricted Cone NAT Presence Server Echo Server Probing STUN (A B) : A, B: Restricted Cone NAT Presence Server Echo Server Probing NAT Client A NAT Client B Tutorial, Jin Li 128

Symmetric/Restricted Cone: A: sequential Symmetric NAT, B: Res Cone Presence Server Get Recent Port Symmetric/Restricted Cone: A: sequential Symmetric NAT, B: Res Cone Presence Server Get Recent Port Mapping Send Predicted Port Mapping to Peer Echo Server 1 Echo Server 3 2 Probing NAT Client A NAT Client B Tutorial, Jin Li 129

Symmetric/Restricted Cone: A: sequential Symmetric NAT, B: Res Cone Presence Server Get Recent Port Symmetric/Restricted Cone: A: sequential Symmetric NAT, B: Res Cone Presence Server Get Recent Port Mapping Send Predicted Port Mapping to Peer Echo Server 1 Multiple Tries 3 Echo Server Probing NAT Client A NAT Client B Tutorial, Jin Li 130

Symmetric/Symmetric: A, B: Both Sequential Symmetric Doable through similar algorithm above If NAT A Symmetric/Symmetric: A, B: Both Sequential Symmetric Doable through similar algorithm above If NAT A has k 1 uncertain ports after port range prediction, and NAT B has k 2 uncertain ports after port range prediction, the probability of success traversal in a single pass is: 1/(k 1·k 2) 131

TCP NAT Traversal Simplementation: Just like UDP NAT Traversal scheme Do bind(), listen() & TCP NAT Traversal Simplementation: Just like UDP NAT Traversal scheme Do bind(), listen() & bind(), connect() on the same port Whichever socket that successfully establishes connection completes the traverse Requires OS support of TCP simultaneous open (Windows XP SP 2, each peer may launch a connection attempt separately) 132

TCP Failure Due to NAT Filtering TCP Failure Case NAT respond RST upon a TCP Failure Due to NAT Filtering TCP Failure Case NAT respond RST upon a unseen SYN Reason: it causes the socket at the end to close prematurely 133

TCP NAT Traversal Variation to counter Low-TTL SYN Spoof/RAW SYNACK RST Both don’t work TCP NAT Traversal Variation to counter Low-TTL SYN Spoof/RAW SYNACK RST Both don’t work well in deployment Not supported by OS Not supported by ISP router Security risk 134

P 2 P DEPLOYMENT ISSUES 135 P 2 P DEPLOYMENT ISSUES 135

P 2 P Deployment Issues P 2 P economy & incentive Attacks in P P 2 P Deployment Issues P 2 P economy & incentive Attacks in P 2 P Network Proximity and heterogeneity P 2 P monitoring and debugging aide Tutorial, Jin Li 136

P 2 P ECONOMY & INCENTIVE 137 P 2 P ECONOMY & INCENTIVE 137

P 2 P Economy & Incentive P 2 P Social Behavior Observation People like P 2 P Economy & Incentive P 2 P Social Behavior Observation People like to free ride (if given the choice) ○ Shown in study of Gnutella (70% client do not share), Kazaa-Lite However, people is OK to contribute if there is no choice (most people don’t bother to hack) or contribution improves performance ○ Bit. Torrent: sharing improves performance ○ Skype & PPLive One class of nodes subsides another class Tutorial, Jin Li 138

Existing P 2 P Incentive Reciprocal incentive Certain kind of fair exchange mechanism directly Existing P 2 P Incentive Reciprocal incentive Certain kind of fair exchange mechanism directly between two peers Force sharing Skype & PPLive Micropayment system Needs extensive server infrastructure 139

Reciprocol Incentive Fair exchange mechanism directly between two peers Can be used in open Reciprocol Incentive Fair exchange mechanism directly between two peers Can be used in open protocol Can easily deter hacking & attacks Most straight forward in implementation Tutorial, Jin Li 140

Iterative Prisoner Dilemma Payoff matrix Cooperate R=d-u, R=d-u S=-u, T=d Defect n Defect T=d, Iterative Prisoner Dilemma Payoff matrix Cooperate R=d-u, R=d-u S=-u, T=d Defect n Defect T=d, S=-u P=0, P=0 Cost n n d (>0): utility of downloading a fragment u (>0): cost of uploading a fragment d>u We have: T>R>P>S and 2 R>S+T>2 P 141

Alexrod’s Tournament & TIT-FOR-TAT Alexrod’s tournament in 1981 & 1984 14 entries & 62 Alexrod’s Tournament & TIT-FOR-TAT Alexrod’s tournament in 1981 & 1984 14 entries & 62 entries Four principles for highly effective strategies Nice, retaliatory, forgiving, clear TIT-FOR-TAT Cooperative on first move Copy the last move of the opponent 142

Tit-For-Tat In Bit. Torrent rule Preferentially uploads to m neighbors that provide it with Tit-For-Tat In Bit. Torrent rule Preferentially uploads to m neighbors that provide it with the best download rate Surprisingly simple yet effective Free ride leads to relatively poor performance, thus is deterred (even research shows that in network with large # of seed node, free ride may have small penalty) However, may not lead to best utilization of peer resources. Tutorial, Jin Li 143

Force Sharing Force sharing Skype & PPLive ○ Pro: superior system performance for all Force Sharing Force sharing Skype & PPLive ○ Pro: superior system performance for all users as users with more resource subsidize users with less resource ○ Con: inherently unfair Relies on proprietary implementation Subject to hacking ○ Skype is hacked, lead to Skype Lite? Force sharing results in poor system performance ○ PPLive slows other apps to a crawl Tutorial, Jin Li 144

Micropayments Basics Virtual money (may or may not be convertible to real currency) Peer Micropayments Basics Virtual money (may or may not be convertible to real currency) Peer pays for resource Used in MMORPG, Xbox 360 Problem Need server support Subject to hack Mental transaction cost argument: ○ Each price, no matter how small, carries a burden of decision Minors and those without credit card may be deterred Tutorial, Jin Li 145

Attacks in P 2 P Network 146 Attacks in P 2 P Network 146

P 2 P Threat Scenario P 2 P vs. client-server In client-server, client only P 2 P Threat Scenario P 2 P vs. client-server In client-server, client only needs to trust server In P 2 P, all peers are servers, trust issue is severe. P 2 P networks must assume some nodes are malicious P 2 P attack scenario Do. S Attack ○ Sybil Attack Pollution & Poisoning Attack Other Attack Tutorial, Jin Li 147

Denial of Service (Do. S) Attack Do. S Attack On P 2 P application Denial of Service (Do. S) Attack Do. S Attack On P 2 P application itself ○ “Berman's bill would give us [copyright owners] the right to launch denial-of-service attacks, known as ‘interdiction, ’ that would deluge P 2 P file servers with false file requests to slow the system or bring it to a halt. “ Network World Fusion, 8/5/02 Towards system not necessarily a participant in P 2 P ○ Naoumoy (IWP 2 P, 2006) demonstrates DDo. S attack launched from overnet By poisoning the distributed index By poisoning the routing table 148

Sybil Attacks Sybil is a well known character of the 70 s , a Sybil Attacks Sybil is a well known character of the 70 s , a women possessed with multiple personality disorder, of 16 characters Sybil Attack A single faulty entity masquerades and presents multiple identity – thus control substantial portion of the network. 149

Why Use Sybil Attacks in P 2 P? Tracker is a single weak point Why Use Sybil Attacks in P 2 P? Tracker is a single weak point in the P 2 P systems Sybil attack Create massive false identity that brings down the tracker Counter Makes identity more expensive A trusted central agency certify identities Attach each identify with certain real-world information, to create accountability with each identity 150

Pollution & Index Poisoning Attack Media. Sentry. com Over. Peer. com Index poisoning attack Pollution & Index Poisoning Attack Media. Sentry. com Over. Peer. com Index poisoning attack Content pollution attack 151

Index Poisoning & Pollution Level ( Liang, Infocomm 2006) Fast. Track Network 152 Index Poisoning & Pollution Level ( Liang, Infocomm 2006) Fast. Track Network 152

Index Poisoning & Pollution Level ( Liang, Infocomm 2006) Overnet 153 Index Poisoning & Pollution Level ( Liang, Infocomm 2006) Overnet 153

Counter: Pollution Attack Bit. Torrent Provide block hash in torrent file (counter pollution) ○ Counter: Pollution Attack Bit. Torrent Provide block hash in torrent file (counter pollution) ○ Signature works as well Automatically contact peer (counter index poisoning) Tutorial, Jin Li 154

Pollution Attack in Network Coding Problem: Peer node may mix content Hash/signature calculated by Pollution Attack in Network Coding Problem: Peer node may mix content Hash/signature calculated by the source may not be available for the newly mixed content Solution: Req new hash/signature from source ○ Source has to be online, heavy computation burden Homomorphic hash ○ Computationally expensive Secure Random Checksums (SRCs) ○ Server needs to be online for each client to distribute SRCs Homomorphic Signature ○ Need to use large Galois Field, computationally simple, but with large overhead 155

Counter: Index Poisoning Signing techniques to verify content authenticity This helps piracy issue in Counter: Index Poisoning Signing techniques to verify content authenticity This helps piracy issue in P 2 P as well Piracy in P 2 P is closely associated with piracy today ○ The history is with Napster, Kazaa, Grokster, Pirate. Bay, … Commercial P 2 P networks must make an effort to prevent illegal publishers ○ E. g. Grokster Case, http: //www. eff. org/IP/P 2 P/MGM_v_Grokster 156

Other Attacks (1) To grab peer identity Man-in-the-middle, replay, password guessing attack Counter: end-to-end Other Attacks (1) To grab peer identity Man-in-the-middle, replay, password guessing attack Counter: end-to-end encryption Identity attack Tracking and harassing user Debugging/reverse-engineering: E. g. , reverse-engineering of skype Counter: security by obfuscation 157

Other Attacks (2) Insertion of viruses Spy ware in the P 2 P software Other Attacks (2) Insertion of viruses Spy ware in the P 2 P software Spamming (Send Junk) Tutorial, Jin Li 158

Proximity and Heterogeneity 159 Proximity and Heterogeneity 159

Peer Parameter Estimation Peer Selection Based on Latency (RTT) Bandwidth/throughput estimation (heterogeneity) ○ Link Peer Parameter Estimation Peer Selection Based on Latency (RTT) Bandwidth/throughput estimation (heterogeneity) ○ Link bandwidth (upload) ISP locality (proximity) Packet Loss Availability (outrage) This section cover How to calculate heterogeneity (bandwidth) & proximity (ISP locality) Build heterogeneity & proximity aware P 2 P application is still an ongoing research topic 160

Bandwidth Estimation Problem: what is my available bandwidth? TCP Throughput ○ Intrusive measurement ○ Bandwidth Estimation Problem: what is my available bandwidth? TCP Throughput ○ Intrusive measurement ○ Resource consuming ○ Slow to get result Back-to-back packet pair ○ Measuring bottleneck link capacity ○ Not for available bandwidth measurement Various packet train approach, e. g. , pathneck 161

Available Bandwidth Measure Definition Consider an end-to-end path of n links: L 1, L Available Bandwidth Measure Definition Consider an end-to-end path of n links: L 1, L 2, …, Ln Their capacities are B 1, B 2, …, Bn Their traffic loads are C 1, C 2, …, Cn The bottleneck link is ○ Bb=min(B 1, B 2, …, Bn) The tight link is ○ Bt-Ct=min(B 1 -C 1, B 2 -C 2, …, Bn-Cn) ○ Bt-Ct is the available bandwidth 162

Basic Tool Back-to-back packet will estimate the bandwidth of the bottleneck link A packet Basic Tool Back-to-back packet will estimate the bandwidth of the bottleneck link A packet train will estimate the bandwidth of the tightest link 163

Available Bandwidth Measurement sending gap probing pkt arriving gap ! background pkt arriving gap Available Bandwidth Measurement sending gap probing pkt arriving gap ! background pkt arriving gap turning point 0 available bandwidth sending gap 164

Pathneck (Hu, Sigcomm’ 04) measurement packets 1 2 measurement packets Load packets 30 255 Pathneck (Hu, Sigcomm’ 04) measurement packets 1 2 measurement packets Load packets 30 255 255 30 60 pkts, 500 B 30 pkts, 60 B 2 1 30 pkts, 60 B TTL Recursive Packet Train (RPT) Load packets are used to measure available bandwidth Measurement packets are used to obtain location information 165

Transmission of RPT (Hu, Sigcomm’ 04) S 1 2 3 4 255 255 255 Transmission of RPT (Hu, Sigcomm’ 04) S 1 2 3 4 255 255 255 4 3 2 1 R 1 0 1 2 3 254 g 1 254 254 3 2 1 0 R 2 0 1 2 253 g 2 253 253 2 1 0 1 R 3 2 253 253 253 2 1 0 1 252 252 g 3 252 1 0 gap values are the raw measurement 166

Choke Point Detection (Hu, Sigcomm’ 04) bottleneck point gap choke points hop # 0 Choke Point Detection (Hu, Sigcomm’ 04) bottleneck point gap choke points hop # 0 a_bw hop # 1 2 3 4 5 6 7 8 167

ISP Locality P 2 P reality P 2 P represents 60% of Internet traffic ISP Locality P 2 P reality P 2 P represents 60% of Internet traffic and still growing 92% of P 2 P traffic crosses transit/peering links 80% of upstream capacity is consumed by P 2 P protocols will aggressively consume all available capacity P 2 P & ISP P 2 P affects Qo. S levels for ALL subscribers As content provider uses P 2 P to deliver content, their costs are being passed onto service providers Weak linkage between traffic generated/served to end-user and charging Solution Try best to deliver content from nearby peers Convert cross transit traffic to intra-ISP traffic 168

End users Internet Today Gateway router Verizon AS MSN AS Core router BGP router End users Internet Today Gateway router Verizon AS MSN AS Core router BGP router & peering point Comcast AS End users Gateway router 169

Inside ISP 170 Inside ISP 170

ISP POP (Point of Presence) 171 ISP POP (Point of Presence) 171

Home Networking 172 Home Networking 172

Internet Topology Discovery Information used Peer external IP address Peer subnet mask Topology Discovery Internet Topology Discovery Information used Peer external IP address Peer subnet mask Topology Discovery BGP feed: external IP AS Geo. Location Service (e. g. , IP 2 Location): external IP Country, region, city, Latitude, Longitude, ISP domain External IP + subnet mask: POP External IP: home/corporation 173

Hybrid CDN-P 2 P – Internet video 150 Edge Server Load (Mbps) 120 pure Hybrid CDN-P 2 P – Internet video 150 Edge Server Load (Mbps) 120 pure CDN-P 2 P 150 Data Center Load (Mbps) 120 90 60 30 30 0 IN NY UT NC MO Akamai Network 174 CDN-P 2 P 90 60 pure CDN TX VA 0 GA IL DC TX CA-LA FL NY Limelight Network AZ CA-SJ WA

Hybrid CDN-P 2 P – software distribution 175 Hybrid CDN-P 2 P – software distribution 175

P 2 P Monitoring Tools and Debugging Aide 176 P 2 P Monitoring Tools and Debugging Aide 176

Debugging of P 2 P Application Problems Debugging is difficult, but debugging a P Debugging of P 2 P Application Problems Debugging is difficult, but debugging a P 2 P application is especially hard Execution is non-deterministic ○ Two executions may yield different result ○ Bug is not reproducible ○ Bug occur only in large scale run may be difficult to debug 177

Nondeterminism in P 2 P Application Nondeterminism Two executions may lead to different results Nondeterminism in P 2 P Application Nondeterminism Two executions may lead to different results Race Two or more processes attempts to access the same resource and at least one of the processes is storing (write) into the resource ○ Read-write race ○ Write-write race Deadlock ○ Two or more processes are waiting for events such that none of the events occurs Solution: Deterministic replay Distributed assert 178

Deterministic Replay Deterministic replay Log into a trace all incoming network traffic, timing event, Deterministic Replay Deterministic replay Log into a trace all incoming network traffic, timing event, thread switching event that affect the execution path of the P 2 P application During debugging, have the capability to replay the activity of a set of peers from the trace Tools available Liblog & Friday (UC Berkeley) Wi. DS (MSRA) 179

Distributed Assert Collect status information (e. g. , state, % task finished, bandwidth usage, Distributed Assert Collect status information (e. g. , state, % task finished, bandwidth usage, etc. . ) with timestamp in a distributed fashion Send the information to a central reporting server Create snapshot of the whole system Tools available D 3 S: debugging deployed distributed systems (MSRA) 180

Summary P 2 P can add significant value to content owner Reduce cost of Summary P 2 P can add significant value to content owner Reduce cost of delivery Scale up to meet customer demand without significant infrastructure investment P 2 P can delivery better quality of service (Qo. S) to the end user The users have access to more resource (bandwidth, storage) on the network, lead to an improved Qo. S But needs to be done right! 181