Скачать презентацию Automatically Inferring Patterns of Resource Consumption in Network Скачать презентацию Automatically Inferring Patterns of Resource Consumption in Network

4e3a2045a76d0748eba84c2cd26c5831.ppt

  • Количество слайдов: 31

Automatically Inferring Patterns of Resource Consumption in Network Traffic Cristian Estan, Stefan Savage, George Automatically Inferring Patterns of Resource Consumption in Network Traffic Cristian Estan, Stefan Savage, George Varghese University of California, San Diego

Who is using my link? Traffic Clusters - 2003 2 Who is using my link? Traffic Clusters - 2003 2

Looking at the traffic Too much data for a human Do something smarter! Traffic Looking at the traffic Too much data for a human Do something smarter! Traffic Clusters - 2003 3

Looking at traffic aggregates Src. IPIP Dest. IP port Source Src. net Dest. net Looking at traffic aggregates Src. IPIP Dest. IP port Source Src. net Dest. net Rank l Protocol Destination IP Dest. port Traffic Aggregating on individual packet header fields Rank Source port Traffic 11. 9% 1 jeff. dorm. big. U. edu Which gives useful results but Rank Destination network the Where 42. 1%Traffic does 1 Web 2 tracy. dorm. big. U. edu 3. 12% network uses u Traffic reports are not always at the right granularity traffic come 1 library. big. U. edu and which What apps web 3 IP address, subnet, etc. ) 6. 7% 27. 5% risc. cs. big. U. edu 2. 83% (e. g. individual 2 Kazaa are used? from? 2 cs. big. U. edu one kazaa? u Cannot show aggregates defined over multiple fields 3 Ssh 6. 3% 18. 1% …… (e. g. which network uses which application) 3 dorm. big. U. edu l Src. port 17. 8% The traffic analysis tool should automatically find aggregates over the Mostfields atgoesright traffic the granularity to the dorms … Traffic Clusters - 2003 4

Ideal traffic report Traffic aggregate Traffic Web traffic 42. 1% Web traffic to library. Ideal traffic report Traffic aggregate Traffic Web traffic 42. 1% Web traffic to library. big. U. edu 26. 7% Web traffic from www. schwarzenegger. com 13. 4% ICMP traffic from sloppynet. bad. U. edu to jeff. dorm. big. U. edu 11. 9% Web is the dominant This paper is aboutheavy theof application giving This is The library is a a a. Denial network big flash administrator. That’s attack !! reports insightful traffic Service user of web crowd! Traffic Clusters - 2003 5

Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Traffic Clusters - 2003 6

Approach l Characterize traffic mix by describing all important traffic aggregates u u u Approach l Characterize traffic mix by describing all important traffic aggregates u u u Multidimensional aggregates (e. g. flash crowd described by protocol, port number and IP address) Aggregates at the right level of granularity (e. g. computer, subnet, ISP) Traffic analysis is automated – finds insightful data without human guidance Traffic Clusters - 2003 7

Definition: traffic clusters l l Traffic clusters are the multidimensional traffic aggregates identified by Definition: traffic clusters l l Traffic clusters are the multidimensional traffic aggregates identified by our reports A cluster is defined by a range for each field The ranges are from natural hierarchies (e. g. IP prefix hierarchy) – meaningful aggregates Example u u Traffic aggregate: incoming web traffic for CS Dept. Traffic cluster: ( Src. IP=*, Dest. IP in 132. 239. 64. 0/21, Proto=TCP, Src. Port=80, Dest. Port in [1024, 65535] ) Traffic Clusters - 2003 8

Definition: traffic report l l Traffic reports give the volume of chosen traffic clusters Definition: traffic report l l Traffic reports give the volume of chosen traffic clusters To keep report size manageable describe only clusters above threshold (e. g. H=total of traffic/20) To avoid redundant data compress by omitting clusters whose traffic can be inferred (up to error H) from nonoverlapping more specific clusters in the report To highlight non-obvious aggregates prioritize by using unexpectedness label u Example » 50% of all traffic is web » Prefix B receives 20% of all traffic » The web traffic received by prefix B is 15% instead of 50%*20%=10%, unexpectedness label is 15%/10%=150% Traffic Clusters - 2003 9

Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Traffic Clusters - 2003 10

Algorithms and theory l Algorithms and theoretical bounds in the paper u u l Algorithms and theory l Algorithms and theoretical bounds in the paper u u l Unidimensional reports are easy to compute Multidimensional reports are exponentially harder as we add more fields Next few slides u u Example of unidimensional compression Example for the structure of the multidimensional cluster space Traffic Clusters - 2003 11

Unidimensional report example Threshold=100 Hierarchy 10. 0/28 500 10. 0/29 120 10. 0. 0. Unidimensional report example Threshold=100 Hierarchy 10. 0/28 500 10. 0/29 120 10. 0. 0. 8/29 380 10. 0/30 50 10. 0. 0. 4/30 70 10. 0. 0. 8/30 305 75 10. 0. 0. 12/30 10. 0. 0. 2/31 50 10. 0. 0. 4/31 70 10. 0. 0. 8/31 270 10. 0. 0. 1 35 0/31 75 10. 0. 0. 14/31 15 10. 0. 0. 2 35 30 40 160 10. 0. 0. 3 10. 0. 0. 4 10. 0. 0. 5 110 10. 0. 0. 8 10. 0. 0. 9 Traffic Clusters - 2003 35 75 10. 0. 0. 10 10. 0. 0. 14 12

Unidimensional report example Compression 10. 0/28 500 Source IP 10. 0/29 120 Traffic 10. Unidimensional report example Compression 10. 0/28 500 Source IP 10. 0/29 120 Traffic 10. 0. 0. 8/29 380 -270≥ 100 10. 0/29 120 10. 0. 0. 8/29 10. 0. 0. 8/30 305 -270<100 380 10. 0. 0. 8 160 10. 0. 0. 8/31 270 10. 0. 0. 9 110 160 110 10. 0. 0. 8 10. 0. 0. 9 Traffic Clusters - 2003 13

Multidimensional structure ex. Nodes (clusters) have multiple parents Source net Nodes (clusters) overlap Application Multidimensional structure ex. Nodes (clusters) have multiple parents Source net Nodes (clusters) overlap Application All traffic US EU CA CA NY GB DE Web Mail US Web Traffic Clusters - 2003 14

Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Traffic Clusters - 2003 15

System: Auto. Focus Cluster miner names Grapher Traffic parser Web based GUI categories Packet System: Auto. Focus Cluster miner names Grapher Traffic parser Web based GUI categories Packet header trace Traffic Clusters - 2003 16

Traffic Clusters - 2003 17 Traffic Clusters - 2003 17

Traffic Clusters - 2003 18 Traffic Clusters - 2003 18

Traffic Clusters - 2003 19 Traffic Clusters - 2003 19

Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Contributions of this paper l Approach l Definitions l Algorithms l System l Experience Traffic Clusters - 2003 20

Structure of regular traffic mix l Backups from CAIDA to tape server u Semi-regular Structure of regular traffic mix l Backups from CAIDA to tape server u Semi-regular time pattern l FTP from SLAC Stanford l Scripps web traffic l Web & Squid servers l Large ssh traffic l SD-NAP Steady ICMP probing from CAIDA Traffic Clusters - 2003 SD-NAP 21

Analysis of unusual events l l UCSD to UCLA route change Sapphire/SQL Slammer worm Analysis of unusual events l l UCSD to UCLA route change Sapphire/SQL Slammer worm Site 2 Traffic Clusters - 2003 22

Conclusions 101011110101000010101111110101100101101011010000101111010100010111101000001011111110101100101011110010010100011011111100010101101011001011110000101111010111010101111110101100101010 1101010111110101000010110101011001000000101011111000010000101111010100001011100101010 1101011110000010111111010110001011110100000101111101010 11010111100100101010100010101011010010 11100101000011101101011011111100010101110 1011001011010111100001101110101110101011111101 0110010110101110101000011010010101110100001010101001010100000101010110101110 1010100000010101101010111101010110101000110 0010101001011101010100001000110101111010100010110 Traffic Clusters - 2003 Conclusions 101011110101000010101111110101100101101011010000101111010100010111101000001011111110101100101011110010010100011011111100010101101011001011110000101111010111010101111110101100101010 1101010111110101000010110101011001000000101011111000010000101111010100001011100101010 1101011110000010111111010110001011110100000101111101010 11010111100100101010100010101011010010 11100101000011101101011011111100010101110 1011001011010111100001101110101110101011111101 0110010110101110101000011010010101110100001010101001010100000101010110101110 1010100000010101101010111101010110101000110 0010101001011101010100001000110101111010100010110 Traffic Clusters - 2003 23

Conclusions l l Multidimensional traffic clusters using natural hierarchies describe traffic aggregates Traffic reports Conclusions l l Multidimensional traffic clusters using natural hierarchies describe traffic aggregates Traffic reports using thresholding identify automatically conspicuous resource consumption at the right granularity Compression produces compact traffic reports and unexpectedness labels highlight non-obvious aggregates Our prototype system, Auto. Focus, provides insights into the structure of regular traffic and unexpected events Traffic Clusters - 2003 24

Thank you! Alpha version of Auto. Focus downloadable from http: //ial. ucsd. edu/Auto. Focus/ Thank you! Alpha version of Auto. Focus downloadable from http: //ial. ucsd. edu/Auto. Focus/ Any questions? Acknowledgements: NIST, NSF, Vern Paxson, David Moore, Liliana Estan, Jennifer Rexford, Alex Snoeren, Geoff Voelker Traffic Clusters - 2003 25

Bounds and running times Report size Running time unc. 1 dim. rep. ≤ 1+(d-1)T/H Bounds and running times Report size Running time unc. 1 dim. rep. ≤ 1+(d-1)T/H O(n+m(d-1)) O(m(d-1)) 1 dim. report ≤ T/H linear 1 dim. Δ report ≤T 1/H+T 2/H unc. +dim. rep. ≤ T/H ∏di/max(di) +dim. Δ report Memory usage linear ≈result*n O(m+result) ≈eresult Traffic Clusters - 2003 26

Open questions l Are there tighter bounds for the size of the reports? l Open questions l Are there tighter bounds for the size of the reports? l Are there algorithms that produce smaller results? l Are there algorithms that compute traffic reports more efficiently? In streaming fashion? Traffic Clusters - 2003 27

Delta reports l l Why repeat the same traffic report if the traffic doesn’t Delta reports l l Why repeat the same traffic report if the traffic doesn’t change from one day to the other? Delta reports describe the clusters that increased or decreased by more than the threshold from one interval to the other On related traffic mixes delta reports much smaller than traffic reports Multidimensional compression very hard for delta reports u We have only exponential algorithm for the cluster delta Traffic Clusters - 2003 28

Greedy compression algorithm Traffic Clusters - 2003 29 Greedy compression algorithm Traffic Clusters - 2003 29

Multidimensional report example Thresholding Compression Traffic Clusters - 2003 30 Multidimensional report example Thresholding Compression Traffic Clusters - 2003 30

System details Part Backend Language C++ GUI HTML, Javascript Glue perl Traffic Clusters - System details Part Backend Language C++ GUI HTML, Javascript Glue perl Traffic Clusters - 2003 Lo. C 5400 Status stable 1000 functional 350 evolving 31