8187f8d6cb082aa3b7d14897f2991b4a.ppt
- Количество слайдов: 72
Towards High Performance Network Defense Zhichun Li EECS Department Northwestern University
Motivation Attackers Botnets Professional attackers exploit networks for profit $$$ Worms 2
Network Level Defense • Network gateways/routers are the vantage points for detecting large scale attacks • Only host based detection/prevention is not enough – Some users do not apply the host-based schemes due to the reliability, overhead, and conflicts – Many users do not update or patch their system on time – E. g. , Conficker worm in the end of 2008 infected 9~15 millions of hosts – Cannot only reply on end users for security protection 3
Challenges • Scalable to high speed networks with a large number of users • Highly accurate • Adapt fast to the emerging threats • Have good attack coverage 4
Network-based Intrusion Detection, Prevention, and Forensics System • Framework Accuracy & Scalability & Coverage Packet streams (I) Sketch based monitoring & detection Scalability Accuracy & adapt fast (III) Signature (II) Polymorphic matching worm signature engines generation (IV) Network situational awareness Accuracy & adapt fast 5
High-speed Network Monitoring and Anomaly Detection • Online traffic monitoring and recording [SIGCOMM IMC 2004, INFOCOM 2006, To. N 2007] [INFOCOM 2008] – – Reversible sketch for data streaming computation Record millions of flows (GB traffic) in a few hundred KB Small # of memory access per packet Scalable to large key space size (232 or 264) • Online sketch-based flow-level anomaly detection [IEEE ICDCS 2006] [Journal of Computer Networks 2010] [IEEE CG&A, Security Visualization 2006] • Online stealthy botnet scan detection h 1(k) … K-1 1 … [IEEE IWQo. S 2007] 0 1 … j H hj(k) h. H(k) 6
Network and Distributed System Diagnosis • Overlay network monitoring and diagnosis [SIGCOMM IMC 2003, SIGCOMM 2004, To. N 2007] [SIGCOMM 2006] • End-user network diagnosis [INFOCOM 2007 (2)] • Internet-scale Virtual Private Network (VPN) and backbone monitoring and diagnosis [INFOCOM 2009] • Internet-scale Data Center and dist system profiling and diagnosis [NSDI 2010] 7
Polymorphic Worm Signature Generation • Exploit invariant signature generation [IEEE Symposium on Security and Privacy 2006] (cited by ~100, code and test cases release to Columbia U. , UT Austin, Purdue, Georgia Tech, UC Davis, etc) • Vulnerability signature generation [IEEE ICNP 2007, To. N 2010] [NSF Cyber. Trust 06 Award] 1010101 Internet Network gateway 10111101 11111100 Our network 00010111 8
Online Protocol Parsing and Signature Matching • Net. Shield vulnerability signature based NIDS/NIPS [NSF Cyber. Trust 08 Award] [under submission] [patent filed] – Interested by Cisco (IPS ruleset & site visit) – Code release has been used by researchers in University of Toronto • Using failure information to detect enterprise zombies [Secure. Com 09] • Spamming botnet detection [NSDI 09] 9
Network Situational Awareness • Large-scale botnet and P 2 P misconfiguration event situational-aware forensics – Botnet attack target/strategy inference [ASIACCS 09] – Root cause analysis of the P 2 P misconfiguration/poisoning traffic [INFOCOM 10] • Analysis of 2 TB data across 4 years over 5 /8 IPs 10
Current Work • Data center management and configuration • Internet emergency response – AS topology study [Co. NEXT 09] – Recovery via IXP [Infocom 10] • Network based web dynamic vulnerability defense • Social network security 11
Net. Shield: Matching a Large Vulnerability Signature Ruleset for High Performance Network Defense 12
Outline • • • Motivation High Speed Matching for Large Rulesets High Speed Parsing Evaluation Research Contributions 13
Net. Shield Overview NIDS/NIPS (Network Intrusion Detection/Prevention System) operation Signature DB Packets NIDS/NIPS Security • Accuracy alerts • Speed • Attack Coverage 14
State Of The Art Regular expression (regex) based approaches Used by: Cisco IPS, Juniper IPS, open source Bro Example: . *Abc. *x 90+de[^rn]{30} Pros • Can efficiently match multiple sigs simultaneously, through DFA • Can describe the syntactic context 15
Cons of Regex Limited expressive power, cannot describe semantic context, thus inaccurate Theoretical prospective Regex Protocol Context grammar. Sensitive Free Practical prospective • HTTP chunk encoding • DNS label pointers
State Of The Art Vulnerability Signature [Wang et al. 04] Blaster Worm (WINRPC) Example: Vulnerability: design flaws enable the bad BIND: inputs lead the program to a bad state rpc_vers==5 && rpc_vers_minor==1 && packed_drep==x 10x 00x 00 Good && context[0]. abstract_syntax. uuid=UUID_Remote. Activation state BIND-ACK: Bad input rpc_vers==5 && rpc_vers_minor==1 CALL: rpc_vers==5 && rpc_vers_minors==1 && packed_drep==x 10x 00x 00 Bad Vulnerability && opnum==0 x 00 && stub. Remote. Activation. Body. actual_length>=40 state Signature && match. RE(stub. buffer, /^x 5 cx 00/) Pros • Directly describe semantic context • Very expressive, can express the vulnerability condition exactly • Accurate Cons • Slow! • Existing approaches all use sequential matching • Require protocol parsing 17
Motivation of Net. Shield 18
Motivation • Desired Features for Signature-based NIDS/NIPS – Accuracy (especially for IPS) – Speed Cannot capture vulnerability – Coverage: Large ruleset condition well! Regular Expression Vulnerability Accuracy Relative Poor Much Better Speed Good ? ? Memory OK ? ? Coverage Good ? ? Shield [sigcomm’ 04] Focus of this work 19
Research Challenges and Solutions • Challenges – Matching thousands of vulnerability signatures simultaneously • Sequential matching match multiple sigs. simultaneously – High speed protocol parsing • Solutions – An efficient algorithm which matches multiple sigs simultaneously – A tailored parsing design for high-speed 20 signature matching
Background • Vulnerability signature basic – Use protocol semantics to express vulnerabilities – Defined on a sequence of PDUs & one predicate for Blastereach PDU Worm (WINRPC) Example: BIND: Example: ver==1 && method==“put” && len(buf)>300 – rpc_vers==5 && rpc_vers_minor==1 && packed_drep==x 10x 00x 00 && Data representations • context[0]. abstract_syntax. uuid=UUID_Remote. Activation BIND-ACK: – For all rpc_vers_minor==1 rpc_vers==5 &&the vulnerability signatures we studied, we only CALL: need numbers and strings rpc_vers==5 && rpc_vers_minors==1>, <, >=, <= – number operators: ==, && packed_drep==x 10x 00x 00 && opnum==0 x 00 && stub. Remote. Activation. Body. actual_length>=40 – String operators: ==, match_re(. , . ), len(. ). && match. RE(stub. buffer, /^x 5 cx 00/) 21
Outline • • • Motivation High Speed Matching for Large Rulesets High Speed Parsing Evaluation Research Contributions 22
Matching Problem Formulation • Suppose we have n signatures, defined on k matching dimensions (matchers) – A matcher is a two-tuple (field, operation) or a fourtuple for the associative array elements – Translate the n signatures to a n by k table – This translation unlocks the potential of matching multiple signatures simultaneously Rule 4: URI. Filename=“fp 40 reg. dll” && len(Headers[“host”])>300 Rule. ID Method == Filename == Header == LEN 1 DELETE * * 2 POST Header. php * 3 * awstats. pl * 4 * fp 40 reg. dll name==“host”; len(value)>300 5 * * name==“User-Agent”; len(value)>544 23
Matching Problem Formulation • Challenges for Single PDU matching problem (SPM) – Large number of signatures n – Large number of matchers k – Large number of “don’t cares” – Cannot reorder matchers arbitrarily -buffering constraint – Field dependency • Arrays, associative arrays • Mutually exclusive fields. 24
Difficulty of the SPM • Bad News – A well-known computational geometric problem can be reduced to this problem. – And that problem has bad worst case bound O((log N)K-1) time or O(NK) space (worst case ruleset) • Good News – Measurement study on Snort and Cisco ruleset – The real-world rulesets are good: the matchers are selective. – With our design O(K) 25
Matching Algorithms Candidate Selection Algorithm 1. Pre-computation decides the rule order and • Integer range checking matcher order balanced binary search tree • String Match each matcher 2. Decomposition. exact matching Trie • Regex DFA (XFA) separately and iteratively combine the results efficiently 26
Step 1: Pre-Computation • Optimize the matcher order based on buffering constraint & field arrival order • Rule reorder: 1 Require Matcher 2 Don’t care Matcher 1 &2 n 27
Step 2: Iterative Matching PDU={Method=POST, Filename=fp 40 reg. dll, Header: name=“host”, len(value)=450} S 1={2} Candidates after match Column 1 (method==) S 2=S 1 A 2+B 2 ={2} {}+{4}={4} S 3=S 2 A 3+B 3={4} {4}+{}={4} Don’t care Rule. ID Method == Filename == Header == LEN R 1 R 2 R 3 1 2 DELETE Header. php * * awstats. pl * 4 * fp 40 reg. dll 5 * * 3 POST Si * matcher i+1 * require In Ai+1 name==“host”; len(value)>300 matcher i+1 name==“User-Agent”; len(value)>544 28
Complexity Analysis Three HTTP traces: avg(|Si|)<0. 04 • Merging complexity Two WINRPC – Need k-1 merging iterations traces: avg(|Si|)<1. 5 – For each iteration • Merge complexity O(n) the worst case, since Si can have O(n) candidates in the worst case rulesets • For real-world rulesets, # of candidates is a small constant. Therefore, O(1) – For real-world rulesets: O(k) which is the optimal we can get 29
Refinement and Extension • SPM improvement – Allow negative conditions – Handle array cases – Handle associative array cases – Handle mutual exclusive cases • Extend to Multiple PDU Matching (MPM) – Allow checkpoints. 30
Outline • • • Motivation High Speed Matching for Large Rulesets. High Speed Parsing Evaluation Research Contribution 31
High Speed Parsing General V. S. Special Purpose Keep the whole parse Parsing and matching V. S. on the fly tree in memory Parse all the nodes in the tree Only signature related V. S. fields (leaf nodes) • Design a parsing state machine • Build an automated parsing state machine generator
Outline • • • Motivation High Speed Matching for Large Rulesets. High Speed Parsing Evaluation Research Contributions 33
Evaluation Methodology Fully implemented prototype • 12, 000 lines of C++ and 3, 000 lines of Python Release at: www. nshield. org Deployed at a university DC with up to 106 Mbps • 26 GB+ Traces from Tsinghua Univ. (TH), Northwestern (NU) and DARPA • Run on a P 4 3. 8 Ghz single core PC w/ 4 GB memory • After TCP reassembly and preload the PDUs in memory • For HTTP we have 794 vulnerability signatures which cover 973 Snort rules. • For WINRPC we have 45 vulnerability signatures which cover 34 3, 519 Snort rules
Parsing Results Trace TH DNS TH NU TH WINRPC HTTP Avg flow len (B) 77 879 596 6. 6 K 55 K 2. 1 K Throughput (Gbps) Binpac Our parser 0. 31 3. 43 1. 41 16. 2 1. 11 12. 9 2. 10 14. 2 1. 69 7. 46 44. 4 6. 67 11. 2 Max. memory per 15 11. 5 15 11. 6 15 3. 6 14 Speed up ratio connection (bytes) NU HTTP 3. 1 14 DARPA HTTP 3. 9 14 35
Matching Results 8 -core 11. 0 Trace TH NU TH WINRPC HTTP NU HTTP DARPA HTTP Avg flow length (B) 879 596 6. 6 K 55 K 2. 1 K 10. 68 14. 37 9. 23 10. 61 0. 34 2. 63 2. 37 0. 28 17. 63 1. 85 Matching only time speed up ratio 4 1. 8 11. 3 11. 7 Avg # of Candidates 1. 16 27 1. 48 27 0. 033 0. 038 0. 0023 20 20 20 Throughput (Gbps) Sequential CS Matching Max. memory per connection (bytes) 8. 8 36
Scalability and Accuracy Results Rule scaling results Performance decrease gracefully Accuracy • Create two polymorphic WINRPC exploits which bypass the original Snort rules but detect accurately by our scheme. • For 10 -minute “clean” HTTP trace, Snort reported 42 alerts, Net. Shield reported 0 alerts. Manually verify the 42 alerts are false positives 37
Research Contribution Make vulnerability signature a practical solution for NIDS/NIPS Regular Expression Exists Vul. IDS Net. Shield Accuracy Poor Good Speed Good Poor Good Memory Good ? ? Good Coverage Good ? ? Good • Multiple sig. matching candidate selection algorithm • Parsing parsing state machine Build a better Snort alternative! 38
Future work Client Server Network Security Data Center Security Web/Web. Security • Web. Propeht[NSDI 10] • Web. Shield Social network security 39
Q&A Thanks! 40
Observations • PDU parse tree • Leaf nodes are numbers or strings PDU array General V. S. Special Purpose Keep the whole parse Parsing and matching V. S. on the fly tree in memory Parse all the nodes in the tree Only signature related V. S. fields (leaf nodes) 41
Efficient Parsing with State Machines • Studied eight protocols: HTTP, FTP, SMTP, e. Mule, Bit. Torrent, WINRPC, SNMP and DNS as well as their vulnerability signatures • Common relationships among leaf nodes Automated parsing state machine generator: Ultra. PAC • Pre-construct parsing state machines based on parse trees and vulnerability signatures 42
Example for WINRPC • Rectangles are states • Parsing variables: R 0. . R 4 • 0. 61 instruction/byte for BIND PDU 43
Experiences • Working in process – In collaboration with MSR, apply the semantic rich analysis for cloud Web service profiling. To understand why slow and how to improve. • Interdisciplinary research • Student mentoring (three undergraduates, six junior graduates) 44
Future Work • Near term – Web security (browser security, web server security) – Data center security – High speed network intrusion prevention system with hardware support • Long term research interests – Combating professional profit-driven attackers will be a continuous arm race – Online applications (including Web 2. 0 applications) become more complex and vulnerable. – Network speed keeps increasing, which demands highly scalable approaches. 45
Research Contributions • Demonstrate vulnerability signatures can be applied to NIDS/NIPS, which can significantly improve the accuracy of current NIDS/NIPS • Propose the candidate selection algorithm for matching a large number of vulnerability signatures efficiently • Propose parsing state machine for fast protocol parsing 46 • Implement the Net. Shield
Comparing With Regex • Memory for 973 Snort rules: DFA 5. 29 GB (XFA 863 rules 1. 08 MB), Net. Shield 2. 3 MB • Per flow memory: XFA 36 bytes, Net. Shield 20 bytes. • Throughput: XFA 756 Mbps, Net. Shield 1. 9+Gbps (*XFA [SIGCOMM 08][Oakland 08]) 47
Measure Snort Rules • Semi-manually classify the rules. 1. Group by CVE-ID 2. Manually look at each vulnerability • Results – 86. 7% of rules can be improved by protocol semantic vulnerability signatures. – Most of remaining rules (9. 9%) are web DHTML and scripts related which are not suitable for signature based approach. – On average 4. 5 Snort rules are reduced to one vulnerability signature. – For binary protocol the reduction ratio is much higher than that of text based ones. • For netbios. rules the ratio is 67. 6. 48
Matcher order Reduce Si+1 Enlarge Si+1 Merging Overhead |Si| (use hash table to calculate in Ai+1, O(1)) fixed, put the matcher later, reduce Bi+1 49
Matcher order optimization • Worth buffering only if estmax. B(Mj)<=Max. B • For Mi in All. Matchers – Try to clear all the Mj in the buffer which estmax. B(Mj)<=Max. B – Buffer Mi if (estmax. B(Mi)>Max. B) – When len(Buf)>Buflen, remove the Mj with minimum estmax. B(Mj) 50
51
• Backup Slides 52
Motivation • Network security has been recognized as the single most important attribute of their networks, according to survey to 395 senior executives conducted by AT&T • Many new emerging threats make the situation even worse 53
Candidate merge operation Don’t care matcher i+1 Si require matcher i+1 In Ai+1 54
A Vulnerability Signature Example • Data representations – For all the vulnerability signatures we studied, we only need numbers and strings – number operators: ==, >, <, >=, <= – String operators: ==, match_re(. , . ), len(. ). • Example signature for Blaster worm Example: BIND: rpc_vers==5 && rpc_vers_minor==1 && packed_drep==x 10x 00x 00 && context[0]. abstract_syntax. uuid=UUID_Remote. Activation BIND-ACK: rpc_vers==5 && rpc_vers_minor==1 CALL: rpc_vers==5 && rpc_vers_minors==1 && packed_drep==x 10x 00x 00 && stub. Remote. Activation. Body. actual_length>=40 && match. RE( stub. buffer, /^x 5 cx 00/) 55
System Framework Scalability Accuracy & Scalability & Coverage Accuracy & adapt fast Accuracy & 56 adapt fast
Example of Vulnerability Signatures • At least 75% vulnerabilities are due to buffer overflow Sample vulnerability signature • Field length corresponding to vulnerable buffer > certain threshold • Intrinsic to buffer overflow vulnerability and hard to evade Overflow! Protocol message Vulnerable buffer 57
Old Slides 58
Conclusions • A novel network-based vulnerability signature matching engine – Through measurement study on Snort ruleset, prove the vulnerability signature can improve most of the signatures in NIDS/IPS. – Proposed parsing state machine for fast parsing – Propose a candidate selection algorithm for matching a large number of vulnerability signature simultaneously 59
Outline • Motivation • Feasibility Study: a measurement approach • Problem Statement • High Speed Parsing • High Speed Matching for massive vulnerability Signatures. • Evaluation • Conclusions 61
Outline • Motivation • Feasibility Study: a measurement approach • Problem Statement • High Speed Parsing • High Speed Matching for massive vulnerability Signatures. • Evaluation • Conclusions 62
Outline • Motivation • Feasibility Study: a measurement approach • Problem Statement • High Speed Parsing • High Speed Matching for a large number of vulnerability Signatures. • Evaluation • Conclusions 63
Outline • Motivation • Feasibility Study: a measurement approach • Problem Statement • High Speed Parsing • High Speed Matching for massive vulnerability Signatures. • Evaluation • Conclusions 64
Limitations of Regular Expression Signatures Signature: 10. *01 1010101 10111101 Internet Traffic Filtering X X 11111100 Our network 00010111 Polymorphism! Polymorphic attack (worm/botnet) might not have exact regular expression based signature 65
What we do? • Build a NIDS/NIPS with much better accuracy and similar speed comparing with Regular Expression based approaches – Feasibility: Snort ruleset (6, 735 signatures) 86. 7% can be improved by vulnerability signatures. – High speed Parsing: 2. 7~12 Gbps – High speed Matching: • Efficient Algorithm for matching massive vulnerability rules • HTTP, 791 vulnerability signatures at ~1 Gbps 66
Problem Formulation • Parsing problem formulation – Given a PDU and the protocol specification as input, output the set of fields which required by matching. 67
Publications • • • Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE ICNP 2007. Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007 Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 Zhichun Li, Yan Chen and Aaron Beach, Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balacing, in Proc. of ACM SIGCOMM LSAD 2006 Yan Gao, Zhichun Li and Yan Chen, A Do. S Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE ICDCS 2006 Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006 68
Current Status • Part I: Sketch based monitoring & detection – Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007 – Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006 (252/1400=18%) – Yan Gao, Zhichun Li and Yan Chen, A Do. S Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE International Conference on Distributed Computing Systems (ICDCS) 2006 (75/536=14%) (Alphabetical order) • Part II: Polymorphic worm signature generation – TOSG: Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 (23/251=9%) – LESG: Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE International Conference on Network Protocols (ICNP) 2007 (32/220=14%) 69
Current Status • Part III: Signature matching engines – Work in progress, will be focus of this talk – Zhichun Li, Gao Xia, Yi Tang, Jian Chen, Ying He, Yan Chen and Bin Liu, Net. Shield : Towards High Performance Networkbased Semantic Signature Matching, in submission • Part IV: Network Situational Awareness – Work in process – Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson, Towards Situational Awareness of Large-Scale Botnet Events using Honeynets, in preparation – Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic, P 2 P Doctor: Measurement and Diagnosis of Misconfigured Peer-to-Peer Traffic, in submission 70
Current Status • Part I: Sketch based monitoring & detection – Result in [Infocom 06, To. N, ICDCS 06] • Part II: Polymorphic worm signature generation – Result in [Oakland 06, ICNP 07] • Part III: Signature matching engines – Work in progress, will be focus of this talk • Part IV: Network Situational Awareness – Work in process 71
Limitations of Exploit Based Signature: 10. *01 1010101 10111101 Internet Traffic Filtering X X 11111100 Our network 00010111 Polymorphism! Polymorphic worm might not have exact exploit based signature 72
Vulnerability Signature Internet Vulnerability signature traffic filtering X X Our network X X Vulnerability Work for polymorphic worms Work for all the worms which target the same vulnerability 73


