eae1311adb5b3b051bb8524ba7e269d7.ppt
- Количество слайдов: 78
Tools and techniques for understanding and defending real systems Jedidiah R. Crandall crandall@cs. ucdavis. edu 1
Overview n Security is not a problem to be solved, but a battle to be waged by… ¨ Antivirus professionals ¨ Law enforcement ¨ Next-generation security technology developers ¨… n Give them the tools they need ¨ Implementations of useful techniques ¨ Theory planted firmly in practice 2
Vision n How can we address emerging threats (poly/metamorphic worms/botnets, cryptovirology, advanced rootkits, etc. )? ¨ Problem: We don’t have very many real-world samples of these to look at ¨ Solution: Look at the way the samples we have interact with the systems we’re trying to defend 3
Outline n Code Red II example ¨ n Minos ¨ n Used to understand polymorphism and metamorphism Temporal Search ¨ n Catches worms DACODA ¨ n Define some basic terms and concepts Analyzes the payload for timebomb attacks Looking ahead… 4
Outline n Code Red II example ¨ n Minos ¨ n Used to understand polymorphism and metamorphism Temporal Search ¨ n Catches worms DACODA ¨ n Define some basic terms and concepts Analyzes the payload for timebomb attacks Looking ahead… 5
Code Red/Code Red II n Code Red ¨ 359, 000 hosts infected ¨ $2. 6 billion in cleanup [Computer Economics] ¨ Attempted Do. S on White House n n Averted after being discovered hours before the attack was to occur Code Red II ¨ Exploit is basically the same 6
Exploit-based Worms Web Server’s Memory Next GET /bla? x=A 1 B 28 CD 30 EE 17 C 7
The Code Red II Exploit GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 8
Three stages of an attack 9
ε = Exploit Vector GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 10
γ = Bogus Control Data GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 11
π = Payload GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 12
Motivation for ε-γ-π n n Different polymorphic/metamorphic techniques for ε, γ, and π Data can be represented differently on the network and where it used in the attack trace 75 62 63 64 33 25 75 37 38 30 31” vs. “d 3 cb 01 78” for 0 x 7801 cbd 3 ¨ “ 25 n “Information only has meaning in that it is subject to interpretation. ” [Cohen, 1984] 13
Network Signatures? GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 14
Polymorphism and metamorphism n Change successive instances of the worm so signature-based network defenses fail ¨ Polymorphic: think syntax ¨ Metamorphic: think semantics n Note: Some researchers call both polymorphism 15
ε = Exploit Vector GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 16
γ = Bogus Control Data GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 17
π = Payload GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 18
Poly/metamorphism in γ and π Poly/metamorphic possibilities of π are endless (self-modifying code) n γ: Buttercup [Pasupulati et al. NOMS 2004] n ¨ “Register springs” al. ; DIMVA 2005] – more details in [Crandall et 11, 009 possibilities for Blaster n 353 for Slammer n 19
Polymorphism of ε GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 20
Polymorphism of ε GET /yutiodr. ida? CEOIUXJASKMDIDD EOXIJOEIJXDXNMDKJXNSKJNXIDOIW R…ATUD%u 8743%ubc 65%ua 999%uffff%u 8 73 f%ue 875%u 4568%u 99 cc%u 8333%u 76 21%ubb 66%u 9876%u 1000%u 8732%u 985 4%u 76 cd%udddd%u 5555%u 5234%uff 43 %u 7632%u 5632%ucc=i HTTP/1. 0 21
Metamorphism of ε GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 22
Metamorphism of ε GET /default. ida? X%u 61 XXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXxd 3xcbx 01x 78 XXXXXXX=a HTTP/1. 0 23
Metamorphism of ε 24
Outline n Code Red II example ¨ n Minos ¨ n Used to understand polymorphism and metamorphism Temporal Search ¨ n Catches worms DACODA ¨ n Define some basic terms and concepts Analyzes the payload for timebomb attacks Looking ahead… 25
Minos [Crandall and Chong; MICRO 2004] n Tagged architecture that tracks the integrity of every memory word ¨ Network data is tainted ¨ Control data (return pointers, function pointers, jump targets, etc. ) should not be Taint tracking with every instruction n Great for catching worms n ¨ Uses the γ mapping 26
Gratuitous Dante Quote Minos the dreadful snarls at the gate, … and wraps himself in his tail with as many turns as levels down that shade will have to dwell 27
Minos Implementation n Implemented a full-system tagging scheme in a virtual machine ¨ Linux (modified kernel) Tracks integrity in the file system n Virtual memory swapping [used by Raksha project] n ¨ Windows n (unmodified) Works great as a honeypot for cacthing worms 28
How to catch worms… 29
Only one false positive… 30
Actually a “non-target pest” 31
Minos Full-System Evaluation n General Minos concept used in related works (DIFT [Suh et al. ; ASPLOS 2004], Taint. Check [Newsome and Song; NDSS 2005]), follow-on works, and at least one commercial product ¨ Important to get things right n e. g. Code Red II – must taint table lookups n Able to build DACODA on top of Minos 32
Outline n Code Red II example ¨ n Minos ¨ n Used to understand polymorphism and metamorphism Temporal Search ¨ n Catches worms DACODA ¨ n Define some basic terms and concepts Analyzes the payload for timebomb attacks Looking ahead… 33
DACODA [Crandall et al. ; CCS 2005] n DAvis mal. CODe n Analyzer Discover invariants in the exploit vector (ε) ¨ Symbolic execution on the system trace during attacks that Minos catches n Used for an empirical analysis of polymorphism and metamorphism ¨ Quantify and understand the limits 34
Worm Polymorphism and Metamorphism n Viruses: Defender has time to pick apart the attacker’s techniques ¨ e. g. n Algorithmic scanners, emulation Worms: Attacker has time to pick apart the deployed network defense techniques ¨ What can defenders do to evaluate the robustness of defenses against attacks that don’t exist yet? 35
Measuring Poly/metamorphism n [Ma et al. ; IMC 2006] ¨ Found wild” n relatively little polymorphism “in the Worm defense designers don’t have samples of the poly/metamorphic techniques attackers will use on their defenses ¨ (Have to build the defense first) 36
The Epsilon-Gamma-Pi Model 37
How DACODA Works “Information only has meaning in that it is subject to interpretation. ” [Cohen, 1984] n Gives each byte of network data a unique label n Tracks these through the entire system n Discovers predicates about how the host under attack interprets the network bytes n 38
mov al, [Address. With. Label 1832] ; AL. expr <= (Label 1832) add al, 4 ; AL. expr <= (ADD AL. Expr 4) ; /* AL. expr == (ADD (LABEL 1832) 4) */ cmp al, 10 ; ZFLAG. left <= AL. expr ; /* ZFLAG. left == (ADD (Label 1832) 4) */ ; ZFLAG. right <= 10 je Jump. Target. If. Equal. To. Ten ; P <= new Predicate(EQUAL ZFLAG. Left ZFLAG. Right) ; /* P == (EQUAL (ADD (Label 1832) 4) 10) */ ; Add. To. Set. Of. Known. Predicates(P) 39
Why Full-System Analysis? • Kernel – “Remote Windows Kernel Exploitation – Step Into the Ring 0” by Barnaby Jack – MS 05 -027 (SMB) • Multiple processes – Base 64 in IIS + ASN. 1 in lsass. exe • Multithreading – And listening on multiple ports – Even for Slammer, the simplest buffer overflow ever 40
Actual Worms/Attacks Caught by Minos and Analyzed by DACODA Name Sasser Blaster Workstation Serv. RPCSS Slammer Code Red II Zotob OS Win. XP Whist. Win 2 K Port 445 TCP 135 TCP 1434 UDP 80 TCP 445 TCP Class Buff. Over. 41
Other Attacks Caught by Minos and Analyzed by DACODA Name SQL Auth. rpc. statd OS Whist. Linux innd Scalper ntpd Turkey Linux OBSD FBSD Port 1434 TCP 111 & 918 TCP 119 TCP 80 TCP 123 TCP 21 TCP Class Buff. Over. Form. Str. Buff. Over. Int. Over. Buff. Over. Off. By. One 42
Single Contiguous Byte Strings Name Sasser Blaster Work. RPCSS Slammer CRII Zotob Longest String 36 92 23 18 1 17 36 Name SQLAuth rpc. statd innd Scalper ntpd Turkey Longest String 4 16 27 32 8 21 43
Single Contiguous Signatures n n Autograph [Kim and Karp; USENIX Security 2004] and Early. Bird [Singh et al. ; OSDI 2004] both demonstrated good results at about 40 bytes for the signature length [Newsome et al. ; IEEE S&P 2005] came to the same conclusion as we did and proposed sets of smaller byte strings called tokens 44
Tokens GET /default. ida? XXXXXXXXXXXXXXXXXXXXX X…XXXX%u 9090%u 6858%ucbd 3%u 7801%u 9090%u 68 58%ucbd 3%u 7801%u 9090%u 819 0%u 00 c 3%u 0003%u 8 b 00%u 531 b%u 53 ff %u 0078%u 0000%u 00=a HTTP/1. 0 45
Where do These Tokens Come From? n Scalper “Transfer-Encoding: chunked” ¨ Same n applies to most of these vulnerabilities “The Horns of a Dilemma” ¨ Use protocol framing as a signature ¨ Be very precise 46
Precision: ASN. 1 Dangling Pointer n Heap corruption (0 x 23 [SIZE]… ”AAAA” (0 x 23 [SIZE] 0 x 77665544 “BBBB”) …) 47
Conclusions from DACODA Whole system analysis is important n New focus on more semantic signatures n ¨ How to understand the semantics of the vulnerability? n We can learn a lot about emerging malware threats by studying existing malware samples and their interactions with the systems they run on 48
Outline n Code Red II example ¨ n Minos ¨ n Used to understand polymorphism and metamorphism Temporal Search ¨ n Catches worms DACODA ¨ n Define some basic terms and concepts Analyzes the payload for timebomb attacks Looking ahead… 49
Temporal Search [Crandall et al. ; ASPLOS 2006] n Automated discovery of timebomb attacks ¨ Analysis n in the π stage Prototype of behavior-based analysis ¨ Proposed a framework for a problem space nobody has looked at before ¨ Implemented parts of it ¨ Identified the remaining challenges n By testing real worms with timebombs on our prototype 50
You as an antivirus professional catch a new worm… n n Unpack it Polymorphism/ metamorphism? Anti-debugger tricks? Any behaviors predicated on time? ¨ How it gets the time? ¨ UTC/Local? ¨ Conversions between formats? 51
With Temporal Search… n n n Infect a VM Automated, behavior-based Temporal Search Respond 52
How to respond? n Sober. X – 6 and 7 January 2006 ¨ URLs n Kama Sutra – 3 rd of the month ¨ Users n blocked removed infections Code Red – 20 th of the month ¨ White House IP address changed What if we have just hours or even minutes, not days? 53
Behavior-based Analysis [Cohen, 1984] defined behavior-based detection as a question of “defining what is and is not a legitimate use of a service, and finding a means of detecting the difference. ” n Behavior-based analysis is similar n ¨ Assume the system is infected with malware ¨ Analyze its use of a service such as the PIT 54
Why not just speed up the clock? n Dramatic time perturbation would be easy to detect ¨ Also not easy to do for a busy system (effectively lowers perceived performance) n May miss some behaviors ¨ Kama n Sutra Will not be able to explain behaviors it does elicit 55
Basic Idea n Find timers ¨ Run time the PIT at different rates of perceived System performance stays the same n Correlate between PIT and memory writes n n Symbolic execution ¨ e. g. n with DACODA Weakest precondition calculation 56
Filling in the Timetable System. Time 126, 396, 288 e 12 (13 July 2001) Predicate ? >= 20 Behavior Spread time 57
Filling in the Timetable System. Time 126, 396, 288 e 12 (13 July 2001) 126, 402, 336 e 12 (20 July 2001) Predicate Behavior ? >= 20 Spread ? >= 28 Do. S White House time 58
Filling in the Timetable System. Time 126, 396, 288 e 12 (13 July 2001) 126, 402, 336 e 12 (20 July 2001) 126, 409, 248 e 12 (28 July 2001) Predicate Behavior ? >= 20 Spread ? >= 28 Do. S White House None Go to sleep time 59
Windows 60
Manual Analysis n Many different library calls, APIs for date and time Get. System. Time(), Get. Local. Time(), Get. Time. Zone. Information(), Diff. Date(), Get. Date. Format(), etc. ¨ System call not really necessary ¨ n Conversions back and forth between various represenations (e. g. My. Party. A, Blaster. E) ¨ ¨ ¨ UTC vs. Local 1600 vs. 1970 32 - vs 64 -bit integers for day, month, year, etc. strings n Not always done with standard library functions Have to unpack it first, anti-debugging tricks n All of this is simply dataflow from System. Time timer n 61
Setup ARP cache poisoning, DNS spoofing, etc. Windows XP @ 192. 168. 33. 2 Host @ 192. 168. 33. 1 Bochs VM w/ DNS, NTP, HTTP, TIME, etc. w/ DACODA and Timer Discovery tuntap interface 62
Temporal Search n Symbolic Execution (DACODA) ¨ Cod n n Red, Blaster. E, My. Party. A, Klez. A Discovers predicates on day, hour, minute, etc. on a real time trace Control-flow sensitivity within loops ¨ Cod Red, Blaster. E, My. Party. A, Klez. A, Sober. X Kama Sutra n Month and year 63
Adversarial Analysis n For any technique, being applicable to every possible virus or worm is not a requirement ¨AV companies collect intelligence n More details in the paper on this 64
Conclusions from Temporal Search n Manual analysis is tricky and time-consuming ¨ Temporal Search can dramatically improve response time n n n Behavior-based analysis is all about the environment Malware does not follow a linear timetable Gregorian calendar poses its own challenges 65
Why Behavior-Based Analysis? “An ant, viewed as a behaving system, is quite simple. The apparent complexity of its behavior over time is largely a reflection of the complexity of the environment in which it finds itself. ” –Herbert Simon 66
Other recent projects… n (Stuff I’m currently working on) 67
Replay-Based Entropy Measurement [Crandall et al. ; work in progess] 68
Great Firewall of China [Zinn et al. ; work in progress] n My contribution: Model keyword-based censorship using Latent Semantic Analysis ¨ Relate keywords to concepts ¨ Efficient probing to discover unknown words that are filtered 69
Recovery [Oliveira et al. ; work in progress] Virtual Time 70
Outline n Code Red II example ¨ n Minos ¨ n Used to understand polymorphism and metamorphism Temporal Search ¨ n Catches worms DACODA ¨ n Define some basic terms and concepts Analyzes the payload for timebomb attacks Looking ahead… 71
Looking ahead… n Worms, botnets, rootkits, ? ? ? ¨ Not problems with purely technical solutions ¨ Should give defenders the tools they need n How to develop defenses for emerging threats… ¨ Study real malware ¨ Understand the systems that the battle takes place on ¨ Use the interactions between the two to develop a theory of what is possible 72
Examples n Behavior-based analysis ¨ Fully-automated implementation of temporal search n Different approaches [Reps et al; ESEC/FSE ‘ 97]? ¨ Cryptovirology [Yung and Young; 2004] n Vulnerability semantics ¨ Vector semantics (such as LSA)? n Testing for unknown vulnerabilities n Policies for commodity systems ¨ Biba’s low-water-mark integrity, Chinese Wall Policy [Fraser; IEEE S&P 2000] 73
Questions? n Thank you for inviting me. 74
Related Work: Vigilante [Costa et al. , SOSP 2005] n Introduces the idea of Self Certifying Alerts ¨ Goal is automatic patching, not network filtering ¨ No distinction between what data looks like on the network and what it looks like when processed n n n Filter generation is similar to DACODA’s symbolic execution DACODA is a whole system approach Shield [Wang et al. ; SIGCOMM 2004] 75
Temporal Search Lessons Learned… n Some interesting times are relative ¨ Need n to track Tick. Count Behavior-based analysis is all about the environment ¨ Code Red and TCP RSTs 76
Minos Evaluation n Attacks designed to subvert Minos ¨ [Crandall and Chong; MICRO 2004] ¨ [Crandall and Chong; WASSA 2004] ¨ [Chen et al. ; USENIX Security 2005] ¨ [Dalton et al. ; WDDD 2006] ¨ [Piromsopa and Enbody; WDDD 2006] 77
Adversarial Analysis of Temporal Search n For any technique, being applicable to every possible virus or worm is not a requirement ¨ n AV companies collect intelligence Challenges ¨ ¨ ¨ What is and is not a malicious use of the PIT? Cryptocounters, covert channels, etc. VM detection n [King et al. ] Subvirt… at IEEE S&P 2006 Pioneer project and related work at CMU All analysis can be done on a trace ¨ [Oliveira et al. ; ASID 2006] 78