Скачать презентацию Diversity Algorithms for Worrisome Software and Networks DAWSON Скачать презентацию Diversity Algorithms for Worrisome Software and Networks DAWSON

7b75c4689b355b7139c8967246b8be85.ppt

  • Количество слайдов: 103

Diversity Algorithms for Worrisome Software and Networks (DAWSON) James Just, Mark Cornwell, Jason Minto, Diversity Algorithms for Worrisome Software and Networks (DAWSON) James Just, Mark Cornwell, Jason Minto, Art Torrey Global Info. Tek, Inc. Karl Levitt, Jeff Rowe, Tufan Demir UC Davis R. Sekar Consultant (SUNY Stony Brook) 27 January 2005

Overview • • Project overview Integration framework Diversity to break exploits Diversity to break Overview • • Project overview Integration framework Diversity to break exploits Diversity to break payloads Framework for analyzing effectiveness Binary rewriting review Next steps

Problem Space (1) Excessive Homogeneity => Systemic Vulnerability How prevent exponentially cascading failures? • Problem Space (1) Excessive Homogeneity => Systemic Vulnerability How prevent exponentially cascading failures? • Attacks exploit dense environment with ease to spread fast and/or far • Foreseeable cyber-risks dominated by static, durable monoculture of executables

Problem Space (2) Common Mode Failures Impede Intrusion Tolerant Systems • Intrusion tolerant systems Problem Space (2) Common Mode Failures Impede Intrusion Tolerant Systems • Intrusion tolerant systems are: – Expensive – Have large hw/sw footprints – Assume a priori knowledge of attack modalities • Success depends on availability of spare components – Assumption of independent intrusions/faults is flawed – Availability of diverse commercial spares limits effectiveness even if intrusion tolerance system affordable – Rapid learning of attack signatures for blocking is hard – Custom N-version programming is costly

DAWSON Approach • Randomized transforms of Windows executables at runtime – – Preserve functionality DAWSON Approach • Randomized transforms of Windows executables at runtime – – Preserve functionality of executable modules (e. g. , dll) Transform binary code, machine addresses, names, etc Use annotations to facilitate Pseudo-random numbers produce unique transformations on each application restart – Network protocol diversity effort replaced by breaking payload execution • Goal: Beat program metric by 10 X for large fraction of exploit space if transforms are focused – 100 functional equivalents with no more than 3 susceptible to same exploit as baseline code for most exploits – Low overhead transforms (runtime performance)

Attack Space of Interest: Memory Error Exploits Memory corruption attacks Corrupt target of existing Attack Space of Interest: Memory Error Exploits Memory corruption attacks Corrupt target of existing pointer Compromise security critical data • File names opened for write or execute • Security credentials -- has the user authenticated himself? Includes common buffer overflows, strncpy(), off-by-one, cast screw-up, format strings, double-free, return to libc, other heap structure exploits Pointer to injected data Corrupt a pointer value Corrupt data pointer • Frame pointer • Local variables, parameters • Pointer used to copy input Corrupt code pointer • Return address • Function pointer • Dynamic linkage tables (GOT, IAT) Pointer to existing data • Example: corrupt string arguments to functions so that Pointer to they point to attacker injected code desired data already in memory, e. g. , “/bin/sh”, “/etc/passwd” Pointer to existing code

Evaluation • Identify assumptions and ROE for possible Red Teaming • Internal testing with Evaluation • Identify assumptions and ROE for possible Red Teaming • Internal testing with – Fabricated applications with known vulnerabilities and exploits – Real applications with known vulnerabilities and exploits • Possible use of Emulab or Deter network emulation testbeds

Some Assumptions & Red Team ROEs • Attacks are remote, automated and nondirected • Some Assumptions & Red Team ROEs • Attacks are remote, automated and nondirected • Attacker cannot observe the execution of valid programs without using system calls • Processes cannot transition from user mode to kernel mode without using system calls • Attacker cannot automate non-trivial static analysis of memory contents • Modification is limited to binary (or memory) editing; source code is unavailable

Status • Interim products – Native Windows (MFC, . Net) PE File Editor – Status • Interim products – Native Windows (MFC, . Net) PE File Editor – Transforms • Automated permutation of the Import Address Table in PE files • Automated replacement of DLL names and functions with random strings in PE files • Local variable location modification – not quite automated yet – In-process • Reordering of binary code blocks and insertion of dead code blocks • Asymmetric transformation of function parameters using dummy functions. • New insights on requirements – Address obfuscation (to defeat trivial static analysis of memory) – Fail-crash & detection mechanism (to defeat brute force trial and error) – Non-by-passability of transform mechanisms – Balzer wrapper mechanism – UC Davis investigating option from another project – Because of perceived higher value, shifted some effort to developing diversity to break payloads rather than diversity for network protocols

Transition and Future Work • Interim use intermediate products by other SRS contractors • Transition and Future Work • Interim use intermediate products by other SRS contractors • Integrate with follow-on projects/products • If successful – Package for military users – Possible GITI “commercial” product – Possible open source “toolkit” approach – Transition to Microsoft or other software vendors – Some expressions of VC interest • Standard research publications

DAWSON Project Schedule & Milestones FY 04 Baseline Tasks 1. Exploit Diversification 3. Payload DAWSON Project Schedule & Milestones FY 04 Baseline Tasks 1. Exploit Diversification 3. Payload Diversification 4. Integration 5. T&E 6. Q 2 Q 3 Q 4 Q 1 Q 2 Q 3 Q 4 Requirement Refinement 2. FY 06 FY 05 Program Mgt. Prototypes 1 2 3

GITI GITI

Current Attack Problem Software Specification Known V-Spec Executable Code Source Code Vu ln Exploit Current Attack Problem Software Specification Known V-Spec Executable Code Source Code Vu ln Exploit & Payload er ab Machine Code Specification Loader ili ty Machine-level Code

Vulnerability Specification? • All aspects of a program execution which can be exploited by Vulnerability Specification? • All aspects of a program execution which can be exploited by malicious code to gain control of the Program Counter, e. g. , – Memory Topology – Stack Specification – System APIs – Application API’s – Libraries – Exception Handling – Etc

Breaking the V-Spec Known V-Spec Source Code Machine Code Specification Executable Code VV ul Breaking the V-Spec Known V-Spec Source Code Machine Code Specification Executable Code VV ul ul ne nr ea rb Machine-level ai Machine-level bi lt ly it? Code Machine-level Code Doesn’t Machine-level y? Match Exploit & Payload Software Specification X Transforming Loader Code Machine-level Machine-level Code Machine-level Code Code V-Spec Unknown Until Load-Time Transform Specifications

Transform Techniques in Literature* • Obfuscation – – Layout obfuscation (scramble identifiers, remove comments, Transform Techniques in Literature* • Obfuscation – – Layout obfuscation (scramble identifiers, remove comments, change formats) Control flow obfuscations (Statement grouping, ordering, computation, opaque constructs) Data obfuscation (Storage, encoding, grouping, ordering) Preventative transformations (prevent decompilers from operating by exploiting weaknesses) • • • Source code – – N-version programming Functional-behavior preserving diversity in components used (e. g. , different encryption algorithms, different scales for data such as Celsius or Fahrenheit) Semantics preserving source code transformations – • • • – – • Place sensitive data (such as function and data pointer) below the starting address of any buffer Variable ordering Equivalent instructions Variable compilation --Variable internal names, padding and addresses, linking orders Insertion of opaque constructs or other dead code to change memory layout Binary code – Address transformations (relative and absolute) on binary code • • • – – • Inherent (aliases, variable or bogus dependencies, opaqueness side effect & difficulty) Targeted Randomize base address of memory regions (Stack, Heap, DLL, routines/static data in executable) Permute order of variable/routines (Local variables in stack frame, static variables, routines in shared libraries or routines in executables) Introduce random gaps between objects (Padding in stack frames, between successive mallocation requests, between variables in the static area; Gaps within routines and add jump instructions to skip over gaps) System resource, system call, or DLL name/address transformation Instruction set transformation Static and dynamic binary rewriting *References shown on later slides

Multi-Layer Defense Strategy GITI Prevent Remote Exploit of Memory Errors Prevent Injected Code from Multi-Layer Defense Strategy GITI Prevent Remote Exploit of Memory Errors Prevent Injected Code from Properly Executing UC Davis Prevent Access to Windows DLL’s Prevent Use of Windows DLL’s GITI UC Davis Prevent the Bypass of DLL’s

Diversity System Functional Architecture: Normal Modified loader transforms original stored program and generates wrapper Diversity System Functional Architecture: Normal Modified loader transforms original stored program and generates wrapper that retranslates User Inputs external calls Original Program Annotation File Modified Loader PRN Translation Normal user inputs are translated & untranslated so they work Other System Resources Wrapper Transformed In-memory program

Diversity System Functional Architecture: Initial Exploit Modified loader transforms original stored program and generates Diversity System Functional Architecture: Initial Exploit Modified loader transforms original stored program and generates wrapper that retranslates external calls Original Program Annotation File Attacker Translation Other System Resources Modified Loader PRN Wrapper Transformed Some attacks In-memory fail because program assumed vulnerability is gone

Diversity System Functional Architecture: Payload Execution Modified loader transforms original stored program and generates Diversity System Functional Architecture: Payload Execution Modified loader transforms original stored program and generates wrapper that retranslates external calls Original Program Annotation File Attacker Untranslation Other System Resources Modified Loader PSN Wrapper Transformed In-memory program Other attacks fail because injected commands are wrong

DAWSON Implementation Concept (I) Approach eases integration of various transform techniques DAWSON Randomizer PE DAWSON Implementation Concept (I) Approach eases integration of various transform techniques DAWSON Randomizer PE File Editor Original Binary Program on Disk PE File Macro Randomizer API PE File Component Randomizers Program Annotation File API API Transform Technique 1 Transform Technique 2 Transform Technique 3 ooo PRNG Transform Technique N API Object 1 Object 2 Object 3 ooo Object K Modified Binary Program in Memory or on disk

DAWSON Implementation Concept (2) Manual Approach – How best to automate? Modified Binary Code DAWSON Implementation Concept (2) Manual Approach – How best to automate? Modified Binary Code Reassembled Code (Optional) Recompiled Code Structure (Pattern) Modifier Structure (Pattern) Analyzer (Optional) Decomiled Code Disassembled Code Original Binary Code

DAWSON Implementation Concept (2) Automated Runtime Randomization Annotation File Original Binary Code Disassembled Code DAWSON Implementation Concept (2) Automated Runtime Randomization Annotation File Original Binary Code Disassembled Code (Optional) Decomiled Code Structure (Pattern) Analyzer Structure (Pattern) Modifier Modified Binary Code Binary Rewriting Randomizer

Automated Defensive Transformations Name Sourc e Code Binar y code Issues/Comments Control-flow obfuscation ++ Automated Defensive Transformations Name Sourc e Code Binar y code Issues/Comments Control-flow obfuscation ++ + Accurate disassembly Data obfuscation ++ Permutation/ Encryption of parameters ++ + Optimizations in code may obscure parameter passing code Detection of all memory errors ++ -- May have 20% to 200% overhead Pointer “encryption” ? -- Difficult to decide (even in source) whether a field is an int or pointer Instruction set randomization ++ ++ Strong protection from injected code System call randomization ++ ++ Limited space of

Automated Layout Randomizations Name Absolut Relativ Potential issues e e Code, static data in Automated Layout Randomizations Name Absolut Relativ Potential issues e e Code, static data in executable Binary? Source Need relocation information or annotations in binaries. Code, static data in DLL Binary Source Range of base addresses for DLLs limited (32 K? ) Stack Binary- • Difficulty randomizing data at stack base • Limited range, especially for threads (1 M? ) Heap Binary- To fully randomize heap allocations, may need to redefine malloc() implementation

Exploit Breaking Transforms • Randomize base of stack by a large number (preferably, by Exploit Breaking Transforms • Randomize base of stack by a large number (preferably, by 100 MB or more for single-threaded programs) • Randomize locations of installed DLLs – – Manage all of the DLLs installed on a system Ensure that they get mapped to non-overlapping locations Change the mappings periodically Need simple management tools to make all of this happen • Randomize location of functions in the executable. • Randomize base of the heap and the distance between two successively allocated heap blocks • Randomize location of static variables in the executable

Transform Characterization • Type of transformation? • What object is transformed? • When can Transform Characterization • Type of transformation? • What object is transformed? • When can transformation occur? – Pre-load, post-load? – Are annotations or compiler support required? • What types of exploits, payloads are impacted? • How difficult is implementation of transform?

Random Stack Rebasing • Linear Randomization over the Range – – – • 4 Random Stack Rebasing • Linear Randomization over the Range – – – • 4 K Byte Granularity Effective address domain approximately 1. 0 GB limited by the demands of other process segments. Approximate 256 K distinct bases are possible Two approaches examined* – – One approach requires modification of the loader to implement in the NT-2000 -XP environments Second approach increase stack reserve space in PE file and decrements ESP – does not require loader modification *Note that stack rebasing can be implemented directly in the PE file for DOS applications

Memory Topology 2 GByte User Space (Win 32) typical Randomization Domain Heap Commit Stack Memory Topology 2 GByte User Space (Win 32) typical Randomization Domain Heap Commit Stack Base Heap Reserve Unmapped Memory Stack Reserve Stack Commit Unmapped Memory. debug Segment. rdata Segment. text Segment 0 x 03600000 0 x 00100000 0 x 00041000

Modifying Static Variable Locations (1) Preamble Code Block Postamble Padding • Preamble & Postamble Modifying Static Variable Locations (1) Preamble Code Block Postamble Padding • Preamble & Postamble code generated by compiler • Code Block built by developer • Padding inserted by compiler

Modifying Static Variable Locations (2) Preamble Code Block Postamble Padding • Preamble modified to Modifying Static Variable Locations (2) Preamble Code Block Postamble Padding • Preamble modified to increase size for local variables • Code Block modified to use new offsets for local variables • Postamble stays unchanged

Example: Original Assembly Code Preamble Code Block Postamble Padding Example: Original Assembly Code Preamble Code Block Postamble Padding

Modified Assembly Code Preamble Code Block Postamble Padding Modified Assembly Code Preamble Code Block Postamble Padding

Diversification of the Windows Vulnerability Environment Karl Levitt, Hao Chen, Matt Bishop, Zhendong Su, Diversification of the Windows Vulnerability Environment Karl Levitt, Hao Chen, Matt Bishop, Zhendong Su, Jeff Rowe, Ivan Balepin, Ebrima Ceesay, Tufan Demir, Bhume Bhumiratana, Lynn Nguyen, Daisuke Nojiri UC Davis Computer Security Lab

The Problem • Microsoft Windows provides the ideal conditions for epidemic cyber-attacks – Plenty The Problem • Microsoft Windows provides the ideal conditions for epidemic cyber-attacks – Plenty of software vulnerabilities (root level buffer overflows). – Widespread installation of identical software • Attack prevention in MS Windows is difficult – No protection via compiler modification – No source code for the OS or applications • A single scripted exploit works against – Any machine – All machines

Diversification of the Windows Vulnerability Environment • Windows executables typically call API functions for Diversification of the Windows Vulnerability Environment • Windows executables typically call API functions for any significant task • All API functions are provided in DLLs. – Load address of API functions is not known until the program loads – Load address of API functions varies from host to host • Major goal of Windows exploits is to locate the addresses of critical DLL functions

Multi-Layer Defense Strategy Prevent Remote Exploit of Memory Errors Prevent Injected Code from Properly Multi-Layer Defense Strategy Prevent Remote Exploit of Memory Errors Prevent Injected Code from Properly Executing Prevent Access to Windows DLL’s Prevent Use of Windows DLL’s Prevent the Bypass of DLL’s

Outline • How Code Red and Slammer work • Permute IAT and Change DLL Outline • How Code Red and Slammer work • Permute IAT and Change DLL name strings: Defeat known attacks • Hypothesized attacks that will succeed • Parameter modification: Padding, transformation • Preventing direct system calls from injected code • Towards quantitative analysis of our approaches • Techniques for binary rewriting • Demonstration

How does DLL system work? 80000000 stack 20000000 . text IAT `Load. Library. A’ How does DLL system work? 80000000 stack 20000000 . text IAT `Load. Library. A’ 77 E 9 D 961 kernel 32. dll Load. Library. A() 77 E 9 D 961 010031 A 0 77 E 80000 Call *010031 A 0 01001000 heap 00070000 65 D 60000

Code Red Worm stack kernel 32. dll Injected code 20000000 Load. Library. A() 77 Code Red Worm stack kernel 32. dll Injected code 20000000 Load. Library. A() 77 E 9 D 961 . text EAT 01001000 `Load. Library. A’ 77 E 9 D 961 heap `KERNEL 32’ 00070000 77 E 80000

SQL Slammer/Sapphire stack Injected code kernel 32. dll Load. Library. A() 20000000 77 E SQL Slammer/Sapphire stack Injected code kernel 32. dll Load. Library. A() 20000000 77 E 9 D 961 . text 01001000 sqlsort. dll IAT 77 E 9 D 961 heap 00070000 77 E 80000

Preventing DLL Access • Add Synthetic Diversity to Windows PE Format – Permutation of Preventing DLL Access • Add Synthetic Diversity to Windows PE Format – Permutation of the Import Address Table – Random String replacement of DLL names and functions

Randomize Plain Text Strings stack PEB Injected code `KERNEL 32’ `a 7 Ly 4 Randomize Plain Text Strings stack PEB Injected code `KERNEL 32’ `a 7 Ly 4 SZq 19’ 77 E 80000 20000000 . text IAT kernel 32. dll Load. Library. A() 77 E 9 D 961 `4 Cu 74 x. Ip. I 9 q 2’ `Load. Library. A’ EAT Call *010031 A 0 01001000 `4 Cu 74 x. Ip. I 9 q 2’ `Load. Library. A’ 77 E 9 D 961 heap `a 7 Ly 4 SZq 19’ `KERNEL 32’ 00070000 77 E 80000

Permute IAT 80000000 stack Call *010031 A 0 20000000 . text IAT `Load. Library. Permute IAT 80000000 stack Call *010031 A 0 20000000 . text IAT `Load. Library. A’ 010031 A 0 77 E 90332 `Get. Proc. Address’ 0100308 C 77 E 9 D 961 Call *010031 A 0 *0100308 C kernel 32. dll Load. Library. A() 77 E 9 D 961 Get. Proc. Address() 77 E 90332 77 E 80000 01001000 heap 00070000 65 D 60000

Preventing DLL Access • Add Synthetic Diversity to Windows PE Format – Permutation of Preventing DLL Access • Add Synthetic Diversity to Windows PE Format – Permutation of the Import Address Table – Random String replacement of DLL names and functions

Some Assumptions… • Attacks are remote, automated and non-directed • Attacker cannot observe the Some Assumptions… • Attacks are remote, automated and non-directed • Attacker cannot observe the execution of valid programs without using system calls • Processes cannot transition from user mode to kernel mode without using system calls • Attacker cannot automate static analysis of memory contents • Modification is limited to binary (or memory) editing; source code is unavailable

Preventing DLL Access • Add Synthetic Diversity to Windows PE Format – Permutation of Preventing DLL Access • Add Synthetic Diversity to Windows PE Format – Permutation of the Import Address Table – Random String replacement of DLL names and functions • Add Diversity to Binary Code – Randomize Base Addresses – Reorder code blocks – Interleave nonfunctional code block

Hypothesis: Operand hijacking 80000000 PEB stack Injected code 20000000 . text kernel 32. dll Hypothesis: Operand hijacking 80000000 PEB stack Injected code 20000000 . text kernel 32. dll Load. Library. A() IAT 77 E 9 D 961 Call *0100308 C 77 E 9 D 961 0100308 C 77 E 80000 01001000 heap 00070000 65 D 60000

Binary Transformation 80000000 stack 20000000 IAT . text kernel 32. dll 77 E 9 Binary Transformation 80000000 stack 20000000 IAT . text kernel 32. dll 77 E 9 D 961 010031 A 0 3 1 1 2 77 E 80000 3 2 65 D 60000

Binary Transformation 80000000 stack 20000000 IAT . text kernel 32. dll 77 E 9 Binary Transformation 80000000 stack 20000000 IAT . text kernel 32. dll 77 E 9 D 961 010031 A 0 3 2 1 77 E 80000 2 65 D 60000

Challenges in Binary Rewriting 80000000 stack 20000000 Indirect Jumps IAT . text X 3 Challenges in Binary Rewriting 80000000 stack 20000000 Indirect Jumps IAT . text X 3 1 1 2 kernel 32. dll 77 E 9 D 961 010031 A 0 JMP EAX Call EBX jmp EAX call EBX 77 E 80000 Function Pointers 3 2 65 D 60000

Our Binary Rewriting Approach 80000000 stack 20000000 . text IAT 77 E 9 D Our Binary Rewriting Approach 80000000 stack 20000000 . text IAT 77 E 9 D 961 kernel 32. dll Load. Library. A 77 E 9 D 961 010031 A 0 jmp 697 FA 0 D 6 Call 010031 A 0 77 E 80000 cmp eax, ebx jde 6687 EF 03 call 77 E 9 D 961 jmp 65 D 833 AE call 77 E 804 AA

Preventing DLL Access • What About Brute Force Searches for DLL Addresses? – Insert Preventing DLL Access • What About Brute Force Searches for DLL Addresses? – Insert code and table entries that point to invalid addresses → page fault, start over – Attempted execution of inserted dead code blocks → dereference null pointer, start over – Landmines: Insert code and table entries pointing to similar code segments that actually generate alarms • Other Issues – Insertion of redundant DLL’s – Runtime Diversification

Preventing DLL Access • Key points – No performance impact upon running programs – Preventing DLL Access • Key points – No performance impact upon running programs – Access prevention is policy free • Challenges – Safety properties of address transformation – Impact of increased program size in memory

Preventing DLL Use • For attacks that locate the proper DLL’s, diverse transformations prevent Preventing DLL Use • For attacks that locate the proper DLL’s, diverse transformations prevent their use • Parameters passed to DLL functions are transformed per machine – Asymmetric parameter value transformation – Additional parameter padding

Function Parameters in Assembly stack push EDI push EBX call 77 E 9 D Function Parameters in Assembly stack push EDI push EBX call 77 E 9 D 961 80000000 20000000 IAT . text 77 E 9 D 961 push EAX push EBX call *010031 A 0 kernel 32. dll Get. Proc. Address() 77 E 9 D 961 mov ECX, EBP+0 xc mov EDX, EBP+0 x 8 77 E 80000 01001000 0000 65 D 60000

Parameter Padding push EDI push EBX call 77 E 9 D 961 80000000 stack Parameter Padding push EDI push EBX call 77 E 9 D 961 80000000 stack 20000000 kernel 32. dll Get. Proc. Address() IAT . text 77 E 9 D 961 push EAX push EBX push ECX push EDX call *010031 A 0 77 E 9 D 961 mov ECX, EBP+0 x 14 mov ECX, EBP+0 x 10 mov ECX, EBP+0 xc mov EDX, EBP+0 x 8 77 E 80000 01001000 0000 65 D 60000

Asymmetric Parameter Transformation stack push EDI push EBX call 77 E 9 D 961 Asymmetric Parameter Transformation stack push EDI push EBX call 77 E 9 D 961 80000000 20000000 IAT . text 77 E 9 D 961 jmp 00045234 cmp EDI, EBX jde 00897 EF 1 push EAX push EBX call *010031 A 0 T ans or tio r f ma n 010031 A 0 kernel 32. dll Get. Proc. Address() 77 E 9 D 961 mov ECX, EBP+0 xc mov EDX, EBP+0 x 8 Reverse Transformation 77 E 80000 01001000 00045234 0000 65 D 60000

Preventing DLL Bypass • Problem: Attacker can provide assembly components that implement some DLL Preventing DLL Bypass • Problem: Attacker can provide assembly components that implement some DLL functions making direct low level (undocumented) Windows system calls • Trap System Interrupts for Runtime Checking

Signing System Calls by Location • Post-load pre-execute binary instrumentation • What is instrumented: Signing System Calls by Location • Post-load pre-execute binary instrumentation • What is instrumented: – – – System call ID. In Linux system call ID is stored in %eax and interrupt is issued. We substitute original syscall_id with signed_id (stored in %eax prior to interrupt) • Advantage: foo. exe Normal load and link Foo in memory Instrument in memory execute – Preserve system consistency. Programs are modified only after they’re loaded. Ek = Fast trapdoor permutation with secret key k. F: {0, 1}32 {0, 1}24 token = F(Address) %eax = signed_id = Ek(token || syscall_id) • Address is 32 -bit address of the location where system call is made • syscall_id is only 8 bit (only about 200 syscalls exist in Linux)

Authentication • Assume: – Non-bypassibility - Every time a program makes system call, we Authentication • Assume: – Non-bypassibility - Every time a program makes system call, we always intercept it before the kernel. – Memory trace inspection – We need to inspect the stack of the program. • Method: – Decrypt the signed_id. • (token || syscall_id) = Dk(%eax) # %eax contains the signed_id – Inspect the program stack for return address. Compute token of the address: • check_token = F(Memory[%esp]) – If check_token == token, then • set %eax = syscall_id • Forward the system call to kernel – Otherwise fail

Limitations • Only authenticate whether system call is made from original source or not. Limitations • Only authenticate whether system call is made from original source or not. Attacker can still use library function to do system call. – Possible solution is to inspect further up the stack.

Cross Layer Commonalities • Address Obfuscation (to defeat trivial static analysis of memory) – Cross Layer Commonalities • Address Obfuscation (to defeat trivial static analysis of memory) – – Insertion of non-functional blocks Basic block permutation Permutation and Insertion of dummy function parameters Run-time obfuscation • Fail-crash & Detection Mechanism (to defeat brute force trial and error) – Insert invalid virtual address (page fault) – Execution of dead code (deref. null pointer error) – Deliberate Landmines

Evaluation • Use the DETER Testbed • Based upon the Emulab technology with extra Evaluation • Use the DETER Testbed • Based upon the Emulab technology with extra security controls • On demand large scale testbed for the testing and evaluation of security tools • Deploy hundreds (thousands) of hosts with diverse configurations. • Try single attacks against all machines • Red Team could launch automated worm attacks against all machines

320 Virtual Node Experiment • Each node has its own OS, filesystem, processes, network 320 Virtual Node Experiment • Each node has its own OS, filesystem, processes, network interfaces. • 32 gateways • 10 hosts per gateway • Emulate 320 nodes totally • Colocate factor: 10 • Use 44 physical nodes

Status • We’ve have implemented and demonstrated diversification of Windows PE format for attack Status • We’ve have implemented and demonstrated diversification of Windows PE format for attack prevention. • We are implementing the reordering of binary code blocks and insertion of dead code blocks. • We are implementing the asymmetric transformation of function parameters using dummy functions. • Investigating potential non-bypassability mechanisms

Probability of Successful Attacks Pr(A) = Pr(V)/[DE(A) * PEE(A)] • Success probability of attack Probability of Successful Attacks Pr(A) = Pr(V)/[DE(A) * PEE(A)] • Success probability of attack A exploiting vulnerability V • DE denotes “derandomization effort” – Range of randomization of addresses involved in A – Requires randomization to change after each successful derandomization • If rerandomization happens after k attempts, multiply Pr(A) by k • No rerandomization => effect of DE and PEE is additive • PEE denotes “payload execution effort” – Attempts to successfully execute “attack payload” • Note: System susceptibility to attacks can be reduced without addressing every vulnerability

What can the injected code do? • DLL access – Walk PEB to learn What can the injected code do? • DLL access – Walk PEB to learn base addresses – O(#dlls loaded) – DLLs (except a few) will be renamed, prevent search by name – Intercept dynamically loaded ones to rename – Can increase the number of loaded dlls

Cont’d • Access to function in DLLS – Walk IAT of the application • Cont’d • Access to function in DLLS – Walk IAT of the application • O(#entries in IAT) • Permute the IAT • Can increase the size of IAT – Add invalid addresses – Scan the code section for call imm_addr • Replace imm_addr with computed goto • Force static analysis

State-of-Art in Binary Analysis & Transformation State-of-Art in Binary Analysis & Transformation

Motivation • No source code needed – Language-neutral (C, C++ or other) • Can Motivation • No source code needed – Language-neutral (C, C++ or other) • Can be largely independent of OS • Ideally, would provide instruction-set independent abstractions – This ideal is far from today’s reality • Applications in – – Instrumenting long-running programs Legacy code migration Program optimizations Security • Program obfuscation, security-enhancing transformations

Approaches • Static analysis/transformation – Binaries files are analyzed/transformed – Benefits • No runtime Approaches • Static analysis/transformation – Binaries files are analyzed/transformed – Benefits • No runtime performance impact • No need for runtime infrastructure – Weakness • Prone to error, problem with checksums/signed code • Dynamic analysis/transformation – Code analyzed/transformed at runtime – Benefit: more robust/accurate – Weakness • Some runtime overhead • Runtime infrastructure needed

Previous Works (Static) • OM/ATOM (DEC WRL) – Proprietary and probably outdated • EEL Previous Works (Static) • OM/ATOM (DEC WRL) – Proprietary and probably outdated • EEL (Jim Larus et al, PLDI ‘ 95) – The precursor of most modern rewriters – Targets RISC (SPARC) – Provides processor independent abstractions – Follow up works • UQBT (for RISC) • LEEL (for Linux/i 386)

Previous Works (Static) • WISA (U. Wisconsin) – Uses EEL for SPARC – Uses Previous Works (Static) • WISA (U. Wisconsin) – Uses EEL for SPARC – Uses IDAPro+Code. Surfer for x 86 • Etch (U. of Washington) [x 86/Windows] – Application in performance optimization – Does not seem to be active any more • PLTO/SOLAR (U. Arizona) – Linux/x 86, but has limitations (e. g. static linking) • Brew (Stony Brook) – Disassembly+RAD implementation • Various tools for Java – BCEL seems most advanced

Previous Works (Dynamic) • Dyn. Inst (U. Maryland, U. Wisconsin) – Instrumentation of running Previous Works (Dynamic) • Dyn. Inst (U. Maryland, U. Wisconsin) – Instrumentation of running programs – Provides OS/architecture independent abstractions for instrumentation • Lib. Verify (Bell Labs/RST Corp) – Runtime rewriting for Stack. Guard • Dynamo. RIO (HP Labs/MIT), Strata (UVA) – Disassembles basic blocks at runtime – Provides API to hook into this process and transform executable – Used in “Program Shepherding” [USENIX Sec '02]

Most Active Research Groups • WISA project (Wisconsin) – Somesh Jha, Tom Reps • Most Active Research Groups • WISA project (Wisconsin) – Somesh Jha, Tom Reps • Dyn. Inst project (Wisconsin/Maryland) – Barton Miller • SOLAR project (U. Arizona) – Saumya Debray • Dynamo. RIO (HP/MIT) – Tool available (binary form), Linux/Win 32 • Strata (UVA) – Tool claimed to be available in source form

Most Promising Tools • BREW [Stony Brook] • Dynamo. RIO • IDAPro/Code. Surfer (Commercial) Most Promising Tools • BREW [Stony Brook] • Dynamo. RIO • IDAPro/Code. Surfer (Commercial) • Dyn. Inst is robust, but capabilities limited for our purpose • Strata may be good, and is supposedly available in source code, but may not be as mature as Dynamo. RIO

Phases in Static Analysis of Binaries • Disassembly • Instruction decoding/understanding • Insertion of Phases in Static Analysis of Binaries • Disassembly • Instruction decoding/understanding • Insertion of new code

Questions You Might Ask How many variants does defense require? 100 by contract, but Questions You Might Ask How many variants does defense require? 100 by contract, but the more the better Why design in depth? Belt and suspenders, but get multiplicative advantage from multiple randomization -- hopefully How do we assure defense achieves multiplicative effect from multiple stages of randomization? From different phenomena: fail-crash (attacker has to retry attack), independence of stages How do we achieve multiplicative effect within a stage, e. g. , IAT randomization? E. g. , prevent attacker from doing DE for a DLL at a time, e. g. , Kernel 32

Questions You Might Ask (2) How often does defense re-randomize? Depends on cost of Questions You Might Ask (2) How often does defense re-randomize? Depends on cost of re-randomization (down-time), DE, number of variants needed What is the cost of randomization? Low for IAT randomization, low for parameter padding, unknown but determinable for other kinds, e. g. , parameter value transformation, return address authentication to prevent system calls from injected code How does defense do control flow randomization for subtle optimized code? Potentially difficult because static analysis of binary code is hard, but will accept only sound transformations Can attacker do de-randomization in payload? Very unlikely

Questions You Might Ask (3) What if attacker obtains defense’s randomization algorithm and sample Questions You Might Ask (3) What if attacker obtains defense’s randomization algorithm and sample randomized code -- known plaintext and cybertext attack? Through static analysis he might generate all variants, but cannot use them in a payload to compromise more than a fraction of the hosts; this depends on fail crash assumption Can attacker bypass randomization stages? Hopefully this is achieved only if kernel is not secure How do we verify this assertion? Careful analysis and lots of testing Is all this new? Builds on existing obfuscation work, but much is new: defense in depth, parameter transformation, random space analysis

Questions You Might Ask (4) Do we have the staff to investigate the numerous Questions You Might Ask (4) Do we have the staff to investigate the numerous issues posed? ? Do we have a plan for all this? Yes!!!

DAWSON Next Steps • Continue developing automated transforms to break exploits specifications – Implement DAWSON Next Steps • Continue developing automated transforms to break exploits specifications – Implement five key transforms and evaluate cost and effectiveness – Evaluate alternative annotation approaches and implement – Look specifically at non-buffer overflow attacks • Continue developing automated transforms to break payload specifications – Prevent brute force searches – Obfuscating code and landmines • Evaluate integration approaches – PE Editor style v Brew style (v. Dynamo. RIO style) • Integrate transforms and test – Pre-loader prefered – Loader hooks, if absolutely required, or Dynamo. RIO style

Demonstrations Tonight Thank You! Demonstrations Tonight Thank You!

Backup Backup

Collberg, Thomborson, Low • First systematic studies of Java code obfuscation – Produced taxonomy Collberg, Thomborson, Low • First systematic studies of Java code obfuscation – Produced taxonomy (layout, control flow, data, and preventative transforms) – Low-cost, stealthy opaque constructs – Techniques for obscuring data structures and abstractions – Measured effectiveness using software complexity metrics

Wang • Studied malicious host problem to protect trusted probe communicating with trusted host Wang • Studied malicious host problem to protect trusted probe communicating with trusted host – Key threats: impersonation, intelligent tampering, input spoofing, not DOS or random tampering – Input spoofing, in general, unsolvable but “If spoofing input x requires solving the algortihm-secrecy or execution-integrity problem, then techniques to ensure the later can be used to counteract input spoofing. However, there applications where this is not possible. ” – Pervasive aliasing enabled proof: precise analysis of transformed program (e. g. , CFG) is NP hard – Replacing 50% of branches => • Execution time = 4 X • Size = 2 X • Wroblewski extended ideas and implemented purely sequential, controllable approach that worked on binary code

Linn and Debray • Rewrote binaries (IA-86) to disrupt major static disassembly approaches (linear Linn and Debray • Rewrote binaries (IA-86) to disrupt major static disassembly approaches (linear sweep and recursive traversal) – Best commercial tools failed on 65% of instructions and 85% of functions – Execution times = 1. 13 X – Executable size = 1. 15 -1. 20 X

Digital Rights Management • Malicious host is key problem in DRM • White box Digital Rights Management • Malicious host is key problem in DRM • White box cryptography approach • Chow et al. – Notwithstanding Barak, can provide useful commercial levels of security – Obscured DES and AES algorithms • Jacobs et al. – Broke obscured DES but showed general problem of retrieving data from circuits is NP hard – Admitted that, in practice, usually easy • Link and Neumann improved on Chow

Barak et al. • Seminal proof showed – Impossibility of completely obscuring code – Barak et al. • Seminal proof showed – Impossibility of completely obscuring code – No general obfuscator possible • Badger et al. began to extend Wang’s work – Unable to prove minimum resistance time to reverse engineering effort – Redirected to review obfuscation work (tour de force report)

Mitigating Vulnerabilities in Code • Forrest et al. randomized stack resident data addresses via Mitigating Vulnerabilities in Code • Forrest et al. randomized stack resident data addresses via modified gcc compiler • Chew and Song randomized stack base address, system call numbers & library entry points via modifying Linux loader and kernel system call table and binary rewriting • Xu et al. modified Linux kernel to randomize base addresses of program regions • Approaches still vulnerable to relative address attacks

Forrest et al. • Scrambled executable (prn), then unscrambled through modified code emulator (x Forrest et al. • Scrambled executable (prn), then unscrambled through modified code emulator (x 86) – Speed = 1. 05 X – Memory usage = 3 X – Discussed danger of generating valid instruction during scrambling but did not see experimentally • Kc produced similar results

Bhaktar et al. “Key difference between program obfuscation and address obfuscation is that program Bhaktar et al. “Key difference between program obfuscation and address obfuscation is that program obfuscation is oriented towards preventing most static analyses of a program, while address obfuscation has a more limited goal of making it impossible to predict the relative or absolute addresses of program code and data. Other analyses, including reverse compilation, extraction of flow graphs, etc. , are generally not affected by address obfuscation” • Focused on memory error exploits – Randomized absolute/relative addresses in Linux binary code – Approach offered protection against classic attacks • Stack smashing, existing code exploits, format string, data modification, heap overflow, double-free, integer overflows • Data modification attacks still possible but Etoh and Yoda approach could help

Performance of Bhaktar Transforms Program Combination (1) Combination (2) % Overhead Standard Deviation (% Performance of Bhaktar Transforms Program Combination (1) Combination (2) % Overhead Standard Deviation (% of mean) tar -1 3. 4 0 5. 2 wu-ftpd 0 1. 4 2 2. 1 gv 0 6. 1 2 2. 1 bison 1 2. 0 8 2. 3 groff -1 1. 1 13 0. 7 gzip -1 1. 9 14 2. 5 gnuplot 0 0. 9 21 1. 0 Combination 1: link time static relocation of stack, heap and code regions with random gaps in stack frames; Combination 2: load time dynamic relocation of above

Vulnerabilities and Exploits • • Aleph One, “Smashing The Stack For Fun And Profit”, Vulnerabilities and Exploits • • Aleph One, “Smashing The Stack For Fun And Profit”, Phrack 49, Volume Seven, Issue Forty-Nine, File 14 of 16, 11/8/1995 David Litchfield, “Defeating the Stack-Based Overflow Prevention Mechanism of Microsoft Windows 2003 Server”, NGS Research Whitepaper, August 9, 2003, http: //www. nextgenss. com/papers. htm Mudge, “How To write buffer overflows”, http: //www. insecure. org/stf/mudge_buffer_overflow_tutorial. html, 10/20/1995 w 00, “Heap Overflow”, http: //www. w 00. org/files/articles/heaptut. txt, 1/1999 Ryan Permeh, Marc Maiffret, Code Red Disassembly Analysis, e. Eye Digital Security, http: //www. eeye. com/html/advisories/codered. zip. Stuart Staniford, Nicholas Weaver, Vern Paxson. “Flash Worms: Is there any Hope? ” Silicon Defense, Retrieved 27 March 2003

Software Fault Tolerance & N-version Programming • • A. Avizienis, “Fault Tolerance and fault Software Fault Tolerance & N-version Programming • • A. Avizienis, “Fault Tolerance and fault intolerance. Complimentary approaches to reliable computing”, Proc. 1975 Int. Conf. Reliable Software, Los Angels, CA, Apr 21 - 27, 1975, pp 458 - 464 A. Avizienis, “N-Version Approach to fault tolerant Software”, IEEE-Software eg. , vol- SE 11, No 12, Dec 1985, pp. 1491 -1501 V. Bharathi, “N-Version programming method of Software Fault Tolerance: A Critical Review”, Indian Institute of Technology, Kharagpur 721302, December 2830, 2003 L. Chen and A. Avizienis, "N-version programming: A fault-tolerance approach to reliability of software operation, " IEEE 8 th FTCS, pp. 3 -9, 1978 J. C. Knight and N. G. Leveson, “A Large Scale Experiment In N-Version Programming”, Digest of Papers FTCS-15: Fifteenth International Symposium on Fault-Tolerant Computing, June 1985, Ann Arbor, MI. pp. 135 -139. J. C. Knight and N. G. Leveson, “An Experimental Evaluation of the Assumption of Independence in Multi-version Programming”, IEEE Transactions on Software Engineering, Vol. SE-12, No. 1 (January 1986), pp. 96 -109. M. R. Lyu, J. -H. Chen, and A. Avizienis, "Software diversity metrics and measurements, " In Proc. The Sixteen Annual Int. Computer Software and Applications Conf. 1992, pp. 69 -78.

Obfuscation -- Java Code • C. Collberg, C. Thomborson, and D. Low. “A Taxonomy Obfuscation -- Java Code • C. Collberg, C. Thomborson, and D. Low. “A Taxonomy of Obfuscating Transformations”. Technical Report 148, Department of Computer Science, University of Auckland, July 1997. • C. Collberg, C. Thomborson, and D. Low. “Manufacturing Cheap, Resilient, and Stealthy Opaque Constructs” Department of Computer Science, University of Auckland. ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'98). January 1998 • C. Collberg, C. Thomborson, D. Low. “Breaking Abstractions and Unstructuring Data Structures”, Proceedings of the 1998 International Conference on Computer Languages, pages 28 -38. IEEE Computer Society Press. May 1998. • Larry D’Anna, Brian Matt, Andrew Reisse, Tom Van Vleck, Steve Schwab, Patrick Le. Blanc, “Self-Protecting Mobile Agents Obfuscation Report - Final report, ” Network Associates Laboratories, Report #03 -015, June 30, 2003 • Lee Badger, Larry D'Anna, Doug Kilpatrick, Brian Matt, Andrew Reisse, Tom Van Vleck. “Self-Protecting Mobile Agents Obfuscation Techniques Evaluation Report, ” Network Associates Laboratories, Report #01 -036, Nov 30, 2001, updated March 22, 2002. • Douglas Low, Java Control Flow Obfuscation, MS Thesis, Univ. Auckland, 3 June 1998

Obfuscation -- Protecting Software • • • Boaz Barak, Oded Goldreich, Russell Impagaliazzo, Steven Obfuscation -- Protecting Software • • • Boaz Barak, Oded Goldreich, Russell Impagaliazzo, Steven Rudich, Amit Sahai, Salil Vadhan, and Ke Yang. “On the (im)possibility of obfuscating programs. ” In J. Kilian, editor, Advances in Cryptology. CRYPTO ‘ 01, Lecture Notes in Computer Science. Springer-Verlag. Stanley Chow, Philip A. Eisen, Harold Johnson, Paul C. van Oorschot: A White-Box DES Implementation for DRM Applications. Digital Rights Management Workshop 2002: 1 -15 S. Chow, P. Eisen, H. Johnson and P. C. van Oorschot, ``White-Box Cryptography and an AES Implementation'', Proceedings of the Ninth Workshop on Selected Areas in Cryptography (SAC 2002) Matthias Jacob, Dan Boneh, and Edward Felten. Attacking an obfuscated cipher by injecting faults , 2002 ACM Workshop on Digital Rights Management. Washington, D. C. , 2002 Hamilton E. Link and William D. Neumann, “Clarifying Obfuscation: Improving the Security of White-Box Encoding”, Sandia National Laboratories, Albuquerque, NM, downloaded from eprint. iacr. org/2004/025. pdf Chenxi Wang, “A Security Architecture for Survivability Mechanisms. ” Ph. D thesis, University of Virginia, October 2000. Chenxi Wang, "Protection of software-based survivability schemes", in the proceedings of 2001 Dependable Systems and Networks. Gutenburg, Sweden. July 2001. w 00, “Heap Overflow”, http: //www. w 00. org/files/articles/heaptut. txt, 1/1999 Gregory Wroblewski, “General Method of Program Code Obfuscation, ” Ph. D Dissertation, Wroclaw University of Technology, Institute of Engineering Cybernetics, 2002. Gregory Wroblewski; “General Method of Program Code Obfuscation, ” 2002 International Conference on Software Engineering Research and Practice (SERP’ 02), June 24 - 27, 2002, Monte Carlo Resort, Las Vegas, Nevada, USA Hamilton E. Link and William D. Neumann, “Clarifying Obfuscation: Improving the Security of White-Box Encoding”, Sandia National Laboratories, Albuquerque, NM, downloaded from eprint. iacr. org/2004/025. pdf Cullen Linn, Saumya Debray, “Obfuscation of Executable Code to Improve Resistance to Static Disassembly, ” ACM Conference on Computer and Communications Security, Washington DC, October 2731, 2003.

Source Code Transforms to Mitigate Vulnerabilities • • M. Chew, D. Song. “Mitigating Buffer Source Code Transforms to Mitigate Vulnerabilities • • M. Chew, D. Song. “Mitigating Buffer Overflows by Operating System Randomization, ” Technical Report CMU-CS-02 -197. Hiroaki Etoh and Kunikazu Yoda. Protecting from stack smashing attacks. Published on World-Wide. Web at URL http: //www. trl. ibm. com/projects/security/ssp/main. html, June 2000. Stephanie Forrest, Anil Somayaji, and David H. Ackley. “Building diverse computer systems. ” In 6 th Workshop on Hot Topics in Operating Systems, pages 67 -72, Los Alamitos, CA, 1997. IEEE Computer Society Press. Selvin George, David Evens, Steven Marchette. “A Biological Programming Model for Self-Healing”, First ACM Workshop on Survivable and Self-Regenerative Systems (in association with 10 th ACM Conference on Computer and Communications Security) October 31, 2003, George W. Johnson Center, George Mason University, Fairfax, VA Pax. Published on World-Wide Web at URL http: //pageexec. virtualave. net, 2001. Jun Xu, Z. Kalbarczyk and R. K. Iyer. “Transparent Runtime Randomization for Security”. Proc. of 22 nd Symposium on Reliable and Distributed Systems (SRDS), Florence, Italy, October 6 -8, 2003 Stack. Guard, Libverify, RAD, Point. Guard, MS C++ compiler Peter Silberman and Richard Johnson, A Comparison of Buffer Overflow Prevention Implementations and Weaknesses, I-Defense, 1875 Campus Commons Dr. Suite 210 Reston, VA 20191, http: //www. blackhat. com/presentations/bh-usa 04/bh-us-04 -silberman-paper. pdf

Run-time Transforms to Mitigate Vulnerabilities • Elena Gabriela Barrantes, David H. Ackley, Stephanie Forrest, Run-time Transforms to Mitigate Vulnerabilities • Elena Gabriela Barrantes, David H. Ackley, Stephanie Forrest, Trek S. Palmer, Darko Stefanovic and Dino Dai Zovi, “Randomized instruction set emulation to disrupt binary code injection attacks, ” 10 th ACM Conference on Computer and Communications Security, Washington DC, October 27 -31, 2003. • Sandeep Bhatkar, Daniel C. Du. Varney, and R. Sekar, “Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits, ” 12 th USENIX Security Symposium, August 2003. • Gaurav S. Kc, Angelos D. Keromytis, Vassilis Prevelakis, “Countering Code-Injection Attacks with Instruction-Set Randomization, ” 10 th ACM Conference on Computer and Communications Security, Washington DC, October 27 -31, 2003.

CMU Ballista Study • Most production quality operating system and core library code exhibit CMU Ballista Study • Most production quality operating system and core library code exhibit large numbers of flaws in validating input, call order, etc. • Specification-driven testing verifies this result.

 • • • SANS Top 10 Top Vulnerabilities to Windows Systems W 1 • • • SANS Top 10 Top Vulnerabilities to Windows Systems W 1 Web Servers & Services W 2 Workstation Service W 3 Windows Remote Access Services W 4 Microsoft SQL Server (MSSQL) W 5 Windows Authentication W 6 Web Browsers W 7 File-Sharing Applications W 8 LSAS Exposures W 9 Mail Client W 10 Instant Messaging

 • • • SANS Top Vulnerabilities to UNIX Systems U 1 BIND Domain • • • SANS Top Vulnerabilities to UNIX Systems U 1 BIND Domain Name System U 2 Web Server U 3 Authentication U 4 Version Control Systems U 5 Mail Transport Service U 6 Simple Network Management Protocol (SNMP) U 7 Open Secure Sockets Layer (SSL) U 8 Misconfiguration of Enterprise Services NIS/NFS U 9 Databases U 10 Kernel