74120184e272af660a8df1f4815ae6a5.ppt
- Количество слайдов: 82
HACQIT: Hierarchical Adaptive Control of Qo. S for Intrusion Tolerance To Critical Users VPN FW GW Switch Primary Nodes Monitor & Adapter Server Backup Nodes Server Sensors Decoys/Fishbowls Server Controls James E. Just James C. Reynolds Karl Levitt 13 February 2001
Outline • • Team HACQIT idea Goals Architecture Status Plans Current Capabilities Questions and Issues
HACQIT Team • Teknowledge Corporation – architecture, design, Quorum component modification, monitor/adapter development, integration – – J. Just J. Reynolds L. Clough R. Maglich, E. Lawson • UC Davis – attack modeling, sensing and response options – – K. Levitt R. Pandey, F. Wu J. Rowe M. Tylutki
The HACQIT Idea • Utilize robust hierarchical control of Qo. S and other fault tolerance techniques to deliver critical COTS services to critical users while under attack – Significantly raise adversary work factors – Focus on useful military applications – Policy driven • Leverage current and new technologies – Qo. S/Quorum – De. Si. De. Ra. Ta, AQu. A, QCS, others – IA&S – wrappers, intrusion and integrity sensors, active monitoring & response, randomization, VPNs, attack modeling (Jigsaw concepts), honeypots – Fault tolerance – separation, diversity, replication & checkpointing, fail-over – Others – out-of-band signaling, etc • Incrementally deliver capabilities
Project Goals • Prototype HACQIT controlled cluster delivering – – – 4 hours of intrusion tolerance under Active Red Team attacks on hosts Providing policy determined critical services from COTS/GOTS applications to Critical users (also policy determined) at 75% capacity • Extensible model base for Intrusion Tolerance • Focus – COTS HW & SW based for near term utility – Architecture based framework for longer term extensibility (hierarchical and fractal)
HACQIT Scope • Will address – – – Qo. S control for critical services to critical users Hierarchical, extensible object control model Attacks on host availability and integrity Variety of COTS/GOTS applications Policy specification of above • Won’t address – Network infrastructure (e. g. , denial of service attacks on bandwidth, routers or LANs) or physical attacks – Integrity of data sources used as inputs, confidentiality – Legitimate but false insider manipulation of application – Developing new sensors or mechanisms, but will leverage them
HACQIT Reference Architecture HACQIT Protected Enclave Other Enclaves User o User J q i WAN Out-of-Band Comms Between HACQIT M/A’s & Cyber Panel? ? F W User P Server p Server q Server r LAN User M Noise Generator HACQIT Protected Node FW GW Switch Monitor & Adapter User 1 User 2 User 3 User * User N = Sensors Key = Controller = Attacker User = Critical User = Non-Critical User Server = Critical Service Server = Non-Critical Service
HACQIT Node Architecture To Critical Users VPN Out-of-Band Control Pathways FW GW Switch Primary Nodes Communications with other Controllers Monitor & Adapter Monitor-adapter uses Out-of. Band signaling for complete separation from network attacks on LAN and WAN Server Backup Nodes Decoys/Fishbowls Controls Server Sensors
Notional HACQIT Implementation
HACQIT Monitor-Adapter Software Overview HACQIT Controllers Policies and Specs. Mediator Integrity & State Sensors HACQIT Visualization IT Event Log, Other Clients HACQIT Monitor/Adapter Mediator Intrusion Sensors To /From Other M/As Mediator Performance Sensors
Goals of Control (Increasing Difficulty) • Continue critical service – Migrate critical applications – Administer system (e. g. , add or remove critical user or critical service) • Gather more information (e. g. , refocus sensors, turn on more intrusive sensing, use decoys or fishbowls) • Stop current attack – Note: Control over enclave firewall and critical user protection features needed • Stop future attacks
Coverage: Illustrative Attack Categories & Characteristics
Paradigm for Responding to Integrity Intrusion • REPEAT UNTIL ATTACK SYMPTOMS DISAPPEAR • Detect integrity violation on a critical file • Switchover to backup server; restore prior version of critical file on primary • Use Jigsaw model to determine possible causes and sources of attack • Deploy sensors and responders as determined by model • If attack persists block with responders
Paradigm for Responding to Internal DOS Attack • REPEAT UNTIL ATTACKSYMPTOMS DISAPPEAR • Detect denial of service violation on primary server • Switchover to newly created process on server; kill process causing denial of service • Use Jigsaw model to determine possible causes and sources of attack • Deploy sensors and responders on server and on firewall as determined by model • If attack persists block with responders
HACQIT Actions in Responding to Connection Spoofing Attack • Detect change to. rhosts file on primary • Switchover to backup • Restore previous version of primary, which is now the backup • Use Jigsaw model of attacks to identify possible causes of integrity problem – Change is legitimate by “clean” process-- no integrity problem – Change is by an unauthorized process – Change is by a legitimate rcommand – Change is by an unauthorized rcommand
HACQIT response to Connection Spoofing (cont) • HACQIT checks for erroneous processes -- finds none; so conclude change is legitimate or due to an rcommand • HACQIT starts monitoring for rcommands • Attack persists, but now on backup with arrival of rcommand • HACQIT temporarily blocks rcommand until verification • HACQIT monitoring detects symptoms of connection spoofing attack -- sequence number guessing, DOS on a host • If traceback to true source s is possible, connections from s are blocked; otherwise, degraded mode (no rcommands)
Example w/ Capabilities Execute Commands Connection Spoofed Connection Remote Login cat + + >> /. rhosts Packet Spoofing Seq # Probe Remote Execution RSH Active Spoofed Packet Seq. Number Guess RSH Connection Spoof Prevent Connection Response Example attack composed of multiple concepts and capabilities Forged Src Address Forging Synflood
Coverage: Illustrative Application “Types” What follows is a first cut. Not completely clear what the minimum set of characterization “axes” is to give HACQIT the necessary robustness. • Human user, client side applications, central storage, e. g. , MS Office • Human user, client-server, e. g. , web servers or applications • Human user, client-server-database (three tier), e. g. , shared planning applications, web based applications • Human user, store and forward applications, e. g. , email – sendmail or Exchange • Human user, communication and collaboration applications, e. g. , CVW or Odyssey or Net meeting • Real time, server to server or server to … to server, e. g. , radar processing or weapon system controls • Others as necessary
First Round Capabilities Demonstrated • Manual migration • Simulation of attacks resulting in: – Soft reboots – Hard reboots • Simulation of integrity attack (Tripwire) • Simulation of performance degrading attack (cpuhog) – Detect “runaway” host process – Detect Qo. S degradation • Second round capabilities described in Part 2
What We’ve Done & Status • Second round of “attack & solution space” exploration – Test applications include Dynbench, Apache web server, Notepad – (Exchange or sendmail in works) – Refining architecture & design -- HACQIT requirements and component responsibilities being refined – De. Si. De. Ra. Ta code conquered (or at least subdued) – Cross project coordination underway • Attack and response modeling begun • New code developed for: – – – Secure Task Manager and heartbeat monitor Sensor manager, e. g. , Tripwire and wrappers Response managers: e. g. , firewall and auditing Policy driven wrappers HACQIT monitor/adapter
Plans • Continue coordination and leveraging activities • Development – Continue design, rapid experimentation (risk reduction), and research efforts through February – Write specification for first real prototype in March – Develop solid prototype for June evaluation – Evaluation – Red Team, informal hacker exposure via internet, other – Go into next cycle • Leverage – SCC – firewall on NIC (ADF) – Draper – Gateway “cleaner” • Other activities – IFIP WG 10 Dependability Benchmarking
Current Demonstration Purpose • Use of policy driven wrapper technology to intercept suspicious calls and initiate failover • Use of Quality of Service manager to effect switchover • Use of diversity among primary and backup to reduce likelihood of renewed attack against backup • Lab capability to test protective measures against actual attacks and mitigate their effects on simulated users
HACQIT Demonstration Configuration • HACQIT primary NT running Apache and/or Exchange and/or MS Word • HACQIT backup is Linux • Outside of HACQIT cluster firewall are three client workstations, Good and Continuing, two “weak” legitimate user clients, and Bad, source of malicious attacks • Bad could be inside or outside the Enclave that is protected by the second firewall – Bad is inside the enclave firewall for convenience – Eventually a user will be outside the enclave firewall • Secure channel to the client – Currently VPN is not used – Eventually will use VPNs or IPSec.
HACQIT Lab
HACQIT Software Implementation
Current Demonstration Scenario • Users (Good and Continuing) running NT in enclave – Processes on Good and Continuing simulate client demands on Apache – Other processes will simulate demands on Exchange or saving Word files on primary • Automated attack launched from Bad to take over Good and then attack web server on Primary (NT) – Exploit vulnerabilities in Apache or Exchange or Word – Attack executes a program and/or modifies file • Primary attack detected & mitigated by Apache wrapper – Wrapper communicates with the Monitor/Adaptor on our out-ofband machine • Monitor/adaptor starts Apache (under Linux) on the backup and tells firewall controller to switch IP address from the primary to the backup • Exchange or Word could start using Wine or VMWare technology
New Issues • Can we capture or redirect client requests so that users really are not interrupted during a migration? • What does this mean for real time? How does Desi do failover without the Dyn. Bench applications losing something? Do they stop processing until the connection is reestablished? • How can we save state and migrate an application? Do different types or classes of applications have different requirements? What is HACQIT’s ability to cover these different classes? • Can we add a new user? We might want to disable all sessions and start over with trusted connections
Incremental Implementation Approach (I) • Several capability levels are envisioned • Lower levels are specified • Level 0 – Insider attack and simple migration – No firewall on HACQIT cluster – Only critical application is Apache web server (no Microsoft Exchange) – Simulations of user web server activities would be running on both Good and Continuing – Attack from weak client Good against Apache web server on primary – Sense compromise via wrapper integrity checker on Apache which then communicates with Monitor/Adapter (M/A) – M/A migrates Apache web server from NT primary to NT backup
Incremental Implementation Approach (II) • Level 1 – Outsider attack with simple, cross platform migration and increased sensing – Firewall(s) added to HACQIT cluster – Added simulations of user web server activities should be added to legitimate user machine outside the enclave – Bad attacks weak client, Good, to compromise it and set up attack from Good against Apache web server on NT primary – Same sensing and communication by wrapper as above – M/A migrates web server to Linux – M/A turns on increased auditing on firewall/gateway as another response
Incremental Implementation Approach (III) • Level 2 – Uninterrupted Critical User during Above Attack and Migration – Demonstrate uninterrupted user via change ARP table – Note: we may still lose the response to the last user request for web services • Level 3 – Block the Attack – Identify the IP address of the attacker and block attacker at firewall or router, e. g. , add blocking command via OPSEC interface or change a rule to shut out attacker’s access to primary – At some point we’d like to know if the attack was from a compromised weak client or an insider – probably not at this capability level
Incremental Implementation Approach (IV) • Level 4 – Multiple Critical Applications – Add mail server (Microsoft exchange or some mail server that is cross platform) • Level 5 – Critical Application with State – Save state of critical application and migrate • Level 6 – Same Machine Failover – Use wrappers to ensure non-compromise of OS and failover critical application to the same machine (i. e. , start up clean application on same Primary and kill attacked process) • • Level 7 – Remediation of Compromised Primary Level 8 – Other Types of Diversity & Use of Decoys Level 9 -- Randomization of Responses Note: Levels 4 -9 are relatively independent and can be done in parallel or in a different order
Architectural Explorations • Major focus of HACQIT is to develop Intrusion Tolerant – oops! – I mean Organically Assured and Survivable architecture • Our levels of capability demonstrations suggest the Subsumption architecture (Brooks, 86) • Brooks developed the architecture for his famous robot projects but many of our requirements are the same – Need certain amount of “stupid” reactive behavior – Need guaranteed fast response • Brooks implemented each layer in his architecture as a deterministic finite state machine with simple I/O • No world model is depended on • Communication from higher to lower levels is done through suppression and injection
Traditional Architecture
Subsumption Architecture
HACQIT Mapping to Subsumption Architecture from Levels 0 -2 Capabilities • Pre-Level 0 capability features could be lowest layer like Brooks’ “Avoid” module – Unauthorized process on primary boosts CPU utilization above threshold: kill process, move critical service to backup – Unauthorized modification of file: move critical service to backup • Level 0 capabilities would be second layer – Wrapper intercepts suspicious call: move critical service to backup – Diversity advantage: Backup runs different OS than primary – TCPDump is turned on after suspicious call is intercepted (heightened awareness) • Level 1 would be third layer – Migration is effected without interrupting critical users (change ARP table) • Level 2 would be fourth layer – Source address of attack is identified – Address blocked by change to firewall policy
Higher Levels of Capability • • • Multiple critical applications Failover which saves state Failover on the same machine Forensics All may be too complex, long-lived, or require global information in order to implement • We’re looking at DICAM as architecture for these functions
DICAM Control
Technology Transfer Exposure Opportunity • NSWC (Mike Masters) is technology transfer target for Quorum – Annual demonstration in September – Security and intrusion tolerance are of interest – Willing to discuss inclusion of HACQIT in demonstration – leverages Quorum technologies – Need to start coordination planning in April
Issues • Looking for interested potential users for feedback (PACOM, NSWC, other? ) • Need help in getting ACOA server software • Reuse of research prototypes
Thank you. Questions?
Backup 1: Towards a Formal Methodology for Responding to Integrity and DOS Intrusions Jim Just - Teknowledge Karl Levit, Jeff Rowe, Marcus Tylutki, Nicole Carlson, Steven Templeton, Mark Heckman -- UCD
Paradigm for Responding to Integrity Intrusion REPEAT UNTIL ATTACK SYMPTOMS DISAPPEAR Detect integrity violation on a critical file Switchover to backup server; restore prior version of critical file on primary Use Jigsaw model to determine possible causes and sources of attack Deploy sensors and responders as determined by model If attack persists block with responders
Paradigm for Responding to Internal DOS Attack REPEAT UNTIL ATTACKSYMPTOMS DISAPPEAR Detect denial of service violation on primary server Switchover to newly created process on server; kill process causing denial of service Use Jigsaw model to determine possible causes and sources of attack Deploy sensors and responders on server and on firewall as determined by model If attack persists block with responders
Connection Spoofing Attack • Multiple stage • Attacker establishes a TCP connection to a host (server) H exploiting a trust relationshiop (through. rhosts) between H and some other host H 1. • Attack involves – denial of service on H 1 – Connection number guessing – Planting a trojan horse on H • Many variants are possible • Detection is assumed to occur when. rhosts file on H is erroneously modified
Scenario Attacks: an example RSH trust relation: sarte trusts kafka, will execute programs for kafka sarte spock
Scenario Attacks: an example kafka sarte (1) Spock launches synflood attack against kafka spock
Scenario Attacks: an example kafka sarte (2) Spock probes sarte for starting sequence number on RSH port spock
Scenario Attacks: an example kafka (3) Spock sends syn packet to TCP/RSH sarte on sarte w/ source forged to be kafka. spock
Scenario Attacks: an example (4) Sarte sends syn/ack to kafka sarte spock
Scenario Attacks: an example (5) Kafka drops packet due to Do. S kafka sarte spock
Scenario Attacks: an example (6) Spock sends forged ack packet to sarte, w/ guessed sequence number. Data in packet, kafka sarte “cat + + >> /. rhosts” adds “all hosts” to sarte’s. rhosts file. spock
Scenario Attacks: an example kafka sarte (7) the attacker rsh’s into sarte as root and installs a sniffer to collect passwords. spock
Scenario Attacks: an example (8) Using one of these he telnets into kafka sarte spock
Scenario Attacks: an example (9) Once on kafka, the attacker exploits a buffer overflow in amd to gain root privileges. kafka sarte (10)Attacker then, copies credit card number file back to spock
HACQIT Actions in Responding to Connection Spoofing Attack • Detect change to. rhosts file on primary • Switchover to backup • Restore previous version of primary, which is now the backup • Use Jigsaw model of attacks to identify possible causes of integrity problem – – Change is legitimate by “clean” process-- no integrity problem Change is by an unauthorized process Change is by a legitimate rcommand Change is by an unauthorized rcommand
HACQIT response to Connection Spoofing (cont) • HACQIT checks for erroneous processes -- finds none; so conclude change is legitimate or due to an rcommand • HACQIT starts monitoring for rcommands • Attack persists, but now on backup with arrival of rcommand • HACQIT temporarily blocks rcommand until verification • HACQIT monitoring detects symptoms of connection spoofing attack -- sequence number guessing, DOS on a host • If traceback to true source s is possible, connections from s are blocked; otherwise, degraded mode (no rcommands)
Example w/ Capabilities Execute Commands Connection Spoofed Connection Remote Login cat + + >> /. rhosts Packet Spoofing Seq # Probe Remote Execution RSH Active Spoofed Packet Seq. Number Guess RSH Connection Spoof Prevent Connection Response Example attack composed of multiple concepts and capbilities Forged Src Address Forging Synflood
NFS Mount Attack-- Overview • Certain partitions (directories) of an NFS system running on server H are exported. • An attacker on host Ha performs information gathering commands remotely on H to identify exported partitions and their owners, e. g. user U. • Once having this information, attacker creates an account for U on Ha. • The last step is the erroneous account U mounting an exportable partition.
NFS Mount Attack -- as attack specification . 1 rcpinfo –p Target-IP Attacker learns that host H uses NFS daemon and that host H uses an NFS daemon. The preconditions specify that attacker A has a remote network access to target host H and that host H has IP address Target. IP. 2. Showmount –e Target-IP Attacker learns that host H exports hard disk partition P via NFS. Preconditions deal with IP addresses and exported services not changing from step 1. 3. showmount –a Target-IP Attacker learns that partition P is locally mounted by H; the preconditions of the previous steps are unchanged.
NFS Mount Attack (cont. ) 4. finger @Target -IP Attacker learns that user U is currently connected to H and that the ID for user U is Userid. Among the preconditions is that host H provides the finger service. 5. create-account(U, Userid) The precondition assures that the attacker has an account on some host Ha. After this step, there is an account for U on Ha. Note there alternatives to this step, such as modifying the password file. 6. mount –t Target-Partiion /mnt The attacker can now access the directory of U. The preconditions are that A is connected to Ha and that U is the owner of some directory in the exported partition P.
HACQIT response to NFS Mount Attack • HACQIT detects modification to a critical file • Switchover to backup server; restore file on old primary which becomes backup • Through model of NFS determine possible causes of modifiation: – By a legitimate user – By an legitimate user, but spoofed – … • HACQIT increases monitoring for NFS • Detects a “write” to critical file • Correlates “write” from a user u with information gathering on the server and for user u
HACQIT response to NFS attack (cont) • Temporarily, HACQIT operates in degraded mode, disallowing “writes” from unauthenticated users • Through Jigsaw model of NFS attacks and NFS vulnerability analysis HACQIT determines “mount” export problem and corrects configuration
Denial of Service Attack • Compromised client launches a “synflood” attack on sever • Temporarily, HACQIT blocks all packets from client • HACQIT identifies possible responses to flooding attack – Block packets at firewall from client – Kill half-open connections as they appear – … • HACQIT chooses 1 st response, as it is quickly deployed • HACQIT identifies user and processes on client responsible for attack, and disables them
Backup 2: Selected HACQIT July PI Slides
HACQIT Schedule
Milestones • Year 1 – – Note that these milestones are Applications more aggressive than the • Office • Email official SOW. Depending on the • Collaboration results of detailed design effort, • Intranet web server some adjustments may be Control • Specification based performance and integrity necessary • Replication and switchover • Year 2 – Applications • Simple planning application • Network-based military planning • Real time application – Control • Detected intrusions • Limited restoration • Options – Integration of new ITS technologies – Automatic generation of integrity monitors – Extensions for diagnosis and recovery
Technology Transfer • Who needs intrusion tolerant server capabilities for critical users and services – user pull – Government -- military and civilian – Commercial -- large corporations, ISPs and others who offer outsourced application services • Development and maintenance organizations for above – Government development efforts (e. g. , IO COP, GCCS) – Government ACTDs (e. g. , AIDE, ACOA, CINC 21? ) – Commercial security product/service providers (including Teknowledge? ) -- significant commercialization costs • Mechanisms – – Demonstrations Ongoing communications Publications Code availability
Risks and Mitigations • Attacks – Against monitoring & control components – Common mode attacks – Active blocking after unknown attack • Accurate and rapid intrusion sensing & avoidance • Backups – Restoration speed (applications & connections) – Corruption (logical isolation from primary) • Overhead for different types of applications • Recovery of the primary server in a timely manner (not a major focus of base program) • Workload to use & maintain • Diversity (hw, sw, versions, time, etc) • Restrictions on services • Out of band control system • VPNs among critical users and servers • Wrappers for low level sensing and control • Adaptive control • Design for usability • Randomization of initialization & response • Content monitoring (and selective logging) of input/output streams • Deception (decoys, honeypots, fishbowls, etc)
JIGSAW Concept Template Extended [abstract] concept
Fault vs. Intrusion Tolerance • Fault tolerance: Fault => Error => Failure – Goal: system avoids failure despite component faults – Process: Error detection => Damage Confinement => Error Recovery => Fault Treatment & Continuation – Building Blocks: Byzantine agreements, synchronized clocks, stable storage, fai stop processors, etc • Intrusion tolerance: Attack => Error => Failure – Goal: system avoids failure despite attacks (errors include loss of integrity)
Design Considerations • COTS/GOTS hardware and software • Distributed hierarchical control paradigm for enclave and wide area protection • Separation is key requirement • Anything not explicitly permitted is forbidden • Intrusion resistance to support intrusion tolerance • System boundary includes some protection for and control of weak clients • Keep footprint small -- more active control than redundancy • Focus on known vulnerable areas, e. g. , weak client attacks (a la recent attack against Microsoft) • Adaptive responses are key research area
General Use Case • Development & setup: policies, application specs. , etc • Operations: Assume a backup (hot or cold) • Detect a performance problem in critical application – Switchover to backup, increase auditing and sensing levels, determine if cause is an attack, then expunge attack from the primary and block future occurrences of the attack, return • Detect an integrity problem in critical application (including data files), operating system, or other critical process that indicates an undetected intrusion – Switchover to backup, expunge the attack from the primary, block future occurrences of the attack, return • Detect an intrusion: – If intrusion does not constitute a threat to the critical application, then start a procedure to expunge the attack, if necessary, and block future occurrences, return; – If attack threatens the critical application, then switchover to backup, expunge the attack from the primary, block future occurrences of the attack, return;
Backup 3: Simple Capability Demonstrations
Experimental Configuration (in Yellow/Blue) User J q User o Primary & Backup F • Secure Task Monitor WAN W • Desi sensor/controller • Dynbench. User • Apache web server P i • Notepad (Word) • Tripwire Firewall • Controller LAN • TCPDump VPNs FW Primaries Sensors Server 2 Server 1 Monitor & Adapter User 2 User 2 N Server r User M Monitor/Adapter • Desi HACQIT middleware • Integrity monitor Managed Cluster (remoted) • HCI Backups & Decoys Server 2 Communications with other HACQIT Controllers
Desiderata Software Architecture Spec File HCI Qo. S/R M Services Qo. S Collector Spec Generator Meta Spec Assessment Metrics File Load. Sim UI User Name Server Qo. S Monitor Experiment Generator Command File Program Control Dyn. Bench Sub-System Scenario File Doctrine File Startup Daemon
Desiderata Software Control Flow
Mapping of Desiderata Middleware to Distributed Hardware
Desiderata’s Real-time Path Paradigm event engagement monitor & guide event actuators sensors situation assess
Dynbench Suite of Real-time Paths Situation Assess Scenario file Sensor Doctrine file Filter Sensor FM Doctrine file MG Sensor EG EDM Radar Display Monitor&Guide MGM Actuator User Command file ED Sensor Engagement AM Action Sensor
Dynbench Subsystems (Paths) • Situation assessment – Filter Manager (FM): receives radar tracks from the sensor and divides them among the filter programs – Filter: correlates point data of track into equations of motion of the track body – Evaluate and Decide Manager (EDM): distributes workload among ED programs – Evaluate and Decide (ED): determines if current position of radar track is within critical region
Dynbench Subsystems (Paths) • Engagement – Action Manager (AM): receives threat tracks from ED and divides them among the action programs – Action: receives threat tracks from AM and commands actuator – Actuator: receives action and executes • Monitor and Guide – Monitor and Guide Manager (MGM): receives threat tracks and interceptors from ED and divides them among the MG programs – Monitor and Guide (MG): receives threat tracks along with interceptors, updates position of interceptor according to the position of the threat track
HACQIT Prototype Architecture FWC TCP Dump STM Tripwire HM SD HM Hub NS HB IM SB QM HM SD SD


