5333dcc23d6eabd897670f83bf3f48c8.ppt
- Количество слайдов: 23
Technological Networks: From Empirical Laws to Theory Stephanie Forrest Dept. of Computer Science University of New Mexico http: //cs. unm. edu/~forrest@cs. unm. edu
UNM CS Dept. Profile • 18 Faculty: – 9 tenured (5 full, 4 assoc), 5 assistant, 1 lecturer – 3 openings (2 junior, 1 senior) – Prince of Asturias Endowed Chair in Information Technology – External faculty appointments from other departments and national labs. • Close collaborations with SFI and national labs (SNL and LANL) • Students: – Degrees: BS: ~40, MS: ~35, Ph. D: ~5 – Undergraduate: ~200 majors; BS degrees • >20% Female; >35% Minorities – Graduate: ~120 MS ~80 Ph. D. • >20% Female; 40 -60% Foreign • Funding: (2004 -2005) – Total: $3. 5 M – NSF, DARPA, DOE, NIH, Sandia and Los Alamos
UNM CS Dept. Research • Strongly Interdisciplinary: – Adaptive computation – New paradigms of computing (molecular and quantum computation) – Computational biology and bioinformatics (phylogenetic tree reconstruction, radiology, RNAi) • • Graphics and visualization High-performance computing: – Light-weight distributed operating systems • Automated reasoning and machine learning: – Otter – POMDP • Complex networks: – Provably robust scalable algorithms for P 2 P networks – Phase transitions in NP-complete problems
Themes of Talk • The real world isn't exactly scale free. • Understanding and predicting network structure is important for engineering: – Network properties can be exploited to enhance computer security (computer epidemiology / border gateway protocol) • We lack theory to explain/predict the structure of technological networks: – Preferential attachment isn’t good enough. • Initial steps toward theory.
Technological Networks • Distribute resources: – Energy – Materials – Information • Energy distribution: – Power grids – Gas pipelines • Transportation: – Highways – Airline routes • The Internet: – – Physical connectivity Autonomous systems (AS) World-wide web Social contacts, e. g. , email • Microarchitecture
Network Structure • Network topology affects network properties: – Shortest distance between two nodes – Bisection width – Rate and extent of contagion • Analysis: – Epidemiological models. – The epidemic threshold. • Degree distribution of network: – Scale-free (power law) networks: – Pk = k-c • Controlling infections on scale-free networks: – Random vaccination is ineffective (e. g. , anti-virus software). – Targeted vaccination of highconnectivity nodes.
Example 1: Computer Epidemiology Justin Balthrop, Mark Newman, Matt Williamson • Viruses and worms spread over networks of contacts between computers: – Email address books. – URL links. • Different types of networks are exploited by different types of infections. UNM CS Dept. Network of Address Books
Degree Distributions of Four Networks Relevant to Computer Security Science 304: 527 -529 (2004) – Not scale-free. – Targeted vaccination unlikely to be effective: • • > 10% of nodes required for address book data > 87% of nodes required for email traffic data – Computer infections can choose their own topology, so network topology is not static. – Viruses spread faster than repairs.
Throttling: Generic Control of Epidemics Matt Williamson and Justin Balthrop • Control network topology in time rather than space. – Limit the rate at which a computer can make new connections – Limits spreading rates rather than stopping. • Assumes that virus traffic is significantly different from normal traffic: – Nimbda infects up to 400 new machines per second compared to normal rate of connections to new Web servers of about one connection per second or slower (Williamson, 2002). – Throttling Nimbda would have increased the epidemic time from one day to over one year. • Advantages: – Effective when the form of infection not known in advance. – Reduces amount of traffic generated during an epidemic. • Limitations: – Effectiveness decreased if not all nodes are throttled (altruism), stealth attacks. • Implementations: – RIOT – HP’s Virus Throttle
Responsive I/O Throttle (RIOT) Justin Balthrop and Matt Williamson • Vision: An adaptive desktop firewall: – Hamper worms, viruses, DOS, misconfigurations, etc. – Graduated automated response (throttling) on all connections. – Adaptive (learn normal). – Robust to false positives • Personal desktop firewalls: – Coarse-grained – Static and preprogrammed • Generic rate limit (throttle): – Delay network connections that occur at an anomalous rate. – Detector activation triggers delay and packets are dropped. IP Address 1 32 bits IP Address 2 32 bits • How to detect anomalous connections? – Set of detectors (lymphocytes) that observe TCP connections: • • Data Path (below) Meta-information (direction, TCP flags) – Each detector matches some portion of IP space. – Each detector has its own normal activity level. TCP Port 1 16 bits TCP Port 2 16 bits
Learning and Throttling Connections 1. A new connection is initiated (SYN Packet) 2. The connection information is translated to a bitstring 3. The bitstring is shown to all detectors 4. An immature detector exceeds its activation threshold 5. The detector’s activation threshold is raised 6. Detectors with stable activation thresholds become mature
Autonomic Responses: A Repertoire LISYS (Steven Hofmeyr) E-mail system administrator when there is an anomalous network connection. Personal Desktop Firewall Block network connections based on user -specified policy. p. H Slow down system calls for anomalous processes. (Anil Somayaji) Throttling (Matt Williamson) RIOT (Justin Balthrop) PGBGP (Josh Karlin) Limit the overall rate of new network connections for a computer. Limit rate of all anomalously behaving TCP connections. Adaptive. Limit the rate at which new BGP routes are adopted.
Example 2: Inter-Domain Routing Josh Karlin, Stephanie Forrest, and Jennifer Rexford • • ~20, 000 Autonomous Systems (ASs) connected via the Border Gateway Protocol (BGP) ASs route blocks of IP’s, know as prefixes – 64. 106. 0. 0/17 (Owned by UNM) • • ~170, 000 prefixes owned by ASs today. Border gateway protocol (BGP): – Tell neighbors about new routes (Announcements). – Tell neighbors about old routes gone bad (Withdrawals). – That’s it.
BGP Networks are Interesting and Important • Distributed: – Nodes are AUTONOMOUS systems. – No centralized routing information (routes stored and maintained locally). – No authentication of new nodes or routes. • Dynamic: – Network connectivity changes routinely and continually. – Network updates are spread through local contact (BGP). • Confluence of technological and economic constraints: – A “policy” network as well as a routing network. • All inter-domain Internet traffic relies upon BGP. • Vulnerable: – Trivial to inject false routing information into network. – Man-in-the-Middle attacks. • Pretty Good BGP (PGBGP) and Internet Alert Registry (IAR). – Throttle the adoption of new routes.
PGBGP Algorithm • Main Idea: Delay Suspicious Routes – Lower the preference of suspicious routes (24 hr) • Detection: – Monitor BGP update messages – Treat origin ASs for a prefix seen within the past few days as normal – Treat new origin ASs as suspicious for 24 hours, then accept as normal (possible prefix hijack) – Treat new sub-prefixes as suspicious for 24 hours, then accept as normal (possible sub-prefix hijack) • Response: – Suspicious origin AS routes are temporarily given low local preference – Suspicious sub-prefixes are temporarily ignored (not forwarded to)
PGBGP Advantages • Incremental deployability – No change to BGP protocol, just to path selection – Immediate benefits to adopting AS and customers • Automated and immediate response – Avoid using and propagating the bogus route – Network has chance to stop the attack before it spreads • Robust to false positives – Lowering preference for suspicious routes – No loss of reachability – Accidental short-term delays do no harm • Offline investigation of suspicious route – Internet Alert Registry, active probing, … • Adaptive, simple
Incremental Deployment Prefix Hijack Subprefix Hijack ICNP (2006) • Limitations: – Doesn’t address path spoofing, redistribution attacks. – Negligence
BGP Network Structure Petter Holme and Josh Karlin • Barabasi-Albert (BA) model (Barabasi and Albert, 1999): – Vertices and edges added iteratively – Probability of attaching to vertex i is proportional to k(i) • Inet model (Winick and Jamin, 2000) – No simple growth principle – Generate random graph with known degree distribution – Augment to mimic additional known correlations (e. g. , connecting all high-degree nodes) Proc. Royal Acad. A (in press)
Tentative Conclusions • Real AS graph is more heterogeneous than can be expected from degree distribution alone: – – Core providers in the low-d tail Peak at d=3 (vertices directly connected to the core) Second peak at d=4 (vertices directly connected to d=3 nodes) More structure in periphery than predicted by earlier models. • Preferential attachment is a poor model for network growth. • What constraints determine network architecture and growth?
Allometric Scaling in Biology • Dominant design constraint: – Distribute resources to every cell in the organism • • • Internal, space-filling, hierarchical networks (vascular system) Invariant terminal units (capillaries) Optimality (minimize transport time, maximize metabolism)
Examples: Scaling in Software H. Inoue Hello. World: Unique Function Calls vs. Invocation Freq Lime. Wire Behavior
Example: Scaling in Software Dave Ackley and Terry Van Belle • How to measure evolvability? – Likelihood (how likely is a location to change). – Impact (change in one location affects other locations). • Work = [Likelihood x Impact]: – Software maintenance costs. – Expected time to evolve. • Study long-lived Java code bases: – Code change sizes and frequencies are power law ish. Van Belle, 2004
A Theory of Network Scaling: Conclusions • Why it’s important – Networking infrastructure continues to expand by orders of magnitude: • NSF GENI project: Redesign the Internet from the ground up. • Interplanetary networking. – Architecture: • Move from performance-oriented to power-aware designs. • The end of silicon. – Scaling problems in software, security problems everywhere • Why it’s hard – Terminal units aren’t necessarily invariant sized – Dimensionality and geometry are not obvious – May require new mathematics


