
fcbf405efe7d458b2194c978998dbc78.ppt
- Количество слайдов: 105
CMU SCS Graph Mining Christos Faloutsos CMU i. CAST, Jan. 09 C. Faloutsos #
CMU SCS Thank you! • Prof. Hsing-Kuo Kenneth Pao • Eric, Morgan, Ian, Teenet i. CAST, Jan. 09 C. Faloutsos 2
CMU SCS Outline • • Problem definition / Motivation Static & dynamic laws; generators Tools: Center. Piece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) • Conclusions i. CAST, Jan. 09 C. Faloutsos 3
CMU SCS Motivation Data mining: ~ find patterns (rules, outliers) • Problem#1: How do real graphs look like? • Problem#2: How do they evolve? • Problem#3: How to generate realistic graphs TOOLS • Problem#4: Who is the ‘master-mind’? • Problem#5: Track communities over time i. CAST, Jan. 09 C. Faloutsos 4
CMU SCS Problem#1: Joint work with Dr. Deepayan Chakrabarti (CMU/Yahoo R. L. ) i. CAST, Jan. 09 C. Faloutsos 5
CMU SCS Graphs – why should we care • Intrusion detection – who-contacts-whom i. CAST, Jan. 09 normal traffic destination abnormal traffic source C. Faloutsos source 6
CMU SCS Graphs - why should we care? Internet Map [lumeta. com] Food Web [Martinez ’ 91] Protein Interactions [genomebiology. com] Friendship Network [Moody ’ 01] i. CAST, Jan. 09 C. Faloutsos 7
CMU SCS Graphs - why should we care? • IR: bi-partite graphs (doc-terms) D 1 . . . DN TM • web: hyper-text graph • . . . and more: i. CAST, Jan. 09 C. Faloutsos T 1 8
CMU SCS Graphs - why should we care? • network of companies & board-of-directors members • ‘viral’ marketing • web-log (‘blog’) news propagation • computer network security: email/IP traffic and anomaly detection • . . i. CAST, Jan. 09 C. Faloutsos 9
CMU SCS Problem #1 - network and graph mining • • i. CAST, Jan. 09 How does the Internet look like? How does the web look like? What is ‘normal’/‘abnormal’? which patterns/laws hold? C. Faloutsos 10
CMU SCS Graph mining • Are real graphs random? i. CAST, Jan. 09 C. Faloutsos 11
CMU SCS Laws and patterns • Are real graphs random? • A: NO!! – Diameter – in- and out- degree distributions – other (surprising) patterns i. CAST, Jan. 09 C. Faloutsos 12
CMU SCS Solution#1 • Power law in the degree distribution [SIGCOMM 99] internet domains log(degree) ibm. com att. com -0. 82 log(rank) i. CAST, Jan. 09 C. Faloutsos 13
CMU SCS Solution#1’: Eigen Exponent E Eigenvalue Exponent = slope E = -0. 48 May 2001 Rank of decreasing eigenvalue • A 2: power law in the eigenvalues of the adjacency matrix i. CAST, Jan. 09 C. Faloutsos 14
CMU SCS Solution#1’: Eigen Exponent E Eigenvalue Exponent = slope E = -0. 48 May 2001 Rank of decreasing eigenvalue • [Mihail, Papadimitriou ’ 02]: slope is ½ of rank exponent i. CAST, Jan. 09 C. Faloutsos 15
CMU SCS But: How about graphs from other domains? i. CAST, Jan. 09 C. Faloutsos 16
CMU SCS The Peer-to-Peer Topology [Jovanovic+] • Count versus degree • Number of adjacent peers follows a power-law i. CAST, Jan. 09 C. Faloutsos 17
CMU SCS More power laws: citation counts: (citeseer. nj. nec. com 6/2001) log(count) Ullman log(#citations) i. CAST, Jan. 09 C. Faloutsos 18
CMU SCS More power laws: • web hit counts [w/ A. Montgomery] Web Site Traffic log(count) Zipf ``ebay’’ users sites log(in-degree) i. CAST, Jan. 09 C. Faloutsos 19
CMU SCS epinions. com • who-trusts-whom [Richardson + Domingos, KDD 2001] count trusts-2000 -people user (out) degree i. CAST, Jan. 09 C. Faloutsos 20
CMU SCS Motivation Data mining: ~ find patterns (rules, outliers) • Problem#1: How do real graphs look like? • Problem#2: How do they evolve? • Problem#3: How to generate realistic graphs TOOLS • Problem#4: Who is the ‘master-mind’? • Problem#5: Track communities over time i. CAST, Jan. 09 C. Faloutsos 21
CMU SCS Problem#2: Time evolution • with Jure Leskovec (CMU/MLD) • and Jon Kleinberg (Cornell – sabb. @ CMU) i. CAST, Jan. 09 C. Faloutsos 22
CMU SCS Evolution of the Diameter • Prior work on Power Law graphs hints at slowly growing diameter: – diameter ~ O(log N) • What is happening in real data? i. CAST, Jan. 09 C. Faloutsos 23
CMU SCS Evolution of the Diameter • Prior work on Power Law graphs hints at slowly growing diameter: – diameter ~ O(log N) • What is happening in real data? • Diameter shrinks over time i. CAST, Jan. 09 C. Faloutsos 24
CMU SCS Diameter – Ar. Xiv citation graph • Citations among physics papers • 1992 – 2003 • One graph per year diameter time [years] i. CAST, Jan. 09 C. Faloutsos 25
CMU SCS Diameter – “Autonomous Systems” • Graph of Internet • One graph per day • 1997 – 2000 diameter number of nodes i. CAST, Jan. 09 C. Faloutsos 26
CMU SCS Diameter – “Affiliation Network” • Graph of collaborations in physics – authors linked to papers • 10 years of data diameter time [years] i. CAST, Jan. 09 C. Faloutsos 27
CMU SCS Diameter – “Patents” • Patent citation network • 25 years of data diameter time [years] i. CAST, Jan. 09 C. Faloutsos 28
CMU SCS Temporal Evolution of the Graphs • N(t) … nodes at time t • E(t) … edges at time t • Suppose that N(t+1) = 2 * N(t) • Q: what is your guess for E(t+1) =? 2 * E(t) i. CAST, Jan. 09 C. Faloutsos 29
CMU SCS Temporal Evolution of the Graphs • N(t) … nodes at time t • E(t) … edges at time t • Suppose that N(t+1) = 2 * N(t) • Q: what is your guess for E(t+1) =? 2 * E(t) • A: over-doubled! – But obeying the ``Densification Power Law’’ i. CAST, Jan. 09 C. Faloutsos 30
CMU SCS Densification – Physics Citations • Citations among physics papers E(t) • 2003: – 29, 555 papers, 352, 807 citations ? ? N(t) i. CAST, Jan. 09 C. Faloutsos 31
CMU SCS Densification – Physics Citations • Citations among physics papers E(t) • 2003: – 29, 555 papers, 352, 807 citations 1. 69 N(t) i. CAST, Jan. 09 C. Faloutsos 32
CMU SCS Densification – Physics Citations • Citations among physics papers E(t) • 2003: – 29, 555 papers, 352, 807 citations 1. 69 1: tree N(t) i. CAST, Jan. 09 C. Faloutsos 33
CMU SCS Densification – Physics Citations • Citations among physics papers E(t) • 2003: – 29, 555 papers, 352, 807 citations clique: 2 1. 69 N(t) i. CAST, Jan. 09 C. Faloutsos 34
CMU SCS Densification – Patent Citations • Citations among patents granted E(t) • 1999 1. 66 – 2. 9 million nodes – 16. 5 million edges • Each year is a datapoint i. CAST, Jan. 09 N(t) C. Faloutsos 35
CMU SCS Densification – Autonomous Systems • Graph of Internet • 2000 E(t) 1. 18 – 6, 000 nodes – 26, 000 edges • One graph per day N(t) i. CAST, Jan. 09 C. Faloutsos 36
CMU SCS Densification – Affiliation Network • Authors linked to their publications • 2002 E(t) 1. 15 – 60, 000 nodes • 20, 000 authors • 38, 000 papers – 133, 000 edges i. CAST, Jan. 09 N(t) C. Faloutsos 37
CMU SCS Motivation Data mining: ~ find patterns (rules, outliers) • Problem#1: How do real graphs look like? • Problem#2: How do they evolve? • Problem#3: How to generate realistic graphs TOOLS • Problem#4: Who is the ‘master-mind’? • Problem#5: Track communities over time i. CAST, Jan. 09 C. Faloutsos 38
CMU SCS Motivation Data mining: ~ find patterns (rules, outliers) • Problem#1: How do real graphs look like? • Problem#2: How do they evolve? • Problem#3: How to generate realistic graphs TOOLS • Problem#4: Who is the ‘master-mind’? • Problem#5: Track communities over time i. CAST, Jan. 09 C. Faloutsos 39
CMU SCS Problem#4: Master. Mind – ‘Ce. PS’ • w/ Hanghang Tong, KDD 2006 • htong
CMU SCS Center-Piece Subgraph(Ceps) • Given Q query nodes • Find Center-piece ( ) • App. – Social Networks – Law Inforcement, … • Idea: – Proximity -> random walk with restarts i. CAST, Jan. 09 C. Faloutsos 41
CMU SCS Case Study: AND query R. Agrawal Jiawei Han V. Vapnik M. Jordan i. CAST, Jan. 09 C. Faloutsos 42
CMU SCS Case Study: AND query i. CAST, Jan. 09 C. Faloutsos 43
CMU SCS Case Study: AND query i. CAST, Jan. 09 C. Faloutsos 44
CMU SCS Conclusions • • Q 1: How to measure the importance? A 1: RWR+K_Soft. And Q 2: How to do it efficiently? A 2: Graph Partition (Fast Ce. PS) – ~90% quality – 150 x speedup (ICDM’ 06, b. p. award) i. CAST, Jan. 09 C. Faloutsos 45
CMU SCS Outline • • Problem definition / Motivation Static & dynamic laws; generators Tools: Center. Piece graphs; Tensors Other projects (Virus propagation, e-bay fraud detection) • Conclusions i. CAST, Jan. 09 C. Faloutsos 46
CMU SCS Motivation Data mining: ~ find patterns (rules, outliers) • Problem#1: How do real graphs look like? • Problem#2: How do they evolve? • Problem#3: How to generate realistic graphs TOOLS • Problem#4: Who is the ‘master-mind’? • Problem#5: Track communities over time i. CAST, Jan. 09 C. Faloutsos 47
CMU SCS Tensors for time evolving graphs • [Jimeng Sun+ KDD’ 06] • [ “ , SDM’ 07] • [ CF, Kolda, Sun, SDM’ 07 tutorial] i. CAST, Jan. 09 C. Faloutsos 48
CMU SCS Social network analysis • Static: find community structures Keywords i. CAST, Jan. 09 Authors 1990 DB C. Faloutsos 49
CMU SCS Social network analysis • Static: find community structures i. CAST, Jan. 09 Authors 1992 1991 1990 DB C. Faloutsos 50
CMU SCS Social network analysis • Static: find community structures • Dynamic: monitor community structure evolution; spot abnormal individuals; abnormal time-stamps i. CAST, Jan. 09 C. Faloutsos 51
CMU SCS Application 1: Multiway latent semantic indexing (LSI) Philip Yu Uauthors 2004 DM 1990 authors DB Ukeyword DB keyword Michael Stonebraker Pattern Query • Projection matrices specify the clusters • Core tensors give cluster activation level i. CAST, Jan. 09 C. Faloutsos 52
CMU SCS Bibliographic data (DBLP) • Papers from VLDB and KDD conferences • Construct 2 nd order tensors with yearly windows with
CMU SCS Multiway LSI Authors Keywords Year michael carey, michael stonebraker, h. jagadish, hector garcia-molina queri, parallel, optimization, concurr, objectorient 1995 surajit chaudhuri, mitch cherniack, michael stonebraker, ugur etintemel DB jiawei han, jian pei, philip s. yu, jianyong wang, charu c. aggarwal distribut, systems, view, storage, servic, pr 2004 ocess, cache streams, pattern, support, cluster, index, gener, queri 2004 DM • Two groups are correctly identified: Databases and Data mining • People and concepts are drifting over time i. CAST, Jan. 09 C. Faloutsos 54
CMU SCS Network forensics • Directional network flows • A large ISP with 100 POPs, each POP 10 Gbps link capacity [Hotnets 2004] – 450 GB/hour with compression • Task: Identify abnormal traffic pattern and find out the cause i. CAST, Jan. 09 normal traffic destination abnormal traffic source C. Faloutsos source (with Prof. Hui Zhang and Dr. Yinglian Xie) 55
CMU SCS MDL mining on time-evolving graph (Enron emails) i. CAST, Jan. 09 Graph. Scope. Faloutsos. Jimeng Sun, [w. C. 56 Spiros Papadimitriou and Philip Yu, KDD’ 07]
CMU SCS Conclusions Tensor-based methods (WTA/DTA/STA): • spot patterns and anomalies on time evolving graphs, and • on streams (monitoring) i. CAST, Jan. 09 C. Faloutsos 57
CMU SCS Motivation Data mining: ~ find patterns (rules, outliers) • Problem#1: How do real graphs look like? • Problem#2: How do they evolve? • Problem#3: How to generate realistic graphs TOOLS • Problem#4: Who is the ‘master-mind’? • Problem#5: Track communities over time i. CAST, Jan. 09 C. Faloutsos 58
CMU SCS Outline • • Problem definition / Motivation Static & dynamic laws; generators Tools: Center. Piece graphs; Tensors Other projects (e-bay fraud detection, blogs, weighted graphs) • Conclusions i. CAST, Jan. 09 C. Faloutsos 59
CMU SCS E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU i. CAST, Jan. 09 C. Faloutsos 60
CMU SCS E-bay Fraud detection • lines: positive feedbacks • would you buy from him/her? i. CAST, Jan. 09 C. Faloutsos 61
CMU SCS E-bay Fraud detection • lines: positive feedbacks • would you buy from him/her? • or him/her? i. CAST, Jan. 09 C. Faloutsos 62
CMU SCS E-bay Fraud detection - Net. Probe i. CAST, Jan. 09 C. Faloutsos 63
CMU SCS Outline • • Problem definition / Motivation Static & dynamic laws; generators Tools: Center. Piece graphs; Tensors Other projects (e-bay fraud detection, blogs, weighted graphs) • Conclusions i. CAST, Jan. 09 C. Faloutsos 64
CMU SCS Blog analysis • with Mary Mc. Glohon (CMU) • Jure Leskovec (CMU) • Natalie Glance (now at Google) • Mat Hurst (now at MSR) [SDM’ 07] i. CAST, Jan. 09 C. Faloutsos 65
CMU SCS Cascades on the Blogosphere B 1 B 2 B 1 1 1 a B 2 1 B 3 B 4 Blogosphere blogs + posts 1 B 3 b c 2 B 4 Blog network links among blogs 3 d e Post network links among posts Q 1: popularity-decay of a post? Q 2: degree distributions? i. CAST, Jan. 09 C. Faloutsos 66
CMU SCS Q 1: popularity over time # in links 1 2 3 days after post Post popularity drops-off – exponentially? i. CAST, Jan. 09 C. Faloutsos Days after post 67
CMU SCS Q 1: popularity over time # in links (log) 1 2 3 days after post (log) Post popularity drops-off – exponentially? POWER LAW! Exponent? i. CAST, Jan. 09 C. Faloutsos Days after post 68
CMU SCS Q 1: popularity over time # in links (log) -1. 6 1 2 3 days after post (log) Post popularity drops-off – exponentially? POWER LAW! Exponent? -1. 6 (close to -1. 5: Barabasi’s stack model) i. CAST, Jan. 09 C. Faloutsos Days after post 69
CMU SCS Q 2: degree distribution 44, 356 nodes, 122, 153 edges. Half of blogs belong to largest connected component. count B 1 ? ? 1 1 1 B 2 2 B B 3 3 4 blog in-degree i. CAST, Jan. 09 C. Faloutsos 70
CMU SCS Q 2: degree distribution 44, 356 nodes, 122, 153 edges. Half of blogs belong to largest connected component. count B 1 1 B 2 2 B B 3 3 4 blog in-degree i. CAST, Jan. 09 C. Faloutsos 71
CMU SCS Q 2: degree distribution 44, 356 nodes, 122, 153 edges. Half of blogs belong to largest connected component. count in-degree slope: -1. 7 out-degree: -3 ‘rich get richer’ i. CAST, Jan. 09 blog in-degree C. Faloutsos 72
CMU SCS Outline • • Problem definition / Motivation Static & dynamic laws; generators Tools: Center. Piece graphs; Tensors Other projects (e-bay fraud detection, blogs, weighted graphs) • [work in progress: i. CAST data analysis] • Conclusions i. CAST, Jan. 09 C. Faloutsos 73
CMU SCS Joint work with Leman Akoglu www. andrew. cmu. edu/~lakoglu Mary Mc. Glohon www. cs. cmu. edu/~mmcgloho Thanks to Eric, Morgan, Ian, Teenet, for providing a copy of the dataset i. CAST, Jan. 09 C. Faloutsos 74
CMU SCS Summary of findings • Who-contacts-whom graph follows old (and new, surprising) patterns • Web servers stand out, though • we are packaging all these tools, for opensource release: – ADAGE www. cs. cmu. edu/~mmcgloho/pubs/ADAGE. tar. gz – Odd. Ball (under development) i. CAST, Jan. 09 C. Faloutsos 75
CMU SCS Shrinking diameter No surprise: shrinking diameter, as `expected’ i. CAST, Jan. 09 C. Faloutsos #
CMU SCS Size of GCC, NLCC GCC size grows over time, of course. What is your guess about the size of rest, say 2 nd CC? • shrinks? • grows? i. CAST, Jan. 09 • stays the same? C. Faloutsos #
CMU SCS Size of GCC, NLCC A: OSCILLATES! No ‘surprise’ either: Typical behavior of graphs [Mc. Glohon+, KDD’ 08] i. CAST, Jan. 09 C. Faloutsos #
CMU SCS Packets over time # Packets over time: bursty behavior, with daily periodicity i. CAST, Jan. 09 C. Faloutsos 79
CMU SCS Densification law 1. Densification law: obeyed (with slope ~1: tree like (!)) 2. ‘hours’ plot has a sudden gap @ 20 -80 nodes i. CAST, Jan. 09 C. Faloutsos #
CMU SCS ‘Weight’ power law 1. 3 Weight: super-linear on number of edges – also ‘expected’: • the more contacts you have, • the even more packets you send! Slope 1. 3, 1. 15; plateau @ 100 edges i. CAST, Jan. 09 C. Faloutsos 81
CMU SCS OVERALL CONCLUSIONS • Graphs pose a wealth of fascinating problems • self-similarity and power laws work, when textbook methods fail! • New patterns (shrinking diameter!) • SVD / tensors / RWR: valuable tools • Intrusion detection: closely related – ADAGE, Odd. Ball i. CAST, Jan. 09 C. Faloutsos 85
CMU SCS References • L. Akoglu, M. Mc. Glohon, C. Faloutsos. RTM : Laws and a Recursive Generator for Weighted Time-Evolving Graphs. IEEE ICDM, Pisa, Italy, Dec. 2008 i. CAST, Jan. 09 C. Faloutsos 86
CMU SCS References • Jure Leskovec, Jon Kleinberg and Christos Faloutsos Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations KDD 2005, Chicago, IL. ("Best Research Paper" award). • Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication (ECML/PKDD 2005), Porto, Portugal, 2005. i. CAST, Jan. 09 C. Faloutsos 87
CMU SCS References • Jure Leskovec and Christos Faloutsos, Scalable Modeling of Real Graphs using Kronecker Multiplication, ICML 2007, Corvallis, OR, USA i. CAST, Jan. 09 C. Faloutsos 88
CMU SCS References • M. Mc. Glohon, L. Akoglu, C. Faloutsos. Weighted Graphs and Disconnected Components: Patterns and a Generator. ACM SIGKDD, Las Vegas, NV, USA, Aug. 2008. i. CAST, Jan. 09 C. Faloutsos 89
CMU SCS References • Shashank Pandit, Duen Horng (Polo) Chau, Samuel Wang and Christos Faloutsos Net. Probe: A Fast and Scalable System for Fraud Detection in Online Auction Networks WWW 2007, Banff, Alberta, Canada, May 8 -12, 2007. • Jimeng Sun, Dacheng Tao, Christos Faloutsos Beyond Streams and Graphs: Dynamic Tensor Analysis, KDD 2006, Philadelphia, PA i. CAST, Jan. 09 C. Faloutsos 90
CMU SCS References • Jimeng Sun, Yinglian Xie, Hui Zhang, Christos Faloutsos. Less is More: Compact Matrix Decomposition for Large Sparse Graphs, SDM, Minneapolis, Minnesota, Apr 2007. [pdf] • Jimeng Sun, Spiros Papadimitriou, Philip S. Yu, and Christos Faloutsos, Graph. Scope: Parameterfree Mining of Large Time-evolving Graphs ACM SIGKDD Conference, San Jose, CA, August 2007 i. CAST, Jan. 09 C. Faloutsos 91
CMU SCS References • Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan Fast Random Walk with Restart and Its Applications ICDM 2006, Hong Kong. • Hanghang Tong, Christos Faloutsos Center-Piece Subgraphs: Problem Definition and Fast Solutions, KDD 2006, Philadelphia, PA • Hanghang Tong, Brian Gallagher, Christos Faloutsos, and Tina Eliassi-Rad Fast Best-Effort Pattern Matching in Large Attributed Graphs KDD 2007, San Jose, CA i. CAST, Jan. 09 C. Faloutsos 92
CMU SCS Contact info: www. cs. cmu. edu /~christos (w/ papers, datasets, code, etc) i. CAST, Jan. 09 C. Faloutsos 93
CMU SCS Extra: Graph Generators i. CAST, Jan. 09 C. Faloutsos 94
CMU SCS Problem#3: Generation • Given a growing graph with count of nodes N 1, N 2, … • Generate a realistic sequence of graphs that will obey all the patterns i. CAST, Jan. 09 C. Faloutsos 95
CMU SCS Problem Definition • Given a growing graph with count of nodes N 1, N 2, … • Generate a realistic sequence of graphs that will obey all the patterns – Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter – Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters i. CAST, Jan. 09 C. Faloutsos 96
CMU SCS Problem Definition • Given a growing graph with count of nodes N 1, N 2, … • Generate a realistic sequence of graphs that will obey all the patterns • Idea: Self-similarity – Leads to power laws – Communities within communities –… i. CAST, Jan. 09 C. Faloutsos 97
CMU SCS Kronecker Product – a Graph Intermediate stage i. CAST, Jan. 09 Adjacency matrix C. Faloutsos 98 Adjacency matrix
CMU SCS Kronecker Product – a Graph • Continuing multiplying with G 1 we obtain G 4 and so on … i. CAST, Jan. 09 G 4 adjacency matrix C. Faloutsos 99
CMU SCS Kronecker Product – a Graph • Continuing multiplying with G 1 we obtain G 4 and so on … i. CAST, Jan. 09 G 4 adjacency matrix C. Faloutsos 100
CMU SCS Kronecker Product – a Graph • Continuing multiplying with G 1 we obtain G 4 and so on … i. CAST, Jan. 09 G 4 adjacency matrix C. Faloutsos 101
CMU SCS Properties: • We can PROVE that – Degree distribution is multinomial ~ power law – Diameter: constant – Eigenvalue distribution: multinomial – First eigenvector: multinomial • See [Leskovec+, PKDD’ 05] for proofs i. CAST, Jan. 09 C. Faloutsos 102
CMU SCS Problem Definition • Given a growing graph with nodes N 1, N 2, … • Generate a realistic sequence of graphs that will obey all the patterns – Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter – Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters • First and only generator for which we can prove all these properties i. CAST, Jan. 09 C. Faloutsos 103
CMU SCS skip Stochastic Kronecker Graphs • Create N 1 probability matrix P 1 • Compute the kth Kronecker power Pk • For each entry puv of Pk include an edge (u, v) with probability puv 0. 4 0. 2 0. 1 0. 3 Kronecker multiplication P 1 0. 16 0. 08 0. 04 0. 12 0. 06 0. 04 0. 02 0. 12 0. 06 0. 01 0. 03 0. 09 Pk i. CAST, Jan. 09 C. Faloutsos Instance Matrix G 2 flip biased coins 104
CMU SCS Experiments • How well can we match real graphs? – Arxiv: physics citations: • 30, 000 papers, 350, 000 citations • 10 years of data – U. S. Patent citation network • 4 million patents, 16 million citations • 37 years of data – Autonomous systems – graph of internet • Single snapshot from January 2002 • 6, 400 nodes, 26, 000 edges • We show both static and temporal patterns i. CAST, Jan. 09 C. Faloutsos 105
CMU SCS (Q: how to fit the parm’s? ) A: • Stochastic version of Kronecker graphs + • Max likelihood + • Metropolis sampling • [Leskovec+, ICML’ 07] i. CAST, Jan. 09 C. Faloutsos 106
CMU SCS Experiments on real AS graph Degree distribution Hop plot Adjacency matrix eigen values i. CAST, Jan. 09 Network value C. Faloutsos 107
CMU SCS Conclusions • Kronecker graphs have: – All the static properties Heavy tailed degree distributions Small diameter Multinomial eigenvalues and eigenvectors – All the temporal properties Densification Power Law Shrinking/Stabilizing Diameters – We can formally prove these results i. CAST, Jan. 09 C. Faloutsos 108