Скачать презентацию Structural Link Analysis from User Profiles and Friends Скачать презентацию Structural Link Analysis from User Profiles and Friends

ff3a0b273718a8bdb67489f4a4905b80.ppt

  • Количество слайдов: 23

Structural Link Analysis from User Profiles and Friends Networks: A Feature Construction Approach William Structural Link Analysis from User Profiles and Friends Networks: A Feature Construction Approach William H. Hsu, Joseph Lancaster, Martin S. R. Paradesi, Tim Weninger Monday, 26 March 2007 Laboratory for Knowledge Discovery in Databases Kansas State University http: //www. kddresearch. org/KSU/CIS/ICWSM-20070326. ppt First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Link Analysis in Social Networks: The K-State Corpus First International Conference on Weblogs And Link Analysis in Social Networks: The K-State Corpus First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Problem Statement: Link Mining in Social Networks Problem Definition Given: records of users of Problem Statement: Link Mining in Social Networks Problem Definition Given: records of users of weblog or social network service Discover Features of entities: users, communities Relationships: friendship, membership, moderatorship Explanations and predictions for relationships Goals Boost precision and recall of link existence prediction Find relevant features Significance: Recommendations (Friendship, Membership) First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Related Work: Link Mining Getoor and Diehl (2005) - Graphical model representations of link Related Work: Link Mining Getoor and Diehl (2005) - Graphical model representations of link structure Ketkar et al. (2005) - Data mining techniques vs graph-based representation Sarkar & Moore (2005) - Change in link structure across discrete time steps Popescul & Ungar (2003) - ER model to predict links Hill (2003), Bhattacharya & Getoor (2004) – Statistical Relational Learning to resolve identity uncertainty Resig et al. (2004) - Predicting IM online times using friends graph degree Mc. Callum et al. (2005) - Inferring roles and topic categories based on link analysis First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Rationale Limitations of Current State of the Art Do not take graph features into Rationale Limitations of Current State of the Art Do not take graph features into account Limited ability to select, extract features Novel Contribution: Link Mining System Extracts, computes features of network model Towards dependent types for relational link mining Rationale Desired functionality: infer new links from old Evaluation: precision, recall for link existence First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

K-State Test Bed: LJMiner Corpus User Interest, Schools, Friends User Contact Info First International K-State Test Bed: LJMiner Corpus User Interest, Schools, Friends User Contact Info First International Conference on Weblogs And Social Media (ICWSM-2007) Community Membership Info Boulder, Colorado Computing & Information Sciences Kansas State University

Live. Journal Topology [1]: Tools and Security Model © 2007 Denga, Inc. LJMind. Map. Live. Journal Topology [1]: Tools and Security Model © 2007 Denga, Inc. LJMind. Map. com © 2004 mcfnord First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Live. Journal Topology [2]: Definitions First International Conference on Weblogs And Social Media (ICWSM-2007) Live. Journal Topology [2]: Definitions First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Graph Features [1]: Node, Pair, Link-Dependent Node-Dependent Features: specific to one node (vertex) within Graph Features [1]: Node, Pair, Link-Dependent Node-Dependent Features: specific to one node (vertex) within candidate pair Indegree (u) “Source popularity” Outdegree (u) “Source fertility” u u v v Indegree (v) “Target popularity” Outdegree (v) “Target fertility” Pair-Dependent Features: specific to one candidate pair of nodes (vertices) Common entities: interests, friends, schools, etc. Attributes of common entities u v Computed from relational query on entities u, v Link-Dependent Features: specific to one link (edge) in directed graph Past, predicted duration Diagnosed cause First International Conference on Weblogs And Social Media (ICWSM-2007) u v Boulder, Colorado Computed and stored with relationship set Computing & Information Sciences Kansas State University

Graph Features [2]: Node and Pair Features in LJMiner Graph Features First International Conference Graph Features [2]: Node and Pair Features in LJMiner Graph Features First International Conference on Weblogs And Social Media (ICWSM-2007) Interest-Related Features Boulder, Colorado Computing & Information Sciences Kansas State University

LJCrawler System Design Data acquisition: client, injector, parser Ancillary issues Multi-threading Distribution Storage Analytical LJCrawler System Design Data acquisition: client, injector, parser Ancillary issues Multi-threading Distribution Storage Analytical postprocessing: LJClipper, LJStats Distinguishing features of LJCrawler Results 200 users/second maximum, 5 users/second allowed Approximately 2 million pages crawled First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Network Statistics: Graph Distance 1000 nodes First International Conference on Weblogs And Social Media Network Statistics: Graph Distance 1000 nodes First International Conference on Weblogs And Social Media (ICWSM-2007) 4000 nodes Boulder, Colorado Computing & Information Sciences Kansas State University

Interpretation of Results 941 -node graph (Hsu et al. , 2006): LJCrawler v 1 Interpretation of Results 941 -node graph (Hsu et al. , 2006): LJCrawler v 1 output 1000 -4000 node graphs: LJCrawler v 2 output First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Results Establishing an Interdisciplinary Research Initiative K-State / KU / UNL collaboration Resources: Linguistic Results Establishing an Interdisciplinary Research Initiative K-State / KU / UNL collaboration Resources: Linguistic Data Consortium NIST evaluations Involving End Users of Machine Translation Document users Machine learning, data mining, info extraction researchers Novel Applications Social networks and collaborative recommendation Gisting and beyond First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Continuing Work Information Extraction and Intelligent IR Learning models for IE: ontologies Latent semantic Continuing Work Information Extraction and Intelligent IR Learning models for IE: ontologies Latent semantic analysis Machine Learning Natural language learning Time series learning and understanding Relational and first-order models Automated Reasoning Probabilistic Case-based analogical Data Mining and Warehousing Grid Computing First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

References Knight, K. What’s New in Statistical Machine Translation. Invited Talk, International Joint Conference References Knight, K. What’s New in Statistical Machine Translation. Invited Talk, International Joint Conference on Artificial Intelligence (IJCAI-2005), Edinburgh, UK, August, 2005. Knight, K. & Graehl, J. (2005). An Overview of Probabilistic Tree Transducers for Natural Language Processing. In Proceedings of CICLing 2005, p. 1 -24. Chiang, D. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the Conference of the Association for Computational Linguistics (ACL 2005), p. 263– 270. Koehn, P. , Och, F. J. , & Marcu, D. (2003). Statistical Phrase-Based Translation. In Proceedings of HLT-NAACL 2003, the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, May 27 - June 1, 2003, Edmonton, CANADA. First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Acknowledgements K-State Lab for Knowledge Discovery in Databases Vikas Bahirwani Tejaswi Pydimarri Andrew King Acknowledgements K-State Lab for Knowledge Discovery in Databases Vikas Bahirwani Tejaswi Pydimarri Andrew King Social Networks, Graph Theory, Graph Algorithms Kirsten Hildrum (IBM T. J. Watson Labs) Todd Easton (K-State, Industrial and Manufacturing Systems Engineering) Machine Learning Dan Roth, Cinda Heeren, Jiawei Han (University of Illinois at Urbana-Champaign) An. Hai Doan (University of Wisconsin – Madison) First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University

Questions and Discussion First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Questions and Discussion First International Conference on Weblogs And Social Media (ICWSM-2007) Boulder, Colorado Computing & Information Sciences Kansas State University