Скачать презентацию Discovery Analysis and Monitoring of Hidden Social Networks Скачать презентацию Discovery Analysis and Monitoring of Hidden Social Networks

c0689ddb49764b29b718faac8bd13fa2.ppt

  • Количество слайдов: 23

Discovery, Analysis and Monitoring of Hidden Social Networks and their Evolution Malik Magdon-Ismail Rensselaer Discovery, Analysis and Monitoring of Hidden Social Networks and their Evolution Malik Magdon-Ismail Rensselaer Polytechnic Institute

Our Group Ø Ø M. Goldberg M-I B. Szymanski A. Wallace Students: Ø Mykola Our Group Ø Ø M. Goldberg M-I B. Szymanski A. Wallace Students: Ø Mykola Hayvanovich Ø Apirak Hoonlor Ø Stephen Kelley Ø Konstantin Mertsalov 2

Motivation Communications supporting IED planning have patterns and are correlated…. Analysis of the patterns Motivation Communications supporting IED planning have patterns and are correlated…. Analysis of the patterns can reveal the groups as well as their internal group structure. 3

Communications Time: January 12, 2005, 09: 35 From: joe@xyz. com To: sue@abc. com Subject: Communications Time: January 12, 2005, 09: 35 From: joe@xyz. com To: sue@abc. com Subject: Hello Message: Where have you been? 16: 06: 31] Republicans were the worst pacifists before ww 1 and ww 2 [16: 06: 43] France Fries [16: 06: 50] As a generality, of course their were Republican Hawks. [16: 07: 13] Sweet, good pun but bad story! [16: 07: 18] yup [16: 07: 23] anyways, he's perpetually tormented by presidential actions [16: 07: 25] it aint good for no one [16: 07: 47] I think they knew it was commiing [16: 07: 51] Rossevelt met monthly in New York with mostly trusted Republicans to talk about how to get america into the war. [16: 08: 10] and he spent 2 year with Churchill meeting him sometimes secretly in the ocean to discuss the same topic. [16: 08: 22] Exchanging a lot of letters. [16: 08: 25] telegrams [16: 08: 28] There really is nothing like a shorn scrotum. It's breathtaking, I suggest you try it. [16: 08: 55] Well they didnt literally meet in the ocean, they were on ships. 4

Streaming Example Time From 10: 00 10: 05 10: 06 10: 12 10: 13 Streaming Example Time From 10: 00 10: 05 10: 06 10: 12 10: 13 10: 15 10: 20 10: 22 10: 25 10: 31 Alice Charlie Alice Felix Alice Bob Charlie Bob Felix To Message Charlie Golf tomorrow? Tell everyone. Felix Alice mentioned golf tomorrow. Bob Hey, golf tomorrow. Spread the word. Bob Tee off: 8 am at Pinehurst. Grace Hey guys, golf tomorrow. Harry Hey guys, golf tomorrow. Charlie Pinehurst Tee time: 8 am. Elizabeth We’re playing golf tomorrow. Dave We’re playing golf tomorrow. Felix Tee time 8 am at Pinehurst Elizabeth We tee off 8 am at Pinehurst. Dave We tee off 8 am at Pinehurst. Grace Tee time 8 am, Pinehurst. Harry Tee time 8 am, Pinehurst. 5

Streaming Example Time From 10: 00 10: 05 10: 06 10: 12 10: 13 Streaming Example Time From 10: 00 10: 05 10: 06 10: 12 10: 13 10: 15 10: 20 10: 22 10: 25 10: 31 Alice Charlie Alice Felix Alice Bob Charlie Bob Felix To Message Charlie Golf tomorrow? Tell everyone. Felix Alice mentioned golf tomorrow. Bob Hey, golf tomorrow. Spread the word. Bob Tee off: 8 am at Pinehurst. Grace Hey guys, golf tomorrow. Harry Hey guys, golf tomorrow. Charlie Pinehurst Tee time: 8 am. Elizabeth We’re playing golf tomorrow. Dave We’re playing golf tomorrow. Felix Tee time 8 am at Pinehurst Elizabeth We tee off 8 am at Pinehurst. Dave We tee off 8 am at Pinehurst. Grace Tee time 8 am, Pinehurst. Harry Tee time 8 am, Pinehurst. 6

Streaming Example Time From 10: 00 10: 05 10: 06 10: 12 10: 13 Streaming Example Time From 10: 00 10: 05 10: 06 10: 12 10: 13 10: 15 10: 20 10: 22 10: 25 10: 31 Alice Charlie Alice Felix Alice Bob Charlie Bob Felix To Charlie Felix Bob Grace Harry Charlie Elizabeth Dave Felix Elizabeth Dave Grace Harry 7

Overview: SIGHTS & RDM Level 2 3 , s e l l . . Overview: SIGHTS & RDM Level 2 3 , s e l l . . . 3 , h e l l Pattern id = 3 Pattern = “ 2 trade”bb Level 1 2 t r a d e . . . 2 t r a d e Pattern id = 2 Pattern = “buy, ” Level 0 b u y , t r a d e . . . b u y Higher ranked leaders Group leader Subgroup leaders 8 Members

Communications Ø Email, Telephone, Newsgroup, Weblog, Chatrooms, … Time: January 12, 2005, 09: 35 Communications Ø Email, Telephone, Newsgroup, Weblog, Chatrooms, … Time: January 12, 2005, 09: 35 From: joe@xyz. com To: sue@abc. com Subject: Hello Message: Where have you been lately? 9

Communication Graph January 12, 2005, 09: 35 joe@xyz. com sue@abc. com 10 Communication Graph January 12, 2005, 09: 35 joe@xyz. com sue@abc. com 10

Communication Graph What are the social groups/coalitions? 11 Communication Graph What are the social groups/coalitions? 11

Social Groups are Clusters Ø Clusters may overlap. 12 Social Groups are Clusters Ø Clusters may overlap. 12

Social Groups are Clusters Ø Clusters may overlap. Ø A cluster is a locally Social Groups are Clusters Ø Clusters may overlap. Ø A cluster is a locally defined object. 13

Social Groups are Clusters Ø Clusters may overlap. Ø A cluster is a locally Social Groups are Clusters Ø Clusters may overlap. Ø A cluster is a locally defined object. ØGroup members are more introverted than extroverted. YES NO 14

Social Groups are Clusters Ø Clusters may overlap. Ø A cluster is a locally Social Groups are Clusters Ø Clusters may overlap. Ø A cluster is a locally defined object. ØGroup members are more introverted than extroverted. Ø Social groups (clusters) persist 15

SIGHTS Statistical Identification of Groups Hidden in Time and Space - System for statistical SIGHTS Statistical Identification of Groups Hidden in Time and Space - System for statistical analysis of social coalitions in communication networks Data Sources Blogs Emails (Enron) Chatroom Synthetic data Coalition Discovery Coalition Analysis Visualizations Overlapping Clustering Streaming groups Persistent groups. Leaders Opposing groups Topic matching Size-Density plots Static coalitions Dynamic coalitions Different analyses on dataset Group members Visualization options Size vs. Density Plot Leader index 16 Choose time window Groups matching analyst topic in red

Examples ENRON Ali Baba Data Set (Do. D) Two clusters: Electric circuit design; Optimization Examples ENRON Ali Baba Data Set (Do. D) Two clusters: Electric circuit design; Optimization of Neural Networks: Intersection: “Sensitivity analysis in degenerate GROUND TRUTH Ø quadratic programming” Citeseer Ø Group A Ø Dog Ø Vulture Ø Camel Ø Yassir Hussein Ø Bird Ø (6 others) Group B Ø Ahmet Ø Saleh Sarwuk Ø Shaid Ø Pavlammed Pavlah Ø Osan Domenik SIGHTS Ø Group A Ø Dog Ø Vulture Ø Camel Ø Gopher Ø Group B Ø Ahmet Ø Saleh Sarwuk Ø Shaid Ø Dajik 17

Recursive Data Mining (RDM) Build a classifier to identify the relationship between sender and Recursive Data Mining (RDM) Build a classifier to identify the relationship between sender and receiver of a message EXAMPLE: “Do you have time to meet some time this week? ” “Lets meet 2 pm today, ok? ” Which is advisor, which is student? 18

Pattern Definition Hierarchical Pattern Construction (recursive definition) Captures patterns; patterns of patterns… (can even Pattern Definition Hierarchical Pattern Construction (recursive definition) Captures patterns; patterns of patterns… (can even capture long-range patterns) Pattern id = 4 Pattern = “ 3, _ell” Level 2 3 , s e l l . . . 3 , h e l l Level 1 Level 0 2 t r a d e. b u y , . . 2 t r a d e Pattern id = 2 Pattern = t r a d e “buy, ”. b u y. . Larger patterns Pattern id = 3 Pattern = “ 2 trade” 19

A Classifier – Joining the Pieces Ø Ensemble of classifiers Ø Classifier for each A Classifier – Joining the Pieces Ø Ensemble of classifiers Ø Classifier for each level in the hierarchical approach Ø Features gathered from the training messages Ø Global features include average length and number of sentences Ø Approximate matching allows treatment of noise 20

Results on Enron Binary classification: for a given message m, is m sent by Results on Enron Binary classification: for a given message m, is m sent by a person with role r? r є {CEO, Manager, Trader, Vice-President} Multi-classification: for a given message m, which role r is the most likely for the sender? r є {CEO, Manager, Trader, Vice-President} The bars show the error of classification. Universally RDM_SVM outperforms other classifiers 21

Summing Up Ø SIGHTS: ØStructural; non-semantic; language independent ØFinds groups, their dynamics and structure; Summing Up Ø SIGHTS: ØStructural; non-semantic; language independent ØFinds groups, their dynamics and structure; visual analytic capabilities. Ø RDM ØUses statistical semantics; language independent ØIdentifies roles within the group 22

Thank You http: //www. cs. rpi. edu/~magdon 23 Thank You http: //www. cs. rpi. edu/~magdon 23