Скачать презентацию Lecture 5 Network centrality Slides are modified from Скачать презентацию Lecture 5 Network centrality Slides are modified from

b0892890f632e72359f8cb1528343e99.ppt

  • Количество слайдов: 102

Lecture 5: Network centrality Slides are modified from Lada Adamic and James Moody Lecture 5: Network centrality Slides are modified from Lada Adamic and James Moody

Measures and Metrics n Knowing the structure of a network, we can calculate various Measures and Metrics n Knowing the structure of a network, we can calculate various useful quantities or measures that capture particular features of the network topology. n basis of most of such measures are from social network analysis n So far, n Degree distribution, Average path length, Density n Centrality n Degree, Eigenvector, Katz, Page. Rank, Hubs, Closeness, Betweenness, …. n Several other graph metrics n Clustering coefficient, Assortativity, Modularity, … 2

Characterizing networks: Who is most central? ? ? ? 3 Characterizing networks: Who is most central? ? ? ? 3

network centrality n Which nodes are most ‘central’? n Definition of ‘central’ varies by network centrality n Which nodes are most ‘central’? n Definition of ‘central’ varies by context/purpose n Local measure: n degree n Relative to rest of network: n closeness, betweenness, eigenvector (Bonacich power centrality), Katz, Page. Rank, … n How evenly is centrality distributed among nodes? n Centralization, hubs and authorities, … 4

centrality: who’s important based on their network position In each of the following networks, centrality: who’s important based on their network position In each of the following networks, X has higher centrality than Y according to a particular measure indegree outdegree betweenness closeness 5

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 6

degree centrality (undirected) He who has many friends is most important. When is the degree centrality (undirected) He who has many friends is most important. When is the number of connections the best centrality measure? o people who will do favors for you o people you can talk to (influence set, information access, …) o influence of an article in terms of citations (using in-degree) 7

degree: normalized degree centrality divide by the max. possible, i. e. (N-1) 8 degree: normalized degree centrality divide by the max. possible, i. e. (N-1) 8

Prestige in directed social networks n when ‘prestige’ may be the right word n Prestige in directed social networks n when ‘prestige’ may be the right word n admiration n influence n gift-giving n trust n directionality especially important in instances where ties may not be reciprocated (e. g. dining partners choice network) n when ‘prestige’ may not be the right word n gives advice to (can reverse direction) n gives orders to (- ” -) n lends money to (- ” -) n dislikes n distrusts 9

Extensions of undirected degree centrality - prestige n degree centrality n indegree centrality n Extensions of undirected degree centrality - prestige n degree centrality n indegree centrality n a paper that is cited by many others has high prestige n a person nominated by many others for a reward has high prestige 10

Degree Centrality in Social Networks The most intuitive notion of centrality focuses on degree: Degree Centrality in Social Networks The most intuitive notion of centrality focuses on degree: The actor with the most ties is the most important:

centralization: how equal are the nodes? How much variation is there in the centrality centralization: how equal are the nodes? How much variation is there in the centrality scores among the nodes? Freeman’s general formula for centralization: (can use other metrics, e. g. gini coefficient or standard deviation) maximum value in the network 12

Degree Centrality in Social Networks Degree Centralization Scores Freeman: 1. 0 Variance: 3. 9 Degree Centrality in Social Networks Degree Centralization Scores Freeman: 1. 0 Variance: 3. 9 Freeman: . 02 Variance: . 17 Freeman: . 07 Variance: . 20 Freeman: 0. 0 Variance: 0. 0

degree centralization examples CD = 0. 167 CD = 1. 0 CD = 0. degree centralization examples CD = 0. 167 CD = 1. 0 CD = 0. 167 14

degree centralization examples example financial trading networks high centralization: one node trading with many degree centralization examples example financial trading networks high centralization: one node trading with many others low centralization: trades are more evenly distributed 15

Degree Centrality in Social Networks Degree centrality can be deceiving, because it is a Degree Centrality in Social Networks Degree centrality can be deceiving, because it is a purely local measure.

when degree isn’t everything In what ways does degree fail to capture centrality in when degree isn’t everything In what ways does degree fail to capture centrality in the following graphs? n ability to broker between groups n likelihood that information originating anywhere in the network reaches you… 17

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 18

betweenness: another centrality measure n intuition: how many pairs of individuals would have to betweenness: another centrality measure n intuition: how many pairs of individuals would have to go through you in order to reach one another in the minimum number of hops? n who has higher betweenness, X or Y? Y X 19

betweenness on toy networks n non-normalized version: A B C D E n A betweenness on toy networks n non-normalized version: A B C D E n A lies between no two other vertices n B lies between A and 3 other vertices: C, D, and E n C lies between 4 pairs of vertices (A, D), (A, E), (B, D), (B, E) n note that there are no alternate paths for these pairs to take, so C gets full credit 20

betweenness centrality: definition paths between j and k that pass through i betweenness of betweenness centrality: definition paths between j and k that pass through i betweenness of vertex i all paths between j and k Where gjk = the number of geodesics connecting j-k, and gjk = the number that actor i is on. Usually normalized by: number of pairs of vertices excluding the vertex itself directed graph: (N-1)*(N-2) 21

Betweenness Centrality in Social Networks Centralization: 1. 0 Centralization: . 59 Centralization: . 31 Betweenness Centrality in Social Networks Centralization: 1. 0 Centralization: . 59 Centralization: . 31 Centralization: 0

Betweenness Centrality in Social Networks Centralization: . 183 Betweenness Centrality in Social Networks Centralization: . 183

betweenness on toy networks n non-normalized version: 24 betweenness on toy networks n non-normalized version: 24

betweenness on toy networks n non-normalized version: broker 25 betweenness on toy networks n non-normalized version: broker 25

example Nodes are sized by degree, and colored by betweenness. What about high degree example Nodes are sized by degree, and colored by betweenness. What about high degree but relatively low betweenness? Can you spot nodes with high betweenness but relatively low degree? 26

betweenness on toy networks n non-normalized version: n why do C and D each betweenness on toy networks n non-normalized version: n why do C and D each have betweenness 1? C n They are both on shortest paths for pairs (A, E), and (B, E), and so must share credit: A E B n ½+½ = 1 n Can you figure out why B has betweenness 3. 5 while E has betweenness 0. 5? D 27

Extending betweenness centrality to directed networks n We now consider the fraction of all Extending betweenness centrality to directed networks n We now consider the fraction of all directed paths between any two vertices that pass through a node betweenness of vertex i paths between j and k that pass through i all paths between j and k n Only modification: when normalizing, we have (N-1)*(N-2) instead of (N-1)*(N-2)/2, because we have twice as many ordered pairs as unordered pairs 28

Directed geodesics n A node does not necessarily lie on a geodesic from j Directed geodesics n A node does not necessarily lie on a geodesic from j to k if it lies on a geodesic from k to j j k 29

Alternative betweenness computations n Slight variations in geodesic path computations n inclusion of self Alternative betweenness computations n Slight variations in geodesic path computations n inclusion of self in the computations n Flow betweenness n Based on the idea of maximum flow n edge-independent path selection effects the results n may not include geodesic paths n Random-walk betweenness n Based on the idea of random walks n Usually yields ranking similar to geodesic betweenness n Many other alternative definitions exist based on diffusion, transmission or flow along network edges 30

Information Centrality in Social Networks Information Centrality: It is quite likely that information can Information Centrality in Social Networks Information Centrality: It is quite likely that information can flow through paths other than the geodesic. The Information Centrality score uses all paths in the network, and weights them based on their length.

Information Centrality in Social Networks Information Centrality: Information Centrality in Social Networks Information Centrality:

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 33

closeness: another centrality measure n What if it’s not so important to have many closeness: another centrality measure n What if it’s not so important to have many direct friends? n Or be “between” others n But one still wants to be in the “middle” of things, n not too far from the center 34

closeness centrality: definition Closeness is based on the length of the average shortest path closeness centrality: definition Closeness is based on the length of the average shortest path between a vertex and all vertices in the graph Closeness Centrality: depends on inverse distance to other vertices Normalized Closeness Centrality 35

closeness centrality: toy example A B C D E 36 closeness centrality: toy example A B C D E 36

Closeness Centrality in Social Networks Distance Closeness normalized 01111111 10222222 12022222 12202222 12220222 12222022 Closeness Centrality in Social Networks Distance Closeness normalized 01111111 10222222 12022222 12202222 12220222 12222022 12222202 12222220 Distance . 143. 077 1. 00. 538 Closeness normalized 012344321 101234432 210123443 321012344 432101234 443210123 344321012 234432101 123443210 . 050 . 400

Closeness Centrality in Social Networks Distance Closeness normalized 0 1 2 3 4 5 Closeness Centrality in Social Networks Distance Closeness normalized 0 1 2 3 4 5 6. 048. 286 1 0 1 2 3 4 5. 063. 375 2 1 0 1 2 3 4. 077. 462 3 2 1 0 1 2 3. 083. 500 4 3 2 1 0 1 2. 077. 462 5 4 3 2 1 0 1. 063. 375 6 5 4 3 2 1 0. 048. 286

Closeness Centrality in Social Networks Distance 0112344556556 1011233445445 1101233445445 2110122334334 3221011223223 4332102334112 4332120112334 5443231011445 Closeness Centrality in Social Networks Distance 0112344556556 1011233445445 1101233445445 2110122334334 3221011223223 4332102334112 4332120112334 5443231011445 5443231101445 6554342110556 5443213445011 5443213445101 6554324556110 . 021. 027. 034. 042. 034. 027. 021 Closeness normalized. 255. 324. 414. 500. 414. 324. 255

closeness centrality: more toy examples 40 closeness centrality: more toy examples 40

how closely do degree and betweenness correspond to closeness? n degree n number of how closely do degree and betweenness correspond to closeness? n degree n number of connections n denoted by size n closeness n length of shortest path to all others n denoted by color 41

Closeness centrality n Values tend to span a rather small dynamic range n typical Closeness centrality n Values tend to span a rather small dynamic range n typical distance increases logarithmically with network size n In a typical network the closeness centrality C might span a factor of five or less n It is difficult to distinguish between central and less central vertices n a small change in network might considerably affect the centrality order n Alternative computations exist but they have their own problems 42

Centrality in Social Networks Graph Theoretic Center (Barry or Jordan Center). Identify the point(s) Centrality in Social Networks Graph Theoretic Center (Barry or Jordan Center). Identify the point(s) with the smallest, maximum distance to all other points. Value = longest distance to any other node. The graph theoretic center is ‘ 3’, but you might also consider a continuous measure as the inverse of the maximum geodesic

Influence range n The influence range of i is the set of vertices who Influence range n The influence range of i is the set of vertices who are reachable from the node i 44

Extensions of undirected closeness centrality n closeness centrality usually implies n all paths should Extensions of undirected closeness centrality n closeness centrality usually implies n all paths should lead to you n paths should lead from you to everywhere else n usually consider only vertices from which the node i in question can be reached 45

Centrality in Social Networks Comparing across these 3 centrality values • Generally, the 3 Centrality in Social Networks Comparing across these 3 centrality values • Generally, the 3 centrality types will be positively correlated • When they are not (low) correlated, it probably tells you something interesting about the network. Low Degree Low Closeness Low Betweenness High Degree Embedded in cluster that is far from the rest of the network Ego's connections are redundant - communication bypasses him/her High Closeness Key player tied to important/active alters Probably multiple paths in the network, ego is near many people, but so are many others High Betweenness Ego's few ties are crucial for network flow Very rare cell. Would mean that ego monopolizes the ties from a small number of people to many others.

Centrality in Social Networks In recent work, Borgatti (2003; 2005) discusses centrality in terms Centrality in Social Networks In recent work, Borgatti (2003; 2005) discusses centrality in terms of two key dimensions: Substantively, the key question for centrality is knowing what is flowing through the network. The key features are: • Whether the actor retains the good to pass to others (Information, Diseases) or whether they pass the good and then loose it (physical objects) • Whether the key factor for spread is distance (disease with low pij) or multiple sources (information) The off-the-shelf measures do not always match the social process of interest, so researchers need to be mindful of this.

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 48

Eigenvector Centrality n 49 Eigenvector Centrality n 49

Eigenvector Centrality n 50 Eigenvector Centrality n 50

Eigenvector Centrality n Can be calculated for directed graphs as well n We need Eigenvector Centrality n Can be calculated for directed graphs as well n We need to decide between incoming or outgoing edges A B D C E n A has no incoming edges, hence a centrality of 0 n B has only an incoming edge from A n hence its centrality is also 0 n Only vertices that are in a strongly connected component of two or more vertices or the out-component of such a component have non-zero centrality 51

Katz centrality n 52 Katz centrality n 52

Bonacich Power Centrality in Social Networks Bonacich Power Centrality: Actor’s centrality (prestige) is equal Bonacich Power Centrality in Social Networks Bonacich Power Centrality: Actor’s centrality (prestige) is equal to a function of the prestige of those they are connected to. Thus, actors who are tied to very central actors should have higher prestige/ centrality than those who are not. • a is a scaling vector, which is set to normalize the score. • b reflects the extent to which you weight the centrality of people ego is tied to. • R is the adjacency matrix (can be valued) • I is the identity matrix (1 s down the diagonal) • 1 is a matrix of all ones.

Katz / Bonacich Power Centrality: b § The magnitude of b reflects the radius Katz / Bonacich Power Centrality: b § The magnitude of b reflects the radius of power • Small values of b weight local structure • Larger values weight global structure § If b > 0, ego has higher centrality when tied to people who are central § If b < 0, then ego has higher centrality when tied to people who are not central § With b = 0, you get degree centrality 54

Bonacich Power Centrality: examples b=. 25 b=-. 25 Why does the middle node have Bonacich Power Centrality: examples b=. 25 b=-. 25 Why does the middle node have lower centrality than its neighbors when b is negative? 55

Bonacich Power Centrality in Social Networks b = 0. 23 Bonacich Power Centrality in Social Networks b = 0. 23

Centrality in Social Networks Power / Eigenvalue Bonacich Power Centrality: b=. 35 b=-. 35 Centrality in Social Networks Power / Eigenvalue Bonacich Power Centrality: b=. 35 b=-. 35

Centrality in Social Networks Power / Eigenvalue Bonacich Power Centrality: b=. 23 b= -. Centrality in Social Networks Power / Eigenvalue Bonacich Power Centrality: b=. 23 b= -. 23

Centrality in Social Networks There are other options, usually based on generalizing some aspect Centrality in Social Networks There are other options, usually based on generalizing some aspect of those above: • Random Walk Betweenness (Mark Newman). Looks at the number of times you would expect node I to be on the path between k and j if information traveled a ‘random walk’ through the network. • Peer Influence based measures (Friedkin and others). Based on the assumed network autocorrelation model of peer influence. In practice it’s a variant of the eigenvector centrality measures. • Subgraph centrality. Counts the number of cliques of size 2, 3, 4, … n-1 that each node belongs to. Reduces to (another) function of the eigenvalues. Similar to influence & information centrality, but does distinguish some unique positions. • Fragmentation centrality – Part of Borgatti’s Key Player idea, where nodes are central if they can easily break up a network. • Moody & White’s Embeddedness measure is technically a group-level index, but captures the extent to which a given set of nodes are nested inside a network • Removal Centrality – effect on the rest of the (graph for any given statistic) with the removal of a given node. Really gets at the system-contribution of a particular actor.

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 60

Friedkin: structural bases of influence n Interested in identifying the structural bases of power. Friedkin: structural bases of influence n Interested in identifying the structural bases of power. In addition to resources, he identifies: n Centrality n Similarity n Cohesion Which are thought to affect interpersonal visibility and salience

Friedkin: structural bases of influence Centrality Central actors are likely more influential. They have Friedkin: structural bases of influence Centrality Central actors are likely more influential. They have greater access to information and can communicate their opinions to others more efficiently. Research shows they are also more likely to use the communication channels than are periphery actors.

Friedkin: structural bases of influence Structural Similarity • Two people may not be directly Friedkin: structural bases of influence Structural Similarity • Two people may not be directly connected, but occupy a similar position in the structure. As such, they have similar interests in outcomes that relate to positions in the structure. • Similarity must be conditioned on visibility. P must know that O is in the same position, which means that the effect of similarity might be conditional on communication frequency.

Friedkin: structural bases of influence Cohesion • Members of a cohesive group are likely Friedkin: structural bases of influence Cohesion • Members of a cohesive group are likely to be aware of each others opinions, because information diffuses quickly within the group. • Groups encourage (through balance) reciprocity and compromise. This likely increases the salience of opinions of other group members, over non-group members. • Actors P and O are structurally cohesive if they are joint members of a cohesive group. The greater their cohesion, the more likely they are to influence each other.

Friedkin: structural bases of influence Substantive questions: Influence in establishing school performance criteria. • Friedkin: structural bases of influence Substantive questions: Influence in establishing school performance criteria. • Data on 23 teachers • Collected in 2 waves • Dyads are the unit of analysis (P--> O): want to measure the extent of influence of one actor on another. • Each teacher identified how much an influence others were on their opinion about school performance criteria. • Cohesion = probability of a flow of events (communication) between them, within 3 steps. • Similarity = pairwise measure of equivalence (profile correlations) • Centrality = TEC (power centrality)

Total Effects Centrality (Friedkin). Very similar to the Bonacich measure, it is based on Total Effects Centrality (Friedkin). Very similar to the Bonacich measure, it is based on an assumed peer influence model. The formula is: Where W is a row-normalized adjacency matrix, and a is a weight for the amount of interpersonal influence

Friedkin: structural bases of influence Interpersonal communication matters, and communication is what matters most Friedkin: structural bases of influence Interpersonal communication matters, and communication is what matters most for interpersonal influence. Source: Structural Bases of Interpersonal Influence in Groups: A Longitudinal Case Study, Noah E. Friedkin. American Sociological Review, Vol. 58, No. 6 (Dec. , 1993), pp. 861 -872. Published by: American Sociological Association, http: //www. soc. ucsb. edu/faculty/friedkin/Reprints/ASRBases. pdf.

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 68

Hubs and Authorities n In directed networks, vertices that point to important resources should Hubs and Authorities n In directed networks, vertices that point to important resources should also get a high centrality n e. g. review articles, web indexes n recursive definition: hubs are nodes that links to good authorities are nodes that are linked to by good hubs

Hyperlink-Induced Topic Search n HITS algorithm n start with a set of pages matching Hyperlink-Induced Topic Search n HITS algorithm n start with a set of pages matching a query n expand the set by following forward and back links n take transition matrix E, where the i, jth entry Eij =1/ni n where i links to j, and ni is the number of links from i n then one can compute the authority scores a, and hub scores h through an iterative approach:

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 71

Page. Rank: bringing order to the web n It’s in the links: n links Page. Rank: bringing order to the web n It’s in the links: n links to URLs can be interpreted as endorsements or recommendations n the more links a URL receives, the more likely it is to be a good/entertaining/provocative/authoritative/interesting information source n but not all link sources are created equal n a link from a respected information source n a link from a page created by a spammer an important page, e. g. slashdot if a web page is slashdotted, it gains attention Many webpages scattered across the web

Page. Rank 73 Page. Rank 73

Ranking pages by tracking a drunk n A random walker following edges in a Ranking pages by tracking a drunk n A random walker following edges in a network for a very long time will spend a proportion of time at each node which can be used as a measure of importance

Trapping a drunk n Problem with pure random walk metric: n Drunk can be Trapping a drunk n Problem with pure random walk metric: n Drunk can be “trapped” and end up going in circles

Ingenuity of the Page. Rank algorithm n Allow drunk to teleport with some probability Ingenuity of the Page. Rank algorithm n Allow drunk to teleport with some probability n e. g. random websurfer follows links for a while, but with some probability teleports to a “random” page n bookmarked page or uses a search engine to start anew

Page. Rank algorithm where p 1, p 2, . . . , p. N Page. Rank algorithm where p 1, p 2, . . . , p. N are the pages under consideration, M(pi) is the set of pages that link to pi, L(pj) is the number of outbound links on page pj, and N is the total number of pages. (1 -d) is the random jumping probability (d = 0. 85 for google)

Exercise: Page. Rank n What happens to the relative Page. Rank scores of the Exercise: Page. Rank n What happens to the relative Page. Rank scores of the GUESS Page. Rank demo nodes as you increase the teleportation probability? n Can you construct a network such that a node with low indegree has the highest Page. Rank? http: //www. ladamic. com/netlearn/GUESS/pagerank. html

example: probable location of random walker after 1 step 20% teleportation probability 1 6 example: probable location of random walker after 1 step 20% teleportation probability 1 6 2 8 t=0 7 5 3 4 t=1

example: location probability after 10 steps t=0 1 6 8 t=1 2 7 5 example: location probability after 10 steps t=0 1 6 8 t=1 2 7 5 3 4 t=10

Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector Outline n Degree centrality n Centralization n Betweenness centrality n Closeness centrality n Eigenvector centrality n Bonacich power centrality n Katz centrality n Influence n Hubs and Authorities n Page. Rank n Lex. Rank 81

Applications to Information Retrieval n Can we use the notion of centrality to pick Applications to Information Retrieval n Can we use the notion of centrality to pick the best summary sentence? n Can we use the subgraph of query results to infer something about the query? n Can we use a graph of word translations to expand dictionaries? disambiguate word meanings? n How might one use the HITS algorithm for document summarization? n Consider a bipartite graph of sentences and words

Centrality in summarization n Extractive summarization n pick k sentences that are most representative Centrality in summarization n Extractive summarization n pick k sentences that are most representative of a collection of n sentences n Motivation: n capture the most central words in a document or cluster n Centroid score [Radev & al. 2000, 2004 a]

Sample multidocument cluster (DUC cluster d 1003 t) 1 (d 1 s 1) Iraqi Sample multidocument cluster (DUC cluster d 1003 t) 1 (d 1 s 1) Iraqi Vice President Taha Yassin Ramadan announced today, Sunday, that Iraq refuses to back down from its decision to stop cooperating with disarmament inspectors before its demands are met. 2 (d 2 s 1) Iraqi Vice president Taha Yassin Ramadan announced today, Thursday, that Iraq rejects cooperating with the United Nations except on the issue of lifting the blockade imposed upon it since the year 1990. 3 (d 2 s 2) Ramadan told reporters in Baghdad that "Iraq cannot deal positively with whoever represents the Security Council unless there was a clear stance on the issue of lifting the blockade off of it. 4 (d 2 s 3) Baghdad had decided late last October to completely cease cooperating with the inspectors of the United Nations Special Commission (UNSCOM), in charge of disarming Iraq's weapons, and whose work became very limited since the fifth of August, and announced it will not resume its cooperation with the Commission even if it were subjected to a military operation. 5 (d 3 s 1) The Russian Foreign Minister, Igor Ivanov, warned today, Wednesday against using force against Iraq, which will destroy, according to him, seven years of difficult diplomatic work and will complicate the regional situation in the area. 6 (d 3 s 2) Ivanov contended that carrying out air strikes against Iraq, who refuses to cooperate with the United Nations inspectors, ``will end the tremendous work achieved by the international group during the past seven years and will complicate the situation in the region. '' 7 (d 3 s 3) Nevertheless, Ivanov stressed that Baghdad must resume working with the Special Commission in charge of disarming the Iraqi weapons of mass destruction (UNSCOM). 8 (d 4 s 1) The Special Representative of the United Nations Secretary-General in Baghdad, Prakash Shah, announced today, Wednesday, after meeting with the Iraqi Deputy Prime Minister Tariq Aziz, that Iraq refuses to back down from its decision to cut off cooperation with the disarmament inspectors. 9 (d 5 s 1) British Prime Minister Tony Blair said today, Sunday, that the crisis between the international community and Iraq ``did not end'' and that Britain is still ``ready, prepared, and able to strike Iraq. '' 10 (d 5 s 2) In a gathering with the press held at the Prime Minister's office, Blair contended that the crisis with Iraq ``will not end until Iraq has absolutely and unconditionally respected its commitments'' towards the United Nations. 11 (d 5 s 3) A spokesman for Tony Blair had indicated that the British Prime Minister gave permission to British Air Force Tornado planes stationed in Kuwait to join the aerial bombardment against Iraq.

Cosine between sentences n Let s 1 and s 2 be two sentences. n Cosine between sentences n Let s 1 and s 2 be two sentences. n Let x and y be their representations in an n-dimensional vector space n The cosine similarity between them is then computed based on the inner product of the two. n The cosine ranges from 0 to 1.

Lex. Rank (Cosine centrality) 1 2 3 4 5 6 7 8 9 10 Lex. Rank (Cosine centrality) 1 2 3 4 5 6 7 8 9 10 11 1 1. 00 0. 45 0. 02 0. 17 0. 03 0. 22 0. 03 0. 28 0. 06 0. 00 2 0. 45 1. 00 0. 16 0. 27 0. 03 0. 19 0. 03 0. 21 0. 03 0. 15 0. 00 3 0. 02 0. 16 1. 00 0. 03 0. 00 0. 01 0. 03 0. 04 0. 00 0. 01 0. 00 4 0. 17 0. 27 0. 03 1. 00 0. 01 0. 16 0. 28 0. 17 0. 00 0. 09 0. 01 5 0. 03 0. 00 0. 01 1. 00 0. 29 0. 05 0. 15 0. 20 0. 04 0. 18 6 0. 22 0. 19 0. 01 0. 16 0. 29 1. 00 0. 05 0. 29 0. 04 0. 20 0. 03 7 0. 03 0. 28 0. 05 1. 00 0. 06 0. 00 0. 01 8 0. 21 0. 04 0. 17 0. 15 0. 29 0. 06 1. 00 0. 25 0. 20 0. 17 9 0. 06 0. 03 0. 00 0. 20 0. 04 0. 00 0. 25 1. 00 0. 26 0. 38 10 0. 06 0. 15 0. 01 0. 09 0. 04 0. 20 0. 00 0. 26 1. 00 0. 12 11 0. 00 0. 01 0. 18 0. 03 0. 01 0. 17 0. 38 0. 12 1. 00

Lexical centrality (t=0. 3) d 3 s 3 d 2 s 3 d 3 Lexical centrality (t=0. 3) d 3 s 3 d 2 s 3 d 3 s 2 d 3 s 1 d 1 s 1 d 4 s 1 d 5 s 1 d 2 s 1 d 5 s 2 d 2 s 2 d 5 s 3

Lexical centrality (t=0. 2) d 3 s 3 d 2 s 3 d 3 Lexical centrality (t=0. 2) d 3 s 3 d 2 s 3 d 3 s 2 d 3 s 1 d 1 s 1 d 4 s 1 d 5 s 1 d 2 s 1 d 5 s 2 d 2 s 2 d 5 s 3

Lexical centrality (t=0. 1) d 3 s 3 d 2 s 3 d 3 Lexical centrality (t=0. 1) d 3 s 3 d 2 s 3 d 3 s 2 d 3 s 1 d 1 s 1 d 4 s 1 d 5 s 1 d 2 s 1 d 5 s 2 d 2 s 2 Sentences vote for the most central sentence… d 5 s 3

Lex. Rank n T 1…Tn are pages that link to A, n c(Ti) is Lex. Rank n T 1…Tn are pages that link to A, n c(Ti) is the outdegree of page. Ti, and n N is the total number of pages. n (1 -d) is the “damping factor”, or the probability that we “jump” to a far-away node during the random walk. n It accounts for disconnected components or periodic graphs. n When d = 0, we have a strict uniform distribution. When d = 1, the method is not guaranteed to converge to a unique solution. n Typical value for (1 -d) is between [0. 1, 0. 2] (Brin and Page, 1998). Güneş Erkan and Dragomir R. Radev, Lex. Rank: Graph-based Lexical Centrality as Salience in Text Summarization

lab: Lexrank demo n how does the summary change as you: n n http: lab: Lexrank demo n how does the summary change as you: n n http: //clair. si. umich. edu/demos/lexrank/ increase the cosine similarity threshold for an edge n how similar two sentences have to be? increase the salience threshold (minimum degree of a node)

Content similarity distributions for web pages (DMOZ) and scientific articles (PNAS) Menczer, Filippo (2004) Content similarity distributions for web pages (DMOZ) and scientific articles (PNAS) Menczer, Filippo (2004) The evolution of document networks.

what is that good for? n How could you take advantage of the fact what is that good for? n How could you take advantage of the fact that pages that are similar in content tend to link to one another?

What can networks of query results tell us about the query? n If query What can networks of query results tell us about the query? n If query results are highly interlinked, is this a narrow or broad query? n How could you use query connection graphs to predict whether a query will be reformulated? Jure Leskovec, Susan Dumais: Web Projections: Learning from Contextual Subgraphs of the Web

How can bipartite citation graphs be used to find related articles? n co-citation: both How can bipartite citation graphs be used to find related articles? n co-citation: both A and B are cited by many other papers (C, D, E …) B A C D E n bibliographic coupling: both A and B are cite many of the same articles (F, G, H …) F G H B A

which of these pairs is more proximate n according to cycle free effective conductance: which of these pairs is more proximate n according to cycle free effective conductance: n the probability that you reach the other node before cycling back on yourself, while doing a random walk….

Proximity as cycle free effective conductance n Measuring and Extracting Proximity in Networks by Proximity as cycle free effective conductance n Measuring and Extracting Proximity in Networks by Yehuda Koren, Stephen C. North, Chris Volinsky, KDD 2006 n demo: http: //public. research. att. com/~volinsky/cgi-bin/prox. pl

Using network algorithms (specifically proximity) to improve movie recommendations can pay off Using network algorithms (specifically proximity) to improve movie recommendations can pay off

final IR application: machine translation n not all pairwise translations are available n e. final IR application: machine translation n not all pairwise translations are available n e. g. between rare languages n in some applications, e. g. image search, a word may have multiple meanings n “spring” is an example in english or or or But in other languages, the word may be unambiguous. n automated translation could be the key

final IR application: machine translation n if we combine all known ﺭﺑﻴﻊ Basque 1 final IR application: machine translation n if we combine all known ﺭﺑﻴﻊ Basque 1 … 1 3 2 2 2 … vzmet Slovenian French 3 veer 2 … koanga Maori … 2 4 4 … пружина Russian … Dutch 4 4 3 3 Spanish spring English printemps 3 primavera 1 2 3 Arabic udaherri word pairs, can we construct additional dictionaries between rare languages? рысора 4 Belarusian 4 ressort French source: Reiter et al. , ‘Lexical Translation with Application to Image Search on the Web ’

Automatic translation & network structure n Two words more likely to have same meaning Automatic translation & network structure n Two words more likely to have same meaning if there are multiple indirect paths of length 2 through other languages ﺭﺑﻴﻊ Arabic udaherri Basque 1 … 1 1 1 … 2 2 … 3 Spanish spring printemps 3 primavera 1 English 3 пружина Russian French 3 3 … 3 koanga Maori

wrap up n Centrality n many measures: degree, betweenness, closeness, Bonacich n may be wrap up n Centrality n many measures: degree, betweenness, closeness, Bonacich n may be unevenly distributed n measure via centralization n extensions to directed networks: n prestige n input domain… n Page. Rank (down the road…) n consequences: n interpersonal influence (Friedkin) n benefits & risks (Baker & Faulkner)