a (0. 5) b (0. 1) c (0. 15) d (0. 25)
a a c b c d d
Test doc Classifier
Prominent off-diagonal entries raise design issues for taxonomy editors and maintainers Clear block-structure derived from coarse-grain topics Strong diagonals reflect tightly-knit topic communities
Infected by neighbor Cured internally
Prob. β Pro b. β
Prob. δ Prob. β Pro b. β
Two IBM intranet data sets with known top URLs (Smaller rank is better) Depth up to which dmoz. org URLs are used as ground truth
Pagerank HITS Authority
Pagerank Randomized HITS
1/3 1/3 1/2
HITS: The Tightly-Knit Community (TKC) effect SALSA: Less TKC influence (but no reinforcement!)
Theory Database Author Forum Year
Keyword Query=set of words Pick a query word per some distribution, e. g. IDF Pick out-link to walk on in proportion to relevance of target out-neighbor
Regions Test image Images Words
Movie “is-a” Gathering Grease Face/Off “directed” “acted-in” A 3 Travolta Cage “is-a” Actor Kleiser Woo “is-a” Director
http: //www. cse. iitb. ac. in/banks/
T 4 K 1, K 2, K 3 T 1 T 2 T 4 T 3 K 3 T 5 K 2 T 5 T 2 T 4 T 2 T 3 T 5
Quoc Vu Jon Kleinberg writes Organizing Web pages by “Information Unit” cites writes Authoritative sources in a hyperlinked environment A metric labeling problem cites Divyakant Agrawal author paper writes cites writes Eva Tardos
Quoc Vu Jon Kleinberg writes Organizing Web pages by “Information Unit” cites writes Authoritative sources in a hyperlinked environment A metric labeling problem cites Divyakant Agrawal author paper writes cites writes Eva Tardos
http: //www. cse. iitb. ac. in/banks/
Negroponte Esther Dyson Palmisano Gerstner
Connected to every node
Neighborhood model params Local model params
Model Log-linear form Parameters to fit
Penalize large params Maximize total conditional likelihood over all instances
Out-of-vocabulary error Orthography: Use words, plus overlapping features: is. Cap, starts. With. Digit, has. Hyphen, ends. With… -ing, ogy, -ed, -s, -ly, -ion, -tion, -ity, -ies