43be15a4f62ed4a26146eec7881ad54c.ppt
- Количество слайдов: 31
Auto. Label: Labeling Places from Pictures and Websites Rufeng Meng, Sheng Shen, Romit Roy Choudhury, Srihari Nelakuditi
Localization Coordinates Names 2
Wi-Fi Based Semantic Localization
Wi-Fi Based Semantic Localization
Core Problem ?
AP Name Mining AP List Macy’s AP Name Mining AT&T Starbucks … Best Buy on Street Macy’s in Mall Store Name …
Opportunity: Stores have two avatars 7
Opportunity: Stores have two avatars 8
Mapping AP Vector to Store Name OCR Auto. Label AP List Matching In-store Text with Website Text Best Buy … Text From Candidate Webpages Best Buy Whole Foods Panera Store Name Starbucks …
Crowd. Sourcing m n Candidate Store Names 10
Crowd. Sourcing m n Candidate Store Names 11
Crowd. Sourcing OCRed In-store Text 1 Text-based Matching 1 Candidate Store Names 12
Text Extraction • Store Text Above Eye-level – Text OCRed from In-store Pictures • Web Text – Candidate Store Names – Meta Keywords in Web Pages – Text in Menu/Category Items Eye-level Below Eye-level 13
Text Matching Store Text Noun/Proper Name Extraction Filter Bag of Words Weight Assignment Similarity Calculation Web Text Noun/Proper Name Extraction Filter Bag of Words Labeling Weight Assignment • Weight assignment: TF-IDF 14
<AP 1, AP 2, AP 5> <AP 2, AP 3, AP 4> <AP 6, AP 7, AP 8> Cluster Auto. Label (Best Buy) (Dollar Tree) (Starbucks) Webs of Candidate Stores Store Name <AP 1, AP 2, AP 5> Best Buy <AP 2, AP 3, AP 4> Dollar Tree <AP 6, AP 7, AP 8> In-store Text vs. Website Text AP Vector (MAC) Starbucks (Panera Bread) 15
Simultaneous Clustering and Labeling • Two images put in different clusters if no common AP • Generate potential clusterings of images • Similarity of clustering = sum of similarities of its clusters • Pick the clustering with highest similarity • Label the AP vectors as per the best clustering 16
Evaluation 17
Data Collection • 40 Stores – 18 from shopping mall @Champaign, IL – 10 from one street area @Champaign, IL – 6 from other places @Champaign, IL – 6 from shopping mall @San Jose, CA • Number of Pics: – 6 ~ 217 pics/store (with readable text) 18
Text Matching without Store Names Accuracy: 80% 19
Text Matching with Store Names Accuracy: 87% 20
Text Matching without Store Names (Mall) 16/18 Correctly Matched 21
Text Matching with Store Names (Mall) 17/18 Correctly Matched 22
Text Matching without Store Names (Street) 10/10 Correctly Matched 23
Accuracy with Varying #Pics/Store 24
Accuracy with Varying #Pics/Store 25
Accuracy with Varying #Stores/Area Mean Accuracy: • 5 -store: 100% • 20 -store: 90% 26
Performance of Place Recognition with AP Vector– Store Name Mapping 27
Simultaneous Clustering and Labeling 28
Related Work • Visual business recognition – Image based matching of the exterior of store • Human assisted positioning using textual signs – Requires user intervention to localize self • Correlate crowdsourced pictures with social network posts – Stronger correlation between store and its website 29
Limitations & Future Work • Wi. Fi AP tagged in-store pictures – Reduce uncertainty even with a few pictures – Common decorations besides text • Image clustering – Spatial relationship between stores – Color/Theme + AP name mining • Beyond semantic localization – Correlate website and in-store shopping behavior 30
Summary of Auto. Label 31
43be15a4f62ed4a26146eec7881ad54c.ppt