bf0dbe19bc318ae85efb30c4629f9837.ppt
- Количество слайдов: 52
Multi-Document Summarization of Evaluative Text Giuseppe Carenini, Raymond T. Ng, Adam Pauls Computer Science Dept. University of British Columbia Vancouver, CANADA 3/18/2018 EACL 2006 1
Multi-Document Summarization of Evaluative Text Giuseppe Carenini, Raymond T. Ng, Adam Pauls Computer Science Dept. University of British Columbia Vancouver, CANADA 3/18/2018 EACL 2006 2
Motivation and Focus o Large amounts of info expressed in text form is constantly produced n News, Reports, Reviews, Blogs, Emails…. o Pressing need to summarize o Considerable work but limited factual info 3/18/2018 EACL 2006 3
Our Focus o. Evaluative documents (good vs. bad, right vs. wrong) about a single entity ●Customer reviews (e. g. Amazon. com) ●Travel logs about a destination ●Teaching evaluations ●User studies (!). . . 3/18/2018 EACL 2006 4
Our Focus o We want to do this: “The Canon G 3 is a great camera. . . ” “Though great, the G 3 has bad menus. . . ” Most users liked the Canon G 3. Even though some did not like the menus, many. . . “I love the Canon G 3! It. . . ” 3/18/2018 EACL 2006 5
Two Approaches o Automatic summarizers generally produce two types of summaries: 1. Extracts: A representative subset of text from the original corpus 2. Abstracts: Generated text which contains the most relevant info from the original corpus 3/18/2018 EACL 2006 6
Two Approaches (cont'd) o Extracts-based summarizers generally fare better for factual summarization (c. f. DUC 2005) o But extracts aren't well suited to capturing evaluative info ● ● Can't express distribution of opinions (‘some/all’) Can't aggregate opinions either numerically or conceptually o So we tried both 3/18/2018 EACL 2006 7
Two Approaches (cont'd) o Extract-based approach (MEAD*): n Based on MEAD (Radev et al. 2003) framework for summarization n Augmented with knowledge of evaluative info (I'll explain later) o Abstract-based (SEA): n Based on GEA (Carenini & Moore, 2001) framework for generating evaluative arguments about an entity 3/18/2018 EACL 2006 8
Pipeline Approach (for both) Evaluative Documents Extraction of evaluative info Shared Organization of. Organization extracted info Selection of extracted info Presentation of extracted info 3/18/2018 EACL 2006 9
Extracting evaluative info o. We adopt previous work of Hu & Liu (2004) (but many others exist. . . ) o. Their approach extracts: n. What features of the entity are evaluated n. The strength and polarity of the evaluation on the [ -3 …. . +3 ] interval o. Approach is (mostly) unsupervised 3/18/2018 EACL 2006 10
Examples • “the menus are easy to navigate and the buttons are easy to use. it is a fantastic camera ……” • “… the canon computer software used to download , sort , . . . is very easy to use. the only two minor issues i have with the camera are the lens cap ( it is not very snug and can come off too easily). . ” 3/18/2018 EACL 2006 11
Feature Discovery • “the menus are easy to navigate and the buttons are easy to use. it is a fantastic camera …” • “…… the canon computer software used to download , sort , . . . is very easy to use. the only two minor issues i have with the camera are the lens cap ( it is not very snug and can come off too easily). . ” 3/18/2018 EACL 2006 12
Strength/Polarity Determination • “the menus are easy to navigate(+2) and the buttons are easy to use(+2). it is a fantastic(+3) camera …” • “…… the canon computer software used to download , sort , . . . is very easy to use (+3). the only two minor issues i have with the camera are the lens cap ( it is not very snug (-2) and can come off too easily (-2)). . . ” 3/18/2018 EACL 2006 13
Pipeline Approach (for both) Evaluative Documents Extraction of evaluative info Organization of. Organization extracted info Shared Partially shared Selection of extracted info Presentation of extracted info 3/18/2018 EACL 2006 14
Organizing Extracted Info o Extraction provides a bag of features o But n features are redundant n features may range from concrete and specific (e. g. “resolution”) to abstract and general (e. g. “image”) o Solution: map features to a hierarchy [Carenini, Ng, & Zwart 2005] 3/18/2018 EACL 2006 15
Feature Ontology “canon” “canon g 3” “digital camera” Canon G 3 Digital Camera [-1, +1, +2, +3, +3] User Interface Buttons [+1] Lever [+1] Convenience Battery Menus Menu [+2, +2, +3+3] Battery Life. . . 3/18/2018 . . . Battery Charging System [-1, -2] EACL 2006 16
Organization: SEA vs. MEAD* o SEA operates only on the hierarchical data and forgets about raw extracted features o MEAD* operates on the raw extracted features and only uses hierarchy for sentence ordering (I'll come back to this) 3/18/2018 EACL 2006 17
Pipeline Approach (for both) Evaluative Documents Extraction of evaluative info Organization of. Organization extracted info Selection of extracted info Shared Partially shared Not shared Presentation of extracted info 3/18/2018 EACL 2006 18
Feature Selection: SEA We define a measure of importance (moi) for each feature fi in the hierarchy of features psk Canon G 3 Digital Camera [-1, +1, +2, +3, +3] User Interface 3/18/2018 [+1] Convenience EACL 2006 19
Selection Procedure • Straightforward greedy selection would not work if a node derives most of its importance from its child(ren) including both the node and the child(ren) would be redundant • => Dynamic greedy selection: Until desired number of features is selected • Most important node is selected • That node is removed from the tree • Importance of remaining nodes is recomputed Similar to redundancy reduction step in many automatic summarization algorithms 3/18/2018 EACL 2006 20
Feature Selection: MEAD* o MEAD* selects sentences, not features o Calculate score for each sentence si with the menus are easy to navigate(+2) and the buttons are easy to use(+2). feature(si) psk o. Break ties with MEAD centroid (common feature in multi-document summarization) 3/18/2018 EACL 2006 21
Feature Selection: MEAD* o We want to extract sentences for most important features, and only one sentence per feature o Put each sentence in “bucket” for each feature(si) I like the menus. . . menus buttons the menus are easy to navigate(+2 ) and the buttons are easy to use(+2 ). 3/18/2018 EACL 2006 22
Feature Selection: MEAD* o Take the (single) highest scoring sentence from the “fullest” buckets until desired summary length is reached 3/18/2018 EACL 2006 23
Pipeline Approach (for both) Evaluative Documents Extraction of evaluative info Organization of. Organization extracted info Shared Partially shared Selection of extracted info Not shared Presentation of extracted info Not shared 3/18/2018 EACL 2006 24
Presentation: MEAD* o Display selected sentences in order from most general (top of feature hierarchy) to most specific o That's it! 3/18/2018 EACL 2006 25
Presentation: SEA o SEA (Summarizer of Evaluative Arguments) is based on GEA (Generator of Evaluative Arguments) (Carenini & Moore, 2001) o GEA takes as input n a hierarchical model of features for an entity n objective values (good vs. bad) for each feature of the entity o Adaptation is (in theory) straightforward 3/18/2018 EACL 2006 26
Possible GEA Output The Canon G 3 is a good camera. Although the interface is poor, the image quality is excellent. 3/18/2018 EACL 2006 27
Target SEA Summary Most users thought Canon G 3 was a good camera. Although, several users did not like interface, almost all users liked the image quality. 3/18/2018 EACL 2006 28
Extra work ● What GEA gives us: – – ● High-level text plan (i. e. content selection and ordering) Cue phrases for argumentation strategy (“In fact”, “Although”, etc. ) What GEA does not give us: – Appropriate micro-planning (lexicalization) ● 3/18/2018 Need to give indication of distribution of customer opinions EACL 2006 29
Microplanning (incomplete!) o We generate one clause for each selected feature o Each clause includes 3 key pieces of information: 1. Distribution of customers who evaluated the feature (“Many”, “most”, “some” etc. ) 2. Name of the feature (“menus”, “image quality”, etc. ) 3. Aggregate of opinions (“excellent”, “fair”, “poor”, etc. ) →“most users found the menus to be poor” 3/18/2018 EACL 2006 30
Microplanning o Distribution is (roughly) based on fraction of customers who evaluated the feature (+ disagreement. . . ) o Name of the feature is straightforward o Aggregate of opinions is based on a function similar in form to the measure of importance n average polarity/strength over all evaluations rather than summing 3/18/2018 EACL 2006 31
Microplanning o We “glue” clauses together using cue phrases from GEA o Also perform basic aggregation 3/18/2018 EACL 2006 32
Formative Evaluation o Goal: test user’s perceived effectiveness o Participants: 28 ugrad students o Procedure n Pretend worked for manufacturer n Given 20 reviews (from either Camera or DVD corpus) and asked to generate summary (~100 words) for marketing dept n After 20 mins, given a summary of the 20 reviews n Asked to fill out questionnaire assessing summary effectiveness (multiple choice and open form) 3/18/2018 EACL 2006 33
Formative Evaluation (cont'd) o Conditions: User given one of 4 summaries 1. Topline summary (human) 2. Baseline summary (vanilla MEAD) 3. MEAD* summary 4. SEA summary 3/18/2018 EACL 2006 34
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 35
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 36
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 37
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 38
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 39
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 40
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 41
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 42
Quantitative Results Responses on a scale from 1 (Strongly disagree) to 5 (Strongly agree) 3/18/2018 EACL 2006 43
Qualitative Results: MEAD* o Surprising: many participants didn't notice or didn't mind verbatim text extraction o Two major complaints about content 1. Summary was not representative (negative sentence extracted even though majority were positive) 2. Evaluations of some features were repeated o (2) could be addressed, but (1) can only partially be fixed with pure extraction 3/18/2018 EACL 2006 44
Qualitative Results: SEA o Some complaints about “robotic” feel of summary, and about repetition/lack of pronouns ➔ Need to do more complex microplanning o Some wanted more details (which “manual features. . . “) ● Note: this complaint absent with MEAD* o Some disagreed with feature selection (precision/recall), but this is a problem even with human summaries 3/18/2018 EACL 2006 45
Conclusions o Extraction works surprisingly well even for evaluative summarization o Topline > MEAD* @ SEA > Baseline o Need to combine strengths of SEA and MEAD* for evaluative summarization ● ● ● Need detail, variety, and natural-sounding text provided by extraction Need to generate opinion distributions Need argument structure from SEA (? ) 3/18/2018 EACL 2006 46
Other Future Work o Automatically induce feature hierarchy o Produce summaries tailored to user preferences of the evaluated entity o Summarize corpora of evaluative documents about more than one entity 3/18/2018 EACL 2006 47
Examples MEAD*: Bottom line , well made camera , easy to use, very flexible and powerful features to include the ability to use external flash and lense / filters choices. It has a beautiful design , lots of features, very easy to use , very configurable and customizable , and the battery duration is amazing! Great colors , pictures and white balance. The camera is a dream to operate in automode , but also gives tremendous flexibility in aperture priority , shutter priority, and manual modes. I ’d highly recommend this camera for anyone who is looking for excellent quality pictures and a combination of ease of use and the flexibility to get advanced with many options to adjust if you like. 3/18/2018 EACL 2006 48
Examples SEA: Almost all users loved the Canon G 3 possibly because some users thought the physical appearance was very good. Furthermore, several users found the manual features and the special features to be very good. Also, some users liked the convenience because some users thought the battery was excellent. Finally, some users found the editing/viewing interface to be good despite the fact that several customers really disliked the viewfinder. However, there were some negative evaluations. Some customers thought the lens was poor even though some customers found the optical zoom capability to be excellent. Most customers thought the quality of the images was very good. 3/18/2018 EACL 2006 49
Examples MEAD: I am a software engineer and am very keen into technical details of everything i buy , i spend around 3 months before buying the digital camera ; and i must say , g 3 worth every single cent i spent on it. I do n’t write many reviews but i ’m compelled to do so with this camera. I spent a lot of time comparing different cameras , and i realized that there is not such thing as the best digital camera. I bought my canon g 3 about a month ago and i have to say i am very satisfied. 3/18/2018 EACL 2006 50
Examples Human: The Canon G 3 was received exceedingly well. Consumer reviews from novice photographers to semiprofessional all listed an impressive number of attributes, they claim makes this camera superior in the market. Customers are pleased with the many features the camera offers, and state that the camera is easy to use and universally accessible. Picture quality, long lasting battery life, size and style were all highlighted in glowing reviews. One flaw in the camera frequently mentioned was the lens which partially obstructs the view through the view finder, however most claimed it was only a minor annoyance since they used the LCD screen. 3/18/2018 EACL 2006 51
Microplanning o We “glue” clauses together using cue phrases from GEA n “Although”, “however”, etc. indicate opposing evidence n “Because”, “in particular”, indicate supporting evidence n “Furthermore” indicates elaboration o Also perform basic aggregation most users found the menus to be poor most users found the buttons to be poor most users found the menus and buttons to be poor 3/18/2018 EACL 2006 52