Generating High Quality Questions from Low Quality Questions Kateryna Ignatova, Delphine Bernhard, Iryna Gurevych Workshop on the Question Generation Shared Task and Evaluation Challenge September 25 -26, 2008, Arlington, VA 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 1
Examples of User Questions 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 2
Characteristics of Low Quality Questions § 755 questions from Yahoo! Answers: § 18% misspelled, 8% Internet slang, 20% ill-formed § Keyword queries are the natural way for most people to look for information § Ambiguity / Underspecification is harder to identify and is highly contextdependent 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 3
Spelling and Grammar Correction § The performance of spelling correction depends on the training lexicon § Unrecognized words lead to wrong corrections § The problem of Internet slang remains unsolved § For grammar checking, a thorough study of the kind of grammatical errors found would be needed 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 4
Question Generation from a Set of Keywords § Not yet been addressed § Challenges: § Generate grammatical questions § Generate sensible and relevant questions § Generate useful questions buy keyboard mouse buy apple laptop generate. Question Where can I buy a keyboard and a mouse? Where can I buy an apple and a laptop? § Method: § Re-use the questions previously asked on Q&A platforms § Use additional information, when available: § user profile (questions asked previously) → preferences of a user 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 5
Datasets and Evaluation § Evaluation datasets from Wiki. Answers: questions + manually tagged reformulations including low quality questions § Extrinsic evaluation: measuring the impact on automatic QA of low quality questions vs. automatically generated high quality questions 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 6
Acknowledgements Ubiquitous Knowledge Processing Lab http: //www. ukp. tu-darmstadt. de 3/15/2018 | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 7