4bea207190c6fc434056fcd9f4ae926f.ppt
- Количество слайдов: 60
Automated Question Answering
Motivation: support for students • Demand is for 365 x 24 support – Students set aside time to complete task – If problem encountered immediate help required • Majority of responses direct students to teaching materials; so not a case of “not there” • Poor search forums – Search per forum - not course – Free-text search options fixed by RDBMS • No explicit operators (AND, OR, NEAR)
Research questions • Given the current level of development of natural language processing (NLP) tools, is it possible to: – Classify messages as question/non-question – Identify the topic of the question – Direct users to specific course resources
Natural Language Processing tools • • Tokenisation (words, numbers, punctuation, whitespace) Sentence detection Part of speech tagging (verbs, nouns, pronouns, etc. ) Named entity recognition (names, locations, events, organisations) Chunking/Parsing (noun/verb phrases and relationships) Statistical modelling tools Dictionaries, word-lists, Word. Net , Verb. Net Corpora tools (Lucene, Lemur)
Question answering solutions • Open domain – No restrictions on question topic – Typically answers from web resources – Extensive literature • Closed domain – Restricted question topics – Typically answers from small corpus • Company documents • Structured data
Open domain QA research • Well established over two decades • TREC (Text REtrieval Conference) – funded by NIST/DARPA since 1992 – QA track 1999 – 2007, directed at ‘Factoids’ • CLEF (Cross Language Evaluation Forum) – 2001 - current – Information Retrieval, language resources • NTCIR (NII Test Collection for IR Systems) – 1997 – current – IR, question answering, summarization, extraction
TREC Factoids • Given a fact-based question: – How many calories in a Big Mac? – Who as the 16 th President of the United States? – Where is the Taj Mahal? • Return an exact answer in 50/250 bytes – 540 calories – Abraham Lincoln – Agra, India
Minimal factoid process • Question analysis • Normalisation (verbs, auxiliaries, modifiers) • Identify entities (people, locations, events) • Pattern detection (who was X? , how high is Y? ) • Query creation, expansion, and execution • Ordered terms, combined terms, weighted terms • Answer analysis • Match answer type to question type
Open. Ephyra: open source QA Source: http: //www. cs. cmu. edu/~nico/ephyra/doc/images/overall_architecture. jpg
Open. Ephyra: question analysis Question ‘who was the fourth president of the USA’ Normalization ‘who be fourth president of USA’ Answer type NEproper. Name->NEperson Interpretation property: target: context: NAME fourth president USA
Open. Ephyra: query expansion 1. "fourth president USA" 2. (fourth OR 4 th OR quaternary) president (USA OR US OR U. S. A. OR U. S. OR "United States" OR "United States of America" OR "the States" OR America) 3. "fourth president" "USA" fourth president USA 4. "was fourth president of USA“ 5. "fourth president of USA was”
Open. Ephyra: result answer: James Madison score: 0. 7561732 docid: http: //www. squidoo. com/james-madison-presidentusa Document content:
Shallow answer selection • Answer based on reformulation of question – Who was the fourth president of the
Importance of named entities Search engine Answer matching Extracted NEs link question and answer Question processed for NEs Search results tagged with NEs
PREPARATORY TASKS
Task list: the real work • Create database of forum messages • Adapt open source NLP tools – Tokenisation, sentence detection, Parts Of Speech, parsing • Establish question patterns • Create language analysis tools – Word frequency – Named-entities: define, build, and train models • Prepare corpus – Format and tag documents (doc, html, pdf) – Build Indri catalogue and search interface Iterative process: build, test, refine
NLP tools • Predominantly Java – Stanford, Open. NLP, Lingpipe – GATE: complete analysis + processing system – IKVM permits use with. NET framework • Some C++, C# – Word. Net, Lemur/Indri, Nooj, Sharp. NLP • Python NLTK – Complete NLP toolset and corpus • Lisp, Prolog
Message database • My. SQL database for First. Class messages • Extract: – Forum, Subject, Date, Author – Body • Use subject to classify as Original or Reply No clean-up or filtering of message content undertaken at this stage
T 320 09 B database properties • • • Total messages: Non-replies: Manually tagged questions: Average length (lines) Containing XML: Containing Eclipse content: 4246 1051 777 7. 9 17 37
Creating question patterns • Extract text from forum messages (non-replies) • Create n-grams (‘n’ adjacent words) • Perform frequency analysis of n-grams • Manually review n-grams to create question patterns
N-gram results Number of words Unique patterns 6 96900 5 96780 4 94975 3 86338
5 -word frequency analysis Frequency 17 16 14 13 12 9 8 8 8 7 7 6 6 6 6 N-Gram An unexpected error has occurred. point me in the right I get the following error me in the right direction unexpected error has occurred. UDDIException does not seem to be get the following error message I get an error message system cannot find the path Any help would be appreciated. I am not sure if I can not seem to I do not know what A problem occured while running but I get the following cannot find the path specified error has occurred. UDDIException java. net. I am not sure how I do not seem to Top 20 results
Sliding window across message Frequency N-gram 1 N-gram 2 1 am not that knowledgable Help I am not that knowledgable 1 am not the early adopter I am not the early 1 am not thinking straight today I am not thinking straight 1 am not too far off I am not too far 1 am not too sure if I am not too sure 1 am not using the fault I am not using the 1 am noticing in the console I am noticing in the 1 am now a while later I am now a while 1 am now adding my exception I am now adding my 1 am now getting the following I am now getting the 1 am now held up again I am now held up 1 am now not sure if I am now not sure 1 am now stuck on activity I am now stuck on 1 am now trying not to I am now trying not 1 am now trying to start I am now trying to 1 am now willing to submit I am now willing to 1 am obviously missing something here
Candidate question patterns Class name Pattern #question (a|my) question (about|on|for|is) #appreciate (. *) (advice|comment|guidance|help|direction) #can/could (can|could|will|would) (any|some)s? (body|one)) (. *) (explain|tell me) #does (any|some)s? (body|one) (have|know) #having (have|having) (. *) (problem|nightmare)s? #how (best|can|does|do i|do you|do we) #i am not (really )? sure (if|how|what|when|whether|why) #i cannot i (can not|cannot|could not) find (. *) answer (. *) question) #just wonder(ed|ing)? (if|what) #point me point (me|one) (. *) right direction
Generalisation of patterns using POS Question part any|some advice|comment|guidance appreciated|welcomed. POS tag DT NN VB(N|D). /. Can/MD anyone/NN offer/VB some/DT help/NN ? /. Can/MD someone/NN offer/VB some/DT help/NN ? /. Can/MD anybody/RB give/VB some/DT guidance/NN ? /. Could/MD somebody/RB give/VB some/DT direction/NN ? /. POS pattern matching failed due to errors in assigning tags
Final question patterns: Reg. Exs Pattern ID Weighting Regular Expression 1 0 (? (a|my)squestions)(? about|on|for|is) 66 0 (? (isam|i'm|im)? shav(e|ing)s(difficult(y|ie)|issue|problem)(s)? ) 67 0 (? is(am|have|was))b(? . *)b(?
CHALLENGES PROCESSING MESSAGES
Poor message style Incorrect POS tagging due to spelling errors when/WRB I/PRP tried/VBD to/TO generate/VB the/DT sample/NN , /, it/PRP said/VBD the/DT data/NNS is/VBZ available/JJ. /.
XML within messages Detected as single sentence
Eclipse console listing within message Line breaks not recognised as end of sentence
Open-source NLP problems • Sentence detection failures: – Bad style (capitalisation, punctuation) – Ellipsis (i tried. . . it failed. . . error message. . . ) – XML, BPEL segments concatenated to single sentence • Tokenisation failures: – Multiple punctuation ? ? ? , !!! (student emphasis) – Abbreviations (im, cant, doesnt, etc. ) • POS errors – Spelling, grammar
Purpose built tools • Tokeniser – Re-coded for typical forum content/style • Multiple punctuation • Abbreviations • Common contractions • Sentence detector – New detector based on token sequences • Pre-filter messages – Remove XML, console listing, error messages
Message pre-filters • Short-forms – i’m, i m – can’t, can t • • • i am can not Line numbers Repeated punctuation (!!!, ? ? ? , . . . ) Smilies Salutations (Hi all, Hiya, etc. ) Names, signature, course codes
Filtered message Raw message containing Eclipse console listing Filtered message ready to process
PRELIMINARY RESULTS: QUESTION CLASSIFICATION
Message-set properties • • • Number of messages: 1051 (100%) Number of questions(M): 777 (73. 9%)(100%) Number of questions(A): 756 (97. 3%) False Positives (A not M): 58 (7. 4%) False Negatives (M not A): 79 (10. 2%) Approx 90% success rate M = manually annotated question, A = automatically annotated question
Message-set properties – cont. • Average # pattern matches: • Min # pattern matches: • Max # pattern matches: 2. 7606 1 12 • Average # of lines (ASCII linefeed) • Min # Lines in a message • Max # Lines in a message 7. 9 1 68 • Average # of sentences • Min # Sentences in a message • Max # Sentences in a message 5. 0 1 89 • Messages containing XML • Messages containing BPEL 17 37
Distribution of pattern match count 350 Number of messages 300 295 240 250 200 174 150 95 100 42 50 32 9 7 7 8 1 2 2 2 9 10 11 12 0 0 1 2 3 4 5 6 Number of pattern matches
Challenges: false positives
Challenges: false negatives
Challenges: detecting the question
Number of messages Messages matching question pattern Pattern IDs Pattern ID
Common question patterns (10) • any – (advice|clarification|clue|comment| – further thought|guidance| – help|hint|idea|opinion| – pointer|reason|suggestion|taker)(s)? • . * • appreciated|welcomed 216 matches Terms added over time to improve detection of questions
Sample question match (10)
Common question patterns (50) • get|getting|gives|got|receive • . * • error(s)? 102 matches
Sample question match (50)
Number of messages Discrimination vs Classification Pattern ID Low discrimination >>> Increases successful classification at the risk of false-positives High discrimination >>> Reduces successful classification and risk of false-positives
Does process transfer? • Tested against TT 380 forums 04 J – 07 J – Preliminary results look promising – Need to manually tag >4000 messages – Review message pre-filters • Need access to Humanities course material
PRELIMINARY RESULTS: QUESTION TOPIC IDENTIFICATION
Basic method • Identify named entities – NEs are block-specific – Majority of questions linked to assignments • Parse sentence for dependencies – Nouns (that are NEs) – Verbs
Named entities: inconsistent usage Message body Message subject Error handling Exception handling
Deep parsing: dependencies advmod(delete-5, How-1) aux(delete-5, can-2) nsubj(delete-5, I-3) advmod(delete-5, properly-4) dobj(delete-5, PLTs-6) conj_and(PLTs-6, PLs-8) conj_and(PLTs-6, roles-10) det(project-13, the-12) prep_from(delete-5, project-13) prep_in(delete-5, order-15) aux(have-17, to-16) xcomp(delete-5, have-17) det(sheet-20, a-18) amod(sheet-20, clean-19) dobj(have-17, sheet-20) advmod(have-17, again-21) How can I properly delete PLTs and PLs and roles from the project in order to have a clean sheet again.
Sentences per message 200 Number of messages 180 160 140 120 100 80 60 40 20 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 20 21 22 23 24 25 30 31 38 48 51 89 Sentence counts under-estimated due to spelling /grammar errors. Of the 120 single-sentence questions >80% are multiple sentences.
Guess the topic Excuse me for directing this question at you, but when I try to contact my tutor through my homepage i still go to the details for John Stephenson but I am sure that he is ill at the moment. My question refers to the entities described in ECA part 2 page 2, it states that the term identifier must be unique within the UK business domain. I thought Buyers ID and Sellers ID could be their email address, however, I am stuck on the Order ID which might refer to a depatch note as I do not know what standard these identifiers have to conform to in UK business. I would appreciate being directed as to where I can find this information.
Current status • Unable to establish question topic for the 95% of detected questions • Current NLP techniques (anaphora and co-reference resolution) for multi-sentence questions not well established.
Pattern matching in console listing
Practical work: exact patterns • Process|Assign|Invoke|Scope|Reply • . * • Completed with fault: • invalid. Variables|uninitialized. Variable|join. Failure Provide direct link to FAQ or teaching materials
Future work • Further work on sentence detection – Everything else depends on this • Create patterns to identify content – “how do i (. *)” – “are you now saying (. *)” – “(. *) word count” • Establish relationships between initial message and replies • Build tool to process Eclipse console listings – Could address 5% of all ECA related questions


