Unformatted text preview:

5/24/09'1'Ques-on'Answering'CISC489/689‐010,'Lecture'#24'Wednesday,'May'13th'Ben'CartereHe'Ques-on'Answering'• Usual'IR'problem:''submit'query'(usually'keywords),'receive'ranking'of'documents'• QA:''user'asks'a'ques-on'in'natural'language,'receives'an'answer'in'natural'language'or'a'ranking'of'answers'• Similar'problems,'but'different'in'fundamental'ways'– Different'approaches'are'successful'– Different'evalua-on'methods'are'needed'5/24/09'2'Knowledge‐Based'Systems'• Create'a'database'of'known'facts'• Reformulate'ques-on'as'a'statement'with'a'“blank”'– Ques-on:''“who'is'president'of'the'U.S.?”'– Reformula-on:''“The'president'of'the'U.S.'is'____”'• Find'facts'in'database'that'match'the'statement'– Fact'in'database:''“The'president'of'the'U.S.'is'Barack'Obama”'• If'none'match,'find'intermediate'facts'– For'example,'“The'president'of'the'U.S.'is'the'former'junior'senator'from'Illinois”,'“Barack'Obama'is'the'former'junior'senator'from'Illinois”''“The'president'of'the'U.S.'is'Barack'Obama”'5/24/09'3'Reformula-on'• Different'reformula-ons'may'be'possible'depending'on'query'type'– “who”'ques-on:''“___'is'the'president'of'the'U.S.”,'“The'president'of'the'U.S.'is'___”,'“___,'who'was'born'in'___,'has'been'elected'president'of'the'U.S.”,'…'– “when”'ques-on:''“Barack'Obama'took'office'in'___”,'“Barack'Obama'was'president'of'the'U.S.'from'___”,'…'• Therefore'it'is'useful'to'classify'ques-on'by'type'– Who/what/where/when/how'– Use'type'to'generate'reformula-on'paHerns'Ques-on'Classifica-on'• What'are'some'features'we'could'use'for'this?'– Does'the'ques-on'contain'“who”/”what”/”where”/”when”?'– En-ty'types:''person'names'might'imply'“who”,'place'names'might'imply'“where”,'etc'• This'is'not'always'as'easy'as'it'might'seem'– “Name'the'first'private'ci-zen'to'fly'in'space”'–'what'features'could'we'use'to'determine'that'is'a'“who”'ques-on?'5/24/09'4'Query'Expansion'• Exact'reformula-on'may'not'exist'in'database'– Try'different'reformula-ons'by'query'expansion/rewri-ng'– “The'president'of'the'U.S.”''“The'president'of'the'United'States”;'“The'president'of'the'USA”;'“President'of'America”;'“American'head'of'state”;'…'• As'usual,'query'expansion'is'noisy'• It'is'not'always'possible'to'find'the'right'terms'to'expand'with'Sta-s-cal'Systems'• Knowledge‐based'systems'have'a'lot'of'shortcomings'– They'require'people'to'add'facts'to'the'database'– Those'facts'have'to'be'verified'– Automa-cally'matching'a'ques-on'to'known'facts'is'not'easy'– Iden-fying'when'a'ques-on'does'not'match'any'fact'is'not'easy'– And'figuring'out'that'an'answer'can'be'deduced'from'other'facts'is'definitely'not'easy'• Instead'of'relying'on'known'facts,'use'large'document'corpora'and'text'sta-s-cs'to'find'likely'answers'5/24/09'5'How'To'Do'It?'• Locate'documents'that'might'contain'answer'– How?'• Locate'parts'of'those'documents'that'are'most'likely'to'contain'answer'– How?'• Reformulate'those'parts'into'natural‐language'answers'• Remove'answers'that'seem'to'be'duplicates'Document'Retrieval'for'QA'• This'is'a'task'for'which'it'might'be'useful'to'have'en--es'tagged'• Named&en(ty&recogni(on:''NLP'task'for'finding'elements'in'text'and'tagging'them'as'belonging'to'predefined'categories&• For'example,'“Jim'bought'300'shares'of'Acme'Corp.'in'2006”'might'become:''– <ENAMEX'TYPE="PERSON">Jim</ENAMEX>'bought'<NUMEX'TYPE="QUANTITY">300</NUMEX>'shares'of'<ENAMEX'TYPE="ORGANIZATION">Acme'Corp.</ENAMEX>'in'<TIMEX'TYPE="DATE">2006</TIMEX>.'• Given'such'output,'we'can'index'the'content'of'these'tags'using'the'same'methods'used'for'indexing'-tle'words,'etc'5/24/09'6'Document'Retrieval'with'NEs'• With'a'named‐en-ty‐tagged'corpus,'we'can'transform'the'ques-on'into'a'query'based'on'the'ques-on'classifica-on'• A'“who”'query'like'“who'is'the'president'of'the'U.S.”'might'become'something'like'– #and(president'U.S.'#person.any)'– Where'#person.any'tells'the'engine'to'match'any'document'containing'something'tagged'as'a'“person”'• Query'expansion'could'be'applied'naturally'using'top‐retrieved'documents'Passage'Retrieval'• QA'systems'owen'depend'on'retrieving'short'pieces'of'documents'rather'than'full'documents'• This'is'called'passage&retrieval'• Examples'of'passages:''50‐word'windows,'250‐word'windows,'sentences,'paragraphs,'etc.'• Fixed‐'or'variable‐length'passages'can'easily'be'retrieved'if'term'posi-ons'have'been'indexed'• Sentences'and'paragraphs'can'be'tagged,'and'tag'informa-on'can'be'indexed'just'like'en-ty'or'markup'tags'5/24/09'7'Passage'Reformula-on'• It'is'possible'to'“learn”'how'to'reformulate'passages'• Idea:''use'training'data'(ques-ons'with'known'answers)'to'find'the'passages'that'contain'the'answers'• From'those'passages,'learn'paHerns'for'the'ques-on'type'• New'ques-ons'can'then'be'answered'by'finding'passages'that'match'the'learned'paHerns'and'pulling'the'answers'out'Example'• “When'was'Bill'Clinton'elected'president?”'–'1992'• Passages'that'match'the'ques-on'and'answer:'– Bill'Clinton'was'elected'president'in'1992'– The'elec-on'was'won'by'Bill'Clinton'in'1992'– Clinton'defeated'Bush'in'1992'– Clinton'won'the'electoral'college'in'1992'• Take'the'most'common'of'these'and'turn'them'into'general'paHerns'– #person'was'elected'president'in'#year'– The'elec-on'was'won'by'#person'in'#year'– #person'defeated'Bush'in'#year'''''(how'useful'is'this?)'• Then'new'ques-ons'can'be'answered'by'finding'passages'that'match'the'paHern'• “When'was'Barack'Obama'elected'president?”'5/24/09'8'QA'Experiments'• TREC'ran'a'QA'track'from'1999'through'2003'• The'track'has'changed'a'lot'over'-me:'– Ques-on'types,'evalua-on,'document'corpus'• The'first'track'used'factual'ques-ons'that'definitely'had'answers'in'a'collec-on'of'news'ar-cles'– Subsequent'tracks'have'included'defini-on'ques-ons'and'list'ques-ons,'and'not'all'ques-ons'have'answers'in'the'document'set'• Systems'answer'ques-ons'with'short'text'passages'– 50'or'250'bytes'that'answer'the'ques-on'and'support'the'answer'•


View Full Document

UD CISC 689 - Question Answering

Download Question Answering
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Question Answering and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Question Answering 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?