Question Answering What%is%Ques+on%Answering?%Dan%Jurafsky%2%Ques%on(Answering(What do worms eat?wormseatwhatwormseatgrassWorms eat grasswormseatgrassGrass is eaten by wormsbirdseatwormsBirds eat wormshorseseatgrassHorses with worms eat grasswithworms!"#$%&'()&*#'%+,-.'$/#0$(One%of%the%oldest%NLP%tasks%(punched%card%systems%in%1961)%Simmons,%Klein,%McConlogue.%1964.%Indexing%and%Dependency%Logic%for%Answering%English%Ques+ons.%American%Documenta+on%15:30,%196U204%Dan%Jurafsky%Ques%on(Answering:(IBM’s(Watson(• Won%Jeopardy%on%February%16,%2011!%3%WILLIAM WILKINSON’S “AN ACCOUNT OF THE PRINCIPALITIES OF WALLACHIA AND MOLDOVIA” INSPIRED THIS AUTHOR’S MOST FAMOUS NOVEL Bram%Stoker%Dan%Jurafsky%Apple’s(Siri(4%Dan%Jurafsky%Wolfram(Alpha(5%Dan%Jurafsky%6%Types(of(Ques%ons(in(Modern(Systems(• Factoid%ques+ons%• Who$wrote$“The$Universal$Declara4on$of$Human$Rights”?$• How$many$calories$are$there$in$two$slices$of$apple$pie?$• What$is$the$average$age$of$the$onset$of$au4sm?$• Where$is$Apple$Computer$based?$• Complex%(narra+ve)%ques+ons:%• In$children$with$an$acute$febrile$illness,$what$is$the$$$$$$$$$$$$$$$efficacy$of$acetaminophen$in$reducing$fever?$• What$do$scholars$think$about$Jefferson’s$posi4on$on$$$$$$$$$$$dealing$with$pirates?$Dan%Jurafsky%Commercial(systems:((mainly(factoid(ques%ons(Where%is%the%Louvre%Museum%located?% In%Paris,%France%What’s%the%abbrevia+on%for%limited%partnership?%L.P.%What%are%the%names%of%Odin’s%ravens?% Huginn%and%Muninn%What%currency%is%used%in%China?% The%yuan%What%kind%of%nuts%are%used%in%marzipan?% almonds%What%instrument%does%Max%Roach%play?% drums%What%is%the%telephone%number%for%Stanford%University?%650U723U2300%Dan%Jurafsky%Paradigms(for(QA(• IRUbased%approaches%• TREC;%%IBM%Watson;%Google%• KnowledgeUbased%and%Hybrid%approaches%• IBM%Watson;%Apple%Siri;%Wolfram%Alpha;%True%Knowledge%Evi%%8%Dan%Jurafsky%Many(ques%ons(can(already(be(answered(by(web(search(• a%9%Dan%Jurafsky%IREbased(Ques%on(Answering(• a%10%Dan%Jurafsky%11%IREbased(Factoid(QA(DocumentDocumentDocumentDocumentDocumentDocumentDocumentDocumentQuestion ProcessingPassageRetrievalQuery FormulationAnswer Type DetectionQuestionPassage RetrievalDocument RetrievalAnswer ProcessingAnswerpassagesIndexingRelevantDocsDocumentDocumentDocumentDan%Jurafsky%IREbased(Factoid(QA(• QUESTION%PROCESSING%• Detect%ques+on%type,%answer%type,%focus,%rela+ons%• Formulate%queries%to%send%to%a%search%engine%• PASSAGE%RETRIEVAL%• Retrieve%ranked%documents%• Break%into%suitable%passages%and%rerank%• ANSWER%PROCESSING%• Extract%candidate%answers%• Rank%candidates%%• using%evidence%from%the%text%and%external%sources%Dan%Jurafsky%KnowledgeEbased(approaches((Siri)(• Build%a%seman+c%representa+on%of%the%query%• Times,%dates,%loca+ons,%en++es,%numeric%quan++es%• Map%from%this%seman+cs%to%query%structured%data%%or%resources%• Geospa+al%databases%• Ontologies%(Wikipedia%infoboxes,%dbPedia,%WordNet,%Yago)%• Restaurant%review%sources%and%reserva+on%services%• Scien+fic%databases%13%Dan%Jurafsky%Hybrid(approaches((IBM(Watson)(• Build%a%shallow%seman+c%representa+on%of%the%query%• Generate%answer%candidates%using%IR%methods%• Augmented%with%ontologies%and%semiUstructured%data%• Score%each%candidate%using%richer%knowledge%sources%• Geospa+al%databases%• Temporal%reasoning%• Taxonomical%classifica+on%14%Question Answering What%is%Ques+on%Answering?%Question Answering Answer%Types%and%Query%Formula+on%Dan%Jurafsky%Factoid(Q/A(17%DocumentDocumentDocumentDocumentDocumentDocumentDocumentDocumentQuestion ProcessingPassageRetrievalQuery FormulationAnswer Type DetectionQuestionPassage RetrievalDocument RetrievalAnswer ProcessingAnswerpassagesIndexingRelevantDocsDocumentDocumentDocumentDan%Jurafsky%Ques%on(Processing(Things(to(extract(from(the(ques%on(• Answer%Type%Detec+on%• Decide%the%named(en%ty(type((person,%place)%of%the%answer(• Query%Formula+on%• Choose%query(keywords(for%the%IR%system%• Ques+on%Type%classifica+on%• Is%this%a%defini+on%ques+on,%a%math%ques+on,%a%list%ques+on?%• Focus%Detec+on%• Find%the%ques+on%words%that%are%replaced%by%the%answer%• Rela+on%Extrac+on%• Find%rela+ons%between%en++es%in%the%ques+on%18%Dan%Jurafsky%Question Processing They’re the two states you could be reentering if you’re crossing Florida’s northern border • Answer%Type:%%US%state%• Query:%%two%states,%border,%Florida,%north%• Focus:%the%two%states%• Rela+ons:%%borders(Florida,%?x,%north)%19%Dan%Jurafsky%Answer(Type(Detec%on:(Named(En%%es(• Who$founded$Virgin$Airlines?$• %PERSON%%• What$Canadian$city$has$the$largest$popula4on?$• $CITY.%Dan%Jurafsky%Answer(Type(Taxonomy(• 6%coarse%classes%• ABBEVIATION,%ENTITY,%DESCRIPTION,%HUMAN,%LOCATION,%NUMERIC%• 50%finer%classes%• LOCATION:%city,%country,%mountain…%• HUMAN:%group,%individual,%+tle,%descrip+on%• ENTITY:%animal,%body,%color,%currency…%21%Xin%Li,%Dan%Roth.%2002.%Learning%Ques+on%Classifiers.%COLING'02%Dan%Jurafsky%22%Part(of(Li(&(Roth’s(Answer(Type(Taxonomy(LOCATIONNUMERICENTITYHUMANABBREVIATIONDESCRIPTIONcountrycity statedatepercentmoneysizedistanceindividualtitlegroupfoodcurrencyanimaldefinitionreasonexpressionabbreviationDan%Jurafsky%23%Answer(Types(Dan%Jurafsky%24%More(Answer(Types(Dan%Jurafsky%Answer(types(in(Jeopardy(• 2500%answer%types%in%20,000%Jeopardy%ques+on%sample%• The%most%frequent%200%answer%types%cover%<%50%%of%data%• The%40%most%frequent%Jeopardy%answer%types%he,%country,%city,%man,%film,%state,%she,%author,%group,%here,%company,%president,%capital,%star,%novel,%character,%woman,%river,%island,%king,%song,%part,%series,%sport,%singer,%actor,%play,%team,%%show,%%%%%%%%%%%%%%%actress,%animal,%presiden+al,%composer,%musical,%na+on,%%%%%%%%%%%%%%%%%%%book,%+tle,%leader,%game%25%Ferrucci%et%al.%2010.%Building%Watson:%An%Overview%of%the%DeepQA%Project.%AI%Magazine.%Fall%2010.%59U79.%Dan%Jurafsky%Answer(Type(Detec%on(• HandUwrioen%rules%• Machine%Learning%• Hybrids%Dan%Jurafsky%Answer(Type(Detec%on(• Regular%expressionUbased%rules%%can%get%some%cases:%• Who%{is|was|are|were}%PERSON%• PERSON%(YEAR%–%YEAR)%•
View Full Document