DOC PREVIEW
UT Dallas CS 6359 - Lecture8

This preview shows page 1-2-3-4-5-36-37-38-39-40-73-74-75-76-77 out of 77 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 77 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS6322: CS6322: Information Retrieval Information Retrieval SandaSandaHarabagiuHarabagiuLecture 8:Question Answering 2Lecture 8:Question Answering 2CS 6322: Information RetrievalCS 6322: Information RetrievalPart II. Structure of main QA modules Question processing Document retrieval Answer extraction Part III. Advanced topics in QA Keyword alternations Question caching Special cases of questions Statistical methods Question Treebanks Logic forms Semantic indexingCS 6322: Information RetrievalCS 6322: Information RetrievalQ733: Who was the first Russian astronaut to walk in space?Who was the first Russian astronaut to walk in space WP VBD DT JJ NNP NN TO VB IN NNNPNPPPVPVPSVPSwalkspacespacewalkwalkastronautastronautastronautPERSONastronautPERSONfirstwalkRussianspaceAn ExampleCS 6322: Information RetrievalCS 6322: Information RetrievalDetecting the Answer Type1. Determine the category(ies) of the question stem2. Select answer type nodes {A} having the same category as the question stem3. Select node N that(a) is connected to the question stem(b) has highest connectivity in the semantic representation4. Search for the word in node N along Answer hierarchies5. Return the answer type as the top of the hierarchy found when N was locatedCS 6322: Information RetrievalCS 6322: Information RetrievalPossible Answer TypesTOPPERSON LOCATION DATE TIME PRODUCT NUMERICAL MONEY ORGANIZATION MANNER REASONVALUEDEGREE DIMENSION RATE DURATION PERCENTAGE COUNTtime of daymidnightprime timeclock timehockeyteamteam,squadinstitution,establishmentfinancialinstitutioneducationalinstitutionnumerosity,multiplicityinteger,whole numberpopulationdenominatorthicknesswidth,breadthdistance,lengthaltitudewingspanCS 6322: Information RetrievalCS 6322: Information RetrievalExamplesWhatplayedactressnameShineWhatBMWcompanyproduceTOPPERSON LOCATION DATE TIME PRODUCT NUMERICAL MONEY ORGANIZATION MANNER REASONVALUEWhat is the name of theactress that played in Shine?What does the BMW companyproduce?PERSONPRODUCTPRODUCTPERSONCS 6322: Information RetrievalCS 6322: Information RetrievalQuestion TaxonomyMappingAnswerTypeQuestion ReformulationRulestype itype jtype kAnswer type nAnswer Taxonomyanswer typeQuestion-STEMword 1word 2word 3Question Semantic RepresentationQuestion WordAlternationsanswer typeQuestion-STEMword 1word 2word 3answer typeQuestion-STEMword 1word 2answer typeQuestion-STEMword 1word 2word 3word 4type itype ktype jQuestion Taxonomyanswer typeQuestion-STEMword 1word 2word 3answer typeQuestion-STEMword 1word 2word 3answer typeQuestion-STEMword 1word 2word 3answer typeQuestion-STEMword 1word 2word 3answer typeQuestion-STEMword 1word 2word 3answer typeQuestion-STEMword 1word 2word 3Question TaxonomyNodeCS 6322: Information RetrievalCS 6322: Information RetrievalNamed Entity Categoriesdate time organization townproduct price country moneyhuman disease phone number continentpercent province other location plantmammal alphabet airport code gamebird reptile University dog breednumber quantity attractionDATE TIME ORGANIZATIONREASON MANNER NATIONALITYPRODUCT MONEY LANGUAGEMAMMAL GAME DOG BREEDLOCATION REPTILE NUMERICAL VALUEQUOTATION ALPHABET PERCENTAGETop Answer TaxonomyCS 6322: Information RetrievalCS 6322: Information RetrievalMapping answer types into named entity categoriesAnswer TypeNamed Entity CategoryPersonMoneySpeedDurationAmounthumanmoneypricequantitynumberCS 6322: Information RetrievalCS 6322: Information RetrievalDocument Retrieval Main approaches used so far: Traditional IR and some NLP extensions Indexing Word - based Named entities (terms and variants) Conceptual indexing Paragraph indexing Retrieval Retrieve documents then rank them Retrieve documents, extract passages, then rank passages Retrieve directly passages and rank them Retrieval methods Vector model Boolean modelCS 6322: Information RetrievalCS 6322: Information RetrievalLIMSITermExtractionIndexingQuestionSentenceMatchingQAnswerNLQ. AnalysisFerret et al., Trec9, 2000RankingNamedEntitySearchEngineDocCS 6322: Information RetrievalCS 6322: Information RetrievalTerminological variants for document selectionLIMSI’s QALC System- high level indexes, comprising terms and variants- 2-step procedure:1) automatic term extraction from questions (uses POS tagging and pattern matching)2) automatic document indexing (uses term and variant recognition)CS 6322: Information RetrievalCS 6322: Information RetrievalTerm Extraction in QALC Questions are tagged with Tree Tagger (Schmid 1999) Patterns of symbolic categories are used to extract terms from the tagged questions. The pattern used to extract terms is:(((((JJ | NN | NP | VBG)) ? (JJ | NN | NP | VBG)(NP | NN))) | (VBD) | (NN) | (NP) | (CD))CS 6322: Information RetrievalCS 6322: Information RetrievalExtraction ExamplenameNNofINtheDTUSNPhelicopterNNpilotNNshotVBDdownPP4 terms are acquired:- US helicopter pilot- helicopter pilot- Pilot- shotCS 6322: Information RetrievalCS 6322: Information RetrievalVariant RecognitionUses FASTR (Jacquemin, ACL ’99)- a transformational shallow parser for the recognition of term occurrences and variants.How?- Terms are transformed into grammar rules and the single words building these terms are extracted and linked to their morphological and semantic families.CS 6322: Information RetrievalCS 6322: Information RetrievalMorphological and Semantic FamiliesThe morphological family of w is M(w) – returned by the CELEX database, having the same root morpheme as w.Example: M(maker) = {maker, make, to make,to remake}The semantic family of w is S(w), all the WordNetsynsets containing w.Example: S(maker) = {maker, manufacturer, shaper, manufacturing business}2 senses!CS 6322: Information RetrievalCS 6322: Information RetrievalVariantsVariant patterns that rely on morphological and semantic families are generated through METARULES.Example: the pattern N to SemArgVM(‘maker’) RP? PREP? (ART (NN | NP)? PREP)? ART?(JJ | NN | NP | VBD | VBG)0-3NS(‘car’)extracts: ‘making many automobiles’ as a variant of ‘car manufacturer’Problem: Some incorrect variants are extracted as well:e.g. ‘make those cuts in auto’CS 6322: Information RetrievalCS 6322: Information RetrievalDocument SelectionThe result of NLP-based indexing is a list of term occurrences composed of:- a document


View Full Document

UT Dallas CS 6359 - Lecture8

Documents in this Course
Lecture2

Lecture2

63 pages

Lecture3

Lecture3

49 pages

Lecture4

Lecture4

48 pages

Lecture5

Lecture5

47 pages

Lecture6

Lecture6

45 pages

Lecture7

Lecture7

63 pages

Lecture9

Lecture9

48 pages

Lecture10

Lecture10

84 pages

Lecture11

Lecture11

45 pages

Lecture12

Lecture12

134 pages

Lecture13

Lecture13

62 pages

Lecture14

Lecture14

76 pages

Project

Project

2 pages

Chapter_1

Chapter_1

25 pages

Load more
Download Lecture8
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture8 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture8 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?