Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

A Natural Language Question and Answer SystemChris Callison-Burch and Philip ShilaneJune 3, 20001 IntroductionOur project is a question and answer system that allows natural language questionto be asked of a knowledge base of information. Our program is grammar-basedsystem which maps English questions and statements onto predicate logic. Userinput is parsed, converted to KIF (Knowledge Interchange Format), and sent toa reasoner that does inference using a set of first order logic axioms and returnsa solution. In order to implement this program we developed a large scale gram-mar for English questions based on recent work in Head-driven Phrase StructureGrammar. Highlights of our grammar include the facts that it:• Distinguishes between questions and propositions,• Handles a wide range of question constructions, including polar interroga-tives, subject and non-subject wh-interrogatives, and multiple wh-questions,• Provides a robust, linguistically motivated analysis of long-distance ,• Creates a mapping between wh-words and the semantic arguments that theyare associated with.We tested our system on a family tree domain, but the approach is flexible andcan be adapted to other fields. The grammar is general enough to be reused simplyby expanding it to include the necessary vocabulary. This is a strong advantageover systems which do not provide such a broad coverage of English, and instead aredesigned with domain specific questions in mind. The theorem proving programis completely domain independent, though a new knowledge base of facts andinference rules would have to be designed for each new domain. Creating a newknowledge base is often time consuming, but it seems like any system of this typewould have this limitation, and there are a number of efforts to create a generalpurpose, reusable KBs. The main advantage of a question answering system is thata human user can ask questions in natural language instead of generating queriesin a logic format.In the rest of this section, we given an overview of our linguistically motivatedapproach and then describe the pre-existing parser and theorem which we inte-grated into our application. In Section 2, we describe some of the mechanisms in1our grammar, give details about our hand-built family tree knowledge base, andend with a description of the code which we use to interface between the two.In Section 3, we give an example session highlighting the question types that ourgrammar covers. In Section 4, we discuss the coverage of our grammar based on afew user testing sessions. In the final section, we describe future extensions to thesystem.1.1 A Linguistically Motivated ApproachOur system implements the Ginzburg and Sag (forthcoming) theory of Englishinterrogative constructions. Their theory is formulated in Head-driven PhraseStructure Grammar (HPSG), a constraint based linguistic formalism, and influ-enced by situation theory semantics. HPSG uses typed feature structures to definea fine-grained typology of language, which provides a comprehensive account of awide range of syntactic phenomena. Some of the syntactic phenomena specific toEnglish questions include extraction, inversion, and sensitivity to the presence ofwh-words. Each of these is treated in the theory, and integrated into our imple-mentation.We modeled the types and constraints described in Ginzburg and Sag (forth-coming) using the LKB (Copestake in preparation) parsing software, and leverageit in interpreting the input to our question and answer system. This type ofgrammar-based approach allows us to precisely model utterances in a language,and specify a mapping between syntax and semantics. Creating a correspondencebetween the syntax and semantics of a language allows the surface form of state-ments to be mapped onto a more abstract, logical representation. We use thislogical representation as a way of interfacing with the theorem prover.The following is an example of the type of representation that our grammarbuilds for a question:(1) Who left? 7→questionparams hparamindex1restrnperson-rel(1)oiproppropositionsit ssoaquants hinucl"leave-relleaver1#Contrast that with the representation for a statement such as “Kim left”:(2) Kim left 7→2propositionsit ssoaquants hinucl"leave-relleaver kim#Thus, each question is treated as being about a certain proposition, with a set ofvariables, or parameters, to be determined in an answer. The elements in set ofparameters correspond to the wh-words in a question, with each wh-word intro-ducing one parameter. The params feature thereby links the parameter for eachwh-word to the argument position within the proposition.Theparams featurefurther introduces restrictions that the referent of the parameter must satisfy, butwe’re ignoring that for the purposes of our implementation.Beyond being closely matched to the logical representations used in automatedreasoners like JTP, the primary advantage of using feature structure notation in asystem like this is that it allows an elegant approach to long-distance dependencies.The innovation of early logic/grammar-based question and answer systems, suchas the CHAT-80 system (Warren and Pereira 1982), is precisely that they providedan analysis of extraposition within a framework which used context free grammarnotation. Our treatment is more elegant in that it does not cause a ballooning inthe number of grammar rules, as the “/” notation in CFGs for extraction does.We’ll describe our grammar and its treatment of extraposition in more detail inthe next section.1.2 The LKBThe LKB (Linguistic Knowledge Building) system is a grammar and lexicon devel-opment environment for use with constraint-based linguistic formalisms. The bestway to think about the LKB system is as a development environment for a veryhigh-level specialized programming language. Typed feature structure languagesare essentially based on one data structure – the typed feature structure, and oneoperation – unification. This combination is powerful enough to allow the grammardeveloper to write grammars and lexicons that can be used to parse and generatenatural languages. In effect, the grammar developer is a

View Full Document
Download A Natural Language Question and Answer System
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...

Join to view A Natural Language Question and Answer System and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Natural Language Question and Answer System 2 2 and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?