DOC PREVIEW
MIT 6 863J - Phrase Structure Parsing

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 18 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Massachusetts Institute of Technology6.863J/9.611J, Natural Language Processing, Spring, 2002Department of Electrical Engineering and Computer ScienceDepartment of Brain and Cognitive SciencesLaboratory 2: Phrase Structure ParsingHanded out: March 10, 2003 Due: March 19, 2003Introduction: goals of the laboratoryThe major aim of this laboratory is to give you an (admittedly brief) experience in building “real”grammars, along with an understanding of how syntactic phenomena interact with context-free parsingalgorithms. The laboratory is designed so that you will become familiar with the Earley algorithm/AKAchart context-free parsing. By the end of the lab, if you want to build a large grammar on your own,for any language, you’ll know the basics about how to proceed (see the first footnote). In addition,you will learn how to use features to simplify grammar construction. Here are the main tasks for thislaboratory:1. We will have you do a very simple warm-up exercise, and answer two basic questions about theparser’s operation.2. We will give you a simple context-free “starter” grammar in the course locker. We ask you ask youto extend this grammar to handle a slightly wider variety of syntactic phenomena with differentkinds of verb arguments and embedded sentences.3. Next, you will extend your augmented grammar to handle simple questions, such as “Whichdetectives solved the case?”4. Finally, using features, you’ll write an augmented context-free grammar to cover the same exam-ples, and compare its size to the grammar you have built that does not use features.Similar to previous labs, your completed laboratory report should be written as a web page, andcontain at a minimum these items:1. A description of how your system operates. How does your system handle each of the syntacticphenomena described below? Include a description of the rules that were used to encode therelevant changes to the grammars and how these were implemented (e.g. “I added the rule VP...In order to implement this rule, we had to . . . ”). If appropriate, mention problems that arose inthe design of your grammar.2. Your modifications to any initial grammar rule files that we supply to you.3. A record of a parser run on the relevant sentences. (We supply them as part of the “starter”grammar, as described below — you won’t have to type these in yourself.)4. Your feature-based context-free grammar for the same sentences, as well as a comparison to yourfeature-free context-free grammar.5. Bonus: if you are (very) ambitious, you might want to tackle a larger set of “challenge” sentencesthat are very much harder to successfully parse using a context-free grammar. You are welcomePhrase Structure Parsing 2to try your hand at these. This makes a very wonderful final project. It shows how truly hard itis to write a complete context-free grammar for English.1Before proceeding to the laboratory questions themselves, we first describe the Earley/chart parserimplementation in the course locker, and how to use it.General Introduction to the Earley Parser ImplementationThe Earley parser we have implemented runs in Common Lisp. (It is linkedonthecoursewebpage).The following is a brief descriptionof how to load and start the system, load in grammar rules, list, and modify them, etc. For completedocumentation of the system, you can, and should, go to the links on the course web page. These pdf and ps files are also in the earley directory in the course locker.The 6.863 Earley parser will work with any well-defined context-free grammar (CFG), includingrecursive ones and those containing empty categories (rules that rewrite as the empty symbol). It canuse either no lookahead or a one word lookahead, i.e. k =0ork = 1. One-word lookahead is typicallya big 15% time win, so it is the default setting.A feature system has been overlaid on the parser, allowing powerful grammars to be created moreeasily than with a simple CFG. We will use this feature system in the final part of this laboratory —NOT in the first portions of this lab.Following this convention, when we use the term CFG we mean an ordinary context-free grammarwithout superimposed features. When we use the term GPSG we mean a Generalized Phrase structuregrammar, namely, a context-free grammar with features imposed on top of it. We discuss this extensionin the lab’s final section.In addition to the parser, the parser system has Lisp facilities for creating and editing CFG andGPSG grammars, dictionaries, and sentence sets.2 Loading the SystemAll of the parser and utility code is in Common Lisp, and were all Common Lisps equal, the parser wouldbe machine independent. The files are set to load into the package USER. The file earley.lisp containsall the necessary code. As it stands, the current system runs on Athena under the implementation ofCommon Lisp called clisp. As for other Common Lisps, your mileage may indeed vary. (For example,we have not tested this code on Allegro Common Lisp for PCs or for Macs, but in all likelihood withsome tweaking on your part the code would work; but we cannot supply system assistance for this.)2.1 Athena business to load the systemThe Earley system stores various types of rule sets, such as CFG grammars, GPSG grammars, orsentence sets. These can be edited and used as input for the parser. The parser requires special grammartables as input, which are compiled from featureless context-free grammars (GPSGs are automaticallytranslated into featureless CFGs).1If you want to know more, you can ask me for a copy of one of the “original” papers on this, by G. Gazdar (1981)“Unbounded dependencies and coordinate structure,” Linguistic Inquiry 12, 155-184. Reprinted in Walter J. Savitch,Emmon Bach, William Marsh and Gila Safran-Naveh, eds. The Formal Complexity of Natural Language Dordrecht:Reidel, 183-226 (1987). In past years, the challenge was to build grammar rules to parse all the 100-plus example sentencesin this paper. This sounds easy, but it is very difficult — it includes such sentences as, “Mary shot, and John killed,a rabid dog”; and “I know John has been, and Mary was previously, a rabid Red Sox fan”. The point about this issimply that the Gazdar paper claimed efficient parsability on the grounds that the formalism described in the paper wascontext-free; but it is apparent when one tries to implement the rules that a real grammar for the example sentencesprobably


View Full Document

MIT 6 863J - Phrase Structure Parsing

Documents in this Course
N-grams

N-grams

42 pages

Semantics

Semantics

75 pages

Semantics

Semantics

82 pages

Semantics

Semantics

64 pages

Load more
Download Phrase Structure Parsing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Phrase Structure Parsing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Phrase Structure Parsing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?