MIT 6 441 - Three Models for the Description of Language

Unformatted text preview:

Three Models for the Description of LanguageMatt WillseyMay 10, 20061 IntroductionNoam Chomsky hopes to accomplish two goals withthe model developed herein. First, the model must besimple, yet sophisticated enough to generate sentences.Second, he uses the developed model of the language tounderstand basic, underlying principles to the structure of alanguage. The model is developed by observing patterns ina finite set of sentences and extended such that it is capableof describing every sentence in the entire language. Theauthor then evaluates the performance of the model bytesting it on clear grammatical and ungrammaticalsentences. Ideally, the model should be capable ofdescribing grammatical sentences and not capable ofdescribing ungrammatical sentences. For example considerthe two sentences as given in [1].(1) John ate a sandwich.(2) Sandwich a ate John.A correct model describing English would is capable ofdescribing (1) but not (2).In his paper, Noam Chomsky analyzes three models todescribe language structure, which are described in moredetail in this review. The first model analyzed is a methodfor generating English sentences with finite-state Markovprocesses. Conceptually, this method is very simple; butunfortunately, Chomsky concludes using finite-stateMarkov processes to generate sentences does properlymodel the English language. Next Chomsky develops theconcept of “phrase structure” in the attempt to develop amore powerful model for the English language. By usingthis model, very simple English sentences can be generated.Unfortunately, this limited model can only generate a smallsubset of sentences in the complete set of Englishsentences. However, this model can be generalized to firstuse phrase structure to generate a basic sentence kernel andthen any number of transformations can be applied. Usingthis extended model, practically all English sentences canbe generated. Moreover, this model also provides insightinto the basic structure of the English language. 2 Finite State Markov ProcessesBefore describing finite-state Markov processes, it isnecessary to precisely define terms used in Chomsky’spaper. First, language is defined as “a set (finite or infinite)of sentences, each of finite length, all constructed from afinite alphabet of symbols [1].” Secondly, for an alphabet,A, a string is any concatenation of the symbols of A. Lastly,grammar of the language, L, is a device that produces allthe sentences of L and only the sentences of L.With these definitions in place, it is possible to describethe first, and conceptually the simplest, model of a languageby defining a finite-state grammar, G, using finite-stateMarkov processes. To do so consider the finite states, S0,…,Sq, a set of transition symbols, A={ai,j,k| 0 ≤ i, j ≤ q; l ≤ k≤ Ni,j for each i, j}, and finally a set of connected pairs ofstates of G, C = {(Si, Sj)}, where Ni,J is the total number oftransition symbols when transitioning from Si to SJ. As theprocess moves from one state to the next, it produces asymbol ai,j,k which is contained in A. Thus to produce asentence in the language (LG), we run through the statesmSS,,1 where α1 = αm = 0. This sequence of stateswill yield the string,11232121,,,,,,1^^^mmmkkkaaas for1,iiNki, where the string is the sentence in LG. However, neither this model nor any other finite-statemodel can generate every possible sentence in the Englishlanguage. Unfortunately, the strings in English haveinterdependencies among words. For example, consider thesentences given in (3) where S1 and S2 are English strings. (3)(i) If S1, then S2. (ii) Either S3, or S4. (iii) If either if S5, then S6, or S7, then S8.As shown in (3)(i) and (ii), the words “if”-“then” and“either”-“or” are dependent, since every “if” requires a“then” (although it could be an implied “then”) and every“either” requires an “or.” Moreover, these sentences can benested an infinite number of times and still form agrammatically-correct English sentence. A case of nestingthree dependencies is shown in (3)(iii), and it can be easilyextended to nesting an infinite number of dependencies.When nesting dependencies, memory is required toremember the presence of an “if” or “either.” Thus, nofinite state machine can capture this property of nesting aninfinite number of dependencies; therefore, no finite-statemachine can completely describe the English language.Chomsky comments that it is not helpful to limit the nestingto finite numbers of dependencies; since once it is assumedthat the sentence structure is finite, it is obvious that a finitestate machine is capable of describing the structure.However, it still provides little insight into the underlyingstructure of English, since it cannot provide insight to a setof grammatically correct sentences.3 Phrase-Structure GrammarAfter demonstrating that finite-state Markov processesare incapable of modeling the English language, Chomskyexplains a syntactic description commonly referred to as“immediateconstituentanalysis.” Thedescription breaksapart a sentenceinto its constituentparts. For example,the string“sentence” givenin (4) can be firstseparated into anoun phrase (NP)and a verb phrase(VP). The NP can be broken down into “the man,” and theVP can be separated in V and NP, which can be brokendown into “took” and “the book,” respectively. Thissentence description can be formalized to describe mostlyall English sentences. Note that “#” just refers to theterminal symbol of the vocabulary.A phrase-structured grammar is then described as afinite vocabulary or alphabet (VP) and a finite set of initialstrings (∑), and a finite set of rules (F) of the form X → Y,where both X and Y are contained in VP. Note that somerulesF could be mandatory while others may be optional.We refer to this grammar as a [∑, F] grammar.Now consider a [∑, F] grammar, which accounts forthe syntactic


View Full Document

MIT 6 441 - Three Models for the Description of Language

Download Three Models for the Description of Language
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Three Models for the Description of Language and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Three Models for the Description of Language 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?