DOC PREVIEW
BEYOND STATISTICAL LEARNING IN SYNTAX

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BEYOND STATISTICAL LEARNING IN SYNTAX ERI TAKAHASHI AND JEFFREY LIDZ 1. Introduction Knowing the structural representation of sentences is a fundamental step for acquiring a language. However, the input to a child does not come with obvious labels to signal constituency–it seems like simple linear sequences of words. Since both words and constituent structure vary from language to language, children have to learn how words go together to form constituents in the particular language they are learning. Therefore, some learning mechanism must be present that guides the learner to build the correct phrase structure. What kind of input is necessary and what kind of information is used by children to come to the correct representation? It is most likely that children employ various kinds of information to arrive at the correct phrase structure representation–perhaps a combination of cues from prosody, function words, agreement morphology, semantics and distribution. This paper will focus on distributional cues to phrase structure. Recent studies in artificial language learning have shown that distributional information can play a role in the acquisition of phonemes (Maye, Werker & Gerken 2002, Maye & Gerken 2000), word segmentation (Saffran, Aslin & Newport 1996), word categories (Mintz 2003) and syntax-like regularities (Gomez & Gerken 1999). In particular, it has been proposed that “transitional probabilities”, which is a statistic that measures the predictiveness of the following element given a previous element, can be used by learners to successfully learn phrasal groupings of words (Thompson & Newport 2007) in miniature artificial languages. Thompson & Newport (2007) showed that the adult subjects successfully learned the phrasal groupings of an artificial language based on the transitional probabilities. However, the artificial grammar in Thompson & Newport (2007) contained phrases with no internal structure and consequently leaves open the question of whether statistical cues to multiply embedded hierarchical structures can be detected by learners. So our question is: can learners use transitional probabilities to detect multiply embedded hierarchical structures? A further question we have is: how is statistical information used byBeyond Statistical Learning in Syntax 2 learners in the acquisition of phrase structure? Broadly speaking, we can think of at least two possibilities. One possibility is that each child has to discover the existence of phrase structure and its characteristics on the basis of distributional information alone (“phrase structure invention” hypothesis). A second possibility is that each child uses the input distribution to determine how the particular language maps words to structural descriptions of a highly restricted character (“phrase structure identification” hypothesis). These two hypotheses make distinct predictions about the nature of the mechanism for acquiring phrase structure representations. The “phrase structure invention” hypothesis predicts that the output of learning is determined solely by experience, while the “phrase structure identification” hypothesis holds that the output of learning derives from an interaction between the input and an experience-independent representational system. The current paper presents two new experiments to investigate (a) what kinds of distributional information can be used to identify hierarchical phrase structure and (b) whether learners use the distributional information to discover the existence of phrase structure or to map the experience onto a template. 2. Experiment 1 Experiment 1 explores whether the statistical cues to multiply embedded hierarchical structures can be detected by learners. Two miniature artificial languages–Grammar 11 and Grammar 2–were created. The two grammars share the identical word classes and lexical items, which were adapted from Thompson & Newport (2007). Each word class contained three nonsense lexical items. Word Class A B C D E F KOF HOX JES SOT FAL KER DAZ NEB REL ZOR TAF NAV MER LEV TID LUM RUD SIB Table 1: Nonsense words assigned to each word class The basic phrase structure trees for Grammar 1 and Grammar 2 are given below. 1 We adapted the artificial languages from Morgan & Newport (1981), Morgan et al. (1987, 1989) and Saffran (2001).Beyond Statistical Learning in Syntax 3 Fig 1: PS trees for Grammars 1 and 2 The canonical sentence in both grammars are identical–ABCDE. Grammars 1 and 2 differ only in constituent structure. For example, while AB is a constituent in Grammar 1, it is not in Grammar 2. Additionally, the grammars display nested hierarchical structure. In Grammar 1, a phrasal unit EP consists of an E word and another phrase CP, which in turn consists of C and D. These grammars incorporate four types of manipulations which (a) made certain constituents optional, (b) allowed for the repetition of certain constituents, (c) substituted proforms for certain constituents and (d) moved certain constituents. For example in Grammar 1, the constituent AP can be replaced by a proform ib. As for the movement operation, the EP can be moved to the front in Grammar 1 and the DP can be moved in Grammar 2. Fig 2: PS trees involving movement in Grammars 1 and 2 Eighty sentences from each language were picked as a presentation set. Incorporating all the manipulations discussed above resulted in the higher TPs between words within phrases compared with the TPs across phrases. Within a phrase, the TP is always 1.00 whereas TPs across phrase boundaries are substantially lower. The TP patterns of the presentation set are given below.Beyond Statistical Learning in Syntax 4 A-B B-C C-D D-E Forward TP 1.00 0.24 1.00 0.25 Backward TP 1.00 0.19 1.00 0.34 Table 2: Transitional probabilities for 80 input sentences in Grammar 1 A-B B-C C-D D-E Forward TP 0.33 1.00 0.15 1.00 Backward TP 0.18 1.00 0.16 1.00 Table 3: Transitional probabilities for 80 input sentences in Grammar 2 The sentences lacked any prosodic cues to phrase boundaries. The 80 sentences


BEYOND STATISTICAL LEARNING IN SYNTAX

Download BEYOND STATISTICAL LEARNING IN SYNTAX
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view BEYOND STATISTICAL LEARNING IN SYNTAX and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view BEYOND STATISTICAL LEARNING IN SYNTAX 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?