DOC PREVIEW
A Bayesian framework for word segmentation

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A Bayesian framework for word segmentation: Exploring the effects of contextIntroductionWords and transitional probabilitiesProbabilistic models for word segmentationMaximum-likelihood estimationBayesian modelsUnigram modelGenerative modelInferenceSimulationsDataEvaluation procedureResults and discussionOther unigram modelsMBDP-1 and searchThe impact of the lexical model on word segmentationMBDP-1, the DP model, and other unigram modelsBigram modelThe hierarchical Dirichlet process modelSimulationsMethodResults and discussionGeneral discussionIdeal observer models of statistical learningOnline inference in word segmentationRepresentational assumptionsImplications for behavioral researchConclusionModel definitionsUnigram modelThe Chinese restaurant processThe Dirichlet processModeling utterance boundariesInferenceBigram modelThe hierarchical Dirichlet processInferenceProofsConnecting MBDP-1 and the DP modelRelationship to other unigram modelsReferencesA Bayesian framework for word segmentation: Exploring the effectsof contextSharon Goldwatera,*, Thomas L. Griffithsb, Mark JohnsoncaSchool of Informatics, University of Edinburgh, Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, UKbDepartment of Psychology, University of California, Berkeley, CA, United StatescDepartment of Cognitive and Linguistic Sciences, Brown University, United Statesarticle infoArticle history:Received 30 May 2008Revised 11 March 2009Accepted 13 March 2009Keywords:Computational modelingBayesianLanguage acquisitionWord segmentationabstractSince the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statisticallearning in 8-month-old infants. Science, 274, 1926–1928], there has been a great deal ofinterest in the question of how statistical regularities in the speech stream might be usedby infants to begin to identify individual words. In this work, we use computational mod-eling to explore the effects of different assumptions the learner might make regarding thenature of words – in particular, how these assumptions affect the kinds of words that aresegmented from a corpus of transcribed child-directed speech. We develop several modelswithin a Bayesian ideal observer framework, and use them to examine the consequences ofassuming either that words are independent units, or units that help to predict other units.We show through empirical and theoretical results that the assumption of independencecauses the learner to undersegment the corpus, with many two- and three-word sequences(e.g. what’s that, do you, in the house) misidentified as individual words. In contrast, whenthe learner assumes that words are predictive, the resulting segmentation is far more accu-rate. These results indicate that taking context into account is important for a statisticalword segmentation strategy to be successful, and raise the possibility that even younginfants may be able to exploit more subtle statistical patterns than have usually beenconsidered.Ó 2009 Elsevier B.V. All rights reserved.1. IntroductionOne of the first problems infants must solve as they areacquiring language is word segmentation: identifyingword boundaries in continuous speech. About 9% of utter-ances directed at English-learning infants consist of iso-lated words (Brent & Siskind, 2001), but there is noobvious way for children to know from the outset whichutterances these are. Since multi-word utterances gener-ally have no apparent pauses between words, childrenmust be using other cues to identify word boundaries. Infact, there is evidence that infants use a wide range of weakcues for word segmentation. These cues include phonotac-tics (Mattys, Jusczyk, Luce, & Morgan, 1999), allophonicvariation (Jusczyk, Hohne, & Bauman, 1999), metrical(stress) patterns (Jusczyk, Houston, & Newsome, 1999;Morgan, Bonamo, & Travis, 1995), effects of coarticulation(Johnson & Jusczyk, 2001), and statistical regularities inthe sequences of syllables found in speech (Saffran, Aslin,& Newport, 1996). This last source of information can beused in a language-independent way, and seems to be usedby infants earlier than most other cues, by the age of7 months (Thiessen & Saffran, 2003). These facts havecaused some researchers to propose that strategies basedon statistical sequencing information are a crucial first stepin bootstrapping word segmentation (Thiessen & Saffran,2003), and have provoked a great deal of interest in thesestrategies (Aslin, Saffran, & Newport, 1998; Saffran, New-port, & Aslin, 1996; Saffran et al., 1996; Toro, Sinnett, &Soto-Faraco, 2005). In this paper, we use computational0010-0277/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved.doi:10.1016/j.cognition.2009.03.008* Corresponding author. Tel.: +44 131 651 5609.E-mail addresses: [email protected], [email protected] (S. Gold-water).Cognition 112 (2009) 21–54Contents lists available at ScienceDirectCognitionjournal homepage: www.elsevier.com/locate/COGNITmodeling techniques to examine some of the assumptionsunderlying much of the research on statistical wordsegmentation.Most previous work on statistical word segmentationis based on the observation that transitions from one syl-lable or phoneme to the next tend to be less predictableat word boundaries than within words (Harris, 1955; Saf-fran et al., 1996). Behavioral research has shown that in-fants are indeed sensitive to this kind of predictability, asmeasured by statistics such as transitional probabilities(Aslin et al., 1998; Saffran et al., 1996). This research,however, is agnostic as to the mechanisms by which in-fants use statistical patterns to perform word segmenta-tion. A number of researchers in both cognitive scienceand computer science have developed algorithms basedon transitional probabilities, mutual information, andsimilar statistics of predictability in order to clarify howthese statistics can be used procedurally to identify wordsor word boundaries (Ando & Lee, 2000; Cohen & Adams,2001; Feng, Chen, Deng, & Zheng, 2004; Swingley,2005). Here, we take a different approach: we seek toidentify the assumptions the learner must make aboutthe nature of language in order to correctly segment nat-ural language input.Observations about predictability at word boundariesare consistent with two different kinds of assumptionsabout what constitutes a word: either a word is a unit thatis statistically independent of other units, or it is a unit thathelps to predict other units (but to a lesser


A Bayesian framework for word segmentation

Download A Bayesian framework for word segmentation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view A Bayesian framework for word segmentation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Bayesian framework for word segmentation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?