View Full Document

A Bayesian framework for word segmentation



View the full content.
View Full Document
View Full Document

1 views

Unformatted text preview:

Cognition 112 2009 21 54 Contents lists available at ScienceDirect Cognition journal homepage www elsevier com locate COGNIT A Bayesian framework for word segmentation Exploring the effects of context Sharon Goldwater a Thomas L Grif ths b Mark Johnson c a b c School of Informatics University of Edinburgh Informatics Forum 10 Crichton Street Edinburgh EH8 9AB UK Department of Psychology University of California Berkeley CA United States Department of Cognitive and Linguistic Sciences Brown University United States a r t i c l e i n f o Article history Received 30 May 2008 Revised 11 March 2009 Accepted 13 March 2009 Keywords Computational modeling Bayesian Language acquisition Word segmentation a b s t r a c t Since the experiments of Saffran et al Saffran J Aslin R Newport E 1996 Statistical learning in 8 month old infants Science 274 1926 1928 there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words In this work we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words in particular how these assumptions affect the kinds of words that are segmented from a corpus of transcribed child directed speech We develop several models within a Bayesian ideal observer framework and use them to examine the consequences of assuming either that words are independent units or units that help to predict other units We show through empirical and theoretical results that the assumption of independence causes the learner to undersegment the corpus with many two and three word sequences e g what s that do you in the house misidenti ed as individual words In contrast when the learner assumes that words are predictive the resulting segmentation is far more accurate These results indicate that taking context into account is important for a statistical word segmentation strategy to be successful and



Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view A Bayesian framework for word segmentation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view A Bayesian framework for word segmentation and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?