Convergence Bounds for Language Evolution by Iterated Learning

Home> Academic Documents> Convergence Bounds for Language Evolution by Iterated Learning

DOC PREVIEW

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Convergence Bounds for Language Evolution by Iterated LearningAnna N. Rafferty ([email protected])Computer Science Division, University of California, Berkeley, CA 94720 USAThomas L. Griffiths ([email protected])Department of Psychology, University of California, Berkeley, CA 94720 USADan Klein ([email protected])Computer Science Division, University of California, Berkeley, CA 94720 USAAbstractSimilarities between human languages are often taken as ev-idence of constraints on language learning. However, suchsimilarities could also be the result of descent from a com-mon ancestor. In the framework of iterated learning, languageevolution converges to an equilibrium that is independent of itsstarting point, with the effect of shared ancestry decaying overtime. Therefore, the central question is the rate of this conver-gence, which we formally analyze here. We show that conver-gence occurs in a number of generations that is O(n logn) forBayesian learning of the ranking of n constraints or the valuesof n binary parameters. We also present simulations confirm-ing this result and indicating how convergence is affected bythe entropy of the prior distribution over languages.IntroductionHuman languages share a surprising number of properties,ranging from high level characteristics like compositionalmapping between sound and meaning to relatively low-level syntactic regularities (Comrie, 1981; Greenberg, 1963;Hawkins, 1988). One explanation for these universal proper-ties is that they reflect constraints on human language learn-ing, with the mechanisms by which we acquire language be-ing restricted to languages with these properties (e.g., Chom-sky, 1965). However, if all modern languages are descendedfrom a common ancestor, these similarities could just reflectthe properties of that ancestor. Evaluating these different pos-sibilities requires establishing how constraints on learning in-fluence the properties of languages, and how long it takes forthis process to remove the influence of a common ancestor. Inthis paper, we explore these questions using a simple modelof language evolution.We model language evolution as a process of iteratedlearning (Kirby, 2001). This model assumes that each gen-eration of people learns language from utterances generatedby the previous generation. While this model makes certainsimplifying assumptions, such as a lack of interaction be-tween learners in the same generation, it has the advantagethat it can be analyzed mathematically. Previous research hasshown that after some number of generations, the distributionover languages produced by learners converges to an equilib-rium that reflects the constraints that guide learning (Griffiths& Kalish, 2007). After convergence, the behavior of learnersis independent of the language spoken by the first generation.These results provide a way to relate constraints on learn-ing to linguistic universals. However, convergence to theequilibrium has to occur in order for these constraints to bethe sole factor influencing the languages learners acquire.Our key contribution is providing bounds on the number ofgenerations required for convergence, known as the conver-gence time, which we obtain by analyzing Markov chains as-sociated with iterated learning. Bounding the convergencetime is a step towards understanding the source of linguisticuniversals: If convergence occurs in relatively few genera-tions, it suggests constraints on learning are more likely thancommon descent to be responsible for linguistic universals.To bound the number of generations required for iteratedlearning to converge, we need to make some assumptionsabout the algorithms and representations used by learners.Following previous analyses (Griffiths & Kalish, 2007), weassume that learners update their beliefs about the plausibil-ity of a set of linguistic hypotheses using Bayesian inference.We outline how this approach can be applied using two kindsof hypothesis spaces that appear in prominent formal linguis-tic theories: constraint rankings, as used in Optimality Theory(Prince & Smolensky, 2004), and vectors of binary parame-ter values, consistent with a simple Principles and Parametersmodel (Chomsky & Lasnik, 1993). In each case, we showthat iterated learning with a uniform prior reaches equilib-rium after O(n log n) generations, where n is the number ofconstraints or parameters.Analyzing Iterated LearningIterated learning has been used to model a variety of aspectsof language evolution, providing a simple way to explore theeffects of cultural transmission on the structure of languages(Kirby, 2001; Smith, Kirby, & Brighton, 2003). The basicassumption behind the model – that each learner learns fromsomebody who was themselves a learner – captures a phe-nomenon we see in nature: Parents pass on language to theirchildren, and these children in turn pass on language to theirown children. The sounds that the children hear are the in-put, and the child produces language (creates output) basedon this input, as well as prior constraints on the form of thelanguage.Formally, we conceptualize iterated learning as follows(see Figure 1). A first learner receives data, forms a hypothe-sis about the process that generated these data, and then pro-duces output based on this hypothesis. A second learner re-ceives the output of the first learner as data and produces anew output that is in turn provided as data to a third learner.This process may continue indefinitely, with the tthlearner re-ceiving the output of the (t − 1)thlearner. The iterated learn-ing models we analyze make the simplifying assumptionsthat language evolution occurs in only one direction (previ-ous generations do not change their hypotheses based on thedata produced by future generations) and that each learner re-ceives input from only one previous learner. We first charac-terize how learning occurs, independent of specific represen-tation, and then give a more detailed description of the formof these hypotheses and data.Our models assume that learners represent (or act as if theyrepresent) the degree to which constraints predispose them tocertain hypotheses about language through a probability dis-tribution over hypotheses, and that they combine these pre-dispositions with information from the data using Bayesianinference. Starting with a prior distribution over hypothesesp(h) for all hypotheses h in a hypothesis space H, the pos-terior distribution over hypotheses given data


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 6 pages.

Please select your school