COGNITIVE SCIENCE 14 179 211 1990 Finding Structure in Time JEFFREY L ELMAN University of California San Diego Time underlies many interesting human behaviors Thus the question of how to represent time in connectionist models is very important One approach is to represent time implicitly by its effects on processing rather than explicitly as in a spatial representation The current report develops a proposal along these lines first described by Jordan 1986 which involves the use of recurrent links in order to provide networks with a dynamic memory In this approach hidden unit patterns are fed back to themselves the internal representations which develop thus reflect task demands in the context of prior internal states A set of simulations is reported which range from relatively simple problems temporal version of XOR to discovering syntactic semantic features for words The networks are able to learn interesting internal representations which incorporate task demands with memory demands indeed in this approach the notion of memory is inextricably bound up with task processing These representations reveal a rich structure which allows them to be highly context dependent while also expressing generalizations across classes of items These representations suggest a method for representing lexical categories and the type token distinction I would like to thank Jay McClelland Mike Jordan Mary Hare Dave Rumelhart Mike Mozer Steve Poteet David Zipser and Mark Dolson for many stimulating discussions I thank McClelland Jordan and two anonymous reviewers for helpful critical comments on an earlier draft of this paper This work was supported by contract N00014 85 K 0076 from the Office of Naval Research and contract DAAB 07 87 C H027 from Army Avionics Ft Monmouth Requests for reprints should be sent to the Center for Research in Language C 008 University of California San Diego CA 92093 0108 The author can be reached via electronic mail as elman crl ucsd edu Page 1 Introduction Time is clearly important in cognition It is inextricably bound up with many behaviors such as language which express themselves as temporal sequences Indeed it is difficult to know how one might deal with such basic problems as goal directed behavior planning or causation without some way of representing time The question of how to represent time might seem to arise as a special problem unique to parallel processing models if only because the parallel nature of computation appears to be at odds with the serial nature of temporal events However even within traditional serial frameworks the representation of serial order and the interaction of a serial input or output with higher levels of representation presents challenges For example in models of motor activity an important issue is whether the action plan is a literal specification of the output sequence or whether the plan represents serial order in a more abstract manner e g Lashley 1951 MacNeilage 1970 Fowler 1977 1980 Kelso Saltzman Tuller 1986 Saltzman Kelso 1987 Jordan Rosenbaum 1988 Linguistic theoreticians have perhaps tended to be less concerned with the representation and processing of the temporal aspects to utterances assuming for instance that all the information in an utterance is somehow made available simultaneously in a syntactic tree but the research in natural language parsing suggests that the problem is not trivially solved e g Frazier Fodor 1978 Marcus 1980 Thus what is one of the most elementary facts about much of human activity that it has temporal extent is sometimes ignored and is often problematic In parallel distributed processing models the processing of sequential inputs has been accomplished in several ways The most common solution is to attempt to parallels time by giving it a spatial representation However there are problems with this approach and it is ultimately not a good solution A better approach would be to represent time implicitly rather than explicitly That is we represent time by the effect it has on processing and not as an additional dimension of the input This paper describes the results of pursuing this approach with particular emphasis on problems that are relevant to natural language processing The approach taken is rather simple but the results are sometimes complex and unexpected Indeed it seems that the solution to the problem of time may interact with other problems for connectionist architectures including the problem of symbolic representation and how connectionist representations encode structure The current approach supports the notion outlined by Van Gelder 1989 see also Smolensky 1987 1988 Elman 1989 that connectionist representations may have a functional compositionality without being syntactically compositional The first section briefly describes some of the problems that arise when time is represented externally as a spatial dimension The second section describes the approach used in this work The major portion of this report presents the results of applying this new architecture to a diverse set of problems These problems range in complexity from a temporal version of the ExclusiveOR function to the discovery of syntactic semantic categories in natural language data The Problem with Time One obvious way of dealing with patterns that have a temporal extent is to represent time explicitly by associating the serial order of the pattern with the dimensionality of the pattern Page 2 vector The first temporal event is represented by the first element in the pattern vector the second temporal event is represented by the second position in the pattern vector and so on The entire pattern vector is processed in parallel by the model This approach has been used in a variety of models e g Cottrell Munro Zipser 1987 Elman Zipser 1988 Hanson Kegl 1987 There are several drawbacks to this approach which basically uses a spatial metaphor for time First it requires that there be some interface with the world which buffers the input so that it can be presented all at once It is not clear that biological systems make use of such shift registers There are also logical problems how should a system know when a buffer s contents should be examined Second the shift register imposes a rigid limit on the duration of patterns since the input layer must provide for the longest possible pattern and further suggests that all input vectors be the same length These problems are particularly troublesome in domains such as language
View Full Document
Unlocking...