14
doi: 10.1098/rstb.2008.0145 , 3591-3603 363 2008 Phil. Trans. R. Soc. B Kenny Smith and Simon Kirby language faculty and its evolution Cultural evolution: implications for understanding the human References http://rstb.royalsocietypublishing.org/content/363/1509/3591.full.html#ref-list-1 This article cites 44 articles, 7 of which can be accessed free Rapid response http://rstb.royalsocietypublishing.org/letters/submit/royptb;363/1509/3591 Respond to this article Subject collections (2309 articles) evolution Articles on similar topics can be found in the following collections Email alerting service here right-hand corner of the article or click Receive free email alerts when new articles cite this article - sign up in the box at the top http://rstb.royalsocietypublishing.org/subscriptions go to: Phil. Trans. R. Soc. B To subscribe to This journal is © 2008 The Royal Society on January 5, 2011 rstb.royalsocietypublishing.org Downloaded from

Cultural evolution: implications for understanding the ...kenny/publications/smith_08_cultural.pdf · language faculty and its evolution Cultural evolution: implications for understanding

  • Upload
    doduong

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

doi: 10.1098/rstb.2008.0145, 3591-3603363 2008 Phil. Trans. R. Soc. B

Kenny Smith and Simon Kirby language faculty and its evolutionCultural evolution: implications for understanding the human

References http://rstb.royalsocietypublishing.org/content/363/1509/3591.full.html#ref-list-1 This article cites 44 articles, 7 of which can be accessed free

Rapid response http://rstb.royalsocietypublishing.org/letters/submit/royptb;363/1509/3591 Respond to this article

Subject collections (2309 articles)evolution

Articles on similar topics can be found in the following collections

Email alerting service hereright-hand corner of the article or click Receive free email alerts when new articles cite this article - sign up in the box at the top

http://rstb.royalsocietypublishing.org/subscriptions go to: Phil. Trans. R. Soc. BTo subscribe to

This journal is © 2008 The Royal Society

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

Cultural evolution: implications for understandingthe human language faculty and its evolution

Kenny Smith1,* and Simon Kirby2

1Cognition and Communication Research Centre, Division of Psychology, Northumbria University,Northumberland Building, Northumberland Road, Newcastle NE1 8ST, UK

2Language Evolution and Computation Research Unit, School of Philosophy, Psychology and LanguageSciences, University of Edinburgh, Dugald Stewart Building, 3 Charles Street, Edinburgh, EH8 9AD, UK

Human language is unique among the communication systems of the natural world: it is sociallylearned and, as a consequence of its recursively compositional structure, offers open-endedcommunicative potential. The structure of this communication system can be explained as aconsequence of the evolution of the human biological capacity for language or the cultural evolutionof language itself. We argue, supported by a formal model, that an explanatory account that involvessome role for cultural evolution has profound implications for our understanding of the biologicalevolution of the language faculty: under a number of reasonable scenarios, cultural evolution canshield the language faculty from selection, such that strongly constraining language-specific learningbiases are unlikely to evolve. We therefore argue that language is best seen as a consequence ofcultural evolution in populations with a weak and/or domain-general language faculty.

Keywords: language; communication; language faculty; cultural evolution; biological evolution

1. INTRODUCTIONWhen compared with other animals, humans strike usas special. First, we are highly cultural—while cultureappears not to be unique to humans (Whiten 2005), itsubiquity in human society and human cognition ishighly distinctive. Second, humans have a uniquecommunication system, language. Language differsfrom the communication systems of non-humananimals along a number of fairly well-defined dimen-sions, to be elucidated below.

The first main contention of this article is that theco-occurrence of these two unusual properties is notcoincidental: they are causally related. The view thatlanguage facilitates human culture is not a new one: it isfor this reason thatMaynard Smith & Szathmary (1995)described language as the most recent major evolution-ary transition in the history of life on Earth. However, wewill argue that the relationship works in the otherdirection as well—at least some of the distinctive featuresof human language are adaptations to its culturaltransmission, and cultural evolution potentially plays amajor role in explaining why the human communicationsystem has the particular features that it does. The mostextreme form of this argument, to be expounded in §5, isthat a communication system looking a lot like humanlanguage is a natural and inevitable consequence ofcultural evolution in populations of a particular sort ofsocial learner.

Our second main contention is that a seriousconsideration of cultural evolution radically changesthe kinds of evolutionary stories we can tell about the

human capacity for language. Specifically, as we willargue in §4, cultural evolution potentially shields thehuman language faculty from selection, ruling out theevolution of a strongly constraining and domain-specific language faculty in our species.

2. LANGUAGE DESIGNWhat is special about language? In an early attempt toanswer this question, Hockett (1960) identified 13design features of language. Of particular relevance arethe following three features, whose conjunction (to afirst approximation) distinguish language from thecommunication systems of all other animals.

—Semanticity: ‘there are relatively fixed associationsbetween elements of messages (e.g. words) andrecurrent features or situations of the world aroundus’ (Hockett 1960, p. 6).—Productivity: ‘[language provides] the capacity to saythings that have never been said or heard before and yetto be understood by other speakers of the language’(Hockett 1960, p. 6).—Traditional (i.e. cultural) transmission: ‘Human genescarry the capacity to acquire a language, and probablyalso a strong drive towards such acquisition, but thedetailed conventions of any one language are trans-mitted extragenetically by learning and teaching’(Hockett 1960, p. 6).

In isolation, each of these features is not particularlyrare. Limited non-productive semantic communicationsystems are common in the natural world, notably in thealarm-calling behaviours of various species (e.g. diversespecies of bird and monkey; Marler 1955; Cheney &Seyfarth 1990; Evans et al. 1993; Zuberbuhler 2001).

Phil. Trans. R. Soc. B (2008) 363, 3591–3603

doi:10.1098/rstb.2008.0145

Published online 19 September 2008

One contribution of 11 to a Theme Issue ‘Cultural transmission andthe evolution of human behaviour’.

*Author for correspondence ([email protected]).

3591 This journal is q 2008 The Royal Society

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

However, such semantic communication systems arenot traditionally transmitted: the consensus view is thatthere is no vocal learning of the form of alarm calls insuch systems, although there may well be a role forlearning in establishing the precise situations underwhich calls must be produced, or how calls must beresponded to (usage and comprehension learning,respectively: Slater 2005).

Traditionally transmitted communication systemsalso exist in non-humans—in mammals (seals, bats,whales and dolphins) and in birds (most notably theoscine passerines, the songbirds). Aspects of theproduction of communicative signals in all of thesegroups show sensitivity to input and (particularly in thecase of songbirds and whales) patterns of local dialectscharacteristic of cultural transmission. Furthermore,some bird song (e.g. that of the starling, willow warblerand Bengalese finch; Eens 1997; Gil & Slater 2000;Okanoya 2004) is productive, to the extent that thesesongs are constructed according to rules that generateseveral possible songs with the same underlyingstructure. However, none of these systems rise abovemore than a superficial degree of semanticity: this isclearest in the songbirds, where song seems to serve adual function as a means of attracting mates andrepelling rivals (Catchpole & Slater 1995). Indeed, thesame song being sung by a particular individual can bedifferentially interpreted depending on the identity ofthe listener, with female listeners interpreting it assexual advertisement and males interpreting it asterritory defence.

Finally, traditionally transmitted and semanticcommunication systems seem to exist, to a limitedextent, in the gestural communication systems of ournearest extant relatives, the apes. Pairs of apes develop,through repeated interactions, meaningful gesturesthat can be subsequently used to communicate aboutvarious situations (feeding, play, sex, etc.; Call &Tomasello 2004). However, such systems are notproductive: each sign is underpinned by a ratherlaborious history of interaction (through a processknown as ontogenetic ritualization; Tomasello 1996)and consequently such systems cannot be expanded toinclude meaningful novel signals.

While there are, in principle, several ways ofdesigning a productive semantic communicationsystem, in human language these features are under-pinned by four subsidiary design features (two fromHockett 1960, the others implicit in his account).

—Arbitrariness: ‘the ties between the meaningfulmessage elements and their meanings can be arbitrary’(Hockett 1960, p. 6). Arbitrary signals of this sort can becontrasted with, for example, signals that have meaningby resemblance (icons) or signals that are causallyrelated to their meanings (indexes, such as the gesturalsignals established by ontogenetic ritualization).—Duality of patterning: ‘The meaningful elements inany language (e.g. words).constitute an enormousstock. Yet they are represented by small arrangementsof a relatively very small stock of distinguishable soundswhich are in themselves wholly meaningless’ (Hockett1960, p. 6). Languages have a few tens of phonemesthat are combined to form tens of thousands of words.

—Recursion: a signal of a given category can containcomponent parts that are of the same category, and thisembedding can be repeated without limit. For example,‘I think he saw her’ is a sentence of English, whichcontains an embedded sentence, ‘he saw her’. Thisembedding can be continued indefinitely (‘I think shesaid he saw her’, ‘You know I think she said he saw her’,and so on), yielding an infinite number of sentences.—Compositionality: the meaning of a complex signal is afunction of the meaning of its parts and the way in whichthey are combined (Krifka 2001). For example, thesentence ‘She slapped him’ consists of four componentparts—‘she’, ‘him’, ‘slap’ and ‘-ed’ (the past tensemarker). The meaning of the sentence is determined bythe meaning of these parts and the order in which theyare combined—‘She slapped him’ differs predictably inmeaning from ‘They slapped him’ and ‘She kicked him’,but also from ‘He slapped her’, where the order of themale and female pronouns has been reversed.

The combination of these four subsidiary featuresresults in a system that is productive and semantic.Duality of patterning allows for the generation of anextremely large set of basic communicative units froma small inventory of discriminable sounds.1 Arbitrari-ness allows those basic units to be mapped onto theworld in a flexible fashion, without additional con-straints of iconicity or indexicality. Recursion allowsthat large inventory of basic units to be combined toform a truly open-ended system. Finally, compositionalstructure makes the interpretation of novel utterancespossible—in a recursive compositional system, if youknow the meaning of the basic elements and the effectsassociated with combining them, you can deduce themeaning of any utterance in the system, includinginfinitely many entirely novel utterances.

Again, we can ask to what extent these subsidiarydesign features are realized in the communicationsystems of non-human animals. Perhaps, as we mightexpect, the productive systems highlighted above (e.g.the song of certain species of bird) appear to adoptrudiments of duality of patterning, in that they consist ofrecombinable subunits (notes or syllables). Semanticsystems show limited amounts of arbitrariness (in alarm-calling systems) and compositionality (in the boomalarm combination call in Campbell’s monkeys, wherethe preceding boom serves to change the meaning of thesubsequent alarm call, or reduces its immediacy;Zuberbuhler 2002). Finally, there is little evidencefor recursion in the communicative behaviour of non-humans—indeed, Hauser et al. (2002) argue that it isentirely absent from the cognitive repertoires of all non-human species, although this point is still a matter fordebate (Kinsella in press).

The foregoing discussion suggests that language isunusual owing to the particular bundle of features itpossesses, and the pervasiveness of those features inlanguage, rather than possession of any individualunique element. Why is language designed like this?This at first seems like a fairly straightforward questionto answer: language is designed like this because thesedesign features make for a useful communicationsystem, specifically a system with open-ended expressiv-ity. A population of individuals sharing such a system

3592 K. Smith & S. Kirby Human language faculty and its evolution

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

can in principle communicate with each other aboutanything they chose, including survival relevant issuessuch as where to find food and shelter, how to deal withpredators and prey, how social relationships are to bemanaged, and so on. Furthermore, each individual onlyrequires a finite (and fairly small) set of cognitiveresources to achieve this expressive range—a few tensof speech sounds, a few tens of thousands of meaningfulwords created from those speech sounds and a fewhundred grammatical rules constraining the com-bination of those words into meaningful sequences.

However, the fact that human language makes gooddesign sense in various ways does not explain howlanguage came to have these properties. To trulyanswer the question ‘why is language designed likethis?’ we need to establish the mechanisms that explainthis fit between function and form: how did themanifest advantages of such a linguistic system becomerealized in language as a system of human behaviour?

We will review two potential mechanisms below:one that explains the fit between function and form asarising from the biological evolution of the humanlanguage faculty (in §3a), and the other that views it as aconsequence of the cultural evolution of language itself(in §3b). In both cases we will see how these contrastingexplanations have been applied to the specific questionof the evolution of compositionality, one of the languagedesign features subserving productivity. The evolutionof compositionality has received a great deal of attentionin the literature, which is why we focus on it here.Ultimately, all these design features will require suchexplanation, and a similar research effort aimed atunderstanding the evolution of duality of patterning (inboth biological and cultural terms) is underway (see, e.g.Nowak & Krakauer 1999; Nowak et al. 1999; Oudeyer2005; Zuidema & de Boer in press). In §4, we will turnto the issue of interactions between biological andcultural evolutionary accounts.

3. TWO EXPLANATIONS FOR LANGUAGEDESIGN(a) An explanation from evolutionary biologyThe uniqueness of human language must have somebiological basis—there must be some feature of humanbiology that results in this unusually rich, expressivesystem of communication in our species alone. Oneobvious biological adaptation for vocal communicationis the unusual structure of the human vocal tract, whichprovides a wide range of highly discriminable sounds(see Fitch 2000, for review). However, language ismore than just speech: the design features picked outabove are ambivalent as to the modality of the system inquestion, and while adaptations for speech areimportant in that they provide a good clue as to theage, modality and selective importance of language inhuman evolution history, they are peripheral to whatwe see as the fundamental design features of language.

Moving beyond the productive apparatus, then,what is our biological endowment for language?Linguists have approached this question via a consider-ation of the problem of language acquisition: how dochildren acquire a complex and richly structuredlanguage with apparent ease at a relatively early stage

in their lives? The difficulty of reconciling the precocityof language acquisition with the complexity of languageleads to the fairly prominent view that humans must beendowed with a highly structured and constraininglanguage-specific mental faculty that, to a large degree,prefigures much of the structure of language. One ofthe main cornerstones of this argument is known asthe argument from the poverty of the stimulus, a termintroduced by Chomsky (1980): the data language mustbe learned from lack direct evidence for a number offeatures of the grammars that children must learn. Ifchildren end up with grammars that contain features forwhich they received no evidence in their input, then (theargument goes) those features of language must beprefigured in the acquisition device. Rather than aflexible and open-ended process of social learning,language acquisition is ‘better understood as the growthof cognitive structures along an internally directedcourse under the triggering and partially shaping effectof the environment’ (Chomsky 1980, p. 34). Designfeatures of language are then naturally explained asfeatures of the internally directed course of acquisition.

This account has been so successful that it nowconstitutes a scientific orthodoxy, at least in some form(and despite a lack of empirical support for stimuluspoverty arguments; Pullum & Scholz 2002). However,even if we accept that language is the way it is becausethe language faculty forces it to be that way, this simplypushes the question back one step: why is the languagefaculty designed the way it is?

A well-established solution to such questions inbiological systems is that of evolution by naturalselection. In a highly influential article, Pinker &Bloom (1990) argued that this solution can be appliedto language, assuming (following the argument above)that language is to a non-trivial extent a biologicalcapacity: ‘It would be natural, then, to expect everyoneto agree that human language is the product ofDarwinian natural selection. The only successfulaccount of the origin of complex biological structureis the theory of natural selection’ (Pinker & Bloom1990, p. 707). To support this claim, Pinker andBloom provided a number of arguments that language,in general, offers reproductive pay-offs. Some of thesearguments have been rehearsed in brief above—forexample, the open-ended communicative possibilitiesthat language affords might plausibly have reproductiveconsequences. The compositionality of language shouldthen be viewed in the context of the functionaladvantages that compositionality affords: languagemust be compositional because the language facultyforces it to be that way, and the language faculty must bedesigned like that because a compositional languageoffers reproductive advantages (what could be moreuseful than the ability to produce entirely novelutterances about a wide range of situations, and havethem understood?).

Nowak et al. (2000) develop this argument in amathematical model of the evolution of composition-ality. They assume two types of language learner: thosewho learn a holistic (non-compositional) mappingbetween meanings and signals, and those who learn asimple compositional system. They consider the case ofpopulations of such learners converged on stable

Human language faculty and its evolution K. Smith & S. Kirby 3593

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

languages and find that, as expected, populations ofcompositional learners have higher within-populationcommunicative accuracy than learners who learn in aholistic fashion, assuming that the number of eventsthat individuals are required to communicate about isnot small. Under conditions where there are a largenumber of fitness relevant situations to communicateabout, the productivity advantage of compositionallanguage pays off in evolutionary terms.

(b) An explanatory role for cultural evolutionThe biological account of the evolution of languagesketched by Pinker & Bloom (1990) is an attractiveone. However, their assertion that it is the only possibleexplanation for the evident adaptive design of languageis based on a false premise: while we would accept that‘[t]he only successful account of the origin of complexbiological structure is the theory of natural selection’,we would dispute that language is solely a biologicalstructure. As already discussed in §2, language ismanifestly a socially learned, culturally transmittedsystem. Individuals acquire their knowledge of languageby observing the linguistic behaviour of others, and goon to use this knowledge to produce further examples oflinguistic behaviour, which others can learn from in turn(see, e.g. Andersen 1973; Hurford 1990; Kirby 1999).We have previously termed this process of culturaltransmission iterated learning: learning from thebehaviour of another, where that behaviour was itselfacquired through the same process of learning. The factthat language is socially learned and culturally trans-mitted opens up a second possible explanation for thedesign features of language: those features arose throughcultural, rather than biological, evolution. Rather thantraditional transmission being another design featurethat a biological account must explain, traditionaltransmission is the feature from which the otherstructural properties of language spring.

One of the primary objections to this account hasalready been stated: the argument from the poverty ofthe stimulus suggests that language learning should beimpossible, and therefore the apparent social learningof language must be illusory. However, culturaltransmission offers a potential solution to this con-undrum that does not require an assumption ofinnateness—while the poverty of the stimulus poses achallenge for individual learners, language adapts overcultural time so as to minimize this problem, becauseits survival depends upon it. We have dubbed thisprocess cultural selection for learnability:

in order for linguistic forms to persist from onegeneration to the next, they must repeatedly survivethe processes of expression [production] and induction[learning]. That is, the output of one generation mustbe successfully learned by the next if these linguisticforms are to survive.

(Brighton et al. 2005, p. 303).

Cultural selection for learnability offers a solution tothe conundrum posed by the argument from thepoverty of the stimulus:

Human children appear preadapted to guess the rules ofsyntax correctly, precisely because languages evolve so as

to embody in their syntax the most frequently guessedpatterns. The brain has coevolved with respect tolanguage, but languages have done most of the adapting

(Deacon 1997, p. 122).

As a consequence of cultural selection for learn-ability, the poverty of the stimulus problem induces apressure for languages that are learnable from the kindsof data learners can expect to see: ‘the poverty of thestimulus solves the poverty of the stimulus’ (Zuidema2003), but only if we look at language in its propercontext of cultural transmission.

How does cultural selection for learnability explaincompositionality? As discussed above, languages areinfinitely expressive (due to the combination ofrecursion and compositionality). However, suchlanguages must be transmitted through a finite set oflearning data. We call this mismatch between the sizeof the system to be transmitted and its medium oftransmission the learning bottleneck (Kirby 2002a; Smithet al. 2003). Compositionality provides an elegantsolution to this problem: to learn a compositionalsystem, a learner must master a finite set of words andrules for their combination, which can be learned from afinite set of data but can generate a far larger system.This fit between the form of language (it is compo-sitional) and a property of the transmission medium(it is finite but the system passing through it is infinite) issuggestive, and computational models2 of the iteratedlearning process (known as iterated learning models) haverepeatedly demonstrated that cultural evolution drivenby cultural selection for learnability can account for thisgoodness of fit.

In their simplest form, iterated learning modelsconsist of a chain of simulated language learner/users(known as agents). Each agent in this chain learns theirlanguage by observing a set of utterances produced bythe preceding agent in the chain, and in turn producesexample utterances for the next agent to learn from.The treatment of language learning and languageproduction varies from model to model, with similarresults having been shown for a fairly wide range ofmodels (see Kirby 2002b, for review). In all cases, thecrucial feature is that these agents are capable oflearning both holistic and compositional languages: theycan memorize a holistic meaning–signal mapping,but they can also generalize to a partially or whollycompositional language when the data they arelearning from merit such generalization. In otherwords, compositionality is not hard-wired into thelanguage learners.

These models show that the presence of a learningbottleneck is a key factor in determining the evolutionof compositionality. In conditions where there is nolearning bottleneck, the set of utterances produced byone agent for another to learn from covers (or is highlylikely to cover) the full space of expressions that anyagent will ever be called upon to produce. This is ofcourse impossible for human language, where the setof expressions is extremely large or infinite. On theother hand, where there is a learning bottleneck, thereis some (typically high) probability that learner/userswill subsequently be called upon to produce a novelexpression and will therefore be required to generalize.

3594 K. Smith & S. Kirby Human language faculty and its evolution

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

It is this pressure to generalize that leads to theevolution of compositional languages: whereas compo-sitional languages or compositional parts of languageare generalizable and can therefore be successfullytransmitted through a bottleneck, holistic (sub)systemsare by definition not generalizable and will be subject tochange. This adaptive dynamic has been used toexplain the putative transition from a hypothesizedholistic protolanguage (see, e.g. Wray 1998) to amodern compositional system.

Compositionality can therefore be explained as acultural adaptation by language to the problem oftransmission through a learning bottleneck. Note thatthroughout this explanation we are appealing to adifferent notion of function to that discussed in §3a:while Pinker & Bloom (1990) and Nowak et al. (2000)appealed to compositionality as an adaptation forcommunication, it is also a potential adaptation forcultural transmission.

(c) Explanations for language design:conclusionsBased on the preceding discussion, we believe there isat least as good a case for explaining one design featureof language as a cultural, rather than biological,adaptation. Of course, cultural and biological evolutioncan happily work in the same direction, and in somecases (such as the evolution of duality of patterning),the distinction between cultural and biologicalmechanisms seems rather minor: in both cases, thelinguistic system is being optimized for its ability toproduce numerous maximally discriminable signals(see Zuidema & de Boer in press). For the case ofcompositionality, there is a difference in the functionbeing optimized by the two alternative adaptivemechanisms: communicative usage under the bio-logical account and learnability under the culturalaccount. However, given that the conclusions in allcases are the same—compositionality and duality ofpatterning are good ideas—the temptation might beto gloss over the differences in mechanism leading tothese adaptations.

However, the two competing explanations makerather different predictions about the structure of thehuman language faculty, which, as linguists, is ourultimate object of study. Biological accounts suggest afairly direct mapping between properties of thelanguage faculty and properties of language, whereasthe cultural accounts propose a rather more opaquerelationship. This opaque relationship between thelanguage faculty and language design significantlycomplicates evolutionary accounts of the languagefaculty, as we will see in §4.

4. IMPLICATIONS FOR UNDERSTANDING THEEVOLUTION OF THE LANGUAGE FACULTYAcknowledging a role for cultural processes in theevolution of language might not actually have anyconsequences for our understanding of the humanlanguage faculty in its present form. For example, itwould be perfectly consistent to accept that compos-itionality was initially a cultural adaptation by languageto maximize its own transmissibility, but to argue that

this feature has subsequently been assimilated into thelanguage faculty. Arguments of this sort, appealing tothe mechanism known as the Baldwin effect or geneticassimilation (Baldwin 1896; Waddington 1975) are infact reasonably common in evolutionary linguistics.Pinker & Bloom (1990) appeal directly to the notionthat initially learned aspects of language will graduallybe assimilated into the language faculty, with anyremaining residue of learning being a result ofdiminishing returns from assimilation. More detailedarguments and formal models of this effect arepresented by Briscoe (2000, 2003). The prevalence ofassimilational accounts in the evolutionary linguisticsliterature can be neatly characterized in a parentheticalremark from Ray Jackendoff: ‘I agree with practicallyeveryone that the ‘Baldwin effect’ had something to dowith it’ (Jackendoff 2002, p. 237).

We believe this picture is fundamentally wrong, forat least two reasons. First, we have previously arguedthat the coevolution of culturally transmitted systemsand biological predispositions for learning those systemis somewhat problematic, due to time lags introducedby the cumulative and frequency-dependent nature ofcultural evolution (Smith 2004). Our focus here will beon a second problem: social transmission and culturalevolution can shield the language-learning machineryof individuals from selection, such that two ratherdifferent language faculties end up being behaviourallyequivalent and selectively neutral. This neutrality rulesout evolution of more strongly constraining languagefaculties via assimilational processes. Furthermore,selection itself can drive evolution precisely into theseconditions—a plausible evolutionary scenario seesnatural selection acting on the language facultyselecting for conditions where the language faculty isshielded from selection.

(a) The link between language structure andbiological predispositionsA useful vehicle to develop this argument is provided bya series of recent papers that seek to explicitly addressthe link between language structure, biological predis-positions and constraints on cultural transmission(Griffiths & Kalish 2005, 2007; Kirby et al. 2007;Griffiths et al. 2008). These papers are based arounditerated learning models where learners apply theprinciples of Bayesian inference to language acqui-sition. A learner’s confidence that a particular grammarh accounts for the linguistic data d that they haveencountered is given by

P!hjd"Z P!djh"P!h"Ph0P!djh0"P!h0" ; !4:1"

where P(h) is the prior belief in each grammar andP(djh) gives the probability that grammar h could havegenerated the observed data. Based on the posteriorprobability of the various grammars, P(hjd ), the learnerthen selects a grammar and produces utterances thatwill form the basis, through social learning, of languageacquisition in others. This learning model provides atransparent division between the contribution of thelearning bias of individuals prior to encountering data(the prior) and the observed data in shaping behaviour.We will equate prior bias with the innate language

Human language faculty and its evolution K. Smith & S. Kirby 3595

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

faculty of individuals—while Griffiths and Kalishrightly point out that the prior need not necessarilytake the form of an innate bias at all (e.g. it might bederived from non-linguistic data), ours is a possible and(we believe) natural interpretation.

Within this framework, Griffiths & Kalish (2005,2007) showed that cultural transmission factors (suchas noise or a learning bottleneck imposed by partialdata) have no effect on the distribution of languagesdelivered by cultural evolution: the outcome of culturalevolution is solely determined by the prior biases oflearners, given by P(h). In other words, only thestructure of the language faculty matters in determin-ing the outcomes of linguistic evolution. Griffiths &Kalish (2007) and Kirby et al. (2007) demonstratedthat this result is a consequence of the assumption thatlearners select a grammar with probability proportionalto P(hjd )—if learners instead select the grammar thatmaximizes the posterior probability (known as MAPlearners), then cultural transmission factors play animportant role in determining the distribution oflanguages delivered by cultural evolution: while thedistribution of languages produced by culturalevolution will be approximately centred on thelanguage most favoured by the prior, different trans-mission bottlenecks (for example) lead to differentdistributions. Furthermore, and crucially for ourarguments here, for MAP learners the strength of theprior bias is irrelevant over a wide range of theparameter space (Kirby et al. 2007)—it matters whichlanguage is most favoured in the prior, but not howmuch it is favoured over alternatives.

These models suggest two candidate components ofthe innate language faculty: first, the prior bias, P(h),and second, the strategy for selecting a grammar basedon P(hjd )—sampling proportional to P(hjd ), orselecting the grammar that maximizes P(hjd ). Further-more, they allow us to explore, using a single frame-work, the evolution of the language faculty under tworather different conceptions of what cultural evolutiondoes: either it ensures that the distribution of languagesultimately delivered by cultural evolution reflectsexactly the biases of language learners, regardless oftransmission factors such as the learning bottleneck(if learners sample from the posterior), or it allows suchtransmission factors to play a significant role indetermining the languages we see, and obscuresdifferences in the nature of individual languagefaculties (if learners maximize). We can thereforestraightforwardly extend models of this sort to askwhether cultural evolution alters the plausibility ofscenarios regarding the biological evolution of thelanguage faculty and, if so, under what conditions.

(b) The model of learning and culturaltransmissionWe adopt Kirby et al.’s (2007) model of language andlanguage learning. Full details are given below, but tobriefly summarize: languages are mappings betweenmeanings and signals, where learners have a paramet-rizable preference for regular languages. Implicit in thismodel of learning is a model of cultural transmission,which can be used to calculate the stable outcomes ofcultural evolution.

A language consists of a system for expressing mmeanings, where each meaning can be expressed usingone of k means of expression, called signal classes. In aperfectly regular (or systematic) language the samesignal class will be used to express each meaning—forexample, the same inflectional paradigm will be usedfor each verb, or the same compositional rules will beused to construct an utterance for each meaning. Bycontrast, in a perfectly irregular system each meaningwill be associated with a distinct signal class—each verban irregular, each complex utterance an idiom.

We will assume two types of prior bias. For unbiasedlearners, all grammars have the same prior probability:P(h)Z1/km. Biased learners have a preference forlanguages that use a consistent means of expression,such that each meaning is expressed using the samesignal class. Following Kirby et al. (2007), this prior isgiven by the expression

P!h"Z G!ka"G!a"kG!mCka"

Yk

jZ1

G!nj Ca"; !4:2"

where G!x"Z !xK1"! when x is an integer; nj is thenumber of meanings expressed using class j; and aR1determines the strength of the preference for consist-ency: low a gives a strong preference for consistentlanguages and higher a leads to a weaker preference forsuch languages.3 Kirby et al. (2007) justify the use ofthis particular prior distribution on the basis thatBayesian inference with this prior can be viewed ashypothesis selection based on minimum descriptionlength principles, which has been argued (convincingly,to our minds) to be relevant to cognition in general andlanguage acquisition in particular (see, e.g. Brighton2003; Chater & Vitanyi 2003). However, the precisedetails of this prior are unimportant for our purposeshere: any prior that provides a (partial) ordering overhypotheses supports our conclusions.

The probability of a particular dataset d (consistingof b observed meaning–form pairs: b gives the bottle-neck on language transmission) being produced by anindividual with grammar h is

P!djh"ZY

hxyi2d

P! yjx; h" 1m; !4:3"

where all meanings are equiprobable; hxyi is a meaning–signal pair consisting of a meaning x and a signal class y;and P! yjx; h" gives the probability of y being producedto convey x given grammar h and noise e:

P!yjx;h"Z

1Ke if y is the classcorresponding to x in h

e

kK1otherwise

:

8>>>><

>>>>:

!4:4"

Bayes’ rule can then be applied to give a posteriordistribution over hypotheses, given a particular set ofutterances.Thisposteriordistribution isusedbya learnerto select a grammar, according to one of two strategies.Sampling learners simply select a grammar propor-tional to its posterior probability: PL!hjd "ZP!hjd ".MAP learners select randomly from among those

3596 K. Smith & S. Kirby Human language faculty and its evolution

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

grammars with the highest posterior probability:

PL!hjd "Z1= Hj j if h2H0 otherwise

;

(

!4:5"

whereH is the set of hypotheses for which P(hjd ) is at amaximum.

A model of cultural transmission follows straight-forwardly from this model of learning: the probabilityof a learner at generation n arriving at grammar hn givenexposure to data produced by grammar hnK1 is simply

P!hn Z ijhnK1 Z j "ZX

d

PL!hn Z ijd "P!djhnK1 Z j ":!4:6"

The matrix of all such transition probabilities isknown as the Q matrix (Nowak et al. 2001): entry Qij

gives the transition probability from grammar j to i. Asdiscussed in Griffiths & Kalish (2005, 2007), the stableoutcome of cultural evolution (the stationary distributionof languages) can be calculated given this Qmatrix andis proportional to its first eigenvector.4 We will denotethe probability of grammar i in the stationarydistribution as Q#

i .Table 1 gives some example prior probabilities and

stationary distributions, for various strengths of priorand both selection strategies. As shown in table 1,strength of prior determines the outcome of culturalevolution for sampling learners, but is unimportant forMAP learners as long as some bias exists.

(c)Evaluating within-population communicativeaccuracyIn order to calculate which selection strategies andpriors are favoured by biological evolution, we need todefine a measure that determines reproductive successand therefore predicts the evolutionary trajectory of thelanguage faculty. One possibility (following Nowak et al.2000) is that the relevant quality is the extent to whichmembers of a genetically homogeneous population cancommunicate with one another: the probability that anytwo randomly selected individuals drawn from apopulation at equilibrium (at the stationary distributionprovided by cultural evolution) will share the samelanguage.5 This quantity, C, is simply

C ZX

h

Q#h

! "2; !4:7"

where the sum is over all possible grammars. Thewithin-population communicative accuracies of variouscombinations of strength of prior and hypothesisselection strategy are given in figure 1. Three resultsare apparent from this figure. First, in samplingpopulations, stronger priors (lower values of a) yieldhigher communicative accuracy. Stronger priors makefor a less uniform stationary distribution, with moreregular languages being over-represented. This skewingaway from the uniform distribution results in greaterwithin-population coherence. By contrast, strength ofprior is irrelevant in populations of MAP learners: it hasno impact on the stationary distribution, and as a resultthere is no communicative advantage associated withstronger priors.

Finally, MAP populations always have higherwithin-population communicative accuracy thansampling learners. As shown by Kirby et al. (2007),and illustrated in table 1, in MAP populations thedifferences between languages are amplified by culturalevolution, and the extent of the amplification isinversely proportional to the size of the transmissionbottleneck: tight bottlenecks give greater amplification,such that the a priori most likely language is greatlyoverrepresented in the stationary distribution. Bycontrast, cultural evolution in sampling populations

Table 1. P(h) for three grammars given various types of bias (unbiased, weak bias (aZ40) and strong bias (aZ1), denoted by u,bw and bs, respectively), and the frequency of those grammars in the stationary distribution for sampling and MAP learners.(Grammars are given as strings of characters, with the first character giving the class used to express the first meaning and so on:aaa is a perfectly regular language and abc is perfectly irregular. All results here and throughout the paper are for mZ3, kZ3,bZ3 and eZ0.1. For MAP learners, qualitatively similar results are obtainable for a wide range of the parameter space. Forsampling learners, as shown by Griffiths & Kalish (2007), qualitatively similar results are obtainable for any region of theparameter space where eO0.)

P(h) Q#, sampler Q#, maximizer

h u bw bs u bw bs u bw bs

aaa 0.0370 0.0389 0.1 0.0370 0.0389 0.1 0.0370 0.2499 0.2499aab 0.0370 0.0370 0.0333 0.0370 0.0370 0.0333 0.0370 0.0135 0.0135abc 0.0370 0.0361 0.0167 0.0370 0.0361 0.0167 0.0370 0.0014 0.0014

0

0.05

0.10

0.15

0.20

0.25

1 10 20 30 40

com

mun

icat

ive

accu

racy

, C

a

Figure 1. Fitness of sampler (filled squares) and MAP (opensquares, bZ3; circles, bZ6; triangles, bZ9) learners. Resultsfor MAP learners are for eZ0.1 and for various bottlenecks(values of b). C for sampler populations is the same regardlessof noise level and amount of data, as these factors have noimpact on the stationary distribution.

Human language faculty and its evolution K. Smith & S. Kirby 3597

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

returns the unmodified prior distribution. Amplifyingthe differences between languages provided by the priorincreases the likelihood that two individuals from apopulation will share the same language, and thereforeyields higher C. This analysis therefore suggests thatevolution should favour MAP learning over sampling,leading to selective neutrality with respect to thestrength of prior biases: strength of prior bias isshielded in MAP populations.

(d) Evaluating evolutionary stabilityWhile the preceding analysis tells us which priors andhypothesis selection strategies are objectively best froma communicative point of view, it tells us nothing aboutthe evolvability of those features. A more satisfyingsolution is to evaluate the evolutionary stability ofhypothesis selection strategies and priors. In order todo this, we require a slightly more detailed means ofevaluating how communication influences reproduc-tive success. We make the following initial assumptions(see below for a slightly different set of assumptions):(i) a population consists of several subpopulations,(ii) each subpopulation has converged on a singlegrammar through social learning, with the probabilityof each grammar being used by a subpopulation givenby that grammar’s probability in the stationarydistribution (as suggested by the analysis provided inGriffiths et al. 2008, §5a), and (iii) natural selectionfavours learners who arrive at the same grammar6 astheir peers in a particular subpopulation, where peersare other learners exposed to the language of thesubpopulation. Given these assumptions, the commu-nicative accuracy between two individuals A and B isgiven by

ca!A;B"ZX

h

X

h0QA

hh0$QBhh0$Q

#h0 ; !4:8"

where the superscripts on Q indicate that learnersA and B may have different selection strategies andpriors. The relative communicative accuracy of a singlelearner A with respect to a large and homogeneouspopulation of individuals of type B is therefore given byrca!A;B"Zca!A;B"=ca!B;B". Where this quantity isgreater than 1, the combination of selection strategyand prior (the learning behaviour) of individual A offerssome reproductive advantage relative to the populationlearning behaviour, and may come to dominate thepopulation. Where relative communicative accuracyis less than 1, learning behaviour A will tend to beselected against; and when relative communicativeaccuracy is 1, both learning behaviours are equivalentand genetic drift will ensue. Following MaynardSmith & Price (1973), the conditions for evolutionarystability for a behaviour of interest, I, are therefore:(i) rca!J; I"!1 for all JsI (populations of type Iresist invasion by all other learning behaviours) or(ii) rca!J; I"Z1 for some JsI, but in each such caserca!I ; J"O1. The second condition covers situationswhere the minority behaviour J can increase by driftto the point where encounters between type J indi-viduals become common, at which point type Iindividuals are positively selected for and the dom-inance of behaviour I is re-established.

Again, we can use this model to ask severalevolutionary questions. First, what does the evolutionof the strength of prior preference look like in samplingand MAP populations? Do different learning models(and their associated differences in the outcomes ofcultural evolution) change the evolutionary stability ofdifferent strengths of prior? Figure 2a,b shows theresults of numerical calculations of evolutionarystability in sampling populations.

As can be seen in figure 2, in sampling populationsthere is selection against weaker priors and selection fora stronger prior bias than the population norm. Thestronger the strength of the prior bias of the population,the more dominant languages favoured by that priorbias (i.e. the regular languages) will be, and the greater

min

ority

am

inor

ity a

min

ority

a

majority a

10

20

30

10

20

30

10

20

30

10 20 30

(a)

(b)

(c)

Figure 2. (a,b) Relative communicative accuracy of variouscombinations of strengths of prior in sampling populations((a) bZ3; (b) bZ9). Black cells indicate the minority a-valuewill be selected against (rca!1), white cells indicate theminority a-value will be selected for (rcaO1) and grey cells(seen on the diagonal) indicate selective neutrality (rcaZ1).(c) An example result given the global evaluation ofcommunicative accuracy (bZ9); see text for details.

3598 K. Smith & S. Kirby Human language faculty and its evolution

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

the advantage in being biased in favour of acquiringthose languages. In maximizing populations, strengthof prior is always selectively neutral—regardless of the aof the majority and minority, rcaZ1. These results aretherefore not plotted. In accordance with the analysisgiven in §4c, the evolutionary stability of differentstrengths of prior bias differs markedly across samplingand maximizing populations.

The degree to which stronger priors are favoured insampling populations is somewhat sensitive to thestrength of the population prior, however. For example,populations with a very weak prior bias in favour ofregularity (aZ40) resist invasion by mutants withmuch stronger prior preferences for regularity (az1)—given the relatively uniform distribution of languages inthese populations, the added ease of acquiring highlyregular languages provided by a very strong preferencefor such languages is counteracted by the reducedlikelihood of being born into populations speaking suchlanguages. This phenomenon becomes more markedgiven larger bottlenecks (higher b, meaning thatlearners observe more data during learning), as thisreduces the likelihood of misconvergence for individ-uals with weaker priors.

As shown in table 2, this tendency for weakermajority priors to reduce the extent to which strength-ened priors can invade also pertains for populationswhere the majority have a flat, unbiased prior: in suchpopulations every language is equally probable, and anybias to acquire a particular language is penalized due tothe consequent decreased ability to acquire the otherlanguages. Consequently, there are two evolutionarilystable strategies (ESSs) in sampling populations: thestrongest possible prior (aZ1) or a completelyunbiased prior.

In contrast with the situation in sampler popu-lations, selection in MAP populations is neutral withrespect to bias strength. This follows naturally from theinsensitivity of MAP learners to the strength of theirprior biases—all that matters is the ranking of thoselanguages under the prior. Consequently, strength ofthe prior makes no difference to the stationarydistribution provided by cultural evolution (as shownin table 1), nor to the ability of learners to acquire thoselanguages. Consequently, strength of prior is selectivelyneutral. However, as shown in table 2, this neutralitywith respect to strength of prior bias only applies whenthere is some prior bias—a completely flat prior is notevolutionarily stable (through the second clause of thedefinition given above). Consequently, the set of all

non-flat priors constitutes the evolutionarily stable set(Thomas 1985) for MAP learners.

These two models of learning, which make differentpredictions about the outcomes of cultural evolution,also make different predictions about the coevolution ofthe language faculty and language. In particular, thelearning model that predicts an opaque relationshipbetween bias and outcomes of cultural evolution (MAPlearning) also predicts extensive shielding of thelanguage faculty from selection, and certainly nopositive selection for increasingly constraininglanguage faculties.

We can ask a final evolutionary question: is it betterto be a sampling learner or a maximizing learner? Asshown in table 3, maximizing is the ESS, regardlessof bias strength. MAP learners boost the probabilitythat the most likely grammar will be learned, and areconsequently more likely to arrive at the same grammaras some other learner exposed to the same data-generating source. Given the fitness function thatemphasizes convergence on the same language asother members of the population, maximizing isobviously the best thing to do.

It is worth noting that this pattern of results is notdependent on the assumption that learners arerewarded for arriving at the same grammar as otherindividuals reared in the same linguistically homo-geneous subpopulation. While this strikes us as areasonable fitness function, it is not the only possibleone. For example, we could make the following ratherdifferent set of assumptions: (i) rather than consistingof several subpopulations, there is a single well-mixed(meta-)population, (ii) the probability of each grammarbeing used by any individual is given by that grammar’sprobability in the stationary distribution (again, assuggested by the analysis provided in Griffiths et al.2008, §5a) and (iii) natural selection favours learnerswho arrive at the same grammar as any randomlyselected peer. This global fitness function evaluateslearners on their ability to communicate with anyrandomly selected individual from a linguisticallyheterogeneous metapopulation, as opposed to reward-ing communication within a local subpopulation.

Under this alternative global fitness regime, thesame pattern of results emerges: stronger priors arefavoured in sampling populations; prior strength isselectively neutral in maximizing populations; andmaximizing is favoured over sampling. There arethree relatively minor qualitative differences.

Table 2. Relative communicative accuracy of various strengthsof prior in (a) sampling and (b) maximizing populations.

majority

u bs bw

(a) minority u — 0.81 0.9997bs 0.98 — 0.82bw 0.99998 0.99 —

(b) minority u — 0.45 0.45bs 1 — 1bw 1 1 —

Table 3. Relative communicative accuracy of the minorityhypothesis selection strategy for strong (aZ1), weak (aZ40)and flat priors.

majority

minority sampling maximizing

strong sampling — 0.6maximizing 1.39 —

weak sampling — 0.38maximizing 1.14 —

flat sampling — 0.88maximizing 1.12 —

Human language faculty and its evolution K. Smith & S. Kirby 3599

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

First, in sampling populations, the flat prior no longerconstitutes an ESS: the sole ESS is the strongest possibleprior, aZ1. This is shown in table 4 (cf. table 2a).Populations converged on a flat prior are invasible bydrift by learners with a non-flat prior: in a well-mixedunbiased population, any randomly selected unbiasedindividual will be equally likely to have each grammar;therefore, no matter what grammar a biased learnerconverges on, they will successfully communicate withan identical proportion of the population. Note thedifference with the local population case, where biasedlearners are penalized whenever they are born into asubpopulation using a language disfavoured by thebiased learner’s prior.

Second, and following the same reasoning, insampling populations, the zone of selection wheremuch stronger priors than the population norm areselected against disappears (figure 2c, cf. figure 2b)—again, global mixing removes the risk associated withstrong priors of converging on the wrong languagewithin a homogeneous subpopulation.

Finally, and again by the same reasoning, in popu-lationswith a flat prior bias, sampling andmaximizing areequivalent: under the global evaluation, the selectiveadvantage to maximizing disappears in such populations(but not in biased populations, where maximizing is stillpreferred). This is illustrated in table 5 (cf. table 3,subtable for flat priors). Given the completely uniformdistribution over languages and global mixing, samplers(who would normally be disproportionately penalizedin situations where they fail to converge on their sub-population’s language) will successfully communicatewith an equal proportion of the population regardless ofwhich grammar they (mis)learn.

Notwithstanding these minor differences, the resultsof the evolutionary analysis seem to be fairly robustunder different conceptions of how communicativesuccess within a population should be defined. It is anopen question whether the same conclusion pertainsgiven more complex fitness functions—for example,one that favours the acquisition of a specific languagerather than the (locally or globally) most frequent, orfavours the acquisition of a frequent but not toofrequent language.

To summarize: this model therefore suggests thatselection for communication acting on the languagefaculty should select learners who maximize over theposterior distribution rather than sampling from it.Given this initial selective choice, selection shouldconsequently be neutral with respect to the strength ofthe prior preference in favour of particular linguistic

structures: strongly constraining language faculties areno more likely than extremely weak biases. This is thefirst main conclusion we would like to draw from thismodel: the prediction that selection will favour lessflexible rather than more flexible learning is not alwayswarranted, because learning and cultural evolution canshield the language faculty from selection.

Furthermore, there are conditions under whichselection will favour the weakest possible prior biases.Cost can be integrated into the model by assuming thatnatural selection favours learners who arrive at thesame grammar as their peers and minimize cost as afunction of a: for example, individuals might reproduceproportionally to their weighted communicativeaccuracy, where the weighted communicative accuracybetween two individuals A and B is given by

ca0!A;B"Z ca!A;B"c!aA"Cc!aB"

; !4:9"

where ca(A, B) is as given in equation (4.8); c(a) is a costfunction, with higher values corresponding to greatercost; and aA gives the strength of prior of individualA. Ifwe assume that strong prior biases have some cost (i.e.c(a) is inversely proportional to a: perhaps strongerpriors require additional, restrictive but costly cognitivemachinery), there are conditions under which only weakbias would be evolutionarily stable. There will be somehigh value of a, which we will call a#, for which: (i) theprior is sufficiently weak that its costs relative to theunbiased strategy are low enough to allow a# individualsto invade unbiased populations and (ii) the priorremains sufficiently strong that a# populations resistinvasion by unbiased individuals. Under such a scenario,the extremely weak a# prior becomes the sole ESS:evolution will favour maximization and the weakestpossible (but not flat) prior.

The evolutionary argument sketched above onlyapplies if we assume that the only selective advantage toa particular prior bias arises from its communicativefunction: were particular priors to offer some non-linguistic advantage, less prone to being shielded bylearning and cultural evolution, then it could beselected for (or against) based on those more selectivelyobvious functions. This means that language-learningstrategies that are strongly constraining but domaingeneral are more likely to evolve than constrainingdomain-specific strategies. In other words, thetraditional transmission of language means that themost likely strongly constraining language facultywill not be a language faculty in the strict sense at all,but a more general cognitive faculty applied to theacquisition of language.

Table 4. Some results for the alternative global evaluation offitness. (Relative communicative accuracy of variousstrengths of prior in sampling populations. Unbiased priorsare no longer evolutionarily stable.)

majority

u bs bw

minority u — 0.79 0.9997bs 1 — 1.008bw 1 0.79 —

Table 5. Some results for the alternative global evaluation offitness. (Relative communicative accuracy of sampling versusmaximizing for flat priors. Under these circumstances,maximizing is no longer preferred over sampling.)

majority

sampling maximizing

minority sampling — 1maximizing 1 —

3600 K. Smith & S. Kirby Human language faculty and its evolution

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

(e) Conclusions on the evolution of thelanguage facultyCultural transmission alters our expectations abouthow the language faculty should evolve. Differentlanguage faculties may lead to identical outcomes forcultural evolution (as shown by Kirby et al. 2007),resulting in selective neutrality over those languagefaculties. If we believe that stronger prior biases (morerestrictive innate machinery) cost, then selection canfavour the weakest possible prior bias in favour of aparticular type of linguistic structure. This result, inconjunction with those provided in Smith (2004), leadsus to suspect that a domain-specific language faculty,selected for its linguistic consequences, is unlikely to bestrongly constraining—such a language faculty is theleast likely to evolve. Any strong constraints fromthe language faculty are likely to be domain general, notdomain specific, and owe their strength to selection foralinguistic, non-cultural functions.

5. CONCLUSIONSHow do the predictions of this evolutionary modellingrelate back to the uniqueness question posed in §2:what is it about humans that gives us our unusualcommunication system? Based on the argument out-lined in §4, we think that a strongly constrainingdomain-specific language faculty is unlikely to haveevolved in humans. It strikes us as more likely thathumans have a collection of domain-general biases thatthey bring to the language-learning task (see alsoChristiansen & Chater in press)—while these biasesmight be rather weak, their cumulative applicationduring language learning will lead to strong universalpatterns of linguistic structure, such as the designfeatures identified in §2. This prediction is consonantwith a recent body of work in developmental linguisticsthat seeks to identify how domain-general statisticallearning techniques can be used to acquire languagefrom data (e.g. Saffran et al. 1996; Gomez 2002), whatthe biases of those statistical processes are (e.g.Newport & Aslin 2004; Endress et al. 2005; HudsonKam & Newport 2005) and whether non-linguisticspecies share the same capacities and biases (Hauseret al. 2001; Fitch & Hauser 2004; Newport et al. 2004;Gentner et al. 2006). The comparative aspect of thiswork strikes us as particularly important, in that itsheds light on the extent to which humans actually haveany specializations for the acquisition of language,rather than specializations for sequence learning or forsocial learning more generally.

We would not be surprised if species-specificspecializations for the acquisition of linguistic structureturn out to be rare or even non-existent—this is whatthe evolutionary argument in §4 suggests. Rather, wesuspect that the uniqueness of language arises from theco-occurrence of a number of relatively unusualcognitive capacities that constitute preconditions forthe cultural evolution of these linguistic features.

The modelling literature on the cultural evolution ofcommunication is a useful tool in identifying what thesepreconditions might look like. The componentsrequired for cultural evolution to produce a simple,

traditionally transmitted, semantic and productivesystem seem to be fairly minimal:

(i) ability to modify own produced signal formsbased on observed usage;

(ii) ability to learn to associate meanings with signals;(iii) ability to infer communicative intentions

(meanings) in others;(iv) thesemeanings are drawn from a reasonably large

and structured meaning space.

At a first approximation, cultural evolution in apopulation of social learners meeting these fourpreconditions should yield a productive and semanticcommunication system. The first three conditions aresimply the component parts required for a traditionallytransmitted semantic communication system. Thefourth stipulation relates to the requirement for alearning bottleneck (a bottleneck is more likely if thesystem contains lots of meanings to be communicated),but also the possibility of generalizing from meaning tomeaning, which requires some similarity structureamong meanings—when this similarity structure isreduced, the learnability advantage of compositionallanguage is reduced (Brighton 2002; Smith et al. 2003).

As discussed in §2, some of these abilities are presentin non-human species. The first is present in all vocallearners, and also in non-human primates capable ofsustaining ritualized gestural communication systems.While the second is not present to any meaningfulextent in songbirds and other vocal learners (at leastwith reference to their vocally learned communicationsystems), it is again present in gesturally communi-cating apes. However, the third ability seems to be rare,including among other primates. The ability to inferthe communicative intentions of others underpins thelearning of large sets of arbitrary meaning–formassociations: in order to learn such associations, youmust be able, somewhat reliably, to infer the meaningassociated with each utterance you encounter. Whilesome other primates are able to learn to infer themeaning of a communicative signal, the laboriousprocess of ontogenetic ritualization by which meaningfor signals become established stands in stark contrastto the human ability to rapidly and accurately infercommunicative intentions (see, e.g. Bloom 1997,2000). There is no good evidence that apes are ableto acquire communicative signs by more streamlinedobservational processes (and see Tomasello et al. 1997,for a negative result).

This difference in efficiency in inferring the meaningassociated with a signal may be important for thefollowing reasons. First, the requirement for a sharedhistory of interaction reduces the potential spreadof any signal, limiting it to the individuals able todevote time to its establishment. Second, it restricts theultimate size of the repertoires of signals that can belearned, which directly links to the fourth requirementlisted above, for a large and structured meaning spaceto associate signals with. The (limited) success of apestrained on artificial communication systems (see, e.g.Savage-Rumbaugh et al. 1986), with modest inven-tories of communicative tokens and meaningfulcombinations of those tokens, suggests that there may

Human language faculty and its evolution K. Smith & S. Kirby 3601

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

be no fundamental cognitive barrier to the acquisitionof a small productive communication system in apes, ifonly sufficient scaffolding is provided for its acquisition.Such communication systems may not exist in the wild(and we offer this as no more than a tentativesuggestion) due to the limits on repertoire size imposedby the ritualization process, which prevents theestablishment of large and systematically related setsof conventional meaning–signal pairings and thereforeremoves any cultural evolutionary pressure for pro-ductive structure.

Of course the question then becomes: why are thesepreconditions for the cultural evolution of languageonly met in humans? Why have humans evolved theunusual abilities to vocally learn, infer meaningefficiently, and so on? We have at present no answerto these questions, but we suspect that exploring theevolution of the preconditions for the cultural evolutionof language is likely to be more profitable, and moreamenable to the comparative approach, than seeking toestablish the evolutionary history of a strongly con-straining, domain-specific and species-unique facultyof language.

K.S. was funded by a British Academy Postdoctoral ResearchFellowship. The development of the model described in §4was facilitated by our attendance at the NWO-fundedMasterclass on Language Evolution, organized by Bart deBoer and Paul Vogt.

ENDNOTES1Or some other discriminable unit, e.g. handshape, orientation and

location in sign language (Stokoe 1960).2See Kirby et al. (2008) for an experimental treatment of the same

phenomenon.3We will only consider the case where a takes integer values.4Provided the Markov process described by the Q matrix is ergodic

(see Griffiths & Kalish 2007, p. 446).5This assumes that identical grammars are required for communi-

cation. We could instead evaluate communicative accuracy based on

the degree of similarity between grammars or the degree of similarity

between datasets produced by those grammars. These more graded

notions of communication produce qualitatively similar results to

those presented here, the quantitative change being a reduction in the

magnitude of the selection differentials in some cases.6Again, more graded notions of match between grammars produce

qualitatively similar results.

REFERENCESAndersen, H. 1973 Abductive and deductive change.

Language 40, 765–793. (doi:10.2307/412063)Baldwin, J. M. 1896 A new factor in evolution. Am. Nat. 30,

441–451. (doi:10.1086/276408)Bloom, P. 1997 Intentionality and word learning. Trends Cogn.

Sci. 1, 9–12. (doi:10.1016/S1364-6613(97)01006-1)Bloom, P. 2000 How children learn the meanings of words.

Cambridge, MA: MIT Press.Brighton, H. 2002 Compositional syntax from cultural

transmission. Artif. Life 8, 25–54. (doi:10.1162/106454602753694756)

Brighton, H. 2003 Simplicity as a driving force in linguisticevolution, PhD thesis, The University of Edinburgh.

Brighton, H., Kirby, S. & Smith, K. 2005 Cultural selectionfor learnability: three principles underlying the view that

language adapts to be learnable. In Language origins:perspectives on evolution (ed. M. Tallerman), pp. 291–309.Oxford, UK: Oxford University Press.

Briscoe, E. 2000 Grammatical acquisition: inductive bias andcoevolution of language and the language acquisitiondevice. Language 76, 245–296. (doi:10.2307/417657)

Briscoe, E. 2003 Grammatical assimilation. In Languageevolution (eds M. H. Christiansen & S. Kirby), pp. 295–316.Oxford, UK: Oxford University Press.

Call, J. & Tomasello, M. (eds) 2004 The gestural communi-cation of apes and monkeys. Hillsdale, NJ: LawrenceEarlbaum Associates.

Catchpole, C. K. & Slater, P. J. B. 1995 Bird song: biologicalthemes and variations. Cambridge, UK: CambridgeUniversity Press.

Chater, N. & Vitanyi, P. M. B. 2003 Simplicity: a unifyingprinciple in cognitive science? Trends Cogn. Sci. 7, 19–22.(doi:10.1016/S1364-6613(02)00005-0)

Cheney, D. & Seyfarth, R. 1990 How monkeys see the world:inside the mind of another species. Chicago, IL: University ofChicago Press.

Chomsky, N. 1980 Rules and representations. London, UK:Basil Blackwell.

Christiansen, M. & Chater, N. In press. Language as shapedby the brain. Behav. Brain Sci.

Deacon, T. 1997 The symbolic species. London, UK: Penguin.Eens, M. 1997 Understanding the complex song of European

starling: an integrated ethological approach. Adv. StudyBehav. 26, 355–434.

Endress, A. D., Scholl, B. J. & Mehler, J. 2005 The role ofsalience in the extraction of algebraic rules. J. Exp. Psychol.Gen. 134, 406–419. (doi:10.1037/0096-3445.134.3.406)

Evans, C. S., Evans, L. & Marler, P. 1993 On the meaning ofalarm calls: functional reference in an avian vocal system.Anim. Behav. 46, 23–38. (doi:10.1006/anbe.1993.1158)

Fitch, W. T. 2000 The evolution of speech: a comparativereview. Trends Cogn. Sci. 4, 258–267. (doi:10.1016/S1364-6613(00)01494-7)

Fitch, W. T. &Hauser, M. D. 2004 Computational constraintson syntactic processing in a nonhuman primate. Science303, 377–380. (doi:10.1126/science.1089401)

Gentner, T. Q., Fenn, K. M., Margoliash, D. & Nusbaum,H. C. 2006 Recursive syntactic pattern learning by song-birds.Nature 440, 1204–1207. (doi:10.1038/nature04675)

Gil, D. & Slater, P. J. B. 2000 Song organisation and singingpatterns of the willow warbler, Phylloscopus trochilus.Behaviour 137, 759–782. (doi:10.1163/156853900502330)

Gomez, R. L. 2002 Variability and detection of invariantstructure. Psychol. Sci. 13, 431–436. (doi:10.1111/1467-9280.00476)

Griffiths, T. L. & Kalish, M. L. 2005 A Bayesian view oflanguage evolution by iterated learning. In Proc. 27thAnnual Conf. of the Cognitive Science Society (eds B. G.Bara, L. Barsalou & M. Bucciarelli), pp. 827–832.Mahwah, NJ: Erlbaum.

Griffiths, T. L. & Kalish, M. L. 2007 Language evolution byiterated learning with Bayesian agents. Cogn. Sci. 31,441–480.

Griffiths, T. L., Kalish, M. L. & Lewandowsky, S. 2008Theoretical and empirical evidence for the impact ofinductive biases on cultural evolution. Phil. Trans. R. Soc.B 363, 3591–3603. (doi:10.1098/rstb.2008.0146)

Hauser, M. D., Newport, E. L. & Aslin, R. N. 2001Segmentation of the speech stream in a non-humanprimate: statistical learning in cotton-top tamarins.Cognition 78, B53–B64. (doi:10.1016/S0010-0277(00)00132-3)

3602 K. Smith & S. Kirby Human language faculty and its evolution

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from

Hauser, M. D., Chomsky, N. & Fitch, W. T. 2002 The facultyof language: what is it, who has it, and how did it evolve?Science 298, 1569–1579. (doi:10.1126/science.298.5598.1569)

Hockett, C. F. 1960 The origin of speech. Sci. Am. 203,88–96.

Hudson Kam, C. L. & Newport, E. L. 2005 Regularizingunpredictable variation: the roles of adult and childlearners in language formation and change. Lang. Learn.Dev. 1, 151–195. (doi:10.1207/s15473341lld0102_3)

Hurford, J. R. 1990 Nativist and functional explanations inlanguage acquisition. In Logical issues in language acquisition(ed. I.M. Roca), pp. 85–136. Dordrecht, The Netherlands:Foris.

Jackendoff, R. 2002 Foundations of language: brain, meaning,grammar, evolution. Oxford, UK: Oxford University Press.

Kinsella, A. R. In press. Language evolution and syntactictheory. Cambridge, UK: Cambridge University Press.

Kirby, S. 1999 Function, selection and innateness: the emergenceof language universals. Oxford, UK: Oxford UniversityPress.

Kirby, S. 2002a Learning, bottlenecks and the evolution ofrecursive syntax. In Linguistic evolution through languageacquisition: formal and computational models (ed. E. Briscoe),pp. 173–203. Cambridge, UK: Cambridge UniversityPress.

Kirby, S. 2002b Natural language from artificial life. Artif.Life 8, 185–215. (doi:10.1162/106454602320184248)

Kirby, S., Dowman, M. & Griffiths, T. L. 2007 Innatenessand culture in the evolution of language. Proc. Natl Acad.Sci. USA 104, 5241–5245. (doi:10.1073/pnas.0608222104)

Kirby, S., Cornish, H. & Smith, K. 2008 Cumulative culturalevolution in the laboratory: an experimental approach to theorigins of structure in human language.Proc. Natl Acad. Sci.USA 105, 10 681–10 686. (doi:10.1073/pnas.0707835105)

Krifka, M. 2001 Compositionality. In The MIT encyclopae-dia of the cognitive sciences (eds R. A. Wilson & F. Keil),pp. 152–153. Cambridge, MA: MIT Press.

Marler, P. 1955 Characteristics of some animal calls. Nature176, 6–8. (doi:10.1038/176006a0)

Maynard Smith, J. & Price, G. R. 1973 The logic of animalconflict. Nature 146, 15–18. (doi:10.1038/246015a0)

Maynard Smith, J. & Szathmary, E. 1995 The major transitionsin evolution. Oxford, UK: Oxford University Press.

Newport, E. L. & Aslin, R. N. 2004 Learning at a distance I:statistical learning of non-adjacent dependencies. Cogn.Psychol. 48, 127–162. (doi:10.1016/S0010-0285(03)00128-2)

Newport, E. L., Hauser, M. D., Spaepan, G. & Aslin, R. N.2004 Learning at a distance II: statistical learning of non-adjacent dependencies in a non-human primate. Cogn.Psychol. 49, 85–117. (doi:10.1016/j.cogpsych.2003.12.002)

Nowak, M. A. & Krakauer, D. C. 1999 The evolution oflanguage. Proc. Natl Acad. Sci. USA 96, 8028–8033.(doi:10.1073/pnas.96.14.8028)

Nowak, M. A., Krakauer, D. C. & Dress, A. 1999 An errorlimit for the evolution of language. Proc. R. Soc. B 266,2131–2136. (doi:10.1098/rspb.1999.0898)

Nowak, M. A., Plotkin, J. B. & Jansen, V. A. A. 2000 Theevolution of syntactic communication. Nature 404,495–498. (doi:10.1038/35006635)

Nowak, M. A., Komarova, N. L. & Niyogi, P. 2001 Evolutionof universal grammar. Science 291, 114–117. (doi:10.1126/science.291.5501.114)

Okanoya, K. 2004 The bengalese finch: a window on thebehavioral neurobiology of birdsong syntax. Ann. N. Y.Acad. Sci. 1016, 724–735. (doi:10.1196/annals.1298.026)

Oudeyer, P.-Y. 2005 The self-organization of speech sounds.J. Theor. Biol. 233, 435–449. (doi:10.1016/j.jtbi.2004.10.025)

Pinker, S. & Bloom, P. 1990 Natural language and naturalselection. Behav. Brain Sci. 13, 707–784.

Pullum, G. K. & Scholz, B. C. 2002 Empirical assessment ofstimulus poverty arguments. Linguist. Rev. 19, 9–50.(doi:10.1515/tlir.19.1-2.9)

Saffran, J. R., Aslin, R. N. & Newport, E. L. 1996 Statisticallearning by 8-month-old infants. Science 274, 1926–1928.(doi:10.1126/science.274.5294.1926)

Savage-Rumbaugh, S., McDonald, K., Sevcik, R. A.,Hopkins, W. D. & Rubert, E. 1986 Spontaneous symbolacquisition and communicative use by pygmy chimpan-zees (Pan paniscus). J. Exp. Psychol. Gen. 115, 211–235.(doi:10.1037/0096-3445.115.3.211)

Slater, P. J. B. 2005 Animal communication: vocal learning.In The encyclopedia of language and linguistics (ed.K. Brown), pp. 291–294, 2nd edn. Amsterdam, TheNetherlands: Elsevier

Smith, K. 2004 The evolution of vocabulary. J. Theor. Biol.228, 127–142. (doi:10.1016/j.jtbi.2003.12.016)

Smith, K., Brighton, H. & Kirby, S. 2003 Complex systemsin language evolution: the cultural emergence of compo-sitional structure. Adv. Complex Syst. 6, 537–558. (doi:10.1142/S0219525903001055)

Stokoe, W. C. 1960 Sign language structure. Silver Spring,MD: Linstok Press.

Thomas, B. 1985 On evolutionarily stable sets. J. Math. Biol.22, 105–115. (doi:10.1007/BF00276549)

Tomasello, M. 1996 Do apes ape? In Social learning inanimals: the roots of culture (eds C. Heyes & B. Galef ),pp. 319–436. San Diego, CA: Academic Press.

Tomasello, M., Call, J., Warren, J., Frost, G., Carpenter, M. &Nagell, K. 1997 The ontogeny of chimpanzee gesturalsignals: a comparison across groups and generations. Evol.Commun. 1, 223–259.

Waddington, C. H. 1975 The evolution of an evolutionist.Ithaca, NY: Cornell University Press.

Whiten, A. 2005 The second inheritance system ofchimpanzees and humans. Nature 437, 52–55. (doi:10.1038/nature04023)

Wray, A. 1998 Protolanguage as a holistic system for socialinteraction. Lang. Commun. 18, 47–67. (doi:10.1016/S0271-5309(97)00033-5)

Zuberbuhler, K. 2001 Predator-specific alarm calls in Camp-bell’s monkeys, Cercopithecus campbelli. Behav. Ecol. Socio-biol. 50, 414–422. (doi:10.1007/s002650100383)

Zuberbuhler, K. 2002 A syntactic rule in forest monkeycommunication. Anim. Behav. 63, 293–299. (doi:10.1006/anbe.2001.1914)

Zuidema, W. H. 2003 How the poverty of the stimulus solvesthe poverty of the stimulus. InAdvances in neural informationprocessing systems 15 (Proceedings ofNIPS ‘02) (eds S.Becker,S. Thrun & K. Obermayer), pp. 51–58. Cambridge, MA:MIT Press.

Zuidema, W. H. & de Boer, B. In press. The evolution ofcombinatorial phonology. J. Phon.

Human language faculty and its evolution K. Smith & S. Kirby 3603

Phil. Trans. R. Soc. B (2008)

on January 5, 2011rstb.royalsocietypublishing.orgDownloaded from