Looking for a ‘gold standard’ to measure language complexity : What psycholinguistics and neurolinguistics can (and can’t) offer to formal linguistics

Looking for a gold standard to measure language complexity: What psycholinguistics and neurolinguistics can (and cant) offer to formal linguistics

Looking for a gold standard to measure language complexity:

What psycholinguistics and neurolinguistics can (and cant) offer to formal linguistics Lise Menn & Jill DuffieldLinguistics Dept/Institute of Cognitive ScienceUniversity of Colorado, [email protected]@colorado.edu

priming (short-term effects) Everything that speakers have experienced, planned or unplanned, externally or internally generated, over the preceding time intervals from milliseconds to minutes to decades, affects how hard it is for them to deploy grammar and lexicon at a given instantA basic reason that general real-world knowledge influences the complexity involved in producing and comprehending words and constructions is that if a hearer has have good grounds for predicting what someone is likely to say (or write), they have more processing resources available to handle difficult syntax and obscure lexicon. One way to think of the moment-to-moment dynamicsthe highly variable nature of whats hard and whats surprisingly easyis in terms of reducing or increasing the competition; that is, of channeling the flow of activation more narrowly in the direction of response, or of having it spread out (when there is more competition, there is less activation going in any given direction). Having a strongly directed flow can give apparently simultaneous access to all parts of a construction, because rapid assembly looks like retrieval from storage. Conversely, being pulled in multiple directions for example, when an utterance competes with well-established collocation patterns slows down tasks like processing No ifs, ands, or buts. (Duffield and Menn submitted, p.18)1Background for our stanceCross-linguistic work on basic morphosyntax in aphasia, and on the earliest stages of child phonology shows that these areas are loaded with individual and with language-specific differences. Markedness keeps vanishing into the mist of unverifiability. Its no guide to complexity.So, the issue of whats simple and why, especially in those domains, has been a constant undercurrent and a frequent topic of discussion.

3/23/12Grammatical Complexity Workshop2Menn, L. & L.K Obler (eds.) (1990). Agrammatic Aphasia: A Cross-Linguistic Narrative Sourcebook. Philadelphia: John Benjamins.Menn, L. & M..M. Vihman (2011). Features in child phonology: inherent, emergent, or artefacts of analysis? In N. Clements and R. Ridouane, eds. Where Do Phonological Features Come From? Cognitive, physical and developmental bases of distinctive speech categories. Amsterdam: John Benjamins. 2What were going to say 1 to 41. What our brains find difficult is not always what grammars consider complex, partly because whats hard for our brains is not constant; it depends on many factors. (Its complicated.)2. Proposed metrics for language or grammar complexity should correspond in some way to the gold standard of whats hard for our brains to process.3. Language complexity measures will have to go beyond a single measure of grammar complexity, because complexity for speakers/writers is not the same as for hearers/readers.4. Psycho- and neurolinguistics cant provide a royal road to measuring the complexity of a grammar or a language, but they do provide tools to measure processing complexity of individual sentences/utterances for speakers vs. listeners or readers, and learners vs. skilled language users. 3/23/12Grammatical Complexity Workshop3Nicol, J,, C. Jakubowicz & M.-C. Goldblum/ 1996. Sensitivity to grammatical marking in English-speaking and French-speaking non-fluent aphasics. Aphasiology 10 (6): 593-622. both structure and frequency matter3What were going to say 5 to 85, Psycho- and neurolinguistic studies indicate that a valid measure of complexity will have to integrate across many linguistic levels (including semantics) and take frequency into account.6. This implies that construction-based and usage-based approaches to grammar can provide insights into how grammars can come closer to reflecting what our brains do.7. But: Complexity measures must handle competition and how it gets resolved in both comprehension and production: the paradigmatic axis also plays a role in complexity.8. Pragmatics/real-world knowledge are involved in resolving this competition. (Implications for practical applications are clear; for comparison of languages, much less so.)3/23/12Grammatical Complexity Workshop4Problematic examples to start fromPourquoi laphasique peut-il dire: "Je ne peux pas le dire" et pas "Elle ne peut pas la chanter''? (Nespoulous & Lecours 1989) [Why can the aphasic person say I cant say it but not She cant sing it ?]Possible culprits: lexical frequency, collocation frequency (formula status?), emotional weight

Dresslers (1991) work on Breton: A speaker with fluent aphasia tends to name pictures or examples of a single object using the plural form if the object itself is most frequently found in quantity (leaves, potatoes); using the dual if the object is usually found in pairs (eyes, hands). Goes against all notions of markedness. Relative frequency of particular inflected form of particular word must be the explanation.What brains find difficult is not always what grammar and linguists consider complex.

3/23/12Grammatical Complexity Workshop5Nespoulous, Jean-Luc, and Andr Roch Lecours Pourquoi laphasique peut-il dire: "Je ne peux pas le dire" et pas "Elle ne peut pas la chanter''? (De l'intrt des dissociations verbales dans l'tude du comportement verbal des aphasiques). Cahiers du Centre Interdisciplinaire des Sciences du Langage, Mlanges offerts J. Verguin. Toulouse: Universit de Toulouse-Le Mirail.

Nespoulous, Jean-Luc, Chris Code, Jaques Virbel, and Andr Roch Lecours Hypotheses on the dissociation between referential and modalizing verbal behaviour in aphasia. Applied Psycholinguistics 19: 311-331.

Dressler, Wolfgang1991The sociolinguistic and patholinguistic attrition of Breton. In First language attrition, Herbert W. Seliger and Robert Michael Vago (eds.), 99-112. Cambridge: Cambridge University Press.

5Why is there a problem?3/23/12Grammatical Complexity Workshop61. What our brains find difficult is not always what grammars consider complex, partly because whats hard for our brains is not constant; it depends on many factors.We cannot equate simpler with what is learned earlier, and the reason we cannot do it is that the neural networks change over the course of development.Another example from aphasia Aphasic verb tense production errors are often, as one might expect, substitutions of present tense for past tense.But the reverse seems to be true for at least some agrammatic aphasic speakers of Arabic (Mimouni & Jarema, 1997), Polish (Jarema & Kadzieawa, 1990), and Korean (Halliwell, 2000)3/23/12Grammatical Complexity Workshop7Mimouni, Z., & Jarema, G. (1997). Agrammatic aphasia in Arabic. Aphasiology, 11, 125144.

Jarema, G., & Kdzieawa, D. (1990). Agrammatism in Polish: A case study. In L. Menn & L. K. Obler (Eds.), Agrammatic aphasia, Vol. II (pp. 817894). Amsterdam: Benjamins.

Halliwell, J. (2000). Korean agrammatic production. Aphasiology, 14, 11871203. 7What brains find difficult is not always what grammars consider complexThis is not just about speakers with brain damage. Difficulty is tied to particular circumstances.Production/comprehension asymmetry:most obvious case: ambiguity. Speakers, who know what they intend to say, often produce utterances that are difficult for hearers because of ambiguity in their referring expressions (He did it!) and elsewhereLong history of studies of ambiguity resolution in psycholinguistics that demonstrates & relies on the processing difficulty caused by hearers or readers need to resolve ambiguity on-line.Other studies showing speakers have to put effort into being simple for their hearers (any teacher knows this!)3/23/12Grammatical Complexity Workshop8With respect to ambiguity resolution in reading: Aro, M., & Wimmer, H. (2003). Learning to read: English in comparison to six more regular orthographies. Applied Psycholinguistics, 24, 621635.See also Ferreira, V. S. (2008). Ambiguity, accessibility, and a division of labor for communicative success. Learning Motivations, 49, 209-246.

8What brains find difficult is not always what grammars consider complexLearning changes the brain (how could it fail to?), creating a learner/skilled user asymmetry

So, relative simple-to-complex rankings must shift as we learn our first languages (OT calls this constraint (re-)ranking).Phonotactics provides many uncontroversial examplesBlevins (1995) illustrations of syllable types: Spanish and Sedang permit CCVC but not CVCC, while the reverse is true for Klamath and Finnish. Japanese speakers struggle with English /tr/ but routinely produce [tst], e.g. in place name Tsuchiura)3/23/12Grammatical Complexity Workshop9Blevins, Juliette. 1995. The syllable in phonological theory. In Goldsmith, John A., The Handbook of Phonological Theory. Cambridge MA: Blackwell. 206-244.

9Before we go on: An essential distinction The grammar of a language as an abstraction across speakers the patterns out there to be learned (E-language) The grammar of a language as an abstraction across speakers isnt directly testable by psycholinguistics/neurolinguistics. If that grammar is your main concern, what we have to say has to be mediated by your idea of the relationship between the grammar of a language and the grammar of each speaker.the grammar internal to a given speaker, which should be that speakers internal approximation to the patterns out there (I-language).This is what were concerned with in this presentation.3/23/12Grammatical Complexity Workshop10But focusing on speaker-internal grammar is only a start Making this distinction can handle (some) differences between learners (who have a cruder approximation to that abstract grammar) and skilled users (who have a better one). But there are more problems to deal with. One that well keep coming back to: If theres only one internal grammar, a single measure of its complexity cant handle the the fact that whats difficult for comprehension (e.g. ambiguity, unclear reference) is not necessarily difficult for production.

3/23/12Grammatical Complexity Workshop11its Complicated

3/23/12Grammatical Complexity Workshop12

Back to our first three points, slightly elaborated:

What our brains find difficult is not always what grammars consider complex, partly because whats hard for our brains is not constant and depends on many factors. Proposed metrics for language or grammar complexity at least for speaker-internal language or grammar should correspond in some way to the empirical gold standard of whats hard for our brains to process. So language complexity measures that claim to be valid metrics for whats in human minds will also have to go beyond a single measure of grammar or language complexity.And we still have the problem cases we started with:Some aphasic people can say I cant say it. but not She cant sing it.

A Breton speaker with fluent aphasia tends to name pictures or examples of a single object using the plural form if the object itself is most frequently found in quantity (leaves, potatoes) using the dual formif the object is usually found in pairs (eyes, hands).

3/23/12Grammatical Complexity Workshop13Breton - Dressler, ref. above13What can (or cant) psycholinguistics & neurolinguistics offer?3/23/12Grammatical Complexity Workshop14Whats a gold standard? Why do linguists need one for complexity?A rigorous standpoint, outside of particular formalisms and levels of language, to inform proposed measures of complexity Needed in order to test whether a proposed metric corresponds to measures of what our brains find effortful to processjust as a proposed metric of color must correspond to some psychophysical measure of human responses to color if its going to be useful in accounting for perception. A metric that is useful for calibrating printers may not do well at accounting for what colors people find similar. 3/23/1215Grammatical Complexity Workshop15 Computational measures of utterance complexity need to be validated against processing measures i.e., measures of performanceValidating a particular formal analysis of processing (e.g., an analysis that can take the number of competing antecedents for a relative pronoun into account, an analysis that can take various aspects of frequency into account) puts us into the domains of psycholinguistics and neurolinguistics, as other speakers have already made clear. 3/23/1216Grammatical Complexity WorkshopQuotes on taking a psycholinguistic approach to grammatical complexitythe emerging correlation between performance and grammars exists because grammars have conventionalized the preferences of performance, in proportion to their strength and in proportion to their number, as they apply to the relevant structures in the relevant language types. (Hawkins 2004)In order to test the hypothesis that typological distributions reflect processing complexity, an independently motivated, well-defined, and empirically assessable notion of processing difficulty is essential. (Jaeger & Tily 2010)not only [should] grammatical theorists be interested in performance modeling, but also empirical facts about various aspects of performance can and should inform the development of the theory of linguistic competence. Sag & Wasow, 2011the competence-performance distinction acknowledges the value of the sort of work linguists do in their day-to-day research, while recognizing that this work eventually must be placed in a broader psychological context. (Jackendoff 2002)

3/23/12Grammatical Complexity Workshop17Hawkins, J. (2004). Efficiency and complexity in grammars. Oxford University Press, New York.Jaeger, T. Florian and Tily, H. (2011). Language Processing Complexity and Communicative Efficiency. WIRE: Cognitive Science 2 (3): 323-335.Sag, I. A. & Wasow, T. (2011) Performance-Compatible Competence Grammar. In Robert Borsley and Kersti Borjars, eds. Non-Transformational Syntax: Formal and Explicit Models of Grammar. Oxford: Wiley-Blackwell.Jackendoff, R. (2002) Foundations of Language. Oxford University Press, New York.17Complexity measures must predict processing effort for different levels, and their interactions Psycholinguistic experiments with normal speakersStudy of cognitive constraints and island effects (Hofmeister & Sag, 2010)Results: Island constraints interact with other features to affect processing effort, correlating with grammaticality judgmentsWH-island violations are processed more easily when the extracted element is complex (a WH-phrase)Which employee did Albert learn whether they dismissed after the annual performance review? processed more easily than Who did Albert learn whether they dismissed after the annual performance review? 3/23/12Grammatical Complexity Workshop18Hofmeister, P. & Sag, I. (2010). Cognitive constraints and island effects. Language 86(2) 366-415. 18Measuring complexity for speakers:Error rate as an exampleStudies of formation of grammatical dependencies: even subject-verb agreement can be complex!Producing subject-verb agreement: a local noun embedded in a subject noun phrase may interfere with the production of agreement - but structure constrains that interference. (Bock & Cutting 1992)The key [PP to the wooden cabinet] is / areThe key [PP to the wooden cabinets] is / areThe key [RC that ___ opened the cabinet] is / areThe key [RC that ___ opened the cabinets] is / are3/23/12Grammatical Complexity Workshop19attraction effectno attraction effectAgreement: Clausal boundaries encapsulate informationso while relative clauses may be structurally more complex, they dont interfere with agreement relations as much as phrases: Bock, K., & Cutting, J. C. (1992). Regulating mental energy: Performance units in language production*1. Journal of Memory and Language, 31(1), 99-127.

19And theres a glitch: Complexity measures must be compatible with different performance metricsSpeakers and listeners show different sensitivities to certain structures in processing tasksWhile error data Bock & Cutting (1992) showed that relative clauses isolate information interfering with agreement for speakers while prepositional phrase modifiers do not (i.e., a clause boundary effect), Tanner (2012) uses ERP and reading times to show no interaction between structure x local noun number x grammaticality (no clause bounding effect)in productionThe key [PP to the wooden cabinet(s)]The key [RC that ___ opened the cabinet(s)] in comprehensionThe key [PP to the wooden cabinet(s)]The key [RC that ___ opened the cabinet(s)]

3/23/12Grammatical Complexity Workshop20effect of number of embedded noun differs across structure here, but not hereTanner, D. (2012). Structural effects in agreement processing: ERP and reaction time evidence for comprehension/production asymmetries. Paper presented at the Annual Meeting of the Linguistics Society of America. Portland, OR. January 2012.20Glitch: Complexity measures must be compatible with different performance metricsSpeakers and listeners show different sensitivities to certain structures in processing tasksSpeaker-hearer asymmetries arent just matters of discourse ambiguity.

which means that there cant be just one measure of complexity of a sentence comprehension and production may give different complexity rankings

3/23/12Grammatical Complexity Workshop2121Glitch: Different sensitivities to structures in different processing tasksStudy of discourse and weight-based factors on relative clause extraposition: Francis & Michaelis (2012) used two different tasks to measure the combined effects of: Definiteness: e.g., (Some/The) research VP length: e.g. (was conducted/has been conducted fairly recently)RC length: e.g., (.that refutes the existing theories/that refutes the existing theories with very clear and convincing evidence)Judgment task: readers saw two versions of a relative clause sentence (e.g., "Further research that indicates...." vs. "Further research has been conducted...")Elicited production task: speakers were given three constituents - a subject NP, a relative clause, and a verb phrase, and asked to order those constituents in a full sentence.

3/23/12Grammatical Complexity Workshop22Francis, E. & Michaelis, L. (2012). Effects of weight and definiteness on speakers choice of clausal ordering in English. Paper presented at the Annual Meeting of the Linguistics Society of America. Portland, OR. January 2012.

22Glitch: Speakers and listeners may show different sensitivities to structures in particular processing tasksIn both experiments, indefinite subjects (e.g. "Some research" vs. "The research"). as opposed to definite ones, were more likely to be used with extraposed relative clauses. BUT:In the judgment (comprehension) task, readers preferred the extraposed version for longer relative clausesVP length, however, didn't matterIn the production task, VP length did matter: shorter VPs were more likely to predict extraposed relative clauses. Relative clause length, however, didn't matter in production.No explanation for this particular pair of comprehension/production discrepancies yet

3/23/12Grammatical Complexity Workshop2323So: We dont have a royal road to offer; how about a set of road-building tools?What tools do we have to measure utterance complexity?Complexity for the listener (comprehension)Neural correlates of relative effort: one current measure is Event-Related Potential (ERP); Errors, reaction time and eye-tracking can also be used. (Imaging tools. though glamorous, dont have good enough spatiotemporal resolution yet)Complexity for the speaker (production)Error-based measures of relative effort comprise the majority of production research. Reaction time measures and eye-tracking measures also are useful.3/23/12Grammatical Complexity Workshop2424Things you probably know about measuring performanceAvailable performance measures for both comprehension and production can only look at one word, utterance, or short passage at a time.Performance always has a large random element minds differ, and they are simultaneously busy with many things besides the task set by the experimenter (wishing for coffee, worrying about politics or the weather)So most measures have to be averaged over fairly large sets of similar items, and sometimes over speakers as well.3/23/12Grammatical Complexity Workshop25Measuring complexity for listeners: Event-Related Potential as an exampleAn ERP study of metaphor comprehension (Lai, Curran, & Menn 2009)Comprehending conventional and novel metaphors: Lexical semantics affects comprehension effort when structure is held constant.Sense/nonsense judgment task, comparing listeners ERPs for these four semantic groups:Literal: Every soldier in the frontline was attacked.Conventional: Every point in my argument was attacked.Novel: Every second of our time was attacked.Anomalous: Every drop of rain was attacked. (assignment of sentences to groups checked by nave subject ratings)3/23/12Grammatical Complexity Workshop26Using a sensicality judgment task, we compared ERPs elicited by the same target word when it was used to end literal, conventional metaphorical, novel metaphorical and anomalous sentences.

Lai, Vicky T., Tim Curran, & Lise Menn. 2009. Comprehending conventional and novel metaphors: An ERP study. Brain Research 1284: 145155.

26Measuring complexity for listeners: Event-Related Potential as an exampleResult: Conventional metaphors required a short burst of additional processing effort when compared with literal sentences. Novel metaphors required a more sustained effort, similar to the effort observed in anomalous sentences.Literal: Every soldier in the frontline was attacked.Conventional: Every point in my argument was attacked.Novel: Every second of our time was attacked.Anomalous: Every drop of rain was attacked. Comprehension of metaphors involves an initial stage of mapping from one concept to another; such mappings are cognitively taxing, implying that complexity (as processing effort) involves more than structure. ERP matches our intuitions about complexity and elaborates them.3/23/12Grammatical Complexity Workshop2727Measuring complexity for speakers:more about agreement errorsStudies of formation of grammatical dependencies: even subject-verb agreement can be complex!Producing subject-verb agreement: a local noun embedded in a subject noun phrase may interfere with the production of agreement - but structure constrains that interference. (Bock & Cutting 1992)The key [PP to the cabinet] is / areThe key [PP to the cabinets] is / areThe key [RC that ___ locks the cabinets] is / are3/23/12Grammatical Complexity Workshop28Bock, K., & Cutting, J. C. (1992). Regulating mental energy: Performance units in language production*1. Journal of Memory and Language, 31(1), 99-127.

28Measuring complexity for speakers:more about agreement errorsProducing subject-verb agreement: Local nouns that are embedded more deeply are less likely to interfere with agreement production (Franck, Vigliocco & Nicol, 2002)The threat [PP to the president [PP of the companies]] is / areThe threat [PP to the presidents [PP of the company]] is / areso syntactic structure can directly affect production complexity and here, the more complex structure has made one component of producing this sentence easier!

3/23/12Grammatical Complexity Workshop29Franck, J., Vigliocco, G., & Nicol, J. (2002). Subject-verb agreement errors in French and English : The role of syntactic hierarchy. Language and Cognitive Processes, 17(4), 371-404.29Measuring complexity for speakers:Other measures:Onset latencies how long before speaker starts to respondShow difficulty in processing (e.g., competition between verb forms in subject-verb agreement: Haskell & MacDonald, 2003; Staub, 2009)Eye trackingShows the interface between high-level message formulation and sentence planning (Brown-Schmidt & Tanenhaus, 2006)Directed elicitation of alternate formsProvides a measure of accessibility (e.g., production of optional complementizers allows speakers time to access upcoming constituents: Ferreira & Firato, 2002)3/23/12Grammatical Complexity Workshop30Haskell, T. R. & MacDonald, M. C. (2003). Conflicting cues and competition in subject-verb agreement. Journal of Memory and Language, 48: 760-778.Participants were asked to produce questions, e.g., The plates for the party guests..broken; latencies reflect time to plan the verbStaub, A. (2009). On the interpretation of the number attraction effect: Response time evidence. Journal of Memory and Language, 60: 308-327.Two-choice reaction time experiment, e.g., The gang with the dangerous rivalsWAS/WERE; Tested whether or not latencies were due to agreement processing or subject planning

Brown-Schmidt, S. & Tanenhaus, M. (2006). Watching the eyes when talking about size: An investigation of message formulation and utterance planning. Journal of Memory and Language, 54: 592-609.

Ferreira, V. S. & Firato, C. E. (2002). Proactive interference effects on sentence production. Psychonomic Bulletin & Review, 9: 795-800.30So that was the fourth point:Psycho- and neurolinguistics cant provide a royal road to measuring the complexity of a grammar or a language, but they do provide tools to measure processing complexity of individual sentences/utterances for speakers vs. listeners or readers, and learners vs. skilled language users.

3/23/12Grammatical Complexity Workshop315. A valid measure of complexity will have to integrate across many linguistic levels (including semantics), and take frequency into account. What does that mean, and what kinds of data support it? Lets break it up into sub-claims that we can examine one at a time.3/23/12Grammatical Complexity Workshop325. Complexity measures must predict processing effort for multiple levels, and for interactions among levels, taking frequency into account3/23/12Grammatical Complexity Workshop33OGrady: the interaction of simple elements and phenomena can yield systems and effects of a qualitatively different and more complex nature.What valid complexity measures would have to do: OGrady, Wiliam To appear. Relative clauses: Processing and acquisition. In The acquisition of relative clauses: Functional and typological perspectives, Evan Kidd (ed.). Amsterdam: John Benjamins

33First, lets talk about processing effort for different levels and their interactionsA testable general complexity measure would have to be able to make predictions about the effort for processing short individual utterances or passages, and correctly predict the relative effort needed.

Weve already seen that this effort depends partly on the choice of individual lexical items within those utterances or passages She cant sing it; Every minute of our time was attacked.

3/23/12Grammatical Complexity Workshop345. Complexity measures must predict processing effort for different levels and their interactionsWhile linguists may analyze a particular utterance into several components, the mind may store it and process it as a whole, or as both whole and analyzed.Work on idiom blends suggests that idioms are both analyzable and stored as lexical items (Cutting & Bock, 1997)Observed speech error: Help all you want, blended from idioms/formulas Help yourself! and Take all you want!- idiom blends like this respect the internal structure of each component, because their surface syntax remains well-formed, so the speaker must have access to those internal structures.As many theorists now argue,The whole vs. analyzed opposition is much too crude, as many linguists have argued on theoretical grounds (Culicover 1999, among many others.)

3/23/12Grammatical Complexity Workshop35Idiom blends are important arguments for not assuming that idiomaticity and analyticity are opposite. Here are things that blend syntactically perfectly, and the same time that the semantics is somewhat opaquely assigned to the parts. Formal measures of complexity are typically understood to be syntactic measures. Its not difficult to imagine some form of combination, some expansion of that, to incorporate word frequency such that you can make an utterance more difficult if it had less frequent words. But without taking into account the words and structures, the particular words appearing together in that structure; without taking constructions into account, youre still going to miss essential contributors to both types of complexity (speaker and hearer)

Cutting, J. C. & Bock, K. (1997). Thats the way the cookie bounces: Syntactic and semantic components of experimentally elicited idiom blends. Memory and Cognition, 25: 57-71.

Culicover, P. W. 1999. Syntactic Nuts: Hard Cases, Syntactic Theory, and Language Acquisition. Oxford University Press.

355. Complexity measures must predict processing effort for different levels, and their interactionsStart with some intuitive evidence for constructions

Particular verbs may be specified or preferred by a particular construction, such as mind if in the politeness formula Do/will [person A] mind if [event X]? (Do you mind if I sit here?). Easy to process, and not completely fixed lexically. (This construction is nested in more general patterns, cf. Would your mother have a fit if I)Conversely, when an unexpected word is used in a familiar construction especially if it evokes a different construction - it can make the construction relatively harder to process (Would you care if I sit here?).so effort for different levels needs to include mixed levels, e.g. structures with some lexical items specified3/23/12Grammatical Complexity Workshop3636Psycho- and neurolinguistics also tell us that the whole vs. analyzed opposition is too crude. We even have to go beyond constructions and talk about collocations.Collocations with high transitional probabilities, even when the collocations dont form constructions (e.g., subject + aux collocations, he has or I am), are easier for speakers to produce. (Thats why we can have contractions across the NP-VP boundary!)Smooth flow across this boundary is well exemplified in fluent aphasias, e.g. French jargon aphasia (Lecours et al. 1981)Learning probabilities is a subconscious (procedural) and gradual process. Expectations that A will be followed by B, or that A will occur in structure , become stronger over time, rather than clicking from nothing to all..3/23/12Grammatical Complexity Workshop375. Complexity measures must predict processing effort for different levels and their interactionsLecours et al. 198137Agrammatic aphasic speakers show the effect of high sequential probabilities forgot the wash the dishes forgot that she was washing the dishes I like the go home. Id like to go home.Once they have chosen the definite article to follow forget or like, these speakers are in trouble; both plug in familiar phrases (wash the dishes, go home) with appropriate semantic content, but in forms that cannot follow the. These utterances are difficult to explain in grammatical terms, because they show the article being substituted for the infinitive marker, and, even more strikingly, because the collocation V+the goes across the major syntactic boundary between the verb and what should be the start of its NP object.3/23/12Grammatical Complexity Workshop38Agrammatic aphasic speaker is SK (aka Mrs. K., Shirley), a participant for Menn in various published studies of agrammatic speech since 1991.385. Complexity measures must predict processing effort...taking frequency into account We cant escape dealing with usage and frequency.In the limit, there may be no empirically testable difference between being stored as a unit and having very strong, predictable links from one sub-unit to another. So if your theory doesnt permit multi-level or other kinds of complex units and/or doesnt recognize collocations that arent constructions, thats not necessarily a problem.What you do need is at least - a way to incorporate item frequencies and transition probabilities into the representation of a structure after the structure has its words filled in (or during the process of getting them filled in). 3/23/12Grammatical Complexity Workshop395. Complexity measures must predict processing effort for different levels, and their interactionsWell shortly consider some more evidence for constructions, as structures lying between the extremes of lexical items and full clauses.

But: the complexity of deploying a construction is not determined by construction frequency alone (e.g., Dutch word order in aphasia, Bastiaanse, Bouma & Post 2009) For example, the interpretation or deployment of a construction, such as the English subject or object cleft, may be made better or worse by the existence of similar constructions (Dick et al. 2001:772): the paradigmatic axis' is relevant to processing complexity more on that soon. 3/23/12Grammatical Complexity Workshop40Bastiaanse, Roelien, Gosse Bouma, and Wendy Post, 2009.Linguistic complexity and frequency in agrammatic speech production. Brain and Language 109: 18-28.

Dick, Frederick, Elizabeth Bates, Beverly Wulfeck, Jennifer Aydelott Utmann, Nina Dronkers, and Morton Ann Gernsbacher, 2001Language deficits, localization and grammar: Evidence for a distributive model of language breakdown in aphasic patients and neurologically intact individuals. Psychological Review 108 (4) 759-788.40Purely structural complexity does affect normal and aphasic language processing (e.g., Thompson & Shapiro 2007 showed that practice on structurally more complex clause generalizes to improvement on less complex clauses, but practice on simple clauses doesnt generalize to more complex ones). But in general: processing effort is a function of the interaction of structure and frequency at multiple levels, Lets look at a psycholinguistic study of normal speakers and a related one of aphasic speakers which demonstrates this.3/23/12Grammatical Complexity Workshop415. Complexity measures must predict processing effort for different levels, and their interactions, taking frequency into account Thompson, C. K., & Shapiro, L. P. (2007). Complexity in treatment of syntactic deficits. American Journal of Speech-Language Pathology, 16, 3042.

Thompson, C. K., Shapiro, L. P., Kiran, S., & Sobecks, J. (2003). The role of syntactic complexity in treatment of sentence deficits in agrammatic aphasia: The complexity account of treatment efficacy (CATE). Journal of Speech, Language, and Hearing Research, 46, 591607.

OGrady, William. Relative clauses: Processing and acquisition. In The acquisition of relative clauses: Functional and typological perspectives, Evan Kidd (ed.). Amsterdam: John Benjamins

41An example of frequency/structure interaction: Relative verb-(subcategorization) frame frequencies create a bias (readers expectation) that affects readers processing patterns and comprehension.

Shrink, for example, has a syntactic bias towards the undergoer-subject argument structure it is more frequently used in the unccusative frame than in any other.The sweater shrank two sizes They shrank the sweater two sizes

5. Complexity measures must predict processing effort for different levels, and their interactions taking frequency into account 42Susanne Gahl, Lise Menn, Gail Ramsberger, Daniel S. Jurafsky, Beth Elder, Molly Rewega, Audrey L. Holland (2003)

Garnsey, Susan M.; Neal J. Pearlmutter; Elizabeth Myers; and Melanie A. Lotocky. 1997. The contributions of verb bias and plausibility to the comprehension of temporarily ambiguous sentences. Journal of Memory & Language 37.1.58 93.

Gahl, S. & Garnsey, S. M. 2006. Knowledge of grammar includes knowledge of syntactic probabilities. Language 82(2), 405-410The sweater shrank two sizes They shrank the sweater two sizesEye movements during reading: verbs with similar syntactic biases pattern together, whereas verbs that are similar only in meaning do not, Garnsey et al. 1997.Comprehension: Clauses that conform to a verbs bias, - They proposed X, They suggested that Y - comprehended faster and more accurately than bias-violating sentences with the same structure- They proposed that X, They suggested Y (ibid.)Production studies supporting this: Gahl & Garnsey 2004, 2006.

5. Complexity measures must predict processing effort for different levels, and their interactions taking frequency into account 43Syntactic frame and verb bias in aphasia: Plausibility judgments of undergoer-subject sentences. Gahl, Susanne, Menn, Lise, Ramsberger, Gail, Jurafsky, Daniel S., Elder, Elizabeth, Rewega, Molly, & Audrey L. Holland. 2003. Brain and Cognition, 53: 223-228.

Garnsey, Susan M.; Neal J. Pearlmutter; Elizabeth Myers; and Melanie A. Lotocky. 1997. The contributions of verb bias and plausibility to the comprehension of temporarily ambiguous sentences. Journal of Memory & Language 37.1.58 93.

Gahl, S. & Garnsey, S. M. 2004. Knowledge of grammar, knowledge of usage: Syntactic probabilities affect pronunciation variation. Language 80(4), 748-75.Gahl, S. & Garnsey, S. M 2006. Knowledge of grammar includes knowledge of syntactic probabilities. Language 82(2), 405-410Gahl et al. 2003: People with aphasia comprehend sentences better when the verb is in its preferred frame.

Clauses with unaccusatives (undergoer-subjects) can be hard or easy, depending on how typical it is for the verb to be used in the unaccusative construction.

This supports a processing model that has relatively direct semantic construal of the verb frames of simple clauses, rather than indirect construal involving, e.g., traces.5. taking both structure and frequency into account the accessibility of a lexical item depends not only its frequency, but also on the density of its phonological and semantic neighborhoods (e.g., Gordon 2002; Mirman and Magnuson 2008), among other factors. there is no standard approach to defining constructional neighborhood density, constructions within a family of constructions sharing similar properties, such as subject-gap relative clauses and clefts sharing SVO or NVN word order, or the family of constructions having filler-gap dependencies (Sag 2010), may be subject to neighborhood effects similar to those of lexical items. (footnote 1, Duffield &Menn submitted)

44The interaction of levels and the effect of frequency on complexity also support our sixth point:6. Construction-based and usage-based approaches to grammar can provide insights into how grammars can come closer to reflecting what our brains do.3/23/12Grammatical Complexity Workshop45Finally, lets look at evidence for our last two pointsComplexity measures must be compatible with different performance metrics3/23/12Grammatical Complexity Workshop46a valid measure of complexity will have to integrate across many linguistic level, and furthermore There can be no single performance metric.Complexity measures must be compatible with different performance metricsThere can be no single performance metric. Why not?What is effortful for speakers does not always match what is effortful for hearers Speakers and hearers have been shown to be sensitive to different structural features the same utterance typesTherefore: The goal of an overall complexity measure needs to be split into sub-goals; several roughly commensurable but sometimes incompatible measures are required. Lets look at some experimental evidence 3/23/12Grammatical Complexity Workshop4747Complexity measures must be compatible with different performance metricsIdentifying referents--ProductionCognitive demands can swamp information that would improve referential success (Wardlow Lane & Ferreira, 2008)


Speakers are asked to name target objects (in the common ground) so that listeners can identify them, while being faced with privileged objects of varying saliency.Speakers could name the target object by only using information that is common ground (e.g., the heart) or by also using privileged information (e.g., the small heart) that is useless and possibly confusing to the hearer.Wardlow Lane, L., Ferreira, V.S., 2008. Speaker-external versus speaker-internal forces on utterance form: Do cognitive demands override threats to referential success? Journal of Experimental Psychology: Learning, Memory, and Cognition 34 (6), 1466-81.48Complexity measures must be compatible with different performance metricsIdentifying referentsProduction experimentCognitive demands swamp information that would improve referential success (Wardlow Lane & Ferreira, 2008)3/23/12Grammatical Complexity Workshop49

Results: When the saliency of privileged information was increased, speakers made more reference to it, (e.g., identifying the target object as the small heart) than when such information was less salient, despite the risk of confusing the listener.In this task, cognitive demands result in a lower processing load for the speaker when producing more (complex) descriptions.Wardlow Lane, L., Ferreira, V.S., 2008. Speaker-external versus speaker-internal forces on utterance form: Do cognitive demands override threats to referential success? Journal of Experimental Psychology: Learning, Memory, and Cognition 34 (6), 1466-81.49Complexity measures must be compatible with different performance metricsIdentifying referentsComprehension experiment using eye-tracking (following the listeners successive visual fixations)Visual context affects ambiguity resolution (Spivey, Tanenhaus, Eberhard & Sedivy, 2002)Four conditions: Listeners heard either Put the apple on the towel in the boxor Put the apple thats on the towel in the boxSometimes what they saw was this:3/23/12Grammatical Complexity Workshop50

A. on the towel isredundant: theres onlyone appleSpivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.50Complexity measures must be compatible with different performance metricsIdentifying referentsComprehension experiment using eye-tracking (following the listeners successive visual fixations)Visual context affects ambiguity resolution (Spivey, Tanenhaus, Eberhard & Sedivy, 2002)Four conditions: Listeners heard either Put the apple on the towel in the boxor Put the apple thats on the towel in the boxAnd sometimes what they saw was this:3/23/12Grammatical Complexity Workshop51B. on the towel isrelevant: there are two apples

Spivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.51Complexity measures must be compatible with different performance metricsPut the apple on the towel in the box.Listeners eye movements were recorded to see whether they treated the PP modifier (e.g., on the towel) in the instruction as a goal, in which case they would look to the empty towel, or as a modifier, in which case they would look to the apple on the towel, and not to the empty towel.


A.redundant info:one appleB. relevant info:two applesSpivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.52Complexity measures must be compatible with different performance metricsParticipants heard instructions in one of these two forms:- temporarily ambiguous: Put the apple on the towel in the box or- unambiguous (control): Put the apple thats on the towel in the boxResults: In the one-referent visual condition listeners were more likely to look towards the empty towel that is, to treat the PP modifier on the towel in this temporarily ambiguous instruction as a goal - than in the two-referent instruction.


A.redundant source info:one appleB. useful source info:two applesSpivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.53Complexity measures must be compatible with different performance metricsResults - 2: In the two-referent condition (B), there was no difference between the responses to the temporarily ambiguous Put the apple on the towel in the box and the unambiguous Put the apple thats on the towel in the box. Participants looked to the correct apple, and then to the correct goal (the box), but not to the false goal (the towel).


B. useful source info:two applesA.redundantsource info:one appleSpivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.54Complexity measures must be compatible with different performance metricsSo the visual context the second apple in display (B) resulted in a lower processing load for the listeners who heard the temporarily ambiguous (garden-path) Put the apple on the towel in the box instruction. The structural ambiguity (does on the towel modify the apple or describe what is to be done with it?) apparently added no complexity/processing load for them.3/23/12Grammatical Complexity Workshop55

Spivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.55So weve seen that both speaker and hearer are influenced by their environments.Speakers may produce something thats more complex because its easier to say in the context, even though that context can make the utterance more difficult for the listener. (Wardlow Lane & Ferreira, 2008)Listeners can easily comprehend something thats more complex by making use of the visual context. (Spivey et al., 2002)3/23/12Grammatical Complexity Workshop56Complexity measures must be compatible with different performance metricsSpivey, Michael J., Tanenhaus, Michael K., Eberhard, Kathleen M., & Sedivy, Julie C. 2002. Eye movements and spoken language comprehension: Effects of visual context on syntactic ambiguity resolution. Cognitive Psychology 45, 447-481.Wardlow Lane, L., Ferreira, V.S., 2008. Speaker-external versus speaker-internal forces on utterance form: Do cognitive demands override threats to referential success? Journal of Experimental Psychology: Learning, Memory, and Cognition 34 (6), 1466-81.

567. Complexity measures must handle competition and how it gets resolved in both comprehension and production 3/23/12Grammatical Complexity Workshop57What valid complexity measures would have to do:7. Complexity measures must handle competition and how it gets resolved in both comprehension and production Processing is strongly influenced by competition normal and aphasic speakers have both substitution errors and blends.Competing lexical items - substitutions .addressing a child using the name of their siblingtip-of-the-tongue (Bhatnagar Baharav Bhuvana!)aphasic word search: my right side was blunt- lump- trip- slip-Competing members of an inflectional paradigmCompeting interpretations (visiting relatives, garden paths)Competing possible continuations of a partially-produced structure. 3/23/12Grammatical Complexity Workshop58587. Complexity measures must handle competition and how it gets resolved in both comprehension and production English-speaking aphasic speaker struggles: And the boy give to a cookiethe boy give to girl a cookie, apparently torn between double object and prepositional constructions.More errors of article choice among people with agrammatic aphasia in German than in Italian. A German speaker struggles to find the right form: die...der...das...die...den..den Hund (Bates, Wulfeck, & MacWhinney 1991)More verb inflection errors among people with agrammatic aphasia in languages with richer verb paradigms.So: the paradigmatic axis, as well as the structural syntagmatic axis, affects the complexity of deploying a given structure.3/23/12Grammatical Complexity Workshop59Bates, Wulfeck, & MacWhinney 1991. Crosslinguistic Research In Aphasia: An Overview. Brain and Language, 41, 123-148.598. Pragmatics/real-world knowledge are involved in resolving this competition 3/23/12Grammatical Complexity Workshop60What valid complexity measures would have to accommodate:8. Pragmatics/real-world knowledge are involved in resolving this competitionPragmatics has an enormous effect on processingWe already saw this in the metaphor processing studyOther speakers here have also made this point

Lets look at a clear, reasonably controlled example3/23/12Grammatical Complexity Workshop61Suggested: Kaiser & Trueswell (2004) The role of discourse context in the processing of a flexible word-order language Mak, Willem, Wietske Vonk & Herbert Schrieffers. 2008. Discourse structure and relative clause processing. Memory and cognition 36 (1) 170-181.Gibson, Edward (1998) Linguistic complexity: locality of syntactic dependencies. Cognition 68: 1-76.

61Whats this?3/23/12Grammatical Complexity Workshop62

62Whats this?3/23/12Grammatical Complexity Workshop63

63Whats this?

3/23/12Grammatical Complexity Workshop64648. Pragmatics/real-world knowledge is involved in resolving this competition Competing perspectives add to the difficulty of choosing among truth-value-equivalent constructionsTheres a table with a lamp, A lamp on a table were rarely tempted to say Theres a table under the lamp ! Theres a bed with a pillow at the wrong end or A pillow at the foot of a bed we dont hesitate to choose the larger object as the location.

In the third picture, were slowed up by the four-way competition among possible orders of mention and choice of location:The footstool is behind the armchair The armchair has a footstool behind it Theres a footstool with an armchair in front of it Theres an armchair in front of a footstool. 3/23/12Grammatical Complexity Workshop65

658. Pragmatics/real-world knowledge is involved in resolving this competitionPerformance data like this lead us to conceptualize complexity as requiring an integration of fluctuating levels of competition among different types of structures, varying from moment to moment as well as from speaker to speaker.

3/23/12Grammatical Complexity Workshop6666Wrapping up3/23/12Grammatical Complexity Workshop67To evaluate complexity measures for speaker-internal grammar, we need:Studies with systematic variation of structures and lexical items that would let us zero in on how construction frequency, lexical frequency, and other sources of processing difficulty interact. Should be carried out with normal speakers using ERP or other sensitive measures of processing load, because the number of linguistic variables is so high that a good design would impose serious burdens on a speaker with aphasia.

3/23/12Grammatical Complexity Workshop68Implications for a valid complexity measurePoint 1 the need to work across unit sizes/levels and to predict the interaction of frequency and structure (including transition probabilities) are arguments for framing a processing account in terms of a formalization of grammar that takes surface structure (constructions) and usage into accountPoint 2 compatibility with different performance metrics - implies that the goal of a creating formal processing complexity metric needs to be split into sub-goals for comprehension, production, and learning. Point 3 that competition adds to complexity implies that the paradigmatic axis as well as the structural syntagmatic axis is affects the complexity of a given structure.Point 4 the need to deal with the impact of real-world situations on processing - suggests that any formal metric, even one that satisfies the above requirements - will be incomplete. But that doesnt mean it wont be interesting and useful!3/23/12Grammatical Complexity Workshop69How might a formal system handle this?The construction grammar approach treats language as an inventory of constructions (both lexical and combinatoric) (abstract entities that are the loci of constraints on the interface of form & meaning; Sag, Boas, & Kay 2012).Each construction has specific information about its properties; we dont assume generalizations across utterances simply because of structural similarity. For example, there are unique properties of various filler-gap constructions (Sag, 2010). By using a type hierarchy, we can represent different grains of constructions.3/23/12Grammatical Complexity Workshop70Sag, Ivan A., Hans C. Boas, and Paul Kay. Introducing Sign-Based Construction Grammar. To appear in H.C. Boas and I.A. Sag (Eds.) Sign-Based Construction Grammar. Stanford: CSLI Publications. `Final' Version of March 8, 2012, downloadable from http://lingo.stanford.edu/sag/publications.html

Sag, I. A. (2010). English filler-gap constructions. Language, 86(3), 486-545 70How might a formal system handle this?Furthermore, constructions contain more than just syntactic informationthey contain semantic and pragmatic information as well, thus formalizing these elements.These features may allow us to deal with the interaction of different levels (morphological, lexical, syntactic, semantic & pragmatic) although it does not directly address the issue of frequency and transition probabilities.The unique specification of construction properties organized in a type hierarchy could allow us to predict certain types of competition. For example, if two constructions are specified for the same lexical item, argument structure, or co-occurrence restrictions, we would expect there to be competition between those two constructionsalthough the competition itself is not formally represented.3/23/12Grammatical Complexity Workshop71its Complicated3/23/12Grammatical Complexity Workshop72

72Thank you!3/23/12Grammatical Complexity Workshop73Special thanks to Laura Michaelis-Cummings for steering us to good readings and giving feedback.

Documents

Looking for a ‘gold standard’ to measure language complexity : What psycholinguistics and neurolinguistics can (and can’t) offer to formal linguistics