Upload
chapa
View
32
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Elicitation Experiments in Language Acquisition. Sonja Eisenbeiss (University of Essex) [email protected]. Overview. I: Elicitation and other Types of Production Data II: Semi-Structured Elicitation and Stimulus Materials III: Production Experiments See Eisenbeiss (2009). - PowerPoint PPT Presentation
Citation preview
Elicitation Experiments in Language Acquisition
Sonja Eisenbeiss (University of Essex)
Overview
I: Elicitation and other Types of Production
Data
II: Semi-Structured Elicitation and Stimulus
Materials
III: Production Experiments
See Eisenbeiss (2009)
Part I: Elicitation and other Types of Production Data
• naturalistic data
• semi-structured elicitation
• production experiments
• transcription and data analysis
Naturalistic Data
• other term: spontaneous speech data
• recording of ongoing communicative events (free play, dinner table conversation,…)
Advantages
• age-independent• no special task-demands, thus high
ecological validity• frequency information available• input-analysis possible• analysable for different
phenomena
Problems
• low comparability
• underestimation of productivity due to recurrent situations which require similar linguistic encoding
• lack of data for low-frequency phenomena (morphemes, constructions,…)
NPs in German Child Language(Eisenbeiss 1994, 2003)
child
age files (w. elicit.) utterances
AND 2;1 1 1.4500
ANN 2;4-2;9 6 1.977
CAR 3;6 1 1.795
HAN 2;0-2;8 8 1.399
LEO 1;11-2;11 15 (all) 4.383 (4.383)
MAT 2;3-3;6 18 1.978
SVE 2;9-3;3 15 (10) 3.811 (2814)
total 1;11-3;6 64 (20) 16.793 (7.197)(Clahsen 1982, Wagner 1985, Clahsen et al. 1990)
NPs in Spontaneous Speech
Noun
phrase
with
context for
numbe
r
% of
utterances
correlation with
mean length of
utterance
article 2.646 28 0.489; sign.
adjective +
article
249 3 0.274; n.s.
possessive
‘s
19 <1 -0.097; n.s.
Problems for Naturalistic Corpus Studies on NPs
•Some NP types are comparatively frequent and become more frequent with increasing utterance length (MLU).
•BUT: Some NP types (e.g. those with contexts for adjective+article or possessive ‘s) :
•are rare•occur in some files, but not in others•do not become more frequent over time (though children’s utterances get longer)
Semi-Structured Elicitation• Encouraging speech production in a
naturalistic (often game-like) setting.• e.g. eliciting complete sentences with
the verb to give in an "animal feeding game": participants have to feed toy animals and explain which food items they would like to give to which animals (Eisenbeiss 1994)
• often used as supplements to naturalistic data or experiments
Advantages
• appropriate for very young learners (<2)• high comparability• no underestimation of productivity due
to recurrent situations which require similar linguistic encoding
• no underestimation of productivity due to high task demands
• data for low-frequency phenomena (morphemes, constructions,…)
• analysable for different phenomena
NPs with Adjective+Article (Eisenbeiss 2003)
For instance:
• picture-matching game: putting picture cards on a board with pictures (red balloon, blue ballon,...)
What is this ? This is ... (NOM).What goes onto this picture here? (NOM)What do you want to put here (ACC)
MLU
4.54.03.53.02.52.01.51.0
% a
ller
an
aly
sier
bar
en Ä
uß
eru
ng
en25
20
15
10
5
0
KORPUS
sve
sve
mat
leo
han
car
ann
and
NPs with Adjective+Article
% o
f analy
sab
le
utte
ran
ces
child
black symbols: files with elicitation
NPs with possessive –s (Eisenbeiss 2003)
For instance:
• possession-matching game: assigning possessions (depicted on cards) to people (depicted on board)
Whose bicycle is this? This is ...
MLU
4.54.03.53.02.52.01.51.0
% a
ller
an
aly
sier
bar
en Ä
uß
eru
ng
en6
5
4
3
2
1
0
KORPUS
sve
sve
mat
leo
leo
han
car
ann
and
NPs with possessive -s
% o
f analy
sab
le
utte
ran
ces
child
black symbols: files with elicitation
Problems
• age-dependent (usually at least 1;6 years)
• no frequency information available
• no input-analysis possible
Experiments
• Systematic control of variables (properties of participants and stimulus materials)
• Standardized procedures
• Limited range of response options
Advantages
• high comparability
• data for low-frequency phenomena (morphemes, constructions,…)
• no underestimation of productivity due to recurrent situations which require similar linguistic encoding
Problems
• age-dependent (usually at least 3 years)
• underestimation of productivity due to comparatively high task-demands
• no frequency information available
• no input-analysis possible
• analysis restricted to target phenomenon
Transcription and Data Analysis• Naturalistic Data and Semi-Structured
Elicitation Games require Transcription.• The most common format is the CHAT-
format, developed for the largest child language data deposit: CHILDES (http://childes.psy.cmu.edu/).
• Digital CHAT-files can be searched using the CLAN tools provided by CHILDES.
• All three data types require “error”-analysis, i.e. classifying deviations from the target.
Transcription: CHAT-Format• Transcripts are written in a text editor
and stored as unformatted ASCII files (text only or plain text).
• All lines are ended by a carriage return (ENTER).
• Every transcript must begin with the line: @BEGIN and end with the line: @End.
• Between @BEGIN and @End:– headers with information about the
transcript (obligatory: @Participants)– main tier for transcription– dependent tiers for further annotations
CHAT-Format: Basic Structure
• @BEGIN• @Participants• [other headers]• *JOE: [spoken material]• %mor: [morpho-syntactic coding]• *INT: [spoken material]• %mor: [morpho-syntactic coding]• @End
CHAT-Format: Headers• three letters followed by a colon and a tab• obligatory: @ Participants, on the second
line of the transcript; e.g.:@Participants: JOE Joe child, INT Interviewer
• optional; e.g.:– @Birth of Learner: …– @Age of Learner: …– @Date: … – @Language: …– @Transcriber: …
CHAT-Format: Main Tiers• what was actually said, one utterance per
tier• introduced by "*", the three-letter code for
the participant and a tab; e.g.: *JOE: the boy put the leash on the cat.
• orthographical transcription in lower case Latin letters; except for proper nouns (e.g. John) and "I"
• numbers spelled out (ten, not 10) • normalisation of phonetically deviant forms
(phonetic information about forms can be presented on a %pho dependent tier)
Main Tiers: Markers• unfilled pauses: #• filled pauses: eh@fp• interruption: +/.• self-interruption: +//.• repetition w/o correction: [/]• repetition with correction: [//]• unintelligible speech: xxx• material coded on phonol. tier: yyy• doubtful material: [?] or
[=? text]• omitted parts of words: ( )• to refer to more than one word: < >
CHAT: Dependent Tiers
further annotations, e.g.
• %mor [morphosyntactic coding]• %pho [phonological coding]• %syn [syntactic coding]• %err [errors]• %com [comments]• %spa [speech acts]
CLAN: Windows• the commands window where you specify
the folders, files, and commands you want to use
• the CLAN output window, where you will see the results of your searches. If you have not specified an output file, your results will be displayed in this window. If you have saved your outputs into a file (as you will be asked to do for this exercise), you will not be able to see it in the output window, but the name and location of the output file will be displayed in the output window.
CLAN: Command Window
CLAN: Steps
• specify your working and lib directory, where the files you will be working with are stored
• specify your output directory, where any output files will be stored
• select a command • select one or more transcription files for
analysis• optionally use some so-called switches to
modify the commands.
CLAN: Core Commands
• FREQ: will provide you with type and
token frequency information
• COMBO: will find utterances matching a given set of criteria
• MLU: will calculate the MLU (mean length of utterance)
CLAN: Useful Switches+f saves output to file. For each transcription that
you have chosen to analyse, an output file will be
generated. By default, this output file will have the name of the transcription file and an extension that will show you which command
was used to create the output (e.g. frq, mlu or cmb).
+s searches for a string in a file. +t restricts the search to a particular tier – e.g.
the tier of a particular speaker.+u treats all files together.+o orders FREQ lists according to token frequency+w –w1 and +w1 provide one preceeding/following
line, -w2 and +w2 will provide two preceeding/following lines, etc.
CLAN: Search Strings
^ immediately followed by+ inclusive OR! logical NOT* “joker” “” strings including
blanks, etc. should be put in quotes
CLAN: Search Strings
^ immediately followed by+ inclusive OR! logical NOT* “joker” “” strings including
blanks, etc. should be put in quotes
Error Analysis
• % suppliance in obligatory contexts (e.g. % –ed in past tense contexts) vs. % correct use of a particular form (e.g. % correct use of all present tense forms)
• errors of omission (e.g. two mouse) vs. errors of comission (two mouses)
• overregularisations (e.g. sing-singed) vs. overirregularisations (e.g. bite-bote)
Part II: Semi-Structured Elicitation and Stimulus
Materials• Interactional Setting
• Target Type
• The Puzzle Task
• Stimulus Type
• Transcription and
Analysis
Interactional Setting• Director/matcher (or “confederate description”):
A “director” describes a scene/object etc. and a “matcher” who is not able to see this scene/object, has to recreate it.E.g.: The matcher has to build a toy house identical to the one created by the director who is hidden behind a screen.
• Speaker/Listener: A speaker provides information for someone who does not have access to this information. E.g.: The speaker retells a story (s)he heard/read while the listener was not in the room.
• Co-Players:All participants are involved in a game and provide each other with information to co-ordinate their actions. E.g.: The players are involved in a construction or puzzle game where not everyone has access to all pieces.
Target Type
• broad-spectrum (generally encouraging participants to speak)
• form-focused: the use of a particular form or construction
• meaning-focused: the linguistic encoding of a particular function or meaning (which can be encoded in different ways)
Broad-Spectrum• Frog story: a picture book w/owords used
to elicit narratives (Berman/Slobin 1994)
• Bag task: a bag with bag for blocks and animals of different sizes and colours. The bag has pockets that match the animals in colour and have coloured buttons, ties, etc.; and children frequently refer to colours, sizes and locations when they ask other players to help them hide or find animals in the pockets (Eisenbeiss 2009b)
Form-focused
• Picture-matching game: aimed at the elicitation of noun phrases with adjectives in different case contexts (Eisenbeiss 1994), see part I
• Possession-matching-games: aimed at the elicitation of adnominal possessive constructions (Eisenbeiss 1994)
Meaning-focused
• “circle of dirt”: a picture book w/owords used to elicit descriptions of part-whole relationships and actions affecting (body) parts (Eisenbeiss and McGregor 1999):
• “cut-and-break”: video stimulus created for cross-linguistic studies of “separation and material destruction” events (Bohnemeyer, Bowerman and Brown 2001).
The Puzzle Task (Eisenbeiss 2009b)
• a task with co-players: child describes contrasting pictures on a puzzle board, adult finds the matching pieces, child puts them into the correct cut-out
• exchangable pictures and puzzle pieces
• can be used to elictit particular forms or to elicit the linguistic encoding of particular meanings
Eisenbeiss/Matsuo (2005)Eisenbeiss et al. (2009)
• German: 39 recordings(picture descriptions and asking for pieces)1286 utterances with/without V 21 children (3;7-6;6)
• Japanese:67 recordings (asking for pieces) 421 utterances with V 16 children (2;11-6;5)
Elicitation Material: give
Elicitation Material: bite
Elicitation Material: wash
Elicitation Material: put on
German: The Use of Verbs (%)
GIVE BITE WASH PUT
GIVE 60 0 < 1 < 1
BITE 0 62 0 0
WASH 0 0 70 0
PUT 0 0 0 48
other 25 16 14 27
no V 14 22 15 25
GAME
Verb
Error Types
•More contexts for errors but no qualitative differences in error types observed so far
•For instance, PPs instead of IOs:
• naturalistic (Carsten 3;6): für'n papa sollste aber den schenken for the daddy shall PART this-one give
• elicited (Jannik 6;4): da gibt das baby fuer das schaf ehm den gras. there gives the baby for the sheep ehm the grass
Overt Arguments in Japanese (%)
0102030405060708090
100
Jun (3;8) Puzzle(</=3:8)
Puzzle (all)
subjectobject
Jun: naturalistic data
Stimulus Type
• pictures
• photos
• computer animations
• videos
• toys
• real objects
Pictures
• better for descriptions of static objects/properties than for event descriptions and verb elicitation
• requires knowledge of artistic conventions (e.g. lines for movement, shading etc.)
• can be used for “unrealistic” events (e.g. animals in different colours or positions)
• comparatively easy to create with clip art and standard software
• comparatively easy to modify for minimal variations (e.g. colour)
Photos• better for descriptions of static objects/properties
than for event descriptions and verb elicitation
• do not require knowledge of artistic conventions and are comparatively “natural”
• problematic for “unrealistic” events (e.g. animals in different colours or positions)
• comparatively easy to create and manipulate with standard photo equipment and Photoshop or similar software
• cannot be as easily modified for minimal variations (e.g. colour) as pictures, but possible in principle
Computer Animations• better for descriptions of dynamic events
and verb elicitation than for descriptions of static objects/properties
• not very naturalistic, in particular when it comes to natural movements of people and animals
• can be used for “unrealistic” events (e.g. funny movements of animals)
• difficult and time-consuming to create
• good control for minimal variations (e.g. direction or manner of motion)
Videos• better for descriptions of dynamic events and verb
elicitation than for descriptions of static objects/properties
• comparatively natural
• cannot easily be used for “unrealistic” events
• comparatively simple to create with standard digital video equipment and editing software (Adobe etc.)
• cannot be as easily modified for minimal variations as computer animations (e.g. direction or manner of motion) because “actors” tend to introduce unwanted modifications
Toys• e.g. stuffed animals, cars, blocks
• appropriate for descriptions of dynamic events and verb elicitation as well as for descriptions of static objects/properties
• not completely naturalistic, in particular when it comes to natural movements of people and animals
• often very culture-dependent
• can be used for “unrealistic” events (e.g. funny movements of animals)
• usually easily obtainable
• object properties can be fairly well controlled, but for dynamic events, actions of toy “actors” tend to introduce unwanted modifications
Real Objects• e.g. tools, household items like pots, dishes
• appropriate for descriptions of dynamic events and verb elicitation as well as for descriptions of static objects/properties
• very naturalistic
• basic objects are often less culture-dependent than toys
• can be used for realistic and “unrealistic” events (e.g. funny movements of pots)
• usually easily obtainable
• object properties are a bit harder to control than for toys and modifications might reduce naturalness (e.g. atypical colours for household items)
• for dynamic events, manipulations of the objects tend to introduce unwanted modifications
Part III: Production Experiments
• elicited imitation• elicited production• speeded production• syntactic priming • input/feedback manipulations • eye-tracking and speech production
Elicited Imitation
Participants are asked to imitate spoken sentences. Thus, it is clear what learners should say; and when stimuli are sufficiently long and complex, participants cannot simply memorise them as a whole, but have to employ their own grammar to recreate them. Thus, a comparison of the target utterance and the learner’s actual production can shed light on the grammatical knowledge that learners employ to express a given meaning.
Elicited Imitation: Pro & Con Advantages
• easy to carry out
• clear target for comparisons
Problems:
• good performance might be due to simple
memorisation.
• errors might be due to a lack of vocabulary knowledge, memory limitations, etc.
Elicited Production
Participants receive a prompt to produce a form (e.g. This is a door. These are two …?). Alternatively, learners can be instructed to turn a given sentence into a question, a negated sentences, etc. (e.g. I'll say something and then you say the opposite).
Elicited Production: Pro & Con Advantages independence of memorised models
Problems: • requires reliable prompts • unclear influence of participants’ earlier
experience - unless novel words are used (This is a wug. These are two …?, Berko 1958).
• performance errors due to task difficulties (especially with novel words)
Berko (1958)
Speeded Production The frequency of stimulus items is manipulated
to study storage and computation in learners’ developing mental lexicon: If a morphologically complex form such as walk-ed is stored as a whole, then high-frequency forms should have stronger memory traces, due to additional exposure. Hence, they should be retrieved and produced faster than low-frequency forms. In contrast, if morphological complex forms are computed from stems and affixes, production latencies should only be affected by the frequency of these components (e.g. walk and –ed), not by the frequency of the complex form (e.g. walk-ed).
Speeded Production: Pro & Con
Advantages: can provide information about storage and
computation
Problems:• requires highly reliable prompts• requires frequency information for the
words in the variety the learner is exposed to (e.g. child directed German,…)
• requires reaction-time equipment
Clahsen et al. (2004)
Elicitation of German participles with a computer game involving an alien: overgeneralisations of regularly inflected participles inflection and a frequency effect for irregulars only; which suggests that irregular word forms are stored as wholes, while regularly inflected word forms are computed.
Syntactic Priming
Speakers tend to repeat syntactic structure across otherwise unrelated utterances (Branigan 2007). For instance, speakers are more likely to use a passive after hearing or producing a passive sentence than after an active sentence. If learners show this effect, this indicates that they have acquired a grammatical representation that can be activated by priming.
Syntactic Priming: Variants
• presentation of primes:• between groups• in blocks • alternating
• required participant response to primes:• listening only• listening and repeating
• lexical overlap between prime and target:• no lexical overlap• same verb or head noun
Syntactic Priming: Pro & Con
Advantages: provides insights into representations
Problems:requires pairs of alternative structures that speakers could use (e.g. active/passive)
Input/Feedback Manipulation
The effect of different types of input (e.g. models) or feedback (explicit corrections, reformulations, etc.) on speakers’ (elicited) production is studied.
Input/Feedback Manipulation: Pro & Con
Advantages: can provide information about the role of input/feedback
Problems:dependency of corrective feedback on the occurrence of errors
Eye-Tracking and Speech Production
Eye-movements are monitored while participants describe pictures and videos. This allows us to investigate the amount of visual information required to start speaking and the planning processes involved in speech production (pre-viewing scenes, aligning speech and vision, etc.).
Dobel et al. (2009)
Eye-Tracking and Speech Production: Pro & Con
Advantages: can provide information about the role of
visual information and speech planning
processes
Problems:•requires specialised equipment•requires good knowledge of visual
processing
Conclusion
Converging Evidence
References IBerko, J. 1958. The child's learning of English morphology.
Word 14, 150-177.Berman, R. A., & Slobin, D. I. (1994). Relating Events in
Narrative: A Crosslinguistic Developmental Study. Hillsdale, NJ: Lawrence Erlbaum.
Bohnemeyer, J , Brown, P. & Bowerman, M. 2001. Cut and Break Clips. In ‘Manual’ for the field season 2001, Levinson, S.C. & Enfield, N.J. (eds), 90-96. Nijmegen: Max Planck Institute for Psycholinguistics.Branigan, H. (2007). Syntactic Priming. Language and Learning Compass: 1, 1-16.
Clahsen, H. 1982. Spracherwerb in der Kindheit. Eine Untersuchung zur Entwicklung der Syntax bei Kleinkindern. Tübingen: Narr.
References IIClahsen, H., Hadler, M. & Weyerts, H. 2004. Speeded
production of inflected words in children and adults. Journal of Child Language 31: 683-712.
Clahsen, H., Vainikka, A. & Young-Scholten, M. 1990. Lernbarkeitstheorie und Lexikalisches Lernen. Eine kurze Darstellung des LEXLERN-Projekts. Linguistische Berichte 130, 466-477.
Dobel, C., Glanemann, R., Kreysa, H., Zwitserlood, P., Eisenbeiss, S. (2009) Visual encoding of meaningful and meaningless events. Submitted.
References IIIEisenbeiss, S. 1994. Elizitation von Nominalphrasen und
Kasusmarkierungen. In: Sonja Eisenbeiß, Susanne Bartke, Helga Weyerts & Harald Clahsen (Eds.), Elizitationsverfahren in der Spracherwerbsforschung: Nominalphrasen, Kasus, Plural, Partizipien. (Arbeiten des Sonderforschungsbereichs 282, Nr. 57). Düsseldorf: Heinrich-Heine-Universität, 1-38.
Eisenbeiss, S. 2003: Merkmalsgesteuerter Grammatikerwerb: Eine Untersuchung zum Erwerb der Struktur und Flexion von Nominalphrasen. Dissertation; University of Duesseldorf. http://diss.ub.uni-duesseldorf.de/ebib/diss/file?dissid=1185
Eisenbeiss, S. 2009a: Production Methods. Ms. University of Essex
Eisenbeiss, S. 2009b: Semi-Structured Elicitation Tasks. Ms. University of Essex (to appear in Essex Research Reports in Linguistics: http://www.essex.ac.uk/linguistics/errl/)
References IVEisenbeiss, S., Matsuo, A. 2003. External and Internal
Possession - A Comparative Study of German and Japanese Child Language. Paper presented at the 28th Annual Boston University Conference on Language Development, Boston University Wagner, Klaus R. (1985). How much do children say in a day? Journal of Child Language 12, 475-487.
Eisenbeiss, S. & Matsuo, A. 2005. Eliciting Language Production Data from Young Children. Presentation at the Xth International Congress for the Study of Child Language, Berlin, Germany
Eisenbeiss, S., Matsuo, A. & Sonnenstuhl, I. (2009): Learning to Encode Possession. Submitted.
Eisenbeiss, S. & McGregor, W.B. 1999. The circle of dirt. Ms. Max-Planck-Institute for Psycholinguistics, Nijmegen.
Further Readings IBreakwell, G.M., Hammond, S., Fife-Schaw, C. (eds.)
2003. Research Methods in Psychology. London: Sage Publications.
Field, A., Hole, G. 2003. How to Design and Report Experiments. Sage Publications.
McDaniel, D., McKee, C., Smith Cairns, H. (eds.) 1996. Methods for Assessing Children's Syntax. Cambridge, MA: MIT Press, 3-22.
Menn, L., Bernstein Ratner, N. (eds.) 2000. Methods for studying language production. Mahwah, N.J. : Lawrence Erlbaum Associates
Further Readings IISekerina, I.A., Fernandez, E.M., Clahsen, H. (eds)
2008. Developmental Psycholinguistics: On-Line Methods in Children’s Language Processing. Amsterdam: Benjamins.
Wei, Li, Moyer, M. (eds) 2008. The Blackwell Guide to Research Methods in Bilingualism and Multilingualism. London: Blackwell.
Wray, A., Bloomer, A. 2006. Projects in Linguistics. A Practical Guide to Researching Language. London: Arnold.
-s/’s and von/of Possessives
English: ‘s for animate, topical, prototypical poss. N(P)
German: -s restricted to unmodified N (name, kinship)
Search in CLAN for strings: FREQ, COMBO (with context)
• Search in CLAN for codes : FREQ, COMBO (with context)
• How would you collect spontaneous speech (participants, setting, activities)?
• How would you create:• a broad-spectrum elicitation game• a meaning-focused elicitation game• a form-focused elicitation game• an elicitation/comprehension experiment for recursive –s: the boy’s father’s mother’s ballon
Nikola Koch: Elicitation/Comprehension
Nikola Koch: Elicitation/Comprehension