26
Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’ written narratives Giagkou Maria Kantzou Vicky Stamouli Spyridoula Tzevelekou Maria

Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Embed Size (px)

Citation preview

Page 1: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Learner Corpus Research Conference, Bergen, Norway, September 27, 2013

Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’ written narrativesGiagkou MariaKantzou VickyStamouli SpyridoulaTzevelekou Maria

Page 2: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Background

•Specification of CEFR functional descriptions: criterial features (specific lexical and grammatical features used differently by L2 learners at different proficiency levels)

• Cambridge English Profile Programme (Hawkins & Filipovic 2012, Hawkings & Buttery 2010)

•CEFR proficiency levels and L2 acquisition of various linguistic features

• SLATE network (Second Language Acquisition and Testing in Europe)

• Different languages: Dutch, Italian and Spanish (Kuiken, Vedder & Gilabert 2010), Finnish (Alanen, Huhta & Tarnanen 2010, Martin et al. 2010), French (Forsberg & Bartning 2010, Prodeau et al. 2012), Norwegian (Carlsen 2010)

• Mostly adult learners, smaller number of studies on young L2 learners (Pallotti 2010).

Page 3: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Research objectives

• Identification of criterial properties for Greek as L2 -> specification of the CEFR proficiency levels with respect to the linguistic features of Greek -> help educators and researchers discriminate the language production of each level from the production of adjacent levels.

• Focus:• young L2 learners of Greek enrolled in Greek state schools (immigrants and

indigene minorities): 18% of students nowadays are immigrants or repatriated Greeks learning Greek as L2 (Gropas and Triandafyllidou (2011)

• written narratives because children are familiarized with from an early age; and because narratives have been widely investigated in L1 and L2 acquisition.

• Investigation of the developing narrative ability at micro- and macro- level, as indicated by:

• Narrative length • Clause Subordination• Discourse markers • Modifiers• Grammatical accuracy• Lexical density

• Previous research findings in Greek L1 and L2 acquisition (Kantzou 2010, 2012, Stamouli 2010, Varlokosta & Triantafyllidou 2003), and in evaluation of Greek text difficulty (Giagkou 2012)

Page 4: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Elicitation task and level allocation

• Two writing tasks performed by ca. 1200 immigrant and repatriated children (October 2011 to February 2012):

• a narrative based on the Cat Story picture series (Hickman 2003)

• a letter or diary entry• Two evaluators placed each student at a CEFR level on the basis of the two written

productions• Rating was based on CEFR descriptors and more specifically on the Overall Written

Production, Creative Writing and Lexical, Grammatical and Orthographic Competence scales.

Page 5: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Corpus

• Narratives based on the on the Cat Story picture series (letters and diary entry excluded from further analysis)

• Only narratives placed at the same level by both evaluators are included

• Corpus of 150 scripts (9742 tokens). Levels A2, B1 and B2 are represented in the corpus, with 50 scripts in each level.

scripts

tokens clauses

N N Mean

(std)

Min Max N Mean

(std)

Min Max

A2 50 2.384 47,68 (14,3

4)

19 83 511 10,22 (3,19

)

4 18

B1 50 3.193 63,86 (13,2

)

31 95 654 13,08 (3,2)

8 22

B2 50 4.165 83,30 (22,1

9)

53 181 842 16,84 (4,11

)

9 33

Total 150 9.742

64,95 (22,3

7)

19 181 2.007

13,38 (4,43

)

4 33

Page 6: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Sample

• 150 primary school pupils• 83 boys, 67 girls• grades 3 to 6 (aged 8-14)• different linguistic backgrounds, mainly Albanian (49%) and Russian (15%)• resided in geographically diverse regions of Greece

28

2

3

1

743

10

2

6

7

2

1

1

1

5

5

5

13

5

2

1

Page 7: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Transcription and annotation

• Manual transcription: a) a version preserving learner’s spelling, and b) a corrected version

• Clause separation: clause expresses a single situation and has one predicate (Berman and Slobin 1994).

• Annotation:• Type of clause• Clitics within the verb frame• Adjectives and adverbs• Discourse markers

Page 8: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Transcription and annotationType of clause

• Independent• Dependent

• Relative clauses • Complement clauses• Clauses of purpose • Clause of cause• Clause of time

• Center-embeddingmia mera // mia γata citaksa kala kala ta mikra pulacia[pu i mitera iχe pai][na vri trofi]One day // a cat looked at the little birdies[that the mother had left][to find food]

Page 9: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Transcription and annotationClitics

• Clitics within the verb frame

• Appropriate use: ce o scilos tis δaγose tin ura tis (A2)

and the dog bit its (=the cat’s) tail

• Inappropriate use:ce i γata arχize na treχi ce o scilos ton ciniγuse

(B1)

and the cat started running and the dog was chasing him* (inappropriate gender marking)

Page 10: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Transcription and annotationAdjectives and adverbs

• Adjectives• Descriptive:

itan ena mikro spitaci pu iχe mikra pulacia (A2)

there was a little house that had little birdies

• Evaluative: i kakurγa γata pinuse (B2)

the wicked cat was hungry

• Adverbs• Descriptive: i γata skarfani apano sto δendro (A2) the cat was climbing on the tree

• Evaluative: ta pulacia citusan ti γata paraxena (B2) the birdies were looking the cat weirdly

Page 11: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Transcription and annotationDiscourse markers

• Additive• Temporal • Contrastive• Inferential• Other

Page 12: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Analysis

• Annotated linguistic features -> metrics based on frequency of occurrence per level

• Means comparison : One-Way ANOVA• Post-hoc multiple comparisons between levels A2, B1 and

B2: Bonferroni tests

Page 13: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsNarrative length

• Main effect and all post-hoc comparisons were significant for: • Number of tokens [F (2, 147)=54.673, p=0,000]• Number of clauses [F (2, 147)=44.000, p=0,000]

The lengthier the narrative the higher the level

Page 14: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsClause subordination (1/2)

• Main effect is significant [F (2, 147)=40.172, p=0.000] and so are all post-hoc comparisons

• A successful discriminator both between A2-B1, and between B1-B2

A script with no dependent clauses is most likely to be below B2

• Percentage of dependent clauses:• Zero occurrences are possible in A2 and B1 (though rarer), but at

least one dependent clause is expected in B2

Page 15: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsClause subordination (2/2)

• Percentage of the different types of dependent clauses:• Complement, relative, purpose and causal clauses did not significantly

discriminate levels• Only temporal clauses achieved a significant main effect [F (2, 114)=6.109,

p=0.003] but only in discriminating A2 from B1 and from B2.

In A2 narratives sequential events are not subordinated. Temporal clauses are used from B1 onwards.

Page 16: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsCenter-embedding

• Percentage of embedded clauses:

• Significant main effect [F (2, 147)=6.417, p=0.001]

• Post-hoc tests: A2 - B2

More than one embedding, indicates a B2 learner.

• Embedding used by:• A2: 3 learners • B1: 9 learners• B2: 29 learners. More than

one embedding in the same script

Page 17: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsClitics

• Percentage of correct clitics to clitics:• Significant main effect [F (2, 120)= 17.380, p=0.000) and all

post-hoc comparisons

A B2 learner should be expected to use clitics correctly in terms of gender, number, person and case agreement

• A2: minimum=0%, maximum=100%

• B1: more than half of learners have got all their clitics correct

• B2: occasional inappropriate uses by only 3 learners

Page 18: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsDiscourse markers: general metrics

• All features were found statistically significant:• average number of discourse markers per clause [F (2, 147)=14.141, p=0.000]• percentage of discourse markers to tokens [F (2, 147)=19.958, p=0.000)

• Both are successful discriminators of A2-B2 and B1-B2

Page 19: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsDiscourse markers: type of marker

• Mean # of the different types of markers per clause: statistically significant

• Additive markers : all levels• Temporal markers: A2-B2 and

B1-B2 • Contrastive markers: A2-B1 and

A2-B2 • Inferential markers : A2-B2• Other markers: B1-B2

Exclusive use of the additive και /ce/ (=and) the temporal μετά /meta’ / (=then) is expected in A2. All other additive or temporal markers should indicate an above A2 learner.

B1 learners reduce the use of και and μετά, and they start marking contrast.

Inference marking is never encountered in A2. It should be expected from learners in B1 or above.

Page 20: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsVerb and noun modifiers

• Not statistically significant:• average number of adjectives per clause• percentage of adjectives to tokens• average number of adverbs per clause and • percentage of adverbs to tokens

Systematic use of evaluative adjectives and adverbs indicates a learner above level A2, and most likely of level B2

• Main effect statistically significant: • percentage of evaluative adjectives to adjectives: B1-B2 and A2-B2• percentage of evaluative adverbs to adverbs: all level pairs

Page 21: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ResultsLexical density

• Not statistically significant

Page 22: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Resultsat a glance

Metrics A2 – B1

B1 – B2

A2 – B2

Narrative length

Number of tokens and clauses

Subordination Percentage of dependent clauses

Percentage of temporal clauses

Percentage of embedded clauses

Clitics Percentage of correct clitics

Discourse markers

Mean number of discourse markers per clause

Percentage of discourse markers to tokens

Percentage of temporal discourse markers

Percentage of contrastive discourse markers

Percentage of additive discourse markers

Percentage of inferential discourse markers

Modifiers Percentage of evaluative adjectives

Percentage of evaluative adverbs

Page 23: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Criterial features at a glance

A2 B1 B2Subordination

Temporal clauses are not expected

Systematic use of temporal clauses

•At least one dependent clause•Embedding is encountered more than once

Discourse •Exclusive use of the additive και and the temporal μετά is expected•No inference

•Start marking contrast•Start marking inference

Systematize inference marking

Grammatical accuracy

Clitics used correctly in terms of gender, number and case agreement

Evaluation Systematic use of evaluative adjectives and adverbs

Page 24: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Further research…

• Larger sample of A2-B2 learners and C1-C2• More fine-grained analysis of indices, e.g. temporal clauses

denoting simultaneity • New indices, e.g. verbal morphology, vocabulary growth• Different discourse types and modalities

Page 25: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

ReferencesAlanen, Riikka, Huhta, Ari & Tarnanen, Mirja (2010). Designing and assessing L2 writing tasks across CEFR

proficiency levels. In Bartning, Martin & Vedder (Eds.), 21-56. Bartning, Inge, Martin, Maisa & Vedder, Ineke (eds.) (2010) Communicative development and linguistic

development: intersections between SLA and language testing research. Eurosla Monographs Series 1. Available at: http://eurosla.org/monographs/EM01/EM01tot.pdf. (date accessed 21/05/2013).

Carlsen, Cecilie (2010) Discourse connectives across CEFR-levels: A corpus based study. In Bartning, Martin & Vedder (Eds.), 191-210.

Forsberg, Fanny & Bartning, Inge (2010) Can linguistic features discriminate between the communicative CEFR-levels? A pilot study of written L2 French. In Bartning, Martin & Vedder (Eds.), 133-158.

Giagkou, Maria. (2012). A readability statistical model for pedagogically relevant text retrieval. In Papadopoulou & Recythiadou (Eds), Proceedings of the 32nd Annual Meeting Department of Linguistics, AUTH (pp 65-76). Thessaloniki: Institute of Modern Greek Studies.

Gropas, R. & Triandafyllidou, A. (2011). Greek education policy and the challenge of migration: an ‘intercultural’ view of assimilation. Race Ethnicity and Education, 14(3), 399-419.

Hawkins, John A. & Buttery, Paula (2010) Criterial Features in Learner Corpora: Theory and Illustrations. English Profile Journal 1(1): 1-23.

Hawkins, John A. & Filipović, Luna (2012) Criterial Features in L2 English: Specifying the Reference Levels of the Common European Framework (English Profile Studies). Cambridge: Cambridge University Press.

Hickmann, Maya (2003) Children’s discourse: Person, space and time across languages. Cambridge: Cambridge University Press.

Kantzou, Vicky (2010) The temporal structure of narrative in the acquisition of Greek as a first and as a second language. Phd Thesis. Athens: National and Kapodistrian University of Athens. [In Greek]

Kantzou, Vicky (2012) The temporal structure of narratives in second language acquisition of Greek. In: Gavriilidou Ζoi, Efthymiou Αggeliki, Thomadaki Εvangelia. & Kambakis-Vougiouklis Penelope (eds) Selected Papers – The 10th International Conference of Greek Linguistics (pp 354-364) Komotini/Greece: Democritus University of Thrace. Available at: http://www.icgl.gr/files/English/26.Kantzou_10ICGL_pp.354-364.pdf (date accessed 21/05/2013).

Kuiken, Folkert, Ineke Vedder & Roger Gilabert (2010) Communicative adequacy and linguistic complexity in L2 writing. In Bartning, Martin & Vedder (Eds.), 81-100.

Pallotti, Gabriele (2010) Doing interlanguage analysis in school contexts. In Bartning, Martin & Vedder (Eds.), 159-190.

Prodeau, Mireille, Lopez, Sabine & Véronique, Daniel (2012) Acquisition of French as a Second Language: Do developmental stages correlate with CEFR levels? Journal of Applied Language Studies 6(1): 47–68.

Stamouli, Spyridoula (2010) Narrative development in Greek L1 and child L2. Phd Thesis. Athens: National and Kapodistrian University of Athens. [in Greek]

Varlokosta, Spyridoula & Triantafillidou, Leda (2003). Proficiency Levels in Greek as a Second Language. Athens: Centre for Intercultural Education, University of Athens. [in Greek]

Page 26: Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’

Thank you!

Part of this work, data collection and rating, was funded by the educational project “Education of Repatriate and Immigrant Students”, Action 1 “Linguistic and Educational Support for Reception Classes”, Aristotle University of Thessaloniki (National Strategic Reference Framework 2007-2013 and the Ministry of Education and Religious Affairs)