23
Hispania 97.4 (2014): 466–88 AATSP Copyright © 2014 Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern Connecticut State University Abstract: Communicative approaches to language teaching that emphasize the importance of speaking (e.g., task-based language teaching) require innovative and evidence-based means of assessing oral language. Nonetheless, research has yet to produce an adequate assessment model for oral language (Chun 2006; Downey et al. 2008). Limited by automatic speech recognition (ASR) technology, which compares non- native speaker discourse to native-like discourse, most tests exclusively focus on accuracy while ignoring how examinees use language to make meaning. In order to oer stakeholders more trustworthy evidence of how examinees might use language in target language domains, a model anchored in systemic functional linguistics (SFL) is put forth. Specic examples are given of how SFL might be used to evaluate test task types, such as the story retell. ree examinees’ responses are contrasted using genre analysis (Derewianka 1990) and transitivity analysis (Ravelli 2000) in order to demonstrate elements in their linguistic proles that ASR-based assessment would overlook. In so doing, implications are drawn regarding the potential of SFL models for enhancing automated scoring procedures by focusing on the meaning-form relations in the linguistic construction of narrative. Keywords: adult Spanish literacy/alfabetización adulta de español, automated language assessment/ evaluación automática del lenguaje, discourse analysis/análisis del discurso, systemic functional linguistics/ lingüística sistémico-funcional, transitivity analysis/análisis de transitividad Introduction A prevailing issue in the eld of language testing is the inability of automated oral assess- ments to oer sucient evidence of how examinees might be expected to perform in real-life target language domains (Chun 2006; Downey et al. 2008; Xi 2008). Although concurrent validation projects have shown that the scores on automated tests correlate highly with other measures of language prociency (e.g., those measured by the TOEFL iBT, see Bernstein, Rosenfeld, Townshend, and Barbier 2004; Bernstein, Van Moere, and Cheng 2010), evidence about how examinee responses can be assessed based upon criteria other than correct phonological, lexical, and grammatical forms is underresearched. Such evidence would be especially helpful for teachers interested in using automated assessments to prepare their learners for target language domains. Assessment practices that solely focus on the errors in examinees’ linguistic forms treat grammar from a structuralist perspective, separating forms from their meanings. By marking form and content separately, eorts toward form-meaning integration are rendered ineectual. A framework based upon systemic functional linguistics (SFL), operationalized using genre and transitivity analysis, oers test developers a complementary way of evaluating exam responses for meaning in addition to form. Halliday (1979) called for a grammar that instead of being “formal; rigid; based on the notion of ‘rule’; syntactic in focus, and oriented toward the sentence” was “functional; exible, based on the notion of ‘resource’; semantic in focus, and oriented towards

Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

Hispania 97.4 (2014): 466–88AATSP Copyright © 2014

Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks

Jesse GleasonSouthern Connecticut State University

Abstract: Communicative approaches to language teaching that emphasize the importance of speaking (e.g., task-based language teaching) require innovative and evidence-based means of assessing oral language. Nonetheless, research has yet to produce an adequate assessment model for oral language (Chun 2006; Downey et al. 2008). Limited by automatic speech recognition (ASR) technology, which compares non-native speaker discourse to native-like discourse, most tests exclusively focus on accuracy while ignoring how examinees use language to make meaning. In order to offer stakeholders more trustworthy evidence of how examinees might use language in target language domains, a model anchored in systemic functional linguistics (SFL) is put forth. Specific examples are given of how SFL might be used to evaluate test task types, such as the story retell. Three examinees’ responses are contrasted using genre analysis (Derewianka 1990) and transitivity analysis (Ravelli 2000) in order to demonstrate elements in their linguistic profiles that ASR-based assessment would overlook. In so doing, implications are drawn regarding the potential of SFL models for enhancing automated scoring procedures by focusing on the meaning-form relations in the linguistic construction of narrative.

Keywords: adult Spanish literacy/alfabetización adulta de español, automated language assessment/ evaluación automática del lenguaje, discourse analysis/análisis del discurso, systemic functional linguistics/lingüística sistémico-funcional, transitivity analysis/análisis de transitividad

Introduction

A prevailing issue in the field of language testing is the inability of automated oral assess-ments to offer sufficient evidence of how examinees might be expected to perform in real-life target language domains (Chun 2006; Downey et al. 2008; Xi 2008). Although

concurrent validation projects have shown that the scores on automated tests correlate highly with other measures of language proficiency (e.g., those measured by the TOEFL iBT, see Bernstein, Rosenfeld, Townshend, and Barbier 2004; Bernstein, Van Moere, and Cheng 2010), evidence about how examinee responses can be assessed based upon criteria other than correct phonological, lexical, and grammatical forms is underresearched. Such evidence would be especially helpful for teachers interested in using automated assessments to prepare their learners for target language domains.

Assessment practices that solely focus on the errors in examinees’ linguistic forms treat grammar from a structuralist perspective, separating forms from their meanings. By marking form and content separately, efforts toward form-meaning integration are rendered ineffectual. A framework based upon systemic functional linguistics (SFL), operationalized using genre and transitivity analysis, offers test developers a complementary way of evaluating exam responses for meaning in addition to form. Halliday (1979) called for a grammar that instead of being “formal; rigid; based on the notion of ‘rule’; syntactic in focus, and oriented toward the sentence” was “functional; flexible, based on the notion of ‘resource’; semantic in focus, and oriented towards

Page 2: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

467Gleason / An SFL Model for Automated Test Tasks

the text” (186). The current paper aims to demonstrate how SFL can provide the basis to look at the meanings language users make while engaged in test tasks. One task in particular, the story retell, will be the focus of discussion.

Systemic Functional Linguistics: A Model for Language Assessment“Remember that all models are wrong; the practical question is how wrong do they have to

be to not be useful” (Box and Draper 1987: 74). Although outside the field of language testing, Box and Draper’s quote encapsulates why certain models of language have been adopted at certain moments throughout history. While some models might seem appropriate at the time, they later become cumbersome, limiting the creativity and commonsense notions of what is fair. Certain assumptions about language stemming from structural linguistics continue to influence language teaching and testing practices. From a structural perspective, language forms are disconnected from their meanings. Viewing language as static, structuralism was put forth as a ‘scientific’ basis of linguistics (Bloomfield 1933). This entailed formal procedures for mechanistic linguistic analysis. Since issues of meaning were unanswerable, they were largely ignored. Instead, analysis focused on the number of errors that language users made. This approach formed the basis for Chomskian generative grammar, which continues to greatly influence the field today, especially in the United States.

Systemic functional linguistics, a model based on a holistic view of language as a system for making meaning, differs epistemologically from structuralism, offering insight into form-meaning integration. Instead of decontextualized sentences or utterances, SFL purports the complete text, a meaningful stretch of language, as the object of language inquiry. Halliday (1994) conceived of languages as organizing three main functions: 1) to represent experience, 2) to establish and maintain interaction between individuals, and 3) to create coherent and con-nected discourse. These aspects correspond to three ‘metafunctions’ of language: the ideational, the interpersonal and the textual, respectively. A third way that SFL differs from structuralist grammar, and which is especially applicable to language testing efforts, is that rather than being a catalogue of rules, language is described in terms of sets of meaning choices, or sets of options. This means that language users constantly make choices based upon the meanings that they would like to impart. What is commonly assessed incorrect from a structuralist viewpoint may actually be appropriate in a given context.

Researchers in SFL uphold that any evaluation of language should focus on meaning and the ways in which users of language make choices in order to mean differently. While structuralism focuses exclusively on linguistic form, SFL sees language as a resource for making meaning. Table 1 contrasts several of the most important differences between SFL and structuralism. See also Derewianka (1999) and Mohan and Slater (2004) for a discussion of structuralist versus SFL approaches to grammar.

Language assessment models based on structuralist assumptions of language as a set of rules and language learning as the acquisition of a correct set of forms is insufficient for responsible and fair language assessment practices in today’s world. Genre analysis and transitivity analysis, as described in detail later, are two tools rooted in SFL that can be useful for evaluating features of language that may remain overlooked by other models.

SFL in Formative AssessmentAn SFL-based assessment model can provide a basis for addressing the quality of discourse

based on language features other than form errors. There is an abundance of research that takes a SFL approach to the assessment of classroom discourse (Early 1990; Early, Mohan, and Hooper 1989; Early and Tang 1991; Fang and Wang 2011; Hood 2010; Huang and Mohan 2003; Huang and Morgan 2009; Mohan and Beckett 2003; Tang 1992). Early, Thew, and Wakefield

Page 3: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

468 Hispania 97 December 2014

(1986) found six functional language structures (i.e., knowledge structures) recurred both as texts and thinking skills in several Canadian curriculum resource guides and textbooks. Early, Mohan, and Hooper (1989) later used this as the basis for the Vancouver School Board Project, a large-scale action research project involving over one hundred teachers and countless activities.

Mohan and Beckett (2003) described the importance of functional recasts in classroom language, which are important for grammatically scaffolding learners toward more advanced lan-guage development. Whereas formal recasts stem from a traditional view of assessment as a judgment of correctness of form, functional recasts are associated with a functional view of assessment, which judges the functional appropriateness of an expression of meaning (Mohan, Leung, and Slater 2010).

Mohan and Slater (2004) addressed important questions with regard to how speakers and writers use language as a resource for meaning. The authors showed how two assessment instruments, the test of written English scoring guide and a locally developed tool for assessing communicative competence, were incapable of distinguishing between student compositions in a way that supported teacher/rater intuitions. Whereas teachers could definitively say that one examinee’s essay was superior to another’s, describing the former as more ‘scientific,’ ‘more advanced,’ and at a ‘higher caliber’ than the latter; there was nothing in either assessment instru-ment that teachers could draw on to account for this difference. Since neither essay presented many grammatical errors, they were given the same score.

Other studies have shown that salient discourse features offer raters more trustworthy means of how language construes meaning. Slater and Mohan (2010) used an SFL approach to show how an ESL teacher worked collaboratively with her school’s science department to help students understand how to construct science knowledge. Results showed that the developmental move in learners’ science discourse exhibited both a semantic and grammatical shift. Semantically, L1 high school students were able to use a wider range of linguistic features than L2 high school students and younger learners. Lexicogrammatically, learners shifted away from the use of conjunctions, drawing on more metaphoric ways of constructing meaning using nominalization. The authors illustrated how the knowledge framework, a model anchored in SFL, could be used to contribute to the inferences of a validity argument.

Although SFL has been frequently used for formative classroom assessment, less has been said about norm-referenced testing. In one of the few studies, Mohan (1998) showed how SFL approaches have major implications for the theory and practice of assessment of academic oral proficiency. Using an SFL-based framework to examine the responses of teaching assistants dur-ing oral proficiency interviews (OPIs), the author demonstrated that despite its best attempts to view language from a functional perspective, the performance-based oral assessments such as the

Table 1. Assumptions of SFL versus Structuralist Grammar

Systemic Functional Linguistics Structuralist Grammar

Discourse level Sentence level and belowFunctions of language and how they evolve in our culture to enable us to do things

Form and structure of language

How discourse varies with context General description of languageLanguage as a resource for meaning making Language as a set of rulesLanguage learning as extending resources for making meaning in context

Form unrelated to meaning (‘conduit’)

Evaluate discourse as making meaning with resources in context

Evaluate correctness. Judge meaning independently from form

Page 4: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

469Gleason / An SFL Model for Automated Test Tasks

OPI continue to rely on superficial and disconnected conceptions of grammar as an indicator of accuracy. Data suggested that important discourse features used in the construction of knowledge were not captured by the ACTFL OPI rating system, prompting the author to put forth an SFL-based model for better illuminating questions of how knowledge is constructed by discourse. Such a model held much potential to contribute to a more robust functional understanding of how speakers during the OPI used language to coconstruct meaning.

The following sections will detail two other SFL models, which have high potential for helping to automatically assess oral language based on form-meaning criteria: genre analysis, which looks at whole texts in their social contexts, and transitivity analysis, a tool for linguistic analysis at the clausal level and below.

Above the Clause: Genre Analysis of a RecountGenre analysis encompasses a major vein of linguistic research in Australia and elsewhere.

Martin (2009) defines a genre as “a staged, goal-oriented social process” (10). In a series of action research projects, the focus has been on defining and describing families of genres in different school disciplines, such as science (e.g., Veel 1997) and history (e.g., Coffin 2006). Derewianka (1990) describes the textual features of six common genres used in academic contexts, including recounts, instructions, narratives, information reports, explanations, and arguments. Later, I will draw on the author’s description of the recount genre to explore how it may inform automated language assessment tasks, such as the story retell, but first I will explore several important features of the recount.

As mentioned, recounts are one of a number of different textual genres that have been explored. They begin with an orientation stage, where the listener is given the necessary background information to understand a text, followed by a chronological series of events. The resultant texts from examinees taking story retell (SR) tasks exhibit many of the language features of a factual recount, including 1) the use of third-person pronouns; 2) selected details to help the listener accurately reconstruct the incident; 3) specific details of time, place, and manner; 4) descriptive details; and 5) the passive voice.

Below the Clause: Transitivity Analysis Since SFL is based on the tenet that language is a resource for relating linguistic forms to

their meanings, language assessment needs to examine how language users manipulate such forms to impart certain nuances through the selection of linguistic choices that are available to them. As mentioned, Halliday (1994) purports that any text can be analyzed using the ideational, interpersonal, and textual metafunctions, which correspond to the situational or register variables of field, tenor, and mode, respectively. According to Huang and Morgan (2003), ideational meanings reflect and construct subject matter or content, allowing for the interpretation of how language is integrated with content.

One tool situated within the ideational metafunction is transitivity analysis. As the main purpose of the ideational is to represent experience, meanings about the world as reflected by discourse are present in the lexicogrammatical choices that language users make to convey their particular representations of reality. Transitivity analysis has been used extensively in the fields of stylistics, critical discourse analysis, and genre analysis (Martínez 2001; Meurer 2004; Shokhui and Amin 2010). However, it has yet to be applied to language assessment.

Transitivity analysis offers explicit procedures for the linguistic analysis of narrative. Draw-ing on ideational meanings, it classifies the clause into six major process types: mental, verbal, relational, behavioral, existential, and material (Ravelli 2000). Mental processes (e.g., believe, see, worry) are concerned with aspects of thinking, feeling, and seeing, and oftentimes project, or pertain to, a separate clause (e.g., I believe you are correct). Verbal processes (e.g., whisper,

Page 5: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

470 Hispania 97 December 2014

say, tell) can also project and generally pertain to forms of saying and their related synonyms. Behavioral processes (e.g., watch, smile, shave) are closely related to material, mental, and verbal processes, usually relating to some type of human behavior. The two types of relational processes, attributive and identifying, are generally used to relate participants to each other. While attributive processes ascribe descriptive attributes to an entity (e.g., He has a nice smile), identifying processes give the entity a definite identity (e.g., She is a good worker). Existential processes (e.g., there is/there are) function to represent something that exists or happens. Finally, material processes (e.g., crouch, melt, spin) often signify actions and include events, common characteristics of a recount.

All processes have specific participants and circumstances that accompany them. For example, verbal processes can have a sayer (the addresser), a receiver (the addressee), and the verbiage (what is said). Circumstances help to represent further details in a text, giving more extensive information to answer questions such as: How long? When? Where? How? And Why? (Martin, Matthiessen, and Painter 1997).

The specific participant roles of material processes include the actor (the one performing the action), the goal (that which is affected by the action), and the beneficiary (the receiver of goods or services). As Ravelli (2000) explains, noting the identification of the material processes is an important part of a transitivity analysis, but a further step needs to be taken in order to examine such processes in conjunction with their associated participants. These can include the characters themselves (generally the actors) as well as the human or non-human participants involved in the narrative (goals and beneficiaries). Participants can include characters but also non-human entities as well.

Automated Scoring of Meaning and Form: The Story Retell TaskBased on structuralist notions of what language is, automated scoring procedures cur-

rently conflate language learners’ correctness in grammatical form with their ability to mean. Automated speech recognition (ASR) systems, which make use of natural language processing technology, detect erroneous forms in non-native speaker (NNS) responses by comparing them to native speaker (NS) speech samples (Bertstein et al. 2004; Bernstein et al. 2010; Van Moere 2012; Xi 2008). Unfair juxtaposition of NS to NNS discourse assumes that NSs use language correctly and that NNSs should learn to speak in the same ways. This is a clear case of mistaken identity, as anyone knows who has ever heard someone with a heavy accent or poor use of grammatical form use language expressively. By relying solely on form-based accuracy, ASR-based tests currently disregard the ways that NNS manipulate language in non-standard ways to represent their experiences. This not only presumes that the NS way is the correct way to use language but ignores the way that NNS speakers often are able to manipulate language more colorfully and vividly than monolingual speakers due to their knowledge of two or more linguistic codes. Multicompetent language users may break NS rules for language use, while at the same time drawing on a diverse repertoire of codes which allows them to effectively represent experience (Cook 1999; Cook et al. 2006).

While automated scoring may be able to evaluate the correspondence of NS to NNS forms, such procedures have yet to develop ways of linking forms to meanings or of addressing the meanings present in non-corresponding forms. As such, current ASR-based testing approaches are inadequate for assessing underlying meaning-form relationships on tasks that require full, complex oral texts. Such evaluations will inevitably lead to misinterpretations about examinees’ spoken language ability because their underlying theory of language as a set of rules is based solely on surface structure.

The Versant (Pearson 2008; 2011) is an example of one such automated assessment instru-ment. Although test developers claim that the surface structures evaluated by the Versant may be indicative of examinees’ underlying ‘core’ language, such structures are still only based on

Page 6: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

471Gleason / An SFL Model for Automated Test Tasks

traditional conceptions of the way NS use language to make meaning. For example, the four language constructs evaluated by the Versant include pronunciation, fluency, sentence mastery (syntax), and vocabulary. Task types include reading sentences aloud, repeating spoken utter-ances, and reorganizing spoken vocabulary items into a correct sentence. Using language for communicative purposes or to convey a message is all but ignored.

In addition to the other task types mentioned, the Versant uses a SR task, which asks exam-inees to listen to and retell a brief story, including “the situation, characters, actions, and ending” in an extended response of thirty words or less. By requesting information about the specific content of the story, examinee responses must necessarily be evaluated for their meaning in addition to their grammatical form, however they are not (Pearson 2008; 2011). Although there is a dearth of publicly available information regarding exactly how the different subconstructs of the Versant are incorporated in order to reach a given score, we can see that the subconstructs evaluated are all form-focused.

The scoring of complex responses must offer stakeholders information about how well examinees are able to use their linguistic resources to successfully retell the story. These tasks must provide additional meaning-based evidence in order to offer trustworthy evaluations indicative of the quality of oral narrative (Mohan 1998; Xi 2008). The SR task undoubtedly calls upon an examinee’s ability to make meaning with the linguistic resources that s/he possesses. One might argue that the successful telling of a story involves a skilled unfolding of a sequence of events or a scenario involving a plot, climax, or dénouement. It may also involve the ability to draw on other strategies, such as circumlocutions, to retell a story vividly.

Versant developers admit the difficulty of evaluating extended response items. As Van Moere (2012) asserts, “the assessor must evaluate the respondents on an uneven playing field because the respondents have provided qualitatively different performances for rating” (8). Seen through an SFL lens, this “uneven playing field” of contextual variables is viewed as an indispensible indicator of examinees’ ability to use language purposefully. From an SFL view, this “noise in measurement” (Van Moere 2012: 8) offers important information about examinees’ ability to use language. Clearly, this approach to scoring is in stark contrast to an SFL theory of language, which sees grammar as a resource for making meaning, not merely a conduit through which meaning flows. One way in which the SFL approach holds enormous potential for reconceptualizing the way that examinees’ oral texts are scored is by offering a complementary way of assessing forms as they relate to meanings. From this view, grammatical form is indeed important, but in the sense that it is related to meaning. How does wording connect to meaning? Any linguistic analysis must build meaning in.

The goal of the present study is to show how SFL as a theoretical construct can be used to help assess oral language. The following research question is presented: how can SFL theory provide information about oral language use that other methods of assessment do not provide?

MethodologyTo answer the above question, examinee responses to a story retell task on a computer-based

speaking test of L2 Spanish were analyzed using a detailed transitivity analysis (Appendix C). The transitivity system includes choices about processes, participants and circumstances, which form the core of ideational meaning within a text. The following sections outline the specific participants, data collection procedures, and analysis. For clarification purposes, human par-ticipants will be referred to as examinees.

ExamineesThe examinees for this study included 19 NNS of Spanish enrolled in an intermediate-

advanced conversation course at a large US university. All were undergraduate students, ranging

Page 7: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

472 Hispania 97 December 2014

between 18 and 30 years of age, with an average age of 21. There were eight males and 11 females. All spoke English as their native language and had passed the academic institution’s introductory Spanish requirements. Many had spent several months abroad in a Spanish-speaking country.

Three key examinees (Irene, Will and Rachel) were selected from a convenience sample as representative of students with high-, mid-, and low-level Spanish speaking ability. These names are pseudonyms chosen to protect examinees’ confidentiality and do not necessarily reflect their cultural backgrounds. Irene, a heritage speaker of Spanish, spoke both Spanish and English in her home as a child. Will had studied Spanish for several years in high school and had experience speaking Spanish for missionary purposes. Lisa had also studied Spanish in high school but had had no prior naturalistic language experience.

Data CollectionThe data collected for the present study were part of a larger investigation on automated

speaking tests. After obtaining Institutional Review Board clearance, examinees were asked to take a computer-administered test of spoken L2 Spanish with tasks similar to those contained on the Versant, as observed in Appendix B. Examinees’ tests were audiorecorded and later transcribed. The current study will focus only on examinee responses to the SR task.

Data AnalysisExaminee test discourse on the SR task was analyzed using genre and transitivity analysis.

For the former, examinee responses were classified using Derewianka’s (1990) description of a recount. For the transitivity analysis, all discourse was classified using SFL processes, partici-pants, and circumstances. Processes were divided into the six categories of material, mental, verbal, relational, behavioral, and existential. If the process was material, it was also assigned a potential beneficiary and goal. If it was verbal, the process was potentially assigned verbiage. Circumstances were also noted.

Data Presentation and DiscussionGenre Analysis: Discourse Features of an Oral Recount

Table 2 compares the linguistic features of examinees’ responses of the three key respondents on the SR. It uses Derewianka’s (1990) categorization to classify these features according to those of an oral recount, including the use of third-person pronouns, the passive voice, and selected, specific, and descriptive details. Original transcripts can be found in Appendix A.

From Table 2, one can see that Irene used the past simple tense to retell much of the story. Her recount included almost all of the same or similar linguistic resources of those of the prompt. She also made use of all five linguistic categories present in a recount according to Derewianka’s classification (third-person references, the passive voice, and selected, specific, and descriptive details).

In the second row of Table 2, we can see that Will’s recount included four of the five linguistic categories of a recount. However, its short length and somewhat less streamlined constructions (e.g., una joven que asiste la universidad versus una joven universitaria) make it possible that he would receive a poor mark if automatically assessed. Even though Will’s recount did not include any instances of the passive voice, it did exhibit several aspects that both Irene and Lisa’s lack, such as the use of descriptive details of time (e.g., cuando llega) and manner (e.g., por su coche). Note the embedded process (una joven que asiste la universidad) as a feature of grammatical metaphor that has been linked to advanced academic discourse (see Halliday 1994; Martin 2000; Thompson 1996).

Page 8: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

473Gleason / An SFL Model for Automated Test Tasks

In the final column of Table 2, we can see that although Lisa’s response exhibited fewer features, her text did still exhibit some aspects of a recount, such as use of the past simple tense (fue, hablé). The potentially confounding variable of memory must also be mentioned. It is plausible that Lisa did not completely understand the story and for this reason was unable to retell it appropriately.

Transitivity Analysis: Quantifying Meaning-Form UnitsThe following sections outline one approach that human raters and possibly machine rat-

ers can use to quantify the processes, participants and circumstances in examinees’ oral texts in order to present a more valid way of evaluating oral narrative. The Versant SR prompt asks examinees  to recall “the situation, characters, actions and ending,” rather broad and inexact descriptors of narrative. In order to evaluate the ideational meanings, transitivity analysis can be used to complement form-focused criteria currently targeted by ASR systems.

The number of clauses present in examinee responses to the SR task is displayed in Table 3. These counts were facilitated by the transitivity analysis. The descriptive statistics of responses given by Lisa, Will, and Irene support the notion that 1) Irene’s recount was superior to Lisa’s and 2) that Will’s falls somewhere between the two, exhibiting characteristics unique to both. These assumptions were formed based upon the criteria included in a factual recount accord-ing to Derewianka (1990) as presented in Table 2. Using this classification, discourse captured from the storytelling task can be coded using the textual descriptors of the recount genre and transitivity analysis, offering researchers an alternative way to assess the SR task based upon form-meaning relationships.

ProcessesIn order to use transitivity analysis to determine the quality of examinees’ recounts, a

quantification of processes was conducted. First, we can see that Irene’s response included a total of 11 processes and Lisa’s contained only two, corresponding to 70% and 10% of the total processes included in the computer prompt respectively. Based on these percentages, we can see that Irene’s response was more similar at least in terms of number of processes (which may also be a function of length) to the story provided by the computer prompt than Lisa’s recount.

Table 2. A comparison of examinee recount characteristics

 3rd Person Reference Selected Details Specific Details

Descriptive Details

Passive Voice

Irene’s recount

Ella, Él, Ellos

quería ir, quería comprar boletos, no tuvieron más boletos, le tocó al brazo, le ofreció boletos, le dijo ‘no me toques’, le pidió perdón, le ofreció un…

A la casa de sus abuelos, el fin de semana, al estación de trenes

una niña universitaria, un hombre

se asustó, se asustó

Will’s recount

Ella, Él va para buscar una llevada, llega, no hay lugar, le ofrezca un transporte

a la estación de autobuses por su coche

una joven que asiste la universidad, un hombre, cuando llega

Lisa’s recount

Él fue, hablé un joven, con un profesor

 

Page 9: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

474 Hispania 97 December 2014

Will’s response included five processes, or 33% of the total processes, falling somewhere between the two. The variety of Will’s processes should also be noted, as these varied in type and were more proportionally similar to the variation of the original story (Will: 3 Material: 1 Verbal: 1 Existential; Computer Prompt: 7 Material: 3 Mental: 3 Verbal: 1 Existential). In sum, it is likely that both number and variety of processes can be a measure of quality of narrative. While Irene’s processes also varied in type, her use of four mental processes far exceeded that provided by the other examinees as well as the original story. This difference is further discussed in the next section.

Mental processesTable 4 shows the qualitative data from the transitivity analysis, which complements the

descriptive statistics given in Table 3. As can be seen below, it may be significant to note that Irene’s response had more mental processes than any of the other examinees and more than the original story. The fact that she chose to use such mental processes, even when they were not included in the original story, suggests her ability to relate the emotions of the story’s female protagonist to the extent that she was able to report on the young university woman’s unpleasant predicament (e.g., she wanted to go to her grandparents’ house and she became scared). Although other examinees made similar assumptions about the feelings of the male protagonist in the story (e.g., he was very mad and the boy is not happy), such feelings were never actually implied in the story (see Appendix A). Given that Irene was the only examinee who retold the story using a high incidence of accurate mental processes suggests that number and correspondence of mental processes to that of original story may be an indicator of quality of narrative.

Verbal processesIn the example story, one can see that the main instances of verbal processes that occurred

in the text were those when the young woman yelled, “¡no me toques!” and when the gentleman asked forgiveness. The first instance could be interpreted as a direct quote from the protagonist, whereas the latter instance was an example of the fixed phrase “pedir perdón.” Of the examinees, 13 were able to provide examples of verbal processes, however, only five of them were able to imitate the examples of verbal processes with corresponding verbiage as given in the computer

Table 3. Quantitative transitivity analysis of examinee responses to the story retelling task

Processes

ID #Wd Ma Me V R B E Tot Act Bfy Vb Gol Circ

CP 83 7 3 3 0 0 1 15 13 1 0 6 10Lisa 12 1 0 1 0 0 0 2 2 0 0 0 2Will 39 3 0 1 0 0 1 5 4 1 0 2 4Irene 68 2 4 4 1 0 0 11 12 2 2 6 3Mean 33.0 2.8 0.6 1.0 0.3 0.1 0.4 5.0 5.2 0.3 0.3 1.8 2.4St. dev 15.0 1.0 1.0 1.0 0.6 0.2 0.6 2.1 2.6 0.6 0.6 1.3 1.0

#Wd = number of words, CP = computer prompt, Ma = material, Me = mental, V = verbal, R = relational, B = behavioral, E = existential, Act = actors, Bfy = beneficiaries, Vb = verbiage, Gol = goals, Circ = circumstances

Page 10: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

475Gleason / An SFL Model for Automated Test Tasks

Table 4. Qualitative transitivity analysis of examinee responses to the story retelling task

Oral Discourse Process Actor Bfcy Goal CircumstancesIre

ne una niña universitaria quería ir a a casa de sus abuelos el fin de semana

Mental: quería ir

niña universitaria

casa de sus abuelos, fin de semana

entonces fue al estación del tren

Material: fue Implicit: ella estación del tren

y quería comprar boletos

Mental: quería comprar

Implicit: ella boletos

pero ya no tuvieron más boletos

Relational: no tuvieron

Implicit: ellos

más boletos

y un hombre le tocó al brazo

Material: tocó Explicit: hombre

le brazo

y le ofreció boletos Verbal: ofreció

Implicit: él le boletos

pero se asustó Mental: se asustó

Implicit: ella

y le dijo no me toques Verbal: dijo Implicit: ella le no me toquesy el hombre también se asustó

Mental: se asustó

Explicit: el hombre

y le pidió perdon... Verbal: pidió Implicit: él le perdóny el hombre le ofreció un...

Verbal: ofreció

Explicit: el hombre

le un…

Will una joven que asiste

la universidadExplicit: una joven

que asiste la universidad

va a la estación de autobuses

Material: va a la estación de autobuses

para buscar una llevada

Material: buscar

una llevada

y cuando llega Material: llega Implicit: ellano hay lugar para ella en los autobuses

Existential: no hay

Explicit: lugar

para ella en los autobuses

y un hombre le ofrezca un transporte por su coche

Verbal: ofrezca

Explicit: un hombre

le un transporte por su coche

y él Explicit: él

Lisa un joven

fue colegioMaterial: fue Explicit:

Un jovencolegio

y hablé con un profesor y

Verbal: hablé Implicit: yo con un profesor

él no Explicit: él

Page 11: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

476 Hispania 97 December 2014

prompt. Irene’s response showed her ability to make use of verbal processes to reiterate what the characters in the story said. Although the use of verbal processes was exhibited by other examinees, it went unmatched in scope to that of Irene.

Material processesMaterial processes were by far the most extensively used (MMa!2.8, SDMa!1.0; MTot!5.0,

SDTot!2.1). These show a strong sense of “doing” and “happening,” which might be expected of the recount genre. Indeed, since material processes often signify actions and events, it is not surprising that examinees used them to advance the story’s plot. Many were also able to demonstrate advanced use of material processes, including appropriate accompanying par-ticipants, goals, beneficiaries, and circumstances. The ability to use embedded processes with corresponding participants, goals, circumstances, and in some cases beneficiaries may indicate developed meaning-making ability. A failure to link a circumstance to its respective process may be associated with decreased meaning-making resources. The discussion of SFL participants as they occur in Spanish will be further elaborated in the following section.

Participants Actors

The computer prompt provided a total of seven explicit and five implicit actors, as observed in Table 4. In Spanish, a pro-drop language in which a subject does not need to be present, implicit actors are represented in grammatical verb morphemes, where the inflection of the verb indicates who is doing the action. Explicit actors, on the other hand, where the subject of the sentence stands alone and accompanies the verb " the implicit actor, are often used to emphasize who is doing the action. The majority of examinee responses did not approximate the number of actors in the original story (M ! 5.2; SD ! 2.6), with the exception of Irene’s, which contained four explicit and eight implicit actors. This represented 90% of the total actors included in the prompt.

The majority of examinee responses contained five participants or less, representing 40% or less of the total number of actors. Looking at the number and type of actors also offered insight into the complexity of examinees’ responses. Whereas implicit actors may have demonstrated a control of verb number and tense, examinees’ overuse of explicit actors or a non-correspondence between the actors and process verbs may have indicated an elementary grasp of meaning resources, especially in Spanish where verb conjugation is actor-dependent. The correspondence between actor-process is another area where meaning is tightly linked to form in Spanish and thus, provides a means of distinguishing between examinee responses based on form-meaning criteria.

Goals and BeneficiariesIn Spanish, beneficiaries and goals are often represented by clitics, which entail express-

ing what (e.g., el boleto) was given and to whom (e.g., a ella). On average, the number of goals included in examinees’ responses was much less than that of the prompt (M ! 1.8; SD ! 1.3), suggesting that the complexity of language, specifically the ability of examinees to express com-plex material processes involving multiple beneficiaries and goals was lacking. It was observed that Irene’s recount was the only one that came close to paralleling the number of goals in the prompt, and even superseded the prompt in her use of beneficiaries. If resemblance to the original story indicates quality of narrative, one can argue the superiority of Irene’s response. Will, on the other hand, used two goals and one beneficiary, which corresponded to una llevada, un transporte, and the a la joven universitaria (e.g., le) respectively.

Page 12: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

477Gleason / An SFL Model for Automated Test Tasks

Only five of the examinee responses were able to match (or in Irene’s case outnumber) the quantity of beneficiaries included in the prompt (M ! 0.3, SD ! 0.6). However, it is worth noting that the computer prompt only provided one example of a material process that included a goal and beneficiary (e.g., pidió perdón), a collocation that some examinees used.

Irene’s recount also had one example of a material process with both goal and beneficiary (e.g., le ofreció boletos), however, it should be noted that in the actual story the gentleman offering to take the young university woman to her grandparents’ house by car never actually offered her a train ticket. For this reason, it is important not only to quantify the processes, participants, and circumstances but also understand how well their semantic relationships match those of the original story.

Will uses the term una llevada, which technically translates to importation rather than a ticket. This term is likely to be deduced and understood by a listener in context but would likely be marked erroneous by automated systems, despite its root llevar. If Will’s non-standard use of llevada is likely to be understood by a fellow listener/speaker in context thus carrying out a message, this raises questions concerning whether it is fair to mark such creative constructions of meaning-making as completely erroneous.

CircumstancesLastly, the circumstances contained in the original story answered more complex questions

than those provided by examinees, such as how? (e.g., sola and rápidamente). Such qualitative details add to the life and dynamism of the story, helping the listener to visualize a young university girl and empathize with her predicament. However, in the retelling of the present story, such fine nuances are generally underrepresented or can also be represented in other ways, as in Will’s nominalization of una joven que asiste a la universidad (e.g., what kind?).

None of the examinee responses approximated the quantity of circumstances included in the original story (M ! 2.4, SD ! 1.0). The maximum number used was four, the majority of which pertained to the location (e.g. a la casa de sus abuelos, en un coche), time (e.g., para el fin de semana), means (e.g., por tren, a pie), or cause (e.g., un boleto, para ella). Most examinees focused on time, place, and other factual details, pointing to the possibility that greater numbers of circumstances may be present in the recounts of examinees with higher abilities than those analyzed here. Since there was a large gap between the number of circumstances in the “best” recount and that of the original story, circumstances may be another area that helps to usefully distinguish between higher levels of ability than the ones taken from this sample. Although only a preliminary analysis, circumstances may provide a scale of measurement indicating quality of narrative.

ConclusionLanguage assessment would greatly benefit from an approach that considers linguistic form

as inextricably linked to meaning. It is posited here that SFL, specifically genre analysis and transitivity analysis, can provide such a framework to understand how examinees use language to create meaning on automated speaking tests. While current automated assessments evaluate examinee oral discourse by focusing solely on sub-sentence-level structures in order to detect mismatched forms and evaluate correctness, an overreliance on comparisons of NNS-to-NS linguistic forms offers an incomplete picture about the ways in which examinees use language to make meaning.

The SFL analyses shown here have offered a deeper understanding of the features of examinee discourse that might be used as the basis for future training of automated systems. Nevertheless, much work remains before making conclusions, as this is only a preliminary

Page 13: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

478 Hispania 97 December 2014

analysis of how such operations might be approached. The following paragraphs summarize what has been shown in the present paper and suggest several future directions for research.

First, one can intuitively observe using genre analysis (Derewianka 1990) that certain examinee responses to the story retell task exhibited more features of a factual recount than others. Responses that contained rhetorical features such as the use of third-person references, selected, specific, and descriptive details, as well as the passive voice appeared to more effectively reconstruct past experience by using language to chronicle a sequence of events over time. Such judgments provide a first step toward understanding what exactly makes an examinee’s recount successful.

Next, a transitivity analysis (Ravelli 2000) has shown how intuitive judgments about the quality of examinee narratives can be supported by evaluating sub-clausal meaning units, includ-ing processes, participants, and circumstances. This analysis showed how the oral recounts of three examinees exhibited varying linguistic resources to recreate meaning. Specific features that may offer test developers evidence of meaning-form connections in oral recounts in Spanish include material, mental, and verbal processes; participants, such as (implicit and explicit) actors, goals, beneficiaries, and circumstances.

Evidence offering insight into how examinees manipulate their linguistic resources to make meaning is currently absent in most assessment frameworks, which tend to view grammar from a structuralist viewpoint. As is often the case, breaking grammatical rules leads to flawed forms but does not necessarily indicate unsuccessful communication. In fact, there is little evidence to suggest a direct link between erroneous grammatical forms and ability to communicate in a L2. Often times the most eloquent of speakers is able to convey meaning while breaking innumerable grammatical, phonological, or syntactical rules. This is not meant to imply that L2 learners should disregard grammatical forms, but rather to emphasize that linguistic evaluation must be viewed as integrated within an individual’s ultimate ability to extend their resources for making meaning in context.

ACKNOWLEDGEMENTS

I would like to extend my gratitude to Tammy Slater and Bernard Mohan for their time and expertise, which led to the refinement and improvement of this manuscript.

WORKS CITEDBernstein, Jared, Elizabeth Rosenfeld, Brent Townshend, and Isabella Barbier. (2004). “An Automatically-

Scored Spoken Spanish Test and Its Relation to OPIs.” Conference on the International Association for Language Assessment (IAEA) 2004. Pennsylvania. Presentation.

Bernstein, Jared, Alistair Van Moore, and Jian Cheng. (2010). “Validating Automated Speaking Tests.” Language Testing 27.3: 355–77. Print.

Bloomfield, Leonard. (1933). Language. New York: Holt. Print.Box, George E. P., and Norman R. Draper. (1987). Empirical Model-Building. New York: Wiley. Print.Chun, Christian W. (2006). “Commentary: An Analysis of a Language Test for Employment. The Authentic-

ity of the PhonePass Test.” Language Assessment Quarterly 3.3: 295–306. Print.Coffin, Caroline. (2006). Historical Discourse: The Language of Time, Cause and Evaluation. London:

Continuum. Print.Cook, Vivian. (1999). “Going Beyond the Native Speaker in Language Teaching.” TESOL Quarterly 33.2:

185–209. Print.Cook, Vivian, Benedetta Biassetti, Chise Kasai, Miho Sasaki, and Jun A. Takahashi. (2006). “Do Bilinguals

Have Different Concepts? The Case of Shape and Material in Japanese L2 Users of English.” Interna-tional Journal of Bilingualism 10.2: 137–52. Print.

Derewianka, Beverly. (1990). Exploring How Texts Work. Newtown: Primary English Teaching Associa-tion. Print.

———. (1999). Introduction to Systemic Functional Linguistics: Literacy and Discursive Power. Washington DC: Falmer. Print.

Page 14: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

479Gleason / An SFL Model for Automated Test Tasks

Downey, Ryan, Hossein Farhady, Rebecca Present-Thomas, Mansori Suzuki, and Alistair Van Moere. (2008). “Evaluation of the Usefulness of the Versant for English Test: A Response.” Language Assess-ment Quarterly 5.2: 160–67. Print.

Early, Margaret. (1990). “Enabling First and Second Language Learners in the Classroom.” Language Arts 67: 567–74. Print.

Early, Margaret, Bernard Mohan, and Hugh R. Hooper. (1989). “The Vancouver School Board Language and Content Project.” Multicultural Education and Policy: ESL in the 1990s. Ed. John H. Esling. Ontario: Ontario Institute for Studies in Education. 107–22. Print.

Early, Margaret, and Gloria Tang. (1991). “Helping ESL Students Cope with Content-Based Texts.” TESL Canada Journal 8.2: 34–44. Print.

Early, Margaret, Carol Thew, and Patricia Wakefield. (1986). Integrating Language and Content Instruction K–12: An ESL Resource Book. Vol. 1. Victoria, BC: Ministry of Education, Modern Language Services Branch. Print.

Fang, Zhihui, and Zhuhun Wang. (2011). “Beyond Rubrics: Using Functional Language Analysis to Evaluate Student Writing.” Australian Journal of Language and Literacy 34.2: 147–65. Print.

Halliday, Michael A. K. (1979). Working Conference on Language in Education: Report to Participants. Sydney: University of Sydney. Print.

———. (1994). An Introduction to Functional Grammar. London: Arnold. Print.Hood, Susan. (2010). Evaluation in Academic Writing. New York: Palgrave. Print.Huang, Jingzi, and Glenn Morgan. (2003). “A Functional Approach to Evaluating Content Knowledge and

Language Development in ESL Students’ Science Classification Texts.” International Journal of Applied Linguistics 13.2: 234–59. Print.

Huang, Jingzi, and Bernard Mohan. (2009). “A Functional Approach to Integrated Assessment of Teacher Support and Student Discourse Development in an Elementary Chinese Program.” Linguistics and Education 20.1: 22–38. Print.

Martin, Jim R. (2000). “Close Reading: Functional Linguistics as a Tool for Critical Discourse Analysis.” Researching Language in Schools and Communities: Functional Linguistic Perspectives. Ed. Len Unsworth. London: Cassell. 275–302. Print.

———. (2009). “Genre and Language Learning: A Social Semiotic Perspective. Linguistics and Education 20.1: 10–21. Print.

Martin, Jim R., Cristian M. I. M. Matthiessen, and Claire Painter. (1997). Working with Functional Gram-mar. London: Arnold. Print.

Martínez, Iliana A. (2001). “Impersonality in the Research Article as Revealed by Analysis of the Transitivity Structure.” English for Specific Purposes 20.3: 227–47. Print.

Meurer, Detmar. (2004). “Role Prescriptions, Social Practices, and Social Structures: A Logical Basis for the Contextualization of Analysis in SFG and CDA.” Systemic Functional Linguistics and Critical Discourse Analysis: Studies in Social Change. Ed. Lynne Young and Claire Harrison. London: Continuum. 85–99. Print

Mohan, Bernard. (1998). “Knowledge Structures in Oral Proficiency Interviews for International Teaching Assistants.” Talking and Testing: Discourse Approaches to the Assessment of Oral Proficiency. Ed. Richard Young and Agnes W. He. Philadelphia: Benjamins. 173–204. Print.

Mohan, Bernard, and Gulbahar H. Beckett. (2003). “A Functional Approach to Research on Content-Based Language Learning: Recasts in Causal Explanations.” The Modern Language Journal 87.3: 421–32. Print.

Mohan, Bernard, Constant Leung, and Tammy Slater. (2010). “Assessing Language and Content: A Func-tional Perspective.” Testing the Untestable in Language Education. Ed. Amos Paran and Lies Sercu. Bristol: Multilingual. 217–40. Print.

Mohan, Bernard, and Tammy Slater. (2004). “The Evaluation of Causal Discourse and Language as a Resource for Meaning.” Language, Education, and Discourse: Functional Approaches. Ed. Joseph A. Foley. London: Continuum. 255–69. Print.

———. (2010). “Examining the Theory/Practice Relation in a High School Science Register: A Functional Linguistic Perspective.” Journal of English for Academic Purposes 5.4: 302–16. Print.

Pearson Education (2008). Versant for English Test Design and Validation. MS. Pearson Education Inc. Print.Pearson Education (2011). Versant for English Test Design and Validation. MS. Pearson Education Inc. Print.Ravelli, Louise. (2000). “Getting Started with Functional Analysis of Texts.” Researching Language in Schools

and Communities: Functional Linguistic Perspectives. Ed. Len Unsworth. London: Cassell. 27–64. Print.Shokhui, Hossein, and Forough Amin. (2010). “A Systemist ‘Verb Transitivity’ Analysis of the Persian and

English Newspaper Editorials: A Focus of Genre Familiarity on EFL Learner’s Reading Comprehen-sion.” Journal of English Language Teaching and Research 1.4: 387–96. Print.

Page 15: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

480 Hispania 97 December 2014

Slater, Tammy, and Bernard Mohan. (2010). “Towards Systematic and Sustained Formative Assessment of Causal Explanations in Oral Interactions.” Ed. Amos Paran and Lies Sercu. Testing the Untestable in Language and Education. London: Multilingual. 256–69. Print.

Tang, Gloria M. (1992). “The Effect of Graphic Representation of Knowledge Structures on ESL Reading Comprehension.” Studies in Second Language Acquisition 14: 177–95. Print.

Thompson, Geoff. (1996). Introducing Functional Grammar. London: Arnold. Print.Van Moere, Alistair. (2012). “A Psycholinguistic Approach to Automated Oral Assessment.” Language

Testing 29.2: 1–20. Print.Veel, Robert. (1997). “Learning How to Mean—Scientifically Speaking: Apprenticeship Into Scientific

Discourse in the Secondary School.” Genre and Institutions: Social Processes in the Workplace and School. Ed. Frances Christie and Jim R. Martin. London: Pinter. 161–95. Print.

Xi, Xiaoming. (2008). “What and How Much Evidence Do We Need? Critical Considerations in Validating an Automated Scoring System.” Towards Adaptive CALL: Natural Language Processing for Diagnostic Language Assessment. Ed. Carol A. Chapelle,Yoo-Ree Chung, and Jing Xu. Ames: Iowa State University. 102–14. Print.

APPENDICES

Appendix A: Story Retelling TaskComputer Prompt: Story Retelling

You will hear a brief story in Spanish; after the story, you will have 30 seconds to retell it in Spanish as best you can. Try to retell as much of the story as you can in Spanish, including the situation, characters, actions, and ending.

Una joven universitaria va a la estación de trenes para comprar un pasaje en tren a la casa de sus abuelos para el fin de semana. Cuando llega a la estación, ve que ya no quedan más pasajes. Un señor la ve sola en la estación de buses y, tocando su brazo, le ofrece llevarla en coche. La joven queda muy asustada y le grita “¡no me toques!” El señor se asusta y le pide perdón, regresando rápidamente a su auto para irse.

IreneUna niña universitaria quería ir a casa de sus abuelos el fin de semana. Entonces fue al estación del tren y quería comprar boletos. Ya no tuvieron más boletos y un hombre le tocó al brazo y le ofreció boletos pero se asustó y le dijo “¡no me toques!” Y el hombre también se asustó y le pidió perdón . . . y el hombre le ofreció un . . . oh . . .

WillUna joven que asiste la Universidad. Va a la estación de autobuses. Para buscar una llevada. Y cuando llega, no hay lugar para ella en los autobuses. Y un hombre le ofrezca un transporte por su coche. Y el . . .

LisaUn joven fue ah colegio. Y hablé con un profesor ahm y el uh el (clears throat) no ah (laughing) . . .

Appendix B: Spanish Speaking TextPart A: ReadingPlease read the sentences as you are instructed.

1. Juana le dio de regalo a su hermano una suscripción a una revista moderna. 2. Iba al trabajo y al colegio en su nuevo coche todos los días. 3. Un día a la salida de la universidad se le ocurrió la idea de ir al gimnasio.

Page 16: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

481Gleason / An SFL Model for Automated Test Tasks

4. Se quedó muy feliz pensando que nunca le había fallado. 5. Habían quedado en reunirse en la plaza a las dos de la tarde. 6. Mientras daban una vuelta Joaquín le hablaba de sus metas para el futuro y le preguntó

si quería ayudarlo. 7. Ella se quedó hablando un poquito, después decidió que sí, que le haría el favor. 8. Tristes, se besaron y optaron por despedirse en vez de ir a comer juntos.

Part B: RepeatPlease repeat each sentence that you hear.

Example: a voice says, “Les encanta comer mariscos.” and you say, “Les encanta comer mariscos.”

Part C: OppositesNow, when you hear a word, just say the opposite.

Example: a voice says, “bonito” and you say “feo” a voice says, “pesado” and you say “liviano”

Part D: QuestionsNow, please just give a simple answer to the questions.

Example: a voice says, “¿Qué es más grande, un elefante o una silla?” and you say, “un elefante”.

Part E: Sentence buildsNow, please rearrange the word groups into a sentence.

Example: a voice says, “y María”… “Alejandro”… “saltan la cuerda” and you say, “Alejandro y María saltan la cuerda.”

Part F: Story retellingYou will hear a brief story in Spanish. After the story, you will have 30 second to retell it in Spanish as best you can. Try to retell as much of the story as you can in Spanish, including the situation, characters, actions, and ending.

Part G: Open questionYou will have 30 seconds to answer a question. The question will be about family life or personal choices. The question will be spoken twice, followed by a beep. When you hear the beep, you will have 30 seconds to answer the question. At the end of the 30 seconds, another beep will signal the end of the time you have to answer.

Part H: Simulated interactions. You will hear a series of spoken questions and statements. After listening to each one, respond in a way that is appropriate for the given situation. Make sure you respond to all of them, even the goodbyes. You will have 10 seconds to respond to each one. Answer in a way that you think would be natural given the situation. Try to elaborate as much as the 10-second time limit allows.

Example: a voice says “Hola, ¿cómo está? and you say “Hola, bien. Tuve un buen día en mi universidad y saqué una A en

mi exámen ¿y Usted?

Page 17: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

Appe

ndix

C: T

rans

itivi

ty A

nalys

is

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

CP83

una

jove

n un

iversi

taria

va

a l

a esta

ción

de tr

enes

Mat

eria

l: is

goin

gA

univ

ersit

y wo

man

To

the t

rain

stat

ion

#

para

com

prar

un

pasa

je en

tren

a la c

asa d

e sus

abue

los

Mat

eria

l: to

buy

A tra

in ti

cket

To h

er gr

and-

pare

nts’

hous

e#

para

el fi

n de

sem

ana

For t

he w

eeke

ndcu

ando

lleg

a a la

esta

ción

Mat

eria

l: ar

rives

(She

)To

the s

tatio

n#

ve qu

eM

enta

l: se

es(S

he)

#ya

no

qued

an m

ás pa

sajes

Exist

entia

l: th

ere a

ren’t

Tick

ets

Mor

e#

un se

ñor l

a ve s

ola e

n la

esta

ción

de b

uses

yM

ater

ial:

sees

A ge

ntlem

anH

erAl

one,

In th

e bus

sta

tion

#

toca

ndo

su br

azo

Mat

eria

l: to

uchi

ngH

er ar

m#

le of

rece

llev

ar en

coch

eVe

rbal:

offe

rs to

take

(He)

Her

By ca

r#

la jo

ven

qued

a muy

asus

tada

Men

tal:

is left

scar

edTh

e you

ng

wom

an#

y le g

rita n

o m

e toq

ues<

Verb

al: ye

lls(S

he)

Don

’t to

uch

me!

At h

im

#el

seño

r se a

susta

Men

tal:

is sta

rtled

(You

)H

er fo

rgiv

enes

sM

ey l

e pid

e per

dón

Verb

al: as

ksTh

e gen

tlem

anH

er#

regr

esan

do rá

pida

men

te

a s

u au

toM

ater

ial:

retu

rnin

g(H

e)#

#

para

irse

Mat

eria

l: to

leav

eQ

uick

ly, to

his

car

#15

131

610

73

31

P110

los l

a ge

nte v

isita

ran

sus a

buelo

sM

ater

ial:

visit

edTh

e peo

ple

Their

gran

d-pa

rent

s#

man

ejab

an u

n co

che y

Mat

eria

l: us

ed to

driv

e(Th

ey)

A ca

r#

22

22

P212

a la u

nive

rsid

ad d

uran

te

la

fin d

e sem

ana

To th

e uni

vers

ity,

durin

g the

wee

kend

el ca

ractá

r es J

uan

Relat

iona

l: is

1Th

e cha

ract

er1

Juan

12

#1

P312

un jo

ven

fue c

oleg

io<

Mat

eria

l: we

ntA

youn

g man

To sc

hool

#y h

ablé

con

un p

rofe

sor<

y él n

oVe

rbal:

talk

ed(I)

With

a te

ache

r# Co

ntin

ued

on pa

ge 48

3

Page 18: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

P422

en es

e hist

ora a

un

jove

n es

Exist

entia

l: is

A yo

ung m

anIn

this

story

#no

toca

un

un co

che

Mat

eria

l: do

esn’t

touc

h(H

e)A

car

#él

dice

Verb

al: sa

ysH

e#

y dig

anse

Verb

al: sa

y to

your

self

(You

all)

#y é

l va a

en p

iesM

ater

ial:

goes

He

By fo

ot#

55

12

22

1P5

24un

a jo

ven

unive

rsita

ria va

a un

esta

ción

de tr

enM

ater

ial:

is go

ing

A un

iver

sity

wo

man

To a

train

stat

ion

#

para

un

paisa

je(S

he)

For a

tick

et#

y dic

e que

no

le to

cará

Verb

al: sa

ysN

ot to

touc

h he

r#

y vol

ver a

a la

esta

ción

Mat

eria

l: to

go(S

he)

Him

To th

e sta

tion

#4

31

33

1P6

24un

a jo

ven

vaya

a la

cent

roM

ater

ial:

goA

youn

g wom

anTo

the c

ente

r#

para

com

prar

una

par

aM

ater

ial:

to b

uy#

y ella

tocó

Mat

eria

l: to

uche

dSh

e#

y le g

rita n

o y

Verb

al: ye

lls(S

he)

At h

im#

y él s

ale e

l cen

tro

im

edian

tem

ente

Mat

eria

l: lea

ves

He

The c

ente

r, im

med

iately

#

54

13

41

P726

un es

tudi

ante

vol v

olvé

a su

casa

Mat

eria

l: (I)

retu

rned

A stu

dent

To h

is ho

use

#

y un

hom

bre t

o ya

tocó

el co

che

Mat

eria

l: to

uche

dA

man

The c

ar#

y él e

ra m

uy en

fada

doM

enta

l: wa

s ver

y mad

He

#y g

rita

Verb

al: ye

lls(H

e)#

grita

ban

a él y

Verb

al: u

sed

to ye

ll(Th

ey)

At h

im#

55

21

21

2

Appe

ndix

C (c

ontin

ued)

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

Cont

inue

d on

page

484

Page 19: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

P828

una

perso

na ll

egó

a la c

asa

de

de u

n de

otra

per

sona

Mat

eria

l: ar

rived

A pe

rson

To th

e hou

se o

f an

othe

r per

son

#

y esta

ba b

usca

ndo

para

un jo

ven

Mat

eria

l: wa

s lo

okin

g for

(She

)A

youn

g man

#

enco

ntró

el jo

ven

Mat

eria

l: fo

und

(She

)y s

e bes

óBe

havi

oral:

kiss

ed

hers

elfTh

e you

ng

man

#

y ant

es ll

egó

corte

Mat

eria

l: ar

rived

(She

)#

(She

)#

55

21

41

P930

una

jove

n ...

está

llev

ando

a sus

abue

los

Mat

eria

l: ta

king

A yo

ung w

oman

Her

gran

d-pa

rent

s#

y va a

en es

tació

n de

tren

pero

Mat

eria

l: is

goin

g(S

he)

To th

e tra

in st

atio

n#

hay m

ucho

s pai

sajes

ento

nces

Exist

entia

l: th

ere a

reTi

cket

s#

un h

ombr

e ah

ofre

ce qu

eVe

rbal:

offe

rsA

man

#el

jove

n va

en co

che

Mat

eria

l: go

esTh

e you

ng m

anBy

car

#5

51

23

11

P10

31un

a es

tudi

ante

de la

r un

iversi

dad

quie

ren

visit

arM

enta

l: wa

nted

(the

y)

to vi

sitA

stude

nt o

f the

un

iver

sity

Her

gran

d-pa

rent

sBy

train

#

a su

s abu

elas c

uand

o

po

r tre

nM

ater

ial:

to go

#

cuan

do ir

Mat

eria

l see

s(S

he)

#cu

ando

veM

ater

ial:

went

(She

)#

fue a

la es

tació

n de

l tre

nRe

latio

nal:

can’t

hav

e(I)

Mor

e tick

ets

To th

e tra

in st

atio

nno

pue

do te

ner m

ás p

asaj

esA

man

#en

tonc

es u

na h

ombr

e5

52

23

11

Appe

ndix

C (c

ontin

ued)

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

Cont

inue

d on

page

485

Page 20: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

P11

35un

a un

jove

n fu

i fue

a

la

casa

de s

u ab

uelo

Mat

eria

l: we

ntA

youn

g (wo

)m

anTo

his

gran

dfat

her’s

ho

use

#

por e

l fin

de se

man

a yFo

r the

wee

kend

su ab

uelo

tocó

su br

azo

Mat

eria

l: to

uche

dH

is gr

andf

athe

rH

is ar

m#

y el j

oven

ah d

ijo n

o

m

e toq

ues<

Verb

al: sa

idTh

e you

ng m

anD

on’t

touc

h m

e!#

y fue

a su

casa

Mat

eria

l: we

nt(H

e)To

his

hous

e#

44

11

33

1P1

236

sobr

e un

jove

n qu

e ne

cesit

aba u

nos b

illet

esEx

isten

tial:

(It is

) ab

out

A yo

ung m

anW

ho n

eede

d so

me

bills

#

para

el tr

en p

ero

ah

m es

taba

nRe

latio

nal:

were

(They

)Fo

r the

train

#

no te

nían

más

bill

etes

ento

nces

un

Relat

iona

l: ha

ve(Th

ey)

Mor

e bill

s#

algu

ien m

ás vi

ejo le

ofr

eció

un bi

llete

Verb

al: o

ffere

dSo

meo

ne o

lder

Him

A bi

ll#

y le a

sustó

el jo

ven

y pue

sM

ater

ial:

scar

ed(H

e)H

im (t

he

youn

g man

)#

los d

os sa

liero

nM

ater

ial:

leftBo

th o

f the

m#

20

12

01

P13

39un

a jo

ven

que a

siste

la u

nive

rsida

dA

wom

an

that

atte

nds

univ

ersit

yva

a la

esta

ción

de au

tobu

ses

Mat

eria

l: go

esTo

the b

us st

atio

n#

para

bus

car u

na ll

evad

aM

ater

ial:

to lo

ok fo

r(S

he)

A lift

?#

y cua

ndo

llega

Mat

eria

l: ar

rives

Plac

e#

no h

ay lu

gar p

ara e

lla en

los a

utob

uses

Exist

entia

l: th

ere i

sn’t

A m

anFo

r her

, On

the

buse

s#

y un

hom

bre l

e ofr

ezca

un

trans

porte

por

su co

che

Verb

al: o

ffers

Her

A rid

eBy

his

car

#

y el

54

12

43

11

Appe

ndix

C (c

ontin

ued)

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

Cont

inue

d on

page

486

Page 21: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

P14

40un

a m

ujer

vas v

a a ve

r

sus a

buelo

sM

ater

ial:

is go

ing

to se

eA

wom

anH

er

gran

dpar

ents

#

y cua

ndo

llega

a su

s abu

elos

Mat

eria

l: ar

rives

(She

)Yo

ur ar

mTo

her

gran

dpar

ents

#un

a per

s algu

ien to

que t

u

br

azo y

Mat

eria

l: to

uch

Som

eone

#

y ven

ga su

veng

a en

un co

che

Mat

eria

l: co

me

(You

)In

a ca

r#

y muj

er es

muy

asus

tado

y sí

Men

tal:

is ve

ry sc

ared

Wom

anD

on’t

touc

h m

e!#

se d

ice n

o m

e toq

ues

Verb

al: sa

ysO

ne#

66

12

24

11

P15

42un

chico

que a

siste

una

unive

rsida

dA

boy w

ho

atte

nds a

un

iver

sity,

devu

elva a

su ca

sa a

su

fa

mili

a yM

ater

ial:

retu

rns

(I)H

ome,

To h

is fa

mily

#

cuan

do ll

ego

aM

ater

ial:

retu

rnA

man

#ha

bía u

n un

hom

bre q

ue qu

e

lo to

que a

un

a un

coch

e<Ex

isten

tial:

Ther

e was

(I)It,

A ca

rTh

at to

uch

#

no sé

Men

tal:

don’t

kno

w(I)

#po

rque

lo to

qué

Mat

eria

l: to

uche

dTh

e boy

It#

pero

el ch

ico n

o es

m

uy al

egre

Men

tal:

is no

t hap

pyVe

ry#

67

34

32

1P1

643

una

chico

fue v

a a ir

a la

casa

de su

s abu

elos p

or tr

enM

ater

ial: i

s goi

ng to

goA(

fem

) boy

To h

is gr

andp

aren

t’s

hous

e#

y cua

ndo

llega

baM

ateria

l: use

d to

arriv

e(H

e)By

train

#el

llega

ste o

no

llegó

al

es

tació

n de

tren

Mat

eria

l: di

dn’t

arriv

e(H

e)#

habí

a per

sona

s o pa

sajes

Exist

entia

l: th

ere w

ere

Peop

le, ti

cket

s#

y hab

ía u

n pa

saje

muy

que

le

dio m

iedo

Exist

entia

l: th

ere w

asA

ticke

t tha

t m

ade h

im

afra

id, H

e

To th

e tra

in st

atio

n#

Appe

ndix

C (c

ontin

ued)

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

Cont

inue

d on

page

487

Page 22: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

P16

(con

tinue

d) y él t

rata

ba d

eM

ater

ial:

used

to tr

y#

67

34

2P1

750

habí

a una

chica

Exist

entia

l: th

ere w

asA

girl

#qu

ee q

uería

visit

ar

su

s abu

elos

Men

tal:

want

ed to

visit

(She

)H

er

gran

dpar

ents

#

el fin

de s

eman

aA

bill

For t

he w

eeke

ndy e

lla va

a un

a esta

ción

de

tre

n pa

ra to

mar

un

bille

teM

ater

ial:

is go

ing t

o ta

keSh

eTo

a tra

in st

atio

n#

habí

a un

hom

bre

Exist

entia

l: th

ere

were

n’tBi

lls, S

o m

any

bills

#

que t

oca e

sa ch

icaEx

isten

tial:

ther

e was

A m

anTh

at gi

rl#

y esa

chica

grit

ó no

me t

oque

sM

ater

ial:

touc

hes

(He)

#

y el h

ombr

e hul

lóVe

rbal:

yelle

dTh

at gi

rlD

on’t

touc

h m

e!

#M

ater

ial:

fled

The m

an#

89

13

23

11

3P1

855

un jo

ven

quie

re ir

a su

a la

casa

de s

us ab

uelo

sM

enta

l: wa

nted

to go

A yo

ung m

anTo

the h

ouse

of h

is gr

andp

aren

ts,#

por fi

n de

sem

ana

To th

e sta

tion

y va a

toca

r un

o to

mar

un

tren

Mat

eria

l: is

goin

g to

take

(He)

A tra

in#

pero

cuan

do ll

ega a

la

esta

chió

nM

ater

ial:

arriv

es(H

e)#

un h

ombr

e ahm

le to

caM

ater

ial:

touc

hes

A m

anH

im#

y el j

oven

grit

óVe

rbal:

yelle

dTh

e you

ng m

anD

on’t

touc

h m

e#

y diic

e y d

ijo n

o m

e toc

aVe

rbal:

says

, said

(He)

Him

#y e

l hom

bre l

e dio

a pe

rdon

yM

ater

ial:

gave

The m

anFo

rgiv

enes

s#

77

23

34

12

Appe

ndix

C (c

ontin

ued)

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

Cont

inue

d on

page

488

Page 23: Meaning-based Scoring: A Systemic Functional Linguistics ...€¦ · Meaning-based Scoring: A Systemic Functional Linguistics Model for Automated Test Tasks Jesse Gleason Southern

P19

68un

a ni

ña u

nive

rsita

ria q

uería

ir

a a ca

sa d

e sus

abue

los

Men

tal:

want

ed to

goA

youn

g un

iver

sity

wom

an

To h

er

gran

dpar

ents’

ho

use

#

el fin

de s

eman

a(S

he)

The w

eeke

nden

tonc

es fu

e al e

stació

n de

l tre

nM

ater

ial:

went

(She

)To

the t

rain

stat

ion

#

y que

ría co

mpr

ar bo

letos

Men

tal:

want

ed to

buy

(They

)Pa

sses

#pe

ro ya

no

tuvi

eron

más

bo

letos

Relat

iona

l: di

dn’t

have

A m

anM

ore p

asse

s#

y un

hom

bre l

e toc

ó al

braz

oM

ater

ial:

touc

hed

(He)

Her

arm

#y l

e ofr

eció

bolet

osVe

rbal:

offe

red

(She

)H

erPa

sses

#

pero

se as

ustó

Men

tal:

beca

me s

care

d(S

he)

Don

’t to

uch

me!

Him

#y l

e dijo

no

me t

oque

sVe

rbal:

told

The m

anH

er fo

rgiv

enes

s he

r

#

y el h

ombr

e tam

bién

se

asus

tóM

enta

l: be

cam

e sca

red

(He)

#

y le p

idió

per

don

...Ve

rbal:

aske

dTh

e man

#y e

l hom

bre l

e ofr

eció

un.

..Ve

rbal:

offe

red

A…#

1112

46

32

44

1

Appe

ndix

C (c

ontin

ued)

Num

ber

Stor

y-re

telli

ng ta

sk

oral

disc

ours

ePr

oces

sPa

rticip

ant/s

M: B

enef

. V:

Ver

biag

eG

oal

Circ

umsta

nces

Ma

Me

VR

BE

#Wd

= nu

mbe

r of w

ords

, CP

= co

mpu

ter p

rom

pt, M

a = m

ater

ial, M

e = m

enta

l, V =

verb

al, R

= re

latio

nal, B

= b

ehav

iora

l, E =

exist

entia

l, Act

= ac

tors

, Bfy

= b

enefi

ciarie

s, Vb

= ve

rbiag

e, G

ol =

goa

ls,

Circ

= ci

rcum

stanc

es