Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Title Examining the notion of listening sub-skill divisibility and its implications for
second language listening Author(s) Christine Chuen Meng Goh and Vahid Aryadoust Source International Journal of Listening, 29(3), 109-133 Published by Taylor & Francis (Routledge) Copyright © 2014 Taylor & Francis This is an Accepted Manuscript of an article published by Taylor & Francis in International Journal of Listening on 30/09/2014, available online: http://www.tandfonline.com/doi/full/10.1080/10904018.2014.936119 Notice: Changes introduced as a result of publishing processes such as copy-editing and formatting may not be reflected in this document. For a definitive version of this work, please refer to the published source. Citation: Goh, C. C., & Aryadoust, V. (2014). Examining the notion of listening sub-skill divisibility and its implications for second language listening. International Journal of Listening, 29(3), 109-133. http://dx.doi.org/10.1080/10904018.2014.936119
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
1
Examining the Notion of Listening Sub-Skill Divisibility and its Implications for
Second Language Listening
Abstract
The testing and teaching of listening has been partially guided by the notion of sub-skills, or a
set of listening abilities that are needed for achieving successful comprehension and
utilisation of the information from listening texts. Although this notion came about mainly
through applications of theoretical perspectives from psychology and communication studies,
the actual divisibility of the sub-skills has rarely been examined. This paper reports an
attempt to do so by using data from the answers of 916 test takers of a retired version of the
Michigan English Language Assessment Battery (MELAB) listening test. First, an iterative
content analysis of items was carried out, identifying five key sub-skills. Next, the
discriminability of sub-skills was examined through confirmatory factor analysis (CFA). Five
independent measurement models representing the sub-skills were evaluated. The overall
CFA model comprising the measurement models showed excessively high correlations
between factors. Further tests through CFA resolved the inadmissible correlations, though the
high correlations persisted. Finally, we made 23 aggregate-level items which were used in a
higher-order model, which induced best fit indices and resolved the inadmissible estimates.
The results show that the sub-skills in the test were empirically divisible, hence lending
support to scholarly attempts in discussing components in the listening construct for the
purpose of teaching and assessment.
Keywords: confirmatory factor analysis; listening sub-skills; second language
listening; sub-skill divisibility;
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
2
The teaching and testing of listening has been guided in part by the notion of sub-skills, or a
set of listening abilities that are needed for achieving successful comprehension and
utilisation of the information from listening texts. An underlying assumption of this approach
is that by distinguishing the listening components and operationalizing them in a language
curriculum or a language test, English learners will be able to build up a repertoire of skills
that enable effective comprehension while test takers’ ability levels can be accurately
reflected by their test results. The ability to use certain sub-skills or a combination of these
skills would therefore distinguish one learner or test taker from another (Buck & Tatsuoka,
1998). In spite of this received practice of focusing on listening sub-skills, the psychological
reality of some of these individual skills and the way they combine to bring about
comprehension has yet to be ascertained empirically. This paper attempts to offer some
insights from a study that examined whether different abilities considered to be important for
academic listening can be empirically separated based on test-takers’ performance in an
international standardised listening test. It begins with a review of the introduction of sub-
skills into second language teaching, and some findings and discussions that have contributed
to our current understanding of the issue of sub-skill divisibility. Next, it describes our study
which investigated the discriminability of different parts of the test and discusses the results
in relation to expressed test aims. It discusses the implications of the results for teaching and
assessing listening.
Sub-skills and Second Language Listening
The notion of sub-skills was introduced in various discussions about second language (L2)
listening skills during the dawn of the communicative language teaching era. Listening and
other language skills were defined as complex skill operations comprising a number of
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
3
constituent elements (i.e., sub-skills) which enable listeners to achieve comprehension. Some
well-known frameworks for sub-skills include:
Carroll’s (1972) two-stage taxonomy of parsing spoken language which comprised
the ability to comprehend linguistic information and relating it to the wider discourse;
Oakeshott-Taylor’s (1977) model of listening macro- and micro-comprehension
attributes;
Aitken’s (1978) categories of major listening skills for a) comprehending vocabulary
and syntax, b) identifying prosodic patterns, c) identifying attitudes, views, and
intentions of interlocutors, and rhetorical clues, and d) making inferences; and
Richard’s (1983) theoretical taxonomy of top-down and bottom-up sub-skills
organised around the context of conversational and academic listening.
The sub-skill approach to assessing L2 listening comprehension recognizes the interwoven
relationships between listening elements and seeks to create coherence in the development of
sub-skills. Richard (1983) proposed a list of 18 academic listening “micro-skills” to describe
simple and complex skills such as “[the] ability to identify purpose and scope of lecture [and]
identify relationships among units within discourse” (Richards, 1983, pp. 229-239). Similarly,
Flowerdew (1994, pp. 11-12) argued that academic listening has distinct features and
challenges, as compared with conversational listening, such as “[the] amount of implied
meaning or number of indirect speech acts” and “ability to concentrate on and understand
long stretches of talk”.
Munby’s (1978) list of 250 sub-skills for each of the four language skills arguably is the
lengthiest list ever proposed. For listening sub-skills, he proposed a complex model including
recognising the role of intonation and discourse markers and selecting key points, resulting in
the precise scope of the term sub-skill becoming relatively undefined. Other authors have also
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
4
suggested lists of listening phases, for example, Rixon’s (1981) five-stage framework which
stresses the importance of predicting contents, sample discourse, and checking the hypotheses
formed during listening.
Listening researchers have applied a variety of latent trait models to support the validity of
speculative taxonomies. Drawing on Buck’s (2001) default listening construct, Wagner (2004)
posited that measurable components of L2 listening would be correlated and include the
ability to comprehend explicitly and implicitly articulated information. Although similar
models have been verified in reading studies, Wagner’s exploratory factor analysis (EFA) did
not support the hypothesis, a finding which led to the assumption that listening sub-skills
were not discriminable, that is, listening comprehension was not empirically multidivisible.
He attributed this finding to the simultaneous operation of the hypothesized sub-skills.
Liao (2007) investigated the validity of Wagner’s model (2004). Contrary to Wagner’s
findings, her research provided evidence favouring the statistical divisibility of items
measuring the ability to understand explicit and implicit information, although some items
did not load on the pre-specified factors. However, confirmatory factor analysis (CFA)
showed that the correlation coefficient of these factors was .97, indicating that the factors
were hardly discriminable.
Soon after Liao’s (2007) research, Eom (2008, p. 81) proposed that the MELAB listening test
would rest upon a two-factorial structure consisting of “linguistic knowledge” and
“comprehension” as measured by 14 subcomponents. As indicated by their correlation index
(.85), these factors proved to be separable yet highly related. In the same year, Shin (2008)
researched the underlying structure of a Web-based listening test and showed that a higher-
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
5
order multitrait-multimethod CFA model where aggregate-level test items—variables that are
made on the basis of the combination of independent test items—could best capture the
complexity of the underlying structure of the listening test. The traits measured by the test
constructed a hierarchy and comprised the abilities to identify overarching main ideas, major
ideas, and supports. Apart from the methodological differences between Shin’s and previous
research, Shin used polytomous constructed response test items1 which reliably measured the
academic listening construct. Shin’s finding is in line with Bodie et al.’s (2011, p. 38) study
of the Watson–Barker Listening Test (WBLT), a test of listening comprehension, who argued
that dichotomous scales might not “fully reflect listening ability”.
More recently, cognitive diagnostic assessment (CDA) has been applied to listening
comprehension. For example, Sawaki, Kim, and Gentile (2009, pp. 200-201) reported that the
underlying test structure could be explained by three dominant sub-skills: understanding
“general and specific information”, “text structure and speaker intention”, and “connecting
ideas”.
The aforementioned studies suggest that research into L2 and academic listening has
produced mixed results with regard to the divisibility of sub-skills (see also Field, 2008;
Flowerdew & Miller, 2006; Goh, 2010; Rost, 2002; Tsui & Fullilove, 1998; Vandergrift
2007). However, the supporting studies (e.g., Shin, 2008) indicate that listening
comprehension sub-skills would generally include one or more of following sets of sub-skills,
which also feature in many recent discussions of the nature of learner listening:
(a) Understanding lexico-grammatical features (e.g., Shin, 2008);
(b) Understanding explicitly stated information (e.g., Field, 2008);
1 Unlike dichotomous test items which have only two possible results (e.g., 0 and 1), polytomous items have
more than two possible results (e.g., 0, 1, and 2).
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
6
(c) Understanding paraphrases (e.g., Wagner, 2004);
(d) Identifying intentions, attitudes, and rhetorical clues (e.g., Vandergrift 2007);
(e) Making inferences (e.g., Tsui & Fullilove, 1998); and
(f) Drawing conclusions (e.g., Liao, 2007; Sawaki et al., 2009).
Proponents of listening sub-skills nevertheless cautioned against any attempt at assessing
individual sub-skills, particularly for decoding (e.g., identifying vowels or recognizing
vocabulary) because of the view that these skills or processes function concurrently to
achieve an outcome and therefore cannot be evaluated separately (e.g., Dunkel, Henning, &
Chaudron 1993). For example, tests that assess the ability to identify prosodic features cannot
be said to assess comprehension, because the perception of prosodic patterns is important
only to the extent that it assists the listener to use more global skills such as understanding the
surface meaning (Dunkel et al., 1993).
In effect, the verification of the divisibility of listening comprehension would confer
important benefits on listening assessment and teaching (Hansen & Jensen, 1994; Shohamy
& Inbar, 1991). For example, it would attest to the genuineness of the speculative taxonomies
that have been used in both testing and teaching. Such evidence can be utilized to construct
fair (diagnostic) assessment systems which furnish fine-grained information concerning
weaknesses and strengths of test takers. An efficient (diagnostic) assessment system—which
would operate for the benefit of test takers—would provide both subscores and overall scores
on the basis of test takers’ performance on each bundle of items which tap a specific sub-skill
(Kunnan & Jang, 2008;). Such a system demands evidence of divisibility of sub-skills,
because supporting evidence which can provide subscores to assist test takers would make for
a more reliable practice (Sawaki, Quinlan, & Lee, 2013).
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
7
Learners also benefit from a sub-skill approach to listening in L2 pedagogy and (diagnostic)
assessment. For example, learners who receive helpful skill-based instructions on preparing
for academic lectures would perform better at the comprehension of lectures and tutorials.
The focus on sub-skills in academic listening is a significant part in the process of developing
L2 listening syllabi and materials. For example, Espeseth’s (2004) and Sanabria’s (2004)
course books of English for Academic Proficiency (EAP) listening focus on improving
students’ ability to listen for details, summarize the message, take notes, and understand
implicit messages. These materials expose students to relatively extensive oral texts to
improve their attention span and concentration, alongside intensive materials to strengthen
other sub-skills such as “whole utterance recognition” (Lynch, 2010, p. 82).
In light of the importance of listening sub-skills ascribed to listening pedagogy, particularly
in academic listening, and the paucity of empirical research in examining the listening
construct, the present study was an attempt to investigate the issue of divisibility through an
analysis of data from an academic listening test. It aimed to answer the following questions:
(a) Are there empirically divisible listening sub-skills underlying L2 listening test
performance?
(b) If so, what kind of relations do they have?
The Study
The Test
A retired version of the Michigan English Language Assessment Battery (MELAB) listening
test was used in this study. The test has three sections, with items tapping various sub-skills
that could potentially form part of the test takers ability to understand English spoken in
academic contexts in a comprehensive manner. The sections are a) 15 minimal-context items,
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
8
b) 20 short conversation items, and c) 15 radio interview items2. The format is entirely
multiple-choice, and the audio stimuli are played only once (see the CaMLA website).
In Section 1, the candidate hears a question or a statement and chooses the best answer in the
test booklet according to the illocutionary force of the utterance, including understanding “the
unexpected” statements or questions that would be posed to them in short daily exchanges
where they are expected to instantaneously provide a response, and vocabulary that had been
misunderstood by language users in real life (Johnson 2003, p. 36). In Section 2, the
candidate hears short conversations and chooses an answer option that most accurately
reflects the intent or outcome of the conversation. In Section 3, the candidate hears relatively
long simulated radio broadcast excerpts, mini lectures, or interviews. She may take notes and
refer to visual aids such as graphs and tables to help them answer comprehension questions
based on the texts. Each section intends to tap into various listening sub-skills, which are
discussed below (see Item Analysis subsection).
Data Source
The data were provided by the English Language Institute at the University of Michigan
(ELI-UM), now CaMLA, comprised a retired version of the MELAB, and graded answers
from 916 test takers, 425 female, 427 male, and 64 missing, who took the test from 76
countries. The participant-to-item ratio (916/50 = 18.32) was sufficient for CFA (Marsh &
Hocevar, 1988). Information about test structure, aims and focus was obtained from the
MELAB manual and other related documents.
2 Examples of test items for the current design of the MELAB test are available
at http://www.cambridgemichigan.org/.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
9
Item analysis
Item analysis was chiefly guided by the relevant literature and Johnson’s (2003) MELAB
Technical Bulletin. We adopted the listening sub-skills that Johnson describes in the bulletin
and, where necessary, relied on the findings of the literature survey, which was previously
discussed. Johnson (2003, p. 34) stated that
The listening test of the MELAB is intended to assess the ability to comprehend
spoken English. It attempts to determine the examinee’s ability to understand the
meaning of short utterances and of more extended discourse as spoken by university-
educated, native speakers of standard American English.
Johnson (2003, p. 34) further argued that, in addition to understanding surface meaning, the
test demands activating a variety of listening comprehension sub-skills including “the
capacity to make inferences and draw conclusions”. Based on this, we carried out an iterative
and exploratory content analysis of individual test items in multiple meetings, discussing
extensively our individual results before reaching a consensus on key ones. This resulted in a
preliminary list of listening sub-skills, as follows:
(a) understanding vocabulary,
(b) understanding grammatical structures,
(c) understanding and responding to minimal context stimuli,
(d) understanding explicitly articulated information,
(e) making inferences,
(f) understanding propositional paraphrases,
(g) understanding the attitude of speaker(s), and
(h) drawing conclusions.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
10
This initial set of sub-skills reflects those within the listening construct which was discussed
earlier; however, the list was revised after multiple rounds of discussion because the sub-
skills had considerable commonalities in definition and function (e.g., inference-making and
understanding implicit information); because they would not result directly in comprehension
although they would facilitate it (e.g., understanding vocabulary and grammatical structures);
and because none of the test items seemed to assess some sub-skills that emerged from the
literature (e.g., the ability to draw conclusions).
Only effective and measurable comprehension sub-skills were retained. For example, we
decided that perception and morpheme recognition sub-skills would facilitate comprehension
but were different from comprehension, as we made a decision to posit models that elicited
major or complex sub-skills (see Dunkel et al., 1993, for a discussion).
In addition, every effort was made to postulate the “conceptually distinguishable” sub-skills
that can be identified in “a minimal number of items” and agree with the item design
descriptions (Kim & Jang, 2009, p. 839). Haberman and von Davier (2007) had cautioned
against positing many excessively fine-grained sub-skills, as this would likely contribute to
the invalidity of the model and that there would also be a trade-off between the number of
sub-skills and convergence of a model (see Table 1).
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
11
Table 1 Description of the Listening Sub-skills Identified in the MELAB Listening Test
Providing inter-coder reliability information is an important research requirement. A
commonly adhered-to methodology is to propose guidelines for item coding and train coders
to carry out the coding. The agreement of coders is subsequently sought and reported. It has
been argued that, although useful, this method sometimes yields low reliability statistics in
the first and/or second rounds due to the biases of coders; reliability improves and coder
biases are “neutralized” only when raters convene, discuss with the rest of the group, and
refine their performance (Compton, Love, & Sell, 2012, p. 351). Whereas there is no
inaccuracy inherent in this method, it seems that the initial rounds of coding are sometimes
inefficient, as they basically resemble individual training sessions for coders3. To overcome
these limitations, we took a directed qualitative content analysis approach where we started
with the findings of our literature survey “as guidance for initial codes” (Hsieh & Shannon,
2005, p. 1277). This approach has been adapted in Buck and Tatsuoka’s (1998) listening
3 In the studies where coding rehearsal is conducted, if the targeted reliability is not obtained, an external
independent coder will be required to make the final decision.
Skills Description Items No. of
items
Understanding and
responding to the unexpected
statements and/or questions
(minimal context items)
Ability to understand illocutionary
forces (e.g., invitations, offers,
suggestions, and so forth) and find
an appropriate response among the
item options.
1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11,
12, 13 14, 15
15
Understanding details and
explicit information
Ability to comprehend individual
lexical items, utterances and
supporting information.
40, 46, 47, 48,
49, 50
6
Making close paraphrases Ability to paraphrase the key
phrases or use synonyms to answer
the item.
18, 21, 30, 35,
36, 37, 38, 39,
41, 42, 43, 44,
45
13
Making propositional
inferences
Ability to understand what might
logically follow in a text.
16, 22, 24, 25,
26, 27, 29, 31,
33
9
Making enabling inferences Ability to make inferences based
on the causal relationships implied
in the text.
17, 19, 20, 23,
28, 32, 34
7
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
12
study with relative success. In short, we convened for multiple meetings and discussed the
coding scheme and test items in detail. We concurred on the coding of the majority of test
items (100% agreement) but modified the coding of others several times across meetings till
we reached a full agreement. Approximately two weeks from the final round of joint coding,
one of the researchers re-examined the finalized codes to ascertain the trustworthiness of the
codes over time.
Here we provide further details about the coding process using items similar to the ones in the
paper. The actual items are secure. According to Johnson (2003), the first sub-skill, to which
an entire test section has been devoted, is the ability to understand minimal context stimuli,
that is, the ability to comprehend and respond to statements (e.g., requests or invitations) or
questions that are asked unexpectedly by people around us, for example, on a campus or at an
office. (Interested readers are referred to CaMLA website.)
In trying to understand an utterance in the minimal context item presented in this item, a
candidate needs to recognise and understand the word “When’s” in order to select ‘c’ as it is
the correct answer. In addition, the candidate also needs to be able to parse the unit “When’s
she going” grammatically to know that the speaker is referring to an event in the future. The
listener is also expected to understand the illocutionary force of this inquiry and, in response,
provide the date of her vacation. Although grammatical and vocabulary knowledge alongside
understanding the illocutionary forces of the stimuli can facilitate and lead up to
comprehension, they were not regarded as independent sub-skills (Dunkel et al., 1993).
Another sub-skill is the ability to understand details or explicit information. The following
example engages this sub-skill:
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
13
The tea plant can grow to a very tall tree if it is left untrimmed. The plant grows in
tropical regions; it cannot tolerate the cold climate of mountainous areas or the
regions that receive an extremely low amount of precipitation ….
Q: Where does the tea plant grow?
a. mountainous regions
b. warm regions
c. tropical regions *
The short text in which the answer is located states that “The plant grows in tropical regions”.
We identified five items that would engage this sub-skill. However, the response is not
always explicit and the reader needs to understand paraphrases of the oral language to arrive
at the answer. For example, understanding the text above could be evaluated through the
following item:
Q: Where does the tea plant grow?
a. mountainous regions
b. cold regions
c. warm and humid regions *
The correct option is c but the text does not include the “exact wording” of the answer. To
arrive at the answer, the successful listener is supposed to know that “tropical regions” in this
text refers to “warm and humid regions”. We found that 13 items elicit the ability to
understand paraphrases.
We also found that 16 items engage inference-making sub-skills. Two related inferences
emerge from literature: propositional and enabling (Hildyard & Olson, 1978). Propositional
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
14
inferences are typically instances of syllogism or logical appeal. They are logical arguments
without explicitly stated conclusions about which listeners should make inferences. We adapt
a listening test item from Wagner (2004) which tests the ability to make propositional
inferences:
There are probably more skunks alive today than there were a thousand years ago.
This is mostly because human development usually involves clearing the land of tree
cover. This is fortunate for skunks, because they like to live in open areas.
Q: According to the speaker, skunks have_____.
a. been raised by human as pets
b. evolved in the last thousand years
c. benefitted from the presence of humans*
d. almost gone extinct because of human development (Wagner, 2004, p. 10)
This item is an instance of syllogism, because the listener hears two sentences based on
which she should draw an inference: She hears “human development usually involves
clearing the land of tree cover” followed by “they like to live in open areas.” These two
premises follow that skunks benefit from human’s deforestation activities, though the
conclusion is not stated explicitly. Propositional inferences do not rely on the world
knowledge; they are implicit conclusions of explicitly articulated sentences or ideas
(Hildyard & Olson, 1978). By contrast, enabling inferences require that listeners connect the
ideas or statements by using their world knowledge and schemata. Hildyard and Olson (1978)
stated that “Enabling inferences are inferences that must be draw to make discourse coherent
and, therefore, comprehensible” (p. 95). Wagner (2004) gives the following example:
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
15
A skunk’s odor works as a very good defence against predators. Skunks spray their
musk at predators, causing a really bad smell.
Q. Why is the skunk’s smell a good defence against predators?
a. The predators can smell the skunks.
b. Skunks use their odor to blend in to their environment.
c. Skunks do not taste very good to predators because their odor is so bad.
d. The predators know that if they attack the skunks, they will end up smelling very
bad.* (Wagner, 2004, p. 10)
There are two important pieces of information in the text, which must be understood to draw
the inferences: “Skunks spray their musk at predators” and “causing a really bad smell”.
Successful listeners realize that because spraying the musk at predators would irritate them,
they avoid approaching the skunks.
Modelling the Sub-skills
A five-factor listening model comprising the following sub-skills was constructed:
(a) Understanding and responding to the unexpected statements and/or questions
(minimal context items);
(b) Understanding details and explicit information;
(c) Making propositional inferences;
(d) Making enabling inferences; and
(e) Drawing conclusions.
Figure 1 shows the model accommodating five divisible sub-skills underlying the test. The
boxes represent test items and the big ovals represent factors (latent variables or sub-skills)
for listening. Each item has an error of measurement which is expressed as a small circle.
Arrows going from latent variables to items represent prediction relationships (or statistical
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
16
causality): the observed variance in the item is mainly accounted for by the posited latent
variables. Regression paths are indicated as unidirectional arrows (e.g., the unidirectional
path connecting Minimal context to Item 1) and suggest that the latent variables cause
variations in examinees performance on items; and correlations are indicted as bidirectional
arrows (e.g., correlations of Minimal context and Propositional inferences).
Factor Analysis
Factor analysis (FA) is basically a variable simplification technique to identify factors or
underlying constructs of tests or questionnaires and to reduce items into smaller groups of
variables. It is performed by exploring the intercorrelation patterns between test items. It is
supposed that the variance of test items is due to the variation in latent (unobservable) traits
represented by factors (Schumacker & Lomax, 2010). Numerous researchers have
successfully proposed factors representing cognitive and metacognitive strategies (e.g.,
Phakiti, 2008; Vandergrift, Goh, Mareschal, & Tafaghodatari, 2006) and different language
abilities (Bae & Bachman, 1993; Bodie, 2011; Bodie et al., 2011; Oller, 1983; Oller &
Hinofitis, 1980).
Confirmatory factor analysis (CFA) models are developed based on preconceived theories,
(as opposed to exploratory FA models which discover the underpinning factor structure of
data sets where there is no preconceived theory [Fabrigar, Wegener, MacCallum, & Strahan,
1999]). CFA allows the analyst to verify whether the postulated relations between items and
latent variables exist (Schumacker & Lomax, 2010). In the present study, we specified the
model by drawing the diagrams (e.g., Figure 1) and evaluated the significant features of the
model such as fit statistics and the magnitude of correlation and regression coefficients.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
17
Figure 1. The proposed models for the MELAB listening test.
Minimalcontext
item 1 e11
1
item 2 e21
item 3 e31
item 4 e41
item 5 e51
item 6 e61
item 7 e71
item 8 e81
item 9 e91
item 10 e101
item 11 e111
item 12 e121
item 13 e131
item 14 e141
item 15 e151
item_16 e161
item_22 e171
item_24 e181
item_25 e191
item_26 e201
item_27 e211
item_29 e221
item_31 e231
item_33 e241
item_18 e251
item_21 e261
item_30 e271
item_35 e281
item_36 e291
item_37 e301
item_38 e311
item_39 e321
item_41 e331
item_42 e341
item_43 e351
item_44 e361
item_45 e371
item_17 e381
item_19 e391
item_20 e401
item_23 e411
item_28 e421
item_32 e431
item_34 e441
item_40 e451
item_46 e461
item_47 e471
item_48 e481
item_49 e491
item_50 e501
Propos.inference
1
Closeparaph.
1
Enablinference
1
Detailedinform.
1
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
18
We used Kline’s (1998) method for testing CFA because of its comprehensiveness when
compared to other methods. It includes a two-stage CFA analysis where the measurement
model is tested prior to the full structural model. The measurement model comprises a latent
variable with strictly causal relationships to test items. For example, a measurement model in
Figure 1 includes Minimal context latent variable, 15 items, and their error terms; it is
hypothesized that Minimal context latent variable would account for the performance of test
takers on those items (Schumacker & Lomax, 2010). The full structural model includes all
relationships among the measurement models, which are primarily causal, although they can
also be correlational (bidirectional arrows in Figure 1). If the structural model includes
correlational relationships, then the entire analysis is called CFA (Jöreskog & Sörbom, 2001).
CFA was performed on the LISREL statistical computer package, Version 8.8 (Jöreskog &
Sörbom, 2006). Measurement models were initially developed, followed by the CFA model
for all the latent variables. This was followed by a calculation of a range of fit statistics for
each of the models we developed.
To evaluate the adequacy of the model, we examined a number of features of the proposed
model: fit of the model expressed as fit indices, regression statistics which show how much of
the variance observed in the performance data is explained by (or caused by) the latent
variables, and the correlation coefficients of the latent variables, which should fall
between .30 and .80 (Schumacker & Lomax, 2010). Following Schumacker and Lomax, we
used the following fit indices to report the fit of the model:
a) Chi-square test (χ2
), an index that shows the difference between the observed and
implied covariance matrices. For larger samples, it tends to be significant; therefore,
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
19
other fit statistics have been developed that impose a penalty for the sample size and
number of variables in the model.
b) χ2
/df (or normed χ2
), the ratio of χ2
to the degree of freedom (df). This ratio should be
small (preferably below 3) in well-fitting models.
c) RMSEA (Root Mean Square Error of Approximation), a measure that corrects for the
tendency of the chi-square test to be significant in large samples. Lower RMSEA
indices are desirable.
d) NNFI (Non-Normed Fit Index) or Tucker-Lewis Index, to compare the posited and
the baseline models. It is possible for the NNFI to be greater than unity, but it is
usually set at unity. Indices greater than .90 are desirable.
(e) CFI (Comparative Fit Index), an index to test the fit of a postulated model relative to a
baseline model; indices greater than .90 are desirable.
First, we evaluated five independent measurement models including minimal context, explicit
information, close paraphrasing, enabling inference, and propositional inferencing latent
variables. Overall, 14 items were deleted due to their low regression coefficients (< 0.30),
leaving 36 items for CFA modelling (Hair et al., 2010). The 36-item CFA model was tested:
the results indicated reasonable fit but the emergence of a few “offending estimates”, which
in this study includes extremely high correlation indices some of which are greater than unity,
made the 36-item model solutions non-admissible.
Large CFA factor correlations can be due to negligible item cross-loadings (i.e., there might
not be exact zero loadings) or the dependence of some sub-skills on others (Schumacker &
Lomax, 2010). Small cross-loadings can be eliminated if we aggregate (combine) related
items into parcels and run an aggregate-level CFA analysis (see Little, Cunningham, Shahar,
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
20
Widaman, 2002). Aggregation converts the dichotomous into polytomous items whose joint
statistical information is the same as their combination into a polytomous item at every point
on the latent variable (local independence assumed) (Huynh & Mayer, 2003). As a general
rule, the more categories in a polytomous item, the more is the total summed information in
the item across the latent variable (Bond & Fox, 2007). Following Johnson’s (2003) FA study
of MELAB, we combined odd-numbered items and then even-numbered items tapping the
five sub-skills, making 23 aggregate-level items. Given the large correlations among factors,
we further added a higher-order general listening factor on which the five factors were
statistically regressed (Chen, West, & Sousa, 2006; Mulaik & Quartetti, 1997).
To examine the higher-order CFA model with aggregate or parcel items, we
incorporated prediction relationships among the factors (i.e., single-headed arrows running
from one lower-order factor to another one). That is, we postulated that the ability to
understand explicit information would predict inferencing as well as understanding
paraphrases. We further discuss parcel items and prediction relationships in the CFA model
below.
Parcel items
The use of parcel or bundle items has become widespread in second language and educational
assessment research (e.g., Aryadoust, 2013; Goh & Hu, 2013; Kunnan, 1998; Purpura, 1998;
Phakiti, 2007, 2008; L. Zhang, Aryadoust, & L. J. Zhang, in press). Bandalos & Finney
(2001) showed that approximately 20% of the articles published in a number of measurement
journals had employed parcel items in CFA studies. Indeed, parcelling is an efficient
technique when the test items are scored dichotomously, as dichotomous scales can inflate
the chi-square index in CFA models (Bandalos, 2002) and/or result in inadmissible regression
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
21
or correlation coefficients (Schumacker & Lomax, 2010). Parcelling can also improve the
reliability and fit of the CFA models (Little et al., 2002).
However, in discussions of parcelling, one controversial issue has been the number of
parameters estimated. On the one hand, advocates of parcelling argue that using parcel items
results in model parsimony (i.e., fewer parameter to estimate) and accordingly stable models
(Bagozzi & Edwards, 1998). On the other hand, Hall, Snell, & Singer Foust (1999) contend
that parcel-item CFA would not be as robust as CFA of individual items. In the present study,
we advocate the former view which is in line with the common practice in language
assessment literature. We agree with Bandalos (2002, p. 100) who argued that the application
of parcel items to dichotomous items can “ameliorate the effects of coarsely categorized and
nonnormally distributed item-level data on model fit.”
It is also worth noting that Pearson correlation matrix is not suitable for the factor
analysis of dichotomous data (Uebersax, 2006). Therefore, the CFA models were all drawn
from polychoric correlation matrices.
Prediction relationships in CFA
Researchers attempting to establish the empirical multidivisibility of second language sub-
skills have used CFA and cognitive diagnostic models (CDMs) (Aryadoust et al., 2012; Lee
& Sawaki, 2009; Sawaki et al., 2009). Traditionally, in both approaches, the statistical
correlations of sub-skills—as represented by dimensions in CDMs and factors in CFA—are
estimated. However, new research by Sawaki et al. (2013) and Aryadoust, Goh, and Lee
(2012) shows that the relations between second language sub-skills are more complex than
commonly postulated correlations. Sawaki et al. (2013, p. 88) found that, contrary to their
initial presumption, some of the posited sub-skills would predict test takers’ performance on
other sub-skills in a sourced-based writing test. They reported that the CFA model
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
22
incorporating the prediction relationships was “the most plausible” model among the tested
CFA models. Sawaki et al. argued that if a parsimonious CFA model incorporating
correlations among factors does not plausibly fit the second language test performance data, it
might indicate that some of the important relationships among the factors are missing. The
researcher then should consider incorporating the prediction relationships among factors to
capture the complexity of the data.
Results
Preliminary Statistics and Reliability
SPSS for Windows (Version 16, SPSS Inc., Chicago, IL) was used to examine the mean (or
facility indices), standard deviation, skewness, and kurtosis coefficients of the data set. Most
items possessed skewness and kurtosis coefficients in the range between -2 and +2, an
indicator of univariate normality (Schumacker & Lomax, 2010). The composite Cronbach’s
alpha reliability coefficient was .84, which fell within the reliability range from .74 to .84 as
reported in the test manual (Johnson, 2003). High Cronbach’s alpha values indicate the
internal consistency of the data in the sense that items are interrelated.
Confirmatory Factor Analysis
Initially, we manufactured five independent measurement models which were the
constituents of the five-factor CFA model, and then the five-factor CFA model
accommodating all latent variables and their indicators (Kline, 1998). Following Hair et al.,
2010, we deleted 14 items (items 2, 3, 6, 16, 17, 23, 26, 28, 29, 31, 36, 40, 44, and 45) out of
50 through iterative analysis of measurement models because their regression coefficients
were below .30 and therefore their contribution to measuring the latent trait was fairly low.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
23
The fit statistics of the measurement models and their modified forms are presented in Table
2. For example, the modified Minimal context model where Items 2, 3, and 6 were deleted
fitted the data reasonably well (χ2
= 48.94, NNFI = 0.95, CFI = 0.96, RMSEA = 0.029). Next,
the measurement models were incorporated into the CFA model which comprised 36 test
items and five latent variables and fitted the data well (χ2
= 795.70, NNFI = 0.97, CFI = 0.97,
RMSEA = 0.020). One correlation coefficient greater than 1.00 nevertheless emerged, thus
rendering the model basically inadmissible (see Table 3). This is caused when the variance-
covariance matrix, based on which CFA parameters are estimated, is non-positive definite
(NPD), which is an “offending estimate” (see Brown, 2006, pp. 187). Thus, while the model
fits the data, it is not admissible.
Next, we performed a higher-order aggregate-level CFA analysis (Widaman, 2002) by
combining odd-numbered items and then even-numbered items and making 23 aggregate-
level or parcel items, whose descriptive statistics and correlations are presented in Appendix
A and B, respectively. As Figure 2 shows, the general listening predicts and lower-order
latent variables, as opposed to the previously evaluated CFA models where the relationships
were correlational. In other words, the lower-order factor are all sub-components of a general
listening factor and the prediction relationships extends to the lower-order factors; for
example, understanding explicitly stated information can predict understanding paraphrases.
This model further postulates a prediction relationship between lower-order latent variables
(Hair et al., 2010).
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
24
Table 2 CFA Models of the MELAB Listening Test
Model χ2
df χ2
/df NNFI CFI RMSEA RMSEA 90%
confidence interval
Minimal context model 99.46* 65 1.54 0.95 0.96 0.023 0.013 – 0.032
Modified minimal context a 48.94* 27 1.81 0.95 0.96 0.029 0.014 – 0.042
Explicit information 5.07 9 0.56 1.03 1.00 0.000 0.000 – 0.023
Modified explicit information b 0.70 5 0.14 1.06 1.00 0.000 0.000 – 0.000
Close paraphrase 99.46* 65 1.53 0.95 0.96 0.023 0.013 – 0.032
Modified close paraphrase c 47.24 27 1.74 0.95 0.96 0.029 0.014 – 0.042
Propositional inference 36.61* 37 0.99 0.96 0.96 0.020 0.000 – 0.034
Modified propositional inference d 4.50 5 0.90 1.01 1.00 0.000 0.000 – 0.043
Enabling inference 11.90 14 0.85 1.02 1.00 0.000 0.000 – 0.027
Modified enabling inference e 0.58 2 0.29 1.06 1.00 0.000 0.000 – 0.045
Five-factor 36-item CFA 631.88* 454 1.40 0.97 0.97 0.021 0.017 – 0.024
Higher-order 25-item model 255.37 221 1.15 1.00 1.00 0.013 0.013 – 0.019
Note. n = 916. df = degree of freedom. NNFI = Non-Normed Fit Index. CFI = Comparative Fit Index. RMSEA = Root Mean Square Error of Approximation. a Deleted items = 2, 3, 6. b Deleted items = 40 . c Deleted items = 36, 44, 45. d Deleted items = 16, 26, 29, 31. e Deleted items = 17, 23, 28.
Good fit is indicated by NNFI and CFI ≥ 0.95, RMSEA ≤ 0.05, χ2/df < 3, and non-significant χ
2.
* p < .01.
Table 3 Correlation Coefficients of the Latent Variables of the 36- and 30-Item CFA Models
Minimal context
Enabling inference
Propositional inference
Close paraphrase
Explicit information
Minimal context 1
Enabling inference 0.92a (0.94b)
1
Propositional inference
0.98 (0.99)
0.92 (0.91)
1
Close paraphrase 0.97 (0.98)
0.94 (0.89)
1.04* (0.95) 1
Explicit information 0.74 (0.70)
0.77 (0.73)
0.88 (0.81) 0.87 (0.91) 1
Note. * non-admissible correlation index. a correlation coefficients of the latent variables in the 36-item five-factor model. b correlation coefficients of the latent variables in the 30-item five-factor model.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
25
Figure 2. Path model of the higher-order aggregate-level 25-item model. (χ2 = 255.37, NNFI =
1.00, CFI = 1.00, RMSEA = 0.013).
Listenin1.00
MiminlCo
ParaphIt
EnablInf
ProposIn
ExplicIn
i1i3 0.85
i2i4 0.78
i6i8 0.83
i5i7 0.70
i9i11 0.80
i10i12i1 0.72
i13i15 0.81
i18i30 0.68
i21i35 0.80
i36i38 0.92
i37i39 0.77
i41i43i4 0.79
i42i44 0.79
i40i46 0.70
i47i49 0.82
i48i50 0.80
i16i22 0.82
i24i26 0.88
i25i27 0.86
i29i31i3 0.68
i17i19i2 0.74
i20i28 0.85
i32i34 0.82
Chi-Square=253.13, df=221, P-value=0.06792, RMSEA=0.013
0.39
0.47
0.58
-0.28
0.55
0.45
0.53
0.43
0.56
0.44
0.28
0.48
0.45
0.46
0.55
0.43
0.45
0.43
0.35
0.37
0.56
0.51
0.39
0.43
0.18
0.18
0.13
0.97
0.86
0.84
0.97
0.79
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
26
As expected, the higher-order aggregate-level 23-item model fitted the data extremely well
(χ2
= 255.37, NNFI = 1.00, CFI = 1.00, RMSEA = 0.013) and resolved the inadmissibility of
correlation coefficients. As displayed in Figure 2, coefficients of the regression paths between
the higher-order listening factor and lower-order factors are significantly high (> 0.001). For
example, for a one-unit increase in the higher-order latent variable listening, we expect to see
0.97 increase in the magnitude of the lower-order minimal context variable. Similarly, if the
magnitude of explicit information variable increases for one unit, we expect to see a small but
significant increase in enabling inferences variable (i.e., 0.18).
Discussion
This study investigated the divisibility of listening sub-skills in a retired version of the
MELAB listening test through an examination of mixed evidence from modelling listening as
a five-factor model. We found evidence for the divisibility of the five sub-skills posited
earlier:
(a) Understanding and responding to the unexpected statements and/or questions
(minimal context items)
(b) Understanding details and explicit information
(c) Making propositional inferences
(d) Making enabling inferences
(e) Drawing conclusions
We discuss our findings in more detail here. A notable finding in the item-level CFA
modelling was the presence of high correlations among factors representing sub-skills (p <
0.001), making the models inadmissible because discriminant validity has been violated
(Brown 2006; Hair et al., 2010). After the data were further tested through the higher-order
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
27
aggregate-level CFA model where five lower-order factors were statistically regressed on a
general listening factor, the results supported distinctions among factors. This observation
resonates with Bodie et al. (2011), Eom (2008), Hansen and Jensen (1994), Sawaki et al.
(2009), Shin (2008), and Shohamy and Inbar (1991), and partly with Liao (2007) who
suggested or found that listening sub-skills are divisible, but contradicts the findings of
Wagner (2004). In addition, we found that the factor structure of the test is much more
complicated than what a simple EFA or CFA could represent. That is, all factors were
attributed to a higher-order factor, and the inference making and understanding paraphrase
factors were partly predicted by the ability to understand explicit information. We suggest
that divisibility was not observed in previous EFA research by Oller (1983), Oller & Hinofitis
(1980), and Wagner (2004) due to the statistical devices employed which were not able to
represent otherwise known interconnections of factors which represent sub-skills.
An important reason for this interconnection could be that listening performance is
attributable to the presence of a general listening ability. Therefore, the identified sub-skills
might not operate in isolation but in unison to facilitate the achieving of a listening
comprehension goal. For example, although a group of items measure the ability to make
inferences and understand paraphrases, to achieve this type of comprehension test takers must
initially understand the explicitly stated information. This relationship is represented by
regression paths, indicating that while parts of the performance on inferences and paraphrase
items can be explained by the ability to understand explicit information, these sub-skills are
indeed separate and contribute to the general listening ability (Aryadoust, Goh, & Lee, 2011).
Relatedly, to make inferences or draw conclusions from short conversations, the candidate
also has to use similar sub-skills to interpret the outcome of the conversation. For example,
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
28
the questions in the minimal context section would require candidates to concurrently activate
their linguistic competence, including vocabulary, syntax, and phonology. The single
utterance prompts in the version of the test we examined (see Item analysis section) contained
what could be new or unfamiliar lexical items including phrasal verbs and idioms. Choosing
the correct answer depended heavily on the candidate’s hearing, simultaneous recognising
and understanding these vocabulary items before they were able to apply their functional
knowledge to interpret the illocutionary force of the message. Thus, performance in this
section appears to rely on fundamental decoding skills of sounds discrimination and making
sound-script connection and even vocabulary in some instances when unfamiliar lexical items
are included in the prompts. These skills are already in use when test takers answer test items
tapping other sub-skills. For example, the inclusion of slightly less common words and
idioms requires listeners to activate top-down processes to assist meaning building, which
seems to combine two uses of context: compensatory use to deal with unknown words and
use to build higher-level meaning.
Another example is the test items such as:
You hear:
A: Let’s go to the football game.
B: Yeah, that’s a good idea. I don’t want to (wanna) stay home.
You read:
a. They’ll stay home.
b. They don’t like football.
c. They’ll go to a game.* (Fleurquin el al., 2009).
The test taker needs to recognise the units of relevant information (RIs), “Let’s”, “Yeah”, and
“good”. A difference (with minimal context items) here is that a simple context has been built
up and the candidate may use the clues from B’s second utterance “I don’t want to (wanna)
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
29
stay home” to monitor and evaluate their understanding, or as a strategy for choosing the
right option. From this we can see the redundancy and overlaps of the sub-skills needed for
answering different types of questions which are aimed at testing different sub-skill, such as
understanding details and drawing conclusions. This kind of model raises questions about
how connected to each other sub-skills really can be; and the test formats that aim to tap
different aspects of a global listening skill actually load on to the different yet connected set
of central processes. This has implications for the on-going effort in the field to postulate a
model of listening.
This hypothesis alludes to the theoretical propositions made by Bechtel and Abrahamsen
(1991) about human cognition, which propose parallel processing of language through a
spreading activation of interconnected or associative neural networks in the brain (Bechtel &
Abrahamsen, 1991). An effective L1 and L2 listener is someone who is able to carry out
parallel processing through the activation of different skills and items of knowledge (Hulstijn,
2003; Vandergrift, 2007).
It seems that sub-skills in the present L2 context functioned in an interactive and
interdependent manner in listening events particularly those that are interactional in nature.
This relationship between different levels of listening sub-skills is appropriately illustrated in
Wolvin and Coakley’s (1996) metaphor of a tree where the roots are discriminative listening
and the trunk represents general comprehension, while the branches refer to different types of
higher-level listening, such as critical listening, which is contingent upon successful
discriminative listening and general overall surface level comprehension (see Imhof &
Janusik, 2006).
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
30
The second reason why the item-level CFA failed to support the divisibility of listening sub-
skills might have to do with the nature of the scale and/or the statistical model used. A close
examination of the relevant literature shows that the application of CFA to dichotomously
scored tests of L1 or L2 listening has resulted in mixed findings. For example, Bodie et al.
(2011) investigated the structure of the Watson–Barker Listening Test (WBLT)—a
dichotomously scored L1 listening test—and found that the test structure did not emerge as
multi-factor, contradicting the theoretical structure of the test but consistent with Wagner
(2004). They discuss the implications for constructing listening tests, as follows:
Perhaps dichotomous scoring does not fully reflect listening ability, with the valid use
of dichotomous scoring likely dependent on context […] Research across the
academic landscape suggests that deriving meaning from conversation is more
complex than picking out a single, correct meaning […] Consequently, right–wrong
scoring may misrepresent the multitude of meanings that may be viable alternatives.
Indeed, a person who is able to generate multiple alternative meanings from a given
utterance may be a more proficient or competent listener. (Bodie et al., 2011, p. 38)
It follows that either L2 listening tests would discriminate high from low ability test takers
better if they use polytomous scales, or that with binary tests researchers should use
modelling approaches other than item-level CFA. Evidence favouring the former postulation
is that aggregate-level items in the higher-order model established the divisibility and
interrelations of the sub-skills, which is consistent with Shin’s (2008) findings. It would
prove worthwhile to carry out further investigation in L2 listening field with tests which are
scored on polytomous scales or apply other modelling approaches for dichotomously scored
tests. Relatedly, it would be crucial to further investigate the applicability of CFA for the
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
31
context of L2 listening. We speculate that sub-skill divisibility might be established by the
statistical tools that would suit dichotomously scored tests (see Aryadoust & Goh, 2013).
Conclusion
The issue of the sub-skills and the divisibility of the listening construct into separate ones will
continue to capture the attention of researchers while curriculum developers and teachers will
continue to be concerned with the practical challenges of helping learners improve their
language skills in order to communicate better socially, pass examinations, and do well
academically. There are benefits in focusing on some aspects of the listening ability which
are crucial to successful comprehension be they at the decoding level or a higher processing
level (Field, 2008). For example, tasks that bring the separate sub-skills together so that they
can be used in conjunction with each other and the issue of whether they are separable or not
ceases to be of central importance in pedagogical terms (Sawaki et al., 2013). Alternatively, it
may be useful to focus separately on the development of different kinds of processes
according to the kinds of listening context within an academic environment. The
development of listening for academic contexts is in many ways similar to general listening
development although distinctions can be made in the types of discourse that students
participate in. Thus, there is rightly a great deal of emphasis on helping students construct
meaning of what they hear through extended discourse such as lectures and seminar
presentations.
As pointed out by Brown & Yule (1983), utterances during informal social interaction tend to
contain grammar that is “loose”; they are typically short and often strung together through the
use of relevant coordinators such as ‘and’ to achieve cohesion. In contrast, the utterances in
lectures and talks which are of a transactional nature are longer and the grammar is “tighter”,
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
32
containing many features of the syntax of written language. Learners would certainly benefit
from learning how to handle the two types of discourse in an academic context.
The findings from this study suggested that the sub-skills that the test aimed to assess through
separate sections of the instrument were empirically divisible, should the right modelling
approach be used. This lends support to the view that it is possible to distinguish sub-skills
from one another, hence supporting the language assessment and teaching experts who have
proposed theoretical taxonomies in an attempt to deconstruct listening into manageable parts
for testing and teaching purposes.
It is worth noting that sub-skills function in an interactive and interdependent manner in
listening events particularly those that are interactional in nature such as when someone is
participating in or listening to a conversation. Because of this, it may be worthwhile that test
developers focus on processes that are important for different listening contexts and types of
discourses encountered in an academic context, as language processing may be constrained or
assisted by the discourse structure of texts and the way meaning is packaged in utterances
(see Bodie, 2013). The key skills that are needed for listening to a short conversation and
those for listening to a lecture may not be the same even though some low-level skills such as
decoding are fundamental to both. It would still be advisable to separate the sub-skills to
develop tasks in assessment and pedagogy, and ensure that sub-skills are integrated in
extended listening tasks such as those that focus on comprehension.
Finally, evidence favouring divisibility of sub-skills is useful as it may be used to generate
assessment systems where both subscores and composite scores are provided to test takers.
Subscores disaggregate the performance of test takers on the basis of the sub-skills tapped by
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
33
each set of items. Test takers and teachers will benefit from such score reports, as it
highlights the weaknesses and strengths of test takers and helps the educational institutes
develop the kind of educational programs that would best meet the language needs of the test
takers.
Limitations and Further Research
An obvious limitation of our study is that we used data from only one version of the test.
Future research could repeat the same modelling procedures or other statistical methods over
a larger number of tests (specifically the tests that possess polytomous scales). Researchers
would need to constantly refine the methods for gathering evidence of divisibility of sub-
skills not only through a statistical quantitative approach but also through rigorous qualitative
methods wherever possible.
In this study, we examined the test takers’ performance with reference to the attributes of test
items and the sub-skills that they would engage. As one of the reviewers suggested,
metacognition, first and second language vocabulary, (implicit) grammar, and working
memory are extremely important factors in predicting listening comprehension (see, for
example, Goh & Hu, 2013). Future research can adapt independent measures for these factors
and examine their predictive power in listening test performance of low- and high-ability test
takers. Although this approach has been used in reading studies (e.g., D. Zhang, 2012; L.
Zhang et al., in press), it has scarcely been researched in listening comprehension studies and
is one line of research that has great potential.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
34
Acknowledgement
Authors would like to thank the Spaan Fellowship Program of Cambridge-Michigan
Language Assessment Institute (CaMLA) for their support of the project.
References
Aitken, K. G. (1978). Measuring listening comprehension in English as a second language.
TEAL Occasional Papers, Volume 2. Vancouver: British Columbia Association of
Teachers of English as an Additional Language.
Aryadoust, V. (2013). Building a validity argument for a listening test of academic
proficiency. Newcastle: Cambridge Scholars Publishing (CSP).
Aryadoust, V., Goh, C., & Lee, O. K. (2011). An investigation of differential item
functioning in the MELAB listening test. Language Assessment Quarterly, 8(4), 361-
385. doi: 10.1080/15434303.2011.628632
Aryadoust, V., Goh, C., & Lee, O. K. (2012). Developing an academic listening self-
assessment questionnaire: A study of modeling academic listening. Psychological Test
and Assessment Modeling, 54(3), 227-256.
Bae, J., & Bachman, L. F. (1998). A latent variable approach to listening and reading: testing
factorial invariance across two groups of children in the Korean/English two-way
immersion program. Language Testing, 15(3), 380–414. doi:
10.1191/026553298676874967
Bagozzi, R. P., & Edwards, J. R. (1998). A general approach for representing constructs in
organizational research. Organizational Research Methods, 1, 45–87. doi:
10.1177/109442819800100104
Bandalos, D. L. (2002). The effects of item parceling on goodness-of-fit and parameter
estimate bias in structural equation modeling. Structural Equation Modeling: A
Multidisciplinary Journal, 9, 78-102. doi: 10.1207/S15328007SEM0901_5
Bechtel, W., & Abrahamsen, A. (1991). Connectionism and the mind: An introduction to
parallel processing in networks. Oxford: Blackwell.
Bodie, G. D. (2011). The Active-Empathic Listening Scale (AELS): Conceptualization and
evidence of validity within the interpersonal domain. Communication Quarterly, 59,
277–295. Doi: 10.1080/01463373.2011.583495
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
35
Bodie, G. D. (2013). Issues in the measurement of listening. Communication Research
Reports, 30, 76-84. Doi: 10.1080/08824096.2012.733981
Bodie, G. D., Worthington, D., & Fitch-Hauser, M. (2011). A Comparison of four
measurement models for the Watson-Barker Listening Test (WBLT)-Form C.
Communication Research Reports, 28, 32-42. Doi: 10.1080/08824096.2011.540547
Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in
the human sciences. London: Lawrence Erlbaum Associates.
Borsboom, D. (2005). Measuring the mind: Conceptual issues in contemporary
psychometrics. Cambridge: Cambridge University Press.
Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford.
Buck, G. (1992). Listening comprehension: Construct validity and trait characteristics.
Language Testing, 42, 313-357. doi: 10.1111/j.1467-1770.1992.tb01339.x
Buck, G. (2001). Assessing listening. UK: Cambridge University Press.
Buck, G., & Tatsuoka, L. (1998). Application of the rule-space procedure to language testing
examining attributes of a free response listening test. Language Testing, 15, 119-157.
doi: 10.1177/026553229801500201
Carroll, J. B. (1972). Defining language comprehension. In R. O. Freedle, & J. B. Carroll
(Eds.), Language comprehension and the acquisition of knowledge (pp. 1–29). New
York: John Wiley and Sons.
Chen, F. F., West, S. G., & Sousa, K. H. (2006). A comparison of bifactor and second-order
models of quality of life. Multivariate Behavioral Research, 41, 189-225. Doi:
10.1207/s15327906mbr4102_5
Compton, D., Love, T. P., & Sell, J. (2012). Developing and assessing intercoder reliability in
studies of group interaction. Sociological Methodology, 42, 348–364. Doi:
10.1177/0081175012444860
Dunkel, P., Henning, G., & Chaudron, C. (1993). The assessment of a listening
comprehension construct: A tentative model for test specification and development.
Modern Language Journal, 77, 180-191. Doi: 10.1111/j.1540-4781.1993.tb01962.x
Eom, M. (2008). Underlying factors of MELAB listening construct. Spaan Fellow Working
Papers in Second or Foreign Language Assessment, 6, 77-94. Ann Arbor, MI:
University of Michigan English Language Institute.
Espeseth, M. (2004). Academic listening encounters: Human behavior listening, note taking,
and discussion. Cambridge: Cambridge University Press.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
36
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the
use of exploratory factor analysis in psychological research. Psychological Methods, 4,
272-299. Doi: 10.1037/1082-989X.4.3.272
Field, J. (2008). Listening in the language classroom. Cambridge: Cambridge University
Press.
Fleurquin, F., Huntley, M., Johnson, J., Lagergren, E., Rose, B., & Wood, B. (2009). MELAB
information and registration bulletin. Ann Arbor: English Language Institute,
University of Michigan.
Flowerdew, J. (1994). Research of relevance to second language lecture comprehension: An
overview. In J. Flowerdew (Ed.), Academic listening: Research perspectives (pp. 7-29).
Cambridge: Cambridge University Press.
Flowerdew, J., & Miller, J. (2006) Second language listening. Cambridge: Cambridge
University Press.
Freedle, R., & Kostin, I. (1991). The prediction of SAT reading comprehension item difficulty
for expository prose passages (TOEFL Research Report RR 91-29). Princeton, NJ:
Educational Testing Service.
Freedle, R., & Kostin, I. (1996). The prediction of TOEFL listening comprehension item
difficulty for mini-talk passages: implications for construct validity (TOEFL Research
Report RR 96). Princeton, NJ: Educational Testing Service.
Goh, C. C. M. (2010). Listening as process: Learning activities for self-appraisal and self-
regulation. In N. Harwood (ED.), Materials in ELT: Theory and practice (pp. 179-206).
Cambridge: Cambridge University Press.
Goh, C. C. M., & Hu, G. (2013). Exploring the relationship between metacognitive
awareness and listening performance with questionnaire data. Language Awareness.
Doi: 10.1080/09658416.2013.769558
Haberman, S. J., & von Davier, M. (2007). Some notes on models for cognitively based skills
diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Vol. 26.
Psychometrics (pp. 1031–1038). Amsterdam, the Netherlands: Elsevier.
Hair, J. F. Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data
analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall.
Hall, R. J., Snell, A. F., & Singer Foust, M. (1999). Item parceling strategies in SEM:
Investigating the subtle effects of unmodeled secondary constructs. Organizational
Research Methods, 2, 233–256. Doi: 10.1177/109442819923002
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
37
Hansen, C., & Jensen, C. (1994). Evaluating lecture comprehension. In J. Flowerdew (Ed.),
Academic listening: Research perspective (pp. 241-268). Cambridge: Cambridge
University Press.
Hildyard, A., & Olson, D. (1978). Memory and inference in the comprehension of oral and
written discourse. Discourse Processes, 1, 91-107. Doi: 10.1080/01638537809544431
Hsieh, H.-F., & Shannon, S.E. (2005). Three approaches to qualitative content analysis.
Qualitative Health Research, 15, 1277-1288. Doi: 10.1177/1049732305276687
Hulstijn, J. H. (2003). Connectionist models of language processing and the training of
listening skills with the aid of multimedia software. Computer Assisted Language
Learning, 16(5), 413-425. Doi: 10.1076/call.16.5.413.29488
Huynh, H., & Meyer, P. L. (2003). Maximum information approach to scale description for
affective measures based on the Rasch model. Journal of Applied Measurement, 4,
1010-1110.
Imhof, M., & Janusik, L. A. (2006). Development and validation of the Imhof-Janusik
listening concepts inventory to measure listening conceptualization differences
between cultures. Journal of International Communication Research, 35, 79-98. Doi:
10.1080/17475750600909246
Johnson, J. (2003). MELAB technical manual. Ann Arbor: English Language Institute, the
University of Michigan.
Jöreskog, K. G. (1993). Testing structural equation models. In K. Bollen & J. S. Long (Eds.),
Testing structural equation models (pp. 294–316). Newbury Park, CA: Sage.
Jöreskog, K. G., & Sörbom, D. (2001). LISREL 8.8: User’s reference guide. Lincolnwood, IL:
Scientific Software International, Inc.
Jöreskog, K. G., & Sörbom, D. (2006). LISREL 8.8 for Windows [Computer software].
Lincolnwood, IL: Scientific Software International, Inc.
Kim, Y. H., & Jang, E. E. (2009). Differential functioning of reading sub-skills on the
OSSLT for L1 and ELL students: A multidimensionality model-based DBF/DIF
approach. Language Learning, 59, 825-865.
Kline, R. B. (1998). Principles and practice of structural equation modeling. New York:
Guilford Press.
Kunnan, A. J., & Jang, E. E. (2009). Diagnostic feedback in language assessment. In M.
Long & C. Doughty (Eds.), The handbook of language teaching (pp. 610-625).
Blackwell Publishing.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
38
Kunnan, A. J. (1998). An introduction to structural equation modeling for language
assessment research. Language Testing, 15, 295-332. Doi:
10.1177/026553229801500302
Lee, Y.-W., & Sawaki, Y. (2009). Cognitive diagnostic approaches to language assessment:
An overview. Language Assessment Quarterly, 6, 172-189. Doi:
10.1080/15434300902985108
Liao, Y. (2007). Investigating the construct validity of the grammar and vocabulary section
and the listening section of the ECCE: Lexico-grammatical ability as a predictor of L2
listening ability. Spaan Fellow Working Papers in Second or Foreign Language
Assessment, 5, 37-78. Ann Arbor, MI: University of Michigan English Language
Institute.
Liao, Y. (2009). Construct validation study of the GEPT reading and listening sections: Re-
examining the models of L2 reading and listening abilities and their relations to lexico-
grammatical knowledge. Unpublished doctoral dissertation in Applied Linguistics,
Teachers College, Columbia University.
Little, T. D., Cunningham, W. A., Shahar, G., & Widaman, K. F. (2002). To parcel or not to
parcel: Exploring the question, weighing the merits. Structural Equation Modelling, 9,
151–173. Doi: 10.1207/S15328007SEM0902_1
Lynch, T. (2010). Listening: Sources, skills and strategies. In R. Kaplan (Ed.), Oxford
handbook of applied linguistics (pp. 74-87). Oxford: Oxford University Press. Doi:
10.1093/oxfordhb/9780195384253.013.0005
Marsh, H. W., & Hocevar, D. (1988). A new, more powerful approach to multitrait-
multimethod analyses: Application of second-order confirmatory factor analysis.
Journal of Applied Psychology, 73, 107–117. Doi: 10.1037/0021-9010.73.1.107
Mulaik, S. A., & Quartetti, D. A. (1997). First order or higher order general factor? Structural
Equation Modeling: A Multidisciplinary Journal, 4, 193-211. Doi:
10.1080/10705519709540071
Munby, J. (1978) Communicative syllabus design. Cambridge: Cambridge University Press.
Nissan, S., DeVincenzi, F., & Tang, L. (1996). An analysis of factors affecting the difficulty
of dialogue items in TOEFL listening comprehension. (TOEFL Research Report RR 95-
37). Princeton, NJ: Educational Testing Service.
Oakeshott-Taylor, J. (1977). Information redundancy, and listening comprehension. In R.
Dirven (Ed.), Ho¨rversta¨ndnis im Fremdsprachenunterricht [Listening comprehension
in foreign language teaching]. Kronberg/Ts: Scriptor.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
39
Oller, J. W., Jr. (1983). Evidence for a general proficiency factor: An Expectancy grammar.
In J. W. Oller, Jr. (Ed.), Issues in language testing research (pp. 3–10). Rowley, MA:
Newbery House Publishers.
Oller, J. W., Jr., & Hinofitis, F. B. (1980). Two manually exclusive hypotheses about second
language ability: Indivisible or partly divisible competence. In J. W. Oller, Jr., & K.
Perkins (Eds.), Research in language testing (pp. 13–23). Rowley, Massachusetts:
Newbery House Publishers.
Pandya, D. N. (1995). Anatomy of the auditory cortex. Review Neurology (Paris), 151, 486–
494.
Phakiti, A. (2008). Strategic competence as a fourth-order factor: A structural equation
modeling approach. Language Assessment Quarterly, 5, 20–42. Doi:
10.1080/15434300701533596
Phakiti, A. (2007). Strategic Competence and EFL reading test performance: A structural
equation modeling approach. Frankfurt am Main: Peter Lang.
Phakiti, A. (2008). Construct validation of Bachman and Palmer’s (1996) strategic
competence model over time in EFL reading tests (This paper was the runner-up for
the International Language Testing Association's Best Article Award 2008.).
Language Testing, 25, 237–272. Doi: 10.1177/0265532207086783
Purpura, J. (1998). Investigating the effects of strategy use and second language test
performance with high- and low-ability test takers: A structural equation modeling
approach. Language Testing, 15, 333-379. Doi: 10.1177/026553229801500303
Resnick, L. B., & Resnick, D. P. (1992). Assessing the thinking curriculum: New tools for
educational reform. In B. G. Gifford & M. C. O’Conner (Eds.), Changing
assessments: Alternative views of aptitude, achievement and instruction (pp. 37–75).
Boston: Kluwer Academic.
Richards, J. C. (1983). Listening comprehension: Approach, design, procedure. TESOL
Quarterly, 17, 219-39. Doi: 10.2307/3586651
Rixon, S. (1981). The design of materials to foster particular linguistic skills: The teaching of
listening comprehension. (ERIC Document Reproduction Service No. ED 258 465).
Ross, S. (1997). An introspective analysis of listener inferencing on a second language
listening test. In G. Kasper & E. Kellerman (Eds.), Communication strategies:
Psycholinguistic and sociolinguistic perspectives (pp. 216-237). London: Longman.
Rost, M. (2002). Teaching and researching listening. New York: Longman.
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
40
Rouiller, E. M. (1997). Functional organization of the auditory pathways. In G. Ehret, & R.
Romand (Eds.), The central auditory system (pp. 3-96). New York: Oxford University
Press.
Sanabria, K. (2004). Academic listening encounters: Life in society. Cambridge: Cambridge
University Press.
Sawaki, Y., Kim, H.-J., & Gentile, C. (2009). Q-Matrix construction: Defining the link
between constructs and test items in large-scale reading and listening comprehension
assessments. Language Assessment Quarterly, 6, 190–209. Doi:
10.1080/15434300902801917
Sawaki, Y., Quinlan, T., & Lee, Y.-W. (2013): Understanding learner strengths and
weaknesses: Assessing performance on an integrated writing task. Language
Assessment Quarterly, 10, 73-95. Doi: 10.1080/15434303.2011.633305
Schumacker, R. E., & Lomax, R. G. (2010). A beginner’s guide to structural equation
modeling. Mahwah, NJ: Lawrence Erlbaum.
Shin, S. (2008). Examining the construct validity of a web-based academic listening test: An
investigation of the effects of response formats. Spaan Fellow Working Papers in
Second or Foreign Language Assessment, 6, 95–129. Ann Arbor, MI: University of
Michigan English Language Institute.
Shohamy, E., & Inbar, O. (1991). Construct validation of listening comprehension tests: The
effect of text and question type. Language Testing, 8, 23-40. Doi:
10.1177/026553229100800103
Suga, N., & Ma, X. (2003). Multiparametric corticofugal modulation and plasticity in the
auditory system. Nature Review, Neuroscience, 4, 783–794. Doi: 10.1038/nrn1222
Tsui, A. B. M., & Fullilove, J. (1998). Bottom-up or top-down processing as a discriminator
of L2 listening performance. Applied Linguistics, 19, 432-451. Doi:
10.1093/applin/19.4.432
Ur, P. (1984). Teaching listening comprehension. Cambridge: Cambridge University Press.
van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York:
Academic Press Inc.
Vandergrift, L. (2007). Recent developments in second and foreign language listening
comprehension research. Language Teaching, 40, 191-210. Doi:
10.1017/S0261444807004338
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
41
Vandergrift, L., Goh, C., Mareschal, C., & Tafaghodatari, M. H. (2006). The Metacognitive
Awareness Listening Questionnaire (MALQ): Development and validation. Language
Learning, 56, 431-462. Doi: 10.1111/j.1467-9922.2006.00373.x
Wagner, A. (2004). A construct validation study of the extended listening sections of the
ECRE and MELAB. , 2, 1-23. Ann Arbor, MI: University of Michigan English
Language Institute.
Weigle, S. C. (2000). Test review: The Michigan English language assessment battery.
Language Testing, 17, 449-455.
Weir, C. (1990). Communicative language testing. London: Prentice Hall.
Widaman, K.F. (2002). To parcel or not to parcel: Exploring the question, weighing the
merits. Structural Equation Modeling, 9, 151–173. Doi:
10.1207/S15328007SEM0902_1
Wolvin, A. D., & Coakley, C. G. (1996). Listening (5th Ed.). Madison: Brown & Benchmark.
Zhang, D. (2012). Vocabulary and grammar knowledge in L2 reading comprehension: A
structural equation modeling study. The Modern Language Journal, 96, 558-575.
Doi: 10.1111/j.1540-4781.2012.01398.x
Zhang, L., Aryadoust,V., & Zhang, L. J. (in press) Development and validation of the Test
Takers' Metacognitive Awareness Reading Questionnaire. The Asia-Pacific Education
Researcher. doi:10.1007/s40299-013-0083-z Doi: 10.1007/s40299-013-0083-z
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
42
Appendix A
Descriptive statistics of the aggregate-level test items.
Note: n = 916. a Each aggregate-level item is made by combining two or three test items. For example, i1i3 was made by combining Items 1 and 3.
Mean Standard deviation Skewness Kurtosis
i1i3 a 1.12 0.727 -0.195 -1.094
i2i4 1.26 0.698 -0.419 -0.907
i6i8 1.32 0.602 -0.286 -0.645
i5i7 1.28 0.738 -0.508 -1.022
i9i11 1.53 0.637 -1.044 -0.019
i10i12i14 1.81 0.884 -0.186 -0.833
i13i15 1.30 0.683 -0.467 -0.823
i18i30 1.28 0.712 -0.474 -0.937
i21i35 1.18 0.746 -0.310 -1.158
i36i38 1.06 0.724 -0.096 -1.090
i37i39 1.28 0.706 -0.477 -0.910
i41i43i45 2.24 0.803 -0.835 0.065
i42i44 1.18 0.682 -0.257 -0.866
i40i46 1.34 0.679 -0.555 -0.761
i47i49 1.11 0.726 -0.182 -1.091
i48i50 1.32 0.705 -0.555 -0.857
i16i22 1.15 0.744 -0.260 -1.161
i24i26 1.19 0.674 -0.254 -0.829
i25i27 1.19 0.711 -0.306 -0.998
i29i31i33 1.98 0.890 -0.455 -0.671
i17i19i23 1.78 0.911 -0.226 -0.818
i20i28 1.28 0.701 -0.461 -0.900
i32i34 1.44 0.667 -0.792 -0.496
EXAMINING THE NOTION OF LISTENING SUB-SKILL DIVISIBILITY AND ITS IMPLICATIONS FOR SECOND LANGUAGE LISTENING
43
Appendix B
Correlation of the aggregate-level items.
Note. * p < .05. ** p < .01.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
1 1
2 .208**
1
3 .180**
.198**
1
4 .197**
.232**
.188**
1
5 .151**
.183**
.170**
.240**
1
6 .234**
.229**
.219**
.290**
.252**
1
7 .168**
.209**
.145**
.251**
.170**
.252**
1
8 .183**
.278**
.183**
.343**
.286**
.318**
.180**
1
9 .145**
.150**
.148**
.262**
.183**
.229**
.189**
.248**
1
10
.134**
.137**
.088**
.052
.102**
.110**
.083*
.173**
.092**
1
11
.183**
.244**
.176**
.204**
.192**
.216**
.206**
.254**
.221**
.184**
1
12
.225**
.186**
.144**
.297**
.187**
.225**
.186**
.261**
.165**
.178**
.228**
1
13
.195**
.227**
.160**
.288**
.190**
.247**
.196**
.238**
.215**
.126**
.238**
.191**
1
14
.165**
.207**
.101**
.192**
.179**
.230**
.222**
.265**
.210**
.133**
.231**
.204**
.217**
1
15
.157**
.139**
.111**
.195**
.080*
.168**
.158**
.208**
.130**
.148**
.222**
.219**
.132**
.233**
1
16
.189**
.145**
.040
.193**
.118**
.181**
.186**
.181**
.161**
.103**
.179**
.194**
.206**
.240**
.212**
1
17
.154**
.212**
.131**
.295**
.188**
.229**
.180**
.269**
.198**
.148**
.247**
.198**
.241**
.207**
.185**
.128**
1
18
.111**
.169**
.123**
.232**
.196**
.165**
.169**
.188**
.175**
.105**
.176**
.147**
.175**
.193**
.126**
.153**
.162**
1
19
.145**
.156**
.149**
.155**
.171**
.227**
.185**
.189**
.189**
.162**
.254**
.170**
.148**
.187**
.110**
.190**
.171**
.164**
1
20
.237**
.319**
.177**
.314**
.263**
.298**
.258**
.360**
.299**
.151**
.277**
.263**
.276**
.262**
.208**
.231**
.222**
.168**
.222**
1
21
.161**
.276**
.158**
.269**
.281**
.208**
.214**
.272**
.253**
.226**
.237**
.205**
.152**
.271**
.175**
.161**
.258**
.180**
.231**
.299**
1
22
.140**
.159**
.066*
.217**
.177**
.207**
.141**
.205**
.172**
.074*
.189**
.214**
.177**
.215**
.153**
.137**
.172**
.158**
.134**
.250**
.210**
1
23
.145**
.160**
.161**
.221**
.219**
.225**
.176**
.287**
.215**
.109**
.178**
.194**
.210**
.167**
.148**
.142**
.164**
.167**
.163**
.279**
.210**
.162**
1