CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17．7 4．5 4 27 0，768 There was no minimum score of zero， though the maximum score

CHAPTER　5　RESULTS

5．1　Descriptive　Statistics

5．1．1Test　Set　A

5．1．1．10ッera〃Results

　　　　　Table　5－1　shows　the　information　of　the　number　of　test　takers，　mean，　standard

deviation，　maximum　score，　minimum　score，　and　KR20　of　Test　S　et　A　With　regard　to　all

test　takers．　Overall，　the　present　authodnds　no　problematic　results　were　fb皿d　in

the　descriptive　statistics．　The　KR　20　may　seem　a　bit　low，　considering　that　GTEC

administered　in　a　formal　situation　usually　bears　a　KR　20　higher　than　O．9．　However，

since　the　test　employed　in　the　present　study　had　only　27　test　items　with　sufficient，　but

limited，　n㎜ber　of　test倣ers，　itα㎜ot　be　helped　that　the　reliability　become　lower

than　that　of　GTEC．　This　and，　also，　the　fact　that　the　test　items　were　written　by　the

present　author　with　no　particular　means　to　item　bahking　being　accounted　as　the

causes　of　decreased　reliability，　the　reliability　of　O．768　seems　acceptable　fbr　the

　　　　　　　　　　　　　　　　　　　．

present　sltuatlon・

Table　5－1　Descriptive　statistics　and　reliability　coefficients　of　Test　Set　A

＃of　Test　Takers Mean S．D． Minimum Maximum KR20

573 17．7 4．5 4 27 0，768

　　　　　There　was　no　minimum　score　of　zero，　though　the　maximum　score　was　a血ll

mark．　Ideally，　there　should　be　no　fUll　marks　in　a　proficiency　test，　because　if　there

are，　it　indicates　that　the　test　was　not　accurate　in　measuring　the　test　taker’s　ability，

which　could　have　been　beyond　what　was　tested．　However，　since　this　test　was　not

quite　a　proficiency　test，　and　since　the　frequency　of　fU11　marks　was　one，　it　was　j　udged

that　this　aspect　of　result　was　no　threat　to　the　reliability　ofthe　present　study．

　　　　　The　histogram　in　Figure　5－1　indicates　the　distributions　of　test　takers’scores　as　a

whole．　The　statistics　bore－0．808　fbr　Skewedness　and　O．227　fbr　Kurtosis，　which

63

東京外国語大学博士学位論文 Doctoral thesis (Tokyo University of Foreign Studies)

allows　the　present　author　to　determine　that　the　curve　presented　is　close　enough　to　a

normal　curve，　although　the　peak　is　slightly　on　the　right．　This　problem　was　solved

when　different　ability　groups　were　determined．

Figure　5・1　Histogram　of　overall　resultS　for　Test　Set　A

FREQUENCY　　　15．0＋

　　　　　　　1

　　　　　　　1

　　　　　　　1

　　　　　　　1

　　　10．0＋

　　　　　　　1

　　　　　　　　1

　　　　　　　　1

　　　　　　　　l

　　　　　　　　I

　　　　　　　　I

　　　　　　　　l

　　　　　　　　l

　　　　　　　　l

　　　　　　　　　　　＊

　　　　　　　　　　＊＊

　　　　　　　　　　＊＊

　　　　　　　＊　　　＊＊

　　　　　　　＊＊　＊＊＊

　　　　　　　＊＊＊＊＊＊

　　　　　　　＊＊＊＊＊＊

　　　　　＊　　＊＊＊＊＊＊＊

　　＊　　＊＊＊＊＊＊＊＊＊＊

　　＊＊＊＊＊＊＊＊＊＊＊＊

＊　　＊＊＊＊＊＊＊＊＊＊＊＊＊

＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊

0．0十一一一一一十一一一一十一一一一十一一一一十一一一一十一一一一十

0． 5．　　10．　　15．　　20．　　25．　　30．　　SCORES

5．1．1．2」rte〃2レblidation

　　　　　Facility　value（percentage　correct）and　discrimination　index（point－biserial

coefficient）fbr　each　item　in　Test　Set　A　are　provided　in　Table　5－2．　Items　1，2，7and

16seem　a　little　problematic　when　their　point－biserial　coefficients　are　examined．

One　reason　could　have　been　because，㎜ong　fbur　options，　the　answer　in　each　item

was皿clear　and　hard　to　distinguish　from　other　options　due　to　its　defective

construction．　Furthermore，　the　fact　that　the　percentages　correct　for　ltems　2，7，16are

especially　low　could　mean　that　they　were　so　difficult　that　even　those　who　had　scored

well　on　the　test　as　a　whole　tended　to　get　them　wrong，　resulting　in　low　coefficients　in

the　point－biserial．　However，　overall，　the　figures　seemed　satisfactory　as　a　test

64


instrument　to　be　employed　in　this　study，　and　it　was　decided　to　employ　all　the　items　in

the　analyses　to　be　fbllowed．

Table　5・2　Percentage　Correct　and　Point－biserial　Coefficient　of　ltems　in　Test　Set　A

ITEM＃ PC PBs1 0．66 0．16

2 0．34 0．07

3 0．37 0．26

4 0．92 0．36

5 0．84 0．30

6 0．46 0．237 0．39 0．00

8 0．58 0．25

9 0．36 0．28

13 0．68 0．49

14 0．72 0．54

15 0．90 0．40

16 0．37 0．04

17 0．70 0．55

18 0．71 0．47

19 0．81 0．53

20 0．71 0．42

21 0．51 0．46

22 0．84 0．51

23 0．71 0．50

24 0．74 0．54

25 0．76 0．49

26 0．78 0．57

27 0．87 0．40

28 0．71 0．55

29 0．68 0．58

30 0．63 0．47

Avera　e 0．66 0．39

5．1．1．3Predeter〃lin　ing’th　e／l　bility　G7ro卯5

　　　　　As　explained　in　4．3．1，　the　ability　groups　of　Group　A－Low　and　Group　A－High

were　predeterrnined　based　on　the　overall　results　of　descriptive　statistics　on　Test　Set　A．

Examining　Figure　5－2，　which　is　the　score　distribution　table　fbr　the　whole　population

of　Test　Set　A，　the　present　author　had　detected　something　obscure　was　detected　about

the　population　who　scored　g　and皿der・They　seem　to　deviate　from　the　rest　of　the

group　since　they　form　a　small　normal　curve　of　their　own．　Fulthemore，　when　the

65


distribution　was　reviewed丘om　11（items　correct）to　26，　it　seemed　to　form　a　rather

perfect　normal　curve．　Since　the　median　is　19（items　correct），　those　test　takers　who

血ll　in　the　zone　between　line　b　and　c　would　be　considered　as　having

Figure　5■2

　　　　　　　Number　　　　　　Freq－

　　　　　　　Correct　　　　　　uency

　　　　　　　　．　．　．　No　exam　i　nees

　　　　　　　　　　3　　　　　　　0

　　　　　　　　　　4　　　　　　　2

　　　　　　　　　　5　　　　　　　5

　　　　　　　　　　6　　　　　　　4

　　　　　　　　　　7　　　　　　　9

　　　　　　　　　　8　　　　　　　10

　　　　　　　　　　　9　　　　　　　6

　　　　　　　　　　10　　　　　　　15

Score　distribution　table　for　lrest　Set　A

　　　Cum

　　　Freq　　　　PR　　　PCT

be　l　ow　th　i　s　score　　　．　．

　　　　　0　　　1　　　0

　　　　　2　　　1　　　0

　　　　　7　　　1　　　1

　　　　　11　　　2　　　1

　　　　　20　　　3　　　2

　　　　　30　　　5　　　2

　　　　　36　　　6　　　　1

　　　　　51　　　9　　　3

1

1

＋＃

1＃

1＃＃

1＃＃

1＃

＋＃＃＃

51　people　（8．9％）

一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一

11　　　　　　　7　　　　　58　　10

12　　　　　　　　　　24　　　　　　　　82　　　　14

13　　　　　　　　18　　　　　　100　　　17

14　　　　　　　　24　　　　　　124　　　22

15　　　　　　　　　　30　　　　　　　154　　　　27

16　　　　　　　　　　24　　　　　　　　178　　　　31

17　　　　　　　　　　54　　　　　　　232　　　　40

1

4

3

4

5

4

9

1＃

1＃＃nv

l＃＃＃

1＃＃＃＃　　　　181　people　（31．6％）

＋＃＃＃＃＃　　　　　〈Group　A－Low＞

1＃＃＃＃

1＃＃＃＃＃＃＃＃＃

＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一

000り011り乙

283　　49

326　　57

393　　69

1＃＃＃＃＃＃＃＃＃　　161　people　（28．1％）

1＃＃＃＃＃＃＃＃　　　←median

＋＃＃＃＃＃＃＃＃＃＃＃＃

＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿＿一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一 b

　　　21　　　　　　　　　　73　　　　　　　466　　　　81　　　　　13　　　　　　1＃＃＃＃＃＃＃＃＃＃＃＃＃

　　　22　　　　　　　　　　46　　　　　　　512　　　　89　　　　　8　　　　　　1＃＃＃＃＃＃＃＃

　　　23　　　　　　　　　　31　　　　　　　543　　　　95　　　　　5　　　　　　1＃＃＃＃＃　　　　180　people　（31．4％）

　　　24　　　　　　　　　　17　　　　　　　560　　　98　　　　　3　　　　　1＃＃＃　　　　　　　〈Group　A－High＞

　　　25　　　　　　　　　　　8　　　　　　　568　　　　99　　　　　　1　　　　　＋＃

　　　26　45729911＃　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　1－一一一←一一一＋一一一一←一一一＋一一一一＋

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　510152025

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　Percentage　of　Examinees

（27number　c・rreCt・is・mitted・f・。m・the・table・because・its・f・eque・cy・was・1，・less・tha・the　number　f・・which＃

would　be　given　which　is　4．）

marginal　ability　between　those　people　who　would　be　considered　as　having　high

66


ability　and　those　as　having　low　ability．　When　the　population　percentage　was

calculated　for　each　zone，　the　population　that　fell皿der　the　zone　between　lines　a　and　b

was　31．6％，　lines　b　and　c　28．1％，　and　line　c　below　31．4％．　Since　the　grouping

seemed　to　allow　roughly　the　same　percentages　of　people　to　be　allotted　to　the　group，

the　present　author　decided　that　those　people　who　had　scored　between　l　l　and　17

would　be　predetermined　to　be　in　Group　A－Low　and　those　between　21　and　27　in

Group　A－High．


deviation，　maximum　score　and　minimum　score　for　Group　A－Low　and　Group　A－High．

Table　5－3　Descriptive　statistics　for　Group　A・Low　and　Group　A－High

Group ＃of　Test　Takers Mean S．D． Minimum Maximum

A－Low 181 14．8 1．9 11 17

A－High 180 22．2 1．3 21 27

5．1．2Test　Set　B

5．1．2．10vera〃」Results


deviation，　maximum　score，　minimum　score，　and　KR200f　Test　Set　B．　Overall，　the

present　author　finds　no　problematic　results　in　the　descriptive　statistics．　The　KR　20

may　seem　a　bit　too　low，　considering　that　GTEC　and　TOEIC　administered　in　the

formal　situation　usually　bears　KR　20　higher　than　O．9．　However，　since　the　tests

employed　in　the　present　study　had　only　27　test　items，　far　less　than　the　number　of

items　included　in　o亘ginal　tests，　with　su伍cient，　but　limited　number　of　test　takers，　the

drop　in　the　index　seems　unavoidable．　Assessing　this　as　an　undisturbing　element　in

the　present　situation，　the　present　author　had　decided　to　proceed　with　this　result．

Table　54　　Descriptive　statistics　and　reliability　coef「icien笛of　Test　Set　B

＃of　Test　Takers Mean S．D． Minimum Maximum KR20

67


257 16．3 4．0 4 25 0，675

　　　　　The　histogram　in　Figure　5－3　indicates　the　distributions　of　test　takers’scores．

The　statistics　bore－0．340　fbr　Skewedness　and　O．250　fbr　Kurtosis，　which　allows　the

present　author　to　determine　that　the　curve　presented　is　close　enough　to　a　normal　curve．

There　was　no　minimum　score　of　zero，　and　the　maximum　score　was　25．　From　this，

along　with　the　mean　score　of　16．3　and　from　the　score　distribution　in　Figure　5－3，　it

was　pre　sumed　that　Test　S　et　B　had　worked　well　to　illustrate　the　reading　ability　of　the

population　who　had　worked　on　this　test　set．

Figure　5－3　Histogram　of　resultS　for　Test　Set　B

FREQUENCY　：　　15．0＋

　　　　　　1

　　　　　　1

　　　　　　1

　　　　　　l

　　　　　　l

　　10．0＋

　　　　　　1

　　　　　　l

　　　　　　l

　　　　　　l

　　　　　　l

　　　　　　l

　　　　　　l

　　　　　　I

　　　　　　l

　　　　　　　　＊

　　　　　　　　＊

　　　　　　　　＊　＊

　　　　　　　＊＊　＊

　　　　　　　＊＊＊＊＊

　　　　＊　　＊＊＊＊＊

　　　　＊＊＊＊＊＊＊＊＊

　　　　＊＊＊＊＊＊＊＊＊＊

　　　　＊＊＊＊＊＊＊＊＊＊＊

　　　　＊＊＊＊＊＊＊＊＊＊＊

　　　　＊＊＊＊＊＊＊＊＊＊＊

＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊

0．0十一一一一一十一一一一十一一一一十一一一一十一一一一十一一一一十一一一一十

0． 5． 10．　　15．　　20． 25． 30． SCORES

5．1．2．2」rte〃1　Vatidation

　　　　　Facility　value（percentage　correct）and　discrimination　index（point－biserial

coefficient）fbr　each　item　in　Test　Set　B　are　provided　in　Table　5－5．　When　their

point－biserial　coefficients　are　examined，　items　2　and　4　seem　to　bear　problems．　One

explanation　could　be　because，㎜ong　fbur　options，　the　answer　in　each　item　was

mclear　and　difficult　to　choose　from　the　given　options　due　to　its　defective

68


construction．　However，　overall，　the　figures　seemed　satisfactory　as　a　test　instrument

to　be　employed　in　this　study，　and　all　the　items　were　observed　in　the　analyses　to　be

fbllowed．

Table　5－5　Percentage　Correct　and　Point－biseriat　Coefficient　of　ltems　in　Test　Set　B

ITEM＃ PC PBs1 0．73 0．24

2 0．24 0．19

3 0．38 0．26

4 0．95 0．15

5 0．81 0．266 0．51 0．23

10 0．65 0．2811 0．61 0．30

12 0．72 0．36

13 0．72 0．41

14 0．42 0．34

15 0．88 0．30

16 0．40 0．22

17 0．53 0．38

18 0．78 0．26

19 0．44 0．29

20 0．68 0．33

21 0．45 0．46

22 0．63 0．38

23 0．40 0．31

24 0．55 0．25

25 0．77 0．48

26 0．73 0．49

27 0．70 0．46

28 0．66 0．44

29 0．54 0．41

30 0．46 0．30

Avera　e 0．61 0．33

5．2　Factor　Analytic　Studies

5．2．1Group　A－Low

　　　　　A　fU11－information　factor　analysis　was　applied　to　all　the　items　in　Test　Set　A　With

the　responses　of　Group　A－Low．　Here，　a　two－factor　solution　was　adopted　because　of

69


its　interpretability．　The　correlation　between　the　factors　is　low，．341　between　the　first

and　second　factors，　which　indicates　that　the　orthogona1（VARIMAX）analysis　is

preferable．　Table　5－6　illustrate　s　the　factor　loadings　fbr　this　group．

　　　　　For　inspection　of　loadings　on　each　factor，　the　factor　loadings　of　each　item　were

rearranged　in　order　from　those　that　bear　high　loadings　to　low　loadings　on　the　first

飴ctoL　The　predetermined　question　t）？ell　of　each　item　is　indicated　in　the　table　as

‘‘p－TYPE’うso　that　the　relationship　between　factor　loadings　and　question　types　might

be　sought．　The　numbers　under“P－＃”in　the　table　shows　for　which　passage　each　test

item　was　responded．

Table　5－6　Factor　Loadings　for　Test　Set　A　by　Group　A－Low

21

111n　order　to　make　the　reference　to　the　terms　simpler，‘‘global－inferential”will　be　presented　as‘‘GI”，

‘‘撃盾モ≠戟|literal”as‘‘LL”and‘‘loca1－inferential”as‘‘LI”in　the　tables　and　in　the　discussion丘om　here　on．

70


　　　　　The　first　thing　noticed　is　that　the　text（context）characteristics　did　not　affect　the

extraction　of　factors．　Items　from　different　passages　had　significant　loadings　on　the

same　factors，　and　the　loadings　of　the　items　that　came　from　the　same　prompt　varied

greatly　in　the　loadings．

　　　　　In　looking　fbr　particularity　in　the　loadings　on　the　first　factor，　one　notices　that

the　items　that　bear　rather　high　loadings　on　the　first　factor　are　the　items　that　have

smaller　item　numbers．　In　other　words，　they　are　the　items　that　appear　early　in　the　test

set．　In　the　same　way，　the　items　that　bear　negatively　high　loadings　are　the　items　that

apPear　later　in　the　test　set．　V西at　this　indicates　is　that　the　first　factor　in　the　factor

loadings　for　Test　Set　A　via　the　performance　of　Group　A－Low　could　be　determined　as

‘‘翌??窒?@a　test　item　is　located　in　the　test　setう’or‘‘item　position”．　This　point　will　be

fUrther　discussed　in　Chapter　6．

　　　　　As　fbr　the　interpretation　of　the　second　factor　in　the　present　analysis，　the

possibility　of　a‘‘literal’うtype　of　reading　being　an　attribute　arises．　The　items　that　load

heavily　on　the　second　factor　are　items　2　and　7．　The　predetermined　question　type

varies　between　the　two，　so　fUrther　analyses　of　the　two　items　were　done．

　　　　　Item　7，　which　was　originally　categorized　to　be　a　GI（‘‘global－inferential”，　see

note＃11）item，　is　presented　in　the　test　as　fbllowing：

7．What　is　the　main　topic　of　this　passage？

（A）　　　The　possibility　of　space　celonies

（B）　　Space　travel　in　the　twenty－first　century

（C）　　How　to　become　an　astronaut

（D）　What　people　think　about　space　exploration

　　　　　The　correct　option（D）could　be　chosen　if　the　test　taker　could　observe　that　the

explanations　about　different　percentages　introduced　in　the　passage　are　all　about

‘‘vhat　people　think　about　space　exploration　，　option（D），　and　that　that　i　s　the　theme　of

the　passage．　However，　at　the　same　time，　it　could　be　supposed　that　some　test　takers

71


look　at　the　first　sentence　in　the　passage，‘‘During　the　last　fbrty　years，　many　studies

have　been　done　to　learn　people’s　opinions　about　space　exploration，”and　match　the

phrase‘‘people’s　opinion　about　space　exploration”with　what　is　said　in　option‘‘D”．

If　it　could　be　presumed　that　this　type　of　reading　was　done　to　reach　the　answer，　this　is

aliteral　matching　of　a　limited（or　local）information，　and　item　7　could　be　considered

as　an　LL（‘‘local－literal’う，　see　note＃11）item．

　　　　　Item　2　is　also　an　LL　item：

2．The　speed　at　which　the　seafloor　is　spreading　is

（A）　　about　an　inch　in　200　million　years．

（B）　　changes　according　to　the　year．

（C）　　half　as　fast　as　human　fingernails　grow．

（D）　　slower　than　the　scientists　can　process．

　　　　　This　is　indeed　an　LL　item　since　the　correct　option（C）would　be　chosen　when

the　test　taker　notices　that　the　last　sentence　in　the　passage，‘‘This　spreading　occurs　in

half　of　a　speed　of　how　fast　fingernails　grow，”　perfectly　matches　the　phrases　in　option

（C）．Thus，　it　is　possible　that　the‘‘local－literal”element　explains　the　second　factor．

　　　　　To㎞her　con丘㎜this　inteΦretation，　another　thing　to　be　pointed　out　is　that

there　are　quite　a　few　items　that　bear　negatively　high　loadings　on　the　second　factor：

items　14，17，18，　and　19．

　　　　　Items　17　and　18　share　the　same　passage，　and　they　were　originally　categorized

to　be　an　LL　item　and　an　LI（‘‘local－inferential”，　see　note＃11）item，　respectively．

17．According　to　this　passage，　what　do　scientists　now　believe　about　the　ocean

　　depths？

（A）　　There　are　many　dark－shaded　jellies・

（B）　　Sea　color　changes　with　the　seasons．

（C）　　Akind　of　desert　exists　in　some　parts．

（D）　　Most　of　the　living　things　there　are　jellies・

72


18．We　can　guess　from　the　passage　that

（A）　　scientists　have　found　that　deep－sea　is　Iike　a　watery　desert．

（B）　　scientists　don’t　know　why　deep－sea　jellies　have　bright　colors．

（C）　　many　jellies　in　the　underwater　have　their　common　ciear　color．

（D）　　many　fish　in　the　deep　sea　have　very　bright　colors．

　　　　　For　item　17，　the　correct　answer　is（D）．　The　sentence，‘‘Scientists　now　believe

that　j　elly　animals　may　be　one　of　the　most　common　types　of　animal　life　in　the　ocean

depths，う’　starting　from　the　seventh　line，　was　to　be　matched　with　the　question，“what　do

the　scientists　now　believe　about　the　ocean　depths？”and　what　is　written　in　option（D）．

However，　it　seems　that，　fbr　Group　A－Low　test　takers，　making　a　link　between　the

phrase‘‘the　most　common　types　of　animal　life’うfrom　the　passage　with　the　phrase

“Most　of　the　living　things”in　option（D）was　an‘‘inferential”type　of　reading，　rather

than　a‘‘matchingう’，　which　makes　us　identifシitem　17as　an　LI　item　fbr　this　group．

　　　　　With　regard　to　item　18，　the　correct　option（B）would　be　chosen　if　the　test　taker

could　locate　the　last　sentence，‘‘The　reason　fbr　these　bright　colors　is　a　mystery，”and

infer　that“a　mystery”means　that　nobody　knows　why　j　ellies　in　the　deep－sea　have

bright　colors．　In　other　words，　this　item　was　constnlcted　with　the　intention　to　test　test

takers’ability　to　make　an　inference　after　understanding　a　small　amount　of　information，

and，　therefbre，　it　was　labeled　as　an　LI　item．　If　this　was　what　was　done　by　the　test

takers，　it　might　l）e　possible　to　explain　that　the　second　factor　is　indeed　a‘‘local－literal”，

or　at　least　a“literal”element，　on　acco皿t　that　the　items　that　load　negatively　high　on

the　same　factor　are　perceived　to　pre　sent　an‘‘inferential’うfeature，　a　feature　that　would

be　on　the　other　end　of‘‘literal”．

　　　　　This　proposition　is　fUrther　confirmed　when　items　14　and　19　are　consulted．

Item　19is　presented　as：

19．What　is　the　main　idea　of　this　passage？

（A）　　Scientists　work　very　hard　to　make　new　discoveries．

73


（B）

（C）

（D）

lmportant　things　can　be　discovered　accidentally．

Making　scientific　discoveries　is　an　easy　thing　to　do．

Sticky　materials　are　useful　in　today‘s　world．

　　　　　This　item　was　predetermined　to　be　a　GI　item　because　the　whole　passage　was

about　how　Art　Fry　had　come　up　with　the　idea　of　stick－ons　by　luck，　giving　the　message

that‘‘（B）Important　things　can　be　discovered　accidentally．　This　answer　could　also　be

reached　if　the　first　sentence　of　the　passage，‘‘Some　discoveries　have　come　as　a　result

of　luck－an　accident　that　causes　a　scienti　st　to　look　differently　at　what　has　occurred，う’

is　located，　and　the　phrases‘‘as　a　result　of　luck”and‘‘an　accidentう’from　the　sentence　is

correctly　linked　with‘‘accidentally”in　option（B）．　Making　this　link　might　require　a

bit　of　infening，　so　this　item　could　be　determined　as　an‘‘inferential”item，　whether　it　is

categorized　as　a　GI　or　LI　item．

　　　　　Item　14　was　an　item　which　was　predetermined　to　be　an　LL　item：

14．Why　was　the　Great　Smoky　Mountains　National　Park　built？

（A）　　People　in　the　East　needed　a　pIace　to　take　a　walk　for　exercises．

（B）　　Many　kinds　of　birds　and　trees　were　discovered　in　Smoky　Mountains．

（C）　Many　parks　in　the　West　were　becoming　too　crowded　with　cars．

（D）　　There　were　few　national　parks　in　the　eastern　part　of　the　US．

　　　　　The　correct　answer　is（D），　which　gives　the　same　explanation　as　the　first

sentence　in　the　passage，‘‘In　the　early　1920s，　the　new　United　States　National　Park

Service　realized　that　most　of　its　parks　were　in　the　West，”in　a　slightly　different

expression．　This　item　was　first　categorized　as　an　LL　item　in　the　item－writing

process　because　she　thought　that　this‘‘matching”was　of　a‘‘literar’nature．　However，

given　the　p・pulati・n・f　Gr・up　A－L・w，　it　might　be　c・nsidered　that　the　nature・f

matching　here　is　something‘‘inferential”，　which　fUrther　suggests　that　the　attribute　of

the　second　factor　is　whether　the　item　elicits‘‘1iteral”or‘‘inferential”　type　of　reading．

74


5．2．2　Group　A－High

　　　　　A　fU11－information　factor　analysis　was　applied　to　all　the　items　in　Test　S　et　A　With

the　responses　of　Group　A－High．　The　pre　sent　author　had　adopted　two－factor　solution

fbr　its　interpretability．　The　correlation　of．054　between　the　first　factor　and　the

second　factor　is　rather　low，　so　the　orthogonal（VARIMAX）analysis　seemed　more

appropriate．　Table　5－7　illustrates　the　factor　loadings　fbr　this　group．

’『able　5・7　Factor　Loadings　for　Test　Set　A　by　Group　A－High

21

　　　　　For　inspection　of　loadings　on　each　factor，　the　factor　loadings　of　each　item　were

rearranged　in　order　from　tho　se　that　bear　high　loadings　to　low　on　the　first　factor．　The

question　type（Q－TYPE）of　each　item　is　indicated　in　the　table，　along　with　the　passage

number（P－＃），　so　that　the　relationship　between　the　factor　loadings　and　questions　types

75


and　passage　numbers　might　be　sought．

　　　　　The　first　thing　noticed　is　that　the　text（context）characteristics　did　not　affect　the


same　factors，　and　the　loadings　of　the　items　that　came　from　the　same　prompt　varied

greatly　in　the　loadings．

　　　　　In　seeking　particularity　in　the　loadings　on　the　first　factor，　it　seems　that　the　items

that　bear　rather　high　loadings　on　the　first　factor　are　the　items　that　have　smaller　item

numbers　and　tho　se　with　low　loadings　have　1arger　item　numbers，　as　was　the　case　with

Group　A－Low．　However，　at　the　same　time，　one　can　also　observe　that　the　items　that

bear　rather　high　loadings　on　the　first　factor　are　the　items　that　are　labeled　GI　fbr

question　type．　These　are　the　questions　that　ask　fbr　the　main　ideas　of　the　given

passage．　To　make　sure　that　these　items　actually　elicit　a　GI　type　of　reading，　items　7，

4，and　22　were　revisited　with　some　test　takers　after　the　test　implementation，　and　it

was　confirmed　that　they　do．

　　　　　Item　6，　labeled　as　an　LI　item，　also　bears　a　rather　high　loading　on　the　first　factor．

When　each　item　is　closely　examined，　item　6　is　presented　as：


（A）　some　trees　in　Muir　Woods　existed　1，200　years　ago．

（B）　　the　redwood　trees　have　been　discovered　just　recently．

（C）　　redwood　trees　are　very　popuIar　in　the　US．

（D）　　cutting　down　of　the　redwood　is　not　allowed　in　the　US．

This　question　is　given　with　the　intention　that，　if　the　test　taker　could　locate　the

sentence，“Some　are　about　1，200　years　of　age，”　on　the　fifth　line　of　the　passage（refer

to　Test　Set　A　in　Appendix　A），　option（A）would　be　chosen　after　inferring　that　if　the

trees　are　1，200　years　old，　they　should　have　existed　in　Muir　Wbods　1，200　years　ago．

This　item　was　constructed　with　the　intention　to　test　a　test　taker’s　ability　to　make　an

inference　after　understanding　a　small　amo皿t　of　information・

　　　　　However，　in　closely　examining　item　6，0ne　thing　to　which　an　attention　is　dra㎜

76


is　that，　compared　to　other　LI　items　in　Test　Set　A，　in　order　to　discard　incorrect　options

fbr　this　item，　the　test　takers　had　to　read　and　refer　to　rather　a　large　amo皿t　of

information．　This　could　be　considered　as　a　type　of　reading　predetermined　for　a　GI

type　of　question，　and，　therefbre，　it　could　be　said　that　item　6　had　worked　as　a　GI　item

in　the　present　analysis，　allowing　us　fUrther　to　interpret　the　first　factor　to　be　the　ability

to　read　rather　a　large　amo皿t　of　information　and　make　inferences　from　its

comprehension．

　　　　　Item　5，　another　item　which　loads　heavily　on　the　first　factor，　shares　the　passage

with　item　6　and　asks：

5．What　is　one　reason　why　redwood　trees　have　existed　so　long？

（A）　　They　form　an　unusual　forest　just　outside　San　Francisco．

（B）　　They　have　special　covers　that　protect　themselves．

（C）　　They　are　very　tall，　so　the　fire　can’t　reach　the　whole　tree．

（D）　　They　are　officially　protected　by　the　State　of　California．

The　correct　answer（B）could　be　chosen　if　the　test　taker　could　locate　the　sentence，

‘‘

shey　contain　chemicals　which　protect　them　against　fire，　decay，　and　insects，”that

starts　from　the　eighth　line　of　the　passage．　This　item　was　labeled　as　a　LL

（local－literal）type　and　was　constructed　to　test　a　test　taker’s　ability　to皿derstand　a

small　amount　of　information　with　little　or　no　inferring．　However，　when　option（B）

is　closely　reviewed，　to　correctly　choose　option（B），　the　test　takers　had　to　comprehend

（and　maybe　infer）that　the　word“cover”in　option（B）means　the　bark　of　．　the　tree．

Furthermore，　the　correct　option　could　also　be　chosen　when　the　sentence，‘‘One　reason

is　that　they　are　not　easily　harmed　by　fire　because　they　have　very　thick　bark，　and　there

is　much　water　in　their　wood，”starting　from　the　sixth　line　of　the　passage，　is　located，

and　the　same　inference　about　the“bark”was　made　by　the　test　taker，　which　would

make　this　item‘‘LI9う．

　　　　　At　the　same　time，　one　notices　that，　although　the　correct　answer　could　be

reached　by　LI　type　of　reading　as　it　was　examined　above，　what　is　asked　in　item　5　is

77


actually　the　central　theme　of　passage．　On　the　sixth　line，　the　passage　presents　a

question‘‘How　do　these　trees　live　so　long？’うand　the　rest　of　the　text　is　fbcused　on

answering　this　question．　Although　this　question　was　given　in　the　middle　of　the

passage，　because　the　explanation　of　why　redwood　trees　have　lived　so　long　seems　to

dominate　the　main　discussion　in　the　passage，　it　could　be　judged　that　item　5　is　asking

fbr　the　main　idea　to　be　comprehended．　Therefbre，　it　could　be　deduced　that　item　5

might　as　well　be　categorized　to　be　a　GI　item，　which　would　allow　us　fUrther　to

conclude　that　the　first　factor　in　the　present　analysis　is　the　ability　to　present　a　GI　type

of　reading　comprehension．

　　　　　If　the　first　factor　could　be　explained　by　a　GI　nature　of　reading，　items　that　hold

negatively　high　loadings　could　be　perceived　as　items　that　elicit　non－GI，　and　perhaps

LL，　types　of　reading　performance．　These　are　items　17，19，28，　and　30，　in　the　order

ofhow　negatively　high　factors　are　loaded　on　each　item．

　　　　　Item　28　was　an　item　which　was　predetermined　to　be　a　GI　item　as　i　s　clear　from

the　question　given．


（A）The　popularity　of　national　parks　is　creating　problems．

（B）National　parks　are　built　as　children’s　playground．

（C）Pollution　is　a　problem　in　national　parks．

（D）The　cost　of　visiting　a　national　park　is　increasing．

　　　　　The　passage　was　about　how　national　parks　in　the　US　have　problems　because

too　many　people　are　visiting　them．　A　similar　proposition　is　expressed　in　option（A），

which　should　be　chosen　if　the　test　taker　had　correctly　comprehended　the　passage．

However，　even　if　the　whole　passage　was　not　read　globally，　the　correct　option　could　be

chosen　after　reading　the　first　sentence，‘‘The　U．S．　National　Parks　Service　is　trying　to

solve　a　difficult　problem，”along　with　an　earlier　part　of　the　second　sentence，“Many

national　parks　have　become　too　popular．” @If　this　was　the　case，　it　might　be　more

appropriate　to　consider　this　item　as　testing　an　LL　type，　or　at　least　a　local　type，　of

78


reading．

　　　　　A　similar　case　holds　true　for　item　19：


（A）Scientists　work　very　hard　to　make　new　discoveries．

（B）lmportant　things　can　be　discovered　accidentally．

（C）Making　scientific　discoveries　is　an　easy　thing　to　do．

（D）Sticky　materials　are　useful　in　today‘s　world．

　　　　　This　item　was　termed　to　be　a　GI　item　in　the　factor　analytic　study　done　for

Group　A－Low．　However，　with　the　population　of　Group　A－High，　because　of　their

higher　ability　and　the　ease　with　which　they　read　English，　the　matching　of‘‘Some

discoveries　have　come　as　a　result　of　luck－an　accident　that　causes　a　scientist　to　look

differently　at　what　has　occurred，う’　from　the　first　sentence　and　option（B）had　rather　an

LL　nature　than　GI．　In　the　same　respect，　item　17，　which　was　categorized　to　be　an　LI

item　with　Group　A－Low，　could　now　be　considered　to　be　an　LL　item．

　　　　　Item　30，　which　shares　the　same　passage　with　item　29　above，　was　constructed

with　the　intention　to　elicit　a　test　taker’s　LI　reading　performance．

30．Why　is　it　necessary　for　some　parks　to　limit　the　number　of　visitors？

（A）

（B）

（C）

（D）

There　aren’t　enough　parking　spaces　for　all　the　visitors　around　the　parks．

Having　too　many　visitors　has　bad　influences　on　the　living　things　in　the

parks．

They　don’t　have　enough　money　to　hire　people　as　the　guides　in　the　parks．

There　would　be　too　much　traf「ic　on　the　roads　inside　and　around　the　parks．

　　　　　In　order　to　correctly　choose（B）as　an　answer，　the　test　taker　was　to　locate　the

second　to　last　sentence，“The　1arge　number　of　visitors　is　al　so　a　threat　to　the　plant　and

animal　life　of　the　parks，”and　infer　that　if　something　is“a　threat　to　the　plant　and

animal　life　of　the　parksう’，　it“has　a　bad　influences　on　the　living　things．”When　the

79


item　was　revisited　with　some　of　the　test　takers，　this　seemed　to　be　the　case．

　　　　　Nevertheless，　what　became　clear　as　items　17，19，28，　and　30　were　revisited　was

that　they　were　certainly　not　GI　items，　in　fact，　they　seem　to　have　elicited　a　type　of

reading　that　could　be　considered　as　an　opposite　of　GI，　and　shared　a　common　feature

of　a“localううnature．　Therefbre，　it　might　as　well　be　concluded　that　the　first　factor　in

the　present　analysis　is　indeed　the‘‘global”，　if　not　global－inferential，　element　of

reading　performance．

　　　　　As　for　the　second　factor　in　the　factor　loadings　presented　in　Table　5－6，　items　16，

23　and　30　bear　high　factor　loadings．　They　are　from　different　reading　prompts，　so　the

text　features　cannot　be　a　factor．　Furthermore，　all　three　items　bear　different

predetermined　question　types．　Therefore，　each　items　were　reexamined　to　seek　a

common　feature　that　would　help　interpret　this　factor．

　　　　　Item　16　was　predetermined　to　be　a　GI　question：


（A）　Scientists　believe　that　the　deep　sea　is　like　a　desert　in　water．

（B）　Scientists　learned　a　lot　about　jellies　in　the　sea　from　the　sailors．

（C）　Scientists　discovered　a　Iot　about　jellies　in　the　ocean　depths．

（D）　Scientists　were　surprised　to　find　so　many　jelIies　in　the　deep－sea．

This　item　is　given　with　the　intention　to　elicit　a　test　takeピs　global　comprehension　of

the　passage．　The　test　taker　is　to　read　the　whole　passage　and　infer　that　the　main　idea

presented　by　the　author　is（C）．　When　this　item　was　revisited　with　some　test　takers，

more　or　less，　this　was　what　was　done　to　reach（C）as　an　answer，　which　con丘㎜s　that

item　16　was　indeed　a　GI　item．　They　said　that　the　second　sentence，‘‘But　with　new

ways　to　explore　the　oceanうs　depths，　we　are　finding　that　they　are　much　richer　in　life

than　we　ever　expected，”had　worked　as　a　clue　to　infer　that　the　theme　of　this　passage

was　how　scientists　are‘‘discovering　a　lot　about　j　ellies　in　the　ocean　depths”，　and　that

the　rest　of　the　passage　was　giving　examples　to　support　this　theme．

　　　　　Looking　at　item　23，　which　was　originally　labeled　as　an　LL　item，　the　correct

80


answer（D）would　be　chosen　if　the　test　taker　could　locate　the　first　sentence　in　the

passage，‘‘About　85％of　all　animal　life　consists　of　insects，う’and　match　the　phrase

‘‘W5％”with　the　phrase‘‘the　most　part’　and‘‘consists　of’with‘‘forms”　in　option（D）．

23．According　to　the　passage，

（A）　some　insects　eat　other　insects　for　food．

（B）　some　insects　make　food　from　oil　pool．

（C）　insects　are　usually　found　near　the　water．

（D）　insects　form　the　most　part　of　animal　life．

However，　in　the　analysis　subsequent　to　the　data　collection，　it　was　perceived　that　what

had　been　presumed　to　be　a‘‘literal　matching”（an　LL　type　of　reading）was　actually皿

‘‘

奄獅??窒窒奄獅〟D”　In　other　words，　interpreting“the　most　part”in　option（D）to　mean

‘‘

W5％”and‘‘forms”to　mean‘‘consists　of’in　the　original　sentence　could　actually　be

considered　as　an‘‘inferring”rather　than　a‘‘literal　matchingう’．　If　this　is　true，　item　23

should　be　called　an　LI　item，　and　now，　the　common　feature　that　items　16　and　23　share

is　an‘‘inferential”element．　Here，　the　possibility　that　an　attribute　that　explains　the

second　factor　is　an‘‘inferential　element”arises．

　　　　　This　proposition　is　fUrther　confirmed　when　item　30　is　examined．　Item　30，　in

the　analysis　that　was　done　for　the　first　factor，　was　determined　to　be　an　LI　item，　a

question　type　that　holds　an“inferential”element．　Therefore，　this　leads　us　to　affrirm

that　an‘‘inferentialう’element　is　the　attribute　that　explains　the　feature　of　the　second

factor．

　　　　　Conversely，　if　the　second　factor　could　be　explained　by　an‘‘inferential”nature　of

reading，　items　that　hold　negatively　high　loadings　could　be　perceived　as　items　that

elicit“non－inferential”，　or“literal”，　type　of　reading　performance．　These　are　items

18，26，and　27，　in　the　order　of　how　negatively　high　factors　are　loaded　on　each　item．

　　　　　Item　1　8　was　termed　to　be　an　LI　item　in　the　factor　artalytic　study　done　for　Group

A－Low．

81



（A）scientists　have　found　that　deep－sea　is　like　a　watery　desert．

（B）scientists　don’t　know　why　deep－sea　jellies　have　bright　colors．

（C）many　jellies　in　the　underwater　have　their　common　clear・coior．

（D）many　fish　in　the　deep　sea　have　very　bright　colors．

　　　　　However，　as　was　the　case　with　items　17　and　19　in　examining　the　nature　of　the

first　factor　in　the　present　analysis，　with　the　population　of　Group　A－High，　because　of

their　higher　ability　and　the　ease　with　which　they　read　English，　what　was　determined

as　the‘‘inferential”matching　of　the　last　sentence，‘‘The　reason　fbr　these　bright　colors

is　a　mystery，’うwith　the　correct　option（B）fbr　the　test　takers　of　Group　A－Low　had

rather　a　LL　nature　than　GI　for　tho　se　in　Group　A－High．

　　　　　Items　26　and　27　share　the　same　passage　and　are　presented　as：

26．Mendez　could　succeed　because　his　parents

（A）helped　him　travel　around　the　world．

（B）brought　him　up　very　strictly．

（C）put　in　much　money　and　time．

（D）taught　him　many　kinds　of　sports．

27．Rober寸Mendez　is

（A）afather　of　two　children

（B）afisherman　from　California

（C）atennis　player

（D）aTV　star

　　　　　Item　26　was　predetermined　to　be　an　LL　item　because　the　correct　option（C）

could　be　reached　if　the　test　taker　could　locate　the　sixth　and　seventh　sentence　in　the

passage，‘‘Robert　traces　his　success　to　his　parents’sacrifices．　They　invested　every

spare　penny　and　every　spare　moment　in　their　sons’　fUture，”and　fbllow　that，　in　essence，

82


what　is　said　by　these　sentences　is　what　is　said　in　option（C），　which　makes　this　an　LL

item．

　　　　　In　the　same　way，　item　27　i　s　an　LL　que　stion　because　the　reading　performance

elicited　by　this　item　is　the‘‘literal’うmatching　of‘‘a　world－class　player”and‘‘a　tennis

racket”from　the　first　two　sentences　of　the　passage　with　option（C）．　Although　item

27　was　predetermined　to　be　an　LI　question　because　the　matching　above　was　thought

to　hold　an‘‘inferential”nature，　in　reality，　the　case　above　seems　to　hold　true，　which

allows　the　present　author　to　conclude　that　the　second　factor　in　the　present　analysis　is

well　explained　by　the‘‘inferential／literal”nature　that　an　item　exhibits．

5．2．3　Group　B

　　　　　All　the　responses　of　Group　B　working　on　Test　Set　B　were　analyzed　using　a

血ll－information　factor　analysis．　The　present　author　had　decided　to　employ　a

two－factor　solution　after　consulting　the　results　since　it　seemed　the　most　appropriate．

The　correlation　between　the　first　and　second　factors　was　not　too　high，．549，　which

indicates　that　the　orthogonal（VARIMAX）analysis　is　preferable．　Table　5－8

illustrates　the　factor　loadings　fbr　this　group．

　　　　　For　the　purpose　of　inspecting　the　loadings　on　each　factor，　the　factor　loadings　of

each　item　were　rearranged　in　the　order　from　tho　se　that　bear　high　loadings　to　low　on

the　first　factor．　To　help　seeking　the　relationship　between　the　factor　loadings　and

question　types　along　with　passage　numbers，　the　predetermined　question　type

（Q－TYPE）of　each　test　item　and　the　number　of　passage　for　which　each　item　was

answered（P－＃）are　indicated　in　the　table．

　　　　　It　could　be　said　that　the　text　（context）　characteristics　did　not　affect　the


same　factors，　and　there　were　sufficient　variations　in　the　loadings　fbr　the　three　items

that　were　constructed　fbr　the　same　prompt．

　　　　　In　seeking　Particularity　in　the　loadings　on　the　first　factor，　one　notices　that　the

items　with　1arger　item　numbers　bear　rather　high　loadings　on　the　first　factor．　In　other

83


words，　the　items　that　load　heavily　on　the　first　factor　are　the　items　that　appear　later　in

the　test　set．　In　the　same　way，　the　items　that　appear　early　in　the　test　set　bear

negatively　high　loadings．　Vのb．at　this　indicates　is　that　the　first　factor　in　the　factor

loadings　for　Test　S　et　B　via　the　performance　of　Group　B　could　be　determined　as

‘‘翌??窒?@a　test　item　is　located　in　the　test　set”，　or‘‘item　position”，　as　was　the　case　with

Group　A－Low．　Thi　s　will　finther　be　discussed　in　Chapter　6．

’『able　5－8　Factor　Loadings　for　Test　Set　B　by　Group　B

21

　　　　　As　fbr　the　interpretation　of　the　second　factor　in　the　present　analysis，　the

possibility　of　an“inferential’type　of　reading　being　an　attribute　arises．　The　items

that　load　heavily　on　the　second　factor　are　items　14，15，　and　21．　The　predetermined

question　types　are　LL　fbr　item　14，　LI　fbr　items　15and　21．

84


　　　　　Items　14　and　15　share　the　same　passage　and　are　presented　in　the　test　set　as

fbllowing：

14．Ptarmigan　keep　warm　in　the　winter　by

（A）　huddling　together　on　the　ground　with　other　birds

（B）　building　nests　in　trees

（C）　burrowing　into　dense　patches　of　vegetation

（D）　digging　tunnels　into　the　snow

15．The　author　mentions　kinglets　in　line　17　as　an　example　of　birds　that

（A）　protect　themselves　by　nesting　in　holes

（B）　nest　with　other　species　of　birds

（C）　nest　together　for　warmth

（D）　usualiy　feed　and　nest　in　pairs

　　　　　In　order　to　correctly　answer　item　15，　which　shows　the　highest　loading　on　the

second　factor，　the　test　takers　are　to　locate　the　last　and　the　second　to　the　last　sentences，

‘‘aody　contact　reduces　the　surface　area　exposed　to　the　cold　air，　so　the　birds　keep　each

other　w㎜．　Two　kinglets　huddling　together　were　found　to　reduce　their　heat　losses　by

aquarter，　and　three　together　saved　a　third　of　their　heat．”@They　are　to　integrate　the

information　given　in　these　sentences　to　deduce　that（C）is　the　correct　answer，　and　this

leads　us　to　con丘m　that　item　15is　indeed　an　LI　item．

　　　　　Item　14　was　an　item　that　was　constructed　with　the　intention　to　elicit　a　test

taker’s　LL　type　of　reading　perfbmance．　The　fifth　sentence，“Solitary　roosters

shelter　in　dense　vegetation　or　enter　a　cavity－homed　larks　dig　holes　in　the　ground

and　ptarmigan　burrow　into　snow　banks，う’is　the　key　in　choosing　the　correct　option（D），

and　it　was　supposed　that　the　test　takers　in　this　group　would　try　to　match‘‘burrow　into

snow　banks”from　the　original　sentence　with“digging　tunnels　into　the　snow”in

option（D）．　However，　at　the　same　time，　it　could　be　presumed　that　this　matching　had

required　a　bit　of　inferring　since　the　words　used　in　the　targeted　phrases　are　slightly

85


different，　which　makes　this　an　LI　item．

　　　　　Item　21　can　also　be　confirmed　as　an　LI　item：

21．Why　does　the　author　mention　Joseph　Pulitzer　and　William　Randolph

Hearst？

（A）They　established　New　Ybrk’s　first　newspaper．

（B）They　published　comic　strips　about　the　newspaper　war．

（C）Their　comic　strips　are　still　published　today．

（D）They　owned　major　competitive　newspapers．

　　　　　The　information　from　the　two　sentences　from　the　passage，“The丘rst血11－color

comic　strip　appeared　in　January　1894　in　the　New　Ybrk　Wbrld，　owned　by　Joseph

Pulitzer，”and‘‘The　first　regular　weekly　fUll－color　comic　supplement，　similar　to

today，s　Sunday　fUnnies，　appeared　two　years　later，　in　William　Randolph　Hearst’s　rival

New、York　paper，　the　Morning　Joumal，”as　well　as　the　phrase，“between　giants　of　the

㎞eric紐press”丘om　the丘rst　sentence皿d“Both　were　immensely　popular，”丘om

the　first　sentence　of　the　second　paragraph　are　integrated　to　infer　that　these　two　people

‘‘盾翌獅?п@maj　or　competitive　newspapers，う’（option（D））．

　　　　　The　fact　that　items　14，15，　and　21　are　all　considered　to　be　LI　items　allows　us　to

claim　that　the　second　factor　in　this　analysis　can　be　explained　by　the“local－inferential”

element　of　reading　perfbrmances．

5．3 ltem　Analyses

5．3．1Selecting　items　to　be紐alyzed　in　this　part　of　study

　　　　　It　is　clear丘om　the　results　of　factor　analytic　studies　in　section　5．2　that　some　of

the　question　types　that　were　predetermined　for　each　item　did　not　fUnction　in　the　way

they　were　expected．　However，　at　the　same　time，　through　the　qualitative　analyses　of

each　item　that　were　done　to　specifシthe　nature　of　each　factor　in　sections　5．2．1，5．2．2，

and　5．2．3，　new　question　types　were　assigned　to　the　items　which　revealed　a　great

86


particularity　to　each　factor．　In　investigating　relationships　between　question　types

and　item　diffriculty，　it　seems　necessary　to　proceed　with　this　part　of　analysis　with　the

items　for　which　the　question　types　became　explicit　and　coherent　in　the　factor　analytic

studies　above．12@For　this　cause，　the　items　which　are　incorporated　in　this　part　of

analysis　are　listed　in　Table　5－9．

　　　　　In　Table　5－9，　Group　B　is　excluded　because　items　from　Test　Set　B　ca皿ot　be

incorporated　in　this　part　of　the　analysis　owning　to　the　fact　that，　fbr　the　second　factor，

only　a　few　items　showed　high　loadings　and　that　no　item　showed　a　strong　negative

loading．

Table　5・9　1tems　adopted　for　item　anatyses　by　their　question　types　and　ability　groups

Question　Type Group　A－Low Group　A－High

Gl （inferential）

（inferential） 16，23．30

14，17，18，19 （global）

456722　　　　　，　　　，　，　　　，

Ll

（local）

17，19，28，30

（literal）（literal）

LL 2．7

　　　　　Furthermore，　in　the　factor　anal）戊ic　studies，　the　items　in　both　Group　A－Low　and

Group　A－High　exhibited　only　partial　aspects　of　question　types　that　were　defined

12 she　items　with　the　factor　loadings　of．400　and　above　and－．400　and　below　were　selected　as　items

that　had　explicit　featUres　ofquestion　types　and　were　employed　for　fUrther　analyses　with　each　ability

9「oups・

87


earlier　in　the　present　thesis．　Therefbre，　the　present　author　could　only　specify　the

question　types　according　to　the　literal／inferential　or　local／global　dimensions，　rather　by

their‘‘question　types”（i．e．　local－inferential）．　This　is　why，　fbr　Group　A－High，　item

30，appears　twice　in　Table　5－9：0nce　as　an　inferential　item　and　again　as　a　local．

5．3．2Group　A－1．ow

　　　　　For　each　test　item　in　Test　Set　A，　item　dif日culty　was　calibrated　via　Rasch

Analysis　based　on　the　test　performances　of　the　test　takers　in　Group　A－Low．

RASCAL　converged　after　3　Loops．　The　final　parameter　estimates　are　presented　in

Table　5－10．　Raw　score　conversion　table，　item　by　person　distribution　map，　test

characteri　stic　curve，　and　test　information　curve　are　in　Appendix　C．　The　present

author　had　fb皿d　nothing　Problematic　with　test　characteristic　curve　and　test

information　curve，　and　item　by　person　distribution　map　indicated　that　the　difficulty　of

items　in　Test　Set　A　was　generally　equal　to　the　ability　estimates　of　the　test　takers　in

Group　A－Low．

　　　　　The　value　for　item　difficulty　can　vary　between－3．00　and　3．00，　with－3．00　being

the　easiest　and　3．OO　the　most　difficult．　The　numbers　in“Rank”column　indicate　the

difficulty　ranking　of　each　of　27　items　included　in　Test　Set　A．

　　　　　In　investigating　the　relationship　between　item　difficulty　and　question　type，　the

mean　scores　of　item　difficulty　for　items　selected　in　Table　5－9　with　reference　to　their

question　types　were　calculated　and　are　presented　in　Table　5－11．　The　items　employed

in　this　part　of　analysis　were　limited　to　the　items　from　Table　5－9　because　they　were　the

items　that　loaded　heavily　on　each　factor　in　the　factor　analytic　study　and　bore　explicit

features　of　each　question　type．

　　　　　For　a　precise　examination　of　the　difference　in　the　means　of　difficulties　in　these

two　groups　of　items，　a　t－test　was　carried　out（p．＜0．1［p．＝0．090］）．　From　this　result，　it

can　be　seen　that，　fbr　the　population　of　Group　A－Low，‘‘1iteral”items　pose　more

di伍culty　than‘‘inferential”items．　No　analysis　of　relationship　between　question

type　and　item　difficulty　could　be　done　fbr‘‘local／global”items　since　factors丘om

factor　analytic　studies　did　not　indicate　this　feature．

88


Table　5－10　Final　Parameter　Estimates　of　test　items　for　Group　A・Low

Item＃ Difficul Rank Std．　Error Chi　s． df Sc．　Diff

1 一〇，076 18 0，152 24，899 6 992 0，763 4 0，155 5，555 6 1073 1，265 1 0，168 10，742 6 1124 一2．045 27 0，255 22，284 6 81

5 一1，270 25 0，193 7，301 6 886 0，740 5 0，154 4，057 6 1077 0，405 9 0，151 25，349 6 1048 0，188 12 0，150 9，565 6 1029 1，238 2 0，167 1，796 6 111

13 0，122 13 0，151 5，810 6 101

14 0，035 15 0，151 5，123 6 10015 一1．650 26 0，219 4，003 6 8516 0，604 6 0，152 12，121 6 10517 0，253 11 0，150 6，744 6 10218 0，100 14 0，151 8，806 6 101

19 一〇．549 22 0，162 5，841 6 9520 一〇．032 17 0，152 9，894 6 10021 1，238 3 0，167 8，678 6 111

22 一〇．818 23 0，171 2，087 6 9323 一〇．076 19 0，152 2，394 6 9924 一〇，010 16 0，152 8，168 6 10025 一〇．424 21 0，159 7，846 6 9626 一〇．189 20 0，154 8，111 6 9827 一1．095 24 0，184 4，450 6 9028 0，318 10 0，150 12，019 6 10329 0，449 8 0，151 4，310 6 10430 0，515 7 0，151 4，856 6 105

Table　5－11　Means　of　item　dif『icutty　for　each　question　type　in　Group　A－Low

Litera1

Item＃ Di伍cul2 0，7637 0，405

Mean 0，584

Infbrential

Item＃ Dif猛cul

14 0，03517 0，25318 0，100

19 一〇．549

Mean 一〇．040

89


5．3．3Group　A－High

　　　　　With　the　test　performances　of　the　test　takers　in　Group　A－High，　item　diffil　culty　of

each　test　item　in　Test　Set　A　was　calibrated　via　Rasch　Analysis　based　on．　RASCAL

converged　after　3　Loops．　The　final　parameter　estimates　are　presented　in　Table　5－12，

and　raw　score　conversion　table，　item　by　person　distribution　map，　test　characteristic

curve，　and　test　information　curve　are　in　Appendix　D．　Nothing　problematic　was

found　with　the　test　characteristic　curve　and　the　test　information　curve，　and　the　item　by

person　distribution　map　indicated　that　the　difficulty　of　items　in　Test　Set　A　was

generally　lower　than　the　ability　estimates　of　the　test　takers　in　Group　A－High．　The

numbers　in“Rank”column　indicates　the　difficulty　ranl（ing　of　each　item　out　of　27

items　included　in　Test　Set　B．

Table　5・12　Final　Parameter　Estimates　of　test　items　for　Group　A－High

Item＃ Difficul Rank Std．　Error Chi　s． df Sc．　Diff

1 0，982 7 0，180 8，249 5 1092 2，575 2 0，157 4，634 5 123

3 1，995 5 0，155 4，524 5 118

4 一1．866 25 0，563 1，565 5 83

5 一〇．312 14 0，279 3，488 5 97

6 1，740 6 0，157 5，201 5 116

7 2，456 3 0，155 5，399 5 122

8 0，823 9 0，187 4，118 5 107

9 2，041 4 0，154 1，467 5 119

13 一〇．105 13 0，256 4，752 5 99

14 一〇．770 17 0，339 2，884 5 93

15 一1．583 23 0，492 87，938 5 86

16 2，647 1 0，158 5，898 5 124

17 一〇．662 16 0，323 5，814 5 94

18 一〇．390 15 0，288 8，646 5 96

19 一1．583 24 0，492 4，278 5 86

20 0，126 11 0，235 14，538 5 101

21 0，951 8 0，181 4，244 5 109

22 一2．263 27 0，682 0，783 5 79

23 0，016 12 0，245 4，522 5 100

24 一〇．770 18 0，339 6，776 5 93

25 一〇．890 19 0，357 2，161 5 92

26 一1．362 22 0，444 3，335 5 88

27 一1．866 26 0，563 4，698 5 83

28 一1．180 21 0，408 12，565 5 89

29 一1．025 20 0，380 5，544 5 91

30 0，276 10 0，222 6，360 5 103

90


　　　　　In　Table　5－13，　the　means　of　item　difficulty　for　test　items　selected　in　Table　5－9

according　to　their　question　types　are　indicated　so　that　the　relationship　between　item

difficulty　and　question　type　of　the　items　could　be　investigated．　Only　the　items　from

Table　5－9　were　employed　in　this　part　of　analysis　because　they　were　the　items　that

loaded　heavily　on　each　factor　in　the　factor　analytic　study　and　bore　explicit　features　of

each　question　type．

Table　5－13　Means　of　item　difficulty’for　each　question　type　in　Group　A・High

Literal

18 一〇，390

25 一〇．890

26 一1．362

27 一1．866

Mean 一1．127

Inferential

16 2，647

23 0，016

30 0，276

Mean 0，980

Local

17 一〇．662

19 一1．583

28 一1．180

30 0，276

Mean 一〇．787

Global

4 一1．866

5 一〇．312

6 1，740

7 2，456

22 一2．263

Mean 一〇．049

　　　　　At－test　was　carried　out　fbr　the　precise　examination　of　the　difference　in　the

mean　difficulties　of　these　two　pairs　of　items．　The　difference　was　significant

between‘‘literal”and‘‘inferential”items（p．＜0．05［p．＝0．045］）but　not　between

“local”and“global”items（p．＞0．1［p．＝0．533］）．　From　this　result，　it　can　be　seen　that，

fbr　the　population　of　Group　A－High，‘‘inferential”items　pose　more　dif丘culty　than

‘‘撃奄狽?窒≠戟hitems，　but　no　difference　could　be　found　with　regard　to　the　local／global　nature

of　reading　performance・

5．3．4Gmup　B

　　　　　Based　on　the　test　performances　of　the　test　takers　in　Group　B，　item　di伍culty

was　calibrated　via　Rasch　Analysis　fbr　each　test　item　in　Test　Set　B．　RASCAL

91


converged　after　3　Loops．　The　final　parameter　estimates　are　presented　in　Table　5－14．

Appendix　E　includes　the　raw　score　conversion　table，　the　item　by　person　distribution

map，　the　test　characteristic　curve，　and　the　test　infbmation　curve．　No　problems　were

fb皿d　with　the　test　characteristic　cinve　and　the　test　information　curve，　and　the　item　by

person　distribution　map　indicated　that　the　difficulty　of　items　in　Test　Set　B　was

generally　equal　to　the　ability　estimates　of　the　test　takers　in　Group　B．　The　numbers

in“Rank”column　of　Table　5－14　indicates　the　difficulty　ranking　of　each　item　out　of　27

items　included　in　Test　Set　B．

1「able　5・14　Final　Parameter　Estimates　of　test　items　for　Group　B

　　　　　As　it　was　explained　in　Section　5．3．1，　items丘om　Test　Set　B　cannot　be

incorporated　in　the　analysis　of　the　relationship　between　question　types　and　item

difficulty　since　the‘‘item　position”factor（or‘‘where　a　test　item　is　indicated　in　the　test

set），　which　accounted　fbr　the　first　factor　in　the　factor　analytic　study　of　Group　B

performances　on　Test　S　et　B，　was　very　strong，　and　only　a　few　items　showed　high

loadings　on　the　second　factor・

92


Documents

CHAPTER 5 RESULTSrepository.tufs.ac.jp/bitstream/10108/51461/9/dt-ko...Minimum Maximum KR20 573 17．7 4．5 4 27 0，768 There was no minimum score of zero， though the maximum score