Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Chapter 12
Scaling outcomes
This chapter reports the outcomes of applying the item response theory (IRT) scaling and
population modelling to generate the plausible values for the PISA 2018 main survey
assessment data.
RESULTS OF THE IRT SCALING AND POPULATION MODELLING
Results of the IRT scaling and population modelling include the proportions of item
parameters that were common (i.e. invariant) across countries and PISA cycles and the
reliability of the assessments for each country. Large proportions of invariant parameters
across countries and cycles ensured the comparability of the proficiency estimates.
Assessing the invariance of item parameters
The item parameters for all items used in the computer-based assessment (CBA) and paper-
based assessment (PBA) were obtained through IRT scaling. Typically, items received
international model parameters that fitted data for a large majority of the country-by-language
groups. Otherwise, items received unique or group-specific parameters, or, if no parameters
could be found that fit data in country-by-language group or groups, the item was dropped for
these groups. One reading item (DR563Q12C) was identified as problematic based on
classical item analyses and the IRT parameters did not fit the data for the majority of the
country-by-language groups. This item was found to be flawed and it was dropped from all
groups.
To assess the invariance of item parameters across country-by-language groups and cycles,
for each country-by-language group, items were categorized as: invariant when their
parameters were the same as the international item parameters in 2015 and 2018 (in the case
of trend items), or the same as the international item parameters in 2018 (in the case of new
items); group-specific invariant when their parameters were not the same as the international
parameters in 2015 and 2018, but the group-specific parameters did not change between the
2015 and 2018 cycles (in the case of trend items); dropped if the item was dropped from the
group’s scaling; and noninvariant for all other cases (different trend unique in 2015 and
2018, or new items unique item parameters). For countries with multiple language groups, the
results were averaged for the country, using the population weights to represent the language
group’s proportion in the country’s sample.
Table 12.1 shows the proportion of items categorized as invariant, noninvariant, and dropped,
averaged across countries participating in the 2018 CBA. The proportion of invariant items
(with parameters equal to the international parameters), which is critical for ensuring the
comparability of scores across countries and for the stability of trends, was large for all
domains, ranging from 77.35% for the reading trend items to 95.17% for the reading fluency
items. When taking into account the group-specific invariant items, which also contribute to
the stability of the trends, the total proportion of invariant items was near or above 90% for
all domains. Regarding the dropped category, the proportions were very small for all domains
(0.64% or less).
Table 12.1 Proportion of invariant, noninvariant, and dropped CBA items averaged across
countries, for each domain
Mathem
atics
Reading
- Trend
Reading
- New
Reading
fluency
Science
Financia
l literacy
- Trend
Financia
l literacy
- New
Global
compete
nce
Total items 70 72 172 65 115 29 14 69
Invariant 91.81% 77.35% 88.39% 95.17% 86.73% 84.85% 93.04% 90.05%
Group-specific
invariant 4.50% 10.35% 5.85% 7.11%
Invariant total 1 96.32% 87.70% 88.39% 95.17% 92.59% 91.97% 93.04% 90.05%
Noninvariant 3.55% 12.30% 11.40% 4.19% 7.25% 7.90% 6.96% 9.65%
Dropped 0.13% 0.00% 0.22% 0.64% 0.17% 0.13% 0.00% 0.30%
Note: Viet Nam is not included in the analysis due to adjudication issues.
1. Invariant total is the sum of invariant and group-specific invariant. The percentages of the invariant total,
noninvariant, and dropped items add up to 100%.
Table 12.2 shows the proportion of items categorized as invariant, noninvariant, and dropped,
averaged across countries participating in the 2018 PBA. The results are similar to those for
CBA, with high proportions of invariant and group-specific invariant items—with values
greater than 90% for mathematics and science, and slightly below 90% for reading—and very
small proportions of dropped items.
Table 12.2 Proportion of invariant, noninvariant, and dropped PBA items averaged across
countries, for each domain
Mathematics Reading Science
Total items 71 87 85
Invariant 89.82% 77.83% 80.08%
Group-specific invariant 5.50% 9.38% 10.46%
Invariant total 1 95.32% 87.21% 90.54%
Noninvariant 4.51% 11.98% 9.10%
Dropped 0.18% 0.81% 0.36%
Note: Viet Nam is not included in the analysis due to adjudication issues.
1. Invariant total is the sum of invariant and group-specific invariant. The percentages of the invariant total,
noninvariant, and dropped items add up to 100%.
An overview of the frequencies of invariant, noninvariant, and dropped items for each
domain, separated by CBA and PBA, is presented in Figures 12.1 to 12.5. Each country is
represented by a vertical bar, with: dark green representing the number of items classified as
invariant (when applicable, a vertical bar is used to separate the trend items, closer to the x-
axis, and new items); light green representing the number of invariant group-specific items;
yellow indicating noninvariant group-specific items (when applicable, a vertical bar is used
to separate the trend items, closer to the x-axis, and new items); and red indicating items
dropped from scaling. The countries are ordered from left to right by increasing number of
invariant items. These plots show that while most countries have large numbers of invariant
items, a few countries show noticeably lower invariance.
Figure 12.1 Frequency of invariant, noninvariant, and dropped items for mathematics, by
country
Figure 12.2 Frequency of invariant, noninvariant, and dropped items for reading, by country
Figure 12.3 Frequency of invariant, noninvariant, and dropped items for science, by country
Figure 12.4 Frequency of invariant, noninvariant, and dropped items for financial literacy, by
country
Figure 12.5 Frequency of invariant, noninvariant, and dropped items for global competence,
by country
After the IRT scaling was finalised, item parameter estimates were delivered to each
country, with an indication of which items had received international item parameters and
which had received group-specific item parameters. Table 12.3 gives an example of the
information provided to countries: the first column shows the domain; the second column
flags items that had received group-specific parameters or had been excluded from the IRT
scaling; and the remaining columns show the final item parameter estimates (the slope and
difficulty parameters are listed for all items, while the threshold parameters are listed for
the polytomous items). Note that some item parameters that had been estimated before
PISA 2015 using the Rasch or PCM models and still fit the 2018 data retained their slope
value of 1. As indicated earlier, all items in the 2018 main survey were modelled using the
two-parameter logistic model (2PLM) or the generalized partial credit model (GPCM),
with the Rasch model being a special case of these models.
Table 12.3 Example of item parameter estimates provided to countries
Domain Flag Item Slope Difficulty Step 1 Step 2
Mathematics Excluded
from scaling CM998Q04
Mathematics Unique item
parameters PM155Q01 1.42972 -0.35538
Mathematics PM00GQ01 1 1.62226
Mathematics PM155Q03 1.08678 0.73497 -0.20119 0.20119
Reliability of the PISA scales
Plausible values were generated for all students by setting all the item parameters to the
values obtained from the final IRT scaling and by applying the population modelling
approach described in Chapter 9.
Given the rotated and incomplete assessment design, it was not possible to calculate the
classical reliability values for each cognitive domain. Nevertheless, test reliability could be
estimated using the commonly used formula: 1 – (expected error variance/total variance).
The expected error variance is the weighted average of the posteriori variance (i.e. the
variance across the 10 plausible values, which is an expression of the posterior measurement
error). The total variance was estimated using a resampling approach (Efron, 1982) and was
estimated for each country depending on the country-specific proficiency distributions for
each cognitive domain.
Table 12.4 presents the distribution of the national reliabilities for the generated scale scores based on all 10 plausible values. The reliabilities for each country are presented in Table 12.5. These tables show that the variance explained by the combined IRT model and population model is at a comparable level across countries. While the values are above 0.80 in all the domains assessed in CBA and PBA, it is important to keep in mind that this is not to be confused with a classical reliability coefficient, as it is based on more than the item responses. Comparisons among individual students are not appropriate because the apparent accuracy of the measures is obtained by statistically adjusting the estimates based on background data. This approach does provide improved behaviour of subgroup estimates, even if the plausible values obtained using this methodology are not suitable for comparisons of individuals (Mislevy & Sheehan, 1987; von Davier et al., 2006).
Table 12.4 Distribution of the national reliabilities of the cognitive domains and reading
subscales
Mode Domain Median S.D. Min Max
CBA
Mathematics 0.85 0.03 0.77 0.90
Reading 0.93 0.01 0.91 0.95
Science 0.88 0.02 0.83 0.92
Financial literacy 1 0.89 0.01 0.86 0.93
Global competence 0.88 0.02 0.83 0.91
Reading subscale - Evaluate and reflect 0.90 0.02 0.84 0.93
Reading subscale - Locate information 0.88 0.02 0.82 0.92
Reading subscale - Understand 0.92 0.01 0.89 0.94
Reading subscale - Multiple 0.91 0.01 0.88 0.94
Reading subscale - Single 0.91 0.01 0.88 0.94
PBA
Mathematics 0.81 0.02 0.77 0.85
Reading 0.91 0.01 0.89 0.92
Science 0.82 0.02 0.80 0.87
Note: Viet Nam was not included in this analysis due to adjudication issues.
1. The financial literacy sample was separate from the main sample.
Table 12.5 National reliability values of the cognitive domains
Mode Country Mathemati
cs
Reading Science Financial
literacy 1
Global
competenc
e
CBA Albania 0.77 0.92 0.84 0.85
CBA Australia 0.85 0.93 0.89 0.88
CBA Austria 0.88 0.94 0.90
CBA Baku (Azerbaijan) 0.81 0.91 0.84
CBA Belarus 0.87 0.93 0.88
CBA Belgium 0.89 0.93 0.90
CBA Bosnia and Herzegovina 0.83 0.93 0.86
CBA Brazil 0.85 0.94 0.89 0.89 CBA Brunei Darussalam 0.90 0.95 0.92 0.91
CBA B-S-J-Z (China) 2 0.84 0.91 0.87
CBA Bulgaria 0.83 0.94 0.88 0.87 CBA Canada 0.81 0.92 0.86 0.91 0.83
CBA Chile 0.85 0.93 0.87 0.89 0.87
CBA Chinese Taipei 0.87 0.93 0.90 0.90
CBA Colombia 0.85 0.93 0.88 0.89
CBA Costa Rica 0.83 0.91 0.86 0.86
CBA Croatia 0.84 0.93 0.87 0.87
CBA Cyprus 3 0.83 0.94 0.87
CBA Czech Republic 0.87 0.94 0.90
CBA Denmark 0.85 0.93 0.89
CBA Dominican Republic 0.83 0.92 0.85
CBA Estonia 0.84 0.92 0.88 0.89
CBA Finland 0.84 0.93 0.88 0.88
CBA France 0.89 0.94 0.90
CBA Georgia 0.83 0.93 0.86 0.88
CBA Germany 0.88 0.94 0.90 CBA Greece 0.82 0.93 0.86 0.88
CBA Hong Kong (China) 0.84 0.92 0.86 0.85
CBA Hungary 0.87 0.94 0.90
CBA Iceland 0.84 0.94 0.89 CBA Indonesia 0.88 0.94 0.89 0.93 0.87
CBA Ireland 0.85 0.93 0.89
CBA Israel 0.85 0.94 0.90
CBA Italy 0.87 0.93 0.89 0.90
CBA Japan 0.85 0.93 0.89 CBA Kazakhstan 0.78 0.94 0.87 0.83
CBA Korea 0.86 0.93 0.88 0.88
CBA Kosovo 0.82 0.91 0.86 CBA Latvia 0.83 0.93 0.86 0.87 0.88
CBA Lithuania 0.85 0.94 0.89 0.89 0.89
CBA Luxembourg 0.87 0.95 0.91
CBA Macao (China) 0.80 0.91 0.86
CBA Malaysia 0.86 0.93 0.89 CBA Malta 0.87 0.95 0.90 0.90
CBA Mexico 0.82 0.92 0.87
CBA Montenegro 0.82 0.93 0.87 CBA Morocco 0.80 0.91 0.85 0.85
CBA Netherlands 0.90 0.94 0.91 0.90
CBA New Zealand 0.85 0.94 0.90
CBA Norway 0.86 0.93 0.89 CBA Panama 0.86 0.93 0.90 0.89
CBA Peru 0.84 0.92 0.88 0.89
CBA Philippines 0.85 0.94 0.88 0.88
CBA Poland 0.84 0.93 0.88 0.87
CBA Portugal 0.88 0.93 0.90 0.89
CBA Qatar 0.85 0.95 0.88 CBA Russian Federation 0.81 0.93 0.86 0.86 0.85
CBA Serbia 0.83 0.94 0.86 0.87 0.87
CBA Singapore 0.84 0.93 0.89 0.88
CBA Slovak Republic 0.86 0.94 0.88 0.88 0.89
CBA Slovenia 0.87 0.93 0.90 CBA Spain 0.80 0.92 0.83 0.88 0.83
CBA Sweden 0.86 0.93 0.89
CBA Switzerland 0.87 0.94 0.90
CBA Thailand 0.86 0.94 0.90 0.89
CBA Turkey 0.86 0.93 0.89
CBA United Arab Emirates 0.85 0.95 0.89
CBA United Kingdom 0.84 0.93 0.88
CBA United States 0.88 0.94 0.90 0.90
CBA Uruguay 0.86 0.93 0.89 PBA Argentina 0.82 0.89 0.82
PBA Jordan 0.77 0.89 0.80
PBA Lebanon 0.81 0.91 0.82
PBA North Macedonia 0.81 0.90 0.82
PBA Republic of Moldova 0.80 0.91 0.84
PBA Romania 0.85 0.92 0.86
PBA Saudi Arabia 0.80 0.90 0.82
PBA Ukraine 0.85 0.92 0.87
Note: Viet Nam was not included in this analysis due to adjudication issues.
1. The financial literacy sample was separate from the main sample.
2. B-S-J-Z (China) data represent the regions of Beijing, Shanghai, Jiangsu, and Zhejiang.
3. Note by Turkey: The information in this document with reference to “Cyprus” relates to the southern part of
the Island. There is no single authority representing both Turkish and Greek Cypriot people on the Island.
Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable solution is
found within the context of the United Nations, Turkey shall preserve its position concerning the “Cyprus
issue.”
Note by all the European Union Member States of the OECD and the European Union: The Republic of Cyprus
is recognised by all members of the United Nations with the exception of Turkey. The information in this
document relates to the area under the effective control of the Government of the Republic of Cyprus.
Reading MSAT measurement error
As indicated earlier, the main goal of the new reading multistage adaptive testing (MSAT)
design was to improve measurement accuracy over what would have been obtained with a
linear (nonadaptive) design used in past PISA cycles.
The efficiency of an MSAT design depends, in large part, on the resources available as well
as the constraints placed on the assembly of the MSAT components (i.e., the core, stage 1 and
stage 2 testlets). For the 2018 reading MSAT, the testlets were assembled to provide strong
links between the MSAT forms, to meet sets of content, timing, and other test blueprints, as
well as to create a differentiation between the high (difficult) and low (easy) stage 1 and stage
2 testlets. Using the international item parameters from the IRT scaling of the PISA 2018
main survey data, the average standard error of measurement for the difficult (HH), easy
(LL), difficult/easy (HL), and easy/difficult (LH) forms across the full range of PISA
reported proficiencies were computed for the reading MSAT. The results are displayed in
Figure 12.6. The lowest standard errors of measurement that can be achieved—if students are
routed according to their true proficiency—is shown by the lowest curve at any point on the
proficiency scale. For example, for a student with a true proficiency of 350, the easy (LL)
form provides the lowest standard error of measurement (approximately 30 points on the
PISA scale), and for a student with a true proficiency of 700, the difficult (HH) form provides
the lowest standard error of measurement (approximately 50 points). However, in reality, the
assignment of the MSAT forms is not ideal because of the inaccuracy of the routing
proficiency estimates and because a proportion of the students are randomly routed by design.
The average standard error obtained for the 2018 main survey, taking into account the
proportion of students that were actually assigned to each form at each proficiency level, is
shown as a dashed line in Figure 12.6. Despite the imperfect routing, it can be observed that
across all points on the proficiency scale, the dashed line is only slightly above the lowest
standard error of measurement that can be achieved across the different MSAT forms.
Figure 12.6 Conditional standard error of measurement for the different forms for the reading
MSAT and the weighted average across the actual assigned forms
The standard error of measurement that can be expected for a traditional nonadaptive PISA
design was also estimated using the same item pool and the same test length (i.e. average
number of items in the forms delivered to students) as the reading MSAT. Figure 12.7 shows
the ratio of the conditional standard error of measurement for the reading MSAT design that
was implemented in the 2018 main survey to the traditional nonadaptive design that could
have been implemented. A ratio of less than 1 indicates that the standard error of
measurement for the MSAT design is lower than that of the traditional nonadaptive design.
As expected, the standard errors of measurement were reduced with the MSAT design by as
much as 10% at the lower and higher proficiency levels.
Figure 12.7 Ratio of the conditional standard error of measurement for the MSAT design to
the standard error of measurement for a traditional nonadaptive PISA design
TRANSFORMING THE PLAUSIBLE VALUES TO PISA SCALES
The plausible values generated from the population modelling onto the IRT scale were
transformed, using a linear transformation, to be reported onto the a scale that is linked to the
historic PISA scale. This scale can be used to compare the overall performance of countries
or subgroups within a country.
Mathematics, reading, and science
For mathematics, reading, and science, the transformation coefficients established for the
PISA 2015 cycle were applicable to the 2018 cycle. Note that in 2015, the transformation
coefficients were computed for each domain separately, based on the 2006, 2009, 2012, and
2015 scaled proficiencies from only the OECD countries. The country means and variances
used to compute the transformation coefficients included only the values from the cycle in
which a given content domain was the major domain. Hence, the transformation coefficients
for science are based on the 2006 reported results, the reading coefficients are based on the
2009 results, and the mathematics coefficients are based on the 2012 results. Computational
details are provided in the PISA 2015 technical report (OECD, PISA 2015 Technical report,
chapter 12).
Financial literacy
For financial literacy, results from the 2012 PISA cycle were used to compute the
transformation coefficients. The method for computing the transformation coefficients for
financial literacy was similar to that used for mathematics, reading, and science. However,
the key distinction was that for financial literacy, all available country data were used to
compute the coefficients, whereas for mathematics, reading, and science, only the data from
OECD countries were used. This decision was made because there were too few OECD
countries that had participated in the financial literacy assessment to provide defensible
transformation coefficients.
Global competence
Global competence was a newly established domain in PISA 2018. Consistent with the new
domains that had been introduced in previous PISA cycles, the transformation coefficients for
global competence were computed so that the plausible values for the OECD countries would
have a mean of 500 and a standard deviation of 100. To take into account the 10 sets of
plausible values, all sets were stacked together and the weighted (using the senate weights)
mean and variance were computed. Stated differently, the full set of transformed plausible
values for global competence had a weighted mean of 500 and a weighted standard deviation
of 100 for the OECD countries.
Specifically, the equations used to compute the transformation coefficients for global
competence are presented below. Xkv is the vth plausible value {v in 1, 2, ..., 10} for examinee
k. The grand mean of the plausible values is �̅�𝑘𝑣, which is computed by compiling all 10 sets
of plausible values into a single vector (with the corresponding senate weights compiled in a
separate vector) and finding the weighted mean of these values. The weighted variance of the
plausible values is 𝜏𝑃𝑉2 which is computed using the vector of plausible values described
above. The square root of 𝜏𝑃𝑉2 is the weighted standard deviation, 𝜏𝑃𝑉.
𝜏𝑃𝑉 = √𝜏𝑃𝑉2 = √
∑ ∑ 𝑊𝑘𝑣(𝑋𝑘𝑣−�̅�𝑘𝑣)2𝑛𝑘=1
10𝑣=1
[(10𝑛−1) ∑ 𝑊𝑘𝑣𝑛𝑘=1 ]/𝑛
(12.1)
The transformation coefficients for global competence were computed using the following
equations:
𝐴 =100
𝜏𝑃𝑉 (12.2)
𝐵 = 500 − 𝐴[�̅�𝑘𝑣] = 500 − 𝐴 [∑ ∑ 𝑋𝑘𝑣𝑊𝑘𝑣
𝑛𝑘=1
10𝑣=1
10 ∑ 𝑊𝑘𝑣𝑛𝑘=1
] (12.3)
The plausible values for global competence were transformed to the PISA scale using a
similar approach to that used for mathematics, reading, science, and financial literacy.
However, one difference is that for global competence, the transformation was based on the
plausible values, because global competence had been introduced for the fist time in 2018
(whereas for mathematics, reading, science, and financial literacy, the transformations used
the model-based results from the concurrent calibration in order to align the results with
previously established scales).
Transformation coefficients for all domains
The transformation coefficients for all content domains are presented in Table 12.6. The A
coefficient adjusts the variability (standard deviation) of the resulting scale, while the B
coefficient adjusts the scale location (mean).
Table 12.6 Transformation coefficients for PISA 2018
Domain A B
Mathematics 135.9030 514.1848
Reading 131.5806 437.9583
Science 168.3189 494.5360
Financial literacy 140.0807 490.7259
Global competence 166.1760 530.9083
Table 12.7 shows the average transformed plausible values as well as the resampling-based
standard errors for each country and domain.
Table 12.7 Average plausible values (PVs) and resampling-based standard errors (SE) by
country and domain
Country
Mathematics Reading Science Financial literacy
1
Global competence
Average
PV SE
Average
PV SE
Average
PV SE
Average
PV SE
Average
PV SE
International average 458.67 0.29 453.40 0.29 457.92 0.28 481.42 0.59 474.06 0.55
Albania 437.22 2.42 405.43 1.92 416.73 1.99 426.93 2.47
Argentina 379.45 2.77 401.50 2.98 404.07 2.87
Australia 491.36 1.94 502.63 1.63 502.96 1.80 510.88 2.07
Austria 498.94 2.97 484.39 2.70 489.78 2.78
Baku (Azerbaijan) 419.64 2.82 389.39 2.51 397.65 2.36
Belarus 471.87 2.67 473.79 2.44 471.26 2.45
Belgium 508.07 2.26 492.86 2.32 498.77 2.23
Bosnia and Herzegovina 406.38 3.06 402.98 2.93 398.50 2.74
Brazil 383.57 2.03 412.87 2.11 403.62 2.06 420.41 2.34
Brunei Darussalam 430.11 1.16 408.07 0.90 430.98 1.21 428.94 1.29
B-S-J-Z (China) 2 591.39 2.52 555.24 2.75 590.45 2.67
Bulgaria 436.04 3.82 419.84 3.91 424.07 3.63 432.24 4.14
Canada 512.02 2.36 520.09 1.80 518.00 2.15 532.29 3.22 553.77 2.32
Chile 417.41 2.42 452.27 2.64 443.58 2.42 450.88 2.90 466.08 2.94
Chinese Taipei 531.14 2.89 502.60 2.84 515.75 2.87 527.27 2.90
Colombia 390.93 2.99 412.30 3.25 413.32 3.05 457.43 3.26
Costa Rica 402.33 3.29 426.50 3.42 415.62 3.27 455.60 3.71
Croatia 464.20 2.55 478.99 2.67 472.36 2.79 506.34 2.77
Cyprus 3 450.68 1.41 424.36 1.37 439.01 1.39
Czech Republic 499.47 2.46 490.22 2.55 496.79 2.55
Denmark 509.40 1.74 501.13 1.80 492.64 1.94
Dominican Republic 325.10 2.62 341.63 2.86 335.63 2.50
Estonia 523.41 1.74 523.02 1.84 530.11 1.88 547.49 2.05
Finland 507.30 1.97 520.08 2.31 521.88 2.51 536.86 2.42
France 495.41 2.32 492.61 2.32 492.98 2.22
Georgia 397.59 2.60 379.75 2.16 382.66 2.31 402.94 2.56
Germany 500.04 2.65 498.28 3.03 502.99 2.91
Greece 451.37 3.09 457.41 3.62 451.63 3.14 487.92 3.59
Hong Kong 551.15 3.00 524.28 2.73 516.69 2.54 542.14 2.81
Hungary 481.08 2.32 475.99 2.25 480.91 2.33
Iceland 495.19 1.95 473.97 1.74 475.02 1.80
Indonesia 378.67 3.12 370.97 2.56 396.07 2.39 388.38 3.21 407.96 2.39
Ireland 499.63 2.20 518.08 2.24 496.11 2.21
Israel 463.03 3.50 470.42 3.67 462.20 3.62 496.37 3.85
Italy 486.59 2.78 476.28 2.44 468.01 2.43 476.49 2.49
Japan 526.97 2.47 503.86 2.67 529.14 2.59
Jordan 399.76 3.31 419.06 2.94 429.25 2.93
Kazakhstan 423.15 1.91 386.91 1.46 397.10 1.66 407.65 1.62
Korea 525.93 3.12 514.05 2.94 519.01 2.80 508.66 2.96
Kosovo 365.88 1.49 353.07 1.14 364.88 1.17
Latvia 496.13 1.96 478.70 1.62 487.25 1.76 501.31 1.77 496.54 1.99
Lebanon 393.45 4.05 353.36 4.32 383.72 3.54
Lithuania 481.19 1.95 475.87 1.52 482.07 1.63 498.29 1.81 489.43 1.89
Luxembourg 483.42 1.10 469.99 1.13 476.77 1.22
Macao 557.67 1.53 525.12 1.23 543.59 1.46
Malaysia 440.21 2.88 414.98 2.87 437.62 2.71
Malta 471.72 1.90 448.23 1.73 456.59 1.87 478.78 2.15
Mexico 408.80 2.49 420.47 2.75 419.20 2.58
Moldova 420.60 2.43 423.99 2.44 428.49 2.26
Montenegro 429.61 1.24 421.06 1.05 415.17 1.31
Morocco 367.73 3.33 359.39 3.13 376.60 3.00 402.30 3.45
Netherlands 519.23 2.63 484.78 2.65 503.38 2.84 557.90 2.64
New Zealand 494.49 1.71 505.73 2.04 508.49 2.10
North Macedonia 394.45 1.56 392.67 1.10 413.04 1.42
Norway 500.96 2.22 499.45 2.17 490.41 2.28
Panama 352.85 2.72 376.97 2.95 364.62 2.89 412.52 2.94
Peru 399.84 2.61 400.51 2.96 404.22 2.67 410.63 3.15
Philippines 352.57 3.47 339.69 3.29 356.93 3.18 371.14 3.41
Poland 515.65 2.60 511.86 2.70 511.04 2.61 519.61 2.55
Portugal 492.49 2.68 491.80 2.43 491.68 2.77 505.36 2.42
Qatar 414.23 1.23 407.09 0.77 419.13 0.92
Romania 429.92 4.90 427.70 5.14 425.76 4.60
Russian Federation 487.79 2.96 478.50 3.08 477.72 2.87 495.14 2.94 479.95 2.84
Saudi Arabia 373.24 2.99 399.15 2.96 386.25 2.84
Serbia 448.28 3.16 439.47 3.27 439.87 3.05 443.62 2.87 463.45 3.20
Singapore 569.01 1.60 549.46 1.59 550.94 1.48 576.48 1.81
Slovak Republic 486.16 2.56 457.98 2.23 464.05 2.28 481.26 2.29 486.42 2.30
Slovenia 508.90 1.36 495.35 1.23 507.01 1.25
Spain 481.39 1.46 476.54 1.58 483.25 1.55 492.25 2.21 512.12 1.61
Sweden 502.39 2.65 505.79 3.02 499.44 3.07
Switzerland 515.31 2.91 483.93 3.12 495.28 3.00
Thailand 418.56 3.45 392.89 3.23 425.81 3.18 423.42 3.01
Turkey 453.51 2.26 465.63 2.17 468.30 2.01
Ukraine 453.12 3.65 465.95 3.50 468.99 3.30
United Arab Emirates 434.95 2.14 431.78 2.30 433.64 2.01
United Kingdom 501.77 2.56 503.93 2.58 504.67 2.56 534.06 4.86
United States 478.24 3.24 505.35 3.57 502.38 3.32 505.68 3.35
Uruguay 417.66 2.63 427.12 2.76 425.81 2.47
Note: Viet Nam was not included in this analysis due to adjudication issues.
1. The financial literacy sample was separate from the main sample.
2. B-S-J-Z (China) data represent the regions of Beijing, Shanghai, Jiangsu, and Zhejiang.
3. Note by Turkey: The information in this document with reference to “Cyprus” relates to the southern part of
the Island. There is no single authority representing both Turkish and Greek Cypriot people on the Island.
Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable solution is
found within the context of the United Nations, Turkey shall preserve its position concerning the “Cyprus
issue.”
Note by all the European Union Member States of the OECD and the European Union: The Republic of Cyprus
is recognised by all members of the United Nations with the exception of Turkey. The information in this
document relates to the area under the effective control of the Government of the Republic of Cyprus.
LINKING ERROR
The estimation of the linking error between two PISA cycles was accomplished by
considering the differences between the reported country means from the previous PISA
cycles and new estimates of these country means based on the new PISA cycle item
parameters (see Chapter 9 for more information on the estimation process). To estimate the
linking error for trend comparisons between PISA 2018 and a previous PISA cycle, the
subset of countries that had participated in both cycles being compared were used. In the case
of financial literacy, since the number of participating countries was relatively small, all
countries were used.
The 2018 linking errors are reported in Table 12.8, below. Using these values, the extent to
which changes in country or subgroup’s performance between PISA 2018 and a previous
PISA cycle are significant can be evaluated.
For each domain, the change in a country or subgroup’s performance between PISA 2018 and
a previous PISA cycle can be calculated using the following formula:
𝛥2018−𝑡 = 𝑃𝐼𝑆𝐴2018 − 𝑃𝐼𝑆𝐴𝑡 (12.4)
where 𝛥2018−𝑡 is the difference in performance between PISA 2018 and a previous PISA
cycle, where t can take any of the following values: 2000, 2003, 2006, 2009, 2012, or 2015.
𝑃𝐼𝑆𝐴2018 is the observed score in 2018, and 𝑃𝐼𝑆𝐴𝑡 is the observed score in a previous cycle.
The standard error of the change in performance 𝜎(𝛥2018−𝑡) can be calculated as:
𝜎(𝛥2018−𝑡) = √σ20182 + σt
2 + error2018,𝑡2 (12.5)
where 𝜎2018 is the standard error observed in PISA 2018, 𝜎𝑡 is the standard error observed in
a previous PISA cycle t, and error2018,𝑡 is the linking error for the comparisons of the scores
between PISA 2018 and a previous PISA cycle t. The values for error2018,𝑡 are presented in
Table 12.8.
Note that for each domain, the earliest cycle for which comparisons can be made between
PISA 2018 and a previous PISA cycle is the cycle in which the domain first became a major
domain. Thus, the comparison of mathematics scores between PISA 2018 and PISA 2000 is
not possible, nor is the comparison of science scores between PISA 2018 and PISA 2000 or
between PISA 2018 and PISA 2003.
Table 12.8 Linking error for score comparisons between PISA 2018 and previous PISA
cycles
Comparison Mathematics Reading Science Financial literacy
PISA 2000 to 2018 4.04
PISA 2003 to 2018 2.80 7.77
PISA 2006 to 2018 3.18 5.24 3.47
PISA 2009 to 2018 3.54 3.52 3.59
PISA 2012 to 2018 3.34 3.74 4.01 5.55
PISA 2015 to 2018 2.33 3.93 1.51 9.37
INTERNATIONAL CHARACTERISTICS OF THE ITEM POOL
This section provides an overview of the test targeting, the domain inter-correlations, and the
correlations among the reading scale and subscales.
Test targeting
Similar to assigning a specific score on a scale to students according to their performance on
an assessment (OECD, 2002), each item in PISA 2018 was assigned a specific value on a
scale – the response probability (RP) – according to the item’s discrimination and difficulty
parameters that were estimated in the calibration stage. Chapter 15 describes how items can
be placed along a scale based on their RP values and how these values can be used to classify
items into proficiency levels. The different item levels provide information about the
underlying characteristics of an item as it relates to the domain (such as item difficulty), with
higher difficulty indicating a higher level.
In PISA, RP62 values were used to classify items into levels. Respondents with a proficiency
located below this point have less than a 62 percent probability of getting the item correct,
while respondents with a proficiency above this point have more than a 62 percent probability
of getting the item correct. The RP62 values for all items are presented in Annex A, together
with the final item parameters obtained from the IRT scaling.
Similar to the process above, respondents were also classified into proficiency levels using
PISA scale scores transformed from the plausible values. The purpose of classifying
respondents into levels was to provide more descriptive information about group
proficiencies.
For each cognitive domain, the levels were defined by certain score boundaries which were
determined based on the previous PISA cycles. Tables 12.9 to 12.13 show the score
boundaries used for each cognitive domain, along with the percentage of items and
respondents classified at each level of proficiency.
Table 12.9 Score boundaries for each level of proficiency for mathematics and the
classification of items and respondents
Level Score points on the PISA scale Number of
items
Percentage of
items
Percentage of
respondents
6
Higher than 669.30
18 10.91 18.04
5 Higher than 606.99 and less than or
equal to 669.30
22 13.33 18.82
4 Higher than 544.68 and less than or
equal to 606.99
40 24.24 21.47
3 Higher than 482.38 and less than or
equal to 544.68
32 19.39 19.78
2 Higher than 420.07 and less than or
equal to 482.38
37 22.42 13.52
1 Higher than 357.77 and less than or
equal to 420.07
8 4.85 6.33
Below 1
Less than 357.77
8 4.85 2.04
Table 12.10 Score boundaries for each level of proficiency for reading and the classification
of items and respondents
Level Score points on the PISA scale Number of
items
Percentage of
items
Percentage of
respondents
6
Higher than 698.32
11 3.08 0.32
5 Higher than 625.61 and less than or
equal to 698.32
22 6.16 3.22
4 Higher than 552.89 and less than or
equal to 625.61
50 14.01 11.33
3 Higher than 480.18 and less than or
equal to 552.89
79 22.13 20.13
2 Higher than 407.47 and less than or
equal to 480.18
106 29.69 24.23
1a Higher than 334.75 and less than or
equal to 407.47
72 20.17 21.44
1b Higher than 262.04 and less than or
equal to 334.75
14 3.92 13.47
1c Higher than 189.33 and less than or
equal to 262.04
3 0.84 4.96
Below 1c
Less than 189.33
0 0.00 0.91
Table 12.11 Score boundaries for each level of proficiency for science and the classification
of items and respondents
Level Score points on the PISA scale Number of
items
Percentage of
items
Percentage of
respondents
6
Higher than 707.93
6 3.00 1.96
5 Higher than 633.33 and less than or
equal to 707.93
19 9.50 10.16
4 Higher than 558.73 and less than or
equal to 633.33
53 26.50 21.99
3 Higher than 484.14 and less than or
equal to 558.73
65 32.50 25.96
2 Higher than 409.54 and less than or
equal to 484.14
45 22.50 22.15
1a Higher than 334.94 and less than or
equal to 409.54
10 5.00 13.00
1b Higher than 260.54 and less than or
equal to 334.94
2 1.00 4.17
Below 1b
Less than 260.54
0 0.00 0.61
Table 12.12 Score boundaries for each level of proficiency for financial literacy and the
classification of items and respondents
Level Score points on the PISA scale Number of
items
Percentage of
items
Percentage of
respondents
5
Higher than 626.00
9 20.93 7.69
4 Higher than 551.00 and less than or
equal to 626.00
7 16.28 15.48
3 Higher than 476.00 and less than or
equal to 551.00
16 37.21 23.83
2 Higher than 401.00 and less than or
equal to 476.00
6 13.95 26.45
1 Higher than 326.00 and less than or
equal to 401.00
4 9.30 18.17
Below 1 Less than 326.00 1 2.33 8.39
Table 12.13 Score boundaries for each level of proficiency for global competence and the
classification of items and respondents
Level Score points on the PISA scale Number of
items
Percentage of
items
Percentage of
respondents
5
Higher than 660.00
12 17.39 26.14
4 Higher than 595.00 and less than or
equal to 660.00
19 27.54 22.51
3 Higher than 530.00 and less than or
equal to 595.00
16 23.19 21.29
2 Higher than 465.00 and less than or
equal to 530.00
17 24.64 16.34
1 Higher than 400.00 and less than or
equal to 465.00
5 7.25 9.37
Below 1
Less than 400.00
0 0.00 4.35
Since RP62 values and the transformed plausible values are on the same PISA scale, the
distribution of respondents’ latent ability and the items’ RP62 values can be placed on the
same scale. In Figures 12.8 to 12.12, the left side of the figures illustrates the distribution of
the first plausible values (PV1) across countries. In each figure, the blue line indicates the
empirical density of the first plausible values across all countries, and the red line indicates
the theoretical normal distribution with the mean of the distribution equal to the mean of the
plausible values and the variance of the distribution equal to the variance of plausible values
across all countries in each domain. The figures show that the distribution of the plausible
values for each domain are approximately normal. On the right side of the figures, each
item’s international RP62 value is plotted on the PISA scale. Note that for polytomous items,
only the lowest category’s RP62 value is plotted.
Figure 12.8 Distribution of the first plausible values and item RP62 values in mathematics
Figure 12.9 Distribution of the first plausible values and item RP62 values in reading
Figure 12.10 Distribution of the first plausible values and item RP62 values in science
Figure 12.11 Distribution of the first plausible values and item RP62 values in financial
literacy
Figure 12.12 Distribution of the first plausible values and item RP62 values in global
competence
Figures 12.13 to 12.15 show the percentage of respondents in each country at each level of
proficiency, sorted in descending order of the average score for the domain.
Figure 12.13 Percentage of respondents in each country at each level of proficiency for
mathematics
Note 1: Viet Nam was not included in this analysis due to adjudication issues.
Note 2: B-S-J-Z (China) data represent the regions of Beijing, Shanghai, Jiangsu, and Zhejiang.
Note 3: Note by Turkey: The information in this document with reference to “Cyprus” relates to the southern
part of the Island. There is no single authority representing both Turkish and Greek Cypriot people on the
Island. Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable
solution is found within the context of the United Nations, Turkey shall preserve its position concerning the
“Cyprus issue.”
Note by all the European Union Member States of the OECD and the European Union: The Republic of Cyprus
is recognised by all members of the United Nations with the exception of Turkey. The information in this
document relates to the area under the effective control of the Government of the Republic of Cyprus.
2126
272928
3031
2827
3022
2627
2828
3229
3024
2726
282624
272624
262122
2525
22202020
2319
1620
16181716161616141515151414131313141412
1312131311
13121111121011
810
99
64
52
6955
5447474341
414136
3835
343132
282926
302525
252526
222023
2224221716
18171516
1418
1811
1411
101099
1111
9789878
477
48
766
47
54
55
45
25
35
31
21
71413
1617191920
1821
192121
2423
262426
222526
2525
232727
2424
2124
2828
2425272627
2121
2721
2524242424
22212325
2321232221
262122
252122222222
19212221
1921
1921
1719
16131211
7
254
56
77
99
1013
1212
1312
1113
1315
1616
1416
1716
1817
1617
181919
1922
232221
2021
2423
2424252526
2324
2628
2625
2527
2630
2525
3124
272626
2924
2729
2625
2723
2924
2723
2225
1918
011
11
22
23
3644
44
34
47
66
667
67
89
109
78
1212
111111
1415
1316
15161716
1718
1818
1818
2019
2021
192020
2121
212021
2322
2223
2222
2223
2423
2523
2630
2629
000
00
00
01
02
11
110
102
11
222
22
334
3224
43
44
67
47
67
77
69
88
78
99
89
7109
710
101010
912
109
1112
1214
1214
141619
202328
000
00
00
00
00
00
000
00
10
00
00
000
01
100
1111
12
2121
1121
22
21
22
322
133
13
233
23
32
45
44
47
48
108
1417
Dominican Republic 325Philippines 353
Panama 353Kosovo 366
Morocco 368Saudi Arabia 373
Indonesia 379Argentina 379
Brazil 384Colombia 391Lebanon 393
North Macedonia 394Georgia 398
Jordan 400Peru 400
Costa Rica 402Bosnia and Herzegovina 406
Mexico 409Qatar 414Chile 417
Uruguay 418Thailand 419
Baku (Azerbaijan) 420Moldova 421
Kazakhstan 423Montenegro 430
Romania 430Brunei Darussalam 430
United Arab Emirates 435Bulgaria 436Albania 437
Malaysia 440Serbia 448
Cyprus 451Greece 451
Ukraine 453Turkey 454
International Average 459Israel 463
Croatia 464Malta 472
Belarus 472United States 478
Hungary 481Lithuania 481
Spain 481Luxembourg 483
Slovak Republic 486Italy 487
Russian Federation 488Australia 491Portugal 492
New Zealand 494Iceland 495France 495Latvia 496
Austria 499Czech Republic 499
Ireland 500Germany 500
Norway 501United Kingdom 502
Sweden 502Finland 507
Belgium 508Slovenia 509
Denmark 509Canada 512
Switzerland 515Poland 516
Netherlands 519Estonia 523
Korea 526Japan 527
Chinese Taipei 531Hong Kong 551
Macao 558Singapore 569
B-S-J-Z (China) 591
2018 PISA main study - MathematicsAverage scores & proficiency-level percentages
Level 1 Below level 1 Level 2 Level 3 Level 4 Level 5 Level 6
Figure 12.14 Percentage of respondents in each country at each level of proficiency for
reading
Note 1: Viet Nam was not included in this analysis due to adjudication issues.
Note 2: B-S-J-Z (China) data represent the regions of Beijing, Shanghai, Jiangsu, and Zhejiang.
Note 3: Note by Turkey: The information in this document with reference to “Cyprus” relates to the southern
part of the Island. There is no single authority representing both Turkish and Greek Cypriot people on the
Island. Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable
solution is found within the context of the United Nations, Turkey shall preserve its position concerning the
“Cyprus issue.”
Note by all the European Union Member States of the OECD and the European Union: The Republic of Cyprus
is recognised by all members of the United Nations with the exception of Turkey. The information in this
document relates to the area under the effective control of the Government of the Republic of Cyprus.
2729
3822
3337
3233
3837
2835
292927
3333
2427
3027282525
29282524
2924232223
192120192019171815
171617171516151716151616151414141314121212131212131312111099
109888
4
3833
3223
3127
2324
2220
1821
1720
171816
1819
161814
1117
1314
1315
1114
131512
129
119
96
79106867
766
557
67
5566
46
644
644
555
34233
2423
1
1516
917
96
87
44
74
55
73
38
545
44
523
4424
46
35
23
22
12
25
121
121111
111
1111
11
21
11
11
111
11
011
0100
0
110
60
0100
02
01
01
00
10
00
01
000
00
001
10
10
00
000
01
00
00
00000
000
0000
000
00
000
000
00
000
000
0
13151717
2122232324
292726
302626
2930
2324
2825
3134
253230
2827
322828
2328
2429
242727
3028
2419
292526252627282728
23232425
23232225
212124
22212223
21212122
2022
192021
1819
1414
55
410
67
10109
914
1215
1416
1414
1615
1616
1820
1717
1821
1919
2021
1822
2224
2125
2427
2823
2228
2528
262828282929
262624
2728
2726
302526
3027
252927
252525
2828
302827
3028
3022
28
110
41
1323
24
33
5533
77
67
44
8567
86
89
1110
1312
1313
141314
1618
1617
1718
1717161716
1919
191921
2020
202122
2222
2122
21212322
2325
242524
2427
2626
31
000
10
0000
00
00
11
0021
12
00
2111
2111
424
25
34
33
68
46
55
54544
77
876
88
71010
79
109
9111111
1011
1012
1211
1212
1918
000
00
0000
00
00
00
000
000
00
0000
0000
101
01
00
00
12
01
01
00100
111
11
11
122
12
322
322
22
223
32
27
4
Philippines 340Dominican Republic 342
Kosovo 353Lebanon 353Morocco 359
Indonesia 371Panama 377Georgia 380
Kazakhstan 387Baku (Azerbaijan) 389North Macedonia 393
Thailand 393Saudi Arabia 399
Peru 401Argentina 402
Bosnia and Herzegovina 403Albania 405
Qatar 407Brunei Darussalam 408
Colombia 412Brazil 413
Malaysia 415Jordan 419
Bulgaria 420Mexico 420
Montenegro 421Moldova 424
Cyprus 424Costa Rica 426
Uruguay 427Romania 428
United Arab Emirates 432Serbia 439Malta 448Chile 452
International Average 453Greece 457
Slovak Republic 458Turkey 466
Ukraine 466Luxembourg 470
Israel 470Belarus 474Iceland 474
Lithuania 476Hungary 476
Italy 476Spain 477
Russian Federation 479Latvia 479
Croatia 479Switzerland 484
Austria 484Netherlands 485
Czech Republic 490Portugal 492
France 493Belgium 493Slovenia 495
Germany 498Norway 499
Denmark 501Chinese Taipei 503
Australia 503Japan 504
United Kingdom 504United States 505New Zealand 506
Sweden 506Poland 512Korea 514
Ireland 518Finland 520Canada 520Estonia 523
Hong Kong 524Macao 525
Singapore 549B-S-J-Z (China) 555
2018 PISA main study - ReadingAverage scores & proficiency-level percentages
Level 1a Level 1b Level 1c Below Level 1c Level 2 Level 3 Level 4 Level 5 Level 6
Figure 12.15 Percentage of respondents in each country at each level of proficiency for
science
Note 1: Viet Nam was not included in this analysis due to adjudication issues.
Note 2: B-S-J-Z (China) data represent the regions of Beijing, Shanghai, Jiangsu, and Zhejiang.
Note 3: Note by Turkey: The information in this document with reference to “Cyprus” relates to the southern
part of the Island. There is no single authority representing both Turkish and Greek Cypriot people on the
Island. Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable
solution is found within the context of the United Nations, Turkey shall preserve its position concerning the
“Cyprus issue.”
Note by all the European Union Member States of the OECD and the European Union: The Republic of Cyprus
is recognised by all members of the United Nations with the exception of Turkey. The information in this
document relates to the area under the effective control of the Government of the Republic of Cyprus.
323534
4341
3630
3641403836
3130
3529
3331
3434
2634
282829
322726
3025
28252526
2219
22192018
2019191919191718171615
16141514151513141414141414141312131111
9101110985
72
4035
2729
2623
2422
1718
1718
2018
1716
1515
1212
1712
1513
1312
1311
141481211
9811
1011
87
565
66
74
655
35
644
55
34554455
42
433
2333
21
12
0
147
104
36
9522
33
45
35
221251
332
12
32
41
22
11
32
31
10
11
10
10
101
01
101
10
00
1111
11
10
10
10
00
00
00
00
1215
2019
2424
2227
2927
3029
252729283031
3435
2534
27303132
3032
2626
362930
3332
2526
232930
333031
3028
2632
26282829
25252627
25252726
2224242322222425
2225
212222
21212022
1715
8
36
74
610
1210
910
1012
1415131615
161515
1715
18191918
2021
1719
222121
2326
2422
2325
2827
2729
2728
2630
282929
312829
2930
2828
3129
282828
2727
2528
3227
3029
3429
2929
3032
3225
23
01
20
024
12
2225
43
54
433
84
76657
69
105
99
89
1313
1513
1312
1313
1415
1714
171617
171919
1919
2019
1919
2121
2121
2122
2122
2222
2425
24242527
2531
3035
00
00
00
00
00
001
00
10
000
201
111
11
23
121
11
445
332
32
34
53
444
366
55
67
57
7788
99
87
108
107
101010
1110
121724
00
00
00
00
00
00
00
00
0000
000
000
00
00
000
00
011
000
00
00
10
000
001
01
11
01
1112
12
11
21
21
22
22
22
47
Dominican Republic 336Philippines 357
Panama 365Kosovo 365
Morocco 377Georgia 383
Lebanon 384Saudi Arabia 386
Indonesia 396Kazakhstan 397
Baku (Azerbaijan) 398Bosnia and Herzegovina 398
Brazil 404Argentina 404
Peru 404North Macedonia 413
Colombia 413Montenegro 415
Costa Rica 416Albania 417
Qatar 419Mexico 419
Bulgaria 424Romania 426Uruguay 426Thailand 426Moldova 428
Jordan 429Brunei Darussalam 431
United Arab Emirates 434Malaysia 438
Cyprus 439Serbia 440
Chile 444Greece 452
Malta 457International Average 458
Israel 462Slovak Republic 464
Italy 468Turkey 468
Ukraine 469Belarus 471Croatia 472Iceland 475
Luxembourg 477Russian Federation 478
Hungary 481Lithuania 482
Spain 483Latvia 487
Austria 490Norway 490
Portugal 492Denmark 493
France 493Switzerland 495
Ireland 496Czech Republic 497
Belgium 499Sweden 499
United States 502Australia 503Germany 503
Netherlands 503United Kingdom 505
Slovenia 507New Zealand 508
Poland 511Chinese Taipei 516
Hong Kong 517Canada 518
Korea 519Finland 522
Japan 529Estonia 530Macao 544
Singapore 551B-S-J-Z (China) 590
2018 PISA main study - ScienceAverage scores & proficiency-level percentages
Level 1a Level 1b Below Level 1b Level 2 Level 3 Level 4 Level 5 Level 6
Domain inter-correlations
Estimated correlations between the domains, based on the 10 plausible values and averaged
across all countries and assessment modes, are presented in Table 12.14 for the main sample
and in Table 12.5 for the financial literacy sample. Overall, the correlations are quite high, as
expected, yet there are some differences among the domains. The estimated correlations for
each country are presented in Table 12.16.
Table 12.14 Domain inter-correlations for the main sample
Domain Reading Science Global competence
Mathematics
Average 0.80 0.80 0.73
Average (CBA) 0.80 0.81 0.73
Average (PBA) 0.77 0.78
Range 0.66 ~ 0.89 0.65 ~ 0.88 0.55 ~ 0.83
Reading
Average
0.85 0.84
Average (CBA) 0.86 0.84
Average (PBA) 0.81
Range 0.78 ~ 0.92 0.75 ~ 0.90
Science
Average
0.79
Average (CBA) 0.79
Average (PBA)
Range 0.68 ~ 0.87
Note: Viet Nam was not included in this analysis due to adjudication issues.
Table 12.15 Domain inter-correlations for the financial literacy sample
Domain Reading Financial literacy
Mathematics
Average 0.81 0.87
Range 0.78 ~ 0.85 0.84 ~ 0.90
Reading Average
0.83
Range 0.77 ~ 0.86
Note: The financial literacy sample was separate from the main sample.
Table 12.16 Domain inter-correlations by country
Country Mathematics &
Reading Mathematics
& Science
Mathematics
& Financial
literacy 1
Mathematics
& Global
competence
Reading & Science
Reading &
Financial
literacy 1
Reading &
Global
competence
Science &
Global
competence
Albania 0.72 0.72 0.67 0.81 0.83 0.73
Argentina 0.77 0.76 0.79
Australia 0.79 0.85 0.87 0.85 0.82
Austria 0.85 0.85 0.89
Baku (Azerbaijan) 0.74 0.76 0.78
Belarus 0.85 0.84 0.88
Belgium 0.84 0.87 0.88
Bosnia and
Herzegovina 0.79 0.79 0.82
Brazil 0.81 0.82 0.86 0.86 0.86
Brunei Darussalam 0.89 0.87 0.83 0.92 0.90 0.87
B-S-J-Z (China) 2 0.81 0.82 0.88
Bulgaria 0.79 0.77 0.85 0.87 0.84
Canada 0.75 0.76 0.85 0.69 0.84 0.81 0.84 0.78
Chile 0.78 0.76 0.89 0.73 0.84 0.85 0.85 0.79
Chinese Taipei 0.83 0.88 0.82 0.88 0.88 0.86
Colombia 0.80 0.77 0.74 0.86 0.87 0.81
Costa Rica 0.77 0.78 0.65 0.84 0.78 0.72
Croatia 0.81 0.78 0.76 0.84 0.87 0.81
Cyprus 3 0.76 0.79 0.82
Czech Republic 0.80 0.83 0.85
Denmark 0.80 0.82 0.86
Dominican Republic 0.79 0.75 0.84
Estonia 0.79 0.80 0.85 0.87 0.83
Finland 0.81 0.82 0.87 0.88 0.84
France 0.84 0.84 0.88
Georgia 0.77 0.76 0.85 0.82 0.81
Germany 0.85 0.86 0.89
Greece 0.77 0.76 0.73 0.85 0.86 0.78
Hong Kong 0.80 0.83 0.78 0.84 0.85 0.82
Hungary 0.84 0.85 0.87
Iceland 0.78 0.85 0.87
Indonesia 0.78 0.71 0.84 0.65 0.82 0.84 0.76 0.69
Ireland 0.82 0.83 0.88
Israel 0.82 0.83 0.81 0.89 0.88 0.85
Italy 0.78 0.82 0.84 0.84 0.77
Japan 0.80 0.85 0.88
Jordan 0.72 0.73 0.78
Kazakhstan 0.66 0.65 0.55 0.82 0.75 0.68
Korea 0.78 0.84 0.81 0.84 0.84 0.85
Kosovo 0.79 0.76 0.84
Latvia 0.78 0.77 0.88 0.76 0.84 0.80 0.86 0.80
Lebanon 0.78 0.79 0.79
Lithuania 0.83 0.81 0.87 0.76 0.87 0.86 0.88 0.81
Luxembourg 0.83 0.83 0.89
Macao 0.73 0.73 0.86
Malaysia 0.81 0.81 0.90
Malta 0.83 0.86 0.81 0.88 0.88 0.85
Mexico 0.81 0.78 0.88
Moldova 0.76 0.77 0.82
Montenegro 0.77 0.78 0.85
Morocco 0.75 0.73 0.69 0.84 0.84 0.77
Netherlands 0.84 0.87 0.90 0.87 0.85
New Zealand 0.80 0.82 0.88
North Macedonia 0.77 0.78 0.79
Norway 0.82 0.87 0.84
Panama 0.82 0.78 0.72 0.89 0.86 0.80
Peru 0.82 0.80 0.88 0.86 0.86
Philippines 0.85 0.80 0.76 0.89 0.87 0.81
Poland 0.80 0.81 0.86 0.86 0.80
Portugal 0.82 0.85 0.88 0.86 0.85
Qatar 0.82 0.81 0.85
Romania 0.80 0.81 0.83
Russian Federation 0.78 0.77 0.86 0.71 0.84 0.80 0.82 0.78
Saudi Arabia 0.77 0.77 0.81
Serbia 0.79 0.77 0.87 0.73 0.83 0.82 0.85 0.77
Singapore 0.81 0.80 0.76 0.89 0.88 0.83
Slovak Republic 0.80 0.81 0.87 0.72 0.85 0.83 0.82 0.77
Slovenia 0.78 0.84 0.85
Spain 0.76 0.77 0.84 0.73 0.81 0.80 0.83 0.80
Sweden 0.81 0.86 0.85
Switzerland 0.81 0.82 0.87
Thailand 0.75 0.73 0.68 0.83 0.81 0.76
Turkey 0.81 0.84 0.87
Ukraine 0.81 0.84 0.84
United Arab Emirates 0.79 0.81 0.85
United Kingdom 0.79 0.81 0.64 0.85 0.78 0.68
United States 0.85 0.86 0.90 0.89 0.85
Uruguay 0.80 0.84 0.86
Note: Viet Nam was not included in this analysis due to adjudication issues.
1. The financial literacy sample was separate from the main sample.
2. B-S-J-Z (China) data represent the regions of Beijing, Shanghai, Jiangsu, and Zhejiang.
3. Note by Turkey: The information in this document with reference to “Cyprus” relates to the southern part of
the Island. There is no single authority representing both Turkish and Greek Cypriot people on the Island.
Turkey recognises the Turkish Republic of Northern Cyprus (TRNC). Until a lasting and equitable solution is
found within the context of the United Nations, Turkey shall preserve its position concerning the “Cyprus
issue.”
Note by all the European Union Member States of the OECD and the European Union: The Republic of Cyprus
is recognised by all members of the United Nations with the exception of Turkey. The information in this
document relates to the area under the effective control of the Government of the Republic of Cyprus.
Reading subscales
The reading subscales were divided into two groups. The first group, measuring cognitive
processes, was composed of the following subscales: evaluate and reflect (RCER), locate
information (RCLI), and understand (RCUN). The second group, based on the text structure,
comprised the subscales multiple (RTML) and single (RTSN). Due to the way in which the
proficiency data were generated, correlations among the cognitive processes reading
subscales and the text structure reading subscales cannot be calculated. Therefore, the
correlations between the cognitive domains and the cognitive processes reading subscales are
presented in Table 12.17, while the correlations between the cognitive domains and the text
structure reading subscales are presented in Table 12.18.
Table 12.17 Estimated correlations between the cognitive domains and the cognitive
processes reading subscales
RCER 1 RCLI 2 RCUN 3
Mathematics 0.76 0.74 0.76
Science 0.81 0.79 0.81
Global competence 0.78 0.76 0.79
RCER 1
0.90 0.93
RCLI 2
0.93
Note: Viet Nam was not included in this analysis due to adjudication issues.
1. RCER: Evaluate and reflect
2. RCLI: Locate information
3. RCUN: Understand
Table 12.18 Estimated correlations between the cognitive domains and the text structure
reading subscales
RTML 1 RTSN 2
Mathematics 0.76 0.76
Science 0.81 0.81
Global Competence 0.79 0.79
RTML 1
0.94
Note: Viet Nam was not included in this analysis due to adjudication issues.
1. RTML: Multiple
2. RTSN: Single
REFERENCES
Efron, B. (1982), “The jackknife, the bootstrap, and other resampling plans”, Society of
Industrial and Applied Mathematics CBMS-NSF Monographs, Vol. 38.
Mislevy, R. J. and K. M. Sheehan (1987), “Marginal estimation procedures”, in A. E. Beaton
(Ed.), Implementing the New Design: The NAEP 1983-84 Technical Report (Report No. 15-
TR-20), Educational Testing Service, Princeton, NJ.
OECD (2002), Reading for Change: Performance and Engagement across Countries (Results
from PISA 2000), OECD Publishing, Paris, http://dx.doi.org/10.1787/9789264099289-en.
Rousseeuw, P. J. and C. Croux (1993), “Alternatives to the median absolute
deviation”, Journal of the American Statistical Association, Vol. 88, pp. 1273-1283.
von Davier, M., S. Sinharay, A. Oranje and A. Beaton (2006), “The statistical procedures
used in National Assessment of Educational Progress: Recent developments and future
directions”, in C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, Vol. 26, pp. 1039-
1055, Elsevier.