Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Centre for Educational Measurement (CEMO) Faculty of Educational Sciences
CEMO RESEARCH PROGRAM
Methodological Challenges in Educational Measurement
CEMO’s mission is, firstly, to move forward the field of educational measurement. This includes an
orientation towards action such as the development of instruments that balance educational quality, equity
and effectiveness but also towards reaction such as addressing concerns about unintended consequences
and side-effects of assessments by examining these in-depth. A special objective of CEMO is to
contextualize educational assessments in the societal and cultural characteristics of the five Nordic
countries Denmark, Finland, Iceland, Norway and Sweden with the intention to “unpack” the Nordic model.
Secondly, CEMO’s mission is to cover the full cycle of educational measurement including research on a
theory-driven definition of how to assess constructs, on the development of appropriate instruments to
observe these constructs, on ways how to code and score observations as well as on how to model the
data. A special feature of CEMO is the combination of basic research in the field of educational
measurement with applications of advanced measurement techniques to educational problems.
CEMO’s third mission is to bring substantive and methodological experts together to work on an integrated
approach to educational measurement. Measurement in education has to deal with crucial methodological
challenges such as nested data and performance assessment on the one hand and the complexity of
constructs with respect to their structure, conceptualization and assessment in substantive areas on the
other hand. Interdisciplinary collaboration is therefore a fruitful way to overcome these challenges.
Finally, CEMO strives for addressing research or methods biases by considering different outcomes
(cognitive, socio-emotional, long-term), taking quality and quantity of contexts and processes into account
(opportunitites to learn, student characteristics and resources), using different types of instruments (paper-
and pencil tests, performance assessments, technology-based assessments) and by covering a wide range
of target populations (early childhood, primary and secondary education, higher education, adults).
2
CEMO RESEARCH PROFILE: Keywords
Methodological areas
Application and applicability of advanced measurement techniques
Measurement equivalence across groups & time
Modeling of causes & effects, development & change, and multi-level effects
Innovative assessment formats & new areas of skills
Substantive areas
Early childhood, primary, secondary and higher education
National & international large-scale assessment of student and teacher knowledge, beliefs, skills
Contextual effects on instructional quality and achievement
3
CEMO RESEARCH PROFILE: An extended note
METHODOLOGICAL AREAS
Below we provide a short overview of the major categories of methodological areas that CEMO works on,
or is planning to work on. These methodological issues frequently appear in several different substantive
areas, even though they typically take on different shapes in different areas.
Causal inference from observational data
Many research questions in the field education are causal in nature (e.g., effects of programs or policies),
while educational researchers often use observational (non-experimental data), when addressing these
questions. Modeling of causes and effects, as well as of development and change requires the application
of sophisticated, if possible longitudinal designs. Causal conclusions of for instance program effects drawn
from conventional analyses of observational data are at risk for being biased by unmeasured or poorly
measured third variables. While econometrics offers a wide variety of tools such as the instrumental
variable approach which may be used to address causal questions in observational data, these are not
widely known or applied in the field of education. Moreover, as method development in this area mainly
stems from the field of economy, the approaches and their applications are usually not designed to be
combined with psychometrically more complex measurement models.
Objectives of CEMO are
(i) to spread the use of causal modeling approaches in educational research, and
(ii) to work towards using synergisms between econometrics and psychometrics
4
Measurement equivalence
In educational assessment contexts it is essential to account for potentially differing interpretations of
constructs and measures across different subgroups or across time. This is not only important for clearly
communicating research outcomes on group differences or developmental changes clearly, but it is also
essential for enabling valid and fair comparisons based on these measures. This comparability aspect is
often referred to as measurement equivalence. CEMO focuses on research about measurement
equivalence across groups and time, in particular with respect to national and international large-scale
assessments in education.
Usually this equivalence appears to be trivial or self-evident for inanimate objects or well-defined variables.
For instance, for a fair comparison of someone’s physical weight, we need to ensure that the same
measuring scale is being used across the different groups and time, according to a standardized procedure
(e.g., by defining conditions such as ”in the morning before breakfast” etc.), and expressed in the same
measurement units (e.g., kilogram). However, it soon becomes more intricate when having to deal with the
more complex, educational constructs that are difficult to operationalize, particularly across different
cultures.
Critical and expository skills are required when dealing with measurement equivalence issues, because raw
numbers never speak for themselves. The issue of measurement equivalence is not a binary yes/no
question; measurements will only be approximately invariant and comparable to some degree. A core
objective in CEMO’s research is to bridge the gap between
(iii) the methodology used to study these comparability issues and the statistical
analysis models used to formalize measurement equivalence (e.g., factorial
invariance, differential item functioning, and linking and equating errors) and
(iv) the actual application of these procedures in practice. This includes both
investigations of reasons why some subsets of educational assessments are not
comparable across countries or across time, as well as an assessment of policy
implications of measurement inequivalence, because not every detected
discrepancy needs to be directly relevant for a particular comparison. The latter
research objectives connect to essential gaps in research literature and practice
with respect to approximate invariance and effect sizes, and the follow-up of non-
invariance and biased indicators.
5
Innovative assessment formats
Current educational research is slowly shifting its focus from summative testing towards more formative
testing. The goal is no longer merely ”testing-to-test” and describe students’ academic performance in
typical school topics, but there is a need for constructive feedback on test performance and for gaining a
better understanding of cognitive and behavioral learning processes. Next to this demand for an increasing
information value and wider scope of educational tests, there is a competing demand for making data
collection, processing and reporting, as efficient as possible. In light of the many technological advances in
recent years and the availability of process data, computer-based technology can for instance open up
opportunities to construct a new generation of educational tests that is line with this change in perspective
(e.g., new dynamic or interactive assessment formats, additional data streams such as response time or
activity logs, automated scoring, game-based learning, …).
Although computer-based assessments are already used to measure new areas of skills such as problem
solving competences, the potential of these measures has not yet been fully exploited. It is still unclear how
they can be used to assess the development over time and how they can be linked to contextual variables
such as instruction in specific domains (e.g., mathematics, science), particularly in the Nordic countries. In
order to address these aspects, advanced psychometric modeling approaches are necessary which account
for the hierarchical and longitudinal structure of data. Moreover, the development of computer-based
assessments that are able to capture changes over time is necessary.
Core objectives in CEMO’s research are to investigate and study
(v) the potential added value and implementation challenges of new assessment
formats, such as dynamic and interactive formats; and
(vi) to develop an expanded psychometric toolbox for dealing with these new forms of
assessment. Whereas this might require new and/or adapted statistical
measurement models, the new generation of tests will still have to be evaluated in
terms of reliability and validity. Thus, although educational testing formats might
change, measurement concerns will stay the same.
(vii) to develop computer-based assessments of competences that capture change
over time; and to advance the modeling of contextual effects and longitudinal
changes to link development of competence with instructional variables
6
SUBSTANTIVE AREAS
Below we provide a short overview of which substantive areas are in focus of CEMO’s current and planned
activities. An overall rationale of CEMO’s activities is to “unpack” the Nordic model. The Nordic countries
provide a unique social and educational context. At a macro level, economies are mostly thriving despite
economic downturns, there is low income-inequality and unemployment, and there is a broad consensus
that education across the life-span is mostly a public concern. Thus, education, including both early- and
higher education is mostly free or heavily subsidized, and accessible for everyone.
Moreover, the Nordic pedagogical model differs in considerable ways from educational practice in most
other countries. For early education, this includes a strong focus on play-based activities, children’s
participation, and a low level of direct teaching and instruction. For primary, secondary and higher
education, this includes a strong focus on social skills and a positive class climate. CEMO takes two main
strategies to ”unpack” this Nordic model: the first is through international comparisons, the other is by
addressing research questions investigated in other sociopolitical contexts, and analyse them with the
specifics of the Nordic contexts in focus.
Early childhood education
Early Childhood Education and Care (ECEC) receives increasing attention as an integrated first step of
children’s educational careers. ECEC policies in the Nordic countries with universal access to quality-
regulated ECEC following an extensive parental leave allowance is by many considered model policies. Yet,
while considerable amounts of research on ECEC has been conducted internationally, the research base is
rather scarce in the Nordic countries both regarding effects of ECEC on cognitive- and language
development and academic achievement, and on potentially negative side-effects. Moreover, while the
Nordic ECEC pedagogical model is more play based and less structured than what is common in most other
countries, little is known about variability in the quality of children’s ECEC-experiences in the Nordic
countries and the consequences of this variability for children’s educational attainment.
Development of high-quality methodological approaches to measure socio-emotional and cognitive
outcomes prior to school age, and to estimate short- and long-term causal effects of ECEC on child
outcomes therefore are on CEMO’s agenda to enhance knowledge of positive and negative effects of the
Nordic ECEC model.
7
Primary and secondary education
In Norway, as in many other countries, international large-scale assessments, such as PISA and TIMSS, have
a central role in the monitoring of the development of the school system. There is, however, much debate
and controversy with respect to country rankings of educational achievement that are based on these
assessments. The question arises whether it is possible to compare the performance of Norwegian students
on these achievement tests to the results of their Chinese or American peers. Moreover, it is of interest, to
what extent cultural differences (”are all Nordic countries the same?”), response styles,
translation/language issues, within-country differences (“are all Norwegians the same?”), testing traditions,
etc. affect these comparisons. Such questions of measurement equivalence have high priority on CEMO’s
research agenda.
Questions about what factors influence educational outcomes, and through which mechanisms this occurs,
are of central importance in primary and secondary education. Given that experimentation only rarely is a
possibility in this field, it is often necessary to rely on observational data, from for example comparative
and longitudinal studies. This poses high demands on both availability and quality of data, and on skills in
using analytic techniques that allow causal inferences.
Both at primary and secondary educational levels, there is great need for assessments that provide
information that can support all phases of learning, and that can provide rich information about a wider
range of outcomes than is currently possible. As schools increasingly get access to powerful information
technology, both opportunities and challenges to take advantage of this new technology to improve
educational assessment and its applications are being offered.
Twenty-first century skills are emphasized both in national curricula and in international educational policy
discussions. These skills cover a broad range of domain-general competences (e.g., ICT and information
literacy, creativity and innovation, communication and collaboration, critical thinking and problem solving)
and are regarded as desirable educational outcomes that can be transferred into real-life and different
academic contexts. Of the 21st century skills, problem solving competence has attracted researchers from
the fields of educational psychology, didactics, and educational measurement. In particular, within the last
20 years, new assessments and measurement approaches have been developed to capture not only
analytical problem solving processes but also the ability to solve dynamic and complex problems, requiring
active knowledge acquisition and knowledge application in computer-based environments.
8
Higher education
In cooperation with the Faculty of Medicine at the University of Oslo, CEMO examines the reliability and
validity of the newly established examination and grading system in medical education including the
benefits and limitations of different assessment formats as well as the structure, level and development of
knowledge and skills during medical education including its impact on workplace performance.
Dimensionality and growth of medical knowledge and skills within a conceptual framework of medical
competence, analyses of methods and rater effects within a multi-trait-multi-method framework as well as
the examination of content, criterial and construct validity including the development of a feedback system
to the faculty of medicine will be major research topics in this context.
Such innovative assessment formats require in-depth methodological research with results generally
important for the field of educational measurement and higher education research. Examination models
with objective structured clinical examinations including practical stations using actors as patients and real
laboratory equipment combined with digital exams including multiple-choice and constructed-response
items are increasingly used in many countries. CEMO’s research on the applicability of such innovative
assessment formats will therefore provide important new insight. In this research CEMO will establish
innovative assessments of medical competences that contain different types of tasks ranging from
traditional formats (e.g., multiple-choice items) to dynamic and interactive problems (e.g., simulation-
based formats). CEMO will also study the reliability and validity of medical assessments with respect to
rater effects and the structure of medical competences.
9
CEMO AREAS OF EXPERTISE: Members and projects
Sigrid Blömeke, CEMO director and professor
Sigrid Blömeke is interested in applied research on the relationship of competencies and performance,
primarily with respect to higher education graduates but also across the life span. Her research covers all
stages of the measurement cycle with a focus on instrument development and analyzing the data arising
from these. Blömeke is used to apply advanced statistical methods, for example in terms of latent multi-
level structural equation modeling but she likes also more exploratory approaches such as latent class
analysis. A particular interest of her is about measurement of non-invariance or DIF and their reasons.
In parallel to her CEMO responsibilities, Sigrid Blömeke holds a Leibniz-Humboldt Chair for Instructional
Research at the Leibniz Institute for Science and Mathematics Education, Kiel, and at Humboldt University
of Berlin, Germany. Her research projects include the assessment of prospective pre-school teachers’
knowledge and skills in mathematics and their relation to pre-school teacher education characteristics,
secondary analyses of data from OECD’s ”Programme for the International Assessment of Adult
Competencies (PIAAC)” and IEA’s comparative ”Teacher Education and Development Study: Learning to
Teach Mathematics (TEDS-M)” as well as an examination of the relationship between mathematics
teachers’ knowledge, performance skills, instructional quality and student achievement.
Two important recent publications are Blömeke, S., Buchholtz, N., Suhl, U. & Kaiser, G. (2014). Resolving the
chicken-or-egg causality dilemma: The longitudinal interplay of teacher knowledge and teacher beliefs.
Teaching and Teacher Education, 37, 130-139; and König, J., Blömeke, S., Klein, P., Suhl, U., Busse, A., &
Kaiser, G. (2014). Is teachers' general pedagogical knowledge a premise for noticing and interpreting
classroom situations? A video-based assessment approach. Teaching and Teacher Education, 38, 76-88.
Johan Braeken, CEMO associate professor
Johan Braeken's research focus is on psychometrics and the broad class of latent variable models. This
includes both the classical techniques (i.e., factor analysis, item response theory, latent class analysis) as
well as current advances that combine and extend these methods across their default boundaries. One
particular innovation Braeken works on is the integration of so-called copula functions in this latent variable
framework. At the same time he enjoys working with interdiscplinary teams on more applied research in
various fields of the social sciences. At CEMO Braeken will continue the psychometric work and direct his
applied work towards explanatory and formative approaches to test construction and assessment.
10
Two important recent publications are Braeken, J., Tuerlinckx, F., Kuppens, P., & De Boeck, P. (2013).
Contextualized personality questionnaires: A case for Coupled error terms in Structural Equation Models for
Categorical data. Multivariate Behavioral Research, 48, 845-870; and Hoffenkamp, H., R., Tooten, A., Hall,
R., Braeken, J., Knots, E., Eliens, M., Winkel, F., Vingerhoets, A., & van Bakel, H. (2015). Effectiveness of
Hospital-Based Video Interaction Guidance on Parental Interactive Behavior, Bonding and Stress after
Preterm Birth: a Randomized trial. Journal of Consulting and Clinical Psychology, 83, in press.
Ronny Scherer, CEMO postdoctoral researcher
Ronny Scherer is interested in applied research on the modeling and assessment of cognitive, motivational,
and instructional constructs in secondary and higher education. Among other constructs, his research
addresses students’ problem solving competences and self-beliefs, teachers’ perceptions about the use of
information and communication technology (ICT), and instructional quality. Ronny is used to apply a range
of advanced statistical methods such as multilevel structural equation and item response theory modeling,
Bayesian structural equation modeling, response-time modeling, and multi-group comparisons. He is
particularly interested in cross-country comparisons of multilevel measures and contextual effects (e.g., in
the context of instructional quality).
Two important recent publications are: Scherer, R., & Siddiq, F. (2015). The Big-Fish-Little-Pond Effect
revisited: Do different types of assessments matter? Computers & Education, 80, 195-210.
doi:10.1016/j.compedu.2014.09.003; and: Scherer, R., Greiff, S., & Hautamäki, J. (2015). Exploring the
relation between time on task and ability in complex problem solving. Intelligence, 48, 37-50.
Stephan Daus, CEMO PhD student
Stephan Daus joined CEMO in 2014 for doctoral research until 2018. His research attempts to investigate
student responses in science assessments with the aim of providing improved formative diagnosis
methods. The methods of particular attention are item difficulty models and cognitive diagnostic models.
His background is in social statistics, international development and linguistics, and his interests span across
the intersection of educational research and applied statistics.
Jan-Eric Gustafsson, CEMO professor II
Gustafsson’s research focuses on prerequisites for education, on outcomes of education, and on the
educational and instructional processes that mediate between prerequisites and outcomes. He also studies
learning and development outside of formal schooling, and works on age groups from early childhood to
11
adult age. Most research is based on data from longitudinal studies, often combined with register data, and
from large-scale international comparative studies such as PIRLS, TIMSS, PISA and PIAAC. His
methodological tools are primarily different kinds of latent variable models, such as item response theory
and structural equation models. He also has an interest in techniques for causal inference from
observational data. In addition to his involvement in CEMO, Gustafsson is a professor of education,
University of Gothenburg, where he leads a research group which shares his methodological and
substantive interests.
Two important recent publications are: Gustafsson, J.-E. (2013). Causal inference in educational
effectiveness research: a comparison of three methods to investigate effects of homework on student
achievement. School Effectiveness and School Improvement, 24, 275-295; and Thorsen, C., Gustafsson, J.-E.,
& Cliffordson, C. (2014). The influence of fluid and crystallized intelligence on the development of
knowledge and skills. British Journal of Educational Psychology, DOI: 10.1111/bjep.12041.
Henrik Daae Zachrisson, CEMO professor II
Henrik Daae Zachrisson’s main applied research interests are in the field of Early Childhood Education and
Care (ECEC). Special foci are on whether ECEC can provide more equal opportunities for socially
disadvantaged children, and the special context of the Nordic (early) educational- and welfare models. His
interests are also more broadly in consequences of social inequality for children’s and adolescent’s
development and educational attainment, and early child development in general. Henrik’s main
methodological focus is on causal modeling in observational data, the integration of econometric and
psychometric techniques, longitudinal modeling (measurement of change), and measurement models in
early childhood. Current projects funded by the Norwegian Research Council include consequences of
transition to school for social and academic adaptation, and early academic stimulation and investment in
the home for achievement in 1st grade. When not at CEMO, Henrik works at the Norwegian Center for
Child Behavioral Development, University of Oslo (Atferdssenteret).
Two important recent publications are: Zachrisson, H.D, & Dearing, E. (2014) Family income dynamics and
early child behavior problems in Norway. Child Development, DOI:10.1111/cdev.12306; and: Zachrisson,
H.D., Dearing, E., Lekhal, R., & Toppelberg, C.O. (2013). Little evidence that time in child care causes
externalizing problems during early childhood in Norway. Child Development, 84, 1152-1170.
12
Anne-Catherine Lehre, CEMO senior adviser
Anne-Catherine Lehre is CEMO’s administrative coordinator. She has a PhD in Biostatistics from the
University of Oslo. Her research interests are inter- and intrasex variability in student achievement i.e. sex
differences and equity in the Norwegian educational system controlling for factors influencing
performance.
Two recent publications are: Lehre, Hansen, Lehre, & Laake (2014). Effects of the 2003 Quality Reform on
gender gaps in student learning outcomes in Norwegian higher education. Scandinavian Journal of
Educational Research, 58, 315-336; and Lehre, Laake, & Sexton (2014). Using quantile distance functions to
assess inter- and intrasex variability in PISA achievement scores, In Strieholt, Bos, Gustafsson, & Rosén
(eds.), Educational Policy Evaluation through International Comparative Assessments (pp.177-190).
Münster: Waxmann.
Øystein Andresen, CEMO higher executive officer
Øystein Andresen joined CEMO in 2014 adding the administration of CEMO to two persons. At CEMO he
works with the public outreach of the centre’s research, including the websites and social media, as well as
managing conferences. Øystein holds an MA in Western European studies from University of Kassel,
Germany.
If you have any questions regarding CEMO’s research or wish to contribute to the centre’s research, contact
Anne-Catherine Lehre by e-mail [email protected] or by phone + 47 22 85 41 51.