9
This article was downloaded by: [Universitat Politècnica de València] On: 21 October 2014, At: 13:04 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Measurement: Interdisciplinary Research and Perspectives Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmes20 Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds André A. Rupp a a University of Maryland , Published online: 26 Jun 2008. To cite this article: André A. Rupp (2008) Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds, Measurement: Interdisciplinary Research and Perspectives, 6:1-2, 117-123, DOI: 10.1080/15366360802035612 To link to this article: http://dx.doi.org/10.1080/15366360802035612 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan,

Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

  • Upload
    andre-a

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

This article was downloaded by: [Universitat Politècnica de València]On: 21 October 2014, At: 13:04Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Measurement: InterdisciplinaryResearch and PerspectivesPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hmes20

Lost in Translation? Meaningand Decision Making in Actualand Possible WorldsAndré A. Rupp aa University of Maryland ,Published online: 26 Jun 2008.

To cite this article: André A. Rupp (2008) Lost in Translation? Meaning and DecisionMaking in Actual and Possible Worlds, Measurement: Interdisciplinary Research andPerspectives, 6:1-2, 117-123, DOI: 10.1080/15366360802035612

To link to this article: http://dx.doi.org/10.1080/15366360802035612

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,

Page 2: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 3: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

COMMENTARIES 117

Mythology-free accounts of a number of the most frequently employedreplacement variate generators, including the principal component, LISREL, andtwo-parameter item response generator, have been worked out in Maraun (2007).

REFERENCES

Borsboom, D. (2008). Latent variable theory. Measurement: Interdisciplinary Research andPerspectives, 6, 25–53.

Guttman, L. (1955). The determinacy of factor score matrices with implications for five other basicproblems of common-factor theory. The British Journal of Statistical Psychology, 8(2), 65–81.

Kestelman, H. (1952). The fundamental equation of factor analysis. British Journal of Psychology, 5,1–6.

Maraun, M. (2007). Myths and confusions: Psychometrics and the latent variable model. Retrievedfrom http://www.sfu.ca/∼maraun/Mikes%20page-%20Myths%20and%20Confusions.html

Maraun, M., & Halpin, P. (2007). MAXEIG: An analytical treatment. Retrieved from http://www.sfu.ca/∼maraun/Mikes%20page-%20Ongoing%20Research.html

Maraun, M., Halpin, P., Slaney, K., Gabriel, S., & Tkatchouk, M. (2007). What the researcher shouldknow about Meehl’s taxometric tools of detection. Retrieved from http://www.sfu.ca/∼maraun/Mikes%20page-%20Ongoing%20Research.html

McDonald, R. P. (1975). Descriptive axioms for common factor theory, image theory and componenttheory. Psychometrika, 40(2), 137–152.

McDonald, R. P. (1974). The measurement of factor indeterminacy. Psychometrika, 39, 203–222.Piaggio, H. (1931). The general factor in Spearman’s theory of intelligence. Nature, 127, 56–57.Rozeboom, W. (1988). Factor indeterminacy: The saga continues. British Journal of Mathematical

and Statistical Psychology, 41, 209–226.Wilson, E. B. (1928). On hierarchical correlation systems. Proceedings of the National Academy of

Science, 14, 283–291.

Lost in Translation? Meaning andDecision Making in Actual and

Possible Worlds

André A. RuppUniversity of Maryland

From different angles, Borsboom, Markus, and Michell present a careful analysisof the way that specialists reason with empirical data about latent characteristicsof individuals. They jointly argue for a more precise and thoughtful use of the key

Correspondence should be addressed to André A. Rupp, Department of Measurement, Statistics,and Evaluation (EDMS), 1230 Benjamin Building, University of Maryland, College Park, MD 20742.E-mail: [email protected]

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 4: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

118 COMMENTARIES

terms and measurement procedures that are ubiquitously employed in disciplinesconcerned with measurement, in particular psychology. They all focus, at somelevel, on the meaning that can be ascribed to the latent variables that are utilizedin measurement models.

They specifically demonstrate how such meaning is dependent on the assumedrelationship between the scale properties of the latent variables and the ontoge-netic properties of the latent traits these are supposed to index. They remindreaders how this relationship is specified within nomological networks that enablecausal inference referenced to actual and possible worlds. All three papers clearlyserve as engaging consciousness-raising devices for specialists across differentdisciplines that can help readers become more responsible designers of research,analysts of data structures, and, above all, communicators of the inferential limitsof their research endeavors.

PRECISION IN ASSESSMENT NARRATIVES

Being able to describe as unambiguously as possible the inferences that aredesired about a certain population of individuals, the data by which these infer-ences are supported, and the measurement models that structure the data toprovide this support is, undoubtedly, at the heart of every empirically defen-sible, ethically responsible, and practically successful measurement enterprise.To help researchers and practitioners collaborate successfully in this complextask, several frameworks for principled assessment design have been formu-lated in recent years by experts in the measurement literature. These includethe cognitive design system (e.g., Embretson, 1998) and the evidence-centereddesign framework (e.g., Mislevy, Steinberg, Almond, & Lukas, 2006) on the onehand as well as several frameworks for inquiries into validity (e.g., Borsboom,Mellenbergh, & van Heerden, 2004; Kane, 2007; Messick, 1989) on the otherhand. The purposes of these frameworks are similar to the purposes of thethree papers in this issue in one important aspect, namely to make specialistsbecome more aware of how and why they make certain classes of inferences anddecisions. Consequently, it is in the context of such existing frameworks thatI evaluate the contributions of the three papers in this issue. The followingcomments are, thus, my attempt to struggle with the lines of reasoning in thethree papers and to translate the nuanced arguments of all three authors into asingle coarser narrative, hopefully without getting lost in translation.

SEMANTICS AND PRAGMATICS

Because of their philosophy of science perspectives, Borsboom, Markus, andMichell seem to be primarily concerned with the semantics of meaning making

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 5: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

COMMENTARIES 119

and only secondarily concerned with the pragmatics of decision making eventhough both are intricately tied. As a methodological researcher who is interestedin bridging the differential needs of theoreticians and practitioners through workthat supports the construction of transparent, coherent, and concise narrativesabout latent traits, I was left feeling incredibly inspired yet dissatisfied at thesame time.

One source of my dissatisfaction stems from the perception that the founda-tional issues raised in the three papers appear to have been motivated by a visionof a methodological landscape whose argumentative boundaries are marked bya categorical contrast between dystopian ignorance and utopian omniscience.That is, I felt that the authors seem to juxtapose mechanistic decision makingin an atheoretical latent-variable vacuum on the one end—seemingly the currentstate of affairs in much research practice—with theoretically thoughtful decision-making practices sanctioned by the theoretical pillars of the discipline ofpsychology on the other end—seemingly the desired state of affairs.

I would argue, however, that any practical work that happens in this landscape,imperfect as it may be, is not necessarily driven solely by ignorance or method-ological blindfoldedness of the researchers who conduct it. Instead, it is largelydriven by the practically unavoidable compromises that they have to make, whichare necessitated by real-world constraints and demands. These include balancingthe different types of errors and the potential consequences that might ensue forstakeholders, considerations that are preferred to a quest for utmost fidelity ofconstruct representation.

In my view, making such compromises does not equate with a mechanisticoperationalist stance on the part of the researchers or acting “pathologically,”as Michell provocatively argues, because the only alternative in many real-life situations might then be inaction due to a lack of appropriate availablemeasurement procedures. If specialists developed narratives only when highpsychological standards for psychophysical measurement had been achieved,human and scientific progress would probably be hindered more than it wouldbe helped. Of course, this is a fundamental difference in belief about what ispreferable, incomplete knowledge coupled with a proper communication of theknowledge boundaries or a search for complete knowledge without having toworry about knowledge boundaries.

I am certainly not endorsing blind action over inaction and I certainlyacknowledge the potential for biases and incorrect decisions that can ensue whenscales are not properly constructed (see, e.g., Embretson, 2006). I simply believethat it is naive to believe that the only “meaningfulness” of decision makingcan come from an empirically airtight link between a comprehensive definitionof latent traits, the constitutive populations that display these traits, and thelatent variables in the measurement models that index variation on these traits.Therefore, the question of how residual uncertainty about the meaning of latent

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 6: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

120 COMMENTARIES

variables, which will undoubtedly always exist for most variables, is treated inprocesses of consensus-building during assessment development, analysis, andreporting in practice is key. It is at least as important as the question of howpsychologists could work together toward reducing this uncertainty generally toimprove seemingly deficient measurement practices and narratives.

PERSPECTIVES ON CAUSALITY

With imperfect knowledge comes weak causal inference or inference that onlyappears to be causal but is merely correlational. The careful discussions ofcausality by Borsboom and Markus remind readers that the choice of variablesand measurement models utilized in a particular study always represents aspecific lens that is used to empirically investigate a problem. This allows for andrestricts insight into the nature of a problem at the same time. As a result, causalvision with a certain methodological lens might be 20-20 within a given frame ofreference, but it may also be nearsighted or farsighted with respect to alternativeexplanations that could have been generated with alternative methodologicaltools within alternative frames of reference.

Moreover, as Markus argues, it is important to pay attention to the connota-tions of the labels that are used for theoretical entities when developing narrativesabout different levels of structural and causal generalizability. Similarly, it isimperative that researchers carefully disentangle how their relationships are ormight be defined within actual or hypothetical nomological networks, respec-tively. Whether these labels need to be constructs and concepts and whether it isuseful to consistently use the term population for different groups of individualswithin different nomological networks specifically is debatable but also besidethe point. It is clearly useful for specialists not merely to think of popula-tions as collections of diffusely described individuals that resemble some vagueprototype but rather to think of defining these populations in terms of variablesand in terms of the relationship that these variables have to one another withina nomological network. This helps to delineate which defining variables arebelieved to be affected by a certain experimental or observational data-collectiondesign, which helps clarify the construct that is the focus of the investigation,which, in turn, can help to motivate a thoughtful model-building rather than athoughtless model-selection process.

Unfortunately, describing populations by feature variables introduces anunwanted problem, namely that the defining variables may also be measurableonly with error with a given investigative lens. This introduces the same layerof complexity for secondary constructs that the approach is meant to eliminatefor the primary constructs of interest. Thus, the supposed increase in the levelof precision seems to come at a hefty price. Even though a precise definition of

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 7: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

COMMENTARIES 121

the population and the resulting constructs would clearly help sharpen constructdefinitions that can guide variable definitions and their summaries in latentvariable models, it is unrealistic to expect that it can ever be developed such thatall residual ambiguities are completely removed.

DEFINITIONAL GRANULARITY

Because the set of defining features of a member of a population is, coarselyspeaking, indefinite across all potential layers of definitional grainsize, anotherchallenge in the above task becomes to characterize and fix the definitionalgrainsize within a particular study. This is necessary so as to limit the incom-pleteness of population and variable definitions to a theoretically and, again,practically defensible degree within a chosen frame of reference. Theoreticaldefinitions of constructs operate like a Russian construct doll in that onecan essentially always continue to discover psychological multidimension-ality/construct clusters at finer resolutions grounded in theoretical plausibility.Yet, only few definitional grainsizes, perhaps only one, are relevant within anygiven research context.

What makes this problem even more complex is that one may not be ableto capture psychologically justifiable multidimensionality at a fixed level ofgrainsize through distinct quantitative latent variables with presently availabledata structures and multidimensional measurement models. Even though publi-cations in areas such as cognitively diagnostic assessment (e.g., Leighton &Gierl, 2007) may seem to suggest that static measurement models can directlyrepresent fluid response processes, this will, at best, be only approximately true.One always has to accept that a measurement model with latent variables is anempirical snapshot that forms a proxy for a cognitive process and the constructsthat fuel it.

This relates to the heart of the argument of Borsboom and Michell, who lamentthat specialists seem to believe that a latent trait is quantitative just because theyuse a quantitative latent variable in their measurement models. This discussion isprimarily about the highest level of measurement precision that can be attainedfor a latent trait rather than the nature of the latent trait itself as if it were apsychophysiological entity in peoples’ heads, which is an important distinctionto make. Contrary to the experience of the authors, however, interactions that Ihave had with specialists suggest to me that they rarely believe that the mannerin which a construct is represented by a particular measurement model is theonly way that it can be measured generally or that the statistical dimensionalityof the model implies ontogenetic psychological dimensionality as well. Even ifone would be willing to speak of ignorance, the type of ignorance that I oftenencounter is different from the type of ignorance that the authors of the three

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 8: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

122 COMMENTARIES

papers speak about. I have seen many specialists being privately frustrated abouttheir own lack of knowledge concerning the level of technical sophisticationof measurement procedures or psychological theories, but I rarely encountersomeone who does not acknowledge the imperfection of their work stemmingfrom his or her lack of knowledge in a certain area. Maybe I have been just luckyor am ignorant myself. To resolve this matter, what might be useful are surveyson the ways in which specialists from different fields with different types andlevels of training construct and interpret latent variables that go beyond polishedpeer-reviewed publications and more accurately reflect actual practices.

Finally, despite my own calls for precision in some of my own writings onmeasurement practices, I believe that there is a legitimacy that can be ascribed toreflective pragmatism that leads to operationalism. For example, as I have arguedabove, response processes are always psychologically multidimensional at somelevel of definitional grainsize, but it may still be truly meaningful, and not juststatistically convenient, to summarize data via a unidimensional latent variablemodel. The resulting latent variable might serve as a more effective and efficientpragmatic communication tool that represents a decidedly conscious compromisebetween cognitive fidelity, empirical feasibility, and utilitarian practicability.

In this regard, authors such as Embretson and Reise (2000), for example,recognize item response modeling as a psychometric, and not a psychological,theory because latent variable models do not magically “infuse” assessment datawith psychological meaning; this, thankfully, remains the responsibility of thehuman specialists. Moreover, as authors such as Junker (1999) remind us, thereare legitimate reasons to view measurement models as purely data-reductiondevices within an operationalist perspective, because all they technically do iscondense multidimensional response data into more succinct lower-dimensionalrepresentations. Of course, the resulting variables should still be psychologicallymeaningful even at lower standards of psychological rigor, but I would argue thatthe standards for such an interpretation will, and should, differ across applicationcontexts.

Thus, coming full circle with my argument again, I believe that criticaldebates about different facets of latent variable theory as represented by the threearticles in this issue would benefit from acknowledging the complex interplayof theoretical desiderata and practical constraints that shape research practices.Stated differently, I believe that perceived deficits in disciplinary practices andresulting narratives about constructs are insufficiently attacked with a criticismof theoretical imprecision that rests on methodological carelessness alone.

CONCLUSION

As the three authors acknowledge in their papers, translations of argumentssurrounding theoretical entities and events in actual and hypothetical worlds are at

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014

Page 9: Lost in Translation? Meaning and Decision Making in Actual and Possible Worlds

COMMENTARIES 123

the heart of every measurement enterprise, and those translations are challengingto develop indeed. It can only be beneficial for measurement practices generallyif specialists across different disciplines struggle with the complexities andintricacies involved in meaning- and decision-making processes while buildingdomain-specific landscapes of constructs along with tools for charting andnavigating them.

In this commentary, I have developed an argument based on my own trans-lation of the authors’ intended meanings, which was clearly influenced by myown training, beliefs, and practices with their resulting expertise and ignorance.It is only appropriate then that I shall thank the authors for inspiring me, alongwith undoubtedly many other readers, to think about these subtle issues and totranslate them into an idiosyncratic meaningful narrative. As these three papersstrongly demonstrate, it is crucial that we, as a community of professionals fromdifferent disciplines, can debate when, how, and why we may be lost in translation.

REFERENCES

Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71, 425–440.Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological

Review, 111, 1061–1071.Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application

to abstract reasoning. Psychological Methods, 3, 380–396.Embretson, S. E. (2006). The continued search for non-arbitrary metrics in psychology. American

Psychologist, 61, 50–55.Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ:

Erlbaum.Junker, B. W. (1999). Some statistical models and computational methods that may be useful for

cognitively-relevant assessment. Unpublished manuscript. Accessed November 28, 2006, fromhttp://www.stat.cmu.edu/˜brian/nrc/cfa

Kane, M. (2007). Validation. In R. J. Brennan (Ed.), Educational measurement (4th ed.). Oxford:Ace/Praeger.

Leighton, J. P., & Gierl, M. (2007). Diagnostic assessment for education: Theory and applications.Cambridge: Cambridge University Press.

Messick, S. (1989). Validity. In R. Linn (Ed.), Educational measurement (pp. 13–103). New York:Macmillan.

Mislevy, R. J., Steinberg, L. S., Almond, R. G., & Lukas, J. F. (2006). Concepts, terminology,and basic models of evidence-centered design. In D. M. Williamson, I. I. Bejar, & R. J. Mislevy(Eds.), Automated scoring of complex tasks in computer-based testing (pp. 15–48). Mahwah, NJ:Erlbaum.

Molenaar, P. C. M. (2008). On the implications of the classical ergodic theorems: Analysis ofdevelopmental processes has to focus on intra-individual variation. Developmental Psychobiology,50, 60–69.

Dow

nloa

ded

by [

Uni

vers

itat P

olitè

cnic

a de

Val

ènci

a] a

t 13:

04 2

1 O

ctob

er 2

014