Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Research Assessment: New metrics? More metrics? No metrics?
Ismael Rafols
Ingenio (CSIC-UPV), Univ. Politècnica de València
SPRU (Science Policy Research Unit), Univ. Sussex
Re-shaping design and use of indicators• Indicators may be harming research
• Current indicators are only (partially) appropriate for some types of science.
• Biases against and potential suppression of creative and valuable types of research (agro-, health,…). Threat to diversity.
• Not only more, but other types of indicators needed• Making visible other contributions (e.g. IDR) and other types of research
(e.g. action research, co-creation)• Enhancing visualisation of metrics for “opening up” perspectives rather
than facilitating “closing down”
• Towards different uses of indicators• New embedding in assessment or policy context• Indicators used to pluralise (opening up) perspectives, as tools
for interpretation and deliberation, not a substitute for judgement
Uses of indicators: Pressing demands of research management and evaluation --- Can indicators help?
Yes, indicators can help make decisions…
Reduce time and costs
Increase transparency and sense of objectivity
Reduce complexity, accessible to managers
but do they lead to the “right” decisions?
Evaluation gap (Wouters):
“discrepancy between evaluation criteria [implicit in indicators] and the social and economic functions of science”
*Academia – “excellence” *Innovation – economic “growth”
Missions not well covered: agriculture, public health, defence,development, social inclusion,…
Often related to marginalised / “neglected” populations?
Problems, research, indicators and marginalisation
Space of problems
Space of research
Space of STIindicators
Space of problems
Space of research
Researchwell illuminatedby indicators
Problems, research, indicators and marginalisation
Problems, research, indicators and marginalisation
Space of problems
STI Peripheries:research spaces notwell capturedby indicators
Researchwell illuminatedby indicators
Problems, research, indicators and marginalisation
Multiple types of space:
STI Peripheries:research spaces notwell capturedby indicators
Researchwell illuminatedby indicators
Cognitive: SSH, engineering
Linguistic: non-English
Sectoral: low-tech, agriculture, creative ind.
Social: gender, minorities
Geographical: regional, “South”
Streetlight effect in indicators: mistaking light with “problems”
Space of problems
Space of research
Researchwell illuminatedby indicators
Streetlight effect in indicators: mistaking light with “problems”
Space of problems
Space of research
Space of problemsSpace of research
Space of problems
Hypothesis: reduced indicator coverage may contract research space
Space of research
Space of STIindicators
(No “hard evidence”)The societal needs dealt by research that is under the streetlight effect, will be
better rewarded.
Reduced diversity ofresearch efforts...
…reduced coverageof societal needs
Space of problems
Demands for expanding role of science in society…
Space of research
Space of STIindicators
Space of problems
Demands for expanding role of science in society…
Space of research
Space of STIindicators
Space of problems
…may require an expanded set of indicators: MORE
Space of research
Space of STIindicators
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Leach et al. 2010
Broadening out vs. Opening up (Stirling, 2008)
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Appraisal methods: broad vs. narrow & close vs. open
cost-benefit analysis
open hearings
consensusconference
scenarioworkshops
citizens’ juries
multi-criteria mapping
q-method
sensitivityanalysis
narrative-based participant observation
decision analysis
risk assessment structured interviews
Stirling et al. (2007)
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Appraisal methods: broad vs. narrow & closing vs. opening
Most conventionalS&T indicators??
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Broadening out S&T Indicators
ConventionalS&T indicators??
Broadening out
Incorporation plural analytical dimensions:
global & local networkshybrid lexical-actor netsetc.
New analytical inputs: media, blogsphere.
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Journal rankings
University rankings Unitary measuresthat are opaque, tendency to favour the established perspectives
… and easily translated into prescription
European InnovationScoreboard
Broadening out S&T Indicators
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Opening up in S&T Indicators
ConventionalS&T Indicators??
opening-up
Making explicit underlying conceptualisations and creating heuristic tools to facilitate exploration
NOT about the uniquely best methodOr about the unitary best explanationOr the single best prediction
A research and policy agenda• Pluralising indicators (supply)• Broadening out: Create more diverse indicators
• Indicators of open science, RRI, hidden, social innovation
• Improve representation of SSH scholarship, languages other than English, the “South”,…
• Opening up: develop more pluralistic toolkits that present contrasting perspectives
• Multi-ranking tools
• Interactive visualisations
• New embedding of indicators in assessment (demand) • Develop new social processes on use of indicators
• Indicators to inform decisions, not a substitute for judgement
• STI indicators as tools for interpretation and deliberation
From S&T indicators for justification and disciplining…… towards S&T indicators as tools for deliberation
Model 2: Plural and conditionalExploring diverse choices Facilitating options/choices in landscapes
Model 1: Unique and prescriptiveProposing “best choices”Rankings -- ranking list of preferences
From S&T indicators for justification and disciplining…… towards S&T indicators as tools for deliberation
• ‘Conventional’ use of indicators (‘Pure scientist ‘--Pielke) • Purely analytical character (i.e. free of normative assumptions)
• Instruments of objectification of dominant perspectives
• Aimed at legitimising /justifying decisions (e.g. excellence)
Unitary and prescriptive advice
• Opening up scientometrics (‘Honest broker’ --Pielke)• Aimed at locating the actors in their context and dynamics
Not predictive, or explanatory, but exploratory
• Construction of indicators is based on choice of perspectives
Make explicit the possible choices on what matters
• Supporting debate
Making science policy more ‘socially robust’
Plural and conditional advice
Barré (2001, 2004, 2010), Stirling (2008)
Hicks, Wouters, De Rijcke, Waltmanand Rafols (2015)
Nature (23 Abril 2015)
Principles of the “The Leiden Manifesto”
1. Quantitative evaluation should support qualitative, expert assessment.
2. Measure performance against the research missions of the institution, group or researcher.
3. Protect excellence in locally relevant research.
4. Keep data collection and analytical processes open, transparent and simple.
5. Allow those evaluated to verify data and analysis.
6. Account for variation by field in publication and citation practices.
7. Base assessment of individual researchers on a qualitative judgement of their portfolio.
8. Avoid misplaced concreteness and false precision.
9. Recognize the systemic effects of and indicators.
10.Scrutinize indicators regularly and update them.
Hicks, Wouters, Waltman, de Rijcke and Rafols (Nature, 2015)
2. Examples of Opening Up
a. Broadening out AND Opening up
b. Opening up WITH NARROW inputs
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
1. Preserving multiple dimensions in broad appraisals
ConventionalS&T indicators??
Leach et al. 2010
Broadening out opening-up
Composite Innovation Indicators (25-30 indicators)European (Union) Innovation Scoreboard
Grupp and Schubert (2010) show that order is highly dependent on indicators weightings.
Sensitivity analysis
Solution: representing multiple dimensions(critique by Grupp and Schubert, 2010)
Use of spider diagramsallows comparing like with like
U-rank, University performance Comparison tools(Univ. Twente)
5.4 Community trademarks indicator
2. Examples of Opening Up
b. Opening up WITH NARROW inputs
narrow
broad
closing-down opening-up
range of appraisals inputs(issues, perspectives, scenarios, methods)
effect of appraisal ‘outputs’ on decision-making
Opening up S&T Indicators
ConventionalS&T Indicators??
Leach et al. 2010
opening-up
Making explicit underlying conceptualisations and creating heuristic tools to facilitate exploration
NOT about the uniquely best methodOr about the unitary best explanationOr the single best prediction
1. Measures of “scientific excellence”
Measures of “scientific excellence”
0
0.5
1
1.5
2
2.5
3
3.5
4
ISSTI SPRU MIoIR Imperial WBS LBS
AB
S R
ank
0
1
2
3
4
5
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
pu
b
Jou
rnal
-fie
ld N
orm
alis
ed
Which one is more meaningful??
0
1
2
3
4
ISSTI SPRU MIoIR Imperial WBS LBS
Jou
rnal
Imp
act
Fact
or
Rafols et al. (2012, Research Policy)
Measures of “scientific excellence”
0
0.5
1
1.5
2
2.5
3
3.5
4
ISSTI SPRU MIoIR Imperial WBS LBS
AB
S R
ank
0
1
2
3
4
5
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
pu
b
Jou
rnal
-fie
ld N
orm
alis
ed
0
0.05
0.1
0.15
0.2
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
pu
bC
itin
g-p
ape
r N
orm
alis
ed
Which one is more meaningful??
0
1
2
3
4
ISSTI SPRU MIoIR Imperial WBS LBS
Jou
rnal
Imp
act
Fact
or
Rafols et al. (2012, Research Policy)
2. Measures of interdisciplinarity
Multiple concepts of interdisciplinarity:
Conspicuous lack of consensus but most indicators aim to capture the following concepts
Integration (diversity & coherence)
• Research that draws on diverse bodies of knowledge
• Research that links different disciplines
Intermediation
• Research that lies between or outside the dominant disciplines
Coherence
Low High
Diversity
Low
Hig
h
InterdisciplinaryMultidisciplinary
Monodisciplinary
Intermediation
Low High
Monodisciplinary Interdisciplinary
Diversity
ISSTI Edinburgh
WoS Cats of references
Assessing interdisciplinarity
ISSTI EdinburghObserved/ExpectedCross-citations
Assessing interdisciplinarity Coherence
RiskAnal
PsycholBull
PhilosTRSocA
Organization
JPersSocPsychol
JLawEconOrgan
JIntEcon
Interfaces
EnvironSciPolicy
CanJEcon
ApplEcon
AnnuRevPsychol
RandJEcon
JPublicEcon
JManage
JLawEcon
HumRelat
BiomassBioenerg
AtmosEnviron
PolicySci
JIntBusStud
JApplPsychol
Econometrica
PublicUnderstSci
PsycholRev
JFinancEcon
JApplEcolJAgrarChangeClimaticChange
AcadManageJ
JRiskRes
JDevStud
Scientometrics
HarvardBusRev
IntJMedInform
GlobalEnvironChang
EconJ
JFinanc
StudHistPhilosSci
DrugInfJ
Futures
WorldDev
StrategicManageJ
SciTechnolHumVal
EconSoc
PublicAdmin
Lancet
IndCorpChange
AccountOrgSoc
EnergPolicy
Nature
AmJSociol
ResPolicy
TechnolAnalStrateg SocStudSci
BritMedJ
ISSTI EdinburghReferences
IntermediationAssessing interdisciplinarity
3. Research trajectories
Explore different directions of research
Rice VarietiesClassic Genetics
TransgenicsMol. Biology
Genomics
PestsPlant protection
Weeds Plant protection
Plant nutrition
Production & socioeconomic issues
Consumption Hum. nutrition, food
techs)
Thinking in terms of research portfolios: the case of rice
Ciarli and Rafols (2014, unpublished)
US, 2000-12
Ciarli and Rafols (2014, unpublished)
Rice research
India 2000-12Rice research
Ciarli and Rafols (2014, unpublished)
Thailand 2000-12Rice research
Ciarli and Rafols (2014, unpublished)
Brazil 2000-12Rice research
Ciarli and Rafols (2014, unpublished)
3. Summary and conclusions
From S&T indicators for justification and disciplining…… towards S&T indicators as tools for deliberation
Instead of designing indicators for ranking (summative assement)
design indicators that foster reflection (formative assessment)
and pluralisation of perspectives
This shift is facilitated by trends pushed by ICT and visualisation tools
More inputs (pubs, pats, but also news, webs, etc.)
Multidimensional outputs (interactive maps)
Institutional repositories
Multiple solutions -- highlighting variation, confidence intervals
More inclusive and contrasting classifications (by-passing private data ownership? Pubmed, Arxiv)
More possibilities for open scrutiny (new research groups)
S&T indicators as a tools to open up the debate
• ‘Conventional’ use of indicators (‘Pure scientist ‘--Pielke) • Purely analytical character (i.e. free of normative assumptions)
• Instruments of objectification of dominant perspectives
• Aimed at legitimising /justifying decisions (e.g. excellence)
Unitary and prescriptive advice
• Opening up scientometrics (‘Honest broker’ --Pielke)• Aimed at locating the actors in their context and dynamics
Not predictive, or explanatory, but exploratory
• Construction of indicators is based on choice of perspectives
Make explicit the possible choices on what matters
• Supporting debate
Making science policy more ‘socially robust’
Plural and conditional advice
Barré (2001, 2004, 2010), Stirling (2008)
Strategies for opening up or how to “keep it complex” yet “manageable”
• Presenting contrasting perspectives• At least TWO, in order to give a taste of choice
• Simultaneous visualisation of multiple properties / dimensions • Allowing the user take its own perspective
• Interactivity• Allowing the user give its own weigh to criteria / factors
• Allowing the user manipulate visuals
.
Is ‘opening up’ worth the effort? (1)Sustaining diversity in S&T system
Decrease in diversity.
Potential unintended consequence of the evaluation machine:
Why diversity matters
Systemic (‘ecological’) understanding of the S&T• S&T outcomes depend on synergistic interactions between disparate
elements.
Dynamic understanding of excellence and relevance• New social needs, challenges, expectations from S&T
Manage diverse portfolios to hedge against uncertainty in research• Office of Portfolio Analysis (National Institutes of Health)
http://dpcpsi.nih.gov/opa/
Open possibility for S&T to work for the disenfranchised• Topics outside dominant science (e.g. neglected diseases)
STI Indicators ConferenceEuropean and Latin American Networks
14-16 September 2016, València
Measures of “scientific excellence”
0
0.5
1
1.5
2
2.5
3
3.5
4
ISSTI SPRU MIoIR Imperial WBS LBS
AB
S R
ank
0
1
2
3
4
5
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
pu
b
Jou
rnal
-fie
ld N
orm
alis
ed
Which one is more meaningful??
0
1
2
3
4
ISSTI SPRU MIoIR Imperial WBS LBS
Jou
rnal
Imp
act
Fact
or
Rafols et al. (2012, Research Policy)
Measures of “scientific excellence”
0
0.5
1
1.5
2
2.5
3
3.5
4
ISSTI SPRU MIoIR Imperial WBS LBS
AB
S R
ank
0
1
2
3
4
5
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
pu
b
Jou
rnal
-fie
ld N
orm
alis
ed
0
0.05
0.1
0.15
0.2
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
pu
bC
itin
g-p
ape
r N
orm
alis
ed
Which one is more meaningful??
0
1
2
3
4
ISSTI SPRU MIoIR Imperial WBS LBS
Jou
rnal
Imp
act
Fact
or
Rafols et al. (2012, Research Policy)
0
1
2
3
4
5
6
7
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
Pu
blic
atio
n
Raw
0
1
2
3
4
5
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
Pu
blic
atio
n
Fie
ld N
orm
alis
ed
0
0.05
0.1
0.15
0.2
ISSTI SPRU MIoIR Imperial WBS LBSC
itat
ion
s/P
ub
licat
ion
Cit
ing-
sid
e N
orm
alis
ed
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
ISSTI SPRU MIoIR Imperial WBS LBS
Cit
atio
ns/
Pu
blic
atio
nJo
urn
al n
orm
alis
ed
Measures of “scientific impact”
Summary: IS (blue) units are more interdisciplinary than BMS (orange)
More Diverse
Rao-Stirling Diversity
More Coherent
Observed/Expected
Cross-Citation Distance
More Interstitial
Average Similarity
0.02
0.03
0.04
0.05
0.06
0.07