Download docx - €¦ · Web viewDespite its importance, assessing or measuring societal impact is difficult. Social, cultural, environmental, and economic impacts are not mutually exclusive, such

ABSTRACT

Applied prevention research centres (APRC) are important parts of public health efforts

to prevent chronic disease and promote healthy living. How to measure their practical impacts

upon society remains poorly understood. This study aimed to identify indicators considered by a

diverse set of stakeholders to be most important for capturing the practical impacts of APRCs

(outside of contributions to new knowledge); and, to identify opportunities for adaptation and

further development of measures for these most important indicators. A modified Delphi

approach was used to gather the perspectives of centre leaders, funders and knowledge users

associated with 36 APRCs from diverse international settings. An initial set of 22 decision

making and capacity development indicators were gathered from existing research impact

frameworks. During a three round Delphi process, panelists rated these indicators on importance

and feasibility, proposed refinements to existing indicators and developed new indicators. Only

those indicators rated above average on importance were retained between rounds. This process

identified eight indicators that were rated as highly important and highly feasible for collection,

such as the number of APRC projects driven by policy needs, the number and quality of

knowledge exchange activities, and citations of APRC research in public policy documents.

Seven indicators were rated as highly important but with low feasibility, such as measures of

APRC reputation, evidence of contributions to the field of prevention research, and the influence

of the APRC’s work over time on the knowledge, skills and commitment of policy and practice

partners. These indicators may be suitable for future methods development.

1

INTRODUCTION

Research evaluation may be conducted for four broad purposes: to inform advocacy

efforts; to meet accountability requirements; to analyse and understand how, where and why

research is effective; and to inform funding allocation decisions (Guthrie et al. 2013; Penfield et

al. 2014). In many jurisdictions, interest in research evaluation is increasing, driven by a greater

focus on accountability, good research governance, a need to make better choices about limited

funding for research activities, and a desire to limit waste in conducting research (Greenhalgh et

al. 2016; Guthrie et al. 2013). Research evaluation therefore provides an important means for

demonstrating the value of investing in research activities, informing the planning of research

institutions and funding agencies, as well as rewarding past experience and incentivizing future

activity (Greenhalgh et al. 2016; Guthrie et al. 2013; Upton et al. 2014).

Within the field of research evaluation are specific efforts to examine and document

research impact, particularly the ‘societal’ impact arising from research activities (Bornmann

2013, 2016; Penfield et al. 2014). As noted by Bornmann, societal impact is concerned with “the

assessment of (a) social, (b) cultural, (c) environmental, and (d) economic returns (impact and

effects) from results (research output) or products (research outcome) of publicly funded

research” (Bornmann 2013).” Societal impact therefore recognizes the importance of research

that leads to marketable and consumable products or services (Bornmann 2013, 2016).

Despite its importance, assessing or measuring societal impact is difficult. Social,

cultural, environmental, and economic impacts are not mutually exclusive, such as evidenced by

the multiple contributions made by a new medical treatment that improves quality of life,

reduces absenteeism, and increases economic productivity (Bornmann 2013). Societal impacts

may be intended or unintended, and confined to a particular target area or population, or extend

2

more broadly. Increasingly, research carried out in one location may have far reaching effects

beyond those in an intended area be they governments, industries, clinicians, or individual

citizens. Finally, societal impacts often require years or decades to become apparent. As a result,

drawing causal links between a particular research project or activity and a definable societal

impact is often extremely difficult.

For some disciplines, these linkages are more readily apparent, such as between research

in engineering and economic impacts (Upton et al. 2014). For other disciplines, such as applied

research efforts related to public policymaking, social sciences, or applied public health, the links

between research and societal impact are more difficult to describe, and influenced by a plethora

of uncontrollable factors (Greenhalgh et al. 2016). In these, and other similar fields, evidence

from research is often used “conceptually (for general enlightenment) or symbolically (to justify

a chosen course of action), rather than instrumentally (feeding directly into a particular policy

decision)” (Greenhalgh et al. 2016). A key contributing factor is the complexity of the decision

making process itself, which exerts powerful effects on how research evidence from social

science can be used. As a consequence, social science research is often used to bring attention to

and highlight the complexity of a situation or problem, for which multiple responses may exist

(Greenhalgh et al. 2016).

Given these challenges, there are some who question the usefulness of assessing the

societal impact of research at all, and suggest the potential for such assessments to negatively

influence the type of research that is undertaken (Penfield et al. 2014). For example, concerns

exist that assessing research impact may encourage research on topics and questions for which

research impact may be more readily identified, for which economic impacts (and products) may

be more easily generated, and which fit the interests and priorities of powerful donors (Johnston

3

1995; Penfield et al. 2014; Stuckler et al. 2011). This research may come at the expense of more

exploratory and creative research, and/or that which is not easily translated into quantified

societal impacts. Recent studies suggest that while researchers and the public both place a high

value on research with societal impacts (Mulligan and Conteh 2016; Pollitt et al. 2016), concern

exists particularly among the public that research should not be conducted solely for the pursuit

of economic gains (Miller et al. 2013).

Therefore, measuring the societal impact of research, without compromising the creative

pursuit of new knowledge, requires nuanced measurement approaches that capture a range of

societal impacts, alongside those measures of new knowledge generation. Existing research

impact frameworks offer insights into some of the different ways these societal impacts may be

understood and measured.

Research impact frameworks

Recent reviews have synthesized insights into the growing number of available research

impact frameworks (Banzi et al. 2011; Boaz et al. 2009; Buykx et al. 2012; Greenhalgh et al.

2016; Milat et al. 2015; Penfield et al. 2014). Among the most commonly cited frameworks are

six examined in detail by the RAND corporation: the Canadian Academy of Health Sciences

(CAHS); Excellence in Research for Australia (ERA); National Institute of Health Research

(NIHR) Dashboard; Productive Interactions framework; Research Excellence Framework (REF);

and STAR METRICS (Guthrie et al. 2013). Many of these approaches have been informed by

Buxton and Hanney’s ‘Payback’ framework, originally developed in the UK to examine returns

or ‘paybacks’ from investment in research (Buxton and Hanney 1996).

4

The Payback framework contains five measurement dimensions: (1) knowledge

production; (2) benefits to future research and research use (research targeting, capacity building

and absorption); (3) political and administrative benefits (informing policy and product

development); (4) health sector benefits; and (5) broader economic benefits (Buxton and Hanney

1996; Hanney et al. 2000; Hanney 2005). These dimensions are situated in an input-output model

of how each can be best assessed. This model contains multiple stages (e.g. research needs

assessment, primary research outputs, secondary research outputs etc.), with interfaces between

the research system and the reservoir or stock of knowledge, and the broader political,

professional and industrial environment or society in which research is conducted. As noted by

Hanney, this emphasizes the need for research that meets the “…needs of potential users and

engages the interest of leading researchers, and is then fed back into the wider environment in a

way that increases the chances of the research being utilized”(p.11)(Hanney 2005).

The Payback framework and others have now been widely applied in a range of contexts.

From this work, most consistency appears to relate to measures of knowledge production,

particularly those metrics focused on research publications and funding. For publications,

common measures include the number of publications, quality of publication and citation data,

while funding specific measures relate to the number of applications made to funding agencies or

the total value of funding support secured from various sources (research councils as well as

industry) (Australian Research Council 2016; Higher Education Funding Council for England

2014; Panel on Return on Investment in Health Research 2009).

Measuring the societal impact of research, including impacts on policy and product

development, capacity development, and decision making is comparatively more difficult.(Boaz

et al. 2009; Greenhalgh et al. 2016; Penfield et al. 2014). As noted by Buxton and Hanney, it is

5

important to be able to demonstrate that research has improved the quality and depth of an

information base, as well as influenced the decisions made by those responsible for policy and

practice and their capacity to do so (Buxton and Hanney 1996). Capacity development is

considered to relate to personnel, the acquisition of funding, and investment in research

infrastructure, and may include improving the skills and competencies of staff, creating larger

and more comprehensive datasets, enhancing centre reputation, and fostering cross fertilization

of ideas (Panel on Return on Investment in Health Research 2009). Decision making relates to

decisions made by those in the broad fields of health, research, the health product industry, and

by the general public. This may be evidenced in practitioner behavior, clinical management

guidelines, how resources are allocated, regulatory decisions, media coverage or research and

development agendas (Panel on Return on Investment in Health Research 2009).

Applications of the Payback framework have demonstrated multiple ways that the

societal impacts of research may be captured, including as they relate to the work of specific

research centres (Graham et al. 2012; Hanney et al. 2000; Wooding et al. 2014). For example,

Hanney et al. describe the impact of two research centres; one focused on substance misuse and

the other on community and primary care (Hanney et al. 2000). Results from applying the

Payback model to the work of these centres suggested both had demonstrated impacts on

capacity development and decision making. In part, these impacts are reflected by use of a

centre’s research in needs assessments and project specifications, as well as evidence that those

in policy make deliberate efforts to engage in centre activity (Hanney et al. 2000). These types

of impacts appear to be particularly important for clinical or applied research centres, including

those with a focus on chronic disease prevention (Wooding et al. 2014).

6

Applied Prevention Research Centres (APRC)

In recent years, many high income countries such as Canada, the USA, Australia and the

UK, have established a variety of applied prevention research centres (APRC) to tackle the rising

burden of chronic conditions through prevention research (e.g. the Healthy Populations Institute

(Canada), Prevention Research Centre at St. Louis (USA), the Australian Prevention Partnership

Centre (Australia), Development and Evaluation of Complex Interventions for Public Health

Improvement (UK)). While there is no single organizational model applicable to all APRCs,

many are characterised by the explicit inclusion and engagement of research, policy and practice

perspectives; foci on scientific and practical contributions; and capacity development among

research scholars and policy/practice partners. APRCs are often affiliated with universities, and

may or may not be housed within traditional university structures. Activities are understandably

diverse within APRCs and involve gathering and interpreting surveillance data; developing,

testing and evaluating preventive interventions; mobilizing knowledge from research and

practice; and developing and delivering training programs (Greenlund and Giles 2012). As

investments in such centres continues to grow, being able to describe and document their societal

impacts – beyond new knowledge – is becoming increasingly important. At the same time, the

challenges of measuring the impact of centres is being acknowledged, including the range of

indirect outcomes that centres influence, the difficulty in linking specific pieces of centre

research with societal impacts, and the often lengthy time required for societal impacts to

develop (Scott et al. 2011). The result is a need and desire to understand and improve the societal

impact of APRCs, but limited methods and tools for doing so in the specific context of APRCs.

This study represents an early and exploratory response to this problem and builds on

existing research impact frameworks such as Payback and CAHS. It seeks to contextualize these

7

frameworks to the specific circumstances of APRCs, and in doing so, provide new insights into

those indicators that may be important and feasible for assessing the societal impacts of these

centres. Specifically, this study aims to identify:

1. Those indicators considered by a diverse set of stakeholders to be most important for

capturing the practical impacts of applied prevention research centres; and,

2. Opportunities for adaptation and further development of measures for these most

important indicators.

In doing so, this study tailors available research impact frameworks to the particular

research domain of applied prevention research, and explicitly engages diverse and relevant

perspectives – i.e., research funders, scientific leaders of research centres, knowledge users of

APRCs. Findings from this work are intended to provide inputs for catalyzing new conversations

with members of participating communities, and discerning promising directions for evaluating

practical impacts of APRCs.

METHODS

This study adopted a modified Delphi approach (Day and Bobeva 2005; Goodman 1987;

Hsu and Sandford 2007). The Delphi technique is commonly used for gaining group consensus

on a given topic or theme, and typically involves multiple rounds of a questionnaire delivered

either online or by post (Day and Bobeva 2005; Goodman 1987; Hsu and Sandford 2007). A

traditional Delphi commences with exploratory questions that are used to refine subsequent

rounds. In this instance, a modified Delphi technique was used based on a review of indicators in

available research evaluation frameworks. The design of this study was guided by an Advisory

Group and involved three rounds.

8

Advisory Group

The Advisory Group (n=12) was convened at the commencement of the study. The

Advisory Group included international experts from Canada, Australia, the UK, the USA, and

the Netherlands with expertise in chronic disease prevention research, prevention policy and

practice, research funding, research governance, and assessing research impact. The purpose of

the Advisory Group was to assist in defining the characteristics of APRCs, identifying centres

who met these criteria and assist in recruiting members of APRCs to participate in the study.

The group also helped to identify evaluation frameworks to inform the study. The group

communicated via email with small group teleconferences conducted as needed.

Delphi panelists

An ‘expert panel’ is critical for the Delphi method (Keeney et al. 2001). Given the focus

of this study on APRCs, a two-step process was used to identify expert participants: (1)

identifying APRCs of relevance to the study; and (2) identifying expert panelists associated with

those centres.

Identifying applied prevention research centres (APRC)

Through a combination of targeted online searches and consultation with the Advisory

Group, 36 APRCs were identified internationally that met the definition outlined in Figure 1.

While the search strategy did not specifically use “chronic disease” as a key word, those centres

that did not include a focus on chronic diseases were excluded. Using this definition, the

websites of identified research centres were reviewed to determine if the information provided

within the mission, vision, objectives, services and or additional material such as the annual

report could be considered to fall within the proposed definition of APRCs. The website and

9

document review were completed by one team member and discussed with two additional team

members to determine a final list of APRCs.

Identifying expert panelists

The leaders of identified centres were found through a review of available websites and

associated documentation. Centre leaders were considered those holding positions of directors,

executive directors, CEOs or another comparable title. The first round of the Delphi (see below)

was sent to these centre leaders, who were asked during that round to provide the names and

business email addresses for their core funders and up to two policy/practice ‘knowledge users’

with whom they collaborate. This combination of centre leaders, core funding agencies, and

knowledge users comprised the expert panel in this Delphi.

Expert panelists were recruited through an initial study invitation sent via email that

introduced the study and provided participants with information about what participation

required. After receiving confirmation of their participation, participants were sent an email with

a link to the round 1 questionnaire. Two reminders were sent for each round of the

questionnaire. The round 1 questionnaire was open for 3 weeks. Only panel members who

returned the round 1 questionnaire were sent the round 2 questionnaire, which was administered

1.5 months after round 1 using the same approach as the round 1 questionnaire. The round 3

questionnaire was administered using the same approach as rounds 1 and 2 and was administered

2.5 months after round 2.

The study was reviewed and received ethics clearance through a University of Waterloo

Research Ethics Committee (ORE#20421).

Conducting the Delphi process

10

Round 1

The initial list of indicators for the round 1 questionnaire was informed by a review of

research impact frameworks. Indicators of impact were compiled from the six frameworks

included in the RAND report, beginning with the CAHS framework as this was considered by

the Advisory Group as most closely aligned with the field of research (i.e. chronic disease

prevention). As this study focused on the practical / societal contributions from APRCs, the

initial set of indicators focused on the ‘capacity development’ (CD) and ‘decision making’ (DM)

categories of CAHS. Therefore, the survey did not include indicators on the scientific

contributions of APRCs to new knowledge (e.g. peer-reviewed publications). Once the relevant

indicators from the CAHS had been identified, the REF, ERA, STAR METRICS, Productive

Interactions, and the NIHR Dashboard were reviewed for additional indicators considered by the

Advisory Group as relevant for inclusion. Only those indicators that were applicable at an

institutional level (i.e. the centre level) were included.

In round 1, participants were emailed a link to the survey containing a list of 22

indicators subdivided into CD (9 indicators) and DM (13 indicators). Participants were asked to

rate each indicator on its importance using a nine point Likert scale with anchors of 1 = ‘not at

all important’ and 9 = ‘extremely important’. A comment box was available for each indicator to

allow participants to suggest how they would refine the indicator to make it most useful to

evaluations of APRCs. Participants were asked to complete the ratings from their own

perspective as a centre leader, funder or knowledge user collaborator. Following the rating of

indicators, an open ended question asked participants to provide any additional indicators

specific to the evaluation of APRCs that they considered to be important but which were not

included in the list of 22 indicators.

11

Round 2

Wording refinements suggested by participants to original indicators in round 1 were

incorporated into ‘refined’ indicators (meaning that one original indicator could have been

developed into multiple ‘refined’ indicators with links to an original indicator). Completely new

ideas not captured in any original indicator were developed into new indicators, using the words

of the participant(s) where possible. The round 2 questionnaire provided participants with the

mean importance ratings for each of the original indicators and requested participants provide

importance ratings for refined indicators (n = 49, 27 CD and 22 DM) and newly developed

indicators (n = 15, 12 CD and 3 DM). An open-ended question was included for each set of

indicators to provide comments or suggestions related to the refined and new indicators.

Round 3

In contrast to rounds 1 and 2, round 3 focused on feasibility ratings for those indicators

deemed by participants to be most important. Based on rounds 1 and 2 results, the mean rating of

importance for all indicators (original, refined and newly nominated) was calculated. All

indicators falling above this mean were identified and reviewed by three of the authors (CW, LS,

BR) to determine a set of indicators that each represented a unique idea. Full agreement was

required among these authors, with disagreements resolved through open discussion. Where two

indicators addressed a similar concept as judged by the three authors, the indicator with the

highest mean rating was retained. These procedures generated a list of 39 unique indicators rated

as above average in importance by participants (26 CD and 13 DM indicators). The authorship

team considered the feasibility of collecting indicators to be an important marker of their

usefulness in practice. As such, round 3 asked panelists to rate the feasibility of measuring these

indicators using a nine point Likert scale with anchors of 1 = ‘not at all feasible’ and 9 =

12

‘extremely feasible’. Open ended questions for each set of indicators (CD and DM indicators)

asked participants to identify promising methods and/or data sources (both existing and

potential) and potential challenges to using these indicators for evaluating APRCs.

In addition to ratings of importance and feasibility, round 3 also provided panelists with

opportunity to provide an overall comment on methodological challenges associated with the

indicators. This included opportunity to describe important considerations, potentially useful

sources of evaluative data, and approaches for collecting new data of relevance to the evaluation

of APRCs. Qualitative data gathered from participants were collated into a brief table and

analyzed thematically.

Analysis

Of the 39 unique indicators with above average ratings of importance, we identified two

subsets: (1) those indicators with the 5 highest importance ratings and with feasibility ratings

also falling above the mean; and (2) those indicators with the 5 highest importance ratings but

with feasibility ratings falling below the mean. Indicators in subset 1 represent those which

panelists believe are important to measure for APRCs and which are considered feasible to

measure with current methods, while those indicators in the second subset represent important

indicators for APRCs, but for which current measurement methods may not be suitable. Given

this study explicitly engaged a small number of panelists with expert knowledge of APRCs (i.e.

centre leaders, funders and knowledge users), comparisons between panelists was not possible.

RESULTS

As noted, 36 APRCs were identified and invited to participate in this study, of which 22

agreed to participate. Of the 36 panel members (22 centre leaders, 6 funders and 8 knowledge

users) who agreed to participate in the study, 27 individuals completed round 1 (75%), 23

13

completed round 2 (85% - based on round 1 sample, 13 centre leaders; 4 funders; 6 knowledge

users) and 20 completed round 3 (87%) (11 centre leaders; 4 funders; 5 knowledge users).

More than half of participating APRCs had less than 25 full or part time employees

(63%), and an approximate annual operating budget of $7.3M USD (range $750,000 - $19.7M

USD). Fifty-six percent of participating APRCs had been in operation for more than 10 years.

Panelist from funding organizations represented government and charitable sectors that provided

funding support for infrastructure, capacity development, project and program grants, and

contracts. All participating funding organizations had been in operation for more than 10 years.

Knowledge users brought a variety of perspectives, most prominently advocacy, program

implementation and evaluation (5 of 7 knowledge user panelists). Other areas of focus for

knowledge users included health services, planning and policy development (4 of 7 knowledge

user panelists). Six of the knowledge user panelists reported that their organization had been in

operation for more than 10 years.

A total of 22 CD and DM indicators were selected from research impact evaluation

frameworks and included in the round 1 questionnaire (Table 1). CD indicators (n=9) broadly

related to issues of student training and employment, centre staffing structure and professional

development, centre and project funding, and collaborations with external partners. DM

indicators (n=13) covered domains relevant to the use of research in public policy and programs,

citation of research in guidelines and other policy documents, and reported use of research in and

outside health domains. In round 1, panelists proposed refinements to 9 of the original CD

indicators and 13 of the original DM indicators in order to increase their specificity for APRCs.

This resulted in 27 refined CD indicators (Table 2a) and 22 refined DM indicators (Table 2b)

that had origins in the originally circulated indicators. In addition, panelists proposed a further 12

14

new CD indicators and 3 new DM indicators. These newly identified CD indicators related to

concepts of co-production with research, policy and practice perspectives, field building,

reputational indicators, and specific indicators for knowledge exchange activities. Newly

developed DM indicators focused on the engagement of APRC staff in formal decision making

committees, specific details on the organizations influenced by APRC activities, and sustained

impact of centre research over time.

Round 1 generated 64 new or refined indicators in CD and DM categories. Using the

specified cut-off values, 25 indicators were discarded from round 2, leaving 39 indicators with

above average importance ratings that were then rated by panelists for their feasibility in round 3

(Tables 3a and 3b).

Examination of round 3 results revealed 8 indicators (5 CD and 3 DM indicators) with

ratings for importance and feasibility above the mean (Table 4). These indicators related to

aspects of centre funding, the responsiveness of centre projects to user needs, knowledge

exchange, citations in policy documents and centre staff contributing to decision making bodies.

Round 3 analyses also identified 7 indicators with importance ratings above the mean, but

feasibility ratings below the mean (5 CD and 2 DM indicators) (Table 4). As noted in Table 4,

these indicators relate broadly to concepts of centre reputation, influence of centre activities on

knowledge users, and contributions of the centre to scientific fields and policy development.

Qualitative data highlighted a number of challenges in advancing the evaluation of

APRCs. Panelists highlighted the lack of funding available for supporting evaluative activities,

including for contracting external and independent evaluations (despite their potential value to

funding agencies). Panelists also referred to the ambiguity present in many of the indicators

themselves, highlighted by one panelist:

15

“There are many definitional challenges, such as what counts as applied prevention

research; what is relevant knowledge and skills for applied prevention research; what are types

of 'influence' and 'use' of centre products/activities; what defines "co-production"?”

These challenges were seen as particularly problematic for those indicators requiring

large amounts of complex qualitative data. As highlighted by the following quote, panelists

noted the challenges in gathering relevant evaluative data from collaborators and potential

beneficiaries:

“It requires significant resources and skills to do well, as an arm's length skilled

interviewer must be paid to collect qualitative data of this sort from a wide range of such

stakeholders, many of whom are reluctant to be interviewed or say much when they are.”

Panelists also suggested that the users of knowledge generated by APRCs are not always

easily identified and are not included in an easily accessed database or repository. Even where

such individuals were identifiable, panelists raised concerns about overburdening collaborators

and partners in collecting evaluative data (particularly qualitative data), as well as a potential

lack of interest among policy and practice partners in participating in evaluation activities (as

noted by the quote above). In addition, some panelists expressed potential challenges in

gathering insightful evaluative data from partner organizations experiencing high rates of staff

turnover (i.e. limited organizational memory).

In addition to the challenges of evaluating APRCs, panelists also identified a number of

ways in which high quality data may be gathered as part of APRC evaluation efforts. Panelists

advocated for a range of data sources to be used, suggesting a ‘triangulation’ of methods and

approaches. These data sources included routine national data repositories containing aggregate

data from defined geographic areas, as well as readily available data in centre blogs, social media

16

data, committee minutes and ministerial press releases. While making use of available data was a

key consideration suggested by panelists, a number of data collection methods were suggested

for gathering new evaluative data. These included interviews with relevant individuals and

teams, surveys of partner organizations, and social network analyses (Knoke and Yang 2008).

An overall approach, such as offered by Contribution Analysis (Mayne 2012), was considered to

be one potentially useful strategy for gathering and organizing diverse sources of data to examine

centre impact. An important consideration is the alignment of these methods to the domains of

preferred CD and DM indicators.

DISCUSSION

This study has identified a number of existing indicators, refined indicators and new

indicators that may be useful for evaluating the impact of APRCs. Many indicators that rated

highly on importance represent refined versions of similar indicators in existing frameworks. For

example, while the originally circulated indicators included two focused on the skills and

experience of centre staff, the refined indicators specified the skills of centre staff to the specific

needs of prevention research, the availability of career paths in prevention centres, and the

professional development opportunities provided to centre staff that are relevant to prevention.

Other originally circulated indicators that did not rate highly on importance (e.g. those related to

funding, student training and partnerships) were refined into highly rating indicators more

closely aligned to the contexts of prevention (e.g. stability of funding for indirect costs, students

employed in relevant fields following graduation, the quality of partnerships and not just their

quantity).

17

These results share some similarities with measures contained in existing frameworks,

and highlight how common concepts related to capacity, policy impact, and health status may be

tailored to the context of APRCs. For example, the Payback framework’s dimension on research

targeting, capacity development and absorption appears to relate to this study’s indicators on the

influence of the centre’s training on public and population health practitioners, and how the

centre’s work influences more junior researchers. Payback dimensions focused on informing

policy and product development are expressed in indicators related to the contributions of the

centre’s work to policy development, implementation and evaluation, and the sustained impact of

the centre’s work on decision making that affects the public’s health. Similarly, the Payback

dimension on knowledge production appears to be related to indicators from this study focused

on the contributions of APRCs to the field of prevention research.

Applications of the Payback framework, and associated frameworks such as CAHS, have

demonstrated how these tools may be adapted for use in specific contexts, as well as some of the

key activities required to achieve impact. For example, the CAHS framework (based on the

Payback framework) was adapted by Graham et al. for use in evaluating the development and

implementation of Alberta Innovates-Health Solutions: a Canadian-based, publicly funded

provincial health research and innovation organization (Graham et al. 2012). As noted by the

authors, through both retrospective and prospective application of the CAHS, additional concepts

were added, including domains for reach, training and mentorship, as well as specific indictors

related to budget variance, partnership expenditures, and the number of programs and services

delivered (Graham et al. 2012). As per the adaptations made by Graham et al., results from the

present study highlight specific adaptations that may be useful in the context of APRCs,

particularly around indicators for assessing knowledge mobilization such as those focused on

18

knowledge exchange activities, co-production with policy and practice partners, invitations to

join policy and practice decision making events, and formal and informal relationships with

those in policy and practice.

Consistent with existing research impact frameworks, these indicators highlight the

crucial role of sustained engagement, partnerships and co-production for achieving research

impact. Similar results are suggested by empirical applications of the Payback framework,

including in evaluating the impact of Heartstart Scotland: a national program attempting to

introduce automatic defibrillators into all Scotland’s ambulances (Wooding et al. 2014). In that

analysis, an explicit and sustained effort to build relationships that linked and exchanged the

perspectives of health researchers and potential users was identified as crucial in achieving

impact (Wooding et al. 2014). These concepts resonate with the present findings, and are in

keeping with the theoretical foundations underlying the core functions of APRCs. These

functions are grounded in concepts from traditions such as engaged scholarship, integrated

knowledge-to-action, and partnership-based research (Davies et al. 2008; Greenhalgh and

Wieringa 2011; Moser et al. 2016; Van de Ven 2007). While unique in their own right, these

traditions each highlight the importance of engagement with those that benefit from the research

generated by APRCs. While this is in part reflected by output measures such as the citation of

APRC research in guidelines, policy documents and by those working in policy and practice

settings, results here reinforce other research impact frameworks, and suggest potential value in

also measuring the processes of engagement, and the quality of relationships that exist between

researchers and partners. Cultivating such relationships is a critical role for APRCs and a

significant investment of time and resources to enable use of research knowledge.

19

The distinction between process, outcome and output measures is important for debates

around measuring the impact of APRCs, particularly as there is variability in how these terms are

used: for example, what some consider as process markers may be classified by others as

proximal outcomes. While greater consistency exists related to what constitutes longer-term

outcomes in health (e.g. the incidence and prevalence of a disease or risk factor in a population),

measurement of these outcomes is challenging for two reasons: the time lag required to generate

a detectable change in the health of populations, and the attribution of any change to the work of

an APRC or prevention researcher (Pollack 2011). Process measures (or proximal outcome

measures) have the advantage of measuring the more immediate and often incremental work of

APRCs activities, and are critical for capturing the ongoing and sustained nature of preventive

research that influences those in positions to create positive change (Samuel and Derrick 2015).

Such markers also recognize the indirect nature of many policy contributions, the tendency for

drivers of policy change to often go un-cited, and the potential for high quality research evidence

to be dismissed by powerful political forces (Upton et al. 2014).

Process markers identified in the present study are consistent with the measurement

domains suggested by the Productive Interactions framework originating from the Netherlands

(Spaapen and Van Drooge 2011). As noted, this framework focuses on the role of productive

interactions between researchers and policy/practice partners. As per Spaapen et al., productive

interactions are considered as those that occur between researchers and society, and which “lead

to efforts by stakeholders to somehow use or apply research results or practical information or

experiences” (p 212) (Spaapen and Van Drooge 2011). The social impacts that result from this

knowledge are changes in behaviour that may relate to human well-being and/or the relationships

that exist between people or organizations (Spaapen and Van Drooge 2011).

20

Unlike other impact frameworks, the Productive Interactions approach is tailored to the

needs and contexts of specific groups and teams, limiting its utility for making comparisons, and

hence, it’s capacity to inform funding allocation decisions. As such, Productive Interactions has

been noted for its primary role in fostering learning and improvement within teams over time

(Guthrie et al. 2013).

The utility of the Productive Interactions framework for learning and improvement,

highlights the different uses that research evaluations might have, including for accountability as

noted above, as well as for informing advocacy and analysis/understanding of research activities

(Guthrie et al. 2013). As noted by Guthrie et al., different approaches to evaluation tend to lend

themselves to different purposes. Quantitative approaches tend to be most useful for gathering

longitudinal data and making comparisons across centres and over time; formative approaches

tend to focus on learning and improvement with a flexible and comprehensive approach to

evaluation but with limited utility for drawing comparisons; approaches with a high burden on

those agencies conducting the evaluation are used infrequently and typically capture large

amounts of qualitative data; while those requiring low levels of expertise in their deployment

often have a low burden on participants, and are therefore more commonly employed (Guthrie et

al. 2013).

Therefore, the ease with which data may be collected for a specific suite of indicators

influences who will use those indicators, for what purposes, and how frequently they will collect

data on one or more indicators. Results from the present study have identified a subset of

indicators that panelists rated as both important and feasible for collection. Many of these

indicators are quantitative measures, such as the number of centre projects driven by expressed

policy/practice needs of engaged organizations; the number of partners/collaborators (by sector)

21

and products and contributions of collaboration; the number and type of knowledge exchange

activities undertaken by the centre; or evidence of the centre’s contributions to supporting

decision making processes and groups (e.g., participation of centre staff on steering groups,

Ministerial Working groups, government committees etc.). Capturing data on these and other

similar indicators helps to quantify the performance of APRCs and is comparatively simple to

gather through regular tracking procedures within many APRCs. As such, these data may be

built into ongoing monitoring efforts, allowing performance to be tracked over time as well as

between centres, and may therefore be of appeal to those evaluation efforts with few resources or

limited specialist skills available. A common set of quantitative metrics may also be of appeal to

centre funders with interests in conducting comparative analyses across centres. Yet these

quantitative methods are unlikely to capture the full range of societal impacts an APRC might

have or desire to have.

Alongside these important and feasible quantitative measures, this study also identified a

sub-set of indicators with high importance ratings, yet low feasibility ratings. These measures

include a wide range of foci, including aspects of centre reputation, temporal changes over time

in how people think, work and behave as a result of centre activities, the contribution of a centre

to building the field of prevention research, and sustained impact of the centre’s work on policy,

programs and practice. For many APRCs, these are key domains of activity, yet as noted by

panelists in this study, are regarded by many as being difficult to capture. Gaining useful insights

into these aspects of centre performance likely requires multi-modal methods, including desk

analysis, panel assessments, interviews and case studies, which include the perspectives of those

within and external to APRCs (Cohen et al. 2015; Milat et al. 2015). Narrative case studies, such

as those gathered by the 2014 Research Excellence Framework, have been noted to succeed in

22

“capturing the complex links between research and impact” (Higher Education Funding Council

for England 2014). Through the REF, 1621 qualitative case studies from the health and

biomedical research fields were submitted for evaluation by expert panels, with 91% considered

by panelists as being outstanding or very considerable in terms of the significance of their

impact. The most successful case studies were considered as those involving a compelling

narrative that linked research to impact; verifiable evidence of the link; and how the impact has

spread from immediate to distant beneficiaries (Higher Education Funding Council for England

2014). Critically, these details cannot be captured through sole use of quantitative metrics

(Higher Education Funding Council for England 2014). Yet gathering these and similar data

requires a high level of expertise, time and financial support: resources not always available to

many evaluation practices.

Given the importance of these indicators, advancing the evaluation of APRCs may be

well served through investing in methods development work that captures the contribution of

such centres to societal and economic goals. A number of contribution-based approaches have

been described in the literature, including contribution mapping (Kok and Schuit 2012),

contribution analysis (Mayne 2012) and an extension of the latter, the Research Contribution

Framework (Morton 2015) A companion article in this journal issue (Riley et al.), reports on the

experience of an APRC in Canada – the Propel Centre for Population Health Impact – in

applying contribution analysis methods to three case studies as part of developmental work for a

centre-wide evaluation intended to serve learning, improvement, and accountability purposes

(Riley et al. 2017). The study adapts and extends Mayne’s six steps to contribution analysis in

an effort to increase their relevance to evaluating research impacts on public health policy, with

potential application more broadly to evaluating societal impacts of APRCs (Mayne 2012).

23

Exploring this and other approaches to describing contribution rather than attribution, is an

important area for future research to better understand the societal impacts of research centres.

Strengths and limitations

This study engaged a broad group of panelists, from multiple jurisdictions, and with

varying perspectives on APRCs (i.e. prevention researchers, funders and knowledge users

working with APRCs). The response rate between rounds remained high throughout the study.

While the study did not seek to identify a final suite of indicators for assessing APRCs, it did

succeed in describing subsets of indicators considered by panelists as being more important

and/or more feasible than others.

While efforts were made to engage diverse perspectives, there were too few panelists to

meaningfully explore any between group differences. Participation was particularly low for those

in funding agencies or those collaborating with APRCs as knowledge users. Future studies may

seek to explore these perspectives through alternative qualitative methods, such as in-person

focus groups or key-informant interviews. Such approaches may provide a more engaging forum

for eliciting the opinions of those funding and collaborating with APRCs.

The indicators identified in this study may be used to provide particular insights into the

impact of APRCs. They cannot stand alone however, especially data on individual indicators. To

more fully understand impact, the indicators identified in this study need to be placed in context

– individually and as a set – which includes the interplay of performance indicators, and the

multi-level influences on APRCs such as broader institutional climates and cultures, local

community settings, and broader socio-economic conditions. In addition, this study has not

considered over what time periods change in the identified indicators may be expected.

Therefore, some of the indicators identified in this study may be suitable for annual monitoring,

24

while others may require longer time frames for change to take place. As noted above, a next

step may be to more deeply understand how those leading APRCs, funding APRCs and

collaborating with APRCs interpret and value different measures, for different purposes, and

over different time periods.

Conclusions

Applied prevention research centres are important parts of societal efforts to promote

healthy living and prevent chronic disease. Documenting, describing and comparing the diverse

impacts such centres can have on advancing new knowledge, improving decision making, and

developing capacity, is therefore important. The challenge of measuring these impacts –

particularly in relation to the effects of APRCs on decision making and capacity development –

are shared with other fields of research, such as those focused on public policy, social sciences,

or applied public health. While previous frameworks such as CAHS and Payback are potentially

useful for understanding the impacts of APRCs, no specific set of indicators exists for measuring

the impact of APRCs. As such, this study sought to gather expert perspectives on promising

measurement domains focused on the practical impacts of APRCs. Findings from this work are

consistent with existing research impact frameworks, and highlight measurement domains that

are both important and feasible, including the number of APRC projects driven by explicit policy

needs, the number and quality of knowledge exchange activities, and citations of APRC research

in public policy documents. Findings also suggest important measurement domains that require

future methods refinement, including measures of centre reputation, evidence of contributions to

the field of prevention research, and the influence of the centre’s work over time on the

knowledge, skills and commitment of policy and practice partners. Methods such as Contribution

Analysis hold promise for advancing the measurement of APRCs and warrant further

25

investigation in future empirical studies. Improving our understanding and measurement of the

practical impacts of APRCs is important for both learning and accountability: findings from this

study provide new insights into what such measures may include, and where future effort may be

usefully directed.

26

REFERENCES

Australian Research Council (2016), 'Excellence in Research for Australia',

<http://www.arc.gov.au/excellence-research-australia>, accessed.

Banzi, R., et al. (2011), 'Conceptual frameworks and empirical approaches used to assess the

impact of health research: an overview of reviews', Health Res Policy Syst, 9, 26.

Boaz, A., Fitzpatrick, S., and Shaw, B. (2009), 'Assessing the impact of research on policy: A

literature review', Science & Public Policy, 36 (4), 255-70.

Bornmann, L. (2013), 'What is societal impact of research and how can it be assessed? A

literature survey', Journal of the American Society for Information Science and

Technology, 64 (2), 217-33.

--- (2016), 'Scientific Revolution in Scientometrics: the Broadening of Impact from Citation to

Societal', in C. R. Sugimoto (ed.), Theories of Informetrics and Scholarly

Communication (Berlin: de Gruyter), 347-59.

Buxton, M. and Hanney, S. (1996), 'How can payback from health services research be

assessed?', J Health Serv Res Policy, 1 (1), 35-43.

Buykx, P., et al. (2012), ''Making evidence count': a framework to monitor the impact of health

services research', Aust J Rural Health, 20 (2), 51-8.

Cohen, G., et al. (2015), 'Does health intervention research have real world policy and practice

impacts: testing a new impact assessment tool', Health Res Policy Syst, 13, 3.

Davies, H., Nutley, S., and Walter, I. (2008), 'Why 'knowledge transfer' is misconceived for

applied social research', J Health Serv Res Policy, 13 (3), 188-90.

Day, J. and Bobeva, M. (2005), 'A generic toolkit for the successful management of Delphi

studies', Electron J Bus Res Methodol, 3, 103-16.

27

http://www.arc.gov.au/excellence-research-australia

Goodman, C. M. (1987), 'The Delphi technique: a critique', J Adv Nurs, 12 (6), 729-34.

Graham, K.E.R., et al. (2012), 'Evaluating health research impact: Development and

implementation of the Alberta Innovates - Health solutions impact framework.', Research

Evaluation, 21, 354-67.

Greenhalgh, T. and Wieringa, S. (2011), 'Is it time to drop the 'knowledge translation' metaphor?

A critical literature review', J R Soc Med, 104 (12), 501-9.

Greenhalgh, T., et al. (2016), 'Research impact: a narrative review', BMC Med, 14, 78.

Greenlund, K. J. and Giles, W. H. (2012), 'The Prevention Research Centers program: translating

research into public health practice and impact', Am J Prev Med, 43 (3 Suppl 2), S91-2.

Guthrie, S., et al. (2013), 'Measuring research: A guide to research evaluation frameworks and

tools', (Santa Monica, CA: RAND).

Hanney, S. (2005), 'Developing and aplying a framework for assessing the payback from

medical research', Kiel Institute's International Research Conference on New Technology

and National Health Systems (Kiel).

Hanney, S., Packwood, T., and Buxton, M. (2000), 'Evaluating the benefits from health research

and development centres', Evaluation, 6 (2), 137-60.

Higher Education Funding Council for England (2014), '2014 REF: Assessment framework and

guidance on submissions. Panel A criteria', (London HEFCE).

Hsu, C.C. and Sandford, B.A. (2007), 'The Delphi Technique: Making Sense Of Consensus',

Practical Assessment, Research and Evaluation, 12 (10), 1-8.

Johnston, R. (1995), 'Research impact quantification', Scientometrics, 34 (3), 415-26.

Keeney, S., Hasson, F., and McKenna, H. P. (2001), 'A critical review of the Delphi technique as

a research methodology for nursing', Int J Nurs Stud, 38 (2), 195-200.

28

Knoke, D. and Yang, S. (2008), Social Network Analysis (2 edn.; Thousand Oaks: Sage).

Kok, M. O. and Schuit, A. J. (2012), 'Contribution mapping: a method for mapping the

contribution of research to enhance its impact', Health Res Policy Syst, 10, 21.

Mayne, J. (2012), 'Contribution analysis: Coming of age?', Evaluation, 18 (3), 270-80.

Milat, A. J., Bauman, A. E., and Redman, S. (2015), 'A narrative review of research impact

assessment models and methods', Health Res Policy Syst, 13, 18.

Miller, F. A., et al. (2013), 'Do canadian researchers and the lay public prioritize biomedical

research outcomes equally? A choice experiment', Acad Med, 88 (4), 519-26.

Morton, S. (2015), 'Progressing research impact assessment: A ‘contributions’ approach',

Research Evaluation, 24 (4), 405-19.

Moser, D. , Ream, T. C., and Braxton, J. M. (2016), Scholarship reconsidered: Priorities of the

Professoriate (expanded edition) (San Francisco: Jossey-Bass).

Mulligan, J. A. and Conteh, L. (2016), 'Global priorities for research and the relative importance

of different research outcomes: an international Delphi survey of malaria research

experts', Malar J, 15 (1), 585.

Panel on Return on Investment in Health Research (2009), 'Making an Impact: A Preferred

Framework and Indicators to Measure Returns on Investment in Health Research',

(Ottawa: Canadian Academy of Health Sciences (CAHS)).

Penfield, T., et al. (2014), 'Assessment, evaluations, and definitions of research impact: A

review', Research Evaluation, 23, 21-32.

Pollack, H. A. (2011), 'Prevention and public health', J Health Polit Policy Law, 36 (3), 515-20.

29

Pollitt, A., et al. (2016), 'Understanding the relative valuation of research impact: a best-worst

scaling experiment of the general public and biomedical and health researchers', BMJ

Open, 6 (8), e010916.

Riley, B.L., et al. (2017), 'Adapting and extending contribution analysis methods to evaluate the

impacts of research on three tobacco control policies in Canada', Research Evaluation,

Under Review.

Samuel, G.N. and Derrick, G. E. (2015), 'Societal impact evaluation: Exploring evaluator

perceptions of the characterization of impact under the REF2014', Research Evaluation,

24 (3), 229-41.

Scott, J.E., et al. (2011), 'An evaluation of the Mind Body Interactions and Health Program:

assessing the impact of an NIH program using the Payback Framework', Res Eval 20 (3),

185-92.

Spaapen, J. and Van Drooge, L. (2011), 'Introducing “Productive Interactions” in Social Impact

Assessment', Research Evaluation, 20 (3), 211-18.

Stuckler, D., Basu, S., and McKee, M. (2011), 'Global health philanthropy and institutional

relationships: how should conflicts of interest be addressed?', PLoS Med, 8 (4),

e1001020.

Upton, S., Vallance, P., and Goddard, J. (2014), 'From outcomes to process: evidence for a new

approach to research impact assessment', Research Evaluation, 23 (4), 352-65.

Van de Ven, A.H. (2007), Engaged Scholarship: A Guide for Organizational and Social

Research (Oxford: Oxford University Press).

Wooding, S., et al. (2014), 'Understanding factors associated with the translation of

cardiovascular research: a multinational case study approach', Implement Sci, 9 (1), 47.

30