0900 tue lomond marks

Usage Factora new metric to measure journal quality

Jayne MarksCo-Chair, Usage Factor Project

UKSG March 27, 2012

Usage Factor – a COUNTER project

2

Usage Factor: agenda

• Why usage-based measures?

• Strengths and weaknesses of the Impact Factor

• Background and aims of the Usage Factor project

• The story so far:• Stage 1 – market research

• Stage 2 - modelling an analysis

• Results

• Recommendations

• Next steps

3

Usage Factor: backgroundWhy usage-based measures?• A growing body of reliable journal usage statistics

• The burgeoning availability of reliable usage data for online journals has opened the door to usage-based measures of impact, value and status.

• Since 2002 COUNTER has provided a global standard for usage statistics

• A complement to citation-based measures

• Impact Factors, based on citation data, have become generally accepted as a valid measure of the impact and status of scholarly journals

• Widely used by publishers, authors, funding agencies and librarians

• There are misgivings about an over-reliance on Impact Factor alone

• Online availability of content combined with the availability of reliable COUNTER-compliant online usage statistics, raises the possibility of a parallel usage-based measure of journal performance becoming a viable additional metric – the Usage Factor for journals

4

Usage Factor - background

Journal Impact Factors: strengths

• well-established, widely recognised, accepted and understood• difficult to defraud• endorsed by funding agencies and scientists• simple and accessible quantitative measure• independent• global• journals covered are measured on the same basis• comparable data available over a period of decades• broadly reflect the relative scientific quality of journals in a given field• faults are generally known

5

Usage Factor: Background

Journal Impact Factors: weaknesses

• bias towards US journals; non-English language journals poorly covered• optimized for biomedical sciences, work less well in other fields• as an arithmetic mean, a journal IF can be dramatically influenced by one

highly cited article• can be manipulated by, e.g., self-citation• over-used, mis-used and over-interpreted• formula is flawed; two year time window too short for most fields• only a true reflection of the value of a journal in pure research fields• impact of practitioner-oriented journal is understated• does not cover all fields of scholarship• over-emphasis on IF distorts the behaviour of authors and publishers• time-lag before IFs are calculated and reported; new journals have no IF• emphasis on IF masks the great richness of citation data

6

Usage Factor: background

Usage Factor: providing a new perspective

UF is a complementary measure that will compensate for the weaknesses of Impact Factors in several important ways:

• UFs will be available for a much larger number of journals• coverage of all fields of scholarship that have online journals• impact of practitioner-oriented journals is better reflected in usage• usage is recorded and reported immediately upon publication of an

article• availability of UF will reduce the current over-emphasis of IFs• authors would welcome a usage-based measure for journals

7

Who will benefit from the Usage Factor?

Four major groups will benefit from the introduction of Usage Factors:

• Authors, especially those in practitioner-oriented fields, where citation-based measures understate the impact of journals, as well as those in areas outside the core STM fields of pure research, where coverage of journals by citation-based measures is weak.

• Publishers, especially those with large numbers of journals outside of the core STM research areas, where there is no reliable, universal measure of journal impact, because citation-based measures are either inadequate or non-existent for these fields

• Librarians, when deciding on new journal acquisitions, have no reliable, global measures of journal impact for fields outside the core STM research fields. They would use usage-based measures to help them prioritise journals to be added to their collections.

• Research Funding Agencies, who are seeking a wider range of credible, consistent quantitative measures of the value and impact of the outputs of the research that they fund.

8

Usage Factor Project - aims and objectives

The overall aim of this project was to explore how online journal usage statistics might form the basis of a new measure of journal impact and quality, the Usage Factor for journals. Specific objectives were to answer the following questions:• Will Usage Factor be a statistically meaningful measure?• Will Usage Factor be accepted by researchers, publishers, librarians and

research institutions?• Will Usage Factor be statistically credible and robust?• Is there an organizational and economic model for its implementation that

would cost-effective and be acceptable to the major stakeholder groups.

The project is being carried out in three Stages:• Stage 1 ( 2007-2008): market research• Stage 2 (2009-2011): modelling and analysis• Stage 3 (2011-2012): further tests based on draft Code of Practice

9

Usage Factor Stage 1- market research

Interviews with 29 key authors/editors, librarians and publishers; web-based survey of 155 librarians; web-based survey of 1400 authors. Key findings:

• the majority of publishers were supportive of the UF concept, appeared to be willing, in principle, to participate in the calculation and publication of UFs, and were prepared to see their journals ranked according to UF

• the great majority of authors in all fields of academic research would welcome a new, usage-based measure of the value of journals

• UF, were it available, would be a highly ranked factor by librarians, not only in the evaluation of journals for potential purchase, but also in the evaluation of journals for retention or cancellation

• COUNTER was on the whole trusted by librarians and publishers and was seen as having a role in the development and maintenance of the UF

10

Usage Factor Stage 2- modelling and analysis

• Real journal usage data analysed by John Cox Associates, Frontline GMS and CIBER• Participating publishers:-• American Chemical Society • Emerald• IOP• Nature Publishing• OUP• Sage• Springer

11

The data• 326 journals• 38 Engineering• 32 Physical Sciences• 119 Social Sciences

• 29 Business and Management • 35 Humanities• 102 Medicine and Life Sciences

• 57 Clinical Medicine

• c.150,000 articles

12

Recommendations: the metric• Usage Factors should be calculated using the median rather than

the arithmetic mean• Usage data re highly skewed; most items attract relatively low use

and a few are used many times. As a result, the use of the arithmetic mean is not appropriate

• A range of Usage Factors should ideally be published for each journal: a comprehensive UF ( all items, all versions) plus supplementary factors for selected items• There is considerable variation in the relative use made of

different articles types and versions. This means that the UF will be affected significantly by the particular mix of items included in a given journal, all other things being equal

13

Recommendations: the metric• Usage Factors should be published as integers with no decimal

places• Monthly patterns of use at the item level are quite volatile and

usage factors therefore include a component of statistical noise

• Usage Factors should be published with appropriate confidence levels around the average to guide their interpretation• As a result of this statistical noise, the mean usage factor should

be interpreted within intervals of plus or minus 22 per cent (Amin and Mabe* have pointed out that ISI Journal Impact Factors are subject to similar statistical factors and recommend that Impact Factors that differ by less than 25% belong together in the same rank)* Amin, M and Mabe, M., Impact Factors: Use and Abuse, Elsevier, Perspectives in Publishing, October 2000. http://www.elsevier.com/framework_editors/pdfs/Perspectives1.pdf

http://www.elsevier.com/framework_editors/pdfs/Perspectives1.pdf

14

Recommendations: the metric• The Usage Factor should be calculated initially on the basis of a

maximum usage time window of 24 months. • It might be helpful later on to consider a 12-month window as well (or

possibly even a 6-month window) to provide further insights• This study shows that relatively short time windows capture a

substantial proportion of the average lifetime interest in full journal content.

• Longer windows than 24-months are not recommended and this should be considered a maximum.

• There is possibly a case for considering a 12-month window, but there are counter-arguments here. On average only 25-30% of lifetime usage takes place during the first twelve months, compared with 60-65% during the first 24 months

15

Recommendations: the metric• The Usage Factor is not directly comparable across subject groups

and should therefore be published and interpreted only within appropriate subject groupings.• Usage, in months 1-12 especially, follows different patterns in

different subject areas

• The Usage Factor should be calculated using a publication window of 2 years• Usage factors will tend to inflate across the board year-on-year as

a result of many factors, including greater item discoverability through search engines and gateways. The use of a two-year publication window would ameliorate some of these effects by providing a moving average as well as a greater number of data points for calculating the usage factor.

16

Recommendations: the metric• There seems to be no reason why ranked lists of journals by usage

factor should not gain acceptance• The usage factor delivers journal rankings that are comparable in

terms of their year-on-year stability with those generated from citation metrics such as the ISI impact factor and SNIP

• Small journals and titles with less than 100 downloads per item are unsuitable candidates for Journal Usage Factors: these are likely to be inaccurate and easily gamed• Usage factors below a certain threshold value (perhaps 100 but

research is needed on a larger scale to explore this further) are likely to be inaccurate due to statistical noise. The size of the journal should also be taken into account

17

Recommendations: the metric• The Usage Factor provides very different information from the citation

Impact Factor and this fact should be emphasised in public communications.• The usage factor does not appear to be statistically associated with

measures of citation impact

• Further work is needed on usage factor gaming and on developing robust forensic techniques for its detection• Attempts to game the usage factor are highly likely. • CIBER’s view is that the real threat comes from software agents

rather than human attack. • The first line of defence has to be making sure that COUNTER

protocols are robust against machine attack. • A second line of defence would be to develop statistical forensics to

identify suspicious behaviour, whether human or machine in origin.

18

Recommendations: the metric• Further work is needed to broaden the scope of the project over

time to include other usage-based metrics• Although the scope of this study was to consider the journal

Usage Factor only, future work could look at the other indicators that mimic other aspects of online use, such as a ‘journal usage half-life’ or a ‘reading immediacy index’.

19

Recommendations: infrastructure

• Development of systems to automate the extraction and collation of data needed for UF calculation is essential if calculation of this metric is to become routine

• Development of an agreed standard for content item types, to which journal specific item types would be mapped, is desirable as it would allow for greater sophistication in UF calculation

• Development or adoption of a simple subject taxonomy to which journal titles would be assigned by their publishers

• Publishers should adopt standard “article version” definitions based on ALPSP/NISO recommendations

20

Next steps: Stage 3

Objectives

• Publication of a draft Code of Practice for the Usage Factor• Further testing of the recommended methodology for calculating

Usage Factors for journals• Investigation of an appropriate, resilient subject taxonomy for

the classification of journals• Exploration of the options for an infrastructure to support the

sustainable implementation of Usage Factor• Investigate the feasibility of applying the Usage Factor concept to other categories of publication in addition to journals

21

Usage Factor

The draft Code of Practice

• The Code of Practice will be consistent with COUNTER and will provide:• A list of Definitions and other terms that are relevant to Usage Factor• A methodology for the calculation of Usage Factor as a median value, including

specifications for the metadata to be recorded, the content types and article versions whose usage may be counted, as well as the Publication Period and Usage Period to be used.

• Specifications for the reporting of the Usage Factor• Data processing rules to ensure that Usage Factors are credible, consistent and

compatible, including protocols for identifying and dealing with attempts to game the Usage Factor

• Specifications for the independent auditing of Usage Factors• A description of the role of the Central Registry for Usage Factors in the

consolidation of usage data and in the publication of Usage Factors • The draft Code of Practice will be published in Q1 2012• Publishers are being invited to prepare UFs using the draft Code of Practice

22

Usage Factor: organizationProject Co-Chairs: • Jayne Marks, Wolters Kluwer, USA• Hazel Woodward, Cranfield University, UK International Advisory Board Members • Mayur Amin, Elsevier, UK• Kim Armstrong, CIC Center for Library Initiatives, USA• Peter Ashman, BMJ Group, UK• Terry Bucknell, University of Liverpool, UK• Ian Craig, Wiley, UK• Joanna Cross, Taylor & Francis, UK• David Hoole, Nature Publishing Group, UK• Tim Jewell, University of Washington, USA• Jack Ochs, ACS Publications, USA• Tony O’Rourke, IOP Publishing, UK• Clive Parry, Sage Publications, UK• Jason Price, Claremont College, USA• Ian Rowlands, CIBER, UK• Bill Russell, Emerald, UK• Ian Russell, Oxford University Press, UK• John Sack, HighWire Press, USA• David Sommer, COUNTER, UK• Harald Wirsching, Springer, Germany

Project Director• Peter Shepherd, COUNTER

23

Usage Factor project

Acknowledgements• Major funding from UKSG and RIN• Additional funding from:

• ALPSP• American Chemical Society• STM• Nature Publishing Group• Springer

• Usage data for Stage 2 analysis provided by the American Chemical Society, Emerald, IOP, Nature Publishing Group, OUP, Sage, Springer

For further informationAccess the full report on Stages 1 and 2 of the Usage Factor project, as well as information on the progress of Stage 3 on the COUNTER website at: http://www.projectcounter.org/usage_factor.html

http://www.projectcounter.org/usage_factor.html

http://www.projectcounter.org/usage_factor.html

Technology

0900 tue lomond marks