27
1 GL11 – December 14-15, 2009 Usage of grey literature in open archives J. Schöpfel (University of Lille 3) C. Boukacem-Zeghmouri (University of Lille 3) H. Prost (INIST-CNRS) 0 1000 2000 3000 4000 5000 6000 7000

Usage of grey literature in open archives

  • Upload
    lonna

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Usage of grey literature in open archives. J. Schöpfel (University of Lille 3) C. Boukacem-Zeghmouri (University of Lille 3) H. Prost (INIST-CNRS). Size of repositories and total number of items. Total number of items (cumulated). 2009. 2008. Number of items in archives (ranking). - PowerPoint PPT Presentation

Citation preview

Page 1: Usage of grey literature  in open archives

1GL11 – December 14-15, 2009

Usage of grey literature in open archives

J. Schöpfel (University of Lille 3)

C. Boukacem-Zeghmouri (University of Lille 3)

H. Prost (INIST-CNRS)

0

1000

2000

3000

4000

5000

6000

7000

Page 2: Usage of grey literature  in open archives

2GL11 – December 14-15, 2009

Size of repositories and total number of items

1

10

100

1000

10000

100000

1000000

10000000

1 7

13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

103

109

115

121

127

133

Total number of items (cumulated)

Number of items in archives (ranking)

2008

2009

Page 3: Usage of grey literature  in open archives

3GL11 – December 14-15, 2009

Content evolution

1.87m items

= x2,7 since 2008

Representativity 10% (?)

Part of GL unchanged (17%)

But: +200,000 new grey items

Other*, ndArticles

Grey literatureDatasets

* = heritage, books…

Page 4: Usage of grey literature  in open archives

4GL11 – December 14-15, 2009

GL document types

Conferences

Reports

ETD

Working papers

Other

Courseware

Page 5: Usage of grey literature  in open archives

5GL11 – December 14-15, 2009

Repository type and presence of grey literature

0

10

20

30

40

50

60

70

80

90

Institutional Doc-type Subject-based Other

yes

no

= 74% of all repositories contain GL

(and 93% of IR)

Page 6: Usage of grey literature  in open archives

6GL11 – December 14-15, 2009

Size of repository and number of grey items

HAL

HAL SHS

PERSEE

IRD

INRA

(standard scores)0

10000

20000

30000

40000

50000

60000

70000

0 50000 100000 150000 200000 250000 300000

2008

2009HAL

INRA

PERSEEHAL SHS

IRD

TEL

I-Revues

HAL-INRIA

Page 7: Usage of grey literature  in open archives

7GL11 – December 14-15, 2009

Quality improvement

0%

10%

20%

30%

40%

50%

60%

70%M

etad

ata

Val

idat

ion

2008

2009

Slightly more archives with specific metadata for grey items

Significant more archives with some kind of content validation and/or quality control

Page 8: Usage of grey literature  in open archives

8GL11 – December 14-15, 2009

Access to full text…

53%38%

9%

All items Restricted NA

(+ 5%)

Page 9: Usage of grey literature  in open archives

9GL11 – December 14-15, 2009

… but items without fulltext

Half of all open archives contain bibliographic records that don’t link to the document

Part of these records varies from 5 to 90%

Overall part of records without fulltext: 16%

Page 10: Usage of grey literature  in open archives

10GL11 – December 14-15, 2009

Usage statistics of GL

0

20

40

60

80

100

120Average downloads per document type

Importance of grey literature: 2,2 (ETD)

University of Toulouse (OATAO)

Page 11: Usage of grey literature  in open archives

11GL11 – December 14-15, 2009

Usage statistics of GL

Average downloads per document type

Importance of grey literature:

4,7 - 7 (ETD)

1,4 - 3 (reports)

1,3 (conferences)

IFREMER (Archimer)

010203040506070

Page 12: Usage of grey literature  in open archives

12GL11 – December 14-15, 2009

Usage statistics of GL

Importance of grey literature: 1,7 - 5 (working papers)

0

2

4

6

8

10

12

14

16

18

20

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Average downloads per document type

Working papers

Articles

RePEc

Page 13: Usage of grey literature  in open archives

13GL11 – December 14-15, 2009

Problems

Cumulative statistics

No history

No details (formats, …)

No specific information on GL

0

1000

2000

3000

4000

5000

6000

7000 HAL

Without metadata, no statistics

Page 14: Usage of grey literature  in open archives

14GL11 – December 14-15, 2009

Metadata

On the one hand:

- Difficulties in identifying the types of documents.- Only « published or unpublished » document.

- No count of results.

Page 15: Usage of grey literature  in open archives

15GL11 – December 14-15, 2009

Metadata

On the other hand:

- Query by: author’s affiliation scientific department research theme document type keywords

- Choice with date- Choice with full text or not- Ranking of results

Page 16: Usage of grey literature  in open archives

16GL11 – December 14-15, 2009

(Figures in parentheses refer to the 7-day period ending 16-Nov-2009 00:00).

Successful requests: 132,810 (6,975) Average successful requests per day: 914 (996) Successful requests for pages: 132,810 (6,975) Average successful requests for pages per day: 914 (996) Failed requests: 84 (0) Distinct files requested: 530 (526) Distinct hosts served: 40,015 (3,609) Corrupt logfile lines: 55 Unwanted logfile entries: 2,743,109 Data transferred: 172.81 gigabytes (9.18 gigabytes) Average data transferred per day: 1.19 gigabytes (1.31 gigabytes)

INP Toulouse

Log analysis (1)

Diversity of tools (Google Analytics / Sitemap, Webalizer Xtended, AWStats, HAL, PhpMyVisite, Analog …)

Page 17: Usage of grey literature  in open archives

17GL11 – December 14-15, 2009

Log analysis (2)

reqs: %bytes: last time: file

•2044: 0.40%: 29/Nov/09 22:53: García Martinez (2009) Development and validation of the Euler-Lagrange formulation on a parallel a... •1283: 0.26%: 29/Nov/09 22:53: Delgado Zambrano (2009) Bioréacteur à membrane externe pour le traitement d'effluents contenant de... •1115: 0.30%: 29/Nov/09 22:53: Sepret (2009) Application de la PIV sur traceurs fluorescents à l'étude de l'entraîneme... •1063: 0.21%: 29/Nov/09 22:53: Nerisson (2009) Modélisation du transfert des aérosols dans un local ventilé. •1057: 0.95%: 29/Nov/09 17:34: Delabrouille (2004) Caractérisation par MET de fissures de corrosion sous contrainte d'alliages... •1029: 0.14%: 29/Nov/09 22:53: Rajsiri (2009) Knowledge-based system for collaborative process specification. •1014: 0.79%: 29/Nov/09 22:10: Delay (2005) Analyse des écoulements transitoires dans les systèmes d'injection directe... •984: 0.88%: 29/Nov/09 22:55: Geneau (2006) Procédé d'élaboration d'agromatériau composite naturel par extrusion biv...

INP Toulouse

Page 18: Usage of grey literature  in open archives

18GL11 – December 14-15, 2009

Log analysis (3)

Pastel ParisTech

Access to website: search engins, geographical origin, strategies, etc.

On site behaviour: bouncing, downloading, duration, domains, etc.

Page 19: Usage of grey literature  in open archives

19GL11 – December 14-15, 2009

Towards standardization:PIRUS (JISC)

Publisher and Institutional Repository Usage Statistics

For authors and institutions

Article level (DOI)

COUNTER compliant

XML prototype

Article Report 1: <title>Number of Successful Full-Text Article Requests by Month and DOI</title>

Page 20: Usage of grey literature  in open archives

20GL11 – December 14-15, 2009

Towards standardization: PIRUS 2 (JISC)

COUNTER standards & PIRUS results

Different « Article Reports » (core set of standard usage statistics reports)

Open Source software for production and sharing of usage statistics on article (item) level for OA

Cost analysis

Final report in December 2010

Page 21: Usage of grey literature  in open archives

21GL11 – December 14-15, 2009

Towards standardization: OA-Statistik (DINI)

For authors (usage follow-up), readers-scientists (relevance, alert), institutions (impact)

Article level (= document)

Tools for transfer/sharing (network)

Added-value services

Page 22: Usage of grey literature  in open archives

22GL11 – December 14-15, 2009

Towards standardization: other websites, projects

LogEc http://logec.repec.org/ Usage statistics of the RePEc repository

IFABC http://www.ifabc.org/ Definition of usage metrics (user, visit…)

SURF http://www.surffoundation.nl/nl/projecten/Pages/SURE.aspx Aggregation of log files

JISC Usage statistics review http://ie-repository.jisc.ac.uk/250/ Proposal of standard

Page 23: Usage of grey literature  in open archives

23GL11 – December 14-15, 2009

Recommendations (1)

Recipient: authors, users, institutions

COUNTER principle: different levels, with a basic minimum level (AR1)

Selection of minimum elements for a basic log analysis(who, what, request type, when, identifier)

Page 24: Usage of grey literature  in open archives

24GL11 – December 14-15, 2009

Recommendations (2)

Definition of elements and terminology (access, downloading, visit, request, hit…)

Periodicity (monthly) and delay (30 days)

Distinction full text / records

Page 25: Usage of grey literature  in open archives

25GL11 – December 14-15, 2009

Recommendations (3)

Added-value services* :Modular statistics (collections, document types, time period

etc.)

Summary tables

Assistance-help / FAQ

Link with other tools measuring the impact of deposited items (citations, tagging etc.)

(…)

* see PLoS http://article-level-metrics.plos.org/

Page 26: Usage of grey literature  in open archives

26GL11 – December 14-15, 2009

Forthcoming

2010 IRIS case study (Lille 1)

2010 Final report of DUAO-F project

2010 Study on search engines

??? Partnership with JISC/COUNTER and DINI

??? Project with CCSD and/or COUPERIN

Page 27: Usage of grey literature  in open archives

27GL11 – December 14-15, 2009

Thank you!

Joachim SchöpfelUniversity Charles de Gaulle Lille 3

[email protected]++ (0) 33 688 35 01 47

Chérifa Boukacem-ZeghmouriUniversity Charles de Gaulle Lille 3

[email protected]++ (0) 33 620 62 18 12

Hélène ProstINIST-CNRS

[email protected] ++33 (0) 383 50 47 12