25
EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators Work in Progress Prof. Dr. Stefan Gradmann / Dr. Frank Havemann Humboldt-Universität zu Berlin / Berlin School of Library and Information Science (IBI)

EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Embed Size (px)

Citation preview

Page 1: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

EERQI Final Conference, Brussels, 15-16 March 2011

This project is funded by the Socioeconomic Sciences and Humanities Section.

Interrelations Of IndicatorsWork in Progress

Prof. Dr. Stefan Gradmann / Dr. Frank HavemannHumboldt-Universität zu Berlin / Berlin School of Library and Information Science (IBI)

Page 2: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

2

Overview

• Base Data• Intrinsic Indicators: Interrelation• Extrinsic paper data from search engines and

social-network services• Citations in Google Scholar• Correlation of intrinsic total score with extrinsic

scores

Page 3: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

3

Base Data

• Assessments of 179 papers based on intrinsic criteria• two files of extrinsic data:

– citation numbers of rated papers obtained with Google Scholar (on March 8, 2011)

– data from search engines and social-network services. • extrinsic author data suffer from homonymic authors → we only use

paper attributes. • Papers in English and in German distributed over three thematic

groups:

– Group 1 includes papers about "assessment, evaluation, testing & measurement" (35 / 35)

– group 2 about "comparative and inter-/multicultural education" (33 / 17)

– group 3 about "history and philosophy of education" (34 / 17)

Page 4: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

4

Intrinsic Indicators: Interrelation

• Rigour ratings

– average of nine ratings of different aspects

• Originality ratings

– average of three ratings of different aspects

• Significance ratings

– average of four ratings of different aspects

• Combined rating score for each paper: the average ratings of all 16 aspects (total score on a scale from 0 to- 7).

• To do: weight the mean ratings of each paper with its number of ratings (→ we need all individual ratings by different persons that until now have not available).

• The scatterplots in the three figures of mean scores of rigour, originality, and significance show that the latter two correlate best, especially for English-language papers.

Page 5: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

5

Originality – Rigour Interrelation

→ Lowest CorrelationStrength

Page 6: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

6

Rigour - Significance Interrelation

→ Medium-low CorrelationStrength

Page 7: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

7

Originality – Significance Interrelation

→ MaximumCorrelationStrength

Page 8: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

8

Intrinsic Indicators: Distributions of Total Scores

• Box-and-whisker plots of distributions of total scores per language and group can be compared in the figures

• We display only distributions of rated papers which also have data from search engines or social-network services.

– The box in each plot contains 50 % of papers around the median (black horizontal line).

– The range of ratings are visualised by the "whiskers".

– Lonely points show outliers (which are more distant from the box than 1.5 times the box's height).

Page 9: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

9

Intrinsic Indicators: Distributions of Total Scores

Page 10: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

10

Extrinsic paper data from search engines and social-network services

• Sources:

– CiteULike, LibraryThing, MendReader

– Google, Metager

• Many papers have only hits in one service.

• To get useful data we apply the in-dubio-pro-reo rule and select maximum values.

• We assume that zero hits cannot be used as a valid value of an indicator and thus exclude papers without hits from the analysis.

• The hit distribution of papers with at least one hit is heavily skewed to the left: Many papers have only a few hits and only a few papers have many hits. We therefore use the logarithm of hit numbers as a more adequate representation.

Page 11: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

11

Extrinsic paper data from social-network services

Page 12: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

12

Extrinsic paper data from search engines: similar to social networks

Page 13: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

13

Extrinsic paper data from search engines and social-network services

• All papers with social-network hits also have search engine hits. • Both hit numbers correlate

– quite well in each of the three groups for papers in English

– and less well for papers in German

Page 14: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

14

Extrinsic paper data from search engines and social networks: English

Page 15: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

15

Extrinsic paper data from search engines and social networks: German

Page 16: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

16

Citations in Google Scholar: citation distributions for samples of the three groups

• Not all papers are listed in Google Scholar.

• Only a few papers in German are in the sample. We omit them.

• Here we use the y-scale of dual logarithms of numbers of citation + 1. The addition of 1 is a usual bibliometric method to include papers without citations into the analysis of log-values. It can be justified with the argument that publishing a new result is its first citation.

Page 17: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

17

Citations in Google Scholar: The total scores (mean ratings)

• Note, that the first (red) group is rated best but cited worst (in contrast to the results for search engines and social-network services, where for papers in English ratings and hit numbers on the aggregated level of thematic groups seem to correlate).

• Extrinsic author data remain a to do: an effective method for disambiguating authors is needed first

Page 18: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

18

Correlation?

Page 19: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

19

Correlation of intrinsic total score with extrinsic scores

• In the case of hits in social networks and in search engines there is no correlation with intrinsic total score as the scatterplots show.

Page 20: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

20

Correlation of intrinsic total score with extrinsic scores: social networks

Page 21: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

21

Correlation of intrinsic total score with extrinsic scores: social networks

Page 22: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

22

Correlation of intrinsic total score with extrinsic scores: search engine data

Page 23: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

23

Correlation of intrinsic total score with extrinsic scores: search engine data

Page 24: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

24

Correlation of intrinsic total score with extrinsic scores

• The same is true for citations of papers in English drawn from Google Scholar

Page 25: EERQI Final Conference, Brussels, 15-16 March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators

Interrelations of Indicators / Stefan Gradmann, Frank HavemannEERQI Final Conference, Bruxelles 15-16 March 2011

25

Conclusion

• As a consequence, to do any correlation analysis (including rank correlation) of these intrinsic and extrinsic paper data does not make any sense ...

• … as long as such an analysis is based on paper attributes exclusively!

• → Effective author name disambiguation and disciplinary allocation is key

• Preliminary results do not yet invalidate the correlation methodology …

• … but they are revealing in terms of source data quality!• How to understand the variance among the sub-samples?