6
Learned Publishing, 23: 9–14 doi:10.1087/20100103 CrossCheck: an effective tool for detecting plagiarism Helen (Yuehong) Zhang LEARNED PUBLISHING VOL. 23 NO. 1 JANUARY 2010 I ntroduction CrossCheck (http://www.crossref.org/ CrossCheck) is an international project intended to help publishers cope with the increasingly high incidence of plagia- rism. 1 CrossCheck helps to protect the original authors’ copyrights, and helps to improve authors’ behaviour by identifying instances of academic plagiarism. It is led by the Publishers International Linking Associ- ation (CrossRef); several global publishing groups are participating. 2 In 2008 Cross- Check won the ALPSP Award for Publishing Innovation. 3 In October 2008 the Journal of Zhejiang University – Science (A & B),3 which is sup- ported by the National Natural Science Foundation of China, became the first mem- ber of CrossCheck in China. 4 CrossCheck is used as part of the journal’s review process. Each paper is CrossChecked twice: the first check takes place before it is sent to interna- tional reviewers; a second check takes place just before ‘online-first’ publication, to ensure that no potential plagiarism is missed owing to the inevitable time-lag in updating the CrossCheck database. The date of the latest CrossCheck is included on the first page for each journal paper (Figure 1) for the information of readers, authors, and data- bases. The majority of authors behave correctly, submitting papers that bear little or no similarity to other published papers. However, around 22.8% of papers appear to contain unreasonable copying or self-plagia- rism, and about a quarter of these give rise to serious suspicions of plagiarism and copy- right infringement; in some cases, the similarity with the plagiarized original was as high as 83%. Four distinct types of plagiarism were identified, which we consider sufficiently serious to be considered as a form of aca- demic misconduct: CrossCheck: an effective tool for detecting plagiarism 9 LEARNED PUBLISHING VOL. 23 NO. 1 JANUARY 2010 CASE STUDY CrossCheck: an effective tool for detecting plagiarism Helen (Yuehong) ZHANG Zhejiang University Press ABSTRACT. The plagiarism detection service CrossCheck has been used since October 2008 as part of the paper reviewing process for the Journal of Zhejiang University – Science (A & B). Between October 2008 and May 2009 662 papers were CrossChecked; 151 of these (around 22.8% of submitted papers) were found to contain apparently unreasonable levels of copying or self-plagiarism, and 25.8% of these cases (39 papers) gave rise to serious suspicions of plagiarism and copyright infringement. Four types of copying or plagiarism were identified, in an attempt to reach a consensus on this type of academic misconduct. © Zhang Yuehong 2010 Helen Zhang

CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

Learned Publishing, 23: 9–14doi:10.1087/20100103

CrossCheck: an effective tool for detecting plagiarismHelen (Yuehong) ZhangLEARNED PUBLISHING VOL. 23 NO. 1 JANUARY 2010

Introduction

CrossCheck (http://www.crossref.org/CrossCheck) is an international projectintended to help publishers cope with

the increasingly high incidence of plagia-rism.1 CrossCheck helps to protect theoriginal authors’ copyrights, and helps toimprove authors’ behaviour by identifyinginstances of academic plagiarism. It is led bythe Publishers International Linking Associ-ation (CrossRef); several global publishinggroups are participating.2 In 2008 Cross-Check won the ALPSP Award for PublishingInnovation.3

In October 2008 the Journal of ZhejiangUniversity – Science (A & B),3 which is sup-ported by the National Natural ScienceFoundation of China, became the first mem-ber of CrossCheck in China.4 CrossCheck isused as part of the journal’s review process.Each paper is CrossChecked twice: the firstcheck takes place before it is sent to interna-tional reviewers; a second check takes placejust before ‘online-first’ publication, toensure that no potential plagiarism is missedowing to the inevitable time-lag in updatingthe CrossCheck database. The date of thelatest CrossCheck is included on the firstpage for each journal paper (Figure 1) for theinformation of readers, authors, and data-bases. The majority of authors behavecorrectly, submitting papers that bear littleor no similarity to other published papers.However, around 22.8% of papers appear tocontain unreasonable copying or self-plagia-rism, and about a quarter of these give rise toserious suspicions of plagiarism and copy-right infringement; in some cases, thesimilarity with the plagiarized original was ashigh as 83%.

Four distinct types of plagiarism wereidentified, which we consider sufficientlyserious to be considered as a form of aca-demic misconduct:

CrossCheck: an effective tool for detecting plagiarism 9

L E A R N E D P U B L I S H I N G V O L . 2 3 N O . 1 J A N U A R Y 2 0 1 0

CASE STUDY

CrossCheck:

an effective tool

for detecting

plagiarismHelen (Yuehong) ZHANGZhejiang University Press

ABSTRACT. The plagiarism detection serviceCrossCheck has been used since October 2008 aspart of the paper reviewing process for the Journalof Zhejiang University – Science (A & B). BetweenOctober 2008 and May 2009 662 papers wereCrossChecked; 151 of these (around 22.8% ofsubmitted papers) were found to contain apparentlyunreasonable levels of copying or self-plagiarism,and 25.8% of these cases (39 papers) gave rise toserious suspicions of plagiarism and copyrightinfringement. Four types of copying or plagiarismwere identified, in an attempt to reach a consensuson this type of academic misconduct.

© Zhang Yuehong 2010 Helen Zhang

Page 2: CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

1. duplicate publication;2. self- (or team) plagiarism;3. direct copying of Methods section, withnew data inserted; and4. uncited or excessive extracts.A fuller report of the findings, in Chinese,was published in ScienceTimes.5

Duplicate publication

CrossChecking identified the fact that a fewauthors had contributed almost identicalpapers to several journals, or had submitted– completely unchanged – papers previouslypublished in conference proceedings or elec-tronic journals. If the similarity is more than40–50%, we automatically reject the articleon the basis of duplication. For example, inMay 2009 the final CrossCheck identifiedone article which duplicated about 78% ofthe content from a paper by the same authorpublished in an IEEE journal in early 2009(Figure 2).

Identification of duplicated text is notdifficult using CrossCheck. However, Cross-Check is currently unable to check duplica-

tion in figures and tables, so we haverecourse to other sources (Google, PubMedCentral, etc.) for further analysis of articleshighlighted by CrossCheck. Comparison of apaper from France and one from BurkinaFaso showed that, while only 18% of themain text was duplicated, the referenceswere identical (Figure 3). When we referredto PubMed Central’s full-text database, wediscovered that the figures and tables werecompletely duplicated from the earlier publi-cation, so that the actual duplication wasnearer 80%. This shows that our editors can-not rely on CrossCheck alone, but also haveto make additional efforts to detect duplica-tion at the time of submission and of publi-cation.

In another typical example, a paper wasfound to have the same abstract as anotherpreviously published paper. Further investi-gation revealed that the author’s Ph.D.thesis had already been published onlinethrough a university press, and that theauthor had also already published, beforePh.D. graduation, two papers containing the

10 Helen (Yuehong) Zhang

L E A R N E D P U B L I S H I N G V O L . 2 3 N O . 1 J A N U A R Y 2 0 1 0

Figure 1. The date ofthe most recent

CrossCheck isindicated on the firstpage of each journal

paper.

Figure 2. Duplicatepublication.

around 22.8%of papersappear to

containunreasonable

copying orself-plagiarism

Page 3: CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

core content of his thesis. However, heargued that although 39% of the paper repli-cated previous publications, the rest of thematerial was previously unpublished contentfrom his Ph.D. thesis, and thus that thewhole paper should still be published. Sincethe whole thesis was already availableonline, and its core content had been pub-lished five years previously, we felt that itwas unacceptable to republish it unless itcontained significant new information, andthe article was therefore rejected.

In our view, duplicate publication injuresthe interests of many journals, wastes publi-cation resources and should be condemnedboth by academia and the publishing indus-try. As Errami and Garner state,

the repeated publication of the same re-sults by those who conducted the researchis ethically questionable. It not only artifi-cially inflates an author’s publication re-cord but places an undue burden onjournal editors and reviewers, and is ex-pressly forbidden by most journal copy-right rules.6

Further work is needed to define relevantcriteria.

Self- (or team) plagiarism

Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications ofother team members). This can frequentlybe found in papers of authors from the sameresearch programme (Figure 4). Someauthors, or even programme leaders, believethat this is justified by different focuses inthe same research project, even when theequipment and methods adopted are thesame; thus they do not feel it is unreasonableto duplicate parts of the Introduction, Meth-ods and Discussion sections.

However, in our view, once a paper is pub-lished the authors should not recycle any ofits content in new papers. Self-plagiarismwastes not only the publication resources ofjournals but also the time of readers. Instead,authors should simply cite previous studies,giving no more than an overview in theircurrent paper. It is preferable to combine thecontent of several papers together to form asingle high-quality paper, rather than repeat-

ing some of the contents to form differentpapers.

Publishers also object to this practice. AsArnout Jacobs, Vice-President of the Sci-ence & Technology Department, Chinesesection, at Elsevier, says:

it is a nuisance for journal editors when re-searchers publish a series of highly similarpapers. Often, these papers could easily berewritten as one single excellent paper.This happens less often in the USA orEurope, where editors or funding agenciescheck earlier publications routinely as areference and authors would be judgednegatively for publishing multiple paperswith the same topics and replicated con-tents.7

Direct copying of Methods section, withnew data inserted

This is a particularly common phenomenonin biomedical papers, where all or part of theMethods section may be copied verbatim,only changing some of the experimentalconditions and data (see Figure 5). Someauthors feel that it is acceptable to copy allor part of the Methods section from a previ-ously published article, simply inserting theirown data.

However, we observe that this type ofdirect copying is hardly ever found in leadingjournals such as Science and Nature. Inprinciple we believe that, although muchresearch refers to or repeats others’ success-ful methods in testing new materials anddiscussing new results, the authors shoulduse their own language to describe and sum-marize their methods and ideas.

Uncited or excessive extracts

We found that some authors incorporatedextracts from other papers without providingcitation details (see Figure 6). In oneinstance, when we raised the matter withthe author, he argued that, since his ownview was identical to that of the otherauthor, it was acceptable to use the samewords without citation. However, such con-duct misleads readers into believing thatthey are reading the author’s own words

CrossCheck: an effective tool for detecting plagiarism 11

L E A R N E D P U B L I S H I N G V O L . 2 3 N O . 1 J A N U A R Y 2 0 1 0

he argued that,since his ownview wasidentical to thatof the otherauthor, it wasacceptable touse the samewords

Page 4: CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

12 Helen (Yuehong) Zhang

L E A R N E D P U B L I S H I N G V O L . 2 3 N O . 1 J A N U A R Y 2 0 1 0

Figure 5. Directcopying of Methods

section with new datainserted.

Figure 4. Plagiarismof the work of

members of the sameresearch team.

Figure 3. Identicalreferences in two

apparently differentpapers.

Page 5: CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

and, quite apart from its academic impropri-ety, this is an infringement of copyright.

Sometimes, too, authors believe that, witha full citation, it is reasonable to copy wholeparagraphs from other papers; this is not thecase, and the ‘fair dealing’ rules alwaysapply.

The phenomenon of ‘copy and paste’ isalso all too common, particularly in papersfrom non-English-speaking authors. In a fewextreme cases, we found that many sen-tences and whole paragraphs were identicalto those in published papers, and scarcelyany of the words were the authors’ own (seeFigure 7).

The Council of Science Editors gives cleardefinitions of piracy and plagiarism:8

Piracy is defined as the appropriation ofideas, data, or methods from others with-out adequate permission or acknowledg-ment. Again, deceit plays a central role inthis form of misconduct. The intent of theperpetrator is the untruthful portrayal ofthe ideas or methods as his or her own.

Plagiarism is a form of piracy that in-volves the use of text or other items (fig-ures, images, tables) without permission oracknowledgment of the source of thesematerials. Plagiarism generally involvesthe use of materials from others, but canapply to researchers’ duplication of theirown previously published reports withoutacknowledgment (this is sometimes calledself-plagiarism or duplicate publication).

CrossCheck: an effective tool for detecting plagiarism 13

L E A R N E D P U B L I S H I N G V O L . 2 3 N O . 1 J A N U A R Y 2 0 1 0

Figure 6. Uncitedextracts from otherpapers.

Figure 7. Copy andpaste, with almost nooriginal text.

the phenomenonof ‘copy andpaste’ is also alltoo common

Page 6: CrossCheck: I an effective tool for detecting plagiarism€¦ · Self- (or team) plagiarism Another familiar phenomenon is self-plagia-rism (or plagiarism of the publications of other

Authors should ensure that any article theysubmit for publication is original and doesnot contain plagiarized content from eithertheir own or others’ work. If an author’s textfollows the source so closely that the result ismore of a the quotation than a paraphrase, itconstitutes plagiarism; the author musteither completely recast the summary in hisor her own words (changing a few words isnot sufficient), or quote explicitly.

Conclusions

The importance of science should be mea-sured by the quality of papers rather thantheir quantity. In China, as elsewhere,researchers and their institutions should beevaluated on the basis of real originalresearch results, rather that on the basis ofpaper output. An emphasis on quantityrather than quality is liable to lead toauthors taking short cuts such as plagiarism.

Academia is not a perfect world; inevita-bly academic journals all over the world arelikely to encounter these or similar prob-lems. As editors, we have a responsibility topromote professional ethics. CrossCheckenables us to see that most scientists dobehave ethically. However, it is up to theeditorial community to propose criteria andprocesses for handling these types of aca-demic misconduct. In this way we can help

to protect the copyrights of original authors,and promote the healthy development ofacademic journals.

Acknowledgements

I thank my colleagues Hanfeng Lin, Ziyang Zhai, XinxinZhang, and Chunjie Zhang for collecting and analyzing thecase study data.

References

1. Meddings K. 2010. Credit where credit’s due: plagia-rism screening in scholarly publishing. Learned Pub-lishing, 23: 5–8.doi:10.1087/20100102

2. http://www.crossref.org/CrossCheck_members.html3. http://www.alpsp.org/ngen_public/article.asp?id=

0&did=0&aid=19899&st=awards&oaid=04. English edition at http://www.zju.edu.cn/jzus/5. Zhang, Y.H. 2009. Academic misconduct as perceived

from recent examples. ScienceTimes, 7 May (in Chi-nese), http://www.sciencenet.cn/dz/dznews_photo.aspx?id=6007

6. Errami, M. and Garner, H. 2008. Commentary – a taleof two citations. Nature, 451: 397–9.doi:10.1038/451397a

7. Jacobs, A. 2008. How to counter academic dishonestyin STM journals. Sciencenet, 5 April. www.sciencenet.cn/htmlnews/2008428182451204881.html (in Chi-nese)

8. http://www.councilscienceeditors.org/editorial_poli-cies/whitepaper/3-1_misconduct.cfm#3.1.3

Helen (Yuehong) ZhangJournal Director, Zhejiang University Press38 Zheda Road, Hangzhou, 310027, ChinaEmail: [email protected]

14 Helen (Yuehong) Zhang

L E A R N E D P U B L I S H I N G V O L . 2 3 N O . 1 J A N U A R Y 2 0 1 0

an emphasis onquantity is

liable to lead toauthors takingshort cuts such

as plagiarism