Dr. Rao Muhammad Adeel Nawab - ilmoirfan€¦ · Dr. Rao Muhammad Adeel Nawab Sentence 1: Textreuse is becoming a serious issue in many fields and research shows that it is much harder

Dr. Rao Muhammad Adeel Nawab

2

How to Read a Research Paper?

Session V

Making Summary and Documenting a Paper

How to Work

��نن .اکام ��نن کام ی

شو�

نی �

شو�

ن. ا�

ک ی ش

و�نی �

شو�

ن�� ھ �ے �و سا�ت اهللا ��نن .ام


3

ین اك نستع إیاك نعبد وإی ت :آ�یی �ے مد د م ھ �ہ �ب

تں اور � �ی ے �ہ

ت�� ی �ببادت ری �ہ �ی ا هللا �م �ت ں بی �ی ے �ہ ��ت بن ا

4


��تتں� دعا��ی

ط ٱلمستقیم صر ر ھم ط ٱلذین أنعمت علی ٱھدنا ٱلص��ی

ہ �یہ دعا مانن :ںروزا�ن

ع ے ا�نن

و �تن �پ� � �� راہ �ب د�ی راہ د�ھا ان �و�وں ں �بی ��ی

ا�ہ �بی ام

Power of Dua

5


یة إنما األعمال بالن

ے۔ �ۓ �ہ ماری �ببادت ا �ہ ا اور �پ�ھابن �یہ �پ��بن�یت �ے �� ن د�ت �� خن دا �وق خن

ا اور �خن �� رضن ان �ے هللا �(هللا ے د�بیھ ں۔) سا�ت ں اور �پ�ھا��ی �پ��ی

صلى هللا علیھ وسلم نے ف رمایارسول �

ے۔( �توں �پ� �ہ �ی کا دارومدار �ن )ا�مال

Power of Neyat

6Dua - Take Help from Allah before starting any Task


Balanced Life is Ideal Life 7

Get Excellence in five things

A Journey from BIGNNER toEXCELLENCE

You must have acombination of five thingswith different variations.However, aggregate will besame.


Have a DADDU YAR in life to drain out on daily basis

8Excellence – FRIENDS


کا ں ا�بباب ے ز�ی�ت ��ی ےکار �ہ وم�ب �ب�ہ

ک ے انی � �ہنکا� و

ت�وص � �� خن بی و �پ ص�ہ

�نش

�

9Excellence – FAMILY


Take Dua’s of Parents and elders by doing their خدمت and ادب

Your wife/husband must be your best friend

Be humble and kind to kids, subordinates and poor people

10

OutlineUnderstand Order and Flow

Template-based Approach to Read a Paper


11

Understand Order and Flow


Order and Flow 12


Document (or Paper) Level

Connection between Sections

Section Level

Connection between Paragraphs

Paragraph Level

Connection between Sentences

Sentence Level

Connection between Words / Phrases

Paper Outline 13


Abstract

Introduction

Related Work

Corpus Generation Extract sentence/passage pairsAnnotation guidelinesAnnotations Corpus statisticsExamples from the corpusLinguistic analysis of the transformations

Paper Outline (cont.) 14


Text reuse detection experimentsTranslation plus mono-lingual analysisExperimental setup

Results and analysis

Conclusions and future work

Acknowledgements

References

15

Template-based Approach to Read a Paper


Reading Abstract 16


Read sentence by sentence and do interpretation of each sentenceTemplate

Problem (or Research Problem)Importance of ProblemApplication(s) of ProblemSummary of Existing LiteratureResearch GapProposed SolutionCharacteristics of Proposed SolutionResults and Main Findings

Abstract 17


Sentence 1:Text reuse is becoming a serious issue in many fields andresearch shows that it is much harder to detect when itoccurs across languages.

InterpretationCross-lingual Text Reuse Detection is a challenging task

InsightsResearch Problem Cross-Lingual Text Reuse DetectionImportance It is a wide spread problem and also

difficult to detect

Abstract 18


Sentence 2:The recent rise in multi-lingual content on the Web hasincreased cross-language text reuse to an unprecedentedscale.

InterpretationReason(s) for rise in cross-lingual text reuse

InsightsJustification why cross-lingual text reuse detection is an important problem to be addressed, it has generic applications and why it is on rise?

Abstract 19


Sentence 3:Although researchers have proposed methods to detect it, onemajor drawback is the unavailability of large-scale gold standardevaluation resources built on real cases.

InterpretationSummary of existing literature and Research Gap

InsightsSummary of existing literature

- Researchers have proposed methods to detect it

Research gap - Unavailability of large-scale gold standard evaluation resources built on real cases

Abstract 20


Sentence 4:To overcome this problem, we propose a cross-language sentence/passage level text reuse corpus forthe English-Urdu language pair.

InterpretationPurposed Solution

InsightsNeed to be very specific in proposing solution

Cross-language sentence/passage level text reuse corpus English-Urdu language pair

Abstract 21


Sentence 5:The Cross-Language English-Urdu Corpus (CLEU) hassource text in English while the derived text is inUrdu.

InterpretationBrief details of Main Contribution i.e. proposed solution (cross-lingual text reuse corpus for English-Urdu language pair)

InsightsBrief Detail of Proposed Solution - source text in English while the derived text is in Urdu.Note that this is the Selling Point of the Paper

Abstract 22


Sentence 6:It contains in total 3,235 sentence/passage pairs manuallytagged into three categories i.e., near copy, paraphrasedcopy, and independently written.

InterpretationMain characteristic of the proposed solution (cross-lingual text reuse corpus for English-Urdu language pair)

InsightsTotal 3,235 sentence/passage pairs Manually tagged Three categories i.e., near copy, paraphrased copy, and independently written.

Abstract 23

Sentence 7:Further, as a second contribution, we evaluate the Translation plusMono-lingual Analysis method using three sets of experiments on theproposed dataset to highlight its usefulness.

InterpretationBrief details of Secondary contribution and applications of the proposed solution

InsightsTechnique – translation plus mono-lingual analysis Experiments – 3Evaluation – comparison of various techniques on the same dataset (proposed in this study)Note that it is not the selling point of this research work


Abstract 24

Sentence 8:Evaluation results (F1 = 0.732 binary, F1 = 0.552 ternary classification)indicate that it is harder to detect cross-language real cases of textreuse, especially when the language pairs have unrelated scripts.

InterpretationTypes of classification, best results and main finding

InsightsTernary Classification – Verbatim, Paraphrased and Independently WrittenBinary Classification – Derived vs non-Derived ResultsResults - (F1 = 0.732 binary, F1 = 0.552 ternary classification) Main Findings - It is harder to detect cross-language real cases of text reuse, especially when the language pairs have unrelated scripts.


Abstract 25


Sentence 9:The corpus is a useful benchmark resource for thefuture development and assessment of cross-languagetext reuse detection systems for the English-Urdulanguage pair.

InterpretationStrengths and applications of proposed solution (cross-lingual text reuse corpus for English-Urdu language pair)

Abstract: Overall Interpretations 26


Cross-lingual text reuse detection is a challenging task

Reasons for rise in cross-lingual text reuse (Importance and application)

Summary of existing literature and Research Gap

Purposed Solution

Brief details of Main Contribution i.e. proposed solution (cross-lingual text reuse corpus for English-Urdu language pair)

1

2

3

4

5

Abstract: Overall Interpretations 27


Main characteristic of proposed solution (cross-lignaultext reuse corpus for English-Urdu language pair)

Brief details of secondary contribution

Types of classification, best results and main finding

Strengths andApplications of proposed solution

6

7

8

9

Reading Introduction 28


Summarize each paragraph into a single sentence

See the order and flow of paragraphs

29Introduction: Passage 1


Text reuse, the process of creating new texts using existing ones, hasbecome very common because of free, readily available, and large digitalrepositories. In addition, state of- the-art text processing applications havemade it very simple to copy-paste text and give it a new identity. Textborrowed from such sources can be reused verbatim (copy paste) orrewritten (paraphrased). If the rewriting process involves complex editingoperations (e.g., lexical substitution, changes in syntax, summarization,synonym replacement, altering word order, or verb or noun nominalization)then the borrowed text transforms into an independently written piece(Clough, Gaizauskas, Piao, & Wilks, 2002; Maurer, Kappe, & Zaka, 2006).Moreover, new text can be created using text from one or more sources andthe amount of reused text varies from local text reuse (such as, a singleword, small chunks, or sentences) to global text reuse (i.e., an entiredocument; Mittelbach, Lehmann, Rensing, & Steinmetz, 2010; Seo & Croft,2008).

30Introduction - Passage 1 - Summary


Definition of Text Reuse

Importance of Text Reuse

Levels of Text Reuse

Verbatim

Paraphrased

Independently Written

Types of Text Reuse (Local vs Global)



Unlike academic plagiarism (the unacknowledged reuseof text), text reuse is a common practice in journalism.Newspapers pay news agencies for their text(s) (heretermed source text) to generate news stories (termedderived text). The text purchased from a news agencycan be reused “verbatim” or “paraphrased” to createthe newspaper story. However, at times the newspaperstory might also be independently written without usingany news agency text (Clough, 2010).



Definition of Plagiarism

Process of text reuse in Journalism



Text reuse can either be mono-lingual (when the source and derived text share the same language)or cross-lingual (when the source text is in one language and the derived text is in another). Mono-lingual text reuse detection has been a subject undergoing intense study for the researchcommunity for some time, but recently the focus has shifted towards detecting text reuse acrosslanguages (Ceska, Toman, & Jezek, 2008; Franco-Salvador, Gupta, Rosso, & Banchs, 2016; Gupta,Barrón-Cedeño, & Rosso, 2012; Potthast, Barrón- Cedeño, Stein, & Rosso, 2011). A recent studysuggested that the scale of cross-language text reuse and plagiarism is increasing (Barrón-Cedeño,Gupta, & Rosso, 2013). This is because of the following reasons: (a) users of under-resourcedlanguages, which are very large in number, commonly use text(s) from resource-rich languages, (b)speakers of one language staying in a country other than their own can consult the text(s) in theirnative language, and (c) often speakers of one language are keen to write in a foreign language.Likewise, the recent rise in multi-linguality, freely available machine translation systems, andintelligent word processors are contributing to an environment where it is easy to reuse text acrosslanguages, but with a perception of being harder to detect such reuse (Somers, Gaspari, & Niño,2006). Therefore, there is an ever-increasing necessity to develop standard evaluation resourcesand methods to detect cross-language text reuse for the various language pairs.



Mono-lingual vs Cross-lingual text reuse

Three main reasons for rise in cross-lingual text reuse



To develop, evaluate, and analyze methods for crosslanguage text reuse (either local or global), gold standardbenchmark corpora are needed. These corpora can begenerated in three ways: (a) artificial - using an automatictext altering tool, (b) simulated - humans are asked torewrite source text to create new text, and (c) real - newagency’s text is reused by journalists to create thenewspaper story. It seems likely that cross-language textreuse detection methods which are trained on real examplesare more likely to give realistic performance that weinvestigate further in our paper.



Why it is important to develop cross-lingual text

reuse corpus?

Three ways to generate a corpus

artificial

simulated

real



This study aims to develop a publicly available largescale benchmark corpus that contains realexamples of cross-language text reuse at sentence/passage level1 for the English-Urdu languagepair. Urdu belongs to the Indo- Aryan family, widely spoken in Pakistan and the northern parts ofIndia (Alam, Mehmood, & Nelson, 2015). Moreover, it has a strong Perso-Arabic influence in itsvocabulary and is written in a Perso-Arabic script from right to left. It is also spoken world-widebecause of the South Asian Diaspora (with large populations in the Middle East, United States, UK,Norway, and Canada etc.; Daud, Khan, & Che, 2016). Despite that, for the English-Urdu languagepair, there are no publicly available cross language text reuse detection datasets known to us.Moreover, previous research has tended to focus more on European languages. The corpusdeveloped as an outcome of this study contains 3,235 pairs of real examples of cross-language textreuse at sentence/passage level (the source text is in English whereas derived text is in Urdu). Eachsentence/passage pair is categorised as i) Near Copy (NC; 751 pairs), ii) Paraphrased Copy (PC; 1751pairs), or iii) Independently Written (IW; 733 pairs). The corpus is representative enough to serveas a benchmark dataset for: (a) developing and evaluating techniques for cross-language text reusedetection for the English-Urdu language pair, (b) obtaining an insight into what edit operations arelikely used by journalists in reusing text, and (c) to foster text reuse detection research in theEnglish-Urdu language pair.



Main aim of this study

Importance of Urdu

Summary of Literature Review

Research Gap

Proposed Solution (or Corpus)

Characteristics and applications of proposed solution



The remainder of this article is organized as follows. We first reviewpreviously developed cross-lingual text reuse or plagiarism detectioncorpora. Then we present a detailed discussion on the CLEU corpusconstruction, its statistics, characteristics, linguistic analysis, andexample cases. This is followed by the explanation of cross languagetext reuse detection experiments that we performed on our corpus tohighlight its strengths and its utility for evaluation purposes. Finally, wepresent the results and their analysis and then conclude the article.



Organization of Paper

Introduction: Overall Interpretations 41


Definition of Text Reuse, Importance of Text Reuse,Levels of Text Reuse (Verbatim, Paraphrased,Independently Written) and Types of Text Reuse (Local vsGlobal)

Definition of Plagiarism and Process of text reuse inJournalism

Mono-lingual vs Cross-lingual text reuse and Three mainreasons for rise in cross-lingual text reuse

1

2

3

Introduction: Overall Interpretations 42


Organization of Paper6

Main aim of this study, Importance of Urdu, Summary ofLiterature Review, Research Gap, Proposed Corpus (orSolution), It’s characteristics and applications

5

Why it is important to develop cross-lingual text reuse corpus?and Three ways to generate a corpus (artificial, Simulated &real)

4

Reading - Related Work 43


Summarize each paragraph into a single sentence

See the order and flow of paragraphs

44Related Work: Passage 1


In the previous literature, efforts have been made to develop standard evaluationresources for measuring cross language text reuse (and plagiarism) for different thelanguage pairs. For example, PAN authors have developed a series of corpora withartificial and simulated examples of plagiarism at document level (Potthast, Barrón-Cedeño, Eiselt, Stein, & Rosso, 2010; Potthast, Eiselt, Barrón- Cedeño, Stein, &Rosso, 2011; Potthast et al., 2012–2014; Stein, Rosso, Stamatatos, Koppel, &Agirre, 2009). The majority (90%) of the text plagiarism cases in these corpora aremono-lingual, however, there exists a small portion (10%) of cross-lingual plagiarismcases too. These cross language plagiarism cases are for the English-German andEnglish-Spanish language pairs. Most of these cases are artificial (created usingautomatic MT [Machine Translation] system that is, Google Translate3) but a smallnumber of them are created manually (i.e., translated by humans). These corporahave been used to evaluate text plagiarism detection methods in the competitionsheld annually.

45Related Work: Passage 1 - Summary


Opening sentence

Summary of PAN text reuse corpora



The CL!TR4 (Cross-Language Indian Text Reuse) corpus is the first of its kind developedspecifically for the analysis of cross-language text reuse detection in the Hindi-Englishlanguage pair at document level (Barrón-Cedeño, Rosso, Devi, Clough, & Stevenson,2013). The suspicious documents it contains are in Hindi and the source documents inEnglish language. The training set includes 198 suspicious (Hindi) and 5,032 source(English) documents, whereas the test set has 190 suspicious (Hindi) and 5,032 source(English) documents. The CL!TR corpus contains simulated cases of text reuse. Thevolunteers involved in the study were asked to answer a set of 10 questions, related tothe tourism and computer science domains, to create suspicious documents. It containsthree types of revisions, categorized by the amount of obfuscation used, namely“Exact” (without any modifications, translation only), “Light” (very few modifications,translation, and manual correction), and “Heavy” (detailed modifications, translation,and manual correction). The corpus also contains “Original” (independently written)documents which were generated without referring to the source documents but usingthe learning material provided.



English-Hindi CLITRA Cross-Lingual Text Reuse Corpus



Another cross-language corpus of 110 documents (55 source inEnglish and 55 plagiarized in Bangla) that contains simulatedplagiarism cases and was built using student’s reports from auniversity (Arefin, Morimoto, & Sharif, 2013). Two groups of 55students each, were asked to write a report on a given topic. 50reports are used as training set whereas the remaining 10 as testset. Plagiarism cases were obfuscated by replacing contents withseveral plagiarized fragments of different lengths. However, thecorpus is not available to download.



English-Bangle Cross-Lingual Text Reuse Corpus



Recently, a cross-language (Urdu-English language pair) document levelplagiarism detection corpus was submitted for the PAN 2016 shared task (Hanifet al., 2015). The corpus is divided in two sets, 500 source (Urdu) and 500suspicious (English) documents, and contains only simulated examples ofplagiarism. The source documents are Wikipedia excerpts whereas theplagiarized documents were manually created by university students. Thestudents were asked to plagiarize 270 documents on three levels of obfuscation(“Near Copy,” “Light Revision,” and “Heavy Revision”), whereas 230documents in the corpus are “Nonlabialized.” Moreover, the plagiarism casesinserted in the suspicious documents are of various length that is small ( < 50tokens), medium (50–100 tokens), and large (100–200 tokens). The corpus isthe first cross language (Urdu-English pair) dataset created for plagiarismdetection research at the document level.



English-Urdu CLUE Cross-Lingual Text Reuse Corpus



CLiPA (Cross-Language Plagiarism Analysis) is a publicly availablefragment or sentence level corpus containing five source sentences (inEnglish) which were used to generate plagiarized cases (in Spanish andItalian) using both machine translation (artificial) and manual translation(simulated; Barrón-Cedeño, Rosso, Pinto, & Juan, 2008). The machinetranslation cases were generated using five different services to havevariations whereas for manually (human) simulated plagiarism cases,nine volunteers were asked to plagiarize each of the five sourcefragments. They were further requested to generate the same numberof nonplagiarized cases as well. The corpus was used in experiments ontext plagiarism detection research in the English-Spanish and English-Italian language pairs.



English-Spanish and English-Italian CLIPA Cross-Lingual

Text Reuse Corpus



In summary, the corpora discussed above either contain artificialor simulated examples of cross-language text reuse (orplagiarism). Cross-language text reuse detection methodsdeveloped using these non-real types of text reuse are unlikely toperform well on real cases of text reuse that occur in real worldscenarios (e.g., academia, journalism; Weber-Wulff, 2010).



Summary of existing literature and Research Gap

(unavailability of real examples of text reuse).



Moreover, the simulated cases created in a controlled environmentusing crowd-sourcing do not represent the strategies used by humanswhen rewriting text in real life. Because cross-language text reuse isincreasing day-by day, first, there is an urgent need to develop textreuse detection corpora with real examples of text reuse. Second,the available corpora for research are created at document level andthere are no corpora available at sentence/passage level for theEnglish-Urdu language pair. Last, the corpora listed above are notlarge enough to generate robust results. This is not surprising becauseit takes a lot of manual effort to create corpora with simulatedexamples of text reuse or plagiarism.



Limitations of existing work (or corpora)



To develop and evaluate cross-language text reuse detectionmethods for the real-world scenario, we need to create corporawith real examples of text reuse. To fill this gap, our research workproposes a large-scale gold standard benchmark corpus containingreal examples to measure cross-language text reuse atsentence/passage level for the English-Urdu language pair. Thenext section describes the corpus generation process in detail.



Justification for need of a new corpus, out contribution

in developing a new corpus and connection with the

next section

Related Work : Overall Interpretations 60


Opening sentence and Summary of PAN text reusecorpora

English-Hindi CLITRA Cross-Lingual Text Reuse Corpus

English-Bangle Cross-Lingual Text Reuse Corpus

1

2

3

English-Urdu CLUE Cross-Lingual Text Reuse Corpus4

Related Work : Overall Interpretations 61


Summary of existing literature and Research Gap(unavailability of real examples of text reuse).

6

English-Spanish and English-Italian CLIPA Cross-Lingual TextReuse Corpus

5

Justification for need of a new corpus, out contribution indeveloping a new corpus and connection with the nextsection

8

Limitations of existing work (or corpora)7

Reading - Corpus Generation 62


Purpose of CorpusCross-lingual text reuse detection

Corpus Generation ProcessExtracting sentence/passage pairsPreparation of annotation guidelinesAnnotation of text by three annotatorsComputing inter-annotator agreement

Corpus Characteristics 63


Language Pair

English-Urdu

Levels of Text Reuse

Verbatim – 741.Paraphrased – 1751.Independently Written –733.

Standardization

XML format

Global or Local

Local (Sentence / Passage level)Size of Corpus total 3235 Pairs

Reading – Text Reuse Detection Experiments 64


Techniques:In which category it falls?How it works?Strengths and weaknesses?In which previous studies it has been used?

Translation plus Monolingual analysis N-gram OverlapGreedy String TilingLongest Common Subsequence

For each technique note 4 things

Evaluation Methodology 65


Binary classification

derived (verbatim + paraphrased) vs non-derived(independently written).

Ternary classification

verbatimvs paraphrasedvs independently written.

Supervised text classification task

Evaluation Methodology 66


Evaluation Measures

Precision

Machine Learning algorithms

J48

Machine Learning Toolkit

WEKA

Recall

F1

Random Forest

SMO

Reading – Results and Analysis 67


Explain “Terms” in the Table.

Explain “Overall” best results

Explain results with individual techniques

Conclude your results

proposed approach outperforms baseline approach

Summarize and Document Paper in Tabular Format 68


Sr no. Year Paper Title Authors

1 2018

CLEU - A Cross-Language English-Urdu Corpus and Benchmark For Text Reuse Experiments

Iqra muneerMuhammad SharjeelMuntaha IqbalRao M. Adeel NawabPaul Rayson

69


Conference / Journal Publisher Problem Importance of

Problem

Journal of the Association for

Information Science and Technology (JASIST).

John Wiley & Sons.

Cross Lingual Text Reuse Detection

The recent rise in multi-lingual content on the Web has increased cross-language text reuse to an unprecedented scale.

Summarize and Document Paper in Tabular Format

70


Applications of Problem

Summary of Literature Review Research Gap

1. Cross-lingual Plagiarism detection

2. Duplicate content removal from Web

Cross-lignaul text reuse detection corpora have been developed for various languages including English Urdu, English-Hindi, English Spanish.

One major drawback is the unavailability of large-scale gold standard evaluation resources built on real cases for cross-lingual text reuse detection, particularly for English-Urdu language pair


71


Proposed Solution Purpose of Corpus

Corpus Generation Process

A cross-language sentence/passage level text reuse corpus for the English-Urdu language pair

Develop systems to detect cross-lignual text reuse for English-Urdu language pair

1. Data collection from news articles

2. Related pairs extractions3. Annotation guidelines

Corpus / DatasetSummarize and Document Paper in Tabular Format

72


Corpus Characteristics

Number of documents: 900Levels of Text reuse1. Exact Copy2. Paraphrase Copy3. Independently Written

Language: English – UrduLicense: Creative (Open access) Publicly available

Corpus / Dataset


73


Technique Toolkit Evaluation Measures

Evaluation Methodology

Translation + Mono-lingual Analysis

1. Longest Common Subsequence

2. N-gram Overlap3. Greedy String Tiles

Weka1. Precision2. Recall3. F1-measure

1. Supervised document classification task.

2. Ten fold cross validation


74


Classifiers Results Main Finding(s)

1. Random Forest2. Naive Bayes3. J484. SMO

Classification It is harder to detect cross-language real cases of text reuse, especially when the language pairs have unrelated scripts.

Binary Ternary

F1 = 0.735 using GST-

mml1

F1 = 0.549 using GST-mml1


75


Future Work Any Remarks Source Code URL?

Improve results by developing a new

technique / algorithm.- Not available publicly


76

Physical Health

Mental Health

Social Health

Key to Success

7-9 hours sleep per night

3 healthy meals daily

30 minutes brisk walk or running or exercise

Offer 5 Namaaz daily with Jamaat

Help at least one person daily for هللا کی رضا

Practice Six Things on Daily Basis to Become a Great Human Being (Insha Allah)

Recite Durood Sharif daily (Min: 100 – Max: 125K)


BECOME A VOLUNTEER

MAKE A D I FFERENCE

ھا �گا ر ا�پن

��ی ل��، ذوق �ن �و ا�پ بت ن

��ھا �گ ر ا�پ ��ن � �ہ

�و� وں �ے �بعد

تامد�

� �ا� ے نپ

ن�س � �ور �� نبات �پ ز�وں ں��ن ز��ی �ن

ھا �گا ر ا�پ ��ن ھا �گا، اک �ہ ر ا�پ اک ��ن

Jazak Allah Khair

79

Dr. Rao Muhammad Adeel NawabEmail: [email protected]

Documents

Dr. Rao Muhammad Adeel Nawab - ilmoirfan€¦ · Dr. Rao Muhammad Adeel Nawab Sentence 1: Textreuse is becoming a serious issue in many fields and research shows that it is much harder