Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Innovations in Digital Library Curation, Ranking, & Management
Mayank Singh Assistant Professor
Dept. of CSE, Indian Institute of Technology Gandhinagar
Changing Landscape of Science & Technology Libraries (CLSTL 2019)
Scientific documents
Secondary Sources
Books
Scientific LiteraturePatent articles
Primary Sources
Review articles
Scientific Literature
Tertiary Sources
Magazines Blogs Crowd-sourced platforms
Focus of the Talk
Platforms that curate, rank & manage
Scientific Documents
Scientific documents
Google Scholar
Michael Gusenbauer, Scientometrics’18
389 million documents including articles, citations and patents
Google Scholar
Google Scholar
Microsoft Academic Search
212M documents, 256M authors
Microsoft Academic Search
Microsoft Academic Search
Semantic Scholar
GrapAL
And many more….
● Ferosa
● CLScholar
● Discern
● ….
Scholarly Ranking
Scholarly RankingTitle
Older papers
URLS
Scholarly RankingTitle
Older papers
Well-knownURLS
Can we rank papers based on performance?
Tabular Information
Comparative
Descriptive
Performance Improvement Graphs
Embedding comparative information into graphs to rank research papers
Dataset
Local Sanitization (prune noisy extractions)
improvement scores
Prune edges having improvement scores > 100%.
Improvement score
Im(B,C)= 100*(0.6-0.1) 0.1
Local Sanitization (combining multi-edges)
No guarantee of the same data set or experimental conditions across different tables, leave alone different papers.
1. UNW — Unweighted Graph
2. ALL — Weighted Graph (Total number of comparisons)
3. UNQ —Weighted Graph (Unique number of metric comparisons)
4. SIG — Sigmoid of actual improvements on edges (MAX and Average)
Ranking Schemes and SOTA Lists
1. Sink nodes
2. Cocitation
3. Linear tournament
4. Exponential tournament
5. PageRank
https://github.com/sbrugman/deep-learning-papers
Performance Evaluation
● GS and SS are mediocre
● Among baselines, sink node search
led to worst performance
● Co-citation performed quite well
● UNW is better than UNQ
Organic Leaderboards
Metrics
Competing papers
Ranks
Generating Organic Leaderboards
Conclusion and Future Directions● Framework to mine experimental performance from papers embedded
within comparative tables.
● Information extraction from figures and tables embedded in PDF research articles.
● Extension to non-CS domains.
Thank you for your attention!
Contact me: [email protected], [email protected]
Webpage: http://mayank4490.github.io/