Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Prof. Partha Pratim Das
[email protected], [email protected], [email protected]
Professor, Department of Computer Science & Engineering,
Head, Rajendra Mishra School of Engineering Entrepreneurship,
Professor-in-Charge, IIT Kharagpur Research Park, Kolkata,
Joint Principal Investigator, National Digital Library of India Project
Indian Institute of Technology Kharagpur
SHORT TERM COURSE ON DIGITAL INDIA BY UGC-HRDC
21 – 26 December 2018
The Digital India programme is a flagship programme of the Government of India with a vision to transform India into a digitally empowered society and
knowledge economy.
INDIA’S NEED Quality education for ANYONE, ANYTIME, ANYWHERE
“
*http://www.ficci.in/spdocument/20787/FICCI-Indian-Higher-Education.pdf
NDLI VISION
Build up National Digital
Library of India as a National
Knowledge and Cultural Asset:
The key driving force for
Education, Research, Cultural
heritage, Innovation, and
knowledge-sharing in India
“IT IS HIGH TIME THAT INDIA SPEAKS FOR ITSELF, TO HIGHLIGHT, OFFER AND SHARE ITS OWN CULTURAL, SPIRITUAL, ACADEMIC AND SCIENTIFIC HERITAGE ON ITS OWN TERMS”
NDLI MISSION
1 To create a 24X7-enabled integrated
ubiquitous digital knowledge source 2
To protect and preserve India’s cultural,
academic and scientific heritage
1
2
OPEN
INCLUSIVE
WINNER mBillionth South Asia Award
2017: in Learning and Education
Category for Android Mobile App
NDLI MOTTO
ABOUT NDLI ISSUES, ARCHITECTURE AND USE MODELS
“Google can bring you back 100,000 answers. A librarian can bring you back the right one.”
- Neil Gaiman “
NDLI ISSUES
USER-SIDE PROVIDER-SIDE
- Wide geographic expanse &
Large population
- Huge number of students
- Large number of institutions
- Varied linguistic diversity
- Severe lack of Teachers
- Wealth of digital content
● Books and Articles
● ETD
● Question Papers and Solutions
● Video Lectures - MOOCs
● Simulations & Animations
● NMEICT Projects
● Data
● …
- No single-window search
- Google search uses keyword – no metadata search
- Widely varied DL technology
- Lack of Interactivity, Vernacular support
- Low integration between content and learning system
- Weak ecosystem between learners and teachers
SERVICE ARCHITECTURE
Dissemination Services
Learning Services Personalization Services
Localization Services Open Services …
Digital Library
NATIONAL DIGITAL LIBRARY
Content Creation
Content Search
Learning Content Experience based Learning
Multilingual Content
Mobile Apps
Content Borrowing
Content Access
Multi-faceted Interface
Dissemination Services Digital Library
APIs
Authoring Services Acquisition Services
Digital Repository
USE MODELS
NDLI REPOSITORY
Content Access
Physically Challenged
Researchers
Learners
Professionals
Web Content Harvesting
Institutional Digital
Repositories
Web Content
Content Contribution
Contributing Institutions
Learning Management
System Educators
PRESENTATION MODELS
● Not a new library – an umbrella
● Collects and ingests metadata
only
● Presents full-text from source
view
Provides:
● Search
● Browse
OBJECTIVE AND SCOPE
● Targets
● Contents
● Stakeholders
● Contributors
● Users
● Architecture
● The Big Picture
OBJECTIVE AND SCOPE
● A 24X7-enabled Infrastructure for NDLI with single window search facility
– To include h/w systems, n/w, s/w tools, applications and interoperability
● Harvest IDRs across institutions of the nation to provide integrated access
● Facilitate institutes to disseminate existing content and create new digital content
● Immersive E-learning environments at multiple levels spanning across
All academic levels – school to life-long learning
All disciplines – Science, Arts, Engineering, Medical, Law, and
All languages – used as medium of instruction
● Interfaces in Indian Languages & for the Differently Abled
CONTENT AT NDLI
● Born-digital object
● Digital surrogate of a physical object
● Digital metadata of physical object
METADATA AT NDLI
● NDLI does not store contents
● NDLI only ingests metadata for Search &
Browse
● Content (Full-text) is delivered from
Source
A content is included (metadata ingested) in NDLI if it is expected to have educational value
RANGE OF CONTENTS
Institutional Digital Repository of
Contributing Institutes
Institutions of School & Higher Education, Boards
Lecture Slides,
Videos, Class Notes,
Courseware
Term Papers,
Assignments,
Solutions
Lab Experiments,
Manuals, Case Studies
Question Banks (JEE /
GATE / NET / CAT ),
Model Answers
Institutional and Open Contributions.
Multi-modal, Multi-faceted
Datasets,
Benchmarks, Models,
Maps, Software
Manuscripts,
Painting, Sculpture,
Music, Dance, Drama
Audio & Video
Content
Re
se
arc
h a
nd
Pro
fes
sio
na
l
In
sti
tuti
on
s,
Ce
ntr
al
/ S
tate
Un
ive
rsit
y
Faculty Publications, ETD
(Electronic Thesis & Dissertation):
DSc-PhD-Masters-Undergrad,
Research Projects
Books & Periodicals, Open Access
Journals , E-Books & Subscribed E-
Resource
Annual Reports, Project Reports,
Convocation, Working Papers,
Others
Encyclopaedia, Dictionaries
Directories Others
CONTENT VIEW ARCHITECTURE
Vertical-Specific Custom Interface and Search
Generic Interface and Search
Content Baseline
Sc
ho
ol
Ve
rtic
al
Do
ma
in V
ert
ica
l (M
ed
ica
l/L
eg
al/
…)
Co
mp
eti
tiv
e E
xa
m
Ve
rtic
al
Da
ta V
ert
ica
l
Ap
pli
ca
tio
n
Ve
rtic
al
Search Browse
Te
xtb
oo
k, L
es
so
n V
iew
Do
ma
in M
eta
da
ta
MC
Q/M
SQ
/...
Da
ta B
row
se
r
Ap
p B
row
se
r
STAKEHOLDERS
Stakeholder Roles and Responsibility
Government 1.Sponsor and facilitator
2.Content Contributor
● Ministries / Departments
● R & D Labs
Institutions
•Public / Private
•Academic / R & D / Educational
1.Host Institution – IIT Kharagpur
2.Contributing Institution – Supporting IDRs
3.Participating Institution – Providing Users & Feedback
Public
•NGOs
•Individuals
1.Use and Feedback
2.Metadata by Crowd Sourcing
3.Content by Crowd Sourcing
Industry 1.Technology Providers
Publishers 1.Metadata Provider
2.Content Provider (under various licensing schemes)
CONTRIBUTORS:
CFTI, State and Central
Universities, R & D
Labs, Govt. Depts, Free
Portals, Publishers,
etc.
WHO IS THIS FOR?
The NDLI platform is for ALL learners.
SCHOOL
COLLEGE PROFESSIONALS
LIFELONG LEARNERS
HOW TO REGISTER?
● Registration to NDLI is OPEN FOR ALL.
REGISTRATION TYPE:
● Individual : Register directly
● Institutional: Bulk registration
managed by Institution overseen by
authenticated nodal person as
appointed by NDLI.
LOG ON TO : HTTPS://NDL.IITKGP.AC.IN
THE BIG PICTURE
NATIONAL DIGITAL LIBRARY OF INDIA
SWAYAM, SWAYAM Prabha,
GIAN
CREDIT TRANSFER AND VIRTUAL
CERTIFICATION (NAD)
Knowledge Repository: Internet & Mobile
Upto 20 credits From MOOCs
School, Certificate, Diploma, UG & PG: Internet
SWAYAM: instrument for self-actualisation
School, UG, PG, Open U, IIT PAL : TV – DTH
SWAYAM Prabha: 32 DTH, 24x7
THE INTERFACE HTTPS://NDL.IITKGP.AC.IN
The NDLI portal is also available as an android and iOS mobile app.
NDLI WEBSITE: Landing page
NDLI M-SITE: Landing page
NDLI WEBSITE: Browse
NDLI WEBSITE: Search Result page
REFINE SEARCH: By author (169 results)
REFINE SEARCH: By content type : video (323 results)
REFINE SEARCH: By source: MIT Open Courseware (23 results)
REFINE SEARCH: By language: Malayalam (4 results)
MULTI-LINGUAL INTERFACE: BROWSE IN PREFERRED LANGUAGE
NDLI BENGALI
NDLI HINDI
FULL-TEXT: CONTENT PAGE - Sample 1
FULL-TEXT: METADATA PAGE - Sample 1
FULL-TEXT: FURTHER PAGES - Sample 1
FULL-TEXT: CONTENT PAGE - Sample 2
FULL-TEXT: METADATA PAGE - Sample 2
FULL-TEXT: ACTUAL PAGE - Sample 2
IN THE WORKS: A SPECIALLY ABLED INTERFACE FOR THE DIFFERENTLY ABLED
● Voice based Input
● Voice-based Output
● Braille keyboards
METADATA FUNDAMENTALS
Flash-course on data
aggregation!
WHAT IS METADATA?
● Metadata Is primarily textual information
relating to Content.
● It Includes information that enables users to
identify, discover, search, browse, interpret, or
manage Content
● Includes hyperlinks that direct users to Content
on the Source
● May include an expressive description of the
Content
NOTE: NDLI does not store contents, it only ingests metadata for Search & Browse. The content (Full-text) is delivered from Source.
METADATA STANDARD
● Defines common
understanding of the
semantics of the data
● Ensures correct and proper use
and interpretation of the data
● Defined through a set of fields,
vocabs (optional), and
instructions for fill-up
NDLI METADATA DESIGN CHALLENGES
1 WIDE CATEGORY OF RESOURCES
Generic metadata or domain specific?
2 OPENNESS OF REPOSITORY
Closed metadata standard may fail to
describe a new resource
3 SCALE IS ENORMOUS
● Manual annotation is infeasible
● Automatic annotation guided by
crowdsourcing?
NDLI METADATA REQUIREMENT SPECS
To describe any digital resource
● Generic content metadata
Contributor, Description, Language,
Format etc.
To describe domain specific resources
● Educational content metadata
Educational level, ToC, Type of learning
material etc.
● Thesis metadata
Institution, advisor, degree, researcher
TRANSLATION ISSUES
➢ Variation in Subject classification standards
● Dewey Decimal Classification (DDC)
● Library of Congress Classification (LCC)
● Library of Congress Subject Headings (LCSH)
➢ Mapping terminology for different languages
● Translate when equivalent terminology is present
● Transliterate otherwise
SO WE COLLECTED CONTENT.
HOW DO WE ENSURE RELEVANCE?
HOW DO WE PROPAGATE USAGE?
HOW DO WE GENERATE MORE VALUE?
HOW DO WE ENSURE SUSTAINABILITY? HOW DO WE GROW?
NDLI INITIATIVES & INNOVATIONS
SKILL DEVELOPMENT AND TECH INTEGRATION LEADERSHIP: IDR WORKSHOPS
TECHNOLOGY INNOVATION: METADATA EXTRACTION
METADATA ENVELOPE
NDLI METADATA
DUBLIN CORE (Generic)
LRMI (Educational)
SHODHGANGA (Thesis)
Type Title Subject Language Description Date Authour
Board Difficulty Level
Typical LT Prerequisite Topic
Type of LM Pedagogic Objective
Educational Level
Researcher Keyword Advisor
Place Department
Institution Awarded
Degree
ACQUISITION SCENARIOS
LOCATE
CONTENT
ACQUIRE
METADATA
➢ Harvest Institutional IDRs
➢ Crawl Websites
➢ In Bulk – from Publishers
➢ Donated by Source
➢ Source-supported API
…
➢ Creation
● Manual
● Automated
➢ Translation
● Format
● Standard / Schema
Curation
● Manual
● Assisted
➢ Ingestion
CHALLENGES IN METADATA ACQUISITION
➢ Different sources follow different norm in annotation
● Automation of curation
➢ Errors in sourced data
● Error listing
● Manual curation
➢ Errors in manual annotation
● Review step before actual submission in database
● Manual curation
➢ Set up norm/guideline
● Controlled vocabulary
NDLI METADATA STANDARDS
NDLI Metadata Standard v 1.0 is an Open Virtual Standard
● URL: http://www.ndlproject.iitkgp.ac.in/ndl/header.php?mname=Metadata%20Schema
Schema is categorized into three profiles:
1. Generic Metadata:
● Describes general attributes of the contents
● Adopted from Dublin Core Metadata Standard
2. Educational Metadata:
● Describes the educational attributes of the resources and helps in enumerating properties of the
contents relevant to teaching-learning process
● Adopted from Learning Resource Metadata Initiative (LRMI)
3. Thesis Metadata:
● Describes dissertation or thesis related metadata fields
● Adopted from Shodhganga Thesis Metadata Standard
More profiles may be added in future
Uses the namespace of Dublin Core (dc.)
NDLI METADATA EXTRACTION TOOLKIT
● Automated metadata extraction workflow
● Syntactic metadata extractor
○ Author name, ISBN, Publisher, dates etc.
● Table of content extractor
● Wikification for keyword extraction
● Learning resource metadata
● Toolchain for metadata extraction
Once completed and tested, the toolkit will be released to public and open-sourced
COMMUNITY INNOVATION: NDLI CLUBS
VIRTUAL LIBRARIES ARE MORE THAN A PASSIVE COLLECTION OF BOOKS...
● Like with all other things digital, users expect a lot more customization and interaction from digital libraries - the digital library hence needs to evolve into a bouquet of services instead of just a storage platform
● Apart from collating data and connecting them meaningfully, the end-success of the library will always lie with its users. NDLI clubs encourages the active involvement of users with the content as well as each other to Learn, Share, Grow.
NDLI CLUBS
● E-learning environment for schools
● Involving all K-12 Stakeholders
● Single point of access for unified
academic resources
● Aimed at building active learners
among users - NDLI club
newsletters, quiz, workshops on
themes like STEM, e-commerce,
IoT etc.
NDLI CLUBS
● Clubs can provide insights on learning outcomes and cognitive abilities to better curriculum
● Will allow development of concept-based questions modelled on learning behaviour
● Student profiles and performance will allow school management to make better decisions
1 2 3
ELEPHANT IN THE ROOM: COPYRIGHTS
● WHAT CAN I SHARE?
● WHAT CAN I REPRODUCE?
● HOW DO I KNOW IF THE
CONTENT IS ORIGINAL?
● IS THIS CITABLE?
● HOW CREDIBLE IS THIS
INFORMATION?
OPERATIONS LEADERSHIP: INTELLECTUAL PROPERTY
Copyright and Intellectual Property has been a major learning through our two years of operation.
NDLI conducted the National Workshop on Copyright Issues in 2018, in collaboration with imminent IP lawyers, librarians of premier institutes across the nation as well as govt policy makers to come up with a Manual of Copyright Best Practices, the very first of its kind, for India
NDLI’S LEARNINGS ON INTELLECTUAL PROPERTY
Copyright is a bundle of rights given by
the law to:
● Creators of literary, dramatic,
musical and artistic works and
● Producers of cinematograph films
and sound recordings
Rights are:
● reproduction of the work,
● communication of the work to the
public,
● adaptation of the work and,
● translation of the work
Creator / Author is the first Owner
Ownership (Title) can be transferred
(assigned)
Copyright Laws are subject to time period and
geographical location and the subsequent IP laws
of the land
NDLI’S APPROACH TO COPYRIGHTS
● Vast Majority of Metadata on NDLI is not subject to Copyright Restrictions
● NDLI’s Partners share NDLI’s Commitment ● NDLI Asserts No Rights Over its Database of
Metadata and Dedicates its Contributions to the Public Domain
● Users have free and unencumbered access to Metadata
● NDLI strives to provide India with its own geographical definition of copyrights at par with international standards
NDLI ACCESS OPTIONS
Open
Full-text available to all (Example: NCERT)
NDLI Users
Full-text available through NDL, not directly from Source (Example: South Asia Archive)
Limited Access
Part of text available but full-text requires authorization by Source authority (Example: IISER, Bhopal)
Subscribed
Full-text available from institutions that have subscribed to the Source (Example: Springer)
Restricted
Full-text access requires authorization by Source authority and separate login to the Source (Example: IIT Jodhpur)
RIGHTSSTATEMENTS
● NDLI is a member of rightsstatements.org at the Steering Committee level
● Develop standardized copyright summaries for digital libraries
● Facilitate learners with access to knowledge without boundaries
INNOVATION IN KNOWLEDGE DISSEMINATION: NATIONAL LICENSING
NATIONAL LICENSING
An initiative by eSS:
● e-ShodhSindhu: Consortium for Higher Education Electronic
Resources
Licensing being negotiated with Publishers
● Institutional License:
Accessible from within designated institutional network: IP filtered
● NDLI Supported License:
Accessible if requested from NDLI: Concurrent use-basis
CATEGORY INNOVATION: CHAVI
DATA-DRIVEN CANCER RESEARCH
● India currently does not have the infrastructure for dedicated bio banks to aid medical research
● This prevents the Indian medical system to learn of and from trends to prepare for subsequent treatment
● NDLI has tied up with TATA MEDICAL CENTRE in Kolkata to develop a prototype for bio-banks that will aid cancer-research
CHAVI
COMPREHENSIVE DIGITAL ARCHIVE OF CANCER IMAGING
● Dedicated image banking service
● Will provide annotated images with associated information of cancer patients
● Furthering India’s data banking initiatives
● Library migrating to create a database for bio-banks
AI INNOVATION: CREATING KNOWLEDGE
UNDERSTANDING THE MODERN KNOWLEDGE SEEKER:
● Technology and hence technology-enabled access have led to users becoming virtualized
● As a data aggregator, digital libraries can not only facilitate search and browse, but also bring together content libraries and algorithm libraries to compute big-data solutions remotely and real-time!
● So not only can the library store knowledge for its users, it can also evolve to create/compute knowledge by cross-linking programs with data-sets
SERVICE ARCHITECTURE:
Cloud Storage Data Storage Computer
Server
Compute and Storage Infrastructure
Internal
Cloud AWS
Cloud Computer
(Back-End) Processing
Units
Cloud Setup Interface and Services
Metadata Storage
Structured Attributes Knowledge
Graph Representation
Ontology and Classification
Access
Mechanisms
and Policies
Query Result
(Front-End)
Search-Compute
Query Processing
User-1
User-2
PERSONALIZED LEARNING
Experience Tracking
● To offer customized search results
Multi-lingual Support
● Reduce cognitive load for native use
Personalization
● Customized UI to suit user grade
• Multi-Aspect and Multi-Level Structured
Prediction for Questions
• Knowledge Tracing to estimate the level of knowledge of a student
• Concept Ecosystem for Addressing Knowledge Gap
• Cognitive Load Index of Concepts
PH-1 PH-2
ACCESS AND DISCOVERY INNOVATION: SURROGATOR
SURROGATOR
● Makes available pre-print copies of academic
resources that are otherwise unavailable to
learners
● A customizable application that can identify
surrogates for access-restricted publications
in digital libraries
● Surrogator can make a case for the evolution
of specific subject matters through pre-print
indexing and open up a whole new realm of
research
● Application is being tested through NDLI, can
easily be extended to other digital libraries
SU
RR
OG
AT
OR
1. QUERY
2. QUERY
3. RESULTS
6. CITATION
4. RESULTS + ACCESS RIGHTS
5. FIND SURROGATES
8. SURROGATES
7. ARTICLES +RELATED_ARTICLES + CITED_BY
THE ONE ON DIGITAL PRESERVATION:
INTERNATIONAL LEARNINGS ON DIGITAL PRESERVATION:
• National Heritage Digitization Strategy (NHDS):
https://nhds.ca/ Canadian memory institutions are working together to find ways to improve access to digital collections and better preserve cultural heritage • Tour of Gatineau Preservation Centre • Co-Lab : A collaboration tool used to transcribe, tag,
translate and describe digitized records from the collection. Also, has a number of open challenges in Transcription, Translation, Tagging and Description
• Library of Congress: International Image Interoperability
Framework (IIIF) is the way to go
INTERNATIONAL LEARNINGS ON DIGITAL PRESERVATION:
• National Heritage Digitization Strategy (NHDS):
https://nhds.ca/ Canadian memory institutions are working together to find ways to improve access to digital collections and better preserve cultural heritage • Tour of Gatineau Preservation Centre • Co-Lab : A collaboration tool used to transcribe, tag,
translate and describe digitized records from the collection. Also, has a number of open challenges in Transcription, Translation, Tagging and Description
• Library of Congress: International Image Interoperability
Framework (IIIF) is the way to go
NDLI PRESERVATION CENTRE:
• NDLI IS POISED TO SET-UP A PRESERVATION CENTRE
WITH WORLD LIBRARY FOUNDATION We have already started work on digital preservation of the archives of PRESIDENCY UNIVERSITY, Kolkata.
2 3 1 DIGITIZE MAKE SEARCHABLE (OCR)
DIGITAL PUBLICATIONS
A GLOBAL PRESENCE
Representatives from National Digital
Library of India regularly participate in both
national and international workshops with
the aim to:
Collaborate with institutions and
organizations that are leaders in the field of
library science and technology
Learn from existing knowledge portals and
integrate key lessons into further
developing NDLI
Share India’s culture, knowledge and
technology inheritance with the world
1
2
3
NDLI OUTREACH 360
NATIONAL LEVEL POLICY ON INTELLECTUAL PROPERTY RIGHTS
2
REJUVENATING THE PUBLIC LIBRARY SERVICE
THROUGH DIGITAL REFERENCE SOURCES 3
OCR FOR INDIAN LANGUAGES
4
NDLI CLUB
5
TRAINING AND REPOSITORY SERVICE 6
1 PAN INDIA WORKSHOPS FOR USERS, INSTITUTES AND PUBLIC LIBRARIES
INTERNATIONAL WORKSHOPS AND INTEGRATION
ON KNOWLEDGE ENGINEERING 7
KEDL-2019 http://kedl2019.ndl.gov.in
Out of the various DIGITAL INDIA
initiatives, NDLI is already integrated with
UMANG.
NDLI SOCIAL MEDIA CAMPAIGNS:
NDLI ON SOCIAL MEDIA : #WWOW
NAME YOUR TOP 5 WOMEN SCIENTISTS - is a list that most of us failed to complete after Marie Curie. #WWOW (Wonderful Women On Wednesday) aims to bring to you stories of some such absolutely wonderful women every Wednesday - we intend to highlight their work and their contribution by directing you to such content from our archives. We hope you will enjoy and learn from them as much as we did.
NDLI ON SOCIAL MEDIA : #NTR
#NightTimeRead is an National Digital Library of India initiative to bring to you some rare and not-so-rare pieces of fiction just before you go to sleep. The content is short and available in the form of pdfs / audio-books to not take up too much of your time but still hopefully instill a bit of magic pre-bedtime.
NDLI ON SOCIAL MEDIA : #STUDYTIME
Take the drudgery out of sitting down to study. Study Time is designed to help students learn better through sharing resources that simplify learning. Interactive tutorials, video lectures, question papers, solutions and other quality resources brought straight to your smart device. Study smart with #StudyTime - anytime, anywhere.
NDLI ON SOCIAL MEDIA : #SME
Sunday Morning Edutainment features graphic novels, comic books, manga, art history, music, dance and much much more! Start your day off by learning about the art of Monet, about believing in yourself from Naruto or about music from Beethoven. Go out of syllabus, edutain yourself with #SME.
FIND US ON SOCIAL MEDIA:
www.facebook.com/ndlindia
www.twitter.com/ndlindia
www.linkedin.com/company/ndlindia
www.youtube.com/c/ndlindia
Don't limit a child to your own learning, for he was born in another time.
-Rabindranath Tagore
“
“
Use your e-mail address to sign up at https://ndl.iitkgp.ac.in/
OR https://ndl.gov.in/
#LearnShareGrow
Together, let us
@NDLIndia
Dedicated to the Nation.