The Digital India programme is a flagship programme of the ... · OBJECTIVE AND SCOPE A...

Preview:

Citation preview

Prof. Partha Pratim Das

ppd@cse.iitkgp.ac.in, ppd@see.iitkgp.ac.in, partha.p.das@gmail.com

Professor, Department of Computer Science & Engineering,

Head, Rajendra Mishra School of Engineering Entrepreneurship,

Professor-in-Charge, IIT Kharagpur Research Park, Kolkata,

Joint Principal Investigator, National Digital Library of India Project

Indian Institute of Technology Kharagpur

SHORT TERM COURSE ON DIGITAL INDIA BY UGC-HRDC

21 – 26 December 2018

The Digital India programme is a flagship programme of the Government of India with a vision to transform India into a digitally empowered society and

knowledge economy.

INDIA’S NEED Quality education for ANYONE, ANYTIME, ANYWHERE

*http://www.ficci.in/spdocument/20787/FICCI-Indian-Higher-Education.pdf

NDLI VISION

Build up National Digital

Library of India as a National

Knowledge and Cultural Asset:

The key driving force for

Education, Research, Cultural

heritage, Innovation, and

knowledge-sharing in India

“IT IS HIGH TIME THAT INDIA SPEAKS FOR ITSELF, TO HIGHLIGHT, OFFER AND SHARE ITS OWN CULTURAL, SPIRITUAL, ACADEMIC AND SCIENTIFIC HERITAGE ON ITS OWN TERMS”

NDLI MISSION

1 To create a 24X7-enabled integrated

ubiquitous digital knowledge source 2

To protect and preserve India’s cultural,

academic and scientific heritage

1

2

OPEN

INCLUSIVE

WINNER mBillionth South Asia Award

2017: in Learning and Education

Category for Android Mobile App

NDLI MOTTO

ABOUT NDLI ISSUES, ARCHITECTURE AND USE MODELS

“Google can bring you back 100,000 answers. A librarian can bring you back the right one.”

- Neil Gaiman “

NDLI ISSUES

USER-SIDE PROVIDER-SIDE

- Wide geographic expanse &

Large population

- Huge number of students

- Large number of institutions

- Varied linguistic diversity

- Severe lack of Teachers

- Wealth of digital content

● Books and Articles

● ETD

● Question Papers and Solutions

● Video Lectures - MOOCs

● Simulations & Animations

● NMEICT Projects

● Data

● …

- No single-window search

- Google search uses keyword – no metadata search

- Widely varied DL technology

- Lack of Interactivity, Vernacular support

- Low integration between content and learning system

- Weak ecosystem between learners and teachers

SERVICE ARCHITECTURE

Dissemination Services

Learning Services Personalization Services

Localization Services Open Services …

Digital Library

NATIONAL DIGITAL LIBRARY

Content Creation

Content Search

Learning Content Experience based Learning

Multilingual Content

Mobile Apps

Content Borrowing

Content Access

Multi-faceted Interface

Dissemination Services Digital Library

APIs

Authoring Services Acquisition Services

Digital Repository

USE MODELS

NDLI REPOSITORY

Content Access

Physically Challenged

Researchers

Learners

Professionals

Web Content Harvesting

Institutional Digital

Repositories

Web Content

Content Contribution

Contributing Institutions

Learning Management

System Educators

PRESENTATION MODELS

● Not a new library – an umbrella

● Collects and ingests metadata

only

● Presents full-text from source

view

Provides:

● Search

● Browse

OBJECTIVE AND SCOPE

● Targets

● Contents

● Stakeholders

● Contributors

● Users

● Architecture

● The Big Picture

OBJECTIVE AND SCOPE

● A 24X7-enabled Infrastructure for NDLI with single window search facility

– To include h/w systems, n/w, s/w tools, applications and interoperability

● Harvest IDRs across institutions of the nation to provide integrated access

● Facilitate institutes to disseminate existing content and create new digital content

● Immersive E-learning environments at multiple levels spanning across

All academic levels – school to life-long learning

All disciplines – Science, Arts, Engineering, Medical, Law, and

All languages – used as medium of instruction

● Interfaces in Indian Languages & for the Differently Abled

CONTENT AT NDLI

● Born-digital object

● Digital surrogate of a physical object

● Digital metadata of physical object

METADATA AT NDLI

● NDLI does not store contents

● NDLI only ingests metadata for Search &

Browse

● Content (Full-text) is delivered from

Source

A content is included (metadata ingested) in NDLI if it is expected to have educational value

RANGE OF CONTENTS

Institutional Digital Repository of

Contributing Institutes

Institutions of School & Higher Education, Boards

Lecture Slides,

Videos, Class Notes,

Courseware

Term Papers,

Assignments,

Solutions

Lab Experiments,

Manuals, Case Studies

Question Banks (JEE /

GATE / NET / CAT ),

Model Answers

Institutional and Open Contributions.

Multi-modal, Multi-faceted

Datasets,

Benchmarks, Models,

Maps, Software

Manuscripts,

Painting, Sculpture,

Music, Dance, Drama

Audio & Video

Content

Re

se

arc

h a

nd

Pro

fes

sio

na

l

In

sti

tuti

on

s,

Ce

ntr

al

/ S

tate

Un

ive

rsit

y

Faculty Publications, ETD

(Electronic Thesis & Dissertation):

DSc-PhD-Masters-Undergrad,

Research Projects

Books & Periodicals, Open Access

Journals , E-Books & Subscribed E-

Resource

Annual Reports, Project Reports,

Convocation, Working Papers,

Others

Encyclopaedia, Dictionaries

Directories Others

CONTENT VIEW ARCHITECTURE

Vertical-Specific Custom Interface and Search

Generic Interface and Search

Content Baseline

Sc

ho

ol

Ve

rtic

al

Do

ma

in V

ert

ica

l (M

ed

ica

l/L

eg

al/

…)

Co

mp

eti

tiv

e E

xa

m

Ve

rtic

al

Da

ta V

ert

ica

l

Ap

pli

ca

tio

n

Ve

rtic

al

Search Browse

Te

xtb

oo

k, L

es

so

n V

iew

Do

ma

in M

eta

da

ta

MC

Q/M

SQ

/...

Da

ta B

row

se

r

Ap

p B

row

se

r

STAKEHOLDERS

Stakeholder Roles and Responsibility

Government 1.Sponsor and facilitator

2.Content Contributor

● Ministries / Departments

● R & D Labs

Institutions

•Public / Private

•Academic / R & D / Educational

1.Host Institution – IIT Kharagpur

2.Contributing Institution – Supporting IDRs

3.Participating Institution – Providing Users & Feedback

Public

•NGOs

•Individuals

1.Use and Feedback

2.Metadata by Crowd Sourcing

3.Content by Crowd Sourcing

Industry 1.Technology Providers

Publishers 1.Metadata Provider

2.Content Provider (under various licensing schemes)

CONTRIBUTORS:

CFTI, State and Central

Universities, R & D

Labs, Govt. Depts, Free

Portals, Publishers,

etc.

WHO IS THIS FOR?

The NDLI platform is for ALL learners.

SCHOOL

COLLEGE PROFESSIONALS

LIFELONG LEARNERS

HOW TO REGISTER?

● Registration to NDLI is OPEN FOR ALL.

REGISTRATION TYPE:

● Individual : Register directly

● Institutional: Bulk registration

managed by Institution overseen by

authenticated nodal person as

appointed by NDLI.

LOG ON TO : HTTPS://NDL.IITKGP.AC.IN

THE BIG PICTURE

NATIONAL DIGITAL LIBRARY OF INDIA

SWAYAM, SWAYAM Prabha,

GIAN

CREDIT TRANSFER AND VIRTUAL

CERTIFICATION (NAD)

Knowledge Repository: Internet & Mobile

Upto 20 credits From MOOCs

School, Certificate, Diploma, UG & PG: Internet

SWAYAM: instrument for self-actualisation

School, UG, PG, Open U, IIT PAL : TV – DTH

SWAYAM Prabha: 32 DTH, 24x7

THE INTERFACE HTTPS://NDL.IITKGP.AC.IN

The NDLI portal is also available as an android and iOS mobile app.

NDLI WEBSITE: Landing page

NDLI M-SITE: Landing page

NDLI WEBSITE: Browse

NDLI WEBSITE: Search Result page

REFINE SEARCH: By author (169 results)

REFINE SEARCH: By content type : video (323 results)

REFINE SEARCH: By source: MIT Open Courseware (23 results)

REFINE SEARCH: By language: Malayalam (4 results)

MULTI-LINGUAL INTERFACE: BROWSE IN PREFERRED LANGUAGE

NDLI BENGALI

NDLI HINDI

FULL-TEXT: CONTENT PAGE - Sample 1

FULL-TEXT: METADATA PAGE - Sample 1

FULL-TEXT: FURTHER PAGES - Sample 1

FULL-TEXT: CONTENT PAGE - Sample 2

FULL-TEXT: METADATA PAGE - Sample 2

FULL-TEXT: ACTUAL PAGE - Sample 2

IN THE WORKS: A SPECIALLY ABLED INTERFACE FOR THE DIFFERENTLY ABLED

● Voice based Input

● Voice-based Output

● Braille keyboards

METADATA FUNDAMENTALS

Flash-course on data

aggregation!

WHAT IS METADATA?

● Metadata Is primarily textual information

relating to Content.

● It Includes information that enables users to

identify, discover, search, browse, interpret, or

manage Content

● Includes hyperlinks that direct users to Content

on the Source

● May include an expressive description of the

Content

NOTE: NDLI does not store contents, it only ingests metadata for Search & Browse. The content (Full-text) is delivered from Source.

METADATA STANDARD

● Defines common

understanding of the

semantics of the data

● Ensures correct and proper use

and interpretation of the data

● Defined through a set of fields,

vocabs (optional), and

instructions for fill-up

NDLI METADATA DESIGN CHALLENGES

1 WIDE CATEGORY OF RESOURCES

Generic metadata or domain specific?

2 OPENNESS OF REPOSITORY

Closed metadata standard may fail to

describe a new resource

3 SCALE IS ENORMOUS

● Manual annotation is infeasible

● Automatic annotation guided by

crowdsourcing?

NDLI METADATA REQUIREMENT SPECS

To describe any digital resource

● Generic content metadata

Contributor, Description, Language,

Format etc.

To describe domain specific resources

● Educational content metadata

Educational level, ToC, Type of learning

material etc.

● Thesis metadata

Institution, advisor, degree, researcher

TRANSLATION ISSUES

➢ Variation in Subject classification standards

● Dewey Decimal Classification (DDC)

● Library of Congress Classification (LCC)

● Library of Congress Subject Headings (LCSH)

➢ Mapping terminology for different languages

● Translate when equivalent terminology is present

● Transliterate otherwise

SO WE COLLECTED CONTENT.

HOW DO WE ENSURE RELEVANCE?

HOW DO WE PROPAGATE USAGE?

HOW DO WE GENERATE MORE VALUE?

HOW DO WE ENSURE SUSTAINABILITY? HOW DO WE GROW?

NDLI INITIATIVES & INNOVATIONS

SKILL DEVELOPMENT AND TECH INTEGRATION LEADERSHIP: IDR WORKSHOPS

TECHNOLOGY INNOVATION: METADATA EXTRACTION

METADATA ENVELOPE

NDLI METADATA

DUBLIN CORE (Generic)

LRMI (Educational)

SHODHGANGA (Thesis)

Type Title Subject Language Description Date Authour

Board Difficulty Level

Typical LT Prerequisite Topic

Type of LM Pedagogic Objective

Educational Level

Researcher Keyword Advisor

Place Department

Institution Awarded

Degree

ACQUISITION SCENARIOS

LOCATE

CONTENT

ACQUIRE

METADATA

➢ Harvest Institutional IDRs

➢ Crawl Websites

➢ In Bulk – from Publishers

➢ Donated by Source

➢ Source-supported API

➢ Creation

● Manual

● Automated

➢ Translation

● Format

● Standard / Schema

Curation

● Manual

● Assisted

➢ Ingestion

CHALLENGES IN METADATA ACQUISITION

➢ Different sources follow different norm in annotation

● Automation of curation

➢ Errors in sourced data

● Error listing

● Manual curation

➢ Errors in manual annotation

● Review step before actual submission in database

● Manual curation

➢ Set up norm/guideline

● Controlled vocabulary

NDLI METADATA STANDARDS

NDLI Metadata Standard v 1.0 is an Open Virtual Standard

● URL: http://www.ndlproject.iitkgp.ac.in/ndl/header.php?mname=Metadata%20Schema

Schema is categorized into three profiles:

1. Generic Metadata:

● Describes general attributes of the contents

● Adopted from Dublin Core Metadata Standard

2. Educational Metadata:

● Describes the educational attributes of the resources and helps in enumerating properties of the

contents relevant to teaching-learning process

● Adopted from Learning Resource Metadata Initiative (LRMI)

3. Thesis Metadata:

● Describes dissertation or thesis related metadata fields

● Adopted from Shodhganga Thesis Metadata Standard

More profiles may be added in future

Uses the namespace of Dublin Core (dc.)

NDLI METADATA EXTRACTION TOOLKIT

● Automated metadata extraction workflow

● Syntactic metadata extractor

○ Author name, ISBN, Publisher, dates etc.

● Table of content extractor

● Wikification for keyword extraction

● Learning resource metadata

● Toolchain for metadata extraction

Once completed and tested, the toolkit will be released to public and open-sourced

COMMUNITY INNOVATION: NDLI CLUBS

VIRTUAL LIBRARIES ARE MORE THAN A PASSIVE COLLECTION OF BOOKS...

● Like with all other things digital, users expect a lot more customization and interaction from digital libraries - the digital library hence needs to evolve into a bouquet of services instead of just a storage platform

● Apart from collating data and connecting them meaningfully, the end-success of the library will always lie with its users. NDLI clubs encourages the active involvement of users with the content as well as each other to Learn, Share, Grow.

NDLI CLUBS

● E-learning environment for schools

● Involving all K-12 Stakeholders

● Single point of access for unified

academic resources

● Aimed at building active learners

among users - NDLI club

newsletters, quiz, workshops on

themes like STEM, e-commerce,

IoT etc.

NDLI CLUBS

● Clubs can provide insights on learning outcomes and cognitive abilities to better curriculum

● Will allow development of concept-based questions modelled on learning behaviour

● Student profiles and performance will allow school management to make better decisions

1 2 3

ELEPHANT IN THE ROOM: COPYRIGHTS

● WHAT CAN I SHARE?

● WHAT CAN I REPRODUCE?

● HOW DO I KNOW IF THE

CONTENT IS ORIGINAL?

● IS THIS CITABLE?

● HOW CREDIBLE IS THIS

INFORMATION?

OPERATIONS LEADERSHIP: INTELLECTUAL PROPERTY

Copyright and Intellectual Property has been a major learning through our two years of operation.

NDLI conducted the National Workshop on Copyright Issues in 2018, in collaboration with imminent IP lawyers, librarians of premier institutes across the nation as well as govt policy makers to come up with a Manual of Copyright Best Practices, the very first of its kind, for India

NDLI’S LEARNINGS ON INTELLECTUAL PROPERTY

Copyright is a bundle of rights given by

the law to:

● Creators of literary, dramatic,

musical and artistic works and

● Producers of cinematograph films

and sound recordings

Rights are:

● reproduction of the work,

● communication of the work to the

public,

● adaptation of the work and,

● translation of the work

Creator / Author is the first Owner

Ownership (Title) can be transferred

(assigned)

Copyright Laws are subject to time period and

geographical location and the subsequent IP laws

of the land

NDLI’S APPROACH TO COPYRIGHTS

● Vast Majority of Metadata on NDLI is not subject to Copyright Restrictions

● NDLI’s Partners share NDLI’s Commitment ● NDLI Asserts No Rights Over its Database of

Metadata and Dedicates its Contributions to the Public Domain

● Users have free and unencumbered access to Metadata

● NDLI strives to provide India with its own geographical definition of copyrights at par with international standards

NDLI ACCESS OPTIONS

Open

Full-text available to all (Example: NCERT)

NDLI Users

Full-text available through NDL, not directly from Source (Example: South Asia Archive)

Limited Access

Part of text available but full-text requires authorization by Source authority (Example: IISER, Bhopal)

Subscribed

Full-text available from institutions that have subscribed to the Source (Example: Springer)

Restricted

Full-text access requires authorization by Source authority and separate login to the Source (Example: IIT Jodhpur)

RIGHTSSTATEMENTS

● NDLI is a member of rightsstatements.org at the Steering Committee level

● Develop standardized copyright summaries for digital libraries

● Facilitate learners with access to knowledge without boundaries

INNOVATION IN KNOWLEDGE DISSEMINATION: NATIONAL LICENSING

NATIONAL LICENSING

An initiative by eSS:

● e-ShodhSindhu: Consortium for Higher Education Electronic

Resources

Licensing being negotiated with Publishers

● Institutional License:

Accessible from within designated institutional network: IP filtered

● NDLI Supported License:

Accessible if requested from NDLI: Concurrent use-basis

CATEGORY INNOVATION: CHAVI

DATA-DRIVEN CANCER RESEARCH

● India currently does not have the infrastructure for dedicated bio banks to aid medical research

● This prevents the Indian medical system to learn of and from trends to prepare for subsequent treatment

● NDLI has tied up with TATA MEDICAL CENTRE in Kolkata to develop a prototype for bio-banks that will aid cancer-research

CHAVI

COMPREHENSIVE DIGITAL ARCHIVE OF CANCER IMAGING

● Dedicated image banking service

● Will provide annotated images with associated information of cancer patients

● Furthering India’s data banking initiatives

● Library migrating to create a database for bio-banks

AI INNOVATION: CREATING KNOWLEDGE

UNDERSTANDING THE MODERN KNOWLEDGE SEEKER:

● Technology and hence technology-enabled access have led to users becoming virtualized

● As a data aggregator, digital libraries can not only facilitate search and browse, but also bring together content libraries and algorithm libraries to compute big-data solutions remotely and real-time!

● So not only can the library store knowledge for its users, it can also evolve to create/compute knowledge by cross-linking programs with data-sets

SERVICE ARCHITECTURE:

Cloud Storage Data Storage Computer

Server

Compute and Storage Infrastructure

Internal

Cloud AWS

Cloud Computer

(Back-End) Processing

Units

Cloud Setup Interface and Services

Metadata Storage

Structured Attributes Knowledge

Graph Representation

Ontology and Classification

Access

Mechanisms

and Policies

Query Result

(Front-End)

Search-Compute

Query Processing

User-1

User-2

PERSONALIZED LEARNING

Experience Tracking

● To offer customized search results

Multi-lingual Support

● Reduce cognitive load for native use

Personalization

● Customized UI to suit user grade

• Multi-Aspect and Multi-Level Structured

Prediction for Questions

• Knowledge Tracing to estimate the level of knowledge of a student

• Concept Ecosystem for Addressing Knowledge Gap

• Cognitive Load Index of Concepts

PH-1 PH-2

ACCESS AND DISCOVERY INNOVATION: SURROGATOR

SURROGATOR

● Makes available pre-print copies of academic

resources that are otherwise unavailable to

learners

● A customizable application that can identify

surrogates for access-restricted publications

in digital libraries

● Surrogator can make a case for the evolution

of specific subject matters through pre-print

indexing and open up a whole new realm of

research

● Application is being tested through NDLI, can

easily be extended to other digital libraries

SU

RR

OG

AT

OR

1. QUERY

2. QUERY

3. RESULTS

6. CITATION

4. RESULTS + ACCESS RIGHTS

5. FIND SURROGATES

8. SURROGATES

7. ARTICLES +RELATED_ARTICLES + CITED_BY

THE ONE ON DIGITAL PRESERVATION:

INTERNATIONAL LEARNINGS ON DIGITAL PRESERVATION:

• National Heritage Digitization Strategy (NHDS):

https://nhds.ca/ Canadian memory institutions are working together to find ways to improve access to digital collections and better preserve cultural heritage • Tour of Gatineau Preservation Centre • Co-Lab : A collaboration tool used to transcribe, tag,

translate and describe digitized records from the collection. Also, has a number of open challenges in Transcription, Translation, Tagging and Description

• Library of Congress: International Image Interoperability

Framework (IIIF) is the way to go

INTERNATIONAL LEARNINGS ON DIGITAL PRESERVATION:

• National Heritage Digitization Strategy (NHDS):

https://nhds.ca/ Canadian memory institutions are working together to find ways to improve access to digital collections and better preserve cultural heritage • Tour of Gatineau Preservation Centre • Co-Lab : A collaboration tool used to transcribe, tag,

translate and describe digitized records from the collection. Also, has a number of open challenges in Transcription, Translation, Tagging and Description

• Library of Congress: International Image Interoperability

Framework (IIIF) is the way to go

NDLI PRESERVATION CENTRE:

• NDLI IS POISED TO SET-UP A PRESERVATION CENTRE

WITH WORLD LIBRARY FOUNDATION We have already started work on digital preservation of the archives of PRESIDENCY UNIVERSITY, Kolkata.

2 3 1 DIGITIZE MAKE SEARCHABLE (OCR)

DIGITAL PUBLICATIONS

A GLOBAL PRESENCE

Representatives from National Digital

Library of India regularly participate in both

national and international workshops with

the aim to:

Collaborate with institutions and

organizations that are leaders in the field of

library science and technology

Learn from existing knowledge portals and

integrate key lessons into further

developing NDLI

Share India’s culture, knowledge and

technology inheritance with the world

1

2

3

NDLI OUTREACH 360

NATIONAL LEVEL POLICY ON INTELLECTUAL PROPERTY RIGHTS

2

REJUVENATING THE PUBLIC LIBRARY SERVICE

THROUGH DIGITAL REFERENCE SOURCES 3

OCR FOR INDIAN LANGUAGES

4

NDLI CLUB

5

TRAINING AND REPOSITORY SERVICE 6

1 PAN INDIA WORKSHOPS FOR USERS, INSTITUTES AND PUBLIC LIBRARIES

INTERNATIONAL WORKSHOPS AND INTEGRATION

ON KNOWLEDGE ENGINEERING 7

KEDL-2019 http://kedl2019.ndl.gov.in

Out of the various DIGITAL INDIA

initiatives, NDLI is already integrated with

UMANG.

NDLI SOCIAL MEDIA CAMPAIGNS:

NDLI ON SOCIAL MEDIA : #WWOW

NAME YOUR TOP 5 WOMEN SCIENTISTS - is a list that most of us failed to complete after Marie Curie. #WWOW (Wonderful Women On Wednesday) aims to bring to you stories of some such absolutely wonderful women every Wednesday - we intend to highlight their work and their contribution by directing you to such content from our archives. We hope you will enjoy and learn from them as much as we did.

NDLI ON SOCIAL MEDIA : #NTR

#NightTimeRead is an National Digital Library of India initiative to bring to you some rare and not-so-rare pieces of fiction just before you go to sleep. The content is short and available in the form of pdfs / audio-books to not take up too much of your time but still hopefully instill a bit of magic pre-bedtime.

NDLI ON SOCIAL MEDIA : #STUDYTIME

Take the drudgery out of sitting down to study. Study Time is designed to help students learn better through sharing resources that simplify learning. Interactive tutorials, video lectures, question papers, solutions and other quality resources brought straight to your smart device. Study smart with #StudyTime - anytime, anywhere.

NDLI ON SOCIAL MEDIA : #SME

Sunday Morning Edutainment features graphic novels, comic books, manga, art history, music, dance and much much more! Start your day off by learning about the art of Monet, about believing in yourself from Naruto or about music from Beethoven. Go out of syllabus, edutain yourself with #SME.

FIND US ON SOCIAL MEDIA:

www.facebook.com/ndlindia

www.twitter.com/ndlindia

www.linkedin.com/company/ndlindia

www.youtube.com/c/ndlindia

Don't limit a child to your own learning, for he was born in another time.

-Rabindranath Tagore

Use your e-mail address to sign up at https://ndl.iitkgp.ac.in/

OR https://ndl.gov.in/

#LearnShareGrow

Together, let us

@NDLIndia

Dedicated to the Nation.