1 Building the NSDL William Y. Arms Cornell University Thinking aloud about the NSDL

Preview:

Citation preview

1

Building the NSDL

William Y. ArmsCornell University

Thinking aloud about the NSDL

2

Acknowledgement and Disclaimer

The NSDL is a program of the National Science Foundation's Directorate for Education and Human Resources, Division of Undergraduate Education.

The ideas discussed in this talk do not represent the official views of the NSF (or of anybody except the author).

3

What's in a name?

4

SMETE

Science, Mathematics, Engineering and Technology Education

The NSDL

National Digital

Library

5

Science?

The NSDL

National Digital

Library

Can we build a comprehensive digital library for science education, without building a National Science Digital Library?

6

The National Science Digital Library

7

The National Science Digital Library

It's BIG!

8

To be comprehensive—all branches of science, all levels of education, very broadly defined:

Five year targets

1,000,000 different users

10,000,000 digital objects

100,000 independent sites

How big might the NSDL be?

9

Scientific and technical information in digital form

Materials used in education

Digital collections for science

Materials tailored toeducation

10

11

12

13

14

15

Opportunities for the NSDL

• Categories of material that have been given lower priority by libraries and publishers, e.g., datasets, software, and other dynamic content, ...

• Materials that are accessible for automatic processing, e.g., scientific web sites and databases, image collections, ...

• Materials designed for education, e.g.,learning objects, curricula, problem sets, ...

Less opportunity for the NSDL

• Conventional scientific literature with restricted access

16

17

The NSF's strategy

18

The NSF cannot fund all collections

19

The NSF is funding selected collections ...

20

The Core Integration task is to provide a coherent set of services for users

across great diversity.

... and a Core Integration team

21

Resources

Core Integration

Budget $4 million

Staff 25 - 30

Management Diffuse How can a small team, without direct management control, create a very large-scale digital library?

22

A spectrum of interoperability

23

Approaches to interoperability

The conventional approach

Wise people develop standards: protocols, formats, etc.

Everybody implements the standards.

This creates an integrated, distributed system.

Unfortunately ...

Standards are expensive to adopt.

Concepts are continually changing.

Systems are continually changing.

24

Interoperability is about agreements

Technical agreements cover formats, protocols, security systems so that messages can be exchanged, etc.  Content agreements cover the data and metadata, and include semantic agreements on the interpretation of the messages.  Organizational agreements cover the ground rules for access, for changing collections and services, payment, authentication, etc.

The challenge is to create incentives for independent digital libraries to adopt agreements

25

Function versus cost of acceptance

Function

Cost of acceptance

Many adopters

Few adopters

26

Example: Textual mark-up

Function

Cost of acceptance

SGML

ASCII

HTML

XML

27

Federations

Collections follow strict standards for content, metadata, protocols, authentication, etc.

Harvested Collections

Each collection makes metadata about its collections available in a simple exchange format (Open Archives metadata harvesting protocol).

Gathered Collections

Material is gathered automatically by selective web crawling.

Levels of interoperability

28

Levels of interoperability

Level Agreements Example

Federation Strict use of standards AACR, MARC(syntax, semantic, Z 39.50and business)

Harvesting Digital libraries expose Open Archivesmetadata; simple

protocol and registry

Gathering Digital libraries do not Web crawlerscooperate; services must and search enginesseek out information

29

Metadata is expensive

The NSDL cannot afford to create it manually

30

User portals

Distributed collections

Metadata repository

31

Every collection is different

32

From an NSF-funded collection: “We are pleased with the technical side…of the database and web access…but we are complete novices in terms of how to make our collection part of the digital library. I assume this hinges on appropriate metadata, but I am not sure exactly what kinds…”

33

Metadata strategy

• Support eight standard formats

• Collect all existing metadata in these formats

• Provide crosswalks to Dublin Core

• Expose records in the metadata repository for others to harvest

• Concentrate on collection-level metadata

• Use automatic generation to augment item-level metadata

Most Core Integration services will be created automatically from collection-level metadata or directly from the content (e.g automatic indexing of text, automatic reference linking).

34

Managing the NSDL

Responsibility without authority

35

A personal observation

Despite all the evidence to the contrary, ...

we repeatedly over-estimate the benefits of collaboration ...

and under-estimate the obstacles.

36

During the preliminary phases ...

• Each project worked independently (NSF grants have little control)

• Coordination was through a loose set of committees, with mailing lists, bulletin boards, etc.

The NSDL challenge

37

During the preliminary phases ...

• Each project worked independently (NSF grants have little control)

• Coordination was through a loose set of committees, with mailing lists, bulletin boards, etc.

For the production phase ...

• We must develop a robust, reliable set of services

• We must make compromises, decide priorities, etc.

• Yet we must attract the energy of many independent individuals and organizations

The NSDL challenge

38

What doesn't workDecision making by online forums

• Become dominated by a few people, not necessarily the most knowledgeable.

• Either usage dies away, or too many low-value messages drive away the busy people.

Decision making without responsibility

• Vision is easy. Implementation is hard.

39

What does work?Money

• Thank you NSF!

Online discussions on specific topics

• Structured discussions as part of a decision-making process are often productive

Patience and persistence

Success builds on success

40

The last word

From the Lisle, NY Volunteer Fire BrigadeSeptember 17,2001

United we stand.

God bless America.

Bingo, Tuesday 7:30 - 10:00.

41

Building the National SDigital Library

William Y. ArmsCornell University

Recommended