NSDL 2.0: Creating a collaborative digital library Dean Krafft, Cornell University...

Preview:

Citation preview

NSDL 2.0:Creating a collaborative digital library

Dean Krafft, Cornell Universitydean@cs.cornell.edu

Structure of the Talk

Project Overview and NSDL 1.0 The Fedora-based NSDL Data

Repository (NDR) and NSDL 2.0 Inspiring Contribution and

Collaboration: ExpertVoices, Soft Matter Wiki, etc.

IS Challenges for NSDL Q&A

What is the NSDL? An NSF-funded $20 million/year program in

Science, Technology, Engineering and Mathematics (STEM) education

A digital library describing nearly two million carefully selected online STEM resources from well over 100 collections (at http://nsdl.org)

A core integration team (Cornell, UCAR, Columbia) working with 9 “pathways” portals and over 200 NSF grantees

A large community of researchers, librarians, content providers, developers, students, and teachers

Portals to the NSDL

NSDL 1.0

A “Union Catalog” OAI-PMH Harvesting Central Metadata Database OAI Server for catalog Search index of

metadata/content Initial K-Gray Portal: nsdl.org

Infrastructure overview: NSDL 1.0

STEMCollectionson the Web

CentralMetadata

Repository

SearchService

ArchiveService

Collection RegistrationSystem

NSDL.org Portal

Protocol:OAI-PMHHTTPRESTSQL

NSDL 1.0 Lessons

Metadata Repository was quick to implement using known technologies, but

Limited model Metadata-centric orientation No content – only metadata Limited relationships – collection/item Limits on context, structure, and access Severe limits on contribution and

collaboration One-way data flow: NSDL → Users

Photo by Jon Crispin

NSDL 2.0

Create an NSDL that guides not just resource discovery, but: Supports creating “context” for resources Presents resources in context: linked to related

concepts; with user ratings; with codes and data

Enables community tools for selecting, organizing, evaluating, annotating, contributing, and collaborating

Provides two-way data flow: NSDL ↔ users

Goal: Create a dynamic, living library

In Architectural terms, create an NSDL Data Repository that

Supports storing both content and metadata

Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation

Accessible through web service architecture of remixable data sources and transformations

The Fedora Vision: A Repository for Rich Information Networks

D ata

Ac to r

F o r m al d o c u m en t

I n f o r m al d o c u m en t

D ata s e ts

W eb s er v ic e

Fedora: the NDR middleware A Flexible, Extensible Digital Object Repository

Architecture (http://www.fedora.info) Open source project with $2.2 million in Mellon

funding 2002-2007 Collaboration of Cornell and Univ. of Virginia Key funded users include:

eSciDoc project (collaboration of the Max Planck Society and FIZ Karlsruhe)

Public Library of Science (Topaz Foundation) VTLS Corp., Harris Corp., Library of Congress Australian Research Repositories Online to the World Royal Library Denmark, National Library, and DTU

What is Fedora? An architecture, toolkit, and

implementation: middleware, not a vertical application

Stores arbitrary internal and external digital objects, disseminations (transformations and combinations), relationships among objects

Entirely SOAP/REST based, disseminations are URLs

XML data store; RDBMS cache; RDF triplestore supports relationship queries

NSDL Data Repository (NDR) References to roughly 2 million

selected STEM resources on the web Sourced metadata statements about

those resources A REST API to allow authenticated

access by Pathways and providers Support for annotation, aggregation,

and other relationships

Sample NDR Objects & Relationships

PublicationResource

Data SetMetadata

PublicationMetadata

Data SetResource

CodeResourceCites

Metadata for

Member of

MetadataProvider MatForge

Collection

SoftMatter

Collection

Member of

Cites

Metadata for

CornellCCMR

MatDLPathway Selector

forSelectorfor

Draft NDR API Characteristics Uses REST calls for all interactions; uses

handles (DOIs) for all external references Ensures external applications can’t violate

the NDR model constraints Disseminations allow combining metadata

from multiple sources, or related content Authentication: Requests signed with

private key associated with an agent Authorization: Agent can become a

metadata provider or aggregator; can create resources

An Information Network Overlay Think of the NDR as a lens for viewing

science content on the net Content can be:

Local: stored directly in the NDR Remote: accessed through a URL Computed: derived from a database or

web service Archived: an older version stored at SDSC

It all has a repository-based URL

Network Overlay View

User View

API/UI

Repository View with Relations & Annotations

Resources on the Web

NSDL 2.0 Technical Challenges

Scaling the RDF triple-store past 200 million triples

Constraining RDF queries to be reasonably computable

Building meaningful search indices on explicit metadata, annotations, resource content, and relationships

Applying the NDR

The NDR provides powerful capabilities for: Creating context around resources Enabling the NSDL community to directly

contribute resources and context Representing a web of relationships among

science resources and information about those resources

How do we use it? Here’s one specific example …

ExpertVoices

What is Expert Voices? A multi-user blogging tool Topic-based discussions (e.g. forensics)

with pointers to related resources An outreach tool to explain and document

NSF-funded research A way for NSDL community members to

become NSDL contributors: of resources, questions, reviews, annotations, metadata

A question/answer and discussion forum: scientist ↔ teacher ↔ student ↔ librarian

What isn’t EV?

Expert Voices ≠ LiveJournal Contributors are carefully selected,

contributions are about science, the process of science, and education

Comic by Michael Lalonde/orneryboy.com

Hurricane Floyd/Photo by NASA

Photo by Jon Crispin

Broadening Participation: An Expert Voices Learning Scenario

“Hurricane Season Blog” run by a National Weather Service hurricane expert, an Earth Science teacher, and a school media specialist familiar with NSDL

Expert creates an entry for Hurricane Gertrude “On track to hit Ft. Lauderdale in 72 hours” “Currently undergoing eyewall replacement cycle” “Expecting 15 foot storm surge”

Media specialist adds links to NSDL resources: Hurricane Hunters site, latest satellite photos, and USGS flooding and flood plain web page

Teacher makes connections to relevant standards and appropriate pedagogy for use by other teachers

Students experience engaging real-time, real-world applications of science lessons

Expert Voices Implementation

Open source multi-user blogging system Published entries become NSDL resources Owner controls publication of entries and

visibility of comments Entries can contain linked references to

NSDL resources, references to URLs that should become resources, and new resource metadata

Integrated with NSDL Shibboleth-based community sign-on

But Expert Voices is just the beginning…

Soft Matter Wiki: Planned NDR Integration Community of approved contributors (e.g.

teachers, librarians, materials scientists) are granted edit access to Soft Matter wiki

New resources and metadata are created as wiki pages and reflected into the NDR

Relevant non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking

User and project pages organize NDR resources

NDR Entry for Soft Matter Wiki

Wiki Entry

NewMetadata

NewAudience

MD

ReferencedNew

Resource 1

ReferencedExisting

Resource 2

Annotates

Metadata for

Metadata for

Member ofMetadataProvider

MetadataProvider

ExistingCollection

Soft Matter

Wiki

Member of

Inferred relationshipbetween resources

MyNSDL: NDR-integrated tagging, bookmarking, and recommendation Based on Connotea open-source

folksonomic tagging/bookmarking system Tags and bookmarking structure are

reflected back into the NDR Authorized users can “automatically”

recommend new NSDL resources simply by tagging them

Gives user a personal view of NSDL resources

NDR Application: Content Assignment Tool

Developed by Anne Diekema, Elizabeth Liddy, et al. at the Syracuse University Center for Natural Language Processing

Uses text analysis and machine learning to suggest Educational Standards alignment for resources

Content expert assigns standard, and system learns from the assignment

Standalone tool available now; standards associated with resources in the NDR 4Q06

Other applications in development Automated grade-level assignment

based on vocabulary analysis (San Diego Supercomputer Center)

OnRamp – multi-user, multi-project NDR-integrated content management system

Instructional Architect: Lesson plan development for K12 teachers (Utah State)

iVia-based Expert-Guided crawl: Tool for Pathways and others to turn websites into resource collections (in development at UC Riverside)

Other proposed applications

Moodle Course Management System – courses integrated with NSDL resources

Electronic lab notebook – integrating lab notes with code, data sets, and reference materials within the library archival framework

NSDL 2.0 Ecosystem

Protocol:OAI-PMHHTTPRESTNDR API

STEMCollections

SearchServiceArchive

Service

Fedora-basedNDR

What are the Information Science challenges?

Trust

Photo © 2005 Reuters

Contribution

Trust and reputation in NSDL

We brand NSDL as a source of “trusted” resources

What is our trust mechanism? Transitive trust approval Community rating/filtering/reputation

Trusted vs. complete “views” What is the right balance of trust

vs. community contribution?

Community Formation

Build the tools and they will come? What can we learn from Wikipedia,

MySpace, Flickr, and YouTube? How do we leverage existing

societies and groupings (NSTA, ACM, AAPT, AAAS)?

Is there an NSDL community, or are there many small communities?

Courtesy Kathy Sierra/WickedlySmart.com

Creating Passionate Users

How do we help NSDL users “kick ass”? What can we learn from game design?

Motivating goal Challenging interaction Meaningful payoff Multiple levels

Can we use fun, emotion, seduction, surprise, and visuals – and still be academics?

Courtesy Kathy Sierra/WickedlySmart.com

Photo by Jon Crispin

Challenges of ubiquity

Should we target NSDL materials at limited devices (iPods, cell phones)?

How does ubiquitous NSDL access change teacher/student interactions?

Should we build tools to capture field data from these devices?

Other IS Challenges

Personalization: SDI, automated activity analysis, targeted user views

Visualizing the library: alternatives to text search for discovery and context

Location awareness: specializing library views by physical location

Summary

NSDL 2.0 and its tools allow scientists, mathematicians, teachers, engineers, librarians, and students to create a unique web of context, contribution, and collaboration around the high-quality STEM education resources at the core of the NSDL.

NSDL CI needs solve the IS problems needed to turn Capability into Reality.

Acknowledgements

NSDL NSF Program Officers Lee Zia David McArthur

NSDL Core Integration Team UCAR: Kaye Howe, PI and Executive Director Cornell: Dean Krafft, PI Columbia: Kate Wittenberg, PI

Fedora Development Team Cornell: Sandy Payette & Carl Lagoze Univ. of Virginia: Thornton Staples

Apology

Courtesy Kathy Sierra/WickedlySmart.com

Questions?

Contact Information

Dean B. KrafftCornell Information Science301 College Ave.Ithaca, NY 14850USAdean@cs.cornell.edu

This work is licensed under the Creative Commons Attribution-NoDerivs 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.