63
Semantic web and search Richard Nurse Open University Library

Semantic web technologies and digital library search

Embed Size (px)

DESCRIPTION

semantic web technologies and digital library search presentation discussing linked data basics and STELLAR project work

Citation preview

Page 1: Semantic web technologies and digital library search

Semantic web and search

Richard NurseOpen University Library Services

Page 2: Semantic web technologies and digital library search

Outline

• Background• Basics of semantic web technologies• Relevance to libraries and search• STELLAR search project

Page 3: Semantic web technologies and digital library search

Open University• UK distance learning University +200,000 students• Undergraduate/Postgraduate/Research• Online learning supported by course materials & local tutors• Milton Keynes campus and regional/national offices• BUT… most students never visit the main campus

Page 4: Semantic web technologies and digital library search

Library Services• 24/7 helpdesk• Online library resources• Online help sessions• Links to library resources and skills activities

embedded in VLE• Discovery platform, website resource lists• Librarians work with academics to build new courses

Page 5: Semantic web technologies and digital library search

Library Services• Cross-university Information Management services• Institutional Repository ORO http://oro.open.ac.uk/ • Research Data Management Project• Data retention and records management• University Archive• Metadata expertise

Page 6: Semantic web technologies and digital library search

Library Services• Innovation projects

http://www.open.ac.uk/blogs/macon/

http://www.open.ac.uk/blogs/RISE/

http://www.open.ac.uk/blogs/telstar/

Page 7: Semantic web technologies and digital library search

Library Services• Innovation and development • OU Knowledge Media Institute and others• Semantic web• Video search

http://kmi.open.ac.uk/projects/name/lucerohttp://projects.kmi.open.ac.uk/reflex/index.xmlhttp://www.open.ac.uk/blogs/AVA/

Page 8: Semantic web technologies and digital library search

Search

Page 9: Semantic web technologies and digital library search

Search

“It’s always so hit-and-miss… I used to sit there for hours and just not find anything. There were thousands and thousands of bits of material but no way of drilling down to find what I really needed. My manager needed to know, by tomorrow, whether there was something we could use or not and I didn’t know the answer, so had to say no”.

Page 10: Semantic web technologies and digital library search

Search

• Terms• Boolean logic – AND, OR, NOT• - site: “ “

Page 11: Semantic web technologies and digital library search

Search

http://www.flickr.com/photos/niallkennedy/

Page 12: Semantic web technologies and digital library search

Search

http://www.flickr.com/photos/dullhunk/

Page 13: Semantic web technologies and digital library search

Search• ‘things not strings’

http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-not.html

Page 14: Semantic web technologies and digital library search

Search

Google’s Knowledge Graph

Page 15: Semantic web technologies and digital library search

Semantic web

Definition: "The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation."

The Semantic Web

Tim Berners-Lee, James Hendler, and Ora Lassila Scientific American, 2001

http://www.sciam.com/article.cfm?id=the-semantic-web

http://www.nature.com/scientificamerican/journal/v284/n5/pdf/scientificamerican0501-34.pdf

Page 16: Semantic web technologies and digital library search

Semantic web basics

• ‘web of meaning’• ‘web of data’

http://www.w3.org/2001/sw/

http://semanticweb.org/wiki/Main_Page

http://www.slideshare.net/fadirra/semantic-web-intro-040411

Page 17: Semantic web technologies and digital library search

Semantic web basics• URIs• Linked data• Ontologies• but also…

Page 18: Semantic web technologies and digital library search

Semantic web basics• URIs – Uniform Resource Identifier • http://en.wikipedia.org/wiki/Uniform_resource_identifier

http://www.slideshare.net/mdaquin/sssw13-ldtut

Page 19: Semantic web technologies and digital library search

Linked data• “Linked Data is about using the Web to connect related

data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods.”

• Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF."

http://linkeddata.org/home

Page 20: Semantic web technologies and digital library search

http://www.nature.com/scientificamerican/journal/v284/n5/pdf/scientificamerican0501-34.pdf

Subject > Predicate < Object

Jane Austen ‘is the author of’

Pride and Prejudice

Page 21: Semantic web technologies and digital library search

Ontologies

http://www.slideshare.net/mdaquin/sssw13-ldtut

“An ontology is a formal specification of a shared conceptualization” Tom Gruberhttp://en.wikipedia.org/wiki/Tom_Gruber http://viaf.org/viaf/72955884/

Page 22: Semantic web technologies and digital library search

Ontologies

http://oclc.org/developer/documentation/virtual-international-authority-file-viaf/viaf-rdf-example

egVirtual International Authority File – VIAF – maintained by OCLCFriend of a Friend – FOAF http://www.foaf-project.org/

Page 23: Semantic web technologies and digital library search

Ontologies

http://viaf.org/viaf/102333412/#foaf:Person

Page 24: Semantic web technologies and digital library search

Ontologies

http://lov.okfn.org/dataset/lov/

http://dbpedia.org/About

Page 25: Semantic web technologies and digital library search

Linked data ‘cloud’ http://lod-cloud.net/

Richard Cyganiak and Anja Jentzsch

Page 28: Semantic web technologies and digital library search

Why is this of interest?

“The change that libraries will need to make … must include the transformation of the library’s public catalog from a stand-alone database of bibliographic records to a highly hyperlinked data set that can interact with information resources on the World Wide Web.” Karen Coyle Understanding the semantic web

http://www.alatechsource.org/library-technology-reports/understanding-the-semantic-web-bibliographic-data-and-metadata

Page 29: Semantic web technologies and digital library search

Why is this of interest?

Staff ‘I would be more likely to explore existing non-current learning materials if there were a better way of finding them.’STELLAR survey comment

Students ‘The library is very expansive which is great but you can never find what you need. They need to redo the system make it easier.’NSS comment

Search is a major “pain point” for students and staff

Page 31: Semantic web technologies and digital library search

at the OU Library

• Library catalogue • Archival material • Old course materials in the University Archive

Page 32: Semantic web technologies and digital library search

University Archive

• OU study materials – print and audio-visual• Historical materials – photographs, oral history• Papers of OU peoplehttp://www.open.ac.uk/library/library-resources/the-open-university-archive

Page 34: Semantic web technologies and digital library search

Range of learning resource types

Page 35: Semantic web technologies and digital library search
Page 36: Semantic web technologies and digital library search

The OU Digital Library (OUDL)

Open source, created by and supported by the digital preservation communitypurpose-designed

Supports international metadata standardsPREMIS – METS – MODS – EAD – DC - OAI

Supports Linked Data nativelyMulgara triplestore

FEDORA Flexible Extensible Digital Object Repository Architecture

Page 37: Semantic web technologies and digital library search

The STELLAR project

• Semantic Technologies Enhancing the Lifecycle of Learning Resources

• OU Library Services/OU Knowledge Media Institute• Experiment with semantic technologies in a digital library

environment … and to consider the sustainability implications of using semantic technologies.

• Jisc-funded 2012-2013• Jisc Digital Infrastructure programme – Sustainability of digital

content

Page 38: Semantic web technologies and digital library search

Taking collections preserved in the OUDL, the STELLAR project was established to:

• Develop a detailed understanding of the value of legacy learning materials as perceived by academic staff and other key stakeholders

• Experiment with the use of semantic technologies in a digital library environment to ascertain the extent to which the perceived value of these materials might be enhanced and to consider the sustainability implications of using semantic technologies.

• Inform the development of digital libraries of learning resources by contributing to the evidence base for their effectiveness

• Increase the return on investment of learning materials by developing an evidence based model for lifecyclemanagement

STELLAR project aims

Page 39: Semantic web technologies and digital library search

The STELLAR project

• Project approach• Create a baseline of perceptions of the value of the collection• Carry out an enhancement of the collection• Assess the impact of that enhancement on perceptions of value

Page 40: Semantic web technologies and digital library search

Initial survey into value • 89.2% of respondents (501) agreed or strongly agreed

with the statement that maintaining an archive of non-current OU learning materials is important to the reputation of the OU.

• 75.9% of respondents thought that this should be maintained in perpetuity.

• 90.16% of respondents (504) agreed or strongly agreed that non-current learning materials are important to the context of the history of higher education.

• 91.75% of those respondents who were involved in module production (356) agreed or strongly agreed that when producing new OU learning material, I am likely to look to previous material, whether for inspiration or for potential reuse.

“Some of the materials

which the OU has

produced in the past

continue to be definitive,

field-leading and

innovative, and are

recognised as such by

other scholars. We need

to keep copies of this as

part of our institutional

legacy”.

“We are the world leaders in distance learning, so our curriculum designs are much admired and so are our materials. It would be remiss of us not to treat them as potential objects of scholarship themselves”.

“The OU has produced some extraordinary courses in the past, comprised of material no longer covered today. This material can complement existing courses, and would be a fantastic extra resource for current students. This is particularly important since, under the new fee arrangements, value for money will be crucial”.

Page 41: Semantic web technologies and digital library search

Capturing perceptions

Personal and professional perspectives of value

· I would be disappointed if the OU learning materials that I helped to produce were not kept

· I keep my own copies of the OU learning materials that I am involved in producing

· I would be pleased if others chose to reuse of reversion the OU learning materials that I have helped to produce

Financial / bottom line perspectives of value

· I think that there is a monetary value to non-current OU learning materials

· The OU could make savings if more learning material were reused

Value to HE and academic communities

· Maintaining an archive of non-current OU learning materials is important to the reputation of the OU

· I think the non-current OU learning materials are important in the context of the history of higher education

· I think the non-current OU learning materials are important in showing how the OU taught at particular times in history

Value to internal processes and cultures

· I keep my own copies of the OU learning materials that I am involved in producing

· When producing new OU learning material, I am likely to look to previous material, whether for inspiration or for potential reuse

· I would be more likely to explore existing non-current learning materials if there were a better way of finding them.

Using a balanced scorecard approach we conducted a benchmarking survey of academic staff and stakeholders to investigate the value they place on non-current learning materials

http://www.gla.ac.uk/services/library/espida/

Page 42: Semantic web technologies and digital library search

STELLAR allowed us to link the metadata for all this module content, making it more discoverable & reusable

A metadata module record was created which connects the complicated web of content and metadata associated with each module

Module Information

Page 43: Semantic web technologies and digital library search

Basic linked data model(for data.open.ac.uk and to comply with current module descriptions)

doau:a103“An Introduction to the Humanities”

dc:title | rdfs:label | courseware:has-title

courseware:is-taught-present

dc:subject

courseware:Course | mlo:LearningOpportunitySpecification | aiiso:Module | xcri:course

rdf:type

“false”aiiso#code

“A103”

jacs:V900 | doau-topic:arts-and-humanities

doau:a102

dc:isVersionOf

doau:a101

daou-library:339347dc:isVersionOf

“An introduction to the humanities : resource book 2”

dc:title

courseware:has-courseware

Page 44: Semantic web technologies and digital library search

Relationship model

Page 45: Semantic web technologies and digital library search

Fedora record

course

Page 46: Semantic web technologies and digital library search

Fedora record

Page 47: Semantic web technologies and digital library search

Application of Linked Data• Text entered into the tool is passed through a semantic meaning

engine and concepts are matched against the concepts contained within the digital library dataset.

• A selection of the closest matches are then displayed. These link through to the object in the Fedora digital library

• The semantic web tool analyses the meaning of those words and finds related material

• the tool can also show related material from other datasets from data.open.ac.uk

Page 49: Semantic web technologies and digital library search

Directly access digitised content stored in the OUDL

Materials include those originally in print, audio and video formats

Links to the extensive metadata about the course or element of the course, held on a data.open.ac.uk page

Page 50: Semantic web technologies and digital library search

data.open.ac.uk

Page 51: Semantic web technologies and digital library search

Architecture of the STELLAR tool

Page 52: Semantic web technologies and digital library search

Try the technology• http://discou.info/alfa/

Page 53: Semantic web technologies and digital library search
Page 54: Semantic web technologies and digital library search
Page 55: Semantic web technologies and digital library search

Headline findings

Page 56: Semantic web technologies and digital library search

Headline findings

• A consistently positive reaction to the enhanced collections. In every area the majority of respondents agreed or strongly agreed that the enhanced materials had value

• Were two dimensions where the evaluation indicates the transformation of the materials has increased the perceived value of the material:• value to internal processes & culture• financial/bottom line value

• Participants also made several comments regarding which materials should be preserved & enhanced

• Read the full report on the blog:http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLAR-Post-Enhancement-Survey-Report.pdf

Page 57: Semantic web technologies and digital library search

Value to internal processes & culture• 89% of respondents agreed or strongly agreed that they would be more likely

to explore existing materials if they knew they had been enhanced

• 94% agreed or strongly agreed that such enhancement makes content easier to reuse or refer to for inspiration during module production

• When thinking about existing systems, 94% also agreed or strongly agreed that the semantic analysis they had seen suggested material which they would not have found using a traditional search

• 78% of respondents agreed or strongly agreed that enhanced materials are more likely to be referred to during module production than those preserved in existing OU systems

Page 58: Semantic web technologies and digital library search

Financial / bottom line value

• Improving the discoverability and reusability of the materials appears to have increased the perceived financial value of the materials

• In the pre-enhancement survey 75.9% of respondents agreed that the OU could make savings if more learning material were reused

• Following the enhancement, an increased 83% agreed or strongly agreed that the OU could make cost savings if existing materials were enhanced to make them more discoverable

“It will be helpful to know what kind of support and budget is available to make more old course resources available. This will help reducing costly

budgets for new modules in production.”

Page 59: Semantic web technologies and digital library search

Value of semantic searchStakeholder views of semantic search• ‘More likely to use material’ - 89% agreed/strongly agreed• ‘Content easier to reuse’ – 94% agreed/strongly agreed• ‘Found material that traditional search wouldn’t – 94% agreed/strongly agreed

Before

After

72.00% 74.00% 76.00% 78.00% 80.00% 82.00% 84.00%

Cost-savings could be made if material re-used

http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLAR-Post-Enhancement-Survey-Report.pdf

Page 60: Semantic web technologies and digital library search

Key findings• Significant effort required to improve the metadata• To make best use of the Linked Data, it was beneficial to digitise and

preserve all course materials for the selected courses• Trade-off between value of extra content digitised and the

cost of cataloguing• Once you’ve built it into your system you can automatically generate linked

data for new content of that type• Stakeholders can see the value of this type of search

Page 61: Semantic web technologies and digital library search

Follow-up work to STELLAR• Linked Data embedded into OU Digital Library• Used to link to related iTunesU and OpenLearn material

Page 62: Semantic web technologies and digital library search

STELLAR• STELLAR blog www.open.ac.uk/blogs/stellar

• Final report http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/09/STELLAR-JISC-Final-Report.pdf

• Final report in Jorum http://hdl.handle.net/10949/18379

Page 63: Semantic web technologies and digital library search

Questions?