Upload
richard-nurse
View
1.140
Download
2
Embed Size (px)
DESCRIPTION
semantic web technologies and digital library search presentation discussing linked data basics and STELLAR project work
Citation preview
Semantic web and search
Richard NurseOpen University Library Services
Outline
• Background• Basics of semantic web technologies• Relevance to libraries and search• STELLAR search project
Open University• UK distance learning University +200,000 students• Undergraduate/Postgraduate/Research• Online learning supported by course materials & local tutors• Milton Keynes campus and regional/national offices• BUT… most students never visit the main campus
Library Services• 24/7 helpdesk• Online library resources• Online help sessions• Links to library resources and skills activities
embedded in VLE• Discovery platform, website resource lists• Librarians work with academics to build new courses
Library Services• Cross-university Information Management services• Institutional Repository ORO http://oro.open.ac.uk/ • Research Data Management Project• Data retention and records management• University Archive• Metadata expertise
Library Services• Innovation projects
http://www.open.ac.uk/blogs/macon/
http://www.open.ac.uk/blogs/RISE/
http://www.open.ac.uk/blogs/telstar/
Library Services• Innovation and development • OU Knowledge Media Institute and others• Semantic web• Video search
http://kmi.open.ac.uk/projects/name/lucerohttp://projects.kmi.open.ac.uk/reflex/index.xmlhttp://www.open.ac.uk/blogs/AVA/
Search
Search
“It’s always so hit-and-miss… I used to sit there for hours and just not find anything. There were thousands and thousands of bits of material but no way of drilling down to find what I really needed. My manager needed to know, by tomorrow, whether there was something we could use or not and I didn’t know the answer, so had to say no”.
Search
• Terms• Boolean logic – AND, OR, NOT• - site: “ “
Search
http://www.flickr.com/photos/niallkennedy/
Search
http://www.flickr.com/photos/dullhunk/
Search• ‘things not strings’
http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-not.html
Search
Google’s Knowledge Graph
Semantic web
Definition: "The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation."
The Semantic Web
Tim Berners-Lee, James Hendler, and Ora Lassila Scientific American, 2001
http://www.sciam.com/article.cfm?id=the-semantic-web
http://www.nature.com/scientificamerican/journal/v284/n5/pdf/scientificamerican0501-34.pdf
Semantic web basics
• ‘web of meaning’• ‘web of data’
http://www.w3.org/2001/sw/
http://semanticweb.org/wiki/Main_Page
http://www.slideshare.net/fadirra/semantic-web-intro-040411
Semantic web basics• URIs• Linked data• Ontologies• but also…
Semantic web basics• URIs – Uniform Resource Identifier • http://en.wikipedia.org/wiki/Uniform_resource_identifier
http://www.slideshare.net/mdaquin/sssw13-ldtut
Linked data• “Linked Data is about using the Web to connect related
data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods.”
• Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF."
http://linkeddata.org/home
http://www.nature.com/scientificamerican/journal/v284/n5/pdf/scientificamerican0501-34.pdf
Subject > Predicate < Object
Jane Austen ‘is the author of’
Pride and Prejudice
Ontologies
http://www.slideshare.net/mdaquin/sssw13-ldtut
“An ontology is a formal specification of a shared conceptualization” Tom Gruberhttp://en.wikipedia.org/wiki/Tom_Gruber http://viaf.org/viaf/72955884/
Ontologies
http://oclc.org/developer/documentation/virtual-international-authority-file-viaf/viaf-rdf-example
egVirtual International Authority File – VIAF – maintained by OCLCFriend of a Friend – FOAF http://www.foaf-project.org/
Ontologies
http://viaf.org/viaf/102333412/#foaf:Person
Ontologies
http://lov.okfn.org/dataset/lov/
http://dbpedia.org/About
Linked data ‘cloud’ http://lod-cloud.net/
Richard Cyganiak and Anja Jentzsch
Why is this of interest?
http://www.slideshare.net/lisld/the-inside-out-library
Lorcan Dempsey OCLC
Why is this of interest?
http://www.slideshare.net/lisld/the-inside-out-library
Quoted by Lorcan Dempsey
“Inside Out library: Scale, Learning and Engagement”
Why is this of interest?
“The change that libraries will need to make … must include the transformation of the library’s public catalog from a stand-alone database of bibliographic records to a highly hyperlinked data set that can interact with information resources on the World Wide Web.” Karen Coyle Understanding the semantic web
http://www.alatechsource.org/library-technology-reports/understanding-the-semantic-web-bibliographic-data-and-metadata
Why is this of interest?
Staff ‘I would be more likely to explore existing non-current learning materials if there were a better way of finding them.’STELLAR survey comment
Students ‘The library is very expansive which is great but you can never find what you need. They need to redo the system make it easier.’NSS comment
Search is a major “pain point” for students and staff
What are libraries doing?
http://www.w3.org/2005/Incubator/lld/
http://lodlam.net/
http://datahub.io/group/lld
at the OU Library
• Library catalogue • Archival material • Old course materials in the University Archive
University Archive
• OU study materials – print and audio-visual• Historical materials – photographs, oral history• Papers of OU peoplehttp://www.open.ac.uk/library/library-resources/the-open-university-archive
Range of learning resource types
Range of learning resource types
The OU Digital Library (OUDL)
Open source, created by and supported by the digital preservation communitypurpose-designed
Supports international metadata standardsPREMIS – METS – MODS – EAD – DC - OAI
Supports Linked Data nativelyMulgara triplestore
FEDORA Flexible Extensible Digital Object Repository Architecture
The STELLAR project
• Semantic Technologies Enhancing the Lifecycle of Learning Resources
• OU Library Services/OU Knowledge Media Institute• Experiment with semantic technologies in a digital library
environment … and to consider the sustainability implications of using semantic technologies.
• Jisc-funded 2012-2013• Jisc Digital Infrastructure programme – Sustainability of digital
content
Taking collections preserved in the OUDL, the STELLAR project was established to:
• Develop a detailed understanding of the value of legacy learning materials as perceived by academic staff and other key stakeholders
• Experiment with the use of semantic technologies in a digital library environment to ascertain the extent to which the perceived value of these materials might be enhanced and to consider the sustainability implications of using semantic technologies.
• Inform the development of digital libraries of learning resources by contributing to the evidence base for their effectiveness
• Increase the return on investment of learning materials by developing an evidence based model for lifecyclemanagement
STELLAR project aims
The STELLAR project
• Project approach• Create a baseline of perceptions of the value of the collection• Carry out an enhancement of the collection• Assess the impact of that enhancement on perceptions of value
Initial survey into value • 89.2% of respondents (501) agreed or strongly agreed
with the statement that maintaining an archive of non-current OU learning materials is important to the reputation of the OU.
• 75.9% of respondents thought that this should be maintained in perpetuity.
• 90.16% of respondents (504) agreed or strongly agreed that non-current learning materials are important to the context of the history of higher education.
• 91.75% of those respondents who were involved in module production (356) agreed or strongly agreed that when producing new OU learning material, I am likely to look to previous material, whether for inspiration or for potential reuse.
“Some of the materials
which the OU has
produced in the past
continue to be definitive,
field-leading and
innovative, and are
recognised as such by
other scholars. We need
to keep copies of this as
part of our institutional
legacy”.
“We are the world leaders in distance learning, so our curriculum designs are much admired and so are our materials. It would be remiss of us not to treat them as potential objects of scholarship themselves”.
“The OU has produced some extraordinary courses in the past, comprised of material no longer covered today. This material can complement existing courses, and would be a fantastic extra resource for current students. This is particularly important since, under the new fee arrangements, value for money will be crucial”.
Capturing perceptions
Personal and professional perspectives of value
· I would be disappointed if the OU learning materials that I helped to produce were not kept
· I keep my own copies of the OU learning materials that I am involved in producing
· I would be pleased if others chose to reuse of reversion the OU learning materials that I have helped to produce
Financial / bottom line perspectives of value
· I think that there is a monetary value to non-current OU learning materials
· The OU could make savings if more learning material were reused
Value to HE and academic communities
· Maintaining an archive of non-current OU learning materials is important to the reputation of the OU
· I think the non-current OU learning materials are important in the context of the history of higher education
· I think the non-current OU learning materials are important in showing how the OU taught at particular times in history
Value to internal processes and cultures
· I keep my own copies of the OU learning materials that I am involved in producing
· When producing new OU learning material, I am likely to look to previous material, whether for inspiration or for potential reuse
· I would be more likely to explore existing non-current learning materials if there were a better way of finding them.
Using a balanced scorecard approach we conducted a benchmarking survey of academic staff and stakeholders to investigate the value they place on non-current learning materials
http://www.gla.ac.uk/services/library/espida/
STELLAR allowed us to link the metadata for all this module content, making it more discoverable & reusable
A metadata module record was created which connects the complicated web of content and metadata associated with each module
Module Information
Basic linked data model(for data.open.ac.uk and to comply with current module descriptions)
doau:a103“An Introduction to the Humanities”
dc:title | rdfs:label | courseware:has-title
courseware:is-taught-present
dc:subject
courseware:Course | mlo:LearningOpportunitySpecification | aiiso:Module | xcri:course
rdf:type
“false”aiiso#code
“A103”
jacs:V900 | doau-topic:arts-and-humanities
doau:a102
dc:isVersionOf
doau:a101
daou-library:339347dc:isVersionOf
“An introduction to the humanities : resource book 2”
dc:title
courseware:has-courseware
Relationship model
Fedora record
course
Fedora record
Application of Linked Data• Text entered into the tool is passed through a semantic meaning
engine and concepts are matched against the concepts contained within the digital library dataset.
• A selection of the closest matches are then displayed. These link through to the object in the Fedora digital library
• The semantic web tool analyses the meaning of those words and finds related material
• the tool can also show related material from other datasets from data.open.ac.uk
http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/stellar2.mp4
Directly access digitised content stored in the OUDL
Materials include those originally in print, audio and video formats
Links to the extensive metadata about the course or element of the course, held on a data.open.ac.uk page
data.open.ac.uk
Architecture of the STELLAR tool
Headline findings
Headline findings
• A consistently positive reaction to the enhanced collections. In every area the majority of respondents agreed or strongly agreed that the enhanced materials had value
• Were two dimensions where the evaluation indicates the transformation of the materials has increased the perceived value of the material:• value to internal processes & culture• financial/bottom line value
• Participants also made several comments regarding which materials should be preserved & enhanced
• Read the full report on the blog:http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLAR-Post-Enhancement-Survey-Report.pdf
Value to internal processes & culture• 89% of respondents agreed or strongly agreed that they would be more likely
to explore existing materials if they knew they had been enhanced
• 94% agreed or strongly agreed that such enhancement makes content easier to reuse or refer to for inspiration during module production
• When thinking about existing systems, 94% also agreed or strongly agreed that the semantic analysis they had seen suggested material which they would not have found using a traditional search
• 78% of respondents agreed or strongly agreed that enhanced materials are more likely to be referred to during module production than those preserved in existing OU systems
Financial / bottom line value
• Improving the discoverability and reusability of the materials appears to have increased the perceived financial value of the materials
• In the pre-enhancement survey 75.9% of respondents agreed that the OU could make savings if more learning material were reused
• Following the enhancement, an increased 83% agreed or strongly agreed that the OU could make cost savings if existing materials were enhanced to make them more discoverable
“It will be helpful to know what kind of support and budget is available to make more old course resources available. This will help reducing costly
budgets for new modules in production.”
Value of semantic searchStakeholder views of semantic search• ‘More likely to use material’ - 89% agreed/strongly agreed• ‘Content easier to reuse’ – 94% agreed/strongly agreed• ‘Found material that traditional search wouldn’t – 94% agreed/strongly agreed
Before
After
72.00% 74.00% 76.00% 78.00% 80.00% 82.00% 84.00%
Cost-savings could be made if material re-used
http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLAR-Post-Enhancement-Survey-Report.pdf
Key findings• Significant effort required to improve the metadata• To make best use of the Linked Data, it was beneficial to digitise and
preserve all course materials for the selected courses• Trade-off between value of extra content digitised and the
cost of cataloguing• Once you’ve built it into your system you can automatically generate linked
data for new content of that type• Stakeholders can see the value of this type of search
Follow-up work to STELLAR• Linked Data embedded into OU Digital Library• Used to link to related iTunesU and OpenLearn material
STELLAR• STELLAR blog www.open.ac.uk/blogs/stellar
• Final report http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/09/STELLAR-JISC-Final-Report.pdf
• Final report in Jorum http://hdl.handle.net/10949/18379
Questions?