Upload
eduserv-foundation
View
12.405
Download
0
Embed Size (px)
DESCRIPTION
A presentation given at the Talis Xiphos Research Day, 10 June 2008.
Citation preview
Jun
e 2
00
8
Andy Powell, Eduserv [email protected]
www.eduserv.org.uk/foundation
Web 2.0 and repositories…
…have we got our repository architecture right?
June 2008Talis "Project Xiphos" Research Day, Birmingham 2
Outline
• where are we now?
• what’s wrong with where we are now?
• what can we do about it?
• do we need a new vision?
June 2008Talis "Project Xiphos" Research Day, Birmingham 3
Where are we now?
where are we now?
June 2008Talis "Project Xiphos" Research Day, Birmingham 4
What is a repository?
a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution. … An institutional repository is not simply a fixed set of software and hardware
(Cliff Lynch, 2003)
June 2008Talis "Project Xiphos" Research Day, Birmingham 5
Repository “doing” words
• manage
• deposit
• disclose
• make openly available
• curate
• preserve
June 2008Talis "Project Xiphos" Research Day, Birmingham 6
Repository content
• all sorts… but most “academic” focus currently on
– scholarly publications
– learning objects
– research data
June 2008Talis "Project Xiphos" Research Day, Birmingham 7
Repository content
• all sorts… but most “academic” focus currently on
– scholarly publications
– learning objects
– research data
• this talk focuses on the first of these, but with the intention that most of what I say will be generic
June 2008Talis "Project Xiphos" Research Day, Birmingham 8
Repository architecture
• largely institutional focus though some exceptions – arXiv, RePEC, JORUM, etc.
• interoperability through centralised aggregators (national and global)
– search services (OAIster, Intute, …)
– registries (DOAR, ROAR, …)
• harvesting metadata about content using OAI-PMH (metadata = simple Dublin Core)
• content = PDF
• SWORD as deposit API
June 2008Talis "Project Xiphos" Research Day, Birmingham 9
What’s “wrong” with where we are now?
what’s “wrong” with where we are now?
June 2008Talis "Project Xiphos" Research Day, Birmingham 10
#1 We talk about “repositories”…
June 2008Talis "Project Xiphos" Research Day, Birmingham 11
…rather than “the Web”
a focus on ‘making content available on the Web’ would be more intuitive to
researchers
June 2008Talis "Project Xiphos" Research Day, Birmingham 12
Whatever happened to the CMS?
• a focus on ‘content management’ would change our emphasis
• OAI-PMH out…
• search engine optimisation, usability, accessibility, Web design, tagging, information architecture, cool URIs in…
June 2008Talis "Project Xiphos" Research Day, Birmingham 13
#2 We don’t emphasise…
• Google indexing
• RSS feeds
• widget technology – embedding functionality into other sites
June 2008Talis "Project Xiphos" Research Day, Birmingham 14
#3 Our focus is on sharing metadata…
• …even though we have full-text to share
• worse… the full-text we share tends to be PDF rather than native Web format
– the Web equivalent of a cul de sac
• and the metadata we share tends to be “simple Dublin Core”
– little consistency in approaches to describing ‘files’ vs. ‘documents’
– little consistency in naming authors and subjects
– ultimately, it is both too simple and too complex!
June 2008Talis "Project Xiphos" Research Day, Birmingham 15
pbo31 @ flickr
#4 We ignore the Web Architecture
• we have tended to adopt service oriented approaches
• in line with longtradition from Z39.50to SOAP/WSDL
– e.g. JISC eFramework
• focus is on building“services on content”rather than on the“content”
June 2008Talis "Project Xiphos" Research Day, Birmingham 16
REST is good
• we don’t tend to adopt a resource oriented approach
• we don’t adopt REST – an architectural style with a focus on resources, their identifiers (e.g. URIs), and a simpleuniform set of operationsthat each resourcesupports (e.g. GET,PUT, POST, DELETE)
• we don’t encourage aWeb style “follow your nose” approach
June 2008Talis "Project Xiphos" Research Day, Birmingham 17
#5 We are antisocial…
• … at least, we tend to treat “content” in isolation from the “social networks” that need to grow around that content
• successful “repositories” (Flickr, YouTube, Slideshare, etc.) promote the social activity that takes place around content as well as the content management and disclosure activity
– friends, groups, social tagging, comments, embedding, re-purposing, etc.
June 2008Talis "Project Xiphos" Research Day, Birmingham 18
But not just about functionality…
• the institutional approach has fundamental mismatch with the real-life social networks adopted by researchers
– subject-based
– cross-institutional
– global
• while institutional approach isgood from perspective of institutional management, preservation, etc.
• globally “concentrated” repositories might better reflect the social networks that need to arise
June 2008Talis "Project Xiphos" Research Day, Birmingham 19
The net effect…
• …is that there is no net effect
• repositories remain uncompelling places to disclose scholarly publications from POV of the researcher
• perceived cost of deposit remains higher than perceived benefits
• we resort to institutional or funder mandates, “thou shalt deposit”, to fill what would otherwise remain empty
June 2008Talis "Project Xiphos" Research Day, Birmingham 20
Wait just a minute…
• didn’t we used to have globally “concentrated” repository services?
• arXiv – the firstWeb 2.0 service?
• invented beforethe Web
• unfortunately, alsoinvented beforeAmazon S3
• i.e. before we knew how to scale things
June 2008Talis "Project Xiphos" Research Day, Birmingham 21
Wait just another minute…
• …doesn’t the blogsphere successfully layer a set of globally concentrated services over a distributed network of content?
– e.g. Technorati
• yes… but…
• the content is under the control of ‘individuals’ rather than ‘institutions’, and…
• the interoperability “glue” (RSS and tagging) is very lightweight and RESTful
June 2008Talis "Project Xiphos" Research Day, Birmingham 22
Having the conversation is hard
• highly political space
• strong “open access” voices who, understandably, don’t want their agenda de-railed by discussion about
– preservation
– search engine optimisation
– Web 2.0
– social networks
– semantic Web
– the future of peer review
• it can be hard to get the conversation started
June 2008Talis "Project Xiphos" Research Day, Birmingham 23
What can we do about it?
what can we do about it?
June 2008Talis "Project Xiphos" Research Day, Birmingham 24
Things can go two ways…
I think that things can go two ways…
The Web 2.0 Way
or
The Semantic Web Way
…possibly both
June 2008Talis "Project Xiphos" Research Day, Birmingham 25
Things can go two ways…
what would a Web 2.0 repository
look like?
June 2008Talis "Project Xiphos" Research Day, Birmingham 26
Like this?
June 2008Talis "Project Xiphos" Research Day, Birmingham 27
A Web 2.0 repository?
• high-quality browser-based document viewer (not Acrobat!)
• tagging, commentary, more-like-this, favorites, …
• persistent (cool) URIs to content
• ability to form simple social groups
• ability to embed documents in other Web sites
• high visibility to Google
• offer RSS as primary API
• use of Amazon S3 to cope with scalability
June 2008Talis "Project Xiphos" Research Day, Birmingham 28
In short… we go “simple”
• we develop simple(ish) repositories
• and complex aggregators and search engines
• RSS/Atom as primary “glue”
• social tagging as “description”
• full-text indexing
• microformats
• Google Sitemaps to guide harvesters to content
• complex functional requirements (e.g. author disambiguation) either ignored or met thru complexity in aggregators
June 2008Talis "Project Xiphos" Research Day, Birmingham 29
Alternatively… we go “complex”
• …we look to the Semantic Web
• we create and share muchricher metadata aboutscholarly publications thanwe do currently
• we explicitly modelcomplexity (a la FRBR)
• and aggregations
• we expose resulting metadatathru the SW “graph”
June 2008Talis "Project Xiphos" Research Day, Birmingham 30
We go “complex”...
SWAP and ORE
June 2008Talis "Project Xiphos" Research Day, Birmingham 31
We go “complex”…
• SWAP – Scholarly Works Application Profile
• an application of the Dublin Core Abstract Model and Application Profiles
• capturing relationships between works, expressions, manifestations, items and agents
• ORE – OAI Object Re-use and Exchange
• capturing relationships between aggregations and aggregated resources
• note that ORE not tied to specific entity in FRBR
• note that ORE implemented as profile of Atom
June 2008Talis "Project Xiphos" Research Day, Birmingham 32
SWAP application profile model
ScholarlyWork
Expression0..∞
isExpressedAs
Manifestation
isManifestedAs
0..∞
Copy
isAvailableAs
0..∞
0..∞
0..∞
isCreatedBy
isPublishedBy
0..∞isEditedBy
0..∞isFundedBy
isSupervisedBy
AffiliatedInstitution
Agent
June 2008Talis "Project Xiphos" Research Day, Birmingham 33
OAI ORE
June 2008Talis "Project Xiphos" Research Day, Birmingham 34
Summary
• what can we learn from Web 2.0?– user interface design matters
– global ‘concentration’ is an enabler of social interaction
• simple DC is both too simple and too complex
• richer DC application profiles such as SWAP and/or RDF applications like ORE may be a way forward
• but need to ensure that their use does not over-complicate user interfaces and workflows
June 2008Talis "Project Xiphos" Research Day, Birmingham 35
A new vision?
a new vision?
June 2008Talis "Project Xiphos" Research Day, Birmingham 36
Flickr and digital cameras…
• didn’t just take the practice of photography and put it on the Web
• they fundamentally changed what photography was about
June 2008Talis "Project Xiphos" Research Day, Birmingham 37
What’s our vision?
• the standards we adopt in the scholarly communication space…
• OAI-PMH, OpenURL, DOI, PDF, …
• are primarily about replicating in a Web world what we have always done on paper
• this is not surprising given the necessary inertia of the scholarly communication life-cycle
• but… do we need to re-envision scholarly communication as a true Web process?
• if so, what would a repository look like?
June 2008Talis "Project Xiphos" Research Day, Birmingham 38
thank you