Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Web Preservation, or Managing your
Organisation’s Online Presence After the
Organisation Ceases to ExistIRMS 2016 conference, Brighton on 15-17 May 2016
Brian KellyIndependent researcher/consultant at UK Web Focus Ltd.
Contact DetailsBrian Kelly
Email: [email protected]
Twitter: @briankelly
Blog: http://ukwebfocus.com/
Slides and further information available at
http://ukwebfocus.com/events/irms-2016-web-preservation/
UK Web Focus Event hashtag: #irms16
View slides & abstract at http://bit.ly/irms16-kelly
Tweet comments using #irms16 #kelly
22
You are free to:
copy, share, adapt, or re-mix;
photograph, film, or broadcast;
blog, live-blog, or post video of
this presentation provided that:You attribute the work to its author and respect the rights
and licences associated with its components.
Idea from Cameron Neylon c
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only CCZero.
Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at:
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
Your comments
may be useful
in evaluation &
subsequent
reflections on
this talk
AbstractAbstract
Your organisation has failed to survive cutbacks and will shortly
close. Public sector organisations may feel responsibilities for
ensuring that information about their activities is not lost if their
organisation is closed down. This talk summarises approaches
taken to managing web content provided by UKOLN, a national
centre of expertise in digital information management at the
University of Bath, which closed in July 2015.
UKOLN existed for 30+ years and had an important role to play in
development of online services for the UK’s higher education sector.
This case study summarises approaches taken to minimising loss of
this history.
Learning Outcomes:
1. Strategies for managing the termination of online services
2. Useful tools and services
3. Addressing the challenges and opportunities provide by social
media services 3
In Other Words …
This talk describes:
• Steps taken over ~6 months to ensure web
products were not lost after cessation of funding
• Approaches taken in updating content
• Services used
• Understanding of risks
4
What did we want to preserve?
• Documents e.g. PDFs
• Web resources (web sites)
• Software
• Ease of access to online content (e.g. functional links,
Google juice, …)
• Audiences, communities, …
• Resources which could inform stories
We Know About Web Preservation!
Web preservation
services are
available:
• UK Web Archive
• Internet Archive
What does this talk
have to add?
5
Focus Of This Talk
This talk addresses:
• Web preservation challenges when an organisation is to be closed
• Motivational issues for preserving web products
• Perspectives from higher education:
Moves towards open access; open practices; …
Blurring between social & professional online services
Increasing important of online services hosted beyond the institution
The talk provides:
• Summary of pragmatic approaches
• A real-world case study
• Suggestions on who the “Information Superheroes who enable business excellence” may be
6
Funding Will Cease on 31 July 2013!
Background:
• Jisc announce cessation of core funding for UKOLN in Dec 2012
• 7 months to manage web preservation work
Challenges:
• What to do; how to do it!
• Why should I do it?!
Outcomes:
• Preservation work completed
• rUKOLN subsequently folded (July 2015)
7
http://www.ukoln.ac.uk/
Why Bother?
x
8
What do I care about web preservation? I’ve lost my job, I’ve bills to pay, I don’t
know if I’ll get another job, …
Image from pixabay.com
Available under a CC-0 licence
Motivating Factors
About UKOLN
• Established in 1977
• A centre of expertise in digital information management
• Funded by JISC and MLA (and predecessors)
• A national centre with an international reputation
• Influential in early digital library work in UK (eLib programme); metadata (Dublin Core); digital preservation (!); …
About UKOLN Staff:
• Many looking to continue work in digital library environment post-UKOLN
• “Will evidence of my professional work disappear?”
9
30th anniversary event held at the
British Library in 2008
Disappearing Content
Web content can
disappear for various
reasons:
• It’s no longer aligned
with current policies
• It’s embarrassing
• It’s illegal
• …
10
Painting of famous photograph (which cannot be shown)
Organisations may have online
content of value to others which
they would prefer to vanish
In this case MySociety have
republished Conservative &
Labour party speeches
Learning From Doctor Who!
“The Doctor Who missing episodes are the portions of the long-running British science-fiction television programme Doctor Who no longer held by the BBC. Between 1967 and 1978 the BBC routinely deleted archive programmes, for various practical reasons (lack of space, scarcity of materials, a lack of rebroadcast rights).”
11
https://en.wikipedia.org/wiki/
Doctor_Who_missing_episodes
Hobbyists, working under the radar, to the rescue!
Learning From Doctor Who!
“The Doctor Who missing episodes are the portions of the long-running British science-fiction television programme Doctor Who no longer held by the BBC. Between 1967 and 1978 the BBC routinely deleted archive programmes, for various practical reasons (lack of space, scarcity of materials, a lack of rebroadcast rights).”
12
https://en.wikipedia.org/wiki/
Doctor_Who_missing_episodes
Hobbyists, working under the radar, to the rescue!
Why We Can’t Rely on the Funders
Ownership of online
content:
• Typically managed by
marketing
• Being positive
• Looking to the future
• “If web content is not
relevant to current
strategy it must go!”
13
For example consider the
eFramework
• A “visionary new initiative”
• Gained international support
(New Zealand & Netherlands)
The eframework.org Site Today
Issues:
• Learning (from apparent failures)
Preservation of:
• Content (beyond news items)
• Provence (who funded / carried out work)
• Significant dates (when started; when partners joined; when work finished)
• Why it stopped: technical reasons? politics? funding? …
• What can be learnt from this?
14
Approaches Taken At UKOLN
Summary of approaches published on 29 July 2013:
• Identifying UKOLN’s web assets and the owner.
• Preparing the content so that it was suitable for preservation.
• Submitting details of web resources to UK Web Archive.
• Liaison with UK Web Archive to ensure that resources successfully archived.
Looking back:
• Uncertainties of rUKOLN continuation (lasted for 2 years)
• Assess and manage risks of dependencies (technical & organisational)
• Addressing motivational issues
• Continuation of preservation activities
• Sharing experiences with others (today!)
15
UKOLN Projects
16
UKOLN A-Z of
projects and
activities page used
as (public) list of
archiving work
Note some activities
may have
continued after
cessation of Jisc
funding and
continuation of
UKOLN at reduced
staffing levels (e.g.
Ariadne ejournal)
UKOLN Projects
17
Typical archived site:• Status clearly visible
on home page
• Content updated where possible (removed ‘will’; years for events included; …)
• Summary of archiving approaches documented
• Audit provided
• Links provided to significant resources
• Information on key contributors provided
• Links to archive copies provided
Second Example QA Focus project web site
• Migrate key reports to more trusted environment (Bath Uni repository)
• Summarise licences for reuse
• Describe technical architecture (and remove ‘dynamic’ aspects; search interfaces: …)
18
Note much of this work was
carried out when the project
funding finished in 2012, as an
example of best practice on
project termination (QA for
mothballing project sites)
Trusted Hosting Agencies
The content has been updated. What happens next?
Papers
• Ensure key papers are migrated to Opus, University of Bath institutional repository
• Update links to point to copy on Opus
Web sites
• Explore resources which are available on Internet Archive and provide links
• Submit content to UK Web Archive
• Discussions with local computer service. Agreement to mirror content to new server and maintain static web site with existing URLs
Software
• Notification of closure of online services (analysis of incoming links & usage patterns)
• Software deposited in repositories e.g. Google Code
19
When Things Go WrongThe UKOLN IRG Web site:
• Continuation of UKOLN
work after cessation of
core funding
• Ceased 2 year’s later
due to lack of
continuation funding,
departure of director,
lack of technical
expertise
• Web site migrated to
static mirror hosted
locally, but …
20
http://irg.ukoln.ac.uk/
When Things Go WrongThe UKOLN IRG Web site:
• Continuation of UKOLN
work after cessation of
core funding
• Ceased 2 year’s later due
to lack of continuation
funding, departure of
director, lack of technical
expertise
• Web site migrated to static
mirror hosted locally, but
…
• Link is to a dynamic page:
http://irg.ukoln.ac.uk/index.
html?p=2206.html
21
http://irg.ukoln.ac.uk/
http://irg.ukoln.ac.uk/index.html%3Fp=2206.html
When Things Go Wrong
Let’s Google the
missing page - “New
UKOLN Informatics
news site”:
• A static version of
page exists
• Nobody would
know this!
• Need to preserve
links and not just
content!
• Don’t use http://www.foo.com/?p=nnn
22
http://irg.ukoln.ac.uk/
Note also problems accessing http://ukoln.ac.uk/
Mirroring processes may not know about
redirects & other server configuration options
http://irg.ukoln.ac.uk/2013/12/09/new-ukoln-informatics-news-site/
When Things Go WrongA UKOLN IRG
project:
Ran from
October 2011 to
July 2013
Project reports
hosted on Bath
repository
Staff list provided
Link provided by
project blog,
hosted by Bath
University (not
in-house)
23
http://irg.ukoln.ac.uk/
When Things Go Wrong
Research 360 blog
hosted by Bath
University:
• After(?) UKOLN
demise blog
deleted and link
provided to copy
on Internet Archive
• Most recent copy
taken on 25 April
2014
24
http://irg.ukoln.ac.uk/
When Things Go Wrong
Research 360 blog
hosted by Bath
University:
• After(?) UKOLN
demise blog deleted
and link provided to
copy on Internet
Archive
• 20 copies taken
between 2012 and
2014
• Most recent copy
taken on 25 April
2014 25
http://irg.ukoln.ac.uk/
When Things Go Wrong
Internet Archive copy, Oct 2013.
Looks good!
26
Internet Archive copy, April 2013.
Looks different. Branding, blog theme & site
navigation changed.
When Things Go Wrong
Archived copy from 24 April 2014 …
27
linked to About page archived on 21 April 2013
Archived copies held on Internet Archive may be incomplete, missing
images and inconsistent
Opus University of Bath’s
institutional
repository, Opus:
• Hosts many of
UKOLN’s
important
publications
• Provides a CV of
research-like
outputs
28
http://opus.bath.ac.uk/view/person_id/588.html
Apparently I have 81 items –
but 1 (at least) isn’t mine!
Note:• Open access papers held in
several places (LOCKS)• Location unknown for
papers with strict copyright
Opus University of Bath’s institutional repository, Opus:
• Hosts many of UKOLN’s important publications
• Provides a list of UKOLN staff & their outputs
But:
• Only some have their own CV page
• Others don’t:
Left ages ago
Left recently
http://http://opus.bath.ac.uk/view/divisions/cent=5Fukoln.html
Problems probably due
to bugs rather than
policy
30
Take Control of Your CV!
Background:
• IR profile pages have disappeared
• No longer access to IR
• IR is now a read-only silo
Decision:
• Use Researchgate (and Academia.edu) to list publications
Then:
• Use them to host papers
• Control regained over content & presentation
• Richer functionality
Events IWMW (Institutional Web Management Workshop) launched in 1997
• 20th anniversary this year
• 16 years of event web site hosted on UKOLN site
Thoughts:
• Not research
• But evidence of 17 years of development of institutional web services in UK HE
• My main area of work over 20 years!
31
Content at risk. Need to
preserve content and
contextualise experiences
IWMW ContentIWMW resources
• Hosted on UK Web
Archive
• Not fully functional
• Not maintainable
32
IWMW ContentIWMW content migrated to
Lanyrd
• Timetable
• Abstracts
• Speaker details
Content provided for recent
post-UKOLN events
Plus links to:
• Speaker slides
• Twitter archives (where
available)
• Other related resources
33
IWMW ContentIWMW content migrated to
Lanyrd
• Timetable
• Abstracts
• Speaker details
• Other related resources
34
Slides uploaded to
Slideshare and embedded
in Lanyrd pages
Twotter CaptioningLinks from Lanyrd entry to resources for Chris Sexton’s plenary at IWMW 2010:
• Slides hosted on Slideshare
• Video of talk on Vimeo
• Twitter commentary of videos on iTitleservice by Martin Hawksey, ALT
35
Long-term access to this
information is uncertain.
Record of what was done
described on UK Web Focus
blog
Slideshare Repository
36
(Most) slides from IWMW events hosted on UKOLN web site since 1997 uploaded to Slideshare.
Note to facilitate discoverability:
• Slides embedded in Lanyrd
• Use of tags (iwmw1997)
Only PPT & PDF files uploaded (not HTML, etc!)
Writing The Book
Who will be able to write
about 25 years of edtech
developments in UK HE?
37
Compare challenges of writing
400+ page history of the JANET
network, published by JANET,
with writing one on the history of
web developments in UK HE,
The IWMW Blog
38
Some questions:
• What’s the point of preservation?
• What’s missing beyond resources?
My thoughts:
• Understand the past in order to plan for the future
• But we need the context and reflections
Hence establishment of IWMW blog, for 20th
anniversary of event
The IWMW Blog
Derek Law’s reflection
on his IWMW 2009
plenary talk:
• Link to post about
talk is now to a
marketing page[Risk – professionals
repurpose old content]
• JISC PoWR blog
has closed[Risk – blog service
provider at jiscinvolve.org could
terminate service]
39
Closed using
described practices
“So the challenge for Brian and his remarkable array of
colleagues is to keep the faith, keep proselytising and make
sure that the links to this 20th birthday set of blog posts blog
posts still work when the 25th birthday comes along!”
The Individual’s Perspective
We should all expect to lose access to our institutional digital environment!
40
We should therefore make plans for migrating content
from institutional silos!
Where Did My Work Go?!
Developer /
researcher:
• Worked at Bristol
University
• Evidence of
research work
available
(publications)
• Online legacy is
harder to find
41
“The Individual as Institution”
Importance
of individual
as agent for
preservation
42
Individual as Institution, Lawrie :
converged blog, Lawrie Phipps, 7 May
2013, http://lawriephipps.co.uk/?p=199
After Institutional The Need For Individual …
Jisc focus on institutional digital preservation issues
Others address personal digital preservation
Gaps for individual in:
• An institutional perspective
• An UK context
• A HE context
• A research context
43
Revisiting the Learning Outcomes
Learning Outcomes:
1. Strategies for managing termination of online
services
Update content (provide context; removal of
problematic links & services; …)
2. Useful tools and services
UK Web Archive & Internet Archive
Institutional Repository
Research repositories; Slideshare; Lanyrd; …
3. Addressing the challenges and opportunities
provide by social media services
Opportunities to complement institutional, national &
international services
44
Who are the information superheroes who will ensure that UK’s higher education digital memories are maintained for future generations?
• The British Library
• The research councils
• The funders
• The digital preservation services
• The institutions
• The motivated professionals
• The staff who support the motivated professionals and help shape institutional policies who embrace the role of the “individual as institution”
The Information Superheros
45
Conclusions
Preservation of UKOLN resources
• A learning journey (doing the work and then reflecting on the work)
• Just letting the Internet Archive to archive your site isn’t sufficient (but can be useful)
• Submitting your site to the UK Web Archive is useful, but not sufficient by itself
• Management of mothballing sites should be carried out routinely
• Motivational factors are important
• Importance of ‘refreshing’ content, especially by motivated professionals
• Need to consider implications of “Individual as institution” – by both individuals and institutions!
• An ongoing process with multiple key stakeholders!
46
Questions?
47