24

ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and
Page 2: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

ARCHIVING & PRESERVING WEB CONTENT

Page 3: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

THE INTERNET ARCHIVE

What?A non-profit digital library and archive

Where? San Francisco, CA

When? Who?Founded in 1996 by Brewster Kahle

How?Officially designated a library by the state of California in 2007

Page 4: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

THE WAYBACK MACHINE

Online: https://archive.org/web/

The largest publicly available web archive in existence.

> 280 Billion Pages > 100 million websites> 150 languages ~ 1 billion URLs added per week

Page 5: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

WEB ARCHIVING

What is a web archive?A collection of archived URLs grouped by theme, event, subject area, or web address.

A web archive contains as much as possible from the original resources and documents their change over time. It is a priority to recreate the same experience a user would have had if they had visited the live site on the day it was archived.

Page 6: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

THE LIFESPAN OF A WEBSITE

How long does a website last?

In general, a typical web page can be expected to last ~90-100 days before changing, moving, or disappearing completely.

> In 2013, our colleagues at Old Dominion University determined that over 10% of event related content posted to social media platforms is lost after one year.

> In 2014, a study by UCLA determined that 7-in-10 scholarly articles that include citations with hyperlinks suffer from reference rot.

Page 7: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

ARCHIVE-IT: A WEB ARCHIVING SERVICE

A web-based application launched in 2006 that allows users to create, manage, access and store collections of web-based digital content.

A fully hosted solution, including access and storage.

A suite of tools for selecting and scoping, and cataloging.

Provides the ability to capture content using 10 different frequencies.

Archived web content includes: html, text, videos, audio, social media, PDF, images, password protected content, static databases and newspapers.

Browse archived content 24 hours after a capture is complete; full text search is available within 7 days.

Private access options are available.

Page 8: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

HOW IS ARCHIVE-IT DIFFERENT THAN THE GENERAL/GLOBAL WAYBACK?

Focused collections

Control over scope and frequency

Technical support

All content and metadata indexed for search

Archived data shipped/downloaded

Private access options

Available 24 hours after captured

Subscription service

One collection

Snapshot

Automated

Search and cataloging not available

Shipping/download not available

Public access only

Access varies

Absolutely free

Page 9: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

WHAT OUR PARTNERS ARE COLLECTING...

Page 10: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

ARCHIVE-IT USE CASES

Create a thematic/topical web archive on a specific subject or event> Often related to traditional collecting activity around the same topical focus> Capture spontaneous events> Document different perspectives and social commentaries

Fulfill a mandate to capture/preserve evolving web history> Construct a historical record of an institution or individual’s web/social media presence> Support an electronic records system to meet records retention requirements> Collect publications/documents that are no longer in print form

Closure crawls> Document a public institution’s presence on the web before it changes or closes

Page 11: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

UNIVERSITY OF ALBERTA: ALBERTA FLOODS JUNE 2013

Use Case:Archive web content before, during, and after the 2013 Alberta floods

> Personal and institutional blogs > News articles > Institutional websites

Page 12: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

WILFRID LAURIER UNIVERSITY

> Document the university’s social media presence

Use Cases:

> Archive the university’s web presence in order to meet required records retention mandates.

Page 13: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

ACCESS TO COLLECTIONS

Partners: > Can view through private web application with login/password

General Public:

> Can view from Archive-It’s website: http://www.archive-it.org/

> Search Archive-It data and metadata from institutional domains

> Landing Pages: branded pages that link back to Archive-It hosted data

Page 14: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

EXAMPLES OF ORGANIZATIONS’ LANDING PAGES

Library of VirginiaUniversity of Texas at Austin

Page 15: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

PRIVATE ACCESS OPTIONS

> Entire account

> Individual collections

> Specific URLs

> IP address

Page 16: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

STORAGE AND PRESERVATION

Storage:

> 2 copies (primary & backup) of archived data are stored at San Francisco data centers.

> A third copy is transferred to the General Archive.

> A copy of archived data can be shipped on a hard drive

> Partners can always download their archived data from Internet Archive’s servers.

Preservation partnerships:

> 2008: LOCKSS

> 2013: DuraCloud

> 2017: Multiple in development...

Page 17: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

DATA REPOSITORY

Page 18: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

KEY ARCHIVE-IT FEATURES

> Different levels of access for account users

> Ten available capture frequencies (from twice daily to yearly)

> Browse collections by URL, search by full-text and metadata

> Detailed post crawl reports for analysis

> Quality Assurance (QA) tools

> Online Help Center and User Manual

> Web Archivists and technical support

> Hosting, access, and redundant storage

Page 19: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

SUBSCRIPTION MODEL

> Annual, renewable subscription

> Subscription levels vary by the amount of archived data archived

> Factors include: type and number of sites, how large they are, and how frequently they are archived

> All subscriptions include hosting, access, and perpetual storage (primary and backup)

Page 20: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

TIME COMMITMENTS

Staff dedicated to web archiving programNDSA, Web Archiving in the United States: A 2016 Survey

58%

13%

5%

5%

19%

Page 21: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

THE WEB ARCHIVING LIFE CYCLE

http://www.archive-it.org/publications

Page 22: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

COMPLIMENTARY TRIAL

Create a collection of up to 5 websites, archive content, and view the results!

Page 23: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

ARCHIVE-IT WEB APPLICATION DEMO

STO

Page 24: ARCHIVING PRESERVING WEB CONTENT and OCUL Archi… · ARCHIVE-IT: A WEB ARCHIVING SERVICE A web-based application launched in 2006 that allows users to create, manage, access and

LEARN MORE

Check out our blog: www.archive-it.org/blog

Follow us on Twitter: @archiveitorg

Like us on Facebook: https://www.facebook.com/ArchiveIt

Questions? [email protected]

THANK YOU!