K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Preview:

Citation preview

K12 Web Archiving Program

Lori Donovan

Coordinator, K12 Web Archiving Program

Internet Archive

Why collaborate on K12 Web Archiving?

• Archiving at-risk web content from a new perspective - K12 students

• Library of Congress has a strong educational foundation and contacts in the schools

• The Archive-It web application allows students and teachers to do hands-on archiving with a user friendly interface, automated crawling and (almost) immediate results

Pilot and Year 1

• Pilot Program, Spring 2008– 3 High Schools (California, Illinois, and Louisiana)

• 2008-2009 school year:– 9 schools in 7 states

• 1 Elementary School• 2 Middle Schools• 6 High Schools

– Student archiving groups were: • Social studies, journalism, or extended learning classes • Extracurricular groups

Program Year 1

• Crawling began in October 2008 and continued through April 2009

• Crawl frequency - usually weekly, monthly or quarterly

• Seed level description

Program Account Specifics

• Document Budget: 15,000,000 documents

• Data Budget: 1 terabyte of data

• Active Collection Budget: Up to three collections can be scheduled for crawling at the same time.

• Active Seed Budget: up to 300 seeds can be scheduled for crawling at any one time

Students as Curators: the Numbers

• Created 68 collections• Crawled 1,704 seeds• Archived 233,554,220 URLs• 87% of K12 seeds were not being archived

by more than one school within the program• 97% of K12 seeds were not being archived

by other Archive-It partners• 24% of K12 seeds are not in the Internet

Archive’s general web archive

Access Page: www.archive-it.org/k12

Students as Curators: Sample Collections

• The Heartland - Ames High School, Ames, IA• Prom Guide - Lincoln Park High, Chicago, IL• Flower Power! - New York Public School 56,

Queens• Social Networking - Moran Middle School,

Wallingford, CT

Students as Curators: Sample Seeds

• iwaswondering.org– Current Events Collection, Ames Middle School

• Awesomepedia.org – Internet Culture Collection, Ames High School

• Werewolf-movies.com – Recreation Collection, Charleston High School

• Peacesites.org – Peace Collection, Lincoln Park High School

Lessons Learned

Changes for the 2009-2010 school year

Current School Year: At a Glance

• 15 schools in 13 states– 1 Elementary School– 9 Middle Schools– 4 High Schools– 1 District-wide Program

• Student archiving groups are:– History, social studies or extended learning

classes– Extracurricular groups

Student Age Groups

• Program pilot - all high school students• 2008-2009 school year - 5th graders

and middle school students were very creative, web-savvy and successful in the program

• This helped us justify expanding the program to more younger students in the current school year

Training teachers and students

• Simplified the Archive-It training– Less information about scoping options

• Created a special area in the Archive-It Help Wiki for K12 documentation

• Encouraged teachers to use K12 listserv so that new teachers can learn from those in their second or third year of the program

Budgeting and Crawl Frequency

• Due to the timing of the students’ work on the program, weekly and especially daily crawls could get out of hand

• This year we explicitly asked schools not to crawl at daily or twice daily frequencies

• We also encouraged the use of test crawls when creating new collections

Do Not Archive list

• Students were generally very creative with their seed selection

• A Do Not Archive list was created to avoid continually archiving the following large sites:– google.com– yahoo.com (including answers.yahoo.com)– imdb.com– wikipedia.org (and en.wikipedia.org)– amazon.com– youtube.com– www.ebay.com

Program Participants

• Cindy Rich– Eastern Illinois University (Charleston High

School), Charleston, Illinois

• Brian Hewlett– Library Media Specialist, Francis C.

Hammond Middle School, Alexandria, Virginia

Recommended