19
K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Embed Size (px)

Citation preview

Page 1: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

K12 Web Archiving Program

Lori Donovan

Coordinator, K12 Web Archiving Program

Internet Archive

Page 2: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Why collaborate on K12 Web Archiving?

• Archiving at-risk web content from a new perspective - K12 students

• Library of Congress has a strong educational foundation and contacts in the schools

• The Archive-It web application allows students and teachers to do hands-on archiving with a user friendly interface, automated crawling and (almost) immediate results

Page 3: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Pilot and Year 1

• Pilot Program, Spring 2008– 3 High Schools (California, Illinois, and Louisiana)

• 2008-2009 school year:– 9 schools in 7 states

• 1 Elementary School• 2 Middle Schools• 6 High Schools

– Student archiving groups were: • Social studies, journalism, or extended learning classes • Extracurricular groups

Page 4: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Program Year 1

• Crawling began in October 2008 and continued through April 2009

• Crawl frequency - usually weekly, monthly or quarterly

• Seed level description

Page 5: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Program Account Specifics

• Document Budget: 15,000,000 documents

• Data Budget: 1 terabyte of data

• Active Collection Budget: Up to three collections can be scheduled for crawling at the same time.

• Active Seed Budget: up to 300 seeds can be scheduled for crawling at any one time

Page 6: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Students as Curators: the Numbers

• Created 68 collections• Crawled 1,704 seeds• Archived 233,554,220 URLs• 87% of K12 seeds were not being archived

by more than one school within the program• 97% of K12 seeds were not being archived

by other Archive-It partners• 24% of K12 seeds are not in the Internet

Archive’s general web archive

Page 7: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Access Page: www.archive-it.org/k12

Page 8: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Students as Curators: Sample Collections

• The Heartland - Ames High School, Ames, IA• Prom Guide - Lincoln Park High, Chicago, IL• Flower Power! - New York Public School 56,

Queens• Social Networking - Moran Middle School,

Wallingford, CT

Page 9: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive
Page 10: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Students as Curators: Sample Seeds

• iwaswondering.org– Current Events Collection, Ames Middle School

• Awesomepedia.org – Internet Culture Collection, Ames High School

• Werewolf-movies.com – Recreation Collection, Charleston High School

• Peacesites.org – Peace Collection, Lincoln Park High School

Page 11: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive
Page 12: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Lessons Learned

Changes for the 2009-2010 school year

Page 13: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Current School Year: At a Glance

• 15 schools in 13 states– 1 Elementary School– 9 Middle Schools– 4 High Schools– 1 District-wide Program

• Student archiving groups are:– History, social studies or extended learning

classes– Extracurricular groups

Page 14: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive
Page 15: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Student Age Groups

• Program pilot - all high school students• 2008-2009 school year - 5th graders

and middle school students were very creative, web-savvy and successful in the program

• This helped us justify expanding the program to more younger students in the current school year

Page 16: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Training teachers and students

• Simplified the Archive-It training– Less information about scoping options

• Created a special area in the Archive-It Help Wiki for K12 documentation

• Encouraged teachers to use K12 listserv so that new teachers can learn from those in their second or third year of the program

Page 17: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Budgeting and Crawl Frequency

• Due to the timing of the students’ work on the program, weekly and especially daily crawls could get out of hand

• This year we explicitly asked schools not to crawl at daily or twice daily frequencies

• We also encouraged the use of test crawls when creating new collections

Page 18: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Do Not Archive list

• Students were generally very creative with their seed selection

• A Do Not Archive list was created to avoid continually archiving the following large sites:– google.com– yahoo.com (including answers.yahoo.com)– imdb.com– wikipedia.org (and en.wikipedia.org)– amazon.com– youtube.com– www.ebay.com

Page 19: K12 Web Archiving Program Lori Donovan Coordinator, K12 Web Archiving Program Internet Archive

Program Participants

• Cindy Rich– Eastern Illinois University (Charleston High

School), Charleston, Illinois

• Brian Hewlett– Library Media Specialist, Francis C.

Hammond Middle School, Alexandria, Virginia