View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Learning by Doing
The Digital Archive for Chinese Studies
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Facts and Figures
• Founded in the summer of 2001
• Institute of Chinese Studies, Heidelberg Univ.
• 630.000 files = 9 GB = 1300 metadata entries
• Selective + harvesting approach
• Network of researchers: selection
• 2 part-time student workers:– Download– Metadata creation
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
The Chinese Web
• Public discourse never accessible bevor– Dissident web-sites, discussion boards
• Important up-to-date information– Newspapers, government sites
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
The Hole in the Structure
• Chinese National Library– No internet archive– Censorship
• Internet Archive– Arbitrary harvesting at irregular intervals– Data destruction on owner‘s request– No full text search
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Legal Position
• Download of everything, unless access restricted and explicitly prohibited
• Free access from within HD Uni‘s campus
• Otherwise restricted to password owners
• Negotiations on protest by copyright holder:– Payment– Further access restriction– Deletion of files
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Collection policy
"The Digital Archive for Chinese Studies (DACHS) aims at identifying, archiving and making accessible Internet resources relevant for Chinese Studies. Special emphasis is given to the social and political discourse as reflected by articulations on the Chinese Internet. [...]"
(Mission statement)
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Identification of resources
• Human information network– Scanning of mailing-lists by specialists– Internet research by scholars
• Integration of external collections
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
• High grade of
specialisation
• Flexibility
• Use of background
knowledge
• Small fraction
of Internet
• Chance of selection
• High amount of
human labour
Pros and Cons
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Resource allocation
• Selection – unpaid researchers (?h/w)
• Download – paid student workers (8h/w)
• Metadata creation – paid students (8h/w)
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Download
• Metaproducts Offline Explorer – more than one page– Site structure is retained
• Microsoft Internet Explorer – single page– Alterations of site structure
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Metadata creation
• Framework: developed from OAIS information model and library‘s catologing system
• Entries made for single documents, E-Journals, websites or whole collections
• Contains information for users, as well as technichal and administrative data
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Access
• Project‘s homepage– Using a basic classification system
• Library‘s catalogue– Searching the metadata entries
• To come: full text search engine– diversity of file types, encodings, mass of docs
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Security issues
• Dedicated and climatized IT-room• UPS
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Security issues
• Dedicated and climatized IT-room• UPS• Software raid level 1• Daily ADSM backup to University Computer Center• Additional backup to University of Karlsruhe
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Security issues
• Dedicated and climatized IT-room• UPS• Software raid level 1• Daily ADSM backup to University Computer Center• Additional backup to University of Karlsruhe• Virus scan on download• Daily virus scan of the complete archive• Hourly update of virus definitions
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Future
• Improvement / fine tuning of the metadata set according to the needs of the collection
• Further establishment of the “information network”
• Testing and implementation of a metadata harvester
• Testing and implementation of a search engine
• Promoting the project
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Future
• Improvement / fine tuning of the metadata set according to the needs of the collection
• Further establishment of the “information network”
• Testing and implementation of a metadata harvester
• Testing and implementation of a search engine
• Promoting the project
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Future
• Improvement / fine tuning of the metadata set according to the needs of the collection
• Further establishment of the “information network”
• Testing and implementation of a metadata harvester
• Testing and implementation of a search engine
• Promoting the project
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Future
• Improvement / fine tuning of the metadata set according to the needs of the collection
• Further establishment of the “information network”
• Testing and implementation of a metadata harvester
• Testing and implementation of a search engine
• Promoting the project
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Future
• Improvement / fine tuning of the metadata set according to the needs of the collection
• Further establishment of the “information network”
• Testing and implementation of a metadata harvester
• Testing and implementation of a search engine
• Promoting the project
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross
Contact
Digital Archive for Chinese Studies
http://www.sino.uni-heidelberg.de/dachs/
Jennifer Gross
Archiving Workshop, Oxford, 28.11.03, Jennifer Gross