21
METS at UC Berkeley Generating METS Objects

METS at UC Berkeley

  • Upload
    carlow

  • View
    58

  • Download
    3

Embed Size (px)

DESCRIPTION

METS at UC Berkeley. Generating METS Objects. Background. Kinds of materials: primarily imaged content & tei encoded content archival materials: manuscripts and pictorial collections oral histories Kinds of Metadata Structural metadata: physical structure Descriptive metadata - PowerPoint PPT Presentation

Citation preview

Page 1: METS at UC Berkeley

METS at UC Berkeley

Generating METS Objects

Page 2: METS at UC Berkeley

Background

• Kinds of materials: – primarily imaged content & tei encoded

content• archival materials: manuscripts and pictorial

collections• oral histories

• Kinds of Metadata– Structural metadata: physical structure– Descriptive metadata – BasicTechnical metadata about digital files

and how they were produced

Page 3: METS at UC Berkeley

Tools For Producing METS Objects

• GenDB– Gathers structural, descriptive and

technical metadata• GenX

– Generates METS objects from GenDB

Page 4: METS at UC Berkeley

GenDB

• Consists of:– Relational database (Currently SQL Server)– Locally developed software for gathering

metadata and facilitating digital processing

Page 5: METS at UC Berkeley

Div 1

GenDB Database StructureStructural Metadata

Div 2Div 3

Object 1

Object 2

(root)

(parent = div 1)(parent = div 1)

Div 1Div 2Div 3

(root)

(parent = div 2)(parent = div 1)

Div 4 (parent = div 2)

Object 1 Div 1 Div 2 Div 3

Object 2 Div 1 Div 2 Div 3 Div 4

Structural Md Table

Page 6: METS at UC Berkeley

Div 1

GenDB Database StructureDescriptive Metadata

Div 2Div 3

Object 1

Object 2 Div 1Div 2Div 3Div 4

Core Desc MdCore Desc MdCore Desc Md

Core Desc MdCore Desc MdCore Desc MdCore Desc Md

Name 1

Name 2

Name 3

Note 1

Note 2

Note 3

Name Table

Note Tables

Structural Md Table

Page 7: METS at UC Berkeley

Div 1

GenDB Database StructureContent File/Technical Md

Div 2Div 3

Object 1

Master Image Table

Derivative Image Table

Structural Md Table

Drv 1Drv 2Drv 3

Mstr 1Mstr 2

Technical MdTechnical Md

Drv 4

Technical MdTechnical MdTechnical MdTechnical Md

Page 8: METS at UC Berkeley

Populating the Database Tables

• Web interface: manual input of structural and descriptive metadata

• Digitization Management modules– Generate work orders to guide digitization

process– Import content file information and

technical metadata coming out of digitization process

• Batch loader: batch input based on TEI encodings, legacy metadata

Page 9: METS at UC Berkeley

Web Interface: WebGenDB

WebInterface

SQL ServerDatabase

Java Servlet

Java Server

XML Config Files

rmi

jdbc

Page 10: METS at UC Berkeley

Digitization Management Modules

WebInterface

Java ServletJava Server

SQL ServerDatabase

Imaging/TranscriptionWorkOrders

Vendor

Technical MDSpreadsheets

Page 11: METS at UC Berkeley

Batch Loader

WebInterface

SQL ServerDatabase

Java Servlet

Java Server

Java Batch Loader

XML Batch Load File

TEI Docs

XSLT

Page 12: METS at UC Berkeley

WebGenDB

The concepts that drove the design• Shielding user from METS complexity• Highly configurable• Unicode support• Access driven by login privileges• Use of Open Source software and

components• Distributed approach

Page 13: METS at UC Berkeley

XML Configuration Files

• Three levels– Common to all projects elements– Common to all screens in a project elements– Specific to a screen in a project

• Define fields common to all projects• Define fields used in specific project• Define screens by project & object type

Page 14: METS at UC Berkeley

AlProjects.xml

Proj1.xml

Proj2.xml

ObjectType1.xml

ObjectType2.xml

ObjectType1.xml

ObjectType2.xml

Relation among XML files

Page 15: METS at UC Berkeley

<ObjectType> <name>workorder</name> <fileLocation> /data/_w/GenDB/WEB-INF/classes/edu/berkeley/library/propertyFiles/CalCultureWorkOrderScreensFile.xml</fileLocation> </ObjectType>

<Field> <name>Image</name><type>checkbox</type><label>Image </label><size>1</size> </Field>

<Field> <name>Text</name><type>checkbox</type><label>Text </label><size>1</size> </Field>

<Field> <name>Title</name><type>text</type><label>Title </label><size>60</size> </Field>

Project XML file example

Page 16: METS at UC Berkeley

Software used

• MSSQL running on NT• Tomcat 4.1.2 implementing servlets 2.3• Jsdk 1.4• Xalan 2.4• Xerces 1.0.3• FOP 0.12.1• JDOM beta 8• Opta 2000

Page 17: METS at UC Berkeley

Relationship of GenDB to METS

• Metadata not directly stored in METS, MODS or MIX schema formats.– Much of the database structure was developed

before these standards emerged– Database structure and content adjusted to be

compatible with all these formats

Page 18: METS at UC Berkeley

GenX: From GenDB to METS

• Allows Digital Publishing Group staff to select the objects in the GenDB database that are ready for export and to export them as METS objects.

Page 19: METS at UC Berkeley

GenX Architecture

AppInterface

GenDB

Java Application METS XML Repository

JDBC

Page 20: METS at UC Berkeley

GenX Output

• METS output corresponding to version 1.3

• Descriptive metadata exported to METS descMD in MODS 2.0 format

• Technical Metadata exported to METS techMD in MIX format

• Planned:– Text technical md to METS descMD in NYU

TextMD– Rights to METS rightsMD in ODRL subset

Page 21: METS at UC Berkeley

Links

• GenDB Web Interface Demo– http://sunsite2.berkeley.edu/GenD– login: demo– password: demo

• Developers:– [email protected][email protected][email protected]