104
National Archives and Records Administration National Archives Catalog (The Catalog) Archival Description Data Model Design – Catalog Perspective – Status-Final Version 1.15 September 3, 2015

Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

National Archives and Records Administration

National Archives Catalog (The Catalog)Archival Description Data Model

Design– Catalog Perspective –

Status-FinalVersion 1.15

September 3, 2015

Page 2: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

National Archives & Records Administration

NARA Catalog Archival Descriptions Data Model Design

Lisong Liu

Paul Nelson

Madhu Koneni

Kristy Martin

Version 1.15

Contract Number GS-35F-0541U

Order Number NAMA-13-F-0120

September 3, 2015

Page 3: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Contents

1 Overview................................................................................................................41.1 Introduction to Archival Descriptions................................................................................4

1.2 What is a DMD?.................................................................................................................5

1.3 Document Conventions.....................................................................................................7

2 DAS XML.................................................................................................................8

3 File Processing and Parsing.....................................................................................93.1 Create the Catalog Information Package (OPA-IP).............................................................9

3.2 Download Files.................................................................................................................10

3.3 Extract Technical Metadata.............................................................................................10

3.3.1 Images.....................................................................................................................10

3.3.2 Video.......................................................................................................................12

3.3.3 Audio Files...............................................................................................................13

3.3.4 Documents..............................................................................................................14

3.4 Create Catalog Renditions...............................................................................................21

3.4.1 Text Content Rendition...........................................................................................21

3.4.2 Image Thumbnails...................................................................................................21

3.4.3 Image Tiles..............................................................................................................21

3.5 Create Objects File...........................................................................................................22

4 Mapping to Index Fields........................................................................................244.1 Mappings for Digital Objects and Archival Descriptions..................................................24

4.2 Mapping Table.................................................................................................................24

4.3 DateQualifer mapping.....................................................................................................32

4.4 Keyword search relevancy mapping................................................................................32

5 Search Fields.........................................................................................................385.1 Advanced Search Fields...................................................................................................38

5.2 Search Fields - Additional Information.............................................................................39

5.2.1 Date range field sources..........................................................................................39

5.2.2 Recurring date search field sources........................................................................39

5.2.3 Exact date search field sources...............................................................................40

5.2.4 Type of Archival Materials Values...........................................................................40

Page 4: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

5.2.5 Level of Description Values.....................................................................................40

5.2.6 File Format Values...................................................................................................40

5.2.7 Location values........................................................................................................41

6 Search Results Presentation..................................................................................44

7 Content Details Presentation................................................................................467.1 Rules for ${online-availability-header}.............................................................................46

7.2 Rules for displaying fields................................................................................................46

7.2.1 Email addresses.......................................................................................................46

7.3 Record Group...................................................................................................................46

7.3.1 Record Group Link Table.........................................................................................48

7.4 Collection.........................................................................................................................49

7.4.1 Collection Link Table...............................................................................................50

7.5 Series...............................................................................................................................51

7.5.1 Series Link Table......................................................................................................56

7.6 File Unit............................................................................................................................57

7.6.1 File Unit Link Table..................................................................................................62

7.6.1 Electronic Records: Download Display Identifier.....................................................63

7.7 Item.................................................................................................................................64

7.7.1 Item Link Table........................................................................................................70

8 Object Metadata Presentation..............................................................................72

Page 5: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Version Control

Version Date Reviewer Summary Description

0.1 2013-10-25 Paul Nelson and Madhu Koneni

Initial Outline

0.2 2013-11-22 Paul Nelson & Rhea Mandavilli

Additional Outline work

0.3 2013-12-30 Madhu Koneni & Rhea Mandavilli

Updates to the sections 1, 1.1, 1.2

0.4 2014-02-24 Lisong Liu Team efforts to complete the document

0.5 2014-02-25 Paul Nelson & Team

Review sections 1-4

1.0 2014-02-25 Paul Nelson & Team

Review sections 5-7

1.1 2014-03-19 Lisong Liu Updated based on NARA review and feedback in DCRF of 3/11/14

1.2 2014-03-31 Lisong Liu Took out ARC XML format

1.3 2014-04-22 Lisong Liu Incorporate feedback from NARA/PPC

1.4 2014-08-14 Andrew Gullett Make corrections: Add zip, tiff, and jpeg2000 to File Format

Values table in 5.2.3. 4.2 Mapping Table: I/creatorIds, I/donorIds,

I/subjectIds, I/contributorIds, and I/personalReferenceIds use NAIDs instead of OPA-IDs.

4.2 Mapping Table: I/dateRangeFacet additionally includes release, copyright, and broadcast dates.

4.2 Mapping Table: Add scopeAndContentNote to I/content.

4.2 Mapping Table: Change I/opaId prefix to “desc-“ to match API documents.

4.2 Mapping Table: Change I/firstIngestDateTime to I/firstIngestedDateTime to match actual search index field name.

4.4: Add parent creator names and location

1

Page 6: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

names to grank3. 4.2 Mapping Table: Change I/oldScope to

properly map ADs with and without digital objects.

4.2 Mapping Table: Change I/iconType to properly map ADs with and without digital objects.

4.2 Mapping Table: Change I/hmsEntryNumbers to only include HMS numbers.

4.2 Mapping Table: Change to map itemAv to “item”.

4.2 Mapping Table: Simplified I/hasOnline since browse requirements were removed from R1.

1.5 2014-09-07 Kristy Martin Updated content: 7.2 Record Group: To accommodate

displaying Variant Control Number data for Record Groups as supported by the data.

7.4 Series: Variant Control Numbers table to {VAR/variant-type/termName}.

7.4 Series: Archived Copies, Copy N Media Information updated “Dimension” to reference the “termName”, also changed “technicalAccessNote” to “technicalAccessRequirementsNote”

1.6 2014-11-14 Kristy Martin Removed “Confidential to Search Technologies” text from the footer.

1.7 2014-11-24 Carlos Araya/Brandon Stahl

Made corrections to 5.2.3 File Format Values to specify correct mime types for MP3, PowerPoint and Excel file types.

Replaced https://research.archives.gov url with https://catalog.archives.gov url

1.8 2014-12-11 Kristy Martin Updated locations, section 5.2.4 Location values (NARAOPA-395)

1.9 2015-03-18 Madhu Koneni, Kristy Martin

Fixed grank mappings in section 4.4; also in: 7.2 7.3

2

Page 7: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Updated the following sections to address Congressional fields for R1P2R1a:

4.4 7.2 7.4 7.5 7.6

Add content for later releases (HMS Entry Number, Finding Aid URL(:

6 7.2 7.3 7.4 7.5 7.6

Added Electronic Records section, 7.5.2; documents current implement of R1P1.

1.10 2015-03-20 Kristy Martin Removed congressional fields from the grank table.Change beginCongress and endCongress fields to match the DAS specification

1.11 2015-03-30 Madhu Koneni Comments from DRCF addressed. Section 4.4, 7.5.1.7.6 are updated

1.12 2015-04-21 Kristy Martin/Carlos Alberto Arraya

Added Object Designator/Description section (8) and Advanced Search date fields for R1P2I1B in section 5.1. Fixed “filter” references in section 5.1.

1.13 2015-07-09 Kristy Martin, Alejandro Baltodano

Changed branding for system name throughout document. Added cover sheet.Added section 7.2Updated screenshots

1.14 2015-07-24 Kristy Martin Updated content based on “DRCF_NAC R1P2 combined 1b_Increment 2 Design_IQS_Consolidated_7_21_15V1 (1).docx”

1.15 2015-09-03 Kristy Martin Updated screenshots in section 6.

3

Page 8: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

1 Overview

This is the Data Model Design document (DMD) for Archival Descriptions from the National Archives and Records Administration (NARA). This document aims to provide detailed data flow documentation for each and every archival description field as it is processed through the National Archives Catalog system, from reception from DAS, to presentation to the end user.

The DMD is both a document and a process. It defines the following for each data source:

How data is acquired from the original data source.

A list of all source metadata elements.

o This includes files (digital objects) and renditions

How metadata from the source is parsed and processed

o This includes creating “Catalog renditions” such as thumbnails and tiles

o It also includes text extraction and technical metadata extraction, as necessary

How source data (metadata and files) is stored in Catalog storage (called opastorage; “opa” is a remnant from the previous name of the system)

How source data is processed into Catalog search engine index fields

What are the advanced search fields

How the search results (aka, the “brief results”) are formatted and filled

How content-detail pages (aka, the “full results”) are formatted and filled

1.1 Introduction to Archival DescriptionsArchival descriptions document the permanent holdings of the federal government in the custody of the National Archives and Records Administration (NARA). They include information on traditional paper holdings, electronic records, and artifacts.

The archival descriptions data source to be indexed by the Catalog includes approximately 7-8 million archival description records which describe NARA holdings at various levels of granularity:

The different level Archival Descriptions records are described at various levels of aggregation:

Record Groups (~600) – Generally include records from one federal agency.

Collections (~3000) – Are usually comprised of donated materials from a single person, family, or organization

4

Page 9: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Series (~200,000) – Group materials originally filed together or because they resulted from a specific activity or are related in some other way.

File Units (~5.4 million) – Group materials originally filed together in groups because they share specific characteristics, for example all correspondence with one individual, or a court case.

Items (~1.3 million) – Are usually single documents, such as a letter, report, photograph, map, sound recordings or film.

ItemAV – Are usually single documents Audio or Video files

The definitions and metadata requirements for each type of description can be found in the Lifecycle Data Requirements Guide (LCDRG).

The content of the archival descriptions are created and maintained by NARA subject matter experts. All the content is stored in the Description and Authority Services (DAS), formerly known as Archival Research Catalog (ARC).

The information contained in archival descriptions is accessed using the National Archives Catalog system. Based on the user entered search parameters, the Catalog system retrieves descriptions and description metadata from the Search Engine Index.

1.2 What is a DMD?The purpose of a Data Model Design document (DMD) is to document and describe all relevant data fields within a data source necessary to support all Catalog functionality. This metadata includes all data fields and their structure (nesting, type, number of values, etc.).

The DMD further describes how metadata values are transformed and stored within the Catalog. This careful accounting of data processing is required to gain a complete understanding of how every bit and byte is handled through the Catalog system.

Finally, the DMD also describes how metadata values are presented to the user, in the brief results, on the content detail page (aka the “full results”), from API calls, and in various metadata downloads.

The following diagram shows all of the metadata mappings and their place in the Catalog system architecture:

5

Page 10: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Specifically, this DMD includes the following:

File Processing (section 3)

o Identifies how digital objects are gathered and stored in Catalog Storage (called opastorage; “opa” is a remnant from the previous name of the system)

o This includes metadata and digital object content extraction

Index Representation (section 4)

o How the Catalog content is represented in the search engine indexes

o This includes business rules (concatenations, string processing, date formats, etc.) for transforming the ARC XML data so it can be stored in the Catalog indexes for efficient storage

Advanced Search Form Mappings (Section 5)

o Identifies each advanced search form filter and how it will be used for search.

o This includes mapping of advanced search form controls to index fields for search.

Brief results presentation (section 6)

o Identifies how index fields are mapped to show the brief results.

Content Detail (section Error: Reference source not found)

o Identifies how fields from Catalog storage (“opastorage”) (DAS XML and digital objects) are mapped to the content detail page.

6

Page 11: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

1.3 Document ConventionsSince there are many different metadata fields for different purposes and from different systems, field mappings will be used throughout this document to clearly identify the originating source for every field, as follows:

Abbrev Description

DAS Fields from the DAS <item>, <itemAv>, <fileUnit>, <series>, <collection> or <recordGroup>.For example, DAS/title will represent any of /series/title, /collection/title from DAS.

I Fields from the search engine index.For example, “I/title” will represent the title as it is stored in the search engine index for the record.

OBJECTS

objects from the objects.xml table. See section 3.5 for more information.

7

Page 12: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

2 DAS XML

The metadata used for constructing the Catalog archival descriptions will come directly from DAS, using DAS native XML format. The DAS XML format definition is available from NARA. The latest schema is “NaraDasUi-1_1.xsd”.

Using DAS XML

There are several reasons for using native DAS XML format:

1. This will always be the format directly supported by DAS production.a. Therefore, it will be the most up-to-date version of all fields available for descriptions

from DAS.

2. Content can be accessed per description / authority record.

a. DAS will identify when a particular record is updated.b. When this record is updated, we can fetch it directly through DAS.

3. When records are updated, all related content will need to be re-indexed.a. When a particular record is updated in DAS, the Catalog will need to identify and re-

index all related content. This includes:i. Child archival descriptions (when a parent is changed)ii. Child digital objects (when the container archival description is changed)iii. Linked archival descriptions (when an authority record is changed)iv. Child granules (when the container archival description is changed)v. When a child archival description is changed, the counts of the parent may need

to be updated and the parent re-indexed. vi. When an archival description is changed, the counts of the related authority

records need to be updated and the parent re-indexed.b. Because of the complexity of re-processing and re-indexing content inside of the

Catalog, getting only the original DAS XML is the best option.i. Further, the Catalog can decide what metadata changes will cause what sorts of

reprocessing.ii. This allows the Catalog to optimize exactly which records are re-indexed when a

change is made.

8

Page 13: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

3 File Processing and Parsing

File processing for the Catalog requires a number of steps for processing digital objects, as shown in the following diagram:

Each of these steps will be described in the following sections.

3.1 Create the Catalog Information Package (OPA-IP)Every archival description with digital objects will be represented in Catalog storage (“opastorage”) as an “Information Package”, called an “OPA-IP” (“opa” is a remnant of the previous name of the system).

And so, the first step for file processing will be to create a directory in Catalog storage (“opastorage”) with the following structure:

/<NAID>/

objects.xml List of all content files with technical metadata

description.xml Holds a copy of the archival description XML

content/ Digital-object-files go here

opa-renditions/ Catalog-generated renditions go here

extracted-content/ Extracted text and content metadata (doc properties)

thumbnails/ Thumbnails of digital objects

image-tiles/ Image tiles for low-bandwidth browsing of images

Note: tags, comments, transcription and translations are saved in annotation database.

9

Page 14: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

3.2 Download FilesDigital object files will need to be cached in the Catalog for fast retrieval through the Catalog APIs. This is also required as part of the Catalog data management requirements.

Initial Migration Process

For the initial bulk migration of OPA Pilot to Catalog Production, the following steps will be required:

1. Go through all <digitalObjectArray> in the DAS XML.

c. This would be all objects found in the DAS/digitalObjectArray/digitalObject.

2. Download the digital file (based on the URL DAS/digitalObjectArray/digitalObject /accessFilename) from the media server.

3. Store the digital file in the OPA-IP into the “content” directory.

Incremental, “Day Forward” Processing

The process for copying new digital objects into the Catalog is specified in NARA Catalog Ingestion Design document.

The anticipated process for copying new digital objects into the Catalog is expected to be:

1. Periodically scan the “pre-ingestion” area for new digital objects.

2. Copy those objects into the appropriate OPA-IP package, into the “content” directory.a. These objects would be specified in an ingestion package, such as a SIP (for simple

objects) or an SEIP (for more complex objects).b. Objects should be copied from the “representation” directory from the SEIP into the

“content” directory inside the OPA-IP.

3.3 Extract Technical MetadataTechnical metadata extraction will be required for all digital objects. This will involve inspecting the contents of each metadata file to extract metadata inherent in the file type.

Metadata extraction will be different for each different type of file.

The file type of each file is based on calculated MIME.

3.3.1 Images

Technical metadata will need to be extracted for the following image types:

image/bmp

image/gif

10

Page 15: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

image/jpeg

image/jp2 [JPEG 2000 images are not in OPA pilot currently but NARA indicated that they want it to be supported]

image/tiff [TIFF images are not in OPA pilot currently but NARA indicated that they want it to be supported]

3.3.1.1 Technical Metadata Extraction Tools

The tool which will be used to extract technical metadata from these formats is TIKA.

3.3.1.2 Technical Metadata Fields

The following technical metadata fields will be extracted for digital objects if the objects have the fields; not all objects will have all fields listed below:

createDate – The date the image was created, as specified in the image metadata.

metadataDate – The date the image metadata was created.

modifyDate – The date the image was last modified, as specified in the image metadata.

size – The size of the file in bytes

mime – The IANA MIME type for the file format

dimensions – The X and Y dimensions for the image, with the following sub-elements:

o @x – The number of pixels of the image in the X dimension

o @y – The number of pixels of the image in the Y dimension

resolution – The X and Y resolution of the image, with the following sub-elements:

o @x – The X resolution of the image

o @y – The Y resolution of the image

o @units – The units for specifying the resolution, expected to be “pixels/inch” for all images

bitsPerSample – The number of bits for each color dimension for each pixel in the image.

o Specified as three space-separated integers. Typically “8 8 8”

photometricInterpretation – The color model for the image. Typically “RGB”.

orientation – The image orientation. Can be any of:

o 1 = horizontal (normal)

11

Page 16: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

o 2 = mirrorHorizontal

o 3 = rotate180

o 4 = mirrorVertical

o 5 = mirrorHorizontalRotate270CW

o 6 = rotate90CW

o 7 = mirrorHorizontalRotate90CW

o 8 = rotate270CW

samplesPerPixel – The number of color components per pixel. Typically “3”.

planarConfiguration – Indicates whether pixel components are recorded in “chunky” or “planar” format.

colorSpace – The color space information tag (ColorSpace) is always recorded as the color space specifier. Normally sRGB (=1) is used to define the color space based on the PC monitor conditions and environment.

compression – The compression scheme used for the image data. When a primary image is JPEG compressed, this designation is not necessary and is omitted. When thumbnails use JPEG compression, this tag value is set to 6.

software – The name and version of the software or firmware of the camera or image input device used to generate the image.

3.3.1.3 Example

<technicalMetadata> <createDate>2012-10-22T03:26:34Z</createDate> <metadataDate>2013-02-20T02:55:27Z</metadataDate> <modifyDate>2013-02-20T02:55:27Z</modifyDate> <size>1142680</size> <mime>image/jpeg</mime> <dimensions width="2181" height="2781"/> <resolution x="300" y="300" units="pixels/inch"/> <bitsPerSample>8 8 8</bitsPerSample> <photometricInterpretation>RGB</photometricInterpretation> <orientation>horizontal</orientation> <samplesPerPixel>3</samplesPerPixel> <planarConfiguration>chunky</planarConfiguration> <colorSpace>sRGB</colorSpace> <compression>uncompressed</compression> <software>Adobe Photoshop CS6 (Windows)</software></technicalMetadata>

3.3.2 Video

Technical metadata will need to be extracted for the following video types:

12

Page 17: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

video/mp4

video/x-ms-wmv (WMV)

video/avi

application/vnd.rn-realmedia

3.3.2.1 Technical Metadata Extraction Tools

The tool which will be used to extract technical metadata from these formats is TIKA

3.3.2.2 Technical Metadata Fields and examples

The following technical metadata fields will be extracted for digital objects if the objects have the fields; not all objects will have all fields listed below:

Content-Length : 194506834 Content-Type : video/mp4 Creation-Date : 2013-06-25T19:38:00Z Last-Modified : 2013-06-25T19:38:06Z Creation-Date : 2013-06-25T19:38:00Z Last-Modified : 2013-06-25T19:38:06Z Last-Save-Date : 2013-06-25T19:38:06Z Date : 2013-06-25T19:38:06Z dcterms:created : 2013-06-25T19:38:00Z dcterms:modified : 2013-06-25T19:38:06Z meta:creation-date : 2013-06-25T19:38:00Z meta:save-date : 2013-06-25T19:38:06Z modified : 2013-06-25T19:38:06Z resourceName : 208_143.mp4 tiff:ImageLength : 0 tiff:ImageWidth : 0 xmpDM:audioChannelType : Stereo xmpDM:audioSampleRate : 48000

3.3.3 Audio Files

Technical metadata will need to be extracted for the following audio types:

audio/mpeg (MP3)

3.3.3.1 Technical Metadata Extraction Tools

The tool which will be used to extract technical metadata from these formats is TIKA.

13

Page 18: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

3.3.3.2 Technical Metadata Fields and examples

The following technical metadata fields will be extracted for digital objects if the objects have the fields; not all objects will have all fields listed below:

Author : null Content-Length :58604223 Content-Type : audio/mpeg Channels : 2 Creator : null dc:creator : null dc:title : 021-ILND-70C1384-Conlisk-01 meta:author : null resourceName : 021-ILND-70C1384-Conlisk-01.mp3 samplerate : 44100 title : 021-ILND-70C1384-Conlisk-01 version : MPEG 3 Layer III Version 1 xmpDM:album : null xmpDM:artist : null xmpDM:audioChannelType : Stereo xmpDM:audioCompressor : MP3 xmpDM:audioSampleRate : 44100 xmpDM:duration :1833496.875 xmpDM:genre : null xmpDM:logComment : null

xmpDM:releaseDate : null

3.3.4 Documents

Technical metadata will need to be extracted for the following document types:

application/pdf

text/plain

application/vnd.ms-powerpoint (PPT)

application/msword

application/x-wri (Microsoft Write)

application/vnd.ms-excel (Excel)

text/html (Web Page)

14

Page 19: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

3.3.4.1 Technical Metadata Extraction Tools

The tool (or tools) which will be used to extract technical metadata from these formats is TIKA.

3.3.4.2 Technical Metadata Fields and Examples

The following technical metadata fields will be extracted for digital objects if the objects have the fields; not all objects will have all fields listed below:

Application/pdf

AllPermissions : 1048575

Company : NARA

Container : Databases

Content-Length : 136909

Content-Type : application/pdf

Creation-Date : 2012-07-25T16:07:19Z

DateCreated : 4/21/2011 10:16:58 AM

DisplayAllViewsOnSharePointSite : 1

Last-Modified : 2012-07-25T16:10:08Z

Last-Save-Date : 2012-07-25T16:10:08Z

LastUpdated : 4/21/2011 10:16:58 AM

Name : UserDefined

Owner : admin

Permissions : 1048575

ReplicateProject : Yes

UserName : admin

Created : Wed Jul 25 12:07:19 EDT 2012

Date : 2012-07-25T16:10:08Z

dc:title : Federal Assistance Award Data System, Technical Specifications Summary

dcterms:created : 2012-07-25T16:07:19Z

dcterms:modified : 2012-07-25T16:10:08Z

meta:creation-date : 2012-07-25T16:07:19Z

15

Page 20: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

meta:save-date : 2012-07-25T16:10:08Z

modified : 2012-07-25T16:10:08Z

producer : Acrobat Distiller 9.0.0 (Windows)

resourceName : FAADS_TSS135.pdf

title : Federal Assistance Award Data System, Technical Specifications Summary

xmp:CreatorTool : Acrobat PDFMaker 9.0 for Access

xmpTPg:NPages : 22

Text/plain

Content-Encoding : ISO-8859-1

Content-Length : 2820573

Content-Type : text/plain; charset\u003dISO-8859-1

resourceName : cbp07prc.txt

Text/html

PICS-Label : PICS-1.1 "http://www.weburbia.com/safe/ratings.htm" l r (s 0)) Content-Language : en-US y_key : 0e94d8e9e5732ec6 engine.cache : false imagetoolbar : no date : 2013-10-21 distribution : global Content-Encoding : windows-1252 Content-Location : http://www.archives.gov rating : general resourceName : http://www.archives.gov dc:title : National Archives and Records Administration

Application/vnd.ms-powerpoint (PPT)

Application-Name : Microsoft PowerPoint

Author : Ray Holland

Company : DOT

Content-Length :18959360

16

Page 21: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Content-Type : application/vnd.ms-powerpoint

Creation-Date : 2001-09-17T15:00:47Z

Edit-Time :167793170000

Last-Author : Preferred Customer

Last-Modified : 2001-09-19T14:56:41Z

Last-Save-Date : 2001-09-19T14:56:41Z

Revision-Number :39

Slide-Count :247

Word-Count :9733

cp:revision :39

creator : Ray Holland

date : 2001-09-19T14:56:41Z

dc:creator : Ray Holland

dc:title : PowerPoint Presentation

dcterms:created : 2001-09-17T15:00:47Z

dcterms:modified : 2001-09-19T14:56:41Z

extended-properties:Application : Microsoft PowerPoint

extended-properties:Company : DOT

meta:author : Ray Holland

meta:creation-date : 2001-09-17T15:00:47Z

meta:last-author : Preferred Customer

meta:save-date : 2001-09-19T14:56:41Z

meta:slide-count :247

meta:word-count :9733

modified : 2001-09-19T14:56:41Z

resourceName : 5 AWA 213 ATCSCC September 11.pps

title : PowerPoint Presentation

xmpTPg:NPages :247

17

Page 22: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Application/msword

Application-Name : Microsoft Word 8.0

Author : Dennis Rowe

Character Count :24377

Comments :null

Company : The MITRE Corporation

Content-Length :95232

Content-Type : application/msword

Creation-Date : 2002-04-19T18:49:00Z

Edit-Time :53400000000

Keywords :null

Last-Author : asignore

Last-Modified : 2002-04-24T16:59:00Z

Last-Save-Date : 2002-04-24T16:59:00Z

Page-Count :1

Revision-Number :32

Template : Normal

Word-Count :4276

comment :null

cp:revision :32

cp:subject :null

creator : Dennis Rowe

date : 2002-04-24T16:59:00Z

dc:creator : Dennis Rowe

dc:subject :null

dc:title : Appendix M

dcterms:created : 2002-04-19T18:49:00Z

dcterms:modified : 2002-04-24T16:59:00Z

18

Page 23: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

extended-properties:Application : Microsoft Word 8.0

extended-properties:Company : The MITRE Corporation

extended-properties:Template : Normal

meta:author : Dennis Rowe

meta:character-count :24377

meta:creation-date : 2002-04-19T18:49:00Z

meta:keyword :null

meta:last-author : asignore

meta:page-count :1

meta:save-date : 2002-04-24T16:59:00Z

meta:word-count :4276

modified : 2002-04-24T16:59:00Z

resourceName : 5 AWA 199 Appendix M.doc

subject : null

title : Appendix M

w:comments : null

xmpTPg:NPages : 1

Application/vnd.ms-excel (Excel)

Application-Name : Microsoft Excel

Application-Version :12.0000,

Author : M1CHJ00

Content-Length :108858,

Content-Type : application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Creation-Date : 2002-11-20T19:42:53Z

Last-Author : Image

Last-Modified : 2012-08-15T20:19:08Z

Last-Printed : 2004-01-30T21:26:08Z

Last-Save-Date : 2012-08-15T20:19:08Z

19

Page 24: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

creator : M1CHJ00

date : 2012-08-15T20:19:08Z

dc:creator : M1CHJ00

dc:publisher : FRB

dcterms:created : 2002-11-20T19:42:53Z

dcterms:modified : 2012-08-15T20:19:08Z

extended-properties:AppVersion :12.0000,

extended-properties:Application : Microsoft Excel

extended-properties:Company : FRB

meta:author : M1CHJ00

meta:creation-date : 2002-11-20T19:42:53Z

meta:last-author : Image

meta:print-date : 2004-01-30T21:26:08Z

meta:save-date : 2012-08-15T20:19:08Z

modified : 2012-08-15T20:19:08Z

protected : false

publisher : FRB

resourceName : HMDA_CRA_2006_layout.xlsx

Application/x-wri (Microsoft Write)

no Microsoft Write samples in OPA Pilot

3.4 Create Catalog RenditionsThe next step in file processing is to create file renditions required by the Catalog for the the Catalog user interface and other Catalog functions.

3.4.1 Text Content Rendition

Text content will need to be extracted from all document files (MS-Word, PDF, Excel, etc.). The expected tool for text content processing (currently being reviewed as part of the Analysis of Alternatives) will be Apache Tika (see https://tika.apache.org/ for more details).

20

Page 25: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Since text extraction is a relatively slow process, extracted text content will be stored in the OPA-IP under “opa-renditions/extracted-content”, so that digital objects can be quickly and efficiently re-indexed as archival description records or annotations change.

Extracted text content will be stored in XHTML format as produced by Apache Tika.

Additional Business Rules:

Text extraction will not be performed on any digital object where DAS/digitalObjectArray/digitalObject/in-database is set to “N”.

o Text extraction will occur if this field is missing OR if it contains “Y”.

3.4.2 Image Thumbnails

Thumbnails for all images, if not available for download from the media server, will need to be created.

Images will be scaled down so that the maximum dimension (either X or Y) will be 200 pixels. The other dimension will be scaled appropriately to maintain the same aspect ratio as the original.

Thumbnail images will be stored in the “opa-renditions/thumbnails” directory.

The tool for creating thumbnails is VIPS.

3.4.3 Image Tiles

Very large images, greater than 500K, will be converted into “tiles” so they can be efficiently downloaded and browsed by users with limited bandwidth (e.g. users on cell phones or tablets).

Tiles will be stored in “DZI” format (Deep Zoom) – the format used by the Open Seadragon tool and a number of other image browers.

Tiles for images will be stored in the “opa-renditions/tiles” directory.

The tool for creating tiles is VIPS.

Note that simpler image pyramids may be used for smaller files. For example, simple ½ resolution and ¼ resolution files may be created, since these are easier and faster to create than image tiles. Open Seadragon can support both image formats.

3.5 Create Objects FileThe final step in file processing is to create the objects.xml file. This file maintains the technical metadata for all files and ties all of the files together, so that all of the various renditions for each file are tied together.

21

Page 26: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

The current objects XML file format is described in section 2.3.1.2 of the NARA Catalog Public API Reference Guide. A sample of the file is shown below.

<objects version="OPA-OBJECTS-1.0"> <object id="1" hasAnnotations="true"> <designator>1</designator> <file type="primary" path="./content/hst-ppp_93-1_18-01.jpg" mime="image/jpeg"> <technicalMetadata> <createDate>2012-10-22T03:26:34Z</createDate> <metadataDate>2013-02-20T02:55:27Z</metadataDate> <modifyDate>2013-02-20T02:55:27Z</modifyDate> <size>1142680</size> <mime>image/jpeg</mime> <dimensions width="2181" height="2781"/> <resolution x="300" y="300" units="pixels/inch"/> <bitsPerSample>8 8 8</bitsPerSample> <photometricInterpretation>RGB</photometricInterpretation> <orientation>horizontal</orientation> <samplesPerPixel>3</samplesPerPixel> <planarConfiguration>chunky</planarConfiguration> <colorSpace>sRGB</colorSpace> <compression>uncompressed</compression> <software>Adobe Photoshop CS6 (Windows)</software> </technicalMetadata> </file> <thumbnail path="./opa-rendition/thumbnails/hst-ppp_93-1_18-01-thumb.jpg" mime="image/jpeg"/> <imageTiles path="./opa-rendition/image-tiles/hst-ppp_93-1_18-01_jpg"/> </object>

<object id="2"> <designator>2</designator> <file type="primary" path="./content/hst-ppp_93-1_18-02.jpg" mime="image/jpeg"> <technicalMetadata> . . Technical metadata for the primary file for the second object will go here . </technicalMetadata> </file> <thumbnail path="./opa-rendition/thumbnails/hst-ppp_93-1_18-02-thumb.jpg" mime="image/jpeg"/> <imageTiles path="./opa-rendition/image-tiles/hst-ppp_93-1_18-02_jpg"/> </object>

<object id="3"> <designator>3</designator> <file type="primary" path="./content/hst-ppp_93-1_18-03.jpg" mime="image/jpeg"> <technicalMetadata> . . Technical metadata for the primary file for the third object will go here . </technicalMetadata> </file> <thumbnail path="./opa-rendition/thumbnails/hst-ppp_93-1_18-03-thumb.jpg" mime="image/jpeg"/> <imageTiles path="./opa-rendition/image-tiles/hst-ppp_93-1_18-03_jpg" type="DZI"/> </object></objects>

22

Page 27: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

4 Mapping to Index Fields

The following table shows the mapping from the <DAS> fields (identified with prefix “DAS/” to the Catalog index fields (identified with prefix “I/”). See section 1.3 for all prefixes used in this document.

See the NARA Catalog Public API Reference Guide and the NARA Catalog Search Engine Design documents for more information on each Catalog index field.

4.1 Mappings for Digital Objects and Archival DescriptionsThere will be separate entries indexed into the Catalog for each digital object as well as the Archival Description as a whole. For example, if an archival description has 10 digital objects, there will be 11 entries in the Catalog index.

For digital objects, all fields mapped to “DAS/” will refer to the fields in archival description to which the digital object belongs, except, for fields mapped from DAS/digitalObjectArray/digitalObject. In this case, the fields will come from the digital object’s specific value.

Similarly for the OBJECTS/ fields (from the objects table). For descriptions, the entire OBJECTS/ objects table will be mapped into the associated index field. But for digital objects, only the portion of the OBJECTS table appropriate for the specified digital object (indexed by object ID) will be mapped into the Catalog index field for the digital object.

4.2 Mapping Table

Index Field XML document element PurposeI/naId DAS/naId results, sorting, search

I/opaId “desc-”+ {DAS/naId}Example: desc-5132492

results, sorting, search

I/containerId One of the following:DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/containerIdDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/containerIdDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/containerIdDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/containerId

results, search

I/localId DAS/localIdentifier results, sorting, search

I/url “http://catalog.archives.gov/id/”+{DAS/naId} results, search

23

Page 28: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/accessPath For Descriptions: “id/”+{DAS/naId}

Example: id/7229423For Objects: “id”+{DAS/naId} + ”/” + {OBJECTS/object/@id}

result

I/dateRangeFacet Bucket as in API doc 3.4.4:centuries to 1699 (1600 - 1699)half-centuries in 1700s (1700 - 1749, 1750 - 1799)decades from 1800 (1820 - 1829)

DAS/inclusiveDates/inclusiveStartDateDAS/inclusiveDates/inclusiveEndDate

DAS/coverageDates/coverageStartDateDAS/coverageDates/coverageEndDate

DAS/productionDateArray/proposableQualifiableDate/yearDAS/releaseDateArray/proposableQualifiableDate/yearDAS/copyrightDateArray/proposableQualifiableDate/yearDAS/broadcastDateArray/proposableQualifiableDate/year

DAS/parent*/inclusiveDates/inclusiveStartDate/yearDAS/parent*/inclusiveDates/inclusiveEndDate/year

results, facet, search

I/type “description” for archival description or “object” for digital object

results, facet, search

I/oldScope “descriptions” for archival description without digital objects.“online” for archival description with digital objects.“online” for digital object.

results, facet, search

I/level root tag of DAS record. Enumeration of value is: “recordGroup” “collection” “series” “fileUnit” “item” for both item and itemAv “object” (for all digital objects)

results, facet, search

I/parentLevel DAS/parent[Level]Based on the value of Level: “RecordGroup” “recordGroup” “Collection” “collection” “Series” “series” “FileUnit” “fileUnit”

results, search

24

Page 29: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/iconType For descriptions without digital objects:

Based on the value of I/level “recordGroup” “nara/record-group” “collection” “nara/collection” “series” “nara/series” “fileUnit” “nara/file-unit” “item” “nara/item” “itemAv” “nara/item”For descriptions with digital objects: OBJECTS/object[1]/file[@type=’primary’]/@mimeFor all digital objects: OBJECTS/object/file[@type='primary']/@mime

(normalized)

results, search

I/fileFormat For descriptions: The list of all unique, normalized mime types from:

OBJECTS/object/file[@type='primary']/@mime (normalized)

For digital objects: OBJECTS/object/file[@type='primary']/@mime

(normalized)

results, facet, search

I/originalMimeType OBJECTS/object/file[@type='primary']/@mimeNote: Not normalized

search

I/tabType “all” will be added to all entries for this field.For descriptions: Add multiple entries to this field for each:

OBJECTS/object/file[@type='primary']/@mime (normalized)

Map each mime type to the “type” field from the table in section 5.2.6.

Add “online” of DAS/digitalObjectArray exists OR OBJECTS/object exists.

For digital objects: Add a single additional entry for:

OBJECTS/object/file[@type='primary']/@mime (normalized)

Map the mime type to the “type” field from the table in section 5.2.6.

Add “online” for all digital objects.

result, search

I/materialsType DAS/generalRecordsTypeArray/generalRecordsType/termName

results, facet, search

I/title {DAS/title} + “ - “ + {DAS/subtitle}NOTE: Do not include the “ - “ if there is no DAS/subtitle

results, search

25

Page 30: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/titleSort DAS/title with leading articles and prepositions removed

Articles include: “the”, “an”, “a”Prepositions include: “of”, “by”, “in”, “on”, “as”, “at”, “for”, “to”

sorting, search

I/allTitles This will be a multi-valued field which contains: DAS/title DAS/subtitle DAS/otherTitleArray/otherTitle/title DAS/productionSeriesTitle DAS/productionSeriesSubtitleIf there are multiple of any of these titles, all should be added into I/allTitles

results, search

I/hmsEntryNumbers DAS/variantControlNumberArray/variantControlNumber/number where DAS/variantControlNumberArray/variantControlNumber/type/termName indicates type: “HMS/MLR Entry Number”

results, search

I/hmsEntryNumberSort Steps:

3. Sort all {I/hmsEntryNumbers} alphabetically.

4. Concatenate into a single string.

results, sorting

I/parentTitle DAS/parent[Level]/titleenumeration of Level: RecordGroup Collection Series FileUnit

results, search

I/content DAS/scopeAndContentNote.Extracted data from digital object. See section 3.4.1.

results, search

I/creators DAS/creatingIndividualArray/creatingIndividual/creator/termNameORDAS/creatingOrganizationArray/creatingOrganization/creator/termName

search

I/teaser substring(I/content, 500) results, search

I/isOnline “true” for all digital objects and archival description which contains a digital object

results, search

I/hasOnline “true” for all digital objects.“true” for descriptions with digital objects.

results, search

I/thumbnailFile For descriptions: OBJECTS/object[1]/thumbnail/@pathFor digital objects: OBJECTS/object/thumvnail/@path

result

26

Page 31: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/titleDate concat(DAS/inclusiveDates/inclusiveStartDate, “ - ”

DAS/inclusiveDates/inclusiveEndDate)results, search

I/location Add an entry for every:DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ referenceUnitArray/referenceUnit/termName

Note:recordGroup and collection shall have location too, but DAS schema does not show that. In case the physicalOccurenceArray is added, it will be populated as:DAS/physicalOccurrenceArray/collectionPhysicalOccurrence/referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/recordGroupPhysicalOccurrence/referenceUnitArray/referenceUnit/termName

results, search

I/locationKeywords Add an entry for every tokenized:DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ referenceUnitArray/referenceUnit/termNameNote:recordGroup and collection shall have location too, but DAS schema does not show that. In case the physicalOccurenceArray is added, it will be populated as:DAS/physicalOccurrenceArray/collectionPhysicalOccurrence/referenceUnitArray/referenceUnit/termNameDAS/physicalOccurrenceArray/recordGroupPhysicalOccurrence/referenceUnitArray/referenceUnit/termName

search

27

Page 32: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/locationIds Add an entry for every:

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ referenceUnitArray/referenceUnit/naIdDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ referenceUnitArray/referenceUnit/naIdDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ referenceUnitArray/referenceUnit/naIdDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ referenceUnitArray/referenceUnit/naIdNote:recordGroup and collection shall have location ID too, but DAS schema does not show that. In case the physicalOccurenceArray is added, it will be populated as:DAS/physicalOccurrenceArray/collectionPhysicalOccurrence/referenceUnitArray/referenceUnit/naIdDAS/physicalOccurrenceArray/recordGroupPhysicalOccurrence/referenceUnitArray/referenceUnit/naId

results, facet, search

I/parentNaId DAS/parent*/naId results, search

I/ancestorNaIds Add an entry for every:DAS/parent[Level]/naIdDAS/parent[Level]/parent[Level]/naIdDAS/parent[Level]/parent[Level]/parent[Level]/naIdenumeration of Level: RecordGroup Collection Series FileUnit

results, search

I/hasChildren “true” if this record has children in the archival hierarchy results, search

I/objectId OBJECTS/object/@id access, search

I/objectSortNum If the record contains both images and documents, re-sort all of the OBJECTS/object with images first (sub-sort by the position in which the object occurs in the DAS XML) and then documents last (sub-sort by the position in which the object occurs in the DAS XML).Then set I/objectSortNum to the new sorted order.

search, sorting

I/fileSize OBJECTS/object/file[@type='primary']/technicalMetadata/size

search

I/objectDesignator OBJECTS/object/designator results

I/objectDescription OBJECTS/object/description search

28

Page 33: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/description If record is archival description:

XML content of archival descriptioncontent delivery

I/technical OBJECTS/object/technicalMetadata search

I/objects For descriptions:OBJECTS = entire objects.xml from section 3.5.(NOTE: Copy the Transcription, Translation and seData XML into the objects XML before indexing)

For digital objects:OBJECTS/object = Just the objects.xml for the specified object(NOTE: Copy the Transcription, Translation and seData XML into the objects XML before indexing)

I/allAuthorityIds Add an entry for every:I/creatorIds, I/donorIds, I/subjectIds, I/contributorIds, I/personalReferenceIds

search

I/creatorIds Add an entry for every: DAS/creatingIndividualArray/creatingIndividual/creator/naIdORDAS/creatingOrganizationArray/creatingOrganization/creator/naId

search

I/donorIds Add an entry for every: DAS/organizationalDonorArray/organizationName/naId ORDAS/personalDonorArray/person/naId ORDAS/archivalDescriptionsDonorArray/descriptionReference/naId

search

I/subjectIds Add an entry for every: DAS/personalReferenceArray/person/naIdDAS/organizationalReferenceArray/organization/naIdDAS/geographicReferenceArray/geographicPlaceName/naIdDAS/descriptionReferenceArray/ descriptionReference/naId

search

I/contributorIds Add an entry for every: DAS/organizationalContributorArray/organizationName/naIdDAS/personalContributorArray/person/naIdDAS/archivalDescriptionsContributorArray/descriptionReference/naId

search

I/personalReferenceIds Add an entry for every: DAS/personalReferenceArray/person/naId

search

I/recordCreatedDateTime DAS/recordHistory/created/dateTime results, sorting

I/recordUpdatedDateTime

DAS/recordHistory/changed/modification/dateTime results, sorting

29

Page 34: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/firstIngestedDateTime The date/time the index-entry was first ingested into the

Catalog.Note: This will need to be maintained by the Catalog in a

caching table.

results, sorting

I/ingestedDateTime now() = The current date-time. results, sorting

I/productionDate DAS/productionDateArray/proposableQualifiableDate results, sorting

I/productionDateMonth DAS/productionDateArray/proposableQualifiableDate search

I/productionDateDay DAS/productionDateArray/proposableQualifiableDate search

I/productionDateYear DAS/productionDateArray/proposableQualifiableDate search

I/productionDateQualifier see section 4.3 result

I/broadcastDate DAS/broadcastDateArray/proposableQualifiableDate results, sorting

I/broadcastDateMonth DAS/broadcastDateArray/proposableQualifiableDate search

I/broadcastDateDay DAS/broadcastDateArray/proposableQualifiableDate search

I/broadcastDateYear DAS/broadcastDateArray/proposableQualifiableDate search

I/broadcastDateQualifier see section 4.3 result

I/releaseDate DAS/releaseDateArray/proposableQualifiableDate results, sorting

I/releaseDateQualifier see section 4.2 result

I/coverageStartDate DAS/coverageDates/coverageStartDate results, sortingI/coverageStartDateMonth DAS/coverageDates/coverageStartDate search

I/coverageStartDateDay DAS/coverageDates/coverageStartDate search

I/coverageStartDateYear DAS/coverageDates/coverageStartDate search

I/coverageStartDateQualifier

see section 4.2 result

I/coverageEndDate DAS/coverageDates/coverageEndDate results, sorting

I/coverageEndDateMonth DAS/coverageDates/coverageEndDate search

I/coverageEndDateDay DAS/coverageDates/coverageEndDate search

I/coverageEndDateYear DAS/coverageDates/coverageEndDate search

I/coverageEndDateQualifier

see section 4.2 result

I/inclusiveStartDate DAS/inclusiveDates/inclusiveStartDate results, sorting

I/inclusiveStartDateMonth

DAS/inclusiveDates/inclusiveStartDate search

I/inclusiveStartDateDay DAS/inclusiveDates/inclusiveStartDate search

I/inclusiveStartDateYear DAS/inclusiveDates/inclusiveStartDate search

I/inclusiveStartDateQualifier

see section 4.2 result

I/inclusiveEndDate DAS/inclusiveDates/inclusiveEndDate results, sorting

30

Page 35: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Index Field XML document element PurposeI/inclusiveEndDateMonth DAS/inclusiveDates/inclusiveEndDate search

I/inclusiveEndDateDay DAS/inclusiveDates/inclusiveEndDate search

I/inclusiveEndDateYear DAS/inclusiveDates/inclusiveEndDate search

I/inclusiveEndDateQualifier

see section 4.2 result

4.3 DateQualifer mappingFor most of the date fields inside archival description record, they also have relevant DateQualifier fields which are used to describe accuracy of the date fields. These date fields includes: productionDate, broadcastDate, releaseDate, coverageStartDate, coverageEndDate, inclusiveStartDate, inclusiveEndDate.

The value of each of the DateQualifier fields will be based on the date format of the corresponding date as it is represented in the DAS XML. The mapping will be as follows:

Y – If the date contains only a year and no additional characters, date is known

Y? – If the date contains only a year and the “?” character, for example: “1944?”, date is uncertain

Yca – If the date contains only a year and the “ca.” string, for example: “ca. 1944”, date is approximate

YM – If the date contains only year and month, no additional characters, date is known

YM? – If the date contains only year, month and the “?” character, for example: “01/1944?”, date is uncertain

YMca – If the date contains only year, month and “ca.” string, for example: “ca. 01/1944”, date is approximate

YMD – If the date contains Full date: year, month and day, date is known

4.4 Keyword search relevancy mappingArchival Description metadata will be mapped to the Catalog relevancy model as follows. Note that all of the fields specified (grank1, grank2, grank3, grank4, and content) will be searched by all “q=” parameters.

When multiple DAS/ fields are mapped to the same relevancy field, all of their content will be concatenated together into the same field and searched together.

Source field Relevancy Field

31

Page 36: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

DAS/naId I/grank1

DAS/recordGroupNumber I/grank2

DAS/collectionIdentifier I/grank2

DAS/title I/grank2

DAS/otherTitleArray/otherTitle/title I/grank2

DAS/subtitle I/grank2

DAS/localIdentifier I/grank2

DAS/recordsCenterTransferArray/recordsCenterTransferNumber/number I/grank2

DAS/accessionNumberArray/accessionNumber/number I/grank2

DAS/variantControlNumberArray/variantControlNumber/number I/grank2

DAS/microformPublicationArray/microformPublication/microformPublicationTitle/identifier

I/grank2

DAS/productionSeriesTitle I/grank2

DAS/productionSeriesSubtitle I/grank2

DAS/organizationalDonorArray/organizationName/termName ORDAS/personalDonorArray/person/termName ORDAS/archivalDescriptionsDonorArray/descriptionReference/title

i/grank3

DAS/organizationalContributorArray/organizationalContributor/contributor/termName ORDAS/personalContributorArray/personalContributor/contributor/termName ORDAS/archivalDescriptionsContributorArray/descriptionReference/title

i/grank3

DAS/creatingIndividualArray/creatingIndividual/creator/termNameOR

DAS/creatingOrganizationArray/creatingOrganization/creator/termName

i/grank3

DAS/personalReferenceArray/person/termNameDAS/organizationalReferenceArray/organization/termNameDAS/geographicReferenceArray/geographicPlaceName/termNameDAS/descriptionReferenceArray/descriptionReference/title

i/grank3

DAS/scopeAndContentNote i/grank3

DAS/*[starts-with(local-name(), 'parent')]/creatingIndividualArray/creatingIndividual/creator/termNameDAS/*[starts-with(local-name(), 'parent')]/creatingOrganizationArray/creatingOrganization/creator/termName

i/grank3

DAS/physicalOccurrenceArray/*/referenceUnitArray/referenceUnit/termName i/grank3

DAS/dateNote i/grank4

DAS/custodialHistoryNote i/grank4

DAS/scaleNote i/grank4

DAS/transferNote i/grank4

DAS/arrangement i/grank4

DAS/accessRestriction/accessRestrictionNote i/grank4

32

Page 37: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

DAS/accessRestriction/status/termName i/grank4

DAS/useRestriction/ status/termName i/grank4

DAS/onlineResourceArray/onlineResource/url i/grank4

DAS/shotList i/grank4DAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ physicalOccurrenceNoteDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ physicalOccurrenceNoteDAS/physicalOccurrenceArray/itemPhysicalOccurrence/ physicalOccurrenceNoteDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ physicalOccurrenceNote

i/grank4

DAS/physicalOccurrenceArray/SeriesPhysicalOccurrence/containerList

i/grank4

DAS/dispositionAuthorityNumberArray/dispositionAuthorityNumber/number

i/grank4

DAS/variantControlNumberArray/variantControlNumber/type/termName

i/grank4

DAS/variantControlNumberArray/variantControlNumber/note i/grank4

DAS/digitalObjectArray/digitalObject/objectDescription i/grank4

DAS/generalNoteArray/generalNote/note i/grank4

DAS/microformPublicationArray/microformPublication/note i/grank4

DAS/microformPublicationArray/microformPublication/publication/title

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/technicalAccessRequirementsNoteDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/technicalAccessRequirementsNoteDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/technicalAccessRequirementsNoteDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/technicalAccessRequirementsNote

i/grank4

DAS/formerRecordGroupArray/recordGroup/recordGroupNumber ? or titleDAS/formerCollectionArray/collection/collectionIdentifier ? or titleDAS/parentSeries/formerRecordGroupArray/recordGroup/recordGroupNumber ? or titleDAS/parentSeries/formerCollectionArray/collection/collectionIdentifier ? or titleDAS/parentFileUnit/formerRecordGroupArray/recordGroup/recordGroupNumber ? or titleDAS/parentFileUnit/formerCollectionArray/collection/collectionIdentifier ? or title

i/grank4

33

Page 38: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/recordingSpeed/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/color/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/color/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/color/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/color/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/dimension/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/dimension/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/dimension/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/dimension/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/base/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/base/termName

i/grank4

DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/soundtrackLanguage/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/specificMediaType/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/specificMediaType/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/specificMediaType/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/specificMediaType/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/process/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/process/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/process/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/

i/grank4

34

Page 39: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

mediaOccurrenceArray/mediaOccurrence/process/termNameDAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/format/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/format/termName

i/grank4

DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/soundtrackConfiguration/termName

i/grank4

DAS/findingAidArray/findingAid/type i/grank4

DAS/findingAidArray/findingAid/source i/grank4

DAS/findingAidArray/findingAid/note i/grank4

DAS/organizationalContributorArray/organizationalContributor/contributorType ORDAS/personalContributorArray/personalContributor/contributorType ORDAS/archivalDescriptionsContributorArray/descriptionReference/recordType

i/grank4

DAS/generalRecordsTypeArray/generalRecordsType/termName i/grank4DAS/physicalOccurrenceArray/itemPhysicalOccurrence/copyStatus/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/copyStatus/termNameDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ copyStatus/termNameDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ copyStatus/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/emulsion/termNameDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/emulsion/termName

i/grank4

DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ mediaOccurrenceArray/itemmediaOccurrence/mediaOccurrenceNoteDAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ mediaOccurrenceArray/itemAvmediaOccurrence/mediaOccurrenceNoteDAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/mediaOccurrenceNoteDAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ mediaOccurrenceArray/mediaOccurrence/mediaOccurrenceNote

i/grank4

DAS/onlineResourceArray/onlineResource/description i/grank4

DAS/onlineResourceArray/onlineResource/note i/grank4

DAS/editStatus/termName i/grank4

DAS/languageArray/language/termName i/grank4

DAS/useRestriction/note i/grank4

DAS/functionAndUse i/grank4

35

Page 40: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

DAS/numberingNote i/grank4

DAS/soundType/termName i/grank4

OBJECTS/seData1 I/content

OBJECTS/transcription1 I/content

OBJECTS/translation1 I/content

Notes:

1. For these fields, copy the data as follows:

a. For archival descriptions – Extract the specified data from all nested digital objects, concatenate it, and copy it to the relevancy field.

b. For digital objects – Extract the specified data only from the appropriate digital object to the relevancy field.

36

Page 41: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

5 Search Fields

5.1 Advanced Search FieldsThe following table describes the fields available for searching on advanced search.

Display Name Query Type Allowed Values

Record Group Number/Collection ID

f.recordGroupCollectionId=record-group-number:(${s}) ORhierarchy-item-record-group-number:(${s}) ORcollection-id:(${s}) ORhierarchy-item-collection-id:(${s})

string Free text

Search by Date Range: From

f.beginDate=${s} string YYYY-MM-DD

Search by Date Range: To

f.endDate=${s} string YYYY-MM-DD

Search by Recurring Date: Recurring Date

f.recurringDateMonth=${s} string List of value in a drop-down menu; values are 1–12.

Search by Recurring Date: Recurring Date

f.recurringDateDay=${s} string List of value in a drop-down menu; values are 1–31.

Search by Exact Date: Exact Date

f.exactDate=${s} string YYYY-MM-DD

Type of Archival Materials

f.materialsType=${s} string List of values in the scrollable list box. The list of values is shown in 5.2.1

Level of Descriptions

f.level=${s} string List of values in the scrollable list box. The list of values is shown in 5.2.5

File Format of Archival Descriptions

f.fileFormat=${s} string List of values in the scrollable list box. The list of values is shown in 5.2.6

Location of Archival Materials

f.locationIds=${s} string List of values in the scrollable list box. The list of values is shown in 5.2.7

Title f.allTitles=${s} string Free text

Geographic References

f.geographicReferences =subject-reference:(display-name:(${s}) AND @subect-type:TGN)

string Free text

Creator f.creators: (${s}) string Free text

37

Page 42: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Display Name Query Type Allowed Values

Description Identifier

f.descriptionIdentifier =naId:(${s}) ORlocalId:(${s}) ORdescription.accession-numbers:(accession-number:(${s}) ) ORdescription.microform-publication:(microform-id:(${s} )) ORdescription.variant-control-number:(variant-number:(${s}) ) ORdescription.rct-numbers:(rct-number:(${s}) ) ORdescription.internal-transfer-numbers:(internal-transfer-number: (${s}) ) ORdescription.record-group-number:(${s}) ORdescription.hierarchy-item-record-group-number:(${s}) ORcollection-id:(${s}) ORhierarchy-item-collection-id:(${s})

string Free text

5.2 Search Fields - Additional Information

5.2.1 Date range field sources

Date field sources for date range searches include the following:

description//productionDateArray/proposableQualifiableDate//logicalDate

description//broadcastDates//logicalDate

description//coverageDates/coverageStartDate|coverageEndDate/logicalDate

description//inclusiveDates/inclusiveStartDate|inclusiveEndDate/logicalDate

5.2.2 Recurring date search field sources

Date field sources for recurring date searches include the following:

description//productionDateArray/proposableQualifiableDate/month|day

description//broadcastDates//month|day

description//coverageDates/coverageStartDate|coverageEndDate/month|day

description//inclusiveDates/inclusiveStartDate|inclusiveEndDate/month|day

5.2.3 Exact date search field sources

Date field sources for exact date searches include the following:

description//productionDateArray/proposableQualifiableDate//logicalDate

description//broadcastDates//logicalDate

38

Page 43: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

description//coverageDates/coverageStartDate|coverageEndDate/logicalDate

description//inclusiveDates/inclusiveStartDate|inclusiveEndDate/logicalDate

5.2.4 Type of Archival Materials Values

Type of Archival Material Value Search ValueArchitectural and Engineering Drawings drawingsArtifacts artifactsData Files dataFilesMaps and Charts mapsAndChartsMoving Images movingImagesPhotographs and Other Graphic Materials photographsAndGraphicsSound Recordings soundTextual Records textWeb Pages web

5.2.5 Level of Description Values

Level of Description Value Search ValueRecord Group recordGroupCollection collectionSeries seriesFile fileUnitItem Item

5.2.6 File Format Values

File Format Value Search Value TypeASCII Text text/plain documentAudio Visual (Real Media Video Stream)

application/vnd.rn-realmedia video

Audio Visual File (AVI) video/x-msvideo videoAudio Visual File (MOV) video/quicktime videoAudio Visual File (MP4) video/mp4 videoAudio Visual File (WMV) video/x-ms-wmv videoImage (BMP) image/bmp imageImage (GIF) image/gif imageImage (JPG) image/jpeg Image

39

Page 44: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

File Format Value Search Value TypeImage (JPEG2000) image/jp2 imageImage (TIFF) image/tiff imageCompressed File (ZIP) application/zip documentMS Excel Spreadsheet application/excel documentMicrosoft PowerPoint Document application/mspowerpoint documentMicrosoft Word Document application/msword documentMicrosoft Write Document application/mswrite documentPortable Document File (PDF) application/pdf documentSound File (MP3) audio/mpeg3 audioSound File (WAV) audio/x-wav audioWorld Wide Web Page text/html web

5.2.7 Location values

Location Value Search Value(ref-id)

William J. Clinton Library 1

Dwight D. Eisenhower Library 2

Franklin D. Roosevelt Library 3

George Bush Library 4

Gerald R. Ford Library 5

Gerald R. Ford Museum 6

Herbert Hoover Library 7

Harry S. Truman Library 8

Jimmy Carter Library 9

John F. Kennedy Library 10

Lyndon Baines Johnson Library 11

Richard Nixon Library - College Park 12

Ronald Reagan Library 13

National Archives at Boston 14

National Archives at New York 15

National Archives at Philadelphia 17

National Archives at Atlanta 18

National Archives at Chicago 19

National Archives at Kansas City 20

National Archives at Fort Worth 21

National Archives at Denver 22

40

Page 45: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Location Value Search Value(ref-id)

National Archives at Riverside 23

National Archives at San Francisco 24

National Archives at SeattleNote: NARA will be moving the records currently assigned to the National Archives at Anchorage reference unit to the National Archives at Seattle by the end of March 2015.

National Archives at Seattle 26

National Personnel Records Center - Civilian Personnel Records 27

National Personnel Records Center - Military Personnel Records 28

National Archives at College Park – Cartographic 29

National Archives at College Park - Motion Pictures 30

National Archives at College Park - Still Pictures 31

National Archives at Washington, DC - Textual Reference 32

National Archives at College Park - Textual Reference 33

National Archives at College Park – FOIA 34

Center for Legislative Archives 36

National Archives at College Park – Electronic Records 37

Library of Congress, Prints and Photographs Division (an affiliated archives) 38

National Park Service, Yellowstone National Park Archives (an affiliated archives) 39

New Mexico Commission of Public Records, State Records Center and Archives (an affiliated archives)

40

Oklahoma Historical Society (an affiliated archives) 41

Pennsylvania Historical and Museum Commission, State Archives (an affiliated archives)

42

United States Military Academy Archives (an affiliated archives) 43

United States Naval Academy, William W. Jeffries Memorial Archives (an affiliated archives)

44

Presidential Materials Division 48

National Archives at St. Louis 50

Richard Nixon Library 51

George W. Bush Library 53

U.S. Government Printing Office (an affiliated archives) 54

University of North Texas Libraries (an affiliated archives) 57

Note: NARA pointed out that the Location Value table will be updated in the future, we will need to build the system to handle the case of location value change.

41

Page 46: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

6 Search Results Presentation

This section covers the “brief results” display – the standard output for archival descriptions when shown in the search results.

Archival descriptions with digital objects:

Archival description without digital object

Note: If the description record has variant control number and the type of the variant control number is “HMS/MLR Entry Number”, the variant control number of the description record will be displayed as HMS Entry Number in this brief result page.

Icon

Use the following algorithm to display the icon:

1. If {I/thumbnailFile} exists, then use:

<img src="http://catalog.archives.gov/id/{I/naid}/opa-renditions/thumbnails/{I/thumbnailFile}"/>

2. Otherwise, use:

<img src="http://catalog.archives.gov/id/{I/iconType}.jpg"/>

Metadata Fields

Display Rules for HMS Entry Number(s)

a. If there is an HMS/MLR Entry number, display it after the National Archives Identifier

b. If there is more than one HMS/MLR Entry number, display the first two HMS/MLR Entry numbers, followed by an ellipse (…) and a comma

42

Page 47: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

c. If there is no HMS/MLR Entry number at the file unit or item level, display the parent HMS/MLR Entry number(s) as described in 1 and 2 above.

Lin

e

Index Fields & Pattern

1 {I/title}, {I/titleDate}Remove “, “ after {I/title} if {I/titleDate} does not exist.

2 From RG: {DAS/parent[Record Group]/ recordGroupNumber } OR From: Collection: {DAS/parent[Collection]/ collectionIdentifier } *{DAS/parent[Series]/title} **

3 {Show highlighted content if available, otherwise show I/teaser}4 National Archives Identifier: {I/naId}, Local Identifier: {I/ localIdentifier},

Container ID: {I/containerId}

5 Creator{s}1: { I/creators }2

Note 1: Output “Creator” (if only one creator) or “Creators” (if multiple creators) for the labelNote 2: Separate multiple creators with “;” (semi-colon)

* For Series, File Units, and Items, display the parent Record Group Number or Collection Identifier.

** Below the parent Record Group Number or Collection Identifier, display the parent Series title for file units and items.

43

Page 48: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

7 Content Details Presentation

The page layouts for Record group, Collection, Series, File Unit and Item are shown below.

7.1 Rules for ${online-availability-header}The rules for displaying the “online-availability-header” are as follows:

1. Many times RG, Collections, Series, and File Units are partially digitized so the header is appropriate. But the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation: R2/3]

2. Moderator users can add a specific header for records that are not available online. They can also remove or hide the header. This can be done in the Moderator Workbench where users can edit how this header should look, hide it from view, or remove it entirely, for example:

This ${recordType} describes records, some of which may not be available online. To obtain a copy or view the records, please contact or visit the National Archives and Records Administration location(s) listed in the Contact information below.

${recordType} is determined by the root tag of the record. The rule is:“Collection” if root tag is <collection>“Record Group” if root tag is <recordGroup>“Series” if root tag is <series>“File Unit” if root tag is <fileUnit>“Item” if root tag is <item>“Item” if root tag is <itemAv>

7.2 Rules for displaying fieldsThe rules for displaying some fields that appear on the Content Details layouts are shown below.

7.2.1 Email addresses

Whenever NARA contact information is displayed in the detailed results, all NARA email addresses should be standard links. These links should be displayed in the standard Catalog format and style and should behave in the standard manner when clicked or right-clicked.

7.3 Record Group

{DAS/title}, {DAS/inclusiveDates/inclusiveStartDate + “-“ + DAS/inclusiveDates/inclusiveEndDate}1

44

Page 49: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

${online-availability-header}

National Archives Identifier: {DAS/naId}

Details

Level of Description: Record Group

Record Group Number: {DAS/recordGroupNumber}

Contact(s): {for-each: RU =DAS/ referenceUnit,

Repeat the following lines:}4

{RU/name} {(RU/mailcode)}2, {RU/address1}1

{RU/address2}{RU/city}, {RU/state} {RU/postcode}Phone: {RU/phone}3

Fax: {RU/fax}3

Email: {RU/email}3

This record group was compiled or maintained between:

{DAS/inclusiveDates/inclusiveStartDate} - {DAS/inclusiveDates/inclusiveEndDate}

This record group documents the time period:

{DAS/coverageDates/coverageStartDate} -{DAS/coverageDates/coverageEndDate}

Date Note: {DAS/dateNote}

These records document the following Congresses:

{DAS/beginCongress/termName} –{DAS/endCongress/termName }

Includes {count(DAS/containsOrderArray)} series described in the catalog 7.3.1

7.3.1

{for-each: FA = DAS/findingAidArray/findingAid,repeat the next three rows:}

Finding Aid Type: {FA/type}

Finding Aid Note: {FA/note}

Finding Aid URL: {FA/url/termname}

Finding Aid Source: {FA/source}

45

Page 50: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Scope & Content

{DAS/scopeAndContentNote}

Variant Control Numbers

ARC Identifier: {DAS/naId}

{for-each: VAR = DAS/variantControlNumberArray/variantControlNumber

Repeat the following row}

{VAR/variant-type}: {VAR/number}{VAR/note}

Notes:1. If field does not exist, remove the preceding comma2. If field does not exist, remove the enclosing parenthesis3. If field does not exist, remove entire line including the preceding label

7.3.1 Record Group Link Table

The following table defines the action when the user clicks a link in the full result page of Record Group record.

Link Expected Result ActionIncludes <series-count> series described in the catalog

Search for all Series descriptions with <parent-id> equal to current <naId>.

http://catalog.archives.gov/?action=search and f.parentNaId ={DAS/naId}&ui.sw={DAS/naId}

46

Page 51: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Link Expected Result Action

Search within this Record Group

Searches for all Series, File(s), and Item(s) which are descendants of the current Record Group. Searches for all Series, File(s) and Item(s) which have parent[Level]/naId equal to the current record’s naId.The search parameters are then executed on only that subset of child records

http://catalog.archives.gov/?action=search and f.ancestorNaIds ={DAS/naId}&ui.sw={DAS/naId}

7.4 Collection

{DAS/title}, {DAS/inclusiveDates/inclusiveStartDate + “-“ + DAS/inclusiveDates/inclusiveEndDate}1

${online-availability-header}

National Archives Identifier: {DAS/naId}

Details

Level of Description: Collection

Collection Identifier: {DAS/collectionIdentifier}

Contact(s): {for-each: RU =DAS/ referenceUnit,

Repeat the following lines:}{RU/name} {(RU/mailcode)}2, {RU/address1}1

{RU/address2}{RU/city}, {RU/state} {RU/postcode}Phone: {RU/phone}3

Fax: {RU/fax}3

Email: {RU/email}3

This collection was compiled or maintained between:

{DAS/inclusiveDates/inclusiveStartDate} - {DAS/inclusiveDates/inclusiveEndDate}

This collection documents the {DAS/coverageDates/coverageStartDate} -

47

Page 52: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

time period: {DAS/coverageDates/coverageEndDate}

Date Note: {DAS/dateNote}

Includes {count(DAS/containsOrderArray)} series described in the catalog 7.4.1 7.4.1

Donor(s): { for-each: DO = DAS/organizationalDonorArray/organizationName OR DAS/personalDonorArray/person OR DAS/archivalDescriptionsDonorArray/descriptionReference }{DO/termName}7.4.1

{ for-each: DAS/findingAidArray/findingAid,Repeat the following three rows}

Finding Aid Type: {FA/type}

Finding Aid Note: {FA/note}

Finding Aid URL: {FA/url/termname}

Finding Aid Source: {FA/source}

Scope & Content

{DAS/scopeAndContentNote}

Variant Control Numbers

ARC Identifier: {DAS/naId}

{for-each: VAR = DAS/variantControlNumberArray/variantControlNumber,Repeat the following row:}

{VAR/variant-type}: {VAR/number}{VAR/note}

Notes:1. If field does not exist, remove the preceding comma2. If field does not exist, remove the enclosing parenthesis

48

Page 53: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

3. If field does not exist, remove entire line including the preceding label

7.4.1 Collection Link Table

The following table defines the action when the user clicks a link in the full result page of Collection record.

Link Expected result ActionIncludes <series-count> series described in the catalog

Search for all Series descriptions with parent[Level]/naId equal to current naId.

http://catalog.archives.gov/?action=search and f.parentNaId ={DAS/naId}&ui.sw={DAS/naId}

Search within this Collection

Searches for all Series, File(s), and Item(s) which are descendants of the current Collection. Searches for all Series, File(s) and Item(s) which have parent[Level]/naId equal to the current record’s naId.The search parameters are then executed on only that subset of child records.

http://catalog.archives.gov/?action=search and f.ancestorNaIds ={DAS/naId}&ui.sw={DAS/naId}

Donor(s):

Links to the specific Authority record who is the Donor based on the donor array type and donor naId

If donor array type is organizationalDonorArray:http://catalog.archives.gov/id/${DAS/organizationalDonorArray/organizationName/naId}

If donor array type is personalDonorArray:http://catalog.archives.gov/id/${DAS/personalDonorArray/person/naId}

If donor array type is archivalDescriptionsDonorArray:http://catalog.archives.gov/id/${DAS/archivalDescriptionsDonorArray/descriptionReference/naId}

7.5 Series

{DAS/title}, {DAS/inclusiveDates/inclusiveStartDate + “-“ + DAS/inclusiveDates/inclusiveEndDate}1

49

Page 54: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

${online-availability-header}

National Archives Identifier: {DAS/naId}

Local Identifier: {DAS/localIdentifier}

HMS Entry Number(s): {DAS/variantControlNumberArray/variantControlNumber/number}, {DAS/variantControlNumberArray/variantControlNumber/number} (…)2

Creator(s) {for-each: CO = DAS/creatingIndividualArray/creatingIndividual/ OR DAS/creatingOrganizationArray/creatingOrganization/ }{CO/creator/termName}7.5.1 {(CO/creatorType/termName)}3

From: {DAS/parent[Level]/title}7.5.1

Details

Level of Description: Series

Type(s) of Archival Materials:

{for-each: GT = DAS/generalRecordsTypeArray/generalRecordsType}{GT/termName}

The creator compiled or maintained the

series between:

{DAS/inclusiveDates/inclusiveStartDate} - {DAS/inclusiveDates/inclusiveEndDate}note: date qualifier like “ca” shall be included

This series documents the time

period:

{DAS/coverageDates/coverageStartDate} - {DAS/coverageDates/coverageEndDate}note: date qualifier like “ca” shall be included

Date Note: {DAS/dateNote}

These records document the

following Congresses:

{DAS/beginCongress/termName} –{DAS/endCongress/termName}

Includes {count(DAS/containsOrderArray)} file(s) described in the catalog 7.5.1

7.5.1

Other Title(s): {for-each: OT = DAS/otherTitleArray/otherTitle/title}{OT}

Function and Use: {DAS/functionAndUse}

50

Page 55: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Scale Note: {DAS/scaleNote}

Numbering Note: {DAS/numberingNote}

General Note(s): {for-each: GN = DAS/generalNoteArray/generalNote/note}{GN}

Arrangement: {DAS/arrangement}

Access Restriction(s):

{DAS/accessRestriction/status/termName}Specific Access Restriction: {for-each:SAR = DAS/accessRestriction/specificAccessRestrictionArray/specificAccessRestriction }4

{SAR/restriction}, {SAR/restriction} {SAR/securityClassification}, {SAR/securityClassification}Note: {DAS/accessRestriction/accessRestrictionNote}4

Use Restriction(s): {DAS/useRestriction/status/termName}Specific Use Restriction: {for-each:SUR =

DAS/useRestriction/specificUseRestrictionArray/specificUseRestriction} {SUR/restriction}, {SUR/restriction} Note: {DAS/useRestriction/note}4

Custodial History: {DAS/custodialHistoryNote}

Transfer Information: {DAS/transferNote}

Edited: {DAS/editStatus/termName}

Sound Type: {DAS/soundType/termName}

Language(s): {for-each: LG = DAS/languageArray/language/termName}{LG}

{for-each: FA = DAS/findingAidArray/findingAid Repeat the following three rows:}

Finding Aid Type: {FA/type}

Finding Aid Note: {FA/note}

Finding Aid URL: {FA/url/termname}

Finding Aid Source: {FA/source}

Accession Number(s):

{for-each: AN = DAS/ accessionNumberArray/accessionNumber/number}{AN}

51

Page 56: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Disposition Authority Number(s):

{for-each: DAN = DAS/dispositionAuthorityNumberArray/dispositionAuthorityNumber/number}

{DAN}

Records Center Transfer Number(s):

{for-each: RCTN = DAS/recordsCenterTransferArray/recordsCenterTransferMmber}

{RCTN}

Internal Transfer Number(s):

{for-each: ITN = DAS/internalTransferNumberArray/internalTransferNumber/number}

{ITN}

Microform Publication(s):

{for-each: MP = DAS/microformPublicationArray/microformPublication}{MP/identifier}{MP/note}{MP/title}

Online Resource(s): {for-each: OR = DAS/onlineResourceArray/onlineResource}{OR/description}7.5.1

{OR/note}

Subjects Represented in the

Archival Material(s):{for-each: SR =

DAS/personalReferenceArray/person OR

DAS/organizationalReferenceArray/organizationName OR

DAS/geographicReferenceArray/geographicPlaceName OR

DAS/descriptionReferenceArray/specificRecordsType ORDAS/descriptionReferenceArray/TopicalSubject}{SR/termName}7.5.1

Contributors to Authorship and/or Production of the

Archival Material(s):

{ for-each: CR= DAS/organizationalContributorArray/organizationalContributor ORDAS/personalContributorArray/personalContributor ORDAS/archivalDescriptionsContributorArray/descriptionReference}

{CR/termName or CR/title}7.5.1

Former Record Group(s):

{for-each: FRG = DAS/formerRecordGroupArray/recordGroup}{FRG/recordGroupNumber}

Former Collection(s): {for-each: FCL = DAS/formerCollectionArray/collection}{FCL/collectionIdentifier}

Scope & Content

52

Page 57: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

{DAS/scopeAndContentNote}

Variant Control Numbers

ARC Identifier: {DAS/naId}

{for-each: VAR = DAS/variantControlNumberArray/variantControlNumber

Repeat the following row}

{VAR/variant-type/termName}: {VAR/number}{VAR/note}

Archived Copies

{for-each: PO = DAS/physicalOccurrenceArray/seriesPhysicalOccurrence}

Copy N5: {PO/copyStatus/termName}

Extent (Size): {PO/extent}

Contact(s): {for-each: RU = DAS/physicalOccurrenceArray/seriesPhysicalOccurrence/ referenceUnitArray/referenceUnit,Repeat the following lines:}{RU/termName} {(RU/mailCode)}3, {RU/address1}1

{RU/address2}{RU/city}, {RU/state} {RU/postcode}Phone: {RU/phone}4

Fax: {RU/fax}4

Email: {RU/email}4

Count: {for-each: HM = PO/ holdingsMeasurementArray/holdingsMeasurement}

{HM/count}{HM/holdingsMeasurementType/termName}

Physical Occurrence Note: {PO/physicalOccurrenceNote}

Copy N Media Information: {for-each: MEDIA = PO/mediaOccurrenceArray/mediaOccurrence Repeat the following lines}Specific Media Type: {MEDIA/specificMediaType/termName}4

Color: {MEDIA/color}4

53

Page 58: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Container ID: {MEDIA/containerId}4

Dimension: {MEDIA/dimension/termName}4

Height: {MEDIA/height}4

Width: {MEDIA/width}4

Depth: {MEDIA/depth}4

Media Occurrence Note: {MEDIA/mediaOccurrenceNote}4

Physical Restriction Note: {MEDIA/physicalRestrictionNote}4

Piece Count: {MEDIA/pieceCount}4

Process: {MEDIA/process}4

Reproduction Count: {MEDIA/reproductionCount}4

Technical Access Requirements Note: {MEDIA/technicalAccessRequirementsNote}4

Container List

{DAS/physicalOccurrenceArray/SeriesPhysicalOccurrence/containerList}

Notes:1. If field does not exist, remove the preceding comma2. If there are less than two variant control numbers, remove the ellipsis and the enclosing

parenthesis, only variant control number whose mlr attribute is ”true” will be counted.3. If field does not exist, remove the enclosing parenthesis4. If field does not exist, remove entire line including the preceding label5. The value N should start with 1 and increment by 1 per each occurrence

7.5.1 Series Link Table

The following table defines the action when the user clicks a link in the full result page of Series record.

Link Expected result Action

Creator(s):

Links to the specific Authority record who is the Creator based on the creator record type and creator-id

Links to the specific Authority record who is the Creator based on creatorType/termName and creator/naId

From:Links to the description whose naId is equal to the parent[Level]/naId

{for-each: ANCESTOR = DAS/parent[Level] }http://catalog.archives.gov/id/ANCESTOR/naId

54

Page 59: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Link Expected result ActionIncludes <file-unit-count/> file(s) described in the catalog

Search for all File descriptions with parent[Level]/naId equal to the current Series naId.

http://catalog.archives.gov/?action=search andf.parentNaId ={DAS/naId}&ui.sw={DAS/naId}

Includes <item-count> file(s) described in the catalog

Search for all Item descriptions with parent[Level]/naId equal to the current Series naId

http://catalog.archives.gov/?action=search andf.parentNaId ={DAS/naId}&ui.sw={DAS/naId}

Search within this Series

Searches for all File(s), and Item(s) which are descendants of the current Series. Searches for all File(s) and Item(s) which have parent[Level]/naId equal to the current record’s naId. The search parameters are then executed on only that subset of child records.

http://catalog.archives.gov/?action=search and f.ancestorNaIds={DAS/naId}&ui.sw={DAS/naId}

online resource description

Links to online-resource-url

Links to {DAS/onlineResourceArray/onlineResource /url}

Subjects Represented in the Archival Material(s):

Links to the specific Authority record who is the Subject based on the subjectType and subjectId

http://catalog.archives.gov/id/${DAS/organizationalReferenceArray/organization/naId} ORhttp://catalog.archives.gov/id/${DAS/personalReferenceArray/person/naId}http://catalog.archives.gov/id/${DAS/descriptionReferenceArray/ descriptionReference/naId}http://catalog.archives.gov/id/${DAS/geographicReferenceArray/geographicPlaceName/naId}

Contributors to Authorship and/or Production of the Archival Material(s):

Links to the specific Authority record who is the Contributor based on the contributorRecordType and contributorId

http://catalog.archives.gov/id/${DAS/organizationalContributorArray/organizationName/naId}http://catalog.archives.gov/id/${DAS/personalContributorArray/person/naId}http://catalog.archives.gov/id/${DAS/archivalDescriptionsContributorArray/descriptionReference/naId}

55

Page 60: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

7.6 File Unit

{DAS/title}, {DAS/inclusiveDates/inclusiveStartDate + “-“ + DAS/inclusiveDates/inclusiveEndDate}1

${online-availability-header}

National Archives Identifier: {DAS/naId}

Local Identifier: {DAS/localIdentifier}

Creator(s) {for-each: CO = DAS/creatingIndividualArray/creatingIndividual/ OR DAS/creatingOrganizationArray/creatingOrganization/ }

{CO/creator/termName} {(CO/creatorType/termName)}3

From: {DAS/parent[Level]/title}if (DAS/variantControlNumberArray/variantControlNumber/variantControlNumberType/termName == “HMS/MLR Entry Number”) {

HMS Entry Number(s): {DAS/variantControlNumberArray/variantControlNumber/number}, {DAS/variantControlNumberArray/variantControlNumber/number} (…)2

{ for-each: ANCESTOR = DAS/parent[Level]/parent[Level]}{ANCESTOR/title}

Details

Level of Description: File Unit

Type(s) of Archival Materials:

{for-each:GT = DAS/generalRecordsTypeArray/generalRecordsType}{GT/general-records-type-desc}

The creator compiled or maintained the

series between:

{DAS/inclusiveDates/inclusiveStartDate} - {DAS/inclusiveDates/inclusiveEndDate}note: date qualifier like “ca” shall be included

This file documents the time period:

{DAS/coverageDates/coverageStartDate} - {DAS/coverageDates/coverageEndDate}note: date qualifier like “ca” shall be included

Date Note: {DAS/dateNote}

These records document the

following

{DAS/beginCongress/termName} – {DAS/endCongress/termName }

56

Page 61: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Congresses:

Includes {count(DAS/containsOrderArray)} item(s) described in the catalog

Other Title(s): {for-each: OT = DAS/otherTitleArray/otherTitle/title}{OT}

Scale Note: {DAS/scaleNote}

Numbering Note: {DAS/numberingNote}

General Note(s): {for-each: GN = DAS/generalNoteArray/generalNote}{GN}

Arrangement: {DAS/arrangement}

Access Restriction(s):

{DAS/accessRestriction/status/termName}Specific Access Restriction: {for-each:SAR = DAS/accessRestriction/specificAccessRestrictionArray/specificAccessRestriction }4

{SAR/restriction}, {SAR/restriction} {SAR/securityClassification}, {SAR/securityClassification}Note: {DAS/accessRestriction/accessRestrictionNote}4

Use Restriction(s): {DAS/useRestriction/status/termName}Specific Use Restriction: {for-each:SUR =

DAS/useRestriction/specificUseRestrictionArray/specificUseRestriction} 4

{SUR/restriction}, {SUR/restriction} Note: {DAS/useRestriction/note}4

Custodial History: {DAS/custodialHistoryNote}

Transfer Information: {DAS/transferNote}

Edited: {DAS/editStatus/termName}

Sound Type: {DAS/soundType/termName}

Language(s): {for-each: LG = DAS/languageArray/language/termName}{LG}

{for-each: FA = DAS/findingAidArray/findingAid }

Finding Aid Type: {FA/type}

57

Page 62: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Finding Aid Note: {FA/note}

Finding Aid URL: {FA/url/termname}

Finding Aid Source: {FA/source}

Accession Number(s):

{for-each: AN = DAS/accessionNumberArray/accessionNumber/number}{AN}

Records Center Transfer Number(s):

{ for-each: RCTN = DAS/recordsCenterTransferArray/recordsCenterTransferMmber}

{RCTN}

Internal Transfer Number(s):

{ for-each: ITN = DAS/internalTransferNumberArray/internalTransferNumber/number}

{ITN}

Microform Publication(s):

{ for-each: MP = DAS/microformPublicationArray/microformPublication}{MP/identifier}{MP/note}{MP/title}

Online Resource(s): { for-each: OR = DAS/onlineResourceArray/onlineResource}{OR/description}{OR/note}

Subjects Represented in the

Archival Material(s):{for-each: SR =

DAS/personalReferenceArray/person OR

DAS/organizationalReferenceArray/organizationName OR

DAS/geographicReferenceArray/geographicPlaceName OR

DAS/descriptionReferenceArray/specificRecordsType ORDAS/descriptionReferenceArray/TopicalSubject}{SR/termName}

Contributors to Authorship and/or Production of the

Archival Material(s):

{ for-each: CR= DAS/organizationalContributorArray/organizationalContributor ORDAS/personalContributorArray/personalContributor ORDAS/archivalDescriptionsContributorArray/descriptionReference}

{CR/termName or CR/title}

Former Record Group(s):

{for-each: FRG = DAS/formerRecordGroupArray/recordGroup}{FRG/recordGroupNumber}

Former Collection(s): {for-each: FCL = DAS/formerCollectionArray/collection}{FCL/collectionIdentifier}

58

Page 63: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Scope & Content

{DAS/scopeAndContentNote}

Variant Control Numbers

ARC Identifier: {DAS/naId}

{for-each: VAR = DAS/variantControlNumberArray/variantControlNumber

Repeat the following row}

{VAR/variant-type}: {VAR/number}{VAR/note}

Archived Copies

{for-each: PO = DAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence}

Copy N5: {PO/copyStatus/termName}

Extent (Size): {PO/extent}

Physical Occurrence Note: {PO/note}

Contact(s): {for-each: RU = DAS/physicalOccurrenceArray/fileUnitPhysicalOccurrence/ referenceUnitArray/referenceUnit orRepeat the following lines:}{RU/termName} {(RU/mailcode)}2, {RU/address1}1

{RU/address2}{RU/city}, {RU/state} {RU/postcode}Phone: {RU/phone}4

Fax: {RU/fax}4

Email: {RU/email}4

Count: {for-each: HM = PO/holdings-measurements/holdings-measurement}{HM/measurement-count}{HM/measurement-type}

59

Page 64: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Copy N5 Media Information: {for-each: MEDIA = PO/media-occurrences/media-occurrence Repeat the following lines}Specific Media Type: {MEDIA/mediaType/termName}4

Color: {MEDIA/color}4

Container ID: {MEDIA/containerId}4

Dimension: {MEDIA/dimension}4

Height: {MEDIA/height}4

Width: {MEDIA/width}4

Depth: {MEDIA/depth}4

Media Occurrence Note: {MEDIA/mediaOccurrenceNote}4

Physical Restriction Note: {MEDIA/physicalRestrictionNote}4

Piece Count: {MEDIA/pieceCount}4

Process: {MEDIA/process}4

Reproduction Count: {MEDIA/reproductionCount}4

Technical Access Requirements Note: {MEDIA/technicalAccessN ote}4

Container List

{DAS/physicalOccurrenceArray/SeriesPhysicalOccurrence/containerList}

Notes:1. If field does not exist, remove the preceding comma2. If there are less than two variant control numbers, remove the ellipsis and the enclosing

parenthesis, only variant control number whose mlr attribute is ”true” will be counted.3. If field does not exist, remove the enclosing parenthesis4. If field does not exist, remove entire line including the preceding label5. The value N should start with 1 and increment by 1 per each occurrence

Display Rules for HMS Entry Number(s)

a. If there is an HMS/MLR Entry number, display it

b. If there is more than one HMS/MLR Entry number, display the first two HMS/MLR Entry numbers, followed by an ellipse (…)

c. If there is no HMS/MLR Entry number at the file unit or item level, display the parent HMS/MLR Entry number(s) as described in 1 and 2 above.

7.6.1 File Unit Link Table

The following table defines the action when the user clicks a link in the full result page of File Unit record.

60

Page 65: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Link Expected Result Action

Creator(s):

Links to the specific Authority record who is the Creator based on the creator-record-type and creator-id

Links to the specific Authority record who is the Creator based on creator/naId and creator type

From:Links to the description whose naId is equal to the ancestor naId.

{for-each: ANCESTOR = DAS/parent[Level].. }

http://catalog.archives.gov/description/ANCESTOR/naId

Includes <item-count> file(s) described in the catalog

Search for all Item descriptions with parent naId equal to the current File Unit naId

http://catalog.archives.gov//?action=search and f.parentNaId ={DAS/naId}&ui.sw={DAS/naId}

Search within this File

Searches for all Item(s) which are descendants of the current File. Searches for all Item(s) which have ancestor naId equal to the current record’s naId

http://catalog.archives.gov//?action=search and f.ancestorNaIds={DAS/naId}&ui.sw={DAS/naId}

online resource description

Links to online resource url Links to {DAS/onlineResourceArray/onlineResource /url}

Subjects Represented in the Archival Material(s):

Links to the specific Authority record who is the Subject based on the subject type and subject id

http://catalog.archives.gov/id/${DAS/organizationalReferenceArray/organization/naId} ORhttp://catalog.archives.gov/id/${DAS/personalReferenceArray/person/naId}http://catalog.archives.gov/id/${DAS/descriptionReferenceArray/ descriptionReference/naId}http://catalog.archives.gov/id/${DAS/geographicReferenceArray/geographicPlaceName/naId}http://catalog.archives.gov/id/${DAS/topicalSubjectArray/topicalSubject/naId}http://catalog.archives.gov/id/${specificRecordsTypeArray/specificRecordsType/naId}

Contributors to Authorship and/or Production of the Archival Material(s):

Links to the specific Authority record who is the Contributor based on the contributor record type and contributor id

http://catalog.archives.gov/id/${DAS/organizationalContributorArray/organizationName/naId}http://catalog.archives.gov/id/${DAS/personalContributorArray/person/naId}http://catalog.archives.gov/id/${DAS/archivalDescriptionsContributorArray/descriptionReference/naId}

61

Page 66: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

7.6.1 Electronic Records: Download Display Identifier

Some digital objects associated with File Unit descriptions are born-Electronic Records. Some of these are:

DAS/variantControlNumberArray/variantControlNumber/number=DDI

The layout for these File Units are identical to that outlined in section 7.5 above with the addition of a list of associated objects presented below the title and above the rest of the record metadata.

{DAS/title}, {DAS/inclusiveDates/inclusiveStartDate + “-“ + DAS/inclusiveDates/inclusiveEndDate}1

{total number of objects} files available

Technical Documentation *

{for-each object where objects/object/designator="Technical Documentation": Repeat the following lines in numbered/ordered list format based on the value in

objects/object@objectSortNum }

#. {objects/object/display}** {objects/object/description} ***

({objects/object/file@name}, {objects/object/file@mime}, {objects/object/technicalMetadata/file} ****)*****

Electronic Records *

{for-each object where objects/object/designator="Electronic Records": Repeat the following lines in numbered/ordered list format based on the value in

objects/object@objectSortNum }

#. {objects/object/display}** {objects/object/description} ***

({objects/object/file@name}, {objects/object/file@mime}, {objects/object/technicalMetadata/file} ****)*****

* Use {designator} to determine whether the object gets displayed under the “Electronic Records” section label or the “Technical Documentation” section label. If there is no {designator} value or any other value, do NOT display a section label, and just list the files.

** To determine whether the object is available for display and download, or download only, follow these rules:

If objects/object/display=Y or [null], display View/Download.

If objects/object/display=N, display Download.

62

Page 67: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

*** The link applied to this line is found in objects/object/file@url.

**** File size presented in the data is in bytes. This should be converted to KB, MB, or GB and displayed in the UI to the right of the file size.

***** In cases where there are more than 10 files for any given section, display a “show all files” link. When clicked, all files are displayed in the UI. Until the link is clicked, only the top 10 file links are displayed.

(Record metadata sections begin at this point; please refer to 7.5 for all fields.)

7.7 Item

{DAS/title}: {DAS/subtitle}1, {DAS/inclusiveDates/inclusiveStartDate + “-“ + DAS/inclusiveDates/inclusiveEndDate}2

${online-availability-header}

National Archives Identifier: {DAS/naId}

Local Identifier: {DAS/localIdentifier}

Creator(s) {for-each: CO = DAS/creatingIndividualArray/creatingIndividual/ OR DAS/creatingOrganizationArray/creatingOrganization/ }{CO/creator/termName}{(CO/creatorType/termName)}4

From: {DAS/parent[Level]/title}if (DAS/variantControlNumberArray/variantControlNumber/variantControlNumberType/termName == “HMS/MLR Entry Number”) {HMS Entry Number(s): {DAS/variantControlNumberArray/variantControlNumber/number}, {DAS/variantControlNumberArray/variantControlNumber/number}2 (…)3

{ for-each: ANCESTOR = DAS/parent[Level]/parent[Level]…}{ANCESTOR/title}

Note: For the header, follow the rules below:

1) If there is a production date, this should display. 2) If there are coverage dates, this should display. 3) If there are inclusive dates this should display.

Details

63

Page 68: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Level of Description: Item

Type(s) of Archival Materials:

{for-each:GT = DAS/generalRecordsTypeArray/generalRecordsType/}{GT/general-records-type-desc}

This item was broadcast:

{for-each:BCD = DAS/broadcastDateArray/proposableQualifiableDate}{BCD}

This item’s copyright was established:

{for-each:CRD = DAS/copyrightDateArray/proposableQualifiableDate }{CRD}

This item was produced or created:

{for-each:PDD = DAS/productionDateArray/proposableQualifiableDate }{PDD}

This item was released:

{for-each:RLD = DAS/releaseDateArray/proposableQualifiableDate }{RLD}

The creator compiled or maintained the

series between:

{DAS/inclusiveDates/inclusiveStartDate} - {DAS/inclusiveDates/inclusiveEndDate}note: date qualifier like “ca” shall be included

This item documents the time period:

{DAS/coverageDates/coverageStartDate} - {DAS/coverageDates/coverageEndDate}note: date qualifier like “ca” shall be included

Date Note: {DAS/dateNote}

These records document the

following Congresses:

{DAS/beginCongress/termName} – {DAS/endCongress/termName}

Other Title(s): {for-each: OT = DAS/otherTitleArray/otherTitle/title}{OT}

Production Series: { for-each: PST = DAS/productionSeriesTitle}Title: {PST}5

{ for-each: PSST = DAS/productionSeriesSubtitle}Subtitle: {PSST}5

{ for-each: PSN = DAS/productionSeriesNumber}Number: {PSN}5

Scale Note: {DAS/scaleNote}

General Note(s): {for-each: GN = DAS/generalNoteArray/generalNote}{GN}

Access Restriction(s):

{DAS/accessRestriction/status/termName}Specific Access Restriction: {for-each:SAR =

64

Page 69: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

DAS/accessRestriction/specificAccessRestrictionArray/specificAccessRestriction } {SAR/restriction}5, {SAR/restriction}2

{SAR/securityClassification}5, {SAR/securityClassification}2

Note: {DAS/accessRestriction/accessRestrictionNote}5

Use Restriction(s): {DAS/useRestriction/status/termName}Specific Use Restriction: {for-each:SUR =

DAS/useRestriction/specificUseRestrictionArray/specificUseRestriction} {SUR/restriction}, {SUR/restriction}2

Note: {DAS/useRestriction/note}5

Custodial History: {DAS/custodialHistoryNote}

Transfer Information: {DAS/transferNote}

Edited: {DAS/editStatus/termName}

Sound Type: {DAS/soundType/termName}

Language(s): {for-each: LG = DAS/languageArray/language/termName}{LG}

{for-each: FA = DAS/findingAidArray/findingAidRepeat the following 3 rows}

Finding Aid Type: {FA/type}

Finding Aid Note: {FA/note}

Finding Aid URL: {FA/url/termname}

Finding Aid Source: {FA/source}

Accession Number(s):

{ for-each: AN = DAS/ accessionNumberArray/accessionNumber/number}{AN}

Disposition Authority Number(s):

{ for-each: DAN = DAS/dispositionAuthorityNumberArray/dispositionAuthorityNumber/number}

{DAN}7

Records Center Transfer Number(s):

{ for-each: RCTN = DAS/recordsCenterTransferArray/recordsCenterTransferMmber}

{RCTN}

Internal Transfer Number(s):

{ for-each: ITN = DAS/internalTransferNumberArray/internalTransferNumber/number}

65

Page 70: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

{ITN}

Microform Publication(s):

{ for-each: MP =DAS/microformPublicationArray/microformPublication}{MP/identifier}{MP/note}{MP/title}

Online Resource(s): { for-each: OR = DAS/onlineResourceArray/onlineResource}{OR/description}{OR/note}

Subjects Represented in the

Archival Material(s):{for-each: SR =

DAS/personalReferenceArray/person OR

DAS/organizationalReferenceArray/organizationName OR

DAS/geographicReferenceArray/geographicPlaceName OR

DAS/descriptionReferenceArray/specificRecordsType ORDAS/descriptionReferenceArray/TopicalSubject

}{SR/termName}

Contributors to Authorship and/or Production of the

Archival Material(s):

{ for-each: CR= DAS/organizationalContributorArray/organizationalContributor ORDAS/personalContributorArray/personalContributor ORDAS/archivalDescriptionsContributorArray/descriptionReference}

{CR/termName or CR/title}

Former Record Group(s):

{for-each: FRG = DAS/formerRecordGroupArray/recordGroup}{FRG/recordGroupNumber}

Former Collection(s): {for-each: FCL = DAS/formerCollectionArray/collection}{FCL/collectionIdentifier}

Scope & Content

{DAS/scopeAndContentNote}

Variant Control Numbers

ARC Identifier: {DAS/naId}

66

Page 71: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

{for-each: VAR = DAS/variantControlNumberArray/variantControlNumber

Repeat the following row}

{VAR/ type/termName}: {VAR/number}{VAR/note}

Archived Copies

{for-each: PO = DAS/physicalOccurrenceArray/itemPhysicalOccurrence OR DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence}

Copy N6: {PO/copyStatus/termName}

Extent (size): {PO/extent}

Physical Occurrence Note: {PO/note}

Contact(s): {for-each: RU = DAS/physicalOccurrenceArray/itemPhysicalOccurrence/ referenceUnitArray/referenceUnit or DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ referenceUnitArray/referenceUnit orRepeat the following lines:}{RU/termName} {(RU/mailcode)}4, {RU/address1}2

{RU/address2}{RU/city}, {RU/state} {RU/postcode}Phone: {RU/phone}5

Fax: {RU/fax}5

Email: {RU/email}5

Total Footage: {DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/totalFootage}

Total Running Time: {DAS/physicalOccurrenceArray/itemAvPhysicalOccurrence/ totalRunningTime}

Copy N6 Media Information: {for-each: MEDIA = PO/media-occurrences/media-occurrence repeat the following lines }Specific Media Type: {MEDIA/mediaType/termName}5

Color: {MEDIA/color}5

Container ID: {MEDIA/containerId}5

Dimension: {MEDIA/dimension}5

Height: {MEDIA/height}5

Width: {MEDIA/width}5

Depth: {MEDIA/depth}5

67

Page 72: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

Media Occurrence Note: {MEDIA/mediaOccurrenceNote}5

Physical Restriction Note: {MEDIA/physicalRestrictionNote}5

Piece Count: {MEDIA/pieceCount}5

Process: {MEDIA/process}5

Reproduction Count: {MEDIA/reproductionCount}5

Technical Access Requirements Note: {MEDIA/technicalAccessNote}5

Height: {MEDIA/height}5

Emulsion: {MEDIA/emulsion}5

Footage: {MEDIA/footage}5

Format: {MEDIA/format}5

Record Speed: {MEDIA/recordingSpeed/termName}5

Reel/Tape/Disc Number: {MEDIA/reelTapeDiscNumber}5

Element Number: {MEDIA/elementNumber}5

Roll: {MEDIA/rollType}5

Running Time: {MEDIA/runningTime}5

Soundtrack Configuration: {MEDIA/soundtrackConfig}5

Soundtrack Language: {MEDIA/soundtrackLang}5

Tape Thickness: {MEDIA/tapeThickness}5

Wind: {MEDIA/wind}5

Shot List

{DAS/shotlist}

Notes:1. If field does not exist, remove the preceding colon2. If field does not exist, remove the preceding comma3. If there are less than two variant control numbers, remove the ellipsis and the enclosing

parenthesis, only variant control number whose mlr attribute is ”true” will be counted.4. If field does not exist, remove the enclosing parenthesis5. If field does not exist, remove entire line including the preceding label6. The value N should start with 1 and increment by 1 per each occurrence7. If item is Unrestricted Electronic Records

Display Rules for HMS Entry Number(s)

a. If there is an HMS/MLR Entry number, display it

b. If there is more than one HMS/MLR Entry number, display the first two HMS/MLR Entry numbers, followed by an ellipse (…)

c. If there is no HMS/MLR Entry number at the file unit or item level, display the parent HMS/MLR Entry number(s) as described in 1 and 2 above.

68

Page 73: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

7.7.1 Item Link Table

The following table defines the action when the user clicks a link in the full result page of Item record.

Link Expected Result Action

Creator(s):

Links to the specific Authority record who is the Creator based on the creator type and creator-id

http://catalog.archives.gov/id/${DAS/creatingOrganizationArray/creatingOrganization/creator/naId}http://catalog.archives.gov/id/${DAS/creatingIndividualArray/creatingIndividual/creator/naId}

From:Links to the description whose naId is equal to the ancestor naId

{for-each: ANCESTOR = DAS/parent[Level].. }

http://catalog.archives.gov/id/ANCESTOR/naId

online resource description Links to online resource url

Links to {DAS/onlineResourceArray/onlineResource /url}

Subjects Represented in the Archival Material(s):

Links to the specific Authority record who is the Subject based on the subject type and subject id

http://catalog.archives.gov/id/${DAS/organizationalReferenceArray/organization/naId} ORhttp://catalog.archives.gov/id/${DAS/personalReferenceArray/person/naId}http://catalog.archives.gov/id/${DAS/descriptionReferenceArray/ descriptionReference/naId}http://catalog.archives.gov/id/${DAS/geographicReferenceArray/geographicPlaceName/naId}http://catalog.archives.gov/topical-subject/${DAS/topicalSubjectArray/topicalSubject/naId}http://catalog.archives.gov/specific-records-type/${specificRecordsTypeArray/specificRecordsType/naId}

Contributors to Authorship and/or Production of the Archival Material(s):

Links to the specific Authority record who is the Contributor based on the contributor record type and contributor-id

http://catalog.archives.gov/id/${DAS/organizationalContributorArray/organizationName/naId}http://catalog.archives.gov/id/${DAS/personalContributorArray/person/naId}http://catalog.archives.gov/id/${DAS/archivalDescriptionsContributorArray/descriptionReference/naId}

69

Page 74: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

8 Object Metadata Presentation

This section covers the UI presentation of object metadata, Object Designator and Object Description.

When an object (types include: image, audio file, video, or PDF) has this data, the Object Designator and Object Description are displayed in a dedicated row below the object viewer and above the object tools, on the left side.

Note: If an object of the above types is presented in a UI that does not include a viewer, e.g., a File Unit with Electronic Records/Technical documentation, this data isn’t displayed.

These two values are concatenated, separated by a comma and space.

{DAS/object/designator}, {DAS/object/description}

The UI should display exactly what is returned by the API/data; for example:

"num": "5","type": "object","naId": "305300","objects": {

"@created": "2015-02-26T12:00:22Z","@version": "OPA-OBJECTS-1.0","object": {

"@id": "15021316","@objectSortNum": "9","description": "Forrest, Franklin, George, Greene, Grenada and Hancock Count","designator": "5",

If the object, regardless of what type it is, has this data, display as shown in the following mock-ups. If the object doesn’t have this data, the UI appears as it currently appears in PROD R1P1, i.e., no dedicated row.

All object viewers display this data in the Content Details page, Preview modal opened in the Results page by clicking on a thumbnail, and the Contributions Workspace.

As the user clicks on any navigation element to change from one object to another, the Object Designator/Description text dynamically updates to show the data, if available, for the newly selected object.

In cases where there is only one value, it is still displayed in the same position; there is no comma.

{DAS/object/designator}

70

Page 75: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

OR

{DAS/object/description}

The UI should display exactly what is returned by the API/data; for example:

"num":"1","type":"object","naId":"305300","objects":{

"@created":"2015-02-26T12:00:22Z","@version":"OPA-OBJECTS-1.0","object":{

"@id":"15021307","@objectSortNum":"1","designator":"COVER",

An object with Object Designator and Object Description (the Object Designator field value is 5 and Object Description field value is Forrest, Franklin, George, Greene, Grenada and Hancock Count):

An object with Object Designator only (the Object Designator field value is COVER; there is no Object Description field/value):

71

Page 76: Search Technologies Assessment  · Web viewBut the system should update and remove the header when the entire RG, collection, series, and/or file unit is completely digitized. [Implementation:

NARA Catalog Archival Descriptions DMD

An object without Object Designator and Object Description values (no dedicated row appears):

72