27
Digital Curation Plan ~LIS 889~ Lisa Roberts Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

Digital Curation Plan ~LIS 889~ Lisa Roberts

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Page 2: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

2

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Digital Curation Plan

Abstract:

In contemporary society, institutions are dealing with an unprecedented amount, and permanent

modification, of data demand by public and professional users. Information technology in

Institutions is in high demand to make sure essential information for users are accessible,

reliable, secure, quality driven and, trusted. Trusted sources are integrity driven, innovative

building and opportunity based. Great opportunities are born in contemporary culture, therefore

collaborative efforts to build relationships with other institutions is vital for the success of long

term preservation efforts of the digital born E-collections that will be presented.

The goal will be to share and sustain data across all variety of social cultures with economic

disparities and social class barriers. The goal of sharing information with the Smithsonian

Institution’s fits the working description and will serve as an educational component globally

accessible template. Efforts to collaborate a smaller collection with a larger establishment as the

Smithsonian, will bridge the knowledge gap for research and scholar resolutions. Deprived

institutions need advocacy base organizations that can articulate the need to bridge the

opportunity gap that continues to distress underserved communities. This will help alleviate the

strain on smaller educational establishments that may not have access to information that was not

readily available in larger collections. Smaller institutions lack adequate economic resources,

more resources yields better results. The Smithsonian Institution is the ideal solution for this

proposed blueprint.

Page 3: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

3

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Why digital curation is important:

Ever lost vital information using a portable device? Have any of your personal devices ever

crashed, became corrupt, lost or suddenly vanished? For the most part, a good number of people

have experienced at least one or all of these unforgiving scenarios. Data loss can happen at any

time and on any level, whether the device is a smart phone, USB, a tablet, in a cloud or a large

online data storage network. Digital curation is important; it establishes digital assets, maintains

preservation and adds value to digital repositories for present and future access.

Drawback:

“Because of these technical dependencies, digital objects are by nature very fragile, often more at

risk of data loss and even sudden death than information recorded on brittle paper or nitrate film

stock.” (Susan Schreibman, 2004)

The process of keeping digital born information accessible and reusable is one of the challenges

that experts face on a consistent basis. The risk of data loss is extremely high when more than

half of the world’s population uses smart phones and the internet. The sensitivity of constant

information changes, the migration of information remaining interchangeable to crosswalk with

other software and technological upgrades is essential. The purpose of formulating a digital

curation plan for digital born data is to safeguard the information in an authentic and complete

manner.

The expertise of computer scientist, engineers, librarians, information technology specialist,

archivist and scholars is critical when ensuring that data can be supported when software and

hardware are no longer supported in its original capacity. This expensive task requires relentless

Page 4: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

4

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

and persistence training to have manageability and uninterrupted data flow over time, making

sure data will be accessible for future usage.

Benefits:

The benefit of a digital curation plan is to guide and maintain specific data that librarians,

scientists, scholars and historians have invested over a large amount of time to preserve. The

goal of this project is to create an on-line archive database for the digital born E-Science,

Humanities and social media collections that will be preserved. With supported software and

hardware from a capable digital repository such as the Smithsonian, the fear of data becoming

obsolete over time, is no longer a severe threat. The objective is to provide open access to digital

born data for scholars, researchers and its user. The digital curation plan for the 3 collections

presented will be a joint collaborative effort with the Smithsonian Institution. The plethora of

assets that already exist within the context of the organization will secure the smaller collections

for proper management.

The rapid growth of file formats and metadata schemas, the Smithsonian institution is a well-

established association that has excess resources that can help aid in the long term preservation

efforts. Please review the collections digital curation goals and assessments. Feedback is

welcomed.

Page 5: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

5

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

E-Science Collection #1

The South Aegean Volcanic Arc

I. Introduction

Goals of Assessment: The objective is to create an active on-line archival database for its users.

The petrographic analysis database of pottery and other ceramic artifacts have been dormant to

users. The goal is to help preserve the extensive collection delicate ceramic thin sections that

exist worldwide. Build a relationship with a successful organization that will be able to sustain

long term preservation efforts for the collection.

Describe the Institution:

The Digital Curation Plan will focus on the accomplishments of the Smithsonian Institution’s

track record. “The Smithsonian is the world’s largest museum, education, and research complex.

(The Smithsonian is a museum and research complex, comprised of 19 museums and galleries

and the National Zoological Park. The total number of objects, works of art and specimens at the

Smithsonian is estimated at nearly 137 million. The collections range from insects and meteorites

to locomotives and spacecraft.” Sharing information with the Smithsonian fits the description

and will serve as an educational component globally. Efforts to collaborate a smaller collection

with a larger establishment as the Smithsonian will bridge the knowledge gap for research and

scholar resolution.

Describe the Collection:

The South Aegean Volcanic Arc website is a raw material database that was designed for

comparative Archeological and Geological Research. The compilation of scientific data holds

Page 6: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

6

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

embedded databases that were strategically researched, collected and assembled. Essential data

provenance of artifactual data were gathered and consist of text, fieldwork studies, images,

numbers and animation. “The site is designed to be used intuitively by multidisciplinary

researchers. It is hoped that the community-based approach will draw the viewer into both

numerical and visual databases.” (Indiana University)

Describe how the collection fit within the scope of the institution: The collection will be

stored on the online Smithsonian Global Volcanism Program page with the original link that will

allow researchers and scholars to gain access to the information that will aid in the research

process. The SAVA data is already collected in a website that is referred to as a database.

https://web.archive.org/web/20131107195639/http://www.indiana.edu/~sava/

The foundation is established, the SAVA collection database will support the progress and

advancement needs for scholars, students, researchers and future professionals. Online sharing

will bridge the curation gap. http://volcano.si.edu/

Purpose:

The purpose of this collection is to identify the environments origin by collecting geological

data, sharing, preserving, analyzing and integrating for scholars, scientist and specialist.

Quantitative rock and clay-rich sediment data can be used for the provenancing of artifactual

material from the region of SAVA.

II. Collection Inventory

Page 7: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

7

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

The collections inventory consists of numerical data, image data, and collaborative efforts from

specialist and maps. These files include folders that contain JPEG images, PowerPoint Slides,

Excel and Microsoft Word documents.

Inventory Files:

Numerical data, Image data, Map data, Data Interpretation, PowerPoint Slides, Supplementary

Material Inventory Files and collaborators.

1. Numerical Data

Geochemical database files- inventoried in PowerPoint (PPT) (Findings: Nature of Aeginetan

Ware Distribution: "Local Cultural Change Model")

Searchable database- Page could not be found

Aegina Island geologic history files- just data- an additional link will be added for to

research

Geologic fieldwork files- animated interactive satellite images are accessible with the

recommended flash drive download. Researchers are able to utilize interactive research

maps; they are available through the map application.

2. Image Data

Ceramic database files-some of the ceramic images from the ceramic artifact database have not

been intergraded. The data is charted and categorized by time period, source and reference

points. Along with the artifact image, each entry comprises mineralogical chemical and

Page 8: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

8

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

archaeological data. “The mineral composites are data in the form of chronology, class, shape,

fabric, provenance and reference samples”. (taken from website)

EXAMPLE OF DATASET

Database

Data Type

Ceramic Athens

Ceramic Halieis

Ceramic Kolonna

Ceramic Tsoungiza

Petrographic Aegina Kolonna Kolonna

Backscatte

Images

Lerna Mercouri Asine

Maps

(General)

3. Map database files- Jpeg Format

Map database files- The map database files hold multiple samples that were collected from the

arctic lava flow. The data is intended for research and is accessible for research. Database

records hold itemized Kolonna, Halieis, Athens and Tsoungiza files.

Page 9: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

9

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Digital images for the seven sample sets from the distribution list of Aeginetan Ware are

related to their respective geochemical analyses. The artifact images, each entry comprises

mineralogical and chemical data, archaeological data for the samples, elemental mineral

composition samples and references to petrography.

Petrographic database files- “The Petrographic data collected gives a detailed descriptions

of the thin-section collections for the sites of Lerna, Kolonna, Halieis, Tsoungiza, Athens,

Asine and Asea. In addition, a thin-section collections for the rocks and clay-rich sediments

which make up the SAVA raw material reference data for both samples and their reference

materials are available.” (Indiana University)

4. Data Interpretation

AW provenance report- PDF

5. PowerPoint Slides

Lerna House of Tiles, Source Clay for Aeginetan Ware, Ceramic Technology for Aeginetan

Ware, Overview for Distribution Problems, Local Cultural Change Model and Environmental

Change Model.

6. Supplementary Material Inventory Files

The Inventory has four main file folders, each folder has Jpeg images of scientific data charts

inside the folders. There are six Microsoft word documents that are describe with Tables and six

Microsoft Excel spreadsheet docs. Some of the data is not accessible. Adding the original link as

a tool, the data can be shared with the public, this will allow for data sharing. a) Tables

Page 10: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

10

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

(Kolonna, Halieis, Athens, Tsoungiza are listed in JPEG format that are itemized. b) Individual

Na-K plots c) Geochemical Data Files. d) Protocols for data manipulation (Laboratory notes and

or notebooks document specific data/data files. (The data describes all columns, units,

abbreviations and missing value identifiers. examples)

7. Collaborators

Dr. J. BROPHY, Dr. C. SHRINER, Dr. G. CHRISTIDIS, Dr. H. MURRAY, Dr. C. LI, Dr. A.

SCHIMMELMANN and Dr. E. RIPLEY.

Determine the file type:

The SAVA online database files are inventoried and cataloged into file folder directories that

contain scientific data, with data being archived in JPEG images/format, Microsoft word docs and

Microsoft excel spreadsheet docs.

Describe the significant properties of the data:

The significant properties of the collection is the historical context within the data. The properties

of the artifact images comprise of mineralogical, chemical data and archaeological data. The

collection provides sample sets of “presumed Aeginetan ware excavated throughout the Aegean

and mainland Greece.” (Indiana University)

III. Curation Plan

Page 11: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

11

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

What data will be archived? All of the data will be archived, it will fit the scope of the already

established volcanic website, and the link will be an additional source for reference.

Rational and criteria for the decisions- Choosing the Smithsonian Museums to focus the digital

plan seemed logical, it caters to all disciplines. The Smithsonian not only caters to visiting patrons,

but the institution has a strong online virtual presence that provides straightforward accessibility to

their multitude of collections. The website interface provides unlimited navigational opportunity

for its global users. Choosing the Smithsonian for the digital curation plan will add value to the

collection as well as provide vital information for researchers, educational purposes and

knowledge seekers. The Smithsonian is a trusted organization that prides itself on quality curation

and preservation efforts throughout the variations of its collections lifecycles.

What preservation strategy will be employed?

How will the data be archived? The data will be archived using hardware/software dependency:

Description and organization

IV. Metadata

What metadata already exist:

Contextual- The collections pre-existing metadata are descriptively and structural described. It

textually and technical summarize information about the data that exist in the collection. The

collection is categorized in an online database by contextual data, numerical data, image data and

topographical data. “Artifact images are documented with mineralogical, chemical and

archaeological data.” The existing data are categorized into databases/ datasets that are described

Page 12: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

12

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

by provenance, fabric, location, type, shape, references, and elements. Some of the information is

not analyzed. These products provide access to a variety of environmental data through online

map applications that allow users to select and view data based on geographic parameters.

What standards will be used?

The International Organization for Standardization (ISO) 19115 (Geospatial metadata) standard

metadata will be used, it is the most common and widely shared. The xml format will be used, this

file is more complete. The two formats will provide compatibility. ISO is an independent

nongovernmental membership organization that develops voluntary standards, published the open

archival information system. The metadata standards data is text based and documented.

How will the metadata be created and/or enhanced?

The metadata will be created by using online xml editor software. The software includes the

allowable elements, prompt for them and validate the results, meaning the descriptive information

about the added datasets can aid in greater use, discoveries and navigation for metadata sharing.

V. Discovery

How will patrons access this material?

With advance technology, patrons will have unlimited access to the Smithsonian online digital

collection located on the National Museum of Natural History Global Volcanism Program

website. The Smithsonian’s digital collections for global volcanism include reports, bulletins,

databases, galleries, and research activities. With the E-Science collection of the SAVA database,

the compilation would fit into the scope of the institutions guidelines. The GVP is housed in the

Page 13: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

13

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

department of Mineral Sciences, National Museum of Natural History. http://volcano.si.edu/

Updated technologies afford patrons the opportunity to access material through tools. The

Smithsonian uses new digital technologies for maximizing preservation efforts in a accelerated

environment.

VI. Staffing- The National Museum of Natural History Global Volcanism Program

Curatorial: 1 preservation specialist, Technical: 5 librarian technicians and 1 staff member for IT

support, Management: The head of natural and physical sciences department, 1 assist dept head

Clerical: 2 assistants.

Page 14: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

14

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Digital Humanities Collection #2

The Owens Family

A researcher has developed a collection of digitized original documents about the Owens family.

This prominent, historical family settled the New Harmony community, a utopian common in the

early 1800s in southern Indiana. This community was known as the Athens on the Wabash for

its support of the arts and sciences and was a major contributor to 19th century knowledge and

culture. As his research is complete on this collection, papers and books written and published,

this researcher wishes to contribute all of his research materials to an archive for long term

curation. The data is divided into two directories.

Describe the Collection

I. Introduction

Goals of the assessment- The goal is to create an on-line database for long term

preservation. The challenge will be maintaining the original data when the infrastructures in

which they were created will revolutionize. Data loss can be an issue; the risk can be less

risky by providing backup that will preserve with the changing times.

Describe your institution- The Digital Curation Plan will focus on the success of the

Smithsonian track record. “The Smithsonian is the world’s largest museum, education, and

research complex. (The Smithsonian is a museum and research complex, comprised of 19

museums and galleries and the National Zoological Park. The total number of objects, works

of art and specimens at the Smithsonian is estimated at nearly 137 million. The collections

range from insects and meteorites to locomotives and spacecraft.” The goal of sharing

Page 15: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

15

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

information with the Smithsonian Institution’s fits the description and will serve as an

educational component globally. Efforts to collaborate a smaller collection with a larger

organization as the Smithsonian will bridge the knowledge gap for research and scholar

resolution.

Describe the collection- The collection hold files from the Owens Family, they were settlers

in the early 19th

century, records date back as early as 1825. They settled in the community

of New Harmony, located in Southern Indiana. The Owens strived to create a utopian society

by establishing the best professionals. The Owens were major contributors of cultural, arts

and the sciences in this community. The collection holds a variety of business and personal

papers, ledgers, correspondence, journals, diaries and other records belonging to the Owens

family. The collected works are categorized into 2 directories.

Describe how the collections fit within the scope of the institution- “The Smithsonian's

collections represent our nation's rich heritage, the Smithsonian institution’s strive to

improve the experience and strengthen its mission of having access to information. “The

Smithsonian digital curation management and research team actively take steps to frequently

maintain integrity, data assurance and high quality software. (just to name a few) “Some data

cannot be replaced if destroyed or lost,” digital curation and data preservation are ongoing

processes that require best practices for institutions and researchers to manage and retain

their research data. The Natural Museum of American History Library is the ideal place for

long term preservation efforts of the Owens Family Digital Collection.

II. Collection Inventory –categorized into 2 directions.

Page 16: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

16

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

o Inventory files- Directory one holds email, New Harmony and New Harmony people

files. They are preserved in Gif, Jpeg, and PNG file formats.

Chart EXAMPLE

Data Data Type Format/Duration/Si

ze

Planned

Access

Numerical Digital image JPEG Open access

via CBS

website

(bristol.ac.uk

) & JORUM

Harmony Society Manuscript

15 page

docx

Correspondence_Li

st

Xlsx, excel

History of New

Harmony

10 pages

History_of_anarchis

m

10 pages docx

In_Log_Cabin digital Image jpg

Inventory txt

New harmony .html

nh_people .html

NH_1834_Map Map Jpg

Notes.Branigan .docx

Old George Walker Digital Image Jpeg

Original House Digital Image Jpeg

Rapp Owan

Granary

Digital Image Jpeg

o Directory two is stored in a zip file; the file has 2 downloadable folders; MACOSX and

Owens_Correspondence. The MACOSX file is not accessible without updated software.

Even so, the updated software for the MACOSX records, (Adobe) and its 184 files

context are not retrievable. The Owens_Correspondence directory holds 74 folders,

Page 17: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

17

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

metadata exist for each record. For example, one folder holds a journal with 184 pages,

these pages are stores in jpeg files formats, each page represents a single Jpeg file.

Directory two is a zip file that hold 74 Owen correspondence file folders, each folder

contain individual Jpeg images of original scanned documents. Each folder contains Jpeg

files that range from 1 document in a folder up to 180 jpeg records in each folder.

o Determine the file types- The files contain digitized still images. The images were from

original tangible documents that are preserved in JPEG format.

o Describe the significant properties of the data-The significant properties of the data

collection holds historical context within the data. The files contain textual images that

are digitalized from tangible/physical to virtual images in Jpeg. (dbl check repeat)

III. Curation Plan

o What data will be archived- Most of the records

o What data will not be archived- the updated software for the MACOSX records,

(Adobe) and its 184 files context are not retrievable. These records will not be archived,

non accessible.

o Rational and criteria for the decisions- The records are not accessible

o What preservation strategy will be employed? How will the data be archived?

o Description and organization

IV. Metadata

Page 18: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

18

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

What metadata already exist? The collection holds digitally scanned papers and published

books that were hand written. The data is divided into two directories. Directory one

metadata- New Harmony: This directory of data holds emails, and New Harmony files. Files

holding correspondence, D.S. Store, Harmony society articles, history and manuscripts.

What standards will be used?

Text Encoding Initiative will be used for the collection. TEI XML captures data and edits

data. The format and standard for collecting develops and maintain a standard for

representation of text in digital form. A set of guidelines are put in place specifying the

encoding methods, for machine readable texts. This metadata standard aids in online research

used for this collection.

How will the metadata be created and/or enhanced?

The metadata will be create and enhanced using the Oxygen software. The software provides

feedback with code editing.

V. Discovery

How will patrons access this material?

With advance technology, patrons will have unlimited access to the Smithsonian’s online digital

collection located with links on the National Museum of American History Library website. The

Smithsonian’s digital collections include over 27,000 digitized books and manuscripts. With the

Owens family collection, the humanities compilation would fit into the scope of the institution

guidelines. http://library.si.edu/libraries/national-museum-american-history-library

Page 19: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

19

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

http://library.si.edu/collections

Technologies afford patrons the opportunity to access material through tools

VI. Staffing- Natural Museum of American History Library

Curatorial: 1 collections librarian,

Technical: 2 librarian technicians and 1 staff member for IT support

Management: Head of department, Sr. librarian

Clerical: 1 librarian technician

VII. Technologies Requirements

What is required to support your curation plan?

Having a successful curation plan requires updated and advanced tools that will enable users to

have access physically, virtually and globally. The Smithsonian online research tools include

One Search, Library Catalog (SIRIS) E- Journals and Databases as well as the Smithsonian

research online tools.

What is required to support your curation plan

Advance hardware/software reliance is a constant concern. The best way to synchronize constant

data preservation is to have a strategic plan (disaster recovery) in place. For example, the

equipment, software operating systems, media drives and so forth requires an immense amount

Page 20: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

20

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

of investing. Technology frequents change, backing up data is necessary in digital

preservation/curation. Digital born metadata is more at risk of being permanently damaged when

technology becomes obsolete. Some metadata schemes are not interchangeable. When mapping

between Metadata elements, some elements are equivalent and compatible with existing resource

descriptions and othrs are not. Interoperability/cross walking can be converted, mapped and

enhanced in some of the elements. Migration—to copy data, or convert data, from one

technology to another, whether hardware or software, preserving the essential characteristics of

the data. These are some of the requirements needed to support the curation plan.

Page 21: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

21

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Social Media Collection #3

The social media digital preservation of Michael Richard "Mike" Pence is historically important.

Pence was born June 7, 1959, he is an American politician, lawyer, and the 48th and current Vice

President of the United States. Social media is a modern movement of self documenting ones

events, political and religious views.

I. Introduction

Long term attributes for digital curation is to make sure the digital content of social media is

preserved by implementing the data life cycle components. The documentation of movements,

events, political and religious views on social media has become the popular trends of the modern

world. Being able to access and reuse data over time is the challenge. The historical events that

are documented on a social media’s platform are central when referencing is required. The

challenge is keeping up with the changing times of modern technologies.

Goals of the assessment- The goals are to ensure long term access to social media data archives

by preserving an organization important data content. Organizations preserve data according to

what the business deems important and appropriate for curation. The social media platform has

regulations in place that seek to govern authentication of privacy rights and regularly laws that

protect the user.

Describe your institution- The Digital Curation Plan will focus on the success of the Smithsonian

Institution’s. “The Smithsonian is the world’s largest museum, education, and research complex.

Page 22: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

22

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

(The Smithsonian is a museum and research complex, comprised of 19 museums and galleries and

the National Zoological Park. The total number of objects, works of art and specimens at the

Smithsonian is estimated at nearly 137 million. The collections range from insects and meteorites

to locomotives and spacecraft.” The goal of sharing information with the Smithsonian

Institution’s fits the description and will serve as an educational component globally. Efforts to

collaborate a smaller collection with a larger establishment as the Smithsonian will bridge the

knowledge gap for research and scholar resolution.

Describe the collection- The collection is comprised of various social networking platforms that

Michael Pence subscribes. The most popular being; Facebook, Instagram, Twitter and YouTube.

The collection documents the use of social media and provides information on interaction with

others. The social media database focuses on the current trends depending on which site is being

accessed. Users are able to engage with their audience, seek employment, stay on top of current

events and use it as a political platform.

Describe how the collections fit within the scope of the institution- The social media collection

fits into the scope of the institution database versatility. Michael Pence social media pages

captured political context during the 2016 presidential elections. Pence’s social media accounts

captured real time events and conversations that will one day be used as reference points for future

scholars. The Smithsonian Institution has already begun archiving the social media collections

within the institutions facility. The process of archiving social media websites are still in the

beginning practice, preservation is processed by screen capturing.

Page 23: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

23

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

II. Collection Inventory

Inventory files- Facebook data is categorized by followers, likes, photo albums, interests and

activities. Twitter data- categorized by followers, tweets, following, liked photos and videos.

Instagram data- categorized by about, followers, post and following. YouTube data files- search

results, data, and hot topics. Finally Flickr data is categorized by- Photo, favorites, following and

followers.

Determine the file types- File types will be all captured with screenshots

Describe the significant properties of the data- Data Type- Social media and social networking

service

III. Curation Plan

What data will be archived as screenshots- The content of social media is large; this makes the

information complex to curate. Curating all social media information would be great; however, the

size of the continuous data makes it burdensome. When reviewing the social media accounts for

Michael Pence, similar data was collected from each social media page. Instead of collecting the

entire datasets from social media, data with heavier activity will be archived. Social media data

that will be archived: Facebook, Twitter, YouTube

What data will not be archived- YouTube and Flickr- Michael Pence does not have control over

photo and video sharing on these platforms. All the providers listed with the exception of one,

does not have readily available platforms to capture Instagram’s metadata. However, it is stated

Page 24: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

24

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

that the Hanzo Archives is very general in the description of capturing websites and particular

social media platforms. The information is very limited.

Rational and criteria for the decisions- preserving and maintaining specific data records make it

easier to maintain control of the data being archived. Social media metadata preservation can be

overwhelming. Flickr is widely used by photo researchers and bloggers, this data has no formal

connection with the Vice President himself. YouTube allows users to upload, view, rate, share,

add to favorites report and comment on the uploaded data. Data that has no physical connection

with Vice President himself will not be used.

What preservation strategy will be employed? When it comes to preservation strategies for

social media, the “best solution would be manual processes.” (Fasching, Kaliner, & Karel, 2012)

How will the data be archived- Solutions for data are still in their primary stages. Data capturing

is a popular way to preserve social media content for archives. Aleph Archives, X1 Social

Discover, Hanzo Archives, Iterasi Archives, and Reed Archives are current solutions that are

available for archiving.

Description and organization- X1 Social discovery is more idea for capturing and maintain the

content in its original format. Aleph Archives is more idea for digital curating social media. Aleph

provides web archiving services using web crawlers to capture the website. Iterasi is a subscription

based provider that targets corporate, legal and governmental industries. Reed Archives captures

Page 25: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

25

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

social media as needed, “users can export archives as PDFs or create bulk exports to entire

websites and social media accounts.” (National Archives and Records Adminstration, 2013)

IV. Metadata

What metadata already exists- The popular social media websites with preexisting metadata:

Facebook, Twitter, YouTube and Instagram already exists for this collection.

How will the metadata be created and/or enhanced- the metadata will be created/enhanced

through screen capturing, cloud based storage, and or using the “X1 social discovery provider.

Data is collected and indexed from social media streams, linked content and websites.” (National

Archives and Records Adminstration, 2013)

V. Discovery

How will patrons access this material- Patrons will access the Smithsonian online materials

through the Archives database.

VI. Staffing- The maintaining of digital archives that are kept up-to-date throughout the course of

research. 2 curators, 2 technical librarians 1 staff member IT support, 1 head of dept (management)

and 1 clerical

VII. Technologies required

What is required to support your curation plan- Using Archive-It, a tool by Internet Archive; it

is used to capture social media accounts prior to being outdated. Archive-it uses a crawler, a

Page 26: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

26

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

program that browses the internet and duplicates a specific moment of a particular website it is

crawling on. The Wayback tool allows accessibility for future usage.

Page 27: Digital Curation Plan · Digital curation is important; it establishes digital assets, maintains preservation and adds value to digital repositories for present and future access

27

Fall 2016~ Digital Curation~ LIS 889 Dr. Stacy Kowalczyk

Works Cited

Fasching, D., Kaliner, S., & Karel, T. (2012, July). Social Media Data Preservation, Tools and Best

Practices. LJN's Law Journal, Legal Tech Newsletter, 29(3).

Indiana University. (n.d.). SAVA South Aegean Volcanic Arc and Aeginetan Ware Database Project.

Retrieved from

http://web.archive.org/web/20140604205028/http://www.indiana.edu/~sava/gallery.html

Indiana University. (n.d.). Retrieved from

https://web.archive.org/web/20140604205018/http://www.indiana.edu/~sava/database.htm

National Archives and Records Adminstration. (2013). White Paper on Best Practices for the Capture of

Social Media Records. Washington DC: National Archives.

Steven, M. J. (2016). Metadata for Digital Resources. New York, NY, USA: Neal-Schuman.

Susan Schreibman, R. S. (2004). A Companion to Digital Humanities, ed. Oxford: Blackwell. Retrieved

from http://www.digitalhumanities.org/companion/