4
Dataverse Network Bibliography Brett, K, et. al. Reports of Conferences, Institutes, and Seminars. Serials Review, 38(4), 266-288 (2012). This report gives an overview of a conference session in which Mercè Crosas gave an overview of the Dataverse Network Project and data management plans. Crosas describes Dataverse as a repository for research data that takes care of long-term preservation and employs old archival practices while allowing researchers to keep control of and receive recognition for their data. Dataverse works well into data management plans because the network provides access and sharing capabilities. Dataverse allows researchers to deposit data in an organized, curated and citable network. This environment facilitates and promote access and sharing. The Dataverse Network designed by the Institute for Quantitative Social Science (IQSS) is free and open to all social science research data. The researcher must agree to terms of use, which contain statements about copyright and confidentiality protections before uploading their data. Dataverse is a good archival choice because all metadata is exported to XML, data files are reformatted for long-term access, all versions are kept, and both metadata and data are replicated to multiple locations through LOCKSS (Lots of Copies Keep Stuff Safe project, Stanford University). Crosas, M. The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data. D-Lib Magazine 17, no. ½ (2011). This article was published in D-Lib Magazine, an electronic publication with a focus on digital library research and development.

Dataverse Netowrk Project Bibliography

Embed Size (px)

DESCRIPTION

Dataverse Network Project Bibliography

Citation preview

Page 1: Dataverse Netowrk Project Bibliography

Dataverse Network Bibliography

Brett, K, et. al. Reports of Conferences, Institutes, and Seminars. Serials Review, 38(4), 266-288 (2012).

This report gives an overview of a conference session in which Mercè Crosas gave an overview of the Dataverse Network Project and data management plans. Crosas describes Dataverse as a repository for research data that takes care of long-term preservation and employs old archival practices while allowing researchers to keep control of and receive recognition for their data.

Dataverse works well into data management plans because the network provides access and sharing capabilities. Dataverse allows researchers to deposit data in an organized, curated and citable network. This environment facilitates and promote access and sharing.

The Dataverse Network designed by the Institute for Quantitative Social Science (IQSS) is free and open to all social science research data. The researcher must agree to terms of use, which contain statements about copyright and confidentiality protections before uploading their data.

Dataverse is a good archival choice because all metadata is exported to XML, data files are reformatted for long-term access, all versions are kept, and both metadata and data are replicated to multiple locations through LOCKSS (Lots of Copies Keep Stuff Safe project, Stanford University).

Crosas, M. The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data. D-Lib Magazine 17, no. ½ (2011).

This article was published in D-Lib Magazine, an electronic publication with a focus on digital library research and development. Mercè Crosas works through the creation of the Dataverse Network at Harvard University, and why it is a good software and framework choice for researchers to use to share and archive their data. The central repository supports archival services, backups, recovery, standards-based identifiers, metadata, conversion and preservation.

Dataverse promotes data sharing and data discoverability, but does support restrictions for those cases in which authors want to limit the use or access of their data. Researchers can browse and search studies within specific dataverses or across the network. Also, researchers can find and easily access small data sets from others that would otherwise be sitting on a local/personal computer.Dataverse Software also supports two types of additional services: tubular data sets and social network data. Tubular data sets are files with rows and columns (SPSS, STATA, CSV) can be subset so that a user can extract only some of the variables they want. Social Network data is

Page 2: Dataverse Netowrk Project Bibliography

data that describes a network of entities and relationships. These sets can be uploaded in GraphML format to provide additional flexibility of the data to users.

Dataverse can be used in two ways: as a web interface or software installation. Data owners can administer all the settings and manage strides through the web interface. A dataverse can be created through a web form from a Dataverse Network.

Crosas states that the Dataverse Network supports data citation, web visibility, and ease of use, which therefore enables data authors to gain recognition and maintain ownership of their data while addressing their data archival concerns. With the growing data sharing initiative, authors need incentives and the right technology solutions in order to participate in successful data sharing.

Interview: Mercè Crosas - Dataverse Network. e-Science Portal for New England Librarians, 2013.

This interview is available on the e-Science Portal for New England Librarians website, in connection with Umass Medical School. Mercè Crosas talks about the Dataverse Network as a data repository product for sharing, citing and archiving research data.

Other libraries can use the Dataverse Network either as a services or as a software installation. An individual researcher can create a Dataverse at the DVN and deposit and share their own data sets. Dataverse data sets are organized in studies which are automatically given a data citation so they can be referenced. As a software installation, an institution can install it on their servers and create their own Dataverse Network.

Crosas also talks about how the library promotes Dataverse and data science services to the Harvard institutional community. Being familiar with data science tools for data storage and provide support and services for preservation of research objects.

Crosas M, Sweeney L, & King G. Dataverse for Big Data. Dataverse Network Project. 2013.

This handout addresses the use of Dataverse for making very large data sets sharable, citable and reusable, and facilitate reproducibility of research. Dataverse has already provides solutions for fixed-size, downloadable data sets, but larger sets are now being used for research.

There are a few challenges involved with large, frequently updated or streaming data sets. Some sets are no longer downloadable to the researcher’s computer, the amount of data cannot be any longer curated or analyzed, and also some sets contain privacy and sensitive data.

Page 3: Dataverse Netowrk Project Bibliography

Dataverse will integrate with other technologies to deal with these big data sets and frequently updating/changing data. Such data sets that Dataverse could make accessible are images from the Connectome project, heath data from hospitals, student usage data by online open courses, and social media data.