22
Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne From Big Data to the Big Picture #SAGETalks

From Big Data to the Big Picture

Embed Size (px)

Citation preview

Page 1: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

From Big

Data to the

Big Picture

#SAGETalks

Page 2: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Caroline Muglia, Head of Resource Sharing and Collection

Assessment Librarian at University of Southern California,

manages the Interlibrary Loan and Document Delivery

department and leads the collection assessment efforts for

the Library. Before this position, Caroline worked at the

Library of Congress and later as a Data Librarian for an

educational technology company.

Jill Parchuck has been the Associate University Librarian

for Science, Social Science and Medicine at Yale

University since 2014. Other positions Jill has held at Yale

include Director, Science and Social Science Libraries and

Co-Director of the Center for Science and Social Science

Information from 2010 to 2014 and Director of Social

Science Libraries and Information Services from 2007 to

2010.

#SAGETalks

Page 3: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

While we do our best to answer as many questions as we can, time constraints may not allow us to

answer every question. Thank you for understanding.

Send us your questions!

Send in your questions

via the Question Box on your screen. →

Using Twitter? Use

the hashtag

#SAGETalks.

#SAGETalks

Page 4: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Introduction

• Big data initiatives are plentiful!

• Libraries can play an important role• What steps can librarians take to contribute to big data

projects?

• How can libraries add value to big data projects?

• How can libraries determine the needs for data support?

#SAGETalks

Page 5: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Areas we will cover

I. Datasets (Homegrown and Purchased)

II. Licensing Data

III. Storage and Repositories

IV.Software and Tools

V. Looking Ahead

#SAGETalks

Page 6: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

What is Big Data?

• Volume: Amount of data being created and ingested.

What qualifies as “big”?

• Variety: Number of types of data

• Velocity: Speed at which data is being created and

processed

• Value: How data is being analyzed and utilized

#SAGETalks

Page 7: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Homegrown Datasets

Guiding question:

● Who is creating datasets and how?

● What are the uses of the datasets?

● Where are the datasets stored?

#SAGETalks

Page 8: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• Created by researchers

• Spatial Sciences student

projects

• USC Neuroimaging and

Informatics

• Created by Partnerships

• Big Data for Discovery Science

• Libraries

• USC Shoah Digital Library (8-

petabytes)

• Created by researchers

• Institution for Social and

Policy Studies (ISPS)

• Yale Open Data Access

Project (YODA)

• Yale Proteomics Expression

Database (YPEDS)

• Produced by administrative

units of the institution

• Yale Sustainability

#SAGETalks

Page 9: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Purchased DatasetsGuiding questions:

● In what format does the library receive the datasets?

● Where are the datasets stored?

● What kind of access do users have?

● How can users discover the datasets?

#SAGETalks

Page 10: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• Subject specialist receives request from researchers and places

order

• Data librarian receives and manages data and places it on local

server

• Cataloger creates records for the online discovery system

#SAGETalks

Page 11: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Licensing Data

Guiding questions:

● What are the terms of use?

● Access vs. ownership?

#SAGETalks

Page 12: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• Review criteria of license

• Ensure the widest possible use of content

• Ensure that a viable platform is available to provide access

• Ensure that metadata can be provided

• Can we retain a backup copy?

#SAGETalks

Page 13: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Storage & RepositoriesGuiding questions:

● How much storage space is needed?

● What do we need the repository to do?

#SAGETalks

Page 14: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• USC Digital Repository

• ICPSR

• Departments/Schools

• Contract to repositories

• Purchase server space

from university

• Smaller data sets

• External hard drives

• Registry of Research Data

Repositories

• ICPSR - Inter-university

Consortium for Political and

Social Research

• Yale Social Science Data

Archive - all in local discovery

system

#SAGETalks

Page 15: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Services for Analyzing Big

DataGuiding questions:

● Who owns the data? what rights management is needed?

● What do you need to do?

● Who will be using the data?

● What output options do you need to have?

#SAGETalks

Page 16: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Support For Using Data

• Organizing data

• Statistical analysis

• Cleaning data

• Manipulating data

• Managing data

• Data Visualization

• Retaining data

#SAGETalks

Page 17: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• Libraries

• Subject librarians

cultivate different skills

• Tableau license

• University-wide

• SC-CTSI (Clinical Data

Analysis)

• Center for High

Performance Computing

(HPC)

• Yale Center for Science and

Social Science Information

(CSSSI) - data services

• Yale StatLab Consultants -

statistical analysis

• CSSSI Research Data

Management - guide

• Yale Research Data

Consultation Group

#SAGETalks

Page 18: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Looking Ahead

What can you do now?

#SAGETalks

Page 19: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• Identify unique role that library can play

• Information management is a library service

• Data literacy or teaching with data; data education

• Expert trainers in Tableau, TDM tools

• Metadata expertise

• Store/make accessible other department’s raw data

• Can libraries provide analytical services?

• Learn needs of the institution

• Digital humanities projects can be a starting point

• Identify stakeholders

• Vendors and librarians can act as research partners

• Unique relationship that other departments may not have

#SAGETalks

Page 20: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Future considerations

What should you be prepared to handle in the near

future?

#SAGETalks

Page 21: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

• Data management plans

• NSF data management requirements

• Open data

• Open Government

• Los Angeles Open City

• Open Science

• Data science

• More students trained in Data Science-increased knowledge on campus

• What is library’s role (Instruction, Collection Development) in meeting

these research needs, but also in capitalizing on them?

#SAGETalks

Page 22: From Big Data to the Big Picture

Los Angeles | London | New Delhi | Singapore | Washington DC | Melbourne

Webinar recording, slides, and follow-up Q&A will be emailed to you and available on connection.sagepub.com.

Thank you!

Be sure to check our website for updates on our webinar series!

#SAGETalks