14
Finding Datasets and Statistics

Data and Statistics library research at UCSD

Embed Size (px)

Citation preview

Page 1: Data and Statistics library research at UCSD

Finding Datasets and Statistics

Page 2: Data and Statistics library research at UCSD

First, what’s the difference?

Page 3: Data and Statistics library research at UCSD

Datasets

are collections of numeric data that can be analyzed using specialized software such as Stata, SPSS, or R.

Page 4: Data and Statistics library research at UCSD

Statistics

are numerical data that has been organized and interpreted, usually displayed in tables.

Page 5: Data and Statistics library research at UCSD

So what’s the deal with data?

Page 6: Data and Statistics library research at UCSD

What is Data?

• Data are raw ingredients from which statistics are created. • Statistical analysis can be performed on data to show

relationships among the variables collected. • Through secondary data analysis, many different researchers

can re-use the same data set for different purposes.

Page 7: Data and Statistics library research at UCSD

Aggregate Data

Is higher-level data that have been compiled from smaller units of data. • Examples: inflation rate, consumer price

index, demographic data for city or state

Page 8: Data and Statistics library research at UCSD

Microdata

• Data directly observed or collected from a specific unit of observation.

• Contain individual cases, usually individual people, or in the case of Census data, individual households – Examples:• Census: the unit of observation is probably an

individual, a household or a family.• Survey or poll: the responses of a single respondent

Page 9: Data and Statistics library research at UCSD

Datasets

• A data set or study is made up of the raw data file and any related files, usually the codebook and setup files.

• Most data sets require at least basic statistical or spreadsheet programs to use.

Page 10: Data and Statistics library research at UCSD

Types of data

• Cross-Sectional - data that are only collected once.

• Time Series study the same variable over time.

• Longitudinal Studies describe surveys that are conducted repeatedly, in which the same group of respondents are surveyed each time.

Page 11: Data and Statistics library research at UCSD

Finding Datasets

Page 12: Data and Statistics library research at UCSD

1. Think about who might collect the data.

• Could it have been collected by a government agency?

• A nonprofit or nongovernmental organization?• A private business or industry group?• Academic researchers?

Page 13: Data and Statistics library research at UCSD

2. Look for publications that use the kind of data you’re looking for

and that cite the dataset

In other words, is the data you want mentioned in scholarly articles or government reports or some other source?

Page 14: Data and Statistics library research at UCSD

3. Once you know that what you want exists, it's time to hunt it down.

• Is it freely available on the web?• Or part of a package to which the

library already subscribes?• Is it something we can buy? (And is it

within the library's budget and can the purchase be made quickly enough to fit your timeframe?)

• Can it be requested directly from the researcher?