Statistics for Librarians, Session 1: What is statistics & Why is it important?

Preview:

DESCRIPTION

First of 4 sessions introducing statistics to librarians and library staff.

Citation preview

Why is it important?

WHAT IS STATISTICS?

Goals of Series

Comfort

Fears

Series Objectives

FoundationsDescriptive Statistics

Inferential Statistics

Reading & Interpreting

Statistics

Comfort Level

What is Statistics?

• Study of Data• Collecting• Organizing• Summarizing • Analyzing• Presenting• Storing &

Sharing

Why is it Important?

• Make sense of the data

• Explain what happens and (possibly) why

• Make sound decisions

• To know how close we are to the truth.

Results

Bias?

Sampling Error?

Invalid Measure

s?

Random Error?

Other Factors?

Purpose of Statistics

Thinking about Data in your Research Project

Start with your Research Question

How do users differ when (searching, finding, selecting) (articles, books, Web sites)?What are the effects of ___________On ____________?

Which is better at improving _________?How are people (finding, selecting, using) _______?

What are factors associated with ___________?

Example of Research Question

PACS• Low LibQUAL+

Ratings

Collections

• Is it our collections?

Do we have what they use?

• Based on citations

VariablesIndepende

nt

Subjects

Factors

Effects of…

Dependent

Objects

Outcomes

Effects on…

Example of Variables

• Department• Years at UNTFaculty

• # published by type

Published

• # cited by type• UNT accessibleCited

IV

DV

Scales of Data (NOIR)

Nominal• Counts by

category• Binary (Yes/No)• No meaning

between the categories (Blue is not better than Red)

Ordinal• Ranks• Scales• Space between

ranks is subjective

Interval• Integers• No baseline• Space between

values is equal and objective, but discrete

Ratio• Interval data with

a baseline• Space between is

continuous

Likert-Type Scale?

Arbitrary

Few Levels

Individual Questions

Ordinal?

Symmetrical

Many Levels

Composite Score

Interval?

Example of Variable Types

• Department• Years at UNTFaculty

• # published by type

Published

• # cited by type• UNT accessibleCited

N

N

NN

I

Compared to What?

Book Circulations

180,354

Compared by…

Time Periods

Other Libraries

National Surveys

Patron Types

Material Types

Research Question

Data Type

Comparison Group

Statistical

Methods Used

VALIDITY OF MEASURES

Are you actually measuring what you are trying to

measure?

Selecting Measures

•Counts•Survey responses•Grades/Scores•Ranks•Scales (e.g. Likert)•Age, Length of Time•Frequency

•People•Books•Articles•Uses•Levels of Analysis•What is the object (DV)?•What is the subject (IV)?

Measures Units of Analysis

Use a tool with established validity

Approaches and Study Skills Inventory for Students (ASSIST)

User Engagement Scale (UES)

Establish Validity of Measures

• ConsistencyReliability

• Corresponds with expectations

• Common understandings

Content Validity

• Corresponds with other variables based on theory

Construct Validity

• Corresponds with other measures

Criterion Validity

Results

Bias?

Invalid Measure

s?

Sampling Error?

Random Error?

Other Factors?

ROLE OF SAMPLING

All members of population

Hard to measure

The Truth

Census

A selection of the population

Easier to measure

An estimate of the truth

Sample

When to Use Which:Research Question?

Census

• Book usage at UNT Libraries

• Effects of IL instruction on English 1100 students

Sample

• Book usage at all libraries

• Effects of IL instruction on all students

Example - Census or Sample?

All journal articles cited

All Items Published by PACS Faculty

All journal articles published by PACS faculty

Random Samples

• Every Unit of Analysis has an equal and known chance of being included.

Importance of Randomness

Random Samples

Random, Weighted,

etc.

Should be representati

ve of population

Can use inferential statistics

Most useful for testing hypotheses

Non-Random Samples

Convenience, Purposive, etc.

May or may not be

representative of population

Use descriptive

statistics only

Most useful for generating hypotheses

Results

Bias?

Invalid Measure

s?

Sampling Error?

Random Error?

Other Factors?

ROLE OF DATA COLLECTION IN STATISTICS

Goal of Data Collection in Statistics

Reliability

Bias

BiasSystematic (not random) deviation from the true value (Statistics.com)

Selection Bias

Measurement• Observer Bias• Non-response Bias

Analysis Bias

Data Collection Forms

Many or Complex Variables

Surveys

1 Unit Per

Form Fewer Variables

Collected all at once

BibliometricSpace Surveys

Spread-

sheet

Data Input

Have a data entry plan

Train the inputters

Use data validation tricks

Double-entry

Organizing Data

One Unit of Analysis per Row

Example Spreadsheets

Results

Bias?

Invalid Measure

s?

Sampling Error?

Random Error?

Other Factors?

STATISTICAL ANALYSIS

Central Tenden

cy

ErrorSpread

Elements of Statistical Analysis

Inferential

• Infer associations

Descriptive

• Describe

Descriptive AnalysisJust the Facts, Ma’am

Summarizes

TablesCharts

UnivariateOne

variable at a time

Comparison with

Population

Demonstrates how random the sample is

Measures of Central Tendency

• Average

Mean

• Middle

Median

• Most Common

Mode

Central Tendency by Scales

Interval or Ratio

Mean

Median

Nominal or Rank

Median

Mode

Spread

Interval & Ratio

• Range• Quartiles

or Quintiles

• Standard Deviation

Nominal & Rank

• Distribution Tables

• Bar Graphs

How variable is the data?

Range & Quartiles

Standard Deviation

•Measure of dispersion of data•Square root of the average variation from the mean

What does the Standard Deviation tell you?

Greater variation, less certainty

Lower variation, more certainty

Presentation of Spread

•Box plots•Mean•Upper & lower quintiles•Outliers•Cross-tabulations•Bar graphs

Spread of Nominal data

Bar graphs & plots

Inferential Statistics

Tests of hypotheses• Associations• ExpectationsAccounts for uncertainty• Random error• Confidence interval

Hypotheses

Your Hypothes

is(H1)

Null Hypothesis(H0)

Example Hypothesis

>=75%* <75%*

*…of journal articles cited by UNT PACS faculty in journal articles published between 2008-2011.

UNT Libraries provides access to…

Hypothesis Testing

p

Sample Size

Central Tendency

SpreadDistribution

Significance Level

Statistical Analysis

Noise

Signal

Results

Bias?

Sampling Error?

Invalid Measure

s?

Random Error?

Other Factors?

Purpose of Statistics

Valid

• Measures• Data Collection• Sample Selection• Statistical Methods

Valid

• Data• Sample• Statistical Analysis

Valid

• Results

Role of Validity

in Researc

h

Recommended