51
IS IT REALLY THAT BAD? Verifying the extent of full- text linking problems en R. Harker, MLS, MPH lection Assessment Librarian Libraries

Is it really that bad?

  • Upload
    olin

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries. Is it really that bad?. Verifying the extent of full-text linking problems. - PowerPoint PPT Presentation

Citation preview

Page 1: Is it really that bad?

IS IT REALLY THAT BAD?Verifying the extent of full-text linking problems

Karen R. Harker, MLS, MPHCollection Assessment LibrarianUNT Libraries

Page 2: Is it really that bad?

“I find searching for journal articles and actually finding full articles to a very difficult. Sometimes it just links to another site and then to a fragment of an article.”

“It's hard to find links to online articles--some links say they can't find it, but sometimes you can still locate it and I don't know why the main "find links" page doesn't bring it up.”“Sometimes, if a link is not

provided to an article in the search results, it is very difficult to find. The article linker often comes up with no results even though it says the article is in UNT's collection.”

Frustrated with "Find Full-Text" when "it doesn't work“; “if it's not 100% perfect, there is really no point” in offering the service.

Page 3: Is it really that bad?
Page 4: Is it really that bad?
Page 5: Is it really that bad?

What have you done?

Page 6: Is it really that bad?

Link-CheckingRANDOM

Selection Cannot Be Predicted

Page 7: Is it really that bad?

What is required?

Intermediate to advanced Excel (but NOT programming)

Page 8: Is it really that bad?

Knowledge

About your collection

About the problem

Page 9: Is it really that bad?

Clear questions

Page 10: Is it really that bad?

Think about...your problem

your collection

your link resolver

your people

Page 11: Is it really that bad?

Brainstorm

Page 12: Is it really that bad?

Come away with…

Page 13: Is it really that bad?

What Kind of Research Questions?

Just a few

Be Specific

Compared to what?

What is important to you?

Page 14: Is it really that bad?

Such as…

Are links from EBSCO more successful than links from Ovid?

Is full-text linking better or worse compared to last year?

Is Serials Solutions’ 360 link resolver more likely to get to the full-text from our key resources than EBSCO’s?

What is the chance that a client will get the full-text of an article

on the first click?

Page 15: Is it really that bad?

Start with the Results

Page 16: Is it really that bad?

Sources Full-Text No Full-Text Total

EBSCO 90 10 100

Ovid 85 15 100

Totals 175 25 200

Comparing One Source With Another

Confidence Level

Chi-Square Test

Target Chi-Square

0.90 0.161 1.00-0.90=0.10Is Chi-Square test < Target? No

Is Ovid Significantly Different from EBSCO? No

Page 17: Is it really that bad?

Sources Full-Text No Full-Text Total

EBSCO 90 10 100Average 75 25 100

Totals 165 35 200

Comparing One Source With the Average

Confidence Level

Chi-Square Test

Target Chi-Square

0.90 0.001 1.00-0.90=0.10Is Chi-Square test < Target? Yes

Is EBSCO Significantly Different from the Average? Yes

Page 18: Is it really that bad?

Targets Full-Text No Full-Text Total

EBSCO 90 10 100Expected or Ideal Rate 95 5 100

Totals 185 15 200

Comparing One Target With The Expected or Ideal Rate

Confidence Level

Chi-Square Test

Target Chi-Square

0.90 0.022 1.00-0.90=0.10Is Chi-Square test < Target? Yes

Is EBSCO Significantly Different from Ideal Rate? Yes

Page 19: Is it really that bad?

Random Sampling

Review or background, depending on your viewpoint

Page 20: Is it really that bad?

Sampling Terms Universe

Sampling Population

Sampling Frame

Sample

All Citation

sCitations in databases to which we have access to articles to which we have full-text access

Only Journal Articles

Page 21: Is it really that bad?

Selection Methods

Convenience sampling

The chance of being selected is not known

The probability of any one citation being selected is known.

Non-probability Probability

Page 22: Is it really that bad?

Simple Random Sampling

Every citation that meets the criteria has an equal chance of being selected.

See Demo.

NOTE: Articles vary greatly by source, target and year.

Page 23: Is it really that bad?

Stratified Sampling

Every citation in discrete homogeneous groups has an equal chance of being chosen.

Try Demo again…

Useful to zero-in on a possible problem

Stratify by source, target & year, but would be time consuming

Page 24: Is it really that bad?

Sampling Population

Samples Selected from Each Stratum

Strata

Page 25: Is it really that bad?

Cluster Sampling

When the sampling population naturally “clusters” (e.g. source and targets).

The way they cluster doesn’t affect your outcome.

Divide population into these clusters

Randomly select the clusters to be a part of the sampling frame

Randomly select sample from selected clustersUseful for very large populations.

Page 26: Is it really that bad?

Sampling Population

Samples Selected from Selected Clusters

Clusters

Page 27: Is it really that bad?

This Methodology

Simple randomized cluster1. Select a sample of ejournals

(clusters)2. Search each database for articles3. Randomly select a citation (sample)4. Test and record results

Most useful for questions that are focused on the sources.

Page 28: Is it really that bad?

Other Questions, Other DesignsComparing link-resolvers: Matched-pair

1. Select a sample of ejournals2. Using one of the link resolvers, search the

source for articles in these ejournals.3. Randomly select a citation4. Test and record results5. For the next link resolver, search each

source for the same citation (the matched pair).

6. Test and record results.

Page 29: Is it really that bad?

For problems related to environment (browsers, location of user, etc.): Use the same

method as the link-resolver, only change the browser or location.

For problems related to targets: Use the same

method, but… Randomly select

ejournals from each target

Other Questions, Other Designs

Page 30: Is it really that bad?
Page 31: Is it really that bad?

Using Excel to Help You Along

Practical Applications

Page 32: Is it really that bad?

Before we begin…

For those with Laptops, download files: Excel file:

http://digital.library.unt.edu/ark:/67531/metadc96818/

PDF of Steps:http://digital.library.unt.edu/ark:/67531/metadc96827/ Or, just follow along…

Page 33: Is it really that bad?

I need to check how many?

May be fewer than you’d think Need to know:

Sampling strategy Kind of analysis Expected rate Chance of a title being indexed in the

source Number of databases or sources to

examine Educated Guess

Page 34: Is it really that bad?
Page 35: Is it really that bad?

Selecting the Journal Titles (Clusters)

1. Download your ejournals list

May want to limit to only those used recently

2. Randomly select the correct number of journals based on sample size

3. Randomly assign each title to the databases or sources you will be searching.

Excel Tricks Remove Duplicates Fill Cell – assigns a new

ID number Sampling method in

Data Analysis Randomly selects IDs

from your list VLOOKUP – gets the

titles for the selected IDs RANDBETWEEN –

Randomly assigns each title to a source to be searched

Page 36: Is it really that bad?

Search Source

If

fo

un

d

Random Number

If found

Select Citation

If found

Test Citation

I f f o u n dRecord Result

If

fo

un

d

If Not Found

Page 37: Is it really that bad?

Test the Sources

1. Login to the database2. Search for articles in the first journal3. If none are found, note this in your

results and skip to next title.4. If some are found, note the total

number of articles.

Page 38: Is it really that bad?

Test Sources

If articles are found:1. Sort the list by author last name (if

possible)2. Note the total number of articles found3. Enter this number in the Sample Size

Calculator worksheet (Random Article Selector)

4. Note the “Select this article” number5. In the database, navigate to this article6. Click on the Find Full-Text button

Page 39: Is it really that bad?

Full-Text PDF!

Page 40: Is it really that bad?

Test Sources

Note the success of that link in your tally sheet

Rinse & repeat: For each journal in the list For each database to be tested

Page 41: Is it really that bad?

Search Source

If

fo

un

d

Random Number

If found

Select Citation

If found

Test Citation

I f f o u n dRecord Result

If

fo

un

d

If Not Found

Page 42: Is it really that bad?

Tips & Tricks

Search by ISSN, if possible.

Display the most citations per page

If full-text article is in that database, skip title. This is a non-

response the Response

Rate the Sample Size

Page 43: Is it really that bad?

Summarize the data

Count up all results in each source

Create ratios for each result (e.g. Full-text ratio) # Full-Text / # Titles Found

Average the ratios

Page 44: Is it really that bad?

Example Data: Raw Counts

Page 45: Is it really that bad?

Example Data: Ratios

Page 46: Is it really that bad?

Test the Results

So you have a ratio – so what? What does it mean? Is it high? Low? Compared to what?

Use the Chi-Squared test to compare the ratios Excel: CHISQ.TEST(actual range, expected

range) If the result is less than 0.10 (or 1.00 –

Confidence Level), then the difference is statistically significant. This may be good or not good, depending on

what your comparing against.

Page 47: Is it really that bad?

Comparing Against an Ideal % Actual range: the # of Successes & # of

Failures Expected Range: # expected to be success

& # of expected failures Example: CHISQ.TEST(B2:C2, B3:C3)

A B C D

1 Sources Full-Text No Full-Text Total

2 EBSCO 90 10 1003 Expected or Ideal Rate 95 5 1004 Totals 185 15 2005 Chi-Square Test 0.022

6 Significantly Different from Expected Count? Yes

Page 48: Is it really that bad?

Sources to Test Sources Full-Text No Full-Text Total1EBSCO 70 21 912Ovid 90 23 113

Totals 160 44 204Chi-Square Test 0.031632

Significantly Different? Yes

Sources to Test Sources Full-Text No Full-Text Total3ProQuest 87 18 1051EBSCO 70 21 91

Totals 157 39 196Chi-Square Test 0.032782

Significantly Different? Yes

Sources to Test Sources Full-Text No Full-Text Total2Ovid 90 23 1135Means 82.3333333 20.66666667 103

Totals 172.333333 43.66666667 216Chi-Square Test 0.322856

Significantly Different? No

Page 49: Is it really that bad?

Any F/T Ratio80%

Any Link Ra-tio

16%

No Link Ratio5% Overall Success

EBSCO Ovid ProQuest0

20

40

60

80

100

120

# Titles by Link Type

No LinkAny LinkAny Full-Text

Page 50: Is it really that bad?

Context is King

The value of the result depends on what you are measuring and comparing If the difference between two sources is not

significant, then they are statistically similar.

If the difference between two link resolvers is significant, then one is better than the other. NOTE: This doesn’t tell you by how much!

Use your best judgment

Page 51: Is it really that bad?

I am here to help you…

Karen R. [email protected]