Upload
olin
View
41
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Karen R. Harker, MLS, MPH Collection Assessment Librarian UNT Libraries. Is it really that bad?. Verifying the extent of full-text linking problems. - PowerPoint PPT Presentation
Citation preview
IS IT REALLY THAT BAD?Verifying the extent of full-text linking problems
Karen R. Harker, MLS, MPHCollection Assessment LibrarianUNT Libraries
“I find searching for journal articles and actually finding full articles to a very difficult. Sometimes it just links to another site and then to a fragment of an article.”
“It's hard to find links to online articles--some links say they can't find it, but sometimes you can still locate it and I don't know why the main "find links" page doesn't bring it up.”“Sometimes, if a link is not
provided to an article in the search results, it is very difficult to find. The article linker often comes up with no results even though it says the article is in UNT's collection.”
Frustrated with "Find Full-Text" when "it doesn't work“; “if it's not 100% perfect, there is really no point” in offering the service.
What have you done?
Link-CheckingRANDOM
Selection Cannot Be Predicted
What is required?
Intermediate to advanced Excel (but NOT programming)
Knowledge
About your collection
About the problem
Clear questions
Think about...your problem
your collection
your link resolver
your people
Brainstorm
Come away with…
What Kind of Research Questions?
Just a few
Be Specific
Compared to what?
What is important to you?
Such as…
Are links from EBSCO more successful than links from Ovid?
Is full-text linking better or worse compared to last year?
Is Serials Solutions’ 360 link resolver more likely to get to the full-text from our key resources than EBSCO’s?
What is the chance that a client will get the full-text of an article
on the first click?
Start with the Results
Sources Full-Text No Full-Text Total
EBSCO 90 10 100
Ovid 85 15 100
Totals 175 25 200
Comparing One Source With Another
Confidence Level
Chi-Square Test
Target Chi-Square
0.90 0.161 1.00-0.90=0.10Is Chi-Square test < Target? No
Is Ovid Significantly Different from EBSCO? No
Sources Full-Text No Full-Text Total
EBSCO 90 10 100Average 75 25 100
Totals 165 35 200
Comparing One Source With the Average
Confidence Level
Chi-Square Test
Target Chi-Square
0.90 0.001 1.00-0.90=0.10Is Chi-Square test < Target? Yes
Is EBSCO Significantly Different from the Average? Yes
Targets Full-Text No Full-Text Total
EBSCO 90 10 100Expected or Ideal Rate 95 5 100
Totals 185 15 200
Comparing One Target With The Expected or Ideal Rate
Confidence Level
Chi-Square Test
Target Chi-Square
0.90 0.022 1.00-0.90=0.10Is Chi-Square test < Target? Yes
Is EBSCO Significantly Different from Ideal Rate? Yes
Random Sampling
Review or background, depending on your viewpoint
Sampling Terms Universe
Sampling Population
Sampling Frame
Sample
All Citation
sCitations in databases to which we have access to articles to which we have full-text access
Only Journal Articles
Selection Methods
Convenience sampling
The chance of being selected is not known
The probability of any one citation being selected is known.
Non-probability Probability
Simple Random Sampling
Every citation that meets the criteria has an equal chance of being selected.
See Demo.
NOTE: Articles vary greatly by source, target and year.
Stratified Sampling
Every citation in discrete homogeneous groups has an equal chance of being chosen.
Try Demo again…
Useful to zero-in on a possible problem
Stratify by source, target & year, but would be time consuming
Sampling Population
Samples Selected from Each Stratum
Strata
Cluster Sampling
When the sampling population naturally “clusters” (e.g. source and targets).
The way they cluster doesn’t affect your outcome.
Divide population into these clusters
Randomly select the clusters to be a part of the sampling frame
Randomly select sample from selected clustersUseful for very large populations.
Sampling Population
Samples Selected from Selected Clusters
Clusters
This Methodology
Simple randomized cluster1. Select a sample of ejournals
(clusters)2. Search each database for articles3. Randomly select a citation (sample)4. Test and record results
Most useful for questions that are focused on the sources.
Other Questions, Other DesignsComparing link-resolvers: Matched-pair
1. Select a sample of ejournals2. Using one of the link resolvers, search the
source for articles in these ejournals.3. Randomly select a citation4. Test and record results5. For the next link resolver, search each
source for the same citation (the matched pair).
6. Test and record results.
For problems related to environment (browsers, location of user, etc.): Use the same
method as the link-resolver, only change the browser or location.
For problems related to targets: Use the same
method, but… Randomly select
ejournals from each target
Other Questions, Other Designs
Using Excel to Help You Along
Practical Applications
Before we begin…
For those with Laptops, download files: Excel file:
http://digital.library.unt.edu/ark:/67531/metadc96818/
PDF of Steps:http://digital.library.unt.edu/ark:/67531/metadc96827/ Or, just follow along…
I need to check how many?
May be fewer than you’d think Need to know:
Sampling strategy Kind of analysis Expected rate Chance of a title being indexed in the
source Number of databases or sources to
examine Educated Guess
Selecting the Journal Titles (Clusters)
1. Download your ejournals list
May want to limit to only those used recently
2. Randomly select the correct number of journals based on sample size
3. Randomly assign each title to the databases or sources you will be searching.
Excel Tricks Remove Duplicates Fill Cell – assigns a new
ID number Sampling method in
Data Analysis Randomly selects IDs
from your list VLOOKUP – gets the
titles for the selected IDs RANDBETWEEN –
Randomly assigns each title to a source to be searched
Search Source
If
fo
un
d
Random Number
If found
Select Citation
If found
Test Citation
I f f o u n dRecord Result
If
fo
un
d
If Not Found
Test the Sources
1. Login to the database2. Search for articles in the first journal3. If none are found, note this in your
results and skip to next title.4. If some are found, note the total
number of articles.
Test Sources
If articles are found:1. Sort the list by author last name (if
possible)2. Note the total number of articles found3. Enter this number in the Sample Size
Calculator worksheet (Random Article Selector)
4. Note the “Select this article” number5. In the database, navigate to this article6. Click on the Find Full-Text button
Full-Text PDF!
Test Sources
Note the success of that link in your tally sheet
Rinse & repeat: For each journal in the list For each database to be tested
Search Source
If
fo
un
d
Random Number
If found
Select Citation
If found
Test Citation
I f f o u n dRecord Result
If
fo
un
d
If Not Found
Tips & Tricks
Search by ISSN, if possible.
Display the most citations per page
If full-text article is in that database, skip title. This is a non-
response the Response
Rate the Sample Size
Summarize the data
Count up all results in each source
Create ratios for each result (e.g. Full-text ratio) # Full-Text / # Titles Found
Average the ratios
Example Data: Raw Counts
Example Data: Ratios
Test the Results
So you have a ratio – so what? What does it mean? Is it high? Low? Compared to what?
Use the Chi-Squared test to compare the ratios Excel: CHISQ.TEST(actual range, expected
range) If the result is less than 0.10 (or 1.00 –
Confidence Level), then the difference is statistically significant. This may be good or not good, depending on
what your comparing against.
Comparing Against an Ideal % Actual range: the # of Successes & # of
Failures Expected Range: # expected to be success
& # of expected failures Example: CHISQ.TEST(B2:C2, B3:C3)
A B C D
1 Sources Full-Text No Full-Text Total
2 EBSCO 90 10 1003 Expected or Ideal Rate 95 5 1004 Totals 185 15 2005 Chi-Square Test 0.022
6 Significantly Different from Expected Count? Yes
Sources to Test Sources Full-Text No Full-Text Total1EBSCO 70 21 912Ovid 90 23 113
Totals 160 44 204Chi-Square Test 0.031632
Significantly Different? Yes
Sources to Test Sources Full-Text No Full-Text Total3ProQuest 87 18 1051EBSCO 70 21 91
Totals 157 39 196Chi-Square Test 0.032782
Significantly Different? Yes
Sources to Test Sources Full-Text No Full-Text Total2Ovid 90 23 1135Means 82.3333333 20.66666667 103
Totals 172.333333 43.66666667 216Chi-Square Test 0.322856
Significantly Different? No
Any F/T Ratio80%
Any Link Ra-tio
16%
No Link Ratio5% Overall Success
EBSCO Ovid ProQuest0
20
40
60
80
100
120
# Titles by Link Type
No LinkAny LinkAny Full-Text
Context is King
The value of the result depends on what you are measuring and comparing If the difference between two sources is not
significant, then they are statistically similar.
If the difference between two link resolvers is significant, then one is better than the other. NOTE: This doesn’t tell you by how much!
Use your best judgment