54
Crunching Numbers: OPAC Log Analysis of WebVoyage Bennett Claire Ponsford Digital Services Librarian Texas A&M University Libraries EndUser 2007 Session 34

Crunching Numbers: OPAC Log Analysis of WebVoyage Bennett Claire Ponsford Digital Services Librarian Texas A&M University Libraries EndUser 2007 Session

Embed Size (px)

Citation preview

Crunching Numbers: OPAC Log Analysis of WebVoyage

Bennett Claire PonsfordDigital Services Librarian

Texas A&M University Libraries

EndUser 2007 Session 34

Overview

Why analyze your log files? How to do it What we found The changes we made What do the latest logs say? Improvements needed

Why Analyze? To see how your users search when

you’re not watching To resolve internal disagreements

over default searches, limits, etc. To see whether changes to

WebVoyage really improved search results

As a counterpoint to task-based user testing

C.S. Lewis Lewis, C. S. (Clive Staples) LION WITCH? LION, WITCH? LION, WITCH, AND WARDROBE? Lewis, C. S. (Clive Staples)

Issues to Think About

Does Voyager capture the data you need? Privacy concerns

Does your network organize data the way you need? Staff vs. public IP addresses

Do you want all searches or a sample?

How To

Read the documentation Technical Manual, Chapter 15, Popacjob

Begin logging your data Extract data into Access database

Clean up data as needed Run queries Scratch head and contact Tech

Support

5-MAR-07 WebOpac 20061016102711 Title keyword(TKEY New) AND (TKEY York) AND(TKEY Times) Y West Campus Library K YN 7 1 W999.999.99.999AMDB20020820112825 N

Data Fields

Search_date

5-MAR-07

Stat_string WebOpacSession_id 20070305123912Search_type Title keywordSearch_string

(TKEY New) AND (TKEY York) AND (TKEY Times)

Data Fields (cont.)

Limit_flag YLimit_string

West Campus Library

Index_type KRelevance YHyperlink NHits 7

Data Fields (cont.)

Search_tab 1Client_type WClient_ip XXX.XXX.XX.XXXDbkey AMDB20020820112825Redirect Flag

N

SQL for Count of Search TypeSELECT [Spring 2007 OPAC log].Search_type,

Count([Spring 2007 OPAC log].Search_type) AS CountOfSearch_type

FROM [Spring 2007 OPAC log]WHERE ((([Spring 2007 OPAC

log].Client_type)="W") AND (([Spring 2007 OPAC log].Search_tab)="1") AND (([Spring 2007 OPAC log].Hyperlink)=“N")AND (([Spring 2007 OPAC log].Client_ip) Not Like "128.194.8[4-7].*" And ([Spring 2007 OPAC log].Client_ip) Not Like "165.91.39.*"))

GROUP BY [Spring 2007 OPAC log].Search_typeORDER BY [Spring 2007 OPAC log].Search_type;

Results Author Browse – 334 Author headings – 943 Author keyword – 1587 Builder – 6 Call Number Browse – 777 Command – 16 Documents Call Number –

15 Expert keyword (rel) – 5 Expert keyword (rev) –

2378 Journal title keyword – 793 Journal title – 1375

Keyword – 82 Keyword (Relevance) --

6179 Keyword Search – 1574 LC Call Number browse --

14 Locally Assigned Call#-- 2 Simple Search -- 58 Subject Browse -- 851 Subject headings -- 128 Subject Hds keyword -- 251 Title keyword -- 2045 Title Redirect -- 615 Title starts with – 1677

June 2006 (Voyager 5)

September 2006 (Voyager 5)

Changed interface Defaults

Kept Tab at Simple Search Changed Search to Keyword (CMD*

with javascript) Changed result sort to by relevance

Fall 2006 Preparing to upgrade to Voyager

6.1 New keyword searches with ^ to

automatically AND words together Some people unhappy with recent

changes Default search Search results sort order

Decided to look at the data

Decisions upgrading to V6

Basic data Where are our searchers What search tab are they using How are they searching

Default search Order of title searches Simple limits

Where Are Our Searchers?

35%

65%

InsideLibraries

OutsideLibraries

What Search Tab Used?

0%

20%

40%

60%

80%

100%

Simple (1) Keyword (akaBuilder - 2)

Course Reserves(3)

Inside Libraries Outside Libraries

Default Search: Discussion

Title search (TALL) What we traditionally had used Reference’s preference

General keyword search (new GKEY^*) What users are used to in a Google

world More forgiving search

Comparison of Searches Used

0

2000

4000

6000

8000

Inside Libraries Outside Libraries

Default Search: Decision

General keyword search (new GKEY^*) User preference Fewer No Hit results

First Title Search: Discussion

Left anchored title (TALL) Preferred by Reference

Title keyword (new TKEY^*) More forgiving

Title Search (TALL): Problems

Title Search: Decision

Title keyword Left-anchored title had too many

problems

Simple Limits

Several additional location limits requested

Concern that too many would be confusing

Search Limits Used

0%

2%

4%

6%

8%

10%

Simple Limits: Decision

Added new limits and will evaluate with more data

Analysis of Voyager 6 Logs

Search frequencies No hit frequencies Title search problems Journal title search hits Search limits

Keyword and Subject Searches

0%

10%

20%

30%

Expertkeyword

Keyword Subjectbrowse

Subjectkeyword

Inside Libraries Outside Libraries

Author Searches

0.0%

2.0%

4.0%

6.0%

8.0%

AuthorBrowse

Authorheadingsbrowse

Authorkeyword

Inside Libraries Outside Libraries

Title Searches

0%

10%

20%

30%

Journalkeyword

Journaltitle

Titlekeyword

TitleRedirectKeyword

Title

Inside Libraries Outside Libraries

No Hit Title Searches: Do We Own Them?

0%

10%

20%

30%

40%

50%

Yes No Unable toverify

Other

Keyword Left-Anchored (preliminary)

No Hit Title Searches: Problems

0%

20%

40%

60%

Keyword Left-Anchored (preliminary)

Location Limits Used

0

100

200

300

400

500

600Media Services

Qatar

Cushing

Bestsellers

West Campus

Reference Coll.

Curriculum Coll.

Web Resources

Comparison of Limits

0%2%

4%6%

8%

Inside Libraries Outside Libraries

Have Changes Helped?

Search frequency No hits percentage

Search Frequency

0%

20%

40%

60%

Summer 2006 Fall 2006 Spring 2007

No Hit Percentages Some

improvement but no major change

0%

5%

10%

15%

20%

25%

30%

Summer 2006 Fall 2006 Spring 2007

Detailed No Hits Percentages

0%

10%

20%

Summer 2006 Fall 2006 Spring 2007

Improvements: Spelling Spellchecking Automatic searching of variant

spellings “&” or “and” British vs. American spellings Numbers Abbreviations

Did you mean? Suggestions based on field

Improvements: Help

More granular no hits help Specific search types Any search with “conference” or

“proceedings” in it Journal title searches including “vol.”,

“no.”, or a number Searches with more than 4 or 5 words

More granular help for too many hits

Improvements: Specific Searches

Keyword searches Automatic stemming Ignore punctuation and spacing Ignore stop words

Title searches Ignore initial article

Journal search results layout

Whether to include the index field in the journal title search results Same search results but the order of

the results is change by the inclusion of the index field

Primarily a problem for single word titles that retrieve more than 1 screen of results – Science, Nature, etc.

JALL Search ResultsResults

Count Percent Count Percent

No hits 252 32.5% 449 32.7%1 hit 144 18.6% 203 14.8%2-5 hits 264 34.0% 557 40.6%5-50 hits 76 9.8% 126 9.2%51+ hits 40 5.2% 37 2.7%

Total 776 1372

Inside Library Outside Library

What Next?

Continued analysis of searches with no hits

Analysis of search repair strategies Word counts

More Information Jansen, Bernard J. “Search log analysis:

What it is, what’s been done, how to do it,” Library & Information Science Research, 28 (2006) 407-432.

Yu, Holly and Margo Young, “The impact of Web search engines on subject searching in OPAC”, Information Technology and Libraries, 23 (2004) 168-180.

Contact Information

Bennett Claire Ponsford [email protected] 979/845-0877

https://libcat.tamu.edu