42
John Cox James Hardiman Library NUI, Galway The Search for Quality: productive Web searching

The Search for Quality: productive Web searching

  • Upload
    pekelo

  • View
    24

  • Download
    1

Embed Size (px)

DESCRIPTION

The Search for Quality: productive Web searching. John Cox James Hardiman Library NUI, Galway. The Problem. 7.3 million new Web pages daily Quality varies, mainly due to ease of publication and lack of checks Quality is in the eye of the beholder Over-dependence on general search engines - PowerPoint PPT Presentation

Citation preview

John CoxJames Hardiman Library

NUI, Galway

The Search for Quality: productive Web searching

The Problem

7.3 million new Web pages dailyQuality varies, mainly due to ease of

publication and lack of checksQuality is in the eye of the beholderOver-dependence on general search

enginesSimplistic use of search tools

Some Usage Findings

NUI, Galway Library survey, March 2000: Search engines cited by 79 out of 167 respondents Exclusively used for, eg Nazism, defamation law, hepatitis C Less than 50% satisfied

Other surveys show very simplistic use: 33% users enter one word only Further 33% users enter two words only

UK survey indicates 80% searchers waste some timeUS survey shows “search rage” within 12 minutes

Key Question

“How much better than users are information staff at finding high-quality information on the Web and what leadership do we provide?”

5 key actions needed

5 Key Actions

Get the best from the search enginesGo vertical: subject-specific sourcesTake time to experiment, eg helper

softwareExploit the invisible WebActively promote quality searching

1: Get the Best from the Search Engines

Understand how they workKnow their limitationsUse advanced featuresSearch more than oneKnow when not to use them

Search Engine Components

Crawler: follows linksIndexer: builds databaseQuery processor: lets us search

Common Limitations

Profit-orientedPaid entries listed at top Out of datePartial site indexingTechnically must exclude many sites,

eg Password-protected Registration needed Database-driven

Hidden search facilities

Understanding Google

Strengths Coverage Cached pages File types, eg

PDF,.doc,.ppt Relevance: link

popularity Beyond pages: images,

newsgroups

Weaknesses Poor Boolean support No truncation Limited date searching Invisible search

facilities Two pages per site

displayed by default

Google: coverage

Google: search modes

Basic

Advanced

Google: file types

Google: newsgroup search

Google: cached pages 1

Google: cached pages 2

Google: Boolean limitations 1

Correct syntax: medline OR embase

Google: Boolean limitations 2

Correct syntax: medline –embase (or use Advanced Search)

Google: no truncation

Use clinton (tax OR taxes OR taxation)

Google: few date limits

Google: hidden features 1

Discovered at www.searchengineshowdown.com (buried in Google help)

Google: hidden features 2

Partial URL v Specific Site Search:

Not possible on Advanced Search despite “Domains” limit

Other Search Engines

Always worth searching more than one, eg All the Web (FAST) AltaVista Lycos/HotBot Northern Light (?)

Overlap may be limitedDifferent ranking criteria

2. Go Vertical: specific tools

Type Example(s)

Region Doras, Yahoo Australia & NZ

Domain SearchEdu.com

Genre Newsindex

Discipline EEVL, LawCrawler

Subject Politicalinformation.com

Horses for Courses 1

Horses for Courses 2

Horses for Courses 3

3. Experimentation

Try out “add-on” search software, eg BullsEye Pro Copernic Copernic Summariser

BullsEye Pro: searching

BullsEye Pro: Webliographies

Copernic

Copernic Summariser

4: Explore the “Invisible Web”

Material, often of high quality, that general search engines can’t or won’t index Unlinked pages Non-HTML file types, eg audio, video, PDF Authenticated sites Databases

Much greater in size than visible Web

invisibleweb.com

invisible-web.net

WebData

Librarians’ Index to the Internet

5. Promote Quality Searching

Old sourcesOld habitsNew media

Old Sources

Old Habits

Search strategy formulation

Critical source

selectionPatience Flexibility

Concept analysis

Critical appraisal of search hits

New Media

Library Web Site

E-newsletter

Weblog

http://www.hw.ac.uk/libWWW/irn/irn.html

Towards a Brighter Future

Automatically-generated, accurate metadata

Smarter search engines More quality-sensitive More penetrative

XML: structured data

References

•Sherman, Chris and Price, Gary The invisible Web: uncovering information sources search engines can't see. Medford, N.J.: Information Today, 2001. ISBN 091096551X. (accompanying database at http://invisible-web.net)

•Search Engine Watch: http://www.searchenginewatch.com

•Search Engine Showdown: www.searchengineshowdown.com