View
213
Download
0
Embed Size (px)
Citation preview
James Tam
Computer Searches
Concepts covered
•What is a search engine and how do they work?
•General search tips
•The Big Six search engines
•Other search tools
Much of these lecture notes were based on Search Engines for the World Wide Web by Alfred and Emily
James Tam
Looking for Information?
Start with the Internet• World wide web• Newsgroups• Archived mailing lists
There are potential problems
James Tam
Search Engines
What is a search engine
How do they work• Search engines may employ spiders
The Internet
James Tam
Search Engines (Continued)• Search engines may search human-created databases
James Tam
Making Your Web Site More Noticeable
Add relevant keywords (Spiders)
Search engine submission (“suggesting your site” to Humans)
James Tam
Keywords: The Secret To Effective Searches
Use keywords that are unique as possible
Run the search using a number of variations
Search only titles
Determine if the search engine is case sensitive• When searching for proper names, capitalize the first letters
Check your spelling
Re-run previous results
James Tam
Types Of Searches
Plain English
AND
OR
NOT
Near Searches
James Tam
Plain English Searches (Natural Language Searches)
Easy to formulate the query but may result in too many hits
James Tam
Plain English Searches (Continued)
Supported by almost all of the Big Six
AskJeeves (www.ask.com)
James Tam
AND SearchesTelling the search engine that it must include multiple keywords
Precede each keyword with a plus sign "+“ or “AND”
Some search engines use AND as the default, others do not
James Tam
OR Searches
Provides broader search results
Tells the search engine to include web pages that include at least one keyword out of a list of many (2+)
James Tam
NOT Searches
Precede the excluded keyword with a minus sign "-“ or “NOT”
James Tam
NEAR Searches
Tell the search Engine to show web pages where keywords appear near each other in the document (within 10 words)
James Tam
Using Wildcards "*"
Used to look for variations on particular words
Some search engines allow the wildcard to be placed at the beginning, middle or end of a keyword
Rules of thumb on the use of wildcards• Use them to find spelling variations• Use a minimum of three characters before the wildcard1
This will vary depending upon the particular search engine.
James Tam
Stopwords
Ignored by search engines because they are too common or are reserved for some special purpose• Common words• Reserved words
The search engine can be forced to include the stopwords• Use quotes• Use a plus sign
James Tam
Topic Directories
James Tam
Searching within a web page
James Tam
Opening A New Browser
James Tam
The Big Six
AltaVista (www.altavista.com)
Google (www.google.com)
HotBot (www.hotbot.com)
Lycos (www.lycos.com)
Northern Light (www.northernlight.com)
Yahoo (www.yahoo.com)
James Tam
Comparing The Big Six
Search Engine
Number of Web pages in database
Percentage of web in database1
Google 1.2 billion 57%
Yahoo (powered by Google)
1.2 billion 57%
Lycos 575 million 27%
HotBot 500 million 24%
AltaVista 350 million 17%
Northern Light
330 million 16%
1 Based upon figures from January 2001 and an estimate of 2 billion web pages in existence from www.searchenginewatch.com
Self-reported sizes
But size isn't everything!
James Tam
AltaVista
Types of searches• Logical OR• Date• Field• Geographic• Wildcards• Language• Case sensitive• Proximity• Weighted
Babel Fish
Obscure facts and figures
Dead links
James Tam
AltaVista (Continued)
Ranking of search results• Appearance in the title• Appearance near the beginning of the document• Links to related content
James Tam
Types of searches• Logical OR• Language• Domain• Type of file• Date• Not case sensitive• No wildcards• Specifies stopwords
Big!
Caches web pages
I feel lucky feature
James Tam
Google (Continued)
Ranking of search results• By the number of links
James Tam
HotBot
Types of searches• Logical OR• Case sensitive • Wildcard searches• Language• Date• Domain• Geographic region• Link searches• Type of file• Must contain, should contain, should not contain
Graphical control of searches
James Tam
HotBot (Continued)
Ranking of search results• Having the keyword(s) in the title• Number of occurrences of the keyword
James Tam
Lycos
Types of searches• Logical AND• Multi-media searches • Must include/should include, exclude• Link searches• No Stop words• Not case sensitive• No searches by date• No searches by wildcard
Kid's search site • www.lycoszone.com
James Tam
Lycos (Continued)
Ranking of search results• "Popularity" of site• Occurrences of keyword
James Tam
Northern Light
Types of searches• Logical AND• Special case sensitive search• Wildcard• Singular and plural• Stop words
WWW and a special database
Free search alerts
Customized search folders
James Tam
Northern Light (Continued)
Ranking of search results• By the number of links• Keyword frequency• Date of the document• Keyword appearing in title
James Tam
Yahoo
Searches• Logical OR• Date added• Wildcards• Not case-sensitive
Searches Yahoo directories
and Google database
Extensive classification
James Tam
Yahoo (Continued)
Ranking of search results• Results in Yahoo directory comes before Google results• Ranking in Yahoo directory determined by:
- The number of key words matched- Exact word matches- Location of the word in the web page
James Tam
Summary of The Big Six and What They Do Best1
AltaVista• Obscure facts and figures• Babel fish
Google• Big!• Often produces relevant search results• Caches web pages
HotBot• Multimedia• Ease of use
1 From Search Engines for the World Wide Web by Alfred and Emily
James Tam
Summary of The Big Six and What They Do Best (Continued)
Lycos• Multimedia• Kid's zone
Northern Light• Search on the web and special data bases
Yahoo• The most extensive web directory
James Tam
Metasearch engines
Search on multiple search engines automatically
Examples• www.metacrawler.com• www.dogpile.com• www.profusion.com• www.search.com• www.mamma.com
Drawbacks• Searches occur in the simplest form• Timeouts• Number of results returned
James Tam
Other (Task-Specific) Search Tools
Products• Amazon: www.amazon.com• CDNOW: www.cdnow.com• Consumer World: www.consumerworld.org• CNET Shareware.com: www.shareware.com• ZDNet: www.zdnet.com
Health• CDC: www.cdc.gov
James Tam
Other (Task-Specific) Search Tools (Continued)
Food• CuisineNet Menus Online: www.cuisinenet.com• Epicurious Food: www.epicurious.com• Martha Stewart: www.marthastewart.com
Miscellaneous• Expedia: www.expedia.com• Internet Movie Database: www.imdb.com• Monster: www.monster.ca• Workopolis: www.workopolis.com
James Tam
SummaryWhat is a search engine?
How do search engines gather information for their databases?
Types of Searches• By keyword• Logical• Plain English• Wildcards
Stopwords and searches.
Browsing topic directories.
What are the Big Six search engines?
Metasearch engines.
Task-specific search tools