Upload
harris
View
22
Download
0
Embed Size (px)
DESCRIPTION
Web Searching. Everything, now. History of Search. Archie (archives) - 1990 Database of FTP filenames with regex query searching WWW Wanderer Web’s first robot High bandwidth load ALIWEB (Archie-Like Indexing of the WEB) Pages submitted with descriptions. History of Search. - PowerPoint PPT Presentation
Citation preview
Web Searching
Everything, now.
History of Search
Archie (archives) - 1990• Database of FTP filenames with regex query
searching
WWW Wanderer• Web’s first robot
• High bandwidth load ALIWEB (Archie-Like Indexing of the WEB)
• Pages submitted with descriptions
History of Search
Archietext - 1993• First to use statistical analysis of word relationships to
generate results
Yahoo! - 1994• Searchable directory of pages with descriptions
Webcrawler - 1994 †• Indexed entire web pages
Lycos - 1994 †• 60 million documents by 1996
History of Search
Infoseek - 1994† Altavista - 1995† Looksmart -1996 Inktomi - 1996 Ask Jeeves -1997 Google -1998 Teoma - 2000
Web Search Today
Search algorithms are highly secret• Use off-page criteria for ranking
• Constant tweaking
Things to look for:• Boolean nesting
• Fields
• Clustering?
• Stop words
Web Search Today
Google• PageRank system
• “Important” sites given artificial high rank
• Strengths• Largest database
• Relevance based on external linkage
• Weaknesses• No nesting
• May search for synonyms / grammatical variants (automatic stemming)
Web Search Today
Yahoo!• Brand new search database (as of Feb ’04)
• Strengths• Full boolean searching
• Very fresh
• Directory links
• Weaknesses• Includes pay for inclusion results (!)
Web Search Today
MSN Search (Inktomi)• Large Inktomi database
• Strengths• Page depth limit
• Full boolean searching
Web Search Today
Teoma• Subject-specific popularity
• Strengths• Refine
• Related
• Weaknesses• Small database
• No boolean nesting
Web Search Tomorrow
Kartoo• Visual meta search engine
Nutch• Open source web search
• Java (but that could change)
Dipsie• “2 clicks”
Singingfish• Multimedia (audio / video) search
Internet DirectoriesSelection Size
Yahoo! User submission / editors
3 million
Open Directory Editors (62,562!) 3.8 million
LookSmart Selected 2.3 million
CiteSeer Submission ???
Librarians’ Index Public Librarians 10 thousand
InfoMine Academic Librarians 120 thousand
RDN Academic Selections 30 thousand
Conclusion
Which search engine is the best?
References
http://searchengineshowdown.com/
http://www.search-marketing.info/