Web Searching

Preview:

DESCRIPTION

Web Searching. Everything, now. History of Search. Archie (archives) - 1990 Database of FTP filenames with regex query searching WWW Wanderer Web’s first robot High bandwidth load ALIWEB (Archie-Like Indexing of the WEB) Pages submitted with descriptions. History of Search. - PowerPoint PPT Presentation

Citation preview

Web Searching

Everything, now.

History of Search

Archie (archives) - 1990• Database of FTP filenames with regex query

searching

WWW Wanderer• Web’s first robot

• High bandwidth load ALIWEB (Archie-Like Indexing of the WEB)

• Pages submitted with descriptions

History of Search

Archietext - 1993• First to use statistical analysis of word relationships to

generate results

Yahoo! - 1994• Searchable directory of pages with descriptions

Webcrawler - 1994 †• Indexed entire web pages

Lycos - 1994 †• 60 million documents by 1996

History of Search

Infoseek - 1994† Altavista - 1995† Looksmart -1996 Inktomi - 1996 Ask Jeeves -1997 Google -1998 Teoma - 2000

Web Search Today

Search algorithms are highly secret• Use off-page criteria for ranking

• Constant tweaking

Things to look for:• Boolean nesting

• Fields

• Clustering?

• Stop words

Web Search Today

Google• PageRank system

• “Important” sites given artificial high rank

• Strengths• Largest database

• Relevance based on external linkage

• Weaknesses• No nesting

• May search for synonyms / grammatical variants (automatic stemming)

Web Search Today

Yahoo!• Brand new search database (as of Feb ’04)

• Strengths• Full boolean searching

• Very fresh

• Directory links

• Weaknesses• Includes pay for inclusion results (!)

Web Search Today

MSN Search (Inktomi)• Large Inktomi database

• Strengths• Page depth limit

• Full boolean searching

Web Search Today

Teoma• Subject-specific popularity

• Strengths• Refine

• Related

• Weaknesses• Small database

• No boolean nesting

Web Search Tomorrow

Kartoo• Visual meta search engine

Nutch• Open source web search

• Java (but that could change)

Dipsie• “2 clicks”

Singingfish• Multimedia (audio / video) search

Internet DirectoriesSelection Size

Yahoo! User submission / editors

3 million

Open Directory Editors (62,562!) 3.8 million

LookSmart Selected 2.3 million

CiteSeer Submission ???

Librarians’ Index Public Librarians 10 thousand

InfoMine Academic Librarians 120 thousand

RDN Academic Selections 30 thousand

Conclusion

Which search engine is the best?

References

http://searchengineshowdown.com/

http://www.search-marketing.info/