13
Web Searching Everything, now.

Web Searching

  • Upload
    harris

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Web Searching. Everything, now. History of Search. Archie (archives) - 1990 Database of FTP filenames with regex query searching WWW Wanderer Web’s first robot High bandwidth load ALIWEB (Archie-Like Indexing of the WEB) Pages submitted with descriptions. History of Search. - PowerPoint PPT Presentation

Citation preview

Page 1: Web Searching

Web Searching

Everything, now.

Page 2: Web Searching

History of Search

Archie (archives) - 1990• Database of FTP filenames with regex query

searching

WWW Wanderer• Web’s first robot

• High bandwidth load ALIWEB (Archie-Like Indexing of the WEB)

• Pages submitted with descriptions

Page 3: Web Searching

History of Search

Archietext - 1993• First to use statistical analysis of word relationships to

generate results

Yahoo! - 1994• Searchable directory of pages with descriptions

Webcrawler - 1994 †• Indexed entire web pages

Lycos - 1994 †• 60 million documents by 1996

Page 4: Web Searching

History of Search

Infoseek - 1994† Altavista - 1995† Looksmart -1996 Inktomi - 1996 Ask Jeeves -1997 Google -1998 Teoma - 2000

Page 5: Web Searching

Web Search Today

Search algorithms are highly secret• Use off-page criteria for ranking

• Constant tweaking

Things to look for:• Boolean nesting

• Fields

• Clustering?

• Stop words

Page 6: Web Searching

Web Search Today

Google• PageRank system

• “Important” sites given artificial high rank

• Strengths• Largest database

• Relevance based on external linkage

• Weaknesses• No nesting

• May search for synonyms / grammatical variants (automatic stemming)

Page 7: Web Searching

Web Search Today

Yahoo!• Brand new search database (as of Feb ’04)

• Strengths• Full boolean searching

• Very fresh

• Directory links

• Weaknesses• Includes pay for inclusion results (!)

Page 8: Web Searching

Web Search Today

MSN Search (Inktomi)• Large Inktomi database

• Strengths• Page depth limit

• Full boolean searching

Page 9: Web Searching

Web Search Today

Teoma• Subject-specific popularity

• Strengths• Refine

• Related

• Weaknesses• Small database

• No boolean nesting

Page 10: Web Searching

Web Search Tomorrow

Kartoo• Visual meta search engine

Nutch• Open source web search

• Java (but that could change)

Dipsie• “2 clicks”

Singingfish• Multimedia (audio / video) search

Page 11: Web Searching

Internet DirectoriesSelection Size

Yahoo! User submission / editors

3 million

Open Directory Editors (62,562!) 3.8 million

LookSmart Selected 2.3 million

CiteSeer Submission ???

Librarians’ Index Public Librarians 10 thousand

InfoMine Academic Librarians 120 thousand

RDN Academic Selections 30 thousand

Page 12: Web Searching

Conclusion

Which search engine is the best?

Page 13: Web Searching

References

http://searchengineshowdown.com/

http://www.search-marketing.info/