Search Engine Case Study

Preview:

DESCRIPTION

Search Engine Case Study. Presented by Alan, Aida, Jonathan & Stephen. History. Incorporated in 1996 and based in Emeryville, California. - PowerPoint PPT Presentation

Citation preview

1

Search Engine Case Study

Presented by Alan, Aida, Jonathan & Stephen

Presented by Alan, Aida, Jonathan & Stephen

2

History

Incorporated in 1996 and based in Emeryville, California.

Created by: Garrett Gruener, venture capitalist and founder of Virtual Microsystems, and David Warthen, CTO and creator of Ask Jeeves' natural language technology.

In April 1997, Ask Jeeves launched the first of its advertiser-supported public Web sites, www.askjeeves.com

3

Web properties & Intntnl. sites

Web properties:– Ask Jeeves at Ask.com– Ask Jeeves for Kids at AJKids.com– DirectHit.com, and eTour.com

International Sites:– Pregunta.com (Español)– Ask.co.uk (United Kingdom)

4

Overview

What does it do?– combines natural-language parsing software, data mining

process, knowledge-base creation and maintenance tools with the strengths and capabilities of human editors.

What does that mean?– user-relevancy ranking algorithms rate Websites

according to how users interact with the content.– Ask Jeeves' editors and popularity technology captures

human judgments to provide useful, relevant information.

5

How does this work for the user?

Input from the user:– Pose questions in plain English.

Output from Jeeves:– Ask Jeeves editorially selected answers– Automated Search Results– Sponsored Links– Metasearch Results– Online Communities

6

Interesting Facts & Information

Tips on searching:– http://static.wc.ask.com/docs/help/help_searchtip

s.html Awards & Current news:

– http://static.wc.ask.com/docs/about/aboutawards.html

– http://static.wc.ask.com/docs/about/media.html

7

Further analysis of architecture

Jeeves’ three main return results sections:– Editorially selected answers– Subject-specific popularity results

(a.k.a automated search results)– Metasearch results

8

Editorially selected answers

The Askjeeves editorial staff have manually created an extensive knowledgebase of question/answer sets.

These question/answer sets are the first links that appear at the top of the results page. The links point to webpages that are thought to contain the exact answer to the question asked.

9

Editorial Standards

Editorially selected sites that contain ‘answers’ must adhere to the following:– load quickly & be polished, easy to read and easy

to navigate.– be well maintained and updated regularly.– offer accurate info that answers user query.– offer other links w/info related to user query.– demonstrate credibility by providing author &

source citations & contact info.

10

Subject-specific popularity results

AskJeeves acquired it’s AskJeeves acquired it’s Subject-specific popularity search process from Teoma Technologies. Their technology uses three methods to acquire meaningful results:– Individual web pages– Web pages grouped by topic– Expert links

AskJeeves only implements the first two search processes created by Teoma.

11

Implementation

Teoma's technology uses compact mathematical Teoma's technology uses compact mathematical modeling of the Web's structure to generate modeling of the Web's structure to generate dynamic queries. After searching using criteria such dynamic queries. After searching using criteria such as popularity and text analysis, it applies dynamic as popularity and text analysis, it applies dynamic topic clustering, subject-specific link analysis, and topic clustering, subject-specific link analysis, and expert identification. Dynamic topic clustering looks expert identification. Dynamic topic clustering looks at the Web from a local perspective, which enables at the Web from a local perspective, which enables Teoma to understand the subject matter of Web Teoma to understand the subject matter of Web pagespages

12

Metasearch results

AskJeeves gives users the ability to send their query to a number of other third party search engines. These search engines include:– Looksmart.com– About.com

13

AskJeeves vs. UNCA Library

VS.

14

UNCA Library (against Jeeves)

Strengths– ability to limit the language, location, & year of

results.– can search by author, subject, periodical title,

author/title, & call numbers. Weaknesses

– Library search shows where to find things, but doesn’t show the full text due to it being a printed medium.

15

AskJeeves (against Library)

Strengths– can use natural language queries.– offers alternative search terms.– offers a browseable subject index.– uses a spell checker.– has a msg. board where queries can be posted.

Weaknesses– unable to narrow search.

16

AskJeeves vs. Google

VS.

17

Google (against Jeeves) Strengths

– simple, stripped-down design.– fast– cache option.– ranking by authorities.– good navigation within search results.

Weaknesses– ranking by authority leaves out new/specialized

pages.– no page summary.

18

Jeeves (against Google)

Strengths– simple English queries for beginner users.– ‘answers’ query user w/questions to help search.– popularity technology makes common questions

faster & easier to answer.– search results include short summary of page.– search may be posted for other askjeeves users

to respond.

19

Jeeves (against Google) con.

Weakness– slow download.– distracting advertisements.– pay for top spots give dubious results.– no advanced search.– questions may not be in database.– bad navigation.– links displayed in askjeeves frame.

20

Engine ReviewsFAQ Pay

for Rank

Page Sum

Adult Filt.

Group by Topic

Case Phrase Search

Proximity Stem

AskJeeves

Yes Yes Yes Yes Yes No Yes No Yes

Google No No No No No Yes Yes No

UNCALibrary

No Yes No No No Yes Yes No

21

Engine ReviewsBoolean AND Exclude OR Wildchards

Ask

Jeeves

Yes No

Google Yes + - No

UNCA

Library

Yes AND AND Not OR Yes

+ -

22

Standard Query Comparison

Our four standardized queries were:– Q1: “Terrorism in the US”– Q2: “Radioactive waste disposal locations”– Q3: “History of Muzak”– Q4: “Federal Firearms Liscense Application

23

Q1: “Terrorism in the US”

Total Docs Returned

Relevant Docs Returned

Precision

Ask

Jeeves

~* 18 NA

Google 1,720,000 18 (out of 20) 0.90*

UNCA

Library

5 5 1.0

* Jeeves returns an indefinite amount of docs so precision is not possible. Also, precision for google is based on first 20 documents.

24

Q2: “Radioactive waste disposal locations”

Total Docs Returned

Relevant Docs Returned

Precision

Ask

Jeeves

~* 20 NA

Google 27,700 15 (out of 20) 0.75*UNCA

Library

14 12 0.857

* Jeeves returns an indefinite amount of docs so precision is not possible. Also, precision for google is based on first 20 documents.

25

Q3: “History of Muzak”

Total Docs Returned

Relevant Docs Returned

Precision

Ask

Jeeves

~* 2 NA

Google 8,730 11 (out of 20) 0.55*UNCA

Library

2 1 0.50

* Jeeves returns an indefinite amount of docs so precision is not possible. Also, precision for google is based on first 20 documents.

26

Q4: “Federal Firearms Liscense Application

Total Docs Returned

Relevant Docs Returned

Precision

Ask

Jeeves

~* 2 NA

Google 32,500 15 (out of 20) 0.75UNCA

Library

12 3 0.25

* Jeeves returns an indefinite amount of docs so precision is not possible. Also, precision for google is based on first 20 documents.

27

Conclusions Due to it’s design, if

you know what you are looking for and wish to receive a definative answer, askjeeves.com might be a good place to start.

Tip: For best results, ask your query as a specific question.

28

Conclusions If you desire a great deal of general information

regarding a specific topic, but do not know what questions to ask, Google is an excelent choice.

Tip: For best results, start broad and use the search within results feature to narrow the scope of your search.

29

Conclusions If you want more localized, relevant information on a

topic that has been hand picked by human experts in the field of categorization, the UNCA Library would be a good option.

Tip: For best results, use all features if searching via keword.

30

Sources

– http://www.infoworld.com/articles/hn/xml/02/01/07/020107hnjeeves.xml

– http://searchenginewatch.com/sereport/01/10-ask.html– http://www.searchenginewatch.com/sereport/01/07-

teoma.html– http://www.infotoday.com/newsbreaks/nb010820-2.htm– http://www.teoma.com/help.html– http://static.wc.ask.com/docs/about/policy.html– http://sp.ask.com/docs/about/whatisaskjeeves.html– http://wncln.appstate.edu/– http://www.google.com

Recommended