Introduction to Web Browsers and Basic Search Strategies Using Search Engines

Preview:

DESCRIPTION

Introduction to Web Browsers and Basic Search Strategies Using Search Engines. Davina Pruitt-Mentle EDUC 698M. Outline. History (WWW & Internet) Search tools Search Engines vs. Subject Directory Meta search Engines Steps for Searching Effective Strategies Narrow or broaden a search? - PowerPoint PPT Presentation

Citation preview

Introduction to Web Browsers and Basic Search Strategies

Using Search Engines

Davina Pruitt-MentleEDUC 698M

EDUC 698M Davina Pruitt-Mentle 2

Outline• History (WWW & Internet)• Search tools• Search Engines vs. Subject Directory• Meta search Engines• Steps for Searching• Effective Strategies• Narrow or broaden a search?• Wildcards

EDUC 698M Davina Pruitt-Mentle 3

Internet History

• Internet made up of thousands of networks worldwide

• No one in charge of Internet - No governing body

• Internet backbone owned by private companies

EDUC 698M Davina Pruitt-Mentle 4

Looking at the Net

Taken from: http://www.cio.com/WebMaster/sem2_net.html

EDUC 698M Davina Pruitt-Mentle 5

Understanding the Map

• Computers use TCP/IP to communicate (Transmission Control Protocol/Internet Protocol)

• Computers use client/server architecture

EDUC 698M Davina Pruitt-Mentle 6

Internet Providers:

• Research and Educational Institutions• Government and Military Entities• Businesses• Private Organizations• Commercial Providers

EDUC 698M Davina Pruitt-Mentle 7

Internet Protocols

• Email (Simple Mail Transport Protocol)• Telnet (Login to remote host computer)• FTP (File Transfer Protocol) - transfers

files between server and client• HTTP (HyperText Transfer Protocol)

EDUC 698M Davina Pruitt-Mentle 8

History• WWW or Web or W3 includes all information,

text, images, audio, video, and computational services that are accessible from the internet

• July 8, 1999 Nature - approximately 800 million pages of publicly accessible information(1)

• Web continues to grow, tripling in size over the past two years(2)

(1) Steve Lawrence & C. Lee Giles, “Accessibility of Information on the Web,” Nature 400 (July 8, 1999), 107

(2) OCLC Office of Research, “June 1999 Web Statistics” Web Characterization Project

EDUC 698M Davina Pruitt-Mentle 9

WWW

• System of Internet servers that support hypertext to access several Internet protocols on a single interface

• Almost all protocols accessible on Internet are accessible on web (email - FTP - Telnet - etc)

• In addition, WWW own protocol: HyperText Transfer Protocol

EDUC 698M Davina Pruitt-Mentle 10

HTTP

• Hypertext - means of information retreival• Contains links that connect to other

documents• Links selected by user• Virtual “web” of connections

EDUC 698M Davina Pruitt-Mentle 11

HTTP (cont)

• Produce HTTP through HTML• HyperText Markup Language• Way of writing or creating with “tags”

added to tell information– i.e. <b> Bold </b> yields Bold

EDUC 698M Davina Pruitt-Mentle 12

More History

• Internet initially conceived in 1989 by Tim Berners-Lee at CERN (European Particle Physics Lab in Switzerland)

• Needed a wide variety of information to be shared and distributed to many different computers and platforms

• “Universal readership”

EDUC 698M Davina Pruitt-Mentle 13

Web Popular Because:

• Easy to use• Easy to navigate• Combines words, graphics, sound, video• Easy to Publish• Plethora of information • Reach larger audience

EDUC 698M Davina Pruitt-Mentle 14

Summary: Web vs. Internet

• What is the relationship between the web and the Internet?

• The Internet contains physical components– computers– networks– services

EDUC 698M Davina Pruitt-Mentle 15

Web vs. Internet

• The Internet connects thousands of computers across the world, but it is the web that allows communication to occur

• Web - abstraction and common set of services on top of the Internet

• Web - set of protocols and tools that let us share information with each other

Directed Search Strategies

Davina Pruitt-MentleJuly Design Institute

July 20, 2000

EDUC 698M Davina Pruitt-Mentle 17

How Do I Find Information on the Internet?

• Join an email discussion or USENET newsgroup

• Go directly to a site if you have the address• Browse• Explore subject directory• Conduct Search

EDUC 698M Davina Pruitt-Mentle 18

How Does Information Get Indexed by the Search Tools

• A publisher of a web page can register the site with the search engine or directory

• Database collects data autonomously

EDUC 698M Davina Pruitt-Mentle 19

Browsers• Netscape Navigator (Communicator)

– Product of Netscape (Now owned by AOL)– Originally was dominant– Multi-platform (all operating systems)

• Internet Explorer– Product of Microsoft– Current Dominant Browser– Not available for all operating systems

• Browser compatibility problems can cause web page problems

EDUC 698M Davina Pruitt-Mentle 20

Netscape Search

EDUC 698M Davina Pruitt-Mentle 21

Netscape Search• 1: Access to different

search engines• 2. Type words or

phrases into text entry box

• 3. Click Button• 4. Preserve favorite

search engine

EDUC 698M Davina Pruitt-Mentle 22

Internet Explorer Search

•Separate Panel In Browser

•Uses MicroSoft Network search

EDUC 698M Davina Pruitt-Mentle 23

Internet Explorer Search

• Direct access to only Microsoft Network’s search engines

• Allows easy access to different types of search– Web pages– People– Businesses– Maps

EDUC 698M Davina Pruitt-Mentle 24

Internet Keywords• Type straight in location bar of

Netscape/Explorer• Simple words instead of URL (uniform

resource location)• Words tie to websites• Can be tied to language preference• Example: Typing in maryland converts to

http://www.state.md.us/

EDUC 698M Davina Pruitt-Mentle 25

Know your URL’s• “Address” of a file on the Internet• Contains type of protocol followed by the

computer name, directory and file name• Examples

– http://www.capecod.net/Wixon/wixon.htm– gopher://gopher.boombox.micro/– ftp:// wuarchive.wustl.edu/pub/windows/psp3.zip– mailto:kschrock@capecod.net

EDUC 698M Davina Pruitt-Mentle 26

Anatomy of a Web Address

• protocol://host/path/filename

See handout “Anatomy of a Web Address”

EDUC 698M Davina Pruitt-Mentle 27

Two Basic Approaches to Searching

(although not really “basic”)

• Search Engines• Subject Directories

EDUC 698M Davina Pruitt-Mentle 28

Search Engines vs. DirectoriesSearch Engines• Computer built index of

information on web• More inclusive• Used to find specific resources• Searchable by keyword• Excessive “hits”• Every page of a Website is

indexed• Better for general searches, but

can be used to find specific information

Directories• Human aided, organized list• May be general or subject-

specific• May be able to “search”

directory– Google - general– NetTech Educational Technology

Coordinator Website - subject specific

• User has control of browsing• Fixed vocabulary• Links go to Website home

pages only• Better at general searches

EDUC 698M Davina Pruitt-Mentle 29

What are Search Engines?• Designed to assist you in searching through

the enormous amount of information on the Web

• No single search tool has everything• Each engine is a large database which utilizes

different search techniques and tools (spiders or robots) to build indexes to the Internet (some also utilize submissions and administration)

EDUC 698M Davina Pruitt-Mentle 30

Which Search Engine?• Yahoo• Altavista• Excite• Google• NorthernLights• Hotbot• InfoseekSee Handout - “The Little Search Engine that Could”

EDUC 698M Davina Pruitt-Mentle 31

How to ChooseConsider• Size of the database (# of URLs)• Currency of the database (updates)• Search interface• Help screens• Search features• Results listed (# of documents retrieved)• Relevance of results

EDUC 698M Davina Pruitt-Mentle 32

More About Search Engines• Searches for matching terms (keywords or

several keywords)• Results “ranked” by relevancy (for some)• Can search by

– subject or category– keyword

• Learn about each search engine’s description, options, and rules and restrictions

EDUC 698M Davina Pruitt-Mentle 33

GO TO

http://www.google.com/help.html

EDUC 698M Davina Pruitt-Mentle 34

Searches for exact matches Try different versions of your search term Example: “Boston hotel” vs. “Boston hotels”

Rephrase query Example: “cheap plane tickets” vs. “cheap

airplane tickets”

EDUC 698M Davina Pruitt-Mentle 35

• Automatically places “and” between words (expands search)• To reduce search –

– add more terms in original search– refine search within the current search results. (adding terms to

first words will return a subset of the original query)• Exclude a word by using a – sign

– Example: to search bass but not speaker bass –speaker• Does not support “or” operator• Does not support “stemming” or “wildcard” searches• Not case sensitive

EDUC 698M Davina Pruitt-Mentle 36

• Finds street maps– Just enter a U.S. street address, including zip

code or city/state into the search box– Google recognizes query as a map request

Try your address

EDUC 698M Davina Pruitt-Mentle 37

Phrase Searches and Connectors

• Phrase Searches are useful when searching for famous sayings or specific names “Gone with the Wind”

• Phrase Connectors are recognized– Hyphens– Slashes– Periods– Equal signs– Apostrophes

• Example: mother-in-law

EDUC 698M Davina Pruitt-Mentle 38

Stop Words• Stop words are ignored • These rarely help narrow and slow down search

– http– com– certain single digits– certain single letters

• to include stop words use [space]+• Example

– Star Wars, Episode 1 Star wars episode +1– OS/2 OS/ +2

***don’t forget the space before the + - signs

EDUC 698M Davina Pruitt-Mentle 39

How to Interpret Results

See Handout

EDUC 698M Davina Pruitt-Mentle 40

Combines in one search a very large full-text Web-page database (~160 million pages) with over 5,400 searchable full-text published (print) journals and an array of online news resources

EDUC 698M Davina Pruitt-Mentle 41

You may access both relevant web-pages and relevant journals and news releases

Tagged WWW like other search tools or Special Collection (published, fee-for-viewing

journal articles or other publication)

EDUC 698M Davina Pruitt-Mentle 42

GOTOhttp://www.northernlight.com/docs/specoll_help_overview.html

• To obtain an item from the Special Collection:

Click on link Decide if you are willing to pay fee

• Page provides citation so you can locate publication in library

EDUC 698M Davina Pruitt-Mentle 43

• Results grouped in folders listed at left• Folders dynamically generated by search results

– From a controlled vocabulary– Similar to library cataloging– Not fixed like subject directories

• Click on any folder to refine or further focus search

• Sub-folders allow you to further “zero in”

Unique Folders Approach

EDUC 698M Davina Pruitt-Mentle 44

• Subjects (baseball, desserts)• Source descriptors (commercial, personal,

magazines, databases)• Types of documents (press releases, product

review, maps)• Languages (major Romanized languages

only)

Four Types of Folders

EDUC 698M Davina Pruitt-Mentle 45

• Basic Search• Power Search• Industry Search• Investext Search• News

Approaches to Searching

EDUC 698M Davina Pruitt-Mentle 46

• Http://www.northernlight.com• From Home Page• Allows Boolean logic• Phrase in “ ”• Truncation (*for many characters or % for 1

character)• + requires, - excludes

Basic Search

EDUC 698M Davina Pruitt-Mentle 47

• Http://www.northernlight.com/power.html• Combines ALL basic search features in one

search• Limits to major language or country• Can select subject or document in advance

Power Search

EDUC 698M Davina Pruitt-Mentle 48

• http://www.northernlight.com/business.html• All features of basic search• Can limit by date range or industry-based

subject category• Default is ALL industries

Industry Search

EDUC 698M Davina Pruitt-Mentle 49

• http://www.northernlight.com/investext.html• Search or browse thousands of investment

research reports written by expert analysts.

Investext Search

EDUC 698M Davina Pruitt-Mentle 50

• http://www.northernlight.com/news.html• Allows on-line news searches

News Search

EDUC 698M Davina Pruitt-Mentle 51

“Meta” Search Tools• Multi-threaded search engines• Allows access to multiple databases

simultaneously or via a single interface• (-) Do not offer the same level of control over

search interface and logic as individual engines• (+) Fast• (+) Improvements

– Results sorted by site used for search, or location of Website– Able to select search engines to include– ability to modify results

EDUC 698M Davina Pruitt-Mentle 52

Popular Meta-Search Engines

• Dogpile• Metacrawler• Profusion• SavvySearch

EDUC 698M Davina Pruitt-Mentle 53

Subject-Specific Search Engines

• Do not index entire web• Focus within specific Websites/pages

within defined subject area, geographical area, type of resource

• Specialized search - depth rather than breath

EDUC 698M Davina Pruitt-Mentle 54

Selected Subject-Specific Engines

Companies • Companies Online (http://www.companiesonline.com/) • Hoover's Online (http://www.hoovers.com/) • Wall Street Research Net (http://www.wsrn.com/)

People (E-mail and Phone) • Bigfoot (http://bigfoot.com/) • WhoWhere? (http://www.whowhere.lycos.com) • Yahoo! People Search (http://people.yahoo.com/)• Switchboard.Com (http://www.switchboard.com)

EDUC 698M Davina Pruitt-Mentle 55

Selected Subject-Specific EnginesImages • The Amazing Picture Machine

(http://www.ncrtec.org/picture.htm) • Lycos Image Gallery

(http://www.lycos.com/picturethis/) • WebSeek

(http://disney.ctr.columbia.edu/webseek/) • Yahoo! Image Surfer (http://ipix.yahoo.com/)

EDUC 698M Davina Pruitt-Mentle 56

Selected Subject-Specific Engines

Jobs • Hotjobs.com (http://www.hotjobs.com/)• Monster.com (http://www.monster.com/) • The Riley Guide (http://www.rileyguide.com/)

Games • CNET Gamecenter.com (http://www.gamecenter.com/) • Games Domain (http://www.gamesdomain.com/) • Gamesmania (http://www.gamesmania.com/) • GameSpot (http://www.gamespot.com/)

EDUC 698M Davina Pruitt-Mentle 57

Selected Subject-Specific Engines

Software • Jumbo (http://www.jumbo.com) • Shareware.com (http://www.shareware.com) • ZDNet Downloads (http://www.zdnet.com/downloads/) Health/Medicine • Achoo (http://www.achoo.com/) • BioMedNet (http://www.bmn.com/) • Combined Health Information Database (http://chid.nih.gov/) • Mayo Clinic Health Oasis (http://www.mayohealth.org/) • Medical World Search (http://www.mwsearch.com/) • OnHealth (http://www.onhealth.com)

EDUC 698M Davina Pruitt-Mentle 58

Selected Subject-Specific Engines

Education/Children's Sites • AOL NetFind Kids Only

(http://www.aol.com/netfind/kids/) • Blue Web'n (http://www.kn.pacbell.com/wired/bluewebn/) • Education World (http://www.education-world.com/) • Kid Info (http://www.kidinfo.com/) • Kids Domain (http://www.kidsdomain.com) • KidsClick! (http://sunsite.berkeley.edu/KidsClick!/) • Yahooligans! (http://www.yahooligans.com)

EDUC 698M Davina Pruitt-Mentle 59

Subject Directories

• Hierarchically organized indexes of subject categories

• User can browse through lists of Websites by subject in search of relevant information

• Maintained by human• May include a search engine for searching

their own database

EDUC 698M Davina Pruitt-Mentle 60

Examples of Subject Directories• INFOMINE (Academic Scholarly Subject

Directory - http://infomine.ucr.edu/)• LookSmart• Lycos• Magellan

(http://www.magellan.excite.com/)• Open Directory (http://www.dmoz.org/)• Yahoo Many of these have aspects of both search and directory

EDUC 698M Davina Pruitt-Mentle 61

Specialized Subject Directory• Guide complied by subject specialist• List important resources in his/her area of expertise• More comprehensive than general guide• Examples

– Film: Internet Movie Database (http://www.imdb.com/)

• Includes Clearinghouses– Argus Clearinghouse (http://clearinghouse.net/)– About.com – WWW.Virtual Library (http://www.vlib.org/)

EDUC 698M Davina Pruitt-Mentle 62

Summary• Search Engines• The Big Guys

– Altavista– Google– Yahoo

• Meta-Search Tools– Dogpile– MetaCrawler

• Subject-Specific– The BigHub.com– Search Engine Colossus

• Subject Directory– LookSmart– Lycos

• Specialized Subject Directory– WWW.Virtual Library– About.com

EDUC 698M Davina Pruitt-Mentle 63

Preparing to Search

• What’s the topic, question, area of interest?• Identify search terms to describe your topic

of interest• Consider synonyms (echinoderm OR

echinoidea OR "sea urchin")• Consider variations of terms (restaurants,

dining, gourmet)

See Handout: Practical Steps

EDUC 698M Davina Pruitt-Mentle 64

Search tips

• Enclosing a multiword phrase in quotation marks tells the search engine to list only sites that contain that exact phrase– Example: “heart disease”

EDUC 698M Davina Pruitt-Mentle 65

Boolean Logic

• Combines search terms in many databases• AND, OR, and NOT or (+) and (-)• Must check to see if search engines use

Boolean logic

EDUC 698M Davina Pruitt-Mentle 66

Boolean Logic : ANDLimits your search

“Oral History” & Women

Only returns pages with both of these terms on them

EDUC 698M Davina Pruitt-Mentle 67

Boolean Logic : ORBroadens your search

Returns every page with either of these terms on them

“Oral

History”OR Women

EDUC 698M Davina Pruitt-Mentle 68

Boolean Logic : NOTLimits your search

Only returns pages that contain one but not the other term on them

“Oral

History”NOT Women

EDUC 698M Davina Pruitt-Mentle 69

Wildcards• Special Character that can be appended to

the root of a word so you can search for all possible endings to that root

• Good for variant spellings and common root words

• Example– rocket* will yield rocket, rockets, rocketry

psycholog* = psychology, psychological, psychologist

– colo*r = color and colour

EDUC 698M Davina Pruitt-Mentle 70

Ctrl-F

• Follow a link to a document retrieved by a search engine and don’t know how relevant

• Ctrl-F finds the relevant words in current document

• Example: women +“El Salavdor” +“Oral History”– Pick one link, then Ctrl-F

EDUC 698M Davina Pruitt-Mentle 71

Searching Summary

• Choose a search engine– Personal preference– Different engines for different purposes

• Syntax - quotations, Boolean logic, wildcards

• Ctrl-F to find search words• Try to stay focused on your task

Recommended