Search Engines - دانشگاه علوم پزشکی و خدمات ... … · individual search...

Preview:

Citation preview

Search Engines

Objective

General Search Strategy

Medical Search:

Finding Title

Search for Articles

• Full Text

Clinical Queries

"Research is the process of going up alleys to see if they

are blind."

Devices

PC with VPN

Tablet with VPN, Proxy, Anonymizer…

Change settings to Classic, Desktop…

Chrome, Firefox

Cell phone (quick midnight search)

Dissertation

Study

Everyday

problems

Clinical

Challenges

HOW DO SEARCH ENGINES WORK?

“Spiders" or "robots" ("bots")

Sites with no links to other pages

Sends a copy to server => click links

Index most of the words (Tree)

Rank

Update rank

Search: Scan its index of sites and match

your keywords

Ranking? (AI)

Location: Title, Keywords, Words that are

mentioned towards the beginning of a

document

Frequency: words that are repeated

several times

Word proximity

Ranking?

The number of links that are pointing to

sites

Importance of the pages that link

Quality of links

Traffic

Your search history

Your location

ARE SEARCH ENGINES ALL THE

SAME?

Size

Speed

Content

Search options

Censorship

Index:

Every word

Only part of the document

ARE SEARCH ENGINES ALL THE

SAME?

Stemming?

"cardiac" =>? "heart“ / “heart” =>? “cardiac”

Searching a portion of the web, captured in a

fixed index created at an earlier date (news)

Ranking Algorithm (AI, Fuzzy logic, Machine

learning )

2.5 M Servers?

WHAT WERE METASEARCH

ENGINES?

Do not crawl the web

Search the databases of multiple sets of

individual search engines simultaneously

A quick way of finding out which engines

are retrieving the best results

A fair picture of what's available across the

Web

search quickly and superficially

Meta Cons:

Don't offer the "salad bar" of search

options

Not enough query Google

Catch about 1% of search results

DO NOT USE METASEARCH

ENGINES

You are in a hurry

A quick overview

Are not having any luck pulling up documents in your search

Work best with simple searches

Not recommended:

quick and dirty

not thorough

unpredictable

METASEARCH ENGINES

Dogpile

Mamma

Clusty

Metacrawler

Copernic

PICO

SUBJECT DIRECTORIES

Human editors

Smaller

When you don't have a precise idea

Most effective for finding general information

To see what kind of information is available

Include a keyword search

The line between subject directories and search

engines is blurring

SUBJECT DIRECTORIES

Beaucoup

Looksmart

Open Directory Project

Excite

MSN directory

Netscape

Yahoo! Directory

Google (Rank)

MESH

Gateways

Collections of databases and informational

sites, arranged by subject

Assembled by specialists, usually

librarians

Academically-oriented pages on the Web

Looking for high quality information

Accuracy and content.

Subject-Specific Databases

( "Vortals“)

Devoted to a single subject, created by

professors, researchers, experts

One particular field

When looking for information on a specific

topic

Today, search engines, subject directories

and portals are pointing to these

INVISIBLE WEB (Deep Web)

Search engine spiders cannot index

Pass-protected sites

Behind firewalls

Archived material

Certain databases

Peer-to-peer

Dark web (Tor, onion [-URL], ENCRYPTION)

Password

Point your browser directly at them

Determining Page Authenticity

Generally rely on the GOV and EDU hostnames

http://www.sc.edu/beaufort/library.html

NET, ORG, MIL, and COM=> additional

verification

Reputable Web page

Last date page updated

Mail-to link for questions, comments

Name, address, telephone number, and email

address of page owner

Sources ?

Authority of the author(s) ?

Who is linking to the page? (link:)

link:arakmu.ac.ir -site:.ir

Links to other pages?

Last updated ?

Verified at other, similar sites?

Promotion, advertising, and serious

content ?

Stability of the pages

The page you cite today may be

altered or revised tomorrow, or it

might disappear completely.

keep a backup

Search Strategy

AI

Fuzzy logic

Neural networks (unsupervised learning)

Deep learning

Target population:

Average English-speaking Americans

(Most common passwords)

AI

Computers are not dumb anymore

Don’t expect exact results

Repeat search even if you are sure

Get a second opinion

Second Opinion

Filter bubble:

DuckDuckGo

US Digital Millennium Copyright Act:

yandex.ru

Search Essentials

Search Operators

STOP WORDS

a, about, an, and, are, as, at, be, by, from,

how, in, is, it, of, on, or, that, the, not, this,

to, we, what, when, where, which, with,

etc.

"to be or not to be“ , WHO

Search engines differ, change frequently

Caps

Punctuation marks: @, #, .., : , (Space)

Start Search

Broad => Narrow

PICO

Add words one-by-one

Check terms in results

Modify keywords

Advanced Search

BOOLEAN LOGIC

Text parsing: splitting a sequence of

characters or values (text) into smaller

parts based on some rules

Left-to-right: 2/4*2

Unless:

1.Exceptions: 2^2^.5

2.Precedence: 2+2*4

3.Innermost ()

BOOLEAN OPERATORS

AND, “+”?

Documents that contain every one of the

keywords

Restricts the search

Default in most engines

OR, “|” (Precedence?)

Either or both keywords

Expands the search

Keywords that are similar or synonymous

AND

OR

NOT, “-”

Your first keyword but not the second

dementia –alzheimers

Youtube: -youtube / -inurl:youtube

NESTING

> 2 keywords

More than one type of operator

(stricture OR stenosis) AND Pyloric

STEMMING

NOT

“……….” double quotation marks (" ")

Force all words in exact order.

Instead of “+……”

No synonym

No AI

No omitting (very different word ranks)

• (“Must include:” link)

All in Advanced Search

Truncat*

Stemming

When appropriate, search for words that

are similar to some or all of the terms

rat dietary needs

rat diet needs, food, feed, pellet…?

No need for OR ?

~

Word limit for Google Searches

Server Overload

2,048 characters

32 word limit:

Google search

Google images

10 word limit:

Google groups

Google news

Google Search Operators

Wildcard: * ( one or more words )

Hip * surgery Hip reconstruction surgery

Hip dislocation surgery

Hip fracture surgery

(Questions) coronary bypass was invented by *

vitamin * is good for *

“*” does not indicate a fraction or extension of a word: flower * will not match flowerful Stemming technology

Google Search Operators

define:

cache:cache:arak.mu.ac.ir

related: related:https://www.tripdatabase.com/ tripdatabase

filetype:

site: (Site search, Domain search)

inurl:

intitle:

allintitle:

Google Search Operators

intext: ≈ default (“……”)

allintext: “…..” “…..” “……”

in

link:

..

1..5 kg abdominal tumor

John Smith 1960..1985

Scalpel $10..$20

Other Google Services

Image Search: (View Image Extension)

Language Tools

Scholar

Books

https://academic.microsoft.com/home

PROXIMITY OPERATORS

NEAR: search for terms situated within a

specified distance of each other in any

order

colon NEAR tumor

ADJ (adjacent to): ADJ works as a phrase

but in any order.

endangered ADJ species

“endangered species”

“species endangered”

PROXIMITY OPERATORS

AROUND (X)

“breast cancer AROUND(3) aspirin”

CREATING A SEARCH STRATEGY

STEP 1: STATE WHAT YOU WANT TO FIND

In one or two sentences, state what you

want to find

What are the gastrointestinal

side-effects of Brufen?

STEP 2: IDENTIFY KEYWORDS

Underline the main concepts in the

statement

What are the gastrointestinal

side-effects

of Brufen?

STEP 3: SELECT SYNONYMS

AND VARIANT WORD FORMS

Gastrointestinal: gastric, stomach, bowel,

intestine

Side-effects: “side effect”, “side-effect”

Brufen: Ibuprofen, Fenbid

Stemming?

STEP 4:COMBINE SYNONYMS, KEYWORDS,

AND VARIANT WORD FORMS

synonyms with Boolean OR (parentheses)

(Brufen OR Ibuprofen OR Fenbid)

asterisk symbol (*) to combine variant

word forms ?????

(Intestin* OR gastrointestin*)

Combine keywords with Boolean AND

(Brufen OR Ibuprofen OR Fenbid)

(Intestine OR gastrointestinal OR stomach)

side effect

Quick Tips

Truncation - use OR searches for variants

librar* = library OR libraries OR librarian

Be specific

Use nouns and objects as keywords

Put most important terms first

“…..” most important terms?

At least three keywords

Combine keywords into phrases

“acute abdominal pain”

Avoid common words

Anticipate the answers:

Imagine what the ideal page you would

like to access would look like. Think about

the words its title and in the first couple of

sentences.

Type keywords and phrases in lower case

Always enclose OR statements in

parentheses

Use CAPS when typing Boolean

operators

WHAT TO DO IF ...

YOUR SEARCH RETURNS A

"ZILLION" DOCUMENTS

Too few terms

Common words

Think of some synonyms (read pages)

Try adding more specific terms

TOO FEW DOCUMENTS

Searching in the wrong place

Your search is too narrow (PICO)

You didn't configure your search correctly

The information isn't on the visible Web

Try omitting some of your search terms

Another engine or specialty resource

Ask for help

Remember, you are smarter

than a computer. Use your

intelligence. Search engines

are fast, but dumb.

Pirate Ebooks

Old-school

WebSites with Security Vulnerabilities

-inurl:(htm|html|php) intitle:”index of” ”last

modified” ”parent directory” description size

(pdf|doc) “banned books″

MOBOTIX Webcams: control/userimage.html

P2P: (malware)

Random Websites

Now

Social Networks

Iran IP:

Free

Paid

Russian:

ebook3000, ebookee, avaxhome (!!!)

gen.lib.rus.ec

Recommended