View
226
Download
7
Tags:
Embed Size (px)
Citation preview
Googling
Welcome !
While you are waiting, please… find in your packet:
Exercise 6 - Questions for the Final Exercise
“What Do You Want Google to Tell You?”begin writing down your questions in
three or more categories
Googling
Infopeople is a federally-funded grant project supported by the California State Library. It provides a wide variety of training to California libraries. Infopeople workshops are offered around the state and are open registration on a first-come, first-served basis.
For a complete list of workshops, and for other information about the Project, go to the Infopeople Web site at infopeople.org.
This Workshop is Brought to You By the Infopeople Project
Introductions
Name
Library
Position
How do you use Google?
Workshop Overview
Google’s way of “thinking”
Taking charge of the driving
Using limits to find the hard-to-get
Finding information on a subject
Special Google databases and tools
What to do when Google doesn’t work
Go to: bookmarks.infopeople.org
Click on extreme_googling_bk.htm
Make a bookmark of this page
Add to Favorites
Click on extreme_googling_bk.htm
Make a bookmark of this page
Add to Favorites
Exercise 1
How does Google “think” about your searches?
Please pause and wait for discussion when you reach a
A Close Look at Google Search Results • Excerpt of page with your terms
• Matched terms in bold
• URL, size, date last crawled • Link to Cached copy• Pages supposedly like this one
• 2nd page from same site
• All Google pages from this site
• Which Google database used • Approx. # of hits• Terms actually searched on, as Dictionary links
Don’t believe the number of ResultsThey are approximate, changing, and not comprehensive
Default Matching on Search Terms
Default AND between terms Google takes a FUZZY approach
only some of the words if a page is “important” words may occur only in pages that link to the page words occur somewhere on the site a page belongs to
Cached reveals the page as Google found it may differ from the current page Cached exists if a page is full-text indexed
About 1 billion pages in Google are not cached Not fully searchable
no Cached if a page owner requests not to be cached
How Can You KnowWhy Google Found a Page ? Click Cache link toward end of results
top area often explains what was matched
Stemming
Google stems “when appropriate”automatically detects word stem or rootretrieves with various endings
kite flying gets kite kites kiting
fly flying, flyers, flyer’s, flyers’
to turn off +kite +flying
“kite flying” single word searches not stemmed
Words Google Does Not Search
Common or “stop” words ignoredto be or not to be
no list of “common” terms Google tells you below search box in results to turn off
+to +be +or not +to +be
“to be or not to be” single word searches possible on common words
Ranking of Results
Word order matters favoring phrases (words together) looks for phrases with something in place of
stop words word repetition and proximity also count
Google ranking is a great mystery PageRank combines many factors
popularity - links to a page and their importance “importance” - a value of 0 (low) to 10 (high) term placement - phrases, proximity, repetition
See Cheat Sheet #1
Google Preferences
Interface language Selected languages for pages SafeSearch filtering
“moderate” is default
Number of results returned 20 or 30 is best
Open new browser window for search results
Back of Cheat Sheet #1
The Google Toolbar Search any Google databases Search within a site Pop-up blocker Search history list Set Google preferences quickly Customizable in Options
download fromtoolbar.google.com
Other browsers toolbar download from
googlebar.mozdev.org
Googling
Exercise 2
Installing the Google ToolbarCustomizing Preferences
Taking Charge of Driving Google
OR
Getting the Most
from Google’s FUZZY Thinking
Improving Google’s “FUZZY” Default AND
Problems with AND default: words can occur anywhere in results pages
may have different meanings or contexts some pages may not contain all of your words some may not have any of your words
Use quotation marks to require words together turns common words into unique search terms
“working mothers” 145,000 5% of working mothers 2,680,000“dry cells” 11,500 1% of dry cells 1,010,000
Hyphen makes phrases and searches with and without hyphens
bite-sized retrieves bite-sized, bite sized, bitesized
Force “FUZZY” with OR Searches
Singulars and plurals not covered by stemming
parent OR parents Equivalent or synonymous terms
parent OR guardian Misspellings
libarian OR librarian Apostrophes and their misuse
april's OR aprils OR april "fools day"
Ask Google to be “FUZZY”Synonym search
~ immediately before a wordsometimes “thinks” of very broad, related terms
~food recipes, nutrition, cooking
~facts information, statistics
~help guide, tutorial, FAQ, manualOften: Terms appear in links pointing to a retrieved page
Take advantage of stemmingLet stemming handle variant endings:
“wild flowers” OR wildflowers hike “point reyes” april OR may OR spring
hike, hikers,hiking, hikes
Ask for “FUZZY” Number Ranges
Numrange search uses . . (no spaces)
babe ruth 1921..1935
results have highlighted dates within this range
3..6 megapixels digital camera
most numbers will be associated with megapixels
DVD player $250..
can be open-ended -- any number above starting
number
The Whole-Word Wildcard:Allowing FUZZY within “ ”
Can’t remember the exact wording in a phrase?Who wrote something like, “The stag at night drank his
fill”?Try searching:“the stag * * * his fill” OR “the stag * * * * his fill”ANSWER: “The stag at eve had drunk his fill” - in most sources
--Sir Walter Scott, “Lady of the Lake”
Construct proximity searches "george bush""george * bush""george * * bush""bush george""bush * george"
Or try GAPSwww.staggernation.com/cgi-bin/gaps.cgi
Excluding to Control “FUZZIness”
You want: Medical info about a pancreatitis diet
Start with: pancreatitis diet 172,000 Eliminate undesirable words in results:
pancreatitis diet -cat -dog
132,000
pancreatitis -cat -dog -"support group"
128,000
Select exclusions carefully
Ask Google to be Very “FUZZY”: Related & Similar
Two commands for the same function click Similar at end of result search related:www.infopeople.org
Sometimes hard to see how related links to and from the target page major words in and ranking of related pages
Possible uses comparison shopping find more sites like a site
related:www.econsumer.gov
use to evaluate a suspect page
Googling
Exercise 3
Taking Charge of Driving Google
Googling
Limiting to Find the Hard-to-Get
Limiting: Words in <Title> intitle:
finds pages concentrated on your termhybrid cars intitle:mileage 7,060
hybrid cars mileage 296,000
with quotes:intitle:”cuban embargo” 581
“cuban embargo” 28,000
with OR:intitle:”global warming” OR intitle:”greenhouse effect”
Use allintitle: to require all words in titleallintitle: hybrid cars mileage 86
can combine only with site:allintitle: hybrid cars mileage –site:com 11
Exploiting a Page’s URL Limiting to domain (edu, gov, etc):
site:edu OR site:gov OR site:ca.us complete list at:
http://en.wikipedia.org/wiki/List_of_Internet_TLDs
Searching within a Site site:
site:memory.loc.gov lincoln “sheet music” works only in top/first part of URL omit http:// and final / makes Google into a search engine for pages that are indexed
in Google inurl: less specific
term may be anywhere in URLs
inurl:lincoln “sheet music” finds “lincoln” anywhere in any URL and “sheet music”
somewhere in the pages
Limiting to Types of Documents
filetype: OR to find more than one
form 1040 filetype:pdf - finds forms
-filetype: exclude certain filetypes
form 1040 -filetype:pdf - finds help with forms
View as HTML link can be useful avoids viruses a document might carry if opened allows viewing without the software or reader
Cannot always be combined link: similar: must stand alone allintitle: allintext: allinanchor: allinurl: with site: only
You can mix all other limit commands, usually:inurl:ucla intitle:admissions statistics
intitle:”thyroid disease” site:edu OR site:com
Be careful not to ask for the impossible:site:ucla.edu -inurl:edu
site:com site:edu site:gov
Some require understanding HTML hypertext links: inanchor:links looks for text in link tags in the HTML code:
<a href="http://www.pancreasweb.com”>Pancreatitis links</a>
<a href="www.pancreaticdisease.com/links/links.htm”>Links</a>
Caveats for Limit Commands
See Cheat Sheet #3
Advanced Web Search pageRestricted Opportunities
Useful if you want to: Try limiting to pages
updated in 3 mos, 6 mos, year
Change language of results pages
Select from list of filetype formats
Change content filtering (also in Preferences)
Not useful if you want to: Construct complex
searches OR with phrases multiple phrases
Use OR for more than one limiter site: filetype: inurl:
Use intitle: inurl: only the allin... commands
in Advanced SearchI almost neveruse it
Googling
Exercise 4
Limiting
Googling
Finding Info on a Subject
Finding Directories & Link Lists EXAMPLE - looking for links or directories about:
“women’s history” “middle east”
Use words likely to occur in link-list or directory pages
links OR "directory of" OR guide “women’s history” “middle east”
“what’s new” OR “what’s cool” “women’s history” “middle east” <Title> field limit to focus pages you want
intitle:links OR intitle:”directory of” OR intitle:”encyclopedia of” “women’s history” “middle east”
intitle:”women’s history” intitle:directory “middle east”
Are there agencies or organizations with links on this topic?inanchor:links society OR association
"middle east" "women's studies"
Be creative. Substitute database for “directory” to find searchable databases
Google’s Directory
1.5+ million pages (compare with 8+ billion in web search)
DMOZ Open Directory Google “importance” ranking within directory
EXAMPLE:women's history middle east OR eastern
Click on useful subject categories for more:Science > Social Sciences > Area Studies > Middle Eastern Studies
Society > People > Women > Women's Studies > By Topic
Society > Issues > Human Rights and Liberties > Regional > Middle East
Search Google for Weblogs Current commentary, opinions, misc. musings
Google indexes “important” blogs frequently more than most web pages
Thorough search impossibleblog OR weblog OR “web log” your subject words
inurl:blog OR inurl:weblog your subject words
If you know the software a blog is using:“powered by blogger” your subject words
site:blogspot.com your subject words
“powered by geeklog” your subject words
Try searching the Google Directory
Search Google Groups for Info Usenet news groups back to 1981
archive of UNevaluated public thoughts, advice & opinions
some not found elsewhere select threads with more than one article for context
Search differences: search for a group by name search within a group + required for common words even in “ “
“hair loss” OR "loss +of hair" OR balding group:alt.support.thyroid
use Advanced Search to limit by group or date posted
Create new mailing lists with registration
Google as Encyclopedic Glossary Use the command define:[no space]
Google finds and ranks Web pages with definitionsdefine:internet
define:due diligence Or build searches for pages with definitions:
internet “what is”“what is the internet”“internet stands +for”internet ~beginners
internet ~FAQ Also many common facts available:
population of japancurrency in algeriabirthplace of hitler
Exercise 5
Finding Info on a SubjectBrainstorming
How would you approach Google to solve each of the following problems?
1. How can I find some good collections of links and information on migraine headaches? 2. I want to find websites directing me to good places for bird watching in Northern California. 3. Where can I find blogs about California and the
use of blogs in libraries, particularly blogs to keep in
touch with other librarians and libraries in the state
and how they’re using blogs?
4. Where can I find debates, from a wide range of perspectives, about what constitutes a near-death experience? I'm interested in proofs that what people report can be believed.
5. What is the birthplace of Teddy Roosevelt?
7. What is the currency of Nepal, and how much of it could $100 US buy as of January 15, 2004?
6. What is the size of California?
Googling
Special Google Databases and Tools
Shortcuts and Services Shortcuts:
dictionaries and other definitions phonebooks - white and yellow movie showtimes stocks with recent news maps, weather converters, math problem calculators, physical constants number searches
UPS, FedEx, USPS, VIN, UPC codes, area codes, airplane reg. #, patents, more
http://www.googleguide.com/shortcuts.html Translate
click [Translate this page] or URL or enter text atwww.google.com/language_tools
Page Info - better to enter a URL @ alexa.com
Many search engines offer useful shortcuts & similar tools:See Search Cheat Sheet #4 & Supplement
“Hacking” Google URLs Structure of a Google search result URL
Your search is for: “web searching” tutorial
http://www.google.com/search? Google URL ? indicates querynum=20& Number of results per pagehl=en& Interface languagelr=& Search language blank (ALL)safe=off& SafeSearch offq=%22web+searching%22+tutorial Query search terms
%22 means quote mark+ joins terms
Will vary according to your Preferences setting
You can modify results by changing values
A “Hack” for Country Searches
Type the search: egypt history 1950..1970http://www.google.com/search?num=20&hl=en&lr=&safe=off&
q=egypt+history+1950..1975
Append in Address/URL box (no spaces):
&restrict=countryEG
More countries and pages than in Language Tools search page
www.google.com/language_tools
&restrict=countryEG
General format - capitalized country code:
&restrict=countryXX Complete country codes list:
http://en.wikipedia.org/wiki/List_of_Internet_TLDs
Google’s Other Proprietary DatabasesBesides Web, Directory, and Groups
Images 1.3+ billion SafeSearch filter only works in English language
News 4,500 news sources 30 days international versions - other news slants
Froogle for shopping shopping sites from Google - a subset + merchant uploads of catalogs not on the web no fees, no pay for position
Catalogs (Google Labs still) scanned mail-order catalogs (not web), text searchable to navigate within a catalog, click an image and use the
special catalogs navigation bar
Use Advanced Search formsUseful, specific limit settings
Local Information local.google.com
“businesses & services” from Google web database + several yellow pages
topic box address/location box restrict to 1, 5, 15, 45 miles away
geographic proximity, maps EXAMPLE:
vegetarian restaurants100 Larkin St, San Francisco, CA
maps.google.com draggable images, satellite view local (yellow pages), driving directions
earth.google.com requires download, 200 MB memory exotic toy or useful tool?
Google Labs More upcoming Google services (beta)
Sets - create and explore sequences of things Suggest - browse possible search terms video.google.com – some TV programs My search history – registration and privacy
considerations Print.google.com – search only in Print
database project to make full text books available online
Scholar.google.com – special page to search from scholarly articles (mostly) on the web
abstracts if full text not available integrated with OCLC for library holdings integrated with some college campuses See Cheat Sheet #5
Exercise 6
Where would you look?
1. Choose ONE or TWO questions to answer
2. Write down what you did & learned
3. It’s O.K. to talk, ask questions, and help
each other as needed
Googling
When Google Doesn’t Work
Other Effective Search Engines
Yahoo Search (3+ billion) no 10-word limit
accepts ( ) around Boolean OR(“global warming” OR “greenhouse effect”)
(site:edu OR site:gov OR site:uk)
pay-for-position sites not identified
Teoma (1+ billion) popularity within subjects sometimes finds link collections as Resources
Bookmarklets for Searching
Java Script applications that reside in your Bookmarks or Favorites (Favlets)
Search engine tools:run a search in another search engine
@Teoma @Yahoo!search highlighted text in a search engine
Information and more about them atsearchengineshowdown.com/bmlets
Recommended Directories
By library peopleLII.ORGAcademic Info Infomine
Complement to searchingwhen search engines do not seem to
workwhen you know or have a hunch there
is a site about your question
Thinking in Sync with Search Engines
Search engine balancing act: Do we agree with Google’s “importance”?
tyrannical or democratic? favors established more than new websites favors trendy, high-speed, consumer, vroom & zoom
Are Google’s secretiveness & fuzziness trustable?
Have search engines changed us? Do we accept “good enough” quicker? Have we given up “thorough” and “certain”?
Will semantic & linguistic analysis help? Or bring in a new age of “whatever” thinking
Googling
Exercise 7
Make your own Cheat Sheet
Write down up to seven things you want to
remember to do or practice
Circle the ONE you like most
Googling
Workshop Evaluation
infopeople.org/WS/eval