1
Advanced searching
a variety tricks of the trade
Tefko Saracevic
[email protected]; http://comminfo.rutgers.edu/~tefko/
• Searching is still much more an art than a science• Main object of searching is to be effective• Effectiveness is primarily considered in terms of
retrieval that is relevant• But there is no such thing as a perfect search • This leads to various tactics to achieve certain
effectiveness goals & levels
Central ideas
Tefko Saracevic 2
1. Definitions, approaches2. Search tactics3. Advanced features 1: Using fields4. Advanced features 2: Using proximity5. Case study
ToC
Tefko Saracevic 3
Advanced searches as heuristics1. Definitions, approaches
Tefko Saracevic 4
Advanced (Encarta)
More highly developed …at a higher stage of development or progress than other
similar people or things
Advanced searchingthat about sums it upit is searching at a higher level of complexity without which
search goals of increased effectiveness cannot be achieved
Definitions
Tefko Saracevic 5
Heuristic (Encarta)problem solving by trial and error
a method of solving a problem for which no formula exists, based on informal methods or experience, and employing a form of trial and error (iteration)
using or arrived at by a process of trial and error rather than set rules
a rule of thumbcommonsense rules indented to increase the probability of solving
some problem
Definitions …
Tefko Saracevic 6
• It means that searching is a trial & error process & an iterative process
• It means that searcher modify a search in response to results or to user rection
• It is a base for search progression toward more effective results
• And it is a behind advanced search strategy and tactics
Advanced searching is a HEURISTIC process
Tefko Saracevic 7
Goals of advanced searching
Tefko Saracevic 8
– achieve higher levels of effectiveness• getting more relevant, missing more irrelevant stuff
– and at higher level of efficiency• saving on overall time, cost, effort
– center search toward answers & resources most likely to be effective
• also: focus unfocused searches &• get ideas how to proceed
– use all available system features for goals– act as an professional (extreme) searcher
Reminder
A search strategy is• The entire approach to a
search – selection of– files and sources to use– approaches in proceeding to
search– formats for viewing results– alternative actions if search
yields• too much• too little
– problem-solving heuristics
Search tactics are• A query - command line
entered into a system in order to retrieve relevant information & variations in– terms, operators, fields,
delimiters & attributes as allowed by a given system
– vocabulary & syntax used in conjunction with connectors &/or limiters to search a system
Tefko Saracevic 9
Advanced searching possible at several levels
Strategic• using different approaches
to fit circumstances or context independent of but adapted to a system used
Reminder: Search strategy (big picture):
– overall approach to searching of a question
– decisions on search resource(s), content & format
– variations in these as a search progresses
Tactical• using system features to the
hilt to achieve given objectives– but as said, features may & do
differ from system to system
Reminder: Search tactics (action choices):
– choices & variations in search statements, query
– terms, connectors, attributes …– using capabilities of a system to the
hilt to achieve desired results
Tefko Saracevic 10
Various ways of approaching an advanced search
2. Search tactics
Tefko Saracevic 11
Name Mostly used for
1. Speed search (also called Briefsearch, meatball search, quick & dirty search)
Questions: usually simpleRequirement for answers: brief, not comprehensiveEffort: not willing to spend much. Little preparation requiredExtension: possibly also used as a starting point for ill defined questions or more complex searches to see what works, what is there, & for relevance feedback to proceed with other tactics
2. Building block search
Questions: usually complex & fairly well definedRequirement for answers: more comprehensiveEffort: willing to spend quite a bit, particularly in preparationExtension: excellent to proceed with relevance feedback to citation pearl growing or refinements
3. Citation pearl growing search
Questions: usually complex & not that well definedRequirement for answers: comprehensiveEffort: willing to spend a lot, particularly in examination of answers & following & evaluating citation trailsExtension: good to proceed with building block tactics
Some major tactics
Tefko Saracevic
12
• Takes little planning & is fast– searcher gets on to the system quickly, & enters terms
using default (or simple Boolean) operators – only a few terms are used– there is no or little reiteration & limited interaction
between searcher & system
• Can also be used for verification purposes• Results can be examined for relevance feedback• Not recommended for comprehensive searches• Widely used & most prefered by users generally
Speed search
Tefko Saracevic
13
• Speed search is not a be all and end all • But it could be a very effective beginning
– to do initial exploring and getting ides about sources, contents, type of documents, magnitude …
– to find some relevant documents and proceed from there– and then to proceed with refining searches using other
tactics
• You do a speed search, examine results, maybe do more & examine again and on that basis refine succeeding searches & tactics
However …for a complex search
Tefko Saracevic 14
Use it as a classic form of feedback
• Commonly used search tactic– start small & then build upon results
• identification: each important concept a search is identified; also facets, such as fields to be searched are identified
• elaboration: for each concept further terms are identified • combination: search starts with one or just a few concepts &
associated; as it progresses additional concepts & facets are connected using appropriate Boolean operators &/or attributes
• iteration: as a search proceeds terms to concepts may be added, new concepts, created & combined; fields added or dropped
• You build heuristically & modify the query as you go along adding, changing concepts, their elaborations, and facets/fields
Building block search
Tefko Saracevic 15
Concept ATerm A1
Term A2
Term An
…
Building block search - illustration
Tefko Saracevic 16
Concept BTerm B1
Term B2
Term Bn
…
Concept CTerm C1
Term C2
Term Cn
…
Facets/fieldsField/limit F1
Field/limit F2
Field/limit Fn
…
1. From a question concepts A, B, C ... are identified – terms that could be further analyzed
2. For each concept search terms are added – narrower, broader, related, synonyms, near synonyms - all these are connected with OR
3. Concepts together with their terms are connected with AND4. Fields and limits may be added to any or all concepts or terms
AND
OR
Dialog worksheet helps in planning
Tefko Saracevic 17
Enter question
Select databases
Elaborate terms
Reflect goals
Specify commands
• Concepts in building block searches can also be identified not only from a question but from resulting documents from a speed search– thus concepts C, D … could be specified after a previous
speed search , elaborated, & then added to a subsequent building block set of concepts
– same with facets & fields
Connecting tactics
Tefko Saracevic 18
• A search can start with using one of the concepts and its elaborations & then adding others– this way it proceeds from broad (one concept) to narrower
by adding other concepts – and reviewing– facets and fields can be added still more narrowing– evaluated as one receives answers – limits/fields can be added at any search, narrowing it further– used to increase precision & focus
• Same can be done in reverse from narrow to broad to by subtracting concepts from a comprehensive search– used to increase recall & focus
Narrowing tactics
Tefko Saracevic 19
Narrowing schematic
Tefko Saracevic 20
Concept A
Term A1
Term A2
Term An
…
Concept A
Term A1
Term A2
Term An
…
Concept A
Term A1
Term A2
Term An
…
Concept B
Term B1
Term B2
Term Bn
…
Concept C
Term C1
Term C2
Term Cn
…
Concept B
Term B1
Term B2
Term Bn
…
+
++
add to any
Facets/fields
Field/limit F1
Field/limit F2
Field/limit Fn
…
+
1st search
3rd search
2nd search
4th, 5th … search
+ = AND
Citation pearl growing search
What? aims
• It means what the name implies: you start with a nugget & grow upon it
• Starts with a few records of high relevance
• Looks at references or who cites it to find more
• Aims for more recall• Avoids subject terms,
indexing & language
When to use it
• When word lists or thesauri are not available
• When there isn’t a large recall after doing some searching
• When a user has one or two good articles and wants to find more like them
• When a topic is hot with a breakthrough paper
Tefko Saracevic 21
It depends on citations over timeBackward chaining(back in time)
• Following up references in articles of interest– moving backward in
successive leaps through reference lists
• Could be linked to co-citation – authors cited together
• Popular in social sciences, humanities
Citation tracking (forward chaining in time)
• Who has cited a given document, author, journal, institution– moving forward in time from
the publication of the item
• Used also to indicate impact– higher citation rate assumed
higher impact
• Popular in sciences
Tefko Saracevic 22
• Tools giving citation links• particularly Web of Science, Scopus & Google Scholar
• Invaluable for citation pearl growing– Citation indexes in various subjects (law, science …)
provided that for a long time even before computers– But it exploded with automation
• Now some search databases provide support for that search tactics– integrated with subject searching
• e.g Scopus, even Google Scholar
– easy to jump from subject searches to references to citation tracking to sources to authros
Citation indexes
Tefko Saracevic 23
Using fields3. Advanced features 1
Tefko Saracevic 24
• Any & all vendors & search engines have advanced search features – none are without them
• In principle most are the same in that they cover similar fields in records
• But in application they differ from vendor to vendor, engine to engine – sometimes greatly
• need to be learned individually. What a bummer!• cannot be taken that what & how works in one works elsewhere –
even though similarities are there• but once you know them well in a few you generalize & adapt to
others
In fact
Tefko Saracevic 25
Fields & advanced features• Common fields beyond
subjects– author, source, year, institution,
type of publication, country, etc
• Some are used to search on another dimension– e.g. authors, sources
• Others to limit subject & other searches– e.g. dates, language
• Everybody has fields– & they are critical for
advanced searching– it starts with fields
• How displayed for searching differs greatly– now mostly in menus
• added automatically
– but also available as commands
Tefko Saracevic 26
examples
Advanced features for Library Literature & Information Science in Wilson Web
Tefko Saracevic
27
fields
Advanced features for Web of Science
Tefko Saracevic 28
fields
Advanced features for Scopus
Tefko Saracevic 29
fields example
Detailed description in:Google Guide , particularly in Query Input by Nancy Blachman“I developed Google Guide because I wanted more information about Google's
capabilties, features, and services than I found on Google's website. Google Guide is neither affiliated with nor endorsed by Google.”
Advanced features for Google
Tefko Saracevic 30
Here is what Google says:
Advanced features for Google …
Tefko Saracevic 31
fields
• Many studies show that users (when searching for themselves as end users) use them rarely, if at all,– they do not use Boolean capabilities, availability of
searching by given fields, restricting of searching by available delimiters etc.
• But professional searchers use them a lot
Use of advanced features
Tefko Saracevic 32
Use of advanced features is one of the hallmarks of professional competencies
Using proximity of terms4. Advanced features 2
Tefko Saracevic 33
Proximity
• Searching for– terms x words apart
• one after the other or in any order
– terms in same sentence, paragraph, field
• Improves precision– zeros in on specific
names, expressions
• Important for searching– particularly for users in fields
with set terminology
• Connected with phrase searching
• Simple idea but handled very differently in different databases– to find how handled must go
to Help
Tefko Saracevic 34
examples
Phrase and string searching (similar to proximity) from Help
Tefko Saracevic
35
Proximity & phrase operators (from Help)
Tefko Saracevic 36
from Help
Tefko Saracevic 37
Stop words
• Words that databases and search engines choose to ignore– for searching – they will
note their position but not include in the index
– some of them also for indexing – they will not index them to start with
• Different databases use very different lists of stop words– and handle them
differently
• Dialog has 9 stop words:– AN, AND, BY, FOR, FROM, OF, THE, TO,
WITH
• How about others?– lets see
Tefko Saracevic 38
examples
Stop wordsimportant to know what they do NOT search automatically
[from their Help pages]
Tefko Saracevic 39
Stop words – handled very differently
Tefko Saracevic 40
WoK has some 200 stop words that are ignored while searching even for phrases
Watch out!
Stop words – again handled differently
Tefko Saracevic 41
My own question & search – reality show
5. A case study
Tefko Saracevic 42
Question & contextthis is a real question & reason I had
Question• Search engines offer a
number of features for searching. They also retrieve a large number of answers. How much are these advanced features used? How many pages do people look at?
Context• I am interested in studies
that have actual data. To be used for update of bibliography in this course and for discussion in a lecture book on relevance in information science that I am currently writing – support for broader conclusion
Tefko Saracevic 43
Databases used
• I used first Library Literature and Information Science – available at RUL– did not get anywhere
really so I lost patience & switched
• Then I used Scopus– not available at RUL any
more, but have class access
• All results are from Scopus
Tefko Saracevic 44
First I did a speed search that led me to making building blocks
Tactics• Selected a basic concept
from the question
Results• I enlarged the search
concepts & terms from index terms found in a few examined documents that seemed relevant
Tefko Saracevic 45
Search
search engines
advanced search Web
Web searches
Online searching
Web queries
Web sessions
Methods
Transaction log analysis
Search log analysis
This resulted in
• Quite a broad search and a lot of results, so I went to limit to certain fields and dates
• Selected to add to search as limitation:
Tefko Saracevic 46
Facets/fields
social sciences
only last two years
and later to articles with a lot of citations – shows impact
One of the searches
Tefko Saracevic 47
Limit years
Limit area
Choose fields
Examined about six pages of results, here are three major selections
Tefko Saracevic 48
These two did not have any results, but were useful for the class, so I included them in the bibliography
This was toward the end but it turned to be a mother lode, not only for having statistical results but for citations
Here is the mother lode abstract with
a number of features for further searching
Tefko Saracevic 49
for entry in bibliographylooke
d at reference
s
looked at citations
clicked on authors
Leads to further things:references, index terms, cited by, related works
Tefko Saracevic 50
ideas for in
dex term
scit
ed by
relat
ed w
orks
Articles that cited it
start of the list, newest ones first -with a number of features to explore further
Tefko Saracevic 51
to e
xam
ine
Multiple use of tactics & results
Search tactics used: • Speed search• Building block search• Citation pearl search
• references (backward chaining)
• cited by (forward chaining)
• Relevance feedback
Results used for:• Got a few references to
include in class bibliography• Got data to include in
lectures and in the future book
• And example to illustrate topic for this lecture
Tefko Saracevic 52
It was what Marcia Bates calls berry-picking search
Conclusion:searching is both
Tefko Saracevic 53