View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Feb. 1, 2007 Search and Sensemaking 1
CoLiDeS and SNIF-ACT: Complementary Models for
Searching and Sensemaking on the Web
Muneo Kitajima, AIST
Peter G. Polson & Marilyn H. BlackmonUniversity of Colorado
Feb. 1, 2007 Search and Sensemaking 2
Decoding Acronyms In Title
• CoLiDeSComprehension-based Linked-model of Deliberate Search (Kitajima, Blackmon, and Polson, 2000,2005)– Derived from Kintsch’s (1998) construction-integration
cognitive architecture– Basis for usability engineering method called Cognitive
Walkthrough for the Web (CWW) (Blackmon et al., 2002).
• SNIF-ACT Scent-based Navigation and Information Foraging in the ACT Cognitive Architecture (Fu and Pirolli, In press; …..)
Feb. 1, 2007 Search and Sensemaking 3
Goals
• To Integrate:– CoLiDeS
• Searching and sensemaking at the level of a Web pageWith
– SNIF-ACT• Searching and sensemaking at higher levels
• To Show:– Role of background knowledge– Multiple patches on a single web page– Importance of comprehension processes in
searching and sensemaking
Feb. 1, 2007 Search and Sensemaking 4
Outline
1. Information Foraging Theory (Card and Pirolli, 1999; Pirolli, 2005)
2. Brief Summary of SNIF-ACT 3. Scent AND Background Knowledge4. CoLiDeS 5. Multiple Patches on A Web Page6. Interactions Between Search and Sense Making7. Conclusions8. CoLiDeS +SNIF-ACT
Feb. 1, 2007 Search and Sensemaking 5
1. Information Foraging Theory
Feb. 1, 2007 Search and Sensemaking 6
Insights from Information Foraging Theory
• World Wide Web is a collection of patches of relevant information (Websites)
• Forager faces two basic decisions– Continue foraging in current patch– Leave and find a new patch
• Decisions based on comparison of expected gain of staying in current patch versus cost of finding new patch
• Scent-based navigation within a site
Feb. 1, 2007 Search and Sensemaking 7
Scent-Based Navigation
• Consensus– Users perform website navigation tasks by
exploration by following a trail of information scent
• In Pete’s presentation this morning– Showing that very reliable scent cues are required
for successful navigation (Hogg and Huberman,1987)
– Derivation of theory of scent from first principles
Feb. 1, 2007 Search and Sensemaking 8
Search, Sensemaking, and Comprehension
• Information Foraging Theory: Two subproblems– Search– Sensemaking
• CoLiDeS: Both search and sensemaking require comprehension– Search requires comprehension of structure and
content of a web page– Sensemaking entails comprehension of retrieved
information
Feb. 1, 2007 Search and Sensemaking 9
2. Brief Summary of SNIF-ACT
Feb. 1, 2007 Search and Sensemaking 10
SNIF-ACT
• ACT-R Model of searching a Website for specific information– Based on Information Foraging Theory (Card and
Pirolli, 1999; and Pirolli, 2005)
• A user’s goal and hyperlink texts are represented as collections of chunks– Spreading activation from link chunks to goal
chunks determines scent– Link strengths can be computed from on-line
corpora
Feb. 1, 2007 Search and Sensemaking 11
SNIF-ACT 1 and 2
Knowledge– Actions are represented as productions
• Probability of an action is determined by its utility in relation to utilities of other actions under consideration
• Utility of clicking on a link is determined by its scent
Control– Evaluate links on webpage
• one link at a time• moving down through links in sequential order
– Use satisficing problem solving strategy
Feb. 1, 2007 Search and Sensemaking 12
SNIF-ACT 2.0 Page Level Loop
During each cycle
Decide– Click on best link found so far*– Process another link– Click on back button
• Utilities (U) of alternative actions– U (Click on best link) Scent of that link– U (Process another link) Starts high and decreases
with number of links evaluated
– U (Click on back button) Average scents of links processed on previous
pages and this page
Feb. 1, 2007 Search and Sensemaking 13
Strengths of SNIF-ACT 2.0
• Treatment of Actions with No Scent – Derived from rational analysis of foraging
in information patches (Pirolli, 2005)• Press back button• Leave website
• Theory of Information Scent– Derived from first principles (Pirolli, 2005)– Linked to spreading activation memory
mechanisms in ACT-R
Feb. 1, 2007 Search and Sensemaking 14
3. Scent AND Background Knowledge
Feb. 1, 2007 Search and Sensemaking 15
Scent and Background Knowledge
– CoLiDeS assumes that scent is product of successful comprehension of link
–Familiarity => Having necessary background knowledge to comprehend words and categories that make up a link
– Familiarity of words and categories in link is as or more important than scent
Feb. 1, 2007 Search and Sensemaking 16
Our Version of Scent
• Used Latent Semantic Analysis (LSA) to estimate similarities between goals and– Descriptions of patches (Headings)– Links
• Values of Scent–1.0 < s < 1.0, analogous to correlation
s < 0.1 weak scent0.2 < s < 0.3 moderate scents > 0.3 strong scent
Feb. 1, 2007 Search and Sensemaking 17
Larson and Czerwinski (1998)
Task– Search for articles in experimental website
simulating Encarta online encyclopedia
– One or two words described target articles• Unfamiliar targets: Pink Floyd, Tlingit, Trilobite, …
– Unfamiliar links• Anthropology, Paleontology, Theology & Practices, …
– Search for articles involving • Unfamiliar targets • Unfamiliar correct links
Far more difficult than predicted by model that only describes scent following
Feb. 1, 2007 Search and Sensemaking 18
Blackmon et al (2002) Experiment
• Partial replication of Larson & Czerwinski (1998)• Fix unfamiliar target problems
– Provided participants with four or five sentence definitions of target articles
• 16 and 32 link conditions– One click– Select correct topic heading from list of 16 or 32 links– Link texts from Larson and Czerwinski
Feb. 1, 2007 Search and Sensemaking 19
Tlingit culture, Tlingit tribes
Feb. 1, 2007 Search and Sensemaking 20
Observed first-click behavior: Unfamiliar problem hid high-scent Anthropology link
Link from Tlingit webpage Percent 1st click
Scent
Anthropology 9% .47
U.S. States, Territories, & Regions
27% .32
History of the Americas 27% .28
People in U. S. History 23% .25
Other links 14% <.25
UNFAMILIAR LINK
Feb. 1, 2007 Search and Sensemaking 21
12 unfamiliar tasks were significantly more difficult (p<.002)
Feb. 1, 2007 Search and Sensemaking 22
…but unfamiliar tasks had same mean scent for correct link
0.420.42 0.43
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
All 64 items 12 unfamiliar items 52 non-problemitems (familiar)
Sets of items
Mean s
cent
Feb. 1, 2007 Search and Sensemaking 23
Experiments with Ethnic Minority Parents and Children
• Parents and under-18 children came together & did experiment at adjacent computers
• Most participants were adult parents with high school or middle school students, but a few children were elementary school students
• Each task asked them to search for things familiar to children, for example, ferns, pets, earth-moving equipment, or Mexican art
• 9 category headings always in left hand navigation column, level-2 pages reveal links for one category (e.g., history) in content area
Feb. 1, 2007 Search and Sensemaking 24
Goals had pictures to help participants quickly grasp what they were looking for,
and texts written at 3rd-grade level
Feb. 1, 2007 Search and Sensemaking 25
College-level vs. ethnic minority adults and children (3rd-grade reading level)
• Encarta links 13% unfamiliar scattered sparsely across the page
• < 5% low frequency words
• Powerful, consistent evidence that even college-level population flounders on unfamiliar links but helped by grouping of links into coherent patches
• Encarta links 56% unfamiliar, and some patches are virtually all unfamiliar links
• 52% low frequency words
• Our preliminary observations: ethnic minority population frequently resorted to trial-and-error exhaustive searching of entire patches
Feb. 1, 2007 Search and Sensemaking 26
9 top-level category links were surprisingly difficult though predicted• Top-level categories
familiar (e.g., Social Science, History)
• 0% low frequency words at top-level
• College-level population good at scent following on 9 top-level category headings
• 44% (4 of 9) of top-level categories are unfamiliar
• 41% low frequency words at top level
• Ethnic minority population poor at scent following on 9 top-level category headings
Feb. 1, 2007 Search and Sensemaking 27
4. CoLiDeS
Feb. 1, 2007 Search and Sensemaking 28
CoLiDeS is a Comprehension-Based Model in HCI
• Simulates a user searching for information relevant to her goal on a Web site
• Based on the Construction-Integration (C-I) architecture (Kintsch, 1998)– C-I is a detailed story about utilization of
background knowledge to comprehend or action plan
– C-I can fill in gaps in knowledge using hill climbing or pure forward search as a problem solving strategy
– Hill climbing controlled by information scent
Feb. 1, 2007 Search and Sensemaking 29
CoLiDeS Starts with a Complex Goal
• I am interested in reading recent articles that deal with prediction of sea level rise in the near future caused by global warming. I am going to browse the Science section of the New York Times Website to articles.
• Can include:– Description of search target– Navigation
Feb. 1, 2007 Search and Sensemaking 30
CoLiDeS Parses Web Page into Patches to Select a Link
For each page for a given goal:• Attention phase: Select a patch
– Parse page into collection of patches– Describe each patch– Select a patch whose description is
comprehended AND is most similar to the goal
• Action selection phase: Select a link– Describe each link– Select a link whose description is
comprehended AND is most similar to the goal
Feb. 1, 2007 Search and Sensemaking 31
Dec. 14 NY Times
Feb. 1, 2007 Search and Sensemaking 32
A Page Parsed Into Patches
Logo Site Nav Bar
Ad Ad
Ad
Search WindowAd
Site N
av Links
Articles
Ad
Topics
Topics
Ad
Info
Ad
Feb. 1, 2007 Search and Sensemaking 33
Attended to Patch
Feb. 1, 2007 Search and Sensemaking 34
Determining Patches on A Page
– A mixture of bottom-up and top-down processes
– Bottom-up processes• Driven by perceptual features that define
visually related regions
– Top-down processes• Controlled by the user’s knowledge of Web
page layout conventions and typical pages for a given Web site or type of Web site
Feb. 1, 2007 Search and Sensemaking 35
Examples of Phenomena to be Explained by Attention Phase
• Interactions between user’s background knowledge, goals, and attention to different patches on a Web page– Banner “Blindness”– Eye movement patterns during interactions
with a Web site
Feb. 1, 2007 Search and Sensemaking 36
CoLiDeS’s Key Assumptions
• Core process underlying Web navigation is comprehension of texts and images
• Comprehension processes – Build mental representations of goals, patches on
a page, hyperlinks, images, and other targets for action on a page
– Compare goal with representations of patches and hyperlinks, images, and other targets for action on a page.
– Select a patch to attend to or object on page to act on based on comprehension AND similarity of descriptions
Feb. 1, 2007 Search and Sensemaking 37
5. Multiple Patches on a Web Page
Feb. 1, 2007 Search and Sensemaking 38
Blackmon, et al. (2003, 2005) Experiments
• Search for target article on Encarta-like website• First two levels of hierarchy presented on one or two
web page(s)• Prototype is 93 links nested under 9 category
headings– Each heading and subordinate links defined a patch– Participant task: Select correct link in correct patch– Clicking on link lead to web page with 8 article titles on page– Click on matching title
• Time limit: 130 seconds• Example task: Find encyclopedia article on
Dome of the Rock
Feb. 1, 2007 Search and Sensemaking 39
Example of One Level Web Page
Summary of Target Article
Feb. 1, 2007 Search and Sensemaking 40
Religion & Philosophy is the Most Attractive Patch
Feb. 1, 2007 Search and Sensemaking 41
Webpage for Find Dome of the Rock Task
• One competing patch on page– Art, Language &
Literature (correct patch)– Religion & Philosophy
(competing patch)– Geography – History– Social Science– Performing Arts– 3 other patches
• 4 competing links in most attractive patch:– Theology & Practices– Religions & Religious
Groups– Scripture– Religious Figures– 3 more links in patch
Feb. 1, 2007 Search and Sensemaking 42
First-click Distribution
54%
26%
8% 8%4%
70%
18%
2%4%6%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Religion &Philosophy
Geography Art, Language &Literature
History Other
Heading/ patch – Art, Language & Literature is correct
Perc
ent
of
all
firs
t-cl
icks
1-level design2-level design
Correct Patch
Feb. 1, 2007 Search and Sensemaking 43
Performance on Dome of the Rock Task
• N=38• Mean total clicks on links = 8.3• Mean time = 115 seconds• 58% time expired• 42% finally clicked the correct link,
“Architecture” nested under Art, Language, & Literature
Feb. 1, 2007 Search and Sensemaking 44
6. Interactions Between Search and Sensemaking
Feb. 1, 2007 Search and Sensemaking 45
Blackmon, et al. (2002, 2003, 2005) Experiments
• Examining combined impact of – Design errors that mislead a scent
following heuristic• Competing links in incorrect patches• Weak scent correct links• Unfamiliar links interfering with sensemaking
• Cognitive Walkthrough for the Web can:– Correctly identify the above problems– Guide successful correction of these
problems
Feb. 1, 2007 Search and Sensemaking 46
Combined Analysis of Blackmon, et al. (2002, 2003, 2005)
• Task => College students search for a target article in Encarta like website
• Mean clicks and time for 324 tasks– N’s ranged from 22 to 50, mean N about 40
• Extensive exploration using multiple regression techniques
• Independent Variables– Total competing links in incorrect patches– Weak scent correct link (yes, no)– Unfamiliar correct link (yes, no)
• Examples of other variables….
Feb. 1, 2007 Search and Sensemaking 47
Examples of Other Independent Variables
• Serial positions of links• Serial positions of patches• Correct patch scent values• Correct link scent values• Number of competing patches• Lots of other variables…
Feb. 1, 2007 Search and Sensemaking 48
Over View of Results
• Mean number of clicks range from 1.1 to over 10
• Percent solvers ranged from 100% to 25%
• Correlation between clicks and percent solvers = .93
• Three Variables Account for 51% of Variance – Total competing links in incorrect patches– Weak scent correct link (yes, no)– Unfamiliar correct link (yes, no)
Feb. 1, 2007 Search and Sensemaking 49
Competing Links in Incorrect Patches Increase Task Difficulty (p<.0001)
n=236 tasks
n=88 tasks
n=33 tasks
n=112 tasks
Feb. 1, 2007 Search and Sensemaking 50
Weak-Scent Correct Links Increase Task Difficulty (p<.0001)
n=271 tasks
n=53 tasks
n=15 tasks
n=112 tasks
Feb. 1, 2007 Search and Sensemaking 51
Unfamiliar Correct Links Increase Task Difficulty (p<.0001)
n=268 tasks
n=25 tasks
n=112 tasks
n=56 tasks
Feb. 1, 2007 Search and Sensemaking 52
Conclusions
• Competing Links in Incorrect Patches• Unfamiliarity• Means End Analysis
Feb. 1, 2007 Search and Sensemaking 53
Competing Links in Incorrect Patches
• Patch contains 0 high or moderate scent links– Little or no impact in performance– Leave patch quickly– Low cost of going to new patch on same page
• Patch contains 2 or more high or moderate scent links– Large impact on performance
• Garden path effects determined by TOTAL number of competing links in incorrect patches
Feb. 1, 2007 Search and Sensemaking 54
Number of Moderate or High Scent Incorrect Links vs. Observed Clicks
2.331
3.591
5.045
5.9426.512
0
1
2
3
4
5
6
7
0 1 2 3 4 or moreNumber of incorrect links (n=324 tasks)
Mean
ob
serv
ed
clicks o
n lin
ks
Feb. 1, 2007 Search and Sensemaking 55
Unfamiliarity
• Two Levels– Incomplete knowledge of meaning of a
superordinate concept• e.g., Anthropology, college students• Only 5% of links in Blackmon et al. (2002,…)
– No knowledge of a word• Specialized technical terms for college students• Many superordinate terms for individuals with
3rd to 6th grade reading skills
Feb. 1, 2007 Search and Sensemaking 56
Impact of Unfamiliarity
• 5% of link terms (1st year college reading level)– Problem solving skills necessary to make
inferences– Infer meaning from other terms in patch– Minor problem
• 50% of link terms (3rd grade reading level)– Locus of unfamiliarity is superordinate concepts– Too many unknown words to be able to make
inferences – May lack necessary problem solving skills
Feb. 1, 2007 Search and Sensemaking 57
Scent Following Is Means-End Analysis
• Limited by ability to comprehend links• Information Foraging by scent following
is a simple form of Means-Ends Analysis (MEA)– Exhibits all MEA’s failure modes
• Classical problem solving literature is directly relevant
– Operator subgoaling• Navigation goals• Patch enrichment activities
Feb. 1, 2007 Search and Sensemaking 58
CoLiDeS + SNIF-ACT
• CoLiDeS– A webpage defines multiple patches– Both search and sensemaking require
comprehension• Search requires comprehension of structure
and content of a web page• Sensemaking entails comprehension of
retrieved information
Feb. 1, 2007 Search and Sensemaking 59
CoLiDeS + SNIF-ACT (Cont.)
• SNIF-ACT– ACT-R cognitive architecture
• Learning mechanisms• Perceptual-motor mechanisms
– Satisficing decision cycle• Click on best link found so far*• Process another link• Click on back button
Feb. 1, 2007 Search and Sensemaking 60
Contact Information
• Marilyn Blackmon– [email protected]
• Muneo Kitajima– [email protected]
• Peter Polson– [email protected]