COMO CAMPUS
Models and interaction mechanisms for exploratory interfaces Luigi Spagnolo [email protected]
1 Information and Communication Quality
Index 2
¨ PREVIEW: Online experimentation! ¨ Part I: navigation, search and exploration
¤ Break ¨ Part II: Faceted search: the model(s) and the
interaction
¨ Visualization issues will be covered into an other lecture
3 PREVIEW: Online Experimentation
Intro 4
¨ This lecture starts in a quite unusual way :-) ¨ To let you introduced with exploratory
interfaces you’ll take part to a research experiment
¨ But don’t worry! ¤ It’s not dangerous for your health :-) ¤ The questionnaire you’re asked to fill is
anonymous and the answers will not be graded
The application | 1 5
The application | 2 6
¨ The last version of a prototype built for the Italian Ministry of Culture
¨ A map of exploring venues of archaeological interest in Italy ¨ According to three properties (facets):
¤ Kind of venue: museum, archaeological site and superintendence (a local branch of the Ministry of Culture devoted to archeological heritage management).
¤ Location: the venue location, at level of macro-area (Northern Italy, Central Italy, eyc.), Italian region and Italian province.
¤ Civilization or Period: The ancient civilizations (Romans, Greeks, etc.) or periods (e.g. Middle Ages, Bronze age) the venues are relevant to.
The application | 3 7
¨ The tag cloud: ¤ Tag size à the number of results that are relevant with respect to the period or
civilization in question. ¤ Text color à how much the percentage of results that are relevant for the period/
civilization deviates from an uniform distribution. n Shades of green show a stronger positive correlation between the other selected filters (e.g.
the location and/or the venue type) and the civilization/period in question. Red instead shows a negative correlation (the civilization/period is less significant with respect to other criteria selected).
¤ Background color à w.r.t. the whole set of venues are relevant for the period/civilization, which percentage of them are included in the results? n Green shows a positive correlation, while red instead shows a negative correlation.
¤ E.g., for venues in a specific region only (e.g. Lombardy), a green tagindicates that the given civilization was particularly relevant for that region.
¤ The green background shows instead that the civilization is peculiar of that region, and is less likely to be found elsewhere.
The application | 4 8
¨ The map: ¤ At three levels: Italian region, Italian province, extact
location(s) ¤ The color of the circle à the specific type of venue ¤ The size of the circle à the number of items of that
type in that area
The experiment 9
¨ Go to http://tinyurl.com/exp-icq ¤ (or http://www.ellesseweb.com/mining/)
¨ You will find a page with two links: 1. The application 2. An online questionnarie (on Rational Survey) ¤ Keep both open on the browser
¨ Work individually (1 hour max) ¨ Answer with your opinions, without looking at other websites, just
at the ArchaeoItaly application ¤ Remember: the survey is anonymous, and there are no “correct
answers”! ¤ For any doubts, ask me!
10 Part 1 | Navigation, search and exploration
Let’s start with a scenario 11
¨ Work in pairs ¨ Imagine to work as journalists for the
Horse Illustrated magazine ¨ You have to write an essay about
horses in art (and in particular in painting) among the centuries.
¨ Find interesting information on the website of the Louvre Museum ¤ http://www.louvre.fr/llv/commun/
home.jsp?bmLocale=en
Problems with the Louvre 12
¨ Artworks are separated by department (internal “bureaucratic” classification) and by provenience.
¨ It is not possible to search them together (regardless of their age and country of origin) by subject.
¨ There is no introductory content on the subject that can guide the student in her search.
Content-intensive websites 13
¨ Also know as: ¤ Information-intensive ¤ Often Infosuasive = informative + persuasive ¤ Like ancient rhetoric: inform and persuade
¨ Mainly intended for: ¤ Learning, understanding, discovering, comparing
information ¤ Leisure and entertainment
Contents 14
¨ Text, multimedia (audio, video, images) ¨ Hypermedia = multimedia + hyperlinks ¨ Information involves subjective judgment
¤ Depends on the author and on the user ¤ Objective: “10km far from Como”, “the painting
was made in 1886” ¤ Subjective: “Near Como”, “the painting is
impressionist”
User experiences requirements | 1 15
¤ From the users’ point of view: n Usability: usage is effective, efficient and satisfactory n Findability: users can locate what they are looking for n “At a glance” understandabity: users understand the
website coverage and can make sense of information n Enticing explorability: users are compelled to “stay
and play” and discover interesting connections among topics
User experiences requirements | 2 16
¤ From the stakeholders’ point of view: n Planned serendipity: promoting most important
contents so that users can stumble in them n E.g. “Readers that purchased this book also bought…”
n Communication strengh and branding: the website conveys the intended “message” and “brand” of the institution behind it
n E.g. “we have the lowest prices”, “we are very authorithative”, etc.
Information architecture 17
¤ Purpose: conceptually organizing information
¤ Providing access to contents n Index navigation (a) n Guided navigation (b)
¤ Providing the possibility of moving from a content to related ones n Contextual navigation (c): cross-
reference links, semantic relationships
“Traditional” structure 18
¤ Taxonomy: hierarchy of categories and subcategories n Sections and group of
contents are the branches of the tree
n Contents are the leaves ¤ Cross-reference links
between nodes
An example 19
Sitemap:
Art gallery website
¤ Artworks of the month ¤ Paintings
n Top 10 masterpieces n By artist n By artistic movement n By subject
¤ Sculptures n ... n By material
¤ Photographs n ...
Problems/1 20
¨ What if I want to browse all artworks (regardless their type) by artist? ¤ Classifications are “nested” in a fixed order ¤ Designers should choose which classification should
prevail (e.g. by type) ¨ What if I want to find “impressionists paintings
portraing animals”? ¤ I cannot combine multiple “sibling”classifications (e.g.
by style and by subject)
Problems/2 21
¨ As long as the website is small a good taxonomy can satisfy user requirements
¨ For large websites ¤ (hundreds or thousand of pages) ¤ Indexed/guided navigation doesn’t scale ¤ Users can’t easily find what they want ¤ Users can’t make sense of all such information
Solutions? 22
¨ What do users do when navigation doesn’t work? ¤ They use search! ¤ Search arranges contents dynamically and automatically (in
a way not predefined by designers) ¨ But keyword-based search is not optimal
¤ No hints for users that have no clear idea of what looking for
¤ Users must know how the information is described (e.e. the specific jargon used)
¤ Just for retrieval/focalized search ¨ We need a better paradigm: Exploratory search
Exploratory search 23
¨ The model “query à results” is (too much) simple
¨ Search is often like berry picking! (Bates 1989) ¤ Users explore a corpus of contents ¤ They refine the query (again and
again) according to what they learn ¤ They pick information here and there,
piece by piece
From search to exploration 24
¨ From finding to understanding (Marchionini) ¤ Acquire knowledge
about a domain, its jargon, the properties of information items in it.
¤ Useful to (better) understand what to look for
¤ …but also to analyze a dataset
Goals of exploratory applications 25
¨ Object seeking ¤ Identify the best object(s) whose features match user
requirements (e.g. purchasing a photocamera with concerns regarding price, resolution, etc.)
¨ Knowledge seeking ¤ Expand the knowledge about a given topic and related
information (e.g. Leonardo Da Vinci and Italian Renaissance) ¨ Wisdom seeking
¤ Discover interesting relationships among features in a information space/dateset (e.g. analysis of sales in Esselunga chain stores, according to store location, type of article, price, etc.)
¨ These goals can possibly coexist in the same application
Retrieval vs. exploration models 26
¨ Retrieval model: query + results ¤ Query can can be either:
n Free form (e.g. keyword based search) n Structured (parametric search, e.g. Scholar advanced search) n Guided (select data from a predefined set of choices)
¨ Exploration model: ¤ Query + results + refinements/feedback ¤ Query supported by self-adaptive structures for:
n Further filter results to a subset of them n Summarizing the features shared by results
27 Part 2 | Faceted search: model(s) and interaction
(Amazon’s Diamond search was one of the first e-commerce applications of faceted search)
Faceted search 28
¨ A exploratory search/navigation pattern based on progressive filtering of results
¨ The user selects a combination of metadata values belonging to several facets
¨ Each facet correspond to a particular dimension that describes the content objects made available for search, e.g. for an artwork: ¤ Subject: people portrayed, flowers and plants, abstract... ¤ Medium: painting, sculpture, photography... ¤ Technique: oil, watercolors, digital art... ¤ Style: impressionism, expressionism, abstractism... ¤ Location: Prado, Louvre, Guggenheim
Let’s see a pair of examples 29
¨ Two examples: ¤ http://orange.sims.berkeley.edu/cgi-
bin/flamenco.cgi/famuseum/Flamenco ¤ http://www.artistrising.com
¨ Try the same search we’ve seen before: find horses in art
¨ More examples at: http://www.flickr.com/photos/morville/collections/72157603789246885/
Non just a matter of finding… E.g. you can learn that horses in art are often found in paintings portraing soldiers or warriors and leaders
30
How the interaction works 31
¨ When the user chooses a filter, the application selects: ¤ The results: items that have
been “tagged” with the filter and the other metadata previously chosen
¤ The remaining filters: metadata that combined with the previous choices can produce results
¨ The users can continue narrowing results until they options are available
A (generalized) formal model | 1 32
¨ Taxonomy: a pair ¤ A set of concepts or terms ¤ The subsumption relation connecting narrower
terms (hyponyms) to broader concepts (hypernyms) ¤ Terminal concepts: terms not further specialized
(the “leaves”)
T ,( ) T = t1,t2 ,…,tn{ }
laptop computerlocation : 'Como ' location : 'Lombardy ' location : 'Italy '
A (generalized) formal model | 2 33
¨ For faceted taxonomies concepts are given in terms of property-value pairs (restrictions): ¤ E.g. subject: “horse”, location: “Como”
¨ A query is any of: ¤ A restriction ¤ A conjunction, disjunction or negation of
(sub)queries ¤ Actually there are limitations in the way concepts
can be combined in current facet browser implementations
q = property :value
q1 and q2
q1 or q2
not q
A (generalized) formal model | 3 34
¨ Item description: an information item is described as a conjunction of restrictions
¨ Extension of a query: the set of items in a context O that match the query extO q( ) = o∈O | d o( ) q{ }
ext q1 and q2( )⊆ ext q1( ), ext q2( )ext q1( ), ext q2( )⊆ ext q1 or q2( )ext not q( ) ≡ ext ALL( ) ext q( )
d o( ) = subject :"horse"and style :"Impressionism"and…
o∈O
tc tp ⇒ ext tc( )⊆ ext tp( )
A (generalized) formal model | 4 35
¨ The result of a query is: ¤ Its extension in the given information space ¤ The set of features shared by these results: i.e. all
the concepts that can be derived from the descriptions of objects in
extO q( )
extO q( )
Query transformations 36
¨ Operations allowing to navigate from a state to another of the exploratio ¤ Appending new restrictions to the query in conjunction
(zoom-in: from a wider to a narrower set of results) ¤ Adding alternatives in disjunction to the existent ones (zoom-
out: from a narrower to a wider set) ¤ Removing existing constraints (zoom-out again) ¤ Negating/excluding values ¤ Replacing a filter with another (shift)
¨ Implemented by hyperlinks (for conjunctive filters / shift), check boxes (for disjunctions), etc.
37
How values are (usually) combined ¨ Filters belonging to different facets are combined in
conjunction ¤ E.g. “technique:oil” AND “style:impressionism” ¤ Filters belonging to the same facet are: ¤ Combined in conjunction if the facet admits more values at
the same time for each object n E.g. “subject:people” AND “subject:animals” n (both people and animals in the same picture)
¤ Combined in disjunction if the facet adimits only one value n E.g. “location:Milan” OR “location:Como” n (an object which is Como or in Milan)
38
Type of facets
¨ Single-valued (functional properties) vs. multi-valued ¨ Flat vs. hierarchical organization of values
¤ E.g. hierarchical: nation/region/province ¨ Subjective/arbitrary (properly named facets) vs. objective
(attributes) ¤ A date, a location, a price are examples of objective data ¤ “Topic”, “Audience”, “Artistic movement”, “importance” are
examples of subjective information ¤ Assigning/using a value involves some kind of judgment and
interpretation and is influenced by cultural and personal backgrounds
Type of facet values ¨ Terms (strings of text)
¤ Taxonomies, controlled vocabularies
¤ User-defined tags (folksonomies)
¤ From data-mining ¨ Numerical values and dates ¨ Boolean values (yes/no)
¤ E.g. “Available for buying?”, “original?”, “still living?”
¨ Even shades of color, shapes, etc...
¨ Sortable and comparable? ¤ We can say that
value1<=value2<=…<=valueN? ¤ E.g. Dates, magnitudes, scales of
judgment, quantitative data n e.g. “sufficient”<“excellent”,
10€<100€, “Monday”<“Friday” ¤ Ranges [value1, value2]
n E.g. User is allowed to search for events from 01/06 to 31/08
¤ Classes of values n e.g. for price: 0-10€, 11-20€,
21-50€, 51-100€, … n The way we define classes is arbitrary
and depend on domain
39
Benefits of faceted search 40
¨ Easy and natural almost like “traditional” browsing ¨ With respect to keyword-based search users have hints
¤ Users can more easily make sense of information (if supported by good interfaces)
¤ …and learn about the context by interacting with it ¨ Users can freely combine multiple classifications according to their
wishes ¤ In traditional browsing, when you reach a terminal concept you can’t
refine further ¤ With faceted search, you can continue refining with related concepts
¨ Navigation is safe: frustrating “no results found” searches avoided ¤ Only concepts that have been used to classify the current set of
results are diplayed
Limitations 41
¨ It works well only with structured data ¨ Faceted search does not provide a ranking of
results ¤ For “object seeking” tasks it might be a limitation ¤ It may be better to compute the “distance” with
respect to an “optimal” solution à otimization task ¨ Other limitations are discussed in the following
slides on advanced issues
Advanced (research) issues 42
43
Full Boolean queries | 1 ¨ How to achieve something like this?
“Given a budget of 250,000 euros, I’m interested in a flat with at least 4 rooms and not central heating in the centre, or an house with at least 5 rooms in the suburbs”
44
Full Boolean queries | 2 ¨ Foci (Ferré et al.) the set of sub-expressions in the semantic
tree of the query ¨ A query is a pair , where is an arbitrary combination of
filters and is one of its foci ¤ The focus is used to select the subquery at which the new filter
should be appended (or the transformation should be applied) ¤ …But also to “inspect” different points of view of information ¤ The main focus represents the “whole” query
q,φ( ) qφ
Semantic faceted search 45
¨ We can filter items, but how can we filter facet values? ¤ E.g. paintings filtered by artists ¤ But how we filter the Artists facet values by nationality,
gender, age, etc.? ¨ Exploring contents at level of sets using semantic
relationships, e.g. ¤ The museums that have bronze Greek statues ¤ “Women portrayed by women”: paintings with subject:woman
and artist:gender:female ¤ Schools attended by the daughters of U.S. democratic
presidents (http://www.freebase.com/labs/parallax/) ¤ Challenges: effective models and usable interface
¨ An example: Sewelis
Beyond binary classication | 1
¤ Classification (faceted or not) is usually binary:
¤ An item must be either relevant (1) or not relevant (0) to a certain category
¤ Problem: quite arbitrary decision in many real domains
Beyond binary classication | 2
î How to classify acathedral by architectural style? ¤ Built upon a 6th century buliding ¤ Mainly gothic ¤ 17th century (baroque) towers ¤ Rebuilt during neoclassicism ¤ Decorations added in 19th century ¤ Contains Roman forum marbles (donated by Pius
IX) ¤ …
î Do we tag the cathedral with all or only some of these?
î A classification may be correct for a kind of users but ineffective for another one
Beyond binary classication | 3
î Monna Lisa is a well known portait of a woman, but…
î There is also a landscape in the background
î Do we classifity it as “subject: woman” and “subject: Tuscan landscape” too?
Beyond binary classication | 4
î Onion is very used in French cuisine
î How do we distinguish “onion-based” recipes from all the recipes with onion inside?
Beyond binary classication | 5
¨ A possible solution: associating weights to each triple item-facet-value ¤ A statement about
the statement ¨ Values between 0 and
1 or other scales ¨ Query could be
specified in terms of facet-values pairs and ranges of weights
Beyond binary classication | 6
¨ Subjective weights ¤ Relevance: at which
extent the item can be considered as belonging to a certain facet value
¤ Significance: the relative importance of the item according to a facet value
¨ Objective weights ¤ E.g. Concentration or quantity (e.g.
a thing is made for the 10% of material:bronze)
¤ E.g. for exploring venues: distance from points of interests
Beyond binary classication | 7
¨ Interaction (concepts)
Handling information overload 53
¨ Too more facets and facets values may generate information overload too! ¤ Possible solution: Display only the most relevant
facets (and facet values) for the user profile or the given context
¨ How to determine the most “interesting” facets in a given context? ¤ E.g. those with a less “uniform” distribution of
values (more correlation) ¤ We will discuss this in a next lecture… :-)
Interested in MS Theses? Contact us! :-) ¨ Advisors: Prof. Di Blas, Prof. Paolini ¨ Both theoretical and development ¨ Fuzzy facets ¨ Semantic faceted search ¨ Advanced visualizations ¨ … ¨ Your own ideas! :-)
54
Are you still alive/awake? Thank you for your attention!
Any final questions? 55