Upload
horatiubota
View
379
Download
0
Tags:
Embed Size (px)
Citation preview
School of Computing Science
Exploring Composite Retrievalfrom the Users’ Perspective
Horațiu BotaKe Zhou
Joemon Jose
by
@hora&ubota
@hora&ubota
Background: Composite Search
• Finding accessories for an iPhone under budget constraints
• Tourist itineraries in a city within time budget
• Course recommendations with constraints
• A general framework
• Web search
Basu Roy et al., 2010
Amer-Yahia et al., 2013
Parameswaran et al., 2011
De Choudhury et al., 2010 Bota et al., 2014
@hora&ubota
Background: Composite Search
“Rather than returning and merging results from different verticals into the SERP, we propose to return to users a set of information objects (bundles) which
are composed of results from several verticals.”Bota et al., WWW 2014
@hora&ubota
• What do searchers expect from these information objects?
• What characteristics of these objects are most important to searchers?
Problem
@hora&ubota
1. Background • Aggregated Search• Composite Search
Roadmap
2. Our work • Study design• Research questions
3. Findings • Contents• Characteristics
4. Discussion • Insights• Future
@hora&ubota
Our work: study design
• Exploratory user study, 40 participants
• Simulated work task: manual document aggregation„Select most useful search results for writing a blog post”
• Compensated £10 for ~1h total duration.
@hora&ubota
Our work: study design
• Exploratory user study, 40 participants
• Simulated work task: manual document aggregation„Select most useful search results for writing a blog post”
• Compensated £10 for ~1h total duration.
• 40 different topics (MillionQuery, FedWeb)• Live collection - 8 verticals:
GW Images News Videos Social Blog QA Wiki
Bing Web Search API
YouTubeAPI
Twitter API
Google Custom Search Engine
@hora&ubota
Our work: task design
Pre-task questionnaire
Topic briefing
Subtopic selection
Document selection
Build bundles
@hora&ubota
Our work: task design
Pre-task questionnaire
Topic briefing
Subtopic selection
Document selection
Build bundles
Document relevance
judgements
Bundle characteristics assessments
Rate bundles
@hora&ubota
Our work: task design
Pre-task questionnaire
Topic briefing
Subtopic selection
Document selection
Build bundles
Document relevance
judgements
Bundle characteristics assessments
Rate bundles
Pairwise preference
@hora&ubota
Our work: task design
Pre-task questionnaire
Topic briefing
Subtopic selection
Document selection
Build bundles
Document relevance
judgements
Bundle characteristics assessments
Rate bundles
Pairwise preference
Post-task questionnaire
@hora&ubota
Our work: task design
Pre-task questionnaire
Topic briefing
Subtopic selection
Document selection
Build bundles
Document relevance
judgements
Bundle characteristics assessments
Rate bundles
Pairwise preference
Post-task questionnaire
4X
@hora&ubota
Our work: questions
1. Do users agree with each other on the subtopics they form bundles on?
@hora&ubota
Our work: questions
1. Do users agree with each other on the subtopics they form bundles on?
2. How do users aggregate information to build bundles?
@hora&ubota
Our work: questions
1. Do users agree with each other on the subtopics they form bundles on?
2. How do users aggregate information to build bundles?
3. Which bundle characteristics are most important to users?
@hora&ubota
Findings: subtopic agreement
(1) Do users agree with each other on the subtopics they form bundles on?
@hora&ubota
Findings: subtopic agreement
(1) Do users agree with each other on the subtopics they form bundles on?
% of par ticipants / topic involved in determining subtopic agreement
100% 75% 50%
% of bundles „about” same subtopic 12% 14% 16%
% of topics with
at least 1 common subtopic 32% 75% 90%
at least 2 common subtopics 0% 32% 85%
at least 3 common subtopics 0% 5% 60%
@hora&ubota
Findings: subtopic agreement
(1) Do users agree with each other on the subtopics they form bundles on?
% of par ticipants / topic involved in determining subtopic agreement
100% 75% 50%
% of bundles „about” same subtopic 12% 14% 16%
% of topics with
at least 1 common subtopic 32% 75% 90%
at least 2 common subtopics 0% 32% 85%
at least 3 common subtopics 0% 5% 60%
@hora&ubota
Findings: subtopic agreement
(1) Do users agree with each other on the subtopics they form bundles on?
% of par ticipants / topic involved in determining subtopic agreement
100% 75% 50%
% of bundles „about” same subtopic 12% 14% 16%
% of topics with
at least 1 common subtopic 32% 75% 90%
at least 2 common subtopics 0% 32% 85%
at least 3 common subtopics 0% 5% 60%
@hora&ubota
Findings: subtopic agreement
(1) Do users agree with each other on the subtopics they form bundles on?
% of par ticipants / topic involved in determining subtopic agreement
100% 75% 50%
% of bundles „about” same subtopic 12% 14% 16%
% of topics with
at least 1 common subtopic 32% 75% 90%
at least 2 common subtopics 0% 32% 85%
at least 3 common subtopics 0% 5% 60%
@hora&ubota
Findings: subtopic agreement
(1) Do users agree with each other on the subtopics they form bundles on?
% of par ticipants / topic involved in determining subtopic agreement
100% 75% 50%
% of bundles „about” same subtopic 12% 14% 16%
% of topics with
at least 1 common subtopic 32% 75% 90%
at least 2 common subtopics 0% 32% 85%
at least 3 common subtopics 0% 5% 60%
@hora&ubota
Num
ber o
f doc
umen
ts
0
1
2
3
GW Image Video News Social Blog Wiki QA
Findings: content
(2) How do users aggregate information to build bundles?
@hora&ubota
Num
ber o
f doc
umen
ts
0
1
2
3
GW Image Video News Social Blog Wiki QA
Findings: content
(2) How do users aggregate information to build bundles?
Perc
enta
ge o
f bun
dles
10%
20%
30%
Number of verticals in bundle
1 2 3 4 5 6 7
@hora&ubota
Num
ber o
f doc
umen
ts
0
1
2
3
GW Image Video News Social Blog Wiki QA
Findings: content
(2) How do users aggregate information to build bundles?
Perc
enta
ge o
f bun
dles
10%
20%
30%
Number of verticals in bundle
1 2 3 4 5 6 7
7.5%
15%
22.5%
30%
Vertical distribution in 3 vertical bundles
GW ImageVideo NewsBlog WikiQA
@hora&ubota
Findings: document roles
D3D2D1 D4
REL RELREL NREL
A:
REL RELREL NREL
D5D4D3 D6 :B
Pivot documents
@hora&ubota
Findings: document roles
D3D2D1 D4
REL RELREL NREL
A:
REL RELREL NREL
D5D4D3 D6 :B
Pivot documents
Ornament documents
@hora&ubota
Findings: document roles
Pivot typeGW Wiki
Ornamenttype
GW - 24.6%Image 23.5% 31.1%Video 21.3% 18%News 7.1% 1.6%
Social <1% 6.6%Blog 9% 11.5%QA 17.4% 4.9%Wiki 19.7% -
Verticals in bundle2 verts 3 verts
Averagedocumentrelevance
perverticaltype
GW 3.872 3.667Image 3.208 3.352Video 3.228 3.649News 2.954 3.156Social 2.667 2.200Blog 3.593 3.402QA 2.560 2.652Wiki 3.553 3.584
@hora&ubota
Findings: document roles
Pivot typeGW Wiki
Ornamenttype
GW - 24.6%Image 23.5% 31.1%Video 21.3% 18%News 7.1% 1.6%
Social <1% 6.6%Blog 9% 11.5%QA 17.4% 4.9%Wiki 19.7% -
@hora&ubota
Findings: document roles
Verticals in bundle2 verts 3 verts
Averagedocumentrelevance
perverticaltype
GW 3.872 3.667Image 3.208 3.352Video 3.228 3.649News 2.954 3.156Social 2.667 2.200Blog 3.593 3.402QA 2.560 2.652Wiki 3.553 3.584
@hora&ubota
Findings: document roles
Verticals in bundle2 verts 3 verts
Averagedocumentrelevance
perverticaltype
GW 3.872 3.667Image 3.208 3.352Video 3.228 3.649News 2.954 3.156Social 2.667 2.200Blog 3.593 3.402QA 2.560 2.652Wiki 3.553 3.584
@hora&ubota
Perc
enta
ge o
f sel
ectio
ns
0
20
40
Relevance Diversity Overall Freshness None Cohesion
5%6%7%21%24%37%
Findings: characteristics(3) Which bundle
characteristics are most important to users?
@hora&ubota
Perc
enta
ge o
f sel
ectio
ns
0
20
40
Relevance Diversity Overall Freshness None Cohesion
5%6%7%21%24%37%
Pearson’s R
All Chosen
Criterion
Relevance 0.332 0.496
Cohesion 0.228 0.432
Diversity 0.334 0.487
Freshness 0.208 0.213
Overall 0.453 0.454
Findings: characteristics(3) Which bundle
characteristics are most important to users?
@hora&ubota
Perc
enta
ge o
f sel
ectio
ns
0
20
40
Relevance Diversity Overall Freshness None Cohesion
5%6%7%21%24%37%
Pearson’s R
All Chosen
Criterion
Relevance 0.332 0.496
Cohesion 0.228 0.432
Diversity 0.334 0.487
Freshness 0.208 0.213
Overall 0.453 0.454
Findings: characteristics(3) Which bundle
characteristics are most important to users?
@hora&ubota
Perc
enta
ge o
f sel
ectio
ns
0
20
40
Relevance Diversity Overall Freshness None Cohesion
5%6%7%21%24%37%
Pearson’s R
All Chosen
Criterion
Relevance 0.332 0.496
Cohesion 0.228 0.432
Diversity 0.334 0.487
Freshness 0.208 0.213
Overall 0.453 0.454
Findings: characteristics(3) Which bundle
characteristics are most important to users?
@hora&ubota
Relevance Diversity Cohesion Freshness Overall
Relevance - 0.272 0.538 0.334 0.630
Diversity 0.272 - 0.144 0.485 0.478
Cohesion 0.538 0.144 - 0.250 0.548
Freshness 0.334 0.485 0.250 - 0.537
Overall 0.630 0.478 0.548 0.537 -
Perc
enta
ge o
f sel
ectio
ns
0
20
40
Relevance Diversity Overall Freshness None Cohesion
5%6%7%21%24%37%
Pearson’s R
All Chosen
Criterion
Relevance 0.332 0.496
Cohesion 0.228 0.432
Diversity 0.334 0.487
Freshness 0.208 0.213
Overall 0.453 0.454
Findings: characteristics(3) Which bundle
characteristics are most important to users?
@hora&ubota
Our work: conclusions
1. Do users agree with each other on the subtopics they form bundles on?
3. Which bundle
@hora&ubota
Our work: conclusions
1. Do users agree with each other on the subtopics they form bundles on?
• Some agreement between users
• Information objects could be focused popular facets
3. Which bundle
@hora&ubota
Our work: conclusions
2. How do users aggregate information to build bundles?
1. Do users agree with each other on the subtopics they form bundles on?
• Some agreement between users
• Information objects could be focused popular facets
3. Which bundle
@hora&ubota
Our work: conclusions
2. How do users aggregate information to build bundles?
1. Do users agree with each other on the subtopics they form bundles on?
• Some agreement between users
• Information objects could be focused popular facets
• Vertically diverse
• Relatively compact• Pivots & ornaments
3. Which bundle
@hora&ubota
Our work: conclusions
2. How do users aggregate information to build bundles?
• Vertically diverse
• Relatively compact• Pivots & ornaments
3. Which bundle characteristics are most important to users?
@hora&ubota
Our work: conclusions
2. How do users aggregate information to build bundles?
• Vertically diverse
• Relatively compact• Pivots & ornaments
3. Which bundle characteristics are most important to users?
• Hard to assess ind.• Relevance • Cohesion / diversity
@hora&ubota
Take home
Searchers want more complex ways of interacting with SERPs — if done right.
Results composition can be used to generate more
complex information objects.
@hora&ubota
Take home
Searchers want more complex ways of interacting with SERPs — if done right.
Results composition can be used to generate more
complex information objects.
Understanding searcher needs in such contexts is very
important.
School of Computing Science
Thank you!Horațiu Bota
Ke ZhouJoemon Jose
from
@hora&ubota
This work was partially funded
by the LiMoSINeproject.
www.limosine-project.eu
Interface code is on Github, checkwww.horatiubota.com for details.