BESDUI: Benchmark for End-User Structured Data User Interfaces

BESDUIA Benchmark for

End-User Structured Data User Interfaces

Persistent URI: http://w3id.org/BESDUI

http://w3id.org/BESDUI

Authors

Roberto GarcíaGRIHO - HCI & Data Integration Research GroupUniversitat de Lleida, Spain

Eirik BakkeComputer Science and Artificial Intelligence LaboratoryMIT, USA

Rosa GilGRIHO - HCI & Data Integration Research GroupUniversitat de Lleida, Spain

David R. KargerComputer Science and Artificial Intelligence LaboratoryMIT, USA

Juan Manuel GimenoGRIHO - HCI & Data Integration Research GroupUniversitat de Lleida, Spain

Motivation• Inability to reach users traditionally alleged as one

of the main barriers for Semantic Web uptake• No killer app for the Semantic Web?

Desired outcome?• Client applications should hide the complexities of

semantic technologies• For specific tasks, task-specific user interfaces

better satisfy user needs without breaking user experience

Motivation• Anyway, opportunity for Semantic Web user interfaces: datasets

without dedicated user interface• New data collections or rarely used • Combination of existing datasets

• Provide users power of Web-wide connected data to explore and discover unforeseen connections… • Semantic Web killer app?

• Current proposals: • Linked Data browsers, Controlled Natural Language query engines,

faceted browsers,…• Difficult to compare from the user perspective

• What ways of exploring the data they provide?• How efficient they are from a Quality in Use perspective?

Proposal• Benchmark for comparing user interfaces

• Set of typical user tasks • Procedure for measuring performance per task • Low cost and easy to apply, not requiring the

intervention of real users• For UI tools based on semantic or relational data

• Longer term• Trigger a community discussion leading to a

framework for comparing, measuring,… …encourage better semantic search/exploration tools

User Tasks• Criteria:

• Avoid introducing bias from our a priori conception of the problem or experience developing our own tools

• Looked outward to find sets of typical end-user tasks related to structured data exploration

• Applicable both to relational and semantic data• Somewhere to start:

• Berlin SPARQL Benchmark (BSBM), Explore Use Case• Intended for measuring the computational performance

but based on a set of realistic queries inspired by common information needs

User Tasks1. BSBM-1 Find products for a given set of generic features COMBINED 2. ADDED Find products for a given set of alternative features 3. BSBM-2 Retrieve basic information about a specific product for display purposes4. BSBM-3 Find products having some specific features and not having one feature5. BSBM-4 Find products matching two different sets of features6. BSBM-5 Find product that are similar to a given product7. BSBM-6 Find products having a label name that contains a specific string some text8. BSBM-7 Retrieve in-depth information about a specific product including offers

and reviews9. BSBM-8 Give me recent reviews in English for a specific product10. BSBM-9 Get Information about a reviewer11. BSBM-10 Get offers for a given product which fulfill specific requirements

BSBM-11 Get all information about an offer12. BSBM-12 Export the chosen offer into another information system which uses a

different schema

User Tasks• BESDUI includes for each Task, considering the

sample dataset:• Information need:

• “List products of type sheeny with product features stroboscopes OR gadgeteers, and a productPropertyNumeric1 greater than 450”

• Expected output:• “aliter tiredest”, “auditoriums reducing pappies”,

“boozed”, “byplay”, “closely jerries”

User Tasks• Set of tasks is not closed, work in progress, contributions

appreciated• However, quite complete.

References for evaluation:• Information Seeking Strategies (Belkin et al., 1995)

• All dimensions covered by the current tasks• Method of Interaction:

Searching (known item) / Scanning (unknown)• Goal of Interaction:

Learning / Selecting (for retrieval)• Mode of Retrieval:

Recognition (by association) / Specification (identified items)• Resource Considered:

Information / Meta-information

User Tasks• Frameworks of Information Exploration - Towards the

Evaluation of Exploration Systems (Nunes & Schwabe, 2016)

• Work in progress… but complete for some operations and criteria

• Boolean Expressivity• Conjunction values Same Relation and Different Relations

Product feature “A” and feature “B”Product feature “A” and price “100”

• Disjunction values Same Relation and Different RelationsProduct feature “A” or feature “B”Product feature “A” or price “100”

• Negation

Metrics

• Measure Quality in Use (ISO/IEC 25010:2011)

Metrics

BESDUI

Alpha Frontal Asymmetry related to Valence (Pleasure)

“Method for Improving EEG Based Emotion Recognition…” (López-Gil et al., 2016)

“Using SWET-QUM to Compare the Quality in Use of Semantic Web Exploration Tools” (González et al., 2013) http://rhizomik.net/swet-qum/

http://rhizomik.net/swet-qum/

http://rhizomik.net/swet-qum/

Metrics• Effectiveness

degree to which users can achieve the tasks with precision and completeness

• BESDUI Metric:Capability: Is performing the task possible with the given system? 0% No – 100% Yes (50% if task has 2 parts)

• Efficiencydegree to which users can achieve tasks investing appropriate amount of resources

• BESDUI Metrics:Operation Count: How many basic steps (mouse clicks, keyboard entry, scrolling) must be performed to carry out the task?Time: How quickly can these steps be executed? Map operations to time using Keystroke Level Model (Card et al, 1980)

Time Efficiency: capability / time, “goals per second” measure

KLM Operator Time (secs.)

K: button press or keystroke 0.2

P: pointing to a target on a display with a mouse 1.1

H: homing the hand(s) on the keyboard or other device 0.4

Applying BESDUI1. Anyone, but preferably an experienced tool

user, loads the dataset and performs the 12 Tasks

2. For each one, record if the tool capable of completing it. If so, detail all interaction steps required

3. Map interaction steps to task time (using provided spreadsheet)

Applying BESDUI• Task 1:

“Look for products of type sheeny with product features stroboscopes AND gadgeteers, and a productPropertyNumeric1 greater than 450”

• Tools• Rhizomer:

• Capability: 0% no support for conjunction of values same property

• Virtuoso FCT (Faceted Browser):• Capability: 100%

Virtuoso FCT – Task 1

1. Type “sheeny” and “Enter”, then click “ProductType10”.2. Click “Go” for “Start New Facet”, then click “Options”.3. For “Interence Rule” Click and Select rules graph then “Apply”.4. Click “Attributes”, then “productFeature” and “stroboscopes”.5. Click “Attributes”, then “productFeature” and “gadgeteers”.6. Click “Attributes” and “productPropertyNumeric1”.7. Click “Add condition: None” and select “>”.8. Type “450” and click “Set Condition”.

9K, 2P, 3H 2K, 2P2K, 2P3K, 3P3K, 3P2K, 2P2K, 2P5K, 2P, 2H

Applying BESDUI• Task 2:

“Look for products of type sheeny with product features stroboscopes OR gadgeteers, and a productPropertyNumeric1 greater than 450”

• Tools• Rhizomer:

• Capability: 100% • Virtuoso FCT:

• Capability: 100%

Rhizomer – Task 2

1. Click menu “ProductType” and then “Sheeny” submenu.2. Click “Show values” for facet “Product Feature”.3. Click facet value “stroboscopes”.4. Type in input “Search Product Feature” “gad...” 5. Select “gadgeteers” from autocomplete6. Set left side of “Product Property Numeric1”slider to “450”.

2K, 2P, 1H 1K, 1P1K, 1P4K, 1P, 1H1K, 1P, 1H1K, 2P

Results

Rhizomer Virtuoso FCT

Task Capability Operation Count

Time(seconds) Capability Operation

CountTime

(seconds)

1 0% - - 100% 51 (28K, 18P, 5H)

27.4

2 100% 21 (10K, 8P, 3H)

12.0 100% 53 (29K, 19P, 5H)

28.7

… … … … … … …

* BESDUI provides spreadsheet to compute these metrics

Results• Currently, BESDUI applied to:

• Rhizomer a semantic data exploration tool with facets and pivoting

• Virtuoso FCTthe faceted browser for the Virtuoso RDF data store

• Sieuferd a general-purpose user interface for relational databases

• PepeSearcha search interface for querying SPARQL endpoints

Results & Conclusions

• Sieuferd the most capable but less performant, most complex user interface

• PepeSearch the less capable but more performant, less complex user interface

• Rhizomer best effectiveness/efficiency ratio, more “goals per second”

Averages per Tool Capability K

(0.2s)P

(1.1s)H

(0.4s)Operator

Count Time Time Efficiency (Capability/Time)

Rhizomer 58% 15.9 10.9 2.6 29.3 16.1 3.60Virtuoso FCT 54% 20.4 12.7 3.0 36.1 19.3 2.80Sieuferd 96% 48.7 19.7 2.9 71.3 32.6 2.94PepeSearch 25% 10.3 5.3 5.3 21.0 10.1 2.48

Conclusions• Importance of benchmarks to drive research in a

domain• Simple benchmark (too much?) but adoption key• BSBM useful source of tasks and data

• Synthetic nature results in funny product names like “waterskiing sharpness horseshoes”…but no significant impact (no real users)

• Measure UI without having to involve users• Less reliable but cheaper• Ideal during early dev stages or to compare tools

Future Work• Continue tasks review and extend set of users tasks• Consider additional tools:

• Direct manipulation (Explorator, Tabulator,…)• Interactive Query Building (YASGUI, iSPARQL…)• Relational data (Cipher, BrioQuery,…)• …

• Improve metrics to consider users mental effort• SPARQL command line best UI from a KLM point of view• Considering GOMS, includes cognitive and perceptual operators

• Compare results with real users tests

• Available as GitHub repository: http://w3id.org/BESDUI• Please, FORK and CONTRIBUTE!




Thank you for your attentionQuestions?

[email protected]://rhizomik.net/~roberto/

BESDUI Persistent URI:http://w3id.org/BESDUI

mailto:[email protected]

http://rhizomik.net/~roberto/





Internet

BESDUI: Benchmark for End-User Structured Data User Interfaces