19
Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw Measuring Mental Workload in IIR User Studies with fNIRS Max L. Wilson University of Nottingham, UK @gingdottwit My Brain Team Norah Alsuraykh, Horia Maior, Matthew Pike, Richard Ramchurn

Measuring Mental Workload in IIR User Studies with fNIRS

Embed Size (px)

Citation preview

Page 1: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Measuring Mental Workload in IIR User Studies with fNIRS

Max L. Wilson University of Nottingham, UK

@gingdottwit

My Brain TeamNorah Alsuraykh, Horia Maior, Matthew Pike, Richard Ramchurn

Page 2: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~mlw/

Page 3: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

With Great Power Comes

Great Complexity and Confusion

Page 4: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~mlw/

Increasing Cognitive Costs

Total Mental Capacity

Simple UI

Easy Task

Page 5: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~mlw/

Increasing Cognitive Costs

Total Mental Capacity

Simple UI

Hard Task

Page 6: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~mlw/

Increasing Cognitive Costs

Total Mental Capacity

Complex UI

Hard Task

Page 7: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~mlw/

Mental Workload

system takes place in working memory and consists of awide variety of the mental activities. In relation to IR, itis interesting to observe how elements of cognition, such asrehearsal of information, planning the search strategy anddeciding on the search keywords interconnect.

Multiple Resource Model. One model of mental work-load that has been widely accepted in Human Factors isWickens Multiple Resource Model [20] (Figure 2). The ele-ments of this model overlap with the needs and considera-tions of evaluating complex tasks (such as IR). He describesthe aspects of human cognition and the multiple resourcetheory in four dimensions:

Figure 2: The 4-D multiple resource model [20]

• The STAGES dimension refers to the three main stagesof information processing system (Wickens, 2004 [21]).

• The MODALITIES dimension indicating that audi-tory and visual perception have di↵erent sources.

• The CODES dimension refers to the types of memoryencodings which can be spatial or verbal.

• The VISUAL PROCESSING dimension refers to a nesteddimension within visual resources distinguishing be-tween focal vision (reading text) and ambient vision(orientation and movement).

Our aim is to understand how these elements link togetherand compose more complex components/tasks. Additionallywe want to consider how complex tasks (such as a searchtask) can be divided into primary components according tothe models described. This will help identify possible prob-lems in SUI design as well as indicating a possible solutionto the problem (suggested implications by Wickens [21]):

• Minimize working memory load of the SUI system andconsider working memory limits in instructions;

• Provide more visual echoes (cues) of di↵erent typesduring IR (verbal vs spatial);

• Exploit chunking (Miller, 1956 [14]) in various ways:physical size, meaningful size, superiority of lettersover numbers, etc;

• Minimize confusability;

• Avoid unnecessary zeros in codes to be remembered;

• Encourage regular use of information to increase fre-quency and redundancy;

• Encourage verbalization or reproduction of informa-tion that needs to be reproduced in the future;

• Carefully design information to be remembered;

Resource vs Demands. One other model that is of inter-est is the limited resource model [22] describing the relation-ship between the demands of a task, the resources allocatedto the task and the impact on performance.

Figure 3: Resources available vs task demands !impact on performance [22]

The graph from Figure 3 is used to represent the lim-ited resource model. The X-axes represent the resourcesdemanded by the primary task and as we move to the rightof the axes, the resources demanded by the primary taskincrease. The axes on the left indicate the resources beingused, but also the maximum available resources point (if wethink of working memory that is limited in capacity). Theright axes indicate the performance of the primary task (thedotted line on the graph). The key element of this model isthe concept of a limited set of resources which, if exceeded,has a negative impact on performance. However, it does notdistinguish between resource modality, therefore we proposeto use both the limited and multiple resources models toinform our work.

5. PATH 2: SUI EVALUATIONRelating quantitative data from brain sensing devices into

feedback about SUI designs is one of our ultimate goals inconducting this research. SUIs are inherently informationrich and thus a↵ect both visual (results page layout) andverbal (text based results) memory. Detecting a change in ei-ther verbal or spatial working memory would help determineif a workload di↵erence was caused by SUI design (spatial)or the amount of information the design provides (verbal).Our first in-progress study has stimulated each memory typein di↵erent tasks - Verbal memory was tested by performingan n-back [13] number memory task, whereas spatial mem-ory was tested using an n-back visual block matrix task.Other studies have also looked at each type of memory andconfirmed fNIRS ability to detect changes in heamodynamicresponses accordingly [9].In addition to developing an understanding of the ex-

tent to which we can monitor di↵erent memory, our ini-tial study also sought to measure the e↵ect of artefacts onthe fNIRS data. Controlling the environment and humanderived sources of noise is a potentially di�cult factor tocontrol without e↵ecting the ecological validity of a study.

Megaw, T. (2005) The definition and measurement of mental workload. Evaluation of human work, 525-551.

Page 8: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Most Important to My Lab

• We can run a ‘normal’ user study.

• As much ecological validity in - the environment they do the study - natural user behaviour in the study - as normal/natural an IIR task as possible

• And tell whether theres a cognitive difference between UIs

Page 9: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

5 Challenges for NeuroIIR

• 1: Physical Constraints - its hard to minimise physical constraints

• 2: Designing Tasks - its hard to design tasks that extract cognitive differences

• 3: Confounding Variables - its hard to stop other artefacts affecting measurements

• 4: Protocol Compatibility - its hard to integrate neurophysiological sensors into ‘normal studies’

• 5: Data Analysis - chunking lots of continuous data into task differences is hard

Page 10: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~mlw/

functional Near Infra-Red Spectroscopy

Page 11: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

4 Years of HCI Papers

• CHI2014 - Pike et al. Measuring the effect of Think Aloud Protocols on Workload using fNIRS.

• CHI2015 - Maior et al. Examining the Reliability of Using fNIRS in Realistic HCI Settings for Spatial and Verbal Tasks.

• CHI2016 - Lukanov et al. Using fNIRS in Usability Testing: Understanding the Effect of Web Form Layout on Mental Workload.

• TOCHI - Maior et al. Workload Alerts - Using Physiological Measures of Mental Workload to Provide Feedback during Tasks (under review)

Page 12: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Our Approaches

• fNIRS is actually very good for this.

• Research shows ‘most’ normal computer usage is fine - although large upper body movements show in our data

• Mostly limited by 1.5-2m cables - Artinis now sell a portable fNIRS

Challenge 1: Physical Constraints

Page 13: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Our Approaches

• We do use a lot of n-back psych tests - but mostly for calibration

• We’ve managed to do ‘normal’ tasks

• For CHI2016, we had people fillin insurance claim forms. - and saw Cognitive differences

• For TOCHI, we had people doing Air Traffic Control taks

Challenge 2: Designing Tasks

By Pedros.lol - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/

index.php?curid=39241201

Page 14: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Our Approaches

• We’re just learning about this.

• We’ve noticed that anxiety/stress is affecting results

• Norah is starting a PhD on these - we’ve started using empatica in parallel with fNIRS

Challenge 3: Confounding Variables

Page 15: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Our Approaches

• We’ve recently noticed that anxiety/stress is affecting results

• Norah is starting a PhD on this confounding variable - we’ve started using empatica in parallel with fNIRS

• Other confoundingvariables?

Challenge 3: Confounding Variables

Page 16: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Our Approaches

• Part of ‘normal user studies’ is using other protocols - like Think Aloud (see our CHI2014 paper)

• Because fNIRS is not very restrictive, and tolerant of artefacts - its essentially fine to design a study, and ‘add fNIRS’

• Just involves adding setup times - we use Rest, 1-back, and 3-back to calibrate

Challenge 4: Protocol Compatibility

Page 17: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Our Approaches

• We collect data with COBI (from fNIRS supplier) » CSV files - 16 channels, so we target sensitive regions for the task - We chunk those channels into task periods (& shift by BOLD)

• Using fNIRSoft- We use low-pass filtering etc to remove noise - and Correlation Based Signal Improvement (CBSI) (comparison between HbO and Hb signals) - and produce some visualisations

• Then do multivariate ANOVA type statistics (channels x conditions)

Challenge 5: Data Analysis

Page 18: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Limitations of fNIRS

• Its temporal resolution is slower than others (2hz) - and affected by BOLD response

• Its not good for measuring other locations/cognitive responses - although full-scalp fNIRS exist

• Doesn't provide you imaging or location identification like MRI

Page 19: Measuring Mental Workload in IIR User Studies with fNIRS

Dr Max L. Wilson http://cs.nott.ac.uk/~pszmw

Questions?