Upload
johannes
View
216
Download
2
Embed Size (px)
Citation preview
ORIGINAL ARTICLE
Between laboratory and simulator: a cognitive approachto evaluating cockpit interfaces by manipulatinginformatory context
Armin Eichinger • Johannes Kellerer
Received: 22 March 2013 / Accepted: 24 September 2013
� Springer-Verlag London 2013
Abstract An evaluation approach for correspondence-
driven domains is suggested and implemented. Touch
screen and trackball controls were evaluated as interaction
devices for large-area displays in the cockpits of highly
agile jet aircraft. To account for the context conditions of
selected use cases, informatory quality and the difficulty of
situational demands were analysed and manipulated
experimentally in dual-task scenarios, which were com-
pleted by experienced pilots. Results indicate a clear per-
formance advantage of touch compared to trackball
interaction, accompanied by less workload. Informatory
dimensions induce different performance and workload
ratings. Cognitive demands interfere the least with aiming
performance, followed by visual and motor. Task and
device influences are interdependent. Motor components of
an additional task interfere especially with trackball control
actions. Workload operates as a buffer. When difficulty
increases, performance decrements are lower than workload
increments. It is argued that this cognitive manipulation of
informatory context is advisable for correspondence-driven
domains, where context is expected to influence human–
machine interaction. Transfer to automotive display eval-
uation appears to be straightforward.
Keywords Jet aircraft � Interface evaluation �Informatory context � Panoramic display � Trackball �Touch screen � Correspondence-driven domain
1 Introduction
Evaluation of control devices is strongly related to the
type of work domain under study. In some domains,
quality aspects of human–machine interaction do not
depend on environmental influences; in other domains,
these influences are important determinants of a suc-
cessful interaction. Vicente (1990) classifies the former as
coherence-driven; the latter as correspondence-driven
work domains. Aircraft cockpit interfaces exchange action
and information between the pilot and a dynamic sur-
rounding. The field of aviation, therefore, qualifies as a
classical correspondence-driven domain.
1.1 Aviation as correspondence-driven domain
The context of flying puts strong information processing
constraints on pilot interface interaction. This is especially
true for tactical aircraft, where informatory load is created
by a broad range of highly demanding tasks that have to be
executed simultaneously by the pilot (Williges et al. 1989).
Wickens (1999) describes modern pilots as information
managers and time-sharing systems, spanning the whole
range of information processing dimensions: visual, audi-
tory, cognitive, verbal, manual/motor (Wickens and Car-
swell 2006). Informatory demands of pilot tasks are highly
complex and change dynamically (Clamann and Kaber
2004).
These external aspects of display interaction, induced by
the task of flying an agile jet aircraft, must not be disre-
garded. In particular, for early-stage evaluations of cockpit
interaction concepts, these informatory characteristics of
contextual influences have to be taken into account and
mapped empirically; allowing their potentially differential
impact on types of control to be examined. Such an
A. Eichinger (&)
Institute of Ergonomics, Technische Universitat Munchen,
Boltzmannstr. 15, 85747 Garching, Germany
e-mail: [email protected]
J. Kellerer
Department for Human Factors Engineering, Cassidian,
Rechliner Straße, 85077 Manching, Germany
123
Cogn Tech Work
DOI 10.1007/s10111-013-0270-y
approach attempts to implement what Alex Kirlik recently
recommended for the empirical work of cognitive engi-
neers: ‘I will continue to recommend (…) that their work
should strive for a reasonable level of fidelity in repre-
senting central aspects of the contexts to which they desire
their research to generalize’ (p. 219; Kirlik 2012).
In spite of dynamic surroundings, the pilot’s interaction
with the main instrument panel of modern jet aircraft takes
place under rather comfortable conditions: vibration spec-
trum is dominated by motion at frequencies below 1 Hz
(Griffin 1996); higher frequencies are controlled and min-
imized by aircraft stabilizers (Kellerer 2011). After
detailed analyses of Eurofighter flight data, Kellerer (2011)
conclude that MHDD interaction performance does not
deteriorate under normal flight conditions. More extreme
accelerations usually do not require main panel control
input.
1.2 Technological point of departure: controls
for a panoramic jet display
The current Eurofighter main instrument panel is dominated
by three individual multifunctional head-down displays
(MHDDs). The development of a holographic back-pro-
jection technique (Becker et al. 2008) allows for large-area
touch screen displays to be used as a single head-down
display. This large-area display will replace the MHDDs
with its corresponding soft keys as well as other display and
control units on the main instrument panel. Figure 1 illus-
trates the differences between the current and the intended
configuration, the latter of which more than doubles the
available multifunctional display area.
Possible control types are evaluated empirically in order
to explore the potential of this new display concept. The
current Eurofighter cursor control device has not been
optimized for interacting with a large-area screen. A
trackball has comparable interaction characteristics (Boff
and Lincoln 1983), as it is a high-performing representative
of the indirect control devices family. A similar trackball
control is used for display interaction in the Airbus A380.
To explore and understand the aptitude of direct
manipulation (Hutchins et al. 1986), interaction by touch
was selected as second type of control to be studied.
1.3 Filling the gap: deficits of current evaluation
studies
In developing new display concepts, evaluation of types of
control has to take place early in the design process. It is
time-consuming and costly to test early interaction proto-
types by means of realistic flight simulation, however.
During this stage of development, it is therefore not fea-
sible to take advantage of highly elaborate simulation and
test environments. Usually, these constraints lead to a
parsimonious kind of assessment involving constricted
studies in a laboratory setting, without taking any contex-
tual influences into account, e.g. demands induced by other
tasks. Inherent to these one-shot analyses or comparisons is
the conviction that there is no sensitivity of the interaction
concept to these external aspects.
Bearing these dimensions in mind, the following profile
of requirements can be defined for an evaluation approach
applicable to the aviation setting described above.
• As the intended domain of application is correspon-
dence-driven, study participants have sufficient exper-
tise in the work domain.
• Main interaction tasks are representative of the
intended domain of application.
• The main informatory dimensions are identified, ana-
lysed and quantitatively assessed.
• Informatory load profiles are represented by additional
tasks.
• Relevant context attributes, like difficulty or attention
allocation policy, are considered.
There is a wide spectrum of studies evaluating various
control devices (cf. Stanton et al. 2013; Noyes and Starr
2007; Rogers et al. 2005; Jones and Parrish1990; Alapetite
et al. 2013). Their methodological approaches differ with
regard to domain focus, practicability/expenditure, degree
of participants’ expertise, main tasks and context attributes
considered.
Fig. 1 Main instrument panel
of the Eurofighter Typhoon in
the current configuration (left)
and the intended area of the new
touch screen display (right;
Kellerer 2011; Eichinger 2011)
Cogn Tech Work
123
Stanton et al. (2013) evaluated input devices for aircraft
cockpits—a correspondence-driven domain. Participants
were not domain experts. Although simulator software was
used for stimulus presentation, the representative tasks
were not performed during flying conditions. The authors
emphasize that, for studies comparing different input
device types for aviation control, ‘it is important to account
for aspects of the context-of-use’ (p. 590). However, their
approach lacks an analysis and representation of theses
context aspects. Similarly, in their study of cockpit input
devices, Noyes and Starr (2007) emphasize the multi-task
nature of aviation, which leads to different context and
workload demands; an issue, they suggest, should be
addressed in future studies. Rogers et al. (2005) evaluated
input devices for desktop software—a coherence-driven
domain. Most interesting is their understanding of con-
textual influences; ‘context’ being taken into consideration
by using representative main tasks. Their research illus-
trates the differences in perspectives between correspon-
dence-driven and coherence-driven evaluation. In an early
study, Jones and Parrish (1990) compared different input
devices for a large-area cockpit display in a simulator
environment. Although a flight condition was set up,
comparisons with the nonflying condition revealed no
performance difference. There were no reasons given for
the selection of the scenario used. No preceding task ana-
lysis, no identification or manipulation of context condi-
tions was reported. Alapetite et al. (2013) compared touch
and trackball interaction for correspondence-driven
domains in an experimental evaluation of a user interface
concept. They used a secondary task to keep participants
focused on a head-up display while blindly interacting with
a main task. No further details were given regarding the
context that might influence interaction.
To the best of our knowledge, no study meets the
requirement profile described above.
1.4 Putting it all together: a correspondence-driven
empirical evaluation concept
We suggest an evaluation concept that fills the gap between
constrained one-shot laboratory and costly simulator experi-
ments by considering the requirements stated above. This new
approach attempts to enrich laboratory-based experiments by
analysing and explicitly considering the main contextual
influences. In accordance with this, our approach also com-
plies with a requirement defined by Hammond (1986): ‘when
an experiment is carried out, the question of generalization
from the laboratory to the actual conditions of interest outside
the laboratory must be answered by some form of defensible
logic’ (p. 431). Thereby, expenditures for simulator equip-
ment can be avoided without sacrificing the being able to
generalize the results.
The suggested evaluation concept takes into account the
characterizing aspects introduced: the influences of various
informatory qualities tapped by the different tasks, by
multitasking and by the varying intensity of informatory
demand are addressed and mapped for evaluation in a
realistic fashion that corresponds to real settings. To
account for these influences, informatory context is first
analysed and then mapped empirically for an experimental
evaluation of the two control devices: trackball and touch
screen. This mapping is effected by additional tasks, whose
complexity is manipulated. Main interaction tasks are
representative of the domain of application. Study partici-
pants are experienced in their work domain.
Although this approach is tailored to fit the requirements
defined in an aviation setting, its structure is easily trans-
ferable to other correspondence-driven domains (Bennett
and Flach 2011) as for example automotive engineering.
1.5 Research questions
The general purpose of the study is to suggest and imple-
ment an evaluation approach for correspondence-driven
domains, especially in highly agile aviation, which makes
the identification and experimental analysis of contextual
influences possible. General research questions, resulting
from the aforementioned approach, are as follows:
How do aspects of the context influence the measured
interaction criteria?—Main effects of informatory dimen-
sions or context complexity.
Is there a differential influence of the context attri-
butes?—Interaction effects of context attributes with con-
trol devices.
Is there a difference between the control devices?—
Main effects of control devices as they are studied in
classical usability studies.
The current study focuses specifically on the evaluative
comparison of performance and workload aspects of
trackball and touch. The manipulation of central context
attributes allows for the analysis of the following ques-
tions: Is there an informatory context influence on inter-
action performance and workload? Is there an
interdependency of informatory context and devices? Do
performance and workload measures react synchronously
on experimental conditions? What conclusions can be
drawn from the perspective of a cockpit designer with
regard to function allocation between devices?
2 Analysing informatory context
To cover a broad range of realistic situational demands,
three typical but divergent mission types were analysed (1)
Combat Air Patrol, CAP, a tactical flying pattern to detect
Cogn Tech Work
123
incoming intruders; (2) Air-to-Surface, A/S, an attack
mission; (3) Route Management, RM, a use case dominated
by waypoint editing.
For these use cases, a profile of informatory load was to
be created along the five processing dimensions: visual
perception, auditory perception, central processing or
cognition, verbal response and manual motor response.
Assessing visual load did not explicitly distinguish stimuli
inside and outside the cockpit.
2.1 Procedure and participants
A three-step approach was taken to create these profiles.
Firstly, informatory load was identified for the main flying
tasks: aviation, navigation, communication, system man-
agement and tactics (cf. Wickens 2003). Secondly, the
relevance of the main flying tasks for the selected use cases
was assessed on a scale from 0 to 100 %. Finally, infor-
matory load scores were aggregated by using relevance
ratings as linear weights according to the following model:
Ldim = Rtaskldim,task � wtask, with Ldim as the load for
dimension dim, ldim,task as the load for dimension dim
during the main flying task task and wtask as the relevance
of this main flying task for the use case in consideration.
The assessments were made by eight test pilots from EADS
Military Air Systems in Manching, Germany. The pilots’
average age was 44; they had an average flight experience
with highly agile jet aircraft of 3,350 h.
2.2 Results
Figure 2 documents almost identical load profiles for the
three use cases. All profiles show peak loads for visual
perception, cognitive processing and motor response.
In spite of the small sample size, Bonferroni-corrected
comparisons using t tests document significant differences
between any of the three peaks and any of the two lesser
loads, except for comparisons comprising the motor com-
ponent. Average corresponding effect sizes of these peak-
low comparisons in terms of contrast correlation coeffi-
cients (Rosenthal et al. 2000) within the three use cases
were reffect size = .752 (air-to-surface), reffect size = .631
(CAP) and reffect size = .732 (Route Management); they all
qualify as very large effects according to Cohen’s (1988)
classification.
According to these results, the study focus was placed
on visual, cognitive and motoric information processing
loads for further analyses.
2.3 Mapping informatory context
The general aspects of informatory context described
above put certain constraints on empirical mapping in the
form of experimental conditions. The information qualities
identified as contributing peak demands—visual, cognitive,
and motor—are reproduced by specific additional tasks,
which have to be accomplished while carrying out a rapid
aiming task for the main comparison of the two control
devices. As primarily visually demanding, a standardized
search task was used as described in ISO/DTS 14198
(2011). Cognitive processing was achieved by a memory
search task according to the classic Sternberg paradigm (cf.
Sternberg 2004). A motor response task was designed to be
similar to the widespread use of pegboard tests (cf. Strenge
et al. 2002). All additional tasks were selected or con-
structed so that difficulty and resource demand could be
manipulated experimentally accordingly.
The dynamic complexity of tasks is accomplished by
presenting each additional task with two levels of difficulty.
Difficulty levels were calibrated in preliminary tests to
qualify as being of average or high difficulty; for ease of
communication, these levels were labelled ‘easy’ and
‘difficult’.
To meet realistic requirements with regard to control
demands, two different aiming tasks were constructed
with differing task emphases. Under the conditions of the
first task, single static targets have to be selected as
quickly as possible after they appear on screen. Under
the conditions of the second task, several target symbols
are presented. Targets appear jointly static and moving
within a set of jointly static and moving distractor
symbols.
Fig. 2 Informatory load profiles for the three use cases. Error bars indicate ± one standard error
Cogn Tech Work
123
This setting fulfils the requirement of time-sharing by
presenting both tasks in a classic dual-task scenario. Sub-
jects were instructed to place equal emphasis on both tasks
to prevent any bias in performance.
3 Methods
3.1 Design
The main criteria for evaluating interaction concepts under
contextual influences have to take into account perfor-
mance aspects as well as the amount of processing
resources that have to be invested to achieve this level of
performance. For early phases of display development,
rapid aiming tasks suggest themselves as a commonly used
experimental paradigm, which has been in widespread use
since the pioneering experiments of Paul Fitts (1954).
There are two basic questions to be addressed in evaluating
the two forms of control interaction. Firstly, how do they
affect performance in rapid aiming tasks, and secondly,
how high are the corresponding costs of the subjective
workload.
As independent variables, the two types of control
devices were compared under three qualities of informa-
tory load and operationalized through three different
additional tasks, which were presented with two levels of
difficulty. The resulting three-way 2 9 3 9 2 design was
empirically implemented in a three-factor repeated mea-
sures design.
Two different aiming tasks were constructed as descri-
bed above. As the first dependent variable, the speed of
task fulfilment was selected as a measure of aiming per-
formance. To analyse the cognitive workload’s potential
buffer effects, NASA TLX ratings were recorded addi-
tionally after each experimental round.
3.2 Participants
Experiments for evaluations were supported by eleven
male test pilots from EADS Military Air Systems and the
German Federal Armed Forces Technical Centre WTD 61
in Manching, Germany. The pilots’ average age was 41
(SD = 9.1); they had an average flight experience with
highly agile aircraft of 3,100 h.
3.3 Instruments
Experiments were conducted in a cockpit mock-up, which
resembled a real cockpit in all relevant geometric respects.
The large-area display was positioned head-down and
integrated into the main instrument panel. The trackball
was positioned to the left of the pilot seat, in the position of
a classic cursor control device that is integrated into the
throttle. Both aiming tasks were executed on the main
display. Figure 3 shows the positioning of instruments used
for aiming and additional tasks in the mock-up.
3.3.1 Aiming task
Aiming performance was assessed in two different tasks.
For both tasks, subjects were instructed to select targets as
fast and accurately as possible.
In the single task condition, single square targets
appeared on random positions of the display. The target,
then, disappeared after it had been selected. Targets were
presented in intervals of 6 s. A new target would appear
regardless of previous target selection. Time needed to
select was recorded as a measure of task performance.
In the multiple task condition, multiple square symbols
appeared randomly on the display. Three target squares were
presented in red, and ten distractor squares were presented in
blue. One of the targets and two of the distractor squares
moved on a linear trajectory. After a target or distractor was
selected, the symbol disappeared. Figure 4 illustrates four
steps of symbol presentation and selection for single and
multiple target tasks, respectively. Time for target selection
was recorded as a measure of task performance.
3.3.2 Visual task
To access visual resources, a visual search, proposed by
ISO as a reference task (ISO/DTS 14198 2011), was used.
Fig. 3 Instruments for aiming and additional tasks were positioned
within a cockpit mock-up. The head-up display was used for the
visual task. The motor task was positioned to the right of the
participants; the trackball to their left. The large-area display with
integrated touch was installed at the position of the main instrument
panel
Cogn Tech Work
123
Fig. 4 Single target task (1–4 above), multiple target task (1–4 below); distractors are indicated by solid squares as opposed to targets;
movement of targets and distractors is designated by a direction indicator
Cogn Tech Work
123
The task was presented on a 17’’’ liquid crystal monitor
with a 1,024 9 768 pixel resolution, positioned as and
where a head-up display would be placed. Participants
were required to identify one white circle as the target
stimulus among a set of 50 similarly shaped and coloured
circles distractors. All circles were randomly configured.
Subjects indicated the half of the screen containing the
target circle by pressing the left or right arrow on a key-
board positioned to their right. Subjects were able to switch
their selection until 2 s post key press. After this two
second window, a new random circle configuration was
presented. An example of an easy and a difficult display
configuration before and after selection is shown in Fig. 5.
3.3.3 Cognitive task
This task was presented using the Cognitive Tasks software
(DaimlerChrysler AG). Participants were played recorded
numbers between one and nine, at a speed of one number
per second. Three seconds after the sequence was com-
pleted, a target number was presented. Subjects were
instructed to verbally indicate in a fast and accurate
way, whether this number was part of the preceding
sequence of numbers by answering ‘yes’ or ‘no’. The next
sequence began 5 s after presenting the target number.
Figure 6 depicts the temporal structure used for both easy
and difficult conditions. The easy condition consisted of
five numbers, while the difficult condition consisted of
eight numbers.
3.3.4 Motor task
In order to manipulate motoric load, a pegboard task was used
as described in Kuhn (2005). A wooden stencil was placed on
top of a flat, flexible keyboard to record when the stylus was
plugged into the circular slots (cf. Fig. 7). In the easy condition,
slot diameter was 25 mm and stylus diameter was 23 mm;
stylus radius was 7 mm on its edge. For the difficult condition,
slot diameter was 8 mm and stylus diameter was 7 mm; stylus
radius was 0.5 mm on its edge. Subjects were instructed to put
the stylus into the pegboard slots in a clockwise fashion
without visual feedback as fast and accurately as possible.
3.4 Procedure
All subjects attended to any combination of experimental
conditions. All combinations of the independent variables
difficulty (low, high), device (trackball, touch screen) and
additional task (visual, cognitive, motor) were presented in a
completely randomized manner. The experiment was sepa-
rated into two sessions for each of the two aiming tasks. These
sessions took place on two different days, usually with a week
between both. The sequence of the two kinds of aiming task
was balanced across the subjects. Subjects were given ample
time to become acquainted with the setting and practice each
task before recordings started. After each experimental run, the
NASA TLX workload questionnaire was completed and
interpreted in an equally weighted fashion suggested by
Nygren (1991).
Fig. 5 Easy (left) and difficult (middle) visual search task configurations before selection. In the rightmost image, the grey field shows the
selected screen half of a difficult configuration
Fig. 6 Easy (above) and
difficult (below) cognitive task
Cogn Tech Work
123
3.5 Data analysis
Data were analysed by using three-way repeated measures,
or ANOVAs. As two kinds of aiming task (single target
and multiple target) and two independent variables (aiming
performance and workload) were used, four ANOVAs
were computed.
The level of significance for any statistical analysis was
set to 5 %. Partial gp2 was used as effect size measure for
ANOVA effect tests. For omnibus effect tests that included
Fig. 7 Easy (left) and difficult
(right) motor task
Fig. 8 Mean performance (a–d) and corresponding mean workload (e–h) ratings for any experimental condition and for both aiming tasks,
single and multiple targets; error bars indicate ± one standard error
Cogn Tech Work
123
more than one degree of freedom, focused contrast analy-
ses with one degree of freedom were conducted addition-
ally. As suggested by Rosenthal et al. (2000), rcontrast was
used as effect size measure for focused contrast analyses.
4 Results
Figure 8 provides a visual account of the measures taken
and relates performance to corresponding workload ratings.
Statistical analyses reveal the significant main effects of
interaction device, difficulty and informatory dimension of
the additional task for both performance and workload ratings
under any of the two aiming conditions. Touch interaction
leads to better aiming performance and lower workload rat-
ings than trackball interaction. Difficult additional tasks lead
to poorer performance and a higher workload. Different
additional tasks lead to differences in performance and
workload ratings. According to Cohen’s effect size classifi-
cation, all effects qualify as very large (Cohen 1988). Anal-
yses of interaction effects document a dependency on the task
context of the devices used for aiming for single and multiple
target performance, as well as for multiple target workload
ratings. These effects also qualify as very large. Table 1
summarizes all relevant ANOVA statistics with error prob-
ability and effect size estimates.
In order to take a closer look at effects with multiple
degrees of freedom, focused contrasts were analysed. For
both single and multiple target performance ratings, they
document significant differences between additional visual
and cognitive demands, as well as between cognitive and
motor demands. For workload ratings under the single and
multiple target conditions, there were significant differences
between visual and motor, as well as between cognitive and
motor demands. Significant interaction effects of devices
and tasks were broken down to focused interaction contrasts.
They all show the same result: There is a significant task
dependency of the difference between touch and trackball
when an additional motor demand is present. The difference
between motor and visual or cognitive task increases when
the trackball is used instead of touch. There is no significant
interaction for visual compared to cognitive demand with
respect to device-induced differences in performance or
workload. Table 2 summarizes all relevant contrast statistics
with error probability and effect size estimates. All signifi-
cant contrasts show very large effects.
5 Discussion
At the heart of this cognitive approach to evaluation lies
the manipulation of informatory context. Touch is faster
than trackball interaction for any additional task and
achieves this with a lower workload for most conditions.
There are, however, differences in informatory dimension
with regard to aiming performance and workload. From a
performance perspective, additional cognitive load leads to
better results than visual or motor demand. From a work-
load perspective, additional motor load leads to higher
ratings than visual or cognitive demand. The two measures
taken complement each other: performance measurement
helps to identify cognitive as the least detrimental, and
workload ratings help to identify motor as the most detri-
mental informatory condition.
There is a clear advantage of touch interaction regarding
task performance. This difference is not a result of
increased effort, as the main determinant of cognitive
workload (Kahneman 1973), as workload ratings were
lower for touch interaction. Better aiming performance is
achieved by a lower workload. This evaluation aspect holds
true for the single and multiple target aiming, as well as for
both levels of additional task difficulty.
There is a differentiating influence of informatory con-
text on device effects. As Fig. 8 illustrates, when combined
with a motor task, aiming performance with trackball is
clearly slower in comparison with visual or cognitive. This
suggests an interference of motor demands for both tasks,
which is not present with touch interaction.
Table 1 Summary of resulting statistics for all four ANOVA analyses; significant effects are set in boldface
Effect df ST performance MT performance ST workload MT workload
F p gp2 F p gp
2 F p gp2 F p gp
2
Device 1,10 133.47 <.001 .93 135.28 <.001 .93 22.16 .001 .69 8.36 .016 .46
Task 2,20 19.83 <.001 .55 24.28 <.001 .71 27.77 <.001 .73 12.94 <.001 .56
Difficulty 1,10 12.27 .001 .67 12.45 .005 .56 74.88 <.001 .88 40.48 <.001 .80
Dev. 9 task 2,20 20.59 <.001 .67 7.92 .003 .44 0.63 .544 .06 5.34 .014 .35
Dev. 9 diff. 1,10 0.08 .783 .01 0.04 .783 .04 0.45 .520 .04 2.30 .160 .19
Task 9 diff. 2,20 1.82 .187 .15 2.01 .160 .17 0.46 .638 .04 1.00 .384 .09
Dev. 9 task 9 diff. 2,20 0.60 .559 .06 0.67 .522 .06 1.37 .276 .12 .784 .470 .07
ST single targets, MT multiple targets
Cogn Tech Work
123
Figure 8 indicates a difference in workload compared to
performance measures, when the difficulty of the additional
task is increased. The effect on workload is much more
pronounced than the effect on performance. For single tar-
gets, the effect of difficulty on workload compared to per-
formance is higher by 31 % in effect size; for multiple
targets, this increase amounts to 43 %. Workload seems to
have a buffer function concerning performance that con-
forms to an energetic notion of workload (cf. Kahneman
1973; Sanders 1983). By exerting more effort, a sufficient
level of performance is achieved. As workload is a bounded
resource, performance can only be maintained up to certain
levels. In potential overload situations, there might not be
any resources available. So a workload increase beyond
certain limits, which might go unnoticed by performance
indicators, could lead to a complete performance breakdown.
This stunning observation alone justifies the inclusion of
workload ratings in this type of evaluation. Relying exclu-
sively on performance indicators would reveal only part of
the mechanisms underlying the interaction achievement.
The aim of this study was not to choose between one of
two interaction devices, but to analyse their strengths and
weaknesses under realistic informatory circumstances.
Conforming to the HOTAS principle—hands on throttle
and stick—certain functions of display interaction have to
be accomplished using a remote cursor control device
(Smith 1999). The suggested approach enriches the clas-
sical procedure of evaluating control devices by analysing
and empirically mapping informatory context. This study
highlights possible interferences with additional tasks with
motor demands as one main weakness of remote cursor
control devices, represented by trackball. This reduction in
aiming performance is accompanied by an increase in
workload, especially in highly demanding situations that
might lead to highly detrimental operating conditions or
even performance breakdowns. It is up to regulation
authorities and cockpit designers to take these influences
and interferences into account when analysing tasks to be
accomplished by display interaction and allocating func-
tions to different interaction devices.
Although our evaluation approach led to a broad spec-
trum of interesting results, and has, thus, broadened the
scope of a classical approach, the potential to improve
certainly still remains. We took single-peak dimensions of
informatory context and mapped them by using isolated
additional tasks. To consider the complex nature of tasks to
be accomplished in cockpit interaction, one complex task
could be devised that jointly maps the three peak dimen-
sions of visual, cognitive and motor demands. This task
should be manipulated in difficulty, as were the single tasks
used for this study.
To be able to study the effects of strategic allocation of
attentional resources, as might be expected with highly
skilled jet pilots (Wickens and McCarley 2008), an explicit
task focus could be considered. Subjects might be
instructed to allocate different portions of their resources to
the various task components; main and additional tasks,
respectively.
Piloting jet aircraft was introduced as a correspondence-
driven working domain, where contextual characteristics
influence display interaction performance. Variable sce-
narios induce volatile requirements in task difficulty and in
turn in pilot workload. The very same characteristics hold
true for other realms, for example the automotive domain:
A highly agile vehicle is navigated through multifaceted
scenarios requiring various informatory profiles of atten-
tional resources. As in aviation, workload is an important
construct in evaluating driver–display interaction perfor-
mance; this is well documented by a vast body of research
in automotive human factors engineering.
Our approach, as described and implemented for avia-
tion can, therefore, be adopted by the automotive domain
with few, if any, modifications: (1) representative scenarios
are to be identified, e.g. urban and highway traffic; (2) main
driving tasks are to be defined, e.g. in accordance with
popular three-level models of driving, as formulated by
Michon (1985), separating strategic, tactical and control
tasks; (3) informatory load profiles are to be determined
using the weighting scheme described above; (4) load
profiles are represented by additional tasks focusing on
Table 2 Results of a contrast analysis of device 9 task interactions; significant contrasts are set in boldface
Contrast df ST performance MT performance ST workload MT workload
F p rcont. F P rcont. F p rcont. F p rcont.
Vis vs. cog 1,10 22.81 <.001 .83 22.81 <.001 .83 1.23 .294 .33 3.28 .100 .50
Vis vs. mot 1,10 2.20 .168 .43 2.21 .167 .43 32.67 <.001 .88 10.71 .008 .72
Cog vs. mot 1,10 17.98 .002 .80 56.18 <.001 .92 52.52 <.001 .92 18.21 .002 .80
Vis vs. cog, TB vs. touch 1,10 4.40 .073 .55 0.41 .537 .20 0.09 .771 .09 .16 .699 .12
Vis vs. mot, TB vs. touch 1,10 13.79 .004 .76 7.53 .021 .66 1.66 .227 .38 5.52 .041 .59
Cog vs. mot, TB vs. touch 1,10 44.23 <.001 .90 12.91 .005 .75 0.69 .426 .25 10.45 .009 .71
ST single targets, MT multiple targets, Vis; Cog; Mot visual, cognitive, motor task, TB trackball, Touch touch screen
Cogn Tech Work
123
single-demand dimensions, e.g. using the methods descri-
bed in this study; (5) relevant task characteristics are
manipulated, e.g. difficulty or resource allocation; (6)
appropriate performance measures are selected for assess-
ing interaction quality, e.g. speed or error rate; (7) effects
of display and/or interaction devices, e.g. comparing jog
dial and touch, in combination with context characteristics
are analysed by appropriate factorial designs with respect
to performance and workload measures.
This cognitive approach of context-sensitive evaluation
can take place in a very early phase of development of
display or interaction devices for any correspondence-dri-
ven domain, sharing the basic characteristics and general
steps described above. It can and should replace evaluation
efforts that restrict themselves to a one-shot study design,
ignoring contextual influences for lack of a feasible or
affordable simulation environment.
References
Alapetite A, Fogh R, Ozkil AG, Andersen HB (2013) A deported
view concept for touch interaction. In: Proceedings of
ACHI’2013, international conference on advances in com-
puter–human interactions, pp 22–27
Becker S, Neujahr H, Sandl P, Babst U (2008) Holographisches
display—HOLDIS. In: Grandt M, Bauch A (eds) Beitrage der
Ergonomie zur Mensch-System-Integration. DGLR, Bonn
Bennett KB, Flach JM (2011) Display and interface design: subtle
science, exact art. CRC Press, Boca Raton
Boff K, Lincoln J (1983) Engineering data compendium: human
perception and performance. Wiley, New York
Clamann M, Kaber D (2004) Applicability of usability evaluation
techniques to aviation systems. Int J Aviation Psychol
14(4):395–420
Cohen J (1988) Statistical power analysis for the behavioral sciences.
Erlbaum, Hillsdale
Eichinger A (2011) Bewertung von Benutzerschnittstellen fur Cock-
pits hochagiler Flugzeuge. Sudwestdeutscher Verlag fur Ho-
chschulschriften, Saarbrucken
Griffin MJ (1996) Handbook of human vibration. Elsevier,
Amsterdam
Hammond K (1986) Generalization in operational contexts: What
does it mean? Can it be done? IEEE Trans Syst Man Cybern
16(3):428–433
Hutchins EL, Hollan JD, Norman DA (1986) Direct manipulation
interfaces. In: Norman DA, Draper SW (eds) User centered
system design: new perspectives on human–computer interac-
tion. Lawrence Erlbaum Associates, Hillsdale, pp 87–124
ISO/DTS 14198 (2011) Road vehicles—ergonomic aspects of trans-
port information and control systems—calibration tasks for
methods which assess demand due to the use of in-vehicle
systems. Revised Draft Version
Jones DR, Parrish RV (1990) Simulator comparison of thumball,
thumb switch and touch screen input concepts for interaction
with large screen cockpit display format. NASA Technical
Memorandum no 1025
Kahneman D (1973) Attention and Effort. Prentice-Hall, Englewood
Cliffs
Kellerer J (2011) Panoramic displays: Untersuchung zur Auswahl von
Eingabeelementen fur Großflachendisplays in Flugzeugcockpits.
Sudwestdeutscher Verlag fur Hochschulschriften, Saarbrucken
Kirlik A (2012) Relevance versus generalization in cognitive
engineering. Cogn Technol Work 14:213–220
Kuhn F (2005) Methode zur Bewertung der Fahrerablenkung durch
Fahrerinformations-Systeme. World Usability Day
Michon JA (1985) A critical view of driver behavior models: what do
we know, what should we do? In: Evans L, Schwing RC (eds)
Human behavior and traffic safety. Plenum Press, New York,
pp 485–524
Noyes JM, Starr AF (2007) A comparison of speech input and touch
screen for executing checklists in an avionics application. Int J
Aviation Psychol 17(3):299–315. doi:10.1080/
10508410701462761
Nygren TE (1991) Psychometric properties of subjective workload
measurement techniques: implications for their use in the
measurements of perceived mental workload. Hum Factors
33:17–33
Rogers WA, Fisk AD, McLaughlin AC, Pak R (2005) Touch a screen
or turn a knob: choosing the best device for the job. Hum Factors
47:271–288
Rosenthal R, Rosnow R, Rubin DB (2000) Contrasts and effect sizes
in behavioral research: a correlational approach. Cambridge
University Press, Cambridge
Sanders AF (1983) Towards a model of stress and performance. Acta
Psychol 53(1):61–97
Smith C (1999) Design of the Eurofighter human machine interface.
Air Space Eur 1(3):54–59
Stanton NA, Harvey C, Plant KL, Bolton L (2013) The performance
of computer input devices in a vibration environment. Ergo-
nomics 56(4):590–611. doi:10.1080/00140139.2012.751458
Sternberg S (2004) Memory-scanning: Mental processes revealed by
reaction-time experiments. In: Balota DA, Marsh EJ (eds)
Cognitive Psychology: Key Readings. Psychology Press, New
York, pp 48–74
Strenge H, Niederberger U, Seelhorst U (2002) Correlation between
tests of attention and performance on grooved and purdue
pegboards in normal subjects. Percept Mot Skills 95(2):507–514
Vicente KJ (1990) Coherence- and correspondence-driven work
domains: implications for system design. Behav Inf Technol
9:493–502
Wickens C (1999) Aerospace psychology. In: Damos D (ed) Human
performance and ergonomics. Academic Press, San Diego,
pp 195–242
Wickens C (2003) Pilot actions and tasks: selection execution, and
control. In: Tsang P, Vidulich M (eds) Principles and practice of
aviation psychology. Lawrence Erlbaum Associates, Mahwah,
pp 239–265
Wickens C, Carswell C (2006) Information processing. In: Salvendy
G (ed) Handbook of human factors and ergonomics. Wiley,
Hoboken, pp 111–149
Wickens C, McCarley J (2008) Applied attention theory. CRC Press,
Boca Raton
Williges R, Williges B, Fainter R (1989) Software interfaces for
aviation systems. In: Wiener E, Nagel D (eds) Human factors in
aviation. Academic Press, San Diego, pp 463–493
Cogn Tech Work
123