9
Human cognition in manual assembly: Theories and applications Sonja Stork * , Anna Schubö Ludwig Maximilian University Munich, Department Psychology, Leopoldstrasse 13, 80802 Munich, Germany article info Article history: Received 14 May 2010 Received in revised form 25 May 2010 Accepted 26 May 2010 Available online 26 June 2010 Keywords: Human cognition Information processing Visual attention Mental workload Task complexity Worker assistance abstract Human cognition in production environments is analyzed with respect to various findings and theories in cognitive psychology. This theoretical overview describes effects of task complexity and attentional demands on both mental workload and task performance as well as presents experimental data on these topics. A review of two studies investigating the benefit of augmented reality and spatial cueing in an assembly task is given. Results demonstrate an improvement in task performance with attentional guid- ance while using contact analog highlighting. Improvements were obvious in reduced performance times and eye fixations as well as in increased velocity and acceleration of reaching and grasping movements. These results have various implications for the development of an assistive system. Future directions in this line of applied research are suggested. The introduced methodology illustrates how the analysis of human information processes and psychological experiments can contribute to the evaluation of engi- neering applications. Ó 2010 Elsevier Ltd. All rights reserved. 1. Introduction The investigation of human cognition in applied scenarios is of growing relevance for the development of applications in daily life as well as in working environments. Humans have to interact with computers and machines in various contexts. Accordingly, research on human–machine interaction (HMI) is a challenge and necessi- tates the understanding of human information processing and ac- tion planning in specific situations. Engineering applications which aim to maximize the usability of task-specific user interfaces need to be evaluated according to objective criteria. Thus, a fruitful methodology for such an evaluation consists of formulating con- crete hypotheses regarding task performance of conducting basic research on human cognitive processes and based in controlled experiments designed to test such hypotheses. It is essential, that engineers are not only aware of various usability methods, but are able to quickly determine which method is best suited to their relevant development task within a software project [1,2]. In com- parison to field studies, the advantage of a theory-based experi- mental approach is the possibility of controlling all important influencing factors. The present account applies this methodology to the develop- ment of an assistive system for manual assembly. Firstly, the necessity of an assistive system for manual assembly tasks is illus- trated followed by a description of how psychological research has been integrated into applied research in production environments in recent years. So far, no extensive research on cognitive processes in production environments has been conducted. Systems of pre- determined times (e.g. method time measurement, MTM) have helped to integrate data on motor executions, which depend on physical task parameters, but more complex cognitive tasks have been neglected. In working environments, economical and ergonomical aspects must be considered, because performance strongly depends on the quality of both information and guidance. Therefore, assistive sys- tems must be adapted to the needs of the user and constraints of the task. In manual assembly tasks workers are confronted with multiple sources of information. Relevant information has to be se- lected, action planned and executed appropriately. Moreover, due to a growing demand for flexible and customized production, inter- faces designed to optimally support workers in manufacturing be- come increasingly relevant [3]. In contrast to rather repetitive tasks in mass production, with flexible production relevant assembly parts and respective actions vary rapidly. Humans are quite good in dealing flexibly with different tasks in comparison to fully auto- mated production systems and industrial robots. Nevertheless, as the information processing capacities are limited, a fluent working process and error avoidance depend on appropriate worker assis- tance. An assistive system for manual assembly should support the attentional allocation to relevant task aspects as well as action execution. For this purpose, information has to be presented near positions where it is needed and at the exact moment in time when it is needed. From the technical perspective this is only possible with a situational awareness system equipped with multiple sen- sors. Nevertheless, a precondition for the adequate utilization of situational data within an assistive system is the knowledge and 1474-0346/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.aei.2010.05.010 * Corresponding author. E-mail address: [email protected] (S. Stork). Advanced Engineering Informatics 24 (2010) 320–328 Contents lists available at ScienceDirect Advanced Engineering Informatics journal homepage: www.elsevier.com/locate/aei

Human cognition in manual assembly: Theories and applications

Embed Size (px)

Citation preview

Page 1: Human cognition in manual assembly: Theories and applications

Advanced Engineering Informatics 24 (2010) 320–328

Contents lists available at ScienceDirect

Advanced Engineering Informatics

journal homepage: www.elsevier .com/ locate /ae i

Human cognition in manual assembly: Theories and applications

Sonja Stork *, Anna SchuböLudwig Maximilian University Munich, Department Psychology, Leopoldstrasse 13, 80802 Munich, Germany

a r t i c l e i n f o

Article history:Received 14 May 2010Received in revised form 25 May 2010Accepted 26 May 2010Available online 26 June 2010

Keywords:Human cognitionInformation processingVisual attentionMental workloadTask complexityWorker assistance

1474-0346/$ - see front matter � 2010 Elsevier Ltd. Adoi:10.1016/j.aei.2010.05.010

* Corresponding author.E-mail address: [email protected] (S. Stork).

a b s t r a c t

Human cognition in production environments is analyzed with respect to various findings and theories incognitive psychology. This theoretical overview describes effects of task complexity and attentionaldemands on both mental workload and task performance as well as presents experimental data on thesetopics. A review of two studies investigating the benefit of augmented reality and spatial cueing in anassembly task is given. Results demonstrate an improvement in task performance with attentional guid-ance while using contact analog highlighting. Improvements were obvious in reduced performance timesand eye fixations as well as in increased velocity and acceleration of reaching and grasping movements.These results have various implications for the development of an assistive system. Future directions inthis line of applied research are suggested. The introduced methodology illustrates how the analysis ofhuman information processes and psychological experiments can contribute to the evaluation of engi-neering applications.

� 2010 Elsevier Ltd. All rights reserved.

1. Introduction

The investigation of human cognition in applied scenarios is ofgrowing relevance for the development of applications in daily lifeas well as in working environments. Humans have to interact withcomputers and machines in various contexts. Accordingly, researchon human–machine interaction (HMI) is a challenge and necessi-tates the understanding of human information processing and ac-tion planning in specific situations. Engineering applications whichaim to maximize the usability of task-specific user interfaces needto be evaluated according to objective criteria. Thus, a fruitfulmethodology for such an evaluation consists of formulating con-crete hypotheses regarding task performance of conducting basicresearch on human cognitive processes and based in controlledexperiments designed to test such hypotheses. It is essential, thatengineers are not only aware of various usability methods, butare able to quickly determine which method is best suited to theirrelevant development task within a software project [1,2]. In com-parison to field studies, the advantage of a theory-based experi-mental approach is the possibility of controlling all importantinfluencing factors.

The present account applies this methodology to the develop-ment of an assistive system for manual assembly. Firstly, thenecessity of an assistive system for manual assembly tasks is illus-trated followed by a description of how psychological research hasbeen integrated into applied research in production environments

ll rights reserved.

in recent years. So far, no extensive research on cognitive processesin production environments has been conducted. Systems of pre-determined times (e.g. method time measurement, MTM) havehelped to integrate data on motor executions, which depend onphysical task parameters, but more complex cognitive tasks havebeen neglected.

In working environments, economical and ergonomical aspectsmust be considered, because performance strongly depends on thequality of both information and guidance. Therefore, assistive sys-tems must be adapted to the needs of the user and constraints ofthe task. In manual assembly tasks workers are confronted withmultiple sources of information. Relevant information has to be se-lected, action planned and executed appropriately. Moreover, dueto a growing demand for flexible and customized production, inter-faces designed to optimally support workers in manufacturing be-come increasingly relevant [3]. In contrast to rather repetitive tasksin mass production, with flexible production relevant assemblyparts and respective actions vary rapidly. Humans are quite goodin dealing flexibly with different tasks in comparison to fully auto-mated production systems and industrial robots. Nevertheless, asthe information processing capacities are limited, a fluent workingprocess and error avoidance depend on appropriate worker assis-tance. An assistive system for manual assembly should supportthe attentional allocation to relevant task aspects as well as actionexecution. For this purpose, information has to be presented nearpositions where it is needed and at the exact moment in time whenit is needed. From the technical perspective this is only possiblewith a situational awareness system equipped with multiple sen-sors. Nevertheless, a precondition for the adequate utilization ofsituational data within an assistive system is the knowledge and

Page 2: Human cognition in manual assembly: Theories and applications

S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328 321

understanding of those cognitive processes which should besupported.

In recent decades, there has been an increasing interest in cog-nitive processes in applied scenarios. One starting point of such aline of research can be found in the work of Ulrich Neisser, whocriticized in ‘Cognition and Reality’ (1976) the lack of ecologicalvalidity in psychological research [4]. In his opinion, in favor ofcontrolled experiments, simplified stimulus presentation and par-ticipants isolated from the natural environment, psychology hadlost its relationship to the real world. Accordingly, he suggestedgetting closer to real world scenarios in order to produce more rel-evant results. Neisser investigated information processing fromsensory input to action execution including the basic processes ofperception, memory, problem solving and interplays thereof. Hecoined the term ‘Cognitive Psychology’ and stressed the computa-tional manner of dynamic information processing. Further, he wasinterested in dual and multi task performance, which are relevantfor daily activities as well as for manual assembly tasks.

One account, which is well in line with the claim for more realworld connections in research is James Jerome Gibson’s ‘Ecologicalpsychology’ and ‘Direct perception theory’, which were inspired byGestaltian Theory. A key term in this account is the term ‘Affor-dances’, defined as: possibilities for action which are directly per-ceived as clues in the environment. In this sense perceptiondrives action in an immediate manner because certain object prop-erties afford certain actions, e.g. a chair is for sitting [5]. Hence,information processing selects action-relevant properties accord-ing to the perceiver’s intentions. Affordance Theory has been ap-plied to the fields of HMI and ergonomics as it grounds methodsfor the design of interfaces (e.g. buttons and knobs). Naturally,these same methods should be considered for the developmentof an assistive system in manual assembly and informationpresentation.

As the perception- and action-related processes during manualassembly are overlapping in space and time it is essential to findthe optimal point in time for information presentation. DonaldNorman, author of ‘The Design of Future Things’ [6], has comparedthe requirements of assistive systems in HMI with determinants ofsuccessful dialogues in human–human communication. In contrastto two monologues, collaboration is defined by the synchroniza-tion of activities. Further, successful dialogues require a sharedknowledge and understanding of environment and context [7].Technologies have to be adapted to the way people actually be-have. ‘Smart environments’ and ‘ambient intelligence’ are key-words in this domain. Norman summarizes his concept with theclaim ‘Augmentation, not automation’, that is, the environmentshould be augmented with relevant information and supportingdevices.

In our view, this means that information presentation and guid-ance should not lead to pacing as in the film ‘Modern times’ whereCharlie Chaplin impressively demonstrated the effects of exter-nally controlled working processes on a factory assembly line. Incontrast, support and guidance should be presented when neededbut it should also allow for quickly dismissed suggestions if theworker decides to continue the work differently. Additionally, themental workload and the experience of the worker must beconsidered.

The engineer and psychologist Christopher D. Wickens pro-posed in his multiple-resource theory that mental resources canbe divided into four dimensions: perceptual processing (visual–auditory), processing codes (spatial–verbal), processing stages(perception–central processing–responding) and response modali-ties (manual–verbal) [8]. Each of these resources has specificcapacity limitations. An analysis of relevant task aspects and re-sources along such lines as Wickens proposes is fruitful for the pre-diction of task difficulties, resulting multiple-task performance and

mental workload. The multiple-resource theory can be applied tothe manual assembly task, where different spatial areas have tobe monitored (e.g. instructions, parts for commissioning, workpiece, etc.) and different manual actions have to be performed withboth hands at the same time, in order to discover bottlenecks. InWickens’s theory, task specific processing bottlenecks can lead toproblems in working situations if certain resources have to beshared.

The monitoring of task-relevant information as well as actionexecution can be supported by attentional guidance. Wickens hasintensively discussed the relevance of attention theory in appliedscenarios [9]. Important for appropriate guidance is that the cuesfor attentional allocation are noticeable but not disruptive and leadto noise exclusion, that is, the ability to ignore distracting elements.

Consequently, by delivering cognitive support, such as atten-tional guidance, an assistive system can lead to improved perfor-mance [10]. There exist several parameters for the analysis andevaluation of human task performance. Parasuraman collected, inthe book ‘Neuroergonomics: the brain at work’ [11], methods forthe investigation of cognitive processes in real-world scenarios inorder to deliver guidelines and constraints for task design andinformation presentation. Besides several brain imaging methods,which are more relevant for offline analysis of information pro-cessing, methods which are easily applicable to the online moni-toring in an applied working scenarios are also described. Thetracking of eye and hand movements gives insights into relevantsub-processes during task execution. Furthermore, while combin-ing different methods the understanding of the interplay of percep-tion and action processes will be enhanced.

To return to the beginning, the described accounts deliver toolsfor the investigation of cognitive processes in manual assemblyand for the development of assistive systems. It must also beunderstood that conducting controlled experiments in real worldassembly scenarios has the power to produce ecologically validand relevant results. Parameters need to be found for the predic-tion of task performance in various specific working situations. Itis assumed then, that the variation of information presentationregarding content, location and point in time as well as the varia-tion of task design determine task complexity and resultingperformance.

The following sections give an overview of relevant theories andresults in cognitive psychology concerning information processing,visual attention, motor control and task complexity. Subsequently,several parameters for the evaluation of task performance are de-scribed. Theories were applied in the development of an assistivesystem and benefits of attentional guidance were tested in a work-ing scenario. Two experiments are reviewed which deal directlywith the above issues.

2. Theoretical background

2.1. Information processing and mental resources

Information processing during manual assembly involves thewhole spectrum of cognitive functions from perception, attentionand memory to action planning and execution [12]. The informa-tion processing framework can be used to divide the completeassembly cycle into relevant processing stages. First, the assemblytask itself can be divided into a commissioning task and a joiningtask. Both subtasks include the cognitive functions from perceptionto action execution, which are assumed to be partially sequential.Perception involves stimulus preprocessing, feature extraction andstimulus identification. In the commissioning phase, a part on thepart list has to be localized, part features have to be analyzed (e.g.small and metal), and the type (e.g. 5 mm screw) as well as the

Page 3: Human cognition in manual assembly: Theories and applications

322 S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328

number of relevant parts for a work piece have to be identified.After localizing the relevant part in a box, the grasping action hasto be prepared (e.g. precision grip with the left hand) and executed.In the joining phase, the assembly instruction has to be localized,part positions and orientations have to be identified and memo-rized. The appropriate joining operation (e.g. screwing) has to bechosen and executed. The ongoing motor action can be adjustedand corrected online in case of unforeseen events (e.g. if the se-lected box for stored assembly parts is empty or screwing pressureis insufficient) [13].

Fig. 1 shows information processing in the joining and commis-sioning tasks. Relevant processing resources according to the Re-source Theory of Wickens are shown. Obviously, the interplay ofcertain subtasks can lead to interferences because of the capacitylimitations of certain resources. In the case that two tasks needthe same resource (e.g. the left hand or simultaneous attention attwo distinct locations), performance concomitantly declines, be-cause the responses can only be executed sequentially. On theother hand, the schema of Wickens also demonstrates which pro-cesses can be performed simultaneously and without problems.For example, during the joining operation it is possible to attendto warning signals from the factory environment.

2.2. Selective attention and visual search

As an important proportion of the commissioning and joiningtask depends on perceptual processing of instructive information,selective visual attention is a resource of outstanding relevancefor the assembly process. Unfortunately, the human visual systemis limited with respect to the number of elements which can beprocessed at the same time. Visual attention has to be directedto task aspects and instruction details relevant for the ongoing taskin order to select the content for further processing. Search strate-gies can facilitate attentional allocation via bottom-up or top-downcontrol mechanisms [15]. Bottom-up or stimulus-driven strategiesenable one to directly attend to elements with specific outstandingphysical properties (e.g. a big screw among small nails). Addition-ally, salient element features can lead to efficient search and en-hance performance, because they pop-out from other distractingelements. Yet, many parts during manual assembly are very similar

Fig. 1. Processing stages and resources in manual assembly tasks separat

(e.g. screws of different sizes) and not easy to discriminate. With-out physical differences, elements are scanned inefficiently [16]. Incase of overlapping features (e.g. several small metal screws)search times increase with the number of distracting elements.Accordingly, the difficulty in visual search increases with increas-ing similarity between target and distractors [17]. Therefore,search times need to be reduced while enabling a pop-out effectfor targets with no naturally salient features.

The term ‘Findability’ has been defined by Peter Morville in thecontext of web design [18]. It describes the ability of users to iden-tify an appropriate website and navigate the pages of the site todiscover and retrieve relevant information resources. This conceptcan also be assigned to the instruction presentation during manualassembly. In this context, the findability of relevant informationshould be improved in such a way that no preknowledge isnecessary.

Different methods for directing visual attention to specific spa-tial locations can be distinguished [19,20]. One method uses a bot-tom-up mechanism to capture attention by presenting salientspatial cues at the relevant position. This so-called exogenous,stimulus-driven or peripheral cueing leads to automatic, reflexiveand involuntary shifts of attention. Another method uses a sym-bolic cue (e.g. an arrow) in order to indicate the spatial position.This endogenous or central cueing needs to be interpreted bytop-down mechanisms and the respective attentional shift has tobe initiated voluntarily. Peripheral salient cues induce faster atten-tional shifts than central cues because the latter need additionaltime for the interpretation of the symbol. Salient attributes likemovement, size, orientation, color and transient luminancechanges or onsets can serve as peripheral exogenous cues [16].

2.3. Task complexity

The complexity of a task is determined by various factors and istherefore difficult to define. Task complexity as a construct, is hardto measure by a single variable, but various task performance mea-sures are used to evaluate different tasks with respect to workloadand other resources needed.

So far, most approaches to the evaluation and computation oftask complexity in manual assembly take only the physical attri-

ely for the commissioning and joining phase. (Adapted from [14,3].)

Page 4: Human cognition in manual assembly: Theories and applications

S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328 323

butes of the assembly objects into account. For manual assembly,there are several objective task parameters to describe task diffi-culty, like distance and size of the to-be-grasped object or numberof assembly steps. These task parameters are used by standard sys-tems of predetermined times for the calculation of assembly times.Nevertheless, other task demands cannot as easily be measuredand evaluated because they result from the interplay of varioussubtasks, which may influence each other. For the prediction ofperformance during the complete assembly task, it is importantto understand the cognitive processes and respective bottlenecksinvolved in manual assembly.

The multiple-resource theory of Wickens can be applied to theassembly task in order to predict task complexity. During manualassembly different spatial areas have to be monitored (e.g. instruc-tion area, parts area, work piece, etc.) and different manual actionshave to be performed concurrently with both hands. An analysis ofthose task aspects that might interfere with respect to a shared re-source enables the prediction of multiple-task performance andmental workload [21]. Regarding resource management, the designof instruction presentation and worker support aims at reducingtask complexity on the information processing side.

One important determinant of task complexity is the sequenceof steps in a manual assembly task. Possible assembly sequencesand plans have to be determined. There are several approachesin to validating the assemblability of parts and subassemblies withthe aim to generate assembly plans automatically [22]. Also, itshould be noted that the order of subassemblies within a task con-tributes to task complexity. For example, differences are expectedif certain task aspects are repeated or are switching during theassembly sequence. Moreover, the production program and result-ing similarity of products contribute to task complexity with re-spect to the worker’s knowledge and experience [3].

For the evaluation of factors which contribute to task complex-ity, the impact of several factors can be investigated in controlledexperiments: the complexity of the assembly, the complexity ofthe instruction given, the time and order in which instructionsare provided and the amount of parts that must be assembled.

3. Performance parameters

Several parameters enable the analysis of human performance.The combination of different measurement methods and depen-dent variables provides insights into the interplay of cognitive pro-cesses in manual assembly.

3.1. Task performance

Task performance can be evaluated via observation and behav-ioral measurement. Completion times, the number of sub-stepsneeded, and the dwell time on each instruction and assembly step,as well as error rates enable descriptions of task performance. Theanalysis of videos recorded during manual assembly provides hintsof possible sources of error.

3.2. Movement parameters

Extensive research has been concerned with body and limbmovements during HMI. Several models predict characteristics ofhuman motor behavior depending on certain task parameters.For example, Fitts’s law predicts that the time required to rapidlymove to a target area varies as a function of the distance to the tar-get and its size [23]. On the base of this model, speed-accuracytradeoffs can be predicted. Movement parameters can be used toobjectively evaluate human task performance. Total movementtime, speed-accuracy tradeoff, latency of movement onset and

peak velocity can provide useful information on task complexityand task difficulty. The tracking data enable the analysis of move-ment trajectories, to segment action sequences into meaningfulunits and to define key sequences. Unnecessary movements (e.g.caused by moving unnecessary distances) as well as ill-directedgrasping movements can be identified. Additionally, the interac-tion of workers in joined assembly tasks can be analyzed.

3.3. Eye movements

Eye tracking allows analyzing interaction patterns while operat-ing with human–machine interfaces. In production environments,non-optimal aspects and weaknesses of information presentationthat contribute to performance errors can be identified. The analy-sis of oculomotor data helps to identify such areas of interest.Accordingly, eye movement data enable the inference of the men-tal workload of users. In more detail, gaze dwell times and gaze fre-quencies to the instruction area, the working area or to the storageboxes may be taken as an indicator of strategies and respectivetask difficulties.

Saccades are not executed until after covert attentional shiftstowards the intended goal location [24]. Therefore, the pattern ofgaze shifts may be taken as predictor of the operator’s current fo-cus of attention. Especially for the measurement of visual searchperformance, eye tracking is of outstanding relevance. Eye move-ment parameters like fixation count and fixation duration provideinformation on the intensity of information processing of any sin-gle object in view [25]. As such, these eye movement parameterscan show whether difficulties during information processing occur[26]. For example, a high fixation count indicates inefficient visualsearch, as the eye has to go back to the object at various times foroverall localization [27]. Furthermore, eye movement trajectorieshelpfully visualize the search path and can be analyzed in orderto investigate search strategies as well as the influence of distract-ing items. Previous experiments have, in fact, demonstrated thatspatial cues can accelerate attentional shifts as well as eye move-ments [28].

It is possible to investigate how salient events may attract theattention of a worker and whether they distract the worker fromthe assembly task. It is the case that eye movements sometimesindicate the observer’s expectancies concerning future events. Byanalyzing eye movement behavior, one may draw conclusionsabout the observer’s next steps and intentions. Finally, eye move-ment data can serve as an input device for human–machine inter-action. For example, displayed items may be selected via a simplegaze.

4. Application

The investigation of cognitive processes and their limitations is akey issue for the development of adaptive support [29–31]. Never-theless, so far, knowledge about information processing in workingenvironments and constraints on tasks in working environments islimited. Controlled experiments in applied scenarios have to be con-ducted following the demand of ecologically valid results. Experi-mental task manipulation and systematic quantitative analysis ofperformance parameters should be the starting point for of anyassistive system.

4.1. Support via optimized instructions

Assembly instructions can be presented via paper manuals,manuals on a monitor or different forms of augmented reality(AR) on a continuum with increasing support by the computer sys-tem [32]. Computer generated information can be modified

Page 5: Human cognition in manual assembly: Theories and applications

Fig. 2. Experimental setup: highlighting of boxes with task relevant parts andprojection of assembly instructions on the working area.

324 S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328

dynamically in contrast to static text book information. Transitionsof information presentation are possible with appearing and disap-pearing contents over time [33]. AR applications combine real andvirtual objects within a real environment [34]. Therefore, AR en-ables an overlay of the working environment with additional infor-mation at the exact spatial position where it is needed. Assemblyinstructions can show where the next part can be found and atwhich position it has to be joined. For example, the aerospace com-pany Boeing facilitates the wiring of cables in the airplane by pro-jections of the cable path [35]. In general, the AR technology canreduce eye and head movements, accelerate attentional shiftsand support spatial cognition [36]. Comparisons between differentpresentation modes show an advantage of AR techniques for totalassembly times and step times for complex tasks like wiring, butnot for repetitive tasks. AR benefits were mainly observed in thepart-selection phase and when parts have to be positioned, butnot during the manual execution itself [32]. On computer monitorsand with AR systems it is easy to present endogenous or exogenouscues. By using a simple LCD projector it is possible to present cuesin various contexts [13,37].

4.2. A working scenario

On the basis of spatial cueing experiments in cognitive psychol-ogy, an improvement of task performance by attentional guidancewas hypothesized. The main question was which parameters oftask execution were modified by the different modes of instructionpresentation. Two experiments were conducted in a working sce-nario consisting of the task to select and grasp assembly parts(Experiments 1 and 2) and to build an object (Experiment 1)according to detailed instructions. The setup of both experimentswas similar except for a few details (cf. Figs. 2 and 3).

Fig. 3. Different presentation modes: contact analog highlighting of part positions (leftspace was the same for all presentation modes [38].

The setup consisted of a standard workbench equipped with aprojector which was placed above the participant. A front-surfacemirror was mounted in 45-angle orientation in front of the beamerenabling the projection of instructions directly onto the workingarea. Foot pedals were arranged at an easy-to-reach position in or-der to enable switching to the next task step without disturbingthe reaching movements in the manual assembly process. The footpedal presses could also be employed in order to segment the datato meaningful step units. In the present setup, the left eye’s gazeposition was recorded via an Eyelink 1000 system in remote mode.The eye tracker was placed below the working area. Eye move-ments of the left eye were registered while tracking, additionally,a marker above it. For the tracking of hand movements a Polhemussystem was used, which generates a magnetic field and deliversmarker positions with respect to the magnetic transmitter posi-tion. The 6-degrees-of-freedom motion tracking device operatedat 60 Hz. Each of the four sensors can be attached to the fingers,hand or any other body part to record spatial positions (in coordi-nates X, Y, Z) and orientation (azimuth, elevation and roll).

4.3. Presentation modes

Three different presentation modes were compared which dif-fered in the location and the content of instruction presentation(cf. Fig. 3). Instructions were presented on the monitor or via a pro-jection on the working area. In all presentation modes instructionsconsisted of the following: a list showing the amount of relevantparts in one work step (cf. Fig. 2, left), pictures of the parts (cf.Fig. 2, middle), and a picture showing the goal state of the to-be-assembled object (cf. Fig. 2, right).

In the monitor and projection condition, a schematic picture ofthe part positions was depicted (cf. Fig. 2, topleft and cf. Fig. 3, mid-dle). This picture served as an endogenous cue, which had to beinterpreted in a top-down manner. In the monitor condition, par-ticipants had to switch attention between the information presen-tation that was displayed on the computer screen and the workingarea (cf. Fig. 3, right). In the projection condition, cues for the rel-evant item position were presented closer to the real part positions(cf. Fig. 2, middle). Accordingly, part positions were supposed to befound more easily and effortlessly. In the contact analog condition,all schematic part positions were of the same color (cf. Fig. 3, leftand Fig. 2, middle). The relevant part locations (i.e. the boxes) werehighlighted directly by a contact analog projection of white light.The respective luminance change or onset provided a salient exog-enous peripheral cue.

The highlighting of part positions was supposed to facilitateattentional allocation, resulting in improved search performance.We expected the shortest completion times with contact analoghighlighting. Also, the movement onset, velocity and accelerationshould benefit from finding the relevant assembly part faster.The enhancement of attentional allocation with contact analoghighlighting should be demonstrated with a reduced amount ofeye fixation for localization of relevant parts. In sum, different

), projection on working area (middle) and monitor presentation (right). The work

Page 6: Human cognition in manual assembly: Theories and applications

Table 1Overview of investigated parameters in Experiments 1 and 2.

Experiment Performance Movementparameters

Eyemovements

Experiment 1 Commissioning + joining: Movement onsetCompletion time Point-of-grasp

Peak velocityPeak acceleration

Experiment 2 Commissioning: Fixation countsCompletion time

S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328 325

modes of instruction presentation were expected to modulate taskcomplexity resulting in performance differences. Additionally,objective parameters of task complexity were varied by using dif-ferent part numbers in the commissioning and joining phase aswell as by using different kinds of assembly types.

5. Review and discussion of results

In two experimental studies, three presentation modes forattentional guidance were compared in order to evaluate facilita-tion effects of attentional shifts to relevant part positions [37,38].Several performance parameters were expected to benefit fromfast attentional allocation. Table 1 shows which parameters wereinvestigated in two studies.

In Experiment 1, the performance in the commissioning andjoining phase was investigated and overall completion times werecalculated. Moreover, the movement onset, time of estimatedpoint-of-grasp, peak velocity and peak acceleration were analyzed.In Experiment 2, completion times for commissioning as well asthe number of eye fixations in the commissioning phase weremeasured.

5.1. Performance times

Overall completion times increased with the number of relevantassembly parts, because more elements had to be selected, graspedand assembled. In order to estimate the time needed for the com-

Fig. 4. Main results of two experiments: mean values and standard errors for (a) movegrasping movement, (c) commissioning time: time to find, pick and place one assembly

Table 2Main effects and results of statistical analyses.

Performance parameter Effect

Movement onset Contact analog < monitorPoint-of-grasp Contact analog < monitorPeak velocity Contact analog > monitorPeak acceleration Contact analog > monitor

Commissioning time (find, pick and place) Contact analog < projectionContact analog < monitor

Fixation count Contact analog < projection

pletion of single parts within a task step, overall completion timeswere divided by the number of relevant parts. The completion timefor commissioning (Experiment 2) was shorter with contact analoghighlighting than in the projection condition; and it was shorter inthe projection condition than in the monitor condition. All differ-ence reached statistical significance (cf. Fig. 4c and Table 2).

Further analysis revealed that mean search and grasp times forsingle assembly parts were only significantly reduced in the con-tact analog condition, when one or two parts had to be commis-sioned but not when four or six parts had to be selected.However, in the task, participants were instructed to grasp objectsin a specific order, which increased the necessity to scan the boxesserially. It is possible that without such a constraint, benefits forthe contact analog condition may have also been present for largerpart numbers. An analysis of total completion times, including thecommissioning as well as the joining phase (Experiment 1), re-vealed a benefit for contact analog highlighting only for the morecomplex assembly operations (i.e. where 3 spatial dimensionshad to be taken into account). Here, the assembly times per partvaried with the presentation modes: mean assembly times for eachpart were about 10 s faster with contact analog highlighting andprojection on working area compared to pure monitor presentation[13]. Further, the time to assemble an object increased linearlywith the item number [39].

5.2. Movement onset, velocity and acceleration

The tracking data enabled the segmentation of the wholeassembly task into assembly steps, the commissioning and joiningphase and several submovements (Experiment 1). Fig. 5 shows themovement data of one assembly step. The commissioning phaseconsists of repeated movements of the right and left hand to thebox area followed by the joining phase, where the hands stay inthe working area and perform several smaller movements.

In the commissioning phase, several movement parameters ofthe first reaching and grasping movement were compared for thethree presentation modes. The movement onset was defined asthe point in time when the hand started to move from the workingarea to the parts area, i.e. in direction of the boxes. Onset was cal-

ment onset and estimated time to grasp, (b) peak acceleration and velocity of firstpart, (d) number of fixations necessary to find one assembly part.

F-value/t-value p-value Refs.

t(18) = 1.85 p < 0.05 one-tailed Stork et al. [37]t(18) = 1.877 p < 0.05 one-tailedt(18) = 3.397 p < 0.01 one-tailedt(18) = 2.617 p < 0.01 one-tailed

F(1,23) = 6.27 p < 0.05 Stork et al. [38]F(1,23) = 23.97 p < 0.001

F(1,23) = 12.90 p < 0.01

Page 7: Human cognition in manual assembly: Theories and applications

Fig. 5. Y coordinates of the left and right hand during one assembly step (middle), segmentation according hand position (bottom), pictures of respective reachingmovements (top). (Adapted from [37].)

326 S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328

culated as the first point in time when the velocity per second wentabove 25 cm/s. The minimum velocity between entering and leav-ing the parts area was used to calculate the time of the point-of-grasp.

Mean movement onset and latency of the estimated point-of-grasp were shorter in the contact analog presentation mode incomparison to the monitor presentation (cf. Fig. 4a and Table 2).Additionally, the peak velocity and acceleration for the first move-ment to the boxes were computed. The peak velocity and acceler-ation were higher with contact analog presentation in comparisonto monitor presentation (cf. Fig. 4b and Table 2). Also, the projec-tion of schematic boxes led to movements with higher velocityand acceleration than the monitor condition.

5.3. Eye fixation counts

Eye movement recordings were only conducted in the commis-sioning phase (Experiment 2). In the eye movement analysis thenumber of fixations per task step (fixation count) was computed.Overall fixation counts within a step were divided by the numberof relevant parts, resulting in a mean fixation count per item. Thenumber of necessary fixations can be interpreted as an indicationof the difficulty finding a relevant assembly part. In accordancewith completion times, the fixation counts in the commissioningphase were also reduced with contact analog presentation in com-parison to the projection condition (cf. Fig. 4d and Table 2). More-over, the benefit for the contact analog condition was present onlywith part numbers below four.

5.4. Summary

In the commissioning phase, contact analog presentation led tofaster completion times for selecting and grasping relevant partsand fewer eye fixations in comparison to the projection and mon-itor condition, where symbolic cues were presented. Both the com-pletion times as well as the number of necessary eye fixationsshowed the same pattern of results, delivering converging evi-dence for the benefit of contact analog highlighting. Movementparameters of the first reaching and grasping movements show abenefit of contact analog highlighting in comparison to monitor

presentation, as well. As the attentional selection of assembly partsis improved in the contact analog condition, concomitantly thereaching movement and grasping movement (i.e. the transportand grasp phase) to the first relevant box began earlier. The periph-eral cues used with contact analog highlighting seemed to enable abetter suppression of irrelevant items. Apparently, contact analoginstruction presentation in the commissioning phase enabled fas-ter selection of the relevant part position and faster shifting ofattention.

However, projections of schematic part positions close to theboxes also enhanced selecting and grasping in comparison to themonitor presentation. Accordingly, the endogenous symbolic cuesseemed to be more efficient if presented closer to the search area.With more distant cue presentation, attention has to be shiftedagain between cue and search area. This will not be necessary withrather simple cues (i.e. arrows), but the schematic pictures of partpositions might already be too complex to memorize at once.

In general, instruction presentation on the monitor, whereinformation was spatially most separated from the working area,led to the poorest performance. Here, attentional shifts seemedto take more time and the limitation of spatial attention was moreobvious.

The variation of part numbers showed a differential effect in thecommissioning and joining phases. Whereas in the joining phaseassembly times increased rather linearly with the number of parts,the search and grasp time per part increases to a lesser extent. Thispattern might be due to the fact that certain motor actions duringthe joining phase take a specific time which can not be speeded.Additionally, the assembly operations have to be executed oneafter the other. In the commissioning phase, however, more paral-lel processing can take place while searching for the parts. Further-more, parts can be grasped with both the left and right handswhich enables a more parallel and, overall, overlapping movementexecution.

6. Conclusions and future directions

The present research examplifies the concept of controlledexperimentation in an applied working scenario, while at the sametime delivering results with ecological validity. These results have

Page 8: Human cognition in manual assembly: Theories and applications

S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328 327

implications for the development of assistive systems for manualassembly as they concretely demonstrate the advantage of contactanalog highlighting and instruction projection on a working area.The measurement methods and parameters described allow for seg-mentation of the task steps into relevant processing units and fur-ther enable an analysis of the interplay of various cognitiveprocesses. Future data analyses will focus, additionally, on handand eye movements during the whole assembly step, i.e. also inthe joining phase. This will enhance the understanding of cognitiveprocesses and perception-action interactions during manualassembly.

Further investigations can help in further refining the method ofcueing and contact analog highlighting relevant task information.The present results showed that cueing or highlighting improvedperformance only if participants had to select not from more thantwo part positions. This might be due to the fact that participantshad to search for assembly parts in a specific order. Naturally, cue-ing will be more efficient with the highlighting of one specific po-sition. This could be realized by highlighting dependent on currenteye fixation. For example, the fixation of a specific part on the partlist could lead to respective highlighting of the relevant partposition.

There are also disadvantages of peripheral cueing, which haveto be considered. Exogenous cues’ effects fade very quickly, butthey also bear the risk of capturing attention and thereby disturb-ing information processing. Here, again, the importance of present-ing information at the right time and place becomes obvious. Insum, cues have to be evaluated and adapted to the content andcontext of information presentation.

The present results are concerned with the support and guid-ance of visual attention in the manual assembly task. With regardto optimal resource management, multimodal presentation andinteraction forms must also be implemented in order to reducethe worker’s mental workload and to minimize the problems ofprocessing bottlenecks. Also the usage of action affordances inthe sense of Gibson is expected to reduce mental workload, be-cause they automatically induce certain actions. In the broadestsense, the exogenous cue can be interpreted as an action affor-dance. In the further development of the assistive system, otherforms of affordances should also be investigated and, where appro-priate, integrated.

The present results provide insights into the information pro-cessing mechanisms involved in the commissioning and joiningphase of manual assembly tasks. Theories of cognitive psychologyhave been applied to a working scenario. The results demonstratethat principles of spatial cueing can be used in order to support aworker during perceptual processing and action execution. All to-gether, experimental results and further technical developmentswill be used in order to optimize an assistive system for manualassembly which adaptively supports the worker.

The described methodology illustrates how theoretical analysisand controlled experiments can contribute to the development andevaluation of engineering applications and user interfaces for HMI.Human information processes were described, not only relevantfor manual assembly tasks, but for various tasks in the workingenvironment and daily life. Accordingly, many engineering appli-cations in the field of HMI can benefit from comparable analysesand performance measurements.

Acknowledgments

The present research was conducted in the project ‘‘AdaptiveCognitive Interaction in Production Environments” (ACIPE) in theExcellence Cluster ‘‘Cognition for Technical Systems” (CoTeSys) sup-ported by the German Research Foundation (DFG). We thank theeditors and the anonymous reviewers for their insightful comments.

Moreover, we want to thank Christian Stößel and Isabella Hild fortheir contributions to the experiments, as well as Laura Voss forher support with the figures and Jared Pool for proofreading.

References

[1] A. Holzinger, Usability engineering for software developers, Communicationsof the ACM 48 (1) (2005) 71–74.

[2] C.-C. Lu, S.-C. Kang, S.-H. Hsieh, R.-S. Shiua, Improvement of a computer-basedsurveyor-training tool using a user-centered approach, Advanced EngineeringInformatics 23 (1) (2009) 81–92.

[3] M.F. Zaeh, M. Wiesbeck, S. Stork, A. Schubö, A multi-dimensional measure fordetermining the complexity of manual assembly operations, ProductionEngineering 3 (4–5) (2009) 489–496.

[4] U. Neisser, Cognition and Reality: Principles and Implications of CognitivePsychology, WH Freeman, New York, 1976.

[5] J.J. Gibson, The Ecological Approach to Visual Perception, Houghton Mifflin,Boston, 1979.

[6] D.A. Norman, The Design of Future Things, Basic Books, New York, 2007.[7] P. Kropf, Collaboration in scientific visualization, Advanced Engineering

Informatics 24 (2010) 188–195.[8] C.D. Wickens, Multiple resources and performance prediction, Theoretical

Issues in Ergonomics Science 3 (2002) 159–177.[9] C.D. Wickens, J.S. McCarley, Applied Attention Theory, CRC Press, Boca Raton,

FL, 2007.[10] A. Holzinger, M. Kickmeier-Rust, S. Wassertheurer, M. Hessinger, Learning

performance with interactive simulations in medical education: lessonslearned from results of learning complex physiological models with theHAEMOdynamics SIMulator, Computers and Education 52 (1) (2009) 292–301.

[11] R. Parasuraman, M. Rizzo (Eds.), Neuroergonomics: The Brain at Work, OxfordUniversity Press, New York, 2006.

[12] K. Landau, R. Wimmer, H. Luczak, J. Mainzer, H. Peters, G. Winter, Arbeit imMontagebetrieb [Work in assembly factory], in: K. Landau, H. Luczak (Eds.),Ergnomie und Organisation in der Montage, Hanser, München, Germany, 2001,pp. 1–82.

[13] C. Stößel, M. Wiesbeck, S. Stork, M.F. Zäh, A. Schubö, Towards optimal workerassistance: investigating cognitive processes in manual assembly, in:Proceedings of the 41st CIRP Conference on Manufacturing Systems, 2008,pp. 245–250.

[14] M.S. Sanders, Human Factors in Engineering and Design, McGraw-Hill, NewYork, 1993.

[15] J. Theeuwes, Endogenous and exogenous control of visual selection, Perception23 (4) (1994) 429–440.

[16] J.M. Wolfe, T.S. Horowitz, What attributes guide the deployment of visualattention and how do they do it?, Nature Reviews Neuroscience 5 (2004) 1–7

[17] J. Duncan, G.W. Humphreys, Visual search and stimulus similarity,Psychological Review 96 (1989) 433–458.

[18] P. Morville, J. Callender, Search Patterns, O’Reilly Media, 2010.[19] M. Posner, C. Snyder, B.J. Davidson, Attention and detection of signals, Journal

of Experimental Psychology General 109 (1980) 160–174.[20] H.J. Müller, P.M. Rabbitt, Reflexive and voluntary orienting of visual attention:

time course of activation and resistance to interruption, Journal ofExperimental Psychology Human Perception and Performance 15 (2) (1989)315–330.

[21] A. Johnson, R.W. Proctor, Attention: Theory and Practice, Sage Publications,Thousand Oaks, CA, 2004.

[22] T. Dong, R. Tong, L. Zhang, J. Dong, A collaborative approach to assemblysequence planning, Advanced Engineering Informatics 19 (2005) 155–168.

[23] P.F. Fitts, The information capacity of the human motor system in controllingthe amplitude of movement, Journal of Experimental Psychology 47 (1954)381–391.

[24] W.X. Schneider, H. Deubel, Visual attention and saccadic eye movements:evidence for obligatory and selective spatial coupling, in: J.M. Findlay, R.W.Kentridge, R. Walker (Eds.), Eye Movement Research: Mechanisms, Processes,and Applications, Elsevier Science, 1995, pp. 317–324.

[25] M.A. Just, P.A. Carpenter, Eye fixations and cognitive processes, CognitivePsychology 8 (1976) 441–480.

[26] R.J.K. Jacob, K.S. Karn, Eye tracking in human–computer interaction andusability research: ready to deliver the promises, in: J. Hyn, R. Radach, H.Deubel (Eds.), The Mind’s Eye: Cognitive and Applied Aspects of EyeMovement Research, Elsevier, Amsterdam, 2003, pp. 573–605.

[27] H.J. Goldberg, X.P. Kotval, Computer interface evaluation using eyemovements: methods and constructs, International Journal of IndustrialErgonomics 24 (1999) 631–645.

[28] T.J. Crawford, H.J. Müller, Spatial and temporal effects of spatial attention onhuman saccadic eye movements, Vision Research 32 (1992) 293–304.

[29] R.W. Proctor, K.-P.L. Vu, Human information processing: an overview forhuman–computer interaction, in: A. Sears, J. Jacko (Eds.), The Human–computer Interaction Handbook: Fundamentals, Evolving Technologies, andEmerging Applications, second ed., CRC Press, Boca Raton, FL, 2008, pp. 43–62.

[30] C.D. Wickens, C.M. Carswell, Information processing, in: G. Salvendy (Ed.),Handbook of Human Factors and Ergonomics, third ed., John Wiley, Hoboken,NJ, 2006, pp. 111–149.

Page 9: Human cognition in manual assembly: Theories and applications

328 S. Stork, A. Schubö / Advanced Engineering Informatics 24 (2010) 320–328

[31] S. Stork, C. Stössel, H.J. Müller, M. Wiesbeck, M.F. Zäh, A. Schubö, Aneuroergonomic approach for the investigation of cognitive processes ininteractive assembly environments, in: 16th IEEE International Conference onRobot and Human Interactive Communication, 2007, pp. 750–755.

[32] S. Wiedenmaier, O. Oehme, L. Schmidt, H. Luczak, Augmented reality (AR) forassembly processes: design and experimental evaluation, International Journalof Human Computer Interactions 16 (2003) 497–514.

[33] A. Holzinger, M. Kickmeier-Rust, D. Albert, Dynamic media in computerscience education; content complexity and learning performance: is lessmore?, Educational Technology and Society 11 (1) (2008) 279–290

[34] R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, B. MacIntyre, Recentadvances in augmented reality, IEEE Computer Graphics and Applications(2001) 34–47.

[35] D. Curtis, D. Mizell, P. Gruenbaum, A. Janin, Several devils in the detail: makingan AR app work in the aeroplane factory, in: Proceedings of the 1st IEEEInternational Workshop on Augmented Reality, Los Alamitos, CA, 1998, pp. 1–8.

[36] A. Tang, C. Owen, F. Biocca, W. Mou, Comparative effectiveness of augmentedreality in object assembly, in: Proceedings of the SIGCHI Conference onHuman Factors in Computing Systems, Ft. Lauderdale, FL, USA, 2003, pp. 73–80.

[37] S. Stork, C. Stößel, A. Schubö, The influence of instruction mode on reachingmovements during manual assembly, in: A. Holzinger (Ed.), HCI and Usabilityfor Education and Work, LNCS, vol. 5298, Springer, Heidelberg, 2008, pp. 161–172.

[38] S. Stork, I. Hild, M. Wiesbeck, M.F. Zaeh, A. Schubö, Finding relevant items:attentional guidance improves visual selection processes, in: A. Holzinger, K.Miesenberger (Eds.), HCI and Usability for e-Inclusion, LNCS, vol. 5889,Springer, Heidelberg, 2009, pp. 69–80.

[39] S. Stork, C. Stößel, A. Schubö, Optimizing human–machine interaction inmanual assembly, in: 17th IEEE International Symposium on Robot andHuman Interactive Communication, 2008, pp. 113–118.