15
A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG Wei-Long Zheng 1 and Bao-Liang Lu 1,2,3,1 Center for Brain-like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 2 Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China 3 Brain Science and Technology Research Center, Shanghai Jiao Tong University, Shanghai, China Author to whom any correspondence should be addressed. E-mail: [email protected], [email protected] August 2016 Abstract. Objective. Covert aspects of ongoing user mental states provide key context information for user-aware human computer interactions. In this paper, we focus on the problem of estimating the vigilance of users using EEG and EOG signals. Approach. To improve the feasibility and wearability of vigilance estimation devices for real-world applications, we adopt a novel electrode placement for forehead EOG and extract various eye movement features, which contain the principal information of traditional EOG. We explore the effects of EEG from different brain areas and combine EEG and forehead EOG to leverage their complementary characteristics for vigilance estimation. Considering that the vigilance of users is a dynamic changing process because the intrinsic mental states of users involve temporal evolution, we introduce continuous conditional neural field and continuous conditional random field models to capture dynamic temporal dependency. Main results. We propose a multimodal approach to estimating vigilance by combining EEG and forehead EOG and incorporating the temporal dependency of vigilance into model training. The experimental results demonstrate that modality fusion can improve the performance compared with a single modality, EOG and EEG contain complementary information for vigilance estimation, and the temporal dependency-based models can enhance the performance of vigilance estimation. From the experimental results, we observe that theta and alpha frequency activities are increased, while gamma frequency activities are decreased in drowsy states in contrast to awake states. Significance. The forehead setup allows for the simultaneous collection of EEG and EOG and achieves comparative performance using only four shared electrodes in comparison with the temporal and posterior sites. Submitted to: J. Neural Eng. arXiv:1611.08492v1 [cs.HC] 25 Nov 2016

arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating VigilanceUsing EEG and Forehead EOG

Wei-Long Zheng1 and Bao-Liang Lu1,2,3,†

1Center for Brain-like Computing and Machine Intelligence, Department ofComputer Science and Engineering, Shanghai Jiao Tong University, Shanghai,China2Key Laboratory of Shanghai Education Commission for Intelligent Interactionand Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China3Brain Science and Technology Research Center, Shanghai Jiao Tong University,Shanghai, China†Author to whom any correspondence should be addressed.

E-mail: [email protected], [email protected]

August 2016

Abstract.Objective. Covert aspects of ongoing user mental states provide key contextinformation for user-aware human computer interactions. In this paper, we focuson the problem of estimating the vigilance of users using EEG and EOG signals.Approach. To improve the feasibility and wearability of vigilance estimationdevices for real-world applications, we adopt a novel electrode placement forforehead EOG and extract various eye movement features, which contain theprincipal information of traditional EOG. We explore the effects of EEG fromdifferent brain areas and combine EEG and forehead EOG to leverage theircomplementary characteristics for vigilance estimation. Considering that thevigilance of users is a dynamic changing process because the intrinsic mentalstates of users involve temporal evolution, we introduce continuous conditionalneural field and continuous conditional random field models to capture dynamictemporal dependency. Main results. We propose a multimodal approach toestimating vigilance by combining EEG and forehead EOG and incorporatingthe temporal dependency of vigilance into model training. The experimentalresults demonstrate that modality fusion can improve the performance comparedwith a single modality, EOG and EEG contain complementary information forvigilance estimation, and the temporal dependency-based models can enhance theperformance of vigilance estimation. From the experimental results, we observethat theta and alpha frequency activities are increased, while gamma frequencyactivities are decreased in drowsy states in contrast to awake states. Significance.The forehead setup allows for the simultaneous collection of EEG and EOG andachieves comparative performance using only four shared electrodes in comparisonwith the temporal and posterior sites.

Submitted to: J. Neural Eng.

arX

iv:1

611.

0849

2v1

[cs

.HC

] 2

5 N

ov 2

016

Page 2: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 2

1. Introduction

Humans interact with their surrounding complexenvironments based on their current states, andcontext awareness plays an important role during suchinteractions. However, the majority of the existingsystems lack this ability and generally interact withusers in a rule-based fashion. Covert aspects of ongoinguser mental states provide key context information inuser-aware human computer interactions (Zander &Jatzev 2012), which can help systems react adaptivelyin a proper manner. Various studies have introducedthe assessment of the mental states of users, such asintention, emotion, and workload, to promote activeinteractions between users and machines (Muhl, Jeunet& Lotte 2014, Lu, Zheng, Li & Lu 2015, Kang,Park, Gonuguntla, Veluvolu & Lee 2015, Zheng &Lu 2015). Zander and Kothe proposed the conceptof a passive brain-computer interface (BCI) to fuseconventional BCI systems with cognitive monitoring(Zander & Kothe 2011). It is attractive to implementthese novel BCI systems with increasing informationflow of human states without simultaneously increasingthe cost significantly. Among these cognitive states,vigilance is a vital component, which refers to theability to endogenously maintain focus.

Various working environments require sustainedhigh vigilance, particularly for some dangerousoccupations such as driving trucks and high-speedtrains. In these cases, a decrease in vigilance(Grier, Warm, Dember, Matthews, Galinsky, Szalma &Parasuraman 2003) or a momentary lapse of attention(Peiris, Davidson, Bones & Jones 2011, Davidson,Jones & Peiris 2007) might severely endanger publictransportation safety. Driving fatigue is reported tobe a major factor in fatal road accidents.

Various approaches for estimating vigilance levelshave been proposed in the literature (Ji, Zhu &Lan 2004, Dong, Hu, Uchimura & Murayama 2011,Sahayadhas, Sundaraj & Murugappan 2012). However,several research challenges still exist. Vigilancedecrement is a dynamic changing process becausethe intrinsic mental states of users involve temporalevolution rather than a time point. This processcannot simply be treated as a function of the durationof time while engaged in tasks. The ability topredict vigilance levels with high temporal resolutionis more feasible in real-world applications (Davidsonet al. 2007). Moreover, drivers’ vigilance levels cannot

be simply classified into several discrete categories butshould be quantified in the same way as the bloodalcohol level (Dong et al. 2011, Ranney 2008). We stilllack a standardized method for measuring the overallvigilance levels of humans.

Among various modalities, EEG is reported tobe a promising neurophysiological indicator of thetransition between wakefulness and sleep in variousstudies because EEG signals directly reflect humanbrain activity (Berka, Levendowski, Lumicao, Yau,Davis, Zivkovic, Olmstead, Tremoulet & Craven 2007,Khushaba, Kodagoda, Lal & Dissanayake 2011, Shi& Lu 2013, Lin, Chuang, Huang, Tsai, Lu, Chen &Ko 2014, Martel, Dahne & Blankertz 2014, Kim, Kim,Haufe & Lee 2014). Rosenberg and colleagues recentlypresented a neuromarker for sustained attention fromwhole-brain functional connectivity (Rosenberg, Finn,Scheinost, Papademetris, Shen, Constable & Chun2016). They developed a network model called thesustained attention network for predicting attentionalperformance. Moreover, EEG has intrinsic potentialto allow fatigue detection at onset or even before onset(Davidson et al. 2007). O’Connell and colleaguesexamined the temporal dynamics of EEG signalspreceding a lapse of sustained attention (O’Connell,Dockree, Robertson, Bellgrove, Foxe & Kelly 2009).Their results demonstrated that the specific neuralsignatures of attentional lapses are registered in theEEG up to 20 s prior to an error. Lin et al.presented a wireless and wearable EEG system forevaluating drivers’ vigilance levels, and they testedtheir system in a virtual driving environment (Linet al. 2014). They also combined lapse detection andfeedback efficacy assessment for implementing a closed-loop system. By monitoring the changes of EEGpatterns, they were able to detect driving performanceand estimate the efficacy of arousing warning feedbackdelivered to drowsy subjects (Lin, Huang, Chuang, Ko& Jung 2013).

In addition to EEG, EOG signals contain char-acteristic information on various eye movements,which are often utilized to estimate vigilance be-cause of its easy setup and high signal-noise ra-tio (Papadelis, Chen, Kourtidou-Papadeli, Bamidis,Chouvarda, Bekiaris & Maglaveras 2007, Damousis& Tzovaras 2008, Ma, Shi & Lu 2010, Ma, Shi& Lu 2014). Researchers have developed variousmultimodal approaches for constructing hybrid BCIs(Pfurtscheller, Allison, Bauernfeind, Brunner, Es-

Page 3: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 3

Figure 1. The simulated driving system and the experimental scene. (a) The virtual-reality-based simulated driving scenes,including various weather and roads. (b) Forehead EOG, EEG and eye movements are simultaneously recorded using the Neuroscansystem and eye tracking glasses. (c) The simulated driving experiments are performed in a real vehicle without unnecessary engineand other components. During the experiments, the subjects are asked to drive the car using the steering wheel and gas pedal. Thedriving scenes are synchronously updated according to subjects’ operations. There is no warning feedback to subjects after sleeping.

calante, Scherer, Zander, Mueller-Putz, Neuper &Birbaumer 2010) and combining brain signals and eyemovements for robotic control and cognitive monitor-ing (Lee, Woo, Kim, Whang & Park 2010, McMullen,Hotson, Katyal, Wester, Fifer, McGee, Harris, Jo-hannes, Vogelstein, Ravitz et al. 2014, Zheng, Dong &Lu 2014, Ma et al. 2014, Lu et al. 2015). Simola et al.studied the valence and arousal interactions under freeviewing of emotional scenes by analysing eye movementbehaviours and eye-fixation-related potentials (Simola,Le Fevre, Torniainen & Baccino 2015). Their find-ings support the multi-dimensional, interactive modelof emotional processing. Moreover, Bulling and col-leagues found that eye movements from EOG signalsare good indicators for activity recognition (Bulling,Ward, Gellersen, Troster et al. 2011). However, theelectrodes in the traditional EOG are placed aroundthe eyes, which may distract users and cause discom-fort. In our previous study, we proposed a new elec-trode placement on the forehead and extracted var-ious eye movement features from the forehead EOG(Zhang, Gao, Zhu, Zheng & Lu 2015, Huo, Zheng &Lu 2016). Various studies have indicated that signalsfrom different modalities represent different aspects ofconvert mental states (Calvo & D’Mello 2010, Sahayad-has et al. 2012, D’mello & Kory 2015). EEG and EOGrepresent internal cognitive states and external subcon-scious behaviours, respectively. These two modalitiescontain complementary information and can be inte-grated to construct a more robust vigilance estimationmodel.

In this paper, we present a multimodal approachfor vigilance estimation by combining EEG andforehead EOG. The main contributions of this paperare as follows: 1) we explore the effect of EEGfor vigilance estimation in different brain areas:frontal, temporal, and posterior; 2) we proposea multimodal vigilance estimation framework withEEG and forehead EOG in terms of feasibility andaccuracy; 3) we acquire both EEG and EOG signalssimultaneously with four shared electrodes on theforehead and combine them for vigilance estimation;4) we reveal the complementary characteristics ofEEG and forehead EOG modalities for vigilanceestimation; 5) we apply continuous conditional neuralfield (CCNF) and continuous conditional randomfield (CCRF) models to enhance the performance ofthe vigilance estimation model to capture dynamictemporal dependency; and 6) we investigate neuralpatterns regarding critical frequency activities underawake and drowsy states.

2. Methods

2.1. Experiment Setup

To collect EEG and EOG data, we developed a virtual-reality-based simulated driving system. A four-lanehighway scene is shown on a large LCD screen infront of a real vehicle without the unnecessary engineand other components. The vehicle movements inthe software are controlled by the steering wheel and

Page 4: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 4

gas pedal, and the scenes are simultaneously updatedaccording to the participants’ operations. The road isprimarily straight and monotonous to induce fatigue inthe subjects more easily. The simulated driving systemand the experimental scene are shown in Figure 1.

A total of 23 subjects (mean age: 23.3, STD:1.4, 12 females) participated in the experiments. Allparticipants possessed normal or corrected-to-normalvision. Caffeine, tobacco, and alcohol were prohibitedprior to participating in the experiments. At thebeginning of the experiments, a short pre-test wasperformed to ensure that every participant understoodthe instructions. Most experiments were performed inthe early afternoon (approximately 13:30) after lunchto induce fatigue easily when the circadian rhythm ofsleepiness reached its peak (Ferrara & De Gennaro2001). The duration of the entire experiment wasapproximately 2 hours. The participants were askedto drive the car in the simulated environments withoutany alertness.

Both EEG and forehead EOG signals wererecorded simultaneously using the Neuroscan systemwith a 1000 Hz sampling rate. The electrode placementof the forehead EOG (Zhang, Gao, Zhu, Zheng &Lu 2015) is shown in Figure 2. For the EEG setup, werecorded 12-channel EEG signals from the posteriorsite (CP1, CPZ, CP2, P1, PZ, P2, PO3, POZ,PO4, O1, OZ, and O2) and 6-channel EEG signalsfrom the temporal site (FT7, FT8, T7, T8, TP7, andTP8) according to the international 10-20 electrodesystem shown in Figure 3. Eye movements weresimultaneously recorded using SMI ETG eye trackingglasses‡, and the facial video was recorded from a videocamera mounted in front of the participants.

For reproducing the results in this paper andenhancing cooperation in related research fields, thedataset used in this study will be freely available tothe academic community as a subset of SEED§.

2.2. Vigilance Annotations

The primary challenge of vigilance estimation usinga supervised machine learning paradigm is how toquantitatively label the sensor data because the groundtruth of convert mental states cannot be accuratelyobtained in theory. To date, researchers have proposedvarious vigilance annotation methods in the literature,such as lane departure and local error rates (Wang,Jung & Lin 2015, Makeig & Inlow 1993). Lin et al.designed an event-related lane-departure driving taskin which the subjects were asked to respond to therandom drifts as soon as possible and the responsetime reflected the vigilance states of the subjects (Lin,Huang, Chao, Chen, Chiu, Ko & Jung 2010, Lin

‡ http://eyetracking-glasses.com/

§ http://bcmi.sjtu.edu.cn/~seed/

⑤⑥ ⑦

①②

(a) Traditional EOG (b) Forehead EOG

Figure 2. Electrode placements for the traditional andforehead EOG setups. The yellow and blue dots indicate theelectrode placements of the traditional EOG and forehead EOG,respectively. Electrode four is the shared electrode of bothsetups.

FPZ

FZ

CZ

POZ

OZ

CPZ

FCZ

PZ

O1PO3PO 5PO7

P1P3P5P7

CP1CP3CP5TP7

C1C3

FC1

C5T7

F1

FC3FC5FT7

F3F5F7

AF3FP1

FC2

F2

FC4 FC6 FT8

F4 F6 F8AF4FP2

O2PO4 PO 6 PO 8

P2 P4 P6 P8

CP2 CP4 CP6 TP8

C2 C4 C6 T8

CB1 CB2

Figure 3. Electrode placements for the EEG setups. 12-channeland 6-channel EEG signals were recorded from the posterior site(red colour) and temporal site (green colour), respectively.

et al. 2013). Shi and Lu (Shi & Lu 2013) conducteda study in which the local error rate of the subjects’performance was used as the vigilance measurement.The subjects were asked to press correct buttonsaccording to the colours of traffic signs. These twoannotation methods are based on subjects’ behavioursand can reflect their actual vigilance levels to someextent. However, they are not feasible for dual tasks,particularly in real-world driving environments.

There is another annotation method calledPERCLOS (Dinges & Grace 1998), which refers to thepercentage of eye closure. It is one of the most widelyaccepted vigilance indices in the literature (Trutschel,Sirois, Sommer, Golz & Edwards 2011, Bergasa,Nuevo, Sotelo, Barea & Lopez 2006, Dong et al.2011). Conventional driving fatigue detection methodsutilize facial videos to calculate the PERCLOS index.

Page 5: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 5

However, the performance of facial videos can beinfluenced by environmental changes, especially forvarious illuminations and heavy occlusion. In thisstudy, we adopt an automatic continuous vigilanceannotation method using eye tracking glasses, whichwas proposed in our previous work (Gao, Zhang,Zheng & Lu 2015). This approach allows vigilanceto be measured in both laboratory and real-worldenvironments.

Compared with facial videos, eye tracking glassescan more precisely capture different eye movements,such as blink, fixation, and saccade, as shown in Figure4. The eye tracking-based PERCLOS index can becalculated from the percentage of the durations ofblinks and ‘CLOS’ over a specified time interval asfollows:

PERCLOS =blink + CLOS

interval, and (1)

interval = blink + fixation+ saccade+ CLOS, (2)

where ‘CLOS’ denotes the duration of the eye closures.We evaluated the efficiency of the eye tracking-

based method for vigilance annotations with the facialvideos recorded simultaneously and found a highcorrelation between the PERCLOS index and thesubject’s current cognitive states. Compared withother approaches (Shi & Lu 2010, Ma et al. 2014, Wanget al. 2015), this method is more feasible for real-worlddriving environments, where performing dual tasks candistract attention and cause safety issues (Oken &Salinsky 2007). This new vigilance annotation methodcan be performed automatically without too muchinterference to the drivers.

Note that although the eye tracking-based ap-proach can estimate the vigilance level more precisely,it is not currently feasible to apply it to real-world ap-plications due to its very expensive cost. Here, we uti-lize eye tracking glasses as a vigilance annotation de-vice to obtain more accurate labelled EEG and EOGdata for training vigilance estimation models.

Scene Camera

IR Camera

Figure 4. The SMI eye tracking glasses used in this study andthe pupillary image captured in one experiment.

2.3. Feature Extraction

2.3.1. Preprocessing for Forehead EOG For tradi-tional EOG recordings, the electrodes are mounted

around the eyes using the electrodes numbered one tofour in Figure 2 (a). However, in real-world applica-tions, such electrode placement is not easily mountedand may distract users with discomfort. To implementwearable devices for real-world vigilance estimation, wepropose placing all the electrodes on the forehead, asshown in Figure 2 (b), and separating vertical EOG(VEO) and horizontal EOG (HEO) using the elec-trodes numbered four to seven shown in Figure 2 (b).For the traditional EOG setup shown in Figure 2 (a),the VEO and HEO signals are obtained by subtractingelectrodes four and three and electrodes one and two,respectively. VEO and HEO signals contain details ofeye movements, such as blink, saccade, and fixation.

How to extract VEO and HEO signals from theforehead EOG setup is one of the key problems inthis study. We extracted VEOf signals from electrodesnumbered four and seven and extracted HEOf signalsfrom electrodes five and six using two separationstrategies: the minus rule and independent componentanalysis (ICA). For the minus rule, the subtraction ofchannels five and seven is an approximation of VEO,named VEOf , and the subtraction of channels five andsix is an approximation of HEO, named HEOf . Here,the subscript ‘f ’ indicates ‘forehead’.

ICA is a blind source separation method proposedto decompose a multivariate signal into independentnon-Gaussian signals (Delorme & Makeig 2004). Weextracted the VEOf and HEOf components usingFASTICA (Delorme & Makeig 2004) from channelsfour and seven and channels five and six, respectively.The comparison of the traditional EOG and foreheadEOG using the minus operation and ICA separationstrategies is depicted in Figure 5. As shown,the extracted VEOf and HEOf from the forehead

1 2 3 4 5 6 7 8−150

−100

−50

0

50

100

150

200

250(a) VEO

Time (s)1 2 3 4 5 6 7 8−40

−20

0

20

40

60

80(b) VEOf-MINUS

Time (s)1 2 3 4 5 6 7 8−3

−2

−1

0

1

2

3

4

5(c) VEOf-ICA

Time (s)

1 2 3 4 5 6 7 8−100

−50

0

50

100

150(d) HEO

Time (s)1 2 3 4 5 6 7 8−150

−100

−50

0

50

100

150(e) HEOf-MINUS

Time (s)1 2 3 4 5 6 7 8−4

−3

−2

−1

0

1

2

3

4(f) HEOf-ICA

Time (s)

Figure 5. Comparison of traditional EOG and forehead EOGusing minus operation and ICA separation strategies. Here, (a)and (d) are traditional VEO and HEO; (b) and (e) are extractedVEOf and HEOf from forehead EOG using the minus operation;and (c) and (f) are extracted VEOf and HEOf from foreheadEOG using the ICA approach.

Page 6: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 6

0 1 2 3 4 5 6 7 8−10

−5

0

5

10

15

20Original VEOf SignalsTransformed Wavelet Signals

Time (s)

Valu

e

θh

θl

Figure 6. The blink detected by using continuous wavelettransform. We applied two thresholds θh and θl on thetransformed wavelet signals and detected peaks to locate blinksegments. Red markers indicate the peaks of each blink, andgreen markers indicate the start and end points of each blink.

0 1 2 3 4 5 6 7 8−400

−300

−200

−100

0

100

200

300

400

Original HEOf SignalsTransformed Wavelet Signals

Time (s)

Valu

e

θh

θl

Figure 7. The saccade detected by using continuous wavelettransform. Similar to blink detection, we applied two thresholdsθh and θl on the transformed wavelet signals and used peakdetection on the transformed wavelet signals. Blue cross markersand diamond markers indicate the start and end points of eachsaccade, respectively.

electrodes have similar waves to the traditional ones,and the forehead VEOf and HEOf can capture criticaleye movements, such as blinks and saccades.

2.3.2. Feature Extraction from Forehead EOG Afterpreprocessing forehead EOG signals and extractingVEOf and HEOf , we detected eye movements suchas blinks and saccades using the wavelet transformmethod (Bulling et al. 2011). We computed thecontinuous wavelet coefficients at a scale of 8 with a

Mexican hat wavelet defined by

ψ(t) =2√

3σπ14

(1− t2

σ2)e

−t2

2σ2 , (3)

where σ is the standard deviation. Because the wavelettransform is sensitive to singularities, we used thepeak detection algorithm on the wavelet coefficients todetect blinks and saccades from the forehead VEOf

and HEOf , respectively. The detected blinks andsaccades are shown in Figures 6 and 7, respectively.

By applying thresholds on the continuous waveletcoefficients, we encoded the positive and negativepeaks in forehead VEOf and HEOf into sequences,where the positive peak was encoded as ‘1’ and thenegative one as ‘0’. A saccade is characterized by asequence of two successive positive and negative peaksin the coefficients. A blink contains three successivelarge peaks, namely, negative, positive, and negative,and the time between two positive peaks should besmaller than the minimum time. Therefore, for theencoding, segments with ‘01’ or ‘10’ are recognizedas saccade candidates, and segments with ‘010’ arerecognized as blink candidates. Moreover, there aresome other constraints, such as slope, correlation, andmaximal segment length, for guaranteeing a precisedetection of blinks and saccades. Following thedetection of blinks and saccades, we extracted thestatistical parameters, such as the mean, maximum,variance, and derivative, of different eye movementswith an 8 s non-overlapping window as the EOGfeatures. We extracted a total of 36 EOG features fromthe detected blinks, saccades, and fixations. Table 1presents the details of the extracted 36 eye movementfeatures.

Group Extracted Features

Blink

maximum/mean/sum of blink ratemaximum/minimum/mean of blinkamplitude, mean/maximum of blink ratevariance and amplitude variancepower/mean power of blink amplitudeblink numbers

Saccade

maximum/minimum/mean of saccaderate and saccade amplitude, maximum/meanof saccade rate variance and amplitudevariance, power/mean power of saccadeamplitude, saccade numbers

Fixation

mean/maximum of blink durationvariance and saccade duration variancemaximum/minimum/mean of blinkduration and saccade duration.

Table 1. The details of the extracted 36 eye movement features.

2.3.3. Forehead EEG Signal Extraction For conven-tional EEG-based approaches, the EOG signals are

Page 7: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 7

0 1.6 3.2 4.8 6.4 8−20

0

20

0 1.6 3.2 4.8 6.4 8−10

0

10

0 1.6 3.2 4.8 6.4 8−10

0

10

0 1.6 3.2 4.8 6.4 8−10

0

10

0 1.6 3.2 4.8 6.4 8−100

0

100

0 1.6 3.2 4.8 6.4 8−100

0

100

0 1.6 3.2 4.8 6.4 8−100

0

100

0 1.6 3.2 4.8 6.4 8−100

0

100

(a) Decomposed ICA Components

(b) Reconstructed Forehead EEG

IC 1

IC 2

IC 3

IC 4

CH

1C

H 2

CH

3C

H 4

Time [s]

Time [s]

Figure 8. (a) The decomposed independent components fromthe four forehead channels (Nos. 4-7) using ICA. IC 1 and IC 2are EOG components for eye activities. (b) The reconstructedforehead EEG by filtering out the EOG components. It canbe observed that strong alpha activities are verified under eyeclosure conditions.

always considered to be severe contamination, par-ticularly for frontal sites. Many methods have beenproposed for removing eye movement and blink arti-facts from EEG recordings (Uriguen & Garcia-Zapirain2015, Daly, Scherer, Billinger & Muller-Putz 2015, De-lorme & Makeig 2004). However, in this study, weconsider that both EEG and EOG contain discrimina-tive information for vigilance estimation. Our intuitiveconcept is that it is possible to separate EEG and EOGsignals from the shared forehead electrodes. The mainadvantage of this concept is that we can leverage thefavourable properties of both EEG and EOG modali-ties while simultaneously not increasing the setup cost.

We utilize the FASTICA algorithm to extractEEG and EOG components from the four foreheadchannels (Nos. 4-7) shown in Figure 2 (b). The

ICA algorithm decomposes the multi-channel data intoa sum of independent components (Jung, Makeig,Humphries, Lee, Mckeown, Iragui & Sejnowski 2000).Similar to artifact removal using blind signal separationin conventional approaches, the forehead EEG signalsare reconstructed with a weight matrix by discardingthe EOG components. The raw data recorded at thefour forehead channels (Nos. 4-7) are concatenated asthe input matrix X for ICA as follows:

X = [Ch 4;Ch 5;−Ch 6;Ch 7], (4)

where the rows of the input matrix X are signals Ch 4,Ch 5, −Ch 6, and Ch 7 from channels Nos. 4-7. AfterICA decomposition, the un-mixing matrix W can beobtained, which decomposes the multi-channel datainto a sum of independent components as follows:

U = W ∗X, (5)

where the rows of U are time courses of activationsof the ICA components. The columns of the inversematrix W−1 indicate the projection strengths of thecorresponding components. Therefore, the cleanforehead EEG signals can be derived as

X = W−1 ∗ U , (6)

where U is the matrix of activation waveforms U withrows representing EOG components set to zero.

The decomposed independent components andreconstructed forehead EEG of one segment under eyeclosure conditions are shown in Figure 8. Under eyeclosure conditions, the alpha rhythm appears moredominant in EEG signals in previous studies (Papadeliset al. 2007). From Figure 8 (a), we can observethat the first two rows are the corresponding eyemovement components, and the last two rows containEEG components with high alpha power values. Thereconstructed signals contain characteristics of EEGwaves, which are accompanied by high alpha bursts.The results presented in Figure 8 demonstrate theefficiency of our approach in extracting EEG signalsfrom forehead electrodes.

2.3.4. Feature Extraction from EEG In addition toforehead EOG, we recorded EEG data from temporaland posterior sites, which showed high relevance alongwith vigilance in the literature and our previouswork (Khushaba et al. 2011, Shi & Lu 2013). Forpreprocessing, the raw EEG data were processed witha band-pass filter between 1 and 75 Hz to reduceartifacts and noise and downsampled to 200 Hz toreduce the computational complexity. For featureextraction, an efficient EEG feature called differentialentropy (DE) was proposed for vigilance estimationand emotion recognition (Shi, Jiao & Lu 2013, Duan,Zhu & Lu 2013), which showed superior performance

Page 8: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 8

compared to the conventional power spectral densityfeatures.

The original formula for calculating differentialentropy is defined as

h(X) = −∫X

f(x)log(f(x))dx. (7)

If a random variable obeys the Gaussian distributionN(µ, σ2), the differential entropy can simply becalculated by the following formulation,

h(X) = −∫ ∞−∞

f(x)log(f(x))dx =1

2log 2πeσ2, (8)

where f(x) = 1√2πσ2

exp (x−µ)22σ2 .

According to the DE definition mentioned above,for each EEG segment, we extracted the DE featuresfrom five frequency bands: delta (1-4 Hz), theta (4-8Hz), alpha (8-14 Hz), beta (14-31 Hz), and gamma(31-50 Hz). We also extracted the DE featuresfrom the total frequency band (1-50 Hz) with a 2Hz frequency resolution. All the DE features werecalculated using short-term Fourier transforms with an8 s non-overlapping window.

2.4. Vigilance Estimation

After obtaining vigilance labels and EOG/EEGfeatures, we used support vector regression (SVR)with radial basis function (RBF) kernels as a basicregression model. The optimal values of theparameters c and g were tuned with the grid search.As the modality fusion strategy, we used feature-levelfusion, in which the feature vectors of EEG and EOGare directly concatenated into a larger feature vectoras inputs. For evaluation, we separated the entire datafrom one experiment into five sessions and evaluatedthe performance with 5-fold cross validation. Thereare a total of 885 samples for each experiment.

The root mean square error (RMSE) andcorrelation coefficient (COR) are the most commonlyused evaluation metrics for continuous regressionmodels (Nicolaou, Gunes, Pantic et al. 2011). RMSEis the squared error between the prediction and theground truth, and it is defined as follows:

RMSE(Y, Y ) =

√√√√ 1

N

N∑i=1

(yi − yi)2, (9)

where Y = (y1, y2, ..., yN )T is the ground truth andY = (y1, y2, ..., yN )T is the prediction.

Since RMSE-based evaluation cannot providestructural information, we used COR to overcome theshortcomings of RMSE. COR provides an evaluation ofthe linear relationship between the prediction and theground truth, which reflects the consistency of their

trends. Pearson’s correlation coefficient is defined asfollows:

COR(Y, Y ) =

∑Ni=1(yi − y)(yi − ¯y)√∑N

i=1(yi − y)2∑Ni=1(yi − ¯y)2

, (10)

where y and ¯y are the means of Y and Y .However, COR is sensitive to short segments and isappropriate for long evaluation metrics. Therefore, weconcatenated the predictions and ground truth of fivesessions and calculated COR as the final evaluation. Ingeneral, the more accurate the model is, the higher theCOR is and the lower the RMSE is.

2.5. Incorporating Temporal Dependency intoVigilance Estimation

Vigilance is a dynamic changing process becausethe intrinsic mental states of users involve temporalevolution. To incorporate the temporal dependencyinto vigilance estimation, we introduced continuousconditional neural field (CCNF) and continuousconditional random field (CCRF) when constructingvigilance estimation models. CCNF and CCRF areextensions of conditional random field (CRF) (Lafferty,McCallum & Pereira 2001) for continuous variablemodelling that incorporates temporal or spatialinformation and have shown promising performance invarious applications (Baltrusaitis, Banda & Robinson2013, Imbrasaite, Baltrusaitis & Robinson 2014,Baltrusaitis, Robinson & Morency 2014). CCNFcombines the nonlinearity of conditional neural fields(Peng, Bo & Xu 2009) and the continuous output ofCCRF.

The probability distribution of CCNF for aparticular sequence is defined as follows:

P (y|x) =exp(Ψ)∫∞

−∞ exp(Ψ)dy, (11)

where∫∞−∞ exp(Ψ)dy is the normalization function,

x = {x1, x2, · · · , xn} is a set of input observations,y = {y1, y2, · · · , yn} is a set of output variables, andn is the length of the sequence.

There are two types of features defined in thesemodels: vertex features fk and edge features gk. Thepotential function Ψ is defined as follows:

Ψ =∑i

K1∑k=1

αkfk(yi,xi,θk) +∑i,j

K2∑k=1

βkgk(yi, yj), (12)

where αk > 0, βk > 0, the vertex features fk denote themapping from xi to yi with a one-layer neural network,and θk is the weight vector for the neuron k.

The vertex features of CCNF are defined as

fk(yi,xi,θk) = −(yi − h(θk,xi))2, and (13)

h(θ,xi) =1

1 + e−θTxi, (14)

Page 9: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 9

where the optimal number of vertex features K1 istuned with cross-validation. In our experiments, weevaluated K1 = {10, 20, 30}.

The edge features gk denote the similaritiesbetween observations yi and yj , which are defined as

gk(yi, yj) = −1

2S(k)i,j (yi − yj)2, (15)

where the similarity measure S(k) controls theexistence of the connections between two vertices.

In the experiments, K2 is set to 1 and S(k) is setto 1 when two nodes i and j are neighbours; otherwise,it is 0. The sequence length n is set to seven. Theformulas for CCRF are the same as those for CCNF,except for the definition of vertex features. The vertexfeatures of CCRF are defined as

fk(yi,xi,k) = −(yi − xi,k)2. (16)

The training of parameters in CCRF and CCNFis based on the conditional log-likelihood P (y|x) asa multivariate Gaussian. For more details regardingthe learning and inference of CCRF and CCNF, pleaserefer to (Baltrusaitis et al. 2014, Imbrasaite et al. 2014).The outputs of support vector regression are usedto train CCRF, and the original multi-dimensionalfeatures are used to train CCNF. The CCRF andCCNF regularization hyper-parameters for αk and βkare chosen based on a grid search in 10[0,1,2] and10[−3,−2,−1,0] using the training set, respectively.

3. Experimental Results

3.1. Forehead EOG-Based Vigilance Estimation

First, we evaluated the similarity between foreheadEOG and traditional EOG and the performance of fore-head EOG-based vigilance estimation for different sep-aration strategies. We extracted forehead VEOf andHEOf using the minus and ICA separation approachesand computed the correlation with traditional VEOand HEO. The mean correlation coefficients of VEOf -MINUS, VEOf -ICA, HEOf -MINUS, and HEOf -ICAare 0.63, 0.80, 0.81, and 0.75, respectively. These com-parative results demonstrate that the extracted fore-head VEOf and HEOf contain most of the principalinformation of traditional EOG.

The mean RMSE, the mean COR and theirstandard deviations for different separation methodsare presented in Table 2. ‘ICA-MINUS’ denotesICA-based VEOf and minus-based HEOf separations,and it has the highest correlation coefficient withtraditional VEO and HEO. As shown in Table 2, ICA-MINUS achieves the best performance for vigilanceestimation in terms of both COR and RMSE. Itis consistent with the above results that VEOf -ICAand HEOf -MINUS are more similar to the originalVEO and HEO. For VEO, it contains many blink

components, such as impulses, which are more likelyto be detected by ICA. In contrast, the minus methodreduces the amplitude of VEO signals since the polarityof the pair electrodes is the same. For HEO, saccadecomponents are more difficult to be detected by ICA,and the polarity of the pair electrodes is different.

ICA-MINUS EOG-ICA EOG-MINUS

COR RMSE COR RMSE COR RMSE

0.7773 0.1188 0.4774 0.1582 0.7193 0.1288

0.1745 0.0391 0.5381 0.0844 0.3492 0.0588

Table 2. The mean RMSE, the mean COR, and their standarddeviations with different separation methods. Here, the numbersin the first and second rows are the averages and standarddeviations, respectively.

3.2. EEG-Based Vigilance Estimation

We reconstructed the frontal 4-channel EEG from theforehead signals based on the ICA algorithm. Inthe experiments, we also recorded 12-channel and 6-channel EEG signals from posterior and temporal sites.We extracted the DE features in two ways: one is fromthe five frequency bands, and the other is to use a 2 Hzfrequency resolution in the entire frequency band. Themean COR, mean RMSE and their standard deviationsof different EEG features from different brain areas areshown in Table 3. The ranking of the performance forEEG-based vigilance estimation from different brainareas is as follows: posterior, temporal, and foreheadsites. For the single EEG modality, the posterior EEGcontains the most critical information for vigilanceestimation, which is consistent with previous findings(Khushaba et al. 2011, Shi & Lu 2013). The EEGfeatures with a 2 Hz frequency resolution achieve betterperformance than those with five frequency bands.In the later experimental evaluation in this paper,we employ the EEG features with a 2 Hz frequencyresolution of the entire frequency band.

Posterior Temporal Forehead2Hz 5Bands 2Hz 5Bands 2Hz 5Bands

0.7001 0.6807 0.6678 0.6410 0.6502 0.57490.2250 0.2129 0.2349 0.2246 0.2116 0.2463

( a ) COR

Posterior Temporal Forehead2Hz 5Bands 2Hz 5Bands 2Hz 5Bands

0.1327 0.1429 0.1385 0.1603 0.1463 0.16400.0303 0.0393 0.0343 0.0722 0.0383 0.0483

( b ) RMSE

Table 3. The average and standard deviations of COR andRMSE for different EEG features. Here, the numbers in thefirst and second rows are the averages and standard deviations,respectively.

Page 10: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 10

In addition to the accuracy that we discussedabove for decoding brain states, another importantconcern is to examine whether patterns of brain activ-ity under different cognitive states exist and whetherthese patterns are to some extent common across in-dividuals. Identifying the specific relationship betweenbrain activities and cognitive states provides evidenceand support for understanding the information pro-cessing mechanism of the brain and brain state decod-ing (Haynes & Rees 2006). Huang et al. demonstratedthe specific links between changes in EEG spectralpower and reaction time during sustained-attentiontasks (Huang, Jung & Makeig 2009). They found thatsignificant tonic power increases occurred in the alphaband in the occipital and parietal areas as reaction timeincreased. Ray and colleagues proposed that alpha ac-tivities of EEG reflect attentional demands and thatbeta activities reflect emotional and cognitive processes(Ray & Cole 1985). They found increasing parietal al-pha activities for tasks that do not require attention.

In this work, to investigate the changes in neuralpatterns associated with vigilance, we split the EEGdata into three categories (awake, tired, and drowsy)with two thresholds (0.35 and 0.7) according to thePERCLOS index. We averaged the DE featuresover different experiments. Figure 9 presents themean neural patterns of awake and drowsy statesas well as the difference between them. As shownin Figure 9, increasing theta and alpha frequencyactivities exist in parietal areas and decreasing gammafrequency activities exist in temporal areas in drowsystates in contrast to awake states. These resultsare consistent with previous findings in the literature(Ray & Cole 1985, Davidson et al. 2007, Huanget al. 2009, O’Connell et al. 2009, Peiris et al. 2011, Linet al. 2013, Martel et al. 2014) and support theprevious evidence that the increasing trend for theratio of slow and fast waves of EEG activities reflectsdecreasing attentional demands (Jap, Lal, Fischer &Bekiaris 2009).

3.3. Modality Fusion with Temporal Dependency

In this section, we introduced a multimodal vigilanceestimation approach with the fusion of EEG and fore-head EOG. We combined the EEG signals from dif-ferent sites (forehead, temporal, and posterior) andforehead EOG signals to utilize their complementarycharacteristics for vigilance estimation. The perfor-mance of each single modality and different modalityfusion strategies are shown in Figure 10. For a singlemodality, forehead EOG achieves better performancethan posterior EEG. The reason for this result is thatforehead EOG has more information in common withthe annotations of eye tracking data. Modality fusioncan significantly enhance the regression performance in

Figure 9. The mean neural patterns of awake and drowsy statesas well as the difference between these two states. By applyingtwo thresholds (0.35 and 0.7) to the PERCLOS index, we splitthe EEG data into three categories: awake, tired, and drowsy.From the average neural patterns, we observe that drowsy stateshave higher alpha frequency activities in parietal areas and lowergamma frequency activities in temporal areas.

Page 11: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 11

comparison with a single modality with a higher CORand lower RMSE. We evaluated the statistical signif-icance using one-way analysis of variance (ANOVA),and the p values of COR for forehead EOG and pos-terior EEG are 0.2978 and 0.0264, respectively. The pvalues of RMSE for forehead EOG and posterior EEGare 0.0654, and 0.0002, respectively.

For different brain areas, an interesting observa-tion is that the fusion of forehead EOG and foreheadEEG achieves better performance than that of foreheadEOG and posterior EEG, whereas for single EEG, theposterior site achieves the best performance. These re-sults indicate that forehead EEG and forehead EOGhave more coherent information. The temporal EEGperforms slightly better than the forehead EEG. How-

(a) COR

(b) RMSE

EEG Temporal Forehead0.4

0.5

0.6

0.7

0.8

0.9

1

SVRCCRFCCNF

EOG

Single Modality Modality Fusion

CO

R

Posterior

EOG Temporal Forehead0.04

0.06

0.08

0.1

0.12

0.14

0.16

SVRCCRFCCNF

EEG

Single Modality Modality Fusion

RM

SE

Posterior

Figure 10. The mean COR and mean RMSE of each singlemodality and different modality fusion strategies.

ever, the former requires six extra electrodes for thesetup. The forehead setup only uses four shared elec-trodes and both EOG and EEG features can be ex-tracted. Therefore, the information flow can be in-creased without any additional setup cost. From theabove discussion, we see that the forehead approach ispreferred for real-world applications.

To incorporate temporal dependency informationinto vigilance estimation, we adopted CCRF andCCNF in this study. As shown in Figures 10 (a)and (b), the temporal dependency models can enhancethe performance. For the forehead setup, the meanCOR/RMSE of SVR, CCRF, and CCNF are 0.83/0.10,0.84/0.10, and 0.85/0.09, respectively. The CCNFachieves the best performance with higher accuraciesand lower standard deviations.

To verify whether the predictions from our pro-posed approaches are consistent with the true subjects’behaviours and cognitive states, the continuous vigi-lance estimation of one experiment is shown in Figure11. The snapshots in Figure 11 show the frames corre-sponding to different vigilance levels. We can observethat our proposed multimodal approach with tempo-ral dependency can moderately predict the continuousvigilance levels and its trends.

To further investigate the complementary charac-teristics of EEG and EOG, we analysed the confusionmatrices of each modality, which reveals the strengthand weakness of each modality. We split the EEGdata into three categories, namely, awake, tired anddrowsy states, with thresholds according to the corre-sponding PERCLOS index as described above. Fig-ure 12 presents the mean confusion graph of foreheadEOG and posterior EEG of all experiments. Theseresults demonstrate that posterior EEG and foreheadEOG have important complementary characteristics.Forehead EOG has the advantage of classifying awakeand drowsy states (77%/76%) compared to the poste-rior EEG (65%/72%), whereas posterior EEG outper-forms forehead EOG in recognizing tired states (88%vs. 84%). The forehead EOG modality achieves bet-ter performance than the posterior EEG overall. Thisresult may be because our ground truth labels are ob-tained with eye movement parameters from eye track-ing glasses. The forehead EOG contains more similarinformation with the experimental observations. More-over, awake states and tired states are often misclassi-fied with each other, and similar results are observedfor drowsy and tired states. In contrast, awake statesare seldom misclassified as drowsy states and vice versafor both modalities. These observations are consistentwith our intuitive knowledge. EEG and EOG featuresof awake and drowsy states should have larger dif-ferences. These results indicate that EEG and EOGhave different discriminative powers for vigilance esti-

Page 12: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 12

Figure 11. The continuous vigilance estimation of different methods in one experiment. As shown, the predictions from ourproposed approaches are almost consistent with the true subjects’ behaviours and cognitive states.

Awake

Tired

Drowsy

EEG EOG

0.77

0.88

0.84

0.65

0.35

0.09

0.12

0.23

0.00

0.00

0.020.04

0.040.03

0.260.20

0.76

0.72

Figure 12. Confusion graph of forehead EOG and posteriorEEG, which shows their complementary characteristics forvigilance estimation. Here, the numbers denote the percentagevalues of samples in the class (arrow tail) classified as the class(arrow head). Bolder lines indicate higher values.

mation. Combining the complementary information ofthese two modalities, modality fusion can improve theprediction performance.

4. Discussion

In this study, we have developed a multimodalapproach for vigilance estimation regarding temporaldependency and combining EEG and forehead EOG ina simulated driving environment. Several researchers

have performed pilot studies for on-road real drivingtests. Papadelis et al. designed an on-board systemto assess a driver’s alertness level in real drivingconditions (Papadelis et al. 2007). They foundthat EEG and EOG are promising neurophysiologicalindicators for monitoring sleepiness. Haufe etal. performed a study to assess the real-worldfeasibility of EEG-based detection of emergencybraking intention (Haufe, Kim, Kim, Sonnleitner,Schrauf, Curio & Blankertz 2014). Indeed, in additionto driving applications, there are many other scenariosthat require vigilance estimation, such as students’performance in classes. Hans and colleagues examinedhow cognitive fatigue influences students’ performanceon standardized tests in their study (Sievertsen, Gino& Piovesan 2016). To evaluate the feasibility ofour approach, we will apply our vigilance estimationapproach to real scenarios in future work.

Considering the wearability and feasibility of avigilance estimation device for real-world applications,we have designed four-electrode placements on theforehead, which are suitable for attachment in awearable headset or headband. We can collect bothEEG and EOG simultaneously and combine theiradvantages via shared forehead electrodes. Theexperimental results demonstrate that our proposedapproach can achieve comparable performance withthe conventional methods on critical brain areas,such as parietal and occipital sites. This approachincreases the information flow with easy setups whilenot considerably increasing the cost.

In recent years, substantial progress has been

Page 13: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 13

made in dry electrodes and high-performance ampli-fiers. Several commercial EEG systems have emergedfor increasing the usability in real-world applications(Grozea, Voinescu & Fazli 2011, Hairston, Whitaker,Ries, Vettel, Bradford, Kerick & McDowell 2014,Mullen, Kothe, Chi, Ojeda, Kerth, Makeig, Jung &Cauwenberghs 2015). It is feasible to integrate thesetechniques with our proposed approach to design a newwearable hybrid EEG and forehead EOG system forvigilance estimation in the future.

In this study, we focus only on vigilance estimationwithout considering any neurofeedback. For example,a feedback can be timely provided to the driverto enhance driving safety if the vigilance detectionsystem indicates that he or she is in an extremelytired state. An adaptive closed-loop BCI systemthat consists of vigilance detection and feedback isvery useful in changing environments (Wu, Courtney,Lance, Narayanan, Dawson, Oie & Parsons 2010, Linet al. 2013). How to efficiently provide and assessthe feedback in high vigilance tasks should be furtherinvestigated.

Due to individual differences of neurophysiologicalsignals across subjects and sessions, the performance ofvigilance estimation models may be dramatically de-graded. The generalization performance of vigilanceestimation models should be considered for individ-ual differences and adaptability. However, trainingsubject-specific models requires time-consuming cal-ibrations. To address these problems, one efficientapproach is to train models on the existing labelleddata from a group of subjects and generalize the mod-els to the new subjects with transfer learning tech-niques (Pan & Yang 2010, Wronkiewicz, Larson &Lee 2015, Morioka, Kanemura, Hirayama, Shikauchi,Ogawa, Ikeda, Kawanabe & Ishii 2015, Zhang, Zheng& Lu 2015, Zheng & Lu 2016).

5. Conclusion

In this paper, we have proposed a multimodalvigilance estimation approach using EEG and foreheadEOG. We have applied different separation strategiesto extract VEOf , HEOf and EEG signals fromfour shared forehead electrodes. The COR andRMSE of single forehead EOG-based and EEG-basedmethods are 0.78/0.12 and 0.70/0.13, respectively,whereas the modality fusion with temporal dependencycan significantly enhance the performance withvalues of 0.85/0.09. The experimental results havedemonstrated the feasibility and efficiency of ourproposed approach based on the forehead setup. Ourvigilance estimation method has the following threemain advantages: both EEG and EOG signals can beacquired simultaneously with four shared electrodes on

the forehead; modelling both internal cognitive statesand external subconscious behaviours with fusion offorehead EEG and EOG; and introducing temporaldependency to capture the dynamic patterns of thevigilance of users. From the experimental results,we have observed that phenomena of increasing thetaand alpha frequency activities and decreasing gammafrequency activities in drowsy states do exist incontrast to awake states. We have also investigatedthe complementary characteristics of forehead EOGand EEG for vigilance estimation. Our experimentalresults indicate that the proposed approach can beused to implement a wearable passive brain-computerinterface for tasks that require sustained attention.

Acknowledgments

This work was supported in part by grants fromthe National Natural Science Foundation of China(Grant No. 61272248), the National Basic ResearchProgram of China (Grant No. 2013CB329401), andthe Major Basic Research Program of Shanghai Scienceand Technology Committee (15JC1400103).

References

Baltrusaitis, T., Banda, N. & Robinson, P. (2013). Dimen-sional affect recognition using continuous conditionalrandom fields, 10th IEEE International Conference andWorkshops on Automatic Face and Gesture Recognition,IEEE, pp. 1–8.

Baltrusaitis, T., Robinson, P. & Morency, L.-P. (2014).Continuous conditional neural fields for structuredregression, Computer Vision–ECCV, Springer, pp. 593–608.

Bergasa, L. M., Nuevo, J., Sotelo, M. A., Barea, R. &Lopez, M. E. (2006). Real-time system for monitoringdriver vigilance, IEEE Transactions on IntelligentTransportation Systems 7(1): 63–77.

Berka, C., Levendowski, D. J., Lumicao, M. N., Yau, A.,Davis, G., Zivkovic, V. T., Olmstead, R. E., Tremoulet,P. D. & Craven, P. L. (2007). EEG correlates of taskengagement and mental workload in vigilance, learning,and memory tasks, Aviation, space, and environmentalmedicine 78(Supplement 1): B231–B244.

Bulling, A., Ward, J., Gellersen, H., Troster, G. et al.(2011). Eye movement analysis for activity recognitionusing electrooculography, IEEE Transactions on PatternAnalysis and Machine Intelligence 33(4): 741–753.

Calvo, R. A. & D’Mello, S. (2010). Affect detection:An interdisciplinary review of models, methods, andtheir applications, IEEE Transactions on AffectiveComputing 1(1): 18–37.

Daly, I., Scherer, R., Billinger, M. & Muller-Putz, G. (2015).Force: Fully online and automated artifact removalfor brain-computer interfacing, IEEE Transactionson Neural Systems and Rehabilitation Engineering23(5): 725–736.

Damousis, I. G. & Tzovaras, D. (2008). Fuzzy fusionof eyelid activity indicators for hypovigilance-relatedaccident prediction, IEEE Transactions on IntelligentTransportation Systems 9(3): 491–500.

Page 14: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 14

Davidson, P. R., Jones, R. D. & Peiris, M. T. (2007). EEG-basedlapse detection with high temporal resolution, IEEETransactions on Biomedical Engineering 54(5): 832–839.

Delorme, A. & Makeig, S. (2004). EEGLAB: an opensource toolbox for analysis of single-trial EEG dynamicsincluding independent component analysis, Journal ofNeuroscience Methods 134(1): 9–21.

Dinges, D. F. & Grace, R. (1998). PERCLOS: A valid psy-chophysiological measure of alertness as assessed by psy-chomotor vigilance, US Department of Transportation,Federal Highway Administration, Publication NumberFHWA-MCRT-98-006 .

D’mello, S. K. & Kory, J. (2015). A review and meta-analysis ofmultimodal affect detection systems, ACM ComputingSurveys 47(3): 43.

Dong, Y., Hu, Z., Uchimura, K. & Murayama, N. (2011).Driver inattention monitoring system for intelligentvehicles: A review, IEEE Transactions on IntelligentTransportation Systems 12(2): 596–614.

Duan, R.-N., Zhu, J.-Y. & Lu, B.-L. (2013). Differentialentropy feature for EEG-based emotion classification,6th International IEEE/EMBS Conference on NeuralEngineering, IEEE, pp. 81–84.

Ferrara, M. & De Gennaro, L. (2001). How much sleep do weneed?, Sleep Medicine Reviews 5(2): 155–179.

Gao, X.-Y., Zhang, Y.-F., Zheng, W.-L. & Lu, B.-L. (2015).Evaluating driving fatigue detection algorithms usingeye tracking glasses, 7th International IEEE/EMBSConference on Neural Engineering, IEEE, pp. 767–770.

Grier, R. A., Warm, J. S., Dember, W. N., Matthews, G.,Galinsky, T. L., Szalma, J. L. & Parasuraman, R.(2003). The vigilance decrement reflects limitations ineffortful attention, not mindlessness, Human Factors:The Journal of the Human Factors and ErgonomicsSociety 45(3): 349–359.

Grozea, C., Voinescu, C. D. & Fazli, S. (2011). Bristle-sensorslow-cost flexible passive dry EEG electrodes forneurofeedback and BCI applications, Journal of NeuralEngineering 8(2): 025008.

Hairston, W. D., Whitaker, K. W., Ries, A. J., Vettel, J. M.,Bradford, J. C., Kerick, S. E. & McDowell, K. (2014).Usability of four commercially-oriented EEG systems,Journal of Neural Engineering 11(4): 046018.

Haufe, S., Kim, J.-W., Kim, I.-H., Sonnleitner, A., Schrauf, M.,Curio, G. & Blankertz, B. (2014). Electrophysiology-based detection of emergency braking intention inreal-world driving, Journal of Neural Engineering11(5): 056011.

Haynes, J.-D. & Rees, G. (2006). Decoding mental states frombrain activity in humans, Nature Reviews Neuroscience7(7): 523–534.

Huang, R.-S., Jung, T.-P. & Makeig, S. (2009). Tonicchanges in EEG power spectra during simulated driving,Foundations of augmented cognition, neuroergonomicsand operational neuroscience, Springer, pp. 394–403.

Huo, X.-Q., Zheng, W.-L. & Lu, B.-L. (2016). Driving fatiguedetection with fusion of EEG and forehead EOG. inInternational Joint Conference on Neural Networks.

Imbrasaite, V., Baltrusaitis, T. & Robinson, P. (2014). CCNFfor continuous emotion tracking in music: Comparisonwith CCRF and relative feature representation, IEEEInternational Conference on Multimedia and ExpoWorkshops, IEEE, pp. 1–6.

Jap, B. T., Lal, S., Fischer, P. & Bekiaris, E. (2009). Using EEGspectral components to assess algorithms for detectingfatigue, Expert Systems with Applications 36(2): 2352–2359.

Ji, Q., Zhu, Z. & Lan, P. (2004). Real-time nonintrusivemonitoring and prediction of driver fatigue, IEEETransactions on Vehicular Technology 53(4): 1052–1068.

Jung, T.-P., Makeig, S., Humphries, C., Lee, T.-W.,Mckeown, M. J., Iragui, V. & Sejnowski, T. J. (2000).Removing electroencephalographic artifacts by blindsource separation, Psychophysiology 37(02): 163–178.

Kang, J.-S., Park, U., Gonuguntla, V., Veluvolu, K. & Lee, M.(2015). Human implicit intent recognition based on thephase synchrony of EEG signals, Pattern RecognitionLetters 66: 144–152.

Khushaba, R. N., Kodagoda, S., Lal, S. & Dissanayake,G. (2011). Driver drowsiness classification us-ing fuzzy wavelet-packet-based feature-extraction algo-rithm, IEEE Transactions on Biomedical Engineering58(1): 121–131.

Kim, I.-H., Kim, J.-W., Haufe, S. & Lee, S.-W. (2014). De-tection of braking intention in diverse situations duringsimulated driving based on EEG feature combination,Journal of Neural Engineering 12(1): 016001.

Lafferty, J., McCallum, A. & Pereira, F. C. (2001). Conditionalrandom fields: Probabilistic models for segmenting andlabeling sequence data, 18th International Conferenceon Machine Learning, Morgan Kaufmann.

Lee, E. C., Woo, J. C., Kim, J. H., Whang, M. & Park,K. R. (2010). A brain–computer interface methodcombined with eye tracking for 3D interaction, Journalof Neuroscience Methods 190(2): 289–298.

Lin, C.-T., Chuang, C.-H., Huang, C.-S., Tsai, S.-F., Lu, S.-W., Chen, Y.-H. & Ko, L.-W. (2014). Wireless andwearable EEG system for evaluating driver vigilance,IEEE Transactions on Biomedical Circuits and Systems8(2): 165–176.

Lin, C.-T., Huang, K.-C., Chao, C.-F., Chen, J.-A., Chiu, T.-W.,Ko, L.-W. & Jung, T.-P. (2010). Tonic and phasic EEGand behavioral changes induced by arousing feedback,NeuroImage 52(2): 633–642.

Lin, C.-T., Huang, K.-C., Chuang, C.-H., Ko, L.-W. & Jung,T.-P. (2013). Can arousing feedback rectify lapses indriving? prediction from EEG power spectra, Journal ofNeural Engineering 10(5): 056024.

Lu, Y., Zheng, W.-L., Li, B. & Lu, B.-L. (2015). Combiningeye movements and EEG to enhance emotion recogni-tion, International Joint Conference on Artificial Intel-ligence, pp. 1170–1176.

Ma, J.-X., Shi, L.-C. & Lu, B.-L. (2010). Vigilance estimation byusing electrooculographic features, Annual InternationalConference of the IEEE Engineering in Medicine andBiology Society, IEEE, pp. 6591–6594.

Ma, J.-X., Shi, L.-C. & Lu, B.-L. (2014). An EOG-basedvigilance estimation method applied for driver fatiguedetection, Neuroscience and Biomedical Engineering2(1): 41–51.

Makeig, S. & Inlow, M. (1993). Lapse in alertness: coher-ence of fluctuations in performance and EEG spec-trum, Electroencephalography and clinical neurophysiol-ogy 86(1): 23–35.

Martel, A., Dahne, S. & Blankertz, B. (2014). EEG predictors ofcovert vigilant attention, Journal of Neural Engineering11(3): 035009.

McMullen, D. P., Hotson, G., Katyal, K. D., Wester, B. A.,Fifer, M. S., McGee, T. G., Harris, A., Johannes,M. S., Vogelstein, R. J., Ravitz, A. D. et al. (2014).Demonstration of a semi-autonomous hybrid brain–machine interface using human intracranial eeg, eyetracking, and computer vision to control a robotic upperlimb prosthetic, IEEE Transactions on Neural Systemsand Rehabilitation Engineering 22(4): 784–796.

Morioka, H., Kanemura, A., Hirayama, J.-i., Shikauchi, M.,Ogawa, T., Ikeda, S., Kawanabe, M. & Ishii, S.(2015). Learning a common dictionary for subject-transfer decoding with resting calibration, NeuroImage111: 167–178.

Page 15: arXiv:1611.08492v1 [cs.HC] 25 Nov 2016 · 2016-11-28 · beginning of the experiments, a short pre-test was performed to ensure that every participant understood the instructions

A Multimodal Approach to Estimating Vigilance Using EEG and Forehead EOG 15

Muhl, C., Jeunet, C. & Lotte, F. (2014). EEG-basedworkload estimation across affective contexts, Frontiersin Neuroscience 8.

Mullen, T. R., Kothe, C. A., Chi, Y. M., Ojeda, A., Kerth,T., Makeig, S., Jung, T.-P. & Cauwenberghs, G. (2015).Real-time neuroimaging and cognitive monitoring usingwearable dry EEG, IEEE Transactions on BiomedicalEngineering 62(11): 2553–2567.

Nicolaou, M., Gunes, H., Pantic, M. et al. (2011). Continuousprediction of spontaneous affect from multiple cues andmodalities in valence-arousal space, IEEE Transactionson Affective Computing 2(2): 92–105.

O’Connell, R. G., Dockree, P. M., Robertson, I. H.,Bellgrove, M. A., Foxe, J. J. & Kelly, S. P. (2009).Uncovering the neural signature of lapsing attention:electrophysiological signals predict errors up to 20s before they occur, The Journal of Neuroscience29(26): 8604–8611.

Oken, B. S. & Salinsky, M. C. (2007). Sleeping and driving: Nota safe dual-task, Clinical Neurophysiology 118(9): 1899.

Pan, S. J. & Yang, Q. (2010). A survey on transfer learning,IEEE Transactions on Knowledge and Data Engineering22(10): 1345–1359.

Papadelis, C., Chen, Z., Kourtidou-Papadeli, C., Bamidis, P. D.,Chouvarda, I., Bekiaris, E. & Maglaveras, N. (2007).Monitoring sleepiness with on-board electrophysiologicalrecordings for preventing sleep-deprived traffic accidents,Clinical Neurophysiology 118(9): 1906–1922.

Peiris, M. T., Davidson, P. R., Bones, P. J. & Jones, R. D.(2011). Detection of lapses in responsiveness from theEEG, Journal of Neural Engineering 8(1): 016003.

Peng, J., Bo, L. & Xu, J. (2009). Conditional neural fields,Advances in Neural Information Processing Systems,pp. 1419–1427.

Pfurtscheller, G., Allison, B. Z., Bauernfeind, G., Brunner, C.,Escalante, T. S., Scherer, R., Zander, T. O., Mueller-Putz, G., Neuper, C. & Birbaumer, N. (2010). Thehybrid BCI, Frontiers in Neuroscience 4: 3.

Ranney, T. A. (2008). Driver distraction: A review of the currentstate-of-knowledge, Technical report, National HighwayTraffic Safety Administration, Washington, DC, USA.

Ray, W. J. & Cole, H. W. (1985). EEG alpha activity reflectsattentional demands, and beta activity reflects emotionaland cognitive processes, Science 228(4700): 750–752.

Rosenberg, M. D., Finn, E. S., Scheinost, D., Papademetris,X., Shen, X., Constable, R. T. & Chun, M. M. (2016).A neuromarker of sustained attention from whole-brainfunctional connectivity, Nature Neuroscience 19: 165–171.

Sahayadhas, A., Sundaraj, K. & Murugappan, M. (2012).Detecting driver drowsiness based on sensors: a review,Sensors 12(12): 16937–16953.

Shi, L.-C., Jiao, Y.-Y. & Lu, B.-L. (2013). Differential entropyfeature for EEG-based vigilance estimation, 35th AnnualInternational Conference of the IEEE Engineering inMedicine and Biology Society, IEEE, pp. 6627–6630.

Shi, L.-C. & Lu, B.-L. (2010). Off-line and on-line vigilanceestimation based on linear dynamical system andmanifold learning, Annual International Conference ofthe IEEE Engineering in Medicine and Biology Society,IEEE, pp. 6587–6590.

Shi, L.-C. & Lu, B.-L. (2013). EEG-based vigilance estimationusing extreme learning machines, Neurocomputing102: 135–143.

Sievertsen, H. H., Gino, F. & Piovesan, M. (2016). Cognitivefatigue influences students performance on standardizedtests, Proceedings of the National Academy of Sciences113(10): 2621–2624.

Simola, J., Le Fevre, K., Torniainen, J. & Baccino, T.(2015). Affective processing in natural scene viewing:

Valence and arousal interactions in eye-fixation-relatedpotentials, NeuroImage 106: 21–33.

Trutschel, U., Sirois, B., Sommer, D., Golz, M. & Edwards, D.(2011). PERCLOS: An alertness measure of the past,Driving Assessment 2011: 6th.

Uriguen, J. A. & Garcia-Zapirain, B. (2015). EEG artifactremovalstate-of-the-art and guidelines, Journal of NeuralEngineering 12(3): 031001.

Wang, Y.-K., Jung, T.-P. & Lin, C.-T. (2015). EEG-basedattention tracking during distracted driving, IEEETransactions on Neural Systems and RehabilitationEngineering 23(6): 1085–1094.

Wronkiewicz, M., Larson, E. & Lee, A. K. (2015). Leveraginganatomical information to improve transfer learning inbrain–computer interfaces, Journal of Neural Engineer-ing 12(4): 046027.

Wu, D., Courtney, C. G., Lance, B. J., Narayanan, S. S.,Dawson, M. E., Oie, K. S. & Parsons, T. D. (2010).Optimal arousal identification and classification foraffective computing using physiological signals: virtualreality stroop task, IEEE Transactions on AffectiveComputing 1(2): 109–118.

Zander, T. O. & Jatzev, S. (2012). Context-aware brain–computer interfaces: exploring the information spaceof user, technical system and environment, Journal ofNeural Engineering 9(1): 016003.

Zander, T. O. & Kothe, C. (2011). Towards passive brain–computer interfaces: applying brain–computer interfacetechnology to human–machine systems in general,Journal of Neural Engineering 8(2): 025005.

Zhang, Y.-F., Gao, X.-Y., Zhu, J.-Y., Zheng, W.-L. & Lu, B.-L.(2015). A novel approach to driving fatigue detectionusing forehead EOG, 7th International IEEE/EMBSConference on Neural Engineering, IEEE, pp. 707–710.

Zhang, Y.-Q., Zheng, W.-L. & Lu, B.-L. (2015). Transfercomponents between subjects for EEG-based drivingfatigue detection, International Conference on NeuralInformation Processing, Springer, pp. 61–68.

Zheng, W.-L., Dong, B.-N. & Lu, B.-L. (2014). Multimodalemotion recognition using EEG and eye tracking data,36th Annual International Conference of the IEEEEngineering in Medicine and Biology Society, IEEE,pp. 5040–5043.

Zheng, W.-L. & Lu, B.-L. (2015). Investigating critical fre-quency bands and channels for EEG-based emotionrecognition with deep neural networks, IEEE Transac-tions on Autonomous Mental Development 7(3): 162–175.

Zheng, W.-L. & Lu, B.-L. (2016). Personalizing EEG-basedaffective models with transfer learning, InternationalJoint Conference on Artificial Intelligence, pp. 2732–2738.