20
WHITE PAPER © iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 1/20 Validation of iMotions’ Emotion Evaluation System embedded in Attention Tool® 3.0 Jakob de Lemos, Golam Reza Sadeghnia, Íris Ólafsdóttir, Ole Jensen Version 2 - Apr 2010 Keywords: emotions, emotional response, emotion technology, non-invasive physiological measurements, visual attention, arousal, emotional activation, decision making. Abstract: This paper describes how Attention Tool® can be used to measure human emotions and which statistical outputs are provided in the tool for each tested visual stimulus. The method in Attention Tool for measuring the emotional strength, also known as physiological arousal, is based on pupil size variation, eye blink pattern and gaze behavior. Furthermore, the method is evaluated together with galvanic skin response (GSR) recordings that is a well-known and widely used method for measuring arousal. The comparison of these two methods shows that Attention Tool is just as good an evaluator of physiological arousal as the currently used GSR. Introduction Emotions are multifaceted and a complex phenomena with many different conceptualizations and theories (Scherer, 2000). One group of theories considers emotions as biologically determined responses that were attained through evolutionary challenges (Cosmides & Tooby, 2000). Other theories consider emotions as based more on learning and cognitive evaluation (Scherer et al, 2001). However, despite theoretical differences, most emotion researchers agree that emotions are manifested and can be assessed in relation to a subjective (experiential), a physiological (bodily) and a behavioral (acting) dimension (Lang, 1988). Research indicates that emotions play an important role in adjusting advantageously to the environment, and are critical for decision making, problem solving and rational behavior in everyday life (Damasio, 1994). Emotions affect decision making and behavior indirectly. Strong emotional response to an advertisement or a product design can be the prevailing factor in the decision making moment in buying one product over the other. People are often not conscious about their emotional response; therefore it is better accessible through psychophysiological measurements rather than self report or questionnaires. Results from these methods also have an innate tendency to make the

iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

Embed Size (px)

DESCRIPTION

Abstract: This paper describes how Attention Tool® can be used to measure human emotions and which statistical outputs are provided in the tool for each tested visual stimulus. The method in Attention Tool for measuring the emotional strength, also known as physiological arousal, is based on pupil size variation, eye blink pattern and gaze behavior. Furthermore, the method is evaluated together with galvanic skin response (GSR) recordings that is a well-known and widely used method for measuring arousal. The comparison of these two methods shows that Attention Tool is just as good an evaluator of physiological arousal as the currently used GSR.

Citation preview

Page 1: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 1/20

Validation of iMotions’ Emotion Evaluation System embedded in

Attention Tool® 3.0

Jakob de Lemos, Golam Reza Sadeghnia, Íris Ólafsdóttir, Ole Jensen

Version 2 - Apr 2010

Keywords: emotions, emotional response, emotion technology, non-invasive physiological measurements, visual attention, arousal, emotional activation, decision making.

Abstract: This paper describes how Attention Tool® can be used to measure human emotions and which statistical outputs are provided in the tool for each tested visual stimulus. The method in Attention Tool for measuring the emotional strength, also known as physiological arousal, is based on pupil size variation, eye blink pattern and gaze behavior. Furthermore, the method is evaluated together with galvanic skin response (GSR) recordings that is a well-known and widely used method for measuring arousal. The comparison of these two methods shows that Attention Tool is just as good an evaluator of physiological arousal as the currently used GSR.

Introduction Emotions are multifaceted and a complex phenomena with many different conceptualizations and theories (Scherer, 2000). One group of theories considers emotions as biologically determined responses that were attained through evolutionary challenges (Cosmides & Tooby, 2000). Other theories consider emotions as based more on learning and cognitive evaluation (Scherer et al, 2001). However, despite theoretical differences, most emotion researchers agree that emotions are manifested and can be assessed in relation to a subjective (experiential), a physiological (bodily) and a behavioral (acting) dimension (Lang, 1988). Research indicates that emotions play an important role in adjusting advantageously to the environment, and are critical for decision making, problem solving and rational behavior in everyday life (Damasio, 1994).

Emotions affect decision making and behavior indirectly. Strong emotional response to an advertisement or a product design can be the prevailing factor in the decision making moment in buying one product over the other. People are often not conscious about their emotional response; therefore it is better accessible through psychophysiological measurements rather than self report or questionnaires. Results from these methods also have an innate tendency to make the

Page 2: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 2/20

respondents reflect about the answer instead of giving an immediate response. Cognition (thoughts and their associations) is unavoidable when using self report and questionnaires, and will often give misleading results if the intention is to measure the immediate emotional response. Attention Tool can access the physiological response in connection to an emotional response fast and efficiently through eye movements, eye blink pattern and pupil size variation. The result is a good indication of the emotional effect that an advertisement or other visual stimuli may have on the respondent on a subconscious level.

Attention Tool is the first and currently the only non-intrusive tool on the market that can make psychophysiological measurements for emotional response evaluation. There exist intrusive tools such as functional magnetic resonance imaging (fMRI), electroencephalogram (EEG), positron emission tomography (PET) and galvanic skin response (GSR) that can measure different aspects of human emotions. GSR is a very good indicator of the strength of an emotion, broadly known as physiological arousal (Emotional Activation in Attention Tool) and has been used in lie detector apparatus (polygraph) for 100 years. The problem with GSR is that it is intrusive (must be connected to the body of the respondent). Attention Tool is compared to GSR measurements for approval of the arousal estimation system embedded in the tool.

Attention Tool provides a range of statistical information on tested stimuli. For a group of no less than ten respondents the average Emotional Activation estimate for the group is provided along with the corresponding confidence interval and data quality. Furthermore Attention Tool 3.0 provides a statistical hypothesis test to investigate if the difference in the average arousal estimate for two different stimuli is significant or not.

A study is performed where 66 stimuli have been exposed to 50 males and 50 females where eye tracking data and GSR data are recorded simultaneously. Attention Tool and GSR are used to classify stimuli as belonging to one of two groups; high arousal or low arousal stimuli. The classification results from these two methods were consistent. For females, 20 out of 24 stimuli were classified into the correct category and 16 out of 17 stimuli for males. The highest Pearson’s correlation coefficient was observed to be ρ = 0.84. According to this study, Attention Tool 3.0 can be considered to be at least as reliable an arousal evaluator as the best of the GSR measurement systems that exist on the market today.

Page 3: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 3/20

Table of contents Measuring emotions ................................................................................................................................3

What is an emotion? ............................................................................................................................3

How can emotions be measured? ........................................................................................................3

How do emotions affect decision making? .........................................................................................3

Attention Tool approach to emotional response evaluation....................................................................3

Statistical output ..................................................................................................................................3

Validation of emotion evaluation method in Attention Tool ..................................................................3

Study design and performance ............................................................................................................3

Participants ......................................................................................................................................3

Stimuli selection ..............................................................................................................................3

Procedure.........................................................................................................................................3

Equipment setup ..............................................................................................................................3

Eye tracker data processing.............................................................................................................3

Signal processing of galvanic skin response data ...........................................................................3

EPOC study results: Validation of Attention Tool..............................................................................3

Precision of galvanic skin response and eye tracking .....................................................................3

A brief note on Arousal calculations...............................................................................................3

Confusion matrix accuracy of low and high arousal classification.................................................3

Linear regression of measurements .................................................................................................3

Conclusion...............................................................................................................................................3

References ...............................................................................................................................................3

Page 4: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 4/20

Measuring emotions Before describing how Attention Tool can access emotional response, it is relevant to answer a few questions like what is an emotion, how can it be measured and how does it affect decision making? Following, a brief discussion on these issues is presented.

What is an emotion? There is a general acceptance that emotions are elicited by a specific situation, person or object (real or imagined). It includes changes in three different reactive systems:

• there is an experience of emotion, often referred to as feelings (Damasio, 1994)

• emotions are accompanied by expressive display, e.g. postures, gestures, facial and vocal expressions (Ekman, 1971; Scherer, 1986)

• emotions are accompanied by bodily responses that comprise changes in the somatic and autonomous nervous system, as well as in the endocrine and immune system. These, in turn, modify specific psychophysiological responses such as reflex, cardiovascular, electrodermal, gastrointestinal or pupillary activity (Cacioppo, Tassinary & Berntson, 2000)

Many researchers agree that emotions can be assessed and measured in relation to a limited number of dimensions (Christie & Friedman, 2004). One of the most relevant is the concept of arousal, which is the intensity connected to the elicited emotion (Lang, 1995). Another dimension often used is the valence dimension that group emotions along a pleasant (positive) – unpleasant (negative) scale. Thus from a dimensional perspective, emotions are often considered subordinate divisions in a valence (pleasant or unpleasant emotion) and arousal (intensity) coordinate space (Lang et al, 1993).

How can emotions be measured? There exist a few methods for accessing emotional response by monitoring the bodily reaction. These bodily reactions are reflected by skin conductance and pupillary response, amongst other peripheral markers. These changes are to a high degree controlled by amygdale in the limbic system. Following is a list of some of the most widely used methods for measuring bodily reactions following the experience of an emotion.

fMRI is a neuroimaging technique that measures the hemodynamic response related to neural activity in the brain, as a method of observing which areas of the brain are active at any given time (functional imaging). Blood releases oxygen to active neurons at a greater rate than to inactive neurons. Thus oxygenated or deoxygenated blood leads to variations in the atoms magnetisation, which can be detected using an MRI scanner. The test subject is lying in a tube that makes it virtually impossible to imitate a natural experimental setup. Additionally, MRI equipment is expensive and requires highly trained staff.

EEG is the measurement of electrical activity produced by the brain and is recorded from electrodes placed on the scalp, so the method is highly intrusive. Furthermore, the signal is always

Page 5: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 5/20

disrupted by other electrical signal sources in the body, such as muscular activity. EEG can be used to measure different aspects of emotions but the equipment is quite expensive and requires highly trained staff.

GSR is related to the electrodermal system. It is sensitive to rapid change in hydration of the skin. It is typically recorded from the surface of the fingers (Hugdahl, 1995). The response is secondary to a hormonal change induced by the sympathetic division of the autonomous nervous system. There is a relationship between sympathetic activity and the intensity of an emotion, although the response is not known to identify the specific emotion being elicited. GSR equipment is quite simple to use and has a low cost. The disadvantage is that the GSR is intrusive as electrodes must be attached to the fingers. Also the GSR signal is highly sensitive to body movements, making it quite difficult to use in a natural experimental setup. The respondent is required not to move any part of the body during the recording or perform deep respirations.

PET is an imaging technique, using small amounts of radioactive substances injected in the blood stream that can be traced by the scanner as the blood stream activity in one area increase according to brain activity. The technique results in precise functional images, but both equipment and maintenance are very costly. PET requires injection of a tracer and is therefore highly intrusive as well as invasive PET is also considered to be unhealthy if a subject is repeatedly exposed, as the injected tracer is a radioisotope.

Eye tracking (Attention Tool) is primarily used to measure eye gaze, but can also measure the pupil size and blinks with high precision. These parameters reflect, amongst other, the immediate emotional response and may indicate interest in the subject of attention and/or sexual stimulation (Hess, Eckhard & Polt 1960). Eye blink rate and pupil dilation have been found to be indicators of cognitive processing as well as the level of emotional arousal (Cramon, 1977). The eye trackers available on the market today are non-intrusive and relatively low cost.

How do emotions affect decision making? Decision making has until recently been viewed as merely a cognitive process. It was assumed that a rational cost-benefit analysis with weighting all possible alternatives and then making a decision based on these analysis. Emotions were not helpful, only a disturbance. However, the last decades of a blooming interest in the role of emotions in decision making tells us otherwise.

A recent research has shown that emotions are critical for advantageous adjustment to the environment, as well as for decision making and problem solving (Forgas, 1995; Damasio, 1994). Background emotions (that are subconscious) are constantly associated with the new situation being experienced. They continuously work as subconscious, automatic guide in decision making. These are called somatic markers (Damasio, 1994).

During decision making the somatic markers from the reward- and punishment associated experiences related to it are summed up to produce a net somatic state which automatically excludes many alternatives. Then the conscious mind is left with a small number of alternatives to

Page 6: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 6/20

be considered. If this automatic mechanism was not there, the conscious mind would be overloaded with information. Somatic markers’ (emotions) role is twofold: the first is to automatically reduce the number of behavior alternatives by retrieving the emotional states previously encountered from similar experience, and the other is to mark the new situation with a new, modified somatic marker. The somatic marker consists of the emotions that are associated to the situation or object.

Emotions also affect visual attention and the processing of visual information. Some studies suggest that the valence of an emotion determines the nature of subsequent information processing. For example, unpleasant negative emotions are thought to narrow, whereas pleasant positive emotions broaden the attention focus (Isen, 1999). The human memory system also depends on emotional information, both at an encoding and a retrieval level. The attention-grabbing nature of emotionally arousing objects often leads to a stronger focus on these, and emotional arousal has been shown to enhance declarative memory (Kensinger and Shachter, 2005).

With associations of recall and decision making with emotions in mind, the intensity of the immediate emotion, known as arousal or Emotional Activation as interpreted by Attention Tool, gives much information about a tested stimulus. This is of great value when analyzing the impact of visual stimulus on an emotional and subconscious level.

Attention Tool approach to emotional response evaluation Knowledge about the relation between emotional response and pupillary reaction has been available since the seventies. The knowledge has not been utilised until recently when reliable eye tracking hardware that is capable of measuring pupil size, eye blink and gaze coordinates with high accuracy became available in the market. Today, there exist dozens of companies providing high quality eye trackers for a reasonable price, which is accelerating the research in this field dramatically.

Attention Tool can measure the emotional response of a group to exposed stimuli with a good precision. From the gaze movements, pupil size variation and blink characteristics it is possible to evaluate the arousal level of a respondent. This method has been successfully compared to GSR measurements where the correct high and low arousal classification was observed to be 83% for females and 94% for males (N = 45 on average). The precision of Attention Tool emotion evaluation system and a proof of concept are described in details in the next chapter. Attention Tool provides statistical information about the emotional response to the tested stimuli, which is described below.

Statistical output The emotional response is evaluated for a stimulus exposed to a group of no less than 10 valid respondents. It is recommended to use at least 30 respondents of the same gender to evaluate the average arousal level for the stimulus. In Attention Tool 3.0 the arousal measure is referred to as Emotional Activation, indicating the strength of the emotion being elicited to the presented stimulus. Attention Tool 3.0 provides the following statistics of emotional response:

• Average arousal estimate (Emotional Activation) of the group

Page 7: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 7/20

• Standard deviation in arousal for the group

• 95% confidence interval for mean arousal

• Affectivity given as number of respondents below (unaffected) and above or equal to (affected) 5.0 in Emotional Involvement

• Statistical hypothesis test to investigate if the difference in the average arousal estimate for two different stimuli is significant or not on a 95% confidence level

• Data quality as the percentage of valid measurements for each stimulus

The average arousal or the Emotional Activation tells how strong an emotional reaction the group experienced on average to the exposed stimulus. The scale is linear and range from zero, indicating no reaction, to ten, indicating a strong emotional response.

The 95% confidence interval of the Emotional Activation is given in the user interface of the software. Additional statistical information is provided in a tabulated text file output, including the 95% and 90% confidence intervals, the standard deviation of the Emotional Activation and number of valid respondents. This information can be used to make further statistical tests and compare results between studies and segments.

Affectivity is calculated from the individual arousal estimate. It indicates how big a fraction of the respondents experienced a strong emotional response (arousal ≥ 5) when exposed to the stimulus.

When comparing two stimuli, it is interesting to know if the given average arousal estimate for the stimuli are significantly different or not. One way to get an indication about this is to compare the confidence intervals and see if they overlap. A more reliable method is to perform a test of significance to investigate if there is a significant difference between the average arousal estimates of the stimuli. Attention Tool 3.0 provides a comparison matrix for all tested stimuli telling if the null hypothesis was accepted (indicating no significant difference) or rejected (indicating that there is a difference). This comparison is performed using a Z-test with a 95% confidence level, corresponding to a two tailed significance level of α = 0.025.

Data quality is calculated from the amount of valid data for each stimulus. If the respondent looks away from the screen, blinks a lot or for any other reason causes poor data collection, he or she is defined as an outlier and excluded from the data set.

Validation of emotion evaluation method in Attention Tool The arousal evaluation in Attention Tool has been compared to GSR for verification. This chapter describes a study iMotions has designed to prove the concept of iMotions’ emotion evaluation system (iEE) which will be referred to as the EPOC study (Extended Proof of Concept).

Page 8: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 8/20

Study design and performance The study took place in a laboratory at iMotions’ headquarters in Copenhagen during the years 2006 and 2007. The selection of participants, stimuli and study procedure were as is described below.

Participants One hundred persons with normal physiological and psychological health, using no medication and capable of being measured for GSR response on the second phalanx of the index and middle fingers of the left hand were selected for testing. Correctly functioning vision was an important factor, therefore only respondents with no ophthalmologic diseases or congenital anomaly of the vision or eyes were selected. The group was equally divided into 50 male and 50 female respondents as previous studies demonstrated important gender differences in relation to emotional reactions (Lang et al, 1993). All subjects were Danish citizens and fluent in Danish. They ranged from 18 to 49 years of age with an average age of 33 years in each gender segment.

Stimuli selection Using the norms provided by Lang et al (2005), 45 pictures from the International Affective Picture System (IAPS) database and 21 advertisements were selected. The IAPS database consists of a combination of emotionally charged color photographs designed to elicit emotions pleasant (positive) and unpleasant (negative) and neutral pictures causing less emotional response. All 900 stimuli in the IAPS database have been evaluated on the arousal and valence dimensions using a rating system called the Self-Assessment Manikin (SAM), through a series of experiments carried out by Peter J. Lang in 1988. Using SAM the respondents have been asked to rate their emotional response on the photographs, on the arousal and valence dimensions, immediately after being presented to the stimulus. The average and standard deviation of the ratings derived in Peter J. Lang’s study are used to select a subset of the IAPS pictures to be used in the iMotions EPOC study.

The criteria for the selected stimuli are:

• Maximum standard deviation for arousal according to the IAPS database is less than or equal to 2.0 (Lang, P.J., Bradley, M.M., & Cuthbert, B.N. (2005))

• The variance in the average arousal ratings across all stimuli was to be maximized by selecting photographs of low, medium and high arousal values

• The variance in valence across all stimuli was kept high

Separate norms of arousal and valence exist for men and women in the IAPS database (Lang et al, 1993). For females the maximum and minimum average SAM arousal ratings are 1.87 (IAPS no. 7175) and 7.77 (IAPS no. 3000). For males the least and most arousing pictures have average SAM arousal ratings of 1.55 (IAPS no. 7010) and 8.25 (IAPS no. 4210), respectively.

Page 9: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 9/20

The pictures selected for the EPOC study range from 1.97 (IAPS no. 7010) to 7.56 (IAPS no. 6230) for females, and from 1.55 (IAPS no. 7010) to 7.76 (IAPS no. 4800) for males. The valence reaches 1.86 (IAPS no. 3010) and 7.86 (IAPS no. 8200) for females, and 1.80 (IAPS no. 3120) to 8.21 (IAPS no. 4180) for males. Moreover, the pictures represent many categories, in order to span across a large spectrum of emotions. The content varies from mutilated bodies, erotic pictures, social and ethical content to household objects, food, sports and typical print advertisements. The IAPS SAM reporting method is based on self evaluation and not physiological measurements. Therefore it is only used to validate our use of the IAPS system.

Procedure The respondent was welcomed and escorted to the test room. He or she was seated in a comfortable chair at a distance of approximately 70 cm from the monitor. The room was sound attenuated and dimly lit, with a fairly consistent temperature. For each testing session, the same female instructor calibrated the eye measurement equipment and attached electrodes for recording skin conductance. The respondent’s left arm was placed on a table and two electrodes of type EL507 with isotonic gel (manufactured by Biopac systems) were attached to the second phalanx of the middle and index fingers. The respondent was then instructed to sit as still as possible, as body movements may influence GSR recordings. Also, the respondent is asked to not look away from the screen, as this may lead to unusable eye tracking data.

The data collection starts with a gaze calibration of the respondent. After a successful gaze calibration, the respondent was told that a series of pictures would be displayed for an equal amount of time. Before the stimulus image, though, there was a series of five intensity images (experienced as grey noise) for light calibration purposes. Then, for each stimulus there was a block consisting of one black inter slide of two seconds duration, one greyscale inter slide of fifteen seconds duration, the stimulus slide of six seconds duration, one SAM arousal scale pictogram slide and finally one SAM valence scale pictogram slide, which did not have any time restriction. The SAM slides are designed to wait for a rating from the user before proceeding to the next slide.

The greyscale prior to the stimulus slide is created by scrambling the following stimulus slide and serves as light adaptation for the respondent, as well as a psychological pause in between stimuli. The total 17 seconds of inter slide duration varied in a random fashion by a couple of seconds, to compensate for the respondent’s tendency to learn when to expect stimulus onset time. The time duration is also designed to allow the skin conductance responses to return to baseline (Bechara et al., 1996). Each stimulus was presented for six seconds, the same time as in previous related studies (Bradley et al., 2001; Lang et al., 1993).

Page 10: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 10/20

Figure 1 shows the slide order during data collection. The light calibration slides and gaze calibration are shown once, followed by the interslide, stimulus slide and SAM rating slides for each of the 66 stimuli, as the arrow indicates.

Immediately after the display of each stimulus slide, the respondent had to evaluate the valence and arousal using the SAM rating system on a PC keyboard. On the screen the SAM arousal and valence scales are displayed one after the other, illustrated by pictograms (see Figure 1). It was ensured that every participant was familiar with this system and the use of the scales was demonstrated with a few sample pictures. These were intended to illustrate neutral and extreme ratings in either valence direction.

After having explained the use of the scales, a short session of four pictures was performed, where the user could rate the pictures as a practice. After the practice session, the instructor ensured that the respondent understood the procedure and left the room. The 66 selected IAPS and advertisement pictures were presented automatically in a random order.

Figure 2 shows the test situation. The respondent is sitting in front of the eye tracker, while her GSR and eye properties are being measured. Using the keyboard with her right hand she rates the pictures on the SAM scales immediately after exposure.

Page 11: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 11/20

When the respondent had completed the study session, the instructor entered the room for a short debriefing and noted demographic information of the respondent. The session was video recorded for later analysis and typically lasted between half an hour and 45 minutes.

Equipment setup The stimuli were presented on a 17 inch Tobii computer monitor with built-in eye tracking sensors (Tobii model 17501). The data presentation was carried out by the software package E-Prime, which is a computer program designed for behavioral experiments. Three sorts of data were recorded. E-Prime registered the respondents SAM ratings for each picture and recorded eye tracking data through the Tobii eye tracker. The Biopac GSR equipment2, which is a professional system for physiological recordings, recorded the GSR data. Every time a target stimulus was presented, a marker signal was sent from E-Prime to the GSR software program (Biopac AcqKnowledge version 3.8.1.) for synchronization. The setup is shown in Figure 3.

Figure 3 illustrates the communication between the hardware used in the EPOC study. GSR recordings were recorded on the laptop, eye tracker data recordings were performed on a stationary computer and a synchronization signal connected the two machines through the Biopac system.

Eye tracker data processing Eye tracker data was exported from E-Prime into an Emotion Tool (former version of Attention Tool) compliant database and loaded into Emotion Tool for emotional response classification. The classification was performed using the iMotions’ emotion evaluation system, iEE 2.0, which

1 See http://www.tobii.com 2 The Biopac modules MP100ACE, UIM100C and GSR100C were used.

Page 12: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 12/20

Attention Tool 3.0 is based on. Signal processing of GSR data was a manual process that has been carried out in accordance to the guidelines of Kenneth Hugdahl (see Hugdahl, 2001).

Signal processing of galvanic skin response data The GSR signal has been filtered to remove small fluctuations which do not belong to the galvanic skin response frequency spectrum. Figure 4 shows an example of the filtered GSR signal from stimulus onset time until five seconds after stimulus presentation.

Figure 4, an example of a filtered galvanic skin response signal. A is the stimulus onset time, B is the time instant where the signal is at maximum, C is where the search for a maximum starts and D where it ends.

The amplitude of the response is determined by the peak of the GSR signal subtracted by the minimum value of the signal. First, the peak is found from one second after stimulus onset time until five seconds after onset time. Then, the minimum value is found in the interval from stimulus onset time until the time instant of the peak. See below for a mathematical description.

A is the stimulus onset time and B is the time instant, where the GSR signal is at maximum between instances C and D; C = A + 1 and D = A + 5.

If there has been a large fluctuation with high amplitudes prior to stimulus onset time, this might indicate heavy respiration, physical movements by the respondent or perhaps a technical disruption of the signal. All of which may distort the response measured after stimulus onset time, and the response will therefore be marked as an outlier and omitted from average response calculations.

Page 13: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 13/20

The respondent was also video recorded during the data collection for later analysis and outlier detection. The video recordings were used to observe any obvious disturbance causing an invalid GSR recording, such as the mentioned body movements or heavy respiration.

EPOC study results: Validation of Attention Tool The results from the EPOC study are presented in this chapter. First, the precision of the measurement equipments are compared for an overview on the limitations of the techniques and the pros and cons of galvanic skin response and eye tracking. The correlation between the simultaneously acquired measurements from eye tracking and galvanic skin response is investigated next. This is done by looking at how the iMotions emotion evaluation system behaves in high and low arousal regions by computing the confusion matrix accuracy of iEE versus GSR. The whole arousal range is investigated by performing a simple linear regression between iEE arousal and GSR arousal.

The comparison proves iMotions’ emotion evaluation system encapsulated in Attention Tool 3.0 to be a good quantitative arousal estimator.

Precision of galvanic skin response and eye tracking Before comparing the ability of the two methods in estimating arousal, it is relevant to compare the precision of data collected by galvanic skin response measurements and eye tracking.

The simultaneously collected skin conductance and eye tracking data from the EPOC study are normalised, and the coefficient of variation (cv) for the measurements are compared on a feature level. The extracted features are the basis for arousal calculations in both measurement techniques. The coefficient of variation is calculated as

where σ is the standard deviation and µ is the average value of the feature.

The coefficient of variation is a normalised, dimensionless measure of dispersion that can be used to compare the relative standard deviation of different datasets of various sources. It is given as a ratio and can be used to give an indication of whether the individual measurements are in agreement with the average of the sample or not. It is desirable that the Cv is less than unity, meaning that the standard deviation of a set of measurements is smaller than the average of the measurements. The higher the Cv, the less reliable is the individual measurements in a sample with regard to the average of the sample.

Data from the 66 stimuli in the EPOC study are used to calculate the Cv of all fundamental features for computing arousal from eye tracking and galvanic skin response measurements. The Cv of these features are plotted in Figure 5 as a function of the normalised average feature values. The comparison shows that there is an average of four times as high a relative standard deviation in the skin response measurements than eye tracking measurements. In fact, the standard deviation of skin response amplitudes is about twice as high as their average. On the other hand, looking at eye

Page 14: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 14/20

tracking features, the complete opposite case can be observed. Figure 5 shows that the standard deviation of the main features used for computing arousal from eye tracking is less than half the average value of the features. The Cv of eye tracking features are also notably more stable around the same level throughout the range of average feature values per stimulus, as opposed to the very varying Cv of skin response features.

Figure 5, coefficient of variation for features from eye tracking (·) and skin conductance (·). The features for eye tracking have a much lower dispersion.

Assuming that both equipments are capable of measuring physiological arousal triggered by visual stimuli, and that the given features are the basis for calculating emotional arousal, we can conclude that the data from the eye tracker are better suited for delivering consistent, low-dispersion features that can be used for arousal estimation.

A brief note on Arousal calculations For calculating arousal with the galvanic skin response equipment, the natural logarithm of the GSR amplitudes is used.

The iMotions emotion evaluation system on the other hand is considered to be a black box, where eye tracking features are fed to the input and the output is an average and standard deviation of iEE 2.0 arousal.

Page 15: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 15/20

Figure 6, iEE 2.0 is considered to be a black box unit in this whitepaper. The inputs are features extracted from eye blink, gaze coordinates and pupil diameter. The output is the average arousal and its standard deviation.

Confusion matrix accuracy of low and high arousal classification For both iEE and GSR it is quite easy to identify low and high arousal emotional responses. In the middle region of the responses however, it is more difficult to tell with confidence whether the elicited response belongs to the lower or higher region of the arousal scale. This is in particular manifested in GSR measurements (see Figure 5), where the dispersion of measurements are especially high in the low to middle region of the measurements. In general, it is difficult to use GSR measurements with fair precision due to the overall high dispersion of data.

High variance is an inherit part of physiological measurements that reflect a transition from an affected to an unaffected state; analogous to a calm or excited bodily reaction. It is therefore quite difficult to maintain low variance, especially in the transition region. Using a scale pan as a metaphor, the transition between the affected and unaffected states in the middle region can be seen as “tipping the scales”. This is also expected to be the case with eye tracking measurements, although the overall variance is somewhat lower than skin response measurements.

The accuracy of the iEE arousal scale when compared to GSR amplitudes is calculated by classifying the responses of either method into two categories; low arousal and high arousal response. For this purpose we have selected a region, which is regarded to be of too high variance using either iEE or GSR. Data from stimuli that were considered to be in this grey zone were left out, whereas the remaining where used separately for the two genders. The definitions for the two classes of responses are listed in Table 1 below.

High arousal class Low arousal class GSR GSR amplitude > average GSR amplitudes

of all stimuli + 15% GSR amplitude < average GSR amplitudes of all stimuli - 15%

iEE 2.0 iEE 2.0 Emotional Involvement > 6.0 iEE 2.0 Emotional Involvement < 4.0

Table 1, definition of low arousal and high arousal classes for GSR and iEE data.

From the total number of tested stimuli, 24 stimuli fulfilled the criteria for females and 17 stimuli for males.

The classification resulted in 8 correct classifications as high arousal and 12 correct classifications as low arousal for females. The two methods disagreed on 4 of the tested stimuli for females.

Page 16: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 16/20

For males, 5 stimuli were correctly classified as high arousal and 11 as low arousal, leaving inconsistency between the methods for only one of the stimuli. The results are listed in Table 2.

Females GSR High arousal

GSR Low arousal

Males GSR High arousal

GSR Low arousal

iEE 2.0 High arousal 8 2

iEE 2.0 High arousal 5 0

iEE 2.0 Low arousal 2 12

iEE 2.0 Low arousal 1 11

Table 2, arousal classification results from GSR and Attention Tool. The accuracy of the classifications is 83% for females and 94% for males.

This leaves 20 out of 24 correctly classified stimuli for females, or 83%, and 16 out of 17 stimuli, or 94%, correctly classified for males.

Linear regression of measurements The arousal values attained with galvanic skin response measurements and iEE are compared by performing a linear regression of the two. The Pearson’s correlation coefficient (ρ) is calculated for two datasets; one with all 66 stimuli from the EPOC study, and one with the low and high arousal stimuli chosen according to Table 1.

Figure 7 shows GSR arousal as a function of iEE arousal, the corresponding regression line and 95% prediction interval of males and females. All stimuli in the EPOC study were used for this plot.

Page 17: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 17/20

The effect of the greater dispersion of individual measurements around the middle range of the arousal scale is also visible in the regression shown in Figure 7. The correlation coefficient is ρ = 0.39 for females and ρ = 0.38 for males for the whole dataset. For the second dataset, the correlation coefficient is ρ = 0.63 for females and ρ = 0.84 for males, as listed in Table 3.

Correlation coefficient ρ Females ρ Males All stimuli 0.39 0.38

Low dispersion stimuli 0.63 0.84

Table 3, Pearson’s correlation coefficient for all stimuli in the EPOC study and for a subset of the stimuli where high or low arousal was observed.

It can be argued that the correlation between the two methods is vague for the whole dataset. This should be seen in the light of the disadvantages of galvanic skin response measurements, and the effect of the greater dispersion around the centre of the arousal scales. Looking at the correlation coefficients attained with the low dispersion dataset, it is evident that there is a strong correlation between GSR and iEE, when the arousal level is not around the centre of the arousal scale. This in turn does not mean that the iEE arousal scale fails at this point, but that the two methods, especially GSR, have a greater dispersion in this region, making it less likely to attain small differences in arousal levels between the methods; thus, resulting in a lower correlation coefficient.

Conclusion Emotions can be measured through the physiological response of the body and these measurements are of high value. One of the advantages of this information is the possibility of analysing how advertisements that still haven’t entered the market, are going to affect potential consumers. There exist many tools on the market today which can measure a physiological reaction in connection with emotions. But most of them are expensive and difficult to use; and all of them are intrusive. GSR is one of these measurement methods (that is less costly) and is considered to be a very good indicator of arousal in the field of psychophysiology.

The coefficient of variation of the features for arousal estimation derived by Attention Tool is compared with the same for GSR features for computing physiological arousal. The result showed that eye tracking features have a much lower dispersion of the individual measurements in a sample with regard to the average of the sample. The comparison revealed that eye tracking measurements are capable of delivering up to four times as consistent measurements of a peripheral physiological feature that can be used for arousal computation, than galvanic skin response measurements. Moreover, galvanic skin response is an intrusive method that needs skilled staff to derive a meaningful result from the data output. The respondent is also required to have a minimum understanding of, and compliance to, the equipments extreme sensitivity to attain useful results. This is due to the fragile nature of the biological signal, which is being recorded from the skin. Eye

Page 18: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 18/20

tracking requires no involvement from the respondent, once the calibration procedure has succeeded and is 100 % non-intrusive.

Classification of the stimuli into high and low arousal response categories showed consistent results. GSR arousal and iEE 2.0 arousal classified 83% of the stimuli into the same category for females and 94% of the stimuli for males. The Pearson’s correlation coefficient further substantiated the agreement between the methods. The correlation coefficient was ρ = 0.39 for females and ρ = 0.38 for males for the whole dataset from the EPOC study. Due to high variance in emotional response in the middle range of the arousal scales, the correlation increased dramatically when stimuli in this region were omitted from the dataset. For the same dataset, as was used for the confusion matrix accuracy calculations, the correlation was ρ = 0.63 for females and ρ = 0.84 for males.

Based on the above facts, a conclusion can be drawn that Attention Tool 3.0 is just as precise in determining the intensity of an emotion as the current, broadly used GSR method; thus, allowing the iMotions emotion evaluation system which Attention Tool 3.0 is based on to act as the future window to the human emotion spectrum. In addition Attention Tool 3.0 is non-intrusive and data output is analyzed and finalized automatically.

Page 19: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 19/20

References

• Beatty, J. & Lucero-Wagoner, B. (2000): The pupillary system, in Caccioppo, J., Tassinary, L.G. & Berntson, G. (Eds.): The Handbook of Psychophysiology, Cambridge University Press, Hillsdale, New York.

• Biopac: http://www.biopac.com

• Bradley, M. M. & Lang, P. J. (1994): Measuring emotion: SAM and the semantic differential, Journal of Experimental Psychiatry & Behavior Therapy, 25, 49-59.

• Bradley, M. M., Cuthbert, B. N. & Lang, P. J. (1999): Affect and the startle reflex, in Dawson, M.E., Schell, A. & Boehmelt, A. (Eds.): Startle modification: Implications for neuroscience, cognitive science and clinical science, Stanford University Press, Stanford, 242–276.

• Bradley, M.M., Codispoti, M., Cuthbert, B.N. & Lang, P.J. (2001): Emotion and Motivation I: Defensive and Appetitive Reactions in Picture Processing, Emotion, 1 (3), 276–298

• Cacioppo, J.T. & Gardner, W.L. (1999): Emotion, Annual Review of Psychology, 50, 191-214

• Calvo, M.G. & Lang, P.J. (2004): Gaze patterns when looking at emotional pictures: Motivationally biased attention, Motivation and Emotion, 28 (3)

• Christie, I. C. & Friedman, B. H. (2004): Autonomic specificity of discrete emotion and dimensions of affective space: a multivariate approach, International Journal of Psychophysiology, 51 (2), 143-153

• Cosmides, L. & Tooby, J. (2000): Evolutionary Psychology and the Emotions, in Lewis, M. & Haviland-Jones, J.M. (Eds.): Handbook of Emotions, The Guilford Press, New York

• Dichter, G.S., Tomarken, A.J. & Baucom, B.R. (2002): Startle modulation before, during and after exposure to emotional stimuli, International Journal of Psychophysiology, 43, 191-196

• E-Prime: www.pstnet.com

• Granholm, E. & Steinhauer, S.R. (2004): Pupillometric Measures of Cognitive and Emotional Processes, International Journal of Psychophysiology, 52, 1–6

Page 20: iMotions White Paper: Validation of Emotion Evaluation System embedded in Attention Tool® 3.0

WHITE PAPER

© iMotions - Emotion Technology A/S . Denmark, India, USA . [email protected] . www.imotionsglobal.com CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 20/20

• Hugdahl, K. (2001): Psychophysiology: The Mind-body Perspective, Harvard University Press

• E-mail correspondence with Kenneth Hugdahl (May 2006).

• Lang, P.J. (1988): What are the Data of Emotion?, in Hamilton, V., Bower, G.H. & Fridja, N.H. (Eds.): Cognitive Perspectives on Emotion and Motivation, Kluwer Academic Publishers, the Netherlands

• Lang, P.J., Bradley, M.M., & Cuthbert, B.N. (2005): International affective picture system (IAPS): Affective ratings of pictures and instruction manual, Technical Report A-6, University of Florida, Gainesville, FL.

• Lang, P.J., Greenwald, M.K., Bradley, M.M. & Hamm, A.O. (1993): Looking at pictures: Affective, facial, visceral, and behavioural reactions, Psychophysiology, 30, 261-273

• Parrott, W.G. & Hertel, P. (1999): Research methods in cognition and emotion, in Dalgleish, T. & Power, M.J. (Eds.): Handbook of Cognition and Emotion, John Wiley & Sons, Chichester

• Ruiz-Padial, L., Sollers, J.J., Vila, J. & Thayer, J.F. (2003): The rhythm of the heart in the blink of an eye: Emotion-modulated startle magnitude covaries with heart rate variability, Psychophysiology, 40, 306–313

• Scherer, K.R. (2000): Psychological Models of Emotion, in Borod, J.C. (Eds.): The Neuropsychology of Emotion, Oxford University Press

• Scherer, K.R., Schorr, A. & Johnstone, T. (Eds.): Appraisal Processes in Emotion – Theory, Methods, Research, Oxford University Press, 2001

• Staners, R.F., Coulter, M., Sweet, A.W. & Murphy, P. (1979): The papillary response as an indicator of arousal and cognition, Motivation and Emotion, 3 (4), 319-340

• Steinhauer, S. R., Boller, F., Zubin, J. & Pearlman, S. (1983): Pupillary dilation to emotional visual stimuli revisited, Psychophysiology, 20

• Tobii: www.tobii.com

• Tranel, D. (2000): Electrodermal Activity in Cognitive Neuroscience: Neuroanatomical and Neuropsychological Correlates, in Lane, R.D. & Nadel, L. (Eds.): Cognitive Neuroscience of Emotion,Oxford University Press, New York