17
Bachelor’s Project Thesis Pim van der Loos, [email protected], Supervisors: Dr M.K. Van Vugt & P. Kaushik Abstract: Monastic debate is a form of practice that plays an important role in the training of Tibetan Buddhist monks. The goal of these debates is to deepen the monk’s understanding of their study materials and the world. During these debates, many emotions such as anger and happiness arise, which makes this a good environment to study naturally occurring emotions. For this study, the electroencephalography (EEG) data for both participants and the videos of 46 debates were recorded. After manually annotating the debates for anger and happiness, two deep learning algorithms were used to classify the emotions of happiness and anger from the EEG data in a subject-independent approach. A long short-term memory (LSTM) and a 1-dimensional convolutional neural network (1D CNN) were used. The LSTM achieved the highest accuracy at 91.0%, with the 1D CNN following at 88.8%. These findings show that deep learning can be used to create a robust classifier for anger and happiness. 1 Introduction Emotions play an important role in human life, for example in bonding with other people and strength- ening social ties (Rim´ e, 2009). When looking specif- ically at positive versus negative emotions, Gino and Schweitzer (2008) have shown that these emo- tions play a role in taking advice from other people. They found that negative emotions such as anger made the participants less likely to take follow useful advice than a positive emotion (gratitude), which resulted in lower performance for those ig- noring the advice. However, despite the big role emotions play in our lives, not much is known about emotions in real-life situations, as described by Trampe et al. (2015). In their study, aimed at discovering what emotions we experience in everyday life, they used a phone application that asked participants to fill in a short questionnaire to monitor their emotions at random times throughout the day. In total, over half a million questionnaires were completed. In their research, they found that the participants experienced positive emotional states such as joy, love, and gratitude 2.5 times more than negative emotional states such as anger, anxiety, and sad- ness. However, they suspect that certain emotions might be under- or over-represented in their results because the decision to respond to the survey would be influenced by that very emotion. One way of overcoming this limitation could be to decode the emotions experienced by the partic- ipants using deep learning and wearable EEG de- vices such as some described by Casson (2019). This method would also have the advantage of being a continuous stream of data, rather than unevenly distributed data points. Matiko et al. (2015) found promising results in classifying emotions using a wearable, solar- powered EEG headband. Their fuzzy logic-based emotion classifier reached an accuracy of 90% (±9%) when classifying positive versus negative emotions in a subject-dependent approach. One thing almost all research into emotion clas- sification using EEG data has in common is that it uses artificially induced emotions. Matiko et al. (2015), for example, used randomly selected se- quences of 24 images from the Geneva affective picture database (Dan-Glauser and Scherer, 2011). These images are supposed to elicit a specific emo- tional response. For instance, images of snakes are supposed to cause a negative emotional response, 1

Using Deep Learning To Classify Emotions in Tibetan Monks

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using Deep Learning To Classify Emotions in Tibetan Monks

Using Deep Learning To Classify Emotions in

Tibetan Monks

Bachelor’s Project Thesis

Pim van der Loos, [email protected],

Supervisors: Dr M.K. Van Vugt & P. Kaushik

Abstract: Monastic debate is a form of practice that plays an important role in the trainingof Tibetan Buddhist monks. The goal of these debates is to deepen the monk’s understandingof their study materials and the world. During these debates, many emotions such as anger andhappiness arise, which makes this a good environment to study naturally occurring emotions.For this study, the electroencephalography (EEG) data for both participants and the videos of46 debates were recorded. After manually annotating the debates for anger and happiness, twodeep learning algorithms were used to classify the emotions of happiness and anger from the EEGdata in a subject-independent approach. A long short-term memory (LSTM) and a 1-dimensionalconvolutional neural network (1D CNN) were used. The LSTM achieved the highest accuracyat 91.0%, with the 1D CNN following at 88.8%. These findings show that deep learning can beused to create a robust classifier for anger and happiness.

1 Introduction

Emotions play an important role in human life, forexample in bonding with other people and strength-ening social ties (Rime, 2009). When looking specif-ically at positive versus negative emotions, Ginoand Schweitzer (2008) have shown that these emo-tions play a role in taking advice from other people.They found that negative emotions such as angermade the participants less likely to take followuseful advice than a positive emotion (gratitude),which resulted in lower performance for those ig-noring the advice.

However, despite the big role emotions play inour lives, not much is known about emotions inreal-life situations, as described by Trampe et al.(2015). In their study, aimed at discovering whatemotions we experience in everyday life, they useda phone application that asked participants to fillin a short questionnaire to monitor their emotionsat random times throughout the day. In total, overhalf a million questionnaires were completed. Intheir research, they found that the participantsexperienced positive emotional states such as joy,love, and gratitude 2.5 times more than negativeemotional states such as anger, anxiety, and sad-

ness. However, they suspect that certain emotionsmight be under- or over-represented in their resultsbecause the decision to respond to the survey wouldbe influenced by that very emotion.

One way of overcoming this limitation could beto decode the emotions experienced by the partic-ipants using deep learning and wearable EEG de-vices such as some described by Casson (2019). Thismethod would also have the advantage of being acontinuous stream of data, rather than unevenlydistributed data points.

Matiko et al. (2015) found promising resultsin classifying emotions using a wearable, solar-powered EEG headband. Their fuzzy logic-basedemotion classifier reached an accuracy of 90%(±9%) when classifying positive versus negativeemotions in a subject-dependent approach.

One thing almost all research into emotion clas-sification using EEG data has in common is thatit uses artificially induced emotions. Matiko et al.(2015), for example, used randomly selected se-quences of 24 images from the Geneva affectivepicture database (Dan-Glauser and Scherer, 2011).These images are supposed to elicit a specific emo-tional response. For instance, images of snakes aresupposed to cause a negative emotional response,

1

Page 2: Using Deep Learning To Classify Emotions in Tibetan Monks

while images of human babies should result in apositive one. Lin et al. (2014), on the other hand,used music to induce the desired emotions. Af-ter the experiment, the participants had to de-scribe their emotions in terms of valence (positivevs negative) and arousal (high vs low). Using anSVM, they obtained a valence classification accu-racy of 61.09% in a subject-independent approach.Subject-independent here means that their modeldid not account for individual variability betweenparticipants. This has an advantage over takinga subject-dependent approach because you do nothave to train a classifier for every participant, but itusually results in lower accuracy because the clas-sifier cannot account for the individual variability.

Because artificially induced emotions may notbe the same as naturally occurring emotions, theclassification of naturally occurring emotions willbe explored in this paper. I expect the generaliz-ability of classifiers trained on naturally occurringemotions to be higher than the accuracy in otherstudies that used emotions evoked in a laboratorysetting, as I think that letting people get angry ateach other or have fun together naturally, results inmore ’real’ and strong emotions.

The dataset used in this paper was obtained dur-ing the real-life setting of monastic debate as prac-ticed by Tibetan monks. This is a good environ-ment to study naturally occurring emotions, as thedebates can get heated at times, involving a lot ofshouting, laughing, and wild movements.

Monastic debate, as described by Sera Jey Sci-ence Centre (2015), is a kind of clearly structureddialectical debate that is centered around logicalreasoning and does not rely on scriptural author-ity. The debate involves two people: A defender anda challenger. The role of the defender is to stateand subsequently defend a hypothesis. Through theuse of logic, the challenger must then constructcounter-arguments for the hypothesis in such a waythat the defender is forced into a corner of eitheraccepting absurd statements or contradicting him-self (Dreyfus (2008)). The defender can only replyto the challenger’s statements with one of four re-sponses: ”I agree”, ”Please state a reason / Why”,”Reason is not established”, or ”No pervasion”.The goal of these debates is not to convince theopponent or any potential spectators of a certainpoint, but rather to deepen the understanding ofthe current topic through the use of logic.

Figure 2.1: Still taken from a debate recording.The sitting participant on the left is the de-fender, the standing participant on the right isthe challenger. The red caps both participantsare wearing are the EEG caps.

This study aims to examine whether it is possibleto use deep learning to classify the emotions of hap-piness and anger from EEG obtained in a real-lifesetting. More specifically, a long short-term mem-ory (LSTM) and a 1D convolutional neural network(1D CNN) will be used in a subject-independentapproach. Deep learning was selected because itshows great promise in the area of EEG classifica-tion, as shown by Roy et al. (2019). My hypothesisis that this is indeed possible with good accuracy.

2 Methods

Participants

To find out whether we can classify emotions fromEEG data, we used EEG data from a group ofmonks while they were practicing monastic debate.24 monks from the Sera Jey Monastery in Indiaparticipated. All monks were male and between theages of 20 and 30. Of the 24 monks, there were tenexperienced monks with at least 15 years (18750hours) of experience, and 14 inexperienced monksthat had at least three years (3750 hours) of expe-rience.

2

Page 3: Using Deep Learning To Classify Emotions in Tibetan Monks

Data Gathering

The monks were recorded during 46 separate one-on-one debates lasting either 10 or 15 minutes. The10-minute debates were about easy topics and the15-minute debates were about more difficult topics.For every debate, audio and video were recordedin addition to the EEG signals from both the de-fender and the challenger. The camera capturedthe face of the defender and the back of the chal-lenger as demonstrated in figure 2.1. To synchro-nize the video feed with the EEG data, two sepa-rate methods were employed. The first method wasto start recording audio and video first and thencount down from 3 to 1. The EEG recording wasstarted at 1. This method was used for all debates.The second method to let the challenger blink fivetimes in a row at the start of the debate. These pur-posefully created EEG artifacts allowed us to moreaccurately synchronize the data. This method wasused for only 15 debates, as this idea came up afterseveral debates had already been recorded.

Video Annotation

The videos were analyzed for emotions usingBORIS: Behavioral Observation Research Interac-tive Software (Friard and Gamba, 2016). In BORIS,we annotated anger and happiness as periods witha start point and an endpoint. The emotions wererecorded separately for the challenger and the de-fender. We classified the emotions by reading thebody language and listening for audible cues. Forexample, laughing loudly was considered to be aclear indicator of happiness, while shouting withclenched fists was considered an indication of anger.

Because of the subjective nature of this process,every recording was annotated by three separatestudents. Once the annotation was completed, theresults were merged. Only the moments at which atleast two of the raters agreed on a certain emotionwere used.

EEG Data

Continuous EEG data were recorded from 32 elec-trodes (Fp1/2, AF3/4, F3/4/7/8/z, FC1/2/5/6,C3/4/z, T7/8, CP1/2/5/6, P3/4/7/8/z, PO3/4,O1/2/z). A Biosemi EEG cap was used at a sam-pling rate of 256 Hz. We received the EEG data of

the debates after it had already been processed.A low-pass filter of 45Hz was used because thefrequencies higher than that contained too muchnoise, caused by the monks’ wild movements. Inde-pendent component analysis (ICA) (Iriarte et al.,2003) was used to detect and remove artifacts suchas those caused by movements and eye blinks fromthe EEG data.

For this research, the following frequency bandswere used: Delta (0.5-4Hz), Theta (4-8Hz), Alpha(8-12Hz), Beta (12-30Hz), and Gamma (30-80Hz).Because of the low-pass filter, the gamma channelwas limited to 30-45Hz. These bands were obtainedusing a discrete wavelet transform. Additionally,the ’raw’ EEG data was used. ’Raw’ here meansthat it was not treated further after it had beencleaned and therefore includes all channels.

The Dataset

The dataset consists of 1759 batches of 256timesteps with 192 features each. The 192 featurescan be broken up into the raw, alpha, beta, gamma,delta, and theta channels for each of the 32 elec-trodes. There are 256 timesteps per batch becausethe sampling rate of the EEG equipment was set to256Hz, meaning that every batch contains the dataof 1 second.

The 1759 batches consist of 1205 instances ofhappiness and 554 instances of anger. Because thesedata are clearly imbalanced, every measurement ofanger was duplicated, resulting in a total of 1108instances of anger. We chose this approach over bal-ancing the dataset by removing instances of happi-ness, as the dataset is already relatively small fordeep learning and removing about a third of thedata would have given the models even less data towork with.

Deep Learning Algorithms

Two different algorithms were used to classify thedata. To find the best settings, many different mod-els were constructed for both algorithms and thencompared based on accuracy, sensitivity, and speci-ficity. In the dataset, ’1’ (true) denotes happiness,and ’0’ (false) anger. This means that the sensi-tivity, also known as the true positive rate, is thepercentage of correct classifications of happiness.Specificity, also known as the true negative rate,

3

Page 4: Using Deep Learning To Classify Emotions in Tibetan Monks

is therefore the percentage of correct classificationsof anger. Because the output for both models wasbinary (either anger or happiness), the accuracy isthe average of the sensitivity and specificity.

In all cases, the models were tested using k-foldcross-validation with k=5 in a 60:20:20 split. So60% of the data were used for training, 20% forvalidation, and 20% for testing. For each model,the average of the k results from the k-fold cross-validation is used as its final results.

Algorithm 1: LSTM

The first model we used was a long short-termmemory (LSTM) neural network (Hochreiter andSchmidhuber, 1997). This algorithm was selectedbecause it is well-suited for time series data. LSTMsare a specialized type of recurrent neural network(RNN). RNNs can use previous information for thecurrent task, as they use the output of the previouscalculation for the next one. However, the perfor-mance of RNNs decreases as the gap between de-pendent pieces of information increases, as shownby Bengio et al. (1994). This means that the infor-mation persistence or ’memory’ in RNNs is limitedto the short-term.

LSTMs, on the other hand, expand on RNNs bymaking use of a cell state. The cell state is the long-term memory aspect of the system, as it is used tohold information over long periods. The cell stateis accessed by a set of gates that control what isadded to and removed from it. At each step, a for-get gate is used to determine how much of the cellstate should be discarded. The forget gate does thisusing a sigmoid function and the output of the pre-vious step as well as the current input. Then, aninput gate decides which values to update. Thisis achieved using a sigmoid function to determinewhich values will be updated and a tanh functionto supply the updated values. These updated val-ues are then added to the cell state. Lastly, there isan output gate that calculates the result of the stepfrom the new cell state, the output of the previousstep, and the current input.

Algorithm 2: 1D CNN

The second algorithm we used was a one-dimensional convolutional neural network (1DCNN). This algorithm was selected because of its

strengths in identifying relevant patterns in timeseries data. For example, 1D CNNs have been use-ful in voice analysis, as shown by Fujimura et al.(2020). A CNN uses convolutional layers to extractfeatures from the data, which are put in featuremaps. A fully connected layer then uses these fea-ture maps to classify the data.

3 Results

Both the LSTM and the 1D CNN models were ableto classify the EEG data with high levels of accu-racy. The layout of the dataset will be discussedfirst, followed by the results of the LSTM, the re-sults of the 1D CNN, and a comparison of the two.

LSTM Results

The best result obtained by the LSTM was an av-erage accuracy of 91.0% with a specificity of 90.8%and a sensitivity of 91.3%. The model that obtainedthese results used 2 LSTM layers with 32 units eachand a tanh activation function. The model used adropout layer of 40% to avoid overfitting and asingle-unit dense layer with a sigmoid activationfunction to get binary output, as the only availableoptions are ’happy’ and ’angry’. The RMSProp op-timizer was used with a learning rate of 0.001 andthe model was trained over 32 epochs with a batchsize of 32.

To find out which selection of the data gave thebest results, the model was run for every permuta-tion of the available data (ignoring their order): Theraw data and the alpha, beta, gamma, delta, andtheta waves, resulting in 63 different data sets. Theaverage results of 5-fold cross-validation for the tenbest combinations are shown in Figure 3.1. Theseresults clearly show the importance of the raw EEGdata for this model, as not only does using only theraw channel give the fourth-highest accuracy, butall ten of the best models are various combinationswith the raw data. The best results were obtainedusing the beta and theta channels in combinationwith the raw data. The results of all 63 combina-tions can be seen in Appendix A.

When only looking at single-channel perfor-mance, as shown in Figure 3.2, we can see thatthe raw data channel is indeed the most impor-tant channel with an average accuracy of 90.7% and

4

Page 5: Using Deep Learning To Classify Emotions in Tibetan Monks

rawbetatheta

rawdelta

rawbeta

raw rawalpha

rawalpha

gamma

rawalphadelta

rawdeltatheta

rawgamma

deltatheta

rawtheta

Channel

80

90

100

Scor

e (%

)

91 91 91 91 91 9190 90 90 90

91

8789

88 8889

8890

88 88

91

9593 93 93

9293

9092 92

Comparison of the best channel combinations for the LSTMAverage Accuracy Average Sensitivity Average Specificity

Figure 3.1: Comparison of the 10 best combinations of data for the LSTM. Ordered by highestaverage accuracy.

raw delta theta alpha beta gammaChannel

70

80

90

100

Scor

e (%

)

91

73 74 73

7779

88

71 70 7174 74

93

7679

75

81

85

Comparison of the different channels for the LSTMAverage Accuracy Average Sensitivity Average Specificity

Figure 3.2: Comparison of the standalone channels for the LSTM. Ordered by frequency.

5

Page 6: Using Deep Learning To Classify Emotions in Tibetan Monks

rawbetatheta

rawbetadelta

rawalphabeta

rawalpha

gammatheta

rawgamma

delta

rawbeta

gammadeltatheta

rawalpha

gammadelta

rawalphabetadelta

rawtheta

rawdeltatheta

Channel

80

90

100

Scor

e (%

)

89 8988 88 88 88 88 88 88 88

86 8687

86

89

84 8486

84 84

9291

9091

87

9392

9092 92

Comparison of the best channel combinations for the CNNAverage Accuracy Average Sensitivity Average Specificity

Figure 3.3: Comparison of the 10 best combinations of data for the 1D CNN. Ordered by highestaverage accuracy.

raw delta theta alpha beta gammaChannel

70

80

90

100

Scor

e (%

)

87

7573

7577 77

85

76 75 76 7775

90

7472

74

78 78

Comparison of the different channels for the CNNAverage Accuracy Average Sensitivity Average Specificity

Figure 3.4: Comparison of the standalone channels for the 1D CNN. Ordered by frequency.

6

Page 7: Using Deep Learning To Classify Emotions in Tibetan Monks

LSTM 1D CNNModel

80

90

100

Scor

e (%

)

91.088.8

90.8

85.7

91.3 92.2

Comparison of the different modelsAverage Accuracy Average Sensitivity Average Specificity

Figure 3.5: Results of the LSTM compared to the 1D CNN when using their optimal dataset.

an average sensitivity and specificity of 88.0% and93.5%.

1D CNN Results

The best results obtained by the 1D CNN were ob-tained using 2 layers of 24 filters and a kernel sizeof 3 each. Both layers had a relu activation functionand used the L1L2 bias regularizer. These 1D CNNlayers were followed by a 20% dropout layer anda single-unit dense layer with a sigmoid activationfunction. The model used the binary cross-entropyloss function and the RMSProp optimizer.

This model reached an average accuracy of 88.8%with a sensitivity of 85.7% and a specificity of92.2%. As shown in Figure 3.3 the combination ofthe raw, beta, and theta channels gave the bestresults, as they did for the LSTM. When lookingat single-channel performance shown in Figure 3.4,the raw EEG data is the best predictor, as was thecase for the LSTM. Using only the raw channel re-sulted in an accuracy of 87.0% and a sensitivity andspecificity of 84.4% and 89.5%. The results of all 63combinations can be seen in Appendix B.

Comparison of the models

As shown in Figure 3.5, the LSTM model slightlyoutperforms the 1D CNN model by 2.2%-point

when only looking at the accuracy of the models.However, the LSTM also has a smaller differencebetween its sensitivity and specificity scores of only0.5%-point, while the 1D CNN has a difference of6.5%-point.

4 Discussion

The results confirm our hypothesis that deep neuralnetworks, and specifically an LSTM and a 1D CNN,can be used to accurately classify the emotionsof anger and happiness from EEG data obtainedin the real-life setting of monastic debate using asubject-independent approach. Of the two models,the LSTM has the highest accuracy at 91.0%, andthe smallest difference between its sensitivity andspecificity results at 90.8% and 91.3% respectively.Therefore, when computing power is not a factor,the LSTM is the clear winner. However, becausethe 1D CNN is faster than the LSTM, the 1D CNNmight be the algorithm of choice in situations wherelittle processing power is available, such as a wear-able EEG headset as used by Matiko et al. (2015).

Furthermore, we explored the effects of the dif-ferent combinations of available data on the per-formance of the algorithms. In both cases, the rawdata performed much better than any of the otherchannels but was beaten by the combination of rawdata and the beta and theta bands. This makes

7

Page 8: Using Deep Learning To Classify Emotions in Tibetan Monks

sense because the raw EEG data contains the mostinformation, which gives the models more to workwith. Given the high performance of just the rawdata, the trade-off between slightly lower resultsand using a much smaller dataset that requires bothless computing power and less memory to use isworth consideration for lower power devices.

When comparing these results to Matiko et al.(2015), this new approach has a clear advantagewhen looking at the subject-independent approach,which eliminates the need for training the classi-fier on each individual subject. While their fuzzylogic-based classifier had an accuracy of 90% (±9%)in their subject-dependent method, the same sys-tem only had an accuracy of 62.62% in a subject-independent approach. Lin et al. (2014) reacheda similar accuracy of 61.09% using an SVM ina subject-independent approach. Both results aremuch lower than the one achieved by the modelsdiscussed here. However, it should be noted thatthese direct comparisons between classifiers are notalways completely fair, given the wildly different se-tups of each study. Matiko et al. (2015), for exam-ple, used a real-time classifier and a wearable EEGheadset that generally results in much lower qualityof data than EEG caps such as used in this study.

During our research, we ran into several issues.While annotating the videos, it was especially dif-ficult to assess the emotion of the challengers, asthe camera was positioned behind them, so wecould only see their faces on the rare occasions theyturned around. Further complicating the annota-tion process was the fact that none of the annota-tors knew more than a few words in Tibetan, so wedid not know what the participants were saying. Attimes, this lack of context made it even harder toevaluate their emotions.

It should also be noted that this debate is a kindof game for the monks and one that even the inex-perienced monks have practiced for at least 3750hours. Therefore, it is conceivable that what weconsidered to be anger was not actually anger. Thisis supported by the fact that the monks would regu-larly start laughing after or even between shouting.Additionally, there was no instance where at leasttwo annotators agreed on when the defender wasangry. All instances of anger were taken from theattacker.

If it was not actually anger, there are a few ex-planations of what our models found instead. The

models might have picked up on a difference be-tween happiness and another emotion or mixtureof emotions that, to us, looked like anger. Alter-natively, there might be a difference in how muchor how wildly the monks moved while they were’angry’, resulting in more noise and artifacts in theEEG data, which was then picked up by the mod-els. Especially the higher frequencies used for thegamma band are susceptible to this, which mightexplain why it scored so high compared to the otherfrequency bands. In the case that either explana-tion is true and our models did not actually classifyanger, this would likely impact the generalizabilityof our models on other real-world situations.

Without the raw EEG and the gamma frequencyband, the average accuracy of the LSTM drops to78.8% with a sensitivity and specificity of 76.0%and 81.7% for the combination of the alpha, beta,and theta frequency bands, as can be seen in FigureA.3. The results of the CNN drop to an accuracyof 77.3% with a sensitivity and specificity of 76.4%and 78.1%, as shown in Figure B.3 for the combina-tion of the alpha, beta, delta, and theta frequencybands. Future research may explore whether thegood results obtained using the raw EEG data andthe data from the gamma frequency band can beattributed to only the emotions themselves or fromother factors such as movement.

This is the first step in decoding naturally occur-ring emotions using deep learning with high relia-bility. Future research could look at how generaliz-able these models are in other real-life situations.

5 References

Y. Bengio, P. Simard, and P. Frasconi. Learn-ing long-term dependencies with gradient de-scent is difficult. IEEE Transactions on Neu-ral Networks, 5(2):157–66, 1994. URL https://

rug.on.worldcat.org/oclc/264469100.

Alexander J. Casson. Wearable EEG andbeyond. Biomedical Engineering Letters, 9(1):53–71, 2019. 10.1007/s13534-018-00093-6. URL https://rug.on.worldcat.org/oclc/

8031624869.

Elise S. Dan-Glauser and Klaus R. Scherer. Thegeneva affective picture database (GAPED): anew 730-picture database focusing on valence

8

Page 9: Using Deep Learning To Classify Emotions in Tibetan Monks

and normative significance. Behavior ResearchMethods, 43(2):468–477, 2011. 10.3758/s13428-011-0064-1. URL https://rug.on.worldcat

.org/oclc/5660412651.

G. B. Dreyfus. What is debate for? the rationalityof tibetan debates and the role of humor. Ar-gumentation, 22(1):43–58, 2008. URL https://

rug.on.worldcat.org/oclc/229725701.

Olivier Friard and Marco Gamba. BORIS: a free,versatile open-source event-logging software forvideo/audio coding and live observations. Meth-ods in Ecology and Evolution, 7(11):1325–1330,2016. 10.1111/2041-210X.12584. URL https://

rug.on.worldcat.org/oclc/6869782890.

Shintaro Fujimura, Tsuyoshi Kojima, YusukeOkanoue, Kazuhiko Shoji, Masato Inoue, KoichiOmori, and Ryusuke Hori. Classification ofvoice disorders using a one-dimensional convo-lutional neural network. Journal of Voice, 2020.10.1016/j.jvoice.2020.02.009. URL https://rug

.on.worldcat.org/oclc/8553814172.

Francesca Gino and Maurice E. Schweitzer. Blindedby anger or feeling the love: How emotions influ-ence advice taking. Journal of Applied Psychol-ogy, 93(5):1165–1173, 2008. 10.1037/0021-9010.93.5.1165. URL https://rug.on.worldcat

.org/oclc/4643329145.

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–80, 1997. URL https://rug.on.worldcat.org/

oclc/120718902.

J. Iriarte, E. Urrestarazu, M. Valencia, M. Alegre,A. Malanda, C. Viteri, and J. Artieda. Indepen-dent component analysis as a tool to eliminateartifacts in EEG: a quantitative study. Jour-nal of clinical neurophysiology : official publica-tion of the American Electroencephalographic So-ciety, 20(4):249–57, 2003. URL https://rug.on

.worldcat.org/oclc/112440307.

Y. P. Lin, Y. H. Yang, and T. P. Jung. Fusionof electroencephalographic dynamics and musicalcontents for estimating emotional responses inmusic listening. Frontiers in neuroscience, 8:94,2014. 10.3389/fnins.2014.00094. URL https://

rug.on.worldcat.org/oclc/5579182259.

Joseph W Matiko, Yang Wei, Russel Torah, NeilGrabham, Gordon Paul, Stephen Beeby, andJohn Tudor. Wearable EEG headband us-ing printed electrodes and powered by energyharvesting for emotion monitoring in ambientassisted living. Smart Materials and Struc-tures, 24(12), 2015. 10.1088/0964-1726/24/12/125028. URL https://rug.on.worldcat.org/

oclc/5914133367.

Bernard Rime. Emotion elicits the social shar-ing of emotion: Theory and empirical re-view. Emotion Review, 1(1):60–85, 2009. 10.1177/1754073908097189. URL https://rug.on

.worldcat.org/oclc/4634563507.

Y. Roy, J. Faubert, H. Banville, A. Gram-fort, I. Albuquerque, and T. H. Falk. Deeplearning-based electroencephalography analysis:A systematic review. Journal of Neural En-gineering, 16(5), 2019. 10.1088/1741-2552/ab260c. URL https://rug.on.worldcat.org/

oclc/8371875269.

Sera Jey Science Centre. Science - brief introduc-tion to science debate, 2015.

Debra Trampe, Jordi Quoidbach, Maxime Taquet,and Alessio Avenanti. Emotions in everydaylife. PLOS One, 10(12):e0145450, 2015. 10.1371/journal.pone.0145450. URL https://rug

.on.worldcat.org/oclc/5954511641.

9

Page 10: Using Deep Learning To Classify Emotions in Tibetan Monks

A Appendix: LSTM Results

rawbetatheta

rawdelta

rawbeta

rawraw

alpharaw

alphagam

ma

rawalphadelta

rawdeltatheta

rawgam

ma

deltatheta

rawtheta

rawbetadeltatheta

rawalpha

gamm

atheta

rawbeta

gamm

a

rawalpha

gamm

adeltatheta

rawbeta

gamm

atheta

rawalphadeltatheta

Channel

70 80 90

100Score (%)

9191

9191

9191

9090

9090

9090

8989

8989

91

87

8988

8889

88

90

8888

8887

86

8887

89

91

95

9393

9392

93

90

9292

9293

9391

92

89

Comparison of possible com

binations for the LSTMAverage Accuracy

Average SensitivityAverage Specificity

Figure A.1: The best 16 combinations of the data for the LSTM. Ordered by highest averageaccuracy.

10

Page 11: Using Deep Learning To Classify Emotions in Tibetan Monks

rawalphabeta

gamm

a

rawalphabeta

gamm

adelta

rawalphabetadelta

rawalpha

gamm

adelta

rawalphabetadeltatheta

rawalphabetatheta

rawgam

ma

delta

rawgam

ma

theta

rawalphabeta

rawgam

ma

rawalphatheta

rawbetadelta

rawbeta

gamm

adeltatheta

rawalphabeta

gamm

atheta

rawalphabeta

gamm

adeltatheta

rawbeta

gamm

adelta

Channel

70 80 90

100

Score (%)

8989

8989

8989

8888

8888

8888

8888

8888

8586

8787

8687

85

88

83

89

84

8687

8485

85

94

9291

9192

9192

90

94

87

93

9090

9291

91

Comparison of possible com

binations for the LSTMAverage Accuracy

Average SensitivityAverage Specificity

Figure A.2: Data combinations 17-32 for the LSTM. Ordered by highest average accuracy.

11

Page 12: Using Deep Learning To Classify Emotions in Tibetan Monks

alphabeta

gamm

adeltatheta

gamm

adelta

gamm

atheta

gamm

adeltatheta

betagam

ma

delta

alphagam

ma

deltatheta

alphabeta

gamm

a

betagam

ma

deltatheta

alphagam

ma

theta

alphagam

ma

delta

betagam

ma

alphagam

ma

alphabeta

gamm

adelta

gamm

aalphabetatheta

alphabeta

gamm

atheta

Channel

70 80 90

100

Score (%)

8282

8282

8282

8181

8181

8079

7979

7978

8079

80

77

7979

7475

7877

74

78

7574

76

80

8586

85

88

8484

89

87

8485

87

81

8485

82

76

Comparison of possible com

binations for the LSTMAverage Accuracy

Average SensitivityAverage Specificity

Figure A.3: Data combinations 33-48 for the LSTM. Ordered by highest average accuracy.

12

Page 13: Using Deep Learning To Classify Emotions in Tibetan Monks

betagam

ma

theta

betatheta

deltatheta

betadelta

betaalphabetadeltatheta

betadeltatheta

alphabetadelta

alphabeta

alphadeltatheta

alphatheta

alphadelta

thetadelta

alpha

Channel

70 80 90

100

Score (%)

7878

7877

7777

7777

7676

7675

7473

7371

73

76

71

7473

71

77

7172

70

74

7071

71

87

84

80

84

8181

83

76

82

80

82

7879

7675

Comparison of possible com

binations for the LSTMAverage Accuracy

Average SensitivityAverage Specificity

Figure A.4: Data combinations 49-63 for the LSTM. Ordered by highest average accuracy.

13

Page 14: Using Deep Learning To Classify Emotions in Tibetan Monks

B Appendix: CNN Results

rawbetatheta

rawbetadelta

rawalphabeta

rawalpha

gamm

atheta

rawgam

ma

delta

rawbeta

gamm

adeltatheta

rawalpha

gamm

adelta

rawalphabetadelta

rawtheta

rawdeltatheta

rawbeta

rawbetadeltatheta

rawgam

ma

rawalphadeltatheta

rawgam

ma

deltatheta

rawbeta

gamm

adelta

Channel

80 90

100Score (%)

8989

8888

8888

8888

8888

8887

8787

8787

8686

8786

89

8484

86

8484

88

8584

83

86

82

9291

9091

87

9392

90

9292

87

9192

92

89

93

Comparison of possible com

binations for the CNN

Average AccuracyAverage Sensitivity

Average Specificity

Figure B.1: The best 16 combinations of the data for the CNN. Ordered by highest averageaccuracy.

14

Page 15: Using Deep Learning To Classify Emotions in Tibetan Monks

rawalphabeta

gamm

adelta

rawalphabeta

gamm

a

rawdelta

rawalpha

gamm

adeltatheta

rawalpha

rawalphabeta

gamm

adeltatheta

rawraw

alphabetadeltatheta

rawbeta

gamm

atheta

rawalphatheta

rawalpha

gamm

a

rawgam

ma

theta

rawalphadelta

rawalphabetatheta

rawbeta

gamm

a

rawalphabeta

gamm

atheta

Channel

80 90

100

Score (%)

8787

8787

8787

8787

8787

8787

8786

8685

8483

82

84

8686

8584

81

8382

82

86

82

80

87

9192

93

91

8889

9090

93

9192

92

87

90

92

83

Comparison of possible com

binations for the CNN

Average AccuracyAverage Sensitivity

Average Specificity

Figure B.2: Data combinations 17-32 for the CNN. Ordered by highest average accuracy.

15

Page 16: Using Deep Learning To Classify Emotions in Tibetan Monks

betagam

ma

delta

alphagam

ma

delta

betagam

ma

theta

alphagam

ma

alphabeta

gamm

atheta

alphabeta

gamm

adelta

betagam

ma

gamm

atheta

gamm

adelta

alphabetadeltatheta

alphabeta

gamm

a

alphagam

ma

deltatheta

betaalphabetatheta

alphagam

ma

theta

betadelta

Channel

80 90

100

Score (%)

7878

7878

7878

7877

7777

7777

7777

7777

7878

8180

81

7778

7777

7677

7677

75

78

74

7878

7576

74

78

76

7878

7878

7878

79

75

79

Comparison of possible com

binations for the CNN

Average AccuracyAverage Sensitivity

Average Specificity

Figure B.3: Data combinations 33-48 for the CNN. Ordered by highest average accuracy.

16

Page 17: Using Deep Learning To Classify Emotions in Tibetan Monks

gamm

adeltatheta

betadeltatheta

gamm

adeltatheta

alphadeltatheta

betagam

ma

deltatheta

betatheta

alphabetadelta

alphadelta

alphabeta

gamm

adeltatheta

alphabeta

alphadelta

alphatheta

theta

Channel

80 90

100

Score (%)

7777

7776

7676

7676

7676

7575

7575

73

78

7475

79

77

75

72

7574

7273

7676

7575

75

80

78

73

75

77

80

7777

80

78

7474

7472

Comparison of possible com

binations for the CNN

Average AccuracyAverage Sensitivity

Average Specificity

Figure B.4: Data combinations 49-63 for the CNN. Ordered by highest average accuracy.

17