RBF Based Responsive Stimulators to Control Epilepsy...Using Radial Basis Functions (RBFs), we modeled interictal and postictal time series based on electroencephalograms (EEGs) of

i

RBF Based Responsive Stimulators to Control Epilepsy

by

Siniša Čolić

A thesis submitted in conformity with the requirements

of the degree of Master of Applied Science

Department of Electrical and Computer Engineering

University of Toronto

© Copyright by Siniša Čolić 2009

ii

RBF Based Responsive Stimulators to Control Epilepsy

Siniša Čolić

Master of Applied Science

Department of Electrical and Computer Engineering

University of Toronto

2009

Abstract

Deep Brain Simulation (DBS) has received attention in the scientific community for its

potential to suppress epileptic seizures. To date, DBS has only achieved marginal positive

results. We believe that a highly complex possibly chaotic (HPC) biologically inspired

stimulation is superior to periodic stimulation. Using Radial Basis Functions (RBFs), we

modeled interictal and postictal time series based on electroencephalograms (EEGs) of rat

hippocampus slices while under low Mg2+/high K+. We then compared the RBF based

interictal and postictal stimulations to the periodic stimulation using a Cognitive Rhythm

Generator (CRG) model for spontaneous Seizure-Like Events (SLEs). What resulted was a

significant improvement in seizure suppression with the HPC stimulators at lower gains as

opposed to the periodic signal. This suggests that the use of biologically inspired HPC

stimulators will achieve better results while confining the stimulation to a narrow region of

the brain.

iii

Acknowledgements

I would like to thank Berj Bardakjian for his guidance, understanding and abundant optimism

that would have me leaving each Thursday lab meeting ready to take on the world. I would like

to thank my group members and friends, Marija Cotic, Osbert Zalay, Eunji Kang, Demitre

Serletis, Josh Dian, Angela Lee and Dave Stanley for the helpful discussions and advice. I would

specifically like to thank Eunji for providing me with the experimental recordings; Osbert for

providing me with the complexity analysis program, developing the CRG stimulation protocol

and the ROC evaluation methodology; Josh for the quick fixes and helpful suggestions. I would

also like to thank Dave and Berj for proof reading my thesis. Finally I would like to thank my

parents for always being there through the good and the bad.

iv

Table of Contents

1 Introduction and Motivation……………………………………………………….………………………………………….1

1.1 Stimulation Literature Review………………………………………………………………………………….4

1.1.1 Continuous Stimulation……………………………………………………………………………..5

1.1.2 Responsive Stimulation……………………………………………………………………………..6

1.2 Outline……………………………………………………………………………………………………………………..7

1.3 Hypothesis………………………………………………………..……………………………………………………..8

2 Chaos and the brain………………………………………………………………………………………………………………..9

2.1 Chaos and Complexity………………………………………………………………………………………………9

2.1.1 Lyapunov Exponent………………………………………………………………………………….10

2.1.2 Correlation Dimension…………………………………………………………………….……….11

2.2 The Brain and Chaos……………………..……………………………………………………………………….12

3 Modeling Highly Complex Possibly Chaotic Time Series………………………………………………………..14

3.1 Time Series Modeling……………………………………………………………………………………………..12

3.2 RBF Model……………………………………………………………………………………………………………...16

3.2.1 RBF Architecture……………………………………………..……………………………………….16

3.2.2 Recurrent RBF……………………………………………..…………………………………………..18

3.2.3 RBF Training Techniques…………………………….……………………………………………19

3.2.3.1 Gradient Descent……………………………..……………………………………….20

3.2.3.2 Regression Tree…………………………………..…………………………………….22

3.2.3.3 Forward Selection……………………………………………………………………..23

3.3 Application to Henon Map………………………………………………………………………………………26

3.3.1 Henon Map………………………………………………………..…………………………….………26

3.3.2 Data Preprocessing……………………………………….………………………………………….28

3.3.3 RBF Training of Henon Map…………………………………………………………………..…28

3.4 Application to Non-Ictal Time Series.…………..…………………………………………………………31

3.4.1 Low Mg2+/High K+ Animal Data………………………………………………………………..32

3.4.2 Data Preprocessing………………………………………………………………………………….32

3.4.3 RBF Training of Non-Ictal Time Series.……………………………………………..……..33

3.4.2.1 Gradient Descent………………………………………………………………………35

3.4.2.2 Forward Selection……………………………………………………………………..38

3.4.2.3 Tree Regression…………………………………………………………………………41

v

4 CRGs and Modeling Spontaneous Seizure-Like Events…………………………………………………………..47

4.1 Literature Review……………………………………………………………………………………………………47

4.2 CRG Based Spontaneous Seizure-Like Model………………………………………………………….49

5 Controlling Seizures………………………………………………………………………………………………………………55

5.1 Application of Stimulation to CRGSLE model…………………………………………………………..55

5.2 Periodic Stimulator Frequency Selection………………………………………………………………..57

5.3 Results of RBF stimulation………………………………………………………………………………………60

5.4 ROC Measurements………………………………………………………………………………………………..62

5.4.1 ROC Curve Construction………………………………………………………………………..…63

5.4.2 ROC Curve Comparison…………………………………………………………………………...64

5.4.3 Area Under ROC Curve………………………………………………………………………….…66

6 Discussion and Future Work………………………………………………………………………………………………….72

6.1 RBF Model Captures Complexity…………………………………………………………………………….72

6.2 Complex RBF Stimulation Outperforms Periodic…………………………………………………….73

6.3 Low Gain More Successful in Complex Stimulation…………………………………..……..…….74

6.4 Future work……………………………………………………………………………………………………………74

Conclusion…..…………………………………………………………………………………………………………………………..77

Bibliography…………………………………………………………………………………………………………………………….78

vi

List of Tables

Table 3.1 – Henon Map Gradient descent training parameters and results………………………………29

Table 3.2 – Complexity of interictal and postictal time series…………………………………………………..34

Table 3.3 – Interictal gradient descent training parameters and results…………………………………..36

Table 5.1 – Determination of the ROC cases…………………………………………………………………………….64

Table 5.2 – 0.01 Gain ROC Area Significance…………………………………………………………………………….68

List of Figures

Figure 1.1: Extracellular recording of seizure time series…………………………………………………………..4

Figure 3.1: Radial Basis Function Model………………………………………………………………………………….18

Figure 3.2: Comparison of non-chaotic and chaotic Henon map time series……………………………31

Figure 3.3: Comparing RBF Henon map model to chaotic time series……………………………………..29

Figure 3.4: RBF Interictal model after gradient descent training……………………………………………..37

Figure 3.5: Results of Interictal RBF training with forward selection……………………………………..…39

Figure 3.6: RBF interictal model after training with forward selection…………………………………….40

Figure 3.7: Results of interictal RBF training with tree regression……………………………………………43

Figure 3.8: RBF interictal after training with tree regression…………………………………………………...44

Figure 3.9: Results of postictal RBF training with tree regression…………………………………………...46

Figure 3.10: RBF postictal training with tree regression…………………………………………………………..44

vii

Figure 4.1: CRGSLE model……………………………………………………………………………………………………….52

Figure 4.2: CRGSLE model output waveforms produced……………………………………………………….…53

Figure 4.3: Comparison of the CRGSLE seizures to the actual seizures being modeled……………54

Figure 5.1: Stimulation Setup…………………………………………………………………………………………………..57

Figure 5.2: FFT Comparison of RBF Stimulator and 12Hz Periodic Stimulator………………………….59

Figure 5.3: Stimulation of the CRGSLE mode with interictal, postictal and periodic stimulation

models…………………………………………………………………………………………………………………………………….61

Figure 5.4: ROC comparison of the periodic, interictal and postictal stimulation…………………….65

Figure 5.5: ROC area under the curve comparison of the periodic, interictal and postictal

stimulations……………………………………………………………………………………………………………………………..68

Figure 5.6: ROC area for different gains of the stimulation models 50 reinitialization………….…69

Figure 5.7: ROC area for different gains of the stimulation models 500 reinitialization……..……70

Figure 5.8: ROC area for different gains different periodic frequencies………………………………..…71

viii

List of Abbreviations

ANN Artificial Neural Network

CRG Cognitive Rhythm Generator

CRGSLE Cognitive Rhythm Generator Seizure-Like Event Model

DBS Deep Brain Stimulation

EEG Electroencephalogram

EMG Electromyogram

exThr Stimulation Threshold

FPGA Field Programmable Gate Array

FS Forwards Selection

GCV Generalized Cross-validation Error

HPC Highly Complex Possibly Chaotic

LPR Low Complexity Possibly Rhythmic

Lmax Maximum Lyapunov Exponent

MSE Mean Square Error

NRE Neural Rhythm Extractor

RBF Radial Basis Function

ROC Receiver Operating Characteristic

SLE Seizure-Like Event

STLmax Short Time Maximum Lyapunov Exponent

TR Tree Regression

VNS Vegas Nerve Stimulator

1

CHAPTER 1

INTRODUCTION AND MOTIVATION

Epilepsy is a serious neurological disorder often accompanied by seizure or ictal events.

Seizures are characterized as a transition from normal or high complexity possibly chaotic

activity (HPC) to low complexity possibly regular (LPR) activity [1][2][3]. The majority of

epileptics (approx. 80%) can be treated with anticonvulsive drug therapies which inhibit the

channel transport mechanisms [4]. Of the remaining 20% some resort to surgery which carries

with it many risks. Those that are not viable for surgery have turned to a new form of

treatment known as Deep Brain Stimulation (DBS). Still in the early stage of epilepsy research,

DBS has shown promising results in treating patients with intractable epilepsy [5].

DBS is a crude stimulation technique that consists of implantation of electrodes around the

seizure focal point and applying high voltage periodic stimulation to counteract seizures [5].

These DBS stimulators are applied for fixed durations or continuously whether the patient still

needs the stimulation or not. This lack of responsiveness is a major shortcoming of the DBS

2

treatment. Here we propose a new responsive, highly complex possibly chaotic stimulation

technique inspired from biological time series recordings.

A seizure can be broken down into three main regions which we refer to as the interictal, ictal

and postictal (see figure 1.1). The ictal region is where the characteristics of a seizure are

present. The interictal region occurs just prior to the ictal and the postictal region occurs just

after the ictal. From now on we will use the term non-ictal to refer to interictal and postictal

regions of a seizure time series.

As stated earlier in the work done by [1][2][3] the normal non-ictal brain activity is highly

complex possibly chaotic (HPC), where as the ictal activity is of lower complexity possibly

rhythmic. Our goal is to provide a stimulation technique which sustains the brain in the highly

complex state preventing the transition to the low complexity seizure activity. The stimulation

is only to be applied at the presence of a seizure. To this end we have constructed a responsive

model based on the non-ictal brain activity.

The model chosen to represent the healthy non-ictal activity was the Radial Basis Function

(RBF). It was chosen due to its success in modeling highly complex time series of the financial

sector, natural generalization tendencies and low processing requirements [6][7]. The RBF

model was trained on extracellular recording samples of seizure-like events (SLE) accumulated

from multiple slices of the rat hippocampus under the in-vitro low Mg2+ epilepsy model.

3

Due to the chaotic nature of the brain signal we never intended to make perfect predictions

from the time series data. Instead we opted in creating models of non-ictal activity from the

interictal and postictal regions of a SLE with matching characteristics in wave shape and

complexity to the original training data.

To assess the feasibility of our model in seizure control we employed our stimulation paradigm

on a coupled oscillator model of SLEs [8]. The goal was to achieve an improvement in seizure

reduction with our biologically inspired HPC stimulation over the presently used periodic

stimulation.

4

Figure 1.1 – Extracellular recording of seizure time series

The data was sampled at 2kHz from rat hippocampal slices under the influence of low Mg2+

. We have further broken the data into three regions referred to as the interictal, ictal and postictal.

1.1 Stimulation Literature Review

The use of stimulation to control seizures has been around for many years. The most common

and only FDA approved implantable device for treatment of epilepsy is the Vagus Nerve

Stimulator (VNS) [9][10]. The vagus nerve stimulator applies periodic electrical pulses to the left

vegal nerve which then make its way to the brain. Recent studies have shown that only 30-40%

of patients undergoing the treatment experienced a 50% seizure reduction [11].

An alternative option known as Deep Brain Stimulation (DBS) has recently become a popular

technique to control epilepsy [9]. In the past DBS has been fairly successful in treating disorders

5

such as Parkinson’s and depression. It is believed that the same level of success can be achieved

in treating epilepsy. DBS uses periodic stimulation which can be described as two square pulses

one after the other with one positive and the other negative. The DBS treatment is highly

dependent on the placement of the electrodes and the type of stimulation used. There are two

main styles of DBS stimulation. The first and the one most often used is the continuous

stimulation with periodic waveforms [9][12][13][14]. The other is responsive stimulation and it

is beginning to gain notice, although it is much harder to achieve as it requires that the seizures

be detected as early as possible [3][9].

1.1.1 Continuous Stimulation

Continuous stimulation, as the name implies is a continuous application of the stimulation

whether the subject is experiencing a seizure or not [9]. The stimulation can also be applied on

a timer basis where the stimulation turns on and off based on a time interval (i.e. on for one

minute, off for 2 minutes). Many human trials have been performed with varying results [9]. A

study performed recently used periodic stimulation with a frequency range of 130-200Hz to

treat temporal lobe epilepsy in the hippocampus [13][14]. They showed remarkable results with

most subjects experiencing a 50% reduction in seizure frequency and a significant number of

patients experienced a 90% reduction and became completely seizure free. A subsequent study

by a Canadian group that tried to match the same results found an improvement of only 15%

[15]. The general story of DBS is that the results are not repeatable. As well many of the

patients that became seizure free only remained so for a short time, then after a couple years

the symptoms returned [12].

6

1.1.2 Responsive Stimulation

Responsive stimulation differs from continuous stimulation in that the stimulation is only

applied when needed [9]. Determining when the stimulation is needed is a much more difficult

task and requires some way to detect the approaching seizure. There are two ways in which

responsive stimulation can be applied. The first is to apply stimulation once the seizure is

observed, although this may often times be too late to stop the seizure [16][17]. The second

method is to use a predictive system that warns of an impending seizure event and applies the

stimulation prior to the event in the hope that the seizure would not occur at all [3].

In a recent study done by Fountas et al., eight patients had an external Responsive

Neurostimulation (eRNS) system implanted [16][17]. The eRNS system detected the occurrence

of a seizure and applied periodic pulses ranging in frequency from 1-333Hz, with amplitude of

0.5-12mA. Of the 8 patients 7 had 45% less seizure activity and 2 had more than 75% reduction

in seizure activity.

There have been numerous DBS trials with marginal success rates. Often times the results are

not repeatable. DBS employs a periodic stimulation model where as the brain has been shown

to be highly complex possibly chaotic (HPC). It is for that very reason that we constructed an

RBF based stimulator using the highly complex features found in the interictal and postictal

regions of the brain. In the following section we explain how the CRGSLE model was configured

to compare the common periodic DBS stimulation to the highly complex interictal and postictal

based stimulations.

7

1.2 Outline

This thesis outlines the initial steps in a long process towards viable human treatment of

epileptic seizures. The first step is the creation of a stimulation model and its application to an

in-silico model of spontaneous seizure-like events. The subsequent steps will be introduced in

the future works section of this thesis.

Having introduced the problem and motivation for the thesis in Chapter 1 we move onto

Chapter 2. In Chapter 2 we provide the background necessary to define chaos, show how it is

quantified and its relevance to the brain and epilepsy.

Chapter 3 focuses on the RBF. There we defend the use of the RBF in modeling brain complexity

from the time series extracellular data. Further we explain in detail the structure of the RBF and

the many training techniques used to model the complexity of the brain. Chapter 3 concludes

with the training results using different training methods and the verification of the model

selected.

The next two chapters focus on the generation and control of Seizure-Like Events (SLEs). In

chapter 4 we describe the SLE model created from the Cognitive Rhythm Generator (CRGSLE).

Then in chapter 5 we show how the CRGSLE model was modified to test control efficacy of our

RBF stimulations. Chapter 5 concludes with the results of stimulating with the interictal and

postictal RBF models compared to the periodic stimulation commonly used in DBS literature.

8

Chapter 6 concludes the work with a discussion of the results and the future work planned on

these results.

1.3 Hypothesis

Radial Basis Functions (RBFs) will capture the highly complex possibly chaotic (HPC) features

present across multiple slices of rat hippocampus from the non-ictal extracellular time series.

The stimulation of the CRGSLE model with the HPC RBF generated non-ictal signals will achieve

better results in terms of suppressing ictal events than those achieved through the periodic

signal used in Deep Brain Stimulation (DBS).

9

CHAPTER 2

CHAOS AND THE BRAIN

In this chapter we will introduce the concept of chaos and how it is measured. We will then

proceed to provide evidence for the existence of HPC activity in a normally functioning brain

and the highly rhythmic, possibly regular activity found in a seizing brain.

2.1 Chaos and Complexity

Chaos is a long term aperiodic behaviour in a nonlinear deterministic system that exhibits

sensitive dependence on initial conditions [18]. There are three important characteristics in this

statement that separate chaotic systems from others. First they produce behaviour that never

repeats, not even after long term observation. Secondly chaotic systems are not based off of

random inputs, but rather from the nonlinear evolution of trajectories. Finally the most

important distinguishing characteristic is that chaotic systems are highly sensitive to initial

10

conditions. This means that two trajectories starting close to each other will diverge

exponentially with time, often referred to as the butterfly effect [18][19].

If the system’s equations are known the chaotic behaviour of the system can be computed

analytically. In general the system equations are not known and many times the only thing

available is the time series of some variable in the system (i.e. voltage). Over the years there

have been many methods developed to find how chaotic or complex a system is. Here we will

present two of the commonly used methods.

2.1.1 Lyapunov Exponents

The most important distinguishing feature of a chaotic system is the sensitive dependence on

initial conditions, in the sense that neighbouring trajectories separate exponentially fast

[18][20]. A common way to quantify this property is to use Lyapunov exponents. Consider an n-

dimensional sphere in n-dimensional state space. During the evolution of the sphere in the

state space it will go from being a sphere to an infinitesimal ellipsoid. Where each dimension k

of the ellipsoid can be described by,

��~��0��, (2.1)

where ��represents a finite separation between two trajectories. The start and end of the

trajectories are related exponentially by �� which are known as the Lyapunov exponents. A

positive Lyapunov exponent indicates the presence of chaos and a negative or zero Lyapunov

11

exponent means the system is non-chaotic. For a system to be considered chaotic only one of

the Lyapunov exponents needs to be positive, or another way to say it is that the maximum

Lyapunov exponent is greater than 0 in chaotic systems [18][19][20].

A successful method for calculating the Lyapunov exponent from time series is known as Wolf’s

method [20][21]. Wolf’s method takes a reference trajectory and follows the divergence of the

neighbouring trajectories from it. In order to ensure the separation between the two

trajectories does not diverge to infinity or extremely large values it is often necessary to

renormalize. This is done by picking a new point every time a threshold value is exceeded and

the process continues. An average is then taken to find the average divergence rate and with it

the maximum Lyapunov exponent is obtained. The drawback of Wolf’s method is that it

requires many time series points to calculate the divergence. Often times the data is non-

stationary and may contain multiple different regions of chaotic and non-chaotic behaviour.

Therefore other short time techniques such as the short time maximum Lyapunov exponent

(STLmax) [22][23] and Rosensteins’ [24] method are used. In this thesis we opted for the STLmax

method which is based closely on Wolf’s algorithm and the details are outlined in Iasemidis et

al, 1990 [22].

2.1.2 Correlation Dimension

Much like the Lyapunov exponent, correlation dimension tries to quantify the chaotic

behaviour of a system. Correlation dimension is a geometrical quantity that characterizes the

minimal number of variables needed to fully describe the dynamics of motion [21]. The larger

12

the number of variables needed the more chaotic it is. Grassberger and Procaccia devised an

efficient way to do this which has become the standard for calculating the correlation

dimension [18][25].

The Grassberger and Procaccia method works by fixing a point x on the attractor A. Then they

let �� denote the number of points on A within the ball of radius � centred on the fixed

point x. Then the number of points is measured as the radius � is increased. As the radius �

grows the number of points inside the ball centered at x grows with the relation of a power law

described by,

��~�� (2.2)

Where d is the correlation dimension. Generally the result varies with the selection of the fixed

point x. To get a more accurate result many different fixed points are used to do the calculation

and then their average is used to find the correlation dimension [18].

2.2 The brain and chaos

The brain is composed of billions of neurons with roughly 1010 synaptic connections. These

neurons join together to form the different systems in the brain such as the cerebellum,

neocortex, amygdala, and hippocampus just to name a few. No man made system in existence

can match the complexity of the brain. Still the question arises whether or not the brain is

chaotic.

13

Using electroencephalogram(EEG) readings which measure the variability of the electric field in

time and space due to the firing of neuronal populations may have provided evidence for the

existence of highly complex possibly chaotic neurodynamics in the brain. Some of the early

work done by Babloyantz et al. had used correlation dimension measurements to assess the

complexity of the different stages of the sleep cycle [26]. They measured a correlation

dimension greater than 4 for the different stages and concluded in that the brain possessed

chaotic dynamics in the sleep state. Further Fell et. al., provided evidence for the existence of

chaotic behaviour in the brain [27]. They applied Wolf’s algorithm on time series gathered from

the different stages of sleep and yielded a positive Lyapunov exponent of 2.5 - 3. Lastly

Balboyantz et al., compared the correlation dimension of a patient in the sleep state to an

epileptic state and found that the sleep state had a correlation dimension of 4.05, and the

epileptic state had a correlation dimension of 2.05 [28]. The drop in dimension supports that

during a seizure a patient is trapped in a lower dimensional, less chaotic state and only when

the state returns to a higher complexity can normal brain function resume.

Much of the work in this thesis relies on the assumption of the existence of HPC neurodynamics

in normal non-ictal brain activity. Likewise we assume the existence of lower complexity,

possibly rhythmic neurodynamics during the ictal region. This assumption is well established in

literature [2][3][20][28].

14

CHAPTER 3

MODELING HIGHLY COMPLEX POSSIBLY CHAOTIC TIME SERIES

In this chapter the challenges of modeling complex time series are outlined in detail. The choice

of the RBF model is defended using references in literature. The RBF model is further

decomposed into its architecture and the learning techniques. The chapter concludes with a

validation of the models’ ability to produce time series that match the characteristics of the

highly complex non-ictal recording measured in the brain.

3.1 Time Series Modeling

We modeled the non-ictal time series from the extracellular reading of rat hippocampal slices.

Earlier in section 2.2 it was explained that this type of signal is non-stationary and HPC. The

chaotic feature of the time series meant that the model would not simply be performing

pattern recognition. Rather it would have to generalize to some underlying features not clearly

visible but highly relevant. These features are the key in finding the right stimulation to prevent

seizure propagation. There are also multiple sources of noise embedded in the system that

15

need to be avoided (i.e. noise from the setup and recording instruments such as the 60Hz

harmonic). The model used has to be able to generalize easily and avoid falling in the traps

caused by the presence of noise. It turns out that this problem is analogous to forecasting stock

market trends.

Stock market time series data is chaotic and highly noisy [7]. Using stock market time series

prediction as a starting point it was discovered that radial basis functions (RBFs) are very

successful for time series prediction [6][7]. RBFs are very similar to ANNs except for

distinguishing difference that the input signals are arranged first based on a non-linear

methodology followed by a linear summation. On the other hand Artificial Neural Networks

(ANNs) first combine the inputs through a linear summation and then perform the non-linear

transformation on those sums. The non-linear transformations in the ANNs are static, whereas

in the case of RBFs the non-linear transformation of inputs is dynamic during training because

the parameters of the RBF are updated. To compensate for the static non-linear

transformation, the ANNs have multiple layers which add higher complexity, but further cost in

training time. Whereas the RBFs have only one layer and the relationship between the weights

and the output are linear and therefore the hardest training is done in finding the parameters

of the non-linear RBF transformation.

16

3.2 RBF Model

The RBF model can be described in a two parts. First the architecture and second the learning

techniques. In the following section we will first describe the standard RBF architecture

followed afterward by the slight modification to make the RBF function in the recurrent mode.

The training of the RBF consisted of three main learning techniques. The first and standard

technique of RBF training is the gradient descent method and it was based on previous work in

the group by Courville [19]. The other learning techniques used were the Tree Regression (TR)

and Forward Selection (FS). They were applied through a RBF Matlab training function created

by the UK group based at the University of Edingburgh, Scotland [29][30].

3.2.1 RBF Architecture

The radial basis function model (RBF) defines an output yn as a linear expansion of radial

functions of the input xn as shown by

y� = � w�∅�� + w , (3.1)

where ∅��x�� is the output of the kth radial basis function given the input vector xn of

dimension m. N is the number of RBFs, the weight w� is the influence associated with the kth

RBF. The RBF in our model was chosen to be the Gaussian RBF, as shown by

17

∅��" = exp �− ∑ �'()*+,-.()*+,�/0()*+,/1�� , (3.2)

where the vector Cn is the center or mean of the kth RBF. The vector rn represents the variance

of the kth RBF. The coefficient term of the Gaussian was omitted as it only modifies the scale

and adds no further complexity. A visual representation of the RBF model is shown in figure 3.1

below.

18

Figure 3.1 – Radial Basis Function Model

The RBF is composed of three different layers. The input layer takes in an m dimensional input from the time series

recording. The hidden layer contains N RBFs which output a value based on the proximity of the input vector to the

Gaussian centre. Finally the output layer is the sum of all the RBF outputs multiplied by their weight factor w.

3.2.2 Recurrent RBF

In its standard mode the RBF model is used to make a prediction based on a given input. There

is an alternative mode of operation known as recurrent RBF mode where the model is only

initialized by one input sample and allowed to generate predictions indefinitely based off of

that first input [31]. To achieve this recurrent mode of operation the model output y� was

replaced by x�2� as shown by,

19

x�2� = � w�∅�� + w , (3.3)

where x�2� is the next point in the time series generated when the RBF functions in recurrent

mode. It then gets incorporated in the next input to make the prediction of x�2".

Recurrent mode of operation is a better validation of the models predictive capabilities [32]. If

the model is able to capture the intrinsic features of the training time series then it should be

able to maintain activity indefinitely without converging to zero. This was one of the main

criteria used in our evaluation and selection of RBF models.

3.2.3 RBF Training Techniques

The training of the RBF consists of finding superior parameters for modeling the time series

training data. These parameters consist of the Gaussian center (c) and radius (r) along with the

weight (w). Further parameters optimized are the embedding (m) and number of radial basis

functions (N) needed to accurately model the given time series. To achieve this end we used

three learning techniques:

1. Gradient descent

2. Regression Tree

3. Forward Selection

20

In the following subsections the algorithms of all three learning techniques will be summarized.

3.2.3.1 Gradient descent

The gradient descent is the standard learning technique for any optimization problem and was

the starting point for the training of the RBFs. The gradient descent method uses the gradient

with respect to one of the three parameters mentioned above (c,r,w) to find how the error

changes as those parameters increase or decrease. The error is calculated by equation 3.4

where 3452� is the predicted time series value and D is the length of the training time series

data. After many training epochs the parameters and the error slowly converge to one of a

number of possible error minimums or also known as a local minimum.

6 = �" ∑ 73452� − 352�8"9-�5�1 (3.4)

Using the previous work done by Courville [19] as a guide, the gradients were calculated for the

three parameters by differentiating the error function (equation 3.4� with respect to the three

parameters (c,r,w). Where c and r are of size [N x m] and w is of size [N+1], with an extra 1

added for the bias. The gradients shown below,

:;

:<= = ∑ >∑ ?@∅A��@� − x�2�B9-�5�1 ∅C�� = 0 (3.5)

21

:;

:D=E = ∑ 73452� − 352�895�� ?F3G H− �IJ-KJ�/LJ/ M ">'()*+E-.()*+EB

0()*+E/ (3.6)

:;

:0=E = ∑ 73452� − 352�895�� ?F3G H− �IJ-KJ�/LJ/ M "�'()*+E-.()*+E�/

0()*+EN (3.7)

The gradients are then used to update the parameters such that after each training epoch a

discrete step is taken down the error surface towards a local error minimum. The standard way

to do this is simply to use a predetermined learning rate to control the step size. This is where

we diverged from [19]. Instead the step size was calculated using the conjugate gradient

method described by Shewchuk [33]. It requires only that you feed it a function that outputs

the error and gradient information. It then uses the conjugate gradient technique to find the

best step direction efficiently and updates the parameters recursively until the desired stopping

criteria is met.

Training in a gradient descent method can be stopped in many ways. One way is to stop

training when the error reaches a low enough value. Another method is to stop when the error

reduction between epoch n and n+1 is smaller than a predetermined value. In our setup we

used a predetermined number of training epochs as the stopping criteria.

22

3.2.3.2 Regression Tree

Another form of learning technique was the regression tree (RT) based on the work of Orr et al.,

[29]. Much like unsupervised learning techniques RT takes the initial [d x p] data matrix to

compute a regression tree. Where d signifies the dimensionality of the data and p is the

number of data patterns used in the training. Then the first node, also referred to as root is

initialized.

The training algorithm orders the training data values along each parameter from least to

greatest. It breaks up the data in half from nmin to p-nmin. Where nmin is the minimum

number of data values that have to remain in each branch after a break. Then the boundary

was found to create the lowest error, where error is determined using,

OP = �QR ∑ SFF∈UR , (3.8)

OV = �QW ∑ SFF∈UW , (3.9)

6�X, Y� = �Q >∑ �SF − OP�F∈UR + ∑ �SF − OV�F∈UW B, (3.10)

where k represents the dimension, b is the boundary choice, OP is the average of the output for

the left branch and OV is the average of the output for the right branch, ZP and ZV are the

samples on the left and right branches.

23

After the greedy search through the input space for the branching corresponding to the lowest

errors we can calculate the centres and radii parameters starting from the root node, or the

parent node if you will. The calculation of the centres and radii are done by,

[� = �" > �3F�� − �3F�� F∈U1F5F∈U1\� B, (3.11)

]� = �" > �3F�� + �3F�� F∈U1F5F∈U1\� B, (3.12)

where [� and ]� stand for the radii and centre of the kth node in the regression tree.

The RT produces a large selection of radii and centres which are well suited to model the

training data. From here we can create fairly good model of the training data, however the

model would be very large as it contains almost as many parameter choices as the number of

data samples. Not all of the parameters may contribute significantly to a reduction in error. By

using a pruning method known as Forward Selection (FS) we can reduce the number of

parameters and still maintain a high degree of model performance.

3.2.3.3 Forward Selection

Forward selection (FS) finds the subset of model parameters that create the greatest reduction

in output error. As opposed to backward selection, FS works by initializing to the RBF with the

greatest influence on error reduction in the matrix representing all RBF. It then builds on that

24

by recursively searching through the remaining RBFs for the next RBF that creates the greatest

shift in error. This continues until a special error known as generalized cross-validation (GCV)

error stops reducing. At which point the addition of further RBFs will not contribute to reduce

error. FS can be significantly improved by combining it with orthogonal least squares [34]. This

is a Gram-Schmidt orthogonalisation process which ensures that each new parameter added is

orthogonal to all the previously added parameters [35]. It works to improve the calculation by

making it easier to compute the sum-squared-error term.

Orthogonal least squares springs from the idea that any matrix can be factored into a product

of an orthogonal matrix and an upper triangular matrix

^1 �ℋ1 ∗ a1, (3.13)

ℋ1 = bℏ� ℏ" … ℏQe � ℝQ�1 ; ℏFg ℏ@ = 0, hij k ≠ m, (3.14)

where ^1is the design matrix, ℋ1 is the orthogonal matrix and a1is an upper triangular

matrix.

Using this idea of orthogonality, forward selection proceeds to compute the projection of the

design matrix F acquired from the RT.

nF = hF − ∑ o=pEpEqpE

1-�@�� p@ , (3.15)

25

where nF represents the projection of the ith parameter and p@ is the jth component of the

design matrix being constructed. The process is iterated p times until all the remaining

components of F have been calculated to form a new matrix r. Then the mean-squared-error is

calculated

61-� − 61�F� = >sqn=B/n=qn= (3.16)

where y are the output values. The process is repeated p times until all the remaining

components have been checked. The one that produces the lowest mean-squared-error is then

added to the design matrix ℋ1. The process is repeated until either all parameters have been

exhausted or until another error, the generalized cross-validation (GCV) error begins to increase

indicating that the addition of any further parameters to the design matrix will have no further

benefits as it leads to over-fitting. The GCV is calculated by

tuv = Qsqw*/ s��0\Dx�w*��/ (3.17)

y1 = y1-� − n=n=qn=qn= (3.18)

Once the design matrix has been found it is a straight forward process to calculate the optimal

weights as the RBF output is linearly dependent on the functions through the weight vector.

The calculation of the weights is achieved through the calculation of equation 3.21.

26

a1 = za1-� �ℋ1-�g ℋ1-��-�ℋ1-�g h@01-�g 1 |, (3.19)

}1 = �ℋ1g ℋ1�-�ℋ1g S, (3.20)

?1 = a1-�}1, (3.21)

where a1 is the upper triangular matrix introduced in the transformation to orthogonality, w is

the weight matrix.

3.3 Application to Henon Map

Henon map is a 2-dimensional dynamical system that has been well studied due to its ability to

exhibit chaotic behavior for certain parameters. This makes it good model to test out the

learning techniques introduced. The dynamics of the Henon map are assumed to be

significantly less complex thus making the Henon map a good starting place to verify the RBF

models abilities. In what is to follow we applied the gradient descent training technique on the

Henon map to see if the gradient descent RBF model can capture the chaotic behaviour of the

Henon map.

3.3.1 Henon Map

The Henon map is defined by,

352� = S52� − ~35" (3.22)

S52� = Y35 (3.23)

27

where a and b are two parameters that can be preset to make the map exhibit chaotic

behaviour. In figure 3.2 below we show the difference between the Henon map running in non-

chaotic mode with parameters a=1.25 and b = 0.3 and chaotic mode with a = 1.4 and b =0.3.

The non-chaotic mode shown in figure 3.2a is periodic and can easily be predicted well in

advance. The chaotic mode shown in figure 3.2b has a chaotic pattern which cannot be

predicted in advance.

Figure 3.2 – Comparison of non-chaotic and chaotic Henon map time series

a) Non-chaotic Henon map time series with a=1.25 and b=0.3 b) Chaotic Henon map time series with a=1.4 and b=0.3.

0 20 40 60 80 100 120 140 160 180 200-0.4

-0.2

0

0.2

0.4

Time

Yn

a) Non-chaotic Henon map (a=1.25, b=0.3)

20 40 60 80 100 120 140 160 180 200-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Time

Yn

b) Chaotic Henon map (a=1.4, b=0.3)

28

3.3.2 Data Preprocessing

In order to model the Henon map it was necessary to divide the time series into training data.

Four thousand time series samples were generated in Matlab for training the RBF to model the

Henon map. Following the generation of the data, the data was modified to produce samples of

varying time embedding. Where time embedding refers to the number of time points used

prior to a prediction. It can also be thought of as the input length. Training samples were

created with time embeddings of 1, 2, 5, 10 and 20.

3.3.3 RBF Training of Henon map

The training of the Henon map was done with the gradient descent method mentioned earlier.

The initial center c parameters were selected from the training data, the r or variance was set

to 0.1, and the weights were randomized from a Gaussian distribution. The training variables

used in the gradient descent method are shown in Table 3.1.

The error calculation was done in two steps. First during the training the error was calculated

using the MSE for the regular non-recurrent mode. Then afterwards the model was verified

using the MSE on the recurrent mode generation compared with the actual Henon map time

series.

�Z6 = �9-1-� ∑ �� − h��"5F�� (3.24)

29

where m is the embedding of the model used to make the prediction h�� and D is the number

of sample points used in the training.

Since in recurrent mode the models diverge very rapidly (see figure 3.3a), it was decided that

MSE was not enough to validate the model. Therefore complexity was further used as a way to

confirm the model selection. To calculate the maximum Lyapunov exponent we used STLmax

[22][23]. To calculate the correlation dimension we used a Matlab program based on

Grasberger and Procaccia [25] written by Zalay, who is a member of our group. The complexity

was compared with the complexity on the training data. The complexity was calculated using

8000 samples, time constant of 2 and embedding dimension of 7. The result was 0.99 for the

maximum Lyapunov exponent and 1.33 for the correlation dimension.

Table 3.1 – Henon Map gradient descent training parameters and results

Model Embedding

(m)

RBFs

(N)

Training

Epochs

MSE

(non-Recurrent)

MSE

(Recurrent)

Max Lyapunov

Exponent

Correlation

Dimension

1 1 10 1000 2.00e-3 3.64e-2 0.20 NaN*

2 2 10 1000 4.55e-7 1.12e-2 0.87 1.27

3 5 20 1000 4.47e-6 1.12e-2 -0.12 0

4 10 20 1000 2.90e-3 7.90e-3 -1.66 6.80e-3

5 20 20 1000 2.70e-3 1.12e-2 -5.30 1.70e-3

6 1 20 1000 1.70e-3 1.69e-1 0.22 NaN*

7 2 20 1000 5.95e-7 1.12e-2 1.10 1.37

8 5 40 1000 7.98e-5 1.12e-2 1.15 2.01

9 10 40 1000 4.31e-5 1.05e-1 -0.44 0

10 20 40 1000 6.93e-2 1.45e-2 0.02 0

* NaN refers to Not A Number and commonly results when the correlation dimension is unable to be calculated, in

this case it is because Model 1 and Model 2 produced a steady constant value in recurrent mode.

30

The trained RBFs were used in recurrent RBF mode to produce time series of length 8000 as a

means to compare the effectiveness of the training. The results are shown in Table 3.1. From

the results the lowest error in training MSE (non-recurrent) was 4.55e-7 for model 2 with an

embedding of 2 and 10 RBFs. Furthermore the recurrent MSE was the lowest with 1.12e-2

along with models 3, 5, 7, and 8. Model 2 produced a maximum Lyapunov exponent of 0.87 and

correlation dimension of 1.27 which closely matched the values found on the original data, 0.99

and 1.33 respectively. Models 7, also with an embedding of 2, but with 20 RBFs produced

similar complexity to the training data, but it required more RBFs. Therefore model 2 was

selected. Further by simple observation in figure 3.3a it was verified that model 2 matches the

characteristics of the Henon map. In figure 3.3b the MSE training with respect to the first 200

epochs is provided to show how the model slowly converged to the error of 4.55e-7. The RBF

with the gradient descent learning method was sufficient in modeling the Henon map. It was

not necessary to use any of the other training techniques.

31

Figure 3.3 – Comparing RBF Henon map model to chaotic time series

a) Comparison of chaotic Henon map time series to RBF generated data in recurrent mode with RBF parameters of m=2 and N=10 b) Mean squared error plot with respect to epoch of gradient descent training for the same model

3.4 Application to Non-ictal Time Series data

As mentioned earlier the non-ictal data is highly complex and possibly chaotic (HPC). Moreover

it is non-stationary and embedded with noise. Modeling this complex time series is far more

difficult than modeling the Henon map. In what is to follow we will describe the process of

20 40 60 80 100 120 140 160 180 200

-0.4

-0.2

0

0.2

Time

Yn

RBF model of Henon Map with m=2 and N=10

20 40 60 80 100 120 140 160 180 200

-0.4

-0.2

0

0.2

Original Henon Map time series for a=1.4 and b=0.3

Time

Yn

0 5 10 15 20 25

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Comparing RBF Prediction to Henon Map

Time

Yn

Henon Map data

RBF (m=2, N=10)

20 40 60 80 100 120 140 160 180 2000

0.1

0.2

0.3

b) Gradient Descent MSE on Henon Map RBF model (m=2, N=10)

Epoch

MS

Ea) Comparing RBF Henon Map Model To Original Henon Map Time Series

32

modeling the non-ictal time series data starting with the acquisition of data, immediately

followed by the preprocessing of the data and concluding with the training results and

verification of complexity.

3.4.1 Low Mg2+

/ High K+ Animal Data Acquisition

Training seizure data was collected independently by Eunji, a member of our group. Eight slices

from the hippocampus of male Wiser rats aged 17-25 days were obtained. Then the slices were

bathed in a low Mg2+/High K+ solution and electrodes were placed in the CA1 region of the

hippocampus. After roughly 20-40 minutes the slices begin to exhibit spontaneous seizures due

to the presence of the low Mg2+/High K+. The seizing activity was recorded by electrodes at a

sampling frequency of 2kHz. The whole process is described in further detail in the paper by

Chiu et al, with the exceptions that we use sampling of 2kHz where as Chiu et. al sampled at

10kHz [3]. Once all the data had been collected it was only a matter of separating the ictal

regions from non-ictal regions. The separation of the ictal and interictal region was the most

difficult as there is a steady increase in spikes as the seizure develops. To avoid overlapping the

regions we selected interictal data as far away from the ictal region as possible. The postictal

region was selected after the last ictal spike was observed (see figure 1.1).

3.4.2 Data Preprocessing

The non-ictal time series data is susceptible to many noise sources ranging from the external

environment to electromyogram (EMG) interference from muscles to simply artifacts in the

33

measuring instrumentation. Therefore preprocessing consisted of filtering out the noise,

trimming out outliers in the time series recording and scaling the signal to lie between the

range -1 to 1.

The signal was low pass filtered to 50Hz followed by a light high pass filtering at 0.5Hz to

remove some low frequency oscillations which interfere in the training of the RBF. After

filtering the training data was downsampled by 20.

The trimming of the signal was done in such a way that only the occasional outliers would be

removed and the remainder of the signals would fall just under the trimming. Following the

trimming the data was scaled such that the maximum and minimum values would lie within -1

to 1 respectively.

3.4.3 RBF Training of Non-Ictal Time Series

The above mentioned training methodologies were then applied in three different sequences

to train our model.

1. Gradient descent

2. Forward Selection

3. Regression Tree with Forward Selection

34

Similarly to the Henon map the training of the RBF models was done using the MSE defined by

equation 3.24. To verify that the models sufficiently resemble the properties of the non-ictal

extracellular time series we tested the model by running it in recurrent mode and initialized

with ictal data.

The final test of the models was to verify their complexity and also how well it resembled the

actual non-ictal time series data used in training. The complexity was calculated by finding the

maximum Lyapunov exponent and correlation dimension which were introduced in Sections

2.1.1 and 2.1.2 respectively. Table 3.2 below shows the complexities of the interictal and

postictal training data after downsampling to 100Hz sample rate. To calculate the maximum

Lyapunov exponent we used STLmax [22][23]. To calculate the correlation dimension we used a

program based on Grasberger and Procaccia[16] written by a Zalay, a member of our group.

The results were calculated using 6 different samples of length 8000, time constant of 2 and

embedding dimension of 7. The results of the calculation yielded a maximum Laypunov

exponent and correlation dimension of 1.67 and 5.66 respectively for the interictal time series

data. Postictal time series data yielded a maximum Lyapunov exponent and correlation

dimension of 1.64 and 6.33 respectively. These complexity results were compared later with

the complexity found from the RBF models generating in recurrent mode.

Table 3.2 – Complexity of interictal and postictal training time series

Model Maximum

Lyapunov Exponent

Standard Error Correlation

Dimension

Standard Error

Interictal Time Series 1.67 0.04 5.66 0.27

Postictal Time Series 1.64 0.06 6.33 0.07

35

3.4.2.1 Gradient Descent

Initially the training of the non-ictal data was done with the gradient descent method. The

training was first done on the interictal model. The centres c were selected initially from the

training data, the variance r was set to 0.1 and the weights w were selected from a normalized

Gaussian distribution. The embedding and number of RBFs were swept through a variety of

choices from fairly simple to very complex (see Table 3.3).

The gradient descent training worked in reducing the error on predictions of the training data.

Although it failed to produce anything resembling the time series of the interictal extracellular

time series when operated in recurrent mode. The MSE results from both training (non-

recurrent) and recurrent modes are shown Table 3.3 along with the complexity calculations.

None of the models succeeded to capture the characteristics of the interictal time series. The

best result was achieved with model 2 which had an embedding of 10 and with 20 RBFs . It

produced a MSE of 0.0777 after training and a MSE of 0.1784 in recurrent mode. The maximum

Lyapunov exponent was close to 0 and the correlation dimension was 0 thus lacking any sort of

complexity. Figure 3.4 shows the results of model 2. From figure 3.4c it can be seen that the

recurrent mode would produce oscillations until it converged to a constant close to 0. From the

above results it was determined that gradient descent based methods were not going to

succeed in modeling the chaotic interictal activity. Thus we proceeded to try out the other

learning methods.

36

Table 3.3 – Interictal gradient descent training parameters and results

Model Embedding

(m)

RBFs

(N)

Training

Epochs

MSE (non-

Recurrent)

MSE

(Recurrent)

Maximum

Lyapunov Exponent

Correlation

Dimension

1 5 20 2000 0.0803 0.1789 -0.14 0 2 10 20 2000 0.0777 0.1784 -0.40 0 3 20 40 2000 1.1740 2.3482 0.21 NaN* 4 40 60 2000 0.0913 0.1798 0.17 NaN* 5 100 60 2000 0.1824 0.3636 0.20 NaN* 6 20 80 2000 0.1394 0.2780 0.20 NaN* 7 40 80 2000 0.2153 0.4282 0.21 NaN* 8 100 100 2000 0.0912 0.1796 0.17 NaN* 9 140 120 2000 0.1843 0.3636 0.20 NaN*

10 140 200 2000 1.2509 0.6120 0.21 NaN*

* NaN refers to Not A Number and commonly results when the correlation dimension is unable to be calculated, in

this case it is because Models 3 - 10 produced a steady constant value in recurrent mode.

37

Figure 3.4 – RBF Interictal model after gradient descent training

a) Interictal training data. b) Prediction of RBF after gradient descent on training data, embedding of the model is

equal to 10 and the number of RBFs used are 20. c) Result of RBF prediction in recurrent mode. d) MSE error curve

with respect to number of training epochs.

0 50 100 150 200 250 300 350 400 450 500-1

-0.5

0

0.5

1

Time

Volta

ge

(m

V)

a) Training Data

0 50 100 150 200 250 300 350 400 450 500

-0.4

-0.2

0

0.2

Time

Vo

lta

ge

(m

V)

b) Non-recurrent RBF Model (m=10, N=20)

0 50 100 150 200 250 300 350 400

0.08

0.1

0.12

0.14

0.16

0.18

d) Gradient Descent MSE on Interictal RBF Model (m=10, N=20)

MS

E

Epochs

0 50 100 150 200 250 300 350 400 450 500-0.1

-0.05

0

0.05

0.1

Vo

lta

ge

(m

V)

c) Recurrent RBF Model (m=10, N=20)

Time

38

3.4.2.2 Forward Selection

The forward selection (FS) learning technique uses a non-gradient based learning method which

may avoid getting trapped in local minimum. The advantage of training with the FS is that it did

not require a lot of parameter selection prior to training. The main parameter that was

controlled was the embedding of the time series. The embedding used for training were 5, 10,

20, 30, 40, 50, 60, 80, 100, 120 and 140. As before, we trained on the interictal training data to

see if forward selection learning could capture the features of the interictal region. After

training the models were tested in recurrent mode. Figure 3.5 shows the results of MSE and

complexity of the different RBF models. The lowest error achieved was 0.19 for embedding 5,

although it failed to produce any complexity. Only the models with embedding 40 and 50

produced complexity in both the Lyapunov exponent and correlation dimension. Even so the

Lyapunov complexity fell far short of the 1.67 goal for the interictal time series. The model with

embedding 50 seemed slightly superior to the other models and its results were further

decomposed in figure 3.6. Under embedding of 50 the model produced 3591 RBFs. The training

is shown in figures 3.6b where as the number of RBFs was added the GCV error reduced. The

addition of further RBFs stops once the GCV error does not change significantly for the past 5

RBF additions. At which point the selection process backtracks to the point 5 RBFs before and

takes that to be the model. This occurred after 3591 RBFs were included. With such a large

number of RBFs it is likely the training attempted to select one RBF for each of the training time

series points which negates any real learning. In figure 3.6a we compare the recurrent RBF

model time series generation to that of the interictal training data. The result is significantly

better than that of the gradient descent training. Simple visual observation shows the two

39

waves are significantly different. The model appears to be stuck in a rhythmic-like pattern with

no real complexity.

Figure 3.5 – Results of Interictal RBF training with forward selection

Comparing MSE and complexity of RBF models with different embeddings. Models were tested in recurrent mode. It can be noted that the lowest MSE occurred for embeddings of 5, 10, 20, 140 and 200. However the only consistent complexity occurred at embeddings of 50 and 60. Preference was given to model complexity and thus the model with embedding of 50 was chosen. All the models fall short on the Lyapunov exponent indicating that the models were unable to match the complexity of the interictal time series.

5 10 20 30 40 50 60 80 100 120 140 200

0.1

0.2

0.3

Embedding

MS

E

MSE vs Embedding of RBFs (Interictal Model)

5 10 20 30 40 50 60 80 100 120 140 200-0.5

0

0.5

1

1.5

Embedding

Lm

ax

Max Lyapunov Exponent vs RBFs (Interictal Model)

5 10 20 30 40 50 60 80 100 120 140 200

1

2

3

4

5

Embedding

Dim

Correlation Dimension vs Embedding of RBFs (Interictal Model)

40

Figure 3.6 – RBF interictal model after training with forward selection

a) Comparison of the interictal training data to the recurrent RBF model selected with embedding of 50 and 3591 RBFs. There is significant improvement over the gradient descent training method however it still does not resemble the interictal data. b) The RBF selection process showing the reduction in GCV error until the error flat lines and no more RBFs are added.

0 500 1000 1500 2000 2500 3000 35000

0.05

0.1

0.15

Number of RBFs

GC

V E

rro

r

b) GCV Error During FS Training (m=50, N=3591)

0 50 100 150 200 250 300 350 400 450 500-1

-0.5

0

0.5

1

Time

Vo

lta

ge

(m

V)

a) Interictal Training Data

0 50 100 150 200 250 300 350 400 450 500-1

-0.5

0

0.5

1

Time

Vo

lta

ge

(m

V)

Recurrent RBF Model (m=50, N=3591)

41

3.4.2.3 Tree Regression and Forward selection

To improve on the results of the FS we employed the tree regression (TR) on the data. TR

sampled the training data to create viable center c and variance r parameters. Then using FS

the best RBFs were selected to produce a much more compact model consisting of far less RBF

functions.

We applied the training on the same embeddings as in section 3.4.2.2 with the FS training. This

time TR was applied before the FS. We first trained on the interictal data. The results were very

encouraging. Figure 3.7 shows the MSE and complexity with respect to the embedding of the

models trained after operating the RBFs in recurrent mode. The MSE was lowest for

embeddings 5, 20 and 30. Even so the MSE was not all that different from section 3.4.2.2 and

3.4.2.1 and those models were not successful in capturing the interictal time series. For that

reason the complexity and resemblance to the training data were taken to be the more reliable

estimates. The complexity of many of the models matched closely to the Lmax of 1.67 and

correlation dimension of 5.66 found for the actual interictal time series.

After training, the RBF models with embeddings of 20, 30 and 50 were found to match closely

the complexity of the interictal data and still managed to resemble the data fairly well in visual

comparisons. The embedding 20 RBF model was found to have a complexity that most closely

matched that of the interictal data with Lmax of 1.68 and correlation dimension of 6.21.

Embedding 30 was fairly close with an Lmax of 1.72 and a correlation dimension of 6.30.

Embedding 50 was also fairly close with Lmax of 1.58 and correlation dimension of 6.22. The

42

correlation dimension results for the three models were far off from the interictal training data,

but the training data had a large variance so preference was given to the Lmax estimate for the

interictal training case.

In figure 3.8 we compare the recurrent RBF mode of the three top models to the interictal time

series to show how they compare. It was noted that even though the model with embedding of

20 had a lower MSE it failed to match the interictal time series as well as the embedding 50

model. In particular it failed to match the amplitude characteristics. The embedding 50 model

matched the time series the best while at the same time maintaining complexity that closely

resembled the interictal time series. In comparison to the embedding 30 model, the embedding

50 model still had better amplitude characteristics. Therefore we chose the embedding 50

model to represent the interictal time series.

Having successfully trained the interictal time series using the TR technique the same training

technique was applied on the postictal time series, the results of which are shown in figure 3.9

and 3.10. The embeddings of 5, 20, 30 produced slightly lower MSE values. Having shown that

the MSE was not a reliable estimate we focused more on the complexity of the models. The

embeddings of 20, 40 and 50 produced similar complexity results to the postictal time series

complexity. The postictal time series had complexity of 1.64 for Lmax and 6.35 for correlation

dimension. The embedding of 50 model had the closest complexity with Lmax of 1.59 and

correlation dimension of 6.33. The embedding of 40 had a Lmax of 1.73 and a correlation

dimension of 6.39. The embedding of 20 had a Lmax of 1.68 and a correlation dimension of

43

6.21. Figure 3.10 compares the three top RBF models to the postictal time series. The

embeddings 20 and 40 models lacked the ability to match the amplitude characteristics as well

as the embedding 50 model. The embedding of 50 model was selected as the postictal model.

Figure 3.7 – Results of interictal RBF training with tree regression

Comparing MSE and complexity of RBF models with different embeddings. Models were tested in recurrent mode. It can be noted that the lowest MSE occurred for embeddings of 5, 20 and 30. Models with embeddings 20, 30 and 50 produced the closet complexity to the interictal training data.

5 10 20 30 40 50 60 80 100 120 140 200

0.1

0.2

0.3

Embedding

MS

E

MSE vs Embedding of RBFs (Interictal Model)

5 10 20 30 40 50 60 80 100 120 140 200

2

4

6

Embedding

Lm

ax

Maximum Lyapunov Exponent vs Embedding of RBFs (Interictal Model)

5 10 20 30 40 50 60 80 100 120 140 200

2

4

6

8

Embedding

Dim

Correlation Dimension vs Embedding of RBFs (Interictal Model)

44

Figure 3.8 –RBF interictal after training with tree regression

a) The interictal time series data that RBF model is striving to replicate. b) The chosen RBF model operated in recurrent mode showing strong resemblance to the interictal time series. The model has an embedding of 50 and uses 99 RBFs. c) The RBF model with 30 embedding and 112 RBFs had slightly better complexity and lower MSE but lacked in the amplitude when compared to the interictal data. d) Similarly RBF model with 20 embedding and 139 RBFs had good complexity but also lacked in amplitude characteristics.

0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)

a) Interictal Time Series

0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)

b) Recurrent RBF Model (m =50, N=99)

0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)


0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)

d) Recurrent RBF Model (m=20, N=139)

45

Figure 3.9 – Results of postictal RBF training with tree regression

Comparing MSE and complexity of RBF models with different embeddings. Models were tested in recurrent mode. It can be noted that the lowest MSE occurred for embeddings of 5, and 20. Embedding models 20, 40 and 50 had the closest matching complexity to the postictal time series.

5 10 20 30 40 50 60 80 100 120 140 200

0.1

0.2

0.3

0.4

0.5

Embedding

MS

E

MSE vs Embedding of RBFs (Postictal Model)

5 10 20 30 40 50 60 80 100 120 140 200

2

4

6

8

10

Embedding

Lm

ax

Maximum Lyapunov Exponent vs Embedding of RBFs (Postictal Model)

5 10 20 30 40 50 60 80 100 120 140 200

2

4

6

Embedding

Dim

Correlation Dimension vs Embedding of RBFs (Postictal Model)

46

Figure 3.10 – RBF postictal training with tree regression

a) The postictal time series data that RBF model is striving to replicate. b) The chosen RBF model operated in recurrent mode showing strong resemblance to the postictal time series while maintaining the closest matching complexity. The model has an embedding of 50 and uses 128 RBFs. c) The RBF model with 40 embedding and 146 RBFs had slightly better complexity and lower MSE but lacked in the amplitude when compared to the interictal data. d) Similarly RBF model with 20 embedding and 156 RBFs had good complexity but also lacked in amplitude characteristics.

0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)

a) Postictal Time Series

0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)

b) Recurrent RBF Model (m=50, N=128)

0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)


0 50 100 150 200 250 300 350 400 450 500-1

0

1

Time

Vo

lta

ge

(m

V)

d) Recurrent RBF Model (m=20, N=156)

47

CHAPTER 4

MODELING SPONTANEOUS SEIZURE LIKE EVENTS

At this point we have established a RBF model of both interictal and postictal time series. The

RBF model was able to capture the necessary features of the biological system. It was able to

maintain a recurrent mode of time series generation while at the same time matching the

amplitude characteristics and the complexity of the biological system. In this chapter we

introduce some SLE models found in literature. We then proceed to summarize in detail the

spontaneous SLE model that will be used to test the RBF stimulators.

4.1 Literature Review

Modeling of spontaneous seizure-like episodes has been achieved using computational models

[8][36][37][38]. When modeling epilepsy one has to take into consideration that not all epilepsy

disorders are the same. Epilepsy can occur in many different regions of the brain and each has

48

its own unique characteristics. Here we provide some literature reviews on the different

models out there.

In 2002 Wendling et al., constructed a seizure model of the human epilepsy using intracerebral

EEG recordings from the human hippocampus. They created a macroscopic model which

represented the neurodynamics of four populations of neurons using 2nd order differential

equations with a static nonlinearity [36]. The four clusters were divided into: main cells

(pyramidal cells in the hippocampus or neocortex), two feedback subsets composed of local

interneurons (either excitatory or inhibitory) and a fourth subset to represent the inhibitory

interneurons with faster kinetics [36]. The model produced very similar waveforms to the

intericital and ictal activity found in the human epilepsy.

In 2004, Suffczynski et al., used a bistable neural network model to create a macroscopic model

for rat absence epilepsy [37]. They modeled the thalamo-cortical circuits based on relevant

physiological data. Transitions between the ictal and interictal states were determined

randomly with constant probabilities. They managed to model the seizure-like oscillations fairly

accurately. However the model was never designed to produce the extracellular type signals

recorded from the intracerebral electrodes.

Recently Zalay et al., modeled temporal lobe epilepsy of the rat hippocampus [8]. Like the other

models this model represented the macroscopic neurodynamics of populations of neurons.

The model consisted of cognitive rhythm generators (CRGs) defined by four differential

49

equations. The output of each CRG was calculated using a static nonlinearity. Furthermore the

model was able to produce an extracellular like signal by taking the outputs of each CRG and

summing them up relative to a centre point based on a topological square relationship between

the four CRGs. The model closely matched the real extracellular recording from the Low Mg2+

spontaneous seizure setup in the rat hippocampus.

The training of the RBF stimulation models was based on the temporal lobe epilepsy and that

made the selection of the Cognitive Rhythm Generator Seizure-Like Event (CRGSLE) model by

Zalay et al. an appropriate choice to test our stimulation on. In the following section CRGSLE

model will be further described.

4.2 CRG Based Spontaneous Seizure-Like Event Model

The CRGSLE model was chosen for validation of our hypothesis that simulating with a HPC non-

ictal signal (i.e. interictal or postictal) would produce successful suppression of an ictal event.

The strengths of the model is that it models the temporal lobe epilepsy, produces spontaneous

seizure-like events (SLEs) and produces an extracellular signal that mimics the extracellular

recordings used in training.

As mentioned earlier the way the CRGSLE (see figure 4.1) works is that it creates four CRGs that

represent different populations of neurons through 2nd order limit cycle dynamics and a static

nonlinearity connecting the state variables to the output waveform [8]. The coupling between

50

the different CRGs is done with an exponential impulse response function, which is referred to

as an ‘integrating mode’ [8]. The nth CRGs’ combined dynamics are defined by four differential

equations,

O�5� = �5�O"5>1 + Z�,5B + O�5>1 + Z�,5 − O�5" − O"5" B�, (4.1)

O"5� = �5�−O�5>1 + Z�,5B + O"5>1 + Z�,5 − O�5" − O"5" B�, (4.2)

O�5� = O�5, (4.3)

O�5� = �5�5�S� − 2�5O�5 − �5"O�5, (4.4)

where �5 is the intrinsic angular frequency, �5 is the parameter controlling the decay rate

given by state variable O�5, Z�,5 and Z�,5 are the phase and amplitude modulation functions

respectively [8].

�5�S� = ∑ ]15S1 + 35��1�� , (4.5)

S5 = ] 5 + O�5 + �O�5" + O"5" � Htan-� �/(��(M (4.6)

where �5�S� is the mode input function and S1 are the CRG outputs, ]15 are the directional

coupling coefficients and 35�� is the optional external input. W(∙) is the intrinsic output

waveform of the CRG normalized over (-π,π], with the 4-quadrant arctangent function

providing the instantaneous phase angle [8].

51

The phase and amplitude modulation functions are defined by,

Z�,5 = ] 5 + X5O�5 + 35� ��, (4.7)

Z�,5 = 0, (4.8)

where kn is a modulatory gain and 35� �� is an optional additive input. The CRGSLE model

generation of spontaneous seizure events is shown in figure 4.2.

The extracellular field potential used to simulate the SLE time series was produced by the

output of the four CRGs (see figure 4.2b). It was created by treating each of the CRGs as a point

source and treating the center of the electrode as being placed above the center of the square

like arrangement of the CRGs [8]. The extracellular seizure shares many features with the actual

seizure data from the rat slice as shown in figure 4.3. The comparison in figure 4.3 acts as a

verification that the CRGSLE model is a good representation of the actual biological system we

are trying to stimulate.

52

Figure 4.1 – CRGSLE model

Diagram showing the configuration of the 4 CRGs used to create the SLEs. The CRG is composed of three parts. First the integrating mode takes all the inputs coming from the other models and convolutes them. Then it feeds the result into the differential equations which contain clock like dynamics. The result of this clock portion is then fed to a mapper which creates the output that can be fed into the other CRGs.

53

Figure 4.2 – CRGSLE model output waveforms produced

a) The extracellular recording of a spontaneous seizure-like event. b) The four CRG outputs. The CRGSLE produces spontaneous SLEs based on the outputs of the 4 CRGs. The 4 CRGs are combined to produce an extracellular type recording by treating each of the CRG outputs as point sources equidistant away from the extracellular recording region.

0 10 20 30 40 50 60 70 80 90

-1

0

1

a) Unstimulated CRGSLE Model

Time (s)

Volta

ge

(m

V)

0 10 20 30 40 50 60 70 80 90

0

20

40

60

b) CRG Outputs

Time (s)

Volta

ge

(m

V)

0 10 20 30 40 50 60 70 80 90

0

20

40

60

Time (s)

Vo

lta

ge

(m

V)

0 10 20 30 40 50 60 70 80 90

0

20

40

60

Time (s)

Vo

lta

ge (

mV

)

0 10 20 30 40 50 60 70 80 90

0

20

40

60

Time (s)

Vo

lta

ge

(m

V)

54

Figure 4.3 – Comparison of the CRGSLE seizures to the actual seizures being modeled

a) Comparison of the seizures recorded from the rat hippocampus under lowMg2+

conditions and the seizures produced by the CRGSLE model. b) Shows a close up comparison of the actual biological and computational seizures.

0.5 1 1.5 2 2.5 3 3.5 4

x 106

a) Seizures Recorded From Hippocampus

20 30 40 50 60 70 80

CRGSLE Seizures

1.3 1.35 1.4 1.45 1.5 1.55 1.6

b) Closeup of Seizure Recorded From Hippocampus

38.5 39 39.5 40 40.5 41 41.5 42

-2

-1

0

1

Closeup of CRGSLE Seizure

55

CHAPTER 5

CONTROLLING SEIZURES

This chapter compares the standard DBS periodic stimulation to that of the HPC interictal and

postictal RBF stimulation models. Then we describe how the stimulation techniques were

applied to the CRGSLE model. After that we provide quantification of the stimulation efficacy

using ROC curves and the area under the ROC curves.

5.1 Application of Stimulation to CRGSLE Model

In the earlier section 4.2 we showed how the CRGSLE model was able to achieve an accurate

representation of the epilepsy found in the rat hippocampus. Now we describe how the same

model can receive inputs from external stimulation.

56

Providing responsive external stimulation to the CRGSLE model required two issues to be

addressed. First it was important that the stimulation be added in the appropriate place to

mimic an external stimulation. Secondly to apply a responsive stimulation it was necessary to

determine when the system was seizing.

The external stimuli was added in equation 4.5 to the mode input function �5�S� [8]. The

external stimuli 35�� is equal to the gain multiplied by the stimulus that was being provided,

whether that was RBF interictal, RBF postictal or periodic. The gain made it possible to modify

the intensity of the input being applied.

The decision to apply the stimuli was determined by comparing the complexity of the state

variable O�� (related to the instantaneous phase of CRG1) to the specified excitation threshold

(exThr) parameter. The basis for this method is that the complexity of the model would be

lower in the ictal state than in the interictal or posticital states, thus there would be a reduction

in complexity as the state of the system transferred from interictal to ictal. The complexity was

calculated by applying STLmax on windowed data of length 5000 from the state variable O��. The

exThr was preset such that when the complexity of O�� reaches a certain value it would indicate

that the system is in the ictal mode. At which point the system would receive stimulation [8].

Then once the system entered the postictal region the complexity would rise above the exThr

and the stimulation would be disengaged.

57

In figure 5.1 we show the feedback configuration of the stimulator and CRGSLE computer

model. Note that the dotted line going to the stimulator from the CRGSLE output represents

the previous m embedding points used to initialize and reinitialize the RBF stimulators.

Figure 5.1 – Stimulation Setup

The stimulation setup that combines the CRGSLE model and the RBF and periodic stimulators. The excitability connection acts like a feedback indicating if stimulation should be applied or not. Stimulation is the stimulator output. Model output is the extracellular field created by the CRGSLE Model.

5.2 Periodic Stimulator Frequency Selection

There is no clearly defined stimulation frequency that works best with DBS. Generally the

frequency is tuned from 0-300Hz until the best result is achieved. In our case we decided on the

stimulation frequency based on the FFT of the RBF interictal and postictal stimulation models.

58

The FFT of one RBF postictal prediction is shown in figure 5.2a. Although the FFTs vary from one

RBF prediction to another, there was one commonality found across all the predictions of

interictal and postictal and that was that the 12Hz component had the strongest amplitude.

Thus to make a fair comparison to the RBF we used the 12Hz periodic stimulation frequency.

The FFT of the 12Hz periodic is shown in figure 5.2b.

59

Figure 5.2 – FFT Comparison of RBF Stimulator and 12Hz Periodic Stimulator

a) Sample periodic time signal and FFT associated with it. It shows that 12Hz is the highest amplitude imbedded frequency in the RBF prediction. The 12Hz was the common frequency found across all interictal and postictal RBF predictions after training. b) The 12Hz periodic stimulation signal with the FFT showing the strong 12Hz and the harmonics.

0 50 100 150 200 250 300 350 400 450 500-1

-0.5

0

0.5

1

Time

Voltage (m

V)

a) Postictal RBF Model

0 5 10 15 20 25 30 35 40 45 500

0.05

0.1

0.15

0.2Single-Sided Amplitude Spectrum of Postictal Training Data (y(t))

Frequency (Hz)

|Y(f)|

0 50 100 150 200 250 300 350 400 450 500-1

-0.5

0

0.5

1

Time

Voltage(m

V)

b) 12Hz Periodic Stimulation

0 5 10 15 20 25 30 35 40 45 500

0.02

0.04

0.06

0.08Single-Sided Amplitude Spectrum of 12Hz Periodic Stimulation (y(t))

Frequency (Hz)

|Y(f)|

60

5.3 Results of RBF Stimulation

Having described how the CRGSLE model receives external stimulation, we now proceed to

show the results of the periodic, interictal and postictal stimulations. In figure 5.3 below the

results of the three different stimulations is compared. It can be seen that there is a good

reduction in the number of seizures after the interictal stimulation (figure 5.3b) and an even

better reduction after postictal stimulation (figure 5.3c). The periodic stimulation produced only

a slight reduction in seizures (figure 5.3d). This result was achieved with exThr of 0.2 and a gain

of 0.01. To better assess the stimulation dependence on parameters an ROC curve was

performed on a sweep of different gain and exThr parameters.

61

Figure 5.3 – Stimulation of the CRG SLE mode with interictal, postictal and periodic stimulations

The results of applying the three different stimulation techniques are compared with the normal SLE model. b) The application of the interictal RBF stimulation reduced the number of seizures by roughly two thirds. c) The postictal stimulation managed to reduce the number of SLEs even more. d) The application of the periodic stimulation also reduced the number of SLEs but not nearly as much as the interictal and postictal stimulations.

0 10 20 30 40 50 60 70 80 90-2

0

2V

oltage (m

V)

a) Extracellular Recording of the Unstimulated CRGSLE Model

0 10 20 30 40 50 60 70 80 90-2

0

2b) Stimulated CRGSLE Model by Interictal RBF

0 10 20 30 40 50 60 70 80 90-2

0

2Interictal RBF Stimulation

0 10 20 30 40 50 60 70 80 90-2

0

2c) Stimulated CRGSLE Model by Postictal RBF

0 10 20 30 40 50 60 70 80 90-2

0

2Postictal RBF Stimulation

d) Stimulated CRGSLE Model by Periodic Stimulation of 12Hz

0 10 20 30 40 50 60 70 80 90-2

0

2d) Stimulated CRGSLE Model by Periodic Stimulation of 12Hz

0 10 20 30 40 50 60 70 80 90

-1

0

1

Time

Periodic Stimulation of 12Hz

62

5.4 ROC Measurements

The Receiver Operating Characteristic (ROC) is a practical evaluation technique that accurately

compares the successfulness of prediction [39]. It came about as a way to deal with

complicated cases where the distribution of positive and negative classes was strongly skewed.

For example in the diagnosis of cancer it is probabilistically more likely that a negative

prediction for cancer will be the correct one than not. This bias tends to lead to procedures that

favour a negative prediction rather than an accurate prediction based on facts. ROC

compensates this by dividing the predictions into four cases. They are true positive (TP), false

negative (FN), false positive (FP) and true negative (TN). From these cases we can calculate the

true positive rate TPR and the false positive rate FPR.

�y� = gwgw2�� (5.1)

�y� = �w�w2g� (5.2)

The TPR is a reflection of the sensitivity of the prediction, meaning how accurate the prediction

is. The FPR represents 1-specificity which is a measure of how well you are discriminating

between the two cases. The values of TPR and FPR are found by varying the detection

threshold. Then the points are plotted on an ROC curve with the x-axis representing the 1-

specificity and the y-axis representing the sensitivity. Looking back at the cancer prediction

example we see that a high TPR value means that we are catching the positive cases very well.

However a high FPR means that we are also making a lot of mistakes by incorrectly classifying

63

many of the negative cases as positive. Ideally we want a system to have a high TPR and a low

FPR, meaning that the system is very specific and sensitive.

Although ROC is often used for prediction, we made a slight modification here to apply it to the

evaluation of seizure control efficacy. In our case sensitivity measures how effective the

stimulation is. The specificity measures how accurately the stimulation is applied. A low

specificity means that the stimulation is applied all the time whether a seizure is present or not.

A high specificity means that we only apply stimulation to the strong seizures. Therefore the

same ROC curve profile is achieved even though we applied it to a control system.

5.4.1 ROC Curve Construction

As was mentioned in the previous section our application is not a prediction, but rather a

control. That meant that we needed to modify the general usage of ROC to apply to our control

situation. To do this we needed to appropriately find a way to convert the successful seizure

control into TP, FP, TN, FN subgroups. To do this we tracked three variables with time. The first

was the SLE complexity without any stimulation. The second was the SLE complexity with

stimulation. The last was the actual time series of the stimulation. Then using a threshold of 0.2

we went through the time series of the first two variables and placed a 1 for all the times the

model was in ictal state. For the third variable we placed a 1 for every time the stimulation was

being applied. Then using table 5.1 we defined the different case (i.e. all three variables 1 is a

False Negative (FN)). The TP, FP, TN and FN were tallied up. Equations 5.1 and 5.2 were then

used to find the TPR and FPR so that we could plot sensitivity vs 1-specificity (TPR vs FPR). Then

64

we repeated the process for randomly modified CRGSLE parameters so that we would create a

slightly different SLE model each time to reflect the differences across patients. The process

was repeated for the different stimulation threshold (exThr) values and different gains. A good

representation was formed that would not be biased on only one good model.

Table 5.1 – Determination of ROC cases

Control Case TN TP FN FN FP TP FP FN

Seizure Before Stimulation 0 1 0 1 0 1 0 1

Seizure After Stimulation 0 0 1 1 0 0 1 1

Stimulation Applied 0 0 0 0 1 1 1 1

5.4.2 ROC Curve Comparison

To create the ROC curve we divided the three different stimulation models into three groups:

periodic, interictal and postictal. Each group was further divided into four subparts indicating

the different gains used in the stimulation. These gains were 0.01, 0.1, 1, and 10. Then the

exThr was spanned into 25 different values (0, 0.01:0.02:0.09, 0.1:0.10:0.9, 1:1:10) to produce

25 different ROC points. To provide some statistical significance to the results we created 32

replicate SLE models (samples) with slight changes in the coupling parameters. The changes

allowed us to produce slightly different dynamics to better contrast with the differences found

in the population. Then the ROC results were averaged for each stimulation gain and model.

We found that the gain of 0.01 produced the best results. We then constructed the ROC curve

65

using the 0.01 gain to compare the different stimulator models. The ROC result is shown in

figure 5.4.

Figure 5.4 – ROC comparison of the periodic, interictal and postictal stimulation

The sensitivity of the complex stimulations is significantly superior to the periodic stimulation particularly as the specificity reduces and more stimulation occurs.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8ROC Curve Comparison of Stimulation Models Under 0.01 Gain

Se

nsitiv

ity

1-Specificity

Interictal

Postictal

Periodic

66

The ROC curve shows that both interictal and postictal stimulation produced better ROC results

than the periodic stimulation. At very high specificity all the models had low results due to the

fact that there was only minor stimulation since as the exThr parameter was set too high. At

low specificity when all models are applied at low exThr the best results were achieved for the

interictal stimulation and the postictal. The complexity of the interictal and postictal

stimulation models allowed the CRGSLE model to better maintain its intrinsic complexity even

during the ictal events.

5.4.3 Area Under the ROC Curve

The successfulness of the ROC curve can be further quantized using the area under the curve.

The larger the area under the curve then the better the model being evaluated is. Here we used

the trapezoidal rule to calculate the area under the curve. In figure 5.5 we show the results of

the three ROCs from figure 5.4 where the three stimulator models were compared for the 0.01

gain. The result indicates that there is an improvement in control of seizures using the HPC

stimulation of the interictal and postictal models. The postictal performed slightly better than

the interictal based on the area under the ROC curve. We then applied the student ‘t’ test and

the Wilcoxon rank sum test to test the hypothesis that two samples are significantly different

on the population of 32. The results are shown in table 5.2. There was no significance between

the interictal and postictal performance, however both the interictal and postictal results were

significant when compared to the periodic stimulation.

67

Then in figure 5.6 we show how the three waveforms vary with gain of stimulation. The

periodic stimulation varied very little with changes in stimulation gain. However the highly

complex interictal and postictal stimulation models seemed to have the higher success with

lower gains. This meant that a strong stimulation (high gain) was not necessary for the HPC

interictal and postictal models to be successful.

Another thing that was done was we modified the RBF recurrent mode prediction duration

from 50 to 500 data points (see figures 5.6 and 5.7). Since stimulation was applied in discrete

steps this meant that there would be longer stimulation of the CRGSLE computer model. As well

the reinitializing to match the current state would not be updated as often allowing for more

divergence between the RBF model and the CRGSLE model. We observed that the variation of

the areas changed more than the mean area for the low gains. This suggests that by applying

longer duration stimulations we are at times more successfully suppressing the seizures and at

other times we are creating seizures due to over stimulating. Thus it is important to use shorter

duration stimulation to allow for more consistent seizure suppression.

Lastly we verified that the periodic stimulation remained fairly consistent across different

frequencies. We compared the ROC areas using three different periodic frequency stimulators:

12Hz periodic, 60Hz periodic and 200Hz periodic. The population size was 16, but it was enough

to show the trend with frequency. The results are shown in figure 5.8. As can be seen the gain

does not vary significantly across frequencies or gains. Granted there was a large drop for the

high gain in the 200Hz stimulation, but for the remainder of the frequencies and gains the

68

results were fairly consistent. This showed that gain and frequency had little effect on the

success of period stimulation.

Figure 5.5 – ROC area under the curve comparison of the periodic, interictal and postictal stimulation models

Further verification of the area under the ROC curve shows that interictal and postictal achieve better results than the periodic stimulation. Postictal produced slightly better results than the interictal. The error bars represent standard error for a population size of 32.

Table 5.2 – 0.01 Gain ROC Area Significance

Interictal : Postictal *Interictal : Periodic *Postictal : Periodic

pttest 0.832 0.011 0.004

pWilcoxon 0.722 0.027 0.002

Interictal Postictal Periodic0.4

0.45

0.5

0.55

0.6

0.65

RO

C A

rea

ROC Area vs Stimulation at the 0.01 Gain of Each Model

69

Figure 5.6 – ROC area for different gains of the stimulation models 50 reinitialization

Comparison of the area under the ROC curve for different gains and different stimulators with reinitialization after 50 points of stimulation. The error bars represent the standard error. Used sample size of 16 to construct the standard error.

Interictal Postictal 12Hz Periodic0.4

0.45

0.5

0.55

0.6

0.65

RO

C A

rea

ROC Area vs Stimulation Gain of Each Model (50 reinitialization)

0.01

0.1

1

10

70

Figure 5.7 – ROC area for different gains of the stimulation models 500 reinitialization

Comparison of the area under the ROC curve for different gains and different stimulators with reinitialization after 500 points of stimulation. The error bars represent the standard error. Used a sample size of 16 to produce the standard errors. The variation of results is greater than that with 50 reinitialization.

Interictal Postictal 12Hz Periodic0.4

0.45

0.5

0.55

0.6

0.65

RO

C A

rea

ROC Area vs Stimulation Gain of Each Model (500 reinitialization)

0.01

0.1

1

10

71

Figure 5.8 – ROC area for different gains and different periodic frequencies

The periodic stimulation achieved similar results as the complex models for high gain. The complex models outperformed the periodic as the gains were reduced. The error bars represent the standard error.

12Hz Periodic 60Hz Periodic 200Hz Periodic0.3

0.35

0.4

0.45

0.5

0.55

0.6

RO

C A

rea

ROC Area vs Stimulation Gain for Different Periodic Signals

0.01

0.1

1

10

72

CHAPTER 6

DISCUSSION AND FUTURE WORK

In this chapter we discuss three most notable results from this thesis. We then discuss the

implications of these results and what it means for future work.

6.1 RBF Model Captures Complexity

In chapter 2 we showed that the RBF model we trained successfully captured the shape and

complexity of the interictal and postictal regions of a seizure. This was verified by operating the

model in recurrent mode and showing that the model sustained dynamics similar to the

interictal and postictal time series while at the same time maintaining similar complexity. The

RBF models were very robust to initialization conditions. For example if the model was

initialized with ictal time series it would still continue to produce the interictal and postictal

dynamics. This ensured that even though the stimulation produced would vary based on

initialization, it would never diverge to a constant stimulation or become a DC stimulator. It is

73

believed that by training the model on multiple slices of different specimens that the RBF had

generalized or captured the characteristics common across all the groups.

6.2 Complex RBF Stimulation Outperforms Periodic

The hypothesis that stimulating with HPC biologically based stimulation would successfully

reduce seizure occurrence comes from the understanding that under normal conditions the

brain functions in a highly complex possibly chaotic manner. This fact has been verified in

literature on numerous occasions [2][3][20][28]. Therefore it is reasonable to believe that to

achieve better results one needs to communicate with the brain in the same biologically based

language.

In this thesis we tested our HPC biologically based stimulators on the CRGSLE computation

model. The results in figure 5.1 show that interictal and postictal stimulation reduced the

number of seizures to a greater extent than the periodic stimulation. To quantitatively compare

the three stimulation methodologies we constructed a ROC curve based on the successfulness

of control. The ROC was applied for multiple gains and the best gain for each model was chosen

for the final ROC comparison. The final comparison is shown in figure 5.2 and it shows that the

performance of interictal and postictal stimulation across different exThrs was significantly

better than the periodic. The difference between interictal and postictal was minimal. The

distinct difference between periodic and the RBF model stimulators was the complexity.

Therefore the hypothesis was satisfied.

74

6.3 Low Gain More Successful in Complex Stimulation

The final and most significant finding of this thesis is that the HPC RBF stimulators performed

better with lower gain of stimulation. The periodic stimulation had little benefit in using lower

gain stimulation. In fact the periodic stimulation tended to favour higher gain as can be shown

by figure 5.3. The CRGSLE model seems to model the dynamics of the SLE very well as many

findings in DBS show that higher gain periodic stimulation performs best. Generally the gain has

to be increased to achieve successful treatment. At larger gains the stimulation is not very

specific to the region intended to be stimulated and the likelihood of the stimulation affecting

other regions of the brain increases. This was not the case for the RBF interictal and postictal

stimulation. Interictal and postictal RBF stimulation both showed improvement with lower gain

stimulation. With lower gain stimulation they were able to focus the stimulation to the regions

that need it and avoid inducing other undesired effects on surrounding brain regions.

6.4 Future Work

The promising results achieved in this thesis are only the first steps. The model provided us with

a way to test the viability of our hypothesis in treating epilepsy. It has also left a lot of questions

to be answered. Can the success achieved on the CRGSLE model be replicated in-vivo?

Therefore the next logical step will be to reproduce these results in-vivo.

75

Would training on high frequency stimulation improve the results was another good question

that arose. In our preprocessing of the training data we removed the majority of the noise by

filtering out all frequencies above 50Hz. This made the model easier to train and less

computationally demanding to implement. It also means that our stimulator may have been

lacking some key features that may help in suppressing seizures. Previous work done by Chiu et

al. suggests that the higher frequencies are indicative of the seizure onset and hold valuable

features for detecting seizures [3]. In the future we will train on the higher frequencies to see if

their inclusion will yield better results. Due to the high noise content at higher frequencies we

may need to use other means to capture the higher frequency features. The Neural Rhythm

Extractor (NRE) developed by Zalay et al, is one proposed method to capture the frequency

information [40]. The NRE which at the heart uses a wavelet packet transform will find the main

frequency bands of the interictal and postictal time series. Then the RBF can then be selectively

trained on those bands. Then by stimulating the CRGSLE with different RBFs trained on

different bands we will track down the frequencies responsible for successful seizure

suppression.

As successful as the RBF had been in capturing the low frequency content we feel that we can

do better. A good substitute training model being considered is the Restricted Boltzman

Machine (RBM) developed by Hinton [41]. The RBM is a more complex model and although it is

based on ANNs is it trained highly effectively by an unsupervised random dream like state. The

use of RBMs brings about a computational issue that needs to be addressed to achieve real

time stimulation. This will mean that we will need to move away from computer based

76

stimulation to hardware stimulation through hardware such as Field Programmable Gate Arrays

(FPGAs). There is a lot of work left to be done, but the goal of the future work will remain to

achieve the same success in-vivo as on the in-silco CRGSLE model.

77

CONCLUSION

With the aid of RBFs we have captured the highly complex possibly chaotic (HPC)

neurodynamics of interictal and postictal regions of seizure time series. We have applied these

stimulation techniques to a CRGSLE model and shown that the HPC stimulation significantly

outperformed those of the low complexity periodic stimulation. If the same results can be

achieved on a rat in-vivo model then this has serious potential to change the way we treat

epilepsy and paves the way towards new treatment opportunities for all those in need.

78

Bibliography

[1] A. Babloyantz, A. Destexhe, “Low-dimensional chaos in an instance of epilepsy”, Neurobiology, Vol. 83, pp. 3513-3517, 1986

[2] S. J. Schiff, K. Jerger, D. H. Duong, T. Chang, M. L. Spano, W. L. Ditto, “Controlling chaos in the brain”, Nature, Vol. 350, pp. 615-620, 1994

[3] A. W.L. Chiu, E. E. Kang, M. Derchansky, Peter L. Carlen, B. L. Bardakjian, “Online Prediction of Onsets of Seizure-like Events in Hippocampal Neural Networks Using Wavelet Artificial Neural Networks”, Annals of Biomedical Engineering, Vol. 34, pp. 282-294, 2006

[4] A. W. L. Chiu, M. Derchansky, E. E. Kang, P. L. Carlen, B. L. Bardakjian, “Prevention of Spontaneous Seizure-like Events in Both in-silico and in-vitro Epilepsy Models”, Engineering in

Medicine and Biology 27th Annual Conference, pp.1-4, 2005

[5] M. Hodaie, R. A. Wennberg, J. O. Dostrovsky, and A. M. Lozano, “Chronic Anterior Thalamus Stimulation for Intractable Epilepsy”, Epilepsia, Vol. 34, pp. 603-608, 2002

[6] Y. F. Sun, Y. C. Liang, W. L. Zhang, H. P. Lee, W. Z. Lin, L. J. Cao, “Optimal partition algorithm of the RBF neural network and its application to financial time series forecasting”, Neural

Computation & Applications, Vol. 14, pp. 36-44, 2005

[7] X. Li and Z. Deng, “A Machine Learning Approach to Predict Turning Points for Chaotic Financial Time Series”, 19th IEEE International Conference on Tools with Artificial Intelligence, pp. 331-335, 2007

[8] O. C. Zalay, D. Serletis, P. L. Carlen, B. L. Bardakjian, “System chracterization of neuronal excitability and its relevance to spontaneous seizure-like transitions in a hippocampal network model”, Submitted to J Neuroscience, pp. 1-29, 2009

[9] C. Hamani, D. Andrade, M. Hodaie, R. Wennberg, and A. Lozano, “Deep brain stimulation for

the treatment of epilepsy”, Int. J Neural Systems, Vol. 19, pp.213-226, 2009

[10] B. M. Uthman, B. J. Wilder, J. K. Penry, C. Dean, R. E. Ramsay, S. A. Reid, E. J. Hammond, W.

B. Tarver, BS and J. F. Wernicke, “Treatment of epilepsy by stimulation of the vegus nerve”,

Epilepsia, Vol. 34, pp. 1007-1016, 1993

[11] S. C. Schachter, “Vagus nerve stimulator therapy summary: five years after FDA approval”,

Neurology, Vol. 59, no. 6 Suppl. 4, pp. S15-20, 2002

79

[12] D.M. Andrade, D. Zumsteg, C. Hamani, M. Hodaie, S. Sarkissian, A.M. Lozano, and R.A.

Wennberg, “Long-term follow-up of patients with thalamic deep brain stimulation for epilepsy”,

Neurology, Vol. 66, pp. 1571-1573, 2006

[13] C. Pollo and J.G. Villemure, “Rationale, mechanisms of efficacy, anatomical targets and

future prospects of electrical deep brain stimulation for epilepsy”, Acta Neurochir. Suppl., Vol.

97, pp. 311-320, 2007

[14] K. Vonck, P. Boon, L. Goossens, S. Dedeurwaerdere, P. Claeys, F. Gossiaux, P. Van Hese, T.

De Smedt, R. Raedt, E. Achten, K. Deblaere, A. Thieleman, P. Vandemaele, E. Thiery, G.

Vingerhoets, M. Miatton, J. Caemaert, D. Van Roost, E. Baert, G. Michielsen, F. Dewaele, K. Van

Laere, V. Thadani, D. Robertson and P. Williamson, “Neurostimulation for refractory epilepsy",

Acta Neurol. Belg., Vol. 103, pp. 213-217, 2003

[15] J.F. Tellez-Zenteno, R.S. McLachlan, A. Parrent, C.S. Kubu and S. Wiebe, “Hippocampal

electrical stimulation in mesial temporal lobe epilepsy”, Neurology, Vol. 66, pp. 1490-1494,

2006

[16] K. N. Fountas and J. R. Smith, 'A novel closed-loop stimulation system in the control of

focal, medically refractory epilepsy', Acta Neurochir. Suppl., Vol. 97, pp. 357-362, 2007

[17] K. N. Fountas, J. R. Smith, A. M. Murro, J. Politsky, Y. D. Park and P. D. Jenkins, “Implantation of a closed-loop stimulation in the management of medically refractory focal epilepsy: a technical note”, Stereotact Funct. Neurosurg, Vol. 83, pp. 153-158, 2005

[18] S. H. Strogatz, “Nonlinear Dynamics and Chaos”, Addison - Wesley Publishing Company, 1994

[19] A. Courville, “Chaosmakers for Epilepsy” M.A.Sc thesis, University of Toronto, 1998

[20] J. Gao, Y. Cao, W. Tung, J. Hu, “Multiscale Analysis of Complex Time Series”, Wiley, 2007

[21] A. Wolf, J. B. Swift, H. L. Swinney and J. A. Vastano, “Determining Lyapunov Exponents From A Time Series”, Physica, pp. 285-317, 1985

[22] L. D. Iasemidis, J. C. Sackellares, H. P. Zaveri, W. J. Williams, “Phase Space Topography and the Lyapunov Exponent of Electrocorticograms in Partial Seizures”, Brain Topography, Vol. 2, pp. 187-201, 1990

[23] S. P. Nair, D. Shiau, J. C. Principe, L. D. Iasemidis, P. M. Pardalos, W. M. Norman, P. R. Carney, K. M. Kelly, J. C. Sackellares, “An investigation on EEG dynamics in an animal model of

80

temporal lobe epilepsy using the maximum Lyapunov exponent”, Experimental Neurology, Vol. 216, pp. 115-121, 2009

[24] M.T. Rosenstein, J.J. Collins, and C.J. De Luca, “A practical method for calculating larget Lyapunov exponents from small data sets”, Physica D, Vol. 65, pp. 117-134, 1993

[25] P. Grassberger, and I. Procaccia, “Characterization of strange attractors”, Phys. Rev. Lett., Vol. 50, pp. 346-349, 1983

[26] A. Babloyantz, J.M. Salazar, “Evidence of chaotic dynamics of brain activity during the sleep cycle”, Phys. Letters, Vol. 111A, pp. 152-156, 1985

[27] J. Fell, J. Röschke, and P. Beckmann, “Deterministic chaos and the first positive Lyapunov exponent: a nonlinear analysis of the human electroencephalogram during sleep”, Biol. Cybern, Vol. 69, pp. 139–164, 1993

[28] A. Babloyantz, and A. Destexhe, “Low-dimensional chaos in an instance of epilepsy”, Proc.

Natl. Acad. Sci., Vol. 83, pp. 3513-3517, 1986

[29] M. Orr, J. Hallam, K. Takezawa, A. Murray, S. Ninomiya, M. Oide and T. Leonard, “Combining regression trees and radial basis function networks”, International Journal of

Neural Systems, pp. 1-17, 1999

[30] M.J.L. Orr, “Regularisation in the selection of radial basis function centres”, Neural

Computation, pp. 1-16, 1995

[31] R. Zemouri, D. Racoceanu, N. Zerhouni, “Recurrent radial basis function network for time-series prediction”, Eng. App. of Artificial Intelligence, Vol. 16, pp. 453-463, 2003

[32] H. Kantz, and T. Schreiber, “Nonlinear time series analysis”, Cambridge University Press, 1997

[33] J. R. Shewchuk, “An introduction to the conjugate gradient method without the agonizing pain”, Carnegie Mellon University, Ed. 1.25, 1994

[34] S. Chen, C.F.N. Cowan, and P.M. Grant, “Orthogonal least squares learning for radial basis function networks”, IEEE Transactions on Neural Networks, Vol. 2, pp. 302-309, 1991

[35] R.A. Horn, and C.R. Johnson, “Matrix Analysis”, Cambridge University Press, 1985

[36] F. Wendling, F. Bartolomei, J. J. Bellanger and P. Chauvel, “Epileptic fast activity can be

explained by a model of impaired GABAergic dendritic inhibition”, European Journal of

Neuroscience, Vol. 15, pp. 1499-1508, 2002

81

[37] P. Suffczynski, S. Kalitzin, Lopes Da Silva, “Dynamics of non-convulsive epileptic

phenomena modeled by a bistable neuronal network”, Neuroscience, Vol. 126, pp. 467−484,

2004

[38] F. Grimbert, O. Faugeras, “Bifurcation analysis of Jansen's neural mass model”, Neural

Comput, Vol. 18, pp. 3052−3068, 2006.

[39] T. Fawcett, “An introduction to ROC analysis”, Pattern Recognition Letters, Vol. 27, pp. 861-

874, 2006

[40] O. C. Zalay, E. E. Kang, M. Cotic, P. L. Carlen, and B. L. Bardakjian, “A Wavelet Packet-Based

Algorithm for the Extraction of Neural Rhythms”, Annals of Biomedical Engineering, Vol. 37 No.

3, pp. 595-613, 2009

[41] G. E. Hinton, S. Osindero, Y. Teh, “A fast learning algorithm for deep belief nets”, Neural

Computation, Vol. 18, pp. 1527-1554

Documents

RBF Based Responsive Stimulators to Control Epilepsy...Using Radial Basis Functions (RBFs), we modeled interictal and postictal time series based on electroencephalograms (EEGs) of