Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
i
RBF Based Responsive Stimulators to Control Epilepsy
by
Siniša Čolić
A thesis submitted in conformity with the requirements
of the degree of Master of Applied Science
Department of Electrical and Computer Engineering
University of Toronto
© Copyright by Siniša Čolić 2009
ii
RBF Based Responsive Stimulators to Control Epilepsy
Siniša Čolić
Master of Applied Science
Department of Electrical and Computer Engineering
University of Toronto
2009
Abstract
Deep Brain Simulation (DBS) has received attention in the scientific community for its
potential to suppress epileptic seizures. To date, DBS has only achieved marginal positive
results. We believe that a highly complex possibly chaotic (HPC) biologically inspired
stimulation is superior to periodic stimulation. Using Radial Basis Functions (RBFs), we
modeled interictal and postictal time series based on electroencephalograms (EEGs) of rat
hippocampus slices while under low Mg2+/high K+. We then compared the RBF based
interictal and postictal stimulations to the periodic stimulation using a Cognitive Rhythm
Generator (CRG) model for spontaneous Seizure-Like Events (SLEs). What resulted was a
significant improvement in seizure suppression with the HPC stimulators at lower gains as
opposed to the periodic signal. This suggests that the use of biologically inspired HPC
stimulators will achieve better results while confining the stimulation to a narrow region of
the brain.
iii
Acknowledgements
I would like to thank Berj Bardakjian for his guidance, understanding and abundant optimism
that would have me leaving each Thursday lab meeting ready to take on the world. I would like
to thank my group members and friends, Marija Cotic, Osbert Zalay, Eunji Kang, Demitre
Serletis, Josh Dian, Angela Lee and Dave Stanley for the helpful discussions and advice. I would
specifically like to thank Eunji for providing me with the experimental recordings; Osbert for
providing me with the complexity analysis program, developing the CRG stimulation protocol
and the ROC evaluation methodology; Josh for the quick fixes and helpful suggestions. I would
also like to thank Dave and Berj for proof reading my thesis. Finally I would like to thank my
parents for always being there through the good and the bad.
iv
Table of Contents
1 Introduction and Motivation……………………………………………………….………………………………………….1
1.1 Stimulation Literature Review………………………………………………………………………………….4
1.1.1 Continuous Stimulation……………………………………………………………………………..5
1.1.2 Responsive Stimulation……………………………………………………………………………..6
1.2 Outline……………………………………………………………………………………………………………………..7
1.3 Hypothesis………………………………………………………..……………………………………………………..8
2 Chaos and the brain………………………………………………………………………………………………………………..9
2.1 Chaos and Complexity………………………………………………………………………………………………9
2.1.1 Lyapunov Exponent………………………………………………………………………………….10
2.1.2 Correlation Dimension…………………………………………………………………….……….11
2.2 The Brain and Chaos……………………..……………………………………………………………………….12
3 Modeling Highly Complex Possibly Chaotic Time Series………………………………………………………..14
3.1 Time Series Modeling……………………………………………………………………………………………..12
3.2 RBF Model……………………………………………………………………………………………………………...16
3.2.1 RBF Architecture……………………………………………..……………………………………….16
3.2.2 Recurrent RBF……………………………………………..…………………………………………..18
3.2.3 RBF Training Techniques…………………………….……………………………………………19
3.2.3.1 Gradient Descent……………………………..……………………………………….20
3.2.3.2 Regression Tree…………………………………..…………………………………….22
3.2.3.3 Forward Selection……………………………………………………………………..23
3.3 Application to Henon Map………………………………………………………………………………………26
3.3.1 Henon Map………………………………………………………..…………………………….………26
3.3.2 Data Preprocessing……………………………………….………………………………………….28
3.3.3 RBF Training of Henon Map…………………………………………………………………..…28
3.4 Application to Non-Ictal Time Series.…………..…………………………………………………………31
3.4.1 Low Mg2+/High K+ Animal Data………………………………………………………………..32
3.4.2 Data Preprocessing………………………………………………………………………………….32
3.4.3 RBF Training of Non-Ictal Time Series.……………………………………………..……..33
3.4.2.1 Gradient Descent………………………………………………………………………35
3.4.2.2 Forward Selection……………………………………………………………………..38
3.4.2.3 Tree Regression…………………………………………………………………………41
v
4 CRGs and Modeling Spontaneous Seizure-Like Events…………………………………………………………..47
4.1 Literature Review……………………………………………………………………………………………………47
4.2 CRG Based Spontaneous Seizure-Like Model………………………………………………………….49
5 Controlling Seizures………………………………………………………………………………………………………………55
5.1 Application of Stimulation to CRGSLE model…………………………………………………………..55
5.2 Periodic Stimulator Frequency Selection………………………………………………………………..57
5.3 Results of RBF stimulation………………………………………………………………………………………60
5.4 ROC Measurements………………………………………………………………………………………………..62
5.4.1 ROC Curve Construction………………………………………………………………………..…63
5.4.2 ROC Curve Comparison…………………………………………………………………………...64
5.4.3 Area Under ROC Curve………………………………………………………………………….…66
6 Discussion and Future Work………………………………………………………………………………………………….72
6.1 RBF Model Captures Complexity…………………………………………………………………………….72
6.2 Complex RBF Stimulation Outperforms Periodic…………………………………………………….73
6.3 Low Gain More Successful in Complex Stimulation…………………………………..……..…….74
6.4 Future work……………………………………………………………………………………………………………74
Conclusion…..…………………………………………………………………………………………………………………………..77
Bibliography…………………………………………………………………………………………………………………………….78
vi
List of Tables
Table 3.1 – Henon Map Gradient descent training parameters and results………………………………29
Table 3.2 – Complexity of interictal and postictal time series…………………………………………………..34
Table 3.3 – Interictal gradient descent training parameters and results…………………………………..36
Table 5.1 – Determination of the ROC cases…………………………………………………………………………….64
Table 5.2 – 0.01 Gain ROC Area Significance…………………………………………………………………………….68
List of Figures
Figure 1.1: Extracellular recording of seizure time series…………………………………………………………..4
Figure 3.1: Radial Basis Function Model………………………………………………………………………………….18
Figure 3.2: Comparison of non-chaotic and chaotic Henon map time series……………………………31
Figure 3.3: Comparing RBF Henon map model to chaotic time series……………………………………..29
Figure 3.4: RBF Interictal model after gradient descent training……………………………………………..37
Figure 3.5: Results of Interictal RBF training with forward selection……………………………………..…39
Figure 3.6: RBF interictal model after training with forward selection…………………………………….40
Figure 3.7: Results of interictal RBF training with tree regression……………………………………………43
Figure 3.8: RBF interictal after training with tree regression…………………………………………………...44
Figure 3.9: Results of postictal RBF training with tree regression…………………………………………...46
Figure 3.10: RBF postictal training with tree regression…………………………………………………………..44
vii
Figure 4.1: CRGSLE model……………………………………………………………………………………………………….52
Figure 4.2: CRGSLE model output waveforms produced……………………………………………………….…53
Figure 4.3: Comparison of the CRGSLE seizures to the actual seizures being modeled……………54
Figure 5.1: Stimulation Setup…………………………………………………………………………………………………..57
Figure 5.2: FFT Comparison of RBF Stimulator and 12Hz Periodic Stimulator………………………….59
Figure 5.3: Stimulation of the CRGSLE mode with interictal, postictal and periodic stimulation
models…………………………………………………………………………………………………………………………………….61
Figure 5.4: ROC comparison of the periodic, interictal and postictal stimulation…………………….65
Figure 5.5: ROC area under the curve comparison of the periodic, interictal and postictal
stimulations……………………………………………………………………………………………………………………………..68
Figure 5.6: ROC area for different gains of the stimulation models 50 reinitialization………….…69
Figure 5.7: ROC area for different gains of the stimulation models 500 reinitialization……..……70
Figure 5.8: ROC area for different gains different periodic frequencies………………………………..…71
viii
List of Abbreviations
ANN Artificial Neural Network
CRG Cognitive Rhythm Generator
CRGSLE Cognitive Rhythm Generator Seizure-Like Event Model
DBS Deep Brain Stimulation
EEG Electroencephalogram
EMG Electromyogram
exThr Stimulation Threshold
FPGA Field Programmable Gate Array
FS Forwards Selection
GCV Generalized Cross-validation Error
HPC Highly Complex Possibly Chaotic
LPR Low Complexity Possibly Rhythmic
Lmax Maximum Lyapunov Exponent
MSE Mean Square Error
NRE Neural Rhythm Extractor
RBF Radial Basis Function
ROC Receiver Operating Characteristic
SLE Seizure-Like Event
STLmax Short Time Maximum Lyapunov Exponent
TR Tree Regression
VNS Vegas Nerve Stimulator
1
CHAPTER 1
INTRODUCTION AND MOTIVATION
Epilepsy is a serious neurological disorder often accompanied by seizure or ictal events.
Seizures are characterized as a transition from normal or high complexity possibly chaotic
activity (HPC) to low complexity possibly regular (LPR) activity [1][2][3]. The majority of
epileptics (approx. 80%) can be treated with anticonvulsive drug therapies which inhibit the
channel transport mechanisms [4]. Of the remaining 20% some resort to surgery which carries
with it many risks. Those that are not viable for surgery have turned to a new form of
treatment known as Deep Brain Stimulation (DBS). Still in the early stage of epilepsy research,
DBS has shown promising results in treating patients with intractable epilepsy [5].
DBS is a crude stimulation technique that consists of implantation of electrodes around the
seizure focal point and applying high voltage periodic stimulation to counteract seizures [5].
These DBS stimulators are applied for fixed durations or continuously whether the patient still
needs the stimulation or not. This lack of responsiveness is a major shortcoming of the DBS
2
treatment. Here we propose a new responsive, highly complex possibly chaotic stimulation
technique inspired from biological time series recordings.
A seizure can be broken down into three main regions which we refer to as the interictal, ictal
and postictal (see figure 1.1). The ictal region is where the characteristics of a seizure are
present. The interictal region occurs just prior to the ictal and the postictal region occurs just
after the ictal. From now on we will use the term non-ictal to refer to interictal and postictal
regions of a seizure time series.
As stated earlier in the work done by [1][2][3] the normal non-ictal brain activity is highly
complex possibly chaotic (HPC), where as the ictal activity is of lower complexity possibly
rhythmic. Our goal is to provide a stimulation technique which sustains the brain in the highly
complex state preventing the transition to the low complexity seizure activity. The stimulation
is only to be applied at the presence of a seizure. To this end we have constructed a responsive
model based on the non-ictal brain activity.
The model chosen to represent the healthy non-ictal activity was the Radial Basis Function
(RBF). It was chosen due to its success in modeling highly complex time series of the financial
sector, natural generalization tendencies and low processing requirements [6][7]. The RBF
model was trained on extracellular recording samples of seizure-like events (SLE) accumulated
from multiple slices of the rat hippocampus under the in-vitro low Mg2+ epilepsy model.
3
Due to the chaotic nature of the brain signal we never intended to make perfect predictions
from the time series data. Instead we opted in creating models of non-ictal activity from the
interictal and postictal regions of a SLE with matching characteristics in wave shape and
complexity to the original training data.
To assess the feasibility of our model in seizure control we employed our stimulation paradigm
on a coupled oscillator model of SLEs [8]. The goal was to achieve an improvement in seizure
reduction with our biologically inspired HPC stimulation over the presently used periodic
stimulation.
4
Figure 1.1 – Extracellular recording of seizure time series
The data was sampled at 2kHz from rat hippocampal slices under the influence of low Mg2+
. We have further broken the data into three regions referred to as the interictal, ictal and postictal.
1.1 Stimulation Literature Review
The use of stimulation to control seizures has been around for many years. The most common
and only FDA approved implantable device for treatment of epilepsy is the Vagus Nerve
Stimulator (VNS) [9][10]. The vagus nerve stimulator applies periodic electrical pulses to the left
vegal nerve which then make its way to the brain. Recent studies have shown that only 30-40%
of patients undergoing the treatment experienced a 50% seizure reduction [11].
An alternative option known as Deep Brain Stimulation (DBS) has recently become a popular
technique to control epilepsy [9]. In the past DBS has been fairly successful in treating disorders
5
such as Parkinson’s and depression. It is believed that the same level of success can be achieved
in treating epilepsy. DBS uses periodic stimulation which can be described as two square pulses
one after the other with one positive and the other negative. The DBS treatment is highly
dependent on the placement of the electrodes and the type of stimulation used. There are two
main styles of DBS stimulation. The first and the one most often used is the continuous
stimulation with periodic waveforms [9][12][13][14]. The other is responsive stimulation and it
is beginning to gain notice, although it is much harder to achieve as it requires that the seizures
be detected as early as possible [3][9].
1.1.1 Continuous Stimulation
Continuous stimulation, as the name implies is a continuous application of the stimulation
whether the subject is experiencing a seizure or not [9]. The stimulation can also be applied on
a timer basis where the stimulation turns on and off based on a time interval (i.e. on for one
minute, off for 2 minutes). Many human trials have been performed with varying results [9]. A
study performed recently used periodic stimulation with a frequency range of 130-200Hz to
treat temporal lobe epilepsy in the hippocampus [13][14]. They showed remarkable results with
most subjects experiencing a 50% reduction in seizure frequency and a significant number of
patients experienced a 90% reduction and became completely seizure free. A subsequent study
by a Canadian group that tried to match the same results found an improvement of only 15%
[15]. The general story of DBS is that the results are not repeatable. As well many of the
patients that became seizure free only remained so for a short time, then after a couple years
the symptoms returned [12].
6
1.1.2 Responsive Stimulation
Responsive stimulation differs from continuous stimulation in that the stimulation is only
applied when needed [9]. Determining when the stimulation is needed is a much more difficult
task and requires some way to detect the approaching seizure. There are two ways in which
responsive stimulation can be applied. The first is to apply stimulation once the seizure is
observed, although this may often times be too late to stop the seizure [16][17]. The second
method is to use a predictive system that warns of an impending seizure event and applies the
stimulation prior to the event in the hope that the seizure would not occur at all [3].
In a recent study done by Fountas et al., eight patients had an external Responsive
Neurostimulation (eRNS) system implanted [16][17]. The eRNS system detected the occurrence
of a seizure and applied periodic pulses ranging in frequency from 1-333Hz, with amplitude of
0.5-12mA. Of the 8 patients 7 had 45% less seizure activity and 2 had more than 75% reduction
in seizure activity.
There have been numerous DBS trials with marginal success rates. Often times the results are
not repeatable. DBS employs a periodic stimulation model where as the brain has been shown
to be highly complex possibly chaotic (HPC). It is for that very reason that we constructed an
RBF based stimulator using the highly complex features found in the interictal and postictal
regions of the brain. In the following section we explain how the CRGSLE model was configured
to compare the common periodic DBS stimulation to the highly complex interictal and postictal
based stimulations.
7
1.2 Outline
This thesis outlines the initial steps in a long process towards viable human treatment of
epileptic seizures. The first step is the creation of a stimulation model and its application to an
in-silico model of spontaneous seizure-like events. The subsequent steps will be introduced in
the future works section of this thesis.
Having introduced the problem and motivation for the thesis in Chapter 1 we move onto
Chapter 2. In Chapter 2 we provide the background necessary to define chaos, show how it is
quantified and its relevance to the brain and epilepsy.
Chapter 3 focuses on the RBF. There we defend the use of the RBF in modeling brain complexity
from the time series extracellular data. Further we explain in detail the structure of the RBF and
the many training techniques used to model the complexity of the brain. Chapter 3 concludes
with the training results using different training methods and the verification of the model
selected.
The next two chapters focus on the generation and control of Seizure-Like Events (SLEs). In
chapter 4 we describe the SLE model created from the Cognitive Rhythm Generator (CRGSLE).
Then in chapter 5 we show how the CRGSLE model was modified to test control efficacy of our
RBF stimulations. Chapter 5 concludes with the results of stimulating with the interictal and
postictal RBF models compared to the periodic stimulation commonly used in DBS literature.
8
Chapter 6 concludes the work with a discussion of the results and the future work planned on
these results.
1.3 Hypothesis
Radial Basis Functions (RBFs) will capture the highly complex possibly chaotic (HPC) features
present across multiple slices of rat hippocampus from the non-ictal extracellular time series.
The stimulation of the CRGSLE model with the HPC RBF generated non-ictal signals will achieve
better results in terms of suppressing ictal events than those achieved through the periodic
signal used in Deep Brain Stimulation (DBS).
9
CHAPTER 2
CHAOS AND THE BRAIN
In this chapter we will introduce the concept of chaos and how it is measured. We will then
proceed to provide evidence for the existence of HPC activity in a normally functioning brain
and the highly rhythmic, possibly regular activity found in a seizing brain.
2.1 Chaos and Complexity
Chaos is a long term aperiodic behaviour in a nonlinear deterministic system that exhibits
sensitive dependence on initial conditions [18]. There are three important characteristics in this
statement that separate chaotic systems from others. First they produce behaviour that never
repeats, not even after long term observation. Secondly chaotic systems are not based off of
random inputs, but rather from the nonlinear evolution of trajectories. Finally the most
important distinguishing characteristic is that chaotic systems are highly sensitive to initial
10
conditions. This means that two trajectories starting close to each other will diverge
exponentially with time, often referred to as the butterfly effect [18][19].
If the system’s equations are known the chaotic behaviour of the system can be computed
analytically. In general the system equations are not known and many times the only thing
available is the time series of some variable in the system (i.e. voltage). Over the years there
have been many methods developed to find how chaotic or complex a system is. Here we will
present two of the commonly used methods.
2.1.1 Lyapunov Exponents
The most important distinguishing feature of a chaotic system is the sensitive dependence on
initial conditions, in the sense that neighbouring trajectories separate exponentially fast
[18][20]. A common way to quantify this property is to use Lyapunov exponents. Consider an n-
dimensional sphere in n-dimensional state space. During the evolution of the sphere in the
state space it will go from being a sphere to an infinitesimal ellipsoid. Where each dimension k
of the ellipsoid can be described by,
�����~���0���, (2.1)
where ��represents a finite separation between two trajectories. The start and end of the
trajectories are related exponentially by �� which are known as the Lyapunov exponents. A
positive Lyapunov exponent indicates the presence of chaos and a negative or zero Lyapunov
11
exponent means the system is non-chaotic. For a system to be considered chaotic only one of
the Lyapunov exponents needs to be positive, or another way to say it is that the maximum
Lyapunov exponent is greater than 0 in chaotic systems [18][19][20].
A successful method for calculating the Lyapunov exponent from time series is known as Wolf’s
method [20][21]. Wolf’s method takes a reference trajectory and follows the divergence of the
neighbouring trajectories from it. In order to ensure the separation between the two
trajectories does not diverge to infinity or extremely large values it is often necessary to
renormalize. This is done by picking a new point every time a threshold value is exceeded and
the process continues. An average is then taken to find the average divergence rate and with it
the maximum Lyapunov exponent is obtained. The drawback of Wolf’s method is that it
requires many time series points to calculate the divergence. Often times the data is non-
stationary and may contain multiple different regions of chaotic and non-chaotic behaviour.
Therefore other short time techniques such as the short time maximum Lyapunov exponent
(STLmax) [22][23] and Rosensteins’ [24] method are used. In this thesis we opted for the STLmax
method which is based closely on Wolf’s algorithm and the details are outlined in Iasemidis et
al, 1990 [22].
2.1.2 Correlation Dimension
Much like the Lyapunov exponent, correlation dimension tries to quantify the chaotic
behaviour of a system. Correlation dimension is a geometrical quantity that characterizes the
minimal number of variables needed to fully describe the dynamics of motion [21]. The larger
12
the number of variables needed the more chaotic it is. Grassberger and Procaccia devised an
efficient way to do this which has become the standard for calculating the correlation
dimension [18][25].
The Grassberger and Procaccia method works by fixing a point x on the attractor A. Then they
let ����� denote the number of points on A within the ball of radius � centred on the fixed
point x. Then the number of points is measured as the radius � is increased. As the radius �
grows the number of points inside the ball centered at x grows with the relation of a power law
described by,
�����~�� (2.2)
Where d is the correlation dimension. Generally the result varies with the selection of the fixed
point x. To get a more accurate result many different fixed points are used to do the calculation
and then their average is used to find the correlation dimension [18].
2.2 The brain and chaos
The brain is composed of billions of neurons with roughly 1010 synaptic connections. These
neurons join together to form the different systems in the brain such as the cerebellum,
neocortex, amygdala, and hippocampus just to name a few. No man made system in existence
can match the complexity of the brain. Still the question arises whether or not the brain is
chaotic.
13
Using electroencephalogram(EEG) readings which measure the variability of the electric field in
time and space due to the firing of neuronal populations may have provided evidence for the
existence of highly complex possibly chaotic neurodynamics in the brain. Some of the early
work done by Babloyantz et al. had used correlation dimension measurements to assess the
complexity of the different stages of the sleep cycle [26]. They measured a correlation
dimension greater than 4 for the different stages and concluded in that the brain possessed
chaotic dynamics in the sleep state. Further Fell et. al., provided evidence for the existence of
chaotic behaviour in the brain [27]. They applied Wolf’s algorithm on time series gathered from
the different stages of sleep and yielded a positive Lyapunov exponent of 2.5 - 3. Lastly
Balboyantz et al., compared the correlation dimension of a patient in the sleep state to an
epileptic state and found that the sleep state had a correlation dimension of 4.05, and the
epileptic state had a correlation dimension of 2.05 [28]. The drop in dimension supports that
during a seizure a patient is trapped in a lower dimensional, less chaotic state and only when
the state returns to a higher complexity can normal brain function resume.
Much of the work in this thesis relies on the assumption of the existence of HPC neurodynamics
in normal non-ictal brain activity. Likewise we assume the existence of lower complexity,
possibly rhythmic neurodynamics during the ictal region. This assumption is well established in
literature [2][3][20][28].
14
CHAPTER 3
MODELING HIGHLY COMPLEX POSSIBLY CHAOTIC TIME SERIES
In this chapter the challenges of modeling complex time series are outlined in detail. The choice
of the RBF model is defended using references in literature. The RBF model is further
decomposed into its architecture and the learning techniques. The chapter concludes with a
validation of the models’ ability to produce time series that match the characteristics of the
highly complex non-ictal recording measured in the brain.
3.1 Time Series Modeling
We modeled the non-ictal time series from the extracellular reading of rat hippocampal slices.
Earlier in section 2.2 it was explained that this type of signal is non-stationary and HPC. The
chaotic feature of the time series meant that the model would not simply be performing
pattern recognition. Rather it would have to generalize to some underlying features not clearly
visible but highly relevant. These features are the key in finding the right stimulation to prevent
seizure propagation. There are also multiple sources of noise embedded in the system that
15
need to be avoided (i.e. noise from the setup and recording instruments such as the 60Hz
harmonic). The model used has to be able to generalize easily and avoid falling in the traps
caused by the presence of noise. It turns out that this problem is analogous to forecasting stock
market trends.
Stock market time series data is chaotic and highly noisy [7]. Using stock market time series
prediction as a starting point it was discovered that radial basis functions (RBFs) are very
successful for time series prediction [6][7]. RBFs are very similar to ANNs except for
distinguishing difference that the input signals are arranged first based on a non-linear
methodology followed by a linear summation. On the other hand Artificial Neural Networks
(ANNs) first combine the inputs through a linear summation and then perform the non-linear
transformation on those sums. The non-linear transformations in the ANNs are static, whereas
in the case of RBFs the non-linear transformation of inputs is dynamic during training because
the parameters of the RBF are updated. To compensate for the static non-linear
transformation, the ANNs have multiple layers which add higher complexity, but further cost in
training time. Whereas the RBFs have only one layer and the relationship between the weights
and the output are linear and therefore the hardest training is done in finding the parameters
of the non-linear RBF transformation.
16
3.2 RBF Model
The RBF model can be described in a two parts. First the architecture and second the learning
techniques. In the following section we will first describe the standard RBF architecture
followed afterward by the slight modification to make the RBF function in the recurrent mode.
The training of the RBF consisted of three main learning techniques. The first and standard
technique of RBF training is the gradient descent method and it was based on previous work in
the group by Courville [19]. The other learning techniques used were the Tree Regression (TR)
and Forward Selection (FS). They were applied through a RBF Matlab training function created
by the UK group based at the University of Edingburgh, Scotland [29][30].
3.2.1 RBF Architecture
The radial basis function model (RBF) defines an output yn as a linear expansion of radial
functions of the input xn as shown by
y� = � w�∅��������� + w , (3.1)
where ∅��x�� is the output of the kth radial basis function given the input vector xn of
dimension m. N is the number of RBFs, the weight w� is the influence associated with the kth
RBF. The RBF in our model was chosen to be the Gaussian RBF, as shown by
17
∅�����" = exp �− ∑ �'()*+,-.()*+,�/0()*+,/1��� �, (3.2)
where the vector Cn is the center or mean of the kth RBF. The vector rn represents the variance
of the kth RBF. The coefficient term of the Gaussian was omitted as it only modifies the scale
and adds no further complexity. A visual representation of the RBF model is shown in figure 3.1
below.
18
Figure 3.1 – Radial Basis Function Model
The RBF is composed of three different layers. The input layer takes in an m dimensional input from the time series
recording. The hidden layer contains N RBFs which output a value based on the proximity of the input vector to the
Gaussian centre. Finally the output layer is the sum of all the RBF outputs multiplied by their weight factor w.
3.2.2 Recurrent RBF
In its standard mode the RBF model is used to make a prediction based on a given input. There
is an alternative mode of operation known as recurrent RBF mode where the model is only
initialized by one input sample and allowed to generate predictions indefinitely based off of
that first input [31]. To achieve this recurrent mode of operation the model output y� was
replaced by x�2� as shown by,
19
x�2� = � w�∅��������� + w , (3.3)
where x�2� is the next point in the time series generated when the RBF functions in recurrent
mode. It then gets incorporated in the next input to make the prediction of x�2".
Recurrent mode of operation is a better validation of the models predictive capabilities [32]. If
the model is able to capture the intrinsic features of the training time series then it should be
able to maintain activity indefinitely without converging to zero. This was one of the main
criteria used in our evaluation and selection of RBF models.
3.2.3 RBF Training Techniques
The training of the RBF consists of finding superior parameters for modeling the time series
training data. These parameters consist of the Gaussian center (c) and radius (r) along with the
weight (w). Further parameters optimized are the embedding (m) and number of radial basis
functions (N) needed to accurately model the given time series. To achieve this end we used
three learning techniques:
1. Gradient descent
2. Regression Tree
3. Forward Selection
20
In the following subsections the algorithms of all three learning techniques will be summarized.
3.2.3.1 Gradient descent
The gradient descent is the standard learning technique for any optimization problem and was
the starting point for the training of the RBFs. The gradient descent method uses the gradient
with respect to one of the three parameters mentioned above (c,r,w) to find how the error
changes as those parameters increase or decrease. The error is calculated by equation 3.4
where 3452� is the predicted time series value and D is the length of the training time series
data. After many training epochs the parameters and the error slowly converge to one of a
number of possible error minimums or also known as a local minimum.
6 = �" ∑ 73452� − 352�8"9-�5�1 (3.4)
Using the previous work done by Courville [19] as a guide, the gradients were calculated for the
three parameters by differentiating the error function (equation 3.4� with respect to the three
parameters (c,r,w). Where c and r are of size [N x m] and w is of size [N+1], with an extra 1
added for the bias. The gradients shown below,
:;
:<= = ∑ >∑ ?@∅A�����@� − x�2�B9-�5�1 ∅C���� = 0 (3.5)
21
:;
:D=E = ∑ 73452� − 352�895�� ?F3G H− �IJ-KJ�/LJ/ M ">'()*+E-.()*+EB
0()*+E/ (3.6)
:;
:0=E = ∑ 73452� − 352�895�� ?F3G H− �IJ-KJ�/LJ/ M "�'()*+E-.()*+E�/
0()*+EN (3.7)
The gradients are then used to update the parameters such that after each training epoch a
discrete step is taken down the error surface towards a local error minimum. The standard way
to do this is simply to use a predetermined learning rate to control the step size. This is where
we diverged from [19]. Instead the step size was calculated using the conjugate gradient
method described by Shewchuk [33]. It requires only that you feed it a function that outputs
the error and gradient information. It then uses the conjugate gradient technique to find the
best step direction efficiently and updates the parameters recursively until the desired stopping
criteria is met.
Training in a gradient descent method can be stopped in many ways. One way is to stop
training when the error reaches a low enough value. Another method is to stop when the error
reduction between epoch n and n+1 is smaller than a predetermined value. In our setup we
used a predetermined number of training epochs as the stopping criteria.
22
3.2.3.2 Regression Tree
Another form of learning technique was the regression tree (RT) based on the work of Orr et al.,
[29]. Much like unsupervised learning techniques RT takes the initial [d x p] data matrix to
compute a regression tree. Where d signifies the dimensionality of the data and p is the
number of data patterns used in the training. Then the first node, also referred to as root is
initialized.
The training algorithm orders the training data values along each parameter from least to
greatest. It breaks up the data in half from nmin to p-nmin. Where nmin is the minimum
number of data values that have to remain in each branch after a break. Then the boundary
was found to create the lowest error, where error is determined using,
OP = �QR ∑ SFF∈UR , (3.8)
OV = �QW ∑ SFF∈UW , (3.9)
6�X, Y� = �Q >∑ �SF − OP�F∈UR + ∑ �SF − OV�F∈UW B, (3.10)
where k represents the dimension, b is the boundary choice, OP is the average of the output for
the left branch and OV is the average of the output for the right branch, ZP and ZV are the
samples on the left and right branches.
23
After the greedy search through the input space for the branching corresponding to the lowest
errors we can calculate the centres and radii parameters starting from the root node, or the
parent node if you will. The calculation of the centres and radii are done by,
[� = �" > �3F�� − �3F�� F∈U1F5F∈U1\� B, (3.11)
]� = �" > �3F�� + �3F�� F∈U1F5F∈U1\� B, (3.12)
where [� and ]� stand for the radii and centre of the kth node in the regression tree.
The RT produces a large selection of radii and centres which are well suited to model the
training data. From here we can create fairly good model of the training data, however the
model would be very large as it contains almost as many parameter choices as the number of
data samples. Not all of the parameters may contribute significantly to a reduction in error. By
using a pruning method known as Forward Selection (FS) we can reduce the number of
parameters and still maintain a high degree of model performance.
3.2.3.3 Forward Selection
Forward selection (FS) finds the subset of model parameters that create the greatest reduction
in output error. As opposed to backward selection, FS works by initializing to the RBF with the
greatest influence on error reduction in the matrix representing all RBF. It then builds on that
24
by recursively searching through the remaining RBFs for the next RBF that creates the greatest
shift in error. This continues until a special error known as generalized cross-validation (GCV)
error stops reducing. At which point the addition of further RBFs will not contribute to reduce
error. FS can be significantly improved by combining it with orthogonal least squares [34]. This
is a Gram-Schmidt orthogonalisation process which ensures that each new parameter added is
orthogonal to all the previously added parameters [35]. It works to improve the calculation by
making it easier to compute the sum-squared-error term.
Orthogonal least squares springs from the idea that any matrix can be factored into a product
of an orthogonal matrix and an upper triangular matrix
^1 �ℋ1 ∗ a1, (3.13)
ℋ1 = bℏ� ℏ" … ℏQe � ℝQ�1 ; ℏFg ℏ@ = 0, hij k ≠ m, (3.14)
where ^1is the design matrix, ℋ1 is the orthogonal matrix and a1is an upper triangular
matrix.
Using this idea of orthogonality, forward selection proceeds to compute the projection of the
design matrix F acquired from the RT.
nF = hF − ∑ o=pEpEqpE
1-�@�� p@ , (3.15)
25
where nF represents the projection of the ith parameter and p@ is the jth component of the
design matrix being constructed. The process is iterated p times until all the remaining
components of F have been calculated to form a new matrix r. Then the mean-squared-error is
calculated
61-� − 61�F� = >sqn=B/n=qn= (3.16)
where y are the output values. The process is repeated p times until all the remaining
components have been checked. The one that produces the lowest mean-squared-error is then
added to the design matrix ℋ1. The process is repeated until either all parameters have been
exhausted or until another error, the generalized cross-validation (GCV) error begins to increase
indicating that the addition of any further parameters to the design matrix will have no further
benefits as it leads to over-fitting. The GCV is calculated by
tuv = Qsqw*/ s��0\Dx�w*��/ (3.17)
y1 = y1-� − n=n=qn=qn= (3.18)
Once the design matrix has been found it is a straight forward process to calculate the optimal
weights as the RBF output is linearly dependent on the functions through the weight vector.
The calculation of the weights is achieved through the calculation of equation 3.21.
26
a1 = za1-� �ℋ1-�g ℋ1-��-�ℋ1-�g h@01-�g 1 |, (3.19)
}1 = �ℋ1g ℋ1�-�ℋ1g S, (3.20)
?1 = a1-�}1, (3.21)
where a1 is the upper triangular matrix introduced in the transformation to orthogonality, w is
the weight matrix.
3.3 Application to Henon Map
Henon map is a 2-dimensional dynamical system that has been well studied due to its ability to
exhibit chaotic behavior for certain parameters. This makes it good model to test out the
learning techniques introduced. The dynamics of the Henon map are assumed to be
significantly less complex thus making the Henon map a good starting place to verify the RBF
models abilities. In what is to follow we applied the gradient descent training technique on the
Henon map to see if the gradient descent RBF model can capture the chaotic behaviour of the
Henon map.
3.3.1 Henon Map
The Henon map is defined by,
352� = S52� − ~35" (3.22)
S52� = Y35 (3.23)
27
where a and b are two parameters that can be preset to make the map exhibit chaotic
behaviour. In figure 3.2 below we show the difference between the Henon map running in non-
chaotic mode with parameters a=1.25 and b = 0.3 and chaotic mode with a = 1.4 and b =0.3.
The non-chaotic mode shown in figure 3.2a is periodic and can easily be predicted well in
advance. The chaotic mode shown in figure 3.2b has a chaotic pattern which cannot be
predicted in advance.
Figure 3.2 – Comparison of non-chaotic and chaotic Henon map time series
a) Non-chaotic Henon map time series with a=1.25 and b=0.3 b) Chaotic Henon map time series with a=1.4 and b=0.3.
0 20 40 60 80 100 120 140 160 180 200-0.4
-0.2
0
0.2
0.4
Time
Yn
a) Non-chaotic Henon map (a=1.25, b=0.3)
20 40 60 80 100 120 140 160 180 200-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Time
Yn
b) Chaotic Henon map (a=1.4, b=0.3)
28
3.3.2 Data Preprocessing
In order to model the Henon map it was necessary to divide the time series into training data.
Four thousand time series samples were generated in Matlab for training the RBF to model the
Henon map. Following the generation of the data, the data was modified to produce samples of
varying time embedding. Where time embedding refers to the number of time points used
prior to a prediction. It can also be thought of as the input length. Training samples were
created with time embeddings of 1, 2, 5, 10 and 20.
3.3.3 RBF Training of Henon map
The training of the Henon map was done with the gradient descent method mentioned earlier.
The initial center c parameters were selected from the training data, the r or variance was set
to 0.1, and the weights were randomized from a Gaussian distribution. The training variables
used in the gradient descent method are shown in Table 3.1.
The error calculation was done in two steps. First during the training the error was calculated
using the MSE for the regular non-recurrent mode. Then afterwards the model was verified
using the MSE on the recurrent mode generation compared with the actual Henon map time
series.
�Z6 = �9-1-� ∑ �� − h����"5F�� (3.24)
29
where m is the embedding of the model used to make the prediction h��� and D is the number
of sample points used in the training.
Since in recurrent mode the models diverge very rapidly (see figure 3.3a), it was decided that
MSE was not enough to validate the model. Therefore complexity was further used as a way to
confirm the model selection. To calculate the maximum Lyapunov exponent we used STLmax
[22][23]. To calculate the correlation dimension we used a Matlab program based on
Grasberger and Procaccia [25] written by Zalay, who is a member of our group. The complexity
was compared with the complexity on the training data. The complexity was calculated using
8000 samples, time constant of 2 and embedding dimension of 7. The result was 0.99 for the
maximum Lyapunov exponent and 1.33 for the correlation dimension.
Table 3.1 – Henon Map gradient descent training parameters and results
Model Embedding
(m)
RBFs
(N)
Training
Epochs
MSE
(non-Recurrent)
MSE
(Recurrent)
Max Lyapunov
Exponent
Correlation
Dimension
1 1 10 1000 2.00e-3 3.64e-2 0.20 NaN*
2 2 10 1000 4.55e-7 1.12e-2 0.87 1.27
3 5 20 1000 4.47e-6 1.12e-2 -0.12 0
4 10 20 1000 2.90e-3 7.90e-3 -1.66 6.80e-3
5 20 20 1000 2.70e-3 1.12e-2 -5.30 1.70e-3
6 1 20 1000 1.70e-3 1.69e-1 0.22 NaN*
7 2 20 1000 5.95e-7 1.12e-2 1.10 1.37
8 5 40 1000 7.98e-5 1.12e-2 1.15 2.01
9 10 40 1000 4.31e-5 1.05e-1 -0.44 0
10 20 40 1000 6.93e-2 1.45e-2 0.02 0
* NaN refers to Not A Number and commonly results when the correlation dimension is unable to be calculated, in
this case it is because Model 1 and Model 2 produced a steady constant value in recurrent mode.
30
The trained RBFs were used in recurrent RBF mode to produce time series of length 8000 as a
means to compare the effectiveness of the training. The results are shown in Table 3.1. From
the results the lowest error in training MSE (non-recurrent) was 4.55e-7 for model 2 with an
embedding of 2 and 10 RBFs. Furthermore the recurrent MSE was the lowest with 1.12e-2
along with models 3, 5, 7, and 8. Model 2 produced a maximum Lyapunov exponent of 0.87 and
correlation dimension of 1.27 which closely matched the values found on the original data, 0.99
and 1.33 respectively. Models 7, also with an embedding of 2, but with 20 RBFs produced
similar complexity to the training data, but it required more RBFs. Therefore model 2 was
selected. Further by simple observation in figure 3.3a it was verified that model 2 matches the
characteristics of the Henon map. In figure 3.3b the MSE training with respect to the first 200
epochs is provided to show how the model slowly converged to the error of 4.55e-7. The RBF
with the gradient descent learning method was sufficient in modeling the Henon map. It was
not necessary to use any of the other training techniques.
31
Figure 3.3 – Comparing RBF Henon map model to chaotic time series
a) Comparison of chaotic Henon map time series to RBF generated data in recurrent mode with RBF parameters of m=2 and N=10 b) Mean squared error plot with respect to epoch of gradient descent training for the same model
3.4 Application to Non-ictal Time Series data
As mentioned earlier the non-ictal data is highly complex and possibly chaotic (HPC). Moreover
it is non-stationary and embedded with noise. Modeling this complex time series is far more
difficult than modeling the Henon map. In what is to follow we will describe the process of
20 40 60 80 100 120 140 160 180 200
-0.4
-0.2
0
0.2
Time
Yn
RBF model of Henon Map with m=2 and N=10
20 40 60 80 100 120 140 160 180 200
-0.4
-0.2
0
0.2
Original Henon Map time series for a=1.4 and b=0.3
Time
Yn
0 5 10 15 20 25
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Comparing RBF Prediction to Henon Map
Time
Yn
Henon Map data
RBF (m=2, N=10)
20 40 60 80 100 120 140 160 180 2000
0.1
0.2
0.3
b) Gradient Descent MSE on Henon Map RBF model (m=2, N=10)
Epoch
MS
Ea) Comparing RBF Henon Map Model To Original Henon Map Time Series
32
modeling the non-ictal time series data starting with the acquisition of data, immediately
followed by the preprocessing of the data and concluding with the training results and
verification of complexity.
3.4.1 Low Mg2+
/ High K+ Animal Data Acquisition
Training seizure data was collected independently by Eunji, a member of our group. Eight slices
from the hippocampus of male Wiser rats aged 17-25 days were obtained. Then the slices were
bathed in a low Mg2+/High K+ solution and electrodes were placed in the CA1 region of the
hippocampus. After roughly 20-40 minutes the slices begin to exhibit spontaneous seizures due
to the presence of the low Mg2+/High K+. The seizing activity was recorded by electrodes at a
sampling frequency of 2kHz. The whole process is described in further detail in the paper by
Chiu et al, with the exceptions that we use sampling of 2kHz where as Chiu et. al sampled at
10kHz [3]. Once all the data had been collected it was only a matter of separating the ictal
regions from non-ictal regions. The separation of the ictal and interictal region was the most
difficult as there is a steady increase in spikes as the seizure develops. To avoid overlapping the
regions we selected interictal data as far away from the ictal region as possible. The postictal
region was selected after the last ictal spike was observed (see figure 1.1).
3.4.2 Data Preprocessing
The non-ictal time series data is susceptible to many noise sources ranging from the external
environment to electromyogram (EMG) interference from muscles to simply artifacts in the
33
measuring instrumentation. Therefore preprocessing consisted of filtering out the noise,
trimming out outliers in the time series recording and scaling the signal to lie between the
range -1 to 1.
The signal was low pass filtered to 50Hz followed by a light high pass filtering at 0.5Hz to
remove some low frequency oscillations which interfere in the training of the RBF. After
filtering the training data was downsampled by 20.
The trimming of the signal was done in such a way that only the occasional outliers would be
removed and the remainder of the signals would fall just under the trimming. Following the
trimming the data was scaled such that the maximum and minimum values would lie within -1
to 1 respectively.
3.4.3 RBF Training of Non-Ictal Time Series
The above mentioned training methodologies were then applied in three different sequences
to train our model.
1. Gradient descent
2. Forward Selection
3. Regression Tree with Forward Selection
34
Similarly to the Henon map the training of the RBF models was done using the MSE defined by
equation 3.24. To verify that the models sufficiently resemble the properties of the non-ictal
extracellular time series we tested the model by running it in recurrent mode and initialized
with ictal data.
The final test of the models was to verify their complexity and also how well it resembled the
actual non-ictal time series data used in training. The complexity was calculated by finding the
maximum Lyapunov exponent and correlation dimension which were introduced in Sections
2.1.1 and 2.1.2 respectively. Table 3.2 below shows the complexities of the interictal and
postictal training data after downsampling to 100Hz sample rate. To calculate the maximum
Lyapunov exponent we used STLmax [22][23]. To calculate the correlation dimension we used a
program based on Grasberger and Procaccia[16] written by a Zalay, a member of our group.
The results were calculated using 6 different samples of length 8000, time constant of 2 and
embedding dimension of 7. The results of the calculation yielded a maximum Laypunov
exponent and correlation dimension of 1.67 and 5.66 respectively for the interictal time series
data. Postictal time series data yielded a maximum Lyapunov exponent and correlation
dimension of 1.64 and 6.33 respectively. These complexity results were compared later with
the complexity found from the RBF models generating in recurrent mode.
Table 3.2 – Complexity of interictal and postictal training time series
Model Maximum
Lyapunov Exponent
Standard Error Correlation
Dimension
Standard Error
Interictal Time Series 1.67 0.04 5.66 0.27
Postictal Time Series 1.64 0.06 6.33 0.07
35
3.4.2.1 Gradient Descent
Initially the training of the non-ictal data was done with the gradient descent method. The
training was first done on the interictal model. The centres c were selected initially from the
training data, the variance r was set to 0.1 and the weights w were selected from a normalized
Gaussian distribution. The embedding and number of RBFs were swept through a variety of
choices from fairly simple to very complex (see Table 3.3).
The gradient descent training worked in reducing the error on predictions of the training data.
Although it failed to produce anything resembling the time series of the interictal extracellular
time series when operated in recurrent mode. The MSE results from both training (non-
recurrent) and recurrent modes are shown Table 3.3 along with the complexity calculations.
None of the models succeeded to capture the characteristics of the interictal time series. The
best result was achieved with model 2 which had an embedding of 10 and with 20 RBFs . It
produced a MSE of 0.0777 after training and a MSE of 0.1784 in recurrent mode. The maximum
Lyapunov exponent was close to 0 and the correlation dimension was 0 thus lacking any sort of
complexity. Figure 3.4 shows the results of model 2. From figure 3.4c it can be seen that the
recurrent mode would produce oscillations until it converged to a constant close to 0. From the
above results it was determined that gradient descent based methods were not going to
succeed in modeling the chaotic interictal activity. Thus we proceeded to try out the other
learning methods.
36
Table 3.3 – Interictal gradient descent training parameters and results
Model Embedding
(m)
RBFs
(N)
Training
Epochs
MSE (non-
Recurrent)
MSE
(Recurrent)
Maximum
Lyapunov Exponent
Correlation
Dimension
1 5 20 2000 0.0803 0.1789 -0.14 0 2 10 20 2000 0.0777 0.1784 -0.40 0 3 20 40 2000 1.1740 2.3482 0.21 NaN* 4 40 60 2000 0.0913 0.1798 0.17 NaN* 5 100 60 2000 0.1824 0.3636 0.20 NaN* 6 20 80 2000 0.1394 0.2780 0.20 NaN* 7 40 80 2000 0.2153 0.4282 0.21 NaN* 8 100 100 2000 0.0912 0.1796 0.17 NaN* 9 140 120 2000 0.1843 0.3636 0.20 NaN*
10 140 200 2000 1.2509 0.6120 0.21 NaN*
* NaN refers to Not A Number and commonly results when the correlation dimension is unable to be calculated, in
this case it is because Models 3 - 10 produced a steady constant value in recurrent mode.
37
Figure 3.4 – RBF Interictal model after gradient descent training
a) Interictal training data. b) Prediction of RBF after gradient descent on training data, embedding of the model is
equal to 10 and the number of RBFs used are 20. c) Result of RBF prediction in recurrent mode. d) MSE error curve
with respect to number of training epochs.
0 50 100 150 200 250 300 350 400 450 500-1
-0.5
0
0.5
1
Time
Volta
ge
(m
V)
a) Training Data
0 50 100 150 200 250 300 350 400 450 500
-0.4
-0.2
0
0.2
Time
Vo
lta
ge
(m
V)
b) Non-recurrent RBF Model (m=10, N=20)
0 50 100 150 200 250 300 350 400
0.08
0.1
0.12
0.14
0.16
0.18
d) Gradient Descent MSE on Interictal RBF Model (m=10, N=20)
MS
E
Epochs
0 50 100 150 200 250 300 350 400 450 500-0.1
-0.05
0
0.05
0.1
Vo
lta
ge
(m
V)
c) Recurrent RBF Model (m=10, N=20)
Time
38
3.4.2.2 Forward Selection
The forward selection (FS) learning technique uses a non-gradient based learning method which
may avoid getting trapped in local minimum. The advantage of training with the FS is that it did
not require a lot of parameter selection prior to training. The main parameter that was
controlled was the embedding of the time series. The embedding used for training were 5, 10,
20, 30, 40, 50, 60, 80, 100, 120 and 140. As before, we trained on the interictal training data to
see if forward selection learning could capture the features of the interictal region. After
training the models were tested in recurrent mode. Figure 3.5 shows the results of MSE and
complexity of the different RBF models. The lowest error achieved was 0.19 for embedding 5,
although it failed to produce any complexity. Only the models with embedding 40 and 50
produced complexity in both the Lyapunov exponent and correlation dimension. Even so the
Lyapunov complexity fell far short of the 1.67 goal for the interictal time series. The model with
embedding 50 seemed slightly superior to the other models and its results were further
decomposed in figure 3.6. Under embedding of 50 the model produced 3591 RBFs. The training
is shown in figures 3.6b where as the number of RBFs was added the GCV error reduced. The
addition of further RBFs stops once the GCV error does not change significantly for the past 5
RBF additions. At which point the selection process backtracks to the point 5 RBFs before and
takes that to be the model. This occurred after 3591 RBFs were included. With such a large
number of RBFs it is likely the training attempted to select one RBF for each of the training time
series points which negates any real learning. In figure 3.6a we compare the recurrent RBF
model time series generation to that of the interictal training data. The result is significantly
better than that of the gradient descent training. Simple visual observation shows the two
39
waves are significantly different. The model appears to be stuck in a rhythmic-like pattern with
no real complexity.
Figure 3.5 – Results of Interictal RBF training with forward selection
Comparing MSE and complexity of RBF models with different embeddings. Models were tested in recurrent mode. It can be noted that the lowest MSE occurred for embeddings of 5, 10, 20, 140 and 200. However the only consistent complexity occurred at embeddings of 50 and 60. Preference was given to model complexity and thus the model with embedding of 50 was chosen. All the models fall short on the Lyapunov exponent indicating that the models were unable to match the complexity of the interictal time series.
5 10 20 30 40 50 60 80 100 120 140 200
0.1
0.2
0.3
Embedding
MS
E
MSE vs Embedding of RBFs (Interictal Model)
5 10 20 30 40 50 60 80 100 120 140 200-0.5
0
0.5
1
1.5
Embedding
Lm
ax
Max Lyapunov Exponent vs RBFs (Interictal Model)
5 10 20 30 40 50 60 80 100 120 140 200
1
2
3
4
5
Embedding
Dim
Correlation Dimension vs Embedding of RBFs (Interictal Model)
40
Figure 3.6 – RBF interictal model after training with forward selection
a) Comparison of the interictal training data to the recurrent RBF model selected with embedding of 50 and 3591 RBFs. There is significant improvement over the gradient descent training method however it still does not resemble the interictal data. b) The RBF selection process showing the reduction in GCV error until the error flat lines and no more RBFs are added.
0 500 1000 1500 2000 2500 3000 35000
0.05
0.1
0.15
Number of RBFs
GC
V E
rro
r
b) GCV Error During FS Training (m=50, N=3591)
0 50 100 150 200 250 300 350 400 450 500-1
-0.5
0
0.5
1
Time
Vo
lta
ge
(m
V)
a) Interictal Training Data
0 50 100 150 200 250 300 350 400 450 500-1
-0.5
0
0.5
1
Time
Vo
lta
ge
(m
V)
Recurrent RBF Model (m=50, N=3591)
41
3.4.2.3 Tree Regression and Forward selection
To improve on the results of the FS we employed the tree regression (TR) on the data. TR
sampled the training data to create viable center c and variance r parameters. Then using FS
the best RBFs were selected to produce a much more compact model consisting of far less RBF
functions.
We applied the training on the same embeddings as in section 3.4.2.2 with the FS training. This
time TR was applied before the FS. We first trained on the interictal data. The results were very
encouraging. Figure 3.7 shows the MSE and complexity with respect to the embedding of the
models trained after operating the RBFs in recurrent mode. The MSE was lowest for
embeddings 5, 20 and 30. Even so the MSE was not all that different from section 3.4.2.2 and
3.4.2.1 and those models were not successful in capturing the interictal time series. For that
reason the complexity and resemblance to the training data were taken to be the more reliable
estimates. The complexity of many of the models matched closely to the Lmax of 1.67 and
correlation dimension of 5.66 found for the actual interictal time series.
After training, the RBF models with embeddings of 20, 30 and 50 were found to match closely
the complexity of the interictal data and still managed to resemble the data fairly well in visual
comparisons. The embedding 20 RBF model was found to have a complexity that most closely
matched that of the interictal data with Lmax of 1.68 and correlation dimension of 6.21.
Embedding 30 was fairly close with an Lmax of 1.72 and a correlation dimension of 6.30.
Embedding 50 was also fairly close with Lmax of 1.58 and correlation dimension of 6.22. The
42
correlation dimension results for the three models were far off from the interictal training data,
but the training data had a large variance so preference was given to the Lmax estimate for the
interictal training case.
In figure 3.8 we compare the recurrent RBF mode of the three top models to the interictal time
series to show how they compare. It was noted that even though the model with embedding of
20 had a lower MSE it failed to match the interictal time series as well as the embedding 50
model. In particular it failed to match the amplitude characteristics. The embedding 50 model
matched the time series the best while at the same time maintaining complexity that closely
resembled the interictal time series. In comparison to the embedding 30 model, the embedding
50 model still had better amplitude characteristics. Therefore we chose the embedding 50
model to represent the interictal time series.
Having successfully trained the interictal time series using the TR technique the same training
technique was applied on the postictal time series, the results of which are shown in figure 3.9
and 3.10. The embeddings of 5, 20, 30 produced slightly lower MSE values. Having shown that
the MSE was not a reliable estimate we focused more on the complexity of the models. The
embeddings of 20, 40 and 50 produced similar complexity results to the postictal time series
complexity. The postictal time series had complexity of 1.64 for Lmax and 6.35 for correlation
dimension. The embedding of 50 model had the closest complexity with Lmax of 1.59 and
correlation dimension of 6.33. The embedding of 40 had a Lmax of 1.73 and a correlation
dimension of 6.39. The embedding of 20 had a Lmax of 1.68 and a correlation dimension of
43
6.21. Figure 3.10 compares the three top RBF models to the postictal time series. The
embeddings 20 and 40 models lacked the ability to match the amplitude characteristics as well
as the embedding 50 model. The embedding of 50 model was selected as the postictal model.
Figure 3.7 – Results of interictal RBF training with tree regression
Comparing MSE and complexity of RBF models with different embeddings. Models were tested in recurrent mode. It can be noted that the lowest MSE occurred for embeddings of 5, 20 and 30. Models with embeddings 20, 30 and 50 produced the closet complexity to the interictal training data.
5 10 20 30 40 50 60 80 100 120 140 200
0.1
0.2
0.3
Embedding
MS
E
MSE vs Embedding of RBFs (Interictal Model)
5 10 20 30 40 50 60 80 100 120 140 200
2
4
6
Embedding
Lm
ax
Maximum Lyapunov Exponent vs Embedding of RBFs (Interictal Model)
5 10 20 30 40 50 60 80 100 120 140 200
2
4
6
8
Embedding
Dim
Correlation Dimension vs Embedding of RBFs (Interictal Model)
44
Figure 3.8 –RBF interictal after training with tree regression
a) The interictal time series data that RBF model is striving to replicate. b) The chosen RBF model operated in recurrent mode showing strong resemblance to the interictal time series. The model has an embedding of 50 and uses 99 RBFs. c) The RBF model with 30 embedding and 112 RBFs had slightly better complexity and lower MSE but lacked in the amplitude when compared to the interictal data. d) Similarly RBF model with 20 embedding and 139 RBFs had good complexity but also lacked in amplitude characteristics.
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
a) Interictal Time Series
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
b) Recurrent RBF Model (m =50, N=99)
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
c) Recurrent RBF Model (m=30, N=112)
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
d) Recurrent RBF Model (m=20, N=139)
45
Figure 3.9 – Results of postictal RBF training with tree regression
Comparing MSE and complexity of RBF models with different embeddings. Models were tested in recurrent mode. It can be noted that the lowest MSE occurred for embeddings of 5, and 20. Embedding models 20, 40 and 50 had the closest matching complexity to the postictal time series.
5 10 20 30 40 50 60 80 100 120 140 200
0.1
0.2
0.3
0.4
0.5
Embedding
MS
E
MSE vs Embedding of RBFs (Postictal Model)
5 10 20 30 40 50 60 80 100 120 140 200
2
4
6
8
10
Embedding
Lm
ax
Maximum Lyapunov Exponent vs Embedding of RBFs (Postictal Model)
5 10 20 30 40 50 60 80 100 120 140 200
2
4
6
Embedding
Dim
Correlation Dimension vs Embedding of RBFs (Postictal Model)
46
Figure 3.10 – RBF postictal training with tree regression
a) The postictal time series data that RBF model is striving to replicate. b) The chosen RBF model operated in recurrent mode showing strong resemblance to the postictal time series while maintaining the closest matching complexity. The model has an embedding of 50 and uses 128 RBFs. c) The RBF model with 40 embedding and 146 RBFs had slightly better complexity and lower MSE but lacked in the amplitude when compared to the interictal data. d) Similarly RBF model with 20 embedding and 156 RBFs had good complexity but also lacked in amplitude characteristics.
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
a) Postictal Time Series
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
b) Recurrent RBF Model (m=50, N=128)
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
c) Recurrent RBF Model (m=40, N=146)
0 50 100 150 200 250 300 350 400 450 500-1
0
1
Time
Vo
lta
ge
(m
V)
d) Recurrent RBF Model (m=20, N=156)
47
CHAPTER 4
MODELING SPONTANEOUS SEIZURE LIKE EVENTS
At this point we have established a RBF model of both interictal and postictal time series. The
RBF model was able to capture the necessary features of the biological system. It was able to
maintain a recurrent mode of time series generation while at the same time matching the
amplitude characteristics and the complexity of the biological system. In this chapter we
introduce some SLE models found in literature. We then proceed to summarize in detail the
spontaneous SLE model that will be used to test the RBF stimulators.
4.1 Literature Review
Modeling of spontaneous seizure-like episodes has been achieved using computational models
[8][36][37][38]. When modeling epilepsy one has to take into consideration that not all epilepsy
disorders are the same. Epilepsy can occur in many different regions of the brain and each has
48
its own unique characteristics. Here we provide some literature reviews on the different
models out there.
In 2002 Wendling et al., constructed a seizure model of the human epilepsy using intracerebral
EEG recordings from the human hippocampus. They created a macroscopic model which
represented the neurodynamics of four populations of neurons using 2nd order differential
equations with a static nonlinearity [36]. The four clusters were divided into: main cells
(pyramidal cells in the hippocampus or neocortex), two feedback subsets composed of local
interneurons (either excitatory or inhibitory) and a fourth subset to represent the inhibitory
interneurons with faster kinetics [36]. The model produced very similar waveforms to the
intericital and ictal activity found in the human epilepsy.
In 2004, Suffczynski et al., used a bistable neural network model to create a macroscopic model
for rat absence epilepsy [37]. They modeled the thalamo-cortical circuits based on relevant
physiological data. Transitions between the ictal and interictal states were determined
randomly with constant probabilities. They managed to model the seizure-like oscillations fairly
accurately. However the model was never designed to produce the extracellular type signals
recorded from the intracerebral electrodes.
Recently Zalay et al., modeled temporal lobe epilepsy of the rat hippocampus [8]. Like the other
models this model represented the macroscopic neurodynamics of populations of neurons.
The model consisted of cognitive rhythm generators (CRGs) defined by four differential
49
equations. The output of each CRG was calculated using a static nonlinearity. Furthermore the
model was able to produce an extracellular like signal by taking the outputs of each CRG and
summing them up relative to a centre point based on a topological square relationship between
the four CRGs. The model closely matched the real extracellular recording from the Low Mg2+
spontaneous seizure setup in the rat hippocampus.
The training of the RBF stimulation models was based on the temporal lobe epilepsy and that
made the selection of the Cognitive Rhythm Generator Seizure-Like Event (CRGSLE) model by
Zalay et al. an appropriate choice to test our stimulation on. In the following section CRGSLE
model will be further described.
4.2 CRG Based Spontaneous Seizure-Like Event Model
The CRGSLE model was chosen for validation of our hypothesis that simulating with a HPC non-
ictal signal (i.e. interictal or postictal) would produce successful suppression of an ictal event.
The strengths of the model is that it models the temporal lobe epilepsy, produces spontaneous
seizure-like events (SLEs) and produces an extracellular signal that mimics the extracellular
recordings used in training.
As mentioned earlier the way the CRGSLE (see figure 4.1) works is that it creates four CRGs that
represent different populations of neurons through 2nd order limit cycle dynamics and a static
nonlinearity connecting the state variables to the output waveform [8]. The coupling between
50
the different CRGs is done with an exponential impulse response function, which is referred to
as an ‘integrating mode’ [8]. The nth CRGs’ combined dynamics are defined by four differential
equations,
O�5� = �5�O"5>1 + Z�,5B + O�5>1 + Z�,5 − O�5" − O"5" B�, (4.1)
O"5� = �5�−O�5>1 + Z�,5B + O"5>1 + Z�,5 − O�5" − O"5" B�, (4.2)
O�5� = O�5, (4.3)
O�5� = �5�5�S� − 2�5O�5 − �5"O�5, (4.4)
where �5 is the intrinsic angular frequency, �5 is the parameter controlling the decay rate
given by state variable O�5, Z�,5 and Z�,5 are the phase and amplitude modulation functions
respectively [8].
�5�S� = ∑ ]15S1 + 35����1�� , (4.5)
S5 = ] 5 + O�5 + �O�5" + O"5" � Htan-� �/(��(M (4.6)
where �5�S� is the mode input function and S1 are the CRG outputs, ]15 are the directional
coupling coefficients and 35��� is the optional external input. W(∙) is the intrinsic output
waveform of the CRG normalized over (-π,π], with the 4-quadrant arctangent function
providing the instantaneous phase angle [8].
51
The phase and amplitude modulation functions are defined by,
Z�,5 = ] 5 + X5O�5 + 35� ���, (4.7)
Z�,5 = 0, (4.8)
where kn is a modulatory gain and 35� ��� is an optional additive input. The CRGSLE model
generation of spontaneous seizure events is shown in figure 4.2.
The extracellular field potential used to simulate the SLE time series was produced by the
output of the four CRGs (see figure 4.2b). It was created by treating each of the CRGs as a point
source and treating the center of the electrode as being placed above the center of the square
like arrangement of the CRGs [8]. The extracellular seizure shares many features with the actual
seizure data from the rat slice as shown in figure 4.3. The comparison in figure 4.3 acts as a
verification that the CRGSLE model is a good representation of the actual biological system we
are trying to stimulate.
52
Figure 4.1 – CRGSLE model
Diagram showing the configuration of the 4 CRGs used to create the SLEs. The CRG is composed of three parts. First the integrating mode takes all the inputs coming from the other models and convolutes them. Then it feeds the result into the differential equations which contain clock like dynamics. The result of this clock portion is then fed to a mapper which creates the output that can be fed into the other CRGs.
53
Figure 4.2 – CRGSLE model output waveforms produced
a) The extracellular recording of a spontaneous seizure-like event. b) The four CRG outputs. The CRGSLE produces spontaneous SLEs based on the outputs of the 4 CRGs. The 4 CRGs are combined to produce an extracellular type recording by treating each of the CRG outputs as point sources equidistant away from the extracellular recording region.
0 10 20 30 40 50 60 70 80 90
-1
0
1
a) Unstimulated CRGSLE Model
Time (s)
Volta
ge
(m
V)
0 10 20 30 40 50 60 70 80 90
0
20
40
60
b) CRG Outputs
Time (s)
Volta
ge
(m
V)
0 10 20 30 40 50 60 70 80 90
0
20
40
60
Time (s)
Vo
lta
ge
(m
V)
0 10 20 30 40 50 60 70 80 90
0
20
40
60
Time (s)
Vo
lta
ge (
mV
)
0 10 20 30 40 50 60 70 80 90
0
20
40
60
Time (s)
Vo
lta
ge
(m
V)
54
Figure 4.3 – Comparison of the CRGSLE seizures to the actual seizures being modeled
a) Comparison of the seizures recorded from the rat hippocampus under lowMg2+
conditions and the seizures produced by the CRGSLE model. b) Shows a close up comparison of the actual biological and computational seizures.
0.5 1 1.5 2 2.5 3 3.5 4
x 106
a) Seizures Recorded From Hippocampus
20 30 40 50 60 70 80
CRGSLE Seizures
1.3 1.35 1.4 1.45 1.5 1.55 1.6
b) Closeup of Seizure Recorded From Hippocampus
38.5 39 39.5 40 40.5 41 41.5 42
-2
-1
0
1
Closeup of CRGSLE Seizure
55
CHAPTER 5
CONTROLLING SEIZURES
This chapter compares the standard DBS periodic stimulation to that of the HPC interictal and
postictal RBF stimulation models. Then we describe how the stimulation techniques were
applied to the CRGSLE model. After that we provide quantification of the stimulation efficacy
using ROC curves and the area under the ROC curves.
5.1 Application of Stimulation to CRGSLE Model
In the earlier section 4.2 we showed how the CRGSLE model was able to achieve an accurate
representation of the epilepsy found in the rat hippocampus. Now we describe how the same
model can receive inputs from external stimulation.
56
Providing responsive external stimulation to the CRGSLE model required two issues to be
addressed. First it was important that the stimulation be added in the appropriate place to
mimic an external stimulation. Secondly to apply a responsive stimulation it was necessary to
determine when the system was seizing.
The external stimuli was added in equation 4.5 to the mode input function �5�S� [8]. The
external stimuli 35��� is equal to the gain multiplied by the stimulus that was being provided,
whether that was RBF interictal, RBF postictal or periodic. The gain made it possible to modify
the intensity of the input being applied.
The decision to apply the stimuli was determined by comparing the complexity of the state
variable O�� (related to the instantaneous phase of CRG1) to the specified excitation threshold
(exThr) parameter. The basis for this method is that the complexity of the model would be
lower in the ictal state than in the interictal or posticital states, thus there would be a reduction
in complexity as the state of the system transferred from interictal to ictal. The complexity was
calculated by applying STLmax on windowed data of length 5000 from the state variable O��. The
exThr was preset such that when the complexity of O�� reaches a certain value it would indicate
that the system is in the ictal mode. At which point the system would receive stimulation [8].
Then once the system entered the postictal region the complexity would rise above the exThr
and the stimulation would be disengaged.
57
In figure 5.1 we show the feedback configuration of the stimulator and CRGSLE computer
model. Note that the dotted line going to the stimulator from the CRGSLE output represents
the previous m embedding points used to initialize and reinitialize the RBF stimulators.
Figure 5.1 – Stimulation Setup
The stimulation setup that combines the CRGSLE model and the RBF and periodic stimulators. The excitability connection acts like a feedback indicating if stimulation should be applied or not. Stimulation is the stimulator output. Model output is the extracellular field created by the CRGSLE Model.
5.2 Periodic Stimulator Frequency Selection
There is no clearly defined stimulation frequency that works best with DBS. Generally the
frequency is tuned from 0-300Hz until the best result is achieved. In our case we decided on the
stimulation frequency based on the FFT of the RBF interictal and postictal stimulation models.
58
The FFT of one RBF postictal prediction is shown in figure 5.2a. Although the FFTs vary from one
RBF prediction to another, there was one commonality found across all the predictions of
interictal and postictal and that was that the 12Hz component had the strongest amplitude.
Thus to make a fair comparison to the RBF we used the 12Hz periodic stimulation frequency.
The FFT of the 12Hz periodic is shown in figure 5.2b.
59
Figure 5.2 – FFT Comparison of RBF Stimulator and 12Hz Periodic Stimulator
a) Sample periodic time signal and FFT associated with it. It shows that 12Hz is the highest amplitude imbedded frequency in the RBF prediction. The 12Hz was the common frequency found across all interictal and postictal RBF predictions after training. b) The 12Hz periodic stimulation signal with the FFT showing the strong 12Hz and the harmonics.
0 50 100 150 200 250 300 350 400 450 500-1
-0.5
0
0.5
1
Time
Voltage (m
V)
a) Postictal RBF Model
0 5 10 15 20 25 30 35 40 45 500
0.05
0.1
0.15
0.2Single-Sided Amplitude Spectrum of Postictal Training Data (y(t))
Frequency (Hz)
|Y(f)|
0 50 100 150 200 250 300 350 400 450 500-1
-0.5
0
0.5
1
Time
Voltage(m
V)
b) 12Hz Periodic Stimulation
0 5 10 15 20 25 30 35 40 45 500
0.02
0.04
0.06
0.08Single-Sided Amplitude Spectrum of 12Hz Periodic Stimulation (y(t))
Frequency (Hz)
|Y(f)|
60
5.3 Results of RBF Stimulation
Having described how the CRGSLE model receives external stimulation, we now proceed to
show the results of the periodic, interictal and postictal stimulations. In figure 5.3 below the
results of the three different stimulations is compared. It can be seen that there is a good
reduction in the number of seizures after the interictal stimulation (figure 5.3b) and an even
better reduction after postictal stimulation (figure 5.3c). The periodic stimulation produced only
a slight reduction in seizures (figure 5.3d). This result was achieved with exThr of 0.2 and a gain
of 0.01. To better assess the stimulation dependence on parameters an ROC curve was
performed on a sweep of different gain and exThr parameters.
61
Figure 5.3 – Stimulation of the CRG SLE mode with interictal, postictal and periodic stimulations
The results of applying the three different stimulation techniques are compared with the normal SLE model. b) The application of the interictal RBF stimulation reduced the number of seizures by roughly two thirds. c) The postictal stimulation managed to reduce the number of SLEs even more. d) The application of the periodic stimulation also reduced the number of SLEs but not nearly as much as the interictal and postictal stimulations.
0 10 20 30 40 50 60 70 80 90-2
0
2V
oltage (m
V)
a) Extracellular Recording of the Unstimulated CRGSLE Model
0 10 20 30 40 50 60 70 80 90-2
0
2b) Stimulated CRGSLE Model by Interictal RBF
0 10 20 30 40 50 60 70 80 90-2
0
2Interictal RBF Stimulation
0 10 20 30 40 50 60 70 80 90-2
0
2c) Stimulated CRGSLE Model by Postictal RBF
0 10 20 30 40 50 60 70 80 90-2
0
2Postictal RBF Stimulation
d) Stimulated CRGSLE Model by Periodic Stimulation of 12Hz
0 10 20 30 40 50 60 70 80 90-2
0
2d) Stimulated CRGSLE Model by Periodic Stimulation of 12Hz
0 10 20 30 40 50 60 70 80 90
-1
0
1
Time
Periodic Stimulation of 12Hz
62
5.4 ROC Measurements
The Receiver Operating Characteristic (ROC) is a practical evaluation technique that accurately
compares the successfulness of prediction [39]. It came about as a way to deal with
complicated cases where the distribution of positive and negative classes was strongly skewed.
For example in the diagnosis of cancer it is probabilistically more likely that a negative
prediction for cancer will be the correct one than not. This bias tends to lead to procedures that
favour a negative prediction rather than an accurate prediction based on facts. ROC
compensates this by dividing the predictions into four cases. They are true positive (TP), false
negative (FN), false positive (FP) and true negative (TN). From these cases we can calculate the
true positive rate TPR and the false positive rate FPR.
�y� = gwgw2�� (5.1)
�y� = �w�w2g� (5.2)
The TPR is a reflection of the sensitivity of the prediction, meaning how accurate the prediction
is. The FPR represents 1-specificity which is a measure of how well you are discriminating
between the two cases. The values of TPR and FPR are found by varying the detection
threshold. Then the points are plotted on an ROC curve with the x-axis representing the 1-
specificity and the y-axis representing the sensitivity. Looking back at the cancer prediction
example we see that a high TPR value means that we are catching the positive cases very well.
However a high FPR means that we are also making a lot of mistakes by incorrectly classifying
63
many of the negative cases as positive. Ideally we want a system to have a high TPR and a low
FPR, meaning that the system is very specific and sensitive.
Although ROC is often used for prediction, we made a slight modification here to apply it to the
evaluation of seizure control efficacy. In our case sensitivity measures how effective the
stimulation is. The specificity measures how accurately the stimulation is applied. A low
specificity means that the stimulation is applied all the time whether a seizure is present or not.
A high specificity means that we only apply stimulation to the strong seizures. Therefore the
same ROC curve profile is achieved even though we applied it to a control system.
5.4.1 ROC Curve Construction
As was mentioned in the previous section our application is not a prediction, but rather a
control. That meant that we needed to modify the general usage of ROC to apply to our control
situation. To do this we needed to appropriately find a way to convert the successful seizure
control into TP, FP, TN, FN subgroups. To do this we tracked three variables with time. The first
was the SLE complexity without any stimulation. The second was the SLE complexity with
stimulation. The last was the actual time series of the stimulation. Then using a threshold of 0.2
we went through the time series of the first two variables and placed a 1 for all the times the
model was in ictal state. For the third variable we placed a 1 for every time the stimulation was
being applied. Then using table 5.1 we defined the different case (i.e. all three variables 1 is a
False Negative (FN)). The TP, FP, TN and FN were tallied up. Equations 5.1 and 5.2 were then
used to find the TPR and FPR so that we could plot sensitivity vs 1-specificity (TPR vs FPR). Then
64
we repeated the process for randomly modified CRGSLE parameters so that we would create a
slightly different SLE model each time to reflect the differences across patients. The process
was repeated for the different stimulation threshold (exThr) values and different gains. A good
representation was formed that would not be biased on only one good model.
Table 5.1 – Determination of ROC cases
Control Case TN TP FN FN FP TP FP FN
Seizure Before Stimulation 0 1 0 1 0 1 0 1
Seizure After Stimulation 0 0 1 1 0 0 1 1
Stimulation Applied 0 0 0 0 1 1 1 1
5.4.2 ROC Curve Comparison
To create the ROC curve we divided the three different stimulation models into three groups:
periodic, interictal and postictal. Each group was further divided into four subparts indicating
the different gains used in the stimulation. These gains were 0.01, 0.1, 1, and 10. Then the
exThr was spanned into 25 different values (0, 0.01:0.02:0.09, 0.1:0.10:0.9, 1:1:10) to produce
25 different ROC points. To provide some statistical significance to the results we created 32
replicate SLE models (samples) with slight changes in the coupling parameters. The changes
allowed us to produce slightly different dynamics to better contrast with the differences found
in the population. Then the ROC results were averaged for each stimulation gain and model.
We found that the gain of 0.01 produced the best results. We then constructed the ROC curve
65
using the 0.01 gain to compare the different stimulator models. The ROC result is shown in
figure 5.4.
Figure 5.4 – ROC comparison of the periodic, interictal and postictal stimulation
The sensitivity of the complex stimulations is significantly superior to the periodic stimulation particularly as the specificity reduces and more stimulation occurs.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8ROC Curve Comparison of Stimulation Models Under 0.01 Gain
Se
nsitiv
ity
1-Specificity
Interictal
Postictal
Periodic
66
The ROC curve shows that both interictal and postictal stimulation produced better ROC results
than the periodic stimulation. At very high specificity all the models had low results due to the
fact that there was only minor stimulation since as the exThr parameter was set too high. At
low specificity when all models are applied at low exThr the best results were achieved for the
interictal stimulation and the postictal. The complexity of the interictal and postictal
stimulation models allowed the CRGSLE model to better maintain its intrinsic complexity even
during the ictal events.
5.4.3 Area Under the ROC Curve
The successfulness of the ROC curve can be further quantized using the area under the curve.
The larger the area under the curve then the better the model being evaluated is. Here we used
the trapezoidal rule to calculate the area under the curve. In figure 5.5 we show the results of
the three ROCs from figure 5.4 where the three stimulator models were compared for the 0.01
gain. The result indicates that there is an improvement in control of seizures using the HPC
stimulation of the interictal and postictal models. The postictal performed slightly better than
the interictal based on the area under the ROC curve. We then applied the student ‘t’ test and
the Wilcoxon rank sum test to test the hypothesis that two samples are significantly different
on the population of 32. The results are shown in table 5.2. There was no significance between
the interictal and postictal performance, however both the interictal and postictal results were
significant when compared to the periodic stimulation.
67
Then in figure 5.6 we show how the three waveforms vary with gain of stimulation. The
periodic stimulation varied very little with changes in stimulation gain. However the highly
complex interictal and postictal stimulation models seemed to have the higher success with
lower gains. This meant that a strong stimulation (high gain) was not necessary for the HPC
interictal and postictal models to be successful.
Another thing that was done was we modified the RBF recurrent mode prediction duration
from 50 to 500 data points (see figures 5.6 and 5.7). Since stimulation was applied in discrete
steps this meant that there would be longer stimulation of the CRGSLE computer model. As well
the reinitializing to match the current state would not be updated as often allowing for more
divergence between the RBF model and the CRGSLE model. We observed that the variation of
the areas changed more than the mean area for the low gains. This suggests that by applying
longer duration stimulations we are at times more successfully suppressing the seizures and at
other times we are creating seizures due to over stimulating. Thus it is important to use shorter
duration stimulation to allow for more consistent seizure suppression.
Lastly we verified that the periodic stimulation remained fairly consistent across different
frequencies. We compared the ROC areas using three different periodic frequency stimulators:
12Hz periodic, 60Hz periodic and 200Hz periodic. The population size was 16, but it was enough
to show the trend with frequency. The results are shown in figure 5.8. As can be seen the gain
does not vary significantly across frequencies or gains. Granted there was a large drop for the
high gain in the 200Hz stimulation, but for the remainder of the frequencies and gains the
68
results were fairly consistent. This showed that gain and frequency had little effect on the
success of period stimulation.
Figure 5.5 – ROC area under the curve comparison of the periodic, interictal and postictal stimulation models
Further verification of the area under the ROC curve shows that interictal and postictal achieve better results than the periodic stimulation. Postictal produced slightly better results than the interictal. The error bars represent standard error for a population size of 32.
Table 5.2 – 0.01 Gain ROC Area Significance
Interictal : Postictal *Interictal : Periodic *Postictal : Periodic
pttest 0.832 0.011 0.004
pWilcoxon 0.722 0.027 0.002
Interictal Postictal Periodic0.4
0.45
0.5
0.55
0.6
0.65
RO
C A
rea
ROC Area vs Stimulation at the 0.01 Gain of Each Model
69
Figure 5.6 – ROC area for different gains of the stimulation models 50 reinitialization
Comparison of the area under the ROC curve for different gains and different stimulators with reinitialization after 50 points of stimulation. The error bars represent the standard error. Used sample size of 16 to construct the standard error.
Interictal Postictal 12Hz Periodic0.4
0.45
0.5
0.55
0.6
0.65
RO
C A
rea
ROC Area vs Stimulation Gain of Each Model (50 reinitialization)
0.01
0.1
1
10
70
Figure 5.7 – ROC area for different gains of the stimulation models 500 reinitialization
Comparison of the area under the ROC curve for different gains and different stimulators with reinitialization after 500 points of stimulation. The error bars represent the standard error. Used a sample size of 16 to produce the standard errors. The variation of results is greater than that with 50 reinitialization.
Interictal Postictal 12Hz Periodic0.4
0.45
0.5
0.55
0.6
0.65
RO
C A
rea
ROC Area vs Stimulation Gain of Each Model (500 reinitialization)
0.01
0.1
1
10
71
Figure 5.8 – ROC area for different gains and different periodic frequencies
The periodic stimulation achieved similar results as the complex models for high gain. The complex models outperformed the periodic as the gains were reduced. The error bars represent the standard error.
12Hz Periodic 60Hz Periodic 200Hz Periodic0.3
0.35
0.4
0.45
0.5
0.55
0.6
RO
C A
rea
ROC Area vs Stimulation Gain for Different Periodic Signals
0.01
0.1
1
10
72
CHAPTER 6
DISCUSSION AND FUTURE WORK
In this chapter we discuss three most notable results from this thesis. We then discuss the
implications of these results and what it means for future work.
6.1 RBF Model Captures Complexity
In chapter 2 we showed that the RBF model we trained successfully captured the shape and
complexity of the interictal and postictal regions of a seizure. This was verified by operating the
model in recurrent mode and showing that the model sustained dynamics similar to the
interictal and postictal time series while at the same time maintaining similar complexity. The
RBF models were very robust to initialization conditions. For example if the model was
initialized with ictal time series it would still continue to produce the interictal and postictal
dynamics. This ensured that even though the stimulation produced would vary based on
initialization, it would never diverge to a constant stimulation or become a DC stimulator. It is
73
believed that by training the model on multiple slices of different specimens that the RBF had
generalized or captured the characteristics common across all the groups.
6.2 Complex RBF Stimulation Outperforms Periodic
The hypothesis that stimulating with HPC biologically based stimulation would successfully
reduce seizure occurrence comes from the understanding that under normal conditions the
brain functions in a highly complex possibly chaotic manner. This fact has been verified in
literature on numerous occasions [2][3][20][28]. Therefore it is reasonable to believe that to
achieve better results one needs to communicate with the brain in the same biologically based
language.
In this thesis we tested our HPC biologically based stimulators on the CRGSLE computation
model. The results in figure 5.1 show that interictal and postictal stimulation reduced the
number of seizures to a greater extent than the periodic stimulation. To quantitatively compare
the three stimulation methodologies we constructed a ROC curve based on the successfulness
of control. The ROC was applied for multiple gains and the best gain for each model was chosen
for the final ROC comparison. The final comparison is shown in figure 5.2 and it shows that the
performance of interictal and postictal stimulation across different exThrs was significantly
better than the periodic. The difference between interictal and postictal was minimal. The
distinct difference between periodic and the RBF model stimulators was the complexity.
Therefore the hypothesis was satisfied.
74
6.3 Low Gain More Successful in Complex Stimulation
The final and most significant finding of this thesis is that the HPC RBF stimulators performed
better with lower gain of stimulation. The periodic stimulation had little benefit in using lower
gain stimulation. In fact the periodic stimulation tended to favour higher gain as can be shown
by figure 5.3. The CRGSLE model seems to model the dynamics of the SLE very well as many
findings in DBS show that higher gain periodic stimulation performs best. Generally the gain has
to be increased to achieve successful treatment. At larger gains the stimulation is not very
specific to the region intended to be stimulated and the likelihood of the stimulation affecting
other regions of the brain increases. This was not the case for the RBF interictal and postictal
stimulation. Interictal and postictal RBF stimulation both showed improvement with lower gain
stimulation. With lower gain stimulation they were able to focus the stimulation to the regions
that need it and avoid inducing other undesired effects on surrounding brain regions.
6.4 Future Work
The promising results achieved in this thesis are only the first steps. The model provided us with
a way to test the viability of our hypothesis in treating epilepsy. It has also left a lot of questions
to be answered. Can the success achieved on the CRGSLE model be replicated in-vivo?
Therefore the next logical step will be to reproduce these results in-vivo.
75
Would training on high frequency stimulation improve the results was another good question
that arose. In our preprocessing of the training data we removed the majority of the noise by
filtering out all frequencies above 50Hz. This made the model easier to train and less
computationally demanding to implement. It also means that our stimulator may have been
lacking some key features that may help in suppressing seizures. Previous work done by Chiu et
al. suggests that the higher frequencies are indicative of the seizure onset and hold valuable
features for detecting seizures [3]. In the future we will train on the higher frequencies to see if
their inclusion will yield better results. Due to the high noise content at higher frequencies we
may need to use other means to capture the higher frequency features. The Neural Rhythm
Extractor (NRE) developed by Zalay et al, is one proposed method to capture the frequency
information [40]. The NRE which at the heart uses a wavelet packet transform will find the main
frequency bands of the interictal and postictal time series. Then the RBF can then be selectively
trained on those bands. Then by stimulating the CRGSLE with different RBFs trained on
different bands we will track down the frequencies responsible for successful seizure
suppression.
As successful as the RBF had been in capturing the low frequency content we feel that we can
do better. A good substitute training model being considered is the Restricted Boltzman
Machine (RBM) developed by Hinton [41]. The RBM is a more complex model and although it is
based on ANNs is it trained highly effectively by an unsupervised random dream like state. The
use of RBMs brings about a computational issue that needs to be addressed to achieve real
time stimulation. This will mean that we will need to move away from computer based
76
stimulation to hardware stimulation through hardware such as Field Programmable Gate Arrays
(FPGAs). There is a lot of work left to be done, but the goal of the future work will remain to
achieve the same success in-vivo as on the in-silco CRGSLE model.
77
CONCLUSION
With the aid of RBFs we have captured the highly complex possibly chaotic (HPC)
neurodynamics of interictal and postictal regions of seizure time series. We have applied these
stimulation techniques to a CRGSLE model and shown that the HPC stimulation significantly
outperformed those of the low complexity periodic stimulation. If the same results can be
achieved on a rat in-vivo model then this has serious potential to change the way we treat
epilepsy and paves the way towards new treatment opportunities for all those in need.
78
Bibliography
[1] A. Babloyantz, A. Destexhe, “Low-dimensional chaos in an instance of epilepsy”, Neurobiology, Vol. 83, pp. 3513-3517, 1986
[2] S. J. Schiff, K. Jerger, D. H. Duong, T. Chang, M. L. Spano, W. L. Ditto, “Controlling chaos in the brain”, Nature, Vol. 350, pp. 615-620, 1994
[3] A. W.L. Chiu, E. E. Kang, M. Derchansky, Peter L. Carlen, B. L. Bardakjian, “Online Prediction of Onsets of Seizure-like Events in Hippocampal Neural Networks Using Wavelet Artificial Neural Networks”, Annals of Biomedical Engineering, Vol. 34, pp. 282-294, 2006
[4] A. W. L. Chiu, M. Derchansky, E. E. Kang, P. L. Carlen, B. L. Bardakjian, “Prevention of Spontaneous Seizure-like Events in Both in-silico and in-vitro Epilepsy Models”, Engineering in
Medicine and Biology 27th Annual Conference, pp.1-4, 2005
[5] M. Hodaie, R. A. Wennberg, J. O. Dostrovsky, and A. M. Lozano, “Chronic Anterior Thalamus Stimulation for Intractable Epilepsy”, Epilepsia, Vol. 34, pp. 603-608, 2002
[6] Y. F. Sun, Y. C. Liang, W. L. Zhang, H. P. Lee, W. Z. Lin, L. J. Cao, “Optimal partition algorithm of the RBF neural network and its application to financial time series forecasting”, Neural
Computation & Applications, Vol. 14, pp. 36-44, 2005
[7] X. Li and Z. Deng, “A Machine Learning Approach to Predict Turning Points for Chaotic Financial Time Series”, 19th IEEE International Conference on Tools with Artificial Intelligence, pp. 331-335, 2007
[8] O. C. Zalay, D. Serletis, P. L. Carlen, B. L. Bardakjian, “System chracterization of neuronal excitability and its relevance to spontaneous seizure-like transitions in a hippocampal network model”, Submitted to J Neuroscience, pp. 1-29, 2009
[9] C. Hamani, D. Andrade, M. Hodaie, R. Wennberg, and A. Lozano, “Deep brain stimulation for
the treatment of epilepsy”, Int. J Neural Systems, Vol. 19, pp.213-226, 2009
[10] B. M. Uthman, B. J. Wilder, J. K. Penry, C. Dean, R. E. Ramsay, S. A. Reid, E. J. Hammond, W.
B. Tarver, BS and J. F. Wernicke, “Treatment of epilepsy by stimulation of the vegus nerve”,
Epilepsia, Vol. 34, pp. 1007-1016, 1993
[11] S. C. Schachter, “Vagus nerve stimulator therapy summary: five years after FDA approval”,
Neurology, Vol. 59, no. 6 Suppl. 4, pp. S15-20, 2002
79
[12] D.M. Andrade, D. Zumsteg, C. Hamani, M. Hodaie, S. Sarkissian, A.M. Lozano, and R.A.
Wennberg, “Long-term follow-up of patients with thalamic deep brain stimulation for epilepsy”,
Neurology, Vol. 66, pp. 1571-1573, 2006
[13] C. Pollo and J.G. Villemure, “Rationale, mechanisms of efficacy, anatomical targets and
future prospects of electrical deep brain stimulation for epilepsy”, Acta Neurochir. Suppl., Vol.
97, pp. 311-320, 2007
[14] K. Vonck, P. Boon, L. Goossens, S. Dedeurwaerdere, P. Claeys, F. Gossiaux, P. Van Hese, T.
De Smedt, R. Raedt, E. Achten, K. Deblaere, A. Thieleman, P. Vandemaele, E. Thiery, G.
Vingerhoets, M. Miatton, J. Caemaert, D. Van Roost, E. Baert, G. Michielsen, F. Dewaele, K. Van
Laere, V. Thadani, D. Robertson and P. Williamson, “Neurostimulation for refractory epilepsy",
Acta Neurol. Belg., Vol. 103, pp. 213-217, 2003
[15] J.F. Tellez-Zenteno, R.S. McLachlan, A. Parrent, C.S. Kubu and S. Wiebe, “Hippocampal
electrical stimulation in mesial temporal lobe epilepsy”, Neurology, Vol. 66, pp. 1490-1494,
2006
[16] K. N. Fountas and J. R. Smith, 'A novel closed-loop stimulation system in the control of
focal, medically refractory epilepsy', Acta Neurochir. Suppl., Vol. 97, pp. 357-362, 2007
[17] K. N. Fountas, J. R. Smith, A. M. Murro, J. Politsky, Y. D. Park and P. D. Jenkins, “Implantation of a closed-loop stimulation in the management of medically refractory focal epilepsy: a technical note”, Stereotact Funct. Neurosurg, Vol. 83, pp. 153-158, 2005
[18] S. H. Strogatz, “Nonlinear Dynamics and Chaos”, Addison - Wesley Publishing Company, 1994
[19] A. Courville, “Chaosmakers for Epilepsy” M.A.Sc thesis, University of Toronto, 1998
[20] J. Gao, Y. Cao, W. Tung, J. Hu, “Multiscale Analysis of Complex Time Series”, Wiley, 2007
[21] A. Wolf, J. B. Swift, H. L. Swinney and J. A. Vastano, “Determining Lyapunov Exponents From A Time Series”, Physica, pp. 285-317, 1985
[22] L. D. Iasemidis, J. C. Sackellares, H. P. Zaveri, W. J. Williams, “Phase Space Topography and the Lyapunov Exponent of Electrocorticograms in Partial Seizures”, Brain Topography, Vol. 2, pp. 187-201, 1990
[23] S. P. Nair, D. Shiau, J. C. Principe, L. D. Iasemidis, P. M. Pardalos, W. M. Norman, P. R. Carney, K. M. Kelly, J. C. Sackellares, “An investigation on EEG dynamics in an animal model of
80
temporal lobe epilepsy using the maximum Lyapunov exponent”, Experimental Neurology, Vol. 216, pp. 115-121, 2009
[24] M.T. Rosenstein, J.J. Collins, and C.J. De Luca, “A practical method for calculating larget Lyapunov exponents from small data sets”, Physica D, Vol. 65, pp. 117-134, 1993
[25] P. Grassberger, and I. Procaccia, “Characterization of strange attractors”, Phys. Rev. Lett., Vol. 50, pp. 346-349, 1983
[26] A. Babloyantz, J.M. Salazar, “Evidence of chaotic dynamics of brain activity during the sleep cycle”, Phys. Letters, Vol. 111A, pp. 152-156, 1985
[27] J. Fell, J. Röschke, and P. Beckmann, “Deterministic chaos and the first positive Lyapunov exponent: a nonlinear analysis of the human electroencephalogram during sleep”, Biol. Cybern, Vol. 69, pp. 139–164, 1993
[28] A. Babloyantz, and A. Destexhe, “Low-dimensional chaos in an instance of epilepsy”, Proc.
Natl. Acad. Sci., Vol. 83, pp. 3513-3517, 1986
[29] M. Orr, J. Hallam, K. Takezawa, A. Murray, S. Ninomiya, M. Oide and T. Leonard, “Combining regression trees and radial basis function networks”, International Journal of
Neural Systems, pp. 1-17, 1999
[30] M.J.L. Orr, “Regularisation in the selection of radial basis function centres”, Neural
Computation, pp. 1-16, 1995
[31] R. Zemouri, D. Racoceanu, N. Zerhouni, “Recurrent radial basis function network for time-series prediction”, Eng. App. of Artificial Intelligence, Vol. 16, pp. 453-463, 2003
[32] H. Kantz, and T. Schreiber, “Nonlinear time series analysis”, Cambridge University Press, 1997
[33] J. R. Shewchuk, “An introduction to the conjugate gradient method without the agonizing pain”, Carnegie Mellon University, Ed. 1.25, 1994
[34] S. Chen, C.F.N. Cowan, and P.M. Grant, “Orthogonal least squares learning for radial basis function networks”, IEEE Transactions on Neural Networks, Vol. 2, pp. 302-309, 1991
[35] R.A. Horn, and C.R. Johnson, “Matrix Analysis”, Cambridge University Press, 1985
[36] F. Wendling, F. Bartolomei, J. J. Bellanger and P. Chauvel, “Epileptic fast activity can be
explained by a model of impaired GABAergic dendritic inhibition”, European Journal of
Neuroscience, Vol. 15, pp. 1499-1508, 2002
81
[37] P. Suffczynski, S. Kalitzin, Lopes Da Silva, “Dynamics of non-convulsive epileptic
phenomena modeled by a bistable neuronal network”, Neuroscience, Vol. 126, pp. 467−484,
2004
[38] F. Grimbert, O. Faugeras, “Bifurcation analysis of Jansen's neural mass model”, Neural
Comput, Vol. 18, pp. 3052−3068, 2006.
[39] T. Fawcett, “An introduction to ROC analysis”, Pattern Recognition Letters, Vol. 27, pp. 861-
874, 2006
[40] O. C. Zalay, E. E. Kang, M. Cotic, P. L. Carlen, and B. L. Bardakjian, “A Wavelet Packet-Based
Algorithm for the Extraction of Neural Rhythms”, Annals of Biomedical Engineering, Vol. 37 No.
3, pp. 595-613, 2009
[41] G. E. Hinton, S. Osindero, Y. Teh, “A fast learning algorithm for deep belief nets”, Neural
Computation, Vol. 18, pp. 1527-1554