1
Modeling Clinical Time Series Using Gaussian Process Sequences Zitao Liu Lei Wu Milos Hauskrecht Department of Computer Science, University of Pittsburgh Motivation Background (con’t) State Space Gaussian Process (con’t) Experiments Goal “Develop accurate models of complex clinical time series!” Specifically, a prediction model that can: 1. Handle missing values 2. Deal with irregular time sampling intervals 3. Make accurate long term predictions Problem Statement We define the time series prediction/regression function for clinical time series as: where is a sequence of past observation-time pairs such that, , is a p-dimensional observation vector made at time ( ), and n is the number of past observations; and is the time at which we would like to predict the observation . Irregularly sampled, . obs : g t Y y obs Y obs 1 ( ,) n i i i t Y y i y i t 1 0 i i t t 1 1 i i i i t t t t y n t t ??? Time Value ( , ) Development of accurate models of complex clinical time series data is critical for understanding the disease, its dynamics, and subsequently patient management and clinical decision making. Gaussian Process (GP) GP is an extension of a multivariate Gaussian to distributions over functions. Defined by two components: . ( (),(, ')) mx kxx Mean function: Covariance function: () [ ( )] m f x x (, ) [( () ( ))( ( ) ( ))] K f m f m xx x x x x GP regression equations: Estimated Mean : Estimated Covariance : 1 2 * ( ,) (,) Kx K I x xx y * ( ( )) Cov f * ( ) f 1 2 * * * * ( , ) ( ,) (,) ( ,) Kx x Kx K I Kx x xx x Time Value ??? Discrete non-linear model (GPIL) Y time series of observations; Z hidden states driving the dynamics. 1 ( ) t t t r z z w ( ) t t t u y z v 1 1 1 ~ ( , ), V z ~ (0, ), t Q w ~ (0, ) t R v ) ( u ) ( r ) ( r ) ( u ) ( u unknown transition function; unknown measurement function. ) ( r ) ( u Acknowledgement This research work was supported by grants R01LM010019 and R01GM088224 from the National Institutes of Health. Its content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Future Work Study and model dependences among multiple time series Extend to switching-state and controlled dynamical systems Reference M. Hauskrecht, M. Valko, I. Batal, G. Clermont, S. Visweswaran, and G.F. Cooper, Conditional outlier detection for clinical alerting, in AMIA Annual Symposium Proceedings, 2010, p. 286. Carl Edward Rasmussen and Christopher K. I. Williams, Gaussian Processes for Machine Learning, MIT Press, 2006. R. Turner, M.P. Deisenroth, and C.E. Rasmussen, State-space inference and learning with Gaussian processes, in AISTATS, vol. 9, 2010, pp. 868-875. Data Evaluation Metric 1/2 1 2 1 | | n i i i RMSE n y y Root Mean Square Error(RMSE): Results Choice of Covariance Functions( ) Mean Reverting Property: Periodicity: 1 1 1 exp( | |) K t t 2 2 2 2 exp( sin ( )) 2 K t t 1 2 K K K Figure 2. Time series for six tests from the Complete Blood Count(CBC) panel for one of the patients. Figure 3. Root Mean Square Error(RMSE) on CBC test samples. Figure 1. Graphical representation of the state-space Gaussian process model. Shaded nodes denote (irregular) observations and shaded nodes denote times associated with each observation. Each rectangle (plate) corresponds to a window, which is associated with its own local GP. is the number of observations in each window. is Gaussian field. i s , ij y , ij T , ij f State Space Gaussian Process(SSGP) Model () () () , ()~ (0, (, )) T f q f f K t t ht β t tt ()~ (() , (, ) () ( )) T q q K T t ht b tt ht ht We consider the Gaussian process q(t) with the mean function formed by a combination of a fixed set of basis functions with coefficients, β: In this definition, f(t) is a zero mean GP , h(t) denotes a set of fixed basis functions, for example, , and β is a Gaussian prior, . Therefore, q(t) is another GP process, defined by: 2 () (1, , , ) h tt t ~ (,) I b Background Linear Dynamical System (LDS) 1 ( | ) ( , ), t t t p A Q z z z ( | ) ( , ) t t t p C R y z z 1 t t t A z z w t t t C y z v 1 1 1 ~ ( , ), V z ~ (0, ), t Q w ~ (0, ) t R v Y time series of observations; Z hidden states driving the dynamics. A A C C C ??? Time Value Idea Illustration Time Value Time Value Time Value State Space Gaussian Process Learning 1 1 1 log ( | ) 1 1 Tr 2 2 T p K K K K K Y Y Y Parameter Set: 1 1 { ,{ }, , , , , , } i ACRQ V β (Θ denotes covariance function parameters) Learn Θ: gradient based methods( ) Learn Ω\Θ: EM algorithm with , [log (,, )] p βz βzY 1 1 , 2 1 1 1 ( ) (,, ) ( ) ( | ) ( | ) ( | ) i s m m m i i i i ij i i i i j pD p p p p z βY z z z β z y β Joint distribution: Prediction 1. Split and t into windows. 2. For windows that do not contain t, extract the last values in those windows as βs and feed them into Kalman Filter algorithms to infer the most recent hidden state where k is the index of the last window that does not contain t. 3. Get from and . 4. If t is in window k+1use observations in window k+1 and to make the prediction, where ; otherwise find out the window index i where t belongs to. The prediction at t is . To support the prediction inference, we need the following steps: obs Y 1 1 k k C β z 1 k k CA β z k z 1 1 1 1 1 1 1 (, ) ( , )( ) k k k k k k Ktt K t t y β y β 1 1 ( , ) k k t y 1 k k A z z i k k CA y z 1 k β Patient management Making decision Disease understanding

Zitao Liu Lei Wu Milos Hauskrecht• State Space Gaussian Process(SSGP) Model ( ) ( ) ( ) , ( ) ~ (0, ( , ))T q f f Kt t h t W W W f c ( ) ~ ( ( ) , ( , ) ( ) ( ))T qK q cc t h t b

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Zitao Liu Lei Wu Milos Hauskrecht• State Space Gaussian Process(SSGP) Model ( ) ( ) ( ) , ( ) ~ (0, ( , ))T q f f Kt t h t W W W f c ( ) ~ ( ( ) , ( , ) ( ) ( ))T qK q cc t h t b

Modeling Clinical Time Series Using Gaussian Process Sequences

Zitao Liu Lei Wu Milos Hauskrecht

Department of Computer Science, University of Pittsburgh

Motivation Background (con’t) State Space Gaussian Process (con’t) Experiments

Goal

“Develop accurate models of complex clinical time series!”

Specifically, a prediction model that can:

1. Handle missing values

2. Deal with irregular time sampling intervals

3. Make accurate long term predictions

Problem Statement

We define the time series prediction/regression function for clinical time

series as: where is a sequence of past observation-time pairs

such that, , is a p-dimensional observation vector

made at time ( ), and n is the number of past observations; and is the

time at which we would like to predict the observation . Irregularly

sampled, .

obs:g t Y y obsY

obs 1( , )n

i i it Y y iy

it

10 i it t

1 1i i i it t t t

y

nt t

???

Time

Value (𝒚𝒊, 𝒕𝒊)

𝒕𝒊

𝒚𝒊 𝒚

Development of accurate models of complex clinical time series data is

critical for understanding the disease, its dynamics, and subsequently

patient management and clinical decision making.

• Gaussian Process (GP)

GP is an extension of a multivariate Gaussian to distributions over

functions. Defined by two components: . ( ( ), ( , '))m x k x x

Mean function:

Covariance function:

( ) [ ( )]m fx x

( , ) [( ( ) ( ))( ( ) ( ))]K f m f m x x x x x x

GP regression equations:

Estimated Mean :

Estimated Covariance :

12

*( , ) ( , )K x K I

x x x y

*( ( ))Cov f

*( )f1

2

* * * *( , ) ( , ) ( , ) ( , )K x x K x K I K x

x x x x

Time

Valu

e

???

• Discrete non-linear model (GPIL)

Y – time series of observations;

Z – hidden states driving the dynamics.

1 ( ) t t tr z z w ( ) t t tu y z v

1 1 1~ ( , ),Vz ~ (0, ),t Qw ~ (0, )t Rv

)(u

)(r )(r

)(u )(u

– unknown transition function;

– unknown measurement function.

)(r

)(u

Acknowledgement

This research work was supported by grants R01LM010019 and R01GM088224 from the

National Institutes of Health. Its content is solely the responsibility of the authors and does not

necessarily represent the official views of the NIH.

Future Work

• Study and model dependences among multiple time series

• Extend to switching-state and controlled dynamical systems

Reference • M. Hauskrecht, M. Valko, I. Batal, G. Clermont, S. Visweswaran, and G.F. Cooper, Conditional outlier

detection for clinical alerting, in AMIA Annual Symposium Proceedings, 2010, p. 286.

• Carl Edward Rasmussen and Christopher K. I. Williams, Gaussian Processes for Machine Learning, MIT

Press, 2006.

• R. Turner, M.P. Deisenroth, and C.E. Rasmussen, State-space inference and learning with Gaussian processes,

in AISTATS, vol. 9, 2010, pp. 868-875.

• Data

• Evaluation Metric

1/2

1 2

1

| |n

i i

i

RMSE n y y

Root Mean Square Error(RMSE):

• Results

• Choice of Covariance Functions( )

Mean Reverting Property:

Periodicity:

1 1 1exp( | |)K t t

2

2 2 2exp( sin ( ) )2

K

t t

1 2K K K

Figure 2. Time series for six tests from the Complete Blood Count(CBC) panel for one of the patients.

Figure 3. Root Mean Square Error(RMSE) on CBC test samples.

Figure 1. Graphical representation of the state-space Gaussian process model. Shaded nodes denote

(irregular) observations and shaded nodes denote times associated with each observation. Each rectangle

(plate) corresponds to a window, which is associated with its own local GP. is the number of observations in

each window. is Gaussian field. is

,i jy

,i jT

,i jf

• State Space Gaussian Process(SSGP) Model

( ) ( ) ( ) , ( ) ~ (0, ( , ))T

fq f f K t t h t β t t t

( ) ~ ( ( ) , ( , ) ( ) ( ))T

qq K Tt h t b t t h t h t

We consider the Gaussian process q(t) with the mean function formed

by a combination of a fixed set of basis functions with coefficients, β:

In this definition, f(t) is a zero mean GP , h(t) denotes a set of fixed

basis functions, for example, , and β is a Gaussian prior,

. Therefore, q(t) is another GP process, defined by:

2( ) (1, , , )h t t t

~ ( , )I b

Background

• Linear Dynamical System (LDS)

1( | ) ( , ),t t tp A Q z z z ( | ) ( , )t t tp C Ry z z

1 t t tA z z w t t tC y z v

1 1 1~ ( , ),Vz ~ (0, ),t Qw ~ (0, )t Rv

Y – time series of observations;

Z – hidden states driving the dynamics.

A A

C C C

???

Time

Valu

e

• Idea Illustration

Time

Va

lue

Time

Valu

e

Time

Valu

e

State Space Gaussian Process

• Learning

1 1 1log ( | ) 1 1Tr

2 2

Tp K KK K K

YY Y

Parameter Set: 1 1{ ,{ }, , , , , , }i A C R Q V β (Θ denotes covariance function parameters)

Learn Θ: gradient based methods( )

Learn Ω\Θ: EM algorithm with , [log ( , , )]p β z β z Y

1 1 ,

2 1 1 1

( ) ( , , ) ( ) ( | ) ( | ) ( | ) ism m m

i i i i i j i

i i i j

p D p p p p

z β Y z z z β z y βJoint distribution:

• Prediction

1. Split and t into windows.

2. For windows that do not contain t, extract the last values in those

windows as βs and feed them into Kalman Filter algorithms to infer

the most recent hidden state where k is the index of the last window

that does not contain t.

3. Get from and .

4. If t is in window k+1, use observations in window k+1 and

to make the prediction, where ;

otherwise find out the window index i where t belongs to. The

prediction at t is .

To support the prediction inference, we need the following steps:

obsY

1 1k kC β z1k kCA β z

kz

1

1 1 1 1 1 1( , ) ( , )( )k k k k k kK t t K t t

y β y β

1 1( , )k kt y

1k kA z z

i k

kCA y z

1kβ

Patient management Making decision Disease understanding