An algebraic-numeric algorithm for the model selection in ......{Cobelli, C., Foster, D. and Toolo,...

Preview:

Citation preview

An algebraic-numeric algorithm for the model selection in kinetic networks

Hiroshi Yoshida (Faculty of Math., Kyushu Univ.)Koji Nakagawa (CBRC, AIST)Hirokazu Anai (Fujitsu Lab. LTD./CREST JST.)Katsuhisa Horimoto (CBRC, AIST)

CASC2007, Bonn

Contents

Introduction of kinetic networkModel and Method(1) Laplace transformation of model

formulae and observed data(2) Matching (3) Model consistency estimationResultSummaryAnnouncement of AB2008

Background

Metacore: database of networks describing interactions between genes or proteins D-sep test: can deal with only directed acyclic graph (DAG)

Cannot be dealt with

Aim: selection for the most/more consistent model with the observed data

Which is Consistent?

A CB

A B

C

Model I

Model II

(A)

(B)

More…

To select the model most/more consistent with the given sampling dataWe have performed model selection over Laplace domain using algebraic equations

Model (Example)

)()(

)()()(

)()()(

)()(

tAkAGtGdtd

tAkAGtNkNAtAdtd

tNkNAtSLAkSNtNdtd

tSLAkSNtSLAdtd

=

−=

−=

−=

SLA N A GkSN kNA kAG

Assuming a linear relation between the variables, the kinetics of the abovenetwork can be describedas the system ofdeferential

equations on the left.

More …

SLA N A GkSN kNA kAG

SLA N G AkSN kNA kGA

SLA N A GkSN kNA kAG

kSA

SLA N A GkSN kNA kAG

kSSLoop

Laplace domain

System of differential equations

Observed data

Laplace trans.

k1 fd(s)s+k2

sum of exponentialsLaplace trans.

MatchingStrategy

1 21 2

a as m s m

+ ++ +

Solution:

Method

① The kinetics for describing biological phenomena are expressed by a system of differential equations

② The observed data are numerically fitted as a sum of exponentials

③ Both the system of differential equations and the sum of exponentials are transformed into the corresponding system of algebraic equations by Laplace transformation

④ The two systems of algebraic equations are

compared according to measure

Consistency estimation (1)

Change the system of differential equations into algebraic equations by Laplace transformation

Ex.) The system of algebraic equations

Into polynomials over Laplace domain

)()(

)()()(

)()()(

)()(

tAkAGtGdtd

tAkAGtNkNAtAdtd

tNkNAtSLAkSNtNdtd

tSLAkSNtSLAdtd

=

−=

−=

−=

)]([)0()]([)]([)]([)0()]([

)]([)]([)0()]([)]([)0()]([

tALkAGGtGLstALkAGtNLkNAAtALs

tNLkNAtSLALkSNNtNLstSLALkSNSLAtSLALs

=−−=−−=−

−=−

The solution over Laplace domain is:

Consistency estimation (2)

The observed data are fitted as a sum of exponentials:

k is the number of exponentials which can theoretically be determinedInto functions in s over Laplace domain:

∑=

−k

iii t

1)exp( αβ

∑= +

k

i i

i

s1 αβ

Change the observed data into algebraic equations by Laplace transformation

Previous study: Identifiability problem

Cobelli, C., Foster, D. and Toolo, G.: Tracer Kinetics in Biomedical research: From data to model, KluwerAcademic/Plenum Publishers, 2000.

2dy k ydt

= − ⋅ ( 2 3)dy k k ydt

= − +

k2 =m, uniquely determinedglobally identifiable

k2+k3=m …. unidentifiable

On the assumption of error-free data, … unrealistic situation

Both data are fitted as: a exp(-m t)

Our model: we handle noisy data

We take the position that it is sufficient to fit a noisy observed data as a sum of exponentials:

1 2 31 2 3m t m t m ta e a e a e− − −+ + +

Example: the observed data intoLaplace domain

Consistency estimation procedure (3)

Comparison of coefficient List in s

Derive the coefficient List by comparing the algebraic equations of the model and the observed data over Laplace domain:

Ex.) Coefficient List

Without any error, these polynomials would be zero, but in the case of real noisy observed data, unfortunately not zero… => Least squares method (LSM)

Consistency measure:

The smallest sum-square value of the elements in Coefficient Listunder

k1 > 0, k2 > 0, ….., kn> 0 (1)Or

k1 >=0, k2 >=0, ….., kn >=0 (2)If e.g. k1 = 0 … subnetwork

SLA N A Gk1 k2 k3

Coefficient List:

The consistency measure can be calculated as the smallest values of f(k)among various ks : => Least squares method:

∑=

−=n

iii rklkf

1

2))(()(

nn rklrkl == )(,,)( 11

Concrete procedure

Measures (1) and (2)

Under ki > 0 … Measure (1)

Under ki >= 0 … Measure (2)

001

=∂∂

∧∧=∂∂

nkf

kf

001

=∂∂

∧∧=∂∂

nkf

kf

Including subnetworks

Algorithm to compute the measure (2)(Recursive procedure)

MinimizePositive.. Compute the measure (1)

MinimizePositive(f(k1, …., 0, …. kn))

Result

We used the following five models

Data generation for simulationWe have generated the time series of data for the consistentmolecules for the simulation study, before the model consistencyestimation.

The given and estimated parametersare as follows: S LA;1, 1 (given) and 1.00 (estimated); S LA;1, 10 and 10.0; N;1, 1/10 and 0.100;N;2, 1 and 1.00; N;1, 163/9 and 18.1; N;2, 100/9 and 11.1; A;1, 1/10 and 0.100; A;2, 1/2 and0.500; A;3, 1 and 1.00; A;1, 163/36 and 4.53; A;2, 15=4 and 3:75; A;3, 20/9 and 2.22; G;1,1/10 and 0.100; G;2, 1/2 and 0.500; G;3, 1 and 1.00; G;1, 815=36 and 22:6; G;2, 15/4 and3.75; G;3, 10=9 and 1:11; G;4, 21 and 21.0. Each figure corresponds to the four variables(molecules) in the model: (a) S LA, (b) N, (c) A, (d) G.

Table of Consistency measure

Measure (1) Measure (2).. Including subnetworks

SLA N A GkSN kNA kAG

SLA N A GkSN kNA kAG

kSSLoop

When kSS is small, Model E is almost equal to model A

Model A

Model E

The verification of our method , Measure 1

SLA N A GkSN kNA kAG

SLA N A GkSN kNA kAG

kSA

kNG

Model C

When kSA and kNG are exactly zero, Model C is equivalent to model A

Model A

The verification of our method , Measure 2

Query A

Summary

We have proposed a method to select a model which is more/most consistent with the time series of observed data. We have verified our method, using generated data, handling a cyclic relationship hitherto unavailable in previous methods.

Future works … scalability

Focusing on a local network within a large-scale network.Easy elimination of the unnecessary variables in virtue of algebraic equations

Eliminating C(t),

k1 kd

C(t)

C1(t)

C2(t)k2

2 1 1 2 1 2( )

0

( ) ( ) (dd

tk tk C t k C t k k e C dτ τ τ− −= − )∫

2 2 1 1 2( ) [ ( )]( ) [ ( )]( ) 0dsk k k L C t s sk L C t s− + + =

International Conference on Algebraic Biology

2005(1st)

2007 (2nd)

from Universal Academy Press

LNCS 4545 from Springer

November 28-30FUJITSU SOLUTION SQURETokyo, Japan

July 2-4RISC, Johannes Kepler UniversityLinz, Austria

organized by B. Buchberger, H. Hong, K. Horimoto

organized by H. Anai, H. Hong, K. Horimoto

2009

(4th)

2008(3rd)

July 31-August 2RISC, Johannes KeplerUniversityLinz, Austria

organized by B. Buchberger,K. Horimoto, R.

Laubenbacher, B. Mishra

SAMSI, NC, USAorganized by R. Laubenbacher

http://www.risc.uni-linz.ac.at/about/conferences/ab2008/

Paper deadline is Jan.-14, 2008

Recommended