35
1 New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools Gerhard-Wilhelm Weber * Institute of Applied Mathematics, METU, Ankara, Turkey Ozlem Defterli Department of Mathematics and Computer Science, Çankaya University, Ankara, Turkey Armin Fügenschuh Department for Optimization, Zuse Institute Berlin, Germany * Faculty of Economics, Management Science and Law, University of Siegen, Germany; Center for Research on Optimization and Control, University of Aveiro, Portugal; Universiti Teknologi Malaysia, Skudai, Malaysia METU 5th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 3-15, 2010

New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

  • Upload
    ssa-kpi

  • View
    519

  • Download
    0

Embed Size (px)

Citation preview

Page 1: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

1

New Advances in Prediction of

Gene-Environment Networks by Applied Mathematics Tools

Gerhard-Wilhelm Weber *

Institute of Applied Mathematics, METU, Ankara, Turkey

Ozlem Defterli Department of Mathematics and Computer Science,

Çankaya University, Ankara, Turkey

Armin Fügenschuh Department for Optimization, Zuse Institute Berlin,

Germany

* Faculty of Economics, Management Science and Law, University of Siegen, Germany;

Center for Research on Optimization and Control, University of Aveiro, Portugal;

Universiti Teknologi Malaysia, Skudai, Malaysia

METU

5th International Summer School

Achievements and Applications of Contemporary Informatics,

Mathematics and Physics

National University of Technology of the Ukraine

Kiev, Ukraine, August 3-15, 2010

Page 2: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

2

• Motivation: Bio-Systems

• Computational Biology

• Modeling and Prediction of Gene Patterns

• Genetic Networks

• Gene-Environment Networks

• The Model Class

• The Time-Discretized Model

• Optimization Problem

• The Mixed-Integer Problem

• Numerical Example

• Conclusion

Outline

Page 3: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

3

Bio-Systems

sustainability

Page 4: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

4

Computational Biology

DNA microarray chip experiments

prediction of gene patterns based on

with

M.U. Akhmet, H. Öktem

S.W. Pickl, E. Quek Ming Poh

T. Ergenç, B. Karasözen

J. Gebert, N. Radde

Ö. Uğur, R. Wünschiers

M. Taştan, A. Tezel, P. Taylan

F.B. Yilmaz, B. Akteke-Öztürk

S. Özöğür, Z. Alparslan-Gök

A. Soyler, B. Soyler, M. Çetin

S. Özöğür-Akyüz, Ö. Defterli

N. Gökgöz

Page 5: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

5

Sequence Data(cDNA, Genome,Genbank, etc.)

Selection or Design andSynthesis of the Probes

Array Production

Laser Scan of the Array

Picture Analysis

Test Material Control Material

mRNA-Isolation

cDNA-Synthesisand Labeling

Hybridization

Array Preparation Sample Preparation Data Analysis

DNA experiments

Page 6: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

6

Ex.: yeast data

GENE / time 0 9.5 11.5 13.5 15.5 18.5 20.5

'YHR007C' 0.224 0.367 0.312 0.014 -0.003 -1.357 -0.811

'YAL051W' 0.002 0.634 0.31 0.441 0.458 -0.136 0.275

'YAL054C' -1.07 -0.51 -0.22 -0.012 -0.215 1.741 4.239

'YAL056W' 0.09 0.884 0.165 0.199 0.034 0.148 0.935

'PRS316' -0.046 0.635 0.194 0.291 0.271 0.488 0.533

'KAN-MX' 0.162 0.159 0.609 0.481 0.447 1.541 1.449

'E. COLI #10' -0.013 0.88 -0.009 0.144 -0.001 0.14 0.192

'E. COLI #33' -0.405 0.853 -0.259 -0.124 -1.181 0.095 0.027

http://genome-www5.stanford.edu/

Page 7: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

7

Page 8: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

8

E0 : metabolic state of a cell at t0 (:=gene expression pattern),

ith-element of the vector E0 :=expression level of gene i,

Mk := I + hkM(Ek) , Ek (k є N0) is recursively defined as Ek+1 := MkEk.

Scheme of metabolic shift.

Gebert et al. 2006

Page 9: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

9

Modeling & Prediction

)(: nE

0)0(,)( EEEEME

)(: nnM

prediction, anticipationleast squares – max likelihood

statistical learning

(a) time-continuous:

expression data

matrix-valued function – metabolic reaction

T

n tetetetEE ))(,...,)(,)(()( 21

Expression data

Page 10: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

10

Modeling & Prediction

kkk EE M1

1 ( ( )) ,k k k kE I h M E E2

2

1 2( )k

k k k

hE I h M M E

(b) time-discrete:

Ex.:

)(Μ jik em M

We analyze the influence of em -parameters

on the dynamics (expression-metabolic).

Ex.: Euler, Runge-Kutta

Page 11: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

11

• For which parameters, i.e., for which set M(hence, dynamics), is stability guaranteed ?

Mstable

unstable

metabolic reaction

feasible

unfeasible

Stability:

Def.: M is stable : B : (complex) bounded neighbourhood of

0 1 1, M ,M ,..., Mkk ΙΝ M :

1 2 0(M M ... M ) .k k

Page 12: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

12

Genetic Network

, 1E M E h

)()()()(

)()()()(

)()()()(

)()()()(

34333231

24232221

14131211

04030201

tEtEtEtE

tEtEtEtE

tEtEtEtE

tEtEtEtE

080170255

25570180255

050200255

2550250255

2001

039.02.00

0061.04.0

0000

M

Ė 4Ė 2

Ė 0

Ė 5

Ė 1

Ė 3

0

1

2

3

4

5

6

7

8

9

0 2 4 6 8

Time, t

Ex

pre

ss

ion

le

ve

l, Ė

Ex. :

Page 13: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

13

gene2

gene3

gene1

gene4

0.4 x1

0.2 x2 1 x1

Genetic Network

Page 14: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

14

Gene-Environment Networks

Page 15: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

15

Gene-Environment Networks

1:

0i j

if gene j regulates gene i

otherwise

,i i

Page 16: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

16

The Model Class

d vector of positive concentration levels

of proteins and of certain levels of environmental factors

continuous change in the gene-expression data in time

is the firstly introduced time-autonomous form, where

initial values of the gene-exprssion levels

: experimental data vectors obtained from microarray experiments

and environmental measurements, at the sample times

: the gene-expression level (concentration rate) of the i th gene at time t

denotes anyone of the first n coordinates in the

d vector of genetic and environmental states.

is the set of genes.

Weber et al. (2008c), Chen et al. (1999),

Gebert et al. (2004a),

Gebert et al. (2006), Gebert et al. (2007),

Tastan (2005), Yilmaz (2004), Yilmaz et al. (2005),

Sakamoto and Iba (2001), Tastan et al. (2005)

Page 17: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

17

(i) is an nxn constant matrix

is an nx1 vector of gene-expression levels

(ii) represents and th dynamical system of the n genes

and their interaction alone.

: nxn matrix with entries as functions of polynomials, exponential, trigonometric,

splines or wavelets containing some parameters to be optimized.

(iii)

environmental effects

n genes , m environmental effects

are (n+m) vector and

(n+m)x(n+m) matrix, respectively.

Weber et al. (2008c), Tastan (2005),

Tastan et al. (2006),

Ugur et al. (2009), Tastan et al. (2005),

Yilmaz (2004), Yilmaz et al. (2005),

Weber et al. (2008b), Weber et al. (2009b)

(*)

The Model Class

Page 18: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

18

The Model Class

• In general, in the d dimensional extended space

with

: dxd matrix

: dx1 vectors

Ugur and Weber (2007),

Weber et al. (2008c),

Weber et al. (2008b),

Weber et al. (2009b)

Page 19: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

19

The Time-Discretized Model

- Euler’s method,

- Runge-Kutta methods, e.g., 2nd-order Heun's method

3rd-order Heun's method is newly introduced: Defterli et al. (2009)

we rewrite as

where

Ergenc and Weber (2004),

Tastan (2005), Tastan et al. (2006),

Tastan et al. 2005)

Page 20: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

20

The Time-Discretized Model

in the extended space denotes the DNA microarray

experimental data and the data of environmental items

obtained at the time-level

the approximations obtained

by the iterative formula above

initial values

k th approximation or prediction is calculated as:

(**)

Page 21: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

21

Matrix Algebra:

are nxn and nxm matrices, respectively

(n+m) x (n+m) matrix

are (n+m) vectors

Applying the 3rd-order Heun’s method to the eqn. (*) gives the iterative formula (**), where

Page 22: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

22

Final canonical block form of : .

Page 23: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

23

Optimization Problem

mixed-integer least-squares optimization problem :

subject to

Ugur and Weber (2007),

Weber et al.(2008c),

Weber et al. (2008b),

Weber et al. (2009b),

Gebert et al. (2004a),

Gebert et al. (2006),

Gebert et al. (2007).

Boolean variables

, : th : the numbers of genes regulated by gene (its outdegree),

by environmental item , or by the cumulative environment, resp..

Page 24: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

24

The mixed-integer problem

: nxn constant matrix with entries representing the effect

which the expression level of gene has on the change of expression of gene

Genetic regulation network

mixed-integer nonlinear optimization problem (MINLP):

subject to

: constant vector representing the lower bounds for the decrease of the transcript concentration.

in order to bound the indegree of each node, introduce

binary variables :

is a given parameter.

Page 25: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

25

Numerical Example:

Consider our MINLP for the data:

Gebert et al. (2004a)

Apply 3rd-order Heun method

Take

using the modeling language Zimpl 3.0, we solve

by SCIP 1.2 as a branch-and-cutframework,

together with SOPLEX 1.4.1 as our LP-solver

Page 26: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

26

Numerical Example:

Apply 3rd-order Heun’s time discretization :

Page 27: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

27

Results of Euler Method for all genes:

____ gene A

........ gene B

_ . _ . gene C

- - - - gene D

Page 28: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

28

Results of 3rd-order Heun Method for all genes:

____ gene A

........ gene B

_ . _ . gene C

- - - - gene D

Page 29: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

29

Page 30: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

30

Conclusion

The structural behavior of the obtained results is almost the same

(constant first column, decreasing second column and increasing third column)

with the given data in Table 1.

For the values presented in the last column of Table 2, instead of

an alternating behavior, we obtain a much more smooth behavior

by using the 3rd-order Heun's discretization scheme.

The above generated time series results are convergent

and we reach the stable values after a few time steps.

This shows the speed of the new discretization technique.

Page 31: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

31

Conclusion

In this presentation, we gave a contribution to an improved modeling

of gene-environment networks, including their rarefication

which may be regarded as a regularization,

and to the numerical solution of their dynamics.

By this, we supported to a better future prediction of how such networks

can develop in time, with important consequences in

health care, environmental protection, in the financial sector

and the field of education.

In our future challenges,

we will work on the further improvements of the algorithms,

different kinds of rarefications and combined methods

together with comparative studies.

Page 32: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

32

References

• [Achterberg (2007)] Achterberg T. Constraint integer programming. PhD. Thesis.Technische Universitat Berlin: Berlin; 2007.

• [Aster et al.(2004)] Aster A, Borchers B and Thurber C. Parameter Estimation and Inverse Problems. Academic Press: San Diego; 2004.

• [Chen et al. (1999)] Chen T, He HL, Church GM. Modeling gene expression with differential equations. In: Proceedings of Pacic Symposium on Biocomputing 1999; 29-40.

• [Ergenc and Weber (2004)] Ergenc T, Weber G-W. Modeling and prediction of gene-expression patterns reconsidered with Runge-Kutta discretization. Special issue at the occasion of 70th birthday of Prof. Dr. Karl Roesner, TU Darmstadt. Journal of Computational Technologies 2004; 9; 6; 40-48.

• [Gebert et al. (2004a)] Gebert J, Latsch M, Pickl SW, Weber G-W, Wunschiers R. Genetic networks and anticipation of gene expression patterns. In: Computing Anticipatory Systems: CASYS(92)03 - Sixth International Conference. AIP Conference Proceedings. vol.718. 2004; 474-485.

• [Hoon et al. (2003)] Hoon M D, Imoto S, Kobayashi K, Ogasawara N , Miyano S. Inferring gene regulatory networks from time-ordered gene expression data of Bacillus subtilis using dierential equations. In: Proceedings of Pacific Symposium on Biocomputing. 2003. 17-28.

• [Pickl and Weber (2001)] Pickl S W, Weber G-W. Optimization of a time-discrete nonlinear dynamical system from a problem of ecology - an analytical and numerical approach. Journal of Computational Technologies 2001; 6; 1; 43-52.

• [Sakamoto and Iba (2001)] Sakamoto E, Iba H. Inferring a system of dierential equations for a gene regulatory network by using genetic programming. In: Proc. Congress on Evolutionary Computation 2001; 720-726.

Page 33: New Advances in Prediction of Gene-Environment Networks by Applied Mathematics Tools

33

• [Tastan (2005)] Tastan M. Analysis and Prediction of Gene Expression Patterns by Dynamical Systems, and by a Combinatorial Algorithm. MSc Thesis. Institute of Applied Mathematics, METU: Turkey; 2005.

• [Tastan et al. (2006)] Tastan M, Pickl SW, Weber G-W. Mathematical modeling and stability analysis of gene-expression patterns in an extended space and with Runge-Kutta discretization. In: Proceedings of Operations Research. Bremen; 2006. 443-450.

• [Wunderling (1996)] Wunderling R. Paralleler und objektorientierter simplex algorithmus. PhD. Thesis. Technical Report ZIB-TR 96-09. Technische Universitat Berlin: Berlin; 1996.

• [Weber et al. (2008a)] Weber G-W, Alparslan Gok S Z and Dikmen N. Environmental and life sciences: Gene-environment networks-optimization, games and control-a survey on recent achievements. In: D. DeTombe (Guest Ed), special issue of Journal of Organizational Transformation and Social Change, vol. 5, no. 3. 2008; 197-233.

• [Weber et al. (2008b)] Weber G-W, Taylan P, Alparslan Gok S Z, Ozogur S, and Akteke Ozturk B. Optimization of gene-environment networks in the presence of errors and uncertainty with Chebychev approximation. TOP 2008; 16; 2; 284-318.

• [Weber et al. (2009a)] Weber G-W, Alparslan-Gok S Z and Soyler B. A new mathematical approach in environmental and life sciences: gene-environment networks and their dynamics. Environmental Modeling & Assessment 2009; 14; 2; 267-288.

• [Weber and Ugur (2007)] Weber G-W, Ugur O. Optimizing gene-environment networks: generalized semi-innite programming approach with intervals. In: Proceedings of International Symposium on Health Informatics and BioinformaticsTurkey '07, HIBIT, Antalya, Turkey, April 30 - May 2 (2007) http://hibit.ii.metu.edu.tr/07/index.html.

• [Yılmaz (2004)] Yılmaz FB. A mathematical modeling and approximation of gene expression patterns by linear and quadratic regulatory relations and analysis of gene networks. MSc Thesis. Institute of Applied Mathematics, METU:Turkey; 2004.

References