mathematical programming techniques applied to biology
play

Mathematical programming techniques applied to biology Fabien - PowerPoint PPT Presentation

Mathematical programming techniques applied to biology Fabien Tarissan 1 Leo Liberti 2 Camilo La Rota 3 1 ISC-PIF (Paris, France) 2 Ecole Polytechnique (Paris, France) 3 IXXI (Lyon, France) October 31, 2008 Context of work Pre-simulation tool


  1. Mathematical programming techniques applied to biology Fabien Tarissan 1 Leo Liberti 2 Camilo La Rota 3 1 ISC-PIF (Paris, France) 2 ´ Ecole Polytechnique (Paris, France) 3 IXXI (Lyon, France) October 31, 2008

  2. Context of work Pre-simulation tool for the Morphex european project: Heterogeity at many levels: ◮ organisms ◮ data ◮ reliability ◮ level of details ◮ . . .

  3. Network reconstruction

  4. Network reconstruction

  5. Network reconstruction

  6. Network reconstruction Our approach: ◮ Modelisation by means of mathematical programming techniques (constraints) ◮ Reformulation of the models in order to ease the solving Contributions : ◮ Reconstruction of gene regulatory networks: ◮ with continuous dynamics (drosophila) ◮ with discrete dynamics (arabidopsis)

  7. Mathematical Programming � min x f ( x ) subject to g ( x ) ≤ 0 ◮ x ∈ R n are the decision variables ◮ f : R n → R is the objective function ◮ g : R n → R m is the set of constraints + distinction between integer and continuous variables. Let Z ∈ { 1 , . . . , n } such that ∀ i ∈ Z , x i ∈ Z .

  8. Classes of problems � min x f ( x ) subject to g ( x ) ≤ 0 AMPL: A Mathematical Programming Language. Class f , g Z Best solver Best free solver Complexity Θ(10 6 ) LP linear Z = ∅ CPLEX CLP Θ(10 4 ) cNLP convex Z = ∅ SNOPT/FILTER IPOPT Θ(10 3 ) MILP linear Z � = ∅ CPLEX BCP/SYMPHONY Θ(10 2 ) NLP non linear Z = ∅ BARON ? Θ(10 3 ) cMINLP convex Z � = ∅ MINLP bb/FILMINT BONMIN/FILMINT Θ(10 2 ) MINLP non linear Z � = ∅ BARON ?

  9. Application to the drosophila model Continuous regulation of gene products concentrations: dg ia ( t ) = R a Φ( u ia ( t )) − λ a g ia ( t )+ D a ( g i +1 , a ( t ) − 2 g ia ( t )+ g i − 1 , a ( t )) dt ◮ g ia ( t ) is the concentration of gene a in nucleus i at time t ◮ R a is the production rate for gene a ◮ Φ is the sigmoid regulation function ◮ λ a is the decay rate ◮ D a is the diffusion coefficient for gene a

  10. Regulation term The sigmoid definition: Φ( u ) = 1 � u � √ + 1 u 2 + 1 2 Relies on: � W ba g ib ( t ) + m a g bcd u ia ( t ) = + h a i b ∈ N γ ◮ W ba is the weight on the arc ( b , a ) in the GRN ◮ m a is the regulatory influence of the maternal gene bcd ◮ h a is the activation threshold for Φ

  11. The problem Size of the problem: ◮ Network of 6 genes ◮ but missing values for W , R , D , m , λ , h : 66 variables. Confronting the estimation to the observed data: ( t )) 2 + Π R + Π λ + Π D + Π u � � ( g ia ( t ) − g data min ia i ∈ N ι t Penalty function: Π u = e Θ − 1 ) 2 + ( m a v max bcd ) 2 + h 2 � ( W ba v max Θ = Λ( a ) b ( b , a ) ∈ A

  12. Modelling in AMPL 1. Translating the model into AMPL: ◮ Objective function: i ( t )) 2 + max ) 2 + max ) 2 + h 2 X (g a a X (W a b v b X (( m a v bcd i ( t ) − g data min a ) a ∈ N γ a ∈ N γ a ∈ N γ i ∈ N ι b ∈ N γ t ∈ T data ◮ Some penalty functions as constraints: 8 R L ≤ R a ≤ R U > < λ L ≤ λ a ≤ λ U ∀ a ∈ N γ D L ≤ D a ≤ D U > : ◮ PDE as a constraint (discretization): 0 1 u a R a i ( t ) g a i ( t ) − g a + 1) − λ a g a i ( t ) + D a (g a i +1 ( t ) − 2g a i ( t ) + g a i ( t − 1) = ∆ t ( i − 1 ( t )) B C @ 2 q i ( t ) 2 + 1 A u a 2. Other issues: ◮ Mitosis time ◮ Modelling cell division ◮ Updating diffusion coefficient ◮ . . .

  13. Simplifying the model ◮ Driven by biological knowledge: (e.g. boundaries on W , m and h ) ◮ Mathematical reformulating of terms: ◮ exact reformulation: e.g. for u √ u 2 +1 ⇒ z 2 ( u 2 + 1) = 1 = ⇒ ( zu ) 2 + z 2 = 1 1 √ 1. z = u 2 +1 = 2. Let u ′ , u ′′ and z ′ be respectively the uz , u ′ 2 and z 2 . u 2 +1 with u ′ and add constraints: u √ 3. Substitute u ′ = uz 8 > u ′′ + z ′ = 1 > < z ′ = z 2 > u ′′ = u ′ 2 > : ◮ approximative reformulation of z 2

  14. Work achieved so far What is done: 1. the raw model (without any reformulation) 2. various reformulations: ◮ sigmoid (exact): too many variables. ◮ sigmoid (approx): ok. ◮ convex products (approx): ok but feasability issues. 3. run on small data set: good results What will be done: ◮ run on large data set: too heavy for now (need to split the model). ◮ trying other modellisations ( g ia ( t ) = g data ( t )?) ia

  15. Other case of study: Arabidopsis Same approach: ◮ Gene regulatory network ◮ Some knowledge of the network topology ◮ Don’t know the weight on edges Different dynamics: ◮ Descretization of the time � � n ◮ Qualitative activity of gene i : x t +1 α ij w ij x t = H � j − θ i i j =1 • θ i : threshold of activation. � � ( induced production ) • w ij : interaction strength . decay • α ij : Kind of the interaction ( repression = − 1 , activation = +1) Similar problem: Find w ij and θ i

  16. Modelling: defining the GRN Gene Regulatory Network (GRN): ( G , T , α, w , x , ι, θ ) ◮ Sets and Graph: ◮ Functions: α : A → { +1 , − 1 } arc sign ; V : vertexes (genes) w : A → R + arc weight ; A : arcs (interactions) x : V × T → { 0 , 1 } gene activation ; T : = { 1 , 2 , .. } ⊂ N ι : V → { 0 , 1 } initial configuration ; G = ( V , A ) θ : V → R threshold , ◮ Evolution rules x ( v , 1) = ι ( v ) � 1 if � α ( u , v ) w ( u , v ) x ( u , t − 1) ≥ θ ( v ) x ( v , t ) = u ∈ δ − ( v ) 0 otherwise , where δ − ( v ) = { u ∈ V | ( u , v ) ∈ A } for all v ∈ V .

  17. Modelling: defining the problem Given ◮ ( G , T , α ) ◮ S := { 1 .. Smax } : set of stages. ◮ U = { U s } s ∈ S ; U s ⊆ V : nodes of G s (induced subnetworks of G ). ◮ I = { ι s , u } s ∈ S , u ∈ U s ; ι s , u : V → { 0 , 1 } : initial conditions. ◮ Φ = { φ s , u } s ∈ S , u ∈ U s ; φ s , u : V → { 0 , 1 } : expression data. Find w , θ with the property that ∀ � ι s , ( G s , T , α, w ,� x s ,� ι s , θ ) satisfies the evolution rules and has fixed points that collectively minimize the total D H ( ρ, φ ). D H : hamming distance from model fixed points to data. x t for all t ′ > t . fixed points ( � ρ ) : If � x t = � x t − 1 = � ρ then � x t ′ = �

  18. Finding fixed points

  19. Finding fixed points

  20. Finding fixed points

  21. Finding fixed points

  22. Mathematical programming formulation ◮ Objective function X X X ( y s , t − 1 − y s , t ) | x s , u , t − ρ s , u | s ∈ S u ∈ U s t ∈ T \ 1 ◮ Fixed point conditions X | x t s , u − x t − 1 � U s � σ t 1 − y t X σ t s , u | ≤ ≤ s s r u ∈ U s r ≥ t X | x t s , u − x t − 1 σ t X y t σ t s , u | ≥ = 0 s s r u ∈ U s r ≥ t ◮ Evolution rules X α u , v w u , v x t − 1 θ v x t s , v − � V � (1 − x t ≥ s , v ) s , u u ∈ U s :( u , v ) ∈ A X α u , v w u , v x t − 1 ( θ v − ǫ )(1 − x t s , v ) + � V � x t ≤ s , u s , v u ∈ U s :( u , v ) ∈ A

  23. Conclusion on the modelling approach Static modelling of a dynamic system A framework for reconstructing regulatory networks: ◮ of different biological organisms ◮ with different dynamics Drawbacks: ◮ loose of efficiency ◮ might require to introduce new elements Perspectives: ◮ automatization of the reformulations ◮ study more complex qualitative models of GRN ◮ integrating different kind of knowledge (experimental, theoretical, . . . )

  24. Automatic (re)formulation For the modelling part: E.g. 4 “virtual” constraints to express the fixed point (should have been generated!) For the simplification part: Name Nonlinear feasible set Linear feasible set PowBin ( x 1 , x 2 ) ∈ { 0 , 1 } × R : x 2 = x n ( x 1 , x 2 ) ∈ { 0 , 1 } × R : x 2 = x 1 1 exact ( x , x n +1 ) ∈ { 0 , 1 } n × [0 , 1] : ( x , x n +1 ) ∈ { 0 , 1 } n × R : x n +1 = ProdBin x n +1 ≤ x i ∀ i ≤ n Q x i exact x n +1 ≥ 1 − n + P x i i ≤ n i ≤ n 2 ] 2 : ( x 1 , x 2 , x 3 ) ∈ { 0 , 1 } × [ x L 2 , x U x 3 ≤ x U 2 x 1 ProdBin- ( x 1 , x 2 , x 3 ) ∈ { 0 , 1 } × [ x L 2 , x U 2 ] × R : x 3 ≥ x L Cont 2 x 1 x 3 = x 1 x 2 exact x 3 ≤ x 2 + x L 2 (1 − x 1 ) x 3 ≥ x 2 − x U 2 (1 − x 1 ) Leads to Term Rewriting Systems (TRS) properties: ◮ termination ◮ confluence ◮ optimality?

Recommend


More recommend