The space of solutions of metabolic systems Andrea Pagnani pagnani@isi.it ISI Foundation Turin
Outlook • Metabolic modeling • Inferring the space of solution by message-passing • Biological applications • Conclusions and perspectives
Metabolic Network cell membrane b 2 b 1 A,B,C: components/metabolites ν 1 A B ν 1 , ν 2 , ν 3 : Internal fluxes ν 2 ν 3 b 1 , b 2 , b 3 : External Fluxes C uptake, secretion b 3 Balance Equations Matrix Formalism dA ν 1 dA − ν 1 − ν 2 + b 1 = dt ν 2 . dt − 1 − 1 0 1 0 0 dB ν 3 dB 1 0 1 0 − 1 0 = b 1 ν 1 + ν 3 − b 2 = dt 0 1 − 1 0 0 − 1 dt b 2 dA dC b 3 ν 2 − ν 3 − b 3 = dt d � dt X ν − � dt = ˆ S · � b
Steady state metabolism Time constants describing metabolic transients are fast: ~ order msec. to sec. Time constants associated to cell growth: ~ order of hours to days. Under these hypothesis the cell is in a (quasi)-steady state d � X ν = � ˆ dt = 0 S · � b
Constraining the space of fluxes Fluxes must be positive and cannot exceed ν MAX : 0 ≤ ν i ≤ ν MAX i ∈ (1 , · · · , N ) i The steady state mass balance equation together the inequalities on fluxes define a convex polytope if N < M (as in metabolism) Each equation defines a plane All equations define the space of all feasible fluxes.
Outline of the computational strategy The space of feasible solutions is a high-dimensional polytope To measure the volume of the polytope we will: • Discretize the space (Riemann integration) • Transform the linear system into a constraint satisfaction problem (CSP) defined on a sparse topology. • Approximate the volume of the polytope with the number of solutions of the associated CSP . • Set up a message-passing strategy for solving the CSP • As a byproduct we can measure the single flux distribution functions for all fluxes
Discretizing the problem Integration a la Riemann The volume is proportional to the number of cube intesecting Π { ν = � ˆ S · � b 0 ≤ ν i ≤ ν MAX i ∈ (1 , · · · , N ) i V Π but now notation) by the same Eqs. ariables ν i ∈ { 0 , 1 , ..., q max } , i part of q max × ν max , where and ∈ { Λ ǫ i or q max = of q max × ν max equal , i i where the ularity of the appro q max is the granularity of the approximation.
The constraint satisfaction problem ν 8 ν 1 ν 32 M functional nodes N variable nodes (fluxes) b 2 121 2 ν 3 + ν 5 − ν 1 − 2 ν 8 − ν 32 = b 2 21 ν 3 ν 5 The function node imposes a hard constraint δ (2 ν 3 + ν 5 − ν 1 − 2 ν 8 − ν 32 ; b 2 )
Belief propagation Metabolites a 8 x 4 a 4 x 2 x 1 Fluxes a 2 a 3 a 6 a 7 x 3 x 5 ) a 5 ( 6 x x 6 µ 5 → 1 ( x 6 ) 1 → 6 µ x 8 x 7 ) a 1 ( x 8 m 1 → 8 µ 7 → 1 ( x 6 ) Bethe approximation (exact on trees ∏ ∏ − 1 d ! ( ) ν ν = P ({ } ν ) P ( ν ) i and large locally tree-like structures) a l l a ∈ i i ∈ ∈ a A i I ∑ ∑ ∏ • ! i " a ( # ): the probability that flux i takes value # in the ν δ ν ν m ( ) s ; b u ( ) = → → a i i a l l , a l a l absence of reaction a . ν ∈ ∈ { } l a \ l a i ∈ l l a i \ ∏ • m a " i ( # ): the non-normalized probability that the bal- ν ν ( ) C m ( ) µ = → → → ance in reaction a is fulfilled given that flux i takes value # . i a i i a b i i ∈ b i a \
Check of the performance on artificial data 100000 N=12 is the 10000 Prop to N maximal 1000 lrs value for lrs 100 τ [sec] 10 BP 1 0.1 0.01 0.001 1 10 100 1000 N 1000 realizations x point . Avis D, Fukuda K: A pivoting algorithm for con- vex hulls and vertex enumeration of arrange- Fixed α = M/N = 1/3 ments and polyhedra . Discrete Comput. Geom. 1992, 8 (3):295–313. LRS package [http://cgm.cs.mcgill.ca/ ~ avis
Wibak, et al. J. Theoretical Biol. 2004. Human Red Blood Cell Network: N=46 M=34 Method: Montecarlo Sampling Computation time: minutes HK PGI PFK ALD TPI GAPDH PGK DPGM 0.6 0.7 0.7 0.7 0.7 0.5 0.8 0.45 0.45 0.7 0.4 0.5 0.6 0.6 0.6 0.6 0.4 0.35 0.6 0.5 0.5 0.5 0.5 0.35 0.4 0.3 0.5 0.3 0.4 0.4 0.4 0.4 0.25 BP marginals 0.3 0.25 0.4 0.2 0.3 0.3 0.3 0.3 0.2 0.3 0.2 0.15 0.15 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.05 0.05 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 0 1 2 3 4 5 DPGase PGM EN PK LDH G6PDH PGL PDGH 2.5 0.8 0.8 0.8 0.9 0.8 0.8 0.8 0.7 0.7 0.7 0.8 0.7 0.7 0.7 2 0.6 0.6 0.6 0.7 0.6 0.6 0.6 0.6 1.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.4 1 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.5 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 0 0 0 0 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 0.5 1 1.5 2 2.5 3 3.5 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 R5PI Xu5PE TKI TKII TA AMPase ADA AK 1.4 0.9 1.4 1.4 1.4 3 8 3 0.8 1.2 1.2 1.2 1.2 7 2.5 2.5 0.7 6 1 1 1 1 0.6 2 2 5 0.8 0.5 0.8 0.8 0.8 1.5 4 1.5 0.6 0.4 0.6 0.6 0.6 3 0.3 1 1 0.4 0.4 0.4 0.4 2 0.2 Computation time: 0.5 0.5 0.2 0.2 0.2 0.2 1 0.1 0 0 0 0 0 0 0 0 0 0.5 1 1.5 2 0 1 2 3 4 0 0.5 1 1.5 2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 0 0.05 0.1 0.15 0.2 0.25 0 0.5 1 1.5 2 2.5 3 ApK AMPDA AdPRT IMPase PNPase PRM PRPPsyn HGPRT 12 60 60 9 6 6 9 9 8 8 8 10 50 50 5 5 ~3 sec! 7 7 7 8 40 40 6 4 4 6 6 5 5 5 6 30 30 3 3 4 4 4 4 20 20 3 2 2 3 3 2 2 2 2 10 10 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.005 0.01 0 0.005 0.01 0 0.05 0.1 0.15 0.2 0.25 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 0 0.05 0.1 0.15 0.2 0.25 0 0.05 0.1 0.15 0.2 0.25 (on a comparable PC) GLC DPG23 PYR LAC HX ADE ADO INO 0.6 0.45 1.4 0.9 5 60 35 10 4.5 9 0.4 0.8 0.5 1.2 50 30 0.35 0.7 4 8 1 25 3.5 7 0.4 0.3 0.6 40 3 6 0.8 20 0.25 0.5 0.3 2.5 30 5 0.2 0.4 0.6 15 2 4 0.2 0.15 0.3 20 1.5 3 0.4 10 0.1 0.2 1 2 0.1 0.2 10 5 0.05 0.1 0.5 1 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 0 1 2 3 4 5 0 0.2 0.4 0.6 0.8 1 1.2 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.1 0.2 0.3 0.4 0 0.005 0.01 0.015 0 0.01 0.02 0.03 0.04 0 0.02 0.04 0.06 0.08 0.1 0.12 ADP ATP NAD NADH NADP NADPH 0.4 0.45 0.45 0.45 0.6 0.6 0.35 0.4 0.4 0.4 0.5 0.5 0.35 0.35 0.35 0.3 0.3 0.3 0.3 0.4 0.4 0.25 0.25 0.25 0.25 0.2 0.3 0.3 0.2 0.2 0.2 0.15 0.15 0.15 0.15 0.2 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.05 0.05 0.05 0.05 0 0 0 0 0 0 0 1 2 3 4 5 6 0 1 2 3 4 5 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 2 4 6 8 10 12 0 2 4 6 8 10 12
Impact of gene knock out in E-Coli central metabolism 0.5 0.45 0.4 0.35 0.3 S 0 - S KO 0.25 0.2 0.15 0.1 0.05 0 h h H P G G G g P E P A S C f f F c M T a a o 2 l U P G G N D C U E S A L 1 y d d D 2 o C P c M I X O H O C h K P M H o 2 P 1 P D D N g 1 e T i n Most of the high impact fluxes are involved in the glycolytic pathway
Simulating enzimopaties h2o h HEX1 0.5 PGK GAPD GLCP G1PP 0.4 glycogen PGM S 0 - S KO ENO PDH 0.3 ACONT SUCD1i CS 0.2 fadh2 fad FUM co2 0.1 MDH TPI 0 0 10 20 30 40 50 60 70 80 90 100 Flux knock-out percentage
Relevance of fluxes as a function of their mean-velocity 0.45 100% knock-out 75% knock-out 0.4 0.35 0.3 GLCP HEX1 S 0 - S KO G1PP 0.25 glycogen 0.2 PDH 0.15 CS 0.1 0.05 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 < ν >
E.Coli Central GLCP S Metabolism HEX1 G1PP I S Y L O C Y L G CS PDH E S L B C E R Y K C
E.Coli organism wide metabolism N = #of reactions = 1025 40 min. on a standard laptop M = #of metabolites = 626 10 A( ν - ν 0 ) - γ A P fit ( ν ) = ( ν − ν 0 ) γ 1 = 1 . 48(5) γ = 0 . 00020(8) ν 0 0.1 0.01 0.1 1 < ν >
Conclusions and Perspectives • Cell metabolism is a very well defined biological description of organisms (universal, stoich. parameters are integer, etc.) • One needs fast algorithms for analyzing metabolism: 1. Global characterization of the space of solution 2. Fast in-silico flux-KO of organism-wide metabolic systems. Work in progress: • Conserved groups in collaboration with A. De Martino
Work in collaboration with • Alfredo Braunstein (Politecnico, Torino, Italy) • Roberto Mulet (University of Havana, Cuba) Extended collaborators: Martin Weigt (ISI), Hamed Mahamoudi (ISI), Riccardo Zecchina (Politecninco Torino & ISI), Enzo Marinari (Univ. Roma 1), Andrea De Martino (University of Roma 1), Ginestra Bianconi (ICTP) Estimating the size of the solution space of metabolic networks Braunstein A, Mulet R, Pagnani A BMC Bioinformatics 2008, 9:240 (19 May 2008) The space of feasible solutions in metabolic networks A Braunstein, R Mulet and A Pagnani J. Phys.: Conf. Ser. 95 (2008) 012017
Recommend
More recommend