Probabilistic Approximations of ODEs Based Signaling Pathways Dynamics P.S. Thiagarajan School of Computing, National University of Singapore
Biopathways Biopathways: Metabolic Pathways Signaling Pathways Gene Regulatory Networks
Signaling Pathways • Chemical reactions in response to external signals (ligands) • Signals pass into the nucleus through a series of protein modifications „Data transfer‟ mechanism of the cell
A Common Modeling Approach • View a pathway as a network of bio-chemical reactions • Model the network as a system of ODEs One for each molecular species Reaction kinetics: Mass action law, Michelis-Menten, Hill, etc. • Study the ODE system dynamics.
The ODEs model k 1 k 3 S E E ES P k 2 Assume mass law. dS dt k 1 S E k 2 ES dE dt k 1 S E ( k 2 k 3 ) ES dES dt k 1 S E ( k 2 k 3 ) ES dP dt k 3 ES
Alternative approach: Keep track of exact number of molecules of each type. Simulate the dynamics by executing one reaction at a time stochastically (CTMCs) Stochastic simulations (Gillespie‟s algorithm) Kappa , BioNetGen, PRISM, Bio-Pepa, ..
ODEs: Major Hurdles • Many unknown rate constants. • Must be estimated using limited data: Low precision , population-based, noisy
Major Hurdles • High dimensional non-linear system no closed-form solutions must resort to numerical simulations point values of initial states/data will not be available a large number of numerical simulations needed for answering each analysis question
“Polling” based approximation • Start with an ODEs system. • Discretize the time and value domains. • Assume a (uniform) distribution of initial states • Generate a “sufficiently” large number of trajectories by Sampling the initial states and numerical simulations.
The “exit poll” Idea • Encode this collection of discretized trajectories as a dynamic Bayesian network. • ODEs DBN • Pay the one-time cost of constructing the DBN approximation. • Do analysis using Bayesian inferencing on the DBN.
Time Discretization • Observe the system only at a finite number of time points. x(t) x(0) = 2 2 x(t) = t 3 + 4t + 2 t max t 0 t 1 t 2 ... ... ... ...
Value Discretization • Observe only with bounded precision x(t) E D C B A t 0 t 1 t 2 t max ... ... ... ...
Symbolic trajectories • A trajectory is recorded as a finite sequence of discrete values . (C,0) (D,1) (D,2) (E,3) (D,4) (C,5) (B,6) x(t) E D C B A t 0 t 1 t 2 t max ... ... ... ...
Collection of Trajectories • Assume a prior distribution of the initial states. • Uncountably many trajectories. Represented as a set of (timed) finite sequences. (C,0) (D,2) (D,1) (E,3) (D,4) (C,5) (B,6) ... ... (D,3) x(t) E D C B A t t 0 t 1 t 2 t max ... ... ... ...
Piecing trajectories together.. • In fact, a probabilistic transition system. • Pr( (D, 2) (E, 3) ) is 0.8 (C,0) (D,2) (D,1) (E,3) (D,4) (C,5) (B,6) the “fraction” of the 0.2 ... ... (D,3) trajectories residing in D at t = 2 that land in E at E D t = 3. C B A t t 0 t 1 t 2 t max ... ... ... ...
The Justification • The value space of the variables is assumed to be a compact subset C of • In Z’ = F(Z), F is assumed to be continuously differentiable in C. Mass-law, Michaelis- Menton,… • Then the solution t : C C (for each t) exists, is unique, a bijection, continuous and hence measurable. • But the transition probabilities can’t be computed.
A computational approximation (s, i) – States; (s, i) (s‟, i+1) -- Transitions 1000 800 0.8 Sample, say, 1000 times the (C,0) (D,1) (D,2) (E,3) (D,4) (C,5) (B,6) initial states. 0.2 ... ... (D,3) Through numerical simulation, generate 1000 E trajectories. D Pr((s, i) (s‟ i+1)) is the C fraction of the trajectories that are in s at t i which land in s‟ B at t i+1 . A t t 0 t 1 t 2 t max ... ... ... ...
Infeasible Size! • But the transition system will be huge. O(T . k n ) k 2 and n ( 50-100).
Compact Representation • Exploit the network structure (additional independence assumptions) to construct a DBN instead. • The DBN is a factored form of the probabilistic transition system.
The DBN representation k 1 k 3 S E E ES P k 2 Assume mass law. dS dt k 1 S E k 2 ES dE dt k 1 S E ( k 2 k 3 ) ES dES dt k 1 S E ( k 2 k 3 ) ES dP dt k 3 ES
S k 1 k 3 S E E ES P k 2 ES P E dS dt k 1 S E k 2 ES dE dt k 1 S E ( k 2 k 3 ) ES Dependency diagram dES dt k 1 S E ( k 2 k 3 ) ES dP dt k 3 ES
S k 1 k 3 S E E ES P k 2 ES P dS E dt k 1 S E k 2 ES dE dt k 1 S E ( k 2 k 3 ) ES Dependency diagram dES dt k 1 S E ( k 2 k 3 ) ES dP dt k 3 ES
The DBN Representation k 1 k 3 S E E ES P k 2 ... ... S 0 S 1 S 2 S 3 dS dt k 1 S E k 2 ES dE ... ... E 0 E 1 E 2 E 3 dt k 1 S E ( k 2 k 3 ) ES dES ... ... dt k 1 S E ( k 2 k 3 ) ES ES 0 ES 1 ES 2 ES 3 dP dt k 3 ES ... ... P 0 P 1 P 2 P 3
P(S 2 = C |S 1 = B ,E 1 = C ,ES 1 = B )= 0.2 P(S 2 = C |S 1 = B ,E 1 = C ,ES 1 = C )= 0.1 P(S 2 = A |S 1 = A ,E 1 = A ,ES 1 = C )= 0.05 . . . • Each node has a CPT ... ... S 0 S 1 S 2 S 3 associated with it. • This specifies the local ... ... E 0 E 1 E 2 E 3 (probabilistic) dynamics. ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3 A B C
P(S 2 = C |S 1 = B ,E 1 = C ,ES 1 = B )= 0.2 P(S 2 = C |S 1 = B ,E 1 = C ,ES 1 = C )= 0.1 P(S 2 = A |S 1 = A ,E 1 = A ,ES 1 = C )= 0.05 . . . • A Fill up the entries in B ... ... S 0 S 1 S 2 S 3 the CPTs by C sampling, simulations and counting ... ... E 0 E 1 E 2 E 3 ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3
Computational Approximation 500 100 • A Fill up the entries in B ... ... S 0 S 1 S 2 S 3 the CPTs by C sampling, simulations and counting ... ... E 0 E 1 E 2 E 3 1000 ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3
The Technique P(S 2 = C |S 1 = B ,E 1 = C ,ES 1 = B )= 100/500= 0.2 500 100 • A Fill up the entries in B ... ... S 0 S 1 S 2 S 3 the CPTs by C sampling, simulations and counting ... ... E 0 E 1 E 2 E 3 ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3
The Technique The size of the DBN is: ... ... S 0 S 1 S 2 S 3 O(T . n . k d ) ... ... E 0 E 1 E 2 E 3 ... ... ES 0 ES 1 ES 2 ES 3 d will be usually much smaller than n. ... ... P 0 P 1 P 2 P 3
Unknown rate constants k 1 0.1 k 3 S E E ES P k 2 0.2 ... ... S 0 S 1 S 2 S 3 dS ... ... E 0 E 1 E 2 E 3 dt 0.1 S E 0.2 ES dE dt 0.1 S E (0.2 k 3 ) ES ... ... ES 0 ES 1 ES 2 ES 3 dES dt 0.1 S E (0.2 k 3 ) ES ... ... P 0 P 1 P 2 P 3 dP dt k 3 ES 0 k 3 dk 3 = 0 dt
Unknown rate constants During the numerical generation of a ... ... S 0 S 1 S 2 S 3 trajectory, the value ... ... E 0 E 1 E 2 E 3 of k 3 does not change after sampling. ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3 ... ... 3 0 1 2 k 3 k 3 k 3 k 3 1 0 P(k 3 = A |k 3 = A )= 1
Unknown rate constants P(ES 2 = A |S 1 = C ,E 1 = B ,ES 1 = A ,k 3 = A )= 0.4 1 During the numerical generation of a ... ... S 0 S 1 S 2 S 3 trajectory, the value ... ... E 0 E 1 E 2 E 3 of k 3 does not change after sampling. ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3 ... ... 3 0 1 2 k 3 k 3 k 2 k 3 1 0 P(k 3 = A |k 3 = A )= 1
Unknown rate constants ... ... S 0 S 1 S 2 S 3 ... ... E 0 E 1 E 2 E 3 ... ... ES 0 ES 1 ES 2 ES 3 ... ... P 0 P 1 P 2 P 3 Sample uniformly ... ... 3 0 1 2 across all the k 3 k 3 k 3 k 3 Intervals.
DBN based Analysis • Use Bayesian inferencing to do parameter estimation, sensitivity analysis, probabilistic model checking … • Exact inferencing is not feasible for large models. • We do approximate inferencing. • Factored Frontier algorithm .
Parameter Estimation 1. For each choice of (interval) values for unknown parameters, ... ... run FF, compare with S 0 S 1 S 2 S 3 experimental data and assign a score using FF. ... ... E 0 E 1 E 2 E 3 2. Return parameter estimates as ... ... maximal likelihoods. ES 0 ES 1 ES 2 ES 3 3. FF can be then used on the ... ... P 0 P 1 P 2 P 3 calibrated model to do sensitivity analysis, probabilistic verification etc. ... ... 3 2 0 1 k 3 k 3 k 3 k 3
DBN based Analysis • Our experiments with signaling pathways models (taken from the BioModels data base ) show: The one-time cost of constructing the DBN can be easily amortized by using it to do parameter estimation and sensitivity analysis. Good compromise between efficiency and accuracy.
Recommend
More recommend