Analyzing and Synthesizing Genomic Logic Functions Nicola Paoletti Department of Computer Science, University of Oxford, UK Joint work with Hillel Kugler, Youssef Hamadi, Christoph M. Wintersteiger, BoyanYordanov Microsoft Research Cambridge UK CAV, 20 Jul 14
Sea urchin developmental GRN (one of) THE MOST COMPLETE MODELS • Boolean GRN model (on/off + logical connections) • 45 genes • Time delays (due to chemical kinetics) • [0-30] hours post-fertilization (hpf). Step: 1 hpf • 4 spatial domains (leading to specific kinds of cells and organs) • 2 spatial relations between domains (direct and indirect contiguity) describing the evolution of embryonic geometry
Limitations Davidson et al., PNAS 109(41) 16434-6442, 2012
Limitations • Can’t fully explain experimental data (discrepancies on 26/45 genes) Davidson et al., PNAS 109(41) 16434-6442, 2012
Limitations • Can’t fully explain experimental data (discrepancies on 26/45 genes). • Hard-coded terms to force observations (3/45 genes) In d ∧ > t i ∧ < t j Davidson et al., PNAS 109(41) 16434-6442, 2012
Limitations • Can’t fully explain experimental data (discrepancies on 26/45 genes). • Hard-coded terms to force observations (3/45 genes) In d ∧ > t i ∧ < t j • Informal modelling language • Simulation semantics (ignores non-determinism) Davidson et al., PNAS 109(41) 16434-6442, 2012
Synthesis of GRNs How to obtain a GRN model that fully explains experimental data? ???
Synthesis of GRNs How to obtain a GRN model that fully explains experimental data? SMT solving
Formal model Transition system dynamics GRN model Q = B | G × D | • States (boolean • genes G expression of each gene in each domain) • spatial domains D • Finite paths • discrete bounded time domain Π T • Synchronous dynamics • spatial relations SR • Transition relation δ : Π → B r : D × D × T → B • update functions (aka Vector δ ( π ) ⇐ ⇒ Equations) ^ F π [ i ]( g, d ) = f g ( π , i, d ) f g : Π × T × D → B ( g ∈ G ) i ∈ T,g ∈ G,d ∈ D History and domain dependent
Experimental observations Pairs ( C, E ) observed effects Perturbed dynamics obtained replacing in the set of functions C perturbed functions Wildtype expression : where is a predicate describing the • ( ∅ , π ) π sequence of observed states Knockout of gene : where is a predicate • g ( { f g ( π , t, d ) := 0 } , E ) E comparing the wildtype and the perturbed paths
Problem formulation Input: • GRN with partial knowledge of N = ( G, D, SR, T, F ) F • Observations O Synthesize functions in s.t. the dynamics of admits F N paths that meet all observations (for each , there exist paths of and of perturbed by ( C i , E i ) ∈ O N N π i π , s.t. holds) E i ( π , π i ) C i Model encoded as constraints in the theory of bit-vectors (SMT QF_UFBV)
Vector equation language…before Davidson et al., PNAS 109(41) 16434-6442, 2012
Vector equation language…after E ::= g | ¬ E | E ∧ E | E ∨ E | > t | < t | In ¯ d | In ¯ (evaluation in domain ) d E ¯ d (evaluation in a domain in sp. relation with ) | In r ¯ ¯ d E d r (delay of steps) | At- n E n | After- n E (delayed permanent activation) (delayed permanent repression) | Perm- n E
Vector equation language…after E ::= g | ¬ E | E ∧ E | E ∨ E | > t | < t | In ¯ d | In ¯ (evaluation in domain ) d E ¯ d (evaluation in a domain in sp. relation with ) | In r ¯ ¯ d E d r (delay of steps) | At- n E n | After- n E (delayed permanent activation) (delayed permanent repression) | Perm- n E n n n E E E At- 2 E After- 1 E Perm- 1 E
Vector equation language…after And it’s a subset of E ::= g | ¬ E | E ∧ E | E ∨ E LTL+P | > t | < t | In ¯ d | In ¯ (evaluation in domain ) d E ¯ d (evaluation in a domain in sp. relation with ) | In r ¯ ¯ d E d r (delay of steps) | At- n E n | After- n E (delayed permanent activation) (delayed permanent repression) | Perm- n E n n n E E E At- 2 E After- 1 E Perm- 1 E
Function synthesis – Basic interactions Basic interactions (BI) are templates for the synthesis of regulatory terms f = op t r d b g domains temporal delays spatial input genes and operators relations their expression Examples ü Clear biological f = At- [ 1, 3 ] ¬ g 1 interpretation f = { After- , Perm- } ? In { d 1 , d 2 } ? { g 1 , g 2 } ü Incorporates uncertainty
Function synthesis – Basic interactions Basic interactions (BI) are templates for the synthesis of regulatory terms f = op t r d b g domains temporal delays spatial input genes and operators relations their expression Examples ü Clear biological f = At- [ 1, 3 ] ¬ g 1 interpretation f = { After- , Perm- } ? In { d 1 , d 2 } ? { g 1 , g 2 } ü Incorporates uncertainty We encode BIs as bit-vectors, and each evaluation of a BI is mapped to a function. f ⇒ op 0 t 0 ( IN r 0 d 0 ( b 0 ⇐ ^ f = ( g 0 , b 0 , d 0 , r 0 , t 0 , op 0 ) = ⇒ g 0 )) ( π , i, d ) g 0 2 g, b 0 2 b, d 0 2 d, r 0 2 r, t 0 2 t, op 0 2 op
Function synthesis – Uninterpreted functions We use Uninterpreted Boolean Functions to uf : B n → B synthesize logical combinations of regulatory inputs. Additional constraints can be added to avoid the negation of arguments Example: with known to be a promoter f g = uf ( f 1 , f 2 ) f 1 Without constraints, can be synthesized as , which makes a repressor ¬ f 1 ∧ f 2 uf f 1 f 1 f 2 uf f 1 f 2 uf Idea: adding constraints uf ( 0, f 2 ) = ⇒ uf ( 1, f 2 ) ¬ f 1 ∧ f 2 f 2 ^ uf ( b 1 , . . . , b i − 1 , 0, b i + 1 , . . . b n ) = ⇒ uf ( b 1 , . . . , b i − 1 , 1, b i + 1 , . . . b n ) i = 1,...,n
Results Software implementation of VE language and SMT -based synthesis methods with Z3 as solving engine UF of synthesis templates: Original function: f1:= {AT-,AFTER-}? IN ?? bra hfn1 := AT-2 bra AND AT-2 eve f2:= {AT-,AFTER-}? IN ?? eve hfn1 := uf(f1,f2) User-guided refinement Synthesized function: f1:= AT-[0,5] bra f2:= AT-[0,5] eve hfn1 := AT-1 bra AND eve hfn1 := uf(f1,f2)
Results Preserved inputs and their effect (activation/repression) • Most of delays preserved (± 3 hpf, the resolution of observations) • • Only few temporary interactions synthesized into permanent (and viceversa) Davidson et al., PNAS 109(41) 16434-6442, 2012 Synthesized model Added spatial inputs (of the form and ) are supported by literature data • In ¯ In r ¯ d d
FULLY EXPLAINS EXPERIMENTAL DATA
FULLY EXPLAINS EXPERIMENTAL DATA • Solved discrepancies
FULLY EXPLAINS EXPERIMENTAL DATA • Solved discrepancies • No need for hard-coded terms
Related work - Parameter synthesis for gene networks (Batt et al. Bioinformatics 23(18), 2007) - Synthesis from mutation experiments (Köksal et al. POPL 13) - Modular design of biological circuits (Bartocci et al. CMSB 13) Z34Bio (http://research.microsoft.com/en-us/projects/z3-4biology/) - Analysis of DNA computing (Yordanov et al. DNA 2013) - Synthesis of minimal GRN for embryonic stem cells (Dunn et al. Science 344(6188), 2014)
Conclusions • SMT -based method for synthesizing models of GRNs • Applied to state-of-the-art model of sea urchin development • 45 genes x 4 spatial domains x 2 spatial relations x 30 hpf Wildtype expression + 3 perturbation experiments • • Formal encoding of biological DSL • Synthesized model fully explains observations with no major changes • Effective (performance depends on size of search space for functions) f1:= {AT-,AFTER-}[0,6] IN ?? bra f2:= {AT-,AFTER-}[0,6] IN ?? eve 352800 possible functions! hfn1 := uf(f1,f2)
Conclusions • SMT -based method for synthesizing models of GRNs • Applied to state-of-the-art model of sea urchin development • 45 genes x 4 spatial domains x 2 spatial relations x 30 hpf Wildtype expression + 3 perturbation experiments • • Formal encoding of biological DSL • Synthesized model fully explains observations with no major changes • Effective (performance depends on size of search space for functions) f1:= {AT-,AFTER-}[0,6] IN ?? bra f2:= {AT-,AFTER-}[0,6] IN ?? eve 352800 possible functions! hfn1 := uf(f1,f2) Improves understanding of biological systems and produces new hypoteses to validate in wet-lab
Thank you
Recommend
More recommend