Discretize and Encode the Data Time series data values for CIII Bin Assignmentsp Experiment 20 250 Experiment 1 Time CI CII CIII Cro N Number of times seen 200 0 0 0 0 0 0 5 0 0 1 0 0 150 10 0 0 0 1 0 .. .. .. .. .. .. 100 100 0 0 2 1 1 θ ( CIII ) = � 0 , 33 , 67 , ∞ � 50 0 0 25 50 75 100 Expression level Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 25 / 98
Discretize and Encode the Data Time series data values for CIII Experimental data. Experiment 20 250 Experiment 1 Time CI CII CIII Cro N Number of times seen 200 0 0 0 0 0 0 5 0 10 0 40 8 150 10 0 20 16 60 8 .. .. .. .. .. .. 100 100 12 29 35 88 45 θ ( CIII ) = � 0 , 7 , 31 , ∞ � 50 0 0 25 50 75 100 Expression level Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 25 / 98
Discretize and Encode the Data Time series data values for CIII Bin Assignmentsp Experiment 20 250 Experiment 1 Time CI CII CIII Cro N Number of times seen 200 0 0 0 0 0 0 5 0 1 0 0 0 150 10 0 2 1 1 0 .. .. .. .. .. .. 100 100 0 2 2 2 1 θ ( CIII ) = � 0 , 7 , 31 , ∞ � 50 0 0 25 50 75 100 Expression level Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 25 / 98
Clustering Algorithms Often applied to microarray data taken over a variety of conditions or a series of time points. Assume that genes that are active at the same time are likely involved in the same regulatory process. Also assume that genes are grouped and within a group the genes produce the same expression profile. Due to noise and other uncertainties, groupings are not clear. Goal: determine the original groupings of the genes. Assume that there exists a method to determine the pairwise distance between the expression profiles of any two genes. Many algorithms have been proposed for clustering. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 26 / 98
K -Means Partitions N genes into K clusters. Begins with K initial clusters either determined by user or by random. For each cluster, computes its centroid (i.e., the average expression profile of the genes in a cluster). Reassigns each gene to cluster with centroid that is closest to the expression pattern of the gene. Centroids recalculated and process repeats until no change. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 27 / 98
K-Means Example Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 28 / 98
K-Means Example Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 29 / 98
K-Means Example Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 30 / 98
K-Means Example Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 31 / 98
K-Means Clustering: Phage λ Data π K 1 = { cI } π K 2 = { B , C , U , V , H , M , L , K , I , J } π K 3 = { A , D , E , Z , G , T , S , R } π K 4 = { cro , cII , cIII , N , Q , xis , int , O } (Data courtesy of Osterhout et al., BMC Microbiology 2007) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 32 / 98
Agglomerative Hierarchical Clustering Begins with N clusters each containing a single gene. Combines two clusters with smallest distance apart where distance is between their average expression profiles. Continues for N − 1 steps at which point all the genes are merged into a hierarchical tree. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 33 / 98
Hierarchical Clustering: Phage λ Data π H 1 = { cI } π H 2 = { cro , cII , cIII , N , Q , xis , int , O } π H 3 = { S , R } π H 4 = { A , D , E , Z , G , T , B , C , U , V , H , M , L , K , I , J } (Data courtesy of Osterhout et al., BMC Microbiology 2007) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 34 / 98
Clustering Summary Using clustering results, one can potentially determine which genes produce proteins with similar functions. Clustering results do not shed light on how these genes and their protein products interact. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 35 / 98
Bayesian Networks Given expression data, E , learning techniques allow one to infer the network connectivity that best matches E . Bayesian networks are a promising tool to learn connectivity. A Bayesian network represents a joint probability distribution . Represented with directed acyclic graph, G , whose vertices correspond to random variables, X 1 , . . . , X n , for gene expression level. Connections represent dependencies between random variables. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 36 / 98
Dependence P ( X , Y ) is joint distribution over two variables X and Y . X and Y are independent if P ( X , Y ) = P ( X ) P ( Y ) for all values of X and Y (equivalently, P ( X | Y ) = P ( X ) ). When X and Y are dependent , value of Y gives information about X . Correlation is sufficient but not necessary condition for dependence. When X and Y are dependent, this is represented in the Bayesian network by an arc between them. If the arc is directed from X to Y , X is a parent of Y . Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 37 / 98
Markov Assumption Associated with each X i is a conditional distribution, θ , given its parents. Graph G encodes the Markov Assumption , each variable X i is independent of its non-descendents given its parents in G . This is known as Conditional independence , and graph G implies a set of conditional independence assumptions, Ind ( G ) . Using the Markov assumption, the joint PDF can be decomposed: n ∏ P ( X 1 ,..., X n ) = P ( X i | Pa ( X i )) i = 1 where Pa ( X i ) denotes the parants of X i . Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 38 / 98
Simple Bayesian Network A P(A) 0 0.5 1 0.5 A P(B=1|A) P(B=0|A) 0 0.1 0.9 1 0.2 0.8 B P(C=1|B) P(C=0|B) 0 0.75 0.25 1 0.35 0.65 Ind ( G ) = { A ⊥ ⊥ C | B } P ( A , B , C ) = P ( A ) P ( B | A ) P ( C | B ) P ( A = 1 , B = 0 , C = 0 ) = 0 . 5 × 0 . 8 × 0 . 25 = 0 . 1 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 39 / 98
Another Bayesian Network E P(E) 0 0.6 1 0.4 A P(A) A E B=1 B=0 0 0.35 0 0 0.1 0.9 1 0.65 0 1 0.8 0.2 A D=1 D=0 1 0 0.7 0.3 1 1 0.4 0.6 0 0.1 0.9 1 0.8 0.2 B C=1 C=0 0 0.75 0.25 1 0.35 0.65 A is common cause of B and D . If A not measured, hidden common cause . Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 40 / 98
Another Bayesian Network E P(E) 0 0.6 1 0.4 A P(A) A E B=1 B=0 0 0.35 0 0 0.1 0.9 1 0.65 0 1 0.8 0.2 1 0 0.7 0.3 A D=1 D=0 1 1 0.4 0.6 0 0.1 0.9 1 0.8 0.2 B C=1 C=0 0 0.75 0.25 1 0.35 0.65 Ind ( G ) = { A ⊥ ⊥ C | B , A ⊥ ⊥ E , B ⊥ ⊥ D | A , C ⊥ ⊥ D | A , C ⊥ ⊥ D | B , C ⊥ ⊥ E | B , D ⊥ ⊥ E } Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 40 / 98
Another Bayesian Network E P(E) 0 0.6 1 0.4 A P(A) A E B=1 B=0 0 0.35 1 0.65 0 0 0.1 0.9 0 1 0.8 0.2 A D=1 D=0 1 0 0.7 0.3 1 1 0.4 0.6 0 0.1 0.9 1 0.8 0.2 B C=1 C=0 0 0.75 0.25 1 0.35 0.65 P ( A , B , C , D , E ) = P ( A ) P ( B | A , E ) P ( C | B ) P ( D | A ) P ( E ) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 40 / 98
Equivalence Classes of Bayesian Networks More than one graph can imply same set of independences. Graphs X → Y and X ← Y both have Ind ( G ) = / 0 . G and G ′ are equivalent if Ind ( G ) = Ind ( G ′ ) . Equivalent graphs have same underlying undirected graph but may disagree on direction of some edges. Equivalence class represented by a partially directed graph (PDAG) where edges can be: X → Y , X ← Y , or X — Y . Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 41 / 98
Equivalent Bayesian Networks C B A Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 42 / 98
Equivalent Bayesian Networks D E A B C Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 43 / 98
Learning Bayesian Networks Given training set of experimental data, E , find a network � G , θ � that best matches E . Evaluate using Bayesian scoring metric: P ( E | G ) P ( G ) P ( G | E ) = P ( E ) Score ( G : E ) = log P ( G | E ) = log P ( E | G )+ log P ( G )+ C where C = − log P ( E ) is constant and P ( E | G ) = R P ( E | G , θ ) P ( θ | G ) d θ is the marginal likelihood . Choice of priors P ( G ) and P ( θ | G ) influence the score. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 44 / 98
Learning Bayesian Networks (cont) Given priors and data, learning amounts to finding structure G that maximizes the score. NP-hard so use heuristics like greedy random search. For example, beginning with some initial network, a greedy random search would select an edge to add, delete, or reverse. It would then compute this networks score, and if it is better than the previous network, then it would keep the change. This process is repeated until no improvement is found. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 45 / 98
Efficient Learning Algorithms Number of graphs is super-exponential in number of variables. Sparse candidate algorithm identifies small number of candidate parents for each gene based on local statistics. Pitfall is early choices can overly restrict the search space. Adapting the candidate sets during the search can help. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 46 / 98
Applying Bayesian Networks to Expression Data By learning Bayesian network, can answer questions like which genes depend on which other genes. Expression level of each gene modeled as a random variable. Need to define local probability model for each variable. Discretize gene expression into 3 categories: significantly lower, similar to, or significantly greater than control. Discretizing can lose information, but more levels can be used if more resolution in experimental data. Control expression level either determined experimentally or the average expression level can be used. Meaning of “significantly” defined by setting threshold to ratio between measured expression and control. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 47 / 98
Phage λ Discretized Expression Data cIII N cII cro cI Probability cIII N cII cro cI Probability 0 0 0 0 0 0.05 1 0 0 0 0 0.0 0 0 0 0 1 0.18 1 0 0 0 1 0.02 0 0 0 1 0 0.06 1 0 0 1 0 0.01 0 0 0 1 1 0.10 1 0 0 1 1 0.0 0 0 1 0 0 0.0 1 0 1 0 0 0.01 0 0 1 0 1 0.04 1 0 1 0 1 0.01 0 0 1 1 0 0.02 1 0 1 1 0 0.01 0 0 1 1 1 0.02 1 0 1 1 1 0.0 0 1 0 0 0 0.01 1 1 0 0 0 0.01 0 1 0 0 1 0.05 1 1 0 0 1 0.01 0 1 0 1 0 0.05 1 1 0 1 0 0.01 0 1 0 1 1 0.02 1 1 0 1 0 0.0 0 1 1 0 0 0.03 1 1 1 0 0 0.20 0 1 1 0 1 0.0 1 1 1 0 1 0.02 0 1 1 1 0 0.03 1 1 1 1 0 0.02 0 1 1 1 1 0.0 1 1 1 1 1 0.0 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 48 / 98
Phage λ Bayesian Network N N cIII=1 cIII=0 0 0.12 0.88 N P(N) 1 0.58 0.42 cIII 0 0.54 1 0.46 cII cI=1 cI=0 cII 0 0.66 0.34 N cIII cII=1 cII=0 1 0.24 0.76 0 0 0.18 0.82 0 1 0.48 0.52 cI 1 0 0.34 0.66 1 1 0.87 0.13 cI cro=1 cro=0 0 0.43 0.57 cro 1 0.33 0.67 P( cIII , N , cII , cro , N ) = P( N )P( cIII | N )P( cII | N,cIII )P( cI | cII )P( cro | cI ) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 49 / 98
Comparison cIII N cII cro cI Orig BN cIII N cII cro cI Orig BN 0 0 0 0 0 0.05 0.08 1 0 0 0 0 0.00 0.01 0 0 0 0 1 0.18 0.17 1 0 0 0 1 0.02 0.01 0 0 0 1 0 0.06 0.08 1 0 0 1 0 0.01 0.01 0 0 0 1 1 0.10 0.17 1 0 0 1 1 0.00 0.01 0 0 1 0 0 0.00 0.04 1 0 1 0 0 0.01 0.01 0 0 1 0 1 0.04 0.01 1 0 1 0 1 0.01 0.01 0 0 1 1 0 0.02 0.04 1 0 1 1 0 0.01 0.01 0 0 1 1 1 0.02 0.01 1 0 1 1 1 0.00 0.01 0 1 0 0 0 0.01 0.03 1 1 0 0 0 0.01 0.01 0 1 0 0 1 0.05 0.06 1 1 0 0 1 0.01 0.01 0 1 0 1 0 0.05 0.03 1 1 0 1 0 0.01 0.01 0 1 0 1 1 0.02 0.06 1 1 0 1 0 0.00 0.01 0 1 1 0 0 0.03 0.03 1 1 1 0 0 0.20 0.10 0 1 1 0 1 0.00 0.01 1 1 1 0 1 0.02 0.04 0 1 1 1 0 0.03 0.03 1 1 1 1 0 0.02 0.10 0 1 1 1 1 0.00 0.01 1 1 1 1 1 0.00 0.04 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 50 / 98
Phage λ Bayesian Network N N cIII=1 cIII=0 0 0.12 0.88 N P(N) 1 0.58 0.42 cIII 0 0.54 1 0.46 cII cI=1 cI=0 cII 0 0.66 0.34 N cIII cII=1 cII=0 1 0.24 0.76 0 0 0.18 0.82 0 1 0.48 0.52 cI 1 0 0.34 0.66 1 1 0.87 0.13 cI cro=1 cro=0 0 0.43 0.57 cro 1 0.33 0.67 P( cIII , N , cII , cro , N ) = P( N )P( cIII | N )P( cII | N,cIII )P( cI | cII )P( cro | cI ) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 51 / 98
Phage λ Bayesian Network cIII cIII N=1 N=0 0 0.29 0.71 cIII P(cIII) 1 0.80 0.20 N 0 0.58 1 0.42 cII cI=1 cI=0 cII 0 0.66 0.34 N cIII cII=1 cII=0 1 0.24 0.76 0 0 0.18 0.82 0 1 0.48 0.52 cI 1 0 0.34 0.66 1 1 0.87 0.13 cI cro=1 cro=0 0 0.43 0.57 cro 1 0.33 0.67 P( cIII , N , cII , cro , N ) = P( cIII )P( N | cIII )P( cII | N,cIII )P( cI | cII )P( cro | cI ) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 52 / 98
Finding Features Difficulty is data is for thousands of genes but often only a few dozen samples, but on positive side, networks typically sparse. A set of plausible networks needs to be considered. May characterize features common in a set of networks. Markov relations : Is Y in the Markov blanket of X ? Order relations : Is X an ancestor of Y ? (or cause?) Confidence is likelihood that a feature is actually true. m 1 ∑ confidence ( f ) = f ( G i ) m i = 1 where m is the number of potential networks considered, G i is a potential network, and f ( G i ) is 1 if f is a feature of G i and 0 otherwise. Can use bootstrap method to generating potential networks which considers multiple subsets of the experimental data. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 53 / 98
Bayesian Networks Discussion Clustering approaches can only find correlations. Bayesian analysis can potentially discover causal relationships and interactions between genes. Probablistic semantics good for noisy biological systems. Can focus on extracting features rather than find single model. Can assist with experimental design . Bayesian networks though limited to acyclic graphs. Most (if not all) genetic circuits include feedback control. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 54 / 98
Dynamic Bayesian Networks A Dynamic Bayesian Networks (DBN) unrolls the cyclic graph T times. Nodes in DBN are random variables X ( t ) 1 ,..., X ( t ) where t equals 1 to T . n The joint PDF can be decomposed as follows: T n P ( X ( 1 ) 1 ,..., X ( T ) P ( X ( t ) | Pa ( X ( t ) ∏ ∏ ) = )) n i i t = 1 i = 1 DBNs require time series experimental data. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 55 / 98
DBN for the Phage λ Decision Circuit P ( N ( 1 ) , cI ( 1 ) , cII ( 1 ) , cIII ( 1 ) , cro ( 1 ) , N ( 2 ) , cI ( 2 ) , cII ( 2 ) , cIII ( 2 ) , cro ( 2 )) = P ( N ( 1 )) P ( cI ( 1 )) P ( cII ( 1 )) P ( cIII ( 1 )) P ( cro ( 1 )) P ( N ( 2 ) | cro ( 1 ) , cI ( 1 )) P ( cI ( 2 ) | cro ( 1 ) , cI ( 1 ) , cII ( 1 )) P ( cII ( 2 ) | N ( 1 ) , cI ( 1 ) , cIII ( 1 )) P ( cIII ( 2 ) | N ( 1 ) , cI ( 1 ) , cro ( 1 )) P ( cro ( 2 ) | cI ( 1 ) , cro ( 1 )) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 56 / 98
Causal Networks A Bayesian network represents correlative relationships, but ultimately we are interested in knowing causal relationships. In a causal network , parents are interpreted as immediate causes . Causal Markov assumption states that given values of variable’s immediate causes, it is independent of earlier causes. Causal networks model not only distribution of observations but also effects of interventions . In causal networks, X → Y and Y → X are not equivalent. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 57 / 98
Learning Causal Networks DBN approaches typically must perform an expensive global search, and they have difficulty learning networks with tight feedback. The method described here uses local analysis to efficiently learn networks with tight feedback. This method determines the likelihood that a gene’s expression increases in the next time point given the current gene expression levels. These likelihoods are then used to determine influences between the genes in the genetic circuit. Result is a directed graph representation of the genetic circuit. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 58 / 98
Possible Genetic Circuit Models 8 . 47 × 10 11 1 CII CI N Cro CIII Number of models = 3 | S | 2 where | S | is the number of species. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 59 / 98
Influence Vectors CII CI ? N Cro CIII CI CII CIII Cro N → i : r n n u a CIII u = unknown Act ( i ) = { N } n = no connection Rep ( i ) = { CI } a = activation Par ( i ) = { N , CI } r = repression Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 60 / 98
Scoring Influence Vectors { ( � e , τ , ν � , � e ′ , τ ′ , ν ′ � ) | � e , τ , ν � ∈ E ∧� e ′ , τ ′ , ν ′ � ∈ E Γ = ∧ ( e = e ′ ) ∧ ( τ < τ ′ ) ∧¬∃� e , τ ′′ , ν ′′ � ∈ E . ( τ < τ ′′ ) ∧ ( τ ′′ < τ ′ ) } { ( � e , τ , ν � , � e ′ , τ ′ , ν ′ � ) ∈ Γ | ν ( s ) < ν ′ ( s ) } inc ( s ) = { ( � e , τ , ν � , � e ′ , τ ′ , ν ′ � ) ∈ Γ | ν ( s ) / ∈ { L , H , −}∧ ν ′ ( s ) / val ( s ) = ∈ { L , H , −}} { ( � e , τ , ν � , � e ′ , τ ′ , ν ′ � ) ∈ Γ | ∀ s ′ ∈ S . ν ( s ′ ) ∈ Φ b ( s ′ ) ( s ′ ) } bin ( b ) = P ( inc ( s ) ∩ val ( s ) ∩ bin ( b )) = | inc ( s ) ∩ val ( s ) ∩ bin ( b ) | | Γ | P ( val ( s ) ∩ bin ( b )) = | val ( s ) ∩ bin ( b ) | | Γ | P ( inc ( s ) | val ( s ) ∩ bin ( b )) = P ( inc ( s ) ∩ val ( s ) ∩ bin ( b )) P ( val ( s ) ∩ bin ( b )) = | inc ( s ) ∩ val ( s ) ∩ bin ( b ) | | val ( s ) ∩ bin ( b ) | Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 61 / 98
Probability of Increase: Example <*,i,j,*,*> N’s Prob. N’s Prob. of Incr. of Incr. 75 50 25 0 2 0 1 CIII 1 CII 2 0 (courtesy of Barker (2007)) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 62 / 98
Probability of Increase: Example � CI , CII , CIII , Cro , N � P ( N ↑ | b ( CII ) , b ( CIII )) �∗ , 0 , 0 , ∗ , ∗� 40 �∗ , 0 , 1 , ∗ , ∗� 49 �∗ , 0 , 2 , ∗ , ∗� 70 �∗ , 1 , 0 , ∗ , ∗� 58 �∗ , 1 , 1 , ∗ , ∗� 42 �∗ , 1 , 2 , ∗ , ∗� 38 �∗ , 2 , 0 , ∗ , ∗� 66 �∗ , 2 , 1 , ∗ , ∗� 38 �∗ , 2 , 2 , ∗ , ∗� 26 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 63 / 98
Probability Ratios To determine trends, a ratio is formed between two probabilities using two partial bin assignments, b and b ′ . P ( inc ( s ) | val ( s ) ∩ bin ( b ′ )) = | inc ( s ) ∩ val ( s ) ∩ bin ( b ′ ) | P ( inc ( s ) | val ( s ) ∩ bin ( b )) | val ( s ) ∩ bin ( b ′ ) | | val ( s ) ∩ bin ( b ) | ∗ | inc ( s ) ∩ val ( s ) ∩ bin ( b ) | The partial bin assignment, b ′ , is called the base . ∗ if i ( s ) = ‘ n ’ 0 if ( i ( s ) = ‘ a ’ ∧| Rep ( i ) | ≤ | Act ( i ) | ) ∨ b ′ ( s ) = ( i ( s ) = ‘ r ’ ∧| Rep ( i ) | > | Act ( i ) | ) n − 1 otherwise Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 64 / 98
Probability Ratios: Example � CI , CII , CIII , Cro , N � P ( N ↑ | b ( CII ) , b ( CIII )) Ratio �∗ , 0 , 0 , ∗ , ∗� 40 �∗ , 0 , 1 , ∗ , ∗� 49 1.23 �∗ , 0 , 2 , ∗ , ∗� 70 1.75 �∗ , 1 , 0 , ∗ , ∗� 58 1.45 �∗ , 1 , 1 , ∗ , ∗� 42 1.05 �∗ , 1 , 2 , ∗ , ∗� 38 0.95 �∗ , 2 , 0 , ∗ , ∗� 66 1.65 �∗ , 2 , 1 , ∗ , ∗� 38 0.95 �∗ , 2 , 2 , ∗ , ∗� 26 0.65 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 65 / 98
Scoring Influence Vectors More activating influences ( i.e., | Rep ( i ) | ≤ | Act ( i ) | ). ∞ 0 T r 1 T a Against Neutral For More repressing influences ( i.e., | Rep ( i ) | > | Act ( i ) | ). ∞ 0 T r 1 T a For Neutral Against The final score is determined using the following equation: v f − v a score = v f + v a + v n A score greater than zero indicates support for the vector while a negative score indicates there is not support for the vector. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 66 / 98
Scoring Influence Vectors: Example � CI , CII , CIII , Cro , N � P ( N ↑ | b ( CII ) , b ( CIII )) Ratio Vote �∗ , 0 , 0 , ∗ , ∗� 40 �∗ , 0 , 1 , ∗ , ∗� 49 1.23 v a �∗ , 0 , 2 , ∗ , ∗� 70 1.75 v a �∗ , 1 , 0 , ∗ , ∗� 58 1.45 v a �∗ , 1 , 1 , ∗ , ∗� 42 1.05 v n �∗ , 1 , 2 , ∗ , ∗� 38 0.95 v n �∗ , 2 , 0 , ∗ , ∗� 66 1.65 v a �∗ , 2 , 1 , ∗ , ∗� 38 0.95 v n �∗ , 2 , 2 , ∗ , ∗� 26 0.65 v f Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 67 / 98
Scoring with a Control Set When scoring an influence vector, i , for species s , the probability of increase can be influenced by the level of s . Similarly, when comparing two influence vectors, i and i ′ , it is useful to control for the species in i ′ , when evaluating the score for i and vice versa. In both cases, can partition bins using a control set, G . Now consider all assignments to species in Par ( i ) ∪ G . Base bin assignment agrees with values in b for each member of G . b ( s ) if s ∈ G ∗ if i ( s ) = ‘ n ’ b ′ ( s ) = if ( i ( s ) = ‘ a ’ ∧| Rep ( i ) | ≤ | Act ( i ) | ) 0 ∨ ( i ( s ) = ‘ r ’ ∧| Rep ( i ) | > | Act ( i ) | ) n − 1 otherwise Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 68 / 98
Scoring with a Control Set: Example <*,i,j,*,2> 3 <*,i,j,*,0> 1 N’s Prob. N’s Prob. of Incr. of Incr. 100 75 50 25 0 2 0 1 CIII 1 CII 2 0 (courtesy of Barker (2007)) Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 69 / 98
Scoring with a Control Set: Example � CI , CII , CIII , Cro , N � P ( N ↑ | b ( CII ) , b ( CIII ) , b ( N )) Ratio Vote �∗ , 0 , 0 , ∗ , 0 � 40 base �∗ , 0 , 1 , ∗ , 0 � 58 1.45 v a �∗ , 0 , 2 , ∗ , 0 � 83 2.08 v a �∗ , 1 , 0 , ∗ , 0 � 67 1.66 v a �∗ , 1 , 1 , ∗ , 0 � 55 1.37 v a �∗ , 1 , 2 , ∗ , 0 � 59 1.47 v a �∗ , 2 , 0 , ∗ , 0 � 100 2.5 v a �∗ , 2 , 1 , ∗ , 0 � 44 1.09 v n �∗ , 2 , 2 , ∗ , 0 � 36 0.90 v n Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 70 / 98
Scoring with a Control Set: Example � CI , CII , CIII , Cro , N � P ( N ↑ | b ( CII ) , b ( CIII ) , b ( N )) Ratio Vote �∗ , 0 , 0 , ∗ , 1 � 55 base �∗ , 0 , 1 , ∗ , 1 � 40 0.72 v f �∗ , 0 , 2 , ∗ , 1 � 50 0.90 v n �∗ , 1 , 0 , ∗ , 1 � 54 0.98 v n �∗ , 1 , 1 , ∗ , 1 � 37 0.67 v f �∗ , 1 , 2 , ∗ , 1 � 41 0.75 v n �∗ , 2 , 0 , ∗ , 1 � 0 0.00 v f �∗ , 2 , 1 , ∗ , 1 � 42 0.76 v n �∗ , 2 , 2 , ∗ , 1 � 27 0.49 v f Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 71 / 98
Scoring with a Control Set: Example � CI , CII , CIII , Cro , N � P ( N ↑ | b ( CII ) , b ( CIII ) , b ( N )) Ratio Vote �∗ , 0 , 0 , ∗ , 2 � 27 base �∗ , 0 , 1 , ∗ , 2 � 22 0.81 v n �∗ , 0 , 2 , ∗ , 2 � 50 1.85 v a �∗ , 1 , 0 , ∗ , 2 � 30 1.11 v n �∗ , 1 , 1 , ∗ , 2 � 28 1.04 v n �∗ , 1 , 2 , ∗ , 2 � 28 1.04 v n �∗ , 2 , 0 , ∗ , 2 � 100 3.70 v a �∗ , 2 , 1 , ∗ , 2 � 30 1.11 v n �∗ , 2 , 2 , ∗ , 2 � 24 0.88 v n Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 72 / 98
Learning Influences Overview Select initial influence vector set. Combine influence vectors. Compete influence vectors. Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 73 / 98
Unknown Influence Vector CII CI ? ? ? N Cro CIII ? CI CII CIII Cro N → u u n u u CIII Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 74 / 98
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 0 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 0 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 0 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 0 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 0 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 0 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 0 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 1 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 0 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 2 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 2 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 2 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 2 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 3 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 4 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 2 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 4 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 2 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 4 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 2 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 4 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 2 Influence Vector v f v a v n 20 CIII’s rising probability (%) � rnnnn � CI 5 0 0 15 � nrnnn � CII � nnnrn � Cro 10 � nnnnr � N 5 � annnn � CI � nannn � CII 0 � nnnan � Cro 0 1 2 CI � nnnna � N CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98
Selecting Initial Influence Vectors: CI ⊣ CIII CIII’s Influence Vector Scores CIII at level 2 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 6 0 0 10 15 20 0 5 � nrnnn � CII 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CI 1 CI L 0 CI L 1 CI L 2 L 1 / L 0 L 2 / L 0 CIII L 0 19.0% 1.7% 1.0% 0.09 0.05 CIII L 1 17.1% 2.6% 1.2% 0.15 0.07 CIII L 2 11.6% 2.7% 1.1% 0.23 0.09 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 75 / 98 2
Selecting Initial Influence Vectors: CII ⊣ CIII CIII’s Influence Vector Scores CIII at level 0 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 6 0 0 10 15 20 0 5 � nrnnn � CII 2 3 0 0 � nnnrn � Cro � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N CII 1 CII L 0 CII L 1 CII L 2 L 1 / L 0 L 2 / L 0 CIII L 0 3.1% 13.7% - 4.32 - CIII L 1 4.4% 7.4% 12.6% 1.65 2.83 CIII L 2 19.4% 5.5% 5.8% 0.28 0.35 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 76 / 98 2
Selecting Initial Influence Vectors: Cro ⊣ CIII CIII’s Influence Vector Scores CIII at level 2 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 6 0 0 10 15 20 0 5 � nrnnn � CII 2 3 0 0 � nnnrn � Cro 6 0 0 � nnnnr � N � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N Cro 1 Cro L 0 Cro L 1 Cro L 2 L 1 / L 0 L 2 / L 0 CIII L 0 11.55% 1.84% 1.47% 0.16 0.13 CIII L 1 14.20% 4.74% 3.10% 0.33 0.22 CIII L 2 9.70% 5.02% 4.17% 0.52 0.43 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 77 / 98 2
Selecting Initial Influence Vectors: N ⊣ CIII CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) � rnnnn � CI 6 0 0 10 15 20 0 5 � nrnnn � CII 2 3 0 0 � nnnrn � Cro 6 0 0 � nnnnr � N 5 0 1 � annnn � CI � nannn � CII � nnnan � Cro � nnnna � N N 1 N L 0 N L 1 N L 2 L 1 / L 0 L 2 / L 0 CIII L 0 5.4% 2.9% 3.7% 0.53 0.68 CIII L 1 9.3% 7.2% 6.7% 0.78 0.71 CIII L 2 8.6% 6.4% 6.1% 0.75 0.71 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 78 / 98 2
Selecting Initial Influence Vectors: Others CIII’s Influence Vector Scores CIII at level 1 Influence Vector v f v a v n CIII’s rising probability (%) CI � rnnnn � 6 0 0 10 15 20 0 5 � nrnnn � CII 2 3 0 0 � nnnrn � Cro 6 0 0 � nnnnr � N 5 0 1 � annnn � CI 0 6 0 � nannn � CII 3 2 0 � nnnan � Cro 0 6 0 � nnnna � N 0 5 1 1 N N L 0 N L 1 N L 2 L 1 / L 0 L 2 / L 0 CIII L 0 5.4% 2.9% 3.7% 0.53 0.68 CIII L 1 9.3% 7.2% 6.7% 0.78 0.71 CIII L 2 8.6% 6.4% 6.1% 0.75 0.71 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 79 / 98 2
Selecting Initial Influence Vectors: Scoring -1.0 0 0.75 1.0 v f − v a v f + v a + v n Discard Keep CIII’s Influence Vector Scores Influence Vector v f v a v n Score � rnnnn � CI 6 0 0 1.0 � nrnnn � CII 2 3 0 -0.2 Cro � nnnrn � 6 0 0 1.0 N � nnnnr � 5 0 1 0.833 � annnn � CI 0 6 0 -1.0 � nannn � CII 3 2 0 0.2 � nnnan � Cro 0 6 0 -1.0 � nnnna � N 0 5 1 -0.833 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 80 / 98
Selecting Initial Influence Vectors: Scoring -1.0 0 0.75 1.0 v f − v a v f + v a + v n Discard Keep CIII’s Influence Vector Scores Influence Vector v f v a v n Score � rnnnn � CI 6 0 0 1.0 � nrnnn � CII 2 3 0 -0.2 Cro � nnnrn � 6 0 0 1.0 N � nnnnr � 5 0 1 0.833 � annnn � CI 0 6 0 -1.0 � nannn � CII 3 2 0 0.2 � nnnan � Cro 0 6 0 -1.0 � nnnna � N 0 5 1 -0.833 Chris J. Myers (Lecture 2: Learning Models) Engineering Genetic Circuits 80 / 98
Recommend
More recommend