Sunday Sep 26 th 2010 Dynamic Deterministic Effects Propagation Networks - Learning signalling pathways from longitudinal data Christian Bender MGA Molecular Genome Analysis - Biostatistics and Modelling
Contents 1) Introduction and motivation: a) Model system: EGF-Receptor (ERBB) signalling network b) Goals and experimental setup c) Technology: Reverse Phase Protein Arrays 2) Network reconstruction framework: DDEPN a) Overview b) System state generation and state sequence optimisation c) Likelihood calculation d) Network structure search by genetic algorithm 3) Testing and application to longitudinal ERBB data set 4) Conclusions Christian Bender Biostatistics - Molecular Genome Analysis
1) Model system: EGF-Receptor signalling • Important cancer related signalling pathway • ERBB2 over expressed in 25-30% of human breast tumours • Stimulation by Epidermal Growth Factor (EGF) and Heregulin (HRG) Christian Bender Biostatistics - Molecular Genome Analysis
1) Goals and experimental setup • Goal: analyse response of various cell lines to different stimuli Experimental setup: • HCC1954 cell line, ERBB2 positive • 2 ligands: EGF and HRG and combination EGF+HRG • 5 biological replicates, 3 technical replicates • 10 time points between 0 and 60 minutes after stimulation • Measurements of phosphoprotein abundance generated on Reverse Phase Protein Arrays • 16 antibodies targeting phosphoproteins Christian Bender Biostatistics - Molecular Genome Analysis
1) Technology: Reverse phase protein arrays (RPPA) Single lysate Primary IR-dye labeled spot antibody secondary detects target antibody binds to protein primary antibody Visualisation/ Cell line lysate Quantification Christian Bender Biostatistics - Molecular Genome Analysis
2) Example plots of the measured Data after EGF stimulation Christian Bender Biostatistics - Molecular Genome Analysis
2) Dynamic Deterministic Effects Propagation Networks (DDEPN) • Aims for modelling approach: • Include time dependencies and perturbation (stimulatory/inhibitory) • Model activating and inhibiting edges • Modelling the signal flow from stimulations downwards the signalling cascade: • Define protein status (active/passive) during this propagation • Estimate Gaussian distributions for active/passive state of each protein • Calculate model likelihood depending on activation states • Optimise to find best network Christian Bender Biostatistics - Molecular Genome Analysis
2) DDEPN framework Γ 1 2 3 S Γ ˆ * t1 t2 t3 t4 S 1 1 1 1 1 1 1 S 1 1 1 1 S A S 1 1 1 1 Signal HMM 1 1 1 1 S A 0 0 1 1 0 1 1 A 0 0 1 1 A A 0 0 1 1 0 0 1 1 A B 0 0 1 0 propagation B B 0 0 1 0 0 1 0 B B 0 0 1 0 0 0 1 0 B Matrix of reachable Network Optimal state system states hypothesis sequence replicate measurements Data time estimation Parameter X S t1 x S1 t2 .. t3 .. t4 x S4 S x S1 .. .. x S4 S x S1 .. .. x S4 x S1 .. .. x S4 S A x A1 .. .. x A4 A x A1 .. .. x A4 A x A1 .. .. x A4 x A1 .. .. x A4 A B x B1 .. .. x B4 B x B1 .. .. x B4 B x B1 .. .. x B4 x B1 .. .. x B4 B proteins Modify network * Θ Likelihood Θ ˆ ˆ hypothesis Γ ˆ calculation ( | , ) p X -> genetic algorithm Christian Bender Biostatistics - Molecular Genome Analysis
2) Generation of reachable system states • N nodes give rise to 2 N system states • Depending on the network structure, some state vectors can never be reached => Reduce to the states that are implied by the network S State of protein v i in step k : A B Network hypothesis Start at step 1: Example: state of protein B If at least one activating parent is 1 and no in step 2 inhibiting parent is 1 in step k-1 , set to 1 Stimulus S is active. Christian Bender Biostatistics - Molecular Genome Analysis
2) Most likely system state series using an HMM • We do not know which state is reached at which time point => find series of system states using an HMM Transition matrix Viterbi Training Model parameters ? Christian Bender Biostatistics - Molecular Genome Analysis
2) Likelihood of network hypothesis given system state matrix ^ Given a state sequence matrix Γ *, each data point • follows one of two Gaussian distributions: The full likelihood for a network hypothesis Φ is: • Christian Bender Biostatistics - Molecular Genome Analysis
2) Genetic algorithm for optimising a population of networks Φ 1 Φ 2 Φ 3 Φ 4 p( Φ 4 ) P 1) Selection/Crossover proportional to p( Φ 1 ) p( Φ 2 ) p( Φ 3 ) ≥ ≥ ≥ network likelihoods ⇒ Keep 'good' Selection Crossing over networks Φ 1 Φ 2 Φ 3 Φ 4 ' ' Repeat P' p( Φ 1 ) p( Φ 2 ) p( Φ 3 p( Φ 4 ≥ ') ' ) ? ? Mutation 2) Mutation introduces Φ 1 Φ 2 Φ 3 Φ 4 ' '' ' randomness in ' ) P' network evolution p( Φ 1 ) p( Φ 2 p( Φ 3 p( Φ 4 ' ) '') Christian Bender Biostatistics - Molecular Genome Analysis
3) Testing: Increasing the number of perturbations • nstim : single treatments, e.g. EGF • cstim : combined treatments, e.g. EGF+HRG • Substantial increase of AUC by inclusion of multiple stimuli Christian Bender Biostatistics - Molecular Genome Analysis
3) Testing: Comparison to related methods G1DBN and ebdbNet Christian Bender Biostatistics - Molecular Genome Analysis
3) Resulting network from ERBB data Identified 13 of 22 edges in agreement with current literature • Edges show high support from the data (see edge numbers) • Christian Bender Biostatistics - Molecular Genome Analysis
3) Summary and conclusions • Reconstruction of signalling networks under external perturbations • Model effect of both external stimulation and inhibition • Model activatory as well as inhibiting edges • Good theoretical performance of the algorithm • Successful reconstruction of ERBB signalling interactions from RPPA data • R-package ddepn available on CRAN: http://cran.r-project.org/ Christian Bender Biostatistics - Molecular Genome Analysis
Acknowledgements MGA – Lab work University Medicine Göttingen • Frauke Henjes • Tim Beißbarth • Ulrike Korf .... For supervision of my PhD • Vivian Szabo thesis MGA – Biostatistics & Modelling • Anika Jöcker ECCB10 Travel Fellowship Cancer Genome Research - Division of Molecular Genetics • Maria Fälth • Marc Johannes • Stephan Gade Bonn-Aachen international center for IT • Holger Fröhlich Christian Bender Biostatistics - Molecular Genome Analysis
Christian Bender Biostatistics - Molecular Genome Analysis
Recommend
More recommend