Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye - PowerPoint PPT Presentation

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye MSc in Artificial Intelligence Supervised by Dirk Husmeier

Outline � Introduction � MCMCMC � MCMCMC for missing values � Result Evaluation (complete data) � Result Evaluation (missing values) � Summary

Introduction � Genetic Network � Clustering and Differential equation � Bayesian Network � MCMC

eq Genetic Network + + F f2 f - B b + ab A a -

Clustering

Differential Equation � Advantage provide detailed understanding of the biological systems � Shortcoming short of data noisy data

Inferring Bayesian Network From Expression Data A � Bayesian Network B C n ∏ D = ( , ,..., ) ( | ( )) P X X X P X Pa X 1 2 n i G i = 1 i E = ( , , , , ) ( | ) ( | , ) ( | ) ( | ) ( ) P a b c d e P e d P d b c P c a P b a P a

Problems � The number of different network structures grows super exponentially with the number of nodes

P(M|D) P(M|D) M’ M M’ M Where the data set is large, the optimal structure M’ is Where the data set is small, there are many networks which can well defined explain the data fairly well.

� Coincidence dependence 1 2 1 2 3 4 5 6 3 4 5 6 1 2 3 4 5 6

� Escape from local optima using traditional MCMC P (M|D) M

� Small step size versus big step size P (M|D) M

Problems � Huge search space and coincidence dependence — Prescreening is important! � Local optima — Traversal operator is important! � Fixed step size — Varied step size is more reasonable

MCMCMC � Metropolis-coupled Markov Chain Monte Carlo (MCMCMC) � Pre-processing method � Traversal operators � Algorithm � MCMCMC for missing values

MCMCMC 2 T 1 T 1 > 1 = T > 3 2 T T

� For each chain, move a step based on 1 T ⎛ ⎞ ' ' ' ( | ) ( ) ( | ) P D M P M Q M M ⎜ ⎟ = ' ( , ) min( 1 , ) A M M ⎜ ⎟ ' ⎝ ( | ) ( ) ⎠ ( | ) P D M P M Q M M � Chain swap ⎛ ⎞ ... ... ... T T T T T = ⎜ ⎟ 1 2 i k M S ⎜ ⎟ a ⎝ ... ... ... ⎠ M M M M M 1 2 i k M ⎛ ⎞ ... ... ... T T T T T = ⎜ ⎟ 1 2 i k M S ⎜ ⎟ b ⎝ ... ... ... ⎠ M M M M M 1 2 k i M Acceptance Probability [ ] [ ] ⎧ ⎫ 1 1 ⎪ T T ⎪ ( | ) ( ) ( | ) ( ) P D M P M k P D M P M i = ⎨ ⎬ i i k k min 1 , } P [ ] [ ] a ⎪ ⎪ 1 T 1 T ⎩ ( | ) ( ) ( | ) ( ) ⎭ P D M P M i P D M P M k i i k k

Pre-processing method ∫ = θ θ θ ( | ) ( | , ) ( | ) P D M P D M P M d � Penalize complex model α ∑ α π = α = α π nv π π | | * | | n n v nv n v n n n n v n Γ α Γ α + ( ) ( ) n ∏∏ ∏ π π π = n nv nv ( | ) p D M n n n n n Γ α + Γ α ( ) ( ) n π π π π n v n n nv n n n n n n

The log likelihood is ∑ = π log( ( | )) ( , , ) p D M score n n D n where ∑ π = Γ α − Γ α + + ( , , ) [log( ( )) log( ( ))] score n D n π π π n n nv n n n n n π n ∑∑ Γ α + − Γ α [log( ( ) ) log( ( ))] n π π π nv nv nv n n n n n n π v n n

� Use some max fan in � Find all possible parents-configurations for each node and delete low score parents-configurations � Keep C parents-configurations for each node and cardinality Threshold is set as: θ = λ − + * ( ) / / scoresh scoresl m scoresl m

score(n, π ,D) π ’ π When data is quite sparse and noisy score(n, π ,D) π ’ π Using pre-screening method

Traversal operators � Importance sampling--- Sample a parents-configuration for a node π + ( , , ) score i D C Ki j π = = ( ) ( | ) P nodei p nodei n ∑ j n ∑ π + − ( , , ) ( 1 ) score i D n C Ki k = ≠ 1 , _ k k k old = 1 i π + ( , , ) score i D C k _ old n ( | ) ∑ Q Mold Mnew π + − = ( , , ) ( 1 ) score i D n C k ( | ) Q Mnew Mold = ≠ 1 , _ k k k old π + ( , , ) score i D C k _ new n ∑ π + − ( , , ) ( 1 ) score i D n C k = ≠ k 1 , k k _ new ( | ) P D Mnew = π − π exp( ( , , ) ( , , )) score i D score i D ( | ) _ _ P D Mold k new k old

� DIN sampling --- If the new network is loopy 2 3 2 3 1 1 The old model The new model 2 3 2 3 1 1 Step 1 Step 2

( | ) * ( ) * ( | ) P D Mnew P Mnew Q Mold Mnew = ( | ) ( 1 , ) A Mnew Mold Min ( | ) * ( ) * ( | ) P D Mold P Mold Q Mnew Mold n ∑ π + π exp( ( , ( ), ) ( , ( ), )) score n n n D score n n n D j j i i ( | ) P D Mnew = = j 1 n ( | ) P D Mold ∑ π + π exp( ( , ( ), ) ( , ( ), )) score n o n D score n o n D j j i i = 1 j π + ( , , ) score i D C k _ old ( | ) Q Mold Mnew n ∑ π + − = ( , , ) ( 1 ) score i D n C k ( | ) Q Mnew Mold = ≠ 1 , _ k k k old π + ( , , ) score i D C k _ new n ∑ π + − ( , , ) ( 1 ) score i D n C I simply us an approximation since it is k = ≠ 1 , _ k k k new quite time consuming to calculate the proposal probability

DIN proposal Traditional MCMC

Algorithm � Initialization � Each iteration Move a step for every chain • Chain swap • � Keep the first chain

Importance sampling M1 M1’ T=1 M2 A (Mi’, Mi) Importance S( m sampling Legal Chains) … Mi’ T>=1 Chain Mm Illegal Swap Mi’ DIN Sampling Pa (S’, S) S’

MCMCMC for missing values X1 X2 X3 X4 X5 1 ? 2 1 1 2 2 2 1 2 ? 1 1 ? 1 1 2 ? 2 ? ? 1 1 1 2 I1 I2 I3 I4 I5 I6 … 1 1 2 ? 1 1 3 1 5 2 1 2 7 3 4 3 9 … 1 ? 2 1 1 2 1 1 2 ? 2 2 ? 1 2 1 2 2 ? ? I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 I12 1 2 2 2 1 1 2 1 2 2 1 2 I4 I3 I10 I6 2 2 1 1

Importance M1, D1 sampling M1’ T=1 M2, D2 A (Mi’, Mi| Di) Importance S( m Legal sampling Chains) … Mi’ T>=1 Mm, Dm Illegal Mi’ DIN Sampling T=1 A (Di’, Di| Gi) Di’ Dmi’ Dmi Observed data Chain Swap Pa (S’, S) S’

� Proposal method for before burn in + 1 N v n = ∑ ( | , ) + Q v M n ( 1 ) N n v n + 1 v N n v v π n m = ∑ + ( | , , ) Q v M n m ( 1 ) N n v v π n m + v 1 n N = ∑ v v v π π n mis nm + ( , | , , ) ( 1 ) Q v v M n m N π n mis v v v π π n mis nm π v n , mis

X1 X2 X3 X4 X5 1 ? 2 1 1 2 2 2 1 2 5 ? 1 1 2 1 1 1 2 ? ? ? ? 1 1 1 2 1 1 2 ? 1 4 1 ? 2 1 1 2 3 2 1 1 2 ? 2 2 ? 1 2 1 2 2 ? ?

Acceptance probability ' ' ( | ) ( | ) Q MissVal MissVal P D M ' ( , ) = min( 1 , ) Accept MissVal MissVal ' ( | ) ( | ) Q MissVal MissVal P D M

� After burn in + = ∏ ∑ 1 N ' MissVal _ i new ( | ) Q MissVal + ( 1 ) N ∈ Ω ( ) i cmis ij j + = ∏ ∑ 1 N _ ' i old ( | ) Q MissVal MissVal + ( ' 1 ) N ∈ Ω ( ) i cmis ij j Acceptance probability ' ' ( | ) ( | ) Q MissVal MissVal P D M ' ( , ) = min( 1 , ) Accept MissVal MissVal ' ( | ) ( | ) Q MissVal MissVal P D M

Result Evaluation (complete data) � ROC curve 1 1 fn fp tp = sensitivty + 2 3 2 3 tp fn tp tn tn tn = specificit y + 4 4 tn fp tn fp = 1 − = complement ary specificit y + + tn fp tn fp tp is the number of true positive edges. fn is the number of false negative edges. fp is the number of false positive edges. tn is the number of true negative edges.

� Model Genetic Network

� MCMCMC against order MCMC � MCMCMC against structure MCMC � MCMCMC against Population MCMC

Temperatures= [1, 1, 3, 9, 30] � Keep at most 10 parents-configurations for each node and cardinality. � With 60000 iterations: 30000 burn in and keep the last 30000 samples. �

� Alarm Network

� Arabidopsis data

Result Evaluation (missing values) � Model Genetic Network � Before burn in(30000 burn in, 30000 iterations � After burn in 40000 iterations � Temp=[1,1,3,9,12]

The ROC curve for noise=0.2 data=200 with different missing rate � Temp = [1, 1, 3, 9, 12] � Use 30000 burn in and 30000 iterations. � Every 10 steps keep one sample. (before burn in algorithm)

� B cell Lymphoma data

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye - PowerPoint PPT Presentation

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye MSc in Artificial Intelligence Supervised by Dirk Husmeier Outline Introduction MCMCMC MCMCMC for missing values Result Evaluation (complete data) Result

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom

Distributed Markov chain Monte Carlo Lawrence Murray CSIRO Mathematics, Informatics and

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

11/15/2016 SAI WHM SAI BUM SWO 60 60 50 60 50 40 50 40 30 40 30 20 30 20 10 10

ISAB and ISRP ISAB Ex Officio Contributors & Coordinator J Richard Alldredge, PhD Michael

Simulating Population Genetics on the XT5 E. A. Duenez-Guzman, A. D. Vose, M. D. Vose, S.

Diffusion Models in Population Genetics Laura Kubatko kubatko.2@osu.edu MBI Workshop on

Investor presentation www.terrascend.com P R I VA T E & C O N F I D E N T I A L A P R I L

Bald and Golden Eagle Bald and Golden Eagle Bald and Golden Eagle Bald and Golden Eagle

Project Overview Effects of Population Size on the Reproductive Fitness of Project began for

Factors affecting the health and fitness of Juvenile winter flounder ( Pseudopleuronectes

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye - PowerPoint PPT Presentation

Population Markov Chain Monte Carlo and Genetic Networks Fujun Ye MSc in Artificial Intelligence Supervised by Dirk Husmeier Outline Introduction MCMCMC MCMCMC for missing values Result Evaluation (complete data) Result

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS &amp; Telecom

Distributed Markov chain Monte Carlo Lawrence Murray CSIRO Mathematics, Informatics and

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

11/15/2016 SAI WHM SAI BUM SWO 60 60 50 60 50 40 50 40 30 40 30 20 30 20 10 10

ISAB and ISRP ISAB Ex Officio Contributors &amp; Coordinator J Richard Alldredge, PhD Michael

Simulating Population Genetics on the XT5 E. A. Duenez-Guzman, A. D. Vose, M. D. Vose, S.

Diffusion Models in Population Genetics Laura Kubatko kubatko.2@osu.edu MBI Workshop on

Investor presentation www.terrascend.com P R I VA T E &amp; C O N F I D E N T I A L A P R I L

Bald and Golden Eagle Bald and Golden Eagle Bald and Golden Eagle Bald and Golden Eagle

Project Overview Effects of Population Size on the Reproductive Fitness of Project began for

Factors affecting the health and fitness of Juvenile winter flounder ( Pseudopleuronectes

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

ISAB and ISRP ISAB Ex Officio Contributors & Coordinator J Richard Alldredge, PhD Michael

Investor presentation www.terrascend.com P R I VA T E & C O N F I D E N T I A L A P R I L