Cross-organism prediction of drug hepatotoxicity by sparse group factor analysis Tommi Suvitaival Juuso A. Parkkinen Seppo Virtanen Samuel Kaski July 19-20, 2013 – CAMDA
Starting point High-dimensional gene-expression Sparse pathological data of data from 3 types of organisms rat in vivo Found Not found 1 2 3 Ground glass appearance Fibrosis View Human Rat Rat Mineralization Hematopoiesis, extramedullary Degeneration, hydropic Vacuolization, nuclear in vitro in vitro in vivo Change, acidophilic Finding types Deposit, lipid Treatments Atypia, nuclear Degeneration, acidophilic, eosinophilic Nodule, hepatodiaphragmatic Proliferation, Kupffer cell Cellular infiltration, mononuclear cell Anisonucleosis Change, basophilic Proliferation Edema Degeneration, granular, eosinophilic DEAD Deposit, glycogen Vacuolization, cytoplasmic Swelling Single cell necrosis Hypertrophy Microgranuloma Change, eosinophilic Cellular infiltration Increased mitosis Necrosis Treatments
Starting point High-dimensional gene-expression Sparse pathological data of data from 3 types of organisms rat in vivo Found Not found 1 2 3 Ground glass appearance Fibrosis View Human Rat Rat Mineralization Hematopoiesis, extramedullary Degeneration, hydropic Vacuolization, nuclear in vitro in vitro in vivo Change, acidophilic Finding types Deposit, lipid Treatments Atypia, nuclear Degeneration, acidophilic, eosinophilic Nodule, hepatodiaphragmatic Proliferation, Kupffer cell Cellular infiltration, mononuclear cell Anisonucleosis Change, basophilic Proliferation Edema Degeneration, granular, eosinophilic DEAD Deposit, glycogen Vacuolization, cytoplasmic Swelling Single cell necrosis Hypertrophy Microgranuloma Change, eosinophilic Cellular infiltration Increased mitosis Necrosis Treatments 1. Can we replace the animal study with in vitro assay? 1 Human in vitro 2. Can we predict the liver Views Rat in vitro injury in humans using Rat in vivo toxicogenomics data from 0 Components animals?
Group factor analysis (GFA) Latent s w e i Observed data variables Factor loadings v n s f w o w i e e t e e v i i i v s v t c l b e l View 1 2 3 1 2 3 a a u l s g ) n A a i Components s ) Treatments B a ) C ≈ ≈ × × GFA: Real numbers Zero
Making generalizations across organisms 1 Human in vitro Views Rat in vitro Rat in vivo 0 Components Shared components ◮ associations between views ◮ cross-view prediction
GFA with sparsity (1) Latent s w e i v Observed data variables Factor loadings n s f o i w w e e t e e v i i v s v t i c l b e l u View 1 2 3 1 2 3 a a l s g ) n A a s i Components Treatments ) B a ) C ≈ ≈ × × GFA: GFA Components Treatments ≈ ≈ × × with sparsity: Real numbers Zero
GFA with and without sparsity ≈ ≈ × × ≈ ≈ × ×
GFA with sparsity (2) Latent s w e i v Observed data variables Factor loadings n s f o i w w e e t e e v i i v s v t i c l b e l u View 1 2 3 1 2 3 a a l s g ) n A a s i Components Treatments ) B a ) C ≈ ≈ × × GFA: GFA Components Treatments ≈ ≈ × × with sparsity: Real numbers ≈ [ X (1) X (2) X (3) ] × [ W (1) W (2) W (3) ] ; ; Z ; ; Zero
Sparsity – why Sparsity in the model is encouraged due to 1. High dimensionality of the gene expression microarray Sparsity in terms of data sets ⇒ variables 2. Strong sparsity of the pathology data 3. Treatments heterogeneous Sparsity in terms of ⇒ by their effects samples
Sparsity – how Spike-and-slab prior ∗ for 1. Sparsity in terms of ⇒ factor loadings matrix W variables 2. Sparsity in terms of Spike-and-slab prior for ⇒ samples latent variables Z ∗ Probability density 0 Value
GFA – model Latent B) a subset of views Observed data variables Factor loadings A) all views active in C) a single view View 3 3 1 2 1 2 Components Treatments ≈ × Real numbers [ X (1) X (2) X (3) ] ≈ [ W (1) W (2) W (3) ] × ; ; Z ; ; Zero x ( m ) � z i · W ( m ) , τ − 1 � ∼ N I m i · ∼ N ( 0 , I ) z i · z i · 1 w ( m ) ∼ N 0 , I k · α ( m ) k a ( τ ) a ( α ) x ( m ) α ( m ) W ( m ) τ i · b ( τ ) b ( α ) i = 1 ... N m = 1 ... M α ( m ) � a ( α ) , b ( α ) � ∼ Gamma k i : samples, m : views � a ( τ ) , b ( τ ) � ∼ τ m Gamma
GFA with sparsity – model Latent Observed data variables Factor loadings s w e i v n s f w o w View 1 2 3 1 2 3 i e e t e e v v i v i i s t b c l e a l u a l s g ) n A a i s Components ) Treatments B a ) C ≈ ≈ × × GFA: GFA Components Treatments with ≈ ≈ × × sparsity: Real numbers [ X (1) X (2) X (3) ] ≈ [ W (1) W (2) W (3) ] × ; ; Z ; ; Zero GFA GFA with sparsity � Λ ( m ) � − 1 � x ( m ) � � x ( m ) � z i · W ( m ) , τ − 1 z i · W ( m ) , ∼ N I ∼ N i · m i · � � H ( z ) 1 � 1 − H ( z ) � z i · ∼ N ( 0 , I ) z ik ∼ k N 0 , + δ 0 α ( z ) k ik � � � � w ( m ) W ( m ) H ( m ) � 1 − H ( m ) � 1 1 ∼ N 0 , I ∼ dk N 0 , + δ 0 k · α ( m ) dk α ( m ) dk k dk
Data representation – gene expression ◮ Treatments that occur in all 3 types of organism: ◮ 119 compounds ◮ dosage levels middle & high ◮ time points 8/9 h & 24 h ◮ Average differential expression over the replicates of each treatment ⇒ Treatment = sample for the model ⇒ Matching treatments between the 3 transcriptomic views X human in vitro , X rat in vitro and X rat in vivo 1 2 3 Human Rat Rat View in vitro in vitro in vivo Treatments
Data representation – histopathology of the liver Found Grade-weighted count of Not found each pathological finding Ground glass appearance Fibrosis Mineralization Hematopoiesis, extramedullary type over the replicates of Degeneration, hydropic Vacuolization, nuclear Change, acidophilic Finding types Deposit, lipid a treatment Atypia, nuclear Degeneration, acidophilic, eosinophilic Nodule, hepatodiaphragmatic Proliferation, Kupffer cell Cellular infiltration, mononuclear cell ⇒ Pathology view Anisonucleosis Change, basophilic Proliferation Y rat Edema in vivo with matching Degeneration, granular, eosinophilic DEAD Deposit, glycogen Vacuolization, cytoplasmic treatments to the 3 Swelling Single cell necrosis Hypertrophy Microgranuloma transcriptomic views Change, eosinophilic Cellular infiltration Increased mitosis Necrosis Treatments
Results Our tasks: 1. Predict liver damage of rats in vivo based on cell-level transcriptomic responses in the 3 types of model organisms 2. Test how well the transcriptomic cell-level responses generalize to known effects of the compounds on humans
Analysis: model organisms’ generalizability to organ level Training: Learn associa- tions between the views ◮ 3 transcriptomic views X human in vitro , X rat in vitro and X rat in vivo ◮ Pathology view Y rat in vivo Testing: Predict the patho- logical findings Y rat in vivo ◮ Given one of the transcriptomic views
Analysis: model organisms’ generalizability to organ level Relative performance of the gene expression views at predicting pathological findings Training: Learn associa- 1.0 ● Human in vitro tions between the views Rat in vitro Proportion of test samples predicted more accurately 0.8 − Rat in vivo than by other views ◮ 3 transcriptomic − − − − − − − 0.6 − views X human − − − in vitro , − − X rat in vitro and X rat − − 0.4 ● − in vivo − ◮ Pathology view − − − − − − − − 0.2 − − − − − − ● − − − − − ● ● − − Y rat − − ● − ● ● ● − − − − − − in vivo 0.0 Hypertrophy Swelling Nodule, hepatodiaphragmatic Microgranuloma Necrosis Single cell necrosis Cellular infiltration Vacuolization, cytoplasmic Testing: Predict the patho- logical findings Y rat in vivo ◮ Given one of the transcriptomic views
Recommend
More recommend