SNeCT: Integrative cancer data analysis via large scale network constrained Tucker decomposition Dongjin Choi and Lee Sael 1 / 24
Motivation Q: How can we characterize cancer patients? A: The Cancer Genome Atlas (TCGA) Pan-Cancer data provide rich data across 12 tumor types 12 tumor types Mary Goldman. UCSC Cancer Browser Workshop (2015) John N. Weinstein et al. Nat Genet 45(10), 1113-1120 (2013) doi:10.1038/ng.2764 2 / 27
Motivation How can we provide integrated analysis for multi- dimensional data? Pan-Cancer12 data consist of multi-platform data Gene Expression DNA Methylation Copy Number Variation Mutation Mary Goldman. UCSC Cancer Browser Workshop (2015) 3 / 27
Motivation How can we build a combined model exploiting gene networks? Gene association networks provide gene similarity information Common pathways John N. Weinstein et al. Nat Genet 45(10), 1113-1120 (2013) doi:10.1038/ng.2764 4 / 27
Introduction Problem definition Proposed method Experiments Conclusion Overview Introduction Problem definition Proposed method Experiments Conclusion 5 / 27
Introduction Problem definition Proposed method Experiments Conclusion Tensor A tensor is a multi-dimensional array Pan-can12 data are represented as a 3-D tensor 0.12 -0.3 Patients 0.82 Observations Genes 6 / 27
Introduction Problem definition Proposed method Experiments Conclusion Tensor Factorization Given a tensor, decompose the tensor into a core tensor and factor matrices whose product approximates the original tensor CP Decomposition Tucker Decomposition (HOSVD) C C B B ≈ ≈ 𝒴 𝒴 A A 7 / 27
Introduction Problem definition Proposed method Experiments Conclusion Overview Introduction Problem definition Proposed method Experiments Conclusion 8 / 27
Introduction Problem definition Proposed method Experiments Conclusion Tucker Decomposition Tucker decomposition (Tucker, 1966) Widely-used tensor factorization method Given a tensor, Tucker decomposition factorizes the tensor into product of a core tensor and orthogonal factor matrices 𝒴 ≈ ෪ 𝒴 = × 1 𝑩 × 2 𝑪 × 3 𝑫 C : s.t. 𝑩 𝑼 𝑩 = 𝑪 𝑼 𝑪 = 𝑫 𝑼 𝑫 = 𝑱 B ≈ Elementwise, 𝒴 A 𝑦 𝑗𝑘𝑙 ≈ × 1 𝒃 𝑗 × 2 𝒄 𝑘 × 3 𝒅 𝑙 𝒃 𝑗 : 𝑗 -th row of 𝑩 𝒄 𝑘 : 𝑘 -th row of 𝑪 𝒅 𝑙 : 𝑙 -th row of 𝑫 9 / 27
Introduction Problem definition Proposed method Experiments Conclusion Tucker Decomposition (cont.) Formal problem definition Given a 3-D tensor 𝒴 (∈ ℝ 𝐽×𝐾×𝐿 ) with observable entries {𝑦 𝑗𝑘𝑙 |(𝑗, 𝑘, 𝑙) ∈ Ω 𝒴 } , the rank-[ 𝑄, 𝑅, 𝑆 ] factorization of 𝒴 is to find the core tensor and factor matrices {𝑩, 𝑪, 𝑫} which minimizes the following loss function: 𝑔 , 𝑩, 𝑪, 𝑫 = 1 2 + 𝜇 2 𝒴 − ෪ 2 𝑆 , 𝑩, 𝑪, 𝑫 𝒴 𝐺 = 1 2 + 𝜇 2 𝑆 , 𝑩, 𝑪, 𝑫 𝑦 𝑗𝑘𝑙 − × 1 𝒃 𝑗 × 2 𝒄 𝑘 × 3 𝒅 𝑙 2 𝑗,𝑘,𝑙 ∈Ω 𝒴 10 / 27
Introduction Problem definition Proposed method Experiments Conclusion Overview Introduction Problem definition Proposed method Experiments Conclusion 11 / 27
Introduction Problem definition Proposed method Experiments Conclusion Scheme of SNeCT Input Lock-Free Parallel SGD Extract patients profile Gene 𝑩 𝑪 𝑫 Patient 𝑫 Gene 𝑩 𝑪 Gene Make related factors similar Bionetwork 𝑫 Personalized Subtype Analysis Prediction Stratification 𝒃 𝒓 C 1 ≈ 𝒃 𝒋 𝑩 × 𝟐 𝒯 = Query patient data 𝑪 C 2 𝑩 Top-k search Patients clustering 12 / 27
Introduction Problem definition Proposed method Experiments Conclusion Proposed methods SNeCT enables integrative tensor factorization and analysis for tensor data with network constraint SNeCT = Scalable Network Constrained Tucker decomposition Method 1 Formulate SGD-amenable objective function Iterative SGD update with lock-free parallel scheme Method 2 Personalized subtype analysis 13 / 27
Introduction Problem definition Proposed method Experiments Conclusion Proposed methods Formulate SGD-amenable objective function Given the gene similarity matrix 𝒁 (∈ ℝ 𝐾×𝐾 ) with observable entries {𝑧 𝑛𝑜 |(𝑛, 𝑜) ∈ Ω 𝒁 } , network constraint is formulated to make similar genes have similar factors: 𝑅 𝑪, 𝒁 = 1 𝑧 𝑛𝑜 𝑐 𝑛𝑚 − 𝑐 𝑜𝑚 2 𝑔 2 𝑚=1 𝑛,𝑜 ∈Ω 𝒁 = 1 2 𝑧 𝑛𝑜 𝒄 𝑛 − 𝒄 𝑜 𝐺 2 𝑛,𝑜 ∈Ω 𝒁 14 / 27
Introduction Problem definition Proposed method Experiments Conclusion Proposed methods Formulate SGD-amenable objective function 𝑔 , 𝑩, 𝑪, 𝑫 = 1 2 + 𝜇 2 𝑆 , 𝑩, 𝑪, 𝑫 𝑦 𝑗𝑘𝑙 − 𝑦 𝑗𝑘𝑙 2 𝑗,𝑘,𝑙 ∈Ω 𝒴 2 𝒄 𝑘 2 2 = 1 𝜇 𝒃 𝑗 + 𝒅 𝑙 2 + 2 + 𝜇 𝐺 𝐺 𝐺 𝑦 𝑗𝑘𝑙 − 𝑦 𝑗𝑘𝑙 𝐺 + 𝑗 𝑙 2 Ω 𝒴 𝑘 Ω 𝒴 Ω 𝒴 Ω 𝒴 𝑗,𝑘,𝑙 ∈Ω 𝒴 𝑪, 𝒁 = 1 2 𝑔 𝑧 𝑛𝑜 𝒄 𝑛 − 𝒄 𝑜 𝐺 2 𝑛,𝑜 ∈Ω 𝒁 Integrate into single objective function 𝑔 𝑝𝑞𝑢 = 𝑔 + 𝜇 𝑔 15 / 27
Introduction Problem definition Proposed method Experiments Conclusion Proposed methods Calculate gradients of 𝑔 𝑝𝑞𝑢 with respect to the core tensor and factor matrices for a given data point 𝑦 𝛽=(𝑗𝑘𝑙) or 𝑧 𝛾=(𝑛𝑜) 𝜖𝑔 𝜇 𝑝𝑞𝑢 ቤ = − 𝑦 𝛽 − 𝑦 𝛽 × 2 𝒄 𝑘 × 3 𝒅 𝑙 + 𝒃 𝑗 𝑗 𝜖𝒃 𝑗 Ω 𝒴 𝛽 𝜖𝑔 𝜇 𝑈 × 2 𝒄 𝑘 𝑈 × 3 𝒅 𝑙 𝑝𝑞𝑢 𝑈 + ቤ = − 𝑦 𝛽 − 𝑦 𝛽 × 1 𝒃 𝑗 𝜖 Ω 𝒴 𝛽 𝜖𝑔 𝑝𝑞𝑢 ቤ = 𝜇 𝑧 𝛾 𝒄 𝑛 − 𝒄 𝑜 𝜖𝒄 𝑛 𝛾 𝜖𝑔 𝜖𝑔 𝜖𝑔 𝑝𝑞𝑢 𝑝𝑞𝑢 𝑝𝑞𝑢 , and are calculated symmetrically ฬ , ฬ ฬ 𝜖𝒄 𝑘 𝜖𝒅 𝑙 𝜖𝒄 𝑜 𝛽 𝛾 𝛽 16 / 27
Introduction Problem definition Proposed method Experiments Conclusion Proposed methods Parallel update with calculated gradient SNeCT( 𝒴 , 𝒁, 𝜇, 𝜇 , 𝜃 ) ( 𝜃 : learning rate) Initialize , 𝑩, 𝑪, 𝑫 randomly 1. repeat 2. for ∀𝑦 (𝑗𝑘𝑙)=𝛽 ∈ 𝒴, ∀𝑧 𝑛𝑜 =𝛾 ∈ 𝒁 in random order in parallel 3. if 𝑦 𝑗𝑘𝑙 ∈ 𝒴 is picked then 4. 𝜖𝑔 𝜖𝑔 𝜖𝑔 𝑝𝑞𝑢 𝑝𝑞𝑢 , 𝒅 𝑙 ← 𝒅 𝑙 − 𝜃 𝑝𝑞𝑢 𝒃 𝑗 ← 𝒃 𝑗 − 𝜃 ฬ , 𝒄 𝑘 ← 𝒄 𝑘 − 𝜃 ฬ ฬ 5. 𝜖𝒃 𝑗 𝜖𝒄 𝑘 𝜖𝒅 𝑙 𝛽 𝛽 𝛽 𝜖𝑔 𝑝𝑞𝑢 ← − 𝜃 ฬ 6. 𝜖 𝛽 else if ∀𝑧 𝑛𝑜 ∈ 𝒁 is picked then 7. 𝜖𝑔 𝜖𝑔 𝑝𝑞𝑢 , 𝒄 𝑜 ← 𝒄 𝑜 − 𝜃 𝑝𝑞𝑢 𝒄 𝑛 ← 𝒄 𝑛 − 𝜃 ฬ ฬ 8. 𝜖𝒄 𝑛 𝛾 𝜖𝒄 𝑜 𝛾 end if 9. end for 10. 11. until convergence condition satisfied Orthogonalize 𝑩, 𝑪, 𝑫 by QR decomposition 12. 13. return , 𝑩, 𝑪, 𝑫 17 / 27
Introduction Problem definition Proposed method Experiments Conclusion Overview Introduction Problem definition Proposed method Experiments Conclusion 18 / 27
Introduction Problem definition Proposed method Experiments Conclusion Experimental Settings Factorize data tensor with rank-[78,48,5] Stratification Cluster analysis Survival analysis Prediction T op-k similarity search on clinical features Personalized subtype analysis Performance Compare speed and convergence rate with competitor Competitor: Narita et al . 2012 19 / 27
Introduction Problem definition Proposed method Experiments Conclusion Stratification – Cluster Analysis C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 T otal BLCA 16 32 2 19 0 22 3 0 0 0 32 0 0 126 BRCA 17 3 600 172 1 70 0 0 0 0 26 0 0 889 COAD 4 0 2 2 0 91 317 0 0 0 1 2 0 419 GBM 4 1 1 2 3 7 0 0 248 0 1 0 0 267 HNSC 0 242 1 6 0 1 0 0 0 0 60 0 0 310 KIRC 14 1 1 0 471 4 0 0 1 0 6 0 0 498 LAML 0 0 0 0 0 9 0 0 0 188 0 0 0 197 LUAD 302 2 2 7 1 12 0 0 0 0 29 0 0 457 LUSC 26 32 0 29 0 7 0 0 0 0 246 0 0 340 OV 0 0 1 3 0 1 1 348 0 0 0 0 131 485 READ 1 1 0 5 0 9 145 0 0 0 1 1 0 163 UCEC 3 1 3 117 1 348 1 0 0 0 10 13 2 499 T otal 387 315 613 362 477 581 467 348 249 188 412 17 134 4550 20 / 27
Introduction Problem definition Proposed method Experiments Conclusion Stratification – Survival Analysis Survival curves for clustered patients log-rank statistics: 1151 1185 409 21 / 27
Recommend
More recommend