transfer learning and applications in computational
play

Transfer Learning and Applications in Computational Biology 1 - PowerPoint PPT Presentation

Transfer Learning and Applications in Computational Biology 1 Christian Widmer, 1 , 2 Marius Kloft, 1 , 3 Gunnar R atsch, 2 Gabriele Schweikert Nico G ornitz, 1 Memorial Sloan-Kettering Cancer Center, NY, USA 2 Technical University of


  1. Transfer Learning and Applications in Computational Biology 1 Christian Widmer, 1 , 2 Marius Kloft, 1 , 3 Gunnar R¨ atsch, 2 Gabriele Schweikert Nico G¨ ornitz, 1 Memorial Sloan-Kettering Cancer Center, NY, USA 2 Technical University of Berlin, Germany 3 New York University, NY, USA

  2. Memorial Sloan-Kettering Cancer Center Frequent words of abstracts from publications 1998-2004. [ wordle.net ] � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 2

  3. Memorial Sloan-Kettering Cancer Center Frequent words of abstracts from publications 2005-2012. [ wordle.net ] � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 3

  4. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  5. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  6. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  7. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g (RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  8. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma of Biology Three things will be crucial: Biological insights , RNA) N Many observations of the system: (DNA, i =1 Empirical inference to estimate Θ: f Θ (DNA, ) = RNA � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 4

  9. Memorial Sloan-Kettering Cancer Center Learning About the Central Dogma Goal: Estimate f to predict RNAs Need: Good inference method Omit (f Inputs (DNA, ) Outputs (complete transcriptome) Challenges: 1 RNA only partially known 2 Factors Omit (f only partially known 3 Improved inference methods � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 5

  10. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  11. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] 8 7 k−mer Length 6 5 4 3 2 1 −30 −20 −10 0 10 20 30 Position � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  12. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] 8 7 10 k−mer Length 6 Log-intensity 5 4 3 5 2 1 transcript −30 −20 −10 0 10 20 30 Position 0 � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  13. Memorial Sloan-Kettering Cancer Center Recent Machine Learning Work � Develop fast, accurate and interpretable learning methods 1 Large scale sequence classification [R¨ atsch et al., 2006a; Sonnenburg et al., 2010, 2007; Toussaint et al., 2010] 2 Analysis and explanation of learning results [R¨ atsch et al., 2006b; Sonnenburg et al., 2008; Zien et al., 2009] 3 Sequence segmentation & structure prediction [R¨ atsch et al., 2007; Schweikert et al., 2009; Zeller et al., 2008] 4 Transfer & Multitask learning [Schweikert et al., 2008a; Widmer et al., 2011, 2012, 2010a; Widmer and R¨ atsch, 2011; Widmer et al., 2010c] 8 7 10 k−mer Length 6 Log-intensity 5 4 3 5 2 1 transcript −30 −20 −10 0 10 20 30 Position 0 � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 6

  14. Memorial Sloan-Kettering Cancer Center Many algorithms implemented in Shogun toolbox (GPL, ≥ 1000 users) � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 7

  15. Memorial Sloan-Kettering Cancer Center Roadmap Motivation from computational biology TSS Donor Acceptor Donor Acceptor polyA/cleavage DNA TIS Stop Empirical comparison of domain adaptation algorithms Algorithms for hierarchical multi-task learning Algorithms for learning task relations Fast(er) Algorithms Discussion & Conclusion � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 8

  16. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding DNA genic intergenic pre-mRNA exon intron exon intron exon mRNA 5' UTR 3' UTR cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

  17. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding TSS polyA/cleavage DNA Splice Splice Splice Splice pre-mRNA Donor Acceptor Donor Acceptor mRNA TIS Stop cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

  18. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding TSS Donor Acceptor polyA/cleavage Donor Acceptor DNA TIS Stop pre-mRNA mRNA cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

  19. Memorial Sloan-Kettering Cancer Center A Core CompBio Problem: Gene Finding TSS Donor Acceptor polyA/cleavage Donor Acceptor DNA TIS Stop pre-mRNA mRNA cap polyA Protein Given a piece of DNA sequence Predict gene products including intermediate processing steps Predict signals used during processing Predict the correct corresponding label sequence with labels � Gunnar R¨ c atsch ( cBio@MSKCC) Transfer Learning in Computational Biology Courant Institute@NYU February 7, 2013 9

Recommend


More recommend