not just a black box interpretable deep learning for
play

Not Just a Black Box: Interpretable Deep Learning for Genomics - PowerPoint PPT Presentation

Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1 With great power comes really poor interpretability Deep Power Learning Traditional machine learning Classical statistics 2


  1. Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1

  2. With great power comes really poor interpretability… Deep Power Learning Traditional machine learning Classical statistics 2 Interpretability

  3. With great power comes really poor interpretability… Deep Interpretable Deep Power Learning Learning Traditional machine learning Classical statistics 2 Interpretability

  4. QuesAons for the model • Which parts of the input are the most important for making a given predic:on? 3

  5. QuesAons for the model • Which parts of the input are the most important for making a given predic:on? • What are the recurring pa<erns in the input? 3

  6. QuesAons for the model • Which parts of the input are the most important for making a given predic:on? • What are the recurring pa<erns in the input? 4

  7. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  8. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  9. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  10. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  11. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  12. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  13. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5

  14. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output Drawbacks … 1) Computa:onal efficiency - … requires one forward prop for each perturba:on … … Yellow = inputs 5

  15. How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output Drawbacks … 1) Computa:onal efficiency - … requires one forward prop for each perturba:on 2) Satura:on … … Yellow = inputs 5

  16. Satura:on problem illustrated y 1 y 2 i 1 1 0 i 2 i 1 + i 2 6

  17. Satura:on problem illustrated =1 y 1 y 2 i 1 1 =1 0 i 2 =1 i 1 + i 2 6

  18. Satura:on problem illustrated =1 y 1 y 2 i 1 1 =1 0 i 2 =1 i 1 + i 2 0 6

  19. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7

  20. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7

  21. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7

  22. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output Examples: - Gradients (Simonyan et al.) … - Deconvolu:onal Networks (Zeiler & Fergus) - Guided Backpropaga:on (Springenberg et al.) … - Layerwise Relevance Propaga:on (Bach et al.) - Integrated Gradients (Sundararajan et al.) … … Yellow = inputs 7

  23. How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output Examples: - Gradients (Simonyan et al.) … - Deconvolu:onal Networks (Zeiler & Fergus) - Guided Backpropaga:on (Springenberg et al.) … - Layerwise Relevance Propaga:on (Bach et al.) - Integrated Gradients (Sundararajan et al.) - DeepLIFT (Learning Important FeaTures) … - h<ps://github.com/kundajelab/ deepli^, ICML 2017 … - With Peyton Greenside and Anshul Yellow = inputs Kundaje 7

  24. Satura:on revisited y 1 y i 1 i 2 1 2 0 i 1 + i 2 8

  25. Satura:on revisited When (i 1 + i 2 ) >= 1, y gradient is 0 1 y i 1 i 2 1 2 0 i 1 + i 2 8

  26. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y 1 y i 1 i 2 1 2 0 i 1 + i 2 9

  27. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) 1 y i 1 i 2 1 2 0 i 1 + i 2 9

  28. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) At (i 1 + i 2 ) = 2, 1 y the “difference from reference” is +1, NOT 0 i 1 i 2 1 2 0 i 1 + i 2 9

  29. The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) At (i 1 + i 2 ) = 2, 1 y the “difference from reference” is +1, NOT 0 i 1 i 2 1 2 0 i 1 + i 2 DeepLIFT addresses other failure modes besides saturaAon (see paper) 9

  30. Reference ma<ers! CIFAR10 model, class = “ship” Original 10

  31. Reference ma<ers! CIFAR10 model, class = “ship” DeepLIFT Reference Original scores 10

  32. Reference ma<ers! CIFAR10 model, class = “ship” SuggesAons on how to pick a DeepLIFT Reference Original scores reference : - MNIST: all zeros (background) 10

  33. Reference ma<ers! CIFAR10 model, class = “ship” SuggesAons on how to pick a DeepLIFT Reference Original scores reference : - MNIST: all zeros (background) - Consider using a distribuAon of references - E.g. mul:ple references generated by shuffling a genomic sequence 10

  34. Eg: morphing 8 to a 3 or a 6 original 8->3 8->6 Backprop Guided Integrated gradients DeepLIFT 11

  35. Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on 12

  36. Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on How is cell-type-specific gene expression controlled? 12

  37. Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on How is cell-type-specific gene expression controlled? Ans: “control elements” act like switches to turn genes on 12

  38. “Control Elements” are switches that turn genes DNA sequence of a gene Control element 13

  39. “Control Elements” are switches that turn genes Sequence contain “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Control element 13

  40. “Control Elements” are switches that turn genes Sequence contain “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Control element Controller proteins bind to DNA words 13

  41. “Control Elements” are switches that turn genes DNA sequence of a gene Control element + controller proteins loop over… 13

  42. “Control Elements” are switches that turn genes …and ac:vate nearby genes DNA sequence of a gene Control element + controller proteins loop over… 13

  43. 89%* of disease-associated mutaAons are outside genes! DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  44. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  45. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  46. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011

  47. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins Many posi:ons in a control element are not essen:al for its func:on! 14 *Stranger et al ., Genet. , 2011

  48. 89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins Many posi:ons in a control element are not essen:al for its func:on! à Which posiAons in controller elements maber? 14 *Stranger et al ., Genet. , 2011

  49. Q: Which posiAons in control elements maber? 15

  50. Q: Which posiAons in control elements maber? Experimentally measure control elements in different :ssues 15

  51. Q: Which posiAons in control elements maber? Predict :ssue- Experimentally specific ac:vity of measure control control elements elements in from sequence using deep different :ssues learning 15

Recommend


More recommend