Not Just a Black Box: Interpretable Deep Learning for Genomics Presented by: AvanA Shrikumar 1
With great power comes really poor interpretability… Deep Power Learning Traditional machine learning Classical statistics 2 Interpretability
With great power comes really poor interpretability… Deep Interpretable Deep Power Learning Learning Traditional machine learning Classical statistics 2 Interpretability
QuesAons for the model • Which parts of the input are the most important for making a given predic:on? 3
QuesAons for the model • Which parts of the input are the most important for making a given predic:on? • What are the recurring pa<erns in the input? 3
QuesAons for the model • Which parts of the input are the most important for making a given predic:on? • What are the recurring pa<erns in the input? 4
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output … … … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output Drawbacks … 1) Computa:onal efficiency - … requires one forward prop for each perturba:on … … Yellow = inputs 5
How can we find the important parts of the input for a given predic:on? Idea 1: perturbaAon Output Drawbacks … 1) Computa:onal efficiency - … requires one forward prop for each perturba:on 2) Satura:on … … Yellow = inputs 5
Satura:on problem illustrated y 1 y 2 i 1 1 0 i 2 i 1 + i 2 6
Satura:on problem illustrated =1 y 1 y 2 i 1 1 =1 0 i 2 =1 i 1 + i 2 6
Satura:on problem illustrated =1 y 1 y 2 i 1 1 =1 0 i 2 =1 i 1 + i 2 0 6
How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7
How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7
How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output … … … … Yellow = inputs 7
How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output Examples: - Gradients (Simonyan et al.) … - Deconvolu:onal Networks (Zeiler & Fergus) - Guided Backpropaga:on (Springenberg et al.) … - Layerwise Relevance Propaga:on (Bach et al.) - Integrated Gradients (Sundararajan et al.) … … Yellow = inputs 7
How can we find the important parts of the input for a given predic:on? Idea 2: backpropagate importance Output Examples: - Gradients (Simonyan et al.) … - Deconvolu:onal Networks (Zeiler & Fergus) - Guided Backpropaga:on (Springenberg et al.) … - Layerwise Relevance Propaga:on (Bach et al.) - Integrated Gradients (Sundararajan et al.) - DeepLIFT (Learning Important FeaTures) … - h<ps://github.com/kundajelab/ deepli^, ICML 2017 … - With Peyton Greenside and Anshul Yellow = inputs Kundaje 7
Satura:on revisited y 1 y i 1 i 2 1 2 0 i 1 + i 2 8
Satura:on revisited When (i 1 + i 2 ) >= 1, y gradient is 0 1 y i 1 i 2 1 2 0 i 1 + i 2 8
The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y 1 y i 1 i 2 1 2 0 i 1 + i 2 9
The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) 1 y i 1 i 2 1 2 0 i 1 + i 2 9
The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) At (i 1 + i 2 ) = 2, 1 y the “difference from reference” is +1, NOT 0 i 1 i 2 1 2 0 i 1 + i 2 9
The DeepLIFT solu:on: difference from reference Reference: i 1 =0 & i 2 =0 y y=0 when (i 1 + i 2 ) = 0 (reference) At (i 1 + i 2 ) = 2, 1 y the “difference from reference” is +1, NOT 0 i 1 i 2 1 2 0 i 1 + i 2 DeepLIFT addresses other failure modes besides saturaAon (see paper) 9
Reference ma<ers! CIFAR10 model, class = “ship” Original 10
Reference ma<ers! CIFAR10 model, class = “ship” DeepLIFT Reference Original scores 10
Reference ma<ers! CIFAR10 model, class = “ship” SuggesAons on how to pick a DeepLIFT Reference Original scores reference : - MNIST: all zeros (background) 10
Reference ma<ers! CIFAR10 model, class = “ship” SuggesAons on how to pick a DeepLIFT Reference Original scores reference : - MNIST: all zeros (background) - Consider using a distribuAon of references - E.g. mul:ple references generated by shuffling a genomic sequence 10
Eg: morphing 8 to a 3 or a 6 original 8->3 8->6 Backprop Guided Integrated gradients DeepLIFT 11
Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on 12
Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on How is cell-type-specific gene expression controlled? 12
Example biological problem: understanding stem cell differen:a:on liver cells cardiac cells fer:lized egg blood cells Cell-types are different because different genes are turned on How is cell-type-specific gene expression controlled? Ans: “control elements” act like switches to turn genes on 12
“Control Elements” are switches that turn genes DNA sequence of a gene Control element 13
“Control Elements” are switches that turn genes Sequence contain “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Control element 13
“Control Elements” are switches that turn genes Sequence contain “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Control element Controller proteins bind to DNA words 13
“Control Elements” are switches that turn genes DNA sequence of a gene Control element + controller proteins loop over… 13
“Control Elements” are switches that turn genes …and ac:vate nearby genes DNA sequence of a gene Control element + controller proteins loop over… 13
89%* of disease-associated mutaAons are outside genes! DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011
89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011
89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011
89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins 14 *Stranger et al ., Genet. , 2011
89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins Many posi:ons in a control element are not essen:al for its func:on! 14 *Stranger et al ., Genet. , 2011
89%* of disease-associated mutaAons are outside genes! Control element has “DNA words” that controller proteins bind to ACGTGTAACTGATAATGCCGATATT DNA sequence of a gene Controller proteins Many posi:ons in a control element are not essen:al for its func:on! à Which posiAons in controller elements maber? 14 *Stranger et al ., Genet. , 2011
Q: Which posiAons in control elements maber? 15
Q: Which posiAons in control elements maber? Experimentally measure control elements in different :ssues 15
Q: Which posiAons in control elements maber? Predict :ssue- Experimentally specific ac:vity of measure control control elements elements in from sequence using deep different :ssues learning 15
Recommend
More recommend