support vector machine prediction of signal peptide
play

Support vector machine prediction of signal peptide cleavage site - PowerPoint PPT Presentation

1 Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan 2 Outline 1. SVM and kernel methods 2. New kernels for


  1. 1 Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan

  2. 2 Outline 1. SVM and kernel methods 2. New kernels for bioinformatics 3. Example: signal peptide cleavage site prediction

  3. 3 Part 1 SVM and kernel methods

  4. 4 Support vector machines φ • Objects to classified x mapped to a feature space • Largest margin separating hyperplan in the feature space

  5. 5 The kernel trick • Implicit definition of x → Φ( x ) through the kernel: def K ( x, y ) = < Φ( x ) , Φ( y ) >

  6. 5 The kernel trick • Implicit definition of x → Φ( x ) through the kernel: def K ( x, y ) = < Φ( x ) , Φ( y ) > • Simple kernels can represent complex Φ

  7. 5 The kernel trick • Implicit definition of x → Φ( x ) through the kernel: def K ( x, y ) = < Φ( x ) , Φ( y ) > • Simple kernels can represent complex Φ • For a given kernel, not only SVM but also clustering, PCA, ICA... possible in the feature space = kernel methods

  8. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid...

  9. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid... but the objects x must be vectors

  10. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid... but the objects x must be vectors • “Exotic” kernels for strings:

  11. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid... but the objects x must be vectors • “Exotic” kernels for strings: ⋆ Fisher kernel (Jaakkoola and Haussler 98)

  12. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid... but the objects x must be vectors • “Exotic” kernels for strings: ⋆ Fisher kernel (Jaakkoola and Haussler 98) ⋆ Convolution kernels (Haussler 99, Watkins 99)

  13. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid... but the objects x must be vectors • “Exotic” kernels for strings: ⋆ Fisher kernel (Jaakkoola and Haussler 98) ⋆ Convolution kernels (Haussler 99, Watkins 99) ⋆ Kernel for translation initiation site (Zien et al. 00)

  14. 6 Kernel examples • “Classical” kernels: polynomial, Gaussian, sigmoid... but the objects x must be vectors • “Exotic” kernels for strings: ⋆ Fisher kernel (Jaakkoola and Haussler 98) ⋆ Convolution kernels (Haussler 99, Watkins 99) ⋆ Kernel for translation initiation site (Zien et al. 00) ⋆ String kernel (Lodhi et al. 00)

  15. 7 Kernel engineering Use prior knowledge to build the geometry of the feature space through K ( ., . )

  16. 8 Part 2 New kernels for bioinfomatics

  17. 9 The problem • X a set of objects

  18. 9 The problem • X a set of objects • p ( x ) a probability distribution on X

  19. 9 The problem • X a set of objects • p ( x ) a probability distribution on X • How to build K ( x, y ) from p ( x ) ?

  20. 10 Product kernel K prod ( x, y ) = p ( x ) p ( y )

  21. 10 Product kernel K prod ( x, y ) = p ( x ) p ( y ) x p(y) p(x) 0 y

  22. 10 Product kernel K prod ( x, y ) = p ( x ) p ( y ) x p(y) p(x) 0 y SVM = Bayesian classifier

  23. 11 Diagonal kernel K diag ( x, y ) = p ( x ) δ ( x, y )

  24. 11 Diagonal kernel K diag ( x, y ) = p ( x ) δ ( x, y ) p(x) x p(y) z p(z) y

  25. 11 Diagonal kernel K diag ( x, y ) = p ( x ) δ ( x, y ) p(x) x p(y) z p(z) y No learning

  26. 12 Interpolated kernel If objects are composite: x = ( x 1 , x 2 ) : K ( x, y ) = K diag ( x 1 , y 1 ) K prod ( x 2 , y 2 )

  27. 12 Interpolated kernel If objects are composite: x = ( x 1 , x 2 ) : K ( x, y ) = K diag ( x 1 , y 1 ) K prod ( x 2 , y 2 ) = p ( x 1 ) δ ( x 1 , y 1 ) × p ( x 2 | x 1 ) p ( y 2 | y 1 ) A* AA AB BA B* BB

  28. 13 General interpolated kernel • Composite objects x = ( x 1 , . . . , x n )

  29. 13 General interpolated kernel • Composite objects x = ( x 1 , . . . , x n ) • A list of index subsets: V = { I 1 , . . . , I v } where I i ⊂ { 1 , . . . , n }

  30. 13 General interpolated kernel • Composite objects x = ( x 1 , . . . , x n ) • A list of index subsets: V = { I 1 , . . . , I v } where I i ⊂ { 1 , . . . , n } • Interpolated kernel: K V ( x, y ) = 1 � K diag ( x I , y I ) K prod ( x I c , y I c ) |V| I ∈V

  31. 14 Rare common subparts For a given p ( x ) and p ( y ) , we have: K V ( x, y ) = K prod ( x, y ) × 1 δ ( x I , y I ) � |V| p ( x I ) I ∈V

  32. 14 Rare common subparts For a given p ( x ) and p ( y ) , we have: K V ( x, y ) = K prod ( x, y ) × 1 δ ( x I , y I ) � |V| p ( x I ) I ∈V x and y get closer in the feature space when they share rare common subparts

  33. 15 Implementation • Factorization for particular choices of p ( . ) and V

  34. 15 Implementation • Factorization for particular choices of p ( . ) and V • Example: ⋆ V = P ( { 1 , . . . , n } ) the set of all subsets: |V| = 2 n

  35. 15 Implementation • Factorization for particular choices of p ( . ) and V • Example: ⋆ V = P ( { 1 , . . . , n } ) the set of all subsets: |V| = 2 n ⋆ product distribution p ( x ) = � n j =1 p j ( x j ) .

  36. 15 Implementation • Factorization for particular choices of p ( . ) and V • Example: ⋆ V = P ( { 1 , . . . , n } ) the set of all subsets: |V| = 2 n ⋆ product distribution p ( x ) = � n j =1 p j ( x j ) . ⋆ implementation in O ( n ) because n � � ( . . . ) = ( . . . ) i =1 I ∈V

  37. 16 Part 3 Application: SVM prediction of signal peptide cleavage site

  38. 17 Secretory pathway mRNA Nascent protein Signal peptide ER −Nucleus −Chloroplast Golgi −Mitochondrion −Cell surface (secreted) −Peroxisome −Lysosome −Cytosole −Plasma membrane

  39. 18 Signal peptides Protein -1 +1 (1) MKANAKTIIAGMIALAISHTAMA EE... (2) MKQSTIALALLPLLFTPVTKA RT... (3) MKATKLVLGAVILGSTLLAG CS... (1):Leucine-binding protein, (2):Pre-alkaline phosphatase, (3)Pre-lipoprotein

  40. 18 Signal peptides Protein -1 +1 (1) MKANAKTIIAGMIALAISHTAMA EE... (2) MKQSTIALALLPLLFTPVTKA RT... (3) MKATKLVLGAVILGSTLLAG CS... (1):Leucine-binding protein, (2):Pre-alkaline phosphatase, (3)Pre-lipoprotein • 6-12 hydrophobic residues (in yellow) • (-3,-1) : small uncharged residues

  41. 19 Experiment • Challenge : classification of aminoacids windows, positive if cleavage occurs between -1 and +1: [ x − 8 , x − 7 , . . . , x − 1 , x 1 , x 2 ]

  42. 19 Experiment • Challenge : classification of aminoacids windows, positive if cleavage occurs between -1 and +1: [ x − 8 , x − 7 , . . . , x − 1 , x 1 , x 2 ] • 1,418 positive examples, 65,216 negative examples

  43. 19 Experiment • Challenge : classification of aminoacids windows, positive if cleavage occurs between -1 and +1: [ x − 8 , x − 7 , . . . , x − 1 , x 1 , x 2 ] • 1,418 positive examples, 65,216 negative examples • Computation of a weight matrix: SVM + K prod (naive Bayes) vs SVM + K interpolated

  44. 20 Result: ROC curves 100 Interpolated Kernel False Negative (%) 80 Product Kernel (Bayes) 60 40 4 8 12 16 20 24 0 False positive (%)

  45. 21 Conclusion

  46. 22 Conclusion • An other way to derive a kernel from a probability distribution

  47. 22 Conclusion • An other way to derive a kernel from a probability distribution • Useful when objects can be compared by comparing subparts

  48. 22 Conclusion • An other way to derive a kernel from a probability distribution • Useful when objects can be compared by comparing subparts • Encouraging result on real-world application’ “how to improve a weight matrix based classifier”

  49. 22 Conclusion • An other way to derive a kernel from a probability distribution • Useful when objects can be compared by comparing subparts • Encouraging result on real-world application’ “how to improve a weight matrix based classifier” • Future work: more application-specific kernels

  50. 23 Acknowledgement • Minoru Kanehisa • Applied Biosystems for the travel grant

Recommend


More recommend