orthogonal grey simultaneous component analysis to
play

Orthogonal grey simultaneous component analysis to distinguish - PowerPoint PPT Presentation

Orthogonal grey simultaneous component analysis to distinguish common and distinctive information in coupled data Martijn Schouteden Katrijn Van Deun Iven Van Mechelen Outline Introduction Coupled data Research questions


  1. Orthogonal grey simultaneous component analysis to distinguish common and distinctive information in coupled data Martijn Schouteden Katrijn Van Deun Iven Van Mechelen

  2. Outline • Introduction – Coupled data – Research questions • Method – Simultaneous component method – Problem – Solution: DISCO-GSCA • Illustration – Results • Conclusion

  3. Outline • Introduction – Coupled data – Research questions • Method – Simultaneous component method – Problem – Solution: DISCO-GSCA • Illustration – Results • Conclusion

  4. Introduction • Coupled data: data that consist of different data blocks, which all contain information about the same entities – E.g. • Data blocks = GC/MS and LC/MS • Variables = E. coli metabolites • Objects = condition Metabolites Condition LC/MS GC/MS Smilde et al. (2005)

  5. Introduction • Coupled data: data that consist of different data blocks, which all contain information about the same entities – E.g. • Data blocks = GC/MS and LC/MS • Variables = E. coli metabolites • Objects = condition 1 … J 1 1 … J 2 1 Metabolites Condition . LC/MS GC/MS . . I Smilde et al. (2005)

  6. • Finding mechanisms that underly the coupled data • RESEARCH QUESTIONS : which mechanisms are – common for both data blocks and – distinctive for a single data block? Which metabolome processes are measured by both separation techniques? Which processes are measured by just one of the two?

  7. Outline • Introduction – Coupled data – Research questions • Method – Simultaneous component method – Problem – Solution: DISCO-GSCA • Illustration – Results • Conclusion

  8. Outline • Introduction – Coupled data – Research questions • Method – Simultaneous component method – Problem – Solution: DISCO-GSCA • Illustration – Results • Conclusion

  9. Simultaneous Component Analysis • Finding underlying mechanisms in – ONE data block Principal Component Analysis (PCA, Jolliffe, 2002) – More data blocks Simultaneous Component Analysis (SCA, Van Deun et al., 2009)

  10. Simultaneous Component Analysis 1 . LC/MS GC/MS . . I 1 … J 1 1 … J 2

  11. Simultaneous Component Analysis LC/MS GC/MS 1 . . LC/MS GC/MS . I 1 … J 1+J2

  12. Simultaneous Component Analysis LC/MS GC/MS 1 . . LC/MS GC/MS . I 1 … J 1+J2 X conc

  13. Simultaneous Component Analysis LC/MS GC/MS 1 . . LC/MS GC/MS . I 1 … J 1+J2 conc = x + X ' ' P P T LC GC E E LC GC x + ' Data = Scores Loadings Error P E conc conc × ( + ) ×( + ) × I R R J J I J J × ( + ) I J J 1 2 1 2 1 2

  14. Simultaneous Component Analysis LC/MS GC/MS 1 . . LC/MS GC/MS . I 1 … J 1+J2 conc = x + X ' ' P P T LC GC E E LC GC x + ' Data = Scores Loadings Error P E conc conc × ( + ) ×( + ) × I R R J J I J J × ( + ) I J J 1 2 1 2 1 2 2 ' Objective: min X - TP conc conc T,P conc

  15. • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks

  16. • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks • E.g., ) = ' X TP conc conc ⎡ ⎤ ' ' | = T P P ⎣ ⎦ LC GC [ ] ⎡ ⎤ L L 0 0 | x x x ⎢ ⎥ = ⎢ ⎥ M ⎢ ⎥ ⎣ ⎦ x

  17. • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks • E.g., ) = ' X TP conc conc ⎡ ⎤ ' ' | = T P P ⎣ ⎦ LC GC [ ] ⎡ ⎤ L L 0 0 | x x x ⎢ ⎥ = ⎢ ⎥ M ⎢ ⎥ ⎣ ⎦ x Distinctive component for GC/MS

  18. • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks • E.g., = ⎣ ⎡ ⎤ ' ' ' | P P P ⎦ conc LC GC ⎡ ⎤ L L | 0 0 x x ⎢ ⎥ = ⎢ L L 0 0 | x x ⎥ ⎢ ⎥ L L ⎣ | ⎦ x x x x

  19. • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks • E.g., ⎡ ⎤ = ⎣ ' ' ' | P P P ⎦ conc LC GC ⎡ ⎤ L L | 0 0 D1 x x ⎢ ⎥ = ⎢ L L D2 0 0 | x x ⎥ ⎢ ⎥ L L ⎣ | ⎦ x x x x C

  20. Problem • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks • E.g., ⎡ ⎤ = ⎣ ' ' ' | P P P ⎦ conc LC GC ⎡ ⎤ L L | 0 0 D1 x x ⎢ ⎥ = ⎢ L L D2 0 0 | x x ⎥ ⎢ ⎥ L L ⎣ | ⎦ x x x x C � However… SC method: obtaining such a pattern is outside control…

  21. Problem • Distinctive mechanisms = simultaneous components that underly only one data block • Common mechanisms = simultaneous components that underly both data blocks • E.g., ⎡ ⎤ = ⎣ ' ' ' a g e a g e a g e | P P P t r t t r t t r t ⎦ conc LC GC ⎡ ⎤ L L | 0 0 D1 x x ⎢ ⎥ = ⎢ L L D2 0 0 | x x ⎥ ⎢ ⎥ L L ⎣ | ⎦ x x x x C � However… SC method: obtaining such a pattern is outside control…

  22. Solution: DISCO-GSCA • Predecessors: – DISCO-SCA (Schouteden et al., 2010) – Grey Component Analysis (GCA, Westerhuis et al., 2007)

  23. Solution: DISCO-GSCA λ - Impose target structure to a certain power ( ) ( ) 2 2 + λ • − ' target = min X - TP W P P ' T T I conc conc conc conc , T P conc

  24. Solution: DISCO-GSCA λ - Impose target structure to a certain power ( ) ( ) 2 2 + λ • − ' target = min X - TP W P P ' T T I conc conc conc conc , T P conc ⎛ ⎞ ⎡ ⎤ p p p ⎡ ⎤ 0 x x 11 12 13 ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ M M M ⎜ ⎟ ⎢ ⎥ M M M ⎢ ⎥ ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ p p p 0 x x ⎜ ⎟ I 1 I 2 I 3 ⎢ ⎥ 1 1 1 − ⎢ ⎥ − − − ⎜ ⎟ − − − ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ p p p 0 ⎜ x x ⎟ ⎢ ( ) ( ) ( ) ⎥ + + + I I 1 I I 2 I I 3 ⎢ ⎥ 1 2 1 2 1 2 ⎜ ⎟ ⎢ ⎥ M M M ⎢ ⎥ M M M ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎣ ⎦ 0 x x ⎢ ⎥ p p p ⎣ ⎦ ⎝ ( ) ( ) ( ) ⎠ + + + I I 1 I I 2 I I 3 1 2 1 2 1 2

  25. Solution: DISCO-GSCA λ - Impose target structure to a certain power ( ) ( ) 2 2 + λ • − ' target = min X - TP W P P ' T T I conc conc conc conc , T P conc Elementwise product ⎛ ⎞ ⎡ ⎤ ⎡ ⎤ p p p ⎡ ⎤ 0 1 0 0 x x 11 12 13 ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ M M M ⎜ ⎟ M M M ⎢ ⎥ M M M ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ p p p 0 1 0 0 x x ⎜ ⎟ I 1 I 2 I 3 ⎢ ⎥ 1 1 1 ⎢ ⎥ • − ⎢ ⎥ − − − − − − ⎜ ⎟ − − − ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1 0 0 p p p 0 ⎜ x x ⎟ ⎢ ( ) ( ) ( ) ⎥ + + + ⎢ ⎥ I I 1 I I 2 I I 3 ⎢ ⎥ 1 2 1 2 1 2 ⎜ ⎟ ⎢ ⎥ M M M ⎢ ⎥ M M M ⎢ ⎥ M M M ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎜ ⎟ ⎣ 1 0 0 ⎦ ⎣ ⎦ 0 x x ⎢ ⎥ p p p ⎣ ⎦ ⎝ ( ) ( ) ( ) ⎠ + + + I I 1 I I 2 I I 3 1 2 1 2 1 2

  26. Solution: DISCO-GSCA ( ) ( ) 2 2 + λ • − ' target = min X - TP W P P ' T T I conc conc conc conc , T P conc

  27. Solution: DISCO-GSCA • Model selection: 3 steps – FIRST: Select the number of simultaneous components • (SCA, Van Deun et al., 2009) – SECOND: characterize these components • i.e., how many of them are common/distinctive? • (DISCO-SCA, Schouteden et al., 2010) – THIRD: define λ • L-curve (Hansen, 1992)

  28. Outline • Introduction – Coupled data – Research questions • Method – Simultaneous component method – Problem – Solution: DISCO-GSCA • Illustration – Results • Conclusion

  29. Outline • Introduction – Coupled data – Research questions • Method – Simultaneous component method – Problem – Solution: DISCO-GSCA • Illustration – Results • Conclusion

  30. • Data: E. coli • Model: – 5 simultaneous components – Target: • 1 common component • 2 distinctive components for GC/MS • 2 distinctive components for LC/MS

Recommend


More recommend