on the use of nmf and curvhdr to cluster flow cytometry
play

On the Use of NMF and curvHDR to Cluster Flow Cytometry Data e M. - PowerPoint PPT Presentation

On the Use of NMF and curvHDR to Cluster Flow Cytometry Data e M. Maisog 1,2 , Andrea A. Barbo 2 , George Luta 2 Jos 1 Medical Numerics, Inc., Germantown, MD 20876 2 Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown


  1. On the Use of NMF and curvHDR to Cluster Flow Cytometry Data e M. Maisog 1,2 , Andrea A. Barbo 2 , George Luta 2 Jos´ 1 Medical Numerics, Inc., Germantown, MD 20876 2 Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, DC 20057 FlowCAP Summit, September 21-22, 2010

  2. Outline Non-Negative Matrix Factorization 1 NMF and curvHDR September, 2010 2 / 21

  3. Outline Non-Negative Matrix Factorization 1 curvHDR 2 NMF and curvHDR September, 2010 2 / 21

  4. Outline Non-Negative Matrix Factorization 1 curvHDR 2 Strategy for FlowCAP Challenge 2 3 NMF and curvHDR September, 2010 2 / 21

  5. Outline Non-Negative Matrix Factorization 1 curvHDR 2 Strategy for FlowCAP Challenge 2 3 Discussion 4 NMF and curvHDR September, 2010 2 / 21

  6. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] NMF and curvHDR September, 2010 3 / 21

  7. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix NMF and curvHDR September, 2010 3 / 21

  8. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix Find W and H such that: Y ≈ W × H NMF and curvHDR September, 2010 3 / 21

  9. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix Find W and H such that: Y ≈ W × H W is M × k , H is k × N NMF and curvHDR September, 2010 3 / 21

  10. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix Find W and H such that: Y ≈ W × H W is M × k , H is k × N W and H are non-negative NMF and curvHDR September, 2010 3 / 21

  11. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix Find W and H such that: Y ≈ W × H W is M × k , H is k × N W and H are non-negative Must specify k (cf. k -means clustering) NMF and curvHDR September, 2010 3 / 21

  12. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix Find W and H such that: Y ≈ W × H W is M × k , H is k × N W and H are non-negative Must specify k (cf. k -means clustering) Dimensionality Reduction: k < M, N NMF and curvHDR September, 2010 3 / 21

  13. Non-Negative Matrix Factorization NMF A relatively new method of matrix decomposition [LS99] Given Y , an M × N non-negative matrix Find W and H such that: Y ≈ W × H W is M × k , H is k × N W and H are non-negative Must specify k (cf. k -means clustering) Dimensionality Reduction: k < M, N There are multiple variations, e.g. different optimization criteria NMF and curvHDR September, 2010 3 / 21

  14. Non-Negative Matrix Factorization NMF: Algorithm H Variables (e.g., genes) les Samples Y W W • H + E = (Based on a figure from [You09]) NMF and curvHDR September, 2010 4 / 21

  15. Non-Negative Matrix Factorization NMF: Algorithm H Variables (e.g., genes) les Samples Y W W • H + E = (Based on a figure from [You09]) Initialize W and H with random values. NMF and curvHDR September, 2010 4 / 21

  16. Non-Negative Matrix Factorization NMF: Algorithm H Variables (e.g., genes) Samples les Y W W • H + E = (Based on a figure from [You09]) Initialize W and H with random values. Optimize so that � ( y ij − wh ij ) 2 is minimized. NMF and curvHDR September, 2010 4 / 21

  17. Non-Negative Matrix Factorization NMF: Algorithm H Variables (e.g., genes) Samples les Y W W • H + E = (Based on a figure from [You09]) Initialize W and H with random values. Optimize so that � ( y ij − wh ij ) 2 is minimized. The k rows of H define “metagenes”, while the i th row of W represents the “metagene expression pattern of the corresponding sample” [Dev08] NMF and curvHDR September, 2010 4 / 21

  18. Non-Negative Matrix Factorization NMF Results are “Sparse” NMF has decomposed the face data into discrete “parts.” (Lee and Seung, Nature 1999 Oct 21;401(6755):788-91) NMF and curvHDR September, 2010 5 / 21

  19. Non-Negative Matrix Factorization PCA of Face Data Principal components are “holistic” rather than discrete “parts.” (Lee and Seung, Nature 1999 Oct 21;401(6755):788-91) NMF and curvHDR September, 2010 6 / 21

  20. Non-Negative Matrix Factorization Comparison of Matrix Decomposition Methods Method Constraints Basis Encodings PCA/SVD components are orthogonal non-sparse non-sparse ICA statistically independent components sparse non-sparse NMF data and factors are non-negative sparse sparse NMF and curvHDR September, 2010 7 / 21

  21. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] NMF and curvHDR September, 2010 8 / 21

  22. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris NMF and curvHDR September, 2010 8 / 21

  23. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris Obtain significant high negative curvature regions [DCKW08] NMF and curvHDR September, 2010 8 / 21

  24. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris Obtain significant high negative curvature regions [DCKW08] Replace each of the significant curvature regions by their convex hull NMF and curvHDR September, 2010 8 / 21

  25. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris Obtain significant high negative curvature regions [DCKW08] Replace each of the significant curvature regions by their convex hull Grow each convex hull by a factor G. NMF and curvHDR September, 2010 8 / 21

  26. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris Obtain significant high negative curvature regions [DCKW08] Replace each of the significant curvature regions by their convex hull Grow each convex hull by a factor G. Obtain a kernel density estimate for data within each grown region [DCKW08] NMF and curvHDR September, 2010 8 / 21

  27. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris Obtain significant high negative curvature regions [DCKW08] Replace each of the significant curvature regions by their convex hull Grow each convex hull by a factor G. Obtain a kernel density estimate for data within each grown region [DCKW08] The curvHDR gate is the union of the level- τ high density regions (HDRs). NMF and curvHDR September, 2010 8 / 21

  28. curvHDR curvHDR Unsupervised clustering with unknown number of clusters [NLW10] Algorithm: Remove excess boundary points and other debris Obtain significant high negative curvature regions [DCKW08] Replace each of the significant curvature regions by their convex hull Grow each convex hull by a factor G. Obtain a kernel density estimate for data within each grown region [DCKW08] The curvHDR gate is the union of the level- τ high density regions (HDRs). Currently only the 2D version is implemented, but a 3D version will be released soon NMF and curvHDR September, 2010 8 / 21

  29. curvHDR curvHDR: Illustration (Naumann et al., BMC Bioinformatics 2010 Jan 22;11:44) NMF and curvHDR September, 2010 9 / 21

Recommend


More recommend