optimization challenges in cell identification
play

Optimization Challenges in Cell Identification Stefan Wild Argonne - PowerPoint PPT Presentation

Optimization Challenges in Cell Identification Stefan Wild Argonne National Laboratory Mathematics and Computer Science Division Joint work with Sven Leyffer, Thanh Ngo, and Siwei Wang August 1, 2012 Disconnect and OPT ( f, c ) = min x R n


  1. Optimization Challenges in Cell Identification Stefan Wild Argonne National Laboratory Mathematics and Computer Science Division Joint work with Sven Leyffer, Thanh Ngo, and Siwei Wang August 1, 2012

  2. Disconnect and OPT ( f, c ) = min x ∈ R n { f ( x ) : c ( x ) ≤ 0 } Gap between science, formulated problem, and algorithmic solution ⋄ “Solving OPT ( f, c ) results in overfitting.” ⋄ “Solution to OPT ( f, c ) must be post-processed.” ⋄ “What is OPT ( f, c ) ? I just have an algorithm that gives me the solution.” ⋄ “I can’t solve the science, but I can solve OPT ( f, c ) .” ⋄ “I don’t know how to solve OPT ( f, c ) on a (large) cluster.” CScADS 12 1

  3. Disconnect and OPT ( f, c ) = min x ∈ R n { f ( x ) : c ( x ) ≤ 0 } Gap between science, formulated problem, and algorithmic solution ⋄ “Solving OPT ( f, c ) results in overfitting.” ⋄ “Solution to OPT ( f, c ) must be post-processed.” ⋄ “What is OPT ( f, c ) ? I just have an algorithm that gives me the solution.” ⋄ “I can’t solve the science, but I can solve OPT ( f, c ) .” ⋄ “I don’t know how to solve OPT ( f, c ) on a (large) cluster.” I will not close this gap! ⋄ Initial examples on (nonlinear) continuous-discrete-mixed numerical/math optimization for data analysis (many [,better] others) ⋄ Experimental data CScADS 12 1

  4. Part 1: Elemental Maps

  5. Multi-Dim. Imaging in X-ray Fluorescence Microscopy Science challenges in Nano-medicine and Theranostics ⋄ Design new treatment and drugs for targeted drug delivery ⋄ Combine therapy and diagnostics by targeting nanoparticles at cancer ⋄ Extract efficiency score from multiple sources of data (instruments) � X-ray, fluorescent, and visible light images CScADS 12 3

  6. Manually Finding Cells is Difficult* CScADS 12 4

  7. Manually Finding Cells is Difficult* red blood cells CScADS 12 4

  8. Manually Finding Cells is Difficult* algae cells CScADS 12 4

  9. Manually Finding Cells is Difficult* yeast cells CScADS 12 4

  10. Challenges and Goals Accurate statistics/recognition of hundreds of cells and elemental distributions within regions of interest 1. Lack of manual annotations 2. Nonuniformity of cells/noise/background A first task: Data reduction ⋄ Raw energy channel maps → elemental maps ⋄ People only look at a handful of “elements” rather than 2000 channels X e,p number of photons arriving at location p , range of energies around e X non-negative energy channel × pixel matrix (think: 10 3 × 10 7 ) CScADS 12 5

  11. 2D (Channel-Pixel) Optimization Approaches (I) Unconstrained low-rank approximation �� � 2 � X − W H T � F : W ∈ R m × k , H ∈ R k × n min � � � ⋄ k ≪ min( m, n ) known k ˜ ⋄ W i H T X = � i i =1 ⋄ W = channel basis ⋄ H = pixel basis ⋄ Solved by SVD (unknown W and H ) � W 1 , H 1 non-negative � W i , H i mixed signs for i > 1 CScADS 12 6

  12. 2D (Channel-Pixel) Optimization Approaches (I) Unconstrained low-rank approximation �� � 2 � X − W H T � F : W ∈ R m × k , H ∈ R k × n min � � � ⋄ k ≪ min( m, n ) known 0 10 Avg W 1 k ˜ ⋄ W i H T X = � W 2 −1 i 10 −W 2 i =1 ⋄ W = channel basis −2 10 ⋄ H = pixel basis −3 10 ⋄ Solved by SVD (unknown W and H ) −4 10 � W 1 , H 1 non-negative � W i , H i mixed signs for i > 1 0 200 400 600 800 1000 CScADS 12 6

  13. 2D (Channel-Pixel) Optimization Approaches (II) Constrained approximation �� 2 � � X − W H T � F : W ∈ R m × k , H ∈ R k × n , W ≥ 0 , H ≥ 0 min � � � Non-negative matrix factorization (NMF) ⋄ W = channel basis ⋄ H = pixel basis ⋄ Preserve structure and approximation ⋄ Multiplicative update algorithms ( XH ) i,j � W i,j ← W i,j ( W ( H T H )) i,j ( W T X ) i,j � H j,i ← H j,i (( W T W ) H T ) i,j ⋄ Other formulations (nnz ( W ) ≤ θ ) CScADS 12 7

  14. 2D (Channel-Pixel) Optimization Approaches (II) Constrained approximation �� 2 � � X − W H T � F : W ∈ R m × k , H ∈ R k × n , W ≥ 0 , H ≥ 0 min � � � Non-negative matrix factorization (NMF) P ⋄ W = channel basis ⋄ H = pixel basis ⋄ Preserve structure and Cu approximation × ≈ ⋄ Multiplicative update algorithms Zn ( XH ) i,j � W i,j ← W i,j ( W ( H T H )) i,j ( W T X ) i,j � H j,i ← H j,i (( W T W ) H T ) i,j ⋄ Other formulations (nnz ( W ) ≤ θ ) CScADS 12 7

  15. Revealing Latent Structure Through NMF ⋄ Non-negative output compatible with intuitive psychological and physiological evidence ⋄ Reconstruction through additive combination of nonnegative W i,j yields ∗ sparse, parts-based representation Applications Natural language processing ⋄ Sparsity helps! Bag-of-words ⋄ Latent Dirichlet allocation, semantic role labeling, K-L divergence,. . . Face recognition/image clustering ⋄ Reveal noses, lips, eyes, . . . ⋄ [Lee & Seung, Nature 1999] DNA microarray CScADS 12 8

  16. No Silver Bullet Challenges/Drawbacks of NMF ⋄ Unique parts-based representation only under specific conditions (e.g., separable complete factorial family [Donoho et al. 2003] ). ⋄ Initialization directly impacts the quality of its output ⋄ Challenging objective functions (nonlinear, nonconvex, . . . ) ⋄ Many local minima ⋄ Expert/modeler needs to specify goals � Sparse features? � Accurate approximation? � Labeled/semi-supervised data? � Features corresponding to elements? CScADS 12 9

  17. Incorporating The Science: Basis Initialization ⋄ Gaussian distributions describing reference elements via an “element signature” ⋄ Gaussians at K α 1 , K α 2 , K β 1 for elements of interest CScADS 12 10

  18. Incorporating The Science: Basis Initialization ⋄ Gaussian distributions describing reference elements via an “element signature” ⋄ Gaussians at K α 1 , K α 2 , K β 1 for elements of interest CScADS 12 10

  19. Weight Image H S Associated With S Basis Previous fitting Square initialization Gaussian initialization (iter=1000) (iter=100) 1 hour 1.5 minutes 10 seconds CScADS 12 11

  20. Multi-Channel Images Corresponding to Chemical Elements Ca Cl Cu Fe K P S TFY Zn s s s + Sufficient for many users/groups − Initial step to ultimate cell identification/classification goals − Neglects spatial attributes of pixels CScADS 12 12

  21. Part 2: Finding Cells

  22. Identifying Cells in Images ⋄ Cells have different sizes and shapes ⋄ Images are noisy, potentially large ( O (10 7 ) pixels) Zn map with more than 500 cells CScADS 12 14

  23. Graph Partitioning Approaches ⋄ Build an undirected graph G = ( V, E ) from the image � v ∈ V corresponds to a pixel or a small region � e uv ∈ E connects u and v with weight w uv ⋄ Connectivity: connect local pixels (k-nearest neighbors or r -neighborhood) � w uv large for pixels within a group, small for pixels in different groups Goal: Partition the graph into disjoint partitions CScADS 12 15

  24. Graph Partitioning Approaches ⋄ Build an undirected graph G = ( V, E ) from the image � v ∈ V corresponds to a pixel or a small region � e uv ∈ E connects u and v with weight w uv ⋄ Connectivity: connect local pixels (k-nearest neighbors or r -neighborhood) � w uv large for pixels within a group, small for pixels in different groups Goal: Partition the graph into disjoint partitions CScADS 12 15

  25. Discrete Optimization and 2-way Graph Partitioning Minimum weight cut      Cut ( A, ¯ � w uv : A ∪ ¯ A = V, A ∩ ¯ A = ∅ , A � = ∅ , ¯ min A ) = A � = ∅ u ∈ A,v ∈ ¯  A + Efficient combinatorial algorithms exist − Often favors unbalanced cuts CScADS 12 16

  26. Discrete Optimization and 2-way Graph Partitioning Minimum weight cut      Cut ( A, ¯ � w uv : A ∪ ¯ A = V, A ∩ ¯ A = ∅ , A � = ∅ , ¯ min A ) = A � = ∅ u ∈ A,v ∈ ¯  A + Efficient combinatorial algorithms exist − Often favors unbalanced cuts To obtain balanced cuts A ) = Cut ( A, ¯ + Cut ( A, ¯ A ) A ) RatioCut ( A, ¯ | ¯ | A | A | A ) = Cut ( A, ¯ + Cut ( A, ¯ A ) A ) NormalizedCut ( A, ¯ vol ( ¯ vol ( A ) A ) − Minimizing these objectives is hard CScADS 12 16

  27. Spectral Relaxations � 1 if i ∈ A, 2 z T Lz, Cut ( A, ¯ A ) = 1 where z i = 0 otherwise. | ¯ � A | if i ∈ A, A ) = z T Lz RatioCut ( A, ¯ | A | z T z , where z i = − | A | otherwise. | ¯ A |  � vol ( ¯ A ) if i ∈ A, A ) = z T Lz  NormalizedCut ( A, ¯ vol ( A ) z T Dz , where z i = � vol ( A ) − otherwise  vol ( ¯ A ) L = D − W ; W = adjacency matrix; D ii = � j w ij CScADS 12 17

  28. Spectral Relaxations � 1 if i ∈ A, 2 z T Lz, Cut ( A, ¯ A ) = 1 where z i = 0 otherwise. | ¯ � A | if i ∈ A, A ) = z T Lz RatioCut ( A, ¯ | A | z T z , where z i = − | A | otherwise. | ¯ A |  � vol ( ¯ A ) if i ∈ A, A ) = z T Lz  NormalizedCut ( A, ¯ vol ( A ) z T Dz , where z i = � vol ( A ) − otherwise  vol ( ¯ A ) L = D − W ; W = adjacency matrix; D ii = � j w ij Relax z ∈ { 0 , 1 } to have real values ⋄ Solve for the eigenvector associated with the 2nd smallest eigenvalue of RatioCut Lz = λz NormalizedCut (generalized eigenproblem) Lz = λDz • eigenvector y of the normalized graph Laplacian L = I − D − 1 / 2 W D − 1 / 2 , then take z = D − 1 / 2 y [Luxburg, “A tutorial on spectral clustering,” 2007] CScADS 12 17

Recommend


More recommend