deep learning theory with application to cancer research
play

Deep Learning Theory with Application to Cancer Research Leonie - PowerPoint PPT Presentation

Deep Learning Theory with Application to Cancer Research Leonie Zeune, Stephan van Gils, Guus van Dalum, Leon Terstappen, Christoph Brune Inverse Problems and Machine Learning, Pasadena, USA, Feb 9-11, 2018 Deep Learning as a Black Box


  1. Deep Learning Theory with Application to Cancer Research Leonie Zeune, Stephan van Gils, Guus van Dalum, Leon Terstappen, Christoph Brune Inverse Problems and Machine Learning, Pasadena, USA, Feb 9-11, 2018

  2. Deep Learning as a Black Box Breakthrough Technologies 2013 Deep Learning 2017 Reinforcement Learning Google Trend Benchmark 2010-2017 2010 2012 2014 2016 April 2017: ”No one really knows how the most advanced algorithms do what they do. That could be a problem.” June 2017: ”Artificial intelligence is a black box that thinks in ways we don’t understand. That’s thrilling and scary.”

  3. Closing the Gap between Math and Machine Learning MODEL-BASED Theory Regularization Calculus of Variations Partial Differential Equations Inverse Problems Graphs / Networks Mathematics Optimization Classification Deep Learning Uncertainty Quantification Big Data Machine Learning Segmentation Clustering Life Sciences Biomedical Imaging DATA-BASED Application

  4. Deeper Insights into Deep Inversion

  5. Deep Learning for Inverse Problems Inverse Problem: Ku = f ( u ∗ , f ∗ ) available Supervised Learning ◮ Learning variational networks ◮ Learning unrolled, proximal schemes (LISTA, learned PD) u ∗ available Semi-supervised Learning min [ D ( Ku , f ) − log ( µ θ ( u ))] , with µ θ = P θ ( U = u ) θ f ∗ available Unsupervised Learning E f [ K ( K † min θ ( f )) − f ] θ ◮ related: Autoencoder (AE), i.e. K T K ( u ) ≈ u and GANs

  6. Challenges in Deep Learning Mathematical / ML questions: ◮ Network architecture: Which activation functions (nonlinearities, norms) should be used? What is the importance of depth (scale), width (fully connected) and convolution (diffusion)? ◮ Network as a generalized ODE: How can we add robustness to the learning of a network? Can deep learning be viewed as a metric learning problem? What are the statistical properties of images (patterns) captured by deep learning networks? ◮ Nonconvex network learning: What is the optimal selection and amount of training data? How to deal with the nonconvexity and that many local minima share a similar performance? Structure/Patterns Parameters/Design Optimization/Learning

  7. Cancer-ID Project Cancer-ID aims to validate blood-based biomarkers for cancer ◮ cells dissociate from primary tumor ◮ circulating tumor cell (CTC) count has and invade blood circulation prognostic value for survival outcome ◮ rare cell events, challenging to detect ◮ no overall CTC definition exists yet 7

  8. Automatic and Platform Independent CTC Definition Find and classify CTCs in various data sets! 8

  9. Semi- or Unsupervised Analysis of Structure and Scale? Idea: Artefacts, intact cells and fragments of cells have different sizes and intensities. Can we detect that automatically? → scale information might help to improve classification results. 9

  10. Goal Variational Methods Denoising by Segmentation using Nonlinear Diffusion Nonlinear Diffusion Find Similarities Lower-level task High-level task Denoising by Classification using CNN Autoencoder CNN Autoencoder Deep Learning 10

  11. Spectral Transformation and Filtering (Fourier, Wavelet) More informative signal representation! 11

  12. Spectral Analysis for TV Denoising Forward Total Variation (TV) flow u t = − p for p ∈ ∂ TV ( u ) u | t = 0 = f δ → discrete case: solving in every step the ROF [Rudin et al. 92] problem: 2 || u − u n || 2 1 2 + α TV ( u ) → min u Idea: Solution of nonlinear eigenvalue problem λ u ∈ ∂ J ( u ) with J ( u ) = TV ( u ) transformed to peak in spectral domain. [Gilboa, 2013] [Gilboa, 2014] [Horesh, Gilboa 15] [Burger et al., 2015],[Gilboa et al., 2015],[Burger et al., 2016] 12

  13. Spectral Transform Spectral Transform and Response (acc.to [Gilboa 13/14]) φ ( t ) = u tt t S ( t ) := || φ ( t ; x ) || L 1 (Ω) Signal representation: � ∞ φ ( t ; x ) dt + ¯ f ( x ) = f 0 Filtering: � ∞ φ H ( t ; x ) dt + H ( ∞ )¯ f H ( x ) = f 0 with φ H ( t ; x ) = H ( t ) φ ( t ; x ) . Parseval id. also available! Example taken from [Burger et al., 2015]. 13

  14. Variational Methods for Segmentation Which variational models can be used to partition an image into two regions? Active Contour without Edges model (Chan-Vese) � � J CV ( c 1 , c 2 , C ) = ( f ( x ) − c 1 ) 2 d x + ( f ( x ) − c 2 ) 2 d x + α · Length ( C ) → min C , c 1 , c 2 Ω in Ω out [Osher, Sethian, 88], [Mumford, Shah 89], [Chan, Vese 01], [Ambrosio, Tortorelli 90] → related to Level-set method (Hamilton-Jacobi) 14

  15. Relation of Total Variation and Perimeter Function Space of Bounded Variation BV (Ω) := { u ∈ L 1 (Ω) | TV ( u ) < ∞} with TV ( u ) := � sup Ω u ∇ · ϕ d µ (Ω; R 2 ) ϕ ∈ C ∞ 0 || ϕ || ∞ < 1 Relation with CV segmentation model? � 1 if x ∈ Ω in ∪ C Length ( C ) = TV ( u ) with u ( x ) = 0 if x ∈ Ω out TV -based formulation of CV model � � ( f ( x ) − c 1 ) 2 − ( f ( x ) − c 2 ) 2 � J CV 2 ( c 1 , c 2 , u ) = u d x + α TV ( u ) → min u ∈ BV (Ω) , c 1 , c 2 Ω u ( x ) ∈{ 0 , 1 } For fixed c 1 , c 2 corresponds to ROF with binary constraint ([Burger et al. 12]): 1 2 || u ( x ) − r ( x ) || 2 2 + α TV ( u ) with r ( x ) = ( f ( x ) − c 2 ) 2 − ( f ( x ) − c 1 ) 2 − 1 min u ∈ BV (Ω) 2 . u ( x ) ∈{ 0 , 1 } 15

  16. Scale spaces for Segmentation ”Forward scale space” for filtering and segmentation Nonlinear Filtering (ROF) Nonlinear Segmentation (CV) ( f − c 1 ) 2 − ( f − c 2 ) 2 � 1 2 || u − f || 2 � � 2 + α TV ( u ) Ω u + α TV ( u ) with scale parameter α with scale parameter α ◮ Inverse scale space for nonlocal filtering through Bregman iterations ”Inverse scale space” for filtering 1 2 || u − f || 2 u k + 1 = arg min 2 + α ( TV ( u ) − < u , p k > ) u ∈ BV (Ω) with p k ∈ ∂ TV ( u k ) , p 0 = 0 and scale parameter k . ◮ How can we construct an ”inverse scale space” for segmentation? [Osher et al. 05] 16

  17. Spectral Transform for Segmentation Spectral Transform and Response � − u t ( x ) (forward case) φ ( t ; x ) = u t ( x ) (inverse case) S ( t ) = || φ ( t ; x ) || L 1 (Ω) � S ( t ) = � Φ( t ) , 2 tp ( t ) � better? ([Burger et al., 2015]) Segmentation representation via: � ∞ f seg ( x ) = φ ( t ; x ) dt 0 Filtering via: � ∞ f seg ( x ) = φ H ( t ; x ) dt with φ H ( t ; x ) = H ( t ) φ ( t ; x ) H 0 17

  18. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 1 18

  19. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 2 18

  20. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 3 18

  21. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 4 18

  22. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 5 18

  23. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 6 18

  24. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 7 18

  25. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 8 18

  26. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 9 18

  27. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 10 18

  28. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 11 18

  29. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 12 18

  30. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 13 18

  31. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 14 18

  32. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 15 18

  33. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 16 18

  34. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 17 18

  35. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 18 18

  36. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 19 18

  37. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 20 18

  38. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 21 18

  39. Detection of Different Sizes Bregman-CV with α = 100, discs with fixed intensity and varying size. Bregman-Iteration 22 18

Recommend


More recommend