Learning the Sampling for MRI Matthias J. Ehrhardt Institute for - PowerPoint PPT Presentation

Learning the Sampling for MRI Matthias J. Ehrhardt Institute for Mathematical Innovation, University of Bath, UK June 24, 2020 Joint work with: F. Sherry, M. Graves, G. Maierhofer, G. Williams, C.-B. Sch¨ onlieb (all Cambridge, UK), M. Benning (Queen Mary, UK), J.C. De los Reyes (EPN, Ecuador)

Outline 1) Motivation min x 1 2 � SFx − y � 2 2 + λ R ( x ) 2) Bilevel Learning min x , y f ( x , y ) x = arg min z g ( z , y ) 3) Learn sampling pattern in MRI

Inverse problems Ax = y x : desired solution y : observed data A : mathematical model Goal: recover x given y

Inverse problems Ax = y x : desired solution y : observed data A : mathematical model Goal: recover x given y Hadamard (1902): We call an inverse problem Ax = y well-posed if (1) a solution x ∗ exists (2) the solution x ∗ is unique (3) x ∗ depends continuously on data y . Otherwise, it is called ill-posed . Jacques Hadamard Most interesting problems are ill-posed .

Example: Magnetic Resonance Imaging (MRI) Continuous model: Fourier transform � Ax ( s ) = R 2 x ( s ) exp( − ist ) dt Dicrete model: A = F ∈ C N × N T ∗ MRI scanner 2 →

Example: Magnetic Resonance Imaging (MRI) Continuous model: Fourier transform � Ax ( s ) = R 2 x ( s ) exp( − ist ) dt Dicrete model: A = SF ∈ C n × N T ∗ MRI scanner 2 → Solution not unique .

How to solve inverse problems? Variational regularization ( ∼ 2000) Approximate a solution x ∗ of Ax = y via � 1 � 2 � Ax − y � 2 ˆ x ∈ arg min 2 + λ R ( x ) x R regularizer : penalizes unwanted features, ensures stability and uniqueness λ regularization parameter : λ ≥ 0. If λ = 0, then an original solution is recovered. If λ → ∞ , more and more weight is given to the regularizer R . textbooks: Scherzer et al. 2008, Ito and Jin 2015, Benning and Burger 2018

Example: Regularizers A. Tikhonov Tikhonov regularization ( ∼ 1960): R ( x ) = 1 2 � x � 2 2

Example: Regularizers A. Tikhonov Tikhonov regularization ( ∼ 1960): R ( x ) = 1 2 � x � 2 2 Total Variation Rudin, Osher, Fatemi 1992 S. Osher R ( x ) = �∇ x � 1 H 1 ( ∼ 1960-1990?) R ( x ) = 1 2 �∇ x � 2 2 Wavelet sparsity ( ∼ 1990) R ( x ) = � W x � 1 Total Generalized Variation : Bredies, Kunisch, Pock 2010 R ( x ) = inf v �∇ x − v � 1 + β �∇ v � 1

Example: MRI reconstruction Compressed Sensing MRI : A = S ◦ F Lustig, Donoho, Pauly 2007 Fourier transform F , sampling Sw = ( w i ) i ∈ Ω � 1 � 2 � SFx − y � 2 x ∈ arg min ˆ 2 + λ �∇ x � 1 x Miki Lustig

Example: MRI reconstruction Compressed Sensing MRI : A = S ◦ F Lustig, Donoho, Pauly 2007 Fourier transform F , sampling Sw = ( w i ) i ∈ Ω � 1 � 2 � SFx − y � 2 x ∈ arg min ˆ 2 + λ �∇ x � 1 x Miki Lustig sampling S ∗ y λ = 0 λ = 1

Example: MRI reconstruction Compressed Sensing MRI : A = S ◦ F Lustig, Donoho, Pauly 2007 Fourier transform F , sampling Sw = ( w i ) i ∈ Ω � 1 � 2 � SFx − y � 2 x ∈ arg min ˆ 2 + λ �∇ x � 1 x Miki Lustig sampling S ∗ y λ = 10 − 4 λ = 0

Example: MRI reconstruction Compressed Sensing MRI : A = S ◦ F Lustig, Donoho, Pauly 2007 Fourier transform F , sampling Sw = ( w i ) i ∈ Ω � 1 � 2 � SFx − y � 2 x ∈ arg min ˆ 2 + λ �∇ x � 1 x Miki Lustig sampling S ∗ y λ = 10 − 3 λ = 0

Example: MRI reconstruction Compressed Sensing MRI : A = S ◦ F Lustig, Donoho, Pauly 2007 Fourier transform F , sampling Sw = ( w i ) i ∈ Ω � 1 � 2 � SFx − y � 2 x ∈ arg min ˆ 2 + λ �∇ x � 1 x Miki Lustig sampling S ∗ y λ = 10 − 3 λ = 0 How to choose the sampling S ? Is there an optimal sampling?

Example: MRI reconstruction Compressed Sensing MRI : A = S ◦ F Lustig, Donoho, Pauly 2007 Fourier transform F , sampling Sw = ( w i ) i ∈ Ω � 1 � 2 � SFx − y � 2 x ∈ arg min ˆ 2 + λ �∇ x � 1 x Miki Lustig sampling S ∗ y λ = 10 − 3 λ = 0 How to choose the sampling S ? Is there an optimal sampling? Does the optimal sampling depend on R and λ ?

Some important works on sampling for MRI Uninformed ◮ Cartesian, radial, variable density ... e.g. Lustig et al. 2007 ✓ simple to implement ✗ not tailored to application ✗ not tailored to regularizer / reconstruction method ◮ compressed sensing theory : random sampling, mostly uniform e.g. Candes and Romberg 2007 ✓ mathematical guarantees limited to few sparsity promoting regularizers: mostly ℓ 1 type ✗ ✗ specific yet uninformed class of recoverable signals: sparse

Some important works on sampling for MRI Uninformed ◮ Cartesian, radial, variable density ... e.g. Lustig et al. 2007 ✓ simple to implement ✗ not tailored to application ✗ not tailored to regularizer / reconstruction method ◮ compressed sensing theory : random sampling, mostly uniform e.g. Candes and Romberg 2007 ✓ mathematical guarantees limited to few sparsity promoting regularizers: mostly ℓ 1 type ✗ ✗ specific yet uninformed class of recoverable signals: sparse Learned ◮ Largest Fourier coefficients of training set Knoll et al. 2011 ✓ simple to implement, computationally light ✗ not tailored to regularizer / reconstruction method ◮ greedy : iteratively select ”best” sample G¨ ozc¨ u et al. 2018 ✓ adaptive to dataset, regularizer / reconstruction method ✗ only discrete values, e.g. can’t learn regularization parameter ✗ computationally heavy

Bilevel Learning

Bilevel learning for inverse problems � 1 � 2 � Ax − y � 2 ˆ x = arg min 2 + λ R ( x ) R smooth and x strongly convex

Bilevel learning for inverse problems Upper level (learning): Given ( x † , y ) , y = Ax † + ε , solve x − x † � 2 min x � ˆ 2 λ ≥ 0 , ˆ Lower level (solve inverse problem): Carola Sch¨ onlieb � 1 � 2 � Ax − y � 2 x = arg min ˆ 2 + λ R ( x ) R smooth and x strongly convex von Stackelberg 1934, Kunisch and Pock 2013, De los Reyes and Sch¨ onlieb 2013

Bilevel learning for inverse problems Upper level (learning): Given ( x † i =1 , y i = Ax † i , y i ) n i + ε i , solve n 1 � x i − x † i � 2 min � ˆ 2 n λ ≥ 0 , ˆ x i i =1 Lower level (solve inverse problem): Carola Sch¨ onlieb � 1 � 2 � Ax − y i � 2 x i = arg min ˆ 2 + λ R ( x ) R smooth and x strongly convex von Stackelberg 1934, Kunisch and Pock 2013, De los Reyes and Sch¨ onlieb 2013

Bilevel learning: Reduced formulation Upper level : x − x † � 2 min x � ˆ 2 λ ≥ 0 , ˆ Lower level : � 1 � 2 � Ax − y � 2 x = arg min ˆ 2 + λ R ( x ) x

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : � 1 � 2 � Ax − y � 2 x = arg min ˆ 2 + λ R ( x ) x

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : ˆ x = arg min x L ( x , λ )

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : x λ := ˆ x = arg min x L ( x , λ ) λ ≥ 0 U ( x λ ) =: ˜ Reduced formulation : min U ( λ )

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : x λ := ˆ x = arg min x L ( x , λ ) ⇔ ∂ x L ( x λ , λ ) = 0 λ ≥ 0 U ( x λ ) =: ˜ Reduced formulation : min U ( λ )

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : x λ := ˆ x = arg min x L ( x , λ ) ⇔ ∂ x L ( x λ , λ ) = 0 λ ≥ 0 U ( x λ ) =: ˜ Reduced formulation : min U ( λ ) 0 = ∂ 2 ∂ λ x λ = − B − 1 A x L ( x λ , λ ) ∂ λ x λ + ∂ θ ∂ x L ( x λ , λ ) ⇔

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : x λ := ˆ x = arg min x L ( x , λ ) ⇔ ∂ x L ( x λ , λ ) = 0 λ ≥ 0 U ( x λ ) =: ˜ Reduced formulation : min U ( λ ) 0 = ∂ 2 ∂ λ x λ = − B − 1 A x L ( x λ , λ ) ∂ λ x λ + ∂ θ ∂ x L ( x λ , λ ) ⇔ ∇ ˜ U ( λ ) = ( ∂ λ x λ ) ∗ ∇ U ( x λ )

Bilevel learning: Reduced formulation Upper level : min x U (ˆ x ) λ ≥ 0 , ˆ Lower level : x λ := ˆ x = arg min x L ( x , λ ) ⇔ ∂ x L ( x λ , λ ) = 0 λ ≥ 0 U ( x λ ) =: ˜ Reduced formulation : min U ( λ ) 0 = ∂ 2 ∂ λ x λ = − B − 1 A x L ( x λ , λ ) ∂ λ x λ + ∂ θ ∂ x L ( x λ , λ ) ⇔ ∇ ˜ U ( λ ) = ( ∂ λ x λ ) ∗ ∇ U ( x λ ) = − A ∗ B − 1 ∇ U ( x λ ) = − A ∗ w where w solves Bw = ∇ U ( x λ ).

Algorithm for Bilevel learning Upper level : min λ ≥ 0 , ˆ x U (ˆ x ) Lower level : x λ := arg min x L ( x , λ ) Reduced formulation : min λ ≥ 0 U ( x λ ) =: ˜ U ( λ ) ◮ Solve reduced formulation via L-BFGS-B Nocedal and Wright 2000 ◮ Compute gradients: Given λ (1) Compute x λ , e.g. via PDHG Chambolle and Pock 2011 (2) Solve Bw = ∇ U ( x λ ), B := ∂ 2 x L ( x λ , λ ) e.g. via CG (3) Compute ∇ ˜ U ( λ ) = − A ∗ w , A := ∂ θ ∂ x L ( x λ , λ )

Learn sampling pattern in MRI

Learning the Sampling for MRI Matthias J. Ehrhardt Institute for - PowerPoint PPT Presentation

Learning the Sampling for MRI Matthias J. Ehrhardt Institute for Mathematical Innovation, University of Bath, UK June 24, 2020 Joint work with: F. Sherry, M. Graves, G. Maierhofer, G. Williams, C.-B. Sch onlieb (all Cambridge, UK), M.

Sampling and Probability Learning Objectives Understand probability sampling Compute and

CS7015 (Deep Learning) : Lecture 19 Using joint distributions for classification and sampling,

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani

AdaGeo: Adaptive Geometric Learning for Optimization and Sampling Gabriele Abbati 1 , Alessandra

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Overview of Sampling Topics (Shannon) sampling theorem Impulse-train sampling

Sampling Methods CMSC 691 UMBC (Some) Learning Techniques MAP/MLE: Point estimation, basic EM

PAC Learning + Oracles, Sampling, Generative vs. Discriminative Matt Gormley Lecture 16 Oct.

Bayesian Learning for Guided Direct Illumination Sampling Vvoda , Kondapaneni, Kivnek -

A primer on R for the Intermediate Distance Sampling workshop 2 Learning goals Obtain a

SAMPLING Week 6 Slides ScWk 240 1 Purpose of Sampling Why sampling? - to

Learning a Sampling Pattern for MRI Ferdia Sherry Supervised by Carola-Bibiane Sch onlieb and

Sampling low-dimensional Markovian dynamics for learning certified reduced models from data Wayne

Double, Multiple, and Sequential Sampling Double-sampling In a double-sampling plan, a first

OUTLINE THE BEAUTY SAMPLING BAR WHAT IS IT? CUSTOMIZED BEAUTY SAMPLING FOR A CAUSE Onsite

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Learning From Data Lecture 14 Three Learning Principles Occams Razor Sampling Bias Data

S A M P L I N G MPA 630: Data Science for Public Management November 1, 2018 Fill out your

Global Illumination Multi-Sampling Path Tracing Simple Sampling Josef talked about all of

Adaptive Incident Radiance Field Sampling and Reconstruction Using Deep Reinforcement Learning

Learning-based Sampling over 3D Point Clouds presented by Dr. HOU Junhui, Assistant Professor

Learning the Sampling for MRI Matthias J. Ehrhardt Institute for - PowerPoint PPT Presentation

Learning the Sampling for MRI Matthias J. Ehrhardt Institute for Mathematical Innovation, University of Bath, UK June 24, 2020 Joint work with: F. Sherry, M. Graves, G. Maierhofer, G. Williams, C.-B. Sch onlieb (all Cambridge, UK), M.

Sampling and Probability Learning Objectives Understand probability sampling Compute and

CS7015 (Deep Learning) : Lecture 19 Using joint distributions for classification and sampling,

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sampling Distributions Sampling Distribution of the Mean &amp; Hypothesis Testing Sampling

Probabilistic &amp; Unsupervised Learning Sampling Methods Maneesh Sahani

Probabilistic &amp; Unsupervised Learning Sampling Methods Maneesh Sahani

AdaGeo: Adaptive Geometric Learning for Optimization and Sampling Gabriele Abbati 1 , Alessandra

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Overview of Sampling Topics (Shannon) sampling theorem Impulse-train sampling

Sampling Methods CMSC 691 UMBC (Some) Learning Techniques MAP/MLE: Point estimation, basic EM

PAC Learning + Oracles, Sampling, Generative vs. Discriminative Matt Gormley Lecture 16 Oct.

Bayesian Learning for Guided Direct Illumination Sampling Vvoda , Kondapaneni, Kivnek -

A primer on R for the Intermediate Distance Sampling workshop 2 Learning goals Obtain a

SAMPLING Week 6 Slides ScWk 240 1 Purpose of Sampling Why sampling? - to

Learning a Sampling Pattern for MRI Ferdia Sherry Supervised by Carola-Bibiane Sch onlieb and

Sampling low-dimensional Markovian dynamics for learning certified reduced models from data Wayne

Double, Multiple, and Sequential Sampling Double-sampling In a double-sampling plan, a first

OUTLINE THE BEAUTY SAMPLING BAR WHAT IS IT? CUSTOMIZED BEAUTY SAMPLING FOR A CAUSE Onsite

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Learning From Data Lecture 14 Three Learning Principles Occams Razor Sampling Bias Data

S A M P L I N G MPA 630: Data Science for Public Management November 1, 2018 Fill out your

Global Illumination Multi-Sampling Path Tracing Simple Sampling Josef talked about all of

Adaptive Incident Radiance Field Sampling and Reconstruction Using Deep Reinforcement Learning

Learning-based Sampling over 3D Point Clouds presented by Dr. HOU Junhui, Assistant Professor

Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Sampling

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani