Minimization for conditional simulation: relationship to optimal transport Dean Oliver, Uni CIPR 28 May 2013 1/38
Initial uncertainty in Final uncertainty in data reservoir model param- − − − − − − − − − → reservoir model param- eters eters 2 1.0 1 0.8 0 0.6 � 1 0.4 � 2 � 1 0 1 2 � 1.0 � 0.5 0.0 0.5 Bayes’ rule relates prior pdf to posterior pdf. For high model dimensions, generally need Monte Carlo methods for integration. 2/38
Initial ensemble of Updated ensemble of data reservoir model param- − − − − − − − − − → reservoir model param- eters eters 3 3 2 2 1 1 � 3 � 2 � 1 1 2 3 � 3 � 2 � 1 1 2 3 � 1 � 1 � 2 � 2 � 3 � 3 Ensembles of particles represent prior pdf and posterior pdf. Compute expectations by summing over samples. 3/38
Sampling from the conditional pdf 1.2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 � 1.0 � 0.5 0.5 � 1.0 � 0.5 0.0 0.5 Samples need to be distributed correctly. MCMC? Rejection/acceptance? Scale poorly in high model/data dimensions. 4/38
Update the ensemble of samples � � � � 2 � � � � � � � � � � � � � � � � � � � � � � 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 2 � 1 � � � 1 � � 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1 � � � � � � � � � � � � � � � � � � � � � � � � prior 5/38
Update the ensemble of samples � � � � � � � 2 � 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1 � � � � � � � � � � � 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 2 � 1 � � � � 1 2 � 2 � 1 � � � 1 � � 2 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1 � � 1 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � prior posterior Each sample from prior is updated , not resampled from posterior. How to update? 5/38
How to update samples? “The Ensemble Kalman Filter (EnKF) [is] a continuous implementation of the Bayesian update rule” (Li and Caers, 2011). “These methods use . . . ‘samples’ that are drawn independently from the given initial distribution and assigned equal weights. . . . When observations become available, Bayes’ rule is applied either to individual samples . . . ” (Park and Xu, 2009) Note: Bayes rule explains how to update probabilities, but not how to update samples. (Particle filters do update the probability of each sample using Bayes rule.) 6/38
Two transformations from prior to posterior pdf 1.5 1.5 1.0 1.0 0.5 0.5 � 2 � 1 1 2 � 2 � 1 1 2 � 0.5 � 0.5 � 1.0 � 1.0 � 1.5 � 1.5 y = µ y + L y [ L y Σ x L y ] − 1 / 2 L y ( x − µ x ) y = µ y + L y L − 1 x ( x − µ x ) Two equally valid transformations from prior to posterior for linear-gaussian problem. Bayes rule is not sufficient to specify the transformation. 7/38
Why it matters 1. If prior is Gaussian and posterior is Gaussian, then any linear transformation of variables that obtains the correct posterior mean and covariance is OK. (Any version of EnKF or EnSRF.) 2. What transformation to use when the observation operator is nonlinear? 2.1 Randomized maximum likelihood (Kitanidis, 1995; Oliver et al., 1996) 2.2 Optimal map (El Moselhy and Marzouk, 2012) 2.3 Implicit filters (Chorin et al., 2010; Morzfeld et al., 2012) 2.4 Continuous data assimilation or multiple data assimilation (Reich, 2011; Emerick and Reynolds, 2013) 2.5 Ensemble-based iterative filters/smoothers 8/38
Monge’s transport problem Solve for the optimal transformation s ∗ ( · ) that minimizes � s ∗ = arg min � x − s ( x ) � 2 p ( x ) dx such that q s ( x ) = q ( x ) s Reminder: ◮ x ∼ p ( x ) ◮ s ( x ) = y ∼ q s ( y ), if x ∼ p ( x ) ◮ q ( y ) is the target pdf of transformed variables 9/38
Explicitly (with Gaussian prior) Let X be a multivariate normal random variable with probability density p , � − 1 � 2( x − µ ) T C − 1 p ( x ) = c p exp ( x − µ ) , x and let Y be a random vector defined by s ( X ) = Y . The probability density of Y is s − 1 ( y ) det S − 1 ( y ) � � q s ( y ) = p � � − 1 2( s − 1 ( y ) − µ ) T C − 1 ( s − 1 ( y ) − µ ) det S − 1 ( y ) = c p exp x where S − 1 ( y ) = ∇ s − T . 10/38
The posterior pdf, for a variable Y with prior distribution p ( y ) and observation relation d o = h ( y ) + ǫ d for ǫ d ∼ N (0 , C d ) is � − 1 2( y − µ ) T C − 1 q ( y ) = c q exp ( y − µ ) x − 1 � 2( h ( y ) − d o ) T C − 1 d ( h ( y ) − d o ) . If we wish to sample from the posterior, we must seek transformations such that q s ( y ) = q ( y ). 11/38
Example 1 ◮ Prior pdf: p ( x ) = N ( µ x , Σ x ) ◮ Target pdf: q ( x ) = N ( µ y , Σ y ) y = s ( x ) := µ y + L y [ L y Σ x L y ] − 1 / 2 L y ( x − µ x ) where L x = Σ 1 / 2 and L y = Σ 1 / 2 . x y Minimizes the expected value of the squared distance � X − Y � 2 1 Olkin and Pukelsheim (1982); Knott and Smith (1984); R¨ uschendorf (1990) 12/38
Speculation — why would the minimum distance solution be desirable? Obtaining a transformation with optimal transport properties may only be important insofar as it simplifies the sampling problem except that the ‘natural’ optimal transport solution might be more robust to deviations from ideality. Examples include those in which samples from the prior are complex geological models, in which case making as small of a change as possible might be beneficial. 13/38
Transformations between spaces of unequal dimensions Consider the mapping from X ∈ R N to Y = s ( X ) ∈ R M where M < N such that � � x − As ( x ) � 2 Σ x p ( x ) dx is minimized and the pdfs for x and y = s ( x ) are x ∼ N ( µ x , Σ x ) y ∼ N ( µ y , Σ y ) . 14/38
Transformations between spaces of unequal dimensions (continued) The transformation ( x − Ay ) T Σ − 1 y = arg min ( x − Ay ) x y � − 1 � A T Σ − 1 A T Σ − 1 = x A x x . is a solution for the special case � − 1 � A T Σ − 1 Σ y = x A and � − 1 � A T Σ − 1 A T Σ − 1 µ y = x A x µ x . 15/38
Transformations between spaces of unequal dimensions Note that if � I � � C x � 0 A = and Σ xd = H 0 C d then � − 1 � x � � A T Σ − 1 A T Σ − 1 y = xd A xd d � − 1 = x + C x H T � HC x H T + C d ( d − Hx ) is the optimal transport transformation of [ x d ] to y for a particular cost function. (Note that this is the perturbed observation form of the EnKF, or RML for linear inverse problem.) 16/38
Nonlinear Transformations Consider now the problem of determining an approximation of the optimal transformation s ∗ that satisfies � s ( x , d ) 2 � � �� � x � s ∗ = arg min � � − p ( x , d ) dx dd � � h ( s ( x , d )) d s � � Σ xd subject to y = s ( x , d ) is distributed as q ( y ) for � − 1 2( x − µ ) T C − 1 p ( x , d ) = c p exp ( x − µ ) x � − 1 2( d − d o ) T C − 1 d ( d − d o ) , The prior is Gaussian but the data relationship is nonlinear. 17/38
Recommend
More recommend