Introduction Determined models Under-determined models Numerical examples Conclusions Fast Bayesian optimal experimental design and its applications Quan Long Joint work with Chaouki Issaid, Mohammad Motamed (UNM), Marco Scavino, Raul Tempone and Suojin Wang (TAMU) SRI Center for Uncertainty Quantification in Computational Science and Engineering, King Abdullah University of Science and Technology, KSA January 9, 2015 SRI UQ Annual Meeting 1 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Introduction Experimental design is important when resources are limited. For example, the total cost of an onshore oil well would be 1-1.5 million USD. 2 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Introduction We first consider a linear regression model: Y = X θ + ǫ The simple least square estimation: ˆ θ = ( X T X ) − 1 X T Y Cov (ˆ θ ) = Σ = ( X T X ) − 1 We want ( X T X ) − 1 to be as “small” as possible. 3 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Introduction Alphabetic optimality: A –optimality: minimize the trace of the covariance matrix tr ( Σ ) C –optimality: minimize the variance of a predefined linear com- bination of parameters ( β T Σ − 1 β ) − 1 D –optimality: minimize the determinant of the covariance ma- trix | Σ | E –optimality: minimize the maximum eigenvalue of the covari- ance matrix max ( σ ii ) Entropy based expected information gain in a Bayesian setting. 4 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Major Notations p ( · ) : probability density function θ : unknown parameter vector θ 0 : the d dimensional vector of the “true” parameters used to generate the synthetic data ξ : the vector of control parameters, also known as the experi- mental setup g : the deterministic model y i : the i th observation vector y = { y i } M ¯ i = 1 : a set of observation vectors ǫ i : the additive independent and identically distributed (i.i.d.) measurement noise 5 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Bayesian framework for experimental design and expected information gain Prior of parameters: p ( θ ) . Posterior (post experimental) of parameters by Bayes’ theorem: y , ξ ) = p (¯ y | θ , ξ ) p ( θ ) p ( θ | ¯ . p (¯ y ) K-L divergence (information gain) between prior and posterior to measure the usefulness of an experiment � � p ( θ | ¯ � y , ξ ) D KL := log p ( θ | ¯ y , ξ ) d θ . p ( θ ) Θ (if p ( θ | ¯ y ) = p ( θ ) , then D KL = 0. ) Expected information gain : � I ( ξ ) = D KL p (¯ y | ξ ) d ¯ y . 6 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Double–loop Monte Carlo The expected information gain can be rearranged as follows � � � p (¯ � y | θ ) I = log p (¯ y | θ ) d ¯ y p ( θ ) d θ . p (¯ y ) Θ Y This integral can be evaluated using Monte Carlo sampling [Ryan, 2003], [Huan and Marzouk, 2011]. N o � p (¯ � y I | θ I ) I DLMC = 1 � log , p (¯ N o y I ) I = 1 where θ I is drawn from p ( θ ) , ¯ y I is drawn from p (¯ y | θ I ) . The so-called “double–loop” comes from the nested Monte Carlo to evaluate the marginal density N i � y I | θ ) p ( θ ) d θ ≈ 1 � p (¯ y I ) = p (¯ p (¯ y I | θ J ) . N i Θ J = 1 7 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Double–loop Monte Carlo � � 1 Bias ( I DLMC ) = E ( I DLMC − I ) = O N i � � 1 Var ( I DLMC ) = O N o Var ( I DLMC ) + Bias ( I DLMC ) 2 = tol 2 � tol − 3 � N o × N i = O 8 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Laplace approximation of I ( ξ ) for determined models Laplace Approximation: � � 1 � � 2 π exp [ Mf ( x )] dx = M | f ′′ ( x 0 ) | exp [ Mf ( x 0 )] + O . M Hint: � | x − x 0 | 3 � 2 f ′′ ( x 0 )( x − x 0 ) 2 + O f ( x ) = f ( x 0 ) + 1 . 9 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Laplace approximation of I ( ξ ) for determined models Synthetic data model : y i = g ( θ 0 , ξ ) + ǫ i , i = 1 , . . . , M , 20 M=1 M=5 M=10 15 Posterior PDF 10 5 0 0.5 1 1.5 θ Figure 1: Posterior pdfs as M increases. 10 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Laplace approximation of I ( ξ ) for determined models Truncated Taylor expansion of log ( p ( θ |{ y i } )) leads to a normal dis- tribution N (ˆ θ , Σ ) . Major result 1 � � � � θ ) − tr ( Σ H p (ˆ − 1 2 log (( 2 π ) d | Σ | ) − d θ )) 2 − h (ˆ I = 2 Θ Y � �� � D KL � 1 � p (¯ y | θ 0 ) d ¯ y p ( θ 0 ) d θ 0 + O (1) M 2 Q. Long, M. Scavino, R. Tempone, S. Wang: Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations, Computer Methods in Applied Mechanics and Engineering 259 (2013) 24-39. 11 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Under-determined models So far, the results are useful when the Laplace approximation can be applied: a single dominant mode exists. Question: How about the cases, where an non-informative manifold exists? 2 ) 3 ξ 2 + ( θ 2 Example 1: g = ( θ 2 1 + θ 2 1 + θ 2 2 ) exp [ −| 0 . 2 − ξ | ] Example 2: Figure 2: A cantilever beam. 12 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions The non-informative manifold 1.5 T( θ 0 ) 1 θ 2 S 0.5 Ω M ( θ 0 ) 0 0 0.5 1 1.5 θ 1 13 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions The definition of non-informative manifold The definition of the manifold and a small region containing this manifold 2 : T ( θ 0 ) := { θ ∈ Θ ⊂ R d : p (¯ y | θ ) − g (¯ y | θ 0 ) = 0 } , Ω M ( θ 0 ) := { θ ∈ R d : dist ( θ , T ( θ 0 )) ≤ ℓ 0 M − α } 2 The volume of Ω M ( θ 0 ) contracts to zero in a slower rate than the square root of the number of replicate experiments M , i.e., α ∈ ( 0 , 0 . 5 ) . 14 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Local reparameterization The diffeomorphism mapping: f : Ω Ms , t → Ω M 2 ( g ( θ ) − g ( θ 0 )) T Σ − 1 Cost function: F ( θ ) := 1 ǫ ( g ( θ ) − g ( θ 0 )) H ( f ( 0 , t )) = [ U V ] Λ [ U V ] T Hessian of F : s = U T ( θ − f ( 0 , t )) Local coordinate s : Prior weight function: p ( s , t ) := p Θ ( f ( s , t )) | J | p ( s , t | ¯ y ) := p Θ ( f ( s , t ) | ¯ y ) | J | Posterior weight function: Due to Bayes’ theorem, we have y ) = p (¯ y | s , t ) p ( s , t ) p ( s , t | ¯ ( s , t ) ∈ Ω Ms , t for p (¯ y ) 15 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Change of coordinates for the K–L divergence ( D KL ) Approximated K–L divergence using the local coordinates t and s : � p ( s , t | ¯ � � � y ) D KL (¯ y ) = log p ( s | t , ¯ y ) p ( t | ¯ y ) d s d t p ( s , t ) [ − ℓ 0 M − α , ℓ 0 M − α ] T t � e − M ℓ 0 δ � + O P 16 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Laplace approximation for the conditional information gain Gaussian approximations: � � s ) T Σ − 1 ( s − ˆ s | t ( s − ˆ s ) 1 p ( s | t , ¯ ˜ y ) = 2 π ) r | Σ s | t | 1 / 2 exp − √ 2 ( � � s ) T Σ − 1 ( s − ˆ s | t ( s − ˆ s ) p ( s , t | ¯ ˜ y ) = p (ˆ s , t | ¯ y ) exp − 2 � � s ) T H p (ˆ s ) + ( s − ˆ s , t )( s − ˆ s ) p ( s , t ) = p (ˆ ˜ s , t ) exp ∇ log p (ˆ s , t )( s − ˆ 2 The information gain D KL can be approximated by � ˜ � � � p ( s , t | ¯ y ) D KL = log p ( s | t , ¯ ˜ y ) d s p ( t | ¯ y ) d t p ( s , t ) ˜ [ − ℓ 0 M − α ,ℓ 0 M − α ] T t � �� � D s | t � 1 � + O P , M with �� � − r 2 log ( 2 π ) − r 2 + O P ( 1 s , t ) | Σ s | t | 1 / 2 d t D s | t = − log p (ˆ M ) . T t 17 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Laplace approximation for the expected information gain for under determined models Major result 2 The expected information gain can be expressed as � � � �� � � − r 2 log ( 2 π ) − r s , t ) | Σ s | t | 1 / 2 d t I = 1 Ω M − log p (ˆ 2 Θ Y T t � 1 � p (¯ y | θ 0 ) p ( θ 0 ) d ¯ y d θ 0 + O , M � 1 � where the error O is dominated by the standard Laplace M approximation in s direction. Q. Long, M. Scavino, R. Tempone, S. Wang: A Laplace Method for Under-Determined Bayesian Optimal Experimental Designs. Computer Methods in Applied Mechanics and Engineering 285 (2015) 849-876. 18 / 32
Introduction Determined models Under-determined models Numerical examples Conclusions Simplification of the integration over the manifold T t Approximation of the conditional covariance matrix (by Woodbury’s formular) 1 Σ s | t =˜ Σ s | t + O P ( √ ) M M Σ s | t = 1 � U T � � � − 1 ˜ s , t )) T Σ − 1 J g ( f (ˆ ǫ J g ( f (ˆ s , t )) U . M Note that | ˜ Σ s | t | is independent to t for a given value of s . 19 / 32
Recommend
More recommend