bayesian calibration of computer models bayesian
play

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & - PowerPoint PPT Presentation

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & Markov chain Monte Carlo Gaussian processes, Computer model calibration and prediction Dave Higdon, Virginia Tech, Brian Williams & Jim Gattiker Statistical Sciences Group,


  1. An Inverse Problem in Hydrology 0.30 0.30 0.30 0.20 0.20 0.20 Well.NW Well.NE Well.N 0.10 0.10 0.10 0.0 0.0 0.0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time Time Time 0.30 0.30 60 50 0.20 0.20 40 Well.W Well.E 30 0.10 0.10 20 10 0.0 0.0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time Time 0.30 0.30 0.30 0.20 0.20 0.20 Well.SW Well.SE Well.S 0.10 0.10 0.10 0.0 0.0 0.0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time Time Time ⎧ ⎫ ⎩ − 1 ⎪ ⎪ L ( y | η ( z )) ∝ | Σ | − 1 ⎨ ⎬ 2 exp 2( y − η ( z )) T Σ − 1 ( y − η ( z )) ⎪ ⎪ ⎭ ⎧ ⎫ ⎩ − 1 ⎪ ⎪ m ⎨ ⎬ 2 z T W z z π ( z | λ z ) ∝ λ z exp 2 ⎪ ⎪ ⎭ π ( λ z ) ∝ λ a z − 1 exp { b z λ z } z π ( z, λ z | y ) ∝ L ( y | η ( z )) × π ( z | λ z ) × π ( λ z )

  2. Posterior realizations of z with a MRF prior 60 60 60 50 50 50 40 40 40 30 30 30 20 20 20 10 10 10 0 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 True Permeability Field MCMC realization MCMC realization 60 60 60 50 50 50 40 40 40 30 30 30 20 20 20 10 10 10 0 0 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 0 10 20 30 40 50 60 MCMC realization MCMC realization Posterior Mean Permeabilitiy Field

  3. Bayesian inference - iid N ( µ, 1) example 5 Observed data y are a noisy versions of µ 0 y i iid y ( s i ) = µ + � i with � i ∼ N (0 , 1) , k = 1 , . . . , n − 5 0 2 4 6 8 i sampling model prior for µ i =1 exp { − 1 � n 2 ( y i − µ ) 2 } L ( y | µ ) ∝ π ( µ ) ∝ N (0 , 1 / λ µ ) , λ µ small posterior density for µ π ( µ | y ) ∝ L ( y | µ ) × π ( µ ) i =1 exp { − 1 2( y i − µ ) 2 } × exp { − 1 n � 2 λ µ µ 2 } ∝ ⎧ ⎫ ⎩ − 1 ⎪ � � ⎪ y n ) 2 + λ µ ( µ − 0) 2 ⎨ ⎬ ∝ exp n ( µ − ¯ ⎭ × f ( y ) ⎪ ⎪ 2 ⎛ ⎞ ⎛ ⎞ y n n + 0 · λ µ ⎝ ¯ 1 y n , 1 λ µ → 0 ⎜ ⎟ ⎜ ⎟ ⇒ µ | y ∼ N , → N ⎝ ¯ ⎜ ⎟ ⎠ ⎠ n + λ µ n + λ µ n

  4. Bayesian inference - iid N ( µ, λ − 1 y ) example 5 Observed data y are a noisy versions of µ 0 y i iid ∼ N (0 , λ − 1 y ( s i ) = µ + � i with � i y ) , k = 1 , . . . , n − 5 0 2 4 6 8 i sampling model prior for µ, λ y 1 y exp { − 1 � n 2 λ y ( y i − µ ) 2 } L ( y | µ ) ∝ 2 π ( µ, λ y ) = π ( µ ) × π ( λ y ) i =1 λ π ( µ ) ∝ 1 , π ( λ y ) ∝ λ a y − 1 e − b y λ y y posterior density for ( µ, λ y ) π ( µ, λ y | y ) ∝ L ( y | µ ) × π ( µ ) × π ( λ y ) y exp { − 1 n n i =1 ( y i − µ ) 2 } × λ a y − 1 � exp { − b y λ y } ∝ λ 2 λ y 2 y π ( µ, λ | y ) is not so easy recognize. Can explore π ( µ, λ | y ) numerically or via Monte Carlo.

  5. Full conditional distributions for π ( µ, λ | y ) y exp { − 1 n n i =1 ( y i − µ ) 2 } × λ a y − 1 � π ( µ, λ y | y ) ∝ λ exp { − b y λ y } 2 λ y 2 y Though π ( µ, λ | y ) is not of a simple form, its conditional distributions are: π ( µ | λ y , y ) ∝ exp { − 1 n i =1 ( y i − µ ) 2 } � 2 λ y ⎛ ⎞ y n , 1 ⎜ ⎟ ⇒ µ | λ y , y ∼ N ⎝ ¯ ⎜ ⎟ ⎠ n λ y ⎧ ⎫ ⎩ b y + 1 n ⎪ ⎪ π ( λ y | µ, y ) ∝ λ a y + n ⎨ ⎬ 2 − 1 i =1 ( y i − µ ) 2 � exp y ⎪ ⎪ 2 ⎭ ⎛ ⎞ ⎝ a y + n 2 , b y + 1 n i =1 ( y i − µ ) 2 � ⎠ . ⇒ λ y | µ, y ∼ Γ ⎜ ⎟ 2

  6. Markov Chain Monte Carlo – Gibbs sampling Given full conditionals for π ( µ, λ y | y ) , one can use Markov chain Monte Carlo (MCMC) to obtain draws from the posterior The Gibbs sampler is a MCMC scheme which iteratively replaces each parame- ter,in turn, by a draw from its full conditional: initialize parameters at ( µ, λ y ) 0 for t = 1 , . . . , niter { ⎛ ⎞ y n , 1 ⎜ ⎟ set µ = a draw from N ⎝ ¯ ⎜ ⎟ ⎠ n λ y ⎛ ⎞ ⎝ a + n 2 , b + 1 n � i =1 ( y i − µ ) 2 set λ y = a draw from Γ ⎜ ⎟ ⎠ 2 } (Be sure to use newly updated µ when updating λ y ) Draws ( µ, λ y ) 1 , . . . , ( µ, λ y ) niter are a dependent sample from π ( µ, λ y | y ) . In practice, initial portion of the sample is discarded to remove e ff ect of initial- ization values ( µ 0 , λ 0 y ) .

  7. Gibbs sampling for π ( µ, λ y | y ) 2 1.5 1 0.5 µ 0 − 0.5 − 1 − 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 λ y

  8. posterior summary for π ( µ, λ y | y ) 2 5 1.5 1 0 0.5 y i µ 0 − 0.5 − 5 − 1 0 2 4 6 8 0 1 2 3 4 5 λ y i 8 5 6 µ λ y 4 0 µ 2 0 − 5 − 2 0 200 400 600 800 1000 0 2 4 6 8 iteration i

  9. Gibbs sampler: intuition Gibbs sampler for a bivariate normal density ⎧ ⎫ − 1 � � ⎛ ⎞ − 1 ⎛ ⎞ � � ⎪ ⎪ ⎩ − 1 2 1 ρ ⎪ ( z 1 z 2 ) ⎝ 1 ⎝ z 1 ⎪ ρ � � ⎪ ⎪ ⎨ ⎬ � � π ( z ) = π ( z 1 , z 2 ) ∝ exp ⎜ ⎟ ⎜ ⎟ � � ⎠ ⎠ � � ⎪ ⎪ ρ 1 1 z 2 ρ � � ⎪ 2 ⎪ ⎪ ⎪ � � ⎭ ✻ Full conditionals of π ( z ) : z 1 | z 2 ∼ N ( ρ z 2 , 1 − ρ 2 ) z 2 | z 1 ∼ N ( ρ z 1 , 1 − ρ 2 ) ( z 1 1 , z 1 2 ) = z 1 • initialize chain with ✉ z 2 ✻ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ 0 ⎝ 1 ρ z 0 ∼ N ⎠ , ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎠ ⎠ 0 1 ρ • draw z 1 1 ∼ N ( ρ z 0 2 , 1 − ρ 2 ) ✲ z 0 = ( z 0 ✉ ✉ 1 , z 0 ( z 1 1 , z 0 2 ) 2 ) 2 ) T ∼ π ( z ) now ( z 1 1 , z 0 ✲ z 1

  10. Gibbs sampler: intuition Gibbs sampler gives z 0 , z 2 , . . . , z T which can be treated as dependent draws from π ( z ) . If z 0 is not a draw from π ( z ) , then the initial realizations will not have the correct distribution. In practice, the first 100?, 1000? realizations are discarded. The draws can be used to make inference ✻ about π ( z ) : • Posterior mean of z is estimated by: ⎛ ⎞ ⎛ ⎞ ⎝ z k ⎠ = 1 ⎝ ˆ µ 1 T � 1 ⎜ ⎟ ⎜ ⎟ ⎠ z k µ 2 ˆ T k =1 2 z 2 • Posterior probabilities: P ( z 1 > 1) = 1 T � k =1 I [ z k � 1 > 1] T P ( z 1 > z 2 ) = 1 T � k =1 I [ z k 1 > z k � 2 ] T • 90% interval: ( z [5%] , z [95%] ) . 1 1 ✲ z 1

  11. Sampling of π ( µ, λ | y ) via Metropolis Initialize parameters at some setting ( µ, λ y ) 0 . For t = 1 , . . . , T { update µ | λ y , y { • generate proposal µ ∗ ∼ U [ µ − r µ , µ + r µ ] . • compute acceptance probability ⎧ ⎫ ⎩ 1 , π ( µ ∗ , λ y | y ) ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ α = min ⎪ ⎪ ⎪ π ( µ, λ y | y ) ⎪ ⎭ • update µ to new value: ⎧ µ ∗ with probability α ⎪ ⎪ µ new = ⎨ ⎪ µ with probability 1 − α ⎪ ⎩ } update λ y | µ, y analagously } Here we ran for T = 1000 scans, giving realizations ( µ, λ y ) 1 , . . . , ( µ, λ y ) T from the posterior. Discarded the first 100 for burn in. Note: proposal width r µ tuned so that µ ∗ is accepted about half the time; proposal width r λ y tuned so that λ ∗ y is accepted about half the time.

  12. Metropolis sampling for π ( µ, λ y | y ) 2 1.5 1 0.5 µ 0 − 0.5 − 1 − 1.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 λ y

  13. posterior summary for π ( µ, λ y | y ) 2 5 1.5 1 0.5 0 y i µ 0 − 0.5 − 1 − 5 − 1.5 0 2 4 6 8 0 1 2 3 4 5 λ y i 6 5 5 4 µ λ y 3 0 µ 2 1 0 − 5 − 1 0 200 400 600 800 1000 0 2 4 6 8 iteration i

  14. Sampling from non-standard multivariate distributions Nick Metropolis – Computing pioneer at Los Alamos National Laboratory − inventor of the Monte Carlo method − inventor of Markov chain Monte Carlo: Equation of State Calculations by Fast Computing Machines (1953) by N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Journal of Chemical Physics . Originally implemented on the MANIAC1 computer at LANL Algorithm constructs a Markov chain whose realiza- tions are draws from the target (posterior) distribu- tion. Constructs steps that maintain detailed balance.

  15. Gibbs Sampling and Metropolis for a bivariate normal density ⎧ ⎫ − 1 � � ⎛ ⎞ − 1 ⎛ ⎞ � � ⎪ ⎪ ⎩ − 1 1 ρ 2 ⎪ ( z 1 z 2 ) ⎝ 1 ⎝ z 1 ⎪ ρ � � ⎪ ⎪ ⎨ ⎬ � � ⎜ ⎟ ⎜ ⎟ π ( z 1 , z 2 ) ∝ exp � � ⎠ ⎠ � � ⎪ ⎪ ρ 1 1 z 2 ρ � � ⎪ 2 ⎪ ⎪ ⎪ � � ⎭ sampling from the full conditionals z 1 | z 2 ∼ N ( ρ z 2 , 1 − ρ 2 ) z 2 | z 1 ∼ N ( ρ z 1 , 1 − ρ 2 ) also called heat bath Metropolis updating: generate z ∗ 1 ∼ U [ z 1 − r, z 1 + r ] calculate α = min { 1 , π ( z ∗ π ( z 1 ,z 2 ) = π ( z ∗ 1 ,z 2 ) 1 | z 2 ) π ( z 1 | z 2 ) } ⎧ z ∗ ⎪ 1 with probability α ⎪ ⎨ set z new = 1 ⎪ z 1 with probability 1 − α ⎪ ⎩

  16. Kernel basis representation for spatial processes z ( s ) Define m basis functions k 1 ( s ) , . . . , k m ( s ) . 0.8 basis 0.4 0.0 − 2 0 2 4 6 8 10 12 s Here k j ( s ) is normal density cetered at spatial location ω j : 1 2 π exp { − 1 2( s − ω j ) 2 } k j ( s ) = √ m � set z ( s ) = j =1 k j ( s ) x j where x ∼ N (0 , I m ) . Can represent z = ( z ( s 1 ) , . . . , z ( s n )) T as z = Kx where K ij = k j ( s i )

  17. x and k ( s ) determine spatial processes z ( s ) k j ( s ) x j z ( s ) 1.5 1.5 basis z(s) 0.5 0.5 − 0.5 − 0.5 − 2 0 2 4 6 8 10 12 − 2 0 2 4 6 8 10 12 s s Continuous representation: m � z ( s ) = j =1 k j ( s ) x j where x ∼ N (0 , I m ) . Discrete representation: For z = ( z ( s 1 ) , . . . , z ( s n )) T , z = Kx where K ij = k j ( s i )

  18. Formulation for the 1-d example Data y = ( y ( s 1 ) , . . . , y ( s n )) T observed at locations s 1 , . . . , s n . Once knot locations ω j , j = 1 , . . . , m and kernel choice k ( s ) are specified, the remaining model formulation is trivial: Likelihood: ⎧ ⎫ ⎩ − 1 ⎪ ⎪ n ⎨ ⎬ 2 λ y ( y − Kx ) T ( y − Kx ) L ( y | x, λ y ) ∝ λ y exp 2 ⎪ ⎪ ⎭ where K ij = k ( ω j − s i ) . Priors: ⎧ ⎫ ⎩ − 1 ⎪ ⎪ m ⎨ ⎬ 2 λ x x T x π ( x | λ x ) ∝ λ x exp 2 ⎪ ⎪ ⎭ π ( λ x ) ∝ λ a x − 1 exp { − b x λ x } x π ( λ y ) ∝ λ a y − 1 exp { − b y λ y } y Posterior: � � π ( x, λ x , λ y | y ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] exp × y � � λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] exp x

  19. Posterior and full conditionals Posterior: � � π ( x, λ x , λ y | y ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] exp × y � � λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] exp x Full conditionals: π ( x | · · · ) ∝ exp { − 1 2[ λ y x T K T Kx − 2 λ y x T K T y + λ x x T x ] } � � π ( λ x | · · · ) ∝ λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] exp x � � π ( λ y | · · · ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] exp y Gibbs sampler implementation x | · · · ∼ N (( λ y K T K + λ x I m ) − 1 λ y K T y, ( λ y K T K + λ x I m ) − 1 ) λ x | · · · ∼ Γ ( a x + m 2 , b x + . 5 x T x ) λ y | · · · ∼ Γ ( a y + n 2 , b y + . 5( y − Kx ) T ( y − Kx ))

  20. 1-d example m = 6 knots evenly spaced between − . 3 and 1 . 2 . n = 5 data points at s = . 05 , . 25 , . 52 , . 65 , . 91 . k ( s ) is N (0 , sd = . 3) a y = 10 , b y = 10 · ( . 25 2 ) ⇒ strong prior at λ y = 1 /. 25 2 ; a x = 1 , b x = . 001 3 3 basis functions 2 2 1 1 y(s) k j (s) 0 0 − 1 − 1 − 2 − 2 − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 s s 8 50 6 40 4 30 λ x , λ y 2 x j 20 0 10 − 2 − 4 0 0 200 400 600 800 1000 0 200 400 600 800 1000 iteration iteration

  21. 1-d example From posterior realizations of knot weights x , one can construct posterior real- � m izations of the smooth fitted function z ( s ) = j =1 k j ( s ) x j . Note strong prior on λ y required since n is small. 3 3 posterior realizations of z(s) mean & pointwise 90% bounds 2 2 1 1 z(s), y(s) z(s), y(s) 0 0 − 1 − 1 − 2 − 2 − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 s s

  22. References • D. Gammerman and H. F. Lopes (2006) Markov Chain Monte Carlo , Chap- man & Hall. • A. Gelman, J. B. Carlin, H. S. Stern and D. B. Rubin (1995) Bayesian Data Analysis , Chapman & Hall. • M. Schervish (1995) Theory of Statistics , Springer-Verlag. • Besag, J., P. J. Green, D. Higdon, and K. Mengersen (1995), Bayesian com- putation and stochastic systems (with Discussion), Statistical Science , 10 , 3-66. • D. Higdon (2002) Space and space-time modeling using process convolutions, in Quantitative Methods for Current Environmental Issues (C. Anderson and V. Barnett and P. C. Chatwin and A. H. El-Shaarawi, eds),37–56. • D. Higdon, H. Lee and C. Holloman (2003) Markov chain Monte Carlo approaches for inference in computationally intensive inverse problems, in Bayesian Statistics 7, Proceedings of the Seventh Valencia Interna- tional Meeting (Bernardo, Bayarri, Berger, Dawid, Heckerman, Smith and West, eds).

  23. GAUSSIAN PROCESSES 1

  24. Gaussian process models for spatial phenomena 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s An example of z ( s ) of a Gaussian process model on s 1 , . . . , s n ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ z ( s 1 ) 0 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . . ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . . ⎠ , with Σ ij = exp { − || s i − s j || 2 } , ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ z = . ⎠ ∼ N . ⎠ , Σ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ ⎝ ⎝ ⎠ z ( s n ) 0 where || s i − s j || denotes the distance between locations s i and s j . z has density π ( z ) = (2 π ) − n 2 | Σ | − 1 2 exp { − 1 2 z T Σ − 1 z } .

  25. Realizations from π ( z ) = (2 π ) − n 2 | Σ | − 1 2 exp { − 1 2 z T Σ − 1 z } 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s model for z ( s ) can be extended to continuous s

  26. Generating multivariate normal realizations Independent normals are standard for any computer package u ∼ N (0 , I n ) Well known property of normals: if u ∼ N ( µ, Σ ) , then z = Ku ∼ N ( Kµ, K Σ K T ) Use this to construct correlated realizations from iid ones. Want z ∼ N (0 , Σ ) 1. compute square root matrix L such that LL T = Σ ; 2. generate u ∼ N (0 , I n ) ; 3. Set z = Lu ∼ N (0 , LI n L T = Σ ) • Any square root matrix L will do here. • Columns of L are basis functions for representing realizations z . • L need not be square – see over or under specified bases.

  27. Standard Cholesky decomposition z = N (0 , Σ ) , Σ = LL T , z = Lu where u ∼ N (0 , I n ) , L lower triangular Σ ij = exp { − || s i − s j || 2 } , s 1 , . . . , s 20 equally spaced between 0 and 10 : columns 5 10 15 20 5 1.0 0.5 10 rows basis 0.0 − 0.5 15 − 1.0 0 2 4 6 8 10 20 s

  28. Cholesky decomposition with pivoting z = N (0 , Σ ) , Σ = LL T , z = Lu where u ∼ N (0 , I n ) , L permuted lower triangular Σ ij = exp { − || s i − s j || 2 } , s 1 , . . . , s 20 equally spaced between 0 and 10 : columns 5 10 15 20 5 1.0 0.5 10 rows basis 0.0 − 0.5 15 − 1.0 0 2 4 6 8 10 20 s

  29. Singular value decomposition z = N (0 , Σ ) , Σ = U Λ U T = LL T , z = Lu where u ∼ N (0 , I n ) Σ ij = exp { − || s i − s j || 2 } , s 1 , . . . , s 20 equally spaced between 0 and 10 : columns 5 10 15 20 5 1.0 0.5 10 rows basis 0.0 − 0.5 15 − 1.0 0 2 4 6 8 10 20 s

  30. Conditioning on some observations of z ( s ) 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 We observe z ( s 2 ) and z ( s 5 ) – what do we now know about { z ( s 1 ) , z ( s 3 ) , z ( s 4 ) , z ( s 6 ) , z ( s 7 ) , z ( s 8 ) } ? ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ z ( s 2 ) 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ z ( s 5 ) ⎜ ⎟ ⎜ ⎜ 0 ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎛ ⎞ ⎟ 1 . 0001 � . 3679 · · · 0 ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ 0 � ⎜ z ( s 1 ) ⎟ ⎜ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � . 0001 1 0 · · · . 0001 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ 0 � ⎜ z ( s 3 ) ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ . 3679 0 1 · · · 0 � ∼ N , ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � 0 ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ z ( s 4 ) � ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ . . ⎟ ⎟ ... � . . ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � . . . . . . . . ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ 0 ⎟ ⎜ ⎟ ⎟ � z ( s 6 ) ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎝ ⎠ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ � 0 . 0001 0 · · · 1 ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ � ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ 0 z ( s 7 ) ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎝ ⎝ ⎠ ⎠ ⎝ ⎠ 0 z ( s 8 )

  31. Conditioning on some observations of z ( s ) ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ z 1 ⎝ 0 ⎝ Σ 11 Σ 12 ⎠ , z 2 | z 1 ∼ N ( Σ 21 Σ − 1 11 z 1 , Σ 22 − Σ 21 Σ − 1 ⎠ , ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎠ ∼ N 11 Σ 12 ) ⎝ ⎠ z 2 0 Σ 21 Σ 22 conditional mean 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 contitional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s

  32. More examples with various covariance functions and spatial scales Σ ij = exp { − ( || s i − s j || / scale ) 2 } Σ ij = exp { − ( || s i − s j || / scale ) 1 } Gaussian C(r), scale = 2 Exponential C(r), scale = 1 Brownian motion C(r), p = 1.5 scale = 1 2 2 2 • • • • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 -2 -2 -2 0 5 10 15 0 5 10 15 0 5 10 15 x x x Gaussian C(r), scale = 3 Exponential C(r), scale = 10 Brownian motion C(r), p = 1.5 scale = 3 2 2 2 • • • • • • 1 1 1 • • • • • • 0 0 0 z z z -1 -1 -1 -2 -2 -2 0 5 10 15 0 5 10 15 0 5 10 15 x x x Gaussian C(r), scale = 5 Exponential C(r), scale = 20 Brownian motion C(r), p = 1.5 scale = 5 2 2 2 • • • • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 -2 -2 -2 0 5 10 15 0 5 10 15 0 5 10 15 x x x

  33. More examples with various covariance functions and spatial scales Σ ij = exp { − ( || s i − s j || / scale ) 2 } Σ ij = exp { − ( || s i − s j || / scale ) 1 } Gaussian C(r), scale = 2 Exponential C(r), scale = 1 Brownian motion C(r), p = 1.5 scale = 1 3 3 3 2 2 2 • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 • • • -2 -2 -2 -3 -3 -3 -5 0 5 10 15 20 -5 0 5 10 15 20 -5 0 5 10 15 20 x x x Gaussian C(r), scale = 3 Exponential C(r), scale = 10 Brownian motion C(r), p = 1.5 scale = 3 3 3 3 2 2 2 • • • 1 1 1 • • • • • • 0 0 0 z z z -1 -1 -1 • • • -2 -2 -2 -3 -3 -3 -5 0 5 10 15 20 -5 0 5 10 15 20 -5 0 5 10 15 20 x x x Gaussian C(r), scale = 5 Exponential C(r), scale = 20 Brownian motion C(r), p = 1.5 scale = 5 3 3 3 2 2 2 • • • 1 1 1 • • • • • • z 0 z 0 z 0 -1 -1 -1 • • • -2 -2 -2 -3 -3 -3 -5 0 5 10 15 20 -5 0 5 10 15 20 -5 0 5 10 15 20 x x x

  34. A 2-d example, conditioning on the edge Σ ij = exp { − ( || s i − s j || / 5) 2 } a realization mean conditional on Y=1 points -2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 Z Z 20 20 15 15 0 20 2 5 15 1 10 10 0 10 Y Y 1 5 5 X X 5 5 realization conditional on Y=1 points realization conditional on Y=1 points -2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 Z Z 20 20 15 20 15 20 15 10 15 10 10 Y 10 Y 5 X 5 5 X 5

  35. Soft Conditioning (Bayes Rule) 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 s Observed data y are a noisy version of z y ( s i ) = z ( s i ) + � ( s i ) with � ( s k ) iid ∼ N (0 , σ 2 y ) , k = 1 , . . . , n spatial process prior for z ( s ) Data Σ y = σ 2 y y I n µ z Σ z ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ σ 2 y 1 0 0 0 y ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . ⎜ ⎟ ⎜ ... ⎟ ⎜ ⎟ ⎜ ⎟ . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . 0 0 . Σ z ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ σ 2 0 y n 0 0 y L ( y | z ) ∝ | Σ y | − 1 π ( z ) ∝ | Σ z | − 1 2 exp { − 1 2 exp { − 1 2 ( y − z ) T Σ − 1 2 z T Σ − 1 y ( y − z ) } z z }

  36. Soft Conditioning (Bayes Rule) ... continued sampling model spatial prior L ( y | z ) ∝ | Σ y | − 1 π ( z ) ∝ | Σ z | − 1 2 exp { − 1 2 exp { − 1 2 ( y − z ) T Σ − 1 2 z T Σ − 1 y ( y − z ) } z z } ⇒ π ( z | y ) ∝ L ( y | z ) × π ( z ) ⇒ π ( z | y ) ∝ exp { − 1 2 [ z T ( Σ − 1 y + Σ − 1 z ) z + z T Σ − 1 y y + f ( y )] } z | y ∼ N ( V Σ − 1 y y, V ) , where V = ( Σ − 1 y + Σ − 1 z ) − 1 ⇒ conditional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 π ( z | y ) describes the updated uncertainty about z given the observations.

  37. Updated predictions for unobserved z ( s ) ’s conditional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 data locations y d = ( y ( s 1 ) , . . . , y ( s n )) T z d = ( z ( s 1 ) , . . . , z ( s n )) T prediction locations y ∗ = ( y ( s ∗ m )) T z ∗ = ( z ( s ∗ m )) T 1 ) , . . . , y ( s ∗ 1 ) , . . . , z ( s ∗ define y = ( y d ; y ∗ ) z = ( z d ; z ∗ ) Data spatial process prior for z ( s ) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎝ y d ⎝ y d ⎝ σ 2 y I n 0 ⎝ 0 n ⎝ cov rule applied ⎠ = ⎜ ⎟ ⎜ ⎟ ⎠ Σ y = ⎜ ⎟ ⎜ ⎟ ⎠ Σ z = ⎜ ⎟ y = µ z = ⎠ ⎠ y ∗ to ( s, s ∗ ) 0 m 0 ∞ I m 0 m ⎛ ⎞ 1 y I n 0 σ 2 ⎜ ⎟ define Σ − y = ⎜ ⎟ ⎝ ⎠ 0 0 Now the posterior distribution for z = ( z d , z ∗ ) is y + Σ − 1 z ) − 1 z | y ∼ N ( V Σ − y y, V ) , where V = ( Σ −

  38. Updated predictions for unobserved z ( s ) ’s, Alternative: use the conditional normal rules: conditional realizations 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 data locations y = ( y ( s 1 ) , . . . , y ( s n )) T = ( z ( s 1 ) + � ( s 1 ) , . . . , z ( s n ) + � ( s n )) T prediction locations z ∗ = ( z ( s ∗ m )) T 1 ) , . . . , z ( s ∗ ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ σ 2 ⎝ y ⎝ 0 y I n 0 ⎠ , ⎠ + Σ z Jointly ⎜ ⎠ ∼ N ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎠ z ∗ 0 0 0 where ⎛ ⎞ ⎛ ⎞ Σ z ( s, s ∗ ) ⎝ Σ z ( s, s ) ⎝ cov rule applied ⎠ = Σ z = ⎜ ⎟ ⎜ ⎟ ⎠ Σ z ( s ∗ , s ) Σ z ( s ∗ , s ∗ ) to ( s, s ∗ ) ( n + m ) × ( n + m ) Therefore z ∗ | y ∼ N ( µ ∗ , Σ ∗ ) where µ ∗ = Σ z ( s ∗ , s )[ σ 2 y I n + Σ z ( s, s )] − 1 y Σ ∗ = Σ z ( s ∗ , s ∗ ) − Σ z ( s ∗ , s )[ σ 2 z I n + Σ z ( s, s )] − 1 Σ z ( s, s ∗ )

  39. Example: Dioxin concentration at Piazza Road Superfund Site 200 . . . . . . . . . . . . . . . 200 . . . . . . . . . . . . . . . 200 0.6 0.4 0.2 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.4 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 150 150 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.4 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . 0.6 0.4 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 100 100 y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 50 0.4 50 . . . . . . . . . . . . . . . 0.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.6 0.4 0 0.4 0.4 0.8 0.6 0.8 0 . . . . . . . . . . . . . . . 0 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 x -1 1 2 3 4 Posterior mean of z ∗ data pointwise posterior sd

  40. Bonus topic: constructing simultaneous intervals • generate a large sample of m -vectors z ∗ from π ( z ∗ | y ) . z ∗ that is the mean of the generated z ∗ s • compute the m -vector ˆ σ that is the pointwise sd of the generated z ∗ s • compute the m -vector ˆ • find the constant a such that 80% of the generated z ∗ s are completely z ∗ ± a ˆ contained within ˆ σ posterior realizations and mean 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7 pointwise estimated sd 2 +/ − sd[z(s)] 1 0 − 1 − 2 0 1 2 3 4 5 6 7 simultaneous 80% credible interval 2 1 z(s) 0 − 1 − 2 0 1 2 3 4 5 6 7

  41. References • Ripley, B. (1989) Spatial Statistics , Wiley. • Cressie, N. (1992) Statistics for Spatial Data , Wiley. • Stein, M. (1999) Interpolation of Spatial Data: Some Theory for Krig- ing , Springer.

  42. GAUSSIAN PROCESSES 2

  43. Gaussian process models revisited Application: finding in a rod of material 1.5 • for various driving frequencies, acceleration of rod recorded 1 • the true frequency-acceleration curve is 0.5 smooth. scaled acceleration 0 • we have noisy measurements of accelera- tion. − 0.5 • estimate resonance frequency. − 1 • use GP model for frequency-accel curve. − 1.5 • smoothness of GP model important here. − 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scaled frequency

  44. Gaussian process models formulation Take response y to be acceleration and spatial value s to be frequency. ( y 1 , . . . , y n ) T data: y = at spatial locations 1.5 s 1 , . . . , s n . 1 0.5 z ( s ) is a mean 0 Gaussian process with covariance scaled acceleration function 0 Cov ( z ( s ) , z ( s ′ )) = 1 − 0.5 exp { − β ( s − s ′ ) 2 } − 1 λ z − 1.5 β controls strength of dependence. − 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scaled frequency Take z = ( z ( s 1 ) , . . . , z ( s n )) T to be z ( s ) restricted to the data observations. Model the data as: y = z + � , where � ∼ N (0 , 1 I n ) λ y We want to find the posterior distribution for the frequency s ⋆ where z ( s ) is maximal.

  45. Reparameterizing the spatial dependence parameter β It is convenient to reparameterize β as: ρ = exp { − β (1 / 2) 2 } ⇔ β = − 4 log( ρ ) So ρ is the correlation between two points on z ( s ) separated by 1 2 . Hence z has spatial prior z | ρ , λ z ∼ N (0 , 1 R ( ρ ; s )) λ z where R ( ρ ; s ) is the correlation matrix with ij elements R ij = ρ 4( s i − s j ) 2 Prior specification for z ( s ) is completed by specfying priors for λ z and ρ . π ( λ z ) ∝ λ a z − 1 exp { − b z λ z } if y is standardized, encourage λ z to be close to 1 – z eg. a z = b z = 5 . π ( ρ ) ∝ (1 − ρ ) − . 5 encourages ρ to be large if possible

  46. Bayesian model formulation Likelihood n 2 λ y ( y − z ) T ( y − z ) } L ( y | z, λ y ) ∝ λ y exp { − 1 2 Priors z | R ( ρ ; s ) | − 1 n 2 λ z z T R ( ρ ; s ) − 1 z } 2 exp { − 1 π ( z | λ z , ρ ) ∝ λ 2 π ( λ y ) ∝ λ a y − 1 e − b y λ y , uninformative here – a y = 1 , b y = . 005 y π ( λ z ) ∝ λ a z − 1 e − b z λ z , fairly informative – a z = 5 , b z = 5 z π ( ρ ) ∝ (1 − ρ ) − . 5 Marginal likelihood (integrating out z ) 1 2 y T Λ y } 2 exp { − 1 L ( y | λ � , λ z , ρ ) ∝ | Λ | where Λ − 1 = 1 λ y I n + 1 λ z R ( ρ ; s ) Posterior 1 2 exp { − 1 2 y T Λ y } × λ a y − 1 e − b y λ y × λ a z − 1 e − b z λ z × (1 − ρ ) − . 5 π ( λ y , λ z , ρ | y ) ∝ | Λ | y z

  47. Posterior Simulation Use Metropolis to simulate from the posterior 1 e − b y λ y × λ a z − 1 e − b z λ z × (1 − ρ ) − . 5 2 exp { − 1 2 y T Λ y } × λ a y − 1 π ( λ y , λ z , ρ | y ) ∝ | Λ | y z giving (after burn-in) ( λ y , λ z , ρ ) 1 , . . . , ( λ y , λ z , ρ ) T For any given realization ( λ y , λ z , ρ ) t , one can generate z ∗ = ( z ( s ∗ m )) T 1 ) , . . . , z ( s ∗ for any set of prediction locations s ∗ 1 , . . . , s ∗ m . From previous GP stu ff , we know ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ ⎝ z ⎝ y ⎝ V Σ − ⎠ | · · · ∼ N ⎠ , V ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎠ y z ∗ 0 m where ⎛ ⎞ ⎝ λ � I n 0 ⎠ and V − 1 = Σ − y + λ z R ( ρ , ( s, s ∗ )) − 1 Σ − y = ⎜ ⎟ 0 0 Hence, one can generate corresponding z ∗ ’s for each posterior realization at a fine grid around the apparent resonance frequency z ⋆ .

  48. Or use conditional normal formula with ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎝ λ − 1 ⎝ y ⎝ 0 n � I n 0 ⎠ + λ − 1 ⎠ | · · · ∼ N ⎠ , z R ( ρ , ( s, s ∗ )) ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎠ z ∗ 0 m 0 0 where ⎛ ⎞ ⎛ ⎞ R ( ρ , ( s, s ∗ )) ⎝ R ( ρ , ( s, s )) ⎝ cor rule applied R ( ρ , ( s, s ∗ )) = ⎠ = ⎜ ⎟ ⎜ ⎟ ⎠ R ( ρ , ( s ∗ , s )) R ( ρ , ( s ∗ , s ∗ )) to ( s, s ∗ ) ( n + m ) × ( n + m ) Therefore z ∗ | y ∼ N ( µ ∗ , Σ ∗ ) where µ ∗ = λ − 1 z R ( ρ , ( s ∗ , s ))[ λ − 1 � I n + λ − 1 z R ( ρ , ( s, s ))] − 1 y Σ ∗ = λ − 1 z R ( ρ , ( s ∗ , s ∗ )) − λ − 1 z R ( ρ , ( s ∗ , s ))[ λ − 1 � I n + λ − 1 z R ( ρ , ( s, s ))] − 1 λ − 1 z R ( ρ , ( s, s ∗ ))

  49. MCMC output for ( λ y , λ z , ρ ) 1 rho 0.5 0 0 500 1000 1500 2000 2500 3000 iteration 2 1.5 lamz 1 0.5 0 0 500 1000 1500 2000 2500 3000 iteration 250 200 lamy 150 100 50 500 1000 1500 2000 2500 3000 iteration

  50. Posterior realizations for z ( s ) near z ⋆ 1.5 1 0.5 0 scaled acceleration − 0.5 − 1 − 1.5 − 2 − 2.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 scaled frequency

  51. Posterior for resonance frequency z ⋆ posterior distribution for scaled resonance frequency 90 80 70 60 50 40 30 20 10 0 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 0.52 scaled resonance frequency

  52. Gaussian Processes for modeling complex computer simulators data input settings (spatial locations) ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ y 1 s 1 s 11 s 12 · · · s 1 p ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ . . . . . . y = ⎜ . ⎟ S = ⎜ . ⎟ ⎠ = ⎜ . . . . ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎝ ⎠ y n s n s n 1 s n 2 · · · s np Model responses y as a (stochastic) function of s y ( s ) = z ( s ) + � ( s ) Vector form – restricting to the n data points y = z + �

  53. Model response as a Gaussian processes y ( s ) = z ( s ) + � Likelihood n 2 λ � ( y − z ) T ( y − z ) } L ( y | z, λ � ) ∝ λ � exp { − 1 2 Priors z | R ( β ) | − 1 n 2 exp { − 1 2 λ z z T R ( β ) − 1 z } π ( z | λ z , β ) ∝ λ 2 π ( λ � ) ∝ λ a � − 1 e − b � λ � , perhaps quite informative � π ( λ z ) ∝ λ a z − 1 e − b z λ z , fairly informative if data have been standardized z p k =1 (1 − ρ k ) − . 5 � π ( ρ ) ∝ Marginal likelihood (integrating out z ) 1 2 exp { − 1 2 y T Λ y } L ( y | λ � , λ z , β ) ∝ | Λ | where Λ − 1 = 1 λ � I n + 1 λ z R ( β )

  54. GASP Covariance model for z ( s ) Cov ( z ( s i ) , z ( s j )) = 1 R ( β ) = 1 p � k =1 exp { − β k ( s ik − s jk ) α } λ z λ z • Typically α = 2 ⇒ z ( s ) is smooth. • Separable covariance – a product of componentwise covariances. • Can handle large number of covariates/inputs p . • Can allow for multiway interactions. • β k = 0 ⇒ input k is “inactive” ⇒ variable selection • reparameterize: ρ k = exp { − β k d α 0 } – typically d 0 is a halfwidth.

  55. Posterior Distribution and MCMC 1 2 exp { − 1 2 y T Λ λ , ρ y } × λ a � − 1 e − b � λ � × π ( λ � , λ z , ρ | y ) ∝ | Λ λ , ρ | � p λ a z − 1 e − b z λ z × k =1 (1 − ρ k ) − . 5 � z • MCMC implementation requires Metropolis updates. • Realizations of z ( s ) | λ , ρ , y can be obtained post-hoc: − define z ∗ = ( z ( s ∗ m )) T to be predictions at locations s ∗ 1 ) , . . . , z ( s ∗ 1 , . . . , s ∗ m , then ⎛ ⎞ ⎛ ⎛ ⎞ ⎞ ⎝ z ⎝ y ⎠ | · · · ∼ N ⎝ V Σ − ⎠ , V ⎜ ⎟ ⎜ ⎜ ⎟ ⎟ ⎠ y z ∗ 0 m where ⎛ ⎞ ⎝ λ � I n 0 ⎠ and V − 1 = Σ − y + λ z R ( ρ , ( s, s ∗ )) − 1 Σ − y = ⎜ ⎟ 0 0

  56. Example: Solar collector Code (Schonlau, Hamada and Welch, 1995) • n = 98 model runs, varying 6 independent variables. • Response is the increase in heat exchange e ff ectiveness. • A latin hypercube (LHC) design was used with 2-d space filling. 0.010 0.020 0.030 0 100 200 300 2 4 6 8 10 0.050 0.035 wind.vel 0.020 0.030 0.020 slot.width 0.010 90 Rey.num 70 50 300 admittance 100 0 0.07 0.05 plate.thickness 0.03 0.01 10 8 6 Nusselt.num 4 2 0.020 0.030 0.040 0.050 50 60 70 80 90 100 0.01 0.03 0.05 0.07

  57. Example: Solar collector Code • Fit of GASP model and predictions of 10 holdout points • Two most active covariates are shown here. x 10 4 lame 2 data 0 2 20 lamz 0 500 1000 1500 2000 1 iteration 0 2 0 500 1000 1500 2000 15 iteration 0 beta(1:p) 10 y − 2 5 − 4 1 1 0.5 0.5 0 0 500 1000 1500 2000 0 0 x5 x4 iteration posterior mean − 1 2 − 1.5 0 − 2 − 2 − 4 − 2.5 1 1 0.5 0.5 − 3 0 0 − 3 − 2.5 − 2 − 1.5 − 1 x5 x4

  58. Example: Solar collector Code • Visualizing a 6-d response surface is di ffi cult • 1-d marginal e ff ects shown here. 1 − D Marginal Effects 2 2 2 1 1 1 0 0 0 y y y − 1 − 1 − 1 − 2 − 2 − 2 x1 x2 x3 − 3 − 3 − 3 0 0.5 1 0 0.5 1 0 0.5 1 x1 x2 x3 2 2 2 1 1 1 0 0 0 y y y − 1 − 1 − 1 − 2 − 2 − 2 x4 x5 x6 − 3 − 3 − 3 0 0.5 1 0 0.5 1 0 0.5 1 x4 x5 x6

  59. References • J. Sacks, W. J. Welch, T. J. Mitchell and H. P. Wynn (1989) Design and analysis of comuter experiments Statistical Science , 4:409–435.

  60. COMPUTER MODEL CALIBRATION 1

  61. Inference combining a physics model with experimental data 7 Data generated from model: 6 d 2 z dt 2 = − 1 − . 3 dz 5 dt + � drop time 4 3 simulation model: 2 d 2 z dt 2 = − 1 1 2 4 6 8 10 drop height (floor) statistical model: 7 6 y ( z ) = η ( z ) + δ ( z ) + � 5 drop time 4 3 2 1 2 4 6 8 10 drop height (floor) Improved physics model: 7 d 2 z dt 2 = − 1 − θ dz dt + � 6 5 drop time 4 statistical model: 3 y ( z ) = η ( z, θ ) + δ ( z ) + � 2 1 2 4 6 8 10 drop height (floor)

  62. Accounting for limited simulator runs data & simulations • Borrows from Kennedy and O’Hagan (2001). 3 x model or system inputs 2 y(x), η (x, θ ) calibration parameters θ 1 ζ ( x ) true physical system response given inputs x 0 η ( x, θ ) simulator response at x and θ . − 1 simulator run at limited input settings − 2 0 .5 1 m )) T η = ( η ( x ∗ 1 , θ ∗ 1 ) , . . . , η ( x ∗ m , θ ∗ θ − 3 0 0.5 1 treat η ( · , · ) as a random function x use GP prior for η ( · , · ) model runs y ( x ) experimental observation of the physical system e ( x ) observation error of the experimental data y ( x ) = ζ ( x ) + e ( x ) 2 η (x, θ ) y ( x ) = η ( x, θ ) + e ( x ) 0 − 2 0 0 0.5 0.5 1 1 x θ

  63. OA designs for simulator runs Example: N = 16 , 3 factors each at 4 levels OA (16 , 4 3 ) design 2-d projections 0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.8 0.6 x_1 0.4 0.2 0.0 1.0 1 0.8 0.6 x_2 0.4 .5 0.2 x 3 0.0 0 1.0 0.8 1 0.6 x_3 1 0.4 0.2 .5 .5 0.0 x 1 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x 2 0 0 OA design ensures importance measures R 2 can be accurately estimated for low dimensions Can spread out design for building a response surface emulator of η ( x )

  64. Gaussian Process models for combining field data and complex computer simulators field data input settings (spatial locations) y ( x 1 ) x 11 x 12 · · · x 1 p x ⎛ ⎞ ⎛ ⎞ . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . y = . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ y ( x n ) ⎠ ⎝ x n 1 x n 2 · · · x np x ⎠ sim data input settings x ; params θ ∗ η ( x ∗ 1 , θ ∗ 1 ) x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎛ ⎞ ⎛ ⎞ 11 1 p x 11 1 p θ . . . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . . . η = . . . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ η ( x ∗ m , θ ∗ m ) ⎠ ⎝ x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎠ m 1 mp x m 1 mp θ Model sim response η ( x, θ ) as a Gaussian process y ( x ) = η ( x, θ ) + � η ( x, θ ) ∼ GP (0 , C η ( x, θ )) � ∼ iid N (0 , 1 / λ � ) C η ( x, θ ) depends on p x + p θ -vector ρ η and λ η

  65. Vector form – restricting to n field obs and m simulation runs y = η ( θ ) + � η ∼ N m (0 m , C η ( ρ η , λ η )) ⎝ y ⎝ 0 n ⎝ 1 / λ � I n 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎠ , C y η = C η + ⎠ ∼ N n + m ⇒ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ 0 m 0 1 / λ s I m η ⎝ ⎠ ⎠ where ⎝ x ⎝ 1 θ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ C η = 1 / λ η R η ⎠ , ⎠ ; ρ η ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ x ∗ θ ∗ ⎠ and the correlation matrix R η is given by p θ p x k ) 2 k ) 2 4( x k − x ′ 4( θ k − θ ′ R η (( x, θ ) , ( x ′ , θ ′ ); ρ η ) = k =1 ρ k =1 ρ � � × η k η ( k + p x ) λ s is typically set to something large like 10 6 to stabalize matrix computations and allow for numerical fluctuation in η ( x, θ ) . note: the covariance matrix C η depends on θ through its “distance”-based correlation function R η (( x, θ ) , ( x ′ , θ ′ ); ρ η ) . We use a 0 mean for η ( x, θ ) ; an alternative is to use a linear regression mean model.

  66. Likelihood L ( y, η | λ � , ρ η , λ η , λ s , θ ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ 2 exp C − 1 ⎪ ⎪ ⎨ ⎬ ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ Priors π ( λ � ) ∝ λ a � − 1 e − b � λ � perhaps well known from observation process � p x + p θ k =1 (1 − ρ η k ) − . 5 , where ρ η k = e − . 5 2 β η k correlation at dist = . 5 ∼ β (1 , . 5) . π ( ρ η k ) ∝ � π ( λ η ) ∝ λ a η − 1 e − b η λ η η π ( λ s ) ∝ λ a s − 1 e − b s λ s s π ( θ ) ∝ I [ θ ∈ C ] • could fix ρ η , λ η from prior GASP run on model output. • Many prefer to reparameterize ρ as β = − log( ρ ) /. 5 2 in the likelihood term

  67. Posterior Density π ( λ � , ρ η , λ η , λ s , θ | y, η ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ C − 1 2 exp ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ p x + p θ k =1 (1 − ρ η k ) − . 5 × λ a η − 1 e − b η λ η × λ a s − 1 e − b s λ s × � η s λ a � − 1 e − b � λ � × I [ θ ∈ C ] � If ρ η , λ η , and λ s are fixed from a previous analysis of the simulator data, then π ( λ � , θ | y, η , ρ η , λ η , λ s ) ∝ ⎧ T ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ ⎪ ⎪ | C y η | − 1 ⎪ ⎪ 2 exp C − 1 ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ λ a � − 1 e − b � λ � × I [ θ ∈ C ] �

  68. Accounting for limited simulation runs posterior realizations of η (x, θ * ) prior uncertainty posterior uncertainty 3 3 2 2 y(x), η (x, θ ) y(x), η (x, θ ) 2 1 1 η (x, θ * ) 0 0 0 − 1 − 1 − 2 0 − 2 − 2 0 0 .5 1 0 .5 1 0.5 θ 0.5 θ − 3 − 3 θ * 1 0 0.5 1 1 0 0.5 1 x x x Again, standard Bayesian estimation gives: π ( θ , η ( · , · ) , λ � , ρ η , λ η | y ( x )) ∝ L ( y ( x ) | η ( x, θ ) , λ � ) × π ( θ ) × π ( η ( · , · ) | λ η , ρ η ) π ( λ � ) × π ( ρ η ) × π ( λ η ) • Posterior means and quantiles shown. • Uncertainty in θ , η ( · , · ) , nuisance parameters are incorporated into the forecast. • Gaussian process models for η ( · , · ) .

  69. Predicting a new outcome: ζ = ζ ( x ′ ) = η ( x ′ , θ ) Given a MCMC realization ( θ , λ � , ρ η , λ η ) , a realization for ζ ( x ′ ) can be produced using Bayes rule. Data GP prior for η ( x, θ )( s ) y λ � I n 0 0 0 n x 1 θ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎠ C η = λ − 1 η R η v = ⎠ Σ − v = 0 λ s I m 0 µ z = 0 m x ∗ ⎠ , θ ∗ ⎠ ; ρ η ⎜ η ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ 0 0 0 ⎠ ⎝ 0 ⎝ ⎝ x ′ ⎝ ⎠ ζ θ Now the posterior distribution for v = ( y, η , ζ ) T is v | y, η ∼ N ( µ v | y η = V Σ − v + C − 1 η ) − 1 v v, V ) , where V = ( Σ − Restricting to ζ we have ζ | y, η ∼ N ( µ v | y η m + n +1 , V n + m +1 ,n + m +1 ) Alternatively, one can apply the conditional normal formula to λ − 1 y 0 � I n 0 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ λ − 1 ⎠ ∼ N 0 ⎠ , 0 s I m 0 ⎠ + C η ⎜ η ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ζ ⎝ ⎝ 0 ⎝ 0 0 0 ⎠ so that ⎝ y ⎛ ⎛ ⎞ ⎞ ⎝ Σ 21 Σ − 1 ⎠ , Σ 22 − Σ 21 Σ − 1 ζ | y, η ∼ N 11 Σ 12 ⎜ ⎜ ⎟ ⎟ 11 η ⎠

  70. Accounting for model discrepancy • Borrows from Kennedy and O’Hagan (2001). prior uncertainty x model or system inputs 3 calibration parameters θ 2 y(x), η (x, θ ) ζ ( x ) true physical system response given inputs x 1 η ( x, θ ) simulator response at x and θ . 0 y ( x ) experimental observation of the physical system − 1 δ ( x ) discrepancy between ζ ( x ) and η ( x, θ ) − 2 0 .5 1 θ may be decomposed into numerical error and bias − 3 0 0.2 0.4 0.6 0.8 1 e ( x ) observation error of the experimental data x y ( x ) = ζ ( x ) + e ( x ) y ( x ) = η ( x, θ ) + δ ( x ) + e ( x ) y ( x ) = η ( x, θ ) + δ n ( x ) + δ b ( x ) + e ( x )

  71. Accounting for model discrepancy Again, standard Bayesian estimation gives: prior uncertainty posterior model uncertainty π ( θ , η , δ | y ( x )) ∝ L ( y ( x ) | η ( x, θ ) , δ ( x )) × 3 3 2 2 π ( θ ) × π ( η ) × π ( δ ) y(x), η (x, θ ) y(x), η (x, θ ) 1 1 0 0 • Posterior means and 90% CI’s shown. − 1 − 1 − 2 − 2 0 .5 1 0 .5 1 θ θ • Posterior prediction for ζ ( x ) is obtained − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x by computing the posterior distribution for posterior model discrepancy calibrated forecast 3 3 η ( x, θ ) + δ ( x ) 2 2 1 y(x), ζ (x) 1 • Uncertainty in θ , η ( x, t ) , and δ ( x ) are in- δ (x) 0 0 − 1 − 1 corporated into the forecast. − 2 − 2 • Gaussian process models for η ( x, t ) and − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x δ ( x )

  72. Gaussian Process models for combining field data and complex computer simulators field data input settings (spatial locations) y ( x 1 ) x 11 x 12 · · · x 1 p x ⎛ ⎞ ⎛ ⎞ . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . y = . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ y ( x n ) ⎠ ⎝ x n 1 x n 2 · · · x np x ⎠ sim data input settings x ; params θ ∗ η ( x ∗ 1 , θ ∗ 1 ) x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎛ ⎞ ⎛ ⎞ 11 1 p x 11 1 p θ . . . . . . . ⎜ ⎟ ⎜ ⎟ . . . . . . . η = . . . . . . . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ η ( x ∗ m , θ ∗ m ) x ∗ · · · x ∗ θ ∗ · · · θ ∗ ⎝ ⎠ ⎝ ⎠ m 1 mp x m 1 mp θ Model sim response η ( x, θ ) as a Gaussian process y ( x ) = η ( x, θ ) + δ ( x ) + � η ( x, θ ) ∼ GP (0 , C η ( x, θ )) � � 0 , C δ ( x ) δ ( x ) ∼ GP � ∼ iid N (0 , 1 / λ � ) C η ( x, θ ) depends on p x + p θ -vector ρ η and λ η C δ ( x ) depends on p x -vector ρ δ and λ δ

  73. Vector form – restricting to n field obs and m simulation runs y = η ( θ ) + δ + � η ∼ N m (0 m , C η ( ρ η , λ η )) ⎝ C δ ⎝ y ⎝ 0 n 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎠ , C y η = C η + ⎠ ∼ N n + m ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ 0 m 0 0 η ⎝ ⎠ ⎠ where ⎝ x ⎝ 1 θ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ C η = 1 / λ η R η ⎠ , ⎠ ; ρ η ⎠ + 1 / λ s I m + n ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ x ∗ θ ∗ C δ = 1 / λ δ R δ ( x ; ρ δ ) + 1 / λ � I n and the correlation matricies R η and R δ are given by p θ p x k ) 2 k ) 2 4( x k − x ′ 4( θ k − θ ′ R η (( x, θ ) , ( x ′ , θ ′ ); ρ η ) = k =1 ρ k =1 ρ � � × η k η ( k + p x ) p x k ) 2 4( x k − x ′ R δ ( x, x ′ ; ρ δ ) = k =1 ρ � δ k λ s is typically set to something large like 10 6 to stabalize matrix computations and allow for numerical fluctuation in η ( x, θ ) . We use a 0 mean for η ( x, θ ) ; an alternative is to use a linear regression mean model.

  74. Likelihood L ( y, η | λ � , ρ η , λ η , λ s , ρ δ , λ δ , θ ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ 2 exp C − 1 ⎪ ⎪ ⎨ ⎬ ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ Priors π ( λ � ) ∝ λ a � − 1 e − b � λ � perhaps well known from observation process � p x + p θ k =1 (1 − ρ η k ) − . 5 , where ρ η k = e − . 5 2 β η k correlation at dist = . 5 ∼ β (1 , . 5) . π ( ρ η k ) ∝ � π ( λ η ) ∝ λ a η − 1 e − b η λ η η π ( λ s ) ∝ λ a s − 1 e − b s λ s s p x k =1 (1 − ρ δ k ) − . 5 , where ρ δ k = e − . 5 2 β δ π ( ρ δ k ) ∝ � k π ( λ δ ) ∝ λ a δ − 1 e − b δ λ δ , δ π ( θ ) ∝ I [ θ ∈ C ] • could fix ρ η , λ η from prior GASP run on model output. • Again, many choose to reparameterize correlation parameters: β = − log( ρ ) /. 5 2 in the likelihood term

  75. Posterior Density π ( λ � , ρ η , λ η , λ s , ρ δ , λ δ , θ | y, η ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ C − 1 2 exp ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ p x + p θ k =1 (1 − ρ η k ) − . 5 × λ a η − 1 e − b η λ η × λ a s − 1 e − b s λ s × � η s p x k =1 (1 − ρ δ k ) − . 5 × λ a δ − 1 e − b δ λ δ × λ a � − 1 e − b � λ � × I [ θ ∈ C ] � δ � If ρ η , λ η , and λ s are fixed from a previous analysis ofthe simulator data, then π ( λ � , ρ δ , λ δ , θ | y, η , ρ η , λ η , λ s ) ∝ T ⎧ ⎫ ⎩ − 1 ⎝ y ⎝ y ⎛ ⎞ ⎛ ⎞ | C y η | − 1 ⎪ ⎪ ⎪ ⎪ C − 1 2 exp ⎪ ⎪ ⎨ ⎬ ⎭ × ⎜ ⎟ ⎜ ⎟ y η η ⎠ η ⎠ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ p x k =1 (1 − ρ δ k ) − . 5 × λ a δ − 1 e − b δ λ δ × λ a � − 1 e − b � λ � × I [ θ ∈ C ] � δ �

  76. Predicting a new outcome: ζ = ζ ( x ′ ) = η ( x ′ , θ ) + δ ( x ′ ) y = η ( x, θ ) + δ ( x ) + � ( x ) η = η ( x ∗ , θ ∗ ) + � s , � s small or 0 ζ = η ( x ′ , θ ) + δ ( x ′ ) , x ′ univariate or multivariate λ − 1 y 0 n � I n 0 0 ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ ⎠ + C η + C δ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ λ − 1 ⎠ ∼ N n + m +1 0 m ⎠ , 0 s I m 0 (1) ⎜ η ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⇒ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ ⎝ 0 ⎝ 0 0 0 ⎠ ζ where x 1 θ ⎛ ⎛ ⎞ ⎛ ⎞ ⎞ C η = 1 / λ η R η ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ x ∗ ⎠ , θ ∗ ⎠ ; ρ η ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎟ ⎝ ⎝ x ′ ⎝ ⎠ θ ⎝ x ⎛ ⎛ ⎞ ⎞ C δ = 1 / λ δ R δ ⎠ ; ρ δ ⎠ , on indicies 1 , . . . , n, n + m + 1 ; zeros elsewhere ⎜ ⎜ ⎟ ⎟ x ′ ⎝ Given a MCMC realization ( θ , λ � , ρ η , λ η , ρ δ , λ δ ) , a realization for ζ ( x ′ ) can be produced using (1) and the conditional normal formula: ⎝ y ⎛ ⎛ ⎞ ⎞ ⎝ Σ 21 Σ − 1 ⎠ , Σ 22 − Σ 21 Σ − 1 ζ | y, η ∼ N 11 Σ 12 ⎜ ⎜ ⎟ ⎟ 11 η ⎠

  77. Accounting for model discrepancy Again, standard Bayesian estimation gives: prior uncertainty posterior model uncertainty π ( θ , η n , δ | y ( x )) ∝ L ( y ( x ) | η ( x, θ ) , δ ( x )) × 3 3 2 2 π ( θ ) × π ( η ) × π ( δ ) y(x), η (x, θ ) y(x), η (x, θ ) 1 1 0 0 • Posterior means and 90% CI’s shown. − 1 − 1 − 2 − 2 0 .5 1 0 .5 1 θ θ • Posterior prediction for ζ ( x ) is obtained − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x by computing the posterior distribution for posterior model discrepancy calibrated forecast 3 3 η ( x, θ ) + δ ( x ) 2 2 1 y(x), ζ (x) 1 • Uncertainty in θ , η ( x, t ) , and δ ( x ) are in- δ (x) 0 0 − 1 − 1 corporated into the forecast. − 2 − 2 • Gaussian process models for η ( x, t ) and − 3 − 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x δ ( x )

  78. References • T. Santner, B. J. Williams and W. I. Notz (2003) The Design and Analysis of Computer Experiments , Springer. • M. Kennedy and A. O’Hagan (2001) Bayesian Calibration of Computer Mod- els (with Discussion), Journal of the Royal Statistical Society B , 63, 425– 464. • J. Sacks, W. J. Welch, T. J. Mitchell and H. P. Wynn (1989). Design and Analysis of computer experiments (with discussion). Statistical Science , 4, 409–423. • Higdon, D., Kennedy, M., Cavendish, J., Cafeo, J. and Ryne R. D. (2004) Combining field observations and simulations for calibration and prediction. SIAM Journal of Scientific Computing , 26, 448–466.

  79. COMPUTER MODEL CALIBRATION 2 DEALING WITH MULTIVARIATE OUTPUT

  80. Carry out simulated implosions using Neddermeyer’s model Sequence of runs carried at m input settings ( x ∗ , θ ∗ 1 , θ ∗ 2 ) = ( m e /m, s, u 0 ) varying   x ∗ 1 θ ∗ 11 θ ∗ 12 . . . . . . over predefined ranges using an OA (32 , 4 3 ) -based LH design. . . .   x ∗ m θ ∗ m 1 θ ∗ m 2 2.5 2 2.5 2 inner radius (cm) 1.5 inner radius (cm) 1.5 1 1 0.5 0 0.5 0 2pi 1 2 pi 3 − 5 4 x 10 0 5 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 angle (radians) time (s) − 5 time (s) x 10 radius by time radius by time and angle φ . Each simulation produces a n η = 22 · 26 vector of radii for 22 times × 26 angles.

Recommend


More recommend