comparing and generating latin hypercube designs in
play

Comparing and generating Latin Hypercube designs in Kriging models - PowerPoint PPT Presentation

ENBIS-EMSE 2009 Conference 1/2/3 July, Saint-Etienne Comparing and generating Latin Hypercube designs in Kriging models Giovanni Pistone, Grazia Vicario Politecnico di Torino Department of Mathematics Corso Duca degli Abruzzi, 24 10129


  1. ENBIS-EMSE 2009 Conference 1/2/3 July, Saint-Etienne Comparing and generating Latin Hypercube designs in Kriging models Giovanni Pistone, Grazia Vicario Politecnico di Torino Department of Mathematics Corso Duca degli Abruzzi, 24 – 10129 Torino, Italy

  2. Outline � Background: Kriging models and Latin Hypercube Designs � Introduction � Ordinary kriging on a lattice: the correlation function � Ordinary kriging on a lattice: the output prediction � Classes of Latin Hypercube designs on lattices � Results and conclusions

  3. Background Official starts 1979 19891991 1951 Kriging modellization CEs&LHDs D.G. Krige McKay et al. Model based methods Sachs J. et al. Bayesian prediction Currin et al. Standard modern book references: M.J. Sasena. Flexibility and Efficiency Enhacements for Costrained Global Design Optinmization with Kriging Approximations . PhD Thesis University of Michigan, 2002. T.J. Santner, B.J. Williams, and W.I. Notz. The design and analysis of computer experim Springer Series in Statistics. Springer-Verlag, New York, 2003. K-T. Fang, R. Li, and A. Sudjianto. Design and modeling for computer experiments. Com Science and Data Analysis Series . Chapman & Hall/CRC, Boca Raton, FL, 2006

  4. Introduction DoE : a protocol for designing physical experiments physical experiments in DoE order to achieve valid, correct and unprejudiced inferences And for Computer Computer Experiments Experiments ?? Ideal design strategy: to uniformly spread the points across the experimental region space- -filling designs filling designs space Designs based on sampling methods Designs based on measures of distance Designs based on the uniform distribution

  5. Introduction Random sampling Latin-hypercube Stratified sampling sampling x 2 x 2 x 2 x 1 x 1 x 1

  6. Introduction A problem of interest: the design of experiment, i.e. the choice of a training set with good performances when evaluated with respect to a statistical index (e.g. Mean Squared Prediction Error, MSPE) � How to build a predictor � How to evaluate the efficiency of the prediction � How to choose the points of the design

  7. Ordinary kriging on a lattice: the correlation function The underlying model is a parametric model of Gaussian type: ( ) ( ) ( ) ′ Y x = f x β β β β + Z x f ′ ( x ): known regression function β β : unknown regression coefficients β β Z ( x ): Gaussian random field with zero mean and stationary covariance over [ ] ) ( ) ( ) 2 2 a design space � d ⊂ � d , i.e. ( σ where cov Z x , Z x = σ R x − x Z i j Z i j is the field variance, R is the Stationary Correlation Function (SCF) ( ) depending only on the displacement vector h: ( ) R x − x = R h i j ( ) ( ) R h = R − h 2 > [ ] ( ) ( ) ≡ = σ R 0 var Z x 0 Z If the space of locations is a lattice, the model is an algebraic statistical model

  8. Ordinary kriging on a lattice: the correlation function Choice of the correlation function: Exponential Correlation Function � − � d d { } � p p ( � ) � | | � | | ∏ � � R h; = exp − h = exp h s s s s � � s = 1 s = 1 θ s , s = 1, 2, …, d , are positive scale parameters p between 0 and 2 Assumptions in this paper : � θ s = θ , ∀ s = 1, 2, …, d : the correlation depends only on the distance � h � � � � between any pair of points x and x + h � p = 1 2 = � � 1 Y

  9. Ordinary kriging on a lattice: the correlation function Assumptions: � the Gaussian field is defined on a regular rectangular lattice � d = {1, ... , l } d d � x − y = x − y � Manhattan distance: s s s = 1 ( j 1 , j 2 ) ( i 1 , i 2 ) ( j 1 , i 2 )

  10. Ordinary kriging on a lattice: the correlation function � � � � 0 1 2 . . l − 1 2 l − 1 1 t t . . t � � � � � 1 0 1 2 . . � � 2 � t 1 t t . . � � � � 2 1 0 . . . 2 t t 1 . . . � � � � ( � ) = t = exp − D Γ Γ = Γ Γ 1 � � 1 � � . . . . . 2 2 . . . . . t � � � � . . . . . 1 � � . . . . 1 t � � � � � � � l − 1 . . . . 0 � l − 1 � t . . . t 1 � One single factor 0 1 2 3 l � � D D + 1 D + 2 . . D + l − 1 d − 1 d − 1 d − 1 d − 1 � � � D + 1 D D + 1 D + 2 . . � d − 1 d − 1 d − 1 d − 1 � � D + 2 D + 1 D . . . d − 1 d − 1 d − 1 � � Γ Γ Γ Γ = Γ Γ Γ Γ ⊗ Γ Γ Γ Γ D = d d − 1 1 d � � . . . . . D + 2 d − 1 � � . . . . . D + 1 � � d − 1 � � � D + l − 1 . . . . D � d − 1 d − 1 d factors

  11. Ordinary kriging on a lattice: the output prediction Kriging is a linear method of spatial interpolation: the random variable Y (x 0 ) is predicted with a linear (affine) combination of observed random variables Y ( x 1 ), n � ( ) ˆ ( ) …, Y ( x n ) in the training set x 1 , …, x n : Y x = a + a i Y x 0 0 i i = 1 The weights in the l.c. are evaluated according a statistical model on the joint � � [ ] ′ 1 r ′ 0 2 � � ( ) distribution of Y 0 , Y 1 , …, Y n : ′ F N , β β β β , σ Σ Σ Σ Σ Σ Σ Σ Σ = f 0 Z � � r R 0 ( ) ( ) Y x = β + Z x Ordinary Kriging model The kriging model can be considered an empirical bayesian approach to computer experiments

  12. Ordinary kriging on a lattice: the output prediction Assume β β β and β β and Γ Γ unknown Γ Γ Γ unknown Assume β β β Γ Γ Γ n � ( ) ˆ ( ) Y x = a + a i Y x A Linear Predictor LP is unbiased iff: 0 0 i i = 1 n n � � [ ] � a = 1 ( ) ˆ x a 0 = 0 and β ≡ Y = a + a β i 0 0 i i = 1 i = 1 and it is the Best (BLUP) if it minimizes the Mean Squared Prediction Error (MS ( ) − 1 [ ] − 1 − 1 ˆ ′ ′ ′ MSPE Y = 1 − R + c u R u c r r 0 0 0 0 n n 0 − 1 ′ = 1 − u n R c r 0 0 The unknown value of the correlation is estimated from the set of the training points and plugged in into the formula of the estimator

  13. Classes of Latin Hypercube designs on lattices Step 1 Step 1 Permutations of the l integers (number of the levels) and construction of the matrix l × ( l !) d − 1 containing all the LH designs with d factors. Example: the possible 24 LH designs relative to d = 2 factors each one Example: with l = 4 levels L H 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Training points 11 11 11 14 14 11 11 11 13 13 13 14 14 13 13 13 12 12 12 14 14 12 12 12 22 22 24 21 21 24 23 23 21 21 24 23 23 24 22 22 23 23 24 22 22 24 21 21 33 34 32 32 33 33 34 32 32 34 31 31 32 32 34 31 31 34 33 33 31 31 34 33 44 43 43 43 42 42 42 44 44 42 42 42 41 41 41 44 44 41 41 41 43 43 43 44

  14. Classes of Latin Hypercube designs on lattices Step 2 Step 2 Construction of the distance matrix between any pair of points in the lattice Step 3 Step 3 Implementation of the Kronecker product between any pair of matrices, so the computing of the covariance matrix between any pair of points of the lattice is available Example: covariance sub-matrix of the 11 th LH design (lattice points Example: (1,3), (2,4), (3,1) and (4,2)) � � 2 4 4 1 t t t � � � 2 4 4 � t 1 t t � � 4 4 2 t t 1 t � � � � � ( ) 4 4 2 = exp − t � � t t t 1

  15. Classes of Latin Hypercube designs on lattices Step 4 Step 4 Computation of the statistical index chosen for the comparison: Total Mean Squared Prediction Error (TMSPE), Entropy, the Minimax Distance and Maxmin Distance, ... Step 5 Step 5 Clustering of the LHs according to the same value of the index in the previous step. Both the TMSPE and the determinant of the covariance matrix are rational function of the parameter t . The rational functions are exactly computed with a symbolic software. Designs with the same function are in the same cluster. For computing the predictor variances in closed form: CoCoA (Computations in Commutative Algebra), see http://cocoa.dima.unige.it Other computations related with the exponential model for covariances: software R, see http://www.R-project.org/

  16. Classes of Latin Hypercube designs on lattices Class 1 Class 1 Class 2 Class 2 Class 3 Class 3 Class 4 Class 4

  17. Classes of Latin Hypercube designs on lattices Class 5 Class 5 Class Class 6 6 Class Class 7 7

  18. Results and conclusions Comments � Class 6 is the best one (it consists of U-design according B. Tang (1993). These designs are also tilted 2 2 . � Classes 3,4,7 are essentially equivalent and worse than class 6 � Classes 4 and 5 are essentially equivalent to the cyclic designs (Bates et al. (1996), very recommended for Fourier regression models � Class 2 is second worse � Class 1 and 4 consist of regular fractions 4 2-1 � Class 3 contains regular fractions 2 4-2 (pseudofactors) � An LH design is an orthogonal array with strength 1 and vice versa

  19. Results and conclusions l=3 levels l=3 levels, , d= d= 2 2 variables variables � Each cluster has a different performance with respect to the TMSPE criteri � The worst case are the two diagonals LHDs (dashed line)

Recommend


More recommend