Bayesian inference & process convolution models Dave Higdon, - PowerPoint PPT Presentation

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 1

MOVING AVERAGE SPATIAL MODELS 2

Kernel basis representation for spatial processes z ( s ) Define m basis functions k 1 ( s ) , . . . , k m ( s ) . 0.8 basis 0.4 0.0 � 2 0 2 4 6 8 10 12 3 s Here k j ( s ) is normal density cetered at spatial location ω j : 1 2 π exp { − 1 2( s − ω j ) 2 } √ k j ( s ) = m set z ( s ) = j =1 k j ( s ) x j where x ∼ N (0 , I m ) . � Can represent z = ( z ( s 1 ) , . . . , z ( s n )) T as z = Kx where K ij = k j ( s i )

x and k ( s ) determine spatial processes z ( s ) k j ( s ) x j z ( s ) 1.5 1.5 basis z(s) 0.5 0.5 � 0.5 � 0.5 � 2 0 2 4 6 8 10 12 � 2 0 2 4 6 8 10 12 s s 4 Continuous representation: m z ( s ) = j =1 k j ( s ) x j where x ∼ N (0 , I m ) . � Discrete representation: For z = ( z ( s 1 ) , . . . , z ( s n )) T , z = Kx where K ij = k j ( s i )

Estimate z ( s ) by specifying k j ( s ) and estimating x k j ( s ) x j z ( s ) 0.5 1.0 0.0 0.5 � 0.5 fitted bases y(s) 0.0 � 1.0 � 0.5 � 1.5 � 2.0 � 1.0 5 � 2 0 2 4 6 8 10 12 � 2 0 2 4 6 8 10 12 s s Discrete representation: For z = ( z ( s 1 ) , . . . , z ( s n )) T , z = Kx where K ij = k j ( s i ) ⇒ standard regression model: y = Kx + �

Formulation for the 1-d example Data y = ( y ( s 1 ) , . . . , y ( s n )) T observed at locations s 1 , . . . , s n . Once knot locations ω j , j = 1 , . . . , m and kernel choice k ( s ) are specified, the remaining model formulation is trivial: Likelihood:  − 1   n 2 λ y ( y − Kx ) T ( y − Kx )   L ( y | x, λ y ) ∝ λ y exp   2    where K ij = k ( ω j − s i ) . Priors: 6  − 1   m 2 λ x x T x   π ( x | λ x ) ∝ λ x exp   2    π ( λ x ) ∝ λ a x − 1 exp { − b x λ x } x π ( λ y ) ∝ λ a y − 1 exp { − b y λ y } y Posterior: π ( x, λ x , λ y | y ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] � � exp × y λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] � � exp x

Posterior and full conditionals Posterior: π ( x, λ x , λ y | y ) ∝ λ a y + n 2 − 1 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] � � exp × y λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] � � exp x Full conditionals: π ( x | · · · ) ∝ exp { − 1 2[ λ y x T K T Kx − 2 λ y x T K T y + λ x x T x ] } π ( λ x | · · · ) ∝ λ a x + m 2 − 1 − λ x [ b x + . 5 x T x ] � � exp x π ( λ y | · · · ) ∝ λ a y + n 2 − 1 7 − λ y [ b y + . 5( y − Kx ) T ( y − Kx )] � � exp y Gibbs sampler implementation x | · · · ∼ N (( λ y K T K + λ x I m ) − 1 λ y K T y, ( λ y K T K + λ x I m ) − 1 ) λ x | · · · ∼ Γ ( a x + m 2 , b x + . 5 x T x ) λ y | · · · ∼ Γ ( a y + n 2 , b y + . 5( y − Kx ) T ( y − Kx ))

Gibbs sampler: intuition Gibbs sampler for a bivariate normal density − 1   − 1  � �  − 1 1 ρ 2 ( z 1 z 2 )  1  z 1    � �   ρ   � �   π ( z ) = π ( z 1 , z 2 ) ∝ exp   � �     � � ρ 1 1  z 2  2 � �  ρ  � �     � �  6 Full conditionals of π ( z ) : z 1 | z 2 ∼ N ( ρ z 2 , 1 − ρ 2 ) z 2 | z 1 ∼ N ( ρ z 1 , 1 − ρ 2 ) ( z 1 1 , z 1 2 ) = z 1 9 • initialize chain with u 6 z 2  0  1       ρ z 0 ∼ N  ,       0 1    ρ • draw z 1 1 ∼ N ( ρ z 0 2 , 1 − ρ 2 ) - z 0 = ( z 0 u u 1 , z 0 ( z 1 1 , z 0 2 ) 2 ) now ( z 1 1 , z 0 2 ) T ∼ π ( z ) - z 1

Gibbs sampler: intuition Gibbs sampler gives z 0 , z 2 , . . . , z T which can be treated as dependent draws from π ( z ) . If z 0 is not a draw from π ( z ) , then the initial realizations will not have the correct distribution. In practice, the first 100?, 1000? realizations are discarded. The draws can be used to make inference 6 about π ( z ) : • Posterior mean of z is estimated by:      z k  = 1  ˆ µ 1 T � 1      10 z k ˆ µ 2 T k =1 2 z 2 • Posterior probabilities: P ( z 1 > 1) = 1 T � k =1 I [ z k � 1 > 1] T P ( z 1 > z 2 ) = 1 T � k =1 I [ z k 1 > z k � 2 ] T • 90% interval: ( z [5%] , z [95%] ) . 1 1 - z 1

1-d example m = 6 knots evenly spaced between − . 3 and 1 . 2 . n = 5 data points at s = . 05 , . 25 , . 52 , . 65 , . 91 . k ( s ) is N (0 , sd = . 3) a y = 10 , b y = 10 · ( . 25 2 ) ⇒ strong prior at λ y = 1 /. 25 2 ; a x = 1 , b x = . 001 3 3 basis functions 2 2 1 1 y(s) k j (s) 0 0 � 1 � 1 � 2 � 2 � 3 � 3 20 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 s s 8 50 6 40 4 30 � x , � y 2 x j 20 0 10 � 2 � 4 0 0 200 400 600 800 1000 0 200 400 600 800 1000 iteration iteration

1-d example From posterior realizations of knot weights x , one can construct posterior real- � m izations of the smooth fitted function z ( s ) = j =1 k j ( s ) x j . Note strong prior on λ y required since n is small. 3 3 posterior realizations of z(s) mean & pointwise 90% bounds 2 2 1 1 z(s), y(s) z(s), y(s) 0 0 21 � 1 � 1 � 2 � 2 � 3 � 3 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 s s

1-d example m = 20 knots evenly spaced between − 2 and 12 . n = 18 data points evenly spaced between 0 and 10. k ( s ) is N (0 , sd = 2) data posterior mean and 80% CI for z(s) 1.5 1.5 1.0 1.0 y y 0.5 0.5 0.0 0.0 -0.5 -0.5 8 0 2 4 6 8 10 0 2 4 6 8 10 s s posterior dist for lam_y posterior dist for lam_x 300 200 250 200 150 150 100 100 50 50 0 0 0 10 20 30 40 50 60 0.0 0.5 1.0 1.5 2.0 lambda_y lambda_x

1-d example m = 20 knots evenly spaced between − 2 and 12 . n = 18 data points evenly spaced between 0 and 10. k ( s ) is N (0 , sd = 1) data posterior mean and 80% CI for z(s) 1.5 1.5 1.0 1.0 y y 0.5 0.5 0.0 0.0 -0.5 -0.5 9 0 2 4 6 8 10 0 2 4 6 8 10 s s posterior dist for lam_y posterior dist for lam_x 200 150 150 100 100 50 50 0 0 0 100 200 300 400 500 600 0.5 1.0 1.5 2.0 2.5 3.0 3.5 lambda_y lambda_x

Basis representations for spatial processes z ( s ) Represent z ( s ) at spatial locations s 1 , . . . , s n . z = ( z ( s 1 ) , . . . , z ( s n )) T ∼ N (0 , Σ z ) . Recall z = Kx, where KK T = Σ z and x ∼ N (0 , I n ) . Gives a discrete representation of z ( s ) at locations s 1 , . . . , s n . discrete representation continuous representation 1.0 1.0 0.5 0.5 basis basis 0.0 0.0 10 � 0.5 � 0.5 � 1.0 � 1.0 0 2 4 6 8 10 0 2 4 6 8 10 s s n n z ( s i ) = z ( s ) = j =1 k j ( s ) x j j =1 K ij x j � � Columns of K give basis functions. Can use a subset of these basis functions to reduce dimensionality.

decomposition basis covariance 10 1.0 8 Cholesky (w/piv) 0.5 location s 6 basis 0.0 4 � 0.5 2 � 1.0 0 0 2 4 6 8 10 0 2 4 6 8 10 location s s 10 1.0 8 SVD 0.5 location s 6 basis 0.0 4 � 0.5 2 11 � 1.0 0 0 2 4 6 8 10 0 2 4 6 8 10 location s s 10 1.0 8 kernels 0.5 location s 6 basis 0.0 4 � 0.5 2 � 1.0 0 0 2 4 6 8 10 0 2 4 6 8 10 s location s

How many basis kernels? Define m basis functions k 1 ( s ) , . . . , k m ( s ) . 1.0 10 0.5 8 m = 20 ? location s 6 basis 0.0 4 � 0.5 2 � 1.0 0 0 2 4 6 8 10 0 2 4 6 8 10 s location s 1.0 10 8 0.5 m = 10 ? location s 6 basis 0.0 4 12 � 0.5 2 � 1.0 0 0 2 4 6 8 10 0 2 4 6 8 10 s location s 1.0 10 8 0.5 m = 6 ? location s 6 basis 0.0 4 � 0.5 2 � 1.0 0 0 2 4 6 8 10 0 2 4 6 8 10 s location s

Moving average specifications for spatial models z ( s ) smoothing kernel k ( s ) white noise process x = ( x 1 , . . . , x m ) at spatial locations ω 1 , . . . , ω m . 13 x ∼ N (0 , 1 I m ) λ x spatial process z ( s ) constructed by convolving x with smoothing kernel k ( s ) m z ( s ) = j =1 x j k ( ω j − s ) � ⇒ z ( s ) is a Gaussian process with mean 0 and covariance given by Cov ( z ( s ) , z ( s � )) = 1 m j =1 k ( ω j − s ) k ( ω j − s � ) � λ x

Example: constructing 1-d models for z ( s ) 3 2 z(s), x(s) m = 20 knot locations 1 -1 ω 1 , . . . , ω m equally spaced between − 2 and 12. -3 x = ( x 1 , . . . , x m ) T ∼ N (0 , I m ) 0 2 4 6 8 10 s 3 � m z ( s ) = k =1 k ( ω k − s ) x k 2 z(s), x(s) 1 -1 k ( s ) is a normal density with sd = 1 1 2 , and 1 4 , -3 4th frame uses k ( s ) = 1 . 4 exp { − 1 . 4 | s |} . 0 2 4 6 8 10 s 3 16 2 z(s), x(s) 1 General points: -1 -3 • smooth kernels required 0 2 4 6 8 10 s • spacing depends on kernel width 3 2 − knot spacing ≤ 1 sd for normal k ( s ) z(s), x(s) 1 -1 • kernel width is equivalent to scale parameter in -3 GP models 0 2 4 6 8 10 s

Bayesian inference & process convolution models Dave Higdon, - PowerPoint PPT Presentation

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 1 MOVING AVERAGE SPATIAL MODELS 2 Kernel basis representation for spatial processes z ( s ) Define m basis functions k 1 ( s ) , . . . , k m ( s )

1 Convolution Convolution is an important operation in signal and image processing. Convolution

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & Markov chain Monte Carlo

Vision and Sound Computer Vision Fall 2018 Columbia University Single-modality video

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Correlation, Convolution, Filtering COMPSCI 527 Computer Vision COMPSCI 527 Computer

Improving PixelCNN Vertical stack oblem with this m of masked convolution. Blind spot

E he i m COMPSCI 527 Computer Vision Correlation, Convolution, Filtering 14 / 26 Image

Chapter 8: Fast Convolution Keshab K. Parhi Chapter 8 Fast Convolution Introduction

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Elementary School Boundary Process Virtual Meeting with Elementary School PTA Presidents and APS

Introduction to Random Processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data

Software Process Assessment Using SEIs Software Capability Maturity Model Neal S. Coulter

Chapter 3: Processes Wh a t i s a p r o c e s s ? Wh a t i s p r o c

REVERSAL PROCESSING USING ILFORD BLACK & WHITE FILMS TO MAKE MONOCHROME TRANSPARENCIES

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Speed Up Your Data Processing Parallel and Asynchronous Programming in Data Science By: Chin

Lecture 1: Introduction to RL Emma Brunskill CS234 RL Winter 2020 Today the 3rd part of the

Bayesian inference & process convolution models Dave Higdon, - PowerPoint PPT Presentation

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 1 MOVING AVERAGE SPATIAL MODELS 2 Kernel basis representation for spatial processes z ( s ) Define m basis functions k 1 ( s ) , . . . , k m ( s )

1 Convolution Convolution is an important operation in signal and image processing. Convolution

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference &amp; Markov chain Monte Carlo

Vision and Sound Computer Vision Fall 2018 Columbia University Single-modality video

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Correlation, Convolution, Filtering COMPSCI 527 Computer Vision COMPSCI 527 Computer

Improving PixelCNN Vertical stack oblem with this m of masked convolution. Blind spot

E he i m COMPSCI 527 Computer Vision Correlation, Convolution, Filtering 14 / 26 Image

Chapter 8: Fast Convolution Keshab K. Parhi Chapter 8 Fast Convolution Introduction

Basics of Bayesian Inference A frequentist thinks of unknown parameters as fixed Basics of

Approximate Bayesian inference for latent Gaussian models avard Rue 1 H Department of

Inference in Bayesian networks Chapter 14.45 Chapter 14.45 1 Outline Exact inference

CS 730/830: Intro AI Bayesian Networks Approx. Inference Exact Inference Wheeler Ruml (UNH)

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Elementary School Boundary Process Virtual Meeting with Elementary School PTA Presidents and APS

Introduction to Random Processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data

Software Process Assessment Using SEIs Software Capability Maturity Model Neal S. Coulter

Chapter 3: Processes Wh a t i s a p r o c e s s ? Wh a t i s p r o c

REVERSAL PROCESSING USING ILFORD BLACK &amp; WHITE FILMS TO MAKE MONOCHROME TRANSPARENCIES

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Speed Up Your Data Processing Parallel and Asynchronous Programming in Data Science By: Chin

Lecture 1: Introduction to RL Emma Brunskill CS234 RL Winter 2020 Today the 3rd part of the

BAYESIAN CALIBRATION OF COMPUTER MODELS Bayesian inference & Markov chain Monte Carlo

REVERSAL PROCESSING USING ILFORD BLACK & WHITE FILMS TO MAKE MONOCHROME TRANSPARENCIES