Gaussian Process Regression with Noisy Inputs Dan Cervone Harvard Statistics Department March 3, 2015 Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Gaussian process regression Introduction A smooth response x over a surface S ⊂ R p . For s 1 , . . . , s n ∈ S , x ( s 1 ) . . ∼ N ( 0 , C ( s n , s n )) . x ( s n ) [ C ( s n , s n )] ij = c ( s i , s j ), where c is the covariance function . Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Gaussian process regression Introduction A smooth response x over a surface S ⊂ R p . For s 1 , . . . , s n ∈ S , x ( s 1 ) . . ∼ N ( 0 , C ( s n , s n )) . x ( s n ) [ C ( s n , s n )] ij = c ( s i , s j ), where c is the covariance function . Interpolation/prediction at unobserved locations in input space Observe x n = ( x ( s 1 ) . . . x ( s n )) ′ . Predict x ∗ k = ( x ( s ∗ 1 ) . . . x ( s ∗ k )) ′ . � k , s n ) C ( s n , s n ) − 1 x n , x ∗ k | x n ∼ N C ( s ∗ k , s n ) C ( s n , s n ) − 1 C ( s n , s ∗ C ( s ∗ k , s ∗ k ) − C ( s ∗ � k ) Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Gaussian process regression Example 2 1 value (x(s)) 0 −1 −2 0 2 4 6 8 10 location (s) Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Gaussian process regression Example 2 1 value (x(s)) 0 −1 −2 0 2 4 6 8 10 location (s) Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Gaussian process regression Example 2 1 value (x(s)) 0 −1 −2 0 2 4 6 8 10 location (s) Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
GPs with noisy inputs Scientific examples Location error model Instead of observing x , we observe the process y ( s ) = x ( s + u ), where u ∼ g ( u ) are errors in the input space S . Note: We observe s n , y n , but wish to predict x ( s ∗ ). Note: y is never a GP. Location errors (e.g. geocoding error, map positional error) is a problem in many scientific domains. Epidemiology [3, 10, 2]. Environmental sciences [1, 16]. Object tracking/computer vision [9, 15]. Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error GP location errors vs errors-in-variables GP input/location errors: y ( s ) = x ( s + u ) + ǫ. Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error GP location errors vs errors-in-variables GP input/location errors: y ( s ) = x ( s + u ) + ǫ. Traditional errors-in-variables model [5]: x ∗ = f θ ( x ) + ǫ. x ( s ∗ ) = f θ, s n ( x n ) + ǫ (GP regression). Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error GP location errors vs errors-in-variables GP input/location errors: y ( s ) = x ( s + u ) + ǫ. Traditional errors-in-variables model [5]: x ∗ = f θ ( x ) + ǫ. x ( s ∗ ) = f θ, s n ( x n ) + ǫ (GP regression). Observe y = x + η , ie y n = x n + η n . Common to assume η ⊥ x (classical) or η ⊥ y (Berkson). Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error GP location errors vs errors-in-variables GP input/location errors: y ( s ) = x ( s + u ) + ǫ. Traditional errors-in-variables model [5]: x ∗ = f θ ( x ) + ǫ. x ( s ∗ ) = f θ, s n ( x n ) + ǫ (GP regression). Observe y = x + η , ie y n = x n + η n . Common to assume η ⊥ x (classical) or η ⊥ y (Berkson). GP input errors does not yield a traditional errors-in-variables regression problem: Errors y ( s ) − x ( s ) depend on x ( s ). True regression function is unknown: x ( s ∗ ) = f θ, s n + u n ( y n ) + ǫ . Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error Methodology Methods to properly accounting for noisy inputs are essential for reliable inference in this regime. We seek: Optimal (MSE) point prediction, and interval predictions with correct coverage. Consistent/efficient parameter estimation. The location-error regime can actually deliver more precise predictions than the error-free regime. Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error Methodology Methods to properly accounting for noisy inputs are essential for reliable inference in this regime. We seek: Optimal (MSE) point prediction, and interval predictions with correct coverage. Consistent/efficient parameter estimation. The location-error regime can actually deliver more precise predictions than the error-free regime. We discuss three methods: Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error Methodology Methods to properly accounting for noisy inputs are essential for reliable inference in this regime. We seek: Optimal (MSE) point prediction, and interval predictions with correct coverage. Consistent/efficient parameter estimation. The location-error regime can actually deliver more precise predictions than the error-free regime. We discuss three methods: Ignoring location errors. Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error Methodology Methods to properly accounting for noisy inputs are essential for reliable inference in this regime. We seek: Optimal (MSE) point prediction, and interval predictions with correct coverage. Consistent/efficient parameter estimation. The location-error regime can actually deliver more precise predictions than the error-free regime. We discuss three methods: Ignoring location errors. Kriging (BLUP), using moment properties of error-induced process y . Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Measurement error Methodology Methods to properly accounting for noisy inputs are essential for reliable inference in this regime. We seek: Optimal (MSE) point prediction, and interval predictions with correct coverage. Consistent/efficient parameter estimation. The location-error regime can actually deliver more precise predictions than the error-free regime. We discuss three methods: Ignoring location errors. Kriging (BLUP), using moment properties of error-induced process y . MCMC on the space ( x ∗ k , u n ). Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Ignoring location errors Sometimes, you can get lucky Analyst just assumes y n = x n : “Kriging Ignoring Location Errors” (KILE) [6]: x KILE ( s ∗ ) = C ( s ∗ , s n ) C ( s n , s n ) − 1 y n . ˆ Parameter inference based on assuming y n = x n ∼ N ( 0 , C ( s n , s n )). Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Ignoring location errors Sometimes, you can get lucky Analyst just assumes y n = x n : “Kriging Ignoring Location Errors” (KILE) [6]: x KILE ( s ∗ ) = C ( s ∗ , s n ) C ( s n , s n ) − 1 y n . ˆ Parameter inference based on assuming y n = x n ∼ N ( 0 , C ( s n , s n )). Example: c ( s 1 , s 2 ) = exp( − ( s 1 − s 2 ) 2 ), and u ∼ N (0 , σ 2 u ). Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Ignoring location errors Sometimes, you can get lucky Analyst just assumes y n = x n : “Kriging Ignoring Location Errors” (KILE) [6]: x KILE ( s ∗ ) = C ( s ∗ , s n ) C ( s n , s n ) − 1 y n . ˆ Parameter inference based on assuming y n = x n ∼ N ( 0 , C ( s n , s n )). Example: c ( s 1 , s 2 ) = exp( − ( s 1 − s 2 ) 2 ), and u ∼ N (0 , σ 2 u ). 1 0 x(s) −1 observed location predictive location −2 sample GP paths 0 2 4 6 8 10 location (s) Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Ignoring location errors Sometimes, you can get lucky Analyst just assumes y n = x n : “Kriging Ignoring Location Errors” (KILE) [6]: x KILE ( s ∗ ) = C ( s ∗ , s n ) C ( s n , s n ) − 1 y n . ˆ Parameter inference based on assuming y n = x n ∼ N ( 0 , C ( s n , s n )). Example: c ( s 1 , s 2 ) = exp( − ( s 1 − s 2 ) 2 ), and u ∼ N (0 , σ 2 u ). 1.40 1.05 0.70 MSE 0.35 0.00 −4 −2 0 2 4 log ( σ u 2 ) Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Ignoring location errors Sometimes, disaster strikes Assuming known covariance funciton, KILE is not a self-efficient procedure. Self-efficiency [12]: estimator cannot be improved by removing/subsampling data. Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Ignoring location errors Sometimes, disaster strikes Assuming known covariance funciton, KILE is not a self-efficient procedure. Self-efficiency [12]: estimator cannot be improved by removing/subsampling data. Theorem Assume covariance function c and error model u ∼ g ( u ) satisfy regularity x n conditions. Let ˆ KILE ( s ∗ ) be the KILE estimator for x ( s ∗ ) given x n . Then for any s n and s ∗ , there exists s n +1 such that x n +1 KILE ( s ∗ )) 2 ] ≥ E [( x ( s ∗ ) − ˆ x n KILE ( s ∗ )) 2 ] . E [( x ( s ∗ ) − ˆ Dan Cervone (Harvard Statistics Department) GP Regression with Noisy Inputs March 3, 2015
Recommend
More recommend