Conjugate Bayesian Models for Massive Spatial Data Abhi Datta 1 , - PowerPoint PPT Presentation

Conjugate Bayesian Models for Massive Spatial Data Abhi Datta 1 , Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland. 2 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles. 3 Departments of Forestry and Geography, Michigan State University, East Lansing, Michigan.

Case Study: Alaska Tanana Valley Forest Height Dataset Forest height and tree cover Forest fire history • Forest height (red lines) data from LiDAR at 5 × 10 6 locations • Knowledge of forest height is important for biomass assessment, carbon management etc 1

Case Study: Alaska Tanana Valley Forest Height Dataset Forest height and tree cover Forest fire history • Goal: High-resolution domainwide prediction maps of forest height • Covariates: Domainwide tree cover (grey) and forest fire history (red patches) in the last 20 years 1

Analyzing the data Models used: • Non-spatial regression: y FH ( s ) = β 0 + β tree x tree + β fire x fire + ǫ ( s ) Figure: Variogram of the residuals from non-spatial regression indicates strong spatial pattern 2

NNGP models • Collapsed NNGP: • y FH ( s ) = β 0 + β tree x tree + β fire x fire + w ( s ) + ǫ ( s ) • w ( s ) ∼ NNGP (0 , C ( · , · | σ 2 , φ )) • y FH ∼ N ( X β, ˜ C + τ 2 I ) where ˜ C is the NNGP covariance matrix derived from C • Response NNGP: • y FH ( s ) ∼ NNGP ( β 0 + β tree x tree + β fire x fire , Σ( · , · | σ 2 , φ, τ 2 )) • y FH ∼ N ( X β, ˜ Σ) where ˜ Σ is the NNGP covariance matrix derived from Σ = C + τ 2 I 3

NNGP models Non-spatial regression Collapsed NNGP Response NNGP CRPS 2.3 0.86 0.86 RMSPE 4.2 1.73 1.72 CP 93% 94% 94% CIW 16.3 6.6 6.6 Table: Model comparison metrics for the Tanana valley dataset • NNGP models perform significantly better than the non-spatial model • MCMC run time for the NNGP models: • Collapsed model: 319 hours • Response model: 38 hours • For massive spatial data, full Bayesian output for even NNGP models require substantial time 4

Another look at the response model • Original full GP model: y ( s ) ind ∼ N ( x ( s ) ′ β + w ( s ) , τ 2 ) • w ( s ) ∼ GP with a stationary covariance function C ( · , · | σ 2 , φ ) • Cov ( w ) = σ 2 R ( φ ) • Full GP model: y ∼ N ( X β, Σ) where Σ = σ 2 M • M = R ( φ ) + α I • α = τ 2 /σ 2 is the ratio of the noise to signal variance • Response NNGP model: y ∼ N ( X β, ˜ Σ) Σ = σ 2 ˜ • ˜ M where ˜ M is the NNGP approximation for M 5

Conjugate NNGP • y ∼ N ( X β, σ 2 ˜ M ) • If φ and α are known, M , and hence ˜ M , are known matrices • The model becomes a standard Bayesian linear model • Assume a Normal Inverse Gamma (NIG) prior for ( β, σ 2 ) ′ • ( β, σ 2 ) ′ ∼ NIG ( µ β , V β , a σ , b σ ), i.e., β | σ 2 ∼ N ( µ β , σ 2 V β ) and σ 2 ∼ IG ( a σ , b σ ) 6

Conjugate NNGP • y ∼ N ( X β, σ 2 ˜ M ), ˜ M is known Joint likelihood: N ( y | X β, σ 2 ˜ M ) × N ( β | µ β , σ 2 V β ) × IG ( σ 2 | a σ , b σ ) 7

Conjugate NNGP • y ∼ N ( X β, σ 2 ˜ M ), ˜ M is known Joint likelihood: N ( y | X β, σ 2 ˜ M ) × N ( β | µ β , σ 2 V β ) × IG ( σ 2 | a σ , b σ ) • Conjugate posterior distribution ( β, σ 2 ) | y ∼ NIG ( µ ∗ β , V ∗ β , a ∗ σ , b ∗ σ ) • Expressions for µ ∗ β , V ∗ β , a ∗ σ and b ∗ σ can be calculated in O ( n ) time 7

Conjugate NNGP • ( β, σ 2 ) | y ∼ NIG ( µ ∗ β , V ∗ β , a ∗ σ , b ∗ σ ) β , b ∗ • Marginal posterior: β | y ∼ MVt 2 a ∗ σ ( µ ∗ σ V ∗ β ) σ a ∗ • MVt k ( m , V ) is the multivariate t distribution with degrees of k , mean m and scale matrix V b ∗ • E ( β | y ) = µ ∗ σ − 1 V ∗ β , Var ( β | y ) = σ β a ∗ • Marginal posterior: σ 2 | y ∼ IG ( a ∗ σ , b ∗ σ ) b ∗ 2 • E ( σ 2 | y ) = σ − 1 , Var ( σ 2 | y ) = b ∗ σ σ σ − 1) 2 ( a ∗ a ∗ ( a ∗ σ − 2) • Exact posterior distributions of β and σ 2 are available 8

Predictive distributions σ ( m ( s ) , b ∗ • y ( s ) | y ∼ t 2 a ∗ σ v ( s )) σ a ∗ b ∗ • E ( y ( s ) | y ) = m ( s ), Var ( y ( s ) | y ) = σ − 1 v ( s ) σ a ∗ • m ( s ) and v ( s ) can be computed using O ( m ) flops • Exact posterior predictive distributions of y ( s ) | y for any s • No MCMC required for parameter estimation or prediction 9

Choosing α and φ • φ and α are chosen using K -fold cross validation over a grid of possible values • Unlike MCMC, cross-validation can be completely parallelized • Resolution of the grid for φ and α can be decided based on computing resources available • In practice, a reasonably coarse grid often suffices 10

Conjugate Bayesian Models for Massive Spatial Data Abhi Datta 1 , - PowerPoint PPT Presentation

Conjugate Bayesian Models for Massive Spatial Data Abhi Datta 1 , Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland. 2 Department

Massive Data Algorithmics Lecture 1: Introduction Massive Data Algorithmics Lecture 1:

Bayesian Spatial Analysis Dr. Jarad Niemi STAT 615 - Iowa State University November 9, 2017

STAT 209 Spatial Data I April 30, 2018 Colin Reimer Dawson 1 / 26 Spatial Data Projections

Choosing Priors Probability Intervals 18.05 Spring 2014 Conjugate priors A prior is conjugate

Resource 1: What is spatial? presentation notes Section Section text Notes 1. Spatial

Broadening the Study of Spatial Intelligence Mary Hegarty University of California, Santa

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

The FIFA Universe Massive scale, massive influence, massive corruption First, Some History.

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Massive Data Algorithmics Lecture 10: Connected Components and MST Massive Data Algorithmics

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

Lecture 19 Spatial GLM + Point Reference Spatial Data Colin Rundel 11/09/2017 1 Spatial GLM

Lecture 19 Spatial GLM + Point Reference Spatial Data Colin Rundel 04/03/2017 1 Spatial GLM

Advanced Data Science on Spark Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Data Science Problem

A survey of f potential in insect vectors of f mountain pin ine proliferation decline phyt

When conifers meet Greg Jordan Tim Brodribb Pinaceae Podocarpaceae North versus South Equator

Congruent Numbers Definition (Congruent Number) An integer n is congruent if it is the area of

Symmetry and Geometry in String Theory Symmetry and Extra Dimensions String/M Theory Rich

Two Connections between Combinatorial and Differential Geometry John M. Sullivan Institut f

A Parameterization Method for Computing Normally Hyperbolic Invariant Tori Some Numerical

Classical negation Patrick D. Elliott November 16, 2020 Logic and Engineering of Natural