lecture 15
play

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs - PowerPoint PPT Presentation

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern ( ) Logistic Regression A typical logistic regression problem uses the following model, logit ( ) = there is no reason that the


  1. Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1

  2. GPs and GLMs 2

  3. ๐‘ง ๐‘— โˆผ Bern (๐‘ž ๐‘— ) Logistic Regression A typical logistic regression problem uses the following model, logit (๐‘ž ๐‘— ) = ๐˜ ๐œธ there is no reason that the linear equation above canโ€™t contain thing like random effects or GPs logit (๐‘ž ๐‘— ) = ๐˜ ๐œธ + ๐‘ฅ(๐ฒ) where ๐‘ฅ(๐ฒ) โˆผ ๐’ช(0, ฮฃ) 3 ๐‘ง ๐‘— โˆผ Bern (๐‘ž ๐‘— ) = ๐›พ 0 + ๐›พ 1 ๐‘ฆ ๐‘—1 + โ‹ฏ + ๐›พ ๐‘™ ๐‘ฆ ๐‘—๐‘™

  4. Logistic Regression A typical logistic regression problem uses the following model, logit (๐‘ž ๐‘— ) = ๐˜ ๐œธ there is no reason that the linear equation above canโ€™t contain thing like random effects or GPs logit (๐‘ž ๐‘— ) = ๐˜ ๐œธ + ๐‘ฅ(๐ฒ) where ๐‘ฅ(๐ฒ) โˆผ ๐’ช(0, ฮฃ) 3 ๐‘ง ๐‘— โˆผ Bern (๐‘ž ๐‘— ) = ๐›พ 0 + ๐›พ 1 ๐‘ฆ ๐‘—1 + โ‹ฏ + ๐›พ ๐‘™ ๐‘ฆ ๐‘—๐‘™ ๐‘ง ๐‘— โˆผ Bern (๐‘ž ๐‘— )

  5. A toy example 4 2 0.75 1 eta 0 p 0.50 โˆ’1 0.25 โˆ’2 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 x x 1.00 0.75 y 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 x

  6. Jags Model โˆ— Sigma[j,i] = Sigma[i,j] }โ€ l ~ dunif(3/0.5, 3/0.01) tau ~ dgamma(1, 2) sigma2 = 1/tau beta0 ~ dnorm(0, 1) } Sigma[i,i] = sigma2 for (i in 1:length(y)) { } } Sigma[i,j] = sigma2 * exp(- l * d[i,j])) logistic_model = โ€model{ for (j in (i+1):length(y)) { for (i in 1:(length(y)-1)) { eta ~ dmnorm(rep(0,N), inverse(Sigma)) loglik = sum(loglik_i) } loglik_i[i] = y[i] * log(p[i]) + (1-y[i]) * log(1-p[i]) y_hat[i] ~ dbern(p[i]) logit(p[i]) = beta0 + eta[i] y[i] ~ dbern(p[i]) for(i in 1:N) { 5

  7. Model Results - Diagnostics 6 0.5 0.0 (Intercept) โˆ’0.5 โˆ’1.0 โˆ’1.5 โˆ’2.0 20 estimate 15 phi 10 8 6 sigma.sq 4 2 0 2500 5000 7500 10000 .iteration

  8. Model Results - Fit 7 eta p 1.00 3 0.75 estimate 0.95 0 0.50 0.8 0.5 0.25 โˆ’3 0.00 โˆ’6 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 x

  9. Model vs glm g = glm (y~ poly (x,2), data=d, family=โ€binomialโ€) 8 eta p 2 0.75 0 type estimate 0.50 glm prediction โˆ’2 truth 0.25 โˆ’4 โˆ’6 0.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 x

  10. Count data - Polio cases Polio from the glarma package. This data set gives the monthly number of cases of poliomyelitis in the U.S. for the years 1970โ€“1983 as reported by the Center for Disease Control. 9 10 cases 5 0 1970 1975 1980 year

  11. Polio Model Model : where ๐‘ฅ(๐ฎ) โˆผ ๐’ช(0, ฮฃ) Priors : ๐œš โˆผ Unif (3 6, 3 1/12) 10 ๐‘ง ๐‘— โˆผ Pois (๐œ‡ ๐‘— ) log (๐œ‡ ๐‘— ) = ๐›พ 0 + ๐‘ฅ(๐ฎ) {๐šป} ๐‘—๐‘˜ = ๐œ 2 exp (โˆ’|๐œš ๐‘’ ๐‘—๐‘˜ |) ๐›พ 0 โˆผ ๐’ช(0, 1) 1/๐œ 2 โˆผ Gamma (2, 1)

  12. Model Results - Diagnostics 11 0.50 0.25 (Intercept) 0.00 โˆ’0.25 โˆ’0.50 30 estimate 20 phi 10 0 2.5 2.0 sigma.sq 1.5 1.0 0.5 0 250 500 750 1000 .iteration

  13. 12 Model Results - Fit lambda 15 10 estimate 0.95 0.8 0.5 5 0 1970 1975 1980 year

  14. Spatial data in R 13

  15. Analysis of geospatial data in R R has a rich package ecosystem for read/writing, manipulating, and analyzing geospatial data. Some core packages (CRAN - Spatial task view): โ€ข sp - core classes for handling spatial data, additional utility functions. โ€ข rgdal - R interface to gdal (Geospatial Data Abstraction Library) for reading and writing spatial data. โ€ข rgeos - R interface to geos (Geometry Engine Open Source) library for querying and manipulating spatial data. Reading and writing WKT. โ€ข sf - Combines the functionality of sp , rgdal , and rgeos into a single package based on tidy priciples. โ€ข lwgeom - additional functionality for sf using PostGISโ€™ liblwgeom. โ€ข raster - classes and tools for handling raster data. 14

  16. Installing sf This is the hardest part of using the sf package, difficulty comes from is dependence on several external libraries ( geos , gdal , and proj ). โ€ข Windows - installing from source works when Rtools is installed (system requirements are downloaded from rwinlib) โ€ข MacOS - install dependencies via homebrew: gdal2 , geos , proj . โ€ข Linux - Install development pacakages for GDAL (>= 2.0.0), GEOS (>= 3.3.0) and Proj.4 (>= 4.8.0) from your package manager of choice. More specific details are included in the repo readme on github. 15

  17. Simple Features 16 Point Linestring Polygon Polygon w/ Hole(s) Multipoint Multilinestring Multipolygon Multipolygon w/ Hole(s)

  18. Reading, writing, and converting simple features โ€ข sf See sf vignette #2. โ€ข st_as_sfc / as(x, โ€Spatialโ€) - sp โ€ข st_as_sfc / st_as_binary - WKB โ€ข st_as_sfc / st_as_wkt - WKT โ€ข st_read / st_write - Shapefile, GeoJSON, KML, โ€ฆ โ€ข rgeos 17 โ€ข maptools โ€ข readShapePoints / writeShapePoints - Shapefile w/ points โ€ข readShapeLines / writeShapeLines - Shapefile w/ lines โ€ข readShapePoly / writeShapePoly - Shapefile w/ polygons โ€ข readShapeSpatial / writeShapeSpatial - Shapefile โ€ข rgdal โ€ข readOGR / writeOGR - Shapefile, GeoJSON, KML, โ€ฆ โ€ข readWKT / writeWKT - Well Known Text

  19. Geospatial data in the real world 18

  20. Projections 19 Lat/Long (epsg:4326) Google / Web Mercator (epsg:3857) Lambert Conformal Conic: 2.0e+07 20 ยฐ N 40 ยฐ N 60 ยฐ N 80 ยฐ N 6e+06 1.0e+07 2e+06 โˆ’4e+06 0.0e+00 150 ยฐ W 100 ยฐ W 50 ยฐ W โˆ’2.0e+07 โˆ’1.0e+07 0.0e+00 โˆ’5e+06 0e+00 5e+06 Alberts Equal Area Robinson Mollweide 8e+06 10000000 1e+07 4e+06 5e+06 4000000 0e+00 0e+00 โˆ’2000000 โˆ’6e+06 โˆ’5e+06 0e+00 5e+06 โˆ’1.5e+07 โˆ’5.0e+06 0.0e+00 โˆ’1.5e+07 โˆ’5.0e+06

  21. Distance on a Sphere 20

  22. Dateline Want to fly from the Western most point in the US to the Eastern most point? 21 70 ยฐ N 65 ยฐ N 60 ยฐ N 55 ยฐ N 180 ยฐ 160 ยฐ W 140 ยฐ W

  23. 22 50 ยฐ N 51 ยฐ N 52 ยฐ N 53 ยฐ N 54 ยฐ N 55 ยฐ N 190 ยฐ W 185 ยฐ W 180 ยฐ 175 ยฐ W 170 ยฐ W

  24. 23

  25. Using sf 24

  26. Example data Stokes C~ 37169 37 241. Camden C~ 37029 37 2002. NC 2.11 7 0.0626 ## 456. 2001. NC 8 0.115 1.40 6 0.119 ## 264. Currituc~ 37053 37 2000. NC 4.43 ## 1.46 ## ## 10 0.0925 [degree]> ## # ## # ... with 90 more rows, and 1 more variable: geometry <sf_geometry 356. Hertford~ 37091 37 2005. NC 1.81 551. 2003. NC Northamp~ 37131 37 2004. NC 2.40 9 0.143 ## 444. Warren C~ 37185 37 5 0.0687 342. nc <dbl> 1 0.112 ## <dbl> <chr> <chr> <dbl> <chr> <chr> <dbl> ## Gates Co~ 37073 37 STATE_FIPS SQUARE_MIL FIPS AREA PERIMETER COUNTYP010 STATE COUNTY ## ## # A tibble: 100 x 9 tbl_df (nc) = st_read (โ€../data/gis/nc_counties/โ€, quiet=TRUE, stringsAsFactors=FALSE) 1.61 1994. NC Ashe Cou~ 37009 37 429. 1999. NC 1.43 4 0.0891 ## 539. Surry Co~ 37171 37 1998. NC 1.77 3 0.140 ## 236. Alleghan~ 37005 37 1996. NC 1.35 2 0.0616 ## 25 air = st_read (โ€../data/gis/airports/โ€, quiet=TRUE, stringsAsFactors=FALSE) hwy = st_read (โ€../data/gis/us_interstates/โ€, quiet=TRUE, stringsAsFactors=FALSE)

  27. tbl_df (air) 7 11. AIRPORT KCCR 8 ## MADIS~ 47 MC KELLAR~ JACK~ TN MKL 10. AIRPORT KMKL ## BUCHANAN ~ CONC~ CA COBB 13 COBB COUN~ ATLA~ GA RYY 8. AIRPORT KRYY 6 ## ST LU~ CCR 06 ST LUCIE ~ FORT~ FL CAD CNTL_TWR <chr>, geometry <sf_geometry [degree]> ## # LATITUDE <dbl>, LONGITUDE <dbl>, ELEV <dbl>, ACT_DATE <chr>, ## # ## # ... with 930 more rows, and 8 more variables: FIPS <chr>, TOT_ENP <dbl>, WEXFO~ 26 WEXFORD C~ CADI~ MI 15. AIRPORT KCAD CONTR~ ## 10 LOUDO~ 51 LEESBURG ~ LEES~ VA JYO 13. AIRPORT KJYO 9 ## 12 FPR ## # A tibble: 940 x 17 <chr> ## NEW L~ 09 GROTON-NE~ GROT~ CT GON 0. AIRPORT KGON 1 ## <chr> <chr> <chr> 3. AIRPORT K6S5 <chr> <chr> <chr> <dbl> <chr> ## STATE STATE_FIPS COUNTY AIRPT_NAME CITY IATA AIRPRTX010 FEATURE ICAO ## 2 6S5 7. AIRPORT KFPR ## 5 ## SAN D~ 06 GILLESPIE~ SAN ~ CA SEE 6. AIRPORT KSEE 4 KERN RAVALLI C~ HAMI~ MT 06 MOJAVE AI~ MOJA~ CA MHV 4. AIRPORT KMHV 3 ## RAVAL~ 30 26

  28. tbl_df (hwy) ## 85.3 137. MULTILINESTRING ((680741.7 ... ## 6 I124 1.73 2.79 MULTILINESTRING ((1201467 3... ## 7 I126 3.56 5.72 MULTILINESTRING ((1601502 3... 8 I129 ## 3.10 4.99 MULTILINESTRING ((217446 47... ## 9 I135 96.3 155. MULTILINESTRING ((96922.97 ... ## 10 I15 1436. 2311. MULTILINESTRING ((-882875.7... ## # ... with 223 more rows 5 I12 2.55 MULTILINESTRING ((-1013796 ... ## # A tibble: 233 x 4 3941. ## ROUTE_NUM DIST_MILES DIST_KM geometry ## <chr> <dbl> <dbl> <sf_geometry [m]> ## 1 I10 2449. MULTILINESTRING ((-1881200 ... 1.58 ## 2 I105 20.8 33.4 MULTILINESTRING ((-1910156 ... ## 3 I110 41.4 66.6 MULTILINESTRING ((1054139 3... ## 4 I115 27

Recommend


More recommend