Burglary in London: Insights from Statistical Heterogeneous Spatial - PowerPoint PPT Presentation

Burglary in London: Insights from Statistical Heterogeneous Spatial Point Processes Jan Povala with Seppo Virtanen and Mark Girolami Imperial College London February 19, 2020

Outline Motivation Modelling Experiment Motivation 2

Motivation ◮ Model the occurrences of burglary as a spatial point pattern and provide short-term forecasts. ◮ Provide insights into the intensity of the process. Motivation 3

Two pillars of spatial statistics To avoid biased results and faulty inferences a reasonable spatial model needs to account for: ◮ Spatial dependence : the first law of Geography – “everything is related to everything else, but near things are more related than distant things”(Tobler 1970) ◮ Spatial heterogeneity : phenomena observed on large domains tend exhibit location-specific dynamics. Motivation 4

The data I burglary 100 80 60 40 20 0 Figure: Intensity of burglary occurrences in London during the year 2015 Motivation 5

The data II Crime counts histogram 2000 Count 1000 0 0 25 50 75 100 Number of crimes in a cell Figure: Histogram of the location counts of burglary in London Motivation 6

Outline Motivation Modelling Experiment Modelling 7

Cox Process Cox process is a natural choice for an environmentally-driven point process (Cox 1955, Diggle et al. 2013). Definition Cox process Y ( x ) is defined by two postulates: 1. Λ( x ) is a nonnegative-valued stochastic process; 2. conditional on the realisation λ ( x ) of the process Λ( x ) , the point process Y ( x ) is an inhomogeneous Poisson process with intensity λ ( x ) . Modelling 8

Log-Gaussian Cox Process ◮ Cox process with intensity driven by a fixed component X ( x ) ⊤ β and a latent function f ( x ) : � X ( x ) ⊤ β + f ( x ) � Λ( x ) = exp , where f ( x ) ∼ GP (0 , k θ ( · , · )) , X ( x ) are socio-economic covariates, and β are their coefficients. ◮ Discretised version of the model: � � X ( x i ) ⊤ β + f ( x i ) �� y i ∼ Poisson exp . Modelling 9

LGCP limitations ◮ Fitting this doubly-stochastic model at scale is challening. ◮ Simplifying assumptions such as stationarity of f may not be appropriate (see Figure 3) f : standard deviation .60 .50 .40 .30 .20 .10 Figure: Standard deviation of the GP Modelling 10

Common approaches to address spatial heterogeneity ◮ Mixture models with allocation that enforces spatial dependence (Green & Richardson 2002, Fern´ andez & Green 2002, Hildeman et al. 2018). ◮ Regression coefficients modelled as a Gaussian process (Gelfand et al. 2003, Banerjee et al. 2015). Both of these approaches have limited scalability. Modelling 11

Our proposed model π b α � � X ⊤ �� y n | z n = k, β , X n ∼ Poisson exp n β k B z n | π ∼ Categorical ( π b [ n ] ) σ kj π b | α ∼ Dirichlet ( α, . . . , α ) X n z n β k,j | σ 2 k,j ∼ N (0 , σ 2 k,j ) β kj σ 2 k,j ∼ InvGamma (1 , 0 . 01) J y n α = 1 /K. K N Modelling 12

Inference We use Metropolis-within-Gibbs (Geman & Geman 1984, Metropolis et al. 1953) scheme using the following two steps: 1. We sample the regression coefficients β k,j jointly for all k = 1 , . . . , K and j = 1 , . . . , J . The unnormalised density of the conditional distribution is given as p ( β | α, X , y , z ) ∝ p ( y | β , X , z ) p ( β ) . (1) Equation 1 is sampled using Hamiltonian Monte Carlo method (Duane et al. 1987). 2. Mixture allocation can be sampled cell by cell directly c ¯ n b [ n ] k + α p ( z n = k | z ¯ n , α, X n β , y ) ∝ p ( y n | z n = k, X n β k ) , Kα + � K i =1 c ¯ n b [ n ] k (2) where c ¯ n b [ n ] k is the number of cells other than cell n in the n is the encompassing block b [ n ] assigned to component k , and z ¯ allocation vector with the contribution of cell n removed. Modelling 13

Outline Motivation Modelling Experiment Experiment 14

London burglary experiment ◮ One-year point pattern aggregated to a grid with cell size 400 m × 400 m . ◮ Covariates X ( x ) chosen based on criminological background. ◮ Number of mixture components, K , ranges from 1 to 8. ◮ The blocking structure given by census output areas (MSOA). Experiment 15

Evaluation We evaluate the performance using these metrics: ◮ Watanabe-Akaike informaction criterion (Gelman et al. 2013) N � S � N 1 � � y n | θ ( s ) �� p ( y n | θ ( s ) ) � V S WAIC = − 2 log +2 log p , s =1 S n =1 s =1 n =1 (3) ◮ Energy score (Gneiting & Raftery 2007) S S S Energy score = 1 2 − 1 � y ( s ) − ˜ � y ( i ) − y ( j ) � γ � y � γ � � 2 , (4) S 2 S 2 s =1 i =1 j =1 ◮ Predictive accuracy index (PAI): proportion of crimes occurring in marked hotspots divided by the proportion of the study region marked as hotspots (Chainey et al. 2008). ◮ Predictive efficiency index (PEI): number of crimes predicted by the model for a given area size divided by the maximum number of crimes for the given area size (Hunt 2016). Experiment 16

Results metric = WAIC metric = Energy score 5 × 10 2 5 × 10 4 value 4 × 10 2 2 4 6 8 2 4 6 8 K K Figure: Evaluation of the performance of the proposed model ( ), compared to LGCP ( ). Results are shown for different model specifications: specification 1 ( ), specification 2 ( ), specification 3 ( ), specification 4 ( ). Training data: burglary 2015, test data: burglary 2016. Experiment 17

Hotspot performance metrics metric = PAI metric = PEI 8 × 10 1 10 1 value 7 × 10 1 6 × 10 0 6 × 10 1 4 × 10 0 0 100 200 300 400 500 0 100 200 300 400 500 n n Figure: PAI/PEI performance for the proposed ( ) and LGCP ( ) models, using specification 4. For the SAM-GLM results, the colour of the line represents the number of components: K = 1 ( ), K = 2 ( ), K = 3 ( ), K = 4 ( ), K = 5 ( ), K = 6 ( ), K = 7 ( ). Training data: burglary 2015, test data: burglary 2016. Experiment 18

Interpretation of results To effectively compare the effects of a covariate across different mixture components, we consider a covariate importance measure , defined as β ) 2 � n I ( z n = k )( y n − ˆ y n ˜ IMP kj = 1 − β j ) 2 , (5) � n I ( z n = k )( y n − ˆ y n ¯ Experiment 19

Allocations 1 CovEffect, component 1 log households 0.914 (0.003) - intercept 0.887 (0.004) + log POIs (all) 0.275 (0.053) - occupation variation 0.152 (0.057) + accessibility 0.062 (0.047) + residential turnover 0.005 (0.040) + log house price 0.005 (0.043) + (Semi-)detached houses 0.002 (0.041) + ethnic heterogeneity -0.007 (0.042) - Richmond and Bushy parks (A), Osterley Park and Kew botanic gardens (B), Heathrow airport (C), RAF Northolt base and nearby parks (D), parks near Harrow (E), green fields next to Edgware (F), Hyde Park, Regent’s park, Hampstead Heath (G), Lee Valley (H), London City airport and the industrial zone in Barking (I), Rainham Marshes reserve (J), parks around Bromley (K) Experiment 20

Allocations 2 CovEffect, component 2 intercept 0.932 (0.002) + log households 0.881 (0.004) + log POIs (all) 0.221 (0.035) + accessibility 0.144 (0.045) + ethnic heterogeneity 0.086 (0.035) + occupation variation 0.014 (0.039) - log house price 0.011 (0.036) + (Semi-)detached houses 0.006 (0.035) - residential turnover 0.001 (0.034) - Clapham, Balham, and Forrest Hill (L); Richmond (M); Southall (N); Ealing, Wembley, and Harrow (O); Chelsea and Kensington (P); Brent and Hampstead (Q); Edgware (R); East Barnet (S), Enfield (T); Haringey and Walthamstow (U); Stratford (V); Romford (W); Orpington (X); Purley (Y); and Twickenham (Z) Experiment 21

Allocations 3 CovEffect, component 3 intercept 0.924 (0.003) + log POIs (all) 0.720 (0.017) + log households 0.530 (0.025) + accessibility 0.508 (0.033) + ethnic heterogeneity 0.229 (0.051) + occupation variation 0.169 (0.057) + residential turnover 0.148 (0.040) - log house price 0.089 (0.046) - (Semi-)detached houses 0.013 (0.040) + Soho, Mayfair, Covent Garden, Marylebone, Fitzrovia, London Bridge, Shoreditch (1); Notting Hill and Holland Park (2); Earl’s Court and Fulham (3); Hackney (4); Brent Cross (5); Wembley (6); Twickenham(7); Sutton (8); Croydon (9) Experiment 22

Remarks ◮ The proposed approach allows for fast sampling and achieves performance comparable to LGCP. One posterior sample from the proposed model is of O ( N × K ) time complexity, compared to � N 3 � LGCP’s O . ◮ The model gives insights as to which covariate is important for each component. ◮ The allocation posterior is mostly determined by how well the β coefficients explain the log intensity at a given location. The mixture allocation prior does not play a strong role. ◮ Label-switching, which hampers interpretation, is not present for K ≤ 5 . It is harder to switch modes in higher dimensions. Experiment 23

Conclusions and further work Conclusions: ◮ Using stationary GPs is not enough to effectively model point patterns in large urban domains. ◮ The blocking approach can significantly reduce computation time. ◮ More details can be found in the submitted arXiv paper: https://arxiv.org/pdf/1910.05212.pdf Further work: ◮ Spatial dependence between the blocks. ◮ Non-blocking models such as Gibbs distribution for mixture allocation. Experiment 24

Burglary in London: Insights from Statistical Heterogeneous Spatial - PowerPoint PPT Presentation

Burglary in London: Insights from Statistical Heterogeneous Spatial Point Processes Jan Povala with Seppo Virtanen and Mark Girolami Imperial College London February 19, 2020 Outline Motivation Modelling Experiment Motivation 2

or car for theft. 99% of home burglar alarms are false. Our HNP burglary-larceny-theft stats are

HOUSTON POLICE DEPARTMENT HOUSTON POLICE DEPARTMENT Burglary and Theft Division Burglary and

Introduction to CPTED and Burglary Prevention Scottsdale Police Department Crime Prevention

Offending and re-offending Tim Churchward Non Domestic Burglary, 1 Number of Offences Fraud And

AN EDUCATION STATION q q q q The Chicken Finger Corridor Aggravated Assault Burglary from

Applying Behavioural Insights to Public Policy Simon Ruda Outline 1. What are behavioural

Profiling Burglary in London using Geodemographics Chris Gale 1 , Alex Singleton 2 , Paul Longley

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Transport for London, London Rail London Overground East London Railway & Stations

Transforming Transit through Transforming Transit through Insights in Motion Insights in Motion

LCBO Customer Insights for Sale Customer Insights & CRM Pam Lawson May 27, 2016 Customer

Insights The key to Culture The Nordic Paradox- how equality drives inequality. Presentation

Melon Attitudinal Insights Consumer Research Outline Presented by Colmar Brunton (a Kantar

Welcome to our Traveller Insights Session 1 Travellers Insights and Trends 2 Tips and Tools to

Regional Insights Paper Overview Presentation Purpose of the GB Regional Insights Paper

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

NEWPORT EARLY LEARNING COMMUNITY SYSTEM MAPPING & EVIDENCE SESSION Bilkis is childminder to

Members of the Panel Mr David Wong, Chairman and Independent Director Ms Chong Siak Ching,

Mapletree Logistics Trust 11 TH Annual General Meeting 14 th July 2020 Disclaimer This

Comparative Analysis of Material Criteria in Green Certification Rating Systems and Urban Design

AMKSS LIFE RUN 2016 Friday, 8 April 2016 Bishan Ang Mo Kio Park OBJECTIVES To promote a

The Socioeconomic Impact of Missing Parking Availability Information Adriano Meyer Broyn Stefan

Challenge for a Healthier Louisiana Background Partners Infrastructure/Programs 52:10

Learned Prioritization for Trading Off Speed and Accuracy Jiarong Jiang 1 Adam Teichert 2 Hal

Sambuz

Useful Links

Newsletter

Mail Us

Burglary in London: Insights from Statistical Heterogeneous Spatial - PowerPoint PPT Presentation

Burglary in London: Insights from Statistical Heterogeneous Spatial Point Processes Jan Povala with Seppo Virtanen and Mark Girolami Imperial College London February 19, 2020 Outline Motivation Modelling Experiment Motivation 2

or car for theft. 99% of home burglar alarms are false. Our HNP burglary-larceny-theft stats are

HOUSTON POLICE DEPARTMENT HOUSTON POLICE DEPARTMENT Burglary and Theft Division Burglary and

Introduction to CPTED and Burglary Prevention Scottsdale Police Department Crime Prevention

Offending and re-offending Tim Churchward Non Domestic Burglary, 1 Number of Offences Fraud And

AN EDUCATION STATION q q q q The Chicken Finger Corridor Aggravated Assault Burglary from

Applying Behavioural Insights to Public Policy Simon Ruda Outline 1. What are behavioural

Profiling Burglary in London using Geodemographics Chris Gale 1 , Alex Singleton 2 , Paul Longley

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Transport for London, London Rail London Overground East London Railway &amp; Stations

Transforming Transit through Transforming Transit through Insights in Motion Insights in Motion

LCBO Customer Insights for Sale Customer Insights &amp; CRM Pam Lawson May 27, 2016 Customer

Insights The key to Culture The Nordic Paradox- how equality drives inequality. Presentation

Melon Attitudinal Insights Consumer Research Outline Presented by Colmar Brunton (a Kantar

Welcome to our Traveller Insights Session 1 Travellers Insights and Trends 2 Tips and Tools to

Regional Insights Paper Overview Presentation Purpose of the GB Regional Insights Paper

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

NEWPORT EARLY LEARNING COMMUNITY SYSTEM MAPPING &amp; EVIDENCE SESSION Bilkis is childminder to

Members of the Panel Mr David Wong, Chairman and Independent Director Ms Chong Siak Ching,

Mapletree Logistics Trust 11 TH Annual General Meeting 14 th July 2020 Disclaimer This

Comparative Analysis of Material Criteria in Green Certification Rating Systems and Urban Design

AMKSS LIFE RUN 2016 Friday, 8 April 2016 Bishan Ang Mo Kio Park OBJECTIVES To promote a

The Socioeconomic Impact of Missing Parking Availability Information Adriano Meyer Broyn Stefan

Challenge for a Healthier Louisiana Background Partners Infrastructure/Programs 52:10

Learned Prioritization for Trading Off Speed and Accuracy Jiarong Jiang 1 Adam Teichert 2 Hal

Sambuz

Useful Links

Newsletter

Mail Us

Transport for London, London Rail London Overground East London Railway & Stations

LCBO Customer Insights for Sale Customer Insights & CRM Pam Lawson May 27, 2016 Customer

NEWPORT EARLY LEARNING COMMUNITY SYSTEM MAPPING & EVIDENCE SESSION Bilkis is childminder to