an unbiased logistic regression estimating function for
play

An unbiased logistic regression estimating function for spatial - PowerPoint PPT Presentation

An unbiased logistic regression estimating function for spatial Gibbs point processes Rasmus Waagepetersen Department of Mathematical Sciences Aalborg University joint work (in progress !) with Adrian Baddeley, Jean-Francois Coeurjolly and Ege


  1. An unbiased logistic regression estimating function for spatial Gibbs point processes Rasmus Waagepetersen Department of Mathematical Sciences Aalborg University joint work (in progress !) with Adrian Baddeley, Jean-Francois Coeurjolly and Ege Rubak 1 / 22

  2. Gibbs point process and conditional intensity Point process X : random point pattern. Conditional intensity λ ( u , X ): for small region A and u ∈ A , | A | λ ( u , X ) ≈ P ( X has a point in A | X \ A ) GNZ-formula: � � f ( u , X \ u ) = E [ f ( u , X ) λ ( u , X )] d u E u ∈ X for non-negative functions f . X observed in bounded region W . 2 / 22

  3. Parametric model λ θ ( u , X ) for conditional intensity. Strauss: λ θ ( u , X ) = βγ n R ( u , X ) , β > 0 , 0 < γ ≤ 1 u 1[ � u − v � ≤ R ]: number of neighboring points n R ( u , X ) = � v ∈ X \ within distance R from u . Log linear: λ θ ( u , X ) = exp[ θ · t ( u , X )] for some statistic t ( u , X ) E.g. (Strauss): t ( u , X ) = (1 , n R ( u , X )) 3 / 22

  4. Pseudo-likelihood Disjoint subdivision W = ∪ m i =1 C i in ‘cells’ C i . Random indicator variables: N i = 1[ X ∩ C i � = ∅ ] (presence/absence of points in C i ). P ( N i = 1 | X \ C i ) ≈ | C i | λ θ ( u i , X ), u i ∈ C i \ X . 4 / 22

  5. log binary pseudo-likelihood based on N i ’s takes logistic regression form: m � N i log[ | C i | λ θ ( u i , X )] + (1 − N i ) log[1 − | C i | λ θ ( u i , X )] ≈ i =1 m | C i | λ θ ( u i , X ) 1 � N i log 1 + | C i | λ θ ( u i , X ) + (1 − N i ) log 1 + | C i | λ θ ( u i , X ) i =1 Log-linear case | C i | λ θ ( u i , X ) exp( θ · t ( u i , X ) 1 + | C i | λ θ ( u i , X ) = exp( θ · t ( u i , X ) + | C i | − 1 5 / 22

  6. binary pseudo-likelihood converges to spatial point process pseudo-likelihood ( | C i | → 0): � � log λ θ ( u , X \ u ) − λ θ ( u , X ) d u W u ∈ X with score λ ′ θ ( X \ u ) � � λ ′ λ θ ( u , X \ u ) − θ ( u , X ) d u W u ∈ X (unbiased due to GNZ formula) 6 / 22

  7. Bias problems with pseudo-likelihood ◮ Binary pseudo-likelihood: biased due to approximation P ( N i = 1 | X \ C i ) ≈ | C i | λ θ ( u i , X ) ◮ spatial point process pseudo-likelihood: score requires numerical approximation λ ′ θ ( u , X \ u ) � � λ ′ λ θ ( u , X \ u ) − θ ( u , X ) d u ≈ W u ∈ X λ ′ θ ( u , X \ u ) � � λ ′ λ θ ( u , X \ u ) − θ ( v , X ) w ( v ) u ∈ X v ∈ Q for quadrature weights and points w ( v ) , v ∈ Q ⇒ bias. 7 / 22

  8. Unbiased ‘logistic regression’ estimating function ′ ′ θ ( u , X \ u ) ρ ( u ) λ λ θ ( u , X ) � � s ( θ ) = λ θ ( u , X \ u )[ λ θ ( u , X \ u ) + ρ ( u )] − λ θ ( u , X ) + ρ ( u ) u ∈ X u ∈ D D : dummy point process of intensity ρ ( · ) independent of X (random ‘quadrature points’). s ( θ ) derivative of ‘logistic regression’ λ θ ( u , X \ u ) 1 � � log λ θ ( u , X \ u ) + ρ ( u ) + log λ θ ( u , X ) + ρ ( u ) u ∈ X u ∈ D 8 / 22

  9. Advantages: ◮ unbiased by GNZ and Campbell formulae (later slide) ◮ formally logistic regression score - computation easy using glm with logistic link function. ◮ tractable asymptotic distribution of parameter estimate in the stationary case ◮ fast computation - parametric bootstrap feasible in inhomogeneous case 9 / 22

  10. Dummy point process Should be easy to simulate and mathematically tractable. Possibilities: Stratified: + + 1. Poisson process + + 2. binomial point process (fixed number + + + of independent points) + + 3. stratified binomial point process + + + ( stratrand() in spatstat ) + + + + 10 / 22

  11. Relation to pseudo-likelihood We can rewrite logistic regression score ′ ′ λ θ ( u , X \ u ) λ θ ( u , X \ u ) � � s ( θ ) = λ θ ( u , X \ u ) − λ θ ( u , X \ u ) + ρ ( u ) u ∈ X u ∈ ( X ∪ D ) By GNZ and Campbell: ′ λ θ ( u , X \ u ; θ ) � � ′ E λ θ ( u , X \ u ) + ρ ( u ) = E λ θ ( u , X ) d u . (1) W u ∈ ( X ∪ D ) Hence ′ θ ( u , X \ u ; θ ) λ � λ θ ( u , X \ u ) + ρ ( u ) u ∈ ( X ∪ D ) unbiased Monte Carlo approximation of last term in pseudo-likelihood score: λ ′ θ ( u , X \ u ) � � λ ′ λ θ ( u , X \ u ) − θ ( u , X ) d u W u ∈ X 11 / 22

  12. Decomposition of logistic regression score ′ ′ ρ ( u ) λ θ ( u , X \ u ) θ ( u , X ) λ � � s ( θ ) = λ θ ( u , X \ u )[ λ θ ( u , X \ u ) + ρ ( u )] − λ θ ( u , X ) + ρ ( u ) u ∈ X u ∈ D ′ ′ ρ ( u ) λ θ ( u , X \ u ) ρ ( u ) λ θ ( u , X ) � � = λ θ ( u , X \ u )[ λ θ ( u , X \ u ) + ρ ( u )] − λ θ ( u , X ) + ρ ( u ) d u W u ∈ X ′ ′ ρ ( u ) λ θ ( u , X ) λ θ ( u , X ) � � + λ θ ( u , X ) + ρ ( u ) d u − λ θ ( u , X ) + ρ ( u ) W u ∈ D = T 1 + T 2 ′ ′ ρ ( u ) λ θ ( u , X ) θ ( u , X ) � λ � λ θ ( u , X ) + ρ ( u ) d u = E [ λ θ ( u , X ) + ρ ( u ) | X ] ⇒ E [ T 2 | X ] = 0 W u ∈ D 12 / 22

  13. By GNZ formula for X ′ ′ θ ( u , X \ u ) ρ ( u ) λ ρ ( u ) λ θ ( u , X ) � � λ θ ( u , X \ u )[ λ θ ( u , X \ u ) + ρ ( u )] = E E λ θ ( u , X ) + ρ ( u ) d u W u ∈ X so E T 1 = ′ ′ θ ( u , X \ u ) ρ ( u ) λ ρ ( u ) λ θ ( u , X ) � � E [ λ θ ( u , X \ u )[ λ θ ( u , X \ u ) + ρ ( u )] − λ θ ( u , X ) + ρ ( u ) d u ] = 0 W u ∈ X 13 / 22

  14. T 1 only depends on X and E [ T 2 | X ] = 0 ⇒ T 1 and T 2 uncorrelated: C ov [ T 1 , T 2 ] = EC ov [ T 1 , T 2 | X ] + C ov [ E [ T 1 | X ] , E [ T 2 | X ]] = 0 14 / 22

  15. Approximate distribution of parameter estimate Parameter estimate ˆ θ solution of s ( θ ) = 0 First order Taylor approximation: s ( θ ) ≈ S (ˆ θ − θ ) ⇔ ˆ θ ≈ θ + S − 1 s ( θ ) where S = − E [ d d θ s ( θ )] 15 / 22

  16. Approximate distribution of parameter estimate Parameter estimate ˆ θ solution of s ( θ ) = 0 First order Taylor approximation: s ( θ ) ≈ S (ˆ θ − θ ) ⇔ ˆ θ ≈ θ + S − 1 s ( θ ) where S = − E [ d d θ s ( θ )] Thus θ ≈ S − 1 V ar s ( θ ) S − 1 = S − 1 V ar T 1 S − 1 + S − 1 V ar T 2 S − 1 = Σ 1 +Σ 2 V ar ˆ NB: T 1 depends only on X while T 2 involves both X and D . V ar T 2 → 0 if ρ ( · ) → ∞ (dense D ) Hence Σ 2 extra variance due to D . 16 / 22

  17. Asymptotic normality Restrict attention to stationary X and increasing observation window W . T 1 asymptotically N (0 , Σ 1 ) by CLT for Gibbs point process innovations (Coeurjolly et al. , 2012). T 2 | X asymptotically normal N (0 , Σ 2 ) by CLT for independent but not identically distributed random variables. NB: Σ 2 does not depend on X ! 17 / 22

  18. Asymptotic normality Restrict attention to stationary X and increasing observation window W . T 1 asymptotically N (0 , Σ 1 ) by CLT for Gibbs point process innovations (Coeurjolly et al. , 2012). T 2 | X asymptotically normal N (0 , Σ 2 ) by CLT for independent but not identically distributed random variables. NB: Σ 2 does not depend on X ! Case of stratified points: m ′ ′ θ ( U i , X ) ρ ( u ) λ θ ( u , X ) λ � � T 2 = − [ λ θ ( U i , X ) + ρ ( U i ) − λ θ ( u , X ) + ρ ( u ) d u ] C i i =1 where W = ∪ i C i and U i uniform on C i . 18 / 22

  19. Asymptotic normality Restrict attention to stationary X and increasing observation window W . T 1 asymptotically N (0 , Σ 1 ) by CLT for Gibbs point process innovations (Coeurjolly et al. , 2012). T 2 | X asymptotically normal N (0 , Σ 2 ) by CLT for independent but not identically distributed random variables. NB: Σ 2 does not depend on X ! Case of stratified points: m ′ ′ θ ( U i , X ) ρ ( u ) λ θ ( u , X ) λ � � T 2 = − [ λ θ ( U i , X ) + ρ ( U i ) − λ θ ( u , X ) + ρ ( u ) d u ] C i i =1 where W = ∪ i C i and U i uniform on C i . Conclusion ˆ θ ≈ θ + S − 1 T 1 + S − 1 T 2 ≈ N ( θ, S − 1 [Σ 1 + Σ 2 ] S − 1 ) 19 / 22

  20. Preliminary simulation results: Strauss process Strauss process with β = 200 γ = 0 . 5 R = 0 . 05 on unit square. ◮ Compare numerically approximated pseudo-likelihood (implementation in spatstat ) with logistic regression score. ◮ variance decomposition Mean number of Poisson quadrature/dummy-points: 625 , 25000 , 10000 , 40000. 20 / 22

  21. Bias and variance Mean and std. dev. of parameter estimates and proportion of variance due to D in % for logistic regression score. Numbers in [] are with spatstat implementation of pseudo-likelihood. β = 200 γ = 0 . 5 E ˆ sd ˆ ρ β β % E ˆ γ sd ˆ γ % 625 204.2 [168.9] 37.8 13 0.502 [0.62] 0.114 5.8 2500 203.4 [188.9] 35.9 4 0.502 [0.55] 0.112 1.5 100 2 203.4 [202.1] 35.3 1 0.502 [0.51] 0.111 0.4 200 2 203.3 [205.5] 35.2 0 0.502 [0.505] 0.111 0.1 Bias small and does not depend on intensity of dummy points for logistic regression score. Even smaller variance contributions from D if stratified dummy points used ! Bias problems with spatstat ! 21 / 22

  22. To be done/work in progress: ◮ further simulation studies ◮ applications to data examples ◮ implementation of estimation procedure and computation of asymptotic covariance matrix in spatstat is on the way ! Thanks for your attention ! 22 / 22

Recommend


More recommend