Small area estimation of proportions of Small area estimation of proportions of Arsenic affected wells in Bangladesh Arsenic affected wells in Bangladesh By Sanghamitra Pal West Bengal State University, India (Joint work with Prof. Partha Lahiri) Sanghamitra Pal SAE 2013, Bangkok Sept 2013 1
Agenda Agenda Problem Statement � Proposed Solution � Simulation Results � Conclusion � References � Sanghamitra Pal SAE 2013, Bangkok Sept 2013 2
Problem Statement Problem Statement Sanghamitra Pal SAE 2013, Bangkok Sept 2013 3
Arsenic – – a Health Hazard a Health Hazard Arsenic � Arsenic (As): toxic metal --- widespread in groundwater in many countries � India(especially in Bengal), Bangladesh, Nepal, Thailand, China, Mongolia and Tibet, Viet Nam, Laos, Cambodia, Myanmar, various South American countries and areas in North America and Western Australia-----------As affected � Negative health impacts are related to: � its concentration in food or water Sanghamitra Pal SAE 2013, Bangkok Sept 2013 4
As Level Limits As Level Limits � WHO guidelines for maximum level of As in drinking water: 10 µ g/L for safe water � Different countries have adopted different standards for As � Bangladesh: 50 µ g/L Sanghamitra Pal SAE 2013, Bangkok Sept 2013 5
Data Map Data Map � In 1997 British Geological Survey had taken out a project “Survey on Arsenic affected wells in Bangladesh” � A sample of 3540 wells were surveyed to measure Arsenic affected wells � Here we are going to estimate District wise proportion of wells less than the threshold value Sanghamitra Pal SAE 2013, Bangkok Sept 2013 6
Data: BGS Survey on As of Data: BGS Survey on As of Bangladesh Bangladesh Sample_ID Latitu Longit Yr_ Well Well owner divisio district As de ude n Const type Depth (Ug/L) (m) S-98-00 22.87 90.78 1992 Shallo 10.7 -- Chitta Laksh 13 w gong mipur S-98-01 23,02 90.87 1971 HP 7.2 -- Dhaka Faridp 256 ur Sanghamitra Pal SAE 2013, Bangkok Sept 2013 7
Map showing the distribution of As in Mandari Mandari Map showing the distribution of As in Sanghamitra Pal SAE 2013, Bangkok Sept 2013 8
Problem & proposed solution Problem & proposed solution District-wise proportion of arsenic affected wells � Problem of Small area estimation � Districts : small areas (Number of districts =61) � Normal/Normal model � Beta-Binomial Model � Benchmarking (Number of Divisions=7) � Sanghamitra Pal SAE 2013, Bangkok Sept 2013 9
Problem Problem � y ij =arsenic level for well j in ith district ; t: threshold value ≤ = = I y t ( ) 1 , i 1,., m ij = m No of districts < wells t (# in POPU.) � Population proportion π = i N i � < wells t � Sample proportion # in Sample x = p i i n i � � N i = Population size for ith district � And n i = Sample size for ith district Covariate : x i = coverage(person per water source) in district i. Sanghamitra Pal SAE 2013, Bangkok Sept 2013 10
The Fay- -Herriot Herriot Model (FH Model) Model (FH Model) The Fay Sampling Model : ind π π p N D / ~ ( , ) i i i i Linking Model : ′ ind π β N x A ~ ( , ) i i x i Linear Mixed Model : ′ = π + = β + + p e x V e i i i i i i e N D Where ~ ( 0 , ) i i V ~ N(0, A) i D Sampling variance : (Known) i A Model variance : ( Unknown) (Fay - Herriot, 1979) Sanghamitra Pal SAE 2013, Bangkok Sept 2013 11
Small area estimation Small area estimation Fay-Herriot (FH) Model (1979) An empirical Bayes estimator of π π π i is given by π EB π = − + µ B ˆ p B ˆ ˆ ˆ ( 1 ) i i i i i m ∑ D p q N p j j = = = B ˆ i D p , (Morris, 1983) , 1 i i m n + A ˆ D ∑ N i i j 1 T µ = β β T = β β x ˆ ˆ ˆ , ( , ) i i 0 1 β = T − − T − ˆ ˆ ˆ X V X X V p 1 1 1 ( ) = = + + p p p V diag A D A D ( ,......... ) ( ,......... , ) m m 1 1 β β ˆ ˆ ˆ A , , are obtained from REML 0 1 Sanghamitra Pal SAE 2013, Bangkok Sept 2013 12
Fay- -Herriot Herriot Model ( Model (Contd Contd… …) ) Fay MSE estimation: Datta-Lahiri (2000) , Prasad-Rao (1990) 1. EB π = ˆ + ˆ + ˆ mse g A g A g A ˆ ( ) ( ) ( ) 2 ( ) i i i i 1 2 3 = − g A B D where ( ) ( 1 ) i i i 1 m 1 ∑ T T T = β = − g A B Var x ˆ B x x x x 2 2 1 ( ) ( ) ( ) i i i i i j j + 2 A D j 1 D 2 2 = g A i ( ) . i + m A D 3 3 ∑ ( ) − + A D i 2 ( ) j 1 Sanghamitra Pal SAE 2013, Bangkok Sept 2013 13
Arc- -Sine Transformation Sine Transformation Arc Apply above following FH model Back-Transformation to get CI for the Population proportion � − = − y n Sin p 1 ( 2 1 ) i i i θ = − π − n Sin 1 ( 2 1 ) i i i Sanghamitra Pal SAE 2013, Bangkok Sept 2013 14
Benchmarking Benchmarking Sanghamitra Pal SAE 2013, Bangkok Sept 2013 15
Benchmarking Benchmarking • Seven divisions (large areas) in Bangladesh • Use that data for benchmarking Sanghamitra Pal SAE 2013, Bangkok Sept 2013 16
Benchmarking with Divisions Benchmarking with Divisions With FH Model = − l p se p — Define 1 . 96 ( ) j j j N di p q − ∑ = = W k se p W k k 2 = + u p se p , ( ) 1 . 96 ( ) kj j kj d n j j j i ∑ = N k k 1 i di = i ∑ 1 = = p W p = j 1,2,....., 7 d No of district in division j j kj k j = k 1 Benchmarked Confidence Intervals l u j j π π ˆ ˆ , i lower i upper dj dj , , ∑ ∑ π π W W ˆ ˆ kj k lower kj k upper , , k = k = 1 1 EB EB π = π − π se ˆ ˆ ˆ 1 . 96 ( ) i lower i i , EB EB π = π + π se ˆ ˆ ˆ 1 . 96 ( ) i upper i i , Sanghamitra Pal SAE 2013, Bangkok Sept 2013 17
Approximate Bayesian method :Beta :Beta- - Approximate Bayesian method Binomial Model Binomial Model Beta-Binomial: π π u Bin n / ~ ( , ) i i i i π µ γµ − µ Beta ~ [ , ( 1 )] i i i i + b b x exp( ) µ = o i 1 i + + b b x 1 exp( ) o i 1 ( Lohr - Rao, 2009) Approximate Bayesian method EB π = − + µ = π data Beta mean B ˆ p B ˆ ˆ / ~ ( ( 1 ) i i i i i i = ν iance var ) i Sanghamitra Pal SAE 2013, Bangkok Sept 2013 18
Approximate Bayesian (Contd.) Approximate Bayesian (Contd.) EB EB = ν = π − π − C ˆ ˆ ˆ variance ( 1 ) i i i i − m m 1 ∑ EB EB EB EB − π − − π − − π − π C ˆ j j j C ˆ ˆ ˆ ˆ ˆ { ( ) ( )[ 1 ( )] ( 1 )} i i i i i i m 1 − m m 1 ∑ EB EB + π − − π j 2 ˆ ˆ [ ( ) ] i i m 1 ( Rao , 2003) EB π − j ˆ ( ) are calculated with Bayesian Jackknife Formula i (Delete - one) Sanghamitra Pal SAE 2013, Bangkok Sept 2013 19
Approximate Bayesian (Contd.) Approximate Bayesian (Contd.) Confidence Interval with Beta-Binomial • Calculate shape parameters—find out CI • Calculate Benchmarked Estimates proceeding as above Sanghamitra Pal SAE 2013, Bangkok Sept 2013 20
Simulation Results Simulation Results Sanghamitra Pal SAE 2013, Bangkok Sept 2013 21
Simulation Simulation Data source: BGS Survey in Bangladesh, 1997 We adopt Design based approach to see the performances of the estimators Pseudo Population: 4n i for the domain i to get a Population Generate For simplicity we adopt SRSWR to draw sample for simplicity only Population (N i =4n i ) ⇓ ⇓ SRSWR ⇓ ⇓ Sample (n i ) Sanghamitra Pal SAE 2013, Bangkok Sept 2013 22
Simulation – – Comparison Criterion Comparison Criterion Simulation ACP – Actual Coverage Percentage (the closer to 95, the better) AL – Average Length of CI (the Lesser the better) ACP, ACV and AL : all are calculated from replicated samples (1000 samples) (1) CI_Normal: CI where “MSE estimation is by Dutta-Lahiri (REML) method” (2) CI_Normal_Bench: Benchmarking on CI_Normal (3) Arc-Sine transformation (4) Beta: with Beta-Binomial model (5) Bench_Beta Benchmarking on Beta Sanghamitra Pal SAE 2013, Bangkok Sept 2013 23
Results – – Summary of ACP values Summary of ACP values Results CI_Norm CI_Normal_ Arc_Sine Beta Beta_B Summary n i al Bench ench Min 15 47 59 45 37 76 1 st Qu. 43 88 92 67 76 88 Median 53 95 97 91 89 92 Mean 57 89 91 81 84 90 3 rd Qu. 76 100 98 96 94 95 Max 110 100 99 99 99 99 Sanghamitra Pal SAE 2013, Bangkok Sept 2013 24
Results BOX Plots of ACP values under Results BOX Plots of ACP values under different methods different methods Sanghamitra Pal SAE 2013, Bangkok Sept 2013 25
Results Results Red=”CI_Normal”; Green=” CI_Normal _Bench”; Yellow=”Arc-Sine”; Black=”Beta”; Blue=”Beta_Bench” Sanghamitra Pal SAE 2013, Bangkok Sept 2013 26
Recommend
More recommend