1
play

1 Using the Results from Poverty Mapping Similar problem as to - PDF document

Poverty Mapping Believing in PovMap ? Why ? in the World Bank: ~ ~ ~ x ~ ? ln y ' Our Work and Lessons Learned ch ch ch c x ' : People with same characteristic have same income ch :


  1. Poverty Mapping Believing in PovMap ? Why ? in the World Bank: ~ ~ ~  x    ~   ? ln y ' Our Work and Lessons Learned ch ch ch c  x ' : People with same characteristic have same income ch  : earning differs by location c Qinghua Zhao  ~ : even people with same characters may earn Development Research Group ch differently The World Bank (202) 473-1273 Qzhao@worldbank.org How to Verify the Result ? Is the Solution Robust? • Not an easy job. • No necessary. Depend on the the implementation. • Few existing statistical data can be used to verify - not detail enough • Good model indeed give stable result - no standard error given • Example: Chinese agriculture census and national - data collected differently census • Oversampled survey could provide a solution but very Possible Factors: - it is the match variables matter costly. - variations in variable help - basic living conditions always important • Kenya: great story on sub-district dimension What Makes a Good Model ? Rethinking the income model with sub-population • Matched variable between survey and census • The well being of handicapped population • Variables on living condition (wall material, toilet, roof, • Whose income model? Everybody or handicapped kitchen,…) • The result is still over-estimate the well being, but what • Interactive term (may not be meaningful) can we do ? • Variables with significant deviation • Correct weighting 1

  2. Using the Results from Poverty Mapping • Similar problem as to 2SLS estimate. Adjustment Part II: Software Tools for Poverty Mapping needed. • Spatial analysis • Deforestation vs. poverty • Poverty and crime Typical Simulation Model Challenges • The fully specified simulation model is defined as follows: ~ • Storage ~  x    ~   ~ ln y ' ch ch ch c • where - large Dataset (about 80M per 1 mil ~   ˆ  ˆ ~ N ( , )  household with 20 variables) ~  • is a random variable (normally distributed or T-distributed) c ~  • is a random variable (normally distributed or T-distributed) ch • Speed  ~   ˆ ~ N ( ˆ , )  - Computing poverty measurements such as Gini index and quintile       AB 1 AB ( 1 B ) ~  2    ~ ˆ V ar ( r )   T   Where B= exp( Z )  ch ,     1 B 2  ( 1 B ) 3  ch - Random number generating ~ ~ T ~   Z T and is function of Z and y X ch ch ch ch Basic Framework of Poverty Mapping Design Goals Survey Estimator • Highest speed • Least memory usage Estimated result • Database connectivity • Parallel processing Measurement Evaluator Computation • Flexibility census estimated y B A C H • Zero Installation result nObs*nSim nObs*nVars 2

  3. Approach 1: Complicated SAS Macro Approach 2: Distributed Computing Model Estimator SAS Program Server Side Survey survey result Estimator Estimated Performance: 3 hours per 1M Client Side result Estimated result dispatcher census Computer n Measurement Computer 3 Computation Evaluator Computer 2 census estimated Computer 1 y Measurement B A C H Evaluator Computation result B A C H nObs*nVars nObs*nSim Approach 3: Basic Structure of PovMap Why Pack the Intermediate Data ? Three memory modes 8 bits survey Estimator 1. N*8*(m+2) Simulation Estimated Configuration result Step 2. PovMap.exe 2. N*(lrecl+16) Measurement Evaluator Computation packer census Packed B A C H result Step 1 3. N*16 Data Prep. census Model nObs*nVars Specification nObs*nlRecL Approach 2a: Packer Written in SAS Macro Approach 2b: Packer Not Dependent on SAS survey survey Estimator Estimator SAS format Simulation Simulation Stata Estimated Estimated Configuration Configuration dBase result result ASCII SAS format PovMap.exe PovMap.exe Measurement Measurement Evaluator Evaluator packer Computation packer Computation census census SAS macro Packed B A C H result Packed B A C H result PovMapPacker DataPrep census census FileAux Model Specification Model SAS Specification 3

  4. Modeling Specification Simulation Specification srvdata=HLANDS_SURV.dta nSim=100 lhs=ltexpae CDist=n rhs=lsize youth mf1550 adultread food animal i218 sch_inc_201 HDist=n sch_inc_218 n_rooms hh_size_m married_m hdschyr_m hdschyr_m2 PovLine=195 food_m rvalue rain_metre slopedum memorysize=128 MinImpute=auto arhs= _Yhat_ sch_inc_218 tree_m food_m rain_m_sq hdschyr_m maximpute=auto Cluster=CLUSTER abound=none sWeight=PERSWEIG bbound=.99 cbound=auto cendata=hlands_census.dta hbound=auto cWeight=n_ad_eq seed=12345678 cKeyVar=cluster INDICES=FGT0 FGT1 FGT2 GE00 GE05 GE10 GE15 GE20 ATK2 GINI Dist:20 dataout=hlands.pda ydump=1 LOCERR=YES Simulation=0 3 5 end Unsolved Problems Performance: What Determines the Speed - memory size • parameterize the error term - indicator • outlier and trimming - random number generating • sensitivity analysis • awkward structure nObs Size(M) Random Memory Mode Indicator Time(sec) • inflexible 166625 5.12 T(8),N 128 1 HC+GINI 41 166625 5.12 T(8),N 20 2 HC+GINI 65 166625 5.12 T(8),N 5 3 HC+GINI 73 166625 5.12 N,N 5 3 HC+GINI 58 1.8m 74 N,N 128 2 HC only 647 1.8m 74 N,N 128 2 HC+GINI 961 Complications… Solutions: New Structure for More Flexibility • small survey into big survey • Modular design for easy expansion. • census collected with a fixed ratio • A single simulator can’t satisfy all different needs-- • survey into a subset of census user can build their own simulator with C++ compiler. • sensitivity analysis • Multithread between disk I/O and numeric • variance decomposition computation. • income modeled by simultaneous equations • Random number generating trigged by data event. • Flexible function form and looping. • income estimated by maximum likelihood model • Adding free C++ compiler. • better estimation of the cluster effect • better estimation of idiosyncratic effect 4

Recommend


More recommend