what is ecological inference ei
play

What is ecological inference ( EI )? eiPack : Tools for R C - PowerPoint PPT Presentation

What is ecological inference ( EI )? eiPack : Tools for R C Ecological Inference and Goal: infer individual level behavior from aggregate data Higher-Dimension Data Management Unit of analysis: contingency table with observed marginals


  1. What is ecological inference ( EI )? eiPack : Tools for R × C Ecological Inference and Goal: infer individual level behavior from aggregate data Higher-Dimension Data Management Unit of analysis: contingency table with observed marginals Olivia Lau Ryan T. Moore Michael Kellermann col 1 col 2 col 3 row 1 N 11 i N 12 i N 13 i N 1 · i Department of Government row 2 N 21 i N 22 i N 23 i N 2 · i Institute for Quantitative Social Science row 3 N 31 i N 32 i N 33 i N 3 · i Harvard University N · 1 i N · 2 i N · 3 i N i Vienna, Austria 16 June 2006 Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management What is ecological inference ( EI )? eiPack Other packages focus on 2 × 2 inference Goal: infer individual level behavior from (e.g., eco , MCMCpack ) aggregate data eiPack : R × C inference Unit of analysis: contingency table with observed marginals col 1 col 2 col 3 row 1 N 11 i N 12 i N 13 i N 1 · i row 2 N 21 i N 22 i N 23 i N 2 · i row 3 N 31 i N 32 i N 33 i N 3 · i N · 1 i N · 2 i N · 3 i N i eiPack methods estimate unobserved internal cells (or functions thereof) Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management

  2. eiPack eiPack Other packages focus on 2 × 2 inference Other packages focus on 2 × 2 inference (e.g., eco , MCMCpack ) (e.g., eco , MCMCpack ) eiPack : R × C inference eiPack : R × C inference eiPack methods: eiPack methods: Method of bounds Method of bounds Ecological regression Ecological regression Multinomial-Dirichlet model Multinomial-Dirichlet model eiPack data: senc Individual level party affiliation Black, White, and Native American voters 8 counties (212 precincts) in SE North Carolina Cell counts known Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management eiPack eiPack The models implemented in eiPack share: The models implemented in eiPack share: A common input syntax of the form: cbind(col1, ..., colC) ∼ cbind(row1, ...,rowR) Functions to calculate proportions of some subset of columns Appropriate print , summary , and plot functions Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management

  3. Method of bounds Method of bounds Quantity of interest: proportion of row Quantity of interest: proportion of row members in each column for each unit members in each column for each unit Observed row and column marginals Observed row and column marginals determine upper and lower bounds determine upper and lower bounds Row thresholds implemented for extreme case analysis Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Method of bounds Method of bounds Quantity of interest: proportion of row members in each column for each unit 1.0 Observed row and column marginals 51 65 58 63 determine upper and lower bounds 52 61 0.8 75 54 67 Proportion Democratic Row thresholds implemented for extreme 68 37 118 139 39 144 0.6 case analysis 212 111 18 71 30 104 97 122 85 92 130 117 123 137 128 113 31 29 90 147 28 34 127 131 25 91 115 Output: 99 120 110 129 145 86 94 89 96 0.4 35 95 98 207 200 $white.dem 88 lower upper 0.2 18 0.519 0.559 0.0 25 0.450 0.469 28 0.392 0.487 Precincts at least 90% White Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management

  4. Method of bounds Ecological regression Express data as proportions of row totals 1.0 51 Regress each column on all row 65 58 63 ● 52 61 0.8 ● 75 ● proportions ( C regressions) 54 ●● 67 ●● Proportion Democratic 68 37 118 ● 139 Coefficients estimate cell proportions ● 39 ● 144 0.6 ● 212 ● ● 111 ● 18 71 30 104 97 122 ● 85 92 ● 130 117 123 137 ● 128 113 ● 31 ● ● 29 90 ● 147 28 34 ● ● 127 ● 25 ● 131 91 ● 115 ● ● ● 99 120 86 ● 110 129 ●●● 145 ● 94 ● ● 89 ●●●● 96 ● ● ● 0.4 35 ●● ● 95 ● ● ● ● 98 ● ●● 207 ● 200 ● 88 ● ● 0.2 0.0 Precincts at least 90% White Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Ecological regression Ecological regression Express data as proportions of row totals Express data as proportions of row totals Regress each column on all row Regress each column on all row proportions ( C regressions) proportions ( C regressions) Coefficients estimate cell proportions Coefficients estimate cell proportions eiPack : freq. and Bayesian regression eiPack : freq. and Bayesian regression lambda functions calculate shares of a subset of columns – e.g. “among Blacks, Dem. share of 2-party registration” Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management

  5. Ecological regression Multinomial-Dirichlet ( MD ) model 40 Density 20 Express data as counts Fit hierarchical Bayesian model 0 Level 1: column marginals ∼ Multinomial , ⊥ ⊥ −0.2 0.2 0.6 1.0 across units Proportion Democratic Level 2: rows of cell fractions ∼ Dirichlet , ⊥ ⊥ across rows and units 40 Level 3: Dirichlet parameters ∼ Gamma , i.i.d. Density 20 0 −0.2 0.2 0.6 1.0 Proportion Republican Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Multinomial-Dirichlet ( MD ) model Multinomial-Dirichlet ( MD ) model 1.0 Express data as counts Proportion of White Democrats Fit hierarchical Bayesian model ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Level 1: column marginals ∼ Multinomial , ⊥ ● ● ● ● ● ● ● ● ● ● ⊥ ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 ● ● ● ● ● ● across units ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Level 2: rows of cell fractions ∼ Dirichlet , ⊥ ● ● ⊥ ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● across rows and units ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Level 3: Dirichlet parameters ∼ Gamma , i.i.d. ● ● ● ● ● ● ● ● 0.2 ● ● ● lambda and density.plot functions 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Proportion White in precinct Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management Olivia Lau, Ryan T. Moore, Michael Kellermann eiPack : R × C Ecological Inference and Data Management

Recommend


More recommend