the aggregate association index
play

The Aggregate Association Index Eric J. Beh School of Mathematical - PowerPoint PPT Presentation

The Aggregate Association Index Eric J. Beh School of Mathematical and Physical Sciences University of Newcastle, Australia COMPSTAT 2010, Paris, France August 24 The 2 x 2 Contingency Table Cross-classify a sample of size n according to


  1. The Aggregate Association Index Eric J. Beh School of Mathematical and Physical Sciences University of Newcastle, Australia COMPSTAT 2010, Paris, France – August 24

  2. The 2 x 2 Contingency Table Cross-classify a sample of size n according to two dichotomous variables “Let us blot out the contents of Column 1 Column 2 Total the table, leaving only the marginal frequencies . . . [they] by p p p Row 1 • 1 11 12 themselves supply no information on . . . the proportionality of the p p p Row 2 • 22 2 21 frequencies in the body of the p • p • table . . . ” Total 1 1 2 – Fisher (1935) Define 2     − P p p p ( ) p     = • • • 2 1 1 1 2 = X P | p , p n     11 P • • 1 1 1 p   p p   1 p • • • 2 1 2 • 1

  3. Bounds of P 1 Duncan & Davis (1953) Bounds     − n n n     = • • ≤ ≤ • = 1 2 1 L max 0 , P min , 1 U     1 1 1  n   n  • • 1 1 100(1 – α )% Confidence Bounds     χ χ 2 2 p p p p = −   < < +   = α • • α • • * * 1 2 1 2 L p p P p p U     α • • • • α 1 2 1 1 2 n  p p  n  p p  • • • • 1 2 1 2 ( ) ( ) = < < = * * L max 0 , L P min 1 , U U α α α α 1

  4. Aggregate Association Index (AAI) 30 Chi-squared Statistic Statistically significant association 25 20 15 χ 2 10 α 5 0 p 1* L 1 0.0 0.2 0.4 0.6 0.8 1.0 L α U α U 1 P 1 χ 2 If the area under X 2 (P 1 ) but above is large than there may be α evidence to suggest that there is a significant association (at the α level of significance) between the two dichotomous variables.

  5. Aggregate Association Index (AAI) 30 Chi-squared Statistic Statistically significant association 25 20 15 χ 2 10 α 5 0 p 1* L 1 0.0 0.2 0.4 0.6 0.8 1.0 L α U α U 1 P 1  [ ]  ( ) ( ) ( ) U ∫ − + − χ + α 2 2  L L U U X P | p , p dP  α α α • • 1 1 1 1 1 1 = − L   α A 100 1 α ( ) U ∫   1 2 X P | p , p dP  • •  1 1 1 1 L 1

  6. Example – Fisher’s Twin Data Fisher's data studies 30 criminal twins and classifies them according to whether they are a monozygotic twin or a dizygotic twin. The table also classifies whether their same sex twin has been convicted of a criminal offence. Pearson chi-squared statistic is 13.032.  p-value = 0.0003 → there is evidence of a strong association between the two variables.  The product moment correlation = 0.6591 → positive association

  7. Example – Fisher’s Twin Data But, as Fisher (1935) did, suppose we “blot out” the cells of the table. Question: What information do the margins provide in understanding the extent to which the variables are associated. We shall calculate the aggregate association index

  8. Example – Fisher’s Twin Data A 0.05 = 61.83 If we consider the 5% level of significance, the margins provide strong evidence that there may exist a significant association between twin type & conviction status 2 −   221 30 P 12 ( ) =   2 where 0 ≤ P 1 ≤ 0.9231 1 X P 1   216 17

  9. Direction of the Association + A α − A α + − = + A A A α α α

  10. Fisher’s Twin Data ( . . . revisited) = A 61 . 83 0 . 05 + − = = A 46 . 43 A 15 . 40 0 . 05 0 . 05 Therefore based solely on the marginal information we can determine that the variables are three times more likely to be positively associated than negatively associated

  11. Discussion  The index provides an indication of the extent to which two dichotomous variables are statistically significantly association given only the marginal information  Index is not meant to infer the individual level correlation of the variables, but to provide a measure reflecting how likely the two variables may be associated. Further Issues:  Investigate the applicability of index for G (>1) 2x2 tables, including incorporating covariate information (ecological inference)  Has links with the correspondence analysis of aggregate data  Link with Fisher’s exact test

Recommend


More recommend