conditional vs marginal estimators
play

Conditional vs. marginal estimators Background of within-pair - PowerPoint PPT Presentation

Acknowledgements Conditional vs. marginal estimators Background of within-pair regression e ff ects in Models for paired data individually-matched case-control History studies and twin cohorts Binary data Estimation of regression


  1. Acknowledgements Conditional vs. marginal estimators Background of within-pair regression e ff ects in Models for paired data individually-matched case-control History studies and twin cohorts Binary data Estimation of regression parameters Simulation Lyle C Gurrin Conclusions Centre for Epidemiology and Biostatistics References Melbourne School of Population and Global Health ViCBiostat, Melbourne, November 2014 1/ 49

  2. Acknowledgements No. 1 Acknowledgements Background Models for Jack and Jill... paired data History Binary data John Carlin Estimation of regression Jonathan Sterne parameters John Hopper Simulation Conclusions References and Gillian Dite 2/ 49

  3. Acknowledgements No. 2 Acknowledgements Background Models for Collaborators on the more recent work on binary paired data data... History Binary data Estimation of regression Martin Hazelton parameters Fizz Williamson Simulation Conclusions Sabria Khan References 3/ 49

  4. Outline I Background and motivation Acknowledgements Background I Regression models for continuously-valued Models for paired data paired exposure and outcome data History I History Binary data Estimation of I Conditional estimators regression parameters I Binary data (both exposure and outcome) Simulation Conclusions I Estimators of within-pair e ff ect References I Simulation results I Extensions 4/ 49

  5. Paired data I Twins provide naturally matched pairs for Acknowledgements Background studies of human health, although paired Models for paired data data goes beyond just twins. History I We can can exploit within-pair comparisons Binary data Estimation of of data to avoid confounding associations regression parameters between outcomes and exposures by shared Simulation factors. Conclusions References I Specific assumptions about shared factors allow the determination of genetic and environmental contributions to disease risk. 5/ 49

  6. Twin Studies - 1 I Tradition of focussing on genetic hypotheses Acknowledgements Background I Decompose variation in a quantitative trait Models for paired data I Compare within-pair correlation of DZ with History MZ ( 1 2 under the additive genetic model) Binary data Estimation of I Classical Twin Model assumes that variation regression parameters attributable common or shared environment Simulation Conclusions is the same for DZ and MZ twins References I Lower DZ than MZ within-pair correlation provides evidence that a trait is determined by genetic factors 6/ 49

  7. Twin Studies - 2 I Can the twin context provide greater insight Acknowledgements Background on associations? Models for paired data I Cardiovascular risk (blood pressure) with History birthweight Binary data I Cancer risk (breast density) with physical Estimation of regression measures (height, weight, BMI) parameters Simulation I Ideally like to separate the e ff ect of shared Conclusions and individual factors ( eg maternal versus References placental) I When can a regression relationship be said to have a genetic basis? 7/ 49

  8. Individual twin regression Exposure variable x ij and binary outcome y ij for Acknowledgements Background i = 1 , . . . , n and j = 1 , 2 . A cross-sectional or Models for paired data individual-level regression model might propose History that Binary data Estimation of regression E ( y i 1 ) = α + β x i 1 parameters Simulation E ( y i 2 ) = α + β x i 2 Conclusions References 8/ 49

  9. Individual twin regression Acknowledgements Background Models for E ( y i 1 ) = α + β x i 1 paired data History E ( y i 2 ) = α + β x i 2 Binary data Estimation of If we take the di ff erence between the two regression parameters equations we get Simulation Conclusions E ( y i 1 � y i 2 ) = β ( x i 1 � x i 2 ) References If we take the average between the two equations we get E ( y i ) = α + β x i 9/ 49

  10. Between- and within-pair regression These are special cases of a model general model Acknowledgements Background Models for E ( y i 1 ) = β 0 + β w ( x i 1 � x i ) + β b x i paired data History E ( y i 2 ) = β 0 + β w ( x i 2 � x i ) + β b x i Binary data Estimation of regression where x i = ( x i 1 + x i 2 ) / 2 . Since parameters Simulation Conclusions x i 1 � x i = ( x i 1 � x i 2 ) / 2 References x i 2 � x i = ( x i 2 � x i 1 ) / 2 = � ( x i 1 � x i 2 ) / 2 we can re-write the multivariable between- and within-pair model as... 10/ 49

  11. Between- and within-pair regression These are special cases of a model general model Acknowledgements Background Models for E ( y i 1 ) = β 0 + β w ( x i 1 � x i 2 ) + β b x i paired data History E ( y i 2 ) = β 0 + β w ( x i 2 � x i 1 ) + β b x i Binary data Estimation of regression Univariate regressions of the within-pair parameters Simulation di ff erences and within-pair means yield estimates Conclusions of β w and β b respectively. References Simultaneous estimation of β w and β b from the multivariable model generates the same estimates for OLS and GLS (but standard errors will di ff er). 11/ 49

  12. Interpretation - 1 I β w is the expected change in the the Acknowledgements Background outcome y for a unit change in the deviation Models for paired data of the exposure x from the pair mean, History holding this pair mean constant. Binary data I β b is the expected change in the outcome y Estimation of regression parameters for a unit change in the pair mean x , Simulation holding the within-pair deviation (di ff erence) Conclusions constant. References 12/ 49

  13. Illustration of between- and within-pair e ff ects Acknowledgements Background Models for paired data Between − and within − pair regression effects History 120 Binary data Estimation of Outcome (Percent Mammographic Density) 100 regression parameters Simulation 80 Conclusions References 60 40 20 0 0 20 40 60 80 100 120 Exposure (Height in Centimetres) 2 13/ 49

  14. Interpretation - 2 What we are postulating is Acknowledgements Background I A model for the expected value of the Models for paired data outcome y can be improved by using data History from pairs. Binary data I A good way to do this is to relate the Estimation of regression parameters expected value of y i 1 not just to x i 1 (the Simulation twin’s own exposure value) but also to their Conclusions co-twins exposure value x i 2 . References I The expected di ff erence in outcome y comparing between two x values may depend on whether we are comparing (i) co-twins with each other within-pair; or (ii) unrelated twins between pairs. 14/ 49

  15. Twin – Co-Twin regression The multivariable model re-expressed Acknowledgements Background Models for E ( y i 1 ) = β 0 + β t x i 1 + β c x i 2 paired data History E ( y i 2 ) = β 0 + β t x i 2 + β c x i 1 Binary data Estimation of where regression parameters Simulation β t = ( β w + β b ) / 2 Conclusions References β c = ( β w � β b ) / 2 from which we can see that β c = 0 ( E ( y i 1 ) does not depend on x i 2 and vice versa ) is equivalent to β w = β b (between- & within-pair reg. e ff ects are the same). 15/ 49

  16. Individual twin regression Recall the individual-level regression model Acknowledgements Background Models for E ( y i 1 ) = α + β x i 1 paired data History E ( y i 2 ) = α + β x i 2 Binary data Estimation of regression If the multivariable between- and within-pair parameters Simulation regression model is correct, then fitting the Conclusions individual-level regression (again, by either OLS References or GLS) produces and estimate of β that is a weighted average of the corresponding estimates of β w and β b with weights that depend on ρ x and ρ y , the observed within-pair correlation of x and y respectively. 16/ 49

  17. Weighted average estimates of β This results first came to my attention through Acknowledgements Background biostatistics via the seminal paper by Neuhaus Models for paired data & Kalbfleisch (1998) in Biometrics . History Binary data Neuhaus & Kalbfleisch (1998), however, quote Estimation of regression Scott & Holt (1982) in J. Amer. Stat. Assoc. , a parameters Simulation paper on two-stage sample surveys . Conclusions References Scott & Holt (1982) in turn trace the result back to Maddala (1971) in Econometrica , so we’re now in economics where the interest at the time was “pooling cross section and time series data”. 17/ 49

  18. Weighted average estimates of β There’s more: Maddala (1971) references this Acknowledgements Background Models for paired data Wallace TD & Hussain A (1969). The use of History error components models in combining cross Binary data section with time series data. Econometrica , 37 , Estimation of regression 55–72, parameters Simulation Conclusions which in turn refers to this References Hildreth C (1950). Combining Cross-Section Data and Time Series. Cowles Commission Discussion Paper: Statistics No. 347, May 15, 1950. 18/ 49

  19. Weighted average estimates of β Acknowledgements Background Models for paired data History This led me to propose Gurrin’s Law : One can Binary data Estimation of always find a reference to the between- and regression parameters within-cluster “beta is weighted average” result Simulation published before one was born regardless of how Conclusions References old one is! 19/ 49

Recommend


More recommend