Balancing Factors for Stepped Wedge Designs Robert Lew 1 Hongsheng Wu 1,2 , Christopher Miller 1,3 , Bo Kim 1,3 Mark Bauer 1,3 1 VA Boston Healthcare System, 2 Wentworth Institute of Technology, 3 Harvard Medical School
Design of BHIP-CCM trial • This trial has a pre-post design that will implement a team- oriented psychiatric patient management system at 9 sites (Veterans Affairs hospitals), 3 sites per period over 3 periods. • We might assign sites ABCDEFGH I to per iods as follows: Period 1 Period 2 Period 3 G H I A B C D E F
Time trends: what could possibly go wrong? • Suppose each of 9 sites is either Urban (U) or Rural (R). Period 1 Period 2 Period 3 G H I A B C D E F U U U R R R R R R • Permute sites to Balance both U and R across the periods Period 1 Period 2 Period 3 C H I A D G B E F U R R U R R U R R
Imbalance score for Times -1, 0, and 1 • Average time for U is -3/3 = -1; for R is (0+0+0+1+1+1)/6 = 0.5. Period -1 Period 0 Period 1 G H I A B C D E F U U U R R R R R R • The average time for both U and R across the periods is zero. Period -1 Period 0 Period 1 C H I A D G B E F U R R U R R U R R
Try many site permutations and pick best • Of 20000 ABCDEFGH I permutations, only 1680 were distinct. • Restricted selection to the 34 best balanced site assignments (restricted randomization). • ‘Best’ means the least imbalance score overall; the sum for every absolute category imbalance score for every factor. • Randomly chose 1 of these 34 best to mute time-trend bias.
Minimize the Overall Imbalance Score • We describe results for only 4 of many factors (characteristics): • Urban (U-urban, R-rural), Academic affiliate (yes, no), • Staff experience (high, low), Region of USA (A, B, C, D) • Each factor has several categories: 10 categories listed above. • We must balance each of the categories over the 3 periods. • Perfect balance is zero: a positive score shows imbalance . • Overall score sums absolute imbalance over all categories.
Category scores for 4 Factors and Overall Score. Category N =1680 N=34 Category N=1680 N=34 𝒏𝒇𝒃𝒐 ± 𝑡𝑒 𝒏𝒇𝒃𝒐 ± 𝑡𝑒 𝒏𝒇𝒃𝒐 ± 𝑡𝑒 𝒏𝒇𝒃𝒐 ± 𝑡𝑒 Urban 0.20 0.16 0.12 0.13 Region A 0.25 0.21 0.14 0.13 Rural 0.25 0.20 0.15 0.16 Region B 0.41 0.35 0.14 0.23 Academic 0.25 0.21 0.10 0.14 Region C 0.41 0.35 0.11 0.21 Non_acad’c 0.20 0.17 0.08 0.11 Region D 0.66 0.47 0.04 0.20 High %Staff 0.16 0.13 0.03 0.07 Low %Staff 0.31 0.26 0.07 0.13 Overall 3.10 0.91 0.99 0.32
Related task: construct a “comparable” control group to our 9 sites that has face validity. • Minimize ‘distance’ between study and control with respect to 10 factors: • AES Psychological Safety AES Civility AES QI/SR %Rural Veterans #Psychiatric Teams #Patients % PACT15 Patients #High risk patients #Phone Encounters Geographic region • Problem : Ad hoc solution : • weight, rescale factors? Equal weight and stdev = 1 • Delete redundant factors? Pearson r > 0.9 • Use numbers or tertile categories? Numbers • What distance metric? Absolute difference • Region of US MATCH
Chose 27 ‘control’ sites to compare with 9 study sites matched on 10 factors • VA Geographic Regions are large and irregular. 1. NY Virginia, Virginia, NY 2. NY Pennsylvania, MD, RI 3. NY Pennsylvania, NC, NC 4. KY Florida, KY, SC 5. Ohio Illinois, Iowa, Iowa 6. Ohio Ohio, WI, Minnesota 7. Kansas Minn, Missouri, Kansas 8. Texas Miss, Montana, Ariz 9. Texas Louisiana, Ariz, Utah • Potential Report on Patient Mental Health Symptom Score Our 9 sites 27 comparable sites All VA but our 9 Mean score 13.2 15.3 19.6 • Several control groups may clarify the apparent result.
Control (referent) group construction when we cannot randomize (enough) • Time trend example drew on ANOVA balanced incomplete block designs. Basic underlying statistical model for stepped wedge designs. • Matching/balance controls example drew on case/control studies. These catalog why many simple methods fail and offer remedies. • Observational study design: Rubin/Rosenbaum propensity scores. Propensity matching often fails and makes matters worse. • Causal models. Factors may play different roles and are not merely simple covariates in a prediction model.
CONCLUSION: Protecting against bias relies heavily on the wisdom of context experts. • What factors matter? • Try to have at least surrogates for all major factors. • Choose assignment or a control group that has face validity. • Start with conceptually simple criteria for comparability. • Only use fancy concepts (propensity) understanding the limitations. • Construct several control groups. • Check that comparability is robust against different choices for factor categories, factor weights, cutpoints, and other subjective choices. THANK YOU!
Recommend
More recommend