matching an internet g panel sample of pregnant women to
play

Matching an Internet g Panel Sample of Pregnant Women to a g - PowerPoint PPT Presentation

Matching an Internet g Panel Sample of Pregnant Women to a g Probability Sample Andrew Burkey Abt SRBI Carla L Black CDC* Charles DiSogra Abt SRBI John Sokolowski Abt SRBI Stacie M Greby CDC* Helen Ding CDC* K.P. Srinath Abt SRBI


  1. Matching an Internet g Panel Sample of Pregnant Women to a g Probability Sample Andrew Burkey Abt SRBI Carla L Black CDC* Charles DiSogra Abt SRBI John Sokolowski Abt SRBI Stacie M Greby CDC* Helen Ding CDC* K.P. Srinath Abt SRBI Sarah W Ball Abt Assoc. Sara MA Donahue Abt Assoc. AAPOR – May 16, 2015 – Hollywood, FL DC-AAPOR – August 3 2015 – Washington DC DC-AAPOR August 3, 2015 Washington, DC Immunization Services Division, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention

  2. Background Background  Centers for Disease Control and Prevention (CDC) uses  Centers for Disease Control and Prevention (CDC) uses influenza vaccination coverage data to:  Monitor the impact of vaccination programs M it th i t f i ti  Identify groups in need of vaccination services  Pregnant women (PW) are prioritized for influenza vaccination  Pregnant women are at increased risk of influenza- Pregnant women are at increased risk of influenza related severe illness and hospitalization  Infants younger than 6 months are at high risk of severe  Infants younger than 6 months are at high risk of severe complications from the flu, but they are too young to be vaccinated. Abt SRBI | pg 2

  3. Surveying a rare group: pregnant women Surveying a rare group: pregnant women  Probability-based survey could be costly and time-consuming  Probability-based survey could be costly and time-consuming  Low prevalence of PW: <4% of all women  Compromise: Use opt in Internet panels  Compromise: Use opt-in Internet panels  Non-probability opt-in panel is fit for purpose  Can access large numbers of PW with Black Hispanic over-samples Can access large numbers of PW with Black, Hispanic over samples • Large sample: approximately 2,000 completes per 2-week period  Provide information quickly (all Web mode) q y ( ) • Short 2-week data collection period in November and in April  Are affordable for the task  Survey Sampling Inc., opt-in Internet panel used for sample since 2010 Abt SRBI | pg 3

  4. Limitations of non Limitations of non- -probability sample probability sample  PW are all panel volunteers who use the Internet  Due to Internet use and multi-survey exposure PW in survey may  Due to Internet use and multi survey exposure, PW in survey may be better informed than PW in general population  Bias in estimate is unknown; poses threat to generalizability  Bias in estimate is unknown; poses threat to generalizability  Standard errors and confidence intervals cannot be used with non- probability samples (AAPOR Task Force Report, June 2013) Abt SRBI | pg 4

  5. Study Objective Study Objective Objective Objective Explore ways to mitigate the shortcomings of non-probability sample Assess the bias  Assign a measure of precision to the estimate  Method Match cases from Spring 2013 non-probability PW survey with M t h f S i 2013 b bilit PW ith PW cases from a 2013 national probability survey Assumption Assumption Purposely matched opt-in sample can resemble a probability sample  Variance of estimate approximates that from a probability sample thus a Variance of estimate approximates that from a probability sample thus a confidence interval may be derived  Matched sample estimate may not be similar to the probability sample’s estimate due to existence of inherent bias Abt SRBI | pg 5

  6. Panel bias is assumed Panel bias is assumed  Matched panel sample Probability sample Abt SRBI | pg 6

  7. The comparison probability sample The comparison probability sample  2013 National Health Interview Survey (NHIS)  In-person survey of population weighted to population totals  Data collection throughout the year  Data collection throughout the year  Has some of the same descriptive variables as the Panel sample  To best align with panel sample, NHIS sample limited to women:  Asked questions about pregnancy  Interviewed February – July 2013 I i d F b J l 2013  Pregnant any time between August 2012 – March 2013  Responded to questions about influenza vaccination status  Responded to questions about influenza vaccination status Abt SRBI | pg 7

  8. Matching variables Matching variables  Variables in both surveys and with differences in the distribution between the panel and NHIS (age, race, education) don’t explain differences in vaccination coverage between the samples  Variables used by other studies not available in both surveys, e.g., political party affiliation, smoking status liti l t ffili ti ki t t One difference between the panel and NHIS is Internet use   Literature suggests that panel members are frequent internet users.  Frequent internet users have attitudinal and behavioral differences compared to the general population Abt SRBI | pg 8

  9. Internet use: observed difference Internet use: observed difference A diff A difference between the two samples: b h l  Panel: All Internet users (online respondents – English only)  NHIS: Internet users and non-users (in-person, general population) There may be other attitude, behavioral, socio-political differences  Internet use can be defined for both samples p  Internet use was thought to be the most logical available measure to define coverage differences between the two samples g p  Opt-in panelists have been described as more active Internet users Abt SRBI | pg 9

  10. Flu Vaccination rate by Internet use Flu Vaccination rate by Internet use Vaccination Rate for Panel and NHIS Samples (weighted*) Panel Panel Panel Panel NHIS NHIS NHIS NHIS Total Panel Total Panel Total NHIS Total NHIS Non-daily Internet Daily Internet V Vaccination Rate i ti R t 47 1% 47 1% 47.1% 47.1% 34 9% 34 9% 34.9% 34.9% 31.0% 31 0% 36 4% 36.4% 95% Conf. Interval - - 18.7- 43.4 29.4 - 43.4 28.8 - 41.0 28.8 - 41.0 * Both samples weighted to identical benchmarks • The Panel sample is missing non-Internet users The Panel sample is missing non Internet users • Daily Internet has a higher observed vaccination rate as does the Panel • Difference by Internet use suggests a reason why Panel rate is higher Difference by Internet use suggests a reason why Panel rate is higher Abt SRBI | pg 10

  11. Matching strategy Matching strategy Internet Use Propensity  Leverage NHIS Internet use questions for matching  Develop a propensity score based on frequency of Internet usage “Daily Internet” Use vs. “Less than daily/No Internet” Use y y Abt SRBI | pg 11

  12. Internet user propensity Internet user propensity  Cases used in analysis: 2,035 panel cases + 394 NHIS cases  Panel cases: We assume 100% to be daily users (for our purpose)  Panel cases: We assume 100% to be daily users (for our purpose)  NHIS cases: 71% daily users, 29% less than daily/non-users  Combined data used in a logistic regression model predicting likelihood of a “Less-than-Daily” user (i.e., propensity) y ( , p p y) Abt SRBI | pg 12

  13. Internet user propensity Internet user propensity  An exploratory investigation pursued for the propensity model  Eleven variables evaluated for prediction of Less-than-Daily user  Stepwise analysis found six as significant for use ( i.e., p <.05 level ) Effect DF Chi-Square q p p education level 3 96.36 <.0001 race/ethnicity 3 36.80 <.0001 type of phone used 3 20.88 <.001 age 5 16.93 <.001 home ownership 1 7.01 <.01 child ≤ 4 yrs old in HH 1 4.46 <.05 p >.05 = health insurance, marital status, region, income, employment status Abt SRBI | pg 13

  14. Vaccination and Internet propensity Vaccination and Internet propensity Vaccination and Internet propensity Vaccination and Internet propensity score score Vaccination Rate and Quintiles of Internet Use Propensity 70% Panel (47%) NHIS (35%) 59% 60% 60% 52% 50% n rate 50% 45% 45% 42% 39% 40% 0% accinatio 36% 30% 27% 30% Va 20% % 10% 0% % 1st High Use 2nd 3rd 4th 5th Low Use Quintile Panel = 2,035 cases; NHIS = 394 cases. Percents rounded to whole numbers P l 2 035 NHIS 394 P t d d t h l b Abt SRBI | pg 14

  15. Matching rates Matching rates  Matched cases from both samples on their propensity score value  82% of NHIS cases matched to 63% of Panel cases Source Matched Did Not Match Total Cases Count C t P Percent t Count C t P Percent t Count C t Panel 1,282 63% 753 37% 2,035 NHIS 325 82% 69 18% 394 Abt SRBI | pg 15

  16. Ratio adjustment weights Ratio adjustment weights  Problem P bl • Single NHIS case matched to multiple Panel cases • Multiple NHIS cases matched to a single Panel case • Multiple NHIS cases matched to a single Panel case • Multiple NHIS cases matched to multiple Panel cases  Solution  Solution  A ratio adjustment for each of the matched Panel cases ( (number of NHIS cases in a match) b f NHIS i t h) Ratio Adj. Weight = (number of Panel cases in same match) Example: When 2 NHIS cases match 5 Panel cases Ratio adjustment weight = 2/5 = 0.40 Abt SRBI | pg 16

  17. Ratio adjustment weights Ratio adjustment weights  Ratio adjust 1,282 matched Panel cases 47 (4%) had adj wgt 47 (4%) had adj. wgt. =1.00 e.g., 1/1, 2/2, 3/3, etc. =1 00 e g 1/1 2/2 3/3 etc 1,216 (95%) had adj. wgt. <1.00 e.g., 1/2, 1/3, 2/3, etc. 19 (<2%) had adj. wgt. >1.00 e.g., 2/1, 3/2, 4/3, etc.  Sum of ratio adjusted weighted Panel cases = 325 j g Abt SRBI | pg 17

Recommend


More recommend