Matching an Internet g Panel Sample of Pregnant Women to a g Probability Sample Andrew Burkey Abt SRBI Carla L Black CDC* Charles DiSogra Abt SRBI John Sokolowski Abt SRBI Stacie M Greby CDC* Helen Ding CDC* K.P. Srinath Abt SRBI Sarah W Ball Abt Assoc. Sara MA Donahue Abt Assoc. AAPOR – May 16, 2015 – Hollywood, FL DC-AAPOR – August 3 2015 – Washington DC DC-AAPOR August 3, 2015 Washington, DC Immunization Services Division, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention
Background Background Centers for Disease Control and Prevention (CDC) uses Centers for Disease Control and Prevention (CDC) uses influenza vaccination coverage data to: Monitor the impact of vaccination programs M it th i t f i ti Identify groups in need of vaccination services Pregnant women (PW) are prioritized for influenza vaccination Pregnant women are at increased risk of influenza- Pregnant women are at increased risk of influenza related severe illness and hospitalization Infants younger than 6 months are at high risk of severe Infants younger than 6 months are at high risk of severe complications from the flu, but they are too young to be vaccinated. Abt SRBI | pg 2
Surveying a rare group: pregnant women Surveying a rare group: pregnant women Probability-based survey could be costly and time-consuming Probability-based survey could be costly and time-consuming Low prevalence of PW: <4% of all women Compromise: Use opt in Internet panels Compromise: Use opt-in Internet panels Non-probability opt-in panel is fit for purpose Can access large numbers of PW with Black Hispanic over-samples Can access large numbers of PW with Black, Hispanic over samples • Large sample: approximately 2,000 completes per 2-week period Provide information quickly (all Web mode) q y ( ) • Short 2-week data collection period in November and in April Are affordable for the task Survey Sampling Inc., opt-in Internet panel used for sample since 2010 Abt SRBI | pg 3
Limitations of non Limitations of non- -probability sample probability sample PW are all panel volunteers who use the Internet Due to Internet use and multi-survey exposure PW in survey may Due to Internet use and multi survey exposure, PW in survey may be better informed than PW in general population Bias in estimate is unknown; poses threat to generalizability Bias in estimate is unknown; poses threat to generalizability Standard errors and confidence intervals cannot be used with non- probability samples (AAPOR Task Force Report, June 2013) Abt SRBI | pg 4
Study Objective Study Objective Objective Objective Explore ways to mitigate the shortcomings of non-probability sample Assess the bias Assign a measure of precision to the estimate Method Match cases from Spring 2013 non-probability PW survey with M t h f S i 2013 b bilit PW ith PW cases from a 2013 national probability survey Assumption Assumption Purposely matched opt-in sample can resemble a probability sample Variance of estimate approximates that from a probability sample thus a Variance of estimate approximates that from a probability sample thus a confidence interval may be derived Matched sample estimate may not be similar to the probability sample’s estimate due to existence of inherent bias Abt SRBI | pg 5
Panel bias is assumed Panel bias is assumed Matched panel sample Probability sample Abt SRBI | pg 6
The comparison probability sample The comparison probability sample 2013 National Health Interview Survey (NHIS) In-person survey of population weighted to population totals Data collection throughout the year Data collection throughout the year Has some of the same descriptive variables as the Panel sample To best align with panel sample, NHIS sample limited to women: Asked questions about pregnancy Interviewed February – July 2013 I i d F b J l 2013 Pregnant any time between August 2012 – March 2013 Responded to questions about influenza vaccination status Responded to questions about influenza vaccination status Abt SRBI | pg 7
Matching variables Matching variables Variables in both surveys and with differences in the distribution between the panel and NHIS (age, race, education) don’t explain differences in vaccination coverage between the samples Variables used by other studies not available in both surveys, e.g., political party affiliation, smoking status liti l t ffili ti ki t t One difference between the panel and NHIS is Internet use Literature suggests that panel members are frequent internet users. Frequent internet users have attitudinal and behavioral differences compared to the general population Abt SRBI | pg 8
Internet use: observed difference Internet use: observed difference A diff A difference between the two samples: b h l Panel: All Internet users (online respondents – English only) NHIS: Internet users and non-users (in-person, general population) There may be other attitude, behavioral, socio-political differences Internet use can be defined for both samples p Internet use was thought to be the most logical available measure to define coverage differences between the two samples g p Opt-in panelists have been described as more active Internet users Abt SRBI | pg 9
Flu Vaccination rate by Internet use Flu Vaccination rate by Internet use Vaccination Rate for Panel and NHIS Samples (weighted*) Panel Panel Panel Panel NHIS NHIS NHIS NHIS Total Panel Total Panel Total NHIS Total NHIS Non-daily Internet Daily Internet V Vaccination Rate i ti R t 47 1% 47 1% 47.1% 47.1% 34 9% 34 9% 34.9% 34.9% 31.0% 31 0% 36 4% 36.4% 95% Conf. Interval - - 18.7- 43.4 29.4 - 43.4 28.8 - 41.0 28.8 - 41.0 * Both samples weighted to identical benchmarks • The Panel sample is missing non-Internet users The Panel sample is missing non Internet users • Daily Internet has a higher observed vaccination rate as does the Panel • Difference by Internet use suggests a reason why Panel rate is higher Difference by Internet use suggests a reason why Panel rate is higher Abt SRBI | pg 10
Matching strategy Matching strategy Internet Use Propensity Leverage NHIS Internet use questions for matching Develop a propensity score based on frequency of Internet usage “Daily Internet” Use vs. “Less than daily/No Internet” Use y y Abt SRBI | pg 11
Internet user propensity Internet user propensity Cases used in analysis: 2,035 panel cases + 394 NHIS cases Panel cases: We assume 100% to be daily users (for our purpose) Panel cases: We assume 100% to be daily users (for our purpose) NHIS cases: 71% daily users, 29% less than daily/non-users Combined data used in a logistic regression model predicting likelihood of a “Less-than-Daily” user (i.e., propensity) y ( , p p y) Abt SRBI | pg 12
Internet user propensity Internet user propensity An exploratory investigation pursued for the propensity model Eleven variables evaluated for prediction of Less-than-Daily user Stepwise analysis found six as significant for use ( i.e., p <.05 level ) Effect DF Chi-Square q p p education level 3 96.36 <.0001 race/ethnicity 3 36.80 <.0001 type of phone used 3 20.88 <.001 age 5 16.93 <.001 home ownership 1 7.01 <.01 child ≤ 4 yrs old in HH 1 4.46 <.05 p >.05 = health insurance, marital status, region, income, employment status Abt SRBI | pg 13
Vaccination and Internet propensity Vaccination and Internet propensity Vaccination and Internet propensity Vaccination and Internet propensity score score Vaccination Rate and Quintiles of Internet Use Propensity 70% Panel (47%) NHIS (35%) 59% 60% 60% 52% 50% n rate 50% 45% 45% 42% 39% 40% 0% accinatio 36% 30% 27% 30% Va 20% % 10% 0% % 1st High Use 2nd 3rd 4th 5th Low Use Quintile Panel = 2,035 cases; NHIS = 394 cases. Percents rounded to whole numbers P l 2 035 NHIS 394 P t d d t h l b Abt SRBI | pg 14
Matching rates Matching rates Matched cases from both samples on their propensity score value 82% of NHIS cases matched to 63% of Panel cases Source Matched Did Not Match Total Cases Count C t P Percent t Count C t P Percent t Count C t Panel 1,282 63% 753 37% 2,035 NHIS 325 82% 69 18% 394 Abt SRBI | pg 15
Ratio adjustment weights Ratio adjustment weights Problem P bl • Single NHIS case matched to multiple Panel cases • Multiple NHIS cases matched to a single Panel case • Multiple NHIS cases matched to a single Panel case • Multiple NHIS cases matched to multiple Panel cases Solution Solution A ratio adjustment for each of the matched Panel cases ( (number of NHIS cases in a match) b f NHIS i t h) Ratio Adj. Weight = (number of Panel cases in same match) Example: When 2 NHIS cases match 5 Panel cases Ratio adjustment weight = 2/5 = 0.40 Abt SRBI | pg 16
Ratio adjustment weights Ratio adjustment weights Ratio adjust 1,282 matched Panel cases 47 (4%) had adj wgt 47 (4%) had adj. wgt. =1.00 e.g., 1/1, 2/2, 3/3, etc. =1 00 e g 1/1 2/2 3/3 etc 1,216 (95%) had adj. wgt. <1.00 e.g., 1/2, 1/3, 2/3, etc. 19 (<2%) had adj. wgt. >1.00 e.g., 2/1, 3/2, 4/3, etc. Sum of ratio adjusted weighted Panel cases = 325 j g Abt SRBI | pg 17
Recommend
More recommend