Arnoˇ st Kom´ arek Dept. of Probability and Mathematical Statistics Regression modelling of misclassified correlated interval-censored data Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, July 27 – 29, 2015
Joint work with Mar´ ıa Jos´ e Garc´ ıa-Zattera and Alejandro Jara Pontificia Universidad Cat´ olica de Chile Santiago de Chile
Outline Misclassified interval-censored data. 1 Model for misclassified interval-censored data. 2 Misclassification model. a Event-time model. b Estimation and inference. 3 Simulation study. 4 Models comparison. 5 The Signal Tandmobiel � study. 6 Summary and conclusions. 7 3/87 Arnoˇ st Kom´ arek .
Part I Misclassified interval-censored data
Motivating dataset The Signal Tandmobiel � study � Longitudinal dental study, Flanders (Belgium), 1996 – 2001. � 2 315 boys, 2 153 girls followed from 7 until 12 years old (primary school time). � Annual dental examinations. � Sixteen trained dental examiners. � Each child examined in general by different examiner in each year. ✇ Clinical data. � Data on oral hygiene and dietary habits. 5/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Main aim Model the relationship between time to caries experience (CE) and potential risk factors. � Gender (boys vs. girls). � Presence of sealants. � Frequency of brushing (daily / not daily). � Geographical location. 6/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Caries experience (CE) 7/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Caries experience (CE) ❄ Reversible 7/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Caries experience (CE) ✻ Caries (Irreversible) ❄ Reversible 7/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Statistical modelling challenges � CE is a progressive disease ✇ we deal with a monotone 0/1 process. � CE status checked only at discrete occasions (visits/dental examinations) ✇ interval censoring. � Teeth in one mouth share common environment, genetical dispositions, . . . ✇ dependence among processes on different teeth in one mouth. 8/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
CE process & interval censoring 1 Y ( i , j ) ( t ) 0 ♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣ ✲ T ( i , j ) Y ( i , j ) 0 0 1 1 ✲ 0 v ( i , 1 ) v ( i , 2 ) v ( i , 3 ) v ( i , 4 ) t � � T ( i , j ) ∈ v ( i , 2 ) , v ( i , 3 ) , � � ⊤ . Y ( i , j ) = 0 , 0 , 1 , 1 9/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Summary of notation � T ( i , j ) : event (CE) time of tooth j on subject i , i = 1 , . . . , N , j = 1 , . . . , J . � Y ( i , j ) ( t ) : 0/1 CE status of tooth ( i , j ) at time t . � � x ( i , j ) : potential risk factors, covariates to explain T ( i , j ) Y ( i , j ) ( t ) . � 0 = v ( i , 0 ) < v ( i , 1 ) < v ( i , 2 ) < · · · < v ( i , K i ) < v ( i , K i + 1 ) = ∞ : visit times (of dental examinations) for subject i . � � ⊤ : � Y ( i , j ) = Y ( i , j , 1 ) , . . . , Y ( i , j , K i ) recorded 0/1 CE status of tooth ( i , j ) at performed visits. 10/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Interval-censored data Interest in Regression T ( i , j ) ∼ x ( i , j ) ≡ Y ( i , j ) ( t ) ∼ x ( i , j ) . Observed data � ⊤ together with the � Monotone 0/1 sequence Y ( i , j ) = Y ( i , j , 1 ) , . . . , Y ( i , j , K i ) visit times v ( i , 1 ) , . . . , v ( i , K i ) . ≡ Intervals ( L ( i , j ) , U ( i , j ) ] such that T ( i , j ) ∈ ( L ( i , j ) , U ( i , j ) ] � � and L ( i , j ) , U ( i , j ) ∈ 0 , v ( i , 1 ) , . . . , v ( i , K i ) , ∞ . L ( i , j ) : the last visit time when Y ( i , j , ∗ ) = 0, U ( i , j ) : the first visit time when Y ( i , j , ∗ ) = 1. 11/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Life is not so easy. . . � Not easy and somehow subjective diagnosis of CE ✇ misclassification in recorded values Y ( i , j , 1 ) , . . . , Y ( i , j , K i ) . ✇ sensitivity/specificity of the diagnostic test towards caries are not one. 12/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Life is not so easy. . . � Not easy and somehow subjective diagnosis of CE ✇ misclassification in recorded values Y ( i , j , 1 ) , . . . , Y ( i , j , K i ) . ✇ sensitivity/specificity of the diagnostic test towards caries are not one. Misclassified correlated interval-censored data. 12/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
CE process & misclassified interval-censored data 1 Y ( i , j ) ( t ) 0 ♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣ ✲ T ( i , j ) Y ( i , j ) 0 0 0 1 ✲ 0 v ( i , 1 ) v ( i , 2 ) v ( i , 3 ) v ( i , 4 ) t � � T ( i , j ) ∈ v ( i , 3 ) , v ( i , 4 ) really? , � � ⊤ . Y ( i , j ) = 0 , 0 , 0 , 1 13/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
CE process & misclassified interval-censored data 1 Y ( i , j ) ( t ) 0 ♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣♣ ✲ T ( i , j ) Y ( i , j ) 0 1 0 1 ✲ 0 v ( i , 1 ) v ( i , 2 ) v ( i , 3 ) v ( i , 4 ) t T ( i , j ) ∈ ??? , � � ⊤ . Y ( i , j ) = 0 , 1 , 0 , 1 14/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Misclassified interval-censored data Interest in Regression T ( i , j ) ∼ x ( i , j ) ≡ Y ( i , j ) ( t ) ∼ x ( i , j ) . Observed data � T ( i , j ) Y ( i , j ) observed only indirectly through � � ⊤ : Y ( i , j ) = Y ( i , j , 1 ) , . . . , Y ( i , j , K i ) ✇ not necessarily monotone sequence of 0/1 possibly misclassified CE status indicators from visits performed at times v ( i , 1 ) , . . . , v ( i , K i ) . 15/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Study design that leads to misclassified interval-censored data � Longitudinal follow-up. � Event status checked at pre-specified time points. � Assumption here: visit times independent of the event time. � Occurrence of event is determined by a diagnostic test (with possi- bly imperfect sensitivity and/or specificity). � Frequent for many non-death events. � Nevertheless, data are mostly analyzed as if both sensitivity and speci- ficity are equal to one and hence there is no event status misclassifica- tion. � Follow-up is not scheduled to stop after the first positive result. � Frequent in longitudinal studies where the event is not the primary study outcome. 16/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Principal questions Using just the observed data – Y ( i , j ) Can we do a valid statistical inference on the time to event T ( i , j ) 1 in presence of event misclassification even if no external infor- mation is available on the magnitude of the misclassification? � No external information on the sensitivity/specificity values. 17/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Principal questions Using just the observed data – Y ( i , j ) Can we do a valid statistical inference on the time to event T ( i , j ) 1 in presence of event misclassification even if no external infor- mation is available on the magnitude of the misclassification? � No external information on the sensitivity/specificity values. Can we evaluate the magnitude of misclassification? 2 � Can we estimate sensitivity/specificity of the event classification? 17/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Principal questions Using just the observed data – Y ( i , j ) Can we do a valid statistical inference on the time to event T ( i , j ) 1 in presence of event misclassification even if no external infor- mation is available on the magnitude of the misclassification? � No external information on the sensitivity/specificity values. Can we evaluate the magnitude of misclassification? 2 � Can we estimate sensitivity/specificity of the event classification? Do we get a valid inference on the time to event T ( i , j ) if mis- 3 classification ignored and it is assumed that T ( i , j ) lies in the first “possible” observed interval? 17/87 Arnoˇ st Kom´ arek I. Misclassified interval-censored data
Part II Modelling approach
Hierarchical model Hierarchically specified model (likelihood) for observed data � � Y i = Y ( i , 1 ) , . . . , Y ( i , J ) . Start with a joint likelihood of unobservable T i and observed Y i : � � � � T i p ( Y i , T i ) = p Y i p ( T i ) . � � � � T i � p Y i : (mis)classification model ✇ visit times v ( i , 1 ) , . . . , v ( i , K i ) act as covariates here. � p ( T i ) : survival model for (correlated) event times ✇ risk factors x ( i , 1 ) , . . . , x ( i , J ) act as covariates here. 19/87 Arnoˇ st Kom´ arek II. Modelling approach
Recommend
More recommend