An R package for analyzing truncated data Alvarez 1 and Rosa M. Crujeiras 2 na-´ Carla Moreira 1 ∗ , Jacobo de U˜ 1 Department of Statistics and OR, University of Vigo 2 Department of Statistics and OR, University of Santiago de Compostela ∗ carlamgmm@gmail.com
Introduction Algorithms for DTD Package description Conclusions Outline Introduction 1 Algorithms for DTD 2 Package description 3 Conclusions 4 Moreira et al. useR! 2009 DTDA package 2/30
Introduction Algorithms for DTD Package description Conclusions Motivation examples Astronomy Epidemiology Economy Survival Analysis In these cases, we must apply specialized statistical models and methods due the need to accommodate the event of losses in the sample, such as grouping, censoring or truncation. Moreira et al. useR! 2009 DTDA package 3/30
Introduction Algorithms for DTD Package description Conclusions Truncation Scheme Moreira et al. useR! 2009 DTDA package 4/30
Introduction Algorithms for DTD Package description Conclusions Truncation Scheme t1 Moreira et al. useR! 2009 DTDA package 4/30
Introduction Algorithms for DTD Package description Conclusions Truncation Scheme t1 Observational Window t2 Moreira et al. useR! 2009 DTDA package 4/30
Introduction Algorithms for DTD Package description Conclusions Truncation Scheme Let X ∗ be the ultimate time of interest with df F ( U ∗ , V ∗ ) the pair of truncation times, with joint df K We observe ( U ∗ , X ∗ , V ∗ ) if and only if U ∗ ≤ X ∗ ≤ V ∗ Let ( U i , X i , V i ) , i = 1 , ..., n be the observed data. Under the assumption of independence between X ∗ and ( U ∗ , V ∗ ) : The full likelihood is given by: n f j k j � L n ( f, k ) = � n i =1 F i k i j =1 Moreira et al. useR! 2009 DTDA package 5/30
Introduction Algorithms for DTD Package description Conclusions Truncation Scheme Where: f = ( f 1 , f 2 , ..., f n ) k = ( k 1 , k 2 , ..., k n ) F i = � n m =1 f m J i m and J i m = I [ U i ≤ X m ≤ V i ] = 1 if U i ≤ X m ≤ V i , or zero otherwise. As noted by Shen (2008): n n f j F j k j � � L n ( f, k ) = = L 1 ( f ) × L 2 ( f, k ) × � n F j i =1 F i k i j =1 j =1 Moreira et al. useR! 2009 DTDA package 6/30
Introduction Algorithms for DTD Package description Conclusions Efron-Petrosian estimators The condicional NPMLE of F (Efron-Petrosian, 1999) is defined as the maximizer of L 1 ( f ) . n 1 J ij × 1 � = , j = 1 , ..., n ˆ ˆ f j F i i =1 n where ˆ � ˆ F i = f m J im . m =1 This equation was used by Efron and Petrosian (1999) to introduce the EM algorithm to compute ˆ f . Moreira et al. useR! 2009 DTDA package 7/30
Introduction Algorithms for DTD Package description Conclusions EM algorithm from Efron and Petrosian (1999) EP1. Compute the initial estimate ˆ F (0) corresponding to ˆ f (0) = (1 /n, ..., 1 /n ) ; EP2. Apply (1) to get an improved estimator ˆ f (1) to compute the F (1) pertaining to ˆ ˆ f (1) ; EP3. Repeat Step EP2 until convergence criterion is reached. Moreira et al. useR! 2009 DTDA package 8/30
Introduction Algorithms for DTD Package description Conclusions Shen Estimator Interchanging the roles of X ’s and ( U i , V i ) : n n k j K j f j � � L n ( f, k ) = = L 1 ( k ) × L 2 ( k, f ) × � n K j i =1 K i f i j =1 j =1 where n n � � K i = k m I [ U m ≤ X i ≤ V m ] = k m J im m =1 m =1 and maximizing L 1 ( k ) : n 1 1 � = J ji , j = 1 , ..., n ˆ ˆ k j K i i =1 n ˆ with ˆ � K i = k m J im . m =1 Moreira et al. useR! 2009 DTDA package 9/30
Introduction Algorithms for DTD Package description Conclusions Shen Estimator Shen (2008) showed that the solutions are the unconditional NPMLE of F and K , respectively, and both estimators can be obtained by: � n � − 1 1 1 ˆ � f j = , j = 1 , ..., n ˆ ˆ K j K j i =1 � n � − 1 1 1 ˆ � k j = , j = 1 , ..., n ˆ ˆ F j F j i =1 Moreira et al. useR! 2009 DTDA package 10/30
Introduction Algorithms for DTD Package description Conclusions EM algorithm from Shen (2008) S1. Compute the initial estimate ˆ F (0) corresponding to ˆ f (0) = (1 /n, ..., 1 /n ) ; S2. Apply (4) to get the first step estimator ˆ k (1) and compute the K (1) pertaining to ˆ ˆ k (1) ; S3. Apply (3) to get the first step estimator ˆ f (1) and its corresponding ˆ F (1) ; S4. Repeat Steps S2 and S3 until convergence criterion is reached. Moreira et al. useR! 2009 DTDA package 11/30
Introduction Algorithms for DTD Package description Conclusions DTDA-package efron.petrosian(X,...) lynden(X,...) shen(X,...) Moreira et al. useR! 2009 DTDA package 12/30
Introduction Algorithms for DTD Package description Conclusions DTDA-package efron.petrosian(X,...) lynden(X,...) shen(X,...) 3 examples data sets with X ∼ Unif(0,1) and: Ex.1 U ∼ Unif(0,0.5), V ∼ Unif(0.5,1) Ex.2 U ∼ Unif(0,0.25), V ∼ Unif(0.75,1) Ex.3 U ∼ Unif(0,0.67), V ∼ Unif(0.33,1) Moreira et al. useR! 2009 DTDA package 13/30
Introduction Algorithms for DTD Package description Conclusions efron.petrosian illustration under double truncation EX.1-50 % of truncation efron.petrosian(X,U,V, . . . ) EP estimator Survival 1.0 1.0 > iter 0.8 0.8 > f > FF 0.6 0.6 > S 0.4 0.4 > Sob > upperF 0.2 0.2 > lowerF 0.0 > upperS 0.2 0.6 1.0 0.2 0.6 1.0 > lowerS Time of interest Time of interest Moreira et al. useR! 2009 DTDA package 14/30
Introduction Algorithms for DTD Package description Conclusions efron.petrosian illustration under double truncation EP estimator Survival 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.2 0.6 1.0 0.2 0.6 1.0 Time of interest Time of interest Moreira et al. useR! 2009 DTDA package 15/30
Introduction Algorithms for DTD Package description Conclusions efron.petrosian illustration under left truncation EX.1 EP estimator Survival efron.petrosian(X,U,...) 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Time of interest Time of interest Moreira et al. useR! 2009 DTDA package 16/30
Introduction Algorithms for DTD Package description Conclusions efron.petrosian illustration under left truncation EP estimator Survival 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Time of interest Time of interest Moreira et al. useR! 2009 DTDA package 17/30
Introduction Algorithms for DTD Package description Conclusions lynden illustration under double truncation EX.2-25% of truncation lynden(X,U,V,...) EP estimator Survival > iter 1.0 1.0 > NJ > f 0.8 0.8 > FF 0.6 0.6 > h > S 0.4 0.4 > Sob 0.2 0.2 > upperF > lowerF 0.0 0.0 > upperS 0.2 0.6 0.2 0.6 Time of interest Time of interest > lowerS Moreira et al. useR! 2009 DTDA package 18/30
Introduction Algorithms for DTD Package description Conclusions lynden illustration under double truncation EP estimator Survival 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.2 0.6 0.2 0.6 Time of interest Time of interest Moreira et al. useR! 2009 DTDA package 19/30
Introduction Algorithms for DTD Package description Conclusions lynden illustration under right truncation EX.2 lynden(X,V,...) Survival 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 Time of interest Moreira et al. useR! 2009 DTDA package 20/30
Introduction Algorithms for DTD Package description Conclusions lynden illustration under right truncation Survival 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 Time of interest Moreira et al. useR! 2009 DTDA package 21/30
Introduction Algorithms for DTD Package description Conclusions shen illustration under double truncation EX.3-67% of truncation shen(X,U,V...) Shen estimator Survival > iter 1.0 0.8 > f 0.6 0.4 > FF 0.2 > S 0.0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 > Sob Time of interest Time of interest > k Marginal U Marginal V > fU 0.8 0.8 > fV 0.4 0.4 > upperF 0.0 0.0 > lowerF 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 > upperS Time of interest Time of interest > lowerS Moreira et al. useR! 2009 DTDA package 22/30
Introduction Algorithms for DTD Package description Conclusions shen illustration under double truncation Shen estimator Survival 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 Time of interest Time of interest Marginal U Marginal V 0.8 0.8 0.4 0.4 0.0 0.0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 Time of interest Time of interest Moreira et al. useR! 2009 DTDA package 23/30
Introduction Algorithms for DTD Package description Conclusions Summary The DTDA package provides different algorithms for analyzing randomly truncated data, one-sided and two-sided (i.e. doubly) truncated data being allowed. Moreira et al. useR! 2009 DTDA package 24/30
Introduction Algorithms for DTD Package description Conclusions Summary The DTDA package provides different algorithms for analyzing randomly truncated data, one-sided and two-sided (i.e. doubly) truncated data being allowed. This package incorporates the functions efron.petrosian, lynden and shen, which call the iterative methods introduced by Efron and Petrosian (1999)and Shen (2008). Moreira et al. useR! 2009 DTDA package 25/30
Recommend
More recommend