Some algorithms to fit some reliability mixture models under censoring Laurent Bordes Didier Chauveau University of Pau University of Orl´ eans COMPSTAT August 22-27, 2010 Laurent Bordes () Fitting some reliability mixture models 27 August 2010 1 / 35
Table of contents Reliability mixture models 1 Some real data sets 2 Parametric EM-algorithm 3 Parametric stochastic EM-algorithm 4 Semiparametric stochastic EM-algorithm 5 Laurent Bordes () Fitting some reliability mixture models 27 August 2010 2 / 35
Reliability mixture models Plan Reliability mixture models 1 Some real data sets 2 Parametric EM-algorithm 3 Parametric stochastic EM-algorithm 4 Semiparametric stochastic EM-algorithm 5 Laurent Bordes () Fitting some reliability mixture models 27 August 2010 3 / 35
Reliability mixture models About lifetimes The lifetime data are assumed to come from a finite mixture of m component densities f j , j = 1 , . . . , m , where f j ( · ) = f ( ·| ξ j ) ∈ F a parametric family indexed by a Euclidean parameter ξ . The lifetime density of an observation X may be written m � X ∼ g ( x | θ ) = λ j f ( x | ξ j ) , j =1 where θ = ( λ , ξ ) = ( λ 1 , . . . , λ m , ξ 1 , . . . , ξ m ). Latent variable representation: X = Y Z where Z ∼ M ult (1 , λ ) and ( Y Z | Z = j ) ∼ f ( ·| ξ j ). For references on the broad literature of mixture models McLachlan and Peel (2000). Laurent Bordes () Fitting some reliability mixture models 27 August 2010 4 / 35
Reliability mixture models Right censored data The censoring process is described by a random variable C with density function q , distribution function Q and survival function ¯ Q . In the right censoring setup the only available information is D = I ( X ≤ C ) . T = min( X , C ) , The n lifetime data are x = ( x 1 , . . . , x n ) iid ∼ g , associated to n censoring times c = ( c 1 , . . . , c n ) iid ∼ C . The observations are thus ( t , d ) = (( t 1 , d 1 ) , . . . , ( t n , d n )) , where t i = min( x i , c i ) and d i = I ( x i ≤ c i ). Laurent Bordes () Fitting some reliability mixture models 27 August 2010 5 / 35
Reliability mixture models Complete data choice The observed data ( t , d ) depends on x which comes from a finite mixture ⇒ missing data are naturally associated to it. To these incomplete data are associated complete data which correspond to the situation where the component of origin z i ∈ { 1 , . . . , m } of each individual lifetime x i is known. The complete model at the level of ( X , Z ) is given by P θ ( Z = z ) = λ z and ( X | Z = z ) ∼ f z . With the right censoring process the complete data are ( t , d , z ), where z = ( z 1 , . . . , z n ). Remark. As in Chauveau (1995) the complete data can be ( x , z ) instead of ( t , d , z ). Laurent Bordes () Fitting some reliability mixture models 27 August 2010 6 / 35
Reliability mixture models Complete data pdf Because we have: f c θ ( T = t , D = 1 , Z = z ) = P θ ( Z = z ) f θ ( D = 1 , T = t | Z = z ) λ z f θ ( C ≥ X , X = t | z ) = = λ z P θ ( C ≥ t ) f θ ( X = t | z ) λ z f z ( t ) ¯ = Q ( t ) , θ ( t , 0 , z ) = λ z ¯ and similarly f c F z ( t ) q ( t ), the complete data pdf is summarized by � 1 − d . � d � λ z f ( t | ξ z ) ¯ λ z ¯ f c ( t , d , z | θ ) = � F ( t | ξ z ) q ( t ) Q ( t ) Laurent Bordes () Fitting some reliability mixture models 27 August 2010 7 / 35
Some real data sets Plan Reliability mixture models 1 Some real data sets 2 Parametric EM-algorithm 3 Parametric stochastic EM-algorithm 4 Semiparametric stochastic EM-algorithm 5 Laurent Bordes () Fitting some reliability mixture models 27 August 2010 8 / 35
Some real data sets Acute Myelogenous Leukemia survival data (Miller, 1997) group scale estimation Group effect with two 63.3 Maintained groups 25.1 Nonmaintained Sample size: 23 Censored lifetimes: 5 5 4 3 Variables Description 2 survival or censoring time time 1 censoring status status 0 maintenance chemotherapy x 0 50 100 150 given Laurent Bordes () Fitting some reliability mixture models 27 August 2010 9 / 35
Some real data sets Lifetimes of diesel engines fans (Nelson, 1982) 8 6 Time scale (1000s of 4 hours) 2 Sample size: 70 0 Censored lifetimes: 12 0 2000 4000 6000 8000 10000 Reliability estimation 1.0 Variables Description 0.8 survival or censoring time 0.6 time R censoring status 0.4 status 0.2 Kaplan−Meier 1−Weibull 0.0 0 5000 10000 15000 time (1000 hours) Laurent Bordes () Fitting some reliability mixture models 27 August 2010 10 / 35
Parametric EM-algorithm Plan Reliability mixture models 1 Some real data sets 2 Parametric EM-algorithm 3 Parametric stochastic EM-algorithm 4 Semiparametric stochastic EM-algorithm 5 Laurent Bordes () Fitting some reliability mixture models 27 August 2010 11 / 35
Parametric EM-algorithm Parametric EM-algorithm: complete data = ( t , d , z ) Usual missing data framework (Dempster, Laird and Rubin, 1977) ⇒ define an EM algorithm that generates a sequence ( θ k ) k =1 , 2 ,... (with arbitrary initial value θ 0 ) by iteratively maximize � log f c ( t , d , Z | θ ) | t , d , θ k � Q ( θ | θ k ) = E n � log f c ( t i , d i , Z i | θ ) | t i , d i , θ k � � = . E i =1 Calculation of Q ( θ | θ k ) requires calculation of the following posterior probabilities p k P ( Z i = j | t i , d i , θ k ) := ij � d i � � 1 − d i ¯ � f ( t i | ξ k F ( t i | ξ k j ) j ) λ k = . (1) j � p � p ℓ ¯ ℓ =1 λ k ℓ f ( t i | ξ k ℓ =1 λ k F ( t i | ξ k ℓ ) ℓ ) Laurent Bordes () Fitting some reliability mixture models 27 August 2010 12 / 35
Parametric EM-algorithm Exponential lifetimes: complete data = ( t , d , z ) EM algorithm: θ k → θ k +1 1 E-step: Calculate the posterior probabilities p k ij as in Equation (1), for all i = 1 , . . . , n and j = 1 , . . . , m . 2 M-step: Set n 1 λ k +1 � p k = for j = 1 , . . . , m ij j n i =1 � n i =1 p k ij d i ξ k +1 = for j = 1 , . . . , m . j � n i =1 p k ij t i Laurent Bordes () Fitting some reliability mixture models 27 August 2010 13 / 35
Parametric EM-algorithm Simulation example g ( x ) = λ 1 ξ 1 exp( − ξ 1 x ) + λ 2 ξ 2 exp( − ξ 2 x ) x > 0 , with ξ 1 = 1 − − − , ξ 2 = 0 . 2 − − − and λ 1 = 1 / 3 − − − . EM for RMM, n=200, 30% censored EM for RMM, n=1000, 34.7% censored 1.5 rate 1 1.5 rate 1 rate 2 rate 2 lambda 1 lambda 1 1.0 1.0 estimates estimates 0.5 0.5 0.0 0.0 0 200 400 600 800 1000 0 50 100 150 200 iterations iterations Laurent Bordes () Fitting some reliability mixture models 27 August 2010 14 / 35
Parametric EM-algorithm Application to AML data: be careful! Scale (100 iterations) Scale (500 iterations) 80 80 60 60 40 40 20 20 0 0 0 20 40 60 80 100 0 100 200 300 400 500 Iterations Iterations Lambda (100 iterations) Lambda (500 iterations) 0.8 0.8 0.4 0.4 0.0 0.0 0 20 40 60 80 100 0 100 200 300 400 500 Iterations Iterations Laurent Bordes () Fitting some reliability mixture models 27 August 2010 15 / 35
Parametric EM-algorithm Parametric EM-algorithm: complete data = ( x , z ) Complete data pdf f c ( x , z ) = λ z f z ( x ) . Then � log f c ( X , Z | θ ) | t , d , θ k � Q ( θ | θ k ) = E n � log f c ( X i , Z i | θ ) | t i , d i , θ k � � = E . i =1 Calculation of Q ( θ | θ k ) requires calculation of the following posterior pdf f k f ( X i = x , Z i = j | t i , d i , θ k ) i ( x , j ) := � d i � � 1 − d i � I ( x = t i ) f ( t i | ξ k I ( x > t i ) f ( x | ξ k j ) j ) λ k = . j � p � p ℓ ¯ ℓ =1 λ k ℓ f ( t i | ξ k ℓ =1 λ k F ( t i | ξ k ℓ ) ℓ ) Laurent Bordes () Fitting some reliability mixture models 27 August 2010 16 / 35
Parametric EM-algorithm Exponential lifetimes: complete data = ( x , z ) EM algorithm: θ k → θ k +1 1 E-step: Calculate the posterior probabilities p k ij as in Equation (1), for all i = 1 , . . . , n and j = 1 , . . . , m . 2 M-step: Set for j = 1 , . . . , m n 1 λ k +1 � p k = ij , j n i =1 � n i =1 p k ij ξ k +1 = � . j � λ k j (1+ ξ k j t i ) exp( − ξ k j i ) � n d i t i p k ij + (1 − d i ) � p i =1 ξ k ℓ =1 λ k ℓ exp( − ξ k ℓ t i ) j Laurent Bordes () Fitting some reliability mixture models 27 August 2010 17 / 35
Parametric EM-algorithm Remarks about the parametric EM algorithms + Whatever the choice of complete data the M-step for the λ j s always leads to explicit formula. Q ( θ | θ k ) depends strongly on the choice of the underlying parametric − family F . Except for exponential lifetimes, explicit maximizers of Q ( θ | θ k ) are − not reachable. Maximizing Q ( θ | θ k ) may be as complicated as maximizing the full − likelihood function. Laurent Bordes () Fitting some reliability mixture models 27 August 2010 18 / 35
Parametric stochastic EM-algorithm Plan Reliability mixture models 1 Some real data sets 2 Parametric EM-algorithm 3 Parametric stochastic EM-algorithm 4 Semiparametric stochastic EM-algorithm 5 Laurent Bordes () Fitting some reliability mixture models 27 August 2010 19 / 35
Recommend
More recommend