Nearest Neighbor and Kernel Survival Analysis Nonasymptotic Error Bounds and Strong Consistency Rates George H. Chen Assistant Professor of Information Systems Carnegie Mellon University June 11, 2019
Survival Analysis Gluten Immuno- Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Day 10 Day ≥ 6
Survival Analysis Gluten Immuno- Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Feature vector X Observed time Y X Y Day 10 Day ≥ 6
Survival Analysis Gluten Immuno- Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Feature vector X Observed time Y X Y Day 10 Day ≥ 6 When we stop collecting training data, not everyone has died!
Survival Analysis Gluten Immuno- Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Feature vector X Observed time Y X Y Day 10 Day ≥ 6 When we stop collecting training data, not everyone has died! Goal: Estimate S ( t | x ) = P ( survive beyond time t | feature vector x )
Problem Setup
Problem Setup Model: Generate data point as follows: ( X, Y, δ )
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): T ≤ C
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981):
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): x
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training k data points k x points closest to . x
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ]
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Feature space is separable metric space Estimator (Beran 1981): (intrinsic dimension d ) d find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X Continuous r.v. in time & 2. Sample time of death T ∼ P T | X smooth w.r.t. feature space 3. Sample time of censoring C ∼ P C | X (Hölder index a ) α 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Feature space is separable metric space Estimator (Beran 1981): (intrinsic dimension d ) d find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ
Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X Borel prob. measure Continuous r.v. in time & 2. Sample time of death T ∼ P T | X smooth w.r.t. feature space 3. Sample time of censoring C ∼ P C | X (Hölder index a ) α 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Feature space is separable metric space Estimator (Beran 1981): (intrinsic dimension d ) d find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ
Theory (Informal)
Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ]
Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ] If no censoring, problem reduces to conditional CDF estimation
Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ] If no censoring, problem reduces to conditional CDF estimation → Error upper bound, up to a log factor, matches conditional CDF estimation lower bound by Chagny & Roche 2014
Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ] If no censoring, problem reduces to conditional CDF estimation → Error upper bound, up to a log factor, matches conditional CDF estimation lower bound by Chagny & Roche 2014 Proof ideas also give finite sample rates for:
Recommend
More recommend