PART I V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES O - PowerPoint PPT Presentation

V ARIABLE S ELECTION AND THE A SSESSMENT OF P REDICTIVE A CCURACY WITH I NTERVAL -C ENSORED R ESPONSES R ICHARD C OOK S TATISTICS AND A CTUARIAL S CIENCE U NIVERSITY OF W ATERLOO Statistical Issues in Biomarker and Drug Co-Development Toronto, Ontario November 8, 2014 Joint work with Ying Wu and Ker-Ai Lee

PART I V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES O UTLINE 1

P ROGNOSTIC H UMAN L EUKOCYTE A NTIGENS IN P SORIATIC A RTHRITIS • The University of Toronto Psoriatic Arthritis Clinic is a tertiary referral clinic comprised of 1300 patients with extensive longitudinal follow-up on disease progression and collection of genetic and serum samples. • Patients with psoriatic arthritis are classified as suffering from arthritis mutilans if they have 5 or more damaged joints • Patients are scheduled to be radiologically assessed every two years . • The time for the development of arthritis mutilans is unknown because it is subject to interval-censoring. I MMEDIATE G OAL Interest lies in identifying HLA markers that predict onset of arthritis mutilans. I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 2

J OINT D AMAGE AND M ARKER V ALUES IN C ONTINUOUS T IME 10 − − 100 ESR MARKER TOTAL NUMBER OF DAMAGED JOINTS MARKER OF INFLAMMATION (ESR) # DAMAGED JOINTS 8 − − 80 − − 6 60 − − 4 40 − − 2 20 | HLA MARKERS CLINIC ENTRY TIME SINCE ONSET OF PSORIATIC ARTHRITIS I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 3

J OINT D AMAGE AND M ARKER V ALUES IN C ONTINUOUS T IME 10 − − 100 ESR MARKER TOTAL NUMBER OF DAMAGED JOINTS MARKER OF INFLAMMATION (ESR) # DAMAGED JOINTS 8 − − 80 − − 6 60 − − 4 40 − − 2 20 | | T HLA MARKERS CLINIC ARTHRITIS ENTRY MUTILANS TIME SINCE ONSET OF PSORIATIC ARTHRITIS I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 4

A VAILABLE D ATA D UE TO I NTERMITTENT A SSESSMENTS X 10 − − 100 X ESR MARKER X TOTAL NUMBER OF DAMAGED JOINTS MARKER OF INFLAMMATION (ESR) X # DAMAGED JOINTS 8 − − 80 X − − 6 60 − − 4 40 X − − 2 20 X | | | | | | | | | s 1 s 2 s 3 s 4 s 5 s 6 T HLA MARKERS CLINIC FOLLOW−UP ASSESSMENT TIMES ENTRY TIME SINCE ONSET OF PSORIATIC ARTHRITIS I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 5

D ATA FOR R ESPONSE M ODEL CENSORING INTERVAL | | | PsA ONSET L R HLA DATA (X) D ATA FOR A SSESSMENT P ROCESS Z ( s j ) denotes marker of inflammation w j = s j − s j − 1 , j = 1 , 2 , . . . are waiting times | | | | | | | s 1 s 2 s 3 s 4 s 5 s 6 PsA ONSET Z ( s 1 ) Z ( s 2 ) Z ( s 3 ) Z ( s 4 ) Z ( s 5 ) Z ( s 6 ) HLA DATA (X) I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 6

S EMI -P ARAMETRIC E STIMATES OF W AITING T IME D ISTRIBUTIONS 1.0 0.8 CUMULATIVE PROBABILITY 0.6 Diagnosis to 1st X−RAY 0.4 1st to 2nd X−RAY 2nd to 3rd X−RAY 3rd to 4th X−RAY 4th to 5th X−RAY 5th to 6th X−RAY 6th to 7th X−RAY 0.2 7th to 8th X−RAY 8th to 9th X−RAY 9th to 10th X−RAY 0.0 0 10 20 30 40 TIME IN YEARS I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 7

E STIMATE 1 OF DISTRIBUTION OF TIME TO ARTHRITIS MUTILANS 1.0 TURNBULL ESTIMATE CUMULATIVE PROBABILITY OF ARTHRITIS MUTILANS POINTWISE 95% CONFIDENCE BAND 0.8 0.6 0.4 0.2 0.0 0 10 20 30 40 YEARS SINCE DIAGNOSIS OF PsA 1 Turnbull BW (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data, Journal of the Royal Statistical Society. Series B (Methodological) 38, 290-295. I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 8

P ENALIZED R EGRESSION FOR F AILURE T IME D ATA • log L ( β ) is the log likelihood or log partial likelihood • Consider a penalized “likelihood” function p � log L PEN ( β ) = log L ( β ) − π γ,λ ( β j ) (1.1) j =1 • π γ,λ ( · ) is a penalty function • ( γ, λ ) are tuning parameters • λ = ( λ 1 , . . . , λ p ) ′ if we use different penalties for each variable I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 9

S OME P ARTICULAR P ENALTY F UNCTIONS The L 2 penalty π λ ( | β | ) = λ | β | 2 gives ridge regression 2 The L 1 penalty π λ ( | β | ) = λ | β | yields the LASSO 3 S MOOTHLY C LIPPED A BSOLUTE D EVIATION (SCAD) P ENALTY The smoothly clipped absolute deviation (SCAD) 4 penalty has the form A DAPTIVE LASSO The adaptive LASSO 5 with penalty has the form π λ ( | β j | ) = λ | β j | τ j , with small weights τ j chosen for large coefficients and large weights for small 2 Hoerl AE and Kennard RW (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12 (1), 55–67. 3 Tibshirani R (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288. 4 Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96 (456), 1348–1360. 5 Zou H (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101 (476), 1418–1429. I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 10

P ENALIZED R EGRESSION WITH I NTERVAL -C ENSORED D ATA • For individual i , D i = { ( L i , R i ) , X i } , where X i is a p × 1 covariate vector • Data consists of D = { D i , i = 1 , 2 , . . . , m } O BSERVED D ATA L OG -L IKELIOOD m � log L ∝ log [ F ( L i | X i ) − F ( R i | X i )] i =1 where F ( s | X ) is the survivor function P ENALIZED O BSERVED D ATA L OG -L IKELIOOD p � m � log L penalized ∝ log [ F ( L i | X i ) − F ( R i | X i )] − π γ,λ ( β j ) i =1 j =1 I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 11

P ENALIZED R EGRESSION WITH I NTERVAL C ENSORED D ATA B 1 B 2 B 3 B k | | | | | | b 0 b 1 b 2 b 3 b k−1 b k Breakpoints 0 = b 0 < · · · < b K = ∞ define B k = [ b k − 1 , b k ) , k = 1 , . . . , K . � u If I k ( u ) = I ( u ∈ B k ) and S k ( u ) = 0 I ( v ∈ B k ) dv then � K i β )) I k ( u ) ( ρ k exp ( x ′ h ( s ; θ ) = k =1 where θ = ( ρ ′ , β ′ ) ′ , ρ = ( ρ 1 , . . . , ρ K ) ′ and β = ( β 1 , . . . , β p ) ′ C OMPLETE D ATA L IKELIHOOD � m � K { I k ( u i ) [log( ρ k ) + X ′ i β ] − S k ( u i ) ρ k exp( X ′ log L c ( θ ) = i β ) } i =1 k =1 I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 12

A N EM A LGORITHM 6 WITH P ENALIZED R EGRESSION T HE E XPECTATION S TEP Take the conditional expectation of penalized complete data log-likelihood p � � log L c ( θ ) | D ; θ r − 1 � Q ( θ ; θ r − 1 ) = E − π α,λ ( β j ) j =1 If � I k ( u i ) | D i ; θ r − 1 � g r ˆ ik = E � S k ( u i ) | D i ; θ r − 1 � ˆ S r ik = E then � � p � m � K � i β ) − ˆ Q ( θ ; θ r − 1 ) = g r ik (log( ρ k ) + X ′ S r ik ρ k exp( X ′ ˆ i β ) − π γ,λ ( β j ) i =1 j =1 k =1 6 Dempster AP, Laird NM and Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38. I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 13

M AXIMIZATION S TEP Let • Z ij = I ( j = k ) , j = 2 , . . . , K , Z ik = (1 , Z i 2 , . . . , Z iK ) ′ • α 1 = log( ρ 1 ) , α j = log( ρ j ) − log( ρ 1 ) , j = 2 , . . . , K Then Q ( θ ; θ r − 1 ) is � � p m K � � � i β ) − ˆ ik ( Z ′ ik α + X ′ ik exp( Z ′ ik α + X ′ g r S r ˆ i β ) − π γ,λ ( β j ) i =1 k =1 j =1 With a pseudo dataset we can maximize Q ( θ ; θ r − 1 ) using standard software for penalized regression (e.g. glmnet(.), SIS(.)) I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 14

S ELECTION OF O PTIMAL P ENALTY λ OPT • The criterion for selecting the optimal λ is similar to the traditional cross- validation. • We partition the dataset into R subsamples T 1 , . . . , T R . • T r and T − T r are r th testing and training sets. • For a given λ , the cross-validation statistic is � R � CV ( λ ) = log L ( θ − r ( λ )) − log L − r ( θ − r ( λ )) . r =1 • L − r is the observed likelihood for the r th training dataset. • θ − r ( λ ) is the estimate for the r th training data. • The optimal λ maximizes � CV ( λ ) . I. V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES 15

PART I V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES O - PowerPoint PPT Presentation

V ARIABLE S ELECTION AND THE A SSESSMENT OF P REDICTIVE A CCURACY WITH I NTERVAL -C ENSORED R ESPONSES R ICHARD C OOK S TATISTICS AND A CTUARIAL S CIENCE U NIVERSITY OF W ATERLOO Statistical Issues in Biomarker and Drug Co-Development Toronto,

Conformal Field Theories, Conformal Bootstrap and Applications Konstantinos Deligiannis December

Part 0: Git-ing Started Part 1: Essential Skills Part 2: Introduction to Git Part 3: Advanced

Overview Two-Part MDL Two-Part MDL Two-Part MDL for Two-Part MDL for Grammar Learning

FY17 CONSOLIDATED RESULTS UNIPOL AND UNIPOLSAI Bologna, 23 March 2018 2 PART 1 PART 2 PART 3

Answers To Common Questions (Part-2) ? Part 1 : Christian walk, Marriage Part 2 : Lifestyle

Cardiff Schools Facilities Presentation Part 1: History of Cardiff Schools Part 2: Todays

Wind Part 1: How do we measure it? Part 2: What exactly is wind? Part 3: Where is it? PART 1:

Introduction Part One: Initial Problem Part Two: Progress Over Six Months Part

SANLAM STAFF UMBRELLA PROVIDENT AND PENSION FUND AND RELATED GROUP INSURANCE agenda PART A -

FY17 Grants Program Presented by the DCCAH Grants Department Agenda: Part 1: The Challenge

Part 2 2017- 2018 Supts Proposed Budget Part 3 Call for Advocacy 2 Part 1 Budget Context

Commercial Dog Breeders Part 8: Housing (Part 2) Introduction Housing Part 1 Housing Part 2

Answers To Common Questions (Part-1) ? Part 1 : Christian walk, Marriage Part 2 : Lifestyle,

DMR - Part 2 of 3 May 2, 2020 Part 1 - Mike Moore KC2NM Part 2 - Rich Hoffarth K2AXP Part 3 -

Fusion - Part 3 of 3 May 16, 2020 Part 1 - Mike Moore KC2NM Part 2 - Rich Hoffarth K2AXP Part 3

The heartful PRESENTER Influence minds and win hearts Contents 04 PART 1 INTRODUCTION 06

HIV cure research in Europe Rowena Johnston, Ph.D. www.amfar.org www.amfar.org www.amfar.org

NG41 Spinal Injury: Assessment and initial management START This resource presents every

CADTH Common Drug Review CDR INFORMATION SESSION TORONTO, ON OCTOBER 8, 2014 CADTH Participants

Chapter 14: Basic Radiobiology Set of 88 slides based on the chapter authored by N.

GROWING TOGETHER Food, Crop Protection, Health and Soil: How Biotechnology will supply us with

Process for Reviewing SEBs Montreal, June 6 2016 CHANDER SEHGAL DIRECTOR, COMMON DRUG REVIEW AND

HLA , IMMUNOGENETICS & MEDICINE XX th Century HLA , MHC ,Cytokines,Receptors.

Mic icrobiome research: where are we now? Shantelle Claassen-Weitz Division of Medical

Sambuz

Useful Links

Newsletter

Mail Us

PART I V ARIABLE S ELECTION WITH I NTERVAL - CENSORED R ESPONSES O - PowerPoint PPT Presentation

V ARIABLE S ELECTION AND THE A SSESSMENT OF P REDICTIVE A CCURACY WITH I NTERVAL -C ENSORED R ESPONSES R ICHARD C OOK S TATISTICS AND A CTUARIAL S CIENCE U NIVERSITY OF W ATERLOO Statistical Issues in Biomarker and Drug Co-Development Toronto,

Conformal Field Theories, Conformal Bootstrap and Applications Konstantinos Deligiannis December

Part 0: Git-ing Started Part 1: Essential Skills Part 2: Introduction to Git Part 3: Advanced

Overview Two-Part MDL Two-Part MDL Two-Part MDL for Two-Part MDL for Grammar Learning

FY17 CONSOLIDATED RESULTS UNIPOL AND UNIPOLSAI Bologna, 23 March 2018 2 PART 1 PART 2 PART 3

Answers To Common Questions (Part-2) ? Part 1 : Christian walk, Marriage Part 2 : Lifestyle

Cardiff Schools Facilities Presentation Part 1: History of Cardiff Schools Part 2: Todays

Wind Part 1: How do we measure it? Part 2: What exactly is wind? Part 3: Where is it? PART 1:

Introduction Part One: Initial Problem Part Two: Progress Over Six Months Part

SANLAM STAFF UMBRELLA PROVIDENT AND PENSION FUND AND RELATED GROUP INSURANCE agenda PART A -

FY17 Grants Program Presented by the DCCAH Grants Department Agenda: Part 1: The Challenge

Part 2 2017- 2018 Supts Proposed Budget Part 3 Call for Advocacy 2 Part 1 Budget Context

Commercial Dog Breeders Part 8: Housing (Part 2) Introduction Housing Part 1 Housing Part 2

Answers To Common Questions (Part-1) ? Part 1 : Christian walk, Marriage Part 2 : Lifestyle,

DMR - Part 2 of 3 May 2, 2020 Part 1 - Mike Moore KC2NM Part 2 - Rich Hoffarth K2AXP Part 3 -

Fusion - Part 3 of 3 May 16, 2020 Part 1 - Mike Moore KC2NM Part 2 - Rich Hoffarth K2AXP Part 3

The heartful PRESENTER Influence minds and win hearts Contents 04 PART 1 INTRODUCTION 06

HIV cure research in Europe Rowena Johnston, Ph.D. www.amfar.org www.amfar.org www.amfar.org

NG41 Spinal Injury: Assessment and initial management START This resource presents every

CADTH Common Drug Review CDR INFORMATION SESSION TORONTO, ON OCTOBER 8, 2014 CADTH Participants

Chapter 14: Basic Radiobiology Set of 88 slides based on the chapter authored by N.

GROWING TOGETHER Food, Crop Protection, Health and Soil: How Biotechnology will supply us with

Process for Reviewing SEBs Montreal, June 6 2016 CHANDER SEHGAL DIRECTOR, COMMON DRUG REVIEW AND

HLA , IMMUNOGENETICS &amp; MEDICINE XX th Century HLA , MHC ,Cytokines,Receptors.

Mic icrobiome research: where are we now? Shantelle Claassen-Weitz Division of Medical

Sambuz

Useful Links

Newsletter

Mail Us

HLA , IMMUNOGENETICS & MEDICINE XX th Century HLA , MHC ,Cytokines,Receptors.