Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level • Deviations from independence impacts inference (Cf. Pepe and Janes, 2007, Biostatistics ; Albert et al., 2001, Biometrics ) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level • Deviations from independence impacts inference (Cf. Pepe and Janes, 2007, Biostatistics ; Albert et al., 2001, Biometrics ) • Modeling Deviation from LI Modeling a cross-classified probability contingency table P [ M i 1 = m 1 , ..., M iJ = m J | I i ] , ∀ m = ( m 1 , ..., m J ) ′ Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level • Deviations from independence impacts inference (Cf. Pepe and Janes, 2007, Biostatistics ; Albert et al., 2001, Biometrics ) • Modeling Deviation from LI Modeling a cross-classified probability contingency table P [ M i 1 = m 1 , ..., M iJ = m J | I i ] , ∀ m = ( m 1 , ..., m J ) ′ • Log-linear parameterization Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level • Deviations from independence impacts inference (Cf. Pepe and Janes, 2007, Biostatistics ; Albert et al., 2001, Biometrics ) • Modeling Deviation from LI Modeling a cross-classified probability contingency table P [ M i 1 = m 1 , ..., M iJ = m J | I i ] , ∀ m = ( m 1 , ..., m J ) ′ • Log-linear parameterization • Generalized linear mixed-effect models (GLMM) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level • Deviations from independence impacts inference (Cf. Pepe and Janes, 2007, Biostatistics ; Albert et al., 2001, Biometrics ) • Modeling Deviation from LI Modeling a cross-classified probability contingency table P [ M i 1 = m 1 , ..., M iJ = m J | I i ] , ∀ m = ( m 1 , ..., m J ) ′ • Log-linear parameterization • Generalized linear mixed-effect models (GLMM) • Simplex factor model; similar to mixed-membership model (Cf. Bhattacharya and Dunson, 2012, JASA ) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion “nested” pLCM Relax the LI and Non-interference Assumption • Direct evidence against LI: control measurements ( M i 1 , ..., M iJ ) ′ • test cross-reactions (prevented in PERCH assays) • lab technicians effect • heterogeneity in subjects’ immunity level • Deviations from independence impacts inference (Cf. Pepe and Janes, 2007, Biostatistics ; Albert et al., 2001, Biometrics ) • Modeling Deviation from LI Modeling a cross-classified probability contingency table P [ M i 1 = m 1 , ..., M iJ = m J | I i ] , ∀ m = ( m 1 , ..., m J ) ′ • Log-linear parameterization • Generalized linear mixed-effect models (GLMM) • Simplex factor model; similar to mixed-membership model (Cf. Bhattacharya and Dunson, 2012, JASA ) • PARAFAC decomposition (Cf. Dunson and Xing, 2009, JASA ) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 18 / 55
Background Models Regression Simulations Results Discussion Nested Partially-Latent Class Models (npLCM; Wu and Zeger, 2016) Example: 5 Pathogens, 2 Subclasses; BrS Data Only Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 19 / 55
Background Models Regression Simulations Results Discussion Nested Partially-Latent Class Models (npLCM; Wu and Zeger, 2016) Example: 5 Pathogens, 3 Subclasses; BrS Data Only Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 20 / 55
Background Models Regression Simulations Results Discussion Encourage Few Subclasses: Stick-Breaking Prior V j ∼ Beta(1 , α ); Example: K = 10, α = 1 • On average, the first several segments receive most weights Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 21 / 55
Background Models Regression Simulations Results Discussion npLCM: Likelihood and Prior BrS Data Only • Likelihood K J � � m j � � 1 − m j , � � ψ ( j ) 1 − ψ ( j ) P 0 ( M i = m ) = ν k k k j =1 k =1 J K � � m j � � 1 − m j � � � m ℓ � � 1 − m ℓ � � θ ( j ) 1 − θ ( j ) ψ ( j ) 1 − ψ ( j ) , P 1 ( M i = m ) = π j η k k k k k j =1 k =1 ℓ � = j • Prior: ∼ Dirichlet( . 5 , . . . , . 5) , π ψ ( j ) ∼ Beta(1 , 1) , θ k ∼ Beta( c 1 kj , c 2 kj ) , j = 1 , ..., J ; k = 1 , ..., ∞ , k ∞ � � Z i ′ | I L i ′ = j ∼ U k [1 − U ℓ ] δ k , U k ∼ Beta(1 , α 0 ) , for all cases, k =1 ℓ< k ∞ � � Z i ∼ V k [1 − V ℓ ] δ k , V k ∼ Beta(1 , α 0 ) , for all controls , k =1 ℓ< k α 0 ∼ Gamma(0 . 25 , 0 . 25) , Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 22 / 55
(I) (II) 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 (A,B) (A,C) (A,D) (A,E) (A,B) (A,C) (A,D) (A,E) 4 D B E C 2 D B B C D C E E 1 A B C D E A B C D E A B C D E A B C D E A A A A 0.5 C E 0.2 B D (A,B) (B,C) (B,D) (B,E) (A,B) (B,C) (B,D) (B,E) 4 4 D Odds Ratio (log−scale) E 2 A A B C D 2 B C 1 E 1 A B C D E A B C D E A B C D E 0.5 0.5 B A C 0.2 E 0.2 D (A,C) (B,C) (C,D) (C,E) (A,C) (B,C) (C,D) (C,E) 4 4 CASES CASES C C 2 O O 2 N N 1 1 Background Models T C D Regression A B C D E Simulations T Results Discussion A B E R R 0.5 A A B C D 0.5 O O B C E L L 0.2 D 0.2 S S E (a) (A,D) (B,D) (C,D) (D,E) (A,D) (B,D) (C,D) (D,E) 4 4 Estimation Bias if Ignoring Local Dependence (LD) 2 D E 2 A B C 1 1 A B C D E 0.5 0.5 0.2 0.2 Simulation: LD Truth (npLCM) Estimated by Working LI Models (pLCM) (A,E) (B,E) (C,E) (D,E) (A,E) (B,E) (C,E) (D,E) 4 2 1 0.5 0.2 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 (I : weak LD ) (II : strong LD ) 120 Asymptotic Bias (PRAB) 100 80 60 Percent Relative C 40 smoothed_mat 20 smoothed_mat E ( 0 A B C D E D A −20 B −40 −60 −80 −100 −120 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 Cases' First Subclass Weight ( η 1 ) Marginal Class A Class B Class C Class D Class E Controls: Marginal Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 23 / 55
So Far: A General Framework Nested Partially Latent Class Models (npLCM) For simplicity, we assume “single-pathogen causes”, or a single relevant feature per cluster, or more visually, ”one row of green boxes per disease class”
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: a. Cause-specific case fractions (CSCF): π = ( π 1 , . . . , π L ) ⊤ = { π ℓ = P ( I = ℓ | Y = 1) , ℓ = 1 , . . . , L } ∈ S L − 1 ; Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: a. Cause-specific case fractions (CSCF): π = ( π 1 , . . . , π L ) ⊤ = { π ℓ = P ( I = ℓ | Y = 1) , ℓ = 1 , . . . , L } ∈ S L − 1 ; b. P 1 ℓ = { P 1 ℓ ( m ) } = { P ( M = m | I = ℓ, Y = 1) } : a table of probabilities of making J binary observations M = m in a case class ℓ � = 0; Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: a. Cause-specific case fractions (CSCF): π = ( π 1 , . . . , π L ) ⊤ = { π ℓ = P ( I = ℓ | Y = 1) , ℓ = 1 , . . . , L } ∈ S L − 1 ; b. P 1 ℓ = { P 1 ℓ ( m ) } = { P ( M = m | I = ℓ, Y = 1) } : a table of probabilities of making J binary observations M = m in a case class ℓ � = 0; c. P 0 = { P 0 ( m ) } = { P ( M = m | I = 0 , Y = 0) } : the same probability table as above but for controls. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: a. Cause-specific case fractions (CSCF): π = ( π 1 , . . . , π L ) ⊤ = { π ℓ = P ( I = ℓ | Y = 1) , ℓ = 1 , . . . , L } ∈ S L − 1 ; b. P 1 ℓ = { P 1 ℓ ( m ) } = { P ( M = m | I = ℓ, Y = 1) } : a table of probabilities of making J binary observations M = m in a case class ℓ � = 0; c. P 0 = { P 0 ( m ) } = { P ( M = m | I = 0 , Y = 0) } : the same probability table as above but for controls. Cases’ disease classes are unobserved , so the distribution of their measurements is a weighted finite-mixture model: P 1 = � L ℓ =1 π ℓ P 1 ℓ Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion npLCM Framework (no Covariates) Three components of a likelihood function: a. Cause-specific case fractions (CSCF): π = ( π 1 , . . . , π L ) ⊤ = { π ℓ = P ( I = ℓ | Y = 1) , ℓ = 1 , . . . , L } ∈ S L − 1 ; b. P 1 ℓ = { P 1 ℓ ( m ) } = { P ( M = m | I = ℓ, Y = 1) } : a table of probabilities of making J binary observations M = m in a case class ℓ � = 0; c. P 0 = { P 0 ( m ) } = { P ( M = m | I = 0 , Y = 0) } : the same probability table as above but for controls. Cases’ disease classes are unobserved , so the distribution of their measurements is a weighted finite-mixture model: P 1 = � L ℓ =1 π ℓ P 1 ℓ The likelihood: L � � � L = L 1 · L 0 = π ℓ · P 1 ℓ ( M i ; Θ , Ψ , η ) × P 0 ( M i ′ ; Ψ , ν ) i ′ : Y i ′ =0 i : Y i =1 ℓ =1 Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 25 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Setting η 1 = 1 and ν 1 = 1 Control model for multivariate binary data { M i : where Y i = 0 } : 1. P 0 ( m ) = � J j =1 { ψ j } m j { 1 − ψ j } 1 − m j = Π( m ; ψ ) 1a. Π( m ; s ) = � J j =1 { s j } m ij { 1 − s j } 1 − m ij is the probability mass function for a product Bernoulli distribution given the success probabilities s = ( s 1 , . . . , s J ) ⊤ , 0 ≤ s j ≤ 1 1b. Parameters ψ = ( ψ 1 , . . . , ψ J ) ⊤ represent the positive rates absent disease, referred to as “false positive rates” (FPRs). Local Independence: M ij ⊥ M ij ′ | I = 0 Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 26 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Model for the multivariate binary data in case class ℓ � = 0 2. P 1 ℓ ( m ) is a product of the probabilities of measurements made Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 27 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Model for the multivariate binary data in case class ℓ � = 0 2. P 1 ℓ ( m ) is a product of the probabilities of measurements made 2a. on the causative pathogen ℓ , P ( M ℓ | I = ℓ, Y = 1 , θ ) = { θ ℓ } M ℓ { 1 − θ ℓ } 1 − M ℓ , where θ = ( θ 1 , . . . , θ J ) ⊤ are “true positive rates” (TPRs), larger than FPRs. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 27 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Model for the multivariate binary data in case class ℓ � = 0 2. P 1 ℓ ( m ) is a product of the probabilities of measurements made 2a. on the causative pathogen ℓ , P ( M ℓ | I = ℓ, Y = 1 , θ ) = { θ ℓ } M ℓ { 1 − θ ℓ } 1 − M ℓ , where θ = ( θ 1 , . . . , θ J ) ⊤ are “true positive rates” (TPRs), larger than FPRs. 2b. on the non-causative pathogens P ( M i [ − ℓ ] | I i = ℓ, Y i = 1 , ψ [ − ℓ ] ) = Π( M [ − ℓ ] ; ψ [ − ℓ ] ), where a [ − ℓ ] represents all but the ℓ -th element in a vector a . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 27 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Model for the multivariate binary data in case class ℓ � = 0 2. P 1 ℓ ( m ) is a product of the probabilities of measurements made 2a. on the causative pathogen ℓ , P ( M ℓ | I = ℓ, Y = 1 , θ ) = { θ ℓ } M ℓ { 1 − θ ℓ } 1 − M ℓ , where θ = ( θ 1 , . . . , θ J ) ⊤ are “true positive rates” (TPRs), larger than FPRs. 2b. on the non-causative pathogens P ( M i [ − ℓ ] | I i = ℓ, Y i = 1 , ψ [ − ℓ ] ) = Π( M [ − ℓ ] ; ψ [ − ℓ ] ), where a [ − ℓ ] represents all but the ℓ -th element in a vector a . 2c. Under the single-pathogen-cause assumption, pLCM uses J TPRs θ for L = J causes and J FPRs ψ . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 27 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Model for the multivariate binary data in case class ℓ � = 0 2. P 1 ℓ ( m ) is a product of the probabilities of measurements made 2a. on the causative pathogen ℓ , P ( M ℓ | I = ℓ, Y = 1 , θ ) = { θ ℓ } M ℓ { 1 − θ ℓ } 1 − M ℓ , where θ = ( θ 1 , . . . , θ J ) ⊤ are “true positive rates” (TPRs), larger than FPRs. 2b. on the non-causative pathogens P ( M i [ − ℓ ] | I i = ℓ, Y i = 1 , ψ [ − ℓ ] ) = Π( M [ − ℓ ] ; ψ [ − ℓ ] ), where a [ − ℓ ] represents all but the ℓ -th element in a vector a . 2c. Under the single-pathogen-cause assumption, pLCM uses J TPRs θ for L = J causes and J FPRs ψ . 2a-2b: Local Independence (LI): M ij ⊥ M ij ′ | I = ℓ � = 0 Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 27 / 55
Background Models Regression Simulations Results Discussion Special Case: pLCM (Wu et al., 2016) Model for the multivariate binary data in case class ℓ � = 0 2. P 1 ℓ ( m ) is a product of the probabilities of measurements made 2a. on the causative pathogen ℓ , P ( M ℓ | I = ℓ, Y = 1 , θ ) = { θ ℓ } M ℓ { 1 − θ ℓ } 1 − M ℓ , where θ = ( θ 1 , . . . , θ J ) ⊤ are “true positive rates” (TPRs), larger than FPRs. 2b. on the non-causative pathogens P ( M i [ − ℓ ] | I i = ℓ, Y i = 1 , ψ [ − ℓ ] ) = Π( M [ − ℓ ] ; ψ [ − ℓ ] ), where a [ − ℓ ] represents all but the ℓ -th element in a vector a . 2c. Under the single-pathogen-cause assumption, pLCM uses J TPRs θ for L = J causes and J FPRs ψ . 2a-2b: Local Independence (LI): M ij ⊥ M ij ′ | I = ℓ � = 0 2a-2b. Non-interference: disease-causing pathogen(s) are more frequently detected among cases than controls ( θ ℓ > ψ ℓ ) and the non-causative pathogens are observed with the same rates among cases as in controls Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 27 / 55
Background Models Regression Simulations Results Discussion Regression Analysis in nested PLCM In large-scale disease etiology studies: • Data : case-control diagnostic tests, multivariate binary observations • Scientific problem : estimate cause-specific case fractions (CSCF); Think “Pie chart” for cases Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 28 / 55
Background Models Regression Simulations Results Discussion Regression Analysis in nested PLCM In large-scale disease etiology studies: • Data : case-control diagnostic tests, multivariate binary observations • Scientific problem : estimate cause-specific case fractions (CSCF); Think “Pie chart” for cases • Statistical problem : Using nested PLCM to estimate the mixing distribution among the cases Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 28 / 55
Background Models Regression Simulations Results Discussion Regression Analysis in nested PLCM In large-scale disease etiology studies: • Data : case-control diagnostic tests, multivariate binary observations • Scientific problem : estimate cause-specific case fractions (CSCF); Think “Pie chart” for cases • Statistical problem : Using nested PLCM to estimate the mixing distribution among the cases • Motivation for regression analyses : CSCFs may vary by season, a child’s age, HIV status, disease severity Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 28 / 55
Background Models Regression Simulations Results Discussion Data (with Covariates) • D = { ( M i , Y i , X i Y i , W i ) , i = 1 , . . . , N } Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 29 / 55
Background Models Regression Simulations Results Discussion Data (with Covariates) • D = { ( M i , Y i , X i Y i , W i ) , i = 1 , . . . , N } • M i = ( M i 1 , ..., M iJ ) ⊤ : binary measurements; Indicate the presence or absence of J pathogens for subject i = 1 , . . . , N . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 29 / 55
Background Models Regression Simulations Results Discussion Data (with Covariates) • D = { ( M i , Y i , X i Y i , W i ) , i = 1 , . . . , N } • M i = ( M i 1 , ..., M iJ ) ⊤ : binary measurements; Indicate the presence or absence of J pathogens for subject i = 1 , . . . , N . • Y i : case (1) or a control (0). Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 29 / 55
Background Models Regression Simulations Results Discussion Data (with Covariates) • D = { ( M i , Y i , X i Y i , W i ) , i = 1 , . . . , N } • M i = ( M i 1 , ..., M iJ ) ⊤ : binary measurements; Indicate the presence or absence of J pathogens for subject i = 1 , . . . , N . • Y i : case (1) or a control (0). • X i = ( X i 1 , . . . , X ip ) ⊤ : covariates that may influence case i ’s etiologic fractions Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 29 / 55
Background Models Regression Simulations Results Discussion Data (with Covariates) • D = { ( M i , Y i , X i Y i , W i ) , i = 1 , . . . , N } • M i = ( M i 1 , ..., M iJ ) ⊤ : binary measurements; Indicate the presence or absence of J pathogens for subject i = 1 , . . . , N . • Y i : case (1) or a control (0). • X i = ( X i 1 , . . . , X ip ) ⊤ : covariates that may influence case i ’s etiologic fractions • W i = ( W i 1 , . . . , W iq ) ⊤ : shared by cases and controls; possibly different from X i ; may influence control distribution [ M i | W i , Y i = 0]. For example, healthy controls do not have disease severity information (which can be included in X i ). Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 29 / 55
Background Models Regression Simulations Results Discussion Data (with Covariates) • D = { ( M i , Y i , X i Y i , W i ) , i = 1 , . . . , N } • M i = ( M i 1 , ..., M iJ ) ⊤ : binary measurements; Indicate the presence or absence of J pathogens for subject i = 1 , . . . , N . • Y i : case (1) or a control (0). • X i = ( X i 1 , . . . , X ip ) ⊤ : covariates that may influence case i ’s etiologic fractions • W i = ( W i 1 , . . . , W iq ) ⊤ : shared by cases and controls; possibly different from X i ; may influence control distribution [ M i | W i , Y i = 0]. For example, healthy controls do not have disease severity information (which can be included in X i ). • Continuous covariates: the first p 1 and q 1 elements of X i and W i , respectively. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 29 / 55
Background Models Regression Simulations Results Discussion Motivating Application Again: PERCH Study Data : 494 cases and 944 controls from one site Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 30 / 55
Background Models Regression Simulations Results Discussion Motivating Application Again: PERCH Study Data : 494 cases and 944 controls from one site Goal a. : Estimate CSCFs at all covariate values, and assign cause-specific probabilities for each case Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 30 / 55
Background Models Regression Simulations Results Discussion Motivating Application Again: PERCH Study Data : 494 cases and 944 controls from one site Goal a. : Estimate CSCFs at all covariate values, and assign cause-specific probabilities for each case Goal b. : Quantify overall cause-specific disease burdens in a population, i.e., overall CSCFs π ∗ = ( π ∗ L ) ⊤ as an 1 , . . . , π ∗ empirical average of the stratum-specific CSCFs (by X ); Of policy interest (vaccine/antibiotics development and manufacture) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 30 / 55
Background Models Regression Simulations Results Discussion Motivating Application Again: PERCH Study Data : 494 cases and 944 controls from one site Goal a. : Estimate CSCFs at all covariate values, and assign cause-specific probabilities for each case Goal b. : Quantify overall cause-specific disease burdens in a population, i.e., overall CSCFs π ∗ = ( π ∗ L ) ⊤ as an 1 , . . . , π ∗ empirical average of the stratum-specific CSCFs (by X ); Of policy interest (vaccine/antibiotics development and manufacture) Model : • J = 7: noisy presence/absence of 2 bacteria and 5 viruses in the nose • Causes: seven single-pathogen causes plus an “Not Specified” (NoS) cause; So L = J + 1 • X i : enrollment date, age ( < or > 1 year), disease severity for cases (severe or very severe), HIV status (+/-) • W i : X i minus “disease severity”. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 30 / 55
Background Models Regression Simulations Results Discussion PERCH Data: Sparsely-Populated Strata � Table: The observed count (frequency) of cases and controls by age, disease severity and HIV status (1: yes; 0: no). The marginal fractions among cases and controls for each covariate are shown at the bottom. Regression results will be shown for the first two strata. age ≥ 1 very severe (VS) HIV positive # cases (%) # controls (%) (case-only) total: 524 (100) total: 964 (100) 0 0 0 208 (39.7) 545 (56.5) 1 0 0 72 (13.7) 278 (28.8) 0 1 0 116 (22.1) - 1 1 0 33 (6.3) - 0 0 1 37 (7.1) 85 (8.8) 1 0 1 24 (4.5) 51 (5.3) 0 1 1 25 (4.8) - 1 1 1 3 (0.6) - case: 25 . 2% 34 . 5% 17 . 0% control: 34 . 3% - 14 . 1% Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 31 / 55
Background Models Regression Simulations Results Discussion Current Methods Fall Short � • Fully-stratified analysis : fit an npLCM to the case-control data in each covariate stratum. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 32 / 55
Background Models Regression Simulations Results Discussion Current Methods Fall Short � • Fully-stratified analysis : fit an npLCM to the case-control data in each covariate stratum. Like pLCM, the npLCM is partially-identified in each stratum, necessitating multiple sets of independent informative priors across multiple strata. Two primary issues: Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 32 / 55
Background Models Regression Simulations Results Discussion Current Methods Fall Short � • Fully-stratified analysis : fit an npLCM to the case-control data in each covariate stratum. Like pLCM, the npLCM is partially-identified in each stratum, necessitating multiple sets of independent informative priors across multiple strata. Two primary issues: Gap 1a Unstable CSCF estimates due to sparsely-populated strata. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 32 / 55
Background Models Regression Simulations Results Discussion Current Methods Fall Short � • Fully-stratified analysis : fit an npLCM to the case-control data in each covariate stratum. Like pLCM, the npLCM is partially-identified in each stratum, necessitating multiple sets of independent informative priors across multiple strata. Two primary issues: Gap 1a Unstable CSCF estimates due to sparsely-populated strata. Gap 1b Informative TPR priors are often elicited for a case population and rarely for each stratum; Reusing independent prior distributions of the TPRs across all the strata will lead to overly-optimistic posterior uncertainty in π ∗ , hampering policy decisions. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 32 / 55
Background Models Regression Simulations Results Discussion The Rest of Talk � More focus on model formulation; Inference done by ‘baker‘ Extend the npLCM to perform regression analysis in case-control disease etiology studies that Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 33 / 55
Background Models Regression Simulations Results Discussion The Rest of Talk � More focus on model formulation; Inference done by ‘baker‘ Extend the npLCM to perform regression analysis in case-control disease etiology studies that (a) incorporates controls to estimate the CSCFs ( π ), Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 33 / 55
Background Models Regression Simulations Results Discussion The Rest of Talk � More focus on model formulation; Inference done by ‘baker‘ Extend the npLCM to perform regression analysis in case-control disease etiology studies that (a) incorporates controls to estimate the CSCFs ( π ), (b) specifies parsimonious functional dependence of π upon covariates such as additivity, and Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 33 / 55
Background Models Regression Simulations Results Discussion The Rest of Talk � More focus on model formulation; Inference done by ‘baker‘ Extend the npLCM to perform regression analysis in case-control disease etiology studies that (a) incorporates controls to estimate the CSCFs ( π ), (b) specifies parsimonious functional dependence of π upon covariates such as additivity, and (c) correctly assesses the posterior uncertainty of the CSCF functions and the overall CSCFs π ∗ by applying the TPR priors just once. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 33 / 55
Now, how to incorporate covariates, to which quantities? Regression Extension for P 0 and P 1 : letting π ℓ , ν k , η k depend on covariates
Background Models Regression Simulations Results Discussion Roadmap Let three sets of parameters in an npLCM (pg.17) depend on the observed covariates 1x. Etiology regression function among cases, { π ℓ ( x ) , ℓ � = 0 } , which is of primary scientific interest 2x. Conditional probability of measurements m given covariates w in controls: P 0 ( m ; w ) = [ M = m | W = w , I = 0], 3x. 2x above, but in the case class ℓ : P 1 ℓ ( m ; w ) = [ M = m | W = w , I = ℓ ], ℓ = 1 , . . . , L note Keep the specifications for the TPRs and FPRs ( Θ , Ψ ) as in the original npLCM. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 35 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . 2. Occurs with probability π i ℓ that depends upon covariates. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . 2. Occurs with probability π i ℓ that depends upon covariates. 3. Over-parameterized multinomial logistic regression: π i ℓ = π ℓ ( X i ) = exp { φ ℓ ( X i ) } / � L ℓ ′ =1 exp { φ ℓ ′ ( X i ) } , ℓ = 1 , ..., L , where φ ℓ ( X i ) − φ L ( X i ) is the log odds of case i in disease class ℓ relative to L : log π i ℓ /π iL . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . 2. Occurs with probability π i ℓ that depends upon covariates. 3. Over-parameterized multinomial logistic regression: π i ℓ = π ℓ ( X i ) = exp { φ ℓ ( X i ) } / � L ℓ ′ =1 exp { φ ℓ ′ ( X i ) } , ℓ = 1 , ..., L , where φ ℓ ( X i ) − φ L ( X i ) is the log odds of case i in disease class ℓ relative to L : log π i ℓ /π iL . 4. Without specifying a baseline category, we treat all the disease classes symmetrically which simplifies prior specification. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . 2. Occurs with probability π i ℓ that depends upon covariates. 3. Over-parameterized multinomial logistic regression: π i ℓ = π ℓ ( X i ) = exp { φ ℓ ( X i ) } / � L ℓ ′ =1 exp { φ ℓ ′ ( X i ) } , ℓ = 1 , ..., L , where φ ℓ ( X i ) − φ L ( X i ) is the log odds of case i in disease class ℓ relative to L : log π i ℓ /π iL . 4. Without specifying a baseline category, we treat all the disease classes symmetrically which simplifies prior specification. ℓ ) = � p 1 5. Additive models for φ ℓ ( x ; Γ π j =1 f π ℓ j ( x j ; β π x ⊤ γ π ℓ j ) + � ℓ Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . 2. Occurs with probability π i ℓ that depends upon covariates. 3. Over-parameterized multinomial logistic regression: π i ℓ = π ℓ ( X i ) = exp { φ ℓ ( X i ) } / � L ℓ ′ =1 exp { φ ℓ ′ ( X i ) } , ℓ = 1 , ..., L , where φ ℓ ( X i ) − φ L ( X i ) is the log odds of case i in disease class ℓ relative to L : log π i ℓ /π iL . 4. Without specifying a baseline category, we treat all the disease classes symmetrically which simplifies prior specification. ℓ ) = � p 1 5. Additive models for φ ℓ ( x ; Γ π j =1 f π ℓ j ( x j ; β π x ⊤ γ π ℓ j ) + � ℓ 5a. Use B-spline basis expansion to approximate f π ℓ j ( · ) and use P-spline for estimating smooth functions. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion Etiology Regression π ℓ ( X ) π ℓ ( X ) is the primary target of inference. 1. Recall that I i = ℓ represents case i ’s disease being caused by pathogen ℓ . 2. Occurs with probability π i ℓ that depends upon covariates. 3. Over-parameterized multinomial logistic regression: π i ℓ = π ℓ ( X i ) = exp { φ ℓ ( X i ) } / � L ℓ ′ =1 exp { φ ℓ ′ ( X i ) } , ℓ = 1 , ..., L , where φ ℓ ( X i ) − φ L ( X i ) is the log odds of case i in disease class ℓ relative to L : log π i ℓ /π iL . 4. Without specifying a baseline category, we treat all the disease classes symmetrically which simplifies prior specification. ℓ ) = � p 1 5. Additive models for φ ℓ ( x ; Γ π j =1 f π ℓ j ( x j ; β π x ⊤ γ π ℓ j ) + � ℓ 5a. Use B-spline basis expansion to approximate f π ℓ j ( · ) and use P-spline for estimating smooth functions. x is the subvector of the predictors x ; Γ π ℓ = ( β π ℓ j , γ π 5b. � ℓ ). Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 36 / 55
Background Models Regression Simulations Results Discussion P 0 : Multivariate binary regression for controls Desirable properties Model Specification: • Model space large enough for complex conditional dependence of M given covariates W • Upward compatibility, or reproducibility (invariant parameter interpretation with increasing dimensions or complex patterns of missing responses) Estimation: • Adaptivity: regularization to adapt to the difficulty of the problem, e.g., model residual dependence [ M | W , I = 0] only if necessary; model the effect of covariates only if necessary Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 37 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls • The pmf for controls’ measurements: Pr ( M i = m | W i , I i = 0) = � K k =1 ν k ( W i )Π( m ; Ψ k ), Ψ k = ( ψ (1) k , . . . , ψ ( J ) k ) ′ Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 38 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls • The pmf for controls’ measurements: Pr ( M i = m | W i , I i = 0) = � K k =1 ν k ( W i )Π( m ; Ψ k ), Ψ k = ( ψ (1) k , . . . , ψ ( J ) k ) ′ • The vector ( ν 1 ( W i ) , . . . , ν K ( W i )) lies in a ( K − 1)-simplex Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 38 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls • The pmf for controls’ measurements: Pr ( M i = m | W i , I i = 0) = � K k =1 ν k ( W i )Π( m ; Ψ k ), Ψ k = ( ψ (1) k , . . . , ψ ( J ) k ) ′ • The vector ( ν 1 ( W i ) , . . . , ν K ( W i )) lies in a ( K − 1)-simplex • Π( m ; s ) = � J j =1 { s j } m ij (1 − s j ) 1 − m ij Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 38 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls • The pmf for controls’ measurements: Pr ( M i = m | W i , I i = 0) = � K k =1 ν k ( W i )Π( m ; Ψ k ), Ψ k = ( ψ (1) k , . . . , ψ ( J ) k ) ′ • The vector ( ν 1 ( W i ) , . . . , ν K ( W i )) lies in a ( K − 1)-simplex • Π( m ; s ) = � J j =1 { s j } m ij (1 − s j ) 1 − m ij • An equivalent generative process: Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 38 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls • The pmf for controls’ measurements: Pr ( M i = m | W i , I i = 0) = � K k =1 ν k ( W i )Π( m ; Ψ k ), Ψ k = ( ψ (1) k , . . . , ψ ( J ) k ) ′ • The vector ( ν 1 ( W i ) , . . . , ν K ( W i )) lies in a ( K − 1)-simplex • Π( m ; s ) = � J j =1 { s j } m ij (1 − s j ) 1 − m ij • An equivalent generative process: sample subclass indicator : Z i | W i ∼ Categorical K ( ν ( W i )) M ij | Z i = k ∼ Bernoulli( ψ ( j ) generate measurements : k ) , independently for j = 1 , ..., J . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 38 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls Stick-breaking parametrization of weight functions ν k ( W i ) = P ( Z i = k | W i ) by � ik ) � g ( α ν s < k { 1 − g ( α ν is ) } , if k < K , h k ( W i ; Γ ν k ) = � � �� � s < k { 1 − g ( α ν is ) } , if k = K , stick k Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 39 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls Stick-breaking parametrization of weight functions ν k ( W i ) = P ( Z i = k | W i ) by � ik ) � g ( α ν s < k { 1 − g ( α ν is ) } , if k < K , h k ( W i ; Γ ν k ) = � � �� � s < k { 1 − g ( α ν is ) } , if k = K , stick k g ( · ) = 1 / (1 + exp {− ( · ) } ) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 39 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls Stick-breaking parametrization of weight functions ν k ( W i ) = P ( Z i = k | W i ) by � ik ) � g ( α ν s < k { 1 − g ( α ν is ) } , if k < K , h k ( W i ; Γ ν k ) = � � �� � s < k { 1 − g ( α ν is ) } , if k = K , stick k g ( · ) = 1 / (1 + exp {− ( · ) } ) . We specify α ν ik via additive models: q 1 � kj ) + � α ν f kj ( W ij ; β ν W ⊤ i γ ν ik = µ k 0 + k , k = 1 , . . . , K − 1 . j =1 Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 39 / 55
Background Models Regression Simulations Results Discussion Let P 0 depend on W i Regression model for controls Stick-breaking parametrization of weight functions ν k ( W i ) = P ( Z i = k | W i ) by � ik ) � g ( α ν s < k { 1 − g ( α ν is ) } , if k < K , h k ( W i ; Γ ν k ) = � � �� � s < k { 1 − g ( α ν is ) } , if k = K , stick k g ( · ) = 1 / (1 + exp {− ( · ) } ) . We specify α ν ik via additive models: q 1 � kj ) + � α ν f kj ( W ij ; β ν W ⊤ i γ ν ik = µ k 0 + k , k = 1 , . . . , K − 1 . j =1 Expand the smooth functions by B-spline bases with coefficients β ν kj ; � w is a subvector of covariates w Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 39 / 55
Background Models Regression Simulations Results Discussion Adaptivity Considerations � Proposed Model • Prevent overfitting when the regression is easy, and improve interpretability • We a priori place substantial probabilities on models with the following two features: a) Few subclasses with effective weights (in the sense that ν k ( · ) is bounded away from 0 and 1): a novel additive half-Cauchy prior for µ k 0 . b) Smooth weight regression curves ν k ( · ): by Bayesian Penalized-Splines (P-Splines) combined with mixture priors on spline coefficients to sensitively distinguish constant α ν k ( · ) from flexible smooth curves Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 40 / 55
Background Models Regression Simulations Results Discussion On Consideration a) “Uniform Shrinkage over Simplex” for ν k ( W ) Proposed Model Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 41 / 55
Background Models Regression Simulations Results Discussion On Consideration a) “Uniform Shrinkage over Simplex” for ν k ( W ) Proposed Model • We let µ k 0 = � k j =1 µ ∗ j 0 , µ ∗ j 0 > 0. A large µ k 0 for a large k . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 41 / 55
Background Models Regression Simulations Results Discussion On Consideration a) “Uniform Shrinkage over Simplex” for ν k ( W ) Proposed Model • We let µ k 0 = � k j =1 µ ∗ j 0 , µ ∗ j 0 > 0. A large µ k 0 for a large k . • µ k 0 increases with k : making the stick-breaking a priori more likely to stop for a large k Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 41 / 55
Background Models Regression Simulations Results Discussion On Consideration a) “Uniform Shrinkage over Simplex” for ν k ( W ) Proposed Model • We let µ k 0 = � k j =1 µ ∗ j 0 , µ ∗ j 0 > 0. A large µ k 0 for a large k . • µ k 0 increases with k : making the stick-breaking a priori more likely to stop for a large k • We specify the prior distributions for µ ∗ j 0 to be heavy-tailed: µ ∗ j 0 ∼ Cauchy + (0 , s j ) , j = 1 , . . . , K , Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 41 / 55
Background Models Regression Simulations Results Discussion On Consideration a) “Uniform Shrinkage over Simplex” for ν k ( W ) Proposed Model • We let µ k 0 = � k j =1 µ ∗ j 0 , µ ∗ j 0 > 0. A large µ k 0 for a large k . • µ k 0 increases with k : making the stick-breaking a priori more likely to stop for a large k • We specify the prior distributions for µ ∗ j 0 to be heavy-tailed: µ ∗ j 0 ∼ Cauchy + (0 , s j ) , j = 1 , . . . , K , • A large s k produces a large µ ∗ k 0 and helps stop the stick-breaking at class k . Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 41 / 55
Background Models Regression Simulations Results Discussion On Consideration a) “Uniform Shrinkage over Simplex” for ν k ( W ) Proposed Model • We let µ k 0 = � k j =1 µ ∗ j 0 , µ ∗ j 0 > 0. A large µ k 0 for a large k . • µ k 0 increases with k : making the stick-breaking a priori more likely to stop for a large k • We specify the prior distributions for µ ∗ j 0 to be heavy-tailed: µ ∗ j 0 ∼ Cauchy + (0 , s j ) , j = 1 , . . . , K , • A large s k produces a large µ ∗ k 0 and helps stop the stick-breaking at class k . • Encourages using a small number of effective classes ( < K ) to approximate the observed 2 J probability contingency table in finite samples Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 41 / 55
Background Models Regression Simulations Results Discussion Inference of ν k ( x ) at three hyperparameter values s j Simulation: with a single continuous covariate; “—”: truth, “—”: posterior samples X-axis: covariate values Y-axis: weight; 0 to 1. Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 42 / 55
Background Models Regression Simulations Results Discussion Let P 1 depend on X and W Subclass Weight Regression: For Cases Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 43 / 55
Background Models Regression Simulations Results Discussion Let P 1 depend on X and W Subclass Weight Regression: For Cases The pmf for cases’ measurements: Pr ( M i = m ) = � L � K ℓ =1 π i ℓ k =1 η ik Π( M i ; p k ℓ ) Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 43 / 55
Background Models Regression Simulations Results Discussion Let P 1 depend on X and W Subclass Weight Regression: For Cases The pmf for cases’ measurements: Pr ( M i = m ) = � L � K ℓ =1 π i ℓ k =1 η ik Π( M i ; p k ℓ ) • p k ℓ = { p ( j ) k ℓ , j = 1 , . . . , J } are positive rates for J measurements in subclass k of disease class ℓ : � � I { j = ℓ } � � 1 − I { j = ℓ } p ( j ) θ ( j ) ψ ( j ) k ℓ = · k k Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 43 / 55
Background Models Regression Simulations Results Discussion Let P 1 depend on X and W Subclass Weight Regression: For Cases The pmf for cases’ measurements: Pr ( M i = m ) = � L � K ℓ =1 π i ℓ k =1 η ik Π( M i ; p k ℓ ) • p k ℓ = { p ( j ) k ℓ , j = 1 , . . . , J } are positive rates for J measurements in subclass k of disease class ℓ : � � I { j = ℓ } � � 1 − I { j = ℓ } p ( j ) θ ( j ) ψ ( j ) k ℓ = · k k • Equals the TPR θ ( j ) for a causative pathogen and the FPR ψ ( j ) k k otherwise Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 43 / 55
Background Models Regression Simulations Results Discussion Let P 1 depend on X and W Subclass Weight Regression: For Cases The pmf for cases’ measurements: Pr ( M i = m ) = � L � K ℓ =1 π i ℓ k =1 η ik Π( M i ; p k ℓ ) • p k ℓ = { p ( j ) k ℓ , j = 1 , . . . , J } are positive rates for J measurements in subclass k of disease class ℓ : � � I { j = ℓ } � � 1 − I { j = ℓ } p ( j ) θ ( j ) ψ ( j ) k ℓ = · k k • Equals the TPR θ ( j ) for a causative pathogen and the FPR ψ ( j ) k k otherwise • Subclass weight regression η k ( W ) is also specified via stick-breaking: η ik = h k ( W i ; Γ η k ), k = 1 , . . . , K − 1 Zhenke Wu( zhenkewu@umich.edu ) 2019 TAMU 43 / 55
Recommend
More recommend