Biostatistics Logistic regression Burkhardt Seifert & Alois Tschopp Biostatistics Unit University of Zurich Master of Science in Medical Biology 1
Logistic regression Great importance for medical research So far: “ordinary” regression explain an “outcome” variable y through explanatory variables x 1 , . . . , x k quantitative outcome variable y (normally distributed) relation usually assumed to be linear New with logistic regression: outcome y is binary Master of Science in Medical Biology 2
Examples A. y = patients survive ( y = 0) or die ( y =1) x 1 = therapy ( x 1 = A, B; nominal) x 2 = age (in years; continuous) x 3 , . . . = laboratory parameters. B. case–control–study (epidemiology) y = case ( y = 1) or control ( y = 0) x 1 = exposed ( x 1 = 1) or not ( x 1 = 0) x 2 , . . . = confounder. Statistical analysis with one independent variable x also: Mann–Whitney test (or unpaired t –test) Fisher‘s exact test (or χ 2 –test) Master of Science in Medical Biology 3
Int. J. Cancer: 121, 1764–1770 (2007) ' 2007 Wiley-Liss, Inc. Consistent expression of the stem cell renewal factor BMI-1 in primary and metastatic melanoma Daniela Mihic-Probst 1 *, Ariana Kuster 1 , Sandra Kilgus 1 , Beata Bode-Lesniewska 1 , Barbara Ingold-Heppner 1 , Carly Leung 1 , Martina Storz 1 , Burkhardt Seifert 2 , Silvia Marino 3 , Peter Schraml 1 , Reinhard Dummer 4 and Holger Moch 1 1 Department of Pathology, Institute of Surgical Pathology, University Hospital Zurich, Zurich, Switzerland 2 Department of Biostatistics, University of Zurich, Zurich, Switzerland 3 Institute of Pathology, Barts and the London, Queen Mary School of Medicine and Dentistry, London, United Kingdom 4 Department of Dermatology, University Hospital Zurich, Zurich, Switzerland Stem cell-like cells have recently been identified in melanoma cell 0.02). These data indicate that cells in primary melanomas and lines, but their relevance for melanoma pathogenesis is controver- their metastases may have stem cell properties. Cell lines obtained sial. To characterize the stem cell signature of melanoma, expres- from melanoma metastases showed a significant higher BMI-1 sion of stem cell markers BMI-1 and nestin was studied in 64 cuta- expression compared to cell lines from primary melanoma ( p 5 neous melanomas, 165 melanoma metastases as well as 53 mela- 0.001). Further, primary melanoma lacking lymphatic metastases noma cell lines. Stem cell renewal factor BMI-1 is a at presentation (pN0, n 5 40) was less frequently BMI-1 positive transcriptional repressor of the Ink4a/Arf locus encoding p16 ink4a than melanomas presenting with lymphatic metastases (pN1; n 5 and p14 Arf . Increased nuclear BMI-1 expression was detectable in 24; 52% versus 83%; p 5 0.01). Therefore, BMI-1 expression 41 of 64 (64%) primary melanomas, 117 of 165 melanoma metas- appears to induce a metastatic tendency. Because BMI-1 functions as a transcriptional repressor of the Ink4a/Arf locus, p16 ink4a and tases (71%) and 15 of 53 (28%) melanoma cell lines. High nestin p14 Arf expression was also analyzed. A high BMI-1/low p16 ink4a expression was observed in 14 of 56 primary melanomas (25%), 84 of 165 melanoma metastases (50%) and 21 of 53 melanoma cell expression pattern was a significant predictor of metastasis by lines (40%). There was a significant correlation between BMI-1 means of logistic regression analysis ( p 5 0.005). This suggests that BMI-1 mediated repression of p16 ink4a may contribute to an and nestin expression in cell lines ( p 5 0.001) and metastases ( p 5 0.02). These data indicate that cells in primary melanomas and increased aggressive behavior of stem cell-like melanoma cells. their metastases may have stem cell properties. Cell lines obtained ' 2007 Wiley-Liss, Inc. Master of Science in Medical Biology 4
Statistics BMI-1, p16 ink4a , p14 Arf and nestin expression in primary mela- noma were compared between different patient groups using the Mann-Whitney test. Correlations between BMI-1, p16 ink4a , p14 Arf , nestin and Breslow tumor thickness were analyzed using Spear- man’s rank correlation. Differences in tumor-specific survival between groups were calculated by log rank test. A logistic regres- sion was performed to evaluate the predictive power of BMI-1 and p16 ink4a expression in primary malignant melanoma for lymph node metastasis. p -Values below 0.05 were considered as signifi- cant. SPSS 12.0.1 for windows (SPSS) was used for statistical analyses. TABLE II – RELATIVE RISK OF LYMPH NODE METASTASIS ACCORDING TO BMI-1 AND P16 INK4A EXPRESSION LEVELS IN PRIMARY MELANOMA n Univariate OR p -value Multivariate OR p -value p16 ink4a low vs. high 1 3.0 (1.0–8.6) 2 35/29 0.04 2.7 (0.89–8.1) 0.08 BMI-1 high vs. Low 1 41/23 4.5 (1.3–15.6) 0.02 4.1 (1.2–14.6) 0.03 p16 ink4a low/BMI-1 high vs. others 1 22/42 3.2 (1.4–7.3) 0.005 Master of Science in Medical Biology 5
Odds ratio (OR) Example: Identification of risk factors for lymph node metastases with prostate cancer (Brown, 1980) n = 52 patients y = nodal metastases (0 = none, 1 = metastases) x = age, phosphatase, X-ray result, tumor size, tumor grade. The first two x –variables are continuous, the rest binary. Contingency table for the relation between nodal metastases and X-ray result X-ray result x = 0 x = 1 no nodal metastases ( y = 0) 28 4 32 nodal metastases ( y = 1) 9 11 20 52 37 15 sensitivity = 11 / 20 = 55% , specificity = 28 / 32 = 87% χ 2 –test p = 0 . 001 Master of Science in Medical Biology 6
Relative risk (RR) or odds ratio (OR)? x = 0 x = 1 y = 0 28 4 32 y = 1 9 11 20 52 37 15 “Risk” defined as P( y = 1 | x ) = p ( x ) , → p (0) = 9 / 37 = 24%, p (1) = 11 / 15 = 73% RR = p (1) / p (0) = 11 × 37 15 × 9 = 3 . 0 RR only valid for representative sample From betting we know “odds”: P( y = 1 | x ) p ( x ) P( y = 0 | x ) = 1 − p ( x ) Master of Science in Medical Biology 7
Master of Science in Medical Biology 8
Relative risk (RR) or odds ratio (OR)? Im epidemiology the “odds ratio” is a measure for the relative risk: x = 0 x = 1 y = 0 28 4 ■ ❅ ❅ � � ✒ � ✠ � ❘ ❅ ❅ y = 1 9 11 � P( y = 1 | x = 1) 1 − P( y = 1 | x = 0) = 28 × 11 P( y = 1 | x = 0) OR = = 8 . 6 1 − P( y = 1 | x = 1) 9 × 4 OR is also valid for case–control studies For rare diseases, OR and RR are nearly equal: � p (1) 1 − p (0) ≈ p (1) p (0) OR = 1 − p (1) p (0) Master of Science in Medical Biology 9
Modelling by means of logistic regression What is fundamental for a (simple) regression? Model: y i = f ( x i , β ) + ε i ( i = 1 , . . . , n ) where: f = pre-specified function e.g. linear f ( x i , β 0 , β 1 ) = β 0 + β 1 x i regression function f ( x , β ) = conditional expectation of y , given the value x , i.e. E( y | x ) = f ( x , β ) “outcome” binary event: “success” ( y = 1), “failure” ( y = 0) probability for success p = P( y = 1) E( y ) = 0 × P( y = 0) + 1 × P( y = 1) = p Master of Science in Medical Biology 10
Why not use ordinary regression? Example: y = presence of nodal metastases x = phosphatase (logarithmised) P(nodal metastases) 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.6 0.2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.2 0.3 0.4 0.6 0.8 1 1.5 phosphatase regression = conditional mean of y given x − → E( y | x ). Thus: E( y | x ) = P( y = 1 | x ) = p ( x ) A probability is modelled — lies between 0 and 1. − → plausible to model p ( x ) as distribution function. Master of Science in Medical Biology 11
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● P(nodal metastases) 0.8 0.4 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.3 0.4 0.6 0.8 1 1.5 phosphatase 2 16 10 4 0 ❅ ■ ✒ � ❅ ■ � ✒ ■ ❅ ✒ � ❅ ■ ✒ � y = 0 � ✠ ❅ ❘ ✠ � ❘ ❅ ✠ � ❅ ❘ � ✠ ❘ ❅ 0 4 8 6 2 y = 1 ∞ ∞ OR 3.2 1.9 OR for [0.58–0.79] vs. [0.41–0.57] = 16 × 8 10 × 4 = 3 . 2 OR for [0.80–1.09] vs. [0.41–0.57] = 16 × 6 4 × 4 = 16 × 8 10 × 4 × 10 × 6 4 × 8 = 3 . 2 × 1 . 9 = 6 OR for a change of more than one class: multiplicative Master of Science in Medical Biology 12
Which distribution function to use? Assumption: odds ratio for adjoining classes is constant (similar to the assumption of a constant slope of the regression function in linear regression) As OR multiplicative, log(OR) must be linear. − → for log–odds (logits): � � p ( x ) log = β 0 + β 1 x 1 − p ( x ) (log = natural logarithm = log e ) − → p ( x ) is logistic distribution function exp( β 0 + β 1 x ) p ( x ) = 1 + exp( β 0 + β 1 x ) Master of Science in Medical Biology 13
♣ Linearity of the logit–transformation Assumption: OR for x = x 0 + c vs x = x 0 is constant in x 0 = OR ( c ) → OR ( c ) = OR (1) c OR multiplicative − � � p ( x ) Is g ( x ) = log linear? 1 − p ( x ) OR ( c ): true OR for “ x = c ” vs x = 0 log ( OR ( c )) = g ( c ) − g (0) logarithmise: g ( x ) − g (0) = log ( OR (1)) x g ( x ) = g (0) + log ( OR (1)) x g ( x ) = β 0 + β 1 x with β 0 = g (0) and β 1 = log ( OR (1)) Master of Science in Medical Biology 14
Recommend
More recommend