lecture 10 introduction to logistic regression
play

Lecture 10: Introduction to Logistic Regression Ani Manichaikul - PowerPoint PPT Presentation

Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression n Regression for a response variable that follows a binomial distribution n Recall the binomial model n And the Binomial


  1. Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007

  2. Logistic Regression n Regression for a response variable that follows a binomial distribution n Recall the “binomial model” n And the Binomial Distribution

  3. Binomial Model n n independent trials n (e.g., coin tosses) n p = probability of success on each trial n (e.g., p = ½ = Pr of Heads) n Y = number of successes out of n trials n (e.g., Y= number of heads)

  4. Binomial Distribution   n ( ) −   = = − n y y ( ) 1 P Y y   p p   y Example:

  5. Why can’t we use regular regression (SLR or MLR)?

  6. Cannot use Linear Regression n The response, Y, is NOT Normally Distributed n The variability of Y is NOT constant since the variance, Var(Y)= pq, depends on the expected response, E(Y)= p. n The predicted/fitted values must be such that the corresponding probabilities are between 0 and 1.

  7. Example n Consider phase I clinical trial in which 35 independent patients are given a new medication for pain relief. Of the 35 patients, 22 report “significant” relief one hour after medication n Question: How effective is the drug?

  8. Model n Y = # patients who get relief n n = 35 patients (trials) n p = probability of relief for any patient n The truth we seek in the population n How effective is the drug? What is p? n Get best estimate of p given data n Determine margin of error -- range of plausible values for p

  9. Maximum Likelihood Method n The method of maximum likelihood estimation chooses values for parameter estimates which make the observed data “maximally likely” under the specified model

  10. Maximum Likelihood n For the binomial model, we have observed Y= y and   n ( ) −   = = − n y y ( ) 1 P Y y p p     y n So for this example   35 ( ) 22 1   = = − 13 ( ) P Y y p p    22 

  11. Maximum Likelihood n So, estimate p by choosing the value for p which makes observed data “maximally likely” n i.e., choose p that makes the value of Pr (Y= 22) maximal n The ML estimate is y/n = 22/35 = 0.63 estimated proportion of patients who will experience relief

  12. Maximum Likelihood Likelihood Function: Pr(22 of 35) 1.0e-10 Max Likelihood Likelihood 5.0e-11 MLE: p= 0.63 0 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1 p=Prob(Event)

  13. Confidence Interval for p ( ) − 1 p p pq ˆ ˆ = p p n Variance of : Var( )= n n pq ˆ p n “Standard Error” of : n ˆ p n Estimate of “Standard Error” of : p ˆ ˆ q n

  14. Confidence Interval for p n 95% Confidence Interval for the ‘true’ proportion, p: ( )( ) ˆ ˆ 0 . 63 0 . 37 p q ± = ± ˆ 1 . 96 0 . 63 1 . 96 p 35 n = 0.63-1.96(.082),0.63+ 1.96(.082) = (0.47, 0.79)

  15. Conclusion n Based upon our clinical trial in which 22 of 35 patients experience relief, we estimate that 63% of persons who receive the new drug experience relief within 1 hour (95% CI: 47% to 79% )

  16. Conclusion n Whether 63% (47% to 79%) represents an ‘effective’ drug will depend many things, especially on the science of the problem. n Sore throat pain? n Arthritis pain? n Accidentally cut your leg off pain?

  17. Aside: Probabilities and Odds n The odds of an event are defined as: = = P(Y 1) P(Y 1) odds(Y= 1) = = = = P(Y 0) 1 - P(Y 1) p = 1 -p

  18. Probabilities and Odds n We can go back and forth between odds and probabilities: p n Odds = 1 -p n p = odds/(odds+ 1)

  19. Odds Ratio n We saw that an odds ratio (OR) can be helpful for comparisons. Recall the Vitamin A trial: odds(Death | Vit. A) n OR = odds(Death | No Vit A.)

  20. Odds Ratio n The OR here describes the benefits of Vitamin A therapy. We saw for this example that: n OR = 0.59 n An estimated 40% reduction in mortality n OR is a building block for logistic regression

  21. Logistic Regression n Suppose we want to ask whether new drug is better than a placebo and have the following observed data: Relief? Drug Placebo No 13 20 Yes 22 15 Total 35 35

  22. Confidence Intervals for p Placebo ( ) ( ) Drug 0 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1 p

  23. Odds Ratio odds(Relie f | Drug) OR = odds(Relie f | Placebo) P(Relief | Drug) / [1 - P(Relief | Drug)] = P(Relief | Placebo) / [1 - P(Relief | Placebo)] 0.63/(1 - 0.63) = = 2.26 0.45/(1 - 0.45)

  24. Confidence Interval for OR n CI used Woolf’s method for the ˆ log( R ) standard error of : O 1 1 1 1 ˆ n se( ) = + + + = log( R ) O 0 . 489 22 13 15 20 ˆ ˆ ± log( ) 1 . 96 (log( )) O R se O R find n n Then (e L ,e U )

  25. Interpretation n OR = 2.26 n 95% CI: (0.86 , 5.9) n The Drug is an estimated 2 ¼ times better than the placebo. n But could the difference be due to chance alone?

  26. Logistic Regression n Can we set up a model for this similar to what we’ve done in ANOVA and Regression? n Idea: model the log odds of the event, (in this example, relief) as a function of predictor variables

  27. Model   [ ] P(relief | Tx)   = log odds(Relie f | Tx) log   P(no relief | Tx)   = β 0 + β 1 Tx 0 if Placebo where: Tx = 1 if Drug

  28. Then… n log( odds(Relief|Drug) ) = β 0 + β 1 n log( odds(Relief|Placebo) ) = β 0 n log( odds(R|D)) – log( odds(R|P)) = β 1

  29. And…   odds(R | D) log = β 1   n Thus:   odds(R | P)   OR = exp( β 1 ) = e β 1 !! n And: n So: exp( β 1 ) = odds ratio of relief for patients taking the Drug-vs-patients taking the Placebo.

  30. Logistic Regression Logit estimates Number of obs = 70 LR chi2(1) = 2.83 Prob > chi2 = 0.0926 Log likelihood = -46.99169 Pseudo R2 = 0.0292 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- drug | .8137752 .4889211 1.66 0.096 -.1444926 1.772043 _cons | -.2876821 .341565 -0.84 0.400 -.9571372 .3817731 ------------------------------------------------------------------------------ Estimates: ˆ ˆ β β + log( odds(relief) ) = Drug 0 1 = -0.288 + 0.814(Drug) Therefore: OR = exp(0.814) = 2.26 !

  31. It’s the same! n So, why go to all the trouble of setting up a linear model? n What if there is a biologic reason to expect that the rate of relief (and perhaps drug efficacy) is age dependent?

  32. Adding other variables n What if Pr(relief) = function of Drug or Placebo AND Age n We could easily include age in a model such as: log( odds(relief) ) = β 0 + β 1 Drug + β 2 Age

  33. Logistic Regression n As in MLR, we can include many additional covariates. n For a Logistic Regression model with p predictors: log ( odds(Y= 1)) = β 0 + β 1 X 1 + ... + β p X p = = Pr( 1 ) Pr( 1 ) Y Y where: odds(Y= 1) = = − = = 1 Pr( 1 ) Y Pr( 0 ) Y

  34. Logistic Regression n Thus:   = Pr( 1 ) Y   = β 0 + β 1 X 1 + ... + β p X p log   =   Pr( 0 ) Y n But, why use log(odds)?

  35. Logistic regression n Linear regression might estimate anything (- � , + � ), not just a proportion in the range of 0 to 1. n Logistic regression is a way to estimate a proportion (between 0 and 1) as well as some related items

  36. Linear models for binary outcomes n We would like to use something like what we know from linear regression: Continuous outcome = � 0 + � 1 X 1 + � 2 X 2 + … n How can we turn a proportion into a continuous outcome?

  37. Transforming a proportion n The odds are always positive:   p   = ⇒ +∞ odds [ 0 , )   −   1 p n The log odds is continuous:   p =   ⇒ −∞ +∞ Log odds ln ( , )   −   1 p

  38. Logit transformation Measure Min Max Name Pr(Y = 1) 0 1 “probability” = Pr( 1 ) Y ∞ 0 “odds” − = 1 Pr( 1 ) Y  =  Pr( 1 ) Y   - ∞ ∞ log “log-odds” or “logit”   − =   1 Pr( 1 ) Y

  39. Logit Function n Relates log-odds (logit) to p = Pr(Y= 1) logit function 10 5 log-odds 0 -5 -10 0 .5 1 Probability of Success

  40. Key Relationships n Relating log-odds, probabilities, and parameters in logistic regression: n Suppose model: β 0 + β 1 X logit( p) =   p β 0 + β 1 X   i.e. log =   1  -p  n Take “anti-logs”   p = exp( β 0 + β 1 X)     1  -p 

  41. Solve for p n p = (1 – p ) ⋅ exp( β 0 + β 1 X) n p = exp( β 0 + β 1 X) – p ⋅ exp( β 0 + β 1 X) n p + p ⋅ exp( β 0 + β 1 X) = exp( β 0 + β 1 X) n p ⋅ { 1+ exp( β 0 + β 1 X)} = exp( β 0 + β 1 X) (β + β exp ) X 0 1 n p = 1 + (β + β exp ) X 0 1

  42. What’s the point? n We can determine the probability of success for a specific set of covariates, X, after running a logistic regression model.

Recommend


More recommend