revisiting the area
play

Revisiting the Area under the ROC Berry de Bruijn Institute for - PowerPoint PPT Presentation

Revisiting the Area under the ROC Berry de Bruijn Institute for Information Technology National Research Council, Canada Personalize with title, slogan or I/B/P name in master slide Purpose Take a look at the Area under the ROC curve from a


  1. Revisiting the Area under the ROC Berry de Bruijn Institute for Information Technology National Research Council, Canada Personalize with title, slogan or I/B/P name in master slide

  2. Purpose • Take a look at the Area under the ROC curve from a different perspective… • give it an additional interpretation… • which might lead to options for extending the AUC. So, an old story with a new twist… Revisiting the Area Under the ROC

  3. Introduction: Tests and Classifiers 30-second tutorial Revisiting the Area Under the ROC

  4. Introduction: Tests and Classifiers Fresh or not fresh!?!?…. Sniff test !! Revisiting the Area Under the ROC

  5. Introduction: Classifiers Sniff test.. all subjects sniffed & scored 0.722 0.666 0.305 0.8 0.879 0.999 0.801 0.544 … then rank ordered Revisiting the Area Under the ROC

  6. Introduction: Classifiers All subjects - rank order by score, then apply a threshold 0.6 Not Fresh Fresh 0.722 TP FP Eaten 0.801 Not  FN TN 0.879 Eaten 0.999 Sensitivity = fresh shrimps eaten / all fresh shrimps; Specificity = non-fresh shrimps not-eaten / all non-fresh shrimps ; Revisiting the Area Under the ROC

  7. Introduction: Classifiers All sensitivity/specificity pairs form the ROC curve Revisiting the Area Under the ROC

  8. Introduction: Classifiers All sensitivity/specificity pairs form the ROC curve     AUC  AUC = 0.9332    à One metric about the performance of the classifier or test.. Revisiting the Area Under the ROC

  9. The new part…. Our classifier can be modeled with a stochastic process: model - sampling, without replacement, from a biased urn with marbles î marbles do not have equal chance to be drawn distribution: Fisher Non-Central Hypergeometric Distribution. TP = f(k, Pos, Neg, bias). Revisiting the Area Under the ROC

  10. Statistical modeling ‘ cond.-vs.-poss. ’ data: Observed: • 1054 cases 171 positives 883 negatives • AUC = 0.9332 Fisher NCHypG distr. curve TP = f(Pos, Neg, k, bias) • k = [0 .. 1054] • Pos = 171, • Neg = 883, • bias = 0.9332* Revisiting the Area Under the ROC

  11. Statistical modeling See the paper for actual and synthesized ROCs from other data sets. Revisiting the Area Under the ROC

  12. Conclusions AUC + non-central hypergeometric distribution = new ? interpretation of AUC, stronger theoretical support. Additional statistical properties can be useful for comparing classifiers on the same data set Opens door to extensions for multi-class classification and non-uniform populations. Tusen takk - Thank you Revisiting the Area Under the ROC

  13. Bonus features… ‘ binormal ’ approximation Revisiting the Area Under the ROC

Recommend


More recommend