AUC: a Better Measure than Accuracy in Comparing Learning Algorithms - PDF document

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 1 /16 AUC: a Better Measure than Accuracy in Comparing Learning Algorithms Authors: Charles X. Ling, Department of Computer Science, University of Western Ontario, Canada & Jin Huang, Department of Computer Science, University of Western Ontario, Canada & Harry Zhang, Faculty of Computer Science, University of New Brunswick, Canada Presented by: William Elazmeh, Ottawa-Carleton Institute for Computer Science, Canada

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 2 /16 Introduction • The focus is visualization of classifier’s performance • Traditionally, performance = predictive accuracy • Accuracy ignores probability estimations of classifi- cation in favor of class labels • ROC curves show the trade off between false positive and true positive rates • AUC of ROC is a better measure than accuracy • AUC as a criteria for comparing learning algorithms • AUC replaces accuracy when comparing classifiers • Experimental results show AUC indicates a differ- ence in performance between decision trees and Naive Bayes (significantly better)

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 3 /16 Matrices Confusion Matrix + - Y T+ F+ N F- T- F+ Rate = F + T+ Rate (Recall) = T + − + Accuracy = ( T +)+( T − ) Precision = T + Y (+)+( − ) F-Score = Precision × Recall Error Rate = 1 - Accuracy

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 4 /16 ROC Space 1 A All Positive B C True Positive Rate D Trivial Classifiers E All Negative F 0 1 False Positive Rate

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 5 /16 ROC Curves 0.30 1 0.1 True Positive Rate 0.34 0.33 0.38 0.37 0.35 0.4 0.36 0.39 0.51 0.505 0.54 0.53 0.52 0.55 0.6 0.8 0.7 0.9 0 0 False Positive Rate 1 # Class Score # Class Score 1 + 0.9 11 + 0.4 2 + 0.8 12 - 0.39 3 - 0.7 13 + 0.38 4 + 0.6 14 - 0.37 5 + 0.55 15 - 0.36 6 + 0.54 16 - 0.35 7 - 0.53 17 + 0.34 8 - 0.52 18 - 0.33 9 + 0.51 19 + 0.30 10 - 0.505 20 - 0.1

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 6 /16 ROC Curves 1 True Positive Rate 0 1 False Positive Rate

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 7 /16 Comparing Classifier Performance ROC 1 True Positive Rate 0 1 False Positive Rate

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 8 /16 Choosing Between Classifiers ROC 1 True Positive Rate 0 1 False Positive Rate

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 9 /16 Area Under the Curve AUC AUC = Σ Rank (+) −| + |× ( | + | +1) / 2 | + | + |−| where: � Rank (+) is the sum the ranks of all positively classified examples | + | is the number of positive examples in the dataset | − | is the number of negative examples in the dataset Class Label Rank C 1 C 2 C 3 + 10 + - + + 9 + + + + 8 + + + + 7 + + - + 6 - + - - 5 + - + - 4 - - + - 3 - - - - 2 - - - - 1 - + - Classifier AUC Error Rate (5+7+8+9+10) − (5 × 6) / 2 = 24 C 1 20% 5 × 5 25 (1+6+7+8+9) − (5 × 6) / 2 = 16 20% C 2 5 × 5 25 (4+5+8+9+10) − (5 × 6) / 2 = 21 40% C 3 5 × 5 25

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 10 /16 Comparing Evaluation Measures for Learning Algorithm • Let Ψ represent the domain and f and g are the two evaluation measures used to compare the learning algorithms A and B • Consistency: f and g are strictly consistent if there does not exist a, b ∈ Ψ | f ( a ) > f ( b ) and g ( a ) < g ( b ) • Discriminancy: f is strictly more discriminating than g if ∃ a, b ∈ Ψ | f ( a ) > f ( b ) and g ( a ) = g ( b ), and there does not exist a, b ∈ Ψ | g ( a ) > g ( b ) and f ( a ) = f ( b )

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 11 /16 Consistency and Discriminancy X Y f Ψ g f Ψ g X is Consistency counter example Y is Discriminancy counter example

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 12 /16 Statistical Consistency and Discriminancy of Two Measures • Let Ψ represent the domain and f and g are the two evaluation measures used to compare the learning algorithms A and B • Degree of Consistency: let R = { ( a, b ) | a, b ∈ Ψ , f ( a ) > f ( b ) , g ( a ) > g ( b ) } , S = { ( a, b ) | a, b ∈ Ψ , f ( a ) > f ( b ) , g ( a ) < g ( b ) } . The degree of consistency of f | R | and g is C (0 ≤ C ≤ 1), where C = | R | + | S | . • Degree of Discriminancy: let P = { ( a, b ) | a, b ∈ Ψ , f ( a ) > f ( b ) , g ( a ) = g ( b ) } , Q = { ( a, b ) | a, b ∈ Ψ , g ( a ) > g ( b ) , f ( a ) = f ( b ) } . The degree of discriminancy for f and g is D = | P | | Q | . • The measure f is statistically consistent and more discriminating than g if and only if C > 0 . 5 and D > 1. Intuitively, f is better than g .

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 13 /16 For AUC and Accuracy Formally • In domain Ψ let R = { ( a, b ) | a, b ∈ Ψ , AUC ( a ) > AUC ( b ) , acc ( a ) > acc ( b ) } , S = { ( a, b ) | a, b ∈ Ψ , AUC ( a ) < AUC ( b ) , acc ( a ) > | R | acc ( b ) } . Then, | R | + | S | > 0 . 5 or | R | > | S | . • In domain Ψ let P = { ( a, b ) | a, b ∈ Ψ , AUC ( a ) > AUC ( b ) , acc ( a ) = acc ( b ) } , Q = { ( a, b ) | a, b ∈ Ψ , acc ( a ) > acc ( b ) , AUC ( a ) = AUC ( b ) } . Then | P | > | Q | . • Experimental results to verify the above formal results for balanced or unbalanced datasets • Experimental results to show that the Naive Bayes classifier is significantly better than decision trees

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 14 /16 AUC and Accuracy Experimental Results (balanced) Statistical Consistency # AUC ( a ) > AUC ( b ) AUC ( a ) > AUC ( b ) C & acc ( a ) > acc ( b ) & acc ( a ) < acc ( b ) 4 9 0 1.0 6 113 1 0.991 8 1459 34 0.977 10 19742 766 0.963 12 273600 13997 0.951 14 3864673 237303 0.942 16 55370122 3868959 0.935 Statistical Discriminancy # AUC ( a ) > AUC ( b ) acc ( a ) > acc ( b ) D & acc ( a ) = acc ( b ) & AUC ( a ) = AUC ( b ) 4 5 0 NA 6 62 4 15.5 8 762 52 14.7 10 9416 618 15.2 12 120374 7369 16.3 14 1578566 89828 17.6 16 21161143 1121120 18.9

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 15 /16 AUC and Accuracy Experimental Results (unbalanced) Statistical Consistency # AUC ( a ) > AUC ( b ) AUC ( a ) > AUC ( b ) C & acc ( a ) > acc ( b ) & acc ( a ) < acc ( b ) 4 3 0 1.0 8 187 10 0.949 12 12716 1225 0.912 16 926884 114074 0.890 Statistical Discriminancy # AUC ( a ) > AUC ( b ) acc ( a ) > acc ( b ) D & acc ( a ) = acc ( b ) & AUC ( a ) = AUC ( b ) 4 3 0 NA 8 159 10 15.9 12 8986 489 18.4 16 559751 25969 21.6

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 16 /16 Conclusions • AUC is a better measure than accuracy based on formal definitions of discriminancy and consistency • The above conclusion allows to the re-evaluation of conclusions made using accuracy in machine learning such as, the Naive Bayes classifier predicts significantly better than decision trees. This is contrary to the well-established conclusion of both being equiva- lent based on the accuracy measure. • The paper recommends using AUC as a “single number” measure to over accuracy when evaluating and comparing classifiers

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms - PDF document

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 1 /16 AUC: a Better Measure than Accuracy in Comparing Learning Algorithms Authors: Charles X. Ling, Department of Computer Science, University of Western Ontario, Canada

WITH C++ Prof. Amr Goneid AUC Part 6. Simple and User Defined Data Types Prof. amr Goneid, AUC

WITH C++ Prof. Amr Goneid AUC Part 13. Abstract Data Types (ADTs) Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

WITH C++ Prof. Amr Goneid AUC Part 5. Functions Prof. amr Goneid, AUC 1 Functions Prof. amr

WITH C++ Prof. Amr Goneid AUC Part 11a. The Vector Class Prof. amr Goneid, AUC 1 The Vector

WITH C++ Prof. Amr Goneid AUC Part 8. Characters & Strings Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 16. Linked Lists Prof. amr Goneid, AUC 1 Linked Lists

WITH C++ Prof. Amr Goneid AUC Part 12. Recursion Prof. amr Goneid, AUC 1 Recursion Prof. amr

WITH C++ Prof. Amr Goneid AUC Part 7. 1-D & 2-D Arrays Prof. Amr Goneid, AUC 1 Arrays

WITH C++ Prof. Amr Goneid AUC Part 11. The Struct Data Type Prof. amr Goneid, AUC 1 The

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Joint use of AUC and SAS Olwyn Byron School of Life Sciences College of Medical, Veterinary and

Auctions Johan Stennek 1 Auc$ons Examples An$ques, fine arts Houses,

WITH C++ Prof. Amr Goneid AUC Introduction to Stacks & Queues Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 10. Pointers & Dynamic Data Structures Prof. amr

WITH C++ Prof. Amr Goneid AUC Part 1. Introduction Prof. amr Goneid, AUC 1 1. Introduction

Effectiveness of External Reactor Vessel Cooling (ERVC) Strategy for APR1400 and Issues of

Academic and Athletic Performance Summary University of Virginia Athletics Academic Performance

Accountable Care Collaborative: Medicare-Medicaid Program Community Partner Roundtable 10.21.14

Investor Presentation 25 th July, 2019 1 Disclaimer / Safe Harbour Cautionary statement

ESM 7 Distributed Correlation Paul MacGyver Carman Global Technical Security Sales Engineer

New Lighting Hardware and Fan Accessories Presentation Features & Benefits Multi-finish

Case Presentation Hematopathology Robert P Hasserjian Massachusetts General Hospital Harvard

n e i z d o h c y z r P z s a k u L f o y t r e p o r P n e i z d

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms - PDF document

AUC: a Better Measure than Accuracy in Comparing Learning Algorithms 1 /16 AUC: a Better Measure than Accuracy in Comparing Learning Algorithms Authors: Charles X. Ling, Department of Computer Science, University of Western Ontario, Canada

WITH C++ Prof. Amr Goneid AUC Part 6. Simple and User Defined Data Types Prof. amr Goneid, AUC

WITH C++ Prof. Amr Goneid AUC Part 13. Abstract Data Types (ADTs) Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 9. Streams &amp; Files Prof. amr Goneid, AUC 1 Streams

WITH C++ Prof. Amr Goneid AUC Part 5. Functions Prof. amr Goneid, AUC 1 Functions Prof. amr

WITH C++ Prof. Amr Goneid AUC Part 11a. The Vector Class Prof. amr Goneid, AUC 1 The Vector

WITH C++ Prof. Amr Goneid AUC Part 8. Characters &amp; Strings Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 16. Linked Lists Prof. amr Goneid, AUC 1 Linked Lists

WITH C++ Prof. Amr Goneid AUC Part 12. Recursion Prof. amr Goneid, AUC 1 Recursion Prof. amr

WITH C++ Prof. Amr Goneid AUC Part 7. 1-D &amp; 2-D Arrays Prof. Amr Goneid, AUC 1 Arrays

WITH C++ Prof. Amr Goneid AUC Part 11. The Struct Data Type Prof. amr Goneid, AUC 1 The

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Joint use of AUC and SAS Olwyn Byron School of Life Sciences College of Medical, Veterinary and

Auctions Johan Stennek 1 Auc$ons Examples An$ques, fine arts Houses,

WITH C++ Prof. Amr Goneid AUC Introduction to Stacks &amp; Queues Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 10. Pointers &amp; Dynamic Data Structures Prof. amr

WITH C++ Prof. Amr Goneid AUC Part 1. Introduction Prof. amr Goneid, AUC 1 1. Introduction

Effectiveness of External Reactor Vessel Cooling (ERVC) Strategy for APR1400 and Issues of

Academic and Athletic Performance Summary University of Virginia Athletics Academic Performance

Accountable Care Collaborative: Medicare-Medicaid Program Community Partner Roundtable 10.21.14

Investor Presentation 25 th July, 2019 1 Disclaimer / Safe Harbour Cautionary statement

ESM 7 Distributed Correlation Paul MacGyver Carman Global Technical Security Sales Engineer

New Lighting Hardware and Fan Accessories Presentation Features &amp; Benefits Multi-finish

Case Presentation Hematopathology Robert P Hasserjian Massachusetts General Hospital Harvard

n e i z d o h c y z r P z s a k u L f o y t r e p o r P n e i z d

WITH C++ Prof. Amr Goneid AUC Part 9. Streams & Files Prof. amr Goneid, AUC 1 Streams

WITH C++ Prof. Amr Goneid AUC Part 8. Characters & Strings Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 7. 1-D & 2-D Arrays Prof. Amr Goneid, AUC 1 Arrays

WITH C++ Prof. Amr Goneid AUC Introduction to Stacks & Queues Prof. amr Goneid, AUC 1

WITH C++ Prof. Amr Goneid AUC Part 10. Pointers & Dynamic Data Structures Prof. amr

New Lighting Hardware and Fan Accessories Presentation Features & Benefits Multi-finish