performance metrics for graph mining tasks
play

Performance Metrics for Graph Mining Tasks 1 Outline - PowerPoint PPT Presentation

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical Significance


  1. Performance Metrics for Graph Mining Tasks 1

  2. Outline • Introduction to Performance Metrics • Supervised Learning Performance Metrics • Unsupervised Learning Performance Metrics • Optimizing Metrics • Statistical Significance Techniques • Model Comparison 2

  3. Outline • Introduction to Performance Metrics • Supervised Learning Performance Metrics • Unsupervised Learning Performance Metrics • Optimizing Metrics • Statistical Significance Techniques • Model Comparison 3

  4. Introduction to Performance Metrics Performance metric measures how well your data mining algorithm is performing on a given dataset. For example, if we apply a classification algorithm on a dataset, we first check to see how many of the data points were classified correctly. This is a performance metric and the formal name for it is “accuracy.” Performance metrics also help us decide is one algorithm is better or worse than another. For example, one classification algorithm A classifies 80% of data points correctly and another classification algorithm B classifies 90% of data points correctly. We immediately realize that algorithm B is doing better. There are some intricacies that we will discuss in this chapter. 4

  5. Outline • Introduction to Performance Metrics • Supervised Learning Performance Metrics • Unsupervised Learning Performance Metrics • Optimizing Metrics • Statistical Significance Techniques • Model Comparison 5

  6. Supervised Learning Performance Metrics Metrics that are applied when the ground truth is known (E.g., Classification tasks) Outline: • 2 X 2 Confusion Matrix • Multi-level Confusion Matrix • Visual Metrics • Cross-validation 6

  7. 2X2 Confusion Matrix An 2X2 matrix , is used to tabulate the results of 2-class supervised learning problem and entry (i,j) represents the number of elements with class label i , but predicted to have class label j . Predicted False Negative True Positive Class + - + f ++ f +- C = f ++ + f +- Actual Class - f -+ f -- D = f -+ + f -- A = f ++ + f -+ B = f +- + f -- T = f ++ + f -+ + f +- + f -- False Positive True Negative + and – are two class labels 7

  8. 2X2 Confusion Metrics Example Results from a Classification Corresponding Algorithms 2x2 matrix for the given table Vertex Actual Predicted Predicted ID Class Class Class 1 + + + - 2 + + Actual + 4 1 C = 5 Class 3 + + - 2 1 D = 3 4 + + A = 6 B = 2 T = 8 5 + - • True positive = 4 6 - + • False positive = 1 7 - + • True Negative = 1 8 - - • False Negative =2 8

  9. 2X2 Confusion Metrics Performance Metrics Walk-through different metrics using the following example 1. Accuracy is proportion of correct predictions 2. Error rate is proportion of incorrect predictions 3. Recall is the proportion of “+” data points predicted as “+” 4. Precision is the proportion of data points predicted as “+” that are truly “+” 9

  10. Multi-level Confusion Matrix An nXn matrix, where n is the number of classes and entry (i,j) represents the number of elements with class label i , but predicted to have class label j 10

  11. Multi-level Confusion Matrix Example Predicted Class Marginal Sum of Class 1 Class 2 Class 3 Actual Values Class 1 2 1 1 4 Actual Class 2 1 2 1 4 Class Class 3 1 2 3 6 Marginal Sum of 4 5 5 T = 14 Predictions 11

  12. Multi-level Confusion Matrix Conversion to 2X2 f ++ Predicted Class f -+ f +- Class 1 Class 2 Class 3 Actual Class 1 2 1 1 f -- Class Class 2 1 2 1 Class 3 1 2 3 We can now apply all the 2X2 Predicted Class 2X2 Matrix metrics Specific to Class 1 Not Class 1 Class 1 (+) (-) Accuracy = 2/14 Error = 8/14 Class 1 (+) 2 2 C = 4 Actual Recall = 2/4 Class Not Class 1 (-) 2 8 D = 10 Precision = 2/4 A = 4 B = 10 T = 14

  13. Multi-level Confusion Matrix Performance Metrics Predicted Class Class 1 Class 2 Class 3 Class 1 2 1 1 Actual Class 2 1 2 1 Class Class 3 1 2 3 1. Critical Success Index or Threat Score is the ratio of correct predictions for class L to the sum of vertices that belong to L and those predicted as L 2. Bias - For each class L, it is the ratio of the total points with class label L to the number of points predicted as L. 13 Bias helps understand if a model is over or under-predicting a class

  14. Confusion Metrics R-code • library(PerformanceMetrics) • data(M) • M • [,1] [,2] • [1,] 4 1 • [2,] 2 1 • twoCrossConfusionMatrixMetrics(M) • data(MultiLevelM) • MultiLevelM • [,1] [,2] [,3] • [1,] 2 1 1 • [2,] 1 2 1 • [3,] 1 2 3 • multilevelConfusionMatrixMetrics(MultiLevelM) 14

  15. Visual Metrics Metrics that are plotted on a graph to obtain the visual picture of the performance of two class classifiers (0,1) - Ideal 1 (1,1) True Positive Rate Predicts the +ve class all the time ROC plot (0,0) AUC = 0.5 Predicts the –ve class all the time 0 False Positive Rate 0 1 Plot the performance of multiple models to 15 decide which one performs best

  16. Understanding Model Performance based on ROC Plot 1. Models that lie in Models that lie in this upper right are upper left have good 1 liberal. performance 2. Will predict “+” Note: This is where you True Positive Rate with little aim to get the model evidence 3. High False positives Models that lie in AUC = 0.5 1. Models that lie in this area perform lower left are worse than random conservative. Note: Models here can 2. Will not predict be negated to move “+” unless strong 0 them to the upper right evidence False Positive Rate 0 corner 1 3. Low False positives but high False Negatives 16

  17. ROC Plot Example 1 M 1 (0.1,0.8) True Positive Rate M 3 (0.3,0.5) M 2 (0.5,0.5) 0 False Positive Rate 0 1 M 1 ’s performance occurs furthest in the upper-right direction and hence is considered the best model. 17

  18. Cross-validation Cross-validation also called rotation estimation, is a way to analyze how a predictive data mining model will perform on an unknown dataset, i.e., how well the model generalizes Strategy: 1. Divide up the dataset into two non-overlapping subsets 2. One subset is called the “test” and the other the “training” 3. Build the model using the “training” dataset 4. Obtain predictions of the “test” set 5. Utilize the “test” set predictions to calculate all the performance metrics Typically cross-validation is performed for multiple iterations, selecting a different non-overlapping test and training set each time 18

  19. Types of Cross-validation • hold-out: Random 1/3 rd of the data is used as test and remaining 2/3 rd as training • k-fold: Divide the data into k partitions, use one partition as test and remaining k-1 partitions for training • Leave-one-out: Special case of k-fold, where k=1 Note: Selection of data points is typically done in stratified manner, i.e., the class distribution in the test set is similar to the training set 19

  20. Outline • Introduction to Performance Metrics • Supervised Learning Performance Metrics • Unsupervised Learning Performance Metrics • Optimizing Metrics • Statistical Significance Techniques • Model Comparison 20

  21. Unsupervised Learning Performance Metrics Metrics that are applied when the ground truth is not always available (E.g., Clustering tasks) Outline: • Evaluation Using Prior Knowledge • Evaluation Using Cluster Properties 21

  22. Evaluation Using Prior Knowledge To test the effectiveness of unsupervised learning methods is by considering a dataset D with known class labels, stripping the labels and providing the set as input to any unsupervised leaning algorithm, U . The resulting clusters are then compared with the knowledge priors to judge the performance of U To evaluate performance 1. Contingency Table 2. Ideal and Observed Matrices 22

  23. Contingency Table Cluster Same Cluster Different Cluster Same Class u 11 u 10 Class Different Class u 01 u 00 (A) To fill the table, initialize u 11, u 01, u 10, u 00 to 0 (B) Then, for each pair of points of form (v,w): 1. if v and w belong to the same class and cluster then increment u 11 2. if v and w belong to the same class but different cluster then increment u 10 3. if v and w belong to the different class but same cluster then increment u 01 4. if v and w belong to the different class and cluster then increment u 00 23

  24. Contingency Table Performance Metrics Example Matrix Cluster Same Cluster Different Cluster Same Class 9 4 Class Different Class 3 12 • Rand Statistic also called simple matching coefficient is a measure where both placing a pair of points with the same class label in the same cluster and placing a pair of points with different class labels in different clusters are given equal importance, i.e., it accounts for both specificity and sensitivity of the clustering • Jaccard Coefficient can be utilized when placing a pair of points with the same class label in the same cluster is primarily important 24

Recommend


More recommend