Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman Mohammad Ghafari Oscar Nierstrasz 1
Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman Mohammad Ghafari Oscar Nierstrasz 2
3
Class Package Wrappers Filters Change Metrics Number of Bugs Class (Buggy or Clean) Source Code Metrics Prediction Error Organizational Metrics Confusion Matrix Cost Effectiveness 4
Poisson Regression Linear Regression Class Package KNN SVM Wrappers Filters Change Metrics Number of Bugs Class (Buggy or Clean) Source Code Metrics Prediction Error Organizational Metrics Confusion Matrix Cost Effectiveness 5
Poisson Regression Linear Regression Class Package KNN SVM Wrappers Filters Change Metrics Number of Bugs Class (Buggy or Clean) Source Code Metrics Prediction Error Organizational Metrics Confusion Matrix Cost Effectiveness 6
Hyperparameters KNN SVM Kernel Complexity # Neighbors Search Evaluation Algorithm Omega Exponent Gamma Sigma 7
Buggy Clean 2000 1'620 1500 1'288 Number of Classes 1000 798 629 500 195 242 209 199 129 62 0 Mylyn JDT Core PDE UI 8
SVM Mylyn JDT Core PDE UI RMSE 2.0 Prediction Error 1.5 SVM 1.05 1.0 0.94 0.8 0.81 0.65 0.63 0.53 0.53 0.5 0.42 0.41 No Yes No Yes No Yes No Yes No Yes Tuned Tuned? 9
KNN Mylyn JDT Core PDE UI Eclipse JDT Core Eclipse PDE UI Equinox Lucene Mylyn 2.0 Prediction Error 1.5 1.19 IBK 1.03 0.99 1.0 0.8 0.79 0.66 0.62 0.55 0.52 0.5 0.43 RMSE No Yes No Yes No Yes No Yes No Yes Tuned Tuned? 10
Poisson Regression Linear Regression Class Package KNN SVM Wrappers Filters Change Metrics Number of Bugs Class (Buggy or Clean) Source Code Metrics Prediction Error Organizational Metrics Confusion Matrix Cost Effectiveness 11
Poisson Regression Linear Regression Class Package KNN SVM Wrappers Filters Change Metrics Number of Bugs Class (Buggy or Clean) Source Code Metrics Prediction Error Organizational Metrics Confusion Matrix Cost Effectiveness 12
13
Filter Feature Selection Filter Train 14
Train Wrapper Feature Selection Train subset Train Train 15
Embedded Feature Selection Train Lasso Ridge Elastic 16
Linear Regression Mylyn JDT Core PDE UI Eclipse JDT Core Eclipse PDE UI Equinox Lucene Mylyn 2.0 Prediction Error 1.5 Linear Regression 1.01 1.01 0.98 0.96 1.0 0.92 0.82 0.81 0.8 0.59 0.58 0.58 0.57 0.52 0.51 0.51 0.51 0.5 0.4 0.38 0.38 0.38 0.0 Elastic Elastic Elastic Elastic Elastic Ridge Lasso Ridge Lasso Ridge Lasso Ridge Lasso Ridge Lasso None None None None None 17
Poisson Regression Mylyn JDT Core PDE UI 2.0 1.82 Prediction Error 1.5 1.37 1.02 1.0 0.92 0.91 0.91 0.89 0.86 0.71 0.69 0.6 0.6 0.6 0.59 0.54 0.54 0.53 0.5 0.4 0.4 0.4 0.0 None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet None Ridge Lasso ElasticNet Elastic Elastic Elastic Elastic Elastic Ridge Lasso Ridge Lasso Ridge Lasso Ridge Lasso Ridge Lasso None None None None None 18
Recommend
More recommend