automatic identification of bug fix commits
play

Automatic Identification of Bug-fix Commits: The Case of GitHub - PowerPoint PPT Presentation

Automatic Identification of Bug-fix Commits: The Case of GitHub Projects Yujuan Jiang, Rodrigo Morales, Bram Adams, Foutse Khom 1 Case study projects Approach Research questions Result (so far) 2 Case Study Projects key words:


  1. Automatic Identification of Bug-fix Commits: The Case of GitHub Projects Yujuan Jiang, Rodrigo Morales, Bram Adams, Foutse Khom 1

  2. • Case study projects • Approach • Research questions • Result (so far) 2

  3. Case Study Projects key words: GitHub, C language 3

  4. Approach • Data Collection • Feature Extraction (Text & Source code) • Model Training • Evaluation 4

  5. Approach: Data collection 5

  6. Approach: Feature Extraction Textual Analysis: keywords Code Analysis 6

  7. Approach: Feature Extraction 1) Textual Analysis: 7

  8. Approach: Feature Extraction 1) Textual Analysis: keywords 7

  9. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words 7

  10. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words All words 7

  11. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words Stem + All words remove stop words 7

  12. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words Stem + All words Filter remove stop words 7

  13. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words Stem + All words Filter remove stop words 7

  14. Approach: Feature Extraction 2) Source Code Analysis: 8

  15. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser 8

  16. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script 8

  17. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script Commits 8

  18. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script Commits Parser 8

  19. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script Commits Parser Commit Profile 8

  20. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script # of while loops # of ifs # of boolean ...... Commits Parser Commit Profile Features 8

  21. Approach: Feature Extraction 9

  22. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) Grey data (Unlabelled) 10

  23. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) Grey data LPU (Unlabelled) 10

  24. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) White data (Bottom k) Grey data LPU (Unlabelled) Black data 10

  25. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) White data (Bottom k) Grey data + LPU (Unlabelled) Black data SVM Random Forest 10

  26. Approach: Evaluation 11

  27. Research Questions • Does our classifier work better than the baseline: keyword-based approach? • How does the parameter k impact the classifier? • What kind of metrics play more important roles in identifying bug-fixing commits? • Is the hybrid approach (namely the combination of the LPU and SVM) more effective than a single classifier approach? • Which combination of the options of the tool LPU makes the classifier work best? 12

  28. Result (so far): recall • Libgit2: 76.95% • openFrameworks: 96.67% 13

  29. Result (so far): key features X5 ● X6 ● X7 ● X22 ● X20 ● X21 ● X23 ● X31 ● X12 ● X50 ● X27 ● X16 ● X10 ● X16676 ● X51 ● X49 ● X48 ● X47 ● X46 ● X45 ● X44 ● X43 ● X42 ● X40 ● X39 ● X36 ● X35 ● X34 ● X32 ● X30 ● X29 ● X28 ● X25 ● X24 ● X19 ● X18 ● X17 ● X15 ● X14 ● X13 ● X11 ● X9 ● X4 ● X3 ● X2 ● X26 ● X37 ● X33 ● X41 ● X38 ● 0.000 0.005 0.010 0.015 0.020 0.025 0.030 Libgit2 14

  30. 15

  31. 15

  32. 15

  33. LPU SVM 15

  34. X5 ● X6 ● X7 ● X22 ● X20 ● X21 ● X23 ● X31 ● X12 ● X50 ● X27 ● X16 ● X10 ● X16676 ● X51 ● X49 ● X48 ● X47 ● X46 ● X45 ● X44 ● X43 ● X42 ● X40 ● X39 ● X36 ● X35 ● X34 ● X32 ● X30 ● X29 ● X28 ● X25 ● X24 ● X19 ● X18 ● X17 ● X15 ● X14 ● X13 ● X11 ● X9 ● X4 ● X3 ● X2 ● X26 ● X37 ● X33 ● X41 ● X38 ● LPU SVM 0.000 0.005 0.010 0.015 0.020 0.025 0.030 15

Recommend


More recommend