La Larg rge-Scal Scale Patch Patch Reco comme mmendati ation at at Alibab aba Xindong Zhang 1 , Chenguang Zhu 2 , Yi Li 3 , Jianmei Guo 1 , Lihua Liu 1 , and Haobo Gu 1 1. Alibaba Group 2. University of Texas at Austin 3. Nanyang Technological University
Motivation 50% time On average, 49.9% of software developers’ time has been spent in debugging 50% cost About half of the development costs are associated with debugging and patching Automated patch recommendation can significantly reduce developers’ debugging efforts and the overall development costs
Challenges Diverse Applications Insufficient test cases Need a general approach Induce difficulty on patch validation Challenges Lack patch labels Practical requirements Accurate patch mining is difficult Highly responsible and low false positive
Solution Diverse Applications Insufficient test cases Patches are mined from internal codebase Independent of test cases and use developers’ using generic features feedback to validate and improve PRECFIX Lack patch labels Practical requirements Automatically mines bug and fix templates Guarantee high responsiveness (scale of ms) from historical changes and low false positive (22% and lower)
PRECFIX Clustering Algorithm:DBSCAN • Commit Commit message contains fix intentions Clustering Strategy: Both defect & patch snippets • fix#723 NPE check 75% bug-fixing commits have such pattern: Optimization:Simhash-KDTree, API sequence author: Jack Delete bug snippet & Add patch snippet Similarity Comparison:Levenstein + Jaccard ++++ - - - - - ++++ 15 million commits 30 million files
Patch Category 26% Validation Check API Modification 40% API Wrap 14%
Results EFFECTIVENESS USER STUDY False positive rate is 22% in patch The majority (10/12) of the 22% discovery and it is supposed to be 10/1 interviewed developers gradually reduced by feedback on acknowledged the value of the 2 discovered patch and contribution patches, and all of them would like of new patch to see Precfix adopted in practice EFFICIENCY DEPLOYMENT Offline patch discovery costs 5 Precfix has been deployed in 1 Year 5 Hours hours (extracting pairs, clustering, Alibaba for about one year so far. and extracting templates Every week, it recommends about consumes 22, 270, and 5 min). 400 patches to developers on Online patch recommendation is average, and receives about two made within milliseconds to three false positive reports
Recommend
More recommend