An Empirical Study of the Fault-Proneness of Clone Mutation and Clone Migration Shuai Xie, Foutse Khomh, Ying Zou Department of Electrical and Computing Engineering
Clone Genealogies Revision i Revision i+1 Revision i+2 Type-1 Clone Clone Group
Clone Mutation Clone A, Revision 270153 Type-1 catch( NullPointerException npe ){ A throw new BuildException Revision i Clone Group ( "Could not load the version information." ); } Clone A, Revision 270159 catch( NullPointerException npe ){ Type-2 throw new TaskException Revision i+1 A Clone Group ( "Could not load the version information." ); }
Clone Mutation Categories G<1> G<2> G<3> G<1,2> G<2,3> G<1,3> G<1,2,3> Revision i Revision i+1 Revision i+2 Clone Group Type-1 Clone Type-2 Clone Type-3 Clone
Clone Migration D1 File A, Revision: 268748 D2 Path: /ant/core/trunk/src/ A Revision i main/org/apache/tools/ant/ ProjectHelper.java D1 File A, Revision: 270158 D2 D3 Path: /ant/core/trunk/ Revision i+1 proposal/myrmidon/ src/ main/org/apache/tools/ B A ant/ProjectHelper.java Directory Type-1 Clone
Clone Migration Patterns Evolution Evolution of Median Distance Evolution of Size of Clone Group Trends Among Clones in Clone Group Constant Increase Decrease Wave Increase Wave Decrease
Clone Migration Patterns Evolution of Clone Median Distance Patterns Evolution of Size of Clone Group Migration? Among Clones in Clone Group Constant No High Density Yes Strong Up We also define other 8 clone migration patterns
Research Questions • RQ1: Do clone mutation and clone migration occur frequently in software systems? • RQ2: Are some clone mutations more fault- prone than others? • RQ3: Are some clone migrations more fault- prone than others?
Subject Systems Systems # LOC # Revisions # Genealogies 1.6M 109K 1.7K 2.3M 1.0M 23K 3.1M 18k 15.6k ArgoUML
Approach Overview
Mine the SVN • Mine the SVN using J-Rex – Identify fault fixing changes – Snapshots for all changed files • Remove the test files
Clone Detection • Perform clone detection once on all files • Use NiCad to detect clone • NiCad parameters: Clone Type Similarity Blind-rename Type-1 100% No Type-2 100% Yes Type-3 80% Yes
Build Clone Genealogies • Build clone genealogies by existing clone groups • Use diff to track changes to each clone • Connect clone groups that share clones
RQ1: Do clone mutation and clone migration occur frequently in software systems? Proportion of Different Clone Mutation Categories 60 Clone mutation affects an 50 Percentage % important number of clones. 40 30 20 10 0 G<1> G<2> G<3> G<1,2> G<1, 3> G<2, 3> G<1, 2, 3> Total Mutation JBoss Apache-Ant ArgoUML No Clone Clone Mutation Mutation Categories
RQ1: Do clone mutation and clone migration occur frequently in software systems? Proportion of Different Clone Migration Patterns 70 Clone migration 60 affects an important 50 Percentage % 40 number of clones. 30 20 10 0 Constant Wave High Low High Low High Low High Low Total Stable Density Density Density Density Density Density Density Density Migration Strong Up Strong Up Wave Up Wave Up Strong Strong Wave Wave Down Down Down Down Jboss Apache-Ant ArgoUML No Clone Clone Migration Patterns Migration
RQ2: Are some clone mutations more fault- prone than others? Odds Ratio for Different Categories of Clone Genealogies 10 40 12 Clone Mutation with 9 Type-1 to Type-2 and 8 Type-1 to Type-3 7 increase the fault- 6 Odds Ratio proneness. 5 4 3 2 1 0 Type-1 Type-2 Type-3 Type-1 Type-2 Type- Type-1 Type-3 Type- Type-2 Type-3 Type- Type-1 Type-2 Type-3 Type- 1, Type-2 1, Type-3 2, Type-3 1, Type- 2, Type-3 G<1> G<2> G<3> G<1,2> G<1, 3> G<2, 3> G<1,2,3> Jboss Apache-Ant ArgoUML No Mutation Clone Mutation Categories
RQ3: Are some clone migrations more fault- prone than others? Odds Ratio of Different Migration Patterns 10 52 Increasing the 9 8 distances among 7 cloned code Odds Ratio 6 segments are more 5 4 fault-prone. 3 2 1 0 Constant Wave Stable High Low Density High Low Density High Low Density High Low Density Density Strong Up Density Wave Up Density Strong Density Wave Down Strong Up Wave Up Strong Down Wave Down Down JBoss ORs Apache-Ant ORs ArgoUML ORs No Clone Clone Migration Patterns Migration
Clone Mutation and Clone Migration
RQ1 • RQ1: Do clone mutation and clone migration occur frequently in software systems? • Main Findings: Both clone mutation and clone migration affect a important number of clones in clone genealogies.
RQ2 • RQ2: Are some clone mutations more fault- prone than others? • Main Findings: Clone Mutation with Type-1 to Type-2 and Type-1 to Type-3 in clone genealogies increase the fault-proneness.
RQ3 • RQ3: Are some clone migrations more fault- prone than others? • Main Findings: Increasing the distances among cloned code segments make the clone genealogies more fault-prone.
Recommend
More recommend