Mining Version Histories to Guide Software Changes by T. Zimmermann, P. Weißgerber, S. Diehl, A. Zeller in IEEE Transaction on Software Engineering, Vol. 31, No. 6., June 2005
The Idea Can we make similar suggestions for software changes?
Extending Eclipse Preferences Extend Eclipse IDE with a new preference Preferences are stored in a field fKeys[]
Extending Eclipse Preferences What else do you need to change? Which of the 27,000 files? Which of the 20,000 classes? Which of the 200,000 methods? Program analysis fKeys[] and initDefaults() use the same variables Usage does not induce change Usage can be detected only within the source code Eclipse has 12,000 non-Java files
Learning from History Programmer who changed fKeys[] also changed …
From CVS to Transactions The CVS archive for Eclipse has more than 47,000 transactions
ROSE in a Nutshell
Changes -> Transactions -> Rules Entity – a triple (c, i, p), where c – syntactic category; i – identifier; p – parent entity Example: (method, initDefaults(), ( class, Comp, (file, Comp.java, …)) Operations on entities: add_to, del_from, alter Transaction – the set of changes simultaneously submitted by a developer to a version archive
Getting Syntactic Entities
Light-Weight Analysis with ROSE
Light-Weight Analysis with ROSE Rose analyzes C/C++, JAVA, PYTHON, T E X and TEXINFO files We get modified methods , variables and subsections
Changes -> Transactions -> Rules ROSE retrieves changes and transactions from CVS [Berliner’90] CVS provides only file versioning Per-file changes are grouped into transactions Files -> Transactions -> Sliding window approach [Fogel’02] Two subsequent changes, the same author, 200 second apart Branches and Merges in CVS Rose ignores changes that affect more than 30 entities
Changes -> Transactions -> Rules Rules are mined from transactions Rules are mined with Apriori Algorithm [Agrawal’94] The generated rules have the form: antecedent(s) => consequent (s) The rules have a probabilistic interpretation Evidence: support count (# of transactions) and confidence (the strength of the correspondence)
Evolutionary Coupling
Evolutionary Coupling
Evolutionary Coupling Support : How much evidence (= simultaneous changes)? Confidence : How much relevant is coupling for participants?
Evolutionary Coupling Support : How much evidence (= simultaneous changes)? Confidence : How much relevant is coupling for participants?
Applying Rules The programmer performs a change – “a situation”: ROSE suggests further changes by applying matching rules Matching rule = situation = antecedent The suggestion = union of the consequents of all the matching rules The # of rules depends on support count and confidence
Multi-Dimensional Rules If something is added to software, there is no way to predict the change based on history E.g., the developer adds “Foo” constant to Comp.java ROSE can do that in “operation” dimension
Examples of Rules GCC arrays that define the cost of different assembler operations for INTEL CPUs The arrays have been altered 9 times; 9 out of 11 times, the change is triggered by a change in the type:
Examples of Rules Python and C files – detecting evolutionary couplings in different programming languages It would require cross-language program analysis to detect this coupling
Examples of Rules POSTGRES documentation
ROSE Server and Client The ROSE server determines coupling and rules The ROSE client guides the programmer along related changes
Evaluation How good are rules at predicting changes? Training period: ROSE infers rules from the past Evaluation period: ROSE applies the mined rules In evaluation period, every transaction T is checked: Navigation : given one change in T, does ROSE point to further changes in T? Error Prevention : given all but one change from T, does ROSE point to the missing change? Closure : given all changes of T, does ROSE stay silent?
Evaluating Additional Questions Granularity Files and functions Maintenance No addition or deletions Multiple Dimensions What is the benefit of add_to and del_from ? History How much history? Usefulness over time? Quality or recommendations depending of the development cycle and releases? Recent Changes Relevance of old changes
Projects Used for Evaluation
Precision vs. Recall Recall : How many relevant entities are returned? Precision : How many of the returned entities are relevant?
Precision vs. Feedback / Support Count vs. Confidence
Results: Navigation, Prevention, Closure
Navigation, Prevention, Closure The programmer has changed one single entity. Can ROSE suggest other entities that should be changed? The programmer has changed several entities but one. Does ROSE find the missing one? The programmer made all necessary changes. How often does ROSE still suggest a missing change?
Results for Fine Granularity
Results: Navigation Given one initial item, ROSE makes predictions in 66 percent of all queries On average, the predictions contain 33 percent of all items changed For those queries for which ROSE makes recommendations, in 7 percent of the cases, a correct location is within ROSE’s topmost three suggestions
Results: Prevention and Closure In 3 percent of the queries where one item is missing, ROSE issues a correct warning A warning predicts 75 percent of the items that need to be changed ROSE’s warning about missing items should be taken seriously … Only 2 percent of all transactions cause a false alarm (!)
Results for Coarse Granularity
Results for Maintenance Rose shows the best predictive power for changes to existing entities
Threads to Validity Kinds of version histories and software projects 8 projects; 100,000 transactions Transactions do not record the order CVS limitation Quality of transactions? User studies?
Summary For stable systems like GCC, ROSE gives precise suggestions (recommendation in 63% of transactions, precision – 30%, in 90% of all recommendations – 3 topmost suggestions contain correct entity) For rapidly changing systems like KOFFICE, most useful suggestions are at the file level (because prediction new functions – out of reach for any approach) Predictive power of ROSE is best during maintenance phases In about 2-7% of all erroneous transactions, ROSE correctly detects the missing change (only 2% of all transactions cause false alarm) ROSE detects coupling between non-program entities (e.g. docs, manuals, mappings)
Future Work Taxonomies : identify patterns of changes Sequence rules : detect rules across multiple transactions Further data sources : log messages, bug databases Refactoring : ROSE does not recognize renamings of methods or files Program analysis : can improve the overall approach Rule presentation : visualization of rules can help
Downloads ROSE is publicly available as a plug-in for ECLIPSE For details and downloads visit http://www.st.cs.uni-sb.de/softevo
Recommend
More recommend