A Degree-of of-Knowledge Model to Capture Source Code Familiarity Thomas Fritz, Jingwen Ou, Gail C. Murphy and Emerson Murphy-Hill Presented by: Haifa Alharthi
Outline • Problem statement • Dataset • Description of the main elements used in the model • Description of the degree-of-knowledge model • Determining the weightings needed in degree-of-knowledge model • Case studies • Discussion And Future Work
Problem statement • The size and high rate of change of source code make it difficult for software developers to keep up with who on the team knows about particular parts of the code. • The lack of this knowledge results in: • May complicate many activities e.g., who to ask when questions arise • Make it difficult to know who can bring a new team member up-to-speed in a particular part of the code • Existing approaches to this problem are based solely on authorship of code.
Dataset • Data was gathered from two professional development sites: • Site 1: • 7 professional developers • Developers’ experience: 1-22 years (11.6 years on average) • Each worked on multiple streams of the code • Site 2: • 2 professional developers • Build open source frameworks for Eclipse • Developers’ experience: 3 and 5 years
Degree-of of-Authorship (D (DOA): • Three factors to determine DOA: • First authorship ( FA ) : if developer D created the first version of the element • Number of deliveries ( DL ): subsequent changes after first authorship made to the element by D • Acceptances ( AC ): changes to the element are not completed by D
Degree-of of-interest (D (DOI) The degree-of-interest (DOI ): real value that represents the amount of interaction-- selections and edits--a developer has had with a source code element • A selection occurs when a developer touches a code element ( e.g., opens a class) • An edit occurs when a keystroke is detected in an editor window. • A selection of an element contributes less to DOI than an edit of an element. • A positive DOI value: developer has been recently and frequently interacting with the element • A negative DOI value: a developer has been interacting with other elements substantially since the developer interacted with this element.
Difference between authorship and interaction • Site1 : On average, only 27% elements with a positive DOI also had at least one first authorship or delivery event in the previous three months. • Site2 : The number of elements with a positive DOI that also had at least one first authorship or delivery event is four (7%)
Degree-of of-knowledge Model • Degree-of-knowledge Model: Assigns a real value to each source code element-- class, method, field-- for each developer. • Two components of Degree-of-knowledge: • Developer's longer-term knowledge: represented by a degree-of- authorship (DOA) value • Developer's shorter-term knowledge: represented by a degree- • of-interest value (DOI)
Determining DOK Weightings Appropriate values were determined empirically: 1. Initial determination of weighting values based on the data collected from Site1 . 2. We then test these weightings at Site2 . Determining weightings that might apply across a broader range of development situations would require gathering data from many more projects
Determining DOK Weightings: Method • At time T3, for each developer: • Collected 40 random code elements with ( DOI ≠ 0) or ( FA > 0) or ( DL > 0) • Developers assess their knowledge of those elements on a scale 1-5 • 246 ratings were collected; ratings are ordinal
Determining DOK Weightings: Analysis and Results • Multiple linear regression analysis • Independent variables: FA, DL, AC, DOI. • Dependent variable: developer ratings • DOI and AC can be substantially high so they used the natural logarithms of the values. • The resulting DOK equation is as follows. The lack of significance in DOI might be from the lack of elements with a positive DOI in the set of randomly chosen elements (only 7%)
Determining DOK Weightings: Analysis and Results • F-test : 19.6 with p < 0.000001 • F test specifies that independent variables (FA,DL,AC.DOI) are jointly significant in explaining the dependent variable (user rating) • Goodness of fit or R Square = 0.25 • R Square assesses if the addition of an independent variable (e.g., FA) has contributed to increased strength of the model. • R Square value shows that the model does not predict the user rating completely. • Each of the four variables contributes to the overall explanation of the user rating. Definitions from: http://www.analystforum.com/forums/cfa-forums/cfa-level-ii-forum/91311972
Determining DOK Weightings: Ext xternal Validity of f the Model Weightings are tested at Site2 1. Each developer ranked 40 random code elements from 1-5 2. DOK values for each of the elements was computed using the weightings determined before (in Site 1) 3. The Spearman rank correlation coefficient statistic was applied • Non-parametric statistic that is designed to measure the strength of association between ranked variables. • 80 code elements from the two developers Results: • There is a statistically significant correlation with r s = 0 : 3847 (p=0.0004). • The model can predict DOK values with reasonable accuracy Spearman rank correlation: https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide.php
Case Studies • To determine if degree-of-knowledge (DOK) values can provide value to software developers • 2 case studies included the seven developers at Site1. • 3 rd case study included different 3 developers at Site2; average of 2.5 years of professional experience http://appliedpowerconcepts.com/wp-content/themes/Applied%20Power/images/banner-case-studies.jpg
Case study 1: : Finding Experts • Problem : identify which team member knows the most about each part of the code-base • Method : • Compared packages predicted by DOK model to the reported assignments of packages • Results • 55% of the results we computed based on DOK values was consistent with the assignments by the developers. • However, The developer assignments were sometimes guesses • All six developers stated that the knowledge map was reasonable Each package is colored (or labelled) according to the developer with the highest DOK values for that package
Case study 1: Finding Experts Comparison to Expertise Recommenders Comparison to Expertise Recommenders • Expertise approach • It represents code familiarity based solely on authorship • Experts for each package are computed by summing up all first authorship and delivery events from the last three months for a developer for each class in the package. • Results • In 49 % of the cases, the DOK-based approach agreed with the developer assignments, whereas the expertise approach agreed in only 24% of these cases. • DOK values can improve on existing approaches to finding experts.
Case study 2: : Onboarding • Problem: • considers a mentoring situation where an experienced developer might use his DOK values to help a new team member become familiar (onboard) into that part of the code base. • Method • 3 random developers • Find for each developer, the twenty elements with the highest DOK and asked her to specify if it is likely to be helpful for a newcomer. • Results • Only 3% of the elements were considered to likely be helpful for a newcomer. http://showd.me/wp-content/uploads/2014/12/bigstock- • The DOK values for the API elements were either very low or zero as Helping-hand-to-new-member-or-47078242.jpg they were neither changing nor were they referred to frequently by the developers who authored them. • The elements with high DOK values were not considered helpful.
Case study 3: : Id Identify fying Changes of f In Interest • Problem : investigated whether a developer's DOK values can be used to select changes of interest to the developer because of overlap between the source code change and the developer's DOK model. • Method • We computed a DOK model for each of three developers from Site2. • For each bugs of interest, the developer specify whether they had read the bug or whether they would have wanted to be aware of the bug. • Results • DOK model provided relevant information to developers in four out of six cases by recommending non-obvious bugs based on the developers' DOK values. http://www.nonnymous.com/project-zero/bug.jpg
Discussion And Future Work • The DOK weighting experiment was during testing stage: • Long-term studies are needed to better understand the impact of project phases on indicators such as DOK. • The style of code ownership influences DOK values: • Study of more teams is needed to determine how robust the DOK values are to team and individual styles. • Elements found using the DOK are often one or two layers below the API elements • Infer familiarity from subclasses up to the API elements that are the super types as it is likely a subclass user knows the API elements to some extent.
Questions • How serious do you think the problem they are solving? Give examples. • Are there any other factors you would consider for learning the Degree-of-knowledge (in addition to authorship and interactions)? • What do you think of the methods of learning weights of the DOK model? Any alternatives? • In addition to the case studies, what other cases can the DOK model be used for?
Recommend
More recommend