Crafting Code Riaan Cornelius Using forensic techniques for targeted refactoring
Who am I > More than a decade of software dev experience > Mobile app developer by day > Purveyor of strange topics by night > I’ve dabbled in AI, computer vision, robotics and even cooking > Please remember to rate my talk: http://www.devconf.co.za/rate
Why do we refactor? > As a developer, what is your job?
Why do we refactor?
Why do we refactor?
Why do we refactor?
Why do we refactor?
Why do we refactor? > Maintenance is expensive
The enemy of change > Complexity > If our job is to understand code, how do we make that job easier
Some (potentially) useful tools > Static analysis > Complexity metrics > Code reviews > Tests
Tools I used > Git (specifically git log) > Code Maat > Python > D3.js (Javascript library)
Forget the tools > It’s not about the tools, but rather the techniques > These tools simplify some parsing, processing or visualisation > You can write your own scripts for any of these functions
Problems of scale > In large systems, how do you prioritise improvements?
The problem with complexity metrics > Complexity is only a problem if you need to deal with it
Offender profiling > You probably know something about offender profiling. > Hollywood loves it: • Silence of the lambs • Numbers • Criminal minds • NCIS • Many more…
Offender profiling > There is one serious limitation: They only work in Hollywood
Geographic profiling > Based in statistics and psychology. > Same principle as police officer sticking pins in a map
Geographic profiling
Applying geographical profiling to code > What if a hotspot analysis could narrow down areas of bad code?
Exploring the geography of code
Add a spatial component > Hopefully you all use a VCS. > We need to focus on areas with high developer activity
Add a spatial component > git log --pretty=format:'[%h] %an %ad %s' --date=short --numstat > maat.bat -l git.log -c git -a revisions > metric_data.cvs
Add a spatial component
Combine complexity and effort
Profiling your codebase > Choose a timespan for your analysis > Get frequency data > Add complexity data > Merge complexity and effort > Visualise this data
Profiling your codebase > We’ll look at the hibernate ORM > git clone https://github.com/hibernate/hibernate-orm.git
Profiling your codebase > Choosing a timeframe > Don’t look at the life of the project > What timeframe you use depend on your development methodology • Between releases • Over iterations • Around significant events (reorganisation of code or teams)
Profiling your codebase > Generate a log: > git log --pretty=format:'[%h] %an %ad %s' --date=short – numstat -- before=2013-09-05 --after=2012-01-01 > hib_evo.log
Profiling your codebase > A summary of the changes shows some interesting things: prompt> maat -l l hib ib_evo.lo log -c git it -a su summary ry statistic,value number-of-commits,1346 number-of-entities,10193 number-of-entities-changed,18258 number-of-authors,89
Profiling your codebase > Analyzing change frequencies: > maat -l hib_evo.log -c git -a revisions > hib_freqs.csv
Profiling your codebase > Calculate complexity > Complexity by lines of code? > Bad metric, but no worse than others … > Cloc ./ --by-file – csv – quiet – report-file=hib_lines.csv
Profiling your codebase > Combine complexity and effort: > python scripts/merge_comp_freqs.py hib_freqs.csv hib_lines.csv > module,revisions,code build.gradle,79,402 hibernate-core/.../persister/entity/AbstractEntityPersister.java,44,3983 hibernate-core/.../cfg/Configuration.java,40,2673 hibernate-core/.../internal/SessionImpl.java,39,2097 hibernate-core/.../internal/SessionFactoryImpl.java,34,1384 …
Profiling your codebase > Now we can finally get to the fun part: Visualisation > I’m using a sample D3.js circle -packing algorithm > Due to security restrictions in modern browsers: > pyth ython -m m Sim Simple leHTTPServer 8888
Profiling your codebase
Profiling your codebase
Measuring complexity > Is there a simple option that is better than lines of code?
Measuring complexity
Measuring complexity > python scripts/complexity_analysis.py hibernate- core/src/main/java/org/hibernate/cfg/Configuration.java n, total, mean, sd, max 3335, 8072, 2.42, 1.63, 14
Measuring complexity > You’ve already seen how to analyze a single revision. Now we want to: 1. Take a range of revisions for a specific module. 2. Calculate the indentation complexity of the module as it occurred in each revision. 3. Output the results revision by revision for further analysis.
Measuring complexity > python scripts/git_complexity_trend.py --start ccc087b --end 46c962e --file hibernate-core/src/main/java/org/hibernate/cfg/Configuration.java > rev, n, total, mean, sd e75b8a7, 3080, 7610, 2.47, 1.76 23a6280, 3092, 7649, 2.47, 1.76 8991100, 3100, 7658, 2.47, 1.76 8373871, 3101, 7658, 2.47, 1.76 …
Visualising complexity trends
Visualising complexity trends
Visualising complexity trends
Going further
Resources > http://riaan.me/dc16 Twitter: @riaancornelius Please remember to rate my talk: http://www.devconf.co.za/rate
/* THANK YOU*/ Riaan Cornelius Entelect Software Riaan.Cornelius@Entelect.co.za 084 755 1866 http://www.devconf.co.za/
Recommend
More recommend