An Empirical Study about the Use of Optional Typing in Software Systems Carlos Souza Eduardo Figueiredo carlosgsouza@gmail.com figueiredo@dcc.ufmg.br
Java JavaScript
dynamically typed languages are becoming more popular
30% 15%
python php python php +javascript +javascript +javascript +javascript php python ASP.net ASP.net +javascript +javascript +javascript +javascript
dynamically typed languages are becoming more popular
x Untyped Typed
What is the best type system?
"If someone claims to have the perfect programming language, he is either a fool or a salesman or both." (Bjarne Stroustrup)
How do programmers use different typing paradigms?
An Empirical Study about the Use of Optional Typing in Software Systems Carlos Souza Eduardo Figureido carlosgsouza@gmail.com figueiredo@dcc.ufmg.br
Introduction Type Systems Study Settings Results and Discussion Conclusion and Future Work
Introduction Type Systems Study Settings Results and Discussion Conclusion and Future Work
Typed Untyped
Typed type checking String s = new File(); documentation tool integration
Typed type checking documentation int sum(Integer[] list) tool integration
Typed public final class Integer extends Number type checking implements Comparable<Integer> The Integer class wraps a value of the primitive type int in an object. An documentation object of type Integer contains a single field whose type is int. tool integration int sum(Integer[] list)
Untyped def sum(v) { def result v.each { simplicity result += it } return result flexibility }
Untyped simplicity def byteValue(number) { return number.byteValue() flexibility } ... println byteValue(new Integer (1)) println byteValue(new Double (1.0)) println byteValue(new CarlosInt ("1"))
Typed Untyped x type checking simplicity documentation flexibility tool integration
How do programmers use different typing paradigms?
Introduction Type Systems Study Settings Results and Discussion Conclusion and Future Work
A large scale empirical study with the goal of finding how programmers use optional typing
int sum(int a, int b) def sum(a,b) { { int result = a+b def result = a+b return result return result } } int sum(a, int b) { def result = a+b return result }
Where Groovy programmers like to use types and where they don't
Q1: Do programmers use types more often in the interface of their modules ? Q2: Do programmers use types less often in test classes and scripts ? Q3: Does the experience of programmers with other languages influence their choice for typing their code? Q4: Does the size, age or level of activity of a project have any influence on the usage of types? Q5: In frequently changed code, do developers prefer typed or untyped declarations?
Dataset
Dataset 6638 projects 1.5M declarations 60GB of data 9.8M lines of code 4481 programmers
Dataset
Static code analyzer kind visibility context typed or untyped
Dataset Relative ? Use of Types
Introduction Type Systems Study Settings Results and Discussion Conclusion and Future Work
Q1 Do programmers use types more often in the interface of their modules ?
The interface of a module is composed by fields, methods and constructors which are public or protected
The interface of a module is NOT composed by local variables or private declarations
Relative Usage of Types by Declaration Kind Relative Usage of Types by Declaration Visibility
Are these results significantly different?
There is no difference H 0 : in how programmers type different kinds of declarations
ANOVA + Tukey Honestly Significant Differences
α = 0.001
confidence = 99.9%
ANOVA reported p = 0
Relative Usage of Types by Declaration Kind Relative Usage of Types by Declaration Visibility
Relative Usage of Types by Declaration Kind Relative Usage of Types by Declaration Visibility
Q1 Yes , programmers use types more often in the interface of their modules
programmers use types to document their modules
documentation is not so necessary in local variables and private methods and fields
Why are fields mostly untyped? Relative Usage of Types by Declaration Kind
private protected public
Why are constructor and protected declarations typed so often? Relative Usage of Types by Declaration Kind Relative Usage of Types by Declaration Visibility
constructors are important elements of a module definition
protected declarations establish complex contracts
Q2 Do programmers use types less often in test classes and scripts?
Q2.a Do programmers use types less often in test classes?
Factorial ANOVA + Tukey Honestly Significant Differences
Q2.a Yes , programmers use types less often in test classes
Q2.b Do programmers use types less often in scripts?
Q2.b Yes , programmers use types less often in scripts
maintainability is not a concern in these scenarios
Q3 Does the experience of programmers with other languages influence their choice for typing their code?
Static only Dynamic Only Static and Dynamic
Q3 Yes , programmers of the static only group use types more often than others
programmers get used to the lack of types
Q4 Does the size, age or level of activity of a project have any influence on the usage of types?
Spearman Correlation Ranking Type age (days) size (LoC) Usage activity (commits)
Do we see any influence on mature projects?
a project is considered mature if it has more than 100 days 2000 lines of code 100 commits
we have 223 mature projects
ANOVA reported p = 0.05
Q4 No , the size, age and level of activity have no influence on the usage of types
Q5 In frequently changed code, do developers prefer typed or untyped declarations?
Spearman Correlation Ranking Type commits Usage in of a file that file
Type File Commits Usage Car.groovy 10 0.1 Wheel.groovy 15 0.2 Engine.groovy 16 0.21 Transmission.groovy 17 0.23 Radio.groovy 38 0.44 Spearman Ranking = 1.0
Type File Commits Usage Car.groovy 10 0.9 Wheel.groovy 15 0.82 Engine.groovy 16 0.21 Transmission.groovy 17 0.14 Radio.groovy 38 0.04 Spearman Ranking = -1.0
12% 30%
Q5 In most cases, developers prefer untyped declarations in frequently changed code
Introduction Type Systems Study Settings Results and Discussion Conclusion and Future Work
Groovy programmers types more often in declarations that define the interface of modules Types are used less frequently in test classes and script files Programmer’s who only developed in statically typed languages type more often than other programmers There is no influence of the size, age or level of activity of a project on how programmers use types In frequently changed files, untyped declarations are more popular
Provides real data which can be used by developers of programming languages and tools Helps on the understanding of the tradeoffs between different typing paradigms Complements experimental studies with a different point of view Raises new questions that can inspire other researchers
Future Work
Measuring the impact of compile, runtime and unit test error messages on maintenance time
Recommend
More recommend