Upload
jens-dietrich
View
187
Download
2
Tags:
Embed Size (px)
Citation preview
What Java Developers (don’t) know about API compatibility, and why this matters
Jens Dietrich, May 2014
http://tinyurl.com/brokenapi
About the Author
● background (PhD): non-classical logic, very formal● 1996-2003: no research, work in industry and
development aid● 2003-2012:
○ models, algorithms and tools to understand and improve the design and architecture of systemsxplrarc.massey.ac.nz
○ exploring better ways to safely assemble systems from modules
Current Research Topics
● fast algorithms and effective data structures for the computation of transitive closure and CFL-reachability in sparse graphs (contract research for Oracle Labs)
● understanding circular dependencies between modules● statistical oracles for performance regression testing● API / component compatibility and evolution
Motivation
● relevant to industry● under-researched topic● trend towards more empirical studies in SE● combination of software evolution and composition
Overview
● background● puzzlers ● survey● impact analysis and repository studies● roadmap
Background
● applications are neither monolithic nor static● they are composed using APIs provided by libraries,
both libraries and the applications themselves change● when is the evolution of libraries safe? ● this depends on:
○ the compatibility of changes made○ how applications are built and deployed
● consider the Java platform (language and JVM), but findings can be generalised
Building and Deploying Java Applications
1. build (compile and test) programs with all libraries, deploy togethermainstream, good tools support (Ant, Maven, Jenkins, ..)
2. build and upgrade libs only facilitates 24/7 applications, approach used in OSGi, app servers, JNLP
Types of Compatibility
Program P uses API in library L
L1
L2
P
provider API contract consumer horizontal compatibility (h-comp(L,P)): L is compatible with P
vertical compatibility (v-comp(L1,L2)): is L1 is h-compatible with P and L1 is v-compatible with L2, then L2 is h-compatible with P for all P
Compatibility in Java
P is source compatible with L if P can be compiled with LP is binary compatible with L if P can be linked with L
vertical only:L1 is behavioural compatible with L2 if the observable behaviour of P linked with either L1 or L2 is the same(contextual w.r.t. all or some P)
Binary Compatibility (v-Version)
A change to a type is binary compatible with pre-existing binaries if pre-existing binaries that previously linked without error will continue to link without error. [JLS 7, sect 13.1]
Evolution Problem
Given a program P and two versions of a library L:
1. if P is source/binary compatible with L-v1, is it still source/binary compatible with L-v2 ?
2. Is L-v1 behavioural compatible with L-v2?
Evolution Problems as Deployment Puzzlers
● inspired by Bloch/Gafter’s famous book● 18 problems (May 2014)● slides: http://www.slideshare.net/JensDietrich/presentation-30367644 ● code: https://bitbucket.org/jensdietrich/java-library-evolution-puzzlers ● discovery of bug in JLS-7
Structure of a Puzzler
● provide a program P consisting of an executable class <somepackage>.Main
● Main uses a class lib.<somepackage>.Foo defined in a library lib-1.0.jar
● there is a modified class lib.<somepackage>.Foo defined in a library lib-2.0.jar
Structure of a Puzzler (ctd)
● experiments:○ compile P with lib-1.0.jar and then run it with
lib-1.0.jar○ compile P with lib-1.0.jar and then run it with
lib-2.0.jar○ compile P with lib-2.0.jar and then run it with
lib-2.0.jar● automated with build script
Source Compatibility !=> Binary Compatibility
public class Foo {public static java.util.Collection getColl() {
return new java.util.ArrayList();}
}
public class Foo {public static java.util.List getColl() {
return new java.util.ArrayList();}
}
public class Main { public static void main(String[] args) {
java.util.Collection coll = Foo.getColl(); System.out.println(coll);
}}
lib v
ersi
on 1
.0lib
ver
sion
2.0
client program using lib
specialising return type (strengthen postcondition!)
Binary Compatibility !=> Source Compatibility
public class Foo { public static List<String> getList() { List<String> list = new ArrayList<String>(); list.add(”42”); return list; }}
public class Foo { public static List<Integer> getList() { List<Integer> list = new ArrayList<Integer>(); list.add(42); return list; }}
public class Main { public static void main(String[] args) { List<String> list = Foo.getList(); System.out.println(list.size()); }}
lib v
ersi
on 1
.0lib
ver
sion
2.0
client program using lib
Binary Compatibility !=> Behavioural Comp.
public class Foo { public static List<String> getList() { List<String> list = new ArrayList<String>(); list.add(”42”); return list; }}
public class Foo { public static List<Integer> getList() { List<Integer> list = new ArrayList<Integer>(); list.add(42); return list; }}
public class Main { public static void main(String[] args) { List<String> list = Foo.getList(); for (String s:list) { System.out.println(s); } }}
lib v
ersi
on 1
.0lib
ver
sion
2.0
client program using lib
How Bizarre ...
import java.io.Serializable;
class Foo<T extends Serializable & Comparable> {
public void foo(T t) {
t.compareTo(””); System.out.println(t);
}
}
import java.io.Serializable;
class Foo<T extends Comparable & Serializable> {
public void foo(T t) {
t.compareTo(””); System.out.println(t);
}
}
public class Main implements java.io.Serializable {
public static void main(String[] args) {
Main m = new Main();
new Foo().foo(m);
}
}
lib v
ersi
on 1
.0lib
ver
sion
2.0
client program using lib
Constant Inlining
public class Foo { public static final int MAGIC = 42;}
public class Foo { public static final int MAGIC = 43;}
public class Main { public static void main(String[] args) { System.out.println(Foo.MAGIC); }}
lib v
ersi
on 1
.0lib
ver
sion
2.0
client program using lib
How Linking Works (for Methods)
● the linker uses the method descriptor● method descriptors do not contain type parameters● method descriptors do not contain throw clauses● descriptors have to match exactly (no reasoning
performed by the linker)● types that are handled as compatible by the compiler
(boxing/unboxing) are strictly different in byte code
Why ?
● ensuring binary backwards compatibility is a major objective for JDK evolution
● bytecode kept stable, JVM innovation focuses on maintainability, stability and scalability
● language evolution focuses on programmer productivity
Why ?
© cew118
How Java Evolves
● language innovation by adding syntactic sugar● ensure byte code compatibility, perhaps add features
(invokedynamic), but don’t modify existing ones● compiler magic:
○ generic types => erasure○ inner classes => synthetic methods to bypass
encapsulation○ covariant return types => return type overloading○ reference vs value types => auto-(un)-boxing
How do Developers Cope ?
from https://www.youtube.com/watch?v=M7FIvfx5J10 , Standard YouTube Licence
Study Repositories
1. Study program evolution in the qualitas corpus data set, find incompatible upgrades, and the impact this has on other programs.
2. Study Maven dependencies whether incompatibilities are introduced by recursive dependency resolution.
Dietrich J., Jezek K., Brada P.: Broken Promises - An Empirical Study into Evolution Problems in Java Programs Caused by Library Upgrades. CSMR-WCRE'14. Jezek K., Dietrich J.: On the Use of Static Analysis to Safeguard Recursive Dependency Resolution. Accepted for SEAA 2014.
CSMR’14 Study
● studied 111 programs, 661 versions (Qualitas Corpus)● incompatible API upgrades are common in real-world
libraries: 344/455 (75%) of upgrades in data set have incompatibilities
● commodity libraries (antlr, ant, hibernate, weka, colt, junit, jung) are affected
● many constants that are inlined are then changed ● a lot of potential, but only a few actual problems
SEEA’14 Study
● study on transitive dependencies of 1902 Maven modules in Qualitas Corpus
● 367 modules depend on multiple versions of other modules
● common issues include: ○ incompatibilities between multiple versions○ redundant (unused) dependencies○ missing dependencies
What do Developers Know ?
● developers we asked were puzzled about our puzzlers● idea: turn this into a survey !● problem: how to convince people to participate?
How not to do it
© Tan Yuenhttp://www.gaolat.com/2013/10/vivo-pizza-free-pizzas-for-students.html
Challenges
● we are asking developers for a lot of time (>1H)● i.e., we are asking them for a lot of money● but we can offer them some value● they will learn something !
Recruiting Participants
● JavaWorld ● Melbourne JUH● several Czech JUGs● NZ Industry Partners (Orion, SolNet, Kiwiplan)● ex-students● old boys networks: Stuttgart Java User Group, German
Telecom
Acknowledgements
Kamil Jezek (Uni Western Bohemia), Jeff Friesen, Athen O’Shea (Java World), Kon Soulianidis (Melbourne JUG), Gareth Cronin (Orion), Manfred Duchrow (MD Consulting), Jochen Hiller (German Telecom), ...
Survey Design
● hosted on SurveyMonkey● 7 background questions (experience)● full survey: 21 standard puzzlers with 2 questions
each, and 4 constant inlining puzzlers with 1 question per puzzler - a total of 46 technical questions
● short survey: 9 standard and 4 constant inlining puzzlers, a total of 22 technical questions
Standard Puzzler Q1
Q1 Can the version of the client program compiled with lib- 1.0.jar be executed with lib-2.0.jar ? a) no, an error occursb) yes, but the behaviour of the program is different from the program version compiled and executed with lib-1.0.jarc) yes, and the behaviour of the program is the same as the program version compiled and executed with lib-1.0.jar
Standard Puzzler Q2
Q2 Can the client program be compiled and then executed with lib-2.0.jar ?
a) no, compilation failsb) yes, but the behaviour of the program is different from the program version compiled and executed with lib-1.0.jarc) yes, and the behaviour of the program is the same as the program version compiled and executed with lib-1.0.jar
Defining Behaviour Change
● avoid references to JLS (most developers won't understand this)
● “a behaviour change is either a different console output or a situation where the execution of one program version throws an exception, but the other program version does not”
Inlining Puzzler
Only one question is asked:
Whether 42 or 43 is printed to the console when the program compiled with version 1.0 of the library is executed with version 2.0 of this library.
I.e., whether a constant is inlined or not.
Conducting the Survey
● open between 15 November and 31 December 2013 ● 184 respondents started the short version● 241 respondents started the full version ● we asked respondent doing the full survey whether they
had already completed the short survey - 11 answered yes
● 414 unique respondents● between 49 and 295 valid responses for technical
questions
Level of Experience with Java
Years of Experience with Java
Occupation
Results: Standard Puzzlers, Q1
peaks: questions in short and full survey
Results: Standard Puzzlers, Q1
● 51% of answers were correct● only 27% correct for simple example like specialising
return type !!● only 39% correct for boxing example (difference
between int and Integer)
Results: Standard Puzzlers, Q2
Results: Standard Puzzlers, Q2
● much better: 76% correct● i.e., developers have a good understanding of the how
the compiler works
Inlining Puzzlers
Inlining Puzzlers (ctd)
● only 52% of answers correct● likely to be lower:● result for Q4 misleading● many answers might be accidentally correct
Correct Answers by Year of Experience
Correct Answers by Level of Experience
Summary
Houston, we have a problem!
© Imagine Entertainment
also (I do love demotivational posters ..)
How Big is the Problem ?
● some answers in CSMR/SEAA studies● similar studies by Steven Raemaekers, Delft● can we find actual problems documented by
developers? ● look in issue tracking systems !
A Simple Idea
● binary incompatibility results in linkage errors● these errors are represented by Java classes (with
rather unique, fully qualified names)● these names appear in error logs / stack traces● google for them in issue tracking systems !● use some google query attributes site and inurl● less is more vs simple != academic
Interpreting Results
● 1,700 pages in the GitHub issue tracking system contain java.lang.NoSuchMethodError
● so what? ● might be duplicated issues, long discussions etc ● false positive and false negatives● absolute values are meaningless● baselining: but we can investigate occurrence relative
to common Java errors and exceptions
Common Java Exceptions and Errors
● again, use google to establish this● candidate set: Exception and Error subclasses in
java.lang● result:
● NullpointerException● ClassCastException● StackOverflowError● OutOfMemoryError
Results
Results
● problems related to binary compatibility are surprisingly common
● NoSuchMethodErrors are more common than StackOverflowErrors and OutOfMemoryErrors !
Roadmap: Engineering Solutions
● compatibility is complex, and compatibility contracts must be kept simple to be used○ semantic versioning (semver.org)○ treaty (RDF-based)
● better tools to enforce contracts: ○ patch ups: clirr and co - we can do better !!○ smarter JVMs (linkers)
● combine both: use tools to compute semantic versions !
Status (May 2014)
● project started in Sept 2013● CSMR and SEAA conference papers● JLS bug discovered, reported and accepted● survey paper submitted (ICSM)● invited journal paper to generalise findings from CSMR
paper (Elsevier IST)● 2014 Hon project to port puzzlers to .NET
Future Plans
● build better semantic versioning tools (with P. Brada and K. Jezek): plugins for semantics, qos, .. (end of 2014)
● repository study: how meaningful are versioning schemes? (end of 2014)
● PhD starting to build smarter JVM linker based on Jikes or Maxine (starts July 2014)
● taxonomy of compatibility problems (2015)● give some talks at JUGs !
QUESTIONS ?