Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 1/13
y 24. September 2015 2 Kommentare zu Yes We Scan – Exploring Libraries
One of the talks I’m currently giving at conferences (e.g. BED-Con 2015, Java Forum Nord 2015, JDD Krakow 2015) is about
exploring 3rd party libraries which are only available as fully packaged artifacts like JAR files. The intro slides can be found
here.
The first part of the live demo shows how API changes between two different versions of the same library can be detected,
the approach is already described in one of my former blog posts.
Let’s now concentrate on the second part which is about finding potential hotspots and structural problems. Therefore I’m
going to use the core libraries of well-known JPA libraries: Hibernate, OpenJPA and EclipseLink. This post is quite long so
here’s what we’re going to perform:
Scanning artifacts
Basic statistics
Deprecations
Thrown exceptions
Type metrics
Package metrics
Prerequisites
All you need is a Java 7 runtime environment and the command line distribution of jQAssistant which is available as ZIP file.
In the following examples the variable JQASSISTANT_HOME points to the directory which is created after unpacking.
Scanning artifacts
First of all the libraries need to be scanned by jQAssistant. Therefore they need a place on our hard disk or SSD – let’s copy
them all into a directory called „jpa“:
jpa/hibernate‐core‐3.6.6.Final.jar jpa/hibernate‐core‐4.3.8.Final.jar jpa/openjpa‐2.3.0.jar jpa/org.eclipse.persistence.core‐2.6.0.jar
As you can see the Hibernate Core JAR is present in two different versions – we’re interested to see if we can detect some
changes between them.
Now let’s trigger the scanner:
$JQASSISTANT_HOME\bin\jqassistant.sh scan ‐f jpa %JQASSISTANT_HOME%\bin\jqassistant.cmd scan ‐f jpa
We see some output like this:
Entering C:/Development/projects/YesWeScan/jpa Entering /hibernate‐core‐3.6.6.Final.jar Leaving /hibernate‐core‐3.6.6.Final.jar (2114 entries, 29843 ms) Entering /hibernate‐core‐4.3.8.Final.jar Leaving /hibernate‐core‐4.3.8.Final.jar (3710 entries, 41287 ms Entering /openjpa‐2.3.0.jar Leaving /openjpa‐2.3.0.jar (2221 entries, 29522 ms) Entering /org.eclipse.persistence.core‐2.6.0.jar Leaving /org.eclipse.persistence.core‐2.6.0.jar (2276 entries, 29583 ms)Leaving C:/Development/projects/YesWeScan/jpa (4 entries, 134318 ms)
(Don’t worry about the logged times: it will be much faster on your machine. As time of writing this blog post I’m sitting on a
bus with my notebook on energy saving mode…)
Download our Case Study"Standardization and Automationat E-Post Development":
EN (PDF, 1 MB) DEU (PDF, 1 MB)
Subscribe jQA Newsletter jQA at Twitter
All ContentsReleasesGet Started / DownloadLicenseDocumentationGitHubGoogle GroupStackoverflow
You want to have a chat about
Software Architecture and Quality
Assurance?
Dirk will be attending the following
events either as a visitor or as a
speaker.
Feel free to contact him beforehand!
28 Jun 2016 - JavaOne Latin America
30 Sep 2016 - JUG Saxony Day
20 Oct 2016 - Java Forum Nord
Content Feed (RSS) Comment Feed (RSS)
Yes We Scan – Exploring Libraries Case Study
Follow us
Navigate
Meet and Greet
Subscribe
Browse
*
CONSULTING
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 2/13
Now it’s time to start the integrated Neo4j server and execute some queries:
$JQASSISTANT_HOME\bin\jqassistant.sh server %JQASSISTANT_HOME%\bin\jqassistant.cmd server
This will make the Neo4j browser available for our web browser under the URL http://localhost:7474.
Basic statistics
For a little warm-up let’s get some statistics how much Java classes are contained in each of the scanned artifacts. Enter the
following query in the top area of the Neo4j browser and hit Ctrl-Enter:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type) return a.fileName, count(t)
The following result appears:
We can see that all JARs contain more or less the same number of types except Hibernate 4 which as grown significantly
over its predecessor. Let’s assume that they provide more or less the same functionality.
Getting back to the query we see that the label „Type“ has been used. Didn’t we look for classes? Actually Java defines types
which can be classes, interfaces, enumerations or annotations. Therefore jQAssistant always adds two labels on a node
representing a scanned „.class“ file: „Type“ and a label which classifies the type further, i.e. „Class“, „Interface“, „Enum“ or
„Annotation“. If we only wanted to know the number of class types per artifact the query would look like this:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type:Class) return a.fileName, count(t)
We can also find out which types are required by these libraries, i.e. which are referenced by contained types but not
available in the JARs:
match (a:Artifact)‐[:REQUIRES]‐>(t:Type) return a.fileName, t.fqn
Note that for all types which are required by an artifact there’s no information available if it is a class, interface,
enumeration or annotation. Therefore these nodes only carry the label „Type“ with the property „fqn“ identifying the fully
qualified name:
Antipattern Archetype Architecture
AsciiDoc CDI ConferenceConstraint Database Demo Dependency
Dependency Injection Design Differences
EJB Event Exception Exploration findbugs
GraphConnect GraphML Immutable
innovation JAR Java Java EE JavaLand
JPA jqassistant JUnit Library Logging
maintainability Maven Metrics
Naming Rule Neo4j ReleaseRepository Roadmap Singleton Slides
Talk TransactionAttribute Video YAML
Explore
Search...
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 3/13
Deprecations
As a developer we might be interested in knowing if there are any deprecations on type or method level right before we
start using a newer version of a library. Let’s start with deprecated types per artifact. On code level these are identified by
„the presence of an annotation of type @java.lang.Deprecated“:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type), (t)‐[:ANNOTATED_BY]‐>(:Annotation)‐[:OF_TYPE]‐>(d:Type) where d.fqn="java.lang.Deprecated" return a.fileName, count(t)
There are two interesting facts to be noticed:
1. The number of deprecated types has doubled from Hibernate 3 to 4.
2. OpenJPA has no deprecated types – either the APIs didn’t change or the project does not work with deprecations.
Beside the statistics we are interested in the actual types that have been deprecated – therefore we only need to change
the return clause of the last query:
... return a.fileName, t.fqn
Or for a more compact result:
... return a.fileName, collect(t.fqn)
Now let’s drill down to method level by gaining some statistics first:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type)‐[:DECLARES]‐>(m), (m)‐[:ANNOTATED_BY]‐>(:Annotation)‐[:OF_TYPE]‐>(d:Type) where d.fqn="java.lang.Deprecated"
return a.fileName, count(m)
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 4/13
The result of the query reveals that OpenJPA actually works with deprecations but only on method level.
Again the return clause may be changed to return the actual methods and their declaring types:
... return a.fileName, t.fqn, m.signature
Thrown Exceptions
In this section we’re going to find out which kind of exceptions (or better Throwables) may be thrown by the libraries under
inspection. Therefore we need to solve two little problems:
First of all we need to identify which types are actually exceptions. Usually we would try to find all types which directly or
indirectly inherit from „java.lang.Throwable“:
match (e:Type)‐[:EXTENDS*]‐>(t:Type) where t.fqn = "java.lang.Throwable" return e.fqn
In our case this won’t work as the full inheritance hierarchy of throwables is not available in our database. As an example
take a method declared by a type within our libraries which throws a java.lang.IllegalArgumentException. There’s a node
representing IllegalArgumentException in the database but there’s no information available about which is the super type
of it. For that we would have needed to scan the file rt.jar of the Java Runtime Environment as well. But the JRE is not the
only library which is missing, there could be also exception types coming from other libraries. So the appropriate solution
would be scanning all dependencies – but this would make the queries for our analysis more difficult because we would
need to add more filters.
Let’s take another approach which is a bit unsafe but sufficient for our case: exceptions types usually have „Exception“ as
suffix in their names, e.g. „IllegalArgumentException“. Let’s execute the following query:
match (e:Type) where e.fqn=~ ".*Exception" set e:Exception return e.fqn order by e.fqn
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 5/13
It might be confusing that the result contains the same exception type more than once (e.g. antlr.ANTLRException). The
explanation for that is that a node is created for each artifact which requires a specific exception type – obviously two of our
inspected libraries depend on ANTLR so we see two nodes.
If we inspect the query a bit further we notice that it contains a clause „set e:Exception“ – we’re adding a label „Exception“ to
each type which has been identified as an exception type according to our naming heuristic. The idea behind this is to make
further queries easier to write and read – from now on we can just use the label instead of filtering by type names:
match (e:Exception:Type) return e.fqn
Note: if we’d use the rule mechanism provided by jQAssistant this query would become a so called concept.
The first problem is solved, let’s go and see what’s the second and how to get around it: There’s no information in the graph
available at which point an exception of a specific type is thrown. This is a limitation of the current byte code scanner
(shame on the author…), may be it’s an interesting feature to be implemented in the future.
We can apply another heuristic to find methods which throw exceptions: usually an instance must be created before it can
be thrown. So we’ll just look for constructor invocations of exception types as the following query does:
match (e:Exception)‐[:DECLARES]‐>(c:Constructor), (a:Artifact)‐[:CONTAINS]‐>(t:Type)‐[:DECLARES]‐>(m:Method)‐[i:INVOKES]‐>(c) where not (m:Constructor) return a.fileName as file, e.fqn as exception, t.fqn as type, m.signature as method, i.lineNumber order by exception, file, type, method
The screenshot only shows the first results of the query, by scrolling down in the browser we see that some unexpected
exception types are used by the implementations, e.g. java.lang.Exception which usually is not considered to be a good
practice.
Things are getting more interesting if we filter for exceptions that are provided by the JRE (hence the additional where
clause containing a regular expression):
match (e:Exception)‐[:DECLARES]‐>(c:Constructor), (a:Artifact)‐[:CONTAINS]‐>(t:Type)‐[:DECLARES]‐>(m:Method)‐[i:INVOKES]‐>(c) where not (m:Constructor) and e.fqn =~ "java\\.lang\\..*" return a.fileName as file, e.fqn as exception, t.fqn as type, m.signature as method, i.lineNumber
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 6/13
a.fileName as file, e.fqn as exception, t.fqn as type, m.signature as method, i.lineNumber order by exception, file, type, method
The result shows that the libraries are also creating instances of „java.lang.NullPointerException“ – did we expect that?
Type metrics
In this section we will start to gather some metrics that may help to identify hotspots in the JPA libraries. Let’s start with a
quite simple one:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type)‐[:DECLARES]‐>(m:Method) return a.fileName, t.fqn, count(m) as Methods order by Methods desc limit 10
The query returns types ordered descending by the number of methods they declare. The values are quite impressive!
But wait – interpretation of metrics always requires some knowledge about the context: at the first look the type
org.apache.openjpa.kernel.jpql.JPQL with 942 methods seems to be a hotspot. Sadly it’s most likely that the class has been
generated from a grammar representing the query language of JPA – so we’re not really interested in it (or we start a
discussion about allowed complexity in generated code).
The same probably holds for the second candidate, i.e. org.hibernate.internal.CoreMessageLogger_$logger. If we take a deeper
look at it (e.g. by another query or simply decompiling the class file) we see that it contains log messages and methods
which all look very similar.
The next entry in the result is the type org.eclipse.persistence.descriptors.ClassDescriptor with 455 methods – decompilation
reveals code that looks hand-crafted – therefore it should be treated as a problem.
Let’s have a look at another metric – the depth of inheritance hierarchies:match h=(class:Class)‐[:EXTENDS*]‐>(super:Type) return class.fqn, length(h) as Depth order by Depth desc
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 7/13
We see a maximum of 7 levels mostly originating from Hibernate classes that seem to represent AST (abstract syntax tree)
structures – usually developers say that they start loosing orientation at about 4 levels.
Let’s switch to potential hotspots regarding incoming and outgoing dependencies of types. Both queries are similar except
that the direction of a relationship has to be switched:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type), (t)<‐[d:DEPENDS_ON]‐() return a.fileName, t.fqn, count(d) as FanIn order by FanIn desc limit 10
The type with the highest fan-in is org.hibernate.HibernateException, i.e. lots of other types depend on it. So this class could
be a potential hotspot as changes to it could affect lots of other classes. As it is an exception which hopefully is quite stable
this is not necessarily a real problem. Looking to the next candidates we see session related types of Hibernate and
EclipseLink – this is a common problem of O/R-mappers.
match
(a:Artifact)‐[:CONTAINS]‐>(t:Type), (t)‐[d:DEPENDS_ON]‐>() return a.fileName, t.fqn, count(d) as FanOut order by FanOut desc limit 10
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 8/13
This result shows the types with the highest fan-out, i.e. they depend of lots of other types and therefore can be treated as
sensitive to changes (i.e. potentially unstable). Interestingly in the case of Hibernate 4 we can again observe session related
implementations as in the result before. Even if the types are not the same (actually SessionImpl implements
SessionImplementor) we get the impression that session handling is a fundamental part of unstable structures in
Hibernate.
Method metrics
A common metric is cyclomatic complexity: „It is a quantitative measure of the number of linearly independent paths
through a program’s source code.“ as Wikipedia explains. As a rule of thumb: the higher the value the harder it is to read
and test the code as more variations must be considered.
The Java scanner of jQAssistant gathers an estimation of cyclomatic complexity on method level so we can use it to find
potential hotspot methods:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type), (t)‐[:DECLARES]‐>(m) where has(m.cyclomaticComplexity) return m.cyclomaticComplexity as CC,a.fileName,t.name +"#" + m.signature as Method order by CC desc limit 10
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 9/13
The values are very high – CheckStyle per default uses a limit of 10. The good news is that the first candidates are again
generated classes (JPQL related stuff) but we can also see methods declared by AnnotationBinder and Configuration from
the Hibernate implementation there.
It is possible to aggregate cyclomatic complexity per type:
match (a:Artifact)‐[:CONTAINS]‐>(t:Type), (t)‐[:DECLARES]‐>(m) where has(m.cyclomaticComplexity) return sum(m.cyclomaticComplexity) as CC, a.fileName, t.name as Type order by CC desc limit 10
The result proves that working with JPQL obviously is not trivial but luckily the top candidates are most likely generated
from a grammar. The other types in the list are hotspots which are most likely hard to test. And there’s another interesting
detail: AbstractEntityPersister has already been detected during fan-out analysis – the Hibernate developers should have a
look at it as the situation became even worse with the newer release.
Package metrics
As the last analysis parts let’s investigate if we can find something of interest on package level. Again let’s start with a simple
metric: the number of types which are contained per package:
match (a:Artifact)‐[:CONTAINS]‐>(p:Package)‐[:CONTAINS]‐>(t:Type) return a.fileName, p.fqn, count(t) as types order by types desc limit 20
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 10/13
The result gives us the feeling that OpenJPA uses much bigger packages then the other libraries. We can verify this
assumption by performing aggregations, i.e. determining the maximum and average number of types per package in each
of the artifacts:
match (a:Artifact)‐[:CONTAINS]‐>(p:Package)‐[:CONTAINS]‐>(t:Type) with a, p, count(t) as types return a.fileName, max(types), avg(types)
Similar to type level metrics it is also possible to determine fan-in and fan-out on package level. The information is not
directly available from the scanned data but be can infered from type level:
match (p1:Package)‐[:CONTAINS]‐>(t1:Type)‐[:DEPENDS_ON]‐>(t2:Type)<‐[:CONTAINS]‐(p2:Package) where
p1 <> p2 create unique (p1)‐[:DEPENDS_ON]‐>(p2) return p1.fqn as package, count(distinct p2) as PackageDependencies order by PackageDependencies desc
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 11/13
The result contains many packages from Hibernate but there’s none coming from OpenJPA. The latter is no real surprise as
OpenJPA puts many classes into one package (see the metric above) – so potential problems are hidden within the
packages.
Looking at the last query we notice that it not only returns the number of outgoing dependencies per package but also
creates (i.e. materializes) DEPENDS_ON relations on this level. This information is useful to find out if there are cycles
between packages:
match (a:Artifact), (a)‐[:CONTAINS]‐>(p1:Package), (a)‐[:CONTAINS]‐>(p2:Package), (p1)‐[:DEPENDS_ON]‐>(p2), path=shortestPath((p2)‐[:DEPENDS_ON*]‐>(p1)) return a.fileName, p1.fqn as package, extract(p in nodes(path) | p.fqn) as Cycle order by package
The query searches for all packages p1 which depend on a package p2 where a path exists traversing DEPENDS_ON
relations back to p1. The result returns contains p1 and all nodes which are extracted from the path.
We can use these cycles as a metric: how many packages in a library are involved in cycles? Higher values may be used as
an indicator for structural problems which make it hard to determine the impact of changes:
match (a:Artifact), (a)‐[:CONTAINS]‐>(p1:Package), (a)‐[:CONTAINS]‐>(p2:Package), (p1)‐[:DEPENDS_ON]‐>(p2), path=shortestPath((p2)‐[:DEPENDS_ON*]‐>(p1)) return a.fileName, count(p1)
OpenJPA at a first glance is the winner for that metric but we already know that it comes to the price of big packages. The
much more interesting result is the evolution of Hibernate.
Wrap up
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 12/13
Share this Post: j n e
« Previous Article
It's Summer Time!
30. Juli 2015
Next Article »
User Stories: From Brown Field
Projects to DDD
2. Oktober 2015
About the author: Dirk Mahler
@dirkmahler
Java Forum Nord 2015 | Technology Toolz - 3. November 2015 - 5:15Antworten /
[…] “Yes we indicate – Software research with JQAssistant” by Dirk Mahler (blog post by Dirk Mahler himself) […]
Yes We Scan! A Christmas Gift From JDD Krakow · jQAssistant - 18. Dezember 2015 - 10:23Antworten /
[…] Check out this blog post if you want try the demonstrated queries […]
Your comment (required):
Name (required):
jQAssistant allows us to get insight into Java artifacts by simply scanning them and executing queries over the obtained
structural information on several levels. This comes to the price of writing queries on our own but with all the flexibility to
apply filters according to our own needs and the possibility to enrich data with our own concepts (i.e. Exception) to make
analysis easier.
Tags: BED-Con Command Line Interface Conference Deprecated Exception Hotspot
Java Metrics Talk
Submit
2 Comments
Leave a comment
27.6.2016 Yes We Scan Exploring Libraries · jQAssistant
http://jqassistant.org/yeswescanexploringlibraries/ 13/13
Email (required):
Website
Copyright © 2014 buschmais GbR. Inhaber: Torsten Busch, Frank Schwarz, Dirk Mahler und Tobias Israel. Alle Rechte vorbehalten. Impressum Consulting