Upload
rose-douglas
View
217
Download
2
Tags:
Embed Size (px)
Citation preview
1
File Types in Alfresco Source
1
10
100
1000
10000
Example application: source code analysisExample application: source code analysis
125 file types; 8029 files; 4689 non-Java; 1112 svn revisions
2
build scripts
version history
spreadsheets
databases
config files
web pages
bug reports
softwarerepository
parsers
queryengine
analyst
dashboard
IDEplugin
exceladd-in
source code
developer
manager
Querying Software ArtefactsQuerying Software Artefacts
3
The problemThe problem
design query language and enginefor accessing vast repository of different types of source artefact
libraries of queries:tailor framework to different types of artefact
4
Tough problem!Tough problem!
Difficulties: - does not scale- efficient queries extremely hard to write- specific to one kind of source artefact
Dozens of attempts, in industry and academia since 1984: databases, prolog, domain-specific query languages
18 man-years of research at University of Oxford1996-2005 to discover ingredients of solution
15 man-years to implement an industrial product
3 patents pending, several more in pipeline
5
SemmleCode: the power of .QLSemmleCode: the power of .QL
6
The query language .QLThe query language .QL
Object-oriented, for creating libraries of queries
Recursive queries, as in logic programming
Familiar syntax to Java and SQL developers
On top of any traditional relational database
Syntax-highlighting, error-checking and auto-completion
7
How it worksHow it works
.QL library
.QL query
RDBMS
proceduralSQL
java / jar
bytecodefor search
XMLfiles
templatefor RDBMS
Semmleoptimiser
8
DemoDemo
The source we shall explore: Alfresco: Enterprise Content Management Spring: Java/JEE Application Framework Builds on Tomcat, JBoss, …
Demo parts:
• out-of-the-box• writing your own queries• querying XML config files
Vital statistics:
50553 Java methods6647 Java types516 XML files
9
Using SemmleCode out-of-the-boxUsing SemmleCode out-of-the-box
115 pre-packaged queries
Find common bug patterns:e.g. compareTo/equals, cloning, serialisation, internationalization
Compute metrics:42 different metrics, including Robert Martin’s package metrics
Examine dependencies:e.g. cyclic package dependencies
Visualization:pie charts, bar charts, tables, graphs, warnings/errors- easy navigation to source- exportable for generating reports
10
Writing queries of your own: Writing queries of your own: selectselect
from Method mwhere m.fromSource() and m.hasName("compareTo") and not m.getDeclaringType(). getAMethod().hasName("equals")select m, "missing equals?"
In general:
from <variable-declarations>where <conditions>select <results>
11
Writing queries of your own: Writing queries of your own: aggregatesaggregates
select sum (CompilationUnit cu | cu.fromSource() | cu.getNumberOfLinesOfCode())
In general:
agg( T1 x1, …, Tn xn | condition | expr )
12
Writing queries of your own: recursionWriting queries of your own: recursion
from RefType s, RefType t, RefType itwhere it.hasName("PasswordInputTag") and it.hasSupertype*(s) and it.hasSupertype*(t) and t.hasSupertype(s)select t,s
In general, can write recursive predicate definitions
13
Queries in .QLQueries in .QL
from-where-selectautocompletion, typechecking, emptiness tests
aggregatesarbitrary nesting, no group-by needed
recursionimplicit with chaining; or explicit
14
Defining new classes in .QLDefining new classes in .QL
class ClassAttribute extends XMLAttribute {
ClassAttribute() { this.getName()="class" }
string getClassName() { this.getValue() = result }
RefType getType() { result.getQualifiedName() = this.getClassName() }
predicate noType() { not exists(this.getType()) }}
from ClassAttribute cawhere ca.noType() and ca.getClassName().matches("org.alfresco%")select ca, ca.getClassName() + " not found"
15
Classes in .QLClasses in .QL
classes are logical properties “constructor” specifies characteristic property
methodsbody is relation between this, result and parametersmore than one result allowed
predicatesmethods without a resultbody is relation between this and parameters
16
The key points of .QLThe key points of .QL
classes are predicatesinheritance is implicationnondeterministic expressions
recursion with super-simple semantics
syntax familiar to SQL and Java programmers
designed for creating libraries of queries
excellent error checking and IDE integration
Concluding remarksConcluding remarks
18
Couldn’t you use LINQ instead of .QL?Couldn’t you use LINQ instead of .QL?
Different design goals:ORM versus libraries of queries
LINQ does not provide recursion
LINQ cannot do the optimisations across multiple queries that are key to efficiency in .QL
“Fortunately, there is light in the darkness. Based on decades of programming language research, the brilliant team at Semmle has created an elegant, industrial strength object-oriented query language called .QL with full support for recursive queries and aggregation… .QL has all the requisites to become a runaway success.”
(Erik Meijer, Creator of LINQ, Microsoft)
19
Too good to be true?Too good to be true?
Jeff Ullman, 1991:
It is not possible for a query languageto be seriously logical and seriouslyobject-oriented at the same time.
key breakthroughs are Semmle’s proprietary technology:- design of .QL- optimisations on “bytecode for search”
20
Wrapping upWrapping up
Java is not enoughsource code analysistools must process amultitude of artefacts
libraries of queriesa means to achieve suchheterogeneous tools
.QLobject-oriented queriesover trees and graphs made fast and easy