20
Spotting automatically cross-language relations Federico Tomassetti (me) Giuseppe Rizzo Marco Torchiano

Automatically Spotting Cross-language Relations

Embed Size (px)

DESCRIPTION

An algorithm (with code on GitHub) to identify cross-language relations. Welcome into polyglot software development!

Citation preview

Page 1: Automatically Spotting Cross-language Relations

Spotting automatically

cross-language relations

Federico Tomassetti (me)

Giuseppe Rizzo

Marco Torchiano

Page 2: Automatically Spotting Cross-language Relations

CREATE TABLE Persons ( ID int, FirstName varchar(255), LastName varchar(255), City varchar(255) ); String query = "select ID, FirstName, LastName, " + "City " + "from " + dbName + ".Persons"; try { ... while (rs.next()) { int id = rs.getInt("ID"); String firstName = rs.getString("FirstName"); String lastName = rs.getString("LastName"); String city= rs.getString("City"); } } catch (SQLException e ) { ...... }

data.sql

Person.java

Page 3: Automatically Spotting Cross-language Relations

CREATE TABLE Persons ( ID int, FirstName varchar(255), LastName varchar(255), City varchar(255) ); String query = "select ID, FirstName, LastName, " + "City " + "from " + dbName + ".Persons"; try { ... while (rs.next()) { int id = rs.getInt("ID"); String firstName = rs.getString("FirstName"); String lastName = rs.getString("LastName"); String city= rs.getString("City"); } } catch (SQLException e ) { (Hopefully it does not happen) }

data.sql

Person.java

Page 4: Automatically Spotting Cross-language Relations

…the complexive system, works, sometimes

Page 5: Automatically Spotting Cross-language Relations

If we would automatically identify

cross-language relations we could:

• Recognize them

• Support refactoring

• Validate them

• Navigate them

So I am aware that this ID is

related to something else

Page 6: Automatically Spotting Cross-language Relations

If we would automatically identify

cross-language relations we could:

• Recognize them

• Support refactoring

• Validate them

• Navigate them

If I change one, the others are

updated

Page 7: Automatically Spotting Cross-language Relations

If we would automatically identify

cross-language relations we could:

• Recognize them

• Support refactoring

• Validate them

• Navigate them

See broken relations as errors

Page 8: Automatically Spotting Cross-language Relations

If we would automatically identify

cross-language relations we could:

• Recognize them

• Support refactoring

• Validate them

• Navigate them

Click to see the other side of

the relation

Page 9: Automatically Spotting Cross-language Relations
Page 10: Automatically Spotting Cross-language Relations

CodeModels

ASTs

Page 11: Automatically Spotting Cross-language Relations

Embedded AST (prendo immagine da paper)

Page 12: Automatically Spotting Cross-language Relations

<ul id="types">

<li ng-repeat="t in types" ng-class="{'selected': t.id == type}">

<a ng-href="#/{{t.id}}">{{t.title}}</a>

</li>

</ul>

var types = [

{ id: 'sliding-puzzle', title: 'Sliding puzzle' },

{ id: 'word-search-puzzle', title: 'Word search puzzle' }

];

index.html

app.js

app.controller('slidingAdvancedCtrl', function($scope) {

$scope.puzzles = [

{ src: './img/misko.jpg', title: 'Miško Hevery', rows: 4, cols: 4 },

{ src: './img/igor.jpg', title: 'Igor Minár', rows: 3, cols: 3 },

{ src: './img/vojta.jpg', title: 'Vojta Jína', rows: 4, cols: 3 }

];

});

<div ng-repeat="puzzle in puzzles">

<h2>{{puzzle.title}}</h2>

</div>

Page 13: Automatically Spotting Cross-language Relations

<ul id="types">

<li ng-repeat="t in types" ng-class="{'selected': t.id == type}">

<a ng-href="#/{{t.id}}">{{t.title}}</a>

</li>

</ul>

var types = [

{ id: 'sliding-puzzle', title: 'Sliding puzzle' },

{ id: 'word-search-puzzle', title: 'Word search puzzle' }

];

index.html

app.js

app.controller('slidingAdvancedCtrl', function($scope) {

$scope.puzzles = [

{ src: './img/misko.jpg', title: 'Miško Hevery', rows: 4, cols: 4 },

{ src: './img/igor.jpg', title: 'Igor Minár', rows: 3, cols: 3 },

{ src: './img/vojta.jpg', title: 'Vojta Jína', rows: 4, cols: 3 }

];

});

<div ng-repeat="puzzle in puzzles">

<h2>{{puzzle.title}}</h2>

</div>

Page 14: Automatically Spotting Cross-language Relations

Context of a node:

all the descendants

+

the siblings and their descendants

Page 15: Automatically Spotting Cross-language Relations

Context of a node:

all the descendants

+

the siblings and their descendants

Page 16: Automatically Spotting Cross-language Relations

Some metrics we use:

• Number of shared values

• Min and max number of different values

• Tversky Index

𝑇𝑉 𝑋, 𝑌 =|𝑋∩𝑌|

|𝑋∩𝑌|+𝛼|𝑋−𝑌|+𝛽|𝑌−𝑋|

• Jaro, Jaccard, tf-idf and others

How to compare contexts:

1) Take all the values in the context (IDs, strings,

numbers)

+

2) Employ different metrics

Page 17: Automatically Spotting Cross-language Relations

How to combine those metrics:

Random Tree tells us

We built a golden set of 1200 candidate relations

(around 140 real relations, the other just same ID)

We train it with golden set

Random Tree find out the best way to combine those

metrics to decide if a pair is related or not

Rule to understand if two nodes with same ID are

connected

Output of Random Tree

Page 18: Automatically Spotting Cross-language Relations

How to evaluate it?

10-fold cross valiationn

Page 19: Automatically Spotting Cross-language Relations

What now?

Code available at:

https://github.com/orgs/CrossLanguageProject

• We want to build a larger golden set

• We want to integrate support in editors

What we have

• A tool that spot automatically cross-language relations

with a precision and recall > 90% (on a first in-house

dataset)

Page 20: Automatically Spotting Cross-language Relations

Code available at:

https://github.com/orgs/CrossLanguageProject

www.slideshare.net/FTomassetti

Spotting Automatically

Cross-Language Relations

Federico Tomassetti, Giuseppe Rizzo, Marco Torchiano

CSMR 2014, Antwerpen, Belgium

Preprint at:

http://www.di.unito.it/~rizzo/publications/Tomassetti_Rizzo-CSMRWCRE2014.pdf