30
Application of Graph Theory to OO Software Engineering Alexander Chatzigeorgiou, Nikolaos Tsantalis, George Stephanides Department of Applied Informatics University of Macedonia Thessaloniki, Greece WISER 2006, May 20, 2006, Shanghai, China

Application of Graph Theory to OO Software Engineering Alexander Chatzigeorgiou, Nikolaos Tsantalis, George Stephanides Department of Applied Informatics

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Application of Graph Theory

to OO Software Engineering

Alexander Chatzigeorgiou, Nikolaos Tsantalis, George Stephanides

Department of Applied Informatics

University of Macedonia

Thessaloniki, Greece

WISER 2006, May 20, 2006, Shanghai, China

Motivation

• Application of Graph Theory to SE is not new:

Planning: network diagrams (CPM, PERT)

Analysis: DFDs, FSMs, Petri Nets

Design: everything is essentially a graph

Testing: McCabe's complexity measure

. . .

• Graph Theory is suitable for object-oriented SE:

Class diagrams can be perfectly mapped to graphs

System Representation

Identification of "God" classes

• Goal: to identify heavily loaded classes of an OO design

• such "God" classes imply a poor model

• Inspiration comes from the Web (HITS algorithm)

HITS

Eat Anything

Car Loans

Anti-Wrinkle

MyHumbleHomePage

Super Cars

Mykonos

AlternativeMusic

Relative Importance: Low

SEI

IEEE TSE

ACMSIGSOFT

MyHumbleHomePage

ICSE

GoF

NASA

Relative Importance: High

Identification of "God" classes

• OO system : directed graph G=(V, E)

• classes vertices

• associations edges

• Each edge is annotated with an integer mp,q corresponding to the

number of discrete messages sent to the same direction from p to q.

q1

q2

q3

pmq2, p

hq1

hq2

hq3

αp

q1

q2

q3

pmp, q2

αq1

αq2

αq3

hp

Identification of "God" classes

4

1

2

3

message7

message8

message2

message1

message6

message4

message3

message5

message9

message10

2

1 3

4

2 1

1 1

1

2

1 1

43214

43213

43212

43211

0210

1010

1102

0010

hhhha

hhhha

hhhha

hhhha

43214

43213

43212

43211

0110

2010

1101

0020

aaaah

aaaah

aaaah

aaaah

hAa T aAh

0110

2010

1101

0020

A

Identification of "God" classes

• Using theorems from Linear Algebra, authority/hub weights can be obtained by finding the principal eigenvectors of ATA and AAT

OvenButton

Door

Timer

Light

Power Tube

Beeper

1

2

3

4

5

6

7

add60sec

countDownsetT imeZero

exp ired

turnOff

turnOn

turnOff

turnOn

beep

cook

cancel

doorOpen

doorClose

isOpen

Identification of "God" classes

1 3

2

4 7

6

5

2

2 12

1

2

31

1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

0

0

0

0

0

0

0

0

1

0

0

0

0

2

2

0

1

0

0

0

0

0

3

0

0

0

0

0

0

2

0

0

0

0

0

0

2

0

0

0

0

0

0

1

0

0

0

0

A =

229.0459.0459.0688.00229.00Tna

0000100Tnh

Clustering

• Goal: to partition the system into strongly communicating classes

• might imply relevance of functionality

• might imply possible reusable components

• Spectral graph partitioning employs the degree matrix (diagonal

matrix containing the degrees of vertices), and the

• Laplacian matrix, defined as L = D – A

• the smallest eigenvalue of L is always zero

Clustering

• the properties of the eigenvector x2 associated with the second

smallest eigenvalue λ2 have been explored by M. Fiedler

• Clustering a graph G into two sub-graphs according to the positive

and negatives entries of the Fiedler vector, corresponds to a partition

which minimizes the weight of the cut set.

11

6

7

1

9

6

7

Clustering

11

6

7

1

9

6

7

weightcut-set = 17

11

6

7

1

9

6

7

weightcut-set = 18

11

6

7

1

9

6

7

weightcut-set = 1

provided by Fiedler vector

Clustering

• Application to OO systems: edges are undirected and edge weight

is the sum of number of messages exchanged in both directions

• Partitioning is performed iteratively

• When to stop ? when a resulting graph is less cohesive than the

parent graph

Clustering

BusinessLogic

Entity1

Entity2

MainFrame

InputForm

Confirmation

DB

Connection Statement

Result

1 1

12

2

2

2

3 3

3

1 2

3 4

5

6

7 8

9

10

5

5

7

6

5

8

9

10

3 4

1 2

2

2

2

2

2

5

5

3

3

4

Clustering

7

6

5

8

9

10

3 4

1 2

2

2

2

2

2

5

5

3

3

4

]366.0,388.0,313.0,108.0,152.0,152.0,359.0,317.0,359.0,446.0[2 Tx

DB

]410.0,480.0,285.0,192.0,491.0,491.0[2 Tx

Logic GUI

Design Pattern Detection

• Design Patterns (descriptions of communicating classes): form solutions to common problems

• According to Parnas software engineering deals with multi-version projects

• Multiple Versions + Large Number of Components =

Complicated and messy architecture

• Patterns impose structure

• Consequently, the identification of implemented patterns

• is useful for understanding an existing design

• enables further improvements

Design Pattern Detection

+doIt()

I

+doIt()

A

+doIt()

D

core

+doIt()+doX()

X

+doIt()+doY()

Y

+doIt()+doZ()

Z

1

1

. . . . . .

. . .

+ further annotations

0 1 0 0 1 0 0 . . . 0 1 1 0 0 1 1 0 . . . 1

. . .

1 1 1 0 0 0 1 . . . 0

0 1 0 0 1 0 0 . . . 0 1 1 0 0 1 1 0 . . . 1

. . .

1 1 1 0 0 0 1 . . . 0

0 1 0 0 1 0 0 . . . 0 1 1 0 0 1 1 0 . . . 1

. . .

1 1 1 0 0 0 1 . . . 0

0 1 . . . 1 01 1 . . . 0 0 . . .

1 1 . . . 0 1

0 1 . . . 1 01 1 . . . 0 0 . . .

1 1 . . . 0 1

0 1 . . . 1 01 1 . . . 0 0 . . .

1 1 . . . 0 1

matches

Class Diagram (UML) System / Pattern

Graph Representation Representation as set of matrices

System under study

Sought Design Pattern

Design Pattern Detection

• Classical pattern matching algorithms fail since patterns often differ from the standard representation

A

B

C

1

2

a

b

System Segment 1 System Segment 2 Pattern

Design Pattern Detection

• Exploiting recent research on graph similarity [Blondel2004] it is possible to measure the degree of similarity between two vertices

Design Pattern Detection

Generalization Graphs A

B

C

a

b

1

2

Association Graphs A

B

C

a

b

1

2

similarity: 1

similarity: 1

similarity: 0

similarity: 0

similarity: 0.5

similarity: 0.5

similarity: 1

similarity: 1

A

B

C

a

b

1

2

Design Pattern Detection

A

B

C

1

2

a

b

System Segment 1 System Segment 2 Pattern

0.50

00.5a

b

1 2

Design Pattern Detection

• Experimental Results:

JHotDraw v5.1 (172 classes)

JRefactory 2.6.24 (572 classes)

JUnit 3.7 (99 classes)

Design Pattern Detection

JHotDraw v5.1 JRefactory v2.6.24 JUnit v3.7 Design Patterns TP FN TP FN TP FN

Adapter*/Command 18 0 7 0 1 0 Composite 1 0 0 0 1 0 Decorator 3 0 1 0 1 0

Factory Method 2 1 1 3 0 0 Observer 5 0 0 0 4 0 Prototype 1 0 0 0 0 0 Singleton 2 0 12 0 0 0

State/Strategy 22 1 11 1 3 0 Template Method 5 0 17 0 1 0

Visitor 1 0 2 0 0 0

Scale-Freeness of OO Systems

• Popular topic: investigation of whether certain systems (technological, biological, social etc) are scale-free

• A scale-free phenomenon shows up statistically in the form of power law.

• For a network, the probability P(k) that a node in the network connects with k other nodes is P(k) ~ k-γ

Scale-Freeness of OO Systems

• Naturally, research has also focused on OO systems

• Scale-freeness is usually graphically detected, since the relationship of P(k) vs. k, plotted on a log-log scale, appears as a line with slope -γ

1

10

100

1000

10000

1 10 100 1000

k

Cu

mm

ula

tive

Fre

qu

ency

JUnit

JHotDraw

JRefactory

Scale-Freeness of OO Systems

(a)

(b)

(c)

(d)

1

10

100

1 10 100

Vertex Degree

Cu

mu

lati

ve F

req

uen

cy

(e)

Degree Sequence = {16, 8, 8, 4, 4, 4, 4, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}

Scale-Freeness of OO Systems

• Recently, in [Li2005], a structural metric has been proposed to evaluate the scale-freeness of a network.

• For an undirected, simple and connected graph g=(V,E)

jEji

iddgs

,

• The metric value is maximized when high-degree nodes ("hubs") are connected to other high-degree nodes.

• Among all graphs having the same degree sequence, there is a graph smax that maximizes the value of the metric s(g) and a graph smin that minimizes it. Thus:

minmax

min

ss

sgsS

Scale-Freeness of OO Systems

Given such a metric, it is possible:

• to validate whether a given OO system is scale-free

• to assess whether an optimization increases scale-freeness

• to evaluate the evolution of systems in terms of scale-freeness

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

versions

scal

e-fr

ee m

etri

c S

JUnit

JHotDraw

JRefactory

Conclusions

• Graph Theory has been widely applied on several CS fields

• It can provide a powerful "tool" for analyzing OO systems

• quantification of properties

• identification of structures

• Graph Theory is important for CS curricula

Application of Graph Theory

to OO Software Engineering

Thank you for your attention

WISER 2006, May 20, 2006, Shanghai, China