27
Introduction Modeling Alchemy Training Inference Case Study Conclusion A Goal Driven Framework for Software Project Data Analytics George Chatzikonstantinou 1 , Kostas Kontogiannis 1 , Ioanna-Maria Attarian 2 1 National Technical University of Athens, Greece 2 IBM Toronto Laboratory, Canada CAiSE’13, Valencia, Spain MINISTRY OF EDUCATION & RELIGIOUS AFFAIRS, CULTURE & SPORTS

Chatzikonstantinou c ai-se2013_

Embed Size (px)

Citation preview

Page 1: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

A Goal Driven Framework for Software ProjectData Analytics

George Chatzikonstantinou1, Kostas Kontogiannis1,Ioanna-Maria Attarian2

1National Technical University of Athens, Greece2IBM Toronto Laboratory, Canada

CAiSE’13, Valencia, Spain

MINISTRY OF EDUCATION & RELIGIOUS AFFAIRS, CULTURE & SPORTS

Page 2: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Problem Description (Software Development Analytics)

Software engineering is a data-rich/data-intensive activity

Large collections of project related information are stored inspecialized repositories

How can those data be leveraged to help managers identifypossible risks in order to better plan a software project?

Software Project Data

?

draw conclusions about the project

(e.g. budget overruns, schedule delays)

Page 3: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Quantitative Approaches

Software Project Data

draw conclusions about the project

(e.g. budget overruns, schedule delays)

cost = f(x1, x2, … xn)

Most software analytics models are based on numerical formulas(e.g. COCOMO II by B. Boehm et al.)

Such approaches fail to take into account:

experience captured from past similar projects

contextual information that leads to different views of analysis

qualitative assessment of project data

Page 4: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

The Proposed Approach

Software Project Data

draw conclusions about the project

(e.g. budget overruns, schedule delays)

Project Analytics

Model

Past Project Data

Uses qualitative models that can capture different views ofanalysis

Allows for past cases to be used for training the models

Can yield results even with incomplete or partial data

Page 5: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

The Proposed Approach

Software Project Data

draw conclusions about the project

(e.g. budget overruns, schedule delays)

Project Analytics

Model

Past Project Data

i) modeling ii) training

iii) inference

Uses qualitative models that can capture different views ofanalysis

Allows for past cases to be used for training the models

Can yield results even with incomplete or partial data

Page 6: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics

Project Analytics are modeled in terms of AND/OR Goal Trees

used extensively in RE

a visual notation with well defined semantics

Advantages of the selected notation :

can capture the views of different stakeholders

can capture various dependency types

is extensible and customizable for different project types andorganizations

Page 7: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics (Example & Semantics)

High Software Product

Complexity

b

Low Effort

a

Each root node corresponds toa desired state/risk

Page 8: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics (Example & Semantics)

Low Effort

AND

OR

a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

c

Application Domain

Experience and Knowledge

ePlatform

Experience and Knowledge

f

High Software Product

Complexity

b

Nodes are reduced to simplerones with:

AND-decompositions

Sat(c) ∧ Sat(d)→ Sat(a)

OR-decompositions

Sat(e)→ Sat(d)Sat(f )→ Sat(d)

Sat(a) : goal node a is satisfied

Page 9: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics (Example & Semantics)

Low Effort

AND

OR

++S / ++D

- - D /- -S a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

c

Application Domain

Experience and Knowledge

ePlatform

Experience and Knowledge

f

Support by Technical

People

g

High Software Product

Complexity

b

Dependencies are depicted ascontribution links :

++S(g , d)p1 : Sat(g)→ Sat(d)

++D(g , d)p2 : ¬Sat(g)→ ¬Sat(d)

−−S(b, a)p3 : Sat(b)→ ¬Sat(a)

−−D(b, a)p4 : ¬Sat(b)→ Sat(a)

Page 10: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics (Example & Semantics)

Low Effort

AND

OR

++S / ++D

- - D /- -S a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

c

Application Domain

Experience and Knowledge

ePlatform

Experience and Knowledge

f

Support by Technical

People

g

High Software Product

Complexity

b

Dependencies are depicted ascontribution links :

++S(g , d)p1 : Sat(g)→ Sat(d)

++D(g , d)p2 : ¬Sat(g)→ ¬Sat(d)

−−S(b, a)p3 : Sat(b)→¬Sat(a)

−−D(b, a)p4 : ¬Sat(b)→ Sat(a)

Page 11: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics (Example & Semantics)

Low Effort

AND

OR

++S / ++D

- - S {PSS}- - D /- -S

PSS: Strict Schedule CompliancePDR: Disciplined Requirements Management

a

- - S{PDR}

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

c

Application Domain

Experience and Knowledge

ePlatform

Experience and Knowledge

f

Support by Technical

People

g

High Software Product

Complexity

b

Requirements Controllability

hDevelopment

Schedule Constraints

i Multiple views are modeledusing conditional contributions

−−S(h, a){PDR}if policy PDR holdsq1 : Sat(h)→ ¬Sat(a)

−−S(i , a){PSS}if policy PSS holdsq2 : Sat(i)→ ¬Sat(a)

Page 12: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Modeling Project Analytics (Example & Semantics)

Low Effort

AND

OR

++S / ++D

- - S {PSS}- - D /- -S

PSS: Strict Schedule CompliancePDR: Disciplined Requirements Management

a

- - S{PDR}

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

c

Application Domain

Experience and Knowledge

ePlatform

Experience and Knowledge

f

Support by Technical

People

g

High Software Product

Complexity

b

Requirements Controllability

hDevelopment

Schedule Constraints

i Multiple views are modeledusing conditional contributions

−−S(h, a){PDR}if policy PDR holdsq1 : Sat(h)→ ¬Sat(a)

−−S(i , a){PSS}if policy PSS holdsq2 : Sat(i)→ ¬Sat(a)

Page 13: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Leaf Nodes

Low Effort

AND

OR

++S / ++D

- - S {PSS}- - D /- -S

PSS: Strict Schedule CompliancePDR: Disciplined Requirements Management

a

- - S{PDR}

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

c

Application Domain

Experience and Knowledge

ePlatform

Experience and Knowledge

f

Support by Technical

People

g

Requirements Controllability

hDevelopment

Schedule Constraints

i

High Software Product

Complexity

b

There are nodes in the modelthat have zero in-degree (leafs)

Leaf nodes in the model arefacts and should be :

either added as input bythe user

or obtained by theavailable repositories

Page 14: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Learning/Inference Engine

Having considered Project Analytics models as rules we need aninference engine to be able to make deductions

Alchemy (http://alchemy.cs.washington.edu/)

A statistical learning and probabilistic inference engine based onMarkov Logic Networks (MLNs).

Markov Logic

A probabilistic logic which combines FOL and Markovnetworks enabling uncertain inference.

An assignment may hold with a non-zero probability even ifsome of the formulas in the underlying KB are violated.

Weights on formulas reflect the strength of the correspondingconstraint.

Page 15: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Alchemy as a Learning Engine

Project Analytics

Goal Model

Training MLN Rules Generation

Interpretations

AlchemyPAG Model with

Weights on Contributions

Low Effort

AND

++S / ++D

- - S {PSS}a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

cSupport by Technical

People

g

Development Schedule

Constraints

i

Sat(c)˄Sat(d)→Sat(a).

p1 : Sat(g)→Sat(d)

p2 : ¬Sat(g)→¬Sat(d)

q1 : Sat(i)→¬Sat(a)

Page 16: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Alchemy as a Learning Engine

Past Project Data

Project Analytics

Goal Model

Training MLN Rules Generation

Ground AtomsGeneration

AlchemyPAG Model with

Weights on Contributions

Low Effort

AND

++S / ++D

- - S {PSS}a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

cSupport by Technical

People

g

Development Schedule

Constraints

i

Sat(c),Sat(g),Sat(i)

Pr1

Sat(c),!Sat(g),Sat(i)

Pr2

Sat(c),Sat(g),Sat(i)

Prn

...

Page 17: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Alchemy as a Learning Engine

Past Project Data

Project Analytics

Goal Model

Training MLN Rules Generation

Ground AtomsGeneration

AlchemyPAG Model with

Weights on Contributions

Low Effort

AND

++S, p1/ ++D, p2

- - S, q1 {PSS}a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

cSupport by Technical

People

g

Development Schedule

Constraints

i

Page 18: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Alchemy as an Inference Engine

Current Project Data

MLN Rules Generation

Ground AtomsGeneration

Alchemy

Active Policies Set

PAG Model with Weights

on Contributions

Project Analytics Satisfaction Probabilities

Low Effort

AND

++S, p1/ ++D, p2

- - S, q1 {PSS}a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

cSupport by Technical

People

g

Development Schedule

Constraints

i

Sat(c)˄Sat(d)→Sat(a).

p1 : Sat(g)→Sat(d)

p2 : ¬Sat(g)→¬Sat(d)

Sat(i)˄Uses(PSS)→Sat(a’).

q1 : Sat(a’)→¬Sat(a)

Page 19: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Alchemy as an Inference Engine

MLN Rules Generation

Ground AtomsGeneration

Alchemy

Active Policies Set

PAG Model with Weights

on Contributions

Project Analytics Satisfaction Probabilities

Current Project Data

Low Effort

AND

++S, p1/ ++D, p2

- - S, q1 {PSS}a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

cSupport by Technical

People

g

Development Schedule

Constraints

i

Current Project Data :

Sat(c), Sat(i)

Active Policies :

Uses(PDR)

Page 20: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Alchemy as an Inference Engine

MLN Rules Generation

Ground AtomsGeneration

Alchemy

Active Policies Set

PAG Model with Weights

on Contributions

Project Analytics Satisfaction Probabilities

Current Project Data

Low Effort

AND

++S, p1/ ++D, p2

- - S, q1 {PSS}a

High Level of Experience and

Knowledge

dClarity of Project Team Roles and Responsibilities

cSupport by Technical

People

g

Development Schedule

Constraints

i

Calculate Satisfaction Probability

Page 21: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Dataset

The ISBSG Dataset

ISBSG (http://www.isbsg.org/)

A non-profit organization that maintains and exploits a repositoryof history data related to software projects.

The ISBSG Dataset in numbers

data for 5,000 software projects

submitted from 24 countries

covers 15 major industry types (e.g banking, insurance)

over 100 features for each project

Page 22: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

PAG Modeling

Compiling the PAG Model

We considered information from the following sources :

assertions from related literature

existing standards and tools (e.g. ISO 9126, COCOMO II)

data available from ISBSG

The PAG model of the case study has :

3 root goals : “High Effort”, “Low Cost”, “High ProductQuality”

96 nodes (50 leaf nodes)

12 OR-decompositions / 10 AND-decompositions

25 contribution links (12 conditional)

Page 23: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Evaluation

Correctness

Objective Correct FP FN

High Effort 73.6 % 11.8 % 14.6 %

Low Cost 67.9 % 14.5 % 17.6 %

High Product Quality 60.6 % 11.4 % 28.0 %

Page 24: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Evaluation

Stability

0 2 4 6 8 10 12 14 16 18 20 22

0.4

0.5

0.6

0.7

0.8

0.9

1

# of Errors

Pro

babi

lity

of a

n ob

ject

ive

to b

e tr

ue

Low CostHigh EffortHigh Product Quality

Page 25: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Evaluation

Policy Variability

Model View Low Cost High Effort High Product Quality

# 1 21.57 % 99.04 % 49.57 %

# 2 99.00 % 77.69 % 50.76 %

# 3 19.13 % 98.99 % 87.00 %

# 4 20.13 % 99.04 % 83.59 %

# 5 19.13 % 99.00 % 99.00 %

Page 26: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Conclusion & Future Work

The proposed approach :

uses qualitative models that can capture different views ofanalysis

allows for past cases to be used for training the models

allows for reasoning under uncertainty or partial information

Future work :

compilation of goal models that relate to specific standards(e.g. SMART, SCRUM)

increase the expressiveness of PAG models

Page 27: Chatzikonstantinou c ai-se2013_

Introduction Modeling Alchemy Training Inference Case Study Conclusion

Acknowledgements

This research has been co-financed by the European Union (Eu-ropean Social Fund ESF) and Greek national funds through theOperational Program ”Education and Lifelong Learning” of the Na-tional Strategic Reference Framework (NSRF) - Research FundingProgram: Heracleitus II. Investing in knowledge society through theEuropean Social Fund.