27
Introduction Metrics Conclusions References Introduction to FLOSS Data Sources Master on Free Software Daniel Izquierdo [email protected] GSyC/Libresoft 18 de noviembre de 2011 Daniel Izquierdo Introduction to FLOSS Data Sources

FLOSS Data Sources Metrics

Embed Size (px)

Citation preview

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 1/27

IntroductionMetrics

ConclusionsReferences

Introduction to FLOSS Data SourcesMaster on Free Software

Daniel Izquierdo

[email protected]/Libresoft

18 de noviembre de 2011

Daniel Izquierdo Introduction to FLOSS Data Sources

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 2/27

IntroductionMetrics

ConclusionsReferences

(cc) 2011 Daniel Izquierdo.Some rights reserved. This document is distributed under the Creative

Commons Attribution-ShareAlike 3.0 licence, available inhttp://creativecommons.org/licenses/by-sa/3.0/

Daniel Izquierdo Introduction to FLOSS Data Sources

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 3/27

IntroductionMetrics

ConclusionsReferences

Index

1 Introduction

2 Metrics

3 Conclusions

4 References

Daniel Izquierdo Introduction to FLOSS Data Sources

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 4/27

IntroductionMetrics

ConclusionsReferences

Data sources

Source code management systemMailing lists

Bug tracking system

Source code

Daniel Izquierdo Introduction to FLOSS Data Sources

I d i

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 5/27

IntroductionMetrics

ConclusionsReferences

Data sources

What type of metrics can we retrieve from them?

Daniel Izquierdo Introduction to FLOSS Data Sources

I t d ti

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 6/27

IntroductionMetrics

ConclusionsReferences

Index

1 Introduction

2 Metrics

3 Conclusions

4 References

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 7/27

IntroductionMetrics

ConclusionsReferences

SCM: main attributes

Per commit:

Owner of the change: committer or authorDate of commitFiles touched Message left by the committer or authorLines involved in the changes

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 8/27

IntroductionMetrics

ConclusionsReferences

SCM: main metrics

Regarding to the size of the project or community:

Number of commitsNumber of committers/authorsNumber of files touched Number of lines touched Usual programming language used (based on the file path)Others...

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 9/27

IntroductionMetrics

ConclusionsReferences

SCM: main metrics

Workload adequacy

Average number of commits per committer/authorAverage number of files/lines touched by committer/authorTerritoriality: number of files only handled by just onedeveloper

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 10/27

IntroductionMetrics

ConclusionsReferences

SCM: main metrics

Distribution of effort:

Distribution of commits per developers (generally following a20 % - 80 % distribution)Distribution of modules or areas of the source code bydeveloperOthers...

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 11/27

MetricsConclusions

References

SCM: main metrics

Social network analysis

Creation of networks based on the type of action (e.g.: twodevelopers working in the same file, people working in thesame programming language)Betweeness: interesting people with a high know-how of thecommunity (usually found as heading two different networks of people)

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 12/27

MetricsConclusions

References

SCM: main metrics

Evolutionary studies

Evolution in number of new people coming to the community(regeneration of developers)Evolution in the number of fixing commits (data left bydevelopers in the log message)Evolution in number of commits (is the community growing inactivity?)

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 13/27

MetricsConclusions

References

Mailing list: main metrics

Size of the community:

Number of unique posters in the mailing listsNumber of users postingNumber of developers postingNumber of mailing lists (specific mailing lists for developers,users, per language, etc)

Daniel Izquierdo Introduction to FLOSS Data Sources

Introduction

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 14/27

MetricsConclusions

References

Mailing list: main metrics

Workload adequacy:How many developers are interacting with end users?Number of e-mails per developer / per userNumber of e-mails per mailing list

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionM i

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 15/27

MetricsConclusions

References

Mailing lists: main metrics

Social network analysis:Similar to metrics found in the SCMDetection of important people not registered in the SCM(lawyers)

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionM t i s

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 16/27

MetricsConclusions

References

Mailing lists: main metrics

Evolutionary analysis:Evolution in the number of new people posting new e-mailsEvolution in the general activity in the Mailing lists

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 17/27

MetricsConclusions

References

BTS: main metrics

Size metrics:

Number of bugsNumber of open bugsNumber of closed bugsNumber of developers fixing bugsNumber of users reporting bugsNumber of developers reporting bugs

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 18/27

MetricsConclusions

References

BTS: main metrics

Workload adequacy:Average number of bugs fixed per developerAverage number of bugs remaining open per developer

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 19/27

MetricsConclusions

References

Source code: main metrics

Size adequacy:

Number of linesNumber of filesTypes of programming languagesTypes of files (source code, translation, images, etc...)Number of lines per file

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 20/27

MetricsConclusions

References

Source code: main metrics

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 21/27

ConclusionsReferences

Source code: main metrics

Static metrics:

Fan-in/Fan-out : Number of functions or methods that callsome other function or method (complexity, connascence).Length of code : Size of program. SLOC (LOC or LLOC).

Cyclomatic complexity : Metric for control complexity of software.Length of identifiers : Average length of identifiers used in aprogram. Supposedly, the longer the ID, the better forreadability and maintenance of code.

Depth of conditional nesting : Deeply nested statements areharder to be grasped.Fog index : Average length of words and sentences indocuments.

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 22/27

ConclusionsReferences

Source code: main metrics

Dynamic metrics:

Depth of inheritance tree : Number of discrete levels in the treeof classes (OOP). The deeper the tree, the more complex thedesgin.Method fan-in/fan-out : Distinction between origin of calls toother methods (from object or from external methods).Weighted methods per class : Number of methods included in aclass, weighted complexity of every method. Too complexclasses are difficult as for understanding and maintenance.

Number of overriding operations : Operations overridden insub-classes. The higher this number, the less appropriate maybe the super-class.

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 23/27

ConclusionsReferences

Source code: main metrics

Evolutionary studies:

Evolution of the number of linesClones detection (are parts of the source code being moved toanother areas?)Evolution of the architectureOthers...

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

C l i

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 24/27

ConclusionsReferences

Index

1 Introduction

2 Metrics

3 Conclusions

4 References

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

C l i

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 25/27

ConclusionsReferences

Using metrics

Metrics are providing objective results, however generalconclusions should be inferred from those.

Thus, human interpretation is needed.

Benchmarks could be created in order to have a comparisonmodel

With that benchmark, you will be able to compare the current

situation of the assessed project with others

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

Conclusions

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 26/27

ConclusionsReferences

Index

1 Introduction

2 Metrics

3 Conclusions

4 References

Daniel Izquierdo Introduction to FLOSS Data Sources

IntroductionMetrics

Conclusions

8/3/2019 FLOSS Data Sources Metrics

http://slidepdf.com/reader/full/floss-data-sources-metrics 27/27

ConclusionsReferences

References

Producing OSS by Karl Fogel 

Tools and datasets for mining libre software repositories, byGregorio Robles, Jes´ us M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar and Israel Herraiz 

Metrics and Models in Software Quality Engineering by Stephen H. Kan

Daniel Izquierdo Introduction to FLOSS Data Sources