OpenOffice++: Improving the Quality of Open Source Software

Embed Size (px)

Citation preview

OpenOffice++

Improving the Quality
of Open Source Software

Rudolf Ferenc, PhDrpd Beszdes, PhDTibor Gyimthy, PhD

University of SzegedFrontEndART Ltd. Hungary

Zsolt Bagoly, PhDDniel Darabos

MultiRacio Ltd.Hungary

GVOP-3.1.1.-2004-05-345/3.0

Agenda

Introduction

The OpenOffice++ R&D project

Source code quality assurance today

Goal of the project

Columbus quality assurance tools

Tasks of the project

Results

Our previous open source activities

The official code-size and performance benchmark of GCC

http://www.inf.u-szeged.hu/csibe/

http://gcc.gnu.org/benchmarks/

Various compiler optimizations in GCC

http://www.inf.u-szeged.hu/gcc-arm/

http://www.inf.u-szeged.hu/symbian-gcc/

Monitoring the quality of Mozilla

http://www.frontendart.com/momo.php

Improvement of the JFFS2 filesystem

http://www.inf.u-szeged.hu/jffs2/

The OpenOffice++ R&D project

OpenOffice++: Improving the Quality
of Open Source Software

Funded by EU-supported Hungarian national grant no. GVOP-3.1.1.-2004-05-0345/3.0 and by MultiRacio Ltd.

Participants

MultiRacio Ltd., Hungary

Department of Software Engineering,
University of Szeged, Hungary

Size & duration

0.54 M (0.4 M grant and 0.14 M by MultiRacio)

2004-11-01 2006-12-31

Similar projects

Software Quality Observatory for Open Source Software (SQO-OSS)

2.5 M EU FP6 fund (http://www.sqo-oss.eu)

Quality of Open Source Software (QUALOSS)

3 M EU FP6 fund (http://www.cetic.be)

Free / libre / open source software metrics and benchmarking study (FLOSSMETRICS)

0.58 M EU fund (http://www.flossmetrics.org/)

Vulnerability Discovery and Remediation, Open Source Hardening project

1.24 M$ US Department of Homeland Security fund

Quality decrease

Idealized and actual failure curves for software

Quality assurance today

Currently, in most of the cases, quality assurance of source code means testing!

That's good!

But it is not enough!

See e.g. the ISO/IEC 9126 & CMMI standards

Tool-supported source code analysis is needed!

The quality assurance of OO.o source code is fundamental!

Goal of the OpenOffice++ project

Develop methods and tools for the quality assurance of OO.o C++ source code

Adapt the Columbus software maintenance methodology and tool set

FrontEndART Ltd. (spin-off company of the university)

Originally an R&D project supported by Nokia

Partners: Nokia, evoSoft, GraphiSoft, Nuance-Recognita,

Analyze the source code of OO.o

Static source code analysis

Fix the source code

Manual work

Columbus quality assurance tools

Robust static source code analyzers

C, C++, Java, C# languages

Build models (abstract semantic graphs, flow graphs) of source code conforming to well-defined metamodels from source code

Programming interface (API) in C++

Enables whole-system analysis

Source code analysis methodologies

Integration into compilation environments

Compiler wrapping

Analysis performed transparently in parallel with compilation

http://www.frontendart.com

Reverse engineering

Creation of high-level documents from code

UML diagrams, XML, HTML, design pattern detection, call-graph, etc.

Source code auditing

Calculating metrics, detecting bugs & bad smells, checking conformance to coding guidelines

Monitoring system

Analysis results are continuously stored in a SQL database

Web browser access to the data through diagrams, charts and reports

http://www.frontendart.com

Columbus quality assurance tools

Tasks of the project

Steps taken during the OOo++ project

1. Set up software quality indicators

62 pages study (in hungarian)

Tibor Gyimthy, Rudolf Ferenc and Istvn Siket.
Empirical validation of object-oriented metrics on open source software for fault prediction.
IEEE Transactions on Software Engineering, 31(10):897910, 2005.

Main observations

Some metrics are good indicators of software quality

Some metrics are good predictors of bugs!

Tasks of the project Metrics

2. Calculate Metrics

You cant manage what you cant control, and you cant control what you dont measure.
(Tom DeMarco)

OpenOffice.org 2.0.4
source code

More than a decade
of development

5 million program lines

27 thousand classes

330 thousand functions

Tasks of the project Bad smells

3. Detect bad smells

Bad smells indicate design problems and degrade software quality and maintainability

Bad smells in the code can be automatically detected

Detection was enhanced by machine learning

Machine learning database

Manual evaluation of the detected bad smells

We analyzed and categorized 6238 bad smells

Manual work by MultiRacio, who has 5 years experience with OO.o source code

We enhanced the detection algorithm with machine
learning

Tasks of the project Bad smells

4. Identify coding errors

Detect coding constructs leading to errors

We manually analyzed the OO.o issue database

We looked for bugfixes from the
past to find repeatedly occurring
problems (4605 issues!)

We created new OO.o-specific
rules

Tasks of the project Bug detection

Memory problems

Dangerous constructs

Tasks of the project Bug detection

Examples of serious problems

Tasks of the project Evaluation

5. Evaluate the quality of OpenOffice.org source code

50 pages study

the complexity of the code is surprisingly low, but

the high coupling value of the classes may cause problems

Complexity index

Coupling index

Tasks of the project Evaluation

We compared the source code quality of the main versions of OpenOffice.org during the project

1.1.4 / 1.1.5 / 2.0.0 / 2.0.1 / 2.0.2 / 2.0.3 / 2.0.4

The size increased a lot

Unfortunately, the number of problems in the code increased as well

Tasks of the project Evaluation

The most problematic classes are the following

Tasks of the project Bugfixing

6. Perform bugfixes

Phase I. (before machine learning)

Based on bad smells, 55 bugfixes in 38 files

Phase II. (improved algorithm based on machine learning)

Fixing the most problematic code segments

129 bugfixes in 58 files

Altogether 184 bugfixes

Tasks of the project Bugfixing

The bugfixes were prepared for OpenOffice.org versions 1.1.5 2.0

From the 184 bugfixes only 9 were not compatible with OpenOffice.org 2.1.0 code

5 out of 9 were already fixed in OpenOffice.org code, in the 4 remaining positions the code changed significantly

We submitted the bugfixes (patches) to the OO.o developer community and most of them were accepted

The results of the analysis are open and publicly available

http://oopp.multiracio.com

http://www.frontendart.com/ooomo.php

We continue the analysis, and

welcome your support! :-)

Questions?

Links

http://oopp.multiracio.com

http://www.inf.u-szeged.hu/opensource/

http://www.multiracio.com

http://www.frontendart.com

http://www.frontendart.com/ooomo.php

Click to edit the outline text format

Second Outline Level

Third Outline Level

Fourth Outline Level

Fifth Outline Level

Sixth Outline Level

Seventh Outline Level

Eighth Outline Level

Ninth Outline Level

Click to edit the outline text format

Second Outline Level

Third Outline Level

Fourth Outline Level

Fifth Outline Level

Sixth Outline Level

Seventh Outline Level

Eighth Outline Level

Ninth Outline Level

Click to edit the title text format

Click to edit the outline text format

Second Outline Level

Third Outline Level

Fourth Outline Level

Fifth Outline Level

Sixth Outline Level

Seventh Outline Level

Eighth Outline Level

Ninth Outline Level

TypeOccurrencesFilesModulesReal hitsSuspicious

DataClass68147936278403

InappropriateIntimacy48445855181303

LargeClassCode33933743145194

LargeClassData42835627166262

LargeClassObject31929338106213

LazyClass143392961637796

LongFunction61047763203407

LongParameterList40525151114291

MiddleMan31528042138177

ShotgunSurgery64429037214430

SpeculativeGenerality58021238160420

Alltogether6238436216923423896

???Oldal ??? (???)2007.09.14, 15:57:21 / IDNo.DescriptionGEN100321allocated arrays of objects created with 'new' should be released with 'delete[ ]'GEN1004273classes that use dynamically allocated memory should have a copy constructor, an assignment operator and a destructorGEN1007632elements of the constructor initializer list should be listed in the same order as they were declaredGEN1008246destructor functions should be virtual in the case of base classesGEN1023214returning pointers to class internals like private member variables should be avoidedGEN7033580each nonempty 'case' branch should end in a jump statement or a commentGEN7034435the last 'case' label of a switch instruction should end with a 'break', 'return', 'continue' or 'throw'

???Oldal ??? (???)2007.09.14, 15:57:21Oldal /