10
Restructuring Variability in Software Product Lines using Concept Analysis of Product Configurations Felix Loesch Robert Bosch GmbH Corporate Research CR/AEY P.O. Box 300240, 70442 Stuttgart, Germany [email protected] Erhard Ploedereder University of Stuttgart Institute of Software Technology Universitaetsstraße 38, 70569 Stuttgart, Germany [email protected] Abstract The management of variability plays an important role in successful software product line engineering. As the set of products that is derived from the product line and their requirements are constantly changing, the variability in the product line needs to evolve as well. A typical problem in in such an evolution scenario is that the number of vari- able features and variants will explode, and thus become unmanageable. One of the reasons for this explosion is that obsolete variable features are not removed. In order to ad- dress this problem, we present a new method for restructur- ing and simplifying the provided variability in a software product line. Our method is based on concept analysis. It analyzes the realized variability in a software product line, and constructs a lattice that provides a classification of the usage of variable features in real products derived from the product line. We show how this classification can be used to derive restructuring strategies for variability that solve the problem of variability explosion. The effectiveness of our method is demonstrated by presenting a case study of restructuring the variability in a large industrial software product line. 1 Introduction Managing the variability of product line components that can be used in a large number of different products is hard because it is almost impossible to anticipate how these com- ponents will be used in products derived from the product line. In order to allow these components to be configured for as many products as possible, they are typically imple- mented using variable features, i.e., features that may be included or not in configurations of the components. These variable features allow a product line engineer to configure the components for the requirements of different products. However, ongoing evolution of product line components due to changing customer requirements and advances in technology will increase the number of variable features. As a result, the number of variants that can be derived by including or excluding variable features in a product will explode, and thus become unmanageable. One of the rea- sons for this explosion is that obsolete variable features are not removed. For example, from a product line component offering only 10 variable features it is theoretically possible to derive 2 10 different variants. This simple example shows that a small number of variable features already results in a combinatorial explosion of variants. Clearly, methods are needed that ensure the minimum number of variable fea- tures needed to derive different products from the product line [5]. In order to address this problem, we present a new method to restructure and simplify the provided variability in a software product line. Our method is based on concept analysis. First, a table is constructed that reflects the us- age of variable features in product configurations. From the table, a concept lattice is derived that factors out which vari- able features are commonly used in product configurations and which variable features only appear in specific product configurations. The main contributions of our method are: An automated classification of the usage of variable features in product configurations based on the concept lattice (Section 3.3). A set of restructuring strategies to minimize the num- ber of variable features based on this classification (Section 3.5). We demonstrate that our method helps to solve the problem of variability explosion by providing a good starting point for interactive restructuring tools to minimize the number of variable features necessary to derive different products. The remainder of this paper is organized as follows. Sec- tion 2 reviews the relevant parts of the theory of concept 11th European Conference on Software Maintenance and Reengineering (CSMR'07) 0-7695-2802-3/07 $20.00 © 2007

[IEEE 11th European Conference on Software Maintenance and Reengineering (CSMR'07) - Amsterdam, The Netherlands (2007.03.21-2007.03.23)] 11th European Conference on Software Maintenance

  • Upload
    erhard

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Restructuring Variability in Software Product Lines usingConcept Analysis of Product Configurations

Felix LoeschRobert Bosch GmbH

Corporate Research CR/AEYP.O. Box 300240, 70442 Stuttgart, Germany

[email protected]

Erhard PloederederUniversity of Stuttgart

Institute of Software TechnologyUniversitaetsstraße 38, 70569 Stuttgart, Germany

[email protected]

Abstract

The management of variability plays an important rolein successful software product line engineering. As the setof products that is derived from the product line and theirrequirements are constantly changing, the variability in theproduct line needs to evolve as well. A typical problem inin such an evolution scenario is that the number of vari-able features and variants will explode, and thus becomeunmanageable. One of the reasons for this explosion is thatobsolete variable features are not removed. In order to ad-dress this problem, we present a new method for restructur-ing and simplifying the provided variability in a softwareproduct line. Our method is based on concept analysis. Itanalyzes the realized variability in a software product line,and constructs a lattice that provides a classification of theusage of variable features in real products derived from theproduct line. We show how this classification can be usedto derive restructuring strategies for variability that solvethe problem of variability explosion. The effectiveness ofour method is demonstrated by presenting a case study ofrestructuring the variability in a large industrial softwareproduct line.

1 Introduction

Managing the variability of product line components that

can be used in a large number of different products is hard

because it is almost impossible to anticipate how these com-

ponents will be used in products derived from the product

line. In order to allow these components to be configured

for as many products as possible, they are typically imple-

mented using variable features, i.e., features that may be

included or not in configurations of the components. These

variable features allow a product line engineer to configure

the components for the requirements of different products.

However, ongoing evolution of product line components

due to changing customer requirements and advances in

technology will increase the number of variable features.

As a result, the number of variants that can be derived by

including or excluding variable features in a product will

explode, and thus become unmanageable. One of the rea-

sons for this explosion is that obsolete variable features are

not removed. For example, from a product line component

offering only 10 variable features it is theoretically possible

to derive 210 different variants. This simple example shows

that a small number of variable features already results in

a combinatorial explosion of variants. Clearly, methods are

needed that ensure the minimum number of variable fea-

tures needed to derive different products from the product

line [5].

In order to address this problem, we present a new

method to restructure and simplify the provided variability

in a software product line. Our method is based on concept

analysis. First, a table is constructed that reflects the us-

age of variable features in product configurations. From the

table, a concept lattice is derived that factors out which vari-

able features are commonly used in product configurations

and which variable features only appear in specific product

configurations. The main contributions of our method are:

• An automated classification of the usage of variable

features in product configurations based on the concept

lattice (Section 3.3).

• A set of restructuring strategies to minimize the num-

ber of variable features based on this classification

(Section 3.5).

We demonstrate that our method helps to solve the problem

of variability explosion by providing a good starting point

for interactive restructuring tools to minimize the number

of variable features necessary to derive different products.

The remainder of this paper is organized as follows. Sec-

tion 2 reviews the relevant parts of the theory of concept

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

analysis. Section 3 explains how concept analysis is used

to classify the usage of variable features in product config-

urations and to restructure variability. Section 4 presents a

case study demonstrating the effectiveness of our method.

Section 5 discusses related research. Finally, conclusions

and future work are presented in Section 6.

2 Formal Concept Analysis

Formal Concept Analysis (FCA) [8] is a mathematical

method that provides a way to identify meaningful group-

ings of objects that have common attributes.

The starting point for concept analysis is a context C =(O,A, T ), consisting of a set of objects O, a set of at-tributes A, and a relation table T , stating which attributes

are possessed by each object. For any set of objects O ⊆ O,

the set of common attributes is defined as σ(O) = {a ∈A | ∀o ∈ O : (o, a) ∈ T }. Similarly, for any set of at-

tributes A ⊆ A, the set of common objects is defined as

τ(A) = {o ∈ O | ∀a ∈ A : (o, a) ∈ T }.

A tuple c = (O,A) is called a concept, if A = σ(O) and

O = τ(A), that is, c is a maximal collection of objects that

share a common set of attributes. A concept c = (O,A)has extent ext(c) = O and intent int(c) = A. Informally,

a concept corresponds to a maximal rectangle of filled table

cells in T modulo column and row permutations.

The set of all concepts that can be derived from the for-

mal context forms a complete partial order, called the con-cept lattice L(C). Thus, we define that a concept (O1, A1) is

a subconcept of concept (O2, A2), denoted by (O1, A1) ≤(O2, A2), if O1 ⊆ O2 (or, equivalently, A1 ⊇ A2). In-

versely, we define that the concept (O2, A2) is a supercon-cept of (O1, A1).

For two concepts (O1, A1) and (O2, A2) in the concept

lattice, their infimum (�) or meet is defined as

(O1, A1) � (O2, A2) = (O1 ∩ O2, σ(O1 ∩ O2)) (1)

and their supremum () or join as

(O1, A1) (O2, A2) = (τ(A1 ∩ A2), A1 ∩ A2) (2)

In our figures, a concept c is labeled with attribute a ∈ A,

if it is the largest concept with a in its intent, and it is labeled

with an object o ∈ O, if it is the smallest concept with o in

its extent. The unique concept labeled with a is denoted

μ(a) = {c ∈ L(C) | a ∈ int(c)} (3)

and the unique concept labeled with o is denoted

γ(o) = �{c ∈ L(C) | o ∈ ext(c)} (4)

Hence, the attributes of object o are those which appear at

γ(o) and above, and all objects that appear at and below

male female tall short intelligent

Paul x x x

George x x

Jenny x x

Michael x x x

Elisabeth x x

Figure 1. Relation table and concept lattice.

μ(a) have attribute a. Consequently, suprema in the lat-

tice indicate that certain objects have attributes in common,

while infima show that certain attributes fit to common ob-

jects.

Figure 1 shows the relation table and associated concept

lattice for an example of properties of persons. The concept

#9 labeled with male is the supremum of all persons that

have male in their intent: Michael, Paul, and George are

below male in the lattice and the table indicates that these

(and no other) persons are indeed male.

A table and a lattice are alternate views on the same in-

formation but they provide different insights. There is yet

another view: a set of implications. For any two attribute

sets A,B ⊆ A, we say A implies B (written A → B), iff

any object o having all attributes in A also has all attributes

in B:

A → B ⇔ ∀o ∈ O : (∀a ∈ A : (o, a) ∈ T )⇒ (∀b ∈ B : (o, b) ∈ T )

For B = {b1, . . . , bk}, A → B holds iff A → {bi} for

all bi ∈ B. Implications show up in the lattice as follows:

A → b holds, iff �{μ(a) | a ∈ A} ≤ μ(b). Informally,

implications between attributes can be found along upward

paths in the lattice. In the example of Figure 1, we have that

μ(intelligent) ≤ μ(short), which can be read as intelligent→ short, or ”An intelligent person is also short”. This im-

plication shows that concept analysis operating on incom-

plete information needs human judgment to verify or reject

derivative information. Often, some implications are known

to hold a priori. Such background knowledge can easily be

integrated into a given table. An implication x → y can

be enforced by adding the entries in the x column to the ycolumn, and will cause μ(x) ≤ μ(y) in L(C). For more

details, we refer to Ganter and Wille [8].

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

3 Using FCA for Restructuring Variability

In order to analyze the usage of variable features in prod-

uct configurations via concept analysis, we first need to de-

fine the formal context (objects, attributes, relation table)

and then interpret the resulting concept lattice accordingly.

3.1 Formal Context

Roughly speaking, the objects and attributes in our do-

main are product configurations and variable features, re-

spectively, and the table that is constructed identifies for

each product configuration which variable features it in-

cludes. Before we can define the objects and attributes more

precisely, we need to introduce some terminology.

In what follows, S denotes a product line component,

package, or subsystem. Thus, S is a generic parameter to

our method. The reengineer can select the granularity of Sdepending on the number of variable features expected in S.

Further, P denotes a product configuration for the analyzed

product line.

A product configuration P consists of so called config-uration rules that itself consist of the following three parts:

scope, path, configuration label. For example, the follow-

ing configuration rule

element \subsys B_SubSys_Comp_FeatureA

has scope = element, path = subsys and configurationlabel = B SubSys Comp FeatureA. In the following de-

finitions, BelongsTo(f,S) describes the set of files that be-

long to S and AttachedTo(l, f) determines if configuration

label l is attached to file f .

A variable feature can be an optional, alternative, op-tional alternative, or or-feature [4]. In our case, all types of

variable features are realized by configuration labels l. Thus

a variable feature will be included in a product by includ-

ing the corresponding configuration label l in the product

configuration. As the CM system does not provide seman-

tics for configuration labels, the different types of variable

features are not distinguished in the CM system.

The attributes of our domain are variable features offered

by S. As variable features are represented by configuration

labels, the variable features offered by S are all configura-

tion labels that are attached to a file that belongs to S.

Definition 1. Let S be a component, package, or subsystem.Then, we define the set of offered features as follows:

OfferedFeatures(S) :={l | AttachedTo(l, f) ∧ BelongsTo(f,S)}

The objects of our domain are product configurations Pin which the analyzed component, package or subsystem S

is used. As other subsystems, packages, and components

are configured in P as well, we need to determine the fea-

tures of S used in P .

Definition 2. The set of product configurations using S andthe set of used features of S in P are defined as follows:

Configurations(S) :={P | P is a configuration using S}

UsedFeatures(S,P) :={l | l ∈ OfferedFeatures(S) ∧ l ∈ P}

Using these definitions we can now define the formal

context for our domain:

Definition 3. The formal context C = (O,A, T ) for con-cept analysis is defined as follows:

C = {Configurations(S), OfferedFeatures(S), T }(P, l) ∈ T ⇔ l ∈ UsedFeatures(S,P)

Before we can obtain the relation table T , we first need

to determine OfferedFeatures(S) using one of the following

alternatives:

1. existing documentation. If the features of S have been

completely documented by the product line engineers,

then we can directly obtain OfferedFeatures(S) from

the documentation.

2. configuration management. If the features of S are

not documented, we can obtain OfferedFeatures(S) by

extracting all configuration labels that are attached to

files belonging to S from configuration management.

3. product configurations. If the features of S are not

documented and we have no access to the information

in the CM system, we can obtain OfferedFeatures(S)

by analyzing a large number of product configurations

P . If the path of a configuration rule in P matches

the path of S, then the configuration label represents a

feature offered by S.

Ideally, one will use alternative (1) if reliable and complete

documentation exists. Alternative (2) can be chosen if the

features are not documented but we can easily determine the

files belonging to S. In all other cases, one will fall back on

alternative (3). However, using alternative (3) we cannot

identify unused variable features as these features will not

show up in the product configurations.

In order to obtain the entries in the relation table T , we

analyze a set of product configurations P that use some

of the variable features offered by S. Ideally, the product

configurations should use different subsets of OfferedFea-tures(S). To obtain UsedFeatures(S,P), we parse the con-

figuration rules in P and determine for each configuration

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

rule if the configuration label l referenced by the configura-

tion rule is in the set of OfferedFeatures(S). If it is in the

set, we mark the entry (P, l) in T .

3.2 Interpretation of the Concept Lattice

Concept analysis applied to the formal context described

in the last section yields a lattice from which interesting

facts about the relationships of variable features and product

configurations can be derived. The following relationships

can be fully automatically derived from the sparse represen-

tation of the lattice and presented to the analyst:

• The number of concepts in the lattice that have at least

one variable feature attached represent the minimum

number of variation points needed to derive the prod-

ucts described by the analyzed product configurations.

• A product configuration, p, contains all variable fea-

tures at and above γ(p) in the lattice.

• A variable feature, l, is used in all product configura-

tions at and below μ(l) in the lattice.

• Variable features commonly used in a set of product

configurations {p1, p2, . . . , pn} can be identified at the

supremum, i.e., the closest common superconcept of

all γ(pi).

• Product configurations using a set of variable features

{l1, l2, . . . , ln} can be identified at the infimum, i.e.,

the closest common subconcept of all μ(li).

• A variable feature, l, is specific to exactly one product

configuration, p, if p is the only product configuration

on all paths from μ(l) to the bottom element, i.e., p is

the only product configuration using variable feature l.

• If μ(l1) < μ(l2) holds for two variable features l1 and

l2, then l1 is more specific than l2 with respect to the

given product configurations. Thus, the more general

variable features can be found at the higher concepts

in the lattice since they are used in many product con-

figurations, while more specific variable features are in

the lower region of the lattice.

• If γ(p1) < γ(p2) holds for two product configurations

p1 and p2, then product configuration p1 is based on p2

because p1 includes all features in the intent of γ(p2).Thus, the lower a product configuration in a lattice, the

more capable it is because it uses more features.

3.3 Classification of Feature Usage

The interpretation of the concept lattice as described

above gives insights into the relationships between variable

features and product configurations. However, the analyst

is primarily interested in restructuring and simplifying the

provided variability of the component, package, or subsys-

tem S. This section describes how the lattice can be used

to automatically classify the usage of variable features and

how we can derive strategies for restructuring and simplify-

ing the existing variability of S using this classification.

Using the sparse representation of the lattice, we can

classify the usage of variable features in product configu-

rations as follows:

I Always used. Variable features appearing at the top

concept (�) in the lattice are used in every product

configuration. These features are likely to be manda-tory features, i.e., they need to be included in every

product. As the variability provided by these features

apparently is not needed for the current product con-

figurations, these features should be merged. Merging

these variable features will reduce the complexity of

variability as the number of configuration labels is re-

duced, i.e., the number of variable features that need to

be managed decreases.

II Never used. Variable features appearing at the bot-

tom concept (⊥) are not used in any of the product

configurations. These features are likely to be features

that have become obsolete. As these variable features

are obviously not needed anymore, they should be re-

moved from the set of offered features. Variable fea-

tures that become obsolete often occur if there used

to exist two alternatives of a feature and one of them

has become the preferred alternative used in all prod-

uct configurations.

III Only used mutually exclusively. Two variable fea-

tures l1 and l2 that appear at different concepts and

whose infimum is the bottom concept, i.e., for which

μ(l1) � μ(l2) = ⊥ holds, are only used mutually ex-

clusively in the product configurations. These features

are likely to be alternative features, i.e., they always

have to be used mutually exclusively. Although this

is only a hypothesis, the analyst may have additional

background knowledge that these features indeed rep-

resent alternative features. If we consider the distance

between μ(l1) and μ(l2), we can even classify the al-

ternatives according to their usage, i.e., a large distance

between μ(l1) and μ(l2) indicates that one alternative

is used more often than the other. In the case that one

of the alternatives has become the preferred alternative

it will show up at the top concept and the unused alter-

natives will appear at the bottom concept.

IV Only used in pairs. Two variable features l1 and

l2 that appear at the same concept, i.e., for which

μ(l1) = μ(l2) holds are features that are used together

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

in all product configurations. These are likely to be

features that cannot be used separately. Although this

is only a hypothesis, the analyst may have additional

background knowledge about the correct usage of these

features. If the analyst knows that these features can

only be used together, they should be merged to reduce

the number of variable features and to make clear that

they cannot be used separately.

3.4 Feature Constraints

Beyond the classification of features into usage classes,

our approach also allows us to automatically derive feature

constraints as described in [4].

Deriving feature constraints. Feature constraints of

the form ”feature A requires features B ∨ C” show up in

the concept lattice as implications between attributes as de-

scribed in Section 2, and can be automatically derived us-

ing the concept analysis tool ConExp [21]. More compli-

cated feature constraints such as ”feature A requires fea-

tures B∧C” can be obtained by combining attributes B and

C to a new attribute B ∧ C in the relation table, repeating

concept analysis, and checking if there exists an implica-

tion between A and the combined attribute B ∧ C. Feature

constraints of the form ”feature A excludes feature B” can

be detected by checking if A and B belong to feature usage

class III. As features in this class are only used mutually

exclusively it is very likely that there exists a mutually ex-

clusive constraint for these features. However, this is only a

hypothesis and should be checked by an analyst.

Adding feature constraints. Feature constraints that

are not reflected in the lattice, e.g., ”variable feature x al-

ways requires variable feature y”, can be integrated in the

lattice via background implications, i.e., implications that

are added to the formal context as described in Section 2.

Usage of feature constraints. The feature constraints

can be used for two different purposes. First, they reduce

the number of variants that can theoretically be configured.

Second, they can be used to check product configurations

for consistency. For example, one can automatically check

product configurations for the inclusion of required features

and the absence of mutually exclusive features.

3.5 Restructuring Variability

We intend to construct an interactive tool for restructur-

ing variability. The tool could be realized as an extension to

ConExp [21]. It is already possible to view the lattice and

the associated table in ConExp. One could easily imagine

extensions that automatically classify features and display

the results of this classification to the user. Using this clas-

sification, the tool can propose restructuring strategies to

Usage Class Restructuring StrategyI Always used Merge variable features

II Never used Remove variable features

III Mutually exclusive Mark as alternative

IV Used in pairs Merge variable features

Table 1. Restructuring strategies for featureusage classes.

simplify the provided variability in a component, package,

or subsystem S.

Table 1 summarizes the possible restructuring strategies

for each usage class. In addition, the tool could analyze

the impact of intended restructuring transformations by an-

alyzing the realization of variable features and the mapping

of these features to files in the configuration management.

Specific transformations that the tool should support are:

• Merging variable features. The analyst may decide

to merge two variable features l1 and l2 if the distinc-

tion between these variable features is not needed any-

more. This requires to unify the set of files l1 and l2are attached to and to replace the configuration labels

l1 and l2 by a new common configuration label. The

tool should check whether l1 and l2 do not overlap, i.e.,

have files in common. If l1 and l2 do overlap, the files

that are shared need to be merged as well. Further-

more, the tool should automatically update the product

configurations to remove the obsolete configuration la-

bels after a successful merge of two features.

• Removing variable features. The analyst may decide

to eliminate variable features that are not used in any

of the product configurations. This requires to remove

the corresponding configuration labels from the set of

offered features. If the files that these configuration la-

bels are attached to are not needed anymore, they can

be removed as well. If the features to be removed rep-

resent obsolete alternatives of other features, then in

addition to removing the obsolete alternatives, the an-

alyst may decide to merge the remaining ”alternatives”

with other features.

• Mark as alternative. If two variable features are

only used mutually exclusively, and if the analyst

knows that they represent alternative features, then

they should be marked as alternative. The pairs of

mutually exclusive features could be given to an auto-

mated checker for product configurations that ensures

that only one of the alternatives is selected in each

product configuration.

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

3.6 Filtering of the Concept Lattice

As the resulting lattice for product configurations and

variable features may be too large, too detailed, and due to a

large number of concepts be unmanageable, it is reasonable

to exclude some variable features from the examination. As

a result of excluding some variable features, the lattice will

become much smaller and interesting facts can be derived

from the lattice more easily.

In this section, we describe how we can exclude some

of the variable features to simplify the lattice without sac-

rificing the order of concepts in the lattice. The lattice

changes when variable features are excluded from the ex-

amination. Fortunately, Ganter and Wille [8] have shown

that the smaller lattice after the exclusion of variable fea-

tures can be mapped to the original lattice (the smaller lat-

tice is the result of a so-called subcontext).

Definition 4. Let C = (O, A, T ) be a context, O′ ⊆ O,and A′ ⊆ A. Then C ′ = (O′, A′, T ∩ (O′ × A′)) is calleda subcontext of C and C is called a supercontext of C ′.

In our application of concept analysis, we only remove

columns (one for each variable feature, assuming that vari-

able features occur in columns of the relation table) but

never rows (because we do not want to remove product

configurations). Removing columns leads to a subcontext

(O,A′, T ′) in which T ′ is a reduced table of T . Fortu-

nately, this modification remains transparent in the lattice.

Every attribute extent of (O,A′, T ∩(O×A′)) is an attribute

extent of C = (O,A, T ) and, since every concept extent is

the intersection of attribute extents, we obtain:

Proposition. Let C = (O,A, T ) and C ′ = (O′, A′, T ′),where A′ ⊆ A and T ′ = (T ∩ (O×A′)). Then every extentof C ′ is an extent of C.Proof. See [8].

According to this proposition, each extent within the

subcontext will show up in the supercontext. This can be

made plausible with the relation table: Removed columns

will never change existing columns, so the maximal rectan-

gles forming concepts will only shrink in horizontal direc-

tion (if variable features are listed in columns).

This proposition on the invariability of extents of sub-

contexts that only differ in the set of attributes results in a

simple mapping of concepts from the subcontext to the su-

percontext (for a formal proof see [8]):

(P,L) �→ (P, σ(P ))

The mapping is a �-preserving order-embedding, meaning

that the partial order relationship is completely preserved.

Consequently, the supercontext is basically a refinement of

the subcontext. By this mapping, all concepts of the sub-

context can be found in the supercontext.

3.7 Implementation

Our prototype tool to analyze product configurations is

based on the concept analysis tool ConExp [21]. In order

to parse product configurations, we implemented a frontend

for ConExp that extracts the set of used configuration labels

for a component, package, or subsystem S and transforms

this set into a formal context that is used by ConExp for the

calculation and the display of the concept lattice. The set

of offered features can be given to the tool in the form of

a simple list. In order to filter the lattice we extended the

filtering capabilities of ConExp. We are currently working

on an implementation for automated feature classification

and interactive restructuring of variability.

4 Case Study

In a case study, we analyzed (and later restructured) the

provided variability in a subsystem of an embedded indus-

trial software product line that is used in engine diesel con-

trol units. A similar system to the one we analyzed is de-

scribed in [9]. All data presented in this case study has been

obtained from the real system. For reasons of intellectual

property protection, we abstracted the names of product

configurations and variable features. However, all lattices

shown in this case study are isomorphic to the original lat-

tices.

4.1 Case Study Setup

In integration with other subsystems, subsystem X per-

forms a number of important functions to control the en-

gine. Subsystem X offers 55 variable features that can be

used to configure the system according to desired criteria of

X’s behavior, e.g., number of features of type A and B, cal-

ibration strategy T or A, operation mode R. However, not

every combination of variable features that can be selected

will result in a working configuration in the sense that it can

be compiled and calculates correct values for the control of

the engine. Beyond CM-labels selecting features, no other

variability mechanisms are used for subsystem X.

We obtained a set of eleven product configurations

shown in Table 2 to analyze the usage of variable features

of subsystem X. For each product configuration the total

number of used features in the product configuration, the

number of used features for subsystem X, and the number

of used features for component Y (a component of subsys-

tem X) is shown. We extracted the set of variable features

relevant for subsystem X from each product configuration.

The set of offered variable features for subsystem X was ob-

tained from existing documentation. We then used our pro-

totype tool to automatically construct a relation table from

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

Figure 2. Lattice for subsystem X. The larger a concept is, the more variable features it contains.

ID # total # subsystem X # component YP1 438 41 / 55 18 / 22

P2 333 36 / 55 17 / 22

P3 424 37 / 55 18 / 22

P4 497 36 / 55 18 / 22

P5 411 34 / 55 18 / 22

P6 405 36 / 55 16 / 22

P7 368 36 / 55 16 / 22

P8 455 36 / 55 18 / 22

P9 457 35 / 55 17 / 22

P10 396 30 / 55 17 / 22

P11 480 32 / 55 17 / 22

Table 2. Analyzed product configurations andnumber of used features.

the available information and applied concept analysis to

obtain a concept lattice.

4.2 General Results

The resulting lattice for subsystem X is shown in Fig-

ure 2. The intents of the concepts (variable features) in the

lattice are omitted for readability reasons. However, the

size of each concept node in this picture is a linear func-

tion of the number of variable features attached to it. As

Figure 2 shows, there are a few concepts containing most

of the variable features of subsystem X. The lattice con-

tains 54 concepts. 21 of them introduce at least one new

variable feature, i.e., to these nodes a variable feature is at-

tached (more precisely a concept c introduces a variable fea-

ture if there exists a variable feature l for which μ(l) = cholds). Concepts that introduce at least one variable feature

are half-filled with gray in the lattice. 33 of the concepts do

not introduce any new variable feature and merely combine

variable features included in several product configurations

(these concepts are shown in white in the lattice). Concepts

are half-filled with black if to these nodes one or more prod-

uct configurations are attached (more precisely a concept cis filled half-black if there exists a product configuration pfor which γ(p) = c holds).

The first interesting observation is that concepts with at-

tached variable features can be found in the upper region

and in the lower region of the lattice, while in the middle

region, there are almost no concepts which have variable

features attached. That is to say that there are a number of

variable features that are used in almost all product config-

urations (these are the features attached to concepts in the

upper region) and there are a few variable features that are

used in a small number of product configurations (these are

the features attached to concepts in the lower region). A

maintainer that needs to restructure and simplify the pro-

vided variability can use this information to determine can-

didate features that can be merged (these appear in the upper

region) because the variability provided by these features is

not needed to facilitate the product differences. Features

appearing at concepts that also have a product configura-

tion attached are features only used in a specific product

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

configuration. If these features are not intended to be used

in other product configurations in the future they can be re-

moved because the variability they provide is not needed.

The next interesting observation regards the capability

of product configurations. As we can see in the lattice most

product configurations appear toward the bottom of the lat-

tice. This means that the product configurations share a

large number of variable features and only differ slightly. If

we combine this information with the fact that subsystem X

offers 55 variable features, indicating that we could theoret-

ically configure 255 different variants, we obviously see that

the provided number of variable features by far outnumbers

the variable features needed to facilitate the differences in

the product configurations.

Furthermore, the lattice reveals interesting relationships

between product configurations. For example, the concept

labeled with P4 and P8 indicates that these product config-

urations use exactly the same features of subsystem X.

We inspected the lattice manually and used the documen-

tation of subsystem X to assign meaning to the variable fea-

tures. Using the names of configuration labels helped to

build groups of variable features that are likely to be related

because of a common prefix in the name, e.g., FeatureA

and FeatureA SubfeatureB are likely to be related. How-

ever, the large number of concepts that do not introduce new

variable features made the analysis difficult, as it was very

hard to follow the lines in the lattice especially in the middle

region where a large number of concepts and lines interfere.

In order to solve these problems, we filtered the lattice

by excluding many variable features from our examination.

As shown in Section 3.6 the exclusion of attributes from the

examination does preserve the partial order of the lattice.

In addition, the relationships between variable features be-

came much clearer in the filtered lattice. In our case study,

we focused on the provided variability of component Y, a

component inside subsystem X. Thus, we removed all vari-

able features from the relation table that do not affect com-

ponent Y. We then applied concept analysis on the result-

ing subcontext. Figure 3 shows the concept lattice for the

reduced relation table after removing the variable features

that are not relevant for component Y. For example, the fact

that B2 T is used in product configurations P6 and P7 be-

comes now clearly visible in Figure 3 as B2 T and P6 and

P7 appear at concept #9, whereas this fact would be very

hard to see in Figure 2 even if the variable features were not

omitted from the Figure.

4.3 Feature Usage Classification

In a next step, we used the lattice in Figure 3 to classify

the usage of features as described in Section 3.3:

Class I: Always Used. Interestingly, 13 of 22 variable

features offered by component Y are used in every product

Figure 3. Lattice for component Y.

configuration because they appear at the top concept (con-

cept #0). Thus, the provided variability of component Y

is much higher than the required variability for the prod-

ucts that currently use the component. In order to reduce

the complexity of having to manage these unnecessary vari-

able features, all features appearing at concept #0 could

be merged. This would reduce the number of variable fea-

tures offered by component Y from 22 to 9 and the number

of variants that can be derived for component Y from 222

variants down to 29 variants. If the analyst expects that the

variability of these features is still needed for products that

should be derived from the product line in the near feature,

it is also possible to merge only a subset of features appear-

ing at Concept #0. For example, if component Y should

still offer variability for multiple types of feature A and B

(represented by features A1, A2, B1, B2), but the function-

ality for specific operation mode R (represented by all fea-

tures ending with R) is expected to be a mandatory feature

for all future products, the analyst could decide to merge

the following pairs of features: Base and Base R, A1 and

A1 R, A2 and A2 R, B1 and B1 R, and B2 and B2 R. This

would still reduce the number of offered variable features

by 5 and the number of variants from 222 to 217.

Class II: Never used. In Figure 3, the variable feature

A2 CC appears at the bottom concept (concept #1) indi-

cating that it is never used in any of the product configura-

tions. By consulting the documentation we found out that

the domain engineers formerly intended to offer two alter-

natives for the calculation of correction factors of feature

A2, namely common correction (A2 CC) and separate cor-

rection (A2 SC). However, all current product configura-

tions use A2 SC and none uses A2 CC. Therefore, A2 CC

can be removed from the offered set of variable features and

A2 SC could be merged with A2.

Class III: Only used mutually exclusively. By closely

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

inspecting the lattice, the analyst can infer that A1 T and

A1 A (respectively A2 T and A2 A) are only used mutually

exclusively because μ(A1 T)�μ(A1 A) = ⊥ (respectively

μ(A2 T) � μ(A2 A) = ⊥). Plausibly, it is not possible

to combine A1 T with A2 A or A2 T with A1 A as there

exists no product configuration in which these features are

used together. Therefore, these features should be marked

as alternative.

Class IV: Only used in pairs. By closely inspecting

the lattice, the analyst can see that features A1 T and A2 T

appear together at concept #2. This indicates that these fea-

tures are likely to be features that cannot be used separately.

The same applies for A1 A and A2 A as they appear to-

gether at concept #13. Therefore, A1 T and A2 T should

be merged as well as A1 A and A2 A.

4.4 Feature Constraints

The lattice shown in Figure 3 also reveals interesting

constraints of the form ”feature A requires/excludes feature

B” as described in Section 3.4. Some of the constraints that

can be derived from the lattice are:

• {A2 T} → {A1 T, A2 T}: This can be read as ”cal-

ibration strategy T of feature A2 requires calibration

strategy T of A1 and A2”.

• {A1 A, A2 A} → {B2 A}: This can be read as ”cal-

ibration strategy A of A1 and A2 requires calibration

strategy A of B2.

• {A1 A, A1 T} → {A1}: This can be interpreted as

”calibration strategy A or T of A1 require the base fea-

ture A1”.

• Concept #7 shows that it is possible to combine cali-

bration strategy T for A1 (A1 T) and calibration strat-

egy T for A2 (A2 T) with calibration strategy A for B2

(B2 A). This combination is used in product configu-

rations P1, P3, and P9.

• {B2 T} likely excludes {A1 A, A2 A} as these fea-

tures belong to class III.

5 Related Work

The need for managing the variability in a software

product line is generally recognized in the literature [2].

However, the main research on this topic has focused on

processes and the forward-engineering point of view, i.e.,

frameworks for managing the variability during domain en-

gineering [4, 10, 11, 17]. Only a small number of re-

searchers have addressed the problem of variability from

the point of view of reengineering.

The problem of variability restructuring was first identi-

fied by Parnas [15], who suggested that, when a large family

of products gets out of control, the first step must be to re-

duce the size of the program family.

Krone and Snelting [12] were the first to use concept

analysis to analyze programs in which the C preprocessor

(CPP) is used for variability management. The relation be-

tween code pieces and CPP symbols is extracted from a

source file, and the corresponding lattice visualizes interfer-

ences between CPP symbols. Our work differs from Snelt-

ing’s work in three ways. First, in our approach, we apply

concept analysis on the usage of variable features in product

configurations instead of analyzing the relationship of CPP

symbols and code lines. Second, the focus of our work is

on restructuring and simplifying the provided variability in

a software product line whereas the focus of Snelting’s work

is to simplify the source code. Third, the granularity of vari-

ability analyzed in Snelting’s work is very small whereas in

our work the variability is at the granularity level of fea-

tures. Our work goes beyond Snelting’s work by identify-

ing obsolete variability and providing restructuring strate-

gies for variability.

In [3] a method for refactoring product line architectures

is proposed. Based on two metrics measuring the utilization

of provided services and required services of optional and

variant components in a product line architecture refactor-

ing strategies for these components are derived.

Recently, Deelstra et al. [6] presented a method for as-

sessing the provided variability in a software product line

at a high-level of abstraction. Based on a model of vari-

ation points and dependencies they describe a process to

analyze the provided variability and compare it to the re-

quired variability that is needed to derive future products

from the product line. However, they do not describe how

to determine whether the provided variability is adequate to

the required variability to derive different products.

Formal concept analysis is a well known method in soft-

ware engineering that has been applied to a variety of prob-

lems. Besides the inference of configuration structures from

source code [12], concept analysis has been used for the

evaluation of class hierarchies [18], for the recovery of com-

ponents [13, 16, 19, 20], for feature location [7], and for

reengineering object oriented systems [1].

6 Conclusions and Future Work

We have presented a method for restructuring variabil-

ity by analyzing the usage of variable features by a set of

product configurations. This method is based on conceptanalysis and constructs a concept lattice in which relation-

ships between variable features and product configurations

are made explicit, and where the features used in multiple

product configurations are ”factored out”’. We have shown

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007

the technique to be capable of finding variable features used

in every product configuration that could be merged with

other features and variable features not used in any product

configuration that could be eliminated. The lattice thus not

only helped us to compare our mental model of the variabil-

ity with the actual use of variable features in real product

configurations, but also to obtain valuable information for

restructuring and simplifying the existing variability. We

have suggested how the lattice can be used as a formal basis

in interactive tools for restructuring the existing variability

of software product line components. The lattice also al-

lowed us to derive feature constraints that can be used to

automatically check product configurations for consistency.

The present paper has focused on analyzing the usage

of variable features in product configurations and the

derivation of feature constraints. We intend to implement

an interactive tool for restructuring variability based on

our technique, and verify its practicality by applying it

to large product line components. Beyond further case

studies and controlled experiments, we want to explore

how the results obtained by the method described in this

paper may be combined with our results of analyzing the

mapping of configuration labels to files in the configuration

management [14]. We believe that combining these results

will considerably ease the process of restructuring the

variability as the realization relationships of variable

features to the files implementing them are made explicit.

Acknowledgements. This work was supported by

Robert Bosch GmbH Corporate Research. We would like

to thank Sa Li for the development of the prototype to an-

alyze product configurations and all anonymous reviewers

that provided valuable feedback on earlier versions of this

paper.

References

[1] G. Arevalo. High-Level Views in Object Oriented Systemsusing Formal Concept Analysis. PhD thesis, University of

Berne, January 2005.[2] P. Clements and L. Northrop. Software Product Lines: Prac-

tices and Patterns. Addison-Wesley, 2001.[3] M. Critchlow, K. Dodd, J. Chou, and A. van der Hoek.

Refactoring product line architectures. In First InternationalWorkshop on Refactoring: Achievements, Challenges, andEffects (REFACE’03), pages 23–26, 2003.

[4] K. Czarnecki and C. H. P. Kim. Cardinality-based feature

modeling and constraints: A progress report. In Proceed-ings of the International Workshop on Software Factories AtOOPSLA 2005, 2005.

[5] S. Deelstra, M. Sinnema, and J. Bosch. Product derivation in

software product families: a case study. Journal of Systemsand Software, 74(2):173–194, 2005.

[6] S. Deelstra, M. Sinnema, J. Nijhuis, and J. Bosch. COS-

VAM: A technique for assessing software variability in

software product families. In Proceedings of the 20thIEEE International Conference on Software Maintenance(ICSM’04), pages 458–462, September 2004.

[7] T. Eisenbarth, R. Koschke, and D. Simon. Locating features

in source code. IEEE Transactions on Software Engineering,

29(3):210–224, 2003.[8] B. Ganter and R. Wille. Formal Concept Analysis - Mathe-

matical Foundations. Springer Verlag, 1996.[9] C. Hammel, B. Boss, H. Jessen, A. Traub, C. Tischer, and

H. Honninger. A common software architecture for diesel

and gasoline engine control systems of the new generation

EDC/MED17. In World Congress & Exhibition of Automo-tive Engineers (SAE 2003), number 2003-01-1048 in SAE

Technical Papers, Detroit, MI, USA, March 2003.[10] K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S.

Peterson. Feature-oriented domain analysis (FODA) feasi-

bility study. Technical Report CMU/SEI-90-TR-21, SEI,

1990.[11] K. C. Kang, S. Kim, J. Lee, K. Kim, E. Shin, and M. Huh.

FORM: A feature-oriented reuse method with domain-

specific reference architectures. Annual Software Engineer-ing, 5:143–168, 1998.

[12] M. Krone and G. Snelting. On the inference of configuration

structures from source code. In Proceedings of the 16th In-ternational Conference on Software Engineering (ICSE’94),pages 49–57, Los Alamitos, CA, USA, 1994.

[13] C. Lindig and G. Snelting. Assessing modular structure of

legacy code based on mathematical concept analysis. In

Proceedings of the 19th International Conference on Soft-ware Engineering (ICSE’97), pages 349–359, New York,

NY, USA, 1997. ACM Press.[14] F. Loesch. A formal method to identify variation points in

product line assets. In Workshop Software-Reengineering(WSR’06), Bad Honnef, Germany, May 2006.

[15] D. L. Parnas. Software aging. In Proceedings of the 16th In-ternational Conference on Software Engineering (ICSE’94),Sorento, Italy, 1994. IEEE Computer Society Press.

[16] M. Siff and T. W. Reps. Identifying modules via concept

analysis. In Proceedings of the International Conference onSoftware Maintenance (ICSM’97), pages 170–179, Wash-

ington, DC, USA, 1997. IEEE Computer Society.[17] M. Sinnema, S. Deelstra, J. Nijhuis, and J. Bosch. COV-

AMOF: A framework for modeling variability in software

product families. In Proceedings of the Third InternationalConference on Software Product Lines (SPLC’04), pages

197–213, Boston, MA, USA, August 30-September 2 2004.[18] G. Snelting and F. Tip. Reengineering class hierarchies us-

ing concept analysis. In Proceedings of the 6th ACM SIG-SOFT International Symposium on Foundations of SoftwareEngineering (FSE’98), pages 99–110, New York, NY, USA,

1998. ACM Press.[19] P. Tonella. Concept analysis for module restructuring. IEEE

Trans. Softw. Eng., 27(4):351–363, 2001.[20] A. van Deursen and T. Kuipers. Identifying objects using

cluster and concept analysis. In Proceedings of the 21st In-ternational Conference on Software Engineering (ICSE’99),pages 246–255, Los Alamitos, CA, USA, 1999.

[21] S. A. Yevtushenko. System of data analysis ”concept ex-

plorer”. In Proceedings of the 7th National Conference onArtifcial Intelligence, pages 127–134, Russia, 2000.

11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007