Upload
erhard
View
213
Download
0
Embed Size (px)
Citation preview
Restructuring Variability in Software Product Lines usingConcept Analysis of Product Configurations
Felix LoeschRobert Bosch GmbH
Corporate Research CR/AEYP.O. Box 300240, 70442 Stuttgart, Germany
Erhard PloederederUniversity of Stuttgart
Institute of Software TechnologyUniversitaetsstraße 38, 70569 Stuttgart, Germany
Abstract
The management of variability plays an important rolein successful software product line engineering. As the setof products that is derived from the product line and theirrequirements are constantly changing, the variability in theproduct line needs to evolve as well. A typical problem inin such an evolution scenario is that the number of vari-able features and variants will explode, and thus becomeunmanageable. One of the reasons for this explosion is thatobsolete variable features are not removed. In order to ad-dress this problem, we present a new method for restructur-ing and simplifying the provided variability in a softwareproduct line. Our method is based on concept analysis. Itanalyzes the realized variability in a software product line,and constructs a lattice that provides a classification of theusage of variable features in real products derived from theproduct line. We show how this classification can be usedto derive restructuring strategies for variability that solvethe problem of variability explosion. The effectiveness ofour method is demonstrated by presenting a case study ofrestructuring the variability in a large industrial softwareproduct line.
1 Introduction
Managing the variability of product line components that
can be used in a large number of different products is hard
because it is almost impossible to anticipate how these com-
ponents will be used in products derived from the product
line. In order to allow these components to be configured
for as many products as possible, they are typically imple-
mented using variable features, i.e., features that may be
included or not in configurations of the components. These
variable features allow a product line engineer to configure
the components for the requirements of different products.
However, ongoing evolution of product line components
due to changing customer requirements and advances in
technology will increase the number of variable features.
As a result, the number of variants that can be derived by
including or excluding variable features in a product will
explode, and thus become unmanageable. One of the rea-
sons for this explosion is that obsolete variable features are
not removed. For example, from a product line component
offering only 10 variable features it is theoretically possible
to derive 210 different variants. This simple example shows
that a small number of variable features already results in
a combinatorial explosion of variants. Clearly, methods are
needed that ensure the minimum number of variable fea-
tures needed to derive different products from the product
line [5].
In order to address this problem, we present a new
method to restructure and simplify the provided variability
in a software product line. Our method is based on concept
analysis. First, a table is constructed that reflects the us-
age of variable features in product configurations. From the
table, a concept lattice is derived that factors out which vari-
able features are commonly used in product configurations
and which variable features only appear in specific product
configurations. The main contributions of our method are:
• An automated classification of the usage of variable
features in product configurations based on the concept
lattice (Section 3.3).
• A set of restructuring strategies to minimize the num-
ber of variable features based on this classification
(Section 3.5).
We demonstrate that our method helps to solve the problem
of variability explosion by providing a good starting point
for interactive restructuring tools to minimize the number
of variable features necessary to derive different products.
The remainder of this paper is organized as follows. Sec-
tion 2 reviews the relevant parts of the theory of concept
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
analysis. Section 3 explains how concept analysis is used
to classify the usage of variable features in product config-
urations and to restructure variability. Section 4 presents a
case study demonstrating the effectiveness of our method.
Section 5 discusses related research. Finally, conclusions
and future work are presented in Section 6.
2 Formal Concept Analysis
Formal Concept Analysis (FCA) [8] is a mathematical
method that provides a way to identify meaningful group-
ings of objects that have common attributes.
The starting point for concept analysis is a context C =(O,A, T ), consisting of a set of objects O, a set of at-tributes A, and a relation table T , stating which attributes
are possessed by each object. For any set of objects O ⊆ O,
the set of common attributes is defined as σ(O) = {a ∈A | ∀o ∈ O : (o, a) ∈ T }. Similarly, for any set of at-
tributes A ⊆ A, the set of common objects is defined as
τ(A) = {o ∈ O | ∀a ∈ A : (o, a) ∈ T }.
A tuple c = (O,A) is called a concept, if A = σ(O) and
O = τ(A), that is, c is a maximal collection of objects that
share a common set of attributes. A concept c = (O,A)has extent ext(c) = O and intent int(c) = A. Informally,
a concept corresponds to a maximal rectangle of filled table
cells in T modulo column and row permutations.
The set of all concepts that can be derived from the for-
mal context forms a complete partial order, called the con-cept lattice L(C). Thus, we define that a concept (O1, A1) is
a subconcept of concept (O2, A2), denoted by (O1, A1) ≤(O2, A2), if O1 ⊆ O2 (or, equivalently, A1 ⊇ A2). In-
versely, we define that the concept (O2, A2) is a supercon-cept of (O1, A1).
For two concepts (O1, A1) and (O2, A2) in the concept
lattice, their infimum (�) or meet is defined as
(O1, A1) � (O2, A2) = (O1 ∩ O2, σ(O1 ∩ O2)) (1)
and their supremum () or join as
(O1, A1) (O2, A2) = (τ(A1 ∩ A2), A1 ∩ A2) (2)
In our figures, a concept c is labeled with attribute a ∈ A,
if it is the largest concept with a in its intent, and it is labeled
with an object o ∈ O, if it is the smallest concept with o in
its extent. The unique concept labeled with a is denoted
μ(a) = {c ∈ L(C) | a ∈ int(c)} (3)
and the unique concept labeled with o is denoted
γ(o) = �{c ∈ L(C) | o ∈ ext(c)} (4)
Hence, the attributes of object o are those which appear at
γ(o) and above, and all objects that appear at and below
male female tall short intelligent
Paul x x x
George x x
Jenny x x
Michael x x x
Elisabeth x x
Figure 1. Relation table and concept lattice.
μ(a) have attribute a. Consequently, suprema in the lat-
tice indicate that certain objects have attributes in common,
while infima show that certain attributes fit to common ob-
jects.
Figure 1 shows the relation table and associated concept
lattice for an example of properties of persons. The concept
#9 labeled with male is the supremum of all persons that
have male in their intent: Michael, Paul, and George are
below male in the lattice and the table indicates that these
(and no other) persons are indeed male.
A table and a lattice are alternate views on the same in-
formation but they provide different insights. There is yet
another view: a set of implications. For any two attribute
sets A,B ⊆ A, we say A implies B (written A → B), iff
any object o having all attributes in A also has all attributes
in B:
A → B ⇔ ∀o ∈ O : (∀a ∈ A : (o, a) ∈ T )⇒ (∀b ∈ B : (o, b) ∈ T )
For B = {b1, . . . , bk}, A → B holds iff A → {bi} for
all bi ∈ B. Implications show up in the lattice as follows:
A → b holds, iff �{μ(a) | a ∈ A} ≤ μ(b). Informally,
implications between attributes can be found along upward
paths in the lattice. In the example of Figure 1, we have that
μ(intelligent) ≤ μ(short), which can be read as intelligent→ short, or ”An intelligent person is also short”. This im-
plication shows that concept analysis operating on incom-
plete information needs human judgment to verify or reject
derivative information. Often, some implications are known
to hold a priori. Such background knowledge can easily be
integrated into a given table. An implication x → y can
be enforced by adding the entries in the x column to the ycolumn, and will cause μ(x) ≤ μ(y) in L(C). For more
details, we refer to Ganter and Wille [8].
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
3 Using FCA for Restructuring Variability
In order to analyze the usage of variable features in prod-
uct configurations via concept analysis, we first need to de-
fine the formal context (objects, attributes, relation table)
and then interpret the resulting concept lattice accordingly.
3.1 Formal Context
Roughly speaking, the objects and attributes in our do-
main are product configurations and variable features, re-
spectively, and the table that is constructed identifies for
each product configuration which variable features it in-
cludes. Before we can define the objects and attributes more
precisely, we need to introduce some terminology.
In what follows, S denotes a product line component,
package, or subsystem. Thus, S is a generic parameter to
our method. The reengineer can select the granularity of Sdepending on the number of variable features expected in S.
Further, P denotes a product configuration for the analyzed
product line.
A product configuration P consists of so called config-uration rules that itself consist of the following three parts:
scope, path, configuration label. For example, the follow-
ing configuration rule
element \subsys B_SubSys_Comp_FeatureA
has scope = element, path = subsys and configurationlabel = B SubSys Comp FeatureA. In the following de-
finitions, BelongsTo(f,S) describes the set of files that be-
long to S and AttachedTo(l, f) determines if configuration
label l is attached to file f .
A variable feature can be an optional, alternative, op-tional alternative, or or-feature [4]. In our case, all types of
variable features are realized by configuration labels l. Thus
a variable feature will be included in a product by includ-
ing the corresponding configuration label l in the product
configuration. As the CM system does not provide seman-
tics for configuration labels, the different types of variable
features are not distinguished in the CM system.
The attributes of our domain are variable features offered
by S. As variable features are represented by configuration
labels, the variable features offered by S are all configura-
tion labels that are attached to a file that belongs to S.
Definition 1. Let S be a component, package, or subsystem.Then, we define the set of offered features as follows:
OfferedFeatures(S) :={l | AttachedTo(l, f) ∧ BelongsTo(f,S)}
The objects of our domain are product configurations Pin which the analyzed component, package or subsystem S
is used. As other subsystems, packages, and components
are configured in P as well, we need to determine the fea-
tures of S used in P .
Definition 2. The set of product configurations using S andthe set of used features of S in P are defined as follows:
Configurations(S) :={P | P is a configuration using S}
UsedFeatures(S,P) :={l | l ∈ OfferedFeatures(S) ∧ l ∈ P}
Using these definitions we can now define the formal
context for our domain:
Definition 3. The formal context C = (O,A, T ) for con-cept analysis is defined as follows:
C = {Configurations(S), OfferedFeatures(S), T }(P, l) ∈ T ⇔ l ∈ UsedFeatures(S,P)
Before we can obtain the relation table T , we first need
to determine OfferedFeatures(S) using one of the following
alternatives:
1. existing documentation. If the features of S have been
completely documented by the product line engineers,
then we can directly obtain OfferedFeatures(S) from
the documentation.
2. configuration management. If the features of S are
not documented, we can obtain OfferedFeatures(S) by
extracting all configuration labels that are attached to
files belonging to S from configuration management.
3. product configurations. If the features of S are not
documented and we have no access to the information
in the CM system, we can obtain OfferedFeatures(S)
by analyzing a large number of product configurations
P . If the path of a configuration rule in P matches
the path of S, then the configuration label represents a
feature offered by S.
Ideally, one will use alternative (1) if reliable and complete
documentation exists. Alternative (2) can be chosen if the
features are not documented but we can easily determine the
files belonging to S. In all other cases, one will fall back on
alternative (3). However, using alternative (3) we cannot
identify unused variable features as these features will not
show up in the product configurations.
In order to obtain the entries in the relation table T , we
analyze a set of product configurations P that use some
of the variable features offered by S. Ideally, the product
configurations should use different subsets of OfferedFea-tures(S). To obtain UsedFeatures(S,P), we parse the con-
figuration rules in P and determine for each configuration
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
rule if the configuration label l referenced by the configura-
tion rule is in the set of OfferedFeatures(S). If it is in the
set, we mark the entry (P, l) in T .
3.2 Interpretation of the Concept Lattice
Concept analysis applied to the formal context described
in the last section yields a lattice from which interesting
facts about the relationships of variable features and product
configurations can be derived. The following relationships
can be fully automatically derived from the sparse represen-
tation of the lattice and presented to the analyst:
• The number of concepts in the lattice that have at least
one variable feature attached represent the minimum
number of variation points needed to derive the prod-
ucts described by the analyzed product configurations.
• A product configuration, p, contains all variable fea-
tures at and above γ(p) in the lattice.
• A variable feature, l, is used in all product configura-
tions at and below μ(l) in the lattice.
• Variable features commonly used in a set of product
configurations {p1, p2, . . . , pn} can be identified at the
supremum, i.e., the closest common superconcept of
all γ(pi).
• Product configurations using a set of variable features
{l1, l2, . . . , ln} can be identified at the infimum, i.e.,
the closest common subconcept of all μ(li).
• A variable feature, l, is specific to exactly one product
configuration, p, if p is the only product configuration
on all paths from μ(l) to the bottom element, i.e., p is
the only product configuration using variable feature l.
• If μ(l1) < μ(l2) holds for two variable features l1 and
l2, then l1 is more specific than l2 with respect to the
given product configurations. Thus, the more general
variable features can be found at the higher concepts
in the lattice since they are used in many product con-
figurations, while more specific variable features are in
the lower region of the lattice.
• If γ(p1) < γ(p2) holds for two product configurations
p1 and p2, then product configuration p1 is based on p2
because p1 includes all features in the intent of γ(p2).Thus, the lower a product configuration in a lattice, the
more capable it is because it uses more features.
3.3 Classification of Feature Usage
The interpretation of the concept lattice as described
above gives insights into the relationships between variable
features and product configurations. However, the analyst
is primarily interested in restructuring and simplifying the
provided variability of the component, package, or subsys-
tem S. This section describes how the lattice can be used
to automatically classify the usage of variable features and
how we can derive strategies for restructuring and simplify-
ing the existing variability of S using this classification.
Using the sparse representation of the lattice, we can
classify the usage of variable features in product configu-
rations as follows:
I Always used. Variable features appearing at the top
concept (�) in the lattice are used in every product
configuration. These features are likely to be manda-tory features, i.e., they need to be included in every
product. As the variability provided by these features
apparently is not needed for the current product con-
figurations, these features should be merged. Merging
these variable features will reduce the complexity of
variability as the number of configuration labels is re-
duced, i.e., the number of variable features that need to
be managed decreases.
II Never used. Variable features appearing at the bot-
tom concept (⊥) are not used in any of the product
configurations. These features are likely to be features
that have become obsolete. As these variable features
are obviously not needed anymore, they should be re-
moved from the set of offered features. Variable fea-
tures that become obsolete often occur if there used
to exist two alternatives of a feature and one of them
has become the preferred alternative used in all prod-
uct configurations.
III Only used mutually exclusively. Two variable fea-
tures l1 and l2 that appear at different concepts and
whose infimum is the bottom concept, i.e., for which
μ(l1) � μ(l2) = ⊥ holds, are only used mutually ex-
clusively in the product configurations. These features
are likely to be alternative features, i.e., they always
have to be used mutually exclusively. Although this
is only a hypothesis, the analyst may have additional
background knowledge that these features indeed rep-
resent alternative features. If we consider the distance
between μ(l1) and μ(l2), we can even classify the al-
ternatives according to their usage, i.e., a large distance
between μ(l1) and μ(l2) indicates that one alternative
is used more often than the other. In the case that one
of the alternatives has become the preferred alternative
it will show up at the top concept and the unused alter-
natives will appear at the bottom concept.
IV Only used in pairs. Two variable features l1 and
l2 that appear at the same concept, i.e., for which
μ(l1) = μ(l2) holds are features that are used together
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
in all product configurations. These are likely to be
features that cannot be used separately. Although this
is only a hypothesis, the analyst may have additional
background knowledge about the correct usage of these
features. If the analyst knows that these features can
only be used together, they should be merged to reduce
the number of variable features and to make clear that
they cannot be used separately.
3.4 Feature Constraints
Beyond the classification of features into usage classes,
our approach also allows us to automatically derive feature
constraints as described in [4].
Deriving feature constraints. Feature constraints of
the form ”feature A requires features B ∨ C” show up in
the concept lattice as implications between attributes as de-
scribed in Section 2, and can be automatically derived us-
ing the concept analysis tool ConExp [21]. More compli-
cated feature constraints such as ”feature A requires fea-
tures B∧C” can be obtained by combining attributes B and
C to a new attribute B ∧ C in the relation table, repeating
concept analysis, and checking if there exists an implica-
tion between A and the combined attribute B ∧ C. Feature
constraints of the form ”feature A excludes feature B” can
be detected by checking if A and B belong to feature usage
class III. As features in this class are only used mutually
exclusively it is very likely that there exists a mutually ex-
clusive constraint for these features. However, this is only a
hypothesis and should be checked by an analyst.
Adding feature constraints. Feature constraints that
are not reflected in the lattice, e.g., ”variable feature x al-
ways requires variable feature y”, can be integrated in the
lattice via background implications, i.e., implications that
are added to the formal context as described in Section 2.
Usage of feature constraints. The feature constraints
can be used for two different purposes. First, they reduce
the number of variants that can theoretically be configured.
Second, they can be used to check product configurations
for consistency. For example, one can automatically check
product configurations for the inclusion of required features
and the absence of mutually exclusive features.
3.5 Restructuring Variability
We intend to construct an interactive tool for restructur-
ing variability. The tool could be realized as an extension to
ConExp [21]. It is already possible to view the lattice and
the associated table in ConExp. One could easily imagine
extensions that automatically classify features and display
the results of this classification to the user. Using this clas-
sification, the tool can propose restructuring strategies to
Usage Class Restructuring StrategyI Always used Merge variable features
II Never used Remove variable features
III Mutually exclusive Mark as alternative
IV Used in pairs Merge variable features
Table 1. Restructuring strategies for featureusage classes.
simplify the provided variability in a component, package,
or subsystem S.
Table 1 summarizes the possible restructuring strategies
for each usage class. In addition, the tool could analyze
the impact of intended restructuring transformations by an-
alyzing the realization of variable features and the mapping
of these features to files in the configuration management.
Specific transformations that the tool should support are:
• Merging variable features. The analyst may decide
to merge two variable features l1 and l2 if the distinc-
tion between these variable features is not needed any-
more. This requires to unify the set of files l1 and l2are attached to and to replace the configuration labels
l1 and l2 by a new common configuration label. The
tool should check whether l1 and l2 do not overlap, i.e.,
have files in common. If l1 and l2 do overlap, the files
that are shared need to be merged as well. Further-
more, the tool should automatically update the product
configurations to remove the obsolete configuration la-
bels after a successful merge of two features.
• Removing variable features. The analyst may decide
to eliminate variable features that are not used in any
of the product configurations. This requires to remove
the corresponding configuration labels from the set of
offered features. If the files that these configuration la-
bels are attached to are not needed anymore, they can
be removed as well. If the features to be removed rep-
resent obsolete alternatives of other features, then in
addition to removing the obsolete alternatives, the an-
alyst may decide to merge the remaining ”alternatives”
with other features.
• Mark as alternative. If two variable features are
only used mutually exclusively, and if the analyst
knows that they represent alternative features, then
they should be marked as alternative. The pairs of
mutually exclusive features could be given to an auto-
mated checker for product configurations that ensures
that only one of the alternatives is selected in each
product configuration.
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
3.6 Filtering of the Concept Lattice
As the resulting lattice for product configurations and
variable features may be too large, too detailed, and due to a
large number of concepts be unmanageable, it is reasonable
to exclude some variable features from the examination. As
a result of excluding some variable features, the lattice will
become much smaller and interesting facts can be derived
from the lattice more easily.
In this section, we describe how we can exclude some
of the variable features to simplify the lattice without sac-
rificing the order of concepts in the lattice. The lattice
changes when variable features are excluded from the ex-
amination. Fortunately, Ganter and Wille [8] have shown
that the smaller lattice after the exclusion of variable fea-
tures can be mapped to the original lattice (the smaller lat-
tice is the result of a so-called subcontext).
Definition 4. Let C = (O, A, T ) be a context, O′ ⊆ O,and A′ ⊆ A. Then C ′ = (O′, A′, T ∩ (O′ × A′)) is calleda subcontext of C and C is called a supercontext of C ′.
In our application of concept analysis, we only remove
columns (one for each variable feature, assuming that vari-
able features occur in columns of the relation table) but
never rows (because we do not want to remove product
configurations). Removing columns leads to a subcontext
(O,A′, T ′) in which T ′ is a reduced table of T . Fortu-
nately, this modification remains transparent in the lattice.
Every attribute extent of (O,A′, T ∩(O×A′)) is an attribute
extent of C = (O,A, T ) and, since every concept extent is
the intersection of attribute extents, we obtain:
Proposition. Let C = (O,A, T ) and C ′ = (O′, A′, T ′),where A′ ⊆ A and T ′ = (T ∩ (O×A′)). Then every extentof C ′ is an extent of C.Proof. See [8].
According to this proposition, each extent within the
subcontext will show up in the supercontext. This can be
made plausible with the relation table: Removed columns
will never change existing columns, so the maximal rectan-
gles forming concepts will only shrink in horizontal direc-
tion (if variable features are listed in columns).
This proposition on the invariability of extents of sub-
contexts that only differ in the set of attributes results in a
simple mapping of concepts from the subcontext to the su-
percontext (for a formal proof see [8]):
(P,L) �→ (P, σ(P ))
The mapping is a �-preserving order-embedding, meaning
that the partial order relationship is completely preserved.
Consequently, the supercontext is basically a refinement of
the subcontext. By this mapping, all concepts of the sub-
context can be found in the supercontext.
3.7 Implementation
Our prototype tool to analyze product configurations is
based on the concept analysis tool ConExp [21]. In order
to parse product configurations, we implemented a frontend
for ConExp that extracts the set of used configuration labels
for a component, package, or subsystem S and transforms
this set into a formal context that is used by ConExp for the
calculation and the display of the concept lattice. The set
of offered features can be given to the tool in the form of
a simple list. In order to filter the lattice we extended the
filtering capabilities of ConExp. We are currently working
on an implementation for automated feature classification
and interactive restructuring of variability.
4 Case Study
In a case study, we analyzed (and later restructured) the
provided variability in a subsystem of an embedded indus-
trial software product line that is used in engine diesel con-
trol units. A similar system to the one we analyzed is de-
scribed in [9]. All data presented in this case study has been
obtained from the real system. For reasons of intellectual
property protection, we abstracted the names of product
configurations and variable features. However, all lattices
shown in this case study are isomorphic to the original lat-
tices.
4.1 Case Study Setup
In integration with other subsystems, subsystem X per-
forms a number of important functions to control the en-
gine. Subsystem X offers 55 variable features that can be
used to configure the system according to desired criteria of
X’s behavior, e.g., number of features of type A and B, cal-
ibration strategy T or A, operation mode R. However, not
every combination of variable features that can be selected
will result in a working configuration in the sense that it can
be compiled and calculates correct values for the control of
the engine. Beyond CM-labels selecting features, no other
variability mechanisms are used for subsystem X.
We obtained a set of eleven product configurations
shown in Table 2 to analyze the usage of variable features
of subsystem X. For each product configuration the total
number of used features in the product configuration, the
number of used features for subsystem X, and the number
of used features for component Y (a component of subsys-
tem X) is shown. We extracted the set of variable features
relevant for subsystem X from each product configuration.
The set of offered variable features for subsystem X was ob-
tained from existing documentation. We then used our pro-
totype tool to automatically construct a relation table from
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
Figure 2. Lattice for subsystem X. The larger a concept is, the more variable features it contains.
ID # total # subsystem X # component YP1 438 41 / 55 18 / 22
P2 333 36 / 55 17 / 22
P3 424 37 / 55 18 / 22
P4 497 36 / 55 18 / 22
P5 411 34 / 55 18 / 22
P6 405 36 / 55 16 / 22
P7 368 36 / 55 16 / 22
P8 455 36 / 55 18 / 22
P9 457 35 / 55 17 / 22
P10 396 30 / 55 17 / 22
P11 480 32 / 55 17 / 22
Table 2. Analyzed product configurations andnumber of used features.
the available information and applied concept analysis to
obtain a concept lattice.
4.2 General Results
The resulting lattice for subsystem X is shown in Fig-
ure 2. The intents of the concepts (variable features) in the
lattice are omitted for readability reasons. However, the
size of each concept node in this picture is a linear func-
tion of the number of variable features attached to it. As
Figure 2 shows, there are a few concepts containing most
of the variable features of subsystem X. The lattice con-
tains 54 concepts. 21 of them introduce at least one new
variable feature, i.e., to these nodes a variable feature is at-
tached (more precisely a concept c introduces a variable fea-
ture if there exists a variable feature l for which μ(l) = cholds). Concepts that introduce at least one variable feature
are half-filled with gray in the lattice. 33 of the concepts do
not introduce any new variable feature and merely combine
variable features included in several product configurations
(these concepts are shown in white in the lattice). Concepts
are half-filled with black if to these nodes one or more prod-
uct configurations are attached (more precisely a concept cis filled half-black if there exists a product configuration pfor which γ(p) = c holds).
The first interesting observation is that concepts with at-
tached variable features can be found in the upper region
and in the lower region of the lattice, while in the middle
region, there are almost no concepts which have variable
features attached. That is to say that there are a number of
variable features that are used in almost all product config-
urations (these are the features attached to concepts in the
upper region) and there are a few variable features that are
used in a small number of product configurations (these are
the features attached to concepts in the lower region). A
maintainer that needs to restructure and simplify the pro-
vided variability can use this information to determine can-
didate features that can be merged (these appear in the upper
region) because the variability provided by these features is
not needed to facilitate the product differences. Features
appearing at concepts that also have a product configura-
tion attached are features only used in a specific product
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
configuration. If these features are not intended to be used
in other product configurations in the future they can be re-
moved because the variability they provide is not needed.
The next interesting observation regards the capability
of product configurations. As we can see in the lattice most
product configurations appear toward the bottom of the lat-
tice. This means that the product configurations share a
large number of variable features and only differ slightly. If
we combine this information with the fact that subsystem X
offers 55 variable features, indicating that we could theoret-
ically configure 255 different variants, we obviously see that
the provided number of variable features by far outnumbers
the variable features needed to facilitate the differences in
the product configurations.
Furthermore, the lattice reveals interesting relationships
between product configurations. For example, the concept
labeled with P4 and P8 indicates that these product config-
urations use exactly the same features of subsystem X.
We inspected the lattice manually and used the documen-
tation of subsystem X to assign meaning to the variable fea-
tures. Using the names of configuration labels helped to
build groups of variable features that are likely to be related
because of a common prefix in the name, e.g., FeatureA
and FeatureA SubfeatureB are likely to be related. How-
ever, the large number of concepts that do not introduce new
variable features made the analysis difficult, as it was very
hard to follow the lines in the lattice especially in the middle
region where a large number of concepts and lines interfere.
In order to solve these problems, we filtered the lattice
by excluding many variable features from our examination.
As shown in Section 3.6 the exclusion of attributes from the
examination does preserve the partial order of the lattice.
In addition, the relationships between variable features be-
came much clearer in the filtered lattice. In our case study,
we focused on the provided variability of component Y, a
component inside subsystem X. Thus, we removed all vari-
able features from the relation table that do not affect com-
ponent Y. We then applied concept analysis on the result-
ing subcontext. Figure 3 shows the concept lattice for the
reduced relation table after removing the variable features
that are not relevant for component Y. For example, the fact
that B2 T is used in product configurations P6 and P7 be-
comes now clearly visible in Figure 3 as B2 T and P6 and
P7 appear at concept #9, whereas this fact would be very
hard to see in Figure 2 even if the variable features were not
omitted from the Figure.
4.3 Feature Usage Classification
In a next step, we used the lattice in Figure 3 to classify
the usage of features as described in Section 3.3:
Class I: Always Used. Interestingly, 13 of 22 variable
features offered by component Y are used in every product
Figure 3. Lattice for component Y.
configuration because they appear at the top concept (con-
cept #0). Thus, the provided variability of component Y
is much higher than the required variability for the prod-
ucts that currently use the component. In order to reduce
the complexity of having to manage these unnecessary vari-
able features, all features appearing at concept #0 could
be merged. This would reduce the number of variable fea-
tures offered by component Y from 22 to 9 and the number
of variants that can be derived for component Y from 222
variants down to 29 variants. If the analyst expects that the
variability of these features is still needed for products that
should be derived from the product line in the near feature,
it is also possible to merge only a subset of features appear-
ing at Concept #0. For example, if component Y should
still offer variability for multiple types of feature A and B
(represented by features A1, A2, B1, B2), but the function-
ality for specific operation mode R (represented by all fea-
tures ending with R) is expected to be a mandatory feature
for all future products, the analyst could decide to merge
the following pairs of features: Base and Base R, A1 and
A1 R, A2 and A2 R, B1 and B1 R, and B2 and B2 R. This
would still reduce the number of offered variable features
by 5 and the number of variants from 222 to 217.
Class II: Never used. In Figure 3, the variable feature
A2 CC appears at the bottom concept (concept #1) indi-
cating that it is never used in any of the product configura-
tions. By consulting the documentation we found out that
the domain engineers formerly intended to offer two alter-
natives for the calculation of correction factors of feature
A2, namely common correction (A2 CC) and separate cor-
rection (A2 SC). However, all current product configura-
tions use A2 SC and none uses A2 CC. Therefore, A2 CC
can be removed from the offered set of variable features and
A2 SC could be merged with A2.
Class III: Only used mutually exclusively. By closely
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
inspecting the lattice, the analyst can infer that A1 T and
A1 A (respectively A2 T and A2 A) are only used mutually
exclusively because μ(A1 T)�μ(A1 A) = ⊥ (respectively
μ(A2 T) � μ(A2 A) = ⊥). Plausibly, it is not possible
to combine A1 T with A2 A or A2 T with A1 A as there
exists no product configuration in which these features are
used together. Therefore, these features should be marked
as alternative.
Class IV: Only used in pairs. By closely inspecting
the lattice, the analyst can see that features A1 T and A2 T
appear together at concept #2. This indicates that these fea-
tures are likely to be features that cannot be used separately.
The same applies for A1 A and A2 A as they appear to-
gether at concept #13. Therefore, A1 T and A2 T should
be merged as well as A1 A and A2 A.
4.4 Feature Constraints
The lattice shown in Figure 3 also reveals interesting
constraints of the form ”feature A requires/excludes feature
B” as described in Section 3.4. Some of the constraints that
can be derived from the lattice are:
• {A2 T} → {A1 T, A2 T}: This can be read as ”cal-
ibration strategy T of feature A2 requires calibration
strategy T of A1 and A2”.
• {A1 A, A2 A} → {B2 A}: This can be read as ”cal-
ibration strategy A of A1 and A2 requires calibration
strategy A of B2.
• {A1 A, A1 T} → {A1}: This can be interpreted as
”calibration strategy A or T of A1 require the base fea-
ture A1”.
• Concept #7 shows that it is possible to combine cali-
bration strategy T for A1 (A1 T) and calibration strat-
egy T for A2 (A2 T) with calibration strategy A for B2
(B2 A). This combination is used in product configu-
rations P1, P3, and P9.
• {B2 T} likely excludes {A1 A, A2 A} as these fea-
tures belong to class III.
5 Related Work
The need for managing the variability in a software
product line is generally recognized in the literature [2].
However, the main research on this topic has focused on
processes and the forward-engineering point of view, i.e.,
frameworks for managing the variability during domain en-
gineering [4, 10, 11, 17]. Only a small number of re-
searchers have addressed the problem of variability from
the point of view of reengineering.
The problem of variability restructuring was first identi-
fied by Parnas [15], who suggested that, when a large family
of products gets out of control, the first step must be to re-
duce the size of the program family.
Krone and Snelting [12] were the first to use concept
analysis to analyze programs in which the C preprocessor
(CPP) is used for variability management. The relation be-
tween code pieces and CPP symbols is extracted from a
source file, and the corresponding lattice visualizes interfer-
ences between CPP symbols. Our work differs from Snelt-
ing’s work in three ways. First, in our approach, we apply
concept analysis on the usage of variable features in product
configurations instead of analyzing the relationship of CPP
symbols and code lines. Second, the focus of our work is
on restructuring and simplifying the provided variability in
a software product line whereas the focus of Snelting’s work
is to simplify the source code. Third, the granularity of vari-
ability analyzed in Snelting’s work is very small whereas in
our work the variability is at the granularity level of fea-
tures. Our work goes beyond Snelting’s work by identify-
ing obsolete variability and providing restructuring strate-
gies for variability.
In [3] a method for refactoring product line architectures
is proposed. Based on two metrics measuring the utilization
of provided services and required services of optional and
variant components in a product line architecture refactor-
ing strategies for these components are derived.
Recently, Deelstra et al. [6] presented a method for as-
sessing the provided variability in a software product line
at a high-level of abstraction. Based on a model of vari-
ation points and dependencies they describe a process to
analyze the provided variability and compare it to the re-
quired variability that is needed to derive future products
from the product line. However, they do not describe how
to determine whether the provided variability is adequate to
the required variability to derive different products.
Formal concept analysis is a well known method in soft-
ware engineering that has been applied to a variety of prob-
lems. Besides the inference of configuration structures from
source code [12], concept analysis has been used for the
evaluation of class hierarchies [18], for the recovery of com-
ponents [13, 16, 19, 20], for feature location [7], and for
reengineering object oriented systems [1].
6 Conclusions and Future Work
We have presented a method for restructuring variabil-
ity by analyzing the usage of variable features by a set of
product configurations. This method is based on conceptanalysis and constructs a concept lattice in which relation-
ships between variable features and product configurations
are made explicit, and where the features used in multiple
product configurations are ”factored out”’. We have shown
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007
the technique to be capable of finding variable features used
in every product configuration that could be merged with
other features and variable features not used in any product
configuration that could be eliminated. The lattice thus not
only helped us to compare our mental model of the variabil-
ity with the actual use of variable features in real product
configurations, but also to obtain valuable information for
restructuring and simplifying the existing variability. We
have suggested how the lattice can be used as a formal basis
in interactive tools for restructuring the existing variability
of software product line components. The lattice also al-
lowed us to derive feature constraints that can be used to
automatically check product configurations for consistency.
The present paper has focused on analyzing the usage
of variable features in product configurations and the
derivation of feature constraints. We intend to implement
an interactive tool for restructuring variability based on
our technique, and verify its practicality by applying it
to large product line components. Beyond further case
studies and controlled experiments, we want to explore
how the results obtained by the method described in this
paper may be combined with our results of analyzing the
mapping of configuration labels to files in the configuration
management [14]. We believe that combining these results
will considerably ease the process of restructuring the
variability as the realization relationships of variable
features to the files implementing them are made explicit.
Acknowledgements. This work was supported by
Robert Bosch GmbH Corporate Research. We would like
to thank Sa Li for the development of the prototype to an-
alyze product configurations and all anonymous reviewers
that provided valuable feedback on earlier versions of this
paper.
References
[1] G. Arevalo. High-Level Views in Object Oriented Systemsusing Formal Concept Analysis. PhD thesis, University of
Berne, January 2005.[2] P. Clements and L. Northrop. Software Product Lines: Prac-
tices and Patterns. Addison-Wesley, 2001.[3] M. Critchlow, K. Dodd, J. Chou, and A. van der Hoek.
Refactoring product line architectures. In First InternationalWorkshop on Refactoring: Achievements, Challenges, andEffects (REFACE’03), pages 23–26, 2003.
[4] K. Czarnecki and C. H. P. Kim. Cardinality-based feature
modeling and constraints: A progress report. In Proceed-ings of the International Workshop on Software Factories AtOOPSLA 2005, 2005.
[5] S. Deelstra, M. Sinnema, and J. Bosch. Product derivation in
software product families: a case study. Journal of Systemsand Software, 74(2):173–194, 2005.
[6] S. Deelstra, M. Sinnema, J. Nijhuis, and J. Bosch. COS-
VAM: A technique for assessing software variability in
software product families. In Proceedings of the 20thIEEE International Conference on Software Maintenance(ICSM’04), pages 458–462, September 2004.
[7] T. Eisenbarth, R. Koschke, and D. Simon. Locating features
in source code. IEEE Transactions on Software Engineering,
29(3):210–224, 2003.[8] B. Ganter and R. Wille. Formal Concept Analysis - Mathe-
matical Foundations. Springer Verlag, 1996.[9] C. Hammel, B. Boss, H. Jessen, A. Traub, C. Tischer, and
H. Honninger. A common software architecture for diesel
and gasoline engine control systems of the new generation
EDC/MED17. In World Congress & Exhibition of Automo-tive Engineers (SAE 2003), number 2003-01-1048 in SAE
Technical Papers, Detroit, MI, USA, March 2003.[10] K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S.
Peterson. Feature-oriented domain analysis (FODA) feasi-
bility study. Technical Report CMU/SEI-90-TR-21, SEI,
1990.[11] K. C. Kang, S. Kim, J. Lee, K. Kim, E. Shin, and M. Huh.
FORM: A feature-oriented reuse method with domain-
specific reference architectures. Annual Software Engineer-ing, 5:143–168, 1998.
[12] M. Krone and G. Snelting. On the inference of configuration
structures from source code. In Proceedings of the 16th In-ternational Conference on Software Engineering (ICSE’94),pages 49–57, Los Alamitos, CA, USA, 1994.
[13] C. Lindig and G. Snelting. Assessing modular structure of
legacy code based on mathematical concept analysis. In
Proceedings of the 19th International Conference on Soft-ware Engineering (ICSE’97), pages 349–359, New York,
NY, USA, 1997. ACM Press.[14] F. Loesch. A formal method to identify variation points in
product line assets. In Workshop Software-Reengineering(WSR’06), Bad Honnef, Germany, May 2006.
[15] D. L. Parnas. Software aging. In Proceedings of the 16th In-ternational Conference on Software Engineering (ICSE’94),Sorento, Italy, 1994. IEEE Computer Society Press.
[16] M. Siff and T. W. Reps. Identifying modules via concept
analysis. In Proceedings of the International Conference onSoftware Maintenance (ICSM’97), pages 170–179, Wash-
ington, DC, USA, 1997. IEEE Computer Society.[17] M. Sinnema, S. Deelstra, J. Nijhuis, and J. Bosch. COV-
AMOF: A framework for modeling variability in software
product families. In Proceedings of the Third InternationalConference on Software Product Lines (SPLC’04), pages
197–213, Boston, MA, USA, August 30-September 2 2004.[18] G. Snelting and F. Tip. Reengineering class hierarchies us-
ing concept analysis. In Proceedings of the 6th ACM SIG-SOFT International Symposium on Foundations of SoftwareEngineering (FSE’98), pages 99–110, New York, NY, USA,
1998. ACM Press.[19] P. Tonella. Concept analysis for module restructuring. IEEE
Trans. Softw. Eng., 27(4):351–363, 2001.[20] A. van Deursen and T. Kuipers. Identifying objects using
cluster and concept analysis. In Proceedings of the 21st In-ternational Conference on Software Engineering (ICSE’99),pages 246–255, Los Alamitos, CA, USA, 1999.
[21] S. A. Yevtushenko. System of data analysis ”concept ex-
plorer”. In Proceedings of the 7th National Conference onArtifcial Intelligence, pages 127–134, Russia, 2000.
11th European Conference on Software Maintenance and Reengineering (CSMR'07)0-7695-2802-3/07 $20.00 © 2007