Software reuse Reuse 133 proving the practice of software engi-neering [Boehm 1987; Brooks 1987; Standish 1984]. The advantage of amor-tizing software development efforts through reuse

Software Reuse

CHARLES W. KRUEGER

School of Computer Science, G’arnegie Mellon University, Pittsburgh, Pennsylvania 15213

Software reuse is the process ofcreating software systems from existing softwarerather than building software systems from scratch. ‘l’his simple yet powerful vision

was introduced in 1968. Software reuse has, however, failed to become a standard

software engineering practice. In an attempt to understand why, researchers haverenewed their interest in software reuse and in the obstacles to implementing it.

This paper surveys the different approaches to software reuse found in the researchliterature. It uses a taxonomy to describe and compare the different approaches andmake generalizations about the field of software reuse. The taxonomy characterizes

each reuse approach interms of its reusable artifacts and the way these artifacts areabstracted, selected, speciahzed, and integrated.

Abstraction plays a central role in software reuse. Concise and expressiveabstractions are essential if software artifacts are to be effectively reused. The

effectiveness of a reuse technique can be evaluatedin terms of cognztzue dwtance-anintuitive gauge of the intellectual effort required to use the technique. Cognitive

distance isreduced in two ways: (l) Higher level abstractions ina reuse technique

reduce the effort required to go from the initial concept of a software system to

representations inthe reuse technique, and(2) automation reduces the effort required

togo from abstractions inareuse technique to an executable implementation.This survey will help answer the following questions: What is software reuse? Why

reuse software? What are the different approaches to reusing software? How effective

are the different approaches? What is required to implement a software reuse

technology? Why is software reuse difficult? What are the open areas for research insoftware reuse?

Categories and Subject Descriptors: D.1.O [Programming Techniques]: General;D.2. 1 [Software Engineering]: Requirements\ Specifications-la rzguages,methodologies, tools; D.2.2 [Software Engineering]: Tools and Techniques—modulesand interfaces, programmer workbench, software libraries; D .2.m [Software

Engineering]: Miscellaneous-–reusable software; D.3.2 [Programming Languages]:

Language Classifications-specialized application languages, very high-level languages;

D.3.4 [Programming Languages]: Processors; F.3. 1 [Logics and Meanings ofPrograms]: Specifying and Verifying and Reasoning about Programs; H.3. 1[Information Storage and Retrieval]: Content Analysis and Indexing—abstractingmethods, indexing methods; H.3.3 [Information Storage and Retrieval]: InformationSearch and Retrieval—query formulation, retrieual models, search process, select~on

process; 1.2.2 [Artificial Intelligence]: Automatic Programming--program synthesis,

program transformation.

This work was supported in part by the Hillman Fellowship for Software Engineering, in part by ZTI-SOFof Siemens Corporation, Munich, Germany, and in part by the Avionics Lab, Wright Research andDevelopment Center, Aeronautical Systems Division (AFSC), U.S. Air Force, Wright-Patterson AFB, OH45433-6543 under contract F33615-90-C-1465, ARPA Order 7597.

Permission to copy without fee all or part of this material is granted provided that the copies are not madeor distributed for direct commercial advantage, the ACM copyright notice and the title of the publicationand its date appear, and notice is given that copying is by permission of the Association for ComputingMachinery. To copy otherwise, or to republish, requires a fee and/or specific permission.CQACM 0360-0300/92/0600-0131 $01.50

132 “ Charles W. Krueger

General Terms: Design, Economics, Languages

Additional Key Words and Phrases: AbstractIon, cognitive distance, software reuse

CONTENTS

INTRODUCTION

1. ABSTRACTION

1 1 Abstraction ]n Software Development

12 Abstraction m Software Reuse

13 Cognitive D]stance

2, ORGANIZATION OF THE SURVEY

3. HIGH-LEVEL LANGUAGES

3.1 AbstractIon m High-Level Languages

3.2 SelectIon

3.3 Speclahzatlon

3.4 Integration

35 ADcraisal of Reuse Techmques m Hlgh-. .Level Languages

4. DESIGN AND CODE SCAVENGING

41 Abstraction m Design and Code

Scavenging

4.2 SelectIon

4.3 Speclalizatlon

4.4 Integration

4.5 Appramal of Reuse Techmques m Desgn

and Code Scavenging

5. SOURCE CODE COMPONENTS

51 Abstraction ]n Source Code Components

52 SelectIon

5.3 Specialization

5.4 Integration

5.5 Object-Oriented Van ants to Component

Reuse

5.6 Appraisal of Reuse Techniques m Source

Code Components

6 SOFTWARE SCHEMAS

61 Abstraction in Software Schemas

62 Selection

63 Specialization

6.4 Integration

6.5 Appramal of Reuse Techniques in

Software Schemas

7. APPLICATION GENERATORS

71 Abstraction m Apphcatlon Generators

7,)- &lectiOn

7.3 Specialization

7.4 Integration

7.5 Software Life Cycle with Application

Generators

76 Appraisal of Reuse Techniques in

Apphcation Generators

8. VERY HIGH-LEVEL LANGUAGES

8.1 Abstraction m Very High-Level

Languages

8.2 Selectlon

8.3 Speclalizatlon

8.4 Integration

8.5 Appraisal of Reuse Techmques in Very

High-level Languages

9.

10

11

TRANSFORMATIONAL SYSTEMS

9.1 AbstractIon m Transformational Systems

9.2 SelectIon

9.3 Speciahzatlon

9.4 Integratmn

9.5 Appraisal of Reuse Techmques m Transfor-

mational Systems

SOFTWARE ARCHITECTURES

10.1 AbstractIon m Software Architectures

10.2 Selection

10.3 Speclahzatlon

10.4 Integration

10.5 Appraisal of Reuse Techniques in Software

Architectures

SUMMARY

11,1 Categories and Taxonomy

11.2 Cogmtlve Distance

113 General Conclusions

ACKNOWLEDGMENTS

REFERENCES

INTRODUCTION

The 1968 NATO Software EngineeringConference is generally considered thebirthplace of the software engineeringfield [Naur and Randell 1968]. The con-

ference focused on the softwarecrisis—the problem of building large, re-

liable software systems in a controled,cost-effective way (it was at this confer-ence that the term soft ware crisis origi-

nated). From the beginning, software

reuse has been touted as a means forovercoming the software crisis. The semi-nal paper on software reuse was an in-vited paper at the conference: Mass Pro-duced Software Components by Mclh-oy[ 1968]. McIlroy proposed a library ofreusable components and automatedtechniques for customizing componentsto different degrees of precision and ro-bustness. McIlroy felt that component li-braries could be effectively used for nu-

merical computation, 1/0 conversion,text processing, and dynamic storage al-location.

Twenty-three years later, many com-puter scientists still see software reuseas potentially a powerful means of im-

ACM Computing Surveys, Vol. 24, No. 2, June 1992

Software Reuse ● 133

proving the practice of software engi-neering [Boehm 1987; Brooks 1987;Standish 1984]. The advantage of amor-tizing software development effortsthrough reuse continues to be widely ac-knowledged, even though the tools,methods, languages, and overall under-standing of software engineering havechanged significantly since 1968 [Bi-ggerstaff and Richter 1989].

In spite of its promise, software reusehas failed to become standard practicefor software construction. In light of thisfailure, the computer science communityhas renewed its interest in understand-ing how and where reuse can be effectiveand why it has proven so difficult tobring the seemingly simple idea of soft-ware reuse to the forefront of softwaredevelopment technologies [Biggerstaffand Perlis 1989a, 1989b; Freeman 1987b;Tracz 1988].

Simply stated, software reuse is usingexisting software artifacts during theconstruction of a new software system.The types of artifacts that can be reusedare not limited to source code fragmentsbut rather may include design struc-tures, module-level implementationstructures, specifications, documenta-tion, transformations, and so on [Free-man 1983].

There is great diversity in the softwareengineering technologies that involvesome form of software reuse. However,there is a commonality among the tech-niques used. For example, software com-ponent libraries, application generators,source code compilers, and generic soft-ware templates all involve abstracting,selecting, specializing, and integratingsoftware artifacts [Biggerstaff andRichter 1989]. In this survey, softwareengineering technologies are analyzedand contrasted in terms of their id-iomatic reuse techniques, particularlyalong the four mentioned dimensions:

Abstraction. All approachm to ~oftwarereuse use some form of abstraction forsoftware artifacts. Abstraction is theessential feature in any reuse tech-nique. Without abstractions, software

developers would be forced to siftthrough a collection of reusable arti-facts trying to figure out what each

artifact did, when it could be reused,and lhow to reuse it,

Selection. Most reuse approaches helpsoftware developers locate, compare,and select reusable software artifacts.For example, classification and cata-loging schemes can be used to organizea library of reusable artifacts and toguide software developers as theysearch for artifacts in the library[Horowitz and Munson 1989].

Specialization. With many reuse tech-nologies, similar artifacts are mergedinto a single generalized (or generic)artifact. After selecting a generalizedartifact for reuse, the software devel-oper specializes it through parame-ters. transformations. constraints. orsome other form of refinement. Forexample, a reusable stack implemen-tation might be parameterized for themaximum stack depth. A programmerusing this generalized stack wouldspecialize it by providing a value forthis parameter.

Integration. Reuse technologies typi-cally have an integration framework.A software develo~er uses this frame-work. to combine a collection of se-lected and specialized artifacts into acomplete software system. A moduleinterconnection language is an exam-ple of an integration framework [De-Remer and Krcm 1976; Prieto-Diaz andNeighbors 1986]. With a module inter-connection language, functions are ex-ported from modules that implementthem and imported into modules thatuse them. Modules are assembled intoa system by interconnecting moduleswith the appropriate exports and im-ports.

From the software engineering per-spective, software reuse pertains solelyto the proceim of constructing ~oftware

systems. Using existing sine routine

source code during the construction of aprogram is considered an example ofsoftware reuse, but repeatedly invoking



that sine routine during the execution ofthe program is not. Also excluded arerepeated program executions and verba-tim duplication of programs for the pur-pose of distribution.

The goal of this paper is to introduceresearch efforts in software reuse; it isprimarily a survey of the software reuseliterature. To provide a coherent perspec-tive of the diverse literature, a uniformtaxonomy is developed. Different reuseapproaches are then characterized usingthis taxonomy. Some simple comparativeanalysis identifies the relative merits anddrawbacks of the different reuse tech-niques and forms the basis for general-izations about the field of software reuse.

As a survey of research literature, thispaper is not intended to be a guide forselecting or implementing a reuse tech-nology in practice. Many of the systemsdescribed are research prototypes thathave not been scaled up or validated inpractical use.

1. ABSTRACTION

Because abstraction is such an importantpart of software reuse, it is used as theunifying theme in this survey. This choicealso reflects the view that successful ap-plication of a reuse technique to a soft-ware engineering technology is inex-orably tied to raising the level of abstrac-tion for that technology. Since raisingabstraction levels for software engineer-ing technologies has proven to be quitedifficult, the relation between abstrac-tion and reuse provides us with the firstclue to why there are so few successful

reuse systems.Others have noted the relationship be-

tween software reuse and abstraction[Booth 1987; Neighbors 1984; Parnas etal. 1989; Standish 1984; Wegner 1983].According to Wegner [1983, p. 30], forexample, “abstraction and reusability are

two sides of the same coin.” He statesthat every abstraction describes a relatedcollection of reusable entities and thatevery related collection of reusable enti-ties determines an abstraction.

1.1 Abstraction in Software Development

Computer scientists often use abstrac-tion to help manage the intellectualcomplexity of software [ Shaw 1984].An abstraction for a software artifact isa succinct description that suppressesthe details that are unimportant to asoftware developer and emphasizes theinformation that is important. For exam-ple, the abstraction provided by a high-level programming language allows aprogrammer to construct algorithmswithout having to worry about the de-tails of hardware register allocation.

Software typically consists of severallayers of abstraction built on top of rawhardware. The lowest-level software ab-straction is object code, or machine code.Assembly language is a layer of abstrac-tion above object code. A programminglanguage (e.g., Adal ) is a layer of ab-straction above the assembly language.In modular languages such as Ada, themodule specification can serve as a layerof abstraction above the imdementationdetails in the module body. ‘

These examples demonstrate that ev-

ery software abstraction has two levels.

The higher of the two levels is referred to

as the abstraction specification. Thelower, more detailed level is called the

abstraction realization. 2 When abstrac-tions are layered, the abstraction specifi-cation at one layer is the abstraction re-alization at the next higher layer. Figure1 shows a hierarchy with two abstrac-tions, L and M. Rep 1, Rep 2, and Rep 3are three representations of the samesoftware artifact, where Rep 1 is the mostdetailed (lowest level) representation. Forabstraction L, Rep 2 is the abstractionspecification, and Rep 1 is the abstrac-tion realization. From the point of view

1Ada is a registered trademark of the U.S. Govern-ment, Ada Joint Program Office.z Implementation is also common terminology for

the lowest level of detail in an abstraction. How-

ever, implementation is often associated with exe-

cutable source code, which is not necessarily the

case for all abstractions. For this reason, realiza-

tion is used in this survey.


/0Rep ~ / Specification M

I

Realization M

,tj/Realization L

Rep 1

L

Figure 1. Two-level abstraction hierarchy.

of abstraction M, Rep 3 is the abstrac-tion specification, and Rep 2 is the ab-straction realization.

An abstraction has a hidden part, avariable part, and a fixed part. The hid-den part consists of the details in theabstraction realization that are not visi-ble in the abstraction specification. Thevariable and fixed parts are visible inthe specification. The variable part rep-resents the variant characteristics in theabstraction realization, whereas the fixedpart represents invariant characteristicsin the abstraction realization. Therefore,as illustrated in Figure 2, an abstractionspecification with a variable part corre-sponds to a collection of alternate real-izations. The variable part of an abstrac-tion specification maps into the collectionof possible realizations.

For example, in an abstraction forstacks, the fixed part of the abstractionexpresses the invariant characteristicsfor all stack realizations, such as thelast-in-first-out (LIFO) semantics. Theinvariant stack behavior does not, dependon the type of elements stored in thestack, so the element type can be in thevariable part of the abstraction. Theneach different element type correspondsto a different stack realization.

The partitioning of an abstraction intovariable, fixed, and hidden parts is notan innate property of the abstraction but


Abstraction Specification

Figure 2. Mappimz from a variable abstraction. .specification.

rather an arbitrary decision made by thecreator of the abstraction. The creatordecides what information will be usefulto users of the abstraction and puts it inthe abstraction specification. The creatoralso decides which properties of the ab-straction the user might want to varyand places them i.n the variable part ofthe abstraction specification. Continuingwith the stack example, the value for themaximum stack depth can be placed ineither the variable, fixed, or hidden partof the stack abstraction. If it is placed inthe variable part, the user has the abilityto choose the maximum stack depth (e.g.,10, 1000, unbounded). If the maximumstack depth is placed in the fixed part,the user knows the predefine value ofmaximum stack depth but cannot changeit. If placed in the hidden part, the stackdepth is totally removed from the con-cerns of the user.

Abstraction specifications and realiza-tions can take on many forms. They canbe formal or informal, explicit or implicit.Consider the stack example written as ageneric Ada package. The abstraction re-alization corresponds to an instantiationof the generic package with a particularstack element type. The abstraction spec-

ification, on the other hand, must bea combination of different descriptions



because of Ada’s limited expressiveness.The generic package can provide the syn-tactic specification for operations of the

stack abstraction, but the semantic spec-ification must be expressed outside of theAda language. One possibility is to use aformal notation such as Hoare axioms.Another is to use an informal descriptionsuch as English text.

Semantic specifications are rarely de-rived from first principles. Abstractioncreators implicitly assume that a certainamount of the abstraction specification iscommon knowledge among the users. Asan extreme example, it might be suffi-cient simply to give “stack” as a semanticspecification and assume that all usersunderstand the stack abstraction with-out an explicit formal description.

In summary, an abstraction expressesa high-level, succinct, natural, and usefulspecification that corresponds to a lessperspicuous realization level of represen-tation. The abstraction specification typi-cally describes “what” the abstractiondoes, whereas the abstraction realizationdescribes “how” it is done. For an ab-straction to be effective, its specificationmust express all of the information thatis needed by the person who uses it [Shaw1984]. This may include space/ timecharacteristics, precision statistics, scala-bility limits, and other information notnormally associated with specificationtechniques [McIlroy 1968].

1.2 Abstraction in Software Reuse

Abstraction plays a central and often-times limiting role in each of the otherfacets of software reuse:

Selection. Reusable artifacts must haveconcise abstractions so users can effl-

ciently locate, understand, compare,and select the appropriate artifactsfrom a collection.

Specialization. A generalized reusableartifact is in fact an abstraction with avariable part. Specialization of a gen-eralized artifact corresponds to choos-ing an abstraction realization from thevariable part of an abstraction specifi-cation.

Integration. To integrate a reusable ar-tiGct into a software system effec-tively, the user must clearly under-stand the artifact’s interface (i.e., thoseproperties of the artifact that interactwith other artifacts or the integrationframework). An artifact interface is anabstraction in which the internal de-tails of the artifact are suppressed.

3 Cognitive Distance1.

The effectiveness of abstractions in asoftware reuse technique can be evalu-ated in terms of the intellectual effortrequired to use them. Better abstractionsmean that less effort is required from theuser. To aid in evaluation, cognitive dis-tance is introduced as an intuitive gaugefor comparing abstractions.

Cognitive distance is defined as theamount of intellectual effort that must beexpended by software developers in orderto take a software system from one stageof development to another. From this def-inition, it should be clear that cognitivedistance is not a formal measurementthat can be expressed with numbers andunits. Rather, it is an informal notionthat relies on intuition about the relativeeffort required to accomplish varioussoftware development tasks.

For the creator of a software reusetechnique, the goal is to minimize cogni-tive distance by (1) using fixed and vari-able abstractions that are both succinctand expressive, (2) maximizing the hid-den part of the abstractions, and (3)using automated mappings from abstrac-tion specification to abstraction realiza-tion (e.g., compilers). This can be sum-marized in an important truism aboutsoftware reuse:

For a software reuse technique to be effectwe, Itmust reduce the cogmtwe distance between themitlal concept of a system and Its final executableImplementation,

This truism, along with others that ariselater in the survey, are obvious andseemingly simple requirements on soft-ware reuse techniques that have provendifficult to satisfy in practice.



2. ORGANIZATION OF THE SURVEY

We partition the different approaches tosoftware reuse into eight categories:high-level languages, design and codescavenging, source code components,software schemas, application genera-tors, very high-level languages, transfor-mational systems, and software architec-tures. They are presented in Sections 3through 10, respectively, beginning withreuse techniques that rely on low-levelabstractions such as assembly languagepatterns and progressing through higherlevel abstractions.

Each of the eight software reuse cate-gories are discussed according to the fol-lowing taxonomy:

Abstraction. What type of software arti-facts are reused and what abstractionsare used to describe the artifacts?

Selection. How are reusable artifactsselected for reuse?

Specialization. How are generalized ar-tifacts specialized for reuse?

Integration. How are reusable artifactsintegrated to create a complete soft-ware system?

Each section ends with a summary of thekey taxonomic features for the categoryand a statement about its pros and cons.These summaries provide a convenientpoint of reference for comparing the dif-ferent approaches to software reuse.

Although the categories and the taxon-omy illustrate different approaches andissues in software reuse, they are notprecise. The real software engineeringtechnologies we will examine often havefeatures that fit into several categories.For example, different features of the Adalanguage have relevance in four cate-gories: High-Level Languages, Designand Code Scavenging, Source Code Com-ponents, and Program Schemas. Like-wise, the relative importance of the fourfacets in the taxonomy differs among thedifferent reuse categories. For example,application generators emphasize thespecialization of a single highly ab-stracted artifact but typically do not re-quire selecting or integrating artifacts.

These imprecision, however, should notdetract from our goal, which is to explorethe interesting issues in software reuse.

3. HIGH-LEVEL LANGUAGES

High-level languages such as C, Ada,Lisp, Smalltalk, and ML have not beentreated in the literature as examples ofsoftware reuse. From the perspective ofsoftware developers prior to the existenceof these languages, however, the goalsand achievements for high-level lan-guages have strong parallels to the cur-rent-day aspirations of software reuse re-searchers. Given their marked level ofacceptance and success (e.g., a factor of 5speedup in writing code [Brooks 197.51),it is interesting to examine high-levellanguage technology in terms of softwarereuse.

3.1 Abstraction in High-Level Languages

Before high-level languages were devel-oped, software development was donemainly with assembly languages. As de-velopers discovered the inherent com-plexity of building large systems in thismodel, they began to search for moreeffective ways to express computation.The common implementation patternsused in assembly language programmingbecame the primitive constructs in thenext generation, high-level languages.Examples include iteration, branching,arithmetic expressions, relational expres-sions, data declarations, and assignment.

The reusable artifacts in a high-levellanguage are assembly language pat-terns. A high-level language is at a levelof abstraction above the assembly lan-guage artifacts. That is, high-level lan-guage constructs are abstraction speci-fications, whereas the correspondingassembly language artifacts are abstrac-tion realizations.

For example, consider a conditional

statement in a high-level language:

If ((expression)) men(statements, )

else{statements,)

endif

ACM Computmg Surveys, Vol. 24, No. 2, June 1992

138 ● Charles W. Krueger

This reusable template is an abstractionspecification that directly maps to an ab-straction realization—the reusable as-sembly language pattern. Referring backto Figure 2, the variable parts in theabstraction specification are the ( expres-sion ) and two ( statements) slots. Thefixed part of the abstraction specificationcorresponds to the semantic descriptionof the conditional statement, given in thelanguage reference manual:

The expression IS evaluated. If true, executestatements,, otherwise statements.

The hidden part of the abstraction in-cludes all of the assembly language de-tails such as temporary data in the eval-uation of the expression.

Programmers use only the fixed andvariable parts of the abstraction specifi-cation. They never see the actual assem-bly language artifacts that are reusedbecause the mapping from specificationto realization is fully automated by thecompiler.

3.2 Selection

Since there is a relatively small numberof reusable artifacts (i.e., language con-structs) in a high-level language, it isrelatively simple for a programmer toselect among them. The language refer-ence manual and tutorial examples helpa novice make the appropriate selectionsin a particular language. A programmercan typically master a high-level lan-guage in a matter of days or weeks.

3.3 Specialization

Most high-level 1anguage constructs aregeneralized constructs with parametri-zed dots. For example, conditional state-ments typically have parameterized slotsfor Boolean expressions and statements.Programmers specialize the generalizedlanguage constructs by recursively fillingthe parameterized slots with other lan-guage constructs of the appropriate type.

3.4 Integration

Programmers integrate statement-level

constructs by recursive specialization as

described in the previous paragraph. Thelarger, encapsulating constructs such asprocedures, packages, or modules are in-tegrated with a module interconnectionlanguage or with scope rules. At both thestatement and module levels, the inte-gration framework of a high-level lan-guage defines how individual constructsare composed to form a complete soft-ware system. That is, every languageconstruct in a fully integrated programhas a well-defined operational semanticsthat describes its effect on the run-timedata and the run-time control flow[Loeckx and Sieber 1984]. (Module inter-connection languages are addressed inmore detail in Section 5.4.)

3.5 Appraisal of Reuse Techniques inHigh-Level Languages

From the perspective of today’s softwaredevelopment technology, an analysis ofhigh-level languages is almost trite.These languages are often the lowestlevel of abstraction used by softwaredevelopers. The accomplishments,strengths, and weaknesses of high-levellanguages are well known. However, it isnot widely recognized that high-level lan-guages are examples of software reuse.Nor is it recognized that, in many ways,high-level language technology is aparagon of software reuse that re-searchers currently can only hope to em-ulate. For example, discovery of a newreuse technology that routinely offered afactor of 5 speedup in software develop-ment would be among the most signifi-cant software engineering achievementsof the decade.

High-level language technology pro-vides a good example of how abstractionimpacts the effectiveness of softwarereuse techniques. Compared to using as-sembly languages, high-level languagesare considerably more succinct and havea more natural form for expressing andreasoning about programs. In addition,

programmers are largely unaware of thereuse that takes place. They use only thefixed and variable parts of the abstrac-

tion specification, while the mapping



Table 1. Reuse in High-Level Languages

Abstraction The reusable artifacts in a high-level language are assembly language patterns.

High-level language constructs serve as abstraction specifications for low-level

assembly language patterns.Selection The semantics for each of a small number of constructs is described in a

language manual. Experienced programmers easily commit the collection of

constructs to memory, which facilitates selection.

Specialization Language constructs are typically generalized with parameterized slots. Con-

structs are specialized by recursively filling the parameterized slots withconstructs of the appropriate type.

Integration Language constructs are integrated by recursive specialization, module intercon-nection rules, or scope rules. The operational semantics of the language defines

the effect of integrating constructs on the run-time data and control flow.Pros Reusable assembly language patterns provide a factor of 5 speedup in develop-

ment time compared to programming direct] y in assembly language because (1)

high-level language constructs are succinct and natural, (2) compilers fullyautomate the mapping from abstraction specification to abstraction realization,

and (3) programmers are completely isolated from compiler and assemblylanguage details.

Cons Due to the need for s,ystem design prior to coding, there is still a large cognitive

distance between the informal requirements for a software system and its

implementation. High-level languages are at a relatively 1ow level of abstraction.

from specification to realization is fullyautomated by the compiler. Program-mers do not have to examine or under-stand the internals of the compiler or theassembly language output, which com-pletely isolates them from the details ofthe hidden and realization parts of thereuse abstraction.

The primary limitation of high-levellanguages as a reuse technology is thelarge amount of system design effort re-quired prior to coding—high-level lan-guage programming comes late in thesoftware development life cycle. Thus,there is a large cognitive distance be-tween the informal requirements for asoftware system and its implementationin a high-level language. Table 1 summa-rizes the high-level language approach tosoftware reuse.

4. DESIGN AND CODE SCAVENGING

Many programmers adopt an ad hoc, al-though effective, approach to reusingsoftware system designs and source code.They scavenge fragments from existingsoftware systems and use them as part ofnew software development. Experiencedsoftware developers are known to gaingreat leverage from reusing previous de-signs [Biggerstaff and Richter 1989].

The goal of design and code scavengingis to reduce the cognitive effort, the num-ber of keystrokes, and therefore theamount of time required to design, imple-ment, and debug a new software system.Scavengers copy as much as possible fromanalogous systems that have alreadybeen designed, implemented, and de-bugged.

4.1 Abstraction in Design and CodeScavenging

In design and code scavenging, the ab-stractions used by a software developerare mostly informal, and they often existonly in the mind of the developer. Theabstractions are concepts about designand programming that a software devel-oper has learned from experience. For-mal education, where students are taughtpractical design concepts, provides an-other source of abstraction for scav-engers.

The reusable artifacts in scavengingare source code fragments. In code scav-enging, a contiguous block of source codeis copied from an existing system. In de-sign scavenging, a large block of code iscopied, but many of the internal detailsare deleted while the global template ofthe design is retained. There is, of course,



a continuum between code scavengingwith dense code blocks and design scav-enging with sparse design templates.

A software developer creates an ab-straction for an existing design or codefragment by remembering an abbrevi-ated description. For example, the devel-oper might remember the semantics of astack and the location of some softwarethat implements a stack. Then, if thedeveloper needs a stack in a new soft-ware system, the existing implementa-tion can be scavenged. In this case, theabstraction realization is the existingstack source code. The abstraction speci-fication is the abbreviated description inthe memory of the software developer.3

In code scavenging, the software devel-oper is ultimately involved with all partsof the abstraction—the abstraction speci-fication, the abstraction realization, andthe mapping from specification to real-ization. Therefore, there is no hidden partof the abstraction. Initially the developerrecalls an existing artifact from a mentalabstraction specification. The mappingfrom specification to realization corre-sponds to the developer manually locat-ing the existing artifact and modifying itfor reuse. Modification and integrationare performed at the realization level ofthe abstraction, which forces the devel-oper to become intimately involved withdetails in the realization.

4.2 Selection

The abstractions a software developer hasin his or her head can be thought of as alibrary of reusable designs and sourcecode. While designing a new softwaresystem, the developer may notice simi-larities between the new system and ab-stractions in the library. The developercan directly reuse these abstractions inthe new design, or the abstractions canbe used as an index to locate existingsource code for scavenging.

3 I am being informal and taking great libertieswith the notion of abstraction in this section.

4.3 Specialization

A programmer specializes a scavengedcode fragment by manually editing it.The fragment is edited to resolve differ-ences between the original context fromwhere it was scavenged and the new con-text where it will be reused. For exam-ple, a scavenged stack implementationmight push and pop integer elements,but a new implementation might needstring elements. The programmer mustthoroughly understand the lowest-leveldetails in the code fragment to adapt it toits new context correctly.

4.4 Integration

Integrating a scavenged code fragmentinto a new context means that the pro-grammer has to modify the fragment, thecontext, or both. For example, variablenames in a scavenged code fragment maycollide with existing variable names inthe new system. This conflict can be re-solved by either modifying the context orthe fragment. Modifying the fragmentcorresponds to specialization, as de-scribed in the previous paragraph. Theissues for modif~ng the context are es-sentially the same.

4.5 Appraisal of Reuse Techniques in Designand Code Scavenging

In ideal cases of scavenging, the softwaredeveloper is able to find large fragmentsof high-quality source code quickly thatcan be reused without significant modifi-cation. In these cases, the payoff is high.The developer goes directly from an in-formal abstraction of a design to a fullyimplemented source code fragment. Thatis, the cognitive distance between the ini-tial concept of a design and its final exe-cutable implementation is small.

In practice, the overall effectiveness ofcode scavenging is severely restricted byits informality. A programmer can onlyscavenge those code fragments he or sheremembers or knows how to find. Thereis no systematic way to share fragmentsamong many different programmers.Specialization and integration are ineffi-

ACM Computmg Surveys, Vol. 24, No, 2, June 1992

Software Reuse w 141

Table2. Reuse in Design and Code Scavenging

Abstraction The reusable artifacts in scavenging are source code fragments. The abstractions

for these artifacts areinformal concepts that asoftware developer has learned

from design and programming experience.

Selection When apro~ammer recognizes that part ofanew application is similar to onepreviously written, a search for existing code may lead to code fragments that

can be scavenged.Specialization A programmer specializes a scavenged code fragment by manually editing it.

The programmer must thoroughly understand the lowest level details of the codefragment in order to adapt it correctly to its new context.

Integration Integrating a scavenged code fragment into a new context means that theprogrammer has to modify the fragment, the context, or both.

Pros In ideal cases of scavenging, a software developer is able to adapt largefragments of source code without significant modification. In these cases,

cognitive distance is small.Cons In the worst cases, a software developer spends more time locating, understand-

ing, modifying, and debugging a scavenged code fragment than the time required

to develop the equivalent software from scratch.

cient because they require a thoroughunderstanding and direct editing at therealization level (source code). The pro-grammer may introduce errors whilemodifying the fragment, so the valida-tion, testing, and debugging must be re-peated each time a code fragment is scav-enged for a new context.

These limitations lead to the secondtruism of software reuse:

For a software reuse technique to be effectwe, It

must be easier to reuse the arhfacts than It is to

develop the software from scratch.

Efforts to extend the scope of code scav-enging, especially across multiple pro-grammers, often violate this simple con-straint. For example, trying to scavengeunfamiliar code from a friend may resultin spending more time locating, under-standing, modifying, and debugging acode fragment than the time required todevelop the equivalent software fromscratch. Scavenging is inherently limitedby the fact that developers are not iso-lated from the details in the abstractionrealization level. They frequently work atthe same level of abstraction while scav-enging as they do when building soft-ware from scratch. Table 2 summarizesthe design- and code-scavenging ap-proach to software reuse.

5. SOURCE CODE COMPONENTS

McIlroy’s [1968] Mass Produced SoftwareComponents introduced the notion of

software reuse by proposing an industryof off-the-shelf source code components.These components were to serve as build-ing blocks in the construction of largersystems. Given a large enough collectionof these components, software developerscould ask the question “What mechanismshall we use?” rather than “What mecha-nism shall we build?”

The nature of a reusable componenttechnology strongly depends on the lan-guage in which it is implemented. Thecomponents proposed by McIlroy in 1968were assembly language or Fortran sub-routines and therefore emphasizedreusable functions. Examples of collec-tions of reusable functions include statis-tics libraries such as SPSS and numeri-cal analysis libraries such as IMSL.Modern languages have a richer collec-tion of program units, such as modules,packages, subsystems, and classes. Thetype of reusable components that can bewritten in these languages is not limitedto functions but can also include data-centered artifacts such as abstract datatypes [Deutsch 1989; EVB Software 1985;Ichbiah 1983; Parnas et al. 1989]. Thatis, these languages make a clear distinc-tion between control abstraction and dataabstraction [Goguen 1986, 1989]. As anexample of reusable data-centered com-ponents, Booth [ 198’7] defines reusableAda packages for 11 abstract data types:stacks, lists, strings, queues, deques,rings, maps, sets, bags, trees, and graphs.



He also gives reusable function-centeredcomponents for control abstractions suchas sorting and searching.

The goal of off-the-shelf components isto reduce the cognitive effort, the numberof keystrokes, and therefore the amountof time required to design, implement,and debug a new software system. Theone-time cost of creating a reusable com-ponent is further amortized each time itis reused. Reusable com~onents have.proven fruitful in a few narrow areas.such as numerical analysis, but there hasbeen less success with collections of gen-eral-purpose components.

5.1 Abstraction in Source Code Components

Component reuse is analogous to codescavenging in that programmers copy ex-isting source code artifacts into new soft-ware systems. However, reusable com-ponent technologies typically use moresystematic techniques such as catalogsand libraries of components. In addition,reusable components are created andstored specifically for the purpose ofreuse, whereas with scavenging the

reused artifacts are extracted from soft-ware that was not written with reuse inmind.

A major challenge for researchers try-ing to implement large libraries ofreusable components is to find conciseabstractions for the components. Withoutabstractions, the user of a component li-brary must examine the source code todetermine what each component does.Although the source code is appropriatefor the abstraction realization level, it isunreasonable to expect a user to searchthrough a large library of source code tofind an appropriate component. The li-brary implementor must provide abstrac-tion specifications that succinctly de-scribe component behavior. The libraryimplementor must also provide classifi-cation and retrieval schemes so users canefficiently search for specific components.

It is interesting to note that the best-known successes with component reusehave been in application domains withapplication-specific, “one-worp abstrac-

tions to describe components. These ab-stractions are universally understood byall programmers in the application do-main. For example, in numerical analy-sis libraries, reusable components for sineand matrix multiply do not require de-tailed abstraction specifications. Thenames alone serve as complete abstrac-tions for the users based on their count-less hours studying high school and col-lege mathematics. Another example is

Booth’s abstract data types (stacks, lists,strings, queues, deques, rings, maps,sets, bags, trees, and graphs), where thenames serve as well-defined abstractionsfor most computer scientists. In both ofthese examples, the source code compo-nents are often accompanied by informalnatural language descriptions, whichprovide more detailed abstraction specifi-cations for the components.

As with code scavenging, the softwaredeveloper reusing source code compo-nents is often involved with all parts ofabstraction. Initially the developer usesthe abstraction specification to locate anappropriate component. The mappingfrom specification to realization corre-sponds to the developer locating the com-ponent in the library and (if necessary)modifying it for reuse. Modification, ifneeded, and integration are performed atthe realization level of the abstraction,which might force the developer to be-come involved with details in the compo-nent implementation.

5.2 Selection

The third truism of software reuse em-phasizes a characteristic problem withreusable components:

To select an artifact for reuse. you must know what

It does.

When faced with a library of sourcecode components, the user needs a levelof abstraction that emphasizes what thecomponents do as opposed to the sourcecode that describes how it is done. Annais an example of an Ada language exten-sion for annotating components (i.e., Adapackages) with semantic descriptions



[Luckham and von Henke 1984]. Al-though Ada maintains a clear separationbetween the package declaration and thepackage implementation, the packagedeclaration provides only syntactic infor-mation, which is not sufficient for de-scribing its behavior and intended use.With Anna, it is possible to “understandhow to use a package without inspectingits implementation in the private partand body” [Luckham and von Henke1984, p. 123].

Anna, which stands for Annotated Ada,is a language for formal annotations. An-notations are predicates (i.e., Boolean-

valued expressions) that express con-straints on Ada language constructs suchas data objects, types, subtypes, subpro-grams, and exceptions. For example, thefollowing is an Ada subtype declarationfor the even integers. The first line is theAda declaration, and the second is anAnna annotation; it is impossible to ex-press this subtype semantics in pure Ada[Luckham and von Henke 1984]:

subtype EVEN is INTEGER;

--lwhere X : EVEN = ) X mod 2 = O;

The next example is a simple Annaspecification for the behavior of a counterpackage (also from [Luckham and vonHenke 1984]). The two Anna annota-tions, which begin with --1, formally statethat the value of a counter just afterinitialization will be O and that the valueof the counter just after an incrementwill always be 1 more than its value justbefore the increment:

package COUNTERS istype COUNTER is limited privatefunction VALUE (C : COUNTER) return NAT-URAL;procedure INITIALIZE (C: in out COUNTER);--l where out (vALLIE (C)= 0);procedure INCREMENT (C : in outCOUNTER);--l where out (VALUE(C) = VALUE(in C) + 1);

private

end COUNTERS;

In addition to component descriptions,a component library must provide soft-

ware developers with techniques to lo-cate components efficiently. Without suchtechniques, a developer must searchthrough a large, monolithic collection ofreusable components. This is analogousto searching for a book on a particularsubject “in a public library that has nocard catallog but has all books arrangedalphabetically” [Embley and Woodfield1987, p. 360]. This leads to the fourthand final truism of software reuse (whichis a special case of the second):

To reuse a software arhfact effectwely, you must be

able to find it faster than you could build It

To address this problem, implementors

of a component library can organize com-ponents with a classification scheme andthen provide manual or automated re-trieval techniques. ‘l’he scheme for classi-fication and r~trieval is another level ofabstraction over all of the components inthe library. For example, the IMSL [ 1987]math library has a three-volume manualwith components hierarchically classifiedby abstract computational or analyticalcapabilities. In addition, three differentindexes serve as inde~endent abstract in-terfaces to the libra~y: a keyword index

(KWIC), an ACM math software classifi-cation index, and an alphabetical index.The keyword and ACM classification in-dexes are both mappings from abstractdescriptions (i.e., abstraction specifica-tions) to reusable com~onents in the li-brary (i.e., abstraction realizations ).About 1200 pages are used to describeand classify approximately 900 routines.

Automated tools for component selec-tion may be necessary for component col-lections significantly larger than IMSL.An interesting small-scale example is thenetlib system for automatically distribut-ing mathematical software componentsvia electronic mail [lDongarra and Grosse1987]. Users’ queries and requests aredrafted according to a simple syntax andsent to an Internet address. A server atthat address parses incoming messagesand returns responses to queries or re-quested software components.

The net,lib components are all publicdomain, and the netlib service is pro-

ACM Compd,mg Surveys, Vol. 24, No. 2, June 1992


vialed free of charge. In December, 1988,the netlib source code collection consistedof51 different software packages (includ-ing Linpack, published ACM algorithms,Minpack, fft packages, and eigenvaluepackages) and occupied about 75MB ofstorage. Although this system is limitedto mathematical software and is rela-tively unsophisticated in terms of soft-ware classification and retrieval, it elicitsimages of an international distributionsystem for a wide range of reusable soft-ware artifacts.

5.3 Specialization

As with code scavenging, programmerscan specialize reusable components bydirectly editing the source code. This ap-proach, however, has the same draw-backs that were noted with code scaveng-ing:

●

☛

Editing source code forces the softwaredeveloper to work at a low level ofabstraction. The effort required to un-derstand and modify the low-level de-tails of a component offsets a signifi-cant amount of the effort saved inreusing the component.

Editing source code may invalidate thecorrectness of the original component.This eliminates the ability to amortizevalidation and verification costs overthe life of a reusable component.

Implementors of reusable componentscan offer a more efficient approach tospecialization with generalized compo-nents and construction-time parameters.Software developers specialize thesecomponents by setting parameters ratherthan directly editing source code. Well-known examples of this approach includeparameterized macro expansions and Adagenerics [Ichbiah 1983]. Parameterizedcomponents are a form of abstraction.For example, the Ada generic packagespecification represents a set of possiblepackage realizations, or instantiations.The generic parameters correspond to thevariable part of the abstraction specifica-tion. Mendal [1986] gives an in-depthexample of a generic Ada package for

sorting in which parameterization deter-mines the type of elements that aresorted, the indexing scheme into a collec-tion of elements, the partial orderingamong elements, and so on.

5.4 Integration

Module interconnection languages pro-vide a framework for integrating reusable

components [DeRemer and Kron 1976].Integrating reusable components into asoftware system is essentially the sameas integrating components built fromscratch. Module interconnection lan-guages are an integral part of many pro-gramming languages such as Modula-3,Standard ML, and Ada, which makesthese languages particularly good candi-dates for implementing reusable compo-nents. Readers not familiar with moduleinterconnection languages might be in-terested in a survey by Prieto-Diaz andNeighbors [ 1986].

Naming and binding conflicts can pre-sent problems for the software developerintegrating a reusable component into thecontext of a new system, Names im-ported into and exported from the compo-nent may clash or be incorrectly bound inthe new system.

Ada provides mechanisms to overcomesome of these naming problems [Ichbiah1983]. The dot notation can be used todisambiguate name clashes. For exam-ple, if package P and Q both define func-tions named F, they can be uniquelyidentified as P.F. and Q. F. Overloadingallows name clashes to be automaticallydisambiguated by the compiler on thebasis of differences in the type signa-tures. And synonyms can be used insituations in which dot notation is toocumbersome or overloading becomes con-fusing.

The UNIX4 pipe is another example ofan integration framework for compo-nents [Kernighan 1984]. The reusablecomponents in this paradigm are com-plete programs. Programs are integrated

4 UNIX is a trademark of AT & T.


Software Reuse “ 145

by sequentially “piping” the output fromone program into the input of anotherprogram. For example, the UNIX com-mand who is a program that outputs thelist of persons logged onto a UNIX sys-tem, one per line. The UNIX command lCcounts the number of linefeeds in an in-put string. Piping the output of who intolC produces a new program that countsthe number of persons logged into a UNIXsystem:

who I IC

where the I designates a pipe,

5.5 Object-Oriented Variants toComponent Reuse

Most of the abstraction, selection, spe-cialization, and integration issues out-lined earlier in this section also apply toobject-oriented languages. The notion ofinheritance from the object-orientedparadigm, however, offers a unique con-tribution to component reuse [Meyer1989].

5.5.1 Inheritance and Subclasses

Class definitions in an object-orientedlanguage such as Smalltalk are primar-ily a data abstraction mechanism. Inheri-tance and subclassing enhance the dataabstraction mechanism by establishing ahierarchical relationship among classesand by allowing reuse to occur within theclass hierarchy [Deutsch 1989; Liskov1987]. For example, an indexed collectionof items might be defined as a class (ab-stract data type) with a hidden storagerepresentation and visible operations tostore and fetch the zth item in the collec-tion and to get the size of the collection.Subclasses of the index collection canreuse the storage representation and thefetch operation, while extending the typewith new data and operations. For exam-ple, a programmer can define a sequenceas a subclass of the indexed collectionsuperclass, with an additional operatorfor concatenation. The programmer onlyhas to write the concatenation operatorfor the sequence class definition since the

data representation and fetch operationdefined in indexed collection are reused.

Subclassing, therefore, corresponds tospecialization of a superclass. That is,the superclass is a template that can beextended by a programmer to create asubclass. Everything in the superclasstemplate is reused in the subclass[Lieberherr and Riel 1988]. The effortrequired to produce a subclass is propor-tional to the dissimilarity between thesubclass and the superclass [Deutsch1983].

Liskov [1987] discusses different formsof inheritance and the effect that eachform has on the encapsulation of classes.In cases in which encapsulation is com-promised for a class, reusability islikewise compromised. When the imple-mentation of a class depends on imple-mentation details in a superclass, theexternal dependency makes it difficult tounderstand and reuse the class.

Organizing classes into a subtype hier-archy he] ps the software developer locateand select reusable classes. Similarclasses in a hierarchy are grouped closetogether. The path from the root of aclass hierarchy down to a particular classis a path from a very general abstractionfor the class down to more specific ab-stractions for the class. A developer look-ing for a reusable class navigates throughthe hierarchy until the closest match islocated [Liskov 1987].

5.5.2 Multiple Inheritance

Some languages offer multiple inheri-tance as an extension to the inheritancemechanism just described. Multiple in-heritance allows a class to have two su-perclasses, inheriting the data represen-tations and operations from both [Stefikand Bobrow 1986]. Multiple inheritanceis therefore a mechanism for mergingmultiple abstract data types into a singletype.

There are several possible forms ofmultiple inheritance. The simplest is adisjoint union of two or more super-classes, where the data representationsand the operations from the superclasses


146 ● Charles W, Krueger

do not interact in any way. In this casethe new subclass reuses the properties ofall of its parents without modifying theirsemantics. For example, the Smalltalkread-write-stream subclass is composedof multi~le inheritance from the read-.stream superclass and the write-streamsuperclass.

Another form of multiple inheritance isan intersecting union. In this case someof the data or operations from super-classes interact. For example, a stack _setclass can be defined bv multi~le inheri-tance from a set supe~class aid a stacksuperclass. The stack _set maintains astack and a set as (conceptually) a singledata type. The semantics are defined sothat the set always contains all objects inthe stack, and as usual the set is anunordered collection containing no durdi-cates. Adding and removing it~ms on ~hestack _set is allowed only through thepush and pop operations. The insert anddelete operations for set are disabled. Thepush and pop operations must be modi-fied to update the set portion of thestack _set in addition to the original op-erations on the stack [Habermann et al.19881.

In ‘this example, modifying the pushand pop operations in the stack _set de-tracts from the reusability of those oper-ations since the software develo~er must.manuallv modifv them, There has been.some research &ing declarative specifi-cations that allow the software develo~erto merge operations from multiple super-classes almost as easily as merging thedata [Kaiser and Garlan 1987, 1989].

5.6 Appraisal of Reuse Techniques in SourceCode Components

Compared to code scavenging, reusablecomponent libraries can be considerablymore effective since components are writ-ten, collected, and organized specificallyfor the purpose of reuse. The most suc-cessful reusable component systems, suchas the IMSL math library, rely on conciseabstractions from a particular applica-tion domain. One-word abstraction speci-fications such as sine often allow a soft-

ware developer to go directly from aninformal requirement to a fully imple-mented and tested source code compo-nent. Thus, the cognitive distance be-tween the informal concept and its finalexecutable implementation is very small.

For components that do not have sim-ple abstractions, more general specifica-tion techniques are required. Unfortu-nately most programming languages donot come with good abstraction mecha-nisms for describing component behaviorand for doing classification and retrieval.And although some progress has beenmade using programming language ex-tensions such as Anna, these formalspecification languages are not withoutproblems. The primary drawback is thatnontrivial specifications are difficult tocreate and understand—specificationscan often be as opaque as source code[Horowitz and Munson 19891. This off-~ets the benefits of using them as ab-straction specifications for reusable com-ponents; that is, it makes the cognitivedistance relatively high.

Reusable components can be special-ized either by editing the source codedirectly or with mechanisms such as theAda generic. Generics provide a level ofabstraction that isolates the software de-veloper from many implementation de-tails. Generics can also reduce the risk ofinadvertently introducing errors whenthe component is reused. Using a formalspecification language such as Anna, anAda generic package can be verified to becorrect for all legal instantiations of thatpackage. Thus t~e user does not have toreverify the reused component.

Ada generics can only be parametri-zed with language constructs such as

data types, data objects, and functions.In Section 6 we will see how m-eaterdegrees of specialization are poss~ble byparameterizing components with higherlevel abstractions, such as space/timeperformance tradeoffs.

Creating a relatively complete andpractical library of reusable componentsis a formidable challenge. Library imple-mentors must have the theory, foresight,and means to produce a collection of com-

ACM Comput]ng Surveys, Vol. 24, No. 2,, June 1992

Software Reuse w 147

ponents from which software developerscan select, specialize, and integrate tosatisfy all possible software developmentrequirements. This is currently possibleto a limited degree for specific applica-tion domains that have a rich and thor-ough theoretical foundation, such asstatistical analysis. General-purpose li-braries, however, remain elusive for atleast two reasons:

e

0

The implementation characteristicsand tradeoffs for data structures andcomputations are widely variable.

Library size grows rapidly with respectto general-purpose component srize.-

Computer science provides general-purpose building blocks such as stacksand lists, but these do not always repre-sent the dominant portion of the compu-tation needed in large software systems[Booth 1987]. Yet even to construct anadmittedly modest library of variationson 11 abstract data types (stacks, lists,

strings, queues, deques, rings, maps,sets, bags, trees, and graphs) Booth[1987] had to create 501 parameterizedcomponents. In order to capture to a rea-sonable degree of tradeoffs in precision,granularity, range, and robustness, McIl-roy [1968] envisioned a library with 300different variations of the “lowly sineroutine.” Projecting from these examples,it appears that a library providing a ver-satile foundation for building softwaresystems must be populated with a verylarge number of components.

With general-purpose components, big-ger is better. A software developer whoconstructs a software system by select-ing, specializing, and integrating 51000-line components is typically goingbe more effective than a developer using1000 5-line components. Larger compo-nents are, however, more specific, andtherefore, more of them are neededto populate a general-purpose library

[Biggerstaff and Richter 19891.For example, consider a programming

language with 10 different types of state-ments. Ten different l-line componentscan be built in this language, but using

these l-line components is equivalent toand as equally cost-effective as program-ming directly in the source language. Alibrary of components of length 6 mightbe six times more cost effective to use,but there are 1,000,000 (lOG ) possiblecomponents of length 6. In general thereare 10N candidate components for popu-lating a library of iV-line components inthis model. It is therefore necessary for alibrary implementor to identify a smallsubset of large granularity componentsthat can serve as building blocks for soft-ware systems. Table 3 summarizes thesource code component approach to soft-ware reuse.

6. SOFTWARE SCI-IEMAS

Software schemas are a formal extensionto reusable software components.Reusable components often rely on adhoc extensions to programming lan-guages to implement reuse techniquessuch as specification, parameterization,classification, and verification. With soft-ware schemas, however, these mecha-nisms are an integral part of the technol-ogy.

Although the intent of softwareschemas is similar to that of reusablecomponents, the emphasis is on reusingabstract algorithms and data structuresrather than reusing source code. To thesoftware developer using reusableschemas, the source code representationwill be much less prominent than theabstract notation for describing what theschema is intended to do, under whatconditions it can be reused, and how togo about reusing it.

The PARIS system is a representativeschema technology [Katz et al. 1989]. Thereusable schema, or program templates,in the PARIS libralcy are instantiated toproduce source code. Each schema has aspecification that includes the following:

●

●

A formal semantics description of theschema (e.g., what the schema does),

Assertions for correctly instantiatingthe schema (e.g., constraints on thevariable parts of the schema), and



Table 3. Reuse m Source Code Components

Abstraction

Selection

Specialization

Integration

Pros

Cons

Currently, the best abstractions for reusable components are domain-specificconcepts, such as those found in the numerical analysis and statistical analysispackages. Specification languages such as Anna are general-purpose mecha-nisms for writing abstraction specifications for components.

Selection is easier in domain-specific libraries because components can beclassified, organized, and retrieved using well defined properties of the domain.In a general-purpose component library, ease of component selection is depen-

dent on the quality of the abstraction, classification, and retrieval schemes.

Describing component behavior with a specification language can help softwaredevelopers select among a small number of alternatives, but this approach is not

suitable for a manual search in a large library.

Generalized components with construction-time parameters represent the mosteffective approach to component specialization, Directly editing componentsource code is a less-controlled and less-efficient means of specialization. In someobject-oriented languages, inheritance offers another means of reuse with

superclass specialization.Module interconnection languages typically provide the framework for integrat-ing components. Naming and name binding are Important module interconnec-

tion issues in component reuse since reusable components are constructed

independently of any particular context.

Components are written, tested, and stored specifically for the purpose of reuse.Some application domains, such as numerical analysls, are particularly well

suited to reusable component techniques since they have well-defined componentabstractions. In these cases, cognitive distance is small.Components without well-defined abstractions must be described with naturallanguage or specification language descriptions. These descriptions can often be

as difficult to understand as source code, thereby increasing the cognitivedistance. Also, general-purpose component libraries will have to be very large,making them difficult to build and use.

. Assertions for the valid use of instanti-ated schema (e.g., preconditions andpostconditions for the source code).

A software developer begins an inter-action with PARIS by giving a problemstatement—a formal set of computationalrequirements. PARIS then assists infinding a schema that can be instanti-ated to satisfy the problem statementprovably.

6.1 Abstraction in Software Schemas

Knuth’s [1973] book, The Art of Com-puter Programming, provides a library ofabstract descriptions for many basic com-puter science algorithms and data struc-tures. Programmers can informally usethese abstractions when writing sourcecode. The goal of schema researchers isto capture abstractions formally at thislevel for their schema libraries [Standish1984].

In the software schema approach, thealgorithms and data structures capturedby the schemas are the reusable arti-


facts. The abstraction specification for aschema is a formal exposition of the algo-rithm or data structure, whereas the ab-straction realization corresponds to thesource code produced when the schema isinstantiated. The fixed part of the ab-straction specification formally describesthe invariant computation or data struc-ture of the schema, for example, sortingwith respect to a particular partial order-ing or a queue of an unspecified elementtype. The variable part of the abstractionspecification describes the range of op-tions over which a schema can be instan-tiated, such as the overflow length for aqueue.

With schemas, the variable parts ofthe abstractions can be more general, orhigher level, than is possible with pa-rametrized components such as the Adageneric. Parameters in generics are lim-ited to a subset of programming lan-guage constructs, whereas in schemasthey can range over other software prop-erties such as space/time efficiencytradeoffs. An even more complex type ofparameterization is possible with nested


schemas, where parameters in a schemacan be instantiated with other schemas.

With software schemas, software de-velopers select, specialize, and integrateschemas at the abstraction specificationlevel. In addition, the mapping from thespecification to the source code realiza-tion (i.e., schema instantiation) can beautomated, in which case developers areisolated from the details in the abstrac-tion realization level. This is analogousto high-level languages and in contrast tocode scavenging and reusable compo-nents.

As with reusable components, a majorchallenge to implementors of softwareschemas is to find suitable abstractionsand abstraction representations for theschemas. Rich and Waters [1989, p. 3131suggest that the limited success in com-ponent and schema reuse “stems not fromany resistance to the idea, nor from anylack of trying, but rather from the diffi-culty of choosing an appropriate formal-ism for representing components.” Differ-ent algorithms and data structures havediverse properties, and this diversitymust be expressible in schema represen-tations. For example, matrix add is bestdefined by a step-by-step operational se-mantics description of how the computa-tion is accomplished, whereas a dead-lock-free scheduler is best described by adeclarative set of constraints on whatthe computation must accomplish [Richand Waters 1989].

In an attempt to provide a diverse basefor describing reusable artifacts, the PlanCalculus5 in the Programmer’s Appren-tice combines ideas from flowcharts, dataabstraction, logical formalisms, and pro-gram transformations [Waters 1985]. Aflowchart representation is used to de-scribe algorithmic aspects of a schemasuch as control flow and data flow.Flowcharts can be annotated with logicalpreconditions and postconditions to cap-ture the declarative aspects of theschemas. Empty boxes within a flowchart

5 Using the terminology of Rich and Waters [1989],plan corresponds to the notion of software schemain this paper.

represent the variable part of the schemaabstraction. A software developer instan-tiates such a schema by filling in theempty boxes with nested schemas.

PARIS uses temporal logic assertionsand quantified logic predicates in schemapreconditions, postconditions, and invari-ant [Katz et al. 1989]. The followingexample is a PARIS description for aninsertion module that inserts an item intoan ordered list. The APPLICABILITY CON-DITIONS, or preconditions, state that thelist is ordered before the insert. The RE-SULT ASSERTIONS, or postconditions,state that the list is ordered after theinsertion and that the new item exists inthe list.

APPLICABllLiw corwmor~s (ww)Ieq: D1 x DI + boolean --less than or

--equal toforall n I n element_of node :

n+neXt!= NULL=)Ieq(n + data, n + next + data)

RESULT ASSERTIONS (POST)Ieq: D1 x 1]1 + boolean --less than or

--equal toequal: D1 x D1 + booleanforall n I n element_of node :

n+ next!= NULL=)

Ieq(n + data, n + next + data)

exists n I n element node :equal(n + data, newdata)

In Goguen’s [1986] Library Intercon-nection L,anguage (LIL), axiomatic theo-ries with varying degrees of formality areattached to schema parameters, andviews describe how schema instantiationsatisfies the theories. The following ex-ample is a theory for partially orderedsets (POSET):

theory POSET istypes ELT --elements m the setfunctions Ieq --less ihan or equal tovars El E2 E3 : ELT

axioms(El Ieq El)(El leq E3 if El Ieq E2 and E2 leq E3)(El = E2 if El Ieq E2 and E2 Ieq El)

end POSET



A schema implementor can attach thistheory to any schema that is a partiallyordered set. The implementor uses a viewto bind the type ELT and the functionLEQ in the theory to the correspondingconstructs in the schema.

Other examples of schema formalismsinclude Volpano and Kieburtz’s [1985,1989] Software Templates with an ML-like functional notation and Embley andWoodfield’s [1987] abstract data typeschemas with a regular expression nota-tion.

6.2 Selection

Reusable hardware components, such asTTL integrated circuits, are widely usedin hardware design. When lamentingabout the limited successes of softwarereuse, software engineering often drawthe analogy between software artifactsand hardware components. The analogyis particularly relevant when comparingthe techniques for selecting TTL compo-nents to the techniques for selecting soft-ware components and schemas.

In Texas Instruments [1985] The TTLData Book, approximately 1000 MSI/LSI TTL components are classified in afour-level hierarchy. The selectionbranching at each level in the hierarchyvaries between 2 and 12 (the averagebeing 5.6). Following is an example clas-sification path for a frequency dividercomponent (S N74LS57), with the hierar-chy level preceding the selection made atthat level; moving to deeper levels in thehierarchy reduces the number of candi-date components until only one remains:

(1) Counters

(2) Frequency dividers/multipIiers

(3) 60-to-l frequency divider

(4) Low-power Schottky technology

If the user finds the simple abstrac-tions presented at the branches insuffi-cient for making a choice, more detailedabstractions are available for the currentcandidates. For every component in thedata book, there is a detailed expositionpresenting different properties of the de-vice at different levels of abstraction.

Techniques for describing components in-clude natural language descriptions, in-put/output tables to describe the devicefunction, logic-level schematics (gates,inverters, flip-flops, etc.), transistor-levelschematics, timing diagrams to describetemporal relationships, plots of linearcharacteristics such as delay time versusdevice temperature, and the range ofnormal operating conditions. The moredetailed abstractions are typically rele-vant as the selection process proceeds todeeper levels in the hierarchy.

Analogous to The TTL Data Book,practical techniques for selecting reus-able software components or schemas canuse hierarchical classification and multi-ple forms of exposition. For example, theIMSL mathematical library is a small-scale software library that has success-fully used these techniques. It is interest-ing to note that the IMSL library andThe TTL Data Book both contain approx-imately 1000 components.

General-purpose software librariesmight contain several orders of magni-tude more candidates than the special-purpose IMSL library. In addition, soft-ware libraries will probably evolverapidly, and therefore it may not be prac-tical to maintain them with books ormanuals [Biggerstaff and Richter 1989].Clearly, reuse technologies with a largenumber of reusable artifacts must pro-vide users with powerful techniques forselecting artifacts. These selection tech-niques must address three importantsubproblems (analogous to The TTL DataBook):

● Classification of the artifacts in thelibrary

. Retrieval of artifacts from the library

* E.~position of information necessary tounderstand, compare, choose, and usethe artifacts

The following sections address these is-sues in detail.

6.2.1 Classification

Prieto-Diaz describes a classificationscheme for software artifacts [Prieto-Diaz


1989; Prieto-Diaz and Freeman 1987]. Inthis scheme, artifacts are classified bytwo top-level descriptors and threelower-level facets within each descriptor:

Functionality. Describes the functionthe component is intended to perform.The three lower-level facets of func-tionality, ordered by decreasing rele-vance to reusability, are

(1) Function. Operation performed.

(2) Object. Type of data object onwhich the operation is performed

(3) Medium. Larger data structure inwhich the data object is located.

Environment. Describes the context forwhich the component was designed.The three lower-level facets of envi-ronment, ordered by decreasing rele-vance to reusability, are

(1) System type. Type of subsystemfor which the component was de-signed.

(2) Functional area. Application de-pendent activities.

(3) Setting. Application domain.

The following is an example of a classi-fication for a file compression routine thatwas designed for a database manage-ment system:

Function: compress

Object: fliesMedium: disk

System Type: file handler

Functional Area: DB management

Setting: catalog sales

The classification scheme allows onlypredefine values in the facets. The pre-define values for a particular facet arerelated by a conceptual closeness graph.An automated evaluation system canshow the user a collection of conceptuallysimilar artifacts.

There are practical limitations to thisparticular classification scheme. For ex-ample, it uses function as the primaryclassification facet, which emphasizes thereuse of functional components and ig-nores reuse of abstract data types. Withthis approach, the insert operation for aqueue would be classified close to the

Software Reuse 9 151

insert operation for a set and not encap-sulated in the queue data type. This clas-sification scheme does, however, demon-strate how classification techniques canbe applied to software components, orschemas, analogous to hardware compo-nents.

6.2.2 Retrieval

PARIS is notable for its sophisticated as-sistance in locating software schemas[Katz et al. 1989]. When presented witha problem statement, PARIS attempts tofind a schema that satisfies the problemstatement. So that PARIS can matchschemas to problem statements, eachschema in the library includes the follow-ing information:

Parameterization for the schema whichis a list of abstract entities that mustbe supplied in order to use the schema.

A schema specification consisting of

Assertions about the substitutions forthe abstract entities,

Preconditions for execution of an in-stantiation of the schema,

Invariants for execution of an instan-tiation of the schema,

Postconditions for execution of an in-stantiation of the schema,

Temporal relations for the executionof a: instantiation of the schema.

Proof of correcimess for the genericschema specification.

This information is a detailed abstrac-tion specification for a schema. Com-pared to the syntactic specification of anAda generic, the semantic assertions thataccompany a PARIS schema provide amore precise definition of how to reusean artifact correctly. That is, a PARISschema has a more narrow interface.

A problem statement is a set of compu-tational requirements. Its form is similarto a schema specification in that thereare preconditions and postcon ditions, butit contains no parameterized sections asdo schemas. When presented with aproblem statement, PARIS first con-structs a list of preliminary candidate



schemas by finding all schemas in thelibrary such that

* The preconditions of the problem state-ment are a superset of the schema pre-conditions

“ The postconditions of the problemstatement are a subset of the schemapostconditions

The user then selects schemas from thecandidates and attempts to find instanti-ation that will satisfy the schema asser-tions. Optionally, PARIS can attempt toinstantiate a candidate schema automat-ically. A theorem prover is used to verifythat a particular instantiation satisfiesthe problem statement and the asser-tions given for the schema.

With PARIS, users perform schemasearches at the abstraction specificationlevel and are therefore isolated from theimplementation details at the abstrac-tion realization level. PARIS does not,however, take advantage of the higherlevel abstractions understood by softwaredevelopers. For example, a user cannotsimply specify that a queue implementa-tion is needed but rather must give anelaborate logic specification of the queuesemantics. This monolithic, low-level ab-straction representation is necessary forthe theorem prover but is not ideal forhuman users.

Also, the state of the art for theoremprovers is not sufficiently advanced forpractical use. Finding a schema that sat-isfies a problem statement requires proofthat the schema is logically consistentwith the problem statement, which ingeneral is well beyond the scope of cur-rent-day theorem-proving technology[Rich and Waters 1988].

6.2.3 Exposition

In The TTL Data Book, a diverse collec-tion of device abstractions are presentedto help users understand and select can-didate components. Biggerstaff believesthat finding an analogous, diverse collec-tion of expositions for reusable softwareartifacts is the “fundamental operationalproblem that must be solved in the devel-

opment of any reuse system” [Biggerstaffand Richter 1989, p. 6]. These abstrac-tions must be useful to both the user andthe tools for automated retrieval.

Following is a list of exposition tech-niques from The TTL Data Book (in bold-face) and some suggestions for analogoussoftware counterparts:

Natural language descriptions. Thisis certainly useful for informal schemadescriptions and search techniquessuch as keyword lookup. But machineunderstanding of natural language isbeyond current capabilities, which lim-its automated searching techniques[Rich and Waters 1988].

Input / output tables to describe thedevice function. Logic assertions canbe used to express the semantic func-tion of a schema (e.g., preconditionsand postconditions). They are not par-ticularly useful for large, complexschemas due to the complexity of thecorresponding logic expressions.

Logic-level schematics. Data flow andcontrol flow diagrams such as thoseused in Plan Calculus resemble thegraphical abstractions found in hard-ware logic-level schematics. These dia-grams have isomorphisms to logicformalisms that are appropriate formachine manipulation.

Transistor-level schematics. This cor-responds to the source code level of asoftware schema. Although this is attoo low a level for most of the reuseprocess, it is occasionally useful in thefinal stages of selection, specialization,and integration. For example, withTTL chips, transistor-level schematicscan be consulted to determine whetheror not unused pins require externalconnections. Knuth’s WEB, a systemfor making source code more compre-hensible, is one possible approach forpresenting the schema implementa-tion level [Bentley 1986].

Timing diagrams to describe tempo-ral relationships. Temporal logic issuitable for expressing some of thetime-dependent characteristics of soft-ware.


Plots of characteristics such as delaytime versus device temperature.The characteristics of computationswithin a schema can be expressed interms of the upper and lower boundson time/space requirements as a func-tion of input size.

Biggerstaff argues that hypertext sys-tems can be used to manage a large col-lection of interrelated component orschema abstractions [Biggerstaff 1987;Biggerstaff and Richter 1989]. So-called“webs of information” allow the user tomove easily between different abstractviews of reusable artifacts.

Seer is an example of the hypertextapproach [Latour and Johnson 1988]. Ituses a graphical representation of theBooth [1987] taxonomy for reusable com-ponents. Each component in the libraryhas a graphical information web thatprovides access to all of the availableinformation on the component. This in-formation includes formal, informal, syn-tactic, semantic, graphical, and textualdescriptions in three different categories:

Specification information. What thecomponent does.

Usage information. How to use thecomponent correctly.

Implementation information. How thecomponent is implemented.

All three types of information are usefulin selecting schema. In addition, usageinformation applies to schema specializa-tion and integration, and implementa-tion information is occasionally useful fordetailed integration decisions.

6.3 Specialization

Schemas are typically specialized eitherby substituting language constructs, codefragments, specifications, or nestedschemas into parameterized parts of aschema or by choosing from a predefineenumeration of options.

It can be argued that in either case,specialization is a continuation of the se-lection process. Both selection and spe-cialization are convergent steps toward a


single, complete instantiation. In bothcases, the software developer may usea formal specification to make the con-vergent step. In the case in which thesoftware developer specializes by substi-tuting a .nested schema for a schema pa-rameter, then specialization is exactly thesame as selection since the nested schemamust be selected from the library beforeit can be inserted.

PARIS uses both parameter substitu-tion and enumerated choices for special-izing the variable parts in the schemaabstractions. The variable part of the ab-straction is separated into two sections:

●

●

Nonprogram entities, such as abstractdata types, subtypes, ranges, func-tions, and variables.

Program sections, which can be hand-wri~ten source code fragments ornested schema instantiations.

The values substituted for abstractnonprogram entities must come from theconcrete nonprogram entities listed in theuser’s problem statement. The possiblevalues may be constrained by the appli-cability conditions of the schema. There-fore, specialization of nonprogram enti-ties in a schema corresponds to matchingactual entities in the problem statementto abstract nonprogram entity parame-ters in the schema. It is truly a continua-tion of the selection process.

The instantiation of the program sec-tions in a PARIS schema is best de-scribed as a substitution. Arbitrarily

complex source code fragments are sub-stituted into the program sections, re-stricted only by the assertions in the sec-tion conditions for the schema. If thesource code fragment is derived by nestedselection of another schema, however, itis a continuation of the selection process.

In LIL [Goguen 1986] and in Embleyand Woodfield’s [1987] reusable abstractdata types, a schema can have multipleimplementations. A schema implementorcan create radically different implemen-tations that satisfy the same semanticinterface but that exhibit different per-formance characteristics. The perfor-



mance profile for the schema then be-comes a parameter in the variable part ofthe schema abstraction. A software de-veloper specializes such a schema byselecting from the enumerated perfor-mance options.

For example, consider a schema for anindexed sequential list. This schemamaintains an ordered list of elements, so

that (1) all elements of the list can beaccessed in sequential order, and (2) ele-ments can also be accessed individuallywith an index key. Sequential access isimplemented with a linked list. Theschema has two different implementa-tions of the indexed access so a softwaredeveloper reusing the schema can choosebetween two different space/time perfor-mance profiles. The first implementationprovides faster access at the expense ofstorage space by maintaining an explicitindex structure with pointers into thelinked list. The second implementationexhibits slower access time and requiresless space by omitting the index struc-ture and doing indexed access with alinear search through the list. With thisimplementation, insertion and deletionare faster since there is no index to up-date when items are added to or removedfrom the list.

This example illustrates how schemaswith multiple implementations addressan important problem in software reuse:the tradeoff between generality and per-formance. The implementor of a reusableartifact typically tries to generalize theartifact, to make it suitable for a broadrange of applications. Increased general-ity often translates into increased func-tionality, and increased functionalityoften translates into performance com-promises in the implementation. Aschema can, however, offer multipleimplementations with a range of func-tionality and performance options. Asoftware developer can choose the imple-mentation that most closely matches therequired functionality and performance.The schema is conceptually one large,generalized artifact, but in reality therecan be many specialized implementa-tions that do not pay a performancepenalty for unnecessary functionality.

ACM Computmg Surveys, Vol. 24, No 2, June 1992

In some cases, schema implementa-tions may perform significantly worsethan very specialized code. If necessary,performance “hot spots” can be identifiedwith conventional monitoring techniquesduring the execution of the final system,and then the hot spots can be tuned,replaced, or hand coded [Booth 1987].

6.4 Integration

The simplest approach to schema inte-

gration is to use the module interconnec-tion language of the schema implementa-tion language. For example, if schemainstantiation produces Ada package code,an instantiated schema can be treated asa conventional Ada package. Integrationwould be based on the syntactic integra-tion rules of Ada.

More sophisticated schema integrationtechniques rely on semantic specifica-tions. In addition to the syntactic con-straints on composition provided byconventional module interconnection lan-guages, formal or informal semanticspecifications serve to narrow the compo-nent interfaces to include only correctsyntactic and semantic composition. Forschemas with complex interfaces, it isparticularly important to have a cleardescription of both the syntactic and thesemantic interface [Latour and Johnson1988].

LIL (introduced in Section 6.1) is anexample of a schema technology that usessemantic specifications to composeschemas [Goguen 1986]. Parameters inAda generics are extended with semanticconstraints that guarantee an instantia-tion to be semantically correct. For exam-ple, consider a generic Sort package, pa-rametrized by the type of elements itsorts. In LIL, the implementor of thispackage could put a semantic constrainton the parameter so the element typesatisfies the Partial Ordered Set theory.This would gaarantee that an instanti-ated Sort package would always be ableto make a semantically meaningful“greater than or equal to” comparison onthe elements.

In LIL, a distinction is made betweenhorizontal and vertical schema composi-

software Reuse “ 155

tion. In vertical composition, higher-levelabstractions are created. For example, apackage implementing a low-level Listabstraction might be used to implementa higher-level abstraction such as a Sortalgorithm. In horizontal composition,schemas are instantiated by specializa-tion. For example, a generic Sort schemainstantiated with a partially ordered setof Labels results in a package that isstill at the Sort level of abstraction. Thus,horizontal composition corresponds toschema composition using nestedschemas.

6.5 Appraisal of Reuse Techniques inSoftware Schemas

Compared to reusable source code com-ponents, reusable schemas place a

greater emphasis on the abstract specifi-cation of algorithms and data structuresand place less emphasis on the sourcecode implementation. This shift in em-phasis helps reduce the cognitive dis-tance between the informal requirementsof a system and its executable implemen-tation by isolating the software developerfrom the source-code-level detail. In casesin which the mapping from the schemaabstraction specification to the sourcecode realization is fully automated, thecognitive distance is reduced by one levelin the software development lifecycle—up from the level that describeshow an algorithm or data structure isimplemented to the level that specifiesthe semantics of what is implemented.

With TTL components, low abstrac-tion-level workhorses such as the ANDgates and INVERTERS offer leverage tohardware developers. By analogy,reusable software schemas for stacks,lists, trees, and so on offer leverage tosoftware developers. It is the higher levelTTL device abstractions, such as coun-ters, multiplexer, and even CPUS, how-ever, that give hardware designers thegreatest leverage. These devices typically

have a large ratio of internal gate countsto external pin count (i.e., large function-to-interface ratio). Similarly, softwareschemas with high-level abstractions andnarrow interfaces allow system designers

to select, specialize, and integrate power-ful, high functionality artifacts with rela-tive ease,

Unfortunately, with software systemswe do not have many universal abstrac-tions above the stack, list, tree, and so onlevel. Therefore, the semantics of higherlevel schema abstractions are often ex-pressed with logic formalisms and speci-fication languages,, In a sense these for-malisms become the programming lan-guage for software developers using theschemas. Thus, the challenge for imple-mentors of a schema technology is to findabstraction formalisms that are natural,succinct, and expressive. Members of thePARIS project describe this challenge:

Our experience in building the PARIS prototype

indicates that transforming abstract logic expres-

sions into precise implementation definitions is

quite difficult. It is a nontrivial task to construct

a system that is able to provide a friendly inter-

face to the user and at the same time produce

correct and efficient input to the theorem-proving

mechanism [Katz et al. 1989].

Table 4 summarizes the softwareschema approach to software reuse.

7. APPLICATION GENERATORS

Application generi~tors operate like pro-gramming language compilers—inputspecifications are automatically trans-lated into executable programs [Clevel-and 1988]. Application generators differfrom traditional compilers in that theinput specifications are typically veryhigh-level, special-purpose abstractionsfrom a very narrow application domain.

By focusing on a narrow domain, the

code expansion (ratio of input size to out-put size) in application generators can beone or more orders of magnitude greaterthan code expansion in programminglanguage compilers. Levy [1986] reportsthat application generators have beenused to create production quality com-mercial software at the equivalent rate of2000 lines of source code per day.

Whereas software schemas reuse algo-rithm and data structure designs, appli-cation generators go one step further andreuse complete software system designs.In application generators, algorithms and


156 * Charles W. Krueger

Integration

Pros

Cons

Table 4. Reuse in Software Schemas

Abstraction The schema approach emphasizes the reuse of algorithms, abstract data types,

and higher level abstractions. Formal semantic specifications are typically usedto express schema abstractions. Examples include data flow, control flow, pre,

post, and invariant logic expressions, regular expressions, temporal logicassertions, and axiomatic theories.

Selection Classification and search schemes and multiple abstraction exposition forms,analogous to The TZ’L Data Book., offer promise as practical selection techniquesfor reusable software schemas. The issue of scale will make large schema

libraries more difficult to use than the hardware analog. Automated assistance

such as hypertext databases and theorem provers can help for schema selection.

Specialization Schemas are typically specialized by filling in parameterized slots. Examplesinclude insertion ofarbitrary language constructs, choosing from an enumerated

set of abstractions, or instantiating parameters with nested schemas. Schemaspecialization may involve characteristics not typically associated with parame-

trized software, such as explicit choices among run time performance alterna-

tives.

Schema composition often involves both the semantic and syntactic schemainterface. This results in a narrower interface when compared to syntactically

typed languages such as Ada, somoresemantic errors can bedetected at designtime,When schemas are based on formal specifications, automated tools can be usedfor selection, specialization, and integration. This, in combination with the factthat software developers work at a higher level of abstraction, can reduce the

cognitive distance between the requmements and the implementations ofalgorithms and data structures.Formal specifications for schema abstractions can be large and complex. Even

with automated tools it can be difficult for software developers to locate,

understand, and use schemas. This complexity serves to increase the cognitive

distance, thereby offsetting some of the advantages of using higher level

abstractions.

data structures are automatically se-lected so the software developer can con-centrate on what the system should dorather than how it is done. That is, ap-plication generators clearly separate the

system specification from its implemen-tation [Cleveland 1988]. At this level ofabstraction, it is possible for even non-programmers familiar with concepts inan application domain to create softwaresystems [Horowitz et al. 1985].

Application generators are appropriatein application domains where

- Many similar software systems arewritten,

● One software system is modified orrewritten many times during its life-time, or

Q Many prototypes of a system are neces-sary to converge on a usable product.

In all of these cases, significant duplica-tion and overlap results if the softwaresystems are built from scratch. Applica-

tion generators generalize and embodythe commonalities, so they are imple-mented once when the application gener-ator is built and then reused each time asoftware system is built using the gener-ator.

7.1 Abstraction in Application Generators

The abstractions presented to the user ofan application generator typically comedirectly from the corresponding applica-tion domain. The way these abstractionsare presented is dependent on the appli-cation domain [Cleveland 1988]. Differ-ent applications need different models ofdata and computation in the generatedprograms, and these different models areamenable to different exposition tech-niques [Levy 1986]. Examples include

● Textual specification languages (appli-cation-oriented languages, fourth-generation languages, etc.)

* Templates

ACM Computmg Surveys, Vol 24, No. 2, June 1992

Software Reuse “ 157

● Graphical diagrams

o Interactive menu-driven dialogs

* Structure-oriented interfaces

The reusable artifacts in an applica-tion generator are the global system ar-chitecture, major subsystems within thisglobal architecture, and very specific datastructures and algorithms. The abstrac-tion specification for an application gen-erator describes all parts of the applica-tion domain that can be modeled in agenerated system. The abstraction real-ization level corresponds to the exe-cutable systems produced by the genera-tor. The fixed part of the abstractionspecification corresponds to those partsof the application the user cannot modifyduring the generation process; the uari-able part corresponds to those parts ofthe application the user can customize.

For example, with a parser generatorthe user specifies (in the variable part ofthe abstraction specification) the gram-mar of the language to be parsed. Thegenerator might, however, have LR(l) asa fixed part of the abstraction specifica-tion. in which case the user cannotchange the parsing algorithm. The com-bination of the fixed part and the rangeof the variable part for an applicationgenerator determines the range of sys-tems that can be generated; it is referredto as the domain width or domain cover-age of the generator [Cleveland 1988].

Abstraction is used effectively in ap-plication generators. The software de-veloper works exclusively at a high-abstraction level. Typically the hidden

Dart of the abstraction covers all of theImplementation details in the generator,and the user does not view. understand.or modify the generator output. This isanalogous to conventional high-level lan-guage compilers where the user treatsthe inmt s~ecification as the lowest levelof abs~ract~on for software development.The workings of the compiler and theobject code produced are black boxes.

It is interesting to compare applicationgenerators to software schemas. Both ap-proaches use diversely and highly pa-rametrized reusable artifacts. With ap-

plication generators, the product is acomplete, executable system; with soft-ware schemas a smaller unit, such as amodule or a subsystem, is produced. Theprimary distinction is that schemas areabstractions for software implementationartifacts such as algorithms and datastructures, whereas application genera-tors are abstractions for the semantics ofa particular application domain. In com-paring the effectiveness of the two ap-proaches, an important question to ask iswhether or not it is easier for the user todeal with application domain abstrac-tions rather than computational abstrac-tions. Successful reuse applications, suchas the IMSL mathematical library, pro-vide some evidence that users are com-fortable dealing with domain abstrac-tions.

7.2 Selection

Selecting an application generator tosolve a particular problem means choos-ing an application generator whose do-main couerage includes the requirementsof the tmoblem statement. For situations.in which one or more application genera-tors are known to exist, the selectionprocess may be easy. Currently, however,there are only a small number of applica-tion generators in a few limited domains,and their availability is not always wellknown [Cleveland 1988]. Also, applica-tion generators are often highly special-ized, limiting their domain coverage [Bi-ggerstaff and Richter 1989]. As a result,application generators currently providea very small coverage for software devel-opment in general.

As more and mm-e application genera-tors are built, we might expect to seelibraries of application generators. Sucha librarv would be similar to. but at a.much higher levell of abstraction than,the libraries of reusable components orschemas described in Section 6.2. Ideallv.an application generator library wou~dcover a broad enough range of applica-tions to be considered a source of generalcomputation for common applications.Classification, search, and exposition in



this library could be done in terms ofdomain characteristics rather than algo-rithmic or data requirements. Initialsteps in the direction of an applicationgenerator library are described in Sec-tion 10.

7.3 Specialization

Specialization is the primary task of asoftware developer using an applicationgenerator. A software developer special-izes an application generator by provid-ing an input specification. Implementorsof application generators have used awide variety of techniques for specializa-tion. To demonstrate this diversity, thefollowing sections describe several appli-cation generators and the abstractionsused for specialization.

7.3.1 “Conventional” Application Generators

Application generators are sometimes as-sociated with only a narrow collection ofbusiness applications. For example,Horowitz et al. [1985] in “A Survey ofApplication Generators” discuss onlygenerators for business-oriented, data-intensive applications. Application gen-erators from this domain often coverhigh-level report generation, data pro-cessing, and data display techniques. Theabstractions presented in these systemsinclude

Database management based on a par-ticular data model, such as hierarchi-cal or relational. Specialization con-sists of specifying the database schema(types, fields, relations, etc.) for an ap-plication.

Textual report generation, which isoften described by a nonprocedural

(declarative) language and is based onassociative data retrieval.

Graphical report generation, whichuses a similar input specification astextual report generation but producesgraphs, histograms, bar charts, scatterplots, pie charts, and so on.

Database manipulation for batch orinteractive modification, processing,

and restructuring of the database con-tents. Specifications can include con-straints on particular data items, con-straints between data items, accesscontrol constraints, and the way ofviewing the contents of the database.

The following example is a simple in-put specification to a report generatorand a sample output from the generatedsystem [Horowitz et al. 1985]:

reportfile SALES

titl:StUNITS SOLD PER CUSTOMER’

by CUSTOMERacross YEARsum (UNITS)rowtotalcol umntotal

where YEAR in 1980...1982end report

UNITS SOLDPER CUSTOMER

1980 1981 1982CUSTOMER UNITS UNITS UNITS TOTAL

A 2344 3455 1234 7033B 234 1111 3221 4566c 412 555 1212 2179

TOTAL 2990 5121 5667 13778

7.3.2 Expert System Generators

Expert systems are software systems thatembody and apply expert knowledge tosolve problems in a particular domain.Whereas conventional software systemdevelopment codifies algorithmic prob-lem-solving-knowledge, expert systemdevelopment codifies the subjective prob-lem-solving knowledge of experts[Hayes-Roth 1983]. By analogy, if appli-cation generators for conventional soft-ware systems reuse common algorithmsand data structures, then applicationgenerators for expert systems can reusecommon expert problem-solving methods[Krueger 1987]. This idea of generatingexpert systems is advocated by expertsystem development tools such as MOLE,


SALT, and SEAR [Marcus 1988; McDer-mott 1986].

Experts in different areas may usesimilar problem-solving techniques eventhough their actual knowledge is quitedifferent. For example, an auto mechanicand a medical doctor may use a similardiagnostic method. Expert system gener-ators reuse implementations of thesegeneric problem-solving methods. Thefixed part of an expert system generator’sabstraction has a particular problem-solving method, whereas the variablepart (by which it is specialized) consistsof the expert knowledge from a particu-lar area of expertise. The input specifica-tion is typically derived through aninteractive dialog between the expertsystem generator and a domain expert.This process is referred to as knowledgeacquisition.

For example, MOLE is an expert sys-tem generator that can be used for awide variety of diagnostic tasks, includ-ing medical diagnosis, car repair, andproduction line troubleshooting [Eshel-man 1988], The fixed problem-solvingmethod reused in MOLE consists of thefollowing steps:

(1) Determine all hypotheses that ex-plain the symptoms.

(2) If there is more than one hypothesisthat explains a symptom, then iden-tify what data are necessary to dif-ferentiate the hypotheses. Data fordifferentiation can serve the follow-ing roles:

● Rule out causality between a hy-pothesis and a symptom (i.e., hy-pothesis implies the symptom insome cases, but not this case).

. Rule out validity of the hypothesis(i.e., hypothesis implies the symp-tom, but hypothesis is not true inthis case).

* Rate the relative likelihood of thehypotheses causing the symptom(i.e., certain hypotheses are morelikely to cause symptoms).

. Rate the relative likelihood of thehypotheses (i.e., certain hypothesesare more likely to be true).

(3)

(4)


Collect the dati~ identified in step 2and differentiate the hypotheses.

If step 3 uncovers new symptoms goto step 1.

The input specification given to MOLE

by a diagnostic expert is expressed interms of domain-specific symptoms, hy-potheses, and data. Each piece of knowl-edge in the input specification fits intoone of the first two steps listed above.From the input specification, MOLE pro-duces an executable expert system to di-agnose problems in the expert’s domain.

SEAR provides a more general frame-work for creating and reusing problem-solving methods [McDermott 1988]. Theeventual goal is to have a library of com-monly used problem-solving methods thatcan be selected and integrated to producea wide range of different expert systemgenerators.

7.3.3 Parser and Compiler Generators

Parser generators and compiler–com-pilers are probably the best-known exam-ples of application generators. Thesesystems have grown out of the well-developed compiler theory and the ubiq-uitous need for compilers in computerscience.

The primary abstractions used inparser generators are regular expres-sions for generating lexical analyzers andcontext-fi-ee grammars for generatingparsers. For example, Lex [Lesk andSchmidt 1979] and Yacc [Johnson 1979]are a pair of generators for producingparsers. Lex deals with lexical analysis;Yacc addresses parsing.

Parser generators and compiler–com-pilers reuse the knowledge gainedthrough years of research and design ex-perience in the compiler domain, includ-ing the general theories behind lexicalanalysis, parsing, syntactic and semanticerror analysis, and code generation. Thisknowledge is embedded in the design andimplementation of a generic compilerframework that is reused each time acompiler is generated. The framework in-cludes such things as the generic parsing



algorithm and the parse tree data struc-tures.

The specification language used as in-put to Lex is a regular expression lan-guage. An input specification describesthe legal tokens (identifiers, keywords,constants, etc.) in the language to becompiled. For example, the following reg-ular expression describes all tokens thathave a letter for the first character, fol-lowed by zero or more letters, digits, orunderscores:

[a–zA– Z][a–zA– Z’C)–g_]k

The specification language used as in-put to Yacc is a context-free grammar.The input specification to Yacc definesthe legal syntactic structure of the lan-guage to be compiled. Actions can beassociated with productions in the gram-mar. Actions can do static semanticchecking, code generation, and so on.Whenever a construct in the grammar isrecognized by the Yacc-generated parser,the associated actions are executed. Ac-tions are expressed in C and are incorpo-rated into the generated parser.

The following is an example Yaccgrammar rule for a conditional state-ment in an imperative language:

stint : IF ‘(’ cond ‘)’ stintI IF ‘(’ cond ‘)’ stint ELSE stint

7.3.4 Structure-Oriented Editor Generators

Structure-oriented editors are a cross be-tween text editors and compilers. Astructure-oriented editor has knowledgeof the syntax and semantics of a pro-gramming language and interactively as-sists users to construct programs in thatlanguage. Typically, structure-oriented editors store and manipulateprograms in the form of abstract syntaxtrees (which are similar to parse trees)rather than text strings. This allows theeditor to detect syntactic and semanticserrors incrementally while a user is edit-ing a program.

Structure-oriented editors for differentlanguages have many similarities, in-cluding the abstract syntax tree man-

agement and the user interface. Theprimary differences between differentstructure-oriented editors are the syntaxand semantics of languages. Structureeditor generators capitalize on this bycapturing the commonalities in the fixedpart of an abstraction specification whileallowing the language syntax and se-mantics to be expressed in a variablepart of the abstraction specification.Gandalf [Staudt et al. 1986], SynthesizerGenerator [Reps and Teitelbaum 1984],and Mentor [Donzeau-Gouge et al. 1984]are examples of structure editor genera-tors.

Typically, four things are specified togenerate an editor:

(1) Tokens in the language. Regularexpressions are typically used to de-fine tokens in the language, analo-gous to those shown for compiler–compilers.

(2) Abstract syntax of the language.Context-free grammars are typicallyused to describe the constructs in alanguage. The grammars specify con-straints on the abstract syntax trees.The following is an example of tworules in an abstract syntax specifica-tion:

IF expression statements statements

statements :: STATEMENT BLOCK I WHILE I

(3)

(4)

CASE IIFI .=

Concrete syntax of the language.The concrete syntax of a languagedescribes how to map an abstractsyntax tree into a textual representa-tion for the user interface. This isreferred to as prettyprinting or un-parsing.

Semantics of the language. Thetechniques for specifying languagesemantics vary among the differenteditor generator systems. The Cor-nell Synthesizer Generator uses at-tribute grammars to declare con-straints among different constructs inthe abstract syntax tree [Reps andTeitelbaum 1984]. Gandalf uses dae-mons, written in an imperative



database manipulation language.Daemons are associated with itemsin the abstract syntax and are trig-gered by different editing activities[Staudt et al. 1986]. ACL is a seman-tics specification language that com-bines the strengths of attributegrammars and daemons [Kaiser1989].

7.3.5 Others

Cleveland reports on application gener-ators for user interfaces, switching soft-ware in telephone systems, test drivers,hardware interprocess communication,CAD design process descriptions, andprinted circuit board layouts [Cleveland1988]. Levy describes application genera-tors for designing telephone company“loop plants” and for allocating and man-aging construction and engineering work[Levy 1986]. Horowitz notes that spread-sheet generators like Visicalc are suc-cessful because they model a commontype of application with a high-level, do-main-specific abstraction [Horowitz andMunson 1989].

7.4 Integration

In cases in which an application genera-tor can produce a complete executablesystem from one specification, integra-tion is not an issue. Some applicationgenerators, however, produce subsystemsthat in turn require composition. For ex-ample, Lex and Yacc are used in combi-nation to produce parsers.

Software developers using an applica-tion generator typically think in terms ofthe abstractions at the input specifica-tion level not in terms of the softwareproduced as output. Therefore, it may beinconvenient or impractical for the soft-ware developer to use a conventionalmodule interconnection language to inte-grate the output from an application gen-erator. It is better if composition isexpressed in terms of the higher-levelabstractions visible at the input specifi-cation level. For example, with Lex andYacc the integration done is through theabstract concept of tokens, which are

emitted lby the Lex lexer and consumedby the Yacc parser.

7.5 Software Life Cycle with ApplicationGenerators

The software development paradigm withapplication generators is significantlydifferent from the conventional software

development life cycle. This sectionexamines the software life cycle for ap-plication generators and its impact oncognitive distance.

7.5.1 Using Applicaticm Generators

The leverage offered by application gen-erators can be shown by comparingconventional software development tosoftware development with applicationgenerators [Horowitz et al. 1985]. A sim-ple version of the conventional softwaredevelopment life cycle might consist ofthe following steps:

System (or requirements) specifica-tion. The desired system functionalityis expressed in terms of concepts (ob-jects and operations) in the applicationdomain. This phase is focused on whatthe system should do and avoids theissues of how the functionality is im-plemented.

Architectural design. The high-levelsystem organization is expressed interms of the implementation technol-ogy. This includes subsystem decom-position into modules, module inter-connection structure (such as controland data flow), and correspondence tothe system specification. This phaseidentifies what the modules should do,not how they are implemented.

Detailed design. Details are specifiedfor the implementation of modules andtheir interconnection based on the ar-chitectural design. This phase beginsto address how modules are imple-mented.

Coding. Source code is written accordingto the detailed design. This phase ex-pands in full detail how each moduleoperates.

Testing. The system is tested to verify



that the “right system is built” andthat the “system is built right” [Zave1984]. That is, testing serves two pur-poses: (1’) validate the correspondencebetween the system specifications andthe intent of the design, and (2) verifythat the code satisfies the system spec-ifications.

Modifications made in any phase arepropagated down through the otherphases. This is true both of changes madeduring the initial system developmentand of changes made during the evolu-tion (i.e., maintenance) of the system.

With application generators, much ofthe conventional life cycle is automated.Software developers construct or modifyonly the system specifications. The archi-tectural design and detailed design areembodied as reusable artifacts in the ap-plication generator. Coding is auto-mated. An application generator takesthe system specifications as input andproduces executable code as output. Soft-ware developers using an applicationgenerator do not have to test that thegenerated code satisfies the system speci-fication, because (theoretically) the im-plementors of the application generatorverify that the generator always pro-duces code that is consistent with theinput specifications. The software devel-opers need only validate the match be-tween the input specifications and theintentions of their design. With applica-tion generators, evolution of a system iscarried out by modifying the system spec-ifications. The application generator au-tomatically propagates new specifica-tions through the conventional life cyclephases in exactly the same way the orig-i-nal system is generated.

Thus, the powerful leverage affordedby application generators comes fromspecifying systems directly in terms ofdomain concepts, thereby avoiding thelow-level details of implementing sourcecode. The cognitive distance is very small

between the informal requirements of asystem and the specification of that sys-tem in terms of application-specific ab-stractions. Compared to conventionalsoftware development, application gener-

ators reduce the cognitive distance forthe software developer by moving muchof the development effort into the hiddenand automated parts of the generator.

7.5.2 Building Application Generators

Even in situations in which an applica-tion generator is not available, it may beadvantageous to build an applicationgenerator to produce one software sys-tem [Cleveland 1988; Levy 1986]. It isnot intuitively obvious how building anapplication generator to generate onesoftware system can be economically ad-vantageous. As described earlier, how-ever, constructing an application genera-tor is appropriate when many similarsoftware systems are written, when onesoftware system is modified or rewrittenmany times during its lifetime, or whenmany prototypes of a system are neces-sary to converge on a usable product. Thelast two cases suggest situations in whichapplication generators could be createdas part of the development of a singlesoftware product.

The primary economic justifications forthe “up-front” expense of building an ap-plication generator are (1) the relativeease of using domain-specific specifica-tions and (2) ability to do rapid iterativeprototyping. Taking these two factors intoaccount, the cost of building an applica-tion generator is rapidly amortized dur-ing its life.

Levy [1986] presents economic modelsfor determining the optimal distributionof effort in building a software systemwith an application generator. He showsthat it is often most effective to spendmost of the effort up-front constructing apowerful, highly productive applicationgenerator, thus requiring less time proto-typing with the generator.

Levy [1986] refers to the process ofbuilding an application generator as

metaprogramming. Metaprogrammingrequires expertise in both the applicationdomain and in software system construc-tion. Cleveland [ 1988] describes the pro-cess as follows:

Recognizing an appropriate domain.Recognizing when the domain of inter-


est is suitable for application genera-tor technology. Domains that areamenable to generator technology willhave implementations with recogniz-able patterns at the source code level,have natural control and data ab-stractions, contain automatabletransformations, and have dispersedinformation that can be consoli-dated. Domains with a well-understoodtheory, such as compilers, are easiestfor application generator construction.

Defining domain boundaries. Decid-ing what aspects of the domain shouldbe included in the generator (i.e., set-ting the domain coverage). Increasingthe domain coverage allows the gener-ator to handle more problems but typi-cally makes the generator less effi-cient and harder to use.

Defining underlying abstraction.Identifying the abstraction presentedto the user of the application genera-tor. Comrhon abstractions include sets,directed graphs, trees, formal logicsystems, and computational modelssuch as finite-state machines andspreadsheets. The 80/20 rule, wherenot everything in a domain falls into aclean computational model, sometimesapplies [Levy 1986]. In this case, es-capes must be provided so that the20% can be programmed directly intothe implementation technology.

Defining variant and invariant parts.Deciding what aspects of the abstrac-tion are under control of the user andwhich parts are fixed.

Defining specification input. Definingthe W=Y ‘in which the &er specifieseach instance of a generated program.As mentioned earlier, options includetextual specification languages (appli-cation-oriented languages, fourth-generation languages, etc.), templates,graphical diagrams, interactivemenu-driven dialog, and structure-oriented editing.

Defining products. Implementing thegenerator (“the easy part” [Cleveland~988]). Creating theset of programsthat will generate the final code froma concise input specification.


Stage [Cleveland 1988], SSAGS [Pay-ton et a-l. 1982], and Draco [Neighbors1989] are examples of systems that assistin building application generators. Theyrely on generator technology to buildgenerators and are therefore applicationgenerator generators. Details of this ap-proach are presented in Section 10, whichdeals with the reuse of system-level ar-chitectures.

7.6 Appraisal of Reuse Techniques inApplication Generators

Application generators are practical andattractive when high-level abstractionsfrom an application domain can be auto-matically mapped into executable soft-ware systems. This capitalizes on the

users’ understanding of semantics fromthe application domain rather than re-quiring users to de-rive domain semanticsrepeatedly from first principles in con-ventional software development technol-ogy. By eliminating many of the designand implementation steps found in theconventional software life cycle, applica-tion generators significantly reduce cog-nitive distance.

Standards can be captured in the fixedpart of the abstraction for an applicationgenerator. For example, users may prefera standard user interface for all gener-ated systems rather than having to learna new interface for each system built.Also, in application domains where thereare established standards, it is easier toimplement the standard in a single ap-plication generator than to have to inter-pret and implement the standard repeat-edly [Cleveland 1988].

A difficult challenge for implementorsof application generators is defining theoptimal domain coverage and the optimaldistribution of domain concepts into thefixed and variable parts of the abstrac-tion. Implementors must attempt to pre-dict in advance the functionality andvariability that software developers us-ing the application generator will want.

A limited number of application gener-ators are available, many of which havenarrow domain coverage. As a result, ap-plication generators do not exist for mostsoftware development problems.



Abstraction

Selection

Specialization

Integration

Pros

Cons

Table 5. Reuse m Appllcatron Generators

Abstractions come directly from the application domain. These high-level

abstractions are mapped directly into executable source code by the generator.Application generator libraries have not received much attention in the litera-

ture. The parallel between software schemas and application generators sug-gests, however, that library techniques could be used to select among a collection

of application generators.Application generators are specialized by writing an input specification for the

generator. Due to the diversity in application domain abstractions, the tech-niques used for specialization are also widely varied. Examples include gram-

mars, regular expressions, finite-state machines, graphical languages, tem-plates, interactive dialog, problem-solving methods, and constraints.Application generators do not require integration techniques when a single

executable system is generated. In cases in which a collection of generatorsproduce a collection of subsystems, composition is best done in terms of domain

abstractions.Since high-level abstractions from an application domain are automatically

mapped mto executable software systems, most of the conventional software

development life cycle is automated. This significantly reduces cognitive dis-

tance.Because of the limited availabdity of application generators, many of which havenarrow domain coverage, it is often difficult or impossible to find an applicationgenerator for a particular software development problem. It is difficult to buildan application generator with appropriate functionality and performance for abroad range of applications.

Application generators offer both ad-vantages and disadvantages with respect

to the efficient execution of generatedsystems. One advantage is that complexoptimizations can be designed and builtonce in the generator and incorporatedinto all generated systems. For major ef-ficiency problems in an application do-main, a significant optimization effort inthe application generator can be amor-tized over all generated programs. Also,application generators can offer a rangeof functionality and performance options,similar to schemas with multiple imple-mentations. On the negative side, in-creasing the generality of an applicationgenerator often requires compromises intime or space efficiency. Table 5 summa-rizes the application generator approachto software reuse.

8. VERY HIGH-LEVEL LANGUAGES

As the name suggests, very high-levellanguages (VHLLS) are an attempt atimproving on the successes of conven-tional high-level languages (HLLs).High-level languages have constructssuch as iteration and relational expres-sions for their abstraction specifications,

and these specifications are compiled intoassembly language realizations. Simi-larly, very high-level languages allowsoftware developers to create executablesystems using constructs that are consid-ered high-level specifications relative toHLL languages. Because of this, VHLLSare also known as executable specifica-tion languages.

Developing software with VHLLS isvery much like developing software withHLLs. Both VHLLS and HLLs provide asyntax and a semantics for expressinggeneral-purpose computation. Both ap-proaches use compilers that accept sourcecode programs as input, make tests forsyntactic and static semantic correct-ness, and create executable programs if acertain threshold of correctness is met.HLL source code may be an order ofmagnitude more succinct than the corre-sponding assembly language implemen-tation, and likewise VHLL source codemay be an order of magnitude more suc-cinct than corresponding HLL implemen-tations.

Very high-level languages resembleapplication generators in that high-levelabstraction specifications are automati-cally translated into executable systems.

ACM Computmg Surveys, Vol. 24, No. 2, ,June 1992


Application generators, however, use ap-plication-specific abstractions, whereasVHLLS use application-independent,general-purpose abstractions. With ap-plication generators, generality is sacri-ficed in order to capture much higher

level abstractions in the specificationlanguage. Therefore, VHLLS have the ad-vantage of being generally applicable tosoftware development, whereas applica-tion generators have the advantage ofbeing more succinct and powerful in therelatively small number of cases in whichthey are applicable.

This distinction between VHLLS andapplication generators exemplifies thetradeoff between generality and leveragein software reuse technologies [Bigger-staff and Richter 1989]. Typically, themore general a reuse technology is, themore effort is required to implement sys-tems with it. For example, the followingtechnologies are listed by increasingleverage and decreasing generality: as-sembly language, HLLs, softwareschemas, VHLLS, and application gener-ators. The goal of VHLL research is tomaximize the leverage offered by higherlevels of specification without sacrificingcomputational generality.

The primary concern in VHLLS is notefficiency in program execution butrather efficiency in implementing andmodifying programs. The SETL languagedesigners use the slogan “slow is beauti-ful” to emphasize this point [Kruchten etal. 1984]. Similarly, in the early days oftheir development, HLLs were often con-sidered too slow to be of practical value

[Naur and Randell 19681. But the in-creased software development productiv-ityy offered by HLLs, in addition to com-piler optimizations and faster hardware,quickly made HLLs the norm for soft-ware development. It is possible thatVHLLS, which are considered too ineffi-cient by today’s standards, will follow asimilar course of evolution.

Looking ahead to Section 9, trcmsfor-

mational systems use VHLLS as the ba-sis for rapid prototyping and validationof software systems, then a series of hu-man-guided transformations is applied to

produce an efficiently executable system.This approach can be seen as an interimmeasure until it is possible to automatethe transformation fully (i.e., compila-tion) into efficiently executable code.

8,1 Abstraction in Very High-LevelLanguages

VHLLS diverge from a trend we haveseen thus far in the survey. The reuseapproaches described in earlier sectionsal~i used abstractions that lie in the con-ventional software development life cy-cle. Higher level abstractions allow lowerlevel artifacts to be reused. For example,HLL constructs are abstractions for as-sembly language code fragments; codescavenging and reusable components useabstractions for HLL system designs andcode fragments; schemas are abstrac-tions for algorithms and data structures,and application generators use abstrac-tions from the application domain beingmodeled. Contrary to this trend, VHLLStypically have mathematical abstractionsthat are not widely used in the conven-tional software development life cycle.Implementors of a VHLL try to find amathematical model that is both exe-cutable and effective for doing software

development.The reusable artifacts in VHLLS are

analogous to those in HLLs. In HLLs,the reusable artifacts are embodied inthe compiler mappings from languageconstructs to assembly language pat-terns. Similarly, VHLLS capture reusableHLL implementation patterns in theircompilers [Cheng et al. 1984]. The VHLLconstructs serve as abstraction specifica-tions for reusable artifacts, whereas theoutput from the compiler corresponds tothe abstraction realization level.

VHLL constructs have fixed and vari-able parts, similar to HLL constructs.For example, in the functional expressionF(X), both F and X are variable parts ofthe function abstraction specification.Different functions can be substituted forF, and different expressions can be sub-stituted for X. The fixed part of the func-tion abstraction is given by the language



semantics for functional evaluation: Ap-ply function F to expression X. Although

the programmer must understand the se-mantics, they are fixed and cannot bemodified.

As with HLL compilers and applicationgenerators, VHLLS effectively use ab-straction for software reuse. Software de-velopers use only the fixed and variableparts of the abstraction specification. Themapping from specification to realizationis completely automated by the compiler,which isolates software developers from

the hidden and realization parts ofthe reuse abstraction. The followingthree examples demonstrate the typeof abstractions that have been usedin VHLLS.

8.1.1 SETL

SETL is a VHLL based on set theory[Doberkat et al. 1983; Dubinsky et al.1989; Kruchten et al. 19841. Therefore,sets are the most important abstraction,defining the character and semantics ofthe language. With a small number ofprimitives, all mathematical operationscan be succinctly expressed in set-theo-retic terms. The data types in the lan-guage are either atomic types (such asintegers, reals, and Boolean values) orcomposite types from set theory. The twocomposite data types are sets and tuples:

Sets. An unordered finite collection ofdistinct SETL objects (atoms, tuples,and sets).

Tuples. An ordered finite sequence ofSETL objects.

Sets and tuples can be nested heteroge-neously and to any depth.

Sets and tuples are formed either byenumeration of their members, as in

{1,2,3},

or with set-formers, which have the gen-eral form

{(exp) : xl,... ,xn I (cond)}.

The (exp) designates the values in theset, as x ~ through x. vary over all values

satisfying the condition (cond). For ex-ample,

{2*x: xIO<X <50}

is the set of even numbers between O and100. The set-former is an example of thepower of reuse in SETL. Throughreusable code stubs, the SETL compileror interpreter can generate arbitrarilycomplex data objects from a succinct set-former specification.

The usual set operations are defined inSETL: union, intersection, difference, andpower set. Higher level operations suchas set iterators are also provided in thelanguage. For example, the following de-notes the subset of S whose memberssatisfy the predicate C(X):

{x in S I C(x)}

Maps in SETL correspond to functionsin set theory. A map is a set of length-2tuples, where the first element in eachtuple is in the domain of the function andthe second element is in the range. Thereare several high-level retrieval opera-tions that can be applied to maps. Forexample, if F is a map, then F(X) yieldsthe second component of the unique pairin F whose first component is X. Like-wise, if F contains more than one tuplewhose first component is X (this is legal),then F(X) is undefined, but F{X} yieldsthe set of all second components whosefirst component is X. The notation F[S],where S is a set, returns the set of allelements in the combined ranges of everyelement in S.

Many common abstract data types canbe implemented trivially in SETL. Forexample, SETL operations for insertionand deletion from the ends of tuples al-low direct implementations of stacks andqueues. The concatenation operation al-lows tuples to be used as lists.

8.1.2 PAISLey

In PAISLey, abstractions from set-basedfunctional programming and communi-cating asynchronous processes are com-bined in a single VHLL [Zave and Schell1986]. The set-based aspects of the lan-

ACM Computing Surveys, Vol 24. No. 2, June 1992

gaage are similar to SETL and will notbe discussed here.

With PAISLey, software developers candefine a collection of asynchronous paral-lel processes. The processes communicatevia exchange functions. An exchange

function consists of a pair of functioncalls, one in each of two different pro-cesses. During execution, when the twoprocesses both reach the evaluation oftheir half of the exchange function, theargument of one function is passed as thereturn value to the other and vise versa

(i.e., they exchange their arguments).This simple extension to the functionalsemantics allows parallelism to be di-rectly modeled in the language.

Real-time constraints are anotherhigh-level abstraction in PAISLey. Tim-ing constraints allow the static and dy-namic execution of a system to be evalu-ated. Software developers can attach tim-ing constraints to functions, expressed aslower bounds, upper bounds, distribu-

tions, or random values. Static inconsis-tencies and dynamic failures are detectedand reported.

A powerful feature of the PAISLey exe-cution environment allows incompleteprograms to be executed. If an unde-clared function is called during execu-tion, the user is prompted for a value touse in place of the evaluation. Declaredbut undefined functions can be handledin several ways:

● In default mode, a distinguished valuewill always be returned, depending onthe type of the return value declared.For example, O is returned by defaultfor integers.

● In random mode, a pseudorandomvalue is returned.

“ In interactive mode, the user isprompted for the return value.

8.1.3 MODEL

MODEL is a declarative constraint lan-gaage, whose abstract model of compu-tation is based on the simultaneoussolution to a collection of constraintequations [Cheng et al. 1984; Prywes and

Software Reuse ● 167 “

Lock 1989]. Data flow and control floware automatically determined by thecompiler, not explicitly specified as inimperative HLLs.

A MODEL program contains data dec-larations and consistency equations. TheMODEL compiler does data flow analysisto determine if the consistency equationscan be uniquely satisfied, and to find theorder of computations that will assignconsistent values to all of the data ob-jects.

For example, the first of the followingequations specifies that the final value ofan account balance must be equal to theinitial value of balance minus the debittotal plus the credit total for that ac-count. The equation for credits specifiesequality with the old value for A, anddebits is equal to the old value of B mi-nus 10’% of credits:

new. bal = old. bal – debits + credits;credits = old.Adebits = old.B – (.1 “ci’edits)

The compiler detects that values for thedebit total and the credit total must becomputedl before the new balance can becalculated and that credits must be com-puted before debits. The appropriate codeis generated.

The MODEL compiler checks for com-pleteness and consistency. Completenessrequires that all variables used are alsodefined. Consistency requires that no cir-cularities exist in the constraints andthat there are no type mismatches.

8.2 Selection

With VHILLS, selection can take place ontwo different levels:

e

●

A software developer selects a VHLLthat is appropriate for a particular ap-plication. This is analogous to selectingan application generator.

A software developer selects among theVHLL constructs during applicationprogramming. This is analogous to se-lecting constructs while programmingwith a high-level language.



The former is guided by the latter. Thatis, a software developer selecting a WILLfor a particular application makes thedecision by comparing how easy it is toprogram the application with the differ-ent VHLLS, The language will have avery strong influence on how a problemis solved. For example, SETL and PAIS-Ley encourage software developers tothink about problems in terms of sets ofobjects and functions on those sets,whereas MODEL encourages thinking interms of data objects with static interob-ject constraints. The best language for a

particular application provides the small-est cognitive distance from initial conceptto executable implementation.

8.3 Specialization

VHLLS have parameterized languageconstructs, or templates, that are special-ized by recursively substituting otherlanguage constructs. For example, thefunctional template F((x)) can be special-ized by substituting another functionalexpression for (x):

F(G(2))

A less conventional type of specializa-tion is found in a class of VHLLS knownas wide-spectrum languages. Wide-spectrum languages contain both VHLLconstructs and HLL constructs. The low-level constructs are present to accommo-date efficiency concerns. Since low-levelspecifications are more difficult to readand modify, software developers shouldprogram at this level only as a refine-ment, or specialization, of stable high-level designs [Kruchten et al. 1984; Liuand Paige 1979].

SETL is an example of a wide-spectrum language. The following high-level SETL code can be specialized into amore efficient low-level form. This exam-ple creates the subset of positive integersand the subset of negative integers froma set S [Liu and Paige 1979]:

POS := {X element S I X > O},NEG ,= {X element S I X < O},

Executing this code, however, requires

two scans over set S, The following low-Ievel SETL code is more difficult to read,

but it requires only a single scan over setS during execution:

POS = nl;NEG ,= nl;

(forall X element S)If X > 0 then

POS with X,elseif X <0 then

NEG with X;end forall X;

SETL also allows software developersto specialize the way language constructsare implemented at run time (for thesake of efficiency). For example, the de-fault run-time implementation of SETLsets is a hash table. For sets that areonly subject to union and intersectionoperations, however, a bit string imple-mentation is more efficient. Ideally, thecompiler or interpreter would automati-cally make the optimal implementationchoice, but in general this is beyond the

scope of current technology. These typesof specializations must be guided by soft-ware developers. Human-guided trans-formations will be discussed in more de-tail in Section 9.

8.4 Integration

PAISLey and other side-effect-free func-tional languages simplify function inte-gration because of encapsulated compu-

tation: Computation within a function isonly influenced by its input parametersand return values from the functions itcalls, and the only influence a functionhas is through its return value. Controlflow and data flow are unified, whichmakes it easier for a software developerto reason about function integration.

MODEL and other declarative lan-guages simplify integration even more bycompletely eliminating the. issue of con-trol and data flow at the programminglevel. In MODEL, the order in which theconstraint equations are written is irrele-vant since the compiler performs dataflow analysis to determine a correct (andclose to optimal) sequencing of opera-tions [Tseng et al. 1986].



Just as source code components writ-ten with HLLs can be reused, softwaredevelopers can also reuse VHLL compo-nents. Declarative languages likeMODEL simplify the integration ofreusable VHLL components. Two or morecomponents can be merged without con-cern for the run-time ordering of compu-tations since this is done automaticallyby the compiler. To merge components, asoftware developer only needs to specifythe relationships (constraints) betweendata objects in the different components[Kaiser and Garlan 1987; Tseng et al.1986].

8.5 Appraisal of Reuse Techniques in VeryHigh-Level Languages

VHLLS use high-level mathematical ab-stractions suitable for general-purposesoftware development. The goal of VHLLimplementors is to find abstractions thatare more natural and expressive thanthe abstractions in HLLs. As a result,VHLL programs can be an order of mag-nitude more succinct than correspondingHLL programs. VHLLS are not, however,as powerful as application generatorssince application generators use domain-specific abstractions, which can be at amuch higher level of abstraction.

Like HLLs and application generators,VHLLS have compilers to map abstrac-tion specifications directly into exe-cutable realizations. This isolates thesoftware developer from the hidden andrealization parts of the abstraction, sothe software developer is largely un-aware of the reuse that takes place. Thelowest level of detail that software devel-opers work with are the high-level ab-stractions in the VHLL, so the cognitivedistance between the initial concept of asystem and the executable implementa-tion is relatively small.

The first validated Ada compiler wasbuilt using the VHLL SETL [Kruchtenet al. 1984]. The experiences of that pro-ject demonstrate the strengths and limi-tations of current-day VHLL technology.Although it was possible to prototype thecompiler rapidly, it was far too inefficient

for production use. Also, the SETL Adacompiler was built by the SETL language

design group, which does not answer thequestion of whether or not the set theo-retic foundation of SETL is a suitable

abstraction for the general population ofsoftware developers, building a widerange of applications.

Zave notes that our notions of specifi-cation and implementation are floatingrather than fixed [Zave and Schell 1986].Although these two levels of program ab-straction will always exist (with the spec-ification level always at a higher level ofabstraction than the implementationlevel), both levels have been graduallyand simultaneously moving higher. Forexample, the implementation level of to-day resembles the specification level of25 years ago. The goal of VHLL researchis to continue this trend by making exe-cutable specification languages a viablealternative for implementing softwaresystems. Table 6 summarizes the veryhigh-level language approach to softwarereuse.

9. TRANSFORMATIONAL SYSTEMS

With transformational systems, softwareis developed in two phases:

(1) Software developers describe the se-manti c behavior of a software systemusing a high-level specification lan-guage.

(2) Software developers then applytransformations to the high-levelspecifications. The transformationsare meant to enhance the efficiencyof execution without changing the se-mantic behavior.

The two phases make a clear distinctionbetween specifying what a software sys-tem does and the implementation issuesof how it will be done [Zave 1984].

The first phase is equivalent to using aVHLL. Software developers create an ex-ecutable system in a language that has arelatively small cognitive distance fromthe developer’s informal requirements forthe system [Balzer 1989]. The secondphase, however, is not present in the



Table 6. Reuse jr! Very High-Level Languages

Abstraction The abstractions used in WILLS can be viewed as specifications when compared

to HLLs. For this reason WILLS are sometimes referred to as executablespecification languages. VHLLS typically have mathematical abstractions, such

as set theory or constraint equations.

Selection Selection can take place at two levels: (1) selecting a VHLL that is most

appropriate for a particular application and (2) selecting the language constructsthat best represent the application.

Specialization VHLLS have parameterized language constructs that are specialized by recur-

sively substituting other language constructs. Another form of specialization isfound in wide-spectrum VHLLS, where the run-time representation of languageconstructs can be speciahzed to provide better performance,

Integration Function integration in languages such as PAISLey is sirnphfied by slde-effect-

free encapsulated computation. Control flow and data flow are unified and

localized to function composition, function arguments, and return values.

Declarative languages such as MODEL simphfy integration further with

order-independent specifications and compiler-generated control flow and data

flow.High-level mathematical abstractions are automatically mapped into executable

systems. VHLLS are suitable for general-purpose software development. The

VHLL abstractions can be more natural and expressive than HLL abstractions,and VHLL programs can be an order of magnitude more succinct than HLL

programs. Thus the cognitive distance for application development can besignificantly less using WILLS rather than HLLs.

Cons The run-time performance of systems written with VHLLS is typically poor. Inwide-spectrum languages, performance can be Improved by using low-level

constructs, but this increases cognitive distance. It is also not clear whether

high-level mathematical abstractions are suitable for software development by

the average programmer.

Pros

pure VHLL paradigm. The goal in thisphase is to produce an executable systemthat satisfies the high-level specificationand that also exhibits performance com-parable to an implementation in a con-ventional HLL. The transformation phasecan be thought of as an interactive, hu-man-guided compilation. Human inter-vention is necessary because issues suchas automatic algorithm and data struc-ture selection are beyond current com-piler technology.

A transformation is a mapping fromm-omams to m-om-ams IPartsch andSte~nbruggen i98~].transformation T to aexpressed as

PxT+ P’.,

where P‘ is the result

Application of aprogram P can be

of the transforma-tion and is semantically equivalent tobut more efficient than P [Cheatham1989]. A transformational development

history is the sequence of transforma-tions applied during the creation of asoftware system. A development history

can be expressed as:

PO XTO+P1P1x T1+P2

PNx TN + P~+lor((... (pOXTO) XTXTJ+p~+l,~+l,

where each PI+, is a more efficient butsemantically equivalent implementationof Pi.

9.1 Abstraction in Transformational Systems

Transformational system programminglanguages often have higher level ab-stractions than VHLLS. The execution ef-ficiency of these higher level abstractionsis often too poor to be directly executed inVHLLS. Transformational systems can,however, afford to use them because soft-ware developers apply transformationsthat enhance their efficiency.

Since the first phase of programmingwith transformational systems is similarto programming with VHLLS, this sec-



tion concentrates primarily on the secondphase—the transformations. The follow-ing three sections describe three differ-

ent kinds of reusable artifacts in thetransformation phase: prototypes, devel-opment histories, and transformations.

9.1.1 Prototypes

Prototyping and its associated benefitsare not unique to transformational sys-tems. Transformational systems are,however, unique in the way prototypesare reused. In conventional software de-velopment, a prototype serves to validatethe requirements of a system and toidentify the critical implementation is-sues, but typically the prototype is athrowaway system. In this case softwaredevelopers reuse the informal experiencegained in designing and implementingthe prototype. With transformationalsystems, however, rather than throwingthe prototype away it can be reused fortwo other purposes: (1) a formal specifi-cation of the system and (2) as the basisfor applying a sequence of transforma-tions that transform the formal specifica-tion directly into an efficiently exe-cutable implementation [Cheatham1989].

9.1.2 Development Histories

One of the key benefits of the transfor-mational approach is that software de-velopers perform maintenance and evolu-tion on the specification of the system,not on the implementation. This isadvantageous since information in aspecification is often localized and inde-pendent, whereas information at the im-plementation level is often widely dis-persed and tightly interdependent for thesake of efficiency. It is much easier tounderstand and modify the informationbefore it has been dispersed throughoutan implementation [Balzer 1989].

When a software developer makesmodifications at the specification level,transformations have to be applied againto produce a new efficient implementa-tion. When modifications to the specifica-

tion are small, significant portions of theprevious development history can oftenbe reused. PADDLE is a transforma-

tional system that stores the develop-ment history as a sequence of appliedtransformations [Balzer 1989; Fickas1985]. All. or part of a previous develop-ment history can be replayed after aspecification is modified.

A problem with PADDLE is that itdoes not encode the rationale behind thesequence of transformations. A PADDLEdevelopment history is essentially a pro-gram that defines a sequence of transfor-mations, but the design decisions thatwent into writing the program are notsaved. As a result, PADDLE develop-ment histories are difficult to understandand modify.

Glitter is a transformational systemthat addresses this problem. The designdecisions that a software developermakes while applying a sequence oftransformations are explicitly encoded inrules [Fickas 1985]. These rules providea level of abstraction for the developmenthistory. The rules are the abstractionspecification, and the sequence oftransfor-mations correspond to theabstractionrealization. Rules make devel-opment histories easier to read, under-stand, reuse, and modify. Glitter can au-tomatically map design decisions into anappropriate sequence of transformations.Thus, a software developer can specifywhat needs to be accomplished by thetransformations, and Glitter will figureout how to do it. Therefore, Glitter rulesisolate the software developer from thehidden and realization parts of the de-sign history abstraction. Glitter rules aredescribed in more detail later.

9.1.3 Transformations

A transformation is a mapping from syn-tactic patterns of code into functionallyequivalent, more efficient patterns ofcode. A transformation has a match pat-tern, applicability conditions, and a sub-stitution pattern. The match pattern is atemplate of code that defines what tosearch for in the initial program, and the



substitution pattern defines a pattern ofcode that replaces the constructs in thetransformed program. The applicabilityconditions define semantic constraints onwhen the transformation should be ap-plied. For example, the following tem-plate provides an implementation for in-serting an element in a queue [ Cheatham1989]; the first line is the match pattern,and the remainder of the template is thesubstitution pattern (there are no appli-cability conditions in this example):

Insert $element Into $queue == )

begin

$queue.count := $queue.count + 1;

$queue.llst[$queue count] := $element;end

$element and $queue are pattern uari-a bles that represent arbitrary syntacticcode fragments. At transformation time,these code fragments from the initialprogram are inserted verbatim in thecorresponding pattern variables in thesubstitution pattern.

Many transformations are reusable.For example, the following are commonidentity transformations on predicatespecifications:

NOT NOT $predicate = = ) $predicate

$predlcate AND $predlcate == )$predlcate

9.2 Selection

Glitter uses expert system technology tohelp software developers select from acollection of reusable transformations. Asdescribed earlier, Glitter rules are ab-stractions that explicitly describe what atransformation or sequence of transfor-mations will accomplish. More specifi-cally, rules embody expert knowledgeabout how to accomplish goals duringprogram transformation. Goals can beaccomplished by either applying otherrules or directly applying transforma-tions.

Glitter rules have three parts:

Goal. The transformation issue that isaddressed by the rule.

Strategies. A list of alternative ways toaccomplish the goal. Each strategy

may involve nested subgoals or theapplication of transformations.

Selection rationale, Guidelines on howto select among the different strate-gies.

For example, one goal in transforminga specification into an efficient imple-mentation is to replace all references topast execution states with references toinformation in the current state. Thestrategies for doing this include

= Caching historical information until itis needed (eager evaluation)

Q Computing the historical informationfrom information in the current state

(lazy evaluation)

The selection rationale for choosingamong these alternatives include

● Space/time tradeoffs for caching ver-sus derivation

● Whether or not it is possible to re-derive historical references from thecurrent context

Glitter automates as much of thetransformation process as possible. Insituations in which strategies are incom-plete or selection rationale fails, how-ever, the system must ask the softwaredeveloper for guidance. Fickas [1985] re-ports that Glitter can automate any-where from 6570 to 90 Yo of the transfor-mation selections and applications. Notethat if Glitter could automate 100%, itwould be a compiler for the specificationlanguage.

9.3 Specialization

After being selected, transformations aretypically applied without the softwaredeveloper having to modify or specializethem. Pattern variables in a transforma-tion are a generalization, but the patternvariables are automatically specializedwith existing code fragments in the pro-gram when a transformation is applied.Therefore, from the software developer’spoint of view, specialization is in the hid-den part of the transformation abstrac-tion.



9.4 Integration

The integration of transformations isanalogous to functional composition. Forexample, the application of two transfor-mations to a program can be expressedas either the sequence of two mappingsor the application of a composed map-ping; that is,

POXTO+P1PI x T, - Pflnal

or

PO x (Tl O TO) + pf,n,l

The only time software developers needto be concerned about the composition oftransformations is when the the order ofapplication is significant. For example,when PA is not identical to P~ in

F’. X (Tl O TO) - pp,PO x (TO O T,) + pB.

PA and P~ will be semantically equiva-lent since transformations always pre-serve semantics, but they may differ interms of implementation characteristics.

9.5 Appraisal of Reuse Techniques inTransformational Systems

The VHLL used in the first phase of atransformational system can have higherlevel abstractions than pure VHLLS suchas SETL, PAISLey, and MODEL. Therun-time performance of these higherlevel abstractions is very poor with thecurrent VHLL compiler technology, butthe human-guided transformations of thetransformational approach produce moreefficient implementations. Since trans-formational systems can use higher levelabstractions than pure VHLLS, they ex-hibit a smaller cognitive distance be-tween informal requirements of a soft-ware system and its written specification[Balzer 1989; Feather 1989].

The second phase in the transforma-tional approach is essentially a human-guided compilation. Compared to VHLLSthat use fully automated compilation,software developers expend additional ef-fort to produce an efficiently executablesystem. By involving the software devel-

oper in the compilation process, transfor-mational systems increase the cognitivedistance in order to achieve better run-time performance.

Expert system technology has proveneffective for reusing transformations. Therules in Glitter represent expert knowl-edge about applying transformations toachieve particular goals during thetransformation phase. Glitter can beviewed as an interface to a library ofreusable transformations. Software de-velopers present high-level goals to Glit-ter, and Glitter helps locate the reusabletransformations that satisfy those goals.The primary challenge for the transfor-mational approach is to identify the typesof optimizations that software developersuse to implement efficient systems and tocodify these optimizations into transfor-mations and rules. T,able 7 summarizesthe transformational system approach tosoftware reuse.

10. SOFTWARE ARCHITECTURES

Reusable software architectures arelarge-grain software frameworks andsubsystems that capture the global struc-ture of a software system design. Thislarge-scale global structure represents asignificant design and implementationeffort that is reused as a whole. Reuse atthis scale offers significant leverage inthe development of software.

Examples of reusable software archi-tectures include database subsystemsthat are specialized and reused in differ-ent applications; the framework for acompiler into which different lexical ana-lyzers, parsers, and code generators canbe inserted; rule-based and blackboardarchitectures for expert systems; adapt-able user interface architectures; andeven design. frameworks for oscilloscopes

[Garlan 1990; Kruegelr 1992; Lane 1990;Shaw 1991].

Software architectures are analogousto very large-scale software schemas.Software architectures, however, focus on

subsystems and their interaction ratherthan data structures and algorithms[Shaw 1989]. Software architectures are



Table 7. Reuse a- Transformahonal Systems

Abstraction Transformations are reusable artifacts that describe how to map source codefragments into semantically equivalent, but more efficient, source code frag-

ments. Rules are abstractions for expert knowledge about how to applytransformations to achieve higher level goals.

Selection Reusable transformations can be stored in a library, Selection of transformations

from this library can be enhanced by rule-based analysis that identifies

sequences of transformations that satisfy high-level transformation goals.

Specialization Transformations are typically apphed without modification. Therefore, special-

ization is typically not an issue for transformation reuse,

Integration The integration of transformations M Implicit m the order in which they are

applied. The issues are similar to functional composition.Pros Transformational systems use general-purpose, high-level programming abstrac-

tions. These abstractions can be at a higher level than with VHLLS because thehuman-gmded transformations produce more efficient implementations than m

possible with VHLL compilers. The higher level abstractions reduce cognitivedistance.

Cons Applying transformations requires time and effort from the software developer,thereby increasing the cognitive distance. As transformation systems improve,however, more of this effort is automated.

also analogous to application generatorsin that large-scale system designs arereused. Application generators, however,are typically standalone systems withimplicit architectures, whereas softwarearchitectures can often be explicitly spe-cialized and integrated with other archi-tectures to create many different compos-

ite architectures.

It is interesting to note that some ofthe more powerful reuse techniques dis-

cussed earlier implicitly capitalize onreuse at the software architecture level.For example, experienced software engi-neers scavenge previous designs withgreat leverage. Application generatorsreuse significant portions of system de-signs and implementations within eachgenerated system. Transformational sys-tems reuse the design of efficient imple-mentations for high-level specifications.Prototyping and experience provide soft-ware developers with positive and nega-tive forms of design reuse (i.e., designsthat work versus those that do not).

10.1 Abstraction in Software Architectures

Draco is an example software architec-ture technology [Freeman 1987a; Neigh-bors 1984, 1989]. With Draco, each soft-ware architecture is encapsulated in itsown application generator. The outputfrom these “architecture generators” can

be used as building blocks for higher levelarchitecture generators. Therefore, Dracois an architecture generator generator.

In Draco, each software architecturehas a domain language and a set of com -ponents that implement the domain lan-guage. The domain language correspondsto the abstraction specification for an ar-chitecture. It captures the relevant ab-stractions (i.e., objects and operations)for a software architecture in a particu-lar domain. The design of domain lan-guages is identical to the design of theabstraction specifications for conven-tional application generators, as de-scribed in Section 7.5.2:

Recognizing an appropriate domain

Defining domain boundaries (domaincoverage)

Defining the underlying abstractions

Defining the variant and invariantparts of the abstraction

Defining the form of the language

The art of domain analysis and do-main language design has become an in-dependent area of research [Prieto-Diazand Arango 1991]. Example domainsrange from relatively small areas, suchas sets and stacks, to arbitrarily largeapplication domains, such as navigation,process control, compilation, and dataprocessing.

ACM Computing Surveys, Vol. 24, No 2, June 1992

Softwai-e Reuse ● 175

Draco components are written to im-plement each object and operation in adomain language. Components thereforecorrespond to the abstraction realizationlevel for a software architecture. Thesecomponents define the execution seman-tics of a domain language, making it exe-cutable.

Draco provides software developerswith a recursive model of reuse. Thecomponents that implement a new do-main language are written using otherdomain languages (possibly including thebase language Lisp). For example, theimplementation of a parser architecturemight use an existing domain languagefor tree-structured data. A reusable do-main language built on several recursivelayers of domain languages represents asignificant amount of reuse.

The software developer only deals withthe top layer domain language in such arecursive reuse structure. The nested do-main languages used to implement thetop layer language are in the hidden partof the top layer abstraction.

10.2 Selection

Once the domain language for a Dracosoftware architecture is written and thecomponents implemented, it is availablefor use (reuse) in building applications ornew software architectures. As with li-braries of other types of software arti-facts, a large collection of software archi-tectures must be accompanied by toolsand techniques for selection (classifica-tion, searching, and exposition). In par-ticular, a library system must helpsoftware developers find software archi-tectures with the smallest cognitive dis-tance between the architecture and theapplication requirements.

Analogous to application generators,when an appropriate software architec-ture does not exist for an application, itmay still be advantageous to create anew reusable architecture. This up-frontcost is justified in cases in which applica-tion will be modified many times in itslife or when many prototypes are neededto converge on a usable system. In addi-

tion, the new software architecture be-comes available for reuse in subsequentapplication and architecture designs.

10.3 Specialization

Once a system is built using a Dracosoftware architecture, it can be special-ized for more efficient execution in twoways: source-to-source transformationsand component refinement.

Source-to-source transformations, likethose described in Section 9, are opti-mization applied to programs written ina domain language. These transforma-tions produce semantically equivalent butmore efficient specifications in the samedomain language. For example, in an al-gebraic domain, the following simpletransformation converts exponentiationto multiplication for the special case ofsquares [Neighbors 1984]:

$X2 == )$X*$X

Component refinements are alternativeimplementations of the same domainlanguage construct that have differentperformance characteristics. This is ex-actly like the multiple schema implemen-tations in Section 6.3. For example, in analgebraic domain, the exponential opera-tor in the domain language could be im-plemented. by both the binary shiftmethod and the Taylor expansionmethod. The software developer usingthis domain language would choose thecomponent refinement with the best exe-cution profile for a particular application.

10.4 Integration

The primary difference between Dracoand conventional application generatorsis that Draco allows an arbitrary collec-tion of software architectures to be inte-grated into a single application, whereasconventional application generators aretypically standalone. This flexibility addsconsiderable power to the Draco model. Areusable architecture can be used as asubsystem in many larger architecturesrather than being limited to a single ap-plication generator.



Abstraction

Selection

Specialization

Integration

Pros

Table 8. Reuse m Software Architectures

Architectural abstractions come dmectly from application domains. High-level

domain abstractions are captured in the “domain language” for an architectureand are automatically mapped mto executable implementations

The analogy between software schemas and software architectures suggests that

hbrary techniques for doing classification, search, and exposition could he usedto select among a collection of reusable software architectures.

Software developers can specialize software architectures to produce implemen-tations that match the performance reqmrernents of an application. Specializa-

tion may be horuontut with source-to-source transformations or zertical with

component refinements.Integrating different domain languages into a single apphcation requires aspecial-purpose module interconnection language. For example, the Dracomodule interconnection language checks for type representation consistency andfor domain-specific semantic consistency in the architecture interfaces.Application developers use high-level abstractions from an application domain toinstantiate and compose software architectures. Therefore, like application

generators, the cogmtive distance is small. In addition, reusable architecturescan be used either stand-alone to create end-user applications or as building

blocks to create higher level software architectures.Creating reusable software architectures is difficult, and many such architec-

tures wdl be needed to populate a general-purpose library of architectures.

Special care must be taken in the Dracomodule interconnection language sincemultiple software architectures with avariety of domain languages may be inte-grated into a single application. For ex-ample, the interconnection languagemust accommodate different domain lan-guages and even different component re-finements with different representationsof the same data object. When objectsare passed in and out of functions, typerepresentation consistency must bemaintained. In addition to syntactic con-sistency, semantic consistency can bemaintained by attaching preconditionsand postconditions to the interface of acomponent refinement.

10.5 Appraisal of Reuse Techniques inSoftware Architectures

As with application generators, the lever-age offered by software architecturescomes from the small cognitive distancebetween informal concepts in an applica-tion domain and executable implementa-tions. The software developer using asoftware architecture typically works atthe abstraction specification level. Themapping from abstraction specification toabstraction realization is mostly auto-mated (although in systems like Draco

the software developer can apply trans-formations and choose among differentcomponent refinements to increase effi-ciency). This automation isolates thesoftware developer from the hidden andrealization parts of the abstraction.

Software architectures can be used aseither standalone application generatorsto create end-user applications or asbuilding blocks for creating higher levelarchitectures. Therefore, in cases inwhich a reusable software architecturedoes not exist for a particular applica-tion, lower level software architecturescan be composed to create the new archi-tecture.

Similar to other reuse techniques, itwill be difficult to build a large, general-purpose library of high-quality reusablesoftware architectures [Neighbors 1989].This is due to both the challenge of iden-tifying and building appropriate softwarearchitectures and the difficulty of identi-fying and building effective library tools.Table 8 summarizes the software archi-tecture approach to software reuse.

11. SUMMARY

In this survey, a broad range of softwarereuse techniques from the research liter-ature was described and compared. The

ACM Computing Surveys, Vol. 24, No 2, June 1992

different approaches to software reusewere partitioned into categories thencompared using a uniform taxonomy.

11.1 Categories and Taxonomy

For this survey, the field of softwarereuse was partitioned into eight cate-gories, each of which illustrates a charac-teristic set of issues and techniques insoftware reuse:

● High-level languages

● Design and code scavenging

● Source code components

● Software schemas

o Application generators

* Very high-level languages

e Transformational systems

● Software architectures

The following taxonomy for the cate-gories allowed us to compare and con-trast different reuse techniques as wellas illustrate some of the fundamentalissues in software reuse:

= Abstraction

● Selection

~ Classification

= Retrieval

9 Exposition

0 Specialization

. Integration

11.2 Cognitive Distance

The notion of cognitive distance was in-troduced as an intuitive gauge to com-pare the effectiveness of reuse tech-niques. Cognitive distance is informallydefined as the amount of intellectual ef-fort that must be expended by softwaredevelopers in order to take a softwaresystem from one stage of development toanother.

Software reuse techniques can reducecognitive distance in two ways:

(1) Reduce the amount of intellectual ef-fort required to go from the initialconceptualization of a system to aspecification of the system in abstrac-tions of the reuse technique.

(2)


Reduce the amount of intellectual ef-fort required to produce an exe-

cutable system once the software de-veloper has produced a specificationin terms of abstractions of the reusetechnique.

To reduce cognitive distance with thefirst method, implementors of a reusetechnique must move the abstractionspecifications in the technique closer tothe abstractions used to reason about ap-plications informally. For example, appli-cation generators use domain-specific ab-stractions for their programming model.

Following is an approximate rankingof the eight reuse categories, rated onhow well they minimize cognitive dis-tance by the first method (where one isbest):

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

Application generators

Software architectures

Transformational systems

Very high-level languages

Software schemas

Source code components

Code scavenging

High-level languages

To reduce cognitive distance with thesecond method, imp] ementors of a reusetechnique must partially or fully auto-mate the mapping from abstraction spec-ifications to executable abstraction real-izations. Cognitive distance is reduced oreliminated by minimizing the need forsoftware developers to see and under-stand the hidden and realization parts ofthe abstractions. For example, veryhigh-level languages use compilers or in-

terpreters to map abstract specificationsin the language directly into an exe-cutable form.

Following is an approximate rankingof the eight reuse categories, rated onhow well they minimize cognitive dis-tance by the second method (where one isbest):

(1) High-level languages

(2) Very high-level languages

ACM Comnutin~ Survevs. Vol. 24, No. 2, June 1992


(3) Application generators

(4) Software architectures

(5) Transformational systems

(6) Software schemas

(7) Source code components

(8) Code scavenging

An ideal software reuse technologywould use both approaches to reduce cog-nitive distance. That is. a software devel-oper using this technology would quicklybe able to select, specialize, and integrateabstraction specifications that satisfied a.particular set of informal requirements,and the abstraction specifications wouldbe automatically translated into an exe-cutable svstem. In addition. this idealtechnolo& could be used in ‘all applica-tion domains.

Of the eight reuse technologies exam-ined in this report, the reusable softwarearchitectures probably come closest tothis ideal description. A “~erfected form. .of this technology would have a completelibrary of useful software architectures,ranging from small-grained domains suchas sets and stacks to large-grained appli-cation domains such as airline naviga-tion control systems. The library systemwould allow software developers to locatethe most appropriate reusable artifactsfor their application easily. The softwarearchitectures would be specialized andintegrated in terms of concise high-levelabstraction specifications from the appli-cation domain. Executable svstems wouldbe automatically produce~ from thesehigh-level specifications.

11.3 General Conclusions

* What is software reuse? Software reuseis the process of using existing soft-ware artifacts rather than buildingthem from scratch. Typically, reuse in-volves the selection, specialization, andintegration of artifacts, although differ-ent reuse techniques may emphasize orreemphasize certain of these.

● Why reuse software artifacts? The pri-mary motivation to reuse software arti-facts is to reduce the time and effort

required to build software systems, Thequality of software systems is en-hanced by reusing quality software ar-tifacts, which also reduces the time andeffort required to maintain softwaresystems.

What is required to implement a soft-ware reuse technology? Creating acomplex executable system with fewerkey strokes and less cognitive burdenon software developers clearly impliesa higher level of abstraction. For a soft-ware developer to select, specialize, andintegrate reusable artifacts success-fully, the reuse technology must pro-vide natural, succinct, high-level ab-stractions that describe artifacts interms of “what” they do rather than“how” they do it. The ability of a soft-ware developer to practice softwarereuse is limited primarily by the abilityto reason in terms of these abstrac-tions. In other words, there must be asmall cognitive distance between infor-mal reasoning and the abstract con-cepts defined by the reuse technology.

In areas of research that strive toraise the level of abstraction in soft-ware development, much of the lever-age has come from software reuse. Infact, much of the software reuse litera-ture is not about new techniques devel-oped from first principles of reuse butrather about ongoing research projectsthat have discovered that reuse is re-sponsible for the advantages offered byhigh-level abstractions in their sys-tems.

Why is software reuse difficult? Usefulabstractions for large, complex,

reusable software artifacts will typi-cally be complex. In order to use theseartifacts, software developers must ei-ther be familiar with the abstractions apriori or must take time to study andunderstand the abstractions. The lattercase can defeat some or all of the gainsin reusing an artifact. The former caseis where we have seen significant suc-cesses in reuse. Examplesinclude (1) math libraries used by de-velopers who are familiar with themathematical concepts, (2) abstract

ACM Computmg Surveys, Vol. 24, No 2, June 1992

Software Reuse . 179

data type schemas such as stacks andqueues used by developers who are fa-

miliar with the semantics, and (3)domain-specific application generatorsand domain languages used by devel-opers who are familiar with concepts ina particular application domain.

The following truisms were included ear-lier in the survey. They are simple state-ments on reuse that have been difficultto satisfy in practice.

For a software reuse technique to be effectwe, itmust reduce the cogntive distance between the

mtial concept of a system and Its final executableImplementation.

For a software reuse technique to be effective, itmust be easier to reuse the artifacts than it IS to

develop the software from scratch.

To select an artifact for reuse, you must know what

It does.

To reuse a software artifact effectively, you must beable to “find it” faster than you could “build it, ”

What are the open areas of research in softwarereuse? There IS clearly much work that remams to

be done. In general, the search for high-level ab-stractions for software artifacts is probably the mostcrucial, but thw type of research IS not new or

unique to software reuse.

Library issues are also critical to manyreuse technologies. These issues, how-ever, lead right back to abstraction sinceclassification, search, and exposition areheavily dependent on high-level abstrac-tions for software artifacts in a library.

ACKNOWLEDGMENTS

Nico Habermann contributed greatly to this survey

through his patient reading, insightful comments,

and thoughtful discussion. John McDermott, Bar-

bara Staudt Lerner, Peter Feiler, John Ockerbloom,

and the anonymous reviewers all provided valuable

feedback and detailed comments. Ellen Douglas

gave many suggestions on improving the technical

writing style (although I admittedly have not done

justice to her efforts). My initial interest in the

subject of software reuse was sparked in a discus-

sion group headed by Mary Shaw at the Software

Engineering Institute.

REFERENCES

BALZER, R. 1985. A 15 year perspective on auto-matic programming. IEEE Trans. Softw. Eng.SE-II, 11 (Nov.), 1257-1268.

BALZER, R. 1989. A 15 year perspective on auto-

matic programming. In Frontier Series: Soft-

ware ReusabdLty: Volume 11—Applications andExperience. Biggerstaff, T. J., and Perlis, A. J.,

Eds. ACM Press, New York, pp. 289-311, Chap.14. Originally Balzer [1985].

BENTLEY, J. 1986. Programming pearls. Comm un.

ACM 29, 5 (May), 364-369.

BIGGERSTMW, T. 1987. Hypermedia as a tool to aidlarge scale reuse. In F’roceedmgs of the Work-

shop on Software Reuse (Oct.). Rocky MountainInstitute of Software Engineering, Boulder,

CO1O.

BIGGERSTAFF, T. J,, AND PERLIS, A. J., EDS. 1989a.

Frontier Series: So filoare Reusability: Volume I

—Concepts and Models, ACM press, New York.

BIGGERSTAFF, T. J., AND PERLIS, A. J., EDS. 1989b.

Frontier Series: Software Reusability: VolumeII—Applications and Experience. ACM Press,

New York.

BIGGERSTAFF, T., AND RICHTER, C. 1987. Reusabil-

ity framework, assessment, and directions.IEEE Softw. 4, 2 (Mar.), 41–49. Also in Tracz

[19881 and Biggerstaff and Richter [1989].

BIGGERSTAFF, T., AND RICHTER, C. 1989. Reusabil-

ity framework, assessment, and directions. InFrontier Series: Software Reusability: Volume

I-Concepts and Models. Biggerstaff, T. J., andPerlis A. J., Eds. ACM Press, New York, pp.1–17, Chap. 1. Originally Biggerstaff and

Richter [ 1987].

BOEHM, B. W. 1987. Improving software produc-

tivity. IEEE Comput. 20, 9 (Sept.), 43-57.

Booc~, G. 1987. Software Components with Ada:

Structures, Tools, and Subsystems. Benjamin/Cummings Publishing Company, Inc., MenloPark, Calif.

BROOKS, F. P. JR. 1975. The Mythical Man-Month.

Addison-Wesley, Reading, Mass.

BROOKS, F. P. 1987. No silver bullet: Essence and

accidents of software engineering. IEEE Com-

put. 20, 4 (Apr.), 10-19.

CHEATHAM, T. E. JR. 1983. Reusability through

program transformations. In Workshop onReusability in Programming (Newport, RI,

Sept.). ITT Programming, Stratford, Corm., pp.122-128. Also in Cheatham [ 1984, 1989].

CHEATHAM, T. E. JR. 1984. Reusability throughprogram transformat~ons. IEEE Trans. Softw.Eng. SE-10, 5 (Sept.), 589–594. OriginallyCheathalm [1983].

CHEATHAM, T. E. JR. 1989. Reusability throughprogram transformations. In Frontier Series:

Software Reusability: Volume I—Concepts andModels. Biggerstaff, T. J., and Perlis, A. J.,Eds., ACM Press, New York, pp. 321-335,

Chap. 13. Originally Cheatham [ 1983].

CHENG, T. T., LOCK, E. D., AND PRYWES, N. S. 1984,Use of very high level languages and programgeneration by management professionals.IEEE Trans. Softw. Eng. SE-10, 5 (Sept.),552-563.



CLEAVELAND, J. C. 1988. Building application gen-

erators. IEEE Softw. 5, 4 (July), 25–33. Also inPrieto-Diaz and Arango [ 1991].

DEREMER, F., AND RRON, H. H, 1976. Program-

ming-in-the-large versus programming-in-the-small. IEEE Trans. Softw. Erg. SE-2, 2 (June),

80-86.

DEUTSCH, P. L. 1983. Reusability in the Small-

talk-80 programming system. In Workshop onReusabdlty m Programming (Newport, R. I.

Sept. ). ITT Programming, Stratford, Corm., pp.72-76. Also in Freeman [ 1987 b].

DEUTSCH, L, P. 1989. Design reuse and frame-

works in the Smalltalk-80 system. In Frontier

Sertes: Software Reusability: Volume II—Ap -plwations and E:<per~ence. 131ggerstaff, T. J.,and Perlis, A. J., Eds. ACM Press, New York,pp. 57-7’1, Chap. 3.

DOBERRAT E., DUBINSKY, E., AND SCIRVARTZ, J. T.

1983. Reusability of design for complex pro-

grams: An experiment with the SETL opti-mizer. In Workshop on Reusability in Program-ming (Newport, R. I., Sept. ) ITT Program-ming, Stratford, Corm., pp. 106–108.

DONGARRA, J. J., AND GROSSE, E. 1987. Distribu-

tion of mathematical software via electronicmad. Commun. ACM 30, 5 (May), 403–407.

DONZEAU-GOUGE, V., KAHN, G., L~G, B., ANDMELESE, B. 1984. Document structure andmodularity in Mentor. In Proceedings of theACM Szgsof+/Sigplan Software Engineering

Symposium on Practtcal Software DevelopmentEnvironments (Pittsburgh, Penn. Apr.). ACM

SIGSOFT/SIGPLAN, New York, pp. 141-148.

DUBINSKY, E., FREUDENBERGER, S., SCHONBER~, E.,

AND SCHWARTZ, J. T. 1989. Reusability of de-

sign for large software systems: An experiment

with the SETL optimizer, In Frontier Series:Software Reusability: Volume I—Concepts andModels. Biggerstaff, T. J., and Perlis, A. J.,Eds. ACM Press, New York, pp. 275-293, Chap.11.

EMBLEY, D. W., AND WOODFIELD, S. N. 1987. Aknowledge structure for reusing abstract datatypes. In 9th Znternatzonal Conference on Soft-

ware Engineering, (Monterey, Calif., Mar.).IEEE Computer Society Press, Los Alamitos,Calif., pp. 360-365. Also in Tracz [1988].

ESHELMAN, L. 1988. MOLE: A knowledge-acquisi-tion tool for cover-and-differentiate systems. In

Kluwer International Series in Engineering andComputer Science. Automatw Knowledge Ac -quzsztion for Expert Systems. Kluwer AcademicPublishers, Boston, Mass , pp. 37-80, Chap. 3.

EVB SOFTWARE 1985, An Object Oriented Des~gnHandbook for Ada Software. EVB Software En-gineering, Inc., Fredrick, Maryland.

FEATHER, M. S. 1983. Reuse in the context of a

transformation based methodology. In Work-shop on Reusability in Programming (Newport,R. I., Sept.). ITT Programming, Stratford,Corm., pp. 50-58. Also in Freeman [1987b] andFeather [1989].

FEATHER, M, S, 1989. Reuse in the context of a

transformation based methodology. In FrontierSeries: Software Reusability: Volume I—Con-

cepts and Models. Biggerstaff, T. J., and Perlis,A, J., Eds. ACM Press, New York, pp. 337-359,

Chap. 14. Originally Feather [1983].

FICKAS, S. F 1985. Automating the transforma-

tional development of software. IEEE Trans.Softw. Eng. SE-11, 11 (Nov.), 1268-1277.

FREE MAN, P. 1983. Reusable software engineer-

ing: Concepts and research directions. In Work-shop on Reusabdtty zn Programmmg (Newport,

R. I., Sept.). ITT Programming, Stratford,Corm., pp. 2-16. Also in Freeman [ 1987b]

FREEW, P. 1987a. A conceptual analysis of the

Draco approach to constructing software sys-tems. IEEE Trans. Soff w. Eng. SE-13, 7 (July),

830-844. Also in Freeman [ 1987 b].

FREEMAN, P., ED. 1987b. Tutor~al: SoftwareReusability. IEEE Computer Society Press,Washington, D,C.

GARLAN, D. 1990. The role of formal reusableframeworks, In Proceedings of the ACM SIG-SOFT International Workshop on Formal

Methods in Software Development (Napa, Calif,

May). ACM Press, New York, pp. 42-44.

GOGUEN, J. A. 1986. Reusing and interconnectingsoftware components. IEEE Comput. 19, 2

(Feb.), 16-28. Also in Freeman [1987b] andPrieto-D,az and Arango [1991].

GOGUEN, J. A, 1989. Principles of parameterizedprogramming. In Frontier Series: Software

Reusabdity: Volume I—Concepts and Models.Biggerstaff, T, J., and Perlis, A. J., Eds. ACM

Press, New York, pp. 159-225, Chap. 7.

HABERMANN, A. N., KRUEGER, C., PIERCE, B., STAUDT,

B., AND WENN, J. 1988. Programming withviews. Tech. Rep. CMU-CS-87-177, Carnegie-

Mellon University, Pittsburgh, Penn.

HAYES-ROTH, F., WATERMAN, D,, AND LENENT, D,,EDS. 1983. Tehnowledge Sertes m KnowledgeEngineering. Vol. 1, Butlding Expert S.vstems.Addison-Wesley, Reading, Mass.

HOROWITZ, E., AND MUNSON, J. B. 1983. An expan-sive view of reusable software. In Workshop onReusability in Programming (Newport, R. I.,

Sept. ). ITT Programming, Stratford, Corm., pp.250-262. Also in Horowitz and Munson [ 1984],

Freeman [ 1987b], and Horowitz and Munson[1989].

HOROWITZ, E., AND MUNSON, J. B. 1984. An expan-sive view of reusable software. IEEE Trans.Softw. Eng. SE-10, 5 (Sept.), 477–487. Origi-nally Horowitz and Munson [1983].

HOROWITZ, E., AND MUNSON, J. B. 1989. An expan-sive view of reusable software. In Front~er Se-ries: Software ReusabtlLty: Volume I—

Concepts and Models. Biggerstaff, T. J., andPerlis, A. J., Eds. ACM Press, New York, pp.19-41, Chap. 2.

HOROWITZ, E., KEMPER, A., AND NARASIMHAN, B.1985. A survey of application generators.IEEE Softw. 2, 1 (Jan.), 40-54.



ICHBIAH, J. D. 1983. On the design of Ada. In

Information Processing 83. Mason, R. E.A., Ed.IFIP, Elsevier Science Pub., New York, pp.1-1o.

IMSL 1987. IMSL Math/Library User’s Manual.1.0 Edition, Houston, Tex.

JOHNSON, S. C. 1979. Yacc: Yet Another Compder-

Compiler in the UNIX Programmer’s Manual—Supplementary Documents. 7th ed. AT&TBell Laboratories, Indianapolis, Ind.

KAISER, G. E. 1985. Semantics for structure edit-ing environments. Ph.D. dissertation,Carnegie-Mellon Univ.

KAISER, G. E. 1989. Incremental dynamic seman-

tics for language-based programming environ-ments. ACM Trans. Program. Lang. Syst. 11,2 (Apr.), 169-193. See also Kaiser [ 1985].

KAISER, G. E., AND GARLAN, D. 1987. Melding soft-ware systems from reusable building blocks.IEEE Softw. 4, 4 (July), 17–24. Also in Tracz

[1988].

KAISER, G. E., AND GARLAN, D. 1989. Synthesizing

programming environments from reusable fea-tures. In Frontier Series: Software Reusability:

Volume 11—Apphcations and Experience. Elig-gerstaff, T. J., and Perlis, A. J., Eds. ACMPress, New York, pp. 35-55, Chap. 2.

KATZ, S., RICHTER, C. A., AND THE, K. 1987. PARIS:A system for reusing partially interpreted

schemas. In 9th International Conference onSoftware Engineering (Monterey, Calif., Mar.).IEEE Computer Society Press, Los Alamitos,

Calif., pp. 377-385. Also in Tracz [1988] andKatz et al. [1989].

KATZ, S., RICHTER, C. A., AND THE, K. 1989. PARIS:

A system for reusing partially interpreted

schemas. In Frontier Series: Software

Reusability: Volume I—Concepts and Models.Biggerstaff, T. J., and Perlis, A. J., Eds. ACM

Press, New York, pp. 257-273, Chap. 10. Origi-nally Katz et al. [1987].

KERNIGHAN, B. W. 1983. The Unix system andsoftware reusability. In Workshop on Reusabil-

ity m Programming (Newport, R. I., Sept.). ITTProgramming, Stratford, Corm., pp. 235-239.Also in Kernighan [1984] and Freeman [ 1987b].

KERNIGHAN, B. W. 1984. The Unix system and

software reusability. IEEE Trans. Softw. Eng.SE-10, 5 (Sept.), 513–518. Originally Kernig-han [1983].

KNOTH, D., E. 1973. The Art of Computer Pro-

gramming. Addison-Wesley, Reading, Mass.

KRUCHTEN, P., SCHONB~RG, E., AND SCHWARTZ, J.1984. Software prototyping using, the SETL

programming language. IEEE Softw. 1, 4

(Oct.), 66-75.

KRUEGER, C. W. 1987. Expert system engineering

and its relation to conventional software engi-neering. Ph.D. area qualifier paper. Carnegie-Mellon Univ., Pittsburgh, Penn. Available fromauthor on request.

KRUEGER, C. W. 1992. Application-specific object

management architectures. Ph.D. dissertation,Carnegie-Mellon Univ.

LANE, T. G. 1990. User interface software struc-

tures. Ph.D. dissertation, Carnegie-MellonUniv.

LATOUR, L., AND JOHNSON, E. 1988. Seer: A graphi-

cal retrieval system for reusable Ada softwaremodules. In The 3rd International IEEE Con-

ference on Ada Applications and Environments(Manchester, N. H., May). IEEE Computer So-

ciety Press, Los Alamitos, Calif., pp. 105–113.

LESK, M. E., AND SCHMIDT, E. 1979. Lex: A LexicalAnalyzer Generator in the UNIX Programmer’s

Manual—Supplementary Documents, 7th ed.

AT& T Bell Laboratories, Indianapolis, Ind.

LEVY, L. S. 1986. A metaprogramming method

and its economic justification. IEEE Trans.Softw. Eng. SE-12, 2 (Feb.), 272-277.

LIEBERHERR, K. J., AND lRI~L, A. J. 1988. Demeter:A CASE study of software growth through pa-

rametrized classes. In The 10th InternationalConference on Software Engineering (Singa-pore, Apr.). IEEE Compu+er Society Press, LosAlamitos, Calif., pp. 254-264.

LISKOV, B. 1987. Data abstraction and hierarchy.

In 00PSLA ’87, Addendum to the Proceedings

(Orlando, Fla., Oct.). ACM Sigplan, New York,pp. 17--34.

LnJ, S., AND PAIGE, R. 1979. Data structure

choice/formal differentiation. Tech. Rep. NSO-15, Courant Institute of Mathematical Sci-ences, New York.

LOECKX, J., AND SIEBER, K. 1984. Wiley and Teub-

ner Series in Computer Science: The Founda-tions of Program Verzfzcation. John Wiley, NewYork.

LUCKHAM, D. C., AND VON HENKE, F. W. 1984. An

overview of Anna, a specification language forAda. In 1984 Conference on Ada Applications

and Envu-onments (St. Paul, Minn., Oct.). IEEEComputer Society Press, Los Alamitos, Calif.,pp. 116-127.

MARCUS, S., ED. 1988. Kluwer International Seriesin Engineering and Computer Science: Auto-

matic Knowledge Acquisition for Expert Sys-tems. Kluwer Academic Publishers, Boston.Mass.

MCDERMOTT, J. 1986. Making expert systems ex-plicit. In Information Processing 86. IFIP, El-sevier Science Pub., New York, pp. 539–544.

MCDERMO’I”r, J. 1988. Preliminary steps toward a

taxonomy of problem-solving methods. InKluwer International Series in Engineering and

Computer Science. Automatic Knowledge Ac-quisition for Expert Systems. Marcus, S., Ed.,

Kluwer Academic Publishers, Boston, Mass.,

pp. 225–256, Chap. 8.

MCILROY, M. D., 1968. Mass produced softwarecomponents. In Sofiware Engineering; Reporton a conference by the NATO Science Commit-tee (Garmisch, Germany, Oct.). Naur, P., andRandell, B., Eds. NATO Scientific Affairs Divi-sion, Brussels, Belgium, pp. 138–150.



MENDAL, G. O., 1986. Designing for Ada reuse: A

case study. In 2nd Znternatzonal Conference onAda Applications and Environments (MiamiBeach, Fla., Apr.). IEEE Computer SocietyPress, Los Alamitos, Calif., pp. 33-42.

MEYER, B, 1987. Reusability: The case for object-

oriented design. IEEE Softw. 4, 2 (Mar.),

50-63. Also m Tracz [1988] and Meyer [1989].

MEYER, B. 1989. Reusability: The case for obJect-

oriented design. In Front~er Series: Softu,areReusability: Volume 11—Applmat~ons and E.Y-per~ence. Biggerstaff, T. J., and Perlis. A. J,,Eds. ACM Press, New York, pp. 1-33, Chap. 1.

Originally Meyer [1987].

NAUER, P., AND RANDELL, B., EDS. 1968. SoftwareEngineering; Report on a Conference by the

NATO Science Committee. NATO Scientific Af-fairs Division, Brussels, Belgium.

NEIGHBORS, J. M. 1983. The Draco approach to

constructing software from reusable compo-nents. In Workshop on Reusability in Program-ming (Newport, R. I., Sept.). ITT Programming,

Stratford, Corm., pp. 167-178. Also in Neigh-

bors [ 1984] and Freeman [ 1987b].

NEIGHBORS, J. M. 1984. The Draco approach toconstructing software from reusable compo-

nents. IEEE Trans. Softw. Eng. SE-10, 5

(Sept.), 564-574. Originally Neighbors [1983].

NIZIGHBORS, J. M. 1989. Draco: A method for en~-

neering reusable software systems. In Frontier

Series: Software Reusability: Volume I—Con-

cepfs and Models. Biggerstaff, T. J., and Perlis,A. J., Eds. ACM Press, New York, pp. 295-319,Chap. 12, Also in Prieto-Diaz and Arango[1991].

PARNAS, D. L., CLEMENTS, P. C., AND WEISS, D. M.1983. Enhancing reusability with information

hiding. In Workshop on Reusability in Pro-gramming (Newport, R. I., Sept.). ITT Program-

ming, Stratford, Corm., pp. 240–247, Also inFreeman [ 1987b] and Parnas et al. [ 1989],

PARNAS, D, L., CLEMENTS, P. C,, MVD WEISS, D. M.1989. Enhancing reusability with information

hiding. In Frontier Ser~es: Software Reusabll-

lty: Volume I—Concepts and Models. 131gger-staff, T. J., and Perlis, A, J., Eds. ACM Press,

New York, pp. 141-157, Chap. 6. OriginallyParnas and Clements [1983].

PARTSCH, H., AND STEINBRUGGEN, R. 1983. Pro-gram transformation systems. ACM C“omput.Surv. 15, 3 (Sept.), 199-236.

PAYTON, T,, KELLER, S., PERKINS, J., ROWLAN, S.,AND MARDINLY, S. 1982. SSAGS: A syntax andsemantics analysis and generation system. In6th International Computer Software and Ap-plications Conference (COMPSAC82 ) (Chicago,Ill., Nov.). IEEE Computer Society Press, LosAlamitos, Cahf., pp. 424-432.

PRIETO-DLAZ, R. 1989. Classification of reusablemodules. In Frontier Series: Soffware Reus-ability: Volume I—Concepts and Models. Big-gerstaff, T. J., and Perlis, A. J., Eds. ACMPress, New York, pp. 99–123, Chap. 4.

PRIETO-DIAZ, R., AND ARANGO, G., EDS. 1991. Do-

main Analysis and Software System Modeling.IEEE Computer Society Press, Los Alamitos,

Calif.

PRIETO-DIAZ, R., AND FREEMAN, P. 1987. Classify-ing software for reusability IEEE Soff w. 4, 1(Jan.), 6-16. Also in Freeman [1987b].

PRIETO-DIAZ, R., AND NEIGHBORS, J. M. 1986. Mod-

ule interconnection languages. J. Syst. SoftZL1,

6, 4 (Nov.), 307-334. Also in Freeman [1987b].

PRYWES, N. S., AND LOCK, E. D. 1989. Use of the

model equational language and program gener-ator by management professionals. In Frontier

Series: Software Reusabdify: Volume II—Ap -plicatzons and Experience. Blggerstaff, T. J.,and Perlis, A, J., Eds. ACM Press, New York,pp. 103–129, Chap. 5,

REPS, T., AND TEITELBAUM, T, 1984. The synthe-

sizer generator. In Proceedings of the ACMSlgsoft/Slgplan Software Engzneenng Sympo-

sum on Pructlcal Software De L~elopmen t Envi-ronments (Pittsburgh, Penn., Apr.). ACM SIG-

SOFT/SIGPLAN, New York, pp. 42-48,

RICH, C., AND WATERS, R. 1983. Formalizingreusable software components. In Workshop on

Reusability m Programmmg (Newport, R. I,,Sept.). ITT Programming, ITT, Stratford,Corm., pp. 152-159.

RICH, C., AND WATERS, R. C. 1988. Automatic pro-

gramming: Myths and prospects. IEEE Com -

put 21, 8 (Aug.), 40-51,

RICH, C., AND WATERS, R, C. 1989. Formalizing

reusable software components in the program-mer’s apprentice. In Frontier Series: Software

Reusabdify: Volume 11—Applications and Ex-perience. Biggerstaff, T. J., and Perlis, A. J.,Eds. ACM Press, New York, pp. 313-343, Chap,15. Expanded version of Rich and Waters

[1983].

SHAW, M. 1984. Abstraction techniques in modernprogramming languages. IEEE Softw. 1, 4(Oct.), 10-26.

SHAW, M, 1989. Larger scale systems require

higher-level abstractions. In Proceeding.s of’ the

5th Internat~onal Workshop on Software Speci-

fication and Deszgn (May). IEEE Computer So-ciety Press, Los Alamitos, Cahf,, pp. 143–146.

SHAW, M. 1991, Heterogeneous design idioms for

software architectures. In Proceedl ngs of the6th Internat~onal Workshop on Software Specl-fzcafion and Design (Como, Italy, Oct.), IEEEComputer Society Press, Los Alamitos, Calif.,pp. 1–8.

STANDISH, T. A. 1984, An essay on software reuse.IEEE Trans. Softw. Eng. SE-10, 5 (Sept.),494-497.

STAUDT, B. J., KRUEGER, C. W., HABERMANN, A, N.,AND AMBRIOLA, V. 1986. The GANDALF sys-tem reference manuals. Tech, Rep. CMU-CS-86130, Carnegie-Mellon Univ., Pittsburgh,Penn.

STEFIK, M., AND BOBROW, D. G. 1986. Object-ori-


Software Reuse * 183

ented programming: Themes and variations.The AI Msg. 16, 4, 40-62.

TEXAS INSTRUMENTS 1985. The TTL Dattz Book.

Vol. 2. Texas Instruments, Dallas, Tex.

TRACZ, W., ED. 1988. Tutorial: Software Reuse:

Emerging Technology. IEEE Computer Society

Press, Los Alamitos, Calif.

TSENG, J. S., SZYMANSN, B., SHI, Y., AND PRYWES, N.

S. 1986. Real-time software life cycle with themodel system. IEEE Trans. Softw, En,g. SE-12,2 (Feb.), 358-373.

VOLPANO, D. M., AND KIEBURTZ, R. B. 1985. Soft-ware templates. In 8th International Confer-en ce on Software Engineering (London, UK,Aug. ). IEEE Computer Society Press, LosAlamitos, Calif., pp. 55-60.

VALPANO, D. M., AND KIEBURTZ, R. B. 1989. Thetemplates approach to software reuse. In Fron-

tier Series: Software Reusability: VolumeI—Concepts and Models. Biggerstaff, T. J., and

Perlis, A. J., Eds. ACM Press, New York, pp.247-255, Chap. 9.

WATERS, R. C. 1985. ‘The programmer’s appren-tice: A session with KBEmacs. IEEE Trans.

Softzu. Eng. SE-11, 11(Nov.), 1296-1320.

WEGNER, P. 1983. Varieties of reusability. In

Workshop on Reusability in Programming

(Newport, R. I., Sept.). ITT Programming,Stratford, Corm., pp. 30–44, Also in Freeman

[1987b].

ZAVE, P. 1984. The operational versus tbe conven-

tional approach to software development. Corn-

rnun. ACM 27, 2 (Feb.), 104–118.

ZAVE, P., AND SCHELL, W, 1986. Salient features of

an executable specification language and its

environment. IEEE Tra~ts. Softw. Eng. SE-12,

2 (Feb.), 312-325.

Recewed February 1990; final revmlon accepted October 1991.