37
Tools and Approaches for Developing Data-Intensive Web Applications: A Survey PIERO FRATERNALI Politecnico di Milano The exponential growth and capillar diffusion of the Web are nurturing a novel generation of applications, characterized by a direct business-to-customer relationship. The development of such applications is a hybrid between traditional IS development and Hypermedia authoring, and challenges the existing tools and approaches for software production. This paper investigates the current situation of Web development tools, both in the commercial and research fields, by identifying and characterizing different categories of solutions, evaluating their adequacy to the requirements of Web application development, enlightening open problems, and exposing possible future trends. Categories and Subject Descriptors: H.5.4 [Information Interfaces and Presentation]: Hypertext/Hypermedia ; D.2.2 [Software Engineering]: Design Tools and Techniques General Terms: Design, Experimentation, Languages, Reliability Additional Key Words and Phrases: Application, development, HTML, Intranet, WWW 1. INTRODUCTION Applications for the Internet in such domains as electronic commerce, digital libraries, and distance learning are characterized by an unprecedented mix of features that makes them radically different from previous applications of information technology [Myers et al. 1996]: —Universal access by individuals with limited or no skills in the use of com- puter applications introduces the need of new man-machine interfaces capable of capturing the customer’s attention and facilitating access to in- formation. —Global availability of heterogeneous information sources requires the inte- grated management of structured and unstructured content, possibly stored in different systems (databases, file systems, multimedia storage devices) and distributed over multiple sites. In recent years the World Wide Web has been elected as an ideal platform for developing Internet applications, thanks to its powerful communication paradigm based on multimediality and browsing, and to its open architectural standards, which facilitate the integra- tion of different types of content and systems. Author’s address: Dipartimento di Elettronica e Informazione, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milano, I-20133, Italy. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. © 2000 ACM 0360-0300/99/0900–0227 $5.00 ACM Computing Surveys, Vol. 31, No. 3, September 1999

Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

Embed Size (px)

Citation preview

Page 1: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

Tools and Approaches for Developing Data-Intensive WebApplications: A SurveyPIERO FRATERNALI

Politecnico di Milano

The exponential growth and capillar diffusion of the Web are nurturing a novelgeneration of applications, characterized by a direct business-to-customerrelationship. The development of such applications is a hybrid between traditionalIS development and Hypermedia authoring, and challenges the existing tools andapproaches for software production. This paper investigates the current situation ofWeb development tools, both in the commercial and research fields, by identifyingand characterizing different categories of solutions, evaluating their adequacy tothe requirements of Web application development, enlightening open problems, andexposing possible future trends.

Categories and Subject Descriptors: H.5.4 [Information Interfaces andPresentation]: Hypertext/Hypermedia ; D.2.2 [Software Engineering]: DesignTools and Techniques

General Terms: Design, Experimentation, Languages, Reliability

Additional Key Words and Phrases: Application, development, HTML, Intranet,WWW

1. INTRODUCTION

Applications for the Internet in suchdomains as electronic commerce, digitallibraries, and distance learning arecharacterized by an unprecedented mixof features that makes them radicallydifferent from previous applications ofinformation technology [Myers et al.1996]:

—Universal access by individuals withlimited or no skills in the use of com-puter applications introduces theneed of new man-machine interfacescapable of capturing the customer’sattention and facilitating access to in-formation.

—Global availability of heterogeneousinformation sources requires the inte-grated management of structured andunstructured content, possibly storedin different systems (databases, filesystems, multimedia storage devices)and distributed over multiple sites.

In recent years the World Wide Webhas been elected as an ideal platform fordeveloping Internet applications,thanks to its powerful communicationparadigm based on multimediality andbrowsing, and to its open architecturalstandards, which facilitate the integra-tion of different types of content andsystems.

Author’s address: Dipartimento di Elettronica e Informazione, Politecnico di Milano, Piazza Leonardoda Vinci 32, Milano, I-20133, Italy.Permission to make digital / hard copy of part or all of this work for personal or classroom use is grantedwithout fee provided that the copies are not made or distributed for profit or commercial advantage, thecopyright notice, the title of the publication, and its date appear, and notice is given that copying is bypermission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute tolists, requires prior specific permission and / or a fee.© 2000 ACM 0360-0300/99/0900–0227 $5.00

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 2: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

Modern Web applications are conve-niently described as a hybrid between ahypermedia [Nielsen 1995] and an in-formation system. As in hypermedia,i.e., applications commonly found inCD-ROMS, kiosks, information points,etc., information is accessed in an ex-ploratory way rather than through“canned” interfaces, and the way inwhich it is navigated and presented is ofgreat importance. Similar to informa-tion systems, the size and volatility ofdata and the distribution of applicationsrequires consolidated architectural so-lutions based on technologies such asdatabase management systems and cli-ent-server computing.

Due to this hybrid nature, the devel-opment of a Web application must cope

with a number of applicative require-ments, such as:

—the necessity of handling both struc-tured data (e.g., database records)and nonstructured data (e.g., multi-media items);

—the support of exploratory accessthrough navigational interfaces;

—a high level of graphical quality;

—the customization and possibly dy-namic adaptation of content struc-ture, navigation primitives, and pre-sentation styles;

—the support of proactive behavior, i.e.,for recommendation and filtering.

These requirements add up and typi-cally compete with the technical andmanagerial issues of every software-and data-intensive application, whichobviously remain valid for large Webapplications also:

—security, scalability, and availability;

—interoperability with legacy systemsand data;

—ease of evolution and maintenance.

As has happened in the past withother emerging technologies such as da-tabases and object-oriented program-ming languages, methodologies andsoftware tools may greatly help in mas-tering the complexity of innovative ap-plications by fostering a correct under-standing and use of a new developmentparadigm, providing productivity ad-vantages, and thus reducing the riskinherent in application developmentand migration.

The goal of this survey is to addressthe software engineering, architectural,and applicative issues of Web develop-ment (Section 2); classify and comparecurrent development tools in light ofsuch aspects (Sections 4 to 10); formu-late evaluations and perspectives by ex-amining the relationship between state-of-the-practice solutions and relevantresearch areas (Sections 11, 12, 13 and

CONTENTS

1. Introduction2. Perspectives on Web Development

2.1 Process2.2 Models, Languages, and Notation2.3 Reuse2.4 Architecture2.5 Usability

3. Tools for Web Development4. Visual Editors and Site Managers5. Web-enabled Hypermedia Authoring Tools6. Web-DBPL Integrators7. Web Form Editors, Report Writers, and

Database Publishing Wizards8. Multiparadigm Tools9. Model-Driven Web Generators10. Middleware, Search Engines, and Groupware

10.1 Middleware10.2 Search Engines10.3 Groupware

11. Evaluation12. Research Perspectives13. Research Projects in Data-Intensive Web

Development13.1 Araneus13.2 Autoweb13.3 Strudel13.4 Web Architect13.5 W3I3

14. Background Research14.1 Modeling Notation14.2 Processes14.3 Other Design Tools

15. ConclusionsAPPENDIXList of URLs of Reviewed Products (Alphabetic Order)

228 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 3: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

14); and to summarize current choicesfor the developer on the basis of thecharacteristics of the application (Sec-tion 15).

2. PERSPECTIVES ON WEBDEVELOPMENT

The development of a Web application isa multifaceted activity, involving notonly technical questions, but also orga-nizational, managerial, and even socialand artistic issues.

2.1 Process

Although there is still no consensus ona general model of the lifecycle of a Webapplication,1 a scheme of typical activi-ties involved in constructing a Web ap-plication can be obtained by interpolat-ing the lifecycle models of traditionalinformation systems and the proposalsfor structured hypermedia design [Gar-zotto et al. 1995; Isakowitz et al. 1995;Nanard and Nanard 1995; and Schwabeand Rossi 1995]. Figure 1 illustrates thelifecycle model used as a reference inthis paper.

—Requirements analysis: The mission ofthe application is established by iden-tifying prospective users and definingthe nature of the information base. In

addition to the customary require-ment collection and feasibility assess-ment tasks, Web applications de-signed for universal access requirespecial care in the identification ofhuman-computer interaction require-ments, in order to establish the inter-action mode most suitable for eachexpected category of users, and foreach type of output device that usersare expected to use to connect to theapplication (ranging from hand-heldpersonal communicators to high defi-nition screens).

—Conceptualization: The application isrepresented through a set of abstractmodels that convey the main compo-nents of the envisioned solution. Inthe Web context, conceptualizationhas a different flavor with respect tothe same activity in information sys-tem design, because the focus is oncapturing objects and relationships asthey will appear to users, rather thanas they will be represented within thesoftware system. Although the nota-tion may be the same (e.g., the EntityRelationship Model [Chen 1976]), theschemas resulting from the conceptu-alization of a Web application and adatabase application usually differ.2

1Among the current proposals are those of Taka-hashi and Liang [1997]; Atzeni et al. [1998]; andFraternali and Paolini [1998].

2A macroscopic difference is in the interpretationof relationships, in which database modeling rep-resents semantic associations to be persistently

RequirementAnalysis

Conceptualization

Design:structurenavigationpresentation

Implementation

Prototyping& Verification

Maintenance& Evolution

Figure 1. The lifecycle of a Web application.

Development of Data-Intensive Web Applications • 229

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 4: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

—Prototyping and validation: Simpli-fied versions of the applications aredeployed to users for early feedback.The importance of prototyping is par-ticularly emphasized in the Web con-text, as it is in hypermedia, becausethe intrinsic complexity of the inter-faces requires a timely evaluation ofthe joint effectiveness of structure,navigation, and presentation [Nielsen1996]. Typically, a prototype is builtprior to design and on a simplifiedarchitecture, e.g., as a set of manuallyimplemented pages containing sam-ples of the application content, whichemulate the desired appearance andbehavior.

—Design: Conceptual schemas aretransformed into a lower-level repre-sentation, closer to the needs of im-plementation, but still independent ofthe actual content of the informationbase. Typically, the structural view ofthe application is mapped to theschema of a storage repository, thenavigational view into a set of accessprimitives over the content reposi-tory, and the presentational view intoa set of content-independent visualspecifications (styles). The latter ac-tivity, called visual design, is of greatimportance in Web application devel-opment and is emerging as an auton-omous discipline [Sano 1996; Hortonet al. 1996].

—Implementation: The repository isfilled with new content prepared bydomain experts and/or with existingdata stored in legacy systems; the ac-tual interfaces, pages inWeb terminol-ogy, are constructed by embedding re-pository content and navigationalcommands into the appropriate pre-sentation style. The mapping of de-sign to implementation requires thechoice of the network language inwhich the application is delivered(e.g., HTML, Java, ActiveX, or a mix

thereof), and the decision on the timeof “binding” between content and ap-plication pages, which can be offlineor online.

—Evolution and maintenance: After de-livery, changes in the requirements orbug fixes may require the revision ofstructure, navigation, presentation,or content. Changes are applied ashigh as possible in the developmentchain and propagated down to imple-mentation.

The process model described caters toa variety of actual processes, whose ap-plicability depends on the specific devel-opment context, including the availabletool support, financial and time con-straints, application complexity, andchange frequency. Generally speaking,with applications of limited size andfairly stable requirements, requirementanalysis is directly followed by imple-mentation, possibly after a number ofiterations through prototyping. Concep-tualization and design assume more rel-evance as the complexity and volatilityof the application increase.

2.2 Models, Languages, and Notation

A Web application is characterized bythree major design dimensions:

—Structure describes the organizationof the information managed by theapplication, in terms of the pieces ofcontent that constitute its informa-tion base and of their semantic rela-tionships.

—Navigation concerns the facilities foraccessing information and for movingacross the application content.

—Presentation affects the way in whichapplication content and navigationcommands are presented to the user.

To support the representation of ap-plication features during the develop-ment lifecycle, languages with differentlevels of formality and abstraction canbe used.

At the conceptual level, applicationsrecorded, whereas in Web modeling a navigationpossibility is generally implied.

230 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 5: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

are described by means of high-levelprimitives which specify the structural,navigational, and presentational viewsin a way that abstracts from any archi-tectural issue.

Structural modeling primitives de-scribe the types of objects that consti-tute the information base and their se-mantic relationships, withoutcommitment to any specific mechanismfor storing, retrieving, and maintainingthe actual instances of such objecttypes. Examples of notation that can beused to express structural features in-clude the best-known conceptual datamodels, like the entity-relationshipmodel [Chen 1976] and various objectmodels [Rumbaugh et al. 1991].

Navigation modeling primitives ex-press the access paths to objects of theinformation base and the available in-ter- and intraobject navigation facili-ties, again without committing to anyspecific technique for implementing ac-cess and navigation. The vast range ofnotation and techniques proposed forthe more general problem of human-computer interaction specification canbe employed for navigation modeling:data models extended with built-in nav-igation semantics [Garzotto et al. 1993;Isakowitz et al. 1995; Fraternali andPaolini 1998] or explicitly annotatedwith behavioral specifications [Kesseler1995; Schwabe and Rossi 1995]; first-order logic [Garg 1988]; Petri nets[Stotts and Furuta 1989]; finite statemachines [Zheng and Pong 1992]; andformal grammars [Jacob 1982] areamong the viable options.

Presentation modeling aims at repre-senting the visual elements of the appli-cation interfaces in a way that abstractsfrom the particular language and deviceused in the delivery. Many differenttechniques can be used, with a varyingdegree of formality and rigor, rangingfrom simple storyboard specification[Madsen and Aiken 1993] to the use ofsoftware tools [Myers 1995] and formalmethods. The independent specificationof presentation, separate from structureand navigation, is particularly relevant

in the Web context because the finalrendering of interface depends on thebrowser and display device, thus it maybe necessary to map the same abstractpresentation scheme to different de-signs and implementations.

At the design level, structured orsemistructured data models are used torepresent the features of an applicationin a way amenable to manipulation,query, and verification. Typically, de-sign representations are maintained asrelational or object-oriented schemas, toexploit the capabilities of databasemanagement systems, or as semistruc-tured objects, to cope with informationhaving partial or missing schemas [Flo-rescu et al. 1998].

At the lowest degree of abstraction,applications are represented by meansof implementation-level languages, no-tably network languages directly inter-pretable by the user’s browsers. At thisstage, content, navigation, and presen-tation are directly embedded in thephysical marked-up texts or programsthat make up the application.

2.3 Reuse

As in any other software process, reuseis an important aspect, related to thefacility of building a novel applicationfrom existing artifacts [Biggerstaff andPerlis 1989]. Reuse may happen at alllevels of the development process: con-ceptual schemas, design schemas, con-tent, and physical application pages canbe reused across applications.

The most common form of reuse onthe Web, as in hypermedia, is contentreuse, possibly facilitated by the pres-ence of a structured repository, whichenables the construction of different ap-plications on top of the same informa-tion base. Reuse also occurs at the im-plementation level, where component-based libraries of self-containedelements (e.g., JavaBeans or ActiveXcomponents) can be plugged into applica-tion pages to provide predefined function-alities (e.g., an electronic payment faci-lity). Generation-based reuse focuses

Development of Data-Intensive Web Applications • 231

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 6: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

instead on reusing transformation tech-niques from design frameworks or par-tially instantiated implementations tofull implementations (e.g., by generat-ing application pages from page skele-tons and database content).

2.4 Architecture

Architecture is the subject of design andimplementation and reflects the spatialarrangement of application data andthe spatio-temporal distribution of com-putation.

The minimal spatial configuration ofa Web application is the so-called two-tier architecture, shown in Figure 2,which closely resembles the traditionalclient-server paradigm. This is differentfrom client-server, wherein the two-tiersolution clients (i.e., browsers) are thin,and are lightweight applications re-sponsible only for rendering the presen-tation. Application logic and data resideon the server side.

A more advanced configuration, calledthree- or multitier architecture (Figure3), separates the application logic fromdata, introducing an additional distinc-tion of responsibilities at the back-endside. The presence of one or more dis-tinct application tiers enables the im-plementation of advanced architecturesthat integrate the traditional HTTPprotocol and client-server applicationdistribution protocols for better perfor-mance, scalability, reliability, and secu-rity.

An orthogonal architectural issue isthe time of binding between the content

of the information base and the applica-tion pages delivered to the client, whichcan be static when pages are computedat application definition time and areimmutable during application usage; ordynamic, when pages are created just-in-time from fresh content. Dynamicityhas further nuances: it may involve onlycontent (navigation and presentationare static), or scale to presentation andnavigation.

2.5 Usability

From the customer’s perspective, us-ability is the most important quality ofa Web application. Although a thoroughdiscussion of the emerging Web usabil-ity engineering field is outside the scopeof this paper (we refer the reader toNielsen [1996]), it is possible to identifya set of general criteria to use in assess-ing Web usability:

—Degree of visual quality indicates theoverall coherence of the presentationand navigation metaphors and the in-dividual quality of the graphic re-sources.

—Degree of customization measures thefacility of tailoring the application in-terface to individual users or usergroups. At one end of the spectrum,applications may have a fixed inter-face; at the other end, content, pre-sentation, and navigation may be in-dividually personalized.

—Degree of adaptivity is proportional tothe runtime flexibility of the inter-

Client Server

Application Logic

Database

Figure 2. Two-tier architecture.

232 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 7: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

face; at one end, the application maybe immutable; at the other, it maytrack the user’s activity and changeaccordingly [Fraternali and Paolini1998].

—Degree of proactivity indicates the ca-pability of an application to interactwith the user on its own initiative.Applications can be passive, i.e., pro-vide information only in response to auser’s requests, or proactive, i.e., ableto send information to users upon theoccurrence of relevant events.

3. TOOLS FOR WEB DEVELOPMENT

In response to the complexity factors ofWeb application development describedabove, many software vendors are pro-viding instruments for supporting theconstruction of new applications or themigration of existing ones.

Notwithstanding their common goal,the available products greatly divergein many aspects, which reflects differ-ent conceptions of the nature of a Webapplication and of its development pro-cess.

To help understand the state of thepractice, we have grouped existing toolsinto six categories, which exhibit homo-geneous features. This grouping is theresult of a broad review, which has con-sidered over forty products, listed inAppendix 1.

The individual categories are

(1) visual editors and site managers;

(2) Web-enabled hypermedia authoringtools;

(3) Web-DBPL integrators;

(4) Web form editors, report writers,and database publishing wizards;

(5) multiparadigm tools;

(6) model-driven application genera-tors.

The order of presentation of the dif-ferent categories reflects the increasinglevel of support that tools in each cate-gory offer to the structured developmentof Web applications.

Category 1 contains productivity toolsthat evolved directly from the HTMLeditor, which do not really support thedevelopment of large-scale Web-data-base applications, but are nonethelessinteresting because they pioneeredmany concepts (like presentation stylesand top-down site design) later inte-grated into more complex environments.The same limitation in the degree ofsupport to structured development oflarge applications is shared by Category2, which originates from a different ap-plication domain, offline hypermediapublishing, but recently added facilitiesfor Web and database integration; theinterest in these tools is motivated bytheir nonconventional (with respect tothe standard software engineering prac-tice) approach to application design andspecific focus on navigation and presen-tation. Category 3 is the first one thatexplicitly addresses the integration ofWeb and databases to achieve a higherlevel of scalability, and includes verypowerful, yet basic, products. Theseproducts can be used to implement

Client Application Server

Application Logic

Database

Data Server

Figure 3. Three-tier architecture.

Development of Data-Intensive Web Applications • 233

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 8: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

applications on top of large databases,although at the price of a substantialprogramming effort. Category 4 takesquite a different, but still database-cen-tric, approach to Web development byaddressing the migration of client/server, form-based applications. Thesetools aim at augmenting the implemen-tor’s productivity in such tasks as formediting, report writing, and event-basedprogramming; they also offer a higherlevel of support with respect to Cate-gory 3, but still concentrate only on theimplementation phase. Category 5 con-tains a number of tools whose commonfeature is the integration of differentdevelopment approaches and technolo-gies, drawn from the previous four toolfamilies. Finally, Category 6 includesthose products (actually two products)that provide complete coverage of allthe development activities, from concep-tualization to implementation, by lever-aging state-of-the-art software engi-neering techniques.

To complete the overview of Web ap-plication development, in Section 10 weinclude a mention to other classes ofproducts which are not directly con-cerned with the specific focus of thispaper on the design and maintenance ofWeb sites, but either cover fundamentalarchitectural issues like distribution,reliability, performance, and security,or provide specialized facilities to usersand site administrators, like advancedsearch functions and collaborative worksupport.

4. VISUAL EDITORS AND SITEMANAGERS

This class includes authoring and sitemanagement environments originallyconceived to alleviate the complexity ofwriting HTML code and of maintainingthe pages of a Web site in the file sys-tem. In a typical configuration theseproducts bundle a WYSIWYG editor,which lets the user design sophisticatedHTML pages without programming,and a visual site manager, which dis-plays in a graphical way the content of a

Web site and supports functions likepage upload, deletion, and renaming,and broken link detection and repair.Among the many products in this cate-gory are Adobe SiteMill and PageMill,NetObject Inc.’s Fusion, SoftQuad’sHotMetal, Claris Home Page, Macrome-dia’s Backstage Designer, and Mi-crosoft’s FrontPage.

From a lifecycle perspective, theseproducts address the implementationand maintenance of Web sites; imple-mentation is supported by content pro-duction functions and maintenance bysite-level management facilities. Themost advanced products (e.g., NetOb-ject’s Fusion and Microsoft FrontPage)also offer a rather rudimentary ap-proach to design, whereby it is possibleto separate the definition of the hierar-chical structure of a site from the au-thoring of content.

Automation concentrates on contentproduction by generating code from vi-sual page designs. Support to code gen-eration is particularly relevant in thosetools (like Macromedia’s DreamWeaver,SofQuad’s HotMetal Pro 5.0, Allaire’sHomeSite, and many more) that supportthe latest extensions of HTML, like Cas-cading Style Sheets (CSS) [World WideWeb Consortium 1998a] for content-in-dependent presentation specification,and the Document Object Model (DOM)[World Wide Web Consortium 1998b],for the object-oriented representation ofpage elements and their manipulationvia a scripting language.

Some tools are able to generate partof the navigation logic by automaticallyinserting navigation buttons into pagesbased on their position in the layout ofthe site (Figure 4). Development ab-stractions are basically at the imple-mentation level and oriented towardsstructure and navigation (pages andlinks); some tools also provide presenta-tion abstractions in the form of pagetemplates or “themes,” i.e., groups ofgraphic resources that can be applied tomultiple pages to obtain visual consis-tency (Figure 5).

Reuse happens mostly at the imple-

234 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 9: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

mentation level: content objects (e.g.,multimedia, graphics, but also object-based components like JavaBeans andActiveX controls) can be reused acrossapplications.

The architecture originally supportedby these tools is file-based and two-tier,and the binding between an applicationand its content is static. Recently, themost advanced tools have added func-tionalities for the dynamic connection tolive database data, as discussed in Sec-tion 8.

Since development is conductedmostly by hand, the level of customiza-tion and the resulting visual quality ofthe final application is proportional tothe effort spent and can be arbitrarilyhigh; coherence of presentation andnavigation metaphors is facilitated byvisual templates and themes, althoughin this case customization becomes moreexpensive.

The features of these tools are shownin Table I: in summary, they are anexcellent solution for small- to medium-

sized applications, where publishinglarge information bases stored in aDBMS is not the major issue. The lackof a design-level schema of the applica-tion, independent of content, requiresthat the application features be definedinstance-by-instance, and is a major ob-stacle to scale-up and integration oflarge masses of data. These limitations,however, are going to become less criti-cal as these products become more inte-grated with Web-database tools.

5. WEB-ENABLED HYPERMEDIAAUTHORING TOOLS

Web-enabled hypermedia authoringtools share the same focus on authoringas visual HTML editors, but have adifferent origin because they were ini-tially conceived for the development ofoffline hypermedia applications, andhave been extended to support the gen-eration of applications for the Web inrecent times. The best known represen-tatives of this class of products are

Figure 4. Hierarchical site layout in NetObject’s fusion.

Development of Data-Intensive Web Applications • 235

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 10: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

Asymetrix’s Toolbook II Assistant/In-structor, Macromedia’s Director and Au-thorware, Formula Graphics’ Multime-dia 97, Aimtech Iconauthor, and AllenCommunication Quest. Most of theseproducts have both Web export facilitiesand database connectivity, but these ex-tensions are not always compatible, andthe developer must often choose be-tween the two possibilities. These prod-ucts are characterized by the followingfeatures:

—The authoring metaphor used in thedefinition of the application. Exam-ples of such metaphors are: flowchart(Quest, Authorware, IconAuthor),book (Toolbook), and movie making(Director).

—The way navigation is defined, whichmay require programming in a script-ing language, or may be supported byvisual programming and wizard toolscoherent with the product’s authoring

Figure 5. Graphic styles in NetObject’s fusion.

Table I. Synopsis of Visual Editors and Site Managers

Process: Lifecycle Coverage Design (limited to hierarchical layout definitions)ImplementationMaintenance

Process: Automation HTML generation from WYSIWYG page editingGeneration of commands for hierarchy navigation

Abstractions Implementation-level: pages, links, presentation stylesReuse Plug-in components; Reusable presentation stylesArchitecture Two-tiers, based on file system

Static binding of content to pagesUsability High graphical control through manual authoring

High coherence through use of presentation stylesLow customization, no adaptivity, no proactivity

236 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 11: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

metaphor (e.g., in Director, objects’synchronization is defined by editingthe score for the cast members of astage; see Figure 6).

—The type of database connectivity,which may range from support of aninternal database, of an external da-tabase via gateway software (typicallyODBC or JDBC), or of an externaldatabase through DBMS API.

—The type of Web connectivity, whichmay be achieved by means of aplug-in application extending a Webbrowser, or by exporting the hyperme-dia application into a network lan-guage. Web connectivity may affectdatabase connectivity, because in thepresent version of most tools, Webexport can be done only for applica-tions not connected to a database.

—The export language may be HTML,Java, or a mix of both.

Basically, hypermedia tools are au-thoring assistants; their focus is re-stricted to the ad hoc implementation oflong-lived applications published once

and for all in CDROMS or informationkiosks. As a consequence, they still havelittle provision for a structured develop-ment lifecycle, and offer limited supportto application modeling, design, testing,and maintenance.

To aid the implementation of largeapplications, some tools separate au-thoring in-the-large (i.e., the definitionof the overall application structure)from authoring in-the-small (i.e., the de-tailed definition of the multimedia con-tent of individual information nodes)[GMP95]. This distinction fosters an el-ementary form of design which helps inreconfiguring the application structureand navigation, independent of content.For example, tools like Quest, Author-ware, and IconAuthor distinguish be-tween a “design mode,” where informa-tion nodes are placed on a flowchart,and an “edit mode,” where each individ-ual node can be edited. A parallel can beestablished between authoring in-the-large in hypermedia authoring tools andthe site layout view offered by somevisual HTML editors: in both cases, de-sign is done at the instance level, without

Figure 6. The film-making authoring metaphor of Macromedia’s Director.

Development of Data-Intensive Web Applications • 237

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 12: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

the help of a content-independent appli-cation schema, and navigation is (in part)automatically generated from the ab-stract structure of the application.

Reuse is pursued both at the designlevel by means of reusable substruc-tures and application-independent lay-outs, and at the implementation levelby leveraging libraries of scripts or mul-timedia components. Recent versions ofmost tools also offer interoperabilitywith third-party components (for exam-ple, inclusion of Java applets and Ac-tiveX components).

Regarding the development abstrac-tions, hypermedia tools have a strongplus in the built-in facilities for explic-itly modeling navigation paths orthogo-nal to application structure; for exam-ple, guided tours can be defined assequential navigations based on asso-ciative properties of objects that maycut across the hierarchical structure ofthe application. Other navigational aidsinclude application-wide search func-tions, glossaries, and access indexes.

Like visual HTML editors, hyperme-dia authoring tools

—do not provide a schema for distin-guishing the type of objects constitut-ing the application (beside the low-level distinction between objects ofdifferent media types). The main de-velopment abstractions are genericcontainer objects (pages, icons,stages) whose definition intermixesstructural, presentation, and behav-ioral aspects (and navigational fea-tures in those tools that do not sup-port the seperate specification ofnavigation);

—offer presentation abstractions in theform of page styles, i.e., sets ofgraphic resources applied uniformlyto the entire application or to part ofit.

The architecture of Hypermedia toolsshows a substantial immaturity, withrespect to competitor solutions; all toolswere originally conceived for offline,file-based publication, and later up-

graded to external database connectiv-ity and Web export. Most products offerdatabase connectivity via external li-braries using ODBC, but lose this fea-ture after Web export. Moreover, trans-lation from native proprietary formatinto a network language is still rudi-mentary, and the most advanced fea-tures (like those obtained by exploitingthe tool’s scripting language) do notcarry over to the Web version of anapplication.

Conversely, usability is the majorstrength of this category of products,which focus on delivering very sophisti-cated user interfaces exhibiting a de-gree of control over graphic accuracyand multimedia synchronization hardlyavailable through other means. The in-herent navigational design paradigm,coupled to very effective and well-estab-lished aids like guided tours and user-defined access indexes, which most toolsoffer for free or with little programmingeffort, contribute to the deployment ofapplications that are very effective andmuch closer to the kind of communica-tion found in high quality, hand-devel-oped Web sites.

Table II summarizes the Web-relatedfeatures of hypermedia authoring tools.

6. WEB-DBPL INTEGRATORS

The common denominator of products inthis category is the dynamic productionof Web pages from information stored ina database by integrating databasesand Web technology at the languagelevel.

Different network languages (notablyHTML and Java) are merged with adatabase programming language(DBPL) to obtain an intermediate appli-cation programming language capableof making the best of both Web anddatabase technology. Web-DBPL inte-gration is pursued in three distinctways:

—HTML extensions add new tags toHTML to embed both generalized pro-gramming constructs and database-specific capabilities. The developer

238 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 13: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

writes page templates, which inter-mix declarative HTML tags and exe-cutable code. Products in this cate-gory can be further distinguishedbased on the language paradigm usedto extend HTML (imperative, object-based, object-oriented). Page tem-plates must be translated into pureHTML; this task is normally executedat runtime by an interpreter. The per-formance of the different productsvary, based on the process in whichthe interpreter is executed and on theway the connection to the database ismanaged. Examples of HTML exten-sions include Allaire Inc.’s Cold Fu-sion Web Database Construction Kit;Microsoft’s Active Server Pages (ASP)and Internet Database Connector(IDC); Vignette Corporation’s Sto-ryServer; HAHT Software’s HahtSite;and Informix AppPages. In general,HTML extensions have a broader ob-jective than the mere addition of da-tabase connectivity, and include prim-itives for overcoming HTML’s well-known limitations, like the absence ofcontrol flow instructions, modulariza-tion mechanisms, and stateful inter-action.3

—DBPL extensions take the other wayaround and add specialized functionsto an existing DBPL or to a general-purpose programming language witha database interface, to enable theoutput of HTML code as a result ofprogram execution. Developers writedatabase applications that constructHTML output from data retrievedfrom the database. This approach re-quires database programming skills,but may yield better performancewhen DBPL programs are executedand optimized directly by the DBMS.Examples of DBPL extensions includeOracle’s PL/SQL Web Toolkit, Sy-base’s PowerBuilder Web.PB class li-brary, and Borland’s Web interfacefor the Delphi Client Server Suite.

—Java extensions add database connec-tivity to Java. The best-known effortin this direction is the JDBC openstandard, which offers an ODBC-likearchitecture for Java applicationsthat want to interact with a DBMS.Unlike HTML extensions, connectivityis added in the form of a librarymasking, under a uniform callable in-terface, the details of interacting withdifferent proprietary DBMSs andSQL dialects.

Web-DBPL integrators are tools for theprogrammer, who may use them to sim-plify the task of gluing together Web

3Stateful interaction is the capability of retainingthe application status between two consecutiveclient’s requests; this is not supported by HTPP/1.0.

Table II. Synopsis of Web-Enabled Hypermedia Authoring Tools

Process: Lifecycle Coverage Design (support of authoring-in-the-large)Implementation

Process: Automation HTML/Java generation from WYSIWYG authoringSynchronization of multimedia objectsGeneration of commands for navigation

Abstractions Implementation-level authoring metaphors: pages, stages,flowcharts

Reuse Script libraries, application skeletons, plug-in components,reusable presentation styles

Architecture MonolithicTwo-tiers through Web exportThree-tiered through libraries for database connectivityStatic binding of content to pages

Usability High graphical control through manual authoringHigh coherence through use of presentation stylesCustomization through user-defined access indexesNo adaptivity, no proactivity

Development of Data-Intensive Web Applications • 239

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 14: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

pages and database queries by hand,they are different from visual HTMLeditors and hypermedia authoring tools,which offer small-scale application de-velopment functions to nonprogram-mers. Due to their focus on implementa-tion languages, these products lackhigh-level abstractions for describingapplications, and thus do not assist thedeveloper in identifying the structure,navigation, and presentation of an ap-plication, which is directly implementedby manually writing the necessaryHMTL page templates or DBPL pro-grams.

Reuse may be achieved either throughthe native constructs of the program-ming language, i.e., packages and mod-ules, or through ad-hoc modularizationmechanisms such as inclusion betweenHTML page templates.

The target architecture is database-centered and three-tiered; the most so-phisticated products also permit multi-ple database tiers and distributedqueries.

Not having any specific support forpresentation and navigation modeling,these products have no direct impact onthe usability and graphical quality ofthe user interface, which must be de-signed separately.

The major difference between Web-DBPL integrators and visual HTML ed-itors and hypermedia authoring tools isthat the former introduce a clear sepa-ration between the final applicationpages and the content repository, theschema of which acts as an implicitspecification of the application struc-ture.

This distinction has several conse-quences on the application-developmentprocess:

—it permits a more effective manipula-tion of the information base becausechanges to content can be made with-out affecting navigation and presenta-tion;

—it exposes the different object typesthat constitute the application, al-

though these types are not elicitedfrom the structural, navigational, andpresentation requirements of the ap-plication, but stem from the design ofthe underlying database;

—thanks to explicit object typing, theuniformity and coherence of the pre-sentation are facilitated, because allobjects of the same type may be easilyvisualized in the same way. As acounterpart, customization at the in-stance level must be explicitly pro-grammed by identifying exceptionalcases and treating them in an ad hocmanner.

Being database-driven, applicationsbuilt with Web-DBPL integrators mayleverage the reactive capabilities of theunderlying DBMS to achieve adaptivityand proactivity, e.g., database triggers.For example, an Oracle trigger couldreact to user-generated events and pro-duce HTML pages adapted to the spe-cific situation.

In spite of the above positive side-effects, Web-DBPL integrators cannotbe considered development tools inthemselves, but are comparable to tra-ditional client/server 4GL because theyprovide a high-level programming inter-face masking lower-level architecturaldetails; as such they are often used tobuild more sophisticated products, likethe ones reviewed next.

Table III summarizes the most impor-tant features of Web-DBPL integrators.

7. WEB FORM EDITORS, REPORTWRITERS, AND DATABASEPUBLISHING WIZARDS

This category collects a large number ofproducts, all having the common char-acteristic of leveraging traditional data-base design concepts and developmenttools, to rapidly deploy both new andlegacy data-intensive applications onthe Web. A first classification can beobtained by considering the functionsoffered by the tools, ordered by complex-ity and sophistication:

240 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 15: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

—Database export: support is limited tothe publication of basic database ele-ments, i.e., tables and views, and isachieved by automatically mappingdatabase content into a network lan-guage, notably HTML. The export canbe either static or dynamic, the lattercase requiring on-the-fly HTML gen-eration, which is normally imple-mented by means of a Web-DBPL in-tegrator. A certain level ofpersonalization can be obtained byapplying sets of preferences to drivethe output of Web pages. A typicalexample is the export of tables andqueries in Microsoft’s Access97.

—Report writing: more complex read-only applications can be obtained bydefining custom reports, which can beexported to the Web either staticallyor dynamically. Dynamic publicationrequires a report interpreter workingside by side with the Web server. Ex-amples include Crystal Report PrintEngine, Oracle’s Reports for the Web,and Microsoft’s Access97 report ex-port facility.

—Form-based application development:assistance extends to the constructionof interactive, form-based applica-tions for accessing and updating data,and more generally for implementingbusiness functions.

Form-based application developmentis where the highest number of products

concentrate, and developers are facedby an impressive spectrum of alterna-tives. Among the reviewed products, wecite Microsoft’s Visual InterDev, VisualBasic 5, and Access97, Oracle Developer2000, Inprise IntraBuilder, Sybase’sPowerBuilder, Apple’s WebObjects, Net-Dynamic’s Visual Studio, Asymetrix Su-perCede Database Edition, and Allaire’sCold Fusion Application Wizards. Themost relevant dimensions characteriz-ing the different products are as follows:

—The application delivery format canbe pure HTML, a combination ofHTML and Java, HTML plus propri-etary component technology andscripting languages (e.g., client-sideActiveX objects, VBScript or Java-Script tags), or a totally proprietaryformat (e.g., ActiveX documents, Sy-base’s PowerBuilder PBD files). Thedelivery format affects the client’sperformance and the application’sportability, with minimal overheadand maximum portability in the pureHTML case, and maximum overheadand minimal portability when the ap-plication is deployed in a totally pro-prietary format interpreted by abrowser’s plug-in application.

—The application development langua-ge(s), which may be different from thedelivery language(s), to preserveexisting software investments andprogramming skills and/or increase

Table III. Synopsis of Web-DBPL Integrators

Process: Lifecycle Coverage Structural design (design of underlying database)ImplementationContent maintenance

Process: Automation Web-database communicationQuery shipment and result processingPage formatting from query results

Abstractions Design-level: database structures (tables, classes)Implementation-level: pages, interface objects

Reuse Reusable modules: page templates, procedures, classesArchitecture Three/multitier

Dynamic binding of content to pagesUsability No specific support for control of graphical quality

Coherence facilitated by page templates (HTML extensions)Low customization, adaptivity and proactivity manuallyimplementable through database triggers

Development of Data-Intensive Web Applications • 241

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 16: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

performance and portability. Overall,the spectrum of bindings between de-velopment and delivery languages isextremely varied, ranging from prod-ucts offering a single developmentlanguage and multiple output formats(e.g., Oracle’s Developer 2000 mapsPL/SQL to HTML, Java, and VisualBasic) to products translating differ-ent input languages into the sameoutput format (e.g., Microsoft’s VisualInterDev maps design-time ActiveXcontrols scripted in VBScript orJavaScript into pure HTML).

—The implementation paradigm, whichcan be 3GL programming, Rapid Ap-plication Development (RAD), wizard-based, or a mix thereof. Almost alltools offer RAD functionalities, lettingprogrammers define an application bydragging and dropping componentsonto a form, and by visually customiz-ing their properties. Behavior cus-tomization, as necessary for the defi-nition of complex forms, can be addedby writing code in the tool’s develop-ment language. A few products (e.g.,Visual InterDev, IntraBuilder, Net-Dynamics, Cold Fusion) also include asimplified development path for non-programmers, based on “wizards” en-abling the automatic generation ofcode from sets of preferences selectedthrough a visual interface (Figure 7).Wizards are applied to obtain bothcomponents and complete applica-tions; in the latter case they deliver“canned” database applications, inwhich the data structure and the in-terface logic are predefined (e.g., mas-ter-detail data entry and visualiza-tion, drill-down reports, and so on).

—The execution architecture: the wayin which an application can be parti-tioned among the different tiers of theWeb architecture is even more variedthan the choice of development anddelivery languages, because objectdistribution (as supported by lan-guages like Java and architectureslike Corba and DCOM) permits the

same functionality, e.g., a databasedriver, to be located either in the cli-ent or in the application server.

From a software engineering point ofview, products in this class are to Webdevelopment what Integrated Develop-ment Environments (IDEs) and RapidApplication Development tools were totraditional application programming:instruments for boosting individual andteam productivity in the implementa-tion phase. The best tools also help pro-cess management, testing, and mainte-nance by leveraging standard softwareengineering techniques like source andversion control, configuration manage-ment, and visual debugging.

With one exception, reviewed later,form-based application generators donot provide upper-CASE capabilities:they do not encompass conceptual mod-eling, nor support the model-driven de-sign and implementation of applica-tions. Typically, data structures,business logic, and interfaces are speci-fied and designed separately and thenimplemented with the tool of choice.

Component-based reuse is also intrin-sic to most products, which leverage thevarious object-level interoperabilitystandards emerging on the Web. Com-ponents can be either client-side, inwhich case they encapsulate typical in-terface controls (e.g., menus, edit fields,grids, etc.) or server-side, in which casethey typically wrap database access anddata formatting primitives.

The development abstractions are in-terface-oriented, and drawn from thetraditional client/server, database-cen-tric world: query and data entry forms,input/output controls, record lists, andreports. The structure of the applicationis dictated by the forms that compose it,and is generally isomorphic to (a subsetof) the database schema, especiallywhen forms are automatically derivedfrom database tables and views throughdatabase wizards. More complex patternscan be obtained by exploiting canonicalstructures like table lookups and master-details, or by hand-programming non-

242 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 17: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

standard combinations of objects. Thenavigation semantics is deeply embeddedwithin the application, and hyperlinksare emulated by means of form-to-formconnections. In some cases, navigation isautomatically inferred by the tool fromdatabase structure, as in automaticallygenerated master-details and drill-downapplications, where database foreign keyconstraints are translated into HTML hy-perlinks. Presentation is addressed as anindependent issue only in those productsthat provide a notion of “presentationstyle” (e.g., models in Access97, themes inInterDev, master pages in IntraBuilder)which can be uniformly applied to allapplication pages to obtain a consistentlook.

On the architectural side, all productstarget multitier architectures; the mostrecent generation (e.g., NetDynamicsVisual Studio) is tightly integrated withspecialized middleware products, called

application servers, which we review inSection 10.

Unlike Web-DBPL integrators, whichare neutral products that can be used toshape any kind of application, form-based generators and database publish-ing wizards deliver complete softwaresolutions, including user interface logic.Thus their underlying interaction modelinfluences the user perception of appli-cation quality. The evaluation must con-sider the application context in which atool is used, and consequently the kindof users addressed. In an intrabusiness,or business-to-business contexts, theseproducts are essentially vehicles forWeb-enabling traditional client-serverapplications. In this setting, interfacequality is proportional to the similaritybetween the look and feel of originalclient-server applications and their Webcounterparts, because the peculiaritiesof Web interactions (free browsing,

Figure 7. Datarange wizard in Microsoft’s Visual Interdev.

Development of Data-Intensive Web Applications • 243

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 18: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

associative links, multimediality anduser-level customization) are less im-portant than adherence to interfacestandards well-established among us-ers. Conversely, if these products aretargeted to user-oriented applicationsfor the Internet, they must be comparedto Web and hypermedia authoring tools,in which case their form-to-form naviga-tion pattern and the limited graphiccontrol imposed by wizards are a sourceof rigidity.

Table IV overviews the main featuresof Web form editors, report writers, anddatabase publishing wizards.

8. MULTIPARADIGM TOOLS

The ongoing trend in the evolution ofWeb development tools is an increasingconvergence between visual editing andsite management products, Web-HTMLintegrators, and client-server products(reviewed in Sections 4, 6, and 7, re-spectively).

Multiparadigm tools integrate solu-tions from the aforementioned catego-ries into a unified development frame-work. The most typical configuration isone in which visual HTML editing andsite administration are extended withexternal components, which provide da-tabase connectivity, or with enhancedHTML generation capabilities able toproduce scripts for pulling content fromdatabases, or with full-fledged databasepublication wizards.

Examples of such convergent tools areFrontPage98, which includes databaseobjects for connecting to external datasources; Elemental’s Drumbeat, which

combines sophisticated visual editingand assets management with databasewizards comparable to those of special-ized Web-database connectivity prod-ucts; Lotus Domino and Domino De-signer, which offer an integratedenvironment for Web-enabling Notesclient-server applications and for devel-oping Web sites that take advantage ofDomino Server capabilities; and NetOb-ject’s Fusion (version 3.0), which fea-tures the Fusion2Fusion extension forconnecting NetObject’s visual page edi-tor and Allaire’s HTML-SQL integrator.

Multiparadigm tools do not introducenovel approaches into Web develop-ment, but combine already establishedfeatures to broaden their spectrum withrespect to single-paradigm products.Thus they offer advantages (summa-rized in Table V) that are the sum of thestrengths of their components (imple-mentation productivity, top-down de-sign, three-tier architecture, highgraphic control), but do not overcomethe major limitations of the productsreviewed so far, i.e., lack of a high-level,content-independent representation ofthe site in terms of structure, naviga-tion, and presentation.

9. MODEL-DRIVEN WEB GENERATORS

Model-driven Web generators are at thetop of the proposed categories, becausethey provide the highest level of auto-mation and lifecycle coverage, by apply-ing conceptual modeling and code gen-eration techniques to the developmentof Web applications.

Table IV. Synopsis of Web Form Editors, Report Writers, and Database Publishing Wizards

Process: Lifecycle Coverage ImplementationTestingMaintenance

Process: Automation Source code generation from RAD and wizard toolsAbstractions Implementation-level: forms, reports, controlsReuse Plug-in components: client-side and server-sideArchitecture Multitier, integration with application servers

Dynamic binding of content to pagesUsability Canned interfaces based on query/result display loop

Low customization, no adaptivity, no proactivity

244 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 19: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

This category comprises a few com-mercial tools, which exhibit differentconceptual models and code generationtechniques. We review HyperwaveServer 4.0, and Oracle’s Web Develop-ment Suite.

Hyperwave [Hyperwave InformationManagement 1998] is an advanced doc-ument management environment whichpermits remote users to browse, anno-tate, and maintain documents distrib-uted over the Web. Hyperwave has avery basic, yet powerful, high-levelmodel of a Web application, which isthought of as a set of document collec-tions organized hierarchically. Collec-tions may contain subcollections anddocuments, and have different behav-iors based upon their type. Documentsin a collection can be linked and anno-tated with a number of metadata. Linksare managed by the Hyperwave server,so that any hypertextual structure canbe superimposed over a set of otherwiseindependent documents.

From the description of collections,links, and metadata, Hyperwave gener-ates a Web interface that enables bothuser-oriented and administrative func-tions like collection and link browsing,searching and notification, remote docu-ment management, fine-grain versionand access control, collaborative work,and personal annotations. The gener-ated interface has a default presenta-tion, which can be personalized using aproprietary template language.

Although document-oriented, the Hy-perware server relies on database tech-nology and on a multitier architectureto store metadata and manage linksspanning multiple servers.

A more database-centric approach istaken by the Oracle Web DevelopmentSuite, which comprises Designer 2000[Gwyer 1996; Barnes and Gwyer 1996],a CASE tool for generating Web applica-tions from augmented entity-relation-ship diagrams.

Designer 2000 is an environment forbusiness process and application model-ing, integrated with software generatorsoriginally designed to target traditionalclient-server environments, namely Or-acle Developer 2000 [Hoven 1997], andVisual Basic. The Web generator en-ables previous applications developedwith Designer 2000 and deployed onLANs to be ported to the Web, as well asthe delivery of novel applications di-rectly on the Internet or on intranets.The Web generator takes its inputsfrom the Designer 2000 design reposi-tory and delivers PL/SQL code that runswithin the Oracle Web Server to pro-duce the desired HTML pages of theapplication. More specifically, three in-puts drive the generation process:

—A Web-enhanced database design: da-tabase design diagrams specify thestructure of the database in terms oftables, views, foreign key relation-ships, and integrity constraints.

Table V. Synopsis of Multiparadigm Tools

Process: Lifecycle Coverage Design (limited to hierarchical layout definitions)Implementation and maintenance

Process: Automation HTML generation from WYSIWYG page editingGeneration of commands for hierarchy navigationWeb-database communicationSource code generation from RAD and wizard tools

Abstractions Implementation-level: pages, links, presentation styles,interface objects, tables, queries

Reuse Plug-in components, reusable presentation stylesArchitecture Three and multitier

Static and dynamic binding of content to pagesUsability High graphical control through manual authoring

Coherence through use of templates and presentation stylesLow customization, adaptivity and proactivity manuallyimplementable through database triggers

Development of Data-Intensive Web Applications • 245

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 20: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

These constitute the schema of thefuture Web application. A few visualfeatures can be specified in the sche-ma: for example, column definitionscan be supplemented with captiontext and display format (e.g., a pop-uplist). Moreover, some integrity con-straints (e.g., valid ranges) can be at-tached to columns and tables, and theWeb generator can be instructed toproduce code for checking them on theserver via PL/SQL or on the client viaJavaScript.

—The definition of applications andmodules: modules correspond to basicapplication units; each module con-sists of a sequence of tables, linked byforeign key relationships (Figure 8).The order of tables in a module deter-mines the sequence of HTML pagesthat will be produced for that module.Navigation is established by drawinglinks between modules: the designermay define which modules can be

reached by a given module and intro-duce fictitious modules acting as hier-archical indexes over other modules.

—User preferences are parameters thatcan be set to govern the presentationof the generated application; they canbe defined either globally, at the mod-ule, or at the component level. Exam-ples of preferences are colors, headersand footers, background images, andhelp text.

From these inputs, the WEB genera-tor produces fixed-format Web pages;one set of related pages is generated foreach module, and links between differ-ent modules are turned into hyperlinksbetween the HTML startup pages ofmodules.

Model-driven generators apply thefundamental principles of software en-gineering to Web development; apartfrom all other product categories, applica-tions are first modeled at the conceptual

Figure 8. Module definition in Oracle Designer 2000.

246 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 21: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

level and then implemented throughcode generation. This approach and itsbenefits are comparable to the ones de-livered by CASE tools for object-ori-ented application development, andrange from reduced development effort,reverse engineering, and multilevel re-use.

However, Hyperwave and Designer2000 diverge in the underlying concep-tual model, and both exhibit limits inthe description of Web applications.

Hyperwave adopts a simplified hyper-media model, which is well suited torepresent navigation, but lacks a properstructural model; as a consequence, theinformation base is reduced to a set offlat documents annotated with meta-data.

Conversely, Designer 2000 draws thedevelopment abstractions from the da-tabase world, and adapts such conceptsas entities and relationships to Webmodeling by adding to them some navi-gation and presentation flavor. This at-

tempt to twist a database conceptualmodel and apply it to a Web application isresponsible for some limitations in theusability of the resulting applications,where the absence of proper modelingabstractions for navigation and presenta-tion limits the exploitation of the commu-nication capabilities of the Web.

The limitations of commercial Web gen-erators are addressed by a few researchprototypes, namely Araneus [Atzeni et al.1997]; Autoweb [Fraternali and Paolini1998]; Strudel [Fernandez et al. 1998];WebArchitect [Takahashi and Liang1997]; W3I3 [Ceri et al. 1998]; HSDL[Kesseler 1995]; RMC [Diaz et al. 1995];and OOHDM [Schwabe and Rossi 1995],which are discussed in Section 12.

10. MIDDLEWARE, SEARCH ENGINES,AND GROUPWARE

Besides support for conceptualization,design, and implementation, Web devel-opment requires tackling other issues

Table VI. Synopsis of Model-Driven Generators (Oracle Designer 2000 and Hyperwave)

Designer 2000 Hyperwave

Process: Lifecycle Coverage Conceptualization (E/R) Conceptualization (collectionand link definition)

Design (relational model)Implementation ImplementationReverse engineering

Process: Automation Relational schema generationfrom ER

Relational metaschemageneration

Generation of HTML fromdesign models andpresentation preferences

Navigation generation

Presentation generationAbstractions Conceptual-level: entity,

relationships, attributesConceptual-level: collections,views, abstract links,documents, virtual collections

Design-level: modules, tables,columns, constraintsImplementation-level: pages,links

Implementation-level: pages,HTML links

Reuse Module reuse; Preferencesreuse

Multiple views over the samedocument base

Architecture Multitier, dynamic binding Multitier, dynamic bindingUsability Low graphical control of

generated pagesLow graphical control ofgenerated pages

High coherence through use ofpresentation preferences

High coherence through use ofpresentation preferences

Low customization, noadaptivity, no proactivity

Programmable customization,no adaptivity, proactivitythrough notification

Development of Data-Intensive Web Applications • 247

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 22: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

which, although outside the focus of thispaper, are critical for the delivery ofeffective applications in performance,availability, scalability, security, infor-mation retrieval, and support for collab-orative administration and usage.

These needs are served by ad hoctools or by specialized functions inte-grated into Web design products. In thesequel, we briefly review the most im-portant features of three categories ofproducts: application servers, search en-gines, and groupware and collaborativedesign tools.

10.1 Middleware

Industrial strength data-intensive ap-plications require not only proper de-sign tools but also a solid architecture,good performance, availability, scalabil-ity, and security. These goals haveprompted the extension of the originaltwo-tier HTPP architecture to more so-phisticated configurations, encompass-ing three, and even multiple, tiers. Thekey to multitiered applications is thecapability to separate data, interfaces,and application logic, and to distributeeach aspect to distinct network nodes.Such distribution leverages Internet-en-abled application protocols (like CorbaInternet InterOrb Protocol (Corba IIOP[Object Management Group 1996]) andMicrosoft’s Distributed Common ObjectModel, DCOM [Microsoft Corp. 1996])and the native remote procedure callcapabilities of network languages (nota-bly, Java’s Remote Method Invocation,RMI [Sun Microsystems 1995]).

The in-depth presentation of the al-ternative architectural and technologi-cal options for multitiered applicationsis outside the scope of this paper, and isthoroughly addresses by many authors(see, for example, the special issue ofIEEE Internet Computing on InternetArchitectures [Benda 1998] for a recentreview of the state of the art, and Byte’sspecial report on networked components[Montgomery et al. 1997] for a compari-son of Corba and Microsoft architec-tures).

However, in the tool market, severalvendors are also addressing multitieredarchitectures by offering specific prod-ucts, called application servers, whichare middleware facilities either inte-grated into HTTP servers, or workingside by side to them. Examples of appli-cation servers are NetDynamics, LotusDomino Server, Oracle 8i Java Server,Sybase’s Jaguar CTS, and Inprise’sVisiBroker for Java.

Application servers do not directly im-pact the client-side design process, butoffer several extensions to the standardfunctions of HTTP engines, which canbe used to support server-side develop-ment:

—Efficient execution of server-side pro-grams that implement the businesslogic of the Web application. Productsdiffer in the development languagessupported, which commonly includeC11, Java, and scripting languageslike Perl, Visual Basic, and JavaS-cript. Efficiency is improved by re-placing the Common Gateway Inter-face (CGI) between the Web serverand the application programs bymeans of optimized protocols and ar-chitectures like FastCGI (http://www.fastcgi.com) and Java Servlets [Chang1998].

—High-performance client-server com-munication. Such capability may bebased on a comprehensive architec-ture for application distribution, onprogramming language features, oron proprietary remote procedure calls.Rather than using HTTP, enabled cli-ents may bypass the Web server anddirectly connect to the applicationserver using either open protocols likeCorba/IIOP and Java RMI, or propri-etary protocols, like Microsoft’sDCOM, Domino’s Notes Remote Pro-cedure Calls (NRPC), and Sybase’sJaguar CTS remote procedure calls.

—Optimized and extensible connectionto multiple external data sources.This function includes pooling connec-tions to a database across multiple

248 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 23: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

clients, caching result sets, and possi-bly extending the application serverwith plug-in gateways to heteroge-neous data sources like EnterpriseResource Planning (ERP) and legacysystems (examples of built-in gatewaycomponents and programmable tool-kits are NetDynamics 4.0 PlatformAdapter Components (PACs) andDomino Enterprise Connection Ser-vices (DECS)).

—Flexible and dynamic load-balancingof client requests by means of auto-matic replication of server functions.High-volume incoming traffic may bedynamically routed to multiple in-stances of server-side functions, im-plemented either as separate threadsof the application server or as exter-nal processes, possibly residing on re-mote hosts. Dynamic redirection alsoenables transparent failure-handling:a user’s request for an unavailableservice can be routed to a replica ofthe same function, either dynamicallyspawned or statically instantiated atsystem configuration time.

—Implementation of a secure infra-structure for both data and messages.Security may be granted by a varietyof means: user logging and authenti-cation over both HTPP and non-HTTPconnections (e.g., by supporting theX.509 open standard certificate for-mat, or authentication via third-partydirectories), message and content en-cryption (e.g., by supporting the IETFSecure MIME standard and RSA/RC2cryptography), and password qualitytesting.

—Transactionality, i.e., the capability ofperforming atomic and recoverabletransactions over a single or multipletiers. This feature, traditionally of-fered by distributed database serversand TP monitors, requires the appli-cation server to implement write-ahead logging and two-phase commitprotocols.

10.2 Search Engines

The proper design of structure and nav-igation normally results in Web siteswith a self-evident organization, conse-quently reducing the need for full-textsearches over the information base[Halasz 1988]. However, Web applica-tions offered to the general public mustalso consider the needs of casual read-ers and readers with highly specific in-terests, for which a content search isthe most effective interaction paradigm.

Designing the search functions for adata-intensive Web site is an orthogonalissue with respect to the design of struc-ture, navigation, and presentation, andis supported either by ad hoc functionsintegrated into Web development tools,or by specialized products called searchengines.

Examples of Web development toolsthat bundle integrated search functionsare Hyperwave (which also comes witha separate commercial search engine),and Lotus Domino Designer. Among thenumerous stand-alone search engines,we mention Verity Search97, Harvest,OpenText’s LiveLink Search and Spi-der, Altavista Search, QuarterDeck’sWebCompass, and Excite for Web Serv-ers.

Search engines basically consist oftwo main components: a user interfaceand query processor, whereby users canpose queries and obtain a ranked list ofpages whose content satisfies the query;and an indexing component (also calledspider or crawler) which creates andmaintains indexes over the datasources.

The available commercial productsdiffer in a variety of dimensions: thekind of queries they support (keyword-based, Boolean, natural language,fuzzy); the customizability of the dis-play of results; the flexibility of the in-dex creation and maintenance process(e.g., the possibility of scheduling theupdates of indexes differently for differ-ent data sources or to analyze differentfile formats); and the adherence to theso-called robot-exclusion standard, by

Development of Data-Intensive Web Applications • 249

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 24: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

which webmasters can deny access tocrawlers at their sites.

For a broad review of commercialsearch engines and a comparison oftheir features, the reader may refer toLange [1997].

10.3 Groupware

Collaboration requirements affect Webdevelopment in several ways, goingfrom the concurrent construction of aWeb site by geographically distributeddevelopment teams, to the provision oflimited content editing and interactionfunctions directly to end-users, to real-time interaction integrated into a Website.

In the commercial market severaltools offer groupware capabilities, in-cluding

—concurrent access control via contentlocking and checkin/checkout proce-dures;

—distributed file systems for concur-rent content upload;

—offline collaboration tools like calen-dars, schedules, notification lists, anddiscussion groups;

—real-time collaboration facilities likevirtual rooms, white boards, chats,and video conferencing.

—full-fledged workflow support, includ-ing offline and online collaborationfacilities and workflow modeling andimplementation.

Examples of design tools supportingcollaborative development are Hyper-wave, TeamSite, and Lotus Domino De-signer. Products for the definition ofvirtual workteams are reviewed inWong [1998], and a comprehensive sur-vey of Web-enabled workflow systems iscontained in Miller et al. [1997].

11. EVALUATION

Table VII summarizes the features ofthe categories of Web development toolsdescribed in Sections 4 to 9, in light of

the perspectives of Web developmentdiscussed in Section 2. To complete thispicture, developers must also considermiddleware products, which address theenhancement of performance and secu-rity and offer a platform for implement-ing advanced business logic, search en-gines, and groupware tools, which addspecialized functions orthogonal to thedesign process.

The state of the practice summarizedin Table VII can be better understood ifcompared with the current situations ofmature software technologies, like ob-ject-oriented systems and databases(see Figure 9).

In these contexts, development toolscover the entire spectrum of the applica-tion lifecycle, and approach develop-ment from the right angle: they exploitwell-established abstractions suited tothe specific context (as the conceptualand logical models for database designand the object-oriented notation for OOsystems) and follow well-proven devel-opment guidelines, like those describedin Ceri et al. [1993] for database design,and in Rumbaugh et al. [1991] for ob-ject-oriented design. Pictorially, maturetools and approaches are represented as“light cones” that enlighten the variousphases of the development cycle withoutcrossing the borders of their target ap-plication field.

The situation of Web development isdifferent and typical of a not yet maturetechnology (it could be easily comparedto the OO tool market in the eighties):most products limit their focus to imple-mentation, with some provision for de-sign (as represented by the shorter“light cone” in Figure 9); a few tools aretrying to cover the lifecycle in a broaderway, but do so by approaching develop-ment from an unnatural angle, typicallyusing models, abstractions, and pro-cesses drawn from other contexts. The“slanted” approach of these tools andapproaches are represented by the lightcone covering all phases of development,

250 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 25: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

but with an inclination due to its originfrom the database area.4

12. RESEARCH PERSPECTIVES

Web application development has re-cently gained much attention not onlyin the commercial field but also in theresearch community, where severalprojects address the extension of thecapabilities of site design tools in vari-ous directions, and propose innovativedesign processes and methodologies.

In this Section we briefly review anumber of projects that have proposed

solutions for overcoming some of thelimitations currently experienced bystate-of-the-practice commercial tools:Araneus [Atzeni et al. 1997; Atzeni etal. 1998 ; Atzeni et al. 1998]; Autoweb[Fraternali and Paolini 1998]; Strudel[Fernandez et al. 1998]; Web Architect[Takahashi and Liang 1997]; and W3I3[Ceri et al. 1998].

The common denominator in such ef-forts is the goal of supporting the activ-ities of data-intensive Web design fromconceptualization to maintenance byproperly distinguishing between the dif-ferent dimensions of Web design, orga-nizing the development activities into astructured process, and providing toolsfor partially automating repetitive de-velopment tasks.

However, besides such commonalities,

4A similar phenomenon took place in the earlydays of object-orientation, when many proposalswere put forth to adapt structured analysis/struc-tured design concepts to object-oriented imple-mentation.

Table VII. Synopsis of the Different Categories of Web Development Tools

VisualEditors

HypermediaTools

Web-DBPLIntegrators Form Editors

Multi-paradigm

Tools

Model-drivenGenerators

Lifecyclecoverage

Implement.,hierarchicalsite design,linkmaintenance

Implement.,design(authoringin-the-large)

Implement. Implement.,maintenance(debugging)

Implement.,hierarchicalsite design,linkmaintenance,debugging

Conceptual.design,implement.,maintenance,reverseeng.

Automation Generationof HTML

Generationof HTML/Java

Databaseconnection,queryshipment,resultformatting

Generationof HTML/Java

Generationof HTML,databaseconnection

Generationof designschemas,navigationcommands,interfaces

Abstractions Page, link,presentationstyle

Authoringmetaphors

Table, pageelements

Form,report,client-sideand server-side control

Page, link,presentationstyle, form,table

Entity,relationship,module,table,column,collection,link

Reuse Components,presentationstyles

Libraries,skeletons,components,styles

Pagetemplates,DBPLunits

Client-sideand server-sidecomponents

Components,presentationstyles,templates

Modules,preferences,collections,links

Defaultarchitecture

2-tiers,static

2-tiers,static

3-tiers,dynamic

3-tiers,dynamic

3-tiers,dynamic

3-tiers,dynamic

Support tousability

Goodgraphiccontrol andcoherence(manual)

Very goodgraphic,navigation,synchroniza-tion control(manual)

Interfaceneutral,proactivitythroughtriggers

Cannedinterfaces

Goodgraphiccontrol andcoherence(manualand withtemplates)

Predefinedinterfaces,low graphiccontrol

Development of Data-Intensive Web Applications • 251

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 26: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

each project has its own special focus,and addresses only some of the issuesleft open among the currently availablecommercial tools.

In Table VIII we summarize the fea-tures of the various projects using thesame categories as commercial tools(see Table VII), and in Table IX weunderline the special focus and strongpoints of each project.

For completeness, we conclude withbackground research on Web develop-ment, which has taken advantage ofcontributions from several fields, inwhich hypermedia and databases areprominent.

13. RESEARCH PROJECTS IN DATA-INTENSIVE WEB DEVELOPMENT

13.1 Araneus

Araneus5 [Atzeni et al. 1997 ; Atzeni etal. 1998 ; Atzeni et al. 1998] is a projectat Univ. di Roma Tre, which focuses onthe definition and prototype implemen-tation of an environment for managingunstructured and structured Web con-tent in an integrated way (called a WebBase Management System, WBMS).The WBMS should allow designers to

effectively deploy large Web sites, inte-grate structured and semistructureddata, reorganize legacy Web sites, andWeb-enable existing database applica-tions.

On the modeling side, Araneusstresses the distinction among datastructure, navigation, and presentation.In structure modeling, a further distinc-tion is made between database and hy-pertext structure: the former is speci-fied using the entity-relationship model,the latter using a notation that inte-grates structure and navigation specifi-cation called the Navigation ConceptualModel (NCM).

Conceptual modeling is followed bylogical design, using the relationalmodel for the structural part, and theAraneus Data Model (ADM) for naviga-tion and page composition. ADM offersthe notion of page scheme, a language-independent page description notationbased on such elements as attributes,lists, link anchors, and forms. The useof ADM introduces page composition asan independent modeling task: the spec-ification of data and page structure isorthogonal, and therefore different pageschemes can be built for the same data.

Presentation is specified orthogonallyto data definition and page composition,using an HTML template approach.

5The project Web site is http://www.dia.uniroma3.it/Araneus.

Figure 9. A pictorial view of the state-of-the-practice of Web development tools.

252 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 27: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

The development process follows twotracks: database and hypertext. Data-base design and implementation areconducted in the usual way using theentity-relationship model and mappingit into relational structures. After that,the entity-relationship schema is trans-formed into a NCM schema; this shiftrequires several design activities. Thenext step, hypertext logical design,maps the NCM schema into severalpage schemes written in ADM. Finally,implementation requires writing pageschemes as templates in the Penelopelanguage [Atzeni et al. 1997], whichspecifies how physical pages are con-structed from logical page schemes and

content stored in a database, in a waysimilar to commercial template-basedHTML-SQL integrators.

Araneus includes several tools to sup-port automation of the aforementioneddesign tasks. A specific point in theAraneus project is the integration intothe WBMS of languages and tools forquerying and restructuring HTML data,so that the designer can deploy a newsite, Web-enable a database application,and “reverse-engineer” a legacy HTMLsite, all within one system.

From an architectural viewpoint, Ara-neus offers both static and dynamicpage generation.

The system does not currently support

Table VIII. Synopsis of Reviewed Research Projects in Data-Intensive Web Development

Araneus Autoweb Strudel WebArchitect W3I3

Lifecycle CoverageConceptualization Y Y N Y YLogical design Y Y Y N YPrototyping N Y N N YImplementation Y Y Y Y YMaintenance Y Y Y Y YRestructuring Y N Y N NRev. engineering Y Y Y N Y

Process AutomationGeneration of supporting database N Y N N YGeneration of mapping to legacydatabases

Y N N N Y

Generation of navigation structures Y Y N Y YGeneration of HTML pages fromtemplates

Y N Y N N

Generation of HTML pages fromabstract specs

N Y N N Y

Modelling AbstractionsStructure model Y Y Y Y YDerivation model N N Y N YNavigation model Y Y N Y YPage composition model Y N N N YPresentation model Y Y Y N Y

Reuse & ComponentsPlug-in components N N N N NReusable presentation styles N Y N N YApplication skeletons N Y N N Y

Default ArchitectureTwo tiers static N N Y Y NThree tiers static Y Y N N YThree tiers dynamic Y Y N N YAutomatic caching N Y N N Y

Support to UsabilityNavigation uniformity N Y N Y YPresentation uniformity Y Y Y N YUsability guidelines N Y N N Y

Development of Data-Intensive Web Applications • 253

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 28: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

user profiling and personalization. Fi-nally, no support is offered for event-based reactive processing, except for re-computation of materialized pagesfollowing database and schema updates.

13.2 Autoweb

Autoweb6Fraternali and Paolini [1998]is a project developed at Politecnico diMilano with the goal of applying a mod-el-driven development process to theconstruction and maintenance of data-intensive Web sites.

Autoweb consists of three ingredients:

—A Web modeling notation calledHDM-lite, which evolved from previ-ous hypermedia and database concep-tual models, specifically tailored tothe Web.

—Two transformation techniques thataddress the mapping of conceptualschemas into relational databasestructures and the production of ap-plication pages (in HTML and Java)from data and metadata stored in thedatabase.

—A set of design-time and runtimeCASE tools that completely automatethe design, implementation, andmaintenance of a Web application.

The Autoweb conceptual model, calledHDM-lite, includes primitives for theindependent specification of structure,navigation, and presentation. Apartfrom other projects, presentation isspecified at an abstract level totally in-dependent of the implementation lan-guage, and is automatically mapped toHTML (a protoype implementation ofthe mapping to Java has also been de-veloped). This makes presentationstyles reusable across applications, aswell as across different object typeswithin the same site.

HDM-lite presently supports neitherthe orthogonal composition of the page,whose content is inferred from thestructure schema, nor a language fordata derivation. An original feature ofAutoweb is the storage into a relationalDBMS not only of application data, butalso of metadata about navigation andpresentation schemas, which greatly en-hances the possibility of quickly adapt-ing the output pages to the user’s needs,even at runtime.

The main focus of Autoweb is the6The project Web site is http://www.ing.unico.it/autoweb.

Table IX. Advanced Features of Reviewed Projects in Data-Intensive Web Development

Araneus Querying and restructuring of HTML-based sitesFull design methodologyOrthogonal structure, navigation, composition and presentation modelingTool support

Autoweb Full design methodologyOrthogonal structure, navigation, and presentation modelingRelational representation of both data and metadataFull CASE supportTool-supported usability guidelines

Strudel Querying and restructuring of semistructured dataDeclarative site definition

WebArchitect Full design methodologyScenario-based modelingRole-based specification of contentTool support

W3I3 Full design methodologyOrthogonal structure, derivation, navigation, composition and presentationmodelingUser modeling and profilingAdaptive behavior through Web business rules

254 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 29: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

automation of the development process.An Autoweb application is constructedstarting from an HDM-lite schema, witha tool called Visual HDM Diagram Edi-tor; presentation specification is as-sisted by the Visual Style Sheet Editortool, which permits the designer to de-fine presentation styles applicable toconceptual objects in a WYSIWYG man-ner. The conceptual model is automati-cally translated into the schema of arelational database for storing applica-tion data, and into a metaschema data-base containing a relational representa-tion of the site’s structure, navigation,and presentation. The site is populatedby either using an automatically con-structed data entry application, pro-duced by the Autoweb Data Entry Gen-erator, or by manually defining amapping between the automaticallygenerated relational schema and theschema of the pre-existing database. Asa last step, application pages are dy-namically constructed from databasecontent and metadata by the AutowebPage Generator in such a way that allthe prescriptions of the conceptualmodel are enforced.

During the development process, re-use happens at two levels: partially in-stantiated application skeletons and ab-stract presentation specifications.

Autoweb has a three-tier architecture,featuring both static and dynamic pagegeneration. Dynamic generation is opti-mized by a flexible caching system thatcaches and refreshes pages of selectedtypes based on user’s requests.

A unique feature of Autoweb is theprovision for interface design guidelineswithin the design tools: uniformity ofnavigation and presentation is enforcedby the design and page generation toolsat the level of both individual objecttypes and of the entire application.

13.3 Strudel

Strudel7 is a project of AT&T Labs [Fer-nandez et al. 1998], which aims at ex-

perimenting with novel ways of develop-ing Web sites based upon declarativespecification of the site’s structure andcontent.

Strudel’s core idea is to describe boththe schema and content of a site bymeans of a set of queries over a datamodel for semistructured information.

Content is represented using the Uni-form Graph Model, a graph-based datamodel capable of describing objects withpartial or missing schema. As a startingpoint of the construction of a site, exter-nal data sources, e.g., HTML files orrelational databases, are translated bymeans of wrappers into the Strudel in-ternal format. In this way it is possibleto either restructure an existing HTMLsite or Web-enable a legacy data reposi-tory.

Then design of a Web site requireswriting one or more queries over theinternal representation of data, usingthe Strudel query language (StruQL).Such queries permit the designer to se-lect the data to be included in the site,along with links and collection of objectsfor navigation. In this way Strudel sep-arates description of content from defi-nition of the structure and the naviga-tion of the site. Presentation is added asa separate dimension by means ofHTML templates; these mix HTML pre-sentation tags and special-purpose tags,which are bound at HTML generationtime to objects resulting from site defi-nition queries. The templates determinerendering of the site definition queriesin HTML.

Specification of navigation is inter-twined with structure and presentationbecause navigable links and index col-lections are specified together with thequeries that define the site, and alsobecause the structure of HTML pagesand their links depend on templatesthat describe the presentation.

Presently, Strudel has a two-tierstatic architecture, in which queries areevaluated and transformed into HTMLpages in advance. However, it could bepossible to evalute queries and rendertheir results dynamically.

7The project Web site is http://www.research.att.com/sw/tools/strudel.

Development of Data-Intensive Web Applications • 255

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 30: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

The declarative definition of structureand content by means of queries opensthe way to personalization: differentsites or different versions of the samesite can be built on top of the samecontent simply by changing the StruQLsite definition queries.

13.4 Web Architect

WebArchitect [Takahashi and Liang1997] is a project aimed at developingmethods and tools for the constructionof Web-based information systems(WBIS). The authors propose a struc-tured design process that goes throughanalysis, design, construction, andmaintenance of a Web site.

Analysis includes both static and dy-namic modeling. The former is con-ducted with the entity relationshipmodel, the latter requires the identifica-tion of scenarios in the tradition of ob-ject-oriented modeling [Jacobson 1994].During ER modeling, entities are classi-fied according to the different roles theyplay in the definition of the site (agent,product, or event). Design is conductedin parallel with scenario analysis andaims at pinning down the structure andnavigation schema of the Web site. De-sign results are represented using avariant of the Relation ManagementData Model by Isakowitz Diaz et al.[1995], which incorporates the roles ofthe entities forming the Web site.WBISimplementation and maintenance aresupported by WebArchitect and Pilot-Boat. The former tool supports defini-tion of the structure and navigation ofthe site, as well as maintenance ofmetadata on the site’s resources; thelatter is a client application permittingthe user to browse an application basedon metalinks, implementing the naviga-tion semantics specified in WebArchi-tect. Metalinks are navigable connec-tions stored outside the applicationobjects, which are managed by extendedHTTP engines supporting the special-purpose methods LINK and UNLINK.

13.5 W3I3

W3I3 (WWW Intelligent Information In-frastructure)8 Ceri et al. [1998] is aproject of the W3I3 Consortium, a part-nership of four European industries andone academic institution, funded by theEuropean Community. W3I3’s goal is toadvance Web modeling and develop-ment tools for data-intensive Web appli-cations, with a special focus on userprofiling, personalization, and reactivebehavior.

W3I3 modeling points out the fiveperspectives of structure, derivation,navigation, page composition, and pre-sentation, and includes models and lan-guages for specifying a site under theseperspectives (see http://webml.org).

The development process stresses theautomatic generation of applicationpages from the conceptual model storedin a relational database, and contentstored in external structured or semi-structured data repositories.

The architecture is multitier: applica-tion content can be distributed acrossdifferent data repositories integratedinto a global view of the site obtained bymapping the conceptual schema into arelational representation.

A special feature of W3I3 is the inte-gration of user modeling and businessrules: users are explicitly modeledthrough demographic and psycographicvariables, and business rules are usedto map users or user groups to personalviews of the site computed dynamically.

Presently, W3I3 is in the implementa-tion phase; a prototype version of boththe design tools and the runtime envi-ronment supporting site personalizationand the Web business rules has beenconstructed.

14. BACKGROUND RESEARCH

Web design tools and approaches owemuch to the debate on hypermedia mod-eling and design, semistructured datamodeling, and hypermedia development

8The project Web site is http://www.txt.iy/w3i3.

256 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 31: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

tools proposed prior to the advent of theWeb. In the following sections, we re-view some of the most important contri-butions.

14.1 Modeling Notation

Historically, most of the modeling nota-tion adopted by current Web develop-ment methodologies stems from the evo-lution and hybridization of previousconceptual models for database and hy-permedia design.

The common ancestors of many subse-quent proposals are the entity relation-ship model [Chen 1976], in the databasefield, and the Dexter model [Halasz andSchwarz 1994] in the hypermedia area.

The Dexter model originates fromthe effort to provide a uniform terminol-ogy for representing the different hyper-text structuring primitives offered byhypertext construction systems; thecore of the model is the ordered repre-sentation of a hypertextual applicationat three levels: storage, within-compo-nent, and runtime levels. The storagelevel describes the network of nodes andlinks of the hypertext, without detailson the inner structure and node con-tent, which is the focus of the within-component layer. The runtime leveldeals with the dynamics and presenta-tion of the hypertext.

The modeling concepts available atthe storage level are very basic: compo-nents describe pieces of informationthat constitute the hypertext, and canbe either atomic or composite. Links area special kind of component used torepresent navigable paths. Although notconceived for hypertext design, the Dex-ter model advocates many concepts,such as the distinctions among thestructure, navigation, and presentationof a hypertext, whose influence hasbeen long-lasting.

Several subsequent contributionsarose from criticism of the Dextermodel, and added more complex formsof hypertext organization and morepowerful navigation primitives [Gar-zotto et al. 1993]; time and multimedia

synchronization features [Hardman etal. 1994]; and formal semantics of navi-gation and a structured design process[Isakowitz et al. 1995]. Among theseevolutions, HDM [Garzotto et al. 1993]and RMM [Isakowitz et al. 1995] havebeen particularly influential in the de-sign of hypermedia applications.

HDM and its variants [Schwabe et al.1992; Garzotto et al. 1993; Garzotto etal. 1991; Garzotto et al. 1993] shiftedthe focus from hypertext data models asa means to capture the structuringprimitives of hypertext systems, to hy-pertext models as a means for capturingthe semantics of a hypermedia applica-tion domain. HDM integrates featuresof the entity relationship model and theDexter model, to obtain a notation forexpressing the main abstractions of ahypermedia application, their internalstructure and navigation, and applica-tion-wide navigation requirements. Webstructure is expressed by means of enti-ties, substructured into a tree of compo-nents. Navigation can be internal toentities (along part-of links), cross-en-tity (along generalized links), or non-contextual (using access indexes, calledcollections [Garzotto et al. 1994]).

RMM (Relationship ManagementMethodology) [Isakowitz et al. 1995]evolves HDM by embedding its hyper-media design concepts into a structuredmethodology, splitting the developmentprocess into seven distinct steps andgiving guidelines for the tasks. RMM’sdata model (called RMDM) structuresdomain entities into slices, and orga-nizes navigation within and across enti-ties using associative relationships andstructural links.

Most of the recent proposals for Webmodeling are built on top of the entityrelationship model and hypermedia de-sign models (notably HDM and RMM),adapted to the specificity of the Webcontext: Araneus and WebArchitectdraw from RMM and the entity-rela-tionship model, Autoweb and W3I3 haveproposed a Web-specific evolution ofconcepts first proposed by HDM.

Finally, another source of inspiration

Development of Data-Intensive Web Applications • 257

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 32: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

to Web modeling comes from researchon the representation and querying ofsemistructured data, i.e., data with par-tial or missing schema. The proposeddata models, thoroughly reviewed inFlorescu et al. [1998], express Web con-tent by means of relations, labeledgraphs, hypertrees, and logic. Araneusand Strudel are examples of Web con-tent management systems based onsemistructured data models and querylanguages.

14.2 Processes

Web development processes evolved inparallel with Web design notation, andhave the same hybrid origin from theinformation system and hypermediafields.

In hypermedia, the evolution from thecreative definition of content to the con-tent-independent organization of thestructure of a hypermedia application isattributed to the work on HDM [Gar-zotto et al. 1993], which stresses thedifference between authoring in thelarge, i.e., designing general structureand navigation, and authoring in thesmall, i.e., deciding the layout and syn-chronization aspects of specific compo-nent types.

HDM, however, did not prescribe aformal development lifecycle, which wasfirst advocated by RMM [Isakowitz etal. 1995], where the following seven ac-tivities, i.e., entity relationship design,slice design, navigation design, conver-sion protocol design, interface design,behavior design, and implementationand testing are proposed. The first threeactivities provide a conceptualization ofthe hypermedia application domain interms of entities, substructured intoslices, and navigable relationships. Con-version protocol design is a technicalactivity that defines the transforma-tions to be used for mapping the concep-tual schema into implementation struc-tures. In addition to defining thedevelopment lifecycle, RMM also givesguidelines for slice and navigation de-

sign, the two tasks most particular tohypermedia design.

OOHDM [Schwabe and Rossi 1995]takes inspiration from object-orientedmodeling and simplifies the RMM life-cycle to only four steps: domain analy-sis, navigation design, abstract interfacedesign, and implementation. In domaindesign, classical object-oriented tech-niques are used, instead of the entityrelationship model. Navigation designadds specific classes (e.g., node, link,index) to represent different forms ofnavigation. The same is done for pre-sentation, which is described by meansof classes (e.g., button, text field) addedduring interface design. Implementa-tion then fleshes out the classes identi-fied during design with code in the im-plementation language of choice.

In the Web context, most methodolog-ical proposals concentrate on visual de-sign and usability criteria [Sano 1996],much as the hypermedia field authoringguidelines were the first concern of de-velopment. The requirement of applica-tion scale-up drives the most recent con-tributions, like Araneus, Autoweb, WebArchitect, and W3I3, which organize thedevelopment process into activitiesdrawn both from the above mentionedhypermedia methodologies and fromtraditional object-oriented and databasedesign methods.

14.3 Other Design Tools

Besides the projects described in Sec-tion 13, other research efforts have con-tributed to the development of proto-types of hypermedia and Web designtools.

RMC [Diaz et al. 1995] gives CASEsupport to the design and implementa-tion phases of the RMM methodology. Itconsists of a diagram editor, which as-sists the input of RMDM schemas, anda code generator, which outputs HTMLpages from data stored in an internalrepository. Pages are produced offline ina precompiled fashion. RMC is notbased on a database architecture, andthus does not provide facilities for scal-

258 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 33: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

ing-up and updating the informationbase.

HSDL [Kesseler 1995] has an archi-tecture similar to RMC, but starts froma HDM-like design model, augmented tosupport a fine-grain personalization ofnavigation and presentation semantics.HSDL objects are annotated by meansof programs in a scripting language,called expanders, which drive the pro-duction of HTML pages. Expanders canbe attached both to schema elementsand to instances, to provide uniformityand exceptional handling at the sametime. Expander programming, althoughpowerful, is a low-level task that resem-bles the programming of page templatesin Web-DBPL integrators. The differ-ence is that HSDL, like RMC, does notrely on database technology to store theapplication content, but has an internalrepository.

15. CONCLUSIONS

Although the present situation in themarket of Web development tools ischaracterized by a substantial overlapbetween the different classes of prod-ucts (despite some difference in thelevel of maturity), an attempt can bemade to identify a plausible choice,based on the application requirements.Table X summarizes the adequacy ofeach product category to a specific kindof application:

—Small-scale business-to-user applica-tions: this category includes such ap-plications as companies’ points ofpresence, training and education, in-focenters, etc. Last-generation visualeditors and site managers (Category

1) seem to be the appropriate solu-tion, because they ensure the highlevel of visual accuracy and customi-zation necessary for application tar-geted to the general public, coupledwith productivity tools, substantiallyreducing the development effort. Thisrole could be undermined in the nearfuture by hypermedia authoring tools(Category 2), which share the samefocus on presentation quality and areeven more effective in the design ofnavigational and multimedia inter-faces. The current limit to the appli-cability of both these classes of solu-tions is the size and volatility of theinformation base; if this has to bekept in a database, then presentlythese tools do not provide the ade-quate means to integrate databasesand the Web for design, implementa-tion, and maintenance. In this case,multiparadigm tools (Category 5) maybe a better choice.

—Intrabusiness, or business-to-businessapplications: this category comprisesall legacy information systems andEDI applications, and is characterizedby a different kind of user, alreadytrained to the transactional and form-based interaction paradigm. In thepresent situation, Web form editorsand database publishing wizard (Cat-egory 4) and model-driven generators(Category 6) offer a powerful opportu-nity for migrating existing applica-tions to Intranets. This technical ad-vantage largely balances the limitedexploitation of the communication ca-pabilities of the Web. However, novelintra- and interbusiness applications

Table X. A Match Between Categories of Web Development Tools and Types of Applications

Visualeditors

Hypermediatools

Web-DBPL

integrators

Formeditors

Multi-paradigm

tools

Model-driven

generators

Small-scale business tocustomer

X X X

Business to business X X X XLarge-scale business tocustomer

X X

Development of Data-Intensive Web Applications • 259

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 34: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

are emerging (for example, hypermediafor technical documentation and com-puter-based training), which demandthe integration of large masses of data,hypertextual multimedia interfaces,and deployment on the Web. The grad-ual introduction of these applicationsmay promote the evolution of the inter-action paradigm for conventional infor-mation systems as well.

—Large-scale business-to-user applica-tions: this is the most challengingarea, comprising such applications aselectronic commerce, virtual libraries,and all sorts of Internet services.Presently, it seems that no specificproduct or class of products is fullyaddressing the analysis, design, im-plementation, and evolution of thesekinds of applications, which requirethe same communication paradigmand interface quality as small-scaleuser-oriented applications, and thesame performance and scalability asclient-server database applications.In this scenario, neutral implementa-tion-oriented products like Web-DBPL integrators (Category 3) andflexible, multiparadigm tools (Catego-ry 5) seem the most adequate choice,although development and mainte-nance with these tools still require asubstantial coding effort. We are con-vinced that the best of these tools canbe obtained by using them in conjunc-tion with a model-driven developmentapproach, based on design notationsand processes like the ones describedin the previous section; after the sup-porting database has been designed,visual page designers and Web-DBPLintegrators can be used to generatethe application’s physical pages.

As several research projects demon-strate, large-scale Web applicationshave more dimensions than the merestructure of the underlying database,prompting a refocusing of the existingconceptual models and software devel-opment processes to achieve the level ofmaturity of a consolidated softwarefield.

APPENDIX

LIST OF URLS OF REVIEWED PRODUCTS(ALPHABETIC ORDER)

1. Access97, Microsoft, http://www.microsoft.com/access/2. Altavista Search, Digital, http://www.altavista.software.digital.com/search/in-dex.htm3. ASP, Microsoft, http://www.microsoft.com/iis/LearnAboutIIS/ActiveServer/de-fault.asp4. Authorware, Macromedia, http://www.macromedia.com/software/authorware5. Backstage Designer, Macromedia,http://www.macromedia.com/software/backstage6. Cold Fusion, Allaire Inc., http://www.allaire.com/products/ColdFusion/317. Crystal Report Print Engine, Seagate,http://www.crystalinc.com/crystalreports8. Delphi Client/Server Suite, Inprise,http://www.inprise.com/delphi9. Data Director 1.0, Informix, http://www.informix.com/products/tools/datadir10. Designer 2000, Oracle, http://www.oracle.com/products/tools/des2k/collateral/wwwgen.pdf11. Developer 2000, Oracle, http://www.oracle.com/products/tools/dev2k/index.html12. Director, Macromedia, http://www.macromedia.com/director13. Domino Designer R5, Lotus, http://www.lotus.com/home.nsf/tabs/r5preview414. Drumbeat 2.0, Elemental Software,http://www.drumbeat.com15. Excite for Web Servers, Excite, http://www.excite.com/navigate/home.html16. Formula Graphics97, Formula Graph-ics, http://www.formulagraphics.com17. FrontPage98, Microsoft, http://www.microsoft.com/frontpage18. Fusion, NetObjects Inc., http://www.netobjects.com/html/nof.html19. HahtSite, HAHT Software, http://www.haht.com20. Harvest, Harvest, http://harvest.transarc.com21. Home Page, Claris, http://www.claris.com/products/claris/clarispage/clar-ispage.html22. HotMetal Pro, SoftQuad, http://www.softquad.com/products/hotmetal23. Iconauthor, Aimtech, http://www.aimtech.com/iconauthor

260 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 35: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

24. IDC, Microsoft, http://www.microsoft.com/msoffice/developer/access/articles/itk/Idcfaq.htm25. IntraBuilder, Borland, http://www.borland.com/intrabuilder26. Jaguar, Sybase, http://www.sybase.com/products/jaguar27. LiveLink Search and Spider, Open-Text, http://www.opentext.com/livelink28. NetDynamics 4.0, NetDynamics, www.netdynamics.com/product/overview/nd400overview.html29. Oracle 8i, Oracle, http://www.oracle.com/products/oracle8i30. Perspecta, Perspecta, http://www.per-specta.com31. Page Mill/Site Mill, Adobe, http://www.adobe.com32. PL/SQL Web Development Toolkit,Oracle, http://www.oracle.com33. PowerBuilder, Sybase, http://www.sy-base.com/products/powerbuilder34. Quest, Allen Communication, http://www.allencomm.com/p&s/software/quest35. StoryServer, Vignette, http://www.vi-gnette.com/PressKit/ProductOverview.pdf36. SuperCede, Asymetrix, http://www.su-percede.com/products/supercede/dbdetails.html37. Teamsite, Interwoven, http://www.in-terwoven.com38. Toolbook II Instr., Asymetrix, http://www.asymetrix.com/products/toolbook2/instructor39. Toolbook Assistant, Asymetrix, http://www.asymetrix.com/products/toolbook2/assistant40. Verity Search 97, Verity, http://www.verity.com41. VisiBroker for Java, Inprise, http://www.inprise.com/visibroker42. Visual Basic 5.0, Microsoft, http://www.microsoft.com/vbasic43. Visual InterDev, Microsoft, http://www.microsoft.com/vinterdev44. Visual JS, Netscape, http://developer.netscape.com/library/documentation/visu-aljs/vjs.html45. WebCompass, Quarterdeck, http://arachnid.qdeck.com/qdeck/products/wc2046. Web Data Blade, Illustra/Informix,http://www.informix.com47. WebObjects, Apple, http://software.apple.com/webobjects

REFERENCES

ATZENI, P., MECCA, G., AND MERIALDO,P. 1998. Design and maintenance of dataintensive Web sites. In Proceedings of theSixth International Conference on ExtendingDatabase Technology (Valencia, Spain, Mar.),H. -J. Schek, F. Saltor, I. Ramos, and G.Alonso, Eds. 436–450.

MECCA, G., ATZENI, P., MASCI, A., SINDONI, G., AND

MERIALDO, P. 1998. The Araneus Web-based management system. SIGMOD Rec.27, 2, 544–546.

ATZENI, P., MECCA, G., AND MERIALDO,P. 1997. To weave the Web. In Proceed-ings of the 23rd International Conference onVery Large Databases (VLDB ’97, Athens,Greece, Aug.), M. Jarke, M. J. Carey, K. R.Dittrich, F. H. Lochovsky, P. Loucopoulos,and M. A. Jeusfeld, Eds. VLDB Endow-ment, Berkeley, CA, 206–215.

BACHIOCHI, D., BERSTENE, M., CHOUINARD, E., CON-LAN, N., DANCHAK, M., FUREY, T., NELIGON, C.,AND WAY, D. 1997. Usability studies anddesigning navigational aids for the WorldWide Web. Comput. Netw. ISDN Syst. 29,8-13, 1489–1496.

BARNES, H. AND GWYER, M. 1996. Designer/2000Web enabling your applications, Oracle whitepaper. Tech. Rep.. Oracle Corp., RedwoodCity, CA.

BATINI, C., CERI, S., AND NAVATHE, S.B. 1992. Conceptual Database Design: AnEntity-Relationship Approach. Benjamin/Cummings series in computer science. Ben-jamin-Cummings Publ. Co., Inc., RedwoodCity, CA.

BENDA, M., Ed. 1998. Internet Architec-ture. IEEE Internet Comput. 2.

BIGGERSTAFF, T. J. AND PERLIS, A. J., Eds.1989. Software Reusability: Vol. 1, Conceptsand Models. ACM Press, New York, NY.

CERI, S., FRATERNALI, P., PARABOSCHI, S., AND

POZZI, G. 1998. Consolidated specificationof WWW intelligent information infrastruc-ture (W3I3). Tech. Rep. Dip. di Elettronicae Informazione, Politecnico di Milano, Milan,Italy.

CHANG, P. I. 1998. Inside the Java Web server:An overview of Java Web server 1.0, Javaservlets, and the JavaServer architec-ture. http://java.sun.com/features/1997/aug/jws1.html.

CHEN, P. P. 1976. The entity-relationship mod-el: Toward a unified view of data. ACMTrans. Database Syst. 1, 1, 9–36.

DIAZ, A., ISAKOWITZ, T., MAIORANA, V., AND GIL-ABERT, G. 1995. RMC: A tool to designWWW applications. In Proceedings of theFourth International Conference on WorldWide Web (Boston, MA), 11–14.

FERNÁNDEZ, M., FLORESCU, D., KANG, J., LEVY, A.,AND SUCIU, D. 1998. Catching the boat with

Development of Data-Intensive Web Applications • 261

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 36: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

Strudel: Experiences with a Web-site manage-ment system. SIGMOD Rec. 27, 2, 414–425.

FLORESCU, D., LEVY, A., AND MENDELZON,A. 1998. Database techniques for theWorld-Wide Web: A survey. SIGMOD Rec.27, 3, 59–74.

FRATERNALI, P. AND PAOLINI, P. 1998. A concep-tual model and a tool environment for devel-oping more scalable and dynamic Web appli-cations. In Proceedings of the SixthInternational Conference on Extending Data-base Technology (Valencia, Spain, Mar.),H.-J. Schek, F. Saltor, I. Ramos, and G.Alonso, Eds. 421–435.

GARG, P. K. 1988. Abstraction mechanisms inhypertext. Commun. ACM 31, 7 (July 1988),862–870.

GARZOTTO, F., MAINETTI, L., AND PAOLINI,P. 1995. Hypermedia design, analysis, andevaluation issues. Commun. ACM 38, 8(Aug. 1995), 74–86.

GARZOTTO, F., MAINETTI, L., AND PAOLINI,P. 1994. Adding multimedia collections tothe Dexter model. In Proceedings of the 1994ACM Conference on Hypermedia Technology(ECHT’94, Edinburgh, Scotland, Sept. 18–23),I. Ritchie and N. Guimarães, Eds. ACMPress, New York, NY, 70–80.

GARZOTTO, F., PAOLINI, P., AND SCHWABE,D. 1993. HDM—a model-based approach tohypertext application design. ACM Trans.Inf. Syst. 11, 1 (Jan. 1993), 1–26.

GARZOTTO, F. AND MAINETTI, L. 1993. HDM2:Extending the E-R approach to hypermediaapplication design. In Proceedings of the12th International Conference on Entity Rela-tionship Approach (ER’93, Dallas, TX), R. El-masri, V. Kouramajian, and B. Thalheim,Eds. 178–189.

GARZOTTO, F., PAOLINI, P., AND SCHWABE,D. 1991. HDM—a model for the design ofhypertext applications. In Proceedings of the3rd Annual ACM Conference on Hypertext(San Antonio, TX, Dec. 15–18), J. J. Leggett,Ed. ACM Press, New York, NY, 313–328.

GWYER, M. 1996. Oracle Designer/2000 Web-Server generator (vers. 1.3.2). Oracle Corp.,Redwood City, CA.

HALASZ, F. AND SCHWARTZ, M. 1994. The Dexterhypertext reference model. Commun. ACM37, 2 (Feb. 1994), 30–39.

HALASZ, F. G. 1988. Reflections on NoteCards:Seven issues for the next generation of hyper-media systems. Commun. ACM 31, 7 (July1988), 836–852.

HARDMAN, L., BULTERMAN, D. C. A., AND VAN ROS-SUM, G. 1994. The Amsterdam hypermediamodel: Adding time and context to the Dextermodel. Commun. ACM 37, 2 (Feb. 1994), 50–62.

HORTON, W., TAYLOR, L., IGNACIO, A., AND HOFT, N.L. 1996. The Web Page Design Cookbook:All the Ingredients You Need to Create 5-Star

Web Pages. John Wiley and Sons, Inc., NewYork, NY.

HOVEN, I. V. 1997. Deploying Developer/2000applications on the Web, Oracle white paper.Oracle Corp., Redwood City, CA.

HYPERWAVE INFORMATION MANAGEMENT,1998. Hyperwave User’s Guide, Version4.0. Hyperwave Information Management.

ISAKOWITZ, T., STOHR, E. A., AND BALASUBRAMA-NIAN, P. 1995. RMM: a methodology forstructured hypermedia design. Commun.ACM 38, 8 (Aug. 1995), 34–44.

JACOB, R. J. K. 1983. Using formal specificationsin the design of a human-computer interface.Commun. ACM 26, 4 (Apr. 1983), 259–264. http://www.eecs.tufts.edu/˜jacob/papers/cacm.txt; http://www.eecs.tufts.edu/˜jacob/papers/cacm.ps.

JACOBSON, I. 1992. Object-Oriented SoftwareEngineering. ACM Press, New York, NY.

KESSELER, M. 1995. A schema-based approachto HTML authoring. In Proceedings of theFourth International Conference on The WorldWide Web (Boston, MA),

LANGE, A. 1997. Sorting through search engines.Web Tech. M. 2, 6 (June). http://www.web-review.com/97/06/13/webtech/index.html.

MADSEN, K. H. AND AIKEN, P. H. 1993.Experiences using cooperative interactive sto-ryboard prototyping. Commun. ACM 36, 6(June 1993), 57–64.

MICROSOFT CORPORATION 1996. The distributedcommon object model. Microsoft Corp., Red-mond, WA. http:/msdn.microsoft.com/work-shop/components/contents.htm.

MILLER, J., SHETH, A., KOCHUT, K., ANDPALANISWAMI, D. 1997. The future of web-based workflows. In Proceedings of the Inter-national Workshop on Research Directions inProcess Technology (Nancy, France).

MONTGOMERY, J. 1997. Distributing compo-nents. BYTE 22, 4, 93–98.

MYERS, B. A. 1995. User interface softwaretools. ACM Trans. Comput. Hum. Interact.2, 1 (Mar. 1995), 64–103.

MYERS, B., HOLLAN, J., CRUZ, I., BRYSON, S., BUL-TERMAN, D., CATARCI, T., CITRIN, W., GLINERT,E., GRUDIN, J., AND IOANNIDIS, Y. 1996.Strategic directions in human-computer inter-action. ACM Comput. Surv. 28, 4, 794–809.

NANARD, J. AND NANARD, M. 1995. Hypertextdesign environments and the hypertext de-sign process. Commun. ACM 38, 8 (Aug.1995), 49–56.

NIELSEN, J. 1990. Hypertext and Hypermedia.Academic Press Prof., Inc., San Diego, CA.

NIELSEN, J. 1996. Computer Science and Engi-neering Handbook. CRC Press, Inc., BocaRaton, FL.

OBJECT MANAGEMENT GROUP 1998. The commonobject request broker: Architecture and specifica-tion. Ver. 2.0. Tech. Rep. Object ManagementGroup, Framingham, MA. http://www.omg.org/corba/corbiiop.htm.

262 • P. Fraternali

ACM Computing Surveys, Vol. 31, No. 3, September 1999

Page 37: Tools and Approaches for Developing Data-Intensive · PDF fileTools and Approaches for Developing Data-Intensive Web Applications: ... Applications for the Internet in such ... only

RUMBAUGH, J., BLAHA, M., PREMERLANI, W., EDDY,F., LORENSEN, B., AND LORENSON, W. 1991.Object Oriented Modeling and Design. Pren-tice-Hall, Englewood Cliffs, NJ.

SANO, D. 1996. Designing Large-Scale WebSites: A Visual Design Methodology. JohnWiley and Sons, Inc., New York, NY.

SCHWABE, D. AND ROSSI, G. 1995. The object-oriented hypermedia design model. Com-mun. ACM 38, 8 (Aug. 1995), 45–46.

SCHWABE, D., CALOINI, A., GARZOTTO, F., AND PA-OLINI, P. 1992. Hypertext development us-ing a model-based approach. Softw. Pract.Exper. 22, 11 (Nov. 1992), 937–962.

STOTTS, P. D. AND FURUTA, R. 1989. Petri-net-based hypertext: Document structure withbrowsing semantics. ACM Trans. Inf. Syst.7, 1 (Jan. 1989), 3–29.

SUN MICROSYSTEMS, 1995. Remote method in-vocation specification. Tech. Rep. Sun Mi-crosystems, Inc., Mountain View, CA. http://java.sun.com/products/jdk/rmi/index.html.

TAKAHASHI, K. AND LIANG, E. 1997. Analysisand design of Web-based informations sys-tems. In Proceedings of the Sixth Interna-tional Conference on the World Wide Web(Santa Clara CA, Apr.),

WONG, W. 1998. Team-building on the fly.BYTE 23, 2 (Feb.). http://www.byte.com/art/9802/sec10/art1.htm.

WORLD WIDE WEB CONSORTIUM, 1998. Cascadingstyle sheets: Level 2 specification. Tech.Rep. World Wide Web Consortium. http://w3c.org/TR/REC-CSS2

WORLD WIDE WEB CONSORTIUM, 1998. The docu-ment object model (DOM): Level 1 specification.Tech. Rep. World Wide Web Consortium.http://www.w3.org/TR/REC-DOM-Level-1/.

ZHENG, Y. AND PONG, M.-C. 1992. Using state-charts to model hypertext. In Proceedings ofthe ACM Conference on Hypertext (ECHT ’92,Milan, Italy, Nov. 30–Dec. 4), D. Lucarella, J.Nanard, M. Nanard, and P. Paolini, Eds.ACM Press, New York, NY, 242–250.

Received: March 1998; revised: August 1998; accepted: December 1998

Development of Data-Intensive Web Applications • 263

ACM Computing Surveys, Vol. 31, No. 3, September 1999