24
SOFTWARE—PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2004; 34:315–338 (DOI: 10.1002/spe.566) A ReScUE XML/EDI model Eric Jui-Lin Lu ,† and Chang-Chuan Wu Department of Information Management, Chaoyang University of Technology, 168 Gifeng E. Rd., Wufeng, Taichung County, Taiwan 413, R.O.C. SUMMARY Since electronic data interchange (EDI) is one of the most important components in electronic commerce and since Extensible Markup Language (XML) provides Internet developers with a powerful vehicle for exchanging messages, XML/EDI has received much attention from almost all well-known international enterprises and is believed to be the next generation EDI. However, since XML allows developers to design their own elements and attributes, it is almost certain businesses will receive XML documents with unknown elements. Generally, human intervention is required to solve the problem. Therefore, it is indispensable to design an efficient scheme to resolve the unknown elements. In this paper, we propose a ReScUE XML/EDI model such that transformation templates will be generated automatically for documents with unknown elements, and the documents will be converted into expected format. This model increases the flexibility of XML/EDI systems and reduces human intervention. Copyright c 2003 John Wiley & Sons, Ltd. KEY WORDS: EDI; XML/EDI; electronic commerce; metadata INTRODUCTION Electronic data interchange (EDI) is one of the most important components of electronic commerce because EDI helps automate and streamline businesses by eliminating clerical tasks, speeding up information transfers, reducing data errors, and eliminating unnecessary business processes. Until recently, however, EDI has been successfully used only in specific industries such as the automobile industry and some other large enterprises [1–3]. One of the major obstacles of widespread acceptance of EDI is the complexity of establishing a ‘common ground’ [4–6]. In the past, the common ground in EDI is based on either standard transaction sets or prior agreement on the semantics of the message contents. On one hand, although EDI standards such as UN/EDIFACT and ANSI X.12 exist, the standardization process is very complex and may require months or years to complete [7,8]. This makes standardization inappropriate for today’s rapidly Correspondence to: Eric Jui-Lin Lu, Department of Information Management, Chaoyang University of Technology, 168 Gifeng E. Rd., Wufeng, Taichung County, Taiwan 413, R.O.C. E-mail: [email protected] Contract/grant sponsor: National Science Council, Taiwan, R.O.C.; contract/grant number: 90-2213-E-324-018 Published online 5 January 2004 Copyright c 2003 John Wiley & Sons, Ltd. Received 21 May 2002 Revised 10 July 2003 Accepted 11 July 2003

A ReScUE XML/EDI model

Embed Size (px)

Citation preview

Page 1: A ReScUE XML/EDI model

SOFTWARE—PRACTICE AND EXPERIENCESoftw. Pract. Exper. 2004; 34:315–338 (DOI: 10.1002/spe.566)

A ReScUE XML/EDI model

Eric Jui-Lin Lu∗,† and Chang-Chuan Wu

Department of Information Management, Chaoyang University of Technology, 168 Gifeng E. Rd., Wufeng,Taichung County, Taiwan 413, R.O.C.

SUMMARY

Since electronic data interchange (EDI) is one of the most important components in electronic commerceand since Extensible Markup Language (XML) provides Internet developers with a powerful vehicle forexchanging messages, XML/EDI has received much attention from almost all well-known internationalenterprises and is believed to be the next generation EDI. However, since XML allows developers to designtheir own elements and attributes, it is almost certain businesses will receive XML documents with unknownelements. Generally, human intervention is required to solve the problem. Therefore, it is indispensable todesign an efficient scheme to resolve the unknown elements. In this paper, we propose a ReScUE XML/EDImodel such that transformation templates will be generated automatically for documents with unknownelements, and the documents will be converted into expected format. This model increases the flexibility ofXML/EDI systems and reduces human intervention. Copyright c© 2003 John Wiley & Sons, Ltd.

KEY WORDS: EDI; XML/EDI; electronic commerce; metadata

INTRODUCTION

Electronic data interchange (EDI) is one of the most important components of electronic commercebecause EDI helps automate and streamline businesses by eliminating clerical tasks, speedingup information transfers, reducing data errors, and eliminating unnecessary business processes.Until recently, however, EDI has been successfully used only in specific industries such as theautomobile industry and some other large enterprises [1–3].

One of the major obstacles of widespread acceptance of EDI is the complexity of establishing a‘common ground’ [4–6]. In the past, the common ground in EDI is based on either standard transactionsets or prior agreement on the semantics of the message contents. On one hand, although EDI standardssuch as UN/EDIFACT and ANSI X.12 exist, the standardization process is very complex and mayrequire months or years to complete [7,8]. This makes standardization inappropriate for today’s rapidly

∗Correspondence to: Eric Jui-Lin Lu, Department of Information Management, Chaoyang University of Technology, 168 GifengE. Rd., Wufeng, Taichung County, Taiwan 413, R.O.C.†E-mail: [email protected]

Contract/grant sponsor: National Science Council, Taiwan, R.O.C.; contract/grant number: 90-2213-E-324-018

Published online 5 January 2004Copyright c© 2003 John Wiley & Sons, Ltd.

Received 21 May 2002Revised 10 July 2003

Accepted 11 July 2003

Page 2: A ReScUE XML/EDI model

316 E. J.-L. LU AND C.-C. WU

changing business environment. On the other hand, to reach an agreement on the format beforecommunications involves extensive negotiations between trading partners.

Additionally, the need for organizations, such as virtual enterprises or strategy alliances [9], to es-tablish short-term relationships is growing due to the rapid changes in the global business environment.It is neither practical for businesses to wait for the completion of standards nor economically feasiblefor them to invest extensive human resources in the negotiation of message formats. To resolve theseproblems, several EDI models [7,10–14] were proposed. Basically, they all require at least two majorcomponents: one is the existence of a common data dictionary (ex. data repository or ontology), andthe other is a process of mapping between terms derived from the common data dictionary.

In 1998, XML (Extensible Markup Language) was officially released by W3C. XML providesInternet developers a new vehicle for exchanging messages. To take advantage of this new technology,the XML/EDI group proposed an XML/EDI framework [3,15,16]. The XML/EDI framework containsfive main components: EDI, XML, Templates, Agent, and Global Repository. In the framework,reusable objects such as templates and agents are stored in a global repository and can be downloadedor referenced later. EDI messages in the XML format are retrieved along with a template (or a referenceto a template), and then the message is processed by the receiver’s agent according to the template.The objective of the XML/EDI framework is to provide businesses with a smarter and cheaper systemso that they can conduct transactions as well as reaching a broader customer base worldwide. It isbelieved that integration of XML and EDI will be the next generation EDI [6,17].

The XML/EDI framework is good for exchanging documents between a server (or more specificallycalled a VAN-like server because it is very much like a server provided by any VAN service provider)and many small or medium sized enterprises since all XML/EDI documents are agreed on a set ofstandards. However, since XML allows users to freely design their own XML documents, a VAN-like server may receive XML documents with unknown elements from other VAN-like servers.Additionally, trading partners in different countries may encode their documents in different encodings(such as Big5 encoding in traditional Chinese [18]) which makes data exchanges even more difficult.To resolve this problem, several utilities such as Clio [19], BizTalk’s Mapper [20], and IBM’sXSLerator have been developed. Although these tools provide users with a nice visual interface togenerate required transformation templates, human intervention is generally required for documentswith unknown elements. So far, however, there are few, if any, proposed XML/EDI models providingan automatic resolution scheme for unknown elements.

In this paper, an XML/EDI model with Resolution Scheme for Unknown Elements (the ReScUEXML/EDI model) is proposed. In the model, each element has corresponding metadata. Each metadatacontains its associated element name and terms from either pre-agreed terms or some standard datadictionaries such as Basic Semantic Repository (BSR) [8,10], ebXML, BizTalk, etc. Because it isunlikely that all standard dictionaries will be merged into one universal dictionary in the foreseenfuture, the design of metadata allows association of one element name to several terms defined inBSR, ebXML, BizTalk, etc. and, thus, provides greater flexibility in the data interchanges. For eachtrading partner, a mapping file that contains mappings for elements and attributes used at both ends iscreated. Upon receiving documents, the agent at the receiving end first checks to see if all elementsand attributes in the received document can be found in the mapping file. If yes, the document is inthe expected format and can be processed further; otherwise, the agent will communicate with thetrading partner’s agent to get the metadata and produce mappings for all the unknown elements inthe document. After the agent has created mappings for all the unknown elements and saved them

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 3: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 317

in the mapping repositories, a template generator will be invoked, and, based on these mappings, atransformation template will be created automatically. Finally, the document will be converted into theexpected format by means of the transformation template. Note that we adopt the term ‘agent’ fromthe XML/EDI framework. Thus, the agents in the ReScUE model are simply computer programs anddo not have to be the agents that are autonomous, intelligent, and mobile [21–23].

As stated earlier, although all existing EDI models use one standard dictionary or ontology fordata interchanges, the standardization process is very complex and time consuming. Additionally,in practice, many standard dictionaries exist and no one universal dictionary will emerge any timesoon. Therefore, instead of relying on only one standard dictionary, the proposed model utilizes asmany standard dictionaries as possible so that XML documents with unknown elements can still beinterpreted, and human interventions can be minimized. These features make the proposed modelsuperior and more flexible than the other models and especially suitable for enterprises with short-termed relationships to interchange messages.

A ReScUE XML/EDI prototype is also developed to study the feasibility of the ReScUE model.In the prototype, all metadata and mapping files are encoded in XML. The messages communicatedbetween agents are defined in the KQML format [24]. The transformation templates are actually XSLTransformation (XSLT) files. Using XSLT allows effective transformation of XML documents fromone format into another format.

RELATED WORKS

Traditional EDI utilizes value-added network (VAN) to exchange standard formatted data betweencomputer systems of trading partners. The most well-known standards include ANSI X.12, which ismainly used in North America, and UN/EDIFACT, which is used in the rest of the world. Generally, tosend an EDI message such as a purchase order to a trading partner, the following tasks are performed:(1) the EDI software extracts information from a database and creates a flat file; (2) the flat file istransferred to a computer where the EDI translator resides and is translated into the EDI standardformat; (3) the EDI message is transmitted to receivers through a dedicated phone line or a third-partyvalue-added network (VAN) [1,11].

While traditional EDI deserves a lot of credit for improving productivity, it also has severalsignificant shortcomings [1,4,25]. EDI standards are large and complex, thus building an EDI systemrequires high skills in EDI message formats. The rigid transaction sets are not well suited for today’srapidly changing business environment because introducing a variation or addition to those standards isa slow and somewhat torturous process. Therefore, the EDI software is usually attached with high pricetags and needs to keep up with changes in EDI standards. Furthermore, companies transmitting EDImessages through either a dedicated phone line or a VAN have to pay for expensive communicationcharges. These barriers limit wide acceptance of EDI, especially for small and medium-size enterprises(SMEs) [5].

New-EDI

In the past, some researchers have proposed several EDI models that attempt to solve some or all ofthe problems of traditional EDI mentioned just now. A so-called ‘New-EDI’ model was proposed by

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 4: A ReScUE XML/EDI model

318 E. J.-L. LU AND C.-C. WU

Steel [8,10]. The flexibility of the New-EDI model is achieved in two parts: one is a common datadictionary, termed Basic Semantic Repository (BSR), and the other is a meta-message standard calledICSDEF (Interchange Structure Definition).

In the New-EDI model, prior to communications, both sender and receiver define their own ICSDEFmessages, which describe the structure of the data to be sent or received, respectively, based on standardlabels from BSR. Once the sender’s ICSDEF is defined, it is sent to the receiver. Upon receivingthe sender’s ICSDEF, the receiver stores the sender’s ICSDEF locally and uses it to decode theincoming EDI data into individual data elements, and then the receiver is ready to re-assemble thedata elements into the expected structure according to the receiver’s ICSDEF. Since both the sender’sand receiver’s ICSDEFs are defined using terms from BSR, the conversions between documents canbe done automatically without human intervention. Unfortunately, ICSDEF has never been actuallycarried out and Steel shifted their research to Open-EDI in 1996.

A machine-negotiated ontology-based EDI

In [7], Lehmann proposed a machine-negotiated ontology-based EDI that uses a concept dictionaryto store the semantic definition of the meaning of each EDI element. First, both trading partnersnegotiate by exchanging ontologies and synchronizing their concept dictionaries. When two computersystems reach an agreement on the common terms and concepts, the actual interchanges of transactionstake place. When semantic conflicts occur, the computer systems at both sides negotiate untilconflicts are resolved. Thus, whether message standards exist or not, there can be little or no humanintervention in processing transactions. Unfortunately, the model requires storing a tremendous amountof information about the descriptive definition of terms for negotiation. Additionally, the process ofnegotiation is very complex and extremely difficult, if not impossible, to implement.

Distributed EDI models

Adam et al. [11] proposed a simple and straightforward model based on the concept of distributeddatabases. In the model, the senders construct EDI messages directly from their own databases, and themessages are then transmitted to the receivers. Upon receiving the messages, the receivers convert themessages directly into databases according to their profiles. Adam’s approach allows trading partnersto exchange EDI messages via a database-to-database manner rather than the traditional application-to-application manner and thus to do business without requiring tedious process of standardization.To resolve semantic conflicts of EDI messages, Adam et al. assume the existence of a domain-specific ontology such as the one proposed by Lehmann, which makes this model suffer from thesame problems discussed previously. Also, since the messages are directly extracted from and saved todatabases, Adam’s approach highly depends on the operating environment and thus lacks the flexibilityof [12].

In [12], Lu and Hwang utilized a layer of middleware to retrieve data from or save data tolegacy systems or databases. For easy integration, Lu and Hwang’s model employs distributed objecttechniques such as CORBA and DCOM [26]. In the model, since the middleware communicates withthe functions of legacy systems, instead of the databases, Lu and Hwang’s approach provides greaterflexibility and ease of introducing EDI into a business that has its own business applications and

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 5: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 319

thus reduces the entrance cost. However, Lu and Hwang’s model uses EDI standards as the formatof exchanging messages and requires human interventions in the generation of the mapping tuples.

Open-EDI

The Open-EDI model is the foundation of the future architecture of CEFACT and X12. Other than theagreement of using a particular ‘scenario’, the Open-EDI model can exchange data without requiringprior agreement [13,27]. The scenarios specify the objects and methods to be used in an exchange andare stored in a repository. When a data exchange is to be performed, the Open-EDI software wouldobtain the appropriate scenario description from the repository. The Open-EDI software would thenexecute the exchange using the objects and methods specified in the scenario.

The Open-EDI uses Business Operational View (BOV) to describe the business aspects of theexchanges and Functional Service View (FSV) to describe the technical implementation of theexchanges. For example, in catalog purchasing, objects (such as purchase orders and sales orders)and various defined methods for buying and selling are described in the BOV. The FSV specifiesdetails such as the EDI standard, the communications protocol, and the particular encryption methodto be used. Sending a purchase order automatically invokes the objects and methods that have beenpredefined in the purchasing scenario.

One major drawback of the Open-EDI model is in the complexity of business process analysis.Additionally, since the FSV implementation details are left to software vendors, this may lead tosituations where different implementations of the same scenario are not interoperable [27].

A context information exchanging model

An agent model was proposed by Lee [14] to exchange context information. The context informationof a piece of data is defined as the metadata relating to its meaning, properties, and organizationalknowledge [28]. For example, consider a trade price of a stock instrument which might be reported as101.25. The context of this number includes information such as its currency, source, index status, latestsale price, precision, and accuracy. The focus of Lee’s work is to ensure that the values received are inthe expected context and, thus, any exchanged message is constructed as a semantic value, encoded inXML, which contains a simple value and its associated context. When an agent sends out its requestto a supplier, the request will go to a context agent that in turn transmits the request to the supplier andreceives its corresponding response. Since the context specified in the response is generally differentfrom the context specified in the request, the context agent converts simple values in the supplier’scontext to the requester’s context and sends them back to the requester. To make conversion possible,all context information must be saved in an ontology server. However, Lee’s paper has little discussionon the issues as to the maintenance of the ontology server such as ‘is there a standard format for allcontext information?’, ‘how to synchronize the context information among ontology servers?’, etc.

The XML/EDI framework

XML, officially released by the W3C in 1998, can be easily comprehensible to anyone who understandsHTML [29]. XML is also a meta language that allows developers to design their own customizeddocuments such as purchase orders. By combining the power of both XML and EDI, the XML/EDI

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 6: A ReScUE XML/EDI model

320 E. J.-L. LU AND C.-C. WU

Legacy System

ORACLE SQL

Client ServerXML/EDIDocuments

GlobalRepository

AgentTemplatesRepository

Figure 1. The XML/EDI framework.

group proposed a XML/EDI framework [3,15] in the summer of 1997 for building a smarter andcheaper system which allows users to exchange business documents in XML with global tradingpartners over the Internet. In addition to XML and EDI, the XML/EDI framework contains three othermain components that are shown in Figure 1: Templates, Agents, and Global Repositories.

1. Templates. Templates are basically processing rules that instruct agents how to manipulatethe receiving XML/EDI documents. Generally, templates are written in XSLT and are eitherglobally referenced or transmitted along with XML documents. Since templates can be definedby developers, this gives developers more flexibility in system development and relaxes themfrom waiting for completion of complex and time-consuming standardization processes.

2. Agents. Agents are computer programs that are used to accomplish the required tasks.For example, agents can process XML/EDI messages in accordance with the instructions definedin templates, transform traditional EDI messages and XML/EDI messages back and forth, accessthe backend database systems, etc. Note that, as illustrated, agents in the XML/EDI frameworkmight not be like the powerful agents that are autonomous, intelligent, and mobile [21–23].

3. Global repository. Since XML allows developers to design their own tags, many worry that eachorganization will design its own tags and their corresponding semantics as well as develop itsown DTDs and programs. These impede interoperability and result in higher transaction anddevelopment costs. Therefore, a global repository is needed for both storing all reusable entitiessuch as glossary, templates, programs, DTDs etc. and providing effective lookup services [16].

A typical XML/EDI document exchange procedure is as follows. When an XML/EDI message isreceived, the document is validated and then processed based on the rules defined in its associatedtemplate. The template might be either sent along with the document or a reference to the local or theglobal repository. Additionally, the agent might save the document into the backend database systemsfor further processing.

According to the previous discussions, it is obvious that the XML/EDI framework is good forexchanging documents between a VAN-like server and many small or medium-sized enterprises since

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 7: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 321

all XML/EDI documents are agreed on a set of standards. However, the framework is not suitable fordocument exchanges among VAN-like servers because they are in general not agreed on a standardset. Additionally, with the current development of BizTalk, ebXML, RosettaNet, Commerce XML,BizCodes, etc., it is believed that a universal XML-EDI standard will become true any time soon.Therefore, it is indispensable to design and implement an XML/EDI system such that the problem ofXML documents with unknown elements can be resolved and human intervention can be minimized.To interpret unknown elements, transformation templates should be generated automatically, and thereceived documents are converted based on the transformation templates into an expected format.

Our observations

From the EDI models discussed, the following facts were observed.

1. There are three major approaches for building a common ground for EDI: a standard transactionset such as UN/EDIFACT and ANSI X.12, a standard structural definition such as ICSDEFand DTD, and a prior agreement on message formats. Standard structural definitions define thestructure of the messages exchanged, but, unfortunately, the semantics is not included. Semanticmismatches may take place if trading partners use different DTDs and thus it is difficult toprocess documents with unknown elements. For example, the element name at the sender side is‘Unit Price’, but it may be called ‘Uprice’ at the receiver side.

2. The semantic conflicts of elements can be resolved by the following methods: standardization,ontology-based reasoning, and referencing data dictionaries. The first method is rigid, lessflexible, and time consuming, while ontology-based reasoning is usually complex and extremelydifficult to implement. Thus, the former two do not seem to be suitable for developing low-cost EDI systems. As a result, referencing data dictionaries seems like an effective approach tomaintain consistency in the semantics of elements.

3. The information describing elements such as context information or metadata can aid theprocessing of transactions.

THE ReScUE XML/EDI MODEL

The ReScUE XML/EDI model, as shown in Figure 2, consists of several key components which aredescribed as follows.

1. Template repository. The template repository is composed of reusable templates for documentpresentation and transformation. There are two types of templates. Transformation templatesare generated by the template generator automatically and used to convert the received EDIdocuments into a pre-defined format. These formatted EDI documents can then be convertedinto other presentation formats such as HTML or WML by layout templates. These templatesmay be retrieved from global repositories, created by developers, or generated by this model.

2. Metadata base. The metadata base contains a collection of metadata files. In the metadata base,it is ensured that, for every XML/EDI document used in an enterprise, there is a correspondingmetadata file. Each metadata file includes the semantics of all the elements used in a document.Additionally, each metadata base must have a system metadata file that provides links between

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 8: A ReScUE XML/EDI model

322 E. J.-L. LU AND C.-C. WU

Legacy System

ORACLE SQL

Templates Repository

Template

Metadata Base

Metadata

InternetXML/EDI Message

Mapping Repository

Mapping

EDIFACT, X12,BSR, Bizcodes,Other Standards

SearchManager

communicationManager

TemplateGenerator

Control Agent

Figure 2. An XML/EDI model.

documents and their associated metadata files. To achieve semantic consistency, terms defined instandard data dictionaries such as BSR [8] and X.12/EDIFACT can be included in the metadatafiles for reference. For example, in Figure 4(a), an element type ‘Odate’ is defined in a metadatafile ‘PO.xml’ and it is also a ‘2005=4’ defined in UN/EDIFACT. Also, <PurchaseOrder>defined in the system metadata file is actually an ‘ORDERS’ defined in UN/EDIFACT oran ‘850’ in X.12. Note that, with this design, an element type can easily be associated withseveral data dictionaries and be encoded in other encodings such as Big5 into the metadatafiles. Additionally, if an element value requires a special format, the required format can also bedefined in the metadata file. For example, not only is the ‘ODate’ element defined as ‘2005=4’in UN/EDIFACT, but also the format of ‘ODate’ is defined as (yyyy/MM/dd).

3. Mapping repository. The mapping repository consists of a collection of mapping files. For eachtrading partner, there is a corresponding mapping file in the repository. In the mapping file, allelements used in the company and their corresponding elements received from the trading partnerare recorded. For example, the element ‘OrderList’ is used in the company and the element‘PurchaseOrder’ is received from the trading partner. After exchanging metadata such as the oneshown in Figure 4, it is realized that both elements are actually an ‘ORDERS’ in UN/EDIFACT.Therefore, a mapping such as the first <Mapping> element shown in Figure 5 is generated. It isassumed that the element name used in an enterprise is unique. With the design of mapping files,if a new document is created by a trading partner and a large percentage of element names usedin the new document is recorded in the mapping file, the time required for resolving unknownelements can be significantly decreased without having to scan whole metadata base.

4. Control agent. In the ReScUE model, the following modules are included and these modules arecoordinated by a control agent.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 9: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 323

Searchit’s

mappingCommunication

Read an element

Reuse mapping

Create and save mapping

Notifydevelopers

No

Yes

Mappingexist?

Havecommon data

dictionaries?

Yes

No

End of fileYesNo Generate

newtemplate

Figure 3. A resolution scheme for unknown elements.

• Lookup manager. When a document is received, the control agent will invoke the lookupmanager to verify whether or not the document’s template exists in the template repository.If yes, the lookup manager returns the template; otherwise, the agent will further ask thelookup manager to search for the mappings of all the elements in the received document.If the requested mappings can be found in the mapping repository, the mapping is returnedand reused; otherwise, the agent will attempt to generate new mappings based on theresolution scheme to be described later.

• Communication manager. If the requested templates or mappings cannot be found, thecommunication managers at both sides are responsible for exchanging the elements’metadata. During communications, the communication managers at both ends need anagreed communication protocol and a message format to exchange metadata. EitherKQML [24] or SOAP [20] can be employed.

• Template generator. According to the newly generated mappings in the mappingrepository, the template generator is invoked to create a transformation template forconverting the received document into a pre-defined format and store the template in thetemplate repository for later usage.

A resolution scheme for unknown elements

The resolution scheme for unknown elements is pictured in Figure 3. When an unknown element isreceived, the agent will search for its mapping such as the one shown in Figure 5. If a mapping is found,the mapping is reused; otherwise, the agent will attempt to generate a new mapping by exchanging themetadata of the unknown element with its trading partner. If both parties agree on the semantics ofthe unknown element, the unknown element is resolved and a new mapping is created; otherwise, the

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 10: A ReScUE XML/EDI model

324 E. J.-L. LU AND C.-C. WU

System Metadata

<Form>

<Form name>PurchaseOrder</Form name>

<common standard=”EDIFACT” version=”95A”>0065=ORDERS</common>

<common standard=”X12” version=”3050”>850</common>

<metadata name>PO.xml</metadata name>

</Form>

...

PO.XML

<element>

<tagname>Odate</tagname>

<common standard= ”EDIFACT” version=”95A”>2005=4</common>

<datatype>datetime</datatype>

<format>yyyy/MM/dd</format>

<necessary!>

</element>

...

System Metadata

<Form>

<Form name>OrderList</From name>

<common standard=”EDIFACT” version=”95A”>0065=ORDERS</common>

<metadata name>OL.xml</metadata name>

</Form>

...

OL.XML

<element>

<tagname>Date</tagname>

<common standard=”EDIFACT” version=”95A”>2005=4</common>

<datatype>datetime</datatype>

<format>MMddyyyy</format>

<necessary!>

</element>

...

(a) Sender (b) Received

Figure 4. Example metadata.

unknown element will be left for developers to define its metadata. The above process will be repeateduntil all unknown elements are resolved.

For example, upon receiving a document, the receiver’s agent will first examine what kind ofdocument it is. Suppose that the receiver’s agent reads the root element ‘<PurchaseOrder>’ whichis unknown and cannot find its mapping in the mapping repository, the receiver’s agent asks thesender’s agent to identify what kind of document it is. After reading the sender’s metadata file(Figure 4(a)), the sender’s agent replies that ‘<PurchaseOrder>’ is actually an ‘ORDERS’ definedin UN/EDIFACT. Since ‘ORDERS’ is also defined at the receiver side (Figure 4(b)), a new mapping,as shown in Figure 5, describing the relationships between ‘<PurchaseOrder>’ and ‘<OrderList>’,will be generated. Following the above procedure, all the unknown elements in the document can beresolved.

In the ReScUE scheme, when the unknown elements cannot be resolved, they will be left fordevelopers to define their own metadata and add them to the metadata files manually. Since mostelements used in business transactions are most likely defined in certain data dictionaries, and since themetadata allow inclusion of as many data dictionaries as possible for each element, human interventionwill decrease gradually. Besides, for businesses of short-term relationships, the resolution schemeprovides a feasible and flexible solution to set up a channel for transactions.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 11: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 325

<Mapping>

<Tagname>OrderList</Tagname>

<To>PurchaseOrder</To>

</Mapping>

<Mapping>

<Tagname>Date</Tagname>

<To>Odate</To>

<Format>

<DataType>datetime</DataType>

<From>MMddyyyy</From>

<To>yyyy/MM/dd</To>

</Format>

</Mapping>

Figure 5. Example mappings.

Analysis and discussion

In this section, the proposed scheme is further analyzed in the following aspects.

1. Less human intervention and high flexibility on document processing. The ReScUE modelallows businesses to both design their own XML/EDI documents and exchange documents withvarious document schemas. While providing high flexibility on document processing, humanintervention will be reduced gradually by establishing complete metadata containing terms fromstandard data dictionaries.

2. Resolve unknown elements with multiple standards. Any communication between two partiesrequires at least one common ground. However, it is unlikely that there will be one universalstandard in the foreseeable future. Even if there is one standard finally developed in the future,its standardization process will be at least as complex as the current ones. Therefore, instead ofcreating one big standard, one of the major goals of this paper is to design a flexible model suchthat all existing standards can be utilized to resolve unknown elements.

3. Lower barrier for trading partners with short-term relationships. Due to the popularity of theInternet, a tremendous amount of transactions are conducted through the Internet and this resultsin a large number of trading partners with short-term relationships. It is neither practical for thesebusinesses to wait for the completion of standards nor economically feasible for them to investextensive human resources in the negotiation of standards [2]. The ReScUE model allows usersto rapidly set up a channel for transactions with new trading partners. The proposed model, thus,is best suited for trading partners with short-term relationships.

4. Multilingual support. As stated earlier, an element can be easily associated with several termsdefined in standard dictionaries. Also, with this scheme, an element say ‘PO’ can also beassociated with any purchase order encoded in other languages such as Japanese, French, orChinese. This feature is especially important in contemporary global commerce.

5. Miscellaneous conversions. The proposed model is also capable of allowing miscellaneousconversions. For example, one system may have all data of type ‘date’ in the form of

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 12: A ReScUE XML/EDI model

326 E. J.-L. LU AND C.-C. WU

‘yyyy/MM/dd’ (assuming ‘y’ is the year, ‘M’ is the month, and ‘d’ is the day). When receivingan order with a date format of ‘MMddyyyy’, the proposed model is able to convert these data intoa ‘yyyy/MM/dd’ format using the definition in the metadata base and the mappings generatedfrom the scheme as shown in Figures 4 and 5.

DESIGN AND IMPLEMENTATION

Recently, several schema languages have been proposed, such as XML Schema [30], XDR, andeDTD [31] etc., in an attempt to replace the DTD. These schema languages contain richer data typesand support more functions (such as max or min value) than the DTD. However, it is believed that theDTD will not be weeded out soon because it has simple syntax and has been widely used since SGMLwas introduced. Therefore, although the proposed model will work along with any schema language,the DTD is selected for implementation. The design of each component in our prototype is describedin one of the following sections.

Template repository

The template repository consists of templates, and each template is actually an XSLT file. In theprototype, all XSLT template files are stored in one directory.

Metadata base

The ReScUE model imposes no restrictions in the format of the metadata files in the metadatabase as long as the agent can read and extract values from them. In the prototype, the metadatafiles are designed in XML format and stored in one directory. The DTD of the metadata files isdefined and shown in Figure 6. In the DTD, ‘tagname’ represents the tag name of the exchangedelement, ‘common’ lists common standards the element refers to, ‘format’ represents the format ofthe element, and ‘necessary’ indicates whether or not the element is mandatory. These metadata filesprovide fundamental information for agents to exchange and generate mappings. Since ‘common’ hasa quantifier of ‘+’, it allows a tag to represent terms from more than one standard data dictionary.For example, a tag ‘PurchaseOrder’ is defined so that it represents both ‘0065=ORDER’ in theUN/EDIFACT and ‘850’ in the X.12. Note that it is assumed that standard values exist for the attributes‘standard’ and ‘version’.

Also, a system metadata file is required to provide links between each business document type andits corresponding metadata file, DTD file, XSLT file, and date of last modification. For example, asshown in Figure 7, ‘PO.xml’ is the metadata file of the document type <PurchaseOrder>. Also, thisdocument’s DTD file and XSLT file are ‘PurchaseOrder.dtd’ and ‘PurchaseOrder.xsl’, respectively.Figure 8 is the DTD designed for the system metadata file.

Mapping repository

The mapping files in the mapping repository are also designed in XML format. The DTD designed forthe mapping files is shown in Figure 9. ‘Mapping data’ is the root element. Each ‘Mapping’ indicates

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 13: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 327

<!ELEMENT metadata (element*)>

<!ELEMENT element (tagname, common+, datatype?, format?, necessary?)>

<!ELEMENT tagname (#PCDATA)>

<!ELEMENT common (#PCDATA)>

<!ATTLIST common standard CDATA #REQUIRED>

<!ATTLIST common version CDATA #REQUIRED>

<!ELEMENT datatype (#PCDATA)>

<!ELEMENT format (#PCDATA)>

<!ELEMENT necessary EMPTY>

Figure 6. The DTD of the metadata files.

Figure 7. A partial system metadata file.

<!ELEMENT Forms_metadata (Form*)>

<!ELEMENT Form (Form_name, common+, metadata_name, DTD_name, XSLT_name, Update)>

<!ELEMENT Form_name (#PCDATA)>

<!ELEMENT common (#PCDATA)>

<!ATTLIST common standard CDATA #REQUIRED>

<!ATTLIST common version CDATA #REQUIRED>

<!ELEMENT metadata_name (#PCDATA)>

<!ELEMENT DTD_name (#PCDATA)>

<!ELEMENT XSLT_name (#PCDATA)>

<!ELEMENT Update(#PCDATA)>

Figure 8. The DTD of the system metadata file.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 14: A ReScUE XML/EDI model

328 E. J.-L. LU AND C.-C. WU

<!ELEMENT Mapping_Data (Mapping*)>

<!ELEMENT Mapping (Tagname, To, Format?)>

<!ELEMENT Tagname (#PCDATA)>

<!ELEMENT To (#PCDATA)>

<!ELEMENT Format (DataType, From, To)>

<!ELEMENT DataType (#PCDATA)>

<!ELEMENT From (#PCDATA)>

Figure 9. The DTD of the mapping files.

<Quantity>100</Quantity><Quantity Unit="Box">100</Quantity> �

<Unit>Box</Unit>

Figure 10. Attributes can be defined as elements.

<Mapping>

<Tagname>Quantity</Tagname>

<To>ItemQuantity</To>

</Mapping>

<Mapping>

<Tagname>Quantity/@Unit</Tagname>

<To>ItemUnit</To>

</Mapping>

Figure 11. Attribute mapping.

that the received element ‘tagname’ will be mapped to the ‘To’ element. The ‘Format’ element isoptional, and, if presented, the format of the element will be converted from the format specified in‘From’ to the format specified in ‘To’.

Generally, attributes, such as the ‘Unit’ attribute in the ‘Quantity’ element as shown in 10, provideimportant supplementary information for elements. However, attributes can be defined as elementswithout losing any information [32,33]. Thus, for simplicity, attributes in the received documents areconverted to elements as shown in Figure 10. Consequently, the mappings would be produced as shownin Figure 11.

Control agent

As described in the proposed model, the agents include a lookup manager, a communication manager,and a template generator. All of the agents are coordinated by a control agent in the prototype.The interface of the control agent is shown in Figure 12. In the figure, the agent on the left is called‘CYUT’, and its trading partner, the one on the right, is called ‘IM’. When ‘CYUT’ is started, it willconnect to a trading partner such as ‘IM’. When a document from ‘IM’ is received, ‘CYUT’ willstart to communicate with ‘IM’. In the communications, the control agents at both ends will invoke

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 15: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 329

Figure 12. The communications of two agents at both sides.

the appropriate agents to accomplish their tasks such as invoking lookup managers to search whether atemplate exists or not, invoking communication managers to obtain the metadata of unknown elements,and invoking template generators to create a new template. All messages indicating the progress of thecommunications are shown in the centered text area of the windows.

Lookup manager

The lookup manager is responsible for searching templates and mappings. In our prototype, thetemplate file is named as the remote agent’s name plus the root element of the received XML document,such as ‘IM OrderList’. If the named template file can be found in the template repository, the templatefile is used directly; otherwise, the agent will invoke the communication manager to exchange metadataand invoke the template generator to generate a new transformation template for the document.The mapping files are encoded in XML. Since a mapping file is created for each trading partner, themapping filename simply takes the agents’ names such as ‘IM’.

Communication manager

A communication protocol is needed by the communication managers for information exchange.Famous communication protocols such as KQML [24], ACL (Agent Communication Language)[22], and SOAP (Simple Object Access Protocol) [20] can be utilized. All of the above protocolsare relatively simple and designed to work with HTTP, SMTP, and other native Internet protocols.Although either one can be implemented, KQML is randomly chosen in our prototype.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 16: A ReScUE XML/EDI model

330 E. J.-L. LU AND C.-C. WU

(Performative

:sender <word>

:receiver <word>

:reply with <word>

:in-reply-to <word>

:language <word>

:ontology <word>

:content <expressions>)

Figure 13. KQML message format.

The Knowledge Query and Manipulation Language (KQML) is a language and protocol forexchanging information and knowledge [24]. This work is a part of the ARPA Knowledge SharingEffort, whose goal is to develop techniques and methodologies for building large-scale knowledgebases that are sharable and reusable. KQML is both a message format and a message-handling protocolto support run-time knowledge sharing among agents. KQML can be used as a language for anapplication program to interact with an intelligent system or for two or more intelligent systems toshare knowledge in support of cooperative problem solving.

KQML can be divided into three layers: the content layer, message layer, and communication layer.The content layer is the actual content of messages. The primary function of the message layer is toidentify the protocol to be used to deliver messages and to supply a speech act or performative to thecontent of messages. The message layer also includes optional features for describing the content suchas the language and the ontology used to express the content. The communication layer encodes a set offeatures for messages, which describe the lower level communication parameters, such as the identityof the sender and the recipient as well as a unique identifier associated with the communication.

The message format of KQML is shown in Figure 13. The syntax of KQML is based on a balancedparenthesis list. The initial element of this list is the performative, and the remaining elements are theperformative parameters as keyword/value pairs. The performative signifies whether the content is anassertion, a query, or a command.

In the prototype, KQML is simply used as a language for communications. Therefore, only definefive reserved performatives for the communications are defined, and they are listed in Table I. In theinterest of space, only some examples in this paper are described. For the complete descriptions of allperformatives and their content parameters, readers are advised to refer to [34].

If ‘CYUT’ receives a document with ‘OrderList’ as the root element from ‘IM’ and wishes toknow what kind of document it is, ‘CYUT’ will ask a question by means of an ‘ask-one’ message.The question itself is actually the content of the message and is encoded as ‘(formtype :content(OrderList))’. The complete ‘ask-one’ message is shown on the left-hand side of Table II. In themessage, it indicates that the receiver of the message is ‘IM’, the message id is ‘CYUT KOD’, and thecontent of the message is encoded in KQML. After receiving the message and retrieving the metadataof ‘OrderList’ as defined in Figure 4(b), ‘IM’ will reply with a ‘reply’ message indicating ‘OrderList’represents ‘0065=ORDERS’ defined in the EDIFACT version 95A. The complete ‘reply’ message isshown on the right-hand side of Table II.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 17: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 331

Table I. Five reserved performatives needed for communication.

Performative name Meaning (S: Sender; R: Receiver)

ask-one S wants one response from R regarding a question about the :contentreply S replies to a requestnext S wants next response from R regarding the prior requestsorry S understands R’s message but cannot provide a more informative replytell S tells R some information

Table II. A communication example.

Agent name: CYUT Agent name: IM

(ask-one :sender CYUT :receiverIM :language KQML :reply-withCYUT KOD :content (formtype:content (OrderList)))

(reply :sender IM :receiver CYUT:language KQML :in-reply-toCYUT KOD :content (semantic:standard EDIFACT :version 95A:content (0065=ORDERS)))

Template generator

The template generator plays a key role in converting XML documents with unknown elements intoa pre-defined format by generating the required transformation templates. The three input parametersfor the template generator are the received DTD, a target DTD, and a mapping file. The transformationtemplate is actually XSLT files. In any DTD, an element type can be any of the five content types:‘ANY’, ‘EMPTY’, ‘(children)’, ‘(#PCDATA)’, and ‘Mixed’, where children may be in a choice orsequence list. Since using ‘ANY’ and ‘Mixed’ content types are impractical for structural documents,it is suggested that ‘ANY’ and ‘Mixed’ content types be avoided in structural documents [30], and thusthey are not considered in our prototype.

Algorithm

The algorithm for generating a transformation template is shown in Figure 14. This algorithm is dividedinto three major parts. One is to generate XSLT expressions to transform element names from receiveddocuments into corresponding element names defined in the target DTD. The other two parts are mainlyto transform choice and sequence lists. For brevity, lowercase letters (such as ‘a’, ‘b’ and ‘c’) are usedto represent the element type names in the received DTD and their corresponding uppercase letters(such as ‘A’, ‘B’ and ‘C’; respectively) are used to represent the element type names in the target DTD.Details of the algorithm are described as follows.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 18: A ReScUE XML/EDI model

332 E. J.-L. LU AND C.-C. WU

//"DTDfrom" means received DTD;

//"DTDto" means expected DTD;

//ToItems stores all DTDitem objects in expected DTD

...

content=content+dumpDTD ("/", DTDto_root.name,ToItems.get(DTDto_root.name));

content=content+appendtext;

...

Template_XSL.print(content);

//1st parameter is path is received DTD;

//2nd parameter is tag name is expected DTD

///3rd parameter is the DTDitem object of the tag name. Ex: (B,C,D) or (B|C|D) or (#PCDATA)

public static String dumpDTD (Sting from_path, String tag_name, DTDItem item)

{

text=text+"<"+tag_name+">"; //append <tag_name> express

from_tag=Mapping(tag_name); //find the mapping tag in received DTD

DTDfrom_path=SearchPath(from_path, from_tag); //find path from para.1 to para.2 in received DTD

if (item instanceof DTDChoice) { //if item is belong to DTDChoice ex. (B,C,D)

items = ((DTDChoice) item).getItems(); //get subitems ex. B,C,D

... // append "<xsl:for-each ...>...</xsl:for-each>" expressions in text string

for (int j=0;j<items.length;j++) //items= {B,C,D}

{

if (items[j] instanceof DTDName) {

... // Add "<xsl:template math...>...</xsl:template>" expressions in end of XSLT file

}else { //ex: A(B|(C,D)), (C,D) is DTDSequence not DTDname

text=text+dumpDTD (from_path, "", items[j]);

}

}

}else

if (item instanceof DTDSequence){ //if item is belong to DTDSequence ex. (B|C|D)

items = ((DTDSequence) item).getItems(); //get subitems ex, B,C,D

for (int j=0;j<items.length;j++)

{

if (items[j] instanceof DTDName) {

if (items.cardinal=="*" | |items.cardinal=="+") {

... //append "<xsl:for-each select...>...</xsl:for-each>" expressions

}else{

text=text+dumpDTD(from_path, items[j].value, ToItems.get(items[j].value));

}

}else{ //ex: A(B, (C|D)), (C|D)is DTDChoice

text=text+dumpDTDItem(from_path, "", items[j]);

}

}

}else

if (item instanceof DTDPCDATA) { // if item belong to PCDATA ex. A (#PCDATA)

if (format==true) { //format set value in function of mapping(tag_name)

... // add format convensions in here

Figure 14. A recursive algorithm for generating transformation templates.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 19: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 333

}else {

if (from_tag!="") {

text=text+"<xsl:value-of select=\""+from_path+"\"/>";

}

}

}

text=text+"</"+tag_name+">"; //append "</tag_name>" express

return text;

}

Figure 14. Continued.

1. Element type names transformations. If the element type ‘A’ in the target DTD is mapped to theelement type ‘a’ in the received DTD and the element type ‘A’ is of content type #PCDATA suchas ‘<!ELEMENT A (#PCDATA)>’, ‘<A><xsl:value-of select=‘a’></A>’ is generated in thetemplate. With this template, elements of the form <a> . . . </a> in received EDI documentswill be converted into <A> . . . </A>.

2. Choice lists transformations. If the element type ‘A’ in the target DTD is a choice list such as‘<!ELEMENT A (B|C|D)>’, the following XSLT expression is generated:

<A><xsl:for-each select="/a"><xsl:apply-templates select="b|c|d"/></xsl:for-each>

</A>

Additionally, at the end of the XSLT file, the following expressions for b, c, and d are appended:

<xsl:template match="b"><B>...</B>

<xsl:template><xsl:template match="c">

<C>...</C><xsl:template><xsl:template match="d">

<D>...</D><xsl:template>

3. Sequence lists transformations. If the element type ‘A’ in the target DTD is a sequence list suchas ‘<!ELEMENT A (B,C,D)>’, the following XSLT expression is generated:

<A><B>...</B><C>...</C><D>...</D>

</A>

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 20: A ReScUE XML/EDI model

334 E. J.-L. LU AND C.-C. WU

If there is any quantifier such as ‘*’ or ‘+’ attached to an element type, a loop is generated for theelement type. For example, if the element type ‘A’ is of the form ‘<!ELEMENT A (B*,C+)>’,the expressions generated are

<A><xsl:for-each select="a/b"><B>...</B></xsl:for-each><xsl:for-each select="a/c"><C>...</C></xsl:for-each>

</A>

With recursion, either choice lists in a sequence list (ex:‘(B,(C|D))’) or sequence lists in a choice list(ex:‘(B|(C,D))’) can be processed correctly.

Format conversions

The template generator can also generate templates which can convert data from one format to another.For example, an ‘ODate’ element type, which is of type ‘date’, can be converted from ‘MMDDyyyy’into ‘yyyy/MM/dd’ with the following expressions where ‘TransformDate’ is a Java program for dateconversion and ‘ParserDate’ is its function name.

<Odate><xsl:variable name="Date" select="OrderList/ODate"/><xsl:value-of select= "java:TransformDate.ParserDate(\$Date,

’MMddyyyy’, ’yyyy/MM/dd’)"/></Odate>

Quantifier conflicts

When the element type names in the received DTD are converted into the target DTD, quantifierconflicts may exist. For example, when the element in the target DTD is ‘A+’ but the correspondingelement in the received DTD is either ‘a*’ or ‘a?’, it is likely that the received EDI document may nothave any data for the element ‘a’. Thus, a blank or default value is inserted for the element ‘A’ to resolvethe problem. Similarly, when the element in the target DTD is ‘A?’ but the corresponding element in thereceived DTD is either ‘a*’ or ‘a+’, it is likely that the received EDI document may contain multiplevalues for the element ‘a’. In the prototype, the first value for the element ‘A’ is chosen. All of thepossible conflicts and their corresponding resolutions are summarized in Table III. Note that, if theproposed resolution is not acceptable, developers can either modify the algorithm to use their preferredresolution or simply leave the received DTD intact and inform developers where conflicts occurred.

The template generator selects mappings from the mapping repository and, based on the receivedDTD and the target DTD, produces a template for transforming the received XML document into anexpected format. An example transformation template generated by the template generator is shown inFigure 15(2). After the received document in Figure 15(1) is converted by the template, it is now in a

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 21: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 335

Table III. Quantifier conflicts in DTDs.

Target DTD Received DTD Resolution

‘A*’ ‘a*’, ‘a?’, ‘a+’, or ‘a’ No conflicts‘A+’ ‘a*’ or ‘a?’ with no value Insert a blank or a default value‘A?’ ‘a*’ or ‘a+’ with multiple values Select first value‘A’ ‘a*’ or ‘a?’ with no value Insert a blank or a default value

‘a*’ or ‘a+’ with multiple values Select first value

recognizable format as shown in Figure 15(3). From this point on, the received document has becomea document that can be further processed by any existing program. For demonstration purposes, alayout template was also designed so that it will transform any newly converted XML document intoHTML format for the Internet browsers. The resulting HTML page is shown in Figure 15(4). Aided bythe mechanism described, the ReScUE model can provide consistent interfaces for users and existingsystems which makes integration easy.

Development environment

For platform independency, Java is chosen for developing the prototype. An XML parser is required forparsing and validating XML documents. A free XML parser—Xerces, provided by IBM—is used inour implementation. Since a DTD parser is required for processing DTDs, a Java DTD parser providedby Wutka (http://www.wutka.com/) is employed. Xalan for Java 2 is a free XSLT processor providedby IBM. Xalan is a utility for transforming XML documents into other formats such as HTML, plaintext, postscript, etc. JATLite, developed at Stanford, written in Java, is employed for communicationsin KQML.

CONCLUSIONS AND FUTURE WORKS

More and more businesses are using XML in delivering business messages over the Internet.However, since XML allows developers to freely design their own element names, businesses mayreceive XML documents with unknown elements, and a lot of human intervention will be called for asa result. In this paper, a ReScUE XML/EDI model is proposed to resolve this problem. From previousdiscussions, it is realized that the ReScUE model is capable of processing XML/EDI documents withunknown elements. Although human intervention cannot be totally eliminated, it has indeed beenreduced as much as possible.

Although all the XML documents in the prototype are stored and accessed in plain files, XMLdatabases, such as eXcelon, Xindice, etc., can be used to access XML documents to increase efficiency.Additionally, the maintenance of the semantics of elements in the metadata base can be greatlysimplified if global repositories provide an easy-to-use service for looking up the requested terms ineach data dictionary [35].

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 22: A ReScUE XML/EDI model

336 E. J.-L. LU AND C.-C. WU

(4)

(3)

(2)

(1)

Figure 15. The original document (1) is transformed by the newly created transformation template (2) into anexpected format (3) which is then, after being processed by a layout template, shown in (4).

The prototype of the ReScUE XML/EDI model has been implemented and thoroughly testedin experimental environments. Although it works in all aspects, the employment of the ReScUEXML/EDI model in real-life situations requires further study. Additionally, in the ReScUE model, themetadata files are designed in XML. Recently, the development of Resource Description Framework(RDF) and RDF Schema is reaching maturity, and they seem like a good substitute for the metadatafiles used in the ReScUE model. Nevertheless, the complexity and flexibility of using RDF and RDFSchema requires further investigation.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 23: A ReScUE XML/EDI model

A ReScUE XML/EDI MODEL 337

ACKNOWLEDGEMENT

This research was supported in part by the National Science Council, Taiwan, R.O.C., under contract No. 90-2213-E-324-018.

REFERENCES

1. Fu S, Chung J-Y, Dietrich W, Gottemukkala V, Cohen M, Chen S. A practical approach to Web-based Internet EDI.Proceedings 19th IEEE International Conference on Distributed Computing Systems Workshops, 1999. IEEE, 1999;53–58.

2. Kalakota R, Whinston AB. Frontiers of Electronic Commerce. Addison-Wesley: Reading, MA, 1996.3. Lu EJ-L, Tsai R-H. An empirical study of XML/EDI. Journal of Systems and Software 2001; 58:269–277.4. Iacovou CL, Benbasat I, Dexter A. Electronic data interchange and small organizations: Adoption and impact of

technology. MIS Quarterly 1995; 19(4):465–485.5. Tuunainen VK. Opportunities of effective integration of EDI for small businesses in the automotive industry. Information

and Management 1998; 34(6):361–375.6. Meltzer B, Glushko R. XML and electronic commerce: Enabling the network economy. SIGMOD Record 1998; 27(4):

21–24.7. Lehmann F. Machine-negotiated ontology-based EDI Electronic Data Interchange. Electronic Commerce: Current

Research Issues and Applications. Springer: Berlin, 1996; 27–45.8. Steel K. The standardisation of flexible EDI messages. Electronic Commerce: Current Research Issues and Applications.

Springer: Berlin, 1996; 13–26.9. Fellner KJ, Turowski K. Component framework supporting inter-company cooperation. Third International Enterprise

Distributed Object Computing Conference, 1999. IEEE, 1999; 164–171.10. Steel K. Another approach to standardising EDI. Electronic Markets 1994; 9–10.11. Adam NR, Adiwijava I, Atluri V, Yesha Y. EDI through a distributed information systems approach. Proceedings 31st

IEEE Annual Hawaii International Conference on System Sciences, 1998. IEEE, 1998; 354–363.12. Lu EJ-L, Hwang R-J. A distributed EDI model. Journal of Systems and Software 2001; 56:1–7.13. Steel K. Open-EDI Reference Model Standard-CD. International Standards Organization/International Electrotechnical

Commission JTC1/SC30, February 1996.14. Lee MR. Context-dependent semantic values for E-negotiation. Second International Workshop, Advanced Issues of

E-Commerce and Web-Based Information Systems, WECWIS, 2000. IEEE, 2000; 41–47.15. Webber D. Introducing XML/EDI frameworks. Electronic Markets 1998; 8(1):38–41.16. Kotok A. White Paper on global XML repositories for XML/EDI. The XML/EDI Group, February 1999.17. Ogbuji U. XMLGThe future of EDI? SunWorld February 1999. http://www.javaworld.com.18. Lu EJ-L, Wu C-C. The study of Chinese XML in the applications of electronic commerce. The Journal of Chaoyang

University of Technology 2000; 5:201–216 (in Chinese).19. Hernandez MA, Miller RJ, Haas LM. Clio: A semi-automatic tool for schema mapping. SIGMOD Record 2001; 30:607.20. Travis BE. XML and SOAP Programming for BizTalk Servers. Microsoft Press, 2000; 464 pp.21. Adam NY, Yasha Y. Strategic directions in electronic commerce and digital libraries: Toward a digital agora. ACM

Computing Survey 1996; 28:818–835.22. Ferber J. Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence. Addison-Wesley Longman Inc., 1999.23. Sprinkle J, van Buskirk CP, Karsai G. Modeling agent negotiation. Systems, Man, and Cybernetics, IEEE International

Conference, vol. 1, 2000. IEEE, 2000; 454–459.24. Labrou Y, Finin T. A proposal for a new KQML specification. Computer Science and Electrical Engineering Department,

University of Maryland Baltimore County, February 1997.25. Doheny S. An introduction to EDI. e-Marketing company. http://www.e-marketing.com.au/ [July 2000].26. Saleh K, Probert R, Khanafer H. The distributed object computing paradigm: Concepts and applications. Journal of Systems

and Software 1999; 47:125–131.27. Rawlins MC. Future EDI: An overview of emerging technologies which might replace X12 and EDIFACT. The Journal of

Electronic Commerce 1998. http://www.rawlinsecconsulting.com/pubpres/EDIFuture.pdf.28. Sciore E, Siegel M, Rosenthal A. Using semantic values to facilitate interoperability among heterogeneous information

systems. ACM Transactions on Database Systems 1994; 19:254–290.29. Goldfarb CF, Prescod P. The XML Handbook. Prentice-Hall: Englewood Cliffs, NJ, 1998.30. Gregory AT. XML schema design for business-to-business e-commerce. Graphic Communications Association XML

Conferences, June 2000. IDE Alliance, 2000.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338

Page 24: A ReScUE XML/EDI model

338 E. J.-L. LU AND C.-C. WU

31. Dubray J-J, Nickull D, Malik A, Webber D. White Paper on enhanced DTD syntax for eBusiness [eDTD]. ebXML InitiativeTechnical Implementation Submission, February 2000.

32. Bourret R, Bornhovd C, Buchmann A. A generic load/extract utility for data transfer between XML documents andrelational databases. Proceedings of the Second International Workshop on Advanced Issues of E-Commerce andWeb-based Information Systems, 2000. ACM Press: New York, 2000; 134–143.

33. Shanmugasundaram J, Tufte K, He G. Relational databases for querying XML documents: Limitations and opportunities.Proceedings of the 25th VLDB Conference, 1999; 302–314.

34. Lu EJ-L, Wu C-C. ReScUE 1.0: Design and implementation. Technical Report CYUT-IM-TR-2001-014, Department ofInformation Management, Chaoyang University of Technology 2001.

35. Techniques & Methodologies Working Group (TMWG). Setting up a UN Repository for XML-EDI. CE-FACT/TMWG/N071, November 1998.

Copyright c© 2003 John Wiley & Sons, Ltd. Softw. Pract. Exper. 2004; 34:315–338