6
HYPERTEXT BASED DISTRIBUTED OBJECT MANAGEMENT FOR INFORMATION LOCATION AND RETRIEVAL PRAVEEN KANT SHARMA SAVITA RA0 PC SAXENA & SK BOSE ERNET,COMP SCS DEPTT NCST COMP & SYS SCIENCE IIT DELHI JUWBOMBAY JNU,DELHI ABSTRACT Present user community exchange information through media like electronic mail, computer news groups, information servers and so on. Old methods and technologies are not able to deal with large amounts of information. The paper discusses an efficient solution for managing information with help of hypertext/ hypermedia approach. Introduction Information storage, location, retrieval and din- semination is a dynamic process. Sise of infor- mation is getting larger and larger in the present world, hence such changing scenario requires greater speed, accuracy, larger information storing device, and quick information retrieval tools. But fulfill- ment of such tank is very difficult and time COMU- ing. So far there is no foolproof system to store all the information of human knowledge, and retrieve the name as and when required.[l3] The whole field of information location, storing, and retrieval is becoming an increasingly important area, although the concepts involved are not very new to either librarica or any other query systems. The basic purpoae of libraries is to acquire, p y r v e , make availabk to the der, and display thur col- lection of documents. Each such inrtitution e.g. library cell of an inrtitution) is routinely invo I ved in giving information to the WT. However, 10- tion and findyof a particular piece of information, out of perhapw undreda, thowan& of items or may be more, is becoming an incmubgly difficult tank. An time panes, our collection ow even larger and the above mentioned nervicca recoming even more pmshg. Thin paper describer how cumnt tech- nology can be d ta open a market of information dcea that will allow WT'S workstationr to act as &umt librarian and information collection agents from a large number of murceo. The problemr that are being sddruaed in the de- signing of mch nyntem include human interface k m u , merging of information of many mu", find- Tpplicable murceo of information, and retting up a amework for the rapid proliferation of info- tion Bervers. Accessing private, public, and group information with one user model implemented on personal workstatione is attempted to allow uscm access to many sourcca without learning specialised commands. Distributed hypertext Object Management approach Hypertext is an approach to information man- ement. A hypertext system ia comtruct of nodes za network and connected by links. Hypetext, at its moat basic level, is a database that lets you con- nect acreem of information wing Mllociatca links. At its most mphirticated level, hypertext is a mft- ware environment for collaborative work, commu- nication and knowledge acquinition. Some of the commonly uned worda in understanding the hyper- text concept are : 1. Links Linltr are the mort fundamental unit of hy- pertext. Lh are the lableathat connect one node (document, article, topic with another. When a link is activated (for WUuIIpk, by ms lecting it with a mouse or MOW keys), it pr+ ducca the following renults It can be unidirectional or II idirectional[l]. 0 transfer to a new topic 0 show a reference 0 provide ancillary information, such a foot- 0 display an illustration, nchematk, photograph, 0 dieplay an index 0 run another program note, definition, or annotation or video ncquence Links has following significance: (a) easy to activate, (b) produce fast rcaponne 102

[IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

  • Upload
    sk

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

HYPERTEXT BASED DISTRIBUTED OBJECT MANAGEMENT FOR INFORMATION LOCATION AND RETRIEVAL

PRAVEEN KANT SHARMA SAVITA RA0 PC SAXENA & SK BOSE

ERNET,COMP SCS DEPTT NCST COMP & SYS SCIENCE

IIT DELHI JUWBOMBAY JNU,DELHI

ABSTRACT

Present user community exchange information through media like electronic mail, computer news groups, information servers and so on. Old methods and technologies are not able to deal with large amounts of information. The paper discusses an efficient solution for managing information with help of hypertext/ hypermedia approach.

Introduction Information storage, location, retrieval and din-

semination is a dynamic process. Sise of infor- mation is getting larger and larger in the present world, hence such changing scenario requires greater speed, accuracy, larger information storing device, and quick information retrieval tools. But fulfill- ment of such tank is very difficult and time COMU- ing. So far there is no foolproof system to store all the information of human knowledge, and retrieve the name as and when required.[l3]

The whole field of information location, storing, and retrieval is becoming an increasingly important area, although the concepts involved are not very new to either librarica or any other query systems. The basic purpoae of libraries is to acquire, p y r v e , make availabk to the d e r , and display thur col- lection of documents. Each such inrtitution e.g. library cell of an inrtitution) is routinely invo I ved in giving information to the WT. However, 10- tion and f i n d y o f a particular piece of information, out of perhapw undreda, thowan& of items or may be more, is becoming an incmubgly difficult tank. An time panes, our collection ow even larger and the above mentioned nervicca recoming even more pmshg. Thin paper describer how cumnt tech- nology can be d ta open a market of information d c e a that wil l allow WT'S workstationr to act as &umt librarian and information collection agents from a large number of murceo.

The problemr that are being sddruaed in the de- signing of mch nyntem include human interface k m u , merging of information of many mu", find- T p p l i c a b l e murceo of information, and retting up a amework for the rapid proliferation of info-

tion Bervers. Accessing private, public, and group information with one user model implemented on personal workstatione is attempted to allow uscm access to many sourcca without learning specialised commands.

Distributed hypertext Object Management approach

Hypertext is an approach to information man- ement. A hypertext system ia comtruct of nodes

z a network and connected by links. Hypetext, at its moat basic level, is a database that lets you con- nect acreem of information wing Mllociatca links. At its most mphirticated level, hypertext is a mft- ware environment for collaborative work, commu- nication and knowledge acquinition. Some of the commonly uned worda in understanding the hyper- text concept are :

1. Links Linltr are the mort fundamental unit of hy- pertext. L h are the lableathat connect one node (document, article, topic with another.

When a link is activated (for WUuIIpk, by ms lecting it with a mouse or MOW keys), it pr+ ducca the following renults

It can be unidirectional or I I idirectional[l].

0 transfer to a new topic

0 show a reference

0 provide ancillary information, such a foot-

0 display an illustration, nchematk, photograph,

0 dieplay an index

0 run another program

note, definition, or annotation

or video ncquence

Links has following significance:

(a) easy to activate, (b) produce fast rcaponne

102

Page 2: [IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

7. Indexing It is making order of the chaos. Although links provide the primary means of connecting information in hypertext systems, an impor- tant secondary capability in indexin . Index- ing makes it p d b l e to look up inkrmation alphabetically or to search for specific terms

A network is a collection of hoots (computers) connected to each other, with the help of vari- ous communication media as coaxial- cable, twisted pair, radio - waves, fibers - optics and so on. So the key hardware and software behind this new sys- tem is a physical network of machines, a database of document materials an approach to information man ement in which data is stored in a network of n z e s connected by links. Nodes can contains text, graphic, audio, video as well as source code, or other form of data. / Accessing and Managing of Information

For the information owners to make their data available over a system, they must be easy to start, inexpensive to operate, and profitable. In our dis- cussion we find out that the major problem is per- forming various operations related to managing of such large volume of information, as described be- low:

0 location of site where the required information

0 after identifying the site, next problem is, how

0 how quickly, asked information can be located.

0 next problem is once user had identified the information, how he can transfer that partic- ular information to his host.

1141.

is available.

to access that particular information.

2. Nodes A single document in a hypertext database is called a node. Each node in a hypertext aye tem comaponds to one or more screen dis- plays. Nodes are connected by links, in a v a riety of possible structures such as webs and hierarchies[5].

3. Document There are as such no fixed guidelines for doc- uments sise. Hypertext documents are usu- ally written as that they are self contained and do not depend upon the user's having viewed other documents. Continuity between docu- ments is provided by links [12].

4. Hierarchies

Figure 1: Hierarchical Tree Structure(1eft) and Network Structure(right)

The structure of a distributed hypertext database is a major factor that determines how4 easy it is to create, use, and update. In a hierarchy, ea& node has a parent (superor- dinate) and a child (subordinate) unless the node is a starting point (root) or an end point (leaf). Figure 1 shows both hierarchical and network structures [l].

5. Browsing In other words, exploring a hypertext system. Hypertext systems offer a surprising and sat- iafying freedom to explore. With a little train- ing in computer concepts and with a little knowledge of the subject domain, hypertext users can easually traverse nodes and links looking for something of their interest [l].

6. Search Although Browsing provides one means of finding things in a hypertext database, it works only for predefined links. A more gen- eral capability is keyword search that finds a word or phrase provided as input, for all known links [l].

0 whether this approach pertains to some inter- national standards or de-facto standard like: CCITT, OSI/ISO, etc.

a whether it can make use of existing database.

0 whether it can handle various formats of data or processed data.

0 and so on. The key ideas in such system are that informa

tion services should be easily and freely distributed, the power of the current workstations can pro- vide sophisticated tools as servers and consumers, and electronic networks should be exploited to dis- tribute information base. These are issues which an implementer and designer has to keep in mind while implementing such tools. A hypertext b a d dis- tributed object management system facilitates most of the requirements mentioned above[l3].

103

Page 3: [IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

Hypertext based Distributed Object Management Imple- ment at ion Issue

This section discusses issu s concerned with im- plementation of various kno& database manage- ment information storage and retrieval systems, and also for proposed HtDOM system. Implementation issue is very critical and pertains to required appli- cation. Implementation issues arise when the sys- tem is put into regular use, like the success or failure of the system may depend upon how the following issues are resolved: Relational database management

The relational model, although simple and pow- erful, imposes some restrictions on the representa- tion of data by organizing data into relations (t% bles). It presumes horizontal and vertical homogen- ity in the data. User requests are formulated in terms of information content and do not reflect any complexities due to system oriented aspects. A re- lational data model is what the user sees, but it is not necessarily wkat will be implemented physically. The major objectives of any database management system is to provide data independency. The re- lational database management techniques removes the details of storage structure and access strategy from the user interface [2].

Relational database management technique is less in comparison with hierarchical, network, or an inverted file with inversion with multiple keys. Its efficiency is also relatively poor, because the storage of data is sequential which engages lots of disk-space and reduces the efficiency. Relational database management does not explicitly include semantics as part of data representation. Instead, the application programs interpret data semantics

When a real-world entity can not fit into the re- lational database management techniques directly, there we foresee an artificial decomposition becomes essential. Design Database Management

Engineering design management databases are useful in Computer Aided Design CAD), Computer Aided Manufacturing (CAM), an 6 Computer Aided Software Engineering (CASE) systems. In such systems, complex objects can be recursively parti- tioned into smaller objects. Furthermore, an object can have different representations at various level of abstractions, and a record of an object's evolu- tion (object version) should be maintained. "ka- ditional database techniques does not support the notion of complex objects, equivalent object, or ob- ject versions[2]. Multimedia database management

It includes not only text, but also images, gFaph- ics, digital audio, and video. Such data are typically stored aa sequences of bytes with variable lengths, and segments of data are linked together for easy referencing. Access to data can be, on the basii of the structure for a graphical item or following links.

121 -

Hypertext based Distributed Object Management

A hypertext approach provides an easy technique of moving around within a large space of informa- tion. Which can face two types of problems:

0 not being able to find described information,

0 getting disoriented.

Above mentioned techniques also give problems when various group of people working together by letting them to work on same document, at the same time. Fortunately the linking structure of hy- pertext allows coordination of nodes (objects) writ- ten by multiple authors. The object oriented a p proach furthermore handle with the fast location of information. In addition to it, this method is quite efficient in managing the information base, and pro- vide flexibility for data to be distributed. The de- tails of proposed method is given below.

Design of Hypertext based Dis- tributed Object Management System

It becomes nearly impossible to keep track of var- ious versions (old version or alternatives for a struc- ture). The proposed design assumes that the avail- able data is in multimedia format, i.e. the informa- tion can be text, image, graphics, tables, audio, and video. The smallest unit of data is stored in form of an object. A complex object has a hierarchical data structure. A simple object has only a value, such as integer or a string pof characters, whereas complex object consists of : a collection of instance variables, i.e. attributes. Instance variable can be denoted as nodes and their references as links. These variable can be local or distributed across the machines. The overall design of the proposed model is as follow:

In theory one can distin uish three levels of hy- pertext system.(see figure 27

0 Presentation level : user-interface

0 HtDOM Abstract layer

0 DataBase level: storage, shared data, and net-

Presentation Level: User Interface and Hardware support for hypertext

The user-interface deals with the presentation of the information in the HtDOM layer including such h u e s a what commands should be made avail- able to the user, how to show nodes and links 91. The design of the user-interface should be such tLt learning time of etting accustomed with it should be minimum. It Aould avoid errors like: direct ma- nipulation, eli inatcds the possibility of errors from incorrectly typed commands, clear directions, help messages, and consistence system responses. Major issues in developing a hypertext system are follow- ing:

work access [3].

104

Page 4: [IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

Proper display terminah : 80 that required type of graphics, windows, animations can be dib played. Exampla of such display terminals are Enchanced Graphics Cards(EGA), Video Graphics Card(VGA), SVGA, CGA for Penonal Computer (PC) d e s , and high-resolution, i.e. 1024*1024 res- olution type of monitors for mini-computer or work- station type of computers.

The Proposed system will be developed on top of GUI based utilities, like MSWindows, XllR5, OpenLook, or Motif etc.

Performance criteria: a hypertext system ex- pects a much higher response time.

User Interface

Application Tools

HtDOM Abstract

Layer

Host File System (database)

Figure 2: HtDOM Layered Architecture

Type of computer hardware and the operating system, i.e whether base machine is 8 or 16 or 32 bits, and the supported environment is interactive, multi-tasking, multiprogramming or distributed.

Database design: sequential, hierarchical, net- work, or object - oriented.

Average document sise.

Total number of documents.

Networking facilities.

Type of graphics involved.

Complexity search for a particular record.

Type of storage used: Hard disk, SCSI drives, CD ROM, synoptical disk etc, and

Amount of workstation storage.

Middle level This particular layer has to interwork with the

physical level and application level functions. At this level the proposed system will determine the basic nature of its nodes and links, where and how to maintain relations amongst them. At this level we have multiple buffer available to us, so that mul- tiple connections can be handled easily, like in a networked environment. The storage system would have knowledge of the form of the nodes and links and would know what attributes were .related to each other. The HtDOM layer is the best candidate for standardisation of import-export formats for hy- pertext, since the database level has to be heavily machine dependent in its storage format and the user interface level is highly different from one hy- pertext system to the next. This leavcs only the HtDOM layer, and since we do need the ability to transfer information from one hypertext system to the other, we have to comeup with an interchange format at this level. Database Level

It is at the bottom of the three-level model and deals with all the traditional issues of the informa- tion storage. It is necessary to store large amounts of information on various computer storage devices, and it may be necessary to keep some of the infor- mation ntored on remote servers accessed through a network. No matter how the information is stored, it should be possible to retrieve a specified small chunk of it in a very short time [4].

Furthermore, the database level should handle traditional database issues, like multiuser access to the information and various security considerations, including backup. One major problem in hypertext system is how to handle updates to the database. There are two issues which should be answered:

should only one version of the database be kept, or

should multiple versions of a document be available?

how to inform reader about the updated databases?

To overcome this problem, a notion of Dynamic Foldera has been introduced. It takes a question a return an ordered- liit of possibly relevant docu- ments. The question can be further refined by giv- ing feedback an to how relevant the documents were. The results of a question can be seen an cousin to the file folder in that it contains a list of documents.

105

Page 5: [IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

In reality, the answers to a question might not be a "copy" of a document, but a "reference" or pointer to a document. This capability becomes important when some of the questions take time to answer because the data might be far away or difficult to answer. Hierarchical Data Model

In this data model there has to be minimum one root node and their childs, for example hierarchi- cal tree structure. A hierarchical tree structure is made up of nodes and branches, where a node is a collection of data attributes describing the entity a t that node.

In this data model the Directory node structure will be as following:

The above shown is a directory node structure, which shows that the the selected directory "pre- vious directory" function is equivalent to "move

HtDOMItemList

type name path I nost port PIUS

t

next--

HtDOM

created contents

H tDOMItemLis t

Figure 3: HTDOM node structure

backward". If the user now selects anything ex- cept "move forward", then this new selection will replace directory N and all of its successors. Each directory from N on is released.

Once a directory is released, the directory data structure is removed from the stack and moved back to a "free list".

Distributed- Hypertext Object based items are saved(cached) with the directory until either the di- rectory is released or they become stale. A time field in the directory structure identifies the time that the directory contents were last fetched. Whenever the contents are needed, the time is checked, and if some user-specified threshold of elapsed time is exceeded, then the directory contents are freed and re-requested from the server.

A HtDOMItemList is simply a first/last pointer structure:

HtDOMItem HtDOMItem

Figure 4: HTDOM Item structure

In HtDOM tree structure each node is defined as an object, and an object has the widget class and the instance name. Indentation shows the parent- child relationships.

HtDOMSystem htdom Form statusForm Command quit

MenuButton other SimpleMenu otherActionMenu

type COPY type unmark All type options type version type restart

Command help Label status

Form Document Label documentTitle Viewport directory View

vertical Scrollbar directory List

Label bookmarkTitle Form bookmarkForm

TraneiqntShell optionsPane1 Form optionsForm

Box but tonBox Command done Command cancel Command help

Form showForm Box showWhatDocument

Toggle showwhat Label showwhat Document

Box appendBkDocument Toggel appendBk Label appendBkDocument

Box 1oadBkBox Toggel IoadBk Label 1oadBkDocument

Toggel reset Label resetDocument

Toggel bkSaveDocument Label bkSave

Form printCmdForm Label printCmdDocument Text printCmd

Form imageCmdForm Label imageCmdDocument Text imageCmd Transient Shell saveshell

Box resetBox

Form bkSaveForm

106

Page 6: [IEEE Engineering Management Society Conference on Managing Projects in a Borderless World - New Delhi, India (17-18 Dec. 1993)] Proceedings of Engineering Management Society Conference

Form cdForm Label cdLabel Document cdPathName Label cdErr Mersage Command ok Command cancel Command help

TransientShell dupDocumentShel1 Form dupDocumentForm Label label1 Label 1abelDocument Label label2 Command ok Command cancel Command append Command help

Conclusion The primary objectives of thio paper is to under-

stand the changing scenario in information technol- ogy, and realise the fact that in this rapidly chang- ing world keeping track of information L becoming a very difficulty task for a ringle site or person. There- fore it in p r o p d to have some better techniques and technologies for managing large amount of in- formation. Where the end-mer L not required to become familiar with several entirely different sys- t e m and m r should not need to become familiar with internal configuration of the rystem, all such aapecta are tranaparent to hi. Another impor- tant aspect L implementation of such tools baaed on international etandads, like OSI/ISO, CCITT and so on. Such system tacilitatu urer a versatile acmm to full-text documento through onaeasy-tct m interface. Thio ryatem har shown that current technologies can be ured to make d, profitable and convenient wide area information oyrtemr. Re- lational DBMS were duigned to optimise for en- vironments with large n u m h of UIUI who t rue short queries. They give very poor performance in case of large queriu. Wherean a multimedia bwd database may contain variablelength h x t r p h - ics, images, and audio and video data. In clition to it, those data are distributed and ~ v e d in form of objects, therefore for retrieval of ouch information we use a mixed approcuch of all theme khn iquu .

References Rao, Savita Epertezt in Information Stor- age and Retrieval, A Technical Report sub- mitted to SHPT School of Library Science, April 1993.

Hunon, A.R., Pakzad, S.H. Object-Oriented Databwe Management Sydenu: Bvolution and perfomonce h u e 4 IEEE Computer, February, 1993, pp. 48-60.

Salton, Gerald, McGill, Michd Intwduction to Modern Infomaation Retrievd, McGraw- Ed, 1983.

[4] fianklin D a h et al. WAIS Medace Protoed Prototype Functional Specification, Thinlt- ing Machines. Available from Ranklin D a h (fadOthink.com)

[5] Woodhead, Negel Eypertezt and Hypermedia: Theory and Applications, England, Sigma Press. 1990.

[6] Sievcrts, E,G. Soflware for Informotion Stor- age and Retrieval Tested, Evaluated, and Campared: Part V - Personal Information Management, Eypertezt and Relevance Ranh ing Program, The Electronic Library, Vol.10, No.6(December l992), p. 339-357.

[7] Nielsen, Jakob. Eypertezt 0 Hypermedia, Boston: Academic Press, 1990

[8] Rada, Roy. Converting a Teztbook to Hyper- tezt, ACM Transaction Information Systems, Vol. lO(3) July 1992, pp. 294313.

[9] Frke, Mark. h m Tezt to Eypertezt, Byte, Vol. 13, No. 10, October 1988, pp 247-254.

[lo] Jone, H. Developing and Distributing Eyper- tezt Toob: Legal Inputr and Parameters , Proc. ACM Hypertext 1987 Conf, Chapel Hil, NC, pp. 13-15.

[ll] Smith, John B & Weise, Stephan Eypertezt, Communication of ACM, Vo1.31, No.7(July 1988), p. 816-819.

[12] The art of Navigation through Hypertezt, Communication of ACM, Vo1.33, NO.S(March 1990), p.296-310.

[13] Ramaiah, C.K. Eypertezt and Hypermedia: An overview, Deridoc bulletin of Information Technology, V01.12, No.6(November l988), p. 3-13.

[14] Akseyn, R.M. et al KMS: A distributed Hypermedia qr tem fer managing knowledge in organization , Communication of ACM, Vo1.13, No.'l(July 1968), p. 820-835.