38
DEVELOPMENT OF A MOBILE AGENT BASED WEBSEARCH IN AJANTA by Arvind Prakash A Plan B report submitted in partial fulfillment of the requirements for the degree of MS in Computer Science University of Minnesota 1999 Approved by ________________________________________ Chairperson of Supervisory Committee __________________________________________ __________________________________________ __________________________________________

Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

DEVELOPMENT OF A MOBILE AGENT BASED WEBSEARCH

IN AJANTA

by

Arvind Prakash

A Plan B report submitted in partial fulfillment of the

requirements for the degree of

MS in Computer Science

University of Minnesota

1999

Approved by _________________________________________________Chairperson of Supervisory

Committee

_________________________________________________________________________________________________________________________________________________________

Program Authorized to Offer Degree_______________________________________________

Date _________________________________________________________

Page 2: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

University of Minnesota

Abstract

DEVELOPMENT OF A MOBILE AGENT BASED WEBSEARCH IN AJANTA

by Arvind Prakash

Chairperson of the Supervisory Committee:Dr: Anand Tripathi

Department of Computer Science.

Ajanta is a Java-based system for programming applications using mobile agents over the Internet. This report explains the design and implementation of a middleware system for performing Web Search. We extend the existing File Access system in Ajanta by adding this new primitive. The Web Search system facilitates the user to perform full-text keyword searches on the files in the remote user’s web directory. The Web Search system not only offers many options to narrow your search, it also presents the results in different views which could provide the user with considerable insight on the distribution of the keyword. We also implement another primitive to fetch the status of a remote file. A complete Graphical User Interface(GUI) for the File Access System has also been designed and developed. This GUI has been designed to be generic and easily extendable.

Page 3: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree
Page 4: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

TABLE OF CONTENTS

1. INTRODUCTION......................................................................................................................1

OVERALL GOALS...............................................................................................................................1MOTIVATIONS....................................................................................................................................2

Why Agent-based ?.......................................................................................................................2Why not use the existent search utilities like Yahoo, Altavista etc ?..............................................2

SALIENT FEATURES OF THE WEBSEARCH SYSTEM.............................................................................2

2. BACKGROUND.........................................................................................................................2

2.1. AN OVERVIEW OF AJANTA.......................................................................................................22.2. AGENT, AGENT SERVERS AND ITINERARY...............................................................................32.3. FILE ACCESS SYSTEM ..............................................................................................................4

2.3.1 File server Architecture..................................................................................................42.3.2 File System resource.......................................................................................................52.3.3 File Server Thread..........................................................................................................5

3. AN AGENT-BASED WEB SEARCH SYSTEM......................................................................5

3.1 DESIGN GOALS AND REQUIREMENTS.......................................................................................53.1.1 Information Filtering :....................................................................................................63.1.2 Web search options :......................................................................................................63.1.3 Presentation Views at the Client side :...........................................................................73.1.4 Security and Privacy.......................................................................................................8

3.2 DESIGN AND IMPLEMENTATION OVERVIEW..............................................................................83.2.1 Extensions to Existing File Access system......................................................................83.2.2 Information Filtering......................................................................................................93.2.3 Presentation Views.......................................................................................................10

4 GUI FOR FILE ACCESS SYSTEM AND WEB SEARCH AGENT...................................10

4.1 FUNCTIONAL REQUIREMENTS.................................................................................................114.2 GUI DESCRIPTION AND SNAPSHOTS.......................................................................................11

4.2.1 File Access System GUI :.............................................................................................114.2.2 Web Search GUI :........................................................................................................12

4.3 IMPLEMENTATION OVERVIEW................................................................................................13

5. CONCLUSIONS AND FUTURE WORK....................................................................................14

REFERENCES...................................................................................................................................15

APPENDIX.........................................................................................................................................16

RESULTS AND PRESENTATION VIEW SNAPSHOTS.............................................................................16

Page 5: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

LIST OF FIGURES

NumberPage

Figure 1: The Ajanta Server Architecture.....................................3Figure 2: File Accss Server Architecture......................................4Figure 3: Agent interaction in WebSearch System.......................6Figure 4: Information filtering on either side................................9Figure 5: Main GUI for the File Access System...........................11Figure 6: GUI for the Transfer Primitive.....................................11Figure 7: WebSearch choice being made in the main GUI..........12Figure 8: Server Choice drop down box......................................12Figure 9: WebSearch GUI...........................................................13Figure 10: Segregated View ........................................................16Figure 11: Combined View...........................................................17Figure 12: Directory Structure View............................................17Figure 13: Abstract View .............................................................18

ii

Page 6: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

ACKNOWLEDGMENTS

I thank my advisor Dr. Anand Tripathi for giving me an opportunity to work on this interesting project. This would not have been possible without all his support, encouragement and suggestions. My gratitude for Neeran Karnik, who’s invaluable work in creating this system, provided me with a base to develop my project.

I enjoyed working in the Ajanta group and would like to thank my colleagues Ram Singh and Tanvir Ahmed for their support.

No words will be enough, to thank my parents who inculcated in me, a passion to enjoy everything I do and a never-say-die attitude. I am immensely fortunate to be their son.

Two years of graduate life has been an interesting experience in my life. I am thankful to all the friends I acquired here, for their support on both professional and personal fronts.

Finally, my humble thanks to the gracious Almighty for giving me a chance to be part of his all encompassing, global project.

iii

Page 7: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

1. IntroductionAjanta1 is a Java-based framework for programming mobile agent based applications on the Internet.. A mobile agent is a program that can represent a user in a network, has the capability to migrate from node to node and to make decisions autonomously on behalf of the user it represents. Its tasks are determined by the agent application, and can range from online shopping to real-time applications. After having accomplished their goals, the agents may either terminate or return to their “source” in order to report the results to the user. Thus, applications can launch mobile agents into a network, allowing them to roam the network, either on a predetermined path, or one that the agents themselves determine based on dynamically gathered inputs. Traditionally, applications in distributed systems have been structured using the client-server paradigm, in which the client and server communicate either through messages or Remote procedure calls(RPC). This model is synchronous in nature, as the client has to wait while the server processes the request. In the Remote Evaluation(REV) model, the client, instead of invoking a procedure, sends the procedure code to the server and requests the server to execute it and return the results. The mobile agent paradigm differs from RPC and REV mainly, as the agent carries the code as well as data with it during migration.

The inherent advantages of this paradigm is the ability to provide increased asynchrony and autonomy in Client-server interactions [1] and in moving client code and computations to the remote server resources. The agent paradigm also provides other benefits, as a client can decompose its tasks among multiple agents to derive parallelism and fault tolerance. This makes the paradigm a virtual gold mine when it comes to applications like information searching, filtering and retrieval, e-commerce etc. on the World Wide Web.

Overall Goals

Information search and filtering applications often download large amounts of information over a network, process it, and generate comparatively small amounts of result data. If we write these applications using mobile agents, the agents can execute on server machines and return the results thus, avoiding network overload. For example, Web-based applications use the stateless HTTP protocol, which often necessitates several network connections for each application-level transaction. If mobile agents are used instead, the client does not have to maintain a network connection while its agents access and process 1 See http://www.cs.umn.edu/Ajanta

Page 8: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

information. The prime goal of this project was thus, to develop an Agent-based Web Search middle ware that would extend the already existing File Access system of Ajanta, as well as, develop a generic GUI for the File Access system. A primitive to return the status of a remote file was also developed. The status comprises of the file properties like, size, last modified date and permissions. The Web Search application would launch an agent to search for a particular keyword(s) on selected servers. This agent will then autonomously perform searches on the web directory of these servers, filter the results as per user directives and bring back the filtered results. The GUI developed for the File Access system is designed to be totally generic, in the sense that new primitives can be added without changes to the main GUI. The status primitive is valuable when it comes to procuring information about files before we plan to fetch them over from a remote site.

Motivations

There were some questions to be answered before we embarked on this project.

Why Agent-based ?

There were two reasons for this As mentioned previously, we wanted to exploit the inherent asynchrony

provided by the agent-based paradigm. In addition, we wanted to restrict the information search/filtering at the server side as much as possible.

Another reason, of course, was to exercise the Ajanta capabilities by building on the existing File Access system.

Why not use the existent search utilities like Yahoo, Altavista etc ?

The goal of the Web search was for specific users and web pages. It allows one to perform exhaustive search of a specific user’s web pages, if that user is running Ajanta’s File Access System server. For example, students at the University of Minnesota can find if the web pages of their instructors have been modified, in the past few hours or not. This kind of search is not possible with the existent search utilities

Salient Features of the WebSearch system

One agent can visit and search multiple servers for the same keyword(s).

Fast and efficient search as the filtering of search results is done at the server side.

Web Search returns the file names as URLs and fetches abstracts of files if specified.

2

Page 9: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

The results are presented in different view formats, and thus, provide considerable insight on the distribution of the keyword, in the remote user’s directory.

Security and privacy concerns are addressed. It is GUI based and provides the user a easy and handy interface to both

enter parameters as well as, display the results.

2. BackgroundThis section gives a brief overview of the Ajanta system. Section 2.2 furnishes some details about the basic components involved in the system. In section 2.3, we describe the existing File Access System in detail.

2.1. An overview of AjantaAjanta provides us a programming environment that facilitates development of applications using the mobile agent methodology. Agents are active mobile objects [2], which encapsulate code and execution context along with data. The Ajanta system is implemented in Java and uses Java’s facilities like object serialization, reflection and remote method invocation (RMI).Two of the main requirements of a mobile agent system are security and robustness. We need to protect the host as well as the agent from being tampered. Also, we need to protect against malicious users and agents and “denial of service” attacks. Robustness is also a main concern, especially in a dynamic, unreliable medium like the Internet. Ajanta satisfies these requirements and more by extending the security model provided in Java.

The first step in creating an agent-based application is to define the services that will provided by host servers to visiting agents. Then the server needs to have appropriate resources to implement these services. More importantly, a generic framework is to be created that allows the server to verify an agent’s identity, create an execution environment, grant an agent restricted access to its local resources plus allow easy agent migration.

2.2. Agent, Agent Servers and Itinerary In this section, we will give a brief overview of the basic elements of the Ajanta system pertinent to this project.

3

Page 10: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

Figure 1: The Ajanta Server Architecture

Ajanta provides implementations of a generic agent defined by the Agent class and a generic agent server defined by the AgentServer class. These classes can be suitably extended by applications to build their own specific servers and agents. Each agent is bound to its host environment object through an object reference named host. The generic agent server also gives provisions to control a visiting agent’s access to the host at any desired level of access control granularity [4], using the agents credentials. An agent’s credentials is a signed certificate, which comprises of information like names of the agent, owner, etc. Ajanta architecture uses a location-independent global naming scheme called the URN(Universal Resource Name) [3] for referencing or communicating with agents, servers and any other resources. There is a special class called the ItinAgent, which has an itinerary specifying the servers to visit.. The user can specify the task list in a request file. The agent creates the itinerary using this task list file. When the agent is started, the start method of the ItinAgent class finds out the first itinEntry from the itinerary and launches the agent to execute the entry. After the agent is done with a task, it moves on to subsequent tasks in order by looking up the itinerary which it carries along as it migrates. This is where a novel feature has been introduced in Ajanta. Agent itineraries can be considerably influenced by what we call “pattern of migration”. A pattern[6] separates the specification of an agent’s migration path from its computation tasks. There are currently patterns like sequence, split, split-join etc. In the set pattern, the order of the tasks is not important and the agent picks any one of the pending tasks. The selection pattern selects only one entry based on a user-specified directive while the loop pattern loops through all tasks in sequence until a user-

4

Page 11: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

defined specification is satisfied. The split pattern results in the creation of child agents for the parallel traversal of it’s contained patterns and the split-join pattern is a specialization of the split pattern, in which the child agents must report their results to some object(mostly the parent) and the parent can wait for one or all of the child agents to return. Sequence is the pattern that is pertinent to this project. This is the simplest of patterns in which, the tasks are executed by the same agent in the order they were submitted(sequential).

2.3. File Access System1

The File Access System is a classic application (middle ware) built on the Ajanta framework. It is designed to allow effective sharing of files over the Internet. Each host(user) runs a File Server that is an extension of the agent server, which provides restricted access to portion of the host’s local files. Visiting agents can request files by name(URN), deposit files into the local file system, search files using keywords, etc.

2.3.1 File server Architecture

Figure 2: File Accss Server Architecture

As the File Server extends the basic, agent server class, it inherits basic agent hosting capabilities. In addition, it implements a FileSystem resource whose proxy is given to the agents to be used to access the files. When the file server starts executing, it creates an instance of the FileSystem resource and inserts it into a resource registry. This FileSystem resource is a Java interface and is implemented by the FileSystemImpl class. The file

1 This system was developed as part of Ram Singh’s Plan B research.5

Page 12: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

system is actually a specific set of files that the user has made available to agents. The user configures the file server by specifying a directory, which acts as the root of the file system, thus, allowing access to the agents. This is where the index files for the search primitive are also stored.

2.3.2 File System resourceThe File system interface[5] defines the APIs available to an agent to perform the File System functions and primitives. FileSystemImpl implements each of these APIs. Here is a short description of each of the API’s. The fetchFile method for the Fetch primitive requests the file system to copy the contents of the requested file into the response object of the agent. An agent will be able to fetch the file, only if the owner of the agent has read permissions on the file. The transferFile method is an extension to the Fetch primitive. If the file to be transferred is large, then the buffer might not be sufficient. In this case, it makes sense to transfer the file over a TCP connection. The parent of the agent starts a thread called ListenerThread, which listens on a port and saves the incoming file. The depositFile method performs the reverse of the fetchFile operation. The agent carries the file with it in its fileBuf buffer to the specified destination and deposits the file. The search operation allows agents to perform full-text searches on the index in the root directory of the destination. We use the Glimpse utility(see section 2.5.1) for indexing and searching. If successful, the search operation returns a list of the filenames and the frequency of occurrence of the keyword. Based on the result of the search, the user may choose to fetch or transfer the file.

2.3.3 File Server ThreadWhen the agent is sent to perform a task at the remote site, it needs permissions to execute the task there. If permission is granted to the agent’s thread, a malicious user might wreak havoc with this kind of security breach. This problem is solved elegantly, by having a separate thread called the File Server Thread running under the agent server’s (remote site) protection domain. This thread services the task to be performed by the agent and has access to the underlying local files.. The agent deposits the task into a buffer, which the File Server thread reads from. After the task has been performed, the File Server thread writes the results into a response buffer, which the agent reads from. This structure ensures that the agent never has direct access to the local system.

6

Page 13: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

3. An Agent-Based Web Search SystemIn this section, we will describe the design goals and requirements in detail followed by the implementation overview. The File access system was extended to support this Web Search system.

3.1 Design Goals and RequirementsThe main goal of this Web Search system is to allow the agent to perform full-text keyword searches on the files in the remote user’s web directory. The system should allow a client to dispatch an agent to a remote user’s file access server, perform the search and return the file names as URLs(Uniform Resource Locators). The file server first constructs an index for the user’s web directory and stores the index in the shared global file system. The indexing is done using the Glimpse utility(developed by the University of Arizona, Tucson). Visiting agents can then search this web index with any desired combination of keywords permitted by Glimpse.

Figure 3: Agent interaction in WebSearch System

There are some requirements for the Web Search system and they are discussed below

3.1.1 Information Filtering :An important aspect that comes into play is the filtering of information. Sometimes, it so happens that the search string or keyword is very common and almost all files in the remote user’s directory may contain it. To refine the granularity of search and to make it more effective, we introduce some filtering constraints in the Web Search system.

7

Page 14: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

Filtering can be done based on either frequency of keywords appearing in the file and/or on the last modified date of the file. The client can specify the frequency count to fetch a file. For example: if the client wants to search for the word “agent” and wants only those files that have the keyword “agent” occurring more than 10 times, he can do so. The use of such a filtering technique might be that almost all files in our web directory may contain the word “agent”. It can be safely assumed that, only files that contain the keyword “agent” more than ten times, may have some information of interest in them. Similarly, another good filter is that you want to see only files that have been modified recently. This way you can specify the last modified date and only the files that have been modified since that date will be returned. The filtering is done on the server side. The File Access system takes care of the filtering based on the filtering arguments provided.

3.1.2 Web search options :The options provided to a user of the Web Search system are the following

KeywordThe user is asked to enter the keyword he needs to make the search on. The user is free to enter combinations of keyword. The AND and OR operators follow the Glimpse syntax. In glimpse, the AND operator is represented by ‘;’ (semi colon) and the OR operator is represented by ‘,’ (comma). Therefore, for a keyword search of both “agent” and “ajanta” we can enter “agent ; ajanta”.

-iThis option specifies if the user wants a case-insensitive search.

Frequency countThis option specifies the frequency count filter. Only files that have keywords occurring more than this count times will be returned. There is a default value of 10.

If-modified-sinceThis option specifies the last modified date filter. Only files that were modified after this date (and having the keyword) will be returned. The default is no modification date.

AbstractThis option specifies whether the user wants the first 50 words of the file to be brought back. This works like a preview. If the user disables this

8

Page 15: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

option, then only the filenames and frequency are brought back. The default is no abstract.

3.1.3 Presentation Views at the Client side :The presentation views represent the client side filtering needed to present the results to the user The results are brought back by the agent in its response object. A PERL script is then executed on this output which converts this file into HTML. This file is then opened in Netscape for the user to preview. Other scripts are run to format data for the different views.

Segregated View :In this view, the filenames and frequency of keywords appearing in the files are shown in the decreasing order of frequency(Higher frequency is shown first) segregated by destination servers.For e.g. if searches for the same keyword were made in two different destinations in order, then this view will sort the results separately and segregate them.

Combined View :This view combines the search results from each server. In cases where the location of the file is not an issue, this view proves useful.

Directory Structure View :This view presents information clustering with respect to the directory structure. With the help of this view, the user can deduce where information pertinent to him is stored. He can then try to get that particular directory/subdirectory and cache it, instead of downloading all possible hits.

3.1.4 Security and PrivacySecurity and Privacy is obviously an important concern as far as web search is concerned. Although, a web directory is made for others to view the information you provide, there are instances, when you don’t want the world to view a certain file in your web directory. Common examples are data files aiding CGI scripts and other files. The Web search system only returns those files that have UNIX “world read” permissions on them. As far as security is concerned, the Ajanta security model has been designed to be completely secure against outside attacks.

3.2 Design and Implementation overviewIn this section, we will cover the implementation aspect of the above design requirements.

9

Page 16: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

3.2.1 Extensions to Existing File Access system We have added two primitives to the existing File Access system viz. Web Search and File status. Web search extends the capability of the current File Access system to searching Web directories of the remote user while File status can return the status( file length, last modified date and UNIX permissions) of the specified file in the shared global file system.

Earlier, we had mentioned about the indexing done at the server side. This indexing is done by using the Glimpse utility. It’s an indexing and query system that facilitates quick searching for keywords through files. Glimpse(which stands for GLobal IMPlicit Search) supports Boolean queries, approximate matching and some limited form of regular expressions. To run glimpse, we first have to index all our files in the directory of interest (in our case, the .www directory), by running the glimpseindex command. Glimpseindex is the indexing program for glimpse. It provides three kinds of indexing options. a tiny index (2-3% of the total size of all files), a small index (7-8%) and a medium-size index (20-30%). Search times are normally better with larger indexes. The indexing is done for all the files in the specified directory and recursively for all sub directories.

Glimpse supports many options and we use the following -i

This option specifies that the search is to be case-insensitive. -c This option displays only the count of matching records. Only files

with count > zero are displayed. -y

This option is to suppress and prompts, so that the search can be done without any interruptions.

-HThis option specifies the directory where the index can be found.

A typical example of a glimpse command would be>glimpse –c –y –H /home/grad22/user1/root/.webindex “agent” -i

The glimpse index is stored under the root directory as a .webindex file.

FileStat primitive This primitive was implemented, to get the properties of a remote file. The only change made to the existing file system described in section 2.3, was in the File system class, as a File Stat API was defined. A getFileStat

10

Page 17: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

method was implemented in the FileSystemImpl class. This method deposits the request and the file name for the File Server thread to perform the task.. The API of the getFileStat method is as follows

public StatResponse getFileStat ( String fileName );StatResponse ( long len, Date lastmod , String perm );

The File Server thread returns the results in the StatResponse object. This object is then sent back with agent as the results of the FileStat primitive. The StatResponse object is shown above. It returns the length, last modified date and the permissions of the file.

Web Search primitiveTo implement the Web Search primitive, the API was first defined in the File System interface and the implementation was added to the FileSystemImpl class. The Web Search API is as follows

public WebSearchResponse webSearch (String searchString, boolean ignoreCase, long freq, Date lastdate)

The search string, frequency count filter , the if-modified-since date and the case insensitive search flag are sent as arguments to the webSearch API. When the agent is launched, it takes with it an itinerary to the remote AgentServer. At the server side, this itinerary is executed in sequence (there are other patterns possible), and the response is returned to the source. The agent deposits the web search task to the servicing thread at the remote server. This thread runs the glimpse command in the background, performs the requested filtering on the results and returns the response. The WebSearchResponse object brings back the Filenames and the frequency count of occurrence of the keyword in the files.

3.2.2 Information Filtering The different filtering of information at the server side was discussed in section 3.1.1. Filtering is very important as it makes the search more granular. We can almost pinpoint the subset of files that we desire. Here’s a brief overview of the implementation details of each filter.

Filter type Server side Client side

1. Keyword Count 2. If-modified-

since

3. Privacy

11

Page 18: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

Filter type Server side Client side

4. Abstract 5. View Filtering

Figure 4 Information filtering on either side

Keyword countThe Keyword count is a very useful filter and specifies the upper bound on the frequency of occurrence of the keyword, in the files. This count can be specified by the user and by default is 10. Glimpse as such returns all files that have matching keyword count > zero. The websearch method in the FileServerThread class then filters out those files that have a frequency count greater than the specified value. Only these files are then written into the WebSearhResponse object..

If-modified-since date The WebSearchRequest object has a field for the if-modified-since date. When a date is specified by the user, this date is compared with each of the last modified dates of the result files. Only those files that satisfy the above criteria are written into the WebSearchResponse object. Again, this filtering is done by the doWebSearch method of FileServerThread class as it isn’t supported by Glimpse. All files are fetched by default.

PrivacyAs mentioned before, files that don’t have “world read” permissions are not fetched to respect the privacy of the remote user. Again, a permission check is made on each result file and only those that are world readable, are written to the WebSearchResponse object.

AbstractThis is a very useful filter when a brief abstract of the file contents is to be retrieved. The WebSearchRequest object has a boolean flag that specifies whether the Abstract is needed or not. If this field is set, then the abstract of each file is fetched. First the glimpse command is run and the result files are stored in a vector. Now, for each of the files that satisfy the other filters, the abstract is fetched. This is done by running a Perl script on each file in turn. This Perl script is again run by the File Server thread. The Perl script uses the lib-www module. The file is first converted to HTML and then the required(default 50) number of words, are extracted from the file. This abstract is then written into the WebSearchResponse object and the results are sent back.

12

Page 19: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

3.2.3 Presentation ViewsThe three different views were mentioned in section 3.1.3. When the agent returns with the response object, it calls three scripts. The first one converts the response data into HTML format for the browser to load. There are links from this html file to the other views. The data for the other views, combined and directory structure, are pre-computed using another two scripts. The directory structure script sorts the file path such that files under the same directory/subdirectory are clustered together. As the data is pre-computed, each view has a link to the other and the user can move between views at any time.

4 GUI for File Access System and Web Search AgentAnother important aspect of this project was to design and implement the GUI(graphical user interface). In section 4.1, we define the functional requirements for the user interface. Section 4.2, describes the UI in detail and furnishes some screen snapshots, while section 4.3 discusses the implementation details.

4.1 Functional Requirements The requirements for the GUI were to create an itinerary for the agent, to get the parameters for the different methods to be called and to project a easy form for the user to select options on. At the same time, we wanted the entire GUI to be generic. If a new primitive is to be added, we should not have to change all classes.

The GUI had to present the different parameters for forming an itinerary in a easy to select manner. The different inputs needed for the itinerary were, the destination server, different searches to be performed and the parameters needed by the agent.

The destination servers could be chosen from a list of active remote servers. More than one search for the same/different keyword can be performed by the agent in one go. The Agent goes to these servers in a sequence, gets the result back and displays them. The parameters needed for e.g.: the keyword in Web Search can also be provided by typing out in the GUI.

4.2 GUI Description and SnapshotsThis section will give a full description of the Graphical User Interface created for the File Access System and the Web Search system. The idea was to make the GUI as generic as possible so that, when a new primitive is added to the File Access System, slight modifications are enough.

13

Page 20: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

4.2.1 File Access System GUI :A GUI was designed for the File Access system. This GUI as shown in the figure below, has a drop down box that contains the different tasks that File Access system server can perform. The GUI also has a text area that will show the tasks being selected. The user can clear or edit the tasks at anytime. Once the user has selected all his tasks then he can hit the Go button to launch the agent.

We can see in figure that when the choice of transfer is made, the corresponding GUI is loaded by reflection. This transfer GUI has fields to take in the file to be transferred and the local name of the file. We have a drop down box(See figure 8) to choose from the variety of servers currently running the file server. This box is populated from a server file, which contains the list of users running the File server. Thus, for each primitive we have a corresponding GUI. Each corresponding GUI of a primitive is useful in getting input parameters needed for the primitive. These parameters are then passed back to the main GUI and are entered into the text area in the main GUI. 4.2.2 Web Search GUI :As shown in the figure 9, the Web Search GUI provides field elements for each of the options. There is a text box for the user to enter the keyword he needs to make the search on. The user can make it a case sensitive or insensitive search. He can enter the frequency filter, abstract needed or not etc. Finally, the user can choose the remote server he needs to make the search at. He can then hit Ok to add this task to the task list in the main GUI window. If he needs to make another similar request he can edit

Figure 5: Main GUI for the File Access System

Figure 6: GUI for the File Transfer primitive

14

Page 21: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

the fields in the current request and add it to the task list. Otherwise, he can hit close and control is transferred to the main window.

As we can see in figure 9, the Web Search GUI has the different parameters that can be specified with this primitive. In this snapshot the keyword to be searched is an “and” combination of “ajanta” and “agent”. The frequency count is given to be 20. Therefore, only the files that have the keywords occurring more than 20 times will be returned. A case-insensitive search is being made as shown by the checkbox. An abstract is requested. Now, the search will come up with the first 50 words of each file that satisfies the search. The if-modified-since field requests the user to specify the month, date, time and year to check against the files. These inputs are converted into a date object and sent with the WebSearch request. Only files that have been modified after this date will be returned. If the date field is left empty, then by default, this option is ignored and all the files that satisfy the other options are returned. Finally, the server to send the web search agent is specified on the drop down list. On hitting the “Add” button the web search task and these parameters are added to the main GUI’s text area.

Figure 7: Web Search choice being made in the File Access System GUI Figure 8: Server Choices in the

Primitive GUI

15

Page 22: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

4.3 Implementation OverviewThe challenge, while implementing the GUI, was to meet the requirement to make the GUI generic. We discuss the different issues involved with this implementation below.

The File Access System GUI had to have a drop down box that would show all the provided primitives like fetch, deposit, search etc. The menu entries for the drop down box were populated from a file that would have the list of primitives. Now, when a choice was made in the drop down box, the corresponding GUI class for the primitive was to be loaded. This new GUI class will open another window, modal in nature. Any arguments specific to the primitive could be entered in this window and when the “Add” button is clicked, the entire task with arguments is added to the tasklist in the main GUI window.

There are two important problems to be considered. One, how the corresponding class is loaded when the primitive choice is made and the other how the arguments were being passed back to the main GUI window. The latter problem is easily solved by passing the main GUI window object to the primitive window. Therefore, when the arguments are entered, the primitive window can set the values in the main GUI class using the provided handle. The other problem of loading the corresponding class at runtime is solved by Java reflection. Java reflection[7] provides a secure way in which objects can be instantiated by knowing the class name. In Java, each class loaded into the virtual machine is available reflectively to

Figure 9: Web Search GUI

16

Page 23: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

the programmer as an instance of the class Class. Given a method name in that class and the types of parameters it takes, it is also possible to obtain a Method object corresponding to that method. Reflection proves very useful to our cause here, as we come to know of the name of the GUI class for the primitive being chosen only at runtime. The corresponding constructor can be called from the class name.

The reflection code looks like this,

// str contains the name of the classClass GuiDefinition = Class.forName(str); //ptypes is the argument for the constructor. This code returns the constructor with these arguments.Constructor GuiConstructor = GuiDefinition.getConstructor(ptypes); // The object is created from the constructorObject object = GuiConstructor.newInstance(ptypes);

This method makes our GUI generic because whenever a new primitive is to be added, all that has to be done is that it’s corresponding GUI class is to be implemented and placed in the class path. Now, a new entry has to be created in the file from which the dropdown box reads. When the new primitive is selected during runtime, its corresponding class constructor can be called and the object can be instantiated.

5. Conclusions and Future WorkWe designed and implemented an agent based Web Search system on Ajanta successfully. This system can use agents to search for a given keyword(s) in remote user’s web directories and bring back the results. It can successfully bring abstracts of files and show the results in a browser for the user to view. Some useful options and filters like if-modified-since, keyword count etc. have been added that makes the search system more effective. We have also implemented a file status primitive that can fetch the status of remote files. In addition, we have developed the Graphical User Interface(GUI) for the entire File Access System including the Web Search system that is generic and thus, extendable in nature.

There are considerable areas where this system can be enhanced.

First and foremost is that, as this is a Web application, it should be accessible from the Web itself. We are planning to implement an applet that should be able to launch agents in the background and perform the search.

Currently, we have the File server running for individual users and we search the index of each user. Extending this idea, we are trying to

17

Page 24: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

index all the web directories into one global index. The search can then be made on this global index and then user-specific views can be shown.

Also, for fetching an abstract, each result file is opened in turn and the first 50 words are extracted. This, considerably slows the agent. We are planning to make abstract files offline for each user. A script could be run often, that would create abstract files for each user. Whenever an agent needs an abstract for a particular file, it can pick the corresponding abstract file, thus saving precious time.

18

Page 25: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

References

19

Page 26: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

[1] Colin G. Harrison, David M. Chess, and Aaron Kershenbaum. Mobile Agents: Are they a good idea? Technical report, IBM Research Division, T.J.Watson Research Center, March 1995. Available at URL http://www.research.ibm.com/massdist/mobag.ps.

[2] Eric Jul, Henry Levy, Norman Hutchinson and Andrew Black. Fine-grained Mobility in the Emerald System. ACM transactions on Computer Systems, 6(1) : 109-133, February 1988.

[3] Karen Sollins and Larry Masinter. RFC 1737: Functional Requirements for Uniform Resource Names. Available at URL http://www.cis.ohiostate.edu/htbin/rfc/rfc1737.html, December 1994.

[4] Anand Tripathi, Neeran Karnik, Manish Vora, Tanvir Ahmed, and Ram Singh. Mobile Agent Programming in Ajanta. In Proceedings of the 19th International Conference on Distributed Computing Systems, May 1999.

[5] Neeran Karnik, Phd Thesis. Available at URL: http://www.cs.umn.edu/Ajanta/defense/index.html, October 1998.

[6] Anand Tripathi, Neeran Karnik, , Manish Vora and Tanvir Ahmed. Ajanta - A system for Mobile Agent Programming. Technical report TR 98-016, Dept. of Computer Science, University of Minnesota, April 1998.

[7] JavaSoft. Java Reflection Documentation web page. Available at URL http://java.sun.com/docs/books/tutorial/reflect/index.html

man

Page 27: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

AppendixResults and Presentation View snapshots

1. Segregated View

Page 28: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

2. Combined View

3. Directory Structure View

4. Abstract View

4

Page 29: Thesis - ajanta.cs.umn.eduajanta.cs.umn.edu/papers/ArvindPlanB.doc  · Web viewby. Arvind Prakash. A Plan B report submitted in partial fulfillment of the requirements for the degree

4