1. INTRODUCTION TO PROJECT
Keyword search over a large amount of data is an important operation in a wide range of
domains. Felipe et al. has recently extended its study to spatial databases, where keyword
search becomes a fundamental building block for an increasing number of real-world
applications, and proposed the IR2-Tree. A main limitation of the IR2-Tree is that it only
supports exact keyword search. In practice, keyword search for retrieving approximate string
matches is required. Since exact match is a special case of approximate string match, it is
clear that keyword search by approximate string matches has a much larger pool of
applications. Approximate string search is necessary when users have a fuzzy search
condition, or a spelling error when submitting the query, or the strings in the database contain
some degree of uncertainty or error. In the context of spatial databases, approximate string
search could be combined with any type of spatial queries. In this work, we focus on range
queries and dub such queries as Spatial Approximate String (SAS) queries. An example in the
Euclidean space, depicting a common scenario in location-based services: find all objects
within a spatial range r (specified by a rectangular area) that have a description that is similar
to “theatre”. We denote SAS queries in Euclidean space as (ESAS) queries. Similarly, Figure
2 extends SAS queries to road networks (referred as RSAS queries). Given a query point
q and a network distance r on a road network, we want to retrieve all objects within distance r
to q and with the description similar to “theatre”, where the distance between two points is
the length of their shortest path.
LITERATURE SURVEY
2.1 Introduction
2.1.1Theoretical Background
The main aim of the project is to privacy as Spatial Approximate String with string
searching use the compression sensing technique .security is provided social networking by
the data is transmitted from User to admin.
2.2 Technical Background:
JSP:
1
Java server Pages is a simple, yet powerful technology for creating and maintaining dynamic-
content web pages. Based on the Java programming language, Java Server Pages offers
proven portability, open standards, and mature re-usable component model .The Java Server
Pages architecture enables the separation of content generation from content presentation.
This separation not eases maintenance headaches; it also allows web team members to focus
on their areas of expertise. Now, web page designer can concentrate on layout, and web
application designers on programming, with minimal concern about impacting each other’s
work.
Introduction
Jsp technology enables you to mix regular static html with dynamically generated content
from servlets. Separating the static html from the dynamic content provides a number of benefits over
servlets alone.
JSP compared to Asp
Jsp and asp are fairly similar in the functionality that they provide. Jsp may have slightly
higher learning curve. Both allow embedded code in an html page, session variables Platform i.e., NT,
JSP can operate on any platform that conforms to the J2EE specification. Jsp allow component reuse
by using JavaBeans and Ejbs. Asp provides the use of Com/activeX controls.
JSP compared to servlets
A servlet is java class that provides special server side service. It is hard to write
HTML code in servlets. You need to have lots of println statement to generate HTML.
Description
JSP looks like html, but they get compiled into java servlets the first time they are invoked. The
resulting servlet is a combination of the html from the Jsp file and embedded dynamic content
specified by the new tags. That is not to say that Jsp must contain html. Some of them contain only
java code; this is particularly useful when the Jsp is responsible for a particular task like maintaining
application flow.
Everything in Jsp can be broken into 2 categories
1. Elements that are processed on the server.
2. Template data or everything other than elements that the engine processing the Jsp ignores.
JSP Architecture
2
Jsp are built on top of sun’s servlet technology. Jsp is essentially an html page with special Jsp tags embedded. These Jsp tags can
contain java code. The Jsp file extension is .Jsp rather than .htm or .html. The Jsp engine parses the .Jsp and creates a java servlet source file.
It then compiles the source file into a class file; this is done the first time and this why the Jsp is probably slower the first time it is accessed.
Any time after this, the special compiled servlet is executed and is therefore returns faster.
Steps Required For a JSP Request
.The user goes to a web site made using Jsp. The user goes to a Jsp page. The web browser
makes the request via the internet.
The JSP request gets sent to the web server
The web server recognizes that the file required is special (.jsp), therefore passes
the JSP file to the JSP servlet engine.
.If the JSP file has been called the first time, the jsp file is parsed, otherwise go to step 7
.The next step is to generate a special servlet form the jsp file. The entire html required is
converted to println statements.
.The servlet source code is compiled into a class.
.The servlet is instantiated, calling the init and service methods
.Html from the servlet output is sent via the internet
.Html results are diaplayed on the user’s web browser.
Servlets are server-side java programs that can be deployed on a web server. The servlet interface
provides the basic frame work for coding servlets. Java Server Pages, SQL, HTML Forms and
Databases.
This section examines how to communicate with a database from Java. We have already seen
how to interface an HTML Form and a JSP. Now e has to see how that JSP can talk to a database.
The objectives of this section are to understand how to:
1.Administratively register databases.
2.Connect a JSP to an Access database.
3.Insert records in a database using JSP.
4.Inserting data from HTML Form in a database using JSP.
5.Delete Records from Database based on Criteria from HTML Form.
6.Retrieve data from a database using JSP – result sets.
7.Apply SQL operations like sort, create table, remove table, delete, and Access-based arithmetic
functions.
3
JSP Declarations
Used to define page level variables and methods are placed within the <%!and %> tags and
always end with a semicolon.
Example:
<%!
Int I=0;
Int j=0;
Int z=0 ;%>
JSP Scriptlets
Consists of valid code snippets enclosed within the <% and %> JSP tags.
Example:
To accept the user name and display the name 10 times:
<%@ page import=”java.util.*” %>
<%@ page import=”java” %>
<HTML><BODY>
<%out.println (“<HTML>”);
out.println (“<BODY>”);
out.println (“<BODY>”);
out.println (“<HTML>”);
%>
</BODY></HTML>
JSP Expressions
Used to directly insert values into the output
Example: < %=msg %>
JSP Implicit Objects
Are predefined variables that can be included in JSP expressions and Scriplets can be Created.
1. Implicitly by using directives.
2. Explicitly by using standard actions.
4
3. Directly by declaring objects with in scriptlets.
Include variables such as
PAGE: To represent the current instance of the JSP page.
REQUEST: To represent an object of HTTP Servlet request used to retrieve the request data.
RESPONSE: To represent an object of HTTP servlet response used to write the HTML response
output.
JSP Actions: <jsp: getProperty>: To retrieve the property of the specified bean and direct it as
output. Attributes used are: Name and Property.
<jsp: setProperty>: To set the property of specified bean.
Attributes used are: Name, Property, Value and Param.
<jsp: forward> : To forward a request to a different page.
Attribute used is Page.
<jsp: param>: Used as a sub attribute with jsp: include and jsp: forward to pass additional request
parameters.
Attributes used are Name and Value.
<jsp: include> : To insert a file into a particular jsp page.
Attributes used are Page and Flush.
Java Script
Java script is a general purpose, prototype based, object oriented scripting language
developed jointly by sun and netscape and is meant for the WWW . it is designed to be embedded in
diverse applications and systems , without consuming much memory . java script borrows most of its
syntax from java but also inherits from awk and Perl , with some indirect influence from self in its
object prototype system.
Java scripts dynamically typed that is programs don’t declare variable types, and the type of
variable is unrestricted and can change at runtime. Source can be generated at run time and evaluated
against an arbitrary scope. Typical implementations compile by translating source into a specified
byte code format, to check syntax and source consistency. Note that the availability to generate and
interpret programs at runtime implies the presence of a compiler at runtime.
5
Java script is a high level scripting language that does not depend on or expose particular
machine representations or operating system services. It provides automatic storage management,
typically using a garbage collector.
Features
Java script is embedded into HTML documents and is executed with in them.
Java script is browser dependent
Java script is an interpreted language that can be interpreted by the browser at run
time.
Java script is loosely typed language
Java script is an object based language.
Java script is an Event-Driven language and supports event handlers to specify the
functionality of a button.
Advantages
Java script can be used for client side application
Java script provides means to contain multi frame windows for presentation of the
web.
Java script provides basic data validation before it is sent to the server. Eg : login and
password checking or whether the values entered are correct or whether all fields in a
from are filled and reduced network traffic
It creates interactive forms and client side lookup tables .
Servlets
Servlets provide a Java(TM)-based solution used to address the problems currently
associated with doing server-side programming, including inextensible scripting solutions,
platform-specific APIs, and incomplete interfaces.
Servlets are objects that conform to a specific interface that can be plugged into a
Java-based server. Servlets are to the server-side what applets are to the client-side -- object
byte codes that can be dynamically loaded off the net. They differ from applets in that they
are faceless objects (without graphics or a GUI component). They serve as platform-
independent, dynamically-loadable, pluggable helper byte code objects on the server side that
can be used to dynamically extend server-side functionality.
Use Servlets instead of CGI Scripts
6
Servlets are an effective replacement for CGI scripts. They provide a way to generate
dynamic documents that is both easier to write and faster to run. Servlets also address the
problem of doing server-side programming with platform-specific APIs: they are developed
with the Java Servlet API, a standard Java extension. So use servlets to handle HTTP client
requests. For example, have servlets process data posted over HTTPS using an HTML form,
including purchase order or credit card data. A servlet like this could be part of an order-entry
and processing system, working with product and inventory databases, and perhaps an on-line
payment system.
Architecture of the Servlet Package
The javax.servlet package provides interfaces and classes for writing servlets. The architecture
of the package is described below.
The Servlet Interface
The central abstraction in the Servlet API is the Servlet interface. All servlets
implement this interface, either directly or, more commonly, by extending a class that
implements it such as Http Servlet .
The Servlet interface declares, but does not implement, methods that manage the
servlet and its communications with clients. Servlet writers provide some or all of these
methods when developing a servlet.
Client Interaction
When a servlet accepts a call from a client, it receives two objects
A ServletRequest, which encapsulates the communication from the client to the server.
A ServletResponse, which encapsulates the communication from the servlet back to the client.
ServletRequest and ServletResponse are interfaces defined by the javax.servlet package.
The ServletRequest Interface
The ServletRequest interface allows the servlet access to
7
Information such as the names of the parameters passed in by the client, the protocol
(scheme) being used by the client, and the names of the remote host that made the request and
the server that received it.
The input stream, ServletInputStream. Servlets use the input stream to get data from clients that
use application protocols such as the HTTP POST and PUT methods.
Interfaces that extend ServletRequest interface allow the servlet to retrieve more protocol-
specific data. For example, the HttpServletRequest interface contains methods for accessing
HTTP-specific header information.
The ServletResponse Interface
The ServletResponse interface gives the servlet methods for replying to the client. It:
Allows the servlet to set the content length and MIME type of the reply.
Provides an output stream, ServletOutputStream, and a Writer through which the servlet can
send the reply data.
Interfaces that extend the ServletResponse interface give the servlet more protocol-specific
capabilities. For example, the HttpServletResponse interface contains methods that allow the
servlet to manipulate HTTP-specific header information.
Additional Capabilities of HTTP Servlets
The classes and interfaces described above make up a basic Servlet. HTTP servlets have
some additional objects that provide session-tracking capabilities. The servlet writer can use
these APIs to maintain state between the servlet and the client that persists across multiple
connections during some time period. HTTP servlets also have objects that provide cookies.
The servlet writer uses the cookie API to save data with the client and to retrieve this data.
The classes mentioned in the Architecture of the Servlet Package section are shown in the
example in bold:
SimpleServlet extends the HttpServlet class, which implements the Servlet interface.
8
SimpleServlet overrides the doGet method in the HttpServlet class. The doGet method is called
when a client makes a GET request (the default HTTP request method), and results in the
simple HTML page being returned to the client.
Within the doGet method,
o The user's request is represented by an HttpServletRequest object.
o The response to the user is represented by an HttpServletResponse object.
o Because text data is returned to the client, the reply is sent using the Writer object
obtained from the HttpServletResponse object.
Servlet Lifecycle
Each servlet has the same life cycle:
A server loads and initializes the servlet
The servlet handles zero or more client requests
The server removes the servlet
Initializing a Servlet
When a server loads a servlet, the server runs the servlet's init method. Initialization completes before
client requests are handled and before the servlet is destroyed.
Even though most servlets are run in multi-threaded servers, servlets have no concurrency issues
during servlet initialization.
The server calls the init method once, when the server loads the servlet, and will not call the init
method again unless the server is reloading the servlet. The server cannot reload a servlet until after
the server has destroyed the servlet by calling the destroy method.
The init Method
The init method provided by the HttpServlet class initializes the servlet and logs the
initialization. To do initialization specific to your servlet, override the init() method following
these rules
9
If an initialization error occurs that renders the servlet incapable of handling client requests,
throw an Unavailable Exception.
An example of this type of error is the inability to establish a required network
connection.
Do not call the System.exit method
Initialization Parameters
The second version of the init method calls the getInitParameter method. This method
takes the parameter name as an argument and returns a String representation of the
parameter's value.
The specification of initialization parameters is server-specific. In the Java Web Server,
the parameters are specified with a servlet is added then configured in the
Administration Tool. For an explanation of the Administration screen where this setup
is performed, see the Administration Tool: Adding Servlets online help document.
If, for some reason, you need to get the parameter names, use the getParameterNames
method.
Destroying a Servlet
Servlets run until the server are destroys them, for example at the
request of a system administrator. When a server destroys a servlet, the server runs the
servlet's destroy method. The method is run once; the server will not run that servlet
again until after the server reloads and reinitializes the servlet.
When the destroy method runs, another thread might be running a service request. The
Handling Service Threads at Servlet Termination section shows you how to provide a
clean shutdown when there could be long-running threads still running service
requests.
Using the Destroy Method
The destroy method provided by the HttpServlet class destroys the servlet and logs the
destruction. To destroy any resources specific to your servlet, override the destroy
method. The destroy method should undo any initialization work and synchronize
persistent state with the current in-memory state.
The following example shows the destroy method that accompanies the init method
shown previously:
public class BookDBServlet extends GenericServlet {
10
private BookstoreDB books;
... // the init method
public void destroy() {
// Allow the database to be garbage collected
books = null;
}
}
A server calls the destroy method after all service calls have been completed, or a server-
specific number of seconds have passed, whichever comes first. If your servlet handles any
long-running operations, service methods might still be running when the server calls the
destroy method. You are responsible for making sure those threads complete. The destroy
method shown above expects all client interactions to be completed when the destroy method
is called, because the servlet has no long-running operations.
Servlet-client Interaction
Handling HTTP Clients
An HTTP Servlet handles client requests through its service method. The service
method supports standard HTTP client requests by dispatching each request to a method
designed to handle that request. For example, the service method calls the doGet method shown
earlier in the simple example servlet.
Requests and Responses
Methods in the HttpServlet class that handle client requests take two arguments:
1. An HttpServletRequest object, which encapsulates the data from the client
2. An HttpServletResponse object, which encapsulates the response to the client
HttpServletRequest Objects
11
An HttpServletRequest object provides access to HTTP header data, such as any cookies
found in the request and the HTTP method with which the request was made. The
HttpServletRequest object also allows you to obtain the arguments that the client sent as part of
the request.
HttpServletResponse Objects
An HttpServletResponse object provides two ways of returning data to the user:
The get Writer method returns a Writer the get OutputStream method returns a Servlet OutputStream.
Use the getWriter method to return text data to the user, and the getOutputStream method for binary
data.
HTML
HTML (hyper text markup language) is a language used to create hyper text documents that have
hyper links embedded in them. It consists of tags embedded in the text of a document with HTML.
We can build web pages or web document s. it is basically a formatting language and not a
programming language. The browser reading the document interprets mark up tags to help format the
document for subsequent display to a reader. HTML is a language for describing structured
documents. HTML is a platform independent. WWW (World Wide Web) pages are written using
HTML. HTML tags control in part the representation of the WWW page when view with web
browser. The browser interprets HTML tags in the web document and displays it. Different browsers
show data differently.
Example code:
<HTML>
<HEAD>
<TITLE>this is an html title</TITLE>
</HEAD>
<BODY>
………
</BODY>
</HTML>
12
Advantages
An HTML document is small and hence easy to send over the net. It is small because it does not
include format information.
HTML documents are cross platform compatible and device independent. We only need an HTML
readable browser to view them. For names, locations etc. are not required.
Apache Tomcat
Introduction to Tomcat
Tomcat is the Reference Implementation for the Java Servlet 2.2 and Java Server Pages 1.1
Technologies. It is the official reference implementation for these complementary
technologies. Tomcat is a servlets container with a JSP environment. A servlet container is a
runtime shell that manages and invokes servlets on behalf of users. Developed under the
Apache license in an open and participatory environment,
Tomcat is intended to be a collaboration of the best-of-breed developers from around the
world.
Tomcat and Servlets
As mentioned above Tomcat is the reference implementation for the Java Servlet 2.2
technology and obviously conforms to the specification that describes the programming
environment that must be provided by all servlet containers that is documented in the Servlet
API Specification, Version 2.2.
This document may be used to understand the web application directory structure and
deployment file, methods of mapping request URLs to servlets, container managed security,
and the syntax of the web.xml, Web Application Deployment Descriptor.
Installation
Tomcat will operate under any Java Development Kit (JDK) environment that provides a
JDK 1.1 or JDK 1.2 compatible platform. The JDK is required so that your servlets, other
classes, and JSP pages can be compiled.
13
Once you have downloaded the required file, unzip it to a directory of your choice. (In the
Microsoft Lab 5 at UWI the file is extracted directly to the C drive (C :\)). A sub-directory
named Jakarta-tomcat is created and this is the root directory of the tomcat hierarchy.
Tomcat 6.x
Implements the Servlet 2.4 and JSP 2.0 specifications .Reduced garbage collection, improved
performance and scalability. Native Windows and UNIX wrappers for platform integration
Faster JSP parsing.
14
Database Tables
It is the age of information technology and data & database play a very key role in this age. A
layperson of these days needs no introduction to databases, whether it is a personal telephone
directory or the bank passbook database are omnipresent. In this session we learn about database
management systems in general with an emphasis on the relational model of the DBMS.
The conventional data processing approach is to develop a program (or many programs) for each
application. This result in one or more data files for each application. Some of the data may be
common between files. However one application may require the file to be organized on a particular
field, while other application may require the file to be organized on another field. A major drawback
of the conventional method is that the storage access methods are built in to the program. Therefore,
though the same data may be required by two applications, the data will have to be sorted in two
different places because each application depends on the way that the data stored.
There are various drawbacks of conventional data file processing environment. Some of them are
listed below:
Data Redundancy
Some data elements like name, address, identification code, are used in various applications. Since
data is required by multiple applications, it is stored in multiple data files. In most cases, there is a
repetition of data. This is referred to as data redundancy, and leads to various other problems.
Data Integrity Problems
Data redundancy is one reason for the problem of data integrity. Since the same data is stored in
different places, it is inevitable that some inconsistency will creep in.
Data Availability Constraints
When data is scattered in different files, the availability of information from a combination of
files is constrained to some extent.
Database Management System
A database management system (DBMS) consists of a collection of interrelated data and a set of
programs to access the data. The collection of data is usually referred to as the database. A Database
system is designed to maintain large volumes of data. Management of data involves:
Defining the structures for the storage of data
Providing the mechanisms for the manipulation of the data
Providing for the security of the data against unauthorized access
15
Users of the DBMS
Broadly, there are three types of DBMS users:
The application programmer
The end user
The database administrator (DBA)
The application programmer writes application programs that use the database. These programs
operate on the data in the database. These operations include retrieving information, inserting data,
deleting or changing data.
The end user interacts with the system either by invoking an application program or by writing their
queries in a database query language. The database query language allows the end user to perform all
the basic operations (retrieval, deletion, insertion and updating) on the data.
The DBA has to coordinate the functions of collecting information about the data to be stored,
designing and maintaining the database and its security. The database must be designed and
maintained to provide the right information at the right time to authorized people. These
responsibilities belong to the DBA and his staff.
Advantages Of a DBMS
The major advantage that the database approach has over the conventional approach is that a database
system provides centralized control of data. Most benefits accrue from this notion of centralized
control.
Redundancy Can Be Controlled
Unlike the conventional approach, each application does not have to maintain its own data files.
Centralized control of data by the DBA avoids unnecessary duplication of data and effectively
reduces the total amount of data storage required. It also eliminates the extra processing necessary to
trace the required data in a large mass of data present. Any redundancies that exist in the DBMS are
controlled and the system ensures that these multiple copies are consistent.
Inconsistency Can Be Avoided
Since redundancy is reduced, inconsistency can also be avoided to some extent. The DBMS guarantee and that
the database is never inconsistent, by ensuring that a change made to any entry automatically applies to the other
entries as well. The process is known as propagating update.
16
The data can be shared
A database allows the sharing of data under its control by any number of application program or
users. Sharing of data does not merely imply that existing applications can share the data in the
database, it also means that new applications can be developed to operate using the same database.
Standards Can Be Enforced
Since there is centralized control of data, the database administrator can ensure that standards are maintained in
the representation of the stored data formats. This is particularly useful for data interchange, or migration of data
between two systems.
Security Restrictions Can Be Applied
The DBMS guarantees that only authorized persons can access the database. The DBA defines the
security checks to be carried out. Different checks can be applied to different operations on the same
data. For instance, a person may have the access rights to query on a file, but may not have the right to
delete or update that file. The DBMS allows such security checks to be established for each piece of
data in the database.
Integrity Can Be Maintained
Centralized control can also ensure that adequate checks are incorporated in the DBMS to
provide data integrity. Data integrity means that the data contain in the database is both
accurate and consistent. Inconsistency between two entries can lead to integrity problems.
However, even if there is no redundancy, the data can still be inconsistent. For example a
student may have enrolled in 10 courses in a semester when the maximum number of courses
one can enroll in is 7. Another example could be that of a student enrolling in a course that is
not being offered that semester. Such problems can be avoided in a DBMS by establishing
certain integrity checks to be carried out whenever any update operation is done. These
checks can be specified at the database level, besides the application programs.
Data Independence
In non-database systems, the requirement of the application dictates the way in which the data is stored and the
access techniques. Besides, the knowledge of the organization of the data, the access techniques are built into
the logic and code of the application. These systems are data dependent. Consider this example, suppose the
17
university has an application that processes the student file. For performance reason, the file is indexed on the
roll number. The application would be aware of the existing index, and the internal structure of the application
would be built around this knowledge. Now consider that the some reason, the file is to index on the registration
data. In this case it is impossible to change the structure of the stored data without affecting the application too.
Such an application is a data dependent one.
Features Of RDBMS
The ability to create multiple relations and enter data into them
An interactive query language
Retrieval of information stored in more than one table
Database Design
Having identified all the data in the system, it is necessary to arrive at the logical database design.
Database design involves designing the conceptual model of the database. This model is independent
of the physical representation of data. Before actually implementing the database, the conceptual
model is designed using various techniques.
The requirements of all the users are taken into account to decide the actual data that needs
to be stored in the system. Once the conceptual model is designed, it can then be mapped to the
DBMS/RDBMS that is actually being used. Two of the widely used approaches are Entity-
relationship (E/R) Modeling and Normalization.
The E/R model is an object based model and is based on a perception of the real world that
is made up of a collection of objects or entities and the relationships among these. E/R modeling is
generally used as a top down approach for new systems.
Entity
Entity is an object or place or event, which can be stored on the system. A physical object can be as
employee, customer, and machinery. An abstract object can be as dept, accounting. An event can be
as registration or application form. A place can be as city, state. Before a table is created it is known
as entity. It is denoted as a rectangle diagram.
Attribute
Attribute is describing the entity. Example an entity employees can contain empno, ename, sal,
hiredate etc. It is represented by a circle.
18
Relation
A “Relation” is a two-dimensional table. It consists of ‘rows” which represent records and ‘columns’
which show the attributes of the entity. A relation is also called a file, it consists of a number of
records, which are also called as tuples. Record consists of a number of attributes, which are also
known as fields or domains.
In order for a relational structure to be useful and manageable, the relation tables must
first be “normalized”.
.
Some of the properties of a relation are
No duplication - In the sense that no two records are identical
Unique Key - Each relation has a unique key by which it
can be accessed
Order - There is no significant order of data in the table.
In case we want the names of all the employees whose grade is 20, we can scan the employee
relation noting the grade. Here the Unique key is the employee number.
Normalization
Normalization is a process of simplifying the relationship between data elements in a record. It is the
transformation of complex data stores to a set of smaller, stable data structures. Normalized
19
Data Item 1
Data Item 2
Data Item 3
Relations
Records
Attributes
data structures are simpler, more stable and are easier to maintain. Normalization can therefore be
defined as a process of simplifying the relationship between data elements in a record.
Purpose for Normalization
To permit simple retrieval of data in response to query and report requests.
To simplify the maintenance of the data through updates, insertions and deletions.
To reduce the need to restructure or reorganize data when new application requirements
arise.
Steps of Normalization
It consists of basic three steps
First Normal Form, which decomposes all data groups into two-dimensional records.
Second Normal form, which eliminates any relationships in which data elements do not
fully depend on the primary key of the record.
Third Normal Form which eliminates any relationships that contain transitive
dependencies.
Fig 3.2 steps involved in the process of normalization
20
ORACLE
Introduction
Oracle is a relational database management system, which organizes data in the form of tables.
Oracle is one of many database servers based on RDBMS model, which manages a seer of data that
attends three specific things-data structures, data integrity and data manipulation. With oracle
cooperative server technology we can realize the benefits of open, relational systems for all the
applications. Oracle makes efficient use of all systems resources, on all hardware architecture; to
21
User Views Data Stores
Un-
normalized
Relations
First Normal Form
Second Normal Form
Third Normal Form
Step 1: Remove repeating groups. Fix record length
identify primary key.
Step 2 : Removal of data items which are not Dependent on
primary key
.Step 3 : Removal of transitive
dependencies.
deliver unmatched performance, price performance and scalability. Any DBMS to be called as
RDBMS has to satisfy Dr.E.F.Codd’s rules.
Oracle is comprehensive operating environment that packs h power of mainframe relation database
management system into user’s microcomputer. It provides a set of functional program that user can
use as tools to build structures and perform tasks. Because applications are developed on oracle are
completely portable to the other versions of the programmer can create a complex application in a
single user, environment and then move it to a multi-user platform. Users do not have to be an expert
to appreciate oracle but the better user understands the program, the more productively and creatively
he can use the tools it provides.
Relational Database Management System
Oracle the right tool
Oracle gives you security and control
Database management tools
Oracle database can be describe at two different levels
Physical Structure
Logical Structure
Physical Structure
a) One or more data files
b) Two or more log files
c) One control file
Logical Structure
a) Table spaces
b) Segments
c) Extents
d) Data Blocks
The data files contain all user data in terms of tables, index and views. The log files contain the
information to open and be recovered, of undone after a transaction (Rollback).
22
The control files physical data, media information to open and manage data files. If the control file is
damaged the server will not be able to open or use the database even if the database is undamaged.
Features of Oracle
Oracle is portable:
The Oracle RDBMS is available on wide range of platforms, ranging from PCs to super
computers and as a multi-user network loadable module (NLM) for Novell Netware. If you develop
an application on one system you can run the same application on other systems without any
modifications.
Oracle is Compatible:
The Oracle command can be used for communicating with IBM, DB/2, Mainframe
RDBMS, which is different from Oracle, i.e., Oracle is compatible with DB/2. Oracle is a
high performance fault tolerant DBMS which is specially designed for on-line transaction
processing and for handling the large database applications.
Oracle Tools
Oracle is RDBMS, which stores and displays the Data in the form of tables. A table
consists of rows and columns. A single row is called Record. Oracle is a modular system that
contains Oracle Database (DB Manager) and several Tools (Functional Programs).
Oracle Tools do 4 major kinds of work
Database management
Data access and manipulation
Programming
Connectivity.
Data Access and Manipulation Tools
These are the tools used for communication with database manager for data access and
manipulation. These tools can be used for not only access and manipulation but you can use design or
use an application. Each tool Provides separate entry point and a unique approach to the Oracle
system. The tools are firmly based on ANSI standard SQL.
23
SQL*PLUS
SQL* Plus is direct access to the Oracle RDBMS. You can see SQL commands to define,
control and manipulate and query data. All users like DBA’s, high-level system developers and others
can talk straight in Oracle RDBMS.
Connectivity Tools
The connectivity tools help in connecting the Oracle databases through network and to other
database systems. SQL* Plus allows for accessing the IBM, DB/2 (an IBM Mainframe RDBMS) and
SQL/DS (Structured query language for data system) databases directly using the normal Oracle
commands without doing any modifications.
SQL
The name SQL stands for structure query language. SQL is data access language, like any other
language, it is used for communication. SQL communicates with database manager. The database
manager could be Oracle, DB2, and SQL base, in grace or any RDBMS that supports SQL language.
These database systems understand SQL.
SQL is easy to learn. Despite the fact that the SQL is a computer programming language, it is
much simpler than traditional programming language like COBOL, BASIC, FORTRAN or APL. This
is due to the fact that SQL is non-procedural language.
Features of SQL
SQL users a free form (A non mathematical syntax), English like structure for its
commands.
SQL Processing Capabilities
SQL is composed of a Definition language, a Data manipulation language and a Data control
language. These three languages support the complete spectrum of Relational Data processing
activity. In fact most SQL based products all access to the data through SQL.
Data definition language: DDL allows creation, deletion and modification of data structures for
bar system. These structures include tables, databases, and indexes.
Ex: Creation, Drop, Alter.
24
Data Manipulation Language: These commands are used to manipulate the data in tables directly
or through views. There are four standard DML statements. They are Delete, Insert, and Update.
Data control language: These commands are used to control usage and access of data. The most
commonly found one’s are Grant and Revoke
SQL Data Manipulation Statements
A transaction is a sequence of SQL statements that Oracle treats as a unit, so that all changes brought
about by the statements are made permanent or undone for the same time. The consistency of the database
PL/SQL lets you use the Commit, Rollback and Save point statements. The Commit statement makes
permanent any changes made during the current transaction until you commit your changes, other users
cannot see them. The Rollback statement ends the current transaction and undoes any changes made since
the transaction began. The Save point statement marks the current point in the processing of a transaction.
3. SYSTEM ANALYSIS
3.1 EXISTING SYSTEM
Keyword search over a large amount of data is an important
operation in a wide range of domains. Felipe et al. has recently extended its
study to spatial databases, where keyword search becomes a fundamental
building block for an increasing number of real-world applications, and
proposed the IR -Tree. A main limitation of the IR -Tree is that it only supports
exact keyword search.
LIMITATIONS WITH EXISTING SYSTEM
Exact Keyword Require For Searching the Results.
3.2PROPOSED SYSTEM
For RSAS queries, the baseline spatial solution is based on the Dijkstra’s
algorithm. Given a query point q, the query range radius r, and a string
predicate, we expand from q on the road network using the Dijkstra algorithm
until we reach the points distance r away from q and verify the string predicate
either in a post-processing step or on the intermediate results of the expansion.
We denote this approach as the Dijkstra solution. Its performance degrades
quickly when the query range enlarges and/or the data on the network increases.
25
This motivates us to find a novel method to avoid the unnecessary road network
expansions, by combining the prunings from both the spatial and the string
predicates simultaneously.
We demonstrate the efficiency and effectiveness of our proposed methods
for SAS queries using a comprehensive experimental evaluation. For ESAS
queries, our experimental evaluation covers both synthetic and real data sets of
up to 10 millions points and 6 dimensions. For RSAS queries, our evaluation is
based on two large, real road network datasets, that contain up to 175,813
nodes, 179,179 edges, and 2 millions points on the road network. In both cases,
our methods have significantly outperformed the respective baseline methods.
ADVANTAGES IN PROPOSED SYSTEM
o This is very helpful for Exact Result from Non Exact keywords .
3.3. SYSTEM REQUIREMENTS
HARDWARE REQUIREMENTS:
Processor : intel pentium-iv (3.00 GHz)
Memory : 512 MB
Hard disk : 100GB
SOFTWARE REQUIREMENTS:
Operating system : Windows XP/7/8
Language : Java ,HTML
Database : Oracle
26
After analyzing the requirements of the task to be performed, the next step is to analyze
the problem and understand its context. The first activity in the phase is studying the existing
system and other is to understand the requirements and domain of the new system. Both the
activities are equally important, but the first activity serves as a basis of giving the functional
specifications and then successful design of the proposed system. Understanding the
properties and requirements of a new system is more difficult and requires creative thinking
and understanding of existing running system is also difficult, improper understanding of
present system can lead diversion from solution.
3.4 SOFTWARE REQUIREMENT SPECIFICATION
SCOPE OF THE PROJECT
The software, Site Explorer is designed for management of web sites from a remote
location.
Purpose: The main purpose for preparing this document is to give a general insight into the
analysis and requirements of the existing system or situation and for determining the
operating characteristics of the system.
Scope: This Document plays a vital role in the development life cycle (SDLC) and it
describes the complete requirement of the system. It is meant for use by the developers and
will be the basic during testing phase. Any changes made to the requirements in the future
will have to go through formal change approval process.
DEVELOPERS RESPONSIBILITIES OVERVIEW:
The developer is responsible for:
Developing the system, which meets the SRS and solving all the requirements of the
system?
Demonstrating the system and installing the system at client's location after the
acceptance testing is successful.
Submitting the required user manual describing the system interfaces to work on it
and also the documents of the system.
Conducting any user training that might be needed for using the system.
27
Maintaining the system for a period of one year after installation.
3.4 FUNCTIONAL REQUIREMENTS
Functional Requirements refer to very important system requirements in a software
engineering process (or at micro level, a sub part of requirement engineering) such as
technical specifications, system design parameters and guidelines, data manipulation, data
processing and calculation modules etc.
Functional Requirements are in contrast to other software design requirements referred to as
Non-Functional Requirements which are primarily based on parameters of system
performance, software quality attributes, reliability and security, cost, constraints in
design/implementation etc.
The key goal of determining “functional requirements” in a software product design and
implementation is to capture the required behavior of a software system in terms of
functionality and the technology implementation of the business processes.
The Functional Requirement document (also called Functional Specifications or Functional
Requirement Specifications), defines the capabilities and functions that a System must be
able to perform successfully.
Functional Requirements should include:
Descriptions of data to be entered into the system
Descriptions of operations performed by each screen
Descriptions of work-flows performed by the system
Descriptions of system reports or other outputs
Who can enter the data into the system?
How the system meets applicable regulatory requirements
The functional specification is designed to be read by a general audience. Readers should
understand the system, but no particular technical knowledge should be required to
understand the document.
28
Examples of Functional Requirements
Functional requirements should include functions performed by specific screens, outlines of
work-flows performed by the system and other business or compliance requirements the
system must meet.
Interface requirements
Field accepts numeric data entry
Field only accepts dates before the current date
Screen can print on-screen data to the printer
Business Requirements
Data must be entered before a request can approved
Clicking the Approve Button moves the request to the Approval Workflow
All personnel using the system will be trained according to internal training strategies
Regulatory/Compliance Requirements
The database will have a functional audit trail
The system will limit access to authorized users
The spreadsheet can secure data with electronic signatures
Security Requirements
Member of the Data Entry group can enter requests but not approve or delete requests
Members of the Managers group can enter or approve a request, but not delete
requests
Members of the Administrators group cannot enter or approve requests, but can delete
requests
The functional specification describes what the system must do; how the system does it is
described in the Design Specification.
If a User Requirement Specification was written, all requirements outlined in the user
requirement specification should be addressed in the functional requirements.
29
3.5 NON FUNCTIONAL REQUIREMENTS
All the other requirements which do not form a part of the above specification are categorized
as Non-Functional Requirements.
A system may be required to present the user with a display of the number of records in a
database. This is a functional requirement.
How up-to-date this number needs to be is a non-functional requirement. If the
number needs to be updated in real time, the system architects must ensure that the system is
capable of updating the displayed record count within an acceptably short interval of the
number of records changing. Sufficient network bandwidth may also be a non-functional
requirement of a system.
Other examples:
Accessibility
Availability
Backup
Certification
Compliance
Configuration Management
Documentation
Disaster Recovery
Efficiency (resource consumption for given load)
Effectiveness (resulting performance in relation to effort)
Extensibility (adding features, and carry-forward of customizations at next major
version upgrade)
Failure Management
Interoperability
Maintainability
Modifiability
Open Source
Operability
30
Performance
Platform compatibility
Price
Portability
Quality (e.g. Faults Discovered, Faults Delivered, Fault Removal Efficacy)
Recoverability
Resilience
Resource constraints (processor speed, memory, disk space, network bandwidth etc.)
Response time
Robustness
Scalability (horizontal, vertical)
Security
Software, tools, standards etc.
Stability
Safety
Supportability
Testability
Usability by target user community
Accessibility is a general term used to describe the degree to which a product, device,
service, or environment is accessible by as many people as possible. Accessibility can be
viewed as the "ability to access" and possible benefit of some system or entity. Accessibility
is often used to focus on people with disabilities and their right of access to the system.
Availability is the degree to which a system, subsystem, or equipment is operable and in a
committable state at the start of a mission, when the mission is called for at an unknown, i.e.,
a random, time. Simply put, availability is the proportion of time a system is in a functioning
condition.
Expressed mathematically, availability is 1 minus the unavailability.
A backup or the process of backing up refers to making copies of data so that these
additional copies may be used to restore the original after a data loss event. These additional
copies are typically called "backups."
31
Certification refers to the confirmation of certain characteristics of an object, system, or
organization. This confirmation is often, but not always, provided by some form of external
review, education, or assessment
Compliance is the act of adhering to, and demonstrating adherence to, a standard or
regulation.
Configuration management (CM) is a field that focuses on establishing and maintaining
consistency of a system's or product's performance and its functional and physical attributes
with its requirements, design, and operational information throughout its life.
Documentation may refer to the process of providing evidence ("to document something")
or to the communicable material used to provide such documentation (i.e. a document).
Documentation may also (seldom) refer to tools aiming at identifying documents or to the
field of study devoted to the study of documents and bibliographies
Disaster recovery is the process, policies and procedures related to preparing for recovery or
continuation of technology infrastructure critical to an organization after a natural or human-
induced disaster.
Disaster recovery planning is a subset of a larger process known as business continuity
planning and should include planning for resumption of applications, data, hardware,
communications (such as networking) and other IT infrastructure
Extensibility (sometimes confused with forward compatibility) is a system design principle
where the implementation takes into consideration future growth. It is a systemic measure of
the ability to extend a system and the level of effort required to implement the extension.
Extensions can be through the addition of new functionality or through modification of
existing functionality. The central theme is to provide for change while minimizing impact to
existing system functions.
Interoperability is a property referring to the ability of diverse systems and organizations to
work together (inter-operate). The term is often used in a technical systems engineering
sense, or alternatively in a broad sense, taking into account social, political, and
organizational factors that impact system to system performance.
Maintenance is the ease with which a software product can be modified in order to:
32
correct defects
meet new requirements
make future maintenance easier, or
cope with a changed environment;
Open source describes practices in production and development that promote access to the
end product's source materials—typically, their source code
Operability is the ability to keep equipment, a system or a whole industrial installation in a
safe and reliable functioning condition, according to pre-defined operational requirements.
In a computing systems environment with multiple systems this includes the ability of
products, systems and business processes to work together to accomplish a common task.
Computer performance is characterized by the amount of useful work accomplished by a
computer system compared to the time and resources used.
Depending on the context, good computer performance may involve one or more of the
following:
Short response time for a given piece of work
High throughput (rate of processing work)
Low utilization of computing resource(s)
High availability of the computing system or application
Fast (or highly compact) data compression and decompression
High bandwidth / short data transmission time
Price in economics and business is the result of an exchange and from that trade we assign a
numerical monetary value to a good, service or asset
Portability is one of the key concepts of high-level programming. Portability is the software-
code base feature to be able to reuse the existing code instead of creating new code when
moving software from an environment to another. When one is targeting several platforms
with the same application, portability is the key issue for development cost reduction.
33
Quality: The common element of the business definitions is that the quality of a product or
service refers to the perception of the degree to which the product or service meets the
customer's expectations. Quality has no specific meaning unless related to a specific function
and/or object. Quality is a perceptual, conditional and somewhat subjective attribute.
Reliability may be defined in several ways:
The idea that something is fit for purpose with respect to time;
The capacity of a device or system to perform as designed;
The resistance to failure of a device or system;
The ability of a device or system to perform a required function under stated
conditions for a specified period of time;
The probability that a functional unit will perform its required function for a specified
interval under stated conditions.
The ability of something to "fail well" (fail without catastrophic consequences
Resilience is the ability to provide and maintain an acceptable level of service in the face of
faults and challenges to normal operation.
These services include:
supporting distributed processing
supporting networked storage
maintaining service of communication services such as
o video conferencing
o instant messaging
o online collaboration
access to applications and data as needed
Response time perceived by the end user is the interval between
(a) The instant at which an operator at a terminal enters a request for a response from
a computer and
(b) The instant at which the first character of the response is received at a terminal.
34
In a data system, the system response time is the interval between the receipt of the end of
transmission of an inquiry message and the beginning of the transmission of a response
message to the station originating the inquiry.
Robustness is the quality of being able to withstand stresses, pressures, or changes in
procedure or circumstance. A system or design may be said to be "robust" if it is capable of
coping well with variations (sometimes unpredictable variations) in its operating environment
with minimal damage, alteration or loss of functionality.
The concept of scalability applies to technology and business settings. Regardless of the
setting, the base concept is consistent - The ability for a business or technology to accept
increased volume without impacting the system.
In telecommunications and software engineering, scalability is a desirable property of a
system, a network, or a process, which indicates its ability to either handle growing amounts
of work in a graceful manner or to be readily enlarged.
Security is the degree of protection against danger, loss, and criminals.
Security has to be compared and contrasted with other related concepts: Safety, continuity,
reliability. The key difference between security and reliability is that security must take into
account the actions of people attempting to cause destruction.
Security as a state or condition is resistance to harm. From an objective perspective, it is a
structure's actual (conceptual and never fully knowable) degree of resistance to harm.
Stability - it means much of the objects will be stable over time and will not need changes.
Safety is the state of being "safe", the condition of being protected against physical, social,
spiritual, financial, political, emotional, occupational, psychological, educational or other
types or consequences of failure, damage, error, accidents, harm or any other event which
could be considered non-desirable. This can take the form of being protected from the event
or from exposure to something that causes health or economical losses. It can include
protection of people or of possessions
Supportability (also known as serviceability) is one of the aspects of RASU (Reliability,
Availability, Serviceability, and Usability)). It refers to the ability of technical support
35
personnel to install, configure, and monitor products, identify exceptions or faults, debug or
isolate faults to root cause analysis, and provide hardware or software maintenance in pursuit
of solving a problem and restoring the product into service. Incorporating serviceability
facilitating features typically results in more efficient product maintenance and reduces
operational costs and maintains business continuity.
Testability, a property applying to an empirical hypothesis, involves two components: (1) the
logical property that is variously described as contingency, defeasibility, which means that
counter examples to the hypothesis are logically possible, and (2) the practical feasibility of
observing a reproducible series of such counter examples if they do exist. In short it refers to
the capability of an equipment or system to be tested
Usability is used to denote the ease which users can employ a tool or other human-made
object to get a particular goal. In human-computer interaction and computer science, usability
refers to the elegance and clarity with which the interaction with a computer program or a
web site is designed.
MODULES:
36
Implementation is the stage of the project when the theoretical design
is turned out into a working system. Thus it can be considered to be the most
critical stage in achieving a successful new system and in giving the user,
confidence that the new system will work and be effective.
The implementation stage involves careful planning, investigation of
the existing system and it’s constraints on implementation, designing of
methods to achieve changeover and evaluation of changeover methods.
1. User Module:
In this module, Users are having authentication and security to access
the detail which is presented in the ontology system. Before accessing or
searching the details user should have the account in that otherwise they should
register first.
.
2. key:
The key of common Index can be made from the Index word given by
the Data owner and File. The secure index and a search scheme to enable fast
similarity search in the context of data. In such a context, it is very critical not to
sacrifice the confidentiality of the sensitive data while providing functionality.
We provided a rigorous security definition and proved the security of the
proposed scheme under the provided definition to ensure the confidentiality.
3. Edit Distance Pruning:
37
Computing edit distance exactly is a costly operation. Sev- eral
techniques have been proposed for identifying candidate strings within a small
edit distance from a query string fast. All of them are based on q-grams and a q-
gram
counting argument. For a string s, its q-grams are produced by sliding a window
of length q over the characters of s. To deal with the special case at the
beginning and the end of s, that have fewer than q characters, one may introduce
special characters, such as “#” and “$”, which are not in S. This helps
conceptually extend
s by prefixing it with q - 1 occurrences of “#” and suffixing it with q - 1
occurrences of “$”. Hence, each q-gram for the string s has exactly q characters.
4. Search:
we provide a specific application of the proposed similarity searchable
encryption scheme to clarify its mechanism.Server performs search on the index
for each component and sends back the corresponding encrypted bit vectors it
makes by the respective like commend. Finally, we illustrated the performance
of the proposed scheme with empirical analysis on a real data.
4. SYSTEM DESIGN
4.1 UML Diagrams:
38
UML is a method for describing the system architecture in detail using the blueprint.
UML represents a collection of best engineering practices that have proven successful
in the modeling of large and complex systems.
UML is a very important part of developing objects oriented software and the
software development process.
UML uses mostly graphical notations to express the design of software projects.
Using the UML helps project teams communicate, explore potential designs, and
validate the architectural design of the software.
Definition:
UML is a general-purpose visual modeling language that is used to specify, visualize,
construct, and document the artifacts of the software system.
UML is a language:
It will provide vocabulary and rules for communications and function on conceptual
and physical representation. So it is modeling language.
UML Specifying:
Specifying means building models that are precise, unambiguous and complete. In
particular, the UML address the specification of all the important analysis, design and
implementation decisions that must be made in developing and displaying a software
intensive system.
UML Visualization:
The UML includes both graphical and textual representation. It makes easy to
visualize the system and for better understanding.
UML Constructing:
39
UML models can be directly connected to a variety of programming languages and it
is sufficiently expressive and free from any ambiguity to permit the direct execution of
models.
UML Documenting:
UML provides variety of documents in addition raw executable codes.The use case
view of a system encompasses the use cases that describe the behavior of the system as seen
by its end users, analysts, and testers.
The design view of a system encompasses the classes, interfaces, and collaborations
that form the vocabulary of the problem and its solution.
The process view of a system encompasses the threads and processes that form the
system's concurrency and synchronization mechanisms.
The implementation view of a system encompasses the components and files that are
used to assemble and release the physical system. The deployment view of a system
encompasses the nodes that form the system's hardware topology on which the system
executes.
Uses of UML:
The UML is intended primarily for software intensive systems. It has been used
effectively for such domain as Enterprise Information System Banking and
i) Financial Services
ii) Telecommunications
iii) Transportation
IV) Defense/Aerospace
v) Retails
vi) Medical Electronics
vii) Scientific Fields
40
Viii) Distributed Web
Building blocks of UML:
The vocabulary of the UML encompasses 3 kinds of building blocks
Things
Relationships
Diagrams
Things:
Things are the data abstractions that are first class citizens in a model. Things are of 4
types
Structural Things, Behavioral Things, Grouping Things, notational Things
Relationships:
Relationships tie the things together. Relationships in the UML are
Dependency, Association, Generalization, Specialization
UML Diagrams:
A diagram is the graphical presentation of a set of elements, most often rendered as a
connected graph of vertices (things) and arcs (relationships).
There are two types of diagrams, they are:
Structural and Behavioral Diagrams
Structural Diagrams:-
The UML‘s four structural diagrams exist to visualize, specify, construct and
document the static aspects of a system. I can View the static parts of a system using one of
the following diagrams. Structural diagrams consist of Class Diagram, Object Diagram,
Component Diagram, and Deployment Diagram.
Behavioral Diagrams:
41
The UML’s five behavioral diagrams are used to visualize, specify, construct, and
document the dynamic aspects of a system. The UML’s behavioral diagrams are roughly
organized around the major ways which can model the dynamics of a system.
Behavioral diagrams consists of
a) Use case Diagram b) Sequence Diagram
c) Collaboration Diagram d) State chart Diagram e) Activity Diagram
4.2 Use-Case diagram:
A use case is a set of scenarios that describing an interaction between a user and a
system. A use case diagram displays the relationship among actors and use cases. The two
main components of a use case diagram are use cases and actors.
An actor is represents a user or another system that will interact with the system you
are modeling. A use case is an external view of the system that represents some action the
user might perform in order to complete a task.
42
Fig 1: USECASE DIAGRAM
Contents:
Use cases
Actors
Dependency, Generalization, and association relationships
System boundary
4.3 Class Diagram:
43
Class diagrams are widely used to describe the types of objects in a system and their
relationships. Class diagrams model class structure and contents using design elements such
as classes, packages and objects. Class diagrams describe three different perspectives when
designing a system, conceptual, specification, and implementation. These perspectives
become evident as the diagram is created and help solidify the design. Class diagrams are
arguably the most used UML diagram type. It is the main building block of any object
oriented solution. It shows the classes in a system, attributes and operations of each class and
the relationship between each class. In most modeling tools a class has three parts, name at
the top, attributes in the middle and operations or methods at the bottom. In large systems
with many classes related classes are grouped together to create class diagrams. Different
relationships between diagrams are show by different types of Arrows. Below is a image of a
class diagram. Follow the link for more class diagram examples.
Fig 2: CLASS DIAGRAM
4.4 Sequence Diagram
Sequence diagrams in UML shows how object interact with each other and the order
those interactions occur. It’s important to note that they show the interactions for a particular
scenario. The processes are represented vertically and interactions are show as arrows. This
44
article explains the purpose and the basics of Sequence diagrams.
Fig 3: SEQUENCE DIAGRAM
4.6 Activity diagram:
Activity Diagram:
45
Activity diagrams describe the workflow behavior of a system. Activity
diagrams are similar to state diagrams because activities are the state of doing something.
The diagrams describe the state of activities by showing the sequence of activities
performed. Activity diagrams can show activities that are conditional or parallel.
How to Draw: Activity Diagrams
Activity diagrams show the flow of activities through the system. Diagrams are read
from top to bottom and have branches and forks to describe conditions and parallel activities.
A fork is used when multiple activities are occurring at the same time. The diagram below
shows a fork after activity1. This indicates that both activity2 and activity3 are occurring at
the same time. After activity2 there is a branch. The branch describes what activities will
take place based on a set of conditions. All branches at some point are followed by a merge
to indicate the end of the conditional behavior started by that branch. After the merge all of
the parallel activities must be combined by a join before transitioning into the final activity
state. .
46
When to Use: Activity Diagrams
Activity diagrams should be used in conjunction with other modeling techniques such
as interaction diagrams and state diagrams. The main reason to use activity diagrams is to
model the workflow behind the system being designed. Activity Diagrams are also useful
for: analyzing a use case by describing what actions need to take place and when they should
occur; describing a complicated sequential algorithm; and modeling applications with parallel
processes.
Fig 4.1: ACTIVITY DIAGRAM FOR USER
47
Fig 4.2: ACTIVITY DIAGRAM FOR ADMIN
4.7 Data Flow Diagram
Sign in Sign in
Sign out Sign out
48
Users
Spatial Approximate String Search Admin
Fig 5: Content level
Sign in Sign in
Si Sign out Sign out
Fig 5.2: LEVEL 0 USER LEVEL DIAGRAM
Fig 5.4: LEVEL 1 ADMIN DIAGRAM
49
# Database
# Database
Users Level 0DED
Spatial Approximate String Search
Log in
Search files
View files
Update profile
Register
Download files
Log in
Search filesView filesUpdate profile
Register
Download files
Logout
Sign in U sign in
Sign out sign out
Fig 5.5: LEVEL 1 DFD DIAGRAM
5. IMPLEMENTATION
Implementation is the stage of the project when the
theoretical design is turned out into a working system. Thus it can be considered to be the
most critical stage in achieving a successful new system and in giving the user, confidence
that the new system will work and be effective.
50
user
Level1DED
Spatial Approximate
String Search
# Database
# Database
# Database
# Database
# DatabaseLogout
The implementation stage involves careful planning, investigation of the existing
system and it’s constraints on implementation, designing of methods to achieve changeover
and evaluation of changeover methods.
Implementation is the process of converting a new system design into operation. It is
the phase that focuses on user training, site preparation and file conversion for installing a
candidate system. The important factor that should be considered here is that the conversion
should not disrupt the functioning of the organization.
5.2 SAMPLE CODE:
6. TESTING
6.1 Introduction
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way to check
51
the functionality of components, sub assemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific
testing requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All decision
branches and internal code flow should be validated. It is the testing of individual software
units of the application .it is done after the completion of an individual unit before
integration. This is a structural testing, that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component level and test a specific business
process, application, and/or system configuration. Unit tests ensure that each unique path of a
business process performs accurately to the documented specifications and contains clearly
defined inputs and expected results.
52
UNIT TESTING
MODULE TESTING
SUB-SYSTEM TESING
SYSTEM TESTING
ACCEPTANCE TESTING
Component Testing
Integration Testing
User Testing
Fig: 6.1 Testing
Integration testing
Integration tests are designed to test integrated software components to determine if
they actually run as one program. Testing is event driven and is more concerned with the
basic outcome of screens or fields. Integration tests demonstrate that although the
components were individually satisfaction, as shown by successfully unit testing, the
combination of components is correct and consistent. Integration testing is specifically aimed
at exposing the problems that arise from the combination of components.
Functional test
Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user
manuals.
53
Functional testing is centered on the following items:
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures: interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on requirements, key functions, or
special test cases. In addition, systematic coverage pertaining to identify Business process
flows; data fields, predefined processes, and successive processes must be considered for
testing. Before functional testing is complete, additional tests are identified and the effective
value of current tests is determined.
System Test
System testing ensures that the entire integrated software system meets requirements.
It tests a configuration to ensure known and predictable results. An example of system testing
is the configuration oriented system integration test. System testing is based on process
descriptions and flows, emphasizing pre-driven process links and integration points.
White Box Testing
White Box Testing is a testing in which in which the software tester has knowledge of
the inner workings, structure and language of the software, or at least its purpose. It is
purpose. It is used to test areas that cannot be reached from a black box level.
Black Box Testing
Black Box Testing is testing the software without any knowledge of the inner workings,
structure or language of the module being tested. Black box tests, as most other kinds of tests,
must be written from a definitive source document, such as specification or requirements
document, such as specification or requirements document. It is a testing in which the
software under test is treated, as a black box .you cannot “see” into it. The test provides
inputs and responds to outputs without considering how the software works.
54
6.2 Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test phase of the
software lifecycle, although it is not uncommon for coding and unit testing to be conducted as
two distinct phases.
Test strategy and approach
Field testing will be performed manually and functional tests will be written in detail.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
6.3 Integration Testing
Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by interface
defects.
The task of the integration test is to check that components or software applications,
e.g. components in a software system or – one step up – software applications at the company
level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
6.4 Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
55
CONCLUSION
CONCLUSION:
This paper presents a comprehensive study for spatial approximate string queries in both the
Euclidean space and road networks. We use the edit distance as the similarity measurement
for the string predicate and focus on the range queries as the spatial predicate. We also
address the problem of query selectivity estimation for queries in the Euclidean space. Future
work include examining spatial approximate sub-string queries, designing methods that are
more update-friendly, and solving the selectivity estimation problem for RSAS queries.
.
APPENDIX- A
REFERENCES
[1] S. Acharya, V. Poosala, and S. Ramaswamy. Selectivity estimation in
spatial databases. In SIGMOD, pages 13–24, 1999.
[2] S. Alsubaiee, A. Behm, and C. Li. Supporting location-based
56
approximate-keyword queries. In GIS, pages 61–70, 2010.
[3] A. Arasu, S. Chaudhuri, K. Ganjam, and R. Kaushik. Incorporating
string transformations in record matching. In SIGMOD, pages 1231–
1234, 2008.
[4] A. Arasu, V. Ganti, and R. Kaushik. Efficient exact set-similarity joins.
In VLDB, pages 918–929, 2006.
[5] N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger. The R_-
tree: an efficient and robust access method for points and rectangles. In
SIGMOD, pages 322–331, 1990.
[6] A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Minwise
independent permutations (extended abstract). In STOC, pages
327–336, 1998.
[7] X. Cao, G. Cong, and C. S. Jensen. Retrieving top-k prestige-based
relevant spatial web objects. Proc. VLDB Endow., 3:373–384, 2010.
[8] K. Chakrabarti, S. Chaudhuri, V. Ganti, and D. Xin. An efficient filter
for approximate membership checking. In SIGMOD, pages 805–818,
2008.
[9] S. Chaudhuri, K. Ganjam, V. Ganti, and R. Motwani. Robust and
efficient fuzzy match for online data cleaning. In SIGMOD, pages 313–
324, 2003.
57