Upload
phamquynh
View
218
Download
0
Embed Size (px)
Citation preview
SEMANTIC WEB APPLICATION :
ONTOLOGY-DRIVEN RECIPE QUERYING
A MASTER’S THESIS
in
Computer Engineering
Atılım University
by
GÜLER KALEM
JUNE 2005
SEMANTIC WEB APPLICATION:
ONTOLOGY-DRIVEN RECIPE QUERYING
A THESIS SUBMITTED TO
THE GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES
OF
ATILIM UNIVERSITY
BY
GÜLER KALEM
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
IN
THE DEPARTMENT OF COMPUTER ENGINEERING
JUNE 2005
iii
Approval of the Graduate School of Natural and Applied Sciences
_____________________
Prof. Dr. İbrahim Akman
Director
I certify that this thesis satisfies all the requirements as a thesis for the degree of Master of Science.
_____________________
Prof. Dr. İbrahim Akman
Head of Department
This is to certify that I have read this thesis and that in my opinion it is fully adequate, in scope and quality, as a thesis for the degree of Master of Science.
_____________________
Asst. Prof. Dr. Çiğdem Turhan
Supervisor
Examining Committee Members
Prof. Dr. Ali Yazıcı _____________________
Prof. Dr. İbrahim Akman _____________________
Assoc. Prof. Dr. Nazife Baykal _____________________
Asst. Prof. Dr. Çiğdem Turhan _____________________
Asst. Prof. Dr. Nevzat Sezer _____________________
iii
ABSTRACT
SEMANTIC WEB APPLICATION :
ONTOLOGY-DRIVEN RECIPE QUERYING
Kalem, Güler
M.S., Computer Engineering Department
Supervisor: Asst. Prof. Dr. Çiğdem Turhan
June 2005, 102 pages
Currently all the information presented on the Internet just have static content
giving meaning in some contexts, and these documents cannot be used effectively by
different systems. However, presenting information with well-defined meaning will
enable different computer systems to process and reason about the information at the
semantic level whereas the present systems process the information only at the
syntax level. Semantic Web approach will drastically change the effectiveness of the
Internet and will enable the reuse of information and increase the representative
power of information. It will be possible to combine information from different
locations and process them together since they are defined in a standard way.
In this thesis, concepts such as representing knowledge with a Semantic Web
language, ontology processing, reasoning and querying on ontologies have been
implemented to realize a Semantic Web application: Ontology-driven Recipe
Querying.
As the domain, a Web-based application dealing with food recipes has been
chosen. All the information and application logic have been moved into an OWL
(Web Ontology Language) ontology file which controls all the content and the
iv
structure of the application, and makes it possible to reason on the provided
information to create new facts from already given logic statements.
In the application, it is possible for the user to enter queries made up of arbitrary
elements when querying for available food recipes. The application is capable of
responding meaningfully no matter how the queries are constructed.
Keywords: Semantic Web, ontology, ontology querying, ontology management,
ontology-driven knowledge management, knowledge representation, Internet.
v
ÖZ
ANLAMSAL AĞ UYGULAMASI:
ONTOLOJİ ODAKLI YEMEK TARİFİ SORGULAMASI
Kalem, Güler
Yüksek Lisans, Bilgisayar Mühendisliği Bölümü
Tez Yöneticisi: Yrd. Doç. Dr. Çiğdem Turhan
Haziran 2005, 102 sayfa
Günümüzde, Internet ortamında yer alan tüm bilgiler statik içerik içermektedir ve
bu dokümanların farklı sistemler tarafından etkili bir şekilde kullanılması oldukça
zordur. Bununla birlikte, bilgiyi uygun tanımlanmış bir anlamla sunmak, farklı
bilgisayar sistemlerine anlamsal düzeyde bilgiyi işlemeyi ve bilgi hakkında çıkarım
yapmayı sağlayacaktır, fakat mevcut sistemler sadece imla düzeyinde bilgiyi
işlemektedirler. Anlamsal Ağ yaklaşımı Internetin etkinliğini büyük oranda
arttıracak, bilginin tekrar kullanımını sağlayacak, ve bilginin sunum gücünü
arttıracak. Bilgiler bir standart ile tanımlandığından, farklı yerlerdeki bilgilerin
birleştirilmesi ve bu bilgilerin birlikte işlenmesi mümkün olacaktır.
Bu tezde, bir Anlamsal Ağ Uygulaması: Ontoloji Odaklı Yemek Tarifi
Sorgulaması gerçekleştirmek için; bilginin Anlamsal Ağ dili ile sunulması, ontoloji
işleme, ontolojiler üzerinden çıkarım yapma ve sorgulama kavramları
gerçekleştirilmiştir.
Alan olarak, Internet tabanlı yemek tarifleri uygulaması seçilmiştir. Tüm bilgi ve
uygulama mantığı uygulamanın içeriğini ve yapısını kontrol eden OWL (Web
vi
Ontoloji Dili) ontoloji dosyasına yüklenmiştir, ve bu dosya var olan mantıksal
ifadelerden yeni bilgiler elde etmek için çıkarım yapmaya olanak sağlar.
Uygulamada, kullanıcı mevcut yemek tariflerini görebilmek için isteğine göre
seçtiği malzemeleri yazarak sorgulamalar yapabilir. Ayrıca uygulama sorgulamalar
her ne şekilde oluşturulursa oluşturulsun anlamlı cevaplar döndürür.
Anahtar Kelimeler: Anlamsal ağ, ontoloji, ontoloji sorgulaması, ontoloji yönetimi,
ontoloji odaklı bilgi yönetimi, bilginin temsil edilmesi, örüt ağ.
vii
ACKNOWLEDGMENTS
I express sincere appreciation to my supervisor Asst. Prof. Dr. Çiğdem Turhan for
sharing her knowledge with me and guiding me throughout my thesis. Without her
project proposal, support, encouragement, guidance and persistence this thesis would
never have happened.
I should also express my appreciation to examination committee members Prof. Dr.
Ali Yazıcı, Prof. Dr. İbrahim Akman, Assoc. Prof. Dr. Nazife Baykal and Asst. Prof.
Dr. Nevzat Sezer for their valuable suggestions and comments.
In addition, I would like to thank my parents Nafiye and Recep Kalem and my sister
Ayşegül for their unlimited patience, support and love during the course of the study.
viii
TABLE OF CONTENTS
ABSTRACT ............................................................................................................... iii
ÖZ.................................................................................................................................v
ACKNOWLEDGMENTS..........................................................................................vii
TABLE OF CONTENTS .........................................................................................viii
LIST OF FIGURES......................................................................................................x
CHAPTER
1. INTRODUCTION................................................................................................1
2. OVERVIEW OF THE WEB................................................................................4
2.1 Web Languages..............................................................................................5
2.2 Information Management on the Web............................................................6
2.3 Information Retrieval on the Web..................................................................6
3. SEMANTIC WEB................................................................................................9
3.1 Overview of the Semantic Web....................................................................10
3.2 Information Retrieval with Semantic Web...................................................13
3.3 Semantic Web Tools and Languages............................................................14
3.3.1 SGML (Standard Generalized Markup Language)...............................15
3.3.2 XML (eXtensible Markup Language)..................................................15
3.3.3 RDF (Resource Description Framework).............................................17
3.3.4 RDFS (RDF Schema)...........................................................................18
3.3.5 OIL (Ontology Inference Layer)..........................................................19
3.3.6 DAML+OIL (DARPA Agent Markup Language - OIL).....................20
3.3.7 OWL (Web Ontology Language).........................................................22
4. ONTOLOGY, ONTOLOGY EDITORS AND QUERY LANGUAGES.........26
4.1 Ontology Editors........................................................................................34
4.2 Ontology Management System..................................................................36
ix
4.3 Ontology Query Languages........................................................................38
5. DESIGN OF THE SEMANTIC WEB APPLICATION: ONTOLOGY-
DRIVEN RECIPE QUERYING........................................................................41
5.1 Overview of the System...............................................................................42
5.2 System Domain............................................................................................43
5.3 System Specifications..................................................................................46
5.4 System Design.............................................................................................47
5.4.1 OWL Ontology Design........................................................................49
5.5 Technical Specification...............................................................................52
6. IMPLEMENTATION.......................................................................................54
6.1 OWL Query Server......................................................................................55
6-2 Web Interface..............................................................................................62
6-3 Implementing the Ontology with Protégé...................................................75
7. CONCLUSION.................................................................................................76
REFERENCES......................................................................................................79
APPENDICES
A. Ontology Editor Survey Results...................................................................86
B. Sample Queries.............................................................................................90
C. OWL Model..................................................................................................92
D. Class Hierarchy for foodReceipts Project....................................................98
x
LIST OF FIGURES
FIGURE
1. Structure of the System ....................................................................................... 42
2. Properties and Relations of the System .............................................................. 51
3. OWLQueryServer UML diagram ....................................................................... 57
4. RequestHandler UML diagram ........................................................................... 61
5. FoodEntry UML diagram ................................................................................... 62
6. Search Interface 1 ............................................................................................... 63
7. Sample Search (A) .............................................................................................. 64
8. Sample Search (B) .............................................................................................. 64
9. Sample Search (C) .............................................................................................. 65
10. Sample Search (D) .............................................................................................. 65
11. Ingredients and Recipe of Körili Pilav ............................................................... 66
12. Source of Körili Pilav ......................................................................................... 67
13. Search Interface 1 ............................................................................................... 68
14. Selecting Ingredients from Categorized Box ..................................................... 69
15. Sample Search (E) .............................................................................................. 70
16. Sample Search (F) .............................................................................................. 71
17. Sample Search (G) ............................................................................................. 72
18. Category of Pilavlar ........................................................................................... 73
19. Help Interface of the System .............................................................................. 74
20. Protégé Ontology Editor......................................................................................75
1
CHAPTER 1
INTRODUCTION
During the last fifteen years the Internet has stepped into our lives to stay
permanently. Today, almost everything in our life has a connection with the ‘Web’ in
one way or another. The Internet has become one of the most important platforms for
e-commerce, communication, entertainment, business, education and sharing
knowledge by all means. By just looking at the different fields and platforms
involved in the Internet it is not difficult to say that the Internet is not just a modern
way of doing something, but more it is a ‘de facto’ situation which will not just fade
away but continue its growth into the way we live. It is an entire concept surrounding
our life and reshaping our life style.
While the Internet is changing our way of living it is also changing and evolving
within itself. A new phase is needed where information on the Internet are given
well-defined meaning, enabling computers and people to work in cooperation.
Currently all the information presented on the Internet just have static content giving
meaning in some target environments or contexts. The Internet contains billions of
such documents which in general cannot be used effectively by different systems.
However, presenting the information in a well defined format using shared standards
will enable computer systems to process information at the semantic level whereas
the present systems process the information only at the syntax level. Presenting
information with well-defined meaning will enable different computer systems to
process and reason about the information presented. This approach will drastically
change the effectiveness of the Internet and will enable the reuse of information and
increase the representative power of information. It will be possible to combine
2
information from different locations and process them together since they are defined
in a standard way.
“The Semantic Web” is a new way of representing information enabling it to be
defined and presented at the semantic level, better enabling computer systems to
process this information. A possible realization of the above mentioned process, if
not the only one, is to use Semantic Web languages enabling the semantic definition
of the information.
In this thesis all the concepts involved in Semantic Web have been studied, and
how different solutions could be combined to realize such applications have been
shown. Concepts such as representing knowledge with a Semantic Web language,
ontology processing, reasoning and querying on ontologies have been applied
successfully in the developed application.
The main purpose of the thesis is to investigate and research the Semantic Web
concept and get a solid understanding of the concepts together with its difficulties,
problems and the ability to be used in real world applications.
In developing the Semantic Web application, the following practical problems
arise:
• to process data defined with the Semantic Web language OWL (Web
Ontology Language) [34] [35] [37] [38] [50] [62] [73].
• to execute queries on OWL ontologies.
• to use meaning when applied within applications.
• to combine and process different information located at different systems, on
a single system.
The implementation part of the thesis mostly deals with the problems mentioned
in the above list. As a domain a Web-based application dealing with food recipes has
been chosen. Instead of building all the application logic into static standard HTML
with a scripting language, all the information and application logic have been moved
into an OWL ontology file. All the data and the application logic should reside in the
OWL Web ontology as much as possible for a more effective system.
3
Specifically, the OWL ontology controls all the content and the structure of the
application. It makes it possible to reason on the provided information and create
some new facts from already given logic statements. In the application, functionality
is provided, so that an end-user can enter queries for some food recipes through the
Web interface. All the data needed for user queries is provided from the information
stored in the ontology itself. It is also possible for the user to enter queries made up
of arbitrary elements when querying for available food recipes. The application is
able to respond meaningfully no matter how the queries are constructed.
The report document has been divided into chapters, each of them dealing with
specific parts of the Semantic Web concept and the implemented application.
The following chapter presents the background information about the concept of
Web, general problems and Information Management on the current Web. Then, in
chapter 3, the history of Semantic Web, the domain of the Semantic Web and
Semantic Web Tools and Languages are presented. Chapter 4 is about Ontology,
Ontology Editors and Query Languages. In the 5th chapter, System Design is
covered, and System Domain and Specifications has been explained. Chapter 6
presents the Implemented Application discussing User Interface and System
Structure in detail. Finally, the Conclusion chapter explains possible extensions to
the thesis and future work on the Semantic Web subject.
4
CHAPTER 2
OVERVIEW OF THE WEB
At the very beginning, when the Web first emerged, some computers connected
to each other in order to work together and share the necessary data between them
[13] [14] [42]. Over time, the Web started to grow, and the intranets and LANs came
on to the scene. But the explosion of personal computers, Mobile devices and major
advances in the field of telecommunications were the actual triggers of the Web as
we know it today.
The growth of the Web has been impressive for the past few years. It is a
phenomenon which cannot be defined and described ranging over a period of time
because of its potential to change and to fit into our life. The interaction between the
Web and the way we live is an interaction which involves both parts equally. As the
Web changes according to our needs, equally the way the human beings work, study,
communicate with each other are also being reshaped. It is an interaction which still
has a great potential to move this interaction far away beyond our imagination.
At the first stage of the Web, it was thought of as some exchange platform of
documents and data, and a communication media for work collaboration. It was
meant to be a big network of workstations where the programs and databases could
share their knowledge and work together in collaboration. But with the enormous
explosion of the media programs, video games, films, music, pictures, etc. the
present Web is almost only used by humans and not by machines. The content is
mainly targeted for human consumption. The information meant to be processed by
computing systems are generally defined by some custom standards which is a
handicap for a more broad and extended use of the provided information.
5
Specifically, the main problem that has appeared in the present Web is that the
information is written only for human consumption in most cases. The machines
cannot understand the meaning of online information. Enormous amount of pictures,
drawings, movies of all kind of media types, and information presented in a natural
language format populate the actual Web. As a result, this meaningless information
is not useful at all to the machines because they cannot process these data as a
context-aware system; they only present these data for the user in a specific format.
On the other hand, finding the right piece of information is often a nightmare on
the present Web. Search results are in most cases imprecise, often yielding matches
to thousands of pages. The human searching is often a difficult task, takes too much
time and has several limitations. Moreover, users face the task of reading all the
documents retrieved in order to extract the information which is actually desired.
Today’s search engines are not context-aware but rather, perform search with text-
match based methods. A related problem is that the maintenance of Web sources has
become very difficult. The burden on users to maintain consistency is often
overwhelming. This has resulted in a vast number of sites containing inconsistent and
contradictory information.
2.1 Web Languages
There are many languages used to publish data in the current Web [13] [15].
Some of these languages are: HTML, PHP, JSP and ASP and some Media-oriented
Web languages such as Flash. However, these scripting and markup languages are
only meant to process the business logic of the applications and the visual
presentation of the information they are dealing with. Markup languages such as
HTML does not care about what the information is, it will only control the layout
and appearance of the given information. Server side Web scripting languages such
as PHP are generally targeted to the dynamic behavior of the Web applications and
the business logic of such applications. However, the above mentioned languages all
have a common lack in providing and processing semantic meaning bound to
information. They just treat data as plain text without any meaning, that is, such Web
languages are not “aware” of the information they are dealing with.
6
2.2 Information Management on the Web
The incredible progress of the Web is as a direct consequence of a big explosion
of all kinds of online Web documents. The information storage and collection on the
Web is as follows; the information is generally stored in large databases that are kept
in the servers. The programs running on the servers generate the requested Web
documents “on the fly”, based on the needed data at some state. Most of these
dynamically generated on-line documents are only made for human consumption and
it is impossible for the machines to understand the meaning of these documents.
Such kind of Web documents is difficult to reuse and to make available to other
parties because they not permanent but are being generated on specific requests
without any well-defined meaning.
2.3 Information Retrieval on the Web
Information retrieval on the Web [15] refers to the act of recovering information
from the vast amount of online Web documents; getting the desired documents and
presenting them to the user. This is the classic and the most widely way of obtaining
information from the Web. With this approach a user does not extract any
information from a document. However, the user just picks up some documents
among all the available documents in the Web. The user will get a document or a set
of documents and will have to analyze the document to find the desired information
if it exists. Actually in this approach, only a portion of the computational power
exposed by the computers is used to fetch the desired information. The computing
systems used are only responsible in transferring the document and presenting to the
user. No processing power is used to retrieve directly relevant information through
context-aware processes and methods.
The problems associated with the retrieval of quality information from the
Internet are many. We can consider the Internet as a connected undirected graph with
many nodes where the connections are the edges. In this perspective, the nodes are
distributed across the world without regard for cultures of time zones. From this
point of view, the idea of a connected undirected graph captures elegantly the idea of
the Internet. The problems related to traversal of the Internet to retrieve information
7
are that the data is distributed to the whole world and the nodes of the Internet are
spread across the world. And, it is obvious that the Internet is changing very fast and
the data is volatile. Every six months the Internet nodes and connections are doubling
in a topology that is not predefined. The data is redundant and stored in an
unstructured way, and the data on the Web is duplicated in many instances across
mirror sites. And also, the quality of the data is poor. The volume of data to be
searched and found on the Web is growing at an exponential rate. Not all the data is
in the same language because the Web is a reflection of the real world in that it is
multicultural and multilingual. New media types are appearing at a fast rate,
particularly where audio-visual or multimedia files are concerned. Many Web pages
content’s are created dynamically on demand.
On the Web, the unstructured markup languages make it difficult for humans and
even more difficult for the machines to locate and acquire the desired information.
To retrieve information on the Web, current methods are browsing and keyword
based searching. Both of the mentioned methods have several limitations when
retrieving information from the Web.
Browsing: Browsing the Web refers to the act of retrieving a Web document by
means of its URI (Uniform Resource Identifier) and displaying it in the local client
browser to view its content. The user often has to traverse from link to other link in
order to reach the desired information if it ever happens.
Anybody familiar with the Web knows the drawbacks of looking for information
by means of browsing:
• It is very time consuming
• It is not always possible to reach the desired information even though it exists
somewhere on the Web.
• It is also very easy to get lost and disoriented following all the links the user
might find relevant; suffering from what is called the “lost-in-hyperspace”
syndrome.
8
Keyword Searching: Keyword searching is an easier way to retrieve
information when compared against browsing Web documents through Web links.
Keyword searching on the Web refers to the act of looking for information using
some words to guide the searching. These keywords that the user wants to search for
are entered into a search engine which will perform the searching on the Web cache
it has stored and indexed locally. Beforehand, the search engines continually traverse
all the links available on the Web caching and indexing all the Web documents they
reach. The search engines search the reduced copy of the Web following the links
and trying to match the input words with the words found in its index tables. When a
match occurs, the links pointed to by the index tables are returned back to the user.
Keyword searching is more useful than just browsing when looking for
information, since the user does not need to know the exact URI of the desired Web
document, however this approach still has some disadvantages:
• The user must be aware of the available search engines and choose the correct
one that fits his/her necessities.
• The keywords entered by a user are the ones the user considers more relevant
for the information he/she wants to look for, which is a very subjective
decision.
• The entered keywords have to exactly match the words presented in the Web
documents. Even a slight variation is not tolerated.
• Keyword searching normally returns vast amounts of useless document
references/links the user has to filter by hand.
“Although search engines index much of the Web's content, they have little
ability to select the pages that a user really wants or needs” [66].
9
CHAPTER 3
SEMANTIC WEB
The Web has dramatically changed the accessibility of electronically available
information. Today, the Web currently contains about 3 billion static documents and
these are accessed by over 500 million users from all around the world [5] [12] [67].
For this reason, with this huge amount of data, and since the information content is
presented primarily in a natural language, it became increasingly difficult to find,
access, present, and maintain relevant information. So, a wide gap has occurred
between the information available for tools and the information maintained in
human-readable form.
As a response to this problem, many new research initiatives and commercial
enterprises have been set up to enrich available information with machine-
processable semantics. One of the examples of the recent research is Semantic Web
which aims to provide intelligent access to heterogeneous, distributed information,
enabling software products (agents) to mediate between the user needs and the
information resources available. This support is essential for “bringing the Web to its
full potential.” Tim Berners-Lee [66], Director of the World Wide Web Consortium
and the inventor of World Wide Web foresees a number of ways in which developers
can use self-descriptions and other techniques so that context-understanding
programs can selectively find what users want. Lee referred to the future of the
current Web as the Semantic Web that is “extended Web of machine-readable
information and automated services that amplify the Web far beyond current
capabilities”.
10
The explicit representation of the semantics underlying data, programs, Web
documents, and all kind of information related Web resources will enable a
knowledge-based Web that provides a qualitatively new level of service and a new
way of processing data. Computing systems and automated services will improve in
their capacity and ability to assist humans in achieving their goals by
“understanding” more of the information presented on the Web, and thus providing
more accurate filtering, categorizing, and searching of these information sources
available on the Web. This process will ultimately lead to an extremely
knowledgeable system that features various specialized reasoning services thus
extending the representational power of the available information. As Lee
summarized [5] [67]; “The first step is putting data on the Web in a form that
machines can naturally understand, or converting it to that form. This creates what I
call a Semantic Web - a Web of data that can be processed directly or indirectly by
machines”.
3.1 Overview of the Semantic Web
The purpose of the new phase in Web technology is to make the machines
capable of understanding the semantics of the information presented on the Web. To
be able to “read” and “understand” the Web as a human being does. For this purpose,
many different approaches have been formulated by a large number of researchers,
organizations and universities. Most of these methods are explained in detail in this
thesis.
The Semantic Web is not a separate Web [2] [11], yet it can be assumed to be an
extension toward the meaning of the current Web. The main difference between the
Semantic Web and the Web is that the Semantic Web is supposed to provide
machine accessible meaning for its constructs whereas in the Web this meaning is
provided by external mechanisms. In order to determine the meaning of a collection
of documents, it is necessary to use only the meaning determined by the formal
language specifications of the Semantic Web, currently the RDF (Resource
Description Framework) model theory and the OWL model theory.
11
The Semantic Web aims for meaningful and machine-understandable Web
resources, whose information can then be shared and processed both by automated
tools, such as search engines, and by human beings [5] [9]. The consumers of Web
resources, whether automated tools or human beings are referred to, as agents. This
sharing of information between different agents requires semantic mark-up, for
example, an annotation of the Web page with information on its content that is
understood by the agents searching the Web. This kind of an annotation will be given
in some standardized, expressive language (which, e.g., provides predicate logic and
some form of quantification) and will make use of certain terms or classes (like
\Human", \Plant", etc.). To make sure that different agents have a common
understanding of these terms, we need ontologies in which these terms are described,
and which thus establish a joint terminology between the agents. Basically, Web
ontology is a collection of definitions of concepts and the shared understanding that
comes from the fact that all the agents interpret the concepts with respect to the same
ontology. Using the same standards will enable the reuse of the defined information.
That is, the information is not annotated for a specific system, however the
annotation relies on some shared standards which makes it possible to be recognized
by different computer systems.
What the Semantic Web is NOT?
The Semantic Web is not Artificial Intelligence: The concept of machine-
understandable documents does not imply some magical artificial intelligence which
allows machines to comprehend human words and fully understand them as human
beings do [16]. Semantic Web only denotes a machine's ability to solve a well-
defined problem by performing well-defined operations on existing well-defined
data. Instead of asking machines to deduce people's language, it involves asking
people to make the extra effort so that the machines are able to process the data in
some specific way.
Even though it is simple to define information with languages such as RDF, at
the level with the power of a Semantic Web these language will be complete
languages, capable of expressing paradoxes and tautologies. And it will be possible
to phrase questions whose answers normally would require a machine to search the
12
entire Web and take an unpredictable amount of time to find the answer. This should
not keep us away from making these languages complete. Each mechanical
application relying on such languages will use a schema to restrict its use to an
intentionally limited language. However, when links are made between the Webs
relying on such languages, the result will be an expression of a big amount of
information. It is obvious that because the Semantic Web must be able to include all
kinds of data to represent the world, the languages must be completely expressive.
A Semantic Web will not require every application to use expressions of arbitrary
complexity: Even though the languages used to define information allow expressions
of arbitrary complexity and computability, applications which generate semantically
defined information will in practice be limited to generating simple expressions such
as access control lists, privacy preferences, and search criteria.
A Semantic Web will not require proof generation to be useful: proof validation
will be enough: Although access control on Web sites involve validation of a
previously prepared proof, there is no requirement for them to answer an arbitrary
question, find the path and the construct of a valid proof. It is well known that to
search for an answer for an arbitrary question and generate a proof for the question is
typically an intractable process as many other real world problems, and a Semantic
Web language does not require this (unsolvable) problem to be solved in order to be
useful.
A Semantic Web is not an exact rerun of a previous failed experiment: Until now
other concerns has been raised against the Semantic Web concept such as the relation
to Knowledge Representation Systems. More or less, such systems have tried to
achieve similar results as the Semantic Web concept is trying to do. Systems such as
KIF [97] and CYC [98] [99] are some examples of such Knowledge Representation
Systems. However the success or failure of such systems should not be a threshold or
limit for the Semantic Web concept/project. A more constructive approach would be
to feed the Semantic Web with design experience and the Semantic Web may
provide a source of data for reasoning engines developed in similar projects such as
those that utilize Knowledge Representation systems.
13
3.2 Information Retrieval with Semantic Web
Machine to Human: The addition of semantic annotations to Web documents
would improve information retrieval in various ways yet unimagined. As Tim Bray
said, search engines "do the equivalent of going through the library, reading every
book, and allowing us to look things up based on the words found in some text" [66].
If more descriptive metadata were available, one would not, as when using Web
search engines; have to rely on the popularity of the resource as an assurance of its
relevancy. How can we be sure that often accessed information against some queries
is relevant to each other? We cannot be sure that such relations always hold.
Librarians, who often act as human mediators between the complex relations of
structured information and the often unformulated queries of the information seeker
know that information retrieval is often incomplete even when information is
organized well. When organized badly or not at all, the consequences are failure in
retrieving information.
Human to Machine: Tim Berners Lee discussed as illustrated in the reference
[42] how content-aware “agents” using semantic information could be used to
conduct research efforts into everyday tasks such as investigating health care
provider options, prescription treatments, or available appointment times. Each of
these tasks now is usually conducted by a human researcher assigned for this task. If
one has left the task to take a trip, he/she must investigate the best price for an
airplane ticket, (even though some of this information is already collected), and
match the information about available flights with available times from a personal
calendar. This sort of research is conducted daily and one takes for granted the
mental and representational systems needed to ask a question, investigate an answer,
pull related information together, select the information which is relevant to the
inquiry and initiate another set of actions based on this selection.
Researchers on artificial-intelligence [68] [69] have been working on methods to
automate these kinds of tasks and processes for many years. Such researchers have
developed several approaches that in the future may be applicable to the Semantic
Web.
14
3.3 Semantic Web Tools and Languages
During the last few years, several ontology languages [4] [17] [21] [71] [72] have
been developed. All of these languages are based on XML [23] syntax, such as XOL
[25] (Ontology Exchange Language), SHOE [26] (Simple HTML Ontology
Extension) which was previously based on HTML, and OML (Ontology Markup
Language), whereas RDF [27, 28, 29] (Resource Description Framework) and RDFS
[30] (RDF Schema) are languages created by W3C (World Wide Web Consortium)
group members. Two additional languages are being built on top of the union of RDF
and RDF Schema with the objective of improving its features; these are OIL
(Ontology Inference Layer) and DAML+OIL [32] (Darpa Agent Markup Language).
Semantic Web languages such as XML, RDF, RDFS, DAML+OIL, OWL [33,
34, 35] are used to organize, integrate and navigate the Web; at the same time
allowing content documents to be linked and grouped in a logical and relevant
manner. With the information environment that these standards can create, users can
search and browse information resources in an intuitive way with the help of content-
aware machines/computing systems.
All of these languages that are oriented to create the Semantic Web are structured
languages and with this feature they can carry on meaning besides giving structure to
the text. Also they have different characteristics compared to each other. Some of
them are relatively new languages, and the newly available languages aim to make
progress from the previous ones, evolving and improving their characteristics to
support the Semantic Web concept.
The reached semantic power is at different levels, some languages provide
meaning to the text/information; others go further and make assertions and inference
of knowledge and facts etc. possible as well.
Some important languages in chronological order [17] [18] [20]:
• Standard Generalized Markup Language (SGML)
• eXtensible Markup Language (XML)
• Resource Description Framework (RDF)
15
• Darpa Agent Markup Language - Ontology Inference Layer (DAML+OIL)
• Web Ontology Language (OWL)
In the context of the Semantic Web, a major effort is devoted to the realization of
machine processable semantic meaning, expressed in meta-models such as RDF,
OIL, OWL, DAML+OIL and based on shared ontologies. Still, these approaches rely
on common ontologies being able to be merged, to which existing information
sources can be related by proper annotation. This is an extremely important
development, but its success will heavily rely on the wide standardization,
acceptance of different languages and adoption of common ontologies or schemas.
In the Semantic Web, all the necessary information resources (data, documents
and programs) will be made available along with various kinds of descriptive
information and annotations, i.e., metadata. A clear defined knowledge about the
meaning, usage, accessibility or quality of Web resources will considerably facilitate
automated processing of all the available Web content/services. The Semantic Web
will allow both human beings and machines to query the Internet as if it were a huge
database.
To allow the realization of such a concept, besides the Web languages, different
tools also have to be developed in order to infer information from the Web. Inference
does not only depend on the languages but also on the different tools that are
currently being developed around the languages.
3.3.1 SGML (Standard Generalized Markup Language)
It is a system for organizing and tagging elements of a document. SGML was
developed and standardized by the International Organization for Standards (ISO) in
1986 [70].
3.3.2 XML (eXtensible Markup Language)
The XML [14] [19] is a meta-language for defining application specific markup
tags and it is the universal format for structuring Web documents and data on the
Web which is also proposed by the W3C. The main contribution of XML is
16
providing a common and communicable syntax for Web documents. XML itself is
not an ontology language, but XML Schemas [24], which define the structure,
constraints and the semantics of XML documents, can be used to specify ontologies.
But since the aim of the creation of XML Schema is the verification of XML
document and its modeling primitives and these tasks are more application oriented
rather than concept oriented, XML Schema will not be considered as an ontology
language.
The only reasonable interpretation is that XML code contains named entities with
sub-entities and values; that is, every XML document forms an ordered, labeled tree,
which is because of the both XML’s strength and its weakness. It is possible to
encode all kinds of data structures in an unambiguous syntax, but XML does not
specify the data’s use and semantics. The groups that use XML for their data
exchange must agree beforehand on the vocabulary, its use and meaning.
Why Meta Data Is Not Enough: XML metadata is a form of description of
available data within some document or information. It describes the purpose or
meaning of raw data via a text format to more easily enable information exchange,
interoperability, and application/platform independence [5]. As a description, the
general rule is accepted as “more is better.” Meta data increases the usability and
granularity of the defined data. The way to think about the current state of metadata
is that words (or labels) are attached to the data values in order to describe it. While
the moving toward metadata evolution will not follow natural language descriptions,
it is a good analogy to that only the words are not enough. The motivation for
providing richer data description is to move data processing from being static and
mechanistic to dynamic and adaptive.
For example, we may be enabling our systems to respond in real time to a
location-aware cell phone customer who is walking in a store outlet. If a system
could match consumers’ needs or past buying habits to current sale merchandise, the
revenue would increase. Additionally, the computers should be able to support that
sale with just-in-time inventory by automating the supply chain with its partners. The
general rule is: The more computers understand, the more effectively they can handle
complex tasks.
17
All the possible ways a semantically aware computing system can drive new
business and decrease the operation costs have not yet been invented. However, to
get there, it must push beyond simple metadata modeling to knowledge modeling and
standard knowledge processing. There are three emerging steps beyond simple
metadata: semantic levels, rule languages, and inference engines. These are the
backbones of the Semantic Web.
3.3.3 RDF (Resource Description Framework)
RDF is a document structure for encoding, exchange and reuse of structured
metadata that is also proposed by W3C [14] [19]. In order to represent metadata in
XML, RDF provides a standard form. The RDF data model consists of three object
types:
Resources: All things being described by RDF expressions are called resources.
A resource could be an entire Web document; such as the well known HTML
document "http://www.w3.org/Overview.html" for example. A resource may be a
part of a Web page; e.g. a specific element within the document source of an HTML
or XML Web document. A resource may also be a large collection of Web
documents; e.g. an entire Web site. A resource could also be a Web object not
directly presented on the Web; e.g. a printed book.
Properties: A property is a specific aspect, characteristic, attribute, or relation
used to describe a resource. Each property has a specific meaning, defines its
permitted values, the types of resources it can describe, and its relationship with
other properties. This document does not address how the characteristics of
properties are expressed; for such information, one should refer to the RDF Schema
specification.
Statements: A specific resource together with a named property and the value of
that property for that resource is defined as an RDF statement. These three individual
parts of a statement are called the subject, the predicate, and the object of that
statement respectively. The object of a statement (i.e., the property value) can be
another resource or it can be a literal value; i.e., a resource (specified by some URI)
or a simple string or any other primitive data type defined by XML. Speaking in
18
RDF, a literal may have a content that is XML markup, however is not further
evaluated by the RDF processor.
RDF does not have any specific mechanisms to define relationships between
these object types, but the RDF Schema (RDFS) Specification Language does.
Although the main intention of RDFS is not for ontology specification, RDFS can be
used directly to describe ontologies. RDFS provides a standard set of modeling
primitives for defining ontology (class, resource, property, “is a” and “element-of”
relationships etc.) and a standard way to encode them into XML. But, since axioms
cannot be defined directly, RDFS has a rather limited expressive power. And also,
the relation between ontology and RDF(S) is much closer than that of between
ontology and XML.
Basically, the RDF data model consists of statements about resources, encoded as
object-attribute-value triples. The objects are resources, the attributes are properties
and the values are resources or strings. For example, to state that “Zeynep” is the
author of the article at a specific URL (Uniform Resource Locator), one would use
the triple: http://www.somewhere.com/#article, has author, “Zeynep”. Attributes,
such as “has author” introduced in the previous example, are called the properties.
3.3.4 RDFS (RDF Schema)
The important feature of RDFS when concerned with ontologies is that RDFS
expresses class-level relations describing acceptable instance-level relations.
RDF Schema is a language layered on top of the RDF language. This layered
approach has been presented by the W3C organization and Tim Berners-Lee as the
“Semantic Web Stack” of layers of different languages or concepts all related to each
other [30] [71] [72]. The base layer of the stack is the concepts of universal
identification (URI) and a universal character set (Unicode). Above those concepts,
the XML Syntax is layered (elements, attributes, and angle brackets) and namespaces
to avoid vocabulary conflicts so that every domain can identify names only required
to be unique within the local domain. The layers above XML are the triple-based
assertions of the RDF model and syntax discussed in the previous section. If a triple
is used to denote a class, class property, and value, it will be possible to create class
19
hierarchies for the classification and description of different objects. This is the goal
of RDF Schema.
The data model expressed by RDF Schema is the same data model used by
different object-oriented paradigms e.g. programming languages like Java. The data
model for RDF Schema allows creating classes of some information within a
domain. A class is defined as a group of things with distinct features and with some
common characteristics. In object-oriented programming (OOP), a class is defined as
a template or a type definition for an object (instance) composed of characteristics
(also called data members or fields) and behaviors (also called methods or functions).
An object is a single instance of a specific class. Object-oriented languages also
allow classes to inherit characteristics and behaviors from a parent class (also called
a super class). All these concepts are more or less very similar to the model used by
RDF Schema.
Above RDF Schema the ontologies layer is residing. Above ontologies, logic
rules can be added about things defined in the ontology. A rule language will make it
possible to infer new knowledge and make decisions. Additionally, the rules layer
provides a standard way to query and filter out data from RDF. The rules layer is sort
of an “introductory logic” capability, while the actual logic framework will be
“advanced logic.” The logic framework allows formal logic proofs to be shared.
Lastly, with such proofs, it will be possible to establish a trust layer for levels of
application-to-application trust. This “Web of trust” forms the third and final Web in
Tim Berners-Lee’s three-part vision expressed as collaborative Web, Semantic Web
and Web of trust.
3.3.5 OIL (Ontology Inference Layer)
OIL was developed in the OnToKnowledge Project [17] [19] [41], and is both a
representational and exchangeable language for creating Web ontologies. The
language is combined with primitive elements from frame-based languages, formal
semantics and reasoning services from description logics. To enable the use of OIL
on the Web it is based on the W3C standards, XML and RDF(S). The ontology
description is divided into three different layers: object level (concrete instances),
20
first meta-level (ontological definitions) and second meta-level (describing features
of the ontology). The OIL ontology language provides definitions for classes and
class relations, and a limited set of axioms enabling the representation of different
classes and their properties. Relations (also called slots) are treated like first-class
citizens, and can be represented in different hierarchies. Although it has some
limitations, OIL can provide precise semantic meaning which will enable reasoning
systems to process the defined information effectively.
As mentioned in the above paragraph, OIL is built on top of RDF(S), and has the
following layers: Core OIL groups the OIL elements/primitives that have a direct
mapping to RDF(S) elements/primitives; Standard OIL is the complete OIL model
with all its features, using more primitives than the ones defined in RDF(S); Instance
OIL adds instances of different concepts, classes and roles to the previous model;
and Heavy OIL has been designed as the layer for future extensions of the OIL
language. OILEd, Protégé-2000, and WebODE are some powerful ontology editors
that can be used to author OIL ontologies (as well as other Web ontologies). Another
feature of OIL is that its syntax can also be expressed in ASCII which is not XML
compliant.
3.3.6 DAML+OIL (DARPA Agent Markup Language - OIL)
These two languages are the XML and Web-based languages to support the
development of Semantic Web.
DAML+OIL [11] is a descriptive semantic markup language for Web resources
which is built on top of earlier defined languages such as RDF and RDF Schema, and
extends these languages with richer modeling primitives enabling reasoning systems
to process it more effectively. DAML+OIL was developed by the Defense Advanced
Research Projects Agency (DARPA) [20] under the DARPA Agent Markup
Language (DAML3) Program.
With DAML+OIL, in order to make the information/data yet more expressive
and powerful, it is possible to use description logic to describe the data enabling it to
be processed on reasoning systems. In this way, not only the explicitly given data
will be available but some new facts and conclusions will be available about the data
21
provided. In order to achieve this extra feature, the DAML+OIL is a suitable
language because of its expressiveness with descriptive logic. For achieving this
extra power, an extension of RDF, called DAML+OIL, can be used. DAML+OIL is
a description logic language disguised in an XML format.
DAML extends RDFS in the following ways:
• Support of XML Schema data types rather than just string literals and
primitive data types such as dates, integers, decimals, etc.
• Restrictions on properties like cardinality constraints.
• Definition of classes by enumerations of their instances.
• Definition of classes by terms of other classes and properties. In order to
enable the definition from other classes different expressions has been
defined such as; unionOf, intersectionOf, complementOf, hasClass and
hasValue which some of them has their roots in classic set theory.
• It is possible to make Ontology and instance mappings (sameClassAs,
samePropertyAs, sameIndividualAs, differentIndividualFrom) permitting
translation between ontologies.
• Additional hints to reasoning systems such as; disjointWith, inverseOf,
TransitiveProperty and the UnambiguousProperty.
DAML is not completely developed yet. Even though it was actually the
recommended ontology language by the World Wide Web Consortium, a new
project called Ontology Web Language (OWL) has been developed to replace
DAML. The OWL project has removed some of the requirements specified for
DAML language, as rules, queries and services are still under development.
Description Logics
Description logics (DLs) [9] [39] are a family of knowledge representation
languages that can be used to represent the knowledge of an application domain and
is very well-suited to provide structure to information. Description Logics is a subset
of First Order Logic, which is non functional and does not allow explicit variables. It
is less expressive in favor of having greater decidability when processed by inference
22
procedures. Description Logics is different form predecessors, such as semantic
networks and frames, in that they are equipped with a formal, logic-based semantics.
High quality Web ontologies are necessary for the Semantic Web to be
successful, and their construction, integration, and evolution is greatly dependent on
the availability of a well-defined semantics and powerful reasoning systems. Since
DLs provide these aspects, they should be ideal candidates for creating and
developing ontology languages. That much was already clear ten years ago, but at
that time, there was a fundamental mismatch between the expressive power and the
efficiency of reasoning that DL systems provided, and the expressivity and the large
knowledge bases that ontologists needed. Through the basic research in DLs in the
last 10 to 15 years, the gap between the needs of ontologists and the systems that DL
researchers provide has finally become narrow enough to build stable bridges.
3.3.7 OWL (Web Ontology Language)
OWL Ontology: Ontology is a term borrowed from philosophy which refers to
the science of describing the kinds of entities in the world and how they are related to
each other. An ontology created with OWL may include descriptions of classes, their
instances and properties. Given such an ontology, the formal semantics of OWL
specifies how to derive its logical meaning not given explicitly, i.e. facts that are not
present in the ontology, but derived by the semantics. These derivations may be
based on a single OWL document or multiple distributed documents that have been
combined with OWL mechanisms allowing such extendable ontologies. The Web
Ontology Language is developed and produced by the W3C Web Ontology Working
Group (WebOnt).
The Web Ontology Language OWL [11] [38] is a semantic markup language for
publishing, extending and sharing ontologies through the Web. OWL is developed as
a vocabulary extension of the formerly developed RDF and is derived from the
DAML+OIL Web Ontology Language adding some extra features and discarding
some of the specifications intended for DAML+OIL. It is a revision of the
DAML+OIL Web ontology language including lessons learned from the design and
application of DAML+OIL ontology language.
23
OWL can be used to explicitly represent the exact semantics of classes within
some domain and the relationships between those classes (and instances). OWL has
more expressive semantic power than XML, RDF, and RDFS, and thus goes beyond
these languages ability to represent machine readable content on the Web.
In the comparision of OWL to XML and XML Schema, two points must be
mentioned:
• An ontology differs from an XML Schema in a way that ontology is a
knowledge representation, not a message format. Most Web standards based
on industrial corporations consist of a combination of different message
formats and protocol specifications. These formats have been given an
operational semantics, such as, "Upon receipt of this PurchaseOrder message,
transfer Amount dollars from AccountFrom to AccountTo and ship the
product purchased." That is each of the steps in the semantics is precisely
defined. However, this kind of a specification is not designed to support
reasoning outside the transaction context. It is fixed on the well defined steps.
For example, in general it is not possible to have a mechanism to conclude
that because the Product is a type of Chardonnay it must also be a white
wine. Such kind of reasoning and conclusions are essential in Semantic Web.
• One advantage of OWL ontologies is the availability of different tools that
can reason about them (For example Racer which reasons on OWL
ontologies and derives new facts from given statements). Such tools will
provide generic support that is not specific to a particular domain, which
would definitely be the case if one were to build a system to reason about a
specific industry-standard XML Schema. Developing a useful reasoning
system is not a simple task to accomplish. Developing an ontology is much
more tractable and feasible.
The OWL language provides three increasingly expressive sublanguages
designed for different users in specific communities.
24
OWL Lite: OWL Lite is targeted for users only needing simple constraint
features and classification hierarchies. For example, even though OWL Lite supports
cardinality constraints the cardinality values are restricted. For such constraints only
the values 0 and 1 is allowed. It is much simpler to provide tool support for OWL
Lite than it is for its more expressive relatives. This will allow easy migration to
OWL Lite from different ontology languages being used on the market.
OWL DL: OWL DL supports users who want the maximum expressiveness
without the lack of computational completeness (all entailments are guaranteed to be
computed) and decidability (all computations will finish in finite time) of reasoning
systems. OWL DL includes all OWL language constructs with restrictions such as
type separation (a class cannot also be an individual or property, a property cannot
also be an individual or class) enabling to create distinct definitions. It is named as
OWL DL because of its correspondence to Description Logic [39], a field of research
that has studied a decidable fragment of first order logic. OWL DL was designed so
that it has desirable computational properties for reasoning systems.
OWL Full: OWL Full is targeted for users who want maximum expressiveness
and the syntactic freedom of RDF with no computational guarantees. Decidability
and completeness properties have not been restricted as it is in OWL DL. Type
separation is not as strict as it is in OWL DL. For example, in OWL Full a defined
class can be treated as a collection of different individuals and as an individual in its
own right simultaneously. Another important difference from OWL DL is that in
OWL Full a owl:DatatypeProperty can be marked as an owl:InverseFunctional-
Property. OWL Full allows an ontology to incorporate the meaning of a pre-defined
(RDF or OWL) vocabulary. It is unlikely that any reasoning software will be able to
support every feature supported by OWL Full.
Each of the sublanguages mentioned above is an extension of their simpler
predecessor; both in what can be legally expressed in the ontology and in what can
be validly concluded in it. The following set of relations hold, but their inverses do
not.
• Every legal OWL Lite ontology considered legal OWL DL ontology.
25
• Every legal OWL DL ontology considered legal OWL Full ontology.
• Every valid OWL Lite conclusion considered valid OWL DL conclusion.
• Every valid OWL DL conclusion considered valid OWL Full conclusion.
Ontology developers should consider which of the species best suits their needs
when choosing an OWL language. When making choice between OWL Lite and
OWL DL, the choice depends on whether the users need the more expressive
restriction constructs provided by OWL DL. Reasoning systems for OWL Lite will
have desirable computational properties. Reasoners for OWL DL will be subject to
higher worst-case complexity because of its more expressiveness compared to OWL
Lite. When considering OWL DL and OWL Full, the choice between them mainly
depends on the extent to which users require the meta-modeling facilities provided
by RDF Schema (i.e. defining classes of classes). Reasoning support is less
predictable when comparing OWL Full to OWL DL.
Moreover, OWL makes an open world assumption, that is, descriptions of
resources are not bounded to a single file or scope. While class C1 may be defined
originally in the ontology O1, it can also be extended in other ontologies. The
consequences of these additional propositions about C1 are not reversible. New
information cannot retract previous information where the new information is
originated from. New information from reasoning can be contradictory, but facts and
entailments can only be added and never deleted.
It is the responsibility of the designer of the ontology to take into consideration,
the possibility of these contradictions. It is expected that tool support will help
detecting such cases.
In order to write an ontology that can be interpreted unambiguously and used by
software agents, a syntax and formal semantics for OWL is required. In addition,
OWL is a vocabulary extension of RDF [37].
26
CHAPTER 4
ONTOLOGY, ONTOLOGY EDITORS AND
QUERY LANGUAGES
Shortly, an ontology [44] [45] [47] is conceptualization based specification of
classes or in other words “things”. An ontology is a detailed description of certain
concepts and the relations among them where the concepts are defined within a
specific domain. The usage of an ontology is consistent with the definition since it is
broken into simpler sets of such concept definitions and relations when being
processed. Even though the word “ontology” has its origin in philosophy, it is
understood in a completely different sense.
Ontologies are designed for the purpose of defining knowledge, reusing and
sharing it effectively. It is a formal definition for some ontological commitment so
that different parts can participate when relying on the same definitions and
vocabularies. An ontology is a set of definitions written using a formal vocabulary.
To specify a conceptualism this is the main approach used because it has some
properties enabling AI processing systems to share knowledge among them. In other
words, an ontological commitment is a kind of an agreement involving different
domain specifications to use a specific vocabulary when defining concepts. Different
processing systems are being built so that they can participate in such commitments.
That is they can be “connected” to some ontology without any conflicts with respect
to the definitions and the vocabularies used. An ontology is built so that such systems
can participate into them, and share knowledge among other systems.
Given a specific domain, an ontology defined for that domain is the base for the
knowledge to represent for that domain. An ontology enables the definition of a
27
vocabulary in order to express the knowledge for some domain. Without a definition
of some vocabulary it is not possible to share knowledge among different
systems/agents. Simply said, there will be no common ground for such systems to
exist and share knowledge. A domain is a specific area of a subject or an area of
knowledge like medicine, economy, a specific research etc. Ontologies are used by
systems/agents such as databases application programs or any other thing that need
to share knowledge. Ontologies are being build up of basic concepts and the different
relations between them. The definitions of such concepts and relation are computer
usable so that computing systems can process these concepts and relations.
Simply said, defining an ontology is similar to defining a set data with all its
properties so other programs can use this data. Different computing systems as
domain independent applications and software agents use ontologies and knowledge
bases built on top of a set of ontologies.
Class definitions are the most common approach when defining some domain in
an ontology. Class definitions are suitable to define and describe the different
concepts within a domain. For example, a class defining a pizza represents all the
different pizza instances that exist. Any pizza is an instance of the class defining and
describing a pizza. Classes can have inheritance relations between them enabling the
definition of more specific classes from a given class. Definition of more general
classes is also possible. For example we can have subclasses of the class pizza as
“spicy pizza” and “non spicy pizza” where the class pizza is a super-class of these
two classes.
An ontology supports software agents, or in general all computer systems
requiring to share and reuse domain knowledge. Below are listed the important
features of an ontology are listed [47]:
• Ability to reuse domain language.
• Making domain assumptions explicit.
• Separation of operational knowledge and domain knowledge.
• Sharing the formal definitions and vocabularies when describing some
concept.
28
• Analysis of domain knowledge.
There are many contradicting definitions of ontologies especially in the AI world.
An ontology is not directly a knowledge base. There is a thin line between the
definitions of these to concepts. Definitions of some knowledge for a domain, the
classes and the instances of these classes constitute a knowledge base. On the other
hand, an ontology is not much concerned with the individual instances. For example,
for an ontology the number of spicy pizzas is not important, rather the definition of a
pizza is essential for an ontology. The definition of knowledge is what ontologies a
more concerned about.
What can ontologies be used for?
Below is a list of major use cases of different ontologies identified by the Web
Ontology Working Group at W3C [16] [33] [47].
• Controlled vocabulary
• Web site or document organization and navigation support
• Browsing support
• Search support (semantic search)
• Generalization or specialization of search
• Sense "disambiguation" support
• Consistency checking (use of restrictions)
• Auto-completion
• Interoperability support (information/process integration)
• Support validation and verification testing
• Configuration support
• Support for structured, comparative, and customized search.
How are ontologies different from relational databases?
Although databases and ontologies have some similarities, they differ in many
important features. First of all an ontology is not storage for data but is a defining
model for the data whereas a relational database is a data repository. An ontology can
29
be used as filter or a framework to access and manipulate data where a database can
be used to store the different data instances defined by the ontology. Another
important difference is querying. When making queries against a relational database
the returned data will be the same data stored previously, just matching some
conditions. However when making a query against an ontology, together with some
reasoning process, the returned data can be some inferred data which was not stored
previously but generated from some facts represented by the ontology. In ontologies,
queries can also be made for some specific relations while this is not possible with
ordinary relational databases.
How are ontologies different from object-oriented modeling?
An ontology is also different than the object-oriented paradigm even though there
are a lot things in common, especially when it comes to model real life with class
definitions. First of all, the whole concept of ontologies has its theoretical roots in
logic. Because of that ontologies allows reasoning systems to make automated
reasoning on the defined knowledge represented by the ontology. Another important
difference is the definition of properties. In an ontology, properties are treated as
first-class citizens while in the object-oriented paradigm this not true. In the object-
oriented world, properties are internal to class definitions. In an ontology it is
possible define multiple inheritance while this is not the case in the object-oriented
paradigm. In object-oriented modeling it is only possible to make single inheritance
between classes because of overlapping method signatures defined in different super
classes when participating in a multiple inheritance relationship.
Ontologies allows property inheritance while this not possible with object-
oriented modeling. While the ontologies allow user defined relations between
different classes, object-oriented modeling restricts the relation with the class sub-
class concept. However, because of the wide acceptance and use of object-oriented
modeling and UML, they are accepted as practical specifications when modeling
ontologies. But because of the lack of logic capabilities of the object-oriented
modeling approach these two different concepts cannot be fully combined and be
productive as they are defined today. Currently there is an on-going effort to add
30
logic capability to object-oriented modeling, represented by OCL (Object Constraint
Language).
Some important aspects of ontologies are explained below [46]:
Kinds of Ontologies: Ontologies may differ with respect to different aspects
such as their implementation, content, level of description and the structure of
knowledge modeling.
Level of description: Ontologies can be built in several ways. The same
knowledge domain can be described in different ways. There is no unique perception
of a knowledge domain which results in a specific description. It is entirely
dependent on the different practitioners. Different vocabularies, terms and
taxonomies used can be given distinguishing properties where these properties can
make it possible to define new concepts where these concepts have some named
relationships with other concepts.
Conceptual scope: The scope and purpose of the different concepts can also be
different in ontologies. The clearest difference can be seen between ontologies where
one of them is modeling fields of knowledge domains such as medicine and more
high level ontologies describing basic concepts and relationship when domain
knowledge is expressed with natural language.
Instantiation: All ontologies have a terminological component which is
analogues to the relationship between an XML document and Schema. This
terminological component defines the vocabulary and structure of the domain the
ontology is intended to model. The second part called the asserted part, populates the
ontology with individual instances that are created on the ground established by
vocabulary and structure of the ontology. The second part can be separated from the
ontology implementation and be maintained in a knowledge base where access to
this knowledge base is controlled by the ontology itself. However, treating an
instance as an individual or treating it as a concept is entirely defined by the way the
specific way the ontology is defined.
31
Building Ontologies: An ontology can be built in several ways depending on the
practitioners and the domain to be modeled. Below is list of the different ontology
building approaches.
1. Acquiring domain knowledge: Assembling all the information resources
that will define the consistency and terms used to formally describe the things
in a given domain. These information, concepts and relations must be
collected so that they can be described by a chosen language.
2. Organization of the ontology: Designing the overall conceptual structure of
the domain involving the identification of the domain’s specific concepts and
properties. Identifying the different relationship between the different
concepts and all the concepts that have individual instances.
3. Building detailed descriptions for the ontology: Adding concepts,
properties, relations and individuals according to the needs of the domain
being modeled.
4. Ontology verification: Checking inconsistencies among the ontology
element such as the syntax, logic and semantic properties. This can also be
based on automatic classification that defines new concepts from existing
concepts, class relations and properties.
5. Ontology Commitment: Final verification of the ontology and later
commitment of the ontology by deploying it into a target environment.
Why are ontologies important in computing?
Building systems relying on ontologies shows a great potential to make software
more efficient, adaptive and intelligent. It is one of the most promising areas in Web
technology which will enable the next break through in the Web. Still it is not widely
accepted and deployed but it has already been accepted by some industries. For
example some medicine industries are heavily using ontologies and contributing to
the development of it. The medicine community has produced the powerful ontology
editor Protégé [50] which is an ontology editor allowing the management and
development of ontologies. However it is still not being used by the majority of the
mainstream users because it is not a straightforward process to apply ontologies into
different software system dealing with knowledge. There is no standard way of doing
32
things. However, it is only a matter of time that some more techniques will gain
attention after experience gained from the different subject fields using ontologies as
information representation technology.
However, the Semantic Web has completely changed their vision of the ontology
landscape to make it a more widespread applied technology. They make a great
effort on developing standard semantics markup languages based on XML, ontology
managements systems and different ontology management tools to make it easier to
adapt to ontologies and integrate them into computer systems. The use of ontologies
is newly being discovered in important applications heavily dealing with information
and the integration of different processes with information. Ontology is slowly
making its way into the software world as its usefulness is becoming clearer as time
passes.
Ontology Tools
Effective and efficient work with the Semantic Web must be supported by
advanced tools enabling the full power of this technology. In particular, we need the
following elements.
In order to effectively make use of the Semantic Web, the users must be
supported by different tools to be able to use all the power exposed by the Semantic
Web. The following list is the important elements needed to make use of Semantic
Web efficiently and effectively:
• Ontology editors to easily create and manipulate ontologies.
• Annotation tools to link information sources together with different
structures.
• Reasoning services to enable advanced query services and to map between
ontologies with different terminologies.
• Ontology library systems and Ontology Environments to create and reuse
ontologies. Such systems should in general allow merging different
ontologies sharing the same terminology.
33
Inference engines can be used to reason about ontologies and the instances
defined by those ontologies and create new knowledge from existing knowledge.
Inference engines are similar to SQL (Structured Query Language) query engines
running against databases but provide stronger support for different rules which
cannot be represented in relational databases known today.
An example inference engine is Ontobroker [57] which is now a commercial
product. Ontobroker can automatically derive new concepts in a given concept
hierarchy when reasoning the concepts of an ontology. Another well known
inference engine is Racer [49] which can be used to implement industrial strength
projects which make uses of ontologies created with OWL/RDF.
Ontology Libraries and Environments
If we assume that we have access to various well defined ontologies, creating a
new ontology is only a matter of merging the existing ontologies and adding new
concepts. Instead of building the ontologies from scratch it will be possible to reuse
the existing ontologies. In order to do this mainly two types of tools are needed.
1. Tools to store and access existing ontologies.
2. Tools to manipulate and manage existing tools.
How to create and manage ontologies in order to make them reusable is far from
being easy. This is why ontology libraries are important. An ontology library makes
it easy to re-organize, group ontologies and merge them together so that they can be
reused, managed and integrated with existing systems.
In order to support ontology reuse, a system must support the following
properties:
• Ontology reuse by identification, versioning and open storage to enable
access to ontologies.
• Ontology reuse by providing support for specific task oriented fields to easily
adapt the stored ontologies.
34
• Ontology reuse by constructing ontologies which fully supports the standards
available: Providing access to high level ontologies and standard
representation languages is an important issue when reuse is going to be
provided to its full potential.
Some examples of existing ontology library systems are: WebOnto [74] [84],
Ontolingua [75] [85], DAML Ontology library system [76], SHOE [26] [86],
Ontology Server [77], IEEE Standard Upper Ontology [78], Sesame [79],
OntoServer [80], and ONIONS [81]. ONIONS has been implemented in several
medical ontology library systems [83]. It is a methodology to enable integration of
existing ontologies. Comparisons together with detailed description for these library
systems can be found in article written by Ding & Fensel [82].
4.1 Ontology Editors
Most of the existing ontology editors [46] are sufficiently general purpose to
allow the construction of ontologies targeting as specific domain. Some of these tools
lack useful ontology export capabilities because they make use of an object-oriented
specification language to model information in a domain. Currently independent
tools to convert different specifications such as UML and DAML+OIL are being
developed.
Tools for ontology design and management:
Today, there are more than 90 tools available for ontology development from
both non-commercial organizations and commercial software vendors [47] [87].
Most of them are tools for designing and editing ontology files. Some of them may
provide certain capabilities for analyzing, modifying, and maintaining ontologies
over time, in addition to the editing capabilities. One of the more popular editing
tools is Protégé, developed by the Stanford University School of Medicine [88].
Other tools are SemTalk [89], OilEd [90], Unicorn [91], Jena [92], and Snobase [93],
to name a few.
Some of the different tools available can be integrated to each other enabling a
more complete development environment. For example the ontology editor Protégé
35
can communicate with the Inference engine to make reasoning and consistency
checking on the ontology being built. Please refer to the detailed survey on different
ontology editors that is provided in Appendix – A.
Protégé: Protégé is a free, open-source, integrated and platform-independent
system for development and maintenance of ontologies [19] [50]. Currently, it is the
version 3.0 of the tool, and was developed by Stanford Medical Informatics. Protégé
has a frame-based knowledge model, which is completely compatible with OKBC
(The Open Knowledge-Base Connectivity protocol) enabling interoperability with
other knowledge-representation systems. Protégé enables a development
environment supported by a number of third party plug-in, targeted to the specific
needs of specific knowledge domains. It is also an ontology development platform
which can easily be extended to include various graphical components such as
graphs and tables, media such as sound, images, and video, and various storage
formats such as OWL, RDF, XML, and HTML.
Ontolingua: The Ontolingua system [46] [75] provides users with the ability to
manage, share and reuse different ontologies stored on an remote ontology server.
The system has been developed at the Knowledge Systems Laboratory at Stanford
University in the early 90s [19]. Ontolingua supports a wide range of translations
while most ontology editors only support a limited range of translations. It can easily
import and export constructed ontologies with the newer languages like DAML+OIL
and OWL.
WebOnto: WebOnto [74] can manage ontologies constructed in OCML. It is a
Web-based tool for browsing, editing and managing ontologies constructed with
OCML. WebOnto has been developed at the Knowledge Media Institute, at the
Open University as part of several European research projects in the late 90s [19]. It
is basically a Java based client application connected to a specific Web server having
access to ontologies constructed with OCML.
WebODE: WebODE is a workbench for managing ontologies on the Web. It has
been developed by The Ontology and Knowledge Reuse Group, at the Technical
University of Madrid [19]. It is built up based on three-tier architecture: the user
36
interface, the application server and the database management system. The main
elements of the WebODE knowledge model are: concepts, groups of concepts,
relations, constants and instances of specific definitions.
OntoEdit: OntoEdit is developed by the Knowledge Management Group of the
University of Karlsruhe [19]. It is an ontology design and management tool whose
knowledge model is related to frame-based languages, and it supports multilingual
development.
OilEd: OilEd [90] is a development environment for ontologies constructed with
the OIL and DAML+OI languages. It can be integrated with a reasoner (FaCT) and
can extend the expressiveness of frame based tools. OilEd is a simple tool to make
demonstrations and ignores services and flexibility of ontologies.
4.2 Ontology Management System
An ontology management system for ontologies is similar to a database
management system for relational databases [47]. A DBMS allows an application to
access data stored in a database via a standard interface. The techniques for storing
and structuring the data is left to the DBMS itself so that the application does not
have to consider these issues. The DBMS system allows the application to access
data stored in the database with a query language (SQL) taking care of all the things
related to data storage, indexing of data and data file management. An ontology
management system allows access to ontologies in a similar way that a DBMS does.
The application making queries on an ontology through an ontology management
system does not have to worry about how the underlying processes relate to data
storage, and how structuring of data is done. Ontology editing capabilities are not the
central parts of an ontology management system, however some systems may
provide capabilities to edit ontologies programmatically through a programming
interface. In the case that such editing capabilities are not provided, developers can
choose to use some graphical editing environments such as Protégé.
37
Snobase (Semantic Network Ontology Base) Ontology Management System
Snobase [47] [93] is an ontology management system providing capabilities to
loading files remotely or through any URL (Unifrom Resource Locator) for files
stored on some Web server somewhere in the world. It is possible to create, modify
and store locally created ontologies. With Snobase, it is possible to run queries
against the loaded ontology through a well defined programming interface.
Applications are allowed to access ontologies through standard ontology languages
such as RDF, DAML+OIL, and OWL. The system makes use of a persistent storage
for ontologies, built-in inference engine, a local ontology directory and source
connectors to application programs. Snobase is a Java package providing similar
capabilities as the JDBC (Java Data Base Connectivity) and returns query results
similar to the results set returned from queries made against a relational database.
Snobase currently supports a variant of OWL Query Language (OWL-QL) [94]
when making queries against an ontology model loaded into the persistent storage of
Snobase. OWL-QL is an ontological equivalent of SQL for the Snobase ontology
management system.
Jena Semantic Web Framework
Jena [92] is a Java framework for building Semantic Web applications
programmatically. Jena provides a programmatic environment for RDF, RDFS and
OWL ontologies, including a rule-based inference engine. Given an ontology and a
model, Jena's inference engine can make reasoning so that additional statements that
the model does not express explicitly can be derived. Jena provides several Reasoner
types to work with different types of ontologies.
Some important capabilities of Jena are listed below:
• Provides a RDF Application Programming Interface (API).
• Reading and writing RDF in RDF/XML, N3 and N-Triples.
• Provides an OWL API.
• Provides both in-memory and persistent storage ontology models.
• Provides support for RDQL – a query language for RDF.
38
Jena is an open source project and its development has started at the HP Labs
Semantic Web Program.
4.3 Ontology Query Languages
A query can be thought of as an assertion or some restrictive statement whose
result to be returned [7]. RDF at the logic level, is enough to express such assertions.
However, in practice a query engine has specific algorithms and indeces available
with which to work, and can therefore answer specific sorts of query. However, the
implemented query engines can have their specific algorithms and ways of doing the
underlying things and can therefore only respond to specific queries. It is possible to
develop a query language in either of the following ways:
• allowing query types to be expressed succinctly with less complicated
algorithms mathematically, or
• allowing certain constrained queries to be expressed with certain
computability properties.
For example, SQL is a query language which has both of the above properties. It
is important that a query language targeted for ontologies can be defined in terms of
RDF logic. For example to query against an ontology, the assertion could have the
form "x is the author of p1" for some x. To ask for a list of all authors, it should be
asserted that all the members of the matching set should be authors and that all the
authors should be in the set, etc.
In practice, the different algorithms and mathematical foundations behind the
different search engines and the different algorithms in local logical systems suggests
that there exists different forms of query agents that will be capable to provide results
to different form of queries. A useful step could be to restrict the queries in some
common sense so that specifications for query engines and languages could be
defined out of them. The experience gained from different query languages currently
being used will enable such common specification so that it will be possible to chain
different search engines together and make them inference through some
intermediate query engines.
39
OWL-QL (OWL Query Language)
OWL-QL [47] [94] is a query language and protocol supporting agent-to-agent
query-answering dialogues using knowledge represented in OWL language. The
semantic relationship is exactly specified among a query, a query answer and the
included ontologies to produce query answer. It also provides dialog with the query
engine so that the query engine can use automated reasoning methods to derive
answers to queries. The query engine could need some extra information from the
querying agent in order to produce the answer. So a dialog could exist between the
two parts in order to produce the answer to a query. This is why OWL-QL has the
properties as a protocol. In this setting, the set of answers to a query may be of
unpredictable size and may require an unpredictable amount of time to compute
since the domain is not totally restricted because multiple knowledge bases can be
involved in the dialog between query agents. The following quote is from the OWL-
QL specification; “an OWL-QL query contains a query pattern that is a collection of
OWL sentences in which some literals and/or URI-refs have been replaced by
variables. A query answer provides bindings of terms to some of these variables such
that the conjunction of the answer sentences – produced by applying the bindings to
the query pattern and considering the remaining variables in the query pattern to be
existentially quantified – is entailed by a knowledge base (KB) called the answer
KB”.
OWL-QL is relatively simple and expressive. To make a query, a querying agent
can simply describe what is being searched for, indicating the variables and their
matching concepts for the answer queried. The advantage of the OWL-QL query
language is that the underlying mechanism is easily adaptable for different ontology
representation languages.
RDQL (RDF Data Query Language)
RDQL [95] was developed by HP and submitted to W3C for a possible
recommendation of the query language. It is an implementation of an SQL-like query
language designed for RDF. RDQL treats RDF as data and provides querying with
some constraints on the triple patterns exposed by the RDF model. The purpose of
40
RDF is to be used at a higher level than the RDF API itself. RDF queries provide a
way such that query statements can be written in a more declarative and intuitive
way for the answers, expected from the query answering system.
RQL (A declarative query language for RDF)
RQL [10] is a typed query language that relys on a functional approach. It is
defined by a set of basic queries and iterators that can be used to build new ones
through functional composition. In addition, RQL supports generalized path
expressions, featuring variables on labels for both nodes (i.e., classes) and edges (i.e.,
properties). The smooth combination of RQL schema and data path expressions is a
key speciality for satisfying the needs of several Semantic Web applications such as
Knowledge Portals and e-Marketplaces.
RQL is a typed query language which based on a functional approach. An RDF
query is defined by a set of queries which can be combined in some functional
manner to construct new queries. It supports generalized path expressions featuring
variables for some labels, classes and edges. The combination of such expressions
and RQL Schema is a key feature satisfying the needs of Semantic Web applications.
The online documentation presents the complete RQL syntax, formal semantics and
type inference rules of the RQL language.
41
CHAPTER 5
DESIGN OF THE SEMANTIC WEB APPLICATION:
ONTOLOGY-DRIVEN RECIPE QUERYING
This chapter will go through the design and specifications of the Semantic Web
application project implemented for this thesis. Specifications regarding choice of
domain, services, user facilities etc. will be discussed in detail in this chapter.
As the internet has been in a period of rapid growth, the need of applications
making use of machine and human consumable data has come on to the scene as a
promising candidate for a new breakthrough in information presentation and
processing. Various kind of technologies have been developed along with different
standards and techniques proposed by different communities making use of the Web.
The Semantic Web project is moving forward in becoming an important actor in the
mainstream. Applications, tools, Semantic Web languages are constantly being
developed creating a solid background for future Semantic Web developments and a
valuable pool of experience are gained from the effort spent on these developments.
The purpose of this thesis project is to explore the potential advantages of
Semantic Web ontologies and to demonstrate how different technologies can be
combined to create applications primarily based on ontologies. Different
technologies directly related to Semantic Web have been used along with other
technologies which are more general Web targeted technologies not directly related
to Web ontologies and the Semantic Web as a whole. The choice of the different
technologies; tools, ontology language, programming platform etc. will be discussed
in detail at the end of this chapter.
42
5.1 Overview of the System
As explained in the previous chapters, there are several benefits of Web
ontologies in the Semantic Web context. The developed Web application for this
thesis project is based on making use of Semantic Web technologies to show the
benefits of such technologies. The overall structure of the system is illustrated in the
Figure-1 below.
Figure - 1: Structure of the System
The application is mainly a Web-based interface for accessing and querying
content stored in an OWL ontology which can be located on the local system or on
any Web server located somewhere on the Web. The targeted content domain is food
43
recipes whose data/information has been collected from various Web sites giving
access to food recipes.
The ontology being processed is not only intended to be used to retrieve data for
various food recipes but also to structure the general view and behavior of the Web
interface such as categorizing the recipes and displaying them under a navigation
menu. All the information and Web content presented at the front end of the
application is extracted from the OWL ontology.
The actual ontology processing task is done by a different OWL server
implemented for this project. The Web interface retrieves the necessary data from
this server through a TCP connection. Loading and creating a model from OWL
ontology, recipe category extraction, recipe querying and recipe content extraction is
all being done by the OWL server working in the background of the application. As
mentioned before, the Web interface is a separate module interacting with OWL
server only to accept user input and present data returned from the server.
The OWL server implements the processing tasks by making use of the Ontology
Management System Snobase. The server creates a persistent model of the ontology
and loads it into the system memory for fast access.
For constructing and creating the OWL ontology, a separate ontology editor has
been used. The ontology constructing and managing part have been kept external for
the project implementation because of the powerful editors already available today.
As functionality the Web interface provides an intuitive and easy to use interface
allowing users to browse through the recipes and providing search capabilities so the
users can easily find a food recipe based on a certain specification.
5.2 System Domain
As the information domain, food recipes have been selected for the project
because of its various interesting properties. Information on food recipes are widely
being presented on the Web on a large number of Web sites. This makes it easy to
44
find information related to recipes and use these data when constructing the
ontology.
Food recipes provide a useful context and structure to create a system relying on
structured data. It is easy to classify the existing data and present this classification
with an ontology. Concepts such as classes, subclasses, properties and relations can
easily be applied and demonstrated within this domain.
The poor quality of the existing food recipe Web sites and portals was also one of
the main reasons in choosing this domain for the thesis application. The lack of easy
content access in finding relevant information is one of the general problems in this
domain. Such problems have been kept in mind when developing this application.
The constructed ontology contains a large number of food recipes in which data
has been collected from the currently existing Web sites publishing food recipes.
Because of the common information structure of the recipes it has been easy to create
a common format and structure to store these recipes in the ontology file that has
been constructed. Properties such as cooking time, preparation time, preparation
recipe and vegetarian information etc. are all common to food recipes. In addition,
the different ingredients used to prepare food are all common to the domain,
allowing them to be reusable definitions instead of being distinct to each and every
recipe.
Some of the Web sites used to obtain information on published food recipes are
listed below:
• http://www.afiyetolsun.net
• http://www.yemekport.com
• http://www.yemek.arsivi.com
• http://www.1de1.com
• http://www.emels.homestead.com
• http://www.mutfak1.homestead.com
• http://www.damaktadi.8m.com
• http://www.gezinet.net/yiyecekicecek/Restoranlar/gurme/gurmeanasayfa.asp
45
• http://www.mutfakrehberi.com.tr
• http://gulseminintenceresi.8m.com
• http://www.geocities.com/Hollywood/2944/index1.html#Turkce
All of the Web sites in the list above, together with many other Web sites not
mentioned here have no powerful features to extract the relevant recipe for users. In
general, they all lack in providing search facilities for the user so that they do not
have to browse all the recipes in order to find the desired one.
It has not been possible to find Web sites which have published their recipe
contents in any form of an ontology. All the available data was marked up with
HTML (Hyper Text Markup Language) which makes it almost impossible for other
systems to access the data and make use of it effectively.
When providing data for the ontology constructed for the application, no
automated process could have been used. Extracting information from such sites has
been done by copying and pasting from the page where the recipe information was
presented visually. It is out of the scope of this thesis project to develop an
automated system to extract information from the HTML markup the recipe
information is being presented in. Besides, it would most probably be an
unsuccessful project and waste of time and effort to create such an automated system
since the markup used by the different sites is not identical, so a general pattern
cannot be constructed to extract the necessary information and store them on
ontologies. This is a different subject and has no direct relation to Semantic Web.
Before designing the system, a detailed investigation of the different Web sites
has been performed. The main focus was on the capabilities related on how different
users can access the relevant content as fast as possible. Not one of the mentioned
Web sites have advanced search capabilities except for some of them which allows
recipe search based on the ingredients. Most of the sites only make it possible to
view the recipes in different categories so that the users can only browse and search
for them manually. Some of the sites provide search capabilities only based on
keyword search where only the recipe titles are being used for keyword matching.
46
The content stored at various Web sites is not structured in a way so that they are
accessible for other systems. That is, the content is not reusable and is just waiting
for users which have plenty of time and passion to seek for it. The only way of
retrieving the stored information is by manually copying and pasting the information
so that it can be used for other purposes.
Web sites providing search capabilities with respect to the ingredients of the
recipe being searched still is not powerful enough to return the user with the most
relevant recipe as possible. A recipe can be classified in various ways and have
different properties and relations. When performing a search based on the ingredient,
none of the mentioned notions are being considered, although they provide useful
information to retrieve the exact information desired. For example a search
mechanism for food recipes could consider different properties such as the
preparation time given for some recipe. Whether a food is vegetarian or not could
also be a useful criteria when performing a search. Other common criterions could be
mentioned such as food category, level of difficulty, origin of country/region etc.
However, none of these properties have been used in the Web sites visited. As
mentioned above, the only different search mechanism other than direct keyword
matching is based on the ingredient which has only been implemented in a very few
number of Web sites.
5.3 System Specifications
Storage and representation of information
The domain information is represented with an ontology. All the data related to
food recipes including classifications, properties and relations are all being stored in
the ontology file.
Ontology language
The ontology language used to construct the ontology was specified as OWL
because it is currently the most powerful ontology language with a greater
representational power compared to other ontology constructing language. The OWL
sublanguage has been specified as OWL DL because some of the more advanced
47
class related constructs such as subClassOf and disjointWith was used in constructing
the ontology.
Ontology processing
Ontology processing is being performed on a separate server application making
use of the Ontology Management System Snobase. Ontology processing is
implemented by making use of the API provided by Snobase and some classes
implemented in order to create communication with the Web client.
Web interface
The Web interface is a Web-based application that handles the visual
presentation of the recipe contents and provides navigation through the different
categories of food recipes. It provides an easy to use search interface allowing the
users to construct queries with different criteria.
Application development platform
The OWL server responsible of ontology processing has been developed using
the popular object oriented programming language Java. The Snobase API is
provided as Java package. The Web interface is separated from server application
itself although it could have been developed using the same platform. The Web
interface is implemented using the widely used server side scripting language PHP
which can be installed and run on any Web server with the proper configuration.
5.4 System Design
The Ontology-driven recipe querying application developed for this thesis is built
up in three main parts; the OWL ontology, OWL server and the Web interface to
interact with the system. The ontology constructed is the only information resource
used for the application. All data such as the text representing the recipe, recipe
category names etc. are stored in the ontology file constructed. Even the link names
appearing on the menu displayed on the Web interface are stored and retrieved from
the ontology file as shown in the Figure-1.
48
The OWL server is a Java based client-server application program which acts as
a bridge between the constructed ontology and the Web interface. Taking a given
URL as parameter, it will load the Ontology file from any location accessible from
the Web and create an internal model enabling queries to be executed on it. The
server will create a persistent model of the loaded ontology on a local directory so
that it does not have to rely on the network connection during execution after the
ontology has been loaded.
After loading the OWL ontology, the server will be ready to accept requests from
the Web interface. The communication between server and Web interface is based on
a simple ad-hoc protocol enabling the two parts to exchange information. Whenever
the server receives a request, it will validate the format of the request, perform the
requested task and send back the results to the requesting Web interface. Mainly four
types of requests can be made from the Web interface.
• Request for recipe categories.
• Request for all recipes under a specific category given as a resource id for the
recipe.
• Request for a food recipe given a specific resource id for the recipe.
• Perform a search given some query.
The Web interface is a PHP (PHP Hypertext Preprocessor) application which
only performs the user interaction with the OWL ontology server/model. The Web
interface does not deal with any data processing other that making requests to the
OWL server and presenting the responses in an HTML formatted page. It is
responsible of accepting user input, creating a request message and sending it to the
OWL server. When the server returns the corresponding response, the Web interface
simply displays it to the user. The different parts of the interface such as the category
based navigation menu, the available selective ingredients and the food recipes being
displayed are all retrieved from the OWL server dynamically on each page request.
Whenever the category structure and the content of the ontology file has been
changed and modified, all the changes are reflected at the Web interface and are
made available to the user without making any modification to the code for the PHP
Web application.
49
The communications between the different parts of the system are based on
commonly used network communication techniques. The server fetches the ontology
stored on some Web server using the HTTP protocol whereas the communication
between the OWL server and Web interface is a TCP/IP socket connection
implementing an ad-hoc communication protocol. All the three parts of the system
can be located remotely or on any machine having access to the Internet.
5.4.1 OWL Ontology Design
The Web ontology file has been constructed so that it will reflect the food recipe
domain in detail. However, the ontology has not been constructed too fine grained,
that it will be difficult to provide specific information for all of the details being built
into the ontology. For example, information such as origin of country and region has
been discarded.
The ontology constructed makes advantage of concepts such as class hierarchies,
class relations and properties. In order to model the information domain in a realistic
manner, proper information description methods have been used together with the
powerful logic descriptors provided by OWL-DL and OWL-Full. The following part
will explain the different classes that have been defined together with the different
relations and properties.
Please refer to the Appendix – D for some part of the “Class Hierarchy for
foodReceipts Project” which is automatically generated as an html file.
OWL Classes:
Food
The Food class is the super class of all the defined classes for specific food types
such “Çorba” and “Kebap” etc. It is the base class for all the food instances defined
in the ontology. It is defined as disjoint from other classes other than those inheriting
from it.
50
Ingredients
The Ingredients class (<owl: Class rdf: ID="Ingredients">) is the super class for
ingredient classes defined for each type of ingredient. Every specific ingredient class
will inherit from this class. It is defined as disjoint with other classes except fro those
classes directly inheriting from it.
Difficulty Level
The DifficultyLevel class is an enumerated class with three different instances
defined. These are Easy, Normal and Difficult. This class is also disjoint with other
classes defined in the domain.
Preparation Time
The PreparationTime Class is an enumerated class with 17 different pre-defined
instances representing minutes from 10 minutes to 90 minutes with proper intervals.
Instances of this class will be used to define the preparation time taking to prepare
food.
Properties and Relations:
All properties and relations are shown in Figure-2 including domain and range
relationships, and explained one by one in the following paragraphs.
51
Figure - 2: Properties and Relations of the System
HasCalory
The HasCalory datatype property (owl: DatatypeProperty) is a relation between
all instances of Class Food and some string representing the calorie value of some
food instance.
HasDifficultyLevel
The HasDifficultyLevel object property (owl: ObjectProperty) is a relation
between all instances of class food and an instance from the enumerated class
DifficultyLevel.
52
HasIngredient
The HasIngredient object property (owl: ObjectProperty) is a relation between
instances of type Food and instances of type Ingredient. This property and Food
instance can be in several relations with different instances of type Ingredient.
HasPreparationTime
The HasPreparationTime object property is a relation between instances of type
Food and instances of type PreparationTime. This property is used to attach a
PreparationTime instance to the food instance.
HasWebSourceURL
The HasWebSourceURL is a datatype property (owl: DatatypeProperty) relation
between instances of Class Food and some literal representing the URL of the Web
source the recipe data has been retrieved from.
IsIngredientOf
The IsIngredientOf object property is defined as the inverse (owl: inverseOf) of
the object property HasIngredient. The domain range relationship is the reverse of
that of HasIngredient.
HasReceipt
This is the datatype property which binds a recipe to an instance of type Food.
The range type is a string (XMLLiteral) being a light HTML formatted text
representing the recipe of a particular food instance.
5.5 Technical Specification
Protégé Ontology Editor
Protégé is developed by Stanford Medical Informatics. It has been selected as the
ontology editor for the thesis because it is suitable to the project in many ways. It is a
widely used free ontology editor especially for constructing and maintaining OWL
ontologies. OWL is different than other ontology languages in that it supports a
richer set of operators such as AND, OR, and Negation. Protégé supports all the
advanced properties of OWL language and provides all the functionality to maintain
OWL ontologies. Because of its wide popularity, it was easy to obtain support for it.
53
Plenty of tutorials, documentation and support forums are available for the editor.
The functionality of Protégé can be extended by various plugins available online on
the home page for Protégé. In addition, different wizards to ease the work are
provided. The internal logical model allows it to interact with external reasoning
services such as Racer, for example to compute inferred types and make consistency
checking on the ontology being constructed. Please refer to the Appendix – A for
detailed survey on different ontology editors.
Racer (Renamed ABox and Concept Expression Reasoner)
Racer is a Semantic Web inference engine for Web ontologies. It currently sup-
ports a wide range of inference services about ontologies specified in the ontology
language OWL. The services provided are made available through a network based
API so that different agents can make use of it. For example Protégé can interact
with the reasoner and use its reasoning services.
For the thesis, Racer has been used to check the consistency of the developed
ontology and to make use of its reasoning services to compute derived types etc. For
example in the constructed ontology, a class to compute the non-vegetarian food
instances named NonVegetarianFood has been created. In its type definition it has
been asserted that the instances of this class contain some ingredients of the type
Meat. The reasoning service of Racer can compute all the Food instances having
ingredients of type Meat even though these instances have not directly been created
with the type NonVegetarianFood.
54
CHAPTER 6
IMPLEMENTATION
The application mainly consists of three parts; the OWL ontology, OWL query
server and Web interface. All of them are implemented with different technologies
and different development platforms. The OWL query server is a Java application
implementing the ontology processing API provided by the Snobase ontology
management system. The server loads a given ontology and creates an internal model
so that it can be processed. It provides a network interface allowing client systems to
connect and pass requests to the server through a simple protocol. The server
processes the query, makes necessary queries on the loaded ontology model and
returns the answer back to the client system through the same network interface. In
this case, the client system is the Web interface.
The client Web interface provides functionality to send requests to the OWL
server and to display the results in a human consumable style. The requests sent to
the OWL server are different depending on the information needed from the
ontology. A user may construct a query in order to search for some food recipe or
may click on a particular link in order to view list of recipes listed under a category,
etc. The form of the requests sent to the server depends on such different
functionalities provided by the interface.
The OWL ontology described in the previous chapter has been developed by
making use of the Protégé ontology editor (Version 3) with the assistance of Racer to
check the consistency all along the development procedure. The detailed steps in
constructing the ontology are not provided in this chapter since the procedure is
straightforward and simple by using Protégé editor. However, some screenshots will
55
be provided of the editor environment giving a visual view of the ontology design
discussed in chapter 5.
6.1 OWL Query Server
As mentioned above, the server is a Java application providing a network
interface to accept requests from the Web interface. The network interface is
implemented by simple TCP socket connections. The main process of the server
listens to incoming requests and passes it to a separate thread which will further
process it, create a response and send the response back to the client process. Each
request received from the Web client is handled by a separate thread
(RequestHandler). Depending on the type of request, the RequestHandler thread
creates a response and sends it back to the client.
The server responds differently for different types of requests made by the client
process. A simple protocol allows the proper communications between the client and
server. The simple request headers are as follows.
• GET_MENU: When the server receives a request with this header, it
performs a query against the loaded ontology model and retrieves all the
existing food categories. That is, it retrieves all the subclasses of the class
Food and returns it back to the client. This request is made by the client
interface in order to create the navigation menu for the different food
categories available.
• GET_CATEGORY: This is the request header for the request containing a
food category ID e.g. “Çorbalar”. The server makes a query against the
ontology model and retrieves all the instances under the given category.
(Each sub class of class Food is considered a food category.)
• GET_FOOD: This request header is used to retrieve a particular food from
the loaded ontology. When attached with some food instance id, the server
retrieves the information of the food with the given id and returns it back to
the client.
• GET_INGREDIENTS: When a request with this header is made, the server
returns all instances of class Ingredient and returns it back to the client. The
56
client uses this information to create a suitable search environment for the
user where the user can make a selection among the different ingredients to
create his/her search query based on ingredients.
• GET_RESULT: This request header is attached with a query string
containing the query constructed by the user. The server processes this string,
makes a search and returns the results back to the client.
Details on how the internal process works will be discussed in the following
section where each class is explained in detail.
The server is made up of three classes. OWLQueryServer, RequestHandler and
FoodEntry.
Class: OWLQueryServer
This is the main class of the server. OWLQueryServer starts with loading the
OWL ontology file from a Web server with a given URL and creates an OWL
model. The OWL model API is provided by Snobase making it possible to represent
the ontology within the system memory. This API enables to run RDQL queries
against the model.
Initially, after loading the model, a query is executed against the model to
retrieve all the food instances represented in the OWL model. These instances are
stored in a Hashtable as FoodEntry objects. This hash table is the internal cache
storing all the available food entries, which will be used to fast access the
information for some food instance. Instead of executing a query against the model
each time some information is needed for some food instance, it is much easier to
refer to it from the hash table containing all the instances. And it will also be easier
to make search operations when having all the food instances as a collection.
After retrieving the food instances, a similar work will be performed for the
ingredient instances represented within the OWL model. Instead of creating a hash
table, this time a string containing all the ingredients is created. This string is used by
the Web interface when creating a list of ingredients where the user can select among
them to construct a query based on ingredients. Instead of making queries on each
57
client request, it will be much faster to create this string beforehand and just send it
to the client whenever the client makes a request for the ingredients available.
Finally the main process of OWLQueryServer creates a server socket and waits
for incoming client request. Whenever it receives a request it hands over the socket
connection to a RequestHandler object and continues its loop, waiting for new
requests. The RequestHandler reads in the request string, processes it and returns a
response to the client and the client finishes its job. Each RequestHandler object is
created instantly for each request and finishes its execution after handling the request
and sending back the created response. The UML diagram of OWLQueryServer is
illustrated in Figure-3.
Figure - 3: OWLQueryServer UML diagram
Class: RequestHandler
The RequestHandler class is where the actual processing is being done. It
processes a single request and finishes its execution. Depending on the request
received it invokes a proper method to generate the response and sends it back to the
client via the socket connection. For example whenever it receives a request as the
following “GET_FOOD{food}foodInstanceID{food}”, it retrieves the food instance
with id “foodInstanceID” of the previously constructed hash table and sends the
information back to the client.
58
For some requests it generates a query and executes it against the OWL model
previously created by the main server class OWLQueryServer. For example, when a
request is made for the names of the different food instances under a specific
category, an RDF query is created with parameters provided by the request string and
executed against the model. The returned response contains the labels for each
instance instead of the pure owl id. This is because the label attached to each instance
is intended for humans. For example, instead of returning “#someFood” it will return
the label which probably has the form “Some Food” which is more suitable to
present to a human user instead of the raw owl id.
As an example, there is a sample query below which returns all the elements of
type Food;
Sample query:
// Variable definition
RDFVariable X1 = model.createRDFVariable("?X1");
// Query statement
RDFStatement queryStatement = model.createRDFStatement( X1,
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
"http://localhost/localOntologies/foodReceipts.owl#Food");
// Executing the query against the OWL model
RDFResultSet resultSet = model.select(queryStatement);
This query will return all the elements (matching X1) of type Food.
For more queries, please refer to Appendix – B in order to view some sample
queries (Java code) illustrating how they are constructed and executed against the
OWL model.
The most important request the RequestHandler is dealing with is when search
requests are made. A search can be made depending on different criterions together
with some specified ingredients. Criterions such as preparation time, vegetarian,
calorie amount etc. are all optional elements of a query that can be constructed by a
user through the search interface provided by the Web client. Each request for a
recipe search must contain a set of specified ingredients.
59
The different elements/restrictions of a recipe search that can be constructed by a
user through the Web interface are listed below.
• The ingredients the recipes searched for should contain.
• Minimum value for food calorie.
• Maximum value for food calorie.
• Minimum value for preparation time.
• Maximum value for preparation time.
• Whether the food is vegetarian or not.
• A specific category (sub class of Food).
• And the difficulty level of the recipe being searched for.
When a search operation starts the RequestHandler performs an iterative
operation on the previously constructed Hashtable (searchtable) storing all the food
instances as FoodEntry objects. The search operation is being done in several steps
in order to return back the most relevant results to the user requesting for recipes.
Initially when a request is received, the RequestHandler simply selects all the
food entries matching all the criterions specified. First of all, it starts with the
ingredients criteria and creates a set of food entries containing those ingredients.
Later it will move on to other restrictions and will perform a selective iteration on the
previously created set containing the specified ingredients. This operation continues
until all the specified restriction has been checked. Finally, when this procedure is
finished, the set of food entries (or recipes) meeting all the criterions will be sent
back to the client.
In the cases that the above procedure results in an empty set, a different
procedure will be started. When there cannot be found any suitable recipes in the first
procedure, which is mentioned in the RequestHandler does not return a
NO_RESULT response, it instead performs a new search for different combinations
of the given set of ingredients.
Initially an ingredient element is removed from the specified set and afterwards a
search procedure identical to the above procedure is made. The recipes matching the
60
ingredients are selected and then a similar selective operation with respect to the
other restrictions will be performed. This operation where an ingredient is removed
will be done for all possible removals of different ingredients. If the resulting set
contains some elements it will be sent back to the client side.
In cases that removal of a single ingredient does not return any result, the
procedure above will be continued again. However, this time two ingredients will be
removed from the ingredients set. And the search will be performed for all different
ingredient sets resulting in removal of two ingredients. The operation resulting in
maximum number of recipes are selected and sent back to the client.
When none of the procedures above returns any results, the restrictions other than
the ingredients are removed. That is, all the above mentioned procedures are repeated
without considering the extra restrictions such as preparation time, calorie amount
etc. The different search procedures explained above return the most relevant result
to the user requesting for recipes through the search interface. Instead of viewing a
“Number of results: 0” for some query being made, the user will most probably
receive a result relevant to the original query he/she has made.
A proper message is displayed of the search performed when viewing the results.
The user is informed of what exactly has been done behind the scenes. The UML
diagram of class RequestHandler is illustrated in Figure-4.
61
Figure - 4: RequestHandler UML diagram
Class: FoodEntry
The FoodEntry class is defined so that it can represent a food instance with all its
data such id, label, preparation time, calorie value etc. Instances of this class are used
to create a collection of OWL food instances for easy manipulation. It has been
mentioned before that the main process in OWLQueryServer initially creates a
Hashtable (searchtable) containing all the food instances represented in the OWL
model. Each of these food instances are stored as FoodEntry objects in the Hashtable
(searchtable).
The FoodEntry class implements some methods that will make it easy to make
use of the data for each food instance. Proper getter and setter methods are
implemented together with some extra methods making it easy to use these objects
when performing a search. For example one such method is containsIngredients
(ingredients). This method checks if the given ingredients exist in the food recipe of
the food instance which containsIngredients method has been called. The UML
diagram of class FoodEntry is illustrated in Figure-5.
62
Figure - 5: FoodEntry UML diagram
6-2 Web Interface
The Web interface is a PHP Web application providing a HTML user interface
for the user. It makes use of a network interface relying on socket connections when
communicating with the server OWLQueryServer. The client creates different
requests to send to the server and processes the responses differently. Some of the
requests are made in order to create the navigation menu providing navigation
through the different classes/categories of food instances represented in the OWL
ontology. Such a request string looks like GET_MENU. The response to such a
request will be a string with category OWL id’s appended with the instance id’s
belonging to the specific category/food class.
63
Whenever the user navigates through the different categories, the Web interface
makes proper request to the server in order to populate the page with the right
content requested by the user. For example, when the user selects a link for a food
instance among the listed food instances under a specific category, the interface will
immediately make a request for that food entry and view the response sent back by
the OWLQueryServer.
The most important part of the interface is the search page where different
ingredients can be selected/added together with the different restrictions to the query
being constructed. The user interface provides a search pane where the user can
directly enter the ingredients and enter values to the optional extra restrictions such
as calorie amount, preparation time, difficulty level etc. as shown in Figure-6.
Additionally, two different search interfaces which provide the user to type
ingredients by himself/herself or selecting the ingredients from a categorized box and
some example searches are shown with the screen shots in the following figures.
Figure - 6: Search Interface 1
64
Figure - 7: Sample Search (A)
Figure - 8: Sample Search (B)
65
Figure - 9: Sample Search (C)
Figure - 10: Sample Search (D)
66
When a search is done, if the user clicks on one of the results e.g. ‘Körili Pilav’,
the ingredients and recipe of it can be viewed as shown in Figure-11.
Figure - 11: Ingredients and Recipe of ‘Körili Pilav’
67
When the user clicks on the link which is indicated as ‘Kaynak’ at the bottom of
the each recipe, it is possible to view the source of that recipe in a new web page as
illustrated in the Figure-12.
Figure - 12: Source of ‘Körili Pilav’
68
The other search interface, as shown in Figure-13, where users can select from
among the ingredients is defined in the OWL ontology file.
Figure - 13: Search Interface 2
69
The other search interface also provides functionality to make selections among
the ingredients defined in the OWL model itself when creating his/her set of
ingredients that the search should be based on. The user can select the ingredients
from an easy to use list where the ingredients are listed and view the selected
ingredients in a different list to prepare for the querying. The Figure-14 illustrates the
search mode users can select the ingredients from among the ingredients defined in
the owl ontology.
Please refer to Appendix - C for a portion of the model that is automatically
generated from the source code of the application.
Figure - 14: Selecting Ingredients from Categorized Box
70
Figure - 15: Sample Search (E)
71
Figure - 16: Sample Search (F)
72
Figure - 17: Sample Search (G)
73
When the user clicks on the “Pilavlar” link from the left side of the window, the
list of Pilavlar can be viewed as shown in Figure-18.
Figure - 18: Category of ‘Pilavlar’
74
A help interface is also available to the user when the help button is clicked as
illustrated in Figure-19.
Figure - 19: Help Interface of the System
In order to make comparison, some searches are done with three popular search
engines and following results have been achieved. When a search is done with the
ingredients of “pirinç soğan tavuk”, Google returns 6,020 results, Yahoo returns
14,600 results and Altavista returns 14,500 results. And when a search is done with
the ingredients of “pirinç soğan tavuk patates bezelye domates”, Google returns
1,150 results, Yahoo returns 3,490 results and Altavista returns 3,500 results. For this
reason, it is a very hard task and takes too much time to work with these huge
amount of results. Additionally, when these links have been analyzed, it is found that
many of them point to the same website, this is another drawback of the search
engines. After these seaches, it is obviously shown that it is difficult to find the
effective result with search engines.
75
6-3 Implementing the Ontology with Protégé
After designing the OWL ontology it was a straightforward process to construct
the ontology file with the help of the powerful tools and wizards the Protégé
development environment provides. It allows creating and populating classes and
concepts by simple clicks while providing a visual overview of the created classes
and instances with class hierarchies and instance diagrams.
Creating properties and relations (objectProperties, dataProperties) is done
similarly by defining the name of the property and specifying the domain-range
relation of the constituent elements/definitions.
While implementing the ontology, the reasoning services provided by Racer
reasoning system has been used continuously to assist and guide the implementation
process. It has mostly been used to check ontology consistency while adding
constrains and assertions for the different classes and relations being created. The
Protégé development environment is illustrated in Figure-20.
Figure - 20: Protégé Ontology Editor
76
CHAPTER 7
CONCLUSION
The aim of this thesis was to develop a querying system for recipes using a new
technology called the Semantic Web. The Semantic Web element in this work is the
knowledge base, and the contents of the knowledge base comprise not only the
system data but also the data schema (i.e. metadata). The knowledge base is available
through the Internet and therefore accessible in an OWL format by other systems and
agents that will be able to understand and process its contents. It is also possible to
augment this knowledge base with some complex description logics expressing
additional constraints to the data, which is normally not possible when working with
standard database systems.
In this application the important aspects of Semantic Web have been
implemented to represent the knowledge domain. The relations between the different
information available and their properties have been represented in the knowledge
base to make it possible to use the available information in a machine consumable
manner and process it via standard reasoning systems such as Racer.
The application has been designed in three independent parts: the Semantic Web
ontology, ontology processing server and the user interface to the system. The
knowledge represented in the Web ontology (knowledge base) has been defined so
that it is possible to make reasoning over the information being processed. To
achieve this ability, the logic descriptors available in OWL have been utilized. Such
descriptions and relation definitions could not have been implemented with an
ordinary relational database.
77
The ontology server of the system makes use of the advanced tool and techniques
used in the world of Semantic Web. The Ontology Management System has been
incorporated into the server part, where the main ontology processing is being
performed. Querying and reasoning on the Semantic Web are performed successfully
by the system.
The Web-based user interface makes it possible for the user to interact with the
ontology processing unit in a human understandable way. The users can construct
queries and search for food recipes as they are used to do in normal Web pages. They
do not have to construct RDF queries, instead the processing unit of the server
converts the queries into actual RDF queries so that the user does not have to worry
about how exactly the information is extracted.
The Semantic Web is the future of the Web. It is already being used by some
industries such as the medicine industry dealing with complex data sets. As different
technologies and tools are being developed together with the experience being
accumulated, I strongly believe that Semantic Web will become the main technology
used when dealing with complex data sets and information which are related to other
concepts in a complex manner. All the tools and languages that have been used in
this thesis are all relatively new and still under development. For example; OWL has
just became a W3C Recommendation on February 10, 2004 and Protégé - the
ontology editor - still has many bugs. As the availability of different technologies
increase, I believe that migration to Semantic Web will speed up and that it will
become the main approach in representing and processing information.
Several extensions and improvements can be added to the application. First of all,
the information domain could have been described with richer properties and
relations. For example, when creating a representational model for a food recipe,
additional properties could have been included into the model such as the origin of
food, vitamins contained, quantity of the ingredients, type of food, medical properties
of each ingredient, e.g.; what kind of diseases they are good for, etc. Beside these
extra properties, an image of the food can also be added to the ontology model.
78
Additionally, in this thesis, when the user make a search and if the result is null,
then a new search is done automatically by taking out one of the ingredients that the
user entered, and this process repeats until a result or results returned. If this is not
enough, new searches are done automatically again by ignoring the properties such
as calorie, difficulty level, preparation time, food type and vegetarian. As an
extension, before doing these new searches automatically, some questions can be
asked to the user in order to make his/her own selection.
One of the most important extensions that can be made to this application is
Natural Language processing. Currently, the query constructing form provided in the
search page of the Web interface only provides the restrictions in selective boxes. In
the Ingredients field, only ingredient elements are supposed to be added. The server
interprets each element as an ingredient. With natural language processing it can be
possible to create queries based on natural language instead of using a form with
different elements to create a search query. For example it could be possible to enter
a sentence like “a soup with tomato and onion” and construct ontology queries out of
the elements exposed by the sentence. Thereby, in the given example, the returned
result would be instances which are a “soup“ and contains “tomato” and “onion” as
ingredients.
The concept of adding semantic properties to information promises advanced and
sophisticated applications when being combined with other traditional Artificial
Intelligence fields such as Natural Language Processing. I believe that computer
systems which can process and understand sentences or information intended for
human consumption will have a revolutionary effect on our relationship with
computers and the way we make use of the computers and the Internet in general.
79
REFERENCES
[1] Davies, N. J., Fensel, D., and Richardson M., The Future of Web Services, BT Technology Journal, Vol 22, No, 118-130, Kluwer Academic Publishers, January 2004.
[2] Parsia B., and Patel-Schneider P.F., Meaning and the Semantic Web, ACM 1-
58113-912-8/04/0005, May 17-22, 2004. [3] http://www.ftrain.com/google_takes_all.html, Last access: 23.12.2004
[4] Amardeilh, Florence, A Semantic Web Platform with HLT (Human Language
Technologies) Capabilities, LaLICC, Université de Paris IV, UMR CNRS 96, Bd Raspail, Paris, France.
[5] Daconta, Michael C., Obrst, Leo J., and Smith, Kevin T., The Semantic Web: A
Guide to the Future of XML, Web Services, and Knowledge Management, Wiley Publishing Inc., ISBN 0-471-43257-1, 2003.
[6] Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T.,
McCurley, Kevin S., Rajagopalan, S., Tomkins, A., Tomlin, John A., and Zien, Jason Y., A case for automated large-scale semantic annotation, Journal of Web Semantics, Web Semantics: Science, Services and Agents on the World Wide Web 1 115–132, July 2003.
[7] http://www.w3.org/DesignIssues/Semantic.html, Last access: 17.01.2005 [8] Horrocks, Ian., and Patel-Schneider, Peter F., Three Theses of Representation in
the Semantic Web, WWW2003, May 20–24, 2003, Budapest, Hungary, ACM 1-58113-680-3/03/0005.
[9] Baader, Franz, Horrocks, Ian, and Sattler, Ulrike, Description Logics as Ontology
Languages for the Semantic Web, BaHS03, Theoretical Computer Science, RWTH Aachen, Germany, Department of Computer Science, University of Manchester, UK.
80
[10] Karvounarakis, Gregory, Alexaki, Sofia, Christophides, Vassilis, Plexousakis, Dimitris, and Scholl, Michel, RQL: A Declarative Query Language for RDF, European projects C-Web (IST-1999-13479) and Mesmuses (IST-2000-26074), WWW2002, May 7–11, 2002, Honolulu, Hawaii, USA, ACM 1-58113-449-5/02/0005.
[11] Wang Hongbing, Huang Joshua Zhexue, Qu Yuzhong, and Xie Junyuan, Web
services: problems and future directions, Journal of Web Semantics, Web Semantics: Science, Services and Agents on the World Wide Web 1 (2004) 309-320, 5 February 2004.
[12] Fensel D., Bussler C., Ding Y., Kartseva V., Klein M., Korotkiy M.,
Omelayenko B., and Siebes R., Semantic Web Application Areas, Free University Amsterdam VUA, Division of Mathematics and Informatics, De Boelelaan 1081a, NL-1081 HV Amsterdam, The Netherlands, Oracle Corporation, 500 Oracle Parkway, Redwood Shores, CA 94065, USA.
[13] Fensel Dieter, Hendler James A, Lieberman Henry, and Wahlster Wolfgang,
Spinning the Semantic Web: Bringing the World Wide Web to Its Full
Potential, Cambridge, M.A.: MIT Press, ISBN 0-26206-232-1, 392p., 2002. [14] Ding Ying, Fensel Dieter, Klein Michel, and Omelayenko Borys, The Semantic
Web: Yet Another Hip?, Data and Knowledge Engineering, 6.10.01, 2002. http://homepage.uibk.ac.at/~c703205/download/dke2001.pdf
[15] Dr. Nolan Brian, “Java and Information Retrieval from the Internet”, Institute
of Technology Blanchardstown, 00353-1-8851000, PPPJ 2003, Kilkenny City, Ireland, ISBN: 0-9544145-1-9, 16-18 June, 2003.
[16] http://www.w3.org/DesignIssues/RDFnot.html, Last access: 17.01.2005 [17] Gómez-Pérez Asunción, and Corcho Oscar, Universidad Politécnica de Madrid,
Ontology Languages for the Semantic Web, 1094-7167/02, IEEE INTELLIGENT SYSTEMS, 2002.
[18] Aberer Karl, Cudré-Mauroux Philippe, and Hauswirth Manfred, Start making
sense: The Chatty Web approach for global semantic agreements, Journal of Web Semantics, Web Semantics: Science, Services and Agents on the World Wide Web 1 (2003) 89–114, 9 September 2003.
[19] Su Xiaomeng, Ilebrekke Lars, A Comparative Study of Ontology Languages
and Tools, Norwegian University of Science and Technology(NTNU), N-7491, Trondheim, Norway.
[20] Vangkilde Ingrid, A Web-Portal with Semantic Web Technologies, Kgs.Lyngby
2003, ISSN 1601-233X, IMM-THESIS-2003-31.
81
[21] Guha, R. V., and Hayes, P., LBase: Semantics for languages of the Semantic Web, W3C NOT-A-Note, Aug. 2002. http://www.w3.org/TR/2003/NOTE-lbase-20030123/, Last access: 07.01.2005
[22] http://www.w3c.org/2001/sw/WebOnt , Last access: 09.03.2005 [23] Bray T., Paoli J., Sperberg-McQueen C. M., and Maler E., Extensible Markup
Language (XML) 1.0 (second edition), W3C Recommendation, 6 october 2000. http://www.w3.org/TR/2000/REC-xml-20001006
[24] Fallside D. C., XML Schema part 0: Primer, W3C Recommentation, May 2001.
http://www.w3.org/TR/xmlschema-0, Last access: 09.03.2005
[25] Karp R., Chaudhri V., Thomere J., XOL: An XML-Based Ontology Exchange Language (version 0.4), Aug. 1999, http://www.ai.sri.com/~pkarp/xol
[26] Luke S., Heflin J., SHOE(Simple HTML Ontology Extensions) 1.01 Proposed
Specification, SHOE Project, Feb. 2000. http://www.cs.umd.edu/projects/plus/SHOE/spec1.01.htm, [27] RDF (Resource Description Framework), http://www.w3c.org/RDF/ [28] Hayes, P., RDF model theory, W3C Working Draft, Apr. 2002. http://www.w3.org/TR/rdf-mt/, Last access: 09.03.2005 [29] Lassila O., and Swick R. R., Resource Description Framework (RDF) Model
and Syntax Specification, W3C Recommentation, Feb. 1999. http://www.w3.org/TR/REC-rdf-syntax , Last access: 11.01.2005
[30] Brickley D., Guha R. V., Resource Description Framework Schema (RDFS)
Specification 1.0, W3C Candidate Recommentation, Mar. 2000. http://www.w3.org/TR/2000/CR-rdf-schema-20000327/.
[31] DAML+OIL, http://www.daml.org/2001/03/daml+oil-index.html [32] Harmelen F. van, Patel-Schneider P. F., Horrocks I., Reference Description of
the DAML+OIL (March 2001) Ontology Markup Langauge, Mar. 2001. http://www.daml.org/2001/03/reference.html
[33] Web-Ontology (WebOnt) Working Group, part of W3C (World Wide Web
Consortium), http://www.w3.org/2001/sw/WebOnt. [34] Web Ontology Language (OWL), http://www.w3c.org/TR/owl-features [35] Dean M., Connolly D., Harmelen F. van, Hendler J., Horrocks I., McGuinness
D. L., Patel-Schneider P. F., Stein L. A., Web Ontology Language (OWL) 1.0 Reference, July 2002. http://www.w3.org/TR/owl-ref/.
[36] Advanced Knowledge Technologies (AKT), http://www.aktors.org/akt/
82
[37] Smith, Michael K., Welty, Chris, and McGuinness Deborah L., OWL Web Ontology Language Guide, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-owl-guide-20040210/
[38] Bechhofer Sean, Harmelen Frank van, Hendler Jim, Horrocks Ian, McGuinness
Deborah L., Patel-Schneider Peter F., Stein Lynn Andrea, OWL Web Ontology Language Reference, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-owl-ref-20040210/, Last access: 06.10.2004
[39] Baader Franz, Calvanese Diego, McGuinness Deborah L., Nardi Daniele, Patel-
Schneider Peter F., The Description Logic Handbook: Theory, Implementation
and Application, Cambridge University Press, 2002. [40] OIL Home Page, http://oil.semanticweb.org/, Last access: 01.10.2004 [41] OIL (Ontology Inference Layer), http://www.ontoknowledge.org/OIL, Last
access: 01.10.2004 [42] Berners-Lee, Tim, Hendler, James, and Lassila, Ora, The Semantic Web, A new
form of Web content that is meaningful to computers will unleash a revolution
of new possibilities, Science and Technology, Scientific American.com, 2001. [43] DAML (The DARPA Agent Markup Language), http://www.daml.org/ [44] http://www-ksl.stanford.edu/kst/what-is-an-ontology.html,
Last access: 08.11.2004 [45] Noy, Natalya F., and McGuinness, Deborah L., What is an ontology and why
we need it?, Stanford University, Stanford, CA, 94305, http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html, Last access: 30.09.2004
[46] http://www.xml.com/pub/a/2002/11/06/ontologies.html, Last access: 1.11.2004 [47] Lee Juhnyoung, Frequently Asked Questions on Ontology Technology, IBM T.
J. Watson Research Center, Hawthorne, NY, August 19, 2004. [48] Lee Juhnyoung, Goodwin Richard, Akkiraju Rama, Ranganathan Anand,
Verma Kunal, and Goh SweeFen, Towards Enterprise-Scale Ontology
Management, IBM T. J. Watson Research Center, Hawthorne, NY 10532. [49] Racer (Renamed ABox and Concept Expression Reasoner), http://www.sts.tu-
harburg.de/~r.f.moeller/racer, Last access: 03.03.2005 [50] Horridge Matthew, Knublauch Holger, Rector Alan, Stevens Robert, Wroe
Chris, A Practical Guide To Building OWL Ontologies Using The Protege-
OWL Plugin and CO-ODE Tools Edition 1.0, The University Of Manchester, Stanford University, August 27, 2004.
83
[51] SW Application Areas, http://trust.mindswap.org/trustDevDay/, Last access: 07.03.2005
[52] http://swoogle.umbc.edu, Last access: 12.04.2005 [53] The Semantic Web group of the W3C, http://www.w3.org/2001/sw/, Last
access: 11.01.2005 [54] The Semantic Web Community Portal, http://www.semanticweb.org/ [55] SWWS (Semantic Web Enabled Web Services), http://swws.semanticweb.org/ [56] OntoWeb, http://www.ontoweb.org, Last access: 12.04.2005 [57] Semantic Web & Ontology Related Initiatives,
http://ontobroker.semanticweb.org/, Last access: 23.12.2004 [58] Sheffield natural language processing group, http://nlp.shef.ac.uk/ [59] SWSI (Semantic Web Services Initiative), http://www.swsi.org/ [60] Defense Advanced Research Projects Agency, http://www.darpa.mil/ [61] DAML+OIL, http://www.w3.org/TR/daml+oil-reference [62] OWL (Web Ontology Language), http://www.w3.org/TR/2004/REC-owl-
features-20040210/ [63] RDF (Resource Description Framework), http://www.w3.org/RDF/ [64] RDFS (RDF Vocabulary Description Language 1.0: RDF Schema),
http://www.w3.org/TR/2004/REC-rdf-schema-20040210/ [65] XML (Extensible Markup Language), http://www.w3.org/XML/ [66] Berners-Lee: http://searchwebservices.techtarget.com/sDefinition/0,,
sid26_gci214349, 00.html [67] Berners-Lee T., Handler J., and Lassila O., The Semantic Web, Scientific
American, May 2001. [68] http://www.knowledgetechnologies.net/proceedings/presentations/stringer-
hye/suellenstringer-hye.htm. [69] http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70- 84A9809EC588EF21. [70] http://www.webopedia.com/TERM/S/SGML.html
84
[71] Decker Stefan, Semantic Web Methods for Knowledge Management, Karlsruhe, 22 February 2002.
[72] Magkanaraki Aimilia, Karvounarakis Grigoris, Anh Ta Tuan, Christophides
Vassilis, Plexousakis Dimitris, Ontology Storage And Querying Technical
Report, Foundation for Research and Technology Hellas Institute of Computer Science Information Systems Laboratory, No 308, April 2002.
[73] Web Ontology Language (OWL), http://www.w3.org/2004/OWL [74] WebOnto, http://eldora.open.ac.uk:3000/webonto [75] http://www-ksl-svc.stanford.edu:5915/ [76] DAML, http://www.daml.org/ontologies/ [77] http://www.starlab.vub.ac.be/research/dogma/OntologyServer.htm [78] http://suo.ieee.org/refs.html [79] SESAME, http://sesame.aidministrator.nl/ [80] http://ontoserver.aifb.uni-karlsruhe.de/ [81] http://saussure.irmkant.rm.cnr.it/onto/ [82] Ding, Y. and Fensel, D., Ontology Library Systems: The key for successful
Ontology Reuse, In Proceedings of the the 1st Semantic Web working symposium, Stanford, CA, USA, July 30th-August 1st, 2001.
[83] Pisanelli, D. M., Gangemi, A. and Steve, G., An Ontological Analysis of the
UMLS Metathesaurus, Journal of American Medical Informatics Association, 5:810-814, 1998.
[84] Motta, E., Buckingham-Shum, S., and Domingue, J., Ontology-Driven
Document Enrichment: Principles, Tools and Applications, International Journal of Human-Computer Studies, 52:1071-1109, 2000.
[85] Farquhar, A., Fikes R., and Rice J., The Ontolingua server: Tools for
collaborative ontology construction, International Journal of Human Computer Studies 46: 707-728, 1997.
[86] Heflin, J. and Hendler, J., Dynamic ontologies on the Web, In Proceedings of
the Seventeenth National Conference on Artificial Intelligence (AAAI-2000). AAAI/MIT Press, Menlo Park, CA, pp. 443-449, 2000.
[87] Denny, M., Ontology Tools Survey, Revisited, http://www.xml.com/pub/a/2004/07/14/onto.html.
85
[88] Stanford University School of Medicine Resources, http://friendshost.com/university/stanford-university-school-of-medicine.php [89] SemTalk, http://www.semtalk.com. [90] OilEd, http://oiled.man.ac.uk/. [91] Unicorn, http://www.unicorn.com. [92] Jena: A Semantic Web Framework, http://www.hpl.hp.com/semweb/jena.htm. [93] Snobase, IBM Ontology Management System, http://alphaworks.ibm.com/tech/snobase. [94] Fikes, R., Hayes, P., and Horrocks, I., OWL-QL - A Language for Deductive
Query Answering on the Semantic Web, Knowledge Systems Laboratory, Stanford University, Stanford, CA, 2003.
[95] RDQL (RDF Data Query Language), RDQL - A Query Language for RDF,
http://www.w3.org/Submission/2004/SUBM-RDQL-20040109/. [97] Knowledge Interchange Format (KIF), http://logic.stanford.edu/kif/kif.html. [98] The CYC Representation Language, http://www.cyc.com/tech.html#cycl. [99] Lenat, Douglas B., CYC: A Large-Scale Investment in Knowledge
Infrastructure, CACM, 1995, A local copy available at; http://www.cs.umbc.edu/471/papers/cyc95.pdf
86
APPENDIX A
Ontology Editor Survey Results
Following table shows the ontology editor survey results, comparing the only
editors which are explained in this thesis context.
This table is taken from:
http://xml.com/2002/11/06/Ontology_Editor_Survey.html
Copyright © 2002 Michael Denny
NOTE Concept Instance Relation
We frequently elected to retain the words of the software provider in these tool descriptions. Consequently, the alternative terms listed to the right may be used with roughly the same meaning.
Concept, class, category, type, term, entity, set and thing.
Instance, individual, resource, extension, description, object and entity.
Relation, relationship, property, function, role, slot, attribute, association, criterion, constraint of, feature and predicate.
1
Tool Version
Release Date
Source Modeling Features/Limitat
ions
Web Support &
{Use}
Import/Export
Formats
Graph View Consistency Checks
Multi-user
Support
Merging
Lexical Support
Comments More Information
The software tool for editing ontologies
Identifier of latest software release
The date latest version became available
The organization producing or supplying the software tool
The representational and logical qualities that can be expressed in the built ontology
Support for Web-compliant ontologies (e.g., URIs), and {use of the software over the Web (e.g., browser client)}
Other languages the built ontology can be serialized in
The extent to which the built ontology can be created, debugged, edited and/or compared directly in graphic form
The degree to which the syntactic, referential and/or logical correctness of the ontology can be verified automatically
Features that allow and facilitate concurrent development of the built ontology
Support for easily comparing and merging independent built ontologies
Capabilities for lexical referencing of ontology elements (e.g., synonyms) and processing lexical content (e.g., searching/filtering ontology terms)
Pertinent information about methodology, availability and support, additional features, etc.
Product or project Web site
OilEd 3.4 4/12/02 University of Manchester Information Management Group
DAML constraint axioms; same-class-as; limited XML Schema datatypes; creation metadata; allows arbitrary expressions as fillers and in constraint axioms; explicit use of quantifiers; one-of lists of individuals; no hierarchical property view.
RDF URIs; limited namespaces; very limited XML Schema
RDFS; SHIQ
Browsing Graphviz files of class subsumption only.
Subsumption and satisfiability (FaCT)
No No Limited synonyms
None http://oiled.man.ac.uk/
OntoEdit 2.5.2 8/6/02 Ontoprise GmbH
F-Logic axioms on classes and relations; algebraic properties of relations; creation of metadata; limited DAML property constraints and datatypes; no class combinations, equivalent instances.
Resource URIs
RDFS; F-Logic; DAML+OIL (limited); RDB
Yes, via plug-in
Yes, via OntoBroker
Transaction locking at the object and whole subtree levels.
Yes Multiple lexicons via plug-in
Free and commercial (Professional) versions are available, with continuing development of the commercial version.
http://www.ontoprise.de/com/ontoedit.htm
87
2
Tool Version
Release Date
Source Modeling Features/Limitat
ions
Web Support &
{Use}
Import/Export
Formats
Graph View Consistency Checks
Multi-user
Support
Merging
Lexical Support
Comments More Information
The software tool for editing ontologies
Identifier of latest software release
The date latest version became available
The organization producing or supplying the software tool
The representational and logical qualities that can be expressed in the built ontology
Support for Web-compliant ontologies (e.g., URIs), and {use of the software over the Web (e.g., browser client)}
Other languages the built ontology can be serialized in
The extent to which the built ontology can be created, debugged, edited and/or compared directly in graphic form
The degree to which the syntactic, referential and/or logical correctness of the ontology can be verified automatically
Features that allow and facilitate concurrent development of the built ontology
Support for easily comparing and merging independent built ontologies
Capabilities for lexical referencing of ontology elements (e.g., synonyms) and processing lexical content (e.g., searching/filtering ontology terms)
Pertinent information about methodology, availability and support, additional features, etc.
Product or project Web site
Ontolingua with Chimaera
1.0.649; 0.1.42
11/214/01; 7/24/02
Stanford Knowledge Systems Lab
OKBC model with full KIF axioms.
{Web access to service.}
Import & Export: DAML+OIL; KIF; OKBC; Loom; Prolog; Ontolingua; CLIPS. Import only: Classic; Ocelot; Protégé.
No Elaborate with Chimaera; Theorem proving (via JTP)
Write-only locking; user access levels.
Semi-automated via Chimaera
Search for terms in all loaded ontologies.
Online service only (at http://www-ksl-svc.stanford.edu); Chimaera is being enhanced under DARPA funding in 2002.
http://www.ksl.stanford.edu/software/ontolingua/ http://xml.com/www.ksl.stanford.edu/software/chimaera
Protégé-2000
1.7; 1.8 beta
4/10/02 ; 10/22/02
Stanford Medical Informatics
Multiple inheritance concept and relation hierarchies (but single class for instance); meta-classes; instances specification support; constraint axioms ala Prolog, F-Logic, OIL and general axiom language (PAL) via plug-ins.
Limited namespaces; {can run as applet; access through servlets}
RDF(S); XML Schema; RDB schema via Data Genie plug-in; (DAML+OIL backend due 4Q'02 from SRI)
Browsing classes & global properties via GraphViz plug-in; nested graph views with editing via Jambalaya plug-in.
Plug-ins for adding & checking constraint axioms: PAL; FaCT.
No, but features under development.
Semi-automated via Anchor-PROMPT.
WordNet plug-in; wildcard string matching (API only).
Support for CommonKADS methodology.
http://protege.stanford.edu/index.html
88
3
Tool Version
Release Date
Source Modeling Features/Limitat
ions
Web Support &
{Use}
Import/Export
Formats
Graph View Consistency Checks
Multi-user
Support
Merging
Lexical Support
Comments More Information
The software tool for editing ontologies
Identifier of latest software release
The date latest version became available
The organization producing or supplying the software tool
The representational and logical qualities that can be expressed in the built ontology
Support for Web-compliant ontologies (e.g., URIs), and {use of the software over the Web (e.g., browser client)}
Other languages the built ontology can be serialized in
The extent to which the built ontology can be created, debugged, edited and/or compared directly in graphic form
The degree to which the syntactic, referential and/or logical correctness of the ontology can be verified automatically
Features that allow and facilitate concurrent development of the built ontology
Support for easily comparing and merging independent built ontologies
Capabilities for lexical referencing of ontology elements (e.g., synonyms) and processing lexical content (e.g., searching/filtering ontology terms)
Pertinent information about methodology, availability and support, additional features, etc.
Product or project Web site
WebODE 2.0.6 7/10/02 Technical University of Madrid UPM
Concepts (class and instance), attributes and relations of taxonomies; disjoint and exhaustive class partitions; part-of and ad-hoc binary relations; properties of relations; constants; axioms; and multiple inheritance. Inference engine for subset of OKBC primitives and axioms.
URIs as imported terms; {browser client}
DAML+OIL; RDFS; X-CARIN; FLogic; Prolog; XML
Native graph view with editing of classes, relations, partitions, meta-properties, etc.
Type and cardinality constraints; disjoint classes and loops, taxonomy style (OntoClean), etc.
Yes, with synchronization; authentication and access restrictions per user groups.
Unsupervised (ODEMerge methodology) using synonym and hyperonym tables; custom dictionaries and merging rules.
Synonyms and abbreviations; (EuroWordNet support under development)
Supports Methontology methodology (Fern‡ndez-L—pez et al, 1999); offered as online service; successor to ODE; ontology storage in RDB.
http://delicias.dia.fi.upm.es/webODE/
WebOnto
2.3 5/1/02 Knowledge Media Institute of Open University (UK )
Multiple inheritance and exact coverings; meta-classes; class level support for prolog-like inference.
{Web service deployment site}
Import: RDF; Export: RDFS, GXL, Ontolingua, OIL
Native graph view of class relationships.
For OCML code
Global write-only locking with change notification.
No No Online service only.
http://kmi.open.ac.uk/projects/webonto/
89
90
APPENDIX B
The following queries (Java code) constitute an example of how to query the
knowledge base to get the required information.
Sample query 1:
The below query will return the all the elements (matching X1) of type
someIngredientType. (Fetching ingredients of type “someIngredientType”).
// Variable definition
RDFVariable X1 = model.createRDFVariable("?X1");
// Executing the query against the owl model
queryStatement = model.createRDFStatement(
X1,
“http://www.w3.org/1999/02/22-rdf-syntax-ns#type”,
“http://localhost/localOntologies/foodReceipts.owl#some
IngredientType”);
Sample query 2:
Variable X1 will math the value of HasDifficultyLevel property for the instance
“someFoodInstance”.
RDFVariable X1 = model.createRDFVariable("?X1");
queryStatement = model.createRDFStatement(
91
“http://localhost/localOntologies/foodReceipts.owl#some
FoodInstance",
“http://localhost/localOntologies/foodReceipts.owl#Has
DifficultyLevel",
X1 );
Sample query 3:
Variable X1 will math the value of HasCalory property for the instance
“someFoodInstance”.
RDFVariable X1 = model.createRDFVariable("?X1");
queryStatement = model.createRDFStatement(
“http://localhost/localOntologies/foodReceipts.owl#some
FoodInstance”,
“http://localhost/localOntologies/foodReceipts.owl#Has
Calory",
X1 );
92
APPENDIX C
Part of the OWL model which is automatically generated from the source code of
the Semantic Web Application: Ontology-driven Recipe Querying. Also available
with the following link; http://localhost/localOntologies/foodReceipts.owl
<?xml version="1.0" ?>
- <rdf:RDF xmlns="http://localhost/localOntologies/foodReceipts.owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xml:base="http://localhost/localOntologies/foodReceipts.owl">
<owl:Ontology rdf:about="" />
- <owl:Class rdf:ID="Ingredients">
- <rdfs:subClassOf>
- <owl:Restriction>
- <owl:someValuesFrom>
<owl:Class rdf:ID="Food" />
</owl:someValuesFrom>
- <owl:onProperty>
<owl:ObjectProperty rdf:ID="IsIngredientOf" />
</owl:onProperty>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:subClassOf
rdf:resource="http://www.w3.org/2002/07/owl#Thing" /> - <owl:disjointWith>
<owl:Class rdf:ID="DifficultyLevel" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:ID="PreparationTime" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Food" />
</owl:disjointWith>
</owl:Class>
- <owl:Class rdf:about="#DifficultyLevel">
- <owl:equivalentClass>
93
- <owl:Class>
- <owl:oneOf rdf:parseType="Collection">
<DifficultyLevel rdf:ID="Easy" />
<DifficultyLevel rdf:ID="Normal" />
<DifficultyLevel rdf:ID="Difficult" />
</owl:oneOf>
</owl:Class>
</owl:equivalentClass>
<owl:disjointWith rdf:resource="#Ingredients" />
- <owl:disjointWith>
<owl:Class rdf:about="#Food" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#PreparationTime" />
</owl:disjointWith>
</owl:Class>
- <owl:Class rdf:about="#PreparationTime">
<owl:disjointWith rdf:resource="#DifficultyLevel" />
<owl:disjointWith rdf:resource="#Ingredients" />
- <owl:equivalentClass>
- <owl:Class>
- <owl:oneOf rdf:parseType="Collection">
<PreparationTime rdf:ID="min-10" />
<PreparationTime rdf:ID="min-15" />
<PreparationTime rdf:ID="min-20" />
<PreparationTime rdf:ID="min-25" />
<PreparationTime rdf:ID="min-30" />
<PreparationTime rdf:ID="min-35" />
<PreparationTime rdf:ID="min-40" />
<PreparationTime rdf:ID="min-45" />
<PreparationTime rdf:ID="min-50" />
<PreparationTime rdf:ID="min-55" />
<PreparationTime rdf:ID="min-60" />
<PreparationTime rdf:ID="min-65" />
<PreparationTime rdf:ID="min-70" />
<PreparationTime rdf:ID="min-75" />
<PreparationTime rdf:ID="min-80" />
<PreparationTime rdf:ID="min-85" />
<PreparationTime rdf:ID="min-90" />
</owl:oneOf>
</owl:Class>
</owl:equivalentClass>
- <owl:disjointWith>
<owl:Class rdf:about="#Food" />
</owl:disjointWith>
</owl:Class>
- <owl:Class rdf:ID="Etler">
- <owl:disjointWith>
<owl:Class rdf:ID="SiviYag" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:ID="Bakliyat" />
94
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:ID="KatiYag" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:ID="Baharat" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:ID="Diger" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:ID="Un" />
</owl:disjointWith> ...
<owl:Class rdf:ID="NonVegetarianFood"> - <owl:equivalentClass>
- <owl:Class>
- <owl:intersectionOf rdf:parseType="Collection">
- <owl:Restriction>
<owl:someValuesFrom rdf:resource="#Etler" />
- <owl:onProperty>
<owl:ObjectProperty rdf:ID="HasIngredient" />
</owl:onProperty>
</owl:Restriction>
<owl:Class rdf:about="#Food" />
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
</owl:Class>
- <owl:Class rdf:about="#SiviYag">
- <owl:disjointWith>
<owl:Class rdf:about="#Sebzeler" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Baharat" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#KatiYag" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#SutUrunleri" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Etler" />
- <owl:disjointWith>
<owl:Class rdf:about="#Bakliyat" />
</owl:disjointWith>
<rdfs:subClassOf rdf:resource="#Ingredients" />
<rdfs:label>Sıvı Yağ</rdfs:label>
- <owl:disjointWith>
<owl:Class rdf:about="#Diger" />
95
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Un" />
</owl:disjointWith>
</owl:Class>
- <owl:Class rdf:about="#Kebap">
- <owl:disjointWith>
<owl:Class rdf:about="#Kofte" />
</owl:disjointWith>
<rdfs:label>Kebaplar</rdfs:label>
<owl:disjointWith rdf:resource="#Tatli" />
- <owl:disjointWith>
<owl:Class rdf:about="#HamurIsi" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Dolma" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Corba" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Pilav" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Sos" />
- <owl:disjointWith>
<owl:Class rdf:about="#TavukYemegi" />
</owl:disjointWith>
- <rdfs:subClassOf>
</owl:disjointWith>
- <rdfs:subClassOf>
<owl:Class rdf:about="#Food" />
</rdfs:subClassOf>
- <owl:disjointWith>
<owl:Class rdf:about="#SebzeYemegi" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Corba" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Tatli" />
- <owl:disjointWith>
<owl:Class rdf:about="#Kofte" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Kebap" />
<rdfs:label>Dolmalar</rdfs:label>
- <owl:disjointWith>
<owl:Class rdf:about="#Pilav" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Sos" />
</owl:Class>
- <owl:Class rdf:about="#Sebzeler">
<rdfs:label>Sebzeler</rdfs:label>
- <owl:disjointWith>
96
<owl:Class rdf:about="#Bakliyat" />
</owl:disjointWith>
<rdfs:subClassOf rdf:resource="#Ingredients" />
- <owl:disjointWith>
<owl:Class rdf:about="#SutUrunleri" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#SiviYag" />
<owl:disjointWith rdf:resource="#Etler" />
- <owl:disjointWith>
<owl:Class rdf:about="#KatiYag" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Baharat" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Un" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Diger" />
</owl:disjointWith>
</owl:Class>
- <owl:Class rdf:about="#Kofte">
- <owl:disjointWith>
<owl:Class rdf:about="#Corba" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#Pilav" />
</owl:disjointWith>
- <owl:disjointWith>
<owl:Class rdf:about="#EtYemegi" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Sos" />
- <rdfs:subClassOf>
<owl:Class rdf:about="#Food" />
</rdfs:subClassOf>
<owl:Class rdf:about="#SebzeYemegi" />
</owl:disjointWith>
- <rdfs:subClassOf>
<owl:Class rdf:about="#Food" />
</rdfs:subClassOf>
<owl:disjointWith rdf:resource="#Kofte" />
<owl:disjointWith rdf:resource="#Tatli" />
</owl:Class>
- <owl:Class rdf:about="#Pilav">
<owl:disjointWith rdf:resource="#EtYemegi" />
<owl:disjointWith rdf:resource="#Kebap" />
<owl:disjointWith rdf:resource="#Dolma" />
<owl:disjointWith rdf:resource="#Sos" />
<owl:disjointWith rdf:resource="#Kofte" />
<owl:disjointWith rdf:resource="#Tatli" />
</owl:Class>
- <owl:Class rdf:about="#KatiYag">
97
<owl:disjointWith rdf:resource="#SutUrunleri" />
<owl:disjointWith rdf:resource="#Un" />
<owl:disjointWith rdf:resource="#Etler" />
<owl:disjointWith rdf:resource="#SiviYag" />
<rdfs:label>Katı Yağ</rdfs:label>
<rdfs:subClassOf rdf:resource="#Ingredients" />
- <owl:disjointWith>
<owl:Class rdf:about="#Bakliyat" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Diger" />
- <owl:disjointWith>
<owl:Class rdf:about="#Baharat" />
</owl:disjointWith>
<owl:disjointWith rdf:resource="#Sebzeler" />
</owl:Class>
.
.
. <BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" /> 1 çay kaşığı dövülmüş pul biber <BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
1 yemek kaşığı tereyağ
<BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
2 yemek kaşığı sıvıyağ
<BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
Bir tutam maydanoz
<BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
<BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
<B
xmlns="http://localhost/localOntologies/foodReceipts.owl#">TARİF</B
>
<BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
Haşlanmış patatesler ile havuç sıcak sıcak soyulur ve salata yapar gibi doğranır.Tuzlanarak derin bir servis tabağına yerleştirilir. Yoğurt, mayonez ve tuzla dövülmüş sarımsak iyice karıştırılır. Bu karışıma nane ve pul biber eklenerek patates ve havuçların üzerine dökülür, fakat karıştırılmaz. Diğer yanda eritilen yağın içine sıvıyağ ve salça konarak pişirilir. Yoğurtlu karışımın üzerine şekilli olarak kaşıkla dökülür. Maydanozla süslenip hemen servis yapılır. <BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
<BR xmlns="http://localhost/localOntologies/foodReceipts.owl#" />
<B
xmlns="http://localhost/localOntologies/foodReceipts.owl#">AFİYET OLSUN !</B> </receipt>
- <HasIngredient>
- <KatiYag rdf:ID="tereyagi">
<rdfs:label>Tereyağı tereyağ</rdfs:label>
- <IsIngredientOf>
- <Kofte rdf:ID="analiKizli">
98
APPENDIX D
Class Hierarchy for foodReceipts Project
Part of the Class Hierarchy of Semantic Web Application: Ontology-driven
Recipe Querying Project which is automatically generated as an html file by Protégé
ontology editor.
• owl:Thing
o ExternalResource
o Food
� Corba
Instances : AndalozCorbasi
� Dolma
� EtYemegi
� HamurIsi
� Kebap
� Kofte
Instances : dalyanKofte
� Pilav
� Salata
� SebzeYemegi
� Sos
� Tatli
� TavukYemegi
o Ingredients
� Butter
Instances : margarin
� Corn
Instances : pirinc
99
� Flavour
Instances : un
� Meat
Instances : kiyma
� Other
Instances : salca, yumurta
� Spice
Instances : karabiber, tuz, kirmizibiber
� Vegetables
Instances : havuc, maydanoz, domates, bezelye, sogan
o owl:DataRange
Instances : xsd:int
o owl:Nothing
o rdf:List
Instances : rdf:List ()
o rdf:Property
Instances : owl:onProperty, owl:allValuesFrom, owl:hasValue, owl:maxCardinality, owl:minCardinality, owl:cardinality, owl:someValuesFrom, owl:differentFrom, owl:sameAs, rdf:value, rdfs:member, rdf:first, rdf:rest, rdf:object, rdf:predicate, rdf:subject, rdf:type, owl:oneOf
� owl:DatatypeProperty
Instances : rdfs:label, owl:versionInfo, rdfs:comment, owl:backwardCompatibleWith, owl:incompatibleWith, owl:priorVersion, receipt
� owl:ObjectProperty
Instances : rdfs:isDefinedBy, rdfs:seeAlso, :TO, :FROM, IsIngredientOf, HasIngredient, HasReceipt, IsReceiptOf
100
o rdf:Statement
o rdfs:Class
Instances : rdfs:Container, rdfs:Alt, rdfs:Bag, rdfs:Seq, rdf:Statement, owl:DataRange
� owl:Class
Instances : owl:Class, rdfs:Class, owl:Thing, rdf:Property, owl:DatatypeProperty, owl:ObjectProperty, :DIRECTED-BINARY-RELATION, owl:Ontology, owl:Nothing, rdf:List, rdfs:Literal, Receipt, Flavour, Ingredients, Food, Other, Butter, Spice, Corn, Meat, Vegetables, Corba, Salata, Kebap, SebzeYemegi, Tatli, TavukYemegi, Kofte, Dolma, HamurIsi, Sos, EtYemegi, Pilav
o rdfs:Container
� rdfs:Alt
� rdfs:Bag
� rdfs:Seq
o rdfs:Literal
o Receipt
Instances : MALZEMELER 2 çorba kaşığı haşlanmış pirinç 1 çorba kaşığı margarin 2 çorba kaşığı un 1 çorba kaşığı salça 1 tutam maydanoz tuz karabiber HAZIRLANIŞI Tencereye margarini koyup eritin. Unu ilave ederek bir süre kavurun, salçayı ekleyerek karıştırmaya devam edin. 5 bardak su katıp kaynatın. Kaynayan suyun içine haşlanmış pirinci, tuzu, karabiberi ve kıyılmış maydanozu ilave ederek karıştırın. AFIYET OLSUN!, MALZEMELER 1 kilo az yağlı kıyma 2 adet soğan 2 yumurta Bayatlatılmış yarım ekmek içi Tuz, karabiber 1 kutu bezelye 2 adet havuç 2 adet domates 1 çay fincanı un HAZIRLANIŞI Kıyma, soğan, biber, 2 yumurta, bayat ekmek iyice yoğurulur el yardımıyla 2 cm kalınlığında kare şeklinde açılır. İçine haşlanmış bezelye, 1-2 dakika haşlanıp kabukları soyulmuş ve ufak küpler halinde doğranmış havuç konur ve rulo şeklinde sarılır. Dikdörtgen şeklindeki bir payreks kaba konur ve 170 derecedeki fırında 40 dakika kadar pişirilir. Piştikten sonra altına biriken su alınır. Rendelenmiş domates hafifçe pişirilip üzerine köftenin suyu ve un ilave edilir. Boza kıvamında bir sos elde edilir. Servis yapılacağı zaman köftenin üzerine bu sos dökülür. Arzuya göre haşlanmış yumurta ile garnitür yapılabilir ve sıcak olarak servis edilir. AFIYET OLSUN!
o :DIRECTED-BINARY-RELATION
o :SYSTEM-CLASS
101
� owl:Ontology
Instances : DefaultOntology
� :ANNOTATION
� :INSTANCE-ANNOTATION
� :CONSTRAINT
� :PAL-CONSTRAINT
� :META-CLASS
� :CLASS
� :OWL-ALL-DIFFERENT
� :OWL-ANONYMOUS-ROOT
� HasIngredient Ingredients
� Food
� Corba
Instances : AndalozCorbasi
� Dolma
� EtYemegi
� HamurIsi
� Kebap
� Kofte
Instances : dalyanKofte
� Pilav
� Salata
� SebzeYemegi
� Sos
� Tatli
� TavukYemegi
� HasReceipt Receipt
� Food
� Corba
Instances : AndalozCorbasi
� Dolma
� EtYemegi
� HamurIsi
� Kebap
� Kofte
Instances : dalyanKofte
� Pilav
� Salata
� SebzeYemegi
102
� Sos
� Tatli
� TavukYemegi
� IsIngredientOf Food
� Ingredients
� Butter