6
Guest Editorial The Semantic Web: an evolution for a revolution The Web is an indisputable success. It has re- volutionised the publication and dissemination of information. However, to access and interpret that information necessitates human intervention. To discover an expert on post-impressionist art, for example, is likely to necessitate combining infor- mation spread across several different Web re- sources covering art collections, artist biographies and art history. A professor who wrote a book on Van Gogh could be inferred to be an expert (as authorship implies expertise) if it is known that Van Gogh is a post-impressionist, even if no mention of this is made on the professorÕs Web site. A published professor would more likely be considered an expert than a high school student whose essay on Gauguin (another post-impres- sionist) is also published as a Web page. To search for and link information, a person or some appli- cation must interpret the content of a Web re- source––the person or application must determine what the content is about. To infer new informa- tion that is not explicitly asserted requires rea- soning with knowledge that is embedded in the application or in the head of the person reading the Web page. This makes the Web today a place where humans are doing the processing or the processing is hard-wired into applications. The vision of the Semantic Web, as proposed by Tim Berners-Lee et al. [1], is to evolve the current Web to one where information and services are understandable and usable by computers as well as humans––to create a ‘‘Web for machines’’. The automated processing of Web content requires, at its heart, that explicit machine-processable se- mantics be associated with Web resources as metadata so that it can be interpreted and com- bined. The Semantic Web does not replace the Web; it sits on top of the Web as an integrating descriptive fabric. Such an environment forms a platform for search engines, information brokers and ultimately the ÔintelligentÕ agents. 1. The components of the Semantic Web The Semantic Web makes huge demands on many areas of computing, and in particular re- quires the confluence of distributed systems, data and knowledge management, and artificial intelli- gence. Current efforts concentrate on three main areas: Specifying the languages that will form the fab- ric of the Semantic Web; Specifying and developing the architectural components and tools forming the infrastruc- ture of the Semantic Web; Prototyping applications using the languages, the components and defining the content neces- sary. All of these areas are developing in parallel and yet are interdependent. Consequently, the devel- opment has been a maelstrom of research coupled concurrently with standards activity in W3C, and early experiments and prototypes running along- side commercial developments. 1.1. The language fabric The first steps have been made in defining the shared languages needed to describe the metadata 1389-1286/03/$ - see front matter Ó 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S1389-1286(03)00222-6 Computer Networks 42 (2003) 551–556 www.elsevier.com/locate/comnet

The Semantic Web: an evolution for a revolution

Embed Size (px)

Citation preview

Computer Networks 42 (2003) 551–556

www.elsevier.com/locate/comnet

Guest Editorial

The Semantic Web: an evolution for a revolution

The Web is an indisputable success. It has re-

volutionised the publication and dissemination of

information. However, to access and interpret thatinformation necessitates human intervention. To

discover an expert on post-impressionist art, for

example, is likely to necessitate combining infor-

mation spread across several different Web re-

sources covering art collections, artist biographies

and art history. A professor who wrote a book on

Van Gogh could be inferred to be an expert (as

authorship implies expertise) if it is known thatVan Gogh is a post-impressionist, even if no

mention of this is made on the professor�s Web

site. A published professor would more likely be

considered an expert than a high school student

whose essay on Gauguin (another post-impres-

sionist) is also published as a Web page. To search

for and link information, a person or some appli-

cation must interpret the content of a Web re-source––the person or application must determine

what the content is about. To infer new informa-

tion that is not explicitly asserted requires rea-

soning with knowledge that is embedded in the

application or in the head of the person reading

the Web page. This makes the Web today a place

where humans are doing the processing or the

processing is hard-wired into applications.The vision of the Semantic Web, as proposed by

Tim Berners-Lee et al. [1], is to evolve the currentWeb to one where information and services areunderstandable and usable by computers as well ashumans––to create a ‘‘Web for machines’’. Theautomated processing of Web content requires, atits heart, that explicit machine-processable se-mantics be associated with Web resources as

metadata so that it can be interpreted and com-

1389-1286/03/$ - see front matter � 2003 Elsevier Science B.V. All r

doi:10.1016/S1389-1286(03)00222-6

bined. The Semantic Web does not replace the

Web; it sits on top of the Web as an integrating

descriptive fabric. Such an environment forms aplatform for search engines, information brokers

and ultimately the �intelligent� agents.

1. The components of the Semantic Web

The Semantic Web makes huge demands on

many areas of computing, and in particular re-

quires the confluence of distributed systems, data

and knowledge management, and artificial intelli-

gence. Current efforts concentrate on three main

areas:

• Specifying the languages that will form the fab-

ric of the Semantic Web;

• Specifying and developing the architectural

components and tools forming the infrastruc-

ture of the Semantic Web;

• Prototyping applications using the languages,

the components and defining the content neces-sary.

All of these areas are developing in parallel and

yet are interdependent. Consequently, the devel-

opment has been a maelstrom of research coupled

concurrently with standards activity in W3C, and

early experiments and prototypes running along-

side commercial developments.

1.1. The language fabric

The first steps have been made in defining the

shared languages needed to describe the metadata

ights reserved.

552 Guest Editorial / Computer Networks 42 (2003) 551–556

and the knowledge that will make up the fabric of

the Semantic Web. The DARPA Agent Markup

Language (DAML) initiative [2] and the EU On-

toWeb Thematic Network [3] have been instru-

mental in laying the foundations; these efforts have

been taken up the World Wide Web Consortium(W3C).

The semantic languages are built upon those

already present in the Web (Fig. 1). URIs and

namespaces provide a means of identifying re-

sources. XML provides the common syntax for

machine understandable statements. All the other

languages are encoded in XML. A common mis-

conception is that XML tags provide explicit se-mantics. They do not as the tags are only labels

interpretable by applications or by whatever people

choose them to mean. The Resource Description

Framework (RDF) family of languages [4] provide

a data model for asserting metadata statements for

annotating Web resources without having to have

write access to those resources. In keeping with the

Fig. 1. Language layers

democratic nature of the Web multiple, and po-

tentially conflicting statements, can be made on the

same resource. For example, asserting that Van

Gogh is a post-impressionist painter and that Van

Gogh painted ‘‘The Sunflowers’’.

Shared ontologies supply the vocabulary ofterms used by metadata in order that the appli-

cations and people share a common language and

a common understanding of what the terms mean

(their semantics). Ontologies form shared models

of knowledge. For example, post-expressionism is

a kind of art movement; a painting is a kind of art

artefact; artists paint paintings; a painting can

have a style. Languages for specifying ontologiesinclude RDFSchema [5], DAML+OIL [6], OWL

[7], and Topic Maps [8]. Logical axioms in the

DAML+OIL and OWL languages support the

inference of new metadata and new knowledge

from that explicitly stated; for example, the state-

ment that van Gogh paints a painting that has

a style post-impressionist implies he is a post-

and technologies.

Guest Editorial / Computer Networks 42 (2003) 551–556 553

impressionist artist. Other languages such as

RuleML [9] support other forms of logical

deduction.

Proof and trust models and languages are those

that are the most embryonic. Proof is the provision

of explanation––why was certain knowledge in-ferred. Trust is an attribution of metadata state-

ments––who made those statements. Assertions

about post-impressionism by a professor are more

trusted than those of a high school student.

1.2. Infrastructure

The minimal components include annotationmechanisms, repositories for annotations and on-

tologies with associated query and lifecycle man-

agement, and inference engines that are resilient,

reliable and perform well. Languages for querying

RDF annotations need to be defined and imple-

mented. We need tools to: acquire metadata and

ontologies (manually and automatically); describe

resources with metadata; and for versioning, up-date, security, view management and so on.

Methods for achieving scalability and robustness

need to be developed.

1.3. Applications

The search for ‘‘the killer application’’ for the

Semantic Web is a perennial occupation. However,instead of thinking in terms of killer applications

we should think in terms of killer functionality. The

applications are those that exist already––infor-

mation search, integration, discovery, exchange––

but improved through the provision of semantics.

The association of simple metadata with a Web

resource with simple queries over that metadata

would give a small but not insignificant improve-ment in information integration [10]. More ambi-

tious ideas of the Semantic Web propose an

environment where software agents are able to

dynamically discover, interrogate and interoperate

resources, building and disbanding virtual problem

solving environments, discovering new facts, and

performing sophisticated tasks on behalf of hu-

mans [1]. This revolution will only occur once theevolution of the Web to the Semantic Web reaches

a critical mass, as was the case with the Web itself.

Two application areas that have come to the

fore are: knowledge management and portals, and

Web services.

Web Services are Web-accessible programs and

devices, the latest generation of distributed com-

puting, that will transform the Web from a col-lection of information to a distributed device of

computation. Web services and the Semantic Web

have a symbiotic relationship. The Semantic Web

infrastructure of ontology services, metadata an-

notators, reasoning engines and so on will be de-

livered as Web services. In turn Web services need

semantic-driven descriptions for discovery, nego-

tiation and composition.Knowledge management moves the Web from a

document-oriented view of information to a

knowledge item view. This is the idea of the Web as

a large knowledge base rather than a document

collection. Knowledge portals provide views onto

domain-specific information to enable their users to

find relevant information. Ontologies and knowl-

edge bases form the backbone of such systems.

2. In this issue

In 2002, the 11th International World Wide

Web Conference in Waikiki, Hawaii, USA, had its

first dedicated Semantic Web track. Most of the

papers in this issue are extended versions of papersoriginally published in that track. They range over

the three areas of fabric, infrastructure and ap-

plications.

We start and end with applications. First we

have an example of a pragmatic application from

the knowledge management area. TAP: A Se-

mantic Web Platform by Guha and McCool de-

scribes a Semantic Search application that buildson simple tools that make the Web a giant

distributed database (a ‘‘Data Web’’). Local, in-

dependently-managed knowledge bases are aggre-

gated to form selected centres of knowledge useful

for particular applications, using a set of protocols

and conventions that create a coherent whole of

independently-produced bits of information. The

authors� emphasise the creation of global agree-ments on vocabularies and the need for scalable

and deployable query systems.

554 Guest Editorial / Computer Networks 42 (2003) 551–556

We then have three papers on infrastructure

and tools. To make the Semantic Web we must

lower the barriers of entry. Semantics must emerge

without encumbering users. In CREAM: CREAt-

ing Metadata for the Semantic Web, Handschuh

and Staab present a framework for the creation ofmetadata for existing Web pages and for gathering

metadata while creating content for pages. They

demonstrate the OntoMat ontology-driven anno-

tation tool for web resources. Fillies, Wood-

Albrecht and Weichhardt re-enforce the point that

metadata must be easy for users to create and

should be incidentally captured as a by-product of

their regular activities if it is to be realisticallyobtained. In Pragmatic Applications of the Se-

mantic Web using SemTalk they present a graph-

ical editor to create RDF-like schema and

workflows, and capture knowledge as a natural

by-product of daily work with Microsoft Office

products. The authors show examples from real

applications in a Swiss bank, a German health

insurance company and a German financial insti-tution. The end result of efficient and effective

annotation systems such as CREAM and SemTalk

is that large volumes of RDF descriptions are

appearing and are being stored in repositories. Yet

we have still to have the corresponding efficient

and effective query languages for RDF. In Que-

rying the Semantic Web with RQL, Karvounarakis

et al. propose a query language for RDF, anddemonstrate its effectiveness over an implementa-

tion of an RDF database.

Three papers then take up the Semantic Web

services application theme. Florescu, Gr€uunhagenand Kossmann in XL: An XML Programming

Language for Web Service Specification and Com-

position continue the language theme by presenting

XL, an XML programming language for the im-plementation of Web Services. This work uses the

languages at the ‘‘bottom’’ of the language stack.

Moving up to the use of ontology languages, in

Semantic Web Support for the Business-to-Business

E-Commerce Pre-Contractual Lifecycle, Trastour,

Bartolini and Preist present how the reasoning

capabilities of the Semantic Web language

DAML+OIL can be used to infer matchmakingpossibilities between choices of services, and sup-

port negotiation and service level agreements. Fi-

nally, moving further up the stack, Narayanan and

McIlraith in Analysis and Simulation of Web Ser-

vices use the DAML-S DAML+OIL ontology for

describing the capabilities of Web services, but go

one step further by encoding service descriptions in

a Petri Net formalism and by providing decisionprocedures for Web service simulation, verification

and composition. This shows the different kinds of

inference systems needed by Web Services and the

Semantic Web.

3. The first steps on the road

The evolution of the Web to the revolutionary

possibilities of the Semantic Web has barely

begun. Despite the range of disciplines and tech-

nologies needed to achieve the Semantic Web, the

AI community appears the most engaged. There

are many prospects for AI techniques: Semantic

Web services are recast as variations of agent ne-

gotiation, planning and rule-based systems; ma-chine learning is used for emergent metadata

and ontologies; ontology development, evolution,

and merging draws upon knowledge acquisition

and knowledge representation. However, viewing

the Web as just an application of current AI

technologies that are at hand is a mistake. This is

an interdisciplinary research challenge, encom-

passing distributed computing, databases, digitallibraries, hypermedia, and so on.

The Web was successful because it scaled, by

challenging assumptions on link consistency and

completeness, and because it was simple. To bring

Semantic Web technologies to the Web means

making a similar stand. What are the challenges?

• The Web is vast, so solutions have to scale. Rea-soning engines must perform quickly and ro-

bustly.

• The Web is here––we have a legacy so we will

have a mixed environment where some re-

sources are ‘‘semantic’’ and some are just

‘‘Web’’. We must have a clear and achievable

migration path from non-semantic to semantic.

• The Web is democratic––all are knowledge ac-quisition experts and all are knowledge model-

lers. The barriers of admission must be low

Guest Editorial / Computer Networks 42 (2003) 551–556 555

enough for most users to participate to the de-

gree that is appropriate for them.

• The Web grows from the bottom. Most people

wrote their first HTML by editing a third

partys. The Semantic Web will arise fromfragments of metadata and ontologies being

copied in a similar way. For example, new con-

cepts for ontologies will be produced ‘‘just in

time’’ by annotators; and rather than a few

large, complex, consistent ontologies, shared

by many users, there will be many small onto-

logical components.

• The Web is volatile and changeable––resourcesappear and disappear, resources change. Ontol-

ogies change. What if a piece of metadata is

grounded on a term in an ontology that no

longer exists?

• The Web is dirty––there is no way to ensure

consistency or whether information is trustwor-

thy, and provenance is unknown. However, tol-

erance of error does not necessarily mean oneshould be oblivious to it.

• The Web is heterogeneous––no one solution or

one technology will be adopted; no one ontology

will prevail; no one set of metadata will apply to

a resource. Agreements are difficult, and map-

pings and translations will be commonplace.

To achieve scalability of technologies andmanagement we should recognise that there will

not be a Semantic Web; there will be many se-

mantic webs. High end semantic web applications

will be comparatively few, frequently within in-

tranets, forming islands of quality semantic webs

for specific communities with high quality anno-

tation, large and good quality ontologies and

sound and complete reasoning. These islands willfloat in a sea of low grade and volatile metadata,

with ontologies of variable quality and doubtful

longevity and provenance.

It is not clear yet how current applications will

adapt and what new ones will appear when se-

mantics are prevalent and can be assumed to exist.

Some spin-offs have already borne fruit, for ex-

ample the molecular biology community hasadopted DAML+OIL as an interchange language

for their ontologies independent of its role in the

Semantic Web [11]. Every step on the way to the

full vision is itself a revolution, and the evolution-

ary journey itself is stimulating and worthwhile.

For more information go to http://www. seman-

ticweb.org/.

Acknowledgements

I would like to thank all the authors and the

paper reviewers for their efforts in helping me

produce such a high quality issue. I should like to

thank the reviewers from the WWW2002 confer-

ence for their efforts at the beginning. I would also

like to thank Raphael Volz for his comments onthis editorial and Harry Rudin for his support

throughout the editing process.

References

[1] T. Berners-Lee, J. Hendler, O. Lassila, The Semantic Web,

Scientific American, May 2001.

[2] DARPA Agent Markup Language initiative. Available at

<http://www.daml.org>.

[3] The EU OntoWeb Thematic Network. Available at

<http://www.ontoweb.org>.

[4] The Resource Description Framework. Available at

<http://www.w3.org/RDF/>.

[5] RDF Schema. Available at <http://www.w3.org/TR/rdf-

schema/>.

[6] I. Horrocks, DAML+OIL: a reasonable web ontology

language, in: Proceedings of EDBT 2002, March 2002.

[7] The Web Ontology Language OWL. Available at <http://

www.w3.org/2001/sw/WebOnt/>.

[8] J. Park, S. Hunting, D.C. Engelbart, XML Topic Maps:

Creating and Using Topic Maps for the Web, first ed.,

Addison-Wesley Professional, Reading, MA, 2002.

[9] RuleML, The Rule Markup Initiative. Available at <http://

www.dfki.uni-kl.de/ruleml/>.

[10] B. McBride, Four steps towards the widespread adoption

of a semantic web, in: Proceedings of 1st International

Semantic Web Conference (ISWC2002), June 2002, Lec-

ture Notes in Computer Science, 2342, Springer, Berlin,

2002, pp. 419–422.

[11] Global Open Biological Ontologies. Available at <http://

www.geneontology.org/doc/gobo.html>.

Carole Goble

Department of Computer Science

University of Manchester

Oxford Road

M13 9PL Manchester, UK

E-mail address: [email protected]

556 Guest Editorial / Computer Networks 42 (2003) 551–556

Carole Goble is a Professor in the De-partment of Computer Science in theUniversity of Manchester. Her re-search interests are centred on theaccessibility of information, primarilythrough the use of ontologies for therepresentation and classification ofmetadata. She works in many appli-cation areas, and in particular LifeSciences. The Information Manage-ment Group that she co-leads is re-nowned for its work on ontologylanguages (OIL, DAML+OIL, OWL),reasoning systems (FaCT) and their

practical application to real problems. Her work on the appli-cation of ontologies to biology and bioinformatics has been

especially influential. She currently has a leading role in twomajor international initiatives: the Semantic Web and the Grid.She has combined these into the Semantic Grid, co-chairing theSemantic Grid Research Group in the Global Grid Forumstandards organisation and directing a major UK BioGrid re-search pilot, myGrid. She chaired the first Semantic Web trackof the World Wide Web Conference in 2002, on which thisspecial issue is based. She serves on many boards and pro-gramme committees including the OntoWeb Thematic Networkexecutive management board, the international Semantic WebScience Association and the EU/NSF joint ad hoc committeeon Semantic Web Services. She is an Editor-in-Chief of the newElsevier Journal of Web Semantics and is a founder of a startup company, Network Inference, specialising in technologiesfor the Semantic Web.