Upload
intellisemantic
View
4.466
Download
13
Tags:
Embed Size (px)
DESCRIPTION
This presentation provides a top down introduction to semantics and Web 3.0 It is intended for the busy executive or developer who want to understand quickly why this new technological wave is relevant For a “one slide presentation” see the first slide only For a general introduction, see only the slides of the first section Following slides about semantic technologies, architectures and applications
Citation preview
1
Semantics and Web 3.0
Alberto Ciaramella Intellisemantic
http://www.intellisemantic.com may 2009
2
This presentation
This presentation provides a top down introduction to semantics and Web 3.0
It is intended for the busy executive or developer who want to understand quicly why this new technological wave is relevant For a “one slide presentation” see the
first slide onlyFor a general introduction, see only the
slides of the first section
3
A overview
Semantics in a slideWeb 1.0, Web 2.0, Web 3.0
Architectures and ApplicationsKnowledge Representations
4
Semantics in a slide semantics adds meanings to the data, in such a way that
they can be understood not only to people, but also by machine, which this way can better support people, hence the slogan: “let’s the machine do the most of the work”
to achieve that, semantic solutions integrates knowledge data bases to the usual information architectures, simulating some kind of intelligence, hence the slogan “a semantic application is a application with a common
sense ” even the simplest semantic extensions provide a well
perceivable benefit to the final user, hence the slogan: “a little semantics goes a long way”
5
Short history the “semantic web” vision was conceived since ’90s The Scientific American paper in may 2001 by T. Berners-Lee, J.
Hendler and O. Lassila disseminated the idea to a larger public, although the visions of this paper are in most part yet to achieve
W3C in these years carried out a significant standardization activity, so it is possible now to rely on a solid and agreed background
The first adoption cases were in enterprise applications they are deployed in more controlled environments the general public could not be fully aware of them this is the most mature area today
Some interesting cases are now beginning to appear on the web, mostly as betas, and they are also named “Web 3.0” This term suggests that semantic applications are more feature rich
than web 2.0 solutions, whilst it obfuscates technical reasons why
6
Web 1.0, Web 2.0 and Web 3.0 (1) Web 1.0 is the web as originated in ’90s, i.e. a
infrastructure providing a unprecedent level of access to documents, of course suitably exposed to the network by information professionals
Web 2.0 is a collaborative web; Web 2.0 solutions facilitate the user in submitting new contents, through blogs and wikis, and to interact, through social networks; in summary Web 2.0 is a read-write web, whilst Web 1.0 is a read web
Web 2.0 applications appeared in ’00s, i.e. on the second decade of the web, hence the name
The counterpart of Web 2.0 for the enterprise is the Enterprise 2.0
7
Web 1.0, Web 2.0 and Web 3.0 (2) Web 2.0 applications produce a increasing amount of
documents and connections to deal with, and semantics as a technology can help in dealing more intelligently and efficiently with these issues
this new wave of connected and intelligent applications was recently named Web 3.0, i.e. a read-write-execute web Remind the slogan “let’s the machine do the most of
the work” The technology behind this evolution is semantics, which
started as a research dream, but seems to become now more pervasive
An interesting example of evolution from Web 2.0 to Web 3.0 is Wikipedia, which is developed in a Web 2.0 cooperarive environment and which is now available as data within the Linked Open Data (LOD) Web 3.0 initiative
8
Semantic solutions A semantic architecture is a usual Web or enterprise
architecture augmented with a knowledge representation. Knowledge representations are a key point in semantic
architectures and solutions Knowledge representations (e.g. vocabularies,
taxonomies) have a long tradition, which predates IT times
Semantics today formalized and generalized general concepts of knowledge representations, provided tools for improving development and standards for easing their integration into specific applications
9
Knowledge representations (1)Knowledge representations can be represented in graphic
form: nodes represent concepts or entities, and arcs represents relationships between nodes
They differ in the generality of the graphical representation and in the kind of relationships allowed, as detailed in the following
Any NetworkOntology
Is-a, related-toNetworkThesaurus
Is-aGraph Taxonomy
NoDiscommected dots Controlled vocabulary
Relationships allowed Graphical representation
Kind of knowledge representation
10
Knowledge representations (2)A knowledge representation can be characterized by: Kind of, as summarized in the previous slide, from
controlled vocabularies to ontologies its minor or major complexity (number of nodes, number
of archs, number of allowed relationship)Use the simplest representation for the domain to be
represented and for the problem to be solved!To make some examples: a taxonomy representation is generally enough for
Information Retrieval, although it is needed to verify if the knowledge granularity is adeguate for the application
a ontology representation is generally required for reasoning; in this case it is needed to verify if relationships defined are adequate
11
Knowledge representations (3)Moreover a knowledge representation is characterized by: its level of generalisation/detail, from upper ontologies,
describing entities like “abstract”, “concrete”, to topic specific taxonomies, e.g. for scientific terms The same application can integrate different kind of
ontologies the kind of entities it describe: these entities can be
concepts, but can be also lexical terms (WordNet is a significant example of lexical knowledge representation) A lexical based ontology is more difficult to generalize
The knowledge representation language: RDF and OWL are specifically designed for the semantic web, but knowledge representations developped independently from the semantic web have to be suitably converted
12
Semantic applications Semantic applications are very diversified Most of them are deployed in the enterprise: they are technically
“simpler” and easier to justify as far as their costs are concerned Large scale semantic web applications are tecnically more
challenging and imply also the development of new business models and minds, but more recently: The “open data” paradigm is becoming more and more popular Semantics is becoming a “new entry” in web marketing
Semantic applications can be classified according to the problem solved and to the kind of the semantic technology used
we have to remind that “a little semantics goes a long way” since also the less complex semantic applications provide interesting advantages in comparison to the previous status of the art
13
“A little semantics goes a long way” “A little semantics goes a long way” for these reasons:
It adds some meanings to your data It adds a layer of relationships to your data and documents It facilitate the reuse of third party knowledge in your application
“A little semantics goes a long way” means also that: Many useful semantic applications use semantics only for a
specific semantic tecnology Almost all semantic applications relay also on other technologies
in order to provide the whole solution, e.g. natural language processing
14
Examples of Semantic applications (1)
Unstructured data access and integration: classify documents by most significant topics
described in a knowledge base extract metadata from documents and web pages in
order to simplify their search in the following and this evolution can also spark a new “semantic
wave” in on line advertisements extract facts and relationships in documents, in order
to produce structured data
15
Semantic applications (2) Structured data access and integration:
For providing more agility to the schema evolutions For simplifying the integration of different data bases For inferring new information from the existing one,g
including reasoning Structured and unstructured data
For unifying their access, e.g. in Business Intelligence Social and interest network
For aggregating more intelligently people and interests Application integration:
for supporting the dynamic location of web services for supporting the dynamic construction of business
processes
16
Semantics and other IT disciplines Semantics is a evolution of different IT disciplines,
bringing new paradigms for providing agility, scalability and robustness; between these we can mention: For data bases: semantics follows a paradigm
different from the usual RDB, i.e. it implements a graph data base in order to provide better agility
For artificial intelligence: semantics follows a new paradigm, i-e. the open word assumption in order to provide scalability and robustness
For application integration: semantics extends the first generation SOA in order to provide better agility
Semantics allows also to combine different kind of approaches till now separate, as for example searching both in structured and in unstructured sources
17
The dark site of semantics Do not overlook “the dark side of semantics”!
This expression means that the problem could be not so complicated, but the sourrounding environment could add unexpected implementation difficulties
Semantics in fact is not a magic bullet to address garbage data, inconsistent knowledge bases, although it can help in discovering these issues
Semantic applications have also to cope with IT ordinary problems like different document formats and so, and have to solve them efficiently before adding semantics
Solutions to these “interface” problems in many cases imply the use of other IT technologies like entity extraction, parsification and so
18
Some final remarks Web 2.0 solutions do not replace Web 1.0; in any case
they constitute a new opportunity and the faster growth area; it is likely that in next years the same will happen between Web 3.0 and Web 2.0 solutions
Semantics is a familly of technologies, Web 3.0 is a family of applications
Semantics predates Web 3.0, present achievements in semantics (i.e. “some semantics”) are the most significant technology enablers for Web 3.0
Semantic technologies have been deployed first in the enterprise; now the web time is coming
Semantics technologies will became pervasive in IT, and a IT manager or professional has to be aware of the different technology choices and business opportunities
19
A technical overview
The dimensions of semanticsThe technical facet
The architectural facetThe application facet
20
The dimensions of semantics To set a framework, we
distinguish three indepent dimensions:
technology application architecture Any of those dimensions
includes differerent aspects
tecnology
application
architecture
Intellisemantic 20
21
Technology different layers of standards
those on the bottom are well established, other are emerging or planned
any level of standard can be correlated to applications of increasing complexity Information retrieval Information extraction Question answering Reasoning Web of trust
Intellisemantic 21
22
Architecture A semantic architecture
includes a knowledge data base.
Semantic architectures hence are characterized by:
What kind of knowledge data base (KDB) is used
where is this KDB used, i.e. in the presentation, in the business logic, in the service integration layer or in the data base layer?
Presentation
Business Logic
Data base
Knowledge data base
Knowledge development tools
Service integration
23
The application facets The environment facet: semantics is splitting between
semantic web semantic enterprise (intranet, extranet, internet)
The application goal facet: for better use of unstructured information for structured data integration for application integration
The industry facet: in different industries different reasons for adopting semantics different levels of awareness different adoption cases
Intellisemantic 23
24
Technologies and recommendatioms
OverviewMain recommendations
25
Overview W3C set initially a general framework composed by a
ordered set of layers, suitable for semantic tasks of increasing complexity, from describing facts to reasoning about these facts, till verifying their thrust
W3C carried out its activites of course starting form the bottom layer. Till now the W3C activities: produced significant recommendations for lower layers
and this is a significant achievement for semantics, which relies on solid grounds
added some more flexibility to the original one, since it was found that
not all the layers were needed for all the cases some layers could have different implementations
Intellisemantic 25
26
Some remarks The layered framework of W3C recommendations is a
significant achievement, but beware that, in a real implementation: not all the layers are needed in order to define it as
“semantic”, semantic layers are only a part of the whole solution,
which can require also other complementary technologies, as natural language processing, statical methods and so
The recommendation used must be complemented with other views, as presented in the slide “the three dimensions of semantics”
The following slides provide a overview of major reccommendarions; the W3C web site provides of coursev more detailed information
27
Most significant recommendations (1) RDF (Resource Description Framework) for coding triples.
A triple describes a statement, which is formed by a thing, a corresponding property and a corresponding value, i.e. this presentation (thing) has a title (property) which is “Semantics and Web 3.0” (value).
A triple can also be visualized as a arch (property) connecting a source node (thing) to a destination node (value).
Any knowledge data base can be represented as a set of triples, and of course it can be coded in RDF
RDF is not the only representation for triples, but it was the first formalized by W3C and it is widely used as a interchamge format
RDFS is the schema language for RDF It adds information about the meaning of data coded in RDF
28
Most significant recommendations (2) SPARQL (Simple Protocol and RDF Query Language) is
a query language for a RDF coded data base OWL (Web Ontology Language) builds on RDF and
RDFS and extends them, with notions like classes, just to mention one, in order to enable more powerful reasoning than with RDF + RDFS. Different kind of OWL language extensions have been defined, which, in the increasing order of allowed expressions, are: OWL Lite OWL DL (Description Logic), with some restrictions in
comparison to OWL Full, in order to ensure decidibility
OWL Full, which is the the unrestricted OWL dialect. Decidibility is not assured in this case, however
29
Other significant recommendations Microformats are a collection of formats for embedding
metadata within XHTML and HTML web pages RDFa (i.e. RDF attributes) is a proposed set of
extensions to XML, including XHTML of course, in order to allow the inclusion of metadata (e.g. the author, the date) in documents
eRFD (i.e. embedded RDF) is similar to RDFa, but it is meant only for XHTML documents
GRDLL (Gleaning Resource Descriptions from Dialects of Languages) is a W3C recommendarion for extracting RDF out of XHTML using XSLT
All these recommendations emerged in the last 2 years, to enable applications with “some semantics” inside
30
Architectures
Definition, Motivations,Structure knowledge data base development
Triple storesSemantic middleware
Semantic services
31
Definition and motivations
A semantic architecture includes a knowledge base as a clearly identified layer
Motivations: to increase the business agility to facilitate the integration of different applications and
data to provide improved user interfaces to provide new functions
Intellisemantic 31
32
Structure
A semantic architecture can be described as the usual four layered architecture (presentation, business logic, service integration, data) to which a knowledge layer is superimposed
The knowledge layer itself can affect only the presentation, or the business logic or the service integration, or the data logic, producing a first classification of architectures
Intellisemantic, Politecnico di Torino 32
33
Structure
presentation
Business logic
Service integration
Data
Knowledge Development
toolKnowledge data base
34
Developing application knowledge base Understand application requirements
which is the application domain? which is the typical document structure? how it is possible to benefit from metadata?
Indentify the most appropriate resources many good quality resources are available a application requiring to develop a totally new
knowledge base is not so common Modify or merge them if appropriate Validate and test the final solution
Intellisemantic 34
35
How to look for knowledge data bases
Identify them from directories, as http://www.schemaweb.info, http://www.daml.org/ontologies
Identify them from aggregators (in such a case they are generally not free), as http://www.taxonomywarehouse.com
Identify them from “paper directories”, as the book “Ontology Engineering”, by Corcho and alii
Identify some of them from specific sites, as http://www.eclass-online.com)
Identify them through suitable search engines, as Swoogle
Intellisemantic 35
36
Most common tools and building blocks in semantics
knowledge base development environments: some of them have been developed by the academic community and are free, as Protegè, others are commercial OTS
semantic stores, i.e. a RDF triple store manager: some of them was developed directly for semantics, other are semantic extensions of the usual relational data bases, as Oracle
reasoners, as for example Pellet semantic middleware and web services Semantic search engines, to be distinghied further into
diferent cathegories, as a) searching for knowledge data bases b) improving the search and the navigation for sites and documents
Intellisemantic
37
Applications
Enterprise applicationsWeb applications
3838
Enterprise semantics motivations: the user point of view
(+) less costs related to the reduction of the time overhead for document retrieving, information extraction more in general, for accomplishing routinary tasks
(+) better quality of results and more revenues in intelligence to improve the relevance of documents found in eCommerce to increase the % of items found and sold
(+) more agile enterprise i.e. improved time to market a well identified and maintained knowledge layer can speed
up IT system updates motivated by internal procedural changes or by external rules compliance
(+) new functions and business opportunities as well most of them are related to the “long tail” and Web 2.0
Intellisemantic
3939
Enterprise semantic systems: the developer and integrator point of view
(+) are easier to upgrade
by simply updating the knowledge layer (+) are easier to maintain,
by focusing on the knowledge layer (-) require some more inital design efforts
since they imply a more structured design
Intellisemantic
4040
Enterprise semantic systems: the manager and analyst point of view
(-/+) require to train your staff about a new paradigm
but this can be repaid by the following projects (-/+) require a careful analysis, since the novelty of the field
in order to identify what can be considered development, by today technology, and what has yet to considered a research challenge
(-/+) require the identification and solution of the additional technical challenges in the application
e.g. properly handling different kinds of documents formats can affect a semantic application, although it is not semantics
(-/+) requires the identification and solution of the additional managerial challenges in the application
e.g. some organisations have difficulties in accepting even the best knowledge management systems,for cultural reasons
Intellisemantic
4141
Enterprise semantic applications (1)
Extend the usual enterprise solutions, as: Content Management Systems (CMS) Knowledge Managent Systems (KMS) Customer Relationship Management (CRM) Business Intelligence Sustems (BI) Human Resource Management Systems IT and enterprise governance , including security For
improving fuctionalities in:
Intellisemantic
4242
Enterprise semantic applications (2)
For improving aforementioned solutions in specific fuctionalities, as:
Unstructured data (document classification and search, document analysis, facts extraction)
Structured data (merging different data bases, reasoning)
Applicarion integration Solutions and semantically extended functionalties
can be combined in a matrix, showing also the best
Intellisemantic
43
Semantic web applications Semantic web applications were the original “dream”,
which is a far reaching effort for reasons as: The scale of information available on the web New business models to developd for the use of
“better” information on the web Something in any case is moving since 2008, although
we have to distinguish between “My Semantic web” solutions, i.e. a semantically
enhanced portal or service Semantic “web on the large” solutions, including tools
and initiatives for this evolution This distinction is not always so sharp indeed.
44
“My Semantic web” applications Extension of Web 2.0 solutions, as:
Semantic social networks, as Twine Semantic mashup, as Tripit Semantic wikis, as Metaweb Semantic blogs, as Zemanta
Depending on the case, they aggregate more information in comparison to usual Web 2.0, increasing the user stickness, or do more automatic work for the user benefit
Semantic technology enabled portals, as provided by Elsevier, BBC, harper
Typically delivered by media companies, they classify and reaggregate the portal information in order to improve the findability of their free or better of their for sale documents
45
Semantic Web on the large main driving forces today (1)
Open world: Open data:
Expose and interconnect semanticised web resources, in order to obtain a web of open data: this is the direction followed by the Linked Objet Data (LOD) project, whose interconnected resources are continously increasing and between others include DBpedia, which is the converted RDF from Wilipedia
open semantic web service, as http://www.opencalais.com
46
Semantic Web on the large main driving forces today (2) Semantic markerting
Semantic positioning Support web masters in producing resources
provided by semantic metadata in such a way to mprove their findability: such a kind of initiative is carried aiur by Search Monkey, by Yahoo, and will affect also web marketing
Semantic contextual advertisement solutions, using semantics for associating
advertisements to the most suitable web pages, as for example http://www.peer39.com and http://www.adpepper.com
47
Semantic web on the large: other directions
Semantic search engines, to be further distinguished into: Vertical specialized engines, as Hakia Searchers for tripicized resources, as those available
with the Linked Open Data (LOD) Initiative Engined supporting natural language, as Powerset
Semantic browsers
48
References http://www.w3.org/2001/sw the W3C Semantic Web Activity site,
with access to specifications, activities, tutorials and best implementation cases
http://www.semanticuniverse.com a portal for semantics “Ontology Engineering”, by A. Gomez-Perez, M. Fernando Lopez,
O. Corcho, Springer, 2004 “Adaptive Information”, by J.Pollock, R. Hodgson, Wiley 2004 “Semantics in Business Systems”, by D. McComb, Morgan
Kaufnann Puvlishers, 2004 “A Semantic Web Primer” by G. Antoniou, F. van Harmelen, MIT
Press, 2008 “Semantic Web for the Working Ontologisr”, by D. Allemang, J.
Hendler, Morgan Kaufann Publishers, 2008 “Semantic Web for Dummies” by J. Pollock, Wiley 2009
49
AcknowledgmentsI acknowledge my colleagues of IntelliSemantic for their
useful feedbacks, discussions and insights. I acknowledge also managers and professionals who
attended IntelliSemantic one day workshop in semantics, whose key points are summarized in the second part of this presenration, and colleagues who attended my recent talks about semantics at MilanIn and BAIA events, whose key points are detailed in the first part of this presentation: all the feedabacks I received contributed to improve the quality of this presentation.
For further comments, please do not esitate to contact me at [email protected]
50
Licence
This work is licenced under Creative Commons Attribution-NonCommercial-Share A like 3.0 Unported Licence
To view a copy of this licence visit:http://creativecommons.org/licenses/by-nc-sa/3.0/