Upload
stone-matthews
View
34
Download
0
Tags:
Embed Size (px)
DESCRIPTION
SEMANTIC KNOWLEDGE ACQUISITION OF INFORMATION FOR SYNTACTIC WEB By G.Nagarajan and K.K.Thyagharajan International Journal of Web & Semantic Technology. By Raef Mchaymech. Outline. Introduction. 1. The Problem. 2. The Proposed Solution. 3. 4. The Proposed Architecture. 5. Conclusion. - PowerPoint PPT Presentation
Citation preview
SEMANTIC KNOWLEDGE ACQUISITION OFINFORMATION FOR SYNTACTIC WEBBy G.Nagarajan and K.K.ThyagharajanInternational Journal of Web & Semantic Technology
By Raef Mchaymech
Outline
Introduction
The Proposed Solution
Conclusion
The Problem
The Proposed Architecture
Critics6
5
4
3
2
1
The First Problem
• Search engines are returning:• Billions of results, informative and non informative
So What is the Problem NOW !!!
The introduction of the semantic web in 2000, had encouraged researchers to create the concept of semantic search engines
Semantic search engines are indeed widely adopted by developers and engineers
Querying the semantic web, using the semantic search engines returned expected results.
Current Solutions
Semantic Web
Expected Result
Semantic search engines
The Second Problem
• There is no enough resources to search:• Searches and queries are very domain-dependent• E.g.:
• Dbpedia to search Wikipedia• LinkedMDB to search IMDB
Current Solutions VS. Proposed Solution
Semantic Web Current Web
Write ontologies
Write
ontologies
Transformation
The Proposed Architecture
WWW
Web CrawlerConversion to XML
List Of URLsFiltering Conversion to
RDF/OWL
Ontology Repository
About the Crawler
Templates
A Web crawler starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit. URLs from the list are recursively visited according to a set of policies.
Web Crawler
HTML to XML Conversion
The conversion is based on Natural Language Processing (NLP), specifically on Name Entity Recognition (NER)
To XML
NER can classify the entity as: Person Name, Organization Name, Location…
1
2
The Proposed Framework
HTML to XML Conversion
HTML Web Page
Lexicon & pattern Repository
Corresponding XML file
Entity Recognition
HTML Document Preprocessing
Domain Hierarchy
• Two main definitions should take place:– RDFS which provide the rules of the web page– OWL which define the conceptual ontology of the web
page
• Two techniques should used:– Syntactic Analysis– Semantic Analysis
XML to RDFS/OWL Conversion
• It’s a simple mapping between XML elements and OWL elements
Syntactic Analysis
For more rules please refer to the paper
Semantic Analysis
Strongly based on NLP techniques:• The analyzer works on identifying nouns, verbs, etc…• Probability Reasoner is used to separate concepts and relations• Relationships also consists of is-a and part-of• T box and A box are used to define logic and rules
• T box provides the classes and property• A box provides instances
Conclusion
• Intelligent information retrieval system
• Projecting the reusability concept here• The authors reused the html pages• Convert them to ontologies
• The English is a complete disaster
Critics
• Authors did not show any real example:• They did not convert from syntactic pages to semantic ones• An example about the conversion of HTML to XML is provided but
not from XML to OWL• The mapping from XSD elements to OWL is not efficient and is error-prone, irrelevant elements could be easily inserted in the ontology.
• The authors did not benefit from the expressive power of ontology (restrictions, type of properties…)
• They wrote exactly:
They talked about the architecture of the syntactic/semantic conversion. But no search engine was designed• No evaluation at all: speed of the solution, the amount of
resource consumption…
Critics