17
Extended WordNet Shrikrishna R. Parab [email protected] M.Tech part 1 Dept. of Computer Science and Technology Goa university

Extended WordNet

Embed Size (px)

Citation preview

Extended WordNetShrikrishna R. [email protected] part 1Dept. of Computer Science and TechnologyGoa university

Recap…•Study the approach to build BabelNet 2.0

•Automatic approach to build BabelNet using integration with WordNet and Wikipedia

•So, using this approach we can integrate Konkani WordNet to Konkani Wikipedia

•This will increase the accuracy of the WordNet by increasing the gloss and also the examples

•Because of this integration with Wikipedia we can also give examples by images.

Outline•Introduction

•Why??

•RDF Models

•LEMON model

•Examples

•Conclusion

•Reference paper

•RDF is a standard model for data interchange on web

•It is a general framework for describing website metadata, or "information about the information" on the website.

•It has features that facilitates data merging even if under schemas differ

•It also supports the evolution of schemas over time without changing much

•RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link

•RDF was designed to allow developers to build search engines that rely on the metadata and to allow Internet users to share Web site information more readily.

•linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes.

Resource Description Framework (RDF)

RDF representation of WordNet??

Choosing of RDF is based on several reasons:

•First, it is a standard for the Web, focused in description of resources

•Second, RDF has a natural structure of network, and is ideal to represent data and metadata with that structure.

•Third, another advantage of RDF is its extensibility.

•Fourth, the schema describing the structure of this representation can be accessed in the same way as the data,

Is it sufficient to use RDF as a format for WordNet??

•Some of the major requirements for WordNet are:• Relations between entities• Notion of class, as for words as synsets• Notion of hierarchy of classes, like an adjective word as a subclass of word• Notion of instance and type, meaning that some entity has a type of some kind.

•All these requirements are accomplished by RDF as a modelling language

•RDF introduces another useful characteristics, like hierarchies among properties and comments.

First representation of WordNet in RDF (Melnik, 2001)

•It consisted in a set of nouns, the glossary and the hyponym and similar-to relations.

•In this schema the synsets are classifed as nouns, verbs, adjectives, adverbs and satellite adverbs.

•All of them are subclasses of the Lexical Concept class

•lexical relations defined are antonyms, similarity, hyponyms and a definition of glossary.

•One drawback of this schema is that it does not take into account the polysemy

•For e.g. there is no way to discover that “power” has several meanings unless all the data is searched

RDF Representation (Gangemi, 2004)

•In this version, there are three layers:• Word layer• Word Sense layer• SynSet layer

•The First layer is composed of a set of nodes which are subclasses of class “Word”

•the words are represented by nodes in the graph and are not just labels. This allow to represent correctly the polysemy inherent in WordNet.

•The Word Sense layer is the link between a Word and a SynSet.

•The SynSet layer is composed by a set of “NounSynSet” “AdjectiveSynSet”,”AdjectiveSatelliteSynSet”, “VerbSynSet” and “AdverbSynset”, which are subclasses of SynSet.

•The lexical relations are located in the second and third layer.

•Examples of this are antonyms and seeAlso relations.

The Lemon Model(2011)Lexicon Model for Ontology

•Lemon is a proposed model for modeling lexicon and machine-readable dictionaries and linked to the Semantic Web

•Lemon is an RDF model for representing lexical information relative to ontologies.

•The lemon model consists of a core path defined as:• Ontology Entity: The ontology entity that describes the meaning of the concept in a language-

independent manner• Lexical Sense: This object is used to attach all meaning-dependent properties of the word or term.• Lexical Entry: This represents the word or term itself.• Lexical Form: This object is used to describe a single form (e.g., plural, perfect, etc.) or an entry• Written Representation: The actual string that the lexical entry is realized as.

@base < http://www.example.org/lexicon>@prefix ontology: < http://www.example.org/ontology#> @prefix lemon: < http://www.monnetproject.eu/lemon#>

:myLexicon a lemon:Lexicon ; lemon:language "en" ; lemon:entry :animal . :animal a lemon:LexicalEntry ; lemon:form [ lemon:writtenRep "animal"@en ] ; lemon:sense [ lemon:reference ontology:animal ] .

Conclusion•The main advantage of RDF for representing WordNet is allowing to represent it as a network in a natural, simple and lightweight way.

•Another advantage is the accessibility through the web, allowing different applications to consult the data.

•WordNet is expressed now in a standard way for the semantic web this will permit the use of semiautomatic agents for more complex searches in the future.

Refrence papers

• M. Ehrmann, F. Cecconi, D. Vannella, J. P. Mccrae, P. Cimiano, and R. Navigli, “Representing multilingual data as linked data: the case of BabelNet 2.0,” in Proc. of LREC, 2014, vol. 14, pp. 401–408.

• P. Buitelaar, P. Cimiano, J. McCrae, E. Montiel-Ponsoda, and T. Declerck, “Ontology lexicalisation: The lemon perspective,” 2011.

• A. Graves and C. Gutierrez, “Data representations for WordNet: A case for RDF,” in GWC 2006–Proceedings of the 3rd International WORDNET Conference, 2006, pp. 165–169.

Thank you