Upload
grace-meagan-evans
View
221
Download
5
Tags:
Embed Size (px)
Citation preview
Revelytix SICoP Revelytix SICoP PresentationPresentation
DRM 3.0 with WordNet Senses in a DRM 3.0 with WordNet Senses in a Semantic WikiSemantic Wiki
Michael LangMichael Lang
February 6, 2007February 6, 2007
AgendaAgenda
► Semantic Matching using WordnetSemantic Matching using Wordnet
► Bootstrapping COI based vocabulariesBootstrapping COI based vocabularies
With WordNetWith WordNet
► DRM 3.0 in a semantic WikiDRM 3.0 in a semantic Wiki
With WordNet integrationWith WordNet integration
DRM implementation tool for the agenciesDRM implementation tool for the agencies
► DemonstrationDemonstration
DRM MissionDRM Mission
► Facilitate information sharingFacilitate information sharing► How can I know that a data element or How can I know that a data element or
service I have discovered is the one I really service I have discovered is the one I really want?want? DescriptionDescription ContextContext
► How can I describe and provide sufficient How can I describe and provide sufficient context for anyone to know they have found context for anyone to know they have found what they wantwhat they want
► Knowledge modelKnowledge model Excel, ISO 11179, DRM 2.0 will not be sufficientExcel, ISO 11179, DRM 2.0 will not be sufficient OWLOWL
Semantic MatchingSemantic Matching
Semantic Matching Semantic Matching
► MatchITMatchIT Extracts terms from data basesExtracts terms from data bases
Creates a MatchIT vocabulary based on the collection of Creates a MatchIT vocabulary based on the collection of termsterms
Uses WordNet to match terms in disparate systemsUses WordNet to match terms in disparate systems
Uses WordNet to match terms to domain vocabularies (NIEM)Uses WordNet to match terms to domain vocabularies (NIEM)
Attaches WordNet “senses” to the vocabulary termsAttaches WordNet “senses” to the vocabulary terms
► MatchIT vocabularies can be exported as OWL modelsMatchIT vocabularies can be exported as OWL models With the WordNet senses and synsetsWith the WordNet senses and synsets
► MatchIT can use other knowledgebases to facilitate MatchIT can use other knowledgebases to facilitate matchingmatching
Bootstrapping COI Bootstrapping COI VocabulariesVocabularies
►MatchIT vocabularies are importedMatchIT vocabularies are imported Either as OWL classes for vocabulary Either as OWL classes for vocabulary
developmentdevelopment
Or as OWL individuals for DRM developmentOr as OWL individuals for DRM development
►These vocabularies can enriched These vocabularies can enriched using Knoodl.com for community using Knoodl.com for community based developmentbased development Data dictionaryData dictionary VocabularyVocabulary Knowledge baseKnowledge base
Conceptual / Logical / Physical Data Models
Relational
XML
XML
XML
XML Ontologies[OWL/RDF]
Domain[UML/ER]
Data Harmonization Complete
MetadataAccess
Data/ContentAccess
Ontological Semantics Access
OWL / RDF Model Complete
Import Export
Representations
Find Matches
Ontological Semantics Access
Enterprise Information Sources
Custom
AnySource
XML
FileSystem
JDBC
RDMS
Semantic Ontology Platform
Fact RepositoriesFact Repositories
OnomasticonsOnomasticons
LexiconsLexicons
DomainOntology
Models & Files[versioned]
Models & Files[versioned]
Search Index
Search Index
Web Reporting
Web Reporting
Instance-level
Match
Instance-level
Match
Schema-level
Match
Schema-level
Match
Build
Knoodl.com
Third-Party Modeling Tool
MatchIT Vocabulary MatchIT Vocabulary ManagerManager
Collaborative OWL Collaborative OWL editoreditor
Information ManagementInformation Management
►Knoodl is a new kind of modeling tool Knoodl is a new kind of modeling tool for modeling the structure, semantics for modeling the structure, semantics and knowledge of any domainand knowledge of any domain The modeling process is necessarily The modeling process is necessarily
collaborativecollaborative The process is necessarily extensible and The process is necessarily extensible and
additiveadditive Community of Interest (COI) based toolCommunity of Interest (COI) based tool OWL basedOWL based
Knoodl.com is …Knoodl.com is …► An internet application where people can An internet application where people can
collaborate with others in their communities of collaborate with others in their communities of interest tointerest to Create, edit, share and find Create, edit, share and find Vocabularies / ontologiesVocabularies / ontologies
► OWL RepositoryOWL Repository Free, but licensing controlled by COI’sFree, but licensing controlled by COI’s
► Social Computing ParadigmSocial Computing Paradigm Users contribute content and benefit from the contentUsers contribute content and benefit from the content Vocabularies capture much of the institutional Vocabularies capture much of the institutional
knowledge of an enterprise or communityknowledge of an enterprise or community Gain value over timeGain value over time Used by people and machinesUsed by people and machines
Knoodl.comKnoodl.com
►Knoodl is a collaborative framework Knoodl is a collaborative framework ► Interoperability depends on three groups Interoperability depends on three groups
of stakeholders contributing to the of stakeholders contributing to the description and context of the servicesdescription and context of the services
►BusinesspeopleBusinesspeople►Technical peopleTechnical people►Data peopleData people
Knoodl provides the features for the Knoodl provides the features for the business people to participatebusiness people to participate
FEADRM PersonPerson Harmonization Workgroup
Data Architecture Subcommittee MeetingJanuary 11, 2007
Gathering Information
We asked those on the workgroup to share their models of PERSON with us.
We received documents from the Department of the Interior (DOI), the Veterans’ Administration (VA), the Federal Aviation Administration (FAA), and the Environmental Protection Agency (EPA).
You can view them on CORE.gov at https://collab.core.gov/CommunityBrowser.aspx?id=10833
Analyzing the Data
We compared the entities and attributes from all the documentation. We created an Excel Workbook.
– The first sheet contains all the entities and attributes from each model.
– The second sheet contains a mapping of the entities from the other agencies to those of the Social Security Administration (SSA)
– The third sheets contains the entities, attributes, and their definitions from the SSA FEADRM Model
The Excel document is named ‘Person Entities and Attributes from Various Feds’ and you can view it on CORE.gov at https://collab.core.gov/CommunityBrowser.aspx?id=11682
Observations
A data model should have a point of view, we should have a common one at the Federal level.
Everyone should be modeling business data rather than creating logical data base models.
PERSON is probably the area in which resides most of the non-administrative sharable data. This is what we at SSA call “common shared.”
The definition of business concepts represented by entities at the “top” of the data model should not be in terms so rigorously tied to the business of any one agency.
Data that are “regulated” require formal agreement to be sharable. PERSON cannot be addressed in a vacuum. The concepts of organization,
party, and role should be addressed at the same time.
DRM 3.0DRM 3.0
Communities of Interest (COI)Communities of Interest (COI)VisionVision
Each COI will implement the 3 pillar framework strategy.
Business &
Data Goals drive
Information Sharing/Exchange
(Services)
Governance
Data StrategyData Architecture
(Structure)
The FEA Data Reference Model 2.0The FEA Data Reference Model 2.0
Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf
NIEM 1.0 NIEM Roadmap Pilot
DRM 2.0 Implementation DRM 2.0 Implementation MetamodelMetamodel
► Definitions:Definitions: Metamodel: Precise Metamodel: Precise
definitions of constructs definitions of constructs and rules needed for and rules needed for abstraction, abstraction, generalization, and generalization, and semantic models.semantic models.
Model: Relationships Model: Relationships between the data and its between the data and its metadata - W3C.metadata - W3C.
Metadata: Data about Metadata: Data about the data for: Discovery, the data for: Discovery, Integration, and Integration, and Execution.Execution.
Data: Structured e.g. Data: Structured e.g. Table, Semi-Structured Table, Semi-Structured e.g. Email, and e.g. Email, and Unstructured e.g. Unstructured e.g. Paragraph.Paragraph.Source: Professor Andreas Tolk, 2005.
The Revelytix SolutionThe Revelytix Solution
►OWL MetaModel:OWL MetaModel:owl:Classowl:Classowl:Propertyowl:Property
►DRM Model:DRM Model:Topic (owl class)Topic (owl class)Entity (owl class)Entity (owl class)Relationship (owl object Relationship (owl object property)property)
• Use existing MetaModel languages to model Use existing MetaModel languages to model the FEA DRM – OWLthe FEA DRM – OWL Model the DRM in a collaborative environment - Model the DRM in a collaborative environment - KnoodlKnoodl Extend the DRM to model the type of Extend the DRM to model the type of information that will be created – JDBC information that will be created – JDBC metadata, Wordnet synset and word datametadata, Wordnet synset and word data
DRM Implementation:DRM Implementation:Data Description AreaData Description Area
► Model JDBC Metadata to Data Description AreaModel JDBC Metadata to Data Description Area
Entity<owl:Class>
Attribute<owl:Class>
DRM v2.0 Vocabulary
View<owl:Class>
Column<owl:Class>
Table<owl:Class>
MatchIT Data Dictionary Vocabulary
<owl:subClassOf>
<owl:subClassOf>
Relationship<owl:ObjectPropert
y>
ForeignKey<owl:ObjectProperty
>
<owl:subPropertyOf>
DRM Implementation:DRM Implementation:Data Context AreaData Context Area
► Model Wordnet data to Data Context AreaModel Wordnet data to Data Context Area
Relationship<owl:ObjectProperty
>
Topic<owl:Class>
DRM v2.0 Vocabulary
Hyponym<owl:ObjectProperty
>
Synset<owl:Class>
Hypernym<owl:ObjectPropert
y>MatchIT Data Dictionary Vocabulary
<owl:subPropertyOf>
<owl:subClassOf>
Taxonomy<owl:Class>
Wordnet<owl:Class>
<owl:subClassOf>
SICoP Knowledge Reference ModelSICoP Knowledge Reference Model
The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).
DemonstrationDemonstration
Contextualize Contextualize (Interpret)(Interpret)
Automated term tokenization
Automated semantic linking using the default knowledge-base contained within MatchIT
ArticleAmount
Amount Article
Sum
Assets
Creation
Synonym
Type-of
Semantic Matching Semantic Matching (Mediate)(Mediate)
► Relationships pre-established within the knowledge-base…Relationships pre-established within the knowledge-base…
Identify the Identify the TargetTarget and the and the Source(s)Source(s) and run the match. and run the match.
ArticleAmount
ProductShares
Automatically linked by a specific % distance
Semantic Matching Semantic Matching (Mediate)(Mediate)Not all direct matches are the most relevant…
In many cases the most valuable match are the distant matches.
By adding a domain knowledge-base these relationships become more obvious.
AbstractionEvidence