View
216
Download
0
Category
Tags:
Preview:
Citation preview
Direction of Proposals forNew Edition (E3) of ISO/IEC 11179
XMDR Working Group
Presentation to SC 32/WG 2 meeting
September, 2005Toronto, Canada
Page 2XMDR Presentation
Where have we been? Where are we now?…& where are we planning to go?
System manuals
Data dictionaries
11179 E1
11179 E3
Term
inolo
gies,
onto
logie
s, e
tc.
XML & related standards
Semantic grids
11179 E2
Semantics management for data
Semantics services (SSOA)
Complex semantics management
Data engineering/XML Data
Data Standards/Data Administration
XMDR Project
Page 3XMDR Presentation
Improvements in Semantic Management Technology
SemanticsManagement
Code sets 11179 (E1) 11179 (E2) 11179 (E3) …--------------------------------------------------------------------- 20943 & 19763 & 20944 24707
Page 4XMDR Presentation
The semantics challenge has evolved
Computer Era: 3rd Generation Languages - Challenge: Automated Data Processing – convert paper data systems to automated systems and improve processing. Coded data to save memory, disk & tape.
• Began to identify data with meaningful names – Data naming methods were innovative and helpful
• Described data using unstructured text in manuals and/or with comments embedded in software– Only visible & useful to programmers
• Text/documents were not computerized (remember typewriters, stencils, mimeographs, carbon paper?)
ISO/IEC JTC 1/SC 14 developed standard code sets (valid values)
• Focus: Nomenclature for data only
Page 5XMDR Presentation
The semantics challenge has evolved
Computer Era: Early DBMS, 4th GL query systems, word processing - Challenge: Manage data – schema integration, eliminate “bit-twiddling”
• Document data in data dictionaries, software packages usually linked to a DBMS. Enforce “integrity constraints (e.g., valid values)”. Use “description” field to describe data
• Manage data life cycle– Standard code sets (valid values) were useful, but difficult
to manage – tended to be left behind by programming changes required to keep up with real world changes.
– Data naming methods failed to achieve interoperation of content between applications and between organizations, but remain useful as human friendly identifiers
SC 14 began to develop methodology for data element standardization 11179 Edition 1 - Part 3, written in text, had ~15 attributes for data elements (editor-Netherlands)
• Focus on standards for data elements
Page 6XMDR Presentation
The semantics challenge has evolved
Computer Era: DBMS, query systems, word processing - challenge: Manage data – DBMS schema integration, data quality (continued)
• Began to model data and processes Modeling standards became useful, ERD, NIAM, UML
• Word processing began to capture text documents• Keywords, glossaries, thesauri, and taxonomies became “machine
readable”, but were treated as documents and were used manually
SC 14 Developed methodology for data element standardization• 11179 (E1) Parts 4 & 5 covered data definitions and names.• Part 6 covered registration• Part 2 WD suggested development of a global taxonomy, then
changed to specify classification attributes (term, definition & identifier) in Part 2 (E1)
• All parts of 11179 were written in text. • Focus on managing data elements and classification of data elements
Page 7XMDR Presentation
The semantics challenge has evolved
Computer Era: Maturing Relational DBMS, Metadata Registries, XML, early WWW - challenge: Manage metadata, use terminology for data integration, data interoperability, data provenance, XML schema integration
SC 14 -> SC 32/WG 2. Developed 11179 Edition 2:• Broadened from data elements to management of all
“administered items”. • 11179 theme became “metadata registries” • Part 3 was expressed as a metamodel. • Part 3 included a “classification scheme region” (nodes &
relationships) to improve semantics management – Link terms in definitions and valid values to terms and definitions in
vocabularies and terminologies– Align concepts used in data with concepts used in text– Use computers to create and manage terminologies, thesauri,
taxonomies• Part 2 (E2) restated the classification scheme region attributes
from Part 3. (All Part 2 E1 attributes were included in Part 3 (E2)).• Focus on semantics for data and text
Page 8XMDR Presentation
The semantics challenge has evolved
Computer Era: WWW, Concept systems, XMDR - challenge: Semantics management & semantics Services
SC 32/WG 2 developing 11179 (E3). Proposals are made to extend semantics management and semantics services for MDR
Page 10XMDR Presentation
ISO/IEC 11179 MDR Standard Goals
• Used to record and link:– Data elements– Data element concepts– Conceptual Domains– Value Domains: e.g, enumerated value domains– Classification Schemes– …..
• Goal:– To record the unambiguous meaning of data
• Human understandable semantics: Current paradigm is natural language definitions
• For E3: Machine “Understandable”: Formal definitions (and axioms). Machine “understandable” in sense that computer can make use of concept systems for processing
Page 11XMDR Presentation
Advanced 11179 E3 Use Scenario
A User is concerned about a specific type of cancer• Wants to discover any documents on the web (reliable and
unreliable sources) about the disease, causes, treatment, victims, and researchers
• Wants to link concepts and individuals found in text to metadata and data in databases (where metadata/data relate to the concepts/individuals)
• Wants to find relevant information where the terms used for the concepts vary: by regions, disciplines, scientific nomenclature, vernacular usage, language, and names of individuals.
• Want to find information that is related through generalization and specialization and other relationships.
• Note: No assumption of federation or central control over data and text generation. However, well managed concept systems and metadata (e.g., data definitions) help.
Page 13XMDR Presentation
Object ClassChemopreventive
Agent
PropertyNSCNumber
Conceptual DomainAgent
Data Element ConceptChemopreventive Agent
NSC Number
Data ElementChemopreventive Agent Name
Value DomainNSC Code
ContextcaCORE
RepresentationCode
Cla
ss
ific
ati
on
Sc
he
me
sc
aD
SR
Tra
inin
g
Valid ValuesCyclooxygenase Inhibitor
DoxercalciferolEflornithine
…Ursodiol
*Concept Use and Integrationwith 11179 Part 3, Edition 2
Page 14XMDR Presentation
Semantic Management Extensions Goals for Edition 3
• Sharable data that can easily be identified, shared, integrated, and made interoperable across information systems and organizations (a continuing challenge)– Unambiguous metadata characteristics to register
semantic, syntactic and lexical information about data and text• Human AND machine “understandable”
• Maintain backward compatibility with 11179 (E2) implementations.
• Registration and management of any semantic information useful for administering and managing the content of data and text
Page 15XMDR Presentation
Semantic Management Extensions Goals for Edition 3
• Specify disciplined way to manage linkage of concept systems (KOS) to administered items.
• Improve the linkage of concept systems to data and text• Enable users to find correspondences between concepts in
text and in data, where these are found in dispersed documents and databases. Concepts may be given linguistic expression with terms that vary by synonymy, discipline, region, language, etc.
• Registration of semantics to facilitate concept (and data) mapping, inference, aggregation
• Manage metadata for not only DBMS & XML schemas, but also for knowledge bases, concept systems, …
Page 16XMDR Presentation
Semantic Management Extensions Goals for Edition 3 (Continued)
• Manage both data life cycle and ontology life cycle• Help to harmonize ontologies• Manage metamodels, reference ontologies & local
ontologies• Restate Part 3 as an ontology and in Common Logic to
enable use in Semantics technologies (Semantic Web, inference engines, reasoners, …).
• Restate Part 3 using MOF• 11179 registries provide support for ISO/IEC 19763.• Specify semantics services for a semantics service
oriented architecture. Enabler for semantic computing, semantic agents, semantic grids.– Semantic services needed for semantic web and semantic grids to
become part of ISO/IEC 20944.
Page 17XMDR Presentation
XMDR Intentions
• We want to try capture existing thesauri, terminologies, ontologies as sources for the semantic specification of data elements to be used in databases, XML documents, messages, etc.
• We want to incorporate more formal semantic specifications (e.g., ontologies, formal statements (axioms, sentences, ...)) to permit more precise semantic specifications (cf. to natural language definitions).
• We want to incorporate formal semantic specifications to facilitate machine processing of semantic specifications, e.g., by inference engines, agents, etc. Such machine processing of semantic specifications can be used in support of federated database access, web service identification and coordination, agent-based computations, etc.
• We want to provide a framework for the registration, harmonization, evolution and standardization of ontologies.
Page 18XMDR Presentation
Conceptual vs. Information CentricMetadata Standards
Ontology Standards:OWL, KIF, CL, ...
Terminology Standards
Connections ???
Conceptual Level
OMG Standards:MOF, CWM, UML
Information ArtifactsMetadata
Page 19XMDR Presentation
Space of Metadata Standards
OMG Standards:MOF, UML,
CWM
TerminologyStandards
Ontology Standards:
OWL, KIF, CL, XTM,
....
MMF &ISO/IEC 11179
Edition 3 Metadata Registry
Standards
About information artifacts:data elements, schemas, UML models, ...
Conceptual modelsof the “real world”
ISO/IEC 11179 connects both conceptual models and information artifacts.
Page 20XMDR Presentation
ISO/IEC 11179 Metadata Registry Standard
• Connects both:
– Conceptual models of the real world:• Concepts, data element concepts, classification
schemes• Terminologies, taxonomies, ontologies
– Information Artifacts• Data elements, enumerated values, ...• UML models (e.g., in caDSR)
Recommended