Upload
jasmin-gallagher
View
216
Download
0
Embed Size (px)
DESCRIPTION
. Problem DL ontologies rarely used in application settings –Used in.ac &.edu, but not in.com projects –Only small data sets exploited Competency questions and use cases usually retro-fitted DL aspects rarely exploited –Complex LEGO-style definitions –RL profile expressivity Cardinalities, disjoints, nested universal restrictions, … Usability decreases as formality and expressive power increases Usability reverse proportional to complexity
Citation preview
.
Daniel Schober, Martin Boeker, University Medical Center Freiburg
‘Ontology Simplification’Buzzword or real Need ?
OBML 2010
.
State of the Art
• Transition from taxonomies to description logics– Increasing formal semantics & expressivity
• OWL DL applied widely– W3C standard for ‚semantic web‘– Multiple efforts with massive funding– Ontology libraries & best practice providers
.
Problem
• DL ontologies rarely used in application settings– Used in .ac & .edu, but not in .com projects– Only small data sets exploited
• Competency questions and use cases usually retro-fitted
• DL aspects rarely exploited– Complex LEGO-style definitions– RL profile expressivity
• Cardinalities, disjoints, nested universal restrictions, …
Usability decreases as formality and expressive power increases
Usability reverse proportional to complexity
.
Potential Reasons• Inherent complexity of biodomain
– Dealing with non-linear behaviors, non-classical physics– Far from sensory-perceptable world (‘meso-level’)
• Complexity of DL– Set theoretic approach counter-intuitive to object-orientation– Long class expressions, nested with multiple brackets– Hard to read Syntax
• Reasoners can’t cope with expressivity on larger scale– Computationally not feasible
• DL (semantics), OWL (syntax) ? … not really very robust• Tools & editors ?
…very far from being robust• Ontologists can’t keep up with frequent changes• Steep learning curve on engineering & usage side
.
Hypothesis
‘Is advertising ‘simplifications’ a solution ?’• Increase end-user compliance• Render DL ontologies human understandable
… while not sacrificing reasoning capabilities
Approach• Collect and review existing simplifications
– Check if whole approach feasible ?
• Introduce new simplification methods• Create typology of simplification methods• Raise general awareness • Make methods accessible• Test if they increase compliance
.
‚understandeable‘ ?• Ontology is understandable if its RUs are understandable• RU is understandable if …
– it is traceable and readily applicable by the user– its intended meaning can be grasped in a short time
• This is the case if …– it is labeled in line with user-expectations– it is instantiated often
• easy map to everyday language constructs• high every-day usage frequency
– it resides in the ‘intuitive’ meso-level• i.e. neither too abstract nor too special• directly perceivable by human senses
– it belongs to traceable and intelligible top-level category• i.e. MaterialEntity vs. DependentContinuant
– it has short logical definitions• built from simple RUs themselves
• Understanding can be facilitated by tools– Using principles of software ergonomics– Implementing simplification and normalisation strategies
.
Typology (naive start)
.
Typology (naive start)
1. Syntax simplifications2. Structural simplifications3. Shortcuts and local approximate models4. Views showing subsets of entities5. Modularizations, partitions, slims6. GUI simplifications / software ergonomics
Some examples …
.
Normalize syntax (1.a)Normalizing equivalent constructs into simpler forms
– Syntactical complexity reduction• E.g. via specialized language constructs
E.g. for disjointness
Simplify
<owl:Class rdf:about="#AdrenalineReceptor"><rdfs:subClassOf><owl:Restriction><owl:complementOf
rdf:resource="#VoltageGatedReceptor"/></owl:Restriction></rdfs:subClassOf></owl:Class>
into
<owl:Class rdf:about="AdrenalineReceptor"><owl:disjointWith rdf:resource="#VoltageGatedReceptor"/></owl:Class>
.
Conflate redundancy in restrictions (1.b) Avoid redundancy in restrictions
Simplify
NeuralInflammation Inflammation⊑ ∃has-participant. CNS_Tissue ⊓ ∃has-participant. PNS_Tissue ⊓ ∃has-participant. Brain_Tissue
into
NeuralInflammation ⊑ ∃ has-participant. (CNS_Tissue ⊓ PNS_Tissue ⊓ Brain_Tissue)
.
Increase human readability (1.c)Human readable syntax• Omit logics-specific symbolism
Simplify
HeparinBiosynthesis ⊑ (HeparinMetabolism (⊓ Biosynthesis ⊓∃ acts_on. Heparin))
into Manchester OWL Syntax
HeparinBiosynthesis SubClassOf HeparinMetabolism SubClassOf (Biosynthesis AND acts_on SOME Heparin)
or Attempto Controlled English (CNL)
“Every HeparinBiosynthesis is a HeparinMetabolism. Every HeparinBiosynthesis is a Biosynthesis that acts_on a Heparin.”
.
Simplify labels (2.c)
Naming Conventions• Shorten long relation names
– “Anatomic_Structure_Is_Physical_Part_Of”• Remove redundancy
SimplifyOvary ⊑∃Anatomic_Structure_Is_Physical_Part_Of. Reproductive_System
intoOvary ⊑ ∃Is_Physical_Part_Of. Reproductive_System
• ‘Anatomic_Structure’-prefix is already specified via domain of relation
.
Conflate property chains (3.a)
Use Shortcuts• Property chains (OWL 2) allow shortening expressions
– Compress two triples into one– Conflate / fold expression over 2 or more properties
Simplify
GeneA transcribed_to GeneA_mRNAGeneA_mRNA translated_to GeneA_Protein
into
GeneA_Protein product_of GeneA
.
Simplified umbrella classes (3.b)
Allows for graceful evolution through temporary proximity models – which can later be untangled seamlessly
Goal model for diseases PathologicalDisposition ⊑ ∃ inheresIn. PathologicalStructure PathologicalDisposition ⊑ ∀hasRealization. PathologicalProcess PathologicalProcess ⊑ ∃ hasParticipant. PathologicalStructure PathologicalProcess ⊑ ∃ realizationOf. PathologicalDisposition
Pre-coordination is labor-intensive due to combinatorial explosion
.
Simplified umbrella classes (3.b)• A pragmatic proximity model can be introduced
– Insert temporary umbrella class– ignoring disposition / structure / process distinction
PathologicalEntity ≡ PathologicalStructure PathologicalDisposition⊔ ⊔PathologicalProcess
• Later gracefully evolve towards complex model• All needed relations for …
– Pathological Structures: part-of / located-in– Pathological Dispositions: inheres-in– Pathological Processes: has-participant / located-in
… can be captured via one super-relation has-locus• Allows connecting from any PathologicalEnity to relevant location
– but without commitment to granularity– But still, the simplified model supports some inferences
• It can later be expanded – without rendering the simplification false
.
Discussion
• Typology in early stage– Re-structure into polyhierarchy of disjoint orthogonal branches
• Potential sortals ordering simplifications– By entity tackled– By persistence– By life cycle
• kick-off, development, deployment/usage– By ergonomics (Wahrnehmungspsychologie)– By user role– By user background
• mathematician, computer scientist, logician, philosopher, linguist, biologist,
.
Discussion
• Ease access to simplification methods• Publish
– OBO Foundry initiative– Ontology Engineering and Patterns Task Force (SWBPD-WG)– Ontology Design Pattern portals
• None currently addresses ‘simplifications’• Rather seen as properties of general design patterns• Introduce special ‘simplification pattern type’ or add additional descriptor to existing pattern types ?
.
Conclusions
• Reason for limited impact of OWL DL – Performance problems– Inherent complexity
• Complexity can be coped with by simplifications• Collection of >30 reviewed simplification methods
– Put in Typology– Collection and Typology to be expanded
• Cross-talk with ODP community
• Compare user compliance pre- and post-simplified– Test how fast two codes/ontologies lead to desired result for
same test task
• Feedback appreciated
.
Resources & Acknowledgements
Resources• Find Simplifications & reviews on http://www.imbi.uni-freiburg.de/~schober/Simplifications/
Acknowledgements
• Martin Boeker• Stefan Schulz• Josef Ingenerf• The DebugIT community
.
Normalize syntax (1.a)
E.g. for instance-assertions
Simplify
<rdf:Description rdf:ID="Beta receptor 94"> <rdf:type rdf:resource="#AdrenalineReceptor"/> </rdf:Description>
into <AdrenalineReceptor rdf:ID="Beta receptor 94"/>
.
Towards Simplification Methods
• Two types of simplifications1. Removing complexity
• Prevents full exploitation of semantics• Format transformation into OWL lite or SKOS
2. Hiding complexity• Allows full exploitation of semantics• Views and excerpts of ontologies
• Define characteristics for ‘simplicity’ and ‘understanding’ for following aspects
– Individual cognitive abilities– Semantics & syntax– Software ergonomics
.
Simplification Collection and Review
.
Conflate redundancy in restrictions (1.b) Avoid redundancy in restrictions• Frequent source of errors for inadequate modeling
– E.g. below: each individual AdrenalineReceptor is simultaneously expressed in three different body parts
Simplify
AdrenalineReceptor ⊑ ∃Gene_Product_Expressed_In_Tissue. Lung ⊓
∃Gene_Product_Expressed_In_Tissue. Brain ⊓ ∃Gene_Product_Expressed_In_Tissue. Muscle
into AdrenalineReceptor ⊑ ∀ Expressed_In. (Lung ⊔ Brain ⊔ Muscle)
.
Conflate property chains (3.a)Use Shortcuts• Property chains (OWL 2) allow shortening expressions
– Compress two triples into one– Conflate and fold expression over 2 or more properties
Simplify
Pneumonia outcome_of LungInflammationLungInflammation treated_by AntibioticsPneumonia improved_by Antibiotics
Tryptophan substrate_of IndolePhosphatase IndolePhosphatase has_product TrypthophanPhosphate
A is_son_of B and B is_brother_of C
into
Tryptophan processed_to TrypthophanPhosphate
A has_uncle C
• Two properties can be chained by a new property• In particular views shortcuts increase understanding
.
• To investigate– What complexities can be automatically detected and be removed
?• Parsers can unify / normalize and simplify syntax
– ‘Guided simplification finder’ chooses appropriate simplifications based on user requirements ?