Upload
dora-washington
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Semantics:How Semantic Technologies are
Tranforming Information Systems
Semantic Arts, Inc.
Dave McComb
for Minneapolis DAMA January 18th 2006
January 18, 2006 © Semantic Arts, Inc 2005/2006. 2
Objectives
Semantics > Good Definitions
Exotic Terminology
Pursue this further
January 18, 2006 3
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
Semantic Web
Semantic Technology
Semantic Methodology,
Design & Approach
January 18, 2006 4
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
Part 1: Intro, Concepts and
Methods
Part 2: Semantic Metadata and
Annotated Data
Part 3: Semantic Web
Part 4: Demos
January 18, 2006 5
Semantic Concepts, Discipline and Methods
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
Part 1: Intro, Concepts and
Methods
January 18, 2006 6
Semantics
The study of meaning(sometimes the study of
the meaning of words)
January 18, 2006 7
January 18, 2006 8
January 18, 2006 9
Structure and Metadata
You can now deal with thousands, even millions of transactions, by knowing only a small amount of metadata
January 18, 2006 10
Drowning in Metadata
Thousands -> millions of bits of metadata
Meta metadata?XMI/MOF/CWM Millions ->
Billions of instances in hundreds of databases
Commit to share ontologies to get back to thousands/ tens of thousands of concepts
January 18, 2006 11
Operative SemanticsSome of these fields are “known” to the system and cause overt changes in
behavior
January 18, 2006 12
Others are more subtle
This one shows up on the detailed P&L
reports
This one shows up in the AP list of bills
to pay
This one shows up on the check
January 18, 2006 13
None of this is mentioned in the user manual or on line help text
January 18, 2006 14
Scale issues
January 18, 2006 15
Carver Mead
January 18, 2006 16
Flat Earth Schema
We need to get up out of the weeds
Higher level, business concepts
January 18, 2006 17
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
Part 2: Semantic Metadata and
Annotated Data
January 18, 2006 18
Metadata and Annotated Data
January 18, 2006 19
Content: FOAF
Friend Of A Friend Ontology for contacts
January 18, 2006 20
Content: Dublin Core
January 18, 2006 21
So, how do we do this?
Business Vocabulary
Taxonomy
Ontology
Description Logic
January 18, 2006 22
Business Vocabulary
Not whether, but – when:
• as you come across the terms, or up front?– what source:
• source documents, interviews or existing systems?
– how:• defining terms or concepts?
January 18, 2006 23
Business Vocabulary
Schema Jargon
January 18, 2006 24
Injured workers -- representatives
Information contained in the claim files and records of injured workers, under the provisions of this title, shall be deemed confidential and shall not be open to public inspection (other than to public employees in the performance of their official duties), but representatives of a claimant, be it an individual or an organization, may review a claim file or receive specific information therefore upon the presentation of the signed authorization of the claimant.
January 18, 2006 25
Employers -- Representatives
Employers or their duly authorized representatives may review any files of their own injured workers in connection with any pending claims.
January 18, 2006 26
Claimant
A claimant may review his or her claim file if the director determines, pursuant to criteria adopted by rule, that the review is in the claimant's interest.
January 18, 2006 27
Patient
Except as otherwise provided by law, all treatment records shall remain confidential. Treatment records may be released only to the persons designated in this section, or to other persons designated in an informed written consent of the patient….[much more]
January 18, 2006 28
Child Victims
Information revealing the identity of child victims of sexual assault who are under age eighteen is confidential and not subject to public disclosure. Identifying information means the child victim's name, address, location, photograph, and in cases in which the child victim is a relative or stepchild of the alleged perpetrator, identification of the relationship between the child and the alleged perpetrator.
January 18, 2006 29
Dilbert’s Boss Understands This
January 18, 2006 30
“How to”
Sources– Documents– Existing systems– Controlled Vocabularies– Interviews
Techniques– Distinctionary– Concept -> Term
January 18, 2006 31
Documents
Information contained in the claim files and records of injured workers, under the provisions of this title, shall be deemed confidential and shall not be open to public inspection (other than to public employees in the performance of their official duties), but representatives of a claimant, be it an individual or an organization, may review a claim file or receive specific information therefore upon the presentation of the signed authorization of the claimant.
January 18, 2006 32
Existing systems
January 18, 2006 33
Vocabulary Item:
“A variety of language unique to an individual”
Idiolect
January 18, 2006 34
Every System We Design or Buy…
… is another ideolect
January 18, 2006 35
Interviews
•Enumerate types•Look for counter examples•Look for similarities•Synonyms
January 18, 2006 36
Warning:
Definitions are hard to get consensus onAnd often not worth it
January 18, 2006 37
Example good Definition
Customer:Groups or individuals who have a business relationship with the organization--those who receive and use or are directly affected by the products and services of the organization. Customers include direct recipients of products and services, internal customers who produce services and products for final recipients, and other organizations and entities that interact with an organization to produce products and services.
January 18, 2006 38
Another Problems with Definitions
Homonym problem– Same lexical word means different things
January 18, 2006 39
SUMO and WordNet
January 18, 2006 40
Concept
Avoids the generalized definition trapDrastically speeds up discovery (have you
ever tried to get a group of experts to agree on the meaning of a set of terms)
Finesses the homonymy problem
Term or Terms
January 18, 2006 41
Process
Tease apart the facets of a given definition.People will generally agree with the facets.They won’t necessarily agree on the same
combination of facets mapping to the base word you started with.
Ask: what could we call each bundle of facets that they care about?
e.g., mother
January 18, 2006 42
Key Concept: The Distinctionary
Is: a glossary
Is distinct from other glossaries: structurally, each definition first specifies the more general type of thing the word is, and then provides a way to distinguish this thing from others that are similar.
January 18, 2006 43
Example
Patient:
A patient is a role between a human being and a healthcare delivery institution.
It is different from other roles between a human and a healthcare delivery institution in that the human had been the recipient of the delivery of diagnostic or corrective health care services.
January 18, 2006 44
Taxonomies
Business Vocabulary
Taxonomy
Ontology
Description Logic
January 18, 2006 45
Taxonomy
“A taxonomy is a system for classifying and organizing large amounts of information”
Seth Earley www.earley.com
January 18, 2006 46
DMOZ
Home– Gardening– Personal Finance– Cooking
• Baking• Casseroles• Camping
– Dutch Oven
January 18, 2006 47
Formal Taxonomy
Animalia
ArthopodaChordata
Mammalia
Carnivora
PantheraGenus
Species
Family
Order
Class
Phylum
Kingdom
Felidae
Ursus
(bears)leo
(lion)
tigris
(tiger)
isa?isa?
January 18, 2006 48
Subsumption v. Inheritance
Dynamic v. Static
+PaidToDate() : int+Reserve() : int
-pensionAmt : int
Pension
+ClaimMgr() : object+DaysLost() : int
-TimeLoss : bool-ReturnToWork : Date
Claim
January 18, 2006 49
Ontology --Frame based
Business Vocabulary
Taxonomy
Ontology
Description Logic
January 18, 2006 50
Ontology Definition
“A specification of a conceptualization”
Tom Gruber
Taxonomy: Ontology::Tree: Network
January 18, 2006 51
January 18, 2006 52
Limits of Taxonomy
Disjointedness
January 18, 2006 53
Concept: A Small Ontology
GP (Genealogy Primitives)PersonM/FSpouseParent
January 18, 2006 54
Consider my family Database
MName FName Sex DoB EyeColor
Naomi John M 11/18/52 Grey
Betty William F 12/20/15 Hazel
Walter Crete M 11/15/17 Blue
Heidi Dave F 12/1/88 Blue
Naomi John M 4/3/54 Blue
Name
Dave
Naomi
John
Addie
Tommy
... ... ... ... ......
January 18, 2006 55
What kinds of queries could I do?
Any view qualified by the attributes– (show everyone born before 1/1/1990)
Some join based queries– (show all of Dave’s children)
But it gets much more complex after that
January 18, 2006 56
Committing to an Ontology
MName FName Sex DoB EyeColor
Naomi John M 11/18/52 Grey
Betty William F 12/20/15 Hazel
Walter Crete M 11/15/17 Blue
Heidi Dave F 12/1/88 Blue
Naomi John M 4/3/54 Blue
Name
Dave
Naomi
John
Addie
Tommy
... ... ... ... ......
Person
Person
Gender
PersonSpouse
January 18, 2006 57
Concept: Committing and Sharing
GP (Genealogy Primitives)
GC (Genealogy Concepts)
My Family
Commits toCommits to
PersonM/FSpouseParent
Dave is maleDave is Addie’s parentAddie is femaleNaomi is Dave’s parentNaomi is Tom’s parent
Father…Uncle…Cousin…Second Cousin, etc. …
Key concept: queries/ inference can be executed using ontological definitions I’m not even aware of
January 18, 2006 58
Good Resource
Ontology Development 101: A Guide to creating your first ontology
Natalya Noy and Deborah McGuinnesshttp://www.ksl.stanford.edu/people/dlm/papers/ontology-tutorial-noy-mcguinness.pdf
January 18, 2006 59
Description Logics
Business Vocabulary
Taxonomy
Ontology
Description Logic
January 18, 2006 60
Description Logics
This is where the rigor comes in.
Three things that take some getting used to:– Classes and Instances interchangeable– Allowing the system to do some of the design
work for you– Open world logic
Plus some very strange terminology and symbology
January 18, 2006 61
Description Logics Points of Departure
As much as possible, minimize the number of concepts that have to be accepted axiomatically.
Emphasize formal definitions for all the rest.
January 18, 2006 62
DL Definitions
January 18, 2006 63
Classes and Instances
Database designers make an early design decision as to what is going to be metadata (classes, columns, etc.) and what is going to be instance data.
For ontologists, this is a continually moving target.
Additionally, properties (which could be equivalent to attributes or relationships) are “free floating” and can be attached to classes, but don’t “belong” to them in the same way as with database models.
January 18, 2006 64
Allowing the System to Do some Design
Declared
Inferred
January 18, 2006 65
Open World
In closed world (i.e., SQL), absence of information is assumed to be negation. If the query doesn’t find it, it doesn’t exist.
In open world (DL), things are assumed to be possible until proven otherwise.
In DL, classes are assumed to overlap unless they are explicitly declared to be disjoint.
Domain and range are used for reasoning, not constraining.
January 18, 2006 66
Motherhood
Sue is John’s biological motherSarah is John’s biological mother
Therefore?
George Washington’s mother
January 18, 2006 67
January 18, 2006 68
Other strange vocabulary
DL Term English Description MeaningPartial Necessary Primitive, or
defined classesIf something is a member of this class then it is necessary to fulfill these conditions
Complete Necessary & Sufficient
Derived or defined classes
If something fulfills these conditions, then it is a member of this class
TBox Terms Metadata Reasoning in the ontology
ABox Assertions instances Reasoning over the data
January 18, 2006 69
Content
Business Vocabulary
Taxonomy
Ontology
Description Logic
January 18, 2006 70
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
Semantic Web
January 18, 2006 71
Essence at each level
TCP/IP Global Physical Addressing
DNS/URL Global Logical Addressing
XML Universal Parsing
XSD Allowable Structure
RDF Assertions / Merging
RDFS Frames / Classes
OWL Inference / Reasoning
SWRL Rule Execution
March 2004
January 18, 2006 72
TCP/IP
Single model for communicationGlobally unique physical addressing
216.239.37.99
January 18, 2006 73
DNS / URL
Logical address need not = physical addressAllows rehosting, migration, etc.
www.google.com
DNS 216.239.37.99
January 18, 2006 74
XML
Uniform parsing rules, tools, etc.Metadata (at least some of it) travels with the
data.
<book> “DaVinci Code”<author> “Dan Brown” </author>
</book>
<h1> “DaVinci Code”<p> “Dan Brown” </p>
</h1>
XML
HTML/ XHTML
January 18, 2006 75
XSD
Rules about allowable XML combinationsCan verify XML validityPrimarily for creating XML, not consuming it
<xs:element name="sculpture"><xs:annotation>
<xs:documentation>Comment describing your root element</xs:documentation>
</xs:annotation></xs:element>
January 18, 2006 76
RDF
Resource Description FrameworkSubject/Predicate/Object“Triple” and “Triple Store”Make assertionsMerge identities[proto truth]
January 18, 2006 77
“Triples”
Subject ObjectPredicate
A URI (URL) A URI (URL) A URI (URL) or Literal
Think instances
Subject/Predicate/Object
Dave McComb Sem in Buswrote
January 18, 2006 78
RDF Triples from a Database
Order2 Molson 5/12/05 Net10
Custo
mer
ID
Order3 Coors 5/12/05 Net10
Order4 Budweiser 5/12/05 Net10
Order5 Miller 5/14/05 Net10
Order1 Miller 5/12/05 Net10
Order
Date
Term
Code
Order2 CustomerID Molson
Order2 OrderDate “5/12/05”
Order2 TermCode “Net10”
Molson Molson Ale
Coors Rocky Mountain
Budweiser Clydesdales
Miller Miller Brewing
January 18, 2006 79
RDF Triples from a Document
<Order> Order2 <Special Labeling> “for winterfest” </Special Labeling></Order>
Order2 Special Labeling “for winterfest”
January 18, 2006 80
Simple Merge
Order2 CustomerID Molson
Order2 OrderDate “5/12/05”
Order2 TermCode “Net10”
Order2 Special Labeling “for winterfest”
Order2 CustomerID Molson
OrderDate
“5/12/05”TermCode
“Net10”
Special Labeling
“for winterfest”
January 18, 2006 81
First Principles
Two things equal to the same thing are equal to each other.
January 18, 2006 82
MER1 & 2 and Spirit
MER2 is Opportunity
MER1 has APXS1
APXS1 has CalibrationSet1
Spirit is MER1
January 18, 2006 83
Reification
Each Assertion (statement) has its own URIand can therefore be the Object of another Assertion
Statement 2715
Sushi sameAs RawFish
Dave thinksStmt 2715
January 18, 2006 84
Reification is Useful For
VeracityProvenanceSecurity
January 18, 2006 85
RDFS
RDF SchemaMeta Data for RDFAdds classes, properties, subclasses
January 18, 2006 86
RDFS adds Properties
Order CustomerIDhasProperty
OrderDate
TermCode
hasPropertyhasProperty
January 18, 2006 87
RDFS Subtypes
Order subTypeOf Agreement
January 18, 2006 88
OWL
Web Ontology Language Comes in three flavors
– OWL Lite– OWL DL (Description Logics)– OWL Full
Adds Reasoning
January 18, 2006 89
OWL DL
Necessary & sufficient
January 18, 2006 90
OWL DL
Person
ParentAncestor
January 18, 2006 91
SWRL
OWL + RuleML
Adds more complex reasoning and the ability to execute action
January 18, 2006 92
SWRL
If y is x’s parent, and z is y’s brother, then z is x’s uncle.
parent(?x,?y) ^ brother(?y,?z) ^ uncle(?x,?z)
January 18, 2006 93
Tools
That use this stack of standards
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
January 18, 2006 94
Tool: Protege
January 18, 2006 95
Tool: AeroText
January 18, 2006 96
Infrastructure
Discipline
Standards
Tools
Con
ten
t
Infr
astr
uctu
re
January 18, 2006 97
Infrastructure: Siderean
January 18, 2006 98
Infrastructure: Cerebra
January 18, 2006 99
Questions?
January 18, 2006 100
Re cap
Semantics can dramatically reduce the complexity and increase the flexibility of your rule based (or non rule based) systems.
January 18, 2006 101
To pursue further
Send an email to me at [email protected]
For either a glossary of semantic terms or the “CIO’s Guide to Semantics” [I have a few bound copies]
Visit our web site, many interesting free white paperswww.semanticarts.com
Semantic Wikiwww.semanticwiki.com
Semantic Technology Conferencewww.semantic-conference.com
January 18, 2006 102
Resources – Books
“Semantics in Business Systems,” print and audio
“Semantic Web Primer” Grigoris Antoniou“The Semantic Web” Michael Daconta et
al.“Women, Fire, and Dangerous Things”
George Lakoff
January 18, 2006 103
One last word
January 18, 2006 104
www.semanticarts.comSemantic Arts, Inc.