CS6999 SWTLecture 1
Introduction to the Semantic Web
Bruce Spencer
NRC-IIT Fredericton
Sept 12, 2002
12-Sep-02CS 6999 SW Semantic Web Techniques2
National Research Council
Research Institutes and Facilities across Canada17 research institutes
4 innovation centres
3,500 employees; 1,000 guest workers
National science facilities
S&T information for industry and scientific communityCISTI: Candian Inst. for Science and Tech
InformationNetwork of technology advisors supporting SME
IRAP: Industrial Reseach Assistanceship Program
12-Sep-02CS 6999 SW Semantic Web Techniques3
Institute for Information Technology
There are two aspects to IIT– A mature research organization of ~80 people in Ottawa– New labs being developed in four cities in New
Brunswick and Nova Scotia involving ~60 new people
The whole organization is evolving to accommodate our new distributed nature
12-Sep-02CS 6999 SW Semantic Web Techniques4
NRC’s plans for New Brunswick
What?– NRC is building an e-business research team in New
Brunswick– E-business includes e-learning, e-government, e-health.
Using information and communication technology to help us to educate, govern and take care of ourselves, to create wealth.
– New Brunswick and Canadian companies already have strengths in all three areas
– NB’s communications infrastructure and interested telco– Bilingual workforce
12-Sep-02CS 6999 SW Semantic Web Techniques5
NRC’s plans for New Brunswick
NRC will act locally, and think nationally and globally– Will work with new Brunswick community to develop
clusters in e-business– This is also NRC’s national lab in e-business– NRC will build international links
Where?– Main group (40 staff) in Fredericton, at UNBF– Satellite in Saint John (6 staff), at E-Comm Centre,
UNBSJ– Satellite in Moncton (6 staff), at U. de Moncton
12-Sep-02CS 6999 SW Semantic Web Techniques8
Bruce
MMath 83, BNR 83-86, Waterloo PhD 86-90, UNB prof 90-01, NRC 01-now
Automated reasoning– data structures in theorem proving– eliminate redundant searching– smallest proofs– deductive databases
Java in curriculum since 1997
12-Sep-02CS 6999 SW Semantic Web Techniques10
Overview and Course Mindmap
Increasing demand for formalized knowledge on the Web: AI’s chance!
XML- & RDF-based markup languages provide a 'universal' storage/interchange format for such Web-distributed knowledge representation
Course introduces knowledge markup & resource semantics: we show how to marry AI representations (e.g., logics and frames) with XML & RDF [incl. RDF Schema]
DTDs
XML
RDF[S]
Namespaces
Stylesheets
CSS
XSLT
XQL
Queries
XML-QL
Transformations
Acquisition
Protégé
Agents
Frames
Rules
SHOE
HornML
RuleML
DAML
XQuery
TopicMaps
Ontobroker
12-Sep-02CS 6999 SW Semantic Web Techniques11
The Semantic Web Activityof the W3C
“The Semantic Web is a vision: the idea of havingdata on the Web defined and linked in a way thatit can be used by machines not just for display purposes,but for• automation,• integration and• reuse of data across various applications.”
(http://www.w3.org/2001/sw/Activity)
12-Sep-02CS 6999 SW Semantic Web Techniques12
What your computer sees in HTML
<b>Joe’s Computer Store
</b>
<br>
365 Yearly Drive
What your computer sees in XML<location><name>Joe’s Computer Store</name><address> 365 Yearly Drive</address></location>
Presentation information
Content description
(ambiguous)
12-Sep-02CS 6999 SW Semantic Web Techniques13
What a computer could understand
<mail:address xmlns:mail=“http://www.canadapost.ca”>
<mail:name>Joe’s Computer Store </mail:name>
<mail:street> 365 Yearly Drive </mail:street>
</mail:address>
www.canadapost.ca could define address, name, street, …Search engines could then identify mail addressesConsider shopbots being able to find
– price, quantity, feature, model number, supplier, serial number, acquisition date
Assumes that namespaces will be used consistently
12-Sep-02CS 6999 SW Semantic Web Techniques14
Semantic Web
Semantics = meaningGood Idea: Dictionary
– Create a dictionary of terms– Put it on the web– Mark up web pages so that terms are linked to these
dictionary-entries– This allow more precise matching
Better idea: Thesaurus – has hierarchies of terms– shades of meaning
Best idea: Ontology – hierarchy of terms and logic conditions
12-Sep-02CS 6999 SW Semantic Web Techniques15
Semantic Web
An agent-enabled resource“information in machine-readable form, creating a
revolution in new applications, environments and B2B commerce”
W3C Activity launched Feb 9, 2001DAML: DARPA Agent Markup Language
– US Gov funding to define languages, tools– 16 project teams
OIL is Ontology Inference Layer– DAML+OIL is joint DARPA-EU
Knowledge Representation is a natural choice
12-Sep-02CS 6999 SW Semantic Web Techniques16
12-Sep-02CS 6999 SW Semantic Web Techniques17
•SmokedSalmon is the intersection of Smoked and Salmon
Smoked Salmon
12-Sep-02CS 6999 SW Semantic Web Techniques18
•Gravalax is the intersection of Cured and Salmon, but not Smoked
•SmokedSalmon is the intersection of Smoked and Salmon
Smoked Salmon
Gravalax
12-Sep-02CS 6999 SW Semantic Web Techniques19
•Lox is Smoked, Cured Salmon
•Gravalax is the intersection of Cured and Salmon, but not Smoked
•SmokedSalmon is the intersection of Smoked and Salmon
Smoked Salmon
Gravalax
Lox
12-Sep-02CS 6999 SW Semantic Web Techniques20
A search for keywords Salmon and Cured should return pages that mention Gravalax, even if they don’t mention Salmon and Cured
A search for Salmon and Smoked will return smoked salmon, should also return Lox, but not Gravalax
Smoked Salmon
LoxGravalax
The Semantic Web is about having the Internet use common sense.
12-Sep-02CS 6999 SW Semantic Web Techniques21
Smoked Salmon
LoxGravalax
12-Sep-02CS 6999 SW Semantic Web Techniques22
Tim Berners- Lee’s Semantic Web
12-Sep-02CS 6999 SW Semantic Web Techniques23
RDF Resource Description Framework
Beginning of Knowledge Representation influence on Web
Akin to Frames, Entity/Relationship diagrams, or Object/Attribute/Value triples
12-Sep-02CS 6999 SW Semantic Web Techniques24
RDF Example
<rdf:ProductSpecs about=
“http://www.lemoncomputers.ca/model_2300”>
<specs:colour>yellow</specs:colour>
<specs:size>medium</specs:size>
</rdf:ProductSpecs>
model_2300
size
medium
colour
yellow
12-Sep-02CS 6999 SW Semantic Web Techniques25
RDF Class Hierarchy
All lemon laptops get packed in cardboard boxes
Allows one to customize existing taxonomies– Example: palmtop
computers still get packed in boxes
lemon_palmtop_20000
is_a
model_2300
size
medium
colour
yellow
12-Sep-02CS 6999 SW Semantic Web Techniques26
Tim Berners- Lee’s Semantic Web
12-Sep-02CS 6999 SW Semantic Web Techniques27
Ontology Web Language: W3C
Previously known as DAML+OIL – US: DARPA Agent Markup Language – EU: Ontology Interchange Layer (Language)
Composed of a hierarchy with additional conditions
Based on Description logic, limited expressivenss– Reasoning procedures are well-behaved– Just enough power
12-Sep-02CS 6999 SW Semantic Web Techniques28
Identifying Resources
URL/URI– Uniform resource locator / identifier– Information sources, goods and services– financial instruments
money, options, investments, stocks, etc.
“Where do you want to go today?” – becomes “What do you want to find?”
12-Sep-02CS 6999 SW Semantic Web Techniques29
Ontology
Branch of philosophy dealing with the theory of being Tarski’s assumption:
– individuals, relationships and functions “A common vocabulary and agreed-upon meanings to
describe a subject domain”– What real-world objects do my tags refer to? – How are these objects related?
Communication requires shared terms– others can join in
12-Sep-02CS 6999 SW Semantic Web Techniques30
Ontology Layer
Widens interoperability and interconversion– knowledge representation
More meta-information– Which attributes are transitive, symmetric– Which relations between individuals are 1-1,
1-many, many-many
Communities exist– DL, OIL, SHOE (Hendler)– New W3C working group
12-Sep-02CS 6999 SW Semantic Web Techniques31
Transitive, Subrole example
One wants to ask about modes of transportation from Sydney to Fredericton
“connected by Acadian Lines bus” is a role in a Nova Scotia taxonomy
“connected by SMT bus” from New Brunswick Both are subroles of “connected” “connected” is transitive Note that ontologies can be combined at runtime
12-Sep-02CS 6999 SW Semantic Web Techniques32
Combining Rich Ontologies
Only these facts are explicit– in separate ontologies
“Connected by bus” – is superset– is symmetric and
transitive
Route from Sydney to Fredericton is inferred
Connected by Acadian Lines
Connected by Acadian Lines
Sydney
Truro
Amherst
Fredericton
Connected by SMT Lines
Sussex
Connected by SMT Lines
Amherst
12-Sep-02CS 6999 SW Semantic Web Techniques33
Tim Berners- Lee’s Semantic Web
12-Sep-02CS 6999 SW Semantic Web Techniques34
Logic Layer
Clausal logic encoded in XML– RuleML, IBM CommonRules
Special cases of first-order logic– Horn Clauses for if-then type reasoning and integrity
constraintsStandard inference rules based on Resolution
– Various implementations: SQL, KIF, SLD (Prolog), XSB– J-DREW reasoning tools in Java.
Modus operandi: build tractable reasoning systems– trade away expressiveness, gain efficiency
12-Sep-02CS 6999 SW Semantic Web Techniques35
Logic Architecture Example
Contracting parties integrate e-businesses via rules
BusinessRules
BusinessRules
OPS5Prolog
Contract Rules Interchange
Seller E-Storefront Buyer’s ShopBot
12-Sep-02CS 6999 SW Semantic Web Techniques36
Negotiation via rules
usualPrice:
price(per-unit, ?PO, $60) purchaseOrder(?PO, supplierCo, ?AnyBuyer) shippingDate(?PO, ?D) (?D 24April2001).
volumeDiscountPrice:
price(per-unit, ?PO, $55) purchaseOrder(?PO, supplierCo, ?AnyBuyer) quantityOrdered(?PO, ?Q) (?Q 1000) shippingDate(?PO, ?D) (?D 24April2001).
overrides(volumeDiscount, usualPrice).
12-Sep-02CS 6999 SW Semantic Web Techniques37
Hot Research Topics:
Tools to create ontologies– Ontolingua– Protégé-2000 (Stanford)– OILED– …
Tools to learn ontologies from a large corpus such as corporate data– Merging / aligning two different ontologies from different
sources on the same topic
Searching cum reasoning tools– SHOE
12-Sep-02CS 6999 SW Semantic Web Techniques38
Eventual Goal of these Efforts
Agents locate goods, services– use ontologies– unambiguous– business rules– expressive language but reasoning tractable– combine from various sources
Gives rise to need of trust, privacy and security– e.g. semantic web project to determine eligibility of
patients for a clinical trial