March 15, 20061 Dr. Douglas B. Lenat, 3721 Executive Center Drive, Suite 100, Austin, TX 78731...

Preview:

Citation preview

March 15, 2006

1

Dr. Douglas B. Lenat , 3721 Executive Center Drive, Suite 100, Austin, TX 78731

Email: Lenat@cyc.com

Phone: (512) 342-4001

2 July 2005

Applications of the CycCyc Formal Ontology

Upper Ontology Symposium

March 15, 2006

22 July 2005

The sentences are written in logic, not English, so computers can deeply understand them, not just store them. It can deduce the same sorts of things from them that you or I could.

A formal ontology has two parts:

(1) A set of terms (sort of like words)

(2) A set of axioms involving those terms (sort of like sentences built out of them)

March 15, 2006

32 July 2005

A formal ontology: terms + axioms (in logic)

CYC: 300k terms, 3.2 million handcrafted axioms.Very general ones (“Upper Ontology”) all the way

down to some domain-specific terms and axioms.

This afternoon’s talk: Formal Ontologies in general

This talk: Examples of how CYC is applied today

Application#1: Smarter searching

• Query: “Someone smiling”

• Caption: “A man helping his daughter take her first step”

find information

find information

by inference (+KB)

by inference (+KB)

When you become happy, you smile.

You become happy when someone you love accomplishes a milestone.

Taking one’s first step is a milestone.

Parents love their children.

(implies        (and            (isa ?PARENT Person)            (children ?PARENT ?CHILD))

       (loves ?PARENT ?CHILD))

.

vets

Do you mean:

• vets (military veteran) • vets (veterinary surgeon)

Web Results 1 -25

New Search Revise

vets: 25,947 matches

1. Photographs of Cyclo-Vets @ work

2. Veterans National Archives

3. Recommended Vets for Hamster Owners

4. Sponsors on Vets On Line

5. Pops Place BBS Index Page

as fa

st a

s us

ual

a se

cond

late

r

Do you mean:

• vets (military veteran) • vets (veterinary surgeon)

Web Results 1 -25

New Search Revise

vets: 25,947 matches

1. Photographs of Cyclo-Vets @ work

2. Veterans National Archives

3. Recommended Vets for Hamster Owners

4. Sponsors on Vets On Line

5. Pops Place BBS Index Page

(ex-serviceman OR "military veteran") OR vet OR veteran

AND NOT (veterinarian OR "veterinary surgeon" OR animal)

(ex-serviceman OR ”mili

Do you mean:

• vets (military veteran) • vets (veterinary surgeon)

(ex-serviceman OR ”mili

Web Results 1 -25

New Search Revise

2. Surf Point - Society & Issues: Military/Armed Forces: War Veterans

3. A Vet Remembers

4. Retail and Wholesale Merchants of Military/ Veteran Goods and Services

1. Veterans News and Information Service - Military, Army, Navy, Marine Corps, Air Force, Coast Guard

(ex-serviceman OR "military veteran") OR vet OR veteran

AND NOT (veterinarian OR "veterinary surgeon" OR animal)

vets: 25,947 matchesvets: 388,109 matches

as fa

st a

s us

ual

vets

Do you mean:

• vets (military veteran) • vets (veterinary surgeon)

Web Results 1 -25

New Search Revise

vets: 25,947 matches

1. Photographs of Cyclo-Vets @ work

2. Veterans National Archives

3. Recommended Vets for Hamster Owners

4. Sponsors on Vets On Line

5. Pops Place BBS Index Page

Do you mean:

• vets (military veteran) • vets (veterinary surgeon)

Web Results 1 -25

New Search Revise

vets: 25,947 matches

1. Photographs of Cyclo-Vets @ work

2. Veterans National Archives

3. Recommended Vets for Hamster Owners

4. Sponsors on Vets On Line

5. Pops Place BBS Index Page

veterinarian OR "veterinary surgeon" OR animal OR vet

AND NOT (ex-serviceman OR "military veteran" OR veteran)

veterinarian OR “veteri

Do you mean:

• vets (military veteran) • vets (veterinary surgeon)

Web Results 1 -25

New Search Revise

vets: 25,947 matches

1. Veterinary Book List

2. Advice from The White Cross Veterinary Group

3. Welcome to the World of Eco-Vet

4. Animal Wellness International

5. The economy or management of animals

veterinarian OR "veterinary surgeon" OR veterinary OR vet

AND NOT (ex-serviceman OR "military veteran" OR veteran)

veterinarian OR “veteri

vets: 153,060 matches

March 15, 2006

11

Three Improvements to Search

• Deep semantic search involving n axioms (slow)• Add in OR and AND-NOT terms, to reduce the

number of false negatives and false positives• Suggest plausible appropriate follow-on queries

– For veterinarians: how to train to be a vet

– For veterans: benefits of reenlisting

March 15, 2006

12

Application#2: Deep Question-Answering

• Even 2-3 step reasoning is relatively deep

• Draw on knowledge from all levels of the Cyc ontology (upper, middle, and domain-specific)

• The following examples come from current DTO and AFRL programs transitioned to RDEC

March 15, 2006

13

What factors argue <for/against> the conclusion that <ETA> <performed> <the March 2004 Madrid attacks>?

For:- ETA often executes attacks near national election- ETA has performed multi-target coordinated attacks- Over the past 30 years, ETA performed 75% of all terrorist attacks in Spain- Over the past 30 years, 98% of all terrorist attacks in Spain were performed by Spain-based groups, and ETA is a Spain-based group.

Against:-ETA warns (a few minutes ahead of time) of attacks that would result in a high number civilian casualties, to prevent them. There was no such warning prior to this attack.-ETA generally takes responsibility for its attacks, and it did not do so this time.-ETA has never been known to falsely deny responsibility for an attack, and it did deny responsibility for this attack.

March 15, 2006

14

March 15, 2006

15

March 15, 2006

16

March 15, 2006

17

murder of rafik hariri

March 15, 2006

18

murder of rafik hariri

March 15, 2006

19

March 15, 2006

20

March 15, 2006

21

March 15, 2006

22

March 15, 2006

23

March 15, 2006

24

March 15, 2006

25

March 15, 2006

26

March 15, 2006

27

March 15, 2006

28

March 15, 2006

29

March 15, 2006

30

CycCyc ReasoningModules

ReasoningModules

Interface to External Data Sources

Interface to External Data Sources

Cyc

API

Cyc

API

Know

led

ge

Entr

y T

ools

Know

led

ge

Entr

y T

ools

User Interface(with Natural Language Dialog)

User Interface(with Natural Language Dialog)

DataBases

WebPages

Text Sources

Other KBs

OtherApplications

OtherApplications

KnowledgeAuthors

KnowledgeAuthors

KnowledgeUsers

KnowledgeUsers

ExternalData

Sources

ExternalData

Sources

Cyc Ontology & Knowledge Base

A Typical Architecture: Formal Ontology + Inference Engines + Interfaces/API’s

March 15, 2006

31

Application#3: Semantic Data Base Integration (Virtual Joins)

• Similar to that last “deep question answering” application, but some of the information is outside the KB: in data bases, on websites, in other ontologies / knowledge bases, etc.

• Map the schema of each of information source to Cyc, and have it call on those external sources as needed, to solve sub-sub-…-problems of the query

OFAC DB8 USGS NARCL

FBI Most

WantedCATS CDE DB4

DB4

Qusay Hussein

Uday Hussein

SuspN

DB8Prenom

Qusai Hussein 30

Odai Hussein

Surnom ann

Dec. 31, 1996

Sept. 9, 2003YOB

1964

Data Warehousing: a Quadratic Solution

you! HAL CYC

QusayHusseinAl-Takriti

UdaiHusseinAl-Takriti

(age ?PERSON (YearsDuration ?AGE))

(birthDate ?PERSON ?BIRTH-DATE)

RULES

CONCEPTS

DB4YOB

Qusay Hussein

Uday Hussein 1964

DB8Prenom ann

Qusai Hussein 30

Odai Hussein

OFAC DB8 USGS NARCL

FBI Most

WantedCATS CDE DB4

Dec. 31, 1996

Sept. 9, 2003SuspN

Surnom

1966

32

A Solution that Scales Linearly

(…and, by the way, enables DB population/enrichment)

DB4YOB

Qusay Hussein

Uday Hussein 1964

DB8Prenom ann

Qusai Hussein 30

Odai Hussein

OFAC DB8 USGS NARCL

FBI Most

WantedCATS CDE DB4

Dec. 31, 1996

Sept. 9, 2003SuspN

Surnom

1966

32

(…and, by the way, enables DB population/enrichment)

A Solution that Scales Linearly

March 15, 2006

35

A very recent Cyc SKSI example

“What major US cities are particularly vulnerable to an anthrax attack?”

The answer is logically implied by data dispersed through several sources:

USGSGNISDB

AMVAKB

RAND R

UNFAODB

DTRACATS

DB

March 15, 2006

36

“major US city” ?C is a U.S. City with >1M population

“particularly vulnerable to an anthrax attack” – the current ambient temperature at ?C is above freezing,

and– ?C has more than 100 people for each hospital bed,

and– the number of anthrax host animals near ?C exceeds 100k

“What major US cities are particularly vulnerable to an anthrax attack?”

March 15, 2006

37

“What major US cities are particularly vulnerable to an anthrax attack?”

U.S. cities with population > 1 million

USGSGNISDB

AMVAKB

RAND R

UNFAODB

DTRACATS

DB

1-2 conjuncts in a

CycL “Ask” expression

March 15, 2006

38

1-2 conjuncts in a

CycL “Ask” expression

“What major US cities are particularly vulnerable to an anthrax attack?”

U.S. cities with population > 1 million

USGSGNISDB

AMVAKB

RAND R

UNFAODB

DTRACATS

DB

  (and    (isa ?C USCity)    (> (NumberOfInhabitantsFn ?C) 106)    (vulnerableToScriptedEventTypeUsing      ?C      DeployingABioAgentByInfectingAZoonoticHost

Anthrax-Bacterium))

March 15, 2006

39

The Geographic Names Information System (GNIS)

DB maintained by the US Geological Survey (USGS).

USGSGNISDB

 state |         name          | type  |     county     | state_fips |  -------+-----------------------+-------+----------------+------------+ TX    | Dallas                | ppl   | Dallas         |         48 | MN    | Hennepin County       | civil | Hennepin       |         27 |    CA    | Sacramento County     | civil | Sacramento     |          6 |    AZ    | Phoenix               | ppl   | Maricopa       |          4 |  

primary_lat | primary_long| elevation | population |     status      | ------------+-------------+-----------+------------+------------------+  32.78333 |       -96.8 |       463 |    1022830 | BGN 1978 1959  45.01667 |      -93.45 |         0 |    1032431 |  38.46667 |  -121.31667 |         0 |    1041219 |  33.44833 |  -112.07333 |      1072 |    1048949 | BGN 1931 1900 1897

March 15, 2006

40

The Geographic Names Information System (GNIS)

DB maintained by the US Geological Survey (USGS).

USGSGNISDB

So how do we explain to our system that:

• row 1 of that table is “about” the city of Dallas, TX

• the population field of that table contains the numberof inhabitants of the city that that row is “about”

• here is exactly how to access tuples of that database

• that access will be fast, accurate, recent, complete

March 15, 2006

41

The Geographic Names Information System (GNIS)

DB maintained by the US Geological Survey (USGS).

USGSGNISDB

• the population field of that table contains the numberof inhabitants of the city that that row is “about”

We provide the field encodings and decodings, some of which correspond to explicit fields like population, two-letter state codes, etc:

(fieldDecoding Usgs-Gnis-LS ?x       (TheFieldCalled “population”) (numberOfInhabitants

(TheReferentOfTheRow Usgs-Gnis) ?x))

March 15, 2006

42

The Geographic Names Information System (GNIS)

DB maintained by the US Geological Survey (USGS).

USGSGNISDB

• how to access tuples of that database We provide all the information needed for a JDBC connection script:

We assert, in the context (MappingMtFn Usgs-KS), all of these:

(passwordForSKS Usgs-KS "geografy")(portNumberForSKS Usgs-KS 4032)(serverOfSKS Usgs-KS "sksi.cyc.com")(sqlProgramForSKS Usgs-KS PostgreSQL)(structuredKnowledgeSourceName Usgs-KS "usgs")(subProtocolForSKS Usgs-KS "postgresql")(userNameForSKS "sksi")

March 15, 2006

43

The Geographic Names Information System (GNIS)

DB maintained by the US Geological Survey (USGS).

USGSGNISDB

• that access will be fast, accurate, recent, complete We provide meta-level assertions about the database, about each table of the database, about the completeness etc. of various kinds of data in the DB, etc.

We assert, in the context (MappingMtFn Usgs-KS):

(schemaCompleteExtentKnownForValueTypeInArg Usgs-Gnis-LSUSCitynumberOfInhabitants 1)

March 15, 2006

44

The Geographic Names Information System (GNIS)

DB maintained by the US Geological Survey (USGS).

USGSGNISDB

• that access will be fast, accurate, recent, complete We provide meta-level assertions about the database, about each table of the database, about the completeness etc. of various kinds of data in the DB, etc.

We assert, in the context (MappingMtFn Usgs-KS):

(resultSetCardinality Usgs-Gnis-PS        (TheSet (PhysicalFieldFn Usgs-Gnis-PS "state")) TheEmptySet 60.0)

(resultSetCardinality Usgs-Gnis-PS        (TheSet            (PhysicalFieldFn Usgs-Gnis-PS "primary_long")            (PhysicalFieldFn Usgs-Gnis-PS "primary_lat")            (PhysicalFieldFn Usgs-Gnis-PS "name"))        (TheSet            (PhysicalFieldFn Usgs-Gnis-PS "county")            (PhysicalFieldFn Usgs-Gnis-PS "state")) 530.36)

March 15, 2006

45

“major US city” U.S. City with >1M population

“particularly vulnerable to an anthrax attack” – the current ambient temperature at ?C is above freezing,

and– ?C has more than 100 people for each hospital bed,

and– the number of anthrax host animals near ?C exceeds 100k

“What major US cities are particularly vulnerable to an anthrax attack?”

Cyc knows that pullets are chickens, so don’t add those two numbers together!

March 15, 2006

46

March 15, 2006

47

March 15, 2006

48

March 15, 2006

49

March 15, 2006

50

March 15, 2006

512 July 2005

A formal ontology: terms + axioms (in logic)

CYC: 300k terms, 3.2 million handcrafted axioms.Very general ones (“Upper Ontology”) all the way

down to some domain-specific terms and axioms.

Three of the CurrentApplications of Cyc:• Smarter searching (augment queries with OR and AND-NOT

terms; suggest meaningful follow-up queries)• Relatively deep Question-answering for analysts• Semantic Knowledge Source Integration (SKSI): map

external DBs, websites, ontologies,… to Cyc for it to call on• 100’s more (that we know of) OpenCyc/ResearchCyc apps.Research: Characterize our systems as agents, to interoperate

Recommended