Symbolic Supercomputer for Artificial Intelligence and Cognitive Science Research Kenneth D. Forbus Dedre Gentner Northwestern University

Symbolic Supercomputerfor

Artificial Intelligenceand

Cognitive ScienceResearch

Kenneth D. Forbus

Dedre Gentner

Northwestern University

Overview

• Why symbolic supercomputing?• Off-line experiments

– Work in progress: Large-scale corpus analysis

– Distributed experiments harness

• Interactive Cognitive Architecture experiments– Companion Cognitive Systems (DARPA)

– Explanation Agent

Off-line experiments

• Sensitivity Analysis– Every cognitive simulation has parameters

• Analyzing how performance depends on parameters important for understanding models

– Sensitivity analyses can be expensive• 1994 MAC/FAC simulations took weeks of CPU time

• 2000: 4.8 million SME runs in SEQL sensitivity analyses took 23 days (400 mhz PII), should be 4 days today.

• Corpora Analyses– Text

– Sketches

– Problems

Larger-scale simulations

• Goal: Increased use of automatically generated inputs– Reduce tailorability

– Increase # of stimuli generated and used.

• Processes – Analogical Encoding

– Conceptual problem solving

Symbolic models and parallelism

• Our approach is based on Gentner’s (1983) structure-mapping theory– Assumes parallel processing both within modules and

between modules– Currently emulate on serial processors

• Coarse-grained parallelism could provide important benefits– Continue to simulate within-module parallelism on

single CPUs– Exploit parallel processing between modules

• Incrementally update retrievals during reasoning• Incrementally construct generalizations during reasoning• Reason about domain, interactions, and self in parallel

Traditional Supercomputers ineffective for symbolic processing

• Optimized for– Floating-point processing

– Pipelined, with vector or grid model okay CPUs, low RAM, fast floating point

• Symbolic processing– Involves many pointer operations

– Some floating-point, but over irregular structures (graphs, sparse-vectors)

fast CPUs, high RAM, okay floating point

Optimizing a cluster for symbolic processing

1. Use the fastest CPU available.

2. Distribute the processing in large, functionally-organized units. – Avoid communication overhead– Data-parallel programming style poor fit for

clusters– Replicate knowledge base as needed

3. Organize memory to be as fast as possible.– Maximize RAM, cache– Avoid virtual memory

Why large memories are crucial

• If a program is going to know a lot, it has to put it somewhere

• Example: Subset of Cyc KB contents we use– 35,070 concepts, 8939 relations, and 3,917 functions

– 1,283,835 axioms, divided into 3,537 microtheories

– Added knowledge (DARPA HPKB, CPOF, RKF)• Military tasks, units, equipment

• Countries, international relationships, terrorist incidents

• Qualitative models, terrain, trafficability, visual representation conventions, developed by our group

– Takes roughly 495 MB of storage, due to indexing overhead

• May double in size as we learn by accumulating experiences

Mk2

• Hardware: Linux Networx– 5 year maintenance

contract

• 67 nodes– Dual 3.2Ghz Xeon CPUs

– 3GB RAM/node

– 80GB disk/node

• Allegro Common Lisp for Linux– Provides flexible

development environment

Mk2 Cluster Network

Node 3

Node 2

Node 1

Node 67

Master Host

Cisco PIXFirewall/Router …

Backplane S

ubnet

Frontplane S

ubnet

PublicInternet

Gigabit switched EthernetPacket

filtering, trusted

whitelist of hosts

One-command provisioning, P2P data distribution

system

Qualitative reasoning for

intelligent agents

(ONR AI Program)

ObjectiveCreate science base for intelligent

software agents that can

• Reason about the physical phenomena and systems in a human-like way

• Extend their knowledge incrementally, by communicating with human collaborators in natural language.

Technical Approach• Develop qualitative reasoning techniques for

solving problems under time pressure with partial, incomplete knowledge (“back of the envelope” reasoning)

• Explore the use of qualitative representations as part of the semantics for a natural language system.

• Develop techniques to assimilate controlled-language reports to extend an agent’s models of the physical world.

Knowledge Base(general knowledge + libraries of cases)

ExplanationAgent

New examples

Queries

Situationupdates

Model of ongoingsituation/system

Estimates,warnings

QP Theory in Natural Language Semantics• Idea: Qualitative Process theory can be

used as a framework for understanding NL descriptions of physical phenomena.

– Right level of abstraction– Consistent with human mental models – Support for compositionality

• Approach– Identify syntactic patterns corresponding

to QP theory concepts via corpus analysis

– Recast QP theory in terms of frames– Use controlled subset of English to

simplify parsing, focus on semantics

• Current status– NL system translates paragraph sized

texts about physical processes into formal representations

– Tested on a dozen examples

• Next steps– Expand range of texts handled– Develop knowledge assimilation

techniques to construct knowledge bases by reading multiple texts

(1) A pipe connects cylinder c1 to cylinder c2.

(2) Cylinder C1 contains 5 liters of water.

(3) Cylinder C2 contains 2 liters of water.

(4) Water flows from cylinder C1 to cylinder C2, because the pressure in

cylinder C1 is greater than the pressure in cylinder C2.

(5) The higher the pressure in cylinder C1, the higher the flowrate of the water.

(6) When the pressure in cylinder C2 increases, the flowrate of the water decreases.

Type:

(isa flow3606 Translation-Flow)

Participants:

(isa c1 Container) [QuantityFrame q3609] (isa c2 Container) [QuantityFrame q3603]

Conditions:

(> (pressure c1) (pressure c2))

Quantities:

[QuantityFrames q3608 and q3605]

Consequences:

(qprop (flowrate flow3606) (pressure c1)) (qprop- (flowrate flow3606) (pressure c2))

(I- (water c1) (flowrate flow3606)) (I+ (water c2) (flowrate flow3606))

C1 C2

The EA natural language system

QP FramesQP Frames

QP Theory

constraints

QP Theory

constraints

Patterns for QP-specific

constituents

Patterns for QP-specific

constituents

Only 15 out of ~100grammar rules are

QP-specific

ProcessRules

ProcessRules

Process Frame Construction

Process Frame Construction

KBKB

QRG-CEgrammarQRG-CEgrammar LexiconLexicon

FactsFacts

WSDDataWSDData

ParserParser Word-SenseDisambiguation

Word-SenseDisambiguation

FrameRulesFrameRules

MergeRulesMergeRules

Frame Construction

Frame Construction

Input text

Input text

Retrieval of semantic

information

Retrieval of semantic

information

1.2 million

fact subset of

Cyc

SvenKuehne’s

Ph.Dthesis

Corpus Analysis (in progress)

• Kuehne and Forbus (2002) used by-hand corpus analysis to identify syntactic patterns

– Four chapters of an introductory science book, 216 sentences total

– 43% of the material in physical explanatory text could be captured via QP theory.

• Do the syntactic patterns that we found for explanatory physical texts apply to everyday texts?

– If they do, what is their coverage?

– How many more patterns are there?

Looking for quantities

• 1999 volume of the New York Times, consisting of 6.4 million sentences • First stage used 30 word list for filtering (7.5 hours)

– ~172,000 sentences output

• Second stage used regular expressions (12 hours)– Derived from vocabulary and syntactic patterns from previous corpus analysis. – Result: ~19,000 sentences worth examining more closely

• Third stage uses modified version of our Explanation Agent NLU system (less than 2 days, 17 hours, on 3 nodes)– Previously, used Quantity and PhysicalQuantity– Generalized to the Cyc concept ScalarInterval,

• Subsumes temperament, monetary values, feeling attributes, formality/politeness of speech, plus others.

• 14,000+ quantities found.– 0.2% of the sentences mention a recognizable quantity– Lexicon limitations may have a strong effect here

• Expanding it via hand-labor (Cycorp) plus co-training is probably necessary• e.g., “intensification of the war effort”

Qualitative changes in the New York Times

• Starting point: Corpus of 6.4 million sentences • Filter using word list of 89 synonyms for

increases, 66 for decreases (~10 hours each)– 62,117 candidate sentences mentioning decreases – 195,452 candidate sentences mentioning increases – Around 4% of corpus– Contrast: 43% of the material in physical explanatory

text could be captured via QP theory.• Larger analysis only concerns qualitative proportionalities • Qualitative representations may play a smaller role in

understanding political texts versus physical texts.• Genre differences: newspapers versus explanatory material

– E.g., “(X i.e., Y)” common on web, not in newspapers

Dexp: Distributed Experiment Tool

• Provides support for running distributed experiments– Written in Common Lisp

– Uses sandbox to avoid configuration issues

• Experimenter divides computation into work units– Example: For N queries, find all of the solutions to

them

– Provides list of work units to dexp as a file, along with a startup file and code tree to use

– Gets back a set of files containing the results.

dexp Architecture

• Experiment Coordinator– Manage distribution and

execution of work units– Collect results

• Experiment pool nodes– Executes a work unit, returns

results.– Execution uses sandbox for

configuration control

• Load Balancer– Dynamically allocate nodes

for work units– Will balances demands from

multiple simultaneous experiments

Loadbalancer (*)

distributed experiment pool

n31n33

n34n15

n65n66

Coordinator

How dexp simplifies experiments: Example

• A experiment analyzing semantic translations in ResearchCyc KB consisted of ~1200 work units– Each consisted of a query to see how many examples in

the KB satisfied the semantic patterns given for verbs

• With 24 nodes, most of the experiment was completed in 34 minutes– Estimate: 11 hours on a single CPU, if no failures

• Five work units churned for 12 hours, failed to finish due to heap blow-out– Most of the results were available quickly– Much easier to diagnose what was going wrong, instead

of waiting for hours to hit a failure.

Companion Cognitive Systems

A new cognitive systems architecture

• Robust reasoning and learning– Companions will learn about

their domains, their users, and themselves.

• Longevity– Companions will operate

continuously over weeks and months at a time.

• Interactivity – Companions will be capable of

high bandwidth interaction with their human partners. This includes taking advice.

– Sketching is a majorinteraction modality

Central hypotheses• Analogical processing will

enable us to create systems with human-like learning and reasoning abilities– Able to handle relational

information– Able to incrementally adapt

and extend their knowledge– Able to apply what they learn

in one domain to other domains

• Using a cluster can make an analogical processing architecture fast enough to be used in interactive systems– Changes the kinds of

experiments that become feasible as well.

Colossus(DARPA, 5 nodes)

Mk2(ONR, 67 nodes)

Companions as Structure-Mapping Architecture

Psychological Bets

• Ubiquitous use of structure-mapping for reasoning and learning– SME for matching

– MAC/FAC for similarity-based retrieval

– SEQL for generalization

• Qualitative representations play central role– Part of visual structure in

spatial reasoning

– Representation of causal knowledge and arguments

Engineering Choices

• Distributed agent architecture using KQML

• Logic-based TMS for working memory

• No hardwired working-memory capacity limits

Companion Architecture Year One

FacilitatorSessionReasoner

MAC/FACDomainTickler

nuSketch System(sKEA or nuSketch

Battlespace)

RelationalConcept

Map

SessionManager

Cluster

User’s Windows box

Master nodeNode Node

w/Thomas Hinrichs, Jeff Usher, Matt Klenk, Greg

Dunham, Emmett

Tomai, Tom Ouyang,

Hyeonkyeong Kim, and

Brian Kyckelhahn

Bennett Mechanical Comprehension Test• Widely used standardized exam

for technicians • Used in cognitive psychology

as indicator of spatial ability • Difficulty lies in breadth of

situations, not narrow technical knowledge

• Best score to date: 10 correct out of a subset of 13 BMCT problems (77%). [P < 0.001]

Q: Which crane is more stable?

Example describes how

physical principles apply to a real-world situation

Analogies with

example provides causal models

needed for solution

Analogical inferences

are surmises,

not certainties

Suggesting visual/conceptual relations by analogy

MAC/FAC

Knowledge Base(including case libraries of examples)

CandidateInferenceExtraction

SuggestionsFiltering

109 candidat

es

184 candidat

es

189 candidat

es

109 candidat

es

Visual/Conceptual Relations: Experimental Results

• Ex1: Focused Tasking– 54 sketches (18 situations

drawn by three KEs) as case library for BMCT experiment

• Round Robin method: For each sketch, remove from library, remove its VCR answers, generate suggestions via analogy– Yielded “exam” of 181

VCR questions– Score = 74.25 (P << 10-5)– Coverage = 54%– Accuracy = 87%

• Ex2: Open tasking– 10 situations selected from

BMCT problems, covering larger range of phenomena (e.g., “a boat moving in water”, “a bicycle”)

– Each situation sketched by two graduate students, told to illustrate the principle(s) you think are important.

• Round Robin method– Yielded “exam” of 138

questions– Score = 21.75 (P < 10-7)– Coverage = 46%– Accuracy = 57%

Facilitator

Executive

nuSketch GUI RelationalConcept

Map

SessionManager

Cluster

User’sWindows box

MAC/FACDomain Model

Tickler

SessionReasoner

DialogueManager

HeadlessnuSketch

SEQLDomain

Generalizer

MAC/FACSelf Model

Tickler

SEQLSelf ModelGeneralizer

MAC/FACUser Model

Tickler

SEQLUser ModelGeneralizer

InteractiveExplanation

Interface

OfflineLearningOffline

LearningOfflineLearning

CompanionsArchitecture as of 9/05

Explanation Agent Prototype

• Use Companions Architecture as infrastructure• Incorporate other ONR advances

– EA NLU system (Sven Kuehne)

– Back of the envelope reasoning (Praveen Paritosh)

– Spatial prepositions model to link language and sketches (Kate Lockwood)

– Analogical Problem Solver (Tom Ouyang)

• Use for cognitive simulations – Natural language, sketching for stimulus input

Back of the Envelope Reasoning (Paritosh)

• Qualitative representations essential for framing the problems, supporting comparisons

• Analogical reasoning used to find similar situations for estimation models, construct qualitative representations via generalization over experience

How much

oxygen is left?

How longto repair

it?

Is anyone still alive in there?

Goal: Develop theories that enable software to reason quantitativelyin real-world situations

Estimate parameter directly

Create estimation model

Find modeling strategy

Find values forparameters

in model

Use known valueif available

Estimate basedon similar situation

Feel for numbers

Problem solving

http://www.time.com/time/daily/special/photo/sub/3.html

Back of the Envelope Reasoning Progress

• Implemented BoTE-Solver– Solves 13 problems to date

• Examples• How many K-8 school

teachers are in the USA?

• How much money is spent on newspapers in USA per year?

• What is the total annual gasoline consumption by cars in US?

• What is the annual cost of healthcare in USA?

• How much power can an adult human generate?

• Claim: There is a core collection of strategic knowledge, specifically, seven strategies that capture most of back of the envelope reasoning.

• Source: – Strategies in Bote-Solver

– Analysis of all problems (n=44) from Force and Pressure, Rotation and Mechanics, Heat and Astronomy from Clifford Swartz’s Back-of-the-Envelope Physics.

C1

CARVE: Using analogy to generate qualitative representations

Dimensional partitioning for each quantity(k-means clustering)

(isa Algeria (HighValueContextualizedFn Area AfricanCountries)..

Add these facts to original cases

Structural clustering using SEQL

C1

Input cases

Cj

Quantity 1

L2L1

S3S1

S2

Cases + structural limit points and distributional partitions

Analogical Estimation

• Analogical estimator: makes guesses for a numeric parameter based on analogy.

(GrossDomesticProduct Brazil ?x)– The value is known.

– Find an analogous case for which value is known.

– Find anything in the KB which might be a basis for an estimate.

• Hypothesis: Representations augmented with symbolic representation will lead to more accurate estimates.

Basketball Stats Domain

• Quantities (e.g., points per game, rebounds per game, assists per game, etc.)

• Causal relationships– Being taller helps being able to rebound and block– Power forwards are taller and are expected to shoot,

rebound and block– Being good at getting 3 point field goals means one is a

good shooter, so their free throw success rates will be higher.

• Case library– 15 players from different positions on field– 11 facts per player

(seasonThreePointsPercent JasonKidd 0.404) (qprop seasonThreePointPercent seasonFreeThrowPercent BasketballPlayers)

Results: Errors

0

10

20

30

40

50

60

70

80

Height

Assist

s

Free

thro

ws

Point

s

Reboun

ds

Three

point % All

Enriched mean % error

Raw mean % error

SpaceCase: Motivation

• Recent research points to role of non-geometric properties in spatial preposition use – Coventry 1994; Coventry &

Prat-Sala, 1999; Herskovitz, 1986; Feist & Gentner, 2003; Garrod et al., 1999; Coventry & Garrod, 2004; Carlson & van der Zee, 2005

• Spatial language can affect retrieval of pictures – Feist and Gentner, 2001

• Multimodal interfaces potentially useful for military needs– Language plus diagrams,

other spatial displays

• Software’s notion of similarity needs to be like their human partners– Including visual properties

– Including retrieval, for shared history

– Including shared language

Lockwood, K., Forbus, K., and Usher, J. SpaceCase: a Model of Spatial Preposition Use

Proceedings of CogSci-05, to appear

sKEA Sketching Interface

sKEA Sketching Interface

medium_curvatureground_supports_

figure

dish

firefly

firefly -> insect -> animate

functions as weak container

Sketch corpus crucial for model development

• Building a corpus of sketches– Gathering library of examples from literature

– Use sKEA to capture them in machine-understandable form

– Estimate: ~ 200 sketches will be needed to cover the set of prepositions and phenomena to be modeled

• Cluster will be used for – Regression testing

– Sensitivity analyses: How does performance depend on parameter values?

Problem-solving experiments

• Starting point: Pisan’s (1998) Thermodynamics Problem Solver– Solved 80% of the problems typically found in first four chapters

in engineering thermodynamics textbooks

– Used graphs and property tables

– Produced human-like solutions

• Generalize: Analogical Problem Solver– Focus on conceptual comprehension questions

– Declarative strategies now include analogical processing• when/what to retrieve, what candidate inferences to use, level of

effort in testing

– Experiment in progress: Can strategy variations explain novice/expert differences?

• Pilot results promising, should have full data by end of summer.

Questions?

Technology Transfer

The Whodunit Problem

• Goal: Generate plausible hypotheses about who performed an event.

• Formal version: Given some event E whose perpetrator is unknown, construct a small set of hypotheses {Hp} about the identity of the perpetrator of E. – Include explanations as to

why these are the likely ones

– Able to explain on demand why others are less likely.

Assumptions & Limitations

• Formal inputs. Structured descriptions, including relational information, expressed in CycL.

• Accurate inputs. • One-shot operation. No

incremental updates.• Passive operation.

Doesn’t generate differential diagnosis information

Method 1: Closest Exemplar

1. Use MAC/FAC to retrieve events similar to E.

2. For each similar event, remove it if it doesn't include a candidate inference about the perpetrator.

3. Iterate until enough hypotheses are generated.

4. (Optional) Generate explanations and expectations by analyzing the similarities and differences between each Hp and E.

Probe

SME

SME

SME

CVmatch

CVmatch

CVmatch

CVmatch

Memory pool

Output =memory

item+ SME results

Cheap, fast, non-structural

MAC/FAC models similarity based retrieval

• Scales to large memories• Accounts for psychological phenomena

• Memory pool = All cases concerning the 98 perpetrators, minus the test set.

Method 2: Closest Generalization

• Preprocessing:1. Partition case library

according to perpetrator.2. Use SEQL to construct

generalizations for each perpetrator.

• Generating hypotheses:

1. Given an incident E, pick the n closest generalizations, as determined by SME's structural evaluation score.

Exemplars…

GeneralizationsSME

NewExample

SEQL

SEQL models generalization• Assimilate new exemplars into a

generalization when close enough.• Models psychological data, used to

made successful predictions of human behavior.

• Recent extension: use probability to improve noise immunity

Whodunit Experiment

• Used 3,379 terrorist incidents from Cycorp’s Terrorist knowledge base– Between 6 and 158

propositions per case, 20 on average

• 98 perpetrators involved in at least 3 incidents in the TKB– Pick one incident at

random for test set, remove perpetrator

• Elaborate via inference– Add attributes (e.g., (CityInCountryFn Italy)) using genls hierarchy

• Three performance levels:– Best bet– Top 3: Best plus

plausible alternatives– Top Ten list: Foci for

additional collection, analysis

Whodunit Example

Whodunit Results

Correctness

0%

10%

20%

30%

40%

50%

60%

MAC/FAC SEQL SEQL+P

Top-10Top-3Correct

Adding probability yielded 5% improveme

nt

Pure retrieval

surprisingly good

Symbolic generalization adds valve for weaker criteria

Background Material

Basketball Stats Estimation by Analogy

Given: An estimation problem (seasonThreePointsPercent JasonKidd ?x) and a case library

Find the most similar player to JasonKidd in the case library for whom we know the value for seasonThreePointsPercent.

Use that as an estimate for the given problem.

Compare accuracy over the initial case library, and the case library enriched with representations from CARVE.

SpaceCase

KB

sKEAinput stimulus

inkprocessing

routines

Evidence Rules

Bayesianupdatingalgorithm

Spatial Preposition Label

Performance

• Labeling task (Feist & Gentner, 2003)– <figure> is in/on the

<ground>

• 36 total stimuli– {firefly, coin}

– {bowl, dish, plate, slab, rock, hand}

– {low, medium, high}

• Consistent on all 36 trials for values of parameters given

Modeling a spatial language/memory interaction

• Feist and Gentner (2001)• Use spatial preposition when

showing someone a situation

• Given novel stimulus, they are more likely to claim they have seen it before

• Use SpaceCase to confirm unsuitability of original stimuli for ON

• Retrieval via MAC/FAC– Initial sketch plus variants

stored as memory– Initial as probe retrieves

itself– Initial plus relation for

spatial preposition retrieves plus variant

“On”

initial sketch 0.363

plus variant 0.859

minus variant 0.2428

SpaceCase next steps

• Expand model – more prepositions

– more complex input

• Cross-linguistic modeling

auf

an

Documents

Symbolic Supercomputer for Artificial Intelligence and Cognitive Science Research Kenneth D. Forbus Dedre Gentner Northwestern University