37
From Guise Theory to Contextual Vocabulary Acquisition William J. Rapaport Department of Computer Science & Engineering, Department of Philosophy, Department of Linguistics, and Center for Cognitive Science [email protected]

From Guise Theory to Contextual Vocabulary Acquisition

  • Upload
    ringo

  • View
    49

  • Download
    1

Embed Size (px)

DESCRIPTION

From Guise Theory to Contextual Vocabulary Acquisition. William J. Rapaport Department of Computer Science & Engineering, Department of Philosophy, Department of Linguistics, and Center for Cognitive Science [email protected] http://www.cse.buffalo.edu/~rapaport. Hector Anecdote #1. - PowerPoint PPT Presentation

Citation preview

Page 1: From Guise Theory to Contextual Vocabulary Acquisition

From Guise Theoryto Contextual Vocabulary Acquisition

William J. Rapaport

Department of Computer Science & Engineering,Department of Philosophy, Department of Linguistics,

and Center for Cognitive Science

[email protected]://www.cse.buffalo.edu/~rapaport

Page 2: From Guise Theory to Contextual Vocabulary Acquisition

Hector Anecdote #1

• Hector’s spirit:

– The Dipert-Rapaport Story

Page 3: From Guise Theory to Contextual Vocabulary Acquisition

Castañeda’s Guise Theory• 2 goals:

– Theory of the mechanism of reference– Theory of the world as it appears to us

• 2 principal sources:– Frege’s paradox of reference:

• President of the US “=” Commander-in-Chief of Armed Forces– “Univocality” of language:

• talk & thought about truth & reality= talk & thought about falsehood & fiction

– treat objects of both kinds of talk/thought in the same way– Metaphysical Internalism:

» talk/thought about world is internal to experience» (experienced) world must be understood from inside/1st-p. POV,

not externally (God’s POV, Nagel’s View from Nowhere)» useful in AI & cognitive modeling

Page 4: From Guise Theory to Contextual Vocabulary Acquisition

Rapaport’s Meinongian Theory• An ontology for epistemology

– HNC: “phenomenological ontology”– ontology of the first-person, mental objects of thought

• Theory of univocal reference:– “How to Make the World Fit Our Language” (GPS, 1981)

• formal languages: total semantic interpretation function• natural languages: partial semantic interpretation function

– to give semantics for non-referring NPs:» either reform the syntax (Russell, Quine)» or expand the semantics, to turn the partial fn into a total fn * using Meinongian objects

* use as semantics for SNePS semantic network KRRA system

Page 5: From Guise Theory to Contextual Vocabulary Acquisition

Extending the Theory• Meinongian objects as structures of properties:

– [[‘bachelor’]] = <Male, Marriage-eligible>

• … as structures of predicates, contexts, “co-texts”:– [[‘bachelor’]] = <‘…is male’, ‘…is marriage-eligible’>

• cf. LSA: “meaning as decontextualized summary of all experiences with a word”

• More generally:– [[w]]S,t = < C(w) : C(w) is a co-text of w heard by S before t >

Page 6: From Guise Theory to Contextual Vocabulary Acquisition

Computational Contextual Vocabulary Acquisition

• Q: How to compute [[w]]S,t

• A: consider the semantic network NS,t representing the propositions

that S believes at t

– describe NS,t from w’s point of view=

figure out (compute) S’s meaning at t for unknown word w from context

Page 7: From Guise Theory to Contextual Vocabulary Acquisition

Contextual Vocabulary Acquisition• Active, conscious acquisition of a meaning for a word,

as it occurs in a text, by reasoning from “context”• CVA = what you do when:

– You’re reading– You come to an unfamiliar word– It’s important for understanding the passage– No one’s around to ask– Dictionary doesn’t help

• No dictionary• Too lazy to look it up :-)• Word not in dictionary• Definition of no use

– Too hard (& dictionary definitions are just more contexts!)– Not relevant to the context

• So, you “figure out” a meaning for the word “from context”– “figure out” = infer (compute) a hypothesis about

what the word might mean in that text– “context” = ??

Page 8: From Guise Theory to Contextual Vocabulary Acquisition

What Does ‘Brachet’ Mean?(From Malory’s Morte D’Arthur [page # in brackets])

1. There came a white hart running into the hall with a white brachet next to him, and thirty couples of black hounds came running after them. [66]

2. As the hart went by the sideboard,the white brachet bit him. [66]

3. The knight arose, took up the brachet and rode away with the brachet. [66]

4. A lady came in and cried aloud to King Arthur,“Sire, the brachet is mine”. [66]

5. There was the white brachet which bayed at him fast. [72]

18. The hart lay dead; a brachet was biting on his throat,and other hounds came behind. [86]

Page 9: From Guise Theory to Contextual Vocabulary Acquisition

What Is the “Context” for CVA?• “context” ≠ textual context

– surrounding words; “co-text” of word• “context” = wide context =

– “internalized” co-text …• ≈ reader’s interpretive mental model of textual “co-text”

– … “integrated” with reader’s prior knowledge…• “world” knowledge• language knowledge• previous hypotheses about word’s meaning• but not including external sources (dictionary, humans)

– … via belief revision• infer new beliefs from internalized co-text + prior knowledge• remove inconsistent beliefs

“Context” for CVA is in reader’s mind, not in the text

Page 10: From Guise Theory to Contextual Vocabulary Acquisition

Prior Knowledge Text

PK1

PK2

PK3

PK4

Page 11: From Guise Theory to Contextual Vocabulary Acquisition

Prior Knowledge Text

PK1

PK2

PK3

PK4

T1

Page 12: From Guise Theory to Contextual Vocabulary Acquisition

Integrated KB Text

PK1

PK2

PK3

PK4

T1I(T1)

internalization

Page 13: From Guise Theory to Contextual Vocabulary Acquisition

B-R Integrated KB Text

PK1

PK2

PK3

PK4

T1I(T1)

internalization

P5

inference

Page 14: From Guise Theory to Contextual Vocabulary Acquisition

B-R Integrated KB Text

PK1

PK2

PK3

PK4

T1I(T1)

internalization

P5

inference

T2

I(T2)

P6

Page 15: From Guise Theory to Contextual Vocabulary Acquisition

B-R Integrated KB Text

PK1

PK2

PK3

PK4

T1I(T1)

internalization

P5

inference

T2

I(T2)

P6

T3

I(T3)

Page 16: From Guise Theory to Contextual Vocabulary Acquisition

B-R Integrated KB Text

PK1

PK2

PK3

PK4

T1I(T1)

internalization

P5

inference

T2

I(T2)

P6

T3

I(T3)

Page 17: From Guise Theory to Contextual Vocabulary Acquisition

B-R Integrated KB(the reader’s mind)

Text

PK1

PK2

PK3

PK4

T1I(T1)

internalization

P5

inference

T2

I(T2)

P6

T3

I(T3)

P7

Note: All “contextual” reasoning is done in this “context”:

Page 18: From Guise Theory to Contextual Vocabulary Acquisition

Overview of CVA Project• Background:

– People do “incidental” CVA• Possibly best explanation of how we learn vocabulary

– Given # of words high-school grad knows (~45K),& # of years to learn them (~18) = ~2.5K words/year

– But only taught ~10% in 12 school years

– Students are taught “deliberate” CVAin order to improve their vocabulary

• CVA project: From Algorithm to Curriculum1. Implemented computational theory of how to

figure out (compute) a meaning for an unfamiliar wordfrom “wide context”

2. Convert algorithms to an improved, teachable curriculum

Page 19: From Guise Theory to Contextual Vocabulary Acquisition

Computational CVA• Implemented in SNePS (Shapiro 1979; Shapiro & Rapaport 1992)

– Intensional, propositional, semantic-network,knowledge-representation, reasoning, & acting system

• intensional: can represent fictional entities• propositional: can represent propositions from text

& propositions about propositions• reasoning system: can make “relevant” inferences & revise beliefs

• syntax = graph • semantics = Meinongian objects• indexed by node:

– From any node, can describe rest of network

– Serves as model of the reader (“Cassie”)

Page 20: From Guise Theory to Contextual Vocabulary Acquisition

Computational CVA

• KB: SNePS representation of reader’s prior knowledge

• I/P: SNePS representation of word in its co-text

• Processing (“simulates”/“models”/is?! reading):– Uses logical inference, generalized inheritance, belief revision

to reason about text integrated with reader’s prior knowledge

– N & V definition algorithms deductively search this “belief-revised, integrated” KB (the context) for slot fillers for definition frame…

• O/P: Definition frame – slots (features): classes, structure, actions, properties, etc.– fillers (values): info gleaned from context (= integrated KB)

Page 21: From Guise Theory to Contextual Vocabulary Acquisition

Cassie learns what “brachet” means:Background info about: harts, animals, King Arthur, etc.No info about: brachetsInput: formal-language (SNePS) version of simplified English

A hart runs into King Arthur’s hall.• In the story, B12 is a hart.• In the story, B13 is a hall.• In the story, B13 is King Arthur’s.• In the story, B12 runs into B13.

A white brachet is next to the hart.• In the story, B14 is a brachet.• In the story, B14 has the property “white”.• Therefore, brachets are physical objects. (deduced while reading; PK: Cassie believes that only physical objects have color)

Page 22: From Guise Theory to Contextual Vocabulary Acquisition

--> (defineNoun "brachet")

Definition of brachet:

Class Inclusions: phys obj,

Possible Properties: white,

Possibly Similar Items:

animal, mammal, deer,

horse, pony, dog,

I.e., a brachet is a physical object that can be white and that might be like an animal, mammal, deer, horse, pony, or dog

Page 23: From Guise Theory to Contextual Vocabulary Acquisition

A hart runs into King Arthur’s hall.A white brachet is next to the hart.The brachet bites the hart’s buttock.

[PK: Only animals bite]

--> (defineNoun "brachet") Definition of brachet: Class Inclusions: animal, Possible Actions: bite buttock, Possible Properties: white, Possibly Similar Items: mammal, pony,

Page 24: From Guise Theory to Contextual Vocabulary Acquisition

A hart runs into King Arthur’s hall.A white brachet is next to the hart.The brachet bites the hart’s buttock.The knight picks up the brachet.The knight carries the brachet.

[PK: Only small things can be picked up/carried]

--> (defineNoun "brachet") Definition of brachet: Class Inclusions: animal, Possible Actions: bite buttock, Possible Properties: small, white, Possibly Similar Items: mammal, pony,

Page 25: From Guise Theory to Contextual Vocabulary Acquisition

A hart runs into King Arthur’s hall.A white brachet is next to the hart.The brachet bites the hart’s buttock.The knight picks up the brachet.The knight carries the brachet.The lady says that she wants the brachet.

[PK: Only valuable things are wanted]

--> (defineNoun "brachet") Definition of brachet: Class Inclusions: animal, Possible Actions: bite buttock, Possible Properties: valuable, small,

white, Possibly Similar Items: mammal, pony,

Page 26: From Guise Theory to Contextual Vocabulary Acquisition

A hart runs into King Arthur’s hall.A white brachet is next to the hart.The brachet bites the hart’s buttock.The knight picks up the brachet.The knight carries the brachet.The lady says that she wants the brachet. The brachet bays at Sir Tor.

[PK: Only hunting dogs bay]

--> (defineNoun "brachet") Definition of brachet: Class Inclusions: hound, dog, Possible Actions: bite buttock, bay, hunt, Possible Properties: valuable, small, white,I.e. A brachet is a hound (a kind of dog) that can bite, bay, and hunt, and that may be valuable, small, and white.

Page 27: From Guise Theory to Contextual Vocabulary Acquisition

General Comments

• Cassie’s behavior human protocols

• Cassie’s definition OED’s definition:= A brachet is “a kind of hound which hunts by scent”

Page 28: From Guise Theory to Contextual Vocabulary Acquisition

Noun Algorithm• Generate initial hypothesis by “syntactic manipulation”

– Algebra:Solve an equation for unknown value X– Syntax: “Solve” a sentence for unknown word X– “A white brachet (X) is next to the hart”

→ X (a brachet) is something that is next to the hart and that can be whiteI.e., “define” node X in terms of immediately connected nodes

• Then find or infer from wide context:– Basic-level class memberships (e.g., “dog”, rather than “animal”)

• else most-specific-level class memberships• else names of individuals

– Properties of Xs (else, of individual Xs) (e.g., size, color, …)– Structure of Xs (else …) (part-whole, physical structure…)

– Acts that Xs perform (else …) or that can be done to/with Xs– Agents that do things to/with Xs– … or to whom things can be done with Xs– … or that own Xs– Possible synonyms, antonymsI.e., “define” word X in terms of some (but not all) other connected nodes

Page 29: From Guise Theory to Contextual Vocabulary Acquisition

Verb Algorithm• Generate initial hypothesis by syntactic/algebraic

manipulation• Then find or infer from wide context:

– Class membership (e.g., Conceptual Dependency)• What kind of act is X-ing (e.g., walking is a kind of moving)• What kinds of acts are X-ings (e.g., sauntering is a kind of

walking)– Properties/manners of X-ing (e.g., moving by foot, slow walking)– Transitivity/subcategorization information

• Return class membership of agent, object, indirect object, instrument– Possible synonyms, antonyms– Causes & effects

• [Also: preliminary work on adjective algorithm]

Page 30: From Guise Theory to Contextual Vocabulary Acquisition

Belief Revision• To revise definitions of words used inconsistently

with current meaning hypothesis

• SNeBR (ATMS; Martins & Shapiro 1988, Johnson 2006, Fogel 2011):

– If inference leads to a contradiction, then:

1. SNeBR (asks user to) remove culprit(s)

2. uses Relevance Logic to automatically remove consequences inferred from culprit

Page 31: From Guise Theory to Contextual Vocabulary Acquisition

A Computational Theory of CVA1. A word does not have a unique meaning.2. A word does not have a “correct” meaning.

a) Author’s intended meaning for word doesn’t need to be known by readerin order for reader to understand word in context

b) Even familiar/well-known words can acquire new meanings in new contexts.c) Neologisms are usually learned only from context

3. Every co-text can give some clue to a meaning for a word.• Generate initial hypothesis via syntactic/algebraic manipulation

4. But co-text must be integrated with reader’s prior knowledgea) Large co-text + large PK more cluesb) Lots of occurrences of word allow asymptotic approach to stable meaning hypothesis

a) CVA is computablea) CVA is “open-ended”, hypothesis generation.

a) CVA ≠ guess missing word (“cloze”); CVA ≠ word-sense disambiguation

b) Some words are easier to compute meanings for than others (N < V < Adj/Adv)1. CVA can improve general reading comprehension (through active reasoning)2. CVA can & should be taught in schools

Page 32: From Guise Theory to Contextual Vocabulary Acquisition

From Algorithm to Curriculum

• State of the art in vocabulary learning from context:

– Mauser 1984: “context” = definition!– Clarke & Nation 1980: a “strategy” (algorithm?):

1. Determine part of speech of word2. Look at grammatical context

• Who does what to whom?

3. Look at surrounding textual context• Search for clues (as we do)

4. Guess the word; check your guess

Page 33: From Guise Theory to Contextual Vocabulary Acquisition

CVA: From Algorithm to Curriculum

• “guess the word” = “then a miracle occurs”

• Surely, computer scientists can “be more explicit”!

• And so should teachers!

Page 34: From Guise Theory to Contextual Vocabulary Acquisition

From Algorithm to Curriculum (cont’d)

• We have explicit, GOF (symbolic) AI theory of how to do CVA Teachable!

• Goal:– Not: teach people to “think like computers”– But: explicate computable & teachable methods

to hypothesize word meanings from context• AI as computational psychology:

– Devise computer programs that faithfully simulate(human) cognition

– Can tell us something about (human) mind– We are teaching a machine,

to see if what we learn in teaching it can help us teach students better

Page 35: From Guise Theory to Contextual Vocabulary Acquisition

CVA as Computational Philosophy & Philosophical Computation

1. CVA & holistic semantic theories:– Semantic networks:

• “Meaning” of a node is its location in the entire network– Holism:

• Meaning of a word is its relationships to all other words in the language– Problems (Fodor & Lepore):

• No 2 people ever share a belief• No 2 people ever mean the same thing• No 1 person ever means the same thing at different times• No one can ever change his/her mind• Nothing can be contradicted• Nothing can be translated

– CVA offers principled way to restrict “entire network”to a useful subnetwork• That subnetwork can be shared across people, individuals, languages,… • Can also account for language/concept change

1. Via “dynamic”/“incremental” semantics

Page 36: From Guise Theory to Contextual Vocabulary Acquisition

CVA as Computational Philosophy & Philosophical Computation

2. CVA and the Chinese Room– How might Searle-in-the-Room figure out a meaning for an

unknown squiggle?• By CVA techniques!

– Searle’s CR argument from semantics:1. Computer programs are purely syntactic2. Cognition is semantic3. Syntax alone does not suffice for semantics No purely syntactic computer program can exhibit semantic cognition

2. “Syntactic Semantics” (Rapaport 1985ff)• Syntax does suffice for the kind of semantics needed for NLU in the

CR1. All input—linguistic, perceptual, etc.—is encoded in a single network

(or: in a single, real neural network: the brain!)2. Relations—including semantic ones—among nodes of such a network

are manipulated syntactically1. Hence computationally (CVA helps make this precise)

Page 37: From Guise Theory to Contextual Vocabulary Acquisition

Hector Anecdote #2

• Hector’s ghost– my dream