27
Psych 85-419: Introduction to Parallel Distributed Processing Michael Harm, Professor Anthony Cate, TA

Psych 85-419: Introduction to Parallel Distributed Processing

  • Upload
    kyrene

  • View
    36

  • Download
    1

Embed Size (px)

DESCRIPTION

Psych 85-419: Introduction to Parallel Distributed Processing. Michael Harm, Professor Anthony Cate, TA. Course Objectives. Solid background in the philosophical and computational underpinnings of modern connectionist (PDP) research Experience with the construction and analysis of pdp models - PowerPoint PPT Presentation

Citation preview

Page 1: Psych 85-419: Introduction to Parallel Distributed Processing

Psych 85-419: Introduction to Parallel Distributed Processing

Michael Harm, Professor

Anthony Cate, TA

Page 2: Psych 85-419: Introduction to Parallel Distributed Processing

Course Objectives

• Solid background in the philosophical and computational underpinnings of modern connectionist (PDP) research

• Experience with the construction and analysis of pdp models

• Appreciation of the benefits (and limitations!) of PDP approaches to psychological research

Page 3: Psych 85-419: Introduction to Parallel Distributed Processing

By May, You All Should Be Able To:

• Recognize when a PDP model may be useful to your research,

• Build a model of a phenomena that interests you

• Understand the contributions of models you see in the literature

• … and/or critique them!

Page 4: Psych 85-419: Introduction to Parallel Distributed Processing

Course Will Be Geared Towards Two Communities

• Modelers who plan to use these techniques in their work

• Researchers who want to better understand these models and their implications, even if they don’t want to be a modeler

• straw poll: which group do you fall into?

Page 5: Psych 85-419: Introduction to Parallel Distributed Processing

Grading

• Four homeworks, each of which count for 10% of your final grade

• One exam, worth 15% of your grade• A project proposal, worth 5%• A final project worth 30% of your grade• Class participation, worth 10% of your grade• No final exam

Page 6: Psych 85-419: Introduction to Parallel Distributed Processing

Class Web Page

www.cnbc.cmu.edu/~mharm/courses/pdp_spring2001/

Watch for updates

Page 7: Psych 85-419: Introduction to Parallel Distributed Processing

What is Expected of You

• Readings assigned for each class. Read them!

• Come prepared with thoughtful questions

• Participate in class discussions

• Complete assignments on time– Come to us if you need help! Don’t wait until

the last minute!

Page 8: Psych 85-419: Introduction to Parallel Distributed Processing

Overview of Class

• What is PDP, anyway? (That’s next)

• Processing and Constraint Satisfaction

• Simple learning and distributed representations

• Learning internal representations

• Unsupervised learning

• Psychological phenomena– Language, vision, higher level cognition

Page 9: Psych 85-419: Introduction to Parallel Distributed Processing

So What is PDP, Anyway?

• Start by describing more traditional approaches

• Why would one want a different approach?

• PDP defined

• A case study

• History of the approach

Page 10: Psych 85-419: Introduction to Parallel Distributed Processing

Traditional Approach to Studying Cognition

• The mind is like a computer

• There are rules, facts and propositions

• There is a logic engine that operates over these rules and propositions– Generates new propositions, new facts, new

rules

• The Name of the Game: Identify the rules and propositions for a given phenomena

Page 11: Psych 85-419: Introduction to Parallel Distributed Processing

Who Uses This Method (Implicitly or Otherwise)?

• Traditional AI, e.g. unification– if (not (married X)) -> (bachelor X)– (not (married JOHN)) implies JOHN is a

bachelor

• Traditional linguistics (Chomsky, etc.)

• Philosophy of Mind (Fodor, etc.)

• Psychologists (some, at least)

Page 12: Psych 85-419: Introduction to Parallel Distributed Processing

Why Would One Question This Approach?

• Descriptive versus explanatory– An equation for an ellipse describes planetary

motion. – But planets do not compute the equation for an

ellipse to decide where to go!– Has an air of Greek Mythology about it

• Creating theories to account for data, with no external validation

Page 13: Psych 85-419: Introduction to Parallel Distributed Processing

Why Would One Question This Approach (More)

• Doesn’t seem to be how the mind actually works– Robust to damage– Graded degradation in performance– Doesn’t seem to be a single “logic engine”

shared across all domains

Page 14: Psych 85-419: Introduction to Parallel Distributed Processing

Why Would One Question This Approach (Yet More)

• No obvious link to neuroscience– Single cell recordings, systems neuroscience– Impairments that have different effects on cells

• Method is typically grounded in symbolic rules– What about phenomena that aren’t rule

governed?

Page 15: Psych 85-419: Introduction to Parallel Distributed Processing

So, Fine. Now Will You Tell Us What PDP Is?

• The idea that cognition can arise through the interactions of simple processing units– Blind to the global task at hand– Output activity based on state and summed

input– … kind of like neurons

• … and that this may be a good way to study cognition

Page 16: Psych 85-419: Introduction to Parallel Distributed Processing

The Name of the Game

• Construct a model consisting of processing units and connections between them– Guided by theory, observation, hypothesis

• Explore the behavior of the model. Relate to behavioral data

• Use model to gain insights into causes of behavioral data

Page 17: Psych 85-419: Introduction to Parallel Distributed Processing

A Case Study: Frequency by Regularity in Reading

• Regular words are words whose spelling to sound correspondences are predictable from other words. Like gave, save, wave, pave.

• Exception words are ones that violate the normal rules of pronunciation, like have, yacht, sergeant

• Word frequency is how often it is seen. Words like the versus yacht

Page 18: Psych 85-419: Introduction to Parallel Distributed Processing

Frequency by Regularity

560

580

600

620

640

660

Low High

Frequency

Rea

ctio

n T

ime

Exception

Regular

• Exception words affected by frequency

• Regular words not (more or less)

Page 19: Psych 85-419: Introduction to Parallel Distributed Processing

Traditional Account (Coltheart and colleagues)

• One cognitive module is responsible for reading exception words. It is frequency sensitive

• Another module can only read regular items. It is rule governed, frequency insensitive.

Page 20: Psych 85-419: Introduction to Parallel Distributed Processing

An Alternative Account, Part I: The Existence Proof

• Seidenberg & McClelland ‘89 constructed large scale connectionist model of reading

• Mapped spelling patterns onto pronunciation

• Observed same frequency by regularity interaction

• Therefore, data does not necessitate separate systems for rules and exceptions

Page 21: Psych 85-419: Introduction to Parallel Distributed Processing

An Alternative Account, Part II: Analysis

• Plaut et al ‘96 analyzed a network that exhibited frequency by regularity interaction

• Accounted for effect through mathematical analysis of network

• This is a different kind of theorizing. – Rooted in computational principles– Discovered, rather than designed

Page 22: Psych 85-419: Introduction to Parallel Distributed Processing

History I: The Age of Discovery

• McCulloch & Pitts (1943)– Networks of simple logic gates can compute

any finite logic proposition

• Hebb (1949)– Clear definition of a learning rule for neurons

• Selfridge (1958)– Intelligent behavior from interactions of many

agents

• And many others...

Page 23: Psych 85-419: Introduction to Parallel Distributed Processing

History II: The Cold Years

• Minsky & Papert ‘69: Simple associators cannot compute problems that are not linearly separable – The XOR problem

• Many problems aren’t linearly separable

• Led to scarcity of funding for such research. Golden years of artificial intelligence.

Page 24: Psych 85-419: Introduction to Parallel Distributed Processing

History III:Renaissance of the Mid ‘80s

• Discovery of training algorithms that are more powerful than simple associators– Could compute problems that are not linearly

separable– Resurgence in interest in use of these models

for theory construction

Page 25: Psych 85-419: Introduction to Parallel Distributed Processing

History IV:The Counter Attack

• Pinker & Prince ‘88 launched attack on PDP account of inflectional morphology

• Fodor & Pylyshyn ‘88 attacked connectionist enterprise as a whole

• Besner et al., Coltheart et al. attacked findings of Seidenberg & McClelland ‘89 model

• McCloskey: Networks are not theories!

Page 26: Psych 85-419: Introduction to Parallel Distributed Processing

Where We Are Today

High Level

Low Level Classical Conditioning, Priming

Reading

Morphology

Parsing Sentences

Reasoning, Creativity

Page 27: Psych 85-419: Introduction to Parallel Distributed Processing

For Next Class

• Read PDP1, Chapter 2

• Optional: Read PDP1, Chapter 1