8
1 Connectionist Models of Language Deficits Fiona M Richardson Developmental Neurocognition Lab Birkbeck, University of London Overview Part One: Connectionism – Principles – Properties – What are connectionist models…really?!! Part Two: Models – Specific Language Impairment (SLI) – Deep Dyslexia Take home message Models Analogous simplified systems Many different types More manipulable than target system Improve our understanding Parameterise, develop, and test theories Try to explain why Controlled means of testing Compare with empirical data Generate predictions Models as Tools Complex tasks can be performed by simple units Content addressable memory Ability to learn Generalise from experience Resistant to damage Properties of Connectionist Models

Fee conist lang 2007 - Birkbeck, University of London€¦ · effects for regular verbs The English Past Tense ... processing resource common to regulars and irregulars The Model

Embed Size (px)

Citation preview

1

Connectionist Models of Language Deficits

Fiona M RichardsonDevelopmental Neurocognition Lab

Birkbeck, University of London

Overview

• Part One: Connectionism

– Principles

– Properties

– What are connectionist models…really?!!

• Part Two: Models

– Specific Language Impairment (SLI)

– Deep Dyslexia

• Take home message

Models

• Analogous simplified systems

• Many different types

• More manipulable than target system

• Improve our understanding

• Parameterise, develop, and test theories

• Try to explain why

• Controlled means of testing

• Compare with empirical data

• Generate predictions

Models as Tools

Complex tasks can be performed by simple units

Content addressable memory

Ability to learn

Generalise from experience

Resistant to damage

Properties of ConnectionistModels

2

Where’s the Knowledge?

Hippocampal neurons

(with glial cells shown in red)

Representation of a connectionist network

• neurons = units • connections = weights

Knowledge is stored in the weights

and is acquired through learning

A Simple Connectionist Model

Input

Layer

Output

Layer

1

1

0

1

1

0

0

1

10 5 2 6

9 6 5 8

8 13 2 13

6 2 8 10

Matrix

24 14 16 20

Output Threshold

Weights

Input

Layer

Input

Vector

Models learn by updating their weights

Creating Deficits

• “Lesioning”

• Degrading the signal

• Removal of connections or units

Focal Diffuse

• You can also…

– Add noise

– Change processing unit discriminability

The Effect of Damage

Dissociations in a distributed memory (Wood, 1977)

10 5 2 6

9 6 5 8

8 13 2 13

6 2 8 10

Matrix

24 14 16 20

Output Threshold

4

6

A correct but not BB correct but not A

1

1

1

0

1

A1

0

0

1

A’

undamaged

0

1

1

0

B0

1

0

1

B’1

1

0

1

A

damaged damaged

0

1

1

0

B1

1

0

1

A0

0

0

1

Ax0

1

1

0

B0

1

0

0

Bx

3

Matching model performance to patient data

“lesion” the model to match patient data

Dell et al. (1997)

How should we model disorders?

• Remember connectionist models learn

training epochs

err

or

Acquired disorders: end (or towards end) of learning

Developmental disorders: damage during learning

About SLI

SLI is defined as a developmental disorder of language that occurs in the absence of any cognitive impairment

Grammar

JSJB RH

R2 = 0.1884

0

5

10

15

20

80 90 100 110 120 130 140age in months

raw

sc

ore

• Expressive and receptive

language difficulties, particularly in the acquisition of grammar

• May also exhibit poor semantic knowledge, vocabulary and phonological skills

Vocabulary

JSJB

RH

R2 = 0.4535

0

20

40

60

80

100

120

140

160

80 90 100 110 120 130 140age in months

raw

sc

ore

Phonology

JS

JB

RH

R2 = 0.0913

0

5

10

15

20

25

30

35

40

80 90 100 110 120 130 140

age in months

item

s c

orr

ec

t

Semantics

JS

JBRH

R2 = 0.0785

0

5

10

15

20

25

30

35

40

45

80 90 100 110 120 130 140age in months

ite

ms

rec

all

ed

4

• Deficit in regular inflection in SLI and frequency effects for regular verbs

The English Past Tense

A “quasi-regular” domain

Regular: TALK - TALKED

Irregular: THINK - THOUGHT, HIT - HIT

Rule: WUG - WUGGED

– Ullman and Pierpoint (2006): developmental deficit to a system specialised for grammar (procedural memory system)

– Thomas (in press): same data can arise from a deficit to a processing resource common to regulars and irregulars

The Model

• Thomas (in press)

• Thomas and Karmiloff-Smith (2003)

SEMANTICS VERB STEM PHONOLOGY

PAST TENSE PHONOLOGY

HIDDEN UNITS

�� /send/

/sent/

Alter initial level of processing unit discriminability

Input to Processing Unit

-4.0 0.0 +4.0

0.0

0.2

0.4

0.6

0.8

1.0

Un

it O

utp

ut

T=1.0 (Normal)

T=4.0 (Low discrimination)

T=0.25 (High discrimination)

Unit activation function 'Temperatures' (T)

Transfer functions and category boundaries

Cliffs – sharp category boundary, good for rule-like distinctions

Slopes – broad category boundary, good for fine-grained distinctions

The Data

Thomas (in press)

0%

20%

40%

60%

80%

100%

Regular Irregular OG errors NR+ed

Condition

Resp

on

ses

Controls (9;10)

Normal Sim

gSLI (11;2)

Low discrim Sim

Multiple causality of developmental deficits

• Joanisse (2000): SLI pattern can be produced by another developmental manipulation

– Poor speech perception affects the use of phonological information in working memory, which in turn leads to poor syntactic comprehension

• Domain-specific (phonological) deficit

The Model

Morphology

Semantics

Phonology

Groups of artificial neurons

Connections between groups

g

5

The Data

matching pattern

Implications

• Developmental dissociations may emerge from alterations to domain-relevant properties of shared resources

• Similar deficits can be produced by both domain-general deficit and domain-specific deficit (though specific for phonology, not regulars)

• Shallice (1988, p248):

• “If modules exist then…double dissociations are a relatively reliable way of uncovering them. Double dissociations exist, therefore modules exist”

but…

Dissociations and modularity

Delusions about dissociations…?

• Shallice describes a number of non-modular systems

that could produce DDs.

Modules…?

• Shallice (1988, p249):

• “the idea that the existence of a double dissociation necessarily implies that the overall system has separate subcomponents can no longer be taken for granted”

6

“functional specialisation”

• Shallice (1988): “functional specialisation” is a more appropriate inference from dissociations in patients

• But the dimensions on which behavioural dissociations are based may not be a direct reflection of the function responsible for specialisation

• If so, the degree of specialisation may not be a useful guide to system architecture and functional organisation

Dissociations and Modules

• Sartori (1988): Components in a fully serial architecture

can only produce single dissociations . Some contribution of parallel organisation is required for

double dissociations.

X1

X2

X1

X2 X3X1 X2

A Model of Word ReadingPlaut & Shallice (1993)

32 orthographic units

98 semantic units

10 intermediate units

61 phonological units

10 clean-up units

10 intermediate units

10 clean-up units

Deep Dyslexia

• The hallmark: semantic errors

– i.e. reading CAT as “dog”

• Also…

– Visual errors: CAT -> cot

– Mixed errors: CAT -> rat

– Morphological: GOES -> go

“Double dissociation without modularity”

32 orthographic units

98 semantic units

10 intermediate units

61 phonological units

10 clean-up units

10 intermediate units

10 clean-up units

32 orthographic units32 orthographic units

98 semantic units98 semantic units

10 intermediate units10 intermediate units

61 phonological units61 phonological units

10 clean-up units10 clean-up units

10 intermediate units10 intermediate units

10 clean-up units10 clean-up units

– Only in cases where this route is inoperative as as in deep and phonological dyslexia, are strong semantic effects such as concreteness observed

• Patient CAV exhibited better performance on abstract words (partial reliance on semantic route)

• There is a double dissociation between concrete and abstract word reading

– Most researchers believe that skilled readers rely almost exclusively on the phonological route

Lesion location and behaviour

• Damage to connections between processing units

32 orthographic units

98 semantic units

10 intermediate units

61 phonological units

10 clean-up units

10 intermediate units

10 clean-up units

32 orthographic units32 orthographic units

98 semantic units98 semantic units

10 intermediate units10 intermediate units

61 phonological units61 phonological units

10 clean-up units10 clean-up units

10 intermediate units10 intermediate units

10 clean-up units10 clean-up units

7

Lesion severity and behaviour

• percentage of damage inflicted upon connections

32 orthographic units

98 semantic units

10 intermediate units

61 phonological units

10 clean-up units

10 intermediate units

10 clean-up units

32 orthographic units32 orthographic units

98 semantic units98 semantic units

10 intermediate units10 intermediate units

61 phonological units61 phonological units

10 clean-up units10 clean-up units

10 intermediate units10 intermediate units

10 clean-up units10 clean-up units

Functional specialisation

• Different parts of the network have a different function: pointers and valleys

• Abstract words are assumed to have fewer semantic features (sparser semantic neighbourhood)

• Concrete relies more on valleys, abstract more on pointers

Attractor space: pointers and valleys

Orthography Semantics

CAT

COT

BED

Implications: Plaut (1995)

• “…Both pathways are involved in processing both types of words. However, they make different contributions the course of this processing…”

– The direct pathway generates an initial approximation of the semantics

– These are refined by the clean-up pathway

• Functional specialisation: this exists in the network “but does not directly correspond to the observed behavioural effects under damage (abstract vs. concrete words)”

• “[Regarding Averaged vs. Rare lesions]… the occasional lesion of each type may produce effects that are exactly opposite to those produced by most quantitatively equivalent lesions

– The observation of a double dissociation does not even indicate functional

specialisation, as Shallice (1988) suggests, for how can the same portion of

a mechanism be “specialised” in two different ways?”

Conclusions

• Concrete-Abstract double dissociation appears to violate Sartori’s (1988) argument that double dissociations cannot arise from serial stages

• Clean-up pathway appears to follow Direct pathway in a serial fashion

• Either parallel processing permits this, or the two parts of network do not conform to independent stages (see McClelland, 1979)

Summary

• Connectionist models are tools for theory development

• Connectionist models are learning models

• Models can be damaged to produce deficits

– A controlled testing environment

– Connectionist models have pushed the boundaries of traditional cognitive neuropsychology

– Connectionist models themselves have become working theories

8

Take home message

Connectionist models provide a controlled environment for testing the effects of both acquired and developmental deficits. They are ideal for this practice because when

damaged they do not collapse, and they are models that can learn. Connectionist models are tools for theory development that have

pushed the boundaries of traditional neuropsychology

The End

Questions?