17
12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability to interpret and respond to the significant content of pictures is shared by a large proportion of the animal kingdom and a small but growing number of machines. The machines (usually digital computers) will classify simple shapes and printed letters and digits represented (by means of a suitable television scanner) as a matrix of I s and Os. Animals, and specifically man, will do this and more; that is, not only will a human subject correctly recognise letters written badly but also comment, 'That's a funny way to write a two', or 'It's a good portrait, but the eyes are a little too far apart'. No machine can as yet approach this sort of performance. We may well ask, therefore, what if anything we can learn from a study of 'machine perception' which could conceivably help us to understand human perception. The answer lies in the fact that any realistic account of the mechanism underlying perceptual or any other human skills will be complex. The virtue of a computer lies in its ability to capture in a definite form processes of indefinite complexity and subtlety, and moreover, to permit an evaluation of the efficacy of the proposed description by trying it out on the actual task. This paper will be concerned with a description of a new method of interpreting pictures, now under development. The method arises directly out of earlier attempts (Clowes & Parks 1960, Clowes 1962) to develop letter recognition machines for commercial use. However, it now appears that the approach is capable of a wider interpretation in terms both of psycho- logical observations and recent physiological evidence. * Present address: Computing Research Section, C.S.I.R.O., Canberra, Australia. 181

PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

12

PERCEPTION, PICTUREPROCESSING AND COMPUTERS

DR M. B. CLOWES*M.R.C. PSYCHO-LINGUISTICS RESEARCH UNITUNIVERSITY OF OXFORD

INTRODUCTION

The ability to interpret and respond to the significant content of pictures isshared by a large proportion of the animal kingdom and a small but growingnumber of machines. The machines (usually digital computers) will classifysimple shapes and printed letters and digits represented (by means of asuitable television scanner) as a matrix of I s and Os. Animals, and specificallyman, will do this and more; that is, not only will a human subject correctlyrecognise letters written badly but also comment, 'That's a funny way towrite a two', or 'It's a good portrait, but the eyes are a little too far apart'.No machine can as yet approach this sort of performance. We may wellask, therefore, what if anything we can learn from a study of 'machineperception' which could conceivably help us to understand human perception.The answer lies in the fact that any realistic account of the mechanismunderlying perceptual or any other human skills will be complex. The virtueof a computer lies in its ability to capture in a definite form processes ofindefinite complexity and subtlety, and moreover, to permit an evaluationof the efficacy of the proposed description by trying it out on the actualtask.This paper will be concerned with a description of a new method of

interpreting pictures, now under development. The method arises directlyout of earlier attempts (Clowes & Parks 1960, Clowes 1962) to develop letterrecognition machines for commercial use. However, it now appears thatthe approach is capable of a wider interpretation in terms both of psycho-logical observations and recent physiological evidence.

* Present address: Computing Research Section, C.S.I.R.O., Canberra, Australia.

181

Page 2: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

PATTERN RECOGNITION

HIERARCHIES OF DESCRIPTION

Psychologists concerned with analysing complex behaviour often have toselect an appropriate level of description. For example, driving a motorcar can be characterised in terms such as 'overtaking', 'giving way', 'filtering',or at a more detailed level in terms of 'changing from 2nd to 3rd gear','turning the wheel anti-clockwise', 'depressing the clutch pedal' or yet moredetailed still in terms of the positions of the limbs as a function of time.Choosing the correct level of description might be vital to the assessment ofthe relative skills of two drivers, or the design of two different types of car.On the other hand perhaps all these levels of description should be employedsimultaneously—if so, how are they related? This question forms the basisof a book Plans and the structure of behaviour (Miller, Galanter & Pribram1960) in which it is shown that the problem arises not only for manual skills,but also in many cognitive processes and especially in language. It is inthis latter case that perhaps the clearest expression of a hierarchy of descrip-tion is evident—phonemes, syllables, words, phrases, clauses, sentences.and it is also the area in which the most progress has been made in answeringthese questions.

LINGUISTIC GRAMMARS

The problem is to formulate a reasonably precise set of rules by means ofwhich we can take an utterance, e.g., 'the boy kicked the red ball' and, by

S = NP VP1

S2 NP VP1

VP =V + NP

VP1

V + ANP

NP = T+N

ANP = T+N1

N1 A+N

V •• Icioked,/rolled/puehed/ • • • •

,T = the/a/every/ . OO ••••

N = man/boy/ball/ • • • • • • ••

A = re d/young/stripea/

FIG. 1. A set of ordered substitution rules specifying the grammar of asimple sentence.

182

Page 3: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

CLOWES

applying the rules in the prescribed manner, can obtain a description of thisutterance at a number of different levels.Let us confine our attention to three levels: words, phrases, sentences.

Fig. 1 shows a set of rules sub-divided into numbered groups (1, 2, 3, . . . , 6).We can regard each rule as defining a permissible substitution. Thus

group 6 rules state that we may replace any of the words red, young, striped(i.e., any adjective) by the symbol A, and similarly for verbs, nouns andarticles. So that the sentence

the boy kicked the red ball

becomes by application of group 6

(6)TNV T AN

Group 5 states that a pair of symbols A, N appearing contiguously and in thatorder can be replaced by N1 so that the string becomes

(5) T N V TM

and so by application of successive rules, in descending order of their groupnumber, we obtain

(4) NP V ANP(3) NP VP'(2) NP V132(1) S2

If we now combine these different strings of symbols into one diagram(Fig. 2) we have a description of the utterance at a number of levels in afashion which indicates the relation between the levels.

7The boy kicked

/ANPNAt

the rod ball

FIG. 2. The structural analysis of a simple sentence asobtained by applying the rules of FIG. I.

The rules may be used to generate or analyse sentences. A typical genera-tive sequence starting from the schema S2 would be S2 NP VP2 T N

the N VP, -4 the boy V132 the boy V ANP ... -4- the boy kicked thered ball. The generative path can be diagrammed in tree form as in Fig. 2.This diagram—a so-called tree—indicates not only that the utterance is a

sentence, but also what type of sentence it is (S2 rather than S1), as well asindicating what features (i.e., phrases like ANP, etc.) it had which caused

183

Page 4: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

PATTERN RECOGNITION

the label S2 to be applicable. Let me hasten to add at this point that thisis not presented as a serious piece of linguistics (there are grounds forbelieving it to be incorrect in several ways) but merely as an introductionto the notion of a set of rules (the grammar) and a procedure for applyingthem (a parsing algorithm) by which we can rigorously derive a structuraldescription of a complex object like a sentence. A mechanism or mechanismsof this type undoubtedly underlie our ability to understand sentences.

PICTURE PROCESSING

Comparable mechanisms for processing pictures have not appeared to anysignificant extent either in Artificial Intelligence or in Psychology. Insteadthe emphasis has been on classifying pictures, i.e., upon the equivalent ofdeciding the type (S1 or S2) of a sentence The possibility that a structuraldescription of a pictorial object is necessary has only recently emerged inArtificial Intelligence (Kirsch 1964). It appears to be crucial to the automa-tion of many picture processing tasks (e.g., interpreting bubble-chamberphotographs, recognising fingerprints). That perceptual behaviour involvesdescription as well as classification could be supported by innumerableexamples beyond the two already quoted in the introduction to this paper.However, it seems to have received scant attention in recent psychologicalliterature. This may well be because overtly descriptive (rather than classi-ficatory) behaviour has a large verbal element. There is a strong temptationto avoid this 'complication' by designing experiments which merely requiresimple YES/NO responses and which have, therefore, no significant verbalcomponent.

It follows from the foregoing remarks that our own objective is thedevelopment of a formal (i.e., rigorous) theory of picture processing whichaims to provide a structural description of pictorial objects as well as aclassification of them (wherever appropriate). The particular system I shalldescribe is by no means the only possible approach, both Ledley (1964) andNarasimhan (1964) have proposed alternatives. I shall illustrate the tech-nique in terms of a limited class of pictures—the handwritten numerals.There are several reasons for choosing this class. (1) They are (hopefully)relatively simple objects; (2) at least two levels of description naturallyoccur, e.g., the symbol 7 is readily described as 'a horizontal stroke above adiagonal stroke. . .'; (3) psycholinguistic skills, e.g., reading (with which theUnitt is primarily concerned), involve pictorial objects of this type.

A PICTURE GRAMMAR FOR HANDWRITTEN NUMERALS: NOTATION

The substitution rules in the linguistic grammar (Fig. 1) contain on theirright-hand side (typically) two elements, e.g., T, N or NP, VP, linked by arelational operator '+' meaning 'followed by'. This operator expressesthe notion of contiguity as it arises in a string of symbols; contiguity in a

t MRC Psycholinguistics Research Unit, University of Oxford.184

Page 5: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

CLOWES

geometrical sense is the more general relationship 'next to'. Thus a simplepictorial property like 'Edge' (i.e., boundary between black and whiteregions of the picture) might be defined as 'black area next to white area'.Specifically we might define a North Edge as 'white area above black area',which can be expressed in a rigorous (computable) form ̀with the aid of asuitable notation. For example, in a pictorial notation the definition ofNorth Edge for a picture composed only of black or white elements mighttake the form illustrated in Fig. 3. This asserts that the point arrowed

FIG. 3.

will be said to have the property N EDGE if it and the points to the left andthe right of it are black (i.e., True) and the three points denoted '0' are white(i.e., False). If we introduce a Cartesian co-ordinate system we can writethis definition as a Boolean expression:

N EDGE (x, y) = PICTURE (X — 1, y) & PICTURE (x y)& PICTURE (X + 1, y) PICTURE (X — 1, y + 2)

PICTURE (X, y+2) PICTURE (X + 1, y + 2) (1)

where (x, y) is the 'arrowed' point, a black point has the value True and awhite point the value False.The assignment of a label (e.g., N EDGE) to a picture point involves a

decision based upon the state of a number of neighbouring points. Fig. 3is a picture of the decision criterion. The values of only 6 specified pointsare considered. All other points are ignored, for the purpose of assigningthis label to the arrowed point.

IMPLEMENTATION

This notation is used in the picture analysis program with some smalldifferences for reasons of economy and ease of programming. Fig. 4 shows asystem of such definitions as they are input to the computer. The first twolines specify various program options relating to print out, etc., and alsodefine the size (48 x 48) of the input picture. The third line is the definitionfor N EDGE given above (1), but specified in Reverse Polish (i.e., parenthesisfree) form.t Repetitions of the label PICTURE have been suppressed and thelabel N EDGE appears at the end of the definition rather than preceding it.The x, y appearing on both sides of the expressions (1) above have alsobeen omitted, it being understood that this rule is applied to the picture(Fig. 5(a)) for all values of (x, y) so as to derive another picture called (Z)N EDGE (Fig. 5(b)). The axes have been rotated so that xis measured down the

t The Reverse Polish form of the Boolean expression (A & B) v C & D) isA, B, &, C, D, &, v.

185

Page 6: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

FIG. A A par

tial

'picture grammar' fo

r handwritten nu

mera

ls.

-F CCWANNIL

02BY02.

4oLINEYYX4OBY411

4,000ENNL•026Y02

Each group of rul

es is preceded by the con

trol

instruction

Z106CTURE.(00,00) NOT

8(00.01)

NOT

AND

0000021 NOT

AND

.602600)

AND

002,01) AND

.102021

AND

2(02.01)ZN

EtnE

xo4e

.de

zpic

TuRF

.(oo

goo)

NOT

.600801)

NOT

AND

6600,031 NOT

AND

.602,001

AND

8602001)AND

66020026

AND

2(02,01)2

BLOSX00002

zPitTORE.(00.00).(01,00)

AND

6602,00)AND

.6000021 NOT

AND

.102,021

NOT

AND

6601,02)NOT

AND

0(01,00)ZE E00EX024,00

ZPICTUNE.(00,001.601000)

006.00)2

Emu000.72

AND

.6020001

AND

.100,021 NOT

AND

002.02) NOT

AND

.601,021

NOT

AND

zPicTuRE.c0o,00).00.0t)

2(00

,01)

25

E00E

x024

.411

zPic

TuRE

.(oo

poo)

.(oo

,01)

200,

01)2

eL

olm0

0,72

AND

AND

.600.021

.(00,02)

AND

AND

.602,00) NOT

.602,001 NOT

AND

AND

.602.011 NOT

.602,01) NOT

AND

AND

.(02,02)

.602.02)

NOT

NOT

AND

AND

ZPICTURE.(00,00) NOT

.101,00)

NOT

AND

.602,00) NOT

AND

.600,02)

AND

.606,02)

AND

0602.02)

AND

2601,02)2W EOGEXWks00

ZPICTURE.(00.001 NOT

.101.001

NOT

AND

.602.00) NOT

AND

000002)

AND

.601802)

AND

0602,02)AND

2106,0262

SL00X000.72

ZPICTURE.(00,021.601.01)AND

.602,00)

AND

.102.03) NOT

AND

.603002)NOT

AND

6604,06)NOT

AND

2(01.01)ZSE EDGE/6024,24

ZPICTURE.600,02).(01,01)AND

.102,00)AND

.602.031 NOT

AND

003,02) NOT

AND

.604.01)NOT

AND

210101)2

0LORX000,72

ZPICTURE.100.02).101,03)AND

.602,041

AND

0602,011 NOT

AND

.103.02)NOT

AND

6104,031

NOT

AND

2(01,03)Z3W EDGFX024,72

ZPICTUNE.(00$02).601/031

AND

.602,04)

AND

.602,011 NOT

ANO

6603,021

NOT

AND

.104003)NOT

AND

2601,03)Z

DLOSX000,72

ZPICTUREe(00.03) Not

$101.021

NOT

AND

6(02/01) NOT

AND

.104802)

AND

6603,03)AND

.60:04)

AND

2603,03)264M EDOEX041024

ZPICTURE.(00,03) NOT

4601,021

NOT

AND

.602,06) NOT

AND

004,02) AND

.603,03)AND

.60204)

AND

2(0303)2

409%00002

ZPICTUREe(00,01) Not

.101802)

NOT

AND

*102,031 NOT

AND

9602,001

AND

.603,01)

AND

.60402)

AND

2(03,06)264E E00EX040.72

ZPICTUNE.6001,01) NOT

.601,021

NOT

AND

.102,031 NOT

AND

.602000)

AND

.603801)AND

.604,021

AND

2603.0162

403%00002

NOLLIND00311 ts11131Idd

Page 7: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

FIG. 4.

(continued)

e000ENNLA.02BTOZ

ZW

EDOE.(02.01)ZE EOCE002.01)002,02) OR

AND

z nowol,00)

NOT

AND

.(00.02) NOT

AND

2(02.01)ZN

ENDX048$60

NOT

ANO

4101803) NOT

AND

4(0041

ZN

EDGE.(01002)ZS EDGEA(01,02)002/02) OR

AND

z oL

oe.c

olso

o, N

OT

NOT

AND

of03$01) NOT

AND

2(01#02)ZW

ENDX048.12

AND

002,00) NOT

AND

000,01

ZN

EDGEs(0)1,00)ZS

EOGEo(01,00).(02s00) OR

AND

Z

SL0B/J(00801)NOT

AND

001,021 NOT

AND

002002

NOT

AND

.107,01) NOT

AND

2(01•00)ZE

ENDX0415/00

ZW

EDGE.(00,01)ZE ECCE.(00.01)000.02) OR

AND

Z

B1.O8o(Olo00)

NOT

AND

.(01/D3) NOT

AND

2(00.01)21

EN0A040$41

NOT

AND

41020011 NOT

AND

002,02

ZNW EDGE.(00002)ZSE EDGE.(00.02).(01,03) OR

AND

Z

BLOW00,00)

NOT

AND

.(01.03) NOT

AND

2(00,02)ZSW ENDX04802

NOT

AND

.101,01) NOT

AND

002,02

s...

.ZNE E0GE.(021,03)ZSW EDGE.(02.03).(03,02) OR

AND

z no

s.(0

3,00

) NOT

NOT

AND

111(00.03) NOT

AND

2(02.03)2NW

£1401048.36

AND

4(02001) NOT

ANO

e101,02

00

-.4

ZNW EDGE41(02/00)ZAE EDGE.(02.00)003.01) CR

AND

2

nos.

(oo.

00)

NOT

AND

11(03.03) NOT

AND

2(02,00)ZNE

ENDX048.84

NOT

AND

otOlp01) NOT

AND

41(02,102

ZNE E00E.(00001)ZSW E00E.000.01).001,00) OR

AND

Z

81.08e103,001

NOT

AND

.(01/02) NOT

AND

2(00.01)280 ENDX048.24

NOT

AND

4(02001) NOT

AND

1,101,02

ZN

EDGE.(00,00)ZS E00EA(00.00).(01,00) OR

AND

ZOO 1.INEW0601100

ZNW EDGE.(00.00)ZSE E00E.(00,40).(01$01) OR

AND

Z45 LINEX090/36

ZW

EDOE.(00.00)ZE

E00E.(00,00).(00,01) CR

AND

Z90 LINEX060#24 .

ZNE E009.(00sOl)ZSW E00E.(00,01)001800) OR

AND

2135LINEX060.12

ZE E00E.(00.00).(01/00) AND

9(02,00) AND

2(01$00)ZE

EDGEX036.00

ZS EDGEA(00o00).(00.01) ANO

.100,02) AND

2(00,01)ZS EDOEXO36,24

ZW

E00E.(00•00).(01.00) ANO

61028001 ANO

2(01000)ZW

000E1036,48

ZN

E06E.(00.00)41(00,01) AND

.(00,02) AND

2(00$01)ZR

EDGEA034.T2

ZSE E00E.(00002).(0)$01) ANO

.102,001 AND

2(0l.0))ZSE ED0EX076$12

ZSW E00E.(00000).(01,01) AND

.102,02) AND

2(01,01)23W n0E7(026406

ZNW E00E.(00.02)A(0).01) AND

.102,00) AND

2(01,01)ZNW E00EXO3b$60

ZNE EDGE.(00#00).(01,01, ANO

002,02) AND

2(01,01)214E E00E1036,6A

Page 8: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

FIG. 4. (c

onti

nued

)

6000ENNO)024702

290 L1NEs(00/00).(01,00) OR

ZNW

ENDe(00/00)ZN

END/(0O/00) OR

ZNE END/COO/001 OR

ANO

ZN

LINIX071/00

ZOO LINE/(00000)/(0O.01) OR

2NE

END.(o0.01)zE

orm.(oo.011 OR

1st RNo.(oosol) OR

AND

2(00/01)2Z LIN

X072/12

Z90 LINE.(00/00).(01/00) OR

25E END.(01000)ZS

END/1019001 OR

ZSM

END/101/00) OR

AND

2(014,00)ZS LIN

X072/24

ZOO wir.toopoo).too.ol) OR

25W

ENO.(00,00)ZM ' END/(00/00) OR

ZNM

END.(00.00) OR

AND

ZW

1.11)8X072041

ZOO LINE/C00/01)245 LINE.(00/00).(01/00) OR

AND

ZWLINE.(00.02)/(01802) OR

AND

2(00/01)2N

CURVX072/48

Z90 LINE/(01/01)Z125L1NE.(00/00)/(004101) OR

• AND

245 LINE/(02/00)/(02.01) OR

AND

2101.01110

CURYX072,90

ZOO LINE/(01/01)Z135LINE/(00800).(01/00) OR

AND

245 LINE.(00/02).(01/02) OR

AND

2(0)/0112S

OURVX072,72

290 L1NE/(010001245 LINE/CO0/001/(0001) OR

AND

2135LINE.(02/00)/102.011 OR

AND

2(01.00)29 CURYX07244

ZN

E00E/(OO/00)2S E00E.(00.00).(01,00) OR

AND

200 UNEX060.00

ZNW E00E.(00/00)ZSE EDGE/C0O/0010019011 OR

AND

245 LINEX000,72

ZN

E000.(004,00)ZE

EDGE.(00/001.(00801) OR

AND

Z90 LINEX060.48

...

ZNE E00E.(00.011ZSW EDGE/C00/0))4(01/00) OR

AND

Z135LINEX060/24

00

200 L1NE.(00.00).(00.01).100.021 AND

AND

2(00,011200 L1NEX080/00

00

245 LINEe102/00)/(01,011.00/02) AND

AND

2(01/01)Z45 LtNEX000/72

290 LINE.100/001.(01/00).(02/00) AND

AND

2101/00)290 LINEX060/80

Z125LINE6(00,00).101/01).(02/02) AND

ANO

2(01/01)Z135LINEX060/28

*C00ENNL•028702

.100/00)22ERO

xleo.00

litoo,00,zoNo

9180,09

245 LINE/1(02/01)2S

LIMP1)(00000).(01/00)

OR

AND

ZN

OURV.(00.01)1E.

CORV/(01(02)OR

AND

ZE

LIMB/103/01

/10202) OR

AND

2(02,01)Z7M0

o(001,011)ZTHREE

X100,27

.9180/10

.(oo.00)zrouR

9150.36

.coo,00izrivE

9i50,45

0(00/00)251Y

9180,54

.(0000)zsEvEN

9100/63

o(00/00)ZZIONT

X180/72

.(00.00)2NINE

xiso,al

NOLLIN90.)111 N1131,1Vd

Page 9: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

A 2.05? 2

.012

3456

7591

1234

5675

7212

3456

7873

1234

5675

,412

3456

7895

1234

5675961234567577

1234

567075

1234567599

12345.

2810

2URE

000 00

0. r

001 0 01.

002 00

2.00

3 003.

004 004.

005 0050

006 006.

007 00

7.00

5 00

8.009 00

9.010 010.

011 01

1.012 012.

013 01

2.014 01

4.01

5 01

5.016 01

6.01

7 017.

- 015

015

.019 01

9.02

0 020.

021 02)-

022 02

2.023 023.

024 02

4.02

5 025.

026 02

6.02

7 02

7.02

5 02

5.029 02

9.03

0 03

0.031 03

1.032 03

2.033 03

3.034 034.

035 035-

036 03

6.03

7 03

7.03

5 03

4.03

9 03

9.04

0 040.

041 04

1.042 042-

043 04

3.044 04

4.04

5 0450

046 04

6.

4567

0134

5670

1212

3456

7012

3012345670123

7012

34670123

67012

6701

6701

670

670

122

123

123

0123

0123

67012

4567

012

4567

0112345610

123456

7012

345

7012

37012

67012

6701

6701

5670

1234

5670

1234

S40

6701

2345

4967

0123

456

45670123 567012

4567

0123

4567

ig

56701

45

6701

567

5670

.6

.00 000

.01

001

.02 002

.03 003

.04 004

.09

005

.06 006

.07 00

7.oe 006

.09

009

.10 010

.11

011

.12 01

2.13 013

.14

014

.15 01

5.16 016

.17 01

7.15 01

5.1

9 019

.20

020

.21

021

.22 0 22

.23 023

.24 024

.25 02

5.

26 026

.27 02 2

.25

020

.29 029

.30 030

.31

031

.32 032

.33 0 33

.34 034

.35 0 35

.36 036

.37 0 3

7.35 0 28

.39

039

.40 040

.41

041

.42 04

2.43 043

.44 044

.45 045

.46 0 66

047 04

7.

.47 0 47

A L0

37 2

.01234567891122e5e1ee21230678,3122e067eeet2245e1s921224901,61224567597123e5e759e122e5ele9el234e.

FIG. 5(a

). The picture to be ana

lyse

d. The ori

gin of coordinates i

s the

cell

immediately beneath t

he Z of

Z PicruRE. The pic

ture

is def

ined

to be only 48 x 48 elements hence in

the

direction (down the pag

e) we can

only pr

int

lines 0 to

47. Each 'bl

ack'

elem

ent is represented int

erna

lly as a bin

ary di

git.

For display purposes, however, the

'bla

ck' elements are

numbered (on a sca

le of 8) in

the

horizontal di

rect

ion.

Page 10: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

A LO

ST 2

.012345678911234547892123454789312345478941234567895123456789412345478371234567898123454789912345.

2

BLOB

000000.

.00000

001001.

.01

001

002002.

234

.02002

003003.

12345

.03003

004004.

7012345

.04004

005005.

701 45

.05005

006006.

70

45

.06006

007007.

70 2445

.07002

008008.

1234

.08000

009009.

0123

.09009

010010.

701

.10010

011011.

70

.11011

012012.

7 1

.12012

013013.

6

12

.13013

014014.

6

234

.14014

015015.

6701234

.15015

016016.

7 23

.16016

017017.

.17017

018018.

.18018

019019.

.19019

020020.

.20020

021021.

.21021

022022.

.22022

023023.

.23023

ZE

EDGE

ZSE EDGE

ZS

EDGE

UV EDGE

000024.

.24000

001025.

.25001

002

026..

.26002

003027.

53

.27003

004020.

512

23

44.28004

005029.

501

.29005

006030.

05

0

5.30006

007031.

045

70

45

77

.31007

008032.

234

34

.32008

009033.

123

2.33009

010034.

t1

.34010

011035.

0.35011

Fio. 5(b). The outcome of the fir

st group of rul

es when app

lied

to Fi

g. 5(a). Note tha

t the 'origin

cell

' of the

array

ZN EDGE is

at th

e ab

solu

te location x = 48, y = 48 as stated in th

e picture grammar.

NOLLI4DODTh N1311Vd

Page 11: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

FIG. 5(b). (continued)

012 036-

013 037.

2014 038.

015 039.

4

70

4

701 4

67 23

016 040.

3

7

3

2017 041.

018 042.

019 042.

020 044-

02* 045.

022 04ô-

023 047. 21

1 EDGE

ZOO EDGE

ZN

EDGE

ZAE EDGE

000 048.

.48 000

001 049.

.49 001

002 050.

234

.50 002

002 051.

12

1234

45

.51 002

004 052.

701

0

5

.52 004

.... 1..■.

005 053.

7

45

70

.53 004

006 054.

7

4

4

.54 006

007 055.

7

234

23

.55 007

008 054.

12

1 .54 008

009 057.

01

1 .57 009

010 05111.

70

70

.58 010

011 OGT.

7

.59 014

012 040.

7

7

1 .60 012

013 061.

6

6

12

12

.61 013

014 062.

6

34

23

.62 014

015 063.

6

.63 015

016 064.

.64 016

Cl? 065.

.65 017

018 066-

.66 018

019 067.

.67 01,

020 068.

.45 OGO

02L 069.

.69 021

022 070.

.70 022

023 071.

.71 023

A LOST 2

.012245478411234567892123456789312345678441234567895123456789612345678971234567898123456789912345.

.36 012

.37 013

.38 014

.39 015

.40 010

.41 017

.42 018

.43 019

.44 020

.44 024

.44 022

.47 023

Page 12: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

V)

A tosT2

.012345478911234547892123454789312345478941234567805123454789612345678971234547898123456789912345.

ZE

EDGE

ZSE EDGE

ZS

EDGE

25o EDGE

ZO

EDGE

ZNO EDGE

ZN

EDGE

ZNE EDGE

000036.

.36000

001037.

5-3700:

002038.

6

o 0

.38002

003039.

6

0 2

3

1.39003

004040.

12

01

.40004

005041.

3.41005

006042.

.42

pos

007043-

4

3.

43007

006044-

.44004

009045.

.45009

010044.

.44

DIO

011047.

.47011

000

ZE

END

ZW

END

ZSE

END

ZNW

END

ZS

END

ZN

END

ZSO

END

ZNE

END

048.

.48

.49000

001

001049.

-50002

002

003

004

050.051-3

052.

.51

.52

.53

003

004

005

005

006

007

053.

054.

055.

6.54

.55

-56

00S

007

008

000054.

.57009

009

010

011

057-

058.

059-

.58

.59010

011

Z00 LINE

EI331.I1E

290 LINE

245 LINE

000060.

.60000

001061-

5

2

- 0

.61

001

002062.

6

70

.62002

003063.

5

3 6

IZ

.63003

004064.

Cl

.64004

005065-

34

.65005

006064.

.66006

007067.

4

1.41'007

008068.

.68008

009069.

.69009

010070.

-70010

01I071.

.71011

S 1.05,

2.01234547891123456789212345678931234,478441234567895123454789612345618971234567898123454789912345.

FIG. 5(c). The outcome of the second group of rules applied to the contents of Fig. 5(

b).

mouthl000ni N1131.LVd

Page 13: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

A LOST2

.01234567891122456,892123456769212345678941234567895123456789612345678971224567898123456789912345.

A r 0ZOO

LINE

zloeLiNE

290 LINE

245 LINE

000060.

.60000

en u)001061.

2

.61

001

002062.

2

.62002

003063.

.63003

004064.

.64004

005065.

.65005

ZN

LINB

ZE

LIMB

28

LIMB

ZW

LIMB

ZN

CURV

ZE

CURV

ZS

CURV

ZN MIN

000072.

2

.72000

001073.

1 7

.73001

002074.

04

002

003075-

i

.78003

004076.

.76004

005077.

.77005

A LO

ST2

.01234567891123456,892122456,89212345676041234567095123456713,612245676,71224567898123456749912345.

FIG. 5(d). The outcome of the third group of rules applied to the contents of Fig. 5(c).

A LOST 2

.012345678911234567892123456789312345678941234567895123456789612345678971234567898123456789912345.

ZZERO

ZONE

21110

!THREE

ZROUR

ZrIWE

251X

22EVEN

2EIGHT

2NINE

000 leo-

meo 000

001 181.

3

.41 001

002 182.

-82 ooz

A LOST 2

■0

12

34

56

78

91

12

34

56

78

92

12

24

50

09

21

23

45

67

09

41

23

45

67

89

512345678961224567897123456,890123456789912245.

FIG. 5(e). 'Recognition' as obtained by applying the fourth group of ru

les to the contents of Fig. 5(d).

Note that

the array size is now 3 x 3 elements. In principle more than one figure could have been presented in the initial Z

PICTURE. Their relative position and identity would have been reflected in this display.

Page 14: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

PATTERN RECOGNITION

page and y across it, following the numbering appearing in the picture

frame. Following the label (Z) N EDGE a further pair of coordinates (048, 48)

specify the location in computer memory at which the resultant picture (Z)

N EDGE is to be stored. The first group of definitions in the grammar expresses

eight EDGE definitions corresponding to eight points of the compass, and a

further definition (Z) BLOB which merely collects together the results of all

the EDGE definitions in a single picture called (Z) BLOB. The next set of rules

in the grammar contains concatenations of EDGES to define LINES having

four orientations 00, 45, 90, 135 and line ENDS with eight orientations. These

in turn are concatenated to form CURVES (combinations of appropriately

oriented lines), LIMBS (combinations of lines and ends) and LINES (combina-

tions of lines). Each set of rules in the picture grammar is separated by a

line beginning + CODE. . . .

The action of the program is to apply each set of rules to the outcome of

the previous set starting with a picture stored in computer memory (cf. the

linguistic grammar). The performance of the program is monitored by

having it print out those portions of its memory which currently contain a

picture after each set of rules has been applied. The area of memory is—in

effect—a matrix of 192 x 96 elements. At any one stage only a portion of

this is in use, the portions being allocated by the coordinates associated '

with picture name given in the grammar, i.e., Z (N) EDGE (048, 48) is stored

starting from location 48 (down), 48 (across). Thus the output for the first

set of rules is as shown in Fig. 5(b).The size of the initial picture (Z) PICTURE is 48 x 48 cells, the EDGE versions

are, however, reduced in size to occupy only 24 x 24 cells. The necessity

for this reduction in size is intimately connected with the form of picture

grammar we have chosen to implement. A feature of the linguistic grammarwas that substitutions were always defined for pairs of adjacent symbols,nowhere do we find a rule which calls for the replacement of symbols widely

separated in the string being analysed. In spite of this immediate constituentor phrase structure restriction it is still possible to reflect with a single

label (S) the structure of an extended utterance, because as we proceed with

the analysis the string automatically shrinks by the very nature of the

substitution operation. In the geometrical analysis we perform on a picture,

the same immediate constituent limitation is imposed. If, however, we are

able to label an extended property of a picture, e.g., a long line, with 'next to'

concatenations, we need to draw together these local properties—hence this

reduction in the size of the EDGE versions of the picture.

We can summarise the outcome of this 'parsing' process in a differentform by collecting together the contents of several pictures, e.g., ENDS and

LINES (Fig. 5(c)), and combining them into a single picture with a suitablepictorial convention to indicate the type of picture from which the entry

originates (Fig. 6(b)). The results of the next level of description (Fig. 5(d))containing CURVES, LIMBS and LINES can similarly be displayed pictorially

(Fig. 6(c)).

194

Page 15: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

CLOWES

They show that the program has adequately labelled the main features ofthis '2' and for the purposes of recognition we can define each numeral asbeing some combination of features. For a '2' this definition will involve a

curve, a diagonal and a horizontal; and the last group of rules (Fig. 4)contains such a definition. Comparable definitions for the other numeralscan be written.

Definitions of this type do not, strictly speaking, constitute a part of the

grammar, and it is with the adequacy of the grammar that we are, at this

point, concerned. Fig. 6(c) shows that certain regions of the '2', e.g., theintersection of the diagonal and the horizontal, are unlabelled. This arises

from the adoption of 'lines' as syntactic categories: we are now studying a

grammar having the same form but concerned exclusively with EDGES.

45670134567011

12349670113012 33333 0113701114 113670123 12367011 12367111 01136701 0123670 67012640 4s470r2

4367011/349670123496

7011343301133011

•701167016701

96701134967012364494701134349670113496436701/3 967011496,01 333333 II96701 436,0196/ 5670

(a)

7/i/

KEY

00 LINE — I umB45 UNE / E LIM5

90 UNE I N CURV

S END E CURV )

E END > 45UNE /

(b) (c)

Fro. 6. A pictorial 'summary' of the results of the analysis.(a) The input picture Z PICTURE. (b) The contents of Fig. 5(c)super imposed and coded according to the convention given inthe KEY. (c) The contents of Fig. 5(d) similarly encoded.

RELATION TO PHYSIOLOGY

It is possible to compare the organisation of this system with the organisa-

tion of the visual system, as revealed by microelectrode studies in the cat(Hubel & Wiesel 1962, 1965) and the frog (Lettvin, Maturana, McCulloch &

Pitts 1959). Briefly the following points emerge:(1) Cells in the visual cortex only respond to local properties of the visual

scene, e.g., edges, line segments. This mirrors the immediate constituentconstraint imposed for economic reasons in the picture grammar.

(2) The visual scene is portrayed many times over in the cortex in terms

of the different picture properties, each portrayal being laid-out' two-

dimensionally and 'in register' with all other portrayals. This corresponds

to the sets of arrays which are also preserved 'in register', in the computer.

(3) In addition to cells portraying the local properties of correspondingregions of the retinal image, there are also cells (so-called 'complex' cells-

195

Page 16: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

PATTERN RECOGNITION

Hubel & Wiesel 1962) which indicate whether a particular local propertyoccurs anywhere over a small area of the retinal image. Such cells areprecisely what one would expect to find mediating the 'reduction' of scale(and resolution) injected into the picture grammar.The detailed nature of the correspondence means that we can usefully

consider the possible significance of discrepancies. For example, there aregrounds for believing that the picture grammar is premature in beginningwith EDGE definitions. In the visual system a considerable amount ofprocessing—so called 'lateral inhibition'—precedes EDGE operations and isperformed in the retina. Conversely we may ask of the physiologists whatmight correspond to certain features of our picture grammar which aredesigned to provide size invariance. This type of detailed cross-fertilisationis in many ways quite novel, at least with this subject matter.The notation can also be used to express physiological results, e.g., the

properties of a receptive field unit.

RELATION TO PSYCHOLOGY

Ultimately, the psychological relevance of this form of picture analysiswill be in its ability to derive structural descriptions of pictorial objects whichreflect our intuitions of their shape. To a limited extent the labelling alreadydemonstrated achieves this: it will be more convincing, however, if we can,for example, show that the same grammar assigns adequate descriptions toobjects which are generically different, e.g., fingerprints, chromosomes. Itcan already be shown, however, that by virtue of the way in which it computespicture labels, the system has some interesting weaknesses. Given a numericalvalue for the notion of local property, e.g., that no definition can refer toarray elements separated by more than 10 units, the system is unable tojudge the positions of an object relative to some frame, to better than 10 percent (i.e., 1/10). This limitation appears in many forms of sensory judgment(Miller 1956). Again the system is unable to 'pick out' (i.e., uniquely label)one member of a large matrix of similar objects, e.g., a matrix of A's. This isa familiar experience. We overcome the limitation by 'tracing' or countingalong a line with finger or pencil point. Significantly this use of a pointerchanges the structure of the picture in a way which would also greatly helpthe computer system.

CONCLUSION

We can now return to a general remark made in the introduction. Theessential characteristic of the system we have described is the volume ofoperations it involves. To calculate by hand the structural analysis illustratedearlier would occupy perhaps 2-3 hours for each numeral and wouldinevitably contain many mistakes. The validation of a picture grammar interms of a whole alphabet and subsequently for other classes of picturewould, therefore, be unthinkable except by computer. In that we already

196

Page 17: PROCESSING AND COMPUTERS · 12 PERCEPTION, PICTURE PROCESSING AND COMPUTERS DR M. B. CLOWES* M.R.C. PSYCHO-LINGUISTICS RESEARCH UNIT UNIVERSITY OF OXFORD INTRODUCTION The ability

CLOWES

sense directions in which our approach requires further complications dueto over-simplifications like assuming pictures to be only black or white, theclaim that psychological theories are necessarily complex, and require a

computer to test them, will need no further elaboration. In one way it isunfortunate that Artificial Intelligence has been so called, since it suggests

that it is concerned with problems quite different from those traditionally

studied by the biologist. In fact, however, both disciplines are concerned

with the study of very complex systems and collaboration is becoming

increasingly fruitful.

ACKNOWLEDGEMENTS

I would like to acknowledge with gratitude the use of the computingfacilities at the Culham Fusion Research Laboratory, U.K.A.E.A., theassistance of Douglas Brand and Barry Astbury, and valuable discussions

with John Marshall, Professor N. S. Sutherland and Professor R. C. Oldfield.

REFERENCES

Clowes, M. B.(1962). The use of multiple auto-correlation in character recognition.Optical Character Recognition. Fischer, Pollack, Radack & Stevens, Eds. Pp.305-318. Baltimore: Spartan Books, Inc.

Clowes, M. B., & Parks, J. R. (1961). A new technique in character recognition.Comput. J., 4, 121-126.

Hubei, D. H., & Wiese!, T. N. (1962). Receptive fields, binocular interaction andfunctional architecture in the cat's visual cortex. J. Physiol., 160, 106-154.

Hubei, D. H., & Wiesel, T. N. (1965). Receptive fields and functional architecturein two non-striate visual areas (18 & 19) of the cat. J. Neurophysiol., 28, 229-289.

Kirsch, R. A. (1964). Computer interpretation of English language test and picturepatterns. Instn. elect. Engrs. Trans. on Electronic Computers, EC-13, 363-376.

Ledley, R. S. (1964). High speed automatic analysis of biomedical pictures.Science, 146, 216-223.

Lettvin, J. Y., Maturana, H. R., McCulloch, W. S., & Pitts, W. H. (1959). Whatthe frog's eye tells the frog's brain. Proc. Inst. Radio Engrs., 47, 1940-1951.

Miller, G. A. (1956). The magical number seven. Psych. Rev., 63, 81.Miller, G. A., Galanter, E., & Pribram, K. H. (1960). Plans and the Structure of

Behavior. New York: Henry Holt and Company, Inc.Narasimhan, R. (1964). Labelling schemata and syntactic descriptions of pictures.

Inf. Control, 7, 151-179.

197