66
r A AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods Bolt Beranek and Newman, Incorporater L ) Prepared for: Advanced Research Projects Agency April 1975 DIS T RIBUTED BY: KJiri National Technical Information Service U. S. DEPARTMENT OF COMMERCE Hi^. itülFiri-tf-

SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

  • Upload
    dohanh

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

1

r A AD-A009 939

1 SYNTAX, SEMANTICS, AND SPEECH

William M. Woods

1 Bolt Beranek and Newman, Incorporater

L )

Prepared for:

Advanced Research Projects Agency

April 1975

DISTRIBUTED BY:

KJiri National Technical Information Service U. S. DEPARTMENT OF COMMERCE

Hi^. itülFiri-tf-

Page 2: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

Unclassified Security Classification 4jiJjl0£3£3i

DOCUMENT CONTROL DATA -R&D (Sieurlty clanlllcallon ol llllt, bvdy of mbilrmcl mnd lndmr,nt unnolmll-m mu»t b» agjggj whtt tfi> ofrmll npoit lu clmtmlilad)

I. ORIGINATINS ACTIVtTV (Corpotmlm muthot)

Bolt Beranek and Newman Inc. 50 Moulton Street Cambridge, MA 02138

la. REPORT ICCURITY CLASSIPICATION

none ab. CROUP

J REPORT TITLE

SYNTAX, SEMANTICS, AND SPEECH

4. DESCRIPTIVE NOTES (Typ» ol rmpotl and Inclutlv, dmf)

Technical Report » *UTHOR(*l rnr«"uwM, mld<n» Inlllal, laml ntw0)

William A. Woods

S. REPORT DATE

April 1975 M. CONTRACT OR GRANT NO.

N00014-75-C-0533 6. PROJECT NO.

«. Order No. 2904

* Program Code No. 5D30

7«. TOTAL NO. OP PACES

57 76. NO. OP REPS

42 Sa. ORICINATON-S REPORT NUfcTBERI»

BBN Report No. 3067 A.I. Number 27

•6. OTHER REPORT NOW (Anr othar ihla »port)

10. DISTRIBUTION STATEMENT

Distribution of this document is unlimited. It may be released to the Clearinghouse, Department of Commerce for sale to the general public.

I X. SPONSORING MILITARY ACTIVITY

ONR Department of the Navy Arlington, Virginia 22217

If. SUPPLEMENT«1

Rsproduced by

NATIONAL TECHNICAL INFORMATION SERVICE

US D.pirtmenl ol Comm»rc« Sprmalieid. VA. 22151

IS. ABSTRACT

Recently, speech understanding research has taken a direction which recognizes the importance of syntactic and semantic constraints as an essential part of the process which deciphers speech signals into sequences of sounds (see Newell et al. 1973). Consequently, it has become important for speech researchers to be acquainted with the work that has been done in the area of computational linguistics, attempting to construct computer programs to model the process of natural language understanding This paper attempts to provide an introduction to the techniques and results which have come out of work in computational linguistics which have special relevance to the design of speech understanding systems. The paper was written for an audience with some understanding of the nature of speech signals and the difficulties of per- forming an acoustic and phonetic analysis of such signals but with little familiarity with the techniques for parsing and semantic interpretation of natural language or the ways in which such techniques could be used in a total speech understanding system. However, readers with interests in computational linguistics, linguistics, and artific'al intelligence may also find the paper of interest.

This paper is not intended to be a survey. Rather, in it I will try to trace the development of what I think are several important ideas and trends in parsing ana syntax and in semantic interpretation. I »fill attempt to convey a feeling for what I think the state of the art is, how it develop3d conceptually, and some of the new perspectives that the problems of speech understanding place on the processes of parsing and semantic interpretation.

DD i MO« as 1473 • ■PLACES DO POISM 1*7S. I JAN »4. NMICH IS «■•OLUTK worn aiMV us«. • Unclassified

Sacurlty ClaaaincaUon

Page 3: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

Unclassified Security Ctassiffcotion

KEY HORO« ROLE WT

Grammars

Parsing

Parsing Algorithms

Semantic Interpretation

Semantic Networks

Semantics

Speech

Speech Recognition

Speech Understanding

Syntax

ICÜ

Security CUtslflcation

Page 4: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

,

BBN Report No. 3067 A.I. Report No. 27

SYNTAX, SEMANTICS, AND SPEECH*

William A. Woods

April 197 5

Sponsored by Advanced Research Projects Agency

ARPA Order No. 2904

This research was supported by the Advanced Research Projects Agency of the Department of Defense and was monitored by ONR under Contract No. N00014-75-C-0533.

The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Advanced Research Projects Agency or the U.S. Government.

*To appear in: Speech Recognition; invited papers presented at the IEEE Symposium, D.R. Reddy (ed.), Academic Press (1975).

\h \

Page 5: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

i: LJ r- I.

I. 0 i

bbN Heport No. 3067 bolt Beranek and Newman Ine

S. mm

he directi semanti whien d weweii l'or sp nas Dee attempt process attempt results linguis design written nature an aco little semanti whicn s underst coraputa intelli tne rea speech I sugge Jakobso introdu w i t n o u t

cently on wh c cons e c i p n e et al . eecn n cone ing t Of na to whic

tics ol sp

lor ol spe ust ic I'amill c int u c n t e anding tional gence der wi proauc st the n , ctions sucn

, sp icn trai rs s

19 rese in

o c tura prov n h wnic eecn an

ecn and

arit erpr chni

sy 11

may th 1 t ion pap

Fant

prio

eech reco

nts a peech

73). a r c n e tne

onstr 1 Ian ide ave n 1

una auai

signa pnon

y wi etati ques stem . nguis

als ittle ana

ers o a

This r kne

INTHUDUCTIÜN

understanding research nas taken a gnizes the importance ol syntactic and s an essential part of tne process signals into sequences of sounds (see Consequently, it has become important

rs to be acquainted with tne work that area of computational linguistics,

uct computer programs to model the guage understanding. ihis paper will an introduction to the techniques and come out ol worK in computational think have special relevance to the

erstanaing systems. The paper was ence with some understanaing of the Is and the difficulties of performing etic analysis of sucn signals but with th tne teenniques lor parsing and on of natural language or tne ways in could be usea in a total speech

however, readers with interests in tics, linguistics, ana artificial o lind things ol interest herein for or no background in the nature of

tne characteristics ol speech signaxs, y uenes anu Pinson (1963) and by nd Halle (1967) as appropriate paper snould be readable nowever

wleage ol speecn cnaracteristics.

Tnis paper is not intendea to be a survey. natner, in it 1 will try to trace tne Development of wnat 1 tnink are several important iaeas ana trends in parsing and syntax and in semantic interpretation. I will attempt to convey a feeling lor wnat I tnintc tne state of tne art is, how it developed conceptually, ana some of tne new perspectives tnat tne problems of speecn unaerstanaing place on tne processes of parsing ana semantic interpretation.

Page 6: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN he port wo. 3057 boit beranek and Newman Inc

Part 1. o^ntactic Analysis

Ihere are two parts to the proulew ol syntactic analysis -- one is a component oi' judgment or decision (wnethur a given string ol words is a sentence or not) and tne otner is a component ol representatj-on or interpretation '.deciding what tne pieces ol' tne sentence arc and n^w tney relate to eacn otner). In speecn understanding we will see tn?t botn ol tnese are important.

Let me start witn a mini-nistory descriPing wnat 1 thin« tne current state ol tne art is, now it developed conceptually, and some ol tne new perspectives tnat the problems oi speecn understanding place on tne evaluation ol' parsing tecnniques.

r1 nrase structure Grammars

The field o1 linguistics was given a great stimulus wnen tne two aspects cl syntax (judgmental and structural) were combined in tne lormaiisr,, ol pnrase structure grammar. Prior to tnis development, largely due to Chomsky (e.g. Cnomsicy, 1965), the mechanism wnercDy a computer program could decide wnetner a given sequence oi worus war a grammatical sentence or not would nave been difficult to imagine.

Tne principal component ol a phrase s-.^cture grammar is a collection ol "rewrite rales" sucn as tr.e following:

is - > ut1 v f Pi P - > Ut'i it

\1 r - > \i H P

Intuitively, tne first rule indicates tnat a sentence can consist ol a noun pnrase followed by a vero pnrase. permally, it indicates tnat in tne course ol deriving a sentence, one can replace an occurrence of tne symool ä in tne string aeriveu so far, witn tne sequence ol two symbols hF Vr. Similarly, one can replace the up witn tne sequence uci N ano the Vf witn tne sequence V i^r1, ultimately deriving t. ne sequence Utl w V üLI N, wnicn is tne sequence oi syntactic woro categories underlying a sentence sucn as

xhe man bit tne uoc

Page 7: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

u

I

I

r r i

i

i

bbN heport wo. 3067 bolt beranek anü Newman Inc.

Parsers ana hecognizers

The rewrite rules ol a phrase structure grammar can be usea to characterize tne set ot possible sequences ot words which can be considereu grammatical sentences, thereby lormally representing tne judgmental part ot syntax. A formal algorithm tor taking a grammar and deciding whether a sequence ol words is a sentence witn respect to tnat grammar is called an acceptor or a recogn.zer.

11 in the course ol aeriving a sentence accoraing to the rules we keep track ci whicn symbols were rewritten into whicn sequences, one can construct a tree structure sucn as that represented in ligure 1 whion gives a very nice representation ol wnat tne parts ol the sentence are and hew tney are put togetner, thus achieving a structural representation ot tne sentence. An algorithm for constructing such a representation wnile accepting or recognizing a sentence is caiieo a parser.

NP

DET N

THE MAN

VP

\ NP

/ \ DET N

BIT

THE DOG

figure 1: A jamoie enrase structure iree

Lexical categories ano aictlonaries

wotice that in ligure 1 ano in tne grammar rules there are two diiferent kincs ol names oi' noces; tnere are

■nonterminal" symbols HKC S, eif, anc V f, wnicn name wnole pnrase typeo, anu tnere are otner symbols which are essentially lexical woro class names, like determiner, noun, ana vero. Tnis distinction uetween terminal and nonterminal symoois is tormalized by dividing tne vocaoulary of special

Page 8: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

obi« tteport wo. jObl bolt beranek eno N&wman Inc

sy loois ot a phrase structure grammar into a terminal ana nonterminal vocabulary. The initial syinool ü, ana all ol

the symDols whicn later get rewritten by pnraae structure rules are in tne nonterminal vocabulary. The uerivation ol

a sentence stops wnen tie spring consists entirely ol terminal symbols. In a simple view of phrase structure

grammars, tne terminal symbol; would be the tnglisn worus themselves, but this woula result in a huge set ol

"singleton" rules sucn as:

1>LT > the

Cun tne average tnere wouiu be severax sucn rules lor eacn

wore in fc,n?1ish). insteau, tne syntactic wora classes

usually serve as tne terminal vocabulary ana tne

correspenaence between syntactic woru classes ana tne woras nneraselves is taKen care ol oy a uictionary.

other uramraar i-iouels

Al wnat 1

There a

grammar way t n a

grammar Dan ue

1orraali

«neneve gene rat

generat a n u If

generat be s t r

n i e r a r c grammar

^ tj o I» s K y

because

various

1 ol s ca

re in s ae

t tne

, t n chara s :ri i

r tw e tn

iveiy

one

ea by ü n g e r

ny ol r.i o u e

hier 1

tnin

the abo 1 lea c lact n

pending y are ere is cterize s saia o i o r K e same eq uiva

1 ormr1

anotne in g

succes 13, K n c arcuy. want t g s w n i c

ve pr 0 n t e x

any a on

appii a co

a oy to

al ism

cla

lent 1 ism

i l or

e n e r a s i v e 1 wn am

1 wo

o co.;. n tne

e 3 e n t

t t r

ill er

tue t

eu .

rresp gramra

gene s, e

3 S 0 !

or eq gene

mails

t ive

y m o ong 1

u 1 o 1 e uac aii t

a 11

ee

e n t y pe

t'o

ona ars

rat i tn

1 a

ui v

rat

m , po

re orw

i Ke

K o ere

or» ua p h r a

type

s ol

r ea ing c ol t

e tu er g

nguag

a 1 e n t es a tuen

wer .

powe al la

to

ccas i n t m o

s bee se s

s oi

rules en d

lass hat t

is c

r a m m a es , t

in sup

tnat

i ne

nul nguag

i n t r o n a x i acis

n a a

truct

pnra

p e r r.1

i I l e r

ol la Ype ;

lass

rs o ney a g e n e r

erset !.i o d e J.

rt is

pnra e t n e ü u u c e

y anü can d

escri p ure gr

se st

itteo en t t

nguagft tne

o 1 Ian

r au

re sai

at ive or tn 13 S a wel

se st orists

tries rerer

o .

tion or ammars. rueture ana tne ype ol s wnicn grammar guages. tomata , a to be power,

e class aiu to 1-Known rueture as the

e nere to tne

as ty Tree grarnrn s i G e s

s y m b o

oi te

also tnan

power

rewri

ihe g

pe 0, gramm ar a

oil 1 an

r hi i n a

Know tne c

to

te ru

raramar

type

ar wni nu is t s r e w

U tne i a no

n as

ontext

i init

les wh

m o u e x s

1, type en we n

c r i a r a c t rite ru

right- nonterm

11 n 11 e Tree g

e state

ose lei

in

a, ave erii.

les

nan a in Li.

st

raram

mac

t-na

t n e

ana just

ea b cons

sia sym

ate

ars o i n e

nu s

c noin type

ue y tn ist es a> bOlS gram

anu

s .

ices

SKy niera 3 gramma

acribea a lact tn of a sin

ay oe any i n e t y

mars , are correspon

i n e y are

are sing

rcny are known

rs. i ne context is tne type ^

at trie ielt-nanu

gie nonterrainai

nonempty string pe 3 grammars,

more restricted

u in generative

cnaracterized by

1 e nonterminals

Page 9: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

5-

4*1

* H

1

noH hepoit No . iOb'i colt beraneK ana Newman Inc

_^ ana wnose rignt-liand sides are eitner a single terwinal syrauol or a terminal symbol followea oy a single

■• nonterminal.

At the other end 01' tne spectrum are tue type 0 §,, grammars, also known as general rewriting systems, wnicn

correspona in generative power to Turing macnines , General f"| rewriting systems are characterized by rewrite rules wnose

left-hand ana right-hana siaes can be arbitrary strings of terminal and nonterminal symbols suoject only to tne constraint tnat terminal symbols cannot be rewritten as some dif'lerent terminal or nonterminal symool . iype 1 grammars,

M also known as context sensitive grammars, are strictly less powerful than general rewriting systems anu strictly more powerful than context free grammars. ihey are characterized

J, by rewrite rules in wnicn tne lert-nana slues specii'y not only a nontermina1 symbol to be rewritten, but also a context of terminal anu nonterminal symuols wnicn must be present in order for tne rule to be appliea.

r'igure 2 gives a summary of tne types of rules for eacn ">• class of grammars.

In the figure, the notation V is usea to represent the union of tne terminal anu nonterminal vocaoularies of tne grammar (Vt a no Vn), ana tne " operator is useo to inuicate tne set of all possible strinss wnicn can be n—^ae from a given vocabulary (i.e. Vt* inaicates tne .'et of all possible terminal strings). i'he symbol e represents tne empty string (i.e. tne string witn no S y m £> 0 i s ) .

Page 10: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bfaN heport no. jü67 b o i t b e r a n e k and w e w 01 a n J n c

TYPE 0: GENERAL REWRITING SYSTEM

a a. /3 € V

TYPE 1 : CONTEXT SENSITIVE

x— X/a — ß X€ VN

TYPE 2 : CONTEXT FREE

X—' y X€ VN,/€ V*-{e}

TYPE 3 - FINITE STATE

a Y X, Y € VN

a a € Vr

f'igure d : oum.'nary ol tne Cnotasky ttierarcny of ]Jnrase jtructure Ui aumars

repr alle gram n u iii 0

numb sens Lne eras ( i .e long syst inte

tacn e s e n t s ndant mars er rep ers , i t ive lorme

ing an tn

as iti ems, rmeaia

ol a

ease w i L n r e s e n

i fie

g r a m m r is y t nin e rig e lef tnis te "

tne restr in pa a i

t s a prin

ar an pro

g fro nt-na t - h a n is

scrat

grammar lotion r s i n g ü ower n special cipai a tne g nioi tea rn tne na siae d sides not t h un wor

ai en

wo s

in n g r e c 0 oer . ase 1 l er trai oy t rKin ol r

f

cas ca

tne enerat gnit io

cac o i tne ence r e w r i

he nat g str ules a or tn e , anu n be

Unoms ive po n ) o v c -^ n class ciasse between ting s y ure 01 ing as re aiwa e gene aroitr erased

Ky wer tne wit

s w tn

stem its

it ys a ral ary

ou

n 1 e r (wit powe

n a n i tn e co

is rules

pro t lea rewr

amour; t o

arcny n an r ol' i g n e r lower n text t n a t from

ceeds st as iting ts of I a

Page 11: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

i ft*

bbN Meport wo. 4 Üb? bolt beranek ana Newman Inc

ti

ft«

tu. resulting string derivation without leaving a trace in , , ., ...,,, that is generated. This is what gives the general rewriting system its power, and also has tne undesirable consequence tnat a recognition or parsing algorithm cannot oe guaranteed to exist tor general rewriting systems. ror ail ot the other classes ol grammars, it is possible to construct a recognizer wnich lor an arbitrary string wil., say yes-or-no whetner that string is in a given grammar. ueneral rewriting systems are theretore not very aesirablt as machine models ot language due to this inability to guarantee a recognition aigoritnm.

uerivations

I r

i i

i i

ror eacn of tne type 1, ^, and 3 grammars, formal parsing algorithms can be ueviseo wnicn, given a grammar ana a string, can answer tne question whether tne string is a sentence with respect to tne ammar. Xnis is aone by attempting to discover a derivati-u of tne string from tne initial syraool of tue grammar by means ot tne rewrite rules. A derivation is essentially a sequence of worKing strings starting with the initial symbol, eacn ol wnich results from tne preceding one by one application of a rewrite rule. a string is said to be generated by tne grammar it tnere is a derivation of tne grammar leading to it.

figure i gives figure 1 .

a jam pie derivation ot the sentence in

Page 12: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbw Heport NO. 3067 tsolt beranek anu i^ewman Ire

SUMMARY OF DERIVATION

S -^ DET N V DET N

INTERMEDIATE STRINGS

S

NP VP

DET N VP

DET N NP

DET N DET N

r'igure j: h oarapie uerivation

notice however tnat Lncre can ue several aistinct uerivations lor a single pnrase structure tree corresponding to aitierert orders ot applying tne rewrite rules, r'or example, it" instead ci expancixng tne subject noun pnrase o&tore the vero pnrase one were to expanu tne vero pnrase first, one ol tne derivations ot rigure 4 woula result, (r'igure 4 compactly represents ail oi tne possioie derivations of tnis particular surlace string, with tne coffiiüon initial parts oi üiilerent derivations comoineu. Alternative cnoices lor expanding a given string are indicated by tne arrows, anu individual derivations are terminated oy undei'l ining . ;

Page 13: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

F I r E I I I I I I r i i r

i i i

OüI\ heport No. 305? bolt üerauek and Newman Inc.

DET N V DET N

NP VP

DET N VP

DET N V

DET N V

NP

DET N

NP V NP

DET N NP

DET N V DET N

NP V DET N *-

DET N DET N

rigure 4 Hiternative uenvations ol tne Ssntence iroia rigure 3

essentially a1 or tne expansion tnat appears in tne phrase structure tree ;ou1d oe done in any oraer ana eacn different oraering woula iv i ailterent uerivation wnicn corresponds to eflectively e same parse. If we don't want to be swamped witn alternative derivations ol the same parse, tnen we neea to include in our parsing aigoritnm some control strategy tnat will Keep it 1rom getting ail ol tnera. Tne typical control strategy tnat is usea in text-oasea parsers (as opposed to speecn) is to decide aruitrarily that tne

Page 14: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Keport wo. 306? bolt beranek ana wewman inc

omy derivations wnich will be consiaereü will be tnose which expand at each step the iet'traost nonterminal in tne string. Ihis etTectively selects one canonical derivation tor each possible parse tree. This raaKes tne derivation shown in rigure 3 the canonical one, and the otner two tnat are shown in rigure 4 are not found.

The Hoots of wonoeterminism

very ons wa sugges scanni mat cne collap strate tne gr simple expres a sin factor the s we Mou

ne c simp r.ts t t

ng s se

gy ai„ma

sion gle or ymbo Id 1

ontr le to u he long ne hat ill r of a mm a U)

term just Is ine

oi st to st se it folio the

right into not w f'igu

r fo can

Li a si

a, o, to ge

rate ate for

wing stri -nan a

ork re 5 r a be a kewi ngle or

c as

gy which w in terms o an analys analysis

ng, as soo a side o single co in general

This f ritnmetioa term (T)

se a term factor, a

c. rigure a parsing

e have t a ge is alg

stra n as y f som nstitu , as w igu.'e 1 exp plus a can be nd fac 6 sno ot th

jus n e r a t o r i 11» tegy : ou f i e ru ent. e can illu

ressi t erm a fa

tors ws tn e str

t u ive ra , i

as no a le,

ho ,.11

stra ons , or

Qf ' V

can e st ing

escr rule t s

yo pie

then weve ustr tes

I can U) be

ruct "a + b

ibeo is , but if eems to u start ce that you can

r, this ate with a very

n it, an be just times a any of

urc tnat

T + T

♦ T

F * F

♦ F

A.B. C

rigure 5: A bimpie Grammar for Arithme \c bxpressions

10

Page 15: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

5

ODH neport No . 3007 ■jolt ueraneK anu Wewman Inc

L

A + B * C

figure ö: A Parse Tree for ' a + b • c "

rv f«

i.

Ine way on us Ln sum. ( parentne interpre we üOOK oi' figur wnerever to a X, oy itse ana then tnat we that we reauce wnen we

tnat e pri A si ses tatio t n i s e 5 a

we tuen If. we c

wouio would h • 1 come

we nav ori ty i g ti 11 y to

n ir t string na st couia

w e ' J n we c

ouiu r reuuc De st to an

to tne

e writ tnat t

more e n a D1 e nat ua oi en

arted ne

ave to ouiu r educe e tne uck De ytning impas

ten t n e ne prod

expan one

s wnat aracter aoing

could r go on

euuee t tne i + c to an cause t

ine se is s

rul uct aed

to was s an rea

eauc to t ne o

i t r a

nere stru hown

es ol t comes I gramma

expr intenae u tne c uctions e tne a ne + w n to an

o a s na tnen is no

cture t in rig

ne gramma irst ana r woula ess tne a . ) wow ontext ir

on tile to an V

icn can't r ana tne ingle fei. to a 1 a

rule wni nat we ha ure 7 .

r forces then tne include

other suppose

e e rules string

ana then reduce

n to a i M f t e r

nd alter en will ve üuilt

I

A + B

rigure 7: M ciockea neverse uerivation irom "a+b*c"

11

Page 16: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

DöN heport NO. 3067 bolt beranek and Newman Ine

essentially, in order to obtain tne parse tree in Figure 6, it is necessary not to go aneaa and reduce tne secono f to a I. Instead we must postpone tnat until reuueing tne c to an F ana reducing tne t' • f' to a single e", wnich can tnen be reduced to a T and tne 1 + 1 reducrd to a single t.

hondeterministic Algoritnras

especi proces altern aevice oi an ay tni unpred t n e r e severa tnacnin oi al algori device searcn m a g i c a explic cyclin a none comput

nere ally sing, at i ve for

onaet s, we ictao is a 1 cno e by terna turn.

to algo

iiy itly g tnr e t e rm ation

are in wn po

devi ermi ao

ie, prim ices syst t i ve Ine ena

ritn mak Keep ougn inis pat

many art

ere s ssible sing a nistic not re but r

itive i ni

ematic cnoi

n o n a e ble t m to t ing t ing tr t n e m ,

tic a ns lea

applica 1 f 1 c i a I ystematic

enoioe 1 g o r i t n m s algcritn

ler to an atner to cnoice op s algorit ally cons ces oi terminist be write n i n K o t n e r i g n t ack ot

une say Igoritnm d to a su

t ion j inte sea

is for

.n or algo

an ao erat i hm is iaeri tne ic m r oi tue

cno tne s tna if

ccess

in coiap lligence rcn in required. sucn tasks nonüetermi ritn oi wnos stract alg on wnicn c tnen sirau

ng all pos abstract n acnine is

a gramma mac nine a ices, ire alternativ t a string any ot t lul analys

uter science, ana language a space of A conceptual

is tne notion nistic mac nine. e oenavior is oritnro in wnicn an make one ol latea on a real siDie sequences ondeterministic

a conceptual r or otner sucn s if it were eing nim from e cnoices and is accepted by

ne alternative is.

ine first fundamental idea tnat I would like you to remember is tnis notion of a nondeterministic algoritnm as a device lor coping witn tnis type ol searcn in a space of alternative possibilities.

12

Page 17: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbW heport No. 3067 bolt beranek ana Newman Inc.

backtracking vs Parallel üearcn

There are nondeterministi its effect is t choice, it save of tne informat choice so tna and try anotner deterministic com iguration s undoes tne la alternative. I undoes tne next choice sequence an efficient ge simulators for t'igure 7 , the last reduction parser then w tne reduction o tne point wner not been reduce reduce tne c t on the blocked puts us on the

two principal ways c programs -- one hat whenever tne p s somewhere (usual ion tnat is about t the simulator ca choice. The pro parser until

uch as tne one in st cnoice made f there is no othe to last cnoice, a

s have been consid neral technique fo nondeterministic a

result of bacKtr of r to T. tinoin oula unao tne red f i ♦ T to t, and e tne b nad been r a to I. Tne pars o an t" (a second t patn) and tnen red rignt patn for tne

ol writing simu is called backt

rogram is about ly on a pusndown to be destroy

n come back late gram then pars it encounters p'igure 7, at whi and tries the ne r alternate cnoi nd so on until a ered. Floyd ( 1 r implementing b Igoritnms. In t acking would be g notning else t uction of c to r' eventually would educed to f, out er could tuen irae -- tnis was uce tne r " r to correct analysi

Pi es a

en xt ce ,

lators racKing to mak stack)

ed by unao lik bio

poin poss tne

11 poss 967) g acktrac n e case to undo o do, , tnen back u tnat r go on

uone be

i, w s .

for and

e a all the it,

e a ckec t it ible n it ible ives king

ot the tne

undo p to had to

i ore nicn

im system algori way i is cal it pr anotna of ot when i most r at tna on tn choice search

D atic thm , n wh led ocee r tn her t en ecen t "o e s seq wou

acktr ally savi

icn i " d e p t ds t at de untr

count tly m epth" tack uence Id co

acki wor

ng e t wa n fi o ma peno led ers ade bef

of a s we rres

r.g king noug Iks rst" ke a s on alte a bl cnoi ore Iter re 1 pond

algori on on

n to un tnrough

Tnat cnoice tnat,

rnative ockea c ce, ana backing natives aid out to a 1

t nm e patn ao it tne s is , a tnat

ana so s at a onfigu it tr up to

. If as a

eft-fi

aoes of

late pace iter aepe on ,

if le rati ies tne

tne tree rst

i

tne r . of mak

nas bui

rent on d all nex

spac , t tree

LS nona Tne

poss ing on t loin "ae

oes poss t pr e of ne wal

s e a r c n etermini

system ible cno one cno nat one , g up a s p t n s " . it undo ible cno evious 1 alterna backtrac k.

by stic atic ices ice, and

tacK only the

ices evel tive king

Another way of call independent time tnat you are a object for eacn corresponas to a st nonaeterministic m real raacnine, a con the program count simulation ot a non configurations ins

wnat goes on in

nanaling nonaeterminism is by wnat I 11 alternatives. In such a program, every oout to make a cnoice, you create an of tne possiole cno s. inis object ate or configuration o. tne hypotneticai acnine wnicn you art. simulating. In a figuration is oasically tne contents of er and the register contents; in the deterministic macnine tnere are many sucn tead of just one, linis is similar to

a time sharing system.) ror a

13

Page 18: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

Dbh neport NO, 30b7 bolt beranfcK and wewraan Inc.

nona basi inpu syst tnat conl now "bre (wor WOTK alte stat conl com just a io vary

eter call t s em f

yo igur 1'ree aa t n king ing rnat e, 1 igur igur

cr t o ing

minist y t n e tring or han u com at ions to wo first on th

on oth i ves , OOK at at ions at ion eatea ) f t n e priori

ic

stat tna

u i in e t as

rk o ") o e on ers you wtie wni

(whi J

se ties

finite sta e tfiat you t you na g inaepen o a cnoi there are n those c o r you can es that se ior exarapl can pick u re it is i ch it coul en may or ust ÜKe a conligurat lor servi

t e mac are i

ve go dent ce po altern n f 1 g ur jump a em mos e ) . w p a co n t n e d get may no t i m e -

ions ce .

nine n an tten alte int , ativ atio roun t li it n nf ig inpu to , t oe snar In

, the c u tne to. 1

rnat ive you

e cnoic ns all d from kely ol multip

ur^ tion t , com ana the one ol

ing s y s pseudo

oni i poi

n pr s, maice es, in p one sue

le , ae pute n go tne

tern par

gura nt ogra ever up

and aral to cess inae term

th to on

you alle

tion is in tne mming a y time as many you are lei (or another •tef ore

penoent ine its e next another es you can run 1 w i t n

unae nona alte alte to toil para IOOK

appr into wner cons anea exna a g it f for of a gett m a a e " was sear make

i n e r e rstandi ecermin rnative rnative deciae ow, it

iiel, s bette o a c n , oarren

e tne iuer on a to ust ivei iven pa u r t n e r . tne ex

n u n i m a ing oac

In m ted" or chea oe

an ait

i

ng 1st s s , wn

13

or r a one te

nex e c com

y a tn

L

am p frin K t ore un

i or ern

s a ana

ic rat ne xi yo ich possi to ju t an

iias rri to t oes I tne piete n u t n it is ven i ie in at ive o wne c o m p inter e one a t i ve

t ais

prog r t u ar oft Die mp 1

y g to

ry o t en alt

iy en b not

n t n

fig na t

re t 1 ica est i can cno

reme 0 i

rams nan e in ne a 1 or rom iven syt.

el or oice erna sear acK pos

e si ure ure ne r tea ng p get

ice

n a o u s n tex

in backt a pos

11 e r n a you to ont; to

Ji o ra e

teraat i e ne c is .

•lives cu tn up out sidle mpie i 7, tne that ight a e x a m p

arts o oacK

can oe

amva t par

term rackin 11 i o n t ive c lol io anot n

nt . cai iy an wai i ne on to a c e spac of it

to com 1lustr re wer naa t 1 terna ies, 1 tne to th astro

ntage sing s o

g. w n e r e no ice w sev er QI

in t wai K K bac xy wa n o i c e e on

un e bac a t i o n e two o be t i ve tne space e co nomic

for

Nit it

s i era pen tie uow / t y t is

tne ce k t Oi

or u

enc amo tn

rre ai .

lor s i m p 1 e m e i n d e p e

h i n d e p e is aill

s tne oe 1 parsin u ing en oacktra

n a long o tne o go oac to plo current

one has o it ana bac Ktra t n r e e t

n a o n e D

ice naa u n t c 1 at nave c t p i a c

peecn nt ing n a e n t ndent icult st to gs in w n i c n eking pat n

place K ana w on path left pusn

CKing n i n g s e l ore to be s u c n

to oe e to

1 will make a piten tuen for wnicn you snoula Know aoout between systematic bacKtracKing multiple inaepenaent alternatives

i secona tunaamental iuea namely tnis difference

ana the following 0'

Ik

Page 19: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

u bb« Report No. 3Ü67 bolt berancK and wewman Inc

öottora Up, lop uown, Hreaictive, and wonpreuictive Parsing

The ü e r i v a t i rules of "bottom- currt nt tne rig tnat mat side or lor simp until w .symbol . a reduct in the s mentione possible dil" leren have be consider process oetaiiS the alg consider once aid reaone wnich is

aigor on ol" the g

up" . workin ht-han c h i n g the ru 1 icity e fin (At 1

ion of tateme a the rules

t pos en ap at, ion as a need

ori tnra at ions lor a

separa ».•ritt

itnm tnat we a given

rammar is an That is, we

g string unt a side ol portion by r le . (I'm as .) we apply ally reduce east the goa tne string

nt we nave j systematic

that could itions in th plied. It ot detail t nondetermini to be cons function can be ma

11 lor a par tely fcr ea en.

described aoove string by reversing algorithm that is look into the inpu

il we find sometnin some grammar rule, epiacing it witn suming a context fr tnis process over a

tne entire stri 1 we are trying to into a single symoo ust made, we nave n

consideration of nave applied at eac e working string wn is exactly tris

hat is acnieved by stic algoritnm. ioered eventually i on a real macnin oe separately ana t sing system anci no ch grammar or versi

for finding a the generative

referred to as t string or tne g that matches ano tnen reduce the left-hand

te grammar here nd over again ng to a single acnieve is auch 1.) notice that ot specilicaliy

each of the h step and the ere rules could freedom from

tninKing of tne uf course the n order to make e, but ney can c have on ol a

these be done to be

grammar

There is another Kino ol parsing aigoriti.m extreme whicn is called "top-down". it ge oecause it starts by expanding the grammar rules top" ano only looks for comparison at tne input a terminal symooi appears in tne expansion, version of a top-uown parser makes use ol a pu into which •ne initial symbol of tne grammar before pa ing begins. bubsequently tne aigori as follows: If tne topmost symbol on tne nonterminal, tnen some rule of tne grammar nonterminal as its leit-nanu side is select nondeterministic choice) and the tupmost sv pusntown stack is replaced witn symbols from tne siae of tne rule (so tne leftmost symbol of tn side is now tne topmost symool of tne stack topmost symbol of tne stack is a terminal symbol compared witn tne next unused symool of tne in if they are tne same tnen tne topmost symooi of removed ano tne string is advanced. If tney oo then this configuration is oiocKed -- i.e. this nondeterministic searcn is terminated. The accepted if tne pusndown stacK becomes empty time that tne last symbol of tne input st.rin (note again our use ol tne nonoeterministic

simplify tne explanation. In an actual parsing

at tne other ts tnis name

"from the string wnen

A simple sndown store

is placeo trim proceeds stack is a

witn that ed (anotner r, r ~ ■ '■ tne

rignt-hano e rignt-nano ). If the , then it is put string. the stack is

not. match path of the string is at tne same

g is used . algoritnm to

algorithm,

15

Page 20: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbw heport wo . 3Üb7 doit beranek anu Newman Inc.

ail possible cnoices ot expanding tne Lopmoct nonterminal of tne stacK are pursued and tne string is accepted il any ol the alternative computation pattis leads to tne accepting criterion,) An example ol a top-down analysis using a pusndown store is shown in "igure d. (iiere the rectangular enclosure represents tne pusndown store, the arrows the steps in tne analysis, ana t.'je plus sign indicates tne consumption of a symooi from the input string by a given stack configuration,)

NP DET VP N

VP

+ THE N

VP + MAN

V NP

+ BIT DET N

+ THE

w + DOG u ACCEPT SENTENCE

r'igure ö: A Sample Top-aown Preuictive Analysis using a Pusndown Store

Tne narvard predictive Analyzer

Tne original narvard rTydictive analyzer (Kuno ana uettinger, 1903) does a siigntly more optimized version ol tne top-uown tecnnique just aescrioea, it worKS witn a grammar wnicn has been transl ormea so ti»at all 01 its rules nave a terminal symooi as tne first symbol ol their ri^nt-nand sides, inus at every step ol tne pusndown store analysis tne aigoritnm consumes a symooi irora tne input string, anu tne numdcr ol steps in a given computation patn ol tne nonueterrainistic macnine is at most n, wnere n is tne

iengtn of tne input string, (uf course tne number ol steps

16

Page 21: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

DtsN heport NO, 3067 bolt beraneic anü Newman Inc.

I or t algo poss ad va by t or in 11 stac new aa va (Ore gram the

ne re rlthra Ible ntage he pr stana nlte K ex insta nclng loach mar possl

al is

alt or

edlc ard loop pand nee

tn

. 19 into bill

compu muc

ernat the s t Ive rorm

s due ing 1 or tn e in 67) w

a ty or

ter h gr ive pecia a n a 1 y ) is to t

nto a e sara put hicn stand such

in sirau eater s coraputat 1 form o zer (kno that it he symbo string

e symbol string. converts ard rorm "lert-r

lating ince it ion pa r tne c wn as u e 1 i m i n a 1 on which e on top An alg an ar gramma

ecursio

tne has

tns , onte reib tes top vent or

orit bitr r n n".

no to

) A xt r ach tne of

uall tne hm d ary nds

nue t roll n ree nor

poss tne

y re stac ue t con

and

ermi ow o addi rule mal ibi

pi suit k w o Gr text el im

nistic ut all tional s used lorm , ty or ndown

s in a itnout eibach

free inates

Predictive vs wonpredictive parsing

t I r

i i i i i

i n e r e parsing 1 and botto Gritilths varieties been a n lit into e the class becoming v importan . top-aown b presented nonpreoict only IOOK a sort tha parstr wi the consti such a c symools on example , algori tnm the anaiy types ot p the curre operates, tontr, st algor j. tnm . togetner would be t tne symbo comoine .

nas oeen iterature m-up alg and Pet

ot eacn t umber of itiier ol leal dist ery luzzy --a als ottom-up --is t ive parsi at a give t it expe 11 find a tuents wn onstituen either s

an innere whicn 1 p sis ther nrases wn nt point only t h this wit

T n e r e , to form ried rega Is to t

a ab

orit rick ype. par

tnes inct

T tine dist ne ng, n po cts giv

icn u 1 ide nt 1 rese e ex ion

in ose n t ir t sora

rule ne 1

great d out tne hms. An

(1965; however

sing algo e broad c ion betwe ne uistin t i o n w n i inction r distincti A prenic

int in th to see th en constr maKe it u s compat of it 1 eature or nteu aoov is ts on t are expec

tne inp constitu

n e s 11 u a ne term in e consti ss or wne eft witn

eal difr

ex whi

, in rith ateg en t ctio en or t on tive e in ere, ucti

P. i ible n t tne

e is ne s ted ut s ents t ion al s tuen tner wnic

or di erences ample ch char recent

ms deve ories, op-down n wnicn is cor he two between parser

put str wnerea

on only rrespec witn a

he inp top-do tnat a

tack a to occu tring ,

will in tne

ymbols t, tnen tnere

n this

scussi betwe

is a acteri years

lopeo and 1 and o 1 tm

relate simple

pred is on

ing fo s a no as a

tive n anal ut st wn pus t eacn predic r to t As th oe 1 simpl

coulo tnat

is an consti

on in the en top-down paper by

res several there nave

which don't tnink that otton up is nk is more d witn the algorithms ictive and e tnat will r things of npredictive runction or or wnether ysis ot the ring. For ndown store

point in tion or the ne right of e algorithm ooked lor, e bottom-up be grouped alternative analysis ol tuent coulo

Ine predictive parsing teennique nas an advantage lor most parsing applications since it considerably reduces the numoer ol applications ol rules that nave to be considered

17

-

Page 22: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

büN heport i»o . 3067 boit beranek anu Newman Inc.

ana t h

( i.e. in some

compiet preaict

grammar t n e be

s e n t e n (

1 o u n d

noun p n

tnere

that po attempt

does tn also r correct

e nu s u q u e otlie

e ana ive ol' r'

ginni es ca tne s

rase

is n int. eu ev

is rc e s J 11

pars

m b e r

ncea

r co 1 ys i anai

igur

ng o n De

ub je at t

0 g in

ery w suit s i

ings

ol

ol

n tex s or

ysis

e 1 , i tn gin

et n

no p ramm

tn

nere in

n m

" a c c i

w c r u s

t but

tne ol

tne e sen

witn oun p

lace

ar ru e Do

sine

more ore

d e n t a

tna t

whic

curre "ti.e

parse

tence noun

hrase

tnat

ie wn 11 o m -

e tne rules spur i

i" e

cou

h ar nt s

ma

r lo bec

pnra , it

st

icn up

re i tna

ous

onst i

io ma e not

t r i n g

n bi

OKS f

ause

ses . uoes

arts would appro

s no t tiav

m a t c n

tuen

ke u a c

) . t tn

or a

the now

n't wit

use acu ,

preu

e to es t

ts en

P a ons t i

r or e 0 uog

noun g r a m in

ever , try t

n "D

a no ai 1

i c t i o be t

na t d

'it are

const t u e n t

x a m p 1 e

" , u s i

pbra ar say

once

ü lOOK

it" D u n p n r

rule n. ho riea ,

on ' t 1

t ound

ituent ol any

, in a ng tne

se at

a tnat it tias

lor a ecause

ase at s are t only

out it e a a to

t n e r e

Decau

t n e r e

reduc

tnere

wort'

espec

aue t

utter

all o

i I it cons i tne

down chanc stand

ol' a

i n f o r .Ti i s s i

given

t or i s

se i

i J es i

is

at a

ia i i

0 pli

ance

1 yc inn

s ten

righ tne e o

s a

wr mat!

ng reg

parsing

a great t iollo

a pröD ts au"a

a 1 ^ i

n y give

y true ono L ogi s . 1 l u r p r e d

ucea yo t witn

t parse string t reco better

otig or

on as a

word m i ion ,

tex

ad v

ws 1 lern

ntag r 1 y

n po

or

cai you

let i

u to tnat

"i doin

ver i

c n a ti

i.-i i

SO

z nt

tint

a n t a g e

ewer o in con e . in

big n p i ri t in

tne r

e rleet r gues ons la

oniy wrong

n e n o n g ever

ng ir ce ol ssing urce

be or

ue loria

to us i 1 i n u a 1

t inuous

c o n t i n rooa D i i

tne st

i r s t a r»

s at tn s ol It, ter wi1

lOOK I'O wora ,

preulet y t n i n g o m sue

i i n d i n g

word . lor pr

what KI

ol ng trie

leys . s p e e c

uous s

11 y t n ring rn

last oeg

1 Irs

oe i t nos

t n e n y

ive pa it c

n e r r o

m o s t it

e a i c t i

no oi

ii

pe-

at

ay

w i n ♦

nl e ÜÜ

r s

an

r3 ol ca on n 0

que

rea

n t und ecn

yo De

ord nin wer

1 ue

tni ma

er s

in n t

a

ra

nee

ict ue

ers u

jr

wr

in

g Ü 1

nee

ngs

y n tna tan

ope

e p

nen 3 is

s o

ive otn

tana naer

gues ong .

tne ana

s «r a Dy

tna ever

t go u 3

ci 1 i

arse

pro to

requ

r w algo er

ing stan

s 1 o

i r.

sen

ene eng, it,

t wi

re es u

a o call

in

vide wnat

ired

eras,

r i t ha nand ,

w n i c n

ding ,

r tne

is is tence

s ol tnen

ana

11 De cover

p ana

etter

y, it spite

tnis

tne

in a

Anotner point tnen that I would like to ma

trauecll Detween preuictive ana nonpredict algoritnms lor speecn unatrstanaing. 1 aori't wa

strong case that one or tne otner is ottter; 1

a teeiing lor wnat tne traueolls are tetwe a J. gor i t nm s . Tne pr ea i c t i v e one will a o a mo searcn, ana if one is conliaent tnat tut tnings is üasing its predictions are rignt, tnen it is

^n tne otn er nand, ir tnere's a nign criance t wrong, tnen tne üisaavantage xz tnat tne preaict

you 1rom linuing enougn oi tne correct parse to

source of inlormation lor error correction.

Ke is tnis

ive parsing nt to make a

want to give

en tne two re selective on wnicn it

prelera Die .

nat tney 're ion may keep

De a uselui

Page 23: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

;

1

r i

bbN heport No. 3067

»ell-forjied Substring rabies

bolt beranek and Newman ln<

On ot par top-dow c o m p u t a done on ways o analysi entire may be t a D i e " or a coraputa reuoing anaiysi a table wnere i a const consult const it results

e th sing n, p tion the

f a s to rema tne is a cons tion tne

s, a inc

t oe itue s t uen t are

ing ai

rerti pa

sep naly spl

inin same raec

titu so

com com

exea gins nt o ne has use

that was found very ea gorithms, especially ctive algorithms is ths are done separa arate paths. t'or exam zing tne beginning it up into two differ g analysis will be don in botn cases, h

hanism for saving the ent on one path of tnat they can be used

putation, «nenever plete constituent is f oy tne type of consti

ahenever tne aigori f a given type at weii-lormea substring alreaay been found,

a without recomputatio

rly wit

that tely Pie, of a ent e tw "wel resu

a on

in ouna tuen tnm a S tao

anu ti,

in the d h the en

when a • duplica

if two sentence compu '-at

ice, even 1-formed Its of th

nondet otner pat the cour i it is r t ana tne is about iven pos le to see if so.

evelo umera Item te wo

pos caus

ions , thou subs

e ana ermin hs wi se o ecord

pos to pr ition if s then

pment Live, ative rk is sible e the

the gh it tring lysis istic tnout f an eu in ition eaict , it uch a

the

Table Urlented Parsing Algorithms

Tne use ol tne well-formeo substring table is sulficiently useful tnat some parsing aigoritnms have been designed exclusively arouna tnat notion. Tneir central purpose is to liil in tnis table witn entries saying there is a constituent of type x from position y to position z in tne input . Tneir acceptance criterion for a string is finding in tne table an entry indicating a constituent

example an algoritnm due to lounger entries '

g in

r or the (1966) fiiu ... BUH

in order of lengtn ol tne resulting constituent (ana lorü,.as grammar rules whose right-nand siaes consist of a

19

Page 24: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No. 3Üb7 boil beranek and Newman Ine

ol' length 2 and 1 will already nave been raacie ana any

questions about trie existence of such constituents can be answered by merely consulting tne table. The constituents

of lengtn 1 are founü by matcning singleton terminal rules against the input string. wtien sucn an algorithm

terminates, if tnere is an entry lor tne initial symbol from the beginning to tne enu of tne input string, tnen the

string is accepted by tne parser, otnerwise it is rejected.

eliminating heaundancy

not ord can at

ell it

sin

rnis ana ide

to

m e n req a p

tna anu

tne

thi

una

res tna

req

sam

in to

er of

rel an

i c i e n

nas a ce on

neara lysis

ntify

the t i o n e

uires art ic t na

11 a

n it IliC t

erstd tr ic t t iao

u ire e par

tne

do 1 i

y o ear cy

di

e 0 or

f

th 1 e

a e t

ula

s

11

wi 1 ner

mil ion st

a a se

au

a 1 i i i

n a 1 ie au

sad

I t

S row

e g ft-

e,rl ne r c

to or

1 b

el o

n K

te ii i

ove

o ve

ot

ng ny r

van van

he ar b

b ar b fir

i er inc

ano

tne

e v re-, to

Th xt

ere r a

typ

ol e in t

an s w po in tage

tage er it

led eing

led

s t c , an

i via nica fou

sub ery

tn

tr is i par

nt s

nd o

e xc

nt er

t

i ic

a

wo

an a

ua 1 na

se u i

at

S i

0 i

ve

01 ess ta tn i n to

or

al nd 1 ou ra )

or. i to

I s oru

f i que

II i

to a i

ng ut i

r a

ai

iv

bl at

r

sp

el t

nd

oa ar,

r. e er

rs ML

cu it

Uli

sy on ga

gon e co e be la

tne ora 1 eecn em en nero

( K

i hi I ae

y ot ps i

1 t in pro

it t

i

re ia

u a m e stem to

in i

trim , 1

in p u t a t

used .

needed 3 t q U e

nary t under

ts ear by Ke men s same ri vat i ner pa n an a 1 one some

cess in o r e c o s im

x s o in nta 1 a s oper tne p r

n dill

t is

ions

Thi nav i

nee, ex t p

s t a n u

iy in ep t

could di sa

on of

r s i n g

na 1 ys

of tn sucn

g 13

ver f

por La e oi

epar t ate a o o i e m

e r e n t

cr 111

tnat s is ng be

Thi a r s 1 n

ing a tne

rife r

be d van t

a pa u e c n

is to e eri

o r a e r r e p e n

r o m t

nt

the ure i na it

ot

v ay s

cai 1

a par so tn en p u s na

g. ti pplic c n a i n est usea

age

rse w

n 1 q u e

be f ticai

ing i dent

ne er lor

se o

r o m t is g

final

n o r a e r t icular

at one t tnere s many

owever,

a 11 o n s ,

may be

of the

to nelp applies

hich we w n 1 c h

o u n d in things

s wrong on it ,

r o r , 1

s p e e c n

r d e r i n g

ne way oing to n g tne

in many cases, it may be important to be able to jump over and find tne object noun pnraso ana tnen t.ie verb pnrase wnen you naven't lounu tre subject yet. ror example, in tnose cases wnere tne suoject wasn't linuabie because of a garbled word, a well unaerstoou verb phrase could be used to predict wnat Kina of subject ougnt to be tnere. nowever, in other ases wnen you nave found tne suoject first on one patn, a computation patn wnicn finds tne vero pnrase and tnen comes back and worKS on trie subject will lind the same parsing over again, ine solution that we have been using in tne bbH system (wooas, 1974) -- trie solution wnicn I tnink nas to be used -- is to put in appropriate cnecks at various cnoice points to ask wnetner tne tning tnat is about to be produced nas been founu already on some otner patn ano avoid

creating a duplicate, hhen tnis is done at tne level ol

20

MHHW

Page 25: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No. 3067 bolt beranek ana Newman inc

noun phrases, embedded clauses, etc., it tenas to block the redundant generation of larger constituents belore the duplication becomes un lanageable. It still carries witn it the cost of trie additional checking, but 1 tnink that this cost is essential in order to cope with the errors that will occur in speech.

Lexical Ambiguity

1:

I've rnentio parsing problem traditional tex amoiguity oi' w sounds. The tnaj parsing is tne for a given wora amoiguity, "Tim three possible adjective^, "11 "like" can eithe Oi' a parser categories as in syntactic categ If you had to pu separately (app you would be aoi woula happen wit ambiguity of d c 1,000,000 difte understanding, t inability to u speech sounds in to run a parse sequence of synt

ned a number oi' things wnicn make the lcr speech understanding more difficult than t parsing. Another difficulty is the oru identification in the input sequence of or source of lexical ambiguity in text possibility of multiple syntactic categories

In a classical example oi sentential e flies like an arrow," the word "tirao" has syntactic categories (noun, verb, or

led" can either be a vero or a noun, and r be a preposition or a verb. If we think receiving a sequence oi tnese kinas of put, there would be jx<>xü=1^ strings of ories that you could gee for this jentsnee. t eacn such sequence through the parser arently some early parsers did exactly that) ng twelve separate parsings. Imagine what h a sentence of say 20 words with an average ategories per wore; you would have over rent possible such sequences. In speech his basic ambiguity is magnified by the narabiguously determine tne segmentation oi' to wo^d sequences. Clearly one doesn't want r on a separate enumeration oi each possible actlc categories.

wora Lattices

A technique that has with lexical amoiguity input symools rather tnan

been very eiiective for dealing has been the use oi a lattice of a single string. A simple example

oi' such a structure is illustrated in figure 9

21

Page 26: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN heport No, 3üb7 bolt beraaek ana Newman Inc

E-V 1 N«

-N i—V' An.i—J ADJ<

PREP DET 1 N —1

TIME FLIES LIKE AN ARROW

r'igure 9: A oample word Lattice

cjucn lattice compactly represents all ol tne possible alternative sequences of in;ut symbols with tne common parts Of different sequences factored together so tnat processing on tnem needs to oe uone only once. *itti such an input, grammar rules are matched tne same as beiore, except that as a rule is matcned against the input, particular paths are selected tnrougn tne wore lattice wnich satisly tne match. This technique has a tremendous oenetit in terms of tne amount ot computation reouired for parsing. wnen a particular rule is matcnea at a given point ia tne word lattice, all of tne possible sequences of worus in wnich tne matching sequence occurs are effectively factored togetne • so that the result of tne reduction is effectively performed just once for an entire equivalence class of word sequences, Ihis technique is very attractive for speecn understanding oecause tne possible alternative segmentations of the input signal into words leads to a lattice structure similar to tnat illustrated in r'igure 9 (altncugn of slightly more varied structure), whereas tne structure in Figure 9 is notning more tnan a sequence of alternative syntactic categories, tne structures for word lattices in speech understanding tend to nave mucn more brancning, and the individual brancnes leaving a given point do not all come togetner again at tne same point. nowever, tne same parsing algorithm runs on this more generalized input lattice and saves a tremendous amount of processing by avoiding the multiplication of combinatorial possibilities.

22

Page 27: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

I. CON Heport No. 30C7

Bolt Beranek ana Newman ln(

u

n D

■ ■

Chart Parsers

Tne concept or a word lattice for the input symbols and

": ^^L^^rr^^1-"?..1"1* f°- ^*°^ can be combined algorithm.

parsing are closely related, an into a single data structure in a parsin

The structure of tne well-fo "-' lattice,

iS e^tC^L.the s^e as tftat'öf InVlorT^ SUbStrins tabie

tound in any analysis of An example of like an arrow"

constituents tnat can be any path in tne initial

such a lattice for tne sentence is shown in figure 10.

lattice. "Time flies

i

1 I

I I

23

Page 28: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN heport i\lo. jüb? bolt beranek and Uewman Inc.

•VP-

•N-

•ADÜ-

-NP-

-VP-

-S-

•NP«

NP-

N-

V'

•VP

•VP-

•S-

PREP

V

VP

pp.

•VP-

■NP-

-DET- H N—

VP«

S-

TIME FLIES LIKE AN ARROW

rigure 10: An example 01 a weii-formeo Substring Lattice or Cnart

tacn striK of tn entri algor walki match and keeps merge parse tiandl trans graram paper and usual botto

labeled no es repres e applicat es in t n ithms (Kay ng tue c ing rules botn prod a great d

d togethe r and COCK e general format iona ars, botl" , 1 will c tneir der

im piemen m-up parsi

rizontai line ents a seamen Ion ol aone e woro latt 's and Locke ' nart and -KO

against tne s uce a very eal of tne co r. The pri e's is Kay 's

rewriting 1 grammars. ilgoritnms ar all all such ivatives "ona t a t i o n ol ng algoritnra

in tne tigure between vertical t aaded to tne cnart as a result rule (or one oi' tne initial ice). botn ol these parsing s) select a particular oraer for uing new segments as a result oi egments already in tne chart, nice recognition algoritnra tnac

mmon parts ot different analyses ncipal dilierente between Kay's generalization of the method to systems anc an approximation to

ror strictly context tree e ettectively tne same. In tnis parsers (botn Cocke's ana Kay's) rt parsers". In particular, tne tne classical nonpredictive

is a cnart parser.

2!*

11 1

Page 29: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

I

bbN Heport No. 3067 Bolt beranek and Newman Inc

Parsing versus Kecognition

I I I I I I I I I I I I I

In order to be called a parser, an algorithm must not only calculate whetner a string is accepted or not, as does a recognizer, out it must also keep a reoord ol' the derivation and provide one or more structural analyses of the sentence. In my description of most of the parsing algorithms so tar, 1 nave glossed over this distinction and only the recognition aspects have been discussed. In order to be a parser, an algorithm must Keep track ot and report what constituents were used as pieces of what higher constituents. This can be done conveniently for a chart parser by annotating each of the segments of the chart with a list of the constituents wnich formed it, -- that is, by a list of the segments which were combinea by some rule to produce the annotatea segment. In general, there can be several ways to form a given segiaent from different sequences of constituents so the annotation must provide for several such constituent lists in order to represent all possible analyses.

nonp algo cons anal char such up" very cite '„p.e cnar part same

both redic rithm t itue ysis . t . r acci char com

rnat i di f f e t of icula char

Coc t ive s th nts

iu igure denta t tog pact ve a rent

r ig r par t wit

Ke s alg algorit

e proper that do ch accid

1 1 snows 1 segment ether wit represen

nalyses analyses u r e 11 w i sing of t h all con

orit nms

ty not

enta a c

s ha h it tati of t merg tn c ne i stit

hm and and

of fi form

1 const hart for v e been s consti on of he input ed toget onstitue nput , an uent poi

Kay share nding a pa i c u e n our

remo v t u e n t all with

ner. nt po a ti n t e r s

s w m

rt Ls exam ed . poi of the fig

inte gure inc

are ith any of a clutt p] e i Sucn

nters tne

comm ure 1 rs ao

13 luded

bottom other accide

n y c o m p er up n wnich a " c 1 e provia

poss on part 2 shows ued for snows

-up, such ntal iete the all

aned es a ible s of the one the

25

Page 30: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BBN Report No. 306/ Bolt beranek and Newman Inc

TIME FLIES LIKE AN ARROW

figure 11: h Lhart wiu. Accidental Constituents Removed

TIME FLIES LIKE AN ARROW

r'igure 12: A Chart Showing Constituent Pointers I'or One Parsing

^6

»Jja^-- --1

Page 31: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No. 3067 bolt beranek and Newman Inc

I.

i. TIME FLIES LIKE AN ARROW

r'igure 13: A Chart Showing All Constituent Pointers for Two Parsi

I ■ c

In m the plan each cons s u f f For alte segm poin comp

ore cna

ar g wi

titu ices a pa rnat ents ts act

t ypic rt as rapn) th a ent t

to rser , i ve (wne

and repre

al c in

, bu se

ype nan tne

cons re a tne sent

ases, our e t ins t of ano t die t incl

titue segm cons

ation

one xampl ide a asso

ne po he mo usion nt 1 ent i titue of a

cann e, ( com

ciat siti st g wit

ists s na nt 11 t

ot or in pa puter ed se on wn enera h eac , ea tried b

label he po

aw as rticul , a ta graents ere t 1 case ft segm c n o f y its ) suf ssible

nice ar i Die (in

he tor

ent wnic left fice par

a P t ma of uica segm a r

of a h is and

s to ses.

icture of y not be a positions , ted by the ent ends) ecognizer.

list of a list of right °nd produce a

tarley's Algorithm

There is another parsing algorithm for context free grammars due to Jay Earley (Earley, 1970), which can be thought ol as a preaictive chart parser. This aU'^rithm combines the benefits ot tne systematic, lattice-oriented parsing of the well-formeo substring or chart parser with the advantages of .-reaictive analysis. Although the algoritnra was developv'd in tne context of parsing for computer programming languages, and is presented as sucn by barley, the algorithm has many tneoretical advantages for

27

Page 32: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No. 3067 Bolt Beranek and Newman Inc.

parsi appre under does model Start in a recor the g or wh each so fa into the p folio

ng con elation standing not quit s, or ing from

table es for e rammar t ich migh rule tha r to the columns , rocedure ws :

text of of

e fit rathe the (whi

ach p hat h t pos t wou left one for

fre its cont into

r it begin ch ha osi ti as be sibly Id be of t

for e filll

e gra operati ext Ire either seems

ning of rley ca on in t en part match consis

hat poi ach pos ng out

mmars on i e par the

to fi the

11s a he in ially begin tent nt) . ition

a g

i

s sing top- t eq stri sta

put mat

ning with The in

iven

n g impor

ha down ually

ng, i te ta strin ched at t what tabl

the i col

ener tant rley or t .wel t be ble) g, e up t hat has

e i nput umn

al, f

's a he b 1 in gins in

ach o th poin bee

s o str i + 1

and or Igor otto to b to

whic rule at p t ( n pa rgan ing,

is

an the

ithm m-up oth. fill h it

of oint i.e. rsed ized and a J

1. (transition) haice entries in the column for rules that appear in the preceding column ana whose match can be continued by matching the input symbol associated witn this column.

this cont rule that to the of repl and all give give

2. CO

inue re

whe tne mate the acc5 gain of t n s n pi

( pr lumn a m

memb n th colu h of col the

s o he d ubco ace

edict for

atch er i e mat ran wh any

umn use

combi if fer na t i t will

ion ev

for the ch i ere and in of a nato ent ucr.t be 1

or "pusni ery cons a rule al column i

s complet the const all rules which a stack in

rial bene possible computat

ooked for

ng") t i t ue ready n wni ed , t ituen whic

subco most

fit b stack ion . and

nake nt wn

in ch it he al t was h wan nsti t

pre y not s whi A gi

found

begi ich thi

s ma gori wan

ted uent diet nav

ch c ven onl

nnin coul s c tch thm ted it . mat ive ing ould cons y on

g ent d be olumn was b can

and c This

ch wa algo

to en sit

titue ce . )

nes in used to

(each egun so return

ontinue memory

s begun rithms, umerate above a nt in a

3. (completion or "popping") For each rule wnose match has just been completed in this column, go back to the column where that match was begun and pick up and continue the match of all rules which can use the constituent just formed .

In Parley's statement of the algorithm the progress of a rule match is recorded by a pair of numbers -- the rule number and the number of symbols in the right-hand side of the rule which have already been matched. An entry in the table consists of these two numbers plus the number indicating the coljrun '.n which the rule match was begun. A sentence is accepfed if, when the last column is filled out, it contains ar entry for a rule whose left-hand side is S, whose match hao h* n norapletea, and whose match was begun in column 0. (The al, ithm begins by initializing column 0 to contain all of the i .ies whose lrit-nand sides are S.)

28

Page 33: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

t>oN Heport No, 3067 bolt beranek and Newman Inc.

top- with succ pass top- will anal give kind pars algc that This thos anal pred sine evid to b

tarle down p the a

essive ing in down amoun ysis) n poin

of er , tn rithm would is b

e entr ysis iction e if ence, enefit

arsi ssum lye form pred t to nas

t, t bott e pr we w hav

ecau ies of is the

it m err

algor ng aig pticn labora at ion iction many dete

he sub om-up incipa ill ge e been se the which the s a mix predi

ay kee or cor

ithm is oritum be that it i tes its s "down" in

(which cycles of rmined t sequent a structur

1 differe t a subse produced predicti

are not a tring to ed bless ction is p us from rection.

fre caus s go et o ste

inci lef

he s naly e b nee t of by

on t t le

th ing made fin

quentl e of t ing to f rule P 2. dental t recu et of sis wi uildin being those

an ord echniq ast c e lef for on th

ding e

y thou he way build

s to be howeve

ly may rsion rules t li do a g as a

tnat entrie

inary ue has onsiste t. On speech e basis nough o

ght that a se loo

r, leap in o be Imos ny o for

s in char elim nt ce a und of

f an

of it

nten ked once ove

the use

t th tner

fca the

t p j nat with gain erst unre

an

as a starts ce and for by

such r wnat final data e same chart

rley 's chart

arser. ed all

some , this anding liable alysis

Transition Network Grammars

The pre extremely s a grammar fo one finas t a verb alone some with any of tnese any of the If one were free rule, rapid prolif a lot of grammars imm such as th constituents repeatable (usually par vertical st star operatu usually tho ordinary con sucn notati bad way to i in parsing of optionali network gram

sentation so far has been impie sample grammars. mhe r any appreciable subset o hat there are some verb phr , some with a verb plus an a verb, an inairect object tnrce forms witn a preposi

three forms with two prepos to write each of these as as illustrated in Figur

eration 'possibly infinite) stuff in tne right-nand sid ediately find themselves fa at illustrated in Figure , alternative constitue constituents are indicate entheses for optionality, rokes for alternative sequ r (*) for repeatable consti ugnt of just as aborevi text free rules, but the ons into ordinary context mplement tnem. Instead, on if he takes advantage of tn ty, alternatives, and repea mars provide a mechanism fo

illustrated by two n one oegins to write f natural language, ases whicn consist of object noun phrase, and a direct object, tional pnrase addeo, itional ph/ases, etc. a separate context

e Ha, we find a very of rules that share

es. People who write Hing into notations Mb in which optional nt sequences, and d by some notation

;urly brackets or ences, and the Kleene tuents). These are ations for a set of actual expansion of free rules is a very

e buys an advantage ese primitive notions tability. Transition r doing this.

29

Page 34: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BbN Heport No. 3067 Bolt Beranek and Newman Inc.

A basic transition network (bTN) is essentially a finite state transition diagram to which recursion has been added by flat (see woods, 1969, 1970, 1973a). The result is no longer a finite state device, but rather is formally equivalent to a pushdown store automaton or a context free grammar. The bTN is a labeled, directed graph whose nodes, which we call states, represent states which the grammar can be in in tne course of generating (or analyzing) a sentence, and whose arcs represent transitions from state to soate. The labels on the arcs indic?-e the input symbol or type of pnrase which must be consumed from the input string in order to make the transition. It is the possibility of arcs (called PUSH arcs) labeled with the names of phrase constituents that provides the recursion wnicn makes this model more than finite state. The grammar contains a start state for each of the types of constituents which can be called for on a PUSH arc, and distinguisnea states called final states wnich represent the completion of «-.ne analysis of some constituent. A PUSH arc can be taken if some string acceptea by the start state associated with the label of that pusn arc is consumea (or generated). There is a mecnanical proceaure presented in «oods (1969) for transforming any given context free grammar into an equivalent bTN and performing a number of optimizing transformations on tne resulting ÖTN to produce a grammar which is more compact and more efficient for parsing than the original context free grammar. bssentialiy the bTN provides a way to factor a context free grammar into a finite state part and a recursive part so that as much of the grammar as possible can be expressed in the finite state part and optimized by tne same tecnniques applicable to finite state grammars.

The set of notations used by linguists for representing alternative sequences anc repeatable constituents in their grammar rules correspond to the operations callea "union", and "closure" in the theory of finite state automata, which together with the operation of concatenation are known to generate the finite state languciges. Thus, tne right-hand sides of grammar rules using these notations are merely notational variants of what is in automata theory called a "regular expression" , and there exist formal procedures for translating such a representation into an equivalent transition diagram for a finite state macnine. These same procedures can be used to translate a context free grammar using these notations into an equivalent bTrt, such as the one illustrated in Figure 14c.

30

Page 35: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

w

D

I I I I I I

bbN Heport No. 3067

VP«

bolt Beranek and Newman Inc

V

V NP

V NP

V PP

V NP

NP

PP

V NP NP PP

a SEPARATE CONTEXT FREE GRAMMAR RULES

VP-^V (NP (NP)) (PP)1

b. MERGED REPRESENTATION

^^^MP.^^^^^^POP r

C. REPRESENTATION^ AS BASIC TRANSITION NETWORK (BTN)

©■

Figure 14: Alternative Representations tor Multiple hight-hand Sides ol ürammar Hules

31

Page 36: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bb.N Heport No. 3067 bolt beranek and Hewinan Inc

Thus the BIN I'ormaiisra provides a realization tor tnese notions of alternative sequences and repeatabie constituents that is more efficient for a parser as well as being less redundant as a linguistic specification. Each of the arcs leaving a given state represents an alternative possible continuation of the string being generated (or of the analysis of a given string).

the cont be on e expr free gram gram part bTN by gram tree A pr is g

The tr mergin

ext fre perform ach in ess ions rules .

mars n mars w icular , grammar Earley ' mar com

gramm esentat iven in

ansit i g of e rule ed on diviau

were rtost

ave na nicn

harl s and s alg pared a r can ion of MOOdS

on n com

s , a ly o al

ex of

tura taKe ey 's the orit to t eas a v ( 19

etwor mon nd th nee o copy pa ride trie p 1 gen

ad v alg

nuraoe hm f he pa iiy b er sio 69) •

k gr part is p n su as

d i arsi eral anta orit r of or rsin e le n of

amraar s of ermi t en pa woul

n to ng al izat i ge o hm i pa-s

a pa g of ss by tari

ef f wh

s pa rts d b sepa go<-i ons r s a ing rsin

an fac

ey 's

ect i at w rsi n inst e ♦■

rate thms to t this natu ope

g of eq

tors alg

vely ould g o ead he ord for

rans ra

ral rati an

ui va oi

orit

pro be

pera o f s case inar con

i t io ergi algo ons opt i lent tour hm f

vides difle t ions epara

if y con text n net ng. r ithm requ

mized con

or f or b

for rent

to tely the

text free work

In for

ired bTN

text ive . T N ' s

Grammars for natural tngiish

each gramm ( ther vario recur On th most incap const sensi recog s t r u c u s e f u gramm possi sucn

In compar other, i

ars have e exist us types sion make e whole, natural

able of ructions tive gram nizer fo tural des 1 power ars and h b1e to ha grammars .

ing the t has b great formal

for fin s it un context grammar dealin and

mars ha r such criptio not a

ave t n e ve a pa

model een fo com put , mec ite st sui tab free

s for g wit disco

v e s u f cons

ns. G Ireaay undes

rsing

s of und t at ion hanic ate le fo gramm na tur h ce nt inu f i c i e t r u c t enera

pre i rabl algor

the hat al al mach r na ars al 1 rtai ous nt f ions 1 re sent e co ithm

Chom wher ad va opt i ines '. ura prov angu n k con

orraa , b writ

in nseq Tor

sky hi eas the ntages miz ing ) , tne 1 langu ide the age but inds o stituen 1 power ut prov ing sys

conte uence t the en

erar fin for

proc ab

age sim are

f c ts . to

ide terns xt nat t irr

chy with ite state

parsing edures of sence of analysis . plest and formally

oordinate Context

provide a no useful

add no sensitive it is not class of

32

Page 37: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Report No. 306V bolt Beranek and Newman Inc.

I I I I

Transformational Grammars

i

have show mode than serv lang tran gram plus orde inse Tran iden feat of tran Flgu pass the

Ther bee

n to 1. U cont

ed a uage sform mar b

a r of rt c sform tity ures the

sform re 1 ive s corre

e ar n p be e ne f ext 3 t gra

atio asic set cons onst atio of

asso sen

atio 5, ente spon

e a num reposed quivale ormalis free gr he veh mmar i nal gr ally co of tran tituent ituents nal rul consti

elated tence • nal rul whicn nee fro ding ac

ter o for

nt to m, ho ammar icle n tn ammar nsist s for ir s an at v

es ca t u c n t with

Per e is produ m the tive

f o na

the weve s ha for

e 1 of

s of atio d i ario n a s a the haps the ces "de

sent

ther tural oroi

r, wi s st

mos ast

Cho a co

nal r n ge us po Iso nd t words

the passi tne

sp s ence.

gra la

nary th c imul t o deca msky ntex ules nera siti test he ana si

ve t "su

true

mmar ngua con

onsi ated f th de .

t fr whi

1 m ons

co pres som

mple rans rfac ture

fo ge w text dera

11 e st

T A tr ee " ch c ove , in t ndit enc«; etim st form e st " t

rmal hich

f re bly ngui udy his ansf base •I., p

de he p ions

of es t exam atio ract hat

isms tnat have been

e grammar more power sties and of natural

is the ormational " grammar ermute the lete, and arse tree.

such as syntactic

he phrases pie of a n shown in ure,: for a underlies

33

Page 38: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

faBN Report No. 306? bolt beranek and Newman Inc

PASSIVE

NP (AUX) NP

2 BE + EN-J-3 BY-H

CONDITION: 4 # 1

a. STATEMENT OF THE RULE

K

NP AUX VP

I I A 1 2 V NP

3 4

S,

NP AUX ^P

4 2 BE EN V BY NP

b. EFFECT OF THE RULE ON TREES

Figure 15: A öample TransIormational hule Transformation

The Passive

The rule says that if you can analyze an intermediate phrase structure tree into a sequence consisting of a noun phrase, optionally an auxiliary verb, followeü by a main verb and an object noun phrase, then you can transform the tree by moving the subject noun phrase (1) to the position of the object noun phrase (i») appending the word "by" on its left, moving the object noun phrase to subject position, and appending the morphemes, "be" ana "en", to the left of the

main verb. This rule changes the tree structure

314

Page 39: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

I. BbN Heport No, 3067 tiolt beranek ana Newman Inc.

I ■

tm

■ •

■ ■

•«

corre "John to t combi s e n t e gener free serie tree usual appli sente

spond was

he ri ne th nee at ion base

s of by

ly or ed cy nces .

ing to shot by ght of e two i by a of a d gramma interme means aered , c i i c a i J

"Mar har

the nto tran eep r an di-M of mar

y to

y sho

y". next a pas sf orm struc d the e str the ked succ

t Jon (A la verb t par ation ture n tra uctur trans as o essiv

n" i ter and tici al tree nsf o es i form ptio e em

nto tha rule wi a "post pie.) T grammar by mea

rminj, t nto the ational nal or bedded

t corres 11 rau/e cyclic"

he gener consis

ns of th his tree surface rules, obliga

clauses

pond the rul

atio ts e c thr str whi

tory in c

ing to "en"

e will n of a of the ontext ough a ucture ch are , and oraplex

captur and a of Eng model . such a any si this consid and h transf

he tra ing th great d lish na

howev gramma

gnifica grammar erable as pro ornatio

nsformati e major s eal of ou s been di er, it is r and no nt amount

model , effort in bably tn nal gramm

onal ynta r cu scov inc

pars of alt tni

e o ars

gramm ctic fa rrent k ered an redibly ing alg text ha hough s direc nly wo in exis

ar cts a nowle a coa

i n e f orith s eve Stanl tion rking tence

appe bout age ifie f ici m su r be ey for par (Pe

ars natur

about d in t ent to itable en de Petric a numb sing a trick,

capa al 1 the

erms par for

velo k h er o Igor 196

ble of anguage,

syntax of this

se with parsing

ped for as spent f years ithm for 5).

I Augmented Transition Net wo Tics

I I I

In linguis preserv parsing model o network ( 1969, were ma and r'ra network carried pieces actions test a proceed associa string build equal, same transfo economi structu very c

order tic ade ing the

algori f gramm

(ATN) . 1970, 19 de by Th ser ( 196

gramma along w of tree associa

nd set t s with a ted wit into reg larger etc. It kinds o rmationa cal way res , whi ompact

to quac

ef thins ar

Pr 73a) orne 9). r au ith str

tea he c n AT h tn iste stru tur

f s 1 gr . T ch t repr

obt y o f ici , 1 whic esen

h , Br An

gmen the uctu with onte N gr e tr rs, ctur ns o true amma he m he n esen

am f a ency

ha n I tati arli atle ATN ted stat re, the

nts amma ansi use es , ut t tura r an ergi etwo tati

a tr of

ve b ca

ons er a y. a cons with e an ana arc

of t r , tion the ehe

hat 1 d d c ng o rk g on

grammar ansform

the een dev 11 an of this ttempts na Dewa ists of a set

d which with ar s of th nese re the co s can p

conte ck whe ^his mo escript an do f commo rammar

of qui

formal ational various eloping augment nod el a along

r (1960) a bas

of regis can h

bitrary e gramm g i .<» t e r s . nditions u t piece nts of ther two del can ions as it in

n parts provide

te large

ism with grammar context

and refin ed trans ppear in similar and by B

ia trans ters whic ola arbi condition ar which

As a pa and ac

s of the register register

construct those

a much of altern s, permi

the while free

ing a ition woods lines obrow ition n are trary s and

can rsing tions input s to s are

the of a more

ative ts a

grammars, and

35

Page 40: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BBN Report No. 306? Bolt Beranek and Newman In'

this lang (Woo spee of t Engl comb basi unde type cond a s tran diff adva with orde coul Une spee fun^ such unam such pred anal than

mod uage ds, ch he f ish inat s rsta s itio imil siti eren ntag whi

r to d o of t ch tion

wo bigu

wo icti ysis cor

el ha und

Kapla under ew 1 that

orial of nding of ns an ar w on n t ru e of ch on pred

ccur he 1m under word

rds ously rds a on an

sin rect

s se erst n, a stan ingu

ar pro

the pro

cont d ac ay, etwo les) the e ca ict to t port stan s su are fin

re a d a ce ones

rved andi nd N ding isti e a blem syn

ject ext tion but

rks wh

tran n fo the he r ant ding ch a aim

d in Imos re spur

as ng ash- , th call t a s. tact (Ba fre

s as sue (su

ich siti How type ight rol Is

s "a ost the

t al not ious

the ba systems Webber, 1 e transit y adequa 11 amena This mode ic compo tes, 197 e gramma sociated h grammar ch as m we disc

on networ the arcs

s of cons or left

es of a to predic " , "an", always u inputi

ways foun even lo matches

sis such 972, ion te ble 1 is nent »♦, rs with s lo ergi us&e k fo bac

titu of a

sy t th "of" nstr In t d as oked wou

for s as t

woods, network grammar to co being of t

Woods, can be the gr

se the ng com d previ rmalism kwards ents o given

ntactic ose pla

shoul essed a he BBN a resu for

Id be f

everal he LUNAH

1973b) grammar

s for ping wi

used he BBN 1974). augmen

ammar ru benefits mon par ously.

is th and forw r words word or

compon ces wher d occur nd diffi speech

It of sy during ound mor

natural system

For is one

natural th tne as the speech Other

ted by les in of the

ts of Another e ease ards in

which phrase. ent in e Rfflvll

since cult to system

ntactic lexical e often

The ATN formalism suggests a way of viewing a grammar as a map with various landmarks and recognizable locations that one encounters in the course of crossing a sentence from left to right. For speech understanding this perspective is beneficial, for example, in attempting to correlate various prosodic characteristics of sentences with such "geographical landmarks" within the structure of a sentence.

Let me conclude this presentation of syntactic techniques with a reiteration that 1 have not attempted to make a case that any one parsing teennique or grammar formalism is uniformly better tnan others (indeed I do not believe there is a best one for all applications). Rather, I have attempted to give sufficient insight into the relative advantages and disadvantages to enable the reader to make appropriate choices for particular applications.

36

Page 41: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

1. bbN Report No. 3067 Bolt Beranek and Newman Inc

In

i.

4.

mm

I 1

I I

I I I I I f

Fart 11. Semantics

Turning now to the subject of semantics, I should perhaps first make the point that the word "semantics'- meano different things to different people. There is a tradition in philosophy and logic that specifies the semantics of formal systems such as the prepositional calculus in terms of a set o- "truth conditions" for each possible expression in the system. These truth conditions are abstract entities which specify the situations or "possible worlds" in which tne statement would be true. In linguistics, on the other hand, concern is usually devoted to finding a notation or representation in which to specify ea of the different possible interpretations or "readings" which a natural language sentence can have and to procedures for determining wnether a sentence is meaningful or "anomalous" (i.e. not rtieaningf ul) , The linguist does not usually follow this up by pruvioing a semantics in terms of truth conditions for his notation. In the field of programming languages in computer science, the semantics of a programming language is specified in terms of the computations which the machine is tc perform as a result of a given expression. In specifying a formal semantics for such systems however, one usually taKes recourse to defining tne semantics by reducing it to another notation such as tnose of elementary arithmetic, wnose semantics is presumably understood. In the fields of computational linguistics and artificial intelligence, the term is perhaps most misused. In some cases, it is taken to cover everything that isn't syntax -- i.e. everything that is not part of a grammar, while in otners it is asserted to be no different in principle from syntax, and any basis for a aistinction between the two is denied.

wnile I don't have tne space here to go into a complete exposition of tne different concerns of all of these different perspectives on semantics, I will try to give a briet synopsis of the aistinctions.

Let us begin by considering what all of these different things which call themselves semantics have in common. According to IT dictionary, semantics is "the scientific study of the relations between signs or symbols and what tney denote or mean." This is the traditional use of the term and i-epresents tne common thread which links the different concerns discussed above. Notice tnat the term does not refer to the things den^-^d or the meanings, but to the relations between these x, .ngs and the linguistic expressions which cenot*» tnem. Thus, although it may be difficult to isolate exactly what part of a system is c=manticsf any s>stem whioh understands sentences and carries out appropriate actions in response to them is somehow completing this connection, and therefore is

applying semantic knowledge to this task. Une of the common

37

Page 42: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BBN Report No. 3067 Bolt Beranek and Newman Inc

ID13U

ling cove ling infe sine sema invo but good "sem proc subs term lite this boun cone that by i

ses o uisti rage uisti rence e for nt ic Ives also name

antic ess. titut inolo ratur

pap dary lusio

not t.)

f th cs a of

o fo ca

man inf

not som for in

I r e

gy e, er

be ns a

al

e te nd a the

rm a pabi y ta orma only e in thi

fere egre term is I w in twee bout ] wr

rm sem ptific

term nd mea lities sks in tion the d

ferenc s furt nces" t to

for so we ill u referr

ay that

iters

antics ial in

not ning,

of lang

to ma etermi e abou her in have say

sue 11 es se th ing t mbol refere who us

in th tellig only but to the sy uage ke an nation t that ferenc come

that h pr tablis e term o inf and

nt. ( e this

e field ence i to thi all of

stem. process

evalu of the object

e proce to be u I have ocesses hed in "seman

erences referen One mus term m

s of s t s re the

This ing, atio

ob . I ss, sed

no an s

tic th

t a t be ean

com o e lati ret mis th

n n Ject n ab term fOi

re

d. ome infe at ud wwa

the

put- xtei.d

on be rieva use a e us ecess

den sence s sue the e ally since

of rence cross then re ho same

ional the

tween 1 and rises e of arily oted, of a

h as ntire good the the

s' in

The coneer the areas of same process, b linguistics and In reducing the some formal n of the job if h the resulting concerns of phi .jcmantics for specifications languages in te are satisfactor what these nota for specifying

ns of t semant

oth of speech semant

otation e does formal

losophe formal of th

rus of y only tions t the sem

he 11 ics whic unde

ics o , the not g syst

rs an sys

e fo the n to th hemse antic

n crn i <5 -- o — — —

are e h th rstan f nat ling

o on em. d lo terns rmal otati e ext Ives s of

t s and ffectiv e fiel ding wi ural la uist ha and spe It is a gicians takes semant

ons of ent tha mean. natural

the ely ds 11 h ngua s on cify t th

in ove

ics elem t we This Ian

nh i r •• —

two of ave ge lye a

is p sp

r . of

enta und is

guag

1oso"her halves o computat to cope sentence mpleted semantic oint tha ecifying Notice progra

ry arith erstand also the e .

the draw

wever thing

s in f the ional with, s to half

s of t the

the that

mming metic fully case

I hope the above presentation has aler of the different kinds of things to which t can refer, and I will attempt to make clear using in the remainder of this presentation out that in the field of computational ling have nearly as good an understanding of s of syntax. I cannot give you the same kind ideas through successively more powe techniques, all of which are well understoo the mechanisms wnich we understand thorou be inadequate for dealing with many aspects and the techniques which hold promise of of the more difficult problems are not understood ur tested, for anyone to say whe solve the problem or not. In this area, th

promising approaches, but few definite answ

ted you to some he term semantics which one I am

I should point uistics we don't emantics as we do of evolution of

rful models and d. Here instead, ghly are known to of the problem,

dealing with some yet sufficiently ther they in fact en, we have many ers .

38

Page 43: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN heport No. 306? BoJt Heranek and Newman Inc

..

..

unae repr jfyst

spec- have part appl tech seraa natu othe seraa de ve spec unde most are natu Art i Carb Coll haph Sane art i Abel ( 197

«tia

rsta esen em en),

d icul iea niqu ntic ral r i ntic lope if ic rsta par bei

ral cles onel ins ael ewal cles son, 3).

t I r d i n g tatio that and

i rect ar, 1 in t

e ol s wn langu s tn netw

d by appl

nd ing t , th ng d langu

whi 1 and and

( I960 1 ( by N hun

will of

n and under

then s rel

will he bbN

sema ich I age qu e tec ork r Quilli icatio , see e deta one i age wi ch ma Coili warno

) , Hei 1971 ), eweil, t, Li

a 11 e m some

inLerp stands orae sp evance descri speec

ntic have

estion hnique eprese an (19 ns of

Nasn ils of n the 11 hav y be ns (19 ck (1 dorn (

«in Sim tu o

ndsay ,

pt b

reta na

ecif t

be t n un int ap

-ans of

ntat 68, tne

-web man

area e to of

74) , 97U )

197^ ogra ns ,

an

10 do h a s i c pi ticn that tural Ian i c t e c h r o speec wo techni derstandi erpretati plied ef wering a

"s^mant ions of 19 69). F

latter ber (197 y otner i of corapu be left

interest Collins

. P111 m o ) , Norman d (1972) Milks, wi a becker

ere xnci

wi guag ique h u ques ng s on feet ppli ic kno

or m tec

4 an nter tat i

to incl and rp

and , w nogr

in

13 pies 11 e (w S w nder whi

yste int

ivel cati inte wled ore hniq d 19 esti onal

th ude : Qui

(196 hum

oods au ,

6c

pr of

appl hetn hich stan ch m , o y t ons , rsec ge deta ue 75*) ng t sem

e r bru

Ilia 8) , elha

(1 Scha hank

ovid se

y t er t

I ding are One proc o s

an tion whic ils to . F hing anti ef er ce ( n ( Grp

rt ( 967) nk, and

e an mantic o any ext or think

In being

is the edural everai d the s" in h was on the speech or the s that cs for ences. 1973), 1969), pn and

1973), , and Colby, Colby

1 frocedural Semantics

I I I I

It stand linguis since terms o Notice of mean present procedu wheneve all on r e p i- e s e

app on

ts i they f tn t na

ing exc

re i r s e h ntat

ears f irme

n spec can

e proc t the that ept by tself omeone as wh ion of

that

r g if yi def

edur not

elus mea

is s ca

en it ,

the round ng t ine t es tn ion o i ve ns of o m e t n rr ies it i

pro tna

ne s he se at th f pro qual i alte

ing a out

s no

gramm n th emant man t i e mac cedur ty o rnati bstra the p t be

ing lang e phiios ics ol cs of the hine is e shares f being ve repres ct which rocedure, ing exec

uage ophers their ir not to ca with t impos

entat i is ins but o

uted

theor or

syst ation rry he no sible ons . t a n t i t n e r w is

ists the

ems, s in out. tion

to The

ated ise, some

Although in ordinary natural language not every sentence is overtly dealing with procedures to be executed, it is possible nevertneless to use the notion of proceaur-es as a means of specifying the truth conditions of declarative statements as well as tne intended meaning of questions and commands. One thus picks up the semartic chain from the pnilosopners at tne level of truth conditions and completes it to tne level of formal specifications of procedun 'es .

39

Page 44: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BBN Report No. 3067 Bolt Beranek ana Newman Inc.

These real notion in te "proce and t applic natura notabl semant Hash-* of Wi questi alumin unders pyrami the am block techni rule d that s cf the effect a numb

can mach of

rms dura he atio 1 la e co ics ebbe nogr ons urn tand d o bigu or

ques rive yste «* •■

V- u

ivel er o

in t ines char of

1 se term n o ngua mput

ar r, 1 ad such in

s an n t ity a use

n an m , I 6 c u n y se f ot

urn and

acte mec

mant ha

f t ge u er s e 972, (197 as hi

d ca he by d bloc d in d si wil

1 Q uc

rve her

bs chara can be

rizing t hanical ics" in s since his tec nderstan ystems w the LU moods,

2). Th "what i gh alk rries ou block in etermini k in t the LUN

nee I am 1 use LU . I b as a for language

cteri there he tr proc

my 19 gai

hniqu ding hich NAH 1973b e fo s th ali t ins the

ng wh he c AH sy more NAH a hink mal m unde

zed by a uth edur 68 A ned e i has make sys

} an rmer e a roc

true corn ethe orne stem fam

s th the

odel rsta

by ncho cond es FIPS wid

n c been

us tem d th

un vera ks?" tion er," r th r). are ilia e pr

ru for

ndin

their red to itions is on paper

e cir ompute very

e of (woo

e bloc dersta ge co , whi s such

( incl ere is

Sine more

r with incipa les u what

g syst

oper phys of

e tha (Woo

culat r sy ef fee this

ds, ks wo nds a ncent le t

as uding a py

e th forma the

1 il sed is go ems.

atio ics. sen

t I ds, ion. stem tive

ty Kapl rid nd a rati he "Pu res

rami e se lize deta lust ther ing

ns on This

tences called 1968)

The s for

Two pe of an, & system nswers on of latter t the olving d on a mantic d and ils of ration e can on in

Semantics in LUNAH

The semantic framework of the LUNAh system consists of three parts -- a semantic notation in which to represent the meanings of the sentences, a specification of tne semantics or meanings of tnis notation by means of LISP programs, ' id a procedure for assigning representations in the notation to input sentences. In LDNAh, the semantic notation (which I have referred to there as a query language) consistj of an extended notational variant of the predicate calculus.

The query language contains essentially three kinds constructions :

of

1) designators, which name or denote objects cr classes of objects in the data base,

2) propositions, which correspond to statements that can be either true or false in the data base, and

3) commands, which initiate and carry out actions.

Designators come in two varieties ~- individual specifiers and class specifiers. Individual specifiers correspond to proper nouns and variables. For example, S10046 is a designator for a particular sample, ULIV is a designator for

^0

Page 45: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No. 306? Bolt Beranek and Newman Inc

3

i.

I ,

u

a certain mineral (olivine), and X3 can be a variable denoting any type of object in the data base. Class specifiers are designators used to denote classes of individuals over which quantification can range. They consist of the name of an enumeration function for the class plus arguments. "or example, (SEU TiPECS) is a specification of the class of type C rocks (i.e. breccias) and (DATALINE S10046 OVERALL OLIV) is a specification of the set of lines of a table of chemical analyses which correspond to analyses of sample 310016 for the overall concentration of olivine.

Elementary propositions are formed from designators as arguments, and complex p formed from these by use of tne logical conn and NOT and by quantification, (CONTAIN S10046 OLIV) is a proposition substituting designators as arguments to CONTAIN, and (AND (CONTAIN X3 OLIV) (NOT (CO is a complex proposition corresponding to th X3 contains olivine but does not conta Elementary commands consist of the name function plus arguments, ana liice propos commands can be constructed using logical quantification. TEST is a command tunction truth value of a proposition given as its (TEST (CONTAIN 310046 OLIV)) will answer yes on whether sample S1Ü046 contains oliv PRINTOUT is a command function wnich representation for a designator given as its

predicates with repositions are ectives AND, OR,

For example, formed by

the predicate NTAIN X3 PLAG))) e assertion that in plagioclase.

of a command itions, complex connectives and

for testing the argument. Thus, or no depending

ine. Similarly prints out a argument.

The format for a quantified proposition or command is

(t'UR QUANT X / CLASS PX QX )

where numer quant ob jec a res comma quant TYPEC quant every olivi ; (P the conce been in th

QUA ical if ic ts trie nd b if ie S) if ie

ty ne. R1NT chem ntra slig e LU

NT i q

atio over t ion eing d ex : ( d pr pe (FO

OUT ical t ion htly NAR

s a typ uantifi n, CLAS whicn on the quanti pressio CONTAIN opositi C rock R EVERY X2)) is

analy s. (F simpl i

system ,

e of ers, S is quan ran

fied ns . )

XI on c tha X2 & q

ses or fled but

quanti etc .

a clas tificat ge, and

(bot For ex PLAG)

orrespo t conta / (DATA uantifi of SI

exposit nere c the di

f ier

), s spe ion i QX

n PX ample

i

nding ins p LINE ed co 0046 ory r ompar f fere

(EAC X elf i s to is ana (FO

(CON to

lag! S100 mman for

easo ed t nces

H, EVER is a er for range ,

the pr QX may R EVERY TAIN XI the st oclase 46 OVER d to pr

over ns, the o that are mi

i, S var

• he .X

opos them

XI OLI

atem also ALL into all not

ac tu nor .

ÜME, iable class speci ition selve

/ V)) ent cont

OLIV) ut al

oli ation ally )

THE, of of

fies or

s be (SEQ is a that ains

: T 1 of vine has

used

kl

Page 46: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No. 3067 Bolt beranek and Newman Inc.

Semantics of the Notation

H the m notati relati for e a specif proced the or for ea a pro given specif subrou FOB f the lo f u n c t i ex pres functi compon proced In tn proced iangua LISP and t h retrie

aving s eanings ons. A ng the ch of ying s ure or edicate ch of t cedure the vai iers tine wn unction gicai o ons T sion i ons w h i ent an ures c e LUNA ures i ge is s program e data val com

peci . w s me not the

eman subr for

he f whi

ues for ich

it pera EST n t en h d a apab h s s d o en s . ba '..e pone

fied e m ntio atlo

pr tic out i giv

unct ch of i

th enum self tors

an ne ave re ic yste one osen The on

nt o

our s ust n ned be ns to edicat repre

ne whi en val ions w can c ts arg e FUH erates is al AND,

d Ph query proced theref of exe m, th in L tnat total whic

f the

emanti ow sp f ore , proced e nam sentat ch wil ues fo hich c ompute uments

f unc the m

so def uh and INIÜÜT langu

ural d ore cut ion e de t ISP an its e ity of n the system

c no ecif we ures es ions 1 de r th an b the

F t ion embe ined NUT

age ef in them on mit d tn x pre the

y o

tat io y th do t whic

that , we termi e arg e use valu

or ea , we rs of by a and

Thus is

ition sei ve the d ion e not ssion se fu perat

n fo e m his h ca can wi

ne umen d , w e of ch

wi the sub

the any

a c s in s a ta ci at io s a net i e c

r re eani in

n be be

11 the ts . e wi tha

of 11

el rout basi

we ompo the

well base aii n of re on d onst

pres ngs LUN exe us

spec tru Sim

11 s t fu the requ ass . ine c o 11 si ti ret

d 1

oi the

exec ef in itut

enting of our Ah by cuted, ed in i f y a th of ilarly p e c i f y netion class

ire a The

as are ommand formed on of rieval efined n fact tnese query

utable it ions e the

d e f i n funct as a of tn possi corre in ten by me sampl is by " sarap cheek other inten witho appl i such each are c ( e x t fJ aigor 1 orme the L

It sho ition ions ,

nigh e pred ble spond i sion ans of e c o n t appea

les") ing wh

hana t ional ut re cation as th

eleraen apable nsiona ithms r ( ext UN Ah 8

uld of tn the q er-le lea t e types ng t and proe

ain s I to

by ether , th II (t leren oil

e ass t . " T eith

1 mod or t

ensio ys tern

be e pr uery vel cal

o o t exte edur ilie the

en sod

is hat ce nf er ert i hus er o e ) o heor nal ) . T

pointed imitive Iangua

program eulus . f infe he phi nsion . es, a on?" e a Individ umerati ium has same q is by r to the ence ru on "t,ve tiie exp f d i r e c r manip em prov mode o

his giv

ou fun

ge c tning Thi

rene loso Fir que

n be uals

ng bee

uest efer obj

les ry s ress t ex ulat ers f in es r

t t. ct io an b Ian

s gi e f pner st, stio ans den

tne n f o ion ence ects to 0 ampl ions ecut ion { int fere ise

hat ns an e vi guage ves r or a 'a d beeau n su wered oted indi

und i coul to deno

ther e con in

ion a by ra en t io nee i to so

by v a pre ewed and

ise t nswer istin se of eh a exte

by t viaua n eac d hav its ted ) ( inte tains tne gains echan nal m s act me 11

irtue dicate simul

as an 0 two ing q ct ion its d

s "Do Hsiona he el 1 sam h one . e been meanin by mea ntiona some

query t the ical ode ) , ual ly mi tati

of s as tane exte diff uest

be ef in es

111 ass pies

On ans

gs ns o 1) amou Ian

data infe Onl use

ons

this LISP

ously nsion erent ions , tween ition every (that name and the

wered alone f the facts nt of guage base

rence y the d in (e.g.

k2

Page 47: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

r

tibü heport No. 3067 bolt beranok ana Newman Inc

it is not possible to prove most assertions about infinite sets in extensional mode), but is very el'ficient Tor a variety of question-answering applications.

Semantic Interpretation

i n

r r i i i i

repre and m meani left are seman of f inter one seman struc to t n inter if th of a const tne per f o the wnole is op

Havi sent akin ngs with assi tic orma pret whic tic ture e sy pret e in

co itue high rm t inte

), a erat

ng n th

g s of the

gned inte 1 s er o h h expr

to stem atio terp nst i nt n er he e rpre nd t ed i

ow spe e mea ure t tne speci to s

rpreta emanti perate as be ession indie

. In n of retati tuent ode is noae ntire tation nis is n the

cified nings hat we express ficatic entence tion, a c inte s on a en con s in th ate the LUwAh t nodes

on of a node , perl" or

is cow semanti

of t the no

LUNAR s

the no of Engl

under ions i n of th a. Thi no in L rpretat syntact structe e notat "meani

his pro can be node r then

med bef pleted, c inter he top rmal mo ystera.

tation in whicn we will ish sentences in our system stand the nature of the n tnat notation, we are now e process whereby meanings s process is referred to as ÜNAR it is driven by a set ion rules. The semantic ic structure or fragment of d by the parser, assigning ion to the nodes of this ngs" of tnose constructions cedure is such that the initiated in any order, but equires the interpretation the interpretation of that ore the interpretation of

Thus, it is possible to pretation by calling for

node (the sentence as a de in whicu the interpreter

Semantic hi'les

In of 1 n f sentence constitu the sent syntacti = " c o n t a i>10046 determin ( W o t e t procedur "acciden the same we have

dete orma

co ents ence c s in"; is e hat e in t" as inte

rrain t ion nstr

, "S true obj

a s the the the

of the rpre

ing ar

ucti For 1004 ture ec t ampl

in pr

ret mnem bngl ted .

tne m e us on a exam

6 con of t

= sil e and terpr edica rieva onic ish w )

earn eo - nd pie, tain he s icon sil

etat te 1 co des

ord

ng o

- sy sem in

3 3 ente

) Pi icon ion CÜrt'I mpon ign "con

f a c n t a c t antic inter ilico nee ( us th is a (Cu

AIN ent a that tain"

onstr ic i

in pret i n," sub je e sem chem

wTAIN here nd it its n in t

uction nforma format ng the it is et = S antic ical e S1004 is th is on

ame ha he sen

, tw tion ion mea bo

1004 fact leme 6 SI e na iy ppen tenc

o types about about

ning of th the 6; verb s that nt that LICÜN). me of a by the s to be e that

In LUNAh, this information about the semantic interpretations of syntactic structures is embodied in

semantic rules consisting of pattens that determine whether

1+3

Page 48: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BBN Report No. 3067 bolt Beranek and Newman Inc-

a rule can apply and actions that specify how the semantic interpretation is to be constructed. An example of such a rule is given in Figure 16.

(S: SAMPLE-CONTAIN

(S.NPCMEMl (SAMPLE)))

(S.V (OR (EQU 1 HAVE)

(EQU) CONTAIN)))

(S.OBJ (MEM 1 (ELEMENT OXIDE ISOTOPE)))

(PRED (CONTAIN(#1 1)(#3 1))))

f'igure 16: A Sample Semantic Interpretation Rule

The name of tne rule is 5: bAi'iPLt-CuNTAiN, and the left-hand side, or pattern part of tne rule, consists of tnree templates wnich match fragments ot syntactic structure. The first template requires that the sentence being interpreted nave a subject noun phrase which is a member of the semantic class SAHPLfc, the second requires that tne verb be either "nave" or "contain", and the third requires a direct object wnich is either a chemical element, an oxiae or an isotope. The terms S.NP, S.V ana S.UbJ name schemata for tree fragments which are used not only to test for the presence of their corresponding syntactic structures in the sentence, but also to associate reference numbers with selected nodes in the structure. These numbers are usea for reference by the semantic conditions in tne templates and for use in the right-hand side of tne semantic rule, for example, the tree fragment S.NP locates the subject noun phrase of the sentence and associates tne reference number 1 with that noun phrase .

The right-hand side, or action part, of the rule follows the right arrow and specifies tnat the interpretation of this node is to be a predicate formed by inserting tne interpretations of two constituent nodes into the schema (CONTAIN (# 1 1)(# 3 1)). where the expressions (# m n) refer to tne interpretation of the node with reference number n for template number m in the match of the left-hand side of tne rule.

kk

Page 49: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bfaN Heport No. 306? Bolt Beranek and Newman Inc

Organization of Rules

The semantic rules for interpreting sentences are usually governed by the verb of the sentence. That is, out of the entire set of semantic rules, only a relatively small number of them can possibly apply to a given sentence because of the verb mentioned in the rule. Similarly the rules which interpret noun phrases are governed by the head noun of the noun phr^'e. For this reason, the semantic rules in LÜNAH are indexed according to the heads of the constructions to which they could apply ar> , recoraed in the dictionary entry for tne head words. bach rule then characterizes a syntactic/semantic environment in which a word can occur and specifies its interpretation in that environment. The templates of a verb rule thus describe the necessary and sufficient constituents and semantic restrictions in order for the verb to be meaningful. Nouns in noun phrases benave similarly. That is, tne semantic rules not only specify the process of interpretation which assigns semantic representations, but their left-nand sides also specify tne conditions under wnich given words and constructions are meaningful.

Semantic ;ales in General

resp gene sema deta as t seve the and inte whic othe phil even are dive this

The ects ral a nt ic ils o ne de ral d reade Nash

resti h hav r com osopn tuall to be rsity pres

above for great rules

i' o p e r sired iffere r is r -webbe ng is e not puter ers t y hav

faci of

entat i

presen the s er var

of t ation oenavi nt way eferre r ( 19 sues been e system nan c e to le at these on.

t a t i o a ke o iety ne LU that or wh s.) r d to 72). in t xplor whic

oraput be h und

issue

n is ov f expos of devi NA« sys we will en a te or more woods (

There he sem ed in n are c er sei anüled erstand s, howe

ersi itor ces tera , not

mpla det 1967

ar anti the urre enti by c ing ver ,

mplif y bre that and cons

te or ails ) and e a es o LUNA

ntly sts omput huma is b

led i vity. are tnere ider a ru

on t to N

lso f nat R sy more but er sy n la eyond

n a Th

used are

here le m •lese oods man

ural stem the whi

stem ngua the

numb ere

in num

atch is

, Ka y Ian or

doma ch s if ge.

er of is in

the erous (Such es In sues , plan , other guage

any in of will they The

pe of

In many question answering systems semantic interpretation rules are pairea more directly with the syntactic rules of tne grammar so tnat there is little or no template matching required (and consequently less latitude for producing semantic interpretations that are not in node-for-node correspondence witn the syntactic structure). In still otner systems, the semantics are not formalized in rules, but are simply embodieo in arbitrary computer

programs (and consequently totally unconstrained in what

^5

Page 50: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

faBN Report No. 3067 bolt beranek and Newman Inc

could be done theoretically but providing little or no theory or conceptual framework for what is going on.) However, the kind of semantic rules tnat are used in LUNAR can be used as formal models to explain what is going 01 in the semantics of these other systems in which the semantics is either more restricted or less formalized.

Semantic Judgments

As judgmen informa represe reject we ha structu to a capabil

compone semar.t i underst of the what as are m e a

in tal tion ntat anom ve ral sent ity ..fee

nt, cs c and i sem

serab ning

tne ana

is ions alous descr aspec ence is

«■ k. ^ « ^ iic r

howe v an do ng, antic lages ful .

ca a

u of or

i bed t — and

nece 11.

er, whi

As w int of

se stru sed the

sema so

now wh

ssar i 5

ther ch a e po erpr synt

of s ctura

bot mea

nt ica far

to a at r y fo te A l

e ar re pa inted etati act ic

ynta 1 a n ning lly

ha ssig epre r a

ur e a rtic out

on r str

x, se spect. to s ot ill-fo s mos n a se sentat ny la speec numb

ularly above

ules c ucture

mant i Th

const tne

rmed tly mant i ion nguag n , I er o

irapo , the an be s and

cs ha at is ruct senten senten dealt c repr to ass e una n tne t thi rtant

patt used lexi

s both a , semantic semantic

ces and to ces. What with the

esentation ign. This erstanding judgmental ngs which for speech ern parts to specify cal words

In tne next few sections, wnat 1 would like to do is briefly survey the uses of semantic information which have been made in various question answering syst'ems using the notion of semantic interpretation rules as presented above to unify the aiscussion. I shall no longer be directly concernea with the use of the rules for the assignment of semantic interpretations to sentences, but with the ancillary use of tne information emboaied in these rules i'or other purposes .

one sema poss cont sent coas Chic sema do spee give mean alte alte

Sema nted nt ica ible ext ence t cit ago" ntic not ch un n i ingf u rnat i rnat i

ntic 1

ny par

of sue

y to mod inte nave ders nter 1 i ve

ve s

info angua meani sings airli h as Chic

ifies rpret

any tandi preta s cr pars

egmen

rmat ge ngfu

of ne "Doe ago"

fl at io

ru ng, tion itic ings tati

ion i under

1 par a s

t light s Am er we c

ight n rule les to this a

of al no , bu

ons of

s us stand sings e n t e n

sch ican an t ana s for inte

bilit a

t on t a

the

ed ing

fr ce , eaul have ell not fli

rpre y to sen

iy ISO

inpu

in a sys

om a t'or

es i a fi tnat city

ghts t cit

det tence for for

t si

num terns raong

exa n in ght f

the bee

to pi ies t ermin

is choos choo

gnal

ber to

all mple , terpr rora s

phr ause aces o pla e wh sema

ing sing

into

of se

of in

etin ome ase we

whil ces, ethe ntic bet bet

wo

text lect the tne

g a east "to

have e we

In r a ally ween ween

rds,

14 6

Page 51: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

ütiU Heport No. 3067 bolt Beranek and Newman Inc

In the next few sections I will discuss some of the techniques that nave oeen used in various question answering systems to use semantic information for this judgmental role and discuss their advantages and limitations for speech understanding applications.

Semantic Selectional hestrictions

0 i: i.

i:

As we mentioned above, the att difference between semantically those wnicn are semantically anoma concern of many linguistic seman Fooor, 1964). The device which attempts is a notion of restrictions -- restrictions betwee and semantic features of the a sensibly take. For example, the re "intend" require higher animate su the oddness of sentences sucn as "t there." This account assumes tnat can be assignee to semantic classes and that there must be "semantic semantic disagreement oetween the v subjects, objects and other argumen is in this area of semantics that t the distinction between syntax a there is usually no difference in implementation of such semantic semantically anomalous sentences syntactic restrictions such as n syntactically incorrect sentenc restricted and fixed domains of dis implement such semantic select subcategorizing the syntactic categ classes like 'animate noun' and than simply noun or adjective. One testing of semantic selectional grammar ana avoias the need for testing semantic selectional restri

empt to cha well-formed lous has ticists (se is used

semantic n the verbs rguments w striction t bjects is u he rock in the nouns o sucn as "h

agreement" erb of a se ts which it he nisconc nd semantic

principle restrictio and imple umber agree es. For course , it ional res ories of th color adje thereby in restrictio

any spscial ctions .

racteri senten

been a e e.g. in mos

sele of a s

hich t hat ver sed to tends f the 1 igher a or at 1 ntence can ta

e p t i o n s s arise

betwe ns to mentati ment to

suff i is poss trictio e gramm ctive ' corpora ns int mechan

ze the c e s and

major Katz &

t such ctional entence hey can bs like explain to sit anguage n i m a t e " east no ana the ke. It

about , since en the reject

on of reject

ciently ible to ns by ar with rather

tes the o the ism for

t

r

synt effe It easy task of t sele or s most true

The actic ctivel has t to i

s. ho he maj ctiona emanti

such whe

tech ca

y in ne mple weve or 1 re c we

con nth

niq u tego lim

aa va ment r, o inad s tri 11-f diti e se

e of ries ited ntage

for ne sh equac ction orraed ons a ntenc

seman of th speech of be s u f f i

ould un ies is s as pr ness is re requ e is a

tiei? e g

an ing cien ders

th ereq not

ired ques

ily ramm aers effi tiy tand at uisi

qu onl

tion

sub ar tand cien sim its

the tes i te y fo or

cate has ing t in pie lim use

for cor

r a when

gorizi been appli execu under

i t a t i o of

gramma rect . senten it as

ng the applied

cations. tion and standing ns. One semantic ticality

hather ce to be serts a

hi

SSkwr ■v^^

Page 52: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bBN Report No, 3067 Bolt Berane'< and Newman Inc.

negative possibility, then semantic selectional restrictions may be violated by perfectly reasonable sentences. A speech understanding system which contains sucn restrictions embedded in its grammar will fail to parse such inputs. (For example, in Terry Winograd's blocks world program the sentence "Can a table like blocks?" fails to parse since the system applies the selectional restriction that "like" requires an animate subject.) A speech understanding system which used such selectional restrictions as a prerequisite for acceptability of an interpretation of a speech signal would be unable to "hear" this sequence of words no matter how well articulated and how successful the acoustic and phonological analysis, but would rather insist on looking for some other interpretation of the signal.

An additional limitation of the semantic selectional restriction approach is that the necessary semantic information associated with a given argument to a verb is not necessarily associated with the lexical items in the noun pnrase, but may be associated with the referent of the noun pnrase instead. The association of such information with the dictionary entries for the words is really just an approxiiüatiün (alueit a usofui one for many applications) or what one really wants the semantic selectional restrictions to test ,

A major practical difficulty with incorporating the semantic selectional restrictions into the syntactic categories ot the grammar is the lack of extendability thus induced. If one wants to apply the system to a different domain of discourse or to extend the domain slightly, he has to redefine the categories of the grammar.

Semantic Screening

A sornewnat more versatile technique for using semantic information to select an appropriate parsing is to apply semantic rules to the nodes of the syntactic tree structure as tne noaes are ouilt by tne parser. If the node just constructed fails to have a semantic interpretation, then ♦•not- nnmn,,* a* inn natin r,r the parser is rejected and the

~ input. This c e s a s

-"-- ■ rr-=^^=-

Page 53: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bBN Heport No. 3067 Bolt Beranek and Newman Inc

have been con;-i..ued further. This argument, however, neglects to count the cost of the semantic interpretation on uncompleted parsings which would not have been completed in any case for syntactic reasons. Whether semantic screening really provides an increase in efficiency depends on the relative costs of the extra or unnecessary semantic processing and the syntactic processing that is thereby eliminated. In many situations, it is more efficient to complete the syntactic analyses and then apply the semantic testing.

sere well with exam abou basi curr gene This its to thin

Anot ening -form

tne pie i t "p s of ent ral i

tec exclu say gs th

her t is

edness form

n «lino ut th w n e t h e state nforma hnique sive a things at wer

echn to but

atio grad e p r th of

t ion ca

nd u th

e no

ique apply also

n of a 's sys yrarnid ere is tne about

n be ncontr at we t true

which test

tests cons

tem on t a

world whet

very oiled re no

is s not of f

titue when he bl pyram

and her p usef u use

t air

re onl

actu nt. he

ock id not

yram 1 in woul eady

late y of alit Thi mak

in t on jus

ids som

d .'»a tru

d gen

y in 3 is es he c a b t on can e si ke i e or

to eral

cOn the

his orner lock the

be on tuati t im to a

sema sema June case deoi " on in

basi bio

ons, poss ska

ntic ntic tign for

sion the the

s of cks . but

ible bout

Semantic Selection

sere sele well a^, " oher poss poss when like that kitid prob bell requ sema sele synt of t been in t info in?t

pref

A ma ening ctiona -forme 1 saw e are ible, ible I saw

ly in would of

ably u eve ot ired nticai ct the actica his pr made

he LUN rmatio rument

er th

jor and

1 re dness the m many but a that the

terpr indi

def au sed t herwi in g .ly i most

lly oblem in a AR pa n su and

e al

inadeq ind

strict is it

an in possi

re not 1 wa

man so etatio cate t It in o see se the eneral ll-for plaus

relate in ge

mechan rser ( ch as one ca

ternat

uacy eed ions s in tne Die equ

s in mewn n i his terp the man ra

med ible d a nera ism see the n se

ive

of of as

abil park pars ally a p

ere n a inte reta man , was

ther int int

Iter 1 is call Wood fact e wi

of

se any

s ity wit ings

Pi ark else bsen rpre t ion and pro th

erpr erpr nati not

ed s s, 1 tha

th

"wi

mant ap

trie to d h a whi

ausl whit; , t ce tati

th in

babl an etat etat ves ,

a*. elec 973a t a an

th

ic plic t eai

te ch a ble . h co his jf s on . at abse y in a m ions ion

Al hand tive ). tele opti

a t

( and ation prere wit h lesco re al

Al ntain is

p e c i f Rath

the nee the

ere is

from thoug

, a modi

This scope cal

elesc

of of

quis sent pe" 1 se thou ea a not ic i e^ t tele of park re je a me amon 1? th begi f ier mech is

inst n

factual) semant ic

iteü for ences such in which

mantically gh it is telescope the most

nforraation here is a scope was reason to

what is ction of chanism to g a set of e solution nning has placement

anism uses an optical rument to

ope" modifying

h9

Page 54: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bBN Report No. 3067 bolt Beranek and Newman Inc

"see", while in absence of semantic preference, the modifier "in the park" modifies the syntactically preceding noun phrase "man". The technique has not been systematically developed, however, and except for the placement of prepct ^ional phrase modifiers, the use of semantic judgments in LUNAR to select among alternative parsings is not well developed.

Semantic Prediction

All oi t e preceding techniques for making semantic judgments about completed syntactic constructions are of grert importance for speech understanding. There nre, however, situations in the course of understanding a speech utterance where one does not have a complete construction to work witn and would like to make use of semantic information to guide the speech understander to look for words which mignt have been slightly garbled or to provide initial prefprences among the words that are discovered on the basis of acoustic and lexical analyses alone. Given for example that we have found tne words "sample" and "contain" in a speech signal, we would like to make use of our semantic information to predict tnat there should now occur a word which is a chemical element, an oxide or an isotope. This information is contained in our semantic rules (specifically it is in tne left-hand sides of the rule*). Similarly upon encountering the words "sample" and "contain" among a large number of other words in the initial word lattice, we would like to use the semantic information to notice that these two words are related and perhaps go together in the interpretation of the utterance. botn of tnese semantic roles make use, not of the logical or interpretative sense of semantics, but of a kind of associational semantics which studies the semantic relationships among woras and concepts. There are a number ot psychologists and psycholinguists as well as peoplo in artificial intelligence, sociology and other field who have been trying tt model this aspect of semantics with various kinds of network structures. The initial impetus in this area was created by Rosd Quillian (I960, 1969), but other researchers in this area of semantics includ.. Abelson, Carbonell, Collins, Rumelhart and Norman, Schänk, Simmons, and others (a sampling of most ^f these authors is given in Schänk & Colby, 1973 and others are cited explicitly in tnis paper.) The work of Fillmore (1968) has also been influential in this area of study, and recently, similar notions have been used at HIT as the basis for programs that analyze visual scenes (winston, 1970). I will describe here some of the characteristics of semantic networks as Quillian visualized them which have direct application in speech understanding and which have been included in the BBN speech understanding system.

50

Page 55: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

■w . übN Heport No. 3067 bolt Beranek and Newman Inc.

4

I I r

i i i i i

as erro Hath coll (wit was torn- be i atte nota many rais

Qui cha

neou er e c t '. hout mean ulat nade nt ic tion

of ed .

llian racte sly) he v on of , ho t by ion quate n to itse the

was rizi tne iewe the

wevt a co and in

a sp If,

po

not i ng t psycho d the

cone r, gi ncept) much the

ecific out th ints t

nter ruth logi

"m epts ving

of t resp atio at d hat

ested I

cal r eanin

tha any

I co he wo ect n of oesn ' he an

in

ndee elev

g" t *. adpq nsi'i rk t that the t 1 d ot

the d h ance Of re uate er hat

it sema esse hers

not io e de

of a wo assoc expl

Quill it ha

doe ntics n th of t

ns o nieci sue

rd :' ate icat ian ' s st sn ' t of

e v his

f sema (I

h not as mer d wit ion of s ori imulat

give the ne a 1 i d i t school

ntics tnink ions. ely a h it what

ginal ed to

any twork y of have

Quillian wa's concerned with investigating t in which humans tore information i'i their br the so called semantic networks are really finding structures and organizations for storin His concern is not with having a notation in whi down a list of facts, but rather with an ov structure in which the interrelationships among which humans use for retrieval of informat construction of inferen. es, are explicitly and represented. The important thing for Quilli much the structure of a particular concept, but of relations to otner concepts that are esta particular, Quillian sought to devise a mecha structure which v ould account for the types associations which people make ana tne associations manifest themselves in huma unaerstanding.

he structure ains. Thus, attempts at g knowledge, ch to write erall memory those facts, ion and for efficiently

an is not so the network blished. In nism and a of semantic way these

n language

nad an ex "die "cone diffe assem netwo PRÜFE the is th it i very seman condi seman seman under ir ^er under

To give a in raind , F ample ot t nt" . epts" rent blage rk.

bac or n sens of

In SSIOHAL st network. e sura tota s connect little lev tic inte tions , it tic predi tically re standing . section c standing.

flavor igure he con h lex odes i es o t point

Figure and to In Qu

1 of t ed -- orage rpreta is a s ctions lated In pa

an pi

of 17 ( cept ical n th

tn ers 17

r po iili he c no on tion uper

an word rtic ay

the ki taken assoc i tem

e sema e wor to o

the id inters an 's v oilect more a solvin

or b mech d not s tha ular , an i

nd or

from Q iated or wor ntic d ) eac ther entifi to ot

lew, t ion of nd no g any

cnar a n i s m icing t are Quilli raporta

netwo uiiii witn d poi net h of conce ers P her he me cone less .

of acter for the req

an 's nt

rK tha an, (19 the le nts to (corres wnich i pt nod bhSDN, concept aning o ept nod

while the p

izat ion accompl coincid uired notion role

t Quill:--n 69)) , give., xical it&m one or more ponding to s merely an e- in the EMPLOr, and

nodes in f a concept es to which this gives

roblems of of truth

ishing the ences among for speech of semantic in speech

51

Page 56: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No, 3067 Bolt Beranek and Newman Inc

DICTIONARY QUILLIAN

rigure 17: A Fragment or a Quiliian Network

52

Page 57: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

öbN Heport No 3067 bolt beranek and Newman Inc

Semantic Intersection

I I I E I I

Quiilian developed tne notion of' semantic intersection as an attempt to account tor tne human capability to immediately identify the relationsnips between diverse things such as between 'plant' and 'alive' or (more subtly) between Madrid and Mexico, and to account for the tendency of people to accept an ambiguous term in a particular sense induced by the appropriateness to the context without noticing the other possible senses (a phenomenon called "foregrounding"). In foregrounding, the appropriate sense is somehow brought forward and made more accessible than the other senses due to the influence of the context. Tt-.t. mechanism which Quiilian proposed to account for such phenomena and which he believed was the principal process for accessing information from one's knowledge store was a process which he called semantic intersection. Quiilian assumed tnat in the brain, whenever a concept was brought into consideration in a discourse or wnatever, it was somehow stimulated or "activated" and tnat this activation passed out in waves from the source of tne stimulation to the concept noaes to wnicn it was connected. When the activation waves from two different sources met at some node in the memory, a semantic intersection was detected, and a path through tne semantic memory was tnereby established which represented the semantic relationship between the two source concepts. (e.g. tnadrid is in Spain which is like Hexico in language and culture.) Similarly, such activations have some auration in time, and wr.en an ambiguous word is encountered, tne sense that people are likely to take is the sense which has semantic connections with concepts that are currently activated (as detected by tne presence of semantic intersections) .

sema that ana rela thro word are de ta of Nash we rule to wner clas head netw cone

In nt ic one can

ted ugh s th suff ils cont -web have s of nave eas ses of

ork epts

s peech inter

hears be us

words i the s

at have icient1 on the 1 n u o u s ber (19 in the LUNAh in s

in LUNA is av

a const format invoiv

understanding , sections can in an otnerwis ed to detect t n a wora la emantic netwo not been dete

y li kely that use of such te speecn, the

74, 1975»). N pattern parts

is one type of uch a semant R tne informat ailable conve ruet ion, simil

would be equ ed in the rule

this foregrounding effect of be used to influence the words

e ambiguous segment of speech, he coincidences of semantically ttice. Following connections rk can also be used to predict cted in the signal but which tney should be looked for. For chniques in the understanding reader is again referred to

otice that the information that of tne semantic interpretation information that we would like

ic network. Notice also that ion about associated semantic niently if one starts with the ar information in a semantic ally accessible from any of the

This is one more instance of

53

Page 58: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bBN heport No. 3067 Bolt beraneK and Newman Inc

the importance of breaking a priori orderings of processing in speecn understanding in favor of multiple, redundant ways of achieving the same result. In any given utterance, it could be one of the critical head words that is garoled, and one would like to be able nevertheless to find the semantic relationships among the arguments and use tiicin to predict the missing head.

Other Aspects of Semantic Nets and Knowledge Representation

semant Raphae about a c h a i them is a t may t at the proper wnich Tnese a b s e n c over are tr

noth ic 1's a co n of supe ype hus lev

ties are wou

e of and ue .

er n nctw SIR ncep mor

rcon of a hav

el o th

stor Id con

over

ot ion ork ( system t can e and ceptsJ nimai e cert f c a n a at ar ed at be au trary aga in

embe whic

( Ra be s more

r' whic ain ry ( e c tne totna info lor

dded hals phael tored inci

or ex n is prope such cramon most t icai rrnat i eacn

in Q o ha , 19 at

us.i v ampl in t rt ie as to

gene ly i on ) of

u i 11 i a n ' s rudirae 64 )) is several e concep f,, a can u-n a ph s wnicn being a grea

ral leve nnerited without the enti

s con ntary

tha diffe ts (g ary i ysica are s yel lo t man 1 of oy a

havin ties

cept beg

t i rent uill s a 1 ob tore w) y co appl ubco g to for

ion o inning nforma level

ian ca bird w ject. a dire but o ncepts ica bil ncepts be SL

whicn

f a s in t ion s up lied hich

it ctly ther ana

ity . (in

ored they

Tnere is a t various semanti structures snould inferences, what network in respon particular, it students, that a response to a assumptions that in memory and no For example, Sena like an ice cr second utterance answer to the qu when one attempts judgments about interpretation of semantic inferen factual knowledge the speaker is of paramount impo given interpretat it to wnat has be current context

remendous amount oi interest rig c networK representations, look like, now they shoula be u kincs of tnings should be

se to understanding a sentence, is pointed out, notably by Sch great deal of wnat is und

n input sentence comes from are maae on the basis of knowled t specifically transmitted by tn nk cites dialog pairs such as eam cone?" ana "1 just ate", i should oe interpreted as giving estion. 1 tnink it snould be ap to understand spoken discourses the contextual appropriateness an utterance, the ability to

ces using large amounts of s (as well as pragmatic knowledge

likely to say in a given situati rtance, The inability to acco ion ot an utterance by being abl en said before or to some aspe snould raise the possibility

ht now in what such

sed to do put into a etc . In

anK and his erstood in gratuitous

ge already e se.itence . "would you n which tne a negative parent that

and make of a given make such

eraantic ano about what

on) will oe unt lor a e to relate ct of the that the

3'*

Page 59: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

D MD bBN Heport No. 3067 bolt Beranek and Kewman Inc.

E i:

utterance has been misheard. The ability to fully use this level of sopnisticated inference as part of a speech understanding system, however, will probably have to await further developments in the ongoing studies in knowledge representation and mechanical inference. The techniques which exist today in these areas are either extremely limited or inordinately cumbersome.

• ■

1 I I I I

55

Page 60: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bUN Keport No. 306? bolt beranek and Newman Inc

CUNCLUSIUN

I have attempted here to provide a perspective on some of the work that has been done in the areas of syntax and semantics for understanding natural language by machines and to call special attention to those techniques which have particular relevance to the problems of speech understanding -

I have tried algorithms and advantages and dis these models for will encounter in tried to give my o features. In part predetermined orde across the sentenc be avoided. 1 syntactic word cia ambiguity in hng understanding by t the word at a gi one at least knows expectation of t for it, in speech alternative possi or more possibl combinatorial prob possible alternatl This is coraplicat that nave oeen dev impact of tnes carefully designed conflict with tn orderings are se analysis of tne speech.

to cover grammar

advanta^as the parti

analyzing pinions as icular, I r of findi ^) is pote have poin ss, whicn lish text he inabili ven positi what tne

wo or thr understand Die words e syntac lems that ve analyse ea by tne eloped in e combin sequences

e above o nsitive t input th

a r mod of

cula cont to

have ng t nt ia ted is o ) is ty t on i word ee p ing at

tic ar is s i fact text ator of

bser o t at

ange els

th r ty inuo tne arg

hing lly out

ne o gre

o u s , is

ossi we ra a gi

ca e fr s m tna par

iai iOO

vat i he are

of wit

e v pes us s valu ued s ( s dang

th f th atly niqu wher and

ole ay ven tego om t ucn t mo sers

po king on t erro vir

diff h emp arious of pro peech , e of t that t uch as erous at tn e ma jo magni

ely d eas in

t, her syntac nave point, ries . he mul worse

st of for m

ssioil for

hat su rs in tually

erent h a s i s

feat olems

and nese d ne us left

and pe e ambi r sou fled i etermi text

e f o r e tic ca a hal each

Hen ti piic

for tne te i n i rn i i; i ties thing

ch con tne

inevi

par on

ures that I

iffe e o to r rhap guit rces n sp ne pars has

tego f d with ce at io spe

chni ing req

s w stra lex

tabl

sing the of

one have rent f a ight s to y of

of eech what ing,

an ries ozen one the

n of ech . ques the

u ;L r e hich ined icai e in

The use of word lattices as input instead of sequences and the desigr. of parsing algorithms around well-formed substring tables or charts appear to be viable metnods for dealing with the comoinatorial problem of speech understanding. The merging of common parts of different analyses permitted Dy transition network grammars is also helpful in this respect. In order to be able to correct errors, it will be essential to be able to come at a given parsing from several directions. Consequently checks will be necessary at appropriate points to avoid duplicating an analysis that has already been found.

Another important role of syntax in a speech

understanding system is the prediction of those places where

36

Page 61: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

i; * ■

bBN heport No. 3067 bolt beraneK and Newman Inc

1. n ü

small function words might occur in order to compensate for tne unreliability of their identification by lexical analysis .

0 I. i -

adva comp lang grea mach the out use unli of sema betw tne of s rese to i unae alte

Alt need lete uage t be ines spec in of

kely sema nt ic een inpu ome arch ncre rsta rnat

noug as

), t un

n e f i . 1 i f ic res sem in

nt ic i

sema t . oft

in ase nü i ve

h ou th

nere ders t in hese at io pons ant i terp

as nter nt .1 c une nese tne

t ne ana inte

r un at are

tand tne inc

n of e t c s reta soci sect ally snou

te are

rang the

rpre

ders of a n

ing con

lüde tne

o th elec t ion at io ion rel

Id b enni as b e o ir a tat i

tand synt umbe pro

stru the ope

e un tion s o ns tec

ated e aw ques otn { t bill ons

ing ax r of gram ctio use

rat i ders al f th as nniq wor

are , an

of s tiing t ies of a

of seraan (which

s e m a n t s have n of s of proc

ons wnic tanding restrict e speech embodied ue to ds at a however

a the yntax an s wnich to cnoo signal .

tics itse ic use

peec edur h a r of t ions sig in

noti if fe , of need a se

su se c

13 If tech d wn n u al s e to he s

to rial ,

th ce rent tne fo

rnant ch orre

not is niqu ich nder eman oe

ente ru

and e coin

po lim

r c ics syst ctly

as well far from es that can have standing tics for carried

nee, the le out the use

Qulllian cidences ints in itations ont inuea in order ems can between

mater issue the inter you are g psych and s under persp roles belie a 1 m o s seman a u torn

1 thin iai t s rath re fere ested some f oing ology , emant i stand i ec t ive

of v e t n a t as tics a at ic s

K it hat er s nces read eel i on

an cs a ng tna

synt t t gre

s t n peec

is

1 hail

wi er t ng f in u a no s tasK t ti.

ax a he at ese n re

clea have owiy 11 o fo or t comp rtif ome

fo e sp nd s spee an area cogn

r tn it

and prov How ne i utat icia oft r t eeen eman en im pa s ar i t io

al i

has oth

ide up.

S3 üfc

iona 1 in ne r hese und

tics unde ct e no n ,

n or bee

ers add

1 s an 1 1 tell ami f

ar erst in

rsta on w ha

der t n nee not it ion hope d som ingui igene icati eas , andin langu naing resea ving

o cover essary at all al det that 1 e of th st ics, e relat ons of Given

g tasK age und

proDl r en in on tne

the to t • ail

na e th lin

i ve tn

the plac erst em

sy pr

scope of reat many riopefully for the

ve given ings that guistics , to syntax e speech di fferent es on the anding, I can have ntax and o b1e m of

Page 62: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

I

bbN heport No. 3067 Bolt Beranek and Newman Ino

Meferencea

n L 1]

D 1.

L2]

■ fe

• «-

L

bobrow, D.G. and Fraser, J,b. (1969) "An Augmented State Transition Network Analysis Procedure," Proceedings International Joint Conference on Artificial Intelligence. nashington D.C. , pp. 557-,367.

[3] bruce, b. (1973) "Case Structure Systems," Proceedings of Tniro Internat ional Joint Conference on Artificial Intelligence. Stanford University, Stanford, California, pp. 364-371.

[4] Carboneil, J.ri. anü Collins, A.H. (1973) "Natural Semantics in Artificial Intelligence," iLLOiLg.edi.ngs of Tnird International Joint Conference on Artif icia,! Intelligence. Stanford University, Stanford, California, pp. 344-351.

[5] Cno.-nsky, N. (1965) Aspects of the The^r^ of Syntax , Cambridge, Mass.

MIT Press,

L6j Collins, A.M. and Quiilian, M.R. (1969) "netrieval Time 1rora Semantic Memory," Journal of Veroal Learning and Verbal Behavior, ö (2), pp. 240-247.

L7J Collins, A.M. and Warnock, h.h. (1974) Semantic wetworKs, heport 2b33, bolt Beranek and Newman Inc., Cambridge, Mass.

Ld] Denes, P.b. and Pinson, t.N. (1963) The Spe.ecn Chain, bell Telephone Laboratories, Inc.

L9J barley, J. (1970) "An efficient Context-t'ree Parsing Algorithm," CACM 13 (2), pp. 94-102.

[10] rilimore, C.J. (I960) "The Case for Case," in bacn, t. and Harms, h. (eds.) UuiieCSais in Lingui.3tic Theory. Holt, hinehart and

R-l

Page 63: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

bbN Heport No, 306? Bolt beranek and Newman Inc

winston, New York,

[11] Floyd, K.M. (1967) "Nondeterrainistic Algorithms," JAOi 1^ pp. 636-6^4,

(4)

[12] Green, C.C. and Raphael, b. (I960) • Tue Use of Theorem-Proving Techniques in Question-Answering Systems," Proc . 12.68 ACM National Conference . pp, 169-Id 1.

L13J Greibach , S.A. ( 1967 ) "A Simple Proof of the Standard-form Theorem for Context-tree Grammars," in Mathematical Lin£uisti£S

and Automatic Translation, Report NSF-18, Harvard University Computation Laboratory, Cambridge, Mass.

Ll^j Griffiths, T. ana Petrick, S,h. (1965) "ün the Relative Lfficiencies of Context-Free Grammar Recognizers," CACii 5 (Ö), pp. 289-300.

L 1:J J Hays , D.G. ( 1962 ) "Automatic Language-Data Processing," in Harold borno (ed.). Computer Ä£2li.cations in the öehaviou,ral Sciences, Prentice hail, Englewooa Cliffs, New Jersey.

[16] Heidorn, G.t. ( 1972) Natural Language Inputs to a Simulation Programming System, Ph.D. ihesis, Yale university. New Haven, Conn,

[17] Jakobson, R., Fant, CG,, and »alle H. (1967) Prsliminaries to S£eech Analysis. HIT Press, Cambridge, Mass,

Lid] Katz, J,J, and J . A- ( 1964 ) a Semantic iheory," in

kü v> % ) o 4 o . d ii u Fodor, "The Structure ol

and Fodor, J.A. (els.) The otrujture of Language: Readings in the Philosophy of Language, Prentice- Hall, Englewood Cliffs - ^ " --" •"- • ' :-

Katz , J . J

.anguagi . ... ..->L .'rentic«

New Jersey, pp. J479-5l8

L 19J Kay, rt. ( 1967) "Experiments with a Powerful Parser," Memorandum, RM-5452-PR, The RAND Corporation, Santa Monica, California,

[20] Kuno, 5, and üettinger, A,G, (1963) "Multiple-Patn Syntactic Analyzer," Information Processing o2, North-Hollana ,

L R-2

Page 64: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

D D

bBN Heport No. 3067 Bolt Beranek and Newman Inc

D D L I

«■

I I I

Amsterdam, pp. 306-312.

[21] Na3h-*ebbcr, B. (1974) "Semantic Support for a Speech Understanding System," Proceedings of ihhh Symposium on S£eech Hecognition, Carnegie-Mellon University, Pittsburgh, Penn., pp. 2i4'4-249.

122] Nasn-rtebber, b. (1975,) "The hole or Semantics in Automatic Speech Understand- ing", in Representation and Understanding. bobrow, D.G. and Collins, A. (eds.) Academic Press, (in press

[23] wewell , A. et al . ( 1973) Speech Understanoing Systems: tlinal heport of a Study Group. North-Holland/American Elsevier, Amsterdam.

124] Norman, D.A. and humelhart, D.b. (1973) "Active Semantic Networks as a Model of Human Memory," proc . Third International icrint Conference on A.!lLilliS.ii.i. illi.eLLLfien£.§.» Stanford University, Stanford, California, ^p. 450-463.

[25] Petrick, S.H. (1965) A hecognition Procedure for Transformational Grammars, Ph.D. Thesis, Depirtment of Modern Languages, M.I.T., Cambridg";, Mass.

[26] Quillian, rt.R. ( 196b) "Semantic ciemory," in Minsky, M.L. (ed.) Semantic Information Proc.cs.si.rg,, Mil Press, Cambridge, Mass .

[27] Wuiliian, h.h. ( 1969) "The Teachable Language Coraprehender: A Simulation Program ana Tneory ol Language,'1 CACn 12 (8), pp. 459-476.

[2bJ haphael , B. ( 1961) "A Computer Program which 'Understands'," AFIPS Conference Proceedings, Vol. 26 (1964 FJCC) pp. 577-569."

L29] Sandewall, K. (1971) "A Programming Tool for Management of a Predicate-Calculus-üriented Data Base," Procj. Second inte.rnati.onal. J.oint Conference on Artificial Intelligence. The British Computer Society, London, pp. 159-1667

[30] Scnank, h.C. and Colby, K.M. (1973) Computer Models of 1 hought and Language, w.H. Freeman

R-3

Page 65: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

BBN Heport No. 3067 belt beranek and Newman Inc

& Co San Francisco, C a 1 i to r n;. a .

L31J Inorne, J., bratiey, P. and üewar, H. (I960) ■The Syntactic Analysis of English by iiacnine," in D. MJ.chie ( ed . ) Na.cnine Intel liSence ^, American tlsevier, New York, N.Y.

L 3^ J "inogrdd , T . ( 197.: j

Uiliierstandin£ Naturai Lan&uaSe, Academic Press. New York, N.Y.

L 33J Winston, P.M. ( 1970 j "Learning Structurax Descriptions from Examples." MAC TR-76, rtIT Project HAU. Cambridge, Mass.

Weeds, W.A. (I960) (i n r'rocyaur-ii Semantic t-or a Quest, i on-Answer ing

Machine", AFIPS ^onfegencg rr^'^ed i ngs , Vol. 33 (1960 Fjccj, pp. iprrnrn":

L35j

L 3u

[37

wooas, «.A, 11967)

oemantics for a Question Answering System, report NSr-19, in* Computatjon Laboratory, harvard University, Cambridge, Mass. (NTIS number Pb-17ö-54d;

wooas , •», A . ( 196 9 )

"Augmented Transition Networks Analysis," rteport C fiarvard urilversity,

or Natural Language -1, eoraputation Laboratory, Cambridge , nass .

woods , «.A. (1970) "Transition i«. e t w o r»; Grammar: Analysis," CÄCH 13 ( 10) , pp

for Natural Language 591-606.

[iCJ wooas, w.A. ( 1 97ja>

"An txperiraentai t'arsint: System Network Grammars," in h, hustin

[40]

. e w >rk ) n t :

for 2'ransition ( ed . ) , Nal.urai s Press A e w Y o r k

L jyJ woods , w.A, (1973b) "Progress in iaturai Language unaerstanding : An Application to Lunar üeoiogy," A f ^ f j Cont erence Proceed in^s . Vol. 42, (1973 National Computer Conference) pp. 441-450.

woods, w.A. (1974)

"Motivation and overview of BBN SPEECriLIS: An experimental Prototype for Speech Understanding Hesearch", LLoc.^ Ltk^ S^mfiosium on S£eecn hecognit.i_on , Carnegie-hellon university. Pittsburgn. Penn.. pp. i-io

Page 66: SYNTAX, SEMANTICS, AND SPEECH William M. … · 1 r A 1 AD-A009 939 SYNTAX, SEMANTICS, AND SPEECH William M. Woods 1 Bolt Beranek and Newman, Incorporater L ) • Prepared for: Advanced

i.

i.

BbN Report No. 306? bolt beranek and Newman Inc

[hi] Woods, W.A., Kaplan, R.M. and Nash-Webbei, 3. (1972) "The Lunar Sciences Natural Language Information System: Final Report", BBN Report No. 2378, Bolt Beranek and Newman Inc., Cambridge, Mass. (NTIS number N72-28984 ) .

142] Younger, D.H. ('966) "Context-Free Language Processing in Time n3," Proceedings 12.66 Annual S^raiiosium 2.11 ^witching and Automata Theory. Ihtt Conference Recoro 16 C 40, 1966, pp. 7-20.

•v - - I • -

Y.

R-5