Collocations in Multilingual Natural Language Generation: Lexical … · 2011. 12. 20. · Lexical Functional Grammar (LFG) (Bresnan, 2001), Head-drivenPhraseStructureGrammar(HPSG)(Pol-lard

Collocations in Multilingual Natural Language Generation:Lexical Functions meet Lexical Functional Grammar

Francois Lareau Mark Dras Benjamin Borschinger Robert DaleCentre for Language Technology

Macquarie UniversitySydney, Australia

Francois.Lareau|Mark.Dras|Benjamin.Borschinger|[email protected]

Abstract

In a collocation, the choice of one lexical itemdepends on the choice made for another. Thisposes a problem for simple approaches to lex-icalisation in natural language generation sys-tems. In the Meaning-Text framework, recur-rent patterns of collocations have been char-acterised by lexical functions, which offer anelegant way of describing these relationships.Previous work has shown that using lexicalfunctions in the context of multilingual naturallanguage generation allows for a more efficientdevelopment of linguistic resources. We pro-pose a way to encode lexical functions in theLexical Functional Grammar framework.

1 Introduction

Natural Language Generation (NLG) is the gener-ation of natural language text from some underly-ing representation: numerical data, knowledge bases,predicate logic, and so on. Multilingual NaturalLanguage Generation (MNLG) is NLG where theoutput is in more than one language. Most NLGsystems are in some way modular (see Reiter andDale (2000) for a discussion of typical architectures);one advantage to modularity is the scope for sepa-rating language-independent components from thosewhich are language-dependent, making it possible toadd multilinguality with much less work than wouldbe involved in building a new system from scratch(Bateman et al., 1999). Such claims have been madesince the very first MNLG systems; the FoG systemgenerating weather forecasts in English and French(Bourbeau et al., 1990) is a case in point.

Consequently, MNLG has been applied for a largenumber of text types: government statistics reports(Iordanskaja et al., 1992), technical instruction man-uals (Paris et al., 1995), fairy tales (Callaway andLester, 2002), museum tours (Callaway et al., 2005),medical terminology (Rassinoux et al., 2007), codesof practice (Evans et al., 2008), and so on.

Marcu et al. (2000), in reviewing some of the ear-lier work, comment that MNLG systems need to ab-stract as much as possible away from the individuallanguage generated:

If an [MNLG] system needs to developlanguage dependent knowledge bases, andlanguage dependent algorithms for con-tent selection, text planning, and sentenceplanning, it is difficult to justify its eco-nomic viability. However, if most ofthese components are language indepen-dent and/or much of the code can be re-used, an [MNLG] system becomes a viableoption.

Bateman et al. (1999) similarly emphasise the impor-tance of reducing language dependence.

One kind of abstraction generalises acrosslanguage-specific collocations: for example, wemight note that heavy rain, strong wind or intensebombardment all refer to the intensification of somephenomenon, as similarly does the French pluie bat-tante (‘beating rain’), but the particular intensifierused is determined by collocational appropriateness.These kinds of collocations are modeled within theMeaning-Text Theory (MTT) framework via lexicalfunctions (LFs) (Mel’cuk, 1995); for some lexemeL, the above semantic notion of intensification or

Francois Lareau, Mark Dras, Benjamin Borschinger and Robert Dale. 2011. Collocations in MultilingualNatural Language Generation: Lexical Functions meet Lexical Functional Grammar. In Proceedings ofAustralasian Language Technology Association Workshop, pages 95−104

strength is represented by Magn(L). MTT-basedMNLG systems, from the early works of Heid andRaab (1989) and Iordanskaja et al. (1992) onwards,have used LFs to abstract away from the specificcollocational phenomena of individual languages.

In terms of resources developed and applicationswithin the computational linguistics community, how-ever, MTT has not been very prominent outside ofNLG. In work that is more geared towards naturallanguage understanding, other formalisms such asLexical Functional Grammar (LFG) (Bresnan, 2001),Head-driven Phrase Structure Grammar (HPSG) (Pol-lard and Sag, 1994), Tree Adjoining Grammar (TAG)(Joshi and Schabes, 1997) and Combinatory Catego-rial Grammar (CCG) (Steedman, 2000) have receivedmuch more attention. Of course, all of these frame-works have also been applied in NLG. LFG, forexample, has a sophisticated grammar developmentenvironment called Xerox Linguistic Environment(XLE) (Maxwell and Kaplan, 1993) for both parsingand generation, with wide-coverage grammars for anumber of languages such as English and German(Butt et al., 2002), and advanced statistical modelsfor tasks such as realisation ranking (Cahill et al.,2007).

The existence of a wide-coverage English LFGgrammar for XLE convinced us to build our resourceson this platform for the MNLG project described be-low. However, LFG comes from a tradition quitedifferent from MTT, and has no concept that cor-responds to MTT’s LFs. In this paper, we demon-strate that LFs cannot be straightforwardly introducedinto the LFG formalism via direct manipulation off-structures, and then show how glue semantics (Dal-rymple, 2001) can incorporate them in an elegantway.

We first describe our MNLG system to provide acontextual background for the problem (§2), alongwith some basic notions on LFG (§3). We then de-scribe LFs in more detail, and discuss how they havebeen used in other MNLG systems (§4), along withhow they would fit into our system. We then returnto LFG and glue semantics (§5), and present ourproposal for incorporating LFs into LFG, using arunning example (§6).

2 Our System

The context for this work is a project involvingan MNLG system for generating commentary-styletextual descriptions of Australian Football League(AFL) games, in both English and the AustralianAboriginal language Arrernte. A typical sentencein a human-authored commentary for a game mightlook as follows:

Led by Brownlow medallist Adam Goodesand veteran Jude Bolton, the Swans kickedseven goals from 16 entries inside theirforward 50 to open a 30-point advantage atthe final change—to that point the largestlead of the match.

For the games we want to describe, there is a cor-responding database which contains quantitativeand other data regarding the game: who scoredwhich goal when, from where, and so on. Thesystem will use handwritten grammars in the LFGformalism—for English, there is an already-existingwide-coverage one developed for XLE as part ofthe ParGram project (Butt et al., 2002)—and the re-search around the grammar development will havea number of foci. In particular, we are interested inexploring how to handle morphologically rich non-configurational languages such as Arrernte, whichare not usually tackled in the field of language tech-nology; these exhibit a number of interesting andcomplicated phenomena, as outlined by, for example,Austin and Bresnan (1996) or Nordlinger and Bres-nan (2011). Given the radical language differencesbetween such languages and those which are moretypically the focus of NLG projects, we are particu-larly interested in investigating the possible extent oflanguage independence (see §4 for a discussion). LFsare an important facet of the semantic abstractionswe require. For example, in the short text cited above,the expression kick a goal would be rendered in Ar-rernte as goal arrerneme, where arrerneme literallymeans ‘put’. We view kick and arrerneme as sup-port verbs in these expressions.1 These collocationsexhibit the same syntactic structure, and express thesame meaning; they are instances of the same pattern

1Note that in AFL one can only score goals by kicking theball, so in this context, the semantic contribution of kick is weak;we believe that for practical purposes it can be viewed as empty.

96

Linguistic knowledgeNon-linguistic knowledge

Language-specificLanguage-independent

matchstatistics

content selector

backgroundknowledge

user model dictionary

dictionarycompiler

grammar

XLE

morphology

XFST

facts

planner

plan

deep realizer

f-structure morphemes text

Figure 1: System architecture

of collocation, which is described in MTT by the LFOper1(L). We will return to this example in §4.

Our system more or less follows the “consensusarchitecture” of Reiter and Dale (2000), as schema-tised in Figure 1. Data is selected using domain-specific knowledge and statistical methods, to pro-duce a collection of facts to be expressed. These factsare organised into a document plan, which is thenpassed to a deep realiser that produces one or moref-structures for each sentence of the text (cf. §3). Thisis then passed to XLE, which uses linguistic knowl-edge to produce ordered lexical items with attachedmorphemes. This is finally passed to a two-level mor-phology model compiled with the Xerox Finite-StateTool (XFST) system, which produces fully inflectedtext. The double-headed arrows in Figure 1 indicatenon-deterministic output; a stochastic reranking tech-nique is used to select between alternative results.

The module which is the focus of the present dis-cussion is the deep realiser, which maps a semanticrepresentation to one or more f-structures, for whichin turn XLE will produce textual realisations. It is notour purpose in this paper to discuss the technical de-tails of the implementation of this component; rather,we focus on the mapping between semantic represen-tations and f-structures from a theoretical point ofview. To explain the approach, and to motivate theneed for glue semantics, we now present an outlineof LFG.

3 Lexical Functional Grammar

LFG is a formalism for a non-derivational theoryof linguistic structure that posits at least two lev-els of representation: c(onstituent)-structure and

f(unctional)-structure.2 Mappings specify the rela-tionship between the different levels of structure. C-structure is represented by phrase-structure trees, cap-turing hierarchical relationships between constituentsand surface phenomena such as word order; whilef-structure is represented by attribute–value matri-ces, describing more abstract functional relationshipssuch as subject and object (indeed, syntactic depen-dencies). LFG assumes that these functional syntac-tic concepts are universally relevant across languages(Dalrymple, 2001, p. 3), and “so may be regarded as amajor explanatory source of the relative invariance off-structures across languages” (Bresnan, 2001, p 98).As an illustration, the c- and f-structures for the sen-tence (1) below are given in Figure 2.

(1) Bradshaw kicked a beautiful goal.

The mapping between f- and c-structures is given byannotations on phrase structure and lexical rules asin (2) and (3) below.

(2) S → NP(↑SUBJ)=↓

VPall↑=↓

(3) kicked V (↑PRED)=‘kick〈(↑SUBJ),(↑OBJ)〉’(↑TENSE)=past

Lexical entries such as (3) are in fact only a differ-ent notation used for terminal nodes in c-structures;this could be written instead as in (4).

(4) V → kicked(↑PRED)=‘kick〈(↑SUBJ),(↑OBJ)〉’

(↑TENSE)=past

2See Dalrymple (2001) or Bresnan (2001) for extensive dis-cussions of LFG; here, we provide only the basic essentialsrequired to understand our treatment.

97

S[fin]

NP

NPadj

NPzero

NAME

Bradshaw

VPall[fin]

VPv[fin]

V[fin]

kicked

NP

D

a

NPadj

AP[attr]

A

beautiful

NPzero

N

goal

PRED ‘kick 〈 1:Bradshaw, 2:goal 〉’TENSE past

OBJ

2

PRED ‘goal’

SPEC

[DET

[PRED ‘a’

]]ADJUNCT

{[PRED ‘beautiful’

]}PERS 3NUM sg

SUBJ

1

PRED ‘Bradshaw’PERS 3NUM sg

Figure 2: c-structure (left) and f-structure (right) for Bradshaw kicked a beautiful goal

The symbol ↑ is a metavariable representing the f-structure of the parent of a node in the c-structure,and ↓ the f-structure of that node itself. These canbe followed by a sequence of attributes that specifya path to an element in the f-structure. For example,in rule (2), the annotation on the NP means that theSUBJ of the f-structure corresponding to the nodeabove it in the c-structure (S) is the f-structure cor-responding to this NP (in plain English: this NP isthe subject of the sentence). The annotation on theVPall node means that it shares the same f-structureas its mother (i.e., it is the head of S). In the lexicalrule for kicked (3), the annotation (↑OBJ) inside thePRED attribute refers to the f-structure numbered 2in Figure 2, which represents goal. A less commonway of referring to elements of an f-structure, albeitone that is necessary in a number of cases, is “inside-out” function application, where the ↑ or ↓ followsan attribute sequence. An annotation such as (OBJ↑)refers to the f-structure of which the current one isthe OBJ.

First-order predicate logic is often used as the fun-damental meaning representation in LFG, althoughother more expressive representations are also pos-sible, such as intensional logic or Discourse Repre-sentation Theory. The issue then is how to relatethe core LFG structures above to this meaning repre-sentation. Dalrymple (2001, p. 217) notes that earlywork in LFG took the f-structure element PRED torepresent the locus of the semantics, with the PRED

in fact originally being referred to as the semanticform. If our meaning representation for (1) were as in(5a) (ignoring here tense and number), the mappingto the f-structure in the right of Figure 2 would be

PRED ‘SV 〈 1:Bradshaw, 2:goal 〉’TENSE past

OBJ

2

PRED ‘goal’

SPEC

[DET

[PRED ‘a’

]]ADJUNCT

{[PRED Good

]}PERS 3NUM sg

SUBJ

1

PRED ‘Bradshaw’PERS 3NUM sg

Figure 3: ‘Quasi-’f-structure with variables

straightforward: the basic hierarchical structure ofthe f-structure is preserved, although predicativity isreversed in the case of the adjunct.

(5) a. kick(bradshaw, beautiful(goal))

b. good(goal(bradshaw))

However, simple semantic forms like this cannotrepresent many aspects of semantics, such as scopeof modifiers or quantification. More particularly forour purposes, if we start from a semantic represen-tation that abstracts away from the collocationallydetermined use of beautiful to characterise the goal,or from the collocational use of kick to refer to thegoal event,3 a suitable starting meaning representa-tion might be as in (5b); what the mapping shouldlook like is then much less clear.

A (quasi-)f-structure corresponding to this mightbe as in Figure 3. The top-level PRED could be avariable (SV) that would have to be instantiated to

3We might think of this as the action of “goaling”; anotherpossible supporting verb would then be score.

98

a collocationally appropriate support verb; alterna-tively, for some types of action, it could indicate thatboth the top-level PRED and its object could be re-alised as a single verb through some kind of structuremerging.4 The adjunct PRED could similarly be avariable (Good) whose value is a word that retainsthe desired semantics but is collocationally deter-mined. However, it is not possible for f-structures tohave variable predicates, or to be of indeterminatestructure, because of their role in ensuring the LFGwellformedness conditions of Coherence and Com-pleteness; in addition, the mapping between meaningrepresentation and f-structure is less straightforward,with quite different hierarchical relations.

In §6, we show how the mapping of such abstractmeaning representations to f-structures can be donein an elegant way using glue semantics. First, wepresent more formally the MTT notion of LFs thatthese quasi-f-structure “variables” of Figure 3 aretrying to capture, and we then give a brief descriptionof glue semantics.

4 Lexical Functions and MNLG

An important step in the NLG task of surface reali-sation is lexicalisation, where specific lexemes arechosen to express the content of a message. Mostof the time, this can be achieved by mapping eitherconcepts or language-specific meanings to lexemesin a straightforward way. For example, the conceptRAIN can be mapped to the lexemes rain (English),pluie (French), lluvia (Spanish), and so on. Often,however, concepts are lexicalised in a different waydepending on the lexemes they appear with, as withour example from §3 above. Consider for exam-ple the phrases strong preference, intense flavour,heavy rain and great risk. While the lexemes pref-erence, flavour, rain and risk are chosen freely ac-cording to their meaning, the lexemes strong, intense,heavy and great are not. They have roughly the samemeaning of intensification, but their choice is tiedto the lexeme they modify. Such collocations posea non-trivial problem for lexicalisation in NLG sys-tems. Under the MTT framework these are modeled

4This kind of structure merging has not, to our knowledge,been implemented in LFG. In practice, it could be carried out bymapping from one f-structure to another, using the sort of mech-anism found in XLE for use in machine translation. However, itwould inelegantly add an unprincipled layer to the formalism.

ATTENTION [of X to Y ]

Magn close/whole/complete/undivided ∼Func2 X’s ∼ is on YnonFunc0 X’s ∼ wandersOper12 X gives his/pays ∼ to YOper2 Y attracts/receives/enjoys X’s ∼Oper2+Magnquant-X Y is the center of ∼ (of many Xs)IncepOper12 X turns his ∼ to YIncepOper2 Y gets X’s ∼ContOper2 Y holds/keeps X’s ∼CausFunc2 Z draws/calls/brings X’s ∼ to YLiquFunc2 Z diverts/distracts/draws X’s ∼

from Y

Figure 4: Dictionary entry for attention

via LFs. Collocations are viewed as instances ofrecurrent patterns of semantico-syntactic mappings,described in terms of functions (in a mathematicalsense) between lexemes. Hence, these four collo-cations can be described in terms of a function fsuch that f (preference)=strong, f (flavour)=intense,etc. Over the years, more than fifty basic recur-rent functions of this type, and hundreds of com-plex ones, have been identified across languages andgiven names; the one discussed above has been calledMagn. Detailed descriptions of these functions canbe found elsewhere (Mel’cuk, 1995; Wanner, 1996;Kahane and Polguere, 2001; Apresjan et al., 2002).

The use of LFs allows the handling of lexicalisa-tion in two steps. In the first step, unbound lexemesare chosen, and collocation patterns identified, whilethe actual value of the LF is only computed in thesecond step: INTENSE+RAIN → Magn(rain)+rain→ heavy+rain.

The values for these functions are stored in thedictionary; for example, the entry for attention mustcontain the information in Figure 4. We will notdiscuss each of these functions here; the key pointis that LFs offer a very efficient way of describing awide range of collocations.

Each pattern must be defined in the grammar, butthis is done only once for all languages and domains.We discuss in §6 how this can be done in LFG. Thefact that the patterns must be defined only once for alllanguages makes this technique a cost-efficient wayof developing MNLG resources by sharing parts ofthe grammar across languages, as advocated notablyby Bateman et al. (1991), Bateman et al. (1999) and

99

Cahill et al. (2000). This was the approach takenby Lareau and Wanner (2007) for the system MAR-QUIS, which generated air quality reports in eight Eu-ropean languages (Catalan, English, Finnish, French,German, Polish, Portuguese and Spanish). Their useof LFs was an important factor that contributed to thelow number of language-specific rules they reportedfor the deeper modules of their grammars, makingthe addition of new languages in their frameworkrelatively cheap. It is LFs such as these that we wishto incorporate into LFG.

5 Glue Semantics

In the context of LFG, there have been several ap-proaches to developing a compositional notion ofsemantics derived from the f-structure; one that iswell developed, and is the basis of our work, is gluesemantics (Dalrymple, 2001; Andrews, 2010). Wegive only a brief summary here; for a full treatment,see Dalrymple (2001).5

Glue semantics is based on linear logic. This dif-fers from classical logic in its resource-sensitivity,in that premises are treated as resources that can bekept track of. For example, consider the statementsIf you have $1, you can get an apple and You have$1. In classical logic, you can deduce that you canget an apple, but the original premises would stillbe true, i.e. you would still have $1, and you couldstill get an apple. In linear logic, these premisesare resources that will be consumed in the processof deduction, and therefore not available for furtherproof;6 notationally, the implication above in linearlogic is written $1 ( apple.

This resource-sensitivity is particularly appropriatewhen we are concerned with the linguistic expressionof semantic content: the contribution of each wordand phrase to the meaning of a sentence is unique,and there should be no missing or redundant wordsin terms of the meaning to be expressed.

We illustrate how this works by showing the morestraightforward mapping between our example sen-tence (1) and its literal meaning representation (5a).The lexical item representing Bradshaw is given be-low in (6). The first line contains the wordform,

5It should be noted that the current version of XLE cannotdirectly handle glue semantics.

6Somewhat counterintuitively, even the premise If you have$1, you can get an apple is consumed.

its part of speech, and an annotation asserting thatthe f-structure corresponding to the N node imme-diately dominating the lexical item has an attributePRED whose value is the semantic form ‘Bradshaw’.The second line (Bradshaw : ↑σ) contains what istermed the meaning constructor, as it gives instruc-tions on how to construct meanings. These are pairs:the lefthand (meaning) side represents the meaning,and the righthand (glue) side represents a logical for-mula over semantic structures corresponding to thosemeanings.

(6) Bradshaw N (↑PRED)=‘Bradshaw’Bradshaw : ↑σ

The present example is trivial: it should be read asthe semantic projection of the mother node is themeaning Bradshaw. A σ subscript indicates the se-mantic projection of a node, so the notation ↑σ givesthe corresponding element of the semantics via theprojection of the mother node to the semantics. Theelement goal is similar:

(7) goal N (↑PRED)=‘goal’goal : ↑σ

The verb kick is transitive, so it will have the fol-lowing form.

(8) kicked V (↑PRED)=‘kick〈(↑SUBJ),(↑OBJ)〉’(↑TENSE)=pastλX.λY.kick(X,Y ) :

(↑SUBJ)σ ( [(↑OBJ)σ ( ↑σ]

In the meaning constructor, the semantics of the ac-tion of kicking is represented by a lambda term, onthe left; the righthand glue side is given in termsof the linear logic implication operator (. Thefirst implication says that if (↑SUBJ)σ is available(i.e., if we have already built the semantic projectionfor the verb’s subject—in our example, Bradshaw),it will be consumed and will saturate the first vari-able of the lambda expression, to produce the newpremise that follows the first ( symbol, leaving uswith λY .kick(Bradshaw,Y ) : (↑OBJ)σ ( ↑σ. Thisin turn consumes the semantic projection for the ob-ject (in our case, goal) to reduce the lambda term,and produces the semantic resource ↑σ, i.e., the se-mantic projection for the verb and its complements,kick(Bradshaw,goal).

100

For the remaining two elements, we would havethe following.

(9) beautiful A (↑PRED)=‘beautiful’λX.beautiful(X) :

(ADJ ∈ ↑)σ ( (ADJ ∈ ↑)σ

(10) a D (↑PRED)=‘a’λX.X : (DET ↑)σ ( (DET ↑)σ

For beautiful in (9), the notation (ADJ ∈ ↑)σ differsin two ways from that introduced earlier. First, ituses an “inside-out” function to refer to the semanticstructure of the phrase it modifies; and second, it usesset membership notation, as modifiers are typicallyrepresented by sets (as in the f-structure of Figure 2).The expression thus refers to the semantic structurecorresponding to the f-structure in which ↑ appearsas a member of the modifier set. In terms of theglue side, all modifiers have this structure: they takeand return the same type of element. We treat thedeterminer in (10) similarly; further, it does not addany meaning element.7

All these combined together, then, give the literalsemantics of (5a). Such mechanics are more com-plicated than is necessary for this simple example,which was used only for illustrative purposes here.However, they can equally well provide the moreabstract semantics of (5b), as we show in §6.

6 Adding Lexical Functions to LFG

There are a number of changes necessary to incorpo-rate LFs, both in a less straightforward use of gluesemantics and in other aspects of the definitions oflexical entries. The lexical entry for the proper nounBradshaw still has the same simple meaning con-structor as above in (6). By contrast, goal in (11), isa unary predicate: λX.goal(X), i.e., ‘X goals’, so tospeak. However, in the construction under considera-tion here, its semantic predicativity is not echoed insyntax, since there is no verb to goal in standard En-glish. This is precisely why a support verb is neededin the first place: kick ties the noun goal to its se-mantic argument Bradshaw. This is rendered in thelexical entry in (11) below with a meaning construc-tor that checks that there is a meaning available forthe subject of the verb of which goal is the object.

7The determiner could be considered as a quantifier, whichwould require a much more sophisticated treatment.

(11) goal N (↑PRED)=‘goal’λX.goal(X) :

((OBJ↑) SUBJ)σ ( ↑σ

The lexeme kick serves only as a support verb toturn Bradshaw’s goal into a verbal expression, sothat it forms a clause. It is a collocation of goal that,in the context of football match summaries, does notcontribute to the meaning of the sentence in a sig-nificant way. Hence, X kicks a goal means nothingmore than λX.goal(X), that is, the verb kick simplyrecopies its object’s meaning, with the constraint thatits object is the lexeme goal in (12):

(12) kicked V (↑PRED)=‘kick〈(↑SUBJ),(↑OBJ)〉’(↑OBJ PRED)=c‘goal’(↑TENSE)=pastλX.X : (↑OBJ)σ ( ↑σ

In this example, the second line is a constrainingequation, which is LFG’s way of handling colloca-tional constraints; it specifies that this rule can onlybe applied if the predicate of the object of kick is goal.And just as for the determiner in (10), the meaningside adds nothing to the overall semantics.

We note here that the semantic description pro-vided by glue semantics does not render obsolete thePRED function. It is still needed to encode purelysyntactic information: the name of the lexeme andits sub-categorisation. The verb kick could control itsown collocations, so we need to have access to thename of the lexeme.

Beautiful, in (1), could be replaced with spectacu-lar or brilliant, for instance. In these kinds of texts,the semantic difference between these expressions isnot significant. The adjectives beautiful, brilliant andspectacular, when they modify goal, merely denotea positive appreciation: λX.good(X).

(13) beautiful A (↑PRED)=‘beautiful’((ADJ ∈ ↑) PRED)=c‘goal’λX.good(X) :

(ADJ ∈ ↑)σ ( (ADJ ∈ ↑)σ

The second line is again an LFG constraining equa-tion, which specifies that this rule can only be appliedif beautiful modifies the lexeme goal; the semanticelement λX.good(X) will be realised in other waysin different contexts.

101

The extra lines in the lexical entries are regular,and can be captured using templates, which are theXLE instantiation of LFG’s lexical rules. These infact then correspond very closely to MTT’s LFs. Forexample, for the LF Oper1(L), which represents theuse of support verbs in contexts such as that of kickin our examples, the following template could bedefined:

(14) @OPER1(L)=(↑PRED)=‘%stem〈(↑SUBJ),(↑OBJ)〉’(↑OBJ PRED)=c‘L’λX.X : (↑OBJ)σ ( ↑σ

The constraining equation on the second line restrictsthe support verb to the particular lexical element withwhich it is invoked. The third line constructs themeaning by just passing along the meaning of theexisting components with no additions. The templateis then invoked in the dictionary:

(15) kick V @(OPER1 goal)suffer V @(OPER1 loss)have V @(OPER1 cold)

Such templates need only be described once for alllanguages. For example, the Arrernte dictionary con-tains the following entry for the expression goal ar-rerneme (literally ‘put (a) goal’):

(16) arrerneme V (OPER1 goal)

One problem with this approach is that colloca-tions must be described in the collocate’s entry, whichis not very elegant and obfuscates the lexicographer’swork. Indeed it is a lot easier, for example, to answerthe question “how do you intensify smoker?” than“what lexemes can heavy intensify?”. However, thisproblem can easily be resolved by writing the dictio-nary in the format of Figure 5 (similar to the attentionexample in Figure 4), where all collocations are listedunder their base headword, and using a compiler tobuild the corresponding XLE lexical entries.

goal [of X]

Bon beautiful/spectacular/brilliant ∼Oper1 X kicks/scores/gets/makes a ∼

Figure 5: Dictionary entry for goal

In the example we have considered so far, the En-glish expression kick a goal and its Arrernte equiva-lent goal arrerneme have the same structure. How-ever, this need not be the case. For example, considerthe contrast between the following two sentences:

(17) John abandons the baby.

(18) John-leJohn-ERG

ampe-Øbaby-NOM

ipmentye-Øabandonment-NOM

iwe-meleave-N.PST

‘John abandons the baby’

Both sentences express the meaning aban-don(John,baby), but in English, the predicate isexpressed by a single verb, while in Arrernte itis expressed by a noun with a support verb. Thisconstruction corresponds to an LF called Labor12,which denotes a support verb that takes as itssubject the first semantic argument of the base of thecollocation (here, John), the second argument as itsdirect object (baby), and the base itself as its secondobject (ipmentye).8 The template for Labor12would look like this:

(19) @LABOR12(L)=(↑PRED)=‘%stem〈(↑SUBJ),(↑OBJ),(↑OBJ2)〉’(↑OBJ2 PRED)=c‘L’λX.X : (↑OBJ2)σ ( ↑σ

And just as we did for goal in (11), we also need aspecific entry for ipmentye that reflects its behaviourin this collocation, as well as an entry for iweme thatsays it is the Labor12 of ipmentye:

(20) ipmentye N(↑PRED)=‘ipmentye’λXλY.abandon(X,Y ) :

((OBJ2↑) SUBJ)σ ([((OBJ2↑) OBJ)σ ( ↑σ]

iweme V @(LABOR12 ipmentye)

Hence, given the same meaning as input, the gram-mar produces different structures, as appropriate forthe language being processed.

8Since both ampe and ipmentye are in the nominative form,it is hard to determine which is the first and which is the secondobject, but this question is largely irrelevant here.

102

Of course, LFs have their limitations too. In thecontext of MNLG, there are two problems related tolexicalisation that are worth mentioning here. Oneis that languages sometimes diverge at the semanticlevel. For example, there is no direct equivalent tothe verb teach in Arrernte; one has to say akaltyeantheme, literally ‘give knowledge’. This is a col-location of the noun akaltye ‘knowledge’ that canbe captured by an LF; but the problem here lies inthe fact that the semantic input in Arrernte shouldbe cause(X,know(Y ,Z)), while in English it wouldbe teach(X,Y ,Z). This is different from the aban-donment case discussed above: there, one languageuses a straightforward realisation, while the otheruses a light verb, but there is no need to decomposethe meaning of these expressions to see that theyare identical; they both have the same semantic rep-resentation. For teach∼akaltye antheme, the twolanguages do not conceptualise the world in the sameway, and these conceptual/semantic differences mustbe dealt with early in the generation process; the deeprealiser must produce different semantic representa-tions depending on the language. At this stage, LFsare irrelevant because we are operating on conceptsrather than at the lexical level, where LFs come intoplay.

Another limitation is that LFs are designed to de-scribe recurrent patterns of collocations. Althoughmost collocations found in languages are instancesof a few common patterns, there are many that ei-ther express unusual meanings, or that exhibit a verypeculiar syntactic or morphological structure. Forexample, the expression winning goal could be de-scribed as a collocation. However, the meaning ex-pressed here by winning is very specific to this do-main, and it cannot be reduced to a recurrent patternacross languages (beyond the equivalent expressionsfor winning goal). Ad hoc LFs can still be definedfor such collocations, but their use will only be aviable solution in the context of an application withina restricted domain (such as ours).

7 Conclusion

We have proposed a technique for the description ofcollocations in LFG based on MTT’s concept of LFs,in order to solve the problem of complex lexicali-sation in NLG. We showed that a direct treatment

within LFG’s f-structure is not possible because itwould require variable values for the attribute PRED,which is not allowed. Also, the semantics of supportverbs in particular is tricky and cannot be capturedsatisfactorily with a PRED attribute. We proposed atreatment using glue semantics, which handles moreelegantly the complex correspondence between thesemantics and syntax of collocations. The rules thatdescribe collocates use constraining equations so thatthey apply only in the context of the base of a colloca-tion. We also showed how templates could be used inXLE to define recurrent patterns, effectively definingany given LF once for all languages. The result isan elegant way of describing collocations within theLFG framework. This technique simplifies the taskof preparing resources for MNLG by sharing thesepatterns across languages.

Acknowledgments

We acknowledge the support of ARC grantDP1095443, and thank Mark Johnson for his feed-back on the idea.

References

Avery Andrews. 2010. Propositional Glue and the Cor-respondence Architecture of LFG. Linguistics andPhilosophy, 33:141–170.

Jury Apresjan, Igor Boguslavsky, Leonid Iomdin, andLeonid Tsinman. 2002. Lexical functions in actualNLP applications. In Computational Linguistics forthe New Millennium: Divergence or Synergy?, pages55–72. Peter Lang, Frankfurt.

Peter Austin and Joan Bresnan. 1996. Non-configura-tionality in Australian aboriginal languages. NaturalLanguage and Linguistic Theory, 14(2):215–268.

John Bateman, Christian Matthiessen, Keizo Nanri, andLicheng Zeng. 1991. The re-use of linguistic resourcesacross languages in multilingual generation compo-nents. In Proceedings of the 1991 International JointConference on Artificial Intelligence, volume 2, pages966–971, Sydney.

John Bateman, Christian Matthiessen, and Licheng Zeng.1999. Multilingual Natural Language Generation forMultilingual Software: A Functional Linguistic Ap-proach. Applied Artificial Intelligence, 13(6):607–639.

Laurent Bourbeau, Denis Carcagno, Eli Goldberg, RichardKittredge, and Alain Polguere. 1990. Bilingual Gener-ation of Weather Forecasts in an Operations Environ-ment. In Proceedings of the 13th International Con-

103

ference on Computational Linguistics (COLING’90),pages 90–92.

Joan Bresnan. 2001. Lexical-Functional Syntax. Black-well, Oxford, UK.

Miriam Butt, Helge Dyvik, Tracy Holloway King, HiroshiMasuichi, and Christian Rohrer. 2002. The ParallelGrammar Project. In Proceedings of COLING-2002Workshop on Grammar Engineering and Evaluation,pages 1–7.

Lynn Cahill, Christie Doran, Roger Evans, Rodger Kibble,Chris Mellish, Daniel Paiva, Mike Reape, Donia Scott,and Neil Tipper. 2000. Enabling resource sharing inlanguage generation: an abstract reference architecture.In Proceedings of the Second International Conferenceon Language Resources and Evaluation (LREC’00),Athens.

Aoife Cahill, Martin Forst, and Christian Rohrer. 2007.Stochastic realisation ranking for a free word order lan-guage. In Proceedings of the Eleventh European Work-shop on Natural Language Generation (ENLG’07),pages 17–24, Schloss Dagstuhl, Germany.

Charles B. Callaway and James C. Lester. 2002. Narrativeprose generation. Artificial Intelligence, 139(2):213–252.

Charles Callaway, Elena Not, Alessandra Novello, Ce-sare Rocchi, Oliviero Stock, and Massimo Zancanaro.2005. Automatic Cinematography and MultilingualNLG for Generating Video Documentaries. ArtificialIntelligence, 165(1):57–89.

Mary Dalrymple. 2001. Lexical Functional Grammar,volume 42 of Syntax and Semantics Series. AcademicPress, New York.

Roger Evans, Paul Piwek, Lynne J. Cahill, and Neil Tip-per. 2008. Natural language processing in CLIME, amultilingual legal advisory system. Natural LanguageEngineering, 14(1):101–132.

Ulrich Heid and Sybille Raab. 1989. Collocations inmultilingual generation. In Proceedings of the fourthconference of the European chapter of the Associationfor Computational Linguistics (EACL’89), pages 130–136.

Lidja Iordanskaja, Myunghee Kim, Richard Kittredge,Benoıt Lavoie, and Alain Polguere. 1992. Generationof Extended Bilingual Statistical Reports. In Proceed-ings of the 15th International Conference on Compu-tational Linguistics (COLING’92), pages 1019–1023.Nantes, France.

Aravind Joshi and Yves Schabes. 1997. Tree-AdjoiningGrammars. In G. Rozenberg and A. Salomaa, editors,Handbook of Formal Languages, volume 3, pages 69–124. Springer, Berlin.

Sylvain Kahane and Alain Polguere. 2001. Formal foun-dation of lexical functions. In Proceedings of ACL2001, Toulouse.

Francois Lareau and Leo Wanner. 2007. Towards ageneric multilingual dependency grammar for text gen-eration. In Proceedings of Grammar EngineeringAcross Frameworks (GEAF’07), pages 203–223, PaloAlto.

Daniel Marcu, Lynn Carlson, and Maki Watanabe. 2000.An Empirical Study in Multilingual Natural LanguageGeneration: What Should A Text Planner Do? InProceedings of the 1st International Conference onNatural language Generation (INLG’00), pages 17–23.

John T. Maxwell and Ronald M. Kaplan. 1993. TheInterface between Phrasal and Functional Constraints.Computational Linguistics, 19(4):571–590.

Igor Mel’cuk. 1995. The future of the lexicon in linguisticdescription and the explanatory combinatorial dictio-nary. In I.-H. Lee, editor, Linguistics in the morningcalm, volume 3. Hanshin, Seoul.

Rachel Nordlinger and Joan Bresnan. 2011. Lexical-Functional Grammar: interactions between morphologyand syntax. In R. Borsley and K. Borjars, editors, Non-Transformational Syntax: Formal and Explicit Modelsof Grammar. Wiley-Blackwell, Chichester.

Cecile Paris, Keith Vander Linden, Markus Fischer, An-thony Hartley, Lyn Pemberton, Richard Power, andDonia Scott. 1995. A Support Tool for Writing Mul-tilingual Instructions. In Proceedings of the 14th In-ternational Joint Conference on Artificial Intelligence(IJCAI’95), pages 1398–1404, Montreal.

Carl Pollard and Ivan Sag. 1994. Head-driven phrasestructure grammar. University of Chicago Press,Chicago.

Anne-Marie Rassinoux, Robert H. Baud, Jean-Marie Ro-drigues, Christian Lovis, and Antoine Geissbuhler.2007. Coupling Ontology Driven Semantic Represen-tation with Multilingual Natural Language Generationfor Tuning International Terminologies. In Proceed-ings of the 12th World Congress on Health (Medical)Informatics (MEDINFO’07), pages 555–559, Brisbane.

Ehud Reiter and Robert Dale. 2000. Building NaturalLanguage Generation Systems. Cambridge UniversityPress.

Mark Steedman. 2000. The Syntactic Process. MITPress.

Leo Wanner, editor. 1996. Lexical functions in lexicog-raphy and natural language processing, volume 31 ofStudies in Language Companion Series. John Ben-jamins, Amsterdam/Philadelphia.

104

Documents

Collocations in Multilingual Natural Language Generation: Lexical … · 2011. 12. 20. · Lexical Functional Grammar (LFG) (Bresnan, 2001), Head-drivenPhraseStructureGrammar(HPSG)(Pol-lard