3
Guest Editor’s Note This special issue of Machine Translation presents the primary results of a research project at Carnegie Mellon University’s Center for Machine Translation. The system described herein adopts and extends the lmowledge-based approach to machine translation and in the current implementation is therefore known as ‘KBMT-89.’ It is referred to in this way throughout this issue. The issue has two separately bound parts. Part I contains four papers which address the project’s knowledge bases. The first paper, ‘Knowledge-Based Ma- chine Translation” (Nirenburg), places KBMT-89 in historical context, presents an overview of the project and gives an account of the system architecture and the relations among its components. It may be taken as an introduction to this issue and is best read lirst. The rest of Part I contains descriptions of the system’s primary knowledge sources. “Knowledge Representation Support” (Nirenburg and Levin) explains the central role of interlingua texts in the system and describes its linguistic framework and grammar formalisms. “Analysis and Generation Grammars” (Gates, Takeda, Mitamura, Levin and Kee) presents an account of the system’s grammars. Likewise, “Lexicons” (Gates, Haberlach, KaufmaM, Kee, McCardell, Mitamura, Monarch, Morrisson, Nirenburg, Ny- berg, Takeda and Zabludowski) introduces the concept, analysis and generation lexicons in KBMT-89. The papers in Part II give accounts of the system’s processing modules. “Analysis” (Morrisson, Kee and Goodman) describes the mapping rule inter- preter and parser. “Augmentation” (Brown) introduces the work on automatic and interactive disambiguation. “Generation” (Nyberg, McCardell, Gates and Nirenburg) provides a description of KBMT-89’s target-language generation module. While the project aimed at producing a unified machine translation system, the system’s various components lend themselves to discrete accounts and ex- plications. So, while the articles that make up this issue are inherently intercon- nected, they broach varying theoretical issues and thus, to the degree possible, have been shaped as independent entities. Nevertheless, the seven articles have been edited in such a way that they will be most profitably read in sequence. Especially note that the papers “Analysis and Generation Grammars” and “Lex- icons” are intimately related and cover some material that is also touched on - but from a different perspective - in the papers “Analysis” and “Generation.” These and similar connections are pointed out by footnotes where appropriate. It is hoped that such congruencies will ease the reader’s course and offer a more enjoyable presentation. The KBMT-89 research was sponsored in large part by IBM Japan’s Tokyo

Guest editor's note

Embed Size (px)

Citation preview

Guest Editor’s Note

This special issue of Machine Translation presents the primary results of a research project at Carnegie Mellon University’s Center for Machine Translation. The system described herein adopts and extends the lmowledge-based approach to machine translation and in the current implementation is therefore known as ‘KBMT-89.’ It is referred to in this way throughout this issue.

The issue has two separately bound parts. Part I contains four papers which address the project’s knowledge bases. The first paper, ‘Knowledge-Based Ma- chine Translation” (Nirenburg), places KBMT-89 in historical context, presents an overview of the project and gives an account of the system architecture and the relations among its components. It may be taken as an introduction to this issue and is best read lirst. The rest of Part I contains descriptions of the system’s primary knowledge sources. “Knowledge Representation Support” (Nirenburg and Levin) explains the central role of interlingua texts in the system and describes its linguistic framework and grammar formalisms. “Analysis and Generation Grammars” (Gates, Takeda, Mitamura, Levin and Kee) presents an account of the system’s grammars. Likewise, “Lexicons” (Gates, Haberlach, KaufmaM, Kee, McCardell, Mitamura, Monarch, Morrisson, Nirenburg, Ny- berg, Takeda and Zabludowski) introduces the concept, analysis and generation lexicons in KBMT-89.

The papers in Part II give accounts of the system’s processing modules. “Analysis” (Morrisson, Kee and Goodman) describes the mapping rule inter- preter and parser. “Augmentation” (Brown) introduces the work on automatic and interactive disambiguation. “Generation” (Nyberg, McCardell, Gates and Nirenburg) provides a description of KBMT-89’s target-language generation module.

While the project aimed at producing a unified machine translation system, the system’s various components lend themselves to discrete accounts and ex- plications. So, while the articles that make up this issue are inherently intercon- nected, they broach varying theoretical issues and thus, to the degree possible, have been shaped as independent entities. Nevertheless, the seven articles have been edited in such a way that they will be most profitably read in sequence. Especially note that the papers “Analysis and Generation Grammars” and “Lex- icons” are intimately related and cover some material that is also touched on - but from a different perspective - in the papers “Analysis” and “Generation.” These and similar connections are pointed out by footnotes where appropriate. It is hoped that such congruencies will ease the reader’s course and offer a more enjoyable presentation.

The KBMT-89 research was sponsored in large part by IBM Japan’s Tokyo

2 GUEST EDITOR’S NOTE

Research Laboratory’ and took for its corpus 150 sentences from each of two instruction manuals for personal computers.2

Here is a list of the special issue’s contributors and their affiliations:

Ralf Brown

Donna M. Gates

Dawn Haberlach

Todd Kaufmauu

Marion R. Kee

Lori Leviu

Rita McCardell

Teruko Mitamura

Ira A. Monarch

Stephen Morrisson

Sergei Nh-enburg

Eric Nyberg 3rd

Koichl Takeda

Margalit Zabludiwskl

Center for Machine Translation, School of Computer Science, Carnegie Mellon University (CMT)

CMT

Department of Linguistics, University of Pittsburgh

CMT

CMT and Joint Program in Computational Linguistics, Carnegie Mellon University and University of Pittsburgh

CMT, KBMT-89 Associate Project Director

Computer Science Department, University of Maryland Baltimore County; U.S. Department of Defense; and the Baltimore Orioles, Inc.

CMT and Department of Linguistics, University of Pittsburgh

CMT and Advanced Computer Tutoring, Inc.

CMT

CMT, KBMT-89 Project Director

CMT and Joint Program in Computational Linguistics, Carnegie Mellon University and University of Pittsburgh

Tokyo Research Laboratory, IJ3M Japan

CMT and Joint Program in Computational Linguistics, Carnegie Mellon University and University of Pittsburgh

Several stylistic conventions have been adopted and extended for the current number. To represent the content of knowledge sources and programs, we use a

‘Some individual researchers were supported by other academic and/or governmental entities in the United States and Japan.

2”Guide to Operations Manual” for IBM Personal Computer XT. 1983. Beta Raton, PL IBM Corporation; and “IBM maruti-suteishicm 5560 shisutemu Sousa gaido” (“IBM multi-station 5560 system operation guide”). 1986. Tokyo: IBM Corporation.

GUEST EDITOR’S NOTE 3

typewriter font. This is generally reserved for rules of grammar, mapping rules, frames, traces and the like. Note that material in this typeface appears both as upper-and-lower-case and all-upper-case. This distinction has no com- putational significance: Before the system is loaded and the main dictionaries are compiled, all entries are given in lower case; but the COMMONLISP code makes no distinction between lower case and upper case. In the interests of a more pleasing typography, we have tried to constrain the use of all-upper-case text to extended examples and traces.

A convention adopted during creation of the concept lexicon is reproduced as consistently as possible in the text, namely, the use of an asterisk before the name of a concept frame. Thus, for instance, *remove is a frame, while remove could be a slot or a facet.

Thanks are due Amy Paynter for preparing several of the figures and Lyn Jones and Joey Monaco for proofreading support.

A special debt of gratitude is owed to W. Scott Bennett, University of Texas at Austin; Victor Raskin, Purdue University; and Allen Tucker, Bowdoin Col- lege, for their comments on an earlier version of the material in this issue.

There is a sense in which any large research project is greater than the sum of its parts. Yet the making-public of results is often necessarily a piecemeal affair, and this can tend to obscure the thrust and accomplishments of the endeavor. This is a source of dismay to researchers who do interesting work on a large scale. The attempt in this special issue is to convey as complete a picture as possible, albeit in the format of individual journal articles, of one such extensive and interesting enterprise.

Kenneth Goodman

Center for Machine Translation Carnegie Mellon University

Department of Philosophy University of Miami