Words, Languages & Combinatorics III

WORDS, LAN COMBINATO

This page intentionally left blankThis page intentionally left blank

Proceedings of the

International Conference

WORDS, lR#GURGES ft

Kyoto, Japan 14 - 18 Mavch 2000

Editors

Masarni I t o

Teruo Imaoka Kyoto Sangyo University, Japan

Shimane University, Japan

World Scientific b NewJersey London Singapore Hong Kong

COMBINATO

Published by

World Scientific Publishing Co. Re. Ltd. 5 Toh Tuck Link, Singapore 596224 USA once: Suite 202, 1060 Main Street, River Edge, NJ 07661 UK once: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library

WORDS, LANGUAGES & COMBINATOFUCS III Proceedings of the Third International Colloquium

Copyright 0 2003 by World Scientific Publishing Co. Re. Ltd. All rights reserved. This book, orparts thereof; may not be reproduced in any form or by any means, electronic or mechanical, includingphotocopying, recording or any information storage and retrieval system now known or to be invented, without wrirten permissionfrom the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-02-4948-9

Printed in Singapore.

V

Preface

The Third International Colloquium on Words, Languages and Combinatorics was held at Kyoto Sangyo University from March 14 to 18, 2000. The colloquium was a continuation of the previous two International Colloquiums on Words, Languages and Combinatorics held in Kyoto in 1990 and 1992. The colloquium was organized under the sponsorship of the Institute of Computer Science at Kyoto Sangyo University and with the financial support of the Asahi Glass Foundation and the Japan Society for Promotion of Science. The program committee consisted of the following members:

J. Almeida ( U . Porto, Portugal), J . Brzozowski (U. Waterloo, Canada), C. Calude ( U . Auckland, New Zealand), J . Dassow (U. Magdeburg, Germany), K. Denecke (U . Potsdam, Germany), V . Diekert ( U . Stuttgart, Germany), F. G6cseg (U. Szeged, Hungary), T . Hall (Monash U., Australia), T. Head (Bing- hamton U., USA), J . Howie (U. St Andrews, UK), T . Imaoka (Shimane U., Japan), M. Ito (Kyoto Sangyo U., Japan, chair), H. Jurgensen (U. Western On- tario, Canada & U. Potsdam, Germany), J . Karhumaki (U . Turku, Finland), M. Katsura (Kyoto Sangyo U., Japan), S. Marcus (U. Bucharest, Romania), J . Meakin (U. Nebraska, USA), M. Nivat (U. Paris VI, France), Gh. P5un (Ins. Mathematics, Romania), J . Reif (Duke U., USA), N . Reilly (Simon Fraiser U., Canada), G. Rozenberg (U. Leiden, Netherlands), J . Sakarovitch (ENS des Telecommunication, France), B. Schein (U. Arkansas, USA), G. Thierrin ( U . Western Ontario, Canada), P. Trotter (U. Tasmania, Australia), Do Long Van (Ins. Mathehmatics, Vietnam), M. Volkov (Ural State U., Russia)

The topics of the colloquium were: ( a ) semigroups, especially free monoids and finite transformation semigroups, ( b ) codes and cryptography, ( c ) automata, (d ) formal languages, ( e ) varieties of semigroups and languages, ( f ) word problems, (9 ) word- and term-rewriting systems, ( h ) ordered structures and categories, (i) combinatorics on words, ( j ) complexity and computability, ( k ) molec- ular computing, especially DNA computing, (/) quantum computing

The number of participants was 92, from 19 different countries. There were 69 lectures (5 plenary lectures among them) during the sessions.

The colloquium was arranged by the conference committee consisting of the following members:

P. Domosi (U. Debrecen, Hungary), Z. Esik ( U . Szeged, Hungary), U . Knauer (U. Oldenburg, Germany), Y. Kobayashi (Toho U. , Japan), T. Imaoka (Shimane U., Japan, co-chair), B. Imreh (U. Szeged, Hungary), M. Ito (Kyoto Sangyo U., Japan, cechair), M. Katsura (Kyoto Sangyo U., Japan), M. Kudlek (U . Hamburg, Germany), C. Nehaniv (U . Hertfordshire, UK), F. Otto (U. Kassel, Germany), K. Shoji (Shimane U., Japan), K.P. Shum (Chinese U. of Hong Kong, Hong Kong)

vi

This volume contains papers based on lectures given at the colloquium. All papers have been refereed. The editors express their gratitude to all contributors of this volume including the referees.

The organizers would like to express their thanks to the Institute of Com- puter Science, the Asahi Glass Foundation, the Japan Society for Promotion of Science and the World Scientific Publishing Company for providing the conditions to host the colloquium. We are also grateful to Ms. Yuki Yasuda, Ms. Miyuki Endo, Ms. Tomomi Hirai, Ms. Chikage Totsuka, Mr. Kenji Fujii, Mr. Taro Nakamura, Mr. Jun-ichi Nakanishi, Mr. Tetsuya Hirose and Mr. Ryo Sugiura for their help to realize the colloquium. Finally, we would like to express our appreciation for the assistance of Mr. Christopher Everett during the editing procedure.

March 2003

Masami Ito Department of Mathematics Kyoto Sangyo University

Teruo Imaoka Department of Mathematics Shimane University

vi i

Scientific Program

March 14, 2000

Plenary lecture

10.00-10.50 Gh. PYun (Instilute of Mathematics of the Romanian Academy),

P systems: An early survey

Section A

Invited lectures

11.00-11.40 J. Gruska (Masaryk University), Quantum challenges in automata theory

11.40-12.20 D.L. Van (Hanoi Institute of Mathematics), A unified approach to the embedding problem for codes defined by binary relations

12.20-13.00 N.H. Lam (Hanoi Institute of Mathematics), Finite maximal solid codes

Contributed lectures

14.30-15.00 D.Y. Long & W.J. Jia (City University of Hong Kong), A new symmetric cripto-algorithm based on prefix codes

15.00-15.30 V. Brattka (FernUniversitat Hagen), The emperor’s new recursiveness

16.00-16.30 A. Yamamura (Communications Research Laboratory) & K. Kuro- sawa (Tokyo Institute of Technology), Key agreement protocol over a commutative group

16.30-17.00 G. Horvbth, K. Inoue, A. Ito & Y. Wang (Yamaguchi University), Clo- sure property of probabilistic Turing machines and alternating Turing machines with subalgorithmic spaces

... Vlll

Section B

Invited lectures

11.00-11.40 M. Steinby ( University of Turku), Tree automata in term rewriting theory

11.40-12.20 G. Niemann & F. Otto (UniversitZt Kassel), Some results on deterministic restarting automata

12.20-13.00 E. Csuhaj-Varj~ (Computer and Automation Research Institute, Hun- garian Academy of Sciences) & A. Salomaa (Turku Centre for Com- puter Science), Networks of Watson-Crick DOL systems


14.30-15.00 G. Horvith (Yamaguchi University), Cs. Nagylaki (University of De- brecen) & 2. Nagylaki (Hiroshima University), Visualization of cellular automata

15.00-15.30 H. Nishio, Cellular automata with polynomials over finite fields

16.00-16.30 R. Schott (Universitd Henri Poincard) & J.-C. Spehner (Universitd de Haute Alsace) Two optimal parallel algorithms on the commutation class of a word

16.30-17.00 I. Inata (Toho University), Presentations of right unitary subiiionoids of monoids

March 15, 2000

Plenary lecture

10.00-10.50 J. Meakin (University of Nebraska), One-relator inverse monoids and rational subsets of one-relator groups

ix

Section A

Invited lectures

11.00-11.40 L. Kari (University of Western Ontario), Computation in cells

11.40-12.20 S.W. Margolis (Bar-Ilan University), J.-E. Pin (Universitk Paris 7) & M.V. Volkov (Ural State University), Words guaranteeing minimal image

12.20-13.00 C. Campbell (University of St Andrews), The semigroup efficiency of groups and finite simple semigroups


14.30-15.00 T. Buchholz, A. Klein & M. Kutrib (University of Giessen), Iterative arrays with limited nondeterministic communication cell

15.00-15.30 T. Saito, Acts over right, left regular bands and semilattices types

16.00-16.30 G. Mashevitzky (Ben Gurion University of the Neger), On definability of weighted circulants by identity

16.30-17.00 M. CiriC & T. PetkoviC (University of NiS), Syntactic and semantic properties of semigroup identities

Section B

Invited lectures

11.00-11.40 J. Karhumaki (University of Turku), Remarks on language equations

11.40-12.20 A. Mateescu (University of Bucharest), Routes and trajectories

12.20-13.00 J . Dassow (Otto-von-Guericke-Universitat Magdeburg), On the differentiation function of some language generating devices


14.30-15.00 M. Ogawa (NTT Communication Science Laboratories), Well-quasi- orders and regular w-languages

X

15.00-15.30 J.A. Anderson (University of South Carolina) & W. Forys (Jagiel- lonian University), Regular languages and seniretracts

16.00-16.00 K. Hashiguchi, Y. Wada & S. Jimbo (Okayama University), Regular binoid expressions and regular binoid languages

16.30-17.00 K. Shoji (Shimane University), On a proof of Okninski and Putcha’s theorem

March 16, 2000

Plenary lecture

10.00-10.50 J. Shallit (University of Waterloo), Number theory and formal languages

Section A

Invited lectures

11.00-11.40 K . Denecke (University of Potsdam), Tree-hyper recognizers and tree- hyper grammars

11.40-12.20 Z. Esik (University of Szeged) & W. Kuich (Technische Universitat Wien), Inductive *-semirings

12.20-13.00 V. Diekert & & C. Hagenah (Universitat Stuttgart), A remark on equations with rational constraints in free groups


14.30-15.00 F. Bassino (Universitd de Marne La Vallde), A characterization of cubic simple beta-numbers

15.00-15.30 T. Poomsa-ard (Khon Kaen University), Hyperidentities in medial graph algebras

15.30-16.00 T. PetkoviC, M. Cirid & S. Bogdanovid (University of Nis), Nonregular varieties of automata

xi

Section B

Invited lectures

11.00-11.40 S. Marcus (Institute of Mathematics of the Romanian Academy), From infinite words to languages and back: an expected itinerary

11.40-12.20 C. Choffrut & S. Grigorieff (UniversitC Paris 7), Rational relations on transfinite strings

12.20-13.00 T. Yokomori (Waseda University), On approximate learning of DFAs


14.30-15.00 S. Konstantinidis (Saint Mary’s University), Error-detecting properties of languages

15.00-15.30 M. It0 (Kyoto Sangyo University) & Y . Kunimochi (Shizuoka Institute of Science and Technology), On C P N languages

15.30-16.00 S.V. Avgustinovich, D.G. Fon-Der-Flaass & A.E. &id (Sobolev Insti- tute of Mathematics), Arithmetical complexity of infinite words

March 17, 2000

Plenary lecture

10.00-10.50 J. Almeida (University of Porto) & A. Escada (University of Coim- bra), Semidirect products with the pseudovariety of all finite groups

Section A

Invited lectures

11.00-11.40 K.P. Shum (Chinese University of Hong Kong), On super Hamiltonian semigroups

11.40-12.20 G. SCnizerguesu (UniversitC Bordeaux I), The equivalence problem for a subclass of Q-algebraic series

xii

12.20-13.00 M. Ozawa (Nagoya University), Computational equivalence between quantum circuits and quantum Turing machines


14.30-15.00 0. Carton (Universitk de Marne-la-vallk), R-trivial languages of words on countable ordinals

15.00-15.30 N. RuSkuc (University of St Andrews), Some (easy?) questions concerning semigroup presentations

16.00-16.30 B. Steinberg (University of Porto), Polynomial closure and topology

16.30-17.00 H. Machida (Hitotsubashi University), Some properties of hyperoperations and hyperclones

17.00-17.30 R. Matsuda (Ibaraki University), Characterization of valuation rings and valuation semigroups by semistar-operations

Section B Invited lectures

11.00-11.40 A.V. Kelarev & P.G. Trotter (University of Tasmania), A combinatorial property of automata, languages and syntactic monoids

11.40-12.20 J. Sakarovitch (ENST), Star height of rational languages: a new presentation for two old results

12.20-13.00 P. Domosi (University of Debrecen) & M. Kudlek (Universitat Ham- burg), An improvement of iteration lemmata for context-free languages


14.30-15.00 P. Domosi (University of Debrecen), M. Kudlek (Universitat Ham- burg) & s. Okawa (University of Aizu), A homomorphic characterization of recursively enumerable languages

15.00-15.30 B. Imreh (University of Szeged), M. Ito (Kyoto Sangyo University) & A. Pukler (Istvin Szkchenyi College), On commutative asynchronous automata

xiii

16.00-16.30 T. Imaoka (Shimane University), Some remarks on representations of orthodox *-semigroups

16.30-17.00 Z. PopoviE, S. BogdanoviC, M. CiriC & T. PetkoviC (University of NiS), On finite generalized directable automata

17.00-17.30 C. Choffrut (Universitd Paris 7), S. Horvith (Eotvos Lorind Univer- sity) & M. Ito (Kyoto Sangyo University), Monoids and languages of transfinite word

March 18, 2000

Plenary lecture

10.20-11.10 J.-E. Pin (Universitb Paris VII) & P. Weil (Universitd Bordeaux I and CNRS) , Semidirect products of ordered semigroups

Section A

Invited lectures

11.20-12.00 A. Atanasiu, C. Martin-Vide & V. Mitrana, On the sentence valuations in a semiring - An approach to the study of synonymy -

12.00-12.40 K. Auinger (Universitat Wien), Join decompositions involving pseudovarieties of semigroups with commuting idempotents


14.10-14.40 T. Koshiba (Telecommunications Advancement Organization of Japan) & K. Hiraishi (JAIST), A note on finding one-variable patterns consistent with examples and counterexamples

14.40-15.10 M. Yasugi & M. Washihara (Kyoto Sangyo University), Rademacher functions and computability

xiv

Section B

Invited lectures

11.20-12.00 J.-E. Pin (Universitd Paris VII) & P. Weil (Universiti Bordeaux 1 and CNRS), Semidirect products of ordered semigroups - Applications t o languages -

12.00-12.40 C. Mauduit (Institut de Mathematiques de Luminy), Pseudorandom words


14.10-14.40 E. Moriya & T. Tada (Waseda University), Relation between the space complexity and the number of stack-head turns of pushdown automata

14.40-15.10 M. Mitrovid, S. BogdanoviC & M. CiriC (University of NiS), Iteration of matrix decompositions

xv

List of Speakers

Almeida, J. (University of Porto) e-mail: j [email protected] Anderson, J.A. (University of South Carolina) e-mail: [email protected] Auinger, K. (Universitat Wien) e-mail: [email protected] Bassino, F. (Universitk de Marne La Vallke) e-mail: [email protected] Brattka, V. (FernUniversitat Hagen) e-mail: [email protected] Campbell, C. (University of St Andrews) e-mail: cmc@s t-andrews.ac. uk Carton, 0. (Universitk de Marne-la-Vallke) e-mail: [email protected] Choffrut, C. (Universitk Paris 7) e-mail: [email protected] CiriC, M. (University of Nis) e-mail: ciricmebankerinter .net Csuhaj-Varj6, E. (Computer and Automation Res. Inst. Hung. Academy) e-mail: [email protected] Dassow, J. (0 tto-von-Guericke-Universitat Magdeburg) e-mail: [email protected] Denecke, K. (University of Potsdam) e-mail: [email protected] Diekert, V. (Universitat Stuttgart) e-mail: [email protected] Domosi, P. (University of Debrecen) e-mail: [email protected] Esik, Z. (University of Szeged) e-mail: [email protected]

xvi

Frid, A. E. (Sobolev Institute of Mathemathics) e-mail: [email protected] Gruska, J. (Masaryk University) e - mail: gruskaeinformati cs .muni. cz Hashiguchi, K. (Okayama University) e-mail: [email protected] Horvith, S . (Eotvos Lorhd University) e-mad: [email protected] Imaoka, T . (Shimane University) e-mail: imaoka@mat h.s himane-u . ac.j p Imreh, B. (University of Szeged) e-mail: imreh0inf.u-szeged.hu Inata, I. (Toho University) e-mail: [email protected] Inoue, K. (Yamaguchi University) e-mail: [email protected] Ito, M. (Kyoto Sangyo University) e-mail: [email protected] Karhumaki, J. (University of Turku) e-mail: [email protected] Kari, L. (University of Western Ontario) e-mail: [email protected] Kelarev, A.V. (University of Tasmania) e-mail: [email protected] Konstantinidis, S . (Saint Mary’s University) e-mait S .Kons t antinidis@St Marys.ca Koshiba, T. (Secure Computing Laboratory, Fujitsu Laboratory Ltd) e-mail: koshi baeyokohama. t ao .go .j p Kudlek, M. (Universitat Hamburg) e-mail: [email protected] Kutrib, M. (University of Giessen) e-mail: [email protected] Lam, N.H. (Hanoi Institute of Mathematics) e-mail: [email protected]

xvii

Long, D,Y. (City University of Hong Kong) e-mail: [email protected] Machida, H. (Hitotsubashi University) e-mail: [email protected] Marcus, S . (Institute of Mathematics of the Romanian Academy) e-mail: [email protected] Mashevitzky, G. (Ben Gurion University of the Neger) e-mail: [email protected] Mateescu, A. (University of Bucharest) e-mail: [email protected] Matsuda, R. (Ibaraki University) e-mail: [email protected] Mauduit, C . (Institut de Mathematiques de Luminy) e-mail: [email protected] Meakin, J. (University of Nebraska) e-mail: [email protected] Mitrana, V. (University of Bucharest) e-mail: [email protected] Mitrovib, M. (University of Nis) e-mail: meli@j unis .ni.ac.yu Moriya, E. (Waseda University) e-mail: [email protected] Nagylaki, Z. (Hiroshima University) e-mail: nagylaki@bigfoot .com Nishio, H. (Kyoto, Japan) e-mail: [email protected] Ogawa, M. (NTT Communication Science Laboratories) e-mail: [email protected] Otto, F. (Universitat Kassel) e-mail: otto@flower. theory.informatik.uni-kassel.de Ozawa, M. (Tohoku University) e-mail: [email protected] P b n , Gh. (Institute of Mathematics of the Romanian Academy) e-mail: gp@as tor.urv.es

xviii

PetkoviC, T. (University of NiB and TUCS) e-mail: [email protected] Pin, J.-E. (Universitk Paris VII) . e-mail: [email protected] Poomsa-ard, T. (Khon Kaen University) e-mail: [email protected] Popovid, Z. (University of NiB) e-mail: [email protected] Ruskuc, N. (University of St Andrews) e-mail: [email protected] Saito, T. (Innoshima, Japan) e-mail: [email protected] Sakarovitch, J. (ENST) e-mail: [email protected] Schott, R. (Universitk Henri Poincark) e-maik [email protected]. Senizergues, G. (Universiti Bordeaux I) e-mail: [email protected] Shallit, J. (University of Waterloo) e-mail: shalli t C3graceland.mat h .uwaterloo. ca Shoji, K. (Shimane University) e-mail: [email protected] Shum, K.P. (Chinese University of Hong Kong) e-mail: [email protected] Steinberg, B. (University of Porto) e-mail: [email protected] Steinby, M. ( University of Turku) e-mail: [email protected] Van, D.L. (Hanoi Institute of Mathematics) e- mail: dlvan @t hevin h.ncs t . ac.vn Volkov, M.V. (Ural State University) e-mail: [email protected] Weil, P. (Universitk Bordeaux I and CNRS) e-,mail: WeilC3labri.u-bordeaux.fr

xix

Yamamura, A. (Communications Research Laboratory) e-mail: [email protected] Yasugi, M. (Kyoto Sangyo University) e-mail: yasugi@cc. kyoto-su.ac. j p Yokoinori, T. (Waseda University) e-mail: [email protected]


x x i

Table of Contents

Contributed Papers

Semidirect Products with the Pseudovariety of All Finite Groups . . . . 1 J. Almeida (Porto, Portugal) and A. Escada (Coimbra, Portugal)

On the Sentence Valuations in a Semiring . . . . . . . . . . . . . 22 A. Atanasiu (Bucharest, Romania), C. Martin- Vide (Tarragona, Spain) and V. Mitrana (Bucharest, Romania)

Join Decompositions of Pseudovarieties of the Form DH n ECom . . . 40 K. Auinger (Wien, Austria)

Arithmetical Complexity of Infinite Words . . . . . . . . . . . . . 51 S. V. Avgustinovich (Novosibirsk, Russia), D. G. Fon-Der-Flaass (Novosibirsk, Russia) and A. E. Frid (Novosibirsk, Russia)

The Emperor’s New Recursiveness: The Epigraph of the Exponential Function in Two Models of Computability . . . . . . . . . . . . . 63

V. Brattka (Hagen, Germany)

Iterative Arrays with Limited Nondeterministic Communication Cell . 73 T. Buchholz (Giessen, Germany), A. Klein (Giessen, Germany) and M. Kutrib (Giessen, Germany)

%Trivial Languagesof Words on Countable Ordinals . . . . . . . . 88 0. Carton (Marne-la- Vallke, fiance)

The Theory of Rational Relations on Transfinite Strings . . . . . . . 103 C. Choflrut (Paris, France) and S. Grigorieff (Paris, Frunce)

Networks of Watson-Crick DOL Systems . . . . . . . . . . . . . . 134 E. Csuhaj- Varjd (Budapest, Hungary) and A. Salomaa (lhrku, Finland)

On the Differentiation Function of Some Language Generating Devices 151 J. Dassow (Magdeburg, Germany)

xxii

Visualization of Cellular Automata . . . . . . . . . . . . . . . . 162 M. Deminy (Debrecen, Hungary), G. Horva'th (Debrecen, Hungary), Cs. Nagylaki (Debrecen, Hungary) and 2. Nagylaki (Debrecen, Hungary)

On a Class of Hypercodes . . . . . . . . . . . . . . . . . . . . 171 Do Long Van (Hanoi, Vietnam)

A Parsing Problem for Context-Sensitive Languages . . . . . . . . . 183 P. Domosi (Debrecen, Hungary) and M. It0 (Kyoto, Japan)

An Improvement of Iteration Lemmata for Context-Free Languages . . 185 P. Domosi (Debrecen, Hungary) and M. Kudlek (Hamburg, Germany)

Quantum Finite Automata . . . . . . . . . . . . . . . . . . . . 192 J. Gruska (Brno, Czech Republic) and R. Vollmar (Karlsruhe, Germany)

On Commutative Asynchronous Automata . . . . . . . . . . . . . 212 B. Imreh (Szeged, Hungary), M. It0 (Kyoto, Japan) and A . Pukler (Gyor, Hungary)

Presentations of Right Unitary Submonoids of Monoids . . . . . . . 222 I. Inata (Funabashi, Japan)

A Combinatorial Property of Languages and Monoids . . . . . . . . 228 A. V. Kelarev (Hobart, Australia) and P. G. Trotter (Hobart, Australia)

Error-Detecting Properties of Languages . . . . . . . . . . . . . . 240 S. Konstantinidis (Halifax, Canada)

A Note on Finding One-Variable Patterns Consistent with Examples and Counterexamples . . . . . . . . . . . . . . . . . . . . . . 253

T. Koshiba (Kawasaki, Japan) and K. Hiraishi (Ishikawa, Japan)

On the Star Height of Rational Languages: A New Presentation for Two Old Results . . . . . . . . . . . . . . . . . . . . . . . . 266

S. Lombardy (Paris, fiance) and J. Sakarovitch (Paris, France)

Some Properties of Hyperoperations and Hyperclones . . . . . . . . 286

H. Machida (Kunitachi, Japan)

xxiii

297 Words Guaranteeing Minimal Image . . . . . . . . . . . . . . . . S. W. Margolis (Ramat Gan, Israel), J.-E. Pin (Paris, fiance) and M. V. Volkov (Ekaterinburg, Russia)

Power Semigroups and Polynomial Closure . . . . . . . . . . . . . 311 S. W. Margolis (Ramat Gan, Israel) and B. Steinberg (Porto, Portugal)

Routes and Trajectories . . . . . . . . . . . . . . . . . . . . A . Mateescu (Bucharest, Romania)

Characterization of Valuation Rings and Valuation Semigroups by Semistar-Operations . . . . . . . . . . . . . . . . . . . . . .

R. Matsuda (Mito, Japan)

Further Results on Restarting Automata . . . . . . . . . . . . . G. Niemann (Kassel, Germany) and F. Otto (Kassel, Germany)

Cellular Automata with Polynomials over Finite Fields . . . . . . H. Nishio (Kyoto, Japan)

Generalized Directable Automata . . . . . . . . . . . . . . . . 2. Popovic' (NiS, Serbia), S. Bogdanovic' (NiS, Serbian), T. Petkovic' (Turku, Finland) and M. CiriC (NiS, Serbia)

Acts over Right, Left Regular Bands and Semilattices Types . . . . T. Saito (Innoshima, Japan)

Two Optimal Parallel Algorithms on the Commutation Class of a Word . . . . . . . . . . . . . . . . . . . . . . . . . .

R. Schott (Nancy, France) and J.-C. Spehner (Mulhouse, France)

A Proof of Okninski and Putcha's Theorem . . . . . . . . . . . K. Shoji (Matsue, Japan)

Subdirect Product Structure of Left Clifford Semigroups . . . . . . K. P. Shum (Hong Kong, China), M. K. Sen (Calcutta, India) and Y. Q. Guo (Kunming, China)

Tree Automata in the Theory of Term Rewriting . . . . . . . . . M. Steinby (Turku, Finland)

. 323

. 339

. 352

. 370

. 378

. 396

. 403

. 420

. 428

. 434

xxiv

Key Agreement Protocol Securer Than DLOG . . . . . . . . . . . 450 A . Yamamum (Tokyo, Japan) and K. Kurosawa (Hitachi, Japan)

A Note on Rademacher Functions and Computability . . . . . . . . 466 M. Yasugi (Kyoto, Japan) and M. Washihara (Kyoto, Japan)

Authors Index . . . . . . . . . . . . . . . . . . . . . . . . . 477

1

Semidirect products with the pseudovariety of all finite groups*

Jorge Almeida Ana Escada

Abstract

This is a survey of recent results related to semidirect products of an arbitrary pseudovariety with the pseudovariety of all finite groups. The main flavour is the establishment of links between various operators on pseudovarieties, some obviously computable, others known not to be so. This not only leads to decidability results but does so in a sort of uniform way which has a structural tint even though the arguments are mostly syntactical.

1 Introduction Many problems in computer science lead to decidability questions on pseudovarieties of finite semigroups. Often the problem involves some sort of decomposition process which in terms of pseudovarieties translates to the calculation of a semidirect product of pseudovarieties. When just two factors are concerned, the cases in which the second factor is the pseudovariety G of all finite groups or the pseudovariety D of all finite definite semigroups have attracted the most attention [18, 3, 22, 43, 461.

This paper is a survey of some recent work around the theme of the semidirect product with the pseudovariety G. It uses some powerful tools to deal with such semidirect products, particularly when the second factor is the pseudovariety G , to obtain syntactic proofs of equalities of the form V * G = &V, where &V denotes the pseudovariety consisting of all finite semigroups

'The authors gratefully acknowledge support by FCT through the Centro de Matemc'tica da Universidade do Porto and the Centro de Matemc'tica da Univer- sidade de Coimbm, respectively, and by the FCT and POCTI approved project POCTI/32817/MAT/2000 which is comparticipated by the European Community Fund FEDER.

2

whose idempotents generate subsemigroups from V. Subpseudovarieties V of DS are considered, including all subpseudovarieties of LI, DA, DS itself, and J. The latter of these provides a new proof of a crucial step in a result of Henckell and Rhodes [23] which is their deduction from Ash's inevitability theorem [13] of the famous equality IPG = 'BG between the pseudovariety generated by all power semigroups of finite groups and the pseudovariety of all finite semigroups in which regular elements have a unique inverse [33].

The arguments are of a syntactical/combinatorial nature. They consist in suitable formal manipulations of words in the enlarged signature with a pseudo-inversion operation which is never nested. Most proofs are only sketched here. See the full paper [6] for further details.

2 Generalities We gather in this section the necessary notation and background for the remainder of the paper. The reader is referred to [32] for a basic introduction to finite semigroup theory and to [3] for a more comprehensive treatment based on methods which are closer to those adopted here. See also these references for any undefined terms.

By a pseudovariety we mean a class of finite semigroups which contains all homomorphic images, subsemigroups, and finite direct products of members of the class. The most active and successful area of finite semigroup theory is precisely the study of pseudovarieties, particularly some natural operations on them such as the semidirect product. Such operations are often obtained by applying some natural algebraic operator to semigroups from the argument pseudovarieties and closing up to the generated pseudovariety.

2.1 Various operators on pseudovarieties Let V and W be pseudovarieties.

The semidirect product pseudovariety V * W is defined to be the pseudovariety generated by all semidirect products S* T with S E V and T E W. It turns out that rather than using general semidirect products one may use specifically the wreath product which, in a suitable context, is associative, and so the semidirect product of pseudovarieties is also associative.

A few other operators will play a role in this paper. The join VVW is simply the pseudovariety generated by the class VUW

or, to use an algebraic operator, by all direct products S x T with S E V a n d T ~ W .

Denote by &V the class of all finite semigroups S whose idempotents

3

generate a subsemigroup which lies in V. Note that the operator E is idempotent.

The Mal'cev product V @ W is the pseudovariety generated by all finite semigroups S for which there is a homomorphism cp : S -+ T with T E W and p-'e E V for every idempotent e E T . As indicated below, the Mal'cev product has important links with the semidirect product.

The power operator Y associates with V the pseudovariety IPV generated by all power semigroups Y ( S ) with S E V. See [3] for an extensive study of this operator and [19, 201 for recent improvements and extensions.

Let S be a finite semigroup and D one of its regular 'D-classes. Let - be the equivalence relation on the set of group elements of D generated by the identification of elements which are either 3 or C-equivalent. A block of D is the Rees quotient of the subsemigroup of S generated by a --class modulo the ideal consisting of the elements which do not lie in D. The blocks of S are the blocks of its regular 'D-classes. The block operator associates with V the class of all finite semigroups whose blocks lie in V, which can be shown to be a pseudovariety.

For a semigroup S, denote by E ( S ) the set of its idempotents. The local operator C is defined by letting LV consist of all finite semigroups S all of whose submonoids of the form eSe, with e E E(S) , lie in V. Note that C is also an idempotent operator.

The class DV is defined to consist of all finite semigroups whose regular 'D-classes are subsemigroups which lie in V. It is again easy to see that DV is a pseudovariety.

Among the above operators, which do not exhaust those of interest for the applications, some have explicit structural definitions while others involve taking the pseudovariety generated by a subclass which itself is defined explicitly in that sense. Note that membership in classes with such explicit structural definitions is relatively easy to test and, in particular, can be done algorithmically. Call a class of finite semigroups decidable if there is an algorithm to test membership in it. It is by no means obvious how to construct an algorithm for the pseudovariety generated by a decidable class. In fact, this task is not always possible. More precisely, the join [l] and the semidirect and Mal'cev products [37] of decidable pseudovarieties may not be decidable. Recently, Auinger and Steinberg [15] have announced that the power operator also fails to preserve decidability.

Thus, any connections which may be found between operators defined by generators and structurally defined operators are particularly useful and often translate in an elegant manner into algorithms for computing values of the former. We review below some such connections which are of interest for the specific topic of this paper. For this purpose, we need to introduce some

4

further ideas and results.

2.2 Semidirect products with D Tilson [46] introduced the notions of pseudovariety of (finite) categories and of pseudovariety of (finite) semigroupoids (categories without the requirement of local identities) and he showed that one is led to considering them by studying semidirect products of pseudovarieties of semigroups. In this context, semigroups are seen as semigroupoids by viewing elements as edges (or morphisms) at a virtual single vertex (or object). On the other hand, the edges of a semigroupoid with both ends at a particular vertex v, assuming there is at least one, form a semigroup which is called the local semigroup at v. See Tilson’s work (and the recent continuation [41]) for precise definitions and results.

It is well known that pseudovarieties of semigroups may be defined by formal equalities between members of free profinite semigroups (this is basically Reiterman’s theorem [2, 351; see [lo] for a presentation in this language), which are called pseudoidentities. For pseudovarieties of semigroupoids there is an analogous result where free profinite semigroupoids freely generated by finite graphs play the role of generating sets of free profinite semigroups [ll, 261. Thus, pseudoidentities for pseudovarieties of semigroupoids are written over finite graphs. See Theorems 2.4 and 2.6 for specific examples.

The global gV of a pseudovariety V of semigroups is the pseudovariety of semigroupoids generated by V. We say that V is local if gV is defined by pseudoidentities over 1-vertex graphs.

[C] denotes the class of all finite semigroups which satisfy all members of a set C of pseudoidentities. In such pseudoidentities we adopt the convention that e, f, . . . stand for idempotents and 0 for a zero. So, for instance, a semigroup satisfies the pseudoidentity ex = xe if and only if its idempotents commute with all elements.

S1 = {finite semilattices} = [x2 = z,zy = y z ] B = {finite bands} = [x2 = x ] 0 = {finite orthodox semigroups} = [ ( e f ) 2 = ef]

Corn = {finite commutative semigroups} R = {finite %trivial semigroups} L = {finite C-trivial semigroups} J = {finite &trivial semigroups} = R n L

Consider some frequently encountered pseudovarieties of semigroups, where

5

D = {finite semigroups in which idempotents are right zeros}

K = {finite semigroups in which idempotents are left zeros}

N = {finite nilpotent semigroups} = K n D = [e = 01 G = {finite groups}

D, = [ Y Z ~ "-zn = 2 1 . . - z , I K, = [ZI .*-z,Y = ~1 .-.z,]

Ab = {finite Abelian groups} LG = {finite left groups) = D1 V G = [ e z = z] RG = {finite right groups} = K 1 V G = [ z e = z]

CS = {finite simple semigroups} CR = {finite completely regular semigroups} = {finite unions of groups}

A = {finite aperiodic semigroups}

S = {finite semigroups} I = {singleton semigroups}

We warn the reader that the notation 0 is often found in the literature with a different meaning [24].

A pseudovariety of semigroups is said to be monoidal if it is generated by its monoids. The interest of the notion of a local pseudovariety of semigroups comes from the following result which is a simplified version of the so-called delay theorem.

Theorem 2.1 ([46]). A monoidal pseudovariety V of semigroups is local if and only if V * D = CV.

Theorem 2.2 (1171). The pseudovariety S1 is local.

Theorem 2.3 ([18, 421). The pseudovariety R is local (therefore so is L).

Theorem 2.4 ([30]). The pseudovariety J is not local, 2. z

gJ = [ ( z y ) " z t ( ~ t ) " = (zy)"(Zt)"; .-. 1.' Y, t

Theorem 2.5 ([43]). As pseudovarieties of monoids, nontrivial pseudovarieties of groups are local.

Theorem 2.6 ([45]). The pseudovariety Com is not local,

2 , z -. gCom = [ z y z = z y z ; -- 1. Y

An orthogroup is an orthodox completely regular semigroup. 'See Subsection 2.4 for a definition of the w-power.

6

Theorem 2.7 ([27]). Every monoidal pseudovariety of orthogroups which does not consist entirely of groups is local.

Theorem 2.8 ([28]). The pseudovariety 'DS is local.

Theorem 2.9 ([4]). The pseudovariety DA is local.

2.3 It is a well-known result that, for a pseudovariety V,

Semidirect products - * H vs other operators

and the first inclusion is an equality if V is local and monoidal.

Theorem 2.10 ([42, 181). R * G = ER.

Theorem 2.11 ([12]). S1* G = ESl.

Theorem 2.12 ([IS]). The equality V @ G = EV holds for every nontrivial monoidal pseudovariety V of bands.

Combining with Theorem 2.7, we deduce that the equality V * G = EV holds for every nontrivial monoidal pseudovariety of bands. In the same paper the authors of Theorem 2.12 claim to establish the following result but their arguments are flawed. Based on results of Szendrei [44], P. Trotter has shown in unpublished work that indeed the result is true.

Theorem 2.13. CR * G = ECR.

The following is a combination of results of Margolis and Pin [31] and Henckell and Rhodes [23], the latter depending on a deep theorem of Ash [13] which is presented further below.

Theorem2.14. I P G = J * G = J @ G = E J = ' B G .

A pseudovariety of groups V is said to be arborescent if (V n Ab) * V = V. Gildenhuys and Ribes [21] had shown that, for such pseudovarieties, their free profinite groups have Cayley graphs which are profinite trees (in a natural homological sense). Almeida and Weil [9] in turn showed that the converse also holds. The following result is a combination of parts of results of Steinberg [38, 391 who has done very extensive work on pseudovarieties of the form V * H and related pseudovarieties.

Theorem 2.15. For an arborescent pseudovariety H of groups, IPH = J * H = J @ H .

7

By further refining the study of the geometry of Cayley graphs of relatively free profinite groups, Auinger and Steinberg [14] have recently obtained a characterization of all pseudovarieties of groups for which the equality J *H = J @ H holds.

Theorem 2.16 ([25]). The inequality J @ H # 'BH holds for every pseudovariety H 2 G closed under extension.

Note that IBH # EV for every pseudovariety H 5 G and every pseudovariety V since G c E l but G IBH.

From a result of Karnofsky and Rhodes [29] it follows that

The situation is however different from that of G.

A * G S & A .

denotes the class of all finite semigroups all of whose subgroups lie in H, which is easily shown to be a pseudovariety.

Theorem 2.17 ([47]). For V = CS n Ab, we have the following strict inclusions:

V * G 5 V @ G 2 EV.

For a pseudovariety H of groups,

Fkom work of Rhodes [36] it follows that V = A * G is also an example

Among the various questions suggested by the above results, we consider in which both inclusions in ( 1 ) are strict and in fact EV = &A.

in this paper the following problem.

Problem. For which pseudovarieties V do we have V * G = EV?

the above problem. At this point, we add a couple of elementary observations related with

1. The pseudovariety &V is the largest pseudovariety W such that W * G c &V.

2. We have &'DO E &A [6].

3. If V g &A and (V n A) * G = E(V n A), then V * G = EV

2.4 (w - 1)-words By an (w-1)-word we mean a term in a free unary semigroup where the unary operation is denoted (-)"-l. The height of an (w - 1)-word is h(w) = 0 if w does not involve the operation (-)"-l, and is recursively defined by letting h ( ( ~ ) ~ - ' ) = h(w) + 1 and ~ ( w ~ w z ) = max{h(wl), h(wZ)}.

8

The natural interpretation of the unary operation (-)"-l in a finite semigroup S associates with an element s the inverse sW-l of se in the maximal subgroup K of the subsemigroup generated by s, where sw = e is the idempotent of K .

The following constitute a Noetherian system of reduction rules which preserve equality in the free group:

a(wa)w-l -+ (w)"-l (aw)w-la -+ (w)"-l

(1)-1 -+ 1

The system is not confluent since, for instance, the (w-1)-word (ab)w-la(ca)w-l may be reduced to bw-l(ca)w-l and also to (ab)w-lc"-l but both of these are irreducible (w - 1)-words. For words that reduce to 1, there is a more convenient set of reduction rules.

Lemma 2.18. Let w be an (w - 1)-word of height a t most 1 which i s equal to 1 in the free group. Then it is possible to reduce w to the empty word by applying a finite number of times rules of the form

U(VU)W--1V -+ 1 (2)

where u and v are possibly empty words.

Lemma 2.19 (Product Inverse Formula). In a finite semigroup, the following formula holds:

where

a: = (ai+l. . . anal . . . ai)w-lai+l . . . anal . . . ai-1. (3)

Let S be a finite semigroup and let s E S. By a weak inverse of s we mean an element t E S such that tst = t. Note that sW-l is the only power of s which is a weak inverse of s. Also, the element a: of Lemma 2.19 given by (3) is a weak inverse of ai. A weak conjugate of s is an element of the form asb where one of a and b is a weak inverse of the other. The self-conjugate core D ( S ) is defined to be the smallest subsemigroup of S containing the idempotents which is closed under weak conjugation.

Corollary 2.20. Let S be a finite semigroup and let ai E S and ui E D ( S ) U (1) (i = 1,. . . ,n ) . Then the product

a l u l . . . a,u,(a,+l. . . anal . . . %+iG+i . . .'%an

lies in D ( S ) .

9

By a relational morphism p : S + T between two semigroups we mean a relation with domain S which is a subsemigroup of S x T.

We shall call a graph what is usually called a directed multigraph, i.e., edges are directed and there may be several edges with the same end vertices. An edge-labeling of a graph by a semigroup S is a function which associates with each edge an element of the semigroup. An edge-labeling X of a graph by a group is said to commute if, for every circuit ( e l , . . . ,en) (which is turned into an oriented cycle if some of the edges in it are reversed), we have the equality (XS,)'~ . . . (Xsn)'- = 1 where ~i = -1 or E, = 1 according to whether the edge ei is reversed in the circuit or not. For a pseudovariety H of groups, an edge-labeling X of a finite graph I? by a finite semigroup S is said to be H-inevitable if, for every relational morphism p : S + G into G E H, there is a commuting edge-labeling of I? by G which is p-related with A.

We may now formulate Ash's inevitability theorem [13] as follows taking into account some observations in [8]. Denote by AS the free profinite semigroup freely generated by a set A.

Theorem 2.21. An edge-labeling X of a finite graph I? by a finite semigroup S i s G-inevitable if and only if, for every (or for some) onto homomorphism q : AS + S , there is an edge-labeling p of I? by (w - 1)-words of height at most 1 such that q o p = X and p commutes over G .

In the terminology of J. Rhodes, an element s of a finite semigroup S is called a type 11 element if, for every relational morphism p : S + G into a finite group, (s, 1) E p. In other words, the l-vertex l-edge graph labeled s is G-inevitable. The set of all type I1 elements of S is denoted K ( S ) and is called the group kernel of S. By Theorem 2.21, s E K ( S ) if and only if there is some (w - 1)-word of height at most 1 that is equal to 1 in the free group and evaluates to s in S. From this observation it is now easy to establish the following result which was known in the 1980's as the type 11 conjecture and which was proposed by J. Rhodes.

Theorem 2.22 ([13]). For every finite semigroup S , K ( S ) = D ( S ) .

Proof. Note that

aba = a =+ asb = a(ba)"-'sb, e E E (S ) a e = e W .

Hence D ( S ) E K ( S ) . Conversely, if s E S is a type I1 element, then it admits an expression as

an (w - 1)-word w of height at most 1 which evaluates to 1 in the free group.

10

By Lemma 2.18, the (w - 1)-word w admits a factorization into factors of the form

a l u l . . . a,u,(a,+l. . . anal . . . a,)w-lu,+lar+l . . . u,a,

with the ui evaluating to 1 in the free group (i = 1 , . . . ,TI). Thus, assuming inductively that all ui E D ( S ) U {l}, by Corollary 2.20 we deduce that s E D ( S ) . 0

2.5

The following result combines Theorem 2.21 with the special case of semidirect products with G of what has come to be known as the basis theorem. It was proved by Almeida and Weil [ll] as a combination of profinite techniques with Tilson's derived category theorem [46].

Theorem 2.23. Let V be a pseudovariety of semigroups and suppose g V admits a basis of pseudoidentities E involving only a bounded number of vertices. Then the semidirect product V * G is defined by the pseudoidentities of the f o m $p = $q where p = q is a pseudoidentity from C over a finite graph I' and cp is an edge-labeling of I' by (w - 1)-words of height at most 1 which commutes over G .

Bases for semidirect products V * G

We present next a simple application of this result for which a further few preliminaries are needed.

Denote by B2 the 5-element multiplicative matrix semigroup consisting of the 2 x 2 matrices over 2 / 2 2 with at most one nonzero entry. Observe that B2 is locally a semilattice, i.e., B2 E LSl.

Let X be an alphabet and let X - l be a disjoint set of formal inverses of the letters. For a word w over X (or, more generally, a member of OxS) using all the letters, let -W denote the equivalence relation on Y = X U X - l generated by the pairs (2-',y) such that xy is a factor of w. Then the most general X-labeled graph rw with initial and final vertices (which may also be seen as an automaton) supporting w, in the sense that the word w may be read along the graph from the initial to the final vertex, is obtained as follows:

0 Vertices(I',) = Y/mW;

0 Edge@,): x/wW -%z/-, if y -W x and y-' -, z ;

0 initial vertex: x/-, where x is the first letter of w;

0 final vertex: x-'/-, where x is the last letter of w.

11

Based on results of Reilly [34] in the context of inverse semigroups, Almeida, Azevedo and Teixeira [5] have observed that a semigroup pseudoidentity u = v is valid in B:! if and only if u and v use exactly the same variables and J?,, = rV. This in turn allowed them to prove the following result which explains why the calculation of globals concentrates on pseudovarieties excluding B2.

Theorem 2.24. If B2 E V and V = [C], then g V is defined by the pseudoidentities in C viewed over the most general graphs supporting them.

Using this result and Theorem 2.23, we may now proceed to compute some specific semidirect products with G.

Proposition 2.25. Let V be a pseudovariety containing S1 and suppose {ui = v, : i E I } is a basis of pseudoidentities f o r V . Then C V is local and (CV) * G is defined by the pseudoidentities of the f o r m

u i ( z w y l x w , . . . , xWy,zW) = v2(zWy1xW, . . . ,xWy,xW) (4)

with i E I , x a variable, and the y j (w - 1)-words of height at most 1 which are 1 in groups.

Proof. Since V 2 S1, the local semillatice B2 belongs to C V . Note that C V is defined by the pseudoidentities of the form (4) where x , y1 , , . . , Y n are distinct variables and u i , q depend on the same n variables. By Theo- rem 2.24, g C V is defined by the pseudoidentities of the form (4) viewed over the corresponding most general graph supporting both sides. But, for the variables x ,yk the equivalence relation defining r identifies x with 2-l and y;', and also identifies Y k with x-'. Since this holds for every k E (1, .. . , n } , the whole graph I' has only one vertex. Hence the pseudovariety CV is local. The remainder of the result follows from Theorem 2.23 by noting that no restriction needs to be imposed on z since it only appears in (4) as an w-power . 0

Note that C I is not local. It can be easily shown that its global is defined by a single pseudoidentity on a 2-vertex graph:

Using this observation, it is also easy to construct a 2-vertex semigroupoid which fails the above pseudoidentity but satisfies all 1-vertex pseudoidentities valid in gCI , i.e., whose local semigroups lie in C I .

12

Proposition 2.26 ([4012). Let V be a pseudovariety containing Sl. Then EV is local.

Proof. Let {ui = v, : i E I} be a basis of pseudoidentities for V. Then EV is defined by the pseudoidentities of the form

with i E I. Since B2 E ESl E EV, it suffices to verify that the most general graphs over which such pseudoidentities may be written have only one vertex.

Now, the (symmetrized) relation generating the equivalence relation -W

which defines the most general graph I'w over which w = xy . . . x: may be read is determined by the following connected graph

21 Xn

Hence the graph I'w has only one vertex. Since V contains S1, ui and vi must involve precisely the same variables. It follows that the most general graph in which the two sides of (5) may be read has only one vertex. By Theorem 2.24, this shows that g E V is defined by pseudoidentities over 1-vertex graphs, that is EV is local.

Again, in contrast, &I is not local and in fact

Stiffler [42] has shown that, if V is monoidal, then D * V E V * D. This allows us to obtain some curious inclusions relating the operators L and -* G under suitable hypothesis.

Proposition 2.27. Let V be a monoidal pseudovariety of semigroups such that V * G is local. Then, we have the following inclusions:

a) L(V * G) * G = L(V * G). b) (LV) * G g L(V * G).

Proof. (a) By [3, Exercise 10.2.4(a)], V * G is a monoidal pseudovariety. By Theorem 2.1, it follows that L(V * G) = V * G * D. Taking into account Stiffler's result, this leads to the following inclusions:

L(V * G) * G = V * G * D * G C V * G * G * D = V * G * D = L(V* G) 21n [40] one also finds the statement If V * G = EV and V i s a monoidal pseudovariety

of bands, then V*G is local (Proposition 10.1). However, in view of Theorems 2.7 and 2.12, the hypothesis V * G = EV is only stating that V is nontrivial and therefore the result adds no new cases of locality to Proposition 2.26.

13

and the reverse inclusion is obvious. (b) Indeed, we have CV C(V * G), and so

(LV) * G C L(V * G) * G = L(V * G) .

Note that the 6-element semigroup with zero given by the following presentation

s = (e , f; e2 = e, f2 = f, fef = 0)

lies in C&S1 = L(Sl* G) but not in &LSl 2 (LSl) * G. Hence, at least for V = S1, the inclusion of Proposition 2.27(b) is strict.

3 Some solutions of the equation V * G = €V For the remainder of the paper, we concentrate on the equation V * G = &V. We have already mentioned some important solutions and also that the equation does not hold in full generality. A syntactical approach based on Theorem 2.23 allows us to find similarly flavoured proofs that the equation holds for many pseudovarieties contained in IDS. We only sketch here some of those proofs. The details may be found in [6].

3.1 Locally trivial solutions Consider the following pseudovarieties:

Kk = 1x1 - - . zne = z1 -..x,f] LI; = i.1. . . znef = ~ 1 . . . znf] LI: = [ e f q ---zn = ez l . . . X J I

Note that &I = [e = f] = (N n Corn) V G where the last equality is easily established and may be found in [3, Section 9.11. It is also an easy exercise in the methods of [3] to show that KL = K, V &I = K, V N V G.

The nilpotency index of a nilpotent semigroup S is the least positive integer n such that S satisfies the identity 21. . . xn = 0. From results of Almeida and Reilly [7] it is easy to deduce that U is the smallest pseudovariety of nilpotent semigroups with no bound on the nilpotency index of its members. Using Theorem 2.23, one may show that U * G = EU = &I. Also using the same theorem, one observes that

u = 8.2 = 0, z y = yz].

[ z l - - - z n = O ] * G ~ [ezl . . .z,=zl. . .z,e=zl. . . zn3 (6)

(where the inclusion is actually an equality) and so the semidirect product with G of a nilpotent pseudovariety V with a bound on the nilpotency index

14

of its members is not EV (which is equal to &I) since N is not contained in the right side of (6). This proves the first part of the following result. The other parts can be established similarly.

Theorem 3.1. Let V C_ ELI. Then V * G = EV if and only i f one of the following conditions holds:

a) U C V C_ €1 = Kb in which case EV = €1;

b) V E EK, and V p KL for every n in which case EV = EK;

c ) left-right dual of ( i i) in which case EV = ED;

d) V is not contained in any of the pseudovarieties LIL and LI:, in which case EV = ELI.

3.2 Some solutions in 'DS It is well known that DS is the largest pseudovariety that excludes the semigroup Bz. It has been the object of much attention not only since it and some of its subpseudovarieties such M J, R and DA are found in various applications in theoretical computer science, but also because they are particularly amenable to syntactical methods, that is the investigation of their properties through the study of relatively free profinite semigroups and pseudoidentities. This is the case also for our equation. The following theorem summarizes the known results in this respect.

Theorem 3.2. Every pseudovariety in the following four intervals (indicated by a bold line) together with 'DS is a solution of the equation V * G = EV:

IDS

For the remainder of the paper, we sketch the proof of this result for the two extreme pseudovarieties in the above diagram, namely D S and J.

15

3.3 The case of DDS Let S be a finite semigroup. We say that v E S is good if, for every x E S,

((VZ)"V)"+l = (V2)"V

that is (vx)"v is a group element. Observe that an idempotent is always good.

Lemma 3.3. Let S E EDS. Then

a) if a , b E S are good then so is ab;

i i ) if v E S is good and aba = a or bab = b, then avb is good.

Proof. (i) Let x E S and let e = (bza)". Then, by associativity, we have the following equalities:

( a b . x)"ab = aeb, ( a . bx)"a = ae, (b . xa)"b = eb.

Since (bxa)"-lbz . aeb . ax(bza)"-' = e , all these elements lie in the same 9-class of S. Here is a sketch box picture:

of their distribution in the corresponding egg

From the hypothesis that both a and b are good it follows that ae and eb are group elements and therefore their %-classes contain idempotents. Since S E EBS, the %-class of aeb must also contain an idempotent and, therefore, ab is good.

(ii) Let x E S. We consider here the case aba = a, the other case being handled similarly. We show that (avbx)"avb is a group element by a sequence of equalities in which in each step we either simply use associativity or we underline the factors about which some property is being used to lead to the next step. At one of the steps we use the fact that, by (i), since bav is a product of an idempotent and a good element, it is good.

16

(gvbx)"gvb = (abavbz)"abawb = a . (&bza)"&. b = a( (bavbxa) " bav) "+I b

a (bavbxa)" bav (( bavbza)w bav)" b (- ababz)" - abav( b(awbzab)"av)"b (avbz)" av (b(gvbz&)"gw) " b

= = = = (avbz)"avb((avbz)wuvb)" = ((avbz)"avb)"f'. 17

Note that D S is defined by the pseudoidentity ((zy>"z)"+l = (XY)" 2. We may now easily complete the proof of the equality D S * G = EDS.

Since D S is local, by Theorem 2.23 we obtain the following basis of pseudoidentities for D S * G:

9s * G = ~ [ ( ( ~ ~ ) " u y + l = (uv)wu : G I= = = 11.

Given S E EDS, a homomorphism cp : AS -+ S, and a pseudoidentity u = 1 valid in G , we have cp(u) E K ( S ) = D ( S ) (cf. Theorem 2.22) and so cp(u) is good by induction on the construction of elements of D ( S ) using Lemma 3.3 for the induction steps. This shows that S verifies all the pseudoidentities in the above basis (even those without any requirement on v). Hence EDS C D S * G and the reverse inclusion is valid for any pseudovariety in the place of DS.

3.4 The case of J In the case of the pseudovariety J, the fact that it is not local makes things somewhat harder which lets us show how the (w- 1)-words of height at most 1 come handy.

Let S be a finite semigroup and let s E S. Say that s is w-central if (st)" = (ts)" for all t E S. The following is the analog of Lemma 3.3. The proof is similar and is omitted.

Lemma 3.4. For S E EJ, the set of all w-central elements contains E ( S ) and is closed under multiplication and weak conjugation.

Part (i) of the following proposition improves a result of Margolis and Pin [31] which characterizes EJ as the pseudovariety defined by the pseudoidentity = (xe)" in the sense that the proposition shows that EJ satisfies apparently much stronger pseudoidentities.

17

Proposition 3.5. Let w E AS be such that G the following pseudoidentities:

w = 1. Then EJ satisfies

2) (wz)" = (zw)";

ii) (wz)"w = (wz)";

iii) & J + 1 = w".

Proof. For (i), note that, in a given S E E J , w evaluates to an element of K ( S ) and therefore to an element of D(S) . Hence w is w-central by Lemma 3.4. Part (ii) is proved similarly, that is by establishing first a suitable

0

It is now a routine matter to prove the following corollary using Proposi-

analog of Lemma 3.4. Part (iii) follows from (ii) by taking z = w.

tion 3.5 and associativity.

Corollary 3.6. Let u, v, w E GAS be such that G + uw = v = 1. Then EJ satisfies the pseudoidentities

(uvw)wuw = UW(UzIW)" = (uvw)w.

Corollary 3.6 gives the first step in an induction procedure that gives, more generally, the following result for (w - 1)-words of height at most 1. The proof is omitted.

Proposition 3.7. Let u,v be (w - l)-urords of height at most 1 such that u = 1 holds in G and u + v by rules of the form u(vu)"-~v + 1. Then EJ satisfies the pseudoidentities vu" = uw = uwv.

With these tools at hand, one may now proceed to prove that J is a solution of our equation by establishing the following result.

Proposition 3.8. Let u1, u2, v1, v2 be (w - 1)-words of height at most 1 such that

u1u2 = 211212 = u1v2 = 1

in groups. Then EJ satisfies the pseudoidentity

( ~ 1 ~ 2 ) w ~ l ~ 2 ( v 1 ~ 2 ) w = ( ~ 1 ~ 2 ) w ( v 1 v 2 ) w .

We only sketch here the proof of Proposition 3.8. By Lemma 2.18, it is possible to reduce the (w - 1)-word ulv2 to 1 by a

finite sequence of applications of rules of the form

u(vu)W-1v + 1. (7)

18

where u and v are possibly empty words. In such a reduction, part will be done entirely within factors descending from u1 or from v2 while some may require a factor descending from u1 and a factor descending from v2. The former are handled directly using Proposition 3.7. For the latter, some care has to be taken.

By associativity, we may for instance assume that the word u is shortest possible and that the factor ( V U ) ~ - ~ V comes entirely from a descendant of v2 and that this factor cannot be erased from that descendant of 212 by application of rules of the form (7). The factor u in turn is a product u = yz where y comes from a descendant of u1 and z from a descendant of 02. So there are descendants ui of u1 and vh of v2 under the reduction rules of the form (7) and factorizations ui = uyy and vh = ,Z(VU)~-~VV~, with IyI smallest possible. Now there is a shortest suffix t of v2 from which 06 is obtained by application of rules of the form (7) and we let 02 = st.

Because u was chosen to be shortest possible, and since v1svh = v l v 2 = 1 in G , it must be possible to reduce 01s to an (w - 1)-word of which y is a suffix. Moreover, such a reduction must come from the left factor v1, that is there is a factorization vl = xjjs' such that 5 reduces to y and s's reduces to 1. This shows that the reduction ~ Z ( V U ) ~ - ' V + 1 that has to be performed in a descendant of u1v2 to reduce it to 1 may also be performed in a descendant of v102 and so again Proposition 3.7 allows us to do it using the right factor (~17~2)~. For the reader's benefit, we depict the various factorizations in the following picture where the appearance in the second line of a factor below another in the first line means it is a descendant of the latter.

v 2 ... '111 '112 '111 v2 v1 . . . - - . - s ' t x l g l s ' s I t

4 - ... 4 x Y - -- '111 212 '11; 73- '111 Y

. . . --.

The equality J * G = EJ follows now from Theorem 2.23 taking into account the basis of pseudoidentities for gJ given by Theorem 2.4.

References [l] D. Albert, R. Baldinger, and J. Fthodes, The ident i ty problem for f inite semi-

[2] J. Almeida, The algebra of implicit operations, Algebra Universalis 26 (1989)

, Fini te Semigroups and Universal Algebra, World Scientific, Singa-

groups (the undecidability of), J . Symbolic Logic 57 (1992) 179-192.

16-32.

pore, 1995. English translation. [31

19

, A syntactical proof of locality of D A , Int. J . Algebra and Compu- tation 6 (1996) 165-177.

[5] J. Almeida, A, Azevedo, and L. Teixeira, O n finitely based pseudovarieties of the forms V*D and V*Dn, J . Pure and Appl. Algebra 146 (2000) 1-15.

[6] J. Almeida and A. Escada, On the equation V*G=EV, J . Pure and Appl. Algebra. To appear.

[7] J. Almeida and N. R. Reilly, Generalized varieties of commutative semigroups, Semigroup Forum 30 (1984) 77-98.

[8] J. Almeida and B. Steinberg, O n the decidability of iterated semidirect products and applications to complexity, Proc. London Math. SOC. 80 (2000) 50-74.

[9] J. Almeida and P. Weil, Reduced factorizations in free profinite groups and jo in decompositions of pseudovarieties, Int. J. Algebra and Computation 4 (1994)

, Relatively free profinite monoids: a n introduction and examples, in Semigroups, Formal Languages and Groups, J. B. Fountain, ed., vol. 466, Dordrecht, 1995, Kluwer Academic Publ., 73-117.

[111 , Profinite categories and semidirect products, J . Pure and Appl. Algebra 123 (1998) 1-50.

[12] C. J. Ash, Finite semigroups with commuting idempotents, J . Austral. Math. SOC., Ser. A 43 (1987) 81-90.

~ 3 1 , Inevitable graphs: a proof of the type 11 conjecture and some related decision procedures, Int. J. Algebra and Computation 1 (1991) 127-146.

[14] K. Auinger e B. Steinberg, T h e geometry of profinite graphs with applications to free groups and finite monoids, Tech. Rep. CMUP 2001-06, 2001.

[15] K. Auinger and B. Steinberg, O n the extension problem f o r partial permutations, Tech. Rep. CMUP 2001-08, 2001.

[16] J.-C. Birget, S. Margolis, and J. Rhodes, Semigroups whose idempotents f o r m a subsemigroup, Bull. Austral. Math. SOC. 41 (1990) 161-184.

[17] J. A. Brzozowski and I. Simon, Characterizations of locally testable events, Discrete Math. 4 (1973) 243-271.

[18] S. Eilenberg, Automata, Languages and Machines, vol. B , Academic Press, New York, 1976.

[19] A. Escada, T h e power exponent of a pseudovariety, Semigroup Forum. To

P O I , Contributions f o r the study of power operators over pseudovarieties

[21] D. Gildenhuys and L. Ribes, Profinite groups and Boolean graphs, J . Pure and

[41

375-403.

P O I

appear.

of semigroups, Ph.D. thesis, Univ. Porto, 1999. In Portuguese.

Appl. Algebra 12 (1978) 21-47.

20

[22] K. Henckell, S. Margolis, J.-E. Pin, and J. Rhodes, Ash’s type 11 theorem, profinite topology and Malcev products. Part I, Int. J. Algebra and Computation 1

[23] K. Henckell and J. Rhodes, T h e theorem of Knast, the PG=BG and Type I1 Conjectures, in Monoids and Semigroups with Applications, J . Rhodes, ed., Singapore, 1991, World Scientific, 453-463.

[24] P. M. Higgins, Pseudovarieties generated by transformation semigroups, in Semigroups with Applications, including Semigroup Rings, S. Kublanovsky, A. Mikhalev, J. Ponizovskii, and P. Higgins, eds., St Petersburg, 1999, TPO “Severny Ochag” , 85-94.

[25] P. M. Higgins and S. W. Margolis, Finite aperiodic semigroups with commuting idempotents and generalizations, Israel J. Math. 116 (2000) 367-380.

[26] P. R. Jones, Profinite categories, implicit operations and pseudovarieties of categories, J. Pure and Appl. Algebra 109 (1996) 61-95.

[27] P. R. Jones and M. B. Szendrei, Local varieties of completely regular monoids, J. Algebra 150 (1992) 1-27.

[28] P. R. Jones and P. G. Trotter, Locality of D S and associated varieties, J. Pure and Appl. Algebra 104 (1995) 275-301.

[29] J. Karnofsky and 3. Rhodes, Decidability of complexity one-half for finite semigroups, Semigroup Forum 24 (1982) 55-66.

[30] R. Knast, Some theorems o n graph congruences, RAIRO Inf. ThBor. et Appl.

[31] S. W. Margolis and J.-E. Pin, Varieties of finite monoids and topology for the free monoid, in Proc. 1984 Marquette Semigroup Conference, Milwaukee, 1984, Marquette University, 113-129.

[32] J.-E. Pin, Varieties of Formal Languages, Plenum, London, 1986. English translation.

[331 , BG=PG: A success story, in Semigroups, Formal Languages and Groups, J. Fountain, ed., vol. 466, Dordrecht, 1995, Kluwer, 33-47.

[34] N. R. Reilly, F’ree combinatorial strict inverse semigroups, J. London Math.

[35] J. Reiterman, T h e Birkhofl theorem for finite algebras, Algebra Universalis 14

[36] J. Rhodes, Kernel systems - a global study of homomorphisms o n finite semigroups, J. Algebra 49 (1977) 1-45.

[371 , Undecidability, automata and pseudovarieties of f inite semigroups, Int. J. Algebra and Computation 9 (1999) 455-473.

[38] B. Steinberg, Inevitable graphs and profinite topologies: some solutions to algorithmic problems in monoid and automata theory, stemming f r o m group theory, Int. J. Algebra and Computation 11 (2001) 25-71.

(1991) 411-436.

17 (1983) 331-342.

SOC. 39 (1989) 102-120.

(1982) 1-10.

21

WI , A note on the equation PH = J*H, Semigroup Forum. To appear.

~401 , Semidirect products of categories and applications, J. Pure and

[41] B. Steinberg and B. Tilson, Categories as algebras 11, Tech. Rep. CMUP 2000-

[42] P. Stiffler, Extension of the fundamental theorem of finite semagroups, Ad-

[43] H. Straubing, Finite semigroup varieties of the form V * D, J. Pure and Appl.

[44] M. B. Szendrei, The bifree regular E-solid semigroups, Semigroup Forum 52

[45] D. Therien and A. Weiss, Graph congruences and wreath products, J. Pure and Appl. Algebra 36 (1985) 205-215.

[46] B. Tilson, Categories as algebra: a n essential ingredient in the theory of monoids, J. Pure and Appl. Algebra 48 (1987) 83-198.

[47] S. Zhang, An infinite order operator on the lattice of varieties of completely regular semigroups, Algebra Universalis 35 (1996) 485-505.

Appl. Algebra 142 (1999) 153-182.

4, 2000.

vances in Math. 11 (1973) 159-209.

Algebra 36 (1985) 53-94.

(1996) 61-82.

22

On the Sentence Valuations in a Semiring

Adrian ATANASIU* Carlos MARTiN- VIDE** Victor MITRANA*

*University of Bucharest, Faculty of Mathematics Str. Academiei 14, 70109, Bucharest, Romania

e-mail: [email protected] e-mail: [email protected]

**Research Group in Mathematical Linguistics and Language Engineering Rovira i Virgili University

Pqa. Imperial TBrraco 1, 43005 Tarragona, Spain e-mail: [email protected]

Abstract. This paper proposes an algebraic way of sentence valuations in a semiring. Actually, throughout the paper only valuations in the ring of integers with usual addition and multiplication are considered. These valuations take into consideration both words and their positions within the sentences. Two synonymy relations, with respect to a given valuation, are introduced. All sentences that are synonymous form a synonymy class which is actually a formal language. Some basic problems regarding the synonymy classes are formulated in the general setting but the results presented concern only very special valuations.

1 Introduction A series of paper, see, e.g., [l], [ a ] , [8], [9], and the references thereof, have dealt with homomorphisms h from a free generated monoid M into the monoid ((0, m), ., l), so that the sum of all homomorphical images of generators of M equals 1, called

'Supported by the Direccih General de Enseiianza Superior e Investigacibn Cientifica, SB 97-00110508

23

Bernoulli homomorphisms (distributions, measures). Besides being homomorphisms, Bernoulli homomorphisms may be viewed as probability measures on the family of all languages over a given alphabet. Furthermore, they played an important role in developing the theory of codes [l]. Some authors discarded the homomorphism property keeping the probability measure property as done in [8], [9] whilst others proceeded vice versa [6], calling them valuations.

These valuations were used in the study of unambiguity representations of languages. Along these lines, in [7] the equation system zI = pi which identifies a context-free grammar, is transformed, via a valuation h, into a numerical equation system h(zi) = h(pi). Solving the former system one gets the context-free language generated by the given grammar while solving the later system one gets exactly the valuation of L, h(L) , defined as the sum of all h ( w ) with w E L. Close relations between the unambiguity of the given context-free grammar and the value h(L) are discussed. Moreover, new characterizations of unambiguity in regular expressions based on the same concept of language valuation are proposed.

This way of assigning values to a sentence remembers also some devices introduced in the area of regulated rewriting: weighted grammars and automata, see, e.g., [4, 5 , 12, 10, 131, where a given number in a group is associated with each computation step (derivation or configuration). A computation is valid iff the total value assigned to that computation, computed in accordance with the operation of the group considered, is the neutral element of that group.

A consistent extension to the basic paradigm of constraint satisfaction in parsing might make use of the penalty factors assigned to syntactic, semantic, and mapping constraints. Penalty factors, which may range from zero to one, are combined mul- tiplicatively leading to confidence scores which indicate a sort of level of constraints violation. This extension can be used to model distance effects if one takes into consideration the local distance between two consecutive constraint violations. This suggests t o compute the confidence score depending also on the position of the constraint.

In this paper, we introduce a generalization of the aforementioned valuations in the following sense. The value of a sentence depends not only on its words but also on their positions within the sentence. Furthermore, the valuation is computed in a richer structure that of a semiring instead of a monoid. Moreover, we consider valuations that allow a finite set of values for each sentence. More precisely, each word in a given vocabulary has a finite set of values (attributes) and each position (a natural number) has just one attribute. For a given sentence, the value associated to a position occupied by a certain word a is obtained by considering two attributes: one is that of position itself the other being one among the attributes associated to the word a. Thus, we need an operation for computing the value of every position in the sentence and one operation for computing the value of the whole sentence.

24

The latter should be, in our opinion, an additive type one. What structure might be the most relevant one for our purposes? We have chosen a very common and widely investigated structure in semantics, that of a semiring. More precisely, all the results we present concern a particular semiring, namely the (semi)ring of integers.

Based on this valuations there are defined two types of synonymy relations. Two sentences are weakly synonymous, with respect to a given valuation, if they have a common value computed in accordance with the given valuation; they are strongly synonymous if they lead to a common value in between any contexts. Informally speaking, two sentences are weakly synonymous if they have a common meaning. However, if one adds the same contexts to two weakly synonymous sentences, one may get two new sentences that have no common meaning (the new sentences are not weakly synonymous anymore). This undesired feature is avoided by the definition of the strong synonymy relation.

We investigate the decidability of the finiteness problem of synonymy classes as well as the possibility of algorithmically deciding whether two given sentences are strongly synonymous (as we shall see, it is always decidable whether or not they are weakly synonymous). In our approach we consider two very special types of valuations depending on their position attributing function, that is the polynomial (constant and linear) and (restricted) exponential functions, respectively.

2 Definitions and examples A vocabulary is a finite nonempty set whose elements are called words; if V = {al, a2,. . . , a,} is a vocabulary, then any sequence w = ailai, . . . aik, 1 5 ij 5 TI, 1 5 j 5 k , is called sentence over V . The length of the aforementioned sentence w is denoted by lg (w) and equals k . The empty sentence is denoted by E , l g (&) = 0. As a rule, the words are denoted by small letters from the beginning of the Latin alphabet and the sentences are denoted by small letters from the end of the same alphabet, excepting the empty sentence. Moreover, ( x ) ~ delivers the sentence obtained from x by removing all words not in U. The set of all sentences over V is denoted by v* and V+ = V* - { E } . Any subset of V* is called language.

A structure ( A , +, -, 0 , l ) is called a semiring iff the following conditions are satisfied for all a, b, c E A:

(i) ( A , +, 0) is a commutative monoid,

(ii) ( A , -, 1) is a monoid,

(iii) a . ( b + c ) = a . b + a . c, ( a + b) . c = a . c + b . C ,

(iv) O . a = a . O = O .

25

The semiring ( A , +, -, 0 , l ) is said to be commutative iff ( A , ., 1) is a commutative monoid. For further notions we refer to [ll]. If B and C are two subsets of A and q E A, one defines

B . Q = {PqlP E B}, Q . B = {qpb E B) , B + C = { p + r l p E B ,r E C } , B - C = { p r l p E B , r E C } .

Let V be a vocabulary and ( A , +, ., 0 , l ) be a commutative semiring. A valuation of V* in A is a pair of mappings

4 = (.,P), where

0 a : V --+ 2f, (the word valuating function; a ( a ) is the set of all values

0 p : IN + A, (the position valuating function; p(n) is the value (attribute) of

Here 2; denotes the set of all finite subsets of A. Given a valuation 4 as above and a sentence x = a 1 a 2 . . . a, E V * , ai E V, 1 5 i 5 n, we define

(attributes) of a) ,

the position n).

n

val+(x) = Ccr(ai) . p ( i ) .

Moreover, v a l + ( ~ ) delivers always 0, for any valuation q5. A valuation as above is deterministic iff card(cr(a)) = 1 for all a E V .

By our intuition, the beginning and the end of a sentence offer more information then its middle part. Even so, one may argue that beginning is still predominant, but for our further results this makes no difference since for any string x as above we can consider

vaZ4(x) = C a(ui) . p(n - i + 1).

i= 1

n

i=l

Two sentences x, y are weakly synonymous with respect to the valuation 4, written as x "+ y , iff val+(x) n va l@(y) # 8. One may easily notice that this relation is reflexive and symmetrical but not transitive. The weak synonymy class of x is defined as

[XI4 = {Y E V*lZ -4 9). Two sentences x, y are strongly synonymous with respect to the valuation 4, written as x "+ y , iff val+(uxv) f l val+(uyv) # 8, for any pair of sentences u, E V*. Again,

26

this relation is reflexive and symmetrical but not transitive. The strong synonymy class of x is defined as

b$4 = {Y E V * b “4 Y}. Note that the strong synonymy always implies the weak synonymy but the converse does not hold.

Example 1. Let us consider the semiring +[XI of all polynomials with only one indeterminate and coeficients in Z together with addition and multiplication. W e consider the valuation of {a, b, c, d}* in Z [ X ] , 4 = (a, p), defined as follows:

a(.) = 2X2 a(b ) = X 2 - 1 a(.) = 1 a ( d ) = 2X2 - 1

P(i) = X + 2 , i E IN.

It is easy t o note that

valb(dacb) = val+(aba) = 5 X 3 + 10X2 - X - 2

which implies dacb wB aba.

Example 2. Take V = {a, b, c} and the valuation of V* in (Q, +, ., 0 , l ) 4 = (a , p) with

a (a) = {1/2}, a(b) = {1/3}, a(c) =z {-1/6} P(n) = 5 , for all n 2 1.

The reader may easily verify that

[&I4 = {xi3ig((x)a) + 21g((x)b) = l g ( ( x ) c ) } .

Note that [&I4 is a context-free non-regular language. Moreover, both valuations are deterministic.

We proceed to investigate mainly the synonymy classes. A natural problem concerns the finiteness of these sets as well as the possibility to decide on this problem. As we shall see in the sequel, a closely related problem concerns the decidability status of the next problem: For a given value q, are there sentences whose valuation set contains q? Furthermore, we are concerning with the problem of finding appropriate devices (automata, grammars, etc.) which characterize the synonymy classes. Since the above definitions were given in a very general setting, we should restrict our investigation to particular valuations. To this end, in this paper we shall only consider the valuations in the ring of integers Z with the usual addition and multiplication. Of course, other semirings may be considered as well,

27

but we have chosen this semiring because it is the most natural and simple one. Even so, the problems we considered appeared to be very difficult. A similar investigation for other semirings remain to be done. The absolute value of an integer x is denoted by 1x1. In the sequel, we shall foccus our attention on valuations whose function p is either the constant polynomial, the linear polynomial, or the exponential function an.

We start with a lemma which will be useful in the sequel. Let x = ( 3 1 ~ 2 . . . a, be a sentence in V* and 4 = (a, p) be a valuation of V' in Z with

p(n) = Conk + C1nkP1 + . . . + Ck.

Denote by

Clearly, 4i = (a,nZ), o 5 i 5 k .

by a direct calculation one gets the desired equality. 0

3 The constant polynomial As [x]4 = V' , for all x E V*, providing that p is the null function, we shall consider only non-zero position valuating functions in the rest of this section. Note that the relations stated by Lemma 1 and relation (l), respectively, may be combined in

valg(xy) = val@(x) + val4(y). (2)

The next result is an immediate consequence of relation 2.

Proposition 1. Let 4 = (a , p) be a valuation of V* in Z. Then, x ~4 y iff x "4 y .

Theorem 1. Let q5 = (a, p) be a valuation of V' in 72. Then, the following problems are decidable:

I. Given q E Z, are there sentences x E V+ such that q E va14(x) ? 2. Given q E Z, are there arbitrarily many sentences x such that q E va14(x) ?

28

Proof. Assume that P(n) = k , n E W, for some integer k . Moreover we take a positive k , the case k < 0 may be treated similarly.

1. Obviously, there is no sentence whose valuation contains q if q is not a multiple of k . We distinguish two cases depending on the values of the words in V . Firstly, let us suppose that all values of the words in V are nonnegative; the reasoning is the same when all of them are negative. It follows that val4(z) contains only nonnegative integers, for each x E V+, hence q has to be nonnegative. Clearly, if q = 0, the answer is affirmative if and only if there is a word in V that has a null value. For q > 0 it suffices to restrict our search to sentences of length a t most q / k . Consequently, one can algorithmically find the answer in this case.

Now, let us consider that the set C = { p # Olp E a(u) ,a E V } contains both negative and positive integers. We claim that exists z E V+ such that q E val+(z) if and only if q is a multiply of k d , where d is the greatest common divisor of all integers in C. Obviously, if q E valg(z) for some x E V+, then q must be a multiple of kd . We prove now that for each multiple of kd there exists a sentence that contains it in its valuation set.

Let q l , q l , . . . qn be all positive integers in C and p l , p 2 , . . . p,, be all negative integers in C. I t is known that

i=l i=l

for some integers ki,rj , 1 5 i 5 n, 1 5 j 5 m. Moreover, one can choose either ki,rj 2 0 or kj, r j 5 0, for all 1 5 i 5 n, 1 5 j 5 m. The last claim requires a short discussion: we thought that we would find a reference for it but we were not able to find such a reference. In order to keep the proof easy to follow, we prefered to prove it in an appendix at the end of the paper.

Let q be an arbitrary multiple of kd; consider the sentence x given by the next algorithm:

Algorithm 1. Procedure Findsynonymy-class-representative(q); begin 2 := E ;

if q > 0 then choose k i , r j 2 0 in (3) else choose ki ,r j 5 0 in (3);

endif; for i:= 1 t o n do

choose a E V with qj E a(a); 2 := x , e ~ i / ( ~ 4 ;

29

endfor; for i:= 1 to m do

choose a E V with pi E a(a); x := xaVil(k4;

endfor; if p=O then choose a, b E V such that 91 E a(a),pl E a@);

endif; end.

It is easy to notice that q E valb(z) which concludes the reasoning of the first assertion.

2. The latter item follows from the first one as follows. Find, if any, a sentence z such that q E valb(z) . If the sentence z exists, detect a sentence y whose valuation contains the value 0. When no sentence y exists, only a finite number of sentences might have q in their valuation sets. Indeed, if there is no sentence y E V+ with 0 E val+(y), then all values of the words in V are either positive or negative. By the first part of this proof it follows that only a finite number of sentences might have q in their valuation set. If such a sentence exists, then by equation 2 all sentences

0

2 := a l P l l p ;

zym, rn 2 0, are in [x]b, which ends the proof. From the previous theorems one can infer the next result.

Theorem 2. Let qj = (a, p) be a valuation of V* in Z. The following problems are decidable:

1. Are two given sentences weakly/strongly synonymous? 2. Given a sentence z E V*, are [XI+ and (x)4 finite?

We recall now an operation on sentences that will turn out to be very useful for our investigation regarding the type of languages [z]b. This operation, called shuf le is a well-known operation in formal language theory and in parallel programming theory. We define this operation on sentences, recursively, as follows: for two strings z, y E V * and two symbols a, b E V we write

( 2 ) zLU&=&IUz=2,

(ii) uz u bg = a(z LU by) u b(az Lu y).

A shuffle of two strings is an arbitrary interleaving of the substrings of the original strings We naturally extend this operations to languages:

L1LULz= u z u y . Z E h , Y E L Z

The next theorem settles the position of synonymy classes with respect to valuations whose position valuating function is a constant in the Chomsky hierarchy.

30

Theorem 3. Let q5 = (a , p ) be a valuation of V* in Z and x be a sentence in V * . 1. The language [x]4 is context-free. 2. If q5 is deterministic, one can decide whether or not [x]g is regular.

Proof. 1. The reader may construct a nondeterministic pushdown automaton that recognizes all sentences in [x ]g . We prefer another proof, namely we use a slightly modified version of additive valence grammar. A right-linear additive valence grammar, see [3], is a construct G = ( N , T, s, P, v) , where N , T, s, P are the parameters of a right-linear grammar and v : P -+ Z is the valence function. The valence associated to a derivation

D : wo ~1 ar2 ~2 3 . . . arm wm

such tha t at each step wi-1 ari wi, 1 5 z 5 m, the applied rule is ri is

i= 1

The language generated by G with the valence q E Z is the set

L(G, q) = {x E T*l there exists a derivation S =+* x with v(S J* x) = q ) .

It is known [3] tha t all languages generated by right-linear additive valence grammars are context-free.

Now, given a valuation q5 = (a,P) of V* in Z, with a constant function p, we construct a right-linear additive valence grammar G+ = ( N , V, S, P, v) as follows:

0 N = { S ) u { (a , q)la E v, Q E 44). 0 For each nonterminal (a ,q) E N we have the following rules in P:

S -+ (a,q), with the valence q, (a,q) -+ aS, with the valence 0, (a ,q ) + E , with the valence 0.

Obviously, the equality

[XI4 = u L ( G b t I P ( 1 ) ) tEWQlg (5 )

holds. Since the family of context-free languages is closed under union the first assertion is completely proved.

31

2. Let q5 = (a, p ) be a deterministic valuation of V* in Z. Denote by v = { a E Vla(a) = 0} and U = V \ v. It is easy to note that

[TI4 = [(z)v]g v*. If card(U) 5 1, then [z]4 is regular for any z E V*. Indeed, if card(U) = 0, then [z]4 = V*. If card(U) = 1, then all classes [ ( z )~ ]+ are finite, hence all languages [z]4 are regular.

Let us suppose that card(U) 2 2. If a(a) . a ( b ) > 0 for all pairs (a , b) E V x V , then [(z)v]4 is a finite set for all z E V* (see the proof of Theorem l), therefore [z]4 is regular. If there exist a, b E V such that a ( a ) - a ( b ) < 0, then the language [ ( z ) U ] $

is context-free but not regular. This language is

[(Z)UI4 = {z E U*l c l g ( ( z ) a ) . a ( a ) = V a l & ) ) . QEU

As shuffling a context-free non-regular language with a regular language, the languages being over disjoint vocabularies, one gets a context-free non-regular language, it follows that [z]4 is regular iff either card(U) 5 1 or a ( a ) . a ( b ) > 0 for all pairs

0

Remark. In the view of the last theorem, the decidability of the finiteness problem for synonymy classes follows directly from the finiteness problem for context-free languages. However, the proof of Theorem 1 offers a more easily testable condition and a less complex (time and memory) implementation.

(a , b) E V x V , both conditions being decidable.

4 The linear polynomial In this section we shall consider only valuations whose position valuating functions are linear polynomials.

Theorem 4. Let q5 = (a, P ) be a valuation of V* over +. Given a sentence 3: E V*, one can decide the f initeness of [XI$. Proof. Let 4 = (a , p) be a valuation with @(n) = k n + p . By Lemma 1 and relation 1, one may write also

wUal&y) = wal,$(z) + wal+(y) + k . l g ( z ) . val+,(y). (4)

Assume that a ( a ) contains only positive integers, for all a E V ; the case when a ( a ) contains just negative integers may be treated analogously. We analyse what happens when k < 0; the reasoning may be carried over the case k > 0 with minor

32

changes. Clearly, there exists no E IN such that valb(z) has only negative values (vaZ+(z) < 0, for short), for all sentences z in V* longer than no.

Let z be such a sentence. We claim that valb(y) < valb(z), for all y E V* such that Zg(y) 2 lg(z) . maz{ Irl I r E valb(z)}. Due to the length of y, one infers that

walb(y) 5 maz{lrl I r E valb(z)} . maz{val~(w)lw is a subsentence of y of length Zg(z)}

which is smaller than valg(z) because valb(w) is negative too, providing that w is a subsentence of y of length l g ( z ) . Consequently, [z]@ is finite for all z E V*.

Let us consider that exist a, b E V , possibly the same, such that &(a) . a(b) contains at least one negative integer. Take q1 E a(a) , q2 E a(b) such that q1 -q2 < 0; the sentence y = alQzlblqll satisfies the relation 0 E valb,(y). Moreover, we claim that 0 E valb, (,aR), for all z E V* with 0 E vaZgO(z). Indeed, if z = 2122.. . z,, zi E V , and ti E a(zi) , 1 5 i 5 m, so that CEl ti = 0, then (2m + I) Ci=,(ti) E va$(zzR) holds. Note also that 0 E val+,(zzR), too.

Now, as valg(zzR) = k . Val+, (2.") + p . val@,(ZZR)

one gets 0 E valg(zzR). Due to the relation 4 one concludes that all sentences 0

As far as the position of languages [ z ] ~ in the Chomsky hierarchy is concerned, z(zzR)q, q 2 0, with z as above, belong to [z]d.

we have:

Theorem 5 Let 4 = (a, p) be a valuation of V* in Z. The language [XI+ is context- sensitive, for any z E V*.

Proot Let us suppose that P(n) = k n + p . We give the proof for k > 0 only; the proof may be carried over the case k < 0 with the appropriate changes. Take no the minimal natural number such that kn + p > 0 for all n > no. For each z E V* one constructs the phrase-structure grammar G, which works accordingly with the next nondeterministic procedure:

1. The grammar generates a sentential form Xala2 . . . a,YZ, X, Y, Z being nonterminals, ui E V, 1 5 i 5 n, and n > no.

2. If no > 0, choose q E valb(ala:!. . . ano) and transform the sentential form into either

Xblbz . . .bnoano+l.. . a,Y(-l)lqlZ, iff q < 0,

Xblb2 . . . bnoa,o+l.. .anYlqZ, iff q 2 0, or

bl, b2,. . . , bno being nonterminals.

33

3. While the current sentential form contains words in V and no trap nonterminal do

0 Assume that the suffix of the current sentential form is Y d Z , for some

0 if cq E [rnin(valb(z)), rnas(val+(z))], then

- choose a word a, in between X and Y , - transform a, into a nonterminal b,, - choose T E a(a,) , and write either l r ( k z + p ) , if T 2 0, or ( - l ) l r l ( k z + P ) , if

- remove iteratively all pairs of consecutive symbols -1,1, or 1, -1, in

0 if cq < rnin(vaZb(z)), then look for a word a, in between X and Y such

0 if no such position exists, then block the derivation by a trap nonterminal;

c E {-1, l}, and q 2 0.

T < 0, before 2,

between Y and 2.

that a(@,) contains a positive integer;

otherwise

- transform a, into a nonterminal b,, - choose T E a(a,) , r > 0, and write l r ( k z z + P ) before 2, - remove iteratively all pairs of consecutive symbols -1,1, in between

0 if cq > rnaz(valb(s)), then look for a word a, in between X and Y such

0 if no such position exists, then block the derivation by a trap nonterminal;

Y and 2.

that a(a,) contains a negative integer;

otherwise

- transform a, into a nonterminal b,, - choose T E a(a,) , r < 0, and write ( - l ) l r l ( k z + p ) before 2, - remove iteratively all pairs of consecutive symbols 1, -1, in between

Y and Z.

4. If the current sentential form does not contain any trap nonterminal, check whether or not its suffix YcqZ satisfies the relation cq E vul+(z), q 2 0. In the affirmative, remove all symbols c and X , Y , 2, and rewrite all nonterminals b, into a,, 1 5 a 5 n, otherwise block the derivation by a trap nonterminal.

Clearly, [Zlb = qG,) u {Y E V*ls Yl k A Y ) 5 720).

34

Note that the working space [14] of each z E L(G,) is bounded as follows

W S ( z , GZ) I m a x ( l g ( z ) + 3 + m ~ , l g ( z ) + 3 + m2 + 2 . m3(k . l g ( z ) + p ) ) ,

where

ml = max{lsl : s E v a l g ( y ) , y E V * , l g ( y ) I n o } , m2 = max{lsl : s E valg(x)}, m3 = max{ls\ : s E a ( a ) , a E V}.

It follows that L(G,) is context-sensitive, hence [x]g is context-sensitive, too. 0

Note that, by relation 1 and Lemma 1, if Zg(z) = l g ( y ) , then x -g y iff z z g y . We do not have any algorithm for deciding whether two sentences are strongly synonymous with respect to valuations whose position functions are arbitrary polynomials. However, we present below an algorithm for a large class of valuations.

Theorem 6. If 4 = (a ,@) is a valuation with p being a non-constant polynomial and there i s a sentence a with exactly one non-zero value in a ( a ) , theiz one can decide whether x =o g, for any sentences x, y.

Proof. Let 4 = (a, p) be a valuation of V* over Z with @(n) = conm +clnm-l + . . . + h. Assume that x z g y ; it follows that 2 N~ y as well as val+(xak) n d 4 ( y a k ) # 0, for all k 2 0, a being the word in V with just one attribute in a(.) which is not zero. By Lemma 1 and relation 1 one gets

i=O i=O

m k

i=o j=1 = va lg (x ) + c cz C ( l g ( 5 ) + j)"-iiY(u). (5)

Analogously, m k

Consequently, l g ( x ) = lg(y) must hold, otherwise {s1-s2Is1 E va lg (x ) , s2 E valg(g)} would be infinite, a contradiction. Indeed, for a ( a ) is a non-null integer, if l g ( y ) > l g ( x ) , then the relations (5) and (6) may be written as:

35

for some integers q k , t k . One infers that t k E {s1 - s2lsl E val+(y), s2 E val+(x)} for all Ic 2 0, which is contradictory. Analogously, when lg(y) < lg(z).

In conclusion, x ~6 y iff (x "6 y)&(lg(x) = lg(y)), conditions that may be algorithmically checked. 0

5 The exponential case an The subject of investigation in this section is the class of valuations q5 = (a,,@ whose position valuating function P is a particular exponential function, namely p(n) = an, n 1 1, a E Z \ {0,1}.

Clearly, val+(zy) = val+(z) + a'g(")valb(y). (7)

Theorem 7. Let q5 = (a, P ) be a valuation of V* in P, P(i) = ail i E IN, a E +\{O}. One can decide whether there exist sentences x E V+ such that q E val+(x), for a given q E Z ?

Proof. One can distinguish three cases: a = 1, la( 2 2 and a = -1. If a = 1, we are dealing with a valuations whose position valuating function is constant; this situation has been treated in the proof of Theorem 1.

Let us analyse the case a = -1. Define d as being the greatest common divisor of all integers in the set {r -s l r E a ( b ) , s E a(c)} , where b and c are words (might be the same) in V . Given an integer q, there exists x such that q E va16(x) if and only if

The argument is similar to that used in the proof of Theorem 1; the reader may easily find out the slight modifications.

Assume now that la1 1 2. It is easy to notice that exists a sentence z E V+ such that q E vald(z) if and only if there exists a polynomial P whose coefficients are in the set C = { p E a ( b ) ( b E V } such that P(a) = q/a. Suppose that z = blb2.. . b,, bi E V, 1 5 i 5 m. For

q = t( mod d) , t E {o}Uub,,{lSl I S E a(b)lS < 0) or d - t E u b c v { S l S E a ( b ) , .S > o}.

VU~+(X) = a(a(b1) + ~ a ( b 2 ) + . . . a"-'a(b,))

and a # 0, it follows that the required polynomial P is an element in the set of polynomials

a(b1) + X a ( b 2 ) + . . . Xrn-'a(bm).

Let p = max({ll( I 1 E C} U {q /a}) . The following algorithm decides, for any given la) 2 2, whether there is a polynomial P with coefficients in C such that P - q/a has the zero a.

36

Algorithm 2. Procedure ExistxPolynornial(q5,q); begin

D1 := { -q la } ; D := 0; repeat

D := D U D1; R := 0; for each i E D do

if i+j mod a=U then R := R U {i + j dzv a} ;

else D1 := D1 U R; until D = D1; “THE POLYNOMIAL DOES NOT EXIST”; end.

c := { p E a(b)lb E V } ;

for each j E C do

if 0 E R then “THE POLYNOMIAL EXISTS”; stop;

In order to finish the proof, we need a reasoning for the correctness of the above algorithm.

Termination. We claim that a t each step when a number i + j div a, i E D and j E C, is added to R, this number is between -p and p. Indeed, initially the assertion is valid. Assume that at an arbitrary moment, when entering the repeat ... until loop, all elements of D are bounded by -p and p, respectively. For la1 2 2, every multiple of a of the form i + j , i E D, j E C, is in the interval [-2p, 2p], hence i + j div a is in [-p,p] . Consequently, either 0 E R, during the loop or D = D1 after this loop has been performed at most 2p times.

Correctness. Assume that the algorithm provides 0 in R at some step. This implies the existence of some k 2 1 such that

or equivalently q/a E a(bil)ak-’ + a(biz)ak-2 + . . . + a(bi,).

It follows that q E ual#(bikbik--l . . . bi , ) . Obviously, if the algorithm ends with D = 0

It is worth mentioning here that the problem of deciding upon the strong synonymy between two given sentences can be algorithmically solved for the same class of valuation as that stated in Theorem 6.

D1, then there is no sentence y such that q E vald(y).

Theorem 8. If I$ = (a lp ) is a valuation with p being a n exponential function p(n) = an, whose base a is any integer distinct of 0 and -1, and there is a word

37

b with exactly one value in a(b), then one can decide whether x z$ y, for any sentences x, y.

The proof is an immediate consequence of relation 7 being left to the reader as an exercise.

6 Final remarks We briefly discuss here some considerations that seem to be in order. In the present paper we have considered the semiring of integers with the addition and multiplication. It appears to be interest to replace it by other semirings (or other structures) having linguistical relevance.

Our approach tries to valuate all sentences over a vocabulary. A more natural approach might be the valuation of just those sentences which belong to a given language. An attractive class of languages seems to be the context-free one.

As one can easily notice, there are plenty of natural questions without answer; all of them remain to be further investigated. We provide below a list of a very few of them which appear to be more attractive from our point of view.

1. Is it decidable whether or not two given sentences are strongly synonymous with respect to valuations whose position function is an arbitrary polynomial or exponential function?

2. Can we decide the finiteness of strong synonymy classes in the arbitrary polynomial case? What about the same problem for both classes in the exponential case?

In our opinion, a natural direction of further work may consider this formalism as an algebrac backbone upon which other formalisms of semantical structure can be grafted.

References [l] J. Berstel and D. Perrin, Theory of Codes, Pure and Applied Mathematics,

[2] J. Berstel and C. Reutenauer, Rational Series and Their Languages, EATCS

131 J. Dassow, Gh. Plun, Regulated Rewriting in Formal Language Theory, Springer-

[4] J. Dassow, V. Mitrana, Finite automata over free generated groups, Intern.

Academic Press, 1985.

Monographs on Theoretical Computer Science, vol. 12, Springer, Berlin, 1988.

Verlag, 1989.

Journal of Algebra and Computation, 10, 6(2000), 725-737.

38

[5] S. A. Greibach, Remarks on blind and partially blind one-way multicounter

[6] H. Fernau, Valuation of languages, with applications to fractal geometry, Theo-

[7] H. Fernau, L. Staiger, Valuations and unambiguity of languages, with applica-

[8] G. Hansel and D. Perrin, Codes and Bernoulli partitions, Math. Systems Theory

[9] G. Hansel and D. Perrin, Rational probability measures, Theoret. Comput. Sci.

[lo] 0. H. Ibarra, S. K. Sahni, C. E. Kim, Finite automata with multiplication,

[ll] W. Kuich and A. Salomaa, Semirings, Automata, Languages, EATCS Mono-

[12] V. Mitrana, R. Stiebe, Extended finite automata over groups, Discrete AppJ.

[13] Gh. P b n , A new generative device:valence grammars, Rev. Roum. Math. Pures

[14] A. Salomaa, Formal Languages, Academic Press, 1973.

machines, Theoret. Comp. Sci. 7(1978), 311-324.

ret. Comput. Sci. 137 (1995) 177-217.

tions to fractal geometry, ICALP'94, LNCS 820, Springer, 11-22.

16 (1983) 133-157.

65 (1989) 171-188.

Theoret. Comp. Sci., 2(1976), 271-294.

graphs on Theoretical Computer Science, vol. 5, Springer-Verlag, 1986.

Math., 108, 3(2001), 247-260.

et Appl., 25, 6(1980), 911-924.

39

Appendix Let S be a finite set of positive integers; we denote by gcd(S) the greatest common divisor of all elements from S . We now proceed to prove the following claim which is equivalent with the fact used in Theorem 1:

Claim 1 Let S be a finite set of positive integers. For any partition S1, S2 of S , with both sets S1, S2 nonempty, the following two conditions are satisfied:

1. There exist coeficients c(t) , t E S , such that

gcd(S) = C c ( t ) t tES

and c( t ) 2 0 for all t E 5’1, and c( t ) 5 0 for all t E S2

2. There exist coeficients c( t ) , t E S , such that

gcd(S) = C c(t)t tES

and c( t ) 5 0 for all t E S1, and c( t ) 2 0 for all t E 5’2

Proof. We prove the claim by induction on the cardinality of S . Let us assume that S = { t l , t z } . It is known that gcd(S) = ctl + f t z for some integers c, f. Clearly, c 2 0 and f 5 0 or c 5 0 and f 2 0. Without loss of generality we may assume that f 5 0. I f f = 0, then c = 1, gcd(S) = t l , and t2 = g t l for some g 2 1. Now we can write gcd(S) = (1 - 2g)tl + 2t2 and we are done.

Suppose now that f < 0; then ctl = gcd(S) - ft2. Since t 2 = g d , for some g 2 1, one gets ctl = gcd(S) - fg(ct1 + f t 2 ) which implies gcd(S) = c( f g + l ) t l + f 2 g t 2 .

For f g + 1 5 0 and f 2 g 2 0 the basic step of induction is completely proved. We assume that the assertion holds for any set S of cardinality s, consider a

set S’ of cardinality s + 1, and a partition of S’ in two nonempty sets Si and SL. Obviously, at least one of these sets, say SL, contains at least two elements. Consider an arbitrary element of S;, say 1 and set S = SIUS~, where Sl = Si and S2 = S;\{l}. By the induction hypothesis,

gcd(S’) = gcd({gcd(S), l}) = d C c ( t ) t + f‘l t E S

holds, where the coefficients c( t ) can be chosen as stated above. Again, either c’ 2 0 and f’ 5 0 or c‘ 5 0 and f’ 2 0. We now choose c( t ) 2 0,

0 for all t E S1, and c( t ) 5 0, for all t E Sz, which completes the proof.

40

JOIN DECOMPOSITIONS OF PSEUDOVARIETIES OF THE FORM DH n ECom

KARL AUINGER Institut f i r Mathemat&, Universitat Wien, Strvdlhofgasse 4, A-I090 Wien,

Austria E-mail:Karl. Auingerhnavie. ac. at

A constructive proof of the equation

DH n ECom = (J n ECom) V H

is presented where H denotes any arborescent pseudovariety of groups. In addition, a larger class of pseudovarieties of groups is found for which that equation holds.

1 Introduction

The purpose of this paper is to present a constructive proof of the equation

DH n ECom = (J n ECom) V H (1)

where H is a “sufficiently nice” pseudovariety of groups. As usual, for a group pseudovariety H, DH denotes the pseudovariety of all (finite) semigroups all of whose regular ID-classes are groups in H, while ECom is the class of all (finite) semigroups with commuting idempotents and J stands for the pseudovariety of all J-trivial semigroups. (In this paper, all semigroups except free semigroups A+ and free monoids A* are assumed to be finite.) A syntactic proof of the equation (1) has been found by Almeida and Weil in the case H being arborescent which means that (H n Ab) * H = H (where Ab is the pseudovariety of all abelian groups and * denotes the Mal’cev product, or, equivalently, the semidirect product of the involved pseudovarieties). More- over, also in one can find (in terms of a “unique factorisation property”) a condition characterizing the set of all pseudovarieties H satisfying equation (1). From that condition it follows that this set is closed under taking joins (within the lattice of all pseudovarieties). The arguments in are based on a careful study of the free pro-H groups and some knowledge of the free pro- DS semigroups. (Here DS denotes the pseudovariety of all semigroups all of whose regular ID-classes are subsemigroups). The proof thereby obtained is not constructive in the sense that for a given S E DH n ECom it would effectively construct a semigroup C E J n ECom and a group H E H such that S divides the direct product C x H . From the proof we only know that suitable C and H do exist.

41

In contrast, our approach will prove equation (1) for a larger class of pseudovarietes H which can be characterized by a certain condition (P) (see Definition 2.3) and the proof will be constructive. It is based on a discovery by Ash, Hall and Pin which provides a convenient set of generators of the pseudovariety DH n ECom. In it is shown that each S E DH n ECom divides a precisely described finite direct product of transition semigroups of automata of a very special kind (these automata will be introduced in section 3); conversely, all such transition semigroups are in DH n ECom. The main idea of our proof then will be, given an automaton d of that kind, to consider a certain quotient automaton A/- ( N essentially eliminates the non-trivial group sub-automata of d) and whose transition semigroup is aperiodic (that is, it is a member of J n ECom). Then we construct a suitable finite group H such that the transition semigroup M(d) of d divides the direct product of the transition semigroup M(d/N) of d/- and H . The group H is especially designed to outweigh the “loss in accuracy” which comes from going from d to d/N. The prerequisites to construct such a group are developed in section 2 (without giving full proofs). An expanded version of the paper, containing full proofs and several refinements will appear elsewhere.

For undefined notions in the theories of semigroups, pseudovarieties, automata, etc., the reader is referred to the books of Almeida and Pin ’; for background information about varieties of groups the book of Neumann is a good reference.

Throughout, for a word w E A* (on a finite alphabet A), c(w) stands for the content of w, that is, the set of all letters occurring in w while IwI denotes the length of the word w. For any finite set S, IS1 stands for the number of elements of S. For any A-generated (semi)group S = (A) and any word w E A+ we will write, if emphasis is necessary, w ( S ) to denote the evaluation of the word w in S.

2 Groups

Here we present an auxiliary result which will be essentially used in the next section in the proof of the main theorem. The result is about semigroup identities (not) being satisfied by certain group varieties. (However, the result holds - mutatis mutandis - for group identities as well). The notation throughout this section will be as follows. For a finite alphabet A let ( z i ) i l l

be a sequence of letters of A and (ui) i?~, (vi)i>o two sequences of words in A* (some of them may be empty) such that

zz $I C(Ui -1 ) u C ( V i - 1 ) u c(u2) u (Vi).

42

We will mainly be interested in identities of the form

uo21u1.. . znun N voz1w1.. .2,vn.

Let U, V be varieties of groups and let U * V be the usual (Mal'cev) product of U and V , that is,

U * V = {G I GIN E V for some normal subgroup N E U}.

It is well known that * is an associative operation on the lattice of group varieties, and that U * V is generated, as a variety, by all possible semidirect products U * V with U E U and V E V (see Neumann '). Moreover, there is a well-known representation of the A-generated free object in U * V as a subgroup of a semidirect product of an appropriate member of U by an appropriate member of V . We use here the version presented in Theorem 10.2.1 in '. More precisely, let F = FAV be the A-generated free object in V and let A' = F x A. Then F acts on A' by

g(h , a) = (gh, a ) Vg, h E F, Vu E A.

Consequently, if G = FAIU is the A'-generated free object in U then F acts on G by automorphisms on the left via

g[(hl, U1)*1. . . (hn, an)*l] = (gh1, Ul)*l . . . (ghn, an)*I.

So we may form the semidirect product G * F subject to this action. Then the free object on A in U * V is isomorphic to the subgroup of G * F freely generated by the elements of the form ((l,a), a) where a E A. For semigroup identities u N v with u, v E A* this means that if u = a1 . . . an and v = bl . . . b, then U * V u N v if and only if (i) V k u 'v v and (ii) U k (l,al)(al,a2). . . (a1 . . .an-l,an) N (l,bl)(bl,b2). . . (bl . . .b,_l,b,). In the latter identity the variables are of the form (u, a ) with u E A* and a E A, and two such variables, say (u, a) and (v, b) are the same if and only if a and b are the same letters from A and V + u 'v v.

We are going to formulate the main result of this section. For each positive integer n let ?in be a group variety which does not satisfy any non-trivial semigroup identity u E v with )u( , (vJ 5 n and u,v E A*. For each n 2 0 let

Vn := '?&+I *'+I!, * . . . * 3t2 * 3c1 and for convenience put V-1 := 7, the trivial variety. In the next results we assume that the words ui, vi and the letters xi satisfy the conditions imposed at the beginning of this section. The first result can be proved by induction on n.

43

Lemma 2.1 The variety V, does not satisfy any identity

uo21u1. . . 2,+1u,+1 ?! wo21w1 . . . xtvt

for 0 5 t 5 n. The main result of this section is an easy consequence of this lemma, again to be proved by induction on n. Theorem 2.2 If V, satisfies uox1u1.. .x,an N V o z l V l . . . Z,W, then Vn-1 satiesfies u; E wi for all i. The following property (P) of a pseudovariey H of groups will turn out to be crucial for our purpose. Definition 2.3 A pseudovariety H of groups has the property (P) if for each G E H and for each positive integer n there is a group F E H such that

1. F does not satisfy any non-trivial semigroup identity u 'v v for IuI, IvI 5 71.7

2. ( F ) * (G) C H.

Property (P) has already been pointed out to be of some interest in '. Here (. . .) denotes the pseudovariety generated by the group ". . ." and * is the Mal'cev (that is, semidirect) product of the involved pseudovarieties. Observe that a pseudovariety H enjoys property (P) if the seemingly weaker condition holds: for each group G E H there exists a prime p for which the wreath product q p z 1 G is in H (see '). The following corollary is a consequence of Theorem 2.2 in a form we shall use it in the next section. Corollary 2.4 Let H be a pseudovariety of groups satisfying the property (P) and let G = (A) be an A-generated group in H; then f o r each positive integer n there is an A-generated group G, = (A) in H such that

1 . for all u,v E A+, i f u(G,) = w(G,) then also u(G) = w(G),

2. whenever (uox1u1.. . znun)(G,) = ( w o z 1 w 1 . . . xnwn)(Gn) then ui(G) = wi(G) for all i (with the assumptions on the words ui, w; and the letters x; imposed at the beginning of this section).

Proof. Let Go = G E H and n E N; choose a group F1 E H not satisfying any identity u E w with 1u1, lwl 5 2 such that (F1) * (Go) C H. Notice that (F1) * (Go) is locally finite, that is, it is the finite trace of a locally finite variety. In particular, in (F1) * (Go) all finitely generated free objects exist. Let G1 be the A-generated free object in (F1) * (Go). Suppose that F,-1 and G,-1 have already been constructed. Let F, be in H such that F, does not satisfy any non-trivial identity u N w with 1211, IwI I n + 1 and (F,) * (Gn-l) C H.

44

Now let G, be the A-generated free object in (F,) * (Gn-l) . By induction one can see that for all i 5 n, Gi and (Fi) * . . . * (F1) * (Go) satisfy the same identities in [A1 variables. Consequently, G , is the A-generated free object in (F,) * . . . * (F1) * (Go) and so by Theorem 2.2 has the requested property.

Remark 2.5 Note that the property (P) is not a closure property in the sense that for each pseudovariety H there would exist a least pseudovariety V such that H G V and V has (P).

3 The main result

As mentioned in the introduction, a pseudovariety of groups is arborescent if (H n Ab) * H = H . It has been shown in that for H arborescent free pro-H groups enjoy certain “unique factorisation conditions” (similar to the free group). These factorisation conditions were, in turn, one essential ingre dient of the proof that such pseudovarieties satisfy the equation (1); the other ingredient was a sufficiently precise description of the implicit operations on DS.

This section gives a constructive proof for the join decomposition (1) which applies to a wider class of pseudovarieties, namely those which satisfy condition (P). As a preparation for this new proof we first shall recall a result of Ash, Hall and Pin presenting a set of “nice” generators of the pseudovariety DH n ECom. We require some definitions. Throughout, let A be a finite set of letters. A permutation automaton (or group automaton) A = (&,A,-) on A consists of a finite non-empty set of states Q together with a labelling of some permutations of Q by the letters of A, denoted by a : q C) q . a. Here two different letters may label the same mapping and the m e IQI = 1 is also included; in the latter case, each letter labels the identity mapping on Q. Let r E N; for each i 5 r let Ai C A and = (Qi, Ai, -) be a permutation automaton such that Qi n Qj = 0 if j # i. Moreover, let uo,u, E A* and u1,. . . , u,-1 E A+ be such that the first letter of ui is not in Ai (1 5 i 5 r ) and the last letter of ui is not.in Ai+l (0 5 i 5 T - 1). For each i choose pi ,qi E Qi; as in 2 , p. 393, let

be the automaton depicted in Figure 1 (with appropriately chosen states inside each path ui).

45

Figure 1: The automaton A

A formal definition of this automaton has been given in ' on p. 38. For convenience, let us call such an automaton good. Note that in ' good automata have been characterized as those admitting a linear quasiorder on the set of states which is compatible with the action of the letters. Let n = l u o u 1 . . . url be the length of that automaton. If all transition groups of the automata belong to the group pseudovariety H then the automaton d will be called H-good. The transition semigroup M(d) of an H-good automaton A is in DH n ECom (see 5 , Corollary 2.3 or 2 , Lemma 3.9). Moreover, the class of all transition semigroups of H-good automata generates the pseudovariety DHnECom. On the one hand, this follows from the proof of Theorem 3.8 in 2 : there it is shown that the class of all such transition semigroups does not satisfy more pseudoidentities than DH n ECom itself; therefore, by Reiter- man's Theorem the result follows. On the other hand, a constructive proof of this assertion has been presented in '. That is, given any S E DHnECom, a precisely described finite set of H-good automata has been constructed such that S divides the direct product of the transformation semigroups of these automata (Proposition 3.5 in '). We shall discuss that construction in more detail. Crucial for it is an important lemma by Ash which holds, if appropriately reformulated, in the more general context of semigroups with commuting idempotents and which lemma has been an important step in Ash's famous proof that each semigroup with commuting idempotents divides an inverse semigroup (see 314). The version we shall need is the following (see Proposi- tions 3.3 and 3.4 in '): Proposition 3.1 Let S = (A) E DGnECom; then there is a positive integer K = K ( S ) (depending on S only) such that each word w E A+ admits, for some n 2 0, a factorization w = goulg l .. . ungn such that

I . all g,(S> are group elements (go,gn may be empty),

2. for each i, the first letter of ui is not in c(gi-1) and the last letter of ui is not in c(g i ) ,

3. lUlU2. . . un-lu,I 5 K

46

This statement and its more general version is usually proved by the use of Ramsey's Theorem. For the above mentioned case we are interested in, semigroups in DG n ECom, we shall give an elementary proof, thereby getting a better bound for the number K . Recall that each S E DG n ECom is a semilattice of unipotent semigroups, that is, if q is the least semilattice congruence on S then S/q C E(S) and each q-class S, is an ideal extension of the group He (the group %-class containing e ) by a certain nilpotent semigroup N, U (0). Then S, = He U Ne and the product of any INeI + 1 elements of S, lies in He. Notice also that for any u,v E A+ with c(u) = c(v) we have u(S) r] v(S), that is, u(S) and v(S) are in the same subsemigroup S,. Corollary 3.2 Let S = (A) E DG n ECom and let

N = N ( S ) = max lNe l+ l . eEE(S)

Then the number K ( S ) of Proposition 3.1 can be chosen to be less than NIAl. More precisely, each w E A+ admits a factorisation as in Proposition 3.1 such that Ju1 . . . u,1 5 Nlc(w)I - 1. Proof. We only have to prove the existence of a factorisation satisfying conditions ( 1 ) and ( 3 ) . Namely, if g(S) is a group element and if x € c(g) then J,(s) 2 J,(s) and therefore (zg)(S) and ( g e ) ( S ) are group elements, as well. Likewise, if g(S) and h ( S ) are group elements then so is (gh)(S) . Therefore, each factorisation satisfying (1) and (3) can be reduced to a factorisation satisfying (1),(2) and (3). The proof now is by induction on the size Ic(w)l of the content of w. If 1c(w)1 = 1 then the claim follows immediately from the definition of N . So let w E A+ and suppose that the claim is true for all words v with Ic(v)I < lc(w)l. We factorize w as

w = 'lllx1u2x2.. . 'Ilkxkuk+l

such that for all i 5 k , Ic(ui)l = lc(w)l-l and xi is a letter not being contained in c(ui), that is, c(uixi) = c(w); moreover, IC(Uk+l) l < Ic(w)I and Uk+1 may be empty. If k 2 N then we are done because in this case,

is a product of N or more elements within the same semilattice congruence class Se so that this product lies in He. Since c(uk+l) C c(u1x1. . . ukxk) the element w(S) itself is a group element. So we may assume that k 5 N - 1. By induction hypothesis, each ui admits a factorization in group and non- group parts such that the accumulated length of the non-group parts is - by induction hypothesis - at most Nlc(w)l-l - 1. Thereby we have already found a factorisation of w in group and non-group elements: each uixi admits

47

such a factorisation with the accumulated length of the non-group elements being at most Nlc(w)l-l - 1 + 1 = NIC(w)l-l. Hence the accumulated length of the non-group elements in w is at most

N l " ( w ) l - y N - 1) + N I C ( 4 - 1 - 1 = N I C ( 4 l - 1,

as required. Remark 3.3 For the semigroup S = (A) E DG n ECom and for any a C A, a # 8 put N, = {w(S) I w(S) is non-regular and c(w) = a } and let N'(S) = m a x ~ # ~ c ~ - IN,I+l. Then Corollary 3.2 still holds if N ( S ) is replaced with N'(S). The proof of Proposition 3.5 in (combined with Corollary 3.2) now shows the following. Corollary 3.4 Let S = (A) E DG n ECom; then S divides a direct product of transition semigroups of good automata on A such that

I . each of these automata has length at most IAIN(S)lAl;

2. the incorporated group automata are of the form = (He , A,, -) where each He is a maximal subgroup of S (which can be regarded to be generated by a subset A, of A); the action . is just the multiplication on the right. The transition group of each such automaton is isomorphic to the group He.

Remark 3.5 1. In item 1 we could restrict ourselves to good automata on A of length (precisely) IAIN(S)IAI; but then, as another factor in the direct product, we have to mention the least group 'fl-class of S which is A-generated and which may be needed for the division but which need not be representable as (a divisor of) the transition semigroup of any automaton described in item 2 of Corollary 3.4 and having positive length.

2. Observe that the number of distinct automata satisfying i t e m 1 and 2 above and also the size of each associated transition semigroup are bounded by a primitive recursive function in the cardinality of S.

3. In the proof of Proposition 3.5 in one can argue by induction on Ic(w) U c(w')J instead of Ic(w)l + Jc(20')). This is the reason why the factor 2 which occurs in the sentence before Proposition 3.6 in does not occur in the expression IAIN(S)IAI. Now let d = d(uO,p1, dl, q1. . . ,p , , A, q,., u,) be a good automaton with

group automata = (Qi, Ai, 0 ) . Let M(d) be the transition semigroup of A. We intend to show that M(d) divides the direct product C x H for some

C E J n ECom and some group H . The aperiodic semigroup C is obtained by “factoring the groups out of d”: more precisely, consider the equivalence relation - on the set of states of d which identifies all states within each group automaton A. The resulting automaton, denoted by A/- is the following (as on p.394 in ’):

A1 A’ Ar

Figure 2: The automaton A/-

The associated transition semigroup C = M ( d / - ) is in J n ECom (see Corollary 2.3 in

Now we construct the group H. Thereby we shall use the ideas developed in section 2. Let 3tl be the (non-trivial, locally finite) variety of groups generated by the transition groups of the automata d k (1 5 Ic 5 r ) . Let n = luo.. . url be the length of d and for each i E (2,. . . ,n + 1) let Xi be a locally finite variety of groups not satisfying any non-trivial identity u 11 w with IuI, Iwl 5 i where u, w E A*. Let V , = %,+I * . . . * 3 t ~ * 3t1 and let H = (A) be the free group on A in V,. Then we have, with the notation introduced above, the main result of the paper: Theorem 3.6 The semigroup M ( d ) divides the direct product C x H . Proof. We show that the map (a,a) e a, a E A, extends to a morphism ((a,a) 1 a E A) C C x H -+ M ( d ) . Let u,w E A+ be such that

or Lemma 3.9 in ’).

u(C) = w(C) and u(H) = w ( H ) .

We have to show that u ( M ( d ) ) = w(M(d)); therefore, it suffices to show that for each state q of d, q . u = q . w. From u(C) = v(C) and from the definition of A/- we have the following: for each state p of d/- , p . u = p . w. Consequently, for each state p of d, p .u is defined if and only if p - w is defined. Let i j = q-; assume that i j is somewhere on the path ui and ?j. u = i j . w is somewhere on uj. Then i 5 j and we may assume that i < j . Then there are factorisations of u and w, respectively, such that

and

49

where the words U k are the words occurring in d resp. d/N, u: is a (possibly empty) su& of ui, u> is a (possibly empty) prefix of uj, and for all possible 1, the first letter of uz [ui] is not in c(gz)uc(hz) [c(gj)Uc(hj)] while the last letter of uz [u:] is not in c(gz+l)Uc(hz+l) [c(gi+l) Uc(hi+l)J (some of the words gz, hz may be empty, as well). By our assumption, u ( H ) = v ( H ) , that is V, b u E v. By Theorem 2.2, 311 gz N hi for all 1 (notice that in Theorem 2.2 some of the segments U k , vk may be empty). In particular, for each I E {i + 1, . . . , j } and each group G = (Az) in 311 we have that gI(G) = hl(G). Let GI = (Az) be the transition group of the automaton dz. By the above argument we have gz (GI) = hz ( GI) , that is, gz ( G I ) and hz( G I ) are the same element of the group GI and consequently, gz and hz act in the same way on &I. This applies to each I E { i + 1, . . . , j } . Therefore, q . u:gi+l = q . u:hi+l, and by induction we get:

q * u:gi+l = q . u:hi+l* q . u:gi+lui+l = q . u:hi+lUi+l

* Q . u:gi+lui+lgi+2 = 4 . u:hi+lui+lhi+z * ... * q . u = q . v

and the theorem is proved. Using Corollary 2.4 we get the next result precisely in the same way. Corollary 3.7 Let H be a pseudovariety of groups satisfying condition (P) of Definition 2.3. Let M be the transition semigroup of an H-good automaton. Then M divides the direct product C x H for some C E J n ECom and some H E H.

We have already remarked that, given an arbitrary semigroup S = ( A ) in DG n ECom then we can find transition semigroups M I , . . . , Mk of good automata such that S divides Hi=, Mi and such that k as well as all /Mil are bounded by a primitive recursive function in JSI. Moreover, the length of the involved automata is bounded by IAIN(S)IAI. Similarly as in 6 ,

Corollary 3.1, it can be shown that the cardinality of the group H in Theorem 3.6 can be bounded by a primitive recursive function in the length of the involved automata and the cardinalities of the involved subgroups. Summing up, we get the next result (answering problem 25 in for that particular join decomposition): Corollary 3.8 The decomposition DG n ECom = (J n ECom) V G is effective in the following sense: each S E DG n ECom divides a direct product C x H for some C E J n ECom and H E G such that the cardinalities of C and H are bounded by primitive recursive functions in the cardinality of S . We remark that the latter result more generally holds for each arborescent pseudovariety H. What we actually need is that in property (P) (Definition 2.3) the cardinality of F is bounded by a primitive recursive function in n and

k

50

the cardinality of G.

References

1. J. Almeida, Finite Semigroups and Universal Algebra, World Scientific, Singapore, 1994.

2. J. Almeida and P. Weil, Reduced factorizations in free profinite groups and join decompositions of pseudovarieties, Int. J. Algebra Comput. 4

3. C. J. Ash, Finite idempotent-commuting semigroups, pp 13-23 in: S. M. Goberstein and P.M.Higgins (eds.), Semigroups and Their A p plications, Reidel Publishing, Dordrecht, 1987.

4. C. J. Ash, Finite semigroups with commuting idempotents, J. Austral. Math. SOC. 43 (1987) 81-90.

5. C. J. Ash, T. E. Hall and J. E. Pin, O n the varieties of languages associated with some varieties of finite monoids with commuting idempotents, Inform. Computation 86 (1990) 32-42.

6. K. Auinger, Semigroups with Central Idempotents, pp 25-33 in: J. C. Bir- get et a1 (eds.), Algorithmic Problems in Groups and Semigroups. Birkhauser, Boston Base1 Berlin, 2000.

7. K. Auinger and B. Steinberg, The geometry of profinite graphs with applications to free groups and finite monoids, preprint.

8. H. Neumann, Varieties of Groups, Springer-Verlag, Berlin Heidelberg New York, 1967.

9. J. E. Pin, Varieties of Formal Languages, North Oxford, London and Plenum, New York, 1986.

(1994) 375-403.

51

Arithmetical Complexity of Infinite Words

S. V. Avgustinovich,* D. G. Fon-Der-Flaass,t

A. E. Fridl Sobolev Institute of Mathematacs,

pr. Koptyuga, 4, Novosibirsk, Russia Email: {avgust,flaass,frid}Qmath.nsc.ru

Abstract

We introduce a new notion of the arithmetical complexity of a word, that is, the number of words of a given length which occur in it in arithmetical progressions. The arithmetical complexity is related to the well-known function of subword complexity and cannot be less than it. However, our main results show that the behaviour of the arithmetical complexity is not determined only by the subword complexity growth: if the latter grows linearly, the arithmetical complexity can increase both linearly and exponentially. To prove it, we consider a family of DOL words with high arithmetical complexity and a family of Toeplitz words with low complexity. In particular, we find the arithmetical complexity of the Thue-Morse word and the paperfolding word.

1 Introduction The famous 1927 theorem by Van der Waerden [9, 71 states that for each infinite word w = wow1 . . . w, . . . on a finite alphabet C there exist arbitrarily long arithmetical progressions k, k + p , . . . , k + n p such that wk = wk+p =

In this paper we are interested in the following generalization of the problem: what can the words WkWk+p . . . Wk+np be in general for a given w and arbitrary k, p , and n? What are the properties of the arithmetical closure

- ... - wk+np.

*Supported in part by INTAS (grant 97-1001) and RFBR (grant 00-01-00916). +Supported in part by Netherlandish-Russian grant NWO-047-008-006 and RFBR

fsupported in part by RFBR (grant 99-01-00531) and Federal Aim Program “Integra- (grant 99-01-00581).

tion” (joint grant AO-110).

52

FA(w), i. e., of the language consisting of all such words? In particular, we make an attempt to compute or estimate the arithmetical complexity f i ( n ) of w which is defined as the number of words of FA(w) of length n.

The procedure of taking arithmetical closure reminds what is called decimations in the paper [6] by J . Justin and G. Pirillo. In terms of that paper, taking an arithmetical progression is a blind decimation.

Note that the arithmetical closure can be defined not only for a word but also for a language, and the term “closure” is used just because for all w the equality FA(w) = FA(FA(w)) holds. As for the arithmetical complexity f i ( n ) , it is somehow similar to the usual subword complexity fw(n ) , i. e., to the number of factors of w of length n. For example, both subword and arithmetical complexities of a periodic infinite word are ultimately constant, and both functions are bounded by (#C)n. Clearly, for all w and n

We try to show in this paper that arithmetical subwords and the function of arithmetical complexity are worth studying. In Section 2, after introducing the needed notions, we show that an arithmetical subsequence of a uniformly recurrent word is uniformly recurrent. Then we pass to studying arithmetical complexity and first show in Section 3 that it is not obliged to grow as slow as the subword complexity. We describe a class of words containing the Thue-Morse word and having linear subword complexity and arithmetical complexity equal to (#C)n . On the other hand, arithmetical complexity of a non-periodic word can grow linearly, and in Section 4 we validate it by an example of some family of Toeplitz words. Finally, we use the latter result to compute the arithmetical complexity of the paperfolding word, which turns out to be equal to 8n + 4 for all n 2 14.

2 Basic Notions Let C be a finite alphabet. As usuai, the set of all finite words on C is denoted by C*, the set of all non-empty finite words is denoted by C+, the set of all words of length n is denoted by En, and the set of all (right) infinite words is denoted by Cw. For any t E C+, the word tt . . . t . . . is denoted by tW.

A (finite) word u is called a factor, or subword of a (finite or infinite) word v if v = slusz for some words $1 and s2 which may be empty. Let us consider an infinite word w E Cw:

53

where wi E C. The set of all factors of w is denoted by F(w). The the well- known function of subword complexity of the word w (or of the language F(w)) is the number of words in F(w) of length n; we denote it by f,(n) = fF(w)(n).

Let us call the infinite word wi = WkWk+pwk+Zp . . . ~ k + ~ ~ . . . the arithmetical subsequence of w starting with position k and having diference p . A factor of some wi is called an arithmetical subword of w, and the set FA(w) of all arithmetical subwords of w is called its arithmetical closure:

FA(w)= u F ( w ~ ) = { X } U { ~ k ~ k + p . . . w k + n p I P > L k , n 2 0 } , P 2 1 , k B

where X denotes the empty word. In these terms, the Van der Waerden theorem can be stated as follows:

Theorem 1 (Van der Waerden 1927) For each infinite word w and positive integer n there exists a symbol a E C such that an E FA(w).

In this paper, we are interested in the properties of the language FA(w) and in particular in its subword complexity f F A (,) (n) which is denoted also by fi (n) and called the arithmetical complexity of w.

Let v = W k W k + l . . . Wk+n. Formally, an occurrence of v in w is the word v together with the number k of its first letter in w. Clearly, a word may have a finite or infinite number of occurrences in an infinite word w. If v is a prefix of w, we call its occurrence corresponding to k = 0 the prefix occurrence.

Recall that an infinite word w is called uniformly recurrent if each of its factors occurs in w infinitely many times with bounded gaps, i. e., if there exists a finite recurrence function R,(n) such that each factor of w of length R,(n) contains all factors of w of length n.

The following lemma does not seem to be new, but since we did not find a reference to it, here it is given with a proof.

Lemma 1 An arithmetical subsequence of a uniformly recurrent word is uniformly recurrent.

PROOF. Let us consider an arithmetical subsequence wf of a uniformly recurrent word w. Since a word obtained from a uniformly recurrent word by erasing a finite prefix is uniformly recurrent, it is sufficient to consider 1 = 0 and to show that the prefix u' of wj of length n' + 1 occurs in it once more with a gap bounded by a function of n' and p.

Let n = n'p and u = wo . . . w,. The word u' is an arithmetical subword of u. To find another occurrence of u' in w;, we shall find an occurrence of u in w lying at a distance dividing p from the prefix occurrence.

54

For an occurrence B = W k . . . W I + ~ of a word v in w, we define the function c(B) = Ic mod p . Our goal is to find an occurrence fi of u not equal to the prefix one and having c(ii) = 0.

Denote u = uo. For all i = 1,. . . , p , we define ui inductively as the minimal prefix of w containing two occurrences of ui-1 (including the prefix occurrence): ui = ui-19-i = liui-1 for some Zi,ri E C+. Since w is uniformly recurrent, all ui are well-defined and Iuil 5 R,(Iui-ll) + 1: indeed, by definition, an occurrence of ui-1 is contained in each subword of w of length Rw(lui-ll) including w1 .. .w~~(l~~-~l), and the occurrence of ui-1 is not the prefix one. Thus, ui is a prefix of wo . . . WR,,,(I~~-~ 1 ) .

For all j 2 i 2 0, let us denote the last occurrence of ui in the prefix occurrence of uj by ua. By the pigeon-hole principle, at least two of the numbers c(uz) = 0, c(uk), . . . , c(uE) are equal: c(ui) = c(uA) for some 0 5 i < j 5 p . This implies the equalities c ( u i ) = c ( u i ) for all Ic 5 i; in particular, 0 = c(ui) = c(u{). Since u = uo is a prefix of ui, we have also c(f i ) = 0, where fi is the first occurrence of u in u:. Since i < j, ii is not the prefix occurrence of u in w but the needed one.

Clearly, since luol = n + 1 and luil 5 R,(lui-ll) + 1 for all i, we have

lupl I R,(R,(. . . (R,(n + 1) + 1). . . + 1) + 1. - P

Consequently, two occurrences of u’ are contained in the prefix of w: of length at most

1 1 P P- - ( /up] - 1) + 1 5 - R,(R,(. . . (R,(n + 1) + 1). . . + 1) + 1) + 1.

P

Since this upper bound does not depend on the choice of u‘ (to consider another word, we just erase another prefix of w in the beginning of the proof), there is an occurrence of each factor of w: of length n’ + 1 in each its factor of length ;R,(R,(. . . (R,(n + 1) + 1). . . + 1) + 1). So, this is an estimate

for the recurrence function:

1 P-

- P

R,; (n’ + 1) 5 - R,(R,(. . . (R,(p’ + 1) + 1). . . + 1).

P

We have estimated the recurrence function of an arithmetical subsequence using its difference and the recurrence function of the initial word. The lemma is proved. 0

55

3 High Arithmetical Complexity In this section we consider DOL words. Let cp : C* + C* be a morphism, i.e., a mapping satisfying cp(xy) = cp(x)cp(y) for all x, y E C*. Clearly, a morphism is completely defined by the images of letters. If for some a E C the image cp(a) starts with a, and Icpi(a)l -b 00, then there exists a right infinite word called the fixed point w of cp starting with a and defined by the equalities

w = cp(w) = lim @(a) . n+cc

Fixed points of morphisms are called also DOL words and are widely used as examples of infinite words with given properties.

Here we consider only uniform morphisms, that are morphisms with all the images of letters having the same length denoted by m. Let C = C, = {0,1,. . . , q - l}, and let a @ b denote the symbol-to-symbol addition modulo q of the words a and b of equal length.

In this section, for every i E {0,1,. . . , q- l}, the expression ij denotes the word i . . . i (and not the j t h power of the number i ) . To distinguish it from

the arithmetical subsequence w i , the word Wk .. ' w k is denoted by ( w k ) j .

v j -

j We say that a morphism cp is symmetric if for all i E C we have cp(i) =

cp(0) @ i". Clearly, a symmetric morphism is determined by the image of 0. A DOL word is symmetric if it is a fixed point of a symmetric morphism starting with 0.

Theorem 2 Let cp be a symmetric morphism on C, , where q is prime, and let its jixed point w be non ultimately periodic. Then for all n 2 0 we have f 3 4 = qn.

PROOF. The words of the form Oilbi, where Ibil = n - i , constitute a basis of C y under the symbol-to-symbol addition modulo q. So, it is sufficient t o prove, first, that the set of arithmetical subwords of w having length n is closed under this addition, and second, that the words O i l are contained in FA(w) for all i 2 0.

First, let us consider two arithmetical subwords a and b of equal length of w and prove that a @ b also belongs to FA (w). Let a = W k W k + p . . . ~ k + ~ ~

and b = wk'wk '+p ' * . . Wk'+np'. Let r be an integer such that k + n p < mr, where m is the image length of a symbol.

Note that for every i , j E C we have cp(i @ j) = cp(0) @ i" @ jm = p( i ) @ j " . Thus, cpz(i) = cp(cp(0) @ i") = ( ~ ~ ( 0 ) @ im2, etc.; by induction, we have cp'(i) = cpr(0) el3 imp for all i . The symbol Wk'mP+k is the (k + 1)st

56

symbol of cp'(wk0 = cp'(0) @ (wkOmr; thus, it can be obtained from the (k + 1)st symbol of cp'(0) (equal to W k ) by adding W k ' modulo q. This means that Wklmr+k W k @ W k ' ; analogouSly, W(kt+pl)mP+k+p = w k + p @ Wkl+pl ,

etC. Thus, a @ b = wk'm'+kW(k'+pl)mP+k+p . . . 'W(k~+np')mP+k+np; We see that a @ b E F A ( W ) .

Now let us prove that Oil E FA(w) for all i 2 0. Since q is prime, it is equivalent to Oik E FA(w) for some k E C,, k # 0: indeed, kc 1 (mod q ) for some number c E {0,1,. . . , q - l}, and adding Oik to itself c times as it is described above, we obtain O i l .

Suppose the opposite: an arithmetical subsequence of the form Oi can be prolonged in FA (w) only by 0 if i is sufficiently large. Since the morphism is symmetric and thus 0 and any other symbol are interchangeble in FA(w), and due to the Van der Waerden theorem, infinitely long arithmetical progressions of 0's do occur in FA(w) . Let us consider such progression wa = 0". Since we always can pass from cp to some its power cpk without changing the fixed point w , without loss of generality we assume that the difference p of the progression is not greater than m. So, since we know w;, starting with some point, we know at least one symbol of each image of letter. But since the morphism is symmetric, this is sufficient to determine the images of letters themselves. Since the positions modulo m of known symbols (participating in wa) change periodically, w is itself ultimately periodic. A contradiction. 0

Remark 1 It is possible to characterize all ultimately periodic symmetric DOL words using the fact that they must contain arbitrarily long factors whose position modulo m depends on an occurrence. In other terms, each ultimately periodic symmetric DOL word is uncircular (an explicit definition of this notion can be found e. g. in [5]). Using the criterion of circularity obtained in [5] and adding the condition that the morphism cp is symmetric, we can conclude that a fixed point of a symmetric morphism cp can be (ultimately) periodic only if the morphism has a very special structure. Namely, up to cyclic renaming of symbols (of the form k -+ ck mod q for a fixed integer c), we must have

p(0) = (01. . . (q - 1 ) ) l O

for some I > 0. The complete proof of this statement is not at all difficult but rather cumbersome and thus is omitted.

Remark 2 If we replace the condition of non-periodicity by that of occurrence of O i l in FA(w) for all i, Theorem 2 becomes true for all cardinalities of alphabets .

57

So, at least for the case when the cardinality q of the alphabet is prime (and in fact for arbitrary alphabets too), Theorem 2 concerns most symmetric DOL words. In particular, it is applied to the most famous one, the Thue- Morse word W T M = 0110100110010110~~~ (see [3]) which is the fixed point starting with 0 of the symmetric morphism ( P T M :

(PTM (0) = 01, { ( P T M ( 1 ) = 10.

Since q = 2, the arithmetical complexity of the Thue-Morse word is 2".

4 Low Arithmetical Complexity In this section, we consider a subfamily of Toeplitz words and prove that each word from it has a linearly growing arithmetical complexity.

Let us consider an alphabet E and a symbol ? $! C called gap. A pattern is a finite word t E C(C U {?})*. For an infinite word w E (C U {?})" and a pattern t we define an infinite word Tt(w) as a result of replacing all gaps in w by corresponding in order symbols of the periodic infinite word tW.

Consider a sequence of patterns tl , t 2 , . . . , t , . . . and the corresponding sequence of infinite words

. .

Clearly, this sequence has a limit U(t1 , . . . , t,, . . .) E C". It is called the Toeplztz word generated by the sequence (tl , .. . ,t,, . . .). In the particular case when all the ti are equal to the same pattern t , we say that the Toeplitz word is generated by t and denote it by U(t) .

The subword complexity of Toeplitz words generated by one pattern was found in [4] (see also [S]); it always grows as a polynomial and is linear in the case when the number of gaps in the pattern divides its length. All such words are uniformly recurrent.

In this paper we consider only patterns with gaps constituting an arithmetical progression of prime difference dividing the length of the pattern; i. e., patterns of the form

where 1 is prime and j E (1,. . . , I - 1). The length of such pattern t is ql , and the gaps constitute an arithmetical subsequence in tW starting with j t h symbol and having difference 1. The set of all such patterns is denoted by T(1, q , j ) .

A very close notion of regular patterns together with corresponding Toeplitz words (generated in general by different patterns) and their subword complexity were considered in [8].

Example 1 Let us consider a pattern t,f = 0?2?. Clearly, t,f E T(2,2,1). We have

uo =?” Ul = Tt, (Uo) U2 = Tt, ( U i ) U, = Ttpf (U2)

= . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... , = 0?2?0?2?0?2?0?2?0?2?0?2?0?2? . . * , = Q02?022?QQ2?022?QQ2?022?QO2?~~ . , = 0Q20022?0022022?0020022?0022~ . . ,

. . .

The limit of this sequence is called the (canonical) paperfolding word and denoted by U(t , f ) :

U ( t , f ) = 002002200Q220220002002220022~ . . .

The subword complexity of U(t,,) was found in [l] and is equal to 4n for n 2 7.

By a result of [4] (and also, indirectly, of [S]), the properties of the paperfolding word are not unique: the subword complexity of every Toeplitz word generated by a pattern of T(Z,q,j) is O(n) . We prove that the same property holds for its arithmetical complexity:

Theorem 3 For every t E T(1, q, j ) the arithmetical complexity of U ( t ) grows linearly:

f&t,(n) = Wn).

PROOF. First of all, without loss of generality we can consider only canonical patterns, i. e., patterns all whose symbols except gaps are distinct. Indeed, we can obtain any other pattern t’ E T(1, q, j ) (and consequently the Toeplitz word U(t’ ) ) from a canonical pattern t E T(Z,q,j) (respectively, from U ( t ) ) by identifying symbols. Thus, factors of arithmetical subsequences of U (t’) are also obtained from factors of corresponding arithmetical subsequences of U ( t ) by identifying symbols. Since each symbol of the canonical pattern t always maps to the same symbol of t’, a word from F l ( U ( t ) ) always maps to the same word of F l ( U ( t ’ ) ) , but different words of FA(U(t)) can give the

58

59

same word of F A ( U ( ~ ' ) ) . Thus, the Toeplitz word generated by the canonical pattern has the maximal arithmetical complexity: f v A ( t ) (n) 2 f $ t , ) (n).

Without loss of generality we consider the word U ( t ) generated by the pattern t = TOTI . . . ~ ~ l - 1 such that the symbol ri is equal to i if i # kl + j , 0 5 k < q, and to ? otherwise. Clearly, t is a canonical pattern from T(1, q, j ) .

Let us consider a finite or infinite word u = uoul . . 'u,. . . with ui E C. We say that a position i (and the symbol ui) have nth order if i = k l n + j e . This definition is introduced so that if u is a Toeplitz word U(t1, t 2 , . . .) for ti E T(1, q, j ) , then i is of nth order if and only if ui appears from a gap not earlier than in Un+l. This is easy to prove by induction: its base is given by the fact that all positions are of order 0, and its step uses the fact that the symbols of (n + 1)st order in u are exactly the symbols of 1st order in the arithmetical subsequence of u constituted by its symbols of nth order.

Let us choose an arithmetical subsequence v = vovl . . . v, . . . , where vi E C, of the word U ( t ) and study the set of factors of v. Let the difference of v be equal t o p = mql + p' for some m 2 0 and p' E (0, . . . , ql - 1); in other terms, let it be equal to p' modulo ql.

Suppose first that 1 divides p'. If the position of the first symbol of v in U ( t ) is equal t o j modulo I , then v consists of symbols having 1st order in U ( t ) , and thus is equal to another arithmetical subsequence of U ( t ) having smaller difference and consisting of appropriate symbols of 0th order. So, it does not contain factors which do not occur in subsequences of smaller differences. And otherwise, if the position of the first symbol of v in U ( t ) is not equal to j modulo 1, then there are no symbols having 1st order in U ( t ) in it, and thus v is periodic with the period not exceeding q. Such subsequences can add only a finite number of arithmetical subwords of each length.

Now consider the main case of the difference p = mql + p' not divided by 1. Since I is prime, it means that exactly one of each 1 consequtive symbols of v is of first order in U ( t ) , exactly one of each 1' symbols is of second order etc..

For each n 2 0, let us consider the set S(n) of all factors of v of length 1" whose ( w j ) t h symbol is of nth order in U ( t ) , i. e., is situated in a position number kln + j s for some k. Clearly, for all n the set S(n) is not empty.

Let us show that the prefix of length 1" of a word v ( n + 1) E S(n + 1) belongs to S(n). Indeed, for all j > i, the ith and j t h symbols of a word from S(n + 1) are symbols situated at distance p ( j - i) in w. In particular, since its

its ( ' 2 j ) t h symbol is situated at the position number

(-j)th symbol occupies the position number a(k , n) = k l n + l + j 1-1, 1"+'-1

60

which is of nth order. Thus, there exists an infinite word s such that for each n its prefix of

length 1" belongs to S(n). By the construction of s , each its factor is a factor of v. Vice versa, since v is uniformly recurrent according to Lemma 1, and s contains its arbitrarily long factors, it contains all its factors. So,

Let sk be the kth symbol of s having nth order in it. In the initial Toeplitz word U ( t ) , sk+q' is situated at the distance pqZn+l from sk. This distance is divided by qln+', and thus, if only the position of sk in V ( t ) is not of (n+ l ) s t order, sk = sk+ql. We see that the set of nth order (but not (n + 1)st order) symbols of s is defined by a pattern t, E T ( l , q , j ) . Consequently, s is a Toeplitz word: s = U ( t o , t l , . . . , t,, . . .).

Moreover, since the (k + 1)th symbol of each ti is equal to k p + ci modulo ql for some ci, each pattern ti is uniquely determined by its first symbol (equal to ci) and by p or, more exactly, by the remainder p' from division p

F ( v ) = F ( s ) .

t o qz. That is why we can denote

s = V(t1, , . . , t,, . . .) = V(p'; c1,. , . , c,, . . ,) = V ( p ' ; c ) ,

where c is the sequence (c1,. . . ,en,. . .), By the definition of s = U(p'; c) , each word equal to some its prefix occurs

in v so that symbols which had nth order in V(p'; c) correspond to nth order positions in V ( t ) for all n . In particular, the position a, = je (i.e., the first position of nth order) in s = V(p ' ; c ) is occupied by symbol c, and corresponds to the position number d, = kqZn+l + cnZn + j e in U ( t ) , where k 2 0.

What symbol occurs at the position number a,-l = j- in V(p ' , c )? By the definition, it is equal to c,-1. On the other hand, in V ( t ) it corresponds to the symbol in the position dn-1 = dn -p(an - an-1) = dn -pln-l. Sincep=mqE+p', we haved,-l = ( k l - j m ) q l n + ( j + l c n - j p ' ) l n - l + j ~ , and thus

j + lc, - jp' f c,-1 (mod qZ).

We see that c,-1 is uniquely determined by c, and p': in the example 2 below we denote this fact by c,-1 = a(c,,p'). Hence, since the sequence c is infinite, it is periodic and uniquely determined by c1 andp'. It means that the sequence s = U(p ' , c ) and, consequently, the factorial language F ( s ) = F ( v ) depend only on p' and c1: s = V ( p ' , c) = V ( p ' , c l ) . Thus,

61

where P is the set of factors of finite number of periodic words given by valuesof p divisible by i.

We see that the arithmetical closure of U(t) is the union of a finite numberof languages of factors of Toeplitz or periodic words. So, computing thearithmetical complexity is reduced to computing the subword complexityof several Toeplitz words. It follows from results of [4] that the subwordcomplexity of each of them is linear, so the arithmetical complexity of U(t)also grows linearly. D

Example 2 Let us find the arithmetical complexity of the paperfolding wordU(tpf) defined in Example 1 (note that the pattern tpf is canonical). To doit, we find all the U(p';c\) and use Formula (1).

First of all, as proved above, cn_i is uniquely determined by cn and p'and does not depend on n or p: cn-\ = a(cn,p') = j + lcn — jp' (mod ql).Here j = 1 and q = / = 2, so we easily find a(0,1) = a(2,1) = 0 anda(0,3) - a(2,3) = 2. That is why if p' = 1, then the only possible c is(0 , . . . ,0 , . . . ) , and £/(p',c) = 17(1,0) = I7(0?2?) = U ( t p f ) . Analogously, ifpf = 3, thenc= (2, . . . , 2 , . . . ) , and U(p',c) =U(ipf), where ipf = 2?0?.

The even (i.e., divisible by / = 2) values of the difference add the set Pof factors of (T, 2W, and (02)w, so that

FA(U(tpf}) = F(U(tpf))UF(U(tpf)} U P.

It is not difficult to show (see [2]) that a word of length n > 14 can belong to atmost one of these three sets. It is clear also that f u ( t p ! ) ( n ) = /c/(tp/)(^) — 4n,so for n > 14 we have

/#(*„)(*) = fu(trf)(n) + fu(tpf)(n) + 4 - 8n + 4.

The values of f£,t -. (n) for n < 14 can be found by manual comparingthe three sets F ( U ( t p f ) ) , F ( U ( t p f ) ) , and P, and are given in the followingtable:

nfA(n)

12

24

38

416

524

632

744

852

964

1076

1186

1296

13106

14116

We have found the arithmetical complexity of the paperfolding word.

5 Concluding remarkLike most new notions, arithmetical complexity poses a series of naturalproblems. As always, it would be interesting to investigate the case of low

62

complexity and, for instance, t o characterize the set of infinite words whose arithmetical complexity grows linearly. Another direction is examination of known families of infinite words, like DOL words and Toeplitz words from wider classes, Sturmian words etc., and finding their arithmetical complexity.

Acknoledgements We thank professor Masami Ito who made it possible to present this work at the 3rd ICWLC, and the referee for careful reading and checking calculations in Example 2.

References [l] J.-P. Allouche, The number of factors in a paperfolding sequence, Bull.

Austral. Math. SOC. 46 (1992) 23-32.

[2] J.-P. Allouche, M. Bousquet-Mklou, Canonical positions for the factors in paperfolding sequences, Theoret. Comput. Sci 129 (1994), 263-278.

[3] J.-P. Allouche, J. Shallit, The ubiquitous Prouhet- Thue-Morse sequence, in: C. Ding, T. Helleseth, and H. Niederreiter (eds.), Sequences and their Applications, Proc. of SETA’98, DMTCS, Springer (1999), 1-16.

[4] J. Cassaigne, J . Karhumaki, Toeplitz words, generalized periodicity and periodically iterated morphisms, European J. of Combinatorics 18 (1997), 497-5 10.

[5] A. E. Frid, On uniform DOL words, in: M. Morvan, C. Meinel, and D. Krob (eds.), STACS’98, LNCS 1373, Springer (1998), 544-554.

[6] J . Justin, G. Pirillo, Decimations and Sturmian words, RAIRO Informa- tique 31 (1997), 271-290.

[7] A. Khintchine, Three Pearls in Number Theory, Graylock Press, New York, 1948.

[8] M. Koskas, Complexite‘ de suites de ToepZitz, Disc. Math. 183 (1998), 161-183.

[9] B. L. Van der Waerden, Beweis einer Baudet’schen Vermutung, Nieuw. Arch. Wisk. 15 (1927), 212-216.

63

THE EMPEROR’S NEW RECURSIVENESS: THE EPIGRAPH OF THE EXPONENTIAL FUNCTION

IN TWO MODELS OF COMPUTABILITY

VASCO BRATTKA Theodische Informatik I, FernUniversitat Hagen, D-5808d Hagen, Germany

E-mail: vasco. brattkaofernuni-hagen.de

In his book “The Emperor’s New Mind” Roger Penrose implicitly defines some criteria which should be met by a reasonable notion of recursiveness for subsets of Euclidean space. We discuss two such notions with regard to Penrose’s criteria: one originated from computable analysis, and the one introduced by Blum, Shub and Smale.

1 Introduction

In his book “The Emperor’s New Mind” Roger Penrose raises the question whether the famous Mandelbrot set A4 g lR2 can be considered as recursive in some well-defined sense. Throughout his discussion of this problem Penrose uses an intuitive notion of recursiveness and he complains about the lack of a mathematically precise meaning of this notion. On the one hand, he argues that it is insufficient to define recursiveness of a set as decidability with respect to computable points, since in this case even a simple set like the unit ball B := {(s, y) E R2 : x 2 + y2 5 1) does not become recursive. Since Penrose is convinced that the unit ball should become recursive we are led to introduce the following criterion.

Penrose’s first criterion. A reasonable notion of recursiveness for subsets of Euclidean space should make the closed unit ball recursive.

Figure 1. The closed unit ball B

64

On the other hand, Penrose argues that certain other ways to define recursiveness are also inappropriate, especially, because they do not handle the border of the sets under consideration in the right way. This aspect is important since the complexity of sets is often inherent in their border, as in case of Mandelbrot’s set. For instance, a definition of recursiveness as decidability with respect to rational or algebraic numbers is insufficient, since in this case sets like the closed epigraph of the exponential function E := {(x, y) E R2 : y 2 e Z } would not be handled appropriately. The border of this set does not contain any algebraic point besides (0 , l ) and thus the border is irrelevant to a decision procedure which is restricted to algebraic points. Of course, Penrose is convinced that a set, easily structured like the closed epigraph of the exponential function, should be recursive. This motivates the second criterion.

Penrose’s second criterion. A reasonable notion of recursiveness for subsets of Euclidean space should make the closed epigraph of the exponential function recursive.

... L

.r

Figure 2. The closed epigraph E of the exponential function

Apparently, there are several similar conditions and Penrose’s criteria are by no means sufficient conditions for a reasonable notion of recursiveness. They are just necessary conditions; a notion of recursiveness which does not meet Penrose’s criteria would be highly suspicious since it could be doubted whether it reflects algorithmic complexity in the right way. Since Penrose did not present any notion which fulfills all his requirements, it seems as if there exists no suitable notion of recursiveness.

The aim of this paper is to compare two existing notions of recursiveness for subsets of Euclidean space and to find out which comes closest to Penrose’s requirements. The first notion is based on computable analysis and has been developed and investigated by several authors. The basic idea of recursive analysis is to call a function f : Rn -+ R computable, if there exists a Turing machine which transforms Cauchy sequences of rationals, rapidly converging to an input 2, into Cauchy sequences of rationals, rapidly converging to the

65

output f(z). Moreover, a set A 2 R” is called recursive, if its distance func- t i on d A : Rn +. R is computable.a Here d denotes the Euclidean metric. This notion of recursiveness straightforwardly generalizes the notion of recursiveness from classical Computability theory (see Odifreddi 3): if we endow the natural numbers N with the discrete metric, then the distance function of a subset A N is equal to its characteristic function and computability of the characteristic function is equivalent to recursiveness of the set A. In Euclidean space the distance function is a LLcontinuous substitute” for the characteristic function. Although recursiveness of subsets of Euclidean space in this sense does not correspond to the intuition of “decidability” , it is a formal generalization of the classical notion of recursiveness. Especially, a subset A N considered as a subset of R is recursive, if and only if it is classically decidable. Finally, it is easy to prove that this notion of recursiveness meets Penrose’s criteria.

As a second notion of recursiveness we will investigate the notion which has been developed by Blum, Shub and Smale.4i5 In their theory a function f : R” +. Iw is computable (we will call it algebraically computable for the following), if there exists a real random access machine which computes f . Such a machine uses real number registers, arbitrary constants, arithmetic operations, comparisons and equality tests. Moreover, a set A C R” is called recursive by Blum, Shub and Smale (we will call it algebraically recursive for the following), if its characteristic function is algebraically computable. If we restrict the class of constants appropriately (for instance to rational numbers), then a set A C N considered as a subset of R is algebraically recursive, if and only if it is classically decidable. In this sense the notion of algebraic recursiveness is a second generalization of the classical notion of recursiveness. Obviously, the unit ball is algebraically recursive and hence Penrose’s first criterion is met. Blum and Smale have proved that Mandelbrot’s set is not algebraically recursive and hence it seems as if they have given an answer to Penrose’s original question. But with a similar technique we will prove that the closed epigraph of the exponential function is not algebraically recursive and hence it is highly questionable whether Blum and Sinale’s answer to Penrose’s question is significant. If even a simple set like the epigraph of the exponential function is not algebraically recursive, we can conclude that algebraic non-recursiveness obviously does not reflect the intrinsic algorithmic complexity of a set.

aThe idea of using distance functions to characterize “located” sets has first been used in constructive analysis, see Bishop and Bridges.2

66

2 Recursive and Recursively Enumerable Sets

In this section we give the precise definitions of several classes of recursively enumerable and recursive sets and we will give a short survey on some elementary properties.

Let d : R" x R" .+ R be the Euclidean metric of R", defined by d ( x , y) := dET'l Ixi - yi12 for all x , y E R". By B ( x , r ) := {y E R" : d ( x , y ) < T-} we denote the open balls and by B(X,T) := {y E R" : d(x,y) 5 r } the closed balls with respect to d. For each set A 5 IR" we denote by d A : R" -+ R the distance function of A, defined by d A ( x ) := infaEA d ( x , u). Let a : N + R" be some standard enumeration with range(0) = Q", defined for instance by c x ( ( i ~ , j ~ , Icl), ..., (i,,j,, k,)) := (w, ..., -1. Here, (.) : N2 -+ N denotes Cantor's Pairing Function, defined by ( i , j ) := i(i + j ) ( i + j + 1) + j , which can inductively be extended to a function (.) : N" + N. All these pairing functions are bijective and computable, as well as their inverses.

We assume that the reader is familiar with the definition of computable real functions (see, for instance, Weihrauch17 Pour-El and Richards,s KO 9 ) .

We briefly recall the ideas: a function f :C IR" -, Iw is called computable, if there exists a Turing machine which transforms each Cauchy sequence (qi)iEn of rational numbers qi E Q" (encoded with respect to a) , which rapidly converges to some x E dom(f) into a Cauchy sequence ( r i ) iE~ of rational numbers ri E Q, which rapidly converges to f(x). Here, rapid convergence means d(qi ,qk) 5 2-k for all i > Ic (and correspondingly for (ri)iEn). Of course, a Turing machine which transforms an infinite sequence into an infinite sequence has to compute infinitely long, but in the long run the correct output sequence has to be produced. It is reasonable to assume one-way output tapes for such machines since otherwise the output after some finite time would be useless (because it could be replaced later).

Functions, such as exp, sin, cos, In and max are examples of computable functions. One of the basic observations of computable analysis is that computable functions are continuous. This is because approximations of the output are computed from approximations of the input and therefore each a p proximation of the output has to depend on some approximation of the input. Computable functions of type f : N R" can be defined similarly and are called computable sequences.

Now we are prepared to define the notion of recursively enumerable and recursive subsets in the sense of computable analysis (see Brattka and Weihrauch lo for a survey). These notions are explicitly defined for open or closed sets, respectively.

67

Definition 2.1 (Recursively enumerable open and closed sets)

1. An open subset A IR" is called recursively enumerable, (r.e. for short), if there is a computable function f : N 4 N2 such that A = U ( i , j ) E r a n g e ( f ) B(a(i) , 2 - 9

2. A closed subset A En is called recursively enumerable, (r.e. for short), if A = 0 or there is a computable sequence f : N -+ IR" such that range(f) is dense in A.

3. An open (closed) set is called co-recursively enumerable (cer.e. for short), if its complement AC is r.e.

4. An open (closed) set is called recursive, if it is r.e. and cer.e.

Recursively enumerable open sets have first been introduced and investigated by Lac0mbe.l' Equivalent definitions to the given ones have been investigated by several authors (see Weihrauch and Kreitz,l27l3 KO et a1.,1419715 Ge and Nerode,16 Zhou,17 Zhong,l8 Brattka 19). The following characterization gives an impression of the stability of the definition of r.e. sets. For completeness we also mention the characterizations via semi-computable d i s tance functions. These notions are not used any further in this paper and the interested reader is refered to Brattka and Weihrauch lo for the definitions and proofs.

Lemma 2.2 (Characterization of r.e. closed sets) Let A closed set. T h e n the following equivalences hold:

Rn be a

1. A i s recursively enumerable % { (2, j ) E N2 : A n B(a(i) , 2 - j ) # 8) is recursively enumerable % dA : Rn -+ IR i s upper semi-computable,

2. A i s co-recursively enumerable * {(i, j ) E N2 : A n B(a(i), 2 - j ) = 0 ) i s recursively enumerable % d A : Rn --+

% A = f-l{O} f o r some computable func t ion f : R" -+ R, i s lower semi-computable

3. A i s recursive d~ : Rn -+ R i s computable.

Using these characterizations and the fact that the exponential function is a computable function one can easily show that the notion of recursiveness of computable analysis fulfills Penrose's criteria.

68

Proposition 2.3 (Recursive sets and Penrose’s criteria)

1. T h e closed unit ball B := ((2, y) E R 2 : x2 + y2 5 1) i s a recursive set.

2. T h e closed epzgruph E := { (x, y) E R2 : y 2 ex} is u recursive set.

Proof.

1. We obtain d ~ ( ~ , y ) = max(0, d m - 1) for the distance function dg : R2 --+ R . Thus, d~ is computable and B is recursive.

2. There exists a computable function f : N +. R2 such that range(f) = {(z, e” + y) E R 2 : z, y E Q, y 2 0 } , since the exponential function is computable. Since range(f) is dense in E it follows that E is an r.e. closed set. The function g : R 2 -+ R with g ( z , y) := max(0, e” - y} is computable and E = g-l{O}. Thus, E is a ccer.e. closed set. Altogether, E is a recursive closed set. 0

More generally, the proof of 2. shows that the closed epigraph epi(f) = { (2, y) E R2 : y 2 f(x)} of a computable function f : R +. R is a recursive set.b It is worth noticing that the notion of computability and the notion of recursiveness of computable analysis fit together very well: a continuous function f : Iw +. R is computable, if and only if its graph is recursive as a closed subset of R2 (see Weihrauch 20).

3 Algebraic Recursiveness

In this section we want to prove that the notion of algebraic recursiveness does not meet Penrose’s second criterion. We start with the definition of algebraically r.e. sets as halting sets of real random access machines, as they have been used by Blum, Shub and Smale.425 These real random access machines use real number registers, arbitrary constants, arithmetic operations, comparisons and equality tests. We assume that the reader is familiar with the precise definitions. F’rom the point of view of computable analysis especially the comparisons and equality tests are problematic. From the point of view of classical Computability theory also the constants are suspicious since one can code an arbitrary function f : N +. N in such a constant.

bFor a general discussion of computability properties of the epigraph, see Zheng et. a1.20,21

69

Definition 3.1 (Algebraically r.e. sets) Let A R".

1. A is called algebraically r.e., if A is the halting set of some real random access machine.

2. A is called algebraically recursive, if A as well as its complement A" is algebraically r.e.

If A is the halting set of a real random access machine which does only use rational constants, then we will say that A is algebraically r.e. wi th rational constants. Obviously, the unit ball B := ((5, y ) E R2 : x2 + y2 5 l} is an algebraically recursive set, even with rational constants. We just have to compute z2 + y2 and test z2 + y2 5 1.

Proposition 3.2 The closed unit ball B := {(z, y ) E R 2 : x2 + y2 5 1) i s algebraically recursive with rational constants.

One can easily prove that the open epigraph of the exponential function {(z,y) E R 2 : y > e 2 ) is an r.e. open set and hence it is also algebraically r.e. (as any other r.e. open set). On the other hand, we will show,that the closed epigraph E of the exponential function is not algebraically recursive. Indeed, we will prove that it is not even algebraically r.e. The proof uses some standard techniques of Blum, Shub and Smale's theory, especially their Path Decomposition Theorem, which states that each algebraically r.e. set is a countable (disjoint) union of basic semi-algebraic sets (see Blum et al.5).

We recall some basic definitions and facts from real algebraic geometry (which can be found in Bochnak et a1.22 and Marker et al.23). The class of semi-algebraic subsets of R" is the smallest class of subsets of R" which contains all sets {x E R" : p ( z ) > 0 ) with real polynomials p : R" -+ R, and which is closed under finite intersection, finite union and complement. Each semi-algebraic set can be written a s finite union of basic semi-algebraic sets, which have the form

where p l , . . . , p i , 41, ..., qj : R" + R are real polynomials. A (partial) function f :C R" + R is called semi-algebraic, if its graph

graph(f) := ((2, y ) E EXnf1 : f(z) = y} is a semi-algebraic set. Using the normal form given above, it is easy to show that each semi-algebraic function is algebraic, i.e. there exists some real polynomial p : Rn+l + R, p # 0 such that p ( z , f(z)) = 0 for all z E dom(f). By the Theorem of Tarski-Seidenberg semi-algebraic sets are closed under projection and one can conclude that the interior A", the closure 71 and

{z E RIB" : Pl(Z) = 0, .'., p z ( z ) = 0, q1(x) > 0, ' " 1 qJz) > O } ,

70

hence the border aA = 71 \ A" of a semi-algebraic set A is semi-algebraic too (additionally, one uses the fact that the Euclidean metric is a semi-algebraic function). Correspondingly, one can see that the lower border

A1 := {(x, y) E R 2 : (z, y) E A and (Vz E R ) ( ( x , 2) E A + z 2 y ) } is semi-algebraic if A C R2 is. By f lu we will denote the restriction of a function f with dom( f 1 ~ ) = dom( f) n U . Now we are prepared to prove the following result.

Proposition 3.3 The closed epigraph E := {( s ly ) E R2 : y 2 e 2 } of the exponentaal func t ion i s not algebraically T. e. Proof. Let E := {(z,y) E R 2 : y 2 f ( x ) } be the closed epigraph of the exponential function f : R -+ R . Let us assume that E is algebraically r.e. Then, by the Path Decomposition Theorem, E is a countable union of semi-algebraic sets Ai C R2, i.e. E = U Z o A ~ . Since the closure of a semi-algebraic set is semi-algebraic too, we can mume w.l.0.g. that all sets Ai are closed. Especially, we obtain aE = U&(dE n Ai) and since the border dE is a complete subspace of R2 it follows by Baire's Category Theorem that there is some i E N and a non-empty open set U C R 2 such that 8 # dE n U 5 Ai. Since 8E = graph( f) and f is continuous, there are some non-empty open intervals I , J C R such that f(1) C_ J and V := I x J C U . Hence graph( f 1 1 ) = d E n V = A! n V is semi-algebraic, since A! and V are semi-algebraic. But using the Identity Theorem for real-analytic functions and the power series expansion of the exponential function, one can prove

0 that f l ~ is not algebraic. Contradiction!

This proposition proves that algebraic recursiveness does not meet Pen- rose's second criterion. w e will call a function f : R + everywhere transcendental, if f Iu is not algebraic for each non-empty open set U R. The proof that the exponential function is everywhere transcendental can be found in basic texts on analysis (see, for instance, Erwe 24). Besides the fact that the exponential function is everywhere transcendental and continuous, we have not used any specific properties of the exponential function in the previous proof. By symmetry we obtain the following general result.

Theorem 3.4 Iff : R -+ R i s a n everywhere transcendental and continuous funct ion, t hen neither the closed epigraph, nor the closed hypograph, nor the graph o f f i s a lgebra id ly r.e.

It is worth noticing that the notions of algebraic recursiveness and algebraic computability do not fit together in the same sense as the notions of recursiveness and computability of computable analysis. The square root

71

function f :c R --t R, 2 c-) fi is an example of a function which is not algebraically computable but whose gra.pli is algebraically recursive. Hence, the algebraic non-recursiveness of the graph of the exponential fuiiction cannot simply be deduced from the fact that the exponential function is not algebraically computable.‘

4 Conclusion

We have seen that the notion of algebraic recursiveness does not meet Pen- rose’s criteria, while the notion of recursiveness from computable analysis does. The latter notion describes recursiveness in terms of computability of the distance function d A of a set A. In view of the fact that equality on the real numbers is undecidable, recursiveness in this sense is the best what one could expect. Recursiveness implies “decidability up to the equality test on the real numbers”: if we only could decide whether d ~ ( z ) = 0, then we could decide whether 2 E A or not.

An essential question remains open. We do not know whether the Man- delbrot set is a recursive closed set or not. It is easy to see that it is a c+r.e. closed set but it is still a challenging open question to find out whether it is also an r.e. closed set or not!

Acknowledgements

The main result of this paper, Proposition 3.3, has been motivated by an inspiring discussion with Peter Hertliiig in Dagstuhl 1997. This work has been supported by DFG Grant BR 1807/4-1.

References

1. R. Penrose, The Emperor’s New Mind. Concerning Computers, Minds and The Laws of Physics (Oxford University Press, New York, 1989).

2. E. Bishop and D.S. Bridges, Constructive Analysis (Springer, Berlin, 1985). 3. P. Odifreddi, Classical Recursion Theory (North-Holland, Amsterdam, 1989). 4. L. Blum, M. Shub, and S. Smale, On a theory of computation and complexity

over the real numbers: NP-completeness, recursive functions and universal machines, Bul. Amer. Math. SOC. 21:l (1989) 1-46.

5. L. Blum, F. Cucker, M. Shub, and S. Smale, Complexity and Real Computation (Springer, New York, 1998).

“Over algebraically closed fields a function is algebraically computable, if and only if its graph is algebraically recursive, see Ceola and L e c ~ m t e . ’ ~

72

6. L. Blum and S. Smale, The Godel incompleteness theorem and decidability (eds.), From Topology to Computation: over a ring, in M.W. Hirsch et al.

Proceedings of the Smalefest (Springer, New York, 1993) 321-339. 7. K. Weihrauch, Computable Analysis (Springer, Berlin, 2000). 8. M.B. Pour-El and J.I. Richards, Computability in Analysis and Physics

9. K.-I KO, Complexity Theory of Real Functions (Birkhauser, Boston, 1991). 10. V. Brattka and K. Weihrauch, Computability on subsets of Euclidean space I:

Closed and compact subsets. Theoret. Comput. Sci. 219 (1999) 65-93. 11. D. Lacombe, Les ensembles rkcursivement ouverts ou fermks, et leurs applica-

tions a 1'Analyse rbcursive, C.R. Acad. Sc. Paris 246 (1958) 28-31. 12. K. Weihrauch and C. Kreitz, Representations of the real numbers and of the

open subsets of the set of real numbers, Ann. Pure Appl. Logic 35 (1987) 247- 260.

13. C. Kreitz and K. Weihrauch, Compactness in constructive analysis revisited, Ann. Pure Appl. Logic 36 (1987) 29-38.

14. K.-I KO and H. Friedman, Computational complexity of real functions, Theo- ret. Comput. Sci. 20 (1982) 323-352.

15. A. Chou and K.4 KO, Computational complexity of two-dimensional regions, SIAM J. Comput 24 (1995) 923-947.

16. X. Ge and A. Nerode, On extreme points of convex compact Turing located sets, in A. Nerode and Y. V. Matiyasevich (eds.), Logical Foundations of Computer Science, vol. 813 of LNCS (Springer, Berlin, 1994) 114-128.

17. Q. Zhou, Computable real-valued functions on recursive open and closed subsets of Euclidean space, Math. Logic Quart. 42 (1996) 379-409.

18. N. Zhong, Recursively enumerable subsets of R' in two computing models: Blum-Shub-Smale machine and Turing machine, Theoret. Comput. Sci. 197

19. V. Brattka, Computable invariance, Theoret. Comput. Sci. 210 (1999) 3-20. 20. K. Weihrauch and X. Zheng, Computability on continuous, lower semi-

continuous and upper semi-continuous real functions, Theoret. Comput. Sci.

21. X. Zheng, V. Brattka, and K. Weihrauch, Approaches to effective semi- continuity of real functions, Math. Logic Quart. 45:4 (1999) 481-496.

22. J. Bochnak, M. Coste, and M.-F. Roy, Ge'ome'trie alge'brique re'elle (Springer, Berlin, 1987).

23. D. Marker, M. Messmer, and A. Pillay, Model Theory of Fields (Springer, Berlin, 1996).

24. F. Erwe, Differential- und Integralrechnung (Bibliographisches Institut, Mannheim, 1973).

25. C. Ceola and P.B.A. Lecomte, Computability of a map and decidability of its graph in the model of Blum, Shub and Smale, Theoret. Comput. Sci. 194

(Springer, Berlin, 1989).

(1998) 79-94.

234 (2000) 109-133.

(1998) 219-223,

73

ITERATIVE ARRAYS WITH LIMITED NONDETERMINISTIC COMMUNICATION CELL

T. BUCHHOLZ, A. KLEIN AND M. KUTRIB Institute of Informatics, University of Giessen

Arndtstr. 2, 0-35392 Giessen, Germany E-mail: [email protected]

An iterative array is a line of interconnected interacting finite automata. One distinguished automaton, the communication cell, is connected to the outside world and fetches the input serially symbol by symbol. Sometimes in the literature this model is referred to as cellular automaton with sequential input mode. We are investigating iterative arrays with a nondeterministic communication cell. All the other cells are deterministic ones. The number of nondeterministic state transitions is regarded as a limited resource which depends on the length of the input. It is shown that the limit can be reduced by a constant factor without affecting the language accepting capabilities, but for sublogarithmic limits there exists an infinite hierarchy of properly included real-time language families. Finally we prove several closure properties of these families.

1 Introduction

Devices of interconnected parallel acting automata have extensively been investigated from a language theoretic point of view. The specification of such a system includes the type and specification of the single automata, the in- terconnection scheme (which sometimes implies a dimension to the system), a local and/or global transition function and the input and output modes. One-dimensional devices with nearest neighbor connections whose cells are deterministic finite automata are commonly called iterative arrays (IA) if the input mode is sequential to a distinguished communication cell.

Especially for practical reasons and for the design of systolic algorithms a sequential input mode is more natural than the parallel input mode of so-called cellular automata. Various other types of acceptors have been investigated under this aspect (e.g. the iterative tree acceptors in [8]) .

In connection with formal language recognition IAs have been introduced in [7] where it was shown that the language families accepted by real-time IAs form a Boolean algebra not closed under concatenation and reversal. More- over, there exists a context-free language that cannot be accepted by any &dimensional IA in real-time. On the other hand, in [6] it is shown that for every context-free grammar a 2-dimensional linear-time IA parser exists. In [lo] a real-time acceptor for prime numbers has been constructed. Pattern

74

manipulation is the main aspect in [I]. A characterization of various types of IAs by restricted Turing machines and several results, especially speed-up theorems, are given in [13,14,15].

Various generalizations of IAs have been considered. In [20] IAs are studied in which all the finite automata are additionally connected to the communication cell. Several more results concerning formal languages can be found e.g. in [21,22,23].

In some cases fully nondeterministic arrays have been studied, but up to now it is not known how the amount of nondeterminism influences the capabilities of the model. In terms of Turing machines bounded nondeterminism has been introduced in [ll]. Further results concerning cellular automata, Turing machines, pushdown automata and finite automata can be found e.g. in [3,5,16,17,18,19].

Here we introduce IAs with limited nondeterminism. We restrict the ability to perform nondeterministic transformations to the communication cell, all the other automata are deterministic ones. Moreover, we limit the number of allowed nondeterministic transitions dependent on the length of the input.

The paper is organized as follows. In section 2 we define the basic notions and the model in question. Section 3 is devoted to the possibility to reduce the number of nondeterministic transitions by a constant factor. In section 4 by varying the amount of allowed nondeterminism we prove an infinite hierarchy of properly included language families. Due to the results in section 3 we need sublogarithmic limits for the number of nondeterministic transitions in order to obtain the hierarchy. Finally, in section 5 several closure properties of the real-time acceptors with such limits are shown.

2 Model and Notions

We denote the rational numbers by Q, the integers by 7, the positive integers { 1 , 2 , . . .} by N, the set N U (0) by No and the powerset of a set S by 2s. The empty word is denoted by E and the reversal of a word w by wR. We use C for inclusions and C if the inclusion is strict. For a function f we denote its i-fold composition by f [ i ] , i E N, and define the set of mappings that grow strictly less than f by o(f) = { g : NO -+ N I limn+, # = 0). The set R( f ) is defined according to {g : No -+ N 1 liminf,,, > 0). The identity function n F-+ n is denoted by id.

An iterative array with nondeterministic communication cell is an infinite linear array of finite automata, sometimes called cells, where each of them is

75

connected to its both nearest neighbors (one to the left and one to the right). For convenience we identify the cells by integers. Initially they are in the so- called quiescent state. The input is supplied sequentially to the distinguished communication cell at the origin. For this reason we have two local transition functions. The state transition of all cells but the communication cell depends on the current state of the cell itself and the current states of its both neighbors. The state transition of the communication cell additionally depends on the current input symbol (or if the whole input has been consumed on a special end-of-input symbol). The finite automata work synchronously at discrete time steps. More formally:

Definition 1 An iterative array with nondeterministic communication cell (G-IA) is a system (S , S , S n d , SO, #, A , F ) , where

1. S is the finite, nonempty set of states, 2. A is the finite, nonempty set of input symbols, 3. F 4 . so E S is the quiescent state, 5. # $! A is the end-of-input symbol, 6. S : S3 -i S is the deterministic local transition function for non-communi-

cation cells satisfying &(so, SO, S O ) = SO, 7. 6,d : S3 x ( A U {#}) + 2’ is the nondeterministic local transition func-

tion for the communication cell satisfying VSI, s2, s3 E S, a E A U { #} :

b n d ~ l , s2,s3, a> # 0.

S is the set of accepting states,

Let M be a G-IA (G for guessing). A configuration of M at some time t 2 0 is a description of its global state which is actually a pair (wt, ct), where wt E A* is the remaining input sequence and ct : Z 3 S is a mapping that maps the single cells to their current states. During its course of computation a G-IA steps nondeterministically through a sequence of configurations. The configuration (WO, CO) at time 0 is defined by the input word 200 and the mapping co(i) = SO, i € Z, while subsequent configurations are chosen according to the global transition function And:

Let (wt , ct), t 2 0, be a configuration then the possible successor configurations (wt+l, ct+l) are as follows:

where i E Z \ {0}, and a = it, wt+l = E if wt = E , and a = a l , wt+l = a2 . . . a, if wt = a1 . . . a,. Thus, the global transition function And is induced by 6

76

and 6nd. The i-fold composition of And is defined as follows:

If the state set is a Cartesian product of some smaller sets S = Sl x . . . x Sk, we will use the notion register for the single parts of a state. The concatenation of one of the registers of all cells respectively forms a track.

A G-IA is deterministic if 6nd(S1,~2,sg,u) is a singleton for all states S~,SZ,SQ E S and all input symbols u E A U {#}. Deterministic iterative arrays are denoted by IA.

Definition 2 Let M = (S , 6 , bnd, S O , #,A, F) be a G-IA.

that ci(0) E F for some (wi,ci) E n i i ( (w ,c0 ) ) . 1. A word w E A* is accepted by M if there exists a time step i E N such

2. L ( M ) = {w E A* I w is accepted by M } is the language accepted by M .

3. Let t : NO + N, t(n) 2 n + 1, be a mapping and iw be the minimal time step a t which M accepts a w E L ( M ) in some computation. If all w E L ( M ) are accepted within i, 5 t(lw1) time steps, then L is said to be of time complexity t.

The family of all languages which can be accepted by a G-IA with time complexity t is denoted by -Yt(G-IA). In the sequel we will use a corresponding notion for other types of acceptors. If t equals the function n + 1 acceptance is said to be in real-time and we write -YTt(G-IA). The linear-time languages -Ylt(G-IA) are defined according to -Yit(G-IA) = UkEQ,k,l Lk.n(G-IA).

There is a natural way to restrict the nondeterminism of the arrays. One can limit the number of allowed nondeterministic state transitions of the communication cell. For this reason a deterministic local transformation 6d : S3 x ( A U {#}) -+ S for the communication cell is provided and the global transformation induced by 6 and bd is denoted by Ad. Let g : No + NO be a mapping that gives the number of allowed nondeterministic transitions dependent on the length of the input. The resulting system (S , 6, dnd, 6d, SO, #, A, F ) is a gG-IA (g guess IA) if starting with the initial configuration (w0,co) the possible configurations at some time i are given by the global transformations

77

as follows:

i f i = O

U A$-g(lwl)l ((w', c')) otherwise ( W ' , C ' ) ~ A ~ ~ ' ' " ' ) ~ ((w0,co))

Observe that in this definition the nondeterministic transitions have to be applied before the deterministic ones. This is not a serious restriction since nondeterministic transitions for later time steps can be guessed and stored in advance (cf. second part of the proof of Theorem 3). Up to now we have g not required to be effective. Of course, for almost all applications we will have to do so but some of our general results can be developed without such requirement.

3 Guess Reduction

This section is devoted to the reduction of the number of nondeterministic transformations. In the sequel we will make extensively use of the ability of IAs to simulate a pushdown storage [8,2] or a queue [4] on some track in real-time. The communication cell contains the symbol at the top of the stack or the queue. The left-to-right inclusion in the following theorem is not immediate since there might be computation paths of the kgG-IA that cannot appear for the gG-IA. Therefore, the kgG-IA must be able to verify whether or not its communication cell has performed g(n) nondeterministic transitions.

Theorem 3 Let g : NO -+ No be a mapping and k E N be a constant. If t : No -+ N, t(n) 2 n + 1, is a mapping such that t(n) 2 k .g(n) for almost all n E N then

2t(gG-IA) = 2 t ( lC . gG-IA)

Proof. The crucial point in proving the inclusion 2t(gG-IA) C Zt(kgG-IA) is that a kgG-IA M' which is designated to simulate a given gG-IA M with the same time complexity must not simulate too many nondeterministic transitions of M . Therefore, the communication cell of M' is equipped with a pushdown storage. During its nondeterministic transitions M' either can simulate a nondeterministic step of M whereby k - 1 specific symbols are pushed or can simulate a deterministic step of M whereby one symbol is popped.

78

Once M' decided to simulate a deterministic transition it has to do so for its remaining nondeterministic steps, whereby again one symbol is popped respectively. In order to accept the input M' has to pop the last symbol from the stack exactly a t time step k . g(n) which is its last nondeterministic one.

Let m be the number of time steps at which symbols are pushed. Then we have m . ( k - 1) = k . g(n) - m

To see the other inclusion D49,(kgG-IA) _49,(gG-IA) we use again a pushdown storage. The communication cell of a gG-IA M' simulating a kgG-IA M without any loss of time pushes k - 1 nondeterministically determined functions d : S3 x ( A U {#}) --t S satisfying d(s1, s2] s3, a) E bnd(Sl1 s 2 , s3, a) (here bnd denotes the nondeterministic transition function for the communication cell of M ) during each of its nondeterministic transitions. Addi- tionally] it simulates a nondeterministic transition of M . During the first deterministic transitions such a function is popped and applied to the states of the communication cell and its neighbors and the current input symbol which yields the next state of the communication cell. Hence a nondeterministic transition in M is simulated deterministically. Altogether M' performs g(n) + (k - 1) .g(n) = k.g(n) nondeterministic transitions and accepts exactly

0

A constant number of nondeterministic transitions does not increase the power of IAs. The principle of the proof is to simulate all finitely many choices on different tracks.

m = g(n).

the same language as M .

Theorem 4 Let t : 0.10 + N, t(n) 2 n + 1 be a mapping. If k E N is a constant then

9t(kG-IA) zz L&(IA)

The next corollary extends the previous results.

Corollary 5 Let g : NO + NO be a mapping and q E Q, 0 < q 5 1, be a rational number such that g(n) = Lqn] for almost all n 6 N, then

2Tt(gG-IA) = TTt(idG-1A)

4 Nondeterministic Hierarchy

Definition 6 Let L g A* be a language over an alphabet A and 1 E NO be a constant .

1. Two words w and w' are 1-equivalent with respect to L if

wwl E L u w'wl E L for all w1 E A'

79

2. N ( n , 1, L ) denotes the number of 1-equivalence classes of words of length n with respect to L (i.e. Iww1I = n).

Lemma 7 Let g : No -+ No, g(n) 5 n + 1, be a mapping. If L E L%(gG-IA) then there exist constants p , q E N such that

Proof. Let M = (S , 6, dnd, 6d, SO, #, A, F ) be a real-time gG-IA which accepts L. We define

4 = max { Ibnd(S1, s2, s 3 , .)I Is1, S2,53 E s A a E A } In order to determine an upper bound to the number of 1-equivalence

classes we consider the possible configurations of M after reading all but I input symbols. The remaining computation depends on the last 1 input symbols and the states of the cells -1 - 1,. . . , 0, . . . , I + 1. For the 21 + 3 states there are 1S12'+3 different possibilities. Let p = ISI5 then due to lS12'+3 = IS121.1S13 = (IS12)'.1S/3 5 (lS12)'.(IS13)' = (IS12.1S13)' 5 p' we haveat most p' different possibilities for at most qg(n) different computation paths. Since the number of equivalence classes is not affected by the last 1 input symbols

0

The following result does not follow for structural reasons since there might be accepting computation paths of the fG-IA that cannot appear for the gG-IA. Therefore, the fG-IA must be able to verify whether or not its communication cell has performed g(n) nondeterministic transitions.

*dn) in total there are at most (p ' ) = p'.qg(") classes.

Theorem 8 Let f : NO + NO, f(n) 5 :, and g : NO -+ NO, g(n) 5 f(n), be two increasing mappings such that V m, n, E N : f(m) = f ( n ) ==+ g(m) = g(n). If L, = ( ~ g ( ~ ) b f ( ~ ) - g ( ~ ) I n E N} belongs to the family &(IA) then

2 T t (SG-IA) C -%t (f G-IA)

Proof. Let M be a real-time gG-IA that accepts the language L. A real-time f G-IA M' which simulates M works as follows.

Since f 2 g M' can guess the time step g(n) and therefore simulate M directly. Additionally, M' has to verify that its guess was correct. Otherwise the computation must not be accepting.

It is known that deterministic linear-time IAs can be sped-up to (2 . id ) - time [14]. Thus, L, belongs to 22id(IA). Now M' simulates such an acceptor M" on an additional track. During the first g(n) time steps M' simulates M" under the assumption that M" fetches input symbols a. From the guessed

80

time step g ( n ) up to the last nondeterministic step f(n) M' simulates M" under the assumption that M" fetches input symbols b, respectively, and during the last n - f ( n ) time steps M' simulates M" without input.

altogether M' simulates at least 2 . i d time steps of M " . If M' guessed g ( n ) correctly it simulates M" for the input ~ g ( ~ ) b f ( ~ ) - g ( ~ ) and, hence, an accepting computation. On the other hand, if M' simulates an accepting computation then it guessed a time step t such that the input ~ ~ b f ( " ) - ~ belongs to L,. It follows t E { g ( m ) I f ( m ) = f ( n ) } and due to the assumption V m, n, E N : f ( m ) = f ( n ) ==+ g ( m ) = g ( n ) it holds t = g ( n ) . Therefore, M' can verify whether its guess was correct and,

The following situation may clarify the necessity of the condition V m, n, E N : f ( m ) = f ( n ) ==+ g(m) = g(n). Let m < n and f ( m ) = f ( n ) and g(m) < g ( n ) . Since c ~ g ( ~ ) b f ( " ) - g ( ~ ) belongs to L, the word ~ g ( ~ ) b f ( ~ ) - g ( ~ )

does. Consequently, for an input of length m the word c ~ g ( ~ ) b f ( ~ ) - g ( ~ ) would lead to an accepting computation but since g(m) < g ( n ) the time step g might be guessed wrong.

Now we are going to extend the previous result to a hierarchy of properly included language families.

Due to the condition f ( n ) 5

thus, accepts L in real-time.

Theorem 9 Let f : No -+ No and g : NO t NO be two mappings which meet the conditions of Theorem 8. If additionally f E o(1og) and g E o(f) then

-Z-t(gG-IA) c -%t(fG-IA)

Proof. We define a mapping h : NO t N by h(n) = 2 f ( n ) . h is increasing since f is. Moreover, since f E o(1og) for all k E Q, k 2 0, it .holds limn-,m = limn+m - = 0 and therefore h E o(nk ) . Especially for Ic = $ it follows that themappingm(n) = max{n' E NO I (h(n)+l).(n'+l) 5 n } is unbounded, and for large n we obtain m(n) > h(n). Now we define a language L that belongs to T,t(fG-IA) but does not belong to z,t(gG-IA).

2 f ( n )

L = { $ T ~ ~ $ ~ 2 $ . . . $ ~ j ~ y ~ ) 3 n E D J : j = h ( n ) A ~ i E { O , l } r n ( n ) , l ~ i ~ j , A T = ~7, - (h(n) + 1). (m(n) + 1) A 3 1 5 i' 5 j : W ~ I = y R }

It follows that L is not empty (cf. Example 10). Assume now L E zTt(gG-IA). Then by Lemma 7 there exist constants p , q E N such that ~ ( n , m ( n ) + 1, L) 5 p(rn(n)+l).q."'"'. Since g E o ( f ) for all IC E 9, IC 2 0, it holds limn+m = lim n+m # = limn+m 2 f o = 0. Thus, we obtain k.g n 2 E . d " )

81

2"g E o ( 2 f ) = o(h). Therefore, for large n the number of equivalence classes is bounded as follows:

N ( n , m(n) + 1, L ) 5 p(m(n)+l) .P'"' < - p 2 4 n ) 4 " )

- - 210g ( p ) .2. m (n) .2'"9(4).9(" )

Let k be log(q), then 21°g(q)'g(") = 2"g(") E o(h) . Now we can find a constant no such that for all n 2 no: 2 . log(p) . 2"g(") < ah(n).

It follows 210g(p).2.m(n).2'0g(P)'9(") < 2m(n),h(n).$

On the other hand, let for all n E N and for every subset U = ( w l , . . . , w ~ (of (0, l}m(n) a word u be defined according to u = $'w~$. . . $ w ~ ( ~ ) $ where T = n - (h(n) + 1) . (m(n) + 1). Then for all y E (0, l}m(n):

y E u - uyRe E L

Since there exist at least 2m(n) different words wi there are (2h:G)) different subsets U . For every pair U , V of subsets one can find a wi belonging to U \ V or to V \ U . It follows UW?$ E L w VW?$ fj L and, hence,

N ( n , m(n) + 1, L) 2 (;:) = 2 4 7 4 . ( 2 4 7 4 - 1) . . . . . ( 2 4 7 4 - h(n) + 1)

h(n)! ( 2 4 4 - q n ) ) h ( " )

h ( n ) W

From m(n) > h(n) for large n it follows 2m(n) - h(n) > - 2m(n).5. Thus

> 2m(n).h(").; -

From the contradiction we obtain L fj -Y't(gG-IA). It remains to show L E -Y.t(fG-IA). An fG-IA M which accepts L has

to check whether j = h(n), whether all the wi are of the same length, whether T < h(n) (from which now follows that lwil = m(n)), and whether there exists an i' such that wit = yR. Accordingly M performs four tasks in parallel.

82

For the first task M simulates a stack and pushes a symbol 1 at every nondeterministic transformation. After the last nondeterministic transformation the pushed string is handled as a counter which is decremented every time step a new wi appears in the input. The decrementation starts for w2. The number of wis is accepted if the counter is 0 after reading the input because

For the second task M uses two more stacks. The subword w1 is pushed onto one of them. When M fetches w2 it pushes w2 to the second stack and pops w1 from the first stack whereby their lengths are compared symbol by symbol. This task is repeated up to wj.

The third task uses another stack on which the first T symbols $ of the input are pushed. Subsequently for each subword ‘uli one of them is popped.

The last task is to find an i’ such that W ~ I = yR. Here the nondeterminism is used. During the first f(n) nondeterministic steps a binary string is guessed bit by bit and pushed onto a stack. From time f(n) on it is handled as a counter which is decremented for every subword wi. If it is 0 the next word is pushed onto another stack. It will be popped and compared symbol by symbol when the word y appears in the input. Thus, the i’ is guessed during the nondeterministic transformations. 0

At first glance the witness L for the proper inclusion seems to be rather complicated. But here is a natural example for a hierarchy:

Example 10 Let i > 1 be a constant and f(n) = log[il(n) and g(n) = l~g [~+ l I (n ) . Then by Theorem 9 we have Trt(gG-IA) c 9?t(fG-IA). Since 9Lt( IA) is identical to the linear-time cellular automata languages [22] and {anb2n-n I n E N} is acceptable by such devices {ag(n )b f (n ) -g (n ) I n E N} E Tlt(IA) holds. Moreover, from g E log(f) follows Vm,n E N : f ( m ) = f (n) ==+ g(m) = g(n). Thus, the conditions of Theorem 8 are met. Trivially, g is of order o(f), E.g. for i = 2 we obtain m(4) = 0, m(8) = 1, m(l6) = 2, m ( 3 2 ) = 4, and $01$11$10$00~11~ E L.

is the binary number 2 f ( n ) - 1 = h(n) - 1.

5 Closure Properties

Besides that closure properties are interesting of their own they are a powerful tool for relating families of languages. Our first results in this sections deal with Boolean operations.

Lemma 11 Let g : No -+ No and t : No -+ N, t(n) 2 n + 1, be two mappings. Then the family 9t(gG-IA) is closed under union and intersection and trivially contains 9t(IA).

83

Proof. Using the same two channel technique of [9] and [22] the assertion can easily be seen. Each cell consists of two registers in which acceptors for

Now we turn to more language specific closure properties. For some functions g the families 5Yrt(gG-IA) are closed under concatenation and for some others are not. At first we consider the closure under marked concatenation.

both languages are simulated in parallel.

Lemma 12 Let g : N o + No be an increasing mapping such that the language {ag(m)bm-g(m) I m E N} belongs to 5YTi(IA). Then the family -YTt (gG-IA) is closed under marked concatenation.

Proof. Let L1 resp. L2 be formal languages over the alphabets A1 resp. A2 which are acceptable in real-time by the gG-IAs MI resp. M z . Let L denote the marked concatenation of L1 and Lz: i.e.,

L = { W I C W Z I w1 E L1 and 202 E L z }

where c 6 A1 U A2 is a marking symbol. A gG-IA M that accepts L in real-time works as follows. ATcA; is

a regular language and, therefore, belongs trivially to 5YTt(gG-IA). Since 2Zrt(gG-IA) is closed under intersection (cf. Lemma 11) it is sufficient to consider inputs of the form ATcA; only. Let w = wlcw2 with w1 E A;, w2 E A;, and n1 = 1w11, n2 = 12021.

Now the idea is as follows: On input w the array M simulates the behavior of MI (on input wl) until reading the marking symbol c and subsequently the behavior of Ma (on input 202) . M accepts w iff both simulations are accepting.

The simulation of M1 can be performed directly since g is monotonically increasing and therefore g(n) 2 g ( n 1 ) . But the time step g(nl) has to be guessed and verified. In order to perform this task an acceptor for the language L' = {ag(m)bm-g(m) I m E N} is simulated on an additional track in parallel. Thereby an input symbol a is assumed for each nondeterministic step (up to the guessed time g(n1)) and an input symbol b for each deterministic step (up to the end of simulation at time nl).

So the number 2 resp. y of simulated nondeterministic resp. deterministic transitions corresponds to a word azbY belonging to L' iff there exists an m E N such that 2 = g(m) and y = m - g(m). Thus, iff n1 = z + y = g(m) + m - g(m) = m.

The simulation of MZ is performed similarly. However, a problem would arise with the nondeterministic transitions if g(n) < n1 + 1 + g(nz). There- fore, during its nondeterministic transitions M uses a queue into which it

84

pipes nondeterministically chosen local transition functions corresponing to a possible nondeterministic transition of Ma (cf. the proof of Theorem 3). Dur- ing the simulation of the nondeterministic transitions of M2 these functions are successively extracted from the queue and applied to the communication cell. 0

The assertions of the lemma can essentially be weakened. Let h be a homomorphism such that h(z) = a for z # b and h(b) = b. Then instead of requiring L = {a9(")bm-g(") I m E N} to be acceptable in real-time by some iterative array it is sufficient to require that some language L' with h(L') = L belongs to T T t (IA).

By $4' we denote the set of functions g : N + No, g(n) 5 n, such that there exists a language L' E Trt(IA) whose image under h is {ag(m)bm-g(m) I m E N}.

So in fact any family TTt(gG-IA) where g E $4' is closed under marked concatenation.

The usage of a marking symbol can be omitted if the limiting function g allows a gG-IA to determine a possible concatenation point by its own (for instance nondeterministically by using a b-ary counter). Hence, we obtain the following corollary.

Corollary 13 Let g : No + NO be a mapping with g E R(1og). If.&(gG-IA) is closed under marked concatenation then it is closed under concatenation.

On the other hand, there exist functions g for which gG-IA is not closed under concatenation. The proof follows essentially an idea presented in [7] to show that the family A$t(IA) is not closed under concatenation.

Theorem 14 Let g : NO + No, g E o(loglog), be a mapping. Then -Y,.t(gG-IA) is not closed under concatenation.

Proof. Let A be the alphabet consisting of the four symbols 0 , 1, a, and b. Further let L1 = A* and denote by L2 the language of palindromes over A, i.e. the set of all words w over A which are identical to their reversals wR. As it has been shown in [7] L1 as well as L2 are belonging to TTt(IA) and thus

Consider now the concatenation L = L1 L2 and assume contrarily that L belongs to A!Tt(gG-IA), too. Then let W, = {Owl I w E {a,b},} for n E OJ and define for each subset U = {wl, . . . wk} of W, the word u as

u={ E if U = @

thk otherwise

to TTt(gG-IA).

85

where the u1,. . . , U k are recursively defined by

uo = E , ui+l = wi+lwRwi, R 1 5 i 5 m - 1.

One easily sees that lul = n(2k - 1) and that for all w E W, it holds w E U iff uw E L. Therefore (choosing k = n) there are especially at least ( z ) different n-equivalence classes with respect to L in the set of words of length n2n over A. Hence using the assumption on g we can work out a contradiction to Lemma 7 for a sufficiently large n. So L is not acceptable in real-time by a gG-IA, i.e. -YTt(gG-IA) is not closed under concatenation. 0

Note that one can additionally show that for g E o(log1og) the corresponding family Z T t (gG-IA) is not closed under marked iteration although it might be closed under marked concatenation.

Theorem 15 Let g : No -+ NO, g E o(log), be a mapping. Then the family ZTt(gG-IA) is not closed under reversal.

Proof. Consider the language L consisting of all marked concatenations of binary sequences of equal length where the first sequence occurs at least twice, i.e

L = {W1$. . .wk$ I k 2 2 A 3 m E N : ‘Wi E ( 0 , I},, 1 5 i 5 k , A 3 2 5 j 5 k : w l = w j } .

We are going to show that L belongs to -YTt(IA) 2 -YTt(gG-IA), but LR 6 -YTt(gG-IA).

An iterative array M that accepts L in real-time works as follows. The communication cell is equipped with a queue through which symbols can be piped in first-in-first-out manner. At the beginning of the computation M stores its input symbols to the queue until the first symbol $ appears.

Afterwards at every time step one symbol is extracted from the queue and compared to the current input symbol. At the same time step it is stored in the queue again. Thus, the symbols of w1 circulate through the queue and w1 is compared with all the wi, 2 5 i 5 k , serially.

It remains to show that LR does not belong to 2’Tt(gG-IA). Let us assume that LR is acceptable by some gG-IA in real-time. Let us consider the equivalence classes N ( ( m + 1)2, (m + l), LR) . For every pair of different subsets {XI , . . . ,x,} and {yl, . . . , ym} of the set (0, I}, there are words $21 . . . $x, and $ y l . . . $ym which belong to different such (rn + 1)-equivalence classes. W.1.o.g. let 21 $! {y l , . . . ,ym}. Then $21 . . . $xm$xl belongs to LR

86

whereas $y1. . . $y,$z1 does not. Hence, there are at least (2) such (m + 1)- equivalence classes. Since f E o(1og) we obtain a contradiction to Lemma 7

0 for a. sufficiently large m which concludes the proof.

The last two results deal with the closure under homomorphisms.

Theorem 16 Let g : NO + NO be a mapping. If 9',t(gG-IA) C 2ZTt(idG-IA) then 9',t(gG-IA) is closed under &-free homomorphism iff z',t(gG-IA) = .YTt (idG-IA) .

Proof. One can show that the family 2',t(idG-IA) coincides with the closure of 9',t(IA) under &-free homomorphisms and forms an AFL which is closed under intersection and reversal. Consequently 2',t (idG-IA) is closed under &-free homomorphisms, too, implying the closure of 9 T t (gG-IA) under &-free homomorphism if 2+ (gG-IA) = 2',t(idG-IA) holds.

On the other hand, since 9',t(IA) C .Y',t(gG-IA) it follows that the closure of &(IA) under €-free homomorphisms (which is dP,t(idG-IA)) is contained in the closure of 9',t(gG-IA). If the latter family is T',t(gG-IA) itself then it follows 2,.t(idG-IA) C 2',L(gG-IA) C 9',t(idG-IA), i.e T',t(gG-IA) = 9',t(idG-IA) 0

Corollary 17 Let g : No + No be a mapping. If 9',t(gG-IA) c ZTt(idG-1A) then 9',t(gG-IA) is not closed under &-free homomorphism, homomorphism and &-free substitution and substitution.

By Theorem 9 such functions exist.

References

1. Beyer, W. T . Recognition of topological invariants by iterative arrays. Technical Report TR-66, MIT, Cambridge, Proj. MAC, 1969.

2. Buchholz, Th. and Kutrib, M. Some relations between massively parallel arrays. Parallel Comput. 23 (1997), 1643-1662.

3. Buchholz, Th., Klein, A., and Kutrib, M. One guess one-way cellular arrays. In: Proc. Int. Sym. on Mathematical Foundations of Computer Science (MFCS). LNCS 1450, Springer, 1998, 807-815.

4. Buchholz, Th., Klein, A., and Kutrib, M. Iterative arrays with limited nondeterministic communication cell. IFIG Research Report 9901, Insti- tute of Informatics, University of Giessen, Giessen, 1999.

5. Buss, J. and Goldsmith, J. Nondeterminism within P. SIAM J. Comput. 22 (1993), 560-572.

87

6. Chang, J. H., Ibarra, 0. H., and Palis, M. A. Parallel parsing on a one- way array of finite-state machines. IEEE Trans. Comput. C-36 (1987),

7. Cole, S. N. Real-time computation b y n-dimensional iterative arrays of

8. Culik 11, K. and Yu, S. Iterative tree automata. Theoret. Comput. Sci.

9. Dyer, C. R. One-way bounded cellular automata. Inform. Control 44

10. Fischer, P. C. Generation of primes b y a one-dimensional real-time iterative array. J . Assoc. Comput. Mach. 12 (1965), 388-394.

11. Fischer, P. C. and Kintala, C. M. R. Real-time computations with restricted nondeterminism. Math. Systems Theory 12 (1979), 219-231.

12. Hromkovic, J. et al. Measures of nondeterminism in finite automata. In: Proc. Int. Conf. on Automata, Languages, and Programming (ICALP). LNCS 1853, Springer, 2000, 199-210.

13. Ibarra, 0. H. and Jiang, T. On one-way cellular arrays. SIAM J. Com-

14. Ibarra, 0. H. and Palis, M. A. Some results concerning linear iterative (systolic) arrays. J. Parallel and Distributed Comput. 2 (1985) , 182-218.

15. Ibarra, 0. H. and Palis, M. A. Two-dimensional iterative arrays: Char- acterizations and applications. Theoret. Comput. Sci. 57 (1988), 47-86.

16. Kintala, C. M. and Fischer, P. C. Refining nondeterminism in relativized complexity classes. SIAM J. Comput. 13 (1984), 329-337.

17. Kintala, C. M. and Wotschke, D. Amounts of nondeterminism in finite automata. Acta Inf. 13 (1980), 199-204.

18. Salomaa, K. and Yu, S. Limited nondeterminism for pushdown automata. Bulletin of the EATCS 50 (1993), 186-193.

19. Salomaa, K. and Yu, S. Measures of nondeterminism for pushdown automata. J. Comput. System Sci. 49 (1994), 362-374.

20. Seiferas, J . I. Iterative arrays with direct central control. Acta Inf. 8

21. Seiferas, J. I. Linear-time computation by nondeterministic multidimen-

22. Smith 111, A. R. Real-time language recognition by one-dimensional cel-

23. Terrier, V. On real time one-way cellular array. Theoret. Comput. Sci.

64-75.

finite-state machines. IEEE Trans. Comput. C-18 (1969), 349-365.

32 (1984), 227-247.

(1980), 261-281.

put. 16 (1987), 1135-1154.

(1977), 177-192.

sional iterative arrays. SIAM J . Comput. 6 (1977), 487-504.

lular automata. J. Comput. System Sci. 6 (1972), 233-253.

141 (1995), 331-335.

88

R-TRIVIAL LANGUAGES OF WORDS ON COUNTABLE ORDINALS

OLIVIER CARTON Imtitut Gaspard Monge Universite' d e Marile-la- Valle'e,

F-77454 Marne-la- Valle'e C e d e x 2, Prance, Email: 01 i v i e r . C a r t o n h n i v - m l v . f r,

Url: h t t p : / / w w w - i gm . un i v-m 1 v . f r / - c a r t on/

Following the recently proved variet,y theorem for transfinite words we give, in this paper, three instances of correspondence between varieties of finit,e WI -semigroups and varieties of wl-languages. We first characterize the class of languages which are recognized by automata in which overlapping limit transit,ions end in t,he same state. I t turns out. that. the corresponding variety of w~-semigroups is defined by an equation which has a topological interpretation in the case of infinite words. It characterizes languages of infinite words in the class A2 = l I z n C z of t,he Bore1 hierarchy. This result is used t,o prove that an wl-latiguage is recognized by an extensive automaton if and only if its syntacric wl-semigroup is R-uivial and satisfies the Az-equation. This result extends Eilenberg's result, concerning R- trivial semigroups and extensive automata. We finally characterize wl-languages recognized by extensive automata whose limit transitions are trivial.

1 Introduction

Finite semigroups are the algebraic counterpart of automata. The first deep result using semigroup recognition is due to Schtitzenberger 14. He proved that the syntactic semigroup of a recognizable language L is finite and aperiodic (i.e. group-free) if and only if L is star-free, i.e., i t belongs to the s~nallest class of languages containing the letters and closed under product aid finite boolean operations. The idea of using algebraic properties of syntactic seinigroups to classify recognizable languages was developed by Eilenberg *, wlro showed that there exists a one-to-one correspondence between varieties of finite semigroups (class of semigroups closed under taking sub-semigroups, quotients and finite direct products) and certain classes of languages, the varieties of Iaiiguages. This theorem is known as the variety theorem. Since that time the tlieory of varieties of recognizable languages has been widely developed (see and '). For instance, i t has been shown by Eilenberg that a language is recognized by an extensive automaton if and only if its syntactic semigroup is R-trivial (see Chap. 10 in *). Furthermore, such languages can he described by very special rational expressions.

Automata on infinite words were introduced by Biiclii 6. A few years later, Buchi extended this notion to ordinals '. The challeirge was tlieii to

89

extend the algebraic approach to infinite words in a first step a i d to orcliiials in a second step. For infinite words, there is now a rather satisfying theory culminating in the works of Wilke 15, Perrin and Pin ' 9 " . The couiiterpart of this theory for ordinal words was a bit slower to develop. Wojciechowski l6

defined rational expressions and proved that they are equivalent to automata. The algebraic theory was first settled for ordinals less than w" and later extended to countable ordinals in 5 . The key algebraic notioii is tliat of an wl-semigroup which extends the notion of an w-semigroup introduced in 15,'.

Roughly speaking, an wl-semigroup is a structure in which the product of any sequence of a countable number of elements is possible. The variety theorem is also extended to words on countable ordinals in 5 .

In this paper, we give three instances of correspondence betweeii varieties of finite wl-semigroups and varieties of wl-languages. We first, cliaixterize the class of languages which are recognized by automata in which overlappilig limit transitions end in the same state. It turns out that tlie corresponding variety of wl-semigroups is defined by an equation which has a topological interpretation in the case of infinite words. This equation characterizes languages of infinite words in the class A2 = rIgnC.2 of the Bore1 hierarcliy 15, We use this result to characterize wl-languages recognized by extensive automata. An wl-language is indeed recognized by an extensive automaton if and only if its syntactic wl-semigroup is R-trivial and satisfies the A.L-equtitioit. This result extends Eilenberg's result concerning R-trivial semigroups and extensive automata. We finally characterize wl-languages recogiiized by extensive automata whose limit transitions are trivial.

The paper is organized as follows. Basic definitions of words, autoinata, rational expressions and wl-semigroups are recalled iii Sectioii 2. The tliree instances of correspondence are given in Section 3.

2 Notation and Basic Definitions

This section is devoted to basic notation and definitions on ordinals, words, rational expressions, automata and wl-semigroups.

2.1 Ordinals

We refer the reader to l 3 for a complete introduction to the theory of orcliiials. An ordinal is a class for isomorphism of well-founded linear orderings. 11-1 this paper, ordinals are usually denoted by lower Greek letters like u, 8, y. An ordinal a: is said to be a successor if a = p+ 1 for some ordinal 8. An ordinal is either 0, a successor ordinal or a limit ordinal. As usual, we identify the

90

linear order on ordinals with the membership. A n ordiiial CY is ttieii identified with the set of ordinals srrialler than a. In this paper, we maiiily use ordiiials to index sequences. Let n be an ordinal. A sequence r of lengtll u (or ail

a-sequence) of elements from a set E is a function which maps any ordinal y smaller that a to an element of E. A sequence r is usually denoted by x = ( z ~ ) ~ < ~ . In this paper, we only use countable ordinals, except for w1

which denotes the first uncountable ordinal.

2.2 W o r d s

Let A be a finite set called the alphabet whose elements are called k t t e r s . For an ordinal a , an a-sequence of letters is also called a word of length n or an a-word over A . The sequence of length 0 which has no element is called t,he e m p t y word and it is denoted by A. The length of a word r is denoted by 11'1. For an ordinal a, Aa denotes the set of all words of' length a. The set of' all words of countable length over A is denoted by Ah. A subset of Ah is called a language or an w1 -language.

Let ( z~ ) - ,<~ be an a-sequence of words. The word obtained by cnncate- nating the words of the sequence (ry)y<n is denoted by nyCcr zy. Its length is the sum Cr<P Ixyl. Conversely, a factorizat ion of a word z is a sequence ( x y ) y < a of words such that r = ry.

2.3 Rat iona l expressions

Rational expression for transfinite words have been defined by Wojciechowski 16. They extend the usual rational expressions for finite words with two other iterations.

Rational expressions for transfinite words are well-formed expressions build from the constants ( U ) ~ ~ A which denote the letters of the alphabet and the constant X which denotes the empty word. The operators used in the expressions are the binary operators + and . which respectively denote the union and the concatenation and the unary operators *, w aiid b which respectively denotes the finite, the omega and the ordiiial iterations. The omega and ordinal iterations are defined as follows.

= ( ~ 0 x 1 ~ . . . I zi E X> and ~b = { xy I a < w1 aiid x? E X > 7 <a

As usual, the braces and the dot are usually omitted in rational expressioiis. Thus, the expression ( { u } + { u } . { b } O ) b is written (u + ubw)b.

91

2.4 Automata

Buchi automata on transfinite words are a generalization of usual (Kleene) automata on finite words, with a second transition function for limit ordiiials. States reached at limit points depend only on states reached Lefore.

Definition 1 An automaton A is a 5-tu,ple (&, .4, E , I , F ) Uhe7.e (2 is the finite set of states, A the finite alphabet, E c (0 x A x Q) u ( P ( Q ) x 0) the set of transitions, I c Q the set of initial states and b7 c the set Of firlal states.

Transitions are either of the form ( q , u, q') or of the form (P , q ) where P is a subset of Q. A transition of the former case is called a su.cce.s.sor trunsition and it is denoted by p % q. A transition of the latter case is called a limit transition and it is denoted by P -+ q .

We now explain how these automata are used to define languages. Before describing a path in an automaton, we define the cofinal set of a sequence at some limit point.

Definition 2 Let c be ct-sequence of states and let ,8 5 Q be (I Limit orrlznal. W e denote by cof,j(c), the set of states q such that for (my S < 0, tlrew exists 6 < y < p such, that cr = q .

This definition allows us to define the notion of a path in an automaton.

Definition 3 Let A = ( Q , A, E , I , F ) be an automaton. A path c lebeled b y a word x = (uy)7<u of length cy from p to q in A is an (a + l)-sec/uence of states such, that:

1. qo = p andq , = q;

2. for any p < a, qp -+ qp+1 is a szuccessor transition of A; 3. for any limit ordinal p, cofp(c) + q/j Zs a limit trwrisitiorl of A.

n!3

The word z = ( C L ~ ) ~ < , is called a label of the path c. We denote t,he existence of such a path c by

c : p A q .

The path is successful if and only ifp is initial ( p E I ) and q is final ((I E F ) . A word is recognized by the automaton if and only if it is the label of a successful path. We illustrate the definition of a path with the following exalnple.

92

Figure 1. An automaton recognizing X = (aA')h(uA* + A )

Example 4 Consider the automaton pictured in Figure 1. The successor transitions are pictured like a labeled graph. This autoinaton recognizes the wl-language of words having an a at the first position and at all limit positions. This wl-language is denoted by the rational expression ( u A ~ ) ~ ( u A * -t A).

The wl-language recognized by the autoinaton of Figure 1 can be denoted by a rational expression. It turns out that automata and rational expressions have been proved to be equivalent 16. Any wl-language recognized by an automaton can be denoted by a rational expression and conversely, ally w1- language denoted by a rational expression can be recognized by an automaton. Such an wl-language is said to be rational.

An automaton is deterministic if and only if it satisfies the following three conditions. It has only one initial state i. For any state p E Q aiid for any letter u E A there is at most one state q such that p 3 q is a successor transition of A. For any subset P C Q there is a t most one state y such that P + q is a limit transition of A. The last two conditions implies that for any state q, any word labels at most one path starting at y. The autoiriat,on of Figure 1 is deterministic.

2.5 w1 -semigroups

The notion of wl-semigroups have been introduced in '. They are a generalization of w-semigroups introduced in l 5 l 9 . Roughly speaking, an w1 -semigroup is a set S equipped with a product which maps any sequence of countable length over S to an element of S. This product satisfies a generalizatioii of the associativity called w1 -associativity.

Definition 5 An wl-semigroup is a set S equipped with a function cp : Sb -+ S , called an w1 -product, which satisfies the following properties

I . For any element s E S , p(s) = s.

93

An wl-semigroup is not an algebra in the usual sense since the product does not have a finite arity. Even if the wl-semigroup is finite, the description of the product is not finite since the product of any sequence of countable length must be given. However, we will see later that the product of a finite wl-semigroup can be described in a finite way. U p to this question of iiifiiiite arity, the notion of wl-semigroups fits into the general framework of universal algebra and the following notions are self understanding: morphism of w1- semigroups, quotient of w1 -semigroups, sub-wl -semigroup, congruence of w1- semigroups. An wl-language X is recognized by a morpliism $ tioin ,4Q to S if and only if there is a subset P of S such that .Y = ( P ) . It was proved in ti that an wl-language is rational if and only if it is recogiiizecl by a fiiiite wl-semigroup.

The wl-product ‘p of an wl-semigroup, naturally induces an inner product defined by z . y = ‘p(zy) and a function z c) z7 defined by xT = ‘p (x” ) . The wl-associativity satisfied by ‘p implies that the inner product is associative and that the function r satisfies

s ( t s ) r = (st)‘ and ( s ” ) ~ = sr for any n > 0.

A function satisfying tliese two identities is said to be coniputitrle with the semigroup structure of S given by the inner product. Wlieii tlre w1- semigroup S is finite, the inner product and the functioii r I-+ T‘ coiripletely describe the wl-product ‘p. Therefore, a finite wl-semigroup can be viewed as a finite semigroup equipped with an additional compatible functioii 7.

In the sequel, we identify a sequence over an wl-semigroup a i d the product of its elements. As in semigroup theory, the product of a sequence of elements is denoted by mere concatenation aiid the applicatioii of cp is omitted. Following this convention, the application of r is then denoted by w since

From now on, we will not distinguish between a finite wl-semigroup and a finite semigroup with an additional compatible function w. We point out that both notions do not coincide anymore when the w1 -semigroups considered are not finite.

In a finite wl-semigroup, there is an integer K such tliat sn is aii idempotent for any element s of S. This integer is generally denoted by w in the literature but w already denotes the first limit ordinal in our context.

xr = ‘p(zc“).

94

3 Instances of correspondences

The variety theorem for finite words is due to Eilenherg *. Ile showed that there exists a one-to-one correspondence between varieties of finite seinigroups (classes of semigroups closed under taking sub-semigroups, quotients and finite direct products) and certain classes of languages called the varieties of languages. A classical example is the theorem of Schiitzenberger l4 which states that a set of finite words is star-free if and only if its syntactic semigroup is aperiodic. The variety theorem was extended to words on countable ordinals in 5 . The result of Schiitzenberger was first extended to ordinal words of length less than ww in

In the following two subsections, we give three instalices of correspol~- dences between varieties of fiitite wl-semigroups and varieties of W I -1alrg11ages. The first example is a characterization of w1 -languages recogiiized by automata in which overlapping limit transitions end iii the sainr state. This class of wl-languages is characterized by an equation which has already been used in the case of languages of infinite words. This equation will be needed in the second example which is a characterization of w1 -languages recognized by extensive automata. The third example is a refinement of the second one. I t is a characterization of wl-languages recognized by extensive autoiriata in which limit transitions have a very special form.

and then to all words of courit,i\hle leiigtll i n '.

3.1 A, -languages

Our first example characterizes wl-languages recognized by autmlata in which overlapping limit transitions end in the same state. An wl-language can be recognized by such an automaton iff its syntactic semigroup satisfies an equation which, in the framework of infinite words, characterizes languages in the class A, of the Bore1 hierarchy (see Thm 6.2 in 15). This justifies the terminology.

Theorem 6 For a n w1 -language X, the following conditions are equivalent.

1. X is recognized by a deterministic and complete automaton such th,at i f P + q and Pi -+ qi are two l imit transitions with P n P i # 0, t h e n q = 4'.

2. T h e syntactic w1 -serniyro,up of X .satisfies the ecputdori ( ~ g ~ ) ~ y ' - = (xuT )".

The automaton pictured in Figure 1 fulfills the property of t,lie theorein. Note that the determinism and the completeness of the autoinatoil are really necessary. In Thm 6.2 of 15, the A2-languages are characterized by the equation

95

(z"y")"x" = (x"yx)"yw. The following Iernina sliows that, tliis equation is equivalent to the equation given in the previous tlieorein. Jt also provides a useful property of wl-semigroups satisfying this equation.

Lemma 7 For any finite wl-semigroup S , the follo*wing propositions are equivalent.

1. S satisfies the equ.ation (xy")"y" = (xy")" I

2. S satisfies the equation (x"y")"z" = ( ~ " y " ) ~ v " .

3. A n y elements s, s',x,x' E S sucl~, that s R s', s z = s (md 5'3.' = s' sutisfy sxw = s'xI"

The third statement of the leirirria means that if the w1 -semigroup S satisfies the equation (zyr)"yW = (zy")", the value of sx" for s and 3: in S sucli that sx = s only depends on the R-class of s and not on a: . Proof We first prove that (1) and (2) are equivalent.

We aim to show that (~"y")"x" = (x"y")"y'. Denote by e aiid f the two idempotents z" and y". Since ( e f ) " e = ( e f e ) " arid ee" = e", one has in the first hand that (ef)"ew = (efe)",". We are in position to apply equation (xy")"yw = (xy")" with e f in the role of z and e in tJie role of y". This yields (ef)"e" = ( e f e ) " . Observing that ( e f e ) w = e ( j e ) d = ( e l ) " , one has the equality (ef)"e" = (ef)"'. Applying again (qj")"y" = (zy")" with e in the role of x and f in the role of y", one has i n the seconcl hand that (ef)"f" = ( e f ) " . Combining the two equalities, one gets tlie cquality (z"yK)"xw = (xxyx)"yw as required.

We aim to show that (a:y")"y" = (xy")". Denote by e a i d f the two idempotents (xy")" and y". One has ef = e and therefore (ef)" = e. Ap- plying equation (xry")"xw = (xryn)*yw with e in the role of a:" and f in tlie role y", one gets eew = e f" which is (xyn)"yW = (zy")".

We now prove that (1) and (3) are equivalent. We suppose that s R s', sx = s and s'z' = s' and we aim to show that

szw = s'z'". Since s R s', there are two elements y a,nd y' sucli that, sy = s' and s'y' = s. Define the two idempotents e and e' by e = (@"y'a:")" and e' = (y'z"yx'")". Applying equation (zy")"yw = (ry")" with yx'"y' in the role of x and x" in the role y", one obtains ex" = e". Symmetrically. one also has e'x'" = e'". By definition of e and e', one has se = s, s'e' = s'. This yields sz" = sew and s 'd " = s'e'". The equality sew = s'e'" follows finally from the definition of e and e'.

We aim to show that (x:y")"yW = (~9" ) " . Let f be tlie ideiripotent yK. Then, one has (xf)"f = (zf)" and (xf)"(xf)" = ( a : f ) " . We are ill position

96

t o apply (3) with f in the role of 2: and (zf)" in the role of s, s' aiicl 3.'. This yields (xf)"f" = (xf)"(xf)" = (zf)" as required. 0

We now come to the proof of the previous theorem. From any complete deterministic automaton with no transitions P -+ q and P' + (I' sucli that P n P' # 0 and q # q', we define an wl-semigroup satisfying the equation (zy")"y" = (xy")" and recognizing the same wl-language. Corivcrsely, frorn any wl-semigroup satisfying the equation (zy")"y'" = (zg")", we construct an automaton with no transitions P 3 q and P' -+ q' such that, P n P' # 0 and q # 9'. Proof Let A be a deterministic and complete ailtoinaton. For m y state (I and any word x, there is a unique path q 3 q' labeled by x. Tliis defines a right action of Ah on Q defined by q . x = 4'. This right action defines a semigroup S and a morphism 7r of semigroup frorn Ah into S. If for any limit transitions P + p and P' -+ p' of A such that P n P' # 0 one has p = p' , this semigroup S can naturally be endowed with a structure of w1 -semigroup.

The compatible operation w is defined as follows. Let q be a state aiid Y

be an element of S. Define the sequence (qn)nlo of states by q, = q . s'l. Let P be the set of states which appear infinitely often in the sequence ( q n ) r r l O , that is P = { q I Vrz 2 0 3k 2 12 q = qa} . Define q . sd = p where P' -+ p is a limit transition of A such that P C P'. Note that the stsate p does not depend on the choice of the transition P' + p . Indeed, if P' -+ 7) and P" -+ p' are two limit transitions such that P c P' and P c P", then P' f~ P" # 0 and p = p'. It is pure routine to check that (sn)" = s'" and ~ ( t s ) ~ = ( ~ t ) ~ and that the wl-semigroup satisfies the equation (zyK)"yw = (zy")". The morphism K from Ah to S is then a morphism of wl-seinigroup. By construction, i t recognizes the set X.

Conversely, let $ : Ab -+ S be a morphism into a finite wl-semigroup satisfying the equation (xy")"~'" = (zyn)'". We construct a cleteririiiiistic aiid complete automaton whose state set is S' , that is S with a neutral eleirierit 1. The initial state is neutral element 1. The successor transitions are the transitions of the form s -% s $ ( ~ ) for s E S1 and u E A. The limit trailsitions are the transitions of the form P + s where P is a subset ( ~ 1 , . . . , s , ~ } such that sz R s j for any 1 5 i , j 5 a. By the third statement of the previous lemma, there is a unique element r in S such that for any xz satisfying szx , = s,, one has sixy = r . We define then the limit transition P -+ r . By constructioil, for any word z, there is a path 1 3 s where s = $(r) and the automaton recognizes the set X . 0

97

3.2 R-trivial languages

An extensive automaton is an automaton in which any cyclilig pat11 is trivial. If there is a paths from p to q and from q to p in such a11 ; \ L I ~ O I I ~ A ~ O I I , tlien the states p and q are equal, These automata capture the iiotioii o f I J ~ O C W S wliicli cannot return to a previously visited state. The progress of sucli a process is always increasing.

I t has been shown by Eilenberg that a language of fiiiite words is recognized by an extensive automaton if and only if its syntactic semigroup is R-trivial (see Chap. 10 in *). We extend here this result. We first define extensive automata for ordinal words and characterize the wl-languages recognized by these automata. We also characterize the w1 -languages recognized by a subclass of these automata. We first extend the notion of extensive automata (see p. 93 in lo).

Definition 8 An automaton A = ( Q , A , E , I , F ) is suid to be exteiisive if its set of states is equipped with a totul ordering 5 su.ch that:

I . for any successor transition p 3 q, it holds p 5 q,

2. for any limit transition P -+ q and any p E P , it h,olds p 5 q .

I t follows directly from the definition that any path in an extensive i\Utoma- ton is an increasing sequence of states. Therefore any cycling path inust be trivial. Conversely, if any cycling path is trivial, the accessibility relation defines a partial ordering on the states. This partial ordering ca.n he embedded into a total ordering which obviously meets the requirernents of the previous definition.

Since any cycling path is trivial in an extensive automaton, the cofinal set of a path at some Iii-Iiit point is reduced to a single state. I’lierefore. we can always assume that all the liinit transitions are of the form P + (1 where P = { p } is a singleton.

If the automaton is deterministic and extensive, two transitions P -+ y and PI -+ q‘ such that P n PI # 0 satisfy q = q’. Since the autoiIiaton is extensive, i t can be assumed that P = { p } and P’ = {p ’ } . If P n P’ # 0 , one has p = p’ and P = P‘. If furthermore the automaton is deterministic, one has then q = q’. This shows that any wl-language recognized by a deterministic extensive automaton is a 4 2 wl-language.

In 8, it is proved that a. language of finite words is recognized by an extensive automaton if and only if its syntactic semigroup is R-trivial. Is is also proved that such a language can be denoted by a rational expression of a very special form. More precisely, it is shown that it is ecplill t,o a fillitre

98

union of products B,*uoB;al . , . B;-Iu,-l B; where Bk C A aid u p 4 BA. (see Prop 3.2 and Thm 3.3 in lo). The following theorem extends thcse results to words on countable ordinals

Theorem 9 Let x be an w1-languuye. The fo16owiny propositions are equz'ti- d e n t .

1. X i s recognized b y a deternvinistic eztewiue uutoniaton. 2. X is recognized by an R-triuial w1 -semigroGp whach sotasfies the equation

3. X is equal t o a finite un ion of products L1 . . . L,, wh,ere euch LA. for Ic < '12

(zy")Ty" = (2yT)".

is either B;uk with B k g A a n d uk $ BA. or BZ with B k C A or witlit B~ A and ak $ B~

and L , i s either B;, BZ or B! with B,,

We have alrexty mentioned that all limit traiisitioiis of an extensive automaton are of the form { p } + q. If a pa.tlr uses a transition { p } + q with p # q , there is a gap in the path. The path reinains in the state p and then jumps to the state q at some limit point. The following theorem chara.cterizes wl-languages recognized by extensive automata in which transitioiis of the form { p } -+ q with p # q are not allowed. All limit trarisitions are of the form { p } -+ p. This theorem is therefore a refinement of the previous oiie.

A .

Theorem 10 Let X be w1-lunyuaye. The follo'winy p r O ~ J O S i t l ~ n s ure ec~uiva- lent.

1. X is recognized by u deterministic extensive automuton d l of ruhictr limit

2. X is recognized by an w1 -sernigroeup wh.lch satisfies the equation xb' = 9.

3. x is equal to a finite union ofproducts B $ ~ ~ B $ x , . . . ~ f - ~ a , , - ~ B:~ iu/i,ere

transitions ure of the forrri { p } + p .

BI, C A and a k $ Bk.

Note first that an wl-semigroup satisfying the equation 2:" = zT is R-trivial and satisfies the Az-equation ( ~ y ~ ) ~ y ' " = (zf)". Indeed, one has ( ~ : y ) ~ x =

The statements of the previous two theorems are very close. We will prove them simultaneously. The proof is split in severa.1 lemmas. This first, lenllrlii

z(yz)T = z(y2)" = (xu)" = (1:yy.

99

shows that a R-trivial seinigroup can tie naturally eiirlowed with a structure of w1 -semigroup. Rirthermore, this w1 -semigroup satisfies the required equatioii. Note that in general there is 110 caiioiiical way tcJ elidow a fiiiite st.rriigroup with a structure of wl-sernigroup. I t caii even be sliown that soirie scinigroups like the finite groups caiinot be eiidowed with a structure of wl-seiriigroup.

Lemma 11 Let S be a R-tr iv ial semigroup and let SO a fixed element of S . T h e func t ion s c+ s"sg endows S with, u structure of w1 -.semi.qroup sutisfying the equation (se)*ew = ( se )W. Furthermore, if SO is a n e u t r d element, then S satisfies the equation sw = sn.

Proof Suppose that the semigroup S is R-trivial. It satisfies tlie equation (zy)"z = (zg)". One 1ia.s ( S ' ~ ) * S ~ = sxso and s ( t s )*so = (stjk.sso = (s t )"sO and the function s c+ s"s0 is compatible.

One verifies that (se)"ed = (se)*ensO = (se)"so = ( ~ e ) ~ . 'The last statement is obvious. 0

If 5 is an ordering on a set &, the set S of all increasiiig fuiictioiis froin G.1 to Q endowed with the coiripositioii as product is a R-trivial semigroup. Therefore, the previous leinina can be applied to any subseinigroup of tlie semigroup of all increasing functions.

I t is not required in Theorems 9 and 10 that the autoinaton is complete. It is just supposed that i t is deterininistic arid extensive. Without loss of generality, it can be always assumed that the automaton is complete as stated in the following lemma.

Lemma 12 A detenninistzc and eztenszve automaton cun be T I L U ~ ~ complete.

Proof Let A be a deteririinistic aiid extensive autoiriaton aiicl let 5 tw tlie ordering on Q. If A is not coinplete, a new state qo sucli that (I 5 qo is added with the following transitions. If there is no transition labeled by a letter CL

outgoing from a state q, the transition q 3 qo is added. If for a statc p , there is no limit transition of the foriri { p } + q, a traiisition { p } + qu is added. Moreover, the transitioiis qo 4 qo for u in A and the transitioii {qo) + qO are added. The new autoinatoil is complete and it is still deteririiitistic aitd extensive. 0

The following shows that if the wl-language X is recogiiized hy a deterministic and extensive autoiriaton, then is recognized by a i l R-trivial w1-

semigroup which satisfies the &-equatioii.

100

Lemma 13 Let X be an wl-language recognized by a deteniiinistic extensive automaton A. T h e n X is recognized by a R-tr iv iul w1 -semigrosup satisfying the equation ( ~ g " ) ~ y ~ = ( ~ y " ) ~ . Furthernore if all the liirvit transitions of the automaton are of the form { p } -+ p , the wl-semigroup satisfies the cyuation %w - - 9.

Proof By the previous lemma, it rnay be assumed that the autolnaton is complete. Let 5 be the ordering on Q. Let 5' be the increasing functioils from Q to Q a.nd let SO be an element of 5' such that if {p} -+ q is a limit transition of d, then p . SO = q . We claim that the wl-semigroup S with the operation w defined by sw = s"s0 recognizes X . Let .si/ : .4Q -+ S he the morphism defined by +(u) = s, where ea.cl-1 s, is a fhct,ion of S such that p-s, = q whenever p -% q is a successor transition of A. We clairn that for any word z and any state q , p . + ( x ) = q iff p ' z in A. The proof is by illduction on the length of z. If 1x1 is a successor ordinal, the result follows directly from the definition of the functions s,. Otherwise, let a be the limit ordinal 1x1. Let 2 = nn<# zn be a factorization such that s = +(xo) and e = $ ( x ? ~ ) for n 2 1. Let yn = 5 0 . . .zn. Since the sequence qn = p ' p n is increasing, there is an integer 1x0 such that qIL = qnc for 'n 2 ' no . Then q is the unique sta.te such that {qno} -+ q is a limit transition of A. By the induction hypothesis, one has qo = p ' g n = psse" for 72 2 n o . Since +(x) = xu = se"s0 and q = qno ' so , one has p + z = p . +(z).

Furthermore, if all limit tra.nsitions of A are of tlie form {p} -+ p , the element SO can he chosen equal to the neutral elemelit. Tlien, the wl-semigroup satisfies the equation zw = 2". 0

Lemma 14 Let X be a n wl-Iangu.age recogruized b y a detenninistic extensive automaton A. Then, at can be denoted by a reyular expression a s (Lescribed in Theorems 9 and 10.

Proof For any states p and q, denote by Ap,q the subalpliabet { a I 1) 3 ( I }

and by ?; the unique state such that { p } -+ ?; is a limit transitioii of A. Let S be the set of increasing sequences of states defined by

S = { ( q l , . . . , Qn) I (11 E I and (11 < < . * . < qTL a ~ d q,t E f}. The set X recognized by A is then equal to to the finite unioii

For any sequence s = (q1,. . . , qn) of S , the set X s = {z I q1 3 qIL } is tlie set of words which labels a path from ql to qn going through the states of s. Since

S

101

the automaton A is extensive, the states rnust be visited in the orclciirig of s. I t suffices to prove that each X, is a finite union of product of tlLe required form. We define the sets Lk as follows

if &. = q k

' ~ k , Q k " ( / k ? q k + l otherwise

' ! h , q k ,Vk+1

' ; k , q k A q k , q , + l A t k , a k if t/t' = qk+l

and L, by

L,, = L,, = ,4;,,qn otherwise

if 4 1 = (111

L,, =

I t is clear that X , is equal to the product L1 . . . L , L . Siiice t h > \ \ ~ t , o l l \ i ~ t ~ ) ~ ~ A is deterministic, the subalphabets L4qk , q k and A,, ,c,r+l are disjoint. 7'licrefore, each Lk is equal to a finite union of wl-languages of the required form. 0

Acknowledgment

The author would like to thank Jean-Eric Pin for his highly relc van t coin- ments on a preliminary version of this paper. He also would like to tJhNIJk t,he anonymous referee for his very helpful suggestions.

References

1. J. Ahneida. Finite Semigroups und Universal Algebru, voluirie 3 of Series

2. N. Bedon. Automata, semigroups and recognizability of words on ordi-

3. N. Bedon. Logic over words on denumerable ordinals. Techiiical R.eport

4. N. Bedon. Star-free sets of words on ordinals. Inform. Coniput., 166:93-

5 . N. Bedon and 0. Carton. An Eilenherg theorem for words 011 countable ordinals. In C. L. Lucchesi and A. V. Moura, editors, LA7'1N.98, vdutne 1380 of Lect. Notes 171 Cornput. Sci., pages 53-64. Spriiigei,, 1998.

6. J. R. Biichi. On a decision method i n the restricted second-orcler itritk- metic. In Proc. Int. Congress Loyz'c, Meth,odoloTy urid Philosophy of science, Berkeley 1960, pages 1-11. Stanford IJniversity I'ress, 1962.

7. J . R. Biichi. Tra.iisfinite autoinata. recursions a.nd weak second order theory of ordinals. In Proc. Int. Conyress Logac, Jvlethorloloyy, u71d Philosophy of Science, Jerusalem 1964, pages 2-23. North Holland, 1965.

in algebra. World Scientific, 1994.

nals. Int. J . Aly. Cornput., 8(1):1-21, 1998.

2000-04, Institut Gaspard Monge, 2000.

111,2001.

102

8. S. Eileriberg. Automutu, La.rtgzla!ges und Machines, volume B. .4ci\delriic Press, 1976.

9. D. Perrin a.nd J . - I~ . pin. ~emigroups and a.utoi-riata on infiiiite words. 11.1

J. Fountain and V. A. R. Gould, editors, N A T O Aduanced S t u d y Fnsti- tute Semigroups, Formal Languages U71d Groups, pages 49-72. Kluwer academic publishers, 1995.

10. J.-8. Pin. Varieties of Forinal Languages. North Oxford, London a.nd Plenum, New-York, 1986. Traduction de Vurie'te's de L a n ! p ~ ~ e . s ~ O ~ ~ I

11. J.-E. Pin. Handbook of formal languages. volui-tie 1 , CllilIJteI' Syntactic semigroups, pages 679-746. Springer, 1997.

12. J.-E. Pin. Positive varieties and infiiiite worc~s. 111 C. L. imxlresi a n c ~ A. V. Moura., editors, LATIN'98, volume 1380 of Lect. Notes 571 Gomput. Sci., pages 76-87. Springer, 1998.

13. J. G. Rosenstein. Linear Ordering. Academic Press, New York, 1982. 14. M. P. Schutzenberger. On finite inonoids having only trivial subgroups.

15. T. Wilke. An algebraic theory for regular languages of finite and infinite

16. J. Wojciechowski. Finite automata. on transfiiiite sequences and regu1a.r

Inform. Coritrol, 8:190-194, 1965.

words. hit. J . Aly. Comput., 3(4):447-489, 1993.

expressions. Fundamento infor~naticce, 8 (3-4) :379-396, I 985.

103

THE THEORY OF RATIONAL RELATIONS ON T R A N S F I N I T E S T R I N G S

CHRISTIAN CHOFFRUT SERGE GRIGORIEFF LIAFA, UniversitB Paris 7

2, pl. Jussieu 75251 Paris Cedex 05 LIAFA, UniversitB Paris 7

2, pl. Jussieu 75251 Paris Cedex 05 France France

ccQliafa. jussieu.fr segQliafa.jussieu.fr

1 Introduction

In order to explain the purpose of this paper, we recall briefly how the theory of “rationality” developed in the last fourty years.

The theory of sets of finite strings recognized by finite automata, also known as regular or rational sets was developed in the fifties. It rapidly extended in two directions. Indeed, by solving the decidability problem of the second order monadic theory of one successor, Buchi was led naturally to introduce the notion of finite automata working on infinite strings. He further extended this result to the monadic theory of all denumerable ordinals, and by doing so he again modified the original notion of finite automata to suit his new purpose, [9]. At this point, the equivalence between the notions of recognizability (by “finite automata”), rationality (by “rational expressions”) and definability (by “monadic second order logics”) was achieved as far as strings of denumerable lengths were concerned.

In the late sixties, Elgot and Mezei wrote an historical paper on rational relations [15] which was a successful attempt to construct the theory of relations between free monoids that could be recognized by so-called n-tape automata. Though hard to read, it contained the basic results of the theory. In the mid eighties Gire and Nivat showed that the theory of rational relations on finite strings carries over to

104

infinite strings, [16]. More recently, Wilke made a breakthrough by giving an algebraic characterization of the rational infinite strings via the notion of “right binoids” (now known as “Wilke algebras”), [25]. This construct happens to be the natural extension of finite monoids for infinite strings. Elaborating on this notion, Bedon [2] showed that rational transfinite strings of countable length recognized by finite automata can also be recognized by finite algebras (the wl-algebras).

In the present paper we draw the theory further by showing that almost everything carries over to transfinite strings. The proof of most elementary properties are mere paraphrase of those for finite strings, so we state them here for self-containment but we leave the simple verifications to the reader. For a few results completely new proof techniques are required, like the equivalent of the “second factorization theorem” (in Eilenberg’s terminology [14, p. 248]), cf. Thm. 27. Finally, some properties no longer hold beyond the length ww like the uniformization problem, cf. Thm. 33 and 55.4.

Observe finally that our properties do not cover all those of [16] since the concatenation of strings is only partially defined. In some sense our general framework seems to be more natural. For example, when dealing with infinite strings the property that the family of relations recognized by automata is closed under concatenation needs some intricate case analysis ([16, pp. 107-1101) while a purely formal proof works for transfinite strings. Also, the notion of direct product of rational subsets (the “recognizable relations”) is completely clarified now with the wl-algebras of Wilke.

2 Preliminaries

Here we present the basic facts on ordinals, transfinite strings and finite automata for transfinite strings, extending thus the elementary definitions usual for finite strings.

2.1 Ordinals

In this paper we shall deal with ordinals less than the first non- denumerable ordinal w1. We refer the interested reader to the standard

105

textbooks, e.g., [24] and [23], for a thorough exposition of the material on ordinals. We recall an ordinal is prime if it cannot be expressed as the sum of two smaller ordinals; these ordinals are exactly the powers of w. The (unique) Cantor’s normal form of an ordinal 0 < a < w1 is the sum

where 0 5 n < w, A, > . . . > XI > Xo, 0 5 ai < w for all i 5 n and 0 < a,. The ordinal A, is the degree and the ordinal XO is the tgpe of a. The degree and the type of 0 are equal to 0.

A few elementary definitions on sequences indexed by ordinals will be used in the sequel. We review them here.

DEFINITION 1. Let a be a limit ordinal. An increasing sequence (/?,),<A is cofinal to a if p, is less than a for all 7 < X and if for all ,B < a there exists 0 5 7 < X such that ,B < ,Bv < a.

We shall also make use of the following well-known fact: if a is a countable ordinal then every sequence (p,),,~ which is cofinal to Q

contains an w-subsequence (,B,i)i<w which is also cofinal to a.

DEFINITION 2. A (strictly) increasing sequence (/?,),<A is continuous if for all limit ordinals 7 < A, sup,,,/?, = /?, or equivalently, /?, is a limit ordinal and the sequence (/?E)E<, is cofinal to /?,.

By posing Q = lim,<x /?, if X is a limit ordinal or a = Px-1 if it is a successor ordinal, the previous property is equivalent to saying that the set of all ordinals less than a is precisely the union of all the semi- open intervals [/3,,/?,+1[ for all 0 < 7 + 1 < A. E.g., with X = w + 1, the sequence = i for all i < w and Pw = w + 1 is not continuous, nor is the sequence /3i = i for all i < w and pw = w x 2.

2.2 Transfinite strings

Given a finite alphabet A, a string is a mapping u from some a < w1

into A. Equivalently, u is a sequence of elements of A indexed by an ordinal Q. We denote up the element indexed by /? < a in this sequence. The ordinal a is the length of u, denoted by 1u(. The collection of all

106

strings is denoted by A<("' and we will call them improperly transfinite though they might also be of finite length. The empty string, denoted by 1, is the string of length 0 and is the unit of A<W1 as a monoid. By extension, the degree of a string u is the degree of its length. For a E A, Iu(, denotes the length in the letter a of the string u, i.e., the ordinal of the subsequence indexed by the set of positions ,O < a for which u p = a. The set of strings is partially ordered by the "prefix relation" : u 5 w if there exists w such that 21 = uw.

2.3 Continuous mappings

FYom now on and unless otherwise stated, the term "increasing" when applied to strings, is to be understood relative to the prefix ordering.

The string w is the limit of the increasing sequence (u,),<~ of prefixes of w if for all prefixes w of w there exists an index 7 such that w is a prefix of u,. E.g., limi,,(ab)i = limi,,(ab)Za = (ab)". Let h : A<"' + be increasing with respect to the prefix ordering. We say that h is continuous if it commutes with limits of strings, i.e., if for all limit ordinals X and all string u = lim,,x u, E A<W1, equality h(u) = lim,,x h(u,) holds.

can be extended in a unique way as an increasing (relative to the prefix ordering) and continuous (in the above sense) function of into Thus, all morphisms will be determined by the images of the letters. This can be proved by transfinite induction by setting h(ua) = h(u)h(a) for all u E and a E A and h(u) = sup{h(w) I w is a prefix of u} if the length of u is a limit ordinal. This latter string is indeed well determined, it is the string of length a = sup{lh(w)l I w is a prefix of u} whose prefix of length ,O < a is the prefix of length p of every string h(v) where w a prefix of u and ,O 5 lh(u)I < a.

Similarly, every mapping h of A into the direct product B,<,' x El,<"' can be extended to a mapping of A<,' into B:"' x B,<"' by

Any mapping h from A into

posing h(u) = ((wh)(u), (.2h)(u)).

107

1. If p < Q then gp+1 satisfies (gp ,up ,gp+ l ) E EA.

2. If p 5 Q is a limit ordinal then ( P , g p ) E Ee where P C Q is the set of persistent elements in the run (q^l)y<p.

Observe that a run with label u has length Iu( + 1. The run is successful if go E Q- and qa E &+.

DEFINITION 5. 1) A subset of is Buchi recognizable if it is the set of labels of successful runs of some Buchi automaton. We denote Buchi(A<"l) the family of Buchi recognizable subsets of A<Wi

2) We let Biichi(A<a) = {X n A<& I X E Buchi(A<"l)}.

A word of caution. The way we introduced the relation E is a slight modification of Buchi's original treatment of limit cases (for which the set of persistent states itself is considered as the limit state). However, this does not change the family of recognized subsets of A<W1 as the reader may easily verify.

REMARK 6. The family of Buchi recognizable languages is easily seen to be closed by union and intersection. Closure by complementation is a difficult result (Buchi, 1965, [8], cf. also [9]).

The next result insures that given an automaton, the set of strings that label some path visiting a specified subset of states is rational. It will be used later in Theorem 27.

LEMMA 7. Let A be a Buchi automaton, Q its set of states and V C Q a fixed subset. For all p , g E Q , the set Xp,v,q of strings that label some run from p to g which visits exactly the states in V is Buchi recognizable.

Proof. If A = (Q, A, Q-, Q+, EA, Ee) then X,,v,, is recognized by the automaton I3 = (Q x 2Q, A, { ( p , 0)}, ( ( 4 , V ) } , FA, Fe) where the transition sets FA, Fe satisfy the following conditions

1. for all q l , q 2 E Q, W C Q, a E A we have ((gl,W),a,(qZ,WU{g:!})) E FA if and only i f ( q l , a , q 2 ) E EA

108

3 Finite automata on transfinite strings

There are two different ways of defining a finite automaton on transfinite strings. Both are due to Buchi [9]. The first one was extensively studied by Choueka and considers strings of length less than wnfl for a given n, see [13]. The second one deals with strings of arbitrary countable lengths and was investigated (and actually extended to uncountable ordinals) by Wojciechowski [27]. Both formalisms are equivalent when restricted to strings of length less than wn+’ for some n < w, (cf. Theorem 17), so we shall use which one is more amenable depending on the question under investigation.

3.1

Here we recall precisely the above mentioned definitions of finite automata for transfinite strings starting with what is nowadays known as Buchi’s automaton. The main new point is about the definition of limit transitions and relies on the notion of cofinal sequences, as defined in paragraph 2.1.

DEFINITION 3. Let Q be a finite set, let a be a limit ordinal and let (qp)p<cy be a sequence of elements of Q. An element q E Q is persistent in the sequence if { p < a I q = qp} is cofinal to a. In other words, for some increasing sequence (pi)i<, cofinal to a we have q = qpi for all i < w .

In the more familiar context of Buchi automata on w-words, the states that are “infinitely repeated” in some infinite path are the persistent ones.

DEFINITION 4. A Buchi automaton is a quintuple A = (Q, A, Q-, Q+, EA, Eel

where Q is the (finite) set of states, A is the finite input alphabet, Q- C Q is the set of initial states, Q+ C Q is the set of f inal states, EA C Q x A x Q is the set of successor transitions and El C 2Q x Q is the set of l imit transitions.

If u is a string of length a, a run with label u is a sequence (qp)p<cu - of elements in Q which satisfies the following inductive conditions (recall up is the letter at position p in u, for all ,f3 < a):

Buchi automata on transfinite strings

109

2. for all (41,. . . ,q'} C Q, W C Q, r E Q we have (S , (r, W U { r } ) ) E F' if and only if ( (41 , . . . , qk} , r ) E Et

where s = ((41, W ) , . . . , ( Q k , W ) )

3.2

Buchi's notion of transition at limit step a involves all possible cofinal sequences to a. In case a < wn+l is a limit ordinal with type k (such that 0 < k 5 n) then a = ,O + w' for some ,O and there is a canonical sequence cofinal to a, namely (p + ~ ' - ' m ) ~ < ~ . This is the one considered at limit steps for the so-called Choueka automata which deal with strings having length less than wn+' for some n < w. They require a special notion of limit, known as "Choueka-continuity" in the literature, see [13]. Given a set Q, we let

[Qlo = Q , [&I - - 2[&Ik-' \ (0) if k > 0 , [Q]; = U,,5k,n[Q]' .

Choueka automata on transfinite strings

DEFINITION 8. A transfinite sequence s of length a + 1 < w"' is Choueka-continuous over the set Q if

- sp+1 E Q for all ,8 < a, - sp = { e E [Q]z I 3"Ok, e = sy+wl,k} for all p 5 a of the form

p = y + wz+' (i.e. ,O is limit of type i + I). In particular, s maps a + 1 into [Q]; (and not into Q) where T is the type of a.

DEFINITION 9. A Choueka automaton is a quintuple A = (Q, A, E , Q-, Q+)

where Q is the (finite) set of states, A is the input alphabet, Q- C Q is the set of initial states, Q+ C: [Q]; is the set of final states and E C [Q]; x A x Q is the set of transitions.

The notion of run with label u = (up)p<, as interpreted for Buchi's automata, extends naturally to Choueka's automata. Indeed, it is a Choueka-continuous sequence (qp)p<(y E [Q]; satisfying (qp, up, qp+1) E E for all p < a. Observe that it is clear from Definitions 8 and 9 that qp E [Q]' if ,O is of type k .

110

A run is successful if its first state 40 is an initial state and its last state qa is a final state. A subset of A<wn+' is Choueka recognizable if it is the set of labels of successful runs of some Choueka automaton.

REMARK 10. The family of Choueka recognizable languages is easily seen to be closed by union and intersection. Closure by complementation is also true (Choueka, [13]).

3.3

A fundamental result is that Kleene's theorem can be extended to transfinite strings for both notions of automata. To that purpose, one has to consider two new operations on sets of strings.

DEFINITION 11. 1) The w-power of a set X, denoted X", is the set of strings obtained by concatenating w-sequences of strings in X . 2) The wl-iteration of a set X is the set X<"l = Ua<"' X", where X" is the set of strings obtained by concatenating &-sequences of strings in X (in particular, Xo = {I)). 3) The n-trace-w-power of a set X is n-trace(X") = X" n A<wn+'.

REMARK 12. 1) X<"l is the closure of XU(1) under w-power, i.e. it is the smallest set Y which contains XU { 1) and is closed under w-power: X U { 1) Y = Y". It is also the closure of X U { 1) under Kleene-star and w-power, i.e. the smallest set Y such that XU{ 1) Y = Y* = Y". In fact, since (X U (1))" 2 X*, such a Y necessarily satisfies Y = Y2 and an easy induction over ordinals less than w1 (using equalities Y = Y 2 and Y = Y" for the respective successor and limit cases) shows that Y = Y" for all Q < w1, hence Y = Y<"l. 2) Observe that there is no need for a similar closure operation relative to the n-trace-w-power. In fact, an easy induction over i shows that the i-th iterate of n-trace-w-power of XU { 1) is (U,<,. xa> n A < w ~ + *

All these iterates coincide for i 2 n + 1, therefore the (n + 1)-th iterate of n-trace-w-power of X U (1) is closed under n-trace-w-power.

DEFINITION 13. 1) [Wojciechowski, 1985 [27]] Rat(A<"') denotes the least family of subsets of A<W1 that contains the empty set, the singleton sets consisting of a letter and is closed under set union, concatenation, Kleene-star, w-power and wl-iteration.

Rational sets of transfinite strings

111

2) [Choueka, 1978 [13]] Rat(A<W"fl) denotes the least family of subsets of A<wn+' that contains the empty set, tbe singleton sets consisting of a letter and is closed under set union, concatenation, Kleene-star, and the n-trace-w-power.

The following can be proven with the same structural induction technique as that for rational subsets of free monoids. Recall that a rational substitution of C into A<W1 is a mapping o that assigns a rational subset of AcW1 to each c E C. Given an ordinal a, one can extend o to A<" by setting o((up)pca) = np<, o(up).

PROPOSITION 14. I f a : C --f Rat(ACW1) is a rational substitution and X E Rat(Ccw1) then o ( X ) = U{a(z) I II: E X } is in Rat(A<"l). An analog property holds with wn+' in place of w1.

3.4 Kleene type results

As for finite strings, there is an equivalence for subsets of transfinite strings between recognizability via some type of finite automaton and expressability via some type of operations. We state these results for transfinite lengths less than for some n < w and for arbitrary lengths less than w1.

THEOREM 15 (Choueka, 1974,1131, cf. also Bedon, 1996, [l]). - .

Rat(A<Wnf') i s exactly the family of Chouelca recognizable-subsets of A < 8 + 1

THEOREM 16 (Wojciechowski, 1985,[27]). Rat(A<W1) i s exactly the family of Buchi recognizable subsets of A<W1.

(considered as a subset of A<'"') is Buchi recognizable. Now, as a corollary of the above theorems, we see that Choueka recognizability is a special case of Buchi's recognizability. This is proven in [l] but it can more easily be seen by arguing on the lengths of the strings.

THEOREM 17 ([13],[1]). Let X C . The following conditions are equivalent:

It is easy to check that

1) X is Buchi recognizable 2) X is Chouelca recognizable 3) x = Y n ~ < w ~ + l f o r some Buchi recognizable set Y C A<W1

112

4 Rational relations on transfinite strings

4.1

The idea of Buchi and Choueka automata described above extends to two-tape automata.

DEFINITION 18. A two-tape Buchi automaton on transfinite strings is a construct A = (Q,A,B, Q-,Q+,EA,EB,E~) where Q,Q-,Q+ are as in Definition 4, A, B are finite alphabet and

Two-tape Buchi and Choueka automata

The elements of (2), ( 3 ) and (4), are respectively called the A, B and limit transitions. The notions of run, label, successful run are straightforward extensions of the corresponding notions in finite automaton, the only difference being that labels are pairs of strings rather than single strings. The relation in A<W1 x B<"'l defined by the two-tape Buchi automaton is the set of pairs (u, v) that are the labels of some successful run and is said to be Buchi recognizable.

DEFINITION 19. A two-tape Choueka automaton on transfinite strings is a construct A = (&,A , B , Q-, Q+, E ) where Q,Q-,Q+ are as in Definition 9, A, B is a finite alphabet and where

E C [Ql; x ( ( A x (1)) u ((1) x BN x Q The relation in x F w n + l defined by the two-tape Choueka

automaton is said to be Choueka recognizable.

4.2 Rational relations

The operations needed for defining rational relations are those introduced for subsets extended in the usual way to pairs of strings. For instance Definition 11 obviously extends from strings to pairs of strings which leads to the following extension of Definition 13.

113

DEFINITION 20. 1) Rat(A<"l x WW1) denotes the least family of subsets of A<W1 x that contains the empty set, the singleton sets { ( a , l)}, ((1, b ) ) for a E A, b E B and is closed under set union, concatenation, Kleene-s tar, w-power and w1 -i teration. 2) Rat (A<Wnf1 x BCwn+l) denotes the least family of subsets of x B<Wn+ that contains the empty set, the singleton sets { ( a , l)}, { (1, b)} for a E A, b E B and is closed under set union, concatenation, Kleene- star, and the n-trace of w-power.

The closure under rational substitution seen in Proposition 14 extends easily to relations.

PROPOSITION 21. I ) If g : C ---f Rat(A<Wl) and T : C --+ Rat(BCW1) are rational substitutions and X E Rat(C<wl) then (0, T ) ( X ) = UsEX D(.) x T(.) is in Rat(A<"l x B<"'l). 2) If o : C --f Rat (ACwn+' ) and T : C -+ Rat ) are rational substitutions and X E Rat(C<WP+l) then (a, T ) ( X ) = UzEX a(z) x .(.)

>. is in Rat(A<WnPC1 B < W n p f l

4.3 The first factorization theorem

We recall that a morphism 4 : A* --+ B* is alphabetic whenever +(A) G B U (1) holds and that it is strictly alphabetic whenever $(A) G B holds. These notions make sense when applied to morphisms from A<W1 into B<W1.

Proposition 21 admits a reciprocal which is given some normalized forms in Proposition 22 below and Theorems 27,28. The first form we consider is the transfinite extension of a result on finite strings first observed by Nivat, 1968, [22], which is called the first factorization theorem in Eilenberg [14] p. 240. Its proof is a paraphrase of the proof for finite strings, cf., e.g., [6, Thm. 111.4.1.1 and consists in considering pairs (a, l), (1, b) as letters of a new alphabet C and introducing the projections from C onto A U (1) and B U (1).

PROPOSITION 22. The following conditions are equivalent 1) R E Rat(A<"l x B<W1) 2) there exist a finite alphabet C , a rational subset K of C<W1 and

two alphabetic morphisms T A : C --f A U {l}, TB : C --+ B U {I} such

114

that R = { ( T A ( ~ ) , T B ( z ) ) I z E K } . Moreover, one can suppose

(*) b'c E C(TA(C) is empty 7 r ~ ( c ) is not empty)

A n analog equivalence holds with wnfl in place of w1.

As a corollary, we get the relational version of Theorem 17.

THEOREM 23. Let R C_ A<"'"+' x B<wn+l . The following conditions are equivalent:

1) R is Buchi recognizable 2) R is Choueka recognizable 3) R = Sfl(A<""+' x BiW"+l) for some Buchi recognizable relation

S .

Proof. 1) + 2). Let K , T A , TB be as in 2) of Proposition 22 with condition (*). Clearly, I T A ( z ) ~ + I~rg(z)l 2 1.1 for all z E C<"l. Since R c - A<wn+' B<Wn+' we see that K 2 C<"n+' . We conclude by Theorem 17 and Proposition 22 applied to wn+'. 2) 3 3). If R is Choueka recognizable then by Proposition 22 with condition (*) applied to wn+l and Theorem 17 there exists K = K' n C<w"+l where K' is Buchi recognizable such that R = { ( n ~ ( z ) , n ~ ( x ) ) 1

x E K'}. 3) + 1). Let S = { ( T A ( z ) , T B ( z ) ) 15 E K } for some K E Rat C<"l with condition (*). Then R = { ( ~ ~ ( z ) , 7 r r g ( z ) ) 1 z E KnC<""+ }.

z E K } . Then R = Sn(A<""+l xB<""+l ) WhereS = { ( T A ( x ) , T B ( z ) ) I

0 0 Another corollary of the first factorization theorem is the closure

under composition property. The proof is also a paraphrase of the same property for finite strings, [6, Thm 111.4.4.1. We reproduce it here for the sake of completeness.

PROPOSITION 24 (Closure under composition). If R E Rat(A<"1 x BcW1) and S E Rat(B<"" x C<"l) then R o S E

all 0 5 n < w .

Proof. The proof goes exactly as in the finite length case, see [6]. De- note by r j B the projection of AUB onto A and by r2ic the projection

Rat(A<"1 x C<"l). The same holds with B<""+' , C<""+l f o r

115

of A U B U C onto A U B. Let R = {(7riB(z),7riB(z)) I 2 E K } , S = { ( T ~ ~ ( ~ ) , T ~ ~ ( Z ) ) I z E L } , where K C ( A U B)<W1 and L 2 ( B U C)<W1 are rational languages. Now, the composition R o S

M = ( T ~ ; ~ ) - ~ ( K ) f l (7rggc)-'(L). The inclusion from right to left is straightforward. As for the left to right inclusion, observe that if A,B,C are disjoint, u E ( A U B)<W1 and v E ( B U C)<W1 are such that 7riB(u) = 7r,""(v) then there exists w E ( A U B U C)<Wl such

tional due to the commutation of 7r-' with set union, concatenation, Kleene-star, w-power and wl-iteration and second to the closure under

0

can be written R o S = { ( rA ABC (z),7rgBC(z)) I z E M } , where

that TAB ABC (w) = u and 7 r ~ ~ " ( w ) = v. Finally, observe that M is ra-

intersection of the family of rational sets of transfinite strings.

4.4 Rational Biichi transducers

Here we modify the notion of finite transducer in such a way as trans- forming it into a Buchi automaton on A where each transition is equipped with an output in Rat(B<"l). In other words, as for finite and infinite strings, there is an alternative definition where the third component of the transitions are rational subsets of

DEFINITION 25. A Biichi transducer on transfinite strings is a construct 7 = (&, A , B, &-, EA, Ee, F ) where Q, A, B , Q- are as in Def- inition 4, EA is a finite subset of (Q x A x Rat (Wwl) x Q), Ee is a subset of 2Q x Q and F is a mapping from Q to Rat(BCW1). A run of 7 on a transfinite input string (u,),<~ E A<W1 is a pair of sequences ((q,)TQ, (X7JVSA) where

1) qo E Q- and for all 7 < X we have (qv,uq,X,,q,+l) E EA 2) if 5 X is limit then (P, qq) E Ee where P is the set of persistent

states in (qc)c-,.

The output of the run is the concatenation product X = (n,,, X,). Transducer 7 defines the relation R G A<W1 x B<W1 which asso-

ciates with every transfinite input string u the union set of all outputs of runs on u.

3) XA = F ( q x )

116

4.5 Eilenberg’s second factorization theorem: the case < w1

The following technical result happens to be the crux for the transfinite extension of Eilenberg’s second factorization theorem.

LEMMA 26. Let Q be a finite set, let a and X be limit ordinals and let (gp)pca be a sequence of elements of Q. Let (&,),<A be an increasing continuous (see 2.1) sequence of ordinals which is cofinal to a. Con- sider the sequence (Q,)7)<~ of subsets of Q where Q, = {qr I ,B, < y < & + I } . Then the set of persistent elements in (gp)p<a is the union of the sets which are persistent elements an the sequence (&,),<A.

Proof. Denote by {Q(’), . . . Q@)} the collection of persistent elements in the sequence (&,),<A and set P = Q(l) U . . . U Q(lc).

1) Let g be some persistent element in the sequence (qp)pCa, say g = gYi for some sequence ( ~ y i ) i < ~ cofinal to a. Since (/3,),<x is cofinal to a, for every i < w there exists q such that ~i < p,. Since (&,),,A is continuous, the least such 7 is necessarily a successor ordinal, we denote it by qi + 1. Thus, yi E [&,&+1[. Observe that the sequence (qi)icw is necessarily cofinal to A. In fact, if 7 < X then q 5 qi for all i < w such that p7 < yi. Since g = gYi we have g E QTi for all i. Since Q is finite, there exists an infinite set I C w such that all QVi , i E I , are equal. Let Q‘ be their common value. The sequence ( q i ) i E ~ being cofinal to X (as is (q i ) iEw) we see that Q’ is a persistent element in the sequence (Q,),<x. Thus, with the above notations, Q’ is among {Q(’), . . . &(‘“I}. Since g E Q‘ we conclude that g E P.

2) Conversely, let g E Q(j ) , with 1 5 j 5 Ic. Then Q(j) = QVi for some increasing sequence ( ~ i ) i < ~ cofinal to A. Choose an arbitrary element ei E [&, with gEi = q. Then the sequence (e i ) iCw is cofinal to cy

0

The equivalence of properties 1 and 4 in the next theorem is a second normalized form for a reciprocal of Proposition 21 introducing strictly alphabetic morphisms, i.e. morphisms 4 : C<wl + D<wl satisfying +(C) C D. It is the transfinite version of Eilenberg’s second factorization theorem ([14], 1974, p. 248).

and thus g is persistent in (gp)p<a.

117

THEOREM 27. Given a relation R erties are equivalent

1) R is rational 2) R is defined by some 2-tape Buchi automaton 3) R is defined b y some Buchi transducer 4) there exist finite alphabets C, D , a rational subset K

A<"'1 x B<"'', the following prop-

C<"'lD, a strictly alphabetic morphism 4 : C --+ A and a rational substitution a : (c U 0) --+ Rat(B<"'') such that R = U d E D , z d E K { 4 ( ~ ) } x a(@

Proof. With no loss of generality we can suppose that A and B are disjoint. 1) % 2). As in the finite string case, there is a one-to-one correspondence between 2-tape Buchi automata on alphabets A, B and one tape Buchi automata working on alphabet ( A x (1)) U ((1) x B ) . The relation associated to the 2-tape Buchi automaton being defined as in condition 2 in Proposition 22. Thus, 1) w 2) is a mere reformulation of that last proposition. 3) + 4). Given the Buchi transducer 7 = (Q, A, B , Q-, EA, El, F ) , let C, D be disjoint finite sets of new symbols such that C is in one-to-one correspondence with the set of quadruples (4, a, X , p ) of EA, and D is in one-to-one correspondence with the set of pairs (q , X ) E Q xRat(B<"'l) such that F ( q ) = X . Denote by [q ,a ,X ,p] or [q , X ] whichever, the element of C, D in this correspondence. Define a strictly alphabetic morphism 4 : C + A and a rational substitution : (CUD) -+ Rat(B<"'l) as follows: $( [q , a, X , p ] ) = a, a([q, a , X , p ] ) = a([q, X I ) = X . Trans- form 7 into an one-tape Buchi automaton

A = (QU {4+},CUD,Q-,{q+},EL"D,Ee) by defining - ((I, [4,a,X,P],P) E E&D if and only if (Q ,a ,X ,P) E EA - (q , [q ,X] ,q+) E ELuD if and only if F(q) = X . Clearly, A recognizes a subset K of C<"'lD such that R = {(4(x), a(zd)) 1 d E D,zd E K } . 4) + 1). This is a direct consequence of Proposition 21. 2) + 3). This is the last implication that remains to be proved. The rest of this paragraph is devoted to its proof. Using condition 2 in Proposition 22, let+ R = { ( T A ( z ) , T B ( z ) ) I z E K } where K E R a t ( ( A U B)<"l) is recognized by the (l-tape) automaton A =

118

(Q, A U B, Q-, Q+, E A U B , El). We shall associate to A a Buchi transducer 7 which, in a first approach, follows the same construction as for finite strings.

Preliminary analysis In a run of A, the set of transitions can be divided into those that are labeled by letters in A and those that are labelled by letters in B. Thus, by tracking the A-transitions, we may view a run of A as starting with a certain number (possibly zero) of B-transitions, followed by one transition in A, followed by a sequence (possible empty) of B- transitions, followed by one A-transitions, etc . . . . This sequence may be followed by a last sequence of B-transitions. This grouping process is indeed possible in the transfinite case.

Claim: Every transfinite string w over A U B of length X in the sub- alphabet A (i.e. 1 2 ~ 1 ~ = A) can be factored in a unique way as

,<A

with u, E A and u, E BcW1. Indeed, let T A ( W ) = (u,),<x. For q < X consider the shortest prefix w, of w containing all occurrences uc with < < q. Then w = w,z holds for some transfinite string z and u7 is exactly the longest prefix of z not containing an occurrence of A. Consider an A-run (gp)pca - on input w = (n,,, t~,u,)ux E (AUB)<W1. where a = IwI and X = I w A ~ . Let P, = 1 ne5,ucql = Cc<,(lucl + l), so that go, is the state reached after processing the q first occurrences of A. A direct naive extension to the transfinite of the intuition used for finite strings would define 7 so that

1. the A-run qp, 3 gp,+IV,l 3 qp,+l is replaced by a single 7-

transition qp, --+ gpo+l where the output X , is the rational set consisting of all strings u' E B<'"1 such that there is an A-run

4PV - QPO+lV,l

a,?-&

V'

2. the last part gpx 3 qa of the A-run is replaced by a last 7- output Xx = F(qVA) equal to the the rational set consisting of

all strings u' E B<W1 such that there is an A-run qpx - gcu W'

119

However, we also have to consider limit transitions which ask for a memorization of the set of persistent states in the run. This leads to the following modification in the above assertion 1:

3. the 7-output X , is reduced to the set of all strings u’ E B<W1

such that there is an d-run qp, A qp,+lv,l and the two runs from qp , to qpo+lvol with labels vg and v’ visit exactly the same set of states, namely Q, = {qo I & 5 ,O < P,+l}. That X , is indeed rational is insured by Lemma 7. Also, 7-states will have two components: one for the current A-state and the other to memorize the set QV. Since that last set is known at the end of the run on input v,, it will be stored in the second component of the state (qp,+l, V,+1) reached by 7 after processing letter a,. For limit 7 we shall put V, = 8.

Construction of the Buchi transducer This leads to the following construction of

I = ( Q x 2 Q , A , B , Q - x {0},E; ,Ei ,F) Let A’ be the automaton obtained from A by deleting all A-transitions. As in Lemma 7, for all q , r E Q and V C Q we denote by Xq,v,r the set of strings u in such that there is an A’-run from state q to state r which visits exactly the states in V . INITIAL OR SUCCESSOR TRANSITIONS We set ( (9 , W ) , a , X , ( p , V ) ) E EL if and only if

X = U X q m r such that ( T , U , p ) E E A u s

LIMIT TRANSITIONS For all S = { ( q ~ , Vi), . . . (qj,Vk)} we set ( S , ( q , 8 ) ) E Ei if (P ,q ) E Ee where P = V1 U . . . U Vj FINAL TRANSITIONS

We set F((q,V)) = X if and only if X is the family of labels of runs of A‘ going from state q to some state in Q+.

Let us verify that R is exactly the relation associated to transducer 7. Suppose ( u , u ) E R. Then there exists w E (A U B)<W1 such that (u, v) = ( .rr~(w), .rrg(w)) and there exists an accepting d-run (qp)p<(y -

on input w with a = IwI. We keep the notations of the preliminary analysis. and show that

120

The preliminary analysis and the very definition of 7 show that w, E X , for q < A. Also, since the d-run on w is accepting, qa E Q+ so that X x # 8 and vx E X x . This proves assertion 2 above.

As for assertion 1, the case of initial and successor steps is clear from the preliminary analysis and the very definition of 7. Concerning limit transitions, first observe that p is a limit ordinal if and only if qp = I nycp u7uy( is a limit ordinal. In particular, the sequence (&),<A is continuous. Recall that V,+, = Q, for all q and V, = 8 for limit q. Suppose 0 5 X is limit. The family of persistent elements in the sequence (Q,),<o is exactly that of persistent elements in the sequence (V,),<o augmented with the empty set (which appears at limit steps) in case 0 has type 2 2, i.e. is a limit of limit ordinals. Thus, these families have the same union. Lemma 26 insures that this union set is exactly the set of persistent elements of the sequence (qp)p<ps. The definitions of limit transitions for A and 7 now show that (qpe, 0) is a valid limit state of 7 for the run on u. This proves assertion 1 above. Thus, (u, w) is in the relation associated to 7.

Conversely, suppose (u,w) is in the relation associated to 7 and u = (",),<A and w = n,lxw, where v, is in the output X, of 7 relative to the transition on letter a, if q < X or in the very last output given by the function F if q = A. Similar arguments allow to construct an accepting d-run on input w = (n,,,w,u,)wx, which shows that (u , v ) E R. 0

4.6 Eilenberg's second factorization theorem: the case < wn+l

Buchi two-tape automata and Biichi transducers have obvious Choueka counterparts in which limit transitions are treated in the Choueka way. In the notion of Choueka transducer there is no need to restrict the output rational sets to R U ~ ( B < ~ " + ~ ) , such outputs can be taken in Rat( B < w ' ) .

121

Using Theorem 17 and Proposition 22, we now prove that Theorem 27 implies its Choueka version as a corollary.

THEOREM 28. 1) Given a relation R C equivalent

1) R is rational 2) R is defined by some Choueka transducer 3) there exist finite alphabets C , D , a rational subset K

x B<"l, the following properties are

Pwn+l D , a strictly alphabetic morphism 4 : C --+ A and a rational substitution

Z) I n case R c A<"~+' x B<~"+' , the above properties are also equivalent to

Moreover, in that case, in condition 3) we can suppose a to have range in Rat ( B<wn+l).

Proof. 1) Implications 2) + 3) + 1) go as the corresponding implications 2) + 4) + 1) in Theorem 27. For implication 3) + 2), on input u E the wanted Choueka transducer non deterministically guesses some string xd E C<'"nLcl D , outputs a(zd) and checks whether zd E K and u = 4(z) (which can be done step by step since 4 is strictly alphabetical). For implication 1) + 3) suppose 1). Using the analog implication in Theorem 27, let C , D , 4 : C -+ A, CT : C U D --+ Rat(B<"l) and K E Rat(C<WID) be such that R = U d E D , x d E K { 4 ( z ) } X ~ ( z d ) . u p to a restriction of alphabet C , we can suppose that a(zd) # 8 for all xd E K . Since 4 is strictly alphabetic and 4(zd) E whenever xd E K , we see that K C_ C<wn+' . Now, using Theorem 17, we get the desired conclusion.

CT : (C U D ) --+ Rat(B<"I) such that R = U d E D , x d E K { ( 4 ( X ) } X a(zd)

4) R is defined b y some 2-tape Choueka automaton

2 ) Obvious. 0

4.7 Recognizable relations

We recall that an wl-Wilke algebra is a semigroup S equipped with an additional unary operation x + x", subject to the two axioms

1. for all x,y, ~ ( y x ) " = (zy)" holds

122

2. for all 2, for all integers n < w , ( ~ c ~ ) ~ = xw holds.

A subset K C: A<W1 is recognizable if there exists a morphism 4 (relative to the structure of wl-Wilke algebras) from onto a finite wl-Wilke algebra S and a subset T & S such that K = &l(T). The notion of recognizability extends to relations: R C x B<W1

is recognizable if there exists a morphism 4 in the category of wl-Wilke algebras from A<W1 x ElcW1 onto a finite wl-Wilke algebra S and a subset T C S such that R = &'(T). It is clear again that the traditional properties of recognizable sets and relations on finite strings extend to transfinite strings.

PROPOSITION 29. I ) Recognizable relations on A<W1 x BcW1 form a Boolean algebra. 2) If 4 : A ---$ B* is a monoid morphism, and i f K C B<W1 is recognizable, then +-'(K) i s a recognizable subset of A<W1. 3) K & ACW1 i s recognizable if and only i f it i s rational. 4 ) A rational relation i s recognizable if and only i f there exists an integer p and rational sets XI, . . . , X, E Rat(A<W1) and YI , . . . , Y, E Rat(B<W1) such that R = Ul<i<pXi _ - x Y,.

5 Uniformizat ion

Uniformizing a rational relation in Rat(ACW1 x WW1) consists of finding a function whose graph is a rational relation and whose domain coincides with that of the given relation. In the finite case, Eilenberg proved that this can be achieved as follows:

For relations recognizable by synchonous automata, just choose for each u E A* in the domain of the relation the minimal 21 E B* that is associated with u (relative to the length-lexicographical ordering relative to some prescribed order on alphabet B.

0 The passage from synchronous to general rational relations uses Eilenberg's second factorization theorem.

However this does not carry over to infinite strings, let alone to transfinite strings. The reason is that the lexicographic order is no more

123

Let 24 = {Uo,. . . , Um-l} . For 0 5 j < m and 0 5 p < w the component ,Bmp+j is the smallest ordinal ,8 E {wi.k I k < w} such that pmP+j-1 < ,Bmp+j and ( (p ) = Uj. Such a p does indeed exist since Uj is infinitely often repeated in the sequence ([(wz), c(w2.2),[(wi.3), . . .).

For -1 5 k < w, letting p be such that ,&+I = p k + wi.t, the (1 + 2k)-th component 1 [,Oh, P,c+~] is in Choueka(Q,wi.t + 1) if i > 0 or in Choueka(Q,t) (i.e. Qt) if i = 0. When ( varies among strings for which q5(c) has fixed components p-1, . . . , p k + l , such (1 + 2k)-th components can be compared via the lexicographic t-power of +$L:y if i > o or of <ireedy if i = 0.

The greedy ordering +$::," is defined from +$2Jy (in case i > 0) or from +treedy, i.e. < (in case i = 0) as follows. To compare two different (wi+' + 1)-Choueka-continuous sequences q, t , we consider the sequences q5(5) and q5(q), look at the first component on which they differ and compare (, q according to these components. Though the (1 + 2k)-th component of 4(E) lies in a set depending on (, such a comparison really makes sense. In fact, if q, E cannot be compared via their first 2k components, then their 1 + 2k-th components lie in the very same set Choueka(Q,y + 1) (where y is of the form w2.t for some t 2 0).

Clearly, <&:iT1 is a total ordering on Choueka(Q, wi+l + 1).

Let a = wzl + wi2 + . . . + wim where w > il 2 i2 2 . . . 2 i, 2 0 and 0 5 m < w. For 0 < j 5 m set aj = wil +wiz + . . . +wi3.

INDUCTIVE STEP: THE GREEDY ORDERING ON Choueka(Q, a + 1).

To each (a + 1)-sequence we associate the m-sequence: Q ( E > = ( E t [ O , ~ l I , E t [m,4 , * f * > E t [ % 4 , ~ m I )

of Choueka-continuous sequences of lengths wz* , wZ2 ,. . ., wZm. We define the greedy ordering +gz,y on Choueka(Q, Q + 1) as fol-

lows. To compare two different (a + 1)-Choueka-continuous sequences 7, c, we consider the m-tuples O(q) and O ( c ) , we look at the first component on which they differ, say it has rank j, and compare <, q according to the greedy ordering +Fr2e$: on this component.

124

well-founded for infinite strings of any fixed length. As in [lo], we will use a "greedy ordering" on the runs on a given input in order to "rationally" choose a second component associated with a given input.

5.1

Recall that runs on inputs of length a (a < w l ) are sequences in ([Q]b)a+l (where t is the type of a) which are Choueka-continuous over Q (cf. Definition 8) .

We fix the finite set Q and some finite total orderings <O,

, . . . , <" on Q and its successive power sets [Q]', . . . , [Q]". The purpose of this subsection is to define a total ordering on the

set of all Choueka-continuous sequences of fixed length a + 1 < w"+' such that every set of runs associated with a given input possesses a minimal element.

We now detail the inductive construction of an operation which associates to every ordinal a < wn+' a total ordering (which we shall call "greedy") on the set Choueka(Q, a + 1) of Choueka-continuous sequences over Q of length a + 1. The definition is first given for ordinals of the form wz + 1 and uses an induction.

INITIAL CASE a+ 1 = 1. Choueka(Q, a+ 1) is just Q and we let 3ireedy be <O .

INDUCTIVE STEP: FROM 1 TO w + 1 AND FROM wz + 1 TO wZ+l+ 1 (i > 0). Let i 2 0. To any (mi+' + 1)-Choueka-continuous sequence E we associate a sequence

The greedy ordering on Choueka-continuous sequences

The first component is the last element of the sequence <, i.e. 1A = [ (wz+ ' ) , and lies in the set [Qli+'. When E varies, such elements can be compared via <i+l .

The second component P-1 is the greatest ordinal ,B E (w2.k I k < w } such that <(P) $! 1A (recall that U is the set of values infinitely often repeated in the sequence (<(mi), <(wi.2), <(wi.3), . . .), so that such a greatest does indeed exist).

125

5.2

We first prove that for every n < w the union of the greedy orderings +:L:y for a < wn+l (which is a partial ordering comparing strings having the same length < wn+') is synchronous rational.

LEMMA 30. There exists a formula @(q,E) of second order monadic logic (in the language of order) which uniformly defines the relation

Two properties of the greedy ordering

{(v, S ) I 3a < wn+l(v, E E Choueka(Q, ct + 1) and v 4FA:y t> in any structure ( 6 , <) where 6 2 wn+l (Convention: as usual, strings over an alphabet with m letters are interpreted as t-tuples of (bounded) subsets of 6 with t = [log(m)l). I n particular, this relation is rational and even synchronous rational, i e . recognized by an automaton which reads its tapes in a synchronous way.

Proof. 1) First, we express types of ordinals (cf. 2.1) in second order monadic logic. Let L i m ( a ) = VP < a 3y (p < y < a) asserts that a is a limit ordinal. Then

and more generally for n 2 1:

We may express the exact type by

If /3 < y < wn+' then by relativizing in the above formulas all quanti- fiers to the interval [P,y[, we can express the type of this interval. 2) Thus, the relations

Typeo(a) = l L i m ( a ) Type,l(a) - = L i m ( a )

Type,,(a) -- V P < a 3 (Type,,-,(P) - A P < Y < 4

T Y P e n ( 4 = Type&) A 'TYPe,n+l(a)

i 5 n and y = P + wz i 5 n and 3k < w (y = P + w Z . k )

i 5 n and u = (((wz) , e(wZ.2) , r(wi .3) , . . .) i 5 n and u = ( E r [Owi] , 5 [wiwi.2] , 6 t [wi.Icwi.(k + l)] , . . .)

are also expressible. 3) The prefix and lexicographic orderings on transfinite sequences on some finite ordered alphabet are easy to express by second order formulas. If + is an expressible ordering over strings in Choueka(Q, wz + 1) then so is the relation

126

P,y E (w2.k I Ic < w } and t r ID,?] +[Pjyl-'ex 77 r [P,y] where +[P>7l-lex is the lexicographic extension of + to strings indexed in [P, 71. Consequently, for i _< n, the greedy orderings +greedy wi+l are expressible.

4) In order to express the greedy ordering +FA& there remains to deal with the decomposition sequences of ordinals a < wn+l, namely

where w > il 2 22 2 . . . 2 i, 2 0 and 0 5 m < w. For P < y let's say that [,O,y] is a block if it has order type wz for some i 5 n and the type of P is at least i. Clearly, blocks of a are the pieces of the decomposition sequence of a. Now, the relation

[,O,r] is a block of a and u = t 1 [P,r]

a = wil + wi2 + . . . + wim

is easy to express via types. 0

Let a < wn+'. We consider the compact product topology on the product set n,,,[4?IT(P) (where ~ ( p ) denotes the type of P, cf. 52.1). Since this set contains Choueka(Q, a+ l), it induces a topology on this last set (Caution: the induced topology is not compact!).

LEMMA 31. Every closed n o n empty subset of Choueka(Q, a + 1) has a smallest element for the +FLiy ordering.

Proof. The way is defined from the -&,t&: orderings makes it clear that we can reduce to the case where a = 0 or a = wz. The first case is trivial. We argue by induction on i for the second case. following the construction detailed in paragraph 5.1 of +ge:::' from

Let F be a non empty closed subset of Choueka(Q, wi+' + 1). We

+wi+l 1 greedy in case i > 0 or from +greedy=< in case i = 0.

want to minimize the sequence

where ,t varies in F . Let V be the smallest value of the first component IA = C(wi+l) of

q5(<) when < varies in F. Restrict F to the subset F' of strings < E F with first component V . Clearly, F' is still closed and non empty.

The following argument is a variation from our paper [lo] p. 68 about uniformization. We inductively define a sequence of integers

4(t) = (U, P-1, t r [O,P-lI, Po, ,t r [P-l,POl, P1, ,t r [PO,PlI, 4

127

(kl , ka , kg,. . .) and a sequence of strings (<I , <2 , <3 , . . .) as follows.

Let A-1 be the smallest value of the second component p-1 of &(c) when c varies in F'. We let k-1 be such that A-1 = wi.k-l and we define (1 as the smallest value of [ [O,wi.k-l] (with respect to the lexicographic k- 1-power of +greedy) when [ varies over sequences such in F' such that p-1 = A-1.

For t 2 0 let At be the smallest value of the component pt of &(<) for [ E F' extending the concatenation of strings <I? . . ., (t-1. We let kt be such that A t = wi.kt and we define ct as the smallest value of < [wi.kt-~,wi.kt] (with respect to the lexicographic (k t - kt-l)-power of +$Liy) for < E F' extending the concatenation of strings < I , . . ., (t-1 and such that pt = At.

Let < be the concatenation of the sequence of strings <1 , <2 , c3 , . . . and of the element V . It is clear that < has length wi+' + 1 and that <(wi.k) E V whenever k > k-1. Also, each element Uj E V appears infinitely often in the sequence (<(wi) , <(wi.2,. . .) since it appears in every string cmp+j for p 2 0. This gives the Choueka-continuity condition for level wi+l. As for levels w , . . . ? wi, the Choueka-continuity conditions are inherited from the ct's. Thus, 5 E Choueka(Q, wi+l +1). By very construction < is the limit of strings in F , hence is in F . Also, by definition of the +KL/, ordering, it is the smallest string in F . 0

wi+l

5.3 Uniformization of relations with domain bounded below ww

We first state a simple lemma.

LEMMA 32. The set of accepting runs o f a Choueka automaton o n an input with length a < ww is a closed subset of Choueka(Q, a + 1).

Proof. Suppose C E Choueka(Q,a + 1) is the pointwise limit of runs T O , T I , . . . on input u. Then for every a < I u I there exists t such that <(a) = rt(a) and <(a + 1) = rt(a + 1). Since rt is a run, the triple (<(a), ua, <(a + 1)) is in the transition relation. Thus, < satisfies the initial and successor conditions for runs. Also, there exists t such that ( ( 1 u I) = rt(l u I). Since rt is a run, <(I u 1) is a final state. Thus, < satisfies the final condition for runs.

128

Finally, since ( is Choueka-continuous, it also satisfies the limit con-

x B<"1

dition for runs. Thus, ( is indeed an accepting run.

THEOREM 33. Every rational transfinite relation R C can be uniformized by a rational relation.

Proof. Consider a Choueka transducer defining R. Without loss of generality, one can suppose that for each transition (q, X, r ) the output X E Rat(B<W1) is uniquely determined by the state r. For each such non empty X choose a witness string vx E X (there are only finitely many choices since the X ' s involved in transitions are finitely many). According to the previous lemma, given an input u E A" (where necessarily Q < wn+'), the set of accepting runs on u is a closed subset of Choueka(X, Q + 1). Arguing in second order monadic logic, Lemmas 32, 31, 30, allow to definably, hence rationally, associate to each input u a uniquely determined run. Now, from the run we rationally go to the transfinite sequence of ouput rational sets and also to the final output set. From this sequence and set we rationally get the transfinite sequence of witness strings and the final string, the concatenation of which gives a unique v such that (u, v) E R.

5.4 F'romw" on

The following two technical properties are elementary properties which can be found resp. in [24] and [ll].

LEMMA 34. Let 0 < IC < w. If ai < wk f o r all i < w then C Q ~ 5 w .

LEMMA 35. For any two transfinite strings x , y , equality y = x y holds if and only i f x" i s a prefix of y .

k

i<W

Given a limit ordinal X < w" we denote by Wx the set of strings w E {0,1}' such that for all non trivial factorizations w = w1u12, the prefix w1 contains finitely many occurrences of the symbol 1 and w2 contains at least one occurrence of 1. This is equivalent to saying that the positions of the occurrences of 1 define an w-sequence which is cofinal with A. Consider the relation

Cof inalSeq = {(Ox, w) I X is limit < ww and w E Wx}

129

It is easy to see that CofinalSeq = R e l ( d ) n ( {O}<ww x (0, l}<ww) where Re1 (A) is the relation recognized by the following deterministic synchronous 2-tape automaton (where the final state (1) takes into account the case X = p + w and w has a suffix 1'").

o/o 0/1

Extending Def. 16 goes through for relations), let's write Rat(A<" x B@) in place of { R n (A<a x B<O) 1 R is Buchi recognizable}.

PROPOSITION 36. The relation Cof inalSeq, which is in Rat(A<'"" x B<WW), where A = B = (0, l}, is not uniformizable.

Proof. Let R be a rational relation on the aphabet (0 , l ) which, for every i E w , accepts some (O'"a, ui) E Cof inalSeq. The proof consists in showing that for some integer i and some string 'u # ui both (Ow' , ui) and (Ow',v) belong to R. We assume the relation is recognized by some Buchi automaton to which any run refers and we denote by 7r1

and 7r2 the projection of the labels of a run onto A<WW and B<"'" respectively (the "first" and the "second" component). For all pairs ( 2 , ~ ) E AcWw x B<WW let us denote by M(x,y) the Boolean Q x Q- matrix whose (q,r)-entry is 1 if and only if there is a run labelled by (x,y) which leads from state q to state r .

Some familiarity with run factorizations are necessary to under- stand the technique used in the proof. Given a run p in an 2-automaton and its label (c, q) it is not the case that each factorization of the label is the label of a prefix of the run, which is a big departure from the 1- automaton case. What can be guaranteed is the following which could

5 to relations (and observing that Thm.

130

be called a “run factorization driven by an output factorization”. Let CY be an ordinal and let nrcaqr be a factorization of q. Then there exists a (non necessarily unique) factorization of p into p = lJr<apr such that 7r2(pr) = qr for all T < a. Apply this observation to some (O“2,u) cs CofinalSeq. Since the length lull is w, the output label can be factored as u = nr<wOtrl. Lemma 34 shows in particular (and this will be used below heavily) that for some r < w, & 2 w2-l holds. The idea of the proof is to substitute some factor ph for the r-th factor pr in the run driven by the output factorization, in such a way as to keep the input label and modify the output label (actually what we do is slightly different but the basic idea is there). This method is akin to that used in the famous “pumping lemma” but it is more elaborate in the sense that the factor p that we plug in does not require considering a single run but rather an infinite collection of runs.

Let us now be more technical. For all i < w, fix a successful run pi labelled by (Ow’, ui). We consider the following predicate

the image by 7r1 of some proper prefix of pi equals 0“” (6)

(Observe, as an example, no run in the above synchronous automaton recognizing Cof inalSeq satifies the predicate). Case 1: there exist infinitely many runs pi satisfying predicate (6). For every such run we choose an arbitrary factorization pi = h&ti (“head”, “body” and “tail”) such that 7rl(hi) = O“2 and 7rz(bi) = 0”z-I (the existence of such a bi is guaranteed by Lemma 34). Since Q is finite there exist two integers k < j such that p k and p j satisfy predicate (6) and M(label(bk)) = M(label(bj)) holds. The two runs hjb& and h j b k t j are successful and we claim that they have the same input labels and two different output labels. Indeed, since I.rrl(bj)l = I7rI(bk)l = 17rl(tj)l = 17rl(tk)l = 0 holds, we clearly have I7rl(hjbjtj)l = 17rl(hjbktj)l

and therefore nl(hjbjt j) = “~(hjbktj). Concerning the output labels, the condition 7r2(hjbjt j) = ~ ~ ( h j b k t j ) would yield, after cancelling out the common prefix 7 r 2 ( h j b k ) , equality OW’-’7r2(tj) = 0” 7 r a ( t j ) , hence 7 r z ( t j ) = OWi-’7rz( t j ) . This implies by Lemma 35, that 7 r 2 ( t j ) has a prefix equal to Ow’, a contradiction to the fact that letter 1 occurs

k- 1

131

cofinally in uj. Case 2: for sufficiently large i , pi does not satisfy predicate (6). For all sufficiently large i , choose a fatorization pi = hibiti such that label(bi) = ( O r i , O W i - ’ ) for some & < wa (this condition is guaranteed by Lemma 34). There exist two integers k , j such that i < k < j and M(label(bk)) = M(label(bj)) holds. Then the two runs pj = hjb j t j

and h j b k t j are successful and we claim that they have the same input labels and two different output labels. Indeed, Ir~(pj)l = uj

and I q ( t j ) l # 0 implies I 7 r l ( t j ) l = w j and (r1(hjbj)( < w j , thus I7rl(hjbk)l < wj and finally I r 1 ( h j b k t j ) / = wj proving the equality of the two input labels (both equal to Ow’). Concerning the output labels, the condition 7ra(h jb j t j ) = 7r2(hj b&) would yield, by cancelling out the common prefix q(hj)Owk equality nZ(tj) = O w ’ - l r z ( t j ) . By Lemma 35 this implies that 7 r 2 ( t j ) has a prefix equal to O w ) , a contradiction to the fact that letter 1 occurs cofinally in uj.

References

[l] N. Bedon. Finite automata and ordinals. Theoret. Comput. Sci., 156(1-2):119-144, 1996.

[a] N. Bedon. Automata, semigroups and recognizability of words on ordinals. Internat. J . Algebra Comput., 8(1):1-21, 1998.

[3] N. Bedon. Langages reconnaissables de mots index& par des or- dinaux. PhD thesis, Universitk de Marne la Vallke, 1998.

[4] N. Bedon and 0. Carton. An Eilenberg theorem for words on countable ordinals. In Proceedings of Latin’98 Theoretical In- formatics, number 1380 in LNCS, pages 53-64. Springer-Verlag, 1998.

[5] B. Bkrard and C. Picaronny. Accepting Zeno words without making time stand still. In Proceedings of the 22nd. Symp. Math. Found. Comp. Sci. (MFCS’97, Bratislava, Slovakia), number 1295 in LNCS, pages 149-158. Springer-Verlag, 1997.

132

[6] J. Berstel. Transductions and contezt-free languages. B. G. Teub- ner, 1979.

[7] Alexis Bhs. Decision problems related to the elementary theory of oordinal multiplication. 1999. Submitted to Fundamenta Mathe- maticae.

[8] J. Biichi. Decision methods in the theory of ordinals. Bull. AMS, 71~767-770, 1965.

[9] J. Buchi and D. Siefkes. The monadic theory of w1. In Decid- able Theories Ill number 328 in Lecture Notes in Mathematics. Springer-Verlag, 1973.

[lo] C. Choffrut and Serge Grigorieff. Uniformization of rational relations. In G. Paun J. Karhumaki, H. Maurer and G. Rozenberg, editors, Jewels are Forever, pages 59-71, 1999.

[ll] C. Choffrut and S. Horvath. Transfinite equations in transfinite strings. Internat. J . Algebra Comput., 10(5):625-649, 2000.

[12] C. Choffrut and J. Karhumaki. Combinatorics of words. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Lan- guages, volume 1, pages 329-438. World Scientific, 1997.

[13] S. C. Choueka. Finite automata, definable sets and regular expressions over wn-tapes. J . Comput. System Sci., 17:81-97, 1978.

[14] S. Eilenberg. Automata, languages and machines, volume A. Aca- demic Press, 1974.

[15] C.C. Elgot and J.E. Mezei. On Relations Defined by Finite Au- tomata. IBM Journal, 10:47-68, 1965.

[16] F. Gire and M. Nivat. Relations rationnelles infinitaires. Calcolo, XXI:9 1-1 25, 1984.

[17] M. R. Hansen, P. K. Pandya, and A. C. Zhou. Finite divergence. Theoret. Comput. Sci., 138:113-139, 1995.

133

[18] H. Johnson. Rational equivalence relations. Theoret. Comput. Sci., 47:39-60, 1986.

[19] D. Klaua. Konstruktion ganzer, rationaler und reeller Or- dinazahlen und die diskontinuierliche Struktur der transfiniten reellen Zahlenruume. Akademie-Verlag, Berlin (DDR), 1961.

[20] D. Klaua. Allgemeine Mengenlehre. Akademie Verlag, 1969.

[21] Lothaire. Combinatorics on Words, volume 17 of Encyclopedia of Mathematics and its Applications. Addison-Wesley, gian-carlo rota edition, 1983.

[22] M. Nivat. Transductions des langages de Chomsky. ANN. Inst. Fourier, 108:339-456, 1968.

[23] J. G. Rosenstein. Linear ordering. Academic Press, New-York, 1982.

[24] W. Sierpinski. Cardinal and Ordinal Numbers. Warsaw: PWN, 1958.

[25] T. Wilke. An Eilenberg theorem for m-languages. In Proc. of 18th ICALP Conference, number 510 in LNCS, pages 588-599. Springer-Verlag, 1991.

[26] T. Wilke. An algebraic theory for regular languages of finite and infinite words. Internat. J . Algebra Comput., 3:447-489, 1993.

[27] J. Wojciechowski. Finite automata on transfinite sequences and regular expressions. Fundamenta Informaticae, 8(3-4):379-396, 1985.

134

Networks of Watson-Crick DOL systems

Erzshbet Csuhaj-Varju * Computer and Automation Research Institute

Hungarian Academy of Sciences Kende u. 13-17

H-1111 Budapest E-mail: [email protected]

Art0 Salomaa Turku Centre for Computer Science

Lemminkaisenkatu 14 A FIN-20520 Turku

E-mail: [email protected]

Abstract

We introduce the notion of a network of Watson-Crick DOL systems, a distributed system of language determining devices motivated by Watson-Crick complementarity. In this paper we compare the behaviour of particular variants of these networks using different protocols for communication, we deal with the growth of the number of strings at the nodes under functioning, and we make observations concerning the decidability of the existence of a so-called black hole in the network.

1 Introduction Watson-Crick complementarity is a fundamental concept in DNA computing. A notion, called Watson-Crick DOL system, where the paradigm of complementarity is considered in the operational sense, was introduced and proposed for further investigations in [S ] .

no. T 029615. *Research supported in part by Hungarian Scientific Research Fund "OTKA" Grant

135

A Watson-Crick DOL system (a WDOL system, for short) is a DOL system over a so-called DNA-like alphabet C and a mapping q5 called the mapping defining the trigger for complementarity transition. In a DNA-like alphabet each letter has a complementary letter and this relation is symmetric. 4 is a mapping from the set of strings over the DNA-like alphabet C to (0 , l ) with the following property: the +-value of the axiom is 0 and whenever the $-value of a string is 1, then the &value of its complementary word must be 0. (The complementary word of a string is obtained by replacing each letter in the string with its complementary letter.) The derivation in a Watson- Crick DOL system is as follows: when the new string has been computed by applying the homomorphism of the DOL system, then it is checked according to the trigger. If the +-value of the obtained string is 0 (the string is a correct word), then the derivation continues in the usual manner. If the obtained string is an incorrect one, that is, its &value is equal to 1, then the string is changed for its complementary word and the derivation continues from this string.

The idea behind the concept is the following: in the course of the computation or development things can go wrong to such extent that it is of worth to continue with the complementary string, which is always available. This argument is general and does not necessarily refer to biology. Watson-Crick complementarity is viewed as an operation: together with or instead of a word w we consider its complementary word h,(w).

These systems raise a lot of interesting questions to study. An important problem is the stability of the system, that is, whether in the course of the computation a turn to the complement will take place or not. Another intriguing question is whether there are Watson-Crick DOL systems determining string sequences with properties different from properties of string sequences of ordinary DOL systems. The first problem, in the case of so-called standard WDOL systems, has been shown to be algorithmically equivalent to a famous problem with open status, called Z,,, [7]. The second question is answered in [6] by demonstrating an example for a Watson-Crick DOL system with a not 2-rational growth function.

In this paper we make a step further: we introduce networks of Watson- Crick DOL systems and study their behaviour. Networks are in the focus of interest in many areas of current computer science, from distributed archi- tectures to nature-motivated computing. Our notion is a particular variant of a general paradigm, called networks of language processors, introduced in [2] and discussed in details in [l].

A network of Watson-Crick DOL systems is a finite collection of Watson- Crick DOL systems over the same DNA-like alphabet and with the same trigger. These WDOL systems act on their own strings in a synchronized

136

manner and after each derivation step communicate some of the obtained words to each other. The condition for communication is determined by the trigger for complementarity transition. In this paper we study two variants of communication protocols: in the case of protocol (a ) , after performing a derivation step, the node keeps every obtained correct word and the complementary word of each obtained incorrect word (each corrected word) and sends a copy of each corrected word to every other node. In the case of protocol ( b ) , as in the previous case, the node keeps all the correct words and the corrected ones (the complementary strings of the incorrect strings) but communicates a copy of each correct string to each other node. The two protocols realize different strategies: in the first case, if some error is detected, it is corrected but a note is sent about this fact to the others. In the second case, the nodes inform the other nodes about their correct strings and keep all information which refers to the correction of some error.

In this paper we discuss some basic properties of these systems. We compare the two protocols, namely, we show that in the case of a particular variant, networks of Watson-Crick DOL systems with protocol (a ) determine the same sequence of string set collections at the nodes under computation as networks of Watson-Crick DOL systems working with protocol (b) , and reversely. We describe the string set collections at the nodes under computation: we show that for both protocols there are networks of Watson-Crick DOL systems where the number of strings at the nodes cannot be calculated by any 2-rational function. We also make observations on the connection between the existence of an algorithm for deciding whether or not the network has a black hole (a node which never communicates any string) and the existence of an algorithm for solving the famous problem Z,,,.

2 Preliminaries and basic notions Throughout the paper we assume that the reader is familiar with the basics of formal language theory. For further details and unexplained notions consult

The set of nonempty words over an alphabet C is denoted by C+; if the empty string, X is included, then we use notation C*. A set of strings L C* is said to be a language over C.

For a string w E L , L 5 C*, we denote the length of w by I w I , and for a set of symbols U C C we denote by lwlu the number of occurrences of letters of U in w.

For a finite language L, we denote the number of strings in L by c a r d ( l ) . By a DOL system we mean a triple H = (C, 9, WO), where C is an alphabet,

g is an endomorphism defined on C* and wo E C* is the axiom. The word

[4l, [51 and [31.

137

sequence S ( H ) of H is defined as the sequence of words wg, w1, wg, . . . where wi+l = g(wi) for i 2 0.

In the following we recall the basic notions concerning Watson-Crick DOL systems, introduced in [6] , and studied in [8], [7] and [9].

By a DNA-like alphabet C we mean an alphabet with 2n letters, n 2 1, of the form C = { a l , . . . ,an,al,. . . ,a,}. Letters ai and ai, 1 5 i 5 n, are said to be complementary letters, we also call the non-barred symbols purines and the barred symbols pyrimidines. The terminology originates from the basic DNA alphabet {A,G,C,T}, where the letters A and G are for purines and their complementary letters T and C for pyrimidines.

We denote by h, the letter-to-letter endomorphism of a DNA-like alphabet C mapping each letter to its complementary letter. h , is also called the Watson- Crick morphism .

A Watson-Crick DOL system (a WDOL system, for short) is a pair W = ( H , q5), where H = (C,g,wo) is a DOL system with a DNA-like alphabet C, morphism g and axiom wo E C+. q5 : C* + (0, l} is a mapping such that ~ ( w o ) = $(A) = 0 and for every word u E C* with 4(u) = 1 it holds that

The word sequence S ( W ) of a Watson-Crick DOL system W consists of 4(h,(u)) = 0.

words wg, w1, w2,. . . , where for each i 2 0

if q5MWi)) = 0 wi+l = { sh$;(wi)) if q5(g(wz)) = 1.

We also can say that wi directly derives wi+l in W, i 2 0, and we can use notation wi & W i + l . The sequence of derivation steps wi ==+ wi+l, i >_ 0, is said to be the computation in W.

The condition +(u) = 1 is said to be the trigger for complementarity transition. The intuitive meaning of the trigger is the following: if in the course of the computation a word u occurs such that 4(u) = 1, then u is considered as an "incorrect" word and it is replaced by its complementary word, h,(u). Moreover, (p is defined in such way that h,(u) has to be a "correct" word. Notice, however, that 4 ( w ) = 0, w E C*, does not imply 4(hZU(w)) = 1, that is, the complementary word of a "correct" word can be either "correct" or "incorrect."

In the following we shall also use this term: a word w E C* is called correct according to 4 if +(w) = 0 and it is called incorrect otherwise. If i t is clear from the context, then we can omit the reference to 4.

Obviously, various mappings 4 can satisfy the conditions of defining a trigger for complementarity transition. In the following we shall use a particular variant and we call the corresponding Watson-Crick DOL system standard. In this case a word w satisfies the trigger for turning to the complemen-

138

tary word (it is incorrect) if it has more occurrences of pyrimidines (barred letters) than purines (non-barred letters). Formally, consider a DNA-like alphabet C = { a l , . . . ,a,, GI,. . . ,a,} , n 2 l . Let CPUR = { a l , . . . ,a,} and C p y ~ = { G I , . . . , G , } . Then, we define 4 : C* -+ { O , l } as follows: for w E C*

Standard systems have turned out to be very natural from the point of view of DOL systems.

An important notion concerning Watson-Crick DOL systems is the Watson- Crick road, introduced and discussed in details in [9].

Let W = ( H , 4) be a Watson-Crick DOL system, where H = (C, 9, wo). The Watson-Crick road of W is an infinite binary word a over (0, l} such that the ith bit of a is equal to 1 if and only if at the ith step of the computation in W a transition to the complementary takes place, that is, +(g(wi-l)) = 1, i 2 1, where WO, w1, w2,. . . is the word sequence of W. In [9], among other things, it was shown that the Watson-Crick road of a standard Watson-Crick DOL system is not necessarily ultimately periodic.

Properties of Watson-Crick DOL systems significantly differ from properties of DOL systems: for example, the class of growth functions of Watson- Crick DOL systems contains functions which are not Z-rational [6] .

A sequence z ( i ) is said to be Z-rational if there is a square matrix A4 with integer entries such that for every i, i 2 1, z ( i ) equals the number in the upper right-hand corner of M2.

Finally, we recall a decision problem we shall refer to in the sequel: Given a Z-rational sequence z ( i ) , decide whether or not z ( i ) 2 0 holds for all i 2 0. The decidability status of this problem, called problem Z,,,, is open. For further details the reader can consult [lo] and [5].

3 Networks of Watson-Crick DOL systems A network of Watson-Crick DOL systems (an NWDOL system) is a finite collection of WDOL systems over the same DNA-like alphabet and with the same trigger for turning to the complement where the component WDOL systems act in a synchronized manner by rewriting their own sets of strings in the WDOL manner and after each derivation step communicate some of the obtained words to each other. The condition for communication is determined by the trigger for complementarity transition.

Definition 3.1 B y an N,WDOL system (a network of Watson-Crick DOL systems) with r components or nodes, where r 2 1, we mean an T + 2-tuple

139

where

0 C = { a l , . . . ,a,, G I , . . . ,a,}, n 2 1, is a DNA-like alphabet, the alphabet of the system,

4 : C* + {0,1) is a mapping defining a trigger for complementarity transition, and

(g i , { A i } ) , 1 5 i 5 r, caZZed the ith component or the ith node of r, is a pair where gi is a DOL morphism ouer C and Ai is a correct nonempty word ouer C according t o 4, called the axiom of the ith component.

If the number of the components in the network is irrelevant, then we speak of an N W D O L system.

An N W D O L system is called standard if 4 is defined in the same way as in the case of standard W D O L systems, namely, a word u E C* is incorrect according to 4 if it has more occurrences of pyrimidines (barred letters) than purines (non-barred letters), and it is correct otherwise.

Definition 3.2 For an N,.WDOL system r = (C,$, (91, { A l } ) , . . . , (g,, { A T } ) ) , r >_ 1, the r-tuple (L1,. , . ,I+), where Li, 1 5 i 5 T, is a finite set of correct strings ouer C according t o 4, is called a state of I'. Li, 1 5 i 5 r, is called the state of the ith component. ( { A l } , . . . , {A,} ) is said t o be the initial state of r.

N W D O L systems change their states by direct derivation steps. A direct change of a state to another one means a rewriting step followed by communication according to the given protocol of the system. In the following we define two variants of communication protocols, representing different communication philosophies.

Definition 3.3 Let s1 = ( L l , . . . , L,) and s2 = ( L i , . . . , Lk) be two states of an NTWDOL system r = (C, 4, (91, { A l } ) , . . . , ( S T , { A T } ) ) , 7- 2 1.

0 W e say that s1 directly derives s.1 b y protocol (a ) , written as SI =+. s2,

if

140

a W e say that 51 directly derives s2 by protocol (b) , written as s1 =Jt, s2,

if L; = h,(Bi) u,.,l c;,

where

Thus, in the case of both protocols, after applying a derivation step in the WDOL manner the node keeps the correct words and the corrected words (the complementary words of the incorrect ones), and in the case of protocol (a) it sends a copy of every corrected word to each other node, while in the case of protocol ( b ) it communicates a copy of every correct word to each other node. The two protocols realize different communication strategies: In the first case the nodes inform each other about the correction of the detected errors, while in the second case the nodes inform each other about the obtained correct words.

Definition 3.4 Let r = (C,+, (91, { A l } ) , . . . , (g,., { A r } ) ) , r 2 1, be a n N,.WDOL system with protocol (z), x E {a , b}.

s(0) = ({AI}, . . . , {A,.}) and s ( t ) *z s(t + 1) f o r t 2 0. The state sequence S(r) of r, S(r) = s(O),s(l), . . . , is defined as follows:

We define two further important notions, both concern communication in the network.

Definition 3.5 L e t r = (C, 4, (91, { A l } ) , . . . , (g,., {A,.})), T 2 1, be an N,.WDOL system with protocol (z), x E {a , b} and let s ( t ) = (L l ( t> , . . . , L,(t)) be the state of r at derivation step t , where t 2 0.

W e say that the ith component of the system is a black hole, 1 5 i 5 r, if f o r every t 2 0

1. if x = a then g i (L i ( t ) ) consists of correct words over C according t o 4, and

2. if x = b then g i ( L i ( t ) ) consists of incorrect words over C according t o 4.

Thus, for networks with more than one components, those components which never emit any string in the course of the derivation are called black holes of the network. If there is only one component in the network, the situation is the following: for protocol (a), the node is a black hole if no complementarity transition takes place under any computation step, and for protocol ( b ) ,

141

the node is a black hole if a complementarity transition takes place at any derivation step.

The following notion, the Watson-Crick road of the system (that of a component in the network) indicates whether communication takes place or does not take place at a particular derivation step.

Definition 3.6 Let r = (C, 4, (gl, {A1}), . . . , (gr, { A ? } ) ) , r 2 1, be an N ,WDOL system with protocol ( x ) , x E {a ,b } .

Let S(r) = s ( O ) , s ( l ) , s ( 2 ) , . . . , be the state sequence of I?. a The Watson-Crick road of I? is an infinite word a: over binary alphabet

{O,1} where the ith bit of a is equal t o 1 if and only i f under derivation step s ( i - 1 ) s ( i ) , i 3 1, the following holds: for r > 1 there is at least one component of r which communicates the copies (the copy) of at least one string t o the other components (component), f o r T = 1 a complementarity transition takes place at the single component of the network if x = a and no complementarity transition takes place at the component if x = b.

The Watson-Crick road of a component j , 1 5 j 5 r, r > 1, in the network is a n infinite word a over binary alphabet (0, l } , where the ith bit of a is equal to 1 if and only if under derivation step s ( i -1 ) *z s ( i ) the j t h component commzmnicates the copies (the copy) of at least one string t o the other components (component).

Notice that the Watson-Crick road of a component in the network can differ from the Watson-Crick road of the component as a single Watson-Crick DOL system with its axiom.

Example 1 Let u s consider a standard N W D O L system with two nodes over alphabet C = {al, a2,7il, u,} where morphisms g1 and 9 2 are defined as follows: gl(a1) = 7i1, gl(a2) = a2, gI(7il) = al, g1(&2) = a2, and ga(a1) = a ~ , g2(a2) = a2, g2(7i1) = a1, g2(zi2) = a2.

Let the f irst node have miom a1 and let the second node have axiom a2.

Suppose that the network functions by protocol (a) . Then the f irst f e w steps of the computation are as follows:

We continue with a simple example.

I t ime I node 1 I node 2 I communication I

t = 2 a1

t = 3 a1

t = 4 a1 a1, a2

... . . . . . . . . .

142

W e can observe that the second node never emits any string, only accepts strings. Thus, this node i s a black hole and its Watson-Crick road i s the infinite string with 0 at each bit. Moreover, at each step of the computation communication takes place, the first node sends a copy of a1 t o the second node. Thus, the Watson-Crick road of this node i s the infinite string with 1 at each bit, the same as the Watson-Crick road of the system itself.

The same state sequence can be obtained by a standard network with two nodes over the same alphabet, C = {a1,a2,t i1,i i2}, and axioms al and a2, with protocol (b ) , where the corresponding morphisms, 91 and g 2 are given as follows:

g i ( a i ) = al , g l (a2 ) = a 2 , g ~ ( a l ) = a l , 9 1 ( u 2 ) = a 2 , and ga(a1) = u l ,

gJ (a2) = 7i2 , g2(7i1) = a l , 9 2 ( a 2 ) = u 2 .

The second node is a black hole, it never emits any string under the computation and the first node communicates a string to this node at every moment . Comparing the definition of the morphisms t o the definition of the morphisms of the previous network, we can notice that the two networks can be viewed as dual ones.

4 Protocol ( a ) versus protocol ( b ) It is an important question whether there are different networks which use different communication protocols but determine the same or almost the same dynamics of sets of strings, that is, whether a state sequence of a network using a certain protocol can be obtained by another network using another variant of protocols.

In the following we show that there are particular variants of NWDOL systems with protocol ( a ) which determine the same state sequences as N W D O L systems of the same particular type with protocol (b) and reversely, suppos- ing that the systems are defined over the same alphabet and they have the same trigger for complementarity transition.

Before turning to the result, we briefly discuss communication in N W D O L systems.

Let r = (E,q5, ( g l , { A 1 } ) , . . . , ( g p , { A p } ) ) , r 2 1 , be an N W D O L system. Suppose that the ith node of the system is in state Li(t) at some step t of the derivation, where 1 5 i 5 r, t 2 0. Then

9i(Li(t)) = G(t) u D i ( t ) u Bz(t),

where Ci(t) is the set of "absolutely correct" words (according to q5), that is, correct words with incorrect complementary words, Di( t ) is the set of words with unclear status according to the property being correct or incorrect, that

143

is, correct words with correct complementary words, and &(t) is the set of incorrect words, by definition with correct complementary words. If the network functions with protocol (a), then Li(t + 1) = Ci(t) U Di(t) Uj'=l h,(Bj(t)). It is easy to see that if we use homomorphism g: = h,gi instead of g i , then we obtain

gi(Li(t)) = Ci(t) U D:(t) U Bi(t) ,

where C,l(t) = h , ( B i ( t ) ) and B:(t) = h , (C i ( t ) ) , and Ci(t) is the set of absolutely correct words and B,!(t) is the set of incorrect words obtained from Li(t). The set of words with unclear status, D:( t ) , is equal to h , (Di ( t ) ) . We can observe that homomorphisms gi and gi lead to almost dual networks. If Dj(t) (and clearly, D i ( t ) ) is the empty set for each j, 1 5 j 5 r, then the same state of node i of the NWDOL system can be obtained by using protocol (a) and homomorphism gi as by using protocol ( b ) and homomorphism gf.

Definition 4.1 An NTWDOL system r = ( C , $ , ( g l , { A l } ) , . . . , (gr , {AT}) ) , r 2 1, with protocol (x), x E {a , b} , is said to be strongly separating (s-type, for short) if for each state s ( t ) = (L1 ( t ) , . . . , Lr(t)) of r, t > 0 , it holds that the complementary word of each correct word in gi(Li(t)) is an incorrect word according to $.

Thus, these networks determine strings with clear status according to the property being correct or incorrect: the complementary word of a correct string must be incorrect.

Theorem 4.1 For every s-type NrWDOL system with protocol (x), there exists an s-type NTWDOL system r' with protocol (y), where x # y, x ,y E {a ,b} , such that f and r' have the same state sequences. Moreover, the Watson-Crick road of r is equal to the Watson-Crick road of I", and for each i , 1 5 i 5 r , the Watson-Crick roads of the ith components of the two systems coincide.

Proof. We first show that the statement holds for x = b and y = a. Let

with protocol (b) . The simulating N,WDOL system r' = (E,$, ( g i , { A ; } ) , . . . , ( g ; , {A;}) ) , is constructed as follows: let Ai = A: and let g:(a) = h, (g i (a) ) for every a E C and i , 1 5 i 5 r. It is obvious, that if gi(w) = u for some w E C", 1 5 i 5 r, then g:(w)) = h , ( u ) .

We show that state sequences S(r) and S(P) coincide, that is, for s ( t ) = (151 ( t ) , . . . , L,.(t)) being the state of at step t and for .s'(t) = (L; ( t ) , . . . , L',(t)) being the state of I" at the same step, s( t ) = .s'(t) holds for t = O,1,2, . . . , .

r = (C+, (91, { A l } ) , . . . , ( g T , { A T } ) ) , r 2 1, be an s-type NTWDOL system

144

We prove this statement by induction, according to the number of derivation steps. For t = 0 the statement obviously holds. Let US suppose that the statement holds for some t , t > 0. We show that it will be valid for t + 1, too. Let us consider s ( t ) = (L1 ( t ) , . . . , L p ( t ) ) . Then, by applying morphisms gi, 1 5 i 5 T , we obtain g i (L i ( t ) ) = Ci(t) U Bi(t) where strings of Ci(t) are the correct words (with incorrect complementary words) and strings of &(t) are the incorrect words in g i (L i ( t ) ) . Then, by means of protocol (b ) , the set of strings at the ith component at the next state is

We can easily observe that for Li(t) = Li(t) it holds that g i (L i ( t ) ) = Ci)(t) U Bi(t) with Ci(t) = h,(Ci(t)) and Bi(t) = h , (B i ( t ) ) , where strings of Ci(t) are incorrect words from g:(L:(t)) and strings of B:(t) are the correct ones with incorrect complementary words. Then Li(t + 1) = Li(t + 1) = B:(t) h , (Ci ( t ) ) for each t = 0,1,2, . . . . Observe that h; is the identity.

By the above equalities, we can easily notice that Li(t + 1), t = 0,1,. . . , can be obtained as the result of computation step t of I" (which works with protocol (a ) ) , thus for each t = 0,1,2,. . . , state s ( t ) of r is equal' to state s ' ( t ) of I". Moreover, I" is an s-type NWDOL system.

Now we prove the statement for x = a and y = b. Suppose that r uses protocol (a) for functioning and r', defined above, works with protocol (b) . We f i s t show that the state of the ith component of I', 1 5 i 5 T , a t the tth step of the derivation, where t 2 0, is equal to the state of the i th component of I" at the same derivation step of the computation. As in the above case, we use induction by t. For t = 0 the statement is obvious. Let us suppose that the equality holds for some t , t > 0, and let s( t ) = (L l ( t ) , . . . , L p ( t ) ) be the state of J? at the tth step of the computation. Then the next state of r can be calculated by using equality Li(t + 1) = Ci(t)) h , ( B j ( t ) ) , where sets of strings Cg(t) and Bi(t) , 1 5 i 5 T , are defined in the same way as above, namely, Ci(t) is the set of correct words (with incorrect complementary words) and Bi(t) is the set of incorrect words in gi(Li( t ) ) . Since g:(Lk(t)) = Ci(t)UBi(t) with Ci(t) = h, (Ci ( t ) ) and B:(t) = h,(Bi(t)) and L:(t) = Li(t) , for state s'(t+l) = (Li( t+l) , . . . , Li( t+l)) of r' we have Li(t+l) = Li(t+l), 1 5 i 5 T. Thus, r' and have the same state sequences. Moreover, I?' is an s-type NWDOL system. From the above constructions we can see that r and r' have the same Watson-Crick road, moreover, the Watson-Crick road of the ith component of I? is equal to the Watson-Crick road of the i th component of I?', 1 5 i 5 T.

Li(t + 1) = h,(Bi(t)) Cj(t) .

Hence the result. The reader can notice that the NWDOL systems of Example 1 in the

previous section were constructed according to ideas of the above proof.

145

5 String population growth in NWDOL systems

Networks of Watson-Crick DOL systems determine string set collections changing in time. One measure describing the dynamics of these collections is the number of strings present in the network (at some nodes, at a specific node) at a certain step of the computation.

Definition 5.1 Let system with protocol (x), x E {a , b}.

computation.

= (X,$, (91, { A l } ) , . . . , (gr , {A,.})), r >_ 1, be an NrWDOL

Let s ( t ) = ( L l ( t ) , . . . , Lr( t ) ) , t = 0,1, . . . , be the state of r' at step t ofthe

Then function p : N -+ N defined by

r

P ( t ) = c CUNLi( t ) ) , t L 0, i=l

is called the string population growth function of r.

string population growth function of node i of r, 1 5 i 5 r. Function pi : N -+ N , where pi( t ) = card(Li(t)) for t = 0,1,. . . , is the

Although the nodes can receive new strings by communication, the string population growth function of an NWDOL system is not necessarily monotonously increasing. The following simple example proves this statement.

Example 2 Let r = (C, 9, (91, { a l } ) , (92, { m ) ) , (93, {a2})) be a standard NWDOL system with protocol (a ) , where C = {a l , a2, G I , a z } . Let g l ( b ) = a l ,

and g 2 ( a 2 ) = g3(&2) = a 2 .

Then, it is easy to see that for the state sequence s ( t ) , t = 0, 1,2,. . . , of the network the following holds: s ( t ) = ( { a l } , {a l } , { a l } ) for t = 2k + 1, k 2 0 , and s ( t ) = ( {a l , a2}, {az}, {q}) for t = 2k, k 2 1.

for b E C, 92(a2) = g3(a2) = al, gZ(a1) = g3(al) = a 2 , 9 2 ( 3 1 ) = g3(a l ) = a l ,

We continue with another example we shall use in the sequel.

Example 3 L e t r = (E,d, (91, { a l } ) , ( g 2 , {ala27i3})) be a standard NWDOL system with protocol (a ) , where C = { a1 , ~ 2 , 1 2 3 , a1 , G 2 , a,}.

Let g l ( b ) = a1 for b E C and g2(ai) = ai , 1 5 i 5 3. Let g ~ ( a 1 ) = Sit&,

g2 (62) = The first few steps of derivation result in the following strings at the nodes:

and g2(u3) = a:.

146

Then p ( t ) = 7 for 11 5 t 5 27. Let us discuss the functioning of this network. First, we can observe

the first node never emits any string in the course of computation, it that is a

black hole. The growth of the string population is due to the second node which sends at some derivation steps one string to the first node. The Watson-Crick D0L system of the second component WZ = ((C, 9 2 , a lazas ) , 4) was examined in details in [6],[7] and [9]. I n [9] it was proved that the Watson-Crick road of Wz is not ultimately periodic, there is an exponentially growing sequence of 0s between subwords 11. More precisely, after the first position bit 1 occurs exactly in positions 3i+' + i and 3i+' + i + 1, for i 2 0. By the definition of the Watson-Crick road, this property implies that communication takes only place at steps 3i+1 + i and 3i+1 + i + 1, i 2 0 , in the course of computation in r, when the number of strings is increased by 1. (The length of the string at the second node is monotonously increasing and the first node does not change the lengths of the strings it has, so any communication implies the increment of the number of strings at the first node.) The growth of the string population in the network can be obtained by function p ( t ) , where

p ( t ) + l i f t = Y + l + i - I OT t = 3 i + 1 + i l i > 0 otherwise. p ( t + 1) =

Now let us consider function d ( t ) , t 2 0 , with d ( t ) = p ( t + 1 ) - p ( t ) . It can take values 1 or 0. (d ( t ) gaves the tth bit of the Watson-Crick road of the second node.) If d ( t ) would be a 2-rational function, then 0s would occur in an ultimately periodic fashion among its values (Skolem-Mahler-Lech theorem, [ l l ] , pp. 58., Lemma 9.10). But this is no t the case, thus, d ( t ) is not a Z- rational function, which implies that p ( t ) is no t Z-rational either. Moreover-, we can ea-sily observe that p l ( t ) , the string population growth function of the first component is no t 2-rational either, but p Z ( t ) , the string population growth function of the second component is a 2-rational function, since it equals t o 1 f o r each t , t 2 0.

147

Let us now modify the definition of r to I?’ as follows: let g:(b) = 61 and gk(b) = gJ(b) for each b E C. Let us suppose that I?‘ functions with protocol (b) . Then , as in the previous case, the first node is a black hole. The second node behaves like a ”dual” node of the second node of r, it issues a string at any step t of the computation with t # 3i+1 + i and t # 3i+1 + i + 1, where i 2 0. Then the growth of the string population of r’, p’(t) , is as follows:

i f t = 2 + l + i - I or t = 3 i + 1 + i , i > 0 + ’) = { i’!: + 1 otherwise.

Analogously t o the above case we can show that function d’(t) , t 2 0 , with d’(t) = p’(t + 1) - p’( t ) is not a 2-rutional function, which implies that p’ ( t ) is not Z-rational.

This example leads to two important observations.

Theorem 5.1 Forx E {a , b} there exists an N W D O L sys tem r with protocol (x) such that the string population growth function of r i s no t 2-rational.

Proof. N W D O L systems I? and r’ of Example 2 satisfy the conditions of the claim.

Theorem 5.2 L e t r = (E,q5, (91, { A l } ) , . . . , (g,., { A T } ) ) , T 2 1, be an N T W D O L system with protocol x, x E { a , b } , such that the components ( g i , { A i } ) , i = 2 , . . . ,T are black holes, and the Watson-Crick road of the component (91, { A l } ) is no t ultimately periodic. Moreover, for each i, where i = 2 , . . . , r, let c a r d ( l i ( t ) ) < c a r d ( l i ( t + l)), if communication takes place a t derivation step t and c a r d ( L i ( t ) ) = card(Li ( t + 1)) otherwise, where Li(t) denotes the state of the ith node at derivation step t , t 2 0. Then the population growth function p ( t ) of r i s not a 2-rational function.

Proof. We give the proof for the case of protocol ( a ) , the other case can be treated in a similar way.

If components (gi , Ai), i = 2 , . . . , are black holes, then they do not con- tribute to the increment of the growth of the string population by communicating strings to other nodes. Thus, at any step of the computation the first node has exactly one string. The string population in the system increases whenever the first node isssues a string, and this takes place exactly at that derivation steps when the string obtained by morphism 91 has to turn to its complementary word. Then the number of strings in the system in the course of the computation can be calculated as follows:

P ( 0 ) = r,

148

if the t th bit of the Watson - Crick road of component (91, w1) is 0 , P(t + 1) = { 1:;; + T - 1 otherwise.

Analogously to the considerations used in Example 2, we can show that p(t) is not a Z-rational function. Again, we consider function d(t) = p(t + 1) - p(t). This function assumes only two values: 0 and T - 1. If it were Z- rational, then the 0-s would occur in an ultimately periodic fashion among its values (Skolem-Mahler-Lech Theorem, [ll], pp. 58., Lemma 9.10.). But this is not the case if the Watson-Crick road of the component is not ultimately periodic.

6 Remarks on black holes Communication in networks of Watson-Crick DOL systems raises a lot of intriguing questions. Among them a particularly interesting problem is whether or not a given network contains a black hole, that is, a node which never emits any string in the course of the computation. For networks working with protocol (a), this question is strongly connected with the problem of stability of WDOL systems. A WDOL system W is stable if the complementarity transition never takes place in its word sequence S ( W ) . In [7] it was shown that any algorithm solving the stability of a given standard Watson-Crick DOL system can be converted to an algorithm solving the problem Z,,, and conversely.

For NWDOL systems, if a component i, 1 5 i 5 T, is a black hole in an NWDOL system r = (C,+, (91, { A l } ) , . . . , (g,., {A , . } ) ) , T 2 1, working with protocol (a ) , then for all strings u E Li(t) , t 2 1, where Li(t) is the state of the ith node at the tth step of the computation, it holds that Wi = ( ( C , g i , u ) , +) is a stable WDOL system, that is, +(g:(u)) = 0, k 2 1. For protocol (b ) , if node i is a black hole, then Li(t) contains only "instable" strings, that is, +(gi (u) ) = 1 for each string u in Li(t).

As a direct consequence of the statement concerning the stability of Watson-Crick DOL systems above, we can state the following result.

Theorem 6.1 A n y algorithm f o r deciding whether a standard NWDOL sys- t e m working with protocol (a) contains a black hole can be converted t o a n algorithm f o r solving problem Z,,,.

Proof. Let us assume that we have an algorithm A for deciding the existence of a black hole. Let us apply A to a network where there is only

149

one component in the system. Then, A solves the stability of an arbitrary standard WDOL system and, hence by the earlier result of [7], it can be converted to settle problem Z,,,.

References [l] E. Csuhaj-Varj6: Networks of Language Processors. EATCS Bul-

letin 63 (1997), 120-134. Appears also in Gh. P h n , G.Rozenberg, ASalomaa (eds.) Current Trends in Theoretical Computer Science, World Scientific, Singapore, 2001, 771-790.

[2] E. Csuhaj-Varjli, A. Salomaa: Networks of parallel language processors. In: New Trends in Computer Science, Cooperation, Control, Combinatorics. (Gh. Pgun, A. Salomaa, eds.), LNCS 1218, Springer- Verlag, Berlin-Heidelberg-New York, 1997, 299-318.

[3] Handbook of Formal Languages. Vol. 1-111. (G. Rozenberg, A. Salo- maa, eds.) Springer Verlag, Berlin-Heidelberg-New York, 1997.

[4] A. Salomaa: Formal Languages. Academic Press, New York, 1973.

[5] G. Rozenberg, A. Salomaa: The Mathematical Theory of L systems. Academic Press, New York, London, 1980.

[6] V. Mihalache, A. Salomaa: Watson-Crick DOL systems. EATCS Bul- letin 62 (1997), 160-175.

[7] V. Mihalache, A. Salomaa: Language-theoretic aspects of DNA complementarity. Theoretical Computer Science 250( 1-2) (2001), 163- 178.

[8] A. Salomaa: Turing, Watson-Crick and Lindenmayer. Aspects of DNA Complementarity. In: Unconventional Models of Computation. (C.S. Calude, J . Casti, M.J. Dinneen, eds.) Springer Verlag, Singa- pore, Berlin, Heidelberg, New York, 1998, 94-107.

Cybernetica 14 (1) (1999), 179-192. [9] A. Salomaa: Watson-Crick Walks and Roads on DOL Graphs. Acta

[lo] W. Kuich, A. Salomaa: Semirings, Automata, Languages. EATCS Monographs on Theoretical Computer Science, Springer Verlag, Berlin, Heidelberg, New York, Tokyo, 1986

1 50

[ll] A. Salomaa, M. Soittola, Automata-Theoretic Aspects of Formal Power Series. Text and Monographs in Computer Science. Springer Verlag, Berlin, Heidelberg, New York, 1978.

751

On the Differentiation Function of some Language Generating Devices

Jurgen Dassow Otto-uon-Guericke- Uniuersitat Magdeburg

Fukultat fur Informatik PSF 4120, 0-39016 Magdeburg

Abstract: The differentiation function of a language generating device counts the number of words which can be generated by the device in a given number of steps. In this paper we summarize results on the differentiation function of deterministic tabled Lindenmayer systems, evolutionary grammars and context-free grammars. We present sharp upper bounds for the differentiation function, prove the closure under some algebraic operations, relate this function with other functions studied in formal language theory and consider decision problems for the differentiation function.

1 Introduction and Definitions In biology, differentiation means the evolution of a variety of organisms which are modifications of the species from which they originate. The differentiation function gives the numerical size of the variety at a certain moment.

Lindenmayer systems form a description of the development of (lower) organisms on the basis of formal language theory. The organisms are represented by words, and a derivation step corresponds to a step of the development, as division of cells or changes of the state of cells etc. Therefore it is natural to define the differentiation function of a Lindenmayer system which gives the number of words which can be obtained in a certain number of derivation steps from an axiom (representing the basic organism) in order to reflect the biological differentiation.

Analogously, the differentiation function of an evolutionary grammar which describes an evolutionary process on the basis of formal language theory gives the number of words (representing DNA sequences) which can be obtained in a certain number of derivation steps (representing mutations) from a set of axioms.

In order to formalize this idea we introduce the notion of a language generating device. N denotes the set of natural numbers including zero. By # ( M ) we designate the

cardinality of a set M . Given an alphabet V , V* denotes the set of all words over V including the empty word

A. We set V+ = V' \ {A}. By IwI we designate the length of a word w. A morphism h : V* + V* is a mapping with h(w1wz) = h(w,)h(wz) for all wl, w~ E V*.

A language generating device is a construct D = (V, I , A ) where - V is an alphabet,

152

- ==+c V' x V' is a binary relation over V*, - A c V' is a finite subset of V'.

follows: For n 2 0, the language L,(G) generated by G in n steps is defined inductively as

Lo(G) = A , L,(G) = {w I v + w for some v E Ln-l(G)} for n 2 1.

The language generated by G is defined as

L(G) = U Li(G). i > O

The differentiation function d c of G is defined as

dG : N -+ N with d c ( n ) = #(L,(G))

The aim of this paper is to summarize results on the differentiation function of (deterministic tabled) Lindenmayer systems and (context-free) evolutionary grammars. In addition, we also present results on the differentiation function of context-free grammars where the definition has to be changed slightly by the distinction between nonterminals and terminals. We present sharp upper bounds for the differentiation function, prove the closure under some algebraic operations, relate this function with growth and structure functions which have already been studied in formal language theory and consider the decidability of the equality and boundedness of the differentiation function of given systems or grammars. Mostly, we only present ideas of the proofs or give partial proofs. For complete proofs we refer to [a, 3, 41.

2 Deterministic Tabled Lindenmayer Systems A deterministic tabled Lindenmayer system without interaction (DTOL system for short) is a tripe1 G = (V, P, w) where - V is an alphabet, - P = {h l , h 2 , . . . , h,} is a set of T- morphisms hi : V' -+ v' and - w is a non-empty word over V .

Intuitively, V encodes cells, the morphisms hi, 1 5 i 5 n, describe the developments of the cells in a certain environment (e.g. h(a) = aa for a cell a describes the division of a cell a into two cells of the same type) and w corresponds to the organisms from which the development starts.

The derivation relation of a DTOL system G is defined as foliows: for two words v and v', the relation v ==+ w' holds if and only if there is an integer i, 1 5 i 5 n, such that v' = hi(.). Thus one derivation step corresponds to the application of one of the morphisms.

Thus any deterministic tabled Lindenmayer system G corresponds to a language generating device D(G) = (V, *, {w}). We denote the associated languages and differentiation function by L,(G) and dc instead of L,(D(G)) and d D ( G ) , respectively.

153

By definition, dG counts the number of different words which can be derived from the start word w in exactly n steps. In biological terms, d~ gives the number of different organisms which can be obtained after n steps using the given sets of developmental rules.

We now present two examples. First we consider the DTOL system

G = ( { a , b, c) , {hl, h d , a )

with

h l ( a ) = abc, h l (b ) = b, hl(c) = c, h 4 a ) = ab'c, hz(b) = b, h4c ) = c .

Then Ln(G)={ab"lcbi2c ... b i n c I i j E { 1 , 2 ) }

and hence d G ( n ) = 2" for n 2 1 .

Second, for k 2 1, we consider the DTOL system

G' = ( {a ,b , a l , az , . . . , ak} , {h; , h:} ,a l )

with

h ; ( a ) = ab, h i ( a ) = ab', h:(b) = b, h:(ak) = a for i E { 1 , 2 } ,

h : (a j ) = aj+l for i E { 1 , 2 } , j E {1 ,2 , . . . , k - 1}

Then

L,(G') = {a,+l} for 0 5 n 5 k - 1, Lk(G' ) = { a } , L,(G') = {ab"b " . . . . b " n - A ~ i j ~ { 1 , 2 } } = { a ~ ~ n - k k .

The following theorem gives an upper bound for the differentiation function of DTOL systems and shows that it is sharp.

Theorem 1 i ) For any DTOL system G with r morphisms, d G ( n ) 5 r" for n 2 0.

that dG(7~) = r".

Proof. The first statement follows easily since from any word we can derive at most r different words. The second statement follows by a generalisation of the construction

0

ii) For any natural number r 2 1, there is a DTOL system G with r morphisms such

given in the first example above.

154

Theorem 2 Let f and g be differentiation functions of DTOL systems, and k 2 1 be a natural number. Then

- f + g, defined b y (f + g)(O) = 1 and (f + g ) ( n ) = f ( n ) + g ( n ) for n L 1, - f .9, defined by (f . 9 ) ( n ) = f ( n ) . g ( n ) for 12 L 0, - f[k], Wried b y f [k ] (n ) = f ) for n z 0'. are differentiation functions of DTOL systems, too.

Proof. the other cases the proofs are constructive.

Let

We only give the proof for multiplication of functions and mention that also in

GI = (VI , {h l , hz, . . . , h , ) , ~ ) and Gz = (VZ, {gl,gZ,. . . ,gm), WZ)

be two DTOL systems where we assume without loss of generality that Yl n V, = 0. We consider the DTOL system

G = (Vi U Vz, {fi,j I 1 I i I n, 1 I j I m), W W )

where, for 1 5 i 5 n, 1 5 j I m, the morphisms f i , j are defined by

It is easy to see that L,(G) = L,(G1). L,(G2) for n 2 0. Hence dc(n) = d c , ( n ) . dG,(n) for n 2 0. 0

Corollary 3 Let p(x) = a,xm + arn-1xm-' + alx + a0 be a polynomial such that all coeficients a;, 0 I i I m, are integers and a , > 0 . Then there is a constant c 2 0 and a DTOL system G such that

dG(n) = p ( n ) for n 2 c .

Proof. We prove the statement by induction on the degree m of the polynomial. For m = 0, i.e. p(x) = a0 the statement follows easily by considering

G = ( { a , b l , b2,. . . , b,,), {h l , hz, . . . ha,), a)

with h;(a) = b; and h;(bj) = bj for 1 5 i 5 ao, 1 5 j I a,,. Let p ( z ) = amtlxm+l + amxm + . . . + a1x + ao. Assume that a0 2 0. Then we set

q(x ) = am+lxm + a,xm-l + . . . + azx + al . By induction, there is a DTOL system G such that dG(n) = q(n) for n 2 c. Then we construct a DTOL system H with the differentiation function dG(n) . n + ao. Obviously, dH(n) = p(n) for n 2 c.

If a0 < 0. Then we construct g(x) = arn+lzm + amxm-' + . . . + azx + ( a l - 1) for which some G with dc(n) = g(n) for n 2 c exists. Moreover, by the above example there is a DTOL system H with dH(n) = n + a0 for n 2 a0 + 1. Thus there is a system F with dF = dG + d H and therefore dF(n) = g(n) + n - a0 = p(n) for n 2 c' for some c' 2 0. 0

[uJ denotes the largest integer n with n 5 a

155

A DOL system is a DTOL system with exactly one morphism. The DOL system H = (V, h , w) generates in n steps the word h"(w) only. Therefore with a DOL system G we can associate the growth function

g H : N -+ N given by gH(n) = lh"(w)I.

The following theorem relates growth function and differentiation function to each other.

Theorem 4 Let H be a DOL system with a non-erasing morphism. Then there is a DTOL 0 system G such that dG = IJH.

We note that there are growth functions of DOL systems which do not belong to the types of functions given in Theorem 1 ii) and Corollary 3 (see [5, 81).

Theorem 5 For two given DTOL systems, it is undecidable whether or not their differentiation functions are equal.

The proof can be given by a reduction to the Post Correspondence Problem.

3 Evolutionary Grammars A (context-free) evolutionary grammar is a sixtupel G = (V, C, I , T , D , A ) where - V is an alphabet, - C , I, T , and D are finite subsets of V*, - A is a finite subset of V'.

The derivation relation of an evolutionary grammar G is defined as follows: for two words v and v' the relation v I d holds if and only if one of the following conditions holds: - v = v1xv2, 0' = v1v2, x E c, - v = v1xv2, v' = v1xRv2, x E I , ~ v = ~1x0203, V' = viv2Xv3, x E T , - v = v1vZXvz, V' = v1xV2v3, x E T , - v = v1xv2, V' = V ~ X X V ~ , x E D.

Intuitively, V encodes the DNA molecules, the sets C, I , T and D correspond to (large-scale) mutations which can occur in the evolution, C is the set of sequences which can be deleted, I is the set of sequences which can be reversed, T is the set of sequences which can be shifted (translocated) in the DNA strand and D is the set of sequences which can be duplicated, and A is a finite set of DNAs from which the evolution starts. Thus, in biological terms, dG gives the number of DNAs which can be obtained from a set of start sequences by a given set of mutations.

Again, any evolutionary grammar G corresponds to a language generating device D(G) = (V,*,A), and we denote the associated languages and differentiation function by L,(G) and dG instead of Ln(D(G)) and d D ( G ) , respectively.

We give two examples.

'The reversal zR of a word z E V' is inductively defined by XR = A, zR = z for 1: E V and ( ~ 1 . 2 ) ~ = zFzp for q , z 2 E v*.

7 56

First, let G = ( { a , b} , 0 ,0 ,0 , {a'ba, a2ba2}, {a2baz})

be a context-free evolutionary grammar where only duplications are allowed. Then

L,(G) = { , z b a ~ ~ f 2 b a i 2 f 2 . . . ba""+2ba2 1 ij E {1,2}} and & ( n ) = 2"

for n 2 0. Second, we consider the evolutionary grammar

G' = ( { a , b} , 0 , {aa} , { b } , {aa) , {aab})

L,(G') = {arbas I r + s = 22,1 5 i 5 k} u { a Z k f 2 b } for which

and d G t ( n ) = 3 + 5 + . . . + (2k + 1) + 1 = (k + 1)'

hold for n 2 0.

Again we start with upper bounds.

Theorem 6 i ) For any evolutionary grammar G, there are constants c1 and c2 such that d G ( n ) 5 c1 . c; for n 2 0 .

ii) For any evolutionary grammar G = (V, C , I , T , 0 , A ) with an empty set of duplications, there is a constant c such that dG(n) 5 c for n 2 0 .

iii) For any natural number c 2 1, there is an evolutionary grammar G such that dG(72) = C".

Proof. i) For a given context-free evolutionary grammar G = (V, C, I , T , D , A ) we set r = max{IzI I I E A } and s = max{lyl I y E D}. Then 2) ==+ u implies lul 5 IuI + s. Thus, for n 2 0, IzI 5 r + n . s for z E L,(G). The number of words whose length is a t most r + n . s can be bounded in the required form.

ii) follows from the fact that evolutionary grammars without duplications only generate finite languages.

iii) can be shown by a generalization of the construction in the first example above. 0

Theorem 7 Let f and g be differentiation functions of evolutionary grammars. Then the functions

- f * , defined b y f* (n) = are differentiation functions of evolutionary grammars, too.

- f + gJ defined b y (f -k !?)(n) = f(.) t g ( n ) f o r 2 O J f ( i ) for n 2 0 ,

Proof. We give the proof for f * only. Let G = (V, C, I , T , D , A ) be an arbitrary evolutionary grainmar and let a be a letter not contained in V . Then we consider the evolutionary grammar

H = ( V u { a } , C , Z , T , D U { a } , { a } A ) .

1 57

It is easy to show that n

L,(H) = U{a'")L,-,(G) i = O

from which d~ = f * follows. 0

It is an open problem whether the set of all diffentiation functions of evolutionary

For evolutionary grammars we have only the following weaker form of Corollary 3.

Lemma 8 For any natural number m 2 1, there is an evolutionary grammar G such that

grammars is closed under product.

dc(n) = o(nm).

Proof. The statement follows from Theorem 7 and the fact that (nm)* = O(nm+'). a

4 Context-Free Grammars Obviously, by the biological motivation the differentiation function of DTOL systems and evolutionary grammars is of interest and importance. This does not hold for the differentiation function of grammars defined with a linguistic motivation. However, we shall see that there are some results which give hints to an interest in such a function for linguistically motivated grammars, too. We restrict to context-free grammars in this paper.

- N and T are disjoint alphabets, - S E N and - P = { ( A ~ , W I ) , ( A Z , W ~ ) , . . . , (A,,w,) is a finite set of pairs with A; E N and w; E ( N U T ) ' for 1 5 i 5 r . A context-free grammar is called linear, if all elements of P are of the form ( A , wBv) or ( A , w) with A , B E N and w, v E T*. It is called regular, if all elements of P are of the form (A ,wB) or (A,w) with A, B E N and w E T'.

The derivation relation of a context-free grammar G is defined as follows: for two words v and v', the relation v * v' holds if and only if v = vlAv2, v' = v1wvz and (A,w) E P. We say that a derivation is leftmost if v1 E T* holds.

Since in the theory of linguistically motivated grammars one is only interested in words over the terminal alphabet T , we modify the concept of a differentiation function of a context-free grammar as follows: Let v 3 v' if there are words vo, v l r v2,, . . , v, such that vj + vj+l for 0 5 i 5 n - 1 and v = vo and v' = v,. Then we set

L ( G ) = (2 I S + z ) ,

L',(G) = L , ( G ) n T * ,

A context-free grammar is a quadruple G = ( N , T , P, S ) where

d G ( n ) = #(Lh(G)).

A context-free grammar G is unambigous if, for any word w E La(G), there is exactly one derivation S w where any derivation step is leftmost.

158

First we consider the context-free grammar

G = (IS), { a , b, c>, { ( S , Sbc), ( S , Sb'c), ( S , abc), ( S , ab'c)), S )

Then . . . .

L,(G) = {Sb'"cb'"-'c. . . bi'c 1 i j E { 1,2} } U {abl"cb'"-' c . . . P l c p j € { l , 2 } } Lk(G) = {ablncbln-' c . . . bile I ij E {I, 2) } d ~ ( 0 ) = 0 and & ( n ) =2" for n 2 1.

. .

As a second example we consider the context-free grammar

G' = ( { S , A ) , { a , b} , { ( S , S ) , ( S , as ) , (S,bA), ( A , a A ) , (A , A)), S )

for which Lk(G') = {arbas 1 r + s = i , O 5 i 5 n - 1)

and . n(n - 1) n-1

d ~ ( 0 ) = 0 and &(n) = cz = ~ for n 2 1 2 i=O

hold.

Concerning the upper bounds we have essentially the same situation as for DTOL systems.

Theorem 9 i ) For any context-free grammar GJ there is a constant c such that d G ( n ) 5 c" f o r n 2 0.

ii) For any natural number c 2 1, there is a regular grammar G such that d G ( n ) = cn f o r n 2 1. 0

Theorem 10 Let f and g be differentiation functions of context-free grammars. Then the functions

- f + g J defined b y (.f + g ) ( n ) = f (.) + g(n) for - f [ k ] J defined b y f [ k ] ( n ) = f ( I:]) for

O, 2 O,

- f + , defined b y f* (n) = EY=o f ( 2 ) f o r n 2 0, are differentiation function of context-free grammars, too.

Proof. We only prove the second statement. Let G = ( N , T , P, S ) be a context-free grammar with the differentiation function

dG = f . Then we construct the context-free grammar G' = (N' , T , P', S:) as follows. With a nonterminal A E N we associate the set N A = { A I , Az, . . . , Ak-1). Then we set

N' = {S; ,S; ,..., SL-l}U IJ N a ,

P' = AEN

S ) , (s;, s;), ( $ 7 S ) , ($7 s3,. . . I(SL2, S ) , ( S L , SLlL (GI> S ) U U { (A, A I ) , (Ai , Az), (Az , A), . . . , Ak-i)} U ((4-1, W ) I (4 W ) E PI

AEN

159

Intuitively, besides a start phase which ensures that S' S for 1 I j < I c , any derivation step in G is obtained by Ic derivation steps in G'. Thus, for n 2 1 and lcn 5 i < Ic(n + 1) and z E T*, S' 3 z holds if and only if S =f+ z. This implies the statement. 0

Again, it is an open problem whether the set of differentiation functions of evolutionary grammars is closed with respect to product.

For a language L, we define the structure function SL by

SL : N + N and sL(n) = # ( { z I z E L, IzI = n } ) .

For facts on the structure function we refer to [I, 6, 7, 91. The following theorem gives a connection between structure functions of context-free grammars and differentiation functions of context-free grammars.

Theorem 11 i) For any context-free language L , there is a context-free grammar G such that dG = SL.

ii) For any umambigous context-free grammar G, there is a context-free language L such that SL = dc .

iii) If the differentiation function of a context-free grammar G is bounded b y a constant, then the structure function of L(G) is bounded b y a constant, too.

Proof. i) The proof can be given by the use of the Greibach normal form for context-free grammars.

ii) Let G = ( N , T , P, S) be a context-free grammar. Let P = {pl,pz,. , . ,p,}. We define the morphism h : ( N U T)* + N* by h ( a ) = X for a E T and h ( A ) = A for A E N and set

G' = ( N , {[il I 1 i i I n ) , { ( A , [iIh(w)) I Pi = (A,w)), S) ' Obviously, if

is a leftmost derivation of a terminal word consisting of n steps (below the arrows we have given the applied element of P ) , then

s ===2 [il]Vl * [il][i2]2)2 * . . . ===+ [il][iZ]. . . [i,-l]V,-l ===+ [il][iZ] . . . [in-l][in] (2)

is a terminating derivation in G'. Further, by the assumption of unambiguity, there is no other left derivation of w. Thus any derivation of w in G differs from (1 ) only in the order in which the rules are applied. Hence any derivation of type (2) corresponds to a derivation of w produces [il][iZ]. . . [in]. On the other hand, any word [il][iZ]. . . [in] E L(G') of length n is associated with a word w E L(G) which is derived by the left derivation where the production p;l , p i z , . . . ,pi , , are applied in succession. This implies a one-to-one mapping of the set of words w which can be generated in G in n steps and the words of length n which can be generated in G', i.e.

#({w I w E L(G'), IwI = n } ) = #({w I S 9 w,w E T * } ) .

160

Therefore SL(G, ) = dG. 0

From the first two statements of the preceding theorem it follows that the sets of differentiation functions of unambigous context-free grammars and of structure functions of context-free languages coincide.

We close this section with two (un)decidability results.

Theorem 12 For two given linear grammars, it is undecidable whether or not their differentiation functions are equal. 0

Theorem 13 Given a context-free grammar G and a natural number c 2 1, it is decidable whether or not the differentiation function of G is bounded b y a constant c, i.e., d G ( n ) 5 c for n 2 1 .

Proof. Let G = ( N , T , P, S ) be a context-free grammar. We introduce a new symbol z 6 N U T , and define the context-free grammar H = ( N , T U { x } , P', S ) with the set of rules P' = { A + a * z : A -+ a E P } . Obviously, a word w E T* can be derived in G in n steps iff H generates a word w' with AT(W') = w and n{x}(w') = P, where TY denotes the projection morphism on the alphabet Y .

Similar to the proof of the decidability of k-slenderness for matrix languages (see [lo]), one shows that the language

I P k ' ( H ) = { u J ~ # . . . w ~ # : T V ( W ~ ) E L(G)(l 5 i 5 k), T V ( W 1 ) # w(wj),T{z}(wi) = " { z } ( W j ) ( 1 5 i < j I k)l

is a matrix language, and a matrix grammar generating L['k](H) can be constructed effectively. Clearly, the differentiation function of G is bounded by a constant c iff L[>"+'](H)

0

It is an open problem whether or not Theorem 13 holds for DTOL systems and evolutionary grammars, too. Moreover, for all devices the decidable status of the question whether or not the differentiation function is bounded is not known,

The most interesting open problem in the area of differentiation function is the char- acterizetion of the set of differentiation functions of devices of a given type. Here we have presented only upper and lower bounds and some special classes of functions which occur as differentiation function (see Corollary 3, Theorems 4 and 11 i) ).

is empty which is decidable.

References [l] A. BERTONI, M. GOLDWURM and N. SABADINI, The complexity of computing the

number of strings of a given length in context-free languages. Theor. Comp. Sci. 86 (1991) 325-342.

[2] J . DASSOW, Eine neue Funktion fur Lindenmayer-Systeme. EIK 12 (1976) 515-521.

[3] J . Dassow, Numerical parameters of evolutionary grammars. In: J . KARHUMAKI, H. MAURER, GH. P ~ U N and G. ROZENBERG (eds.), Jewels are Forever, Springer- Verlag, Berlin, 1999, 171-181.

161

[4] J. DASSOW, V. MITRANA, GH. P ~ U N and R. STIEBE, On functions and languages associated with context-free grammars. Submitted.

[5] G. HERMAN and G. ROZENBERG, Developmental Systems and Languages. North- Holland, Amsterdam, 1975.

[6] T. KATAYAMA, M. OKAMOTO and H. ENOMOTO, Characterization of structure- generating functions of regular sets and the DOL systems. Inform. Control 36 (1978) 85-101.

[7] W. KUICH and R.K. SHYAMASUNDAR, The structure generating function of some families of languages. Infomn. Control 32 (1976) 85-92.

[8] G. ROZENBERG and A. SALOMAA, The Mathematical Theory of L Syste,ms. Aca- demic Press, 1980.

[9] A. SALOMAA and M. SOITTOLA, Automata-Theoretic Aspects of Formal Power Se- ries. Springer-Verlag, 1978.

[lo] R. STIEBE, Slender matrix languages. In: G. ROZENBERG and W. THOMAS (eds.), Developments in Language Theory, World Scientific, Singapore, 2000, 375-385.

1 62

Visualization of Cellular Automata

MBria Demkny, G6za HorvBth, Csaba Nagylaki and ZoltBn Nagylaki

1. Abstract

Cellular automaton is a special sort of automata and heavily studied in automata theory [2]. Cellular automata consist of automata are put next to each other. They can be in a line, in a grid or even higher dimensional arrangements. These automata differ from the other well-known basic automata, firstly they have not got any tape. Secondly, the transition function is different, the new state of an automaton is determined by its current state and the current states of its direct neighbouring automata. These finite state machines work synchronously.

In this paper we show a visualization of cellular automata in a three dimensional environment. We represent the automata with their properties. We discuss the aspects of designing and developing of an engine which simulates the work of automata. This engine applies the automata’s transition rules in each step. We analyze how the engine works and how the simulation and visualization are accomplished.

The number and form of cellular automata, their initial states and the transition rules are required for this software as input. The output consists of those states of automata which states the automata had during the execution. Every state can be assigned any special coloured and sized shape supported by the three-dimensional environment. For each arbitrary cellular automata the most appropriate assignment can be done for better understanding the result of the run.

This work was supported by the Hungarian National Science Foundation (Grant No’s.: TO19392 and T030140).

163

2. Introduction

2.1. Cellular Automata

A cellular automaton is a discrete dynamical system. Space, time, and the states of the system are discrete. Each point in a regular spatial lattice, called a cell, can have any one of a finite number of states. The states of the cells in the lattice are updated according to a local rule. That is, the state of a cell at a given time depends only on its own state one time step previously, and the states of its nearby neighbours at the previous time step. All cells on the lattice are updated synchronously. Thus the state of the entire lattice advances in discrete time steps.

To define a cellular automaton precisely, we have to give the following parameters of automat a: - Dimension

- Size of automata We will examine one, two and three dimensional cases.

In one dimensional case it is a number of cells in the line of automata. In two dimensional case it is the width and length of the grid of automata. In three dimensional case it is the width, length and height of automata.

Set of states is usually an alphabet, but sometimes this alphabet can be - Set of states

a set of numbers. - Set of rules

Rules are transition functions, which gives one state for each combination of the states of the cells and nearby neighbours. Usually specified in the form of a rule table. - Initial state

We have to give the initial state for each cell of the automata.

2.2. Life

”Life” originally began as an experiment to determine if a simple system of rules could create a universal computer [l] [5] [6]. The concept ”universal computer” was invented by Alan Turing and denotes a machine that is capable of emulating any kind of information processing by implementing a small set of simple operations. The inventor of Life, John Conway, sought to create as simple a ”universe” as possible that was capable of computation. What he found after two years of experimentation was a system consisting of a rectangular grid where each square could be in one of two states: on or off. He considered of them as cells, alive and dead. The rules of the system are

164

very simple: a cell survives if it has two or three living neighbours. A new cell is created on a ”dead” square if it has exactly three living neighbours. It is one example of the two dimensional cellular automata. In 1982 Stephen Wolfram, set out to create an even simpler, one-dimensional system. The main advantage of a one-dimensional automaton is that changes over time can be illustrated in a singe, two-dimensional image and that each cell only has two neighbours.

3. Visualization of Cellular Automata

To examine cellular automata’s work is very important to see how they work. To visualize automata’s states we have two possibilities, as we can see in the following. We created a universal viewer, which can show all - one, two or three dimensional - automata’s states while working, in this two way.

3.1. Higher dimensional viewing

As we were seen in Wolfram’s one dimensional system, the one dimensional automata can be seen as a two dimensional picture. In this case the automata is a line of the cells, and we draw each line under the previous, when the automata’s states are changing. When we have two dimensional automata, we have to draw two dimensional grids next to each other, and finally we will receive a three dimensional image. If we use this representation method then we can see all the previous states of automata, and finally these states together form a higher dimensional still-image.

3.2. Same dimensional viewing

For investigating how Conway’s universe works, we have to show his two dimensional automata in two dimension, and only the last states in each time. In this case we receive a two dimensional moving picture, but we can see only the latest states of automata. Similarly we can see the one dimensional moving image (a line) in one dimensional case, and a three dimensional moving image in three dimensional case.

165

4. The engine

Our engine is represented in a VRML environment. VRML is a very pop- ular three dimensional Virtual Reality Modeling Language for the Internet users [3] [4]. This language has the indispensable means for representing a real 3D environment. It includes shapes, geometry, textures, lights, sound sources, transformations, interpolators, etc. These nodes can be grouped, embedded and linked to each other for forming sophisticated objects. It makes possible to create arbitrary static 3D world. Additionally, it contains objects for representing dynamic objects of the world. It includes translators, sensors, timers and event processing mechanism. Resulting that the ’move’ and ’change’ of the world can be applied. The VRML as a universal 3D environment supports all those requirements which are laid by the problem of visualization of cellular automata, even in the three dimensional case. Un- fortunately, the VRML doesn’t have some basics, important programming possibilities, such as variables, functions, etc., but fortunately we can use JavaScript functions in our VRML program, so we have these means, which are important to compute the states of automata.

The main concept of our program is to give a universal viewer. The program contents the following main parts: - The computing function, which is given in JavaScript, it computes the states of automata by applying the rules and gives the results to the visualization part. - The visualization part, which is written in VRML, it shows the automata’s work in the three dimensional environment.

4.1. Input parameters

There are declaration parts of the program, these contain all the data of automata and contain the information of visualization too. These data are the input parameters of the program.

- The states of automata. The states of automata can be natural numbers. They are listed together with the shapes which represent the state in the visualization part, therefore this input will be described in the visualization part. - The dimension of automata. It is a number, which can be 1, 2 or 3. This value must be assigned to the variable ”dim”. - The size of the automata. These are three natural numbers, we have to give the width, length and height of the automaton to the variables ”xnum”, ” ynum” and ” znum” .

The declaration part for the automata consists of the following:

166

- The initial states of automata. We have to give the initial states of each automaton to the ”ca” variable. The ”ca” variable is an array, we must list the states in the order of Z, Y and X increasingly. - The rule table. We have to give the number of rules to the variable ”rulenum”, and we have to give the rules of the automata to the variable ”rule”. Rules contain the states of cells, and must be given in the following form: center Xleft Xright Yleft Yright Zleft Zright newstate Naturally, in the two dimensional case the automata have not got neighbours in the Z-direction, thus we don’t give the Z neighbours’ states. Similarly, in the one dimensional case we do not specify the Y neighbours’ states neither. There are two values with special roles. Firstly, we can use the -2 instead of some input state. The -2 means that this state should not be considered, it is a universal one. For applying a rule to an automaton the state of automaton regarding to the -2 valued state of the rule is not compared. Namely that automaton can be in any state nevertheless the rule can be applied. Secondly, the automata are expanded with one cell in each direction. All these cells in this frame has the state -1 permanently. Therefore for those rules which is intended to be applied for the automata on the border -1 is supposed to be used in the external direction. - The number of steps. We have to give to the variable ”stepsnum” it specifies the number of applying the rules.

These input parameters must be given to compute the states of the cells of automaton.

The declaration part for the visualization contains: - The cells’ shapes. This is a list of strings, ”cell-shape” is the variable’s name. Each string is a valid VRML source of a shape. These are the shapes which can be connected to states of automata, therefore this is how a cell is visualized, it is the specification how it appears on the screen. There are some predefined shapes, naturally defining new shapes are also possible, but it requires basic knowledge of VRML. Each shape’s size is supposed to fit in the unit-cube. If the shape X, Y or Z size is bigger than one then the ” cellsizex” , ” cell-size-y” and ” cellsizez” variables must be set accordingly for forming a bounding box around the shape. Since these shapes are specified by VRML, arbitrary VRML objects can be used. - The connection table, it assigns one shape to each state of automata. The variable is called ”connect”, and contains pairs of numbers, where the first one is the state, and the second one is the index of the ”cellshape” list. Naturally, every state must be listed but the same shape can be assigned to different states. The -1 value has a special role, it is the void shape. If a state is connected to -1 then this state will not be visualized by a shape, namely

167

its space remains empty. - The number of these number pairs in "connect" has to assign to the "con- nectnum" variable.

- In the case of two dimensional automata we have to give x n u m zeros to the "01d..row~~ variable. - In the case of three dimensional automata we have to give x-num*ynum zeros to the "old-plane" variable too.

These parameters are fields of the "SCRIPT" node which contains the Javascript part of the engine too.

The declaration part for the technical input parameters contains:

4.2. Dashboard

For controlling the visualization a dashboard is created. The execution and visualization of the automata is conducted by the dashboard. - Gap. It specifies the X, Y and Z space between the cells. - Step. In the case of higher dimensional viewing, we can set the X, Y and Z space between the new and the previous shapes of states of cells. - Moving / Staying switch button. It switches the type of viewing. Moving is Higher dimensional viewing Staying is Same dimensional viewing - Continuous / Step by step switch button. There are two modes of play, this button switches between them. In the continuous play the automata's work is played as an animation, the shapes of new states are displayed after each other as the time passes. In step b y step mode we can see each frame of the animation individually. - Play button. It starts the animation. In the case of continuous play, a real animation starts, in the case of step by step play the rules are applied and the shapes of new states are displayed. - Stop button. It has effect during continuous play only, it stops the animation. - Speed. It also has effect during continuous play only it specifies the delay in seconds for displaying the next shapes of states of automata, namely the next frame of animation. - CA Step. It specifies how often the program visualize the states of automaton. Firstly, several states can be skipped. Secondly, it makes possible to visualize the states of automata after given steps every time. This parameter takes effect in both play modes. - Follow mode On / Off switch button. It has effect in the case of higher dimensional viewing. If it is set on then the shapes of new states are centered on the screen. Namely, the viewer slides paralelly to the shapes of states in

168

each step. - Redraw. The last visualized states are redrawn. It is useful a t the begining, when we set the gap. - Board Off. Removes the dashboard from the screen. It increases the screen’s region for viewing and reduces the load of the VRML-browser. - Board On. Put the dashboard to the screen.

5. Plans

- In our program the parameters are specified in the program-file. As an enhancement the input can be read from files. Several file formats for spec- ifying automata rules and states are used recently. Adding filters to the program for these formats are planned. - Sometimes the transition function is not specified by a rule table, but it is specified by formulas. - The automata’s states are visualized in forward direction. Moving in back- ward direction, namely the revisualization of previous steps can be made possible. In the case of reversible automata this feature can be added relatively easily. - Support for partial cellular automata is also planned.

6. References

1. ftp: / /alife.santafe.edu/pub/topics/cas/txt/general.txt 2. Andreas Ehrencrona’s Cellular automata homepage http: / /cgi.student .nada.kth.se/ cgi-bin/ d95-aeh/get/lifeeng 3. Carey, R., Bell, G., Marrin, C.: ISO/IEC 14772-1:1997, Virtual Reality Modeling Language, (VRML97), San Diego Supercomputing (SDSC), 1997. http://www.vrml.org/Specifications/VRML97 4. San Diego Supercomputing Group, The Virtual Reality Modeling Lan- guage Version 2.0, ISO/IEC CD 14772 August 4, 1996. http://vrml.sgi.com/moving-worlds/spec/part 1/ 5. Morita, K.: Cellular automata and artificial life - Computation and life in reversible cellular automata -, Proc. of the 6th Summer School on Com- plex Systems, Santiago, 1-40, 1998. 6. Morita, K. and Harao, M.: Computation universality of one dimensional reversible (injective) cellular automata, Trans. IEICE Japan, E72, 758-762, 1989.

169

Picture 1. The dashboard

Picture 2. One dimensional automaton, higher dimensional viewing

170

Picture 3. and Picture 4. Two dimensional automata, higher dimensional viewing

171

O N A CLASS OF HYPERCODES

BY Do Long Van'

Institute of Mathematics P.O.Box 631 Bo Ho, 10 000 Hanoi, Vietnam

Abstract. In this note we consider a special class of hypercodes whose elements are called supercodes. Characterizations of supercodes, maximal supercodes are established. Embedding a supercode in a maximal supercode is considered.

1. Preliminaries. Let A throughout denote an alphabet, i.e. a non-empty finite set of symbols called letters. We denote by A* the free monoid generated by A whose elements are called words over A. The empty word is denoted by 1 and A+ = A* - (1). The number of all occurrences of letters in a word u is the length of u, denoted by IuI. Any set of words is a language. A non-empty language X is a code if for any positive integers n,m 2 1 and for any X I , ..., x,, y1, ...,urn E X ,

x1 ... x, = y1 ...yrn j n = m and xi = yi for all i

For further details and background of the theory of codes we refer to [BP, S]. Let u,v E A*, we say that u is a subword of v if, for some n 2 1, u =

ul...u,, v = ~ 0 ~ 1 x 1 ... u,x, with u1, ..., u,,xo,x1 ,..., x, E A*. If ~ 0 x 1 ... x, # 1 then u is called a proper subword of v. A subset X c A+ is a hypercode if no word in X is a proper subword of another word in it. Hypercodes have been considered by many authors [Tl,ST,Val,T2,S], and they have some interesting properties, in particular one has

Proposition 1.1 (see [S]) Every hypercode is finite . As has been observed by several authors, many classes of codes can be

defined by a binary relation (see [IJST,S]). Given a binary relation < on A*. A subset X c A* is an independent set w. r. t. the relation < if any two elements of X are not in this relation. We say that a class C of codes is defined by 4 if these codes are exactly the independent sets w.r.t. 4. Then we denote the class C by C+. Very often, the relation < characterizes some property (Y of words. In this case, instead of < we write +, and also C,

le-mail: d1vanQthevinh.ncst.ac.vn

1 72

stands for C+-. It is obvious that the class c h of hypercodes is defined by the relation 4 h given by

21 <h V eS 3n 2 1 : 21 = u~uz...U(Ln A V = X O U ~ X ~ U Z . . . U ~ X ~ with X o X i ... X n # 1.

Let < be a binary relation on A" and u, v E A". We say that u depends on v if either u + v or v < u holds. Otherwise, u is independent of v. These notions can be extended to subsets of words in a standard way. Namely, a word u is dependent on a subset X if it depends on some word in X . Otherwise, u is independent of X . For brevity, the following notations will be used in the sequel

u 4 x + 3v E x : u + v ; x < u $ 3 v E x : v 4 u ;

Next, an element u in X is minimal in X if there is no word v in X such that v < u. When X is finite, by m a x X we denote the maximal wordlength of X .

Now, for every subset X C A* we denote by D x , I x , LX and Rx the sets of words dependent on X, independent of X , non-minimal in IX and minimal in I x , respectively. In notations:

D x = { u E A * I u < X V X < U } ;

IX = A" - D x ;

Lx = {u E Ix I Ix < u} ;

Rx = I X - L x .

When there is no risque of confusion, for brevity we write simply D , I , L, R instead.

The relation < is said to be length-increasing if for any u , v E A', u < v implies lul < 1211. We denote by 5 the reflexive closure of 4, i.e. for any u,v E A*, u 5 v if and only if u < v or u =v.

The following result has been proved in [Van]

Proposition 1.2 Let 4 be a length-increasing transitive binary relation on A* which defines the class C+ of codes. Then, for any code X in C+, we have

(i) Rx is a maximal code in C+ which contains X . (ii) Rx and IX are regular iff so are D X and Lx . (iii) If moreover the relation + satisfies the condition

(*)

then the finiteness of X implies the finiteness of Rx, and

3k 2 lVu ,v E A+ : (1.1 2 Iul+k)A(u + v ) =+ 3w(lwl 2 Iul)A(w + v),

maxRx 5 m a x X + k - 1.

173

2. Supercodes. Given u, v E A*, u, is called a permllsubword ( a proper permusubword) of v if u is a subword (a proper subword, resp.) of a permutation of v.

Definition 2.1 A non-empty subset X of A+ is called a supercode if no word in X is a proper permusubword of another word in X . In other words, the class C,, of supercodes over A is defined by the relation +sp on A* given by

u +sp v H 3v' E 7r(v) : u < h v', where ~ ( v ) denotes the set of all permutations of v.

in the following

Example 2.2 (1) Every uniform code over A is a supercode over A; (2) Consider the language X = {ub2, ba3b} over A = {a , b} . Since ab2 is

not a proper subword of ba3b, X is a hypercode. But X is not a supercode because ab2 is a proper subword of a3b2, a permutation of ba3b;

(3) The language X = {a2b,ab3} is a supercode because u 2 b is not a proper subword of any word in n ( a b 3 ) = {rib3, bub2, b'ub, b3a}.

Given u, v E A*, we call v a superword (propersuperword) of u if u is a subword (proper subword, resp.) of v. Next, v is a pennusuperword (proper permusuperword) of u if there exists u' E ~ ( u ) such that u' is a subword (proper subword. resp.) of v. Let A = {al ,a2, ..., a k } . For every word u over A we denote by p ( u ) the Parikh vector of u, namely

Thus, every supercode is a hypercode. The converse is not true as shown

where lulai denotes the number of occurrences of ui in u. The following fact is useful in the sequel.

Fact 2.3 For any u,v E A+, the following conditions are equivalent (i) u is a permusubword ( a proper permusubword, resp.) of v; ( i i ) v is a permusuperword ( a proper permusuperword, resp.) of u; w P(U> I P ( V ) M u ) < P(V), resp.).

Proof. (i) w (iii): Let u be a permusubword of v. By definition, there exists v' E ~ ( v ) such that u is a subword of v'. Then we have p ( u ) 5 p(v') = p ( v ) . Conversely, let p ( u ) 5 p ( v ) . We shall prove by induction on JvJ that u is a pemusuword of v. If lvl = 1 then u = v E A, the assertion is trivial. Let lvl = '11 + 1 and suppose that the assertion is true for all v' with lv'l = 11. If p ( u ) = p ( v ) then u is a permutation and therefore a permusubword of v. Let

174

now p ( u ) < p(v). There exists then a E A such that (uIa < Ivla. Then there is u‘ E T ( V ) such that w‘ = w”a. We have p ( u ) < p(v”) . By the induction hypothesis, u is a permusubword of 21”. Hence, u. is a pemusubword of u‘ and therefore of v. (ii) H (iii): The argument is similar 0

Let A = { a l , a2, ..., ak}. Denote by V” the set of all k-vectors of nonnegative integers. For any subset X C_ A* we denote by p ( X ) the set of all Parikh vectors of the words in X , p ( X ) = { p E V k Ip = p ( u ) f o r s o m e u E X } . Also we put .(X) = UuEX ~(u). Proposition 2.4 For any non-empty subset X A+ the following assertions are equivalent:

( i ) X is a supercode; (ii) .(X) is a supercode; (iii) p ( X ) is an independent set w.r.t. the relation < on V k .

Proof. ( i ) H (iii): By definition, X is a supercode iff it is an independent set w.r.t. the relation +. The later is equivalent to the fact that Vu,u E X : p ( u ) { p ( v ) , which in turn is equivalent to the fact that p ( X ) is an independent set w.r.t. the relation < on V k . (iii) + (ii): let p ( X ) be an independent set w.r.t. <. Since p ( X ) = p ( r ( X ) ) , by the above, .(X) is a supercode. (ii) + (i): evident 0

3. Maximal supercodes. A supercode X over A is said to be maximal if X is not properly included in any supercode over A.

Proposition 3.1 For any subset X E A+ ( i ) If X is a maximal supercode then r ( X ) = X ; (ii) If X is a maximal supercode then p ( X ) is a maximal independent set

w.r.t. < on V“. Conversely, if p ( X ) is a muximal independent set w.r.t. < and .(X) = X then X is a maximal szlpercode;

Proof. (i) Let X be a maximal supercode. If T ( X ) # X then, by Proposition 2.4, r ( X ) is a supercode containing strictly X , a contradiction with the maximality of X . (ii) Let X be a maximal supercode. By Proposition 2.4, p ( X ) is an independent set w.r.t. <. If it is not maximal then 3 p # p ( X ) such that p ( X ) u { p } is still an independent set w.r.t. <. Choose u to be any word with ~ ( z c . ) = p(such a word always exists). then p ( X U {u}) = p ( X ) U { p } . Again by Proposition 2.4 this implies that X U { u } is still a supercode, a contradiction with the maximality of the supercode X . Conversely, let r ( X )

(zii) Every maximal supercode is a maximal hypercode.

175

is a maximal independent set w.r.t. < on V k and .(X) = X . By Proposition 2.4, X is a supercode. Suppose X is not a maximal supercode. There exists a word u not in X and therefore not in n(X) such that X U {u} is still a supercode. Because u $! n ( X ) , p = p ( u ) is not in p ( X ) . Again by Propo- sition 2.4, p ( X U {u}) = p ( X ) U { p } is still an independent set w.r.t. <, a contradiction. (iii) Let X be a maximal supercode not being a maximal hypercode. Then, there is a word u not in X such that X u {u} is still a hypercode. By (i) n ( X ) = X . Thus Y = n ( X ) U {u} is a hypercode. If Y is not a supercode then either p ( u ) < p(v) or p(v) < p ( u ) for some v E n ( X ) . By Fact 2.3, u must be a proper permusubword of v or a proper permusuperword of v. This means that there exists v' E ~ ( v ) such that u is either a proper subword of v' or a proper superword of v'. But v' is in Y too, which contradicts the fact that Y is a hypercode. The set Y and therefore the set X U {u} must be a supercode, a contradiction. Thus X is a maximal hypercode that required to prove 0

The converse of (i) in Proposition 3.1 is not true. In other words, the condition n ( X ) = X is not sufficient for a supecode X to be maximal.

Example 3.2 Let X = {ab2,bab,b2a,b4}. Clearly p ( X ) = {(1,2) , (0 ,4)} which is an independent set w.r.t. <. By Proposition 2.4, X is a supercode. Obviously, T ( X ) = X . But X is not a maximal supercode because X U {a2} for example is also a supercode.

There exist maximal hypercodes which are not supercodes.

Example 3.3 Consider the language X = {a2,ab, b2a,b3}. It it easy to see that no word in X is a proper subword of another word in it. So X is a hypercode. Moreover, it is a maximal hypercode. Indeed, let u be a non- empty word not being in X . Let Y = XU {u}. If 1211, 2 2 then u2 is a proper subword of u, i.e. Y is no more a hypercode. Similarly, if lulb 2 3 then b3 is a proper subword of u and Y is no more a hypercode either. Let now luln < 2 and lulb < 3. If 1'111, = 0 or lUlb = 0 then u is a proper subword of b3 or a2 respectively, which implies that Y is not a hypercode. It remains only the case 0 < Iu/, < 2 and 0 < lulb < 3, i.e. lul, = 1 and [ u [ b = 1,2. This implies u E n(ab) or u E n(ab2). Because u $! X , it follows that u E {bu,ab2,bab}. But ba is a proper subword of b2a while ab is a proper subword of ab2 and of bab. Thus, in all cases Y is not a hypercode. Hence X is a maximal hypercode. The code X is not a supercode because the word ab in X is a proper permusubword of the word b2a in X .

We have however

176

Proposition 3.4 A maximal hypercode X is a maximal supercode iff n ( X ) = X .

Proof. (+). Obvious by Proposition 3.1(i). (+). Let X be a maximal hypercode with n ( X ) = X . Being a hypercode, no word in X is a proper subword of another word in X . Moreover, since n ( X ) = X , no word in X can be a proper permusubword of another word in X , i.e X is a supercode. The maximality of X as a supercode is then evident 0

For any set X we denote by P ( X ) the family of all subsets of X . Recall that a substitution is a mapping f from B into P(C*) , where B and C are alphabets. When f ( b ) is a singleton for all b E B it induces a homomorphism from B* into C*. Let # be a new letter not being in A. Put A# = A U {#}. Let's consider the substitutions S1, Sp and the homomorphism h defined as follows:

S1 : A + P(A#*) with S l ( a ) = {a, #} for all a E A. Sz : A# + ?(A*) with Sp(#) = A' and &(a) = { a } for all a E A; h : A#* -+ A* with la(#) = 1 and h(a) = a for all a E A.

Factually, the substitution S1 will be used to mark the occurences of letters to be deleted from a word. The homomorphism h realizes the deletion by replacing # by the empty word. The inverse homomorphism h-' "chooses" in a word the positions where the words of A+ are inserted, while Sp realizes the insertions by replacing # by A+. Denote by A[n] the set of all the words in A* whose length is less than or equal to 11.

Proposition 3.5 For any supercode X over A with m a x X = n, there exists a maximal supercode Y containing X with maxY = m a x X which can be computed by the formulas:

Y = z - n(sz(h-l(z) n (A#*{#)A#*) n A#["]) n A'"]),

where Z = A[n]-h(S1 (n(X))fl(A#*{#}A#*))--K(Sz(h-l(X)n(A#*{#}A#["]) n ~ [ n l ) .

Proof. By Fact 2.3, u < s p 2, p ( u ) < p(v ) which shows that < s p is a length- increasing transitive relation on A*. Then, by Proposition 1.2(i), Rx is a maximal supercode containing X . Now we compute Rx by step to step.

{w, E A * ~ u < s p X } = {U E A*IU 4 h n ( X ) } = l ~ , ( ~ ( x ) ) n (A#*{#}A#*)); {U E A*IX < s p U } = T ( { w , E A*IX <h w,} = n ( S z ( h - l ( X ) n (A#*{#}A#*)) ) ;

177

Dx = {U E A * ~ u 4sp X A X 4sp U }

= h ( s l ( r ( ~ ) ) n (A#*{#&*)) u n ( S d h - V ) n ( A # * { # ) A # * ) ) ) ;

= A* - h(sl(~@)) n (A#*{#}A#*) ) -.rr(SAh-'(X) n (A#*{# )A#*) ) ) ; Lx = {U E IXIIX 4 s p U }

= n ~ ( s ~ ( h - l ( - ~ ~ ) n (A#*{#}A#*) ) ) ;

= IX - n(Sz(h-l(Ix) n (A#*{#}A#*) ) ) . It is easy to see that

IX = A* - DX

Rx = IX - L X

satisfies the condition (*) in Proposition 1.2 with k = 1. Therefore mazRx = mazX = 'n. So the expressions of Rx and IX established above may be restricted to A["] and A#["] instead of A* and A#* respectively. Setting 2 = IX and Y = Rx we obtain the formulas required for Y and 2 0

Example 3.6 Let us consider the supercode X = {a2,ab2} over the alphabet A = {a , b}. Since mazX = 3, we may compute Y by the formulas in Proposition 3.5 with n = 3. We shall do it now step by step.

n ( X ) = {a2 , ab2, bab, b2a}; sl(n(x)) n (A#*{#}A#*) = {#a, a#, ##> u {#b? a#b, ab#, ##b, #b#, a##, ###I U{#ab, b#b, ba#, ##b, #a#, b##, ###I U{#ba, b#a, b", ##a, #b#, b##, ###I; h(Si(n(X)) n ( A # * { # } A # * ) ) = { a , 1,b2,ab,b, ba}; 2 = A['] - h(Sl(n(X)) n (A#*{#}A#*) ) = (1, a, b, a2, ab, ba, = { a 2 , u3,a2b,aba, ba2, ab2, bab, b2a, b3} ;

a3, a2b, aba, ba2, ab2, bab, b'u, b3}-{ 1, a , b, ab, ba, b2}

h-'(z) n (A#*{#}A#*) n A,P] = {#a2, a#a, a"}; Sa(h-'(Z) n (A#*{#}A#*) n n = {a3 , ba2, aba, a"}; n(S~(h-l(Z) n (A#*{#}A#*) n n = {a3 , ba2, aba, a"}; Y = 2 - T ( S ~ ( I L - ' ( Z ) n (A#*{#}A#*) n = { a2, a3, a' b, abe, ba2, ab" bab, b2 a , b3} - { a3, ba" aba, a2 6) = {u2,ab2, bab, b2a, b3}.

n A[']))

4. Supercodes over two letter alphabets . Let's fix a two letter alphabet A = {a , b} . On V 2 we introduce the relation <a.v defined by

178

where pi(u) denotes the i-th component of u. For simplicity, in this section we write < instead of A finite sequence (may be empty) S: u1, u2, ..., u, of elements in V 2 is a chain if

(i) u1 < u:! < ... 4 u,. The chain S is full if

(ii) V i , 1 5 i 5 n - 1, Bv: ui 3 u 4 ui+l. If the full chain S satisfies moreover the condition

then it is said to be complete. A non-empty finite subset U of V 2 is complete if it can be arranged to become a complete chain. For 1 5 i < j 5 n we denote by [ui,uj] the subsequence ui, ui+1, ..., uj of the sequence S. Proposition 4.1 Let X be a non-empty f ini te subset of A + . If X i s u maximal supercode t h e n p ( X ) is complete. Conversely, if p ( X ) is complete t h e n .(X) i s a maximal supercode.

Proof. Let X be a maximal supercode, 1x1 = n. By Proposition 3.1(ii), p ( X ) is a maximal independent set w.r.t < on V 2 . So, for any different u,u in p ( X ) , PI(^) # PI(v),PZ(~) # m(u) . Arrange p ( X > to become a sequence u1,u2, ..., u, such that pl(u1) > pl(u2) > ... > pl(u,). We must have p2(u1) < p2(u2) < ... < pz(u,). That is u1 4 u2 < ... < u,. If pa(u1) # 0 then, choosing u to be any 2-vector with pl(u.) > pl(u1) and pa(u) = 0, the set p ( X ) U {u} is still an independent set w.r.t. <, a contradiction. Thus p. (u l ) = 0. Similarly we have p l ( u n ) = 0. Now if there exists u such that ui 4 u + u ~ + ~ for some i , 1 5 i 5 n - 1, then p ( X ) U {u} is an independent set w.r.t. <, which contradicts again the maximality of p ( X ) . Thus, the sequence u1,u2, ..., u, is a complete chain and, therefore, the set p ( X ) is complete. Conversely, since, as it is easily verified, every complete set is a maximal independent set w.r.t. <, the completeness of p ( X ) = p ( r ( X ) ) implies, again by Proposition 3.1(ii), that r ( X ) is a maximal supercode 0

Example 4.2 For any 11 2 1, the sequence

(4 pz(u1) = Pl(UZ1,) = 0,

(n, O ) , (12 - 1,2), ..., (.. - i , 29 , ..., (0 ,2n)

is obviously a complete chain. Therefore, the set V, = {(n, 0), (,n - 1, Z), ... (0,2n)} is complete. With'n = 3 for example, V3 = {(3,0), (2,2), (1,4), (0,6)}. By Proposition 3.l(ii) it follows that the set

x = n ( { a 3 , a 2 ~ , ab4,

= {a3,a2b2,abub,ab2CL,ba2b, buba, b2a2,ab4, bab3, b2ab2, b3ab,b4a,a6}

is a maximal supercode.

179

By Proposition 4.1, in order to characterize the maximal supercodes over A = { u , b } we may characterize the complete sets instead. For this we first consider some transformations on complete chains. Let S: u1,uz, ..., un be a complete chain.

(T.l) (extension). It consists in doing consecutively the following: rn Add on the left of S a 2-vector u with pl(u) > pl(u1); rn Delete from S all the uis with pz(ui) 5 pa(u); rn If uio is the first among the uis remained, then insert between u and

If there is no such a ui,, then add on the right of u any chain ending

rn Add on the left of u any chain begining with a v , p ~ ( v ) = 0, and such

uio any chain such that [u,uio] is a full chain;

with a w, p1 (v) = 0, and such that [u, v] is a full chain;

that [v, u] is a full chain.

(T.2) (replacement) The following steps will be done successively: Replacing some element ui in S by an element u with pl(u) = pl(ui); Ifp~(u) < pa(ui), then delete all the ujs on the left of u with pa(uj) 2

If uj, is the last among the uj remained, then insert between ujo and u

rn If there is no such a uj, , then add on the left of u any chain commencing

rn If i < n then insert between u and ui+l any chain such that [u, ui+l] is

rn If p~(u) > p~(ui), then delete all the ujs on the right of u with p~(uj) 5

rn If uj, is the first among the ujs remained, then insert between u and uj, any chain such that [u,uj,] is a full chain;

If there is no such a uj,, then add on the right of u any chain ending with a w, pl (w) = 0, and such that [u, v] is a full chain;

If i > 0 then insert between ui-1 and u any chain such that [ui-l, u] is a full chain; If i = 0 then add on the left of u any chain begining with a v , pz(v) = 0, and such that [v, u] is a full chain.

(T.3) (insertion). This consists of the following successive steps:

element u with pl(ui) < p1(u) < pl(ui+l);

P 2 (.I; any sequence such that [uj, , u] is a full chain;

with a 21, pa(v) = 0, and such that [v,u] is a full chain;

a full chain;

P z (.I;

rn For some i , insert in the middle of ui and ui+l, 1 5 i 5 .n - 1, an

rn Ifp~(u) 5 p2(ui), then delete all the ujs on the left of u with p~(uj) 2

rn If uj, is the last among the ujs remained, then insert between uj, and P 2 (u);

u any chain such that [uj, , u] is a full chain;

180

0 If there is no such a uj, , then add on the left of u any chain commencing

Insert between u and ui+l any chain such that [u,ui+l] is a full chain; 0 If p ~ ( u ) 2 pa(ui+l), then delete all the ujs on the right of u with

If uj, is the first among the ujs remained, then insert between u and

If there is no such uj, , then add on the right of u any sequence ending

0 Insert between ui and u any chain such that [ui, u] become a full chain.

(a) The transformations (T. l)-(T.3) preserve the completeness of a chain; (z‘i) Any complete chain can be obtained from another one by a finite

(iii) Every chain S can be embedded in a complete chain by a finite number

with a v, pz(v) = 0, and such that [v,u] is a full chain;

PdUj) 5 Pdu) ;

uj, any chain such that [u,uj,,] is a full chain;

with a v , pl(v) = 0, and such that [u,v] is a full chain;

Proposition 4.3

number of applications of the transformations (T.l)-(T.3);

of applications of the transformations (T.l)-(T.3).

Proof. (i) Easily seen by the definitions of (T,l)-(T.3). (ii) Let S : u1 ,u2, ..., un and S’ : v1, 02, ..., v, be two complete chains. To obtain S’ from S we can do as follows. According as pl(v1) greater, equal or less than pl(u1) we apply to S the transformation (T.l), (T.2) or (T.3) with u = vl. In any case we obtain a complete chain commencing with v1. Suppose S(”, 1 5 5 m - 1, have been constructed, which is a complete chain commencing with v1, ..., Vk. Let s ( ~ ) : v1, ..., v k , wk+1, ..., w,. We construct S(’+l) as follows. If PI (vk+l) > p1 ( w ~ + I ) then, since PI (vk+l) < pl(vk), we may apply (T.3) to insert vk+1 in the middle of v k and wk+l. Because S’ is complete, in the chain obtained, vk+1 must be next to vk. If p1(vk+l) = pl(wk+l), then we may apply (T.2) to replace wk+l by ?&+I. Again by the completeness of S‘, in the chain obtained, Vk+1 must be next to vk. Let now pl(vk+l) < p1(wk+l). There exists then an integer t 2 1 such that pl(w+t+l) 5 pl(va+l) < pl(wk+t). I ~ P ~ ( u + I ) > ~2(wk+l) then it follows that Vk < wk+l < vk+1, a contradiction with the completeness of S’. So we have m(vk+l) L PZ(W~+I). According as pl(wk+t+l) = pl(vk+l) or m(w+t+l) < PI(v~+I), we may apply (T.2) or (T.3) to replace wk+t+l by vk+1, or to insert vk+1 in the middle of wk+t and wk+t+l . Because pl(vk+1) 5 p~(wk+1), wk+l will be deleted and in the chain obtained, vk+l must be next to vk. Thus, in any case, the chain obtained is complete and commences with v1,v2, ...,vk+l. We take this chain to be S(’+’). As pl(v,) = 0, S(,) must coincide with S’. (iii) Given a chain S : ~11,212, ...,v,. Choose S’ to be any complete chain. Similarly as above, we may apply to S’ appropriate transformations (T.1)-

181

(T.3), to ”enter” ~ 1 , ~ 2 , ..., Y, consecutively. Notice that ertering ui+1, i 2 1, does not delete any of q, ..., vi which have been entered in the previous steps 0

Example 4.4 Consider the chain S : (5,2), (3,4), (1,7). We try to embed S in a complete chain by using (T.l)-(T.3). For this, we choose an arbitrary complete chain S‘, say S’ : (2,0), (1, a), (0,4), and manipulate like this:

0 Applying (T.l) to S’ with u = (5,2) we obtain from step to step the following sequences, where underline indicates the 2-vectors added in every step.

0 Applying (T.3) to the last chain with u = (3,4) we obtain successively

(6,0>,(5,2),(3,4),(2,3),(0,4); (6,017 ( 5 , 2 ) , r n (6,0), (57% (3941, (1,5),(0,6); (6, O ) , (5,2),(4,, m,m, (076); 0 Applying (T.2) to the last chain with u. = (1,7) we obtain:

(6701, (5,2), (4,3), (3,4), (1,717 (076); (6,0), (5,2), (4,3), ( 3 , 4 ) , m (6,0), (5,2), (4,319 (3,417 (1, 7), (0,Q (6,0), (5,2),(4,3),(3,4),(2,,(1,,(0,8).

The last chain is a complete chain containing S.

As a consequence of Proposition 4.3 we have

Proposition 4.5 Let A be a two letter alphabet. ( i ) There exists a procedure to generate all the maximal supercodes over

A starting from an arbitrary given maximal supercode; (ii) There is an algorithm allowing to construct, for every supercode X

over A , a maximal supercode Y containing X .

Proof. (i) Let X he a given maximal supercode. Compute first p ( X ) , which is a complete set. Arrange p ( X ) to become a complete chain S. By Propo- sition 4.3(ii), every possible complete chain, hence every complete set, can be obtained from S by a finite number of applications of the transformations (T.1)-(T.3). The inverse immages of all such sets w.r.t. the morphism p give all the possible maximal supercodes.

1 82

(ii) p ( X ) is an independent set w.r.t. <. So it can be arranged to become a chain S. By Proposition 4.3(iii), we can construct a complete chain 5" containing S . Let U be the complete set corresponding to S'. Put Y = p- l (U) . Evidently Y contains X and p ( Y ) = U . By Proposition 4.1, Y is a maximal supercode 0

Example 4.6 Let X = {b'a'bab, u3ba26, b4ab3}. Since p ( X ) = {(3,4), (5,2), (1,7)} is an independent set w.r.t. < on V 2 , by Proposition 2.4, X is a supercode over A = {a , b}. The corresponding chain of p ( X ) is

As has been shown in Example 4.4, the sequence

is a complete chain containng S . The corresponding complete set of S' is

So 1' = p - l ( U ) is a maximal supercode contaning X . Y = ~ ( 2 ) with 2 = {u6,u5b2,a4b3,a3b4,u2b6,ub7,b8}.

s : (5,2), (3,4), (177).

S' : (6, o>, (5,219 (4,3), (3,4), (2,6), (1,7), (0,s)

u = {(6,0>,(5,2>,(4,3>,(3,4),(2,6), (1,7>,(0,8)}. More explicitely,

References

[BPI J. Berstel, D. Perrin, Theory of Codes, Academic Press, Orlando, 1985. [HT] T. Head, G. Thierrin, Hypercodes in deterministic and slender OL languages, Infom. and Control, 45 (1980), 251-262. [IJST] M. Ito, H. Jiirgensen, H. J . Shyr, G. Thierrin, Outfix and infix codes and related classes of languages, J . Comput. and System Sci.,43 (1991),484- 508. [S] H. J. Shyr, n e e Monoids and Languages, Hon Minh Book Company, Taichung, 1991. [ST] H. J. Shyr, G. Thierrin, Hypercodes, Inform. and Control, 24 (1974), 45-54. [Tl] G. Thierrin, The syntactic nionoid of a hypercode, Semigroup Forum, 6 (1973), 227-231. [T2] G. Thierrin, Hypercodes, right-convex languages and their syntactic monoids, Proc. Amer. Math. SOC., 83 (1981), 255-258. [Val] E. Valkema, Syntaktishe monoide und hypercodes, Semigroup Forum, 13 (1976/77), 119-126. [Van] D. L. Van, The enibedding problem for codes defined by binary relations, Hanoi Institute of Mathematics, Preprint 98/A22, 1998.

183

A Parsing Problem for Context-Sensitive Languages

- A Correction -

PA1 Domosil and Masami Ito' 'Institute of Mathematics and Informatics, University of Debrecen,

Egyetem tkr 1, H-4032 Debrecen, Hungary email: domosiQmat h.klte. hu

2Faculty of Science, Kyoto Sangyo University, Kyoto 603-1555, Japan

email: itoQksuvx0.kyoto-su.ac.jp

Let L X * be a language over a nonempty finite alphabet X. For a word u E X * over X, we denote the length of u by 1211. Moreover, S u b ( L ) means the set { p E X' : 3u, q , r E X*, u = qpr E L}. In this note, we will prove Theorem 1 using the following two lemmas.

Lemma 1 ([a, page 841) Every context-sensitive language is recursive

Lemma 2 ([2, page 891) Let X be a nonempty finite alphabet. Let L' X* be a type-0 (recursively enumerable) language. Moreover, let a , b $ X where u # b . Then there is a context-sensitive language L such that (a) L consists of words of the f o m aibp where i 2 0 and p E L', and (i i) for every p E L', there is an i 2 0 such that a'bp E L'.

Theorem 1 Let L be a language and let f i ; : N -+ N be a function such that for any p E S u b ( L ) there exists a pair q , r with qpr E L and lqrl 5 fL(Ip1). Then there is a context-sensitive language which has no recursive function fL having this property. Proof. Let X = { c , d, . . .} and let M be a recursively enumerable set of positive integers that is not recursive. Moreover, let L' = {cnd : n E M } . Now let L be a context-sensitive language over { a , b , c , d, . . .} defined as in Lemma 2 and let f~ : N + N be a function stated in Theorem 1. Suppose fL is recursive. Since fL is recursive, for any positive integer k we can construct the language L k = {a"bckd : m 5 f L ( k + 2)). If k E M , then, by Lemma 2 and the definition of fL , L r l Lk # 0. Conversely, if L fl Lk # 0, then bckd E S u b ( L ) , which implies k E M . Consequently, for a given positive integer k , k E A4 if and only if L f l Lk # 8. On the other hand, by Lemma 1, L is recursive. Thus i t

184

is decidable whether L n L k is empty for a given positive integer k. Therefore, A4 is recursive, a contradiction. Hence fL is not recursive. This completes the proof of Theorem 1.

Acknowledgement Prof. F. Otto at the University of Kassel indicated the mistake in our proof for Theorem 1 in [I]. Moreover, he had a diffirent proof from our correction. We would like to express our gratitude for his indication and suggestion.

References

[l] P. Domosi and M. Ito, Characterization of languages by lengths of their subwords, in Semigroups (edited by K.P. Shum et al.), Monograph Series (1998) (Springer, Singapore), 117 - 129

[2] A. Salomaa, Formal Languages, Academic Press, New York, London, 1973

185

An Improvement of Iteration Lemmata for Context-free Languages1

PQ DOMOSI Institute of Mathematics and Informatics, L. Kossuth University

Debrecen, Egyetem t6r 1, H-4032, Hungary e-mail: [email protected]

and

Manfred KUDLEK Fachbereich Informatik, Universitat Hamburg

D-22527 Hamburg, Vogt-Kolln-Str. 30, D-22527 Hamburg, Germany [email protected]

Abstract: An improvement of iteration lemmata is given for context-free languages.

1. Introduction

In this paper we give an improvement of iteration lemmata for context- free languages in [l, 2, 6, 7, 10, 8, 12, 111. For all notions and notations not defined here, see [8] and [9, 11, 13, 141. An alphabet is a finite nonempty set. The elements of an alphabet are called letters. A word over an alphabet X is a finite string consisting of letters of X. For any alphabet X , let X * denote the free monoid generated by X , i.e. the set of all words over X including the empty word X and X+ = X* \ {A}.

The length of a word w, in symbols )wI, means the number of letters in w when each letter is counted as many times as it occurs. Therefore, w has 1201 positions. By definition, 1x1 = 0. If u and 21 are words over an alphabet X , then their catenation uu is also a word over X . Especially, for any word

This work was supported by DAAD, the Hungarian National Science Foundation (Grant No’s TO19392 and T030140).

186

uvw, we say that v is a subword of uvw. Let w be a word. We put wo = X and wn = wn-lw(n > 0). Thus w k ( k 2 0) is the k- th power of w.

A ( generative unrestricted, or simply, unrestricted ) grammar is an ordered quadruple G = (V, X , S , P ) where V and X are disjoint alphabets, S E V , and P is a finite set of ordered pairs (W, 2) such that 2 is a word over the alphabet V U X and W is a word over V U X containing at least one letter from V . The elements of V are called variables and those of X terminals. S is called the start symbol. Elements (W,Z) of P are called productions and are written W + Z . If W 4 2 E P implies W E V then G is called context-free. A word Q over V U X derives directly a word R, in symbols, Q + R, if and only if there are words Q 1 , Q2, Q3, R1 such that Q = QzQlQ3, R = Q2R1Q3 and Q1 R1 belongs to P.

The language L(G) generated by a grammar G = (V, X , S, P ) is the set L(G) = {w I w E X* and S ~ W , where 3 denotes the reflexive and transitive closure of + . L C X* is a context-free language if we have L = L(G) for some context-free grammar G.

2. Excluded Positions in Derivation Trees

For any word z E X*, and positive integer k 5 JzJ, we will speak about the k th position of z . Moreover, if z = a1 . . .a,, with a l , . . . ,an E X, then we say that ak is in the kth position of z . In addition, sometimes we will distinguish excluded and non-excluded positions of z . Finally, if ak, . . . , uk+e

are in excluded positions of z then we also say that a k . . . ak+e consists of excluded positions.

Given a context-free grammar G in Chomsky normal form, let T, be a derivation tree for some z E L(G). We say that a subpath of T, is external if its initial node is the root of the tree and its terminal node is either the first or the last position of z . In the same sense, we will speak about the external subpaths of a given subtree of T,. An intermediate node of T, is said to be a branch point if each of its children has an excluded descendant. On the other hand, define a node to be free if each of its children has no excluded descendant. ( Recall that G is in Chomsky normal form. Thus every node in T, has not more than two children. ) Of course, the leaves of T, are neither branch points nor free nodes.

a) its initial node is either a branch point or the root of the tree, and its terminal node is either a branch point or a single excluded position ( i.e. the left and right neighbours are non-excluded positions ); b) non of its intermediate nodes is a branch point; c) if it has no intermediate node then its initial node is the root of the tree and simultaneously not a branch point.

A subpath of T, is distinguished if

1 87

For a context-free grammar G in Chomsky normal form and a word z E L(G), let T, be a derivation tree with z = z ~ w l z l . . . wnzn, where wl, . . . , w, denote ( possibly empty ) words consisting of excluded positions, and z1 , . . . , z, denote ( possibly empty ) words having no excluded positions.

A derivation tree T, is called minimal if all of its subpaths with the following properties :

a) the terminal node is either a branch point, a non-excluded position, or a single excluded position;

b) no other node is a branch point; have no 2 non-terminal nodes with the same ( non-terminal ) label.

We start with the following

Lemma 2.1. Let T, be a minimal derivation tree fo r z E L(G) , and consider an arbitrary distinguished subpath p . Then the free children of the intermediate nodes on p have not more than 2lVl-' - 1 non-excluded descendants. I

Proof. Consider a subpath p' containing all nodes of p apart from its initial node if the initial node of p is a branch point. Otherwise, ( if the initial point of p is not a branch point, and then it is the root of the tree ) let us assume p' = p. Since T, is minimal, p' is not reducible. Consider the maximal derivation subtree Tzr of T, having the root as the initial node of p'. Omitting all of the descendants of the terminal node of p' ( and p ) from T,,, we get a subtree Tzl1 containing no path with distinct nodes having the same nonterminal label. Therefore, the subtree T,!, has not more than 2lVl-l leaves, where one of the leaves is the terminal node of p' ( and p ).

0

Lemma 2.2. Let k be the number of the words in ( ~ 1 , . . . , w,} consisting of two or more letters. Suppose that T, is a minimal derivation tree. Then T, has not more than Iw1 . . . w,J + n + k - 1 distinguished paths.

Proof.

Then this w contributes at most 1wI distinguished paths. a) Consider any block w with (w( 2 2 consisting only of excluded positions.

This is proved by induction on IwI. If JwI = 2, then clearly, w contributes a t most 2 distinguished paths. Assume that w1w2 with Iw1w21 = n contributes a t most n distinguished

paths. Adding a new excluded position x gives w1xw2. If w1 = X then the path from x has to be joined on the left external paths

belonging to w1w2. This can be done either above the highest branch point, or within some distinguished left external path. But then no distinguished

188

path below the join point can have left free children. Thus at most 1 new distinguished path can be added, the join point becoming a branch point.

Symmetrically, if w1 = A. If w1 # A, w2 # A, then x must be joint to some interior path, without

left and right free children, either to a left or to a right external distinguished path. Again, at most 1 new distinguished path can be added, the join point becoming a branch point.

b) Now consider the highest branch points coming from the blocks w, with lwil 2 2 as ’excluded positions’. With the same argument which has been used for the number of distinguished paths in [6, 71 one gets a t most 2(k + ( n - k)) - 1 = 2 n - 1 distinguished paths. Adding those contributed by the k blocks wii with lwiiI 2 2 gives at most

k 2 n - 1 + (wij I = lwi, . . - wi, I + 2(n - k ) + 2k - 1 = Iw1. . . w,I + n + k - 1

j = 1

distinguished paths. ( n - k is the number of blocks wi:, with I w ~ , ~ ) = 1 ). 0

Theorem 2.3. Given a context-free grammar G = (V, XI S, P ) in Chomsky normal f o r m and a word zowlz1 . . . Wnzn E L(G) with A $ ( ~ 1 , . . . , wn}, let k be the number of the words in (201, . . . , w,} consisting of two or more letters. If T, is a minimal derivation tree for z , then the following holds :

Proof. Let T, be a minimal derivation tree of the word z = zowlzl . . . w,z,. Exclude positions in z such that w1 , . . . , w, are ( possibly empty ) words consisting of excluded positions, and ~ 1 , . . . , zn are ( possibly empty ) words having no excluded positions. Then, by Lemma 2.2, T, has not more than lwl . . . w,I + n + k - 1 distinguished paths. On the other hand, using Lemma 2.1, for every distinguished subpath p of T,, the free children of the intermediate nodes in p have not more than 21vl-1 - 1 excluded descendants. Therefore, we have not more than (21vl-1 - l)(lwl . . . w,J + n + k - 1) non- excluded ( leaf ) descendants of all free children.

Now we consider an arbitrary non-excluded position. If it is a descendant of a branch point then consider the last one with this property. It should have two children. Both of them should have an excluded descendant. Therefore, one of them is an intermediate point of a distinguished path of which the considered excluded position is also a descendant. On the other hand, if the considered non-excluded position is not a descendant of any branch point

1 . ~ 1 5 (2lv1-l - l ) ( l w l . . . w,I + n + k - 1) - n - k + 1.

189

then the root of the tree should not be a branch point. Therefore] either the considered non-excluded position is a descendant of a branch point or not, and then it i s a descendant of a free node.

Since there are not more than (21vl-1 - l ) ( ( w l . . . w,( + n + k - 1) non- excluded ( leaf ) descendants of all free children] it follows immediately that

I Z I 5 1w1.. . w , ~ + (2l'l-l- 1)(lw1. . . w , ~ + n + k - 1) - - 21vl-1(lw1.. . w,I + n + k - 1 ) - n - k + 1.

0

3. Context-Free Languages

Now we show an improvement of Theorem 1.7 in [7].

Theorem 3.1. For a context-free grammar G = (V, X , S, P ) in Chomsky normal form and a word z = zowlz l . . . wnzn E L ( G ) with A $ ( ~ 1 , . . . , wn} a n d ) z ) > (21vl-1-1)(Jwl . . . w,J+n+k- l ) -n-k+1, letk bethenumberof the words in { w l , , . . , w,} consisting of two or more letters. There are words u, u, w , x , y , with z = uuwxy, and positive integers s, t with 0 5 s < t 5 n,

u = ZOWlZ1 . . . w,-lzs-lw,z;, u = z;, w = z~w,+1z,+1 . . . W t - l Z t - l W t Z ; ,

5 = z;, y = Z: I IWt+lZ t+l . . . w,z, ( 2 , = z;z:zr, Zt = ZtZt tt ), I 11 111

1uwx( 5 2 . ((2I'l-l - l ) ( l w l . . . w,( + n + k - 1) - n - k + 1 ) ] lux( > 0 , and uuzwxiy E L ( G ) for every nonnegative integer i.

Proof. We consider the following cases. Case I.

Suppose that T,, a derivation tree for z , is not minimal and denote by p one of its subpaths. Thus, there exist distinct nodes in p having the same ( non-terminal ) label, say A, moreover, two strings of terminals] v and z, and two nonterminals, B and C, such that the derivation A + BC + WAX is represented in T,. B and C cannot both dominate the lower A, therefore lux) > 0. On the other hand, since there exists no intermediate branch point of the distinguished paths, we have that neither u nor x contains an excluded position. ( Of course, the free children of the nodes of this path do not have excluded descendants. ) In other words, we obtain for an appropriate pair s, t of positive integers,

z = z;,y = t ~ w t + l z t + l . . . w,zn, lvzl > 0, and uuiwxzy E L ( G ) , i 2 0. In addition, by the derivation discussed above, A 5 BC + W A X , we may assume the existence of derivations B + * z ' , C j *z" with z't'l = uwx such that the derivation subtrees T,,, T,,, are minimal. Therefore, we get (uwz( I 2 . ((2lv1-l - l ) ( ( w l . . .wn( + n + k - 1) - n - k + I).

u = Z O W l Z l . . . ws-1zS-1w,z;] 21 = z;, w = zyws+1z,+1.. . W t - l Z t - l W t Z ; ,

190

Case 11. Suppose that T, is minimal with respect to 20, w l , zl, . . . , wnr z,. There-

fore, by Theorem 2.3, IzI < (2l"l-l - l ) ( l w l . . . WnI + n+ k - 1) - n - k + I), contradicting our conditions.

0

References [l] Bader, C., Moura, A. : A Generalization of Ogden's Lemma. JACM 29,

no. 2, (1982), 404-407.

[a] Bar-Hillel,Y., Perles, M., Shamir, E. : On Formal Properties of Sim- ple Phrase Structure Srammars. Zeitschrift fur Phonetik, Sprachwis- senschaft, und Kommunikationsforschung, 14 (1961), 143-172.

[3] Berstel, J. , Boasson, L. : Context-free Languages., in Handbook of The- oretical Computer Sciences, Vol. B : Formal Models and Semantics, van Leeuwen, J., ed., Elsevier/MIT, 1994, 60-102.

[4] Domosi, P., Ito, M. : On Subwords of Languages. RIMS Proceedings, Kyoto Univ., 910 (1995), 1-4.

[5] Domosi, P., Ito, M. : Characterization of Languages b y Lengths of their Subwords. Proc. Int. Conf. on Semigroups and their Related Topics, (Inst. of Math., Yunnan Univ., China,) Monograph Series, Springer- Verlag, Singapore, to appear.

[6] Domosi, P., Ito, M., Katsura, M, Nehaniv ,C. : A New Pumping Prop- erty of Context- free Languages. Combinatorics, Complexity and Logic (Proc. Int. Conf. DMTCS'96), ed. D.S. Bridges et al., Springer-Verlag, Singapore, 1996, 187-193.

[7] Domosi, P., Kudlek, M. : Some New Iteration Lemmata for Context-free and Linear Indexed Languages, ( accepted for Publicationes Mathemat- icae, No 60 ).

[8] Harrison, M. A. : Introduction to Formal Language Theory. Addison- Wesley Publishing Company, Reading , Massachusetts, Menlo Park, Cal- ifornia, London, Amsterdam, Don Mils, Ontario, Sidney, 1978.

[9] Hopcroft, J.E., Ullman, J.D. : Introduction to Automata Theory, Lan- guages, and Computation. Addison-Wesley, Reading , Massachusetts,

191

Menlo Park, California, London, Amsterdam, Don Mils, Ontario, Sid- ney, 1979.

[lo] Horv&th, S. : A Comparison of Iteration Conditions on Formal Lan- guages. In Algebra, Combinatorics and Logic in Computer Science, vol. 11, pp. 453-464, Colloquia Matematica Societatis J’anos Bolyai, 42, North Holland 1986.

[ll] Nijholt, A. : An Annotated Bibliography of Pumping. Bull. EATCS, 17 (June, 1982), 34-52.

[12] Ogden, W. : A Helpful Result for Proving Inherent Ambiguity. Math. Syst. Theory 2 (1968), 191-194

[13] Rkvksz, Gy.E. : Introduction to Formal Languages, McGraw-Hill, New York, St Louis, San Francisco, Auckland, Bogota, Hamburg, Johannes- burg, London, Madrid, Mexico, Montreal, New Delhi, Panama, Paris, Siio Paulo, Singapore, Sydney, Tokyo, Toronto, 1983.

[14] Salomaa, A. : Formal Languages, Academic Press, New York, London, 1973.

192

QUANTUM FINITE AUTOMATA

Jozef Gruska* and Roland Vollmar

Faculty of Informatics, Masaryk University, Botanicki 68a, 602 00 Brno, Czech Republik Fakultat fur Informatik, Universitat Karlsruhe, Am Fasanengarten 5, 76128 Germany

Abstract Various quantum versions o f the most basic models o f the classical finite automata have already been introduced and various modes of their computations have already started t o be investigated. In this paper we overview basic models, approaches, techniques and results in this promising area of quantum automata that is expected to play an important role also in theoretical computer science. We also summarize some open problems and research directions to pursue in this area.

1 Introduction

Once an understanding has emerged that foundation of computing has t o be based on the laws and limitations of quantum mechanics, it has became natural t o turn attention also t o various quantum models of automata.

1.1 Goals of the research

The research in the area of quantum computation models has several interrelated external and internal goals.

0 To get an insight into the power o f different quantum computing models and modes, using language/automata theoretic methods.

0 To discover very simple models o f computation a t which one can proof large (or huge) difFerence in the power between quantum and classical versions o f automata.

0 To determine borderlines between algorithmic decidability and undecidability for key algorithmic problems. To explore how much quantumness is needed and how pure it has t o be, in order to have (quantum) models of computation that are more powerful than classical ones.

* The paper has been written during the first author stay with University of karlsruhe, Department of Informatics, in summer 2000. Support of the grants GACR 201/98/0369, CEZ:J07/98:143300001 and VEGA 117654120 is to be acknowledged.

193

0 To develop quantum automata (networks, algorithms) design and analysis

0 To explore mutual relations between different quantum computation models

0 To discover, in a transparent and elegant form, limitations of quantum com-

0 To explore how much of quantum resorces are needed to have quantum models

methodologies.

and modes.

putations and communications.

(provably) more powerful than classical ones.

Main models of quantum automata are a natural quantum modification of the main classical models o f automata.

1. Quantum (one-tape) Turing machines (QTM). They are used t o explore, a t the most general level o f sequential computation, the potential and limitations o f quantum computing. Using this model the main computational complexity classes are defined. (QTM can be seen as the main quantum abstraction o f the human computational processes.)

2. Quantum cellular automata (QCA). They are used to model and to explore, on very general and basic level o f parallel computation, the potential and limitations of quantum computing. (QCA can be seen as a very basic quantum abstraction of computation by nature.

3. Quantum finite automata (QFA). They are considered to be the simplest model o f quantum processors, with "finite" quantum memory, that models well the most basic mode o f quantum computing - a quantum action is performed on each classical input. Their classical variants are usually denoted by QFA or 1FA.

4. Almost finite quantum automata They are again modifications of the classical models. The needs for introducing them are the same as in the classical case. Attempts to generalize or simplify main models to get models for which one could extend results or for which one could get results not obtainable or not known to be true for more general models. In addition, the classical model of push-down automata has a very strong motivation in connection with recursive programmimng. One natural approach to design such models is to add additional tapes (of a special access). The main models to consider so far are: quantum multitape finite automata, quantum counter finite automata, quantum pushdown automata

2 Classical Reversible Finite Automata

The very basic definition of a reversible finite automaton A is that of a deterministic finite automaton a t which to each state q and any input a there is

194

at most one state q' such t h a t under the input a the automaton A gets from the state q' into the state q.

A special case are totally reversible finite automata, called also group automata, a t which to each state q and any input symbol a there is exactly one state q' such that under the input a the automaton A gets from the state q' into the state q.

The power of reversible finite automata as acceptors depends on how many input and output states we allow, whether only one or many.

In none of the cases the language {0*1*} is acceptable by a reversible finite (one-way) automaton. (However, this language can be accepted by reversible two- way automata and also by reversible push-down automata.)

Concerning the acceptance, the most interesting case seems t o be the one a t which several input and also output states are allowed. Let us denote such a model as RFA.

A detailed study of the power of RFA was done by Pin (1987). He gives the following characterizations of languages L accepted by RSA.

Theorem 2.1 ditions are equivalent.

(Pin, 1987') If L is a regular language, then the following con-

1. L is accepted by a reversible finite automaton (with a set of initial states and a set of final states).

2. L = K n C*, where K is a subset of the free group F ( C ) , consisting of a finite union of left cosets of finitely generated subgroups of F ( C ) .

3. The idempotents of the syntactical monoid M ( L ) of L commute and, f o r every x,u , y E C*, xu*y E L implies xy E L.

4. The idempotents of M ( L ) commute and, for everg s, t , e E M ( L ) such that e is idempotent, set E P implies st E P, where P is the image of L in M ( L ) .

5. The idempotents of M ( L ) commute and L is closed in the free-group topology.

Languages accepted by group automata are also well understood. They are exactly languages the syntactical monoids o f which are groups.

As pointed out by K. Paschen, for some languages there are two minimal reversible automata that are not isomorphic. This results also indicates that reversible finite automata are far from being so easy to handle as the ordinary finite automata.

A method how to design a circuit out of (reversible) Fredkin gates that imple- ments a given RFA was developed by Morita (1990).

195

3 Abstract Approaches to Quantum (Finite State) Sequential Machines

Several general approaches have been developed by Gudder (2000). The first o f them is the concept of the quantum transition machine (QTRM) M = (X, Iqo), U ) , where IFI is a Hilbert space, Iqo) is an (initial) state and U is a unitary transformation. Closely related is the concept of quantum sequential machine (QSM) M = (Q,qo,d), where Q is the set of states, qo E Q and 6 : Q x Q 4 C is a transition mapping such that the following well-formedness condition

is satisfied. This condition guarantees unitarity o f the corresponding evolution o p erator

9‘

At this approach the corresponding Hilbert space has the basis { 14) I q E Q} and the initial quantum state is lo).

The above concepts of quantum machines can naturally be extended in two ways. We can consider a set Qt of terminating states in the case of QSM and a s u b space of terminating states Ht in the case of QTRM. A computation is considered as terminating if it gets into a terminating state. The second way o f generalization is t o consider transition mapping as being dependent also on input symbols. This can then naturally be combined also with the case of having terminating states.

There is a close relation between the above two concepts o f quantum sequantial machines. Indeed, to each quantum sequential machine we can construct an equivalent quantum transition machine by considering as the corresponding Hilbert space Zz(Q) and as the unitary operator the evolution operator defined above. Conversely, suppose that M = (Q, qo, U ) is a quantum transition machine. Let B be an or- thonormal basis for H that includes the state qo and let us define 6(q, 4’) = (Uqlq’). (That is a quantum transition machine can be seen as a quantum sequential machine at which computational basis is unspecified.)

3.1

Several, nonequivalent, models of quantum finite automata have been introduced. Most o f them have the following basic componenets:

Input has the form #wl . . . wn$ or, shortly, #w$, where w E C*, IwI = n, C is an input aphabet amd {#,$} are endmarkers.

The set of states Q = Qa U QT U Qn is composed of the accepting states, Qa, the rejecting states, Q T , and the nonhalting states, Qn.

Quantum finite automata - a general scheme

196

A configuration is a pair (Ql I ) - a state and a position on the input tape. The set of configurations has the form C(Ql w ) = { ( q , i ) I q E Q1 0 5 i 5 IwI + 1) This is used to introduce the corresponding Hilbert space: Zz(C(Q, w ) )

Transition mappings 6 are defined as follows

q 'EQ,Wj<_n+l

where aq',j are complex numbers such that Cq,,j ( ~ y ~ t , j ( ~ .

ered. They are projections into one of the subspaces: Measurements. So far mainly projection measurements have been consid-

Ea = l ( { ( q , i) 14 E &a}), 4 = l ( { ( q , i) I 4 E Qr))i En = l ( { ( q i i) I q E Qn)).

In addition, the following measurement modes have been considered:

0 MM-mode (many measurements mode - a measuremnt is done after each

0 MO-mode (measurement once mode - measurement is done only at the input symbol is read and the corresponding unitary operator is applied).

end of computation.)

4 One-way quantum finite automata

The very basic model is that of one-way quantum finite automata (1QFA). At this model a state is a superposition o f the basis states that corresponds to the "heads positioned on the same square o f the tape" and at each evolution step "all heads move" one square to the right. A formal description of lQFA follows.

Definition 4.1 A one-way (real-time) quantum finite automaton A is given by: C - the input alphabet; Q - the set of the states; qo - the initial state; Qa C Q , QT Q - sets of accepting and rejecting states, and the transition function

6 : Q x r x Q -+ cp,~], where r = C U {#, $), #, $ are endmarkers and C I ~ , ~ ] is the set of complex numbers absolute value of which is not larger than 1.

The evolution (computation) of A is performed on the Hilbert space Zz(Q), with the basis states { I q ) I q E Q}, using the unitary operators Vu,o E r, defined bv

q'EQ

For an input w = w1w2.. . w , the whole evolution is then given by the mappings

197

unless some measurements are performed.

4.1 Acceptance probabilities for QFA

There are various approaches how to define the acceptance of words and languages by quantum finite automata.

A lQFA A accepts (rejects) a word w of length n with probability p if p is the sum of probabilities pi t h a t w is accepted (rejected) after i symbols of w are scanned, for i = 1,. . . ,n.

A lQFA A accepts a language L with probability $ + E , E > 0, if A accepts (rejects) any z E L (z $ L) with probability a t least $ + E .

I f there is an E such that A accepts L with probability a t least $ + E , then A is said t o accept L with bounded error probability.

A language L is accepted by A with unbounded error probability, if z E L (z $ L ) is accepted (rejected) with probability a t least $.

The acceptance with bounded error probability is considered to be the main and the most realistic one because it is robust with respect t o small errors. The acceptance with respect t o unbounded errors can lead to unrealistic conclusions.

Finally, let us denote by BMO (BMM) the family of languages accepted by MO-1QFA (MM-1QFA) with the bounded error and by UMO (UMM) the family of languages accepted by MO-1QFA (MM-1QFA) with the unbounded error.

4.2

We present now a simple, but tricky, example, due to Ambainis and Freivalds (1998), of an interesting lQFA A:

Example - hierarchies of languages

States: Q = (40 , qi, q 2 , qa, q r } , Qa = { q a } , Qr = { q r } . Transitions:

vo141) = (1 -P)141) + h F i q l q 2 ) + Jirlqr),

VOlQ2) = d i F 3 I q 1 ) +P142) - Ji--plqr),

K14d = I%-), W q 2 ) = lq2), &Id = I Q P ) , & lq2) = I&). The remaining transitions are defined arbitrarily t o satisfy the unitarity condition

The automaton A can quite well recognize the following languages L,, where for mappings V,.

L1 = O* and for n > 1

L, = {z;l; . . . 1 ; 1122-1 = 0,122 = 1)

198

Indeed, as shown by Ambainis and Freivalds (1998, for the case n = 2), and by Kikusts and RasEevskis (2000, for the general case), the language L , can be accepted by the automaton A with probability p , where p s + p = 1.

In addition, (Li},”=, represents a sequence of langauges such that each next of them can be recognized only with probability smaller than the previous ones and these probabilities converge, from the above, to f. More exactly, it holds,

Theorem 4.2 (Kikusts and RasSEevskis, 2000)

1. The language L, can be recognized with probability f + for a constant c , but cannot be recognized with probability greater than $ + -3- m.

2. I f we put n1 = 2 and, f o r k > 1, n k = 7 + 1, and we define pk = $ + 6, then f o r every k > 1 the language L,, can be recognized by a MM- 1QFA with the probability p k , but cannot be recognized by a MM-1QFA with probability pk-1.

gn2k_l

Remark 4.3 Actually, the above theorem has been first shown, by Ambainis et al. (1999), f o r the sequance of languages L , = a;aa. . . a;, over the increasinly large alphabets (a1 azl . . . , an}. I n the classical case, it is usually straightforward to transform a many-letter alphabet result of such a type to a two-letter alphabet case. I n the quantum case no straightforward techniques to do that are known

Remark 4.4 The above result holds only fo r the MM-mode of computation on IQFA. I n the case of MO-mode, it holds, similarly as in the classical case, that once a lagauge can be cacepted with the probability p > $, then it can also be accepted with probability 1 2 p’ > p .

4.3

The folllowing is the very basic result used to show some limitations for languages that can be accepted by MM-1QFA.

Theorem 4.5 (Ambainis and Freivalds, 1998) Let L be a regular language and A a minimal DFA for L with the transition function 6 and a set of accepting states. Let there be in A states 41, q2 and an input word w such that: (a) q1 # 92; (b) 6 ( q l , w ) = S(q2, w ) = 92; (c) 92 is neither “all-accepting” nor “all-rejecting” state. Then L cannot be accepted by a MM-1QFA with probability at least 5 + E ,

f o r any E > 0.

Limitations on the probability acceptance

The minimal automaton for the language {0*1*} clearly contains the above trouble making construction and therefore it holds.

199

Theorem 4.6 (Ambainis and Freivalds, 1998) There is a regular language that can be recognized by a MM-1QFA with probability 0.68.. ., but neither by MM-IQFA with probability at least i + E nor by RFA.

In addition, it holds.

Theorem 4.7 (Ambainis, Freivalds, 1998) If L is a regular language and A its minimal automaton with n states. If A does not contain the “orbidden constructi on” of Theorem 4.3, Then L can be recognized by an RFA with 0(2n) states.

The basic idea behind the lower bound proofs o f Theoerm 4.3 is the fact that the minimal DFA for the langauge L, contains n - 1 o f “forbidden constructions and each one decreasers probability with which the language can be accepted by a M M- 1Q FA.

4.4 Characterizations

In the case o f MM-1QFA a nice characterization is know only for langauges accepted with high probability.

Theorem 4.8 (Ambainis and Freivalds, 1998) A language can be recognized by a MM-IQFA with probability i + E , E > 0, if and only if it is accepted by a RFA.

Another quite nice characterization is known for langauges accepted by MO- 1QFA.

Theorem 4.9 (Brodsky and Pippenger, 1999) The class B M O is exactly the class of group languages and therefore a proper subclass of the class of regular languages.’

4.5 Closure properties

Let us first summarize the closure properties for the class BMO. I t holds

Theorem 4.10 (Brodsky and Pippenger, 1999) The class BMO is closed under Boolean operations, inverse homomorphism and word quotients, but not under homomorphism.

On the other hand, the class UMO contains also non-regular languages. For example, the language {wl lwlo = IwI1,w E {O,l}*}. Also the class UMO is closed under Boolean operations, inverse homomorphism and word quotients, but

It is the class of languages accepted by group finite automata, or, equivalently, the class of regular languages syntactical monoids of which are groups.

200

not under homomorphism (Moore and Crutchfield, 1997). However, no precise characterization of this class is yet known.

Less is known about the classes BMM and UMM. Both of them are closed under complement, inverse homomorphism and word quotient and it is known that the class U M M is not closed under homomorphism.

Moreover, it has been shown by Valdats (2000) that the class o f languages accepted by lQFA is not closed under union and, actually under any binary Boolean operation. Namely, he has showed tha t the languages L1 = (aa)*bb*a(b*ab*a)*b*U (aa)* and L a = aL1 can be accepted by lQFA with probability $, but their union is not acceptable by a 1QFA. In addition, Valdatas (2000) has shown that the above example represents a border case in the following sense. If two languages L1 and L 2 can be accepted by 1QFA with probabilities p1 and p2 such that $ + & < 3, then their union is accepted by a lQFA with probability p l ~ ~ ~ ' l p 2 .

4.6 Succinctness results

Even if recognition power of lQFA is not impressive, quite a different situation is with their descriptional power. However, even from this point of view a comparison between quantum and classical automata is not conclusive. For some langauges we can have exponentially more succinct description using 1QFA than using lFA, but in other cases the situation can be reverse. Here are the main results in this area.

Theorem 4.11 (Ambainis, Freivalds, 1998)

1. For a n y p r i m e p any DFA recognizing the language L, = {ai I i is divided b y p ) has to have at least p states, but there is a MM -1QFA with O(1gp) states recognizing L, .

2. For any integer n, each DFA recognizing the language L, = {On}, containing the single string, has to have n states, but there is a MM-1QFA with O(lg n) states recognizing L,.

The proof of the first part of the Theorem 4.6 actually contains a method how to accept the language L, using a lQFA with only O(1gp) states. An interesting analysis o f this method has been peformed by Berzina e t al. (2000). They show that even using only a 7 qubit computer it is possible to recognize the language L1223, as the special case of the languege L,, with probability 0.651. This is quite surprising because the method involved seems to require t o work with very large primes.

Thorem 4.12 (Ambainis et al. 1998, Nayak, 1999) For each integern there is a DFA of size O(n) recognizing the language L, = {wO I w E (0, l}*, Iw( I n}, but each MM-IQFA recognizing L, with probability greater than f has to have 2"(n) states.

20 1

We can conclude that in some cases, it seems that due to quantum parallelism, lQFA can be much smaller than their classical counterparts and in some cases, it seems that due to the requirement on unitarity (reversibility), this is just the opposite way. Also the following result indicates that. By Ambainis and Freivalds (1998), there is for any integer n a language t h a t can be accepted by a DFA with O(n) states but each RFA accepting t h a t language has to have O(2,) states.

For a more detailed treatment of the problem of succinctness in quantum computing see [16].

4.7 Lower bounds methods

As usually, it is far from trivial t o show such sharp bounds on succinctness as presented in theorems above. So far three methods have been used to do that.

0 Classical computing methods. In the case of the last theorem we can argue as follows: Since the language L, is finite, there is a RFA accepting it. It is easy to see t h a t each RFA for L, has to have O(2,) states. Indeed, due to the reversibility requirement, each state has to encode the whole input that brings the automaton to that state. Since there are 2, of possible inputs, the total number of states has to be O(2,).

0 Probabilistic computing methods. They are methods urrr t o show lower bounds for randomized computations, especially methods used to show lower bounds for probabilistic automata.

0 Random access coding method The above idea does not seem to be applied easily t o lQFA for L, for a t least two reasons. A lQFA can accept an input, with a certain probability, without reading the whole input. In addition, it is not clear in which sense particular states encode the history o f computation because the automaton can be a t a given moment not in a particular state, but in a superposition o f states. However, an interesting modification of the above idea works in a combination with new, purely quantum, ideas of the so-called random access coding and of the serial coding.

One of the basic results of the quantum information theory, the so-called Holevo theorem, says that no more than n bits of information can be encoded and later faithfully retrieved from n qubits. However, quite surprisingly, if we relax, in a reasonable sense, the above strong requirement on perfect retrieval o f al l encoded bits, then we can encode m bits into n < m qubits in such a way that each single bit (but not all of them) can be retrieved with a quite high probability. This idea has been formalized as follows.

Definition 4.13 A m -% n random access coding is a mapping f : {0,1}* x R -+ C2" such that for any 1 5 i 5 m, there is a quantum

202

measurement Oi producing values 0 or 1 and such that for any w E (0 , ljm, Pr(0il f ( w , r ) ) = wi) 2 p , where R is a set of random bits, f is called to be an encoding function and Oi are said to be decoding observations. I n the case that each observable 0i can depend on the string wi+l . . . w,, we talk about serial coding.

Ambainis e t al. (1998) showed the following lower bound n = a(&) 0 Entropy method The conceptual framework behind this entropy proof method

is very different and requires a more detailed presentation.

Since each computation process of a lQFA can be seen as a sequence of unitary operations V, and of the standard accepting/rejecting/nonterminating measurements, such a computation process can, and should, be seen as producing a sequence of mixed states. When we then consider computation of a lQFA for L, on a random binary input string, then one can show easily that the quantum entropy o f the mixed states being produced during the computation can only increase for any symbol a read and the corresponding unitary operation V,, and therefore also for the standard measurement. In addition, for certain languages one can show that such an entropy increase is limited from bellow for processing each symbol. Moreover, the total information capacity o f QFA can be bounded in terms of the number of states and this way a lower bound can be obtained on the number of states of an automaton to recognize the language.

I f we now have a restricted lQFA A,, which accepts L, with probabilityp 2 $ and with the set of states Q, then on the basis of the lfollowing lemma one can see that after reading k input symbols the resulting mixed state with the density matrix p k is such that it holds (where S ( p ) is the Shannon entropy)

Q S ( P ~ ) 2 (1 - S ( P ) ) ~ .

Lemma 4.14 Let po and p1 be two density matrices and p = + ( P I +p2) . If 0 is a measurement with outcomes 0 and 1 such that making a measurement on P b yields b with probability p , then

From that we get the lower bound for An: IQI 1 2 " ( ( 1 - S ( p ) ) n ) . As already discussed above, once a lQFA 23, is given for L,, with a set of states Q , we can construct an equivalent restricted lQFA for L , with O(nlQl) states. This leads to the overall lower bound for the number of states IQI of 1QFA recognizing L, :

I Q I 2 2(1-S(p))n-kn-O(1)

203

Theorem 4.12 can be strengthened, to hold also for a more general class o f one-way quantum finite automata, for the so-called enhanced one-way real- time quantum automata (elQFA). Their main new feature is that each time after a new symbol g is read an arbitrary sequence of unitary operations and orthogonal measurements, that depends (this sequence) on u, is performed. In short a superoperator is performed on the density matrix representing the current mixed state. The above model is of importance for several reasons. First of all, it is a very natural generalization of the model of lQFA and, secondly, it is in lines with recent concentration on density matrices and superoperators (as operators t h a t are applied on density matrices). I t has been shown by Nayak (1999) that the lower bound 2"(,) holds also for elQFA recognizing the language L,. The proof goes basically along the same line of reasoning as in the former case. The key new fact to be used is that an application of a measurement (and thereby also of a superoperator) increases entropy a t least by the additive term 1 - S(p) .

5 Two-way quantum finite automata

The second very basic model is that of the two-way quantum finite automaton (2QFA) at which a state is a superposition of the basis states that can correspond to different heads positioned on different squares of the input type. Formally, 2QFA are defined as follows [28].

A two-way quantum finite automaton A is specified again by an alphabet C, a finite set of states Q, an initial state 40, sets Qa 5 Q and QT C Q, such that Qa n QT = 0, and the transition function

6 : Q x r x Q x I+-, 1, -+I - C[o,ilI

where r = CU{#, $} is the tape alphabet of A and # and $ are endmarkers not in C , which satisfies the following conditions (of well-formedness) for any 41, QZ E Q, u ,m,m E r, d E {4-,1,4}: 1. Local probability and orthogonality condition.

2. Separability condition .

204

3. Separability condition 11.

C6*(41, Ul, d , +)b(q2,uZ?, q', +) = 0 . 4'

The above conditions are not easy t o verify. Fortunately there is a simpler concept of 2QFA that is equally powerful.

Definition 5.1 A 2QFA A = (C,Q,qo,Qa,Qr,6) is simple, or unidirec- tional, i f for each u E I' there is a unitary operator V, defined on the Hilbert space 12(Q) and, in addition, a function D : Q -+ {+-, 1, -1) such that for each 4 E Q,u E r , - - ,

,, q , if D(q') = d; otherwise.

6(q , u, q', d ) = { f" ' )

It is straightforward to verify that if we rewrite the well-formedness conditions using the relation (l), then we get t h a t a simple 2QFA A satisfies the well- formedness condition if and only if

for each 0 E r, which holds if and only if every operator V, is unitary.

5.1 Power of 2QFA

There are two basic results concerning the power of 2QFA, both due to Kondacs and Watrous (1998).

Theorem 5.2 reversible (and therefore also quantum) finite automaton.

The basic idea behind the proof is to make a reversible simulation of a given DFA. This method leads in some casse to an exponential increase o f the number of states and it is not clear whether this is avoidable.

Theorem 5.3 2QFA can recognize, with respect to the bounded-error mode of acceptance, also non-regular languages, such as the context-free language { O i l i I i > 0 } , and non-contest-free langauges, such as { O i l i O i I i > 0) .

It should not be difficult t o construct a 2QFA recognizing the language L , = {Oili 1 i > 0 ) from the following informal description of its behaviour. Figure 1 illustrates the basic trick of such a 2QFA A(") accepting strings from the language L, (the integer n is here a parameter that ensusres probability with which strings not in L, are rejected).

Each regular language can be recognized by a two-way classical

205

Stage 1. QFA keeps moving right checking whethex the input has the form 0'

Stage 3. After arriving at the left endmarker each state branches into a superposition of new states and if they arrive simultaneously this superposition results in a single state.

Stage 4. A measurement is performed. I Stage 2. At the right endmarker a superposition of new states is created and all states move left arriving at the left endmarker simultaneously iff the input has the form d 1 I .

ACCEPT

Fig. 1. QFA recognizing the language {Oi l i I i 2 1) - 60%

Each computation of A(") consists of three phases. In the first phase any input word not of the form O a l j is rejected (this can be done actually by a classical reversible automaton). For words of the type O z l j the phase ends in a state with the head on the rightmost endmarker $. As the first step of the second phase a superposition o f n special states is formed. This way computation in a sense "branches" into n parallel paths (actually into their superposition).

In the j t h paths, the head moves, deterministically, t o the left endmarker according to the following rules. Each time the head is on a new cell and reads 0 (1) it remains stationary for j (n- j + 1) steps and then moves one cell left. Therefore, for an input of the form O'lv the j t h head requires exactly ( j + l )u+ (n - j+2 )v+ l steps to reach the left endmarker. I f j # j ' , then

(j + 1). + (n - j + 2)v + 1 = (j' + 1)u + (n - j' + 2). + 1 if and only if u = v.

This implies that any two heads of a l l n different computational paths reach the left endmarker a t the same time if and only if u = v.

In the third phase, consisting of only one computational step and one measurement, each computation path splits again, this time the resulting superposition is obtained by an application o f the QFT (Quantum Fourier Transform).

In the case u = v all these splittings occur simultaneously and the resulting superposition equals exactly to Is,), where Is,) is the single accepting state. At that moment an observation is performed using the measurement making a projection

206

into the state spanned either by accepting, rejecting or nonterminating configurations. In the case u = 'u, the result o f such a measurement is "accept" with the probability 1.

In the case u # 'u only one head comes as the first t o the leftmarker and a measurement can then accept the string only with probability 4. 5.2 1.5-way quantum finite automata

A natural modification o f the concept of 2QFA is that o f 2QFA a t which no head "can move left". It can therefore "keep staying" or "to move right" a t a computation step.

It is an important open problem to determine whether 1.5QFA can recognize all regular languages with respect t o bounded-error mode of acceptance. The method used to show that all regular languages are accepted by 2QFA does not work for the case of 1.5QFA, and neither the method to show that a regular language is not acceptable by a 1QFA.

There is, however, a result showing that such automata are quite powerful. It was shown by Amano and lwama (1999), by a reduction to the halting problem of one-register machines, that the emptiness problem is undecidable for this type of automata, what is quite surprising because this problem is, in the classical case, decidable even for push-down automata.

In addition, they have shown that 1.5QFA can accept the language { O i l O i I i >}, actually using a small modification of the method used to show that 2QFA can accept the language {Oili I i > 0).

Let us list now some open problems for 1.2QFA.

1. Can 1.5QFA accept some languages accepted by lQFA, but with larger prob-

2. Can some 1.5QFA have less states than each lQFA recognizing the same

3. What is the power of 1.5QFA?

ability?

language.

5.3 Two-way classical/quantum finite automata

The models of QFA considered so far have a l l been natural quantum versions o f the classical models of automata. O f a different type is the model introduced in [4], and called two-way finite automata with quantum and classical states (2QCFA). This model is more powerful than classical (probabilistic) 2FA and a t the same time it seems to be more realistic, and really more "finite" than 2QFA (because 2QFA actually need quantum memory of size O(n) t o process an input o f the size

A 2QCFA A is defined similarly as a classical 2FA, but, in addition, A has a fixed size quantum register (which can be in a mixed state) upon which the

n).

207

a a unitary operatio

. --’the result of measurement - _ _ _ _ _ - - . determines the action of the

classical part of the automaton - - _

Fig. 2. A model of SQCFA

automaton can perform either a unitary operation or a measurement. A 2QCFA has a classical initial state qo and an initial quantum state 160).

The evolution of the classical part of the automaton and of the quantum state o f the register is specified by a mapping 0 that assigns t o each classical state q and a tape symbol 0 an action O(q,a).

One possibility is that Q(q, a) = (q’, d , U ) , where q’ is a new state, d is the next movement of the head (to left, no movement or t o right), and U is a unitary operator t o be performed on the current quantum register state.

The second possibility is that O(q,a) = (M,ml,ql,dl,mz,qz,dz,. . . , m k ,

q k , dk), where M is a measurement on the register state, ml, . . . ,mk are its possible classical outcomes and for each measurement outcome new state and new movement of the head is determined. In such a case the state transmission and the head movement are probabilistic.

I t has been shown in [4] that 2QCFA with only one qubit o f quantum memory are already more powerful than 2FA. Such 2QCFA can accept the language of palindromes over the alphabet (0, l}, which cannot be accepted by probabilistic 2FA a t all, and also the language { O i l i I i 2 0}, in polynomial time. This language can be accepted by probabilistic 2FA, but only in exponential time.

In the above model only projection measurements have been considered. I t is not clear whether something, especially concerning the number of the classical states, could be obtained by considering also POVM.

6 Quantum almost finite automata

Let us discuss briefly also main modifications of the models of quantum finite automata discussed above.

Quantum finite multitape automata have been introduced by Ambainis et al. (1999) with the idea to show that for such a model quantum version of the automata is provably more powerful than the classical probabilistic one. Several

208

languages have been designed that are of increasing complexity when accepted by probabilistic classical versions of automata. However, the final proof that such a quantum model is more powerful than the classsical one is sti l l missing.

For the case o f two tapes only, it has been shown by Bonner a t al. (2000b) that for such quantum automata the emptiness problem is undecidable. This has been shown actually even for a weaker model in which at each step a t least one of the heads has to move right.

Quantum finite counter automata have been introduced by Kravtsev (19990 and studied also by Yamasaki et al. (1999). There are two major results concerning this model: I t is provably more powerful than i t s classical probabilistic version (see Bonner et al., 1999a), and (see Bonner et al., 2000) the emptiness problem for this model is undecidable (this has been shown by a reduction to the Post correspondence problem.

Quantum pushdown automata have been introduced at first by Moore and Crutchfield (1997) and in a more elaborated way by Golovkins (2000). He has shown that the following languages can be recognized by RPDA: (a) L1 = {O, l}*; (b) L2 = {w 1 IwJo = lw11, w E {O,l}*}. The following languages can be recognized by QPDA: (a) L3 = {w I lwlo = lwll = Iw12,w E {0,1,2}*} with probability :; (b) L4 = {w I JwIo = lwll or lwlo = Iwl2} with probability $. The last language is known not t o be recognizable by a DPDA. It is not clear whether this langauge is recognizable by a probabilistic PDA. In general, it is not yet known whether QPDA are more powerful than classical probabilistic push-down automaa.

In all these cases a nontrivial problem was to develop proper well-formedness conditions.

209

1. Leonard M. Adleman, Jonathan DeMarrais, and Ming-Deh A. Huang Quantum computability. SIAM Journal of Computing, 26(5):1524-1540, 1997.

2. Masami Amano and Kazuo Iwama. Undecidability on quantum finite automata. In Proceedings of 31st ACM STOC, pages 368-375, 1999.

3. Andris Ambainis, Richard Bonner, and Rising Freivalds a nd Arnolds Fikusts. Probabilities to accept languages by quantum finite automata. Technical report, quant-ph/9904066, 1999.

4. Andris Ambainis and John Watrous. Two-way finite automata with quantum and classical states. Technical report, quant-ph/9911009, 1999.

5. Andris Ambainis and Risi@ Freivalds. 1-way quantum finite automata: strengths, weaknesses and generalizations. In Proceedings of 39th IEEE FOCS, pages 332-341, 1998. quant-ph/9802062.

6. Andris Ambainis, Ashwin Nayak, Amnon Ta-Shma , and Umesh Vazirani. Dense quantum coding and a lower bound for 1-way quantum finite automata. Technical report, quant-ph/9804043, 1998.

7. Andris Ambainis, Ashwin Nayak, Amnon Ta-Shma, and Umesh Vazirani. Dense quantum coding and a lower bound for 1-way quantum finite automata. Technical report, quant-ph/9804043, 1998.

8. Charles H. Bennett Logical reversibility of computation. IBM Journal of Research and Development, 17:525-532, 1973.

9. Ethan Bernstein and Umesh Vazirani. Quantum complexity theory. SIAM Journal of Computing, 26(5):1411-1473, 1997.

10. Aija Berzina, Richard Bonner, and Rusins Freivalds. Parameters in am bainis-freivalds algorithm. In Proceedings of the International Work- shop on Quantum Computing and Learning, Sundbyhols Slott, Sweden, M a y 2000, pages 101-109, 2000.

11. Richard Bonner, Rising Freivalds, and Renars Gailis. Undecidability of 2- tape quantum finite automata. In Proceedings of International Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 93-100, 2000.

12. Richard Bonner, Rijsinx Freivalds, and Maxim Kravtsev. Quantum versus probabilistic 1-way finite automata with counter. In Proceedings of Inter- national Workshop on Quantum Computation and Lea rning, Sundby- holms, May 27-29, 2000, pages 80-88, 2000a.

Undecidability of quantum finite 1-counter automaton. In Proceedings of International Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 65-71, 2000b.

13. Richard Bonner, R is igE Freivalds, and Madars Rikards.

21 0

14. Richard Bonner, Rusins Freivalds, and Maxim Kravtsev. Quantum versus probabilistic one-way finite automata with counter. In Proceedings of the International Workshop on Quantum Computing an d Learning, Sundbyholms Slott, Sweden, May 2000, pages 80-88, 2000a.

15. Alex Brodsky and Nicholas Pippenger. Characterization of 1-way quantum finite automata. Technical report, quant-ph/9903014, 1999.

16. Lance Fortnow. One complexity theorist's view of quantum computing. Technical report, Tech. report, NBC Research Institute, t o appear at CATS 2000 Proceedings and in ENTCS.

17. Maratas Golovkins. On quantum pushdown automata. In Proceedings of International Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 41-51, 2000.

18. Jozef Gruska. Quantum computing. McGraw-Hill, 1999. See also addi- tions and updatings of the book on http://www.mcgraw-hill.co.uk/gruska.

19. Jozef Gruska. Descriptional complexity issues in quantum computing. Journal of Automata, Languages and Combinatorics, 5:191-218, 2000.

20. Jozef Gruska. Mathematics unlimited, 2001 and beyond, chapter Quan- tum computing challenges, pages ?-?+37. Springer, 2000.

21. Stanley Gudder. Basic properties of quantum automata. Technical report, Department o f Computer Science, University of Denver, 2000.

22. Arnolds Kikusts and Zigmars Rasscevskis. On the accepting probabilities of 1-way quantum finite automata. In Proceedings of International Workshop on Quantum Computation and Lea rning, Sundbyholms, May 27-29, 2000, pages 72-79, 2000.

23. Attila Kondacs and John Watrous. On the power of finite state automata. In Proceedings of 36th IEEE FOGS, pages 66-75, 1997.

24. Maksim Kravtsev. Quantum finite one-counter automata. Technical report, quant-ph/9905092, 1999.

25. Cristopher Moore and James P. Crutchfield Quantum automata and quantum grammars. Technical report, Santa Fe, 1997.

26. Kenichi Morita. A simple construction method of a reversible finite automaton out of F redkin gates, and i ts related problems. The transactions of the IEICE, E73:978-984, 1999.

27. Ashwin Nayak. Optimal lower bounds for quantum automata and random access codes. In Proceedings of 40th IEEE FOGS, pages 369-376. ACM, 1999.

28. Jean-Erie Pin. On the languages accepted by finite reversible automata. In Proceedings of 14th ICALP, pages 237-249. LNCS 267, Springer-Verlag, 1987.

21 1

29. Daniel R. Simon On the power of quantum computation. In Proceedings of 35th IEEE FOG'S, pages 116-123, 1994. See also SlAM Journal of Computing, V26, N5, 1474-1483, 1997.

The class of languages recognizable by 1-way quantum automata is not clos ed under union. Technical report, quant-ph/000115, 2000.

31. John Watrous. On the power of 2-way quantum finite automata. Technical report, University of Wisconsin, 1997.

30. Maris Valdats.

212

On commutative asynchronous automata *

B. Imreht M. It04 A. Puklers

Abstract

The class of the commutative asynchronous automata are investigated here. By characterizing the subdirectly irreducible members of this class, it is proved that every commutative asynchronous automaton can be embedded isomorphically into a quasi-direct power of a suitable two-state automaton. We also prove that the exact bound for the maximal lengths of minimum-length directing words of an n states directable commutative asynchronous automata is equal to n - 1, moreover, it is [log,(n)] for the subclass containing all directable commutative asynchronous automata generated by one element.

1 Introduction An automaton is asynchronous if for every input sign and state the next state is stable for the input sign considered. Asynchronous automata were studied in different aspects. We mention here only the papers [6] and [7] which deal with the decomposition of an arbitrary automaton into a serial composition of two ones having fewer states than the original automaton and one of them is asynchronous.

It is said that an automaton is commutative if for every pair of its input signs, the transition of the states is independent of the order of the signs of the

'This work has been supported by the Japanese Ministry of Education, Mom- busho International Scientific Research Program, Joint Research 10044098, the Hungarian National Foundation for Science Research, Grant T030143, and the Ministry of Culture and Education of Hungary, Grant FKFP 0704/1997.

tDepartment of Informatics, University of Szeged, Arpa t6r 2, H-6720 Szeged, Hungary t Department of Mathematics, Faculty of Science, Kyoto Sangyo University, Kyoto 603-

§Department of Computer Science, Istvgn Szbchenyi College, H&Ierv*i tit 3., H-9026 8555, Japan

GyBr, Hungary

213

pair. Commutative automata have been studied from different points of view. Regarding the isomorphic and homomorphic representations of commutative automata, we mention the papers [2], [3], [4], [5], [8] [12], [13]. As far as the directable commutative automata is concerned, we refer to the works [9] and

In this paper, we deal with the intersection of these classes, namely the class of commutative asynchronous automata. After the preliminaries of Sec- tion 2, we persent in Section 3 the description of the subdirectly irreducible members of this class, and as an application of this description, we characterize the isomorphically complete systems for this class with respect to the quasi-direct product. In Section 4, the directable commutative asynchronous automata are investigated. We give the exact bound for the lengths of the minimal directing words of directable commutative asynchronous automata. Finally, we consider a subclass of the previous class, namely the class of directable commutative asynchronous automata generated by one element, and also give the exact bound for the lengths of the shortest directing words of the members of this class.

P11-

2 Preliminaries The cardinality of a set A is denoted by ]At. The diagonal relation on A is denoted by W A , i.e., W A = {(a,a) : a E A}. Let X be a finite nonempty alphabet. The set of all finite words over X is denoted by X * and X + = X * \ {E}, where E denotes the empty word of X * . For any p E X * , let alph(p) denote the set of the letters which occur in p , i.e., z E alph(p) if and only if x occurs in p .

By automaton we mean a triplet A = ( A , X , 6 ) , where A and X are finite nonempty sets, the set of states and the set of input signs, respectively, and 6 : A x X -+ A is the transition finction. An automaton can be also defined as an algebra A = ( A , X) in which each input sign is realized as the unary operation xA : A 4 A, a -+ 6(a,z). The transition function can be extended to A x X * in the usual way. Each word p E X* defines then a unary operation p A : A A a -+ 6(a,p). If C C A and p E X * , then let CpA = {& : c E C } . In what follows, if there is no danger of confusion, then we write up and C p instead of upA and C#, respectively. A state a* of A is called a dead state if a*x = a* is valid, for all x E X .

Using the second definition of automata, the notion such as subautomaton, generating element, congruence relation, subdirectly irreducible automaton can be defined in the usual way. We may associate with any nontrivial subautomaton B of A a congruence relation CTB called Rees congruence be-

214

longing to B as follows. For every a, b E A, let

aaBb if and only if a, b E B or a = b.

An automaton A = ( A , X ) is commutative if axy = ayx is valid for all a E A and x,y E X . Another particular automata are the asynchronous automata. A = ( A , X ) is asynchronous if for every a E A and x E X, axx = ax is valid. For the sake of simplicity, let us denote by K the class of all commutative asynchronous automata. We use the notion of the connectivity defined as follows. The automaton A = (A , X ) is connected if for every couple of states a, b, there are input words p , q E X * such that ap = bq is valid.

A word w E X* is called a directing word of an automaton A = ( A , X ) if it takes A from every state into the same state, or in other words, if [Awl = 1. An automaton is called directable if it has a directing word. Let A = ( A , X ) now a directable automata. Furthermore, let

d(A) = min{(w[ : w is a directing word of A). Regarding the meaning of d(A), it gives the length of the shortest directing words of A. If w is a directing word of A and lw[ = d(A), then w is called a minimum-length directing word of A. Now, for every positive integer n, take the maximum of the lengths of the shortest directing words of all directable automata of n states, i.e., let

d(n) = max{d(A) : A is a directable automaton of n states}.

The visual meaning of d(n) can be given as follows. For every directable automaton of n states, there exists a directing word whose length does not exceed d(n), moreover, there is such a directable automaton of n states for which the length of the shortest directing word is equal to d(n).

Regarding d(n), Cernjr[l] has a famous conjecture which claims that d(n) 5 (n - 1)2. This conjecture has been neither proved nor disproved so far, and thus, it remains an open problem of the theory of automata. On the other hand, considering the directable members of special classes of automata, sometimes, a better bound can be given than (n - 1)2 (see eg. [9], [lo], Ill]). The question can be restricted to particular classes of automata as follows. Let M be an arbitrary class of automata. Furthermore, let

dM(n) = {d(A) : A E M and A is directable}.

We shall present the values dK:(n), d p ( n ) , where K* is the subclass of K containing all commutative asynchronous automata generated by one el& ment.

21 5

Let At = (At , X t ) , t = 1 , . . . , k, be a system of automata. Moreover, let X be a finite nonempty alphabet, and 'p a mapping of X into n;=, Xt such that 'p is given in the form cp(x) = (cpl(x), . . . , cpk(2)). Then, the automaton A = (n,"=, At , X ) is called the quasi-direct product of At, t = 1,. . . , k, where ( a l , . . . , ak)xA = ( u ~ ( P I ( z ) ~ ~ , . . . , a k ( ~ k ( z ) * k ) is valid for every ( a l , . . . ,a) E n,"=, At and x E X. In particular, the automaton A is called a quasi-direct power of B if A1 = . . . = A k = B for some automaton B.

Then, the following statement can be easily proved by the definitions.

Lemma 1. If A = ( A , X ) can be embedded isomorphically into a direct product nbl A,, where for every i, i = 1,. . . , k, A, = ( A i , X ) can be embedded isomorphically into the quasi-direct product & Bit(X, @i), then A can be embedded isomorphically into a quasi-direct product of the automata

Now let M be an arbitrary class of automata. Furthermore, let C be a system of automata. It is said that 'c is isomorphically complete fo r M with respect to the quasi-direct product if for every automaton A E M , there are automata At E C, t = 1 , . . . k, such that A can be embedded isomorphically into a quasi-direct product of At, t = 1, . . . , k.

Bit, t = l , ..., ri; i = l , ... ,k.

3 Isomorphic representation

First of all, we prove the following obvious statement.

Lemma 2. If A = ( A , X ) E K, then the transition graph of A can not contain any directed cycle different from loop.

Proof. Contrary, let us suppose that a E A, a # a y and ayp = a for some p E Xi. Then, since A is commutative and asynchronous, a y = (ayp)y = ayyp = ayp = a which is a contradiction.

Let X = X I U X Z be a finite nonempty alphabet, where X1 and X2 are disjoint sets. Let us define the automaton Exl,x, = ( {O, l } ,X1 U X Z ) as follows. For every 21 E X1 and 2 2 E X Z , 0x1 = 0, 1x1 = 1x2 = 1, and 0x2 = 1 . The automaton EX,,X, is called the elevator ower X I and X Z . Then, we have the following characterization of the subdirectly irreducible commutative asynchronous automata.

Theorem 1. An automaton A = ( A , X ) E K with IA1 2 2 is subdirectly irreducible i f and only i f there are disjoint subsets X I and X Z of X such that X1 U XZ = X and A is isomorphic to EX,,^, .

216

Proof. If IAl = 2, then the statement is obviously valid. Consequently, it is sufficient to prove that a commutative asynchronous automaton is subdirectly reducible if IAl > 2. For this purpose, let A = ( A , X ) be a commutative asynchronous automaton with IAl > 2.

By the commutativity of A, the automaton A is either connected or disjoint union of its connected subautomata. In the latter case, it is easy to prove (see eg. [3]) that A is subdirectly reducible. Now, let us suppose that A is connected, and define the following relation on A. Let a 5 b if and only if there is a word p E X * such that ap = b. Then, this relation is a partial ordering on A, since the transition graph of A is cycle free. Since A is connected, there is a greatest element a* in (A, 5 ) which is a dead state of A. Let us consider the partially ordered set ( A \ {a*}, I). We distinguish two cases depending on the number of the maximal elements of (A\ {a*}, 5).

Case 1. (A\{a*}, 5 ) has at least two maximal elements. Let bl, 4 denote two different maximal elements in ( A \ {a*}, I). Then, the states a*, bl and a*, 4 constitute subautomata of A, and for the corresponding nontrivial Rees congruences u{a*,b1}, d { a * , b 2 ) , we have that b { , * , b l ) n O{,*,b2) = WA. Therefore, A is subdirectly reducible.

Case 2. ( A \ {a*}, 5 ) has one and only one maximal element denoted by al. Since A is connected and IAl 2 3, there is a maximal element in ( A \ {a*, q } , 5 ) which is denoted by a2. Let us classify the elements of X as follows: X I = { x E X : a1x = a l } and X2 = { x E X : a1x = a*}. Since a2 I all and A is asynchronous, X I # 0. Moreover, by a1 5 a*, we get that X2 # 0 as well. Now, let us define the equivalence relation p A x A as follows. For any a, b E A, let

apb if and only if a, b E {all a2} or a = b.

It is proved that p is a congruence relation of A. For this purpose, let x E X2 be an arbitrary input sign. Then, a2x = a* must hold. Indeed, in the opposite case, a2x E {al,a2). If a22 = al , then a2xx = a* # a1 which is a contradiction since A is asynchronous. If a22 = a2, then let y E X such an input sign for which a2y = al. Since a2 5 all such an input sign exists. Then, a1 = a2y = a2xy = a2yx = a1x = a* which is a contradiction again. Thus, a22 = a1x = a*, for all x E X2. On the other hand, a2xpalx, for all x E X I . If it is not so, then a22 = a* for some x E X I . Now, let y E X such an input sign for which a2y = al. Since a2 5 al, there exists such an input sign. In this case, a* = a2xy = a2yx = a1x = a1 which is a contradiction. Consequently, p is a nontrivial congruence relation of A.

A further nontrivial congruence of A is the Rees congruence u{a*,al} belonging to the subautomaton {a*, al} , and obviously, p n o{a*,al} = WA,

217

which results in the subdirect reducibility of A.

systems for K with respect to the quasi-direct product as follows. Now, by Theorem 1, we can characterize the isomorphically complete

Theorem 2. A system C of automata is isomorphically complete for K with respect to the quasi-direct product if and only if C contains an automaton A = ( A , X ) such that the elevator E{z},{v} can be embedded isomorphically into a quasi-direct product of A with a single factor.

Proof. The necessity of the condition is obvious. To prove the sufficiency, let us suppose that E{z},{v} can be embedded isomorphically into a quasi- direct product of A with a single factor for some A E C. Then, it is easy to see that any elevator EX,,^, can be embedded isomorphically into a quasi-direct product of A with a single factor. Now, let A’ = ( A ’ , X ) be an arbitrary commutative asynchronous automaton. By Theorem 1, A’ can be embedded isomorphically into the direct product of some elevators; let us denote them by Ex,,J,,, . . . ,Exkl,xk2. On the other hand, every Ex,,,x~,, t = 1 , . . . k, can be embedded isomorphically into a quasi-direct product of A with a single factor. Now, by Lemma 1, we obtain that A’ can be embedded isomorphically into a quasi-direct power of A, and thus, C is an isomorphically complete system for the class K: with respect to the quasi-direct product.

4 Minimum-length directing words Regarding the maximum of the lengths of the minimum-length directing words of commutative asynchronous automata of n states, the following statement is valid.

Theorem 3. dX(n) = n - 1, for every integer n 2 1.

Pro05 It is known (cf. [9], [ll]) that the maximum of the lengths of the minimum-length directing words of directable commutative automata of n states is equal to n - 1. Therefore, dx(n) 5 n - 1. To prove that the equality is possible, let n 2 1 be an arbitrary integer, and let us consider the automaton A, = ((1,. . . , n } , (21,. . . ,z,-l}), where

n i f j = i , j otherwise, nxi = n, and jxi =

for all zi E ( 2 1 , . . . ,3c,-1} and j E (1,. . . , n - 1). It is obvious that A, is a directable commutative asynchronous automaton, and d(A,) = n - 1. Consequently, dc(n) = n - 1 which ends the proof of Theorem 3.

218

In what follows, we show that this bound decreases drastically if we r e strict ourself to automata generated by one element. To do this, we need the following observations.

Lemma 3. If A = (A, X ) E K: and p E X * , then up = ax;, . . . xil, for all a E A, where {xil , . . . , xi,} = alph(p).

Proof. Let p E X * be an arbitrary word and let us suppose that x E X occurs in p in more times. Then, there are words r , s , t E X * such that p = rxsxt. By the commutativity, up = axxrst, for all a E A. On the other hand, A is asynchronous, and thus, axx = ax, for all a E A. Therefore, up = axrst, for all a E A which yields the validity of Lemma 3.

Lemma 4. If A = ( A , X ) E X* is a directable automaton and w is its directing word, then A contains one and only one dead state and w takes A from every state to the dead state.

Proof. Let a0 denote the generating element of A. Since w is a directing word, Aw = {a*} for some state a* E A. We show that a* is a dead state of A. Indeed, let x E X be arbitrary. Then, a*x = aowx = aoxw = a'w = a*. Finally, let us observe that A can not contain a further dead state. Indeed, if si is a further dead state, then {a*, ii}w = {a*, ii} which contradicts the fact that w is a directing word of A.

Theorem 4. If A = (A, X ) E K:* is a directable automaton and w is one of its minimum-length directing words, then lwl 5 1x1 and IAI 2 214.

Proof. The inequality lwl 5 (XI immediately follows from Lemma 3. In order to prove the lower bound for the number of states, as a consequence of Lemma 3, we may suppose that w = XI . . . Xk, where XI,. . . , Xk are pairwise different. We prove that IAl 2 2k by induction on k. If k = 1, then the statement is obviously valid. Let k 2 1 be an arbitrary integer, and let us suppose that the statement is valid for k. Furthermore, let A = ( A , X ) such a directable commutative asynchronous automaton generated by one element whose minimum-length directing word is w = XI . . . Xkxk+1, where 21,. . . , xk+1 are pairwise different signs of A. Let us denote the generating element of A by ao.

We show that the automaton A = (A, {XI,. . . , xk+l}) is also a directable commutative asynchronous automaton generated by a0 and w is a minimum- length directing word of A, where A = {ao# : p € {XI,. . . , xk+l}*} and iixA 3 = iixjA, for all ii E A and xj E {XI,. . . , xk+1}.

Obviously, w is a directing word of A, and thus, we have to prove only the minimality of w. By Lemma 4, awA = a*, for all a E A, where a*

219

- is the unique dead state of A. Then, awA = a*, for all a E A, where a* is the unique dead state of A. Let us suppose now that w is not a minimum-length directing word for A. Then, there are pairwise different letters zil,. . . ,zit E (21,. . . , Xk+1} such that 1 < k + 1 and 5 = xil . . .zit is also a directing word of A. By Lemma 4, iidA = a*, for all ii E 8, since A has exactly one dead state a*. Now, we prove that d is a directing word of A as well. Let a E A be an arbitrary state. Then, a = aopA for some p E X*. Furthermore, a s A = aO(pij)A = aowA# = aowA# = a*@ = a* since a* is a dead state of A. Consequently, AWA = {a*} which contradicts the fact that w is a minimum-length directing word of A. This yields that w is a minimum-length directing word of A as well.

Now, let us consider the automaton B = (B, (21,. . . , Xk}) which is generated by a~ in A under the input signs 21,. . . ,2k. Obviously, B is a commutative asynchronous automaton which is generated by ao. We prove that B is a directable automaton and XI.. . Zk is a minimum-length directing word of B. For this purpose, let aj # a; be two arbitrary states of B. Then, it is sufficient to prove that

ai21.. .2k = ajzl.. .2k.

Since B is generated by ao, there are p, q E {XI,. . . , Zk}* such that ai = aop and aj = aOq. But then

aix1.. . x k = aopxl . . . xk = aoxl. . . xk = aoqxl . . . x k = ajxl . . . xk. Let us observe that z1 . . . Xk must be a minimal-length directing word of B. Indeed, in the opposite case, if xil . . . xiz would be a minimum-length directing word of B, where xit E {XI,. . . ,zk}, t = 1 , . . . , I , and 1 < k, then xil . . . xi12k+1 would be a directing word for A which is a contradiction.

Let c = {aZk+l : a E B}. Then, C = (C, (21, , . . , 2k+1}) is a directable subautomaton of A with the minimum-length directing word XI. . . xk and A = C U B. Now, we show that C n B = 0. Contrary, let us s u p pose that ai E C n B. Then, a; = aop for some p E (21,. . . , 2k}* and ai = ajzk+l = aOqxk+l for some aj E B and q E {XI,. . . , q}*. Thus,

aox1.. . xk, where a* denotes the dead state of A. Therefore, for any at E B,

that a ~ s = at. Moreover, for any c E C, CXI . . .zk = aouxk+1x1.. . xk =

c = aOuxk+l. This yields that XI. . . x k is a directing word of A which contradicts the minimality of w = XI . . . Xk+1. Consequently, C n B = 0.

Now, since C is a directable commutative asynchronous automaton generated by aoxk+l and 21 . . . x k is a minimum-length directing word for it, we

a* = aOz1.. .2k2k+1 = aO2k+lx1.. . Zk = aoqx&+lxl.. .xk = aopxl.. . x k =

at21 ... 2k = aOSZ1 ... Xk = a021 ... 2k = a*, where S E (21 ,..., Xk}* such

aoxk+121...2k 1 (1021 ... 2k = a*, where U E (21 ,..., Zk}* such that

220

obtain that ICl 2 2k by the induction assumption. On the other hand, B is also a directable commutative asynchronous automaton generated by a0 whose minimum-length directing word is 21.. . xk, and thus, by the induction hypothesis again, IBI 2 2k. The obtained inequalities, A 2 A = B U C, and C n B = 0 result in (A( 2 2k+1, which ends the proof of our statement.

Now, we are ready to prove the following statement.

Theorem 5 . For every positive integer n, d p ( n ) = [log,(n)].

Proof. Let n 2 1 be an arbitrary integer. If n = 1 or n = 2, then the statement is obviously valid. Now, let us suppose that n 2 3. Let A = (A,X) E Ic* be an arbitrary directable automaton of n states. Assume that d(A) = k for some nonnegative integer k. Then, by Theorem 4, 2k 5 n, and thus, k 5 log,(n) which results in k 5 [log,(n)]. Consequently, d p (n) 5 [log,(n)]. To prove that the equality is possible, we construct an automaton A E K* such that

(1) IAI = n,

(2) d(A) = [log,(n)l.

To do this, let [log,(n)] = k and r = n - 2k. Let X = (21,. . . ,sk} and Y = {yl, . . . , y,.} be two disjoint sets of input signs. In particular, if r = 0, then let Y = 0. Let us denote by 0 the k-dimensional vector whose every component is equal to 0. Now, let us define the automaton

A = ( A , X U Y ) = ( {O, l}k U {l,.. . , r } , X U Y ) as follows. For every xj E X, (21,. . . , ik) E (0, l}k, and t E (1,. . . , r } , let

(ii,. . . ,ilk)

(il, . . . , ik)

if ij = 0, where ii = it, t = 1, . . . , k, t # j

otherwise, and i; = 1, (il, . . . , i+j =

tsj = 0x3,

and for every y1 E Y, (il,. . . ,ik) E (0, l}k, and t E {l,. . . , r } , let

0 i f t = r and 1=r, s

0 i f t # r and l # t .

i f t = r and l = s for some s ~ { l , ..., r-1}, t i f t # r and ~ = t ,

221

It is easy to see that T generates the automaton A, and A is a commutative asynchronous automaton. In particular, if T = 0, then 0 generates A. Moreover, 21 . . . z k is a minimum-length directing word of A. Consequently, d(A) = k = [log2(n)] which ends the proof of Theorem 5.

References [l] Clem$, J., Poznhka k homog6nym experimentom s konecinm automatami,

[2] h ik , Z., B. Imreh, Remarks on finite commutative automata, Acta Cybernetica

[3] h ik , Z., B. Imreh, Subdirectly irreducible commutative automata, Acta Cy-

[4] G h e g , F., On subdirect representations of finite commutative unoids, Acta

[5] G k g , F . , On vl-products of commutative automata, Acta Cybernetica 7

[6] Gerace, G. B., G. Gestri, Decomposition of Synchronous Sequential Machines into Synchronous and Asynchronous Submachines, Information and Control 11 (1968), 568-591.

[7] Gerace, G. B., G. Gestri, Decomposition of Synchronous Machine into an Asynchronous Submachine driving a Synchronous One, Infomation and Con- trol 12 (1968), 538-548.

[8] Imreh, B., On isomorphic representation of commutative automata with respect to a,-products, Acta Cybernetica 5 (1980), 21-32.

[9] Imreh, B., M. Steinby, Some remarks on directable automata, Acta Cybernetica

[lo] Pin, J. E., Sur un cas particulier de la conjecture de Cerny. - Automata, languages and programming, ICALP’79 (Proc. Coll., Udine 1979), LNCS 62, Springer-Verlag, Berlin 1979, 345-352.

[ll] Rystsov, I., Exact linear bound for the length of reset words in commutative automata, Publicationes Mathematime 48 (1996), 405-409.

[12] Yoeli, M., Subdirectly irreducible unary algebras, Amer. Math. Monthly 74

[13] Wenzel, G. H., Subdirect irreducibility and equational compactness in unary

Mat.-fyz. cas. SAV 14 (1964), 208-215.

5 (1981), 143-146.

bernetica 5 (1981), 251-260.

Sci. Math. 36 (1974), 33-38.

(1985), 55-59.

12 (1995), 23-35.

(1967), 957-960.

algebras (A; f), Arch. Math., Basel 21 (1970), 256-263.

222

Presentations of right unitary submonoids of monoids

ISAMU INATA Department of Information Science, Toho University, &nubashi 274-8510, Jupan

1 Introduction In the case of groups, the index of a subgroup H of a group G is the number of different right cosets of H in G. This index is equal to the number of equivalence classes of the right congruence P H = ((2, y ) E G x G I zy-' E H } on G. Using this index, Reidemeister and Schreier showed that every subgroup of a finitely presented group of finite index is also finitely presented (see [5]). Several authors considered generalizations of the above result to semigroups or monoids (see

The purpose of this paper is to obtain a generalization of Reidemeister- Schreier theorem for right unitary submonoids of monoids. In the rest of this section we give basic definitions and notations on presentations of monoids.

Let A be an alphabet and A* the free monoid on A. The empty word is denoted by A. We set A+ = A* - {A}. A (monoid) presentation is an ordered pair ( A I R), where R A* x A*. An element (u, u ) in R is called a (defining) relation and usually denoted by u = u. A monoid M is defined by a presentation ( A I R) if M A*/q, where q is the congruence on A* generated by R. For any w1,w2 E A*, we write w1 = w2 if w1 and w2 are identical as words, and write w1 =R w2 if w1 and wp represent the same element in M, that is w1/q = wz/q. For any subset S of M , set L(A, S) = {w E A* I w/q E S}.

A monoid is called finitely presented if it can be defined by a presentation ( A I R) in which both A and R are finite.

[I, 2,3,4,6,71).

2 The index of submonoids of monoids A subset U of a monoid M is right (resp. left) unitary if for any u E U and x E M , ux E U (resp. xu E U ) implies x E U . A subset U of M which is both right and left unitary is unitary.

Let N be a right unitary submonoid of a monoid M . A right coset N x of N is maximal if N x Ny for some y E M implies Na: = Ny. The (right) index of N (in M ) is the number of different maximal right cosets of N . Remark that we can define the left index of a left unitary submonoid of a monoid in the same

223

way, but even though a submonoid is unitary, the right index and the left index are not necessarily equal.

Proposition 2.1 Let N be a right unitary submonoid of a monoid M and {Nmi I i E I } the set of different maximal right cosets of N . Then, (1) M = UiE1 Nmi. (2) Nm; (3) There is i E I such that N = Nmi, that is, N is itself mazimal.

Proof. Clear. Let N be a right unitary submonoid of a monoid M and {Nm; 1 i E I} the

set of different maximal right cosets of N . Then a set {mi I i E I } is called a set of generalized right wset representatives of N . By the above proposition, we can choose m; = 1 for some i E I where 1 is the identity element in M .

Proposition 2.2 Let {mi I i E I } with mo = 1 be a set of generalized right coset representatives of N . Then, N n Nm; = 0 for all i # 0.

Proof. Clear from the unitarity of N .

Nmj for all i # j .

3 Presentations of right unitary submonoids of monoids

Let M be a monoid defined by a presentation ( A I R), cp : A* + M the natural surjection and N a right unitary submonoid of M . And let {ui E A* I i E I } be a subset of A* such that {cp(ui) li E I } is a set of generalized right coset representatives of N . We choose uo G A.

For any i E I and a E A, fix j 6 I such that cp(uj) is a generalized right coset representative of cp(uia). Then, for any i E I and w ala2.. . a, E A+, there exist j,,j,, . . . ,&+I E I such that j1 = i and c p ( ~ i ~ + ~ ) is the fixed generalized right coset representative of cp(uj,ak) for all k = 1 ,2 , . . . , r. Such j k + 1 is denoted by iw. Since for any i E I and a E A, there is n E L(A, N ) = {w E A* I cp(w) E N } such that

uia =R m i a ,

we choose such n and denote it by n;,a. Using this notation, for any i E I and w = a1a2 . . . a, E A*, we have,

U i W = R ni,alnia,,az . "nialaz..-a,-l,a,uiw.

The word ni,alnial,az. .-niala *... a,-l,av is simply denoted by n(i, w).

Lemma 3.1 For any w E A+, w E L(A, N ) i f and only i f Ow = 0.

224

Proof. For any w E A*, w = R n(0, w)uow. If w E L(A, N), then n(0, w)uow E L ( A , N ) . Since.N is right unitary, uow E L(A,n) , and hence Ow = 0. Con- versely, if Ow = 0, then uow _= uo = A. Thus w E L(A, N).

Now we have,

Theorem 3.2 N i s generated by the set

{n;,a I i E I , a E A}.

Proof. For any w = a1a2 . - - a , E L(A, N),

Corollary 3.3 Eve? right unitary submonoid of a finitely generated monoid of finite index is finitely generated.

Let M be a monoid, N a right unitary submonoid of M generated by a set X = {xi E M l i E I} and Y = {mj E M l j E J } a set of right coset representatives of N. Then it is easy to show that M is generated by X U Y. So, Corollary 3.3 can be strengthened to

Corollary 3.4 Let M be a monoid and N a right unitary submonoid of M of f inite index. Then, M i s finitely generated if and only if N i s finitely generated.

Let B = {bi,, 1 i E I , a E A } be a new alphabet and $ : B* + A* a monoid homomorphism induced by the mapping b;,a e ni+. For i E I and w = a1a2 . . . a, 6 A+, the word bi,albial,az . . . b;ala2...aT-l,a., is denoted by b(i , w). Define a mapping 4 : L(A, N ) -+ B* by

4(A) = A, and $(w) = b(0,w) (w E A+).

Now we have,

Lemma 3.5 N i s defined by the generators B, and the relations

where i E I , a E A, and w1, w2 E A*, u = v E R such that w1uw2 E L(A, N).

Proof. Theorem 2.11, N is defined by the generators B and the relations

The mapping 4 is a reuniting mapping in the sense of [l]. So, by [ l ,

d(ni,a) = h,a 4(WlW2) = 4(w1)4(w2) (w1, w2 E L(A, N)), and (4)

(i E 1, a E A) ,

$ ( w ~ u w ~ ) = 4(wlvw2) (w~, w2 E A*, u = v E R such that wluw2 E L(A, N)).

225

4 Finitely presentability of right unitary submonoids of monoids

In the previous section, we have obtained a presentation of a right unitary submonoid of a monoid defined by some presentation. But such a presentation may be infinite even though both the presentation of M and the index of N are finite. In this section, we consider the following problem: when does a right unitary submonoid of a finitely presented monoid of finite index have a finite presentation?

With the notations in the previous section, we have,

Lemma 4.1 N is defined by the generators B , and the relations

4(ni,a) = bi,a, and 4 ( U O W 1 ~ ~ 2 ) = 4 ( U 0 W l V W 2 ) ,

where i E I , a E A, and w l , w2 E A*, u = u E R such that wluw2 E L(A, N ) ) .

Proof. To prove the assertion, it suffices to show that the relations (2) and (3) can be derived from the relations (5) and (6). The set of relations of the form (5) and (6) is denoted by R’. The relation (2) directly follows from the relation (5). Let q5(w1uw2) = c$(w1uw2) be a relation of the form (3), that is, w1, w2 E A*, u = u E R and w1uw2 E L(A, N ) . Then, we have

(b(Wlaw2) =R‘ (b(n(O, W1)uOwl aw2)

= 4(@, W 1 ) ) 4 ( ~ O w , ~ W 2 )

=R‘ 4 ( W I U W 2 ) .

=R’ d(n(0, wl))4(uOw1Vw2)

This completes the proof of the lemma.

Now we introduce an automaton A = (I, A, 6,0,0) associated with our relations as follows:

Thus (W1W2)and (w1) (w2) are exactly equal as words over. B In this we can delete the relation of the form (4), andf we have the desired rela

For any

226

(1) I is the set of states, (2) A is the input alphabet, (3) 6 : I x A + I is a transition function defined by

d(i, a) = { j E I I there exists n E L(A, N ) such that rnia = R nrnj}.

(4) 0 is the initial state, (5) 0 is the terminal state.

Remark that A is non-deterministic, in general.

Proposition 4.2 With above notations, w E L(A, N) if and only if d(0, w) = 0.

Proof. It is immediate from Lemma 3.1.

We say that an automaton A' is a deterministic choice of A, if A' is a deterministic subautomaton of A. And A' is called cycle-free, if there is no non-trivial directed cycle that does not contain 0.

Theorem 4.3 Let A be the automaton dejined above. Assume that both ( A 1 R) and I are finite. If there is a cycle-free deterministic choice of A, then N is finitely presented.

Proof. Since there is a cycle-free deterministic choice of A, for any i E I and a E A, we can choose ia E I, which is cyclefree and deterministic. And we choose n+, E L ( A , N ) such that rnia = R n+rnia. Hence we obtain a presentation of N given in Lemma 4.1. To show the theorem, it is enough to show that the relations of the form (6) are finite. Since above choice is cycle- free, for any state i c? I, there are at most a finite number of paths from i to 0. So, for any w1 E A* and (u, w) E R, there are only a finite number of paths from Owlu to 0. Thus the relations of the form (6) are finite. This completes the proof of the theorem.

Example 1. Let A = {a, b} , R = {aba = a, bab = b} and M the monoid defined by the presentation (A , R) . And let N be the submonoid of M generated by a set {cp(ambn), cp(bnam) 1 rn, n E NO, rn + n : even}. Then it is easy to see that N is a unitary submonoid of M and {A, a, b} is a set of generalized right coset representatives of N . In our automaton A, the transition function d is defined as

d(A,a) = a, d(A,b) = b, d(a, a) = A, d(a,b) = A, d(b,a) = A, 6(b,b) = A.

It is clear that A is cycle-free and deterministic. Hence, N is finitely presented by Theorem 4.3. In fact, put aa = e, ab = f, ba = g and bb = h, then N is defined by the generators {e, f, g, h} and the relations {fe = e, f2 = f, eg = e, g2 = g, gh = h, he = h}.

227

References [I] C.M. Campbell, E.F. Robertson, N. RuSkuc and R.M. Thomas,

Reidemeister-Schreier type rewriting f o r semigroups, Semigroup Forum 51 (1995), 47-62.

[2] C.M. Campbell, E.F. Robertson, N. Rudkuc and R.M. Thomas, O n subsemigroups of finitely presented semigroups, J. Algebra, 180 (1996), 1-21.

[3] C.M. Campbell, E.F. Robertson, N. RuSkuc and R.M. Thomas, Presen- tations f o r subsemigroups - applications t o ideals of semigroups, J. Pure Appl. Algebra, 124 (1998), 47-64.

[4] A. Jura, Determining ideals of a given f inite index in a finitely presented semigroups, Demonstratio Math., 11 (1978), 813-827.

[5] W. Magunus, A. Karrass and D. Solitar, Combinatorial Group Theory, Interscience Publishers, New York, 1966.

[6] N. RuSkuc, O n large subsemigroups and finiteness conditions of semigroups, Proc. London Math. SOC., 76 (1998), 383-405.

[7] N. Rudkuc, Presentations f o r Subgroups of Monoids, J. Algebra, 220 (1999), 365380.

[8] N. RuSkuc and R.M. Thomas, Syntactic and Rees indices of Subsemigroups, J. Algebra 205 (1998), 435-450.

228

A combinatorial property of languages and monoids

A.V. KELAREV AND P.G. TROTTER

School of Mathematics and Physics, University of Tasmania, G.P. 0. Box 252-37, Hobart, Tasmania 7001, Australia

Email: [email protected] [email protected]

In a 1976 paper 8 , B.H. Neumann characterized center-by-finite groups

as being groups with a particular combinatorial property; a group is center-

by-finite if and only if every infinite sequence of its elements contains a pair of

elements that commute. The characterization was produced as an answer to

a question by Paul Erdos and has led to a series of papers by various authors

in which combinatorial properties of algebraic structures have been investi-

gated. A survey of this direction of research, by the first author, appears

in 6 . Our aim here is to investigate formal languages that satisfy particular

combinatorial properties (namely, permutational properties) with respect to

combinatorial and finiteness properties of their syntactic monoids.

Given an alphabet A, let A+ and A* denote respectively the free semi-

group and the free monoid generated by A. A subset L of A* is called a

language on A. The syntactic congruence induced by L is the congruence p~

on A* defined by

p~ = {(u,v) I avb E L H awb E L, for all a ,b E A*}.

The quotient semigroup Syn(L) = A * / ~ L is called the syntactic monoid of

L (see4). It is well known that a language L is recognized by a finite state

229

automaton if and only if Syn(L) is finite. Furthermore, the property of a

language L being rational, or regular, is equivalent to Syn(L) being finite.

Let S, be the symmetric group on {1,2,. . . , n} for some positive in-

teger n. A semigroup S is said to be n-permutational if, for any elements

XI, 22, . . . , x, in S, there exists a non-identity permutation u E S, such that

A semigroup is permutational if it is n-permutational for some n. This notion

generalizes commutativity and has been actively investigated (see 5 , for

references). In particular, by ', a group is permutational if and only if it is

finiteby-abelian-by-finite. An important result of Restivo and Reutenauer

states that a finitely generated periodic semigroup is permutational if and

only if it is finite; that is, a language on a finite alphabet is recognisable by

a finite state automaton if and only if its syntactic monoid is periodic and

permutational.

Because of the connection between a language L and its syntactic

monoid via the congruence p ~ , it is natural to define L to be n-permutational

for some positive integer n if, for each word w E L and each factorization

of w, there exists a non-identity permutation u E S, such that

Define L to be permutational if it is n-permutational for some n.

In 2, permutational semigroups are called 'permutable semigroups'. A

language L is defined in2 to have the permutation property if, for some n and

230

for any words u, 2 1 , . . . , x,, v in A*, there exists a non-identity permutation

u E S, such that

'11x1 * . . x,v E L * UZU(1) . . . XU(,)V E L.

It is clear that a language L with the permutation property is permutational.

However, with A = {a, b}, there is a language L over A that is permutational

but does not satisfy the permutation property. To see this, consider

L = A* \ {aba2b2.. . anbn I n 2 2) .

It is easy to verify that this language is n-permutational for each n 2 3.

However, with u = 1 = v and xi = aibi for 1 5 i I n, we get 21x1 . . . xnv $ L;

yet for any non-identity permutation u E S3, uxu(l) - - - X ~ ( ~ ) V E L. Hence L

does not have the permutation property.

We begin with a pair of easy deductions based on the definitions and

on the above mentioned result of Q .

THEOREM 1 For any language L, the monoid Syn(L) is n-permutational

only if L is n-permutational.

Proof. Suppose that w = uu1uz ---u,v, for some u,u;,v E A*, 1 5

i 5 n. Since Syn(L) = A * / ~ L is n-permutational, for some n, there exists

u E Sn \ (1) such that

It follows that

uu1uz.. . unv E L * U U u ( ~ ) U u ( ~ ) . * 'UU(,)V E L.

23 1

COROLLARY 2 Every language that is recognized by a finite state automaton

is permutational.

Proof. Let L be a language over a finite alphabet that is recognized

by a finite state automaton. Then by9, since Syn(L) is finite, Syn(L) is

permutational, and so Theorem 1 completes the proof. 17

Corollary 2 also follows from Theorem 4; we have included both ver-

sions to show the first easier proof.

The next example shows that the severing of the connection between

languages and finite semigroups, as exists in Corollary 2, can result in a

non-permutational language. Moreover, Corollary 2 does not generalize to

context-free languages.

EXAMPLE 3 Let G be a context-free grammar in Chomsky Normal Form

with alphabet A = {a, b}, non-terminal symbols V = {a, p, r}, start-symbol

7, and productions

+rr, r +wrp, a+ a, O + b, r + ab.

Clearly, G generates the language M+, where

M = {anbn I n 2 1).

We show that this language is not permutational. Indeed, for any positive

integer n the following product

p = aba2b2a3b3..-a"bn E M+

can be factorized as

p = ~ 1 2 2 - - - 2 ~ = ~(ba~)(b~a~)..-(b"-~a")b",

232

where 21 = a, 2 2 = ba2, . . ., xn-l = bn-lan, 2" = b". It is easily seen that,

for any non-identical permutation u,

Given a recognizable language L, the result of Restivo and Reutenauerg

does not provide a formula for estimating the least n such that Syn(L) is

n-permutational. The next theorem gives us a bound for the least n such

that L is n-permutational.

THEOREM 4 Every language that is recognized by a (possibly non-deterministic

and incomplete) finite state automaton with k states is 2k-permutational.

Proof. Suppose a language L is recognized by a finite state automaton

d(S,X,cp,so,T) with IS1 = k. Take any word w E L and any factorization

w = uulu2.. . '112kU. For each integer 0 5 i 5 2k, consider the state

where we assume u = UO. By the pigeonhole principle there exist 0 5 il < iz < i3 5 2k such that sil = siz = si3. It follows that

Thus L is 2k-permutational. 0

The following example shows that our bound is exact.

EXAMPLE 5 For each positive integer k, there exists a deterministic complete

finite state automaton with k states recognising a language which is not (2k-

1) -permutational.

233

Proof. Let S = {SI, s 2 , . . . , sk} be the set of states, X = (21, 22, . . . , 22k-I}

the input alphabet, s1 the start state, T = {sl} the set of terminal states.

Define the next-state function cp by the rule

Then the language L recognized by this automaton has a unique member of

length 2k - 1, namely 2122 -. . 22k-1. Clearly, no non-identical permutation

of the letters of 2 1 2 2 . . - 22k-l belongs to L, and therefore L is not (2k - 1)-

permutational.

We now consider 2-permutational languages. In this case Theorem 1

can be strengthened.

PROPOSITION 6 A language is 2-pemutational if and only if its syntactic

monoid is 2-permutational (i. e. commutative).

Proof. The symmetric group S 2 has only one non-identity element.

Therefore every 2-permutational language L satisfies the property

UUlU2V E L * uu2u1v E L VU, u1, U2 ,V E x*.

Hence (u1u2,u2u1) E p~ for all u lul lu2,v E X*, and so Syn(L) is 2-

permutational. The converse holds by Theorem 1. 0

If a language is 2-permutationall then it does not follow that it is

recognizable.

THEOREM 7 There exists a 2-permutational language that has a commutative

infinite syntactic monoid.

234

Proof. Let A = {a} and L = {a" I p is a prime number}. We see

immediately that L is 2-permutational. Now take any ail aj E A*, for some

non-negative integers i , j . Suppose that (ai ,aj) E p ~ . Then aian E L if and

only if aja" E L, for each n 2 0. This means that i + n is a prime number if

and only if j + n is a prime number. Hence i = j . Thus Syn(L) = A * / ~ L E

{a}*, and the assertion follows. 0

For the next theorem we use the following definitions and notations.

Let A = {a, b}, and let u, v E A*. We say that the word ubav is obtained from

u h by an a ++ b transition. The number of occurences of the letter x in a

word w is denoted by IwI,. Suppose that w E A+, where lwlo = lwlb = n.

Then there is a (usually not unique) sequence of words from A+

such that wi H wi+l is an a I-+ b transition for each i, 1 5 i 5 m. The length

m of this sequence is determined uniquely. Define the shift of w from anbn

to be s(w) = m; so s(w) is the number of a ++ b transitions in sequence (1).

The value s(w) is well defined; it is the same for all sequences of the form (l),

because for a word w = 21x2.. .x2, it is equal to the number of pairs (i, j )

such that 1 5 i < j 5 2n and xi = b,xj = a. Notice that the largest shift

from anbn is s(bnan) = n2.

If v, w E A+ are such that IwI, = lwlb = lvla = lvlb = n, then the shift

of v from w is s(v) - s(w); this is the number of a ++ b transitions minus the

number of b ++ a transitions in any sequence of such transitions from w to v.

Suppose that w = a1a2.. . a2", where ai E A+ for each i, 1 5 i 5 2n,

235

and that (wla = (wlb. For 7~ E Szn, define

W7r = % ( l ) % r ( 2 ) . . .%r(Zn).

LEMMA 8 Let w E (u,b}+ be such that lwla = IW(b = n 2 2 , and w =

u u 1 u 2 u 3 v for some u,v E {u,b}*, U l , U 2 , U 3 E {u,b}+. Then one of the

following two conditions holds:

Proof. We have the following shifts from w:

Observe that

236

We assume that (i) is false, so all shifts which occur in (a) through (e) are

equal. But then by (2) we get s(uuq(1)u,,(2)uq(3)u) = s(w), for all 77 E S3. The

right hand side of each equation (a) through (e) has value 0, hence equations

(4, (b), (4 yield

Since u1,u2,u3 E A+, the result now follows.

THEOREM 9 There exists a 3-permutational language with a non-permutational

syntactic monoid.

Proof. Let A = {a, b, d } and for any w E A+ define to be the word

obtained from w by deleting all occurrences of d. Put

2 Lo = {W E A+ I W = unbn,O 5 lwld 5 n , for some n 2 2},

L1 = {w E A+ I W = (anbn)?, for some K E Szn \ (l}, 0 I lwld 5 n2, 1w1d # s ( s ) , for some n 2 2 , )

and then define L = LO U L1.

Let us verify that L is %permutational. Take any w E L and any

%factorization

w = uu1u2u3u, (3)

where u, v E A*, ~ 1 , 2 1 2 , u3 E A+. Consider two cases.

Case 1: w E Lo and condition (i) of Lemma 8 applies to

Let X , r E S3 be selected as in condition (i) of the lemma. Choose per-

mutations X ' , r l in S2n whose application to w, followed by deletion of all

237

letters d, produces the same result as the application of X,r to Ei. Then

s(wxt) # s(w+), and so at least one of the numbers s (wx~) , s (w~t) is not

equal to Iwld. We may assume that s(wxl) # I&. Then wxt E L.

Case 2: w E LO and condition (ii) of Lemma 8 applies to (4). Since

w = anbn and .(a) = 0, we see that the image of 7ii under any permutation of

blocks of the factorization (4) is the same word anbn. Therefore any 3-block

permutation of w, based on the factorization (3), results only in permuting

occurences of d. Hence the image of w under any 3-permutation is in L.

-

Case 3: w E L1 and condition (i) of Lemma 8 applies to (4). This case

is similar to Case 1, and we omit the details.

Case 4: w E L1 and condition (ii) of Lemma 8 applies to (4). Then,

as in Case 2, for each permutation of w based on the %factorization (3), the

image of w is in L.

Note that in the above we assume u1, u2, '113 E A+. This is for conve

nience only. If any of u1, ~ 2 , 2 1 3 is equal to 1, then the proof simplifies. Thus

we see that L is %permutational.

Next, consider the syntactic congruence p~ on L. For sufficiently

large n, let A E SZn be a permutation corresponding to a non-identity r-

permutation of the r-factorization

where i + r = ( T + l ) r / 2 + j = n; the blocks are ab,ab2,. . . , abr. Put w = w,.

Clearly, s(w) > s(w). Hence lwd"(") # L, lwd"(") 6 L while lvd"(") E L,

lvd"(") # L. Thus (w,w) # p t , and therefore Syn(L) is not r-permutational

238

for any r.

COROLLARY 10 For any integer n 2 3, there exists a n n-permutational reg-

ular language L such that Syn(L) i s (n + 1)-permutational, but is not n-

permutational.

Pro05 Select a specific value of n, namely n = T , in the definition of

L in the proof of Theorem 9. 0

There are many open questions concerned with connections between

permutational properties of languages and their syntactic monoids. The fol-

lowing difficult question specialises to a question of de Luca and Varricchio

when restricted to the permutation property.

A language is said to be periodic if its syntactic monoid is periodic.

PROBLEM 11 Are all permutational periodic languages regular?

Given Corollary 2 and Example 3, it is natural to consider the following

PROBLEM 12 Find an algorithm to determine whether a context-free lan-

guage is permutational.

1. M. CURZIO, P. LONGOBARDI, M. MAJ AND A.H. RHEMTULLA, A

permutational property f o r groups, Arch. Math. 44 (1985), 385-389.

2. A. DE LUCA, s. VARRICCHIO, Regularity and finiteness conditions,

“Handbook of Formal Languages”, Vol. 1, Eds. G. Rosenberg, A. Sa-

lomaa, Springer-Verlag, Berlin, 1997, 747-810.

3. A. DE LUCA, s. VARRICCHIO, “Finiteness and Regularity in Semi-

groups and Formal Languages” , Monographs in Theoretical Computer

239

Science, Springer-Verlag, Berlin, 1998.

4. J.M. HOWIE, “Automata and Languages”, Clarendon Press, Oxford,

1991.

5. J. JUSTIN, G. PIRILLO, O n some questions and conjectures in com-

binatorial semigroup theory, Southeast Asian Bulletin of Mathematics

18 (1994), 91-104.

6. A.V. KELAREV, Combinatorial properties of sequences in groups and

semigroups, “Combinatorics, Complexity and Logic”, Discrete Mathe-

matics and Theoretical Computer Science, Eds. DS Bridge, CS Calude,

J. Gibbons, S. Reeves, I.H. Witten, Springer-Verlag, 1996, 289-298.

7. M. LOTHAIRE, “Combinatorics on Words”, Addison-Wesley, Tokyo,

1982.

8. B.H. NEUMANN, A problem of Paul Erdos o n groups, J. Austral. Math.

SOC. 21 (1976), 467-472.

9. A. RESTIVO AND c. REUTENAUER, O n the Burnside problem for semi-

groups, J. Algebra 89 (1984), 102-104.

240

ERROR-DETECTING PROPERTIES OF LANGUAGES’

Stavros Konstantinidis

Department of Mathematics and Computing Science Saint Mary’s University

Halifax, Nova Scotia B3H 3C3, Canada

s.konstantinidisQstmarys.ca

Abstract: In the context of storing/transmitting words of a language L using a noisy medium, the language property of error-detection is fundamental. It ensures that the medium cannot transform a word from L to another word of L. This paper defines some basic error-detecting properties of languages and obtains a few basic results on error-detection. Moreover, some error- detecting capabilities of uniform, solid, and shuffle codes are considered. It is shown that those codes provide certain error-detection either for free or when a simpler condition is satisfied.

Key words: error-detection, channel, code, regular language, solid code, shuffle code.

1. Introduction Consider the problem of transmitting/storing words of a language L using a medium y capable of introducing errors in the words of L. Let us call the words of L permissible words and the medium y channel. Now it is possible that a permissible word can be transformed to a non-permissible one after it is received/retrieved from the channel y. In this context, the language property of error-detection is fundamental. Specifically, if the language L is error-detecting for the channel y, then y cannot transform a permissible word to another permissible word. As a consequence, when the channel returns a word w which is permissible, it is the case that w is the permissible word that was originally transmitted/stored into y. On the other hand, if the returned word is not permissible, one can be sure that it has been corrupted by the channel and then take appropriate action - for example, request that the word be retransmitted.

The set of permissible words could be any subset of X * , where X is the alphabet used, or it could be the set K* that consists of all the messages (words) over a code K . In the latter case, when a permissible message is returned, it can be decoded uniquely and correctly. To keep the basic definitions general, we use the framework of P-channels (see [4]) restricted to the case of finite words. This channel model is very general and includes the case

This work was supported by a research grant of the Natural Science and Engineering Research Council of Canada.

241

of SID-channels which were presented in [3] and further extended in [6]. SID- channels are discrete channels represented by formal expressions that describe the type of errors permitted and the frequency of those errors. The basic error types are:

a: substitution. It means that a symbol in a message can be replaced with

L : insertion. It means that a symbol (of the alphabet X ) can be inserted

6: deletion. It means that a symbol in a message can be deleted, i.e.,

We note that errors of type L or 6 are called synchronization errors, as they cause, or are caused by, loss of synchronization. Examples of SID-channel expressions are:

(1) a(m,l): represents the channel that permits at most m substitutions in any l (or less) consecutive symbols of a message.

(2) ~(m, ! ) : represents the channel that permits at most m insertions in any l (or less) consecutive symbols of a message.

(3) S(m, l ) : represents the channel that permits at most m deletions in any l (or less) consecutive symbols of a message.

(4) L 0 6(m,l): represents the channel that permits a total of at most m insertions and deletions in any l (or less) consecutive symbols of a message.

( 5 ) a 0 L 0 6(m, l ) : represents the channel that permits a total of at most m substitutions, insertions, and deletions in any l (or less) consecutive symbols of a message.

More generally, we use the expression ~ ( m , l ) to denote the channel that permits a total of at most m errors of type T in any l consecutive symbols of a message. In this case, we assume that m and t. are positive integers with m < l . In this paper we ignore the distinction between the terms SID-channel and SID-channel expression. Moreover, we consider the following set of error types:

another symbol (of the alphabet X ) .

in a message.

replaced with the empty word.

3 = {a, L , 6 , 0 0 6, a 0 L , L 0 6, a 0 L 0 S}.

The paper is organized as follows. The next section gives some basic concepts about words, factorizations, and P-channels. Section 3 defines the basic error-detecting properties of languages, provides examples to illustrate these properties, and contains a few basic results on error-detection. For example, it is shown that the number of synchronization errors that a regular language can detect is bounded by the cardinality of its syntactic monoid. Section 4 discusses certain error-detecting capabilities of uniform, solid and shuffle codes. In particular, a necessary and sufficient condition is obtained for detecting the errors of the channel u 0 L 0 6(1 , l ) in the messages of a

242

finite solid code. Finally, Section 5 contains a few concluding remarks.

2. Basic Background

For a set S , the notation IS1 represents the cardinality of S . The set of positive integers is denoted by N and NO = N u (0). An index set is a subset I of NO such that I = { O , l , . . . , n - 1) for some n in NO. If n = 0, the corresponding index set is the empty set 0. An alphabet, X , is a finite non- empty set of symbols. A word (over X ) is a mapping w : I -i X , where I is an index set. In this case, we write I , to denote the index set of the word w. Moreover, as usual, we can denote w by juxtaposing its elements: w = w(O)w(l) . . . w(n - 1). The empty word, A, is the unique word with Ix = 0. The length, IwI, of a word w is the number lIwl. The set of all words over X is denoted by X * and X + = X* \ {A} . A language is a subset of X * . We write minlen L to denote the length of a shortest word in the language L. On the other hand, if L is finite we write maxlenL to denote the length of a longest word in L. If all the words in L are of the same length, we say that L is a uniform code. In this case, we use the symbol len L to denote the length of the words in L. In the sequel, we fix an alphabet X that contains at least the two distinct symbols 0 and 1.

Let L be a subset of X*, then a factorization over L is a mapping cp : I -i L where I is an index set. As before, we write I,+, to indicate the index set of cp, and IcpI to denote the length of the factorization cp which is equal to II,+,I. For a factorization cp over L, we write [cp] to denote the word cp(O)cp(l) . . . c p ( n - l ) , where n = IcpI. If IcpI = 0 then [cp] = A. For n E NO and w E X * , the symbol W" denotes the word [cp] such that IcpI = n and cp(i) = w for all i E Iv . Also, for W X * , W" = {w" I w E W } and WSn = ur==,Wi.

A code (over X) is a non-empty subset K of X + such that [cp] = [$J] implies cp = $J for all factorizations cp and $J over K . A message over K is a word [cp], where cp is a factorization over K . Then, K* is the set of all messages over K and K+ is the set of all non-empty messages.

A channel, y, is a binary relation over X * , namely y E X * x X * . For the elements of a channel y, we prefer to write (y'ly) rather than (y',y). Then, (y'ly) E y means that the word y' can be received from y through the channel y. For a word y we define (y)-, to be the set of all possible outputs of y when y is used as input; that is,

More generally, for a set of words Y , we have (Y)-, = UyEY(y)-,.

Definition 1 Let y be a channel and let be a factorization over Y C X*.

243

A factorization v’ over (Y), is y-admissible for v if

I,! = I , and w’(i) . . ~ ’ ( i + k) E (v(i) . . . v(i + k)),, for all i E I , and k E No with i + k E I,.

Example 1 Consider the message y = 001100 and its factorization v over K = ( 0 0 , l l ) such that 21 = (00,11,00). Consider also a channel y that allows a t most one deletion in any 2 consecutive input symbols. As a result, y‘ = 0100 is a possible output in (y), if one deletes the symbols y(0) and y(2) in y. Then the factorization Y’ of y’ over ( K ) , such that v’ = (0, 1,OO) is y-admissible for w. On the other hand, for the same channel y, and for K = (01,lO) and y = 0110, one has the following: w = (01,lO) is a factorization of y over K and w‘ = (0,O) is a factorization of y‘ = 00 over ( K ) , such that d ( i ) E ( ~ ( i ) ) , for i E { O , l } . But y’ f (w(O)w(l)), since the symbols y(1) and y(2) of y cannot be both deleted. Hence, v’ is not y-admissible for v.

In the sequel, we consider only channels y satisfying the following natural conditions. ( P I ) Input fuctorizations urriue as y-admissible output factorizations: If

(y’ly) E y and v is a non-empty factorization of y over some subset Y of X + , then there is a factorization v‘ of y‘ over (Y) , which is y- admissible for v.

(Pz) Error-free messages can be received independently of the context: If (y’ly) E y then (zy’zlzyz) E y, for all z , z E X’.

(P3) Empty input can result into empty output: (XlX) E y. Channels satisfying properties Pl-P3 are called P, -channels. They differ

from the P-channels defined in [4] only in the finiteness type of the inputs and outputs; that is, P,-channels allow only finite words to be used as opposed to P-channels. Consequently, property PO of P-channels is omitted here. We note that properties PZ and P3 imply (yly) E y for all y E X * . Moreover, every SID-channel is a P,-channel.

We close this section with an example of how words can be affected by the errors of an SID-channel.

Example 2 Consider the word 2 = 0000000 and the SID-channel y = L o 6(2,5) that permits at most 2 insertions and deletions in any 5 consecutive symbols. Let y = 01000001 and let z = 0110000010. Observe that y can be obtained from z when y deletes z(2), inserts a 1 between ~ ( 0 ) and z ( l ) , and inserts a 1 at the end of z - all the errors occur at the same time. Hence, y E (x),. On the other hand, to obtain z from z using a minimum number

244

of errors, one has to insert three 1s in the segment z(1) . . . z(5) of length 5. Hence, z 4 (z),.

3. Error Detection: Definitions, Examples and Basic Results

The classical theory of error-correcting codes deals with channels that permit substitution errors and considers primarily uniform codes. In that context, a uniform code K is said to be m-error-detecting if 211 E (m), implies 211 = u2,

for all codewords v1 and 212, where y = ~ ( m , t ) and t is the length of the words in K - see [l] or [8]. The notion of error-detection has been generalized in [4] to the case of P-channels, but no results are included there concerning error- detection. In this section we investigate the notion of (y, *)-detecting code as defined in [4]. In many cases, this property can be studied in terms of the simpler notion of (y, t)-detecting code, where t E NO. The formal definitions are provided next.

Definition 2 Let y be a P,-channel and let t E NO. (i) A language L is error-detecting for y, if

v Wl,’w2 E L U (A}, w1 E (w2), -+ w1 = w2.

The symbol ED, denotes the class of languages that are error-detecting for y.

(ii) A code K is (y, *)-detecting, if the language K* is error-detecting for y. The symbol ED; denotes the class of codes that are (7, *)-detecting.

(iii) A code K is (y,t)-detecting, if

V ~1 E K l t V 202 E K*, ~1 E (wz), + ~1 = ~ 2 ,

The symbol ED; denotes the class of codes that are (y, t)-detecting.

In part (i) of Definition 2, the use of “wl, w2 E L U {A}” as opposed to “wl, w2 E L” is justified as follows. First, it should not be possible for the channel y to return a non-empty word in L when nothing is sent to y, i.e., when the input used is A. That is, w1 E (A), and w1 E L U (A} implies w1 = A. Similarly, the channel should not be capable of erasing completely a non-empty word of L. That is, A E (wz), and w2 E L U {A} implies w2 = A. These observations do not eliminate from consideration channels that insert or delete symbols. Instead, they ensure that when an error-detecting language is used for y, it is impossible that y can erase or introduce an entire non-empty word of L.

245

Next we show a few examples of error-detecting codes. We also remark that every (y, *)-correcting code is (y, *)-detecting as

Example 3 Every uniform code K is error-detecting for the channel y = L(m,l), provided lenK > m. Indeed, as only insertions are permitted, z E (v), implies (v1 5 1x1; therefore, A E (v), and v E K U {A} imply v = A. On the other hand, as v E (A), implies IvI 5 m, one has that v E (A), and v E K U { A } imply v = A. Now let v1 and w2 be codewords of K such that v1 E ( ~ 2 ) ~ . As only insertions are permitted, one has that lvll 2 Iw2I. In particular] 1wl l = 12121 if and only if no insertion occurs in 212, if and only if w1 = 212. Hence, as K is uniform, v1 = v2. Analogously, one can verify that every uniform code K is error detecting for d(m, 1), provided len K > m.

Example 4 One can verify that the code KO = {000,111} is error-detecting for the channel y = (T 0 L 0 S(1,3). But KO is not (y, *)-detecting. Indeed, consider the messages w2 = (000)3 and w1 = (000)2 such that w1 # w2. Then, w1 E (wg), by deleting appropriately three symbols from w2.

Example 5 Consider the code K1 = {q, v2 I v1 = 00111, v2 = 0101011) and the channel y = 6(1,7). From the equalities (.I), = {v~,Ol l l ,OOl l} and

(v& = {v2,101011,001011,011011,010011,010111,010101},

one verifies that K1 is error-detecting for y. In addition, we claim that K1 is (y,*)-detecting. Indeed, note first that A @ (w), and w @ (A), for all w E K t . Now consider two messages w1 and w2 in K t such that w1 E ( ~ 2Then, w1 = [ I E ~ ] and w2 = [Q] for some factorizations 61 and 6 2 over K1. By property PI of the channel y, there is a factorization $ which is y-admissible for 6 2 such that [$] = w1 = [KI] and $(i) E ( ~ 2 ( i ) ) ~ for all i E I$ = In,. It is sufficient to show that $ = 61; then, as K1 is error-detecting for y, ~ l ( i ) E ( ~ ~ ( i ) ) , implies ~ l ( i ) = 1 c 2 ( i ) for all i in In,. So consider the word ~ l ( 0 ) of K1 which is a prefix of both, [nl] and [$I. If ~ l ( 0 ) = vl then $ ( O ) = v1 or $ ( O ) = 0011. The second case implies $(1) = 101011 which is impossible, as two deletions would occur in 62(0)n2(1) within a segment of length less than 7. Hence, $ ( O ) = v1 as well. Similarly, one verifies that if ~ l ( 0 ) = vg then $(O) = v2 as well. Hence, $ ( O ) = nl(0) and $(1)$(2)... = ~ 1 ( 1 ) ~ 1 ( 2 )The same argument can be applied repeatedly to obtain $(i) = ~ l ( i ) for all i in I$.

The following proposition gives certain relationships between the error-

' A code K is (?, *)-correcting if ( ~ 1 ) ~ n (wz)-, # 0 implies WI = WZ, for all ~1 and wz in K'.

246

detecting properties given in Definition 2.

Proposition 1 For every t in NO and fo r every P,-channel y, the following relationships are valid.

(i) ED;+^ 2 ED;. (ii) ED; 2 ED,.

(iii) ED; = nFoED;.

Proof: Consider a code K which is (y , t + 1)-detecting and the messages w1 E KSt and wz E K' such that w1 E (wz),. Let v E K . By property P2 of the channel y, one has w1v E ( W Z V ) ~ . As wlv E Kit+' and wzv E K', it follows that w1v = w2v. Hence, w1 = wz and the first inclusion is correct. Obviously, the second inclusion is correct as well. For the third relationship, one can easily verify that ED; g ED; for all t in NO. Hence, ED; nEoED;. On the other hand, consider a code K in nE,,ED; and w1, w2 E K* with w1 E (wz),. Then, there is t E NO such that wl E Kt and, as K E ED;, it follows that w1 = wz. Hence, K E ED;. 0

Next it is shown that the inclusion in Proposition l ( i ) can be proper for every value of the parameter t.

Proposition 2 For every t in NO there is an SID-channel y such that ED;+,+' is properly contained in ED;.

Proof: For each t in No consider the SID-channel y = y(t) = 6( l , t + 2) and the code K = K(t) = (Ot+'}. First we show that K is (y,t)-detecting and then that K is not (y, t + 1)-detecting.

Let w1 E Km and 202 E Kn such that w1 E (wz),, m 5 t , and n E NO. As only deletions are permitted, lwll 5 IwgI. If lwll = 1w21 then w1 = w2 as required. On the other hand, we show that the assumption lwll < lwzl leads to a contradiction. Indeed, as IKI = 1, this assumption implies m + 1 5 n. Now as wz consists of n codewords each of length t + 2, at most one symbol can be deleted in each codeword and, therefore, a t most n deletions can occur in W Z . Hence, Iw1 I 2 lwzl - n which together with m + 1 5 n imply

+ m + l < m ( t + 2 ) j t + l < m . m(t + 2) m(t + 2) 2 n(t + 2) - n + n 5

t + l t + l

The last inequality, however, contradicts m 5 t. Now we show that K is not (y , t + 1)-detecting. Let w1 = (Ot+2)t+1 E

KSt+' and w2 = ( O t + 2 ) t + 2 E K*. Clearly w1 # w2. On the other hand, one has that w1 E (wz), by deleting appropriately one zero in every t + 2 consecutive symbols of WZ. 0

247

The following result poses a certain restriction on the words of (y,*)- detecting codes for SID-channels that involve insertions or deletions.

Proposition 3 Let K be a code and let y = r(m,l) be an SID-channel with r E 5 \ {u}. If K is (y, *)-detecting, then xn f K for all x E X l m and for all n E N. Proof As r # u, at least one of L and 6 occurs in r. Assume that 6 occurs in r and that K is (y, *)-detecting, but suppose xn E K for some x E X F m n X + and n E N. Let u = xn. Note that both w2 = une and w1 = vne-' are in K* and that w1 # w2. We show that w1 E (wz), which contradicts the fact that K is (y,*)-detecting. Let y = xne-' such that we = xy. Then, w2 = (ve).. = ( x Y ) ~ = (xy)(xy) . . . (xy). Moreover, as 1xyl = llvl = h l x l 2 l, it is possible that y deletes the prefix x in each of the n factors xy of w2. Hence, yn E (wa),. But yn = x(ne-l)n = = w1. The case where only L occurs in r can be shown analogously. 0

The next proposition gives a certain bound on the number of insertion/deletion errors that a regular language can detect. Let L be a regular language, other than 0 and {A}, accepted by a minimal reduced deterministic finite automaton A; that is, an automaton with the smallest number of states such that every state q is reachable from the start state and can reach a final state. A loop of A is a sequence of states (9.1,. . . , qn) , with n 2 2 , such that q1 = qn and there is a computation of A from q1 to qn. Then n is called the length of the loop. Let minloop(L) be the minimum length of the loops of A, or 00 if A has no loops. Let p(L) = min{minlen(L\ {A}) - 1, minloop(L) - 2).

Proposition 4 Let r be an error type in 7; \ {u} and let L be a regular language other than 0 and {A} . If L is error-detecting for r(m, l ) then rn 5

Proof As r # u, at least one of 6 and L occurs in r. Let y = r(m, l) . Assume that L is error-detecting for y but suppose for the sake of contradiction that m > p(L). If p ( L ) = IwI - 1, for some nonempty word w in L, then m 2 Iw( which implies that the channel can erase or introduce w depending on whether 6 occurs in T. That is, A E ( w ) ~ or w E (A), which is impossible. Now if p(L) = minloop(L) - 2 then there is a loop (q1,. . . , qn) of length n = p(L) +2. Let II be the word of length p(L) + 1 formed by the labels (symbols) in the path from q1 to qn. Then, xy,xvy E L for some words z and y. As IvI 5 m, one has that xy E (xvy), or xvy E (xy),, depending on whether 6 occurs in r , which contradicts the assumption about L.

The symbol synL denotes the syntactic monoid of the language L. It is well-known that a language L is regular if and only if syn L is finite (see [4]

P(L).

248

or [9]). Moreover, 1 syn LI >_ minlen(L \ {A}) . Hence, the following obtains.

Corollary 1 L e t r be a n error type in 71 \ { u } and le t L be a regular language other t h a n 8 and { A } . If L is error-detecting f o r r(m, C ) t h e n m < I syn LI. 0

4. Error-detecting Uniform, Solid, and Shuffle Codes

In this section we consider certain error-detecting capabilities of some known classes of codes. There are cases where, due to the characteristics of the codes used, (y, 1)-detection is sufficient to ensure (y, *)-detection. On the other hand, for some classes of codes, (y,l)-detection is provided for free. The first result concerns the channel u(m, C) that involves only substitution errors. This result justifies the use of uniform codes for such channels.

Proposition 5 L e t K be a u n i f o r m code and let y be t h e channel u(m,f?). T h e n , K is (y, *)-detect ing i f and only if it i s (y, 1)-detect ing.

Proof The ‘only if’ part follows immediately from Proposition l(ii). Now assume that K is a uniform code of length n E N and that K is (y , l ) - detecting. Let w1, w2 be messages in K* such that 201 E ( 2 ~ 2 ) ~ . Then, there are factorizations K I , K ~ over K such that [KI] = w1 and [ K Z ] = W Z . Property PI implies that there is a factorization 11, which is y-admissible for ~2 such that w1 = [4] and 11,(i) E ( ~ 2 ( i ) ) ~ for all i E I$ = In,. As y permits only substitutions, one has I11,(i)l = n for all i E In,. Hence, 1[11,]1 = nl~zl. On the other hand, Jw11 = nlnll; therefore, In11 = I K Z ~ = which implies 11, = ~ 1 .

Now as tcl(i) E ( ~ 2 ( i ) ) ? and K is (y, 1)-detecting, it follows that ~ l ( i ) = ~ ( i ) for all i E In,. Hence, w1 = wz. 0

A similar statement follows about finite solid codes for the channel u o L 0 6(1, C). A language K is a solid code, if it is an infix and overlap-free language; that is, K n (X*KXS U X+KX*) = 8 and, for all u , v E Xf and x E X*, vx, xu E K implies x = A. Some interesting decoding capabilities of solid codes are discussed in [4]. Recent results on solid codes can be found in [2] and [7].

The proof of the following proposition is based on a special property of the assumed type of solid codes. Let K be a code and let y be a P,-channel. A factorization 11, is said to be (y, K ) - c o r r u p t e d , if it is y-admissible for some factorization IC, over K and K # 11,. Thus, [4] E ([.I)-, and there is at least one factor Q ( i ) of which is not equal to its corresponding factor ~ ( i ) E K . The property we need is the following.

P ( y , K ) : If 11, is a (y, K)-corrupted factorization then [11,] $ K*.

249

One can verify that every code satisfying P(y, K ) must be a (y, *)-detecting code.

Proposition 6 Let y be the channel F @ L @ d( l , e ) and let K be a finite solid code with maxlen K 5 e . Then, K is (y, *)-detecting if and only if it is (7, 1)-detecting.

Proof: The ‘only if’ part follows immediately from Proposition l(ii). Now assume that K is (y, 1)-detecting. We show that P(y, K ) holds. Let K be a factorization over K and let $ be y-admissible for K such that $ # K .

Then, 161 = > 0. Now suppose that [$] E K*; that is, [$] = [p] for some factorization p over K . If 1p1 = 0 then [p] = X E ( [ K ] ) ~ which contradicts the fact that K is (y, 1)-detecting. Hence, Ipl > 0.

Then, [$] = $(O)...$(k - 1) = p(O). . .p(m-l) . A s n # $ , t h e r e i s a m i n i m u m p ~ I & s u c h t h a t ~ ( p ) # $ ( p ) . Then, [$] = ~ ( 0 ) - l)$(p) ...I)( k - 1) and, as K is a prefix code, ~ ( i ) = p ( i ) for all i < p . Hence, $(p ) . . . $(k - 1) = p ( p ) . . . p(m - 1). Now, f o r a l l j i n { p , p + l , . . . , k - 1 ) o n e h a s

Let k = I K I = I$[ and m = Ipl.

x j y j , z j a j y j ,

~ ( j ) , if no error occurs.

if ~ ( j ) = x j a j y j with aj E X deleted; if ~ ( j ) = x j y j with aj E X inserted; or if ~ ( j ) = x jb j y j with bj E X substituted with aj E X ; $ ( j ) =

Of course, when j = p , $ ( j ) # ~ ( j ) . For the lengths of p ( p ) and $ ( p ) we distinguish three cases which all lead to contradictions due to the fact that K is a (y, 1)-detecting solid code.

First, assume Ip(p)I > I$(p)l. Then, p ( p ) = $ ( p ) . . .$( T ) W where p 5 T

and w is either equal to $(r + 1) or to a non-empty proper prefix of $(r + 1). The former case implies p ( p ) E (K2K*) , n K which is impossible. Hence, 0 < IwI < I $ ( T + 1)1 and $(r + 1) = ws with s E X+. The case $(r + 1) = ~ ( r + 1) is not possible, as otherwise w would be a proper suffix of p ( p ) and a proper prefix of ~ ( r + 1). Hence, $(r + 1) is of the form z,+ly,+1 or x,+la,+ly,+l. If IwI 5 1x,+1( the overlap-freeness of K is violated again. Hence, w s = x,+lyr+l or ws = x,+laT+lyr+l, and IwI > 1x,+11. It follows then that p ( p + 1) either is contained in yr+l or it starts with a proper suffix

Second, assume Ip(p)I < I$(p)l. Then, $ ( p ) = p ( p ) s where s E X + and m > p . As K is an infix code, it must be lp(p)l > Jzpl and, therefore, Is1 5 Iypl. Then, however, p ( p + 1) is either contained in y p or it starts with a suffix of yp . Finally, the case Ip(p)I = I$(p)l is also impossible, as it violates the fact that K is (y, 1)-detecting. 0

of Yr+l.

250

The code K1 of Example 5 is a (y, 1)-detecting solid code, where y = ~7 0 L 0 6(1 ,7 ) . Hence, Proposition 6 implies that K1 is (y, *)-detecting as well.

Let's consider now the classes of shuffle codes, as they provide error- detecting capabilities for SID-channels that involve either insertions or deletions. A language K is a prefix-shufle code of index n E Pi, if 20 . . . x,-~ E K and X O ~ O ~ ~ ~ X , - ~ ~ , - ~ E K imply yo = ... = yn-l = A, for all words zi and yi in X * . Let PS, be the class of prefix-shuffle codes of index n. Then, PS,+1 C PS,. The class SS, of sufix-shufle codes of index n is defined analogously: xo . . . x,-1 E K and ~ 0 x 0 . . . yn-lx,-l E K imply yo = ... = yn-l = A. Again, one has SS,+1 C SS,. The class IS, of infix- shufle codes of index n consists of all codes K such that xo - . . x,-~ E K and y0~0...y,-1x,-1y, E K imply yo = ... = yn-l = y, = A for all zi and y j in X ' . Then, IS,+1 C IS,. Finally, for the class OS, of out&-shufle codes of index n, one has that xo . . . x, E K and XOYO . . . zn-1y,-1z, E K imply yo = . . . = yn-l = A. Again, one has OS,+1 OS,. Moreover, for all n E N,

PS,+l U SS,+l IS, n OS, and IS, U OS, PS, n SS,.

We refer the reader to [4] for further results on shuffle codes.

Proposition 7 Let m, C E N with m < C, and let K be a code with minlen K > m and maxlen K 5 C.

(i) If K is outfix-shufle of index m then it is error-detecting for L(m,C) and for 6(m, C).

(ii) If K is prefix-shufle of index m + 1 then it is (y, 1)-detecting, where y = L(m, C).

Proof (i) Let y = 6(m,C). Then, if z E ( x ) ~ and 1x1 5 C, at most m symbols can be deleted from x to obtain z. Observe that, if k is the number of symbols deleted, then x can be written in the form xOaO.. . ~ k - l a k - ~ x k and z in the form xo . . . z k - 1 ~ 1 2 , where ao, . . . , ab-1 E X are the deleted symbols and 20,. . . , x k E X * . From this observation and the fact OS, C OSk for k <_ m, it follows easily that if K is outfix-shuffle of index m then it is error- detecting for 6(m, C). Using a similar argument, one can show that K is also error-detecting for L(m, l).

(ii) Let K be prefix-shuffle of index m + 1 and let w1 E K U {A} and wg E K* such that w1 E (wz)?. As minlen K > m and y permits at most m insertions in any C or less consecutive symbols of wg, it follows that when one of w1 and wg is empty they must both be empty. Now assume w1 E K and wg E Kn for some n in N. Then, wg = [K ] and w1 = [4], where K is a factorization over K of length n and $ is y-admissible for K . We show

251

that w1 = ~ ( 0 ) . As +(O) E (n(O)), and I ~ ( 0 ) l 5 C, at most m insertions can occur in ~ ( 0 ) . More specifically, let k be the number of insertions in ~ ( 0 ) and let ao, . . . , a k - 1 E X be the symbols inserted. Then, 0 5 k 5 m and, $(O) = x O a O . - . X k - l a k - l X k and K ( O ) = $ 0 . . . X k - l x k for some words xo , . . . , X k - 1 , x k . Now [$I = +(O)s and s E ( ~ ( l ) . . . n ( n - l)),, for some s in X * , and w1 = x0aO ~ ~ - ~ k - ~ a ~ - ~ x k s E K . As K is prefix-shuffle of index m + 1, it is also prefix-shuffle of index k + 1 and, therefore, w 1 = n(0) which implies k = 0 and s = A. Moreover, ~ ( 1 ) . . . n(n - 1) = X implies n = 1 and 202 = ~ ( 0 ) . Hence, 201 = w2 as required. 0

We note that a code satisfying the premises of Proposition 7 is not necessarily (y,*)-detecting. For example, the code KO of Example 4 is prefix- shuffle of index 2 and (y,l)-detecting, where y = ~ ( 1 , 3 ) . But KO is not (7, *)-detecting.

5 . Discussion

In this paper, we have argued that error-detection is a fundamental language property when it comes to storing/communicating data. We have presented some initial results on error-detection at the general level of P- and SID- channels, and examined certain error-detecting capabilities of uniform, solid, and shuffle codes. Some potentially interesting questions that arise from this work are the following:

(1) With Proposition 4 in mind, what other bounds exist on the insertion- and deletion-detecting capabilities of languages?

(2) Is it possible to show that solid codes possess stronger error-detecting capabilities than the one shown in Proposition 6 for the SID-channel

(3) How large is the intersection between certain shuffle codes and solid codes? In view of Proposition 6 and Proposition 7, it appears that codes in that intersection provide certain *-error-detecting capabilities for free.

A related concept which is desirable from a practical point of view is the property of error-detection with finite delay. This property allows the detection of errors in a word w by examining consecutive segments of w of bounded length, one at a time. Some initial results on this topic exist in [5].

(T 0 L 0 S(1, e)?

References

[I] J. Duske, H. Jiirgensen: Codierungstheorie. BI Wissenschaftsverlag, Mannheim, 1977.

252

[2] H. Jiirgensen, M. Katsura, S. Konstantinidis: Maximal solid codes. Jour- nal of Automata, Languages and Combin., 6 (2001), 25-50.

[3] H. Jiirgensen, S. Konstantinidis: Error correction for channels with substitutions, insertions, and deletions. In J.-Y. Chouinard, P. Fortier, T. A. Gulliver (editors) : Information Theory and Applications 2, Fourth Cana- dian Workshop on Information Theory. Lecture Notes in Computer Sci- ence 1133, 149-163, Berlin, 1996. Springer-Verlag.

[4] H. Jurgensen, S. Konstantinidis: Codes. In G. Rozenberg, A. Salomaa (editors): Handbook of Formal Languages, vol. I. 511-607, Berlin, 1997. Springer-Verlag.

[5] S. Konstantinidis: Error-detection with finite delay. In preparation.

[6] S. Konstantinidis: An algebra of discrete channels that involve combinations of three basic error types. Inform. and Comput., 167 (2001), 120-131.

[7] N. H. Lbm: Finite maximal solid codes. Theoret. Comput. Sci., 262 (2001), 333-347.

[8] S. Roman: Coding and Information Theory. Springer-Verlag, New York,

[9] H. J. Shyr: Free Monoids and Languages. Hon Min Book Company,

1992.

Taichung, second ed., 1991.

253

A NOTE ON FINDING ONE-VARIABLE PATTERNS CONSISTENT WITH EXAMPLES AND

COUNTEREXAMPLES

TAKESHI KOSHIBA Secure Computing Laboratory, FUJITSU LABORATORIES Ltd.,

4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki 211-8588, Japan E-mail: koshiba@acm. org

KUNIHIKO HIRAISHI School of Information Science, Japan Advanced Institute of Science and

Technology, 1-1 Asahidai, Tatsunokuchi, Ishikawa 923-1292, Japan E-mail: hiraojaist. ac.jp

We consider the problem of finding one-variable patterns consistent with given positive examples and negative examples. We try to give some evidence that the pattern finding problem is computationally difficult by finding an NP-complete graph problem (called MCP) such that the pattern finding problem is a subproblem of MCP. We also give sufficient conditions such that the pattern finding problem is polynomial-time computable and show that some of the conditions are related with solving word-equations in one variable.

1 Introduction

Finding common patterns of strings is a classical problem in inductive inference. For example, when given the following set of strings {01010,12120,3313310}, we expect that a common pattern like 2x0 is induced. Angluin' has presented a theoretical framework of this problem, and the framework enables us to analyze the complexity of the problem more easily. A pattern is a string over constant symbols and variable symbols. Patterns may generate language by substituting non- null strings of constants for the variables of the pattern. For example, substituting 110 for z in the pattern zz0 generates the string 1101100. The language of the pattern zz0, denoted L(zzO), is the following set:

Angluin' showed that the membership problem, that is, whether, given a pattern p and a string w, w E L ( p ) or not, is NP-complete, and suggested the pattern finding problem might be difficult. Jiang e t aL5 showed that the inclusion problem, that is, whether, given a pair of patterns p and q, L ( p ) c L(q) or not, is undecidable. On the other hand, Angluin' presented a polynomial-time algorithm that finds, given a set of strings, one of the longest

(000,110,00000,01010,10100,11110,0000000, . . .}.

254

one-variable patterns. The algorithm consist,s of three main steps:

Step 1. All possible patterns are partitioned into at most t3 groups, where t is the input size.

Step 2 . For each group Gi and each input string s, construct a finite automaton Ai,+ of at most t2 states, which is called a pattern automaton, to recognize all patterns for s which are in Gi.

Step 3. For each set Gi, construct the intersection automaton Ai = n,Ai,,

Angluin’ pointed out that although the algorithm can be generalized to the k- variable cases for k > 1 in a straightforward manner, the generalized algorithm does not seem to run in polynomial-time in the case k > 1. KO and Hua7 showed that the straightforward generalization to the two-variable case brings an NP-complete subproblem.

KO and Tzeng’ have studied another important problem of finding a common pattern. That is the problem of finding a pattern consistent with given positive and negative examples. They showed that the problem of finding a pattern consistent with given positive and negative examples (i.e. given two sets S and T of constant strings, - determine whether there exists - a pattern p such that S C L ( p ) and T 5 L ( p ) ) is C;-complete, where L ( p ) is the complement language of L ( p ) . KO et al.’ also stated that the complexity of the problem is not settled in the k-variable case for k 2 1.

In this paper, we further investigate the modification of Angluin’s algorithm to the one-variable pattern-finding problem from given positive and negative examples. We show that the modified algorithm meets a difficult problem that is NP-complete. More precisely, in the modified algorithm, Step 3, finding a pattern that is recognized by Ai and consistent with each negative example seems to be difficult. We show that the pattern-finding problem is a subproblem of a graph problem and also show that the graph problem is NP-complete. Although this fact does not imply that the one-variable pattern- finding problem from given positive and negative example is difficult, we can regard this fact as an aspect of the computational compIexity of the problem. We also give sufficient conditions that the one-variable pattern-finding problem from given positive and negative is efficiently computable.

to recognize all common patterns in Gi.

2 Definitions

C is a finite alphabet containing at least two symbols. The set of all finite strings over C is denoted by C*. The set of all finite non-null strings over C

255

is denoted by C+. The set of all strings with the length k over C is denoted by Ck. A sample is a finite nonempty subset of C+, and each element of a sample is called an example.

A pattern is a finite non-null string over C U {z}, where z is the variable symbol and not in C. Let PI denote the set of all patterns. The length of a pattern p , denoted lpl, is the number of occurrences of symbols composing it. For each set A, let I IAl I denote the cardinality of A. The concatenation of two patterns p and q is denoted by pq. The pattern that is k-times concatenation of a pattern p is denoted by p k .

Let f be a non-erasing homomorphism from PI to PI with respect to concatenation. If f is an identity function when restricted on C, then f is called a substitution. We use a notation [w/x] for a substitution which maps the variable symbol z to the string w and every other symbol to itself. For any pattern p and for any substitution f , substituted pattern f ( p ) is denoted

If p is a pattern, the language of p , denoted L ( p ) , is the set {s E Cf : s = f ( p ) for some substitution f } . A pattern p is said to be descriptive of a sample S if S C L(p) and for every pattern q such that S C L(q), L(q) is not a proper subset of L ( p ) . That is, for a descriptive pattern p of S , L ( p ) is minimal in the set-containment ordering among all pattern languages containing S. For any pattern p and any string s, if there exists a substitution f such that f ( p ) = s then we say that p generates s (by f ) . A pattern p is said to be consistent with a positive sample S and a negative sample T if S C L ( p ) and T C C* \ L ( p ) .

by P b J l X I .

3 Review of the One-Variable Pattern-Finding Problem

The difficulty of the pattern-finding problem in the case of general patterns lies on that of the membership problem (i.e. given a pattern p and a constant string s, determine whether s E L ( p ) ) . The following shows the difficulty of the membership problem.

Proposition 1 (Angluinl (1980)) The problem of deciding whether s E L ( p ) f o r any string s E C* and f o r any pattern p i s NP-complete.

However, in the case of one-variable patterns, the membership problem is decidable in polynomial time. This suggests that finding a common pattern in one-variable case may be solvable in polynomial time. Actually, Angluin's algorithm runs in polynomial time to find a common pattern from a given positive sample.

256

In this section, we review Angluin’s algorithm for finding a common one- variable pattern from a positive sample.

We first define pattern automata. Let s be a string and let w be a nonempty substring of s. Denote PAl(s; w ) = { p E PI : s = p[w/3:]}. We define a (one-variable) pattern automaton A(s ; w ) to recognize the set PAl(s ;w) . The states of A ( s ; w ) are ordered pairs ( i , j ) such that 0 5 i, 0 5 j , and i + jlwl 5 IsI. The initial state is (0,O). The final states are all states ( i , j ) such that j 2 1 and i +jlwl = Is]. The transition function 6 is defined as follows. Let b E C.

if the (1 + i + jlwl)th symbol of s is b, S ( ( i , j ) , b ) = { (Zt1,j) undefined otherwise;

( i , j + 1) { undefined otherwise.

if w occurs in s beginning at position (I + i + j lwl) , S ( ( i , j ) , 3 : ) =

The state ( i , j ) signifies that in the input string, z constant symbols and j occurrences of 3: have been read so far.

Let A, = (Q,,Qo,S,,F,) for i = 1 , 2 be two finite automata over the alphabet C with the same initial state yo = ( O , O ) , where Q, C: N x N is the set of states, S, is the transition function, and Fa is the set of final states of A,. Then we define A1 C A2 if and only if Q1 C Q2, F1 C F2 and whenever bl is defined, 62 is also defined and agrees with 61. A finite automaton A is called a one-variable pattern automaton if and only if A C A(s ; w) for some string s and substring w.

Let A, = ( Q z , ( O , O ) , S,, Fa) be two one-variable pattern automata, for i = 1,2 . Then the intersection of automata A1 and A2, denoted by A1 n A2,

is the finite automaton (Q1 n Q2, (0, 0 ) , S, F1 n F2), where b(q, a ) is defined to be 61(q,a) whenever 61(y ,a ) and 62(y,u) are both defined and equal; and is undefined Otherwise.

Proposition 2 (Angluin’ (1980)) If A and A’ are one-variable pattern automata then A n A‘ as a one-variable pattern automaton, and L ( A n A’) = L ( A ) n L(A’).

Next we discuss the partition of one-variable patterns into pairwise disjoint groups. For each one-variable pattern p , define ~ ( p ) to be the triple of nonnegative integers ( i , j , k ) such that the number of occurrences of constants in p is i, the number of occurrences of variables in p is j , and the position of the leftmost occurrence of 3: in p is k . Let PA( i , j , k ) be the set of all patterns p in PI such that the number of occurrences of constants in p is i, the

257

number of occurrences of variables in p is j , and the position of the leftmost occurrence of x in p is k .

Let us call a triple ( i , j , k ) feasible f o r s if 0 5 i 5 IsI, 1 5 j 5 Is], 1 5 k 5 i + 1, and j just divides Is/ - i. We say a triple ( i , j , k ) is feasible f o r a set S if it is feasible for all s in S. Let F be the set of all feasible triples for the given set S. We construct, for each string s and each triple ( z , j , k ) that is feasible for s, a pattern automaton A(s; w) where w is the unique string defined by the triple. Let an input sample S = {sl, . . . s,} be given, where each si E C+ and m 2 2. Then each triple ( i , j , k ) in F defines m automata A,(i,j, k ) , for T = 1,. . . , m, as follows. Let w, be the substring of s, beginning at position k and with the length (Is,] - i)/j. To obtain A , ( i , j , k ) , take A(s,; w,) and remove any z transition leaving from a state (u, 0) where u < k - 1, remove the constant transition leaving from the state (0, k - l), and remove all final states except ( i , j ) .

Proposition 3 (Angluinl (1980)) A,(i, j , I c ) recognizes all patternsp in PI such that s, E L ( p ) and ~ ( p ) = ( z , j , I c ) . Consequently,

u L ( fi A, ( i , j , h ) ) = { p € Pl : s c L ( p ) } . ( i , j , l c )EF r=l

The above observation gives us the following algorithm.

Angluin's One- Variable Pattern-Finding Algorithm INPUT: S = {sl , . . . , s,}; OUTPUT: a one-variable pattern p which is descriptive of S within PI; begin

for each (i, j , I c ) in F do

for T := 1 to m do begin

construct automaton A,(i, j , k ) ; m

~ ( i , j , k ) := n ~ , ( i , j , k ) ; T=l

end; sort F in descending order according to the value of a + j ; for each ( i , j , I c ) in sorted F do

if llL(A(i,j, k))ll # 0 then output any p E L(A(Z,j, k ) ) and exit

end.

258

In the above algorithm, it is clear that the time complexity depends on two factors: one is the number of feasible triples, and the other is the amount of time to construct A ( i , j , k ) . Let ! be the input size, that is, xy=l IS,^. Since, for each feasible triple (i, j , k ) and each r , 1 5 r 5 m, the automaton A T ( i , j , k ) can be constructed in time O(lsT12) and the intersection of automata can be constructed in linear time with respect to the size of the automata, the automaton A ( i , j , k ) can be constructed in time O(C;==l 1s,.I2). Furthermore, using a theorem (on number theory) of Dirichlet, we can show that IlFll is 0(t2 log!). Therefore, the above algorithm runs in time 0(t4 log!).

We note that Angluin's algorithm guarantees the following property, which is useful in what follows.

Proposition 4 (Angluin' (1980)) For any feasible tripZe ( i , j , k ) fo r S , i j there exists a descriptive pattern of S in L ( A ( i , j , k ) ) then all patterns in L ( A ( i , j , k ) ) are descriptive of S .

4

In general case, the pattern finding problem from positive and negative examples is not easier than the pattern finding problem from positive examples only. The following proposition suggested this observation.

Finding Patterns from Positive and Negative Examples

Proposition 5 (KO and Tzeng' (1991)) The problem of deciding whether there exists a pattern p which is consistent with a positive sample S and a negative T is C;-complete.

When the number of variables is fixed, whether the problem is efficiently solvable or not has been unsettled. In this section, we extend Angluin's algorithm to deal with negative examples. The following is a straightforward extended algorithm to find a one-variable pattern consistent with positive and negative examples.

One- Variable Pattern-Finding Simple Algorithm

INPUT: S = { s ' , . . . , sm} , T = { t l , . . . , tn} ; OUTPUT: a one-variable pattern p which is descriptive of S within PI

begin

from Positive & Negative Sample

and is consistent with S and T ;

for each ( i , j , k ) in F do begin

259

for T := 1 to m do construct the automaton Ar(i , j , k ) ;

m

~ ( i , j , ,q := n ~ ~ ( i , j , I C ) ; r=l

end; sort F in descending order according to the value of i + j ; for each ( i , j , k ) in sorted F do

if JIL(A(i, j , k))lJ = 0 then

if JJL(A(i, j , k))Jl = 1 then

begin

continue

begin

;go to the end point of the loop body

for T := 1 to n do

if ‘dt E T [t $? L(p)] then

else

check whether t , E L(p) ;

output p

continue ;go to the end point of the loop body end

begin else

for T := 1 to n do begin

construct the automaton AtT(Z,j, k ) := A(Z,.j, k ) n Atr( i , j , k ) ;

Er := { e I edge e appears in A but not in At.) end;

find a pattern p which goes through at least one edge in each E, output p and exit

end end;

output “none” end.

Obviously, time complexity of the algorithm is determined by the time complexity of execution of the step (*). We formulate the above problem into a decision problem, called the MCP problem, and show that MCP is an NP-complete problem.

260

Multiple Color Path(MCP) Problem: Given a directed acyclic graph G = (V, E ) , where V is a set of vertices and E is a set of edges, and given specified vertices s and t and subsets El, E2,. . . , Ek of E , find out whether there is a path from s to t which goes through at least one edge in each Ei.

Theorem 6 The MCP problem is NP-complete.

Proof: It is obvious that the MCP problem is an N P problem. We show that there is a polynomial time reduction of 3SAT to MCP, where 3SAT = {p I 'p

is a Boolean formula in the conjunctive normal form (CNF) in which each clause contains exactly three literals and 'p is satisfiable}. Let cp be a Boolean formula in CNF with m variables X I , x i , . . . , x , and n clauses, and with three literals per clause. That is,

where ei,j (1 5 i 5 n, 1 5 j 5 3) is either xk or PI, for some k with 1 5 k 5 m and %k denotes the negative literal of xk. We first define a graph G(v , E ) and specified vertices s and t as follows:

and

E = El U Ez, where

26 1

and

E2 =

Next we define path constraints, that is, subsets of E as follows:

E i , j , ~ = {fi,j,~} U {if !i,j = Xk then e k , F else null }, Ei, j ,F = { f i , j , F } u {if &,j = Z k then e k , T else null},

Ec,i = {if &,I is positive then fi,l,T else fi,l,F} U {if !i,2 is positive then f i , 2 , ~ else f i , 2 , ~ }

U {if &,3 is positive then f i , 3 , ~ else f i , 3 , ~ } ,

for each i , j with 1 5 i 5 n, 1 5 j 5 3. It is easy to see that this reduction is computable in polynomial time. We have only to show that 'p is satisfiable if and only if there is a path in G from s to t which goes through at least one edge in each Ei , j ,~ , Ei,j,T or Ec,~ . Due to the construction of G, all paths from s to t go through either e k , T or e k , F , for all k (1 5 k 5 m). This fact corresponds to the truth assignments for variables in 'p: the fact that the path goes through e k , T means x k = 1. Due to the setting of E i , j , ~ and E i , j , ~ , if x k appears in j t h clause as a positive (resp., negative) literal, the value of the literal is determined to 1 (resp., 0). If the path goes through e k , F then that means 21, = 0. Due to the setting of E i , j , ~ and E i , j , ~ , if xk appears in j t h clause as a positive (resp., negative) literal, the value of the literal is

262

determined to 0 (resp., 1). Moreover, the setting of Ec,i guarantees the truth value of each clause. Therefore, we can say that 'p is satisfiable if and only if there is a path from s to t which goes through at least one edge in each E i j , ~ ,

Ei,j,T O r Ec,i.

The above theorem suggests that the one-variable pattern-finding problem from positive and negative examples could be hard. However, it does not mean that the problem is negatively settled. In what follows, we consider sufficient conditions that the one-variable pattern-finding problem from positive and negative examples is efficiently computable.

Theorem 7 Suppose that the number of negative examples is bounded by a constant. Then, the one-variable pattern-finding problem f r o m positive and negative examples is polynomial-time computable.

Proof: In case that the number of negative examples is bounded by a constant, it is the following subproblem of the MCP problem that we have to solve in order to execute the step (*) in the new algorithm. The subproblem is the case that k is bounded by a constant. It is easy to see that the subproblem is polynomial-time computable. Thus, the new algorithm finds a one-variable

0

Let p l and pa be one-variable patterns accepted by some pattern automaton for a positive sample S. We can regard any solution w for the equation p1 = p2 as substitution f = [w/rc], since f ( p l ) = f ( p 2 ) . Then, S is a subset of all substituted patterns f ( p l ) such that f ( p l ) = f (p2). The following is well known in the literature of the word equation problems.

pattern from positive and negative examples in polynomial-time.

Pattern finding problem is related to solving word equations.

Proposition 8 Let p l and p2 be one-variable patterns. Any solution f o r p l = p2 belongs to either

1. the set { ( ~ p ) ~ a : i 2 0 } such that la1 5 lpll, 5 lp11, a and p are uniquely determined, and ap is a primitive word or

2. some finite set whose elements are shorter than p l or of length Ipll.

We call solutions in the former set long solutions and ones in the later set short ones. The analysis of the input S also brings some sufficient conditions the one-variable pattern-finding problem from positive and negative examples is efficiently computable.

263

Theorem 9 Let S = {sl, . . . , s,} be positive sample and T = { t i , . . . , tn} be negative sample. Let smin be an element in S of the minimum length. If ( I ) there exist s , , s ~ E S such that s, # S b and Is,I = lSbl 2 Is,in12 or (2) there exists s E S such that Is1 2 IsminI2 and 'dt E T [It! 2 I ~ ~ i ~ 1 ~ ] , then the new algorithm find a one-variable pattern from positive and negative examples in polynomial-time.

Proof: = lSbl 2 ISmin12. Then the length of any pattern p which generates s,in is at most Ismini. Also the number of variable symbols in any pattern p which generates smin is at most Is,inI. Since any pattern automaton A ( i , j , k ) appeared in the new algorithm recognizes patterns of the same length, if there is a one-variable pattern p such that (1) the number of variable symbols in p is exactly Ismin[, (2) the pattern p generates smin, and (3) the pattern p is recognized by the pattern automaton, then the pattern automaton recognizes at most one one- variable pattern. So, we have to consider only the case the number of variable symbols in p is less than Ismini. We assume that there are two distinct one- variable patterns p l and p2 in L(A(i , j , k ) ) . Since there exists s, E S such that Is,[ 2 Ismin12, there exists a substitution f such that f = [ w / x ] , f(p1) = f ( p z ) , and IwI 2 lsminl+l. This means that the equationpl = pa has solutions whose length is more than ( p l ( . Proposition 8 says that there exists at most one long solution of the same constant length. The existence of two distinct strings s, and Sb such that Is,I = lSbl 2 IsminI2, sa = fa(pl) = f a ( p 2 ) for some substitution fa and sb = fb(p1) = fb (p2) for some substitution f b means that there exist two distinct long solutions of the same length. This contradicts that there exists at most one long solution of the some constant length. So we can say that l l L ( A ( i , j , k)) l l 5 1. If any pattern automaton A ( i , j , k ) appeared in the new algorithm is either a single path or the null automaton, then the new algorithm always skips the step (*), namely, the new algorithm runs in polynomial-time.

Next we assume that there exists s E S such that Is1 2 Is,inI2 and W E T [It1 2 I ~ , i ~ 1 ~ ] . Using the similar discussion as above, we may assume that there are two distinct one-variable patterns p1 and p2 in L ( A ( i , j , k ) ) . This assumption and Is\ 2 Jsmin12 imply that the existence of long solutions for the equation p l = p2. Since all elements in T is long, it is easy to see that A t ( i , j , k ) is either the same as A(z, j , k ) , a single path automaton, or the null automaton. This fact enables us to run the step (*) in polynomial-time.

0

NOW we assume that S a , S b E s, S, # S b and

Thus, we can say that the new algorithm runs in polynomial-time.

264

5 Conclusion

We considered the computational complexity of the one-variable pattern- finding problem from given positive and negative examples. We modified the Angluin’s algorithm, which efficiently solves one-variable pattern-finding problem from positive examples only, in order to cope with negative examples. We showed that the modified algorithm involves some difficult problem (say, the MCP problem) and the problem is NP-complete. Since the modified algorithm actually involves some subproblem of the MCP problem, NP-completeness of the MCP problem does not imply the difficulty of the one-variable pattern- finding problem from given positive and negative examples. We also showed some sufficient conditions that the one-variable pattern-finding problem from given positive and negative examples is computable in polynomial-time. Some conditions are obtained using properties of the word equation problem. Since pattern-finding problem is related to the word equation problem, more careful analysis on the word equation problem may affirmatively solve the one- variable pattern-finding problem from given positive and negative examples.

References

1. D. Angluin. Finding patterns common to a set of strings. Journal of Computer and System Sciences, 21( 1):46-62, 1980.

2. W. Charatonik and L. Pacholski. Word equations with two variables. In Proceedings of the 2nd International Workshop on Word Equations and Related Topics, IWWERT’91, Lecture Notes in Computer Science 677, pages 43-56. Springer-Verlag, 1991.

3. M. R. Garey and D. S. Johnson. Computers and Intractability : A Guide to the Theory of NP-Completeness. W. H. Freeman, New York, 1979.

4. L. Ilie and W. Plandowski. Two-variable word equations. In Proceed- ings of the 17th Annual Symposium on Theoretical Aspects of Computer Science, STAGS 2000, Lecture Notes in Computer Science 1770, pages 122-132. Springer-Verlag, 2000.

5. T. Jiang, A. Salomaa, K. Salomaa, and S. Yu. Decision problems for patterns. Journal of Computer and System Sciences, 50( 1):53-63, 1995.

6. D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323-350, 1977.

7. K.-I KO and C.-M Hua. A note on the two-variable pattern-finding problem. Journal of Computer and System Sciences, 34( 1):75-86, 1987.

8. K.-I KO and W.-G Tzeng. Three Cr-complete problems in computational learning theory. Computational Complexity, 1 (3):269-310, 1991.

265

9. K.-I KO, A. Marron, and W.-G Tzeng. Learning string patterns and tree patterns from examples. In Proceedings of the 7th International Conference on Machine Learning, pages 384-391. Morgan Kaufmann, 1990.

10. M. Lothaire. Combinatorics on Words. Addison-Wesley, Reading: Mas- sachusetts, 1983.

11. G. S. Makanin. The problem of solvability of equations in a free semigroup. Mathematics of the USSR Sbornik, 32(2):129-198, 1977.

12. S. E. Obono, P. Goralcik, and M. Maksimenko. Efficient solving of the word equations in one variable. In Proceedings of the 19th In- ternational Symposium on Mathematical Foundations of Computer Sci- ence, MFCS’94, Lecture Notes in Computer Science 841, pages 336-341. Springer-Verlag, 1994.

266

O N THE STAR HEIGHT OF RATIONAL LANGUAGES A NEW PRESENTATION FOR TWO OLD RESULTS

SYLVAIN LOMBARDY AND JACQUES SAKAROVITCH Laboratoire Traitement et Communication de l’lnformution, CNRS / ENST,

46, rue Barrault, 75 634 Paris Cedex 13, France E-mail: {lombardy ,sakarovitch}Qenst .fr

The star height of a rational language, introduced by Eggan in 1963, has proved to be the most puzzling invariant defined for rational languages. Here, we give a new proof of Eggan’s theorem on the relationship between the cycle rank of a n automaton and the star height of an expression that describes the language accepted by the automaton. We then present a new method for McNaughton’s result on the star height of pure-group language. It is based on the definition of a (finite) automaton which can be canonically associated to every (rational) language and which we call universal. In contrast with the minimal automaton, the universal automaton of a pure-group language has the property that it contains a subautomaton of minimal cycle rank that recognizes the language.

The star height of a rational language is the infimum of the star height of the rational expressions that denote the language. The star height has been defined in 1963 by Eggan who basically proved two things and asked two questions.

Eggan showed first that the star height of a rational expression is related to another quantity that is defined on a finite automaton which produces the expression, a quantity which he called rank and which we call here loop complexcity. He proved then that there are rational languages of arbitrary large star height, provided that an arbitrary large number of letters are available. And he stated the following two problems.

Is the star height of a rational language computable? 0 Does there exist, on a fixed finite alphabet, rational languages of arbitrary large star height?

For a long time, the first one was considered as one of the most difficult problems in the theory of automata and eventually solved (positively) by Hashiguchi lo in 1988.

The second problem, much easier, was solved in 1966 by Dejean and Schutzenberger 7, positively as well. Soon afterwards, in 1967, McNaughton published a paper 1 2 , entitled “The loop complexity of pure-group languages” where he gave a conceptual proof of what Dejean and Schutzenberger had established by means of coinbinatorial virtuosity (one of the ‘(jewels” in formal

267

language theory cf14). He proved that the loop complexity, and thus the star height, of a language whose syntactic monoid is a finite group is computable and that this family contains languages of arbitrary large loop complexity (the languages considered by Dejean and Schutzenberger belong to that family).

The purpose of this communication is to give a new, and hopefully en- lightening, presentation of Eggan’s and McNaughton’s results. We first give a new proof of Eggan’s theorem, by describing an explicit correspondence between the computation that yields the loop complexity of an automaton and the computation of an expression that denotes the language accepted by the automaton. We then present a new method for McNaughton’s result on the star height of pure-group language; it is based on the definition of a (finite) automaton which can be canonically associated to every (rational) language and which we call universal. In contrast with the minimal automaton, the universal automaton of a pure-group language has the property that it contains a subautomaton of minimal cycle rank that recognizes the language.

In a forthcoming paper, we show how this method can be extended and the result generalized from pure-group languages to reversible languages”.

We mostly use the classical terminology, notation and results for automata and languages (cf.’). We give explicit notes when we depart from the standard ones.

1 Eggan’s Theorem

1.1

Rational expressions (over A’) are the well-formed formulae built from the atomic formulae that are 0, 1 and the elements of A and using the binary operators + and . and the unary operator *.

The operator * is the one that “gives access to infinity” . Hence the idea of measuring the complexity of an expression as the largest number of nested calls to that operator in the expression. This number is called the star height of the expression, denoted by h[E] and defined recursively by:

Star height and loop complexity

if E = 0, E = 1 or E = a E A ,

if E = E’+ El’ or E = E’ . E”,

if E = F*,

h[E] = 0 ,

h[E] = max(h[E’], h[E”]) ,

h[E] = 1 + h[F] .

Examples 1 i) h[(u + b)*] = 1 ; h[a* ( b a * ) * ] = 2 . ii) h[u* + a*b(ba*b)*bu* + u*b(bu*b)*a(b + a(ba*b)*a)*a(ba*b)*ba*] = 3 ,

h[(a + b(ba*b)*b)*] = 3 ; h[a*b(ab*a + ba*b)*ba*] = 2 .

268

These examples show that two equivalent expressions may have different star heights (the expressions in i) as well as those in ii) are equivalent). The following definition is then natural.

Definition 1 The star height of a rational language L of A*, denoted by h[L], is the minimum of the star height of the expressions that denotea the language L:

h[L] = min{h[E] I E E RatEA* IEI = L } . The star height of an expression also reflects a structural property of an

automaton (more precisely, of the underlying graph of an automaton) which corresponds to that expression. In order to state it, we define the notion of a ballb of a graph: a ball in a graph is a strongly connected component that contains at least one arc (cf. Figure 1).

Figure 1. An automaton, its strongly connected components, and its balls.

Definition 2 recursively defined by:

The loop complexity‘ of a graph G is the integer Ic(G)

l C ( 6 ) = 0 Ic(G) = max{lc(P) I P ball of G} Ic(G) = 1 + min{Ic(G \ {s}) I s vertex ofg}

if G contains no ball (in particular, if G is empty); if 6 is not a ball itself;

if G is a ball.

As Eggan showed, star height and loop complexity are the two faces of the same object:

Theorem 1 ’ The loop complexity of a trim automaton A is equal t o the infimum of the star height of the expressions (denoting IdI) that are obtained by the different possible runs of the McNaughton-Yamada algorithm on A.

“We write IEl for the language denoted by the expression E. Similarly, we write Id1 for the language accepted by the automaton A. RatE A’ is the set of rational expressions over the alphabet A. bLike in a ball of wool. ‘Eggan calls it “cycle rank” . McNaughton l2 calls loop complexity of a language the minimum cycle rank of an automaton that accepts the language. We have taken this terminology and made it parallel to star height, for “rank” is a word of already many different meanings.

269

There is an infimum “hidden” in the definition of the loop complexity and the theorem states that it is equal to another infimum. It proves to be adequate to make this infimum more explicit and, for that purpose, to define the loop complexity, as well as the star height, relatively to an order on the vertices of the graph (or on the states of the automaton). We shall then relate more closely the two quantities, showing that they are equal when they are taken relatively to an order. The equality of the two minima will follow then obviously.

We use the following notation and convention. If w is a total order on a set Q, we denote by G the largest element of Q for w. If R is a subset of Q, we still denote by w the trace of the order w on R and, in such a context, G is the largest element of R for w.

Definition 3 Let I ; be a graph and w a total order on the set of vertices of I; . The loop complexity of I; relative to w is the integer Ic(I;, w) recursively defined by:

(1) (2)

lc(G,w) = 0 lc(I;, w ) = max{lc(P, w) I P ball of I;}

i f G contains no ball ( in particular, if G is empty); if I ; is not a ball itself;

Ic(I;, u) = 1 + Ic(G \ {GI, w) i f is a ball. (3)

Property 1

Proof. see first that , for any total order w ,

For any graph I ; , Ic(I;) = min{lc(I;,w) I w order on I; }.

By induction on the number of vertices of G, the base being 0. We

lc(G) 5 Ic(G,w)

which clearly holds if 6 contains no ball or is empty. If G is not a ball itself, it holds

Ic(I;) = max{lc(P) I P ball of I;} 5 max{lc(P,w) I P ball of 6) = Ic(I;,w)

since Ic(P) 5 Ic(P, w) as P has strictly less vertices than 6. And if G is a ball, it holds

Ic(G) = 1 + min{lc(G \ {s}) I s vertex of G} < 1 + Ic(G \ @}) 6 1 + Ic(G \ {G},W)

Conversely, the definition of the loop complexity of I; amounts to the definition of a total order w on the vertices of such that Ic(I;) = Ic(I;,w). If 6 contains no ball, any order makes the property holds. If I; is not a ball

270

itself, let w be any order such that its trace on every ball of G is the order that has been determined by the induction hypothesis. If G is a ball itself, let s be the vertex such that the loop complexity of G \ { s } is minimum. Let then 1c, be the order on G \ { s } , determined by the induction hypothesis, such that Ic(G \ { s } ) = Ic(G \ { s } , G). The order w defined on 6 by w = s and the trace of w on G \ { s } being equal to 1c, is such that Ic(G) = ic(G,u).

1.2 T h e state elimination algorithm

McNaughton-Yamada’s algorithm is probably the best known algorithm for computing a rational expression that denotes the language accepted by an automaton. For our purpose however, it is convenient to use a variant of i t , due to Brzozowski and McCluskey 2 , which is completely equivalentd. This algorithm has been described in l5 and in 16. It uses generalized automata and processes by deleting state after state.

Let us call generalized an automaton A = (Q, A , E , I , T ) in which the labels of the transitions are not letters anymore but expressions, that is the elements of E are triples ( p , e , q ) with p and q in Q and e E RatE A*. The label of a computation is, as usual, the product of the labels of the transitions that constitute this computation and the language accepted by A is the union of the labels of the successful computations of A.

Starting from a (generalized) automaton A, the state elimination algorithm consists in building a generalized automaton C which can be called trivial: an initial state i, a final state t (distinct from i) and a single transition from i to t and labelled by a rational expression E which denotes the language accepted by A ( c j . Figure 2) .

Figure 2. The result of the state elimination algorithm.

The first phase consists in building a kind of “normalized” automaton L? by adding to A = (Q, A , E , I , T ) two distinct states i and t , and a transition labelled by 1 ~ . from i to every initial state of A, and a transition labelled by 1 ~ . from every final state of A to t . The state i is the unique initial state of L?, the state t its unique final state: L? is equivalent to A. As A, and ~ ~

dThis statement can be made precise and meaningful: an expression obtained by one algorithm can be transformed into an expression computed by the other by using the axiom E’ = 1 + EE’ (cf. 1 3 ) . Note that this axiom preserves star height.

27 1

then B, are finite, one can assume - after some finite unions on the labels of the transitions - that there is at most one transition from p to q for every pair ( p , q ) of states of B.

The second phase has as many steps as there are states in A. It consists in successively removing states from B (but i and t ) and to update the transitions in such a way that a t every step an equivalent automaton is computed whose labels are obtained from those of the preceding one by union, product and star.

More precisely, let q be an element of Q; let p l , pa, . . . , pl be the states of B which are the origin of a transition whose end is q , and K I , K g , . . . , I<[ the labels of these transitions; let r1, 7-2, . . . , r k be the states of B which are the end of a transition whose origin is q , and H I , Hg, . . . , Hk the labels of these transitions - some of the rJ may coincide with some of the P h , but no p h nor any rJ coincide with q. Let L be the label of the transition whose origin and end is q , if it exists; otherwise, we put L = 0 and thus L' = 1 ~ . .

Let B' be the automaton obtained from B by removing q and all the transitions adjacent to y, and by adding, for every pair of states ( p h , r j ) ,

1 < h < 1 and 1 < j < k , the transition ( p h , I<h L' HJ , rj) (cf. Figure 3) . I<! L' H , Ii'l L' Hk

(a) Before the deletion of q (b) After the deletion of q

Figure 3. A step of the state elimination algorithm.

The automata B and B' are equivalent. By iterating this construction n times, ( n = 1 1 & 1 1 ) " , an automaton C is obtained that contains no states of the automaton A and which is of the required form.

eThe cardinal of a set Q is denoted by 11Q11.

272

1.3 The Eggan-Brzozowski index

In order to prove Theorem 1, we define the Eyyan-Brzomwski index of an automaton A - E B index for short - which is at the same time a generalization and a refinement of the loop complexity. And as above for the loop complexity, we shall define the E B index of A, not absolutely but relative to a total order w on the set Q of the states of A, that order which is implicit in the state elimination algorithm.

If A is a generalized automaton, we first call E B zndex of a transition e of A, denoted iEB(e), the star height of the labelf of e :

i ~ ~ ( e ) = h[le11. If A = ( Q , A, E , I , T ) is a “classical” automaton over A:

Ve E E iEB(e) = 0 . We call then E B index of A rehtiue to w , and we note i E B ( d , w ) , the

integer defined by the following algorithm (called E B algorithm) where we keep the above notation and convention for the order and the trace of an order on a subset:

0 If A is not, a ball:

i E B ( d , w ) = max({igB(e) [ e does not belong to a ball of A} U { ~ E B ( P , u ) I P is ball of A}) (4)

0 If A is a ball:

i E B ( d , w ) = 1 +max({iEB(e) I e is adjacent to z } , i E B ( d \ ~ , w ) ) (5) If A is a “classical” automaton, (4) and (5) become respectively:

0 If A is not a ball:

i E B ( d , w ) = max({iEB(P,w) I P is ball of A } )

i E B ( d , U ) = 1 + i E B ( d \ G , w )

( 6 )

(7)

0 If A is a ball:

to which the base of the recurrence has to be added: 0 If A does not contain any ball, or is empty:

i E B ( d , U ) = 0 . (8) Since (8) , (6) and (7) define the same induction as ( l) , (2) and (3) , it

directly follows from Property 1:

fThe label of the transition e is denoted by 1.1

273

Property 2

We denote by Es E ( d , w ) the rational expression obtained by running the state elimination algorithm on A with the order w, that is by deleting the states of A the smallest first. It should be noted that it follows from these definitions that, once the order w is fixed, the order of deletion of states in the state elimination algorithm is the reverse order of the “deletion” of states in the computation of the E B index. Theorem 1 is then the consequence of the following.

Ic(A) = min{iEB(A,w) I w order on Q }.

Proposition 1 Let w be a total order on the set of states of an automaton A. The E B index of A relative to w is equal to the star height of the rational exp,ression obtained by running the state elimination algorithm on A with the order w , i.e.

i E B ( d , W ) = h [ E s ~ ( d , w ) ] .

Proof. By induction on the number of states of A. By convention, the states i and t that have been added are larger than all the states of A in the order w and are not deleted in the state elimination algorithm .

The base of the induction is thus a generalized automaton with 3 states, like in the Figure 4 a) or b). In case a), B contains no ball and it holds:

i E B ( B ) , w = max(h[E], h[F], h[H]) = h[E + F . HI = h[Es~(t? ,w)] . In case b) , the unique state of t? which is neither initial nor final is a ball whose E B index is 1 + h[G], and it holds:

iEB(B, w) = max(h[E], h[F], h[H], (1 + h[G])) = h[E + F . G* . H] = h[Es~(B,w)] .

Figure 4. Base of the induction

Let now t? be an automaton of the prescribed form and with 11 + 2 states, q the smallest state in the order w , and B’ the automaton after the first step

274

of the state elimination algorithm applied to B - that is, after deletion of q . Since the adjacency relations (for the other states than q ) are the same in B and in B‘, and as q is the smallest element in the order w , the E B algorithm runs in the same way in B and in B’, i.e. the succession of balls build in both cases is identical, up to the processing of q in B excluded. It remains to show that the computed values are identical as well.

Let P be the smallest ball of B that strictly contains q - and if such a ball does not exist, let P = B - and let P‘ be “the image” of P in B’. Two cases are possible. If q is not the origin (and the end) of a loop - case a) --, the transitions of P’ are either identical to those of P or labelled by products F.H , where F and H are labels of transitions of P. It then comes:

iEB(P’ ,w) = max(max{iEB(e) I e does not belong to a ball of PI},

max{iEB(Q,w) 1 & is a ball of P’})

max{iEB(Q),w I & is a ball of P}) = max(max{iEB(e) 1 e does not belong to a ball of P},

= i E B ( P , W ) . (9)

If q is the origin (and the end) of a loop labelled by G - case b) -, i.e. q is a ball of B by itself, the transitions of P’ are either identical to those of P or labelled by products F .G*. H. It then comes, since iEB({q},W) = 1 + h[G]:

i E B ( p ’ , w ) = max(max{iEB(e) I e does not belong to a ball of P‘}, max{ iE B (&, w ) 1 Q is a ball of P’})

= max(max{igB(e) I e does not belong to a ball of P}, (1 + h[G]) ,max{i~~(&,w) I & is a ball of P’})

= max(max{igB(e) I e does not belong to a ball of P}, iEB({q), ‘ “ ) I

max{iEB(Q,w) I & ball of P, different from { q } } )

= ~ E B ( P , ~ ) . (9’)

iEB(B’,w) = iEB(B,W) (10)

If P = B (and P’ = a’), the equalities (9) and (9’) become

which yields the induction and then the proposition. Otherwise, and without any induction on the number of nested balls that contain q , (10) is obtained from (9) by noting that the transitions of B’ are either identical to those of B or correspond to transitions that are adjacent to q .

275

In case a), the labels of these transitions are products of the labels of transitions of B, their index is obtained by taking a maximum and (10) follows from the relation max(u, b , c ) = max(a, max(b, c ) ) .

In case b), the labels of these transitions are, as above, of the form F.G*.H, of index max(h[F], h[H], l+h[G]). The corresponding transition in B has label F (or H); it is processed by the E B algorithm when the index of the transition of label H (or F) and the one of the ball {y} , whose index is 1 + h[G], are already computed. The result, that is ( l o ) , follows then, for the same reason as above.

1.4 No ,rush to conclusion

After Theorem 1 that shows that the correspondance between automata and expressions can be carried on to a correspondance between loop complexity and star height, one could have thought that to the minimal automaton would correspond an expression of ,minimal star height. There is no such thing of course (or the star height of a language would not be mysterious anymore). The following example describes one of the simplest languages whose minimal automaton is not of minimal loop complexity.

Example 2 Let Fz and F3 be the languages of A’ = { u , b } ’ consisting of words whose number of u’s is congruent to the number of b’s plus 1 modulo 2 and 3 respectively and let F6 be their union:

FZ = {f I Ifla - l f l b = 1 mod 2) F3 = {f I lf la - l f l b = 1 mod 3) and F+j = {f I l f l a - l f l b = 1,3 ,4 or 5 mod 6) .

The minimal automaton of F6 is the “double ring” of length 6, whose loop complexity is 3. The minimal automata of FZ and F3 have loop complexity 1 and 2, hence the star height of F(5 is at most 2 (cf. Figure 5).

Figure 5 . An automaton of minimal loop complexity (right) which is not the minimal automaton (left) for F6.

276

2 Conway’s universal automaton

The new interpretation of McNaughton’s theorem we are aiming a t makes use of a construction which is basically due to Conway.

Let A = (Q, M , E , I , T ) be an automaton over a monoid M . For any state q of A let us call “past of q (in A)” the set of labels of computations that go from an initial state of A to q , let us denote i t by PastA(y); i.e.

Pasta(q) = { m E M 13 E I i 3 q ] A

In a dual way, we call “future of q (in A)’’ the set of labels of computations that go from q to a final state of A, and we denote it by FutA(q); i.e.

FutA(q) = { m E M 13 E T y t } A

For every q in Q it then obviously holds:

[Pastd(q)] [Futd(q)] c 1-41 . (*) Moreover, if one denotes by TransA(p, q ) the set of labels of computations that go from p to q , it then holds:

[Pastd ( P ) ] [Transd ( P , q ) ] [Futd(Y)] c 1.41 . (**) It can also be observed that a state p of A is initial (resp. final) if and only if 1 ~ ’ belongs to PastA(p) (resp. to FutA(p)).

Hence every automaton, and in every automaton, every state induces a set of factorizations - this will be how equations such as (*) or (**) will be called - of the subset accepted by the automaton. It is an idea essentially due to J. Conway , and that proved to be extremely fruitful, to take the converse point of view, that is to build an automaton from the factorizations of a subset (in any monoid).

More specifically, let It’ be any subset of a monoid M and let us ca.11 factorization of Ii‘ a pair ( L , R) of subsets of M such that

L R S K

and ( L , R) is maximalg for that property in M x M . We denote by Q K the set of factorizations of It’. For every p , q in QK the factor FP,, of It‘ is the maximal subset of M such that

L, FP,, R, c It‘ where p = ( L p , Rp) and q = (L,, Rq) indeed.

gMaxirnal in the order induced by the inclusion in M .

277

It is easy to verify, as it was noted in 4 , that if a : M --+ N is a surjective morphism that recognizes Ii', i.e. Ii'acu-' = Ii', and if (L, R) is a factorization and F a factor of Ii' then:

i) L = L 0 a - l , R = Ra0-l , and F = Frra-' ; ii) ( L a , Ra) is a factorization and F a is a factor of Ka ;

or, in other words, factorizations and factors are syntactic objects with respect to Ii'. AS a consequence, QK is finite if and only if I< is recognizable.

In 5 , the Fp,g are organized as a QK x QK-matrix, called the factor matrix of the language It", subset of A*. A further step consists in building an automaton over the alphabet A, which we call the universal automaton of I<, denoted by UK, and based on the factorizations and the factors of I<:

UK = (QK,A,EK,IK,TK), where

and

and, obviously, l U ~ l = Ii' . What makes UK universal is expressed in the following result.

IK = { p E QK I 1 ~ * E L p } , TK {Q € QK I 1 ~ - E Rg)

EK = { ( ~ , a , q ) E Q K x A x Q K I a E Fp,q ) ,

Theorem 2 l 3 If A = (Q, A, E, I,T) is any automaton that accepts I<, then the,re exists an automaton morphismh 'p from A into UK, and UK is minimal fo r this property. Moreover, if A is minimali then 'p is injective. w

In particular, UK contains as a subautomaton every minimal automaton (deterministic, or non deterministic) that accepts I<. The construction of the universal automaton by means of factorization has been more or less given in and (where it is refered also to ') as well.

Example 3 Let Ii'l = A*wbA* be the language of words that contain at least one factor a b. Easy computations show that I<l has 3 factorizations:

u = ( A * , A * a b A * ) , v = (A*aA*,A*bA*) , and w = ( A * a b A * , A * ) ,

which yields the universal automaton represented at Figure 6.

hThat is rp is a mapping from the set of states of A into the set of states of U,< such that if ( p , a , q ) is a transition of A then ( p 9 , a , q c p ) is a transition of UI; and if p is an initial (resp. final) state of A then p 9 is an initial (resp. final) state of Ulc. 'With respect to K : no state of A can be deleted without making Id1 smaller, no pair of states of A can be merged without making Id1 larger.

278

n + b Figure 6. The universal automaton of I C 1 = A’a b A’

Example 4 Let E3 be the language of A* = { a , b}* consisting of words whose number of a’s is not congruent to the number of b’s modulo 3:

E3 = {f I I f la f l f l b mod 3) . The 3 factorizations of E3 are best seen on its syntactic monoid Z/3Z as represented at Figure 7. The universal automaton of E3 is then represented, in two ways, at Figure 8.

Figure 7. The factorizations of Es.

3 McNaughton’s Theorem

With the previous definitions, McNaughton’s Theorem on pure-group language becomes:

Theorem 3 a subautomaton of minimal loop complexity which recognizes I<.

The universal automaton of a pure-group language K contains

As the universal automaton of a rational language is finite, i t is possible to enumerate its subautomata, to keep only those which recognize the language, and to distinguish among them those of minimal loop complexity. Hence:

Corollary 1 l2 The star height of a ,rational pure-group language is computable.

279

:.. ......... . . ............. . . . . . . . I . ..:*: ..* ....... . . . . . . ;..‘ ......

1,2 2,U 091

~ . . ................... . . . . . . . ....

Figure 8. The universal automaton of E3. On subfigure (a), the transitions labelled by a are represented by solid arrows, the transitions labelled by b by dashed ones. The subfigure (b) is a simplification of the previous one, in preparation of Figure 9 that would be unreadable without these conventions. The arrows labelled by b are removed for they are opposite to the ones labelled by a, which are drawn thicker inside the two levels. The solid and dashed arrows between the two levels are replaced by dotted arrows that can be considered as labelled by 1 ~ ’ and that play the role of two previous ones: for instance, the dotted arrow between 0 and {2,0} represents the solid arrow between 0 and (0 , l} and the dashed one beween 0 and {1,2} in subfigure (a).

Example 5 (Example 2 continued) The universal automaton of

FG = {f I I f l a - l f l b z 1 , 3 , 4 o r 5 mod S}

is represented at Figure 9. Two of its balls form the automaton shown above and that accepts FG with minimal loop complexity.

On the other hand, the same theorem yields directly what had been established by Dejean and Schutzenberger by means of subtle and sophisticated combina.toria1 arguments:

Corollary 2 Let W, be the lmguuge of { a , b}* consisting of words whose number of a’s is congruent to the number of b’s .mod.ulo 2Y. Then the star height of W, is q .

Proof. The syntactic monoid of W, is the group Z./2,Z and the image of W, in this group is the ident,ity. It is an immediate computation that the universal automaton of the identity of a group is the group itself and that (the Cayley

The proof of Theorem 3 follows indeed the original proof by McNaughton, For any automaton L? that accepts a language li’ - and in particular for one

graph of) Z / 2 % has a loop complexity equal to q. I

280

of minimal loop complexity - there exists a morphism from B into UK. If an (automaton) morphism were preserving loop complexity or, at least, were not increasing i t , the theorem would follow immediately and not only for group languages but for any language. But this is not the case, by far. With that idea in mind, one has to consider morphisms of a special kind.

Figure 9. The universal automaton of F6, with the notation conventions of Figure 8.

28 1

3.1 Conformal morphisms

Definition 4 putation in A is the image of (at least) one computation in B .

A morphism 'p: B + A is said to be conformalj if any com-

A morphism is not necessarily conformal, as shown by the example of Figure 10.

a

Figure 10. A non conformal morphism (the horizontal is the morphism).

The notion of conformal morphism allows to put into relation the loop complexity of an automaton and the one of its image by a morphism.

Theorem 4 l2

plexity of B is greater than or equal to the one of A: Ic(B) 2 Ic (d ) . I f p: B + A is a conformal morphism, then the loop com-

We first state and prove a lemma.

Lemma 1 Let 'p: B + A be a conformal morphism. For every ball P of A, there exists a ball Q of B such that the restriction of 'p to Q is a conformal morphism from Q onto P .

Proof. This lemma (as the theorem) is indeed a statement on graphs and not on automata, that is we can forget the labels on the transitions. But the proof will be simpler if we make use of the notion of automata, that is of labelled graphs - not with the original labels, but with labels that are convenient for the proof. Every transition of A is considered as having a distinct label and every state of A is considered as both initial and final.

The words of the language accepted by A (resp. by a subautomaton P of A) characterize the pathes in the graph A (resp. in the graph P ) . The transitions of B are labelled in such a way that 'p is a morphism and every state of B is both initial and final.

3McNaughton call them "pathwise" l2 but his definition of morphism is slightly different from ours.

282

Let P be a ball of A and R = Pp-l. Let n = llRll and m = IlPll be the numbers of states of R and of P respectively. Let w be a circuit (then a word) that contains all the pathes of P of length smaller than 2n+n. The circuit w" is a path in P which is lifted into a path in R (as p is conformal). By the pigeon hole principle, there exists a k such that a factor wk is the label of a circuit in R; let Q be the ball in R, and then in B, which contains this circuit. By construction, Q accepts all the words of length smaller than 2"+" of the language accepted by P , Q is thus equivalentk to P , then all the pathes of P

Proof of Theorem 4 By contradiction. Among all automata for which the proposition does not hold, let B be an automaton with minimal loop complexity d, and let c be the loop complexity of A: c > d.

If d = 0, the length of the pathes in B is bounded and it is impossible for p to be conformal, then d > 0.

By definition, there exists a ball P in A of loop complexity c and, by Lemma 1 , a ball Q of B whose image by p is P . This ball is of loop complexity at most d but it is as well, by minimality of d, of loop complexity at least d. There exists then a state q of Q such that

are lifted in Q : the restriction of p from Q onto P is conformal. H

I c ( Q \ { q } ) = d - l .

Let p = qp, P' = P \ { p } and Q' = Q \ {pp-'1; it holds:

Ic(Q') 6 l c ( Q \ { q } ) = d - 1 and Any path of P' is a path of P which does not go through p ; such a path is

the image of a path of Q which does not go through any of the states in pp-', that is, the image of a path of Q': p is a conformal morphism from &' onto P' ,

Ic(P') 2 c - 1 > d - 1 .

a contradiction with the minimality of d. H

3.2 Proof of Theorem 3

In the sequel, A' c A* is a pure-group language, a : A* -+ G is the syntactic morphism, P = Ka and dK = (G,A,d , ~ G , P ) is the minimal automaton of K. For 'w in A* and g in G, we note g . w for g (wa), the multiplication being taken in C.

Even in the case of a pure-group language I<, the morphism p from an automaton B (that accepts K) into the universal automaton UK is not necessarily conformal. The proof of the theorem boils down to show that nevertheless p is conformal on those balls of B that are crucial for the loop complexity.

k A s two automata with n and m states respectively are equivalent if they coincide on all words of length smaller than Z n f m . This is the argument which makes the use of automata instead of graphs powerful.

283

This goes via two properties of the balls of the universal automaton UK of a pure-group language I( that we establish first.

Leinina 2

Proof.

The balls of UK are deterministic and complete.

It follows from the definition of the universal automaton that if

( L i , Ri) (L2, R2)

is a transition OfUK, then L1 ( a a ) R2 C P and then both L1 . a E L2 and

Let (L1, R1) and (L2, R2) be twostates OfUK in a same ball. There exist u and v in A* such that L1. u L1. As G is a group, the action of every element is injective. Then llLlll < 11L211 < llLlll, hence llLlll = llL2ll and L1 . u = L2. Which means that L2 is uniquely determined by L1 and u : the ball is deterministic.

On the other hand, if ( L , R) is a factorization of P , ( L ( ~ a ) , ( u a ) -is a factorization of P as well, for every u in A* and there exists a transition labelled by u from the first one onto the second one. For every u , there exists v such that ( u v ) a = l ~ , and then a transition labelled by v from ( L (ua) , ( u a ) - l R ) onto ( L , R) . Then, ( L ( u a ) , (ua ) - 'R ) belongs to the

UK

(a&) Ra C R1 hold.

L2 and L2 .v

same ball as ( L , R) and this ball is complete.

Lemma 3 For every integer k , there exists a word wk in A* whose image in G is 1~~ and such that any coir72putation of lenght k in any ball C of UK is a subcomputation of any computation in c labelled by Wk.

Proof. Every word whose image in G is l ~ , is the label of a circuit in every ball of UK and for every state as starting point. For every ball, and every state of this ball, one can build a circuit that contains all computations of length k in that ball. Let z be the product of the labels of all these circuits. One can choose for wk a power zn of t such that its image in G is 1 ~ . Proof of Theorem 3 Let B be an automaton of minimal loop complexity that accepts I( and n the number of states of B. Let 'p be a morphism from B onto UK.

Let g in P , thus a final state of d K , and let u g , be a word of A* that is mapped onto g by a. For every integer k , the word ( . w ~ ) ~ u ~ is in Ii' and then is accepted by B. The block star lemma, applied to factors W k , yields a state p k of B which is the starting point of a circuit that is labelled by a certain power ( 2 0 ; ) . In other words, a computation with label ( W ~ ) ~ U ~ can be

284

factorized as follow:

Let 2 ) k be the ball of a that contains p k , and thus this circuit. An infinite sequence of balls 2 ) k is obtained in that way, in which at least one ball V appears infinitely often.

Let C be the ball of UK which contains the image of 2, by 'p. For every path c of C, there exist an integer k larger than the length of c, an integer 1 and a state p of V such that there exists a circuit of V of origin p and labelled by (wk)'. This same word (wk)' is the label of a circuit in C that goes through all computations of length smaller than or equal to k ; in particular, it contains c itself. Hence c is the image of a computation in V. The ball C is then the image of V by 'p and the restriction of 'p to V is conformal. By Theorem 4, Ic(V) >, Ic(C) holds.

Let ( L , R ) be the factorization that is the image of p by 'p - where p is the state defined above. As (wk)" is in PastB(p), 1~ is in Pastu,((L, R)) and then 1~ is in L , that is, ( L , R) is an initial state of U K . In the same way, (wk)'"ug is in FutB(p) and g is in R. Every word u of A* such that UQ = g is the label of a computation in C that starts from ( L , R) (initial state) and that ends at the state (Lg,g-lR), which is a final state of UK since 1~ E y-lR; hence u is accepted by C. The ball C is a. subautomaton of UK that accepts a language which contains ga-' and which is contained in I<.

The same construction can be repeated for every y in P and a set E of balls of U K is obtained which accepts the whole language I<. Every ball in E has a loop complexity that is smaller than or equal to the loop complexity of at least one ball of B. The loop complexity of E is at most equal to the loop complexity of B which was supposed to be minimal.

Acknowledgements

It is a pleasure to thank J . Brzozowski for the fruitful discussions we had during his last visit to Europe.

The second author gratefully thanks D. Wood for his invitation and hos- pitality in November 1999 at the Hong Kong University for Science and Tech- nology. The friendly athmosphere and the stimulating discussions we had then lead to the definition of the E B -index.

The second author is also grateful to M. Ito who invited him at the 3rd International Conference on Languages, Words and Combinatorics, and especially to H. Jurgensen who gave him the opportunity to present this work at

285

the 2nd workshop on Descriptive Complexity of Automata, Grammars and Related Structures in July 2000, in London (Ontario), with the attendance of R. McNaughton.

References

1. A. ARNOLD, A. DICKY AND M. NIVAT, A note about minimal nondeterministic automata. Bull. of E.A. T.C.S. 47 (1992), 166-169.

2. J . A. BRZOZOWSKI A N D E. J . MCCLUSKEY, Signal flow graph techniques for sequential circuit state diagrams. IEEE Transactions on Elec- tronic Computers 12 (1963), 67-76.

3. C. CARREZ, On the minimalization of non-deterministic automaton. Tech. rep. du laboratoire de calcul de la Facultd des Sciences de 1'Universitd de Lille, 1970.

4. 0. CARTON, Factorisations et morphismes. unpublished manuscript. 5. J . H. CONWAY, Regular algebra andfinite machines. Chapman and Hall,

1971. 6. B. COURCELLE, D. NIWINSKI AND A. PODELSKI, A geometrical view

of the determinization and minimization of finite-state automata. Math. Systems Theory 24 (1991), 117-146.

7. F. DEJEAN AND M. P . SCH~TZENBERGER, On a question of Eggan. Inform. and Control 9 (1966), 23-25.

8. L. C . EGGAN, Transition graphs and the star-height of regular events. Michigan Mathematical J. 10 (1963), 385-397.

9. S. EILENBERG, Automata, languages and machines, vol. A, Academic Press, 1974.

10. K . HASHIGUCHI, Algorithms for determining relative star height and star height. Inform. and Computation 78 (1988), 124-169.

11. S. LOMBARDY AND J . SAKAROVITCH, Star height of reversible languages and universal automata, submitted.

12. R. MCNAUGHTON, The loop complexity of pure-group events. Inform. and Control 11 (1967), 167-176.

13. J . SAKAROVITCH, Ele'ments de the'orie des automates, Vuibert, to appear; English translation to be published by Cambridge University Press.

14. A. SALOMAA, Jewels of formal language theory. Computer Science Press, 1981.

15. D. WOOD, Theory of computation. Wiley, 1987. 16. S. Yu, Regular languages, in Handbook of Formal Languages, vol. 1 (G.

Rozenberg and A. Salomaa, Eds.), Elsevier, 1997.

286

Some Properties of Hyperoperations and Hyper clones

Hajime Machida

Hitotsubashi University, Kunit achi, Tokyo 186- 860 1 Japan Email: machidaQmat h. hit-u.ac. jp

Abstract

A hyperoperation on a finite set A is a mapping from A” (n > 0) into the set of all non-empty subsets of A, and a hyperclone on A is a (hyper-)composition-closed subset of hyperoperations containing all selectors. In this paper we present the following basic properties of hyperoperations and hyperclones: (i) The existence of a normal form for hyperoperations, (ii) the existence of Sheffer hyperoperations and (iii) the fact that the lattice of all hyperclones on (0 , l ) has the cardinality of the continuum.

I Preliminaries While the classical clone theory deals with “operations” and “clones” a new theory which will be discussed in this paper concerns with “hyperoperations” and LLhyperclones’l . Hyperoperations as well as hyperclones were introduced by I. G. Rosenberg [6] in 1996. Some special hyperalgebras such as hyper- groups, hyperrings, etc. have been studied for years. Rosenberg’s study on hyperclones is an attempt to establish a univesal-algebra type theory for hyperalgebras.

In this paper we present some basic properties of hyperoperations and hyperclones.

First of all, we give definitions of some frequently used terms. Let A be a fixed finite set where IAl 2 2.

The term “operation” is used with the same meaning as function in the

287

ordinary and traditional sense: Thus, for n > 0, an (n-ary) operation f on A is a mapping from A" into A. We denote by 02) the set of all n-ary operations on A and put

OA = u Ol;"). n > O

For 1 5 i 5 n, an operation p r l E 02) is the i-th n-ary projection if and only if pr~(a1,. . . ,a,) = ui for every (ul, . . . ,a,) E A". We denote by JF) the set of all n-ary projections and let JA = Un>O JP). The (functional) composition f [gl,. . . , gm] of f E OLm) with 91,. . . , gm E 02' is defined in the standard way as

f[g1,. . . , g m ] ( z ~ , . . . ,xn) = f(gl(z1, * . . , zn) , ,gm(zl, . . .

for every (XI,... ,x") E A". A clone C on A is a subset of OA which contains the set JA and is closed

under composition. The set of all clones on A is denoted by LA. It is well- known that the set LA is a lattice with respect to the inclusion relation.

Next we define the "hyper"-counterparts of operation, composition and clone.

Let PA be the set of all non-empty subsets of A. For n > 0, an (n-ary) hyperoperation h on A is a mapping from A" into PA. We denote by 'H2' the set of all n-ary hyperoperations on A and put

Z.4 = u 'HF'. n > O

In the theory of hyperclones, trivial hyperoperations called selectors play the role of projections in the standard clone theory. For 1 5 i 5 n, a hyperoperation e l E 'HT) is the i-th n-ary selector if and only if eY(a1,. . . , a,) = {ui} for every (al, . . . , a,) E A". We denote by 2;T-a the set of all n-ary selectors

and let J ~ , A = Un>o JLz. Composition is naturally generalized in the following way: For f E 'HLm)

and g1,92,. . . , gm E RF), we define the operation h = f[gl, g2,. . . , gm] E 'H:) by

h(z1,x2,...,zn) = U {f(~17~2,...,~rn) I zi E gi(x1,z27..',zn)7 for all i = 1,2 , . . . ,m}

for every (51, z2, . . . , z,) E A". The operation h = f [gl, g2, . . . , gm] is called (hyper-)composition o f f with g1,g2,'..,grn.

NOW we are ready to define hyperclones. A hyperclone C on A is a subset of ' H A which contains JH,A and is closed under (hyper-)composition. The set of all hyperclones on A is denoted by & A . It is known that the set ,&A

is a lattice with respect to the inclusion relation.

In Section 2 we establish normal form for hyperoperations. For ordinary operations several types of normal forms are known. We exploit any one of such normal forms for ordinary operations to derive a normal form for hyperoperations. In Section 3 we discuss the problem of whether Sheffer hyperoperations exist. This is a problem posed by B. A. Romov [5]. We explicitly give ternary and quarternary Sheffer hyperoperations and also show the non-existence of binary Sheffer hyperoperations on a two-element set A. Finally, in Section 4 we consider the cardinality of the set ' H A of all the hyperclones on A where A is a two-element set and show that it has the cardinality of the continuum. This answers affirmatively to Rosenberg's problem.

This article is a summary of two papers [l] and [2]: The contents of Sec- tions 2 and 3 with full proof can be found in [2] and the contents of Section 4 with more details will appear in [l].

2 Normal Form In this section we construct a normal form for hyperoperations in 'HA.

Before we discuss the case for hyperoperations, we shall review normal forms for ordinary operations. When the base set A consists of two elements, e.g., A = {O, l} , the operations on A are more commonly called Boolean functions and it is well-known that there are several normal forms for them such as conjunctive normal form, disjunctive normal f o r m and Galois normal form. For the case where the set A is a finite set with two or more elements, we have, for example, the following normal form ([3]):

f(z1,. . . ,%I = V (a1 , ..., a,,) €A"

( c f (a l , ..., a,) ( 2 1 ) A ~ a l ( x 1 ) A . . . A Xa, ( xn ) ) .

Here, the operations ca E 02) and xa E 0:' for a E A are defined as

for every x E A, and the operations A E 0; and V E 0; are any operations satisfying the laws

a A l = a , a A O = O and O V a = a V O = a

288

289

for all a E A.

In order to construct a normal form for hyperoperations we need one particular hyperoperation.

Definition 2 . 1 Let u E 'Hy) be defined as

u(z1,zz) = {z1,z2}

f o r every (z1,zz) E A2. W e call u the union operator

Definition 2 . 2 Let k be the number of elements in A, i e . , k = (A1 ( 1 < k < m) and n > 0. Denote by 3 the lexicographic order on the set A". For a hyperoperation h E IFl2), a value vector v of h is a kn-vector given as

where z j E h(aj) , 1 5 j 5 k", f o r the j - t h element aj of A" with respect to the order 3.

For n > 0, let (al , a2, . . ., u p ) be the sequence of all elements of A" with respect to the order 3. For a hyperoperation h E 'I-&) and a value vector v = ( ~ 1 , . . . , z k n ) of h, we construct an (ordinary) operation h" E 02) in a natural way as hv(aj) = z j for every j ( 1 5 j 5 k").

With these tools, the union operator and value vectors, we can establish a normal fo rm for hyperoperations.

Theorem 2 . 1 For h E 'Ht)

h(z1,. . . ,z,) = IJ V:valueuector

NF(hV(z1, . . ., 2"))

where hv is the operation in 02) derived from h for value vector v and NF(f) is a normal form o f f f o r f E OA.

Example. Let A = {0,1} and h E 'H?) be a hyperoperation satisfying

h(0,O) = { l} , h(0,l) = {0}, h(1,O) = {0 ,1} and h(1, l ) = { I } .

290

There are two value vectors q, z12 of h:

The corresponding ordinary operations are h"1 and hvz defined as

hvl(O,O) = 1, h"'(0,l) = 0, hwl(l ,O) = 0, h v l ( l , l ) = 1

and hvz(O,O) = 1, hvz(O,l) = 0, hwz(l ,O) = 1, h v z ( l , l ) = 1.

Then a normal form of h is expressed as

As a consequece of Theorem 2.1, we have:

Corollary 2. 2 %A zs generated by OA U{U}~ i e . , ' H A = [OA u{u}].

3 Sheffer Hyperoperations A hyperoperation h is called Sheffer when all hyperoperations in 'HA can be generated by h and selectors through finite applications of hypercomposition. In what follows, [h] denotes the hyperclone generated by {h} u ,&,A.

Definition 3. 1 A hyperoperation h E 'HA is a Sheffer hyperoperation if and o n l y ilf'H~ = (h].

In this section we show the existence of Sheffer hyperoperations by actually exhibiting them. Our examples are quaternary and ternary Sheffer hyperoperations. We then claim that binary Sheffer hyperoperations do not exist on a two-element set. This is one of the phenomena where the case of hyperoperations is different from the case of ordinary operations as it is well-known that binary Sheffer operations exist in Oy). We shall see another

29 1

phenominon in Section 4 where the “hyper” case differs from the ordinary case.

We shall adopt the convention of identifying a hyperoperation h E 7f2) whose value is always a singleton with an ordinary operation fh E 02) in an obvious manner:

h ( z l I ’ . . l G J = { f h ( 2 1 , . . . , 4 )

for every ( 2 1 , . . . , z,) E A”

3.1 Existence of Sheffer Hyperoperations

First we show the existence of a quaternary] i.e., 4-variable] Sheffer hyperoperation.

In the case of OA it is known that the operation w E 02) defined as

w(z1,xz) = 1 + max{x11zz)

is a Sheffer operation where + is taken modulo k(= \A\) . This operation is called Webb function. When IAl = 2 (Boolean case), the operation w is identical to N O R ( z 1 , xz), which is called the Sheffer function in the narrowest sense.

Definition 3. 2 Let t E ‘Hy) be defined as follows:

We shall show that this hyperoperation t is a Sheffer hyperoperation in the following way.

Lemma 3. 1 Webb function w is generated by t , i.e., w E [ t ] .

In fact, w is expressed as w = t [ e f , e;] ef , e f ] . Since w is Sheffer in OA, Lemma 3.1 immediately implies:

Corollary 3 .2 OA C [t]

For the union operator u E ‘HT) defined in the previous section, it is readily verified that u = t [ e f , e i , co[ef] , c l [ e f ] ] and we have:

Lemma 3. 3 Union operator u is generated by t , i.e., u E [t].

Now we establish the main result of this subsection.

292

Theorem 3 . 4 The hyperoperation t is a Sheffer hyperoperation, i.e., ' H A =

[tl .

Proof. This is clear from Corollary 2.2, Corollary 3.2 and Lemma 3.3. 0

Next, we show that it is possible to modify t to obtain ternary, i.e., 3- variable, Sheffer hyperoperation.

Definition 3. 3 Let s E 'HT) be defined as

s(x1) x 2 , 2 3 ) = t ( z l , x Z , x l , 2 3 )

for every (x1,x2,x3) E A3, i.e., s =t[(e: ,e$,e: ,ez]

Theorem 3. 5 The hyperoperation s is a Sheffer hyperoperation, i.e., ' H A =

[.I. Proof. It suffices to show that w E [s] and u E [s]. The former is shown as w = s [ e f , e;, e'$ and the latter is verified by u = s [ e f , e?j, b[ey]] where b E 'H:)

0 is a hyperoperation defined as b(0) = (1) and b(x) = (0) if x # 0.

3.2 Nonexistence of a Binary Sheffer Hyperoperation on { O J )

We have shown the existence of ternary and quaternary Sheffer hyperoperations on any finite set A. In this subsection we claim the negative result that there does not exist a binary Sheffer hyperoperation on the two element set A = {0,1}.

Lemma 3. 6 Let h E 'Fly). Assume that for every a E A one of the following conditions is satisfied.

( i ) I n the multiplication table of h, entry { a } appears in at most one row.

(ii) I n the multiplication table of h, entry { u } appears in at most one column.

Then h is not Sheffer.

For the proof of Lemma 3.6, see [2].

Proposition 3. 7 Let A = {0,1}. There does not exist a binary, i.e., 2- variable, Sheffer hyperoperation on A.

293

Sketch of the Proof. Suppose that h E ?.tT) is a binary Sheffer hyperoperation on A. First, by a simple argument, we see that h(i , i ) = { i } for each i = 0 , l . Then, since at least one member in the image of h must be a singleton, we may assume, w.l.o.g., that h ( 0 , l ) = {0,1}. Finally, consider the only remaining value h(1,O). For each case of f ( 1 , O ) = {0}, (1) and (0, l}, we can check with the help of Lemma 3.6 that h cannot generate all

0 hyperoperations on A. Hence h is not a Sheffer hyperoperation.

4 Rosenberg’s Problem In this section we fix that A = {0,1}. Hence, an (n-ary) hyperoperation h is a mapping from (0, l}, into {{0}, {l}, {0,1}}.

In 1998, I. G. Rosenberg asked the following question ([7]): Is the lattice of hyperclones on {0,1} of the continuum cardinality ?

It should be noted that, for the case of ordinary clones, the cardinality of the lattice LA of all clones on A is countable when /A1 = 2 ([4]) and of the continuum when IAl 2 3 ([S]).

We answer Rosenberg’s problem affirmatively. Thus the situation differs between the lattice of ordinary clones and that of hyperclones for a two- element set A.

The key to the solution to Rosenberg’s problem lies in the following sequence of hyperoperations.

Definition 4 .1 Let A = (0,l) . For every n > 0 , let h, E 7-tt’ be the n-variable hyperoperation o n (0 , l ) defined as follows:

Let G denote the set of all such hyperoperations: G = { h, I n > 0 }.

Note that hl is a constant hyperoperation: hl(x1) = (1) for every x1 E A.

NOTATION: For any n > 0, set G, = 6 - {h,}.

The hyperoperation h, and the set G, satisfy the following property, which is essential to our discussion.

294

Lemma 4. 1 For every n > 0 , h, # [G,]. Proof. Suppose that h, E [G,]. Then there exist some m + n and g1 , . . . , g, E [G,] such that

Case 1: Let m = 1. As noted above, h l is a constant hyperoperation and so the right-hand side of the equation (1) is constant. However, m + n implies h, + hl and h, is not a constant hyperoperation. Thus, the equation (1) does not hold. Case 2: Let m > 1. We may assume without loss of generality that for any j = 1 , 2 , . . . , m, gj in (1) is either

h, =~,[gl1g2,.. . ,Sm1 (1)

gj = en

gj = he,[ti,. . . , tp l

(1 5 ij 5 n) (Q) ti or

(PI (lj > 0, p > 0)

where e t , is a selector defined in Section 1.

and gj is of type (0) if gj takes the form of the above (p).

that are of type (p), it is clear that

We shall say that gj is of type (a) if gj takes the form of the above (a)

In the right-hand side of the eqution (l), if there are two or more gj's

~ ~ ~ ~ l , ~ 2 , . . . , ~ m l ~ ~ , ~ , . . . , ~ ~ = {0,1}.

On the other hand, h,(O, 0, . . . , 0) = (1) holds, and so the equation (1) does not hold in this case.

Next, suppose that there is only one gj (call it gj,) that is of type (p) and all the rest is of type (a). For brevity, we may write the equation (1) as

h, = h,[ e y , . . . ,g j0 , . . . I . Then, for the tuple (1 ,0 ,0 , . . . , O ) E (0, l},, we have

h,(l,O,o , . . . , 0) = (1)

and

L i e ; " , . . . , g ~ , , . . . 1 ( ~ , 0 , 0 , . . . , 0) 2 h,({l} ,..., (1) , . . . ) = { O ] l } .

This implies that h, # h,[gl, 92,. . . , g,] in this case.

have Finally, suppose that every gj in the equation (1) is of type (a) . Then we

h, = h,[eE,. . . ,e,,] n (1 5 21,. . . ,i, 5 n)

295

If m > n, there are some p , q satisfying 1 5 p < q 5 m and i, = i,. Put i = i, (= i,). Then, for the tuple xi = (0, . . . ,0 ,1 ,0 , . . . ,0) where the i-th component is 1 and all the rest is 0, we have the contradiction as

h,(zi) = (1) and

h m [ g l , . . . , g m ] ( s i ) = (0,1).

The remaining case where m < n can be handled similarly. We have checked for all possible cases that the equation (1) does not hold

0

Corollary 4. 2 Non-empty subsets of G generate mutually distinct hyperclones.

and thus proved that h, # [&I.

Proof is immediate from the above lemma.

As the set is countable, the set of all non-empty subsets of G has the cardinality of the continuum. Therefore, Corollary 4.2 gives the affirmative solution to Rosenberg’s problem:

Theorem 4. 3 The lattice L ~ , J O , ~ } of all hyperclones on (0 , l ) has the cardinality of the continuum.

References [l] Machida, H., Hyperclones on a two-element set, to appear in Multiple-

Valued Logic - An International Journal.

[2] Machida, H., Normal form of hyperoperations and the existence of Shef- fer hyperoperations, submitted.

[3] Poschel, R., and Kaluinin, L. A. (1979). Funktionen- und Relationenal- gebren, VEB Deutscher Verlag der Wissenschaften, Berlin.

[4] Post, E. L. (1941). The two-valued iterative systems of mathematical logic, Ann. Math. Studies, 5 , Princeton Univ. Press.

[5] Romov, B. A. (1998). Hyperclones on a finite set, Multiple-Valued Logic - An International Journal, 3, 285-300.

[6] Rosenberg, I. G. (1996). An algebraic approach to hyperalgebras, Proc. 26th lnt. Symp. Multiple- Valued Logic, Santiago de Compostela, IEEE, 203-207.

296

[7] Rosenberg, I. G. (1998). Multiple-valued hyperstructures, Proc. 28th Int. Symp. Multiple- Valued Logic, Fukuoka, IEEE, 326-333.

[8] Yanov, Yu. I. and Muchnik, A. A. (1959). Existence of k-valued closed classes without a finite basis (Russian), Dolcl. Alcad. Nauk., 127, 44-46.

297

Words guaranteeing minimal image

S .W. Margolis J.-E. Pin M.V. Volkov*

Abs t r ac t

Given a positive integer n and a finite alphabet A , a word w over A is said to guarantee minimal image if, for every homomorphism cp from the free monoid A* over A into the monoid of all transformations of an n-element set, the range of the transformation wcp has the minimum cardinality among the ranges of all transformations of the form vcp where v runs over A'. Although the existence of words guaranteeing minimal image is pretty obvious, the problem of their explicit description is very far from being trivial. Sauer and Stone in 1991 gave a recursive construction for such a word w but the length of the word resulting from that construction was doubly exponential (as a function of n). We first show that some known results of automata theory immediately lead to an alternative construction which yields a simpler word that guarantees minimal image: it has exponential length, more

precisely, its length is O(lA($("'-")). Then using a different approach, we find a word guaranteeing minimal image similar to that of Sauer

and Stone but of the length O(lA[ :("'-")). On the other hand, we observe that the length of any word guaranteeing minimal image cannot be less than IAln-l.

Let X be a non-empty set. A transformation of the set X is an arbitrary function f whose domain is X and whose range (denoted by Im(f)) is ' a non- empty subset of X . The rank rk(f) of the function f is the cardinality of the set lm(f) . Transformations of X form a monoid under the usual composition of functions; the monoid is called the full transformation monoid over X and

'This work was initiated when the third-named author was visiting Bar-Ilan University (Ramat Gan, Israel) with the support of Department of Mathematics and Computer Sci- ence, Bar-Ilan University, of Russian Education Ministry (through its Grant Center a t St Petersburg State University, grant EOC-1 .C-92) and of Russian Basic Research Foundation. The work was also partially supported by the INTAS through the Network project 991224 "Combinatorial and Geometric Theory of Groups and Semigroups and its Applications to Computer Science", by the Emmy Noether Research Institute for Mathematics and the Minerva Foundation of Germany, by the Excellency Center "Group Theoretic Methods in the study of Algebraic Varieties" of the Israel Science foundation, and by the NSF.

298

is denoted by T ( X ) . If the set X is finite with n elements, the monoid T ( X ) is also denoted by T,.

Now let A be a finite set called an alphabet. The elements of A are called letters, and strings of letters are called words ouer A. The number of letters forming a word u is called the length of u and is denoted by ! (u ) . Words over A (including the empty word) form a monoid under the concatenation operation; the monoid is called the free monoid ouer the alphabet A and is denoted by A*.

Both words over a finite alphabet and transformations of a finite set are classical objects of combinatorics. On the other hand, their interaction is essentially the main subject of the theory of finite automata. One of the aims of the present paper is to demonstrate how certain quite well known facts about finite automata may be utilized to improve some recent combinatorial results concerned with words and transformations. Vice versa, we shall also apply certain purely combinatorial considerations to some questions which, as we intend to show, are rather natural from the automata viewpoint.

The combinatorial results we have in sight group around the notion of a word guaranteeing minimal image introduced by Sauer and Stone in [21]. To describe it, let us first fix a positive integer n (the size of the domain X of our transformations) and a finite alphabet A . Now suppose we have a mapping 'p : A + T,. It extends in a unique way to a homomorphism of the free monoid A* into T,; we will denote the homomorphism by p as well. Now, with each word u E A * , we associate the transformation up. A word w E A* is said to guamntee minimal image if the inequality

rk(wcp) I rk(u'p) (1)

holds for every word u E A* and for every mapping 'p : A + T,.

Indeed, for each mapping p : A + T,, there is a word wv such that Clearly, words guaranteeing minimal image exist [20, Proposition 2.31.

rk(w,cp) 5 rk(ucp) ( 2 )

for all E A*. Since there are only finitely many mappings between the finite sets A and T, and since the composition of transformations cannot increase the size of its image, we can concatenate all words w, getting an (apparently very long) word w satisfying (1).

Words guaranteeing minimal image have been proved to have some interesting algebraic applications. In [20] they were used to find identities in the full transformation monoids. Recently these words have been applied for studying the structure of the free profinite semigroup, see [2]. Of course, for application purposes, the pure existence statement is not sufficient, and one seeks an explicit construction.

299

The only construction of words guaranteeing minimal image known so far was due to Sauer and Stone [21, Corollary 3.51. The construction makes an elegant use of recursion but results in very long words such that, even over a two-element alphabet, it is hardly possible to write down the Sauer-Stone word that guarantees minimal image, say, in T5.

To build a word guaranteeing minimal image in T, , Sauer and Stone make use of an intermediate notion which is also of independent interest. Given a transformation f of a finite set X , we denote by df ( f ) its deficiency, that is, the difference 1x1 - r k ( f ) . For a homomorphism p : A* + T ( X ) , we denote by d f ( p ) the maximum of the deficiencies d f ( v p ) where v runs over A*; in other words, d f ( p ) = df(w,cp) where w, is any word satisfying (2) . Now we say that a word w E A* witnesses for deficiency k (has property Ak in Sauer and Stone’s terminology), provided that, for all homomorphisms p : A* + T ( X ) where X is a finite set, d f ( w p ) 2 k whenever d f ( p ) 2 k . The following easy observation explains how the two properties under consideration relate:

Lemma 1. If a word w witnesses for deficiency k for all 0 5 k < n, then it guarantees minimal image in T,.

Proof. Take an arbitrary homomorphism p : A* t T, and apply it to an arbitrary word v E A* thus obtaining a transformation u p E T,. Suppose that rk(vcp) = r . Then 1 <_ r <_ n and

d f ( p ) 2 d f ( v p ) = 72 - r

whence d f ( w p ) 2 n - r as w witnesses for deficiency n - r . Therefore

rk(wp) = n - d f ( w p ) 5 n - (n - r ) = r = rk(vp) ,

as the definition of a word guaranteeing minimal image requires. 0

Since the cardinality of the set X is not fixed in the definition of a word which witnesses for deficiency k , it is not obvious that such a word should exist for every k . However, it is clear that if A = { a l , . . . , a t } , then the product WI = a1 .. .at witnesses for deficiency 1. (Indeed, if d f ( p ) 2 1, then a t least one of the letters a l , . . . , at should be evaluated at a transformation which is not a permutation whence w l p is not a permutation as well). Using this observation as the induction basis, Sauer and Stone then proceed by defining

U E Q k

where Q k denotes the set of all words v over A such that t ( v ) 5 1 + 2zk. Their main results say that, for each k , the word wk witnesses for deficiency

300

k [21, Theorem 3.31 and, given any n > 1, the word w,-1 guarantees minimal image in T, [all Corollary 3.51.

Using (3), it is rather easy to see that the growth of [ ( w k ) as a function of k is double exponential; more precisely, it can be calculated that the leading monomial in the expansion of [ ( w k ) as a polynomial of t (the size of the alphabet) equals t3'2k-2Sk-2 for all k > - 2. The reader may verify that applying that construction to produce a word over a 2-letter alphabet guaranteeing minimal image in T:, results in a word of length 216 248; thus, we were not exaggerating as we said that it would be rather hard to write down this word! Sauer and Stone formulate in [all the following open problem: for a given alphabet with t letters, determine for each positive integer k the length pk( t ) of the shortest word that witnesses for deficiency k . Obviously p l ( t ) = t for any t ; besides that, the only value of the function p k ( t ) known so far is p2(2) = 8 - it is shown in [al l Corollary 3.41 that the word aba2b2ab witnesses for deficiency 2 , and it can be checked that no shorter word does the job. We notice that the word over { a , b } with the same property obtained via (3) is much longer - its length is 24. This gap is large enough to suggest that there should be more economic constructions than (3). We are going to present two approaches to such constructions.

Our first approach is based on certain developments in finite automata theory which arose from numerous attempts to resolve a (still open) problem by Cernf [4] on synchronizing automata. A finite automaton A may be thought of as a triple (XI A, 'p) where X is a finite set (called the state set o f d ) , A is another finite set (called the alphabet ofd) , and 'p is a mapping which assigns a transformation of the set X to each letter a E A. As above, 'p extends to a homomorphism of the free monoid A* into T ( X ) so one may speak about words over A acting on the state set X via 'p. With this convention, a synchronizing automaton is one such that there exists a word w E A* whose action resets the automaton, that is, brings all its states to a particular one: z(w'p) = d ( w ' p ) for all z , d E X. Any word w with this property is said to be a reset word for the automaton. It is rather natural to ask how long such a word may be. We refer to the question of determining the length of the shortest reset word as to the Cemy problem. Cernf conjectured in [4]-that is, almost 40 years ago-that for any synchronizing automaton with n states there exists a reset word of length (n - 1)'. Although being confirmed in some special cases (cf. [5, 16, 9, 8, 11, 141, to mention a few most representative papers only), this conjecture still constitutes an open problem.

The second-named author has extended Cernf's problem in the following way (see [17, 181). Suppose that in the automaton A = ( X , A , ' p ) , the deficiency of 'p is no less than k , where 1 5 k < 1x1. Then the problem (which we shall refer to as the generalized C e m y problem) is to determine the length

30 1

of the shortest word w E A* verifying df(w'p) 2 k . Clearly, the initial Cernf problem corresponds to the case k = 1x1 - 1. (The second-named author also generalized the Cernf conjecture in the following natural way: if d f ( ' p ) 2 k , then there exists a word w E A+ of length k 2 for which df(w'p) > k . In [17, 181 he proved this generalized conjecture for k 5 3 , but recently-J. Kari [13] exhibited a counter example in the case k = 4.)

A comparison between the generalized Cernf problem and the aforementioned problem of determining the shortest word witnessing for deficiency k immediately reveals an obvious similarity in them. In fact, the only difference between the two situations in question is that in the former case we look for the shortest rank-decreasing word for a given homomorphism of deficiency > - k while in the latter case we are interested in a word with the same properties but with respect to an arbitrary homomorphism of deficiency 2 k . In the language of automata theory, we may alternatively describe this difference by saying that in the second situation we also look for the shortest word decreasing rank by k for an automaton, but in contrast with the generalized Cernf problem situation, the automaton is a black-box about which we only know that it admits a word of deficiency k . If thinking of a real computational device as a composite made from many finite automata, each a with relatively small number of states, a reasonable construction for an input signal which would simultaneously reset all those automata and which could be generated without analyzing the structure of each particular component of the device might be of some practical interest.

As far as theoretical aspects are concerned, the connection just discussed leads to the following conclusion:

Theorem 2. For each k 2 3 and for each finite alphabet A, there exists a

word of length IAlkk("+')("+2)-1 + g k ( k + l ) ( k + 2 ) - 2 o v e r A that witnesses for deficiency k .

Proof. We utilize a result by the second-named author [19]. This result which is based on a combinatorial theorem by Frank1 [lo] yields the best approximation to the size of the shortest reset word known so far:

Proposition 3. Suppose that the automaton ( X , A , ' p ) is such that the deficiency of the mapping 'p is no less than k , where 3 5 k < 1x1. Then there

1 existsa w o r d w E A * oflength - k ( k + l ) ( k + 2 ) - 1 verifyingdf(w'p) 2 k . 0

6

For brevity, let m = - k ( k + l ) ( k + 2) - 1. By a well known result of

DeBruijn [7], there is a cyclic sequence over A , of length [A[", such that each word over A of length m appears as a factor of the sequence. Cut this

1

1 6

302

cycle in an arbitrary place and make it a word u of the same length IAI”. Since our cut goes through exactly m - 1 factors of length m, the word u still contains all but m - 1 words of length m as factors. Now let be the prefix of u of length m - 1 and let w = uv. Note that the word w has length [A[“ + m - 1 . Clearly, this procedure restores all those factors of length m that we destroyed by cutting the initial DeBruijn sequence, and therefore each word over A of length m appears as a factor in w. We note that there is an efficient procedure that, given A and m, builds DeBruijn’s sequences so, if necessary, ,the word w may be explicitly written.

By Proposition 3, for any finite set X and for any homomorphism ‘p : A* -+ T ( X ) with df(’p) > - k, there exists a word wv E A* of length m such that df(w,’p) 2 k . By the above construction of the word w , the word wv must appear as a factor in w so df(w’p) 2 k as well, and thus, w witnesses for deficiency k .

It should be mentioned that the natural idea used in the above proof (of “gluing together” individual reset words in order to produce an “universal” reset word) first appeared in a paper by Ito and Duske, cf. [12, Theorem 3.11.

Corollary 4. Over each finite alphabet A and for each n > 3, there exists a word of length IAlg(n3-n)-1 + ,(n3 - n) - 2 that guarantees minimal image

1 1

an T,. 0

ProoJ As in the proof of Theorem 2, we construct a word w of length

J A J S ( ~ ~ - ~ ) - ~ + -(n3 - T I ) -2 that has every word of length -(n3 - n) - 1 as a

factor. Then of course w has also every word of length -k(k + l ) ( k + 2) - 1, 1 < - k < n, as a factor and, as such, witnesses for deficiency k for all 1 5 k < n by Proposition 3. We may also assume w witmessing for deficiency 0 as every word does so. The corollary now immediately follows from Lemma 1. 0

Obviously, the constructions to which Theorem 2 and Corollary 4 refer are asymptotically (that is, for sufficiently large values of k and respectively n) more economic than the Sauer-Stone construction. Still, the length of the resulting words is exponential as a function of k. Can we do essentially better by finding some words of polynomial length doing the same job? The following result answers this question in the negative:

Theorem 5. Any word over a finite alphabet A guamnteeing minimal image in Tn contains every word over A of length n- 1 as a factor and has the length at least JAJnM1 + n - 2.

1 1 1 6

1 6

6

303

U’

Proof. We recall the construction of the minimal automaton of a language of the form A*wA*, where w E A*. This construction can be readily obtained from the well-known construction of the minimal automaton of A*w, which is used, for instance, in pattern matching algorithms (implicitly in [15], and explicitly in [1, 3, 61).

Given two words u and I J words of A*, we denote by overlap(u,v) the longest word z E A* such that u = u’z, I J = zv’ for some U’,IJ’ E A*. In other terms, overlap(u, v) is the longest suffix of u which is at the same time a prefix of v.

U

Z 0’

V Figure 1: z = overlap(u, v)

Now given a word w = a1 . . . a, E A*, the minimal automaton of A’wA’ is d ( w ) = ( X , A , p ) , with the set of states X = (a1 ...ai 10 5 i 5 m}, that is, the set of all prefixes of the word w, and the function p : A + T ( X ) defined as follows: for all a E A

a1...a,(up) =u1...a,, (4) a l . . . a i ( a p ) =overlap(al . . .a ia ,w) for O < i < m . (5)

The initial state is the empty word, and the unique final state is the word w.

Lemma 6 . The automaton d ( w ) is synchronizing, and u E A* is a reset word for d ( w ) if and only if the word w is a factor of u.

Proof. Since the final state is stabilized by each letter, a reset word u in A ( w ) necessarily sends every state on the final state. In particular, it sends the initial state to the final state, and thus is accepted by d ( w ) . It follows that w is a factor of u.

Conversely, if w is a factor of u, and x is a state, then w is a factor of xu. It follows that the word zu is accepted by d ( w ) , whence x(up) = w. Thus u is a reset word.

Now take an arbitrary word v E A* of length n - 1 and consider the automaton d ( v ) = ( X , A , c p ) . By Lemma 6, the mapping cp : A + T ( X ) = T, verifies rk (vp) = 1. By the definition, any word w E A* that guarantees minimal image in T, should satisfy rk(wcp) 5 rk(vcp) whence rk(wcp) = 1. Thus, w should be a reset word for automaton d ( v ) . By Lemma 6, w then has the word I J as a factor.

304

3 4 5 6

Since there are (A("-1 different words over A of length n - 1 and since a word of length m 2 n - 1 has m - n + 2 factors of length n - 1, any word over A containing every word over A of length n - 1 as a factor has the length at least (A("-1 + n - 2. (This is, in fact, an exact b o u n d s e e the reasoning

Another natural question concerns the behavior of the constructions for small values of k and for small sizes of the alphabet A. Here the Sauer- Stone construction is often better as the following table shows. In the table, t denotes the size of the alphabet A and we omit some of the summands in the second column to fit onto the page.

with the DeBruijn sequences in the proof of Theorem 2.)

Table 1: The Sauer-Stone construction vs. Theorem 2

842 520 216 248 524 306 3542 987 594 237 765 870 667 058 360

17 179 869 217 36 028 797 018 964 022

The length of the word from:

the Sauer-S t one construction I Theorem 2

t7 + 4t6 + 6t5 + lot4 + 9t3 + 7t2 + 3t

tS2 + 7tS1 + 24t5' + 62t4' + 130t48 + . . . + 6t

4

6

t14 + 5t13 + l l t" + 21t" + 30t" + 37t9 + . . . + 4t t" + 18 tZ7 + 6tZ6 + 17tZ5 + 38tZ4 + 68tZ3 + 105tZ2 + . . . + 5t t34 + 33

t55 + 54 7 t"' + 8t"' + 32tg9 + 94tg8 + 224tg7 + . . . + 7t t83 $ 8 2

Using the values collected in this table, one can easily calculate that, for any t > 2, the Sauer-Stone construction produces shorter words than the construction based on Proposition 3 for k = 3 , 4 , 5 , 6 . The case t = 2 deserves some special attention. Here the following table, in which all words aTe meant to be over a two-letter alphabet, collects the necessary information:

Table 2: The case of a two-letter alphabet

The length of the word from: k the Sauer-Stone construction 1 Theorem 2

305

We see that, for k = 4,5, the Sauer-Stone construction over a two-letter alphabet is more economic than one arising from Theorem 2. Moreover, we recall that Sauer and Stone have found a word of length 8 that witnesses for deficiency 2. Though this is not explicitly mentioned in [all, it is pretty obvious that starting a recursion analogous to (3) with that word, one obtains a sequence of words over a two-letter alphabet such that the ( k - l)th member of the sequence witnesses for deficiency k for each k 2 2 and is shorter than the word wk arising from (3). A straight calculation shows that this produces a word of length 346 witnessing for deficiency 3, a word of length 89 768 witnessing for deficiency 4, a word of length 1470 865 754 witnessing for deficiency 5, a word of length 98 708 129 987 190 440 witnessing for deficiency 6, etc. Comparing the data in Table 2 with these figures, we observe that the Sauer-Stone construction modified this way yields shorter words than the construction Theorem 2 for k = 3,4,5.

Yet, having in mind the benchmark we mentioned above, that is, of producing, over a two-letter alphabet, a word of reasonable size that guarantees minimal image in T5, we cannot be satisfied with a word of length 89 768. A more important motivation for further efforts is provided by the crucial question if any “simultaneous” CernJ; word which resets all synchronizing automata with n states must indeed consist of all “individual” CernJ; words (one for each synchronizing automaton) somehow put together. We shall answer this question by exhibiting a better construction than one which we got from the automata-theoretical approach. The behavior of this construction for small deficiencies/alphabet sizes will be also better than that of any of the constructions above.

Given a transformation f : X + X , we denote by Ker ( f ) its kernel, that is, the partition of the set X into rk(f ) classes such that 2, y E X belong to the same class of the partition if and only if xf = yf . By a cross-section of a partition IT of X we mean any subset of X having a singleton intersection with each 7r-class. We need an obvious and well known lemma:

Lemma 7. Let f , g : X -+ X be two transformations of rank r . Then the product fg has m n k r if and only if lrn(f) is a cross-section of Ker(g). 0

Let cp : A* -+ T ( X ) be a homomorphism, w E A* a word with rk(wcp) = r . Suppose that there exists a word v E A* such that rk(wvwcp) < r and let u = ala2 . . . a, be a shortest word with this property. Setting, for 0 5 i 5 m,

7ri = Ker((a,-i+l . . . a,w)cp), Cj = Irn((wal.. .ai)cp),

we have the following proposition:

306

Proposition 8. (1) T O , r 1 , . . . , rm-l are pairwise distinct partitions of x into r parts. (2) CO, c1,. . . , Cm-l are pairwise distinct subsets of X of cardinality r . (3) I f i + j < m, Ci is a cross-section of rj. (4) If i + j = m, Ci is not a cross-section of rj.

ProoJ Let i < m. If ri has less than r classes, then

rk((wam-i+l . . . a m ~ ) p ) < r,

a contradiction with the choice of u. Similarly, the set Ci should consist of r elements. Thus, both ( w a l . . . a i ) p , for 0 < - - i < m - 1, and ( a j + l . . . a m w ) p , for 1 < - j 5 m, are transformations of rank r. If i < j and the set Ci is not a cross-section of the partition rm-j, then by Lemma 7, the product

(wa1 . . . ai)p(aj+l . . . a m w ) p = (20611 . . ' U i U j + l . . . a m w ) p

has rank < r , again a contradiction with the choice of u. Furthermore, by the same lemma, Ci cannot be a cross-section of rm-i since rk(wuwp) < r . In particular, if i < j , the set Cm-j is a cross-section for ri, but not for rj. It follows that the partitions ri and rj are different provided that i # j.

It is Proposition 8 that allows us to improve the Sauer-Stone construction. If we mimic the strategy of [21] and want to create a sequence of words witnessing for deficiency k by induction on k , then on each step, we may assume that we have some word w of deficiency k and we seek for a bound to the length of the shortest word v verifying df(wvwcp) > k for a given evaluation cp of deficiency > k . Proposition 8 shows that the length of such a minimal word is tightly related to the size of a specific combinatorial configuration involving subsets and partitions of an n-element set. According to a well-known method in combinatorics, we now convert this combinatorial problem into a problem of linear algebra.

Let X = { 1, . . . , n}. We identify each subset C E X with its characteristic vector (c1 , . . . , cn) in Rn, defined by

Similarly, all the sets Ci, for 0 5 i 5 m - 1, are different.

1 i f i E C , 0 otherwise.

ci =

The notation ICI, originally used to denote the number of elements of C , extends naturally to a linear form on R" defined by

307

Finally, if C, D we observe that

X , then denoting by C . D the scalar product cidi,

C . D = ICnDI.

It follows that a subset C of X is a cross-section of the partition (01, . . . , D,.} if and only if C . D, = 1 for all s = 1,. . . , r .

With this notation in hand, we can prove the following bound for the size of the combinatorial configuration arising in Proposition 8:

Proposition 9. If the partitions TO, T I , . . . ,T,-I and the subsets Co, C1, . . . , C,-l of an n-element set satisfy the conditions (1)-(4) of Proposition 8, then m 5 n - r + 1.

Proof. We first prove that the vectors Co, C1, . . . , C,-l are linearly independent. Otherwise, one of the Cj’s is a linear combination of the preceding vectors Co, CI,. . . , Cj-1, say

cj = c XiCi.

r = c X,)Ci) = r c A,

O<i<j-l

It follows, since the map C H ICI is linear,

O<i<j-1 O < i < j - l

whence X i = 1. Consider the partition rm-j = ( 0 1 , D z , . . . , D,.}.

Since each of the sets CO, C1, . . . , Cj-1 is a cross-section of this partition, we obtain, for each s = 1,. . . , r ,

C O<i<j-l

C j . D , = ( c AiCi) . D, = c Xi(Ci .D, ) =

whence Cj also is a cross-section of ~ , - j , a contradiction. Now T O = (B1, B2, . . . , B,.}. Since the Bi’s are pairwise disjoint and non-

empty, their characteristic vectors are linearly independent. Furthermore, since CO, C1, . . . , C,-I are cross-sections of T O , the relation Ci . B, = 1 holds for 0 < - - i < m - 1 and 1 5 s 5 r. It follows in particular that

X i = 1 O<i<j- 1 O j i < j - l O<i<j-l

Ci . ( B , - Bt) = 0 for 15 s , t 5 T. (6)

Now, the vectors B, - Bt, for 1 5 s , t 5 r , generate a vector space of dimension r - 1 and the relation (6) shows that each Ci is orthogonal to this space. It follows that the rank of the family {Ci}o<i<m-l is at most n-r+ 1,

0 whence m 5 n - r + 1.

308

It is easy to see that the bound of Proposition 9 is exact. Applying Proposition 9 to the situation of Proposition 8 yields

Corollary 10. Let k be Q positive integer, 'p : A* -+ T ( X ) Q homomorphism of deficiency > k . Then for any word w E A* with df(w'p) = k , there exists Q word v of length 5 k + 1 such that df(www'p) > k .

Now suppose that A = { a l , . . . , a t } and let u 1 = a 1 . . . at and

t ( o ) < k + l

Theorem 11. For any positive integer k , the word 'ilk defined via (7) witnesses for deficiency k .

Proof. By induction on k . The case k = 1 is obvious. Suppose that u k

witnesses for deficiency k and take any homomorphism 'p : A* -+ T ( X ) of deficiency > k . We are to verify that d f ( u k + l ' p ) > k . If already d f ( u k ' p ) > k , we have nothing to prove. If d f ( u k ' p ) = k , then by Corollary 10 there exists a word v of length 5 k + 1 such that d f ( U k V U k ' p ) > k . Since by (7) the word u k v u k appears as a factor in U k + l , we also have d f ( u k + l ' p ) > k , as required. 0

From Theorem 11 and Lemma 1 we obtain

Corollary 12. For each n > 1, the word un-l guarantees minimal image in Tn . 0

A comparison between the definitions (3) and (7) shows that the word u k is shorter than the Sauer-Stone word W k (on the same alphabet) for each k 2 3. In fact, the leading monomial in the expansion of ! ( u k ) as a polynomial

o f t = IAl equals t T ( k - ); this means that asymptotically the construction (7) is better than the construction from Theorem 2. Moreover, we see that the shortest word in A* that resets all synchronizing automata with a fixed number of states and with the input alphabet A does not need consisting of all shortest "individual" reset words somehow put together.

The following table exhibits some data about the size of words arising from (7) for small k and/or t . The data in the last column refer to a slight modification for the construction in the case when the alphabet consists of two letters; the modification is similar to the modification of the Sauer-Stone construction discussed above. Namely, we can make the word aba2b2ab play the role of 212 and proceed by (7) for k 2 3.

Viewing the data in Table 3 against the corresponding data in Tables 1 and 2 shows that the gain provided by the new construction is quite large

2 k

309

Table 3: The length of the words defined via (7)

k

t

t6+4t5+6t4+9t3 +7t2 +3t

t '5+6t '4+17t13+37t '2+64t '1+~~~ +5t t21+7t20+24t'9+61t18+125t'7+... +6t t28+8t27+32t26+93t25+218t24+. . . +7t

t3+3t2+2t

t'0+5t9+llt8+20t7+27t6+29t5+. . . +4t

[A1 = 2

2 24 394 12 312 775 914 98 541 720 25 128 140 138

u? = aba2b2ab

8 154 4872 307 194 39 014 280 9 948 642 938

even for small deficiencies and alphabet sizes. As for our "benchmark", that is, a word over a two-letter alphabet that guarantees minimal image in T5,

Table 3 indicates that there is such a word of length 4872. Yet too lengthy to be written down here, the word appears to be much closer to what may be called "a word of reasonable length" for its size is already well comparable with the size of the monoid T5 itself (which is 3125).

References [l] A. V. Aho, J . E. Hopcroft, and J. D. Ullman, The design and analysis of

computer algorithms, Addison-Wesley, 1974.

[2] J . Almeida and M. V. Volkov, Projinite methods in finite semigroup theory, Centro de Matemlltica da Universidade do Porto, 2001, Preprint 2001-02.

[3] D. Beauquier, 3. Berstel, and Ph. Chrktienne, Eldments d'algorithmique, Masson, 1994 [in French].

[4] J. Cernf, Pozna'mka k homoge'nnym eksperimentom s konecnymi av- tomatami, Mat.-Fyz. Cas. Slovensk. Akad. Vied. 14 (1964) 208-216 [in Slovak].

[5] J. Cernf, A. Pirick6, and B. Rosenauerova, On directable automata, Kybernetika, Praha 7 (1971) 289-298.

[6] M. Crochemore and W. Rytter, Text algorithms, Oxford University Press, 1994.

[7] N. G. DeBruijn, A combinatorial problem, Proc. Nederl. Akad. Weten- sch. 49 (1946) 758-764; Indagationes Math. 8 (1946) 461-467.

31 0

[8] L. Dubuc, Les automates circulaires biaisks verifient la conjecture de Cernc, RAIRO, Inform. Theor. Appl. 30 (1996) 495-505 [in French].

[9] D. Eppstein, Reset sequences for monotonic automata, SIAM J . Comput. 19 (1990) 500-510.

[lo] P. Frankl, A n extremal problem for two families of sets, Eur. J. Comb. 3 (1982) 125-127.

[ll] W. Goehring, Minimal initializing word: A contribution to Cerntj conjecture, J. Autom. Lang. Comb. 2 (1997) 209-226.

[12] M. Ito and J. Duske, On cofinal and definite automata, Acta Cybernetica 6 (1983) 181-189.

[13] J . Kari, A counter example to a conjecture concerning synchronizing words in finite automata, EATCS Bulletin 73 (2001) 146.

[14] J . Kari, Synchronizing finite automata on Eulerian digraphs, Math. Foundations Comput. Sci.; 26th Internat. Symp., Marianske Lazne 2001, Lect. Notes Comput. Sci. 2136 (2001) 432-438.

[15] D. E. Knuth, J . H. Morris, Jr, and V. R. Pratt, Fast pattern matching in strings, SIAM J. Comput. 6 (1977) 323-350.

[16] J.-E. Pin, Sur un cas particulier de la conjecture de Cerntj, Automata, Languages, Programming; 5th Colloq., Udine 1978, Lect. Notes Comput. Sci. 62 (1978) 345-352 [in French].

[17] J.-E. Pin, Le problkme de la synchronisation. Contribution i I'dtude de la conjecture de tern$, Thbse 3e cycle, Paris, 1978 [in French].

[18] J.-E. Pin, Sur les mots synchronisants dans un automate fini, Elektron. Informationverarbeitung und Kybernetik 14 (1978) 283-289 [in French].

[19] J .-E. Pin, On two combinatorial problems arising from automata theory, Ann. Discrete Math. 17 (1983) 535-548.

[20] R. Poschel, M. V. Sapir, N. Sauer, M. G. Stone, and M. V. Volkov, Iden- tities in full transformation semigroups, Algebra Universalis 31 (1994) 580-588.

[21] N. Sauer and M. G. Stone, Composing functions to reduce image size, Ars Combinatoria 31 (1991) 171-176.

31 1

Power Semigroups and Polynomial Closure

Stuart W. Margolis Benjamin Steinberg Department of Computer Science

Bar Ilan University 52900 Ramat Gan, Israel

Faculdade de Cikkcias da Universidade do Porto 4099-002 Porto, Portugal*

Abstract

We show that the pseudovariety of semigroups which are locally block groups is precisely that generated by power semigroups of semigroups which are locally groups; that is P(LG) = L(PG) (using that PG = BG). We also will show that this pseudovariety corresponds to the Boolean polynomial closure of the LG-languages which is hence polynomial time decidable.

More generally, it is shown that if H is a pseudovariety of groups closed under semidirect product with the pseudovariety of pgroups for some prime p, then the pseudovariety of semigroups associated to the Boolean polynomial closure of the LH-languages is P(LH). The polynomial closure of the LH-languages is similarly characterized.

1 Introduction

A common approach to studying rational languages is to attempt to de- compose them into simpler parts. Concatenation hierarchies allow this to be done in a natural way which, in addition, has applications to logic and circuit theory [8]. A concatenation hierarchy is built up from a base variety of languages V by taking, alternately, the polynomial closure and the boolean polynomial closure of the previous half level of the hierarchy. The most famous example in the literature of such a hierarchy is the dot-depth hierarchy, introduced by Brzozowski [2], which starts of with the trivial +- variety, and whose union is the +-variety of star-free (aperiodic) languages.

*The second author was supported in part by NSF-NATO postdoctoral fellowship DGE9972697, and by FCT through Centm d e Matema'tica da Universidade do Porto.

312

Pin and Margolis [6] also studied the group hierarchy which takes as its base the *-variety of all group languages.

In [13, 141, the author studied the levels one-half and one of the concatenation hierarchy associated to a pseudovariety of groups H . In particular, it was shown that if H is a pseudovariety of groups closed under semidirect product with the pseudovariety G, of pgroups for some prime p , then

PH = BPol(H)

where BPol(H) is the pseudovariety corresponding to the Boolean polyne mial closure of the H-languages [S]. A similar equality was shown to hold between the pseudovariety corresponding to the polynomial closure of the H-languages and an ordered analog of P H . All the aforementioned pseude varieties were considered as pseudovarieties of monoids.

In this paper, we prove a semigroup analog of these results; here H is replaced by LH, the pseudovariety of semigroups whose submonoids are in H; we are then able to show that BPoZ(LH) = P ( L H ) and its ordered analog (provided, of course, H = G, * H for some prime p ) . Special cases include: G, the pseudovariety of finite groups; G,; Gsol, the pseudovariety of finite solvable groups. For the case of G, we can characterize P ( L G ) as L ( P G ) , semigroups which are locally block groups; hence BPol(LG) has a polynomial time membership algorithm .

2 Preliminaries

As this paper extends the results of [14] to the semigroup context, it seems best to refer the reader there for basic notation and definitions, only monoids will be replaced throughout by semigroups; the reader is also referred to the general references [ 1, 3, 7, 81.

A semigroup S is a set with an associative multiplication. An ordered semigroup (S, 5) is a semigroup S with a partial order 5, compatible with the multiplication; that is to say, m 5 n implies rm 5 Tn and mr 5 nr. Any semigroup S can be viewed as an ordered semigroup with the equality relation as the ordering, and free semigroups will always be regarded this way.

An order ideal of an ordered semigroup (S,<) is a subset I such that y E I and 2 5 y implies 2 E I. We note that the collection of order ideals is closed under union and intersection. If X 5 5' and s E S, then s-lX and Xs-' will denote, as usual, the, respectively, left and right quotients of X by s. If I is an order ideal, then so is any of its left or right quotients.

313

Morphisms of ordered semigroups are defined in the natural way. One can also define recognizability of a subset of an ordered semigroup; the only difference is that all subsets in the usual definition are now required to be order ideals.

A pseudovariety of (ordered) semigroups is a class of finite (ordered) semigroups closed under finite products (with the product order), submonoids (with the induced order), and images under (order-preserving) morphisms. Pseudovarieties of (ordered) monoids are defined similarly. An important example of such is J+ = [x 5 11 (finite ordered monoids with 1 as the greatest element). We use N for the pseudovariety of nilpotent semigroups (finite semigroups S such that S" = 0 for some n > 0). We often identify a pseudovariety of semigroups with the pseudovariety of ordered semigroups which it generates.

If S is a semigroup, the power set P ( S ) is a semigroup under setwise multiplication. We use P'(S) for the subsemigroup consisting of the non- empty subsets of S. We note that the order 2 on P ( S ) is compatible with the multiplication. If U1 = {0,1} under multiplication, one can show that P ( S ) is a quotient of a subsemigroup of U1 x P'(S).

If V is a pseudovariety of semigroups, we use PV to denote the pseudovariety generated by semigroups of the form P ( S ) with S E V, and P'V+ to denote the pseudovarieties generated by ordered semigroups of the form (P'(S),>) with S E V. Suppose that V contains a non-trivial monoid M ; then {{l},M} P'(M) is isomorphic to U1. We the obtain from the previous paragraph the following statement:

(*) If V contains a non-trivial monoid, then PV is generated, as a pseudovariety of semigroups, by P'V+.

If V is a pseudovariety of (ordered) monoids, LV denotes the pseudovariety of (ordered) semigroups, all of whose submonoids are in V. For instance, LJ+ = I[xwyxw 5 x"] where z" is interpreted as the idempotent power of x.

If V is a pseudovariety of (ordered) semigroups, then EV is the pseude variety of (ordered) semigroups whose idempotents generate a subsemigroup in V.

A relational morphism of (ordered) semigroups p : S -e+ T is a function p : S + P'(T) such that slps2p G (sls2)p for all s1,sa E S. Note that if S is an (ordered) semigroup and e E T is an idempotent, then ep-l is a subsemigroup of S (where p-' is the inverse relation). If V, W are pseude varieties of (ordered) semigroups, then the Mal'cev product V@ W consists of all (ordered) semigroups S with a relational morphism cp : S -e+ W E W such that ecp-' E V for each idempotent e of W . One can show that

314

V @ W is generated by (ordered) semigroups S with a homomorphism cp : S + W E W such that ecp-' E V for each idempotent e of W .

If V1 and V2 are pseudovarieties of (ordered) semigroups, then V1* V2 denotes the pseudovariety generated by semidirect products of (ordered) semigroups in V1 with those in V2. The semidirect product is an associative operations on pseudovarieties; see [l, 3, 14, 111 for more details. If V1 and V2 are pseudovarieties of groups, V1 * V2 can be shown to consist of all groups which are an extension of a group in V1 by a group in V2.

If A is an alphabet, we let Rec(A+) denote the recognizable subsets of A+. A class of recognizable languages is a correspondence C which associates to each alphabet A, a set C(A+) Rec(A+). If V is a pseudovariety of ordered semigroups, then one can define a class of recognizable languages, which we also denote by V, by letting V(A+) be the set of all languages of A+ recognized by a member of V. Then the following result, proved by Eilenberg [3] for semigroups and by Pin [7] in the version below, holds.

Proposition 2.1. Let V and W be pseudovarieties of odered semigroups. Then V C W if and only iL for each finite alphabet A, V(A+) 2 W(A+).

This, of course, leaves the question as to which classes arise in this fashion. The answer is again due to Eilenberg [3] for semigroups and Pin [7] for ordered semigroups. A positive variety of languages is a class of recognizable languages V such that:

1. For every alphabet A, V(A+) is closed under finite unions and intersections;

2. If cp : A+ + B+ is a morphism, then L E V(B+) implies Lcp-' E

V(A+);

3 . If L E V(A+) and a E A, then a - l L , La-' E V(A+).

A variety of languages is a positive variety closed under complementation.

Proposition 2.2. If V is a pseudovariety of (ordered) semigroups, the class V is a (positive) variety.

If V is a (positive) variety of languages, then we associate to it the pseudovariety, also denoted by V, generated by syntactic (ordered) semigroups [7, 8, 141 of languages L E V(A+) for some finite alphabet A. The reason for this abuse of notation is that the class of rational languages associated to the pseudovariety V obtained in this manner is the original (positive) variety.

31 5

3 Polynomials

If V is a pseudovariety of semigroups and A an alphabet, then a monomial over V in variables A is an expression

UOLIu1 * * * ~ n - 1 L n ~ n

with the ui E A*, Li E V(A+), and uo non-empty if n = 0. A polynomial over V in variables A is a finite union of monomials (over V in variables A).

The class

PoZ(V)(A+) = {polynomials over V in variables A}

is then a positive variety of languages [lo]. We let BPoZ(V)(A+) be the closure of PoZ(V)(A+) under finite boolean operations. Then one can verify that BPoZ(V) is a variety of languages. One defines a hierarchy of (positive) varieties of languages as follows:

0 vo = v;

0 Vn+l = BPoZ(V,).

The dot depth hierarchy [2] comes from letting VO be the trivial pseudovariety.

We recall the following important theorem of Pin and Weil [lo]. Theorem 3.1. Let V be a pseudovariety of ordered semigroups. Then p0z(v) = LJ+ @ v.

We end this section with a technical lemma.

Lemma 3.2. Let V be a pseudovariety of semigroups containing N. Then every polynomial in V over A can be written as a finite union of monomials of the form Loal - - -anLn with the ai E A and the Li E V(A+).

Proof. The hypotheses are equivalent to assuming V contains all finite languages. It suffices to show that any monomial M = uoKiU1. * . Un-iKnUn with the ui E A* and Ki E V(A+) can be so expressed. We induct on n which we refer to as the degree of M . If n = 0, then by taking LO = {ug} we are done; now assume n > 0. Observe that if w E Kl , then

M = (UOWUl)K2 . * - un-lKnun u uo(K1\ {W})U& * * * un-lKnun. (1)

316

Since V(A+) contains all finite languages, it follows that K1 \{w} E V(A+). Since the first term in (1) has smaller degree, the above argument shows that we can remove a finite number of words from K1. In particular, we may assume that every word in K1 has length at least 5. Note that (u-'K~v-') E V(A+) for all u, v E A+. Since every word in K1 is assumed to have length at least 5, it follows that

and so M = U (uou)(u-lK1v-l)(vul) ---un-~Knu,.

Thus we may assume that uo and u1 have length at least 2. Suppose uo = wa and u1 = a'w' with a,a' E A, w,w' E A+. Then let LO = {w}, a1 = a, L1 = K1, a2 = a'. Now M' = w'K2u~-~~u , -1Knun has smaller degree and hence can be expressed as a finite union of monomials of the desired form. But then M = LoalLla2M' can be written as a finite union of the monomials of the desired form. 0

u , v ~ A ~

4 Counters

Suppose that we have a l l . . . ,a , E A, and LO,. . . , L, Z A+. Then, for 0 5 r < m, we define

(Leal . . . anLn)r,m

to consist of those words w E A+ with exactly r factorizations of the form wOal-.-a,w,, with wi E Li all i, modulo m. Such a language is called a product with m-counter. A variety of languages is said to be closed under products with m-counter if LO, . . . , L, E V(A+) implies that (Lou1 ---anLn)r,m E V(A+). The following result is due to Weil [18].

Theorem 4.1. Let V be a pseudovariety of semigroups. Then V is closed under products with p-counters, p a prime, if and only i f V = LG, @ V.

5 The Power Operator and Polynomial Closure

We will need the following version [14, Proposition 5.11 of a well-known proposition (see, for instance, [8] which also references the original sources) ; a proof for the monoidal version can be found in [14], so we omit the proof.

317

If B and A are alphabets, a homomorphism p : B+ + A+ is called a literal morphism if Bp G A.

Proposition 5.1. Let L E Rec(B+) be recognized by a semigroup S, with L = P$-’, and 9 : B+ -+ A+ be literal morphism. Then (P(S ) , 2) recognizes Lq. If, in addition, Bp = A, then (P’(S), 2) recognizes Lp.

The proof idea for the next theorem is borrowed from [5].

Theorem 5.2. Let V be a pseudovariety of semigroups such that, for some prime p , LG, @ V = V. Then

LJ+ @ V P’V’ and BPoZ(V) E PV.

Proof, The second inequality follows immediately kom the first by (*). To prove the first, since

N C LG, 5 V, it suffices, by Lemma 3.2, to consider a monomial over V in variables A of the form

L = Loal . . * anLn

with LO,. . . , L, E V(A+), al, . . . ,a , E A. Let B = A u A with a disjoint copy of A. We define a literal morphism p : B+ + A+ such that Bp = A by u p = a and iip = a, and show that L is the image of an element of V(B+). For each j , let Kj = Ljp-’. Then Kj E V(B+) for each j. Let

K = ( K ~ ? i i * * * ~ K ~ ) l , ~ .

By Theorem 4.1, K E V(B+). We show Kp = L. Clearly Kp 5 L. For the converse, suppose u E L. Then u = wOa1.. .a,w, with each w, E Lj. Consider v = WOK- -. w,-la,w,. Then, since the wj are in A+, v has exactly one factorization in K0Ei.m .KKn, namely the one above; hence v E K. But up = u, so Kp = L. Thus, by the above proposition, L E P’V+(A+). 0

6 Semigroups which are Locally Groups

In this section, we characterize the operations we have been considering for pseudovarieties of semigroups which are locally groups.

Proposition 6.1. Let V1, V2 be pseudovarieties of (ordered) semigroups. Then LV1@ LV2 E L(LV1@ VZ). In particular, if V1 and V2 are pseudovarieties of groups, LV1@ LV2 5 L(V1 * V2).

318

Proof. It suffices to show that given a semigroup homomorphism cp : S += T such that T E LV2 and, for all idempotents e E T, ecp-' E LVI, one has that S E L(LV1 @ V2). Let M C S be a submonoid; then M v E V2, being a monoid. If f E Mcp is an idempotent, then fv-' E LV1 whence f9-l f l M E LV1. Thus M E L(LV1@ V2).

Suppose now that V1,V2 are pseudovarieties of groups. Then if M S is a monoid with identity e, we see that ecpcp-' E LV1. Since ecpcp-' contains all the idempotents of M (Mcp being a group), it follows that M is a group which is an extension of a group in V1 by a group in V2 whence M E V1 *Vz as desired. 0

We then obtain from Theorem 5.2:

Corollary 6.2. Let H be a pseudovariety of groups such that G, * H = H for some prime p. Then

LJ+ @ L H C P'(LH)+ and BPol(LH) C P(LH).

Proof. Proposition 6.1 shows that LG, @ LH = LH whence Theorem 5.2 applies to prove the result. 0

To prove the converse, we need the following characterization of finite completely simple semigroups.

Lemma 6.3. A finite semigroup S is completely simple if and only if S E LG and S2 = S.

Proof. If S is completely simple, then clearly S2 = S; also it is well-known that any subsemigroup of a finite completely simple semigroup is completely simple, and that a completely simple monoid is a group.

The converse follows immediately from the Delay Theorem [15, 171, but we give an elementary proof here. Suppose that S E LG and S2 = S. We begin by showing that S is completely regular. Consider the natural map cp : S+ -+ S which evaluates each letter as itself; let, for s E S, L, = (w E S+lwcp = s}; L, is rational, being recognized by S. Observe that S2 = S implies S" = S for all n > 0 whence we can conclude that L, is infinite. The Pumping Lemma then applies to show that there exist S I , S Z , S ~ E S such that s = s1s;s3 for all n > 0. Thus, by choosing n carefully, we see that s = sles3 with e an idempotent. Then sk+' = ~l(es3s le )~s3 for lc > 0. Since S E LG, it follows that for some m > 0, ( e s 3 ~ l e ) ~ = e whence

p + l = s l ( e ~ 3 s l e ) ~ s 3 = sles3 = s.

319

Thus S is completely regular (and so every element is %,-equivalent to an idempotent).

Thus, to finish our proof, it suffices to show that all idempotents of S are 3-equivalent. Let e, f E S be idempotents. Then ( e f e ) n = e for some n > 0 (since s E LG) so e E SfS. Dually, f E SeS so e ,7 f . The result follows. 0

We now prove a theorem which implies the converse of Corollary 6.2.

Theorem 6.4. Let V 2 LG. Then P’V’ C LJ+ @ V. f irthennore, i f V contains a non-trivial monoid, then P V C BPoZ(V).

Proof. The second statement follows from the first by (*). It suffices to show that if S E V, then (P’(S) , 2 ) E LJ+ @ V. The identity map $J : P’(S) + P’(S) gives rise to a relational morphism $J : P’(S) -e+ S; in fact, X$JY$J = X Y = ( X Y ) $ J . Let e E S be an idempotent. Then

eq-’ = { X E P’(S)le E x} .

An idempotent of e$-’ is then a subsemigroup E C S with e E E and E2 = E. Lemma 6.3 shows that E is completely simple, so EeE = E. It follows that if Y E eq-l , then EYE 2 BeE = E whence the local monoid with identity E has E as its greatest element; we conclude that e$J-l E LJ+. 0

Since LH contains a non-trivial monoid whenever H is non-trivial, we immediately obtain the following theorem which is one of our main results.

Theorem 6.5. Let H be a pseudovariety of groups such that Gp * H = H for some prime p. Then PoZ(H) = P’(LH)+ and BPoZ(LH) = P(LH). In particular, these results hold for H any of G, Gp (p prime), or Gsol.

7 Locally Block Groups

A block group is a semigroup whose regular elements have unique inverses (or, equivalently, semigroups which do not have a non-trivial right or left zero subsemigroup). The pseudovariety of such is denote BG. We use D for the pseudovariety of semigroups whose idempotents are right zeros.

We now recall some important facts whose consequences we shall use without comment:

1. P G = J * G = B G = E J [4];

320

2. L(EJ) = EJ * D [12, Proposition 10.21, [17, The Delay Theorem];

3. LG = G * D [15, 171;

4. If H is a pseudovariety of groups, then BPoZ(H) = J * H [9, 141;

5. For any pseudovariety of semigroups V, J * V is generated by semidirect products M * N with M E J+ and N E V [16, 141;

6. For a monoid M , M E J+ if and only M E LJ+.

Proposition 7.1. Let H be a pseudovariety of groups. Then

P'(LH)+ c P~Z(LH) c L(P~z(H)); P(LH) c BPoZ(LH) L(BPoZ(H)).

Proof. The first containment of the first statement follows from Theorem 6.4. The second containment follows from Proposition 6.1 which shows that

PoZ(LH) = LJ+ @ LH C_ L(LJ+ @ H) = L(PoZ(H)).

The second statement follows from the first by (*). 0

The following lemma will be of use. As its proof is identical to the unordered case [18, Lemma 2.21, we omit the proof.

Lemma 7.2. Let 'p : S * T + T be a semidirect product projection from a semidirect product of (ordered) semigroups, and let e E T be an idempotent. Then any submonoid of ecp-' (order) embeds in S .

Using our collection of facts and the above lemma, one deduces immediately

Corollary 7.3. Let V be a pseudovariety of semigroups. Then

J+ * V c LJ+ @ V = PoZ(V); J * V C_ BPoZ(V).

We now show that for the case of G, all the pseudovarieties in question are the same.

Theorem 7.4. P(LG) = L(PG) = L(BG)

32 1

Proof. Proposition 7.1 shows that P(LG) L(PG) (here we are using that PG = J * G = BPoZ(G)). For the other direction, using that PG = EJ, we see that

L(PG) = E J * D = J * G * D = J *LG.

But, by Corollary 7.3, J * LG BPoZ(LG).

However, by Theorem 6.5, the righthand side is none other than P(LG). The result follows. 0

It is clear that one can verify if a semigroup is locally a block group in polynomial time whence P(LG) = BPoZ(LG) has polynomial time membership problem. Observe that we have also shown that L(BG) = J * LG. We note that an entirely similar argument would show that P’(LG)+ = PoZ(LG) = L(P’G+) if one could show that EJ+ is local (the argument of [12, Proposition 10.21 fails because (B;)+ @ EJ+).

References [l] J. Almeida, Finite Semigroups and Universal Algebra, World Scientific, 1994.

[2] J. A. Brzozowski, Hierarchies of aperiodic languages, RAIRO Inform. T h b r . 10 (1976), 33-49.

[3] S. Eilenberg, Automata, Languages and Machines, Academic Press, New York, Vol A, 1974: Vol B, 1976.

[4] K. Henckell, S. Margolis, J. -E. Pin, and J. Rhodes, Ash’s type 11 theorem, profinite topology and Malcev products. Part I, Internat. J. Algebra and Comput. 1 (1991), 411-436.

[5] S. W. Margolis and J.-E. Pin, Varieties of finite monoids and topology for the free monoid, in “Proceedings of the 1984 Marquette Conference on Semi- groups” (K. Byleen, P. Jones and F. Pastijn eds.), Marquette University (1984), 113-130.

[6] S . W. Margolis and J.-E. Pin, Product of group languages, PTOC. FCT Conf., Lecture Notes in Computer Science, Voll99 (Springer, Berlin, 1985), 285-299.

[7] J.-E. Pin, Eilenberg’s theorem for positive varieties of languages, Russian Math. (Iz. VUZ) 39 (1995), 7483.

[8] J.-E. Pin, Syntactic semigmups, Chap. 10 in Handbook of language theory, Vol. I, G. Rozenberg and A. Salomaa (ed.), Springer Verlag, 1997, 67S746.

[9] J.-E. Pin, Bridges for concatenation hierarchies, in 25th ICALP, Berlin, 1998, pp. 431-442, Lecture Notes in Computer Science 1443, Springer Verlag.

[lo] J.-E. Pin and P. Weil, Polynomial closure and unambiguous product, Theory

[ll] J.-E. Pin and P. Weil, Semidirect product of ordered semigroupps, Comm. in

[12] B. Steinberg, Semidirect products of categories and applications, J. Pure Appl.

[13] B. Steinberg, A note o n the equation PH = J * H, Semigroup Forum, to

[14] B. Steinberg, Polynomial closure and topology, Internat. J. Algebra and Com-

[15] H. Straubing, Finite semigroup varieties of the form V * D, J. Pure Appl.

[16] H. Straubing and D. Thkrien, Partially ordered f inite monoids and a theorem

Comput. Systems 30 (1997), 1-39.

Algebra, to appear.

Algebra 142 (1999), 153-182.

appear.

put., 10 (2000), 603-624.

Algebra 36 (1985), 53-94.

of I. Simon, J. Algebra 119 (1985), 393-399.

[17] B. Tilson, Categories as algebra, J. Pure and Applied Algebra 48 (1987) 83-

[18] P. Weil, Closure of varieties of languages under products with counter, J. Com-

198.

put. System Sci. 45 (1992), 316-339.

322322 3222

323

Routes and Trajectories

Alexandru Mateescu*

Abstract

This paper is an overview of some basic facts about routes and trajectories. We introduce and investigate a new operation of parallel composition of words. This operation can be used both for DNA computation as well as for parallel computation. For instance the re- combination of DNA sequences produces a new sequence starting from two parent sequences. The resulting sequence is formed by starting at the left end of one parent sequence, copying a substring, crossing over to some site in the other parent sequence, copying a substring, crossing back to some site in the first parent sequence and so on. The new method that we introduce is based on syntactic constraints on the crossover operation.

1 Introduction

We define and investigate new methods to define parallel composition of words and languages. These operations are suitable both for concurrency and for DNA computation. The operation of splicing of routes leads to new shufflelike operations defined by syntactic constraints on the usual crossover operation. The constraints involve the general strategy to switch from one word to another word. Once such a strategy is defined, the structure of the words that are operated does not play any role.

Definition 1.1 A nondeterministic generalized sequential machine (gsm) is an ordered system G = (C, A, S.Qo, F') where C and A are alphabets, Q is a finite set of states, QO G Q is the initial set of states and F C Q is the final set of states, S : Q x C + Pfin(Q x A*) is the transition function.

If L is a language, then 6(L) is the image by the gsm of L.

*Faculty of Mathematics, University of Bucharest, Romania, email: alexmateQpcnet.ro

324

Definition 1.2 A w E C* be a word. where (wlui means

.ssume that C = {all a2, . . . a,} i s an ordered alphubet. Let The Parikh vector of w is Q(w) = ( ~ w ~ ~ ~ , Iwluz, -. . l ~ l ~ , , ) the number of occurrences of ai in w.

The operations are introduced using a uniform method based on the notion of route. A route defines how to skip from a word to another word during the parallel composition.

These operations lead in a natural way to a large class of semirings. The approach is very flexible, various concepts from the theory of con-

currency and of DNA computation can be introduced and studied in this framework. For instance, we provide examples of applications to fairness property and to parallelization of languages. The reader is referred to the monograph [l], for the notion of fairness.

The application considered deals with the parallelization of non-context- free languages. The parallelixation problem for a non-context-free language L consists in finding a representation of L as the shuffle over a set T of trajectories of two languages L1 and L2, such that each of the languages L1, T and L2 are context-free, or even regular languages. This problem is related to the problem of parallelization of algorithms for a parallel computer. This is a central topic in the theory of parallel computation.

2 Operations on routes and trajectories

In this section we introduce the notions of route and of splicing on routes. For more details on these notions, the reader is referred to [4].

Consider the alphabet V = (1, i , 2 ,2} . Elements in V are referred to as versors. Denote V+ = (1, 2) and V- = {I, 2). VI = {1,T} and V2 = (2,2}.

Definition 2.1 A route is afi element t E V* and a trajectory is an element t' E v;.

Let C be an alphabet and let a,@ be words over C. Assume that d E V and t E V*.

Definition 2.2 The splicing of (Y with p on the route d t , denoted (Y wdt 0, is defined as follows:

i f Q = au and p = bv, where a, b E C and u, v E C*, then:

325

b(aU Wt v ) , (U ~t bv),

i f d = 2, if d = I, au Wdt bv =

If a = au and p = A, a E C,U E C*, then

U ( U ~t A), i f d = 1, Wt A), if d = 1,

otherwise.

I f a = A a n d p = b v , b E C , v E C * , then

b(A Wt v ) , i f d = 2, Wt v ) , if d = 2,

otherwise.

Final 1 y,

A, i f t = A, 0, otherwise. AwtA=

Remark 2.1 One can easily notice that, if la1 # Itlvl or 1/31 # Itlv2, then a Wt p = 0.

The operation of splicing on a route is extended in a natural way to the operation of splicing on a set of routes as well as an operation between languages.

If T is a set of routes, the splicing of a with p on the set T of routes, denoted a WT p, is:

a W T P = U a ! W t p . tET

The above operation is extended to languages. In the sequel we consider some particular cases of the operation of splic-

ing on routes. This will prove that splicing on routes is a very general operation with a great power of expressibility.

326

2.1 Some binary operations that are particular cases of splicing on routes

Here we show that many customary binary operations of words and languages are particular cases of the operation of splicing on routes.

1. If T = (1, I, 2,2}* then w~ is the crossover operation.

2. If T = 1*1*2*2* then WT is the simple splicing operation, see [7] for more details on this operation (simple splicing of two words a and p is y where a = y1a1, p = pl72 and 7 = 7 1 7 2 ) .

3. If T = { 1n2n2mim I n, m 2 0}* then WT is the equal-length crossover.

4. If T = { 1,2}* then CUT= Lu, the shuffle operation.

5. If T = (12)*(1* U 2*) then WT= LUz, the literal s h d e .

6. If T C {1,2}* then +, is the shuffle on the set T of trajectories.

7. If T = 1*2* then WT= -, the catenation operation.

8. If T = 2*1* then CUT=', the anti-catenation operation.

9. Define T = 1*2*1* and note that w T = t , the insertion operation.

2.2 Some unary operations that are particular cases of splicing on routes

Let C be an alphabet. Assume that T is a set of routes such that T = T'Z* with T' C (1, I}*.

Note that for all languages L,Ll ,Lz 5 C*, with L1,L2 nonempty, it follows that:

L qr L1 = L qr L2-

Therefore, in this case, the operation qr does not depend on the second argument.

Consequently, several well-known unary operations of words and languages are particular cases of the operation of splicing on routes. In the sequel we denote by VT the unary operation defined by DQT, in the case T E { 1 , i , 2}*.

327

1. If T = 1*f*2* then V T ( L ) = Pref(L), the prefixes of L.

2. If T = I*1*2* then V T ( L ) = Suf (L) , the suffixes of L.

3. If T = T*l*T*2* then V T ( L ) = Sub(L), the subwords of L.

4. If T = (1, I}*2* then V T ( L ) = Scatt(L), the scattered subwords of L.

5. If T = {lkTklk 2 0}2* then V T ( L ) = !j(L).

6 . If T = l*I*1*2* then V T ( L ) = L + C*.

3 Splicing on routes of regular and context-free languages

This section is devoted to the operation of splicing on routes of regular and context-free languages. We consider the situations when the set of routes are regular and context-free languages.

The following theorem states, see [4], that the splicing of two regular languages over a regular set of routes is a regular language. However, the second part of this theorem involves context-free languages.

Theorem 3.1 Let L1, L2 and T , T C {1,2,1,2}* be three languages.

(i) If all three languages are regular languages, then L1 WT L2 is a regular language.

(ii) If two languages are regular languages and the third one is a context- free language, then L1 WT L2 is a context-free language.

In the sequel we use Theorem 3.1 to obtain some well-known closure properties as well as some other closure properties of regular and context- free languages under a number of operations.

Corollary 3.1 Several closure properties of the families of regular and context- free languages can be obtained from the above theorem. For instance the fam- i ly of regular languages is closed under the following operations: crossover, simple splicing, shufle, literal shufle, catenation, anti-catenation, insertion.

Moreover, the above operations applied to a context-free language and to a regular language produce a context-free language.

Remark 3.1 The conditions from the above theorem cannot be relaxed. For instance, if two languages are context-free and the third one is a regular language, then L1 WT L2 is not necessary a context-free language. Assume that T is regular, T = {1,2}*. It is known that there are context-free languages L1, Lz such that L1 WT L2, i.e,, L1LUL2 is not a context-free language.

For the other two cases, assume that T = {ln2lZn 1 n 2 l}, L1 = {andbncm I n,m 2 1) and L2 = {d}. Note that

(L1 WT Lz) n asdbsc+ = {a”dbncn I n 2 1).

Hence, L1 WT L2 is not a context-free language. If T = {2n+1122n I n 2 l}, then LZ wr L1 is not a context-free language.

4 Trajectories

In this section we introduce the notion of the trajectory and of the shuffle on trajectories. The shuffle on trajectories is a special case of the operation of splicing on routes.

Remark 4.1 From now on we denote by ‘Cr” the uersor “1” and by ‘b” the uersor ‘2”. However in the remainder part of this paper we will not use the uersors 1 and 2. Thus the alphabet V is V = { r , ~ } . Also the operation WT

is denoted by UJT and is called shufle on the set T of trajectories.

Definition 4.1 A trajectory is an element t , t E V*.

The following theorem is a representation result for the languages of the form L1 wTL2.

Theorem 4.1 For all languages LL and Lz , L I , LZ C_ C*, and for all sets T of trajectories, there exist a morphism cp and two letter-to-letter morphisms g and h, g : C + Cy and h : C -+ Cz where C1 and CZ are two copies of C, and a regular language R, such that

Consequently, we obtain the following:

328 328

329

Corollary 4.1 For all languages L1 and L2, L1, L2 & C*, and for all sets T of trajectories, there exist a gsm M and two letter-to-letter morphisms g and h such that

5 Some algebraic properties

This section is devoted to some important algebraic properties of the operation of shuf le on trajectories that was introduced and studied in [6].

5.1 Completeness

A complete set T of trajectories, has the property that, for each lattice point in plane, i.e., for each point in plane with nonnegative integer coordinates, there exists at least one trajectory in T that ends in this lattice point.

Definition 5.1 A set T of trajectories i s complete i f l a U - J ~ P # 8, f o r all a,@ E C*.

Definition 5.2 The balanced insertion i s the following operation: if w = xx and x = y, then w t b z = xyx

Example 5.1 ShufRe, catenation, insertion are complete sets of trajectories.

Noncomplete sets of trajectories are, for instance, balanced literal shuffle, balanced insertion, all finite sets of trajectories, see [6] for more details about balanced literal shufRe and balanced insertion.

Remark 5.1 T is complete i#Q(T) = N2, i.e., the restriction of the Parikh mapping Q to T is a surjective mapping [ Q I T i s surjective).

Proposition 5.1 If T is a set of trajectories such that T is a semilinear language with Q(T) effectively calculable, then it i s decidable whether or not T is complete.

Corollary 5.1 If T is a context-free language or i f T is a simple matrix language, then it i s decidable whether or not T is complete.

330

5.2 Determinism

A deterministic set T of trajectories has the property that, for each lattice point in the plane, there exists at most one trajectory in T that ends in this lattice point.

Definition 5.3 A set T of trajectories is deterministic iff c a r d ( a l u ~ P ) 5 1, for all a, E C*.

Example 5.2 Catenation, balanced literal s h d e , balanced insertion are deterministic sets of trajectories.

Nondeterministic sets of trajectories are, for instance shufRe and insertion.

Remark 5.2 T is deterministic iff the restriction of the Parikh mapping * to T is injective.

Proposition 5.2 Let L be a class of semilinear languages, eflectively closed under catenation and under GSM mappings. If T E L, then it is decidable whether or not T is deterministic.

Corollary 5.2 If T is a context-free language or i f T is a simple matrix language, then it i s decidable whether or not T is deterministic.

Proposition 5.3 It is undecidable whether or not a context-sensitive set T of trajectories i s deterministic.

5.3 Commutativity

The property of an operation to be commutative is an well known algebraic property.

Definition 5.4 A set T of trajectories is referred to as cornmutative i f l the operation LUT is a commutative operation, i.e. CYWTP = P L U T ~ , for all a,@ E c*. Example 5.3 S h d e is a commutative set of trajectories, whereas for instance, catenation and insertion are noncommutative sets of trajectories.

Notation. The morphism sym : {r, u } + {r, u}* is defined by sym(u) = r and sym(r) = U .

33 1

Remark 5.3 T is commutative iflT = syrn(T).

Proposition 5.4 Let T be a set of trajectories.

(i) if T is a regular language, then it i s decidable whether or not T is commutative.

(ii) i f T is a wntext-free language, then it is undecidable whether or not T i s commutative.

Remark 5.4 N o nonempty commutative set of trajectories is deterministic.

5.4 The unit element

Again, the existence of an unit element is an important algebraic property of an operation.

Definition 5.5 A set T of trajectories has a unit element iff the operation UJT has a unit element, i.e., ifithere exists a word 1 E C* such that l W ~ a = aU~1, for all a E C*.

Remark 5.5 T has a unit element i f l X is the unit element. Moreover, T has a unit element ifl (T* U u*) C T.

Note that the above property is decidable, i f T is a wntext-free language.

5.5 Associativity

We now start our discussion concerning associativity. After presenting a general characterization result (Proposition 5.5) , we show that the property of associativity is preserved under certain transformations.

Definition 5.6 A set T of trajectories is associative iff the operation WT i s associative, i.e.

for all a, P, y E C*

Example 5.4 The following sets of trajectories are associative:

( i ) T = {r,u}*, the s h d e , LU.

332

(ii) T = {riu2jri I i , j 2 O}*, the balanced insertion, t b .

(iii) T = r*u*, the catenation, -.

(iv) T = u*r*, the anti-catenation, '.

Examples of nonassociative sets of trajectories are:

(i') T = (ru)*(r* U u*), the literal shuffle, LUz.

(ii') T = (ru)*, the balanced literal s h d e , u b l .

(iii') T = r*u*r*, the insertion, t.

Definition 5.7 Let D be the set D = { x , y,x}. Define the substitutions u and r as follows:

u , r : V + P(D*), 4 r ) = {.,Y) , .(u) = M, r(r) = (4 7 d u ) = {Y,x}*

Consider the morphisms cp and $:

cp , $: V + D*,

Proposition 5.5 Let T be a set of trajectories. The following conditions are equivalent:

(i) T is an associative set of trajectories.

(ii) ~ ( 2 ' ) n (cp(T)LLk*) = r (T ) fl ($(T)LUx*).

Proposition 5.6 Let T be a set of trajectories.

[i) if T is a regular language, then it is decidable whether or not T is associative.

(ii) if T is a context-free language, then it is undecidable whether or not T is associative.

333

Notation. Let A be the family of all associative sets of trajectories.

Proposition 5.7 The family A is an anti-AFL.

Proposition 5.8 If ( T i ) i e ~ is a family of sets of trajectories, such that for all i E I , Ti is an associative set of trajectories, then TI,

T'= , i E I

is an associative set of trajectories.

Definition 5.8 Let T be an arbitrary set of trajectories. The associative closure of T , denoted T , is

T = n TI. TGT' ,TI E A

Observe that for all T, T C { r , u}* , T is an associative set of trajectories and, moreover, T is the smallest associative set of trajectories that contains T.

Example 5.5 One can easily verify that f- = Lu, i.e., the associative closure of insertion is the shuffle. Similarly, the associative closure of balanced insertion is balanced insertion itself, i.e., ;tb =+b. This is of course also obvious because balanced insertion is associative.

Remark 5.6 The function -, - : P(V*) + P(V*) defined as:

i s a closure operator.

We provide now another characterization of an associative set of trajectories. This is useful in finding an alternative definition of the associative closure of a set of trajectories and also to prove some other properties related to associativity.

Definition 5.9 Let W be the alphabet W = {x,y,z} and consider the following four morphisms, pi, 1 5 i 5 4, where

pi:w-+v* , 1 < i 5 4 ,

334

Next, we consider four partial operations on the set of trajectories, V*.

Definition 5.10 Let Oi, 1 5 i 6 4 be the following partial operations on V* .

oi : v* x v* -e+ v* 1 5 i 5 4 , Let t,tl be in V* and assume that It1 = n, Itl, = P , Itl, = 9, It 1 - I - n I

Itllr = PI7 ltllu = 9 7 I

1. if n = P I , then

else, Ol(t , t ' ) is undefined.

2. if n = p', then

else, 0 2 ( t , t 1 ) is undefined.

3. if n = gr7 then 0 3 ( t 1 , t ) = P3(9cp'u (YPWtZQ)),

else, O3(t, t') is undefined.

4. if n = q17 then 0 4 ( t 1 , t ) = P 4 ( z p t ~ f ( Y p u J t Z q ) ) ,

else, 0 4 ( t , t 1 ) is undefined.

Definition 5.11 A set T of trajectories is stable under O-operations i$ for all t 1 , t z E T , whenever Oi(tl,tz) is defined7 it follows that Oi(t1,tz) E T , 1 5 2 5 4 .

335

Proposition 5.9 Let T be a set of trajectories. The following assertions are equivalent:

(i) T is an associative set of trajectories.

(ii) T i s stable under O-operations.

5.6 Distributivity

For each set of trajectories, T, the operation U-IT is distributive over union, both, on the right and on the left side. Hence, we obtain the following important result:

Proposition 5.10 If T is an associative set of trajectories and i f T has a unit element, then for any alphabet E,

is a semiring.

6 Properties related to concurrency

This section is devoted to study some interrelations between these new o p erations and the theory of concurrency.

6.1 Fairness

Fairness is a property of the parallel composition of processes that, roughly speaking, says that each action of a process is performed with not too much delay with respect to performing actions from another process. That is, the parallel composition is "fair"with both processes that are performed.

Definition 6.1 Let T C {r,u}* be a set of trajectories and let n be an integer, n 1 1. T has the n-fairness property i f l for all t E T and for all t' such that t = t't" for some t" E {r,u}*, it follows that:

336

Example 6.1 The balanced literal shuffle ( U b ) has the n-fairness property for all n, n >_ 1.

The following operations: shuffle (LU), catenation (.), insertion (+), balanced insertion ( t b ) do not have the n-fairness property for any n, n 2 1.

Definition 6.2 Let n be a fixed number, n 2 1. Define the language Fn as:

F, = { t E V*I I lt'lr - It'[, 15 n, for all t' such that t = t't'',t'' E V*}.

Remark 6.1 Note that a set T of trajectories has the n-fairness property i f and only if T & Fn.

Proposition 6.1 For every n, n 2 1, the language Fn is a regular language.

Corollary 6.1 Let T be a set of trajectories. If T is a wntext-free language and i f n is @ed, n 2 1, then it is decidable whether or not T has the n- fairness property.

Remark 6.2 The fairness property is not a property of the set 9(T) as one can observe in the case when T is the set

T = {rid I i 2 1).

Indeed, T does not have the n-fairness property for any n, n 2 1 despite that 9 (T) is the first diagonal.

Proposition 6.2 The fairness property is preserved in the transition to the commutative closure.

Proposition 6.3 The fairness property is not preserved by applying the associative closure.

6.2 On parallelization of languages using shuffle on trajectories

The parallelization of a problem consists in decomposing the problem in subproblems, such that each subproblem can be solved by a processor, i.e., the subproblems are solved in parallel and, finally, the partial results are

337

collected and assembled in the answer of the initial problem by a processor. Solving problems in this way increases the time efficiency. It is known that not every problem can be parallelized. Also, no general methods are known for the parallelization of problems.

Here we formulate the problem in terms of languages and shuffle on trajectories. Also we present some examples.

Assume that L is a language. The parallelization of L consists in finding languages L1, Lz and T , T C V*, such that L = L1WTL2 and moreover, the complexity of L1, LZ and T is in some sense smaller than the complexity of L. In the sequel the complexity of a language L refers to the Chomsky class of L, i.e., regular languages are less complex than context-free languages which are, in turn, less complex than context-sensitive languages.

It is easy to see that every language L, L C_ {u,b}* can be written as L = u*UTb* for some set T of trajectories. However, this is not a parallelization of L since the complexity of T is the same with the complexity of L.

In view of Corollary 3.1 there are non-context-free languages L such that L = LluTL2 for some context-free languages L1, LZ and T . Moreover, one of those three languages can be even a regular language. Note that this is a parallelization of L.

As a first example we consider the non-context-free language L C {a, b, c}* ,

Consider the languages: L1 G {u ,b}* , L1 = {u 1 1 u l a = [ u Ib}, LZ = c* and T = {t 11 t I T = 2 I t I u } .

One can easily verify that L = L1LUTLZ. Moreover, note that L1 and T are context-free languages, whereas L2 is a regular language. Hence this is a parallelization of L. As a consequence of Corollary 3.1 one cannot expect a significant improvement of this result, for instance to have only one context-free language and two regular languages in the decomposition of L.

Finding characterizations of those languages that have a parallelization remains a challenging problem.

L = {w I I w la=] w Ib=l 20 Ic).

7 Conclusion

The operations of splicing on routes and its special case of shuffle on trajectories offer a new topic of research in the area of parrallel composition of words. For the special case of infinite words the reader is referred to [5].

338

Acknowledgement. The author is grateful to Kyoto Sangyo University, Professor Masami Ito, and to Tarragona University Rovim i Virgili, Profes- sor Carlos Martin- Vide for oflering all conditions to write this paper. Also, many thanks to the anonymous referee for his valuable comments.

References [l] N. Francez, Fairness, Springer-Verlag, Berlin, 1986.

121 J. S. Golan, The Theory of Semirings with Applications in Mathematics and Theoretical Computer Science, Longman Scientific and Technical, Harlow, Essex, 1992.

[3] W. Kuich and A. Salomaa, Semirings, Automata, Languages, EATCS Mone graphs on Theoretical Computer Science, Springer-Verlag, Berlin, 1986.

[4] A. Mateescu, “Splicing on Routes: a Framework of DNA Computation”, Unconventional Models of Computation, UMC’98, Auckland, New Zealand, C.S. Calude, J. Casti and M.J. Dinneen (eds.), Springer, 1998, 273 - 285.

[5] A. Mateescu and G.D. Mateescu, “Associative shuffle of infinite words”, Structures in Logic and Computer Science eds. J. Mycielski, G. Rozenberg and A. Salomaa Lecture Notes in Computer Science (LNCS) 1261, Springer,

[6] A. Mateescu, G. Rozenberg and A. Salomaa, “Shuffle on Trajectories: Syn- tactic Constraints”, Theoretical Computer Science, TCS, 197, 1-2, (1998) 1-56 Fundamental Study.

[7] A. Mateescu, G. Ph, G. Rozenberg and A. Salomaa, Simple splicing systems; Discrete Applied Mathematics, 84, 1-3 (1998) 145-163.

[8] “Handbook of Formal Languages”, eds. G. Rozenberg and A. Salomaa, Springer, 1997.

1997, 291-307.

339

Characterization of valuation rings and valuation semigroups by semistar-operations

Rydki Matsuda Ibaraki University

Mito,Japan Email: [email protected]

Let L be a commutative ring which coincides with its total quotient ring. That is, L = {a /b I a, b E L and b is a non-zerodivisor of L}. Let r be a totally ordered abelian (additive) group. A mapping w of L onto r U {GO} is called a valuation on L if v(ab) = .(a) + v(b) and w(a + b) 2 inf {v(a), v(b)} for each elements a and b of L. The subring { a E L I .(a) 2 0) of L is called the valuation ring associated to v. Valuation ring is one of most important notion and tool in commutative ring theory.

The aim of this talk is to characterize valuation rings and valuation semigroups by semistar-operations. In $0, we will state relationships between commutative ring theory and commutative semigroups (explicitly, grading monoids). In $1, we concern with a commutative ring without zerodivisors (that is, an integral domain) and a grading monoid. The results of 31 appeared on [M 6 and 71. In 92, we generalize results for integral domains to commutative rings with zerodivisors. The results of 92 appeared on [M8].

$0. Commutative ring theory and commutative semigroups

In this section, we will state relationships between commutative semigroups and commutative ring theory.

Let G be a torsion-free abelian (additive) group, and let S be a subsemigroup of G which contains 0. Then S is called a grading monoid ([No]). We will call a grading monoid simply as a g-monoid. For example, the direct sum ZO @I . . . @I Zo of n-copies of the non-negative integers ZO is a g-monoid. Many terms in commutative ring theory may be defined analogously for S. For example, a non-empty subset I of S is called an ideal of S if S + I c I . Let I be an ideal of S with I 5 S. If s1 + s2 E I (for s1, s2 E S) implies s1 E I or s2 E I , then I is called a prime ideal of S. Let I? be a totally ordered abelian (additive) group. A mapping v of a torsion-free abelian group G onto I? is called a valuation on G if v(z + y) = v(z) + w(y) for all 2, y E G. The subsemigroup {z E G I w ( z ) 2 0) of G is called the valuation semigroup of G associated to v. The maximum number n so that there exists a chain PI 5 Pz 5 ... 5 P, of prime ideals of S is called the dimension of S. If every ideal I of S is finitely generated, that is, I = U;(S + si) for a finite number of elements s1,. + . , s, of S, then S is called a Noetherian semigroup. Many propositions for commutative rings are known to hold for

340

S. For example, if S is a Noetherian semigroup, then every finitely generated extension g-monoid S[q, . . . ,%,I = S + 1 Z o z i is also Noetherin [M4], and the integral closure of S is a Krull semigroup [M5]. Ideal theory of S is interesting itself and important for semigroup rings. Let R be a commutative ring, and let S be a g-monoid. There arises the semigroup ring R[S] of S over R: R[S] = R[X; S] = {Cfinite a,Xs I a, E R, s E S}. If S is the direct sum ZO @I - . @I ZO of n-copies of ZO, then R[SI is isomorphic to the polyne mial ring R [ X 1 , - . - ,Xn] of n variables over R. Assume that the semigroup ring D[S] over a domain D is a Krull domain. Then D. F. Anderson [A] and Chouinard [C] showed that C(D[Sl) E C(D) @ C(S) , where C denotes ideal class group. Thus they were able to make domains that have various ideal class groups. For another example, assume that D is integrally closed and S is integrally closed. Then we have (I1 n . . . n In)" = I: n . . . n 1: for every finite number of finitely generated ideals 11, . . . , I, of D[S] if and only if (I1 n . . . n I,)" = 1: n . . . n I: for every finite number of finitely generated ideals 11, . . . , I, of D and (I1 n . . . I,)" = I: n . . . n I: for every finite number of finitely generated ideals 11, . . . , I, of S ([M3]), where w is the w-operation. For references of ideal theory of S and R[S] we confer [GI.

$1. Valuation semigroups and valuation domains

Let D be an integral domain with quotient field K. Let F ( D ) be the set of non-zero fractional ideals of D. A mapping I +-+ I* of F ( D ) to F ( D ) is called a star-operation on D if for all a E K - (0) and I, J E F ( D ) ; (1) (a)* = (a); (2) (aI)* = aI*; (3) I c I*; (4) if I c J , then I* c J*; and (5) (I*)* = I*. Let C ( D ) be the set of star-operations on D.

Let F'(D) be the set of non-zero D-submodules of K. A mapping I c--) I* of F'(D) to F'(D) is called a semistar-operation on D if for all a E K - { 0 } and I , J E F'(D); (1) (aI)* = aI*; (2) I c I*; (3) If I c J , then I* c J*; and (4) (I*)* = I*. Let C'(D) be the set of semistar-operations on D.

[H, Lemma 5.21 and [AA, Proposition 121 showed: Let V be a non-trivial valuation ring on a field, and M its maximal ideal. If M is principal, then I C ( V ) I= 1. If M is not principal, then I C ( V ) I= 2.

[OM, Theorem 481 showed: A domain D is a discrete valuation ring of dimension 1 if and only if I C'(D) I= 2.

[MS, Corollary 61 showed: Let D be an integrally closed quasi-local domain with dimension n. Then D is a valuation ring if and only if n + 1 51 C'(D) I< 2n + I .

Let S be a g-monoid with quotient group G; G = { s - s' I s , s' E S}. Let F ( S ) be the set of fractional ideals of S. A mapping I H I* of F ( S ) to F ( S ) is called a star-operation on S if for all a E G, and I, J E F ( S ) ; (1) (a)* = (a);

341

(2) ( a + I)* = a + I*; (3) I c I*; (4) If I c J , then I* c J*; ( 5 ) (I*)* = I*. For example, if we set Id = I for each I E F ( S ) , d is a star-operation on S which is called the &operation on S. Let I" be the intersection of principal fractional ideals containing I, then v is a star-operation on S which is called the v-operation on S. Let C ( S ) be the set of star-operations on S.

Let F'(S) be the set of non-empty subsets I of G such that S + I c I. A mapping I I-+ I* of F'(S) to F'(S) is called a semistar-operation on S if for all a E G and I, J E F'(S); (1) (a + I)* = a + I*; (2) I C I,*; (3) If I c J, then I* c J*; (4) (I*)* = I*. For example, if we set Id = I for each I E F'(S), then d' is a semistar-operation on S which is called the d'-operation on S. For each I E F'(S), we set I"' = I" if I E F(S) , and set Id = G if I 9 F(S) . Then v' is a semistar-operation on S which is called the v'-operation on S. Let C'(S) be the set of semistar-operations on S.

Let V be a valuation semigroup. If its value group is discrete, V is called a discrete valuation semigroup. In this section, we will prove the following four Theorems:

Theorem 1. Let S be a g-monoid with dimension n. Then S is a discrete valuation semigroup if and only if I C'(S) I= n + 1.

Theorem 2. Let V be a valuation semigroup of dimension n, v its valuation and I' its value group. Let M = P, 2 P,-1 2 ... 2 PI be the prime ideals of V , and let (0) 5 Hn-l 5 -.. 5 HI 5 I' be the convex subgroups of r. Let m be a positive integer such that n + 1 5 m 5 2n + 1. Then the followings are equivalent:

(1) I C'(V) I= m. (2) The maximal ideal of the g-monoid Vpi = (s - t I s E V, t E V - Pi)

(3) The ordered abelian group I'/Hi has a minimal positive element for is principal for exactly 2n + 1 - m of i.

exactly 2n + 1 - m of i.

Theorem 3. Let D be a domain with dimension n. Then D is a discrete valuation ring if and only if 1 C'(0) I = n + 1.

Theorem 4. Let V be a valuation domain of dimension n, v its valuation and I' its value group. Let M = P, 2 P,-1 2 ... 2 PI 2 (0) be the prime ideals of V , and let (0) 5 H,-1 5 ... 5 HI 5 I' be the convex subgroups of I'. Let m be a positive integer such that n + 1 5 m 5 2n + 1. Then the followings are equivalent:

(1) I C' (V) I= m. ( 2 ) The maximal ideal of Vpi is principal for exactly 2n + 1 - m of i.

342

(3) r / H i has a minimal positive element for exactly 271 + 1 - m of i.

Lemma 1. (1) Let * be a semistar-operation on a g-monoid S. Then, for all I, J E F'(S) we have (I + J)* = (I* + J*)*.

(2) Let * be a semistar-operation on S. If R is an oversemigroup of S, that is, if R is a subsemigroup of the quotient group of S containing S, then R* is an oversemigroup of S.

(3) Let R be an oversemigroup of S, and let * E C'(S) . If we set .I"= J* for each J E F'(R), then a(*) is a semistar-operation on R.

(4) Let R be an oversemigroup of S, and let * E X'(R). If we set 16(*) = (I + R)* for each I E F'(S), then 6(*) is a semistar-operation on S.

( 5 ) Let V be a valuation semigroup on a torsion-free abelian group G. Then we have F'(V) = F ( V ) U {G}.

We call a(*) of Lemma l(3) the ascent of * to R. We call 6(*) of Lemma l(4) the descent of * to S.

Lemma 2. Let V be a valuation semigroup on a torsion-free abelian

(1) If M is principal, then 1 C ( V ) I= 1. (2) If M is not principal, then 1 C(V) (= 2.

group with maximal ideal M .

For the proof, assume that M = ( p ) is principal. Let I E F ( V ) , and let z @ I . Then we have I c (z + p ) and z @ (z + p ) . It follows that I = I" and C(V) = {d} . If M is not principal, we have v # d. Let * be a semistar-operation on V, and let I E F ( V ) such that I 5 I*. Take an element b E I* - I. Then we have (b) c I* C I" C (b) . Hence I" = I* and * = v.

Lemma 3. A g-monoid S is a discrete valuation semigroup of dimension 1 if and only if I C'(S) I= 2.

For the proof, let G be the quotient group of S, and assume that S is a discrete valuation semigroup of dimension 1 with maximal ideal M. Let * be a semistar-operation on S. If S* = G, then * = e. If S* 5 G, we have S* = S. Lemma 2 implies that C'(S) = {d',e}. If I C'(S) I= 2, then C(S) = {d } . Suppose that there exists a valuation oversemigroup V such that S 5 V 5 G. Let d" be the identity mapping on F'(V), and let * be the descent of d" to S. Then Cl(S) has at least three members d', e, *; a contradiction. Then M is principal by Lemma 2.

343

An element a of the quotient group of S is called integral over S if na E S for some positive integer n. The set 3 of integral elements over S is called the integral closure of S.

Lemma 4. If S is a valuation semigroup of dimension n, then n + 1 51 C’(s) 15 2n + 1. If I C‘(S) I < 00, then is a valuation semigroup.

For the proof, let G be the quotient group of S, and let S be a valuation semigroup of dimension n. Let M = P, 2 Pn-l 2 ... 3 PI be a chain of prime ideals of S. Let di be the identity mapping on F’(Vpi), and let * i

be the descent of d: to S. Then C’(S) contains at least (n + 1)-members d’,*1,-.. ,*“-1,e. Suppose that I C’(S) I< 2k + 1 for k = 1 , 2 , - . . ,n - 1, and let * E C’(V). If V* = G, then * = e. If V* = V, then * = d’ or * = u’. If V 5 V*, we have V* = Vpd for some i. Then * is the descent to V of the semistar-operation a(*) on Vpi, and hence I C’(S) 15 2n + 1.

Suppose that s is not a valuation semigroup. Then there exists an element u E G such that u 9 s and -u $! S. Then we have S[u] 2 S[2u] 2 . . . 3 S. It follows that I C’(S) I= 00; a contradiction.

Lemma 5. Let V be a valuation semigroup with value group r. Let P be a prime ideal of V , and let H be the associated convex subgroup of I?. Then I’/H is the value group of Vp.

For the proof, let w be the composition of u and the canonical mapping --+ r / H . Then w is a valuation on G (the quotient group of S) with value

group r / H . The valuation semigroup of w is Vp.

Lemma 6. Let V be a discrete valuation semigroup of dimension n. Then we have I C’(V) I= n+ 1.

For the proof, let M = P 3 Pn-l 2 ... 2 PI be a chain of prime ideals “ f of V. Set Ui = Vp; for each a . Then Ui is a discrete valuation semigroup of dimension i by Lemma 5. For each i, let d: be the identity mapping on F’(U+), and let *i be the descent of d: to V . We show that C’(V) = {e ,*1 , - . - ,*n-l,d’} by the induction on n. If n = 1, the assertion holds by Lemma 3. Suppose that the assertion holds for each i < n. We have C‘(Vi) = { a ( e ) , a(*l), . . . ,a(*+)} by the induction hypothesis, where a is the ascent mapping of V to U+. Then Lemmas 1 and 2 complete the proof.

Lemma 7. Let S be a g-monoid of dimension n with I C’(S) I= n + 1. Then S is a discrete valuation semigroup.

344

Proof. There exists a chain of prime ideals of S: S 2 M = P,, 3 P,-1 2 ... 2 PI. Let di be the identity mapping on F’(Spi) for each i. Let * i

be the descent of d: to S for each i. By the assumption we have C’(S) = {e, * I , . . . , *,,-I, d’}. Lemma 4 implies that S is a valuation semigroup. Then Ui = Spi is a valuation semigroup of dimension i for each i. Then Lemmas 2 and 3 imply that V is discrete by the induction on n.

Theorem 1 follows from Lemmas 6 and 7. Let S be a g-monoid of dimension n with n + 1 51 C’(S) 15 2n + 1. Then

S is not necessarily integrally closed. For example, let S = { 0 , 2 , 3 , 4 , . . . }. Then S is a 1-dimensional g-monoid, and has 1 C’(S) I= 3. But S is not integrally closed.

Proof of Theorem 2. Assume that the maximal ideal of Vpi is principal for exactly 2n + 1 - m of i. We show that I C’(V) I = m by induction on n. Suppose that the assertion holds for n - 1. Set W = VpnPl.

The case M is not principal: Since the assertion holds for W , and since 2n + 1 - m = 2 ( n - 1) + 1 - (m - 2 ) , we have 1 C’(W) I= m - 2. Let C’(W) = { * I , . - - ,*m-2}. Then 6(*1) , - . - ,6(*,,-2),d’,v’ are distinct each other by Lemma 2. Let * be a semistar-operation on V. If V* = V, then * = d’ or * = v’. If V* 2 V, we have V* = Vpi for some i. Then * is the descent of the ascent a(*) of * to Vpi, and hence I C’(V) I= m.

The case that M is principal: Since the assertion hold for W , and since 2n - m = 2 ( n - 1) + 1 - (m - l), we have I C’(W) I= m - 1. Let C’(W) = { * I , . . - ,*,,-I}. Then 6(*1),-.’ ,6(*m - l),d’ are distinct each other. Let * be a semistar-operation on V. If V* = V, then * = d‘ by Lemma 2. If V* 2 V , we have V* = Vpi for some i. Then * is the descent of the ascent a(*) to Vp;, and hence I C‘(V) I= m. Since the value group of Vp< is r / H i has a minimal positive element for exactly 2n + 1 - m of i. The proof is completes.

The proof of Theorem 3 (resp. 4) is a ring version of that of Theorem 1 (resp. 2 ) .

$2. Valuation rings

Let R be a commutative ring. A non-zeodivisor of R is called a regular element of R. Let I be an R-submodule of the total quotient ring L of R. The set of regular elements of L contained in I is denoted by Reg(1). If Reg(I) # 8, then I is called regular. If each regular ideal I of R is generated

345

by Reg(I), then R is called a Marot ring. Throughout the section A denotes a Marot ring and K its total quotient ring. The aim of this section is to generalize the results of $1 for Marot rings.

The total quotient ring of the ring R is denoted by q(R). Let L =q(R), and let F(R) be the set of non-zero fractional ideals of the ring R. If I I---+ I* is a mapping of F(R) into itself which satisfies the following conditions, it is called a star-operation on R: (1) (a)* = (a) for each regular element a of L; (2) (aI)* = aI* for each regular element a of L and each I E F(R); (3) I c I* for each I E F(R); (4) I c J implies I* c J* for each I and J of F(R); (5) (I*)* = I* for each I E F(R).

Let F’(R) be the set of non-zero R-submodules of L. If I I--+ I* is a mapping of F’(R) into itself which satisfies the following conditions, it is called a semistar-operation on R: (1) (aI)* = aI* for each regular element a of L and each I E F’(R); (2) I c I* for each I E F’(R); (3) I c J implies I* c J* for each I and J of F’(R); (4) (I*)* = I* for each I E F’(R).

The set of star-operations (resp. semistar-operations) on R is denoted by C ( R ) (resp. C’(R)). Let *, *‘ be star-operations (resp. semistar-operations) on the ring R. If I* = I*‘ for every regular member I of F(R) (resp. F’(R)), then * and *’ are called similar, and is denoted by * - *’. If * and *’ are similar star-operations (resp. semistar-operations) on a domain, then * = *’. The relation - is an equivalence relation on C ( R ) (resp. C’(R)), and the set of equivalence classes is denoted by C ( R ) / - (resp. C’(R)/ -). If D is a domain, then I C ( D ) I=( C ( D ) / -1 and I C’(D) I=( C’(D) / -1. The equivalence class of * is denoted by [*I.

A regular prime ideal is also called an r-prime ideal. The maximal number n so that there exists a chain of r-prime ideals of the ring R: (R ?)Pn 2 Pn-l 2 . . . 2 PI is called the r-dimension of R.

Remark 1. 1 C’(R)/ -I= 1 if and only if R =q(R), or equivalently r- dim(R) = 0.

For each non-empty subset I of L =q(R), the subset {x E L I xI c R} of L is denoted by I-’. And (I-’)-’ is denoted by I” for each I E F(R). A fractional ideal I of R is called divisorial if Iu = I.

Remark 2. 1 C ( A ) / -I= 1 if and only if every regular ideal of A is divisorial.

For, if I is a regular ideal of A, then I” is the intersection of regular principal fractional ideals which contain I.

346

A regular maximal ideal is also called an r-maximal ideal. If each finitely generated regular ideal is invertible, then the ring is called a Priifer ring.

Theorem 1 ([M2,(6.5)Theorem]). Let A be integrally closed. Then every regular ideal of A is divisorial if and only if: A is a Priifer ring, every r- maximal ideal of A is finitely generated, every r-prime is contained in a unique maximal ideal, and every regular ideal has only a finite number of minimal prime ideals.

Lemma 1. If A is a valuation ring, then F'(A) = F(A) U {K} .

Lemma 2. Let A be a valuation ring with r-dimension > 0. Then A has a unique maxilmal ideal, namely A is a quasi-local ring.

For, the value group of A is non-zero, and the subset {z E A I v(x) > 0) of A is a unique maximal ideal.

Remark 3. (1) If R is a valuation ring with value group 0, then r-dim

(2) If r-dim ( R ) = 0, R need not be quasi-local. ( R ) = 0.

For (2), let k be a field, and let R be the direct sum of n-copies of k. Then r-dim (R) = 0, and R has n maximal ideals.

Let v be a valuation on a total quotient ring K , and let I be an ideal of its valuation ring. If v(z) = 00 for each 2 E I, then I is called an infty ideal. (0) is an infty ideal. There exists a unique maximal infty ideal v-l(00).

The identity mapping on F ( R ) is a star-operation, called d-operation. The mapping I H I" on F ( R ) is a star-operation, called v-operation. The identity mapping on F'( R) is a semistar-operation, called d'-operation. The mapping I H I" on F'(R) is a semistar-operation, called v'-operation. The mapping I +-+ q(R) on F ' ( R ) is a semistar-operation, called e-operation.

Remark 4. Let A be a valuation ring with r-dimension > 0. If A has three distinct infty ideals, then d',v' and e are distinct.

For, if I is an infty ideal of A, then I-' = K , and K-' is the maximal infty ideal.

Theorem 2. Let V be a Marot valuation ring of r-dimension > 0, and let M be its maximal ideal.

347

(1) If M is principal, then I C(V)/ -I= 1. (2) If M is not principal, then I C(V)/ -I= 2.

Proof. (1) Let M = xV, let I be a regular ideal of V, and let I' = I". If y # I is a regular element, then I is properly contained in yV. Hence I C xyV 5 yV. Thus I' c yxV and y # I'. We conclude that I = I'. Remark 2 implies that I C(V)/ -I= 1.

(2) Assume that I is a regular ideal of V which is not divisorial. Take a regular element u E I" - I. Then u-'I c M . If a-lI 5 M, take a regular element b E M - u-'I. Then I c abV c a M , and (u) = I" c (ub) c uM 5 (u) ; a contradiction. Hence u-'I = M and hence, 1 C(V) I / -I= 2.

Proposition 1. Let V be a Z-valued Marot valuation ring. Then I C'(V/) -I= 2.

For the proof, let M = ( p ) be the maximal ideal of V. We have F'(V) = { K } U {p"V I n E Z} U {infty ideals}, where K =q(V). Let * be a semistar- operation on V. If V* = K , then (p"V)* = p"V* = K. Hence * N e. If V* = p"V for some n, since (V*)* = V*, we have n = 0. It follows that * N d'. Therefore I C'(V)/ N I = 2.

Remark 5. If I C'(A)/ N I = 2, then C'(A)/ N= { [d ' ] , [el}.

Remark 6. Let V be a Z-valued Marot valuation ring. Then I C'(V) I need not be 2.

For example, let k be a field, w a Z-valued valuation on k, and W its valuation ring. Let M be a vactor space over k of dim ( M ) > 1, and let K = k @ M be the semidirect product of k and M ([Na]). For each element a+x ( a E k and x E M ) of K , we set v(u + x) = ~ ( u ) . Then v is a Z-valued valuation on K . The valuation ring V of v equals to W + M . We see that V is a Marot ring, M is an infty ideal of V, and d',v' and e are distinct by Remark 4. Hence I C'(V) I# 2.

Let D be a Bezout domain, and M a D-module. Then the semidirect product D @ M is a Marot ring ([Ml, Remark 131).

We may canonically define partial orders 5 on C ( R ) and on C'(R). For a domain D, Anderson-Anderson [AA] considered the partial order 5 on C ( D ) . We define the partial order 5 on C ( R ) (resp. C'(R)) by *1 5 *2 if and only if I*' c I*2 for every I E F ( R ) (resp. F'(R)).

348

Remark 7. (1) The partial order 5 on C’(D) need not be a total order. (2) On C‘(R) , d’ is the smallest element and e is the greatest element. (3) On C ( R ) , d is the smallest element. (4) On C(D) , v is the greatest element.

Example for (1): Let D be a domain with overrings D1 and D2 which are incomparable. For each I E F(D) , set I*l = ID1 and I*2 = ID2. Then * 1

and *2 are incomparable.

Remark 8. If I C’(A) I= 2, then r-dim ( A ) > 0 need not hold.

For example, let Ic be a field with I k I= 2, and let M be a Ic-module with I M I= 2. Then the semidirect product A = Ic @ M is a Marot ring, and I C‘(A) I= 2. However, r-dim (A) = 0.

We note that each overring of a Marot ring is a Marot ring. And A is integrally closed if and only if A is the intersection of the set of valuation overrings of A ([HH,Corollary 2.41).

Lemma 3. If I C’(A)/ N I = 2, then A is a Z-valued valuation ring.

Proof. Then A is a valuation ring. Since v’ is not similar to e, each regular fractional ideal of A is divisorial. By Theorem 1, the r-dimension of A is 1, and the maximal ideal M is principal. Suppose that A is not a Z-valued valuation ring. Let v be the valuation associated to A, and r its value group. There exists a E I? such that nv(p) < a for every n E N . Coose x E K such that v(x) = -a. We have x = a / b for a,b E A with b regular. It follows that a 5 v(b). Therefore n,Mn is an r-prime ideal of A, and hence r-dim (A) > 1; a contradiction.

Proposition 2 and Lemma 3 imply the following,

Theorem 3. A is a Z-valued valuation ring if and only if I C’(A)/ N I = 2.

Let P be a prime ideal of the ring R. Then the overring {x E q(R) I sx E R for some s E R - P } is denoted by Rp].

Lemma 4. Let A be a valuation ring of r-dimension n. Then n + 1 51 C’(A)/ 4.

For the proof, let M = P, 2 . . . 2 PI be the r-prime ideals of A. Then

349

we have there arise semistar-operations *I, . . . , *,+I.

3 . . . 2 ~ p n - l l 3 V. If we set I*i = I q p i ] for each I E F’(A),

Lemma 5. Let V be a Marot valuation ring of r-dimension 1. If the maximal ideal is principal, then V is Z-valued and 1 C’(V)/ N I = 2.

For, each regular ideal of V is divisorial by Theorem 2. The proof of Lemma 3 shows that V is Z-valued, and then 1 C’(V)/ wJ= 2 by Proposition 1.

Lemma 6. Let V be a Marot valuation ring of r-dimension n + 1, and let W be its valuation overring of r-dimension n for a positive integer n. Let * E C‘(V) so that * is not similar to any member of bwp(C’(W)). Then * is similar to d’ or v’.

Proof. If V* = V, then we have * N d’ or * N 21’ by Theorem 2. Thus V* 2 V and W c V*. Let I be a regular ideal of V. Then IW c IV* c I*. Hence I* = (IW)*. It follows that * E SW,V(C’(W)).

Proposition 2. Let V be a Marot valuation ring with r-dimension n. Then n + 1 51 C’(V)/ -I< 2n + 1.

Proof. We have n + l 51 C’(V)/ N I by Lemma 4. Assume that n = 1, and let M be the maximal ideal of V. If M is principal, then I C’(V) / N I = 2 by Lemma 5. If M is not principal, then I C’(V)/ -I= 3 by Theorem 2(2). Then repeating Lemma 6 step by step for a valuation ring V with r-dimension n, we have the required inequality.

We say that a ring R has property (U), if each regular ideal of R is a (set-theoretical) union of regular principal ideals. We say that a ring R has property (FU), if Reg(I) c U y I i implies I c U y I i for each family of a finite number of regular ideals I , I l , - - - ,In. If R has property (U), then R has property (FU), and if R has property (FU), then R is a Marot ring. If R is a Marot ring, then R need not have property (FU), and if R has property (FU), then R need not have property (U) ([Ml]).

Lemma 7 ([M2, (2.10)Lemma ). Let 6, .. - , V, be valuation overrings of A such that A = V, n . . . n V,. Then, if A has property (U), A is a Priifer ring.

Theorem 4. Assume that A has property (U) and is integrally closed. If

350

C'(A)/ - I < 00, then A is a Priifer ring with only a finitely many r-prime ideals. Furthermore, if, in addition, A has a unique r-maximal ideal, then A is a valuation ring.

Proof. A can be written in the form A = nxeAVx, where the Vx are valuation overrings of A. Let d', be the identity mapping of F'(Vx), and let *A be the descent of d i to A. Then *xl and t x z are not similar for A1 # Xa. It follows that 1 A I < 00. Then A is a Priifer ring by Lemma 7. Let {Pi I i} be the set of r-prime ideals of A. Set I*' = IAp1 for each I E F'(A). Then *il

and ti2 are not similar for il # ia. Hence A has only finitely many r-prime ideals.

Corollary 1. Assume that A has property (U) which is integrally closed and has a unique r-maximal ideal with r-dimension n. Then A is a valuation ring if and only if n + 1 51 C'(A)/ -15 2n + 1.

Lemma 8. Let V be a Marot valuation ring with value group I'. Let P be an r-prime ideal of V , and let H be the associated convex subgroup of l?. Then I'/H is the value group of ypl.

Proof. Let w be the composition of v and the canonical map {I?, 00) --+ {I'/H,co}. Then w is a valuation on K with value group r / H . Let a E K with w(a) >_ 0. There exists an element a E V such that .(a) - v(a) E H . Then there exists an element s E V - P such that .(a) - .(a) = w(s) or .(a) - .(a) = -v(s). Hence a E ypl. The proof is complete.

Let V be a valuation ring. If the value group I? of V is discrete, then V is called discrete.

Theorem 5. Let A be a ring with r-dimension n. Then A is a discrete valuation ring if and only if I C'(A)/ -I= n + 1.

Using Theorems 2 and 3, the proof is similar to that of Theorem 3 of 51.

Theorem 6. Let V be a Marot valuation ring of r-dimension n, and r its value group. Let M = P, 2 P,-1 3 . . . 2 PI be the r-prime ideals of V , and let (0) 5 HnP1 5 - . - 5 HI 5 I? be the convex subgroups of I'. Let m be a positive integer such that n + 1 5 m 5 2n + 1. Then the followings are equivalent:

(1) I C'(V)/ -I= m. (2) The maximal ideal of is principal for exactly 2n + 1 - m of i.

351

(3) r / H i has a minimal positive element for exactly 2n + 1 - m of i.

Using Theorem 2, the proof is similar to that of Theorem 4 of 51.

REFERENCES

[A] D. F. Anderson, The divisor class group of a semigroup ring, Comm.

[AA] D. D. Anderson and D. F. Anderson, Examples of star operations

[C] L. Chouinard, Krull semigroups and divisor class groups, Can. J .

[GI R. Gilmer, Commutatibve Semigroup Rings, The Univ. Chicago

[HI W. Heinzer, Integral domains in which each non-zero ideal is divisorial,

[HH] G. W. Hinkle and J. A. Huckaba, The generalized Kronecker function

[Ml] R. Matsuda, On Marot rings, Proc. Japan Acad. 60(1984),134-137. [M2] R. Matsuda, Generalizations of multiplicative ideal theory to com-

mutative rings with zerodivisors, Bull. Fac. Sci., Ibaraki Univ. 17( 1985),49- 101.

[M3] R. Matsuda, Torsion-free abelian semigroup rings IX, Bull. Fac. Sci., Ibaraki Univ. 26(1994),1-12.

[M4] R. Matsuda, Some theorems for semigroups, Math. J. Ibaraki Univ. 30(1998), 1-7.

[M5] R. Matsuda, The Mori-Nagata Theorem for semigroups, Math. Japon. 49 (1999),17-19.

[M6] R. Matsuda, Note on the number of semistar-operations, Math. J. Ibaraki Univ. 31(1999),47-53.

[M7] R. Matsuda, Note on valuation rings and semistar-operations, Comm.

[M8] R. Matsuda, A note on the number of semistar-operations,II, Far East J. Math. Sci. 2(2000),159-172.

[MS] R. Matsuda and T. Sugatani, Semistar-operations on integral dce mains,II, Math. J. Toyama Univ. 18( 1995),155-161.

[Na] M. Nagata, Local Rings, Interscience, 1962. [No] D. Northcott, Lessons on Rings,Modules and Multiplicities, Cam-

[OM] A. Okabe and R. Matsuda, Semistar-operations on integral domains,

Alg. 8 (1980),467-476.

on integral domains, Comm. Alg. 18(1990),1621-1643.

Math. 33 (1981),1459-1468.

Press, 1984.

Mat hematika 15( 1968), 1 6 4 170.

ring and the ring R(X), J. Reine. Angew. Math. 292(1978),25-36.

Alg . 28 (2000) ,2515-25 19.

bridge Univ. Press,1968.

Math. J. Toyama Univ. 17(1994),1-21.

352

FURTHER RESULTS ON RESTARTING AUTOMATA

GUNDULA NIEMANN FRIEDRICH OTTO

Fachbereich Mathematik/Informatik, Uniuersitat Kassel 0-34109 Kassel, Germany

E-mail:{niemann,otto)~theory.informatik.uni-kassel.de

JanEar et al (1995) developed the restarting automaton as a formal model for certain syntactical aspects of natural languages. Here it is shown that with respect to its expressive power the use of nonterminal symbols by restarting automata corresponds to the language theoretical operation of intersection with regular languages. Further, we establish another characterization of the class of Church-Rosser languages by showing that it coincides with the class of languages accepted by the det-RRWW-automata, thus extending an earlier result presented at DLT’99. Fi- nally, we show that the Gladkij language L G ~ is accepted by an RRWW-automaton, which implies that the class GCSL of growing context-sensitive languages is properly contained in the class L(RRWW).

1 Introduction

In JanEar et a1 presented the restarting automaton, which is a nondeterministic machine model processing strings that are stored in lists. These automata model certain elementary aspects of the syntactical analysis of natural languages.

A restarting automaton, or R-automaton for short, has a finite control, and it has a readlwrite-head with a finite look-ahead working on a list of symbols (or a tape). As defined in it can perform two kinds of operations: a move-right step, which shifts the readlwrite-window one position to the right and possibly changes the actual state, and a restart step, which deletes some letters in the read/write-window, places this window over the left end of the list (or tape), and puts the automaton back into its initial state.

In subsequent papers JanEar and his co-workers extended the restarting automaton by introducing rewrite-steps that instead of simply deleting some letters replace the contents of the readlwrite-window by some shorter string from the input alphabet ’. This is the so-called RW-automaton. Later the restarting automaton was allowed to use auxiliary symbols, so-called nonterminals, in the replacement operation ‘711, which leads to the so-called RWW- automaton. Finally the restarting operation was separated from the rewriting operation ‘ , which yields the RRW-automaton. Obviously, the later variations can be combined, giving the so-called RRWW-automaton.

Since one can put various additional restrictions on each of these variants of the restarting automaton, a potentially very large family of automata and corresponding language classes is obtained. For example, various notions of monotonicity have been defined, and it has been shown that the monotonous and deterministic RW-automata accept the deterministic context-free languages 'i7, and that the monotonous RWW-automata accept the context-free languages 6,7. However, the various forms of the non-monotonous deterministic restarting automaton were not investigated in detail until it was shown in l1 that the deterministic RWW-automata accept the Church-Rosser languages.

Here we continue this work by investigating some classes of deterministic restarting automata and their relationship to each other and to the corresponding nondeterministic restarting automata. As a general result we will see that the use of nonterminals in the rewriting operation of a restarting automaton corresponds on the part of the language accepted to the operation of taking the intersection with a regular language. Secondly, we will show that by separating the restarting operation from the rewriting operation the descriptive power of the deterministic restarting automaton is not increased, that is, the deterministic RRWW-automata still accept the Church-Rosser languages. It should be pointed out that for the general case of nondeterministic restarting automata it is an open question whether or not this separation increases the power of the restarting automaton. Further we will see that the Gladkij language L G ~ = { w+wR#w I w E {a,b}* } is accepted by some RRWW-automaton, which proves that the class GCSL of growing context- sensitive languages is properly contained in ,C(RRWW), as L G ~ is not growing context-sensitive l t 3 .

This paper is structured as follows. In Section 2 we provide the definition of the restarting automaton and its variants. In Section 3 we analyze the effect of using auxiliary symbols in the rewrite operation, and in Section 4 we consider the inclusion relations between the language classes defined by the various classes of deterministic restarting automata. In the following section we consider the Gladkij language. The paper closes with some characterizations of the language classes considered through certain types of prefix-rewriting systems.

2 The restarting automaton and some of its variants

In this section we do not follow the historical development of the restarting automaton, but instead we introduce the most general version first and present the other variants as certain restrictions thereof.

354

A restarting automaton with rewriting, RRWW-automaton for short, is described by a 9-tuple M = (Q, C, r, S, qo, 4, $, F, H ) , where Q is a finite set of states, C is a finite input alphabet, I' is a finite tape alphabet containing C, qo E Q is the initial state, &, $ E r \ C are the markers for the left and right border of the work space, respectively, F g Q is the set of accepting states, H g Q \ F is the set of rejecting states, and

n

i = O is the transition relation. Here I'ln = U I?, 2s denotes the set of subsets

of the set S, and k 2 1 is the Zook-ahead of M . The look-ahead is implicitly given through the transition relation, but it is an important parameter of M .

The transition relation consists of three different types of transition steps:

1. A move-right step is of the form (q',MVR) E S(q,u), where q E Q \ ( F U H ) , q' E Q and u E rk+l U r lk . {$}, u # $, that is, if M is in state q and sees the string u in its readlwrite-window, then it shifts its readlwrite-window one position to the right and enters state q', and if q' E F U H , then it halts, either accepting or rejecting.

2. A rewrite-step is of the form (q', w) E S(q, u), where q E Q \ ( F u H ) , q' E Q , u E . {$}, and \u\ < \u(, that is, the contents 2~

of the readlwrite-window is replaced by the string u which is strictly shorter than u, and the state q' is entered. Further, the readlwrite- window is placed immediately to the right of the string u. In addition, if q' 6 F U H , then M halts, either accepting or rejecting. However, some additional restrictions apply in that the border markers & and $ must not disappear from the tape nor that new occurrences of these markers are created. Further, the readlwrite-window must not move across the right border marker $, that is, if u is of the form u1$, then u is of the form u1$, and after execution of the rewrite operation the readlwrite-window just contains the string $.

3. A restart-step is of the form RESTART E S(q, u), where q E Q \ ( F U H ) and u E rk+' U rlk . {$}, that is, if M is in state q seeing u in its readlwrite-window, it can move its readlwrite-window to the left end of the tape, so that the first symbol it sees is the left border marker $, and it reenters the initial state qo.

Obviously, each computation of M proceeds in cycles. Starting from an initial configuration qo&w$, the head moves right, while MVR- and rewrite- steps are executed until finally a RESTART-step takes M back into a config-

U

355

uration of the form qo$wl$. It is required that in each such cycle exactly one rewrite-step is executed. By k b we denote the execution of a complete cycle, that is, the above computation will be expressed as qo$w$ t-& qO&wl$.

An input w E C' is accepted by M if there exists a computation of M which starts with the initial configuration qo&w$ and which finally reaches a configuration containing an accepting state qa E F . By L ( M ) we denote the language accepted by M .

The following lemma can easily be proved by standard techniques from automata theory. Lemma 2.1. Each RRWW-automaton M is equivalent to an RRWW- automaton M' that satisfies the following additional restrictions:

(a) M' enters an accepting or a rejecting state only when it sees the right border marker $ in its read/write-window.

(b) M' makes a restart-step only when it sees the right border marker $ in its read/write-window.

This lemma means that in each cycle and also during the last part of a computation the readlwrite-window moves all the way to the right before a RESTART is made, respectively, before the machine halts.

By placing certain restrictions on the transition relation of an RRWW- automaton we get various subclasses. Here we are interested in the following restrictions and the corresponding language classes:

An RRWW-automaton is deterministic if its transition relation is a (partial) function

S : (Q \ ( F U H ) ) x + ( ( Q x ({MVR} U I?<')) U {RESTART}).

0 An RWW-automaton is an RRWW-automaton that performs a RESTART immediately after each rewrite-step.

0 An (R)RW-automaton is an (R)RWW-automaton for which the tape alphabet l? coincides with the set C U {$, $}, that is, such an automaton has no auxiliary tape symbols.

By combining these restrictions we obtain eight classes of automata. By L(C) we denote the class of languages that are accepted by the automata of class C. From the above definitions we immediately obtain the inclusions presented in Figure 1 below.

356

L( RWW) L(det-RRWW)

L (det- RW)

Figure 1.

It is known that the inclusions C(RW) c C(RRW), L(RW) c L(RWW) and C(RRW) c C(RRWW) are proper ', while it is an open problem whether or not the inclusion C(RWW) C L(RRWW) is proper. In addition, it is known that the class GCSL of growing context-sensitive languages is contained in C(RWW), but again it is open whether or not this inclusion is proper l l .

On the other hand, it has been shown that the class CRL of Church-Rosser languages * coincides with the class L(det-RWW) ll.

It is immediate that Lemma 2.1 also holds for RRW-automata, for deterministic RRW-automata, and for deterministic RRWW-automata. Hence, we have two groups of automata: those that make a RESTART only at the right end of the tape, and those that make a RESTART immediately after performing a rewrite-step.

W W )

3 A language-theoretical equivalent to the use of nonterminals

L(det-RRW)

In this section we will see that for each of the classes of restarting automata introduced the use of auxiliary tape symbols corresponds to the operation of intersecting the language accepted with a regular set. Theorem 3.1. A language L is accepted by a (de t e rmin i s t i c ) RRWW- a u t o m a t o n i f and only i f there exist a (determinis t ic) R R W - a u t o m a t o n MI and a regular language R such t h a t L = L(M1) n R holds. Proof. Let L E L(RRWW), that is, there exists an RRWW-automaton M = ( Q , C, I?, 6, QO,&, $, F, H ) accepting L. Hence, for all 20 E C*, w E L

357

iff qo&w$ Fb $uqav$ for some u ,v E r* and qa E F . Let R := C*, and let M I denote the RRW-automaton M I = (Q, r \ (4, $}, I’, 6,q0, 6, $, F, H ) . Then for each w E (r \ (4, $})* the following statements are equivalent:

w E L(M1) n R iff w E C* and w is accepted by M I iff w E C* and qo&w$ Fbl 4uqav$ for some u, v E I?*, qa E F iff w E L ( M ) = L.

Thus, L = L(M1) n R. Conversely, let Ml = (Q,r \ {+ ,$} , r , 6 ,qo ,$ ,$ , F , H ) be an RRW-

automaton, and let R be a regular language over I?. By Lemma 2.1 we can assume without loss of generality that MI moves its readlwrite-window to the right end of the tape before making a RESTART-step and before accepting or rejecting. From MI we construct an RRWW-automaton M as follows.

M behaves essentially like the RRW-automaton M I . However, while reading the tape contents from left to right it internally simulates a deterministic finite-state acceptor for R. When it makes a rewrite-step, it marks a letter a of the actual tape contents by replacing it by a copy a as an integral part of the rewrite operation. In this way M indicates that it has read the tape. This marking of symbols necessitates an extention of the size of the look-ahead, as the RRW-automaton M I may replace a string u by the empty string A. When M’s readfwrite-window reaches the right border marker $ without having encountered a marked letter, which means that M is still in the first cycle, then M halts and rejects if the tape contents read is not in R; otherwise it behaves like M I . When M encounters a marked symbol while reading the tape, then it ceases to simulate the finite-state acceptor for this cycle. Further, when rewriting a string u containing a marked symbol, then also a symbol will be marked in the string v substituted for u. Because of the markers M is an RRWW-automaton, and it is obvious that L ( M ) = L(M1) n R.

I7

For the RW-automaton the corresponding result holds, but its proof is more involved. This stems from the fact that an RWW-automaton performs a RESTART immediately after each rewrite-step. Thus, during the first cycle of its computation it will in general not see the input completely. Hence, it must check membership of the given input in the regular language R using a different strategy. Theorem 3.2. A language L is accepted b y a (deterministic) RWW- automaton if and only if there exist a {deterministic) RW-automaton MI and a regular language R such that L = L(M1) n R holds. Proof. The proof that L can be written as L(M1) n R is the same .as for

The proof for the deterministic case is identical.

358

Theorem 3.1. It remains to prove the converse implication. So let Ml be an RW-automaton, and let R be a regular language that is accepted by a deterministic finite-state acceptor D. From M I and D we now construct an RWW-automaton M such that L ( M ) = L(M1) n R holds. During the first cycle of a computation starting from the initial configuration qO$w$, MI will in general not see the input completely. As M must simulate M I , it will neither see the complete input during its first cycle. Thus, M will simulate the finite-state acceptor D not in one cycle, but it will simulate parts of D’s computation in each cycle until it finally reaches the right border marker $. Accordingly M will operate as follows.

Starting from a configuration of the form qo$w$, M will simulate the RW-automaton M I and the finite-state acceptor D in parallel, while moving its read/write-window to the right. Assume that M reaches a configuration of the form 4x(q,p)uay$, where u is the contents of MI’S readlwrite-window, M I is in state q, and D is in state p . If M I performs a rewrite/restart-step next, replacing u by the shorter string w, then M perfoms a rewrite/restart- step, replacing the string ua by the string w[a,p’], where p’ = S ~ ( p , u ) , that is, p‘ is the state that D enters after reading the input xu. Observe that as in the proof of the previous theorem, the look-ahead of M is larger than that

During a cycle of the form above M may encounter a tape symbol of the form [ a , ~ ] , where a is a tape symbol of M I , and p is a state symbol of D. Then M continues the simulation of M I as before, but the simulation of D now continues with state p , reading the symbol a.

Thus, during the course of a computation M’s tape inscription may contain several occurrences of symbols of the form [a ,p] , where p is a state symbol of D. However, the right-most occurrence of a symbol of this form satisfies the following conditions:

(a) M has not yet seen the tape inscription to the right of this symbol.

(p) The state of D contained in this symbol is the actual state of D that D

of Mi.

enters after reading the initial input w up to this position.

Finally, M accepts if M I accepts and if the simulation of D is in a final state when M reaches the right border marker $. Observe that in case M reaches the marker $ in a cycle prior to the final cycle, it can decide whether or not the given input belongs to R. If it does not belong to R, then M rejects, otherwise M places a special mark on the last symbol before the $ to indicate that the simulation of D has been finished successfully. Now it should be clear that M is an RWW-automaton that accepts the language L(M1) n R.

359

Again, if M I is deterministic, then so is M . 0

From the above characterizations and the fact that the inclusions L(RW) C L(RWW) and L(RRW) C L(RRWW) are proper, we obtain the following consequences. Corollary 3.3. (a) L(RWW) and L(RRWW) are closed under the operation

of taking the intersection with a regular language.

(b) L(RW) and L(RRW) are n o t closed under this operation.

4 Inclusions between the deterministic classes

Concerning the four nondeterministic and the four deterministic language classes defined by the various restarting automata considered here, we have the inclusions depicted in Figure 1. It is not known whether the inclusion L(RWW) C L(RRWW) is proper, but the remaining inclusions among the nondeterministic classes are known to be proper. So let us turn to the deterministic classes. In contrast to the open problem stated above we have the following equality result. Theorem 4.1. L(det-RRWW) = L(det-RWW) = CRL.

Proof. Since L(det-RWW) C L(det-RRWW), and since CRL = L(det-RWW) 11, it remains to prove that L(det-RRWW) CRL holds. Recall from that a language is a Church-Rosser language if and only if it is accepted by some shrinking deterministic two-pushdown automaton (sDTPDA) (see for the definition of the sDTPDA).

Let M = (Q, C, I?, 6, qo, &, $, F, H ) be a deterministic RRWW-automaton with readjwrite-window of size k (that is, look-ahead k - l), and let L denote the language accepted by M . By Lemma 2.1 we can make the following assumptions about M : after performing a rewrite-step M continues with MVR-steps until it scans the right delimiter $, and then it either (1.) halts and accepts, (2.) halts and rejects, or (3.) makes a RESTART. Accord- ingly we associate three subsets Q+(w), Q- (w), Qrs(w) of Q with each string w E (I' \ ( 4 , $})* as follows:

A state q E Q is contained in Q+(w) (Q-(w),Qrs(w)) if, starting from the configuration $qw$, M makes only MVR-steps until it scans the $-symbol, and halts and accepts (halts and rejects, respectively, restarts) then.

As M is deterministic, Q+(w), Q-(w), and Qrs(w) are pairwise disjoint. After performing a rewrite-step M is in a configuration of the form &xvqw$. Then by our assumption on M , q E Q+(w) U Q-(w) U Qrs(w). If these three sets were known, then instead of actually performing the corresponding steps

360

- - - - finite control

left part of M’s tape 1 4

states of M

of M , we could simply accept (if q E Q+(w)), reject (if q E Q-(w)) or make a RESTART (if q E Qrs(w)).

Now we will simulate M by an sDTPDA M I as in the proof of l1

Lemma 3.6, but in addition we will keep track of the sets Q+(w), Q-(w), and Qrs(w) for each suffix w of the actual tape contents. In this way we can combine the rewrite-steps and the RESTART-steps. Accordingly M1’s pushdown stores will have 2, respectively 4, tracks:

right part of M’s tape

&+(.I & - ( . I S rs ( . )

$ I

U -

... I4

The sDTPDA MI will work in two phases.

Phase 1 (Initialization): From right to left MI prints encodings of the three sets Q+(w), Q-(w), and Qrs(w) underneath the first letter of w for each suffix w of the given input. This process is realized in two steps:

(a) First M I shifts the tape contents z of stack 2 onto stack 1 (using an

(b) Pushing z back onto stack 2 the corresponding sets Q+(.), Q-( . ) , and

Phase 2: M I now simulates M step-by-step. To simplify the presentation we use a variant of the sDTPDA that, instead of only seeing the topmost symbol on each of its two pushdowns, sees the topmost k + 1 symbols on each pushdown, that is, it uses pushdown windows of size k + 1 (see ll). We distinguish between the MVR-steps of M and the rewrite-steps of M .

(a) MVR-step of M : $uqlaww$ I-M $uaqzww$, where a E I’ \ (4,s) and 1111 = k - 1. For $uqlaww$ the corresponding configuration of M I looks as follows:

appropriate copy alphabet to make this process weight-reducing) .

Qrs(.) are written on tracks 2 to 4.

a u W

Q+ Q+(uw). . . Q- Q-(uw) . . .

$I

Q r s Q r s ( ~ w ) . . .

Here Q+ = Q+(avw), Q- = Q-(aww), and Qr, = Qrs(avw). In a single step M I transforms this configuration into the following one, thus simulating the above MVR-step of M :

36 1

~ - '111 U2 - ... Q4...

1 4

u

Observe that on the second track of its first pushdown MI stores the actual state in which it was when it pushed the letter a onto that stack.

(b) REWRITE-step of M: +uqlww$ J-M $uyq2w$, where Iwl = k > IyI. Here we distinguish between the following three cases:

Case 1. q2 E Q+(w): Then M will accept. Accordingly, MI enters an ERASE-state, erases the stack contents, and accepts.

Case 2. 42 E Q-(w): Then M will reject. Accordingly, MI enters an ERASE- state, erases the stack contents, and rejects.

qo$uyw$ kbVR + ~ 1 q 4 ~ 2 y w $ , where u = ~ 1 ~ 2 , 12121 = k - 1, if I u I 2 k, or u2 := $U and no MVR-step is included here after the RESTART, if

MI will now simulate this sequence of M-steps by a single step as follows. The configuration of MI corresponding to $uqlvw$ is the following:

Case 3- ~2 E Q r s ( ~ ) : Then + u Y ~ ~ w $ F G V R $ U Y W ~ S $ RESTART

< k.

V W

$1 Q+(vw). . . Q + ( w ) . . . Q-(vw) . . . Q-(w) . . .

Q ~ ~ ( v w ) . . . Qrs(w). . .

Here the sets Q+(.), Q-(.), and Qrs(.) underneath the letters of u2y on the second pushdown are computed from &+(w), Q-(w), and Qrs(w) based on the transition function of M .

In case lul < k, the first pushdown will only contain the bottom marker I, and MI will be in the state that corresponds to the initial state of M .

362

It is easily verified that L(M1) = L ( M ) = L. Obviously step 2(a) can be realized in a weight-reducing manner. As IyI < IwI, also step 2(b) can be realized in this manner. Thus, Adl is indeed an sDTPDA for I,, which

0

the language 156 := { anbnc I n 2 0 } U { anb2nd I n 2 0 } is shown to belong to L(RRW) \ L(RW). Since L6 can obviously be accepted by an sDTPDA, we see that L6 E L(det-RWW) \ L(det-RW). Further, we have the following negative result. Lemma 4.2. L6 # L(det-RRW). Proof. Assume that L6 is accepted by a deterministic RRW-automaton M = (Q, C, I?, 6, qo, $, $, F, H ) , where C = { a , b, c, d } and r = CU (4, $}. For n >_ 0, given anbnc as input, M performs an accepting computation of the following form:

completes the proof of Theorem 4.1.

In

qo$anbnc$ tb qo&wi$ k L QO$WZ$ k b .. . t-b qo$wm$ kb $uqaV$

for some qa E F . Since w1,. . . , w, E C*, and since M accepts starting from the initial configuration qo&wi$, we see that w1,. . . , W m E L6. If n is sufficiently large, then M cannot rewrite the tape contents &anbnc$ into a string of the form &aib2id$ within a single cycle. Hence, 201 is of the form w1 = an-jbn-jc for some j 2 1.

Starting from the initial configuration qo&anbnb"d$, M will perform the same rewrite-step, that is, qo$a"b"b"d$ t-b $an-jbn-jqbnd$ for some q E Q. Following this rewrite- step M will either reject on encountering the symbol d , or it will make a RESTART, that is, qo$anb2"d$ l-L qo$an-jb2n-jd$. As an-jb2"-jd $2 L6, we see that in each case L ( M ) # L6. Thus, L6 is not accepted by any deterministic RRW-automaton. 0

The observations above show that the following inclusions are proper.

Now consider the input z := anb2"d.

Corollary 4.3. (a) L(det-RW) C L(det-RWW).

(b) L(det-RRW) c L(det-RRWW) = L(det-RWW). However, it remains open whether or not the inclusion L(det-RW)

L(det-RRW) is proper. F'rom Corollary 4.3 and the results of Section 3 we obtain the following closure and non-closure properties in analogy to Corol- lary 3.3. Corollary 4.4. (a) L(det-RWW) and L(det-RRWW) are closed under the

operation of taking the intersection with a regular language.

(b) C(det-RW) and L(det-RRW) are not closed under this operation.

363

Since Ls E L(RRW), Lemma 4.2 implies that the inclusion L(det-RRW) C L(RRW) is proper. Further, since CRL is properly contained in GCSL 2 , also the inclusion L(det-RWW) c L(RWW) is proper.

Hence, in summary we have the situation depicted in Figure 2 , where a question mark indicates that it is an open problem whether the corresponding inclusion is proper, and a language given as a label of an edge means that this language is an example that shows that the corresponding inclusion is indeed a proper one. Here L2 := { endn 1 n 2 0 } U { end" 1 rn > 2 n 2 0 } and L7 := { anbm I 0 5 n 5 m 5 2 n } are taken from 5 1 6 .

C(RRWW)

Figure 2.

5

For the nondeterministic restarting automata we have the chain of inclusions GCSL C L(RWW) c L(RRWW) c CSL, where CSL denotes the class of context-sensitive languages. It is known that GCSL is properly contained in CSL, but it is open which of the intermediate inclusions are proper.

Let LGI denote the so-called Gladkij language 3 , that is, L G ~ = { w#wR#w 1 w E {a,b}* }. It is known that L G ~ is a context-sensitive language that is not growing context-sensitive l. Here we will show that this language is accepted by some RRWW-automaton, thus separating the class GCSL from the class L(RRWW). Theorem 5.1. L G ~ E L(RRWW). Proof. We will construct an RRWW-automaton M that accepts the lan-

The Gladkij language is in L(RRWW)

364

gudge L G ~ . As by Corollary 3.3, L(RRWW) is closed under the operation of taking intersections with regular languages, we can restrict our attention to inputs of the form u#w#w, where u, w, w E { a , b}*.

Let C := {a , b, #}, and let r := C U {&, $} U { A,, B,, C, I u E {a , b}2 }. Further, for M's look-ahead we choose the number 7. The action of A4 on an input of the form u#w#w is described by the following algorithm, where win always denotes the actual contents of M's readlwrite-window:

(1.) if win = $x#y#z$ then (* The window contains the tape inscription completely. *)

if x#y#z E L G ~ or

then ACCEPT else REJECT; z E {a , b, A}, y = x and z = xC, for some u E {a , b } 2

(2.) repeat MVR until win E r2 . # .

(3.) if win = u2#w2y for some ~ 2 , 7 1 2 E r2 then (3.1) begin if '112 $! {a , b}2 or v2 $! { a , b}2 or u2 # w: then REJECT; ( 3 4

or win E {a , b, $} . {A,B, I u E { a , b}2} .

(* Here v#w#w = u ~ u 2 # u f w ~ # w . *) nondeterministically goto (4.) or goto (5.);

repeat MVR until win E r* . $; if win ends in u2$ then RESTART else REJECT; (* A RESTART is performed if the tape contents was

(* u2#uF has been discovered, but it is not yet rewritten. *) repeat MVR until win E I?* . $; if win ends in Cx$ for some x E {a , b}2

then REWRITE : Cx$ I+ $ else REJECT;

(4.) REWRITE : ~ 2 # ~ 2 H A,,B,, ; (4.1) (4.2)

ulu2#u3l#Wl212 ( U l , W l , W l E {a,b)*,u2 E {a ,b}2 ) . *) (5.)

(5.1)

(5.2) RESTART; (* A RESTART is performed if the tape contents was

~ l u 2 # u f ' u l # W 1 c x ( U l , W l , W l E {a,b)*,x,u2 E and the C, has just been deleted. *)

(5.3) end; (6.) if win = c . A,,B,, . w' for some u2 E { a , b } 2

(7.) repeat MVR until win E I?* . $; (7.1) if win ends in u2$ then REWRITE : u2$ c) CUz$

(7.2) RESTART;

then nondeterministically goto (7.) or goto (8.);

else REJECT;

365

(* A RESTART is performed if the tape contents was ~1-4uzBu ,v1#wu~ ( w , v 1 , w E {a ,b}* ,w E {a,b}:!), and u:! has just been replaced by C,,. *)

(8.) REWRITE : A,,B,, H #; (8.1) repeat MVR until win E I?* . $; (8.2) if win ends in Cu,$ then RESTART else REJECT;

(* A RESTART is performed if the tape contents was w A u 2 B u z ~ 1 # w C u z ( w , v l , w E {a,b}*,u2 E {a ,b}2) , and in this cycle A,, B,, has been replaced by #. *)

In the following we give some example computations of M in order to illustrate how it works before we turn to proving that indeed L ( M ) = L G ~ . In the description of these computations we place a bar underneath the important part of the window’s contents.

Example 1. Consider the input abbb#bbba#abbb:

qo$abbb#bbba#abbb$ k $abbb#bbba#abbb$.

Now we can continue with either (4.) or (5.). However, (5.) will lead to rejection, so we continue with (4.):

( 2 ) -

$abbb#bbba#abbb$ H $ab&bBbbba#abbb$ k $abAbbBbbba#ab&$ ( 4 ) (4.1) H qo$abAb&,bba#abbb$ k $ ab&,Bbbba#abbb$.

(4 .2 ) (2) ___

Now we can continue with either (7.) or (8.). However, it is easily seen that (8.) will lead to rejection, and so we continue with (7.):

$ abAbb B b b ba#abbb$ k $ab&bBbbba#abbb$ H #abAb&,bba#abCbb$

H qo$abAbbBbbba#abCbb$ A $abAbbBbbba#abCbb$. (7) (7.1)

(7.2) (2) -

Again we can continue with either (7.) or (8). This time, however, (7.) will lead to rejection, and we continue with (8.):

$abAbbBbbba#abCbb$ H $ab#ba#abCbb$ k $ab#ba#abCbb$ ( 8 ) - (8.1) H qo$ab#ba#abCbb$ A &ab#ba#abCbb$.

(8 .2 ) ( 2 ) -

Here we can continue with either (4.) or (5.). This time (4.) will lead to

366

rejection, and we continue with (5.):

$ab#ba#abCbb$ A $ab#ba#abCbb$ e $ab#ba#ab$ ( 5 ) - (5.1) H qo$ab#ba#ab$. (5.2)

Continuing in this way we will finally obtain the configuration qo$##cab$,

Example 2. Consider the input abbb#bbba#abba:

which leads to acceptance. 0

qo#abbb#bbba#abba$ A &abbb#bbba#abba$. (2) -

We can continue with either (4.) or (5.):

(1.) $abbb#bba#abba$ e &abAbbBbbba#abba$ A $abAbb&bba#abba$ (4) ~ (4.1) . I

e REJECT. (4.2)

(2.) $abbb#bbba#abba$ A $abbb#bbba#abba$ I-+ REJECT. (5) (5.1)

Thus, this input cannot be accepted by M .

Example 3. Consider the input abbb#baba#abbb:

0

qo#abbb#baba#abbb$ A $abbb#baba#abbb$ I+ REJECT. (2) - (3.1)

Thus, this input is not accepted either. 0

Based on these examples we can easily complete the proof of the theorem. From Example 1 we see that each string w E L G ~ is accepted by M . On the other hand, if w = x#y#z for some x, y , z E { a , b}* such that w # L G ~ , then x # yR or x # z. In a computation it is checked whether or not x = y R in step (3.1), and it is checked whether or not x = z in step (4.2). Hence, it follows that the language L ( M ) coincides with the Gladkij language L G ~ . 0

As the Gladkij language does not belong to the class GCSL, we obtain the following consequence. Corollary 5.2. GCSL is properly contained in t h e class ,L(RRWW).

Thus, we see that at least one of the following two inclusions is proper:

GCSL L(RWW) C L(RRWW),

but at the moment it is not clear which one. In fact we would expect that the Gladkij language is not contained in the class L(RWW), which would show

367

that in contrast to the situation in the deterministic case, the separation of the restart from the rewrite operation does increase the power of the nondeterministic restarting automatona. Also we would like to point out that using the same technique as for the Gladkij language it can be shown that C(RRWW) contains some rather complicated languages 12. Thus, it does not seem to be easy to separate this class from the class CSL of context-sensitive languages.

6

Instead of working directly with the various types of restarting automata, it may be easier to work with characterizations of the corresponding language classes through certain prefix-rewriting systems.

A prefix-rewriting s y s t e m P on some alphabet C is a subset of C* x C*. Its elements are called (prefix-) rewrite rules, and usually they are written as (l + T). By dom(P) we denote the set of all left-hand sides of rules of P .

A prefix-rewriting system P on C induces a prefix-reduction relat ion +-> on C*, which is the reflexive transitive closure of the single-step prefix- reduct ion relat ion JP:= { ( l z , TZ) I (l -+ T) E P, z E C* }. If u +> u holds, then u is an ancestor of v and u is a descendant of u (mod P ) . By O>(v) we denote the set of all ancestors of u , and for L C_ C*, O>(L) := U O>(v).

Let u E C*. If there exists some u E C* such that u ~p u , then 'u. is called reducible mod P , otherwise it is irreducible mod P . By IRR(P) we denote the set of all irreducible strings mod P. Obviously, IRR(P) = C* \ dom(P) . C*,

that is, for a system of the form P = U { xui + X Z ) ~ I x E Ri }, where

R1, . . . , R, are regular languages, IRR(P) is a regular language as well. Using prefix-rewriting systems of this form the languages accepted by RW-automata can be characterized as follows. Theorem 6.1. lo L e t L C_ C * . T h e n L E C(RW) if and only if there exis t a

prefix-rewriting s y s t e m P of t h e f o r m P = u { xui 4 xvi I x E Ri }, where

for i = 1,. . . ,TI , ui,ui E C* or ui,'ui E C* . $ satisfying luil > Ivil, Ri C* i s a regular language, and $ is a n additional symbol n o t in C , and a regular language RO 2 C* . $ s u c h t h a t L . $ = V>(&).

For the class C(det-RW) a corresponding characterization can be derived

Restarting automata and prefix-rewriting systems

U E L

n

i= 1

n

i= 1

cooperation with Tomasz Jurdziliski and Krzysztof LoryS we have recently been able to show that in contrast to our expectations the Gladkij language is accepted by an RWW- automaton. Thus, the first of the two inclusions above is proper.

368

using confluent prefix-rewriting systems of the form described above. Here a prefix-rewriting system P is called confluent, if two strings u , u E C* that have a common ancestor also have a common descendant. Sknizergues has investigated prefix-rewriting systems of the form above, which he called strict Rat-Fin controlled rewriting systems, and he has shown that for these systems confluence is unfortunately undecidable 13.

For RRW-automata similar characterizations can be obtained. Only the form of the prefix-rewriting systems used in the characterization is different. Theorem 6.2. lo Let L C C*. Then L belongs to C(RRW) if and only if

there exist a prefix-rewriting system P of the form P = U { xuiy$ + xuiy$ I x E Ri(l),y E R,!') }, where $ is an additional symbol not in C, and for i = 1,. . . , n, ui, oi E C* satisfying 1uil > IuiJ, and Ri('), Ri2) C C* are regular languages, and a regular language Ro C C* . $ such that L . $ = V>(Ro).

Again the deterministic class C(det-RRW) can be characterized in the same way by confluent prefix-rewriting systems. Further, together with The- orems 3.1 and 3.2 the results above yield corresponding characterizations for the classes of languages accepted by the (deterministic) (R) RWW-automata. Corollary 6.3. Let L c C*. Then L E C(RWW) ( L E C(det-RWW)) if and only if there exist an alphabet F containing C , a (confluent) prefix-rewriting

system P of the form P = U { xui + xwi I x E Ri }, where for i = 1,. . . , n,

ui,ui E F* orui,ui E I?*.$ satisfying Juil > Iuil, Ri 2 r* is a regular language, and $ is an additional symbol not in I?, and regular languages R, Ro C F* . $ such that L . $ = V$(Ro) n R. Corollary 6.4. Let L C C'. Then L belongs to L(RRWW) (C(det-RRWW)) if and only if there exist an alphabet l? containing C, a (confluent) prefix-

( 1 ) (2) rewriting system P := u { xUiy$ + xviy$ I x E Ri , y E Ri }, where $

is an additional symbol not in F, and for i = 1,. . . , n, ui, ui E I?* satisfying lUil > luil, and Ri(l), Rt2) C I?* are regular languages, and regular languages Ro and R such that L . $ = VF(R0) n R. Acknowledgement The authors want to express their thanks to Gerhard Buntrock for helpful discussions concerning the results presented in Section 3.

n

i=l

n

i=l

n

i= 1

369

References

1. G. Buntrock. Wachsende kontext-sensitive Sprachen. Habilitationsschrift, Fakultat f i r Mathematik und Informatik, Universitat Wiirzburg, July 1996.

2. G. Buntrock and F. Otto. Growing context-sensitive languages and Church- Rosser languages. Information and Computation, 141:l-36, 1998.

3. A.W. Gladkij. On the complexity of derivations for context-sensitive grammars. Algebra i Logika Sem., 3:29-44, 1964. In Russian.

4. P. JanEar, F. MrBz, M. PlBtek, and J. Vogel. Restarting automata. In H. Reichel (ed.), Fundamentals of Computation Theory, Proc., Lect. Notes in Comp. Sci. 965, pp. 283-292. Springer, Berlin, 1995.

5. P. JanEar, F. MrBz, M. PlBtek, and J. Vogel. On restarting automata with rewriting. In G. P5un and A. Salomaa (eds.), New Trends in Formal Languages, Lect. Notes in Comp. Sci. 1218, pp. 119-136. Springer, Berlin, 1997.

6. P. Jantar, F. MrBz, M. PlStek, and J . Vogel. Different types of monotonicity for restarting automata. In V. Arvind and R. Ramanujam (eds.), Foundations of Software Technology and Theoretical Computer Science, Proc., Lect. Notes in Comp. Sci. 1530, pp. 343-354. Springer, Berlin, 1998.

7. P. Jantar, F. MrBz, M. PlBtek, and J. Vogel. On monotonic automata with a restart operation. Journal of Automata, Languages and Combinatorics, 4:287- 311, 1999.

8. R. WIcNaughton, P. Narendran, and F. Otto. Church-Rosser Thue systems and formal languages. Journal of the Association for Computing Machinery,

9. G. Niemann and F. Otto. The Church-Rosser languages are the deterministic variants of the growing context-sensitive languages. In M. Nivat (ed.), Foun- dations of Software Science and Computation Structures, Proc., Lect. Notes in Comp. Sci. 1378, pp. 243-257. Springer, Berlin, 1998.

10. G. Niemann and F. Otto. Restarting automata and prefix-rewriting systems. Mathematische Schriften Kassel 18/99, Universitat Kassel, Dec. 1999.

11. G. Niemann and F. Otto. Restarting automata, Church-Rosser languages, and representations of r.e. languages. In G. Rozenberg and W. Thomas (eds.), De- velopments an Language Theory - Foundations, Applications, and Perspectives, Proc., pp. 103-114. World Scientific, Singapore, 2000.

12. G. Niemann and F. Otto. On the power of RRW-au tomata . In M. Ito, G. PBun and S. Yu (eds.), Words, Semigroups, and Transductions. Essays in Honour of Gabriel Thierrin, On the Occasion of His 80th Birthday, pp. 341- 355. World Scientific, Singapore, 2001.

13. G. Sknizergues. Some decision problems about controlled rewriting systems. Theoretical Computer Science, 71 :281-346, 1990.

35~324-344, 1988.

370

Cellular Automata with Polynomials over Finite Fields *

Hidenosuke Nishio Iwakura Miyake-cho 204, Sakyo-ku, Kyoto

Email: YRAO5762Qnifty.ne.jp

Abstract

Information transmission in cellular automata(CA) is studied using polynomials over finite fields. The state set is thought to be a finite field and the local function is expressed in terms of a polynomial over it. The information is expressed by an unknown variable X, which takes a value from the state set and information transmission is discussed using polynomials in X . The idea is presented for the basic one dimensional CA with neighborhood index {- l ,O , +I}, although it works for general CAs. We give first the algebraic framework for the extension of CA and then show some fundamental results on extended CAs.

1 Introduction ’What is the information?’ and ’How to study the information?’ are fundamental questions in information sciences. Shannon’s pioneering work originated the mathematical study of the information, focusing on information transmission through noisy communicating channels. He introduced the numerical measure entropy based on the probability theory. Also in the study of CA, it has been investigated from various points of view. J.von Neumann firstly proposed the self-reproducing 2-D CA with 29 state cells[vN66]. In his design the information is transmitted by means of many signals. The firing squad synchronization problem and other real time computations have been solved by utilizing many signals which travel in CA spaces with various speeds and specific meanings[M+T99]. In this paper we are going to discuss another way of viewing the information in CA, which is essentially different from signals. Our approach will be called algebraic “991.

‘This is an extended abstract and the full paper will be published elsewhere.

37 1

2 Definitions The 1-D CA is defined as usual with the space 2 (the set of integers), the neighborhood index N , the state set Q and the local function f and denoted as CA=(Z,N,Q,f). Throughout this paper we assume the l-D CA with N={-l,O, +l} and denote it simply as CA(Q,f).

2.1 State Set Q is generally a finite set but is thought to be a finite field in our study. I t may possibly be a ring or an integral domain, but for the sake of simplicity we assume first the structure of field. Thus Q=GF(q), where q = with prime p and positive integer n.

2.2 Local Function Various ways in expressing the local function have been made use by CA researchers. When Q is an arbitrary finite set, f is expressed as a table or by listing up the function vaules for every combinations of neighboring cell states. If Q is a field, however, we can express it in terms of the polynomial over GF(q). Note that the linear CA has been extensively investigated, where f is expressed in the form of the linear combination of variables[G95].

Denote the cardinality of Q as IQI. So IQI = q = p". Then the local function f : Q x Q x Q + Q can be expressed as follows:

f(z, y, z ) = UQ + u1z + u2y + . * . + uixhyjzk + * * *

+ uq3-2z9-1 y q-1 z 9-2 +uq3-1x9-lyq--1z~-1,

where ui E Q (0 5 i 5 q3 - 1). (I)

x,y and z indicate the states of the neighboring cells -l(left), O(center) and +l(right), respectively. There are qq3 local functions in all and it will be seen that (1) is a due form for expressing them. As for the polynomial expression of functions from GF(q)" to GF(q) and other topics on finite fields, see [L+N97].

Example 1. The binary set Q = {O,l}=GF(2) and the function f(x, y, z ) = yz t x.

372

3 Information Function

3.1 Information Expressed by X We are going to present an algebraic tool for studying the information transmission in CA. Let X be a symbol different from those used in equation (1). It stands for an unkown state of the cell in CA and has been introduced to trace the information, which is essentially different from the signal.

Take Example 1 above. From the fact that f ( O , O , O ) = 0 and f ( l , O , O ) = 1, we may write as f(X,O,O) = X (i). Similarly we write f ( X , 1,1) = X + 1 (mod 2) (ii), which comes from the fact that f ( O , l , 1) = 1 and f ( l , l , 1) = 0. Also we have f(O,X,O) = 0 (iii). From the information related point of view, we claim: in case (i) the information X is transmitted without loss from the left to the center. In case (ii) it is also so, since from the output X + 1 we can uniquely restore the input value of X . In case (iii), however, the state information X of the center is lost.

For generalizing the above argument, we consider another polynomial form, which we call the information function.

g ( X ) = a0 + U l X + . . . + aixz + . . . + aq-lXQ-l , where ai E Q (0 5 i 5 q - 1). (2)

y is a function Q -+ Q and the set of such functions is denoted by &[XI. Evidently IQ[X]I=qq. Note that Q [ X ] 2 Q. The element of &[XI \ Q is called i n format i ve , while that of Q constant.

3.2 Ring &[XI We introduce two operations in Q [ X ] , addition and multiplication, following the ring operation of Q. So we have the following equations.

p X = O and X q - X = O .

Consequently Q [ X ] becomes a (commutative) ring with identity. In fact it is isomorphic to the factor ring by ( X q - X ) which is a reducible polynomial. Therefore Q [ X ] is not a field and even not an integral domain. See Example 2 below. It is not a field nor an integral domain and is proved to be isomorphic to the direct sum of two copies of GF(2).

373

Example 2. Q = GF(2). Q[X] = {O,l,X,X + 1).

Example 3. Appendix 1 lists up all polynomials of Q[X] over GF(3). Each polynomial function is equivalently expressed as the coefficeint vector(first column) and the value vector(third column).

4 Extension of CA

4.1 Defining CA(Q[X],fx) We extend a CA(Q, f) to its extended CA(Q[X], fx), where the state set Q[X] is the set of polynomials over Q and the local function fx is expressed by the same polynomial form f as in (Q, f ) . The variables x,y and z, however, move in Q[X] instead of Q. That is, fx : Q[XI3 + Q[X].

4.2 Dynamics of CA(Q[X], f x ) The global map Fx : C -+ C is defined as usual, where C = Q[XIz is the set of all state configurations. The configuration at time t is defined by ct = F$(co) for the initial configuration C O . The suffix X is often omitted when no confusion is expected.

Corn put er Simulation Appendix 2 shows the dynamics of a finite cyclic boundary CA(Q[X], f ) over GF(3). The first simutation is done for the initial configuration with X. The others show the dynamics with the substituted initial configurations.

5 Results We present several results which will make clear the features of dynamics of the extended CA in contrast to those of the original CA.

Theorem 5.1

When CA(Q[X],f) starts with a constant configuration, its dynamics is the same as that of CA(Q,f). In other words, CA(Q,f) is embedded in CA(Q[Xl, f).

374

Theorem 5.2

CA ( Q [ X ] , f ) is surjective (injective, reversible) iff CA (Q,f) is surjective (injective, reversible).

We define here substitution of configuraitions: For any configuration w E Q[XIz and a E Q, let wa be a configuration obtained from w by substituting a for X of each cell state g(X).

When CA is finite (with cyclic or fixed boundary condition), its trajectory enters a cycle after a finite transient part. Denote the cycle length by #(w) and the transient length by ~ ( w ) , when a CA starts with w. Note that when the number of cells in CA is n then its configurations are words of length 72.

Theorem 5.3

Let w be a word in Q[XIn. Then we have,

6 Concluding Remarks We have shown results without proofs, which will appear elsewhere. Proofs were often conceived from the computer simulation of CA(Q[X], f), for which the author is greately indebted to Takashi Saito.

References

[G95] M.Garzon,Models of Mssive Parallelism, Analysis of Cellular Automata and Neural Networks,Springer, 1995.

[L+N97] R.Lidl and H.Niederreiter,Finte Fields, 2nd ed. Cambridge Univer- sity Press, 1997.

[M+T99] J.Mazoyer and V.Terrier, Signals in onedimensional cellular automata, Theoret. Comput.Sci., ~01.217, 53-80( 1999).

“991 H.Nishio, Algebraic Studies of Information in Cellular Automata, Kyoto University, RIMS Kokyuroku, vol. 1106, 186-195(1999).

375

("I] H.Nishio, Global Dynamics of l-D Extended Cellular Automata, Kyoto University, RIMS Kokyuroku, ~01.1166, 200-206 (2000).

[vN66] J.von Neumann, Theory of Self-reproducing Automata, Univ. of Ilinois Press, 1966.

376

g(1) g(2> 0 0 1 1 2 2 1 2 2 0 0 1 2 1 0 2 1 0 1 1 2 2 0 0 2 0 0 1 1 2 0 2 1 0 2 1 2 2 0 0 1 1 0 1 1 2 2 0 1 0 2 1 0 2

Appendix 1. Polynomials over GF (3 )

I dQ) I 1 2 2 3 2 2 3 2 2 1 2 2 3 2 2 3 2 2 1 2 2 3 2 2 3 2 2

g (X) a0 + u ~ X + a2X2, C L ~ E GF(3).

a0 a1 a2 0 0 0 0 0 1 0 0 2 0 1 0 0 1 1 0 1 2 0 2 0 0 2 1 0 2 2 1 0 0 1. 0 1 1 0 2 1 1 0 1 1 1 1 1 2 1 2 0 1 2 1 1 2 2 2 0 0 2 0 1 2 0 2 2 1 0 2 1 1 2 1 2 2 2 0 2 2 1 2 2 2

dX) 0

X 2 2 x 2 X

x + x2 x i- 2 x 2

2 x 2 x + x2

2 x - I 2 x 2

1 + x2 1 + 2 x 2 1+x

1 + X + X 2 1 + X + 2 X 2

1 + 2 x 1 + 2 x + x 2 1 + 2 x + 2 x 2

2 + x 2 2 + 2 x 2 2 + x

2 + x + x 2 2 + x + 2 x 2

2 + 2 x 2 + 2 x + x2

1 2 + 2 X + 2 X 2

1

2

.9 (0) 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2

377

Appendix 2. Simulation of CA[X] Q=GF(3), cyclic boundary, n = 6, f = z z + y.

0 : ((0 1 0) (0 0 1 ) (0 0 1) (0 0 1) (0 0 1) (0 0 1)) 1 : ((0 1 1) (0 1 1) (0 0 2) (0 0 2) (0 0 2) (0 1 1)) 2 : ((1 0 2) (0 0 0) (0 2 1) (0 0 0) (0 2 1) (0 0 0)) 3 : ((1 0 2) (1 0 2) (0 2 1) (1 1 1) (0 2 1) (1 0 2)) 4 : ((0 0 0) (2 0 1 ) (1 2 0) (2 2 2) (1 2 0) (2 0 1)) 5 : ((2 0 1) (2 0 1) (2 2 2) (1 0 2) (2 2 2) (2 0 1))

7 : ((1 0 2) (1 0 2) (0 2 1) (0 2 1) (0 2 1) (1 0 2))

9 : ((2 0 1) (2 0 1) (2 2 2) (0 1 2) (2 2 2) (2 0 1)) 10 : ((1 0 2 ) ( 0 0 0) (0 2 1) (1 2 0) (0 2 1) (0 0 0)) 11 : ((1 0 2) (1 0 2) (0 2 1) (2 0 1) (0 2 1) (1 0 2)) 12 : ((0 0 0) (2 0 1) (1 2 0) (0 1 2 ) (1 2 0) (2 0 1))

(A) w = XI1111 time: cell 1 to 6.

6 : ((1 0 2) (0 0 0) (0 2 1) (2 1 0 ) (0 2 1) (0 0 0))

8 : ((0 0 0) (2 0 1) (1 2 0) (1 0 2) (1 2 0) (2 0 1))

13 : ( ( 2 0 1) ( 2 0 1) (2 2 2) (2 2 2) (2 2 2) (2 0 1)) 14 : ((1 0 2) (0 0 0) (0 2 1) (0 0 0) (0 2 1) (0 0 0) ) T = 2 , 4 = 12

(B) ~ o = O l l l l l 0 : ((0 0 0) (0 0 1) (0 0 1) (0 0 1) (0 0 1) (0 0 1)) 1 : ((0 0 1) (0 0 1) (0 0 2) (0 0 2) (0 0 2) (0 0 1)) 2 : ((0 0 2) (0 0 0) (0 0 1) (0 0 0) (0 0 1) (0 0 0)) 3 : ((0 0 2) (0 0 2) (0 0 1) (0 0 1) (0 0 1) (0 0 2)) 4 : ((0 0 0) (0 0 1) (0 0 0) (0 0 2) (0 0 0) (0 0 1)) 5 : ( (0 0 1) (0 0 1) (0 0 2) (0 0 2) (0 0 2) (0 0 1)) T = 1, = 4

( C ) w 1 = l l l l l l 0 : ((0 0 1) (0 0 1 ) (0 0 1) (0 0 1) (0 0 1) (0 0 1)) 1 : ((0 0 2) (0 0 2) (0 0 2) (0 0 2) (0 0 2) (0 0 2)) 2 : ((0 0 0 ) (0 0 0 ) ( 0 0 0) (0 0 0) (0 0 0 ) (0 0 0 ) ) 3 : ( ( 0 0 0) (0 0 0 ) (0 0 0) (0 0 0 ) (0 0 0) ( 0 0 0 ) ) T = 2 , 4 = 1

(D) ~2=211111 0 : ((0 0 2) (0 0 1 ) (0 0 1) (0 0 1) (0 0 1) (0 0 1)) 1 : ((0 0 0 ) (0 0 0) (0 0 2) (0 0 2) (0 0 2) (0 0 0)) 2 : ((0 0 0) (0 0 0 ) (0 0 2) (0 0 0) (0 0 2) (0 0 0)) 3 : ( (0 0 0 ) (0 0 0) (0 0 2) (0 0 1) (0 0 2) (0 0 0)) 4 : ( (0 0 0) (0 0 0 ) (0 0 2) (0 0 2) ( 0 0 2) (0 0 0 ) ) T = 1,Cj5 = 3

The cell state is expressed by the coefficient vector. For example coefficient vectors (0,1,0) and (O,l,l) mean X and X + X2, respectively, as is seen in Appendix 1.

378

GENERALIZED DIRECTABLE AUTOMATA

ZARKO POPOVIC AND STOJAN BOGDANOVIC University of NiS, Faculty of Economics, Trg Kralja Alelcsandra 11,

18000 NiS, Yugoslavia E-mail: [email protected], [email protected]

TATJANA PETKOVIC University of Turku, TUCS and Department of Mathematics,

FIN-2001~ Turku, Finland E-mail: [email protected]

MIROSLAV CIRIC University of NiS, Faculty of Sciences and Mathematics, Cirila i Metodija 6,

18000 NiS, Yugoslavia E-mail: ciricmObankerinter.net, [email protected]

In [16] the last three authors introduced the notion of generalized directable automata as a generalization of many already known types of automata. The algorithms for testing whether a finite automaton belongs to some of important subclasses of the class of generalized directable automata are studied by the authors in [18]. In this paper structural properties of finite and infinite generalized directable automata are considered, tests for membership of a finite automaton in the pseudovarieties of generalized directable and locally directable automata are given, and the least generalized directing and locally directing congruences on a finite automaton are described.

1 Introduction and preliminaries

Directable automata were introduced in [6] and later studied by many authors (see, for example, [14], [13] or [4]), whereas trapped, trap-directable (or one-trapped), uniformly locally nilpotent, uniformly locally definite, uniformly locally directable, uniformly locally trap-directable automata were introduced in [16] and, as it was proved there, they form generalized varieties of automata properly contained in the generalized variety of all generalized directable automata, also introduced in [16]. The algorithms for testing whether a finite automaton belongs to a pseudovariety of all trapped, trap-directable or locally trap-directable automata were considered by the authors in [18]. The algorithms for construction the least congruence on a finite automaton whose corresponding factor automaton belongs to the mentioned pseudovarieties were also given in [18]. More information about all these classes of automata can be found in 141.

379

This paper presents a deeper study of generalized directable automata. Some structural properties of generalized directable automata and their transition semigroups were given in [16]. However, finite generalized directable automata have some particular properties that are described in Section 2. Those properties give rise to an algorithm for testing whether a finite automaton is generalized directable. Since uniformly locally directable automata play an important role in characterization of generalized directable automata, and finite locally directable automata are uniformly locally directable, in Section 2 special attention is devoted to testing finite automata for local directability. Directing congruences on automata were first considered in [14], where it was noted that every finite automaton has the least directing congruence, and an algorithm for finding this congruence was given in [13]. In Section 3 of this paper the existence of the least directing congruence on an arbitrary, not necessarily finite, generalized directable automaton is proved. It is shown that there are interesting mutual relations between the least directing, trapping and trap-directing congruences on a generalized directable automaton. Eventually, the least generalized directing congruence is characterized in Sec- tion 4. In addition, for an irregular pseudovariety of automata P , the least L(P)-congruence is described.

Let A be any set. Then AA and VA denote the diagonal (identity) relation and the universal'relation on A, respectively. For two binary relations a and p on A, their product is the relation a . /3 defined by: ( a , b ) E a p if and only if (a , c) E a and (c, b) E /3, for some c E A . If a * p = /3. a, we say that (Y and /3 commute.

Automata considered throughout this paper will be automata without outputs in the sense of the definition from the book by F. GQcseg and I. Peak [ll]. It is well known that automata without outputs, with the input alphabet X , can be considered as unary algebras of type indexed by X, so notions such as a congruence, homomorphism, generating set etc., have their usual algebraic meanings (see, for example, [5]). The state set and the input set of an automaton are not necessarily finite. In order to simplify notations, an automaton with the state set A is also denoted by the same letter A. For any considered automaton A , its input alphabet is denoted by X , and the free monoid over X , the input monoid of A, is denoted by X*. Under the action of an input word u E X * , the automaton A goes from a state a into the state denoted by au.

A state a E A is called a trap of A if au = a for every word u E X*. The set of all traps of A is denoted by Tr(A). A state a E A is reversible if for every word v E X * there exists a word u E X " such that avu = a , and the set of all reversible states of A, called the reversible part of A, is

380

denoted by R(A) . If it is nonempty, R(A) is a subautomaton of A. An automaton A is reversible if every its state is reversible. If for every a, b E A there exists u E X* such that b = au, then the automaton A is strongly connected. Equivalently, A is strongly connected if it does not have proper subautomata. On the other hand, A is connected if for every a, b E A there exist u ,v E X " such that au = bv. The mergeability relation P A on A is defined by (a, b) E P A if and only if au = bu, for some u E X " . If (a, b) E P A , we say that a and b are mergeable. Otherwise they are nonmergeable. For a state a E A, by (a) we denote the monogenic subautomaton of A generated by a, i.e. the subautomaton (a) = {au I u E X * } . The least subautomaton of an automaton A , if it exists, is called the kernel of A, and in this case, it is the unique strongly connected subautomaton of A .

Let u E X'. An automaton A is called u-trapped if au E Tr(A) for every a E A, and in this case u is a trapping word of A. If au = bu for every a , b E A, then A is u-directable, u is a directing word of A and the set of all directing words of A is denoted by DW(A) . If A is u-directable and has a trap, or equivalently, if it is u-trapped and has a unique trap, then it is called u-trap-directable. Also, an automaton A is generalized u-directable if for every state a E A and every word v E X * holds auvu = au, and then u is a generalized directing word and the set of all generalized directing words is denoted by G D W ( A ) . An automaton A is trapped (resp. directable, trap- directable, generalized directable) if there exists a word u E X * such that A is u-trapped (resp. u-directable, u-trap-directable, generalized u-directable). It can be proved (see [19]) that a finite automaton is directable if and only if any two its states are mergeable. For a word u E X*, a state a E A is a u-neck of A if bu = a, for every b E A, and it is a neck of A if it is a u-neck, for some u E X * . An automaton A is strongly directable if every its state is a neck, or equivalently, if i t is both strongly connected and directable.

Let B be a subautomaton of an automaton A. If 8 is a congruence relation on B , then the relation R(8) defined by R(8) = BUAA is a congruence relation on A and it is called the Rees extension of 8 (up to a congruence on A ) . In particular, the Rees congruence e B of a subautomaton B is the Rees extension R ( V B ) . The factor automaton AleB is denoted by AIB and the automaton A is an extension of B by an automaton C (with a trap), if AIB ? C.

Let A and B be automata and let H be an automaton such that there exist homomorphisms cp from A onto H and I+!J from B onto H . Then P = {(a, 6) E A x B I acp = b$} is a subdirect product of A and B and any automaton isomorphic to P is called a pullback product of A and B with respect t o H . By a parallel composition of automata A and B we mean any automaton isomorphic to a subautomaton of their direct product A x B .

381

An automaton A is a direct s u m of its subautomata A,, a E Y , if A = UaEY A, and A, n Ap = 0, for every a,p E Y such that a # p. Automata A,, a E Y , are direct summands of A. They determine a partition of A called a direct sum decomposition of A , and the corresponding equivalence relation is a congruence relation on A called a direct s u m congruence. By the greatest direct s u m decomposition of A we mean the decomposition corresponding to the least direct sum congruence on A. An automaton A is direct s u m indecomposable if the universal relation VA is the only direct sum congruence on A. More on direct sum decompositions can be found in [8]. Here we quote the following theorem from [8], which will be widely used here. Theorem 1 (CiriC and BogdanoviC [8 ] ) Every automaton can be uniquely represented as a direct s u m of direct s u m indecomposable automata. This decomposition i s the greatest direct s u m decomposition of that automaton.

In this paper special attention is devoted to finite automata. Hence we will often use the following basic result describing the structure of an arbitrary finite automaton. Theorem 2 (KovaEeviC, CiriC, PetkoviC and BogdanoviC [15]) Every f inite automaton can be uniquely represented as an extension of a direct s u m of strongly connected automata b y a trap-directable automaton.

If K is a class of automata, then an automaton is a locally K - a u t o m a t o n if every its monogenic subautomaton belongs to K , and the class of all locally K-automata is denoted by L ( K ) . In such a way locally directable and locally trap-directable automata are defined. In particular, if every monogenic subautomaton of an automaton A is u-directable, for some u E X * , i.e. all monogenic subautomata of A are directable and have a common directing word u, then A is uniformly locally directable, u is a uniformly locally directing word of A and the set of all such words is denoted by ULDW(A). Furthermore, a uniformly locally strongly directable automaton is an automaton whose every monogenic subautomaton is strongly connected and u-directable, for a fixed '11 E x*.

By a generalized variety of automata we mean any class of automata closed under formation of subautomata, homomorphic images, finite direct products and direct powers, whereas by a pseudovariety of automata we mean any class of finite automata closed under formation of subautomata, homomorphic images and finite direct products. Equivalently, a class of automata is a pseudovariety if and only if it is the class of all finite members of some generalized variety (see [l]). As was proved in [16], the classes of directable, uniformly locally directable, generalized directable, trap-directable, uniformly locally trap-directable and trapped automata are generalized varieties, and

382

hence, finite members from these classes form pseudovarieties. A pseudovariety of automata is here defined to be irregular if it is con-

tained in the pseudovariety of all finite directable automata. Otherwise it is called regular. Many interesting algebraic properties of irregular and regular pseudovarieties are described in [3] and [4]. Here we recall a result from [3] that will play an important role in the further work. Theorem 3 (BogdanoviC, CiriC, PetkoviC, Imreh and Steinby [3]) If P is a n arbitrary pseudovariety of automata, then L(P) is also a pseudovariety of automata. Moreover, if P is an irregular pseudovariety of automata and A i s a finite automaton, then A E L ( P ) if and only if A is a direct s u m of automata f rom P .

For undefined notions and notations we refer to [ll], [5] and [12].

2 Testing for generalized and local directability

Generalized directable automata were introduced and studied by the last three authors in [16], where they proved that a generalized directable automaton can be characterized as an extension of a uniformly locally directable automaton by a trap-directable automaton. By the next theorem we give a more precise structural characterization of these automata. Theorem 4 An automaton A i s generalized directable if and only if it is a n extension of a uniformly locally strongly directable automaton B by a trap- directable automaton C.

In that case we have

DW(C) U L D W ( B ) C GDW(A) DW(C) n U L D W ( B ) .

Proof. Let A be generalized directable. Consider arbitrary a E A and u E GDW(A). Then auuu = a u , for every IJ E X ' , whence it follows that au E R(A). Now, if we set B = R(A), we have that B is a subautomaton of A, and by au E B, for every a E A and u E GDW(A), it follows that C = A / B is a trap-directable automaton and GDW(A) C DW(C).

Let D be an arbitrary monogenic subautomaton of B. Since B is reversible, we have that D is strongly connected. Consider arbitrary a , b E D and u E GDW(A). Then au ,b E D, so auu = b, for some TJ E X * , whence it follows that bu = auuu = au. Thus, D is directable and u E DW(D), so we conclude that B is uniformly locally strongly directable and GDW(A) E U L D W ( B ) .

Conversely, let A be represented its an extension of a uniformly locally strongly directable automaton B by a trap-directable automaton C. Consider arbitrary a E A, p E DW(C), q E U L D W ( B ) and u E X ' , and set u = pq.

383

Then ap,apqup E D , for some strongly directable subautomaton D of B , whence auuu = (apqup)q = (ap)q = au, because q E D W ( D ) . Therefore, A is a generalized directable automaton and DW(C) - U L D W ( B ) C G D W ( A ) . I

Besides the characterization of arbitrary generalized directable automata given in Theorem 4, the following theorem contains other equivalents of that property on finite automata. Theorem 5 The following conditions on a finite automaton A are equivalent:

(i) A is generalized directable; (ii) every strongly connected subautomaton of A is directable;

(iii) every subautomaton of A contains a directable subautomaton; (iv) (Va E A)(3u E X*)(Vu E X * ) auuu = au; (v) (Va E A)(% E X*)(Vu E X * ) ( 3 w E X * ) auuw = auw.

Proof. (i)+(ii). This implication is an immediate consequence of Theorem 4. (ii)+(i). By Theorem 2, A is an extension of an automaton B by a

trap-directable automaton C, where B is a direct sum of strongly connected automata Bi, i E [I, n], and by the hypothesis it follows that Bi is a directable automaton, for every i E [1,n]. Since D W ( B i ) is an ideal of X * , for each i E [l, n], and the intersection of any finite family of ideals is nonempty, then there exists q E ny=, DW(Bi). Then the automaton B is uniformly locally strongly directable, and hence, by Theorem 4, A is a generalized directable automaton.

(ii)+(iii). This is an immediate consequence of Theorem 2. (iii)+(iv). Consider an arbitrary a E A. By the hypothesis, the mono-

genic subautomaton (a ) contains a directable subautomaton B , and then there exists p E X * such that ap E B. Let u = pq, where q E DW(B) , and let u E X * be an arbitrary word. Then as in the proof of Theorem 4 we show that auuu = au. Thus, (iv) holds.

(iv)=+(v). It is clear that for every a E A there exists u E X * such that auvu = au = au2, for every u E x * .

(v)+(ii). Take an arbitrary strongly connected subautomaton B of A and a , b E B. By the hypothesis, there exists u E X' such that for every u E X * there exists w E X * such that auuw = auw. Then au, bu E B and B is strongly connected so there exists p E X' such that aup = bu, and for that p there exists q E X * such that aupq = auq, so auq = buq. Therefore, we have proved that a and b are mergeable, whence it follows that B is a directable automaton.

384

Note that condition (v) means that for each a E A there exists u E X " such that aw. and any state from ( a u ) are mergeable, whereas condition (iv) means that every state has, in some sense, its own generalized directing word.

Condition (ii) of Theorem 5 gives rise to an algorithm which tests a finite automaton A with n states and m input letters for generalized directability. The algorithm is a combination of two other algorithms. The first one is an algorithm for finding the strongly connected subautomata of a finite automaton. For that purpose we can use the algorithm given by the authors in [18], which works in time O(mn + n2), or adapt the algorithm from the paper by J. Demel, M. Demlovii and V. Koubek [9] for finding the strongly connected components of a directed graph, which works in time O(rnn). Immediately after an arbitrary strongly connected subautomaton is formed, it can be checked for directability, using, for example, an algorithm given by B. Imreh and M. Steinby in [13]. The total time required for checking all strongly connected subautomata for directability is bounded by O(mn2). Therefore, the total working time for the whole algorithm is bounded by O(mn2), which is the same bound as for the directability test given in [13].

Recall that an automaton A is called locally directable if every its monogenic subautomaton is directable, and it is called uniformly locally directable if all its monogenic subautomata are directable and have a common directing word. In the general case, the class of uniformly directable automata is a proper subclass of the class of locally directable automata, as well as of the class of generalized directable automata. But, finite uniformly locally directable automata and finite locally directable automata form the same class, and in the second part of this section we study several properties of automata from this class and give an algorithm for testing a finite automaton for local directability. Theorem 6 The following conditions o n a finite automaton A are equivalent:

(i) A is locally directable; (ii) every monogenic subautomaton of A has the directable kernel;

(iii) A is a direct s u m of directable automata; (iv) every summand in the greatest direct s u m decomposition of A has the

(v) (Va E A)(3u E X*)(Vv E X ' ) uvu = au.

Proof. Note first that, according to Theorem 2, a finite automaton is directable if and only if it has the directable kernel. This fact immediately implies the equivalences (i)@(ii) and (iii)@(iv). Since finite directable automata form an irregular pseudovariety, the equivalence (i)@(iii) follows from

directable kernel;

385

Theorem 3. Finally, the claim (v) is just statement (i) written in symbols, i.e. (i)@(v) obviously holds.

Using the previous theorem we can give an algorithm which tests a finite automaton A with n states and m input letters for local directability. This algorithm is a combination of three simpler algorithms. The first one is for computing the summands in the greatest direct sum decomposition of A, for example an algorithm given by the authors in [18], which works in time O(mn). Immediately after forming any of these summands, it can be checked whether this summand has a kernel, using one of two algorithms for finding the strongly connected subautomata of A mentioned before, which can be done in time O(mn) or O(mn + n2). These algorithms have to be modified to check whether the considered summand has only one strongly connected subautomaton. If it is established that this summand has the kernel, this kernel can be immediately tested for directability using the mentioned algorithm from [13]. The total time needed for checking directability of all these kernels is bounded by O(mn2). Therefore, the whole algorithm can be realized in time O(mn2).

3 The least directing congruence

If K is a class of automata and A is an automaton, then a congruence relation 8 on A is called a K-congruence if the related factor automaton A18 belongs to K . According to M. CiriC and S. BogdanoviC [7,2], the class K is closed under homomorphic images and finite subdirect products if and only if the partially ordered set ConK(A) of all K-congruences on A is a sublattice of the congruence lattice Con ( A ) , for every automaton A, or equivalently, if it is a filter of Con ( A ) , for every automaton A. Therefore, if K is a generalized variety or a pseudovariety of automata and A is a finite automaton, then ConK (A) is a finite lattice, so it has the least element which is called the least K-congruence on A.

If 8 is a congruence relation on an automaton A such that A18 is a directable automaton, then 8 is called a directing congruence on A. Recall from [13] that a congruence relation 8 on a finite automaton A is directing if and only if any two states a, b E A are 8-mergeable, by which we mean that there exists u E X * such that (au, bu) E 0. Since the class of all directable automata is a generalized variety, then every finite automaton has the least directing congruence. An algorithm for finding the least directing congruence on a finite automaton was given by B. Imreh and M. Steinby in [13]. But, in various theoretical considerations it is often of interest to describe such a congruence

386

through some logical formula, which is the main aim of this section. Note that T. PetkoviC and M. Steinby introduced in [17] the notion of a

pair automaton of an automaton A. Here we will use a special subautomaton of this automaton defined as follows. On the set

A!:!, = { {a , b} I a, b E A, a # b, (a , b) 4 P A }

of all pairs of nonmergeable states of A we define transitions by

{a , b}x = {ax, bx},

for every x E X. The transitions defined in this way are well-defined since if a pair {a , b} is nonmergeable then the pairs {ax , bx}, for all 2 E X, are nonmergeable as well. Then A::,), is an automaton which will be called the nonmergeable pair automaton of A. It plays an important role in the proof of the following theorem which characterizes the least directing congruence on a finite automaton. Theorem 7 Let A be an arbitrary finite automaton and let 64 be the transitive closure of the relation @A defined on A by

(a , b ) E QA H a = b or (Vv E X*)(3u E X*) {avu, bvu} = {a , b} .

Then 6~ is the least directing congruence on A. Proof. It is evident that @A is reflexive and symmetric. Let (a,b) E Q A

and w E X'. Then for each v E X' there exists u E X* such that {a(wv)u, b(wv)u} = {a , b}, whence

{(aw)vuw, (bw)vuw} = {aw, bw},

so (aw, bw) E @ A . Thus, @ A is compatible. Being the transitive closure of a reflexive, symmetric and compatible relation, d~ has the same properties and is transitive, so it is a congruence relation on A.

To prove that 64 is a directing congruence, consider arbitrary a, b E A. If aw = bw for some w E X ' , then clearly (aw,bw) E d ~ , so a and b are dA-mergeable. Suppose now that aw # bw, for every w E X*. Then {a,b} is a state of the nonmergeable pair automaton A!:?, of A, and by Theorem 2, there exists w E X* such that {aw, bw} is a reversible state of A!;:?,. By this it follows that for each

{awvu, bwvu} = {aw, bw}vu = {aw, bw},

so (aw, bw) E @A 6 ~ . Therefore, a and b are 6.4-mergeable, and by Lemma 5.3 of [13] we have that 64 is a directing congruence on A.

It remains to prove that 6~ is contained in any directing congruence 8 on A. Assume that (a , b) E Q A . By the hypothesis and Lema 5.3 of [13], a

E X' there exists u E X* such that

387

and b are 8-mergeable, so there exists v E X " such that (av, bv) E 8. On the other hand, (a , b) E @ A implies {mu, bvu} = { a , b}, for some u E X * , and by (av, bv) E 8 it follows (avu, bvu) E 8 , so we conclude that (a , b) E 8. Thus, @ A 5 8, whence 64 c 8, which was to be proved.

Let us observe that a and b are distinct states of an automaton A and for every v E X' there exists u E X " such that {avu ,bvu} = {a ,b} if and only if {a , b} is a reversible state of the nonmergeable pair automaton A$;]. Therefore, the previous theorem has the following equivalent formulation: Theorem 8 Let A be an arbitrary finite automaton and let &A be the transitive closure of the relation @ A on A defined by

(a , b) E @ A e a = b or {a , b} E R(A1:;J.

Then 6~ is the least directing congruence o n A. Note that the mentioned algorithm by B. Imreh and M. Steinby [13], for

finding the least directing congruence on a finite automaton, is based on a similar result given in terms of graphs.

Corollary 1 The least directing congruence o n a finite automaton A is the Rees extension of the least directing congruence o n the reversible part of A, i.e. d~ = ~ R ( A ) U AA.

If A is an infinite automaton, then it does not necessarily have the least directing congruence. In the second part of this section we prove the existence of the least directing congruence on an arbitrary generalized directable automaton, even on an infinite one, and we give a characterization of this congruence different than the one given for finite automata in Theorem 7.

First we introduce several notions and notations. If A is an arbitrary (not necessarily finite) automaton, then to each state a E A we can associate a language G(a) X * defined as follows

G(a) = { u E X * I (Vv E X ' ) uvu = a } .

The main properties of so defined languages are described by the next lemma. Lemma 1 Let A be an arbitrary automaton and a E A. Then G(a) # 0 i f and only if (a) is a strongly directable automaton.

In that case the following conditions hold:

(a) G(a) = { u E X " la i s a u-neck of ( a ) }; (b) G(a) i s a left ideal of X * ; (c) G(a)w c G(aw), for every w E X ' .

By Theorem 7 and Theorem 2 the following result holds:

388

Proof. If G(a) # 0 then clearly (a) is a directable automaton. On the other hand, a is reversible, whence it follows that ( a ) is strongly connected. Thus, (a) is strongly directable. Conversely, let (a ) be strongly directable. Then a is a u-neck of (a) , for some u E X * , and then u E G(a).

The assertion (a) is evident. Further, consider arbitrary u E G(a) and w E X * . Then auwu = a, for each u E X * , so wu E G(a). Thus, G(a) is a left ideal of X*. Consider also arbitrary u E G(a) and w E X * . Then awuu = a, whence awvuw = aw, for every u E X * . Hence, uw E G(aw).

Now we are ready to describe the least directing congruence on a generalized directable automaton. Theorem 9 Let A be an arbitrary generalized directable automaton and let U A be the transitive closure of the relation U A defined on A by

(a , b) E V A e a = b or G(a) nG(b) # 0.

Then U A is the least directing congruence on A. Proof. The relation U A is clearly reflexive and symmetric. Consider a, b E A, a # b, such that (a,b) E U A , and an arbitrary w E X * . Then there exists TA E G(a) n G(b), and by (c) of Lemma 1 we have that uw E G(aw) f l G(bw), so (aw, bw) E U A . Therefore, U A is compatible, whence it follows that U A is a congruence relation.

To prove that U A is a directing congruence on A , consider an arbitrary u E GDW(A) and a , b E A. Then u E G(au) n G(bu), so (au, bu) E V A C U A .

Therefore, A/uA is a u-directable automaton, so U A is a directing congruence on A .

Let 0 be an arbitrary directing congruence on A. Suppose that (a , b) E UA and a # b. Then there exists u E G(a) n G(b). On the other hand, for an arbitrary u E DW(AI0) we have that (au,bu) E 0, whence (auu,buu) E 0. Now by u E G(a) n G(b) i t follows that

(a , b) = (avu, buu) E 0.

Therefore, V A C B , whence U A & 0, and we have proved that U A is the least directing congruence on A .

The previous theorem can be equivalently formulated as follows: Theorem 10 Let A be an arbitray generalized directable automaton and let U A be the transitive closure of the relation V A on A defined by

(a,b) E U A @ a = b OT (321 E X*)(VV E X * ) uuu = 12 & buu = b.

Then U A is the least directing congruence on A .

389

As we see from Theorem 10, the condition which defines the relation V A

is stronger than the one from Theorem 7 that defines the relation Q A . A congruence relation 6 on an automaton A is called a trapping congruence

if the factor automaton A/@ is a trapped automaton, and it is called a trap- directing congruence if A/@ is a trap-directable automaton.

Let A be a generalized directable automaton. Then the relation TA defined on A by

(a ,b) E TA w a = b or ( V U , ~ E X * ) ( 3 p , q E X * ) aup = b & bvq = a

is the least trapping congruence on A. In other words, (a , b) E TA if and only if either a = b or a and b belong to the same strongly connected subautomaton of A. Moreover, the relation I ~ A on A defined by

(a , b) E 1 9 ~ H a = b or (VU, v E X * ) ( 3 p , q E X * ) aup = a & bvq = b

is the least trap-directing congruence on A. Equivalently, (a ,b) E IJA if and only if either a = b or a , b E R(A), i.e. I ~ A is the Rees congruence of the subautomaton R(A) of A. As it was proved in [18], such defined relations are the least trapping and the least trap-directing congruences on an arbitrary finite automaton, and almost the same proofs can be given in the case when A is a generalized directable (not necessarily finite) automaton.

The next theorem describes certain relationships between the congruences U A , TA and t 9 ~ on a generalized directable automaton. Theorem 11 Let A be a generalized directable automaton. T h e n

U A . T A = T A . U A = I ~ A .

Proof. Since U A C_ I ~ A and TA C_ I ~ A , then U A * TA C_ I ~ A and TA * U A C_ I ~ A . Therefore, it remains to prove the opposite inclusions. For that reason, consider an arbitrary pair ( a , b ) E 194. If a = b, then clearly (a , b) E U A . TA

and (a , b) E TA . U A . Assume that a # b. Then a, b E R(A), so by Theorem 4, (a) and (b) are strongly directable automata, i.e. G(a) # 0 and G(b) # 0. Take arbitrary u E G(a) and v E G(b). Then by (b) and (c) of Lemma 1 we have that

uv E X*G(b) C_ G(b) and uv E G(a)v s G(av),

whence (a , av) E TA and (av, b) E V A 2 U A , and similarly,

vu E X*G(a) G(a) and vu E G(b)u C G(bu),

which yields ( a , h ) E V A E U A and (bu,b) E TA. Therefore, (a ,b) E TA . U A

and (a , b) E U A . T A , so we have proved the assertion of the theorem.

390

In the general case, the relation V A on a generalized directable automaton A is not necessarily transitive, i.e. UA # YA. The next theorem gives interesting characterizations of the structure of generalized directable automata on which the relation V A is transitive. Theorem 12 The following conditions o n an automaton A are equivalent:

(i) A is generalized directable and UA is transitive; (ii) A is generalized directable and YA n TA = AA;

(iii) A is a pullback product of a directabze automaton and a trapped automaton (wi th respect to a trap-directable automaton);

(iv) A i s a subdirect product of a directable automaton and a trapped automaton;

(v) A is a parallel composition of a directable automaton and a trapped automaton.

Proof. (i)+(ii). If u is transitive, then VA = YA. Consider an arbitrary pair ( a , b ) E YA f l TA. If a = b then ( a , b) E AA is trivially satisfied, so we can further assume that a # b. By ( a , b ) E TA it follows that a, b E B , for some strongly connected subautomaton B of A , and then there exists w E X’ such that aw = b. On the other hand, (a , b) E YA = UA implies that there exists u E X * such that avu = a and b v u = b, for each v E X ’ . Now a = awu = b u = b. Hence, V A n TA = AA.

(ii)+(iii). By the general result proved for arbitrary universal algebras by I. Fleischer in [lo] it follows that an automaton A is a pullback product of automata A1 and A2 with respect to an automaton A3 if and only if there exists a pair of congruences O1 and (32 on A such that 81 n 02 = AA, 01 and 8 2 commute and A / & 5 A1, A / & % A2 and A / & ? AS, where 83 = 81 & = & . e l . Since by Theorem 11 we have that V A . TA = TA U A = ~ Q A , then V A n TA = AA implies that A is a pullback product of a directable automaton A/vA and a trapped automaton A/TA with respect to a trap- directable automaton A/6A.

(iii)+(iv) and (iv)=+(v). These implications are evident. (v)+(i). Let A B x C be a parallel composition of a directable automa-

ton B and a trapped automaton C. Then B and C are generalized directable, and since generalized directable automata form a generalized variety, then A is also a generalized directable automaton. Furthermore, it can be easily verified that

( ( b , c ) , ( b ’ , c ’ ) ) € ~ A ($ b=b’ & ( c = c ’ or c , c ’ E T T ( C ) ) ,

whence it follows that U A is transitive. W

39 1

4 The least generalized and locally directing congruences

A congruence relation 8 on an automaton A is called a generalized directing congruence if the factor automaton A / @ is generalized directable, and it is called a locally directing congruence if A / @ is a locally directable automaton. In this section we describe the least generalized directing and the least Iocally directing congruences on a finite automaton, and give algorithms for finding them.

Theorem 13 Let a f in i t e automaton A be represented as a n extension of a n automaton B by a trap-directable automaton C , where B is a direct s u m of strongly connected automata Bi, i E [l,n]. For each i E [1,n] let bi denote the least directzng congruence o n Bi. Then the relation ^/A defined o n A by

First we prove the following theorem:

(a , b) E YA w a = b or (a , b ) E & , for some i E [l, n],

is the least generalized directing congruence o n A . Proof. It can be seen easily that Y A is a congruence relation on A. As in the proof of Theorem 5 we obtain that there exists p E DW(Bi/bi). Take also arbitrary q E DW(C) and u E X * . Consider now any a E A. Then aq E Bi, for some i E [l, 721, and for u = qp holds auvq = (aq)pvq E Bi, what implies

(auuu, au) = ( ( a u w ) p , ( a d p ) E bi.

Thus, (auuu,au) E Y A , for every a E A , whence it follows that A / Y A is a generalized directable automaton, i.e. YA is a generalized directing congruence on A.

To prove that Y A is the least generalized directing congruence on A, consider an arbitrary generalized directing congruence 0 on A. Let 'p be the natural homomorphism of A onto A / @ , and for any i E [ l ,n] , let 'pi denote the restriction of 'p on Bi. Then for each i E 11, n], Bi'pi is a strongly connected subautomaton of A / @ , so by Theorem 5, Bicpi is a directable automaton. This means that ker 'pi is a directing congruence on Bi, whence 6i ker ' p i , and now we have that

n

YA = AA U u bi ker'p = 8. i=l

So we have proved that Y A is the least generalized directing congruence on A. w

392

Using the above theorem we can give an algorithm for finding the least generalized directing congruence on a finite automaton A with ‘12 states and in input letters. The algorithm consist of two parts. In the first one we compute the strongly connected subautomata of A. As we have mentioned in Section 2, we can use one of the algorithms given in [18] or [9]. They work in time C3(m,z + n2) and U(rnn), respectively. In the second part of the algorithm we compute the least directing congruence on each strongly connected subautomaton of A. Here we can use the algorithm given by B. Imreh and M. Steinby in [13], which can be carried out in time C3(rnn2 +n3) . Therefore, the total time for realizing the whole algorithm is bounded by C3(mrz2 + n3) , the same as for the algorithm for computing the least directing congruence.

Before we describe the least locally directing congruence on a finite automaton, we give a more general result. Theorem 14 Let P be a n irregular pseudovariety of automata and let a f in i te au tomaton A be represented as a direct s u m of direct s u m indecomposable automata Ai, i E [ l ,n] . For each i E [1,n] let Xp,i denote the least P - congruence on Ai. T h e n the relation XP,A defined o n A by

(u,b) E AP,A @ (a ,b) E A P , ~ for some i E [ l ,n] ,

i s the least L(P)-congruence on A. Proof. Evidently, XP,A is a congruence relation on A. Let p be the natural hoinomorphism of A onto A‘ = A/XP,A, and for each i E [l ,n] let cpi denote the restriction of cp on Ai and A: = Aipi. Then for every i E [ l ,n] we have that

(a ,b) Ekerdi e a , b E Ai & (a ,b) E ker4 H a,b E Ai & ( ~ ~ 6 ) E X P , A @ (a ,b ) E X P , ~ ,

so kercpi = Xp,i , and now we conclude that A: E Ai/Xp,i E P, because Xp,i is a P-congruence on Ai. On the other hand, if a‘ E A: n A$, for some i , j E [1,n], i # j , then a’ = aicpi = aip and a’ = aj’pj = a j q , for some ai E Ai and aj E Aj , which yields (ui,uj) E XP,A. But, by the definition of XP,A it follows that ai and aj must belong to the same Ak, for some k E [l, n], i.e. that i = k = j , which leads to a contradiction. Therefore, we conclude that A: f l AS = 0 for i , j E [l,n], i # j, so A’ is a direct sum of automata A:, i E [l ,n]. Using again Theorem 3 we obtain that A’ E L(P) , and hence, X P , A is a L(P)-congruence on 4.

To prove that XP,A is the least L(P)-congruence on A, consider an arbitrary L(P)-congruence 0 on A. Let q5 be the natural homomorphism of A onto A” = A/O, and for each i E [1,n] let di he the restriction of 4 on

393

Ai and A: = Ai+ = Ai+i. We are going to prove that A? is direct sum indecomposable, for every i E I . Fix i E I and consider A?. It is easy to see that the inverse homomorphic image B+rl of every direct summand B of A? is a direct summand of Ai, and since Ai is direct sum indecomposable, we conclude that so is A:. On the other hand, 8 is an L(P)-congruence on A, whence A" = A/O E L ( P ) , and seeing that L ( P ) is a pseudovariety, then we also have that A? E L(P) . According to Theorem 3, the automaton A: can be decomposed into a direct sum of automata from P , and since A: is direct sum indecomposable, we conclude that A: E P . By this and by A? = Ai+i Ai/ ker+i it follows that ker+i is a P-congruence on Ai, whence X P , ~ g ker +i. Therefore, Xp, i g ker +i for every i E [l ,n], and hence, XP,A 5 ker + = 8. So we have proved that XP,A is the least L(P)-congruence on A.

If we assume P to be the pseudovariety of all finite directable automata, then the following consequence is obtained: Corollary 2 Let a f in i te automaton A be represented as a direct s u m of direct s u m indecomposable automata Ai, i E [I, n]. For each i E [l, n] let 6i be the least directing congruence on Ai. Then the relation XA on A , defined b y

(a ,b ) E XA (u,b) E 6i for some i E [ l , n ] ,

is the least locally directing congruence on A. In the case when P is assumed to be the pseudovariety of all finite trap-

directable automata, Theorem 14 gives as a consequence Theorem 5 of [18] that characterizes the least locally trap-directing congruence on a finite automaton.

An algorithm for finding the least locally directing congruence on a finite automaton A with n states and m input letters, based on the previous results, can be also composed of two algorithms. The first one is the algorithm for finding the greatest direct sum decomposition of A, given by the authors in [l8], which can be done in time U(7nn) . In the second one we compute the least directing congruence on every summand of this decomposition, using the mentioned algorithm from [13], and this takes time U(mn2 + n 3 ) . Therefore, the whole algorithm can be also realized in time U(mn2 + n3).

References

1. C. J. Ash, Pseudovarieties, generalized varieties and similarly described classes, J. Algebra 92 (1985), 104-115.

394

2. S. BogdanoviC and M. CiriC, A note on congruences on algebras, in: Proc. of I1 Math. Conf. in PriStina 1996, Lj. D. KoEinac ed., PriStina, 1997,

3. S. BogdanoviC, M. CiriC, B. Imreh, T. PetkoviC and M. Steinby, Local properties of unary algebras, (to appear).

4. S. BogdanoviC, B. Imreh, M. CiriC and T. PetkoviC, Directable automata and their generalizations - A survey, in: S . CrvenkoviC and I. Dolinka (eds.), Proc. VIII Int. Conf. "Algebra and Logic" (Novi Sad, 1998), Novi Sad J. Math 29 (2) (1999), 31-74.

5. S. Burris and H. P. Sankappanavar, A course in universal algebra, Springer-Verlag, New York, 1981.

6. J. Cernjr, Potna'mka k homoge'nnym expperimentom s konecny'mi au- tomatrni, Mat.-fyz. cas. SAV 14 (1964), 208-215.

7. M. CiriC and S. BogdanoviC, Posets of C-congruences, Algebra Universalis

8. M. CiriC and S. BogdanoviC, Lattices of subautomata and direct s u m decompositions of automata, Algebra Colloq. 6:l (1999), 71-88.

9. J. Demel, M. DemlovA and V. Koubek, Fast algorithms constructing minimal subalgebras, congruences, and ideals in a finite algebra, Theoretical Computer Science 36 (1985), 203-216.

10. I. Fleischer, A note on subdirect products, Acta. Math. Acad. Sci. Hungar. 6 (1955), 463-465.

11. F. GCcseg and I. PeSk, Algebraic Theory of Automata, AkadCmiai Kiad6, Budapest, 1972.

12. J. M. Howie, Fundamentals of Semigroup Theory, London Mathematical Society Monographs. New Series, Oxford: Clarendon Press, 1995.

13. B. Imreh and M. Steinby, Some remarks on directable automata, Acta Cybernetica 12 (1995) , 23-35.

14. M. Ito and J. Duske, On cofinal and definite automata, Acta Cybernetica

15. J. KovaEeviC, M. Cirit, T. PetkoviC and S. BogdanoviC, Decompositions of automata and reversible states, A. Adam and P. Domosi (eds.), Pro- ceedings of the Nineth International Conference on Automata and Formal Languages, Publ. Math. Debrecen (to appear).

16. T. PetkoviC, M. CiriC and S. BogdanoviC, Decompositions of automata and transition semigroups, Acta Cybernetica (Szeged) 13 (1998), 385- 403.

17. T. PetkoviC and M. Steinby, Piecewise directable automata, Journal of Automata, Languages and Combinatorics 6 (2001), 205-220.

pp. 67-72.

36 (1996), 423-424.

6 (1983), 181-189.

395

18. 2. PopoviC, S. BogdanoviC, T. PetkoviC and M. CiriC, Trapped automata, A. Adam and P. Domosi (eds.), Proceedings of the Nineth International Conference on Automata and Formal Languages, Publ. Math. Debre- cen(to appear).

19. M. Steinby, On definite automata and related systems, Ann. Acad. Sci. Fenn., Ser. A, I Math. 444, Helsinki 1969.

20. G. THIERRIN, Decompositions of locally transitive semiautomata, Utili- tas Mathematics 2 (1972), 25-32.

396

Acts over Right, Left Regular Bands and Semilattices Types

Tatsuhiko Saito

Mukunoura, Innoshima 374, Hiroshima, Japan 722-2321

Email: [email protected]

Let S be a semigroup and let X be a non-empty set. Then X is called a right act over S or simply S-act if there is a mapping X x S 4 X , (2 , s) C) x s with the property ( x s ) t = x ( s t ) .

A semigroup S is called a band if every element in S is an idempotent. A band S is called r ight regular (resp. l e f t regular) if sts = st (resp. sts = ts) holds for every s, t E S. A commutative band is called a semilatt ice.

An S-act X is said to be a right regular band type, or simply RRB-type, if xs2 = x s and x s t s = x s t for all x E X and every s, t E S. A left regular band type (LRB-type) S-act and a semilattice type (SL-type) S-act are similarly defined.

When S is a free monoid a RRB-type automaton, an LRB-type automaton and an SL-type auromaton can be similarly defined. In this case, for an automaton A = (A, X , a), where A is an alphabet, X is a set of states and 6 is a mapping X x A -+ X , ( 2 , a) C) xu. we can show that, if xu2 = za and xaba = xab for all x E X , a , b E A, then xs2 = x s and x s t s = x for all x E X, s, t E A*. This fact can be applied to LRB-type automata and SL-type automata.

Our purpose is to determine all S-act which are right regular band types, left regular band types and semilattice types, respectively.

To achieve the purpose, we obtain necessary and sufficient conditions, for any given set X , and any semigroup S, in order that X is S-acts which are a RRB-type, a LRB-type and a SL-type, respectively (Theorems 1,3,5).

Further we obtain more concrete results to construct actually RRB-type, LRB-type and SGtype automata, respectively (Corollaries 2,4,5).

Let X be a S-act. It is well-known that defining a relation p on S by spt if x s = x t for all x E X . a transformation semigroup S/p on X can be obtained. Thus, from the above results, every right regular band, left regular band and semilattce can be obtained in the full transformation semigroup T ( X ) , respectively.

Let X be a set and let p be an equivalence on X . Then the pclass containing x E X is denoted by x p and the partition of X determined by p is denoted by ~ ( p ) . For a mapping 4 : X -+ X , x H x 4 , let im(4) = ( x 4 l x E

397

X } , ker(4) = ( ( 2 , y ) E X x Xl.4 = y 4 } and fix(+) = { x E Xl.4 = x} , which are called the image, the kernel and the set of fixed points of 4, respectively.

When X is an act over a semigroup S, im(s), ker(s) and fix(s) can be defined for s E S, since s : x c-) 2s. Since ker(s) is an equivalence on X, xker(s) denotes the ker(s)-class containing x.

The symboles n and U denote the set-theoretic intersection and union, respectively, and A and V denote the lattice theoretic meet and join, respectively. If X is an act over a free monoid S, the x l = x for all x E X , where 1 denotes the empty word in S. For a set X, 1x1 denotes the cardinality of X , and for a word s, 1.1 denotes the length of s.

Lemma 1. Let X be an act over a free monoid A*. Then (1) If xaba = xab for every a, b E A U ( 1 ) and x E X . then X is a mght

regular band type act over S. ( 2 ) If xaba = xba for every a, b E A U ( 1 ) and x E X , then X is a left

regular band type act over S. (3) If xa2 = xa and xab = xba for every a, b E A and x E X , then X is a

semilattice type act over S. Proof. Let x E X and s,t E A*. We show (1) by induction on Is1 = n and JtJ = m, ( 2 ) and (3) can be similarly shown. If n, m 5 1, then the assertion is true by the assumption. Suppose that xata = xat holds for a E A and t E A* with It1 = k. If It1 = k + 1, then t = t'c for some c E A and t' E A* with It'l = k . Then we have xata = xat'ca = xat'aca = xat'ac = xat'c = xat. Thus xata = xat for every t E A*. Suppose that xsts = ast holds for any t E A* and s E A* with 1.1 = k. If 1.1 = k + 1, then s = s'c for some c E A and some s' E A* with ls'I = k. Then we have xsts = xs'cts'c = xs'cts' = xs'd = xst. Thus xsts = xst for every s, t E A*. Therefore we have xs2 = xs ls = xs l = 2s. Consequently, X is a right regular band type S-act.

Lemma 2. Let X be a set and let 4 : X + X , x c-) x+. Then the following are equivalent :

(1) (x,x$) E ker(4) for every x E X.

(3) im(4) nxker(+) = {qb} for every x E X , which means the set im(+) n xker(4) has only one elementis x4. Proof. (1) + ( 2 ) Let x E im(+). Then x = y+ for some y E X . Since (y ,y+) E ker(4), x+ = (y+)+ = y 4 = x , Thus im(+) C fix(+). The reverse inclusion is clear.

(2) + (3) Let y E im(+) n xker(4). Then y = y+ = x4, since ( x , y ) E ker(+) and y E im(+) = fix(+).

(2) im(4)=fuc(+).

398

(3) + (1) Straightforward.

Lemma 3. Let X and 4 be as in Lemma 2. If there exist a subset Y of X and an equvalence p on X such that Y n x p = {x4} for every x E X , then Y = im(4) = h ( 4 ) and p = ker(4). Proof. Let x E im(4). Then z = y 4 for some y E X. Since {x} = Y n yp, we have z E Y. Thus im(4) C Y . Let x E Y. Since x E Y f l xp, x = x4. Thus Y C_ h(4 ) im(4). Let (x, y) E p. Since Y r l xp = Y n yp, xq5 = y4. Thus (x,y) E ker(4). Let (x,y) E ker(4). Then x4 = y4. Since ( x , x 4 ) E p and (Y, Y4) E P , we have (2, 9) E P.

Let (X, 5 ) be an ordered set. A subset I of X is called an o-ideal if x E I and y 5 x imply y E I. Then the set of o-ideals in ( X , 5) forms a lattice ordered set under U and n. For 2 E X, let I ( x ) = {y E Xly 5 x}. Then I(z) is an o-ideal, which is called the principal ideal generated by x. If a subset Y of X has the minimun element, we denote it by min(Y).

Theorem 1. Let X and S be a non-empty set and a semigroup, respectively. Then X is a right regular band type S-act i f and only i f X is an ordered set under some order-relation 5, and f o r each s E S, there exist a subset X , of X and an equivalence p , on X which satisfy the following conditions:

(1) JX, n xpsI = 1 for every z E X and s E S, ( 2 ) each X , is an o-ideal in ( X , I), (3) if X , n xp = {y}, then y E I(z), ( 4 ) X,t = X, f l X , for every s, t E S and (5) if (2, 9) E P s , y E x, and (Y, E pt, z E xt, then (2, .) E p t .

Proof. Suppose that X is a right regular band type S-act. Define a relation 5 on X by y 5 x iff y = xs for some s E S. It is easy to see that 5 is reflexive and transitive. Let y 5 x and x 5 y. Then y = xs and x = yt for some s, t E S, so that we have x = xst = x s t s = y. Thus (2,s) is an ordered set. Since (25)s = xs2 = xs for every x E X and s E S , so that by Lemma 2 we have that im(s) = h(s) and im(s) n xker(s) = {xs} and xs E I ( z ) , since xs 5 x. Let x E im(s) and let y 5 x for x,y E X,s E S. Then y = xt for some t E S, so that we have y = xt = zst = xsts E im(s), since x E im(s) = fix(s). Thus im(s) is an o-ideal. Let x E im(st). Then, as x = xst = zsts, x E im(s) n im(t). Thus im(st) C im(s) n im(t). The reverse inclusion is clear, since fix(s) n h ( t ) C fix(&). Thus im(st) = im(s) n im(t). If (x, y) E ker(s), y E im(s) and (y, z ) E ker(t), z E im(t), then xst = y t = z. Thus ( x , z ) = (x ,xs t ) E ker(st). For each s E S, put im(s) = X, and ker(a) = pa. Then X , and p , satisfy the conditions (1)-(5).

Suppose coversely that, for each s , ~ S, the subset X, of X and the equivalence p , on X satisfy the conditions (1)-(5). Define the action of S on X

399

by xs = y if X,nxp, = {y}. Then xs 5 x by the condition (3), and by Lemma 3 we have that X, = im(s) = fix(s) and p, = ker(s), so that (2,~s) E ker(s) by Lemma 2, i.e., zs = (xs)s . Let zs = y and yt = z. Then y E X s , z E Xt and z I y, so that by the condition (2) z E X, Thus (zs)t E X, nXt = Xst, and ((zs)t)s = (zs)t, since (zs)t E X, = fix(s). As (x,y) E ps,y E X, and (y,z) E pt ,z E Xt, by the condition ( 5 ) ( q z ) E pst. Consequently z E X,, n zp,t. Thus z = (zs)t = z(st), so that xs2 = (zs)s = zs and xsts = ((xs)t)s = (xs)t = xst for all z E X and every s,t E S, as requiered.

Thr following result is useful to construct actually right regular band type automata.

Corollary 2. Let X be an act over a free monoid S = A*. Then X is a right regular band type S-act if and only if X is an ordered set under some order relation I, and for each a E A, there exists an o-deal I, in (X, 2) with I, n I(%) # 0 for every x E X . Proof. Suppose that X is a right regular band type S-act. As is seen in the proof of Theorm 1, X can be an ordered set under some order relation 5, and im(s) is an @ideal in (X, 5 ) with im(s) f l I(z) # 0. For each a E A, put im(a) = I,. Then the proof is complete.

Suppose conversly that (X, 5) is an ordered set and for each a E A, there exists an 0-ideal I, in (X, 5 ) with I, n I(x) # 0. Define the action of A on X by xa E I, n I(x) if z 4 I,, otherwise za = z. Then I, =im(a) = &(a) and xa 5 x. Put ker(a) = pa. Let x E X and a, b E A. Then zab = (xa)b 5 xa and za E I,. Since I, is an @ideal, we have xab E I, = fix(a). Thus xab = (xab)a = zaba. By Lemma 1, X is a right regular band type S-act.

Theorem 3. Let X and S be as in Theorm 1. Then X is a left regular band type S-act if and only if, for each s E S, there exist a subset X, of X and equivalence ps on X which satisfy the following conditions:

(1) (X, n zp,l = 1 for every x E X and s E S, (2) pst = ps V pt for every s,t E S and (3) if x E X,,y E Xt and (x,y) E pt, then y E XSt.

Proof. Suppose that X is a left regular band type S-act. Since xs2 = xs for every x E X,s E S, by Lemma 2 we have that im(s) = fix(s) and im(s) V zker(s) = {xs}. Let x ,y E X and s, t E S. If (x,y) E ker(s), then xst = yst, so that (x, y) E ker(st). If (x, y) E ker(t), then zst = ztst = ytst = yst, so that (2, y) E ker(st). Consequently, ker(s) V ker(t) ker(st). If (x,y) E ker(st), then zst = yst. Since (2,~s) E ker(s) and (zs, zst ) E ker(t), (5, zst) E ker(s) vker(t), similarly (y, yst) E ker(s) V ker(t) so that (z,y) E ker(s) V ker(t). Consequently ker(st) G ker(s) V ker(t). Thus ker(st) = ker(s) V ker(t). If 2 E im(s),y E im(t) and (z,y) E ker(t),

400

then zs = z, since im(s) = &(s), similarly y t = y , and zt = yt, so that ys t = yts t = ztst = zst = zt = y t = y. Thus y E im(st). Put X , = im(s) and p, = ker(s). Then X , and ps satisfy the conditions (1)-(3).

Suppose conversely that, for each s E S, the subset X , of X and the equivalence p, on X satisfy the conditions (1)-(3). Define the action of 5’ on X by zs = y if X , np, = {y} . Then p, = im(s) = &(s) and p, = ker(s). Let z E X and s, t E S. Then by Lemma 2 (z,zs) E p, and (zs, ( z s ) t ) E pt, so that (2, (zs) t ) E p,Vpt = p,t. Since zs E X,, (m)t E Xt and (zs, ( zs ) t ) E pt, by the condition (3) we have z E XSt. Consequently, (zs) t E X,t n zp,t, so that (zs)t = z(st). Thus X is S a d . By Lemma 2 (zs)s = 5s so that zs2 = zs and since (z,zt) E pt 5 pst ,zst = z(st ) = (z t ) (s t ) = ztst. Therefore X is a left regular band type Sad .

Let X be a set. Then the set of equivalences on X forms a latticeordered set under n and V. We consider here a special set R ( X ) of equivalences on X , that is, for each p E R ( X ) , there exists a subset M, such that

(1) M, n zp = 0 for every pclass zp, (2) for every X E R ( X ) , if (2, y ) E p V A, then (u,v) E X for every

u E M, n z p and v E M, n yp. (3) if z 4 M, and z E zp, then there exists X E R ( X ) such that ( z , v) 4 X

for every v E y p even if (2, y ) E p V A. In this case, M, is called the join-mediating set of p in R ( X ) and R ( X )

is called a JM-set, i.e., every element in R ( X ) has the join-mediating set in

Example. Let X = {1,2,3,4} and R ( X ) = {p l ,pz} with ~ ( p l ) = {1,3} U {2,4},7r(pg) = {1,2,3}U{4}. Then{1,2,3}and {2,4}arethejoin-mediating set of p1 and p2, respectively, so that R ( X ) is an JM-set. If 4 p 3 ) = {1,2} u {3,4}, then R(X) U {p3} is not an JM-set, since (2,4) 4 p3.

R ( X ) .

The following result is useful to construct actually left regular band type automata.

Corollary 4. Let X be a n act over a free semigmup S = A*. Then X is a left regular band type S-act .if and only iJ for each a E A, there exists a n equivalence pa o n X such that R(A) = {pala E A } is an JM-set. Proof. Suppose that X is a left regular band type 5’-act. For each s E S, put ker(s) = p,. From Theorem 3, we have p,t = p, Vpt . Since (2s). = xs2 = zs, (z,zs) E ker(s) = p,, so that by Lemma 2 im(s) = fix(s). Let s E S and (z, y ) E p, V pt for any t E 5’. Then (z, y ) E p,t, so that zst = yst . Since (z, zs), ( y , y s ) E p, and (zs, y s ) E pt. Thus im(s) is the join-mediating set of p, in {ptlt E S}, so that {ptlt E S} is an JM-set. Therefore R(A) = {pala E A} is also an JM-set.

40 1

Suppose conversely that, for each a E A, there exists an eqivalence pa on X such that R(A) = {pala E A } is an JM-set. For each pa E R(A), let M,, be the join-mediating set of pa in R(A). Define the action of A on X by za E Mp, nzp, and ya = xa if y E zp,. Since za E %pa, we have (za)a = 20. By Lemma 2, im(a) = &(a). If (z,y) E pa, then za = ya. If (z,y) E ker(a), then zp, = ypa, since za E xrho,, ya E ypa and za = ya. Thus pa = ker(a). Let z E X and a, b E A, and let xab = y. Since (5, xu) E ker(a) = pa and (20, y) = (za, zab) E Pb, (2, y) E pa V Pb. As Mpb is a join-mediating set of pb in R(A), ( zb , yb ) E pa = ker(a), so that zba = yba = ya = xaba for all z E X, since y E im(b) = f k ( b ) . By Lemma 1, X is a left regular band type S-ad.

Theorem 5. Let X and S be as in Theorem 1. Then X is a semilattice type S-act i f and only i f X is an ordered set for some order relation 5, and for each s E S, there exist a subset X, of X and an equivalence p, on X which satisfy the following conditions:

(1) IX, n zp,I = 1 for every 2 E X , (2) each X , is an o-ideal, (3) if z, n zp, = {y}, then y E I ( z ) (4) xst = X, n X t for every s, t E S and ( 5 ) pst = p, V pt for every s, t E S.

Proof. Suppose that X is a semilattice type S-act. From the fact that a semilattice is a right regular band an a left regular band, by using Theorems 1,3 this follows.

Suppose conversely that, (X, I ) is an ordered set, and for each s E S, the subset X, of X and the equivalence p, on X satisfy the conditions (1)- (5). Define the action of S on X by 2s = y if X, n zp, = {y}. Then by Lemma 3 we have that X , = im(s) = fix(s) and p, = ker(s). Let z E X and s,t E S. By the same argument as the proof of Theorem 1, we obtain that 2s = (zs)s, (zs)t = ((xS)t)s and (zs) t E Xa fl x b = Xab and by the same argument as the proof of Theorem 3, we obtain that ( z t ) s = ( ( z s ) t ) s and (zs)t E ps V Pt = pst. Thus (xs) t = ( ( zs ) t ) s = (x t )s and ( zs ) t E X,t n xpst, so that (zs) t = z ( s t ) , and we have zs2 = xs and z ( s t ) = z( ts) .

The following result is useful to construct actually semilattice type automata.

Corollary 6. Let X be an act over a free monoid S = A*. Then X is a semilattice type S-act af and only i f X is an ordered set under some order- relation I, and for each a € A there exists an equivalence pa which satisfies the following conditions:

(1) R(A) = {p,(a E A } is a M-set,

402

(2) each pa-class xp, has the minimum min(xp,) and (3) Mp, = {min(zp,)1x E S} is an o-ideal in ( X , l ) and the join-

mediating set of pa in R(A). Proof. Suppose that X is a semilattice type S-act. From xs2 = xs, by Lemma 2 we have im(s) n xker = {xs}. Since a semilattice is a right regular band and left regular band, we have that ( X , I) is an ordered set, where y I x iff xs = y for some s E S, and that im(s) = fix(s) is an &ideal in (X, 5) and the join-mediating set of ker(s) in R(S) = {ker(t)It E S}. Since xs 5 x, we have xs = min(xker(s)), so that im(s) = {min(xker(s))lx E S} which is an &ideal in ( X , 5 ) and the join-mediating set of ker(s) in R(S). Thus R(S) is a M-set. For each a E A, put ker(a) = pa and R(A) = {p,(a E A } . Then im(a) = {min(xp,)lx E X } is an &ideal in ( X , 5 ) and the join-mediating of pa in R(a).

Suppose conversely that ( X , I) is an ordered set and for each a E A, the equivalence pa satisfies the conditions (1)-(3). Define the action of A on X by za = y if y = min(zp,). Then we have xa E I (z) . Since Mpp n xp, = {min(zp,)}, by Lemma 3 Mpp = im(a) = &(a) and pa = ker(a). Let x E X , a, b E A. Then xa2 = (xa)a = za. Since xab 5 xa E im(a) = Mp, and Mp, is &ideal, we have xab E Mpa = &(a), so that zaba = (xab)a = xab. Let xab = y. Then (x,y) E p,Vpb. Since (x,xb) E p b , (y,yb) E pb and Mpb is the join-mediating set of pb in R ( X ) , we have (xb, yb) E pa = ker(a), so that xba = yba = ya = xaba. Consequently, xab = xaba = xba for all x E X and every a, b E A. By Lemma 1, X is a semilattice type S-act.

References [l] Howie, J. M. “Fundamentals of Semigroup Theory”, Oxford Science Pub-

lications, Oxford, 1995.

[2] Kunze, M. and S. CrvenkoviE, Maximal subsemilattices of the full transformation semigroup on a finite set, Dissertationes Matematicae CCCXIII, Polish academy of Sciences, 1991.

[3] Petrich, M., “Lectures in Semigroups”, John Wiley and Sons, London, 1977.

[4] Saito, T. and M. Katsura, Maximal inverse subsemigroups of the full transfomation semigroup in “Semigroups with Applications (ed. J. M. Howie, W. D. Munn and H. J. Weinert), World Scientific,l991, 101-113.

403

Two Optimal Parallel Algorithms on the Commutation Class of a Word

Extended abstract

Re& Schott* Jean-Claude Spehnerl

Abstract

The free partially commutative monoid M ( A , 0) defined by a set of commutation relations 0 on an alphabet A can be viewed as a model for concurrent computing: indeed, the independence or the simultane- ity of two actions can be interpreted by the commutation of two letters that encode them. In this context, the commutation class Co(w) of a word w of the free monoid A* plays a crucial role. In this paper we present: - A characterization of the minimal automaton Ao(w) for Co(w) with the help of the new notion of @-dissection. - A parallel algorithm which computes the minimal automaton Ao(w). This algorithm is optimal if the size of A is constant. - An optimal parallel algorithm for testing if a word belongs to the commutation class CQ(W). Our approach differs completely from the methods (based on Foata's normal form) used by C. G r i n and A. Petit [2, 31 for solving similar problems. Under some assumptions the first algorithm achieves an optimal speedup. The second algorithm achieves also an optimal speedup and has a time complexity in O(1og n) if the number of processors is in O(n) where n is the length of the word w, the total number of operations is in O(n) and does not depend on the size of the alphabet A as for the classical sequential algorithm.

Keywords: Automaton, commutation class, optimal, parallel algorithm, partially commutative monoid.

~ ~~

'LORIA and IECN, Universite Henri Poincark, 54506 Vandoeuvre-l&s-Nancy, France,

t Laboratoire MAGE, FacultC des Sciences et Techniques, UniversitC de Haute Alsace, e-mail: schottOloria.fr

68093, Mulhouse, France, e-mail: [email protected]

404

1 Introduction The free partially commutative monoid was introduced by P. Cartier and D. Foata [l] for the study of combinatorial problems in connection with word arrangements. It has particularly been investigated as a model for concurrent systems (see [4, 131) since the pioneering work of A. Mazurkiewicz [9]. In this context the computation of the commutation class of an element w (i.e. all words equivalent to w) is of great interest since it gives all transactions equivalent to the initial one modulo the partial commutation relations. In other words, if a transaction is correct (i.e. no deadlock appears during its execution) then all elements of its commutation class are also correct. This paper is devoted to the design of - an optimal parallel algorithm which computes the minimal automaton of the commutation class of a given word on a constant size alphabet and achieves an optimal speedup under some assumptions, - an optimal parallel algorithm for testing if a word belongs to this commutation class. Our test algorithm is particularly original since its time complexity does not depend on the size of the alphabet on which the word is written. The notion of optimality of parallel algorithms used in this paper is defined as follows (see [7]): Given a computational problem Q, let the sequential time complexity of Q be Tseq(n) where n is the size of Q’s data. This assumption means that there is an algorithm to solve Q whose running time is O(Tsep(n)). A parallel algorithm to solve Q will be called optimal if the total number of operations it uses is asymptotically the same as the sequential complexity of the problem, regardless of the running time Tpar(n) of the parallel algorithm. The organization of the paper is as follows: Section 2 provides the basic notions on partial commutativity and gives a characterization of the minimal automaton of a commutation class with the help of the new notion of @-dissection. Section 3 focuses on the design of a parallel algorithm which constructs the partial automaton of a commutation class. Testing if a word belongs to a commutation class is the subject of Section 4. We give mainly sketch of proofs of our results. All details will be provided in the full version of this paper.

405

2 The partial minimal automaton of the commutation class of a word

Let A be a finite alphabet, A* the free monoid on A and 0 a partial commutation relation on A. With (A , 0) we associate the smallest congruence (denoted =e) such that: (a , 6 ) E 0 @ ab E-0 ba. Let w be an element of A*. The commutation class of w is the set C e ( w ) defined as follows:

C ~ ( W ) = (w' E A*/w' E@ w).

For each rational language L of A*, there exists a finite minimal automaton A ( L ) which recognizes L. If L is finite, A(L) admits a non terminal state z such that, for each letter a E A, z.a = z and by deleting the state z , we get the partial minimal automaton AP(L) of L. The partial minimal automaton of the class C e ( w ) is denoted Ae(w) .

Definition 1 A scattered subword (not a factor) m = ail ai, . . . aih of w = aoal . . . a,-l is called rigid relatively to 0 if none of the pairs of letters

'22)l ('i2 I %S), ' ' ' 1 ( a i ( h - l ) 7 ' i h )

belongs to 0 U O-l, i.e. distinct and not permutable with respect t o 0.

two consecutive letters of m are either equal or

It is easy to prove that all words of C e ( w ) have the same rigid subwords.

Definition 2 i ) For each strictly increasing sequence of integers n = ( i l l . . . , ip) of the set (0, . . . , n - l}, the strictly increasing sequence r = (jl , . . . , j,) such that {jl, . . . , j q } = (0,. . . , n - 1) - {ill.. . ,ip} i s called the complementary sequence of n for (0 , . . . , n - 1). By symmetry, n is the complementary sequence of r f o r (0, . . . ,n - 1). u = ail . . . aip and v = ajl . . . aj, are then subwords of w = aio . . . a,-l and w is a shzlfJEe of u and v. The word v is then said to be complementary of u with respect t o w . A strictly increasing sequence n admits a unique complementary sequence but this is not true f o r words since two distinct strictly increasing sequences n = (ill. . . , ip) and n' = (ii,. . . ,ib) can define the same word u = ail . . . ai, = ail , . . ai; (see Example 1 below).

Every pair ( i , j ) of elements of ( 0 , . . . , n-1) such that i < j and n( j ) < n(i) is called an inversion of the sequence (n(O), . . . , n(n - 1)) and also an inversion of w' with respect to w .

i i) Let n be a permutation of ( 0 , . . . ,n - 1) and w' = a,(o) . . .

406

iii) A pair (a, r ) of strictly increasing complementary sequences of the set ( 0 , . . . ,n - l} is called a @-dissection of ( 0 , . . . ,n - 1) if, for each inversion (j,i) of the sequence ar = (il,. . . , i p , j l , . . . , j q ) , the letters ai and aj are distinct and permutable for 0. I f (a, r ) is a @-dissection of ( 0 , . . . , n - l}, the pair (u, w) of subwords u = ail . . . aip and v = aj, . . . aj, of w is called a @-dissection of the word w.

Example 1 If w = abcdbe and 0 = { ( a , b ) , ( a , c ) , ( a , d ) , ( a , e ) , (b , d ) , (b , e ) , ( c , d ) } , the sequences a = (1,2,3) and r = (0,4,5) are complementary for ( 0 , . . . ,5} and (a, r) is a @-dissection of ( 0 , . . . , n - 1) since the inversions (0, 1), (0,2) and (0,3) of the sequence ar = (1 ,2 ,3 ,0 ,4 ,5) correpond to the pairs ( a , b ) , ( a , c ) and ( a , d ) of 0. The pair of words (u, w) where u = bcd and v = abe is therefore a @-dissection of w = abcdbe. (bc, adbe), (bcde, ab), (bcdbe, a ) are also @-dissections of w. The subword u = abe of w admits two complementary subwords w = cdb and v‘ = bcd which correspond to a = (0,1,5), r = (2,3,4) and a’ = (0,4,5), r’ = (1,2,3).

24 d 32

47 Figure 1: The graph of the partial minimal automaton Ao(w) for w = abcdbe and 0 = { ( a , b) , ( a , c ) , ( a , d ) , ( a , e ) , ( b , d ) , ( b , e ) , ( c , d ) } . The states are denoted in accordance with section 3.

Theorem 1 a) The function 4 which associates the state s = 1.u with each @-dissection

(u, w) of w is bijective. ii) If u = bl . . . bp, the letters a for which there exists a transition towards

s relative to a , are the letters bi such that, if i # p , bi permutes with the letters bi+l, . . , , bp .

407

iii) If u = c1 . . . cq, the letters a for which there exists a transition issued f r o m s relative to a, are the letters ci such that, if i # 1, ci permutes with the letters c1,. . . , ci-1.

Proof sketch. The proof of the theorem is based on the following results: - For each subword u of w there exists at most one subword u of w such that (u ,v) is a @-dissection of w. - If u and u are subwords of w, (u,v) is a O-dissection of w if and only if uu belongs to Ce (w). - For each state s of A(w), if u and u are words such that 1.u = s and s . w = f, L(Ae(w), 1, s) = Ce(u) and L(Ae(w) , s, f ) = Ce(v) . 0

3 A parallel algorithm In this section we design an optimal parallel algorithm which constructs the partial minimal automaton Ae(w). We give an overview of how our algorithm works. The algorithm constructs first the partial automaton A0 which recognizes only the word w. The transformation of A0 into the automaton Ae(w) is based, essentially, on the following simple transformation: if t , u and u are states and a, b are letters of A such that t = u.a, u = t.b and (a , b) E 0 U 0-l then there exists also a state s such that s = u.b and s.a = v. If s does not exist already it has to be constructed and the transitions s = u.b and s.a = u have to be created (if they do not exist). This transformation, called the permutation of the letters a and b at t , can generate as well new permutations of some letters at u and u. If such permutations are realized in parallel, it may be possible that they try to create simultaneously the same state. In order to avoid this possibility we associate an integer with each state. This integer does not depend on its creation procedure and we distribute the states among the different processors.

3.1 The distribution of the states among the processors Theorem 1 proves that each state s of Ae(w) is in 1 - 1 correspondence with a O-dissection (u ,v ) of w. If w = w[O]w[l]w[2]. . . w[n - 11 (from now arrays of letters are used for words) u has the form w [ i l ] w [ i ~ ] . . .w[ i t ] where ( i l , i2,. . . , ik) is a strictly increasing sequence of integers of {0,1,. . . , n - 1). It follows that we can identify the number 1 + 2i1 + 2i2 + . . . + 2i'. with the state s. In fact, if we put z = s - 1 and remove iteratively from z the smallest power of 2, we recover the sequence (il, . . . , ik). Every state is hence an element of the universe U = { 1, . . . ,2"}.

408

Let p be the number of processors which are available on the computer and r the largest odd number which is strictly less than p . If we suppose that p is a power of 2 (a frequent situation), r and p are mutual prime numbers. We split now U in r parts U l , . . . , U, of equal size (up to 1) such that, for each processor q of { 1,. . . , r } , U, is the set of integers s such that 1 + s mod r = q. The processor q has in charge the treatment of all created states which belong to U, and to store in its local memory all data concerning these states. If a state s is created, the processor q = 1 + s mod r is activated for: 1) inserting s in a stack of its local memory, 2) affecting a number num[s] to the state s thanks to the following procedure:

i n ser t ( s , 4 ) ; { if 0 < num[s] 5 size then stack[num[s]] := s

else { size := size + I; stack[size] := s; num[s] := size 1

1 3) testing if a state s of U, has been created previously by the procedure:

{ if 0 < n u m [ s ] 5 size and stack[num[s]] := s then e z i s t ( s , q ) := t rue e z i s t ( s , 4 ) ;

else e x i s t ( s , q ) := f a l s e }

The variable size , common to all these procedures, is stored in the local memory of the processor q and is not used by any other procedure. For a given state s, all these procedures are executed by the same processor q. Therefore the simultaneous execution of several of these procedures for the same state s is not possible. Nevertheless, two distinct processors can execute simultaneously these procedures since they concern then distinct states.

Remark 1 These procedures are executed in time 0(1) and replace avanta- geously the use of an array of booleans. I n fact, the t ime complexity of the initialization of an array of booleans for the universe U is in 0 ( 2 n ) . Here the initialization is reduced to let s ize = 0 for each processor (see [lo], page 289). Its time complexity is therefore in 0(1) and the total number of operations is in O(r ) .

Remark 2 The partition used here is well-balanced for the subsets U, of the universe U but it is not necessarily the case fo r the created subsets of states which belong to the subsets U,. If the word w has no particularities, such a splitting is adapted; otherwise, a size balancing method has to be found for the sets S n U,.

409

3.2 The data structures associated with a state Let s be a state, q = 1 + s mod T the processor associated with s and e = nurn[s]. The following data structures are used for s in the local memory of the processor q: - an array transin[.., el which contains the integers i such that there exists a state u such that s = u + 22 = u.w[i] (transitions towards s ) ; - the number nbin[e] of transitions towards s; - an array nurnin[.., el such that if h = nurnin[i, el then transin[h, el = i - an array transout[.., e] which contains the integers i such that there exists a state w such that w = s + 2i = s.w[i] (transitions issued from s); - the number nbout[e] of transitions issued from s; - an array nurnout[.., e] such that if h = nurnout[i, el then transout[h, el = i. The procedures below are all based on the same idea which is to avoid using arrays of booleans in order to realize the initialization in constant time.

insert in( i , e , q ) ; { nbin[e] := nbin[e] + 1; nurnin[i, e] := nbin[e]; transin[nbin[e], e] := i 1

ezis t in( i , e , q ) ; { if 0 <nurnin[i, e] 5 nbin[e] and transin[nurnin[i, e] , el := i

then ezis t in( i , e , q) := true else ezis t in( i , e , q ) := false

1 The dual procedures insertout(i, e , q) and ezistout(i, e , q ) are omitted.

Since, for each state s, it is always the same processor q which has access to the data concerning s, there is no concurrent access to the data.

3.3 Construction of the partial automaton which recognizes the word w

The procedure a f f e c t - f irst-states can be executed by any processor. It creates the states of the partial automaton which recognizes only the word w = w[O] . . . w[n - 11.

a f f ec t - f irst-states; { for h E (0, . . . ,n} pardo

{ s := 2h; q := 1 + s mod r ; insert(s , q ) ; initialize-state(s, h, q ) ; 1

1

410

The processor q which is determined here for any state s created by the procedure af fec t - f irst-states initializes in the procedure initialize-state the management of the transition towards s and of the transitions issued from s.

initialize-state(s, h, q ) ; { e := num[s]; nbin[e] := 0; nbout[e] := 0;

if h = 0 then insertout(0, e , q ) else if h = n then insertin(n - 1, e , q )

else { insertin(h - 1, e , q ) ; insertout(h, e, q ) I

}

3.4 The procedure construct-basic-automaton reduces to the call of the procedure which permutes two letters for all pairs of successive permutable letters of the word w and can be executed be any processor.

Starting the construction of the automaton

construct-basic-automaton; { for h := 1 to n - 1 pardo

{ s := 2h; if (w[h - 11, w[h]) E 0 U 0-l then permute(s, h - 1, h) 1

1

3.5 Permutation of two letters The state t is given as well as the indices i and j of the letters w[i] and w [ j ] such that there exist a transition toward s relative to the letter w[i] and a transition issued from s relative to the letter w [ j ] . The previous state u, the next state Y and the diagonal state s which form a parallelogram with t are then respectively ~ = t - 2 ~ , v = t + 2 J and s = u + 2 J = v - 2 ' . The permutation procedure can also be executed by any processor. It creates the state s except if s exists already and, by calling three procedures, a transition from u to s related to the letter w[j ] (except if the transition exists already) and a transition from s to w corresponding to w[i] (if this transition does not exist). It modifies therefore the current automaton Ah in order to add all words of K1 = L(Ah, l ,u )w[ j]w[i]L(Ah, Y , 2n) to the recognized language L(Ah, 1, 2n) if s is created and one of the following languages KZ = L(Ah, 1, s )w[ i]L(Ah,v , 2"), K3 = L(Ah, 1, u)w[ j ]L(Ah , S , , 2 n ) , Ki U Kz U K3 or 0 otherwise.

41 1

permute(t, i,j); { u := t - 2 i ;

Y := t + 2j; s := u + 2j; q := 1 + s mod r if exist(s, q) then init := 0 else { insert(s, q) ; init := 1); pardo

{ previms(u, j , 1 + u mod r ) ; next(w, i, 1 + v mod r ) ; diagonal(s, i, j , init, q)

(u is the state before t ) (v is the state after t ) (s is the diagonal state o f t )

V=t+a i 1

1

u=t-2' Figure 2: The parallelogram associated with the procedure permute.

3.6

The previous state u of t is already created but the transition from u to s is not necessarily created. If the transition from u to s is created, the procedure permute is called for all the triples (u, h, j ) where there exists currently a transition towards u relative to the letter w[h]. This procedure is necessarily executed by the processor q = 1 + u mod r affected to the state

The treatment of the previous and the next states

U .

previms(u, j , q>; { e := num[u];

if exis tmt( j , e, q) = false then

if nbin[e] # 0 then { insertmt(j , e, q ) ;

{ for lc := 1 to nbin[e] pardo { h := trans+, el;

if (w[h],w[j]) E 0 U 0-1 then permute(u, h , j ) 1

1 1

}

412

The treatment of the state ”next” is the dual part of the treatment of the state ”previous”.

3.7 This procedure is executed by the processor q = s mod r associated with s. If s is created, then we have to initialize nbin[e] and nbout[e] and to create the transitions to and from s relative to the letters w[j] and w[i] . If s exists already, the transition from u to s (resp. from s to w) is created if it does not exist. Possibly, there is nothing to do.

diagonal(s, i , j , init , q) ;

The treatment of the diagonal state

{ e := num[s]; if init = 1 then { nbin[e] := 0; insertin(j , e, q ) ;

nbout[e] := 0; insertout(i, e , q) 1

else { if exis t in( j , e , q) = false then insertin(j , e , q) ; if existout(i, e, q) = fa l se then insertout(i, e, q )

1

Definition 3 A0 is the automaton determined by the procedure af feet-first-states and Ah is the current automaton after executing the procedure permute h times. The procedure permute is executed only once for a triple ( t , i , j ) and there exist only a finite number of such triples. Thus our algorithm terminates. Let f i e n d be the automaton which is finally constructed by our algorithm.

It’s easy to prove that: 0 The automaton Aend is deterministic and monogeneous. 0 The language recognized by the automaton Aend is L( Aend, 1,2”) = Co (w). It follows that:

Theorem 2 The partial automaton constructed b y our algorithm is isomorphic to the partial minimal automaton A e ( w ) of the commutation class Co(w).

Theorem 3 i) If S i ze (Ao(w) ) is the size of the partial minimal automaton Ae(w) and S is his set of states, the total number of operations of our algorithm is in

O ( S i z e ( A s ( w ) ) * card(A)) = O(card(S) * (card(A))’).

413

ii) If the alphabet A is of constant size, our algorithm is optimal. iii) I f the alphabet A is of constant size and if the distribution of the states of S among the subsets U1, . . . , U,. is uniform (i.e. balanced), then our algorithm achieves an optimal speedup.

Proof. i) If there exist k ( s ) transitions towards a state s and Z(s) transitions issued from s, the treatment of the state s in the procedures previous, n e x t and diagonal requires O(k( s ) * Z(s)) operations. Since k ( s ) 5 card(A) and xsES l ( s ) = Size(Ao(w)) , the total number of operations is in O(Size(Ae(w)) * curd(A)) = O(curd(S) * ( c u ~ d ( A ) ) ~ ) .

ii) If the alphabet A is of constant size, the total number of operations is in O ( S i z e ( A e (w))). But any algorithm which constructs a partial automaton recognizing Ce (w) tests necessarily all the transitions issued from each state and therefore the number of operations of such an algorithm is necessarily in O(Size(Ae(w))) and this proves that our algorithm is optimal in this case.

iii) If the distribution of the states of S is uniform (i.e. balanced) among the subsets U1, .. . , U,., the T processors are load-balanced. In addition all procedures which are not affected to a processor can be distributed with pri- ority on the processors Ur+l, . . . , Up and then uniformly among all processors. For every q E (1,. . . , p } , let Tq(n) be the total number of operations realized by the processor q during the execution of the algorithm for a word w of length n and let T,,,(n) = maz{T , (n ) ; 4 E (1,. . . , p } } . Since the processors are load-balanced, there exists a strictly positive constant c1 (c1 < I) such that, for every q E (1,. . . , p } , T,(n) 2 c1 * Tmaz(n). Therefore we get:

Let Tpar(n) and Tseq(n) be respectively the time complexity of our parallel algorithm and the time complexity of an optimal sequential algorithm which constructs the automaton Ao(w) when w is of length n. Since our algorithm is optimal, there exist strictly positive constants c2 and c3 such that: Tpar(n) = c2*TmaZ(n) and TSe,(n) = C ~ * C ; = ~ T,(n). Therefore the speedup

that Sp(n) is in O ( p ) and is optimal. 0

c1 * P * Tmaz(n ) 5 E: Tq(4 I P * Tmaz(n).

Sp(n) = T;;'p(n, T (n) (see [7]) verifies c1* (2) * p 5 Sp(n) 5 * p and this proves

4 Testing if a word belongs to a commutation class

We want to test if a given word u = u[O] . . .u[n - 11 belongs to the commutation class Ce(w) i.e. if this word is recognized by the automaton Ao(w).

414

An elementary sequential algorithm solves this problem in time O(n) . We design a parallel algorithm which solves this problem in time O(1ogn) when the number of processors is in O(n). Moreover the total number of operations is in O ( n ) and does not depend on the size of the alphabet A. Hence our algorithm is optimal. We give now an overview of our algorithm. We use first a very simple test which verifies that, for every letter a E A, the numbers of occurrences of a in the two words u and w are equal. Then we determine, for every i E (0 ,..., n - l}, the value j = eta[i] E (0 , . . . , n - 1 ) such that w [ j ] = u[i] and the numbers of occurrences of the letter w [ j ] = u[i] in the words w[O] . . . w [ j - 11 and 2401.. . u[i- 11 are equal. Since the states of A o ( w ) are identified with integers of the form 1 + 2i1 + 2 i 2 + . . . + 2 2 k , we can determine, by a prefix sum calculation in O(1og n) time, all the intermediate states which are necessary for recognizing the word u. In fact this computation is done on the universe U = ( 1 , . . . , 2n} and U is also the set of states of a partial automaton Al(w1) where w1 is of length n and all letters of w1 are distinct and two by two permutable. Now u E CQ(W), if and only if all these intermediate states are states of the automaton AQ (w) . Our algorithm uses three well-known procedures.

4.1 Known used procedures The following procedures compute respectively the sum, the prefix sum and the maximum of the elements in an array. For details see [6, 7, 81. The procedure somme(k, 1 , x[k..Z], sum) (where k < 1 ) computes the sum of the 1 - k + 1 elements of the array z[k..l] and puts the result in the variable sum. The procedure somme-prefiz(k, 1 , z[k..Z], sx[k..l]) (where k < 1 ) computes, for each index i of { k , . . . , 1 } , the prefix sum sz[i] = z [ k ] + . . . + x[ i ] . The result is then in the array sz[k..Z]. The procedure maximum(k , 1 , x[k..Z], m a z ) (where k < 1 ) computes the maximum of the 1 - k + 1 elements of the array x[k..Z] and puts the result in the variable max. All these procedures are optimal and have a time complexity in O(log(1 - k + 1 ) ) when the number of processors is in 0(1 - k + 1).

4.2

An alphabetic order is given on the alphabet A: the array order is such that, for all a E A, a is the order[aIth letter of the alphabet A. Let w = w [ O ] w [ l ] . . . w[n - 11 be a word of length n and for each letter a of the alphabet A let nocv[order[a]] be the number of occurrences of a in w.

Letter occurrences in a word

415

The purpose of the procedure letter-occurrences given below is to determine, in time O(1og n) , the number of occurrences nocv[order[a]] of a in v simultaneously for every letter a in A. We choose a number base which is bigger than the number of occurrences of every letter in v : base = n - card(A) + 2 (we suppose here that every letter of A has at least one occurrence in v ) . Since card(A) 5 n, we can precompute all the powers base2,. . . baseCard(A)-' of base in O(1ogn) time by an algorithm similar to somme-prefix. The value sum computed by this procedure is xk=1 nocv[k].base"'. sum permits to determine simultaneously the number of occurrences of every letter of A in v.

card(A)

letter-occurrences(v[O..n - 11, nocv[ 1. .card( A ) ] ) ; { base := n - card(A) + 2;

for i := 0 to n - 1 pardo { k = order[v[i]]; ~ [ i ] := base"' ; 1

somme(0, n - 1, x[O..n - 11, sum); for k := 1 to card(A) pardo

{ divsum := sum div basek-'; nocv[k] := divsum mod base; (divsum stands for the floor of sumlbase"')

} 1

4.3 The first test The purpose of the procedure first-test given below is to compare for the two words w = w[O]w[l] . . . w[n - 13 and u = u[O]u[l] . . . u[n - 11 of length n and for each letter a of the alphabet A, the number of occurrences nocw[order[a]] and noczi[order[a]] of the letter a in w and u. Hence the procedure Zetter- occurrences is called for the words u and w. If these numbers are not two by two equal then the array idoc[l..card(A)] contains a zero and tes t l # card(A). In this case u is not in the commutation class C e ( w ) .

first-test(u[O..n - l],w[O..n - 11); { letter-occurrences(u[O. .n - 11, nocu[l. .card(A)]); letter-occurrences(w[O..n - 11, nocw[ 1. .card(A)]); for k := 1 to card(A) pardo

{ if nocu[k] = nocw[k] then idoc[k] := 1 else idoc[k] := 0 1

somme(1, card(A), idoc[l..card(A)], t es t l ) if testl # card(A) then write ('u does not belong to the class')

1

41 6

4.4 The reference word Let z = z[O]z[l] . . . z [n - 13 be the word of length n which satisfies the following conditions: order[z[O]] 5 order[z[l]] 5 . . . 5 order[+ - 111 for the alphabetic order on A and for each letter a of A the number of occurrences of a in z is equal to the number of occurrences of a in w. We call z the reference word of w. By applying the procedure somme-pre f is to the array nocw we obtain an array decal such that, for every letter a E A such that order[a] > 1, decal[order[a] - 11 is the index of the first occurrence of the letter a in z . Moreover if order[a] = 1, the index of the first occurrence of a in z is obviously 0. In the same procedure we compute a better value for the identifier base which is equal to the maximum number of occurrences of a letter in w plus one. Similarly, as in the procedure letter-occcurrences, base is used in the procedure ref erence-word below for determining the indices of the occurrences of each letter a of A .

re ference-word(1, card(A), nocw[l..card(A)]); { somme-pre f is( 1, card(A), nocw [l . .card(A)], decal [l . .card(A)]);

decal[O] := 0; maximum( 1, card(A), nocw[l..card(A)], m a z ) ; base := m a s + 1;

1

4.5 Analysis of a word The purpose of the procedure analyze-word given below is to determine, for a given word v = v[O]w[l] . . . v[n - I ] of length n, the array phi such that: for each i of (0, . . . , n - l}, z[i] = uCphi[i]] and for every pair (i,j) such that z[ i ] = z [ j ] and i < j , phi[i] < p h i [ j ] . The array phi associates, for every letter a E A and for every admissible value of T , the rth occurrence of a in IJ with the rth occurrence of a in z . The array decal[O..card(A)] is used by this procedure.

analyze-word(0, n - 1, v[O..n - l],phi[O..n - 13); { for i := 0 to n - 1 pardo

{ k = order[v[i]]; z[i] := base"-l ; }

somme-pre f i s ( 0 , n - 1, z[O..n - I ] , ss[O..n - I ] ) ; for i := 0 to n - 1 pardo

{ m[i] := ss[i] div s[i]; r[i] := rz[i] mod base; phi[decal[order[v[i]] - 11 + ~ [ i ] ] := i;

41 7

4.6 The transformation of a word The next procedure uses the procedure analyse-word for determining the ar- raysphiu and phiw and the array eta[O..n-1] which is such that eta[phiu[i]] = phiw[i] for every i E (0, . . . , n - 1). Thus eta[i] = j if and only if there exists an integer r such that u[i] and w[j] are the rth occurrence of a same letter of A in the words u and w.

trans f orm-word(u[O..n - 11, w[O..n - 13, eta[O..n - 11); { analyze-word(0,n - l,w[O..n - l],phiw[O..n - 11);

analyze-word(0, n - l,u[O..n - l],phiu[O..n - 11); for i := 0 to n - 1 pardo eta[phiu[i]] := phiw[i]

1

4.7 The second test The array eta and the procedure somme-pre f i x allow to determine all states of the automaton Aa(w) which recognize the word u and all its left factors in the case where u belongs to the commutation class C e ( w ) . In the opposite case, there exists an i E (0, . . . , n - 1 ) such that eaist(sx[i] + 1) = fa l se and u is not in the commutation class C e ( w ) .

the-second-test(eta[O..n - 11); { for i := 0 to n - 2 pardo x[ i] := 2eta[i1;

smme-prefix(O..n - 2,x[O..n - 21, sx[O..n - 21); for i := 0 to n - 2 pardo

{ q := 1 + ( sx[ i ] + 1) mod r ; if ezis t (sx[i] + 1, q) then y[i] := 0 else y [ i ] := 1;

1 s m m e ( 0 , n - 2, y[O..n - 21, test2); if test2 = 0 then write ('u belongs to the class') else write ('u does not belong to the class')

1 Example 2 If w and 0 are as in Example 1 and i f u = dbceba, z = abbcde, base = 3,

phiw[O] = 0 , phiw[l] = 1, phiw[2] = 4, phiw[3] = 2, phiw[4] = 3 and phiw[5] = 5. Hence eta[O] = 3, e ta[l] = 1, eta[2] = 2 , eta[3] = 5 and

phiu[O] = 5, phi411 = 1, phiu[2] = 4, phiu[3] = 2, phiu[4] = 0 , phiu[5] = 3,

418

eta[4] = 4. Since the states sx[O] + 1 = 9, sx[l] + 1 = 11, sx[2] + 1 = 15, sx[3] + 1 = 47 and sx[4] + 1 = 63 are states of the automaton Ae(w) , u = dbceba is accepted. But the word v = dbecba is not accepted since the state sx[2] + 1 = 43 is not a state of the automaton Ae(w) (see Figure 1).

Theorem 4 If the number of processors is in O(n), our algorithm tests if a word u belongs to the commutation class CQ(W) an t ime O(1ogn). The total number of operations is in O(n) and the algorithm is optimal. Moreover if the distribution of the states of 5' among the subsets U1,. . . , U, is uniform, then our algorithm achieves a n optimal speedup.

Proof. For every transition i from a state s to a state t , t = s .w(i) = s+2'. Hence the determination of the states s1 = l.u[O] = 1 + 2et"[01, 5-2 = s1 .u[1] = s1+2eta['I1.. . , sn-l = sn-2.u[n-2] = sn-2+2eta[n-21 reduces to a prefix-sum calculation. It follows that u E CQ(W) if and only if all states SI, s2,. . . ,sn-l belong to the automaton AQ(w) . If the number of processors is in O(n), the time complexities of the procedures s o m m e , somme-pre f ix and m a x i m u m given in [6, 7, 81 are in O(1og n) and the total number of operations is in O(n). Moreover, all our procedures have the same complexities. Our algorithm is therefore optimal. The proof of the optimal speedup achievement is the same as in Theorem 3.

5 Conclusion We have presented an optimal parallel algorithm for generating the commutation class of a word and an optimal parallel algorithm for testing if a word belongs to this commutation class. Our algorithms are efficient and easy to code. The notion of @-dissection is original to the authors. Applications to parallel processing are the object of further studies.

Acknowledgments : The authors are grateful to V. Diekert and F. Otto for discussions on the contents of this paper, to M. Ito for his pertinent comments which led to a new proof of Theorem 1 and to an anonymous referee for many remarks and suggestions which permitted to improve both the contents and the presentation of this paper.

41 9

References [l] Cartier P. and Foata D., Problhmes combinatoires de commutation et

de rQarrangements, Lecture Notes in Math., 85, Springer Verlag, 1969.

[2] CQrin C., Automatic parallelization of programs with tools of trace theory, Proceedings of the 6th International Parallel Processing Symposium (IPPS), 1992, IEEE, 374-379.

[3] CQrin C. and Petit A., Speedup of recognizable languages, Proceedings of MFCS’93, Lecture Notes in Computer Science, 711, 332-341, Springer Verlag, 1993.

[4] Cori R. and Perrin D., Automates et commutations partielles, RAIRO Inf. Theor., 19, 1985, 21-32.

[5] Diekert V., Combinatorics of traces, Lecture Notes in Computer Science, 454, Springer Verlag, 1990.

[6] Hillis W.D. and Steele G.L., JR., Data parallel algorithms, Communi- cations of the ACM, 29, 12 (1986) 1170-1183.

[7] JAjA J., An introduction to parallel algorithms, Addison-Wesley Pub. Company, 1992.

[8] Ladner R.E. and Fischer M.J., Parallel prefix computation, JournaZ of the ACM, 27, 4 (1980) 831-838.

[9] Mazurkievitch A., Concurrent program schemes and their interpreta-

[lo] Mehlhorn K., Data structures and algorithms, volume 1, Springer Verlag,

tions, DQIMI Rept., PB 78, Aarhus University, 1977.

1984.

[ll] MQtivier Y., An algorithm for computing asynchronous automata in the case of acyclic non-commutation graphs, Proc. ICALP’87, Lecture Notes in Computer Science, 372, 637-251.

[12] Schott R. and Spehner J.-C., Efficient generation of commutation classes, Journal of Computing and Information, 2, 1, 1996, 1110-1132. Special issue: Proceedings of Eighth International Conference of Com- puting and Information (ICCI’96), Waterloo, Canada, June 19-22, 1996.

[13] Zielonka W., Notes on finite asynchronous automata and trace languages, RAIRO Inf. Theor., 21, 1987, 99-135.

420

A PROOF OF OKNINSKI AND PUTCHA’S THEOREM

KUNITAKA SHOJI DEPARTMENT OF MATHEMATICS, SHIMANE UNIVERSITY

MATSUE, SHIMANE 690-8504 JAPAN

Abstract. Oknihlci and Putcha proved that any finite semigroup S is an amalgamation base for all finite semigroups if the z-classes of S are linearly ordered and the semigroup algebra R [Sl over C has a zero Jacobson radical. As its consequence they proved that every h i t e inverse semigroup U whose all of the classes form a chain is an amalgamation base for finite semigroups. In this paper we give another proof of the result for finite inverse semigroups by making use of semigroup representations only.

1. INTRODUCTION AND PRELIMINARIES

A finite semigroup U is called amalgamation base for finite semigroups if every amalgam [S,T;U] of finite semigroups S,T with U as a core is embedded in a finite semigroup. Hall and Putcha [4] proved that if a finite semigroup S is an amalgamation base for all finite semigroups, then the 3-classes of S are linearly ordered. Okniriski and Putcha 171 prove that any finite semigroup U is an amalgamation base for all finite semigroups if all of the 3-classes of U are linearly ordered and the semigroup algebra C [U] over C has a zero Jacobson radical. As its consequence, they obtained the following.

OkniriSki and Putcha’s theorem (Corollary 10 of [7]). A finite inverse semigroup whose 3-classes of U are linearly ordered is an amalgamation base for finite semigroups.

The purpose of this paper is to give another proof for the theorem by using results and methods introduced in the paper [5]. Okniriski and Putcha[7] used both representations of semigroups and linear representations of semigroups. Our proof uses only representations of semigroups.

Convention. Let 7 ( X ) denote the full transformation semigroup on a set X with composition being from right to left.

Let S be a semigroup. Then a left [resp. right] S-set is a set with an associative operation of S on the left [resp. right]. A left [resp. right] S-set X is faithfil if for distinct s, t E S , there exists x E X with sx # t x . Thus, given a faithful left [resp. right] S-set X , we obtain a canonical embedding of S into T ( X ) and vice-versa.

For undefined terms of semigroup theory, we refer readers to [l] and [5].

42 1

Result 1 (Lemma 1 of [7]). Let U be a finite semigroup and GI , G2 be two subgroups of U with an identity element in common which are isomorphic by an isomorphism 4 : G1 -+ G2 . Let S be a finite semigroup containing U as a subsemigroup. Then there exists a finite semigroup T such that S is a subsemigroup of T and there exists t E T such that 4 ( g ) = t-’gt for all 9 E Gi(C T ) .

Result 2. (cf. Lemma 1 and its corollary of [5]) Let U be afinite semigroup. Then the following are equivalent :

(1) u is an amalgamation base for finite semigroups ; ( 2 ) FOT any two embeddings 4 l , q 5 2 of U into the full transformation

semigroup T ( X ) , there exist afinite set Y and two embeddings 61,62 : T ( X ) -+ T(Y) such that Y contains X as a subset and 6141 and 62q52 coincide on u;

(3) FOT any finite semigroups S , T , any finite faithful left S-set X and any finite faithful left T-set Y , there exist a finite faithful left S-set X’ 2 X and a finite faithful left T-set Y’ 2 Y such that the U-sets X I , Y‘ are U - isomorphic to each other.

2. PROOF OF OKNIkSKI AND PUTCHA’S THEOREM

This section is devoted to semigrouptheoretical proof of the Okniriski and Putcha’s theorem.

We shall prove first the following several results will be used later.

Lemma 1. Let U be a finite inverse semigroup. Let 41, 4 2 be embeddings of U into the full transfonation semigroup T ( X ) such that /Y(l)I = IY(2)I, where Y(’) = (u & ( u ) ( X ) ) and Y(’) = (u +z(u ) (X) ) . Then

any U-isomorphism between the right U-set T ( X ) 4 1 ( U ) and the right U-set T ( X ) 4 2 ( U ) extends a U-isomorphism from the right 41(U)-set T ( X ) to the right U-set T ( X ) .

Proof. Suppose that there exists a U-isomorphism 6 from the right U-set T(X)q51(U) to the right U-set T ( X ) $ z ( U ) . Let f E Map(Y(’) ,X) . Then there exists uniquely f’ E Map(Y(’) ,X) such that O(f+l(e)) = f’42(e) for all e E Eu. In fact, we define a mapping f’ E Map(yi“),X) by f’(z) = O(fq51(e))(x) if x E q52(e)(X) for some e E Eu, where EIJ denotes the set of all idempotents of U . If x E 42(e ) (X) n &(el), where e, e’ E Eu, then

UEU UEU

e ( f 4 l ( e ) ) ( x ) = e(fq5l(e))(42(e1)(X)) = W $ l ( e M l ( e ’ ) ) ( x ) =

W41(ee ’ ) ) ( x ) and Q(f41(e’))(x) = e ( f4 l ( e r ) ) (42 (e ) (X) ) =

W41(e ’ )41 (e ) ) ( z ) = W 4 1 ( e 1 e ) ) ( z ) . Thus f’ is well-defined and unique. So we obtain a mapping [ : M a p ( Y ( l ) , X ) -+ M a p ( Y ( l ) , X ) with [(f) = f’.

422

For any f E M a p ( Y ( l ) , X ) , let V( ' ) ( f ) be

{ h 6 7 ( X ) - T(X)q51(U) I hq51(u) = fq5l(u) for all u E U } .

Also, Let V@)( f ' ) be

{h E 7 ( X ) - T(X)q52(U) I hq52(u) = fq5z(u) for all u E U } .

Actually, if f = ulY(l) for some u E U then for any 2 E Y @ ) , f'(z) =

= Q(f&(e ) ) ( z ) . Hence f '= 8(q51(u))(y(a). The converse is true. 1-1 = IV(2)(((f)I iff =

fq51(e)IY(1) for some e E Eu and IV(')(f)l = IXY(')I = IXycz)I = IV(2)(((f))l if not. Hence

I v ( ' ) ( ~ ) I = Iv(')(~(f))l for any f E M a p ( Y ( ' ) , X ) .

Thus, there exists a bijection Ef from V ( l ) ( f ) to V( ' ) ( ( ( f ) ) and so, there exists a bijection E from T ( X ) - T(X)q5l(U) to 7 ( X ) - 7(X)q52(U).

Thus we obtain a bijection 0 : T ( X ) + 7 ( X ) by gluing 8 and E. We shall prove that 0 is a U-homomorphism of the right q51(U)-set

Let h E 7 ( X ) . For any u E U , letting z E X I we have

Note here that for any e E Eu, f = fq51(e)ly(l) if only iff' = 8(fq51(e))lycz,.

@(fq51(e'))(.) (. E q52(e)(X) (e' E Eu)) = e((fq51(e)Ml(e')) = 8((fq51(e))(~z(e')(.))

yca Thisimplies that IV(')(f)l = IXy ( l ) l - l = IX

7(X)q51(U) to the right $Z(U)-set 7(X)q5z(U).

( W ) q 5 2 (4) (.I = (S:(hM2 (4) = =(h) ($2 (u.-l> (42 (4 (.>I = e(hq51(uu-l))(q52(u)(.)) = Wq51(uu-lu))(.) = W q 5 1 ( 4 ) ( 2 )

= Whq51(4)(.). Hence 0(h&(u) ) = 8(h)q52(u). The lemma is proved. 0

Lemma 2. Let U be a finite inverse semigroup semigroup which is a disjoint union of 1-idempotent semigroup {e} and I such that I is an ideal of U and I = I e I .

Then U is an amalgamation base for finite semigroups if the semigroup I is an amalgamation base for finite semigroups.

Proof. Suppose that there exist two embeddings q51,q52 of U into the full transformation semigroup T ( X ) . Since I is an amagamation base for finite semigroups, by Result 2 we can assume that q5llr = 4211. Moreover, by Lemma 9 of [5], we may assume that &(e) and q5z(e) belong to a 3-class of 7 ( X ) , since by assumption there exists an ideal J of T ( X ) whixh contains q51(I), but neither q5l(e) nor c#q(e). Let Y = Uu,~q51(u)(X), 2 1 = q5l(e)(X)- Y and 2 2 = q52(e)(X) - Y . There exist a bijection ( : q5z(e)(X) + q51(e)(X) such that ( ( 2 1 ) = 2 2 and the restrictio of ( to Y is the identity mapping on Y .

T(X)q5z(e) u T(X)q51(1) as follows : Now we shall define a mappin 8 : T(X)q51(e) U 7(X)q51(I) -+

For any f E T(X)q51(I),

423

{ f i f f ET(X)41(4 Q(f) =

fE4de) if f E T(X)41(e) - 7-(X)41(4 Then it is clear that 0 is bijective. Next we shall prove that 0 is a U-homomorphism from the right 41(U)-set

7(X)41 ( e ) U7(X)41 (I)-set 7( X)41 (V) to the right 4 2 (U)-set 7(X)4z ( U ) U nX)41(I).

Case 1 : u = e and f E T(X)&(e) - T(X)&(I). Then

W41(.)) = @(f) = ft4de) = ( fE4z(e))4z(e) = W 4 z ( e ) .

e ( f h ( e ) ) = ( f h ( e ' ) ) h ( e ) (forsome e' E Er) = f h ( e ' e ) =

f(42(.'.) = f(4z(e'Mz(e) = (f41(e1)42(e) = f 4 z ( e ) .

Case 2 : u = e and f E T(X)&(I). Then

Case 3 : u E I and f E 7 ( X ) & ( e ) - T(X)q51(I). Then 0(f4l(u)) =

Case 4 : u E I and f E T(X)qh(I). Then O(f+l(u)) = f41(u) = f d ~ ( u ) f h ( 4 = f4d.I = fE4du) = 0(f)42(.).

= 0(f)42(4. In any case, it holds that 0 ( f & ( u ) ) = O(f)+Z(u).

Finally, Let U1 be the semigroup obtained from U by adjoining an identity element 1. We can extend 41, $2 by defining 41 (1) = q52( 1) to be the identity mapping on X . Then 41~42 are regarded as homorphisms from U1 to T(X). Then by Lemma 1, we can get a U'-isomorphism from the right 41(U)-set 7 (X) to the right &@)-set 7 ( X ) which is an extension of 0. By Result 2, the lemme is proved. 0

Lemma 3. Let G be a finite group with an identity element e and I a finite (not necessarily inverse) regular semigroup. Let U he a finite regular s e m i p u p semigroup which is a disjoint union of G and I such that I is an ideal of U and I = IeI. If there are emheddings $1~42 of U into a finite semigroup S such that the restrictions to I U { e } of 41 and 4 2 are equal, then there exists an embedding t of S into a finite semigroup T sucht that there exists t 6 T such that t - l (&(g) ) t = & # ~ ( g ) for all g E G, and Y&(.) = t41(eu)7 E41(u)t = E41(ue) for all u of I .

Proof. By Result 1, there exists an embedding 4 of S into a finite semigroup V sucht that there exists c E V such that c-'(&$1(g))c = $C$2(g) for all g E G . Also we can assume that V is a finite full transformation semigroup. Then there is an ideal J of V which contains &$1(I) but does not contain &l(e). Let TI be the full transformation semigroup on the set J . Regarding J as a left V-set, we define a mapping p : V + TI such that p(v)(a) = va for all a E J . Then p is a homomorphism and the restriction of p to the ideal J is injective, since J is regular'.

Next, let T2 be the Rees factor semigroup of V by the ideal J . For any v E V, let B denote the element of Tz containing v. Let T = TI x Tz

424

be the direct product of semigroups TI and Tz. Then we define a mapping E : S + T by E ( s ) = (pq5(s),m) for all s E S. It is clear that [ is an injective homomorphism. Let t = (pq!&(e),c). Then t satisfies the property that t-'([+l(g))t = &52(g) for all g E G. Actually, t-'(&$l(g))t =

( P 4 4 l ( e ) , ~ ) ( P 4 4 ~ ( g ) , ~ ) ( P 4 4 l ( e ) , ~ ) = (~44i(g),c-Wi(g)c) = (P44l(g), 44z(g)) = ( p 4 4 d g ) , 44dg) ) (since P 4 4 l ( g ) = P 4 4 d g ) ) .

Also, we have t ( [ h ( u ) ) = ( p 9 4 l ( e ) , ~ ) ( p d ~ i ( u ) , I ) = (p&h(eu),I) = &$l(eu) for all u of I. Similarly, ([41(u))t = [41(ue) for all u of I . The lemma is proved. 0

Lemma 4. Let G be a finite group with an identity element e and I a finite inverse semigroup. Let U be a finite inverse semigroup semigroup which as a disjoint union of G and I such that I is an ideal of U.

Then U is an amalgamation base for finite semigroups i f the subsemigroup I U {e} of U is an amalgamation base for finite semigroups Proof. Suppose that there exist two embeddings 41,q52 of U into the

full transformation semigroup T ( X ) . Since I is an amagamation base for finite semigroups, by Result 2 and Lemma 2 we can assume that I # J ~ ( ~= q ! ~ z l r ~ { ~ ) . By Lemma 3, We can assume that there exists t E T ( X ) such that t- ' (&bi(g))t = E4z(g) for all g E G, tdi(u) = 4i(eu) and 41(u)t = 41(ue) for all u E I .

Now we define a map 0 : T ( X ) & ( U ) -+ T ( X ) + z ( U ) as follows : For any f E T ( X ) 4 1 ( U ) ,

f if f E 7 ( X ) 4 1 ( I )

f t i f f E T ( X ) h ( e ) - T ( X ) 4 1 ( I ) Q(f) =

To prove that 0 is well-defined, we shall prove that fe E T ( X ) h ( e ) - T ( X ) h ( I ) i f f E T ( X ) h ( e ) - T ( X ) q h ( I ) . Actually, it follows from the property that t+l(u) = &(eu) and $l(u)t = q51(ue) for all 21 E I .

Next we shall prove that 6 is a U-homomorphism from the right ~$~(U)-set T ( X ) q h ( U ) to the right qb(U)-set T ( X ) q h ( U ) .

Case 1 : u E I and f E T ( X ) & ( I ) . Then 0(fq51(u)) = f&(u) = f ~ # ~ ( u ) . On the other hand, 0 ( f ) 4 z ( u ) ( z ) = (f)4z(u) = f b ~ ( u ) . Hence O(f+l(u))

Case 2 : u E I and f E T ( X ) d l ( e ) - T ( X ) 4 1 ( I ) . Since tq5l(u) = 4 l (eu ) , = e(f)4z(.).

we have Q(f41(4) = f4l(.) = ( f 4 1 ( e ) ) 4 1 ( 4 = f ( t h ( 4 ) = ( f t ) 4 Z ( U ) = @(f)(42(4).

(f42(e1))42('11) = ( f41 (er ) )42 (4 = f42(.) = W)42(4.

(f4l(.))t = f(t42(4) = e(f)42(4.

Case 3 : u E G and f E T ( X ) 4 1 ( I ) . Then e(f41(u)) = fq!q(u) = (fq51(et))q51(u) (for some idempotent er of I) = f&(e'u) = fc$z(e'u) =

Case 4 : u E G and f E T ( X ) $ q ( e ) - 7 ( X ) q h ( I ) . Then e(fqh(u)) =

425

Consequently, 8 is a U-isomorphism between the right qh(U)-set T ( X ) and the right &(U)-set T ( X ) . The lemma follows from Lemma 1 and Result 2. 0

Proof of Okniriski and Putcha's theorem. Let U be a finite inverse semigroup whose 3-classes form a chain. Then there exists a chain of ideals U = U1 2 UZ 2 ... 2 U,, such that U, is a maximal subgroup and each Ui/Ui+l is a completely 0-simple inverse semigroups (1 5 i 5 n - 1). Also, we can index idempotents of U so that n idempotents eil (1 5 i 5 n) form a chain and for each 1 5 i 5 n, ri idempotents eij (1 5 j 5 ri) are D-related and Ui = UeilU. For each idempotent eil (1 5 i 5 n), let G,,, denote the maximal subgroup containing e i l . Particularly, U,, = G,,, .

Since U,, is a finite group, by Lemma 3, U,, is an amalgamation base for finite semigroups. By Lemma 2 and Lemma 3, it suffices to prove that if UZU Gel, is an amalgamation base for fnite semigroups, then so is U(= U1). So we suppose that UzUG,,, is an amalgamation base for finite semigroups. By Result 2, there exist two embeddings 4 1 , 4 z of U into the full transformation T ( X ) such that 41 and 4 2 coincide on UZ U Gel,. We shall prove that there exists a U-isomorphism 8 from the right &(U)-set T ( X ) to the right 42(U)-

set T ( X ) . Let &(e i j ) (X) = Xij , 42(e i j ) (X) = y i j for all 1 5 i 5 n and 1 5 j 5 ri.

Then Xi1 = yil for all 1 5 i 5 m and Xij =. y i j for all 2 5 i 5 n and 1 5 j 5 ri. Moreover, lXljI = IY1jI for all 1 5 J 5 r1 since +l (e l j ) , +z(elj)

are D-related to q5l(ell) (= 42(e11)).

Let Xi = u Xij and yi = u y i j , where m 5 i 5 n.

Then Xi = yi for all 2 5 i 5 n. Since U is an ineverse semigroup, the idempotents eij commute to each other, it holds that

i<k<n,l<j<ri i<k<n, l< j<r;

XI, n X I , C X Z and YIP n Ylq G X 2 for all 1 i p # q i rl. Hence X1 - X Z is a disjoint union of subsets Xlj - X2 (1 5 j 5 T I ) and Y1 - X Z is a disjoint union of subsets Y1j - X2 (1 5 j 5 rl).

Choose elements uj E e1jUell and u;' E ellUelj such that ell = ujuj' and ell = uj'uj (1 5 j 5 TI).

(1) We define maps y : YI - X Z to X I - X Z as follows :

y(z) = 41(uj )42(u~l) (z ) if z E f i j - x Z .

Also we can define a map 6 : X 1 - x ~ -+ f i - X , by 6(z) = (+z(uj )41(u~') ) ( z ) if z E X l j - X2.

(e11)42(uy1) = 4z(~j)4z(~y')) = h(e11) . Hence 6 is an inverse mapping of 7. SO it follows that y is bijective.

we have 42 (Uj 141 b y 1 )41 (Uj ) 4 2 b y 1 ) (= 4 2 ( U j ) d l (U,-1Uj)42 ( U y 1 ) = 4 2 ( U j ) & .

(2) We define a map 8 : T(X)qh(Ul ) + T(X)qh(U1) as follows : For any f E T ( X ) ,

426

) if / 6

To prove that 0 is well-defined, it suffices to prove that /7 = / for / 6T(X)</>i(eip) n T(-X~)<£i(ei,) (p ^ q). Actually, this follows from two factsthat eipeiq € U2 and 0i(eipei,)(0i(Mp)^2(Wp *)) =02(eipelgWp)^2(Wp1)) = 02(eipei,) = <£i(eipei,). Since the inverse of 0 canbe constructed by using 8 instead of 7, we know that 6 is bijective.

(3) We shall prove that 0 is a {/-homomorphism of the right <^>i(t/)-setT(X)0i(l7) to the right (j>2(U)-set T(X)fa(U).

Let w 6 C/ and / € T(X)<f>i(U).Case 1 : / 6 r(X)01(£72)- Of course, /<Ai(«) € T(X)î(£72). Also, there

exists an idempotent e G U% with / = f<f>i(e). Hence f<j>i(u) = f<t>\(e)<t>i(u)= f<h(e") = ffa(eu) = ffa(e)<j>2(u) = f f a ( u ) . Thus,

Case2: w € f/i -Z/2,/ € TW^Ô-TWî^). Then / =and f<j>i(u) = f(j>i(u)<j>i(eik) for some 1 < j, fc < ri. So we have

Case 3 : u e l/2, / e T(X)î(î) - TpO<M£/2). Then

f<l>i(u). While, / = f<j)i(eij) for some 1 < j < n. Then

HenceTherefore 9(f)fa(u) = 6(f<j>i(u)) for all w e C7. We have proved that the

right <j>i(U)-set T(X) is isomorphic to the right 02(t7 )-set T(X).By Lemma 1 and Result 2, there exist a finite set Y and two embeddings

81,82 : T(X) — > T(Y) such that y contains X as a subset and <5iî andcoincide on U. The proof of the theorem is cmplete. D

In a forthcoming paper we will improve the method used here to generalizethe result from finite inverse semigroups to finte regular semigroups.

427

REFERENCES [I] A. H. Clifford and G. B. Preston, Algebraic tJzeory of semipups , Amer. Math. SOC., Math.

[2] T. E. Hall. Representation eztension and amalgamation for semigmups. Quart. J. Math.

131 T. E. Hall. Finite inverse s e m i p u p s and amalgamations. Semigroups and their applications,

[4] T. E. Hall and M.S. Putcha, The potential .?'-relation and amalgamation bases for finite

[5] T. E. Hall and K. Shoji, Finite bands and amalgamation bases for finite s e m i p u p s , Com-

[6] B. H. Neumann, An essay on f i e products of p u p s with amalgamations llans. Roy. SOC.

[7] J. Okni&ki and M.S. Putcha, Embedding finite s e m i p u p amdgams, J. Austral. Math. SOC.

Survey, No.7, Providence, R.I., Vol.I(l961); Vol.II(1967).

Oxford (2) 29(1978), 309-334.

Reidel, 1987, pp. 51-56.

s e m i p u p s , Proc. Amer. Math. SOC. 95(1985), 309-334.

munications in algebra, To appear.

London Ser. A 246(1954), 503-554.

(Series A) 51(1991), 489-496.

428

SUBDIRECT PRODUCT STRUCTURE OF LEFT CLIFFORD SEMIGROUPS

K.P. Shum" Department of Mathematics,

The Chinese University of Hong Kong, Shatin, N.T. Hong Kong.

E-mail: kpshum(9math. cuhk. edu. hk

M.K. Sen Department of Pure Mathematics, The University of Calcutta, India

E-mail: senmkQcal3. vsnl .net. in

Y.Q. Guob Institute of Mathematics,

Yunnan University, Kunming, Yunnan, China. E-mail: yqguoQynu . edu . cn

In this paper, we study the subdirect product structure of left Clifford semigroups. It is shown that subdirect product of left groups with zero possibly adjoined is related with left Clifford semigroups. Subdirect product of a group and left normal bands is also considered.

1 Introduction

By a Left Clifford semigroup (in brevity, LC-Semigroup), we mean a regular semigroup S in which e S 2 Se for every idempotent e E S, that is if for any a 6 S and e2 = e 6 S , eae = ea. Clearly Clifford semigroup S being regular such that eS = S e for all e2 = e E S , is LC. The structure of LC semigroups has been investigated in

Recall that a semigroup is a band if it consists of only idempotents. Let E ( S ) be the set of all idempotents of a semigroup S. Clearly, E ( S ) # 0 if S is LC since S is regular. If E ( S ) forms a band, then E ( S ) is left normal if the identity e f g = eg f holds in E(S) .

For Clifford semigroups, M. Petrich proved in that a semigroup is Clif- ford if and only if it is regular and is a subdirect product of groups with a "The research of K.P. Shum is partially supported by a UGC(HK) grant #2060126 (2000/2001) bThe research of Y.Q. Guo is partially supported by a NSF, China grant #1976-1004 and a grant of Basic Research in Science, Yunnan Providence, China, #96z-001 Keywords: Left groups, unitary subsemigroups, LC-semigroups, Subdirect products. ZOO0 A M S Mathematics Subject Classification: 20M15

429

zero possibly adjoined. We call a semigroup S a left group if for all a, b E S , there exists a unique x E S such that xa = b. The concept of left group has also recently extended to quasi left group in4 . In this paper, we prove that a semigroup S is regular and is a subdirect product of left normal band and a group if and only if S is a strong LC semigroup and E ( S ) of S is a unitary subsemigroup of S. Thus, the result of M. Petrich in is extended from groups to left groups.

For the terminology and notations not given in this paper, the reader is referred to the text of M. Petrich l.

2

In this section, we study the class of LC-semigroups whose idempotents form a left normal band. The following Theorem that concerns the subdirect product of groups is known for Clifford semigroups. Theorem 2.1 [2i11.2.6]. The following conditions on a semigroup S are equivalent.

Subdirect product of left groups

(a) S is a Clifford semigroup.

(ii) S is a strong semilattice of groups.

(iii) S i s regular and i s a subdirect product of groups with zero possibly adjoined.

(iv) S is regular and its idempotents lie in the centre of S .

In order to extend this result to Left Clifford semigroups, we need the following Lemma. Lemma 2.1 [2J1.2.41 If S = [Y; S,, qa,o] i s a strong semilattice of the semigroups s,, then S is a subdirect product of semigroups S, with a zero possible adjoined.

Theorem 2.2 A semigroup S i s a LC-semigroup with E ( S ) as a left normal band if and only if S is regular and i s a subdirect product of left groups with zero possibly adjoined. Proof: Suppose S is a LC-semigroup with E ( S ) as a left normal band. Since S is LC, it is regular and it is known5 that S is a semilattice Y of left groups S,. Now for each S,, there exist a left zero band I , and group G, such that S, = I , x G,. Hence S = u ( I , x G,). We show that S is a strong

We now give a characterization theorem for LC-semigroups.

CXEY semilttice Y of left groups S,. Let 1, be the identity of the group G,, for all a E Y . Now for each i E I,, (i, la ) E S, and for each a E Y , we can

430

fix i, E I and denote the element (i,, 1,) by e,. Let a,P E Y with a 2 P. Then we define 4,,p : S, + Sp by

a$,,p = aep for all a E S,.

Clearly, aeg E Sp. Let aep = ( j p , as), where j p E I p and ap E Gp. Also let

aeg and xeg = ( j p , l p ) ( i p , l p ) = ( j p , 1s) = x . Now, take x = ( j p , l L J ) E I p x Gp. Then xaep = ( j s , l p ) ( j p , a p ) = ( j p , a g ) = (jL3,ap) =

b = (i,, b,) E I , x G, and y = (i,, la).

Then we have

Thus, for a, b E S,, we have bepaep = bepxaep = byepxaep. Now y , ep, x are all idempotents of S. Since E ( S ) is a left normal band,

we have yegx = yxep. This leads to bepaep = byxegaep = byxaep = byaep = baeg = bap,,p.

Hence, bp,,pap,,p = bap,,p. This shows that pa,p is a homomorphism. We can also show for a, ,B,r E Y with a 2 @ 2 y, we have ( ~ , , g p g , ~ =

and for any a E S, and b E Sp, ab = (ap,,,g)(bpp,,p). We omit the details. Therefore S becomes a strong semilattice Y of left groups S,. Then by Lemma 2.2, S is a subdirect product of semigroups S, with a zero possibly adjoined.

Conversely, we assume that S is a regular semigroup and is a subdirect product of left groups S,, a E Y with zero 0, possibly adjoined. Let (e,),€Y E E ( S ) and (a,),€Y E S. If e, = 0, or a, = 0, then clearly e,a,e, = e,a,. Suppose e, # 0, and a, # 0,. Then, since S, is a left group, every idempotent e, # 0, in S, is a right identity. Hence e,a,e, = e,a,. Thus, it follows that eS Se for any idempotent e E S, that is, S is a LC semigroup.

Consider now (e,),€y, (f,),€y and (ga)aEy in E(S) . For the case e, = 0, or fa = 0, or g, = 0,, it is trivial to see that eaf,ga = e,g,f,; for the case e, # 0,, f a # 0, and g, # 0,, we also have eaf,ga = e,g,fa, since each E(S,) is a left zero band. Hence E ( S ) is a left normal band. The proof

by = yb = b.

is completed 0

3

A non-empty subset A of a semigroup S is called left unitary if for each s E S and a E A, as E A implies that s E A. Right unitary can be dually defined. Hence we call a subset of S unitary if it is both left and right unitary.

According to5 , a LC semigroup S is called strong if the usual Green’s relation 3t of S is also a congruence on S. Also, it has been shown in4 that

Subdirect product of a left normal band and a group

43 1

a LC semigroup is a left orthogroup, that is, a completely regular semigroup in which the set of all its idempotents form a left normal band.

In this section, we show that every strong LC semigroup S with E(S) being a unitary subset of S is a subdirect product of a left normal band and a group. Theorem 3.1 Let S be a semigroup. Then the following conditions are equivalent:

1. S is a regular semigroup and is a subdirect product of a left normal band and a group.

2, y E s. 2. S is a strong LC-semigroup in which yx = y implies that x2 = x for all

3. S is a strong LC-semigroup and E ( S ) is a unitary subsemigroup of S.

Proof: (i) + (ii). Suppose S is regular and a subdirect product of a left normal band L and a group G. Then it can be easily seen that for any (i, g ) , (j, h) E S, we have that ( i , g ) ? f s ( j , h) implies i ? f L j . Hence i = j. Consequently, ?f is a congruence on S. Now, let y = ( i , g ) , x = ( j ,h ) be elements of S such that yx = x, that is, (i, g ) ( j , h) = ( i , 9) . Then we have ij = i and gh = g , Since G is a group, gh = g implies that h = e , where e is the group identity of G. Thus o = ( j , h) = ( j , e ) and so x2 = x. So (ii) is proved.

(ii) + (iii). Suppose (ii) holds. Let x,y E S such that yx E E ( S ) and y E E(S) . In order to show that E(S) is a left unitary set, we only need to prove that x E E(S) . Since y,yo E E(S) and E(S) is a left normal band, we have

Hence by (ii), we obtain that x2 = x E E(S) . This means that E ( S ) is a left unitary band. Again, we can show that every left unitary band is right unitary. Hence E ( S ) is unitary.

(iii) +- (i). Suppose (iii) holds. Let < = {(a,b) E S x S I ab' E E ( S ) for some b' E V(b)} , where V(b) is the set of all regular inverses of b. We first claim that if ab' E E ( S ) for some b' E V(b) , then ab" E E ( S ) for any b" E V(b). Clearly, V(b) # 8 in S since S is a strong LC semigroup and of course regular. Suppose ab' E E(S) and b" E V(b). Then (b'b)(ab')(bb") E E ( S ) since E(S) is a band. This shows that b'ba(b'b)b" E E(S) . However, because S is a LC semigroup, b'ba(b'b) = b'ba. This leads to b'bab" E E(S) . Since E(S) is unitary and b'b E E ( S ) , we have ab" E E(S) . Our claim is therefore established.

We next prove that the relation ,$ is an equivalence relation on S. Clearly, < is reflexive. For the symmetry of <, we consider (a, b) E < such that ab' E

(YXYb = (YZ)Y(YX) = (YX)(YX)Y = YXY.

432

E(S) for some b‘ E V ( b ) . Let a’ E V ( a ) . What we need to show is ba’ E E(S) . Since S is a LC semigroup, it is easy to see that (ab‘)(ba’) = a(b‘b)a’ E E(S) . Also by our assumption (iii), E(S) is unitary, which implies ba‘ E E(S) . Hence J is symmetric. To see the transitivity of J, we assume that aJb and bJc. Then by the definition of J, we have ab‘ E E(S) and bc‘ E E ( S ) for some b’ E V ( b ) and c’ E V ( c ) . Hence ab‘bc‘ E E(S) . This implies that a‘(ab‘bc’)a E E ( S ) , that is, (a‘ab‘b)c‘a E E(S) . But since E ( S ) is a unitary subset of S, we have c‘a E E(S) . Then, we obtain c(c’a)c’ E E(S) . This implies that (cc’)(ac’) E E(S) . However, because E ( S ) is unitary and cc’ E E(S) , it follows that ac‘ E E(S) . In other words, we have a&, that is, J is transitive. Hence J is indeed an equivalence relation on S.

We now show that J is a group congruence on S. For this purpose, we let a, b, c E S such that aJb. Then ab’ E E ( S ) for some b’ E V(b) and cab’c’ E E(S) , where c’ E V ( c ) . Now, as b’c’ E V(cb) , we have caJbc. Also, we have a(cc’)b‘ = a(a‘acc’)b’ = a(a’acc‘a‘ab’) = aa’(acc’a‘)ab‘ E E(S) . This implies that a g b c as well, by the definition of J. Thus we have verified that J is a congruence on S. Since for any e, f E E(S) , ef E E ( S ) , it follows that < is indeed a group congruence on S.

As 3t is a congruence on S , the quotient semigroup S/X is of course a left normal band since 3t is a strong LC semigroup. We now show that J n 3t is the identity relation. To this end, we let (a , b) E J n H . Then we have ab’ E E ( S ) for all b’ E V ( b ) . Also, there exist a’ E V ( a ) and b‘ E V ( b ) such that a’a = b‘b and aa‘ = bb‘. Dually, we have ba’ E E(S) . Since S is a strong LC-semigroup, we have b‘(ba‘)b E E ( S ) , that is, (b‘b)(a’b) E E(S) . Hence a‘b E E(S). Similarly, we have b‘a E E(S) . Thus, by noting that S is a LC-semigroup, we deduce that

a = aa‘a = ab’b = ab’ba’ab’b = (ab‘)(ba‘)(ab’)b = (ab’)(ba‘)b = a(b‘b)~’b = aa‘aa’b = aa’b = bb’b = b.

This shows that n X is the identity relation. Consequently, S is a subdirect product of a left normal band S / X and a group S/J. Of course, S, being a LC semigroup, is always regular. Thus (1) is proved and the cycle of the proofs is completed. 0

As a generalization of left groups, Shum, Ren and Guo4 called a semigroup S a quasi left group if and only if S is quasi regular and E(S) is a left zero band. In closing this paper, we ask whether we can extend the results obtained in this paper to quasi left groups, for instance, under what conditions, a left Clifford quasi regular semigroup can be described as a subdirect product of quasi left groups with zero possibly adjoined?

433

References

1. M. Petrich, Introduciton to Semigroups, Charles E. Merill Publishing Cp., A Bell & Howell Co., Columbus, Ohio, USA, (1973).

2. M. Petrich, Inverse Semigroups, John Wiley & Sons, New York, USA, (1984).

3. M. Petrich, Regular Semigroups which are subdirect products of a band and a semilattice of groups, Glasgow Math.J. 14 (1973), 27-49.

4. K.P. Shum, X.M. Ren and Y.Q. Guo, On quasi left groups, Groups- Korea’ 94 (edited by Kim/Johnson), Walter de Gruyter & Co., Berlin New York (1995), 285-288.

5. Zhu Pingyu, Guo Yuqi and K.P. Shum, Strucutre and Characterizations of the left CZifford Semigroups, Science in China (Series A) 35 (1992), 792-805.

434

TREE AUTOMATA IN THE THEORY OF TERM REWRITING

MAGNUS STEINBY Turku Centre for Computer Science and

Department of Mathematics, University of Turku FIN-20014 Turku, Finland

E-mail: [email protected]

In this paper we survey several applications of tree automata and regular tree languages in term rewriting theory. Also some recently introduced types of tree automata by which the range of such applications can be extended beyond the linear cases are considered.

1 Introduction

That tree automata and term rewriting systems must be related in some ways is suggested already by the fact that terms can be viewed as trees and conversely. Moreover, the theory of tree automata actually sprang from the unified theory of automata, languages, algebras and equations propounded by J.R. Buchi and J.B. Wright. It is also well-known that tree automata of various types can conveniently be defined as term rewriting systems. Here we shall consider some of the ways in which notions and results from the theory of tree automata have been used in the study of term rewriting systems. Such applications appeared rather late, but in recent years the development has been rapid and it has stimulated strongly the theory of tree automata.

A fruitful line of research was opened by J.H. Gallier and J.R. Book [25] when they observed that for a left-linear term rewriting system (TRS) R the set Red(R) of reducible ground terms is a regular tree language. For example, it has been shown that if Red(R) is a regular set, then the TRS R can be linearized in a certain sense. It is also decidable whether Red(R) is a regular tree language for a given TRS R. Such results have been applied to problems of ground reducibility, sufficient completeness and inductive proofs.

The ways a TRS transforms regular tree languages have been investigated from various points of view. For example, it has been shown that under some conditions the descendants or the normal forms of the elements of a given regular tree language form regular sets, or that it can be decided whether all such descendants or normal forms belong to another given regular set.

That the congruence classes of any finite system of ground equations are regular tree languages perhaps explains why the use of tree automata has been so successful in the area of ground term rewriting. The ground

435

tree transducers introduced by M. Dauchet and S. Tison [15] have proved a very effective tool here. In a recent paper J. Engelfriet [18] puts ground tree transducers into a new perspective by defining derivation trees for ground TRSs and gives then novel proofs for many important results.

Since many applications of finite tree automata are essentially limited to linear or ground TRSs, tree automata with some capability to test the equality of subterms are needed for dealing with non-linear rules. An unlimited use of equality checking leads to automata with poor decidability properties, but there are good compromises such as the automata with equality and inequality tests limited to ’brother subtrees’ studied by B. Bogaert and S. Tison [4]. The sets of reducible and inductively reducible terms of certain non-linear TRSs are recognized by such automata. Another possibility is to limit the number of equality tests allowed along any path in the input tree. As a further example of such new notions we may mention the tree automata introduced for dealing with term rewriting problems which involve associative-commutative symbols.

It has been necessary to limit the scope of this survey to a - hopefully representative - selection of topics. Moreover, the ideas and results are presented, as far as possible without jeopardizing accuracy or readability, in a rather informal manner. However, in Section 2 some notions of the theories of term rewriting and tree automata are reviewed. Systematic treatments of term rewriting systems can be found, for example, in [l], [2], [6] and [17]. For tree automata and tree languages, [27] may be consulted. Some topics in the scope of this paper have been surveyed also in [29], [46], and [53]. The forthcoming book ”Tree automata techniques and applications” [52] also contains much relevant material.

2 Preliminaries

In what follows, C is always a ranked alphabet, i.e., a finite set of operation symbols, and C, denotes the set of m-ary symbols in C ( m 2 0). If X is an alphabet of variables disjoint from C , the set Tc(X) of C-terms over X is defined as the smallest set T such that X U C g T, and f ( t 1 , . . . , t,) E T whenever f E C, and tl , . . . , t, E T . The set of variables appearing in a term t is denoted var(t). A term in which no variable appears more than once is called linear, and a term with no variables at all is ground. Let Tc denote the set of ground C-terms.

A C-context is a C-term over {[} in which the variable [ appears exactly once. The set of C-contexts is denoted Cc. If p E Cc and t E Tc, then p ( t ) is the ground C-term obtained from p by replacing the 5 by t .

The result t c of applying a substitution (T : X -+ Tc(X) to a CX-term t

436

is called an instance of t , and it is a ground instance if var(ta) = 0. Let Gr(t) be the set of ground instances of t .

A term rewriting system (TRS) is a finite set of rewrite rules 1 -+ r , where Z,r E Tc(X) for some set X of variables, 1 $ X and var(r) C var(1). For the sake of simplicity, and without any essential loss of generality, we shall apply a TRS to ground terms only. Moreover, since any TRS will be over the given ranked alphabet C, this fact will usually not be mentioned. If R is a TRS, the rewrite relation JR on Tc is defined so that if s, t E Tc, then s JR t iff there exist a C-context p , a rule 1 -+ r E R and a substitution a such that s = p(Za) and t = p(ra) . A C-term s is reducible if s + R t for some t, and otherwise it is irreducible. The sets of reducible and irreducible ground C-terms are denoted Red(R) and Irr(R), respectively.

By J; we denote the transitive reflexive closure of JR. If s +; t and t E Irr(R), then t is a normal form of s. Furthermore, let + R and c.; denote the converses of the relations + R and +;, respectively. The equivalence = E on Tc generated by + R is also a congruence on the term algebra Ti = (Tc, C).

A TRS R is terminating if there are no infinite reduction sequences t o + R tl + R t2 + R . . ., and it is confluent if for any terms s, t ,u E Tc such that s +k t and s ‘;1 u, there is a term v E T i such that t +; v and u +; v, that is to say, if +; o J k J k 0 +;. Similarly, R is locally confluent if + R o + R E +; o -e&, and R is said to have the Church-Rosser property if = R C ‘;1 o -e;. It is well-known that a TRS has the Church-Rosser property iff it is confluent, and that a terminating locally confluent TRS is also confluent. A TRS is complete (or convergent) if it is terminating and confluent. If R is a complete TRS, then every C-term t has a unique normal form tJ which is obtained by reducing t in an arbitrary way until an irreducible form is reached. Moreover, for any s , t E Tc, s ER t iff SJ = t k . Hence, the word problem “s -R t?” is decidable for a complete TRS (and for any system of identities which can be converted into a complete TRS).

Terms can be represented by trees and, on the other hand, any term may be regarded as a syntactic representation of an ordered labeled tree. Hence we call ground C-terms also C-trees and subsets of Tc C-tree languages. One of the common types of finite tree automata recognizing exactly the regular tree languages can be defined as follows.

A (frontier-to-root, bottom-up) C-recognizer is a system A = ( A , C, P, F ) , where A is a finite set of nullary symbols, the states, such that A fl C = 0, P is a finite set of rules each of the form

c -+ a with c E CO and a E A, or

f (a l , . .. ,am) -+ a with m > 0, f E C, and ai ,... ,am,a E A,

437

and F c A is the set of final states. We regard A as a TRS over C U A and the tree language recognized by it is defined as the C-tree language

T(d) = {t E Tc E F)t =s> u } .

A C-tree language T g Tc is regular, or recognizable, if T = T(A) for some C-recognizer A. The set of all regular C-tree languages is denoted Recc.

3 Reducibility, left-linearity and regular tree languages

The set Red(R) of reducible C-terms and its complement Irr(R), the set of normal forms of C-terms, are two important tree languages associated with any TRS R. A C-term t is reducible by R iff an instance of the left-hand side of some rule of R is a subterm of t. More generally, a C-term t is said to encompass a CX-term 1 if there exist a C-context p and a substitution n such that t = p ( l a ) . The set of C-terms encompassing a given term 1 E Tc(X) is denoted Enc(Z), and for any L g Tc(X), let Enc(L) be the set of C-terms encompassing at least one term in L. If lhs(R) denotes the set of left-hand sides of the rules of R, we may write Red(R) = Enc(lhs(R)).

A TRS R is called left-linear if lhs(R) contains linear terms only. In 1985 Gallier and Book [25] noted that Red(R) is a regular tree language for any left-linear TRS R. (Recall that R is finite by our definition.) This actually follows from well-known closure properties of Recc, but it is also easy to construct directly a C-tree recognizer for Red(R) when R is a left- linear TRS. For example, if 1 is the linear term f(z , f (c ,y)) , where f E C2, c E CO and z,y E X , a tree recognizer for Enc(l) = { p ( f ( s , f ( c , t ) ) ) Ip E Cc, s, t E Tc} has simply to search for the pattern f(-, f (c , -)). Here linearity is essential since otherwise the recognizer should be able to check the equality of arbitrarily large subtrees.

Not all regular tree languages are of the form Red(R) or Irr(R) for some left-linear TRS R. For example, if C consists of a nullary symbol c and a unary symbol g , the regular set {g2"(c) 1 n 2 0) obviously cannot equal Red(R) or Irr(R) for any left-linear TRS R. In [22] Fulop and VBgvolgyi introduce a class of one-state deterministic top-down tree automata which recognize exactly the sets Irr(R) where R is a left-linear TRS.

If a TRS R is not left-linear, the sets Red(R) and Irr(R) are not necessarily recognizable. For example, if f E C2, c E CO and R = {f(z,z) + c } , then Red(R) = { p ( f ( t , t ) ) ( p E Cc, t E Tc} is not a regular set. On the other hand, Kucherov [40] has shown that no new regular sets Red(R) can be obtained by allowing R to be non-left-linear. In fact, he proved that if Red(R) is a regular

438

tree language, there is a left-linear TRS L in which every rule is an instance of a rule of R such that Red(L) = Red(R).

This was proved in a different way also by Hofbauer and Huber [32]. They actually proved the following more general fact, where R and S represent sets of left-hand sides of rewrite rules:

Let R and S be finite sets of EX-terms. There is a finite set S* of linear instances of terms in S such that Red(RUS*) = Red(RUS) iff there exists a regular E-tree language T such that Red(S) \ Red(R) 5 T C Red(S).

They also make some remarks about the semantic effects of the linearization of the left-hand sides on the whole TRS. In particular, they show that if R is a ground convergent TRS, i.e., terminating and confluent on ground terms, then any left-linear instance R* is also ground convergent and induces the same equivalence ER on Tc as R.

Now the question arises whether the regularity of Red(R) can be decided for any given TRS R. A positive answer was given independently by three pairs of authors: Kucherov and Tajine [41], Hofbauer and Huber [32], and Vagvolgyi and Gilleron [55]. The problem is essentially to prove that it can be decided whether each non-linear variable in a left-hand side of a rule can be replaced by finitely many ground terms without changing the set Red(R). Here results presented in [36] prove useful.

4 Ground reducibility and universal closures

A EX-term t is ground reducible by a TRS R, if every ground instance o f t is reducible by R, i.e., if Gr(t) C Red(R). Ground reducibility is important in inductive theorem proving and complete specifications of functions (cf. [6] or [17], for example). In [49] and [36] it was show that the question whether a term is ground reducible by a given TRS is decidable but computationally hard. However, if t is linear and R is a left-linear TRS, then both Gr(t) and Red(R) are regular tree languages and the inclusion Gr(t) C Red(R) can be decided using some well-known properties of the family Recc. In [39] a test based on this fact was presented.

Kucherov [40] also shows that if a linear term t is ground reducible by a TRS R and R* is a linearization of R, then t is ground reducible by R*, too. This implies, for example, that if R is a sufficiently complete specification of a function f ( r c l , . . . , 2,) such that Red(R) is regular, then f has also a sufficiently complete left-linear specification. Such matters are discussed more extensively by Hofbauer and Huber [32].

Deciding ground reducibility is usually based on test sets: T C Tc is a

439

test set for a TRS R if a CX-term t is ground reducible by R iff all T-instances o f t are reducible by R; t c is a T-instance of t if XCJ E T for every x E var(t). Hence the search for more efficient ground reducibility checking methods has largely meant finding smaller test sets. In their recent work Hofbauer and Huber [33] generalize the study of ground reducibility and found it solidly on the theory of tree automata and tree languages. The universal closure Lv of a C-tree language L is defined as the set of all linear C-terms t such that Gr(t) L. (Since only linear terms are considered, the problem of an unlimited number of variables is easily solved.) Note that a linear term t is ground reducible by a TRS R exactly in case t E Red(R)'. A set T of C-terms is a (ground) test set for the universal closure Lv if a linear term t is in Lv iff all T-instances of t are in L. Such test sets are characterized in terms of certain relations (akin to the syntactic congruences used for classifying of tree languages; cf. Section 13 of [27] for references). It is also shown, for example, that for any regular L, optimal test sets effectively exist.

5 Coping with nonlinearity

As we already noted, the sets Red(R) and Irr(R) are not necessarily regular if the TRS R is not left-linear, and it is not always possible to replace R by a left-linear TRS. Moreover, for any non-linear CX-term t , the sets Gr(t) and Enc(t) are also non-regular (assuming again that CO # 8). Hence, to extend the range of applications of tree automata, one obviously has to go beyond finite tree recognizers and regular tree languages.

Already in 1981 Mongy considered in his thesis (cf. [52]) tree automata by which any set of the form Red(R), Irr(R), Gr(t) or Enc(t) can be recognized. The power of these automata is due to the equality constraints on subtrees by which state-transitions can be controlled. Unfortunately, they are too general for applications of the kind considered here. In particular, the emptiness problem is undecidable [52]. However, useful extensions of ordinary tree automata have been obtained by restricting either the form or the uses of the constraints.

In an automaton with constraints between brothers (AWCBB) defined by Bogaert and Tison [4] a transition rule is of the form

where f E C,, al, . . . , a, and a are new unary symbols regarded as states, and C is a combination of conditions of the form i = j or i # j, where 1 5 i < j 5 m. The automaton operates in a bottom-up fashion and the

440

above rule can be used for a configuration transition

~ ( f ( a l ( t l ) , . * . ,am(trn))> * p(a(f(t1, . * * , t rn>) ) ,

where p E CC and t l , . . . , t , E Tc, assuming that ti = t j whenever C includes the constraint i = j, and ti # t j whenever C includes the constraint i # j (1 5 i < j 5 m). These automata have many of the good properties of classical tree automata. For example, the emptiness and finiteness problems are decidable, and the family of the corresponding tree languages is closed under all Boolean operations and alphabetic tree homomorphisms. This means that by using AWCBB many applications of tree automata can be extended from left-linear systems and linear terms to cases in which all non-linearities appear in 'brother positions'. It should also be mentioned that Bogaert, Seynhaeve and Tison [3] have shown that it is decidable whether the tree language recognized by a given AWCBB is regular.

In a reduction automaton (RA) the form of the constraints is not restricted but the number of equality tests along each path of the input tree is limited. Dauchet, Caron and Coquid6 [12] prove that the emptiness and finiteness problems are decidable for these automata, and that the family of tree languages recognized by RA is closed under union and intersection. Moreover, they show that for any CX-term t , the set Enc(t) is recognized by an RA. Using these facts it is shown that the first-order theory of reduction is decidable. The language of this theory includes for each CX-term t a unary predicate symbol Enct, the atomic formulae are of the form Enct(w), where w ranges over Tc, and Enct(w) is interpreted to be true for w = s (s E Tc) if s encompasses t . Since the fact that a CX-term t is ground reducible by a TRS R with lhs(R) = (21,. . . , I n } can be expressed by the formula

(V'v)(Enct(w) + (Enq, (w) V . . . V Enq, (w))),

a new proof for the decidability of ground reducibility is also obtained.

6 Tree automata for AC-theories

In many applications all terms are interpreted in domains belonging to some fixed class, and often this is a variety of algebras defined by a set of identities. Then two different terms may be equivalent in all models to be considered. Hence, a subterm is to be regarded as a potential reduct for a TRS whenever it is equivalent in the given equational theory with an instance of the left-hand side of a rule. On the other hand, recognizable and equational subsets of free algebras of equational theories were considered already by Mezei and Wright [45], and the topic is studied further by Courcelle [9,10], for example. Here

441

we review briefly some recent work in which tree automata of some new types are used.

The associative law f(z, f ( y , z ) ) N f(f(z, y), z ) and the commutative law f(z, y) N f (y , x) are the most common equational axioms. An AC-theory is obtained when some of the binary operations are associative and commutative. Let ZAG be the equational theory generated by a set of AC-axioms, and for any T g Tc, let AC(T) = {s E Tc 1s EAC t for some t E T}. Then AC(Gr(t)) is the set of ground AC-instances of a given CX-term t , and a rule 1 -+ r can be applied to a ground term s exactly in case s has a subterm belonging to AC(Gr(Z)). It is easy to see that AC(Gr(t)) is a regular tree language for any linear term t. This means that ordinary tree automata can be used for dealing with linear AC-rewriting problems.

An instance of the AC-complement problem consists of some C-terms t, s1 , . . . , s,, and the question is whether there is a ground instance o f t which is not a ground AC-instance of any si (i = 1 , . . . , n). Since the question can be expressed in the form

“Gr(t) n AC(Gr(s1) U . . . u Gr(s,))‘ # 0?”,

it is clearly decidable when the terms 5-1, . . . , s, are linear; it is easy to check whether a recognizer for the complement set AC(Gr(t1)U.. .UGr(t,))‘ accepts a ground tree of the form ta. The ground reducibility problem modulo an AC- theory for any left-linear TRS is similarly seen to be decidable.

Lugiez and Moysset [43] generalize AWCBB-recognizers so that equality tests modulo the AC-theory can be performed between brother subtrees with a common non- AC parent symbol, and using such recognizers, non-linearities in such brother positions can be allowed. The scope of applications is widened further by using the ‘conditional automata’ studied by Lugiez and Moysset [44]. Finally, Lugiez [42] defines an abstract class of constrained tree automata suited, e.g., for verifying the sufficient completeness of conditional definitions of functions. The theory is developed in some detail for multiset and arithmetic constraints.

7 Descendants, normal forms and equivalence classes

For any TRS R, any C-term t and any tree language T C Tc, let

0 R*(t) = {s E Tc It +k s} and R*(T) = UtETR*(t); 0 R“t) = R*(t) n Irr(R) and RJ(T) = UtET Rs(t); 0 R3(t) = { s E Tc 1 s --R t} and R’(T) = UtETR=(t).

442

Many questions about R become decidable if these sets are (effectively) regular for every regular T . However, this is not always the case. A TRS R is said to preserve regularity if R*(T) is regular for every regular set T . As we shall see, this property and many other questions concerning the operators R*, RJ and RE are undecidable, but we begin with some positive results.

A TRS is ground if its rules contain no variables. It follows from results by Brainerd [5] and Dauchet and Tison [15] that if R is a ground TRS, then R*(T) , RJ (T) and R-(T) are regular sets for every regular set T .

A TRS is called monadic if in every rule the left-hand side is linear and the right-hand side is of height 5 1. Gallier and Book [25] proved that if R is a monadic TRS with the Church-Rosser property, then RE(t ) E Recc for every t E Tc. Salomaa [50] has shown that any monadic TRS in which also the right-hand sides are linear, preserves regularity, and in [8] and [30] this result is extended to semimonadic and generalized semimonadic systems, respectively. Gyenize and VAgvolgyi [30] also consider the effect of the ranked alphabet of the regular sets on the regularity preserving property; it may be lost when the alphabet is extended. It should be noted that the papers [25], [50] and [8] are actually about the use of various types of tree pushdown automata.

We shall now mention some general undecidability results. For the sake of simplicity, we will sometimes overlook possibilities to strengthen a result by restrictions on C and R. For fuller treatments of these matters the reader is referred to [28] and [as].

Dauchet [ll] has shown that given a TRS R, it is undecidable both

(1) whether R*(T) E Recc for every T E Recc, and (2) whether R*(T) E Recc for a given T E Recc.

In [28] Gilleron proves the undecidability of the corresponding problems for RJ(T) . Gilleron and Tison [29] show that these problems concerning Rk are undecidable even for convergent TRSs. They also prove that given a TRS R, it is undecidable both whether RE(T) E Recc for all T E Recc, and whether R- ( T ) E Recc for a given T E Recc . That many such questions are, in fact, undecidable already for very special complete TRSs follows from Fiilop’s [19] undecidability results concerning the ranges of deterministic top-down tree transducers.

Let us now turn to some decidable problems. First we note that the regularity of RJ(T) is decidable in the special case T = Tc since RL(Tc) is simply Irr(R). In [29] it was noted that for a regularity preserving TRS R, both the reachability problem “s E R*(t)?”, where s, t E Tc, and the inclusion problem “R*(T) C U?”, where T , U E Recc, are decidable.

443

Many problems become decidable when some restricted form of rewriting is considered. For example, in [20] Fulop et al. prove such results for ‘one- pass rewriting’. When the one-pass root-started rewriting mode is followed, rewriting begins from the root of the given term and proceeds towards the leaves as far as possible without rewriting anything that has already been modified and without leaving any intermediate nodes unrewritten. When no more rewriting is possible, a one-pass root-started normal f o r m has been obtained. One-pass frontier-to-root rewriting is similar, but proceeds from the leaves towards the root. Let lrR*(T) and lrRc(T) denote, respectively, the set of all one-pass root-started descendants and the set of all one-pass root-started normal forms of the terms in T (& Tc). In [20] it is shown that for a left-linear TRS R the inclusion problems “lrR*(T) g U?” and “lrRc(T) C U?”, where T, U E Recc, as well as the corresponding problems for one-pass leaf-started rewriting, are decidable. It is to be noted that since R is not required to be right-linear, the sets lrR*(T) and 1rR“T) (and their leaf-started counterparts) are not necessarily regular. In [51] Seynhaeve e t al. introduce a general method for proving the decidability of such problems which also shows why right-linearity was not needed here. Let us explain briefly their idea.

Let R” be some operator on Tc, such as R* and RJ, associated with a given TRS R. To treat problems of the form “Ro(T) U?” or “Ro(T) = U?”, where T, U E Recc, a tree bimorphism (4, S, $) is defined as follows. If R = (11 + T I , . . . , lk + Tk} , we enlarge C by k new symbols 91,. . . , gk so that for each i = 1 , . . . , k, the rank of gi equals Jvar(li)J. If r denotes the new ranked alphabet, 4 and $ are tree homomorphisms (cf. [27]) from Tr to TC which are defined so that

4 m ( f ) = $ m ( f ) = f(z1,. . * 1zm) for all f € E m , m L 0;

&(gi) = li(z1,. . . , zm) and q m ( g i ) = ri (z l , . . . ,z,), where var(1i) = {q,. . . ,z,}, for i = 1 , . . . , k.

Finally, S is a r-tree language such that R”(t) = $(+-l(t) n S ) for every t E Tc. Whether such a set S exists, depends naturally on the operator R” considered, but as shown in [51], in many cases of interest even a regular S can be found. Let us assume that such an (4, S, $) with S E Recr is given. The question “Ro(T) = U?” with T , U E Recc is then decidable if R is right-linear since

Ro(T) = U * $~(4-l(T) n S ) = U,

444

and both the inverse tree homomorphism 4-l and the linear tree homomorphism + preserve regularity (cf. [27], for example). Moreover, since

Ro(T) U W (P-l(T) n S $-‘(U),

the question “Ro(T) 5 U?” is decidable even if R is not right-linear. Let us finally mention the interesting idea considered in [34] and [26] of

using regular approximations of the sets R*(T) and RJ(T) for handling cases where the sets themselves are not regular.

8 Congruential tree languages and ground term rewriting systems

A C-tree language T is said to be congruential if it can be expressed as the union [ t l ] ~ U . . . U [ t k ] ~ of finitely many classes of the congruence generated by a finite system E of ground C-equations on the C-term algebra Tc. In particular, for any ground TRS R, all Ex-classes and their finite unions are congruential sets. Since any congruence on Tj of finite index is finitely generated, regular tree languages are congruential, but also the converse holds although the index of a finitely generated congruence on Tc is not necessarily finite. This was proved by Fulop and VBgvolgyi [21], but the fact is implicit already in [5] and it is mentioned without proof in [38]. Moreover, the equality of the families of congruential and regular sets is effective. In fact, VBgvolgyi [54] presents an efficient algorithm based on results from [23] for constructing a deterministic tree recognizer for a given congruential tree language [ t l ] ~ U . . . U [ t k ] ~ , and in [24] Fulop and VBgvolgyi show how the converse problem of finding a (minimal) equational representation for a given regular tree language can be solved. The effective equivalence of congruential and regular tree languages could explain why tree automata and regular tree languages are so useful in the theory of ground equational systems and ground term rewriting systems.

Let us call a C-recognizer, as defined in Section 2, from which the set of final states has been omitted, simply a C-automaton. A ground tree transducer (GTT) is then a pair (d, B) consisting of two C-automata A = (A, C , P) and 23 = ( B , C , Q ) which may share some states. The GTT-relation defined by (A, I?) is the tree transformation

T ( d , I?) = { ( s , t ) E Tc x Tc I ( ~ T E Tc(A n B)) s +> T , t +; T } .

Ground tree transducers were introduced by Dauchet and Tison [15]. The rewrite relation JR of any ground TRS R can be defined as a GTT-

relation. Since the class of GTT-relations is effectively closed under forming

445

converse relations (trivial), compositions and reflexive, transitive closures, the confluence condition

e; 0 *& c *; 0 +& can be expressed as an inclusion between two GTT-relations. Now the inclusion problem of GTT-relations turns out to be decidable, and hence a new simple proof of the decidability of the confluence of ground TRSs is obtained. In fact, the result can be extended to concern some more general classes of TRSs. These results can be found in [15] and [14]. Of course, it should be mentioned that the decidability of the confluence of a ground TRS was shown also by Oyamaguchi [47]. In [16] ground tree transducers are used for proving an even stronger result, the decidability of the first-order theory of ground term rewriting.

In [18] Engelfriet introduces derivation trees for ground TRSs. A reduction sequence s * R . . . * R t of a ground TRS R is represented by a tree r E Tx,{#), where # is a new binary symbol, so that X(r) = s and p(r) = t for two given linear tree homomorphisms X and p. The properties of these derivation trees are quite similar to those of the derivation trees of context- free grammars. In particular, for any ground TRS R, the set DR of derivation trees is a regular tree language and the relation is the image of DR under the yield-mapping q(r) = (A(r ) ,p(r ) ) . Any GTT-relation can be defined in a similar manner, and elegant, rigorous proofs are obtained for many results concerning GTTs and ground TRSs.

Acknowledgements

This work was supported by the Academy of Finland under Grant SA 863038. I thank the participants of a TUCS seminar (1999-2000) for sharing with me the pains and pleasures of studying some of the new literature on tree automata and term rewriting, and especially Eija Jurvanen and Matti Ronka for their comments on this paper. Special thanks are due to Tatjana PetkoviC for her generous help with the preparation of the typescript.

References

[l] J. Avenhaus: Reduktionssysteme, Springer, Berlin 1995. [2] F. Baader and T. Nipkow, Term Rewriting and All That, Cambridge

University Press, Cambridge 1998. [3] B. Bogaert, F. Seynhaeve and S. Tison: The recognazibility problem

for tree automata with comparisons between brothers. Foundations of

Software Science and Computation Structures (Proc. Conf. 1999) LNCSa 1578, Springer, Amsterdam, 1999, 150-164. B. Bogaert and S. Tison: Equality and disequality constraints on direct subterms in tree automata, Theoretical Aspects of Computer Science, STACS’92 (Proc. Symp., 1992), LNCS 577, Springer, Berlin 1992, 161- 171. W.S. Brainerd: Tree generating regular systems, Information and Control

R. Biindgen: Termersetzungssysteme, Vieweg&Sohn, Wiesbaden 1998. H. Comon: An efficient method for handling initial algebras, Algebraic an Logic Programming (Proc. Intern. Workshop, 1988), LNCS 343, Springer, Berlin, 108-118. J.L. Coquid6, M. Dauchet, R. Gilleron and S. Vfigvolgyi: Bottom-up tree pushdown automata: classification and connection with rewrite systems. Theoretical Computer Science 127 (1994), 69-98. B. Courcelle: On recognizable sets and tree automata. In: Resolution of Equations in Algebraic Structures (eds. H. Ait-Kaci and M. Nivat), Academic Press, Boston 1989, 93-126. B. Courcelle: Basic notions of universal algebra for language theory and graph grammars. Theoretical Computer Science 163 (1996), 1-54. M. Dauchet: Simulation of Turing machines by a regular rewrite rule. Theoretical Computer Science 103 (1992), 409-420. M. Dauchet, A.C. Caron and J.L. Coquid6: Automata for reduction properties solving. J. Symbolic Computation 20 (1995), 215-233. M. Dauchet, F. De Comite: A gap between linear and non linear term-rewriting systems. Rewriting Techniques and Applications, RTA- 87 (Proc. Conf., 1987), LNCS 256, Springer, Berlin 1987, 95-104. M. Dauchet, T. Heuillard, P. Lescanne and S. Tison: Decidability of the confluence of finite ground term rewrite systems and other related term rewrite systems. Information and Computation 88 (1990), 187-201. M. Dauchet and S. Tison: Decidability of confluence for ground term rewriting systems. Fundamentals of Computation Theory, FCT’85 (Proc. Conf., 1985), LNCS 199, Springer, Berlin 1985, 80-89. M. Dauchet and S. Tison: The theory of ground rewrite systems is decidable. 5th IEEE Symposium on Logic in Computer Science (Proc. Symp., 1990), IEEE Computer Society Press, Los Alamitos, CA 1990, 242-248. N. Dershowitz and J.-P. Jouannaud, Rewrite systems. In: J . van Leeuwen (ed.), Handbook of Theoretical Computer Science, Vol. B , Elsevier Sci-

14 (1969), 217-231.

aLNCS = Lecture Notes in Computer Science

446

446

446

446446

446

446

446

446

446

446

446

446

446

446

447

ence Publisher B.V., Amsterdam 1990, 243-320. [l8] J. Engelfriet: Derivation trees of ground term rewriting systems. Infor-

mation and Computation 152 (1999), 1-15. [19] Z. Fulop: Undecidable properties of deterministic top-down tree trans-

ducers. Theoretical Computer Science 134 (1994), 311-328. [20] Z. Fiilop, E. Jurvanen, M. Steinby and S. VBgvolgyi: On one-pass term

rewriting. Acta Cybernetica 14 (1999), 83-98. [21] Z. Fulop and S. VBgvolgyi: Congruential tree languages are the same

as recognizable tree languages. - A proof for a theorem of D. Kozen. Bulletin of the EATCS 39 (1989), 175-185.

[22] Z. Fulop and S. VAgvolgyi: A characterization of irreducible sets modulo left-linear term rewriting systems by tree automata. Fundamenta Informaticae XI11 (1990), 211-226.

[23] Z. Fulop and S. VBgvolgyi: Ground term rewriting rules for the word problem of ground equations. Bulletin of the EATCS 45 (1991), 186-201.

[24] Z. Fulop and S. VAgvolgyi: Minimal equational representations of recognizable tree languages. Acta Inforrnatica 34 (1997), 59-84.

[25] J.H. Gallier and R.V. Book: Reductions in tree replacement systems. Theoretical Computer Science 37 (1985), 123-150.

[26] T. Genet: Decidable approximations of sets of descendants and sets of normal forms. Rewriting Techniques and Applications, RTA-98 (Proc. Conf., 1998), LNCS 1379, Springer 1998, 151-165

[27] F. Gkcseg and M. Steinby: Tree languages. In: G. Rozenberg and A. Salomaa (eds.), Handbook of Formal Languages, Vol. 3, Springer, Berlin

[28] R. Gilleron: Decision problems for term rewriting systems and recognizable tree languages. Theoretical Aspects of Computer Science 1991, STACS 91, (Proc. Symp., 1991), LNCS 480, Springer, Berlin 1991, 148- 159.

[29] R. Gilleron and S. Tison: Regular tree languages and rewrite systems. Fundamenta Informaticae 24 (1995), 157-175.

[30] P. Gyenizse and S. Vagvolgyi: Linear generalized semi-monadic rewrite systems effectively preserve recognizability. Theoretical Computer Sci- ence 194 (1998), 87-122.

1311 P. Gyenizse and S. Vagvolgyi: A property of left-linear rewrite systems preserving recognizability. Theoretical Computer Science 242 (2000), 477- 498.

[32] D. Hofbauer and M. Huber: Linearizing term rewriting systems using test sets. J. Symbolic Computation 17 (1994), 91-129.

[33] D. Hofbauer and M. Huber: Test sets for the universal and existential

1997, 1-87.

448

closure of regular tree languages. Rewriting Techniques and Applications, RTA-99 (Proc. Conf., 1999), LNCS 1631, Springer, Berlin 1999, 205-219.

[34] J. Jacquemard: Decidable approximations of term rewriting systems. Rewriting Techniques and Applications, RTA-96 (Proc. Conf., 1996), LNCS 1103, Springer, Berlin 1996, 362-376.

[35] Y. Kaji, T . Fujiwara and T. Kasami: Solving a unification problem under constrained substitutions using tree automata. J. Symbolic Computation

[36] D. Kapur, P. Narendran and H. Zhang: On sufficient completeness and related properties of term rewriting systems. Actu Informatzca 24 (1987),

[37] E. Kounalis: Testing for inductive (co)-reducibility. Trees in Algebra and Programming, CAAP’SO (Proc. Coll., 1990), LNCS 431, Springer, Berlin

[38] Kozen, D.: Complexity of finitely presented algebras. 9th Annual Sym- posium of the Theory of Computing (Proc. Conf., 1977), 164-177.

[39] G.A. Kucherov: A new quasi-reducibility testing algorithm and its application to proofs by induction. Algebraic and Logic Programming (Proc. Conf.,1988), LNCS 343, Springer, Berlin 1988, 204-213.

[40] G.A. Kucherov: On the relationship between term rewriting systems and regular tree languages. Rewriting Techniques and Applications, RTA-91 (Proc. Conf., 1991), LNCS 488, Springer, Berlin 1991, 299-311.

[41] G. Kucherov and M. Tajine: Decidability of regularity and related properties of ground normal form languages. Information and Computation

[42] D. Lugiez: A good class of tree automata. Applications to inductive theorem proving. Automata, Languages and Programming, ICALP’98 (Proc. Conf., 1998), LNCS 1443, Berlin 1998, 409-420.

[43] D. Lugiez and J.L. Moysset: Complement problems and tree automata in AC-like theories. Theoretical Aspects of Computer Science 1999, STACS 93, (Proc. Symp., 1993), LNCS 665, Springer, Berlin 1993, 515-524.

[44] D. Lugiez and J.L. Moysset: Tree automata help one to solve equational formulae in AC-theories. J. Symbolic Computation 18 (1994), 297-318.

[45] J . Mezei and J.B. Wright: Algebraic automata and context-free sets. Information and Control 11 (1967), 3-29.

[46] F. Otto: On the connections between rewriting and formal language theory. Rewriting Techniques and Applications, RTA-99 (Proc. Conf., 1999), LNCS 1631, Springer, Berlin 1999, 332-355.

[47] M. Oyamaguchi: The Church-Rosser property for ground term-rewriting systems is decidable. Theoretical Computer Science 49 (1987), 43-79.

23 (1997), 79-117.

395-415.

1990, 221-238.

118 (1995), 91-100.

449

[48] M. Oyamaguchi: Some results on decision problems for right-ground term-rewriting systems. Toyohashi Symposium on Theoretical Computer Science (Proc. Symp., 1990), 41-42.

[49] D. Plaisted: Semantic confluence tests and completion methods. Infor- mation and Control 65 (1985), 182-215.

[50] K. Salomaa: Deterministic tree pushdown automata and monadic tree rewriting systems. J. Computer and System Sciences 37 (1988), 367-394.

[51] F. Seynhaeve, S. Tison and M. Tommasi: Homomorphisms and concurrent term rewriting. Fundamentals of Computation Theory, FCT’99 (Proc. Conf. 1999), LNCS 1684, Springer, Berlin 1999, 475-487.

[52] H. Comon, M. Dauchet, R. Gilleron, F. Jaquemard, D. Lugiez, S. Ti- son and M. Tommasi: Tree Automata Techniques and Applications. http: //www .grappa.univ-lille3.fr/tata/.

[53] S. Tison: Tree automata and term rewrite systems. (Extended Abstract) Rewriting Techniques and Applications, RTA-2000 (Proc. Conf.) , LNCS 1833, Springer, Berlin 2000, 27-30.

[54] S. VAgvolgyi: A fast algorithm for constructing a tree automaton recognizing a congruential tree language. Theoretical Computer Science 115 (1993), 391-399.

[55] S. VAgvolgyi and R. Gilleron: For a rewrite system it is decidable whether the set of irreducible, ground terms is recognizable. Bulletin of the EATCS 48 (1992), 197-209.

450

Key agreement protocol securer than DLOG

Akihiro Yamamura * and Kaorii Kurosawa t

Abstract

Our goal is to propose a key agreeInerit protocol that is secure even if the discrete logarithm problem can be efficiently solved in the underlying abelian group. The protocol is defiled over a non-cyclic finite abelim group whereas the DifFie-Hellman protocol is defined over a cyclic finite abeliari group. We analyze the generic reductions of breaking the proposed protocol to the discrete logarithm problem and show that a large number of queries to the discrete logarithm oracle are required to break the proposed protocol in the generic algorithm model.

Keg Word$: Diffie-Hellman protocol, multiple discrete logarithm problem, generic algorithm, discrete logarithm oracle

1 Introduction In 1976 Diae and Hellman proposed a protocol over an insecure channel to establish a secret key. Since then, their scheme has been applied to numer- 011s finite abelian groups like the miiltiplicative groups of finite fields and the groups of the rational points on elliptic curves and hyperelliptic curves ([3], [4] and [S]). However, the Diffie-Hellman key exchange protocol is in- herently viilnerable to an adversary who can solve the discrete logarithm problem in the underlying group. The discrete logarithm problem is believed to be intractable in general, however, we cannot deny the existence of an efficient algorithm that solves the discrete logarithm problem. As a matter of fact, polynomial or subexponential time algorithms for the discrete logarithm problem have been discovered for several classes of finite abelian groups.

*Communications Research Laboratory, 42-1, Nukui-Kitamachi, Koganei, Tokyo, 1 8 4

TTbaraki University, 4 1 2 1 , Nakanarusawa, Hitachi, Tbaraki, 316-8511, Japan email: 8795 Japan email: akiOcrl.go.jp

kurosawaOcis.ibaraki.ac. jp

45 1

On the other hand, the quest for abelian groups appropriate to the Diffie- Hellman scheme made niimeroiis classes of abelian groups available to protocol designers. Some groups have potentially richer striictiires than the multiplicative groups of finite fields which are always cyclic. Several groups, for example, the multiplicative group of integers modiilo a composite number, the group of the rational points on an elliptic curve and a hyperelliptic curve and a commutative subgroup of the group of non-singular matrices over a finite ring are not necessarily cyclic. For these groups, the discrete logarithm problem does not fiilly reflect the complexity of their algebraic striictiires. In fact, in [S], it is shown that R(p) queries to the group operation oracle are required to solve the multiple discrete logarithm problem (see Section 2.3 for the definition) in a non-cyclic group isomorphic to Z, x Z, in the generic algorithm model whereas only R ( f i ) queries to the group operation oracle are required to solve the discrete logarithm problem for a group isomorphic to ZpTb for any n 2 1. The results indicate that the multiple discrete logarithm problem is more difficult than the discrete logarithm problem.

This observation motivates 11s to invent a key agreement protocol over a non-cyclic group so that we can exploit its complicated algebraic structure to enhance the security. We constriict a key agreement protocol whose seciirity is based on the intractability of the multiple discrete logarithm problem over a non-cyclic abelian group. We employ generic algorithms and generic reductions to produce evidence that the proposed protocol cannot be broken by the adversary who can solve the discrete logarithm problem. We prove that breakmg the proposed protocol requires R(fi) queries to the group operation oracles. Furthermore, we prove that breaking the proposed protocol requires R ( fi) queries if it is allowed to call the discrete logarithm oracle, which is introduced in Section 3.2, in addition to the group operation oracle. There- fore there exists no probabilistic polynomial time algorithm that breaks the proposed protocol even if the discrete logarithm problem is efficiently solved, Hence, the proposed protocol has a novel feature that it is secure against the adversary who can solve the discrete logarithm.

Related works: A generic algorithm is a general purpose algorithm that does not make iise of any property of the representation of the group elements. In [8] it is proved that the computational complexity of breaking the DH protocol is also R(&. In [5 ] , however, it is proved that solving the DLOG is strictly harder than breaking the DH protocol if p 2 I n.

452

2 Proposed key agreement protocol We introduce a key agreement protocol that is defined over a general finite abelian group that is not necessarily cyclic. The protocol is called the Gen- eralized Dz;tfie-Hellman protocol or simply the GDH protocol in this paper. We prove that the GDH protocol is at least as secure as the Diffie-Hellman (DH for short) protocol. In Section 3, we produce a stronger evidence that the GDH protocol is securer than the DH protocol.

Before defining the protocol, we recall the notations in group theory. Let G be a finite abelian (multiplicative) group. The subgroup generated by the element a is denoted by < a > and similarly the subgroup generated by the elements a and b is denoted by < a, b >, that is:

< a > = {an 1 n E Z}, < a , b > = {unb'" 1 n, rn E 2).

For a E G, lul denotes the order of a, that is, the number of the elements in < a >. The order of a group G is denoted by /GI.

2.1 Proposed protocol

GDH protocol: Let G be a finite abelian group and a, b elements of G. We choose piiblic integers a, p, y, 6 such that each of a, p, y, 6 is relatively prime to both la1 and Ibl.

step 1. Alice chooses integers il, i2 randomly. She computes

1 (aa i ibP iz , a7i2b6i1

and sends it to Bob.

step 2. Bob chooses integers 3'1, 3'2 randomly. He compiites

7 ) ( aaji Pjz arja b6ji

and sends it to Alice.

step 3. Alice compiites (aajib@jz)6ii = a a 6 i l j l b P 6 i l h

and (arjz b6ji P i a - a P r i a j a b P S i 2 j i . ) -

Then Alice computes a common key K = a a d i ~ j ~ + P r i z j , bPd(ilj2 + i z j l )

by multiplying the two elements.

453

step 4. Bob computes

and

Then Bob similarly computes the common key K .

Remark Suppose that G is an abelian group and a and b are elements of G. If la1 and Ibl are relatively prime, then the siibgroiip < a, b > is in fact cyclic. Therefore the necessary condition for < a, b > to be non-cyclic is that la1 and Ibl have common prime divisors. We should note that a finite abelian group G is non-cyclic if and only if G contains a siibgroiip isomorphic to Z, x Z, for some prime p . Hence? we prefer to choose la1 and Ibl which have common prime divisors so that the scheme can be based on the striictiire of a non-cyclic group.

2.2 Security compared with the DiffieHellman protocol Breaking the protocol is equivalent to solving the following algorithmic p rob lem. Suppose that G is a finite abelian group and that a, b E G. Each of the parameters a, p, y, 6 is relatively prime to both la1 and Ibl. The GDH problem in G with respect to a, b is defined by:

where il, i2, j,, j o are randomly and independently chosen integers.

The following resiilt guarantees that the GDH protocol is at least as secure a.s the DH protocol if the parameters are carefully chosen.

Theorem 2.1 Let a , p , y , b be integers. W e suppose that each of them is relatively prime to /GI. If there exists an efficient algorithm that solves the GDH problem (with the parameters a, 0, y, 6 ) in an abelian group G for all a, b E G, then there exists an efficient algorithm that solves the DH problem in G for all a E G.

Proof. Suppose that there is an efficient algorithm that solves the GDH problem for all a and b. We construct an efficient algorithm that solves the DH problem, that is, an algorithm that computes ailjl for the inputs ail and

INPUT:

OUTPUT:

454

ajl where a is an element of G. Let b = 1 (the identity element of G). We should note that a, p, y, 6 are integers relatively prime to (a( since la( divides IGI. By our assumption, we have an efficient algorithm to solve the GDH problem for a and b. Let i2 = 3'2 = 0. We input

(aai1bPi2 a7izb6i l ) = ( ( p i 1 , 1) = ((aily, 1)

and (&l pjz. a Y h @1) = (@l , 1) = ((&)a, 1)

to the algorithm that solves the GDH problem with respect to a and b. Then we obtain

U Ordiljl +Pyizjz bPa(ilj2 +iz j , ) = p 5 i l j l

We note that we can compiite and (ajl)" because we are given ai l , ajl and a is a public information. Since both a and b are relatively prime to la[, we can find the integer m such that (aa6)" = a. Then (aasiljl)Tn =

0 (aru6m ) i l j l = &jl , and hence, the DH problem is efficiently solved.

2.3 MDLOG Let G be a finite abelian group and a, b elements in G. We set H to be the subgroup of G generated by a and b. The multiple discrete logarithni problem (MDLOG for short) in the group H = < a , b > is the algorithmic problem defined by:

INPUT: An element g of H

OUTPUT: A pair (z, y) of non-negative integers such that g = axby.

Since H is generated by a and b, there exists at least one pair (z, y) of nonnegative integers satisfying g = axby. Although such a pair is not necessarily unique in general, the output is uniquely determined if H is the direct product < a > x . Clearly the GDH problem is reduced to the MDLOG problem, and hence, the GDH protocol can he broken if the MDLOG is efficiently solved. We should remark that the result in [S] indicates that solving MDLOG is essentially harder than solving DLOG. On the other hand, DLOG is evidently reduced to MDLOG. We shall siimmarize the relationships among MDLOG, DLOG, GDH and DH in Section 6.

455

3 Generic reduction of breaking the proposed protocol

We discuss the security of the GDH protocol from the point of view of the generic model. Our conclusion is that the GDH protocol with carefully chosen parameters is securer thap the DH protocol in the generic model. To simplify the argument, we consider only the GDH protocol over a miiltiplicative group G isomorphic to Z, x Z, where p is a large prime in the rest of the paper. We shall show that the GDH protocol is secure even against the adversary who can solve the DLOG if we impose the condition on the parameters a, p, y, 6 as follows:

a, p, y, 6, are relatively prime to p (1)

(2)

and P6 - is a quadratic nonresidue (mod p ) . “7

We suppose that the conditions (1) and (2) are satisfied. The condition (1) is imposed to prevent a, 0, y, 6 from collapsing elements a, b E G. On the other hand, the condition (2) seems rather artificial. We explain why the condition (2) is imposed in Section 4.

3.1 Generic algorithms We briefly review generic algorithms and generic reductions. A generic algorithm is a general purpose algorithm which does not rely on any property of the representation of the group (see [S] and [5] for details). Let a be a random mapping from Z, to a set S of she p of binary strings. The generic algorithm is allowed to make calls the group operation oracle that computes the function add and inv defined by for z, y E Z,,

add(a(z), a ( y ) ) = a(z + y) and i n v ( a ( 2 ) ) = a(-.)

without any compiitational cost. A generic algorithm for the DLOG in the cyclic group Z, takes ( a ( l ) , a ( z ) ) as an input and outputs z, where z E Z,. We note that in [5] the Dfie-Hellman oracle is introduced to study the generic reduction of the DLOG to the DH.

Next let cr be a random mapping from Z, x Z, to a set S of size p 2 of binary strings. A generic algorithm for Z, x Z, is allowed to make calk group operation oracles which computes the function add and inv defined by

add(a(zi, ZZ), a(y i , 1~2)) = ~ ( z i + ~ i , 2 2 + YZ) i7LV(c7(Z1, 2 2 ) ) = (T(--51, -Z2).

456

A generic algorithm for the MDLOG in Z, x Z, takes

(41, 01, 4 0 , I), 4 2 1 , 2 2 ) )

as an inpiit and then outputs (21, "2) where $1, 2 2 E Z,.

3.2 Main Theorem We now investigate the hardness of breaking the (proposed) GDH protocol compared with the DLOG problem in terms of the generic reduction.

First of all, a generic algorithm for GDH problem runs as follows. Let p be a large prime. The group Z, x Z, is encoded by into a set S of binary strings. A generic algorithm for the GDH problem in Z, x Z, takes a list

( 4 : 0 ) , 40,1) , daily Piz) , C(Yi2, ail), 4aj1, /3j2) , 4 Y j 2 , Sjl))

as an inpiit, compiites by calling the group operation oracles and then outputs

v(aSi1j1 + PYi2j2 , PS(i1jz + i 2 j l ) ) .

In addition to the group operation oracles, we allow the generic algorithm to call the discrete logarithm oracle. A discrete logarithm oracle for Z, x Z, takes the pair

of the representations as an inpiit and then outputs the integer n such that nil = jl(mod p ) without any computation cost if such TL exists. However, several plaiisible behaviors of the DLOG oracle are considered if there is no integer 7~ siich that nil = jl(mod p ) and ni2 = jn(m0d p ) . Let iis call such an inpiit illegal. We enumerate several plausible modes of the discrete logarithm oracle as follows.

(4i1, i2) , 4j1, j2 ) )

p ) and n i 2 = ja(m0d

Mode 1 An oracle does nothing to illegal inputs. Then the generic algorithm provides an error message and the computation proceeds to the next step.

Mode 2 An oracle provides wrong answers (for example: randomly chosen integers) to illegal inpiits while it retiirns correct answers if it is given legal inputs.

Mode 3 An oracle makes the entire computation stop without any output when illegal inpiits are given.

In oiir study of the generic reduction of the GDH problem, we adopt Mode 1. Clearly, the other modes reduce the computational efficiency and the probability that the algorithm retiirns a correct answer.

457

Theorem 3.1 Let A be a generic algorithm that solves the GDH problem in the group Z p x Z p where p i s a. prime. We m p p o s e that the parameters a , p, y, b satisfij the conditions (1) and (2). Suppose that A niake.s at mos t R queries to the group operation oracle and at wjwst L queries t o the discrete logarithm, oracle, respectively. T h e n the probability 0 that the algorithm A returns the correct answer i s a t mos t

2L(R+6)(R+5) + (n+6)(n+5) + 4(R+6) P P P2

where the probability is taken over il, i2, j , , j 2 and a representation o. The pxpected number of queries t o the discrete logarithm oracle is at least

1 2 _ - _ ps 2(R+6)(R+5) 2 p(R+5)'

0

We postpone the proof until Section 3.3 and discuss here the consequences of Theorem 3.1. Let T denote the total running time of the algorithm A. Since T >_ L + R, we have T 2 L and T 2 R. Suppose that Q is a constant. By Theorem 3.1, we have

(2T + 1)(T + 6)(T + 5 ) + 4(T + 6 ) 0. P P2

Therefore, T is in f l ( f i ) = f l ( 2 9 ) . This implies that there exists no probabilistic polynomial time algorithm that breaks the GDH protocol even if the DLOG is efficiently solved.

Now we suppose that the discrete logarithm oracle is not available. The expected number of queries to the group operation oracle for solving the GDH problem is derived from Theorem 3.1 by letting L = 0. An upper bound of the success probability Q is

( R t - 6 ) ( R + 5 ) + 4(R+6)

and hence, the expected number of queries to the group operation oracle is estimated as

p 2 P

n ( m if Q is a constant.

We also remark that the siiccess probability grows in proportion to ( 3 ) (n+ 6)(R+5) as L grows in the boiind given in Theorem 3.1. Since ($)(R+6)(R+ 5 ) is small provided that p is large enough, the DLOG oracle does not siih

stantially help to break the GDH protocol.

P

458

3.3 Proof of Theorem 3.1 The following is useful.

Lemma 3.1 ([7]) W h e n given a non-zero polynowiial F of total degree d in Z,[X1, X2 , . . . , Xk] (p is a priwie), the probability that F ( z1 , $2 , . . . , zk) = 0 for independently and randomly chosen elewients ~ 1 ~ x 2 , . . . ,zk of Z, is at most d / p .

Proof of Theorem 3.1. In the proof, we simulate a generic algorithm by polynomials over Z,. At the beginning, we have six pairs of polynomials

(F1,Hl) = ( L O ) , ( F z , H 2 ) = (0, 1 ) 7 (F37H3) = (cyx1,Px2) ,

(F5, H5) = (OYl, PK), (F47H4) = (yx2,6x1):

( F 6 7 H6) = (yy2, 6y1)

in Z,[Xl, X z , Y1, Yz]. Each pair corresponds to the representations (of the group elements)

respectively. We compute polynomials

of representations (of the group elements). When the miiltiplication oracle is called with the inpiits corresponding to the pairs (Fk, Hk) and (Fr, Hi), we compute polynomials Fi and Hi by setting Fi = Fk + Fr and Hi = Hk + Hl where i > k , 1. Similarly, when the inversion oracle is called with the inpiits corresponding to the pair (Fk, Hk) , we compute polynomials Fi = -Fk and Hi = -Hk where i > k , 1. When the discrete logarithm oracle is called with the inpiits corresponding to (Fk, Hk) and ( f i , Hl) , it returns R (E Z,) such that

*rFk ( i l , i2 , j 1 , j z ) = 4 ( i l , i 2 , j 1 , j ~ )

459

and S H k ( i 1 , i 2 , j,, 3’2) = Hl(i1, i 2 , j,, j 2 )

if such s exists. In this case, we do not produce polynomials, but we get the information that i l , i 2 , j 1 , j 2 satisfy the equations sFk = Fi and s H k = Hi. We suppose that a generic algorithm has a chance to return the correct answer only when we find non-trivial equations satisfied by il, i 2 , 3’1, j p in our simulation of the computation.

Before starting the proof, we discuss more on the hehavior of the discrete logarithm oracle. When the algorithm calls the discrete logarithm oracle for inputs

g(Fk(ii, i 2 , j i , 3’21, H k ( i i , i 2 , j i , 3’2))

and ~ ( F l ( i 1 , 3’1, j 2 ) , H ~ ( i 1 , i27 j17 j 2 ) ) ,

there are three possible events. The first possible event is that the inputs are illegal, that is, the second input is not a power of the first. The second event is that the inputs are legal but the polynomials Fk, Hk, Fl, HL satisfy the condition FkHl = HkFl (mod p) as a polynomial over Z,. The third event is that the inputs are legal and FkHi # HkFl (mod p ) . We show that information on il, i 2 , 3’1, 3’2 can be derived only in the last event. If the first event occurs, the discrete logarithm oracle does not return anything except for an error message. We have no chance to gain the information on i l , i 2 , 3’1, j 2 other than that the second is not a power of the first. We now discuss the second event. Let 11s suppose that

FkHl - F1Hk = 0 (modp).

First we note that since Fk, Hk, Fl, Hl are polynomials of at most degree 1 over Z,, they are units or irreducible polynomials. Since the polynomial ring Z,[X1, X 2 , Yl, Yz] is a nniqiie factorimtion domain, we have either 71Fk = Fl and 71Hk = H1 for some 7 1 E Z, or 71Fk = Hk and u F ~ = Hl for some 71 E Z,. In the case that 71Fk = F1 and 71Hk = Hit the discrete logarithm oracle returns 71 E Z, to the inputs a(Fk, Hk) and o(F1, HL) , but we do not obtain any information on il , i2, j 1 , j 2 because the equations 7LFk = Fl and f1Hk = Hl are satisfied not only by i l , i 2 , j 1 , j 2 but also b y all 2 1 , z 2 , y1, y 2 E Z,. Next we suppose that u F ~ = Hk and 7 1 4 = Hl. By the definition of Fk and Hk, we can write

460

where c1, c2, c3, e4, c5, c6 E Zp. Since uFk = Hk, we have

11c1 = c2

Because we are assuming the condition (Z), the matrix

f uff -6

is non-singular. Hence, we have

c3 = c4 = c5 = c~ = 0 (modp)

and so both Fk and Hk are constants. It follows that Fi and Hi are constants. Therefore, the oracle call with an input u(Fk, Hk) and ~ ( F L , Hi) such that FkHl = FiHk does not provide any information on il , i 2 , j 1 , 3’2. Consequently, we can obtain information on i l , i2 , j1, j 2 only when the third event occurs and so we say that a discrete logarithm oracle query is meaningful if it is called in the third event, otherwise it is nonsensical.

We now find an upper bound of the probability that the algorithm A returns the correct answer. There are three probable cases for a generic algorithm to return the correct answer. (Case 1) At least one discrete logarithm oracle query is meaningful. (Case 2) All discrete logarithm oracle queries are nonsensical and there are (Fk , Hk) and (Fl, H I ) such that (Fk, H k ) # (FI: Hi) as polynomials over Z P , bllt

Fk (ii , i 2 , j i , j 2 ) = F’ (ii , i 2 , j i : j 2 )

H k ( i i , i 2 , j i , j 2 ) = Hl( i1 , i2 , j i 7 j2 ) .

(Case 3) All discrete logarithm oracle queries are nonsensical and we have

and

(Fk( i i , i2,311,3’2), H k ( i i , i 2 , j i , j 2 ) ) = ( f f 6 i l j i -4- PYi2j2, Pd(i13’2 + i 2 j 1 ) )

for some (Fk, Hk) .

46 1

We find an upper boiind on the probability in each of (Case l), (Case 2) and (Case 3). (Case 1) The probability that a discrete logarithm oracle query is meaningful is bounded by the probability that for some k and 1 with FiHk: - FkHi # 0 and there is s in Z, satisfying

sFk:(21, ~ 2 , 2 / 1 , 2 / 2 ) = Fi(g;i, 2 2 , 2/1, 9 2 )

and c9Hk:(~1,22, 2/i, 92) = H ~ ( x 1 , ~ 2 ~ 1 ~ 1 , 2/2)

for randomly chosen 21, 22, y1,1~;? in Z,. Then the probability is bounded by an upper boiind of the probability that for randomly chosen z1, 2 2 , 1 ~ 1 , 1 ~ 2 in Z p we have

F L ( z ~ , 22, 1 ~ 1 , 2/2)Hk(z1, 22,2/1, 2/2) = Fk(21,22, 2/1, 2/2)Hl(zl, 22, ! / I , 1/21

since we have

.sFk:(zi, 22, 2/1, 2/2)Hk:(Z1, 22, 91, Y 2 ) = Fl(z i , 22, 2/i, 92)wk:(21, z2, 2/21

and

s F k ( ~ i , 22, 211, 2/z)Hk(x1, 22,211, 2/2) = Fk(Z1 ,2~ , 2/1, 2/2)wt(z1,%, 91, 2/2).

On the other hand, the probability that

F i ( ~ i , 22 ,2 /1 , 2/2)Hk(21,22,2/1,2/2) = Fk(21,22, 2/1, ?/2)Hl(Zl, 22, 91, 2/21

for randomly chosen 21, 2 2 , y 1 , ~ 2 in Z, is bounded by 2 / p by Lemma 3.1 since the total degree of the polynomials FlHk - FkHl does not exceed two and FlHk - FkHl # 0 as a polynomial. It follows that the probability that at least one discrete logarithm oracle query is meaningful is bounded by

2 L(R+ 6 ) ( R + 5) x -.

P

(Case 2 ) Assume that (Fk , Hk) # (Fl, Hl). There are three cases: (i) Fk # F1 (we do not care whether Hk: # Hl or Hk = Hl) and (ii) Fk = Fl and Hk # Hl. In the case (i), the probability that

Fk:(ii7i2,j1,3’2) = F~(ii,i2,3’i,j2)

for randomly chosen 21, 22, y1, 92 in Z, is at most the probability that

by Lemma 3.1. Hence,

J’k:(ii, i 2 , j1, 3’2) = Fl(i1, i2,3’1, j 2 )

462

and HL(il> i 2 , jl , 3’2) = mil , 1:2: 3’1, j 2 )

for some k, 1 and randomly chosen $1, 2 2 , y 1 , y 2 in Z, is

(R+6) (R+5) 1 2 P

x -

in the case (i). Similarly the probability in the case (ii) is at most

(R+6) (R+5) 1 2 P

Therefore the probability in (Case 2) is at most

x -.

( R + 6) (R + 5 ) P

(Case 3) By Lemma 3.1, an upper bound is

(R+ 6 ) x

for the probability of the event that for randomly chosen 2 1 , 2 2 , y 1 , y p in Z,, we have

for some ( P k , H k ) because the total degrees of the polynomials

F k - f f 6 x 1 YI+ Pyx2 y 2

and HI, - PWlY2 + X 2 Y d

are two. We note that a S # 0, 0-y # 0 and PS # 0 (mod p ) by the condition

Consequently, the probability that a generic algorithm outputs the correct (1).

answer is at most

463

4 Example Let p be a large prime. In Sec.3, we have discussed the security of our GDH protocol over G =< a > x wlch that \a\ = Jb( = p. That is, ap = bP = 1 and any g E G is expressed as

g = ax@'

uniquely for some z and g. Theorem 3.1 implies that the GDH protocol over the G is secure in the generic algorithm model even if the DLOG is solvable.

In this section, we show an example of G =< a > x such that JaJ = JbJ = p . Let q and r be large primes such that p 1 q - 1 and p 1 r - 1. Let n = qr. Let g1 be a pth root of unity in modq and g2 be a pth root of unity in modr. For some c1 E Z; and c2 E Z;, choose a E Z,, and b E Z, such as follows.

A

a = g1 mod q, Q = gil mod r ,

b = g? mod q, b = g2 mod T.

Theorem 4.1 Ifclc2 # 1 mod p , then

Z : = < a > x .

Proof. It is enough to show that

< a > U = (1).

On the contrary, suppose that

1 # d E< a > U .

This implies that there exist z E iZ; and g E Zi such that

d = ax = by mod n.

Then we have

g? = g?" modq, g$lx = gg mod r.

Therefore,

z = c2ymodp, c1z = gmodp.

464

Hence.

~ 1 ~ 2 x 1 ~ = 3:y mod p.

Now since 3: # 0 mod p and 3: # 0 mod p! we have

c1c2 = 1 mod p.

This is a contradiction.

5 On the parameter condition The condition (2) in page 5 claiming that (pd ) / (ay ) is a qiiadratic nonresidue (mod p ) is essentially important. If (pd)/(ayy) is a qiiadratic residue (mod p ) , that is, that (pd)/(ay) = 11' for some I L , then there exists an attack against the GDH protocol by using the DLOG oracle.

Attack against the GDH protocol in which ( p d ) / ( a ~ ) is quadratic residue: We siippose that ( p b ) / ( a y ) = IL' for some 11. Then the matrix

is singular, and hence, the system of equations

has a nontrivial solution. Suppose that (s, t ) = (q, c4) is a nontrivial solution. We are given group elements a, b, aailbPiz, ciyiab6i1 and so we can compute

( a a i i @ i ~ ) c B ( a y i ~ b 6 i ~ c4

- - ac&l+c4~i2 bc~Pi2+c46il

By the definition of c3, c4, we have i ~ ( c 3 a 2 1 + ~ 4 y i 2 ) = c 3 p i 2 + ~ 4 6 2 1 . Hence, we have obtained

)

(ab")c3ail+c4yi2

We compute ab" and then call the discrete logarithm oracle with the inputs (ab")C3ai1+C4~i2 and ah". The oracle returns hl = c3aZ1 + ~ 4 ~ 2 2 . We then do a similar process with another ($, ck) and obtain h: = c!pil + c iy22 . Then we may be able to obtain il and 22. Likewise the adversary can obtain jl

and j 2 .

465

References [I] W.Df ie and M.E.Hellman, New directions in cryptography, IEEE

Transactions on Information Theory, 22 (1976) 644-654.

[2] T.ElGama1, A piiblic key cryptosystem and a signatiire scheme based on discrete logarithms, IEEE Transactions on Information Theory, 31 (1985) 469-472.

[3] N.Koblitz, Elliptic ciirve cryptosystems, Math. Comp. 48 (1987) 203- 209.

[4] N.Koblitz, Hyperelliptic cryptosystems, 3. Cryptology, 1 (1989) 139- 150.

[5] U.M.Maiirer and S.Wolf, Lower bounds on generic algorjthms in groiips, Advances in Cryptology (Eurocrypt'98) Lecture Notes in Computer Sci- enc, Vol 1403, Springer-Verlag (1998) 72-84.

[6] V.Miller, Uses of elliptic curves in cryptography, Advances in Cryptol- ogy (Crypto'85) Lectiire Notes in Compiiter Science, Vol 218, Springer- Verlag (1986) 417-426.

[7] J.T.Schwartz, Fast probabilistic algorithms for verification of polynomial identities, J . ACM 27 (4) (1980) 701-717.

[8] V.Shoup, Lower bounds for discrete logarithms and related problems, Advances in Cryptology (Eiirocrypt'97) Lectiire Notes in Compiiter Sci- ence, Vol 1233, Springer-Verlag (1997) 256-266.

466

A NOTE ON RADEMACHER FUNCTIONS AND COMPUTABILITY *

MARIKO YASUGI Faculty of Science, Kyoto Sangyo University,

Motoyama, Kamigamo, Kita-ku, Kyoto, 603-8555, Japan E-mail: [email protected]

MASAKO WASHIHARA Faculty of Science, Kyoto Sangyo University

Motoyama, Kamigamo, Kita-ku, Kyoto, 603-8555, Japan E-mail: wasiharaOcc.kyoto-su.ac.jp

We will speculate on some computational properties of the system of Rademacher functions {a}. The n-th Rademacher function is a step function on the interval [O, I ) , jumping at finitely many dyadic rationals of size & and assuming values (1, -1) alternatingly.

Keywords Rademacher functions, computability problems of discontinuous function, LP[O, 11-space, computability structure, Cy-law of excluded middle, limiting recursion

1 Introduction

In [6], Pour-El and Richards proposed to treat computational aspects of some discontinuous functions by regarding them as points in some appropriate function spaces.

It will then be of general interest to find examples of discontinuous functions which can be regarded as computable in Pour-El and Richards approach. We are working on the integer part function [z] in [lo] in this respect. It is not difficult to claim that it is a computable point in a function space.

It is also important to find out what sort of a principle, beside recursive algorithm, is necessary in evaluating the value of such a function at a possible point of discontinuity. In [lo], we are investigating this problem as well.

In this article, we report some facts on a sequence of discontinuous functions which is computsble as a sequence of points in a function space.

Let {&(z)} be the sequence of Rademacher functions, that is, for each n, &(z) is defined on [0, l), is discontinuous at the dyadic rational numbers of the form &, and assumes the values 1 and -1 alternatingly.

'This work has been supported in part by Science Foundation No.12440031.

467

For a real number x, we call a pair ( { T ~ } , a) an information on x if { rm} is a sequence of rational numbers which converges to x and a is a function from natural numbers to natural numbers which serves as a modulus of convergence

We will discuss computational aspects of this function system from two viewpoints. First, it is a computable sequence of points in the function space Lp[O, 11 (Section 2).

Next, we would like to see how one might evaluate the value &(z) for a single computable number x (and for all n) (Section 3) . It turns out that {q5n(x)} has a “weak computation in” the following sense: input an information on x, say ( { r m } , a ) , there is a program to output a sequence of rational numbers, say { snm} , which converges to &(x). If x is a computable number, and its information ( {rm} ,a) is recursive, then the output { s n m } is a recursive double sequence. In order to evaluate a modulus of convergence for {snm}, one has to apply the Cy-law of excluded middle (denoted by Cy-LEM), that is, we assume a formula of the form 3xR(x) VVx-R(x) for a recursive R. There is a counter-example for a sequence of computable reals in [0, l), say {xm}, for which the sequence of values { & ( x ~ ) } ~ is not computable, even for a single n. This shows that the Cy-LEM cannot be replaced by a recursive excluded middle.

As a functional analysis treatment, we can show that {&} is a computable sequence of elements in the Banach space Lp[O, 11 for any computable real number p such that 1 5 p < CQ. See also [6], [B], [9] and [lo] for functional analysis approaches to discontinuous functions. For a quick review of computability properties in the function spaces, one can also refer to [12].

The mathematical significance of the Rademacher function system among various discontinuous functions is that i t is a subsystem of the Walsh function system, and the latter plays an important role in analysis. (We have consulted [l], [2] and [13] for Rademacher and Walsh functions.) It is only a matter of routine to extend our discussion to the Walsh function system.

For some discontinuous functions, if one changes the topology of the domain of a function, then it is possible that it becomes continuous. For that reason, it will be worthwhile to investigate computability structures in abstract topological spaces and metric spaces. (See, for example, [4], [7] and

(of { T m } to z).

WI.)

2 Rademacher functions and computability in a Banach space

Rademacher functions are step functions from [0, 1) to {-1,1} defined below. Definition 2.1 (Rademacher functions) Let n denote 0 ,1 ,2 ,3, . . .. Then the

468

nth Rademacher function 4"(z) is defined as follows.

4o(z) = 1, 2 E [O, 1)

where n 2 1 and i = 0 , 1 , 2 , . . . ,2"-' - 1. The sequence {&(x)) will be called the system of Rademacher functions,

or the Rademacher system. A Rademacher function &(x) is a step function which takes a value 1

or -1, and jumps at binary fractions & for k = 1 , 2 , . . . ,2" - 1. It is right continuous with left limit.

As a sequence of functions, (&} is eventually constant at each binary point. Namely, let x be a binary point $, where n is the first number with respect to which x can be expressed as such. Then k is an odd number, and &(x) = -1. For any m > n, x = for an 1, and this implies that q ! ~ ~ ( x ) = 1.

We will show that the function system ( 4 % ) is endowed with some kind of computational attributes. Note A traditional computable real-function (on a compact interval, say [0,1]) is assumed to satisfy two conditions: it preserves sequential computability and it is uniformly continuous with a recursive modulus of continuity. Such a function is sometimes called G-computable, meaning Grzegorczyk- computable.

We will first show that (4") is a computable sequence of points in a Banach space.

Let ( X , 1 1 11) be a Banach space. According to Section 2 of Chapter 2 in [6], a family of sequences from X , say S, is called a computability structure of the space ( X , 11 11) if it satisfies the follwoing three properties: S is closed with respect to recursive linear combinations and effective limits, and the norms of a sequence from S form a computable real number sequence.

A sequence in S is called S-computable, or simply computable. A sequence in S is called an efective generating set if it is a generating set (in the classical sense) for the space X.

For L P [ O , 11, where p 2 1 is a computable real number, Pour-El and Richards, in Section 3 of Chapter 2 of [6], proposes to define computable sequences as follows. A sequence {fn} from LP[O, 11 is said to be LP-computable if there exists a G-computable double sequence of functions {gnk} which satisfies that J1gnk - f n l l p converges to 0 as k + 00, effectively in n and I c , where I l f l l , = (Jt I f 1");. Efec t ive convergence means a convergence with a recursive modulus of convergence with respect to the norm 1) ! I p .

469

Let Sp be the computability structure for L P [ O , 11 consisting of computable sequences as defined above. In [6], four kinds of effective generating sets for Sp are listed: the sequence of monomials { 1, z, x2, . . .}, an enumeration of all piecewise linear functions with corners of rational coordinates, an enumeration of trigonometric polynomials and an enumeration of all step functions with rational values and rational jump points.

We will utilize the last generating set. Let us denote an effective enumeration of such functions by {e l} . An el contains information on the number of (finitely many) jump points, the jump points, and the corresponding values.

It is a general practice that one regards a function defined on [0, 1) which is integrable over this interval as an L P [ O , 11 function, since it is equal to an L P [ O , 11 function almost everywhere. Then, for each n, 4, can be regarded as an element of P [ O , 11.

Now, (4") can be obtained as a recursive subsequence of { e l } (which we call a re-enumeration of { e l } ) , and hence it is an L P [ O , 11-computable sequence. Re-enumeration can be obtained by examining el for each 1 if the number of jump points el represents is 2" - 1, if the jump points are F , F , . ' . , F , . . ' , 7, and finally if the corresponding values are 1, -1,1, - 1 , . . . , 1 , -1.

We have thus obtained the Theorem 1 (LP[O,l]-computability) Let p be a computable real number such that 1 5 p < 00. The Rademacher function system (4" ) is a computable sequence in the space L P [ O , 11.

Having shown that (4" ) is a computable sequence of points in a function space, we then question how one might evaluate the values {&(z)} for a computable z. We will observe this problem in the next section.

1 2 k 2"-1

3

We will introduce a weak notion of (pointwise) computability of a function, and show that the Rademacher functions form a sequence of weak computability.

Let z be a real number, let { r m } be a sequence of rational numbers and let Q! be a number-theoretic function (from natural numbers to natural numbers). When { r m } converges to z, Q is called a modulus of convergence (of { r m } to z) if the following holds.

Computation within the Cy-law of excluded middle

1 m 2 a ( p ) implies Iz - r,I 5 -

2 p

The pair ( { r m } , a ) is then called an information on 2.

470

Definition 3.1 (Weak computation) (1) We will temporarily call an algorithm to evaluate a function value f(z), say P , a weak computation if the following holds: given an information on z, say ({rm}, a) , P outputs a sequence of rational numbers {sm} (from the information ({rm}, a ) ) which (classically) converges to f (z) .

(2) The definition of weak computation P can be extended to a sequence of functions, say {fn(z)} as follows: given an information ( { T ~ } , a ) on z, P outputs a sequence of rational numbers { S , , ~ } , which converges to {fn(z)}, as m tends to 00 for each n.

Theorem 2 (Weak computation of {&}) The Rademacher function system (4,) has a weak computation (cf. (2) of Definition 3.1). Proof We will describe an algorithm Po which does the following job: input an information on z in [ O , 1 ) , say ( { rm} ,a ) , Po outputs a sequence of rational numbers {snm}, which converges to {&(z)}. Po is determined as a composition of several algorithms as below. (For simplicity, we assume x > 0. Amendment for the case z = 0 inclusive will be explained later.) 1 First, we define an algorithm P2 which, given ( { T ~ } , a) , outputs an integer Ic (for each n) such that & < z <

By Proposition 0 in Chapter 0 of [6 ] , there is a program P I which decides, for an information on z and a rational number s, z < s (z > s) when it is true.

In our case, at stage 1, P I checks whether T ~ ( ~ ) < $ - $ holds for p 5 1 and k 5 1. If the result is Yes, then z < & has been determined. Similarly with the other direction, cheking the inequality & + & < ~ ~ ( ~ 1 .

As we have assumed 2 > 0, the desired k is a natural number. At stage 1, keep computing “x < &?” and “$ < z?” for k = 0 , 1 , 2 , . . . ,2”-l - 1 up to step 1, applying PI. As 1 increases, P1 eventually hits a number k such that & < x and z < y , Then stop the process. PZ denotes this whole process. Denote such a k by kn.

Although k dependes on n, the algorithm Pz applies uniformly to any n, and hence Pz outputs the sequence { l ~ , , } ~ .

2 There is an algorithm P3 such that, on input {k,}, P3 outputs {Mnp} , a sequence of integers (natural numbers) for n , p = 1 , 2 , 3 . . . in the following way.

- &. If the answer is No, put Mnp = k, + 1. If the answer hits Yes, then put Mnp = k,. (If there is such a p , then Mnq = k, for all q 2 p ) . 3 Next, there is an algorithm P4 such that, given the input { M n p } , P4

outputs a sequence of rational numbers {snP} such that snP = 1 if Mnp is

holds.

At each stage p , compute the inequality ru(p) <

47 1

even, and snP = -1 if Mnp is odd. Now let Po be the composition of Pi for i = 2 , 3 , 4 in succession. In particular, if the information on x is computable, that is, if the sequence

We have the following facts at our disposal. { T " } and the function a are recursive, then {Adnp} and {snp} are recursive.

Fact 1 {Adfip} is eventually constant with respect to p for each n, and this constant number (the limit number) is either k, or k, + 1. Let us write the limit number as K,. Fact 2 From Fact 1, it can be derived that Isnp} is eventually constant with respect to p for each n, and hence {snp} is a converging sequence with respect to p . The limit number s, is 1 if K, is even and s, is -1 if K, is odd. Fact 3 K , is k, if % < x < 9 and is k, + 1 if 5 x < v, and hence the limit of { s T l p } , say s,, is equal to 4,(z).

We have thus proved the theorem.

Note We have shown only the fact that, from a given information on x, one can find a recursive sequence of rational numbers which classically converges to &(z), but the convergence may not be effective. This does not assert, however, that the limit number is not computable. Indeed, &(z) is an integer, hence a computable number! In fact, we can even show the following. Proposition 3.1 (Computability with respect to n) If x is a computable real number and ({T"}, a) is a recursive information on x, then the sequence { 4 , ( x ) } is a computable sequence of real numbers (which consists of 1 and

Proof Notice that what we must claim is a fact for each fixed z (and for all n). So, we study two cases concerning x.

1) x is a dyadic rational point, that is 2 = 6 , where j >_ 1 and 1 is odd. If n < j, then a k satisfying & < x < can be computed. The value

If n = j, then &(x) = -1. If n > j, then &(x) = 1. Notice that three cases above are recursive. 2) x is not a dyadic rational number. Then by the algorithm Pz, one can

effectively in n. The value of &(z) can

-1).

&(z) can be determined by &(x) = 1 if k is even and = -1 if k is odd.

find a Ic , satisfying % < x < be determined to be 1 or -1 according as k, is even or odd.

Amendment When x = 0 is included, one may suspect that one has to start with k = -1 in searching for a k such that $ < x < F. We can avoid this complication by assuming that any computable number in the interval [0, 1)

472

can be effectively approximated by a computable sequence of positive rational numbers.

The reason why a modulus of convergence cannot be determined for the <

(Case 2). These cases can be - & for

- $ for d l p (Case 2). Let us speculate on

sequence { s n p } in the proof of Theorem 2 lies in Fact 3. Namely, z < 9 (Case 1) and 9 5 2 < alternatively expressed by the following conditions: T ~ ( ~ ) < some p (Case 1) or T ~ ( ~ ) 2 these cases.

Computation within the Cy excluded middle The logical structure in the analysis above is the following: two cases are distinguished in terms of a recursive predicate R(n,p), where Case 1 is expressed by 3pR(n ,p) and Case 2 is expressed by V p ~ R ( n , p ) . (R(n,p) represents the relation < e - $.) So, the computation of {&(z)} is relative to the Cy-LEM. (Notice that V p l R ( n , p ) is a classical equivalent of -3pR(n,p) . )

< - $ holds. Put c , = p p ( ~ a ( p ) < - h), where pp is the least number operator. If it is true that Case 1 holds, then c, can be computed. If we put Pn(q) = c,, then P, is recursive (a constant), and it serves as a modulus of convergence for { s n p } with respect to p .

Suppose next that Case 2 holds. Then &(q) = 1 will serve as a recursive modulus of convergence.

The computation can be carried out for each n parallel to one another, and hence, if we admit the Cy-LEM, then the computation of {$,(z)} can also be carried out parallel to one another.

We can thus claim that, given an information ( { T [ } , L Y ) on z, a pair of information ( { s n l } , P,) is output wuithin each case of the C!-LEM.

Suppose that Case 1 holds. Then wait until

Counter-example (Counter-example to effectivity) In order to certify that there is no effective way of deciding whether Case 1 holds or Case 2 holds, we will present a counter-example: there is a computable sequence of numbers {zm} for which the sequence of real numbers {4n(zm)} is not computable even for a single n.

Suppose the Cy-LEM applied to R(n,p) above were decidable. Then, as was remarked earlier, the whole process of computing ({snl},Pn) would become effective. It would therefore hold that, for any computable sequence of real numbers {xm}, {+,(z,)} would be computable, contradicting the counter-example.

473

We will give a counter-example for &. The same method applies to any

Let a be a recursive injection whose range is not recursive. Let (21) be a 4,.

computable sequence of reals in Example 4, Chapter 0 of [6 ] . That is,

L- if I = a(m) for some rn 0 otherwise

Consider the computable sequence of reals {yl} where yl = $ - 21:

$, if 1 = a(rn) for some rn

This implies that

1 if 1 = a(m) for some rn

Now, suppose {41(y1)} were a computable sequence. Then, it can be shown, as in Example 4, Chapter 0 of [6], that the range of a would be recursive, yielding a contradiction. So, { $1 (yr)} cannot be a computable sequence.

Note In the argument above, we did not quesion if x belongs to the interval [0, 1). Here we assume that the universe of discourse be limited to the numbers in this interval.

4 Remarks

To conclude our discussion, we will give two remarks.

Remark 1 A computation within Cy-LEM may be alternatively expressed as the limiting computability in the following sense. In the computation of &(z), the only process that is not effective is taking the limit of an integer sequence { M n P } . If we take the limit of { M n P } with respect to p , which exists and is either k, or k, + 1, then we will know which modulus of convergence, c, or 1, should be the right one.

Taking the limit of a recursive function whose values are eventually constant was studied in [3]. A number-theoretic function whose values are given by the limits of a recursive function was called limiting recursive.

If we call a computation using limiting recursive functions a limiting computation, then we might say that {&} is a limiting computable sequence of functions.

474

In fact, it is easy to show that {&} is sequentially limiting computable in the sense that , given a computable sequence of real numbers {zj}, there is a limiting computation of the double sequence of values { & ( ~ j ) } ~ j .

Remark 2 A Rademacher function is a rather special function, since, between any two consecutive jump points, the value is constant. As was mentioned previously, we have taken up the Rademacher function system as a mathematically significant example of a function sequence with jump points. The forgoing argument goes through, however, for some functions which are not constant between two consecutive jump points and whose jump points are not confined t o dyadic rational numbers. Then, a modulus of convergence is usually not a constant.

Consider, for example, a function sequence { f,} on [0, 1) such that f, jumps at $ , k = 0 ,1 , . . . , 2 " - 1.

Suppose { g n k } is a computable sequence of functions on [O, I), and suppose fn(z) = g n k ( Z ) if z € [&, %). Up to definition of { M n p } , the argument goes through as before. If Mnp = k,, then put snp = g n k , ( ~ ~ ( ~ ) ) ; if Mnp =

{ s n p } will converge to {fn(z)}.

g n k , and, for the latter case, it is supplied by that of gnk,+l.

k , + 1, then put snp = gnk,+l(t), where t = min{ra(pl + &, +T- - F T d . +' 1

For the former case, the modulus of convergence is supplied by that of

Acknowledgments

The authors are grateful to Takakazu Mori for his valuable comments. We have extended Theorem 1 to p such that 1 5 p < 00 from the case p = 1. We are also indebted to Susumu Hayashi for his comments on the EY-LEM and limiting recursion.

References

1. Y.Endo, Wulsh analysis (in Japanese), Tokyo Denki Daigaku Press, 1993. 2. N.J.Fine, On the Walsh Functions, Trans. Amer. Math. Soc.,65(1949),

3. E.M.Gold, Limiting recursion, JSL, 30-1(1965),28-48. 4. T.Mori, On the computability of Walsh functions, to appear in TCS,2000. 5 . T.Mori,Y.Tsujii and M.Yasugi, Computability structure on met-

ric spaces, Combinaton'cs, Complexity and Logic (Proceedings of DMTCS'96), ed. by Bridges et.al,Springer( 1996),351-362.

372-414.

475

6. M.B.Pour-El and J.I.Richards, Computabi l i ty in Ana lys i s and Physics , Perspectives in Mathematical Logic, Springer-Verlag, 1989.

7. Y.Tsujii, M.Yasugi and T.Mori, S o m e properties of t h e effective u n i f o r m topological space, Proceedings of CCA2000(2000) , 421-440; revised version available at http://www.kyoto-su.ac.jp/"'yasugi/Recent.

8. M.Washihara, Computabi l i ty and f ie 'chet space, Mathernatica Japonica, 42( 1995) ,1-13.

9. M.Washihara and M.Yasugi, Computabi l i ty and me t r i c s in a Fre'chet space, ibid., 43(1996), 1-13

10. M. Yasugi, V. Brattka and M. Washihara, Computabi l i ty aspects of s o m e discont inuous f u n c t i o n s , to appear in SCMJ.

11. M.Yasugi, T.Mori and Y.Tsujii, Ef fec t i ve Properties of Se t s and f i n c - t i o n s in Met r i c Spaces with Computabi l i ty Structure, Theoretical Com- puter Science, 219(1999), 467-486.

12. M.Yasugi and M.Washihara, Computabi l i ty s tructures in analysis, Sug- aku Expositions (by AMS) 13(2000), no.2, 215-235.

13. K.Yoneda, On application of W a l s h Fourier Ana lys i s t o Weakely Stat ion- a r y Processes, Acta Math. Hungar., 76(1997), 303-335.


477

Author Index

Almeida, J., I Atanasiu, A., 22 Auinger, K., 40 Avgustinovich, S.V., 51

BogdanoviC, S., 378 Brattka, V., 63 Buchholz, T., 73

Carton, O., 88 Choffrut, c., 103 Cirik, M., 378 Csuhaj-Varjli, E., 134

Domosi , P., 185 Domosi, P., 183 Dassow, J . , 151 DemCny, M., 162

Escada, A,, 1

Fon-Der-Flaass, D.G., 51 Frid, A.E., 51

Grigorieff, S., 103 Gruska, J . , 192 Guo, Y.Q., 428

Hiraishi, K., 253 Horvith, G., 162

Imreh, B., 212 Inata, I., 222 Ito , M., 183 Ito, M., 212

Kelarev, A.V., 228 Klein, A., 73 Konstantinidis, S., 240 Koshiba, T., 253 Kudlek , M., 185 Kurosawa, K., 450 Kutrib, M., 73

Lombardy, S., 266

Machida, H., 286 Margolis, S . W., 297, 311 Martin-Vide, C., 22 Mateescu, A., 323 Matsuda, R., 339 Mitrana, V., 22

Nagylaki, Cs., 162 Nagylaki, Z., 162 Niemann, G., 352 Nishio, H., 370

Otto, F., 352

Petkovid, T., 378 Pin, J.-E., 297 Popovid, Z., 378 Pukler, A., 212

Saito, T., 396 Sakarovitch, J., 266 Salomaa, A., 134 Schott, R., 403 Sen, M.K., 428 Shoji, K., 420 Shum, K.P., 428 Spehner, J.-C., 403 Steinberg, B., 311 Steinby, M., 434

Trotter, P.G., 228

Van, Do Long Van, 171 Volkov, M.V., 297 Vollmar, R., 192

Washihara, M., 466

Yamamura, A., 450 Yasugi, M., 466

Documents

Words, Languages & Combinatorics III