ON THE EXPRESSIVE POWER OF SHUFFLE PRODUCT Antonio Restivo Università di Palermo

ON THE EXPRESSIVE POWER OF SHUFFLE PRODUCT

Antonio RestivoUniversità di Palermo

A very general problem:

Given a basis B of languages, and a set O of operations, characterize the family O(B) of languages expressible from the basis B by using the operations in O.

The Family REG of Regular Languages:

The basis:B = { {a} | a } U {} The operations:O = {union, concatenation, (Kleene) star}

REG = O(B)

REG is closed also under all Boolean operations

The Family SF of Star-Free Languages

The basis:B = { {a} | a } U {ε}The operations:O = {Boolean operations , concatenation}

SF = O(B)

Shuffle Product

The shuffle of two words u and v is the set u ш v = {u1v1…unvn|n≥0, u1…un=u, v1…vn=v}

ab ш ba = {abba, baab, abab, baba}

The shuffle of two languages L and K is the language

L ш K = UuєL,vєK u ш v

Expressive Power of the Shuffle

Very little is known about classes of languages closed under shuffle, and their study appears to be a difficult problem.

Such a study, apart its theoretical interest, is also motivated by applications to the modeling of process algebras and to program verification

The Family INT of Intermixed Languages

The basis:B = { {a} | a } U {ε}The operations:O = {Boolean operations, concatenation, shuffle}

INT = O(B)

REG

Theorem (Berstel, Boasson, Carton, Pin, R.)SF INT REG

INT SF

The ProblemProblem 1:Give a (decidable) characterization if the family INT

Proposition INT is not a variety (in the sense of Eilenberg)

Remark: REG and SF are varieties

PeriodicityA language L * is aperiodic , or non-counting, if there exists an integer n 0 such that, for all x,y,z *, one has

xynz L xyn+1z L.

Theorem (M.P. Schutzenberger)A regular language L is aperiodic if and only if it is star-free

PeriodicityThe strict inclusion SF INT implies that the shuffle of two star-free languages in general is not star-free:

«the shuffle creates periodicities»

Problem 2:Determine conditions under which the shuffle of two star-free languages is star-free.

Bounded ShuffleLet k be a positive integer. The k-shuffle of two languages L1 and L2 is defined as follows:

L1 шk L2 =

= {u1v1…umvm |m≤k, u1…umL1, v1…vmL2}.

Any k-shuffle is called bounded shuffle

Theorem (Castiglione, R.)SF is closed under bounded shuffle Corollary. The shuffle of a star-free language and a finite language is a star-free language

Partial Commutations

Let be an alphabet and let be a symmetric and reflexive relation, called (partial) commutation.

Consider the congruence of * generated by the set of pairs (ab,ba) with (a,b).

If L * is a language, [L] denotes the closure of L by . L is closed by if L = [L].

The closed subsets of * are called trace languages.

Partial CommutationsLet L1 and L2 be two languages over the alphabet.

Let 1 and 2 be two disjoint copies of the alphabet (colored copies), and i: i , for i=1,2, the corresponding bijections. Let L’1 (L’2 resp.) be the subset of 1 (2 resp.) corresponding to the L1 (L2 resp.) under the morphism 1 (2 resp.).

Let = 1 2 and consider the partial commutation 1 2 and let : * * be the morphism induced by 1 and 2 (delete colours). The -product of L1 and L2 is

L1 ш L2 = ( [L’1L’2])


bacbcaabca L1, babcacbab L2

bacbcaabca babcacbab q = {(a,a), (a,b), (b,a), (b,b), (c,a) (c,b), (c,c)}bbaabcbcaabcacacbab L1 ш L2

The -product generalizes at the same time concatenation and shuffle:If = , then L1 ш L2 = L1L2

If = 1 2, then L1 ш L2 = L1 ш L2

Partial CommutationsGiven the partial commutation 1 2, we define the partial commutation ’ defined as follows:

(a,b) ’ (a,b)

Theorem (Guaiana, R., Salemi)Let L1, L2 be languages over , closed under ’.

If L1, L2 SF, then L1 ш L2 SF.

Corollary. The shuffle of two commutative star-free languages is star-free


If the internal commutation ’ (i.e.the commutation allowed inside each of the languages L1, L2) is the «same» as the external commutation (i.e. the commutations between the letters in L1 and the letters in L2), then the -product preserves the star-freeness.

Unambiguous Star-Free Languages A language L is the marked product of the languages L0, L1, …, Ln if

L = L0a1L1a2L2 … anLn,

for some letters a1, a2, … , an of .

A marked product L = L0a1L1a2L2 … anLn is unambiguous if every word of L admits a unique decomposition u = u0a1u1…anun, with u0 L0, … , un Ln.

The product {a,c}*a{}b{b,c}* is unambiguous

Unambiguous Star-Free Languages

SF is the smallest Boolean algebra of languages which is closed under marked product

The family USF of Unambiguous Star-Free languages is the smallest Boolean algebra of languages of * containing the languages of the form A*,for A , which is closed under unambiguous marked product.

Unambiguous Star-Free LanguagesFO : class of languages corresponding to formulas of first order logic.FOk : class of languages corresponding to formulas of first order logic with k variables.

Theorem (McNaughton) SF = FO

Theorem (Immerman, Kozen) FO3 = FO

Theorem (Therien, Wilke) FO2 = USF

Unambiguous Star-Free LanguaesUSF SF INT REG

Theorem (Castiglione, R.)If L1, L2 USF then L1 ш L2 SF

REGINT SFUSF

Cyclic SubmonoidsThe languages in the class USF correspond to regular expressions in which the star operation is restricted to subsets of the alphabet.

The simplest languages not in USF are the languages of the form L = u*, where u is a word of length 2.

Such languages are the cyclic submonoids of *.

We here study the shuffle of cyclic submonoids.

Cyclic submonoids

Theorem (Berstel, Boasson, Carton, Pin, R.)If a word u contains at least two different letters, then u* INT.

A word u * is primitive if the condition u=vn, for some word v and integer n, implies u=v and n=1.

Theorem (McNaughton, Papert)The language u* is star-free if and only if u is a primitive word.

Cyclic submonoidsu = b, v = ab

b* ш (ab)* = (b + ab)* SF

u = aab, v = bba

(aab)* ш (bba)* (ab)* = ((ab)3)* (aab)* ш (bba)* SF

Problem 3: Characterize the pairs of primitive words u,v such that u*ш v* is a star-free language.

Combinatorics on Words

Theorem (Lyndon, Schutzenberger)If u and v are distinct primitive words, then the word unvm is primitive for all n,m 2.

Theorem (Shyr, Yu)If u and v are distinct primitive words, then there is at most one non-primitive word in the language u+v+.

Combinatorics on WordsProblem 3 is related to the search for the powers (non-primitive words) that appear in the language u+ ш v+.

Denote by Q the set of primitive words.

For u,v,w Q, let p(u,v,w) be the integer k such that

(u*ш v*) w* = (wk)*

If (u*ш v*) w* = {}, then p(u,v,w) = 0.

Combinatorics on WordsFor u,v Q, define the set of integers

P(u,v) = { p(u,v,w) | w Q}.

For instance, if u = a10b and v = b thenP(u,v) = {0,1,2,5,10}.

Problem 4:Given two primitive words u, v, characterize the set P(u,v) in terms of the combinatorial properties of u and v.

Documents

ON THE EXPRESSIVE POWER OF SHUFFLE PRODUCT Antonio Restivo Università di Palermo