12
Optimal parallel analysis and decomposition of partially occluded strings Costas S. Iliopoulos a,b,1 , James F. Reid a,c, * ,2 a Algorithm Design Group, Department of Computer Science, King’s College London, London WC2R 2LS, UK b School of Computing, Curtin University of Technology, Perth, WA 6102, Australia c Dipt. di Elettronica e Informatica, Universit a degli Studi di Padova, Via Gradenigo 6/A, Padova 35131, Italy Received 4 April 1998; received in revised form 15 July 1999; accepted 13 September 1999 Abstract This is a theoretical study of partially occluded one-dimensional images. Here, we consider ‘‘valid’’ images composed from a given set of objects, where some objects appearing in the image may be partially obstructed by others. A CRCW PRAM algorithm is presented here for validating a one-dimensional image x of length n over a set of k objects of equal length in Olog log n time with linear work, where k is a fixed integer. Ó 2000 Published by Elsevier Science B.V. All rights reserved. Keywords: String algorithms; PRAM algorithms; Pattern recognition 1. Introduction The study of repetitive structures in sequences (strings) plays a key role in in- formation processing and more generally in computer science. This has led to a generalisation of notions concerning repetitions in sequences. One of these is the notion of a cover of a string. A substring w of a string x is called a cover of x if x can www.elsevier.com/locate/parco Parallel Computing 26 (2000) 483–494 * Corresponding author. E-mail addresses: [email protected] (C.S. Iliopoulos), [email protected] (J.F. Reid). 1 Partially supported by EPSRC grants GR/F 00898, GR/L 19362 and GR/J 17844, NATO grant CRG 900293, and MRC grant G9115730 2 Supported by a Marie Curie Fellowship of the European Commission Training and Mobility of Researchers (TMR) Programme. 0167-8191/00/$ - see front matter Ó 2000 Published by Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 1 9 1 ( 9 9 ) 0 0 1 1 4 - 3

Optimal parallel analysis and decomposition of partially occluded strings

Embed Size (px)

Citation preview

Page 1: Optimal parallel analysis and decomposition of partially occluded strings

Optimal parallel analysis and decomposition ofpartially occluded strings

Costas S. Iliopoulosa,b,1, James F. Reida,c,*,2

a Algorithm Design Group, Department of Computer Science, King's College London, London WC2R 2LS,

UKb School of Computing, Curtin University of Technology, Perth, WA 6102, Australia

c Dipt. di Elettronica e Informatica, Universit�a degli Studi di Padova, Via Gradenigo 6/A, Padova 35131,

Italy

Received 4 April 1998; received in revised form 15 July 1999; accepted 13 September 1999

Abstract

This is a theoretical study of partially occluded one-dimensional images. Here, we consider

``valid'' images composed from a given set of objects, where some objects appearing in the

image may be partially obstructed by others. A CRCW PRAM algorithm is presented here for

validating a one-dimensional image x of length n over a set of k objects of equal length in

O�log log n� time with linear work, where k is a ®xed integer. Ó 2000 Published by Elsevier

Science B.V. All rights reserved.

Keywords: String algorithms; PRAM algorithms; Pattern recognition

1. Introduction

The study of repetitive structures in sequences (strings) plays a key role in in-formation processing and more generally in computer science. This has led to ageneralisation of notions concerning repetitions in sequences. One of these is thenotion of a cover of a string. A substring w of a string x is called a cover of x if x can

www.elsevier.com/locate/parco

Parallel Computing 26 (2000) 483±494

* Corresponding author.

E-mail addresses: [email protected] (C.S. Iliopoulos), [email protected] (J.F. Reid).1 Partially supported by EPSRC grants GR/F 00898, GR/L 19362 and GR/J 17844, NATO grant CRG

900293, and MRC grant G91157302 Supported by a Marie Curie Fellowship of the European Commission Training and Mobility of

Researchers (TMR) Programme.

0167-8191/00/$ - see front matter Ó 2000 Published by Elsevier Science B.V. All rights reserved.

PII: S 0 1 6 7 - 8 1 9 1 ( 9 9 ) 0 0 1 1 4 - 3

Page 2: Optimal parallel analysis and decomposition of partially occluded strings

be constructed by concatenations and superpositions of w. It has been shown, see[5,16], that e�cient solutions for string covering problems have applications to DNAsequencing by hybridization. As a result, many sequential and parallel algorithmshave been developed concerning the covering of a string. Among the sequential al-gorithms, Apostolico et al. [1] solved the problem of computing the shortest cover ofa given string. Similarly, Moore and Smyth [15] solved the problem of computing allthe covers of a given string in linear time. In parallel computation, Iliopoulos andPark [11] gave an optimal O�log log n� time with linear work algorithm for theshortest-cover and all-covers problems, see also [10].

This paper focuses on an application of the string covering problem to imageanalysis. Decomposing occluded images is a classical problem in computer visionand its computational complexity is exponential. There are many arti®cial intelli-gence and neural network solutions to this problem, see for example [2,4,8,9,14]. Ourapproach has been to consider the problem bottom-up by considering images in onedimension, thus aiming to use previous results concerning strings. Here we will bevalidating occluded one-dimensional images which consist of a set of continuousobjects where some may be partially obstructed by others. To validate occludedimages one must distinguish whether an input image can be covered by the membersof a ®nite set of objects of various lengths, where both the image and the objectsrange over identical alphabets. Recently Iliopoulos and Simpson [13] produced alinear sequential on-line algorithm for solving this problem.

This is a theoretical study on the analysis of images composed from a given setof objects, where some of the appearing objects may be partially occluded by otherones. Thus we restrict our attention on the occlusion problem by focusing only ondiscrete images and convex objects, and their e�cient solutions are based on thestudy of the repetitive structures of the input. The results and solutions presentedhere provide the foundations for practical solutions to this problem. For example,in a recent paper [12] e�cient methods for the analysis and decomposition ofdiscrete two-dimensional partially occluded images are presented. In the following,we will consider the family of one-dimensional images, that we call valid images;given a set of objects (i.e. strings) fs1; . . . ; skg and a special ``background'' symboldenoted #, a string x of length n is a valid image of length n, if x is iterativelyobtained from a string z � #n by substituting substrings of z by some objects si,for some i. We will be focusing on designing algorithms for testing one-dimen-sional images for validity, under restricted sets of objects, e.g., objects of the samelength, convex objects, etc. Additionally we will be investigating the decomposi-tion of a given image into the set of (perhaps partially occluded) objects occurringin it.

The algorithm described in this paper uses the common concurrent-read concur-rent-write parallel random access machine CRCW PRAM model of computation,where simultaneous access to a memory location for a read or write is allowed. Herewe present a common CRCW PRAM algorithm for validating a one-dimensionalimage x of size n over a ®xed number k of objects si of equal length s � jsij. Therunning time of the algorithm is O�log log n� with linear work. We also describe howto extend the above algorithm for objects of di�erent lengths.

484 C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494

Page 3: Optimal parallel analysis and decomposition of partially occluded strings

The paper is organised as follows. In Section 2 we present the basic de®nitions forstrings and we refer to previous results that we use in the sequel. In Section 3 wedescribe the data structure and the main techniques used. In Section 4 we describethe steps of the PRAM algorithm. Finally in Section 5 we present our conclusionsand open problems.

2. Preliminaries

A string (or word) is a sequence of zero or more symbols drawn from an alphabetR, which consists of a ®nite number of symbols. The set of all strings over R is denotedby R�. The string of length zero is the empty string � and a string x of length n > 0 isrepresented by x1x2 � � � xn, where xi 2 R for 16 i6 n. The string xy is a concatenation oftwo strings x and y. The concatenation of k copies of x is denoted by xk. A string w issaid to be substring of x if and only if x � uwv for u; v 2 R�. A string w is a pre®x of x ifand only if x � wu for u 2 R�; if u is not empty, then w is a proper pre®x of x. Similarly,w is a su�x of x if and only if x � uw for u 2 R�; if u is not empty, then w is a propersu�x of x. Additionally prefixk�x� denotes the ®rst k symbols of x and suffixk�x�denotes the last k symbols of x. For two strings x � x1 � � � xn and y � y1 � � � ym such thatxnÿi�1 � � � xn � y1 � � � yi for some i P 1 (i.e., hat is, such that x has a su�x equal to apre®x of y), the string x1 � � � xnyi�1 � � � ym is said to be a superposition of x and y. Al-ternatively, we may say that x overlaps with y. A substring w of x is called a cover of xif x can be constructed by concatenations and superpositions of w. In the followingde®nition we assume that R is a ®nite alphabet of symbols and we denote # 62 R to bea special symbol called the background symbol.

De®nition 2.1. Let x be a string of length n over an alphabet R and letO � fs1; s2; . . . ; skg be a set of strings called the objects also over R. Then a string x oflength n is said to be a valid image if and only if x � zi for some 06 i6 n, where

z0 � #n

zi�1 � prefixp�zi� sm suffixq�zi�:�1�

for some m 2 f1; . . . ; kg and p; q 2 f0; . . . ; nÿ 1g such that p � jsmj � q � n.

The equalities (1) are said to be the substitution rules and the sequence z0; . . . ; zi issaid to be the generating sequence of x. An example of such a generating sequence fora speci®c string is as follows. Let O � fs1 � aba; s2 � abc; s3 � cde; s4 � efbg. Thenx � aabababccefabacdeefb is a valid image over O with generating sequence (Fig. 1):

z0 � #20; z1 � aba#17; z2 � aba#14efb; z3 � aabc#13efb;

z4 � aabc#10cdeefb; z5 � aababa#8cdeefb; z6 � aabababc#6cdeefb;

z7 � aabababccde#3cdeefb; z8 � aabababccefb#2cdeefb;

z9 � aabababccef abacdeefb:

C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494 485

Page 4: Optimal parallel analysis and decomposition of partially occluded strings

The occurrences of the (perhaps partially occluded) objects in x are underlined.From this construction, it is obvious that the generating sequence of x is not unique,for example consider the following sequence:

z0 � #20; z1 � abc#17; z2 � aaba#16; z3 � aababc#14;

z4 � aababc#11efb; z5 � aababc#8cdeefb; z6 � aabababc#6cdeefb;

z7 � aabababccde#3cdeefb; z8 � aabababccefb#2cdeefb;

z9 � aabababccef abacdeefb:

also generates x as a valid image over O. These two generating sequences show thatthe decomposition of x into objects is not unique due to the fact that some objectsmay be occluded by others.

From the above example we can see that there may exist many possible generatingsequences for a given image, since its decomposition is not unique. We say that twogenerating sequences for a string x of length n are distinct, if one generating sequenceimplies that an object sr occurs at position i of x for some i 2 1; . . . ; n, and the otherimplies that either no object occurs at position i of x or a di�erent object than sr

occurs at i.

Lemma 2.1 [13]. Let D�n; k� be the number of distinct generating sequences thatproduce an over the set of objects O � fa; a2; . . . ; akg. Then

D�n; k� � X�cn�for some constant c > 1. �

We conclude from Lemma 2.1 that the number of distinct generating sequencesmay be exponential in n. This fact complicates the design of an iterative algorithm

Fig. 1. The string x � aabababccefabacdeefb is a valid image over the set of objects O � faba; abc; cde;efbg.

486 C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494

Page 5: Optimal parallel analysis and decomposition of partially occluded strings

for decomposing or even validating a one-dimensional image since it is imperativenot to inspect all possible generating sequences for a given image.

The de®nition of a valid image implies trivially that the objects are containedwithin the image x. That is, there is no si for all i 2 f1; . . . ; kg that is `cut' at x�1� orx�n�. This leads to the following two basic facts:

Fact 1. If x is a valid image over O � fs1; . . . ; skg, then for some i 2 f1; . . . ; kg thereexists a suffix �si of si that is also a suffix of x.

Fact 2. If x is a valid image over O � fs1; . . . ; skg, then for some i 2 f1; . . . ; kg thereexists a prefix si of si that is also a prefix of x.

This follows from the fact that one of the si's must occur at x�1� and (possibly)another must end at x�n�. By considering the problem of validating an image underthe constraint that all the objects are of equal length then one can break down thevalidity of a given image into three families of representations of a valid image. Thesefamilies will enable us to prove the theorems in the following sections needed for thecorrectness of the algorithm.

Theorem 2.1 [13]. Let x be a string over R where there are no background symbols #'sin x. Let O � fs1; s2; . . . ; skg be as set of objects all of the same length. The string x is avalid image over O if and only if one of the following conditions holds:

x � siu; �2�x � u�si; �3�x � u~siv; �4�

where si and �si with i 2 f1; . . . ; kg denote a prefix and a suffix of si, respectively, ~si is asubstring of si and u and v are valid images.

By combining Facts 1 and 2 with the above classi®cation we can list what one canexpect to follow pre®xes, su�xes and substrings of any object. These properties willbe useful in the design of our algorithm because they will easily detect invalid images.

(i) The occurrence of a proper pre®x of an object in a valid image must be fol-lowed by a (not necessarily proper) pre®x of an object (Theorem 2.1, (2), and Fact2).(ii) If an occurrence of a pre®x of an object in an image is followed by an occur-rence of a proper su�x of an object, then the image is not valid. In a valid image,the occurrence of a proper su�x of an object is always preceded by the su�x of anobject.(iii) The occurrence of a su�x of an object in a valid image can be followed byeither a pre®x of an object, a su�x of an object or a proper substring of an object(Theorem 2.1, (2)±(4)).(iv) The occurrence of a substring of an object in a valid image is preceded by andfollowed by valid images (Theorem 2.1, (4)).

C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494 487

Page 6: Optimal parallel analysis and decomposition of partially occluded strings

The algorithm presented in this paper uses the following known algorithms:1. The optimal constant time algorithm in [6] for computing the minimum of n in-

tegers whose values are in �1; . . . ; n�.2. The optimal parallel algorithm for solving the chaining problem in [17]. This

problem creates an ordered linked list of k non-zero elements in an array of sizen. Radge solved this problem with an algorithm taking O�a�k�� time with a totalof O�n� work (where a is a functional inverse of Ackermann's function).

3. The algorithm for solving the pre®x matching problem by Gaßsieniec and Park [7].That is, given a pattern P of length m and a text T of length n, ®nd for all posi-tions i in T the longest pre®x of P starting at i. It takes O�a�m�� time for the searchphase and O�log log m� time for the preprocessing and O�m� work.

4. The optimal pattern matching algorithm by Breslauer and Galil [3]. This algo-rithm ®nds all occurrences of a pattern P of size m in a text T of size n inO�log log m� time and linear work.

3. Data structures and main techniques

Here we assume that x is an image of length n and O � fs1; . . . ; skg is a set ofobjects, where jsij � s for all i 2 f1; . . . ; kg. Both the image x and the objects in O arestrings over the same alphabet R, and without loss of generality we assume that thereare no background symbols #'s in x. If the input image has background symbols,then we decompose the image into a set S of substrings that do not have # as asubstring. Clearly, if all the substrings in S are valid then so is the input image.

The algorithm presented here relies upon the computation of the occurrences in xof the longest pre®xes and su�xes of the objects in O and in particular chains oflongest pre®xes and su�xes in x. In order to achieve this we need to maintain andupdate the following data structures.

We will maintain the list pre®x-head�i� :� j, for each position i of x to denote theindex j of the longest pre®x of any sj 2 O that occurs at position i of x; otherwisepre®x-head�i� is marked empty. We de®ne pre®x-tail�i� :� i� `ÿ 1, if and only ifpre®x-head�i� :� j where ` is the length of the longest pre®x of sj occurring at po-sition i of x. Similarly, we will maintain the list su�x-head�i� :� j, for each position iof x to denote the index j of the longest su�x of any sj 2 O that occurs at position iof x; otherwise su�x-head�i� is marked empty. We also de®ne su�x-

tail�i� :� i� `ÿ 1, if and only if su�x-head�i� :� j where ` is the length of thelongest su�x of sj occurring at position i of x.

Let C � fc1; c2; . . . ; cqg be a subset of f1; . . . ; ng. We say that C is a pre®x-chain ifand only if

(i) pre®x-head�ci� is non-empty for all i 2 f1; . . . ; qg.(ii) pre®x-tail�ci�P ci�1 ÿ 1 for all i 2 f1; . . . ; qÿ 1g.Furthermore we de®ne pre®x-chain-head(C� :� c1, if and only if pre®x-

tail�p� < c1, where p is the nearest position to the left of c1 in x with pre®x-head non-empty. We also de®ne pre®x-chain-tail�C� � cq, if and only if pre®x-tail�cq� < p,where p is the nearest position to the right of cq in x with pre®x-head non-empty. A

488 C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494

Page 7: Optimal parallel analysis and decomposition of partially occluded strings

pre®x chain C � fc1; c2; . . . ; cqg is said to be maximal if and only if c1 is a pre®x-chain-head and cq is a pre®x-chain-tail.

Let C � fc1; c2; . . . ; cqg be a subset of f1; . . . ; ng. We say that C is a su�x-chain ifand only if

(i) su�x-head�ci� is non-empty for all i 2 f1; . . . ; qg.(ii) su�x-tail�ci�P ci�1 for all i 2 f1; . . . ; qÿ 1g.Furthermore we de®ne the head of the chain su�x-chain-head�C� � c1, if and only

if, su�x-tail�p� < c1, where p is the nearest position to the left of c1 in x with su�x-

head non-empty. We also de®ne the tail of the chain su�x-chain-tail�C� � cq, if andonly if, su�x-tail�cq� < p, where p is the nearest position to the right of cq in x, withsu�x-head non-empty. A su�x chain C � fc1; c2; . . . ; cqg is said to be maximal if andonly if c1 is a su�x-chain-head and cq is a su�x-chain-tail.

We now prove another less trivial decomposition of valid images using pre®xes ofsome si of O.

Lemma 3.1. Let x be a valid image of length n over the set of objectsO � fs1; s2; . . . ; skg. If a pre®x-chain-head occurs at position q in x, then for somei 2 1 . . . k x can be decomposed into:

x � y~siz; �5�where y and z are valid images, z is a substring of x starting at position q and ~si is eitherthe empty string �, a suffix of si or a substring of si starting at position qÿ j~sij in x.

Proof. Since x is a valid image over O we can represent it according to Eq. (4) ofTheorem 2.1: x � y~siz, where y and z are also valid images over O. From Theorem2.1, we know that ~si can be the empty string �, a pre®x or a su�x of some si 2 O.Recall that a pre®x-chain-head�C� :� c1 if and only if pre®x-tail�p� < c1, where p isthe nearest position to the left of c1 in x with pre®x-head�p� non-empty. Clearly ~si

cannot be a pre®x of some si since this would contradict the fact that a pre®x-chain-

head occurs at position q, indeed it would occur at position qÿ j~sij. �

We now make a more speci®c breakdown of a valid image x by using its pre®xchains and su�x chains. Let Cp � fc1; . . . ; clg be a maximal pre®x-chain and letCs � f�c1; . . . ; �cmg be a maximal su�x-chain.

Fact 3. The maximal chains Cp and Cs compose a valid image x�c1 . . . �cm� if and only ifpre®x-chain-head�Cp�6 su�x-chain-head�Cs� and su�x-chain-head�Cs�6 pre®x-

chain-tail�Cp� � 1:

It follows from these equations that CpCs constitutes a valid image since it iscovered by a ®nite number of pre®xes and su�xes of possibly di�erent si's. Byde®nition of the covers, if CpCs is covered by pre®xes and su�xes of any si, wherei 2 f1; . . . ; kg, then it is a valid image. Note that a su�x chain may be included insidea pre®x-chain and still constitute a valid image.

C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494 489

Page 8: Optimal parallel analysis and decomposition of partially occluded strings

A very important fact about our composition of valid sub-images into validimages: if one has a linearly ordered sequence of pre®x chains and su�x chains,then for each attempt of a possible concatenation of chains the validity of x can betested.

Theorem 3.1. Let x be a valid image over a set of objects O � fs1; . . . ; skg. Then thefollowing two statements hold:1. Let Cp be a maximal prefix chain of length ` with a proper prefix as last element in

the sequence, and Cs be a maximal suffix chain of length m. Let su�x-chain-

head�Cs� be the nearest suffix to the right of pre®x-chain-tail�Cp�, then su�x-

chain-head�Cs�6 pre®x-chain-tail�Cp� � 1.2. Let Dt and Dt�1 be two consecutive valid sub-images, both substrings of x, such that

w :� DtcDt�1; jcj6 jsij and x � ywz

for some strings y; c; z. The string w is a valid image if and only if c is a substring of anobject si, where i 2 1 . . . k.

Proof.1. This fact derives immediately from the item (i) following Theorem 2.1. We are as-

sured that the tail of the pre®x-chain is not a su�x and not a substring, thus rulingout items (ii) and (iv).

2. From the construction of the two chains it is clear that no pre®x or su�x arisesinside x�j� 1 � � � k ÿ 1� by the de®nition of pre®x and su�x chains. If ct is a sub-string of some si with i 2 f1; . . . ; kg then again the image DtctDt�1 is valid becauseit is covered by a ®nite sequence of pre®xes, su�xes and substrings of varioussi's. �

We now investigate the possible upper bounds on the number of valid sub-imagesthat can occur in a decomposition of a valid image.

Lemma 3.2. Let x be a valid image over a set of objects O and consider the decom-position

x � D1c1D2 � � � ctÿ1Dt

where D1; . . . ;Dt are valid images. Then the number of valid sub-images all occurring inthe same decomposition of x is bounded by t � O�n=s�, where s � jsij for alli 2 f1; . . . ; kg.

Proof. To prove that there are at most O�n=m� cj for j 2 f1; . . . ; t ÿ 1g we need tomaximise t. This case is obtained when every block represents just one si followed bya largest possible substring ~si (any si with both pre®x and su�x cut of) hence of sizejmÿ 2j, where m � jsij. �

490 C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494

Page 9: Optimal parallel analysis and decomposition of partially occluded strings

4. A Parallel algorithm for validating one-dimensional images

This algorithm validates one-dimensional images with objects of equal length, itconsists of a preprocessing stage together with three main steps:1. Preprocessing stage. Here we are interested in knowing the longest pre®x and suf-

®x of any si 2 O, where i 2 f1; . . . ; kg that occurs at position j in x for allj 2 f1; . . . ; ng.

2. Computing chains of longest pre®xes and su�xes. Here the aim is to build overlap-ping sequences of longest pre®xes/su�xes that cover substrings of x.

3. Creating valid chains of valid sub-images in x. We create valid sub-images by com-bining the overlapping pre®x and su�x chains computed previously, thus testingthe validity of x.

4. Pattern matching stage. At this stage (if it has not yet been declared invalid), theimage x has been decomposed into a chain of valid sub-images linked by a num-ber of substrings; the image is valid only if these are substrings of some si fori 2 f1; . . . ; kg (see Theorem 3.1, (2) and Lemma 3.2).

4.1. Preprocessing stage

Before starting our validating procedure we need to perform some preprocessingon the given image x and the objects si for i 2 f1; . . . ; kg. This preprocessing com-putes the longest pre®x and su�x of any si starting at position j of x, wherej 2 f1; . . . ; ng. This longest pre®x/su�x information will be distributed to n pro-cessors aligned under each position of x. This task is carried out by using the al-gorithm of Gaßsieniec and Park in [7] which solves the pre®x matching problem. Thatis, given a pattern P of length m and a text T of length n, ®nd for all positions i in Tthe longest pre®x of P starting at i. It takes O�a�m�� time for the search phase andO�log log m� time for the preprocessing using m processors, where a�m� is the inverseAckermann function.

We are interested in using the pre®x matching algorithm over k patterns si of lengths and a unique text x of length n. This implies some consideration of the variable k, itis reasonable to assume for practical purposes that k is a ®xed constant and henceO�1�. By using just s processors then the resulting running times for the preprocessingand the search would be O�k log log s� and O�ka�s��, respectively. To decrease theserunning times we need to use ks processors so that the preprocessing of the k patternsis done simultaneously hence taking O�log log s� time. Unfortunately this increase inthe number of processors does not improve on the search time bounds. The only wayto bring down the search phase to O�a�s�� is to use kn processors.

By reversing the image x one can easily do the same as above for su�xes.

4.2. Computing the longest pre®x/su�x chains

We now describe the procedure to compute sequences of longest pre®xes, in asimilar way one can do the same for the su�xes. In e�ect the preprocessing stagecomputes the lists pre®x-head, pre®x-tail, su�x-head and su�x-tail.

C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494 491

Page 10: Optimal parallel analysis and decomposition of partially occluded strings

In parallel every processor that is associated with a longest pre®x marks the value1 to its respective location (index in x) in a 1� n array BB (common memory bulletinboard) and every processor that is not associated with a pre®x marks the value 0 toits respective location in the same array. Note that the processors marking thebulletin locations with 1 or 0 e�ectively re-create the list pre®x-head. Every processorassociated with a pre®x-head (location marked a 1), proceeds to ®nd the location ofthe ®rst 1 to its left in the array BB. This problem is also referred to as the chainingproblem: for every 1 in the array of size n ®nd the ®rst 1 to the right, where all theelements in the array are either 0 or 1. Radge solved this problem in [17] with analgorithm that runs in O�a�n�� time and O�n� work. Consider all processors pi thatare associated with a non-empty pre®x-head�i� (i.e. marked 1 in BB). Each suchprocessor computes ij, the nearest non-empty location to the left of BB�i� in BB. Ifpre®x-tail�ij� < iÿ 1, then processor pi marks both pre®x-chain-head�i� and pre®x-

chain-tail�ij�. The computation of the chain-heads and chain-tails requires constanttime and a linear number of processors.

4.3. Creating chains of valid sub-images inside x

After Step 2 we have a series of pre®x-chain-heads for the longest pre®x and su�x-

chain-heads for the longest su�x chains together with their respective chain-tails.We now construct valid sub-images inside x by combining the pre®x and su�x

chains. Every su�x-chain-head ®nds the nearest pre®x-chain-head to its left in x. Ifthe pre®x-chain-tail of the pre®x-chain is within the su�x chain, then we can create avalid sub-image starting at the pre®x-chain-head and terminating at su�x-chain-tail.These computations are easily reduced to the above mentioned chaining problem.Except that whenever a next su�x head occurs outside the bounds of a pre®x-chainthen the input image is declared to be invalid and the algorithm stops, see Theorem3.1, (1). These procedures all take constant time using n processors.

4.4. Pattern matching on substrings of some si

The input to this part of the algorithm is a decomposition of x according to validpictures (sub-images) and substrings of the si's, see Theorem 3.1, (2).

We apply concurrently the pattern matching algorithm of Breslauer and Galil [3]to at most n=s substrings ci, see Lemma 3.2. Each substring is of length at most sÿ 2and there are at most n=s of them in x. By using s= log log s processors one canperform in parallel n=s instances of the pattern matching algorithm yielding a totalrunning time of O�log log s� time and n work.

This algorithm gives rise to the following theorem:

Theorem 4.1. Let x be a one-dimensional image of length n over the alphabet R,containing no background symbols (#'s). Let O � fs1; . . . ; skg be a set of distinct finiteobjects of length m both over R. Then there exists a parallel algorithm for validating xover O taking O�log log n� time and using n work.

492 C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494

Page 11: Optimal parallel analysis and decomposition of partially occluded strings

5. Conclusion and open problems

The algorithms presented here can be extended to handle variable length objects,see [13]. In a forthcoming paper [12], we present sequential algorithms for validatingtwo dimensional square and rectangular images. An interesting open practicalproblem is the validation of images with sets of objects that are concave or non-continuous; of particular interest is the variant of the problem with objects overR [ fKg, where K is transparent symbol and this alphabet de®nes a set of strings withholes. Another interesting problem is the computation of the depth of an object in animage, i.e. the number of objects applied onto an object after the placement of anobject in an image. Finally, approximate occlusion analysis is of practical impor-tance and therefore all the above mentioned problems need to be extended to handleerrors.

References

[1] A. Apostolico, M. Farach, C.S. Iliopoulos, Optimal superprimitivity testing for strings, Information

Processing Letters 39 (1991) 17±20.

[2] W. Bischof, T. Caelli, Learning structural descriptions of patterns: a new technique for conditional

clustering and rule generation, Pattern Recognition 27 (5) (1994) 689±699.

[3] D. Breslauer, Z. Galil, An optimal O�log log n� time parallel string matching algorithm, SIAM

Journal of Computing 19 (6) (1990) 1051±1058.

[4] H. Bunke, U. Buhler, Applications of approximate string matching to 2D shape recognition, Pattern

Recognition 26 (12) (1993) 1797±1812.

[5] A.M. Duval, W.F. Smyth, Reducing the size of hybridization chips for DNA sequence analysis,

International Journal of Foundations of Computer Science 7 (1) (1996) 87±93.

[6] F. Fich, R. Ragde, A. Widgerson, Relations between concurrent-write models of parallel

computation, SIAM Journal of Computing 17 (3) (1988) 606±627.

[7] L. Gaßsieniec, K. Park, Work-time optimal parallel pre®x matching, in: Proceedings of the Second

European Symposium on Algorithms (ESA'94), Lecture Notes in Computer Science, vol. 855,

Springer, Berlin, 1994, pp. 471±482.

[8] D. Geiger, B. Ladendorf, A. Yuille, Occlusions and binocular stereo, International Journal of

Computer Vision 14 (1995) 211±226.

[9] K. Ho�, T. Hudson, D. Manocha, H. Zhang, Visibility culling using hierarchical occlusion maps,

Computer Graphics Proceedings, Annal Conference Series, Sigraph, 1997, pp. 77±88.

[10] C.S. Iliopoulos, M. Korda, Optimal parallel superprimitivity testing on square arrays, Parallel

Processing Letters 6 (3) (1996) 299±308.

[11] C.S. Iliopoulos, K. Park, An optimal O�log log n� time algorithm for parallel superprimitivity testing,

Journal of the Korea Information Science Society 21 (8) (1994) 1400±1404.

[12] C.S. Iliopoulos, J.F. Reid, Validating and decomposing partially occluded two-dimensional images

(extended abstract), in: Proceedings of the Third Prague Stringology Club Workshop, PSCW98,

Department of Computer Science and Engineering, Czech Technical University, Prague, Czech

Republic, 1998.

[13] C.S. Iliopoulos, J. Simpson, On-line validation and analysis of occluded images, in: Proceedings of the

Eighth Australasian Workshop on Combinatorial Algorithms (AWOCA'97), Research on Combi-

natorial Algorithms, Queensland University of Technology, 1997, pp. 25±36.

[14] L. Kuzmin, A. Petrov, Visual space geometry derived from occlusion axioms, Journal of

Mathematical Imaging and Vision 6 (1996) 291±308.

C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494 493

Page 12: Optimal parallel analysis and decomposition of partially occluded strings

[15] D.W.G. Moore, W.F. Smyth, Computing the covers of a string in linear time, in: Proceedings of the

Fifth ACM-SIAM Symposium on Discrete Algorithms, 1994, pp. 511±515.

[16] P.A. Pevzner, R.J. Lipshutz, Towards DNA sequencing chips, in: Proceedings of the 19th

International Symposium on Mathematical Foundations of Computer Science, Lecture Notes in

Computer Science, vol. 841, Springer, Berlin, 1994, pp. 143±158.

[17] P. Radge, The parallel simplicity of compaction and chaining, Journal of Algorithms 14 (1993)

371±380.

494 C.S. Iliopoulos, J.F. Reid / Parallel Computing 26 (2000) 483±494