1
Pattern Recognition, Vol. 24. No. 9, p. 917, 1991 Printed in Great Britain 0031-3203/91 $3.00 + .00 Pergamon Press plc ~) 1991 Pattern Recognition Society A NOTE ON EFFICIENT PARALLEL ALGORITHMS FOR THE COMPUTATION OF TWO-DIMENSIONAL IMAGE MOMENTS YI PAN Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, U.S.A. (Received 9 October 1990; in revised form 5 March 1991; received for publication 14 March 1991) An earlier paper ~1) presented an efficient parallel algorithm for the computation of two-dimensional (2D) image moments. The algorithm is implemented on both a linear array and a 2D array. The idea of decomposing a 2D moment into many vertical and horizontal moments is very interesting, and helpful in reducing the time complexity when implemented on a parallel computer array. However, there is a flaw in the time analysis of the algorithms. As we all know, the total time complexity of a parallel algorithm is determined by the com- putation time and the communication time. The communication time involves routing data from one processor to another through the communication links which connect the processors in the array. Although there may be variations depending on the nature of the technology, the communication time to route one data element from one processor to its neighbor is normally higher than a single arithmetic operation, such as addition. Even in a fine-grain parallel computer array, the communication time of transferring one data element between neighboring processors is at least on the same order of an arith- metic operation32) Thus, the communication time can never be ignored unless all the computation involved in a parallel computation is local, which means that no data need be transmitted from one processor to another. The correct time analysis is as follows. In the case of linear array, since only neighboring processors are directly connected, remote data communication has to be implemented by passing through intermediate processors between the sending and receiving pro- cessors. Hence, to implement the cascade-partial- sum described in the algorithm, I1) the communi- cation time can be computed as follows. In the first cycle, since a processor sends data only to a neigh- boring processor, the communication time is that for one unit-route. Here, one unit-route is defined as a data transfer between two directly connected pro- cessors32) In the second cycle, a processor com- municates with a processor with a distance of three. Thus the communication time is that for three unit- routes. In general, in cycle i, the communication time is that for 2i - 1 unit-routes and we need a total number of logN cycles to complete the cascade- partial-sum. Therefore, the total communication time is that for IogN ~2i - 1) -- 2N - log N - 2 i=1 unit-routes. The computation time remains the same. In the case of a 2D processor array, M processors on the same column will work on the column of the image to produce the vertical moments. Using an analysis similar to the above, we can conclude that the communication time is of O (M) unit-routes. Also, the communication time to compute the hori- zontal calculation is O (N). Therefore, the total time complexity of the 2D algorithm is O (M + N) instead of O (log N) as presented in the paper. As is well known, in a 2D processor array of size N × M, a lower bound on the total time complexity of the algorithm is t2(N + M) because the communication diameter of the mesh is O (M + N). In conclusion, the time analysis of the paper is incorrect and the time complexities of the algorithms on both linear and 2D arrays are the same and are O (N). Using the 2D array is meaningless since this will result in an increased area*time complexity while not reducing the total time complexity. Therefore, unless we have global connections in the array (such as the hypercube network), we cannot reduce the time complexity. REFERENCES 1. K. Chen, Efficient parallel algorithms for the com- putation of two-dimensional image moments, Pattern Recognition 23, 109-119 (1990). 2. E. Dekel, D. Nassimi and S. Sahni, Parallel matrix and graph algorithms, SIAM J. Cornput. 10,657-675 (1981). 917

A note on efficient parallel algorithms for the computation of two-dimensional image moments

  • Upload
    yi-pan

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Pattern Recognition, Vol. 24. No. 9, p. 917, 1991 Printed in Great Britain

0031-3203/91 $3.00 + .00 Pergamon Press plc

~) 1991 Pattern Recognition Society

A N O T E O N E F F I C I E N T P A R A L L E L A L G O R I T H M S F O R

T H E C O M P U T A T I O N O F T W O - D I M E N S I O N A L I M A G E

M O M E N T S

YI PAN

Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, U.S.A.

(Received 9 October 1990; in revised form 5 March 1991; received for publication 14 March 1991)

An earlier paper ~1) presented an efficient parallel algorithm for the computation of two-dimensional (2D) image moments. The algorithm is implemented on both a linear array and a 2D array. The idea of decomposing a 2D moment into many vertical and horizontal moments is very interesting, and helpful in reducing the time complexity when implemented on a parallel computer array.

However, there is a flaw in the time analysis of the algorithms. As we all know, the total time complexity of a parallel algorithm is determined by the com- putation time and the communication time. The communication time involves routing data from one processor to another through the communication links which connect the processors in the array. Although there may be variations depending on the nature of the technology, the communication time to route one data element from one processor to its neighbor is normally higher than a single arithmetic operation, such as addition. Even in a fine-grain parallel computer array, the communication time of transferring one data element between neighboring processors is at least on the same order of an arith- metic operation32) Thus, the communication time can never be ignored unless all the computation involved in a parallel computation is local, which means that no data need be transmitted from one processor to another.

The correct time analysis is as follows. In the case of linear array, since only neighboring processors are directly connected, remote data communication has to be implemented by passing through intermediate processors between the sending and receiving pro- cessors. Hence, to implement the cascade-partial- sum described in the algorithm, I1) the communi- cation time can be computed as follows. In the first cycle, since a processor sends data only to a neigh- boring processor, the communication time is that for one unit-route. Here, one unit-route is defined as a data transfer between two directly connected pro- cessors32) In the second cycle, a processor com- municates with a processor with a distance of three.

Thus the communication time is that for three unit- routes. In general, in cycle i, the communication time is that for 2 i - 1 unit-routes and we need a total number of logN cycles to complete the cascade- partial-sum. Therefore, the total communication time is that for

IogN

~2 i - 1) -- 2N - log N - 2 i=1

unit-routes. The computation time remains the same. In the case of a 2D processor array, M processors

on the same column will work on the column of the image to produce the vertical moments. Using an analysis similar to the above, we can conclude that the communication time is of O (M) unit-routes. Also, the communication time to compute the hori- zontal calculation is O (N). Therefore, the total time complexity of the 2D algorithm is O (M + N) instead of O (log N) as presented in the paper. As is well known, in a 2D processor array of size N × M, a lower bound on the total time complexity of the algorithm is t2(N + M) because the communication diameter of the mesh is O (M + N).

In conclusion, the time analysis of the paper is incorrect and the time complexities of the algorithms on both linear and 2D arrays are the same and are O (N). Using the 2D array is meaningless since this will result in an increased area*time complexity while not reducing the total time complexity. Therefore, unless we have global connections in the array (such as the hypercube network), we cannot reduce the time complexity.

REFERENCES

1. K. Chen, Efficient parallel algorithms for the com- putation of two-dimensional image moments, Pattern Recognition 23, 109-119 (1990).

2. E. Dekel, D. Nassimi and S. Sahni, Parallel matrix and graph algorithms, SIAM J. Cornput. 10,657-675 (1981).

917