ACAT2007, April 23-27, Amsterdam, Netherlands

ACAT2007, April 23-27, Amsterdam, Netherlands

Miroslav Morháč - Institute of Physics, Slovak Academy of Sciences, Bratislava, Slovakia

Error-free algorithms to solve special and general discrete systems of linear equations.

1. Introduction•the solution of linear algebraic equations is a very frequent task in numerical mathematics and experimental physics at all. •when solving such a set of linear equations one often meets the problem of ill-conditioned matrix of the set. •for large dense sets of linear equations, which are as a rule ill-conditioned, the stability of the solution cannot be guaranteed. •rounding and truncation errors, during the numerical computations, involved in obtaining the solution to the problem cannot be tolerated.

Illustrative example

Let us have ill-conditioned set of two linear equations

x + 3y = 4x + 3.00001y = 4.00001 (1)

which has the solution x = 1, y = 1.On the other hand let us consider the set

x + 3y = 4x + 2.99999y = 4.00002 (2)

This set has the solution x = 10, y = -2. Very small change in the coefficients (0.00002 and 0.00001) caused enormous changes in the solution. The inverse matrix of (1) contains elements of the order of It demonstrates its ill-conditionality. 510 .

•we are motivated by the fact that in scientific computations there are large classes of ill-conditioned problems and there are also numerically unstable algorithms•digital computer is a finite machine•it is capable of representing internally only a finite set of numbers•there exist difficulties associated with the attempts at approximating arithmetic in the field of real numbers (R,+,-) by using the finite set of so-called floating-point numbers F, more appropriately called the set of computer-representable numbers•there is no possibility of representing the continuum of real numbers in any detail.•a “practical” solution is to represent a real number by the closest computer-representable number, thereby introducing the rounding error•it is well known that computers can perform certain arithmetic operations exactly if the operands are integers•moreover there are many situations in physics when we work with integer input data. •processing of spectra (histograms) can serve as a good example•when solving ill-conditioned problems during the analysis of spectra it is not reasonable to leave the world of exact integers and thus introduce instability•it motivates us to consider integer arithmetic as a mean of avoiding rounding errors in the hope that certain ill-conditioned problems can be solved exactly•residue number system is an example of a number system with which we can do exact arithmetic•the aim of the contribution is to present error-free algorithms to solve ill-conditioned systems of linear equations

Negative numbers

Examples of associations in modulo arithmetic (M=17)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

0 1 2 3 4 5 6 7 8 -8 -7 -6 -5 -4 -3 -2 -1

Inverse numbers

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 9 6 13 7 3 5 15 2 12 14 10 4 11 8 16

The question is how can we proceed when the solution of ill-conditioned linear system of equations can be expressed as rational fractions of integers•we can increase the precision of floating point representation – it does not need to guarantee the correct result•we can work with long integers, it is many times very cumbersome•we can work with integers in finite rings using residue class or modulo arithmetic and at the end of calculation to convert the modulo representation into real numbers

D R

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8

9

9

10

10

11

11

12

12

13

13

14

14

15

15

16

16

2 2

1

4

2

6

3

8

4

10

5

12

6

14

7

16

8

1

9

3

10

5

11

7

12

9

13

11

14

13

15

15

16

3 3

1

6

2

9

3

12

4

15

5

1

6

4

7

7

8

10

9

13

10

16

11

2

12

5

13

8

14

11

15

14

16

4 4

1

8

2

12

3

16

4

3

5

7

6

11

7

15

8

2

9

6

10

10

11

14

12

1

13

5

14

9

15

13

16

5 5

1

10

2

15

3

3

4

8

5

13

6

1

7

6

8

11

9

16

10

4

11

9

12

14

13

2

14

7

15

12

16

6 6

1

12

2

1

3

7

4

13

5

2

6

8

7

14

8

3

9

9

10

15

11

4

12

10

13

16

14

5

15

11

16

7 7

1

14

2

4

3

11

4

1

5

8

6

15

7

5

8

12

9

2

10

9

11

16

12

6

13

13

14

3

15

10

16

8 8

1

16

2

7

3

15

4

6

5

14

6

5

7

13

8

4

9

12

10

3

11

11

12

2

13

10

14

1

15

9

16

9 9

1

1

2

10

3

2

4

11

5

3

6

12

7

4

8

13

9

5

10

14

11

6

12

15

13

7

14

16

15

8

16

10 10

1

3

2

13

3

6

4

16

5

9

6

2

7

12

8

5

9

15

10

8

11

1

12

11

13

4

14

14

15

7

16

11 11

1

5

2

16

3

10

4

4

5

15

6

9

7

3

8

14

9

8

10

2

11

13

12

7

13

1

14

12

15

6

16

12 12

1

7

2

2

3

14

4

9

5

4

6

16

7

11

8

6

9

1

10

13

11

8

12

3

13

15

14

10

15

5

16

13 13

1

9

2

5

3

1

4

14

5

10

6

6

7

2

8

15

9

11

10

7

11

3

12

16

13

12

14

8

15

4

16

14 14

1

11

2

8

3

5

4

2

5

16

6

13

7

10

8

7

9

4

10

1

11

15

12

12

13

9

14

6

15

3

16

15 15

1

13

2

11

3

9

4

7

5

5

6

3

7

1

8

16

9

14

10

12

11

10

12

8

13

6

14

4

15

2

16

16 16

1

15

2

14

3

13

4

12

5

11

6

10

7

9

8

8

9

7

10

6

11

5

12

4

13

3

14

2

15

1

16

Rational fractions

11

0

( ) ( ) (mod ) 0,1, , 1N

kn

k

x n N X k M n N

•very frequently the advantageous property of factorization of the convolution of two signals to the product of their Fourier coefficients is utilized [1], [2].

•since the Fourier transform cannot be computed with absolute precision, the number theoretical transforms were defined with the aim to carry out the fast error-free convolution of data.

( ) ( ) ( ) , 0,1, , 1i i iF y n F h n F x n i N

(5)1

0

( ) ( ) (mod ) 0,1, , 1N

kn

n

X k x n M k N

2. Convolution systems

•in many cases the dynamic behavior of a linear system can be described by means of impulse response function h(n). •the periodical convolution of finite sets x(n) and h(n) is

1

0

( ) ( ) ( ), 0,1, , 1N

m

y n h n m x m n N

(3)

•these transforms are defined in the ring of integers using the operations carried out in modulo M arithmetic, where M is prime•direct and inverse Number Theoretical Transforms (NNT) can be defined

(4)

(6)

where 1. (mod ) 1a N N M

. (mod ) 1Nb M

2 1,qM M is prime and N must divide M-1. •Mersenne Transforms (Mersenne numbers q is prime, some Mersenne numbers are primes •Fermat Transforms (Fermat numbers 2

0 1 2 3 42 1, 3, 5, 17, 257, 65537)t

tM F F F F F F •Fermat transform is the most suitable, fast transform algorithm exists. The first 5 Fermat numbers are primes •there are many applications for error-free convolution calculations (filtration). •however, the problem of precision is much more critical in the inverse operation - deconvolution

2.1 Precise deconvolution using the Fermat number transform [3]

0 00 , 1 , 2 , 1

1 11 , 0 , 3 , 2

2 , 3 , 0 , 12 2

1 , 2 , 1 , 01 1

y xh h N h h

y xh h h h

h N h N h h Ny N x N

h N h N h hy N x N

L

L

M g MM M M M

L

L

H y x

Convolution system (3) can be written as

or

(7)

(8)

The solution of any regular system of linear equations (8) can be expressed as

0 1 20 1 2

1 mmM M M M

D x x x x xL

(9)

where M is chosen modulus (Fermat number) and m is a finite number

Substituting this expression into matrix equation (8) we obtain

0 10 1

1 mmM H M H M H

D y x x xL

(10)or

0 1

1 1 1mD H H H

M M M

y x x x 0L L

(11)

1 0

1D H ,

M y y x 1

1j j jH ,

M y y x 1 2j , , ,m. K

1jy

Where D is determinant of the matrix H. Now we can iteratively calculate

If the vector equals the zero vector the calculation can be finished.

(12)

1 0 1 1i i iR R R (mod M ) i , , ,N x h y K

To calculate modulo solution of (8) we can utilize the Fermat number transform. For transformed values it holds

(13)

x

MDD(mod M ) (mod M ) (mod M )

D D M M Mx x x x

Using inverse Fermat transform we obtain modulo M solution of vector

where

1

00

N

M M M Mi

D k M D , D , D H i (mod M )

x x

0 1 2 mD M M M x x x x xK KThen

(14)

(15)

(16)

Substituting (16) to (12) we get

1 1 2 mH M H M H y x x xK

j(mod M ) H jy x

or in general

1j j

j

H

M

y xy

Here to find particular solution we can apply again Fermat transform. Then we can proceed to the next iteration step

(17)

(18)

jx

0

10 1 1

kj

jj

x( i ) x ( i ) M , i , , ,ND

K

Final solution can be expressed

(19)

Exact algorithm to calculate determinant of the convolution matrix [3], [4]

The determinant of the convolution matrix of the system (8) can be calculated as follows

11

1 2 1 1220 1 2 1

0

1N

N N N kk kN

k

D h hW h W h W

L (20)

where 2j

NW e

and N is a power of 2. It follows from relation (20) that the determinant of such a matrix· can be expressed as the product of the coefficients of the Fourier transform of the response vector

h . Since in the deconvolution algorithm using the Fermat transform the exact integer value of the determinant is needed, the determinant cannot be calculated simply by multiplication of the Fourier coefficients of the response. The Fourier transform is used only formally. For details we refer to [3].

Let us illustrate the algorithm by an example with 8N . Applying DFT to vector h we obtain bit reversed vector

0

1

2 2 2 22

2 2 2 23

2 3 2 34

2 3 2 35

3 2 3 26

3 2 3 27

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1

1 1 1 1

1 1

1 1

1 1

1 1

h

h

hW W W W

hW W W W

hW W W W W W

W W W W W W h

W W W W W W h

W W W W W W h

0 1 2 3 4 5 6 7

0 1 2 3 4 5 6 7

0 2 4 6 1 3 5 7

0 2 4 6 1 3 5 7

2 30 4 1 5 2 6 3 7

2 30 4 1 5 2 6 3 7

3 20 4 1 5 2 6 3 7

3 20 4 1 5 2 6

h h h h h h h h

h h h h h h h h

h h h h h h h h W

h h h h h h h h W

h h h h W h h W h h W



h h h h W h h W

3 7h h W

00

01

20 1 2

20 1 3

2 340 1 2 3

2 350 1 2 3

3 2 360 1 2 3

3 2 37

0 1 2 3

1 group

2 group

3 group

4 gro

a .Xb .Xc c W X

.c c W X

Xd d W d W d W

Xd d W d W d W.

Xd d W d W d WXd d W d W d W

up

The numbers in the first and the second group are real. In the third group we multiply the numbers 2 3X ,X

and in the fourth the numbers 4 5 6 7X ,X ,X ,X • The fact that 4 1W must be taken into account. Then

2 2 2 22 3 0 1 0 1 0 1X X c c W c c W c c

2 3 2 34 5 0 1 2 3 0 1 2 3X X d d W d W d W d d W d W d W

2 2 2 2 2 20 1 3 2 1 0 2 3 0 12 2d d d d d d d d W e e W

3 2 3 26 7 0 1 2 3 0 1 2 3X X d d W d W d W d d W d W d W

2 2 2 2 2 20 1 3 2 1 0 2 3 0 22 2d d d d d d d d W e e W

2 2 2 24 5 6 7 0 1 0 1 0 1X X X X e e W e e W e e

Generally, let us have the vectors 0 0 0X X Xn n n, , , ,K where for Xkn and 1Xk

n there is

1

2

2 1

1

0

kn

knk

n kn

kn

kn

j

j /

j /

X

XX

X

X

X

M

M

1

0

1

2 1

2

2

1

kn

kn

knk

n kn

kn

kn

j

j /

j /

X

X

XX

X

X

X

M

M

where n is the number of reduction 20 1 log 1n , , , N ; k K is the number of cyclic shifts with a

negative transition to the topmost position 10 1 2 2n nk , , ,N / ; j N / ; N K is the number of

components of the basic element of the group. Then for the reduced vector 01nX there holds

12

2 2

1

01

12

0 0 2 0 2 0 0

1

1 0 2 2 1 1 1

1 0 2 2 1 1 1

X

j

k ik k k kn n n n

i

n

j

i

n n n ni

X X j / X i X j i

X X j / X i X j i

M

M

Using the described procedure we can calculate the exact integer value of the determinant of the convolution

matrix which is used in the deconvolution algorithm.


Let us have the system

03 0 0 2 3

12 3 0 0 5

0 2 3 0 32

0 0 2 3 03

x

x

x

x

where the determinant 65D . We shall use the modulus 17M , which is the Fermat number and 4N . Then the transform matrices are as follows

1 1 1 1

1 4 16 13

1 16 1 16

1 13 16 4

T

and 1

1 1 1 1

1 13 16 413

1 16 1 16

1 4 16 13

T

Then

T th h mod M

1 1 1 1 3

1 4 16 13 2

1 16 1 16 0

1 13 16 4 0

mod M

5

11

1

12

It follows that

1 1 1 1 15 11 1 12T

, , , th mod M 7 14 110T

, , ,

We calculate the vector of the transformed output values

T m0 m0Y y

1 1 1 1 3

1 4 16 13 5

1 16 1 16 3

1 13 16 4 0

mod 17

11

3

1

14

Then there holds

0

11 7

3 14

1 1

14 10

X mod 17

9

8

1

4

Applying the inverse transform we obtain

0

1 1 1 1 9

1 13 16 4 813

1 16 1 16 1

1 4 16 13 4

x mod 17

14 14

15 15

8 8

6 6

D

D

mod 17

14

15

8

6

MD

D

mod 17

14

1514

865

6

mod 17

9

61

1065

16

where MD was calculated using relation (15). Let us suppose that the positive numbers are represented in the range 0 1 2, M / and the negative ones in the range 1 2 1 1M / ,M . Then we obtain the negative values for numbers greater than 1 2M / by subtracting the modulus. It follows that

0

8

61

765

1

x

1-st iteration step

Further we calculate

0 0

3 0 0 2 8 26

2 3 0 0 6 21 1

0 2 3 0 7 965 65

0 0 2 3 1 17

H ,

y x

1 0

195 26 13

325 2 191 1

195 9 1217

0 17 1

DM

m0y y y

m1 1y y mod M 13 19 12 1T

, , , mod 17 13 2 12 1T

, , ,

Further in the next iteration we obtain

T m1 m1Y y mod M 11 5 5 14T

, , ,

1 11 7 5 14 5 1 14 10T

, , , X mod 17 9 2 5 4T

, , ,

Using the inverse transform we calculate the vector 1x

11 1T x X mod M 5 3 2 16

T, , ,

Adjusting the elements greater than 1 2M / we obtain

1x 5 3 2 1T

, , ,

Then we calculate

1 1 13 19 12 1T

H , , , y x

1 12

113 19 12 1 13 19 12 1 0 0 0 0

17

T T T, , , , , , , , ,

M

y y

y

The zero vector 2y indicates the end of calculation. The resulting vector is then obtained by

0 1

1 117 8 6 7 1 17 5 3 2 1

65

T T, , , , , ,

D x x x

177 57 27 18

65

T, , ,

2-nd iteration step

2.2 Multidimensional error-free deconvolution using the Fermat number transform [5]

The periodical k-dimensional convolution of the finite sets 1 2, ,..., kx n n n and 1 2, ,..., kh n n n is defined by

1 2

1 1 1

1 2 1 2 1 1 2 20 0 0

, ,..., ... , ,... , ,...,k

N N N

k k k km m m

y n n n x m m m h n m n m n m

Analogously to one-dimensional case in [5] we have derived the error-free algorithm of k-dimensional deconvolution as well as algorithm to calculate the determinant of the k-dimensional convolution matrix.

2.3 Error-free deconvolution using polynomial algebra concept [6], [7]•polynomial algebra plays an important role in digital signal processing because convolutions can be expressed in terms of operations on polynomials ([2]). •polynomial algebra is the basis of one group of fast and error-free one- and multidimensional convolution algorithms.

Now suppose the elements of h m ,x l in [3] to be coefficients of the polynomials H z and X z of degree 1N . Hence we have

1

0

Nm

m

H z h m z

(21)

1

0

Nl

l

X z x l z

(22)

If we multiply H z by X z , the resulting polynomial will be of degree 2 2N . Thus

2 2

0

Nn

n

Y z H z X z y n z .

(24)

This means that the convolution of two sequences can be treated as the product of two polynomials. If the one-dimensional convolution defined by (1) is cyclic the indices n, m, l, are calculated modulo

N. This implies that mod 1NZ N and therefore the cyclic convolution can be considered as the multiplication of two polynomials

Y z H z X z (25)

where the powers of the polynomial variable z are calculated modulo N or by

mod 1NY z H z X z z .

Let us proceed to the multidimensional case of convolution. The k-dimensional cyclic convolution of the discrete input signal x and the response signal h is defined as

1

1

11

1 1 1 10 0

k

k

NN

k k k km m

y n , n , , h m , m x n m , ,n m

K K K K (26)

where 1 1 2 20 1 0 1 0 1k kn ,N , n ,N , , n ,N K and indices 1 1n m are calculated

modulo 1N the indices 2 2n m are calculated modulo 2N , etc. By considerations analogical with

those of the one-dimensional case the polynomial expression of the k-dimensional convolution is obtained in the form 1 2 1 2 1 2k k kY z ,z , ,z H z ,z , ,z X z ,z , ,z , K K K (27) where the powers of the variable 1z are calculated modulo 1N and the powers of the variable kz are

calculated modulo kN .

In the case of the convolution of the signals h, x we can compute the one-, resp. the k-dimensional output signal y using the relations (25), resp. (27). In the case of deconvolution, i.e. when we know the output signal y and the response signal h, the input signal can be formally expressed in polynomial form as 1X z Y z H z , (28)

re sp .

11 2 1 2 1 2k k kX z ,z , ,z Y z ,z , ,z H z ,z , ,z , K K K (29)

This implies that we have to execute operation of the inversion of the polynomials H z , resp.

1 2 kH z ,z , ,zK .

Let the vectors of the output signal, resp. the impulse response be

3 1 2 1T

y , , , ,

re sp .

1 4 2 0T

h , , , .

According to (5) there holds

2 3 23 2 1 4 2z z z z z X z .

From that we have

2 3

2

3 2

1 4 2

z z zX z .

z z

Multiplying the numerator and the denominator by the polynomial H z modulo 1Nz we obtain

2 3 2 2 3

2 2 2

3 2 1 4 2 3 9 4 5

1 4 2 1 4 2 5 12

z z z z z z z zX z .

z z z z z

In the following sections of the paper the product H z H z will be called the reduction of the polynomial H z with respect to the variable z . If we multiply the numerator and the denominator by the polynomial 25 12z , the resulting polynomial is

2 3 2 2 3

2 2

3 9 4 5 5 12 63 105 56 133

1195 12 5 12

z z z z z z zX z .

z z

Illustrative example for 1D deconvolution

Illustrative example for 2D deconvolution

Let the matrix of the two-dimensional output signal, resp. the two-dimensional impulse response be

3 2

1 4

,y ,

,

resp.

2 3

1 3

,h .

,

Using (27) for k = 2 we have

1 1 2 1 1 2 1 23 2 4 2 3 3z z z z z z X z ,z

or

1 1 2

1 2

1 1 2

3 2 4

2 3 3

z z zX z ,z .

z z z

By multiplying the numerator and the denominator by the polynomial 1 2H z , z i.e. by its reduction

with respect to the variable 2z , we get

1 1 2 1 1 2

1 2

1 1 2 1 1 2

3 2 4 2 3 3

2 3 3 2 3 3

z z z z z zX z ,z

z z z z z z

1 1 2

1

11 13 4 2

13 14

z z z

z

Now we multiply this result by the polynomial 113 14z

1 1 2 1

1 21 1

11 13 4 2 13 14

13 14 13 14

z z z zX z ,z

z z

1 1 239 15 24 30

27

z z z.

The resulting matrix x is then

39 241

15 3027

,x .

,

Let us assume that the dimensions of the convolution system 1 2 kN ,N , ,NK to be powers of 2. Then

the algorithm of inversion of the polynomial 1 2 kH z ,z , ,zK and its simultaneous multiplication by

the polynomial 1 2 kY z ,z , ,zK can be expressed as successive reductions of the response function

(autoconvolutions) and convolutions of 1 2 kY z ,z , ,zK with reduced response (crossconvolutions). The 1p th reduction of the response in the direction k is

11 2

1 2 1

111 1 2

1 11 2 1 1 2 1

0 0 0 0

2 2

kp

k

k k

N

NN Np p p p

k k k k k ki i i i

h l ,l , ,l , l h i ,i , ,i , i ,

K K K

1 2 1 1 kipk k kh m ,m , ,m ,m , K (30)

where 1 1 1 1mod m l i N ,

2 2 2 2mod m l i N ,

M 1 1 1 1mod k k k km l i N ,

12 2 mod p pk k k km l i N ,

1 1 2 2 1 1 10 1 0 1 0 1 0 1

2k

k k k p

Nl ,N , l ,N , ,l ,N , l , ,

K

and the number of the reduction 2 10 1 log kp , , , N . K •

Moreover, let 0

1 2 1 2k k kn i ,i , ,i y i ,i , ,i ,K K

where 1 1 2 20 1 0 1 0 1k ki ,N , i ,N , ,i ,N , K

Then the calculation of the 1p th reduction of the signal kn in the direction k is given by the

formula

11 2

1 2 1

111 1 2

11 2 1 1 2 1

0 0 0 0

2

kp

k

k k

N

NN Np p p

k k k k k ki i i i

n l ,l , ,l ,l n i ,i , ,i , i ,

K K K

1 2 1 1 kipk k kh m ,m , ,m ,m K (31)

where 1 1 1 1mod m l i N ,

2 2 2 2mod m l i N ,

M 1 1 1 1mod k k k km l i N ,

2 mod pk k k km l i N ,

1 1 2 2 1 10 1 0 1 0 1 0 1k k k kl ,N , l ,N , ,l ,N , l ,N K

and the number of the reduction 2 10 1 log kp , , , N . K

We repeat the algorithm (30), (31) until the number of reductions is 2 1log kN . • Then we continue

with the reductions of the response and the signal nk in the direction 1k (again using formulas (30), (31)). We proceed in this way until all the reductions of the response in all the directions are carried out, i.e., until the k-dimensional discrete signal of the response is reduced to the only non-zero element. It represents the value of the determinant of the system matrix kh hence

0 0 0rkD h , , , , K (32)

where

21

logk

ii

r N

The final solution is then given by relation

n

x kk D

3. Special linear systems•Toeplitz, Hilbert, Vandermonde

3.1 Error-free algorithm to solve integer Toeplitz system [8]

•symmetrical Toeplitz arises when solving overdetermined convolution system •we present an error-free form of Levison algorithm to solve integernonsymmetrical Toeplitz system. •it removes roundoff errors. Their accumulation can, when using classic algorithms, deteriorate or destroy the solution. •it retains the speed of original Levison [9] algorithms (proportional to ). 2N

0 1 2 1 1 1

1 0 1 2 2 2

1 2 3 0

N

N

N N N N N

a , a , a , , a x y

a , a , a , , a x y

a , a , a , , a x y

K

K

M M M M

K

1

1 2N

i j j ij

a x y , i , , ,N

K

(33)

or

(34)

For integer regular Toeplitz system one can write

1

1 2N

ji j i

j N

xa y , i , , ,N

D

K(35)

Levison derived the algorithm for fast solution of (33) using recursive procedure solving in each iteration step the system

1

1 2 1 2M

( M )i j j i

j

a v y , i , , ,M ; M , , ,N

K K(36)

In [8] we derived the algorithm for fast solution of integer Toeplitz system (35) using the following procedure

1 01

1 111

1

0 1 11

1

11 1 1

1

11 1 1

1

and

M( M )

M M j jj

M( M )

M M M j jj( M )

M M( M )

M M j M jj

M

M( M ) ( M )M M M M j j

j

M( M ) ( M )M M M j M j

j

D D a a b ,

y D a v

v ,a D a b

D

c a D a c

b a D a b

to determine quantities with the index M+1,

(37)

1 11 1 1

1 11 1 1

1 11 1 1

1

1

and

11 2

( M ) ( M ) ( M ) ( M )j j M M M j

M

( M ) ( M ) ( M ) ( M )j j M M M j

M

( M ) ( M ) ( M ) ( M )j j M M M j

M

v v D v b ,D

c c D c bD

b b D b c , j , , ,M ,D

K

to determine remaining components of the vectors v, c, b, and initial values1 1 1

1 0 1 1 1 1 1 1( ) ( ) ( )D a ,v y , c a ,b a .

(38)

(39)

From the above given formulas, one can see that all quantities are integers. After some iteration steps, however, their magnitudes increase rapidly, which complicates the realization of the calculation. To overcome this problem, we can proceed in several ways:

•to perform the calculation with these long integers what means to watch overflows; the calculation becomes more time consuming and inconvenient;

•to perform the calculation in floating point form, which means to give up the accuracy of the algorithm; for ill-conditioned systems, this can give rise to the well-known problems with stability of the algorithm; and

•to perform the calculation in modulo arithmetic; one does not need to watch overflows, and also the roundoff errors are removed.

When using modulo arithmetic, we carry out the calculation in different prime modulo classes , i = 1,2, . . . , r, so that it holds [10]

im

1 2

2 1 2 1

where P must satisfy the condition

2 1 y

where

1 1

1

r

N / N ( N ) / N

i

j

P m m m

P max N M( A ) ,N( N ) M( A ) M( ) ,

M( A ) max a , i N ,N ,

M( y ) max y , j ,N

K

(40)

The calculation in the given modulo class is independent of other modulo classes and thus this model of implementation is well suited for parallel computing. By inverse conversion from residual representation, employing Chinesse theorem [2], [9], one can calculate resulting vector x and determinant D.


1

2

3

1 1 2 1

3 1 1 3

2 3 1 1

, , x

, , x

, , x

p 1 2

n 1 2

a 0 3 2 0

a 0 1 2 0

y 1 3 1

b 1

c 3

v 1 1

T T

,

T T

T

T

T

T

a ,a , ,

a ,a , , ,

, ,

, ,

, ,

, , , D .

1 2 35 7 11m , m , m

Find the exact solution of the Toeplitz system

We initialize the vectors and variables

With respect to (40) we choose the moduli

Carrying out the first iteration step yields

b 5 3 1 b 7 0 1 b 11 4 1

c 5 0 3 c 7 5 0 c 11 5 4

v 5 2 1 v 7 2 6 v 11 2 6

5 4 7 4 11 4

T T T

T T T

T T T

(mod ) , , ; (mod ) , , ; (mod ) , , ;

(mod ) , , ; (mod ) , , ; (mod ) , , ;

(mod ) , , ; (mod ) , , ; (mod ) , , ;

D(mod ) ; D(mod ) ; D(mod ) .

b 5 3 1 0 b 7 0 1 1 b 11 4 1 4

c 5 0 3 4 c 7 5 0 4 c 11 5 4 0

v 5 1 3 2 v 7 2 3 3 v 11 5 3 4

5 3 7 2 11 1

T T T

T T T

T T T

(mod ) , , ; (mod ) , , ; (mod ) , , ;

(mod ) , , ; (mod ) , , ; (mod ) , , ;

(mod ) , , ; (mod ) , , ; (mod ) , , ;

D(mod ) ; D(mod ) ; D(mod ) .

v 16 3 367 23T

, , , D .

1x 16 3 18

23

T, , .

After the second iteration step (M=N-1=2) the calculation in this example is finished. The resulting values of vectors and determinant are then as follows

By conversion of the vector v and the determinant D from residue class code one obtains

Positive numbers are represented in the range <0,(P-1)/2> and negative ones in the range

<(P-1)/2+1,P-1>. Then the negative values are, for the values being greater than (P-1)/2, calculated by subtracting P. In our example the element v(3)=367-385=-18 (367>5.7.11/2). Then the final solution is

3.2 An algorithm to solve Hilbert system of linear equations exactly [11]

•typical example of extremely ill-conditioned matrices are Hilbert matrices. In literature one can find the examples of Hilbert matrices with the elements

11

1i , ja ; i, j ,Ni j

1

11i , j

i j

a ; i, j ,Nd

•we shall consider more general Hilbert systems with elements

1 2 31 1

2 22 3 4

3 3

3 4 5

1 1 1

1 1 1

1 1 1

, ,b b b

x y

, , x yb b b

x y

, ,b b b

then for N=3 we can write

or after exchanging the elements of the vector x

3 2 1 0 1 21 3 1 0 1 2 1

2 2 2 1 0 1 24 3 2 1 0 1

3 1 3 2 1 0 3

5 4 3 2 1 0

1 1 1 1 1 1

1 1 1 1 1 1

1 1 1 1 1 1

, , , ,b b b s s s

y x z R , R , R z

y , , x , , z R , R , R zb b b s s s

y x z R , R , R z

, , , ,b b b s s s

3 2 2 1 2 3 2 10 2 1 1 2 2 0 2 1 0 1 2 1 0 1 2

1 1 1 1 2 C C

D

D DD

s s s s s s s s s s s s s s s s D

This represents Toeplitz system of linear equations. Assuming that all quantities are integers the determinant of the matrix can be written in the form of rational fraction. For N=3 one obtains

where left upper index C denotes numerator and D denominator. Without being interested in the numerator for denominator of the determinant in general case one can write

1 2 1 1 2 11 2 1 0 1 2 1

D N N NN N N ND s s s s s s s

K L

Analogously to integer Toeplitz system we have converted Hilbert system to Toeplitz one composed of rational fractions and subsequently we have derived the error-free algorithm to solve it. In this case we introduce pairs of expressions of all quantities for both numerator and denominator, respectivelyFirst let us determine quantities with the index M+1,

11 11 1 1

1 1 111 1 1

11 11 1 1

1 1 111 1 1

11

C ( M )CM MM i jD ( M ) C ( M ) D ( M ) M

M i M M D D ( M )ji M M M j M j

C ( M )CM MM i jD ( M ) C ( M ) D ( M ) M

M i M M D D ( M )ji M M M M j j

D ( MM

bDb s ; b b ;

s D s b

cDc s ; c c ;

s D s c

v

1

1 1 11 11 1 11

11 1

1

11 1 1

10 1

MD

C ( M )i C C Mj) D ( M ) C ( M ) D ( M )i M M

M M MM D D D ( M )jM M M j j

ii

C ( M )CM MM i jD C D M

M i M M D D ( M )ji M M M j j

yvy D

c ; v v ;y D s v

s

bDD s ; D D ;

s D s b

1

1 111 1 11 10

1 1 11 1 1

0

1

1111 10

1 1

0

j

D ( M ) D C C ( M ) C ( M ) C ( M )M ij M M j M M jD ( M ) D C ( M )i

j M jj C D D ( M ) D ( M ) D ( M )M M j M M j

ii

j

D ( M ) D C CM ij M MD ( M ) D C ( M )i

j M jj CM

ii

sb D D b b c

b D ; b ;D D b b c

s

sc D D

c D ; cD

s

11 1

11 1 1

1 111 1 11 1 11

1 11 1 1 1

( M ) C ( M ) C ( M )j M M j

D D ( M ) D ( M ) D ( M )M j M M j

D ( M ) D C C ( M ) C ( M ) C ( M )D ( M )j M M j M M jD ( M ) D ( M ) C ( M )M

j j jD ( M ) C D D ( M ) D ( M ) D ( M )M M M j M M j

c c b;

D c c b

v D D v v bvv c ; b ;

c D D v v b

j

1 1 1 1 1 11 0 1 1 1 1 1 1 1 1 1 1

1 2

where

1 1 1D C D ( ) D C ( ) C D ( ) C ( ) D ( ) C ( )

, , ,M ,

D s ; D ; v y ; v y ; b s ; b ; c s ; c ;

L

Then we determine remaining components of the vectors v, c, b, and initial values

3.3 Error-free algorithm to solve Vandermonde system of linear equations [12]

Let us have Vandermonde system

1 1

2 1 2 3 22 2 2 2

3 1 2 3 3

1 1 1 11 2 3

1 1 1 1

N

N

N N N NN N N

y x

y a a a a x

y a a a a x

y a a a a x

A x

L

L

L

M M M M M M

L

1

1 1

N N

i jj i j

D ( a a )

1

11

N Nki

j j ,kki ,i j j i

x aP ( x ) B x

a a

(41)

It is well known that the determinant D of the matrix A is

and the elements of inverse matrix can be expressed using polynomial coefficients [12]

12 1

1

NN N

i Ni

P( x ) ( x a )(mod M ) x c x c x c (mod M )

L

Analogously to the algorithm given in [13] we determine coefficients of the polynomial

(42)

(43)

(44)

Vandermonde system arises when solving polynomial interpolation problem

Polynomials from (43) modulo M can be expressed

1

1

1 1

1

Nk

j ,kk

j N Nj

j i j ii ,i j i ,i j

z xP( x )

P ( x ) (mod M ) (mod M )( x a )

( a a ) ( a a )

1

1 1

1

1 1 1

j ,N

j ,N N j ,N j

j ,N i N i j ,N i j

z ,

z ( c z a )(mod M ),

z ( c z a )(mod M )

i ,N , j ,N .

M

1

11 1 2 11

2 1 2 2 221

1 2

1

0 0

0 0A B

0 0

where

, , ,N

, , ,NM

N , N , N ,NN

N

j j ii ,i j

z z zd

z z zd(mod M ) (mod M )

z z zd

d ( a a )(mod M )

LL

LL

M M MM M M

LL

Then one can derive

Resulting inverse matrix modulo M is

We can employ iterative scheme of error-free solution of linear equation system from section 2.3.

(45)

(46)

(47)


1

2

3

4

11 1 1 0

1 3 5 4 1

1 9 25 16 2

1 27 125 64 0

x, , ,

x, , ,

x, , ,

, , , x

The determinant of the matrix, according to (42), is -48. We shall use modulus M = 5. Then

DM = 2. Using the relations (45), (46), (47) yields inverse matrix in modulo class 5

BM

24 0 0 00 4 0 00 0 8 00 0 0 3

60 47 12 120 29 10 112 19 8 115 23 9 1

5

0 2 3 10 1 0 41 3 4 20 4 3 3

, , ,, , ,, , ,, , ,

, , ,, , ,, , ,, , ,

(mod )

, , ,, , ,, , ,, , ,

Then the procedure to calculate exact solution is as follows.

a. ym0 = y=[0,1,2,0]T .

b.

x = B ymM M 0(mod )

, , ,, , ,, , ,, , ,

(mod )5

0 2 3 10 1 0 41 3 4 20 4 3 3

0120

5

3110

c. x0 =DMxM (mod 5)=2.[3,1,1,0]T =[1,2,2,0]T.

d. x = [1,2,2,0] T.

e. We calculate vectors

y = Ax0'

0

1 1 1 11 3 5 41 9 25 161 27 125 64

1220

51769305

, , ,, , ,, , ,, , ,

y =115

04896

0

51769305

1133361

ym1 = [4,2,2,4]T.

f. Let j = 1.

g.

x = B ym1 M 1(mod )

, , ,, , ,, , ,, , ,

(mod )5

0 2 3 10 1 0 41 3 4 20 4 3 3

4224

5

4311

Suppose that positive numbers are represented in the range <0,(M-1)/2> and negative

numbers in the range <(M-1)/2+1,M-1>. Then negative values are, for numbers greater

than (M-1)/2, obtained by subtracting the modulus

x1 = [-1, -2, 1, 1]T.

h.

x =

1220

5

12

11

48

75

i. Calculate vectors

y1’ = Ax1 = [-1, 2, 22, 134]T

y2 = 5-1 [0, 3, -11, -39] T ym2 = [0, 2, 4, 1]T.

Now we continue in the next iteration (j = 2) in the step g.

g. x2 =BMym2 (mod 5) = [2,1,4,3]T .

After adjustement of elements greater than (M-1)/2 we have

x2 = [2,1,-1,-2]T .

h.

x =

48

75

5

21

12

4617

1845

2


y2’ = Ax2 = [0,-8,-46,-224]T ,

y3 = [0,1,7,37] T,

ym3 = [0,1,2,2]T.

We continue in the next iteration (j = 3) in the step g.

g. x3 =[0,-1,0,1]T .

h.

x =

4617

1845

5

01

01

4610818

80

3


y3’ = [0,1,7,37]T,

y4 = [0,0,0,0]T.

The zero vector y4 indicates the end of iterations. The result is

x = 1-48

4610818

80

.

4. General linear systems

4.1 Parallel error-free algorithm [14, 15]

1

11 1 2 1 3 1 1 1

2 1 2 2 2 3 2 2 2

1 2 3

1 2 or

Ax=y

N

m

, , , ,N

, , , ,N

N , N , N , N ,N N N

y( n ) a( n,m )x( m ), n , , ,N

a , a , a , , a x y

a , a , a , , a x y

a , a , a , , a x y

L

K

K

M M M M

K

1i(mod m ) , i ,rAx =y

Let us have system of linear equations, where all the elements of the matrix A and vector y are integers

Let us split the calculation into several residue classes satisfying.

1 2

2 1 2 1

where P must satisfy the condition

2 1 y

where

1

1

r

N / N ( N ) / N

i , j

i

P m m m

P max N M( A ) ,N( N ) M( A ) M( ) ,

M( A ) max a , i, j ,N ,

M( y ) max y , i ,N

K

(48)

Then in each residue class we can carry out separately the calculations.

Calculations in residue classes can be carried out in parallel

Gauss-Jordan elimination in residue class m can be described by the following procedure

Let

0, , (mod ); 1,2, , , 1, 2, , 1 .i j i ja a m i n j n K K

For 1, 2, ,k n K (k denotes elimination step) do:

1. search for such fi c f k that ,1 0, ,i ka c i

2 . 1

1 1, , 1 ,1 (mod ), 1,2, , 1 .k k k

i j i j ia a a m j n k

K (49)

3 . fo r 1,2, ,l n K a n d l i d o :

1 1, , 1 ,1 ,1 (mod ), 1,2, , 1 .k k k k

l j l j l ia a a a m j n k

K (5 0 )

The determinant in the residue class m is obtained as the modulo product of the elements 1,1k

ia (see (49))

1,1

1

(mod ) 1 (mod ),n

J km k

k

D m D a m

(51)

where J is the total number of exchanges of the elements kc which create an increasing sequence. Then,

the sought solution of (48) in the residue class m is given by

1(mod ) (mod ) (mod ).m m m m

Dx m x x m D x m

D D

(52)

It can be noticed that by executing each elimination step the columns of the matrix are shifted to the left (the contents of the rightmost column being irrelevant), so that after finishing the elimination the vector

(mod )x m is placed in the first column of the memory matrix (elements ,1, 1, 2, ,ia i n K ).

The described algorithm can be easily implemented in the residual processor, presented in block diagram in Fig. 2.

The residual processor contains the memory matrix M, the arithmetic units AUD , AU2 to AUn+2, and the unit denoted as ROMs. A jth column of the memory matrix is connected by its serial output to AUj•

The unit ROMs comprises two ROM memories, one with negative numbers and one with inverse

numbers in the residue class m. From the first column of the matrix M the values 1,1k

ia (see (49)) and

1,1k

la (see (50)) are fed through the e bit internal data bus IDB to the unit ROMs and are used as

addresses to find the appropriate inverse and negative numbers. The inverse and negative numbers are supposed to be written into each residual processor’s ROMs beforehand in correspondence to the ascending sequence of the addresses.

The arithmetic units AU2 to AUn+2 serve to perform the calculations (49), (50). i.e. they are designed to carry out the operations of the residue class addition and multiplication.

Finally. the arithmetic unit AUD performs the calculation of the determinant D modulo m according to (51).

The shift of the columns to the left in the elimination step is achieved by interconnec tions of the matrix columns and the arithmetic units by serial input and output buses SI, S O .

Conversion from residue class code can be done according to the following algorithm:

After the calculation of the solution of a linear equations set in the modular arithmetic is terminated the 1n element vectors

1(mod ), (mod ), , (mod ) , 1, 2, ,k k n kD m x m x m k rK K (53)

are prepared in the arithmetical units 2 3 n+2AU , AU , , AUK of the residual processors

.1 2 rRP , RP , , RPK The resulting integer values D, 1 2, , , nx x xK can be obtained by the inverse conversion

of the vectors (53) from the residue class code. We shall assume that the resulting values are formed in the floating point processor FFP in the control unit CU as the 1n -element vector of the floating point numbers.

Since we use the known algorithm (given e.g. in [16]) we describe it very briefly from mathematical point of view. We suppose we have a vector of moduli

0 1 2, , , rm m m K (54)

and a vector of the residue representation of the integer jx

,1 1 2mod , mod , , mod .j j j j rt x m x m x m K (55)

The elements of the vector (54) are written in the ROM memory and the elements of vector (55) in the arithmetical units 1AU j of the residual processors , 1, 2, ,RPk k r K . Further. let us denote

1 2, , ,k k k rm m m K

, , , ,, 1 , ,j k j k j k j kt t k t k t r K ,

, , ,mod , , mod .j k j k k j k rd t k m t k m K

If we define the vector , 1j kt to be

1, 1 , , mod ,j k j k j k k kt t d m

then the resulting sought number jx is given by

,1 1 ,2 2 ,3 1 ,1 2 3 .j j j j r j rx t m t m t m t r L (56)

Repeating the computation using this algorithm for 1,2, , 1j n K in the control unit CU we obtain the

resulting vector 1 2, , , nx x xK and determinant D . Let us illustrate the given algorithm by an example with

0 3,5,7 , and ,1 1, 2,5jt - see Table 1.

Table 1

0 1 3m 2 5m 3 7m operation

,1jt 2 5

,1jd 1 1 1 subtraction

,1 ,1j jt d 0 1 4 1

1*m 2 5 multiplication

,2jt 6

,2jd 2 2 subtraction

,2 ,2j jt d 0 4 1

2*m 3 multiplication

,3jt

Then, using (56), the number jx is

,1 1 ,2 2 ,31 2 3 1 3 2 5 5 82.j j j jx t m t m t

1

2

5

4.2 Sequential algorithm to solve general system of linear equations [17]

So far we have mentioned two basic algorithm classes to solve linear equations exactly•iterative algorithms (applied in section 2.1 – Fermat transform, section 3.3 – Vandermonde

system)•parallel algorithms (applied in section 3.1 – Toeplitz system, section 4.1 – general system of

linear equations)

In this section we want to demonstrate that sequential iterative algorithm can be applied also for general systems of linear integer equations. Let us define the following algorithm:

A lg o r i th m A

a . L e t m0y y.

b . C a lc u la te v e c to r xM

1m0x A yM mod M .

c . I t h o ld s

0x xM MD mod M .

d . L e t 0x x .

e . C a lc u la te v e c to r s 0

'0y Ax ,

1

1 'm0 0y y yD ,

M

1m1y y mod M .

If 1y equals zero vector the calculation is finished, if not continue in f.

f. 1j .

g. Calculate 1

m1x A yj mod M . h. For 0 1 1i , , ,N K (N is the size of vectors x, y ) calculate

jjx i x i x i M .

i. Calculate vectors '

j jy Ax ,

'

j jj+1

y yy ,

M

m,j+1 j+1y y mod M .

j. If for all 10 1 1 0ji , , ,N , y i , K finish the calculation. If not, increment j and repeat

the algorithm from point g. on. The resulting solution of system (1) is simply calculated as

x

x .D

(57)

(58)

Then determinant of the system of linear equations can be calculated using the formula

11

01

jN

jij

D a j, j a j,i k i ,

where the coefficient vectors jk are calculated by exact solution of sets

1 1 2 1jkj , j j ,A A j , , ,N , K

using the algorithm A

Next problem in the algorithm A is calculation of the inverse matrix in modulo arithmetic. Let us define iterative procedure where:

1

1 11 1 1

j , j j ,

j , j, j ,

A AA ,

A A

1 1

11

1 1 1j , j

j , j j ,

, j ,

B BA mod M ,

B B

Then one can derive

1111 1 1 1 1, , , j j , j j ,B A A A A mod M ,

11 1 1 1j , j , j j , ,B A A B mod M ,

11 1 1 1, j , , j j , jB B A A mod M ,

1 11 1 1 1j , j j , j j , , , jB A B B B mod M .

(59)

(60)

(61)

(63)

4.3 Sequential error-free algorithm to solve system of polynomial equations [18]

Sequential iterative algorithm can be successively applied also for polynomial systems of linear integer equations:

A x ys s s , 1 1 1 1 0p pi , j i , j i , j i , ja s a s a s a , L 1 1 1 1 0p p

i i i iy s y s y s y , L(64)

Assuming the coefficients being integers, the solution of the system given by (64) can be expressed as

1

0 1

11 1 1x x x x

kp p pks s s s s s s ,

D s

L (65)

and vectors 0 1ix ,i , , ,k K are:

0 10 1ix x x xm

i i ims M s M s M s , L (66)

where M is prime modulus, D is the determinant of matrix A and m, k are finite integers. Let us suppose for the moment that the determinant D s (polynomial) is known. We denote polynomial modulus as 1pP s s . The algorithms for the determinant and inverse matrix of polynomial system can be derived analogously to relations (59-63). For details we refer to [18]. New problem is the calculation of inverse polynomial. In what follows we outline this algorithm.

We have to find polynomial b s to polynomial a s so that

mod 1a s b s P s (mod M ) (67)

According to [2], the cyclic convolution using polynomial algebra concept is

1

0

p

l i l ii

c a b ,

(68)

where indices l i are calculated modulo p

From what is given above, it follows that solution of equations (6 7 ) represents the solution of linear system

00 1 1 1

1 0 2 1 0

1 2 0 01

ba , a p , a

a , a , a b(mod M ) ,

a p , a p , a b p

L

L

M M M MML

(69)

in residual class M.

Inverse polynomial b s can be obtained by successive reductions of polynomial a s or vector

a . For the r+1 -st reduction it holds

2 1

1 1

0

2 2 1

rN /ir r r r r

i

a l a i a m (mod M ),

where

11

2 2 0 12

r rr

Nm l i (mod N ); l , ,

and number of reduction 20 1r , , ,log N K . Let

0 1 0 0nT

, , , , L and

2 1

1

0

2 1 (

rN /ir r r r

i

n l n i a m mod M ),

where

2 0 1rm l i (mod N ); l ,n ,

and number of reduction 20 1 log 1r , , , N . K Resulting solution of polynomial inversion is

0

nb

t

t(mod M ),

a

where 2logt N .


Let us have linear system consisting of polynomials:

2

2

1 3 1

2 1 2x

s , s ss .

s, s

We use polynomial modulus 3 1P s s

and numerical modulus 5M . We assume that we know determinant 3 25 1D s s s s , determinant in residue class P s and residue class M

3mod 1 mod 5 2Dpm s D s s s , (70)

and inverse matrix in residue class P s and residue class M

2

1 3

2 2

3 4 2 3mod 1 mod 5

3 2 3

s , s sB s A s s .

s s , s s

We calculate x p s

2 2 2

3

2 2

3 4 2 3 11 5

3 2 3 2 0x p

s , s s s ss mod s mod .

s s , s s

Using Dpm s from (70), we get:

2 2

300

21 5

0 0x

s ss Dpm s mod s mod .

We initialize vectors and control variables

5 4 3 22

3 2

3 2

5 2 4 115 1

2 2 10 2 2T y

s s s s sss s D s s s s ,

s s s

0 0j , k ,

22 1

0x

ss .

We calculate vectors

3 22

00 2 3

1 3 2 2 12 1

2 1 0 4 2'00y x

s , s s s sss A s s ,

s, s s s

5 4 2

0

3 2

5 6

2 10 2'00T T y

s s ss s s M ,

s s

3 2 2

2 2

1 5 51

5 5 10 201

Ty

s mod s s s s ss ,

s s

2

2

4 45

3p,01 01y y

s ss s mod .

s

The vector 001y s , thus we set k = 1

3 1 501 p,01x ys B s s mod s mod

22

3

2 2 2

4 43 4 2 3mod 1 mod 5

3 2 3 3

s ss , s ss

s s , s s s

4 5mod 5

0 0

s.

N O T E . If any coefficient of the solution is greater than 1 2M / , it is considered negative and in accordance with rules of modular arithmetic its negative value is calculated by subtracting modulus M.

We continue with

2 2

12 1 2 5 15

00 0x

ss s ss .

Further, we calculate appropriate vectors

2

2 2

1 3

2 1 0 2'01 01y x

s , s s s ss a s s ,

s, s s

5 4 2

1

3

5 55

2 2'01T T y

s s s ss s s ,

s

3 1

502

Ty 0

s mod ss ,

p,02y 0s .

The vector p,02y 0s . The vector 0T s , therefore, we set control variables 1 0j , k .. Then we calculate

2 2

3

55

1 2 3p,10

TT y T

s s s ss ; s s mod .

s

Further we calculate next particular solution

2 2

3

2 2

3 4 2 31 5 5

3 23 2 3 310x

s ss s s ss mod s mod mod ,

s s s s

which we add to final solution

4 22

3

3

2 6 12 5 11

20 2 2x

s s s ss ss s .

s

Again we repeat and calculate

2

2

1 3 5

2 1 2 2'10 10y x

s , s s s ss A s s ,

s, s

05'10T T y 0s s s ,

11y 0s ,

p,11y 0s ,

The vector 11y s as well as vector T s equal zero vectors what means that we finish the

calculation. The resulting solution is

4 2

3 2 3

2 6 11 1

5 1 2 2x x

s s ss s .

D s s s s s

5. Volterra systems

Direct generalization of convolution systems for nonlinear systems

1 1 2 1 2 3

1 1 1 1 1 1

1 1 1 2 1 2 1 2 3 1 2 3 1 2 30 0 0 0 0 0

1 2 3

( ) ( ) ( ) ( , ) ( ) ( ) ( , , ) ( ) ( ) ( )

( ) ( ) ( ) , 0,1, , 1

N N N N N N

l l l l l l

y j h l x j l h l l x j l x j l h l l l x j l x j l x j l

y j y j y j j N

1 2

1 2

1 1 1 1

1 2 1 2 1 20 0 0

i

i

N N N Nll ll

i i i i il l l l

X( z ) x( l )z ; H ( z ,z , ,z ) h ( z ,z , ,z )z z z

L L L L

(71)

Several problems:1. Determination of Volterra kernels [19]2. Calculation of the output of the nonlinear Volterra filter [20]3. Determination of inverse kernels to the given nonlinear Volterra system [21]Let us again employ polynomial algebra

Then for the Volterra kernel of the i-th degree we can write

1 2 1 21 2

1 2

pi i i i

i ii

Y ( z z z ) Y ( z ,z , ,z )H ( z ,z , ,z )

X ( z )X ( z ) X ( z ) D

L LL

L (72)


Let N=2, i=2 and

3 4 1 22y x=T T

, ; ,

1 2 1 2 1 22 1 2

1 2 1 1 2 2

1 2 1 2 1 2 1 22 2 2

3 4 3 4 1 2 1 2

1 2 1 2 1 2 1 2 1 2 1 2

19 14 14 16 19 14 14 16

1 2 9

z z ( z z )( z )( z )H ( z ,z )

( z )( z ) ( z )( z )( z )( z )

z z z z z z z z

( )

Then according to (72)

The resulting kernel of the second degree is

2

19 141

14 169h

The product X(z)X(-z) is called the reduction of the polynomial X(z) with respect to the variable z.

Conclusion

In the contribution we have presented several error-free algorithms to solve one- and multidimensional convolution systems based on Fermat number theoretical transform. The main significance of the use of modulo arithmetic is eliminating rounding-off and truncation errors.

In the contribution we have derived the error-free algorithm to solve convolution systems that is based on polynomial algebra concept. This form of k-dimensional deconvolution algorithm permits to express it as a sequence of k-dimensional “autoconvolutions” of the response signal and “cross-convolutions” of the response and the output signal.

Further in the contribution we have extended the error-free algorithms to special linear systems, e.g. Toeplitz, Hilbert, Vandermonde systems. We continued with general systems of linear equations. In principle the algorithms proposed in the contribution can be divided into two groups

•iterative (using one modulus – one residue class)•parallel (using several moduli – several residue classes and Chinese theoremWe have extended the iterative algorithm also for the linear system of polynomial equations. At

the end of the contribution we outlined the possible extension of the application of error-free algorithms for nonlinear Volterra systems.

References:[1] N. Ahmed and K.K. Rao: Orthogonal Transforms for Digital Signal Processing. Springer-Verlag,

1975. [2] H. J. Nussbaumer: Fast Fourier Transform and Convolution Algorithms. Springer-Verlag, 1981.[3] M. Morháč: Precise Deconvolution Using the Fermat Number Transform. Computer and

Mathematics with Applications. Vol. 12A, No. 3, pp. 319-329, 1986. [4] Morháč M.: Error-free deconvolution based on cyclic determinant calculation approach.

Computer Mathematics, Vol. 49, pp.41-51, 1993, [5] M. Morháč: k-dimensional Error-free Deconvolution Using the Fermat Number Transform.

Computers and Mathematics with Applications. Vol.18, No.12, pp.1023-1032, 1989. [6] M. Morháč: Precise Multidimensional Deconvolution Using the Polynomial Algebra Concept.

International J. Computer Mathematics, Vol.32, pp.13-26, 1990. [7] M. Morháč, V. Matoušek: Exact algorithm of multidimensional circulant deconvolution. Applied

Mathematics and Computation, Vol. 164/1 (2005), 155-166.

[9] R. E. Blahut: Fast Algorithms for Digital Signal Processing, IBM Corporation, Owego, NY, 1985.

[10] M. Newman: Solving equations exactly, Nut. Bureau Standards 71B:171-179 (1967).

[8] M. Morháč: An error-free Levison algorithm to solve integer Toeplitz system. Applied Mathematics and Computation, Vol. 61, No.2-3, pp. 135-149, 1994.

[11] M. Morháč: An algorithm to solve Hilbert systems of linear equations precisely. Applied Mathematics and Computation, Vol. 73, No.2-3, Dec. 95, pp.209-229.

[13] W. H. Press et al.: Numerical Recipes, Cambridge University Press, Cambridge, 1986.

[12] M. Morháč: An iterative error-free algorithm to solve Vandermonde systems. Applied Mathematics and Computation. Vol. 117/1, 2001, pp.45-54.

[14] M. Morháč, R. Lórencz: A modular system for solving linear equations exactly. I. Architecture and numerical algorithms.Computers and Artificial Intelligence, Vol.11, No.4, 1992, pp.351-361.[15] R. Lórencz, M. Morháč: A modular system for solving linear equations exactly. II. Hardware realization.Computers and Artificial Intelligence, Vol.11, No.5, 1992, pp.497-507[16] R.T.Gregory, E.V.Krishnamurthy: Methods and applications of error-free computation. Springer-Verlag, New York, Berlin, Heidelberg, Tokyo 1984.[17] M. Morháč: One-Modulus Residue Arithmetic Algorithm to Solve Linear Equations Exactly. Mathematical and Computer Modelling, Vol.19, No.12, pp.95-105, 1994. [18] M. Morháč: An Error-Free Algorithm to Solve Linear System of Polynomial Equations. Mathematical and Computer Modelling, Vol.19, No.12, pp.85-93, 1994.[19] M. Morháč : Fast Error-free Algorithm for the Determination of Kernels of the Periodical Volterra Representation.Applied Mathematics and Computation, Vol.38, No.2, 1990, pp. 87-101. [20] M. Morháč: A Fast Algorithm of Nonlinear Volterra Filtering.IEEE Transactions on Signal Processing, Vol.39, No.10, 1991, pp. 2353-2356. [21] M. Morháč: Determination of Inverse Volterra Kernels in Nonlinear Discrete Systems. Nonlinear Analysis, Theory, Methods & Applications. Vol.15, No.3, pp.269-281, 1990.

Documents

ACAT2007, April 23-27, Amsterdam, Netherlands