16
BIT 7 (1967), 175-190 ON THE GENERALISATION OF BAIRSTOW'S METHOD G. M. BIRTWISTLE and D. J. EVANS Abstract. This paper examines the possibility of improving Bairstow's algorithm for find- ing the roots of polynomials with real coefficients. It discusses points concerning the removal of factors of any chosen order and with speeding the convergence rate. The case of multiple factors is also examined. Introduction. The problem of finding the roots of a given polynomial is of great scientific importance. This can be judged from the widespread literature on the subject, from which we select as background material to accom- pany this paper, the accounts of Wilkinson [1], and Ostrowski [2]. Two of the most used processes are those of lqewton-Raphson and Bairstow. These are in fact special cases of a single method for extracting factors of any permissible degree from a given polynomial. This paper examines the feasibility of extracting factors of order m from a pol~momial with real coefficients of degree n (m will be in the range 1 < m < ½n as a factor of order p > ½n can be obtained by putting m = n - p, and with less work). The process of finding a factor of degree m which is outlined in § 1 is henceforth called "Bairstow(m)" and its convergence is of second order for factors of unit multiplicity. Processes of different orders of convergence, either single-step or double-step can be developed from Bairstow(m) quite simply and the two of most interest have convergence rates of order three. 1. A Description of the Bairstow (m) Algorithm. Let f(x) be the polynomial of degree n whose roots are being sought, and a(x) be a non-exact factor of degree m < ½n. This section outlines the process by which a(x) is transformed into an "exact" factor ~(x) of multi- plicity one for any m in the quoted range (or indeed, less usefully, in the range 1 < m < n). Division of f(x) by a(x) yields the equation f(x) = a(x)g(x) ÷ b(x), (1) BIT 7 -- 12

On the generalisation of Bairstow's method

Embed Size (px)

Citation preview

B I T 7 (1967), 175-190

ON THE GENERALISATION OF BAIRSTOW'S METHOD

G. M. BIRTWISTLE and D. J. EVANS

Abstract. This paper examines the possibility of improving Bairstow's algorithm for find-

ing the roots of polynomials with real coefficients. It discusses points concerning the removal of factors of any chosen order and with speeding the convergence rate. The case of multiple factors is also examined.

Introduction.

The problem of finding the roots of a given polynomial is of great scientific importance. This can be judged from the widespread literature on the subject, from which we select as background material to accom- pany this paper, the accounts of Wilkinson [1], and Ostrowski [2]. Two of the most used processes are those of lqewton-Raphson and Bairstow. These are in fact special cases of a single method for extracting factors of any permissible degree from a given polynomial. This paper examines the feasibility of extracting factors of order m from a pol~momial with real coefficients of degree n (m will be in the range 1 < m < ½n as a factor of order p > ½n can be obtained by putting m = n - p, and with less work). The process of finding a factor of degree m which is outlined in § 1 is henceforth called "Bairstow(m)" and its convergence is of second order for factors of unit multiplicity. Processes of different orders of convergence, either single-step or double-step can be developed from Bairstow(m) quite simply and the two of most interest have convergence rates of order three.

1. A Description of the Bairstow (m) Algorithm.

Let f ( x ) be the polynomial of degree n whose roots are being sought, and a(x) be a non-exact factor of degree m < ½n. This section outlines the process by which a(x) is transformed into an "exact" factor ~(x) of multi- plicity one for any m in the quoted range (or indeed, less usefully, in the range 1 < m < n).

Division of f ( x ) by a(x) yields the equation

f ( x ) = a(x)g(x) ÷ b ( x ) , (1)

B I T 7 - - 12

176 O. ~L BIRTWISTLE AND D. Jr. EVANS

where the degrees of the polynomials f (x ) , a(x), g(x) and b(x) are n, m, n - m and s ( O < s < m - 1 ; if s = 0 , then a(x) is an exact factor of f (x) ) . We take the coefficients of x i (for any permissible i) in (1) to be fi, a i, gt, and b i respectively, and assume without loss of generality that f ~ =

am=gn_m= 1. Partial differentiation of (1) by each a~ (i = 0, 1 , . . . m - 1) in turn gives

xig(x) = - a ( X ) ~ a . - 2., = - x , (i=O, 1 , . . . m - 1 ) . (2)

In words, (1) and (2) show that taking any polynomial of degree m, a(x), and dividing it into f ( x ) yields a quotient g(x) and the m quantities

b0, bl, . . . . b~_l

each of which is a function of ao, a 1 . . . . . am_ I. By multiplying g(x) in turn by 1,x ,x ~ . . . . . x '~-1 and dividing the resulting polynomial by a(x) we can obtain all the m ~ quantities ~bj/~ai, ( i , j= 0 , 1 , . . . , m - 1 ) .

We use these calculable m ~ + m quantities involving the b's, each a function of ao, a 1 . . . . ,a~_~, to produce a better estimate for the exact factor ~(x), which is taken in the form

~(x) = ~0 + al x + ~ x2 + • • • + ~m-1 x~-I + x~ •

We now define O k by the relations

~k = ~k--ak ( k = O , 1 , . . . , m - - l ) (3)

and by Taylor's Theorem note that

, , . ~1o ~bi(a) m-lm-1 ~bj(a) bi(~x) = ot(a)+ ZOk- -=- - - -+ ~ ~OrOs~--:q=-+. . . ( j = 0 , 1 . . . . . m - l )

k=0 ~ak r=0 8=0 oaroas (4)

(4) comprises a finite set of infinite expansions for which the solutions 0 k (/c=0, 1 , . . . , m - 1 ) express exactly the difference between the true a-coefficients and the approximate a-coeffidients. This set is obviously not solvable directly, so we seek a linear approximation to 0 k obtained from (4) by omitting all terms in 0 of higher order than unity. Let the exact solution to this truncated set of equations be ~k* (/C = 0, 1 . . . . . m-- 1). We note that

bi(~o,~1 . . . . , ~ - 1 ) = bj(~) = 0 (j = 0,1 . . . . , m - l ) (5)

for the O k which we are seeking (in practise [bj(~)l <e, a pre-assigned quantity). For brevity we put

O N T H E G E N E R A L I S A T I O N O F B A I R S T O W ' S M E T H O D 177

bAao , a l . . . . . =

and the 5~* are calculated from the equation

~-i ~b i 0 = b~+ ~ k * - - ( j=0 ,1 . . . . . m - l ) . (6)

k = 0 ~ a k

For a process of order m this involves the solution of m equations in m unknowns ~k*, and as m increases this is increasingly disadvantageous to the computers. Substituting for bj from (6) into (4) and rearranging yields

~-i ~b. ,~-i ~b. ~-i m-1 0~bj

k=O ~a/~ r = O s=O

The true solution ~k and our approximation ~k* will not in general be equal and so we must iterate towards the true solution. That the order of this iteration process is the second is demonstrated below. We have tha t

.~-i ~b - b j - ~ * t ( j=0 ,1 , . , m - l ) - - X . , k ~ " " "

~=0 ~a~

Let A be the determinant with the number ~bj/~a k as the element in its jth row and k th column. Cramer's Rule gives us tha t

= ( s )

where A r is the determinant obtained from A by substituting in the rth column of A the numbers - b j for the already present elements ~bl/~a r ( j=0 ,1 . . . . , m - 1). But from (4),

~-1 ~b. - 5i = 2~ ~k" -~ + O(~) (9)

which means tha t we can expand A r as the sum of m determinants, each identical to A except in the r TM column where the elements are over- written by terms like ~k(~bJ~ak) in the s TM row and r th column of the ]c TM

such determinant, and an additional determinant in which the elements of the r th column are replaced by terms of O(b~). The only non-vanishing members of this set of m + 1 determinants are the two for which the r TM

column contains the elements

a) ~5~(~bk/Oa~) (It = O, 1 . . . . , m - - 1) b) the 0(~ ~) elements

178 G.M. BII~TWISTLE AND D. 5. EVANS

Clearly then, AOk* = O~A +0(~ ~)

0k* = (~k+ O(5~) (10)

SO that our estimation for (3~ is correct to within a perturbation of O(~ ~) and the method will converge quadratically if the coefficients of the O(~ 2) terms are sufficiently small (i.e. of 0(1)).

N.B. What is here called Bairstow(2) is the ordinary Bairstow pro- tess, and Bairstow(1) is in fact a slight improvement on Newton- Raphson. For consider Bairstow(1). Let ( x - a ) be an approximate factor

f (x) = (x-a)g(x)+f(a) by the Remainder Theorem.

Differentiation with respect to a yields

g ( x ) = oa ~a

and (6) shows that we calculate the increment in a by

-/(6) ~k* = -

f'(a)

:By performing the calculations in this manner, the final iteration, for which Jf(a)l <~, will also produce without additional computation the deflated polynomial g(x).

Suppose now that we have a factor ~(x) of multiplicity r > 1 and degree m. Then

/ (x) = s a y . (11)

As before we would guess a(x) as an approximation to ~(x), but would not know prior to r deflations with the factor a(x) that it is in fact a factor of multiplicity r.

In the exact process we have m--1

f(x) = {a(x)}rg*(x) + Z b~xf. (12) .i=O

Differentiation with respect to a k (k = 0 ,1 , . . . m - 1) yields

xJ. (la) xk(a(x)}r_ig$(x) = 1 {a(x)} r ~ -~ "= r Yak j~o ~ak

In other words, if we deflate f (x) by dividing by a(x) and obtain 9*(x), then division of xkg*(x) (k= O, 1 . . . . , m - 1 ) by a(x) yields the remainder as the polynomial

ON THE GENERALISATION OF BAIRSTOW'S METHOD 179

1 ~_z . ~b 1 - - l x J = - . (14)

rj=o 6ak

But the Bairstow(m) process automatically treats' 02) as

m-1 f (x ) = a(x)p(x) + ~ bixf (15)

j=o

where p(x)= (a(x)}r-lg*(x) and obtains

x v(z) = . ( 1 6 )

The Bairstow(m) process will then seek the incremental changes ~k** in the a k (k=0,1 . . . . , m - 1 ) as the solution of

% Z ** ~bj (17) - b i = .¢.,, . ,~- ~--,

j= o 6ak

whereas the real approximate step is the solution of, (by (14))

m-1 1 ~b i - b l = Z (~k*

j=0 r ~a k

so that the calculated step 6** is given as

1 1 ~k** = - * ~ (18)

r ~k r ~k,

Thus we alter the coefficients a k in the approximate polynomial factor a(x) by ~ r - ~ k and a determinantal analysis quickly shows that the convergence is of order ~ ( 1 - r -1) and not 6~ as before (when r = l ) .

2. Calculation of the work done.

The division process for the calculation of g(x) and b(x) from the k n o w n f ( x ) and a(x)is essentially a matching process in the coefficients of x and is well known in the cases m = 1 or 2. We do not reproduce a scheme for the general case when a(x) is of order m, as the general points are brought out more simply if we consider m having a par- ticular value, say 4, and the results come out in a form that is easy to generalise. When m = 4, (1) reads

n--1 x'~+ ~ / W ; ( x ~ + % x ~ + a ~ + a ~ x + % ) x~-~+ ~ g W

s~0 r~O J

+ b3x 3 + b~x 2 + blx + b 0 (19)

180 G.M. BIRTWISTLE AND D. J. EVANS

Comparing coefficients

ff n - 6 =

g n - 7 =

g n - 9 =

go =

b 3 =

b~ =

b 1

b o =

f n - 1 ~ 6I'3

f ,~_2 - a39,~-s - a~.

f ~ - 3 - a .~g ,~_e - a d ~ - 5 - al f ~-a - aag~-, - a 2 g n - 6 - - a l g n - 5 - - gO

f n - 5 -- a 3 g n - s -- a 2 g n - 7 -- a l g n - ¢ -- aog~-~

f4 -a3g l -a2g2 --alg8 --aoga f3 -aago -a2g l --a19~ -aog3

fs - a d o --algl --aog~ f i - a l g o - a o g l

fo -aogo

Thus to calculate g ( x ) a n d b(x) we need 4 n - 1 6 multiplications (in general m n - m 2) and 4 n - 1 2 additions (in general m n - m 2 + m ) . After this step, Bairstow(m) demands the division of xig(x) ( i=0,1 ,2 . . . . m - 1 ) by a(x), a process in which the quotients and remainders are clearly going to be closely linked. In the particular case when m = 4, we put

g(x) = a(x)p(x) + c~x 8 + ~2x ~ + clz + oo

xg(x) = a(x) q(x) + dax 3 + d2x ~ + dlx + d o (21) x2g(x) = a(x)r(x) + e3xS+ e~x~+ e lx+ e o

xag(x) --- a(x)s(x) + fax a + f~x 2 + f i x + fo

and require only the coefficients in the remainder. We calculate ca, % c~ and c o by the division process already outlined and analysis shows that the remaining twelve coefficients are given by

d3 -- cz-caa3 ea = d~-daas f s = e2-eaaa

42 = 61- -C3a2 e2 = ~ l - d a a 2 f2 = e l - e 3 a 2

dl = Co-Caal el = do -daa l f l = e0-eaa~ do = -caao eo = -daao fo = -eaao

(22)

The finding of the Fs requires 4 n - 3 2 multiplications and 4 n - 2 8 additions, to which must be added the twelve multiplications and nine additions necessary to calculate the d's, e's and f 's . This gives the total work so far as 8 n - 36 multiplications and 8 n - 31 additions.

In the general ease the required remainders on dividing xig(x) by a(x) are calculated in the form ctjxi ( i , j=O, 1,2 . . . . . m - l ) . The com- putation of the c0/s requires r a n - 2 m 2 multiplications and r a n - 2 m 2 + m additions. The remaining G/s are calculated by

ON THE GENERALISATION OF BAIRSTOW'S METHOD 181

¢r0 -- - cr-l"~-la° (23) c O = c r _ l , j _ l - G . _ l , m _ l a I, ( l < j S m - 1 )

where r increases in steps of 1 from 1 until m - 1. Thus for one Bairstew(m)iteration the amount of work required to

calculate these m ~ constants and the b's necessary to set up the basic equations from which the 3k* are calculated totals 2 m n - 2 m ~ - m

multiplications and 2 m n - 2 m 2 + 1 additions. The calculation of ~he 3k* from equation (6) requires a further ½ m ( m U + 3 m - 1 ) multiplica- tions and a similar number of additions, so tha t we may take the work done to be measured by the number of multiplications in one complete Bairstow(m) iteration which is

m a 4m 2 m n + - - - m s - - - . (24)

3 3

Let us now examine the question of which Bairstow(m) method will extract a given number of roots from a polynomial for the least work. A guide to work is furnished by the calculations above as each Bair- stow(m) converges as 0(52), and hence it is reasonable to expect each factor a ( x ) to converge to the appropriate a(x) in the same number of itera- tions whatever m. Thus the work done in one cycle will be proportional to the work done in extracting the factor. Further the number of multi- plications per iteration is sensibly equal to the number of additions, and so the former number is taken as our guide to the work done.

Suppose we wish to extract p = m s < n factors from f ( x ) . T o do this we would obtain factors a l ( x ) , a e ( x ) . . . . . a s ( x ) each of order m from polynomials of degree n, n - m, n - 2m . . . . . n - p + m. To calculate the work done we take as our measure the total number of multiplications in one iteration for each of the s extractions.

This is quite clearly

m 3 4m 2ran. + - - - m s - - - -

3 3

m a 4 m + - - _ m 2 _

3 3 + 2m(n- m)

m a 4 m + 2m( - 1)m) m ' - - -

3

m s - 4 } = p 2n~ 3 p . (25)

1 8 2 G, M. BIRTWISTLE AND D, J. EVANS

If n is divisible by 3 and 4, the work done in finding all the factors of order 1, 2, 3, and 4 by the first four Bairstow(m) methods is thus

by our measure.

m = 2 n~-4

5n m = 3 n 2 + - - - 14

3

m = 4 n e + 4 n - 3 2 (26)

These results show that the higher order the Bairstow(m) method, the less efficient (slightly)is the scheme for finding the ~(x) factors. This point holds only of course if we work with real numbers through- out, which Bairstow(2s) methods can do. If complex arithmetic is involved, computation of multiplications takes 4 times more, and al- though a complex conjugate root is guaranteed little or no extra work (depending upon rounding errors) we can deduce that generally we will be quicker using Bairstow(2) or Bairstow(4) or . . . etc. Having obtained our factors we then have to find their roots. By direct meth- ods this is possible for Bairstow's 1, 2, 3, and 4 (by the method of the reducing cubic), but not when m > 5, so we now assert that the logical method to use when extracting the roots of a polynomial With real coefficients will be either Bairstow(2) or Bairstow(4). At first sight Bairstow(2) would be preferred as it is easier t,o calculate the roots of a quadratic than a quartic, but the overall speed clearly depends upon how well the appropriate factors are conditioned. Polynomials clearly exist for which the Bai rs tow(4)method will be quicker than Bair- stow(2) e.g.

f (x) = ( x - 5)(x+ 1)6(x+ 5)

or, consider the polynomial with known close quadratic factors

g(X) = X 12 -~ 3.1X 11T 6.08X 1° + 10.22X 9 + 15.275X s + 18.43X 7 + 19.2925x 6

+ 18.4325X 5 + 15.2825X a + 10.2275X a + 6.0725X ~ + 3.1 lx + 0.99.

Using starting approximations of x 2 + x + l (Bairstow(2)) and x4+xa+ xe+x+ 1 (Bairstow(4)) and to an accuracy of 0.00005, the Bairstow(2) took a total of 43 iterations t o e x t r a c t five factors and the Bairstow(4) only 11.

In practice however, polynomials with roots of high multiplicity are rare and this gives us a hint tha t Bairstow(2) will probably be the best overall method.

ON THE GENERALISATION OF BAIRSTOW'S METHOD 183

3. Analysis of Rounding Errors inherent in the Deflation Process.

In floating point arithmetic, each number x is usually represented in the form

x = a . 2 b

where b is the exponent and a, the mantissa, lies in the range ½ < ]a] < 1. Following Wilkinson [1], we use the notation f l (xOy) to denote the com- puted result of performing floating point arithmetic on two numbers x and y where 0 may be one of the four usual arithmetic operators. We assume that the following equations are exact mat.hematieal relation- Ships.

fl(x_+y) -- (x+_y)(l+el), fl(x.y) ~ x y ( l + ~ )

f l ( x / y ) =- (x /y ) ( l + ~3) (27)

where el, e~, e 3 are in general all different rounding errrors incurred less than 2 -t, where t is the number of binary digits allocated to the mantissa of the floating point number.

In the factorization process described in section 1, there occurs the situation that when a factor of order m has been determined, the process cont inueswi th the deflated polynomial until all the factors have been eliminated. Unless the process of deflation is stable, the roots determined at a later stage will be seriously in error.

We now consider the errors resulting from one stage of deflation with a yiew to yield the limitations imposed on the process by the use of t-digit floating point arithmetic.

Since the process is performed in floating point arithmetic, then if we denote the coefficients of the polynomial, and the factors respectively by ft (0 < i < n), gi (0 < i < n - m) and a i (0 < i < m) , w e can obtain, the co- efficients of the deflated polynomial from the following relationship

x ~ + f ~ - l x ~-I + . . . + f i x +fo

-- ( xm "~ am-1 xm-1 + " • • + alx + ao)( xn-m + gn-m-1 xn-m-1 + • • • + gl x + gO)

~* Xn + (gn--m--1 + am-lgn-m) xn-1

+ (gn-~-2 + a,~-lgn-m-1 + am-2gn-m)Xn-2 + • • •

+ (g~-m-r + ~. am-qg~-m-r+q) x ' - r + . . . + (a~o + algl + aog~)x ~ q=l

+ (alg o + aogl)X + aoff o . (28)

Comparing coefficients we see that in general the coefficients of the deflated polynomial gl (0 < i < n - m) are given in terms of recurrence relationships of the form

184 G. M, BIRTWISTLE AND D. J. EV:ANS

~n-m-1 -= f]2 ( L - 1 -- a~n-lgn-m) = [ f ~ - i -- am-lgn-m(1 -~ 81)](1 + ~2)

g~_~ - fl~ f,~-m-~- ~, am-qgn-m-r+q q=l

where we distinguish floating point operations of this kind, i.e. fl~(xOy) in which the sum is produced as a double precision number in the first instance and then rounded to single precision before use.

The error made in computing the coefficients of the deflated poly- nomial is thus for gn-r

- - {(1 + e l t O • \q=l

Hence,

gn- , - fn -m- , - am-qgn-m-r+q < 2 am-qg.,-m-r+q + lfn-m-rl 2-t

and if we make the assumption

I~lam_q~Tn_m_r+q<=,f._~_~lq= (30)

we get the result

f12 fn-m-r- am-qgn-m-r+q =- - ~, am-qgn-m-r+q +fn-m-r(1 + Er) q=l q=l

q=l where

tErI < 3(2-t) .

This result which effectively States that the errors from using the co- efficients f ~ - r instead of f~-m-r axe bounded by a perturbation of 3(2 :t) in the original coefficients was similar to one obtained by Wil- kinson (1963) for the Newton first order process. Hence, the upper bounds to the rounding errors incurred in any factorisation process can be made independent of the order m. We notice also that in order for the deflation process to be made as accurate as possible, it is essential to locate the factors in increasing order independent of sign.

ON THE GENERALISATION OF BAIRSTOW'B METHOD ]85

4. Speeding the Convergence rate. I t is quite easy to construct Bairstow(m) processes which converge at

a faster rate than 0(~ ~) for well conditioned factors. But as Ostrowski [2] has pointed out it is of no use obtaining a process which converges more rapidly if it involves more work to achieve the same overall convergence. t t is unit of work was the t torner which is sensibly equal to the amount of work necessary to divide f(x) by a(x) when f(x) is of order n and a(x) is of order I.

The Ostrowski idea is first illustrated by Bairstow(1)--we apply the two processes

f(x2D

(32)

3~2n+2 ----- X2n+l f,(x2n )

as a single step and call the process Alternating Bairstow(1). The work done is clearly equivalent to three Horners - - to calculate f (x~) , f(x2~+l) and f'(x2n ). The retention of f'(x2~ ) in the second step means that the convergence rate of the second step is less than 0(~ ~) but the work done is halved. The convergence rate of the who]e step is now found.

Let X be the required root, and define % by

Then

but

r x . . o e~+lf ( ) + e~n+1½f"(X) + e2n+~" = e~+l f ' (X) + e , J " ( X ) + . . .

f " ( X ) ---- e2n+le2n ~ "~ . . .

=

. (33) = \ / , ( X ) / 2 ! " "

and so this two step method has cubic convergence. The method can be extended by calculating several improvements to x~+l by the first order scheme (with the retention of f'(x2~ ) but these turn out to be less effi- cient than the one proposed by Ostrowski.

For Bairstow(1) the table below lists the convergence in the power of of the various methods, where q is the number of steps applied with

186 G.M. BIRTWISTLE AND D. J. EVANS

first order convergence after the initial 0((~ ~) improvement, and W denotes the work done in Homers.

q ' ~ ( 2 3 4 5 6 7 8 9 10

0 2 4 8 16 32 1 2 3 6 9 18 27 2 2 3 4 8 12 16 32 3 I 2 3 4 5 10 15 20 25

The best rate of convergence is clearly secured by q = 1, but the saving is not dramatic as compared with q = 0.

Another way of achieving cubic convergence is to apply the formula

f(x~) f"(X,~)(f(x,~)) ~ (34)

which requires three Homer schemes to evaluate each step. We note that if

f (x) = ( x - a)g(x) +f(a) then

g(x) = ( x - a ) ¢ a ~a

2 = ( x - a)

so that all the terms on the right hand side of (34) are calculable. The incorporation of simplified schemes into a cubic does not produce any- thing to better itself, and we note that (34) and the alternating scheme with q = 1 are equivalent in terms of convergence to a well conditioned single root.

An extension of the alternating scheme of Ostrowski to Bairstow(2) is achieved by performing the usual step for a first improvement, and calculating the 6k from

= b .

For the remaining q steps only b is calculated afresh. CMculating the work done in multiples of Bairstow(2) Horners means that we reproduce table i Where a Bairstow(2) Homer ~-2 Bairstow(1) Homers. Again then, there is a slight improvement when q= 1.

The idea obviously lends itself to extension to Bairstow(m) for any integral q > 0, and clearly the best choice for q is always 1.

For multiple roots, the cubic Bairstow(1) converges quicker than the alternating Bairstow(1).

ON THE GENERALISATION OF BAI~STOW'S METHOD 187

For the cubic, if X is a root of multiplicity r, and e n = X - x n,

en+ 1 = e ,~- (enrf(r)(X)/r ! ) / ( e n r - l f ( r ) ( X ) / ( r - l !)

- ½(enr -e f ( r ) (X) / ( r - 2)!)(ernf(r)(X)/r !)e/(%~-xf(~)(X)/(r - 1)!)a

r - - 1 = en 1 r ~

- -

which is, in fact, slightly less efficient than the ordinary Bairstow(1) method.

As for the alternating Bairstow(1),

e~+1 = e~ ( 1 - ~ )

e~.n+ ~ = eun+l -- (er2n+lf(')(X)/r !)/(elnl f ( r ) ( X ) / ( r - 1)!)

---- e2n+l l - - - r \ e2u / ]

(37)

which is less efficient than either the cubic version or ordinary Bair- stow(l), but again, only slightly so.

These conclusions carry over to Bairstow(m) for any m and it seems worthwhile to quote the method for obtaining cubic Bairstow(2) con- vergence in full.

Let f ( x ) = (x 2 + a l x + ao)g(x ) + b e + b lx

then

~g ~b0 Obl - xg(x) = (x" + ~,~ + a0) ~ + ~ + --~al ~

~g ~b 0 3b, - -g (x ) = (x ~ + a l x +%)¢aocao~'-+-z--+ ~a--- ° x (3s)

where the coefficients of the two polynomials ~g/3a 1 and ~g/3a o are related. In fact, if we take the coefficient of x1 in ~g/~a o to be given by

1 j = n - 4

G t j = 0,1, . . . . n - 5 , (39)

188 G, l~I. B I R T W I S T L E AND D.J. EVANS

1 Gj_~

C1 = ~a--o

Putt ing Co = ~bd~ao,

then the coefficient of x1 in ~g/~a i is given by

j = n - 3 j - - 1 , 2 , . . . , n - 4

j = 0

(40)

- 2X ~a--~l = 2 + ~al~ x + ~al~

~g ~g = a(x) ~ g + ~bl x + ~2b° - X ~ a o ~a I ~ ~ao~al ~ao~a---~l

For brevity we denote the remainders by wlx + wo, v~x + v o and u~x + u o respectively.

The success of the scheme depends upon the fact that the coefficients of the polynomials on the left are similar. From (40), (41) the coefficients of x1 in these three polynomials are given by

for 2~__g

x~g ~g for ~ao + ~--~a~

~g for 2x

2 j = n - 4

2G s j = 0,1 . . . . , n - 5

2 j = n - 3

2G~_ 1 j = 1,2 . . . . , n - 4 C1 j = 0

2 j = n - 2

ea,_~ j = e , 3 , . . . , n - 3 cO1 j = 1 0 j = o (43)

(42)

8b 1 Co - - alO1 ~al

- - = - a o 0 1 ( 4 1 ) ~al

and so the whole of the computations so far have been carried out at a cost of slightly more than one division process by a(x).

Further differentiation of (38) with respect to a 0 and a 1 yields the three distinct equations (a(x) = x ~ + alx + ao) :

ON THE GENERALISATION OF BAIRSTOW'S METHOD 189

We divide the polymomial Og/Oa o by a(x) and calculate the coefficients of the remainder by the usual division process. Analysis shows that the remaining four remainder coefficients are given without further division by

Vl = ~0- @lax

V 0 = 0 1 - - , / ~ 1 a 0

w 1 = C 1 - a l v 1+vo w o = -- aov 1 (44)

These coefficients are then used to modify the usual Bairstow(2) method in th e following way. Initially we seek changes 81 and do in a 1 and a o such that due to these increments bl and b 0 vanish. For

~b o Ob o bo(ao + C}o, al + (}1) = bo(ao, al) + (~o ~ a ° + ¢}1-~a ~

_t_1 3 (80 Uo + 28081% + 81~wo) + . . . (45)

and similarly for b I. We calculate 8o* , (}15 by use of the matrix equation

lObo obo\-, C:)

\ 0 % Oal/

b°) (46) = B bl

and then modify the calculated 80* and dx* to give better approximations 80' and d) i' to 8 o and Oi by

( 8°'~ = B [ b° + ½(~°*3u° + 2~°'81"v° + 8~'3w°)~ ~1'] \b l + ½(8o'2Ul + 2do*~1*vl + (~1"3wl)/"

(47)

The error term in (47) due to not solving (45) exactly is easily seen to be a term of 0(~ a) for in (47) we use 8 *3 instead of d03(1 +803) 3. This shows that the method outlined does indeed have cubic convergence.

When the factors in the Bairstow(m) process are of multiplicity r > 1 we have shown that the convergence is no longer quadratic bu t 0(1 - I/r). In other words, if h~, n is the correction to the coefficient a~ from the n th iteration to find the "exact" ~t, then

]~t,n+l N i - I . hi , n r

Rail [4] has presented evidence in graphical form of this effect. Since 1 - 1/r > ½, for r > 2, we may use this as a test after allowing a few itera-

190 G.M. BIRTWISTLE AND D. J. EVANS

tions for the ratio to settle down and estimate r. Another procedure would be to add increments of 2hi , n to the a~ and observe the new rate of convergence--if this is not quadratic, we can t ry 3hi,~ etc. Rall warns " that this process can be regarded as successful as long as the successive corrections stay within a subspace of the space determined by the set obtained for the correcting factor r - 1, and decrease in norm more rapidly. If the norms of these corrections start increasing in value, or have components which lie outside the subspace previously determined, the situation may be corrected by returning to the use of a smaller value of r to eliminate the error outside of the subspace."

The same argument can be used to increase the convergence rate of cubic Bairstow(m) when a root of multiplicity r is encountered.

5. Conclusion.

In conclusion we point out that (1) to ensure as much working as possible with real numbers, even

Bairstow(m) methods should be used. Further, only m = 2 and m = 4

satisfy the criterion that the factors of order m must themselves be solved quickly.

(2) Bairstow(2) involves less work, but the result of two applications will leave a deflated polynomial whose error bounds are larger than those of one Bairstow(4) reduction.

(3) the alternating method of Ostrowski [2] is as efficient as cubic Bairstow(m) for well conditioned factors, but very slightly worse for multiple roots. The recommendation of cubic Bairstow(2) agrees with the judgement of Salzer [3].

REFERENCES

1. J . ~Vilkinson, Rounding Errors in Algebraic P~ocesses, H. M. S. O., 1963. 2. A. Ostrowsld, Tech. l%ep. 7, Stanford University, 1960. 3. lq. Salzer, Num. Ma~h. V 3, p. 120, 1961. 4. L. B. Rail, M. 1%. C. Tech. Summary Report No. 617, 1966.

UNIVERSITY OF SHEFFIELD

ENGLAND