Method of Successive Polynomial Substitutions for Computing Roots of Polynomials ()
1. Introduction
Numerous different techniques such as the bisection method, fixed-point iteration, Newton-Raphson method, are available for obtaining the roots of equations in general. To a lesser extent, there are solution techniques like Laguerre’s method, Müller’s method, and Bairstow’s method, specific to the roots of polynomials (Chapra and Canale [1], pp. 123-202). The Weierstrass-Durand-Kerner method, essentially due to Weierstrass [2] but about 70 years later independently rediscovered by Durand [3] and Kerner [4], is also a well-known root-finding algorithm.
Based on the work of Traub [5], Jenkins and Traub [6] introduced an algorithm for computing zeros of polynomials with complex coefficients, programmed as the CPOLY algorithm [7]. A faster version of the algorithm for real polynomials was also formulated [8] and later presented as the RPOLY algorithm [9]. Currently, RPOLY is regarded as the standard in black-box polynomial root-finders due to its robustness, accuracy, and efficiency.
The present approach is a generalized and preconditioned version of a recursive method first introduced in Beji [10] for obtaining the roots of cubic polynomials. The method proceeds by reducing the degree of the polynomial by one at each stage of successive polynomial substitutions till the second degree. All the roots of a given polynomial, whether real or complex, are obtained in a single run without requiring any initial guess. The entire approach and the routine implementing it are found to be quite robust and accurate for polynomials of any degree with real or complex coefficients. When the implementation simplicity and efficiency of the method are considered, a professional version of the given generic algorithm may be a worthwhile competitor of the Jenkins-Traub algorithm.
2. Method of Successive Polynomial Substitutions
A polynomial of
-degree can be written as
(1)
where
is the degree of polynomial and
’s are real or complex coefficients. Introducing a change of variable
and setting
give
(2)
where
is a real or complex constant to be determined. Dividing the entire equation by
results in
(3)
Following the approach introduced in Beji [10], we set
and solve for
to transform Equation (3) into
(4)
Redefining the coefficients as
, Equation (4) is rewritten as
(5)
Depending on n being even (+) or odd (−), the multiplication of the roots of (5) is obviously
; therefore, Equation (5) has at least one root whose magnitude is less than unity, unless all the roots are equal in magnitude. Leaving the special case aside we can reason as follows. If
is the root less than unity, then,
(6)
In view of (6), a very rough approximation is possible by neglecting the term with the highest power zn in (5) and writing
(7)
which reduces the degree of polynomial to be solved by one. But it is possible to do much better by successive polynomial submissions. First, multiply Equation (5) by z and write it as
(8)
Next, use (5) again to replace
on the right in (8):
(9a)
(9b)
We now have the term
on the left, and since
for
, neglecting
in (9b) would definitely result in a better approximation compared to (7), where
is neglected:
(10)
Repeating the substitutions as many times as desired according to a definite criterion eventually leads to a polynomial one degree less,
, than the original polynomial of degree
. By recalling that the sum of roots is equal to the opposite sign value of the coefficient of the second highest power term, the first root eliminated in the reduction process can easily be determined by a simple subtraction as
. Here,
is the coefficient of
term of the original polynomial of degree
and
is the coefficient of
term of the reduced polynomial of degree
. Afterwards, the same recursion process is applied to the reduced polynomial to reduce it one degree more and subsequently the second root is computed. The recursive process is repeated until a second-degree polynomial is obtained and the last two roots are computed from the well-known analytical expression, the quadratic formula. Afterwards, roots of the original polynomial can be obtained from
. But it turns out that this transformation is not necessary at all because the inverse transformation following successive substitutions yields a polynomial which is identical to the one that can be obtained without transformation. Therefore, the change of variable,
, should be viewed only as a formal justification process of the present approach.
2.1. A Demonstrative Application
To clarify the method further we apply it to a general cubic polynomial
without transformation. First, set
, divide by
and redefine its new coefficients as
, etc. so that
(11)
Neglecting the
term results in a zeroth-order approximation with
. For a better approximation multiply Equation (11) by x and replace the
term by
so that
(12)
We get a first-order approximation now if the term
is neglected,
(13)
The following second-degree polynomials listed in Table 1 are obtained for the first three approximations.
Table 1. Quadratic polynomials resulting from successive substitutions of a general cubic polynomial.
Degree Neglected |
Corresponding Quadratic Polynomial |
3rd |
|
4th |
|
5th |
|
Higher-order approximations are obtained in the same manner; however, the coefficients get larger and cause computational inaccuracies. To avoid this problem, the reduced polynomial should be divided by the coefficient of
term at each step. Note that this coefficient would never become zero as we must have a second-degree polynomial to get two more roots besides the one eliminated through the degree-reduction process. In other words, the fundamental theorem of algebra ensures a non-zero coefficient for the highest term of the reduced polynomial.
As a numerical example, we now consider a particular cubic polynomial
whose roots are obviously
,
, and
. Following the above outlined method and employing the suggested normalization at each step result in Table 2. Roots obtained from the last quadratic
are
and
and the process is clearly leading to
with roots
and
.
Table 2. Quadratic polynomials resulting from successive substitutions of a particular cubic polynomial.
Degree Neglected |
Corresponding Quadratic Polynomial |
3rd |
|
5th |
|
10th |
|
2.2. Formulation of Recursive Scheme
With the guidance of preceding examples, we can formulate the recursive scheme to be employed for determining coefficients of the reduced degree,
, polynomial. First, a single sweep is performed for introducing a second set of coefficients:
(14)
where bi’s are the normalized coefficients exemplified in Table 2. Then, a succession of sweeps till a definite convergence criterion is met:
(15)
The scheme above reduces
-degree polynomial to the
-degree polynomial and therefore can be used only for reducing a cubic polynomial
to a quadratic polynomial for obtaining all the roots. For higher degree polynomials
it is necessary to reduce the degree of polynomial more than once till a quadratic polynomial is obtained. The generalized version, which reduces the degrees successively from
to
, is for the initial sweep
(16)
and for the succession of sweeps till convergence
(17)
Note that the special case formulated by Equations (14) and (15) corresponds to
hence
to
, just one-degree reduction in the general scheme of (16) and (17). FORTRAN code given in the Appendix basically uses Equations (16) and (17). At any stage of degree-reduction the repeated sweeps continue until the absolute value of the last coefficient of the polynomial, bn, differs less than 10−14 from the one computed in the previous step. Checking only the last coefficient is sufficient as all the coefficients are computed interrelatedly and the convergence of a coefficient implies the convergence of all the coefficients. Computations are carried out entirely as complex numbers so that roots of polynomials in most general forms, including complex polynomial coefficients, can be computed.
3. Preconditioning
Virtually every numerical technique requires the satisfaction of a definite condition or conditions to prevent its failure. As it is obvious from Equations (8) and (9), for the present methodology this condition is
. By a simple shift, which may be called as the preconditioning procedure, this particular requirement can be handled and the scheme works virtually for all polynomials. Preconditioning is achieved by changing the independent variable, say x, to
and then specifying the shift θ in a way to set a specific coefficient of the polynomial to a desired value. In order to implement such a measure, it is first necessary to introduce a convenient way of formulating the intended shift.
3.1. Transformation of Polynomials via Taylor Series
A change of independent variable,
, transforms the general
degree polynomial
via the Taylor series expansion, into
:
(18)
where θ is a real or complex constant to be determined and primes indicate differentiation with respect to the independent variable. The above expression for instance is quite practical for implementing the so-called Tschirnhaus transformation [11] which makes the coefficient of
term zero; specifically,
. Incidentally, the transformation attempted by Tschirnhaus was more general as he stated in the first sentence of his article: “We have learned from DesCartes’ geometry by what method the second term might reliably be removed from a given equation; but on the question of removing multiple intermediate terms I have seen nothing hitherto in the analytic arts.” but was not as much successful as he claimed. Thus, according to Tschirnhaus himself, DesCartes was the first to introduce the method to remove the second term; that is,
term.
The use of (18) for easily obtaining the roots of quadratic equation
by removing the second term is now demonstrated. In the transformed expression,
, setting
gives
. Then,
or
. On the other hand, when
, the transformed polynomial becomes
hence
. Noting that
and that
we obtain the roots as
, which, in turn, can be shown to be identical with the well-known quadratic formula. Applications of (18) to cubic, quartic, and quintic equations can be found in Beji [12].
3.2. Preconditioning Measure
Clearly, coefficient
which multiplies
must be nonzero if the successive polynomial substitutions are to be carried out successfully as Equations (8) and (9) show. Equivalently, when the transformed polynomial is considered, the coefficient of
must never vanish to prevent a complete failure of the procedure. Thus, for ensuring a nonzero coefficient for
in the transformed polynomial,
is set to a constant,
, where
is the imaginary unit. Although in principle any constant would work, the value
is selected based on systematic trial computations with polynomials of differing degrees.
The trials revealed that this particular value typically results in fastest convergence rates and that unequal real and imaginary parts is essential for convergence in special cases such as
, where all the intermediate coefficients are zero. Accordingly,
(19)
Once the shift θ is determined according to (19), all the coefficients of transformed polynomial can easily be computed with the help of (18), and this is precisely the procedure followed for the code given in the Appendix.
4. Sample Computations
The methodology is implemented in the FORTRAN program given in the Appendix. Maximum degree of polynomials to be solved is arbitrarily set to
; likewise, the maximum number of allowed iterations is defined as
. In accord with the programming language and machine capacity, the computational precision for convergence at each degree-reduction is specified as
. The coefficients of polynomial are entered in complex number format in parenthesis as the code is written to accommodate the most general case possible. Results are given in 10 decimal places.
4.1. P3(x) = 6x3 − 17x2 − 5x + 6
Figure 1 shows the execution outcome of the code for a third-degree polynomial constructed as
with roots
,
, and
, which are all real. For this case, only 56 iterations or sweeps are needed to reduce the degree of polynomial from 3rd to 2nd with
precision and these 56 iterations are all the needed to compute three roots, which are exact to 10 decimal places. To check the accuracy of each computed root, the polynomial is evaluated too; the results are shown on the right.
Figure 1. Screenshot of code execution for
.
4.2. P4(x) = 3x4 − 2x3 + x2 + 4x + 5
A fourth-degree polynomial with arbitrarily selected real coefficients is considered and the code gives two pairs of complex conjugate zeros. Figure 2 shows the results which are quite accurate as revealed by the polynomial evaluations, all zero, given on the right. Reducing the degree of polynomial from 4th to 3rd requires 209 iterations and from 3rd to 2nd, 245 iterations; hence, altogether, obtaining all the roots takes 454 sweeps.
Figure 2. Screenshot of code execution for
.
4.3. P5(x) = (−2 + 3i)x5 + (5 + 5i)x4 + (0 − i)x3 + (7 + 0i)x2 + (1 − 2i)x
+ (−15 + 12i)
A fifth-degree polynomial with arbitrarily selected real and complex coefficients is considered. The computed roots, seen in Figure 3, are all complex but none complex conjugate because the polynomial has complex coefficients. Iteration numbers for computational sequences, from 5th down to 2nd degree are 218, 4507, and 60, summing up to 4785 sweeps altogether for the computational precision requirement
. Functional evaluations shown on the right are all zero to 10 decimal places, verifying that the computed roots are quite accurate.
Figure 3. Screenshot of code execution for P5(x) = (−2 + 3i)x5 + (5 + 5i)x4 + (0 − i)x3 + (7 + 0i)x2 + (1 – 2i)x + (−15 + 12i).
4.4. P7(x) = x7 − 6.01x6 + 12.54x5 − 8.545x4 − 5.505x3 + 12.545x2 −
8.035x + 2.01
A particularly difficult case used in Jenkins and Traub [8] is factored as
. Obviously, the repeated roots originating from the factor
and the very close magnitude roots from
may cause computational problems. Figure 4 shows the computed roots and corresponding polynomial evaluations to the 10 decimal places. For the five distinct zeros, the present algorithm works virtually perfectly well and definitely better than the Jenkins-Traub algorithm. Only the coincident roots,
and
, resulting from
could not be obtained with the accuracy of
despite the allowed maximum 10,000 iterations, simply because the code cannot decide which one of the roots to converge as both are the same. Note however that the close magnitude roots
and
are computed exactly to 10 decimal places of accuracy. Iteration numbers for computational sequences, from 7th down to 2nd degree are 58, 135, 108, 10,000, and 54, summing up to 10,355 sweeps altogether. The greatest portion of iterations, counting to 10,000 due to the repeated roots, could be reduced down to 5,198 by setting
without appreciably affecting the accuracy of the computed roots. Nevertheless, it is clear that repeated roots are the weakest spot of the present scheme and its source may be traced back to the paragraph between Equations (5) and (6).
![]()
Figure 4. Screenshot of code execution for P7(x) = x7 − 6.01x6 + 12.54x5 − 8.545x4 − 5.505x3 + 12.545x2 − 8.035x + 2.01.
4.5. P9(x) = (−2 + i)x9 + (1 + i)x8 + (3 − 2i)x7 + (5 + 0i)x6 + (−4 +
3i)x5 + (7 + 7i)x4 + (6 + 0i)x3 + (−3 + 0i)x2 + (2 + 2i)x + (10 + 10i)
A ninth-degree polynomial with arbitrarily selected real and complex coefficients is the last example treated. As in §4.3 the roots shown in Figure 5 are all complex but none complex conjugate because the polynomial has complex coefficients. The results are highly accurate as all the functional evaluations are zero to the 10 decimal places. Number of sweeps for sequences from 9th to 2nd degree are 369, 314, 875, 133, 108, 440, and 95, adding up to 2334 sweeps for the entire computations.
Figure 5. Screenshot of code execution for P9(x) = (−2 + i)x9 + (1 + i)x8 + (3 − 2i)x7 + (5 + 0i)x6 + (−4 + 3i)x5 + (7 + 7i)x4 + (6 + 0i)x3 + (−3 + 0i)x2 + (2 + 2i)x + (10 + 10i).
4.6. Some Special Cases
The most critical special case that could make the present scheme fail has been identified as
, and this pitfall is avoided by a shift that sets the corresponding term in the transformed polynomial to a constant. This preconditioning measure avoids quite a number of problems associated with a variety of special cases. In this section, we are going to enumerate some noteworthy special cases for which, despite unusual characteristics of polynomial, the code still performs well.
Polynomials like
may at first sight appear problematic since all the coefficients but the fist and the last, are zero. However, the scheme, in virtue of a transformation to set the coefficient of the second term to a constant, handles all these cases accurately. For instance, for
with 90 iterations, roots of
are computed as x1 = −1.0000000000 + 0.0000000000, x2 = 0.5000000000 − 0.8660254038i, x3 = 0.5000000000 + 0.8660254038i, which may be regarded exact. Likewise, it takes 157 + 96 = 253 iterations to compute all the roots of
as x1 = −0.7071067812 − 0.7071067812i, x2 = −0.7071067812 + 0.7071067812i, x3 = 0.7071067812 − 0.7071067812i, and x4 = 0.7071067812 + 0.7071067812i. Higher degree cases
,
, which are tested to the 10th degree, are solved with the same ease and accuracy without problem.
For the present formulation, as observed in §4.4 of the Jenkins-Traub test, repeated or coincident roots of the form
poses a potential problem since the procedure is not able to decide for a root to converge. Unsurprisingly, for
the code produces just barely acceptable results x1 = −1.0001999401 − 0.0000999701i, x2 = −0.9998500497 − 0.0000499752i,
after the maximum allowed 10,000 iterations. For
we obtain similar results after 10,000 + 10,000 = 20,000 iterations. For higher degrees we get acceptable results but clearly the repeated zeros are not easily handled. A remedy to this weakest spot should be implemented in the professional version of the code in an appropriate way.
Quite trivial case of
is problematic; again due to repeated roots, but can still be computed. For instance,
is solved after the maximum allowed 10,000 iterations as x1 = −0.0001999412 − 0.0000999698i, x2 = 0.0001499502 − 0.0000499782i, and
. The fourth-degree case
produces similar near zero values after 10,000 + 10,000 = 20,000 iterations. Nevertheless, relatively low performance of the method for this most primitive form,
, does not imply a serious defect at all since the solution is trivial.
The above screening of main special cases clearly shows that the repeated roots are the only special case that impair the accuracy of computations for the proposed root-finding algorithm. No case has been identified with a complete failure; though, naturally, there may remain some overlooked exceptional cases. Overall however, the methodology works remarkably well and produces quite accurate solutions for polynomials with real or complex coefficients.
5. Concluding Remarks
A new method of computing all the roots of a polynomial of any degree is introduced. The scheme proceeds by reducing the degree of the given polynomial by one at each stage of successive polynomial substitutions till reaching a quadratic. A simple shift via Taylor series expansion is introduced to implement a preconditioning measure for avoiding failure of the scheme due to the vanishing of the second highest-degree term. Computations are performed in the complex domain for the sake of generality and several demonstrative examples are included. Except for repeated roots, the code works virtually perfectly well with high efficiency and accuracy without any observed failure. A professionally restructured form of the present generic algorithm in different programming languages is expected to be very useful for practical uses as its performance proves to be quite competitive even against the Jenkins-Traub algorithm, which is the standard root-finder.
Appendix: FORTRAN Code for Implementing Method of Successive Polynomial Substitutions: SPS
use msimsl
parameter(nmax=100,imax=10000,eps=1.e-14)
double complex a(0:nmax),b(1:nmax),c(0:nmax),x(1:nmax),xb(1:nmax)
double complex dlt,poly,tht
c
write(*,10)
10 format(3x,31HEnter the degree of polynomial:,/)
read(*,*)n
c
write(*,20)
20 format(3x,37HStarting from the highest degree term,
&38H enter the coefficients of polynomial:,/)
c
do i=0,n
write(*,30)i
30 format(3x,1Ha,i2)
read(*,*)a(i)
c(i)=a(i)
enddo
c
c Divide all coefficients by a(0)
do i=1,n
a(i)=a(i)/a(0)
enddo
a(0)=dcmplx(1.,0.)
c
c Preconditioning of polynomial by setting a(1) to (n,n/2)
tht=(dcmplx(1.,1./2.)-a(1)/n)/a(0)
c
c Recompute polynomial coefficients for making a(1)=(n,n/2)
do j=n,1,-1
a(j)=a(j)*dfac(n-j)
do k=j-1,0,-1
a(j)=a(j)+a(k)*(dfac(n-k)/dfac(j-k))*tht**(j-k)
enddo; enddo
c
c Coefficients of transformed polynomial
do j=n,1,-1
a(j)=a(j)/dfac(n-j)
enddo
c
c
c Main Program to solve roots of transformed polynomial
c
do 50 j=1,n-2
c
c A single substitution
b(j)=a(j)*a(j)-a(j+1)
do i=j+1,n-1
b(i)=(a(i)*a(j)-a(i+1))/b(j)
enddo
b(n)=a(j)*a(n)/b(j)
c
c Repeat till convergence
c
k=0
c
40 xb(n)=b(n)
c
k=k+1
c
b(j)=b(j+1)-a(j)
do i=j+1,n-1
b(i)=(b(i+1)-a(i))/b(j)
enddo
b(n)=-a(n)/b(j)
c
if(abs(xb(n)-b(n)).gt.eps.and.k.lt.imax) then
goto 40
else; goto 99; endif
c
99 x(j)=-a(j)+b(j+1)
c
do i=j+1,n
a(i)=b(i)
enddo
c
dlt=cdsqrt(a(j+1)*a(j+1)-4.*a(j+2))
x(j+1)=(-a(j+1)+dlt)/2.
x(j+2)=(-a(j+1)-dlt)/2.1
c
50 continue
c
c Write the roots and corresponding values of the polynomial
c
write(*,100)
100 format(/,40x,4HRoot,30x,19HValue of Polynomial,/)
write(*,200)
200 format(8x,11HRoot Number,13x,4HReal,10x,9HImaginary,18x,4HReal,
&10x,9HImaginary,/)
c
do j=1,n
c
c Evaluate polynomial for roots
poly=c(n)
do i=n-1,0,-1
poly=poly+c(i)*(x(j)+tht)**float(n-i)
enddo
c
write(*,300)j,x(j)+tht,poly
300 format(12x,1Hx,i2,8x,2f16.10,9x,2f16.10)
c
enddo
c
write(*,400)
400 format(/)
c
stop
end
NOTES
1These three lines may be moved outside the do-loop below 50 continue after setting j = n − 1 to avoid unnecessary computations.