Translation:Demonstratio nova theorematis omnem functionem algebraicam rationalem integram unius variabilis in factores reales primi vel secundi gradus resolvi posse

New Proof of the Theorem that Every Algebraic Rational Function of One Variable Can Be Resolved into Factors of the First or Second Degree (1799)
by Carl Friedrich Gauss, translated from Latin by Wikisource
4438125New Proof of the Theorem that Every Algebraic Rational Function of One Variable Can Be Resolved into Factors of the First or Second Degree1799Carl Friedrich Gauss


1. edit

Any given algebraic equation can be reduced to the form

 

such that   is a positive integer. If we denote the first part of this equation by   and assume that the equation   is satisfied by multiple unequal values of   say by setting       etc., then the function   will be divisible by the product of the factors       etc. Conversely, if the product of several simple factors       etc. divides the function   the equation   will be satisfied by setting   equal to any of the quantities   etc. Finally, if   is equal to the product of   such simple factors (whether they are all different or some of them are identical), then no other simple factors besides these can divide the function   Therefore, an equation of degree   cannot have more roots than   but it is evident that an equation of degree   can have fewer roots, even if   is resolvable into   simple factors: if some of these factors are identical, then the number of different ways the equation can be satisfied will necessarily be less than   Nevertheless, for the sake of elegance, geometers prefer to say that the equation has   roots even in this case, but that some of them are equal to each other: a liberty they could certainly take.

2. edit

The things explained so far are sufficiently demonstrated in algebraic works and do not violate geometric rigor anywhere. However, analysts seem to have adopted the theorem on which almost the entire theory of equations is built too hastily and without proper solid proof: that a function such as   can always be resolved into   simple factors, or, which is entirely consistent with it, that an equation of degree   does indeed have   roots. But since in quadratic equations, we often encounter cases that contradict this theorem, algebraists were forced to introduce a certain imaginary quantity whose square is   and then they acknowledged that if quantities of the form   are treated as real, then the theorem holds not only for quadratic but also for cubic and biquadratic equations. However, it is by no means permissible to infer from this that any equation of the fifth degree or higher can be satisfied by admitting quantities of the form   or as it is often expressed (although I would prefer a less slippery phrase), that the roots of such an equation can be reduced to the form   This theorem, as stated in the title of this paper, does not differ from what has been mentioned before. The aim of this dissertation is to provide a new rigorous demonstration of this theorem.

Moreover, since analysts discovered that there are infinitely many equations that have no roots at all unless quantities of the form   are admitted, these fictitious quantities, considered as a special kind of quantity and called imaginary to distinguish them from real ones, have been introduced into the entire analysis; on what grounds? I do not dispute this point here. I will complete my demonstration without any aid from imaginary quantities, although I would be allowed the same freedom that all recent analysts have used.

3. edit

Although what is presented in most elementary books as the proof of our theorem is so trivial and deviates so much from geometric rigor that it is scarcely worth mentioning, I will touch on it briefly so that nothing seems to be lacking. To demonstrate that the equation

 

or   indeed has   roots, they attempt to prove that   can be resolved into   simple factors. To achieve this, they assume   simple factors       etc., where       etc. are still unknown, and they set the product of these equal to the function   Then, by comparing coefficients, they deduce   equations from which they claim that the unknowns       etc. can be determined, and the number of these equations is also   In particular,   unknowns can be eliminated, leading to an equation that contains only the unknown   Leaving aside other criticisms that could be made of this argumentation, let us simply ask how we can be certain that the final equation indeed has any roots. Why couldn’t it be the case that neither this final equation nor any magnitude proposed satisfies any value in the entire range of real and imaginary quantities? However, experts will easily see that this final equation must necessarily be entirely identical to the proposed one if the calculation is properly conducted; namely, after eliminating the unknowns     etc., the equation

 

should emerge. There is no need to elaborate further on this reasoning.

Some authors, who seem to have perceived the weakness of this method, take it as an axiom that any equation indeed has roots, whether possible or impossible. However, they do not seem to have clearly explained what is meant by possible and impossible quantities. If possible quantities are meant to denote the same as real and impossible as imaginary, this axiom cannot be admitted without proper demonstration, and instead requires proof. Nevertheless, the terms do not seem to be intended in that sense, but rather the meaning of the axiom appears to be: ‘Although we are not yet certain that there necessarily exist   real or imaginary quantities satisfying a given equation of degree   we will assume it for a while; for if by chance it should happen that so many real and imaginary quantities cannot be found, then at least we have an escape, and we can say that the remaining ones are impossible.’ If someone prefers to use this phrase rather than simply saying that the equation in this case will not have so many roots, I have no objection; but if, at that point, they treat these impossible roots as if they were something true, and, for example, say that the sum of all the roots of the equation   is   even if some of them are impossible (which expression explicitly means even if some are missing), then I cannot approve of it. For impossible roots, accepted in this sense, are still roots, and then that axiom cannot be admitted without some kind of demonstration, as it is not unreasonable to ask whether equations can exist that do not even have impossible roots.[1]

4. edit

Before I review the demonstrations of our theorem by other geometers and point out what seems to me to be objectionable in each, I make the following observation: it is sufficient to show only that for any equation of any degree

 

or   (where the coefficients   etc. are assumed to be real), it can be satisfied in at least one way by the value of   expressed as   For it is evident that   will then be divisible by a real quadratic factor   if   is not   and by a real simple factor   if   In both cases, the factor will be real and of lower degree than   and since, by the same reasoning, it must have a real factor of the first or second degree, it is clear that by continuing this operation, the function   will eventually be resolved into real simple or double factors, or, if you prefer to use two simple imaginaries instead of each real double factor, into   simple factors.

5. edit

The first proof of the theorem is owed to the illustrious geometer d’Alembert and can be found in Recherches sur le calcul integral, Histoire de l’Acad. de Berlin, Année 1746, p. 182 ff. The same proof is also in Bougainville, Traité du calcul integral, à Paris 1754, p. 47 ff. The main points of his method are as follows.

First, he shows that if any function   of the variable   becomes   either for   or for   and if it can acquire an infinitely small positive real value by assigning a real value to   then this function can also obtain an infinitely small negative real value, either from a real value of   or an imaginary value of the form   Namely, denoting   as the infinitely small value of   and   as the corresponding value of   he asserts that   can be expressed by a highly convergent series   etc., where the exponents       etc. are continuously increasing rational quantities, and thus become at least a certain distance from the positive starting point and make the terms in which they appear infinitely small. Now, if among all these exponents there is none that is a fraction with an even denominator, all the terms of the series become real for both positive and negative values of   but if some fractions with even denominators are found among these exponents, it is established that, for negative values of   the corresponding terms are expressed in the form   However, due to the convergence of the infinite series in the former case, it suffices to retain only the first (i.e., the largest) term; in the latter case, it is unnecessary to go beyond the term that first introduces an imaginary part.

Through similar reasoning, it can be shown that if   can obtain a negative infinitely small value from a real value of   then that function can acquire a positive real infinitely small value from a real value of   or from an imaginary value in the form  

Hence he further concludes that a real finite value of   can also be found, in the former case negative and in the latter case positive, which can be produced from an imaginary value of   of the form  

From this, it follows that if   is a function of   such that it obtains a real value   from a real value   of   and also obtains an infinitely small real value, either greater or smaller than   from a real value of   then it can also receive an infinitely small real value, which is either smaller and greater than   (respectively), by assigning to   a value of the form   This is easily derived from the above if   is conceived to be replaced by   and   by  

Finally, d’Alembert asserts that   can traverse any interval between two real values     (i.e., becoming equal to   and   and all intermediate real values), by assigning to   values of the form   that is, the function   can increase or decrease by any finite real quantity (depending on whether   or  ), while   remains always in the form   For if a real quantity   is given (which is supposed to lie between   and  ), such that   cannot be made equal to   by such a value of   then necessarily a maximum value of   must be given (when   and a minimum when  ), say   which would be obtained from a value of     in such a way that no value of   in a similar form could be assigned that would bring the function   closer to   by the smallest excess. Now, if in the equation between   and     is substituted everywhere for   and then the real part and the part involving the factor   are equated, two equations will result from this (in which     and   occur mixed with constants) that, through elimination, can yield two others, in one of which     and constants are found, and in the other,   is free containing only     and constants. Therefore, since   has traversed all values from   to   by real values of   according to the above,   can still approach the value   by assigning values to     such as     respectively. From this, it follows that   i.e. it is still in the form   contrary to the hypothesis.

Now, if   is supposed to represent a function such as   it is clear that there is no problem, and such real values can be assigned to   so that   traverses any interval between two real values. Therefore,   can also obtain some value in the form   from which   becomes   Q.E.D.[2]

6. edit

The objections that can be raised against d’Alembert’s demonstration mostly come down to the following.

1. d’Alembert raises no doubt about the existence of values of   for which the given values of   correspond but assumes it and only investigates the form of these values.

Although this objection is in itself very serious, it pertains here only to the form of expression, which can easily be corrected to completely invalidate it.

2. The assertion that   can always be expressed by such a series as he posits is certainly false if   is supposed to represent any transcendental function (as d’A. hints at in several places). This is evident, for example, if   or   However, if we restrict the demonstration to the case where   is an algebraic function of   (which is sufficient for the present matter), the proposition is certainly true. Nevertheless, d’A. provided no evidence to support his assumption; the illustrious Bougainville assumes that   is an algebraic function of   and recommends the use of Newton’s parallelogram series for finding it.

3. He uses infinitely small quantities more freely than can be justified with geometric rigor or at least would be granted by a careful analyst in our age (where they rightly face skepticism). He also did not explain the leap from the value of   being infinitely small to it being finite sufficiently clearly. His conclusion that   can obtain a finite value seems to be derived not so much from the possibility of an infinitely small value of   as from the fact that, denoting   as a very small quantity, due to the great convergence of the series, the closer the true value of   is approached, or the more accurately the equation expressing the relation between   and   or   and   is satisfied, the more terms of the series are taken. Furthermore, the entire argument seems too vague to draw any rigorous conclusions from it: it should be noted that there are series that, no matter how small a value is assigned to the quantity according to which their powers progress, always diverge, so that if they continue far enough, you can reach terms greater than any given quantity[3]. This happens when the coefficients of the series constitute a hypergeometric progression. Therefore, it should have been necessarily demonstrated that such a hypergeometric series cannot arise in the present case.

However, it seems to me that d’A. did not rightly resort to infinite series here, and they are not suitable for establishing this fundamental theorem of the theory of equations.

4. From the assumption that   can attain the value   but not the value   it does not necessarily follow that there must be a value   between   and   which   can reach but not exceed. Another case remains: namely, it could be that there is a limit between   and   that can be approached as closely as desired by   but   never actually reaches it. From the arguments provided by d’A., it only follows that   can always surpass any value it has reached by a finite quantity, for example, when it has become   it can still increase by some finite quantity   and with this, a new increment   may occur, then another increase   etc., so that no increment should be considered final, but there can always be a new one added. Although the multitude of possible increments is not limited by any boundaries, it could still be the case that if the increments       etc. continuously decrease, the sum   etc. never reaches a certain limit, no matter how many terms are considered.

While this case cannot occur when   represents a complete algebraic function of   without a demonstration, this inability to occur must necessarily be considered a methodological flaw. However, when   is a transcendental function or even a fractional algebraic function, this case can indeed occur, for example, whenever a certain value of   corresponds to an infinitely large value of   Then, the d’Alembertian method seems not without many difficulties and possibly in some cases impossible to reduce to unquestionable principles.

For these reasons, I cannot consider the d’Alembertian demonstration as satisfactory. However, despite this, I do not believe that the true essence of the demonstration is in any way compromised by all objections, and I think that based on the same foundation (though with a vastly different rationale and at least a more comprehensive perspective), not only a rigorous demonstration of our theorem can be built, but also everything that can be desired concerning the theory of transcendent equations. I will discuss this matter more extensively on another occasion; see meanwhile below, article 24.

7. edit

After d’Alembert, Euler published his investigations on the same subject in Recherches sur les racines imaginaires des equations, Hist. de l’Acad. de Berlin A. 1749, p. 223 sqq. He presented two methods, and the essence of the first is summarized in the following.

First, Euler aims to demonstrate that if   is any power of   then the function   (where the coefficient of the second term is  ) can always be resolved into two real factors, in which   has up to   dimensions. To achieve this, he considers two factors:

 

where the coefficients       etc.,     etc. are still unknown. Their product is set equal to the function   The comparison of coefficients yields   equations, and it only needs to be shown that the unknowns       etc.,     etc. (whose number is also  ) can be assigned real values satisfying these equations. Euler asserts that if   is considered as known initially, such that the number of unknowns is one less than the number of equations, then by properly combining these using algebraic methods, all     etc.,     etc. can be rationally determined, without any extraction of roots, by   and the known coefficients   etc. Furthermore, all     etc.,     etc. can be eliminated, resulting in the equation   where   is an integral function of   and the known coefficients. It suffices here to know one property of this equation, namely, that the last term in   (which does not involve the unknown  ) must be negative. It follows that the equation must have at least one real root, meaning that   and consequently     etc.,     etc. can be determined in at least one real way. This property can be confirmed through the following reflections: When   etc. is assumed to be a factor of the function     will necessarily be the sum of   roots of the equation   and thus it must have as many values as there are ways to choose   out of   roots, which is given by the combinatorial calculation   This number will always be oddly even (a not difficult demonstration is omitted here): if   is assumed for this number, then   will be odd; the equation   will then be of the   degree. Now, since the second term is missing in the equation   the sum of all   roots will be   It is clear that if the sum of any   roots is   the sum of the remaining roots will be   i.e., if   is among the values of   then   will also be among them. Hence, Euler concludes that   is the product of   double factors of the form       etc., representing         etc., all   roots of the equation   Therefore, due to the multitude of odd factors, the last term in   will be the square of the product   etc., with a negative sign. Moreover, the square of this product can always be rationally determined from the coefficients   etc., and will consequently be a real quantity. Thus, the square of this with a negative sign will certainly be a negative quantity. Q.E.D.

Since these two real factors of   are of degree   and   is a power of the number   each factor can again be resolved into two real factors of dimension   by the same reasoning. However, as through repeated halving of the number   one necessarily eventually reaches two, it is evident that by the continuation of this operation, the function   will ultimately be resolved into real second-degree factors.

If, on the other hand, a function is presented in which the second term is not lacking, say   also denoting   as the power of binary, this will, by the substitution   be transformed into a similar function lacking a second term. Hence, it is easily concluded that such a function is also resolvable into real second-degree factors.

Finally, for a given function of degree   where   is not a binary power: let the nearest higher binary power be denoted as   and multiply the given function by   arbitrary real simple factors. From the resolvability of the product into real second-degree factors, it is straightforwardly derived that the given function must also be resolvable into real factors of the second or first degree.

8. edit

Against this demonstration, one can object:

1. The rule by which E. concludes that from   equations,   unknowns     etc.,     etc. can all be rationally determined, is not general but often admits exceptions. For example, if in Art. 3 one attempts to express the remaining unknowns and coefficients rationally by considering some unknowns as known, they will easily find that this is impossible, and that the unknown quantities can only be determined by an equation of degree   Although it can be immediately seen a priori that this must necessarily happen, it could be rightly doubted whether, even in the present case, for certain values of   the situation is such that the unknowns     etc.,     etc. cannot be determined by an equation possibly of a degree greater than   For the case where the equation   is of the fourth degree, E. extracts rational values of the coefficients through   and the given coefficients; the same can indeed be done in all higher-degree equations, but it certainly requires a more extensive explanation. However, it seems worthwhile to delve more deeply and more generally into those formulas that rationally express     etc. through   etc.; I will undertake a more detailed discussion on this and other matters related to the theory of elimination (an argument by no means exhausted) on another occasion.

2. However, even if it is demonstrated that for an equation of any degree   formulas can always be found that express     etc.,     etc. rationally through       etc., it is certain that for certain specific values of the coefficients     etc., those formulas can become indeterminate, so that not only is it impossible to define those unknowns rationally from       etc., but in some cases, no real values of     etc.,     etc. correspond to any real value of   For the confirmation of this matter, for brevity, I refer the reader to E.’s dissertation itself, where on p. 236 the equation of the fourth degree is more extensively explained. Everyone will immediately see that the formulas for the coefficients     become indeterminate if   and the value   is assumed for   and their values cannot be assigned without extracting roots, and even more, not real values, if the quantity   is negative. Although in this case it is easy to see that   can still have other real values for which real values of     correspond, still, someone might fear that the solution of this difficulty (which E. did not touch at all) may require much more effort in higher-degree equations. Certainly, this matter should by no means be passed over in silence in an exact demonstration.

3. The illustrious E. tacitly assumes that the equation   has   roots, and he establishes that their sum is   because the second term in   is absent. My opinion on this assumption (which all authors use in this argument) was already declared in Art. 3 above. The proposition that the sum of all roots of an equation equals the negative of the coefficient of the first term does not seem to be applicable to equations except those which have roots. Since it must be proved by this very demonstration that the equation   actually has roots, it does not seem permissible to assume their existence. Undoubtedly, those who have not yet penetrated the fallacy of this paralogism will respond, it is not demonstrated here that the equation   can be satisfied (for this expression means having roots), but that it can only be satisfied by values of   of the form   the former is to be taken as an axiom. However, since other forms of quantities cannot be conceived beyond the real and imaginary   it is not clear enough how what should be demonstrated differs from what is assumed as an axiom; indeed, even if it were possible to conceive other forms of quantities, such as       etc., it should not be admitted without demonstration that the equation can be satisfied by some value of   either real, or in the form   or in the form   or in   etc. Therefore, that axiom cannot have any other meaning than this: Any equation can be satisfied either by a real value of the unknown, or by an imaginary value in the form   or perhaps by a value in some hitherto unknown form, or by a value that is not contained in any form whatsoever. But how such quantities, about which one cannot even imagine an idea — true shadows of shadows — can be added or multiplied, is not understood with the clarity demanded in mathematics[4].

Now, I do not intend to render the conclusions that E. derived from his assumption at all suspect through these objections; rather, I am confident that they can be confirmed by a method neither difficult nor very different from the Eulerian one, in such a way that there should be no doubt left for anyone, even the slightest. I only criticize the form, which, although it can be of great utility in discovering new truths, seems to be not at all commendable in demonstrating before the public.

4. As for the demonstration of the assertion that the product   etc., can be rationally determined from the coefficients in   the illustrious E. has brought nothing at all. All that he explains on this matter in equations of fourth degree is as follows (where         are the roots of the proposed equation  ):

‘On m’objectera sans doute, que j’ai supposé ici, que la quantité   etait une quantité réelle, et que son quarré   était affirmatif; ce qui était encore douteux, vu que les racines         etant imaginaires, il pourrait bien arriver, que le quarré de la quantité   qui en est composée, fut négatif. Or je réponds à cela que ce cas ne saurait jamais avoir lieu; car quelque imaginaires que soient les racines         on sait pourtant, qu’il doit y avoir       [5];   ces quantites       étant réelles. Mais puisque       leur produit   est déterminable comme on sait, par les quantités       et sera par conséquent réel, tout comme nous avons vu, qu’il est effectivement   et   On reconnaı̂tra aisément de même, que dans les plus hautes équations cette même circonstance doit avoir lieu, et qu’on ne saurait me faire des objections de ce côté.’

However, E. did not add anywhere that the product   etc., can be rationally determined by     etc., although he seems to have always understood it implicitly, as without it the demonstration can have no force. Indeed, it is true in equations of the fourth degree that if the product   is expanded, it yields   However, it does not seem clear enough how, in all higher-degree equations, the product can be rationally determined by the coefficients. The distinguished de Foncenex, who first observed this (Miscell. phil. math. soc. Taurin. Vol. I, p. 117), rightly contends that without a rigorous demonstration of this proposition, the method loses all its force, and he admits that it seems quite difficult to him, describing the fruitless attempts he made in that direction[6]. However, this matter can be easily completed by the following method (of which I can only provide a summary here): Although it is not clear enough in equations of the fourth degree that the product   can be determined by the coefficients       it can be easily seen that the same product is also   as well as   and finally also   Therefore, the product   will be a quarter of the sum   which, if expanded, can be foreseen a priori to be a rational integral function of the roots         in which they all enter in the same way. Such functions can always be expressed rationally by the coefficients of the equation whose roots are         — The same is also evident if the product   is brought into this form:

 

The expanded product of this expression, involving all         in the same way, can be easily foreseen. Knowledgeable individuals will simultaneously gather how this can be applied to higher-degree equations. I reserve the complete exposition of the demonstration, which brevity does not permit me to include here, along with a more extensive discussion of functions involving multiple variables, for another occasion.

Now, I observe that in addition to these four objections, there are still some other aspects in the demonstration of E. that could be criticized, which I pass over in silence lest I seem to be an overly severe critic, especially since the foregoing seems to sufficiently demonstrate that the demonstration, in the form in which it is proposed by E., cannot be considered complete.

After this demonstration, E. presents another way to reduce the theorem for equations whose degree is not a binary power to the resolution of such equations. However, since this method teaches nothing for equations whose degree is a binary power and, moreover, is equally susceptible to all the aforementioned objections (except the fourth) as the initial general demonstration, there is no need to elaborate on it here.

9. edit

In the same paper, on page 263, the illustrious E. endeavored to further confirm our theorem by another method, the essence of which is as follows: Given an equation   an analytic expression representing its roots explicitly could not be found so far for exponents   however, it seems certain (as E. asserts) that it can contain nothing else but arithmetic operations and root extractions, increasingly complicated as   grows. If this is conceded, E. excellently demonstrates that, no matter how complicated the radical signs are among themselves, the formulas can always be represented by the form   where   are real quantities.

Against this reasoning, one can object that, after so many great efforts by geometers, there remains little hope of ever reaching a general solution for algebraic equations. It becomes more and more likely that such a resolution is entirely impossible and contradictory. This should not seem too paradoxical, especially since what is commonly called the solution of an equation is properly nothing other than its reduction to pure equations. For the solution of pure equations is not taught but assumed, and if you express the root of the equation   as   you have not solved it, nor have you done more than if you were to invent some symbol to denote the root of the equation   and equate the root to it. It is true that pure equations, due to the ease of finding their roots by approximation and the elegant connection that all roots have with each other, excel above all others and are therefore not to be blamed for analysts denoting these roots by a specific symbol. However, it does not follow from this that the root of any equation can be expressed by these symbols. Or, in other words, it is assumed without sufficient reason that the solution of any equation can be reduced to the solution of pure equations. Perhaps it would not be so difficult to rigorously demonstrate the impossibility already for the fifth degree, about which I will present more extensive discussions elsewhere. Here, it suffices to note that the general solvability of equations, in the sense accepted here, is still highly doubtful, and therefore, the demonstration, whose entire validity depends on that assumption, currently carries no weight.

10. edit

Later, the distinguished de Foncenex, having noticed a deficiency in Euler’s initial demonstration (see objection 4 in article 8) that he could not rectify, attempted another approach, which he presented in the aforementioned commentary on page 120[7]. This approach is as follows:

Suppose we have the equation   representing a function of degree   in an unknown   If   is an odd number, then it is clear that this equation has a real root. However, if   is even, the distinguished Foncenex attempts to prove in the following way that the equation has at least one root of the form   Let   where   is an odd number, and suppose that   is a divisor of the function   Then each value of   will be the sum of two roots of the equation   (with the sign changed). Therefore,   will have   values, and if   is assumed to be determined by the equation   (where   is a function involving   and known coefficients in  ), this will be of degree   It can be easily seen that   will be of the form   where   is an odd number. Now, unless   is odd, assume again that   is a divisor of   By similar reasoning,   will be determined by the equation   where   is a function of degree   in   Setting     will be of the form   where   is an odd number. Now, unless   is odd, assume again that   is a divisor of   and then   will be determined by the equation   which has degree   where   is of the form   an odd number. It is evident that in the series of equations       etc., the degree will be odd, and thus have a real root. For brevity, let us assume   so that the equation   has a real root   It can be easily understood that the same reasoning holds for any other value of   Then, the coefficient   and the coefficients in   (which can be easily seen to be integral functions of the coefficients in  ) or are asserted by de Foncenex to be rationally determinable from   and the coefficients of   and are therefore real. It follows that the roots of the equation   will be of the form   They will also satisfy the equation   i.e., this equation will have roots of the form   Finally, by similar reasoning, it follows that even   will be in the same form, and consequently, the root of the equation   will also satisfy the given equation   Hence, any equation will have at least one root in the form  

11. edit

Objections 1, 2, 3, which I made against the first demonstration of Euler (art. 8), have the same force against this method. However, there is a difference, so that the second objection, to which Euler’s demonstration was only liable in certain special cases, must now apply to all cases. Specifically, it can be a priori demonstrated that even if a formula is given expressing the coefficient   rationally in terms of   and the coefficients in   it must necessarily become indeterminate for multiple values of   likewise, a formula expressing the coefficient   in terms of   must become indeterminate for certain values of   and so on. This will be most clearly understood if we take the example of a quartic equation. Let us assume, therefore, that   and let the roots of the equation   be         Then it is clear that the equation   will be of the sixth degree, and its roots will be             The equation   will be of the fifteenth degree, and its values of   will be

 

Now, in this equation, since its degree is odd, it will have to have a root, and it will indeed have the real root   (which, with the sign of the first coefficient in   changed, is equal and therefore not only real but also rational, if the coefficients in   are rational). But it can be easily seen that if a formula is given that rationally expresses the value of   in terms of the corresponding value of   it must necessarily become indeterminate for   For this value is a root of the equation   and the three values of   corresponding to it will be, for example,     and   all of which can be irrational. Clearly, a rational formula could not produce an irrational value of   in this case, nor could it produce three distinct values. From this example, it is evident that the method of de Foncenex is by no means satisfactory, but if it is to be made complete from every aspect, a much deeper investigation into the theory of elimination is required.

12. edit

Finally, Lagrange dealt with our theorem in his work Sur la forme des racines imaginaires des équations, Nouv. Mém. de l’Acad. de Berlin 1772, p. 222 sqq. This great geometer especially endeavored to repair the deficiencies in Euler’s first demonstration, particularly addressing those aspects constituting objections two and four as outlined above (art. 8). He delved so deeply into these matters that nothing more is desired, except perhaps in the previous discussion on the theory of elimination (on which this entire investigation is based), certain doubts may seem to remain. However, he did not touch upon the third objection at all, and the entire inquiry is built on the assumption that the equation of degree   indeed has   roots.

Therefore, with careful consideration of what has been presented so far, I hope that experts will find a new demonstration of this most important theorem, derived from entirely different principles, to be not unwelcome. I now proceed to present it.

13. edit

Lemma. Let   denote any positive integer. Then the function   will be divisible by  .

Proof. For   the function becomes   and hence it is divisible by any factor. For   the quotient becomes   and for any larger value, it will be   It can be easily confirmed that by multiplying this function by   the product becomes equal to the given function.

14. edit

Lemma. If the quantity   and the angle   are determined in such a way that we have the equations

 

 

then the function   will be divisible by the double factor   provided   is not   if   then the same function will be divisible by the simple factor  

Proof. I. From the preceding article, all of the following quantities will be divisible by  

 

Therefore, the sum of these quantities will also be divisible by   The terms of the first group constitute the sum   the sum of the second group is   due to [2]; and it is easily seen that the sum of the third group also vanishes, if [1] is multiplied by   and [2] by   and the products are subtracted. Hence, it follows that the function   is divisible by   and therefore, unless   so is the function   Q.E.P.

II. If   then either   or   In the former case,   due to [1], and therefore   is divisible by   or   in the latter case,       and generally   Therefore, due to [1],   when   and hence the function   is divisible by   Q.E.S.

15. edit

The preceding theorem is often demonstrated with the aid of imaginary quantities, see Euler Introductio in Analysin Infinitorum Vol. I p.110; I deemed it worthwhile to show how it can be equally easily derived without their assistance. It is already evident that for the proof of our theorem, nothing else is required than to show: Given any function   of the form     and   can be determined in such a way that equations [1] and [2] hold. From this, it will follow that   has a real factor of the first or second degree; however, the division will necessarily produce a real quotient of a lower degree, which, for the same reason, will also have a factor of the first or second degree. By continuing this operation,   will eventually be resolved into simple or double real factors. Thus, the goal of the following discussion is to prove that theorem.

16. edit

Imagine an infinite fixed plane (the plane of the table, Fig. 1), and on this, an infinite fixed straight line   passing through the fixed point   Assume any length as the unit so that all lines can be expressed by numbers. At any point   on the plane, with a distance   from the center   and an angle   erect a perpendicular equal to the value of the expression

 

which, for brevity, I will always denote by   in the following. I always consider the distance   as positive, and for points on the other side of the axis, the angle   should be considered either as greater than two right angles or as negative (which here is equivalent). The ends of these perpendiculars (which should be taken above the plane for a positive value of   below for a negative value of   and on the plane itself when   vanishes) will be on a continuous curved surface everywhere infinite, which, for brevity, I will call the first surface in the following. Similarly, in exactly the same way, another surface, whose height above any point on the plane is

 

which I will denote by   for brevity. This surface will also be continuous and everywhere infinite, and I will distinguish from the former by the term second surface. Then it is evident that the whole matter revolves around proving that at least one point exists that lies simultaneously in the plane, on the first surface, and on the second surface.

17. edit

It can be easily seen that the first surface lies partly above and partly below the plane; for the distance from the center   can be taken so large that the remaining terms in   become negligible compared to the first term   this term, however, can be either positive or negative for a properly determined angle   Therefore, the fixed plane will necessarily intersect the first surface; I will call this intersection of the plane with the first surface the first line, which will be determined by the equation   For the same reason, the plane will intersect the second surface; the intersection will constitute a curve determined by the equation   which I will call the second line. Strictly speaking, each curve will consist of several branches that can be entirely separate, but each will be a continuous line. Indeed, the first line will always be such that it is called a complex, and the axis   should be regarded as part of this curve; for any value assigned to     will always be   when   is either   or   However, it is better to consider the complex of all branches passing through all points where   as one curve (according to the usage generally accepted in higher geometry), and similarly for all branches passing through all points where   It is now evident that the problem has been reduced to proving that at least one point exists in the plane where some branch of the first line intersects some branch of the second line. To achieve this, it will be necessary to closely examine the nature of these lines.

18. edit

First of all, I observe that both curves are algebraic, namely, if brought back to orthogonal coordinates, they are of order   Starting with the abscissas from   with   toward   and ordinates   toward   we have     and thus, generally, for any  

 

Therefore, both   and   will consist of several terms of this kind   denoting     as positive integers whose sum is at most   Moreover, it can be easily foreseen that all terms of   involve the factor   and therefore, the first line is composed of a line (whose equation is  ) and a curve of order   However, it is not necessary to consider this distinction here.

A matter of greater significance will be the investigation of whether the first and second lines have infinite branches and how many of each. At an infinite distance from the point   the first line, whose equation is   will merge with the line whose equation is   The latter exhibits   straight lines intersecting at point   where the first is the axis   and the others are inclined at angles       etc. degrees against it. Therefore, the first line has   infinite branches, which, when described around the circle with an infinitely large radius, divide the circumference into   equal parts. The division occurs in such a way that the circumference is intersected by the first branch at the intersection of the circle and the axis, by the second at a distance of   by the third at a distance of   and so on.

Similarly, the second line at an infinite distance from the center will have an asymptote expressed by the equation   This asymptote is a complex of   straight lines at point   intersecting at equal angles, such that the first forms an angle of   the second an angle of   the third an angle of   and so on. Therefore, the second line will also have   infinite branches, each occupying the middle position between the two nearest branches of the first line. This arrangement causes the branches to intersect the circumference of a circle described with an infinitely large radius at points that are       etc. away from the axis.

However, it is evident that the axis itself always constitutes two infinite branches of the first line, namely the first and   This arrangement of the branches is clearly shown in Fig. 2, for the case   where the branches of the second line are represented with dotted lines to distinguish them from the branches of the first line. The same applies to Fig. 4[8]. Since these conclusions are of utmost importance, and infinitely large quantities may offend some readers, I will demonstrate them without the support of the infinite in the following article.

19. edit

Theorem. With all the conditions as stated above, a circle can be described from the center   on whose circumference there are   points where   and an equal number of points where   arranged such that each latter point lies between two former points.

Denote the sum of all coefficients     etc., up to       by   and let   be taken such that   and  [9]. Then I say that in a circle described with a radius   the conditions stated in the theorem necessarily hold. Specifically, for simplicity, designate the point on its circumference that is   degrees away from its intersection with the left side of the axis, or for which   by (1), and similarly, the point that is   away from this intersection, or for which   by (3); and the point where   by (5), and so on up to   which is   degrees away from that intersection, if you always progress in the same direction (or   from the opposite side), so that a total of   points are on the circumference, spaced at equal intervals. Then one point will lie between   and (1) for which   similarly, there will be singular points between (3) and (5); between (7) and (9); between (11) and (13), and so on, with a total of   points. Likewise, each point for which   will lie between (1) and (3); between (5) and (7); between (9) and (11), with the total count also   Finally, apart from these   points, there will be no other points in the entire circumference for which either   or   is  

Proof. I. In the point (1), we have   and thus

 

However, the sum   involving   etc. cannot be greater than   Therefore, it must necessarily be less than   It follows that at this point, the value of   is certainly positive. Hence,   will have a positive value when   lies between   and   i.e., from point (1) to (3), the value of   will always be positive. By the same reasoning,   will have a positive value from point (9) to (11) and generally from any point   to   where   denotes any integer. Similarly,   will have a negative value everywhere between (5) and (7), between (13) and (15), etc., and generally between   and   so it can never be   in these intervals. But since the value is positive at (3) and negative at (5), it must be   somewhere between (3) and (5); also somewhere between (7) and (9); between (11) and (13), etc., up to the interval between   and (1) inclusive, so that altogether at   points,   Q.E.D.

II. That no other points with this property exist beyond these   points can be understood as follows. Since there are none between (1) and (3), between (5) and (7), etc., it could not be otherwise unless more such points existed, which would happen only if at least two were in some interval between (3) and (5) or between (7) and (9), etc. Then necessarily in the same interval,   would be either a maximum or minimum, and thus   But   and   between (3) and (5) is always negative and   Hence, it is easily seen that in this entire interval,   is a negative quantity, and similarly, between (7) and (9) everywhere positive; between (11) and (13) negative, etc., so that   cannot exist in any of these intervals. Therefore, etc. Q.E.S.

III. In a wholly similar manner, it is demonstrated that   has a negative value everywhere between (3) and (5), between (11) and (13), etc., and generally between   and   positive, however, between (7) and (9), between (15) and (17), etc., and generally between   and   Hence, it immediately follows that   must occur somewhere between (1) and (3), between (5) and (7), etc., i.e., in   points. However, in none of these intervals can   occur (which is easily proved similarly as above): therefore, more than those   points on the circumference of the circle will not be given, where   Q.E.T. et Q.

Moreover, the part of this theorem according to which more than   points do not exist where   nor more than   where   can also be demonstrated from the fact that the equations     represent curves of   order, such as, according to higher geometry, cannot be cut in more than   points, a circle being a curve of the second order.

20. edit

If another circle with a radius greater than   is described from the same center, then it will be divided in the same way: between points (3) and (5), there will be one point where   likewise between (7) and (9), etc. It will be easily observed that the less the radius of this circle differs from the radius   the closer such points between (3) and (5) should be on the circumferences of both circles. The same will occur if a circle with a radius somewhat smaller than   but greater than   and   is described. From this, it is easily understood that the circumference of the circle described with a radius   is actually cut at the point between (3) and (5) where   by some branch of the first line; the same holds for the other points where   Similarly, it is evident that the circumference of this circle is cut at all   points where   by some branch of the second line. These conclusions can also be expressed in the following way: When a circle of the appropriate size is described from the center     branches of the first line and   branches of the second line will enter this, in such a way that the two nearest branches of the first line are separated by some branch of the second line. See Fig. 2, where the circle is now of finite size, and the numbers assigned to each branch are not to be confused with the numbers by which I designated specific limits in the previous article and in this for the sake of brevity.

21. edit

Now, from this relative arrangement of the branches entering the circle, the intersection of some branch of the first line with a branch of the second line within the circle can be deduced in various ways. I am almost ignorant of which method to choose among the rest. The following seems very clear: Let’s designate (Fig. 2) a point on the circumference of the circle, where it is cut by a branch from the left side of the axis (which itself is one of the   branches of the first line) as   the nearest point where a branch of the second line enters, as   the next point to this, where the second branch of the first line enters, as   and so on up to   so that in any point marked with an even number, a branch of the second line enters the circle, contrary to a branch of the first line expressed in all points by an odd number. It is well known from higher geometry that an algebraic curve (or each part of any algebraic curve if it happens to be composed of several) may either return into itself or extend infinitely on both sides, so if any branch of an algebraic curve enters a finite space, it must necessarily come out again somewhere from this space[10]. Hence, it is easily concluded that any point marked with an even number (or, for the sake of brevity, any even point) should be connected by a branch of the first line with another even point within the circle, and similarly, any point marked with an odd number should be connected with another similar point by a branch of the second line. Although the connection of these two points according to the nature of the function   can be very different, so that it cannot be determined in general, it can be easily demonstrated that in any case, an intersection of the first line with the second line always occurs.

22. edit

The demonstration of this necessity seems most conveniently representable by reductio ad absurdum. Namely, let’s assume that the connection of any two even points and any two odd points can be arranged in such a way that no intersection of a branch of the first line with a branch of the second line arises from it. Since the axis is a part of the first line, clearly point   must be connected with point   Therefore, point   cannot be connected with any point beyond the axis, i.e., with no point expressed by a number greater than   otherwise the connecting line would necessarily cut the axis. So, if   is assumed to be connected with point   then   By similar reasoning, if   is connected with   then   because otherwise, the branch   would necessarily cut the branch   For the same reason, point   will be connected with some point between   and   and it is clear that if       etc., are assumed to be connected with       etc.,   lies between   and     between   and   etc. Hence, it is evident that, finally reaching some point   connected with point   the branch entering the circle at point   will necessarily cut the branch connecting points   and   However, since one of these two branches will belong to the first line and the other to the second, it is now clear that the assumption is contradictory, and therefore, an intersection of the first line with the second line must necessarily occur somewhere.

If this is combined with the preceding discussions, it will be concluded from all the explanations that the theorem, a rational algebraic function of one indeterminate can be resolved into factors of the first or second degree with real coefficients, has been rigorously demonstrated.

23. edit

Moreover, it can be easily deduced from the same principles, that not only one but at least   intersections of the first line with the second line are given, although it is also possible for the first line to be cut by several branches of the second line at the same point, in which case the function   will have multiple equal factors. However, since it suffices here to have demonstrated the necessity of one intersection, I do not dwell further on this matter for the sake of brevity. For the same reason, I do not pursue other properties of these lines here in more detail, such as the intersection always occurring at right angles, or if multiple branches of each curve coincide at the same point, the first line having as many branches as the second line, and these being alternately placed, intersecting at equal angles, etc.

Finally, I observe that it is not impossible for the preceding demonstration, which I built on geometric principles here, to be presented in a purely analytical form. However, I believed that the representation I explained here would be less abstract, and the essence of the proof could be put more clearly before the eyes than could be expected from an analytical demonstration.

As a bonus, I will suggest another method for proving our theorem, which, at first glance, will seem not only very different from the preceding demonstration but also from all the other demonstrations explained above, and yet it is fundamentally the same as the d’Alembertian method. I leave it to those familiar with the subject to compare it with the previous one and explore the parallelism between the two. It is attached solely for their benefit.

24. edit

Above the plane of Figure 4, relative to the axis   and the fixed point   I assume that the first and second surfaces are described in the same way as above. Take any point located on any branch of the first line, where   (for example, any point   lying on the axis), and unless   at this point, proceed from this point in the first line towards the direction where the absolute magnitude of   decreases. If, by chance, the absolute value of   decreases in both directions at the point   it is arbitrary where you proceed; but I will immediately explain what to do if   increases in both directions. It is clear that as long as you always progress in the first line, you will necessarily reach a point where   or one where the value of   becomes a minimum, for example, the point   In the former case, the sought point is found; in the latter, it can be demonstrated that in this point, multiple branches of the first line intersect (indeed, an equal number of branches), and their semiaxes are so arranged that if you deviate towards any of them (either here or there), the value of   will continue to decrease. (For the sake of brevity, I must suppress the demonstration of this theorem, which, although not more difficult, is more extensive.) In this branch, you can then progress again until   becomes   (as happens in Fig. 4 at  ) or again a minimum. Then, deviating again, you will necessarily reach a point where  

Against this demonstration, a doubt could be raised about whether it is possible that no matter how far you progress, and even though the value of   always decreases, these decrements continuously become slower, and nevertheless, that value never reaches a certain limit. This objection would correspond to the fourth in Article 6. But it would not be difficult to assign a limit, such that once you surpass it, the value of   must necessarily not only change more rapidly but also not decrease any longer, so that before reaching this limit, the value   must have necessarily been reached. However, I reserve the opportunity to elaborate more extensively on this and other points that I could only touch upon in this demonstration on another occasion.

We discovered the principles on which this demonstration is based in October 1797.

  1. I always understand the term imaginary quantity here to refer to a quantity in the form   as long as   is not equal to   In this sense, this expression has always been accepted by all geometers of the first order, and I consider those who wanted to call the quantity   imaginary only in the case where   and impossible only when   not worth listening to, as this distinction is neither necessary nor of any utility. If imaginary quantities are to be retained in analysis altogether (which seems more advisable than abolishing them, provided they are solidly established), then they must necessarily be regarded as equally possible as real quantities; hence, I would prefer to encompass real and imaginary quantities under the common designation of possible quantities: conversely, I would call a quantity impossible if it should satisfy conditions that cannot be satisfied even by admitting imaginary ones, yet in a way that this phrase means the same as saying that such a quantity does not exist in the entire range of magnitudes. From this standpoint, I would not concede the formation of a peculiar class of quantities. If someone says that an equilateral right-angled triangle is impossible, no one will deny it. But if he wants to consider such an impossible triangle as a new kind of triangles and apply properties of other triangles to it, would anyone take it seriously? This would be playing with words or rather abusing them. Although even eminent mathematicians have often applied truths that manifestly presuppose the possibility of certain quantities to cases where the possibility was still doubtful; and while I do not deny that such licenses usually pertain only to the form and semblance of reasoning, which the keen edge of true geometry can soon penetrate: yet it seems more advisable and more worthy of the sublimity of a science celebrated as the most perfect example of clarity and certainty, either to entirely prohibit such liberties, or at least to use them sparingly and only where those less practiced might find it difficult to perceive the matter without their aid, and where it could still be handled as rigorously, if perhaps less briefly. However, I do not deny that what I have said here against the abuse of impossibilities can be applied in some respects against imaginaries as well: yet I reserve the vindication of these and the fuller exposition of this whole matter for another occasion.
  2. It is worth noting that d’Alembert in his exposition of this demonstration used geometric considerations, viewing   as the abscissa and   as the ordinate of the curve (in the manner of all geometers of the first part of this century, among whom the notion of functions was less common). However, since all his reasoning, if you consider only their essence, relies on purely analytic principles, and imaginary curves and expressions of imaginary ordinates may seem harder and more likely to confuse the modern reader, I preferred to use a purely analytic form of representation here. I added this note to prevent anyone from suspecting that something essential had been changed by comparing d’Alembert’s demonstration itself with this concise exposition.
  3. By the way, on this occasion, I note incidentally that there are many series that initially seem to converge greatly, most notably those used by Euler in the latter part of Institutiones Calculi Differentialis Chapter VI. for approximating the sum of other series (for the remaining series on p. 475-478 can indeed converge), which, as far as I know, has not been noticed by anyone so far. Therefore, it would be highly desirable to clearly and rigorously demonstrate why such series, which converge very quickly at first, then more slowly, and finally more and more slowly, nevertheless provide an approximation to the true sum, as long as not too many terms are taken, and until such a sum can be safely considered exact.
  4. All of this will be much elucidated by another dissertation already sweating under the press, where in a completely different argument, but nevertheless analogous, I could have used a similar license with exactly the same right, as has been done here in equations by all analysts. Although the proofs of several truths could have been completed in a few words with the help of such fictions, which otherwise become very difficult and require the most subtle artifices, I preferred to abstain from them altogether and hoped to have satisfied a few if I followed the method of analysts.
  5. E. per errorem habet   unde etiam postea perperam statuit  
  6. An error seems to have crept into this explanation, namely on p. 118, line 5. Instead of "characteris (on choisissait seulement Celles oü entrait p etc.)," one must necessarily read "une même racine quelconque de l’équation in-oposee," or something similar, as the former has no meaning.
  7. Explanations related to this commentary are found in the second volume of the same Miscellanea on p. 337. However, these are not relevant to the current discussion but pertain to the logarithms of negative quantities, which were discussed in the same work.
  8. Fig. 4 is constructed assuming   in which case readers less accustomed to general and abstract discussions may find it challenging to visualize the respective positions of both curves concretely. The length of the line   is assumed to be 10 (CN= 1.26255.)
  9. When   condition one implies condition two; when   condition two implies condition one.
  10. It seems to have been demonstrated quite well that an algebraic curve cannot suddenly break off anywhere (as happens, for example, in a transcendental curve whose equation is  ), nor lose itself, as it were, after infinite spirals at some point (like the logarithmic spiral), and as far as I know, no one has cast doubt on this matter. However, if someone demands a demonstration that is not subject to any doubts, I will undertake it on another occasion. In the present case, it is evident that if a branch, for example, 2, did not come out from the circle anywhere (Fig. 3), you could enter the circle between   and   then move around the whole branch (which should get lost in the space of the circle), and finally be able to exit between   and   again, so that you never intersect the first line on the entire path. This is absurd because at the point where you entered the circle, you had the first surface above you, and in the exit, below; therefore, you must have necessarily intersected the first surface itself somewhere, i.e., at a point on the first line. However, from this reasoning based on the principles of the geometry of position, which are no less valid than the principles of the geometry of magnitudes, it follows only that if you enter a branch of the first line in the circle, you can exit somewhere else from the circle, always remaining on the first line, and not that your path is a continuous line in the sense in which it is understood in higher geometry. But here it suffices that the path is a continuous line in the common sense, i.e., not interrupted anywhere but cohering everywhere.