2343561911 Encyclopædia Britannica, Volume 19 — NumberGeorge Ballard Mathews

NUMBER[1] (through Fr. nombre, from Lat. numerus; from a root seen in Gr. νέμειν to distribute), a word generally expressive of quantity, the fundamental meaning of which leads on analysis to some of the most difficult problems of higher mathematics.

1. The most elementary process of thought involves a distinction within an identity—the A and the not-A within the sphere throughout which these terms are intelligible. Again A may be a generic quality found in different modes Aa, Ab, Ac, &c.; for instance, colour in the modes, red, green, blue and so on. Thus the notions of “one,” “two,” and the vague “many” are fundamental, and must have impressed themselves on the human mind at a very early, period: evidence of this is found in the grammatical distinction of singular, dual and plural which occurs in ancient languages of widely different races. A more definite idea of number seems to have been gradually acquired by realizing the equivalence, as regards plurality, of different concrete groups, such as the fingers of the right hand and those of the left. This led to the invention of a set of names which in the first instance did not suggest a numerical system, but denoted certain recognized forms of plurality, just as blue, red, green, &c., denote recognized forms of colour. Eventually the conception of the series of natural numbers became sufficiently clear to lead to a systematic terminology, and the science of arithmetic was thus rendered possible. But it is only in quite recent times that the notion of number has been submitted to a searching critical analysis: it is, in fact, one of the most characteristic results of modern mathematical research that the term number has been made at once more precise and more extensive.

2. Aggregates (also called manifolds or sets).—Let us assume the possibility of constructing or contemplating a permanent system of things such that (1) the system includes all objects to which a certain definite quality belongs; (2) no object without this quality belongs to the system; (3) each object of the system is permanently recognizable as the same thing, and as distinct from all other objects of the system. Such a collection is called an aggregate: the separate objects belonging to it are called its elements. An aggregate may consist of a single element.

It is further assumed that we can select, by a definite process, one or more elements of any aggregate at pleasure: these form another aggregate . If any element of remains unselected, is said to be a part of (in symbols, ): if not, is identical with . Every element of is a part of . If and , then .

When a correspondence can be established between two aggregates and in such a way that to every element of corresponds one and only one element of , and conversely, and are said to be equivalent, or to have the same power (or potency); in symbols, . If and , then . It is possible for an aggregate to be equivalent to a part of itself: the aggregate is then said to be infinite. As an example, the aggregates , &c., and , &c., are equivalent, but the first is only a part of the second.

3. Order.—Suppose that when any two elements of an aggregate are taken there can be established, by a definite criterion, one or other of two alternative relations, symbolized by and , subject to the following conditions:-(1) If , then , and if , then ; (2) If and , then . In this case the criterion is said to arrange the aggregate in order. An aggregate which can be arranged in order may be called ordinable. An ordinable aggregate may, in general, by the application of different criteria, be arranged in order in a variety of ways. According as or we shall speak of a as anterior or posterior to . These terms are chosen merely for convenience, and must not be taken to imply any meaning except what is involved in the definitions of the signs and for the particular criterion in question. The consideration of a succession of events in time will help to show that the assumptions made are not self-contradictory. An aggregate arranged in order by a definite criterion will be called an ordered aggregate. Let be any two elements of an ordered aggregate, and suppose . All the elements (if any) such that are said to fall within the interval . If an element , posterior to , can be found so that no element falls within the interval , then is said to be isolated from all subsequent elements, and is said to be the element next after . So if , and no element falls within the interval , then is isolated from all preceding elements, and is the element next before . As will be seen presently, for any assigned element , either, neither, or both of these cases may occur.

An aggregate is said to be well-ordered (or normally ordered) when, in addition to being ordered, it has the following properties: (1) has a first or lowest element a which is anterior to all the rest; (2) if is any part of , then has a first element. It follows from this that every part of a well-ordered aggregate is itself well-ordered. A well-ordered aggregate may or may not have a last element.

Two ordered aggregates are said to be similar () when a one-one correspondence can be set up between their elements in such a way that if are the elements of B which correspond to any two elements of A, then or according as or . For example, , because we can make the even number correspond to the odd number and conversely.

Similar ordered aggregates are said to have the same order-type. Any definite order-type is said to be the ordinal number of every aggregate arranged according to that type. This somewhat vague definition will become clearer as we proceed.

4. The Natural Scale.—Let be any element of a well-ordered aggregate . Then all the elements posterior to form an aggregate , which is a part of and, by definition, has a first element . This element is different from , and immediately succeeds it in the order of . (It may happen, of course, that does not exist; in this case is the last element of .) Thus in a well-ordered aggregate every element except the last (if there be a last element) is succeeded by a definite next element. The ingenuity of man has developed a symbolism by means of which every symbol is associated with a definite next succeeding symbol, and in this way we have a set of visible or audible signs 1, 2, 3, &c. (or their verbal equivalents), representing an aggregate in which (1) there is a definite order, (2) there is a first term, (3) each term has one next following, and consequently there is no last term. Counting a set of objects means associating them in order with the first and subsequent members of this conventional aggregate. The process of counting may lead to three different results: (1) the set of objects may be finite in number, so that they are associated with a part of the conventional aggregate which has a last term; (2) the set of objects may have the same power as the conventional aggregate; (3) the set of objects may have a higher power than the conventional aggregate. Examples of (2) and (3) will be found further on. The order-type of 1, 2, 3, &c., and of similar aggregates will be denoted by ; this is the first and simplest member of a set of transfinite ordinal numbers to be considered later on. Any finite number such as 3 is used ordinally as representing the order-type of 1, 2, 3 or any similar aggregate, and cardinally as representing the power of 1, 2, 3 or any equivalent aggregate. For reasons that will appear, is only used in an ordinal sense. The aggregate 1, 2, 3, &c., in any of its written or spoken forms, may be called the natural scale, and denoted by . It has already been shown that is infinite: this appears in a more elementary way from the fact that , where each element of is made to correspond with the next following. Any aggregate which is equivalent to the natural scale or a part thereof is said to be countable.

5. Arithmetical Operations.—When the natural scale has once been obtained it is comparatively easy, although it requires a long process of induction, to define the arithmetical operations of addition, multiplication and involution, as applied to natural numbers. It can be proved that these operations are free from ambiguity and obey certain formal laws of commutation, &c., which will not be discussed here. Each of the three direct operations leads to an inverse problem which cannot be solved except under certain implied conditions. Let denote any two assigned natural numbers: then it is required to find natural numbers, such that

respectively. The solutions, when they exist, are perfectly definite, and may be denoted by and ; but they are only possible in the first case when , in the second when is a multiple of , and in the third when is a perfect th power. It is found to be possible, by the construction of certain elements, called respectively negative, fractional and irrational numbers, and zero, to remove all these restrictions.

6. There are certain properties, common to the aggregates with which we have next to deal, analogous to those possessed by the natural scale, and consequently justifying us in applying the term number to any one of their elements. They are stated here, once for all, to avoid repetition; the verification, in each case, will be, for the most part, left to the reader. Each of the aggregates in question (, suppose) is an ordered aggregate. If are any two elements of , they may be combined by two definite operations, represented by and , so as to produce two definite elements of represented by and (or ); these operations obey the formal laws satisfied by those of addition and multiplication. The aggregate contains one (and only one) element , such that if is any element of ( included), then , and . Thus contains the elements , or, as we may write them, such that and ; also We may express this by saying that contains an image of the natural scale. The element denoted by may be called the ground element of .

7. Negative Numbers.—Let any two natural numbers be selected in a definite order (to be distinguished from , in which the order is reversed). In this way we obtain from an aggregate of symbols which we shall call couples, or more precisely, if necessary, polar couples. This new aggregate may be arranged in order by means of the following rules:—

Two couples are said to be equal if . In other words are then taken to be equivalent symbols for the same thing.

If , we write ; and if we write .

The rules for the addition and multiplication of couples are:

The aggregate thus defined will be denoted by ; it may be called the scale of relative integers.

If denotes or any equivalent couple, and . Hence is the ground element of . By definition, : and hence by induction , where is any natural integer. Conversely every couple in which can be expressed by the symbol . In the same way, every couple in which can be expressed in the form , where .

8. It follows as a formal consequence of the definitions that . It is convenient to denote and its equivalent symbols by , because

;

hence , and we can represent by the scheme—

in which each element is obtained from the next before it by the addition of . With this notation the rules of operation may be written (, denoting natural numbers)—

with the special rules for zero, that if is any element of ,

.

To each element, , of corresponds a definite element such that ; if , then , but in every other case are different and may be denoted by . The natural number is called the absolute value of and .

9. If are any two elements of , the equation is satisfied by putting . Thus the symbol is always interpretable as , and we may say that within subtraction is always possible; it is easily proved to be also free from ambiguity. On the other hand, is intelligible only if the absolute value of is a multiple of the absolute value of .

The aggregate has no first element and no last element. At the same time it is countable, as we see, for instance, by associating the elements with the natural numbers respectively, thus—


It is usual to write (or simply ) for and for ; that this should be possible without leading to confusion or ambiguity is certainly remarkable.

10. Fractional Numbers.—We will now derive from a different aggregate of couples subject to the following rules:

The symbols are equivalent if . According as is greater or less than we regard as being greater or less than . The formulae for addition and multiplication are

.

All the couples are equivalent to , and if we denote this by we have , so that is the ground element of the new aggregate.

Again , and by induction . Moreover, if is a multiple of , say , we may denote by .

11. The new aggregate of couples will be denoted by . It differs from and in one very important respect, namely, that when its elements are arranged in order of magnitude (that is to say, by the rule above given) they are not isolated from each other. In fact if , and , the element lies between and ; hence it follows that between any two different elements of we can find as many other elements as we please. This property is expressed by saying that is in close order when its elements are arranged in order of magnitude. Strange as it appears at first sight, is a countable aggregate; a theorem first proved by G. Cantor. To see this, observe that every element of R may be represented by a “reduced” couple , in which are prime to each other. If are any two reduced couples, we will agree that is anterior to if either (1) , or (2) but . This gives a new criterion by which all the elements of R can be arranged in the succession

which is similar to the natural scale.

The aggregate , arranged in order of magnitude, agrees with in having no least and no greatest element; for if denotes any element , then .

12. The division of one element of by another is always possible; for by definition

,

and consequently is always interpretable as . As a particular case , so that every element of is expressible in one of the forms . It is usual to omit the symbol altogether, and to represent the element by , whether is a multiple of or not. Moreover, is written , which may be done without confusion, because , and , by the rules given above.

13. Within the aggregate subtraction is not always practicable; but this limitation may be removed by constructing an aggregate related to in the same way as to . This may be done in two ways which lead to equivalent results. We may either form symbols of the type , where denote elements of , and apply the rules of § 7 ; or else form symbols of the type , where denote elements of , and apply the rules of § 10. The final result is that contains a zero element, , а ground element , an element such that , and a set of elements representable by the symbols (. In this notation the rules of operation are

;
;


;
.

Here and denote any two elements of . If , then , and if , then . If , then

14. When is constructed by means of couples taken from , we must put , , and , if is any element of except . The symbols and are inadmissible; the first because it satisfies the definition of equality (§ 10) with every symbol , and is therefore indeterminate; the second because, according to the rule of addition,

,

which is inconsistent with

In the same way, if denotes the zero element of , and any other element, the symbol is indeterminate, and inadmissible, because, by the formal rules of operation, , which conflicts with the definition of the ground element . It is usual to write (or simply ) for , and for . Each of these elements is said to have the absolute value . The criterion for arranging the elements of in order of magnitude is that, if are any two elements of it, when is positive; that is to say, when it can be expressed in the form .

15. The aggregate is very important, because it is the simplest type of a field of rationality, or corpus. An algebraic corpus is an aggregate, such that its elements are representable by symbols , &c., which can be combined according to the laws of ordinary algebra; every algebraic expression obtained by combining a finite number of symbols, by means of a finite chain of rational operations, being capable of interpretation as representing a definite element of the aggregate, with the single exception that division by zero is inadmissible. Since, by the laws of algebra, , and , every algebraic field contains , or, more properly, an aggregate which is an image of .

16. Irrational Numbers.—Let denote any element of ; then and all lesser elements form an aggregate, say; the remaining elements form another aggregate , which we shall call complementary to , and we may write . Now the essence of this separation of into the parts and may be expressed without any reference to as follows:—

I. The aggregates are complementary; that is, their elements, taken together, make up the whole of .

II. Every element of is less than every element of .

III. The aggregate has no least element. (This condition is artificial, but saves a distinction of cases in what follows.)

Every separation which satisfies these conditions is called a cut (or section), and will be denoted by . We have seen that every rational number can be associated with a cut. Conversely, every cut in which has a last element is perfectly definite, and specifies without ambiguity. But there are other cuts in which has no last element. For instance, all the elements () of such that either , or , form an aggregate , while those for which and , form the complementary aggregate . This separation is a cut in which has no last element; because if is any positive element of , the element exceeds , and also belongs to . Every cut of this kind is said to define an irrational number. The justification of this is contained in the following propositions:—

(1) A cut is a definite concept, and the assemblage of cuts is an aggregate according to definition; the generic quality of the aggregate being the separation of into two complementary parts, without altering the order of its elements.

(2) The aggregate of cuts may be arranged in order by the rule that if is a part of .

(3) This criterion of arrangement preserves the order of magnitude of all rational numbers.

(4) Cuts may be combined according to the laws of algebra, and, when the cuts so combined are all rational, the results are in agreement with those derived from the rational theory.

As a partial illustration of proposition (4) let be any two cuts ; and let be the aggregate whose elements are obtained by forming all the values of , where is any element of and is any element of . Then if is the complement of , it can be proved that is a cut; this is said to be the sum of and . The difference, product and quotient of two cuts may be defined in a similar way. If denotes the irrational cut chosen above for purposes of illustration, we shall have where comprises all the numbers obtained by multiplying any two elements, which are rational and positive, and such that . Since it follows that is positive and greater than ; it can be proved conversely that every rational number which is greater than can be expressed in the form . Hence so that the cut actually gives a real arithmetical meaning to the positive root of the equation ; in other words we may say that n defines the irrational number . The theory of cuts, in fact, provides a logical basis for the treatment of all finite numerical irrationalities, and enables us to justify all arithmetical operations involving the use of such quantities.

17. Since the aggregate of cuts (N say) has an order of magnitude, we may construct cuts in this aggregate. Thus if a is any element of N, and A is the aggregate which consists of a and all anterior elements of N, We may write N= A+A′, and (A, A′) is a cut in which A has a last element a. It is a remarkable fact that no other kind of cut in N is possible; in other words, every conceivable cut in N is defined by one of its own elements. This is expressed by saying that N is a continuous aggregate, and N itself is referred to as the numerical continuum of real numbers. The property of continuity must be carefully distinguished from that of close order (§ 11); a continuous aggregate is necessarily in close order, but the converse is not always true. The aggregate N is not countable.

18. Another way of treating irrationals is by means of sequences. A sequence is an unlimited succession of rational numbers

a1, a2, a3 . . . am, am+1 . . .

(in order-type ω) the elements of which can be assigned by a definite rule, such that when any rational number ε, however small, has been fixed, it is possible to find an integer m, so that for all positive integral values of n the absolute value of (am+nam) is less than ε. Under these conditions the sequence may be taken to represent a definite number, which is, in fact, the limit of am when m increases without limit. Every rational number a can be expressed as a sequence in the form (a, a, a, . . .), but this is only one of an infinite variety of such representations, for instance—

1 = (·9, ·99, ·999, . . .) =

and so on. The essential thing is that we have a mode of representation which can be applied to rational and irrational numbers alike, and provides a very convenient symbolism to express the results of arithmetical operations. Thus the rules for the sum and product of two sequences are given by the formulae


from which the rules for subtraction and division may be at once inferred. It has been proved that the method of sequences is ultimately equivalent to that of cuts. The advantage of the former lies in its convenient notation, that of the latter in giving a clear definition of an irrational number without having recourse to the notion of a limit.

19. Complex Numbers.—If α is an assigned number, rational or irrational, and n a natural number, it can be proved that there is a real number satisfying the equation , except when n is even and a is negative: in this case the equation is not satisfied by any real number whatever. To remove the difficulty we construct an aggregate of polar couples {x, y}, where x, y are any two real numbers, and define the addition and multiplication of such couples by the rules


We also agree that {x, y} < {x′, y′}, if x<x′ or if x=x′ and y<y′. It follows that the aggregate has the ground element {1, 0}, which we may denote by σ; and that, if we write τ for the element {0, 1},

τ2={–1, 0} = –σ.

Whenever m, n are rational, {m, n} = mσ+nτ, and we are thus justified in writing, if we like, xσ+yτ for {x, y} in all circumstances. A further simplification is gained by writing x instead of xσ, and regarding τ as a symbol which is such that τ2= −1, but in other respects obeys the ordinary laws of operation. It is usual to write i instead of τ; we thus have an aggregate J of complex numbers x+yi. In this aggregate, which includes the real continuum as part of itself, not only the four rational operations (excluding division by {0, 0}, the zero element), but also the extraction of roots, may be effected without any restriction. Moreover (as first proved by Gauss and Cauchy), if a0, a1 . . . an an are any assigned real or complex numbers, the equation

,

is always satisfied by precisely n real or complex values of z, with a proper convention as to multiple roots. Thus any algebraic function of any finite number of elements of J is also contained in J, which is, in this sense, a closed arithmetical field, just as N is when we restrict ourselves to rational operations. The power of J is the same as that of N.

20. Transfinite Numbers.—The theory of these numbers is quite recent, and mainly due to G. Cantor. The simplest of them, ω, has been already defined (§ 4) as the order-type of the natural scale. Now there is no logical difficulty in constructing a scheme

u1, u2, u3 . . . | v1,

indicating a well-ordered aggregate of type ω immediately followed by a distinct element v1: for example, we may think of all positive odd integers arranged in ascending order of magnitude and then think of the even number 2. A scheme of this kind is said to be of order-type (ω+1); and it will be convenient to speak of (ω+1) as the index of the scheme. Similarly we may form arrangements corresponding to the indices

ω+2, ω+3 . . . ω+n,

where n is any positive integer. The scheme

u1, u2, u3 . . . | v1, v2, v3 . . .

is associated with ω+ω = 2ω;

u11, u12, u13 . . . | u21, u22, u23 . . . | . . . | un1, un2, . . . | . . .

with ω.ω or ω2; and so on. Thus we may construct arrangements of aggregates corresponding to any index of the form

φ(ω)aωn+bωn−1+ . . . +kω+l,

where n, a, b, . . . l are all positive integers.

We are thus led to the construction of a scheme of symbols—

The symbols φ(ω) form a countable aggregate: so that we may, if we like (and in various ways), arrange the rows of block (II.) in a scheme of type ω: we thus have each element α succeeded in its row by (α+1), and the row containing φ(ω) succeeded by a definite next row. The same process may be applied to (III.), and we can form additional blocks (IV.), (V.), &c., with first elements &c. All the symbols in which ω occurs are called transfinite ordinal numbers.

21. The index of a finite set is a definite integer however the set may be arranged; we may take this index as also denoting the power of the set, and call it the number of things in the set. But the index of an infinite ordinable set depends upon the way in which its elements are arranged; for instance, ind. , but ind. . Or, to take another example, the scheme—

where each row is supposed to follow the one above it, gives a permutation of (1, 2, 3, . . . ), by which its index is changed from ω to ω2. It has been proved that there is a permutation of the natural scale, of which the index is φ(ω), any assigned element of (II.); and that, if the index of any ordered aggregate is φ(ω), the aggregate is countable. Thus the power of all aggregates which can be associated with indices of the class (II.) is the same as that of the natural scale; this power may be denoted by a. Since a is associated with all aggregates of a particular power, independently of the arrangement of their elements, it is analogous to the integers, 1, 2, 3, &c, when used to denote powers of finite aggregates; for this reason it is called the least transfinite cardinal number.

22. There are aggregates which have a power greater than 𝑎: for instance, the arithmetical continuum of positive real numbers, the power of which is denoted by 𝑐. Another one is the aggregate of all those order-types which (like those in II. above) are the indices of aggregates of power 𝑎. The power of this aggregate is denoted by א1. According to Cantor’s theory it is the transfinite cardinal number next superior to 𝑎, which for the sake of uniformity is also denoted by א0. It has been conjectured that א1=𝑐, but this has neither been verified nor disproved The discussion of the aleph-numbers is still in a controversial stage (November 1907) and the points in debate cannot be entered upon here.

23. Transfinite numbers, both ordinal and cardinal, may be combined by operations which are so far analogous to those of ordinary arithmetic that it is convenient to denote them by the same symbols. But the laws of operation are not entirely the same; for instance, 2ω and ω2 have different meanings: the first has been explained, the second is the index of the scheme (𝑎1𝑏1 | 𝑎2𝑏2 | 𝑎3𝑏3 | . . . | 𝑎𝑛𝑏𝑛 | . . . ) or any similar arrangement. Again if 𝑛 is any positive integer, 𝑛𝑎=𝑎𝑛=𝑎. It should also be observed that according to Cantor’s principles of construction every ordinal number is succeeded by a definite next one; but that there are definite ordinal numbers (e.g. ω, ω2) which have no ordinal immediately preceding them.

24. Theory of Numbers.—The theory of numbers is that branch of mathematics which deals with the properties of the natural numbers. As Dirichlet observed long ago, the whole of the subject would be coextensive with mathematical analysis in general; but it is convenient to restrict it to certain fields where the appropriateness of the above definition is fairly obvious. Even so, the domain of the subject is becoming more and more comprehensive, as the methods of analysis become more systematic and more exact.

The first noteworthy classification of the natural numbers is into those which are prime and those which are composite. A prime number is one which is not exactly divisible by any number except itself and 1; all others are composite. The number of primes is infinite (Eucl. Elem. ix. 20), and consequently, if 𝑛 is an assigned number, however large, there is an infinite number (𝑎) of primes greater than 𝑛.

If 𝑚, 𝑛 are any two numbers, and 𝑚>𝑛, we can always find a definite chain of positive integers (𝑞1, 𝑟1), (𝑞2, 𝑟2), &c., such that

𝑚=𝑞1𝑛+𝑟1, 𝑛=𝑞2𝑟1+𝑟2, 𝑟1=𝑞3𝑟2+𝑟3, &c.


with 𝑛>𝑟1>𝑟2>𝑟3 . . .; the process by which they are calculated will be called residuation. Since there is only a finite number of positive integers less than 𝑛, the process must terminate with two equalities of the form

𝑟ℎ−2=𝑞𝑟ℎ−1+𝑟,  𝑟ℎ−1=𝑞ℎ+1𝑟.

Hence we infer successively that 𝑟 is a divisor of 𝑟ℎ−1, 𝑟ℎ−2, . . . 𝑟1, and finally of 𝑚 and 𝑛. Also 𝑟 is the greatest common factor of 𝑚, 𝑛: because any common factor must divide 𝑟1, 𝑟2, and so on down to 𝑟: and the highest factor of 𝑟 is 𝑟 itself. It will be convenient to write 𝑟=dv (𝑚, 𝑛). If 𝑟 = 1, the numbers 𝑚, 𝑛 are said to be prime to each other, or co-primes.

25. The foregoing theorem of residuation is of the greatest importance; with the help of it we can prove three other fundamental propositions, namely:—

(1) If 𝑚, 𝑛 are any two natural numbers, we can always find two other natural numbers 𝑥, 𝑦 such that

dv(𝑚,𝑛)=𝑥𝑚−𝑦𝑛.

(2) If 𝑚, 𝑛 are prime to each other, and 𝑝 is a prime factor of 𝑚𝑛, then 𝑝 must be a factor of either 𝑚 or 𝑛.

(3) Every number may be uniquely expressed as a product of prime factors.

Hence if 𝑛=𝑝α𝑞β𝑟γ . . . is the representation of any number 𝑛 as the product of powers of different primes, the divisors of 𝑛 are the terms of the product (1+𝑝+𝑝2. . . +𝑝α) (1+𝑞+ . . . +𝑞β) (1+𝑟 . . . +𝑟γ) their number is (α+1) (β+1) (𝛾+1) . . .; and their sum is Π(𝑝α+1−1)÷Π(𝑝−1). This includes 1 and 𝑛 among the divisors of 𝑛.

26. Totients.—By the totient of 𝑛, which is denoted, after Euler, by φ(𝑛), we mean the number of integers prime to 𝑛, and not exceeding 𝑛. If 𝑛=𝑝α, the numbers not exceeding 𝑛 and not prime to it are 𝑝, 2𝑝, . . . (𝑝α−𝑝), 𝑝α of which the number is 𝑝α−1: hence φ(𝑝α)=𝑝α−𝑝α−1. If 𝑚, 𝑛 are prime to each other, φ(𝑚𝑛)=φ(𝑚)φ(𝑛); and hence for the general case, if 𝑛=𝑝α𝑞β𝑟γ . . . ,φ(𝑛)=Π𝑝α−1(𝑝−1), where the product applies to all the different prime factors of 𝑛. If 𝑑1, 𝑑2, &c., are the different divisors of 𝑛,

φ(𝑑1)+φ(𝑑2)+ . . . =𝑛.

For example 15=φ(15)+φ(5)+φ(3)+φ(1)=8+4+2+1.

27. Residues and congruences.—It will now be convenient to include in the term “number” both zero and negative integers. Two numbers 𝑎, 𝑏 are said to be congruent with respect to the modulus 𝑚, when (𝑎−𝑏) is divisible by 𝑚. This is expressed by the notation 𝑎≡𝑏 (mod 𝑚), which was invented by Gauss. The fundamental theorems relating to congruences are

If

𝑎≡𝑏 and 𝑐≡𝑑 (mod 𝑚), then 𝑎±𝑐≡𝑏±𝑑, and 𝑎𝑏 ≡𝑐𝑑.

 
If

ℎ𝑎≡ℎ𝑏(mod 𝑚) then 𝑎≡𝑏 (mod 𝑚/𝑑), where 𝑑=dv(ℎ, 𝑚).

 

Thus the theory of congruences is very nearly, but not quite, similar to that of algebraic equations. With respect to a given modulus 𝑚 the scale of relative integers may be distributed into 𝑚 classes, any two elements of each class being congruent with respect to 𝑚. Among these will be φ(𝑚) classes containing numbers prime to 𝑚. By taking any one number from each class we obtain a complete system of residues to the modulus 𝑚. Supposing (as we shall always do) that 𝑚 is positive, the numbers 0, 1, 2, . . . (𝑚−1) form a system of least positive residues; according as 𝑚 is odd or even, 0, ±1, ±2, . . . ±1/2 (𝑚−1), or 0, ±1, ±2, . . . ±1/2(𝑚−2),1/2𝑚 form a system of absolutely least residues.

28. The Theorems of Fermat and Wilson.—Let 𝑟1, 𝑟2. . . 𝑟𝑡 where 𝑡=φ(𝑚), be a complete set of residues prime to the modulus 𝑚. Then if 𝑥 is any number prime to 𝑚, the residues 𝑥𝑟1, 𝑥𝑟2. . . 𝑥𝑟𝑡 also form a complete set prime to 𝑚 (§ 27). Consequently 𝑥𝑟1·𝑥𝑟2 . . . 𝑥𝑟𝑡≡𝑟1𝑟2 . . .𝑟𝑡, and dividing by 𝑟1𝑟2 . . . 𝑟𝑡, which is prime to the modulus, we infer that

𝑥φ(𝑚)≡1(mod 𝑚).

which is the general statement of Fermat’s theorem. If 𝑚 is a prime 𝑝, it becomes 𝑥𝑝−1≡1 (mod 𝑝).

For a prime modulus 𝑝 there will be among the set 𝑥, 2𝑥, 3𝑥, . . . (𝑝−1)𝑥 just one and no more that is congruent to 1: let this be 𝑥𝑦. If 𝑦≡𝑥, we must have 𝑥2−1=(𝑥−1) (𝑥+1)≡0, and hence 𝑥≡±1: consequently the residues 2, 3, 4, . . . (𝑝−2) can be arranged in 1/2 (𝑝−3) pairs (𝑥, 𝑦) such that 𝑥𝑦≡1. Multiplying them all together, we conclude that 2.3.4. . . .(𝑝−2)≡1 and hence, since 1.(𝑝−1)≡−1,

(𝑝−1)!≡−1 (mod 𝑝).


which is Wilson’s theorem. It may be generalized, like that of Fermat, but the result is not very interesting. If 𝑚 is composite (𝑚−1)!+1 cannot be a multiple of 𝑚: because 𝑚 will have a prime factor 𝑝 which is less than 𝑚, so that (𝑚−1)!≡0 (mod 𝑝). Hence Wilson’s theorem is invertible: but it does not supply any practical test to decide whether a given number is prime.

29. Exponents, Primitive Roots, Indices.—Let 𝑝 denote an odd prime, and 𝑥 any number prime to 𝑝. Among the powers 𝑥, 𝑥2, 𝑥3. . . 𝑥𝑝−1 there is certainly one, namely 𝑥𝑝−1, which ≡1 (mod 𝑝); let 𝑥𝑒 be the lowest power of 𝑥 such that 𝑥𝑒≡1. Then 𝑒 is said to be the exponent to which 𝑥 appertains (mod 𝑝): it is always a factor of (𝑝−1) and can only be 1 when 𝑥≡1. The residues 𝑥 for which 𝑒=𝑝−1 are said to be primitive roots of 𝑝. They always exist, their number is φ(𝑝−1), and they can be found by a methodical, though tedious, process of exhaustion. If 𝑔 is any one of them, the complete set may be represented by 𝑔, 𝑔𝑎, 𝑔𝑏. . . &c. where 𝑎, 𝑏, &c., are the numbers less than (𝑝−1) and prime to it, other than 1. Every number 𝑥 which is prime to 𝑝 is congruent, mod 𝑝, to 𝑔𝑖, where 𝑖 is one of the numbers 1, 2, 3, . . . (𝑝−1); this number 𝑖 is called the index of 𝑥 to the base 𝑔. Indices are analogous to logarithms: thus

ind𝑔 (𝑥𝑦)≡ind𝑔 𝑥 + ind𝑔 𝑦, ind𝑔 (𝑥)≡ ℎ ind𝑔 𝑥 (mod p − 1).

Consequently tables of primitive roots and indices for different primes are of great value for arithmetical purposes. Jacobi’s Canon Arithmeticus gives a primitive root, and a table of numbers and indices for all primes less than 1000.

For moduli of the forms 2𝑝, 𝑝𝑚, 2𝑝𝑚 there is an analogous theory (and also for 2 and 4); but for a composite modulus of other forms there are no primitive roots, and the nearest analogy is the representation of prime residues in the form α𝑥 β𝑦 χ𝑧 . . . , where α, β, γ are selected prime residues, and 𝑥, 𝑦, 𝑧, . . . are indices of restricted range. For instance, all residues prime to 48 can be exhibited in the form 5𝑥 7𝑦 13𝑧, where 𝑥=0, 1, 2, 3; 𝑦=0, 1; 𝑧=0, 1; the total number of distinct residues being 4.2.2=16=φ(48), as it should be.

30. Linear Congruences.—The congruence 𝑎′𝑥≡𝑏′ (mod 𝑚′) has no solution unless dv(𝑎′, 𝑚′) is a factor of 𝑏′. If this condition is satisfied, we may replace the given congruence by the equivalent one 𝑎𝑥≡𝑏 (mod 𝑚), where 𝑎 is prime to 𝑏 as well as to 𝑚. By residuation (§§ 24, 25) we can find integers ℎ, 𝑘 such that 𝑎ℎ−𝑚𝑘=1, and thence obtain 𝑥≡𝑏ℎ (mod 𝑚) as the complete solution of the given congruence. To the modulus 𝑚′ there are 𝑚′/𝑚 incongruent solutions. For example, 12𝑥≡30 (mod 21) reduces to 2𝑥≡5 (mod 7) whence 𝑥≡6 (mod 7)≡6, 13, 20 (mod 21). There is a theory of simultaneous linear congruences in any number of variables, first developed with precision by Smith. In any particular case, it is best to replace as many as possible of the given congruences by an equivalent set obtained by successively eliminating the variables 𝑥, 𝑦, 𝑧, . . . in order. An important problem is to find a number which has given residues with respect to a given set of moduli. When possible, the solution is of the form 𝑥≡𝑎 (mod 𝑚), where 𝑚 is the least common multiple of the moduli. Supposing that 𝑝 is a prime, and that we have a corresponding table of indices, the solution of 𝑎𝑥≡𝑏 (mod 𝑝) can be found by observing that ind 𝑥 ≡ ind 𝑏−ind 𝑎 (mod 𝑝−1).

31. Quadratic Residues. Law of Reciprocity.—To an odd prime modulus 𝑝, the numbers 1, 4, 9, . . . (𝑝−1)2 are congruent to 1/2(𝑝−1) residues only, because (𝑝−𝑥)2=𝑥2. Thus for 𝑝=5, we have 1, 4, 9, 16≡1, 4, 4, 1 respectively. There are therefore 1/2(𝑝−1) quadratic residues and 1/2(𝑝−1) quadratic non-residues prime to 𝑝; and there is a corresponding division of incongruent classes of integers with respect to 𝑝. The product of two residues or of two non-residues is a residue; that of a residue and a non-residue is a non-residue; and taking any primitive root as base the index of any number is even or odd according as the number is a residue or a non-residue. Gauss writes 𝑎R𝑝, 𝑎N𝑝 to denote that 𝑎, is a residue or non-residue of 𝑝 respectively.

Given a table of indices, the solution of when possible, is found from 2ind 𝑥≡ind 𝑎 (mod 𝑝−1), and the result may be written in the form 𝑥≡±𝑟 (mod 𝑝). But it is important to discuss the congruence 𝑥2≡𝑎 without assuming that we have a table of indices. It is sufficient to consider the case 𝑥2≡𝑞 (mod 𝑝), where 𝑞 is a positive prime less than 𝑝; and the question arises whether the quadratic character of 𝑞 with respect to 𝑝 can be deduced from that of 𝑝 with respect to 𝑞. The answer is contained in the following theorem, which is called the law of quadratic reciprocity (for real positive odd primes): if 𝑝, 𝑞 are each or one of them of the form 4𝑛+1, then 𝑝, 𝑞 are each of them a residue, or each a non-residue of the other; but if 𝑝, 𝑞 are each of the form 4𝑛+3, then according as 𝑝 is a residue or non-residue of 𝑞 we have 𝑞 a non-residue or a residue of 𝑝.

Legendre introduced a symbol which denotes + 1 or −1 according as 𝑚R𝑞 or 𝑚N𝑞 being a positive odd prime and 𝑚 any number prime to 𝑞); with its help we may express the law of reciprocity in the form

This theorem was first stated by Legendre, who only partly proved it; the first complete proof, by induction, was published by Gauss, who also discovered five (or six) other more or less independent proofs of it. Many others have since been invented.

There are two supplementary theorems relating to −1 and 2 respectively, which, may be expressed in the form

,

where 𝑝 is any positive odd prime.

It follows from the definition that

and that if . As a simple application of the law of reciprocity, let it be required to find the quadratic character of 11 with respect to 1907. We have

because 6N11. Hence 11R1907.

Legendre’s symbol was extended by Jacobi in the following manner. Let P be any positive odd number, and let 𝑝, 𝑝′, 𝑝″, &c. be its (equal or unequal) prime factors, so that P=𝑝𝑝′𝑝″. . . . Then if Q is any number prime to P, we have a generalized symbol defined by

This symbol obeys the law that, if Q is odd and positive,

with the supplementary laws

,

It is found convenient to add the conventions that

when Q and P are both odd; and that the value of the symbol is 0 when P, Q are not co-primes.

In order that the congruence 𝑥2≡𝑎 (mod 𝑚) may have a solution it is necessary and sufficient that 𝑎 be a residue of each distinct prime factor of 𝑚. If these conditions are all satisfied, and , where 𝑝, 𝑞, &c., are the distinct odd prime factors of 𝑚, being t in all, the number of incongruent solutions of the given congruence is , or , according as , , or respectively. The actual solutions are best found by a process of exhaustion. It should be observed that is a necessary but not a sufficient condition for the possibility of the congruence.

32. Quadratic forms.—It will be observed that the solution of the linear congruence 𝑎𝑥≡𝑏(mod 𝑚) leads to all the representations of 𝑏 in the form 𝑎𝑥+𝑚𝑦, where 𝑥, 𝑦 are integers. Many of the earliest researches in the theory of numbers deal with particular cases of the problem: given four numbers 𝑚, 𝑎, 𝑏, 𝑐, it is required to find all the integers 𝑥, 𝑦 (if there be any) which satisfy the equation 𝑎𝑥2+𝑏𝑥𝑦+𝑐𝑦2=𝑚. Fermat, for instance, discovered that every positive prime of the form 4𝑛+1 is uniquely expressible as the sum of two squares. There is a corresponding arithmetical theory for forms of any degree and any number of variables; only those of linear forms and binary quadratics are in any sense complete, as the difficulty of the problem increases very rapidly with the increase of the degree of the form considered or of the number of variables contained in it.

The form will be denoted by (𝑎, 𝑏, 𝑐 ) (x,\ y)^2</math> or more simply by (𝑎, 𝑏, 𝑐 ) when there is no need of specifying the variables. If 𝑘 is the greatest common factor of 𝑎, 𝑏, 𝑐, we may write (𝑎, 𝑏, 𝑐)=𝑘(𝑎′, 𝑏′, 𝑐′) where (𝑎′, 𝑏′, 𝑐′) is a primitive form, that is, one for which dv (𝑎′, 𝑏′, 𝑐′)=1. The other form is than said to be derived from (𝑎′, 𝑏′, 𝑐′) and to have a divisor 𝑘. For the present we shall concern ourselves only with primitive forms. Writing D=b2−4𝑎𝑐, the invariant D is called the determinant of (𝑎, 𝑏, 𝑐), and there is a first classification of forms into definite forms for which D is negative, and indefinite forms for which D is positive. The case D=0 or a positive square is rejected, because in that case the form breaks up into the product of two linear factors. It will be observed that D≡0, 1 (mod 4) according as 𝑏 is even or odd; and that if 𝑘2 is any odd square factor of D there will be forms of determinant D and divisor 𝑘.

If we write , , we have identically

where



Hence also

.

Supposing that are integers such that , a number different from zero, is said to be transformed into by the substitution of the 𝑛th order. If 𝑛2=1, the two forms are said to be equivalent, and the equivalence is said to be proper or improper according as 𝑛=1 or 𝑛=−1. In the case of equivalence, not only are 𝑥′, 𝑦′ integers wherever 𝑥, 𝑦 are so, but conversely; hence every number representable by (𝑎, 𝑏, 𝑐) is representable by (𝑎′, 𝑏′, 𝑐′) and conversely. For the present we shall deal with proper equivalence only and write 𝑓~𝑓′ to indicate that the forms 𝑓,𝑓′ are properly equivalent. Equivalent forms have the same divisor. A complete set of equivalent forms is said to form a class; classes of the same divisor are said to form an order, and of these the most important is the principal order, which consists of the primitive classes. It is a fundamental theorem that for a given determinant the number of classes is finite; this is proved by showing that every class must contain one at least of a certain finite number of so-called reduced forms, which can be found by definite rules of calculation.

33. Method of Reduction.—This differs according as D is positive or negative, and will require some preliminary lemmas. Suppose that any complex quantity 𝑧=𝑥+𝑦𝑖 is represented in the usual way by a point (𝑥, 𝑦) referred to rectangular axes. Then by plotting off all the points corresponding to (α𝑧+β) / (γ𝑧 + δ), we obtain a complete set of properly equivalent points. These all lie on the same side of the axis of 𝑥, and there is precisely one of them and no more which satisfies the conditions: (i.) that it is not outside the area which is bounded by the lines 2𝑥=±1; (ii.) that it is not inside the circle 𝑥2+𝑦2=1; (iii.) that it is not on the line 2𝑥=1, or on the arcs of the circle 𝑥2+𝑦2=1 intercepted by 2𝑥=1 and 𝑥=0. This point will be called the reduced point equivalent to 𝑧. In the positive half-plane (𝑦>0) the aggregate of all reduced points occupies the interior and half the boundary of an area which will be called the fundamental triangle, because the areas equivalent to it, and finite, are all triangles bounded by circular arcs, and having angles 1/3π, 1/3π, 0 and the fundamental triangle may be considered as a special case when one vertex goes to infinity. The aggregate of equivalent triangles forms a kind of mosaic which fills up the whole of the positive half-plane. It will be convenient to denote the fundamental triangle (with its half-boundary, for which 𝑥<0) by ∇; for a reason which will appear later, the set of equivalent triangles will be said to make up the modular dissection of the positive half-plane.

Now let 𝑓′=(𝑎′, 𝑏′, 𝑐′) be any definite form with 𝑎′ positive and determinant — 𝚫. The root of 𝑎′𝑧2+𝑏′𝑧+𝑐′=0 which is represented by a point in the positive half-plane is

and this is a reduced point if either



Cases (ii.) and (iii.) only occur when the representative point is on the boundary of ∇. A form whose representative point is reduced is said to be a reduced form. It follows from the geometrical theory that every form is equivalent to a reduced form, and that there are as many distinct classes of positive forms of determinant —∆ as there are reduced forms. The total number of reduced forms is limited, because in case (i.) we have , so that , while ; in case (ii.) , or else ; in case (iii.) , or else . With the help of these inequalities a complete set of reduced forms can be found by trial, and the number of classes determined. The latter cannot exceed 1/3∆; it is in general much less.

With an indefinite form (𝑎, 𝑏, 𝑐) we may associate the representative circle

𝑎(𝑥2+𝑦2)+𝑏𝑥+𝑐=0,

which cuts the axis of 𝑥 in two real points. The form is said to be reduced if this circle cuts ∇; the condition for this is , which can be expressed in the form , and it is hence clear that the absolute values of 𝑎, 𝑏, and therefore of 𝑐, are limited. As before, there are a limited number of reduced forms, but they are not all non-equivalent. In fact they arrange themselves, according to a law which is not very difficult to discover, in cycles or periods, each of which is associated with a particular class. The main result is the same as before: that the number of classes is finite, and that for each class we can find a representative form by a finite process of calculation.

34. Problem of Representation.—It is required to find out whether a given number 𝑚′ can be represented by the given form . One condition is clearly that the divisor of the form must be a factor of 𝑚′. Suppose this is the case; and let 𝑚, (𝑎, 𝑏, 𝑐) be the quotients of 𝑚′ and be the divisor in question. Then we have now to discover whether 𝑚 can be represented by the primitive form (𝑎, 𝑏, 𝑐). First of all we will consider proper representations

where α, γ are co-primes. Determine integers β, δ such that , and apply to (𝑎, 𝑏, 𝑐) the substitution ; the new form will be (𝑚, 𝑛, 𝑙), where

.

Consequently , and D must be a quadratic residue of 𝑚. Unless this condition is satisfied, there is no proper representation of 𝑚 by any form of determinant D. Suppose, however, that is soluble and that 𝑛1, 𝑛2, &c. are its roots. Taking any one of these, say 𝑛𝑖, we can find out whether (𝑚, 𝑛𝑖, 𝑙𝑖) and (𝑎, 𝑏, 𝑐) are equivalent; if they are, there is a substitution which converts the latter into the former, and then . As to derived representations, if , then 𝑚 must have the square factor , and ; hence everything may be made to depend on proper representation by primitive forms.

35. Automorphs. The Pellian Equation.—A primitive form (𝑎, 𝑏, 𝑐) is, by definition, equivalent to itself; but it may be so in more ways than one. In order that (𝑎, 𝑏, 𝑐) may be transformed into itself by the substitution , it is necessary and sufficient that

where (𝑡, 𝑢) is an integral solution of

.

If D is negative and , the only solutions are ; gives ; gives . On the other hand, if the number of solutions is infinite and if (𝑡1, 𝑢1) is the solution for which 𝑡, 𝑢 have their least positive values, all the other positive solutions may be found from

.

The substitutions by which (𝑎, 𝑏, 𝑐) is transformed into itself are called its automorphs. In the case when we have , , , and (T, U) any solution of

.

This is usually called the Pellian equation, though it should properly be associated with Fermat, who first perceived its importance. The minimum solution can be found by converting into a periodic continued fraction.

The form (𝑎, 𝑏, 𝑐) may be improperly equivalent to itself; in this case all its improper automorphs can be expressed in the form

where . In particular, if the form (𝑎, 𝑏, 𝑐) is improperly equivalent to itself. A form improperly equivalent to itself is said to be ambiguous.

36. Characters of a form or class. Genera.—Let be any primitive form; we have seen above (§ 32) that if are any integers

where . Now the expressions in brackets on the left hand may denote any two numbers 𝑚, 𝑛 representable by the form (𝑎, 𝑏, 𝑐); the formula shows that 4𝑚𝑛 is a residue of D, and hence 𝑚𝑛 is a residue of every odd prime factor of D, and if 𝑝 is any such factor the symbols and will have the same value. Putting , this common value is denoted by and called a quadratic character (or simply character) of 𝑓 with respect to 𝑝. Since 𝑎 is representable by the value is the same as . For example, if D = −140, the scheme of characters for the six reduced primitive forms, and therefore for the classes they represent, is

(1, 0, 35)
(4, ±2, 9)
+
 
+
 
(5, 0, 7)
(3, ±2, 12)

 

 

In certain cases there are supplementary characters of the type and , and the characters are discriminated according as an odd or even power of 𝑝 is contained in D; but in every case there are certain combinations of characters (in number one-half of all possible combinations) which form the total characters of actually existing classes. Classes which have the same total character are said to belong to the same genus. Each genus of the same order contains the same number of classes.

For any determinant D we have a principal primitive class for which all the characters are +; this is represented by the principal form (1, 0, −𝑛) or (1, 1, −𝑛) according as D is of the form 4𝑛 or 4𝑛+1. The corresponding genus is called the principal genus. Thus, when D=−140, it appears from the table above that in the primitive order there are two genera, each containing three classes; and the non-existent total characters are ; and .

37. Composition.—Considering X, Y as given lineo-linear functions of (𝑥, 𝑦), (𝑥′, 𝑦′) defined by the equations

we may have identically, in 𝑥, 𝑦, 𝑥′, 𝑦′,

and, this being so, the form (A, B, C) is said to be compounded of the two forms (𝑎, 𝑏, 𝑐), (𝑎′, 𝑏′, 𝑐′), the order of composition being indifferent. In order that two forms may admit of composition into a third, it is necessary and sufficient that their determinants be in the ratio of two squares. The most important case is that of two primitive forms φ, χ of the same determinant; these can be compounded into a form denoted by φχ or χφ which is also primitive and of the same determinant as φ or χ. If A, B, C are the classes to which φ, χ, φχ respectively belong, then any form of A compounded with any form of B gives rise to a form belonging to C. For this reason we write C=AB=BA, and speak of the multiplication or composition of classes. The principal class is usually denoted by 1, because when compounded with any other class A it gives this same class A.

The total number of primitive classes being finite, ℎ, say, the series A, A², A³, &c., must be recurring, and there will be a least exponent 𝑒 such that . This exponent is a factor of ℎ, so that every class satisfies . Composition is associative as well as commutative, that is to say, (AB)C=A(BC); hence the symbols A1, A2,. . . A for the ℎ different classes define an Abelian group (see Groups) of order ℎ, which is representable by one or more base-classes B1, B2, . . . B𝑖 in such a way that each class A is enumerated once and only once by putting

with , and . Moreover, the bases may be so chosen that 𝑚 is a multiple of 𝑛, 𝑛 of the next corresponding index, and so on. The same thing may be said with regard to the symbols for the classes contained in the principal genus, because two forms of that genus compound into one of the same kind. If this latter group is cyclical, that is, if all the classes of the principal genus can be represented in the form 1, 𝖠, 𝖠2,. . .𝖠𝑣−1, the determinant 𝖣 is said to be regular; if not, the determinant is irregular. It has been proved that certain specified classes of determinants are always irregular; but no complete criterion has been found, other than working out the whole set of primitive classes, and determining the group of the principal genus, for deciding whether a given determinant is irregular or not.

If 𝖠, 𝖡 are any two classes, the total character of 𝖠𝖡 is found by compounding the characters of 𝖠 and 𝖡. In particular, the class 𝖠², which is called the duplicate of 𝖠, always belongs to the principal genus. Gauss proved, conversely, that every class in the principal genus may be expressed as the duplicate of a class. An ambiguous class satisfies 𝖠²=1, that is, its duplicate is the principal class; and the converse of this is true. Hence if 𝖡₁, 𝖡₂,. . .𝖡𝑖 are the base-classes for the whole composition-group, and 𝖠=𝖡₁𝑥 𝖡₂𝑦 . . . 𝖡𝒊𝑧 (as above) 𝖠=1, if 2𝑥=0 or 𝑚, 2𝑦=0 or 𝑛, &c.; hence the number of ambiguous classes is 2𝑖. As an example, when 𝖣=−1460, there are four ambiguous classes, represented by

(1, 0, 365), (2, 2, 183), (5, 0, 73), (10, 10, 39);


hence the composition-group must be dibasic, and in fact, if we put 𝖡₁, 𝖡₂ for the classes represented by (11, 6, 34) and (2, 2, 183), we have 𝖡₁¹⁰=𝖡₂²=1 and the 20 primitive classes are given by 𝖡₁𝑥B₂𝑦(𝑥≤10, 𝑦≤2). In this case the determinant is regular and the classes in the principal genus are 1, 𝖡₁², 𝖡₁⁴, 𝖡₁⁶, 𝖡₁⁸.

38. On account of its historical interest, we may briefly consider the form 𝑥²+𝑦², for which 𝖣=−4. If 𝑝 is an odd prime of the form 4𝑛+1, the congruence 𝑚²≡−4(mod 4𝑝) is soluble (§ 31); let one of its roots be 𝑚, and 𝑚²+4=4𝑙𝑝. Then (𝑝, 𝑚, 𝑙) is of determinant −4, and, since there is only one primitive class for this determinant, we must have (𝑝, 𝑚, 𝑙)~(1, 0, 1). By known rules we can actually find a substitution which converts the first form into the second; this being so, will transform the second into the first, and we shall have 𝑝=γ²+δ², a representation of 𝑝 as the sum of two squares. This is unique, except that we may put 𝑝=(±γ)²+(±δ)². We also have 2=1²+1² while no prime 4𝑛+3 admits of such a representation.

The theory of composition for this determinant is expressed by the identity (𝑥²+𝑦²) (𝑥′²+𝑦′²)=(𝑥𝑥′±𝑦𝑦′)²+(𝑥𝑦′∓𝑦𝑥′)²; and by repeated application of this, and the previous theorem, we can show that if 𝖭=2𝑎𝑝𝑏𝑞𝑐. . ., where 𝑝, 𝑞,. . . are odd primes of the form 4𝑛+1, we can find solutions of 𝖭=𝑥²+𝑦², and indeed distinct solutions. For instance 65=1²+8²=4²+7², and conversely two distinct representations 𝖭=𝑥²+𝑦²=𝑢²+𝑣² lead to the conclusion that 𝖭 is composite. This is a simple example of the application of the theory of forms to the difficult problem of deciding whether a given large number is prime or composite; an application first indicated by Gauss, though, in the present simple case, probably known to Fermat.

39. Number of classes. Class-number Relations.—It appears from Gauss’s posthumous papers that he solved the very difficult problem of finding a formula for ℎ(𝖣), the number of properly primitive classes for the determinant 𝖣. The first published solution, however, was that of P. G. L. Dirichlet; it depends on the consideration of series of the form Σ(𝑎𝑥²+𝑏𝑥𝑦+𝑐𝑦²)−1−𝑠 where 𝑠 is a positive quantity, ultimately made very small. L. Kronecker has shown the connexion of Dirichlet’s results with the theory of elliptic functions, and obtained more comprehensive formulae by taking (𝑎, 𝑏, 𝑐) as the standard type of a quadratic form, whereas Gauss, Dirichlet, and most of their successors, took (𝑎, 2𝑏, 𝑐) as the standard, calling (𝑏²−𝑎𝑐) its determinant. As a sample of the kind of formulae that are obtained, let 𝑝 be a prime of the form 4𝑛+3; then

,


where in the first formula Σα means the sum of all quadratic residues of 𝑝 contained in the series 1, 2, 3,. . .1/2(𝑝∼1) and Σβ is the sum of the remaining non-residues; while in the second formula (𝑡, 𝑢) is the least positive solution of 𝑡²−𝑝𝑢²=1, and the product extends to all values of 𝑏 in the set 1, 3, 5,. . .(4𝑝−1) of which 𝑝 is a non-residue. The remarkable fact will be noticed that the second formula gives a solution of the Pellian equation in a trigonometrical form.

Kronecker was the first to discover, in connexion with the complex multiplication of elliptic functions, the simplest instances of a very curious group of arithmetical formulae involving sums of class-numbers and other arithmetical functions; the theory of these relations has been greatly extended by A. Hurwitz. The simplest of all these theorems may be stated as follows. Let 𝖧 (Δ) represent the number of classes for the determinant −Δ, with the convention that 1/2 and not 1 is to be reckoned for each class containing a reduced form of the type (𝑎, o, 𝑎) and 1/3 for each class containing a reduced form (𝑎, 𝑎, 𝑎); then if 𝑛 is any positive integer,


where Φ(𝑛) means the sum of the divisors of 𝑛, and Ψ(𝑛) means the excess of the sum of those divisors of 𝑛 which are greater than over the sum of those divisors which are less than . The formula is obtained by calculating in two different ways the number of reduced values of 𝑧 which satisfy the modular equation J(𝑛𝑧)=J(𝑧),

where J(𝒛) is the absolute invariant which, for the elliptic function 𝔭(𝑢; 𝑔₂, 𝑔₃) is 𝑔₂³÷(𝑔₂³−27𝑔₃²), and 𝑧 is the ratio of any two primitive periods taken so that the real part of 𝑖𝑧 is negative (see below, § 68). It should be added that there is a series of scattered papers by J. Liouville, which implicitly contain Kronecker’s class-number relations, obtained by a purely arithmetical process without any use of transcendents.

40. Bilinear Forms.—A bilinear form means an expression of the type Σα𝑖𝑘𝑥𝑖𝑦𝑘 (𝑖=1, 2,. . .𝑚; 𝑘=1, 2,. . .𝑛); the most important case is when 𝑚=𝑛, and only this will be considered here. The invariants of a form are its determinant [𝑎𝑛𝑛] and the elementary factors thereof. Two bilinear forms are equivalent when each can be transformed into the other by linear integral substitutions 𝑥′=Σα𝑥, 𝑦′=𝚺𝛽𝑦. Every bilinear form is equivalent to a reduced form , and 𝑟=𝑛, unless [𝑎𝑛𝑛]=0. In order that two forms may be equivalent it is necessary and sufficient that their invariants should be the same. Moreover, if 𝑎∼𝑏 and 𝑐∼𝑑, and if the invariants of the forms 𝑎+λ𝑐, 𝑏+λ𝑑 are the same for all values of λ, we shall have 𝑎+λ𝑐∼𝑏+λ𝑑, and the transformation of one form to the other may be effected by a substitution which does not involve λ. The theory of bilinear forms practically includes that of quadratic forms, if we suppose 𝑥𝑖, 𝑦𝑖 to be cogredient variables. Kronecker has developed the case when 𝑛=2, and deduced various class-relations for quadratic forms in a manner resembling that of Liouville. So far as the bilinear forms are concerned, the main result is that the number of classes for the positive determinant 𝑎₁₁𝑎₂₂−𝑎₁₂𝑎₂₁=Δ is 12{Φ(Δ)+Ψ(Δ)}+2ε, where ε is 1 or 0 according as Δ is or is not a square, and the symbols Φ, Ψ have the meaning previously assigned to them (§ 39).

41. Higher Quadratic Forms.—The algebraic theory of quadratics is so complete that considerable advance has been made in the much more complicated arithmetical theory. Among the most important results relating to the general case of 𝑛 variables are the proof that the class-number is finite; the enumeration of the arithmetical invariants of a form; classification according to orders and genera, and proof that genera with specified characters exist; also the determination of all the rational transformations of a given form into itself. In connexion with a definite form there is the important conception of its weight; this is defined as the reciprocal of the number of its proper automorphs. Equivalent forms are of the same weight; this is defined to be the weight of their class. The weight of a genus or order is the sum of the weights of the classes contained in it; and expressions for the weight of a given genus have actually been obtained. For binary forms the sum of the weights of all the genera coincides with the expression denoted by H(Δ) in § 39. The complete discussion of a form requires the consideration of (𝑛−2) associated quadratics; one of these is the contravariant of the given form, each of the others contains more than 𝑛 variables. For certain quaternary and senary classes there are formulae analogous to the class-relations for binary forms referred to in § 39 (see Smith, Proc. R.S. xvi., or Collected Papers, i. 510).

Among the most interesting special applications of the theory are certain propositions relating to the representation of numbers as the sum of squares. In order that a number may be expressible as the sum of two squares it is necessary and sufficient for it to be of the form 𝖯𝖰², where 𝖯 has no square factor and no prime factor of the form 4𝑛+3. A number is expressible as the sum of three squares if, and only if, it is of the form 𝑚²𝑛 with 𝑛≡1, ±2, ±3 (mod 8); when 𝑚=1 and 𝑛≡3 (mod 8), all the squares are odd, and hence follows Fermat’s theorem that every number can be expressed as the sum of three triangular numbers (one or two of which may be 0). Another famous theorem of Fermat’s is that every number can be expressed as the sum of four squares; this was first proved by Jacobi, who also proved that the number of solutions of 𝑛=𝑥²+𝑦²+𝑧²+𝑡² is 8Φ(𝑛), if 𝑛 is odd, while if 𝑛 is even it is 24 times the sum of the odd factors of 𝑛. Explicit and finite, though more complicated, formulae have been obtained for the number of representations of 𝒏 as the sum of five, six, seven and eight squares respectively. As an example of the outstanding difficulties of this part of the subject may be mentioned the problem of finding all the integral (not merely rational) automorphs of a given form 𝑓. When 𝑓 is ternary, C. Hermite has shown that the solution depends on finding all the integral solutions of 𝖥(𝑥, 𝑦, 𝑧)+𝑡²=1, where 𝖥 is the contra variant of 𝑓.

Thanks to the researches of Gauss, Eisenstein, Smith, Hermite and others, the theory of ternary quadratics is much less incomplete than that of quadratics with four or more variables. Thus methods of reduction have been found both for definite and for indefinite forms; so that it would be possible to draw up a table of representative forms, if the result were worth the labour. One specially important theorem is the solution of 𝑎𝑥²+𝑏𝑦²+𝑐𝑧²=0; this is always possible if −𝑏𝑐, −𝑐𝑎, −𝑎𝑏 are quadratic residues of 𝑎, 𝑏, 𝑐 respectively, and a formula can then be obtained which furnishes all the solutions.

42. Complex Numbers.—One of Gauss’s most important and far-reaching contributions to arithmetic was his introduction of complex integers 𝑎+𝑏𝑖, where 𝑎, 𝑏 are ordinary integers, and, as usual, 𝑖²=−1. In this theory there are four units ±1, ±𝑖; the numbers 𝑖𝒉(𝑎+𝑏𝑖) are said to be associated; 𝑎−𝑏𝑖 is the conjugate of 𝑎+𝑏𝑖 and we write 𝖭(𝑎±𝑏)=𝑎²+𝑏², the norm of 𝑎+𝑏𝑖, its conjugate, and associates. The most fundamental proposition in the theory is that the process of residuation (§ 24) is applicable; namely, if 𝑚, 𝑛 are any two complex integers and 𝖭(𝑚)>𝖭(𝑛), we can always find integers 𝑞, 𝑟 such that 𝑚=𝑞𝑛+𝑟 with 𝖭(𝑟)⩽1/2𝖭(𝑛). This may be proved analytically, but is obvious if we mark complex integers by points in a plane. Hence immediately follow propositions about resolutions into prime factors, greatest common measure, &c., analogous to those in the ordinary theory; it will only be necessary to indicate special points of difference.

We have 2 = −𝑖(1+𝑖)², so that 2 is associated with a square; a real prime of the form 4𝑛+3 is still a prime but one of the form 4𝑛+1 breaks up into two conjugate prime factors, for example 5 = (1−2𝑖)(1+2𝑖). An integer is even, semi-even, or odd according as it is divisible by (1+𝑖)², (1+𝑖) or is prime to (1+𝑖). Among four associated odd integers there is one and only one which≡1 (mod 2+2𝑖); this is said to be primary; the conjugate of a primary number is primary, and the product of any number of primaries is primary. The conditions that 𝑎+𝑏𝑖 may be primary are b≡0 (mod 2), 𝑎+𝑏−1≡0 (mod 4). Every complex integer can be uniquely expressed in the form 𝑖𝑚(1+𝑖)𝑛𝑎𝛼𝑏𝛽𝑐𝛾 . . ., where 0⩽𝑚<4, and 𝑎, 𝑏, 𝑐, . . . are primary primes.

With respect to a complex modulus 𝑚, all complex integers may be distributed into 𝖭 (𝑚) incongruous classes. If 𝑚=ℎ(𝑎+𝑏𝑖) where 𝑎, 𝑏 are co-primes, we may take as representatives of these classes the residues 𝑥+𝑦𝑖 where 𝑥=0, 1, 2, . . . {(𝑎²+𝑏²)ℎ−1}; 𝑦=0, 1, 2, . . . (ℎ−1). Thus when 𝑏=0 we may take 𝑥=0, 1, 2, . . . (ℎ−1); 𝑦=0, 1, 2, . . . (ℎ−1), giving the ℎ² residues of the real number ℎ; while if 𝑎+𝑏𝑖 is prime, 1, 2, 3, . . . (𝑎²+𝑏²-1) form a complete set of residues.

The number of residues of 𝑚 that are prime to 𝑚 is given by


where the product extends to all prime factors of 𝑚. As an analogue to Fermat’s theorem we have, for any integer prime to the modulus,

𝑥𝜙(𝑚)≡1(mod 𝑚), 𝑥𝖭(𝑝)−1≡1 (mod 𝑝)


according as 𝑚 is composite or prime. There are 𝜙{𝖭(𝑝)−1)} primitive roots of the prime 𝑝; a primitive root in the real theory for a real prime 4𝑛+1 is also a primitive root in the new theory for each prime factor of (4𝑛+1), but if 𝑝=4𝑛+3 be a prime its primitive roots are necessarily complex.

43. If 𝑝, 𝑞 are any two odd primes, we shall define the symbols and by the congruences

,


it being understood that the symbols stand for absolutely least residues. It follows that or −1 according as 𝑝 is a quadratic residue of 𝑞 or not; and that only if 𝑝 is a biquadratic residue of 𝑞. If 𝑝, 𝑞 are primary primes, we have two laws of reciprocity, expressed by the equations

,

To these must be added the supplementary formulae


𝑎+𝑏𝑖 being a primary odd prime. In words, the law of biquadratic reciprocity for two primary odd primes may be expressed by saying that the biquadratic characters of each prime with respect to the other are identical, unless 𝑝 = 𝑞 ≡ 3 + 2𝑖 (mod 4), in which case they are opposite. The law of biquadratic reciprocity was discovered by Gauss, who does not seem, however, to have obtained a complete proof of it. The first published proof is that of Eisenstein, which is very beautiful and simple, but involves the theory of lemniscate functions. A proof on the lines indicated in Gauss’s posthumous papers has been developed by Busche; this probably admits of simplification. Other demonstrations, for instance Jacobi’s, depend on cyclotomy (see below).

44. Algebraic Numbers.—The first extension of Gauss’s complex theory was made by E. E. Kummer, who considered complex numbers represented by rational integral functions of any roots of unity, thus including the ordinary theory and Gauss’s as special cases. He was soon faced by the difficulty that, in some cases, the law that an integer can be uniquely expressed as the product of prime factors appeared to break down. To see how this happens take the equation 𝜂²+𝜂+6=0, the roots of which are expressible as rational integral functions of 23rd roots of unity, and let 𝜂 be either of the roots. If we define 𝑎𝜂+𝑏 to be an integer, when 𝑎, 𝑏 are natural numbers, the product of any number of such integers is uniquely expressible in the form 𝑙𝜂+𝑚. Conversely every integer can be expressed as the product of a finite number of indecomposable integers 𝑎+𝑏𝜂, that is, integers which cannot be further resolved into factors of the same type. But this resolution is not necessarily unique: for instance 6=2.3=-𝜂(𝜂+1), where 2, 3, 𝜂, 𝜂+1 are all indecomposable and essentially distinct. To see the way in which Kummer surmounted the difficulty consider the congruence

𝑢²+𝑢+6≡0(mod 𝑝)


where 𝑝 is any prime, except 23. If -23𝖱𝑝 this has two distinct roots 𝑢₁, 𝑢₂; and we say that 𝑎𝜂+𝑏 is divisible by the ideal prime factor of 𝑝 corresponding to 𝑢₁, if 𝑎𝑢₁+𝑏≡0 (mod 𝑝). For instance, if 𝑝=2 we may put 𝑢₁=0, 𝑢₂=1 and there will be two ideal factors of 2, say 𝑝₁ and 𝑝₂ such that 𝑎𝜂+𝑏≡0 (mod 𝑝₁) if 𝑏≡0 (mod 2) and 𝑎𝜂+𝑏≡0 (mod 𝑝₂) if 𝑎+𝑏≡0 (mod 2). If both these congruences are satisfied, 𝑎≡𝑏≡0 (mod 2) and 𝑎𝜂+𝑏 is divisible by 2 in the ordinary sense. Moreover (𝑎𝜂+𝑏)(𝑐𝜂+𝑑)=(𝑏𝑐+𝑎𝑑-𝑎𝑐)𝜂+(𝑏𝑑-6𝑎𝑐) and if this product is divisible by 𝑝₁, 𝑏𝑑≡0 (mod 2), whence either 𝑎𝜂+𝑏 or 𝑐𝜂+𝑑 is divisible by 𝑝₁; while if the product is divisible by 𝑝₂ we have 𝑏𝑐+𝑎𝑑+𝑏𝑑-7𝑎𝑐=0 (mod 2) which is equivalent to (𝑎+𝑏)(𝑐+𝑑)≡0 (mod 2), so that again either 𝑎𝜂+𝑏 or 𝑐𝜂+𝑑 is divisible by 𝑝₂. Hence we may properly speak of 𝑝₁ and 𝑝₂ as prime divisors. Similarly the congruence 𝑢²+𝑢+6≡0 (mod 3) defines two ideal prime factors of 3, and 𝑎𝜂+𝑏 is divisible by one or the other of these according as 𝑏≡0 (mod 3) or 2𝑎+𝑏≡0 (mod 3); we will call these prime factors 𝑝₃, 𝑝₄. With this notation we have (neglecting unit factors)

2=𝑝₁𝑝₂, 3=𝑝₃𝑝₄, 𝜂=𝑝₁𝑝₃, 1+𝜂=𝑝₂𝑝₄


Real primes of which -23 is a non-quadratic residue are also primes in the field (𝜂); and the prime factors of any number 𝑎𝜂+𝑏, as well as the degree of their multiplicity, may be found by factorizing (6𝑎²-𝑎𝑏+𝑏²), the norm of (𝑎𝜂+𝑏). Finally every integer divisible by 𝑝₂ is expressible in the form ±2𝑚±(1+𝜂)𝑛 where 𝑚, 𝑛 are natural numbers (or zero); it is convenient to denote this fact by writing 𝑝₂=[2, 1+𝜂], and calling the aggregate 2𝑚+(1+𝜂)𝑛 a compound modulus with the base 2, 1+𝜂. This generalized idea of a modulus is very important and far-reaching; an aggregate is a modulus when, if 𝛼, 𝛽 are any two of its elements, 𝛼+𝛽 and 𝛼-𝛽 also belong to it. For arithmetical purposes those moduli are most useful which can be put into the form [𝛼₁ , 𝛼₂,…𝛼𝑛] which means the aggregate of all the quantities 𝑥₁𝛼₁+𝑥₂𝛼₂+…+𝑥𝑛𝛼𝑛 obtained by assigning to (𝑥₁,𝑥₂,…𝑥𝑛), independently, the values 01±1, ±2, &c. Compound moduli may be multiplied together, or raised to powers, by rules which will be plain from the following example. We have
𝑝₂²=[4, 2(1+𝜂), (1+𝜂)²]=[4, 2+2𝜂,-5+𝜂]=[4, 12,-5+𝜂]
    =[4,-5+𝜂]=[4, 3+𝜂]
hence
𝑝₂³=𝑝₂².𝑝₂=[4, 3+𝜂]×[2, 1+𝜂]=[8, 4+4𝜂, 6+2𝜂, 3+4𝜂+𝜂²]
    =[8, 4+4𝜂, 6+2𝜂, -3+3𝜂]=(𝜂-1)[𝜂+2, 𝜂-6, 3]=(𝜂-1)[1, 𝜂]
Hence every integer divisible by 𝑝₂³ is divisible by the actual integer (𝜂-1) and conversely; so that in a certain sense we may regard 𝑝₂ as a cube root. Similarly the cube of any other ideal prime is of the form (𝑎𝜂+𝑏)[1, 𝜂]. According to a principle which will be explained further on, all primes here considered may be arranged in three classes; one is that of the real primes, the others each contain ideal primes only. As we shall see presently all these results are intimately connected with the fact that for the determinant -23 there are three primitive classes, represented by (1, 1, 6) (2, 1, 3), (2, -1, 3) respectively.

45. Kummer’s definition of ideal primes sufficed for his particular purpose, and completely restored the validity of the fundamental theorems about factors and divisibility. His complex integers were more general than any previously considered and suggested a definition of an algebraic integer in general, which is as follows: if 𝑎₁,𝑎₂,…𝑎𝑛 are ordinary integers (i.e. elements of N, § 7), and 𝜃 satisfies an equation of the form

𝜃+𝑎₁𝜃𝑛-1+𝑎₂𝜃𝑛-2. . . +𝑎𝑛-1𝜃+𝑎𝑛=0


𝜃 is said to be an algebraic integer. We may suppose this equation irreducible; 𝜃 is then said to be of the 𝑛th order. The 𝑛 roots 𝜃, 𝜃′, 𝜃″,. . .𝜃(𝑛-1) are all different, and are said to be conjugate. If the equation began with 𝑎₀𝜃𝑛 instead of 𝜃𝑛, 𝜃 would still be an algebraic number; every algebraic number can be put into the form 𝜃 ∕𝑚, where 𝑚 is a natural number and 𝜃 an algebraic integer.

Associated with 𝜃 we have a field (or corpus) Ω=𝖱(𝜃) consisting of all rational functions of 𝜃 with real rational coefficients; and in like manner we have the conjugate fields Ω′=𝖱(𝜃′), &c. The aggregate of integers contained in Ω is denoted by ο.

Every element of Ω can be put into the form

𝜔=𝑐₀+𝑐₁𝜃+ . . . +𝑐𝑛 —1𝜃𝑛—1


where 𝑐₀, 𝑐₁,…𝑐𝑛—1 are real and rational. If these coefficients are all integral, 𝜔 is an integer; but the converse is not necessarily true. It is possible, however, to find a set of integers 𝜔₁, 𝜔₂,…𝜔𝑛, belonging to Ω, such that every integer in Ω can be uniquely expressed in the form

𝜔=ℎ₁𝜔₁+ℎ₂𝜔₂+ . . . + ℎ𝑛𝜔𝑛

where are elements of which may be called the

co-ordinates of with respect to the base . Thus is a modulus (§ 44), and we may write . Having found one base, we can construct any number of equivalent bases by means of equations such as , where the rational integral coefficients are such that the determinant .

If we write

is a rational integer called the discriminant of the field. Its value is the same whatever base is chosen.

If is any integer in , the product of and its conjugates is a rational integer called the norm of , and written . By considering the equation satisfied by we see that where is an integer in . It follows from the definition that if are any two integers in , then ; and that for an ordinary real integer , we have .


46. Ideals.—The extension of Kummer’s results to algebraic numbers in general was independently made by J. W. R. Dedekind and Kronecker; their methods differ mainly in matters of notation and machinery, each having special advantages of its own for particular purposes. Dedekind’s method is based upon the notion of an ideal, which is defined by the following properties:—

(i.) An ideal is an aggregate of integers in .

(ii.) This aggregate is a modulus; that is to say, if are any two elements of (the same or different) is contained in . Hence also contains a zero element, and is an element of .

(iii.) If is any element of , and any element of , then is an element of . It is this property that makes the notion of an ideal more specific than that of a modulus.

It is clear that ideals exist; for instance, itself is an ideal. Again, all integers in which are divisible by a given integer (in ) form an ideal; this is called a principal ideal, and is denoted by . Every ideal can be represented by a base (§§ 44, 45), so that we may write , meaning that every element of can be uniquely expressed in the form , where is a rational integer. In other words, every ideal has a base (and therefore, of course, an infinite number of bases). If are any two ideals, and if we form the aggregate of all products obtained by multiplying each element of the first ideal by each element of the second, then this aggregate, together with all sums of such products, is an ideal which is called the product of and and written or . In particular . This law of multiplication is associative as well as commutative. It is clear that every element of is contained in : it can be proved that, conversely, if every element of is contained in , there exists an ideal such that . In particular, if is any element of , there is an ideal such that . A prime ideal is one which has no divisors except itself and . It is a fundamental theorem that every ideal can be resolved into the product of a finite number of prime ideals, and that this resolution is unique. It is the decomposition of a principal ideal into the product of prime ideals that takes the place of the resolution of an integer into its prime factors in the ordinary theory. It may happen that all the ideals in are principal ideals; in this case every resolution of an ideal into factors corresponds to the resolution of an integer into actual integral factors, and the introduction of ideals is unnecessary. But in every other case the introduction of ideals or some equivalent notion, is indispensable. When two ideals have been resolved into their prime factors, their greatest common measure and least common multiple are determined by the ordinary rules. Every ideal may be expressed (in an infinite number of ways) as the greatest common measure of two principal ideals.

47. There is a theory of congruences with respect to an ideal modulus. Thus means that is an element of . With respect to , all the integers in may be arranged in a finite number of incongruent classes. The number of these classes is called the norm of , and written . The norm of a prime ideal is some power of a real prime ; if , is said to be a prime ideal of degree . If are any two ideals, then . If , then , and there is an ideal such that . The norm of a principal ideal is equal to the absolute value of as defined in § 45.

The number of incongruent residues prime to is—

,

where the product extends to all prime factors of . If is any element of prime to ,

.

Associated with a prime modulus for which we have primitive roots, where has the meaning given to it in the ordinary theory. Hence follow the usual results about exponents, indices, solutions of linear congruences, and so on. For any modulus we have , where the sum extends to all the divisors of .

48. Every element of which is not contained in any other ideal is an algebraic unit. If the conjugate fields consist of real and imaginary fields, there is a system of units , where , such that every unit in is expressible in the form where is a root of unity contained in and are natural numbers. This theorem is due to Dirichlet.

The norm of a unit is or ; and the determination of all the units contained in a given field is in fact the same as the solution of a Diophantine equation

.

For a quadratic field the equation is of the form , and the theory of this is complete; but except for certain special cubic corpora little has been done towards solving the important problem of assigning a definite process by which, for a given field, a system of fundamental units may be calculated. The researches of Jacobi, Hermite, and Minkowsky seem to show that a proper extension of the method of continued fractions is necessary.


49. Ideal Classes.—If is any ideal, another ideal can always be found such that is a principal ideal; for instance, one such multiplier is . Two ideals are said to be equivalent () or to belong to the same class, if there is an ideal such that are both principal ideals. It can be proved that two ideals each equivalent to a third are equivalent to each other and that all ideals in may be distributed into a finite number, , of ideal classes. The class which contains all principal ideals is called the principal class and denoted by .

If are any two ideals belonging to the classes respectively, then belongs to a definite class which depends only upon and may be denoted by or indifferently. Thus the class-symbols form an Abelian group of order , of which is the unit element; and, mutatis mutandis, the theorems of § 37 about composition of classes still hold good.

The principal theorem with regard to the determination of is the following, which is Dedekind’s generalization of the corresponding one for quadratic fields, first obtained by Dirichlet. Let

where the sum extends to all ideals contained in ; this converges so long as the real quantity is positive and greater than . Then being a certain quantity which can be calculated when a fundamental system of units is known, we shall have

.

The expression for is rather complicated, and very peculiar; it may be written in the form

where means the absolute value of the square root of the discriminant of the field, have the same meaning as in § 48, is the number of roots of unity in , and is a determinant of the form , of order , with elements which are, in a certain special sense, “logarithms” of the fundamental units .

50. The discriminant enjoys some very remarkable properties. Its value is always different from ; there can be only a finite number of fields which have a given discriminant; and the rational prime factors of are precisely those rational primes which, in , are divisible by the square (or some higher power) of a prime ideal. Consequently, every rational prime not contained in is resolvable, in , into the product of distinct primes, each of which occurs only once. The presence of multiple prime factors in the discriminant was the principal difficulty in the way of extending Kummer’s method to all fields, and was overcome by the introduction of compound moduli—for this is the common characteristic of Dedekind’s and Kronecker’s procedure.


51. Normal Fields.—The special properties of a particular field are closely connected with its relations to the conjugate fields . The most important case is when each of the conjugate fields is identical with : the field is then said to be Galoisian or normal. The aggregate of all rational functions of and its conjugates is a normal field: hence every arithmetical field of order is either normal, or contained in a normal field of a higher order. The roots of an equation which defines a normal field are associated with a group of substitutions: if this is Abelian, the field is called Abelian; if it is cyclic, the field is called cyclic. A cyclotomic field is one the elements of which are all expressible as rational functions of roots of unity; in particular the complete cyclotomic field , of order , is the aggregate of all rational functions of a primitive mth root of unity. To Kronecker is due the very remarkable theorem that all Abelian (including cyclic) fields are cyclotomic: the first published proof of this was given by Weber, and another is due to D. Hubert.

Many important theorems concerning a normal field have been established by Hilbert. He shows that if is a given normal field of order , and any of its prime ideals, there is a finite series of associated fields , of orders , such that , and that if , , a prime ideal in . If is the last of this series, it is called the field of inertia (Trägheitskörper) for 𝔭: next after this comes another field of still lower order called the resolving field (Zerlegungskörper) for 𝔭, and in this field there is a prime of the first degree, 𝔭𝑙+1, such that 𝔭𝑙+1=𝔭𝒌, where 𝒌=𝑚 ∕𝑚𝑙. In the field of inertia 𝔭𝑙+1 remains a prime, but becomes of higher degree; in Ω𝑙—1, which is called the branch-field (Verzweigungskörper) it becomes a power of a prime, and by going on in this way from the resolving field to Ω, we obtain (𝑙+2) representations for any prime ideal of the resolving field. By means of these theorems, Hilbert finds an expression for the exact power to which a rational prime 𝒑 occurs in the discriminant of Ω, and in other ways the structure of Ω becomes more evident. It may be observed that whem 𝑚 is prime the whole series reduces to Ω and the rational field, and we conclude that every prime ideal in Ω is of the first or 𝑚th degree: this is the case, for instance, when 𝑚=2, and is one of the reasons why quadratic fields are comparatively so simple in character.

52. Quadratic Fields.—Let 𝑚 be an ordinary integer different from +1, and not divisible by any square: then if 𝑥, 𝑦 assume all ordinary rational values the expressions 𝑥+𝑦√𝑚 are the elements of a field which may be called Ω(√𝑚). It should be observed that √𝑚 means one definite root of 𝑥²—𝑚=0, it does not matter which: it is convenient, however, to agree that √𝑚 is positive when 𝑚 is positive, and 𝑖√𝑚 is negative when 𝑚 is negative. The principal results relating to Ω will now be stated, and will serve as illustrations of §§ 44-51.

In the notation previously used

𝔬=[1, 1/2(1+√𝑚)] or [1, √𝑚]


according as 𝑚≡1 (mod 4) or not. In the first case 𝚫=𝑚, in the second 𝚫=4𝑚. The field Ω is normal, and every ideal prime in it is of the first degree.

Let 𝒒 be any odd prime factor of 𝑚; then 𝒒=𝔮², where 𝔮 is the prime ideal [𝒒, 1/2(𝒒+√𝑚)] when 𝑚≡1 (mod 4) and in other cases [𝒒, √𝑚]. An odd prime 𝒑 of which 𝑚 is a quadratic residue is the product of two prime ideals 𝔭, 𝔭′, which may be written in the form [𝒑, 1/2(𝒂+√𝑚)], [𝒑, 1/2(𝒂—√𝑚)] or [𝒑, 𝒂+√𝑚], [𝒑, 𝒂—√𝑚], according as 𝑚≡1 (mod 4) or not: here 𝒂 is a root of 𝑥²≡𝑚 (mod 𝒑), taken so as to be odd in the first of the two cases. All other rational odd primes are primes in Ω. For the exceptional prime 2 there are four cases to consider: (i.) if 𝑚≡1 (mod 8), then 2=[2,1/2(1+√𝑚)]×[2,1/2(1—√𝑚)]. (ii.) If 𝑚≡5 (mod 8), then 2 is prime: (iii.) if 𝑚≡2 (mod 4), 2=[2,√𝑚]²: (iv.) if 𝑚=3 (mod≡4), 2=[2,1+√m)². Illustrations will be found in § 44 for the case 𝑚=23.

53. Normal Residues. Genera.—Hilbert has introduced a very convenient definition, and a corresponding symbol, which is a generalization of Legendre’s quadratic character. Let 𝒏, 𝑚 be rational integers, 𝑚 not a square, 𝑤 any rational prime; we write if, to the modulus 𝑤, 𝒏 is congruent to the norm of an integer contained in Ω(√𝑚); in all other cases we put . This new symbol obeys a set of laws, among which may be especially noted and , whenever 𝒏, 𝑚 are prime to 𝒑.

Now let 𝒒 ₁, 𝒒 ₂ , . . . 𝒒𝑡 be the different rational prime factors of the discriminant of Ω(√𝑚); then with any rational integer 𝒂 we may associate the 𝑡 symbols