Translation:Theoria combinationis observationum erroribus minimis obnoxiae

Theory of the combination of observations which is subject to the least error (1821–1823)
by Carl Friedrich Gauss, translated from French by Wikisource

Based on the 1855 French translation by Joseph Bertrand.

4446897Theory of the combination of observations which is subject to the least error1821-1823Carl Friedrich Gauss

Part One edit

1. edit

No matter how careful one is with observations concerning the measurement of physical quantities, they are inevitably subject to errors of varying degrees. These errors, in most cases, are not simple but arise from several distinct sources that it is best to distinguish into two classes.

Some causes of errors depend, for each observation, on variable circumstances independent of the result obtained: the errors arising from these are called "irregular" or "random," and like the circumstances that produce them, their value is not amenable to calculation. Such are the errors that arise from the imperfection of our senses and all those due to irregular external causes, e.g. vibrations of the air that blur our vision. Some of the errors due to the inevitable imperfection of even the best instruments, e.g. the roughness of the inner part of a level, its lack of absolute rigidity, etc., belong to this same category.

On the other hand, there are other causes that produce an identical error in all observations of the same kind, or one whose magnitude depends only on circumstances that can be viewed as essentially connected to the observation. We will call errors of this category "constant" or "regular" errors.

Moreover, one can see that this distinction is to a certain extent relative, and has a broader or narrower sense depending on the meaning one attaches to the idea of observations of the same nature. E.g. if one indefinitely repeats the measurement of the same angle, the errors arising from imperfect division of the instrument belongs to the class of constant errors. If, on the other hand, one successively measures several different angles, the errors due to imperfect division will be considered random until a table of errors relative to each division has been formed.

2. edit

We exclude the consideration of regular errors from our discussion. It is up to the observer to carefully investigate the causes that can produce a constant error, to eliminate them if possible, or at least assess their effect in order to correct it for each observation, which will then give the same result as if the constant cause had not existed. It is quite different for irregular errors: by their nature, they resist any calculation, and they must be tolerated in observations. However, by skillfully combining results, their influence can be minimized as much as possible. The following investigation is devoted to this most important topic.

3. edit

Errors arising from a simple and determinate cause in observations of the same kind are confined within certain limits that could undoubtedly be assigned if the nature of this cause were perfectly known. In most cases, all errors between these extreme limits must be considered possible. A thorough knowledge of each cause would reveal whether all these errors have equal or unequal likelihood, and in the latter case, what the relative probability of each of them is. The same remark applies to the total error resulting from the combination of several simple errors. This error will also be confined between two limits, one being the sum of the upper limits, the other the sum of the lower limits corresponding to the simple errors. All errors between these limits will be possible, and each can result, in an infinite number of ways, from suitable values attributed to the partial errors. Nevertheless, it is possible to assign a larger or smaller likelihood for each result, from which the law of relative probability can be derived, provided that the laws of each of the simple errors are assumed to be known, and ignoring the analytical difficulties involved in collecting all of the combinations.

Of course, certain sources of error produce errors that cannot vary according to a continuous law, but are instead capable of a finite number of values, such as errors arising from the imperfect division of instruments (if indeed one wants to classify them among random errors), because the number of divisions in a given instrument is essentially finite. Nevertheless, if it is assumed that not all sources of error are of this type, then it is clear that the complex of all possible total errors will form a series subject to the law of continuity, or, at least, several distinct series, if it so happens that, upon arranging all possible values of the discontinuous errors in order of magnitude, the difference between a pair of consecutive terms is greater than the difference between the extreme limits of the errors subject to the law of continuity. In practice, such a case will almost never occur, unless the the instrument is subject to gross defects.

4. edit

Let $\varphi (x)$ denote the relative likelihood of an error $x{:}$ this means, due to the continuity of the errors, that $\varphi (x)\,{d}x$ is the probability that the error lies between the limits $x$ and $x+{d}x.$ In practice it is hardly possible, or perhaps impossible, to assign a form to the function $\varphi$ a priori. Nevertheless, several general characteristics that it must necessarily present can be established: $\varphi (x)$ is obviously a discontinuous function; it vanishes for all values of $x$ not between the extreme errors. For any value between these limits, the function is positive (excluding the case indicated at the end of the previous article); in most cases, errors of opposite signs will be equally possible, and thus we will have: $\varphi (x)=\varphi (-x)$ Finally, since small errors are more easily made than large ones, $\varphi (x)$ will generally have a maximum when $x=0$ and will continually decrease as $x$ increases.

In general, the integral $\int _{a}^{b}\varphi (x)\,{d}x$ expresses the probability that the unknown error falls between the limits $a$ and $b.$ It follows that the value of this integral taken between the extreme limits of the possible errors will always be $=1.$ And since $\varphi (x)$ is zero for values not between these limits, it is clear that in all cases

the value of the integral $\int _{-\infty }^{\infty }\varphi (x)\,{d}x$ will always be $=1.$

5. edit

Let us consider the integral $\int _{-\infty }^{\infty }x\,\varphi (x)\,{d}x$ and denote its value by $k.$ If the sources of error are such that there is no reason for two equal errors of opposite signs to have unequal likelihood, we will have $\varphi (x)=\varphi (-x),$ and consequently, $k=0.$ We conclude that if $k$ does not vanish and has e.g. a positive value, then there necessarily exists an error source that produces only positive errors or, at least, produces them more easily than negative errors. This quantity $k,$ which is the average of all possible errors, or the average value of $x,$ can conveniently be referred to as the "constant part of the error". Moreover, it is easily proven that the constant part of the total error is the sum of the constant parts of the simple errors of which it is composed.

If the quantity $k$ is assumed to be known and subtracted from the result of each observation, then, denoting the error of the corrected observation by $x'$ and the corresponding probability by $\varphi '(x'),$ we will have $x'=x-k,$ $\varphi '(x')=\varphi (x),$ and consequently, $\textstyle \int x'\varphi '(x')\,{d}x'=\int x\,\varphi (x)\,{d}x-k\int \varphi (x)\,{d}x=k-k=0,$ i.e. the errors of the corrected observations will have no constant part, which is clear in and of itself.

6. edit

The value of the integral $\int _{-\infty }^{\infty }x\varphi (x)\,{d}x,$ which is the average value of $x,$ reveals the presence or absence of a constant error, as well as the value of this error. Similarly, the integral $\int _{-\infty }^{\infty }x^{2}\varphi (x)\,{d}x,$ which is the average value of $x^{2},$ seems very suitable for defining and measuring, in a general manner, the uncertainty of a system of observations. Therefore, between two systems of observations of unequal precision, the one giving a smaller value to the integral $\int _{-\infty }^{\infty }x^{2}\varphi (x)\,{d}x$ should be considered preferable. If it is argued that this convention is arbitrary and seemingly unnecessary, then we readily agree. The question at hand is inherently vague and can only be delimited by a somewhat arbitrary principle. Determining a quantity through observation can be likened, somewhat accurately, to a game in which there is a loss to be feared and no gain to be expected; each error being likened to a loss incurred, the relative apprehension about such a game should be expressed by the probable loss, i.e., by the sum of the products of the various possible losses by their respective probabilities. But what loss should be likened to a specific error? This is not clear in itself; its determination depends partly on our whim. It is evident, first of all, that the loss should not be regarded as proportional to the error committed; for, in this hypothesis, a positive error representing a loss, the negative error should be regarded as a gain: on the contrary, the magnitude of the loss should be evaluated by a function of the error whose value is always positive. Among the infinite number of functions that fulfill this condition, it seems natural to choose the simplest one, which is undoubtedly the square of the error, and thus we are led to the principle proposed above.

Laplace considered the question in a similar manner, but adopted as a measure of loss the error itself, taken positively. This assumption, if we do not deceive ourselves, is no less arbitrary than ours: should we, indeed, consider a double error as more or less regrettable than a simple error repeated twice, and should we, consequently, assign it a double or more than double importance? This is a question that is not clear, and on which mathematical arguments have no bearing; each must resolve it according to their preference. Nevertheless, it cannot be denied that Laplace's assumption deviates from the law of continuity and is therefore less suitable for analytical study; ours, on the other hand, is recommended by the generality and simplicity of its consequences.

7. edit

Let us define $\int _{-\infty }^{\infty }\varphi (x)\,x^{2}\,{d}x=m^{2}{:}$ we will call $m$ the "mean error to be feared" or simply the "mean error" of the observation whose indefinite errors $x$ have a relative probability of $\phi x.$ We do not limit this designation to the immediate result of the observations, but rather extend it to any quantity that can be derived from them in any way. It is important not to confuse this mean error with the arithmetic mean of the errors, which is discussed in art. 5.

When comparing several systems of observations or several quantities resulting from observations that are not given the same precision, we will consider their relative "weight" to be inversely proportional to $m^{2},$ and their "precision" to be inversely proportional to $m.$ In order to represent the weights by numbers, we should take, as the unit, the weight of a certain arbitrarily chosen system of observations.

8. edit

If the errors of the observations have a constant part, subtracting it from each obtained result reduces the mean error, increases the weight and precision. Retaining the notation of art. 5, and letting $m'$ denote the mean error of the corrected observations, we have

${\begin{aligned}m'^{2}&=\int _{-\infty }^{\infty }x'^{2}\varphi '(x'){d}x=\int _{-\infty }^{\infty }(x-k)^{2}\varphi (x)\,{d}x\\[0.75ex]&=\int _{-\infty }^{\infty }x^{2}\varphi (x)\,{d}x-2k\int _{-\infty }^{\infty }x\,\varphi (x)\,{d}x+k^{2}\int _{-\infty }^{\infty }\varphi (x)\,{d}x\\[0.75ex]&=m^{2}-2k^{2}+k^{2}=m^{2}-k^{2}.\\\end{aligned}}$

If, instead of the constant part $k,$ another number $l$ were subtracted from each observation, the square of the mean error would become

$m^{2}-2kl+l^{2}=m'^{2}+(l-k)^{2}.$

9. edit

Let $\lambda$ be a determined coefficient and let $\mu$ the value of the integral

$\int _{-\lambda m}^{\lambda m}\varphi (x)\,{d}x.$

Then $\mu$ will be the probability that the error of a certain observation is less than $\lambda m$ in absolute value, and $1-\mu$ will be the probability that this error exceeds $\lambda m.$ If, for $\mu ={\tfrac {1}{2}},$ $\lambda m$ has the value $\rho ,$ it will be equally likely for the error to be smaller or larger than $\rho {:}$ thus, $\rho$ can be called the probable error. The relationship between $\lambda$ and $\mu$ depends on the nature of the function $\varphi ,$ which is unknown in most cases. However,i t is interesting to study this relationship in some particular cases.

I. If the extreme limits of the possible errors are $+a$ and $-a,$ and if, between these limits, all errors are equally probable, the function $\varphi (x)$ will be constant between these same limits, and, consequently, equal to ${\tfrac {1}{2a}}.$ Hence, we have $m=a\,{\sqrt {\tfrac {1}{3}}},$ and $\mu =\lambda \,{\sqrt {\tfrac {1}{3}}},$ so long as $\lambda$ is less than or equal to ${\sqrt {3}};$ finally $\rho =m{\sqrt {\tfrac {3}{4}}}=0{,}8660254m,$ and the probability that the error does not exceed the mean error is ${\sqrt {\tfrac {1}{3}}}=0{,}5773503.$

II. If as before $-a$ and $+a$ are the limits of possible errors, and if we assume that the probability of these errors decreases from the error $0$ onwards like the terms of an arithmetic progression, then we will have

\varphi (x)={\frac {a-x}{a^{2}}},

for values of

x

between

0

and

+a,

\varphi (x)={\frac {a+x}{a^{2}}},

for values of

x

between

0

and

-a.

From this, we deduce that $m=a\,{\sqrt {\tfrac {1}{6}}}$ and $\mu =\lambda \,{\sqrt {\tfrac {2}{3}}}-{\tfrac {1}{6}}\lambda ^{2},$ as long as $\lambda$ is between 0 and ${\sqrt {6}};$ $\lambda ={\sqrt {6}}-{\sqrt {6-6\mu }}$ as long as $\mu$ is between 0 and 1; and finally,

$\rho =m\left({\sqrt {6}}-{\sqrt {3}}\right)=0{,}7174389\,m.$

In this case, the probability that the error remains below the mean error is

$={\sqrt {\tfrac {2}{3}}}-{\tfrac {1}{6}}=0{,}6498299.$

III. If we assume the function $\varphi (x)$ to be proportional to $e^{-{\frac {x^{2}}{h^{2}}}},$ then it must be equal to

$\varphi (x)={\frac {e^{-{\frac {x^{2}}{h^{2}}}}}{h\,{\sqrt {\pi }}}};$

where $\pi$ denotes the semiperimeter of a circle of radius $1,$ from which we deduce

$m=h\,{\sqrt {\frac {1}{2}}},$

(see Disquisitiones generales circa seriem infinitam, art. 28). If we let $\Theta z$ denote the value of the integral

${\frac {2}{\sqrt {\pi }}}\int _{0}^{z}e^{-z^{2}}{d}z,$

then we have

$\mu =\Theta \left(\lambda \,{\sqrt {\frac {1}{2}}}\right).$

The following table gives some values of this quantity:

${\begin{array}{c|l}\lambda &{\phantom {0{,}68}}\mu \\\hline 0{,}6744897&0{,}5\\0{,}8416213&0{,}6\\1{,}0000000&0{,}6826895\\1{,}0364334&0{,}7\\1{,}2815517&0{,}8\\1{,}6448537&0{,}9\\2{,}5758293&0{,}99\\3{,}2918301&0{,}999\\3{,}8905940&0{,}9999\\\infty &1\end{array}}$

10. edit

Although the relationship between $\lambda$ and $\mu$ depends on the nature of the function $\varphi ,$ some general results can be established that apply to all cases where this function does not increase with the absolute value of the variable $x;$ then we have the following theorems:

\lambda

will not exceed

\mu {\sqrt {3}}

whenever

\mu

is less than

{\tfrac {2}{3}};

\lambda

will not exceed

{\frac {2}{3{\sqrt {1-\mu }}}}

whenever

\mu

exceeds

{\tfrac {2}{3}};

When $\mu ={\tfrac {2}{3}},$ the two limits coincide and $\lambda$ cannot exceed ${\sqrt {\tfrac {4}{3}}}.$

To prove this remarkable theorem, let $y$ be the value of the integral $\int _{-x}^{+x}\varphi (z)\,{d}z.$ Then $y$ will be the probability that an error is between $-x$ and $+x.$ Let us set

$x=\psi (y),$ ${d}\psi (y)=\psi '(y)\,{d}y,$ ${d}\psi '(y)=\psi ''(y)\,{d}y,$

then we have $\psi (0)=0,$ and

$\psi '(y)={\frac {1}{\varphi (x)+\varphi (-x)}};$

and by hypothesis $\psi '(y)$ is always increasing between $y=0$ and $y=1,$ or at least is not decreasing, or equivalently $\psi ''(y)$ is always positive, or at least not negative. Now we have

${d}y\,\psi '(y)=\psi '(y)\,{d}y+y\,\psi ''(y)\,{d}y,$

thus,

$y\,\psi '(y)-\psi (y)=\int _{0}^{y}y\,\psi ''(y)\,{d}y;$

Therefore, $y\,\psi '(y)-\psi (y)$ always has positive value, or at least this expression is never negative, and therefore

$1-{\frac {\psi (y)}{y\,\psi '(y)}}$

will always be positive and less than unity. Let $f$ be the value of this difference for $y=\mu ;$ since $\psi (\mu )=\lambda \,m,$ we have

$f=1-{\frac {\lambda \,m}{\mu \,\psi '(\mu )}},\quad$ or $\quad \psi '(\mu )={\frac {\lambda \,m}{(1-f)\mu }}.$

This being prepared, let's consider the function

${\frac {\lambda \,m}{(1-f)\mu }}(y-\mu f),$

which we set $={F}(y),$ and also ${d}{.}{F}(y)={F}'(y)\,{d}y.$ Then it is clear that

${\begin{aligned}{F}(\mu )&=\lambda \,m=\psi (\mu ),\\{F}'(\mu )&={\frac {\lambda \,m}{(1-f)\mu }}=\psi '(\mu ).\\\end{aligned}}$

Since $\psi '(y)$ is continually increasing with $y$ (or at least does not decrease, which should always be understood), and at the same time ${F}'(y)$ is constant, the difference

$\psi '(y)-{F}'(y)={\frac {{d}[\psi (y)-{F}(y)]}{{d}y}}$

will be positive for all values of $y$ greater than $\mu ,$ and negative for all values of $y$ smaller than $\mu .$ It follows that the difference $\psi (y)-{F}(y)$ is always positive, and consequently, $\psi (y)$ will certainly be greater than ${F}(y)$ in absolute value, as long as the function ${F}(y)$ is positive, i.e. between $y=\mu f$ and $y=1.$ The value of the integral

$\int _{\mu f}^{1}[{F}(y))^{2}\,{d}y$

will therefore be less than that of the integral

$\int _{\mu f}^{1}[\psi (y)]^{2}\,{d}y,$

and a fortiori less than

$\int _{0}^{1}[\psi (y)]^{2}\,{d}y,$

i.e., less than $m^{2}.$ Now the value of the first of these integrals is found to be

$={\frac {\lambda ^{2}m^{2}(1-\mu f)^{3}}{3\mu ^{2}(1-f)^{2}}}$ ;

and therefore $\lambda ^{2}$ is less than ${\frac {3\mu ^{2}(1-f)^{2}}{(1-\mu f)^{3}}},$ with $f$ being a number between $0$ and $1.$ If we consider $f$ as a variable, then this fraction, whose differential is

$=-{\frac {3\mu ^{2}(1-f)}{(1-\mu f)^{4}}}(2-3\mu +\mu f)\,{d}f\,,$

will be continually decreasing as $f$ increases from $0$ to $1$ so long as $\mu$ is less than ${\tfrac {2}{3}},$ and therefore its maximum value will be found when $f=0$ and will be $=3\mu ^{2},$ so that in this case, the coefficient $\lambda$ will certainly be less, or at least not greater than $\mu {\sqrt {3}}.$ Q.E.P. On the other hand, when $\mu$ is greater than ${\frac {2}{3}},$ the maximum value of the function will be found when $2-3\mu +\mu f=0,$ i.e. for $f=3-{\tfrac {2}{\mu }},$ and this maximum value will be $={\frac {4}{9(1-\mu )}},$ so in this case, the coefficient $\lambda$ will not be greater than ${\frac {2}{3{\sqrt {1-\mu }}}}.$ Q.E.S.

Thus e.g. for $\mu ={\tfrac {1}{2}}$ it is certain that $\lambda$ will not exceed ${\sqrt {\tfrac {3}{4}}},$ which means that the probable error cannot exceed $0{,}8660254m,$ to which it was found to be equal in the first example in art. 9. Furthermore, it is easily concluded from our theorem that $\mu$ is not less than $\lambda {\sqrt {\tfrac {1}{3}}}$ when $\lambda$ is less than ${\sqrt {\tfrac {4}{3}}},$ and on the other hand, it is not less than $1-{\tfrac {4}{9\lambda ^{2}}}$ when $\lambda$ is greater than ${\sqrt {\tfrac {4}{3}}}.$

11. edit

Since several of the problems discussed below involve the integral $\textstyle \int x^{4}\varphi (x)\,{d}x,$ it will be worthwhile for us to evaluate it in some special cases. Let us denote the value of the integral

$\int _{-\infty }^{\infty }x^{4}\varphi (x)\,{d}x$

by $n^{4}.$ I. When $\varphi (x)={\tfrac {1}{2a}}$ for values of $x$ between $-a$ and $+a,$ we have $n^{4}={\tfrac {a^{4}}{5}}={\tfrac {9}{5}}\,m^{4}.$

II. In the second case of art. 9, with $x$ still between $-a$ and $+a,$ we have $n^{4}={\tfrac {1}{15}}\,a^{4}={\tfrac {12}{5}}\,m^{4}.$

III. In the third case, where

$\varphi (x)={\frac {e^{-{\frac {x^{2}}{h^{2}}}}}{h{\sqrt {\pi }}}},$

we find, as explained in the commentary cited above, that $n^{4}={\tfrac {3}{4}}\,h^{4}=3\,m^{4}.$

It can also be demonstrated, with only the assumptions of the previous article, that the ratio ${\tfrac {n^{4}}{m^{4}}}$ is never less than ${\tfrac {9}{5}}.$

12. edit

Let $x,$ $x',$ $x'',$ etc. denote the errors made in observations of the same kind, and suppose that these errors are independent of each other. Let $\varphi (x)$ be the relative probability of error $x,$ and let $y$ be a rational function of variables $x,$ $x',$ $x'',$ etc. Then the multiple integral

(I)	$\int \varphi (x)\,\varphi (x')\,\varphi (x'')\ldots \,{d}x\,{d}x'\,{d}x''\ldots ,$

extended to all values of the variables $x,$ $x',$ $x'',$ etc. for which the value of $y$ falls between the given limits $0$ and $\eta ,$ represents the probability that the value of $y$ is between $0$ and $\eta .$ This integral is evidently a function of $\eta ,$ whose differential we set $=\psi (\eta )\,{d}\eta ,$ so that the integral in question is equal to $\int _{0}^{\eta }\psi (\eta )\,{d}\eta ,$ and therefore, $\psi (\eta )$ represents the relative probability of an arbitrary value of $y.$ Since $x$ can be regarded as a function of the variables $y,$ $x',$ $x'',$ etc., which we set

$=f(y,x',x'',\ldots ),$

the integral (I) will be

$=\int \varphi \left(f(y,x',x'',\ldots )\right){\frac {{d}f(y,x',x'',\ldots )}{{d}y}}\,\varphi (x')\,\varphi (x'')\ldots \,{d}y\,{d}x'\,{d}x''\ldots ,$

where $y$ takes values between $y=0$ and $y=\eta ,$ and the other variables take all values for which $f(y,x',x'',\ldots )$ is real. Hence we have

$\psi (y)=\int \varphi \left[f(y,x',x'',\ldots )\right]{\frac {{d}f(y,x',x'',\ldots )}{{d}y}}\,\varphi (x')\,\varphi (x'')\ldots \,{d}y\,{d}x'\,{d}x''\ldots ;$

the integration, where $y$ is to be regarded as a constant, being extended to all values of the variables $x',$ $x'',$ etc. for which $f(y,x',x'',\ldots )$ takes a real value.

13. edit

The previous integration would require knowledge of the function $\varphi ,$ which is unknown in most cases. Even if this function were known, the calculation would often exceed the capabilities of analysis. Therefore, it will be impossible to obtain the probability of each value of $y;$ but it is different if one asks only for the average value of $y,$ which will be given by the integral $\textstyle \int y\,\psi (y)\,{d}y,$ extended to all possible values of $y.$ And since it is evident that for all values which $y$ cannot attain, either due to the nature of the function (e.g. for negative values, if $y=xx+x'x'+x''x''+$ etc.), or because of the limits imposed on $x,$ $x',$ $x'',$ etc., one can assume that $\psi (y)=0,$ it is clear that the integration can be extended to all real values of $y$ from $-\infty$ to $+\infty .$

But the integral $\int y\,\psi (y)\,{d}y,$ taken between determinate limits $\eta$ and $\eta ',$ is equal to the integral

$\int y\,\varphi \left[f(y,x',x'',\ldots )\right]{\frac {{d}f(y,x',x'',\ldots )}{{d}y}}\varphi (x')\varphi (x'')\ldots {d}y\,{d}x'\,{d}x''\ldots ,$

,

taken from $y=\eta$ to $y=\eta '$ and extended to all values of the variables $x',$ $x'',$ etc. for which $f$ is real. This integral is therefore equal to the integral

$\int y\,\varphi (x)\,\varphi (x')\,\varphi (x'')\ldots \,{d}x\,{d}x'\,{d}x''\ldots ,$

in which $y$ is expressed as a function of $x,$ $x',$ $x'',$ etc., and the integration is extended to all values of the variables that leave $y$ between $\eta$ and $\eta '.$ Thus, the integral}}

$\int _{-\infty }^{\infty }y\,\psi (y)\,{d}y$

can be obtained from the integral

$\int y\,\varphi (x)\,\varphi (x')\,\varphi (x'')\ldots \,{d}x\,{d}x'\,{d}x''\ldots ,$

where the integration is extended to all real values of $x,$ $x',$ $x'',$ that is, from $x=-\infty$ to $x=+\infty ,$ $x'=-\infty$ to $x'=+\infty ,$ etc.

If the function $y$ reduces to a sum of terms of the form

$A\,x^{\alpha }{x'}^{\beta }{x''}^{\gamma }\ldots ,$

then the value of the integral

$\int y\,\psi (y)\,{d}y,$

extended to all values of $y,$ or equivalently the average value of $y,$ will be equal to a sum of terms of the form

$A\int _{-\infty }^{\infty }x^{\alpha }\varphi (x)\,{d}x\cdot \int _{-\infty }^{\infty }{x'}^{\beta }\varphi (x')\,{d}x'\cdot \int _{-\infty }^{\infty }{x''}^{\gamma }\varphi (x'')\,{d}x''\ldots ,$

that is, the average value of $y$ is equal to a sum of terms derived from those that make up $y,$ by replacing $x^{\alpha },$ $x'^{\beta },$ $x''^{\gamma },$ etc. with their average values. The proof of this important theorem could easily be derived from other considerations.

15. edit

Let us apply the theorem of the previous article to the case where

$y={\frac {x^{2}+{x'}^{2}+{x''}^{2}+\ldots }{\sigma }},$

and $\sigma$ denotes the number of terms in the numerator.

We immediately find that the average value of $y$ is equal to $m^{2},$ the letter $m$ having the same meaning as above. The true value of $y$ may be lower or higher than its average, just as the true value of $x^{2}$ may, in each case, be lower or higher than $m^{2};$ but the probability that by chance, the value of $y$ differs by a small amount from $m^{2}$ will approach certainty as $\sigma$ becomes larger. In order to clarify this, since it is not possible to determine this probability exactly, let us investigate the mean error to be feared when $y=m^{2}.$ It is clear from the principles of art. 6 that this error will be the square root of the average value of the function

${\left({\frac {{x}^{2}+{x'}^{2}+{x''}^{2}+\ldots }{\sigma }}-m^{2}\right)}^{2}.$

To find it, it suffices to observe that the average value of a term such as ${\frac {x^{4}}{\sigma ^{2}}}$ is equal to ${\frac {n^{4}}{\sigma ^{2}}}$ ( $n$ having the same meaning as in art. 11), and that the average value of a term such as ${\frac {2{x^{2}}{x'}^{2}}{\sigma ^{2}}}$ is equal to ${\frac {2m^{4}}{\sigma ^{2}}};$ therefore, the average value of this function will be

$={\frac {n^{4}-m^{4}}{\sigma }}.$

Since this last formula contains the quantity $n,$ if we only want to get an idea of the precision of this determination, it will suffice to adopt a certain hypothesis about the function $\varphi .$ E.g. if we take the third assumption of arts. 9 and 11, this error will be equal to $m^{2}{\sqrt {\tfrac {2}{\sigma }}}.$ Alternatively, we can obtain an approximate value of $n^{4}$ by means of the errors themselves, using the formula

${\frac {{x}^{4}+{x'}^{4}+{x''}^{4}+\ldots }{\sigma }}.$

In general, it can be stated that a precision twice as great in this determination will require a quadruple number of errors, meaning that the weight of the determination is proportional to the number $\sigma .$

Similarly, if the errors of the observations contain a constant part, we will deduce from their arithmetic mean a value of the constant part, and this value will be approached as the number of errors increases. In this determination, the mean error to be feared will be represented by ${\sqrt {\frac {m^{2}-k^{2}}{\sigma }}},$ where $k$ denotes the constant part, and $m$ denotes the mean error of the observations uncorrected for their constant error. It will be simply represented by ${\frac {m}{\sqrt {\sigma }}}$ if $m$ represents the mean error of the observations corrected for the constant part (see art 8).

16. edit

In the arts. 12-15, we assumed that the errors $x,$ $x',$ $x'',$ etc. belonged to the same type of observation, so that the probability of each of these errors was represented by the same function. However, it is clear that the general principles outlined in arts. 12-14 can be applied with equal ease in the more general case where the probabilities of the errors $x,$ $x',$ $x'',$ etc., are represented by different functions $\varphi (x),$ $\varphi '(x'),$ $\varphi ''(x'')$ etc., i.e. when these errors belong to observations of varying precision or uncertainty. Let $x$ denote the error of an observation with a mean error to be feared of $m,$ and let $x',$ $x'',$ etc. denote the errors of other observations with mean errors to be feared of $m',$ $m'',$ etc. Then the average value of the sum $x^{2}+{x'}^{2}+{x''}^{2}+$ etc. will be $m^{2}+{m'}^{2}+{m''}^{2}+$ etc. Now, if it is also known that the quantities $m,$ $m',$ $m'',$ etc. are respectively proportional to the numbers $1,$ $\mu ',$ $\mu '',$ etc., then the average value of the expression

${\frac {x^{2}+{x'}^{2}+{x''}^{2}+\ldots }{1+{\mu '}^{2}+{\mu ''}^{2}+\ldots }}$

will be $=m^{2}.$ However, if we adopt for $m^{2}$ the value that this expression will take, by substituting the errors $x,$ $x',$ $x'',$ etc., as chance offers them, then the mean error affecting this determination will become, just as in the preceding article,

${\frac {\sqrt {n^{4}+{n'}^{4}+{n''}^{4}+\ldots -m^{4}-{m'}^{4}-{m''}^{4}-\ldots }}{1+{\mu '}^{2}+{\mu ''}^{2}+\ldots }}$

where $n',$ $n'',$ etc., have the same meaning with respect to the second and third observation, as $n$ does with respect to the first; and if we can assume the numbers $n,$ $n',$ $n'',$ etc., proportional to $m,$ $m',$ $m'',$ etc., this mean error to be feared will be equal to

${\frac {\sqrt {(n^{4}-m^{4})(1+{\mu '}^{4}+{\mu ''}^{4}+\ldots )}}{1+{\mu '}^{2}+{\mu ''}^{2}+\ldots }}$ ;

But this method of determining an approximate value for $m$ is not the most advantageous. Consider the more general expression

$y={\frac {x^{2}+\alpha '{x'}^{2}+\alpha ''{x''}^{2}+\ldots }{1+\alpha '{\mu '}^{2}+\alpha ''{\mu ''}^{2}+\ldots }},$

whose average value will also be $m^{2},$ regardless of the coefficients $\alpha ',$ $\alpha '',$ etc. The mean error to be feared when substituting the value $m^{2}$ for a value of $y,$ as determined by the likelihoods of $x,$ $x',$ $x'',$ etc., will, according to the principles above, be given by the formula

${\frac {\sqrt {(n^{4}-m^{4})+{\alpha '}^{2}({n'}^{4}-{m'}^{4})+{\alpha ''}^{2}({n''}^{4}-{m''}^{4})+\ldots }}{1+\alpha '{\mu '}^{2}+\alpha ''{\mu ''}^{2}+\ldots }}.$

To minimize this error, we must set

${\begin{aligned}\alpha '&={\frac {n^{4}-m^{4}}{{n'}^{4}-{m'}^{4}}}\,{\mu '}^{2},\\\alpha ''&={\frac {n^{4}-m^{4}}{{n''}^{4}-{m''}^{4}}}\,{\mu ''}^{2},\\\,\cdots &\cdots \cdots \cdots \cdots \cdots \cdots \end{aligned}}$

These values cannot be evaluated until the exact ratios ${\tfrac {n}{m}},$ ${\tfrac {n'}{m'}},$ etc. are known. In the absence of exact knowledge^[1], it is safest to assume them equal to each other (see art. 11), in which case

${\begin{aligned}\alpha '&={\frac {1}{{\mu '}^{2}}},&\alpha ''&={\frac {1}{{\mu ''}^{2}}},\dots ,\end{aligned}}$

i.e. the coefficients $\alpha ',$ $\alpha '',$ etc., should be assumed equal to the relative weights of the various observations, taking the weight of the one corresponding to the error $x$ as the unit. With this assumption, let $\sigma$ denote, as above, the number of proposed errors. Then the average value of the expression

${\frac {x^{2}+\alpha '{x'}^{2}+\alpha ''{x''}^{2}+\ldots }{\sigma }}$

will be $=m^{2},$ and when we take, for the true value of $m^{2},$ the randomly determined value of this expression, the mean error to be feared will be

${\frac {\sqrt {n^{4}+{\alpha '}^{2}{n'}^{4}+{\alpha ''}^{2}{n''}^{4}+\ldots -\sigma m^{4}}}{\sigma }},$

and, finally, if we are allowed to assume that the quantities $n,$ $n',$ $n'',$ etc., are proportional to $m,$ $m',$ $m'',$ etc., this expression reduces to

${\sqrt {\frac {n^{4}-m^{4}}{\sigma }}},$

which is identical to what we found in the case where all observations were of the same type.

17. edit

When the value of a quantity, which depends on an unknown magnitude, is determined by an observation whose precision is not absolute, the result of this observation may provide an erroneous value for the unknown, but there is no room for discretion in this determination. But if several functions of the same unknown have been found by imperfect observations, we can obtain the value of the unknown either by any one of these observations, or by a combination of several observations, which can be carried out in infinitely many ways. The result will be subject, in all cases, to a possible error, and depending on the combination chosen, the mean error to be feared may be greater or smaller. The same applies if several observed quantities depend on multiple unknowns. Depending on whether the number of observations equals the number of unknowns, or is smaller or larger than this number, the problem will be determined, undetermined, or more than determined (at least in general), and in this third case, the observations can be combined in infinitely many ways to provide values for the unknowns. Among these combinations, the most advantageous ones must be chosen, i.e., those that provide values for which the mean error to be feared is as small as possible. This problem is certainly the most important one presented by the application of mathematics to natural philosophy.

In Theoria motus corporum coelestium we have shown how to find the most probable values of unknowns when the probability law of the observational errors is known, and since, in almost all cases, this law remains hypothetical by its nature, we have applied this theory to the highly plausible hypothesis that the probability of error $x$ is proportional to $e^{-h^{2}x^{2}}.$ Hence this method that I have followed, especially in astronomical calculations, and which most calculators now use under the name of Method of Least Squares.

Laplace later considered the question from another point of view, and showed that this principle is preferable to all others, regardless of the probability law of the errors, provided that the number of observations is very large. But when this number is limited, the question remains open; so that, if we reject our hypothetical law, the method of least squares would be preferable to others, for the sole reason that it leads to simpler calculations.

We therefore hope to please geometers by demonstrating in this Memoir that the method of least squares provides the most advantageous combination of observations, not only approximately, but also absolutely, regardless of the probability law of errors and regardless of the number of observations, provided that we adopt for the mean error, not Laplace's definition, but the one which we have given in arts. 5 and 6.

It is necessary to warn here that in the following investigations, only random errors reduced by their constant part will be considered. It is up to the observer to carefully eliminate the causes of constant errors. We reserve for another occasion the examination of the case where observations are affected by an unknown constant error, and we will address this issue in another Memoir.

18. edit

Problem. Let ${U}$ be a given function of the unknowns ${V},$ ${V'},$ ${V''},$ etc.; we ask for the mean error ${M}$ to be feared in determining the value of ${U}$ when, instead of the true values of ${V},$ ${V'},$ ${V''},$ etc., we take the values derived from independent observations; $m,$ $m',$ $m'',$ etc., being the mean errors corresponding to these various observations.

Solution. Let $e,$ $e',$ $e'',$ etc. denote the errors of the observed values ${V},$ ${V'},$ ${V''},$ etc.; the resulting error for the value of the function ${U}$ can be expressed by the linear function

$\lambda e+\lambda 'e'+\lambda ''e''+\ldots ={E},$

where $\lambda ,$ $\lambda ',$ $\lambda '',$ etc., represent the derivatives ${\frac {dU}{dV}},$ ${\frac {dU}{dV'}},$ ${\frac {dU}{dV''}},$ etc., when ${V},$ ${V'},$ ${V''},$ etc., are replaced by their true values.

This value of ${E}$ is evident if we assume the observations to be accurate enough so that the squares and products of the errors are negligible. It follows that the average value of ${E}$ is zero, since we assume that the errors of the observations have no constant part. Now the mean error ${M}$ to be feared in the value of ${U}$ will be the square root of the average value of ${E}^{2},$ or equivalently ${M}^{2}$ will be the average value of the sum

$\lambda ^{2}e^{2}+{\lambda '}^{2}{e'}^{2}+{\lambda ''}^{2}{e''}^{2}+\ldots +2\lambda \lambda 'ee'+2\lambda \lambda ''ee''+2\lambda '\lambda ''e'e''+\ldots ;$

but the average value of $\lambda ^{2}e^{2}$ is $\lambda ^{2}m^{2},$ that of ${\lambda '}^{2}{e'}^{2}$ is ${\lambda '}^{2}{m'}^{2}$ , etc., and finally the average values of the products $2\lambda \lambda 'ee'$ are all zero. Hence we find that

${M}={\sqrt {\lambda ^{2}m^{2}+{\lambda '}^{2}{m'}^{2}+{\lambda ''}^{2}{m''}^{2}+\ldots }}.$

It is good to add several remarks to this solution.

I. Since we neglect powers of errors higher than the first, we can, in our formula, take for $\lambda ,$ $\lambda ',$ $\lambda '',$ etc., the values of the differential coefficients ${\frac {dU}{dV}},$ etc., derived from the observed values ${V},$ ${V}',$ ${V}'',$ etc. Whenever ${U}$ is a linear function, this substitution is rigorously exact.

II. If instead of mean errors, one prefers to introduce weights $p,$ $p',$ $p'',$ etc. for the respective observations, with the unit being arbitrary, and ${P}$ being the weight of the value of ${U}.$ Then we will have

${P}={\frac {1}{{\frac {\lambda ^{2}}{p}}+{\frac {{\lambda '}^{2}}{p'}}+{\frac {{\lambda ''}^{2}}{p''}}+\ldots }}\cdot$

III. Let ${T}$ be another function of ${V},$ ${V}',$ ${V}'',$ etc., and let

${\frac {dT}{dV}}=\varkappa ,\;\;{\frac {dT}{{dV}'}}=\varkappa ',\;\;{\frac {dT}{{dV}''}}=\varkappa '',\;\dots$

The error in the determination of ${T},$ from the observed values ${V},$ ${V}',$ ${V}'',$ etc., will be

$\varkappa e+\varkappa 'e'+\varkappa ''e''+\ldots ={E}',$

and the mean error to be feared in this determination will be

${\sqrt {\varkappa ^{2}m^{2}+{\varkappa '}^{2}{m'}^{2}+{\varkappa ''}^{2}{m''}^{2}+\ldots \,}}.$

It is obvious that the errors ${E}$ and ${E'}$ will not be independent of each other, and the mean value of the product ${EE}'$ will not be $=0$ like the mean value of $ee',$ but instead it will be equal to

$\varkappa \lambda m^{2}+\varkappa '\lambda '{m'}^{2}+\varkappa ''\lambda ''{m''}^{2}+\ldots .$

IV. The problem includes the case where the values of the quantities ${V},$ ${V}',$ ${V}'',$ etc., are not immediately given by observation, but are deduced from any combinations of direct observations. For this extension to be legitimate, the determinations of these quantities must be independent, i.e., they must be provided by different observations. If this condition of independence is not fulfilled, the formula giving the value of ${M}$ would no longer be accurate. For example, if the same observation were used both in determining ${V}$ and in determining ${V}',$ the errors $e$ and $e'$ would no longer be independent, and the mean value of the product $ee'$ would no longer be zero. If, in this case, the relationship between ${V}$ and ${V'}$ and the results of the simple observations from which they derive is known, we can calculate the mean value of the product $ee',$ as indicated in remark III, and consequently correct the formula which gives ${M}.$

19. edit

Let ${V},$ ${V'},$ ${V''},$ etc., be functions of the unknowns $x,$ $y,$ $z,$ etc. Let $\pi$ be the number of these functions, and let $\rho$ be the number of unknowns. Suppose that observations have given, immediately or indirectly, $V={L},$ $V'={L'},$ $V''={L''},$ etc., and that these determinations are absolutely independent of each other. If $\rho$ is greater than $\pi ,$ then the determination of the unknowns is an indeterminate problem. If $\rho$ is equal to $\pi ,$ then each of the unknowns $x,$ $y,$ $z,$ etc., can be reduced to a function of ${V},$ ${V'},$ ${V''},$ etc., so that the values of the former can be deduced from the observed values of the latter, and the previous article will allow us to calculate the relative accuracy of these various determinations. If $\rho$ is less than $\pi ,$ then each unknown $x,$ $y,$ $z,$ etc., can be expressed in infinitely many ways as a function of ${V},$ ${V'},$ ${V''},$ etc., and, in general, these values will be different; they should coincide if the observations were, contrary to our assumptions, rigorously accurate. It is clear, moreover, that the various combinations will provide results whose accuracy will generally be different.

Moreover, if, in the second and third cases, the quantities ${V},$ ${V'},$ ${V''},$ etc., are such that $\pi -\rho +1$ of them, or more, can be regarded as functions of the others, the problem is more than determined relative to these latter functions and indeterminate relative to the unknowns $x,$ $y,$ $z,$ etc.; and we could not even determine these latter unknowns, even if the functions ${V},$ ${V'},$ ${V''},$ etc., were exactly known: but we exclude this case from our investigations.

If ${V},$ ${V'},$ ${V''},$ etc., are not linear functions of the unknowns, we can always assign them this form, by replacing the primitive unknowns with their difference from their approximate values, which we assume known; the mean errors to be feared in the determinations

${V}={L},\;\;{V}'={L}',\;\;{V}''={L}'',\ldots$

being respectively denoted by $m,$ $m',$ $m'',$ etc., and the weights of these determinations by $p,$ $p',$ $p'',$ etc., so that

$pm^{2}=p'{m'}^{2}=p''{m''}^{2}=\ldots .$

We will assume that both the ratios of the mean errors and the weights are known, one of which will be arbitrarily chosen. Finally, if we set

${\begin{aligned}\left({V}-{L}\right)\textstyle {\sqrt {p^{\phantom {\prime }}}}&=v,&\left({V}'-{L}'\right)\textstyle {\sqrt {p'}}&=v',\ldots ,\end{aligned}}$

then things will proceed as if immediate observations, equally precise and with mean error $m\textstyle {\sqrt {p^{\phantom {\prime }}}},$ had given

$v=0,\;\;v'=0,\;\;v''=0,\;\ldots .$

20. edit

Problem. Let $v,$ $v',$ $v'',$ etc., be the following linear functions of the unknowns $x,$ $y,$ $z,$ etc.,

(1)

\left\{{\begin{array}{l}{\begin{alignedat}{6}&v&{}={}&ax&{}+{}&by&{}+{}&cz&{}+{}&\ldots &{}+{}&l,\\&v'&{}={}&a'x&{}+{}&b'y&{}+{}&c'z&{}+{}&\ldots &{}+{}&l',\\&v''&{}={}&a''x&{}+{}&b''y&{}+{}&c''z&{}+{}&\ldots &{}+{}&l'',\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \end{array}}\right.

Among all systems of coefficients $\varkappa ,$ $\varkappa ',$ $\varkappa ,$ etc., that identically satisfy

$\varkappa v+\varkappa 'v'+\varkappa ''v''+\ldots =x-k,$

$k$ being independent of $x,$ $y,$ $z,$ etc., find the one for which $\varkappa ^{2}+{\varkappa '}^{2}+{\varkappa ''}^{2}+\ldots$ obtains its minimum value.

Solution. — Let us set

(2)

\left\{{\begin{array}{l}{\begin{alignedat}{4}av&{}+{}&a'v'&{}+{}&a''v''&{}+{}&\ldots &=\xi ,\\bv&{}+{}&b'v'&{}+{}&b''v''&{}+{}&\ldots &=\eta ,\\cv&{}+{}&c'v'&{}+{}&c''v''&{}+{}&\ldots &=\zeta ,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \cdot \end{array}}\right.

$\xi ,$ $\eta ,$ $\zeta$ are linear functions of $x,$ $y,$ $z,$ and we have

(3)

\left\{{\begin{array}{l}{\begin{alignedat}{5}\xi &=x\Sigma a^{2}&{}+{}&y\Sigma ab&{}+{}&z\Sigma ac&{}+{}&\ldots &{}+{}&\Sigma al,\\[0.75ex]\eta &=x\Sigma ab&{}+{}&y\Sigma b^{2}&{}+{}&z\Sigma bc&{}+{}&\ldots &{}+{}&\Sigma bl,\\[0.75ex]\zeta &=x\Sigma ac&{}+{}&y\Sigma bc&{}+{}&z\Sigma c^{2}&{}+{}&\ldots &{}+{}&\Sigma cl,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \end{array}}\right.

where $\Sigma a^{2}$ denotes the sum $a^{2}+{a'}^{2}+{a''}^{2}+\ldots ,$ and similarly for the other sums.

The number of quantities $\xi ,$ $\eta ,$ $\zeta ,$ etc., is equal to the number of unknowns $x,$ $y,$ $z,$ etc., namely $\rho$ . Thus, by elimination, one can obtain an equation of the following form,^[2]

$x={A}+(\alpha \alpha )\,\xi +(\alpha \beta )\,\eta +(\alpha \gamma )\,\zeta +\ldots ,$

which will be identically satisfied if we replace $\xi ,$ $\eta ,$ $\zeta$ with their values from (3). Consequently, if we set

(4)

\left\{{\begin{alignedat}{8}&a&&(\alpha \alpha )&{}+{}&b&&(\alpha \beta )&{}+{}&c&&(\alpha \gamma )&{}+{}&\ldots &&=\alpha ,\\&a'&&(\alpha \alpha )&{}+{}&b'&&(\alpha \beta )&{}+{}&c'&&(\alpha \gamma )&{}+{}&\ldots &&=\alpha ',\\&a''&&(\alpha \alpha )&{}+{}&b''&&(\alpha \beta )&{}+{}&c''&&(\alpha \gamma )&{}+{}&\ldots &&=\alpha '',\\\end{alignedat}}\right.

then we will have identically

(5)	$\alpha v+\alpha 'v'+\alpha ''v''+\ldots =x-{A}.$

This equation shows that among the different systems of coefficients $\varkappa ,$ $\varkappa ',$ $\varkappa '',$ etc., we must consider the system

$\varkappa =\alpha ,\;\;\varkappa '=\alpha ',\;\;\varkappa ''=\alpha '',\;\ldots .$

Moreover, for any system, we will have identically

$(\varkappa -\alpha )\,v+(\varkappa '-\alpha ')\,v'+(\varkappa ''-\alpha '')\,v''+\ldots ={A}-k,$

and this equation, being identical, leads to the following:

${\begin{array}{l}{\begin{alignedat}{4}(\varkappa -\alpha )\,a&{}+{}&(\varkappa '-\alpha ')\,a'&{}+{}&(\varkappa ''-\alpha '')\,a''&{}+{}&\ldots &=0,\\(\varkappa -\alpha )\,b&{}+{}&(\varkappa '-\alpha ')\,b'&{}+{}&(\varkappa ''-\alpha '')\,b''&{}+{}&\ldots &=0,\\(\varkappa -\alpha )\,c&{}+{}&(\varkappa '-\alpha ')\,c'&{}+{}&(\varkappa ''-\alpha '')\,c''&{}+{}&\ldots &=0.\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

Adding these equations after multiplying them, respectively, by $(\alpha \alpha ),$ $(\alpha \beta ),$ $(\alpha \gamma ),$ etc., we will have, by virtue of the system (4),

$(\varkappa -\alpha )\,\alpha +(\varkappa '-\alpha ')\,\alpha '+(\varkappa ''-\alpha '')\,\alpha ''+\ldots =0,$

which is the same as

$\varkappa ^{2}+{\varkappa '}^{2}+\ldots =\alpha ^{2}+{\alpha '}^{2}+\ldots +{(\varkappa -\alpha )}^{2}+{(\varkappa '-\alpha ')}^{2}+\ldots ;$

thus, the sum

$\varkappa ^{2}+{\varkappa '}^{2}+{\varkappa ''}^{2}+\ldots$

will have its minimum value when $\varkappa =\alpha ,$ $\varkappa '=\alpha ',$ $\varkappa ''=\alpha '',$ etc. Q.E.I.

Moreover, this minimum value will be obtained as follows. Equation (5) shows that we have

${\begin{array}{l}{\begin{alignedat}{4}a\,\alpha &{}+{}&a'\alpha '&{}+{}&a''\alpha ''&{}+{}&\ldots &=1,\\b\,\alpha &{}+{}&b'\alpha '&{}+{}&b''\alpha ''&{}+{}&\ldots &=0,\\c\,\alpha &{}+{}&c'\alpha '&{}+{}&c''\alpha ''&{}+{}&\ldots &=0,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \end{array}}$

Let's multiply these equations, respectively, by $(\alpha \alpha ),$ $(\alpha \beta ),$ $(\alpha \gamma ),$ etc., and add them; considering the relations (4), we find

$\alpha ^{2}+{\alpha '}^{2}+{\alpha ''}^{2}+\ldots =(\alpha \alpha ).$

21. edit

When the observations have provided approximate equations $v=0,$ $v'=0,$ $v''=0,$ etc., it will be necessary, to determine the unknown $x,$ to choose a combination of the form

$\varkappa v+\varkappa 'v'+\varkappa ''v''+\ldots =0,$

such that the unknown $x$ acquires a coefficient equal to $1$ , and that the other unknowns are eliminated.

According to art. 18, the weight of this determination will be given by

${\frac {1}{\varkappa ^{2}+{\varkappa '}^{2}+{\varkappa ''}^{2}+\ldots }}\cdot$

According to the previous article, the most suitable determination will be obtained by taking $\varkappa =\alpha ,$ $\varkappa '=\alpha ',$ $\varkappa ''=\alpha '',$ etc. Then $x$ will have the value ${A},$ and it is clear the same value would be obtained (without knowing the multipliers $\alpha ,$ $\alpha ',$ $\alpha '',$ etc.), by performing elimination on the equations $\xi =0,$ $\eta =0,$ $\zeta =0,$ etc. The weight of this determination will be given $={\tfrac {1}{(\alpha \alpha )}},$ and the mean error to be feared will be

$=m{\sqrt {p^{\stackrel {}{}}}}\,(\alpha \alpha )=m'{\sqrt {p'}}\,(\alpha \alpha )=m''{\sqrt {p''}}\,(\alpha \alpha )=\ldots .$

A similar approach would lead to the most suitable values of the other unknowns $y,$ $z,$ etc., which would be those obtained by performing eliminating on the equations $\xi =0,$ $\eta =0,$ $\zeta =0,$ etc.

If we denote the sum $v^{2}+{v'}^{2}+{v''}^{2}+\ldots ,$ or equivalently

$p\,\left({V-L}\right)^{2}+p'\,\left({V'-L'}\right)^{2}+p''\,\left({V''-L''}\right)^{2}+\ldots ,$

by $\Omega$ , then it is clear that $2\xi ,$ $2\eta ,$ $2\zeta ,$ etc. will be the partial differential quotients of the function $\Omega ,$ i.e.

${\begin{aligned}2\,\xi &={\frac {{d}\Omega }{{d}x}},&2\,\eta &={\frac {{d}\Omega }{{d}y}},&2\,\zeta &={\frac {{d}\Omega }{{d}z}}.\ldots \end{aligned}}$

Therefore, the values of the unknowns that are deduced from the most suitable combination, and which we can call the most plausible values, are precisely those that minimize $\Omega$ . Now ${V-L}$ represents the difference between the observed value and the computed value. Thus, the most plausible values of the unknowns are those that minimize the sum of the squares of the differences between the calculated and observed values of the quantities ${V},$ ${V'},$ ${V''},$ etc., these squares being respectively multiplied by the weight of the observations. I had established this principle a long time ago through other considerations, in Theoria Motus Corporum Coelestium.

If one wants to assign the relative precision of each determination, it is necessary to deduce the values of $x,$ $y,$ $z,$ etc. from the equations (3), which gives them in the following form:

(7)

\left\{{\begin{alignedat}{5}x&={A}&{}+{}&(\alpha \alpha )&\,\xi &{}+{}(\alpha \beta )&\,\eta &{}+{}(\alpha \gamma )&\,\zeta +\ldots ,\\y&={B}&{}+{}&(\beta \alpha )&\,\xi &{}+{}(\beta \beta )&\,\eta &{}+{}(\beta \gamma )&\,\zeta +\ldots ,\\z&={C}&{}+{}&(\gamma \alpha )&\,\xi &{}+{}(\gamma \beta )&\,\eta &{}+{}(\gamma \gamma )&\,\zeta +\ldots .\end{alignedat}}\right.

Accordingly, the most plausible values of the unknowns $x,$ $y,$ $z,$ etc., will be ${A},$ ${B},$ ${C},$ etc. The weights of these determinations will be ${\tfrac {1}{(\alpha \alpha )}},$ ${\tfrac {1}{(\beta \beta )}},$ ${\tfrac {1}{(\gamma \gamma )}},$ etc. and the mean errors to be feared will be

for

x,\quad m{\sqrt {p^{\stackrel {}{}}}}\,(\alpha \alpha )=m'{\sqrt {p'}}(\alpha \alpha ),\dots ,

for

y,\quad m{\sqrt {p^{\stackrel {}{}}}}\,(\beta \beta )\,=m'{\sqrt {p'}}(\beta \beta ),\dots ,

for

z,\quad m{\sqrt {p^{\stackrel {}{}}}}\,(\gamma \gamma )\,=m'{\sqrt {p'}}(\gamma \gamma ),\dots ,

in agreement with the results obtained in Theoria Motus Corporum Coelestium.

22. edit

The case where there is only one unknown is the most frequent and simplest of all. In this case we have ${V}=x,$ ${V'}=x,$ ${V''}=x,$ etc. We will then have $a=\textstyle {\sqrt {p^{\phantom {\prime }}}},$ $a'=\textstyle {\sqrt {p'}},$ $a''=\textstyle {\sqrt {p''}}$ etc., $l=-{L}{\sqrt {p^{\stackrel {}{}}}},$ $l'=-{L'}{\sqrt {p'}},$ $l''=-{L''}{\sqrt {p''}}$ etc., and consequently,

$\xi =(p+p'+p''+\ldots )\,x-(p{L}+p'{L'}+p''{L''}+\ldots );$

Hence

${\begin{aligned}(\alpha \alpha )&={\frac {1}{p+p'+p''+\ldots }},\\[0.75ex]{A}&={\frac {p{L}+p'{L'}+p''{L''}+\ldots }{p+p'+p''+\ldots }}\cdot \end{aligned}}$

Therefore, if by several observations that do not have the same precision and whose respective weights are $p,$ $p',$ $p'',$ etc., we have found, for the same quantity, a first value ${L},$ a second ${L'},$ a third ${L''},$ etc., then the most plausible value will be

${\frac {p{L}+p'{L'}+p''{L''}+\ldots }{p+p'+p''+\ldots }},$

and the weight of this determination will be $=p+p'+p''+\ldots .$ If all observations are equally plausible, then the most probable value will be

${\frac {L+L'+L''+\ldots }{\pi }},$

i.e. the arithmetic mean of the observed values; taking the weight of an individual observation as the unit, the weight of the average will be $\pi .$

Part Two edit

23. edit

A number of investigations still remain to be discussed, through which the preceding theory will be clarified and extended.

Let us first investigate whether the elimination used to express the variables $x,$ $y,$ $z,$ etc., in terms of $\xi ,$ $\eta ,$ $\zeta ,$ etc., is always possible. Since the number of equations is equal to the number of unknowns, we know that this elimination will be possible if $\xi ,$ $\eta ,$ $\zeta ,$ etc. are independent of each other; otherwise, it is impossible.

Suppose, for a moment, that $\xi ,$ $\eta ,$ $\zeta ,$ etc. are not independent, but rather there exists between these quantities an identical equation

$0={F}\,\xi +{G}\,\eta +{H}\,\zeta +\ldots +{K};$

We will then have

${\begin{array}{l}{\begin{alignedat}{4}&{F}\Sigma a^{2}&{}+{}&{G}\Sigma ab&{}+{}&{H}\Sigma ac&{}+{}&\ldots &{}=0,\\[0.75ex]&{F}\Sigma ab&{}+{}&{G}\Sigma b^{2}&{}+{}&{H}\Sigma bc&{}+{}&\ldots &{}=0,\\[0.75ex]&{F}\Sigma ac&{}+{}&{G}\Sigma bc&{}+{}&{H}\Sigma c^{2}&{}+{}&\ldots &{}=0,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \\{\begin{alignedat}{4}&{F}\Sigma al\,\,&{}+{}&{G}\Sigma bl\,&{}+{}&{H}\Sigma cl\,&{}+{}&\ldots &{}=-{K}.\\\end{alignedat}}\end{array}}$

Let us set

(1)

\left\{{\begin{array}{l}{\begin{alignedat}{4}&a&{F}&{}+{}b&{G}{}+{}&c&{H}{}+{}\ldots &{}=\theta ,\\&a'&{F}&{}+{}b'&{G}{}+{}&c'&{H}{}+{}\ldots &{}=\theta ',\\&a''&{F}&{}+{}b''&{G}{}+{}&c''&{H}{}+{}\ldots &{}=\theta '',\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \\\end{array}}\right.

from which it follows that

${\begin{array}{l}{\begin{alignedat}{4}a\,\theta &{}+{}a'&\theta '&{}+{}a''&\theta ''+\ldots &{}={}0,\\b\,\theta &{}+{}b'&\theta '&{}+{}b''&\theta ''+\ldots &{}={}0,\\c\,\theta &{}+{}c'&\theta '&{}+{}c''&\theta ''+\ldots &{}={}0,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \cdot \\{\begin{alignedat}{4}l\,\theta &{}+{}l'&\theta '&{}+{}l''&\theta ''+\ldots &{}={}-{K}.\\\end{alignedat}}\end{array}}$

Multiplying the equations (1) resp. by $\theta ,$ $\theta ',$ $\theta '',$ etc., and adding, we obtain

$\theta ^{2}+{\theta '}^{2}+{\theta ''}^{2}+\ldots =0,$

and this equation leads to $\theta =0,$ $\theta '=0,$ $\theta ''=0,$ etc. From this we conclude, first of all, ${K}=0.$ Secondly, the equations (1) show that the functions $v,$ $v',$ $v'',$ etc., are such that their values do not change when the variables $x,$ $y,$ $z,$ etc., increase or decrease proportionally to ${F},$ ${G},$ ${H},$ etc. respectively. It is clear that the same holds for the functions ${V},$ ${V}',$ ${V}'',$ etc.: but this can only happen in the case where it would be impossible to determine $x,$ $y,$ $z,$ etc. using the values of ${V},$ ${V}',$ ${V}'',$ etc., even if these were exactly known; but then the problem would be indeterminate by its nature, and we will exclude this case from our investigations.

24. edit

If $\beta ,$ $\beta ',$ $\beta '',$ etc. denote multipliers playing the same role relative to the unknown $y,$ as the multipliers $\alpha ,$ $\alpha ',$ $\alpha '',$ etc. relative to the unknown $x,$ i.e. so that we have

${\begin{array}{l}{\begin{alignedat}{4}&a&(\beta \alpha )&{}+{}b&(\beta \beta )&{}+{}c&(\beta \gamma )&{}+{}\ldots =\beta ,\\&a'&(\beta \alpha )&{}+{}b'&(\beta \beta )&{}+{}c'&(\beta \gamma )&{}+{}\ldots =\beta ',\\&a''&(\beta \alpha )&{}+{}b''&(\beta \beta )&{}+{}c''&(\beta \gamma )&{}+{}\ldots =\beta '',\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots ,\end{array}}$

then we will identically have

$\beta v+\beta 'v'+\beta ''v''+\ldots =y-{B}.$

Let $\gamma ,$ $\gamma ',$ $\gamma '',$ etc. be the analogous multipliers relative to the variable $z,$ so that we have:

${\begin{array}{l}{\begin{alignedat}{4}&a&(\gamma \alpha )&{}+{}b&(\gamma \beta )&{}+{}c&(\gamma \gamma )&{}+{}\ldots =\gamma ,\\&a'&(\gamma \alpha )&{}+{}b'&(\gamma \beta )&{}+{}c'&(\gamma \gamma )&{}+{}\ldots =\gamma ',\\&a''&(\gamma \alpha )&{}+{}b''&(\gamma \beta )&{}+{}c''&(\gamma \gamma )&{}+{}\ldots =\gamma '',\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

and consequently,

$\gamma v+\gamma 'v'+\gamma ''v''+\ldots =z-{C}.$

In the same way as we found in art. 20 that

${\begin{aligned}\Sigma \alpha a&=1,&\Sigma \alpha b&=0,&\Sigma \alpha c&=0,\ldots ,&\Sigma \alpha l&=-{A},\end{aligned}}$

we will find here

${\begin{aligned}\Sigma \beta \,a&=0,&\Sigma \beta \,b&=1,&\Sigma \beta \,c&=0,\ldots ,&\Sigma \beta \,l&=-{B},\\[1ex]\Sigma \gamma \,a&=0,&\Sigma \gamma \,b&=0,&\Sigma \gamma \,c&=1,\ldots ,&\Sigma \gamma \,l&=-{C}\,;\\\end{aligned}}$

and so on.

We will also have, as in art. 20

${\begin{aligned}\Sigma \alpha ^{2}&=(\alpha \alpha ),&\Sigma \beta ^{2}&=(\beta \beta ),&\Sigma \gamma ^{2}&=(\gamma \gamma ),\ldots .\end{aligned}}$

If we multiply the values $\alpha ,$ $\alpha ',$ $\alpha '',$ etc. (art. 20. (4)), respectively, by $\beta ,$ $\beta ',$ $\beta '',$ etc., and add; we obtain

$\alpha \beta +\alpha '\beta '+\alpha ''\beta ''+\ldots =(\alpha \beta ),$ or $\Sigma \alpha \beta =(\alpha \beta ).$

If we multiply $\beta ,$ $\beta ',$ $\beta '',$ etc., respectively, by $\alpha ,$ $\alpha ',$ $\alpha '',$ etc., and add, we will find

$\alpha \beta +\alpha '\beta '+\alpha ''\beta ''+\ldots =(\beta \alpha ),$ and thus $(\alpha \beta )=(\beta \alpha ).$

In the same manner, we find

${\begin{aligned}(\alpha \gamma )&=(\gamma \alpha )=\Sigma \alpha \gamma ,&(\beta \gamma )&=(\gamma \beta )=\Sigma \beta \gamma ,\ldots .\end{aligned}}$

25. edit

Let $\lambda ,$ $\lambda ',$ $\lambda '',$ etc. denote the values taken by the functions $v,$ $v',$ $v'',$ etc., when $x,$ $y,$ $z,$ etc. are replaced by their most plausible values, ${A},$ ${B},$ ${C},$ etc., i.e.

${\begin{array}{l}{\begin{alignedat}{5}&a&{A}&{}+{}b&{B}&{}+{}c&{C}{}+{}\ldots {}+{}&l&{}={}&\lambda ,\\&a'&{A}&{}+{}b'&{B}&{}+{}c'&{C}{}+{}\ldots {}+{}&l'&{}={}&\lambda ',\\&a''&{A}&{}+{}b''&{B}&{}+{}c''&{C}{}+{}\ldots {}+{}&l''&{}={}&\lambda '',\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

If we set

$\lambda ^{2}+{\lambda '}^{2}+{\lambda ''}^{2}+\ldots ={M},$

so that ${M}$ is the value of the function $\Omega$ corresponding to the most plausible values of the variables, and therefore, as was shown in art. 20, the minimum value of $\Omega .$ Then the value of $a\lambda +a'\lambda '+a''\lambda ''+\ldots$ will be $\xi ,$ corresponding to $x={A},$ $y={B},$ $z={C},$ etc.\end{aligned},</math> and this value is zero, according to the way ${A},$ ${B},$ ${C},$ etc. have been obtained. Thus, we have

$\Sigma a\lambda =0;$

and similarly we would obtain

${\begin{aligned}\Sigma b\lambda &=0,&\Sigma c\lambda &=0,\ldots ,\end{aligned}}$

and

${\begin{aligned}\Sigma \alpha \lambda &=0,&\Sigma \beta \lambda &=0,&\Sigma \gamma \lambda &=0,\ldots .\end{aligned}}$

Finally, multiplying the values of $\lambda ,$ $\lambda ',$ $\lambda '',$ etc. respectively by $\lambda ,$ $\lambda ',$ $\lambda '',$ and adding, we get $l\lambda +l'\lambda '+l''\lambda ''+\ldots =\lambda ^{2}+{\lambda '}^{2}+{\lambda ''}^{2}+\ldots ,$ or

$\Sigma l\lambda ={M}.$

26. edit

Replacing $x,$ $y,$ $z,$ etc., with the expressions (7) from art. 21 in the equation $v=ax+by+cz+\ldots +l,$ we find, through the same reductions as before,

${\begin{array}{l}{\begin{alignedat}{5}&v&{}={}&\alpha &\xi {}+{}&\beta &\eta {}+{}&\gamma &\zeta &{}+{}\ldots +\lambda ,\\&v'&{}={}&\alpha '&\xi {}+{}&\beta '&\eta {}+{}&\gamma '&\zeta &{}+{}\ldots +\lambda ',\\&v''&{}={}&\alpha ''&\xi {}+{}&\beta ''&\eta {}+{}&\gamma ''&\zeta &{}+{}\ldots +\lambda '',\end{alignedat}}\\\,\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

Multiplying either these equations or the equations (1) of art. 20, by $\lambda ,$ $\lambda ',$ $\lambda '',$ etc., and then adding, we obtain the identity

$\lambda v+\lambda 'v'+\lambda ''v''+\ldots ={M}.$

27. edit

The function $\Omega$ can take several forms, which are worth developing.

Let us square the equations (1) art. 20, and add them. Then we find

${\begin{aligned}\Omega &=x^{2}\Sigma a^{2}+y^{2}\Sigma b^{2}+z^{2}\Sigma c^{2}+\ldots +2\,xy\Sigma ab\\[1ex]&+2\,xz\Sigma ac+2\,yz\Sigma bc+\ldots +2\,x\Sigma al+2\,y\Sigma bl\\[1ex]&+2\,z\Sigma cl+\ldots +\Sigma l^{2}:\end{aligned}}$

this is the first form.

Next let us multiply the same equations by $v,$ $v',$ $v'',$ etc. respectively, and add. Then we obtain $\Omega =\xi x+\eta y+\zeta z+\ldots +lv+l'v'+l''v''+\ldots ;$ and replacing $v,$ $v',$ $v'',$ etc., with the values indicated in the previous article, we find that $\Omega =\xi x+\eta y+\zeta z+\ldots -{A}\xi -{B}\eta -{C}\zeta -\ldots +{M},$ or $\Omega =\xi \,(x-{A})+\eta \,(y-{B})+\zeta \,(z-{C})+\ldots +{M}:$ this is the second form.

Finally, replacing, in this second form, $x-{A},$ $y-{B},$ $z-{C},$ etc. by the expressions (7) art. 21, we obtain the 'third form':

${\begin{aligned}\Omega &=(\alpha \alpha )\,\xi ^{2}+(\beta \beta )\,\eta ^{2}+(\gamma \gamma )\,\zeta ^{2}+\ldots +2\,(\alpha \beta )\,\xi \eta \\&+2\,(\alpha \gamma )\,\xi \zeta +2\,(\beta \gamma )\,\eta \zeta +\ldots +{M}.\end{aligned}}$

We can also give a fourth form which results automatically from the third form and the formulas of the previous article:

\Omega ={(v-\lambda )}^{2}+{(v'-\lambda ')}^{2}+{(v''-\lambda '')}^{2}+\ldots +{M},

or

\Omega ={M}+\Sigma {(v-\lambda )}^{2}.

From this last form we clearly see that ${M}$ is the minimum value of $\Omega .$

28. edit

Let $e,$ $e',$ $e'',$ etc., be the errors made in the observations that gave ${V}={L},$ ${V}'={L}',$ ${V}''={L}'',$ etc. Then the true values of the functions ${V},$ ${V}',$ ${V}'',$ etc., will be ${L}-e,$ ${L}'-e',$ ${L}''-e'',$ etc. respectively, and the true values of $v,$ $v',$ $v'',$ etc., will be $-e{\sqrt {p^{\phantom {\prime }}}},$ $-e'{\sqrt {p'}},$ $-e''{\sqrt {p''}},$ etc. respectively. therefore, the true value of $x$ will be

${A}-\alpha e{\sqrt {p^{\begin{array}{l}\\\end{array}}}}-\alpha 'e'{\sqrt {p'}}-\alpha ''e''{\sqrt {p''}}-\ldots ,$

and the error made in the most suitable determination of the unknown $e,$ which we will denote by ${E}x,$ will be

$=\alpha e{\sqrt {p^{\begin{array}{l}\\\end{array}}}}+\alpha 'e'{\sqrt {p'}}+\alpha ''e''{\sqrt {p''}}+\ldots .$

Similarly, the error made in the most suitable determination of the value of $y$ will be

${E}y=\beta e{\sqrt {p^{\phantom {\prime }}}}+\beta 'e'{\sqrt {p'}}+\beta ''e''{\sqrt {p''}}+\ldots .$

The average value of the square ${({E}x)}^{2}$ will be

$m^{2}p\left(\alpha ^{2}+{\alpha '}^{2}+{\alpha ''}^{2}+\ldots \right)=m^{2}p\,(\alpha \alpha ).$

The average value of ${({E}y)}^{2}$ will similarly be $=m^{2}p\,(\beta \beta ),$ as shown above. We can also determine the average value of the product ${E}x\,.\,{E}y,$ which will be

$m^{2}p\,(\alpha \beta +\alpha '\beta '+\alpha ''\beta ''+\ldots )=m^{2}p\,(\alpha \beta ).$

These results can be stated more briefly as follows:

The average values of the squares ${({E}x)}^{2},$ ${({E}y)}^{2},$ etc., are respectively equal to the products of ${\tfrac {1}{2}}m^{2}p$ with the second-order partial differential quotients

${\begin{aligned}{\frac {{d}^{2}\Omega }{{d}\xi ^{2}}}&,&{\frac {{d}^{2}\Omega }{{d}\eta ^{2}}},\ldots ,&\end{aligned}}$

and the average value of a product such as ${E}x.{E}y$ is the product of ${\tfrac {1}{2}}m^{2}p$ with ${\frac {{d}^{2}\Omega }{{d}\xi \,{d}\eta }},$ where $\Omega$ is regarded as a function of $\xi ,$ $\eta ,$ $\zeta ,$ etc.}

29. edit

Let $t$ be a given linear function of the quantities $x,$ $y,$ $z,$ etc., i.e.

$t=fx+gy+hz+\ldots +k\,;$

the value of $t$ deduced from the most plausible values of $x,$ $y,$ $z,$ etc., will then be $=f\,{A}+g\,{B}+h\,{C}+\ldots +k,$ and we denote this by ${K}.$ The error thus committed will be

$=f\,{E}x+g\,{E}y+h\,{E}z+\ldots ,$

which we denote by ${E}t.$ The average value of this error will obviously be zero, meaning the error will not contain a constant part, but the average value of ${({E}t)}^{2},$ i.e., the sum

${\begin{aligned}f^{2}{({E}x)}^{2}&+2fg\,{E}x\,.\,{E}y+2fh\,{E}x\,.\,{E}z+\ldots \\&{\begin{aligned}+\;g^{2}{({E}y)}^{2}&+2gh\,{E}y\,.\,{E}z+\ldots \\&+h^{2}{({E}z)}^{2}+\ldots ,\end{aligned}}\end{aligned}}$

will, according to the preceding article, be equal to the product of $m^{2}p$ with the sum

${\begin{aligned}f^{2}(\alpha \alpha )&+2fg\,(\alpha \beta )+2fh\,(\alpha \gamma )+\ldots \\&{\begin{aligned}+\;g^{2}\,(\beta \beta )&+2gh\,(\beta \gamma )+\ldots \\&+h^{2}\,(\gamma \gamma )+\ldots ,\end{aligned}}\end{aligned}}$

i.e., the product of $m^{2}p$ with the value produced by the function $\Omega -{M}$ when we substitute ${\begin{aligned}\xi &=f,&\eta &=g,&\zeta &=h,\ldots .\end{aligned}}$

If we let $\omega$ denote this value of $\Omega -{M},$ then the mean error to be feared when we take $t={K}$ will be $m{\sqrt {p^{\phantom {\prime }}}}$ and the weight of this determination will be ${\tfrac {1}{\omega }}$ .

Since we have identically

$\Omega -{M}=(x-{A})\,\xi +(y-{B})\,\eta +(z-{C})\,\zeta +\ldots ,$

$\omega$ will be equal to the value of the expression $(x-{A})\,f+(y-{B})\,g+(z-{C})\,h+\ldots$ or the value produced by $(t={K})$ when we substitute for $x,$ $y,$ $z,$ etc. the values corresponding to $\xi =f,$ $\eta =g,$ $\zeta =h,$ etc..

Finally, observing that $t,$ expressed as a function of the quantities $\xi ,$ $\eta ,$ $\zeta ,$ etc., will have ${K}$ as its constant part, if we suppose that

$t={F}\,\xi +{G}\,\eta +{H}\,\zeta +\ldots +{K},$

then we will have

$\omega =f\,{F}+g\,{G}+h\,{H}+\ldots .$

30. edit

We have seen that the function $\Omega$ attains its absolute minimum ${M},$ when we substitute $x={A},$ $y={B},$ $z={C},$ etc. or, equivalently, $\xi =0,$ $\eta =0,$ $\zeta =0,$ etc. If we assign another value to one of the unknowns, e.g. $x={A}+\Delta ,$ while the other unknowns remain variable, $\Omega$ may acquire a relative minimum value, which can be obtained from the equations

${\begin{aligned}x&={A}+\Delta ,&{\frac {{d}\Omega }{{d}y}}&=0,&{\frac {{d}\Omega }{{d}z}}&=0,\ldots ,\end{aligned}}$

Therefore, we must have $\eta =0,$ $\zeta =0,$ etc., and since

$x={A}+(\alpha \alpha )\,\xi +(\alpha \beta )\,\eta +(\alpha \gamma )\,\zeta +\ldots ,$

we have

$\xi ={\frac {\Delta }{(\alpha \alpha )}}\cdot$

Likewise, we have

${\begin{aligned}y&={B}+{\frac {(\alpha \beta )}{(\alpha \alpha )}}\,\Delta ,&z&={C}+{\frac {(\alpha \gamma )}{(\alpha \alpha )}}\,\Delta ,\ldots ,\end{aligned}}$

and the relative minimum value of $\Omega$ will be

$=(\alpha \alpha )\,\xi ^{2}+{M}={M}+{\frac {\Delta ^{2}}{(\alpha \alpha )}}\cdot$

Reciprocally, we conclude that if $\Omega$ is not to exceed ${M}+\mu ^{2},$ then the value of $x$ must necessarily be between the limits ${A}-\mu {\sqrt {(\alpha \alpha )}}$ and ${A}+\mu {\sqrt {(\alpha \alpha )}}.$ It is important to note that $\mu {\sqrt {(\alpha \alpha )}}$ becomes equal to the mean error to be feared in the most plausible value of $x,$ if we set $\mu =m{\sqrt {p^{\phantom {\prime }}}};$ i.e., if $\mu$ is the mean error of observations whose weights are $=1$ .

More generally, let us find the smallest value of the function $\Omega$ that can correspond to a given value of $t,$ where $t$ denotes, as in the previous article, a linear expression $fx+gy+hz+\ldots +k$ whose most plausible value is ${K}$ . Let us denote by the prescribed value of $t$ by ${K}+\kappa .$ According to the theory of maxima and minima, the solution to the problem will be given by the equations

${\begin{aligned}{\frac {{d}\Omega }{{d}x}}&=\theta \,{\frac {{d}t}{{d}x}},\\{\frac {{d}\Omega }{{d}y}}&=\theta \,{\frac {{d}t}{{d}y}},\\{\frac {{d}\Omega }{{d}z}}&=\theta \,{\frac {{d}t}{{d}z}},\\\cdot \cdots &\cdots \cdots \cdot \cdot \end{aligned}}$

or $\xi =\theta \,f,$ $\eta =\theta \,g,$ $\zeta =\theta \,h,$ etc., where $\theta$ denotes an as yet undetermined multiplier. If, as in the previous article, we identically set,

$t={F}\,\xi +{G}\,\eta +{H}\,\zeta +\ldots +{K},$

then we will have

${K}+\kappa =\theta \,(f\,{F}+g\,{G}+h\,{H}+\ldots )+{K},$

or

$\theta ={\frac {\kappa }{\omega }},$

where $\omega$ has the same meaning as in the previous article.

Since $\Omega -{M}$ is a homogeneous function of the second degree with respect to the variables $\xi ,$ $\eta ,$ $\zeta ,$ etc., its value when $\xi =\theta \,f,$ $\eta =\theta \,g,$ $\zeta =\theta \,h,$ etc. will evidently be $=\theta ^{2}\omega ,$ and thus the minimum value of $\Omega ,$ when $t={K}+\kappa ,$ will be $={M}+\theta ^{2}\omega ={M}+{\tfrac {\kappa ^{2}}{\omega }}.$ Reciprocally, if $\Omega$ must remain less than a given value ${M}+\mu ^{2},$ the value of $t$ will necessarily be between the limits ${K}-\mu {\sqrt {\omega ^{\stackrel {}{}}}},$ ${K}+\mu {\sqrt {\omega ^{\stackrel {}{}}}},$ and $\mu {\sqrt {\omega ^{\stackrel {}{}}}}$ will be the mean error to be feared in the most plausible value of $t,$ if $\mu$ represents the mean error of observations whose weights are $=1$ .

31. edit

When the number of unknowns $x,$ $y,$ $z,$ etc. is quite large, the determination of the numerical values of ${A},$ ${B},$ ${C},$ etc. by ordinary elimination is quite tedious. For this reason we have indicated, in Theoria Motus Corporum Coelestium art. 182, and later developed, in Disquisitione de elementis ellipticis Palladis (Comm. recent. Soc. Gotting Vol. I), a method that simplifies this work as much as possible. Namely, the function $\Omega$ must be reduced to the following form:

${\frac {{u^{0}}^{2}}{{\mathfrak {A}}^{0}}}+{\frac {{u'}^{2}}{{\mathfrak {B}}'}}+{\frac {{u''}^{2}}{{\mathfrak {C}}''}}+{\frac {{u'''}^{2}}{{\mathfrak {D}}'''}}+\ldots +{M},$

where the divisors ${\mathfrak {A}}^{0},$ ${\mathfrak {B}}',$ ${\mathfrak {C}}'',$ ${\mathfrak {D}}''',$ etc., are determined quantities; $u^{0},$ $u',$ $u'',$ etc., are linear functions of $x,$ $y,$ $z,$ etc., such that the second $u'$ does not contain $x,$ the third $u''$ contains neither $x$ nor $y,$ the fourth contains neither $x,$ nor $y,$ nor $z,$ and so on, so that the last $u^{(\pi -1)}$ contains only the last of the unknowns $x,$ $y,$ $z,$ etc.; and finally, the coefficients of $x,$ $y,$ $z,$ etc., in $u^{0},$ $u',$ $u'',$ etc., are respectively equal to ${\mathfrak {A}}^{0},$ ${\mathfrak {B}}',$ ${\mathfrak {C}}'',$ etc. Then we set $u^{0}=0,$ $u'=0,$ $u''=0,$ $u'''=0,$ etc. and we will easily obtain the values of $x,$ $y,$ $z,$ etc. by solving these equations, starting with the last one. I do not believe it necessary to repeat the algorithm that leads to the transformation of the function $\Omega$ .

However, the elimination required to find the weights of these determinations requires even longer calculations. We have shown in the Theoria Motus Corporum Coelestium that the weight of the last unknown, (which appears by itself in $u^{(\pi -1)}),$ is equal to the last term in the series of divisors ${\mathfrak {A}}^{0},$ ${\mathfrak {B}}',$ ${\mathfrak {C}}'',$ etc. This is easily found; hence, several calculators, wanting to avoid cumbersome elimination, have had the idea, in the absence of another method, to repeat the indicated transformation by successively considering each unknown as the last one. Therefore, I hope that geometers will appreciate my indication of a new method for calculating the weights of determinations, which seems to leave nothing more to be desired on this point.

Setting

(1)

\left\{{\begin{array}{l}{\begin{alignedat}{6}&u^{0}&{}={}&{\mathfrak {A}}^{0}x&{}+{}&{\mathfrak {B}}^{0}y&{}+{}&{\mathfrak {C}}^{0}z&{}+{}&\ldots &{}+{}&{\mathfrak {L}}^{0},\\&u'&{}={}&&&{\mathfrak {B}}'\,y&{}+{}&{\mathfrak {C}}'\,z&{}+{}&\ldots &{}+{}&{\mathfrak {L}}'\,,\\&u''&{}={}&&&&&{\mathfrak {C}}''z&{}+{}&\ldots &{}+{}&{\mathfrak {L}}'',\end{alignedat}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}\right.

we have identically

${\begin{aligned}{\frac {1}{2}}\,{d}\Omega &=\xi \,{d}x+\eta \,{d}y+\zeta \,{d}z+\ldots \\&=u^{0}\,{\frac {{d}u^{0}}{{\mathfrak {A}}^{0}}}+{\frac {u'\,{d}u'}{{\mathfrak {B}}'}}+{\frac {u''\,{d}u''}{{\mathfrak {C}}''}}+\ldots \\&=u^{0}\left({d}x+{\frac {{\mathfrak {B}}^{0}}{{\mathfrak {A}}^{0}}}\,{d}y+{\frac {{\mathfrak {C}}^{0}}{{\mathfrak {A}}^{0}}}\,{d}z+\ldots \right)\\&+u'\left({d}y+{\frac {{\mathfrak {C}}'}{{\mathfrak {B}}'}}\,{d}z+\ldots \right)\\&+u''\left(\,{d}z+\ldots \,\right)\\&+\ldots \ldots \ldots \ldots ;\end{aligned}}$

and from this we deduce:

(2)

\left\{{\begin{array}{l}{\begin{alignedat}{3}&\xi &{}={}&u^{0},\\[0.25em]&\eta &{}={}&{\frac {{\mathfrak {B}}^{0}}{{\mathfrak {A}}^{0}}}u^{0}&{}+{}&u',\\&\zeta &{}={}&{\frac {{\mathfrak {C}}^{0}}{{\mathfrak {A}}^{0}}}u^{0}&{}+{}&{\frac {{\mathfrak {C}}'}{{\mathfrak {B}}'}}u'&{}+{}&u'',\end{alignedat}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}\right.

The values of $u^{0},$ $u',$ $u'',$ etc. deduced from these equations will be presented in the following form:

(3)	$\left\{{\begin{array}{l}{\begin{alignedat}{3}&u^{0}&{}={}&\xi ,\\&u'&{}={}&{A}'\xi &{}+{}&\eta ,\\&u''&{}={}&{A}''\xi &{}+{}&{B}''\eta &{}+{}&\zeta ,\end{alignedat}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}\right.$

By taking the complete differential of the equation

$\Omega =\xi \,(x-{A})+\eta \,(y-{B})+\zeta \,(z-{C})+\ldots +{M},$

we obtain

${\frac {1}{2}}\,{d}\Omega =\xi \,{d}x+\eta \,{d}y+\zeta \,{d}z+\ldots ,$

and thus

${\frac {1}{2}}\,{d}\Omega =(x-{A})\,{d}\xi +(y-{B})\,{d}\eta +(z-{C})\,{d}\zeta +\ldots .$

This expression must be equivaleny to the one obtained from the equations (3),

${\frac {u^{0}}{{\mathfrak {A}}^{0}}}\,{d}\xi +{\frac {u'}{{\mathfrak {B}}'}}\,({A}'\,{d}\xi +{d}\eta )+{\frac {u''}{{\mathfrak {C}}''}}\,({A}''\,{d}\xi +{B}''\,{d}\eta +{d}\zeta )+\ldots ;$

and therefore we have

(4)

\left\{{\begin{array}{l}{\begin{alignedat}{6}x&{}={}&{\frac {u^{0}}{{\mathfrak {A}}^{0}}}&{}+{}&{A}'\,{\frac {u'}{{\mathfrak {B}}'}}&{}+{}&{A}''\,{\frac {u''}{{\mathfrak {C}}''}}&{}+{}&\ldots &{}+{}&{A},\\y&{}={}&&&{\frac {u'}{{\mathfrak {B}}'}}&{}+{}&{B}''\,{\frac {u''}{{\mathfrak {C}}''}}&{}+{}&\ldots &{}+{}&{B},\\z&{}={}&&&&&{\frac {u''}{{\mathfrak {C}}''}}&{}+{}&\ldots &{}+{}&{C},\end{alignedat}}\\\,\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}\right.

By substituting in these expressions the values of $u^{0},$ $u',$ and $u'',$ etc. obtained from the equations (3), we will have performed the elimination. For the determination of the weights, we have

(5)

\left\{{\begin{array}{l}{\begin{alignedat}{6}&(\alpha \alpha )&{}={}&{\frac {1}{{\mathfrak {A}}^{0}}}&{}+{}&{\frac {{{A}'}^{2}}{{\mathfrak {B}}'}}&{}+{}&{\frac {{{A}''}^{2}}{{\mathfrak {C}}''}}&{}+{}&{\frac {{{A}'''}^{2}}{{\mathfrak {D}}'''}}&{}+{}&\ldots ,\\&(\beta \beta )&{}={}&&&{\frac {1}{{\mathfrak {B}}'}}&{}+{}&{\frac {{{B}''}^{2}}{{\mathfrak {C}}''}}&{}+{}&{\frac {{{B}'''}^{2}}{{\mathfrak {D}}'''}}&{}+{}&\ldots ,\\&(\gamma \gamma )&{}={}&&&&&{\frac {1}{{\mathfrak {C}}''}}&{}+{}&{\frac {{C'''}^{2}}{{\mathfrak {D}}'''}}&{}+{}&\ldots .\end{alignedat}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \end{array}}\right.

The simplicity of these formulas leaves nothing to be desired. Equally simple formulas could be found to express the other coefficients $(\alpha \beta ),$ $(\alpha \gamma ),$ and $(\beta \gamma ),$ etc.; however, as their use is less frequent, we will refrain from presenting them.

33. edit

The importance of the subject has prompted us to prepare everything for the calculation and to form explicit expressions for the coefficients ${A}',$ ${A}'',$ ${A}'''$ etc., ${B}'',$ ${B}''',$ etc. etc. This calculation can be approached in two ways. The first involves substituting the values of $u^{0},$ $u',$ and so forth, deduced from the equations (3) into the equations (2), and the second involves substituting the values $\xi ,$ $\eta ,$ $\zeta ,$ from the equations (2) into the equations (3). The first method leads to the following formulas:

${\begin{array}{l}{\begin{aligned}{\frac {{\mathfrak {B}}^{0}}{{\mathfrak {A}}^{0}}}&+{A}'=0,\\{\frac {{\mathfrak {C}}^{0}}{{\mathfrak {A}}^{0}}}&+{\frac {{\mathfrak {C}}'}{{\mathfrak {B}}'}}\,{A}'+{A}''=0,\\{\frac {{\mathfrak {D}}^{0}}{{\mathfrak {A}}^{0}}}&+{\frac {{\mathfrak {D}}'}{{\mathfrak {B}}'}}\,{A}'+{\frac {{\mathfrak {D}}''}{{\mathfrak {C}}''}}\,{A}''+{A}'''=0,\end{aligned}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

These formulas will determine ${A}',$ ${A}'',$ and so on.

We will then have,

${\begin{array}{l}{\begin{aligned}&{\frac {{\mathfrak {C}}'}{{\mathfrak {B}}'}}+{B}''=0,\\&{\frac {{\mathfrak {D}}'}{{\mathfrak {B}}'}}+{\frac {{\mathfrak {D}}''}{{\mathfrak {C}}''}}\,{B}''+{B}'''=0,\end{aligned}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

which will determine ${B}'',$ and so forth; then

${\begin{array}{l}{\frac {{\mathfrak {D}}''}{{\mathfrak {C}}''}}+{C}'''=0,\\\cdots \cdots \cdots \cdots \cdots \end{array}}$

which will determine ${C}'''$ etc., and so on.

The second method yields the following system:

${\mathfrak {A}}^{0}{A}'+{\mathfrak {B}}^{0}=0$

from which we deduce ${A}',$

${\begin{alignedat}{5}{\mathfrak {A}}^{0}&{A}''&{}+{}&{\mathfrak {B}}^{0}&{B}''&&{}+{}&{\mathfrak {C}}^{0}&{}={}&0,\\&&&{\mathfrak {B}}'&{B}''&&{}+{}&{\mathfrak {C}}'&{}={}&0,\end{alignedat}}$

from which we deduce ${B}''$ and ${A}'',$

${\begin{alignedat}{7}{\mathfrak {A}}^{0}&{A}'''&{}+{}&{\mathfrak {B}}^{0}&{B}'''&&{}+{}&{\mathfrak {C}}^{0}&{C}'''&&{}+{}&{\mathfrak {D}}^{0}&{}={}&0,\\&&&{\mathfrak {B}}'&{B}'''&&{}+{}&{\mathfrak {C}}'&{C}'''&&{}+{}&{\mathfrak {D}}'&{}={}&0,\\&&&&&&&{\mathfrak {C}}''&{C}'''&&{}+{}&{\mathfrak {D}}''&{}={}&0,\\\end{alignedat}}$

from which we deduce ${C}''',$ ${B}''',$ ${A}''',$ and so on.

Both systems of formulas offer nearly equal advantages when seeking the weights of the determinations of all unknowns $x,$ $y,$ and so forth; however, if only one of the quantities $(\alpha \alpha ),$ $(\beta \beta ),$ and so forth is required, the first system is much preferable.

Moreover, the combination of equations (1) and (4) yields the same formulas, and provides, in addition, a second way to obtain the most plausible values ${A},$ ${B},$ and so forth, which are

${\begin{array}{l}{\begin{alignedat}{6}{A}&{}={}&&{}-{}&{\frac {{\mathfrak {L}}^{0}}{{\mathfrak {A}}^{0}}}&{}-{}&{A}'{\frac {{\mathfrak {L}}'}{{\mathfrak {B}}'}}&{}-{}&{A}''{\frac {{\mathfrak {L}}''}{{\mathfrak {C}}''}}&{}-{}&{A}'''{\frac {{\mathfrak {L}}'''}{{\mathfrak {D}}'''}}&{}-{}&\ldots ,\\[0.5ex]{B}&{}={}&&&&&{}-{}{\frac {{\mathfrak {L}}'}{{\mathfrak {B}}'}}&{}-{}&{B}''{\frac {{\mathfrak {L}}''}{{\mathfrak {C}}''}}&{}-{}&{B}'''{\frac {{\mathfrak {L}}'''}{{\mathfrak {D}}'''}}&{}-{}&\ldots ,\\[0.5ex]{C}&{}={}&&&&&&&{}-{}{\frac {{\mathfrak {L}}''}{{\mathfrak {C}}''}}&{}-{}&{C}'''{\frac {{\mathfrak {L}}'''}{{\mathfrak {D}}'''}}&{}-{}&\ldots ,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

The other calculation is identical to the ordinary calculation in which it is assumed $u^{0}=0,$ $u'=0,$ $u''=0,$ etc.

34. edit

The results obtained in art. 32 are only particular cases of a more general theorem which can be stated as follows:

Theorem If $t$ represents the following linear function of the unknowns $x,$ $y,$ $z,$ etc.,

$t=fx+gy+hz+\ldots +k,$

whose expression in terms of the variables $u^{0},$ $u',$ $u'',$ etc., is

$t=k^{0}u^{0}+k'u'+k''u''+\ldots +{K},$

then ${K}$ will be the most plausible value of $t,$ and the weight of this determination will be

${\frac {1}{{\mathfrak {A}}^{0}{k^{0}}^{2}+{\mathfrak {B}}'{k'}^{2}+{\mathfrak {C}}''{k''}^{2}+\ldots }}\cdot$

Proof. The first part of the theorem is obvious, since the most plausible value of $t$ must correspond to the values $u^{0}=0,$ $u'=0,$ $u''=0,$ etc.

To demonstrate the second part, let's note that we have

${\begin{alignedat}{6}{\frac {1}{2}}\,{d}\Omega &{}={}&\xi \,&{d}x&&{}+{}&\eta \,&{d}y&&{}+{}&\zeta \,&{d}z+\ldots ,\\{d}t&{}={}&f\,&{d}x&&{}+{}&g\,&{d}y&&{}+{}&h\,&{d}z+\ldots ,\end{alignedat}}$

and consequently, when

${\begin{aligned}\xi &=f,&\eta &=g,&\zeta &=h,\ldots ,\end{aligned}}$

we have

${d}\Omega =2\,{d}t,$

whatever the differentials ${d}x,$ ${d}y,$ ${d}z,$ etc. Hence, assuming always, ${\begin{aligned}\xi &=f,&\eta &=g,&\zeta &=h,\ldots ,\end{aligned}}$ we obtain

${\begin{aligned}&{\frac {u^{0}}{{\mathcal {A}}^{0}}}\,{d}u^{0}+{\frac {u'}{{\mathcal {B}}'}}\,{d}u'+{\frac {u''}{{\mathcal {C}}''}}\,{d}u''+\ldots \\[0.75ex]&{}={}k^{0}\,{d}u^{0}+k'\,{d}u'+k''\,{d}u''+\ldots .\end{aligned}}$

Now it is easily seen that if the differentials ${d}x,$ ${d}y,$ ${d}z,$ etc. are independent of each other, so will be ${d}u^{0},$ ${d}u',$ ${d}u'',$ etc., therefore, we will have,

${\begin{aligned}\xi &=f,&\eta &=g,&\zeta &=h,\ldots ,\\u^{0}&={\mathcal {A}}^{0}k^{0},&u'&={\mathcal {B}}'k',&u''&={\mathcal {C}}''k'',\ldots .\end{aligned}}$

Hence, the value of $\Omega$ corresponding to the same assumptions, will be

${\mathcal {A}}^{0}{k^{0}}^{2}+{\mathcal {B}}'{k'}^{2}+{\mathcal {C}}''{k''}^{2}+\ldots +{M}:$

which, by art. 29, demonstrates the truth of our theorem.

Moreover, if we wish to perform the transformation of the function $t$ without resorting to formulas (4) of art. 32, we immediately have the relations

${\begin{aligned}&f={\mathcal {A}}^{0}k^{0},\\&g={\mathcal {B}}^{0}k^{0}+{\mathcal {B}}'k',\\&h={\mathcal {C}}^{0}k^{0}+{\mathcal {C}}'k'+{\mathcal {C}}''k'',\\&\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \cdot \end{aligned}}$

which will allow us to determine $k^{0},$ $k',$ $k'',$ etc., and we will finally have

${K}=-{\mathcal {L}}^{0}k^{0}-{\mathcal {L}}'k'-{\mathcal {L}}''k''-\ldots .$

35. edit

We will particularly address the following problem, both because of its practical utility and the simplicity of the solution:

Find the changes that the most plausible values of the unknowns undergo by adding a new equation, and assign the weights of these new determinations.

Let us keep the previous notations. The primitive equations, reduced to have a weight of unity, will be ${\begin{aligned}v&=0,&v'&=0,&v''&=0,\ldots ;\end{aligned}}$ we will have $\Omega ={v}^{2}+{v'}^{2}+{v''}^{2}+\ldots ,$ and $\xi ,$ $\eta ,$ $\zeta ,$ etc., will be the partial derivatives

${\begin{aligned}{\frac {{d}\Omega ^{\ast }}{2\,{d}x}}&,&{\frac {{d}\Omega ^{\ast }}{2\,{d}y}}&,&{\frac {{d}\Omega ^{\ast }}{2\,{d}z}},\ldots .&\end{aligned}}$

Finally, by elimination, we will have

(1)

\left\{{\begin{array}{l}{\begin{alignedat}{5}x&{}={}{A}&&{}+{}&(\alpha \alpha )&\,\xi {}+{}&(\alpha \beta )&\,\eta {}+{}&(\alpha \gamma )&\,\zeta +\ldots ,\\y&{}={}{B}&&{}+{}&(\alpha \beta )&\,\xi {}+{}&(\beta \beta )&\,\eta {}+{}&(\beta \gamma )&\,\zeta +\ldots ,\\z&{}={}{C}&&{}+{}&(\alpha \gamma )&\,\xi {}+{}&(\beta \gamma )&\,\eta {}+{}&(\gamma \gamma )&\,\zeta +\ldots ,\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdot \cdot \end{array}}\right.

Now suppose we have a new approximate equation $v^{\ast }=0$ (which we assume to have a weight equal to unity), and we seek the changes undergone by the most plausible values of ${A},$ ${B},$ ${C},$ etc., and of the coefficients $(\alpha \alpha ),$ $(\beta \beta ),$ etc..

Let us set

${\begin{array}{ccc}&\Omega +{v^{\ast }}^{2}=\Omega ^{\ast },&\\[0.75ex]{\dfrac {1}{2}}{\dfrac {{d}\Omega ^{\ast }}{{d}x}}=\xi ^{\ast },&{\dfrac {1}{2}}{\dfrac {{d}\Omega ^{\ast }}{{d}y}}=\eta ^{\ast },&{\dfrac {1}{2}}{\dfrac {{d}\Omega ^{\ast }}{{d}z}}=\zeta ^{\ast },\ldots ,\end{array}}$

and let

$x={A}^{\ast }+(\alpha \alpha ^{\ast })\,\xi ^{\ast }+(\alpha \beta ^{\ast })\,\eta ^{\ast }+(\alpha \gamma ^{\ast })\,\zeta ^{\ast }+\ldots$

be the result of the elimination. Finally, let

$v^{\ast }=fx+gy+hz+\ldots +k,$

which, taking into account the equations (1), becomes

$v^{\ast }={F}\,\xi +{G}\,\eta +{H}\,\zeta +\ldots +{K},$

and let

${F}\,f+{G}\,g+{H}\,h+\ldots =\omega \,;$

It is clear that ${K}$ will be the most plausible value of the function $v^{\ast },$ as resulting from the primitive equations, without considering the value $0,$ provided by the new observation, and ${\tfrac {1}{\omega }}$ will be the weight of this determination.

Now we have

$\xi ^{\ast }=\xi +fv^{\ast },\quad \eta ^{\ast }=\eta +gv^{\ast },\quad \zeta ^{\ast }=\zeta +hv^{\ast },\ldots$

and consequently,

${F}\,\xi ^{\ast }+{G}\,\eta ^{\ast }+{H}\,\zeta ^{\ast }+\ldots +{K}=v^{\ast }(1+{F}\,f+{G}\,g+{H}\,h+\ldots )\,;$

or

$v^{\ast }={\frac {{F}\,\xi ^{\ast }+{G}\,\eta ^{\ast }+{H}\,\zeta ^{\ast }+\ldots +{K}}{1+\omega }}\cdot$

Furthermore,

${\begin{aligned}x&={A}+(\alpha \alpha )\,\xi ^{\ast }+(\alpha \beta )\,\eta ^{\ast }+(\alpha \gamma )\,\zeta ^{\ast }+\ldots \\&\qquad \qquad {}-{}v^{\ast }{\big [}f\,(\alpha \alpha )+g\,(\alpha \beta )+h\,(\alpha \gamma )+\ldots {\big ]}\\[0.75ex]&={A}+(\alpha \alpha )\,\xi ^{\ast }+(\alpha \beta )\,\eta ^{\ast }+\ldots -{F}\,v^{\ast }\\&={A}+(\alpha \alpha )\,\xi ^{\ast }+(\alpha \beta )\,\eta ^{\ast }+\ldots -{\frac {F}{1+\omega }}({F}\xi ^{\ast }+{G}\eta ^{\ast }+{H}\zeta ^{\ast }+\ldots {K}).\end{aligned}}$

From this, we deduce,

${A}^{\ast }={A}-{\frac {FK}{1+\omega }},$

which will be the most plausible value of $x,$ deduced from all observations.

We will also have

$(\alpha \alpha ^{\ast })=(\alpha \alpha )-{\frac {{F}^{2}}{1+\omega }}\,;$

thus

${\frac {1}{(\alpha \alpha )-{\dfrac {{F}^{2}}{1+\omega }}}}$

will be the weight of this determination.

Similarly, for the most plausible value of $y,$ deduced from all observations, we find

${B}^{\ast }={B}-{\dfrac {GK}{1+\omega }};$

the weight of this determination will be

${\frac {1}{{\frac {1}{(\beta \beta )}}-{\dfrac {{G}^{2}}{1+\omega }}}}\,;$

and so on. Q.E.I.

Let us add some remarks.

I. After substituting the new values ${A}^{\ast },$ ${B}^{\ast },$ ${C}^{\ast },$ etc., the function $v^{\ast }$ will obtain the most plausible value

${K}-{\frac {K}{1+\omega }}\,\left({F}\,f+{G}\,g+{H}\,h+\ldots \right)={\frac {K}{1+\omega }},$

and since we have, identically,

$v^{\ast }={\frac {F}{1+\omega }}\,\xi ^{\ast }+{\frac {G}{1+\omega }}\,\eta ^{\ast }+{\frac {H}{1+\omega }}\,\zeta ^{\ast }+\ldots +{\frac {K}{1+\omega }},$

the weight of this determination, according to art. 29, will be

${\frac {1+\omega }{{F}\,f+{G}\,g+{H}\,h+\ldots }}={\frac {1}{\omega }}+1.$

These results could be deduced immediately from the rules explained at the end of art. 21. The original equations had, indeed, provided the determination $v^{\ast }={K},$ whose weight was ${\tfrac {1}{\omega }}.$ A new observation gives another determination $v^{\ast }=0,$ independent of the first, whose weight is $=1,$ and their combination produces the determination $v^{\ast }={\tfrac {K}{1+\omega }}$ with a weight of ${\tfrac {1}{\omega }}+1.$

II. It follows from the above that, for $x={A}^{\ast },$ $y={B}^{\ast },$ $z={C}^{\ast }$ etc. we must have $\xi ^{\ast }=0,$ $\eta ^{\ast }=0,$ $\zeta ^{\ast }=0$ etc., and consequently,

${\begin{aligned}\xi &=-{\frac {f\,{K}}{1+\omega }},&\eta &=-{\frac {g\,{K}}{1+\omega }},&\zeta &=-{\frac {h\,{K}}{1+\omega }},\ldots \\\end{aligned}}$

Furthermore, since

${\begin{aligned}\Omega &=\xi \,(x-{A})+\eta \,(y-{B})+\zeta \,(z-{C})+\ldots +{M},\\\Omega ^{\ast }&=\Omega +{v^{\ast }}^{2},\end{aligned}}$

we must have

${\begin{alignedat}{2}\Omega &={\frac {{K}^{2}}{{(1+\omega )}^{2}}}\,\left({F}\,f+{G}\,g+{H}\,h+\ldots \right)+{M}&&={M}+{\frac {\omega \,{K}^{2}}{{(1+\omega )}^{2}}},\end{alignedat}}$

and

${\begin{alignedat}{2}\Omega ^{\ast }={M}+{\frac {\omega \,{K}^{2}}{{(1+\omega )}^{2}}}+{\frac {{K}^{2}}{{(1+\omega )}^{2}}}&&={M}+{\frac {{K}^{2}}{1+\omega }}\cdot \end{alignedat}}$

III. Comparing these results with those of the art. 30, we see that here the function $\Omega$ has the smallest value it can obtain when subjected to the condition $v^{\ast }={\tfrac {K}{1+\omega }}.$

36. edit

We will give here the solution to the following problem, which is analogous to the previous one, but we will refrain from indicating the demonstration, which can be easily found, as in the previous article.

Find the changes in the most plausible values of the unknowns and the weights of the new determinations when changing the weight of one of the primitive observations.

Suppose that after completing the calculation, it is noticed that the weight which has been assigned to an observation is too strong or too weak, e.g. the first one which gave ${V}={L},$ and that it would be more accurate to assign it the weight $p^{\ast },$ instead of the weight $p.$ It is not necessary to then restart the calculation. Instead it is convenient to form the corrections using the following formulas.

The most plausible values of the unknowns will be corrected as follows:

${\begin{array}{l}{\begin{alignedat}{2}x&={A}-{\frac {(p^{\ast }-p)\,\alpha \lambda }{p+(p^{\ast }-p)(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}},\\[0.5em]y&={B}-{\frac {(p^{\ast }-p)\,\beta \lambda }{p+(p^{\ast }-p)(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}},\\[0.5em]z&={C}-{\frac {(p^{\ast }-p)\,\gamma \lambda }{p+(p^{\ast }-p)(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}},\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

and the weights of these determinations will be found upon dividing unity by

${\begin{array}{l}{\begin{alignedat}{2}&(\alpha \alpha )-{\frac {(p^{\ast }-p)\,\alpha ^{2}}{p+(p^{\ast }-p)(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}},\\[0.5em]&(\beta \beta )-{\frac {(p^{\ast }-p)\,\beta ^{2}}{p+(p^{\ast }-p)(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}},\\[0.5em]&(\gamma \gamma )-{\frac {(p^{\ast }-p)\,\gamma ^{2}}{p+(p^{\ast }-p)(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}},\end{alignedat}}\\\;\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

respectively.

This solution applies in the case where, after completing the calculation, it is necessary to completely reject one of the observations, since this amounts to making $p^{\ast }=0$ ; similarly, $p^{\ast }=\infty$ will be suitable for the case where the equation ${V}={L},$ which in the calculation had been regarded as approximate, is in fact absolutely precise.

If, after completing the calculation, several new equations were to be added to those proposed, or if the weights assigned to several of them were incorrect, the calculation of the corrections becomes too complicated, and it is preferable to start over.

37. edit

In the arts. 15 and 16, we have given a method to approximate the accuracy of a system of observations; but this method assumes that the real errors encountered in a large number of observations are known exactly; however, this condition is rarely fulfilled, if ever.

If the quantities for which the observation provides approximate values depend on one or more unknowns, according to a given law, then the method of least squares allows us to find the most plausible values of these unknowns. If we then calculate the corresponding values of the observed quantities, they can be regarded as differing little from the true values, so that their differences with the observed values will represent the errors committed, with a certainty that will increase with the number observations. This is the procedure followed in practice by calculators, who have attempted, in complicated cases, to retrospectively evaluate the precision of the observations. Although sufficient in many cases, this method is theoretically inaccurate and can sometimes lead to serious errors; therefore, it is very important to treat the issue with more care.

In the following discussion, we retain the notation used in art. 19. The method in question consists of considering ${A},$ ${B},$ ${C},$ etc., as the true values of the unknowns $x,$ $y,$ $z,$ etc., and $\lambda ,$ $\lambda ',$ $\lambda '',$ etc., as those of the functions $v,$ $v',$ $v'',$ etc. If all observations have equal precision and their common weight $p=p'=p''=\ldots$ is taken to be unity, these same quantities, changed in sign, represent, under this assumption, the errors of the observations. Consequently, according to art. 15,

$m={\sqrt {\frac {{\lambda }^{2}+{\lambda '}^{2}+{\lambda ''}^{2}+\ldots }{\pi }}}={\sqrt {\frac {{M}^{\begin{array}{l}\\\end{array}}}{\pi \;}}}$

will be the mean error of the observations. If the observations do not have the same precision, then $-\lambda ,$ $-\lambda ',$ $-\lambda '',$ etc., represent the errors of the observations, respectively multiplied by the square roots of the weights, and the rules of art. 16 lead to the same formula,

${\sqrt {\frac {M}{\pi }}},$

which already expresses the mean error of these observations, when their weight is $=1$ . However, it is clear that an exact calculation would require replacing $\lambda ,$ $\lambda ',$ $\lambda '',$ etc. with the values of $v,$ $v',$ $v'',$ etc., deduced from the true values of the unknowns $x,$ $y,$ $z,$ etc., and replacing the quantity ${M}$ by the corresponding value of $\Omega .$ Although we cannot assign this latter value, we are nonetheless certain that it is greater than ${M}$ (which is its minimum possible value), and it would only reach this limit in the infinitely unlikely case where the true values of the unknowns coincide with the most plausible ones. We can therefore affirm, in general, that the mean error calculated by ordinary practice is smaller than the exact mean error, and consequently, that too much precision is attributed to the observations. Now let us see what a rigorous theory yields.

38. edit

First of all, we need to determine how the quantity ${M}$ depends on the true errors of the observations. As in art. 28, Let us denote these errors by $e,$ $e',$ $e'',$ etc., and let us set, for simplicity,

${\begin{alignedat}{3}e{\sqrt {p^{\phantom {\prime }}}}&=\varepsilon ,\quad &e'{\sqrt {p'}}&=\varepsilon ',\quad &e''{\sqrt {p''}}&=\varepsilon '',\ldots \end{alignedat}}$

and

$m{\sqrt {p^{\phantom {\prime }}}}=m'{\sqrt {p'}}=m''{\sqrt {p''}}=\ldots =\mu .$

Let ${A}-x^{0},$ ${B}-y^{0},$ ${C}-z^{0},$ etc., be the true values of the unknowns $x,$ $y,$ $z,$ etc., for which $\xi ,$ $\eta ,$ $\zeta ,$ etc., are, respectively, $-\xi ^{0},$ $-\eta ^{0},$ $-\zeta ^{0},$ etc. The corresponding values of $v,$ $v',$ $v',$ etc., will obviously be ${\begin{aligned}-\varepsilon &,&-\varepsilon '&,&-\varepsilon '',\ldots ;&\end{aligned}}$ so that we will have

${\begin{array}{l}{\begin{alignedat}{7}\xi ^{0}&{}={}&a\,&\varepsilon &&{}+{}&\,a'\,&\varepsilon '&&{}+{}&\,a''\,&\varepsilon ''&&{}+{}\ldots ,\\\eta ^{0}&{}={}&b\,&\varepsilon &&{}+{}&\,b'\,&\varepsilon '&&{}+{}&\,b''\,&\varepsilon ''&&{}+{}\ldots ,\\\zeta ^{0}&{}={}&c\,&\varepsilon &&{}+{}&\,c'\,&\varepsilon '&&{}+{}&\,c''\,&\varepsilon ''&&{}+{}\ldots ,\end{alignedat}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \\{\begin{alignedat}{7}x^{0}&{}={}&\alpha \,&\varepsilon &&{}+{}&\alpha '\,&\varepsilon '&&{}+{}&\alpha ''\,&\varepsilon ''&&{}+{}\ldots ,\\y^{0}&{}={}&\beta \,&\varepsilon &&{}+{}&\beta '\,&\varepsilon '&&{}+{}&\beta ''\,&\varepsilon ''&&{}+{}\ldots ,\\z^{0}&{}={}&\gamma \,&\varepsilon &&{}+{}&\gamma '\,&\varepsilon '&&{}+{}&\gamma ''\,&\varepsilon ''&&{}+{}\ldots ,\end{alignedat}}\\\cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \cdots \end{array}}$

Finally,

$\Omega ^{0}=\varepsilon ^{2}+{\varepsilon '}^{2}+{\varepsilon ''}^{2}+\ldots$

will be the value of the function $\Omega ,$ corresponding to the true values of the $x,$ $y,$ $z,$ etc. Since we also have identically

$\Omega ={M}+(x-{A})\,\xi +(y-{B})\,\eta +(z-{C})\,\zeta +\ldots ,$

we will also have

${M}=\Omega ^{0}-x^{0}\xi ^{0}-y^{0}\eta ^{0}-z^{0}\zeta ^{0}-\ldots .$

From this, it is clear that ${M}$ is a homogeneous function of the second degree of the errors $e,$ $e',$ $e'',$ etc.; for various values of the errors this function may become greater or smaller. However, the extent of the errors remains unknown to us, so it is good to carefully examine the function ${M}$ , and to first calculate its average value according to the elementary calculus of probability. We will obtain this average value by replacing the squares $e^{2},$ ${e'}^{2},$ etc. with $m^{2},$ ${m'}^{2},$ etc., and omitting the terms in $ee',$ $ee'',$ etc., whose average value is zero; or equivalently, by replacing each square $\varepsilon ^{2},$ ${\varepsilon '}^{2},$ ${\varepsilon ''}^{2},$ $\ldots$ , by $\mu ^{2},$ and neglecting $\varepsilon \varepsilon ',$ $\varepsilon \varepsilon '',$ $\ldots$ . Accordingly, the term $\Omega ^{0}$ will provide $\pi \mu ^{2}$ ; the term $-x^{0}\xi ^{0}$ will produce

$-(a\,\alpha +a'\,\alpha '+a''\,\alpha ''+\ldots )\,\mu ^{2}=-\mu ^{2};$

each of the other terms will also give $-\mu ^{2},$ so that the total average value will be $=(\pi -\rho )\,\mu ^{2},$ where $\pi$ denotes the number of observations, and $\rho$ denotes the number of unknowns. Due to errors offered by chance, the true value of ${M}$ may be greater or smaller than this average value, but the difference decrease as the number of observations increases, so that

${\sqrt {\frac {M}{\pi -\rho }}}$

can be regarded as an approximate value of $\mu .$ Consequently, the value of $\mu$ provided by the erroneous method we discussed in the previous article, must be increased by the ratio of ${\sqrt {\pi -\rho }}$ to ${\sqrt {\pi }}.$

39. edit

To clearly understand the extent to which it is permissible to consider the value of ${M}$ provided by the observations as equal to the exact value, we must seek the mean error to be feared when ${\tfrac {M}{\pi -\rho }}=\mu ^{2}.$ This mean error is the square root of the average value of the quantity

${\left({\frac {\Omega ^{0}-x^{0}\xi ^{0}-y^{0}\eta ^{0}-z^{0}\zeta ^{0}-\ldots -(\pi -\rho )\,\mu ^{2}}{\pi -\rho }}\right)}^{2},$

which we will write as:

${\begin{aligned}&{\left({\frac {\Omega ^{0}-x^{0}\xi ^{0}-y^{0}\eta ^{0}-z^{0}\zeta ^{0}-\ldots }{\pi -\rho }}\right)}^{2}\\-{\frac {2\mu ^{2}}{\pi -\rho }}&\left[\Omega ^{0}-x^{0}\xi ^{0}-y^{0}\eta ^{0}-z^{0}\zeta ^{0}-\ldots -(\pi -\rho )\,\mu ^{2}\right]-\mu ^{2},\end{aligned}}$

and since the average value of the second term is evidently zero, the question reduces to finding the average value of the function

$\Psi =\left(\Omega ^{0}-x^{0}\xi ^{0}-y^{0}\eta ^{0}-z^{0}\zeta ^{0}-\ldots \right)^{2}.$

If we denote this average value by ${N},$ then the mean error we seek will be ${\sqrt {{\frac {N}{{(\pi -\rho )}^{2}}}-\mu ^{4}}}.$

Expanding the function $\Psi ,$ we see that it is a homogeneous function of the errors $e,$ $e',$ $e'',$ etc., or equivalently, of the quantities $\varepsilon ,$ $\varepsilon ',$ $\varepsilon '',$ etc.; therefore, we will find the average value by:

1. Replacing the fourth powers $e^{4},$ ${e'}^{4},$ ${e''}^{4},$ etc., by their average values;

2. Replacing the products $e^{2}{e'}^{2},$ ${e}^{2}{e''}^{2},$ etc., by their average values, that is, by $m^{2}{m'}^{2},$ ${m'}^{2}{m''}^{2},$ etc.;

3. Neglecting products such as $e^{3}{e'},$ $e^{2}{e'}{e''},$ etc.. We will assume (see art. 16) that the average values of $e^{4},$ ${e'}^{4},$ ${e''}^{4},$ etc., are proportional to $m^{4},$ ${m'}^{4},$ ${m''}^{4},$ etc., so that the ratios of one to another are ${\tfrac {\nu ^{4}}{\mu ^{4}}},$ where $\nu ^{4}$ denotes the average value of the fourth powers of the errors for observations whose weight is $=1$ . Thus the previous rules could also be expressed as follows: Replace each fourth power $\varepsilon ^{4},$ ${\varepsilon '}^{4},$ ${\varepsilon ''}^{4},$ etc., by $\nu ^{4};$ each product $\varepsilon ^{2}{\varepsilon '}^{2},$ ${\varepsilon }^{2}{\varepsilon ''}^{2},$ etc., by $\mu ^{4},$ and neglect all terms such as $\varepsilon ^{3}{\varepsilon '}$ or $\varepsilon ^{2}{\varepsilon '}{\varepsilon ''},$ $\varepsilon \varepsilon '\varepsilon ''\varepsilon '''.$

These principles being understood, it is easy to see that:

I. The average value of ${\Omega ^{0}}^{2}$ is

$\pi \nu ^{4}+(\pi ^{2}-\pi )\,\mu ^{4}.$

II. The average value of the product $\varepsilon ^{2}\,x^{0}\,\xi ^{0}$ is

$a\,\alpha \nu ^{4}+(a'\alpha '+a''\alpha ''+\ldots )\,\mu ^{4}=a\,\alpha \,(\nu ^{4}-\mu ^{4})+\mu ^{4},$

because

$a\,\alpha +a'\alpha '+a''\alpha ''+\ldots =1.$

Similarly, the average value of ${\varepsilon '}^{2}\,x^{0}\,\xi ^{0}$ is

$a'\alpha '\,(\nu ^{4}-\mu ^{4})+\mu ^{4};$

the average value of ${\varepsilon ''}^{2}\,x^{0}\,\xi ^{0}$ is

$a''\alpha ''\,(\nu ^{4}-\mu ^{4})+\mu ^{4}\,;$

and so on. Thus the average value of the product

$(\varepsilon ^{2}+{\varepsilon '}^{2}+{\varepsilon ''}^{2}+\ldots )\,x^{0}\,\xi ^{0}\quad$ or $\quad \Omega ^{0}\,x^{0}\,\xi ^{0}$

will be

$\nu ^{4}-\mu ^{4}+\pi \mu ^{4}.$

The products $\Omega ^{0}\,y^{0}\,\eta ^{0}$ or $\Omega ^{0}\,z^{0}\,\zeta ^{0},$ etc., will have the same average value. Thus the product

$\Omega ^{0}\,\left(x^{0}\xi ^{0}+y^{0}\eta ^{0}+z^{0}\zeta ^{0}+\ldots \right)$

will have an average value of

$\rho \nu ^{4}+\rho \,(\pi -1)\,\mu ^{4}.$

III. To shorten the following developments, we will adopt the following notation. We give the character $\Sigma$ a more extended meaning than we have done so far, by making it designate the sum of similar but not identical terms arising from all permutations of the observations. According to this notation, we will have

${\begin{alignedat}{2}&x^{0}&{}={}&\Sigma \alpha \varepsilon ,\\[1ex]&{x^{0}}^{2}&{}={}&\Sigma \alpha ^{2}\varepsilon ^{2}+2\Sigma \alpha \alpha '\varepsilon \varepsilon '.\end{alignedat}}$

Calculating the average value of ${x^{0}}^{2}\,{\xi ^{0}}^{2}$ term by ter, we first have, for the average value of the product $\alpha ^{2}\,\varepsilon ^{2}\,{\xi ^{0}}^{2},$

${\begin{aligned}&a^{2}\alpha ^{2}\nu ^{4}+\alpha ^{2}\left({a'}^{2}+{a''}^{2}+\ldots \right)\mu ^{4}\\&=a^{2}\alpha ^{2}\,(\nu ^{4}-\mu ^{4})+\alpha ^{2}\mu ^{4}\Sigma a^{2}.\end{aligned}}$

Similarly, the average value of the product ${\alpha '}^{2}\,{\varepsilon '}^{2}\,{\xi ^{0}}^{2}$ is

${a'}^{2}{\alpha '}^{2}\,(\nu ^{4}-\mu ^{4})+{\alpha '}^{2}\mu ^{4}\Sigma a^{2}\,;$

and so on. Therefore, the average value of the product

${\xi ^{0}}^{2}\Sigma \alpha ^{2}\varepsilon ^{2}$

is

$(\nu ^{4}-\mu ^{4})\Sigma a^{2}\alpha ^{2}+\mu ^{4}\Sigma a^{2}\cdot \Sigma \alpha ^{2}.$

Now the average value of $\alpha \alpha '\varepsilon \varepsilon '{\xi ^{0}}^{2}$ is

$2\,\alpha \alpha 'aa'\mu ^{4}.$

The average value of $\alpha \alpha ''\varepsilon \varepsilon ''{\xi ^{0}}^{2}$ is

$2\,\alpha \alpha ''aa''\mu ^{4}\,;$

and so on. Hence, we easily conclude that the average value of the product

${\xi ^{0}}^{2}\Sigma \alpha \alpha '\varepsilon \varepsilon '$

is

$2\,\mu ^{4}\Sigma \alpha \,a\,\alpha 'a'=\mu ^{4}\left[{\left(\Sigma a\,\alpha \right)}^{2}-\Sigma a^{2}\alpha ^{2}\right]=\mu ^{4}\left(1-\Sigma a^{2}\alpha ^{2}\right).$

Thus, for the average value of the product ${x^{0}}^{2}\,{\xi ^{0}}^{2},$ we have

$\left(\nu ^{4}-3\,\mu ^{4}\right)\Sigma a^{2}\alpha ^{2}+2\,\mu ^{4}+\mu ^{4}\Sigma a^{2}\cdot \Sigma \alpha ^{2}.$

IV. Similarly, for the average value of the product $x^{0}y^{0}\xi ^{0}\eta ^{0},$ we find

$\nu ^{4}\Sigma ab\,\alpha \beta +\mu ^{4}\Sigma a\,\alpha \,b'\beta '+\mu ^{4}\Sigma ab\,\alpha '\beta '+\mu ^{4}\Sigma a\,\beta \,b'\alpha '.$

Now, we have

${\begin{array}{c}{\begin{alignedat}{4}&\Sigma a\,\alpha \,b'\beta '&{}={}&\Sigma a\,\alpha &{}\cdot {}&\Sigma b\,\beta &{}-{}&\Sigma a\,\alpha \,b\,\beta ,\\[1ex]&\Sigma a\,b\,\alpha '\beta '&{}={}&\Sigma a\,b&{}\cdot {}&\Sigma \alpha \,\beta &{}-{}&\Sigma a\,b\,\alpha \,\beta ,\\[1ex]&\Sigma a\,\beta \,b'\alpha '&{}={}&\Sigma a\,\beta &{}\cdot {}&\Sigma b\,\alpha &{}-{}&\Sigma a\,\beta \,b\,\alpha ,\end{alignedat}}\\[1ex]{\begin{aligned}\Sigma a\,\alpha &=1,&\Sigma b\,\beta &=1,&\Sigma a\,\beta &=0,&\Sigma b\,\alpha &=0\,,\end{aligned}}\end{array}}$

so this average value will be

$\left(\nu ^{4}-3\,\mu ^{4}\right)\Sigma ab\,\alpha \beta +\mu ^{4}\left(1+\Sigma ab\cdot \Sigma \alpha \beta \right).$

V. By a similar calculation, we find that the average value of $x^{0}z^{0}\xi ^{0}\zeta ^{0}$ is

$\left(\nu ^{4}-3\,\mu ^{4}\right)\Sigma ac\,\alpha \gamma +\mu ^{4}\left(1+\Sigma ac\cdot \Sigma \alpha \gamma \right);$

and so on. Adding up, we obtain the average value of the product

$x^{0}\xi ^{0}(x^{0}\xi ^{0}+y^{0}\eta ^{0}+z^{0}\zeta ^{0}+\ldots )\,;$

this value is

${\begin{aligned}(&\nu ^{4}-3\,\mu ^{4})\Sigma \left[a\,\alpha \,(a\,\alpha +b\,\beta +c\,\gamma +\ldots )\right]\\&+(\rho +1)\,\mu ^{4}+\mu ^{4}\left(\Sigma a^{2}\cdot \Sigma \alpha ^{2}+\Sigma ab\cdot \Sigma \alpha \beta +\Sigma ac\cdot \Sigma \alpha \gamma +\ldots \right)\\[0.75ex]&=(\nu ^{4}-3\,\mu ^{4})\Sigma \left[a\,\alpha \,(a\,\alpha +b\,\beta +c\,\gamma +\ldots )\right]+(\rho +2)\,\mu ^{4}.\end{aligned}}$

VI. Similarly, we find that

$(\nu ^{4}-3\,\mu ^{4})\Sigma \left[b\,\beta \,(a\,\alpha +b\,\beta +c\,\gamma +\ldots )\right]+(\rho +2)\,\mu ^{4}$

is the average value of the product

$y^{0}\eta ^{0}(x^{0}\xi ^{0}+y^{0}\eta ^{0}+z^{0}\zeta ^{0}+\ldots ),$

and

$(\nu ^{4}-3\,\mu ^{4})\Sigma \left[c\,\gamma \,(a\,\alpha +b\,\beta +c\,\gamma +\ldots )\right]+(\rho +2)\,\mu ^{4}$

is the average value of the product

$z^{0}\zeta ^{0}(x^{0}\xi ^{0}+y^{0}\eta ^{0}+z^{0}\zeta ^{0}+\ldots )\,;$

and so on.

Hence by addition we find the average value of the square

${(x^{0}\xi ^{0}+y^{0}\eta ^{0}+z^{0}\zeta ^{0}+\ldots )}^{2};$

which is

$(\nu ^{4}-3\,\mu ^{4})\Sigma \left[{(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}^{2}\right]+(\rho ^{2}+2\,\rho )\,\mu ^{4}.$

VII. Finally, from all these preliminaries, we conclude that

${\begin{aligned}{N}&{}={}(\pi -2\,\rho )\,\nu ^{4}+\left(\pi ^{2}-\pi -2\,\pi \rho +4\,\rho +\rho ^{2}\right)\mu ^{4}\\&{}+{}\left(\nu ^{4}-3\,\mu ^{4}\right)\Sigma \left[{(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}^{2}\right]\\[0.75ex]&{}={}(\pi -\rho )\left(\nu ^{4}-\mu ^{4}\right)+{(\pi -\rho )}^{2}\,\mu ^{4}\\&{}-{}\left(\nu ^{4}-3\,\mu ^{4}\right)\left[\rho -\Sigma \left(a\,\alpha +b\,\beta +c\,\gamma +\ldots \right)^{2}\right].\end{aligned}}$

Therefore, the mean error to be feared when

$\mu ^{2}={\frac {M}{\pi -\rho }},$

will be

${\sqrt {{\frac {\nu ^{4}-\mu ^{4}}{\pi -\rho }}-{\frac {\nu ^{4}-3\,\mu ^{4}}{{(\pi -\rho )}^{2}}}\left[\rho -\Sigma {(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}^{2}\right]}}.$

40. edit

The quantity

$\Sigma \left[{(a\,\alpha +b\,\beta +c\,\gamma +\ldots )^{2}}\right]$

which occurs in the expression above, generally cannot be reduced to a simpler form. However, we can assign two limits between which its value must necessarily lie. First, It is easily deduced from the previous relations that

${\begin{aligned}&{(a\,\alpha +b\,\beta +c\,\gamma +\ldots )}^{2}+{(a\,\alpha '+b\,\beta '+c\,\gamma '+\ldots )}^{2}\\+\;&{(a\,\alpha ''+b\,\beta ''+c\,\gamma ''+\ldots )}^{2}+\ldots =a\,\alpha +b\,\beta +c\,\gamma +\ldots ;\end{aligned}}$

from which we conclude that

$a\,\alpha +b\,\beta +c\,\gamma +\ldots$

is a positive quantity smaller than unity, or at least not larger. The same will be true for the quantity

$a'\alpha '+b'\beta '+c'\gamma '+\ldots ,$

which is equal to the sum

${\begin{aligned}(a'\alpha +b'\beta &+c'\gamma +\ldots )^{2}+(a'\alpha '+b'\beta '+c'\gamma '+\ldots )^{2}\\&+{(a'\alpha ''+b'\beta ''+c'\gamma ''+\ldots )}^{2}+\ldots \end{aligned}}$

Similarly,

$a''\alpha ''+b''\beta ''+c''\gamma ''+\ldots$

will be smaller than unity; and so on. Therefore,

$\Sigma \left[{(a\,\alpha +b\,\beta +c\,\gamma +\ldots )^{2}}\right]$

must be smaller than $\pi .$ Second, we have

$\Sigma (a\,\alpha +b\,\beta +c\,\gamma +\ldots )=\rho ,$

since

${\begin{aligned}\Sigma a\,\alpha &=1,&\Sigma b\,\beta &=1,&\Sigma c\,\gamma &=1,\ldots ,\end{aligned}}$

from which it is easily deduced that

$\Sigma \left[{(a\,\alpha +b\,\beta +c\,\gamma +\ldots )^{2}}\right]$

is greater, or at least not smaller, than ${\frac {\rho ^{2}}{\pi }}.$ Therefore, the term

${\frac {\nu ^{4}-3\,\mu ^{4}}{{(\pi -\rho )}^{2}}}\left[\rho -\Sigma {(a\,\alpha +b\,\beta +c\,\gamma +\ldots )^{2}}\right]$

must necessarily lie between the limits

$-{\frac {\nu ^{4}-3\,\mu ^{4}}{\pi -\rho }}\quad$ and $\quad {\frac {\nu ^{4}-3\,\mu ^{4}}{\pi -\rho }}\,{\frac {\rho }{\pi }},$

or, between the broader limits

$-{\frac {\nu ^{4}-3\,\mu ^{4}}{\pi -\rho }}\quad$ and $\quad {\frac {\nu ^{4}-3\,\mu ^{4}}{\pi -\rho }}\cdot$

Thus, the square of the mean error to be feared for the value

$\mu ^{2}={\frac {M}{\pi -\rho }}$

lies between the limits

${\frac {2\,\nu ^{4}-4\,\mu ^{4}}{\pi -\rho }}\quad$ and $\quad {\frac {2\,\mu ^{4}}{\pi -\rho }}\,,$

so that a degree of precision as great as desired can be achieved, provided the number of observations is sufficiently large.

It is very remarkable that in hypothesis III of art. 9, on which we had formerly relied to establish the theory of least squares, the second term of the square of the average error completely disappears (since $\nu ^{4}-3\,\mu ^{4}=0$ ); and because, to find the approximate value $\mu$ of the average error of the observations, it is always necessary to treat the sum

${\lambda }^{2}+{\lambda '}^{2}+{\lambda ''}^{2}+\ldots ={M},$

as if it were equal to the sum of the squares of $\pi -\rho$ random errors, it follows that, in this hypothesis, the precision of this determination becomes equal to that which we found, in art. 15, for the determination from $\pi -\rho$ true errors.

↑ The exact determination of $\mu ',$ $\mu '',$ etc., is conceivable only in the case where, by the nature of the matter, the errors $x,$ $x',$ $x'',$ etc. proportional to $1,$ $\mu ',$ $\mu '',$ etc., are considered equally probable, or rather in the case where

$\varphi (x)=\mu '\varphi '(\mu 'x)=\mu ''\varphi ''(\mu ''x)\,\ldots$
↑ We will later explain the reasoning that led us to denote the coefficients of this formula by the notation $(\alpha \alpha ),$ $(\alpha \beta ),$ etc..

[1] The exact determination of $\mu ',$ $\mu '',$ etc., is conceivable only in the case where, by the nature of the matter, the errors $x,$ $x',$ $x'',$ etc. proportional to $1,$ $\mu ',$ $\mu '',$ etc., are considered equally probable, or rather in the case where

$\varphi (x)=\mu '\varphi '(\mu 'x)=\mu ''\varphi ''(\mu ''x)\,\ldots$

[2] We will later explain the reasoning that led us to denote the coefficients of this formula by the notation $(\alpha \alpha ),$ $(\alpha \beta ),$ etc..

[1]

[2]