Popular Science Monthly/Volume 63/October 1903/The Sherman Principle in Rhetoric and its Restrictions

1414350Popular Science Monthly Volume 63 October 1903 — The Sherman Principle in Rhetoric and its Restrictions1903Robert Edouard Moritz




FIFTEEN years ago, Professor L. A. Sherman, of the University of Nebraska, while investigating the sentence-lengths used by early and modern English writers, noticed that in the works which he examined each author manifested an average sentence-length, which he inferred to be characteristic of the author. Consecutive hundreds of periods were averaged with respect to the number of words per sentence and the mean of five or more of these averages was taken to represent approximately the average sentence-length used by the author. It was found that the averages for separate hundreds generally varied by less than 20 per cent, from the total average of 500 to 1,000 sentences. The 2,225 periods of De Quincey's 'Opium-Eater' averaged 33.65 ± 6.64 words per sentence, where the number 6.64 indicates the largest number of words by which the averages of individual hundreds differ from the average 33.65. Similarly 722 periods from Macaulay's 'Essay on History' yielded 23 ± 3.35 words per sentence, 750 periods from Channing's 'Self-Culture' 25.35 ± 1.45, 732 periods from Emerson's 'American Scholar' and the 'Divinity School Address' 20.71 ± 2.65 and 805 periods from Bartol's 'Radicalism' and 'Father Taylor' gave an average of 16.63 ± 2.35 words per sentence. These results led to the suspicion that stylists are 'subject to a rigid rhythmic law from which even by the widest range and variety of sentence length and form they may not escape.' Averages from other authors were made with similar results. A culminating test was furnished by actually counting the words in each of the 41,500 periods in the five volumes of Macaulay's 'History of England' with the resulting average of 23.43 ±7.11. The conclusion was that writers who have achieved a style are governed by a constant sentence rhythm, which will generally be revealed by an examination of 300 periods.

Encouraged by these results. Professor Sherman induced Mr. Gerwig, then a student at the University of Nebraska, to examine other stylistic peculiarities. This Mr. Gerwig did by determining the average number of predications per sentence and the percentage of simple sentences used by one hundred different authors. His conclusions are summed up in the following words: "A very little investigation served to convince me that the same remarkable uniformity which had been found in the average number of words used by any given author per sentence would also hold in regard to the number of finite verbs, or predications, found in each sentence. The results obtained convinced me also that there was a uniformity in the number of simple sentences per hundred of a given author." Mr. Gerwig expresses his conviction that the average number of predications and simple sentences in five hundred periods of any author who has achieved a style is approximately the average of his whole work. In particular he found that 'while Chaucer and Spenser put habitually over five main verbs in each sentence they wrote, and less than ten simple sentences in each hundred, Macaulay and Emerson used only a little over two verbs per sentence, and left over thirty-five verbs in each hundred simple.'

The theory which has grown out of these investigations has been most tersely stated by Mr. Hildreth, another student of Professor Sherman, who at the same time applies the theory to the Bacon-Shakespeare controversy. We read:

Ten years or more ago Professor Sherman, while investigating the course of stylistic evolution in English prose, made the discovery that authors indicate their individuality by constant sentence proportions, personal and peculiar to themselves. This was demonstrated especially with the number of words used per sentence in large averagings. It was found that De Quincey, Channing and Macaulay, if five hundred periods or more were taken, evinced this average invariably, and in the earliest as well as in the latest period of their authorship. This discovery led to the suspicion that good writers would be found constant in predication averages, in per cent, of simple sentences, and other stylistic details. Acting upon a suggestion to this effect, Mr. G. W. Gerwig, then a pupil of Professor Sherman, undertook an investigation that established the constancy of predication, as well as simple-sentence frequency, in given authors. . . . Professor Sherman and Mr. Gerwig have thus established by an examination of a great many authors, that writers are structurally consistent with themselves; that they possess a certain sentence-sense peculiarly their own. These investigators have established that by this instinct authors use a constant average sentence-length, and a certain number of predications per sentence, and that a given per cent, of their sentences will be simple sentences. . . . The work of these investigators covers a large amount of material and a wide field of literature. They have examined and compared the works of ancient and recent authors, early and late writings of the same author, and writings of the same author of different character, such as history and dialogue, poetry and prose.[1] The results thus far obtained are sufficient to show that it is not possible for a writer to escape from his stylistic peculiarities.

The principle once established, its application to cases of disputed authorship is very plain. If each author employs but one set of average sentence proportions such as sentence-length, predication average and simple sentence frequency, it is only necessary to determine these constants for a disputed work and compare them with those of its supposed author. If the two sets of constants manifest a striking difference it is conclusive evidence that the supposed author did not write the disputed work; if they are practically identical, the evidence is in favor of the supposed author, for it is highly improbable that two sets of three numbers each, taken at random, should happen to coincide,

Following this or some similar line of thought, Mr. Hildreth examined the prose in fifteen of Shakespeare's plays and of Bacon's 'Essays' and a portion of the 'New Atlantis.' To eliminate possible errors arising from careless or inconsistent punctuation, all the material was repunctuated according to modern principles, All inorganic and broken sentences were omitted. Then follow twelve pages of figures representing totals and specimen results, and then the summary.


No. of Sentences
No. of Words per
No. of
per Sentence.
Per Cent, of
Shakespeare. 5,002 12.39 1.70 39
Bacon 2,041 32.59 3.45 14

The reader is left free to draw his own conclusion from these figures. The closing statement is that the numbers are not presented as proof conclusive, but only as contributory evidence in the controversy.

Without wishing to deny the general principle of sentence-rhythm, which, in honor of its discoverer, I shall refer to as the Sherman principle in rhetoric, I wish to point out certain limitations to this principle, which I think will invalidate the conclusion that must otherwise be drawn from the above summary. The Sherman principle has been established only for certain normal forms of composition, a fact which lias been overlooked in the statement of the principle, as well as in its applications. What has been shown is that a writer uses definite sentence proportions while writing in a certain form of composition; it has not been shown that he uses the same proportions when he employs essentially different forms of composition, such as drama and description, criticism and correspondence. It is almost obvious that the sentence proportions of a philosophic discourse must differ from those employed in light fiction or the drama, yet this fact is not only overlooked, but directly denied in Mr. Hildreth's statement of the Sherman principle. To compare the sentence structure of dramatic compositions with the sentence structure of a heavy dissertation or description is to compare the oral utterances of a person engaged in deep contemplation or in vivid imagination of some sublime object with the commonplace talk of the drawing-room or the vernacular of the marketplace. Quite as plausible would it seem to assert that a man's average gait in walking is the same whether he is out for pleasure, on business, to escape from danger, or on a long journey.

The chief fact which apparently gives weight to the persistence of sentence proportions, regardless of the composition employed, is the instance of Macaulay's 'History of England,' for which the sentence constants are practically the same as those of his essays, notwithstanding that some parts of the 'History,' in particular the second volume, contain much dialogue. This anomaly is explained by the fact that taking the five volumes as a whole the essay style predominates to such an extent as practically to obliterate the disturbing effect of the dialogue portions. This is easily demonstrated. The average sentence length of 'Macaulay's History' is 24.43, which differs but little from 23.65 of Machiavelli. 24.00 of Pitt and 23.00 of the 'Essay on History'

Variation of Sentence-length in Ten Authors.

by the same author. Now of the 41,500 periods of the 'History,' there are forty-five hundreds whose average is less than twenty words per sentence. These we may take to represent the dialogue portions of the work. The exact average of these 4,500 periods is 18.62 words per sentence, that is, 4.81 words per sentence less than the average for the entire 'History.' If we replace these sentences by others of normal length, we augment the total aggregate by 4,500 times 4.81 or 21,645 words. That is, if the portions of the 'History' which contain an excessive amount of dialogue were replaced by an equal number of sentences of normal length, the five volumes would contain or 993,990 words. Dividing this number by 41,500 we obtain 23.95 words per sentence, a result not essentially different from the actual average, 24.43.

But whether the presumption is for or against limitations of the Sherman principles is of little consequence in a matter so easily tested by experiment. I have prepared a table giving the approximate sentence-lengths for widely divergent forms of composition by the same author. The averages by hundreds, as well as the final average, have been given in order to show the variation in the averages of consecutive hundreds in each work.

It is needless to continue this table, for a mere inspection of the figures already given must once and for all settle the 'single set of constants' theory. In fact the question suggests itself, whether the number of different sets of constants which an author may employ is not limited merely by his versatility as a writer. So far as sentence length is concerned, this conjecture is fully corroborated by a partial examination of Goethe's works. The results are exhibited in the following table:

Variation in the Sentence-length of Goethe's Writings.

The above list includes romance, drama, allegory, criticism, biography, description, science and correspondence, but with the exception of 'Faust' and 'Reinecke Fuchs' the works are all in prose, so that the fact of variation is not disturbed even if we consider prose literature alone. There can be little doubt that a complete examination of Goethe's writings would furnish a chain of sentence-lengths varying by almost insensible gradations from five to thirty-five or forty words per sentence.

The conclusion from which there seems to be no escape is that the average sentence-length used by an author depends upon at least two factors, one of which is the author's sentence sense, the other the particular form of composition into which his thought is cast.

What is true of sentence-length holds equally true of predication averages and simple sentence percentages. Other things being equal, the shorter sentences will naturally contain the fewer predications, and a larger per cent, of simple sentences, the limits being single predications, on the one hand, and none but simple sentences, on the other. This general relation is fully made good by the facts. Macaulay, in his 'History of England,' uses 23.3 words per sentence and 2.3 finite verbs, which is almost exactly ten words to one verb. Nearly the same ratio obtains in More's 'Life of Richard III.' with an average of 3.65 verbs out of 36.5 words per sentence; Hooker's 'Ecclesiastical Polity,' with an average of 4.12 verbs and 40.9 words per sentence; Sidney's 'Defense of Poesie,' 3.98 verbs and 39.3 words per sentence; and Channing's 'Self-Culture' employs 2.57 verbs out of a total of 25.9 words per sentence. However, in very short sentences there is a tendency to diminish and in very long sentences to increase the ratio of the total number of words to the number of verbs per sentence. Thus Emerson in his 'Divinity School Address' uses 2.14 verbs and 18.0 words per sentence, while Hakluyt in the 'Voyages of the English Nation to America' uses but 4.44 verbs out of an average of 56.8 words per sentence.

A more striking though less obvious relation exists between predication averages and simple sentence percentages, which is all the more surprising, inasmuch as simple sentence percentages are the least constant of the sentence proportions thus far examined. For instance, Lyly's 'Euphues' furnishes for five consecutive hundreds 26, 14, 20, 15 and 8 simple sentences respectively. De Quincey's 'Opium-Eater' yields the numbers 10, 19, 15, 7 and 21 for consecutive hundreds, and Macaulay in his 'History of England' gives simple sentence percentages as widely divergent as 41 and 27, though each average is based upon 500 consecutive sentences. These are extreme cases, but even the average variation is high. An examination of fifty authors shows that the simple sentence percentages based upon an examination of 400 sentences, differs on an average by nearly 6 per cent. (5.98 per cent.) from the averages based upon 500 sentences from each author, with extreme variations as high as 28.8 per cent. It seems quite plain, therefore, that several thousand sentences from each author would have to be examined to get anything like a constant simple sentence percentage.

Now Mr. Gerwig's tables[4] for predication averages and simple sentence percentages for prose works comprise averages of about 60,000 sentences taken from seventy-one different authors, exclusive of the complete averages for Macaulay's 'History.' These tables I utilized for a preliminary test[5] by employing the following device. I grouped together all the works whose predication averages fell between 1.50 and 2.00 per sentence. This group yielded an average of 1.83 predications per sentence and 53 simple sentences per hundred. Next I averaged the works which contained between 2.00 and 2.25 predications per sentence, and the average for this group was found to be 2.15 verbs per sentence and 38 simple sentences per hundred. Proceeding similarly by grouping the works whose predication averages fall between 2.25 and 2.50, between 2.50 and 2.75, and so on, we obtain the following table:

Index. Predications per Sentence
1 1.50 and 2.00 1.86 53.0 13.54
2 2.00"2.25 214 39.1 13.38
3 2.25"2.50 2.34 32.9 13.41
4 2.50"2.75 2.62 25.9 13.33
5 2.75"3.00 2.88 23.2 13.87
6 3.00"3.25 3.10 19.2 13.59
7 3.25"3.50 3.39 15.9 13.52
8 3.50"4.00 3.70 13.4 13.55
9 4.00"4.50 4.84 8.3 13.94
10 4.50"5.00 4.84 8.3 13.94
11 5.00"5.50 5.38 6.7 13.92

The numbers P, the predication averages, and S, the simple sentence percentages, aside from the general reciprocal relation which we should expect, manifest a more specific uniformity. The square-root of 53, the first number under S, multiplied by 1.86, the corresponding number under P, is 13. , but so also is the square-root of 39.1, the second number under S, multiplied by 2.14, the corresponding number for P. Similarly for the third, fourth, fifth pairs of corresponding numbers. That is, we find

and so on through the list, the result in each case being 13. . In short, we have quite uniformly


where , the arithmetic mean of the slightly varying values 13. , which are given in the last column of our table.

How nearly this equation fits our data may be best seen from the graphical representation. Fig. 1. The curve , as well

as the P's and S's from our table, have been plotted in rectangular coordinates by using the values of S for abscissas, and for ordinates ten times the corresponding values of P. The resulting points have been numbered to correspond with the index numbers in our table.

The relation expressed by may be easily expressed in words. For if P1 and P2 represent two predication averages, and S1, S2, the corresponding simple sentence percentages, We have approximately

from which

that is:

The predication averages of various works are approximately inversely proportional to the square-roots of their simple sentence percentages. It will be interesting to test this law on some specific work, not included in the table from which the law has been deduced. Macaulay's 'History of England' is the only larger work whose sentence dimensions have been determined with reasonable accuracy. Here S was found to be 34.2, and substituting this in our formula we find

which is nearly equal to the value 2.30 as determined by actually counting the finite verbs in 40,000 sentences.

There is, of course, no reason to infer that our formula will apply with equal accuracy to the sentence dimensions of every other work. Variations from it must occur. The only conclusion that is warranted is that when a reasonable number of works are selected whose predication frequency is nearly the same, and the average of these frequencies is taken, this average will bear a definite relation to the average of simple sentence ratios of the same works and that this relation is approximately expressed by our formula.

  1. This Professor Sherman tells me is an oversight. Neither he nor Mr. Gerwig think that the principle in question applies to poetry.
  2. The averages except the last are for 500 periods.
  3. The individual averages are for 500 periods each, the total for 2,500 periods.
  4. 'University (of Nebraska) Studies,' Vol. 2, No. 1.
  5. For a detailed discussion of this experiment, together with other matter of a more technical nature, see 'University Studies,' University of Nebraska, Vol. 111., No. 3.