Page:Popular Science Monthly Volume 60.djvu/107

This page has been proofread, but needs to be validated.

course, closer agreement of their diagrams, and this became so evident in the earlier stages of the investigation that the conclusion was soon reached that if a diagram be made representing a very large number of words from a given author, it would not differ sensibly from any other diagram representing an equally large number of words from the same author. Such a diagram would then reflect the persistent peculiarities of this author in the use of words of different lengths and might be called the characteristic curve of his composition. Curves similarly formed from anything that he had ever written could not differ materially from this, although curves of other authors might possibly but would not probably, agree closely with his.

Thus, if this principle were established, the method might be useful as a means of identification of authorship, and it might be relied upon with great confidence to show that a certain author did not write a certain composition.

In the earlier application of the method many interesting facts were brought out, some of which are worth mentioning here, although a full account of the preliminary work was published in 'Science' of March 11, 1887, It was soon discovered that among writers of English the threeletter word occurred much more frequently than any other. Indeed in the earlier investigation only one exception to this rule was found and that was in the writings of John Stuart Mill, who uses two-letter words more often than any other. This was surprising at first, especially in view of the large average word-length of Mill's composition, which is considerably in excess of that of any other author thus far examined, but it is easily explained by the very frequent appearance of prepositional phrases, necessitating the use of such two-letter words as in, on, to, of, etc., to an extent unapproached by other writers. Mill's writings furnished an opportunity for comparing the curves representing two different periods of an author's life. A comparison of two groups of 5,000 words each from his 'Political Economy' and his 'Essay on Liberty' showed the presence of the same peculiarities in word choosing, and in every thousand of the ten examined the two-letter word was in excess. No other writer of English has been found to use two letter words oftener than any other, but it is not at all improbable that there may be such.

Through the interest of Mr. Edward Atkinson, it became possible to give a partial answer to the question. Can an author purposely avoid the peculiarities of style that belong to his normal composition? Mr. Atkinson, having addressed a body of college alumni on a certain topic, afterward gave what he meant to be the same address to a body of workingmen, but in the latter instance he made a special effort to use simple, short words and sentences of the simplest and plainest construction. Although relating to the same topic the two addresses 'read'