Talk:The Oregon Trail

Latest comment: 2 months ago by Peteforsyth in topic Finding changes in the text
Information about this edition
Edition: George N. Morang & Company, Limited, 1901
Source: The text was most likely copied from the transcription #1015 at Project Gutenberg, which does not indicate which edition of the book it comes from. It is currently (May 2018) being matched and corrected to reflect the Fourth Edition, 1901 (from Internet Archive). The preface, and several sentences within the text, vary slightly. This may be explained by the editorial note at the beginning of the Fourth Edition. It seems that later editions published in 1910 etc. failed to incorporate the edits of the Fourth Edition.
Contributor(s): Akme, Pete Forsyth
Level of progress:
Notes:
Proofreaders:

Finding changes in the text edit

Akme (and anybody else working on this), I've found a fairly reliable (if somewhat tedious) way to find the words that were changed from whatever edition we're working on to the Fourth Edition (the scanned version). To do this you'll need the javascript options enabled in your preferences that give you the "TemplateScript" menu on the left.

  1. Click "Edit" on the page
  2. Click the "OCR" button in the menu, which will perform a new OCR function on the scanned text (you may have to wait a few seconds for the result)
  3. Delete the initial lines (the page header and the blank line following it)
  4. Click the "Clean up & Lines" link in the lefthand menu (which will eliminate extra character returns, producing a cleaner diff)
  5. No need to click the "Publish changes" link, but for the sake of illustrating the process, I've done so here
  6. Instead, click "Show changes" which will produce a diff like the one above
  7. Some of the diffs will be inconsequential -- results of inaccurate OCR, or differences based on ligatures; ignore those
  8. Take note of any substantive changes (correction to the spelling of words, or additions or removal of words)
  9. Now edit the original text (the one imported from Gutenberg; if you did not click "publish changes" above, you can easily do this just by clicking the "Edit" tab -- perhaps in a new window to make it easier to compare the two views)
  10. Make the necessary corrections

I hope that makes sense. Like I said, it's tedious...perhaps we should be matching this text to an edition it seems to actually match, instead, which would make for a much easier process. However, it does seem that the Fourth Edition corrections were done purposefully, and in some cases were necessary corrections. -Pete (talk) 18:46, 4 June 2018 (UTC)Reply

@Xover: Any suggestions how to proceed with this one? Though much work has gone into correcting individual pages, I wonder if the best thing might be to just nuke it and start fresh. -Pete (talk) 05:55, 23 February 2024 (UTC)Reply
If so however, probably best to manually mark off the illustration pages, which I think are all fine, at the very least they shouldn't be divorced from the relevant file names. -Pete (talk) 07:59, 23 February 2024 (UTC)Reply