Open main menu

OCR button causes pages to freezeEdit

In the past, when I have used the OCR button on (e.g.) Page:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf/6, the "Page Body" field becomes temporarily greyed out and inaccessible while the script rearranges text on the page and then it eventually snaps to and lets me interact with the text again, reinserting a cursor. Now, it stays stuck in that greyed out step. Is anyone else experiencing this? I am using Firefox and just (re-)installed uMatrix but I told that (and all my other extensions) to allow media from other domains (mediawiki.org, wikimedia.org, wmflabs.org)... any guesses as to what's happening here or how to fix it? —Justin (koavf)TCM 17:06, 7 September 2019 (UTC)

@Koavf: See this thread in the archives. It's a problem with the OCR tool that we appear to be dependent on Phe too fix, and they have not yet been able to do anything about it. --Xover (talk) 18:39, 7 September 2019 (UTC)
See task T228594. However, nobody answers there. --Jan Kameníček (talk) 18:43, 7 September 2019 (UTC)
Thanks. Good to know it's not my browser, I guess. —Justin (koavf)TCM 18:58, 7 September 2019 (UTC)
That's why I have four OCR buttons on my edit toolbar: Tesseract 3 (the default here), Tesseract 4, Google Drive, Google Cloud Vision. Google Cloud Vision OCR is available as a gadget here. The other two can be put into interested users' common.js. Hrishikes (talk) 02:29, 8 September 2019 (UTC)
@Hrishikes: Is it s:mul:User:Putnik/TesseractOCR.js from s:mul:Wikisource:Tesseract OCR? —Justin (koavf)TCM 03:09, 8 September 2019 (UTC)
Yes, that is Tesseract 4. Hrishikes (talk) 03:11, 8 September 2019 (UTC)
@Hrishikes: I copied and pasted it to my common.js and purge several times but don't see a new button. How do you actually use it? —Justin (koavf)TCM 03:40, 8 September 2019 (UTC)
@Koavf: -- Not like that. Copy it from my common.js. But Google Drive OCR is better; it is faster and also removes the line breaks. You can copy it from my global.js. -- Hrishikes (talk) 03:46, 8 September 2019 (UTC)
@Hrishikes: Thanks. I'm going to avoid Google but I appreciate the tip. Very helpful. —Justin (koavf)TCM 03:51, 8 September 2019 (UTC)
It's working! (redlink at the moment, blue soon). —Justin (koavf)TCM 03:57, 8 September 2019 (UTC)
@Koavf: -- For doing it directly from WMF Labs: Tesseract 4, Google Cloud Vision, Google Drive. -- Hrishikes (talk) 04:22, 8 September 2019 (UTC)
@Hrishikes: Thanks very much for the tips too! I am just trying them. The Tesseract OCR has good results, but it is extremely slow. However it is much better to have it now than to wait until the default OCR button is repaired. Did I understand it right that there are two different Google gadgets (Google Drive and Google Cloud Vision)? Some time ago I enabled the Google OCR gadget in my preferences, but it gives quite poor results. Now I have copied something to my commons.js from your page, hoping to try the other Google gadget, but as a result I have got two identical Google buttons giving identical results. How can I get the other one? --Jan Kameníček (talk) 15:42, 8 September 2019 (UTC)
@Jan.Kamenicek: -- My common.js has Google Cloud Vision. You had already got it as a gadget, so the two are identical. You can get the Google Drive OCR from my global.js at Meta (it is called indic ocr by the developer). Hrishikes (talk) 16:07, 8 September 2019 (UTC)
Thanks very much, I am going to try it! --Jan Kameníček (talk) 17:51, 8 September 2019 (UTC)
Unfortunately none of these is good with 2-column pages. There are lots and lots of 2-column scans out there, so maybe someone sometime will write a variant of one of these gadgets that is specific for that situation. Levana Taylor (talk) 16:58, 8 September 2019 (UTC)
@Levana Taylor: -- ABBYY FineReader is good for double columns. IA uses it. I have it offline. Hrishikes (talk) 17:15, 8 September 2019 (UTC)
Yep, I go down to the local university to use ABBYY fairly often. I’m slightly amazed you bought a copy, since it’s so expensive. The ABBYY output that IA provides isn’t too bad, but they didn’t check every page to make sure that columns were being recognized correctly and maybe 1 out of 30 is wrong; plus they didn’t retain italics and suchlike formatting in the output they give. There are other problems, but it’s still probably the best OCR you can find on the web for old texts. Levana Taylor (talk) 17:26, 8 September 2019 (UTC)
From time to time there are some wishlists where WM contributors can express what they need to develop. What about asking for a new OCR gadget that would cover the needs of Wikisource contributors? --Jan Kameníček (talk) 17:52, 8 September 2019 (UTC)
m:Wishlists. 2020 isn't open yet. —Justin (koavf)TCM 18:40, 8 September 2019 (UTC)
@Levana Taylor: You wouldn't happen to be able to provide me with an example of a "bad" OCR of a two column page and then a "good" OCR result? Ideally of the same page, but that's not very important. --Xover (talk) 02:41, 9 September 2019 (UTC)
For a bad OCR of a two column page, look at Page:Weird Tales v01n01 (1923-03).djvu/149. The OCR as often as not runs the text right across the column text.--Prosfilaes (talk) 03:11, 9 September 2019 (UTC)
Thanks. FWIW, in my (limited) experience, Tesseract 4 usually gets this right on half-way decent scans. --Xover (talk) 03:36, 9 September 2019 (UTC)\
@Xover: You ask, you get. Here is Page 38 of Volume 3 of Once a Week run through all 4 gadgets: all 4 of them failed to notice that it was two columns. At the bottom, IA’s FineReader text, which did get it right. Levana Taylor (talk) 04:32, 9 September 2019 (UTC)
@Levana Taylor: Ah, thank you. That is most elucidating. I grabbed that page preview from Commons and ran it through Tesseract 4.1 locally, and added its output (modulo some post-processing for stuff like turning \n into actual newlines) to your samples page. I don't have older versions of Tesseract available for testing so I can't say whether the difference is due to improvements in 4.1 or there are other factors at play. --Xover (talk) 05:13, 9 September 2019 (UTC)

For those who maintain these kinds of tools, this is a promising Web interface. Not sure if the authors of that tool are interested in collaborating but we could reach out to them. —Justin (koavf)TCM 17:09, 9 September 2019 (UTC)

I've read this and every other post concerning the OCR problem (I reported the bug), and want to thank everyone for the comments and links. Being in a bind, I tried all suggestions but the results are very disheartening and want to voice my opinions for whatever they are worth.
Of all the OCR software I tried, Phe's is the fastest and has the best reproduction of texts I am working on. (A lot of Spanish mixed in the English text).
  • It is fast and clean. It recognizes mdashes and the "é", character, but not the other accented characters. But, character replacements are consistent. Seeing "4" in the Spanish the text, means an "à" consistently.
  • I have no idea why would anyone use a web service to OCR a page.
  • I tried Project Naphta, but it only works in Chrome based browsers, but not Firefox.
  • Indic-OCR has no instructions. Do they want a single image or a complete book uploaded? Tried a single .jpg image from the Commons and their software crashed with error 505, alerting me that their server is either too busy or it's down.
  • LSTM-based Tesseract 4-alpha OCR service using the same image as above. It also crashed and reported "502 Bad Gateway".
  • Locally, when I copied [[s:mul:User:Putnik/TesseractOCR.js]] script to my user namespace, it didn't work. But embedding the link in the Common.js pointing his namespace's copy works. It produces the same quality as Phe's, but very very slow. It took a 1 minute and 40 seconds to OCR THIS PAGE. So, it clearly needs more work.
  • Also tried the Google OCR from Gadgets but it produces poor results.
So, Phe's OCR script must be fixed. It works for me in the current volume, but not in the other 3 volumes of the series. The problem was found but not fixed, and if it works intermittently, perhaps comparing the function when working successfully, and not, can lead to the solution. Proofreading ~2400 pages of text heavily infused with Spanish, without a good working OCR is ridiculous. The software issue cannot be a great problem because it works in some but not in other volumes. I would have to try check out older .djvu uploads as they relate to the wmf software versions. The wmf software changed and improved over time and Phe hasn't been around for over a year. — Ineuw (talk) 10:01, 14 September 2019 (UTC)

Proofread.js changes and Index files' installation dates.Edit

A random comparison of Index:djvu files' installation dates with the working copy of the OCR gadget, indicates that the problem occurred after March 2019, at which time the Gadget still worked. From what I understand, OCR is part of Proofreading script, but both scripts are continually updated, and the last OCR edit was in January 2019. I was hoping to narrow down the time frame of changes by matching Index file installations and testing the functionality. This is the only thing I can contribute. — Ineuw (talk) 23:58, 14 September 2019 (UTC)

Index form scriptEdit

I am seeing Bengali script in index form entries of these three indices: 1, 2, 3. Not sure if it is related to my browser/preferences only. Can anyone please check? Hrishikes (talk) 17:53, 8 September 2019 (UTC)

@Hrishikes: I, too, am seeing the incorrect script on certain index titles but in each of your samples only on certain revisions. I am not even logged-in. No other Index: page I have examined (so far - I haven't tried all that many!) seem similarly affected. 114.78.171.144 21:38, 8 September 2019 (UTC)
Thanks. Glad to hear from you after a long time. But what is the solution? Phabricator? Hrishikes (talk) 01:40, 9 September 2019 (UTC)
Very strange. Editing the page fixes it. Purging the page fixes it. It's not being done by the javascript that enhances Index:-page editing, nor the one that preloads data from Commons. I think this may be in Mediawiki somewhere, but I don't have any bright ideas as to where or how. It might be worth opening a ticked on it, but since a purge clears it it's unlikely to get a whole lot of effort put into it. --Xover (talk) 02:36, 9 September 2019 (UTC)

Template:Center not displayingEdit

How can I fix Page:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf/56? —Justin (koavf)TCM 23:28, 15 September 2019 (UTC)

It's displaying now. Thanks. —Justin (koavf)TCM 23:57, 15 September 2019 (UTC)
(ec.) Declare the positional parameter {{center|1 = all your text}} as its display it was saying that it was missing. Generally happens when there is an equals sign appearing in the other output which it interprets as a named parameter, which then means that the required parameter {{{1}}} is seen as missing. — billinghurst sDrewth 23:59, 15 September 2019 (UTC)
Nice. Thanks, bill. —Justin (koavf)TCM 02:13, 16 September 2019 (UTC)

Can someone help me with this index page?Edit

I'm having trouble here using a hanging indent with the leader box template and I'm not sure what template I'm supposed to use for the columns with a border between them. Abyssal (talk) 18:54, 16 September 2019 (UTC)

@Abyssal: You can handle the indent like this: {{dotted TOC line||{{em}}cinnabarinus|136}}. The best thing to do is ignore the columns with the border between them and instead create it as one single column. —Beleg Tâl (talk) 19:08, 16 September 2019 (UTC)

Circular redirects?Edit

@ShakespeareFan00: Fixed, I think. Thanks. --Xover (talk) 10:13, 19 September 2019 (UTC)

Consistency throughout a magazine in fixed width?Edit

In this article for OAW, I had to restrict the text to a fixed width of 500px because of the image that wraps around the text. There are other articles that do the same thing. If I’m setting a fixed width for some articles, should I be consistent throughout the magazine and do it for all of them? If so, what’s the most wiki-friendly way to do so? (I know about default layouts & I don’t think they’re the solution because they alter other things besides just width) Levana Taylor (talk) 19:39, 19 September 2019 (UTC)

@Levana Taylor: I am afraid even then it does not work perfectly. In my case the various paragraphs overlap (maybe due to font-size being a little larger than yours—I do not know for sure?) Perhaps reworking the {{overfloat image}} invocation into something a little more flexible—I am thinking perhaps {{img float|polygon=}} with all the paragraphs rejoined into a continuous text flow wrapping around the image…? 114.78.171.144 01:23, 22 September 2019 (UTC)
Thanks for the tip about using the polygon parameter. I haven’t ever done that before and will have to wait till Monday to be able to create an edited version of the image; so if you want to have a try, be my guest. Levana Taylor (talk) 03:51, 22 September 2019 (UTC)
@Levana Taylor: I gave it a go, not being entirely familiar with polygon outlines myself. Then I found Firefox has a point-and-click edit interface for this stuff anyway! Is the result acceptable for you? I was fairly harsh in removing some of your clever effects (I particularly liked text-align-last:justify and was rather sorry to realise it could be dispensed with!) to make the blocks join and attempted to force HTML to do the text flow itself. If I have overdone this please amend further! 114.78.171.144 09:43, 22 September 2019 (UTC)
Looks great! I just adjusted the margins around the upper text block.
Pleasingly, this page also looks good if the window width is increased beyond 500 px so that the text flows to the right of the top of the image. That means there’s no maximum with for the text. There’s merely a problem with the image being cut off if the window width is less than 500px, but no reason to force the text to be as wide as the image, so no minimum text width.
The same situation should prevail in almost all pages: The text width does not have to be fixed globally to accommodate certain fixed-width elements, like the image-positioning tables on page 2.43, the certificate on page 7.702, or the side-by-side letter and transcription on page 8. 63. I just put them inside a 500px box and let them be cut off if the window is too narrow, while the rest of the text on the page can be wider or narrower than them. Levana Taylor (talk) 12:49, 22 September 2019 (UTC)

Problem solved (for now, at least) -- see above. However, I would still like to hear if any one has arguments in favor of setting global widths (minimum, maximum, or fixed) -- the arguments against are obvious, in that flexibility of text is one of the great advantages of a digital format. Levana Taylor (talk) 12:49, 22 September 2019 (UTC)

If I may throw in a final 2¢ worth: although the experiment has been going on for a long time now—born as it was prior to mediawiki user adjustable <img> CSS—consider {{FreedImg}} as an example of exploring the possibility of flexible-sized images… Now if polygon support were to be migrated into that template… and judicious application of CSS max-width: applied to the image… It should not be beyond possibility to construct a page such that it would balance (whatever that meant) image and text up to a certain (fixed) limit (say when the image is at its crispest; i.e. native resolution) and beyond that on ever-larger displays rewrap the text component dynamically around the best-possible image…
Come to think of it {{FI|imgstyle=}}… the hooks are already in place if somebody wants to give it a trial? The CSS would not be pretty though! 114.78.171.144 05:55, 23 September 2019 (UTC)

"Living authors"Edit

Currently, when an author’s dates in Wikidata are "floruit," that is, properties Work period (start) and Work period (end), they are automatically added to the category "Living authors" here. Is there a way to suppress that? Levana Taylor (talk) 19:45, 25 September 2019 (UTC)

A little calculation: if someone died this year at age 100 and had their first work published at age 18, their first publication would have been in 1937. In round numbers, if someone’s "Work period (start)" is 1935 or earlier, they can’t possibly be alive. Levana Taylor (talk) 19:55, 25 September 2019 (UTC)
My view was either:
  • 2019-70-70 = 1879 (so 1880 to be really sure.) So if an author was born or working before 1880 they are highly unlikely to be alive now.
  • 1996-70 = 1926 - 70 = 1856 so anything published prior to the 1850's is almost certain not to have a living author ( or be in copyright still) (1996 due to URAA) ShakespeareFan00 (talk) 20:14, 25 September 2019 (UTC)
That is the calculation for "PD-old", yes -- if an author published before about 1850, they can't have lived past 1924, so, except for posthumous works, everything they wrote must be public domain automatically. That’s different from whether an author can be classified as a Living Author: there, what matters is whether they can have lived past 2019, not whether they can have lived past 1924. Levana Taylor (talk) 20:45, 25 September 2019 (UTC)
I think the page is categorized well if the floruit property is filled, but not with similar properties Work period (start) and Work period (end). Sometimes also only baptism date may be filled. The problem was discussed and partly solved at Template talk:Author#Living authors category and User talk:Samwilson#Living auhors category again. --Jan Kameníček (talk) 20:26, 25 September 2019 (UTC)
See also Module talk:Author/testcases. --Jan Kameníček (talk) 20:30, 25 September 2019 (UTC)
OK, so the problem is related to the also unresolved issue of "Work period (start)" and "Work period (end)" not yet being integrated into the software -- they will eventually be displayed as "fl. start-end" and at that time the living authors problem will also be solved. thanks for the pointer. Levana Taylor (talk) 20:37, 25 September 2019 (UTC)
Yes, I keep trying to find time to get this fixed, but the couple of times I've done so it's not been quick. Oh, and the test-items I was using have since been improved so I had to find more (you can see lots of the test cases are now failing, because there items have since been changed). Sam Wilson 00:44, 26 September 2019 (UTC)

Carrying forward a block center within a split footnoteEdit

Esme Shepherd (talk) 18:51, 27 September 2019 (UTC)

  • I have split footnotes with restricted text widths but at the moment these texts do not run continuously. That on the second page begins on a new line. I have tried using block centre/s and block center/e but I have probably got it wrong because this does not eliminate the problem.

For example, at present, I have:
First page:
Body:
<ref follow="p53">{{c|Note 22.<br>''The Bowl of Liberty''—}} {{block center/s|max-width=500px}} One of the ceremonies by which the battle of Platæa was annually commemorated was, to crown with wine a</ref>
Footer:
{{block center/e}}
Next page:
Header:
{{block center/s|max-width=500px}}
Body:
<ref follow="p53"> cup called the ''Bowl of Liberty'', which was afterwards poured forth in libation.{{block center/e}}</ref>

Can you point to the specific page you're working on? Londonjackbooks (talk) 19:00, 27 September 2019 (UTC)
@Esme Shepherd: fixedBeleg Tâl (talk) 21:13, 27 September 2019 (UTC)
These particular pages are:The Siege of Valencia.pdf/68 and 69. There are several other instances within the notes here, e.g. pages 62 and 63, 63 and 64 etc. Esme Shepherd (talk) 21:51, 27 September 2019 (UTC)

  Comment As coding is nested, the coding starts and should finish in the ref, not outside of the ref. So ideally what should be occurring with the code is something like

<ref name="p53">
{{block center/s|max-width=500px}} All the sections of text on first page<noinclude>{{block center/e}}</noinclude>
</ref>

which starts and finishes though does not transclude the close, then have on the subsequent page(s)

<ref follow="p53">
<noinclude>{{block center/s|max-width=500px}}</noinclude>all your subsequent sections of text follow ... {{block center/e}}
</ref>

which has though does not transclude the open, yet has the close. And yes, it does make the coding a little more complex, — billinghurst sDrewth 23:35, 27 September 2019 (UTC)

@Billinghurst, @Beleg Tâl: I'm seeing it still breaking. see Similar issue in other notes. Londonjackbooks (talk) 04:24, 28 September 2019 (UTC)

Truth be told, why are they being done as references, they are not references on these pages. We should be doing these using {{authority reference}}, seeing the note for where it spans pages. Should I be bot'ing a fix? — billinghurst sDrewth 11:16, 29 September 2019 (UTC)
I knew I was overlapping templates, which seemed wrong. I see the problem has been cured by omitting the block center/e in the footer of the first page, although the text on the second page looks wrong because it is inset. However, the end result now is perfect, so I have followed suit with all the other splits and published the result in Wikisource. All these references begin on earlier pages within the poem text. Esme Shepherd (talk) 11:40, 29 September 2019 (UTC)

PERMISSION REGARDING USE OF TEXTEdit

Sir

I wish to make a request for using "The Bishop's Candlesticks' by Norman McKinnel for educational purposes. Kindly guide me through the procedure.

Regards Richa unsigned comment by 49.207.150.99 (talk) .

Most of our text here is public domain in the US, with a few freely licensed pieces. [The Bishop's Candlesticks] is a British work published in 1908 by an author who died in 1932; I don't believe it's under copyright anywhere in the world and therefore is free to do with as you wish. None of us can give you more permission than that, and this is not legal advice.--Prosfilaes (talk) 10:39, 28 September 2019 (UTC)
The link to the text at English Wikisource: The Bishop's Candlesticks. The licence is given at the bottom; as for Wikisource, you can download and use the work to any purpose you need to use it. --Jan Kameníček (talk) 17:22, 28 September 2019 (UTC)
If you utilise the epub download link on the left hand sidebar, I believe that you will find that the appropriate credits will be part of the download ("About this digital edition" which is the last page). — billinghurst sDrewth 11:20, 29 September 2019 (UTC)

A few problems with an old scan and request for best practicesEdit

I have finished with The Applicability of Weber's Law to Smell as far as a first pass. I have a few issues and would like some feedback:

  1. There are a couple of charts that should be converted to SVG (e.g. Page:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf/56). What is the easiest program to generate these? I'd prefer to not do them by hand.
    I’m not sure what you mean by "generate" an SVG image. Adobe Illustrator is one of the major programs to work with SVG (I go down to my local university to use their Adobe software). You can save an existing image as SVG, or, if you want to recreate it with clean lines, add a new layer, set it to semi-transparent so you see the old image, trace over the lines with Illustrator’s line-drawing tools, make the layer non-transparent, then save just that layer as SVG. Levana Taylor (talk) 20:17, 29 September 2019 (UTC)
    @Levana Taylor: Sorry, I meant a program that can easily generate SVG graphs from data. I looked for a few online and tried LibreOffice Calc but they seem a little cumbersome. Is there a program (like a spreadsheet program) that can take input data like a table and generate an SVG graph easily? —Justin (koavf)TCM 20:37, 29 September 2019 (UTC)
    Finding a program to output data in exactly that form is unlikely. And even if you did find it, it would probably take more programming to set it up just so that doing it by hand in Inkscape (which is a very, very good free vector graphics program).
    If you are willing to essentially write a program to output from data, rather than just do it as graphics, you probably want Octave (or MATLAB), Matplotlib (Python) or TikZ (LaTeX). But it won't be trivial to make this image with any of them. Inductiveloadtalk/contribs 10:33, 30 September 2019 (UTC)
  2. There are some tables that cross over pages and this is one of those times where print considerations of pagination don't really make sense for digital. E.g. see the text that splits across Page:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf/55 and Page:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf/56. On paper, it makes more sense to have the running text broken up with the tables but digitally with one long scroll, it's very jarring. How should I fix this? Should I have a way of inserting divs from different pages in different orders?
  3. I have several moderately complicated tables (e.g.) Page:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf/49 and I have noticed two problems: one is that the little empty rows that I have at the top are too thick, can someone tell me how to make these rows slimmer and I have several lines broken up by cross-cutting rows. Between the CSS, HTML, and MediaWiki, I'm not sure on how to fix this. Additionally, these tables are not centered.
    I have modified the table in this diff. Basically, overall table borders can be done right after {|, rather than on a per-cell basis. "margin:auto" is the secret sauce to centring a table on the page. Inductiveloadtalk/contribs 20:28, 29 September 2019 (UTC)
    Brilliant. Maybe I can use this to fix the other tables. Thanks kindly. —Justin (koavf)TCM 20:37, 29 September 2019 (UTC)
  4. Most importantly, the scan that I have is missing the last page. It can be found here: https://ia801902.us.archive.org/22/items/jstor-1412679/1412679.pdf. I have manually inserted it into the page itself but this is probably not optimal. I have also updated the scan at c:File:Eleanor Gamble - The Applicability of Weber's Law to Smell.pdf with an inserted page but again, this may not be the best practice. Any feedback is appreciated for this.

Thanks. —Justin (koavf)TCM 19:37, 29 September 2019 (UTC)

Nice work, @Koavf:
(1) It took me some pondering and looking, but now I see why you're looking to generate SVGs: you want to recreate charts like the one reproduced here, based on the original data. Correct? I would imagine it's possible to do so with a program like LibreOffice, but I'm not certain how. (I've successfully produced PNGs from spreadsheets, but not SVGs.) Regardless, I agree that this seems like an ideal way to approach these images (though others may disagree, preferring photographic preservation of the original image). -Pete (talk) 22:20, 29 September 2019 (UTC)
(2) I think it's fine to combine stuff originally presented on two pages into one thing. It's commonly done with images that span pages, for instance, with the map at the beginning of this Wikisource page. I would take a similar approach.
(3) (nothing to add here)
(4) My take: Explain any such decisions in the "notes" field of the header (or perhaps in a template on the work's talk page). The key question is, "what's the source document?" If the source document is a printed book, rather than a scan of a printed book, I think it's fine to reassemble the digital file to better reflect the original book. But it should be explained somewhere a critical reader is likely to see it. -Pete (talk) 22:25, 29 September 2019 (UTC)
@Peteforsyth: That's exactly correct: I want to recreate these data which are perfectly suited for SVG rather than raster graphics. We should not prioritize photographic reproduction of non-photographic images, just like how we don't prioritize a perfectly "typographic" reproduction but a reproduction of the textual content including what it semantically means, not just how it looks. That's to say nothing of how much more accessible SVG is than raster graphics. I monkeyed around in LibreOffice a little bit but couldn't reproduce what I wanted easily. I may need to go back to that. Do you have any feedback for my other issues above or bandwidth to validate any pages? —Justin (koavf)TCM 22:28, 29 September 2019 (UTC)
Yeah, I think we're in agreement on the best approach. Wish I knew better how to generate the charts in SVG, but I don't. I validated the page associated with that chart, and I'll see if I can do a few more. I'll try to watch this discussion too -- they're good questions, I'm curious what others have to say. -Pete (talk) 22:33, 29 September 2019 (UTC)
A very simple manual recreation of the graph in Inkscape is here: File:The Applicability of Weber's Law to Smell Table 04A.svg. I have used "Liberation Serif" as the font, as 1) I had it already and 2) it's one of the fonts MediaWiki provides. Took only a few minutes. Inductiveloadtalk/contribs 11:12, 30 September 2019 (UTC)
@Inductiveload: This is excellent. I've tried using Inkscape (and GIMP and Photoshop and Illustrator) before and it's just confusing to me but I guess I need to learn to make SVG some way other than by hand or via spreadsheet programs. This is very helpful--I'll try making the other two graphs. —Justin (koavf)TCM 17:22, 30 September 2019 (UTC)

What to link to on WikidataEdit

Should a work here be linked to its work item on Wikidata, an item specifically for the Wikisource edition, or the item for the edition which was scanned? Levana Taylor (talk) 15:49, 1 October 2019 (UTC)

I think the edition which was scanned, and the Wikisource edition transcribed from that scan, can be considered to be the same edition. The work itself (as distinct from any edition of that work) is a separate item, linked to the edition-item using has edition and edition or translation of, and linked to the Wikisource {{versions}} or {{translations}} page. —Beleg Tâl (talk) 16:14, 1 October 2019 (UTC)
@Levana Taylor: d:Wikidata:WikiProject Books is the guidance, though unfortunately it is not black and white issue as I see it.

Where an edition exists, or should exist, then we have an edition item, and wikidata item, so typically we are talking classical literary works (fiction or non-fiction) and we produce VERSIONS page that will generally link to the WD item about the work (and usually the WP article). However, for court cases which we reproduce the court judgment / output are only synonymous to be one version/edition, so those I link directly to the WD item about the case, and fudge the element of the judgment being a separate component.

Then we come to entries and articles that are part of larger literary works and probably either never going to be more than one edition, or evolve through various editions, eg. Crockford Clerical Directory; Who's Who; etc. which don't fit into the model of WD very well, and no-one has a particular interest. With those I just try to maintain a uniform approach, and if someone dislikes what I do, they can bot-remedy it. — billinghurst sDrewth 22:04, 8 October 2019 (UTC)

Once a Week (magazine)/Series 1/Volume 5Edit

What is the reason for the messed-up syntax on this page Once a Week (magazine)/Series 1/Volume 5? As far as I can see everything about both the page namespace and the main namespace are the same as for all the other volumes of the magazine that have no problems. Levana Taylor (talk) 22:31, 4 October 2019 (UTC)

@Levana Taylor: I am uncertain as to the technical reason for this problem occurring but it has some association with mixing Index: space references. You will note the first page is being drawn via Index:Once a Week Volume V.djvu and the latter two (which register no errors here) via Index:Once a Week Dec 1860 to June 61.pdf. You might investigate if an equivalent effect may be achieved which draws all pages via a single index (still using multiple <pages>s should be O.K.)
Alternately, one solution which works (but will like draw criticism!) would be to substitute line 11:
<pages index="Once a Week Volume V.djvu" include=4 />
with, instead:
{{page|Once a Week Volume V.djvu/4|num=i}}
114.78.66.82 08:30, 5 October 2019 (UTC)
I corrected the <pages> to use all the same index but the problem’s still there. Levana Taylor (talk) 12:08, 5 October 2019 (UTC)
It is really strange. I have just tried to transclude Page:Once a Week Volume V.djvu/5, which is a very simple page where apparently nothing is wrong, and when I clicked "Show preview", the syntax was messed up too. --Jan Kameníček (talk) 13:52, 5 October 2019 (UTC)
This is due to using {{ditto}} in the pagelist on the index page. The pagelist text is included on the mainspace page by the ProofreadPage extension indicate where the page breaks are. If it's got weird stuff in it, it breaks the HTML markup. Arguably, the extension is at fault for insufficient sanitisation, but the easy fix is not putting non-text into the pagelist in the first place. Inductiveloadtalk/contribs 14:18, 5 October 2019 (UTC)
OK, thanks a lot! Levana Taylor (talk) 17:21, 5 October 2019 (UTC)
@Inductiveload: Well spotted! I completely missed that. 114.78.66.82 21:13, 5 October 2019 (UTC)

Entick v CarringtonEdit

I recently found the full text of Entick v Carrington ([1765] EWHC J98 (KB)) (decided at 1765) on [1], but I am not so sure about how to import the verdict, provided that I am not really professional in these case laws. Can anyone inform me how to do this? Many thanks.廣九直通車 (talk) 02:55, 5 October 2019 (UTC)

You can find an overview of the process at Help:Beginner's guide to adding texts. The scan is available here. —Beleg Tâl (talk) 12:08, 8 October 2019 (UTC)
There is some example case law at Portal:Law of the United Kingdom and Ireland. It's a bit hard to find, because it's not linked from Portal:Law and Portal:Law of the United Kingdom doesn't exist. Campbell v Hall is probably a fairly good example for your case, it's from 1774, and similarly formatted. Inductiveloadtalk/contribs 21:55, 8 October 2019 (UTC)
@廣九直通車: It is pretty much going to be a copy and paste of the text, then some formatting, with appropriate references and sourcing. There isn't a lot we can do that is different. Then it is the curatorial aspects outside of main namespace. Listing on portal pages, wikidata item, possibly links from enWP, NO author page in this situation. There are plenty of examples below category:case law, especially from US, to use as comparisons. — billinghurst sDrewth 22:16, 8 October 2019 (UTC)

Linking to paragraphsEdit

I'm working on this book, which -

  • numbers each paragraph, and
  • frequently makes reference to paragraph numbers (including those in other chapters).

I think it'd be cool and useful if I could format these references as hyperlinks, which would make navigation nicer...but how can I do it? Couldn't make much sense of the links and subpage documentation for this purpose. --Contrapunctus-1 (talk) 11:04, 8 October 2019 (UTC)

You could potentially use {{verse}} for this purpose. Or do it manually using {{anchor}} or {{anchor+}} —Beleg Tâl (talk) 12:05, 8 October 2019 (UTC)
(Edit conflict, lol) use {{anchor+}}, like so...
{{anchor+|Para1|1.}} A certain amount of elementary knowledge ....
You will then be able to wikilink to it just like a section header, i.e. [[Pagename#Para1]] Jarnsax (talk) 12:11, 8 October 2019 (UTC)
Thanks, Beleg Tâl and Jarnsax! I didn't realize I could search the Template namespace to look for answers :) Is there any reason you suggest {{anchor}} and {{anchor+}} over, say, {{numbered div}}? Contrapunctus-1 (talk) 03:41, 9 October 2019 (UTC)
{{numbered div}} only works in a particular context. {{Anchor}} works anywhere you want to use it—including mid-paragraph. For an equivalent example to yours that I'm part way through see Index:Fugue by Ebenezer Prout.djvu. Beeswaxcandle (talk) 05:43, 9 October 2019 (UTC)
Wow, that is an amazing amount of work! I tried using {{anchor+}}, but I don't like that it doesn't format the visible anchor text as a URL. If that were possible, it would indicate to users that they can link to this anchor. Maybe I should look into making my own template 🤔 Contrapunctus-1 (talk) 11:44, 9 October 2019 (UTC)
I would suggest that you not strive to format the anchor text as a URL. Firstly, it would indicate to users that they can click on it to link to something else, rather than that they can link to it as an anchor. Secondly, if the text is not formatted as a URL in the source material, then it is not good to make it one in the transcription (unless it's listed as acceptable usage by WS:ANN or WS:Links). —Beleg Tâl (talk) 13:13, 9 October 2019 (UTC)
Fully concur. Anchors are passive, wherever used. If there is a crossreference to add, you know that you have created anchors, and you know that you can easily code for them with wikilinks (as expressed above). — billinghurst sDrewth 13:49, 9 October 2019 (UTC)
I went through W:ANN and WS:Links. Some way to copy the URL of an anchor seems to generally be considered useful (e.g. see the headings here or here, which have a link button on mouseover). In the original text, each paragraph is numbered, and I was merely proposing to format the numbers as links - fairly unobtrusive, and it doesn't seem all that different from references. I don't see it as changing the content, but making better use of the facilities afforded to us by the medium of the Web, so better ways to refer to the work are visible to users.
Nevertheless, I'll defer to the conventions agreed upon by the community, and proceed with {{anchor}}/{{anchor+}}. — Contrapunctus-1 (talk) 14:40, 9 October 2019 (UTC)

An incorrect ISBN number?Edit

Is it possible for publications to have an incorrect ISBN number? How would it be possible to track down such a book get the correct number? It seems that the publisher printed the wrong ISBN number. — Ineuw (talk) 21:20, 8 October 2019 (UTC)

@Ineuw: It's not incredibly uncommon, especially with smaller publishers. A common mistake is not properly converting 10 digit ISBN to a 13 digit one, or vice versa, which gives you the wrong check digit. Try searching for the book on WorldCat by name or title, or using only the middle part of the ISBN. Giving us what book details you have might also help. Jarnsax (talk) 21:37, 8 October 2019 (UTC)
@Jarnsax: Much thanks, but I also just found out that the numbers were made up by the publisher. — Ineuw (talk) 04:27, 11 October 2019 (UTC)

OCR content one page offEdit

The OCR content of Index:Autobiography of an Androgyne 1918 book scan.djvu is shifted by one page. Is it possible to fix this? Kaldari (talk) 05:22, 9 October 2019 (UTC)

@Kaldari: The text layer in the original DjVu was offset (presumably a messup at IA). This can be fixed by dumping the hidden text of each page with djvutxt—which will generate sexpressions to recreate a hidden text page with all structure intact—and then reimporting it at the correct pages using djvused. However, as I didn't have a script for that particular operation set up I've instead regenerated it from the source .jp2 files (at full resolution and with new OCR) and uploaded the new version. If the old OCR was better I can look into preserving that instead, but in my experience that's generally not been the case. While I was at it I also rotated the two foldout pages (pp. 293–294) that were printed in landscape format, mainly because Tesseract does not handle rotated text very well (I think it thought they were in Cyrillic or something). If you would like them in the original configuration they can be rotated back. Please let me know if there are any issues. --Xover (talk) 07:42, 9 October 2019 (UTC)
@Xover: Thanks again! Kaldari (talk) 15:09, 9 October 2019 (UTC)
Which pages shifted? Are you referring to and the following blank page? — Ineuw (talk) 04:26, 11 October 2019 (UTC)
The updated version is, hopefully, correct. The original had every page's text layer shifted by -1 (text layer one page earlier than the actual page). The problem was in the original DjVu from IA, and not just a symptom of the MW bug triggered by some DjVu files that looks similar (offset OCR pages). --Xover (talk) 05:53, 11 October 2019 (UTC)
Thanks, The picture is clear. By the time I looked, it was fine. — Ineuw (talk) 07:42, 11 October 2019 (UTC)

copyright and unknown death datesEdit

What is the copyright status of works by authors whose death dates are unknown, and whose works were published in the mid-19th century? Is it Wikisource policy to act as if they died between 1919 and 1948 and use PD-70, that is, use an assumed minimum time since death? I don’t know what lawyers in Spain and Mexico would argue as to whether an author about whom nothing is known but their name is to be treated as anonymous, but I take it Wikisource would prefer to play it safe and use a generously-assumed date of death rather than the anonymity rule of going by publication date. Levana Taylor (talk) 08:05, 9 October 2019 (UTC)

When Commons discussed this a while back they landed on publication + 120 years as a reasonable assumption (though some argued for much more) based on some guidance from the US Copyright office. It seems a reasonable rule of thumb for us to adopt: while some authors will have lived for more than 50 years (50+70=120) after publishing something, in most instances that will not be the case.
On anonymous works I'm not sure we have a well-established standard for what the threshold is: it is entirely possible that in a legal sense a book that includes an author's name is anonymous when that author name cannot with any certainty be linked to any real world person (for example, if we cannot even determine whether the given author's name is a real name or a pseudonym). I haven't really looked into this, but I'm pretty sure the relevant laws tend to talk about knowing the identity of the physical or legal person and not about whether or not there is some text that looks like a name on the title page as published. On the other hand, if we're pretty sure that the name is a real person but we just don't know anything about them, then that would argue against treating it as anonymous. --Xover (talk) 08:23, 9 October 2019 (UTC)
Of the 1860s authors I’m dealing with, I know the death dates of about 350, and six of those are between 1920 and 1926 (thus, PD-80). That's different from publication+120. Still, if the committee of people who are smarter than me thinks that I can call 1860s publications with unknown author dates PD-old, that’s good enough for me. Levana Taylor (talk) 09:01, 9 October 2019 (UTC)
We only concern ourselves with US copyright, and other countries we tell them that they cannot have a copy. With our criteria of pre-1923 the inclusion the date of death is usually only pertinent for whether we host the scan at Commons or here. That said, I usually try to hunt down authors and get dates, which is why I watch new author creations, and churn away at the author maintenance categories. — billinghurst sDrewth 09:48, 9 October 2019 (UTC)
Yeah, that’s perfectly satisfactory for WS, just use {{PD-1923}}. It does leave me puzzled as to how to deal with making a corresponding wikidata item; it seems like there are no exactly suitable items defined yet for the "license" and "copyright determination method" properties. It's a question to ask over there, but honestly, I hardly expect answers any more from questions asked at WD -- there seem to be remarkably few people participating. Levana Taylor (talk) 11:05, 9 October 2019 (UTC)
{{PD-1923}} corresponds to the copyright determination method d:Q47246828 "published more than 95 years ago" alongside of jurisdiction=USA. —Beleg Tâl (talk) 13:23, 9 October 2019 (UTC)

Could you please check formatting?Edit

of Emergency Regulations Ordinance, 1922? It was my first time transcribing an ordinance of Hong Kong, based on File:The Hongkong Government Gazette 19220228 Emergency Regulations Ordinance.pdf. I copied the format from British and Canadian laws on ws.--Roy17 (talk) 19:12, 9 October 2019 (UTC)

@Roy17: Based on a cursory look the formatting seems fine. But since we have a scan already available, why did you not follow the normal proofreading process? See Help:Adding texts. --Xover (talk) 07:02, 10 October 2019 (UTC)

CopyrightEdit

I hope this is not too bothering a task for anybody!

I am correct in supposing that all works published pre-1924 without a copyright renewal after that date can certainly be hosted on Wikisource, can they not?

I'm asking because I'd like to import W.B. Yeats' pre-1924 plays from Project Gutenberg, and, as I annoyingly don't know its copyright status, I don't want to painstakingly re-format the whole text, only to find that all (say) thirty-something pages have to go.

Can I ask how to find records of copyright renewals also, so for the future I don't have to bother anybody or worry myself unnecessarily? Thanks in advance, Orlando the Cat (talk) 05:24, 10 October 2019 (UTC)

@Orlando the Cat: Copyright, sadly, is not straightforward, so you'll rarely get a single blanket answer.
The closest thing we have is the pre-1924 rule of thumb: works published anywhere in the world before 1924 are in the public domain in the US due to expiration of any publication+95 years copyright term.
Works hosted on Wikisource must at a minimum be in the public domain in the US; but works hosted on Commons must also be in the public domain in their country of origin. In the case of Yeats, a lot of his output will have been first published in the UK. The copyright term in the UK is 70 years post mortem auctoris ("after the author's death", abbreviated pma), and so the UK copyright for Yeats' works will have expired in 1939 + 70 = 2009.
Pre-1924 works are not subject to renewal (they will have expired regardless) in the US. In the UK the terms have always been of fixed lengths not subject to renewals. I thus don't think there are any relevant renewal checks for Yeats' pre-1924 works.
However, works first published in the UK whose copyright had not yet expired there on 1 January 1996 will have had their US copyright restored by the URAA. Since the US copyright term is 95 years from publication, this means any of Yeats' works that were first published in the UK after 1924 are still protected by copyright in the US even if they have since expired in the UK. Such works cannot be hosted neither here nor at Commons.
But, of course, there are exceptions to the exceptions (told you copyright was tricky). The URAA restored US copyright after the fact, in 1996, for foreign works whose US copyright had expired due to failure to observe US formalities (notice, registration, renewal). But foreign works that were also published in the US within 30 days of the publication in their home country are, for US copyright purposes, considered to have been first published in the US. They are thus subject to ordinary US copyright rules, including the historical requirements for copyright notice, registration, and renewal; and, crucially, they are not affected by the URAA copyright restoration in 1996. That means that any of Yeats' work that was either actually first published in the US, or was first published elsewhere but published in the US within 30 days, can potentially have expired if it failed to print a copyright notice, failed to register the copyright, or failed to renew it. For works that fall into this category we will have to do detective work in copyright records to determine status. But it is often worthwhile to do so as a surprising number of works have failed to observe one or more of these renewals.
However, all that being said, please do not import anything from Project Gutenberg, ever. Gutenberg have a rather lax attitude to reproducing their sources, and will often conflate multiple editions without even bothering to document where the text comes from. And all new projects here really should be scan-backed now. If you want to add Yeats' works then please start by identifying good printed editions of the works, finding and uploading scans of these, setting up transcription projects for them, and then proofreading from these. Gutenberg etexts are plenty well enough hosted and mirrored on Project Gutenberg; we can and should do better than just be another mirror for them. If you need help please feel free to ask. Nothing in the process is actually very advanced, but it is a bit complicated before you've done it a few times. --Xover (talk) 06:57, 10 October 2019 (UTC)
@Xover: Thank you for your speedy, detailed and useful reply!
I've had a nagging suspicion that Project Gutenberg wasn't the best place to find reliable sources of texts (the playscripts are such a complex mess), and I only began checking the website once I joined Wikisource; I shall keep your advice doubly in mind. I believe, however, judging from the debates on Wikisource talk:Proofread of the Month, that the Internet Archive is the next-best place to find texts - am I correct?
It's a shame Yeats' plays are so difficult to find, otherwise I'd have tried to upload scans of them long ago; this is one of my motivations that I've been wanting to put them online, on a trusted place. I'll have a quick look through existing Indexes, in case there are ones I missed.
I suppose I may have to give up my project, if things get too complex, but that's alright - something else will appear which will catch my attention.
Thanks again, Orlando the Cat (talk) 07:58, 10 October 2019 (UTC)
The Internet Archive is the easiest source of scans; HathiTrust is often more complete, but is harder to download from.--Prosfilaes (talk) 08:39, 10 October 2019 (UTC)
HathiTrust have some plays by Yeats available, see [2] . You can also try the Hathi download helper, I have quite a good experience with it, although the downloads sometimes take a very long time. --Jan Kameníček (talk) 11:27, 10 October 2019 (UTC)
@Orlando the Cat: I took a quick look and found a good scan of Yeats' own 1922 edition of his Irish plays at the Internet Archive (Internet Archive identifier : playsinprosevers00yeatuoft). I've uploaded that on Commons as File:Plays in Prose and Verse (1922).djvu. I've also set up a basic transcription project for it Index:Plays in Prose and Verse (1922).djvu. Based on my own experience these are the biggest hurdles to getting started with such efforts. The next step is to proofread the pages of the book one by one, and when that is done, we transclude it into mainspace (sort of as if it were a template) using the <pages …> tag. For a work like this we would typically transclude the front matter onto Plays in Prose and Verse, and each play into subpages like those linked in the table of contents (i.e. Plays in Prose and Verse/Cathleen ni Holihan, Plays in Prose and Verse/The Pot of Broth, etc.). Since these are plays rather than chapters, we would also typically create redirects or entries on versions pages for the play title: Cathleen ni HolihanPlays in Prose and Verse/Cathleen ni Holihan. Practice varies a bit for when to transclude, but I'd suggest transcluding each "chapter" (play) as and when it is done to start. I'll try to watch your progress and help out when needed; or you should feel free to ask for help here if you get stuck. Proofreading the pages is the big job here; everything else there's usually someone available to help out with. --Xover (talk) 14:11, 10 October 2019 (UTC)
Thank you all for your informative replies!
@Xover: I'll get to work on Plays in Prose and Verse at once - thanks again! Orlando the Cat (talk) 00:21, 11 October 2019 (UTC)

Using Wiki source material for commercial purposesEdit

Dear Wiki source help,

I've got a question concerning the commercial use of Wiki source material. I would like to make a graphic novel based on the text of Alice in Wonderland. My question is whether I can use the text from the novel found on the wiki source page and then publish it for commercial purposes?

Best regards,

Charles

Hi Charles. The texts listed at Alice's Adventures in Wonderland should all be in the public domain worldwide. Wikisource generally does not assert any additional copyright in the texts we host since they are mere transcriptions of the original (other terms apply for other parts of the website, but they are generally on the free side; do be aware of that though).
To the degree any independent copyright exists those parts are dual-licensed (you can pick either license depending your preference or needs) under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA) and the GNU Free Documentation License (GFDL). The choice is up to you (and your lawyers), but I would usually assume CC BY-SA to be the easiest to work with. The "BY" bit refers to a requirement to attribute the source (i.e. the Wikisource contributors, typically by a link to the page here you got it from), and the "SA" bit refers to a requirement to publish derivative works under the same license.
But as I said, you generally do not need a license to use the text of Alice in Wonderland as the text itself is in the public domain since its copyright has expired. We would always appreciate acknowledgement of the work our contributors have performed, of course, but that is as a mere courtesy. --Xover (talk) 09:48, 11 October 2019 (UTC)

Hi Xover,

Thank you for your reply and I'll def. refer to Wiki source and its volunteers when the text is done. Btw. I just realized something Alice in Wonderland is an English text. However, is the same true when it comes to an American text from the same period and the author died around the same time?

To be clear: I'm not a lawyer, and this ain't legal advice. Ultimately you'll have to consult a real lawyer to get those determinations!
That being said, Wikisource strives for all its texts to be freely and safely reusable, including for commercial purposes, and to that end expends quite a lot of energy on copyright determinations when potential issues are raised (we do not pre-vet contributed texts). When we do there are some rules of thumb regarding copyright we have found useful. The most applicable one based on your question is that anything published anywhere in the world before 1924 will be in the public domain in the US due to its term of copyright protection having expired (generally 95 years after publication). That guideline is derived from US copyright law, and Wikisource as such requires only that a work is freely licensed or in the public domain in the US (where the servers are hosted). For your purposes you may also need to care about the country in which the work was first published, and the copyright regime in the jurisdiction in which you yourself are located. If you intend to commercially publish a derivative work internationally, your publishing partner may have additional requirements. As a rule of thumb, most (but not all!) jurisdictions have copyright terms that expire 70 years after the death of the author (vs. 95 years after publication in the US). Depending on the work in question and the jurisdictions involved the answer to that question will vary. --Xover (talk) 11:23, 11 October 2019 (UTC)

Bishop of Hereford: John or Thomas?Edit

It seems that the bishop John Tresnant, author of The Process of John Tresnant, Bishop of Hereford... is the same person as John Trevenant mentioned by various sources. He calls himself John too in the mentioned text. What confuses me is that Wikipedia and other more sources call him for some reason Thomas Trevenant, while I did not find any source mentioning both names and confirming clearly that John Trevenant had also another name Thomas or vice versa. May I ask for help what the real name of the person is? Is it possible that the sources who call him Thomas are wrong? --Jan Kameníček (talk) 20:32, 11 October 2019 (UTC)

BTW:The external links in this contribution of mine behave strangely. --Jan Kameníček (talk) 20:36, 11 October 2019 (UTC)

What a great puzzle! So far I've found that Henry Wright Phillott refers to him as John in chapter 10 of Hereford [3], but as Thomas in chapter 11 of the same work. So you're not the first person to be stymied by this :) —Beleg Tâl (talk) 03:55, 12 October 2019 (UTC)
As it seems that he called himself John, so I called his author page John Trevenant, but it would still be useful to find out whether the Thomas version is another name of him or a later mistake, copied by various sources since. --Jan Kameníček (talk) 18:32, 12 October 2019 (UTC)
@Jan.Kamenicek: The work that is probably best cited (authoritative?) is Le Neve's Fasti ecclesiae Anglicanae and for Hereford the detail says John Trefnant or Trevenant. No mention of Thomas. Also check Page:Fasti ecclesiae Anglicanae Vol.1 body of work.djvu/569. Church of England databaases don't start until protestant times. — billinghurst sDrewth 11:29, 13 October 2019 (UTC)
Thank you very much. It really seems that some confusion happened later and modern sources including Wikipedia keep copying it one from another... --Jan Kameníček (talk) 11:39, 13 October 2019 (UTC)
I agree. In fact, I have been unable to find any mention of "Thomas" that predates the 1888 work by Phillott that I linked above, so my current hypothesis is that Phillott himself is the cause of the confusion. The mistake proliferated widely, however; even appearing in "Ancient Diocese of Hereford," in Catholic Encyclopedia, (ed.) by Charles G. Herbermann and others, New York: The Encyclopaedia Press (1913) —Beleg Tâl (talk) 14:10, 13 October 2019 (UTC)
I imagine the mistake was due to conflation with Thomas Spofford, bishop of Hereford 1422-1448. —Beleg Tâl (talk) 14:15, 13 October 2019 (UTC)
Note that the enwp article is now at w:John Trevenant. According to Ealdgyth (courtesy ping) who created it (and a lot of the other enwp articles for bishops of that era), the "Thomas" in the article was a mistake (possibly even a cut&paste error). The source cited in the article (Fryde, et al.) uses "John". --Xover (talk) 14:53, 13 October 2019 (UTC)
Win for Wikisource! — billinghurst sDrewth 20:24, 13 October 2019 (UTC)