1926 ShakespeareEdit

It's that time of year again, On the Yale volumes entering public domain, only Titus Andronicus has had its copy released on IA.

Could you please generate a DjVu from this copy, and upload it to Commons as File:Titus Andronicus (1926) Yale.djvu?

No, rush, but this is likely the volume I will tackle myself this year, or at least be the first to tackle, as each year I work to complete at least of of them. Thank you. --EncycloPetey (talk) 05:54, 10 January 2022 (UTC)

@EncycloPetey: Index:Titus Andronicus (1926) Yale.djvu. I've done basic sanity checks, but not a full quality check. Xover (talk) 10:04, 10 January 2022 (UTC)
Thanks. --EncycloPetey (talk) 16:44, 10 January 2022 (UTC)

Shakespeare of Stratford (1926) is now available. This scan has been visually checked by me for completeness and correct sequence. The Commons file should be named File:Shakespeare of Stratford (1926) Yale.djvu. Note: this volume is in a completely different format from the rest of the series, because it covers the sources for information about the Bard. --EncycloPetey (talk) 21:51, 10 January 2022 (UTC)

@EncycloPetey: This copy appears to be missing p. v, the first page of the ToC covering chapters I.–XIX (pp. 1–26). I checked the raw scan images and it looks like it's missing from the copy, and not something that was fatfingered in the scanning. Xover (talk) 08:46, 11 January 2022 (UTC)
Hmm, yes, it's missing an entire page: the copyright page and p. v should be between the title page and p. vi. Xover (talk) 08:55, 11 January 2022 (UTC)
@EncycloPetey: I have multiple other scans from which I can patch in the missing pages, for both the 1926 first edition and the 1947 third reprint, but since we don't have the copyright page from this copy I can't tell which specific printing it is. Most of the other scans are also typical scribbled-on, over-crushed black&white Google scans, but I suppose that's something we might have to live with. I can also crib a somewhat better scan of the two pages from a books-to-borrow scan on IA, but that's from the 1947 reprint so that's only an option if we conclude our copy is the reprint. Xover (talk) 09:15, 11 January 2022 (UTC)
While I would prefer a 1st edition, if we can only get a reprint, then that's what we'll have to do. I suspect this is a reprint, mainly for the fact that the publication year does not appear on the title page in Roman numerals. This is only a guess, however. This is not a volume I intend to transcribe myself; it seems more like one for which you'd have the interest to complete. If that is indeed the case, then making the decision what to do would rest appropriately with you. I am disappointed with myself for having not noticed the missing page v. --EncycloPetey (talk) 13:05, 11 January 2022 (UTC)
@EncycloPetey: Aha! The first edition here does indeed have the roman year on the title page, which makes the IA copy a reprint. I'll look through the options and pick one. The one you'd found looks really nice (both the copy and the scan) except for the missing leaf, so I may land on patching it up from one of the other copies. And, yes, I was going to ask whether you had any particular designs on this one before grabbing it for myself! :-) Xover (talk) 13:24, 11 January 2022 (UTC)
@EncycloPetey: Index:Shakespeare of Stratford (1926) Yale.djvu and Shakespeare of Stratford. Xover (talk) 07:27, 27 January 2022 (UTC)
Wonderful! I've linked it to the WD data item, and added it to the "works about" section on Shakespeare's Author page. --EncycloPetey (talk) 18:18, 27 January 2022 (UTC)

No good deed goes... (part XIX)Edit

((fewer bugs than initially it appeared, but Gentium Plus font is wonky fer shur, but manageable(?); read at your leisure))

On the one hand, your efforts are noticed quickly.

On the other hand, &@%#%!!   or something like that.   ;-)

I thought my eyes were playing me tricks, a LAMBDA λ with a KORONIS on top λ᾽ . . . Gasp!

Started dumping the Unicode text](before "Hipp. 316 sq") in binary and delving into mysteries, and it really is a LAMBDA followed by KORONIS character,

U+1FBD GREEK KORONIS character (᾽)

Then what font is (now) active:

font-family: GentiumPlus;
font-feature-settings: "cv78"on,"cv83"on;

Then found something purporting to describe those features and font:

Gentium - Font features

(on that page do a find for "Greek alternates" for cv78 & cv83 descriptions)


1) a KORONIS U+1FBD is *not* a combining character! It should not be playing king-of-the-hill. It *is* supposed to look like   , but as a single character by itself. So the font is doing something it shouldn't do, basically converting the KORONIS into a "U+0313 COMBINING COMMA ABOVE" " ̓ "

Note that there is such a thing as "0343 COMBINING GREEK KORONIS" but that is not in my text.

2) cv78: Porsonic circumflex does not work:

plain font:

Ἆ Ἦ ᾯ    ἆ ἦ ᾧ

cv78 with Gentium plus

Ἆ Ἦ ᾯ     ἆ ἦ ᾧ
showing that cv78 is not converting the
form ◌̃ aka COMBINING TILDE 0303
into the
Oh good grief!! This changed in just the last half-hour to now display as spec'd. I switched to an older tab and saw tilde, then refreshed and now see inverted breve. Your change was at "07:46, 7 January 2022". Time now is approx. "00:24 14 January 2022"   Damn you network caching!!!

3) cv83: Capital adscript iota (prosgegrammeni) does not work:

plain font:

ᾼ ᾜ ᾯ     ᾳ ᾔ ᾧ

cv83 with Gentium plus

ᾼ ᾜ ᾯ    ᾳ ᾔ ᾧ
showing that cv83 is not converting the
form ͺ aka 037A GREEK YPOGEGRAMMENI (not a combiner)
into the
Oh good grief again!! It changed to as spec'd, which means the display *now* agrees with the normal behaviour as seen in _other_ fonts. The Gentium doc shows as 'standard' a form I've never seen in other fonts, so cv83 is good and necessary. Thank you.

4) However, U+1FBD KORONIS is not even the Unicode-approved way to do a KORONIS mark, it's deprecated. Somewhere (I can't find this right now) it is advised to use a "U+2019 RIGHT SINGLE QUOTATION MARK (’)" instead. I only saw this because the TESSERACT transcription apparatus inserted this combination LAMBDA KORONIS.

If instead of LAMBDA KORONIS λ᾽   I substitute LAMBDA U+2019 λ’ we get the correct appearance. So the Gentium Plus font *is* wrong, but in this one regard it shouldn't matter. (But who knows what else might pop up?)


Getting all this (HTML/CSS/fonts/WP/templates/styles/etc./etc.) straight makes ancient Greek look easy, so I'll go back to that.

BTW: να πάθεις, να μάθεις - once it has happened to you, then you know.

Shenme (talk) 00:59, 14 January 2022 (UTC)

Found another example from TESSERACT (advanced mode with Eng and GRC selected) - δ᾽ is supposed to look like δ’. Bleh! Good there's a workaround for the font bug. Shenme (talk) 01:07, 14 January 2022 (UTC)
@Shenme: I'm not sufficiently familiar with polytonic Greek, or Unicode's guidelines for it, to tell off-hand whether Gentium's treatment of the koronis here is reasonable. I'll try to take a closer look when time allows; but in the mean time, if you're sure it's a bug in Gentium Plus you can contact SIL and report it. Presumably they are just treating it as a ligature-eligible form (but if so, I'm not sure we can disable just that ligature).
The reason for the change to the porsonic circumflex and ypogegrammeni you observed is either because the lang team deployed version 6.001 of the font, or because you installed it locally. The ULS webfont repo used to have an ancient version of Gentium that was updated in T298613 (many many thanks to Santhosh and KartikMistry!), but I didn't think that was deployed yet. The old version had several bugs / suboptimal behaviour that bit us, and had no support for the porsonic circumflex and capital adscript iota font features. Xover (talk) 07:07, 14 January 2022 (UTC)
Newer Gentium font should be available in ULS now. -- KartikMistry (talk) 11:07, 14 January 2022 (UTC)

Really it is sufficient to note that there are two different Unicode codes,
and that the font is forcing the latter to act like the former, forcing combining action when it is not a combining character. If typing 'A' followed by ':' always got you 'Ä' you'd be negatively impressed, yes? But as I mentioned, using the different character U+2019 RIGHT SINGLE QUOTATION MARK works around that font bug.
As for posting a bug report to SIL, my question would be would they respond / update? I'm dubious since I keep finding font bugs that are decades old. (30 years wrong, yay Microsoft)
Heck, I didn't get finished typing my initial note before I found another bug in the font resolved to by "font-family: monospace". The browser says this is Microsoft's Consolas font, a 15 year old font.
There are two characters
  • (Ι) U+0399 GREEK CAPITAL LETTER IOTA character
With a space between them they look like " Ἇ Ι ", which is correct. Placed together they look like " ἏÏ " (<--simulated) Somehow the font is magically inventing an umlaut/diaresis on the second character. Happens also for UPSILON. Here is that pair in the various fonts:
sans-serif font ἏΙ Ἇ Ι
Gentium font   ἏΙ Ἇ Ι
              monospace   ἏΙ  Ἇ Ι
Hopefully they all work wonderfully on your system.
None of these is a major problem. But each of these is a cautionary tale. Network caching can betray you. 30 year old fonts can betray you. 15 year old fonts can betray you. A change that has upsides, also has downsides, for the very same population you were trying to serve well!
Thank you for persisting anyway. :-) Shenme (talk) 21:46, 14 January 2022 (UTC)

Ugh, I hadn't considered that this would happen. Indeed δ᾽ and λ᾽ should not combine the diacritic with the characters, and this is how Greek does words where the end of the word drops because of interaction with the following word. Think of it as like "I d' know" (for I don't know). Putting the apostrophe above the d is not correct. This sort of thing is fairly common in the Greek texts I transcribe. --EncycloPetey (talk) 22:11, 14 January 2022 (UTC)

@Shenme: I can't see what you did for the workaround, but this is a common thing, and it either means hunting down every instance of this and kludging the workaround, or making some other change. I can't see the difference between δ᾽ and δ’ which you used above. Can you explain? --EncycloPetey (talk) 22:15, 14 January 2022 (UTC)

@EncycloPetey: When Unicode put together the Greek Extended block they were determined to cover *all* the characters not already covered in the Greek and Coptic block. They went overboard, including code points they later regretted.
U+1FBD GREEK KORONIS character (᾽) was one of those characters. It was the 'intended' character for that isolated pause character, ala ἀλλ’ ἀλλ’ and seen wrong here ἀλλ᾽.
Then the Unicode people said, oh no, rather... we really want everyone to use a more normal 'apostrophe' character (so text searches work). Hence (somewhere it says) substitute an apostrophe, but *not* an apostrophe U+0027 ' APOSTROPHE, but rather a typographic apostrophe 2019 ’ RIGHT SINGLE QUOTATION MARK . So clear all this is...
    (see demo table I'm getting ready to submit bug to SIL)
So the workaround is to *not* use the Greek Extended char (as originally spec'd) but to substitute the newer recommendation U+2019 RIGHT SINGLE QUOTATION MARK.
So, yeah, let's go back and change everything that conformed to their original fiat. Only... let's not, not for old stuff anyway. I'm changing it as I work. My new Ancient Greek IME I need to submit to the ULS people uses U+2019.
As to why not the top variant? I've now done many pages across several works that employed polytonic Koine Greek, including Thayer's Lexicon (present on a lot of 'bible' sites with errors!), Scrivener's NT in Greek (present on a few 'bible' sites with errors), an academic text The New Testament in the original Greek - Introduction and Appendix (1882) having much Greek (!), and others. I have not seen anywhere the use of the pause apostrophe except in the to-the-right-side variant. So no (?) usages of the top variant in the 1800's? (I could have overlooked the crasis case, thinking it was smooth breathing mark.)
So the top variant violates the "match the original scan" goal that I'm obsessed with. To my mind, it is a much more jarring divergence than the tilde circumflex ever was (since, again, that tilde variant is what is seen in _so_ many texts).
Sorry for all the words, thanks for the attention, and I can't believe Koine Greek is so badly supported by 'Western' fonts. Shenme (talk) 23:32, 14 January 2022 (UTC)

Submitted bug to SIL for chaotic KORONIS Greek KORONIS U+1FBD improperly coerced to U+0343 COMBINING GREEK KORONIS behavior and pointing them to User:Shenme/GentiumPlus for examples. Shenme (talk) 00:24, 15 January 2022 (UTC)

And... SIL say the bug is fixed in next release! Oooo, eek! Fixed and released! Shenme (talk) 23:42, 12 February 2022 (UTC)

Songs of the SoulEdit


Hallo Xover, can I ask you again to make me formatting the pages (Page7-9) to me. That would be most important to me because if it is not right I make the whole book again. If you have time, in the contents chapter Caption II-IV is on the right. Also, I do not know how to bring the chapters to the first page as in the other projects. Thank you very much ! https://en.wikisource.org/wiki/Index:Songs_of_the_Soul(1923).pdf --Riquix (talk) 09:01, 22 January 2022 (UTC)

@Riquix: I've gone over the existing pages. For poetry you'll want to use {{ppoem}}. It's slightly complicated to get the hang of, but it is still in general the easiest way to deal with poetry here. Feel free to ask if you need help.
But from where did you get the PDF file you uploaded at File:Songs of the Soul(1923).pdf? It is lacking the requisite metadata (the file page should have a {{book}} template with all relevant fields filled in), and the file size does not match the scan of this copy available at the Internet Archive. Xover (talk) 10:37, 22 January 2022 (UTC)

I took it from here. https://archive.org/download/songsofsoul00swam I thought PDF I know and is also a larger file. You come to the overview at the top "Go to parent directory". I will work the other things as usual, piece by piece. Thanks!--Riquix (talk) 12:43, 22 January 2022 (UTC)

@Riquix: The PDF at IA is 2.2MB. The one uploaded here is 32.84MB. So these are not the same file. Xover (talk) 14:05, 22 January 2022 (UTC)

I downloaded her and uploaded again. Did not use a tool this time because it has been so badly translated to the first trials. Both have the "Dedication" at the beginning.--Riquix (talk) 15:42, 22 January 2022 (UTC)

I have repeatedly downloaded them and after a time is displayed in the info window of the file : 34.432.290 Byte (35,3 MB auf dem Volume) Should be true.--Riquix (talk) 15:46, 22 January 2022 (UTC)

Is that OK to connect the first three links? I would do the rest. Regards https://en.wikisource.org/wiki/Page:Songs_of_the_Soul(1923).pdf/9 --Riquix (talk) 08:40, 23 January 2022 (UTC)

@Riquix: No, you are linking to wikipages in the Page: namespace. The Page: and Index: namespaces are internal production namespaces; borrowing a theatrical term they are "backstage". Once proofreading is completed we transclude the content from the wikipages in the internal namespaces onto wikipages in the main presentation namespace (the main namespace has no prefix followed by a colon, but it is a distinct namespace all the same). Any links should be to the wikipage in the main namespace where the content will end up once done. The links I had put in the table of contents were to those destinations; they were just red because the proofreading was not complete and so the pages had not been created yet.
I am in the middle of something right now, so it'll have to be later today, but if you want me to I can go over and transclude the first few poems that are finished to illustrate how it will work and what it will look like.
PS. For links to wikipages that exist on this project, it's generally best to use internal links, like so: Page:Songs of the Soul(1923).pdf/9. Once you get used to it it's usually also a lot easier than using external links (the ones with a "http://" and a host name like "en.wikisource.org"). Both kinds work, so don't worry too much about it, but using internal links is a good habit to get into early. Xover (talk) 09:57, 23 January 2022 (UTC)
@Riquix: Ok, I've transcluded the first couple of poems at the correct locations. The links in the table of contents should be ok now. You can do the rest using the examples. Let me know if you run into trouble.
The PDF file is weird. The one at the Internet Archive is definitely just 2.2MB, so where your 32MB file comes from I have no idea. I have reuploaded the file directly from the Internet Archive so that we have a file of known provenance. At the same time I have added the required information template and moved it to Wikimedia Commons (Commons is the central media repository for all the Wikimedia projects, including Wikisource). Xover (talk) 14:17, 23 January 2022 (UTC)

Hallo Xover, in the files in the book, the original is now not show on the right side. So I can not compare now. To the file size change, this is apparently normal for another operating system. Unfortunately, I have not seen any English contribution.https://www.mactechnews.de/forum/discussion/PDFs-immer-groesser-als-Quelldatei-Normal-277803.html --Riquix (talk) 07:27, 24 January 2022 (UTC)

@Riquix: Oh, I see the file appears to be broken in some way such that Mediawiki is unable to process it. Strange, it looked fine when I opened it locally on my computer. I'll look into it and find some workaround.
Regarding the file size… The forum thread you linked to discusses the difference in final output file size as a result of using different PDF tools to produce it from the source material. That is, it discusses differences in file size for files that have been modified. Which is indeed my concern: the PDF file that the Internet Archive produced and the PDF file you uploaded are different sizes, so the file you uploaded has been modified in some way. Xover (talk) 12:44, 24 January 2022 (UTC)
@Riquix: Ok, I have no idea what the problem with the PDF file was. I am guessing something went wrong at the Internet Archive when they created it, but it could also be a software error in Mediawiki. To work around the problem I have downloaded the original scan images from IA and generated a DjVu (or w:de:DjVu in German) format file from it, and then migrated the index and all the associated pages there. The index is now at Index:Songs of the Soul (1923).djvu (to match the file name, File:Songs of the Soul (1923).djvu). It should work as expected now. Xover (talk) 14:41, 24 January 2022 (UTC)

Are you agreeing if I still insert two spaces on the page number? From this [15] to this [ 15 ] https://en.wikisource.org/wiki/Page:Songs_of_the_Soul_(1923).djvu/21 --Riquix (talk) 09:12, 26 January 2022 (UTC)

@Riquix: Yes. The precise formatting of the header doesn't really matter because it doesn't get included when the text is transcluded to mainspace. How faithfully you want to reproduce its formatting is entirely up to you. Xover (talk) 10:47, 26 January 2022 (UTC)

Hallo Xover, Can we do it that way, I look at me on text and then as far as I can? Such as "{{ppoem|start=follow|" and "stanzas". I still have to look at it. Think so for a week, I should be done. Would contact you then. So I learn that then, and you have a hopefully to correct a little less. --Riquix (talk) 15:45, 30 January 2022 (UTC)

@Riquix: I didn't understand this message. Could you try to phrase it a different way? Xover (talk) 17:19, 30 January 2022 (UTC)

The changes (Such as "{{ppoem|start=follow|" and "stanzas") where you meant in the previous History note is that okay for you if I do the end. When I finish with the text.--Riquix (talk) 18:09, 30 January 2022 (UTC)

@Riquix: Hmm. Not sure I understand still. But let me try to address what I think it might be what you're concerned about:
There is no requirement that all the pages are 100% perfect on your first pass. So long as all the text itself has been corrected, and the formatting mostly correct, that's fine. The changes I have made, and the associated comments in the edit summary, are mainly intended as explanation and instruction so that you can learn what to do. It is easier (less work) to do everything at once, if you are able to do so. But if you are not it's ok too. It only means you'll have to go back and fix some things afterward. In this case, you would see some problem when you try to transclude the poems: there would be gaps between lines where there shouldn't be gaps, or there would not be gaps where there should be gaps. It is quite alright to go back and fix any such issues at that point. It is very common to see problems when transcluding and needing to go back to the page or pages involved to fix them.
Did that help at all? Xover (talk) 18:53, 30 January 2022 (UTC)

Hallo Xover, I have it as far as I could make it relatively sure, the edits. There are two things that are still open, where I'm not sure and do not want to do wrong, which then causes more work. One is the "Stanzas" because I can not English as well, and I'm not sure where it starts and ceases. The other one is part IV, I do not believe the page number is right (e.g., "Foreword" is page 95 but other number show page with "Part III"). I used translation program for writing, therefore probably sometimes the text does not clearly understand. --Riquix (talk) 08:52, 9 February 2022 (UTC)

Stewart's Textual Difficulties in ShakespeareEdit

I set up the Index:Some Textual Difficulties in Shakespeare.djvu --EncycloPetey (talk) 01:57, 11 February 2022 (UTC)

NLS deletionsEdit

Per this discussion, please delete those index talk pages, as well. TE(æ)A,ea. (talk) 22:20, 22 February 2022 (UTC)

@TE(æ)A,ea. and @CalendulaAsteraceae: Those index talk pages are not associated with any of the indexes listed as duplicate, and their associated indexes still exist and are not apparently listed for deletion anywhere. What's the rationale for deleting them, and is it only the talk pages or is there an implicit assumption that the associated indexes and their Page: pages should also be deleted? Xover (talk) 07:55, 25 February 2022 (UTC)
Xover: The Index talk: and Page: pages which have been listed by CalendulaAsteraceae are all dependent on the Index: pages which were deleted, and are thus subject to speedy deletion under criterion M.4. Also, re: the most recent comment here, asking about the location, it is from here. TE(æ)A,ea. (talk) 15:03, 25 February 2022 (UTC)
@TE(æ)A,ea.: I must be being particularly dense again today. Index talk:Captain Barnwell.pdf, which is one of the talk pages listed at WS:PD, has an associated index at Index:Captain Barnwell.pdf and 8 associated Page: pages, all marked as validated. Similar is the case for the other talk pages listed there. What am I missing? Xover (talk) 17:09, 25 February 2022 (UTC)
@Xover: Possibly you were looking at a different list? I took the list of index pages which you deleted and replaced Index with Index talk and got the following list of index talk pages.
Extended content
CalendulaAsteraceae (talkcontribs) 00:52, 26 February 2022 (UTC)
@CalendulaAsteraceae: Oh. a dim bulb starts flickering I see now. Not "the index talk pages listed at WS:PD", but "the talk pages of the Index pages listed at PD". Duh! I got confused because at the top of the discussion you listed 8 actual index talk pages, and those have still-extant Index: pages and Page: pages. Apologies again for being so dense, but… dumb admin is dumb, needs spoonfeeding. Thank you both for the hand-holding! --Xover (talk) 07:53, 26 February 2022 (UTC)
@Xover: Glad I could help clear that up! Could I request that you also delete the Page-namespace pages of the Index pages listed at PD?
Relatedly, could you edit the use of {{category handler}} in {{sdelete}} so that adding "nocat = false" will work in other namespaces, like the Page and Talk namespaces? I understand the {{category handler}} documentation well enough to know it can be done, but not well enough to do it myself. —CalendulaAsteraceae (talkcontribs) 23:34, 26 February 2022 (UTC)
@CalendulaAsteraceae: Meh. So it seems I got exactly zero things right in that deletion. Grr! I'll take a look and see if there's any sane way to deal with them short of custom coding (in which case it'll be a bit before I have the spare cycles).
I've tweaked {{sdelete}} to categorise in all namespaces. nocat = false only bypasses the blacklist (regex of page names where cats should not be added, primarily for /Archive pages), but has no effect when it's in a namespace {{category handler}} doesn't know about (and it doesn't know about Page: and Index:). Xover (talk) 08:50, 27 February 2022 (UTC)
@CalendulaAsteraceae: Ok, I think I've got all of them now. Please let me know if I messed up anything else there. Xover (talk) 10:18, 27 February 2022 (UTC)
Thank you! Everything looks good to me. —CalendulaAsteraceae (talkcontribs) 22:33, 27 February 2022 (UTC)

Page end hyphenEdit

The template {{peh}} is not working the way it should. In the Main namespace, it is transcluding as a double hyphen instead of as a single hyphen. I am not sure why. --EncycloPetey (talk) 19:31, 23 February 2022 (UTC)

@EncycloPetey: My quick sandbox test looked ok. On what page are you seeing this? Xover (talk) 20:23, 23 February 2022 (UTC)
On several pages. I noticed the issue when User:Yodin had made several hws/hwe replacements, and those where the word should have had a hyphen preserved in transclusion instead had a double hyphen. He has now reverted the affected pages, but you can see where the problem occurred by looking at their self-reversions Special:Contributions/Yodin. --EncycloPetey (talk) 20:28, 23 February 2022 (UTC)
@EncycloPetey: See User:Xover/sandbox. I am unable to reproduce this problem. Can you find a page where this is currently displaying incorrectly? Xover (talk) 21:29, 23 February 2022 (UTC)
The problem appears only under certain kinds of usage. On Test page 2 it occurs in the first call, where the pages are transcluded separately, but not in the second, where the pages tag is used. I noticed it first in the transclusion where you tried to replicate the problem, but it does not seem to happen there now. I am not sure why. Extra blank lines or carriage returns? A missing carriage return? Something that was cleared upon any edit? --EncycloPetey (talk) 22:39, 24 February 2022 (UTC)
@EncycloPetey: {{peh}} depends on the automatic hyphen handling that is provided by the Proofread Page extension to work sensibly. Or rather, {{peh}} works the same regardless of where it's used, but its behaviour makes no sense unless it's used with PRP's automatic hyphen handling. When you use direct transclusion with {{Page:…}} you completely bypass PRP so neither the automatic hyphen joining nor {{peh}} will work. That's one of the reasons why direct transclusion shouldn't be used.
Similarly, if pages are transcluded using separate instances of PRP's <pages … /> extension tag the automatic hyphen handling will not work since Mediawiki invokes the extension with only the content provided in that instance of the tag. Or put another way, each individual use of <pages … /> lives in its own little world and knows nothing about other content on the wikipage, including any other <pages … /> tags that may be present.
If there were issues with {{peh}} when used through a single <pages … /> tag then I'm unsure what might have caused it. I'm not aware of any code changes recently that seem likely to have caused something like this (not that I necessarily would pick up all such), but if it was a side effect of some change elsewhere in Mediawiki (not directly related to PRP) it's certainly possible. If that's the case, any edit to the mainspace page (the transcluding page) should force any cache to get regenerated in normal circumstances. Most of the time these kinds of changes also manage to invalidate the cache without any end-user action (there are both automatic dependency-based mechanisms and periodical scheduled jobs for this), but some such code changes can end up needing an edit (which is why we sometimes have to run a "touch edit" bot job; these days usually invisibly because the bot can "purge" the page instead of actually editing it). Xover (talk) 07:27, 25 February 2022 (UTC)
Gah! There are instances where direct transclusion is still used, such as Tables of Contents on Index pages. And I have seen editors split <pages … /> into multiple calls. I will have a better idea what to look for in future. --EncycloPetey (talk) 17:28, 25 February 2022 (UTC)

Big, bold red text about wikidata entities!!!Edit

The fables, all 6 or 7 hundred of them, have been indexed. So, (as suggested at the portal) I put them in the order of the index and I started to link them to wikidata via {{wdl}} because, they are a "bitch and her whelps" to find there so, one big finding and pasted on the portal and all is well, etc.

BUT!! I have big red letters within the 500 series. See Portal:Aesop's_Fables#Perry_501–584. Normally, I would just think to rethink it but this error has more entities behind it and is weird to have where it is, if it is indeed a software complaint of the usual kind.

So, 1) is that really a problem or just about that one data? 2) is it going to become a problem (if it isn't already)? 3) what can be done to collect these eh, whelps?

This is nothing like herding cats, btw....--RaboKarbakian (talk) 00:17, 21 March 2022 (UTC)

@RaboKarbakian: I'm not sure why Perry 508, and only Perry 508, is throwing an error there. I see nothing obvious at The Trees Under the Protection of the Gods (Q105289012) to explain it, and if it was the total number of entities used on Portal:Aesop's Fables I would have expected lots more big bold red errors there. When I have the time I'll try to trace what {{wdl}} is doing with Q105289012 in more detail, just in case there's some weirdness that's not apparent at first glance, but so far I see no explanation for what's going on. Xover (talk) 06:11, 21 March 2022 (UTC)
Xover:Not entirely unrelated, but close: Bevis and Butthead Ben and Jerry fix a scan. Much more like Ben and Jerry, I guess.--RaboKarbakian (talk) 00:10, 22 March 2022 (UTC)

Xover I am sorry to bother you, but I am in need of some dialog, I think. The abuse of wdl was naive and accidental; I am not anxious to make the same mistake with {{nsl2}} (which worked very nicely for a simple %s/problem/hack/g). Through my mind have been running some different options. First, just using simple bracket links like I have been (in the 600s and 700s mostly). Second, continue with the nsl2 template. Others: a table where the QNNNNNN is shown and linked to with the row shared by the linked page name here. A Portal:Perry index where Portal:Perry index/001-099, &c. subpages exist. I am actually looking forward to some scripting, much more than wanting a specific format.... So, here is where you inject some guidance and direction, replace my naive with knowledge, or, tell me what to do and where to go if that suits.--RaboKarbakian (talk) 01:37, 8 April 2022 (UTC)

Xover I had some observations about the behavior of that page and where the warning happens. Adding an edition or translation to any of the wdled items would make the warning move up, like from 420-something to 410-something. And that would happen without touching the page. That got me thinking that perhaps the or a module is grabbing "everything" and then picking from that what it wants. And that just using the base module and asking for just the links might fix that page. Maybe I am wrong, but it would explain that behavior. --RaboKarbakian (talk) 01:14, 13 April 2022 (UTC)
@RaboKarbakian: On closer inspection it seems to be simply that the total number of wikidata for the entire page is tipping over a hard limit. But I still don't understand why only a single item is showing an error.
I think we're going to have to rethink the entire entire Aesop's portal for this. What are we trying to achieve with the {{wdl}} invocations there? Do we actually need to fetch data from Wikidata for most of those links? Are we just trying to document what the Q-number is for each of these? Would it make more sense to generate the entire portal from Wikidata by bot, that is, the page itself is just static wikicode and all the lookups happen offline? Xover (talk) 15:23, 24 May 2022 (UTC)
Xover rethought for sure! I have been thinking about it. It shows on the link where it has reached it's limit. I watched it move as I added editions or versions to numbers lower than the "problem" number and the number would move up, meaning, occur at a lesser number.
So, my (unconfirmed) thoughts about the "why" are this: the (or a, since I am not sure which one and I myself have used more than one) module pulls in all of the properties from the data item and then delivers the requested properties. So, if the problem is at fable 14 and I add a version to fable 8, the overload moves to fable 13.
{{wdl}} is a great thing. It is wonderful for linking within articles. I just used it for Yerkes Observatory, for instance. Right now, it points to en.wiki but in a potentially very near future, it might point to a collection of articles here (it has a very famous "largest refractor telescope"). But terrible for a table like the Perry index. But, what would be good for the perry index would also be good for {{wdl}}....
If all that is needed are the links to this or other wiki, then a "simple" module might be made that gets just those inter-wiki links. This module could never be used to build a versions or translations page, for instance, like the other module(s), but would be easier on all systems because of the single purpose.
About the linked Perry index. Yes, it is a very good thing to have all of the fables with their links. I can expound more on this, but I already have somewhat. Not only is the list good, but it should be on every wiki that hosts fables (like el.s, es.s, nl.s, la.s, fr.s....) and on wiki with articles about fables (en.wiki, fr.wiki, el.wiki, etc) with the same cascading way with its links. I am certain that there are other lists of things like the fables that could stand the same thorough treatment to make it easier for the wikis to work together on.
Thanks for getting back to this.--RaboKarbakian (talk) 15:50, 24 May 2022 (UTC)
@RaboKarbakian: Actually, I am a genius and everyone should bow down before me and worship my über leet coding skillz! :)
It turns out that while {{wdl}} doesn't fetch more than it needs from Wikidata, the underlying Lua library it uses does. By tweaking it to call the Wikidata access functions juuuust right, I think I've eliminated the current big red error message on this portal. We'll probably hit a similar limit at some point eventually, but this gives us some more breathing room.
So longer term we probably need a different approach for this portal in any case. Do we really need to lookup anything at all from Wikidata on that portal? Maybe the important bit is just having the Q-number alongside the link? Something like a custom template called as {{perry fable|503|Q7726564|The Cock and the Jewel}}, which just spits out a local link with the provided label (i.e. "Perry 503. The Cock and the Jewel") without ever touching Wikidata. We'd lose automatic linking to other projects when the local page doesn't exist, but performance would improve dramatically. Finding interwikis for each fable is best done from that particular fable's individual page, rather than the portal, in any case. Xover (talk) 16:56, 24 May 2022 (UTC)
Lovely tweaks! This is a thank you and a warning. The warning being that I am going to put the page completely back onto wdls, so we will know what the limit is, or isn't if the index doesn't reach it. I have no problem with whatever decision is made about how the index is handled; it's just that when I was trying to find which fable I had, I was looking at en.wiki, fr.s the link that the perry number goes to -- anywhere I could find it. The same fable have different characters and, they are also reflective of changes in domestics, ie. what was a weasel in the 1600s is a cat now. Goats in Greece are sheep in England, etc. 2100 years of fables, if they were ever clear-cut, they aren't now. So, I will always vote for inter-wiki linking. I would have used it had it been available.
Thanks again! --RaboKarbakian (talk) 18:55, 24 May 2022 (UTC)

Good news and bad news!Edit

Xover I have converted the portal to use only the template now. Bad news first

The bad: The limits of the new and improved {{wdl}} are (still) unknown.

The good: The index at Portal:Aesop's Fables is (still) rendering without warning and also, in seconds (where it used to take 10s of seconds!).

Such is the outcome of your very fine tweak....--RaboKarbakian (talk) 13:37, 25 May 2022 (UTC)

Goblin Market movedEdit

I just checked, and you did not say to put sdelete on all of the moved pages, so I didn't... but everything got moved and, it looks great on a djvu! If I should do anything else, let me know. Thanks again!--RaboKarbakian (talk) 04:15, 27 March 2022 (UTC)

@RaboKarbakian: And the old redirects and index have now been deleted. Xover (talk) 08:35, 27 March 2022 (UTC)

Potential workflow for scanned microfilm documentsEdit

I ask this question here, rather than on Scan Lab, because I think you might be able to set up a process—but I’m not sure, so I’ll get to the point. I have scanned in some documents on microfilm, but they need to be processed before they can be uploaded. This was created as a result, but it took a long time off-hand, and so I wonder if you can help. For the next file, there are ostensibly four pages of the text per TIFF file, which need to be cropped, color-inverted, and combined as a PDF (or DJVU, your choice). I was wondering if there is some way to go through the processes more quickly. I can upload the files if you have questions or want to test something out. Thanks in advance for looking into it. TE(æ)A,ea. (talk) 01:06, 1 April 2022 (UTC)

@TE(æ)A,ea.: How much it can be automated depends on the files. Cropping out a square area, or multiple squares, is not a problem so long as the coordinate offsets of the areas are consistent between image files. Very often this is not the case, and then it comes down to how much margin is present (i.e. how sloppy can you be with the rectangles and still avoid cutting out any of the content you want to retain). A solid black or solid white border can be automatically removed, but it has to be relatively uniform and it has to have sufficient contrast with the content to be preserved. Inverting color is not a problem. And once the images are otherwise done, I can obviously create a DjVu from them with my existing tools. Also, depending on the total number of files we'd need to figure out some way to transfer them that's practical. Xover (talk) 06:35, 1 April 2022 (UTC)
  • The offsets are not exactly the same, but for almost all of the files they are close enough not to really matter. The contrast is black-and-white (white originally, black is color-inversion comes first). I have uploaded one of the files here. This specific works only has 30 TIFF files, but future works which I may scan have more pages; for this, I can just upload the images myself. (The names will be 169630001–169630030.) For the 30-image work, I put in effort to have four pages (or page-areas) per file; however, because that was quite difficult, and led in some cases to horizontal offsets, in the future I will probably only scan for two pages per image. TE(æ)A,ea. (talk) 14:29, 1 April 2022 (UTC)
    @TE(æ)A,ea.: I've looked into the example image a bit to see what's feasible.
    Color inversion is easy and fully automated, so it's just an extra step to throw in that the computer takes care of. Cropping the pages can be partially automated using a fancy algorithm (short version: it tries to find a point in the middle of the text, and then starts scanning outwards until it finds the edges), so long as the scanned images have sufficient contrast between the foreground (the text) and the background (everything else), not too much noise in the image (coffee stains and similar from ageing are a nightmare), and decent margins between the edge of the text and the edge of the sheet. It is preferable to have scans with either one or two pages per image. Three or more (i.e. four) will require an extra processing step that may need a lot of manual adjustment (crop offsets). Non-uniform images (i.e. if the first or last images have just one page but the rest have two) may also need manual processing.
    One of the operations that will be needed (for several purposes) is what's usually called "thresholding". Since these images do not have perfect white (that is, pixels with RGB value 255,255,255) or letters that are perfect black (RGB 0,0,0), but rather they have a lot of shades of gray in between, we need to pick a shade of gray where everything darker than that is considered "black" and everything lighter than that is "white". Once we have that, the software can use logic like "every black pixel belongs to text" in order to straighten skewed pages, crop the edges, etc. This is also the same thing you need to do when converting a grayscale image to actual black & white (which gives very small files). If you look closely at the inverted version of File:169630011.tif you'll notice there are a lot of very dark gray pixels around and behind the text. Probably the ink on the next or previous page that has rubbed off or bled through. On these particular images I was able to just use a default threshold of 0.5 (midway between black and white) and get decent results; but if the darkest non-text parts of the page are significantly darker, or the lightest parts (faded) of the text is lighter, finding a working threshold gets difficult. If the functioning threshold varies between pages you essentially cannot automate this step any more (you need to manually pick the threshold for each page).
    I don't have a batch downloader for Commons (or Wikisource) set up, so I'll have to put that together. Easiest is probably if you tag each image with a category (which can be a redlink) and I can work from that. I'll also need a reliable way to order the images correctly. My existing tools are set up to look for a numerical sequence after the filename and before the filename extension: canbeanything-0001.tif. But so long as an alphanumeric sort of the filenames puts them in the correct order it should be fine.
    Finally, figuring out bibliographic data, licensing details, etc. can be pretty time-consuming (compared to hitting a few buttons and letting the computer do its thing), so it'll help if you prepare the file description page using a template like this (doesn't have to be exactly that, but those are the things I consider important to include). Also, if there is any degree of uncertainty at all about copyright status please let me know beforehand so we can figure that out beforehand.
    In any case, the long and short is that this should be doable, and hopefully without my available time being too much of a bottleneck. If you upload the first batch I can try to run it through and we can see how it goes and whether you're happy with the results. Xover (talk) 10:03, 2 April 2022 (UTC)
    • Thanks for your lengthy response! I’ve uploaded the images from this batch manually (as 169630001–169630030, as stated). Some images do have other than four pages per image. For other works (if I am able to scan more), I will go for two pages per image. I think (I hope) that the other pages will be fine for thresholding. For the source, I don’t have the reel on me, but I will be able to get it later. All of the works from this reel are from the late 1790s, so copyright shouldn’t really be an issue. Again, if I scan other works (which are longer), it would be easier to deal with off-site zipped uploads rather than dozens of (for me, manual) on-site TIF uploads. TE(æ)A,ea. (talk) 21:25, 4 April 2022 (UTC)
      @TE(æ)A,ea.: I'll have a go and see what we can do.
      Do you particularly want to keep the cover pages added by the microfilm company? We usually delete pages like that (think Google cover page etc.), and since they have a different geometry they'll need special-casing if included.
      A single-file download would be the easiest, yes, and I don't mind if it is on a site outside Wikimedia (just so long as it isn't too too spammy). Xover (talk) 07:22, 5 April 2022 (UTC)
      @TE(æ)A,ea.: Ok, the results are at File:A Letter on the Subject of the Cause (1797).djvu.
      Note that pp. 24–25 was missing from the scan, so I have replaced them with placeholders.
      Most of the time was spent in tweaking the workflow in order to automate it as much as possible (so it'll be faster in the future). A couple of findings / notes / musings:
      A single file download would be much easier than having to manually open a gazillion tabs and doing "Save as…". However, if the most convenient way to upload the images is to do it here—for example because you can then use one of the ready-made bulk upload tools like Pattypan or Sunflower—it should be possible for me to write a script to download all the files from a given category without too much effort. I checked the API and some related existing code, and it doesn't look like it'd be particularly difficult (and such a script would have other uses, so it wouldn't be only for this particular use case). Mostly it comes down to what's most convenient for you.
      The optimal format would be a one-page-per-image, followed by a single two-page spread per image, followed by exactly four pages per image. Odd numbers of pages per image will almost invariably lead to a need for manual interventions, and inconsistent number of pages per image within a single batch / work will similarly need manual work. Which, in addition to being kinda tedious, means it'll take longer to process (because I don't often have the time slots needed to do it). Guessing about your source material and workflow, I think that probably means you should aim for exactly two pages per image (but if something else is better for you then please let me know and I'll see what can be done).
      Finding an automated process to churn through these scans and produce acceptable results was a bit challenging for a couple of reasons. One was variation in geometry anf page placement within the image. Most of that was due to the images with a divergent number of pages, but there were also some issues caused by insufficient margins between the page and the edge of the image. The scans also had insufficient contrast between the page background (theoretically white, but in reality it's a shade of gray with lots of noise in the form of near-black pixels), and this varied from page to page. This made it hard to find good threshold values that would work, and would work for all the pages. This was compounded by the presence of various forms of noise (non-black pixels) in the image background. For this particular batch of images I was able to find the right combination of threshold values by simply dithering the images to black and white (i.e. no shades of gray), but this is something that will vary from scan to scan so here's where I expect there will be some need for experimentation and tweaking in the future.
      In any case, the most critical factor is making sure all the images are as uniform as practically possible, along all dimensions (geometry, placement, contrast, margins, etc.). I have the tools to automate a vast number of image manipulation operations, but if I can't apply them identically to every image we'll soon end up where it would actually be faster to do it by hand in Photoshop or similar. But for images that are sufficiently uniform etc., processing anything up to something like a thousand pages shouldn't be a problem.
      Anyways… Have a look at the result and see if it looks ok. Xover (talk) 11:03, 7 April 2022 (UTC)
      • First, thanks! Second, there are no pp. 24–25 (a printer’s error), so no placeholders are needed. The workflow on my end produces the individual files. However, it is much faster for me to ZIP those files and upload them to an off-WMF Web-site; in the future, I will do that. Because the scans in this specific work (not the microfilm reel, but this specific scan) fit very neatly into four pages per image, I tried to stay with that; however, in doing so, I messed up the alignment, which made it take much longer than necessary. For future scans, I will stay with two pages per image, unless they fit so easily that it’s not necessary to mess with alignment. The inconsistency in number of pages/image was annoying, but not universal in this reel; I don’t think it will be too much of a problem. As you can see, the images are displayed on the reel side-by-side, which makes it easiest for me to scan in two pages per image. The “Book” template form I placed on all of the individual scanned images; the only thing I need is the microfilm information, which I can get the next time I see the reel. The work itself looks great! All that’s needed is to remove the placeholders, add the information template, and the file can be moved to Commons and the work on the index started. (The few other works I looked at didn’t have (or didn’t have as much) the variable number of pages per image.) TE(æ)A,ea. (talk) 14:53, 7 April 2022 (UTC)
        @TE(æ)A,ea.: Placeholders are now removed. Xover (talk) 18:52, 7 April 2022 (UTC)
  • Xover: I have obtained another microfilm reel, and have scanned some multipage documents from the same. They are available at the following hyper-links (identified by their Crandall–Harwell numbers): 3150, 3151, 3152, 3152-1, 3153, 3154, 3155, 3156, 3157, 3157-6 (3157-1–5 are all single-page documents). Some of these were more well-scanned than others. There are a few (three or four) documents where one page takes up more than one scan image; are you able to stitch such images together? TE(æ)A,ea. (talk) 22:03, 25 April 2022 (UTC)
    • Xover: Have you downloaded, or could you download, the files, or are you not interested in working on them? I need to free up some file, and I’ll delete the originals if you have these files. TE(æ)A,ea. (talk) 23:05, 3 May 2022 (UTC)
      @TE(æ)A,ea.: I have downloaded them, and am slowly making my way through them. It's just IRL that is eating up all my time lately. Sorry. Xover (talk) 17:30, 4 May 2022 (UTC)
      @TE(æ)A,ea.: Well, that took a "little" longer than expected (sorry!). Between IRL getting in the way and these scans being pretty pathological, the job ended up requiring a lot more manual tweaking than expected (on the worst ones I ended up essentially doing it manually in Photoshop). In any case:
      I'm not really happy with the results, but it was about the best I could squeeze out of them. You'll need to check these pretty carefully, both to see if I've made any mistakes (not at all unlikely given the amount of manual fiddling) and because the quality of some pages looked really iffy (in 3153 especially).
      If you add info templates and licenses we can rename them to something sensible locally before transferring them to Commons. If there are any problems that look fixable, let me know and I'll take a look (I still have all the files). Xover (talk) 11:26, 22 May 2022 (UTC)

Div span swapping in references..Edit

I'm going to give up on trying to resolve most of these, for the simple reason that the actually stable repair would be for the cite extension to properly support block based references (and thus I can rely on a defined and documentedbehaviour, as opposed to one that only appears to resolve the Linter concern.) There is a Phabricator ticket requesting this support, but the lack of response to it suggests that it is not considered a priority currently.

ShakespeareFan00 (talk) 08:33, 8 April 2022 (UTC)

Well I found a temporary tracking approach - Category:Pages_with_block_based_footnotes for some of them. It seems the DIV SPAN swap doesn't occur if you have a span immediately after the opening REF tag, hence the {{blockref}} template I've been using as tracking entity.

This should also demonstrate that it isn't just a few rare instances where the ref tag needs to be able to cope with block/P/DIV etc level elements :( ShakespeareFan00 (talk) 18:03, 11 April 2022 (UTC)

Does ppoem support ULS for different languages and font choices?Edit

Exmaple:- Page:15 decisive battles of the world Vol 1 (London).djvu/61

I am asking because these would be a {{lang block}} for Greek or polytonic under different circumstances, and felt that ppoem should eventually have the ability to set up language tags and font choices appropriately.

Your thoughts? ShakespeareFan00 (talk) 18:41, 8 April 2022 (UTC)

@ShakespeareFan00: Hmm. Good point. A lang tag should probably be added at some point. @Inductiveload: fyi. Xover (talk) 20:09, 8 April 2022 (UTC)


Can you take a look at this template, because the AccessDate parameter doesn't seem to be working correctly, in that nothing is displayed?

Was attempting to cleanup references here - https://en.wikisource.org/wiki/Wikisource:WikiProject_Open_Access/Programmatic_import_from_PubMed_Central/A_Collaborative_Epidemiological_Investigation_into_the_Criminal_Fake_Artesunate_Trade_in_South_East_Asia ? ShakespeareFan00 (talk) 08:56, 9 April 2022 (UTC)

@ShakespeareFan00: Hmm. Why do we even have this template? The point of the citation templates on enWP is to make all citation conform to the enWP house style; whereas on enWS we should be replicating whatever was used in the original. I don't think we should fix this template so much as putting it out of use and deleting it. Xover (talk) 11:29, 22 May 2022 (UTC)

Wikisource:WikiProject Open Access/Programmatic import from PubMed Central/Ebola and Marburg Hemorrhagic Fevers Neglected Tropical DiseasesEdit

When this was imported the importer dutifully imported all the references which had a link in the page, however, what it did not import was all the references that were given by a range in the text itself, as I've encountered missing references which are in the PLOS original, but not in the version/edition as presented on Wikisource.

How many 'bad' or incomplete papers imported by this method does Wikisource have? ShakespeareFan00 (talk) 15:37, 9 April 2022 (UTC)

@ShakespeareFan00: Everything under Wikisource:WikiProject Open Access/Programmatic import from PubMed Central is experimental imports, and, as you'll note, not in mainspace. It is annoying that they keep turning up in various maintenance categories, but I don't think there is much we can do about it. I don't believe the community would support just deleting them. So meanwhile we should just try to do the minimal we can to keep them off the maintenance categories. Xover (talk) 13:26, 10 April 2022 (UTC)

Index:A Biographical Index of British and Irish Botanists.djvuEdit

I've found that this has a better copy at - https://archive.org/details/abiographicalin01boulgoog

compared to the copy at:- https://archive.org/details/abiographicalin02boulgoog

that's actually on Commons.

And they seem to be the same edition.

As a temporary work-around, I've set the new source at Commons ( I haven't uploaded the new file), so the hi-res scans (from the better version) can be used for proofreading.

If you wanted to compare the files in more depth and generate a new version with better scans locally I have no objections, provided the pagelist doesn't need a bulk move as I was planning to work on it. ShakespeareFan00 (talk) 09:20, 16 April 2022 (UTC)

Parallel textsEdit

Hi, This page describes 2 kind of bilingual books saying (a) and (b) as if this is the 2 only ways that exist. How do you want to describe the third way which is really different and new with computers?
BluePrawn (talk) 15:19, 26 April 2022 (UTC)

Template:“ ‘Edit

Isn't Template:“‘ a duplicate of Template:“ ‘ ? unsigned comment by EncycloPetey (talk) 19:37, 7 May 2022‎ (UTC).

@EncycloPetey: Indeed it is. Thanks! --Xover (talk) 17:40, 7 May 2022 (UTC)

I need helpEdit

Thanks for the quick response to my CSD tag. I need another favor. I tried reverting this vandalism a few days ago but got blocked by an edit filter that seemed to think I'm a long-term abuser, which I'm absolutely not. I'm autoconfirmed now but I don't know if the filter will still think I'm vandalizing or not. There's also Index:Editing Wikipedia brochure EN.pdf, which needs to be restored to this revision. Coolperson177 (talk) 14:24, 12 May 2022 (UTC)

@Coolperson177: Reverts taken care of. Apologies for the auto-block; that filter is evidently being a little bit too aggressive.
@Inductiveload, @Billinghurst: Filter 42 seems to have triggered when Coolperson here tried to revert a vandalised page back to this before they were autoconfirmed. There are no obvious trigger strings in that diff so I'm guessing it's one of the ones in almatch3 that is hitting too broadly. Since this is an action:block filter it's probably worthwhile tracking it down and making it more conservative, but that ruleset made my head explode just looking at it. Xover (talk) 14:44, 12 May 2022 (UTC)
I think this would be the line with 0nsmatch which will trigger on any use of "&oldid=" (i.e. linking to amy permalink at any wiki) in mainspace. I have no idea if that's intentional, but it seems like a very blunt instrument to me. It dates back to the original import of the filter from en.wikivoyage.
I have removed that part of the filter, as such a filter cannot, in good conscience, be that broad. On the other hand, the filter is actually still being triggered correctly here and there by the LTA it's designed for. Inductiveloadtalk/contribs 17:05, 14 May 2022 (UTC)
Thanks. Yeah, blocking non-autoconfrmed on any permanent link is too aggressive. But maybe this was originally a non-block filter? I seem to recall we promoted one to auto-blocking relatively recently. Xover (talk) 19:00, 14 May 2022 (UTC)
I don't think it was this one: #42 has been set to auto-block since it was imported in early 2021. Inductiveloadtalk/contribs 19:51, 14 May 2022 (UTC)

Transclusion checkerEdit

Good work on cleaning and gadget-ifying the transclusion checker! And the other gadgetery!

Just one thing that's probably not working quite right: it looks like it's sucking up the targets for the pages in a way that includes the (pages does not exist) bit. For example, I see requests to https://en.wikisource.org/w/api.php?action=query&format=json&titles=Page:The Commentaries of Caesar.djvu/151 (page does not exist)|Page:The Commentaries of Caesar.djvu/152 (page does not exist)|Page:The Commentaries of Caesar.djvu/153 (page does not exist)|.... Which means you cannot see if the non-existent pages are being transcluded.

It would make sense for the extension to add the name of the target page to the link as an HTML data field (so <a href="..." data-page-name="Page:The Commentaries of Caesar.djvu/153" title="The Commentaries of Caesar.djvu/153 (page does not exist)/>153</a>. Which might actually be more possible now that someone has changes the PHP to use the link generator (related: phab:T267617, which is the page's numerical position-in-index) But until then, stripping the parenthesised text would be brutal and effective. Inductiveloadtalk/contribs 16:50, 14 May 2022 (UTC)

@Inductiveload: Fixed. Nice catch!
Yeah, page index, label, quality level, and associated wikipage (i.e. what's currently in @title) would all be nice to have as data attributes. Most of the time you're going to use quality level as a selector, so the existing class is fine, but occasionally you'll come at it from the opposite direction and need to get and use the numerical quality.
PS. So are you getting in some wiki'ing in the leadup to Eurovision, or hiding from it? :) Xover (talk) 18:14, 14 May 2022 (UTC)

use of deprecated pages tagEdit

About this edit: I use pages syntax around plates and the like because it contains the page mark-up that tells e-readers to make a page break there. I can provide a link to the page break css if you require it.

Consider reverting that change?--RaboKarbakian (talk) 13:46, 16 May 2022 (UTC)

@RaboKarbakian: That diff link leads off in the weeds. Xover (talk) 16:32, 16 May 2022 (UTC)
Heh, I am very sorry to have wasted your time. This is the whole url: https://en.wikisource.org/w/index.php?title=St._Nicholas/Volume_40/Number_12&curid=3847330&diff=12322012&oldid=12322007 And the heh was about me trying fancysmancy diff templating in the wee hours of the day. Again, my sincere apologies about the abuse of your time.--RaboKarbakian (talk) 16:48, 16 May 2022 (UTC)
Further, I saw the page template you put in its place. I ended up thinking about the day, which was not so long ago, really, that I was shown the pages tag working. It was the most pleasant interaction I have had here, really. So, removal of the deprecated tag is not such a problem due to the page break template (probably, as I haven't investigated). Lack of such positive interaction with what I was sure were humans, especially since I asked what I am pretty sure was a non-human (that also hit on me) to stop following me and trying to chat--eh, positive interaction among people, has that been deprecated also? --RaboKarbakian (talk) 16:54, 16 May 2022 (UTC)
@RaboKarbakian: No worries about any wasted time. If something isn't clear it's always better to ask (some people will actually rather fume in silence over such things, and I have no idea why). If one of my edits doesn't make sense you should feel free to assume I was insufficiently caffeinated until we can establish otherwise.
Regarding the #tag:pages syntax, the only instance I'm aware of where it's needed is when you have discontiguous pages that must be joined without a line break (a newspaper article that's "continued on p. 42" and similar). Page breaks for paged media is ultimately up to the ereader in question, but, as you say, we give them a hint with {{pb}} and {{ppb}} and this seems to mostly work.
PS. If bots start hitting on you, you know they're getting close to taking over the world. I for one welcome our robot overlords! :) Xover (talk) 18:20, 16 May 2022 (UTC)

Vol. 41, a re-upload request; or my terrible mistakeEdit

It would seem that last December, I did not search commons for the ia name, and went to the scan lab asking for a lesser version to be uploaded from Hathi.

What I would like to happen is:

  1. 41.1 Index:St. Nicholas - Volume 41, Part 1.djvu to use https://archive.org/details/stnicholasserial411dodg
  2. 41.2 Index:St. Nicholas - Volume 41, Part 2.djvu to use https://archive.org/details/stnicholasserial412dodg

Problems I can see with this are:

  1. quite a bit of 41.1 has been done and the page numbers will not match. I can work out the differences if it will help.
  2. 41.2 is broken and will need further repair. My experience with finding the needed repairs uses the interface here which would mean a second upload. If avoiding the second upload is the best way to maintain your good spirits, I would be willing to try a different method to find the missing parts.
  3. (might be a problem) I should not be here with this, but at the Scan Lab or just keeping it to myself and just working with these problems I created.

I considered moving onto different volumes, but I am really looking forward to more skyscraper, tunnel, aqueduct engineering. My other problem is (kind of a fun and completely different task) that I have not been able to determine what the name this building is: Page:St. Nicholas - Volume 41, Part 2.djvu/455, maybe it no longer exists. Thanks for your time!--RaboKarbakian (talk) 14:28, 22 May 2022 (UTC)