Open main menu




Hello, and welcome to Wikisource! Thank you for joining the project. I hope you like the place and decide to stay. Here are a few good links for newcomers:

You may be interested in participating in

Add the code {{active projects}}, {{PotM}} or {{CotW}} to your page for current wikisource projects.

You can put a brief description of your interests on your user page and contributions to another Wikimedia project, such as Wikipedia and Commons.

I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes (~~~~); this will automatically produce your IP address (or username if you're logged in) and the date. If you need help, ask me on my talk page, or ask your question here (click edit) and place {{helpme}} before your question.

Again, welcome! Beeswaxcandle (talk) 06:57, 11 July 2011 (UTC)

The mysterious Header toggle buttonEdit

When proofreading in the Page: namespace and you have your toolbar turned on in "my preferences" (Show edit toolbar (requires JavaScript)), then you will see the button   in your toolbar, and clicking it toggles the header/footer on and off. In this space we put the relevant components for top and bottoms of pages, usually by use of the template {{RunningHeader}}, so for example {{RunningHeader|Stanhope|3|Stanhope}} produces


Personally, I have my header/footer set to open in the Page: namespace and I achieved this by activating that option in my preferences (Show header and footer fields when editing in the Page namespace.)billinghurst sDrewth 06:02, 28 September 2015 (UTC)

@Billinghurst: Thanks. I do have the header/footer fields enabled, but I am deliberately ignoring them for now, in favour of getting all the main text finished. Once that's done I plan to go over it to clean up what needs cleaning up, and first among them is trying to figure out how the header/footer stuff works. I've had a quick look at some of the featured content on here, and some of the documentation, but still don't feel like I have a good grasp of how they work and what they're used for. A deep dive in the style guide is probably in order at some point.
One question that's been bugging me though: is my use of the page status "Proofread" correct? Given the unfinished state of them (cf. above), would it be more correct to set the pages to "Needs proofreading"? I had trouble understanding the finer details of their use (culture shock, used to enwiki and its terminology, sorry) so I somewhat guessed when I decided how to use it. For reference, the understanding I landed on was that "Proofread" here actually means "The automatic text has been corrected for any obvious OCR-artefacts". If instead it has broader meaning—like "Rough Draft" vs. "Draft" vs. "Final", or steps in workflow like "Ready for wider community comment", or whatever—then I may well be using it wrong and would appreciate correction.
Anyways, I appreciate you (and @Prosfilaes:) taking the time to look at this stuff over on the relevant work's index. I still don't quite understand where you're coming from, but I'll try to digest it and read some docs, and follow up further there. In the mean time, it would really help if you could point me at the relevant policy-type stuff that deals with this. Things that talk about "What Wikisource is" and its general principles, etc. If I were to sum up my understanding of our positions (short and without nuance, so only for illustration of my confusion), I would say you and Prosfilaes seem to be arguing that the project aims to produce a completely new edition of the work, with whatever improvements and emendations we think are good for our reader, while I lean more towards preserving the original to as large a degree as possible while still taking advantage of the modern platform. That seems like a pretty fundamental difference in approach, if I've understood the positions correctly, and seems the sort of thing the project ought to have copious guidance on. i.e. not the How of things, but the What and the Why.
Cheers, --Xover (talk) 07:14, 28 September 2015 (UTC)
Wikisource:For Wikipedians!!

We try to be a straight-forward and uncomplicated community. Proofread means that a person has been through that page and they believe that the text/formatting/image(s) represents the scan (the author's intent); and validation means that a second person has separately reached the same conclusion (Help:Page status). Headers/footers take the iterative/compositorial "book" construction (which I addressed at the Index talk: page). As book compositors and publishers changed styles over the last two hundred years, it has not been our goal to be bound to each book's style, instead having components of our style (per guide).

I don't think that Prosfilaes and I are arguing in the way that you indicate, we wish to reproduce the body of the text as published, but as we are in a web world, there has to be flexibility in output, otherwise there is no point to what we are doing, and you can just read the work as a scan. We want to interlink works, we want to link to authors, and we wish to make this a usable resource through the wikis. We don't wish to convert a book to be an interpretative guide, so we have guidance like Wikisource:Annotations, but we do wish to encourage translations of foreign works, so we do have Wikisource:Translations.

We try to be reasonable and practicable. We are strong on principles, we are guided by rules. — billinghurst sDrewth 10:09, 28 September 2015 (UTC)

A "completely new edition" could mean a lot of things. We're changing the line breaks and the font and dynamically retypesetting it at will (computer typesetting always blows my mind when I think about what it was like pre-computer). There is some disagreement about the long-s, which I would argue is a sub-orthographic feature to be lumped in with the font, but {{ls}} generally works to placate both sides. I would argue that what I'm doing is a reprinting, just in a different form factor and in more modern fonts, not a completely new edition.--Prosfilaes (talk) 23:50, 28 September 2015 (UTC)

Page:The Plays of William Shakspeare (1778).djvu/345, {{refn}}, {{ss}}Edit

I hope you do not mind but noting your comment "Note, issue with the refn template and the |follow argument!" on the first above I had a bit of a go at the page and you can see the result.

As I note in passing you were the person who imported {{refn}} from WikiPedia no doubt you have also become aware that that community has no use for (and thus their templates never support!) the follow parameter. I briefly considered adding said support but the result would be even uglier than directly coding {{#tag:ref so I demurred.

Finally {{ss}} formerly rammed the result into any following characters: e.g. "poſſeſſion" becomes "poſſeſſion" (with my luck those two look identical on your browser; please believe me on mine only the latter case looks quite "normal.") so I took the opportunity of amending that as well. AuFCL (talk) 01:02, 17 November 2015 (UTC)

Hi AuFCL, and thanks for helping out here. Very much appreciated!
I'm still planning on trying to fix the {{refn}}, but since the solution didn't jump out at me on a quick glance I just put it aside until I'm through proofreading the text. I'm expecting lots of little issues (the stuff around the page breaks not least of all) that I figured would be best tackled when I could more easily see how the whole thing will look.
As for {{ss}}, I'm primarily concerned with marking these ligatures, and I'm still undecided on whether to try to mark all the other ligatures in the text (ffi, etc.). Displaying them isn't that critical to me in the short term, and it seems there are quite a few issues that should be addressed before that bubbles to the top of the list. For one, an easy switch to let each user decide whether they want to see these or not (not sure what's actually in place currently), and a consensus default for non-logged in users (which is probably "off", but I haven't located a relevant discussion yet). For another the use of CSS-positioning to emulate a font feature is kinda gross, and any typographer would soundly trout you before running away screaming if they saw that, but at the same time using the CSS 3 Fonts module seems premature, or at least requires skin support in Vector. Anyways, I'm basically only concerned with having them marked so they can easily take advantage of any future solution, if and when one becomes available. In the mean time I'm perfectly happy with displaying just plain "ss".
Aanyways… Very happy to see someone take an interest in this text, for all sorts of reasons, so feel free to mess about as you think best, and I'll be sure to quibble where I have any objection. --Xover (talk) 14:05, 17 November 2015 (UTC)
Oh well. General encouragement and all that (you can probably tell I'm currently—today at least—pressed for time.)

If there is anything I can help with please let me know. AuFCL (talk) 20:07, 17 November 2015 (UTC)

May I sound you out upon a slightly subversive issue? No pressure if you choose not to proceed.

I am a little annoyed that certain persons have been pretending that there was a discussion and consensus in 2014 to deprecate {{ct}} when in fact to the best of my admittedly limited researches the last discussion took place in 2011 and only agreed to dispense with {{st}} (long since deleted and since replaced by an entirely unrelated template)—{{ct}} was only ever mentioned as a side issue without conclusion. Discussion is here in case this is new to you (Also please set me straight if I have misread the argument.)

What I am proposing is that {{ct}} be modified to display   (which I consider self-evidently merely representative of the typography and most certainly not a match to the page scan; on the other hand it is simple to generate using approved methods) in Page, Template and Index spaces; and as "ct" in all other name-spaces to keep the search-engine-side-of-the-argument people happy. (And there is always File:Latin_ligatures_ct_and_st.svg as backup or a reference point.)

If you consider this is a waste of time and I should drop the idea—or equally if you have any thoughts or a better idea—please say so!

Sincere apologies if any of this puts you on the spot if it is an issue in which you did not wish to become involved. AuFCL (talk) 08:01, 18 November 2015 (UTC)

@AuFCL: Well, I hadn't really planned on getting involved in any drama when I set out on my Wikisource-projects (I mostly edit on enwiki which has a seemingly endless supply of it), but I seem to have halfway stumbled into an area that is, for some reason I'm not quite grasping as yet, somewhat controversial.
To try to sum up my position: I very much care about maintaining as many aspects of the original work as is practically feasible by tagging it in a semantically structured way. I am, in the short term, not all that concerned with presentation and can happily live with whatever is the consensus default. When another editor insists on a change that is contrary to these goals I expect their position to be backed up by policy or community consensus, and I expect the discussions leading up to either to be easily findable and clear-cut.
As a concrete expression of these more general principles, I have searched in vain for any community discussion that deprecates typographical ligatures. I have found some that address technical limitations and problems with various specific limitations, but otherwise the discussions I have found suggest a consensus for preserving these to whatever degree is practical. The policies I've found that bear on this also point specifically to preserving such features wherever possible and to what extent it is possible do so (without addressing the question directly).
Thus I am quite concerned that individual editors' personal preferences are being applied as if they were project-wide policy supported by community consensus, and would engage myself in any process or discussion aimed at resolving this dissonance.
Now, all that being said, I have some opinions, more or less well considered, but not necessarily ultimate answers. There may well be factors I am not aware of that make my current conclusions naive or unworkable in practice. If so I would very much like to be made aware of that. I also have some personal preferences in some aspects, such as that I do not like the visual presentation of your proposed solution (I want the actual ligature and not a symbolic representation of it) and neither am I particularly happy about your implementation of it (using math markup). Thus, depending on what other alternatives are on the table, I would likely then fall down in favour of using just plain "ct" until a proper solution could be implemented (which, I think, will be using the CSS 3 Fonts module's support for historical ligatures; see [1], [2], and [3]).
In any case, I am sympathetic to your position, even if I do not predict agreeing with all details of its proposed implementation. --Xover (talk) 19:23, 18 November 2015 (UTC)


What is the purpose in beginning each line with {{gap}} on this page? If the poem is set inside {{block center}}, then I can't see what function the use of {{gap}} serves. --EncycloPetey (talk) 18:22, 21 February 2017 (UTC)

No purpose. They're a leftover brainfart. Thanks for the catch! --Xover (talk) 18:33, 21 February 2017 (UTC)

You could try ...Edit

Hi, just noticed the frustration with the passage with a running left margin quote mark. If you want to set off the passage, you could try {{quote}}. An example where I've used it is Page:A Dictionary of Music and Musicians vol 3.djvu/86. Beeswaxcandle (talk) 09:31, 8 August 2018 (UTC)

@Beeswaxcandle: Thanks for the tip. I'll keep it in mind for future needs. But in this case the problem was mainly that after pages upon pages of block prose quotation, Hazlitt's reproduction of North's translation of Plutarch begins intermixing prose and dialogue. Since we can't sanely reproduce the row of single quote marks in the left margin, and the dialogue stuff is too long for just pairs of single quote marks at start and end, it needs some other typographic way to set it off. I don't particularly like using italics that aren't present in the original, but in this case I think it's a reasonable compromise when we can't be fully semi-diplomatic. --Xover (talk) 10:50, 8 August 2018 (UTC)

Henry IV Part 1Edit

This one doesn't have the cover in the scan. It's a (blank) library binding and not the book's cover. --EncycloPetey (talk) 01:23, 12 January 2019 (UTC)

Also, {{serif}} is a very old and unnecessary template. The serifs are added in the page layout when the content is transcribed. Specifying fonts or font-styles is usually frowned upon, as it is restrictive and not necessary. --EncycloPetey (talk) 01:26, 12 January 2019 (UTC)

@EncycloPetey: Thanks. But I think you'll have to explain the serifs thing. I add the explicit serif formatting to the headings and such that look horrible and anachronistic in sans-serif, but only to those places. How would tranclusion into mainspace achieve the same thing? --Xover (talk) 08:48, 12 January 2019 (UTC)
Look at the transcluded versions; I have transcluded the Bibliography for instance. They have a layout applied to them that includes a serif-font preference. Applying that level of font-style in the Page namespace is redundant. --EncycloPetey (talk) 14:59, 12 January 2019 (UTC)
@EncycloPetey: Ok, I think I see. By specifying a layout when transcluding we're effectively also applying a set of styles, and since that style sets everything to serifs it is redundant to also do so in the Page: namespace? I'd not really considered the layouts as something other than an extra gadget for editors: are they active also for non-logged in users? Are the effects of the different layouts documented anywhere? --Xover (talk) 10:29, 13 January 2019 (UTC)
The only place I'm aware of is Help:Layout, but it's explanations are minimal. As far as I am aware, the Layouts, when applied, have effects visible for everyone. --EncycloPetey (talk) 15:03, 13 January 2019 (UTC)
Hmm, and further, why are you specifying a 320px width for the text here? Especially considering neither {{fine block}} nor {{smaller block}} actually support a |width= parameter. --Xover (talk) 09:06, 12 January 2019 (UTC)
The width shold have applied to the {{block center}} which wraps the font size. Templates like {{fine block}} or {{smaller block}} are simply there to adjust the size of the font; it's the enclosing template {{block center}} that sets the margins and placement of the text. Consider

This text is enclosed with a width limitation in a "block center template"

This text is enclosed without a width limitation in a "block center template"

The applied width limits the length of the text lines, which is the format present in the original image description present in the text. Without the limitation, the lines of text will be as wide as the screen allows. What you noticed was that I had failed to add the wrapping template, but have done so now. --EncycloPetey (talk) 14:59, 12 January 2019 (UTC)
Hmm. I'm not sure I particularly agree with the implication that non-wrapped text is undesirable. But that aside… Doesn't the layout (cf. the point above) also apply things like max-width, margins and text-indent, and block center? Or, at least, could and should it not do those things by the same reasoning as for a serif font-family? What's the reasoning for applying one in the Page: namespace and the other in a layout in mainspace? And whence the particular pixel width, not to mention why give it in pixels? --Xover (talk) 10:29, 13 January 2019 (UTC)
The purpose here is to make the block of text slightly narrower than the rest of the text, which it is in the original (being an image caption) and to reduce the chance of being a single long line, which would look very odd in the layout. No, this isn't a style, it's layout. I've used pixels because that's what I always use; if there's a better option that works well with the applied layout, then I'm open to suggestions. --EncycloPetey (talk) 15:03, 13 January 2019 (UTC)
Addendum: I guess part of the reason for using pixels in this instance is that it immediately precedes an image it describes. Since that image has its width given in pixels, it makes sense to do the same with the caption, so that both are measured in the same system and can be easily coordinated. But what I said before is still true. I use pixels because that is the measurement system I'm most familiar with, and is more closely tied to display size. --EncycloPetey (talk) 15:46, 13 January 2019 (UTC)
Pixels are usually a poor choice precisely because they are tied to display size: every other option is relative and scales to some degree, but pixels fall down when faced with the wide variety of pixel resolutions of devices out there. Even images are best sized using percentages of its containing context. But, of course, that's all on a general basis, and the specific needs on enWS may affect the calculus. In any case, I was mostly just wondering whether there was a specific reason for it. --Xover (talk) 07:18, 14 January 2019 (UTC)

A suggestion on workflow: I found it easier to set up all the Notes pages (with anchors) in advance, and transcluded the Notes section, so that I could check the "Cf. n." links as I worked through the Text pages. If a link showed up red, or didn't take me to the right endnote, then I was able to make that correction in the Text. --EncycloPetey (talk) 15:42, 12 January 2019 (UTC)

Good idea. I like approaches that give you extra ways to detect errors "free". But we'll see how I actually go; right now I'm more concerned with figuring out the technical stuff like formatting and partial transclusion and so forth. Lots of new and, to me, relatively advanced stuff on this one! --Xover (talk) 10:29, 13 January 2019 (UTC)
I'm working on Macbeth now (and it will take a while). You can watch to see my workflow if that helps. I work in stages to keep things clear in my own head as I work when a work is as complicated as these are. --EncycloPetey (talk) 15:03, 13 January 2019 (UTC)
@EncycloPetey: Thanks, I'll keep an eye on that. And, I may perhaps be so bold as to presume I may ask you further stupid questions as they arise? :) --Xover (talk) 07:18, 14 January 2019 (UTC)
  • @EncycloPetey: I think I've figured out all the twisty niggling bits, and completed proofreading Index:Henry IV Part 1 (1917) Yale.djvu. Could you take a look just to make sure I didn't mess up anything big? --Xover (talk) 21:42, 8 February 2019 (UTC)
    I haven't looked thoroughly, but I did notice that the right-hand page headers were missing a space after the period. I've corrected them all for Act V, but not the earlier acts. I'll look though at more things as I have time, and let you know if I spot anything. Do you plan to continue on to 2 Henry IV next? The file exists on Commons; I just hadn't set up the Index page. --EncycloPetey (talk) 21:50, 8 February 2019 (UTC)
    @EncycloPetey: Thanks. I do intend to move on to 2H4 next, yes. Regarding the right-hand headers, I'm not convinced there is actually a space there: the original typesetting (kerning) is too quirky and coarse to be able to say definitively, and act.scene is at least as common as act. scene and act, scene in general. But in any case, I've gone back over the earlier pages and inserted one in the interest of consistency. --Xover (talk) 09:28, 9 February 2019 (UTC)
    One formatting error I've noticed: You need to end formatting whenever a line is to be centered. Adjusted margins affect the centering, so: do this to avoid misaligned centering. --EncycloPetey (talk)
    @EncycloPetey: Thanks. I must admit that one is down to laziness rather than ignorance: the slight offset on centred lines seemed too small to bother with. Since it is apparently more noticeable than I thought I'll try to be more careful going forward. --Xover (talk) 19:44, 10 February 2019 (UTC)

Dash in Index pagesEdit

The dash (-) in Index pages is properly used for blank pages that lie outside the page numbering system.

See Index:Aeneid (Conington 1866).djvu. The opening pages are blank, and are outside the numbering system, and so are marked "-".

But pages iv and vi, which are blank, lie within the page numbering system, and so are numbered. The gray color indicates they are blank, so it is not necessary to double mark them with gray color and a dash. Doing so could also confuse people about the sequence and numbering of the pages, suggesting that pages had been omitted from the scan. This does sometimes happen with defective scans.

The "Errata" page is not numbered because it is a slip that was inserted into the volume, and was not a full page of the volume. The "page" following it is a scan of the back of the errata slip, which is also not a page, but the back of an inserted slip. Since it has no content ;;and;; is not part of the page numbering scheme, it is best represented as a "dash".

Page 450 is another blank page, but it is clearly part of the page numbering scheme for the volume, since the previous page is numbered 449 and the following page is numbered 451.

With the Yale Shakespeare series, there is a common set of eight pages in the from matter, none of which bear a page number in any of the volumes, but library catalogs assign these pages Roman numerals beginning with the half-title page[1] and continuing up to the first numbered page of the volume, or at least the page that appears in the position of "page 1" even if it is not numbered and the following page is "page 2".

With this in mind, I've set "page i" at the half-title page in accordance with standard practice. The back of that page, even if it is blank, is "page ii. --EncycloPetey (talk) 20:30, 26 January 2019 (UTC)

  1. But usually omitting a sheet that bears a frontispiece.
@EncycloPetey: Ugh. Confusing! But thanks for clarifying. --Xover (talk) 07:10, 27 January 2019 (UTC)

Header formattingEdit

I don't usually bother trying to match font size in the headers. If the page numbers are a bit smaller, then there's little point in going to the trouble, since the header isn't transcluded in the final copy anyway. Some people like to go to a lot of trouble formatting the header text, but as it's not going to show up anywhere but the Page namespace, it doesn't seem worth the effort in most situations.

However, italic text in the header is easily done. For the Yale Shakespeare, I've been italicizing, but not concerned with making bits of the header larger or smaller. --EncycloPetey (talk) 22:03, 27 January 2019 (UTC)

@EncycloPetey: Yeah, I've been starting to have second thoughts about the level of and which detail to reproduce in headers (and the main text for that matter). As you say the formatting here generates a bit too much cognitive overhead, which in turn (among other factors) leads to me drop the ball on stuff like the italics. My starting point was very close to diplomatic (paleography and minor textual, even punctuation, details is a major issue in my field), but experience here is slowly nudging me back down that scale. Hopefully I'll arrive at a sensible middle path eventually! :) --Xover (talk) 07:30, 28 January 2019 (UTC)
@EncycloPetey: By the way, you may find some use in this list. It's in a sandbox that may change at some point in the future, hence the permalink, but I have no immediate plans to reuse it. I may at some point update it to fill in some of the missing bits, so feel free to browse the most recent version of the page. --Xover (talk) 21:31, 6 February 2019 (UTC)

PageLayout and MarginNote templates...Edit

Can I ask you to look into documentin these and implement an lrpage and rlpage option for them?

The context being -;_or,_Chronicle_of_the_Kings_of_Norway_Vol_1.djvu/225&action=submit which has alternating margins..ShakespeareFan00 (talk) 19:18, 10 February 2019 (UTC)

@ShakespeareFan00: No promises as template syntax gives me as much of a headache as the next guy, but I can certainly take a look. However, I don't think I understand what it is you're asking. Template:MarginNote appears to have existing documentation; and judging by Page:The Heimskringla; or, Chronicle of the Kings of Norway Vol 1.djvu/225 you've already achieved the effect you're after using it and Template:PageLayout. What is it you're having trouble with and what would the lrpage and rlpage parameters do? --Xover (talk) 19:39, 10 February 2019 (UTC)

Page:The Heimskringla; or, Chronicle of the Kings of Norway Vol 1.djvu/225 which I've already been able to handle (which has a right-hand margin.) Page:The Heimskringla; or, Chronicle of the Kings of Norway Vol 1.djvu/226 which has a left hand.

In page namespace I can set up left or right hand margins for per the header. I can't currently setup a {{MarginNote}} that automatically changes a left-hand margin note to a right-hand one when the page is transcluded ( contrast the functionality of {{outside R}} {{outside LR}} for sidenotes. )

In main namespace by comparison, either a lefthand or right hand margin would be set up for an entire transcluded section, and at present the MarginNotes cannot be set to display differently depending on page (where the margin space for them alternates between pages) or mainspace use (where they would not.

lrpage and rlpage are shorthands that tell a template to behave differently in Page vs other namespaces:

  • lrpage means "left" in page namespace, "right" when transcluded
  • rlpage means "right" in page namespace, "left" when transcluded

the behaviour of the template changing accordingly.

(I can of course make an editorial decison to only use one 'margin' style, which is a reasonable medium term workaround.)

It is {{PageLayout}} that is undocumented, and some examples of how to use it conjunction with MarginNote would in general by useful anyway.

There are also some other limitations of {{PageLayout}} and {{MarginNote}} that are not applicable to the current work. ShakespeareFan00 (talk) 20:25, 10 February 2019 (UTC)

See also -Template:MarginNote/sandbox for a related issue I solved, on Page:The Heimskringla; or, Chronicle of the Kings of Norway Vol 1.djvu/230 ShakespeareFan00 (talk)

Henry VEdit

Is it a conscious choice on your part to indent all the text as poetry, or did you not notice that the prose passages of the current pages you're doing use a different indentation to indicate they are prose? --EncycloPetey (talk) 16:15, 18 February 2019 (UTC)

@EncycloPetey: I didn't notice. Or rather, provided we're thinking of the same passages, I just figured they were inconsistent typesetting. Any suggestions for how to handle them? --Xover (talk) 16:42, 18 February 2019 (UTC)
It's simply a tweak of the {{dent}} template to accommodate the different indentation. The series is consistent about giving the prose an extra 1em of indentation. If you didn't notice this before, I can handle the Henry IV plays, if you'll do Henry V. --EncycloPetey (talk) 16:46, 18 February 2019 (UTC)
@EncycloPetey: That would be much appreciated. Thanks, and I'll try to get this right on H5. However, looking into it now I see what I'd noticed before wasn't actually the prose passages, but rather Pistol's spontaneous verse lines interspersed in them. These look like they have -1em margin when you've not noticed the actual prose lines. Sigh. I can tell already that I'm going to struggle getting this right. So doubly thanks, it seems, as I would never have noticed this on my own! --Xover (talk) 16:58, 18 February 2019 (UTC)


It isn't necessary to label sections on most pages in the Appendices. In the body of the plays, there are separate parts on each page (body text, footnotes) that must be separated from each other in the transclusion, so all the sections have to be labelled. But in the Appendices, there is usually just one section per page, so labeling the sections is not needful. The only exception being instances where one Appendix concludes and another begins on the same page. --EncycloPetey (talk) 18:27, 22 February 2019 (UTC)

@EncycloPetey: Thanks. Yeah, I was actually aware of that (but very much appreciate you checking up!); but one of the previous plays had appendices that ended and started on the same page, so here I added the section markers prophylactically for consistency-slash-lazyness. The extra markup should do no harm even if unused. --Xover (talk) 18:46, 22 February 2019 (UTC)
The problem I've found with using section markings when they're not needed is that proofreaders occasionally remove the section markup, either accidentally or without understanding their importance, so I avoid using them if they're not essential for transclusion. A single removed or altered section tag removes that page from the book. --EncycloPetey (talk) 22:57, 24 February 2019 (UTC)
Sure. But at a certain point there you're trading off robustness and convenience in a risk analysis with diminishing returns. The pages will be very rarely edited, and even more rarely by someone that messes up the section markers, and there are a whole host of other markup syntax errors that can mess up not just the one page but even all the following pages in the transclusion unit. --Xover (talk) 07:20, 25 February 2019 (UTC)

Final scansEdit

I've set up half of the Yale Shakespeare comedies, and much more of the disambiguation and versions pages. At this point, there are only four volumes in public domain for which I haven't located a quality scan. For these I've located Google scans only, which are never great, and are occasionally awful.

  • Shakespeare's Sonnets - Google scan from the University of Minnesota that looks surprisingly clean.
  • Romeo and Juliet - I've found a Google scan from the University of California that looks OK on a first pass.
  • 2 Henry VI - I found a Google scan that was good enough for the Hathi Trust to host it. Again, looks reasonably clean.
  • The Tempest - The only Google scan I've found is a botched job with multiple "extra" pages, which may mean that pages are also missing or out of sequence.

I'm asking around to see whether someone can generate a DjVu for 2 Henry VI. If I find someone who can capably produce a usable file, then I'll see whether they can also do the Sonnets and Romeo and Juliet. I'll continue looking for a cleanly scanned Tempest, but might have to purchase a copy myself and find a nearby library with digitizing equipment. --EncycloPetey (talk) 22:55, 24 February 2019 (UTC)

2 Henry VI is up and started. You'll notice a few differences once you start proofreading:
  1. The scan resolution is lower, so some characters may be harder to distinguish; e.g. numbers such as "3" and "8" will look very similar.
  2. Different scan errors: "x" frequently becomes "r".
  3. Some hyphens will be missing from the text.
  4. Some quotes in the text layer will be curly quotes, and will need to be swapped to straight quotes, which doesn't happen with IA scans.
--EncycloPetey (talk) 02:35, 25 February 2019 (UTC)
Thanks. Yeah, I've been subjected to Google scans before. Especially when combined with 18th-century typography and page layout (Edmond Malone uses multi-page nested footnotes, obsessively formatted details that can't really be approximated: they have to either be reproduced exactly or replaced entirely. Sigh.). I'm sure Google Books started out with the very best of intentions, but their incompetent scanner operators, low general quality level, and subsequent turn to lock up and lock down public domain works… Meanwhile their crappy job made it that much harder to get proper scanning projects funded, much less any organised way to make the scans available. Thank heavens for HathiTrust, but their approach to copyright is way way too conservative.
In any case, it will probably be a pain, but we'll get there eventually. Thanks for all your work and care getting this set up! --Xover (talk) 07:29, 25 February 2019 (UTC)
@EncycloPetey: I took a stab at The Tempest based on the page images at HathiTrust (Google scanned), mainly as an experiment to test out my tooling. I've manually cleaned up the page images (there were more interleaved pages than original pages!) and generated new OCR text. So far as I can tell it looks reasonable, except that I wasn't aggressive enough in removing interleaved pages at the start of the book so that it has an excessive number of blank pages there relative to the rest of the series. I didn't feel that was worth going back and redoing the DjVu over, but if you disagree I can do so (it's not all that much work). Take a look and let me know what you think?
PS. I now have some rudimentary tooling for working with DjVu files set up (including generating a DjVu with hidden text from just page images), so if you need anything done feel free to ask (or just ping me when you post a request at the Scriptorium). I claim no particular expertise, nor make any promises regarding quality or response time, but I'm happy to help when I can so never hesitate to ask. --Xover (talk) 10:18, 13 April 2019 (UTC)
My feeling is always that "if it's worth doing, it's worth doing right". Besides the extra blank pages before page "i", you've left two extra pages between "i" and the Title page, which puts the title on page "v" instead of pages "iii", and also left two blank pages between the copyright and the Contents, both of which mean that the page numbering will be wrong for all of the front matter (relative to the original text). This has an impact on citations. There should also be the usual number of blank pages at the end of the book, since the "PRINTED IN..." really isn't the back cover of the book. I would recommend redoing the DjVu so that all of these issues are corrected and we have a proper reproduction of the original text. --EncycloPetey (talk) 14:25, 13 April 2019 (UTC)
@EncycloPetey: Hmm. Ok… I've removed the extraneous interleaved pages in the front matter. Since this scan is missing the outside cover I have omitted that page, but I've kept the inside (verso) cover plus one blank leaf between the cover and the series page. I have also kept the blank verso of the leaf with the series page. All other blank pages in the front matter have been deleted. For the end of the book there is little rhyme or reason to the scans, but it looks like the practice of the printer was that:
  • iff that page is included, it is always printed on the verso side of a leaf
  • if the last content page is a recto page, the Printed in… page is printed on its verso side
  • if the last page is a verso page, the Printed in… page is printed on the verso of its own leaf
  • there is usually at least one blank leaf between the Printed in… page and the back cover
So, for this work, I have added a separate leaf with the Printed in… page (since the last content page is a verso page), followed by one blank leaf (recto + verso pages). Neither side of the actual cover is included in the scan so I have left those pages out. It's very hard to tell from the scan, but it looks like this copy had one blank leaf inserted before the Printed in… page and two blank leaves after it; in addition to the blank recto of the Printed in… page itself and the original blank leaf following it. Without access to a physical copy or a better scan this is the best approximation we can get I think.
Incidentally, in looking into this I notice that for those editions where the Printed in… page appears on the verso of the last content page the pagelist tends to assign it a page number, which I do not think is warranted: that it appears following the last actual numbered page is mere happenstance, and when it appears on a separate sheet it is not numbered. --Xover (talk) 10:18, 14 April 2019 (UTC)


You clearly know what you're doing here, from what I've seen. Interested in having mop privileges? —Beleg Tâl (talk) 12:17, 21 March 2019 (UTC)

@Beleg Tâl: Well, I would certainly be happy to help where I can. But 1) I've not had +sysop on any project before so I'm not familiar with the tools, and 2) I've as yet not ran into any particular instances where I've needed that bit (except possibly interface admin, but I'm sure that could have been handled by proxy if needed). --Xover (talk) 15:12, 21 March 2019 (UTC)
I initially read that as a "yes" and posted the nom; now I'm not sure; is that a yes or no? If you read through Wikisource:Adminship it will tell you what tools are available and what you would be able/expected to do. The only tools of import are the ability to delete pages, and block users. In my opinion, in a small project like this, anyone who can be trusted with sysop should be offered sysop; you have clearly proven yourself trustworthy. —Beleg Tâl (talk) 15:18, 22 March 2019 (UTC)
@Beleg Tâl: It's a yes :) I'm just used to enwp's RfA and wanted to be up front about my experience and relative need for the tools in case that mattered here. --Xover (talk) 15:26, 22 March 2019 (UTC)
Great! I have re-posted the nom; if you could post at Wikisource:Administrators#Xover that you accept the nomination, then everything will be set for voting. —Beleg Tâl (talk) 15:29, 22 March 2019 (UTC)

Hi Xover, I have closed your nomination as successful, and assigned you the bit. I hope you enjoy the extra responsibility!

If you know any languages other than English, or have any special access, can you please update Wikisource:Administrators#Current administrators?

Hesperian 23:04, 31 March 2019 (UTC)

@Hesperian: Done. And thanks. --Xover (talk) 05:30, 1 April 2019 (UTC)

Nomination for AdminEdit

I'm writing a story about your nomination for adminship. Can I get a quote for Wikisource:News? How do you feel about potentially being the first new admin on the project in more than a year? –MJLTalk 16:19, 31 March 2019 (UTC)

@MJL: While the attention is certainly flattering, I'm not sure that's particularly newsworthy. I probably won't be doing much admining, above some minor technical stuff, unless something urgent pops up that none of the experienced admins are available to handle. I am happy, though, to see your enthusiasm and efforts in the area of community building and cheerleading. enWS is a relatively small community where people are mostly plugging away at their own little projects. Anything that brings us closer together can only be good for the long-term health of the project. --Xover (talk) 05:38, 1 April 2019 (UTC)

Page number errorsEdit

Just bringing this to your attention. The page numbers in the Indices do need to be checked carefully. Numbers are more prone to scannos in this series than the text. --EncycloPetey (talk) 01:15, 7 April 2019 (UTC)

@EncycloPetey: Ugh. That's a pretty poor showing, alright. Thanks for letting me know! --Xover (talk) 04:55, 8 April 2019 (UTC)

Merchant of VeniceEdit

Note on page 21 is "get-up", though the scan has either blurred it to "get-ap", or else the printer made a mistake. Without holding a physical copy, I'm not sure if {{SIC}} is warranted. Sometimes an error like this is caused by the scanning process. --EncycloPetey (talk) 13:24, 22 April 2019 (UTC)

@EncycloPetey: You really think so? "get-up" appears out of style with the rest of the work, and is not a particularly good gloss on Via!. It's not a scan artefact as it is also present in the Google scan of the original 1923 printing. As such it is also unlikely to be a printing error: it is present on the original plates. If it is a typesetting error, it is a very strange one. How would the typesetter end up using either badly broken letters or a radically different font for just these two letters? And in any case, they do not even vaguely resemble the letterforms for a, o, u, or p elsewhere in the work. The most plausible explanation I can think of is that whatever it is, it comes from the author's manuscript. And the closest letterform for the first is a Greek alpha. The second, either a small italic rho with dasia or psili, or a very badly mangled phi. Which of course makes no sense either, but it is the best I can come up with. --Xover (talk) 15:05, 22 April 2019 (UTC)
I don't see a radically different font here. This is a smaller typeface, as all the notes are printed in, and italicized. This smaller italiziced font does look different from the non-italicized font, but is itself consistent in the notes. This looks just like the "a" in the note for line 10 on the same page, and I suspect an "a" was mistakenly used for "u" and no one caught the mistake. The "p" looks just like the "p" in the note on page 20 for line 35 "stepfather". --EncycloPetey (talk) 15:13, 22 April 2019 (UTC)
@EncycloPetey: Aha! You're right, of course, I was comparing with the regular typeface in the notes. The italic letterforms are straightforwardly identical. Which makes this "get-ap". I still don't quite buy "get-up" as the intended meaning (too colloquial and informal), but taking your word on that, I would then say we SIC-mark get-up/get-ap here. --Xover (talk) 15:24, 22 April 2019 (UTC)
"Get-up" might be too colloquial in current English, but perhaps not 100 years ago. I've seen a number of glosses in this series that seem odd by the standards of more modern English, ~100 years since the original publication. --EncycloPetey (talk) 15:38, 22 April 2019 (UTC)
@EncycloPetey: Well, I must readily admit that late 19th- / early 20th-century is a particular blind spot, but the formulation, in the context, jives badly with both modern usage and anything before the early 19th century. But as you say, there are a number of such odd glosses in this series. --Xover (talk) 16:12, 22 April 2019 (UTC)
@EncycloPetey: Hmm. Actually, I am informed elsewhere, that "get-ap" and "get ap" are widely attested early 20th century US forms of "giddy-up" that would be not entirely inappropriate in the context. cf. its use as an interjection in Wiktionary. I have never seen that form, but then I had not really seen the "get-up" form either, and of the two I am actually inclined to go with "get-ap" on the available evidence. --Xover (talk) 11:40, 23 April 2019 (UTC)

Template:Hanging indent inlineEdit

The template {{Hanging indent inline}}, which you have created, is really good! Would it also be possible to modify it somehow, so that it also worked when it starts in one page and finishes in another one? Something like template:hin/s, template:hin/m and template:hin/e. --Jan Kameníček (talk) 22:28, 22 May 2019 (UTC)

@Jan.Kamenicek: I can certainly take a look. There's nothing technical about the template as such that would prevent this, but we may run into HTML/CSS/browser limitations in terms of achieving the desired effect. --Xover (talk) 06:45, 23 May 2019 (UTC)
OK, thanks very much! --Jan Kameníček (talk) 13:55, 23 May 2019 (UTC)
Hi. Do you think you may have time to look at this issue? If not I can try to ask at Scriptorium. --Jan Kameníček (talk) 10:45, 30 June 2019 (UTC)
@Jan.Kamenicek: Done; see main template doc page. And my apologies: this entirely slipped my mind. Don't ever hesitate to remind me about stuff like this. I can pretty much guarantee I'll appreciate the reminder! --Xover (talk) 13:21, 30 June 2019 (UTC)
Perfect, thank you very much! --Jan Kameníček (talk) 13:36, 30 June 2019 (UTC)

Xover, this may be the wrong place to ask, but I noticed a Scriptorium reference from you in February about making {{hin}} work across page breaks. I have a template {{USStatPension}} which is strewn with subst references in order to allow it to be manually (and horribly) broken across page ends. Would you have any idea how I might go about amending the template to work natively across page breaks? Currently it leaves a trail of destruction with unclosed spans and divs section tags which take an age to repair! Thanks. CharlesSpencer (talk) 16:56, 12 September 2019 (UTC)

@CharlesSpencer: No, sorry. That template is way too complex for that. The templates that are suitable for breaking across pages are of the form <start-tag attribute="value" style="this: that; the: other-thing;">… content …</end-tag>. Because all the content is between the two tags, and because both the start tag and end tag are atomic (we can guarantee that the entire tag will be on a single page), we can just separate it so the start bit is inserted by one template (/s), and the end bit by another (/e). {{USStatPension}} contains multiple tags and content, and can in principle be broken at any point in there. Incidentally, this is one reason why it's generally not a good solution to put content inside templates. --Xover (talk) 17:41, 12 September 2019 (UTC)
No worries - thanks for looking. You're quite right that it's sub-optimal, but with 1000+ pages of boiler plate Pension grants to proofread which only differ in date, name, amount and related personal data, some kind of automation is the only way... Maybe we mass-subst EVERYTHING once it's all been proofread (probably some time in 2031)!!! CharlesSpencer (talk) 10:24, 13 September 2019 (UTC)

Henry IV, Part 1Edit

When validating Henry IV, Part 1, I have noticed that sometimes different attitudes are applied to placing tooltips in the text. E. g. at Page:Henry IV Part 1 (1917) Yale.djvu/97 there is a note saying "98–99 Cf. n." and at the same time there is also a tooltip on these lines inside the text. However, Page:Henry IV Part 1 (1917) Yale.djvu/77 has a similar note "200–203 Cf. n.", but there is no tooltip in the text. I think this should be unified in some way. --Jan Kameníček (talk) 08:57, 3 June 2019 (UTC)

@Jan.Kamenicek: Suggestions welcome. The ones missing a tooltip are ones where there was some kind of problem with adding one. In this example the lines references cross a page boundary. In others there is another note that is attached to a word or phrase inside the range (and tooltips cannot be nested). And in other instances the line range given is just too long for a tooltip spanning all of them to make sense. I've not worried overly about this while proofreading as the tooltips are just an added convenience and not integral to the original text. But, of course, if there is a good general solution to this I certainly wouldn't be opposed to it!
Oh, and since I have the opportunity: thank you so much for all your effort validating these works. That is very much appreciated! --Xover (talk) 09:10, 3 June 2019 (UTC)
I see, OK. --Jan Kameníček (talk) 09:21, 3 June 2019 (UTC)


Much appreciated if you could use your technical skills here, as I'm only using tf/s as a quick method, if there's a more elegant solution it would be appreciated.

Special:WhatLinksHere/Template:Tf/s being the list of affected pages.

Ultimately it would be nice to have a Module: that replaced the currently very large switch statement... so that more than 9 codes could be used, and that adding additional short codes, did not involved changing the ENTIRE template logic (as opposed to a Dataset page). ShakespeareFan00 (talk) 16:55, 21 June 2019 (UTC)


Well, I think that may be the first of a number of issues that will come crashing down..

I'm still not entirely happy, given that {{cl-act-h}} should really collapse to footnotes or be invisible if the margins get to narrow such as on mobile, but the techniques I used here, might prove useful elsewhere.

I'll leave it for a few days , before I convert it back to conventional sidenotes if you want to examine further. ShakespeareFan00 (talk) 11:01, 29 June 2019 (UTC)

@ShakespeareFan00: Hmm? What am I looking at? --Xover (talk) 14:48, 29 June 2019 (UTC)

See the source code portion of the page which represents Article 26.

To put a DIV into the flow of content (the {{cl-act-h}}) I would per document structuring conventions, end a <P>, place the sidenote and then start another paragraph.

Normally this results in a line feed, between the two P based portions. Changing the P's to 'display:inline' caused ( at least in terms of Firefox) them to run on after each other, meaning that the two P based sections, collapse (in the browser into a continuous flow layout wise). However the intervening sidenote, can now be inserted as DIV between the two P based sections of content, WITHOUT a structuring concern arising as far as HTML5 is concerned (and without a Lint-Error arising.)

The xtra_phack (the CSS "class" I am using in the userspace template) wasn't possible previously, because inline CSS (As implemented in mediawiki) didn't allow selectors or classes, in the way that the seperated CSS stylesheets used by Template Styles permits. :)

What this means is that I can with the xtra_phack in effect build up the entire paragrpah with the {{cl-act-h}} sidenotes using puesdo-paragraph P sections, and not have to fight mediawiki's internal handling to do it. A relatively simple approach, but one that's has possibilities, like the p-collapse template you helped me with previously.

Why does {{cl-act-h}} need to be DIV? So it can have various properties that can only be applied to block level elements (like a different line-height, or spacing}. Ideally the P handling inside the 'sidenote' DIV should revert to being the conventional P style, but I wasn't sure how to author a specfic rule for that, given that so far it seems to work as intended? (famous last words?)

Eventually I'd like to be able to replace the custom DIV approach with a proper <sidenote> tag, when Mediawiki or HTML gets an extension to support that.

ShakespeareFan00 (talk) 15:17, 29 June 2019 (UTC)

@ShakespeareFan00: Hmm. I think I understand. But my "Too complicated!" detector light is blinking here. Some formatting only makes sense or can only be applied to an element that is a block in the CSS box model, but pretty much element can be set to display: block. What would happen to this equation if your sidenote was in a <span style="display: block"? --Xover (talk) 16:33, 29 June 2019 (UTC)
Then I get certain formatting not being applied IIRC. You are welcome to attempt a SPAN version of {{cl-act-h}} though. IIRC it was made as a DIV to get float and clear behaviour. ShakespeareFan00 (talk) 16:35, 29 June 2019 (UTC)
The other issue is that multiple {{Left sidenote}} in close proximity (in Page: namespace), will overlap in layout terms.. due to float and clear behaviours.. If you can solve that so it worksm ore cleanly by adjusting the CSS that template uses, so much the better.ShakespeareFan00 (talk) 16:42, 29 June 2019 (UTC)
The above being down to what is put in here - seemingly. Getting it working in both Page: and Main namespace is hard.
{{Outside L}} etc Effectively uses a DIV as well in Main space, but the span approaches of {{left sidenote}} in Page:.
{{cl-act-h}} currently uses (ultimately) {{MarginNote}} (DIV based) with a lot of predefined parameters..

Someone I think needs to consolidate and rethink all the templates.. the various variants on {{cl-act-p}} as well ... ShakespeareFan00 (talk) 17:02, 29 June 2019 (UTC)

Also by using the puesdo P_hack, and cl-act-h, I managed to get the sidenotes to behave nicely even if there was {{dropinitial}} something that I had never got working fully previously.

ShakespeareFan00 (talk) 17:02, 29 June 2019 (UTC)

The defined layouts for {{left sidenote}}} etc in mainspace are buried here MediaWiki:PageNumbers.js ShakespeareFan00 (talk) 17:27, 29 June 2019 (UTC)

Running headers...Edit

Page:Miscellaneous Babylonian Inscriptions.djvu/35

Here a parser function is used in the header, what I was trying to figure out with the template I was trying to create was to come up with a more generic template which did the same thing as the current logic, but which could be generically applied, adjusting for the page numbers being on the left and right accordingly, based either on a supplied page number (or more complex based on a number calculated from the Page: name).

Such a template would also allow for the modified Title or Chapter title sub headings on odd or even pages respectively. ?

ShakespeareFan00 (talk) 15:34, 29 June 2019 (UTC)

@ShakespeareFan00: Funny. I was literally thinking about that this morning. Synchronicity! :)
My conclusion was that I don't really want to try dealing with this logic in Mediawiki template / parserfunction syntax, so it'll have to keep until I feel motivated to figure out how to do it in Lua. But, yeah, having to manually fiddle with the recto/verso differences is driving me slightly batty, so we should definitely try to come up with a replacement for {{rh}} that at least can deal with the most prevalent styles: laterality of the page number, and title vs. chapter. It's a pity we don't have any way to get the specific chapter title automatically so we could fully automate the headers.
I think I'll crib your parserfunctions version and see if that spurs me to do a bit of Lua fiddling. Thanks for the pointer! --Xover (talk) 16:16, 29 June 2019 (UTC)
My version didn't work... Please note that. ShakespeareFan00 (talk) 16:18, 29 June 2019 (UTC)
@ShakespeareFan00: {{rvh}}, for the very simplest of the cases. --Xover (talk) 09:52, 30 June 2019 (UTC)

Table classes....Edit

Had some frustrations earlier but {{table class/import}} and {{table class}} got written.

I used it here to reduce the number of calls "a little".

It would be nice to eventually only need one {{tc}} line for the style of table desired.. but my CSS tweaking skills aren't that advanced yet..

The rather laughable stylesheet is here : {{Table_class/tableclasses.css}} if you want to tweak further or document...

ShakespeareFan00 (talk) 16:43, 3 July 2019 (UTC)


I'll do the Wikilivres migrations for WS:CV at some point when I have a chance —Beleg Tâl (talk) 22:17, 5 July 2019 (UTC)

@Beleg Tâl: Thank you. Much appreciated! --Xover (talk) 07:57, 6 July 2019 (UTC)
@Beleg Tâl: In the interest of not dumping too much on individual contributors here, is it possible that Wikilivres would be interested / able to have a workflow for importing stuff that would otherwise create a backlog here? Ideally they'd be willing to monitor a queue here and tag works as transwikied once done (so we can delete them), or possibly have a noticeboard there where we could post such and monitor to see when the transwiki is done? I know next to nothing about Wikilivres so this may be a dumb question, but… --Xover (talk) 17:48, 22 July 2019 (UTC)
Wikilivres is extremely inactive compared to Wikisource, and all the editors there who would care about this are already editors at Wikisource, so I don't think that would have any significant benefit over our existing workflows. —Beleg Tâl (talk) 18:38, 22 July 2019 (UTC)
@Beleg Tâl: Ah, I should have guessed that was the case. Thanks. --Xover (talk) 18:54, 22 July 2019 (UTC)

@Beleg Tâl: Do we have a list of editors who also work on Wikilivres? And are those editors generally admins here? Would it be at all practical to immediately delete relevant works here, but to list them on a "To be moved to Wikilivres" backlog page from where they can be temporarily undeleted for move?

My question is motivated by two main concerns: 1) once we have determined that something violates copyright, we have a legal liability from knowingly hosting copyright violations (just having links to them in soft redirects is iffy enough in terms of being accused of contributory copyright infringement); and 2) to clear out the backlogs on the main noticeboards (WS:CV+WS:PD and maintenance categories). I'm looking for ways we can either streamline the pipeline that leads to closing out the issue here (without putting more pressure on our volunteers!), or to decouple the two tasks such that processing of CV/PD here does not have a dependency on transfer to Wikilivres (i.e. we can close out one without having to wait on the other). --Xover (talk) 05:54, 31 July 2019 (UTC)

Here is a list of admins on Wikilivres. Perhaps if you want to clear out this backlog a bit faster, you might consider asking for adminship at Wikilivres yourself? I personally do not think that the backlog here is a big deal; we have backlogs that stretch back ten years, and at least in this case someone will definitely get to it much sooner than that. In the meantime {{copyvio}} hides the content of the page, which should be good enough to counter accusations of knowingly hosting copyrighted material. —Beleg Tâl (talk) 13:00, 31 July 2019 (UTC)
Perhaps you could propose to the general community, your idea to have a separate move-to-wikilivres backlog? My concern with that idea, is that it will take far longer for the works to be moved, than when they are currently still active discussions at WS:CV. —Beleg Tâl (talk) 13:03, 31 July 2019 (UTC)
@Beleg Tâl: Any change would certainly need to be passed by the community, yes. I'm just sort of thinking out loud trying to find something that would be workable and a net positive overall, and preferably without significant downsides for any stakeholder.
I think it is important that we get the backlogs down to a manageable size so they do not overwhelm and confuse. There are issues which desperately need wide community input that are drowning in matters that are essentially resolved but just not implemented yet. So the idea regarding the Wikilivres stuff is to get them out of the backlog that requires discussion and into a list of things that just need to get done. I would imagine that would be easier for those involved with Wikilivres too: here's a list of stuff to import, without having to pick it out from a long copyright discussion. And at the same time they won't clutter up the main PD/CV backlogs.
The main downside that I can see is that it would require those Wikilivres importers that are not also admins on Wikisource to remember to tag the works for speedy once the import is done. Currently works are not removed from PD/CV until after they are deleted, so there is no chance they get forgotten about, but with the alternate setup you would have some potential for that happening.
PS. Just to be clear, this isn't some sort of subtle passive-aggressive way to nag you about the existing cases that need moving. I'm just trying to figure out a good alternative for the long term. When I start nagging someone about something it's usually blindingly obvious I'm nagging! :-) --Xover (talk) 17:31, 10 August 2019 (UTC)
Well your not-nagging worked because the four works in question have been migrated :) Now I just need to clean them up on Wikilivres. If you make this proposal for the change to a separate backlog, I'll probably stay neutral, since even though I think a separate backlog will simply pile up, I still think it's a good idea otherwise. —Beleg Tâl (talk) 16:22, 11 August 2019 (UTC)
@Beleg Tâl: Heh heh. Maybe I should start using subtle passive-agressive nagging then. :-)
If your take on the idea is no worse than neutral, I may indeed propose it to the community. I figure that at worst, if it doesn't work well (e.g. if it leads to an ever increasing backlog, or turns out to be inefficient for other reasons), we can just go back to the old way of doing it. I'll try to think it over a bit, and wait until a few other proposals have run their course first, but at least foor now I think it might be worthwhile to try. Thanks for providing input: very much appreciated! --Xover (talk) 17:38, 11 August 2019 (UTC)

United States v. $29,410.00.pdfEdit

Hi. About [4] - is something wrong? I validated all of the pages, though I didn't realize the source wasn't listed. I'm about to board a plane so may not be able to respond. Thanks, --DannyS712 (talk) 10:51, 6 July 2019 (UTC)

@DannyS712: No big issue. But it looks like your script inserted the string "_empty_" in the source field, which I don't think is correct. --Xover (talk) 10:55, 6 July 2019 (UTC)
The script is only for changing the status of individual pages - I do the index page updates manually. I distinctly did not add an empty tag - maybe a bug? The only thing I changed was the status. --DannyS712 (talk) 10:58, 6 July 2019 (UTC)
@DannyS712: Hmm. Weird. Must be a MW bug or something. Thanks for clarifying, I'll keep an eye out for this behaviour to see if it's something we need to do something about. --Xover (talk) 11:07, 6 July 2019 (UTC)

Thank you!Edit

Thank you so much for your redaction on c:File:The Philadelphia Negro A Social Study.djvu. I would have done it myself if I had known how! Do you have any resources for doing that sort of redaction on a DjVu file? Mathmitch7 (talk) 19:59, 15 July 2019 (UTC)

@Mathmitch7: I use the DjVu Libre tools, Tesseract, and GraphicsMagick with custom Perl scripts and some manual tweaking (i.e. the redacted page graphic was created with Pixelmator Pro). All this is arcane and hacky, and I wouldn't recommend it to anyone unless they're comfortable fiddling around at the bits and bytes level and wants to invest a significant amount of time into it. If you do want to experiment with it I'd be happy to share my scripts and provide any advice I can, but this is definitely not end-user stuff.
I've toyed with the idea of making a web tool for end users that hides all the fiddly bits and makes available functions designed specifically for the tasks commonly needed on WS, but quite apart from it being a huge time investment, I'm not at all sure these tools can be cobbled together in a way that is robust enough that it'll be worth the effort. --Xover (talk) 07:23, 16 July 2019 (UTC)

Template:Subsec row/testcasesEdit

I'm giving up on this because I CANNOT see where the table is leaking something, or generating something unexpected. Perhaps you will have better luck? ShakespeareFan00 (talk) 23:45, 15 July 2019 (UTC)

@ShakespeareFan00: Where are you seeing the problems, and what are the problems you're seeing? If you just meant the lint errors on the testcases page, that wasn't the template code but the wikimarkup on the testcases page itself: it tried to wrap a html <table> inside a <code> element which is not valid (<code> can contain inline elements and <table> is a block level element). --Xover (talk) 07:39, 16 July 2019 (UTC)
Thanks: BTW is there a block level element for source code? ShakespeareFan00 (talk) 08:42, 16 July 2019 (UTC)
@ShakespeareFan00: Nope, sorry. --Xover (talk) 10:35, 16 July 2019 (UTC)

Template:Np2 and Template:Numbered para 2Edit

In the documentation for this it claims that you can use it for compound or nested situations. However that's not (strictly) possible, given that what is generated in the compound example is:-

<div style="margin-left:3em; "><p><span style="float:left; text-align:; margin-left:-3em; ">123<span style="display:inline-block; width:;">&#8288;</span></span>
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Mediawiki then attempts to tidy this up, as

<div class="mw-parser-output"><div style="margin-left:3em;"><p><span style="float:left; text-align:; margin-left:-3em;">123<span style="display:inline-block; width:;">&#8288;</span></span>
</p><p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
</p><p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
<p class="mw-empty-elt"></p></div></div>

which I do not think was what was intended. I'm not sure how this could be easily repaired, without breaking a number of pages..

In the compound example, whats being attempted I think is to put a P inside P which you can't do in HTML5. ShakespeareFan00 (talk) 11:50, 16 July 2019 (UTC)

Aside: The use of {{tl2}} in the documentation to generate the view of the example markup is also flawed, that template uses a SPAN, but in the documentation here, it's being supplied with what is a block level element, given the paragraph break generated by Mediawiki)

I checked and Module:Message box is indeed expecting a SPAN element for the text of a messag box.. Again not sure how this could be repaired easily. ShakespeareFan00 (talk) 12:04, 16 July 2019 (UTC)
@Beleg Tâl: You wouldn't happen too recall why you added a <p></p> wrapper there? I don't see what function it serves, given MW will add such as needed anyway. Perhaps it could simply be removed? --Xover (talk) 14:29, 16 July 2019 (UTC)
If MediaWiki will add one, then the manual one can be removed. The contents of the main text block do need to be in a <p> block because the whole point of the template is that its contents will behave the same as other paragraphs with the exception of the number in the margin. —Beleg Tâl (talk) 14:33, 16 July 2019 (UTC)
@Beleg Tâl: Ah, right, it has to be affected by whatever default styling is applied (now or in future) to other <p></p> elements or it'll look different. @ShakespeareFan00: If you set up a sandbox and some test cases you should be able to tell fairly quickly whether that <p></p> is needed: all the examples used in the documentation should end up in the page with a <p></p> wrapper added by Mediawiki even after it is removed from the code generated by the template directly. --Xover (talk) 14:41, 16 July 2019 (UTC)

Remaining Lint flagged templatesEdit

(The Stripped Chart cell templates are relatively easy to resolve, and effectively deprecated as nothing seems to be using them on Wikisource currently, other than the Chart documentation directly.)

It would be nice if an experienced contributor could examine the remaining examples, with a view to providing repairs to make them not generate lint warnings.

ShakespeareFan00 (talk) 11:50, 16 July 2019 (UTC)


I get really really annoyed when I have to make reverts like :

because the parser in Mediawiki decides it's cleverer than me and decides to mis-"tidy" the generated content that is generated when the updated version was transcluded. Perhaps you will have better luck in silencing the Lint warnings this generates, without the template failing to work correctly? Perhaps you can also hold down some of back-end developers to actually fix the issues surronding the illogical and inconsistent way in which <noiwki>\n</nowiki> in wiki-markup with and without templates are processed, because it's COMPLETELY AND TOTALLY UNREASONABLE to expect a contributor to have to recall every single illogical rule for it is handled when trying to diagnose and resolve concerns with certain widely used templates. ONE CONSISTENT and DOCUMENTED whitespace handling rule seems to be beyond Mediawiki at times. ShakespeareFan00 (talk) 14:34, 17 July 2019 (UTC)

Strike out... I must have been really upset when i wrote that. ShakespeareFan00 (talk) 16:55, 17 July 2019 (UTC)

War and PeaceEdit

The file has been properly removed of copyrighted material. Could you please restore the improperly removed revisions of non-copyrighted material? TE(æ)A,ea. (talk) 22:30, 19 July 2019 (UTC).

@TE(æ)A,ea.: I presume you're talking about this discussion and File:War and Peace.djvu?
Based on the information available to me (i.e. what was presented in that thread), nothing has been improperly removed and your restoration of the material on that file is restoring copyrighted material against community consensus. If you feel a mistake has been made in the redaction you need to be more specific about what it is, and if a mistake has been made I'll obviously be happy to correct it.
But if you simply disagree with the outcome of the discussion in total then you need to bring the matter up for a new community discussion (you can't just unilaterally overturn the previous discussion, even if you feel it was incorrect). The discussion was poorly attended so it's always possible it does not, or does not fully, reflect the community's consensus; or that information relevant to the copyright calculus was not brought forward. If you feel that is the case then a new community discussion is the way to determine it. --Xover (talk) 08:05, 20 July 2019 (UTC)
  • I did participate in the discussion, and, after having personally determined that the copyright claim was legitimate, I removed the copyrighted material from the work. Some time after this, you deleted all revisions and removed non-copyrighted material. The file has already been removed of the copyrighted material mentioned in the discussion; there is no need to further butcher the file. TE(æ)A,ea. (talk) 12:03, 20 July 2019 (UTC).
    @TE(æ)A,ea.: Ok, since I don’t think the thread is archived yet, the easiest way to resolve this is probably to reopen it (remove the {{closed}} template that wraps the discussion, and the {{section resolved}} template below it). Please then also explain why you disagree with Shakespearefan’s assessment there. I am currently on mobile so I won’t be able to participate until I’m back on my computer (probably tonight or tomorrow morning). —Xover (talk) 13:27, 20 July 2019 (UTC)
  • Note that I have now reopened the discussion at WS:CV (mostly to prevent archiving). --Xover (talk) 18:02, 20 July 2019 (UTC)
    • There is no reason to resume discussion; my action removed the copyrighted material in accordance with the determination of the discussion, after which I began to proofread the file. It was later that you changed the file, after I had done so. What I mean to say is that my action removed the basis of the discussion (the copyrighted material of the file), and there is therefore no reason to resume discussion. TE(æ)A,ea. (talk) 21:02, 20 July 2019 (UTC).
      • @TE(æ)A,ea.: Once a copyright concern has been raised on WS:CV no single editor gets to unilaterally decide what is or is not acceptable for hosting here: the community discussion needs to reach a conclusion. At the time I closed the discussion I believed that was the case, but pinged both you and ShakespeareFan00 to be certain. Only they responded, and they listed everything except pp. 11–706 as needing to be redacted, so I took that be the consensus. Based on your objection now I see that there is not consensus on this point after all. Which means you need to explain in that discussion what parts of that page range you disagree with ShakespearFan00 on, and what your reasoning is. If the consensus there is that your position is correct then so much the better; but the discussion does need to happen. --Xover (talk) 21:36, 20 July 2019 (UTC)

@TE(æ)A,ea.: There, that should be back to where you wanted the work. Please let me know if I missed anything! Apologies for the detour, and the bureaucratic process for correcting it. I must certainly take the blame for misreading the situation there, but it is also god advice for the future to not assume any such community discussion is settled until it has been formally closed and archived. --Xover (talk) 05:31, 25 July 2019 (UTC)

Do you like barnstars?Edit

Thanks for grabbing File:The Poems of Henry Kendall (1920).djvu for me. I definitely expected that request to remain unanswered indefinitely. —Beleg Tâl (talk) 13:25, 20 July 2019 (UTC)

@Beleg Tâl: No problem. I have some basic tooling for that kind of thing set up now, so please don’t hesitate to ask if you need any other works. I can’t access the Hathi PDFs directly, but I can usually grab the individual images and generate a DjVu from those. PS. and, yes, I do like barnstars! :) —Xover (talk) 13:32, 20 July 2019 (UTC)


Could you please put the delete page in my Wikisource /Sandbox ? Yosakrai (talk) 18:51, 20 July 2019 (UTC)

@Yosakrai: For what purpose do you want it? --Xover (talk) 20:08, 20 July 2019 (UTC)
@Yosakrai: The work you refer to is copyrighted, and therefore cannot be hosted on Wikisource, not even in a user sandbox. —Beleg Tâl (talk) 17:35, 21 July 2019 (UTC)

@Xover @Beleg Tâl: I need only the information from the Reference. Could you post only the reference information?Yosakrai (talk) 11:07, 23 August 2019 (UTC)

@Xover @Beleg Tâl: I believe @Xover opinions about Rangsitpol works is correct and neutral. Please check from the information from this link Yosakrai (talk) 11:07, 23 August 2019 (UTC)

@Xover: I support your opinion that All of Rangsitpol works should be keep .

I asked for the reference for the one that you deleted to use the information . I am interested in his political work especially his Education Reform. Yosakrai (talk) 11:07, 23 August 2019 (UTC)

@Yosakrai: The source was: Bunnag, Sirikul (May 17, 1996). "Policy on English to be Test of Schools". Bangkok Post. You will have to contact the Bangkok Post to obtain a copy of the article's contents though. --Xover (talk) 11:25, 23 August 2019 (UTC)

Vol. 31 of Poet LoreEdit

Thank you very much for all your effort. The copy you have uploaded is much better than the original PDF in several ways. Besides the added missing pages, the OCR layer shows much better, and what is more, the OCR button works here as well.

I have briefly compared the following OCRs: OCR layer in the PDF version, OCR layer in the DJVU version, text achieved by the OCR button in the DJVU version, text achieved from the Google gadget in the DJVU version and text from the Google gadget in the PDF version.

By far the worst was the original layer in the PDF. Google gadget texts were much better, although sometimes a part of a line appears at a completely wrong place. It is interesting, that the Google gadget produces results of similar quality with both PDF and DJVU, but not identical, although the text is identical. This gadget seems best in recognizing characters in foreign names, such as "Gülich". Otherwise it was worse than the OCR layer which you uploaded with the DJVU file. I think that its quality is comparable to the text produced by the OCR button with one exception: OCR button is able to recognize ends of paragraphs, which would make it the winner of my comparison, if it were more reliable :-( --Jan Kameníček (talk) 17:57, 24 July 2019 (UTC)

@Jan.Kamenicek: Thank you, that is extremely useful! It also roughly corresponds with my impression of relative quality. I'll see if I can do something about the missing paragraph breaks: all page features (characters, words, lines, paragraphs, and columns) are recognised and coded in the OCR process, so it is in the extraction into plain text that they go missing. I may be able to hack up something to work around that. The diacritics are possibly a harder nut to crack as they are in the control of the OCR engine. However I know it detects the diacritics (not sure about accuracy), it's just not applying them to the output. I'll dig a little to see if there is some way to tweak it to behave that way. --Xover (talk) 18:10, 24 July 2019 (UTC)
You are welcome :-) BTW, the OCR button, which worked at this work before, stopped working here as well. Really, strange... --Jan Kameníček (talk) 13:55, 26 July 2019 (UTC)
@Jan.Kamenicek: Hmm. A thought occurs to me… Since I have both lines and paragraphs marked up (let's leave aside accuracy for now), perhaps it would make sense to by default remove line breaks within a paragraph? I haven't really thought this through, but I know for most works and most pages I end up removing these anyway. Perhaps this would actually be better, and then it's the exceptions where we'd have to re-add line breaks. Possibly it'd depend on how accurately I can detect what constitutes a "paragraph"? Anyways, just thinking out loud and thought you might be interested. It's probably completely unworkable and has flaws I haven't thought of (like most of my ideas ;D). --Xover (talk) 15:24, 2 August 2019 (UTC)
Hm, that is a really interesting idea... Hope it will not lead to less thourough proofreading by contributors because removing the line breaks forces everybody to go through the text :-) --Jan Kameníček (talk) 18:13, 2 August 2019 (UTC)

Works Wikisource shouldn't host...Edit

You recently expressed a concern about certain works apparently hosted under {{IEEPA}}, I left a comment which I will again state here:- In my view, Wikisource should pro-actively delete content which has originated with 'terrorists' or 'extremists' as applicable US law defines or proscribes. This should be done regardless of any copyright considerations. With that in mind IEEPA works should be subject to immediate deletion.

I would also suggest that Wikisource administrators also consider removing certain historical works, which have a highly sensitive relationship with objectionable or extreme views, Especially when that association or relationship is clearly identified in the historical context. ShakespeareFan00 (talk) 23:35, 24 July 2019 (UTC)

@ShakespeareFan00: I'm generally sympathetic to that view, but I think when you get into "censorship" territory—especially of historical works—you're goose-stepping around on eggshells down a slippery slope while wearing hobnailed boots. Pretty soon you start to have to decide whether to expunge the pervasive racism and misogyny of century-old purely literary works; or, to take the prime example, make Shakespeare fit for mixed company. We do have to draw a line somewhere, but I am much more comfortable if that is where something causes harm to the project (typically through creating a legal liability). --Xover (talk) 04:54, 25 July 2019 (UTC)
Thank you for the quick response on this, your concerns about where to apply a 'red-line' are reasonable. ShakespeareFan00 (talk) 06:47, 25 July 2019 (UTC)

Bowlderized worksEdit

On the subject of Bowlderized works, presumably if out of copyright, they could be transcribed on English Wikisource? ShakespeareFan00 (talk) 06:47, 25 July 2019 (UTC)

@ShakespeareFan00: I see no reason why not. The originals are looong out of copyright, and they are historically important works. We may even be able to find later critical editions of these (how meta!). --Xover (talk) 06:52, 25 July 2019 (UTC)


This is what supports {{cl-act-p}} through various templates, to do Commonwealth legislation ( mostly Canadian and British so far).

At some point the most recent author had attempted to update it, which as far as I saw it had some issues (and why it was in effect deprecated from a LOT of my efforts.). However I am not prepared to see it completely abandoned.

Much as it may seem like hammering away at something that may be flawed, it would be appreciated if you could take a look through the various iterations of this, to see if it can't finally be rescued and actually made stable.

One consideration I was also going to ask for, was forethe Drop initial functionality inside the cl-act- formatting (and DIV) with an option so that the side-headings behaviour no longer needs the {{float left/s}} hack currently used. The other missing function is the ability to add "side quoting" (as used in Ruffhead.) This is not essential, but would make the template family VERY powerful.ShakespeareFan00 (talk) 11:21, 16 August 2019 (UTC)

Side note - It will need a LOT of testcases, so don't worry if you don't have the time.ShakespeareFan00 (talk) 11:22, 16 August 2019 (UTC)
I reinstated the changes they'd made and the glitch came back - see here,, somewhere a tag is not getting closed. Much appreciated if you are able to sanity check the relevant module, to determine where. This template is very close to being viable, but needs someone with more expertise then me to give it the final push. ShakespeareFan00 (talk) 09:41, 17 August 2019 (UTC)
@ShakespeareFan00: I'll take a look when I can, but it's big and complex, and I don't really understand the context it's used in, so I don't want to get your hopes up that I'll be able to fix anything. --Xover (talk) 09:48, 17 August 2019 (UTC)
It's used for formatting legislation, That's why I'd linked the test cases. In terms of actual usage see because it was not working I'd removed from a lot of pages, because I couldn't figure out why it was unstable. The three linked Page:s are the sole location it's used at present, and could be removed. ShakespeareFan00 (talk) 09:53, 17 August 2019 (UTC)
See also test case 7a Page:Test_page#Testing_/s_/c_/e_version_(7a)_with_layout_specified ShakespeareFan00 (talk) 10:22, 17 August 2019 (UTC)
BEFORE this is put back into general use, all the test cases should (ideally , the drop initial in 5e is a Known limitation) render without concern. As others have said, this template family may be 'too big to recover' ShakespeareFan00 (talk) 10:22, 17 August 2019 (UTC)
A partial explanation of the fault is that it's misreading or mis-expanding the unnamed parameter. ShakespeareFan00 (talk) 12:43, 17 August 2019 (UTC)
On second thoughts don't bother, certain other contributors have made their views about template families like this one abundantly clear. ShakespeareFan00 (talk) 15:17, 17 August 2019 (UTC)
If you are still interested, good luck. ShakespeareFan00 (talk) 10:48, 23 August 2019 (UTC)

Module:Short titleEdit

The concern here is that the Module, although well reasonably well thought out, is lacking the ability to disambiguate between different jurisdictions, or (currently) the ability to generate footnotes instead of sidenotes. The handling of the generation of Regnal style identifiers alongside the short-titles isn't as clear as it could be.

In addition, this template can only currently handle short titles for Commonwealth style legislation, and a list of jurisdictions would be needed somewhere as well. I'm also wondering if it's actually over complicating things in most situations.

The higher level templates also have the same issue (with respect to jurisdiction diambiguation.):-

There is also currently no support for additional indexing (such as the No. X of Y. coding used in Indian Legislation for example.). A review of this code should be reasonably straightforward, compared to cl-act. However, updating the template is only part of the issue, there would also need to be a discussion about what disambiguation conventions the Module (and the higher level templates) needs to take into account. Maybe just doing everything as standard links (maybe with subst templates would in fact be quicker moving forward?. ShakespeareFan00 (talk) 11:04, 23 August 2019 (UTC)

Related is the issue of the templates {{Statute_table/collective/entry}} which for what it does is perhaps an over-complicated solution.ShakespeareFan00 (talk) 11:04, 23 August 2019 (UTC)
Thanks in advance for any reviewing or code improvement you can do on these. ShakespeareFan00 (talk) 11:04, 23 August 2019 (UTC)

Copyright for PansiesEdit

Hi Xovier

The UK copyright Law was only extended from the author’s life plus 50 years to the author's life plus 70 years in the Copyright, Designs and Patents Act 1988; therefore, Pansies would have entered the public domain in 1980, 50 years after D. H. Lawrence’s death in 1930, and so is in the public domain in the US.

Lord Scantaethon (talk) 11:00, 27 August 2019 (UTC)

I have copied this message to Lord Scantaethon's talk page and replied there. --Xover (talk) 12:08, 27 August 2019 (UTC)

Nixon's phone call to the MoonEdit

Hi Xover, I see that yours is the last edit to this page

In the History you talk about the transcripts you've found. In the official transcrips there isn't the second answer by Armstrong. And also in the audio I hear not it, there is a kind of "bwop" or microphone noise before the Aldrin's answer. But I am not expert in editing a Wikisource page and changing the "Source" section. So I hope some other expert user could hear the audio and make a revision. I wrote a comment in the "discussion" section, with all the necessary links:

P.S. Thank you for editing.

OrmenVilla (talk) 07:36, 1 September 2019 (UTC)

Wikisource:News (en): September 2019 EditionEdit

Quick closerEdit

What is this intriguing "quick closer 1.0.0" that I see so nonchalantly tagged in your edits to WS:PD? —Beleg Tâl (talk) 14:06, 8 September 2019 (UTC)

@Beleg Tâl: It's an adaptation of w:User:DannyS712/DiscussionCloser for WS:CV and :PD. See this and this. --Xover (talk) 15:21, 8 September 2019 (UTC)

Community Insights SurveyEdit

RMaung (WMF) 14:34, 9 September 2019 (UTC)

@RMaung (WMF): Your account is not attached on enws, so to anyone here it will show up as an unregistered account and your global user page will not show up. In other words, it looks like a phishing attempt. This is likely to also be the case for every project except the 88 listed here. --Xover (talk) 16:14, 9 September 2019 (UTC)

xwiki LTAEdit

We could probably abusefilter the word "Glapz" though that will just delay, rather than prevent. Suggest that if the target is not an active local editor, and more a target for xwiki abuse, that we just soft protect the user talk page. Locally block the LTA on site is okay, and if you want to report to m:SRG, then feel welcome. If we are getting user: ns abuse, then I have a filter at meta that we can use here to prevent that sort of abuse, and we can tune it however we like with regards to access/use levels. — billinghurst sDrewth 06:37, 10 September 2019 (UTC)

@Billinghurst: Thanks. This one was pretty low-intensity (only one edit, self-reverted), so a surgical approach seems apt. I mostly just wasn't sure what was going on there: it smelled of LTA activity, but the edits themselves were just suspicious rather than obviously abusive. Since you had blocked the original account I figured you might have the context and be better able to judge the situation. --Xover (talk) 06:52, 10 September 2019 (UTC)
He is just a annoying arsehole who targets some general admins from other wikis, and gadsabout with versions of names, and stuff. I have some filters for him at meta, and generally just light protect pages. Occasionally they will do a sleeper account, however, pretty easy to right some simple filters to eliminate the worst of it. — billinghurst sDrewth 06:57, 10 September 2019 (UTC)
@Billinghurst: Hmm. Could we use abuse filters to flag edits for closer scrutiny and include a link to more info so admins not familiar with the case can better judge the appropriate action? Blacklisting or auto-blocking seems excessive (for this particular case) and risks catching legitimate uses, but a filter that tags possible instances of a LTA with a recommendation for admin action or a link to something like a enwp LTA page, would enable other admins to apply human judgement based on a fuller picture than we often see in a single edit here. I'm familiar with a few of the egregious LTAs on enwp, but don't follow it closely there or watch what happens at meta at all, so I often feel I have blinders on when they pop up here. --Xover (talk) 07:12, 10 September 2019 (UTC)


Any chance of doing some page-listing, for the remaining entries in this category? I can't necessarily because these works are still under copyright in the United Kingdom... ShakespeareFan00 (talk) 08:53, 12 September 2019 (UTC)

@ShakespeareFan00: It's already on the list, as one of the many backlogs on the project, but I'll take a look and see if we can't get this one down to zero. No promises though, since I see there are a few real doorstoppers in there.
Incidentally, I don't think you need worry overmuch about copyright here. For one thing several of them were CC0-licensed, and for anything that is within the project's copyright policy, the WMF will stand in front of any theoretical over-eager copyright plaintiff and its volunteers. And so long as you are not actually uploading works that may not be PD yet in the UK, the highly theoretical infraction of making a pagelist for it falls squarly within the limits of aquila non capit muscas. --Xover (talk) 17:52, 12 September 2019 (UTC)

Overfloat left x rightEdit

Hello. I have just found out that seemingly mirror templates {{overfloat left}} and {{overfloat right}} are not that same as one would expect, with Overfloat right missing some parameters (width, padding). Do you think they might be unified? --Jan Kameníček (talk) 14:28, 12 September 2019 (UTC)

@Jan.Kamenicek: I don't really understand these templates so I'm not certain; but I've created a sandbox version at {{overfloat right/sandbox}} that you can try to see if it does the trick.
BTW, I very strongly recommend that you do not try to replicate left—right alternating side notes. In addition to being a complex nightmare to get working at all, I would say the result is actively wrong when transcluded. In mainspace the work is no longer in a paged medium, and the right margin, in particular, is no longer simply one page width away. And there is no logical reason why the notes should switch sides in random places in a continuously scrolling page. If the text requires side notes, at least stick with just left side notes. --Xover (talk) 18:10, 12 September 2019 (UTC)
I see. Yes, that sounds reasonable, so I will keep all the notes on the left. As a result, after your explanation I will not need the overfloat right template reworked, although it might be still useful for somebody else in the future. So I tried it in my sandbox and it seems that the parameter depth in combination with align:left influences the width of the sidenote instead of the depth. However, it is up to you if you want to bother with it, as I am not going to use it now. Thank you very much! --Jan Kameníček (talk) 20:05, 12 September 2019 (UTC)

Milton's ShakespeareEdit

You've seen the news about the discovery of John Milton's hand-annotated copy of Shakespeare's First Folio? --EncycloPetey (talk) 17:37, 16 September 2019 (UTC)

@EncycloPetey: Whaat? No. /me heads off to Google… --Xover (talk) 17:42, 16 September 2019 (UTC)

Re: OCR fixEdit

Hi. I saw your comment on Phabricator and just noticed that our OCR bug is the 3rd item on the Next Up list. Do you understand what was found to be the problem? I am just trying to understand the intermittent nature of a bug. — Ineuw (talk) 11:02, 17 September 2019 (UTC)

@Ineuw: So far as I know the problem has not been pinpointed yet. That it's on the "workboard" in Phabricator doesn't really mean anything more than that the issue exists on a list somewhere: I wouldn't make any assumptions on when or if anybody will get around to dealing with it.
The intermittent nature of the bug is partly why this is very hard to debug without direct access to the infrastructure (you need to be able to check the state of the system it's running on, add debugging code to the script itself, etc.). If it had been consistent in when it failed it would be much more likely that one could reason one's way to a conclusion based on externally observable phenomena.
However, I have drawn some conclusions…
The OCR system consists of a JavaScript-based Gadget here that provides the OCR button in the toolbar, and a server-side (CGI) program running on one of the WMF's servers. When you click the OCR button, the script running in your web browser contacts the server-side program, telling it what page of what file it cares about. The server side program extracts that specific page from the file, saves it to a temporary image file (in TIFF format), runs Tesseract on it, and then returns the resulting text to the Javascript. The Javascript then inserts that into the textarea on the page.
What's actually visible from outside when this fails is an error message returned from the server side program to the javascript. Since the javascript doesn't have any logic to deal with that error message specifically, it just dies in the middle (vs. showing you the error message, cleaning up after itself, etc.). The error message that is returned can be traced to a specific line of code in the server side program, and the text of the error message when viewed in combination with that line of code, suggest that the server side program is failing when it is trying to open the temporary file containing the image. That this happens intermittently and not on all files, suggests that this is not a broken installation of Tesseract (e.g. if they had updated their Tesseract installation recently and the new one was broken), and it is not due to a change in the server side component of the OCR button (if that was the problem it would most likely fail always).
The likely causes then become problems with something internal to Mediawiki that affects the ability to reliably extract the page image from the file, or something wonky with the server hosting the program such as a network filesystem (think "file sharing" but for servers) that is behaving unreliably, or that runs out of space periodically, and so forth. For example, if you've noticed that some images fail to load on some pages occasionally lately; the server side OCR tool would be subject to similar failures when trying to access page images.
It is possible that there is something about the specific works it fails on that causes it, but it really doesn't look like that is the case. For example, I wondered if it could be related to the language of the work and the presence of diacritics from, say, Spanish; but that does not match my observations of which works it fails or works fine on.
In any case… I don't think there is much more that can be done from the outside. Further debugging needs to be done by either Phe or the WMF sysadmins (WMF software developers aren't familiar with Phe's code, and don't normally have access to it), and there is very little we can do to influence their priorities or allocation of resources. --Xover (talk) 12:50, 17 September 2019 (UTC)
Many thanks for the excellent explanation. I doubt that Phe will return anytime soon. Since there is nothing can be done at the moment, I will pursue my theory and see if there is a correlation. Based on what you said about sever issues, if I can find the correlation perhaps it's possible that, that date may provide a clue for Phabricator. — Ineuw (talk) 13:25, 17 September 2019 (UTC)
My theory is a bust. OCR failed in Indexes created in 2011. — Ineuw (talk) 14:15, 17 September 2019 (UTC)

@Xover: Hi. I am revisiting this issue and wondering if you read the newer posts about the OCR issue on Maniphest? — Ineuw (talk) 01:37, 1 October 2019 (UTC)

@Ineuw: I have. What’d you have in mind? --Xover (talk) 07:22, 1 October 2019 (UTC)
I noticed that other WS sites have the same issue, and wondered if it will speed things up in tracking down the problem. — Ineuw (talk) 08:39, 1 October 2019 (UTC)
@Ineuw: Sadly, most likely not. We’re essentially just waiting for Phe to have time to look into it. It is conceivable that WMF sysadmins or developers may intercede at some point, but since they have not already done so it probably means they will not do so any time soon. From the WMF’s perspective this is just a minor problem with a non-core add-on gadget provided by a third-party/volunteer developer and only affecting a few smaller wikis out of the several hundred they host, and it only affects some users even on the Wikisources (not everyone uses this OCR gadget). WMF developers do not normally have access to the tool’s account (it’s tied to Phe), and while they can technically usurp it, that’s considered somewhat of a last resort. In other words, I think we just have to wait it out now. --Xover (talk) 08:56, 1 October 2019 (UTC)
Thanks for the enlightenment. I asked because of wanting to vary my activity. The lack of this tool restricts me. — Ineuw (talk) 10:44, 1 October 2019 (UTC)


Takes layouts that you want to test. Plug them in there, and if you have the gadget activated, it flows through the test layouts. Means we can have them there for testing, rather than in the full mix and risk breaking things for all users. — billinghurst sDrewth 14:08, 17 September 2019 (UTC)

and the levels are on the talk page, so nothing will get broken. It was done so anyone hitting the [+] so that new additions were at level 2, but no bother whichever way as it hasn't been utilised. — billinghurst sDrewth 14:11, 17 September 2019 (UTC)
@Billinghurst: Ah, I see. The issue there was that since MediaWiki talk:Common.js/layouts is transcluded onto MediaWiki talk:Common.js, any new thread on the latter (like my request to edit the PageNumbers.js-related parts of MediaWiki:Common.js) got grouped as sub-threads of the transcluded level 1 heading. If there's no issue I propose we leave it as I left it; and if that breaks anything we try some other approach than the original. Maybe split MediaWiki talk:Common.js into multiple sections like WS:S or something, with the transcluded section first so that new threads are not subsumed into it. Not a big deal, obviously, but it was a bit confusing until I tracked down the culprit. --Xover (talk) 15:15, 17 September 2019 (UTC)
Okay, sometimes I hate transclusions. [what idiot did that? 🤔] UGH!! and see how we change in time, I would never transclude a page like that these days. Thanks for the fix. — billinghurst sDrewth 21:56, 17 September 2019 (UTC)

Can I impose on you for more HathiTrust scans?Edit

Specifically these:

If you're feeling especially sadistic and want to grab all of these and toss them in Commons for future reference, that would be amazing, but I only need volume 1 at the moment.

Also: how hard would it be for you to teach me your HathiTrust-scan-grabbing ways? —Beleg Tâl (talk) 17:26, 17 September 2019 (UTC)

@Beleg Tâl: never hesitate to impose. :)
The trick for grabbing them is simply that when they display the work to you they are displaying image files, not a PDF or something like that. Just pick a page and save it is the simplest approach. To make the amount of labour manageable you open the image in a new tab or window to find the actual URL for it. The URLs are regular so you can derive the URLs for each page from that. And tools like wget and curl can download multiple files based on a pattern: "This URL, but replace this bit with each number from 1 to 452". Keep in mind that they detect automated downloads and will block the IP downloading, so you need to rate-limit the downloads to something reasonable (one every couple of seconds, say). Once you have all the images saved you can generate a DjVu in whatever way you have available. I have some (pretty hacky) scripts to do this using graphicsmagick, tesseract, and djvulibre; but anything that will take images in and spit DjVu out should work.
As for the specific works you mention, I can certainly grab them for you but I may not be able to get to them for a while. I'll be a combination of busy / travelling / offline until sunday; then I have a few medium-busy days; followed by travelling. I'll try to sneak it in when I can, but it may have to keep until early october some time. --Xover (talk) 19:11, 17 September 2019 (UTC)
Very much appreciated. No rush. Enjoy your travels. —Beleg Tâl (talk) 19:25, 17 September 2019 (UTC)
@Beleg Tâl: You're in luck. I managed to squeeze these in before the crazyness takes over: c:Ten Minute Stories (1914).djvu and c:The European Magazine and London Review - Vol. 1.djvu. I didn't do a lot of quality control, but I did give the DjVu a quick look locally without spotting any obvious issues. You should probably do a proper check of them before you assume all is well, not least because we know Mediawiki sometimes chokes on DjVu files that look fine in the local viewer (e.g. when the text layer gets out of sync with the actual page). Please do let me know if there are any issues, but, as mentioned, it may be a while before I can get around to fixing whatever it is. --Xover (talk) 08:34, 18 September 2019 (UTC)

Reminder: Community Insights SurveyEdit

RMaung (WMF) 19:13, 20 September 2019 (UTC)

All the Year RoundEdit

Hi, Just responding to your query in an edit summary. All the Year Round isn't something I've ever claimed ownership of. All I did was do the uploading and indexes for whatever volumes were available back in 2012, so that they'd be available to people wanting to proofread articles out of them (not that anyone seems to have done do, at least not transcluded...). If you want to change/rearrange/whatever, please feel free (not that anyone needs permission, but you know what I mean)! Inductiveloadtalk/contribs 09:18, 23 September 2019 (UTC)

@Inductiveload: Thanks for following up. I was mainly just wondering whether it was something you did on your own initiative and out of interest in the work, or whether you were just responding to someone's request for a bot upload. The context is that while working on the backlog of files with OCR issues I came across a few good scans of this series and figured I'd better upload them to fill the holes in what we already have. So the question was more "Is this something you want a ping for to look over?". I have no particular interest in the series as such, and the task is just sort of a digression while working a maint. backlog, so it would be better if somebody that cared about it specifically looked over what I did. --Xover (talk) 09:38, 23 September 2019 (UTC)
I can't remember if there was an underlying reason I specially uploaded them back them. Considering there appears to be no mainspace activity, I would hazard it was probably just a spin-off from populating Author:Charles Dickens, Jr.. In the intervening 7 years, better and more scans have probably appeared, and filling the gaps will be appreciated: 1) everyone likes a nice blue volume list and 2) periodicals like this are, in theory, a relatively good source of interesting shorter works and historical interest pieces, and are generally poorly organised elsewhere (like the IA, Google Books, etc). So having them here allows at least a curated list, even if only as scans for the time being. Regardless, I have no specific vested interest, so go nuts! I'll be happy to give them a once-over (with no guaranteed timescale!). Inductiveloadtalk/contribs 09:48, 23 September 2019 (UTC)

Reminder: Community Insights SurveyEdit

RMaung (WMF) 17:04, 4 October 2019 (UTC)

Thai Education Institute that was founded between 1995-1997 due to Sukavich Rangsitpol ‘s Education Reform 1995Edit

  • Declaration of education The announcement of Fifty Industrial Community College on 18 June 1997[5]

[6] [7] Thai Wikisource Declaration of education The announcement of Fifty Industrial Community College on 18 June 1997[8] [9] [10]

Please I deleted the page Thai Wikipedia sources had the article.Please Link both language together for future information.2405:9800:BC11:BD0D:BDE0:5842:E5FE:EB2E 23:29, 5 October 2019 (UTC)

I'm sorry, but I do not understand what it is you are requesting of me. --Xover (talk) 06:04, 6 October 2019 (UTC)
Return to the user page of "Xover".