Warning Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

This is missing roman numeral pages x and xi (first ToC) which need to be inserted as /16 and /17. They are available as TIFs at /16 and /17. I'll move the pages to make way for an update. @Mpaa:billinghurst sDrewth 09:09, 18 September 2022 (UTC)

@Billinghurst file is fixed. Mpaa (talk) 18:08, 19 September 2022 (UTC)
This section was archived on a request by: — billinghurst sDrewth 12:57, 25 September 2022 (UTC)

Book 2 of The History of Herodotus is not divided into chapters like the other books; This book is very basic, so I think dividing into into chapter will benefit many editors. Thanks in advance, פעמי-עליון (talk) 00:14, 5 September 2022 (UTC)

It's not scan-backed, so really, what would be better would be to find a scan and start working on that. But if you want to split it, I don't see a problem with that. JesseW (talk) 01:32, 6 September 2022 (UTC)

Tech News: 2022-36

23:22, 5 September 2022 (UTC)

The 2022 WMF Board Election (Please Vote! only 2 days left!)

Hi all. As you've probably seen, and promptly ignored, the voting period for the 2022 WMF Board of Trustees election is open. Despite fatigue with global notices and mass messages about the latest hare-brained "strategy process" or "branding initiative" or… I am here to URGE YOU TO VOTE! (quick link: SecurePoll)

This election elects two community-filled seats to the Wikimedia Foundation Board of Trustees. The Board has 16 seats—one for Jimbo, 7 appointed by the usual middled processes, and 8 from the community (but that's including the Affiliates, with no on-wiki experience requirement)—and are the ones who ultimately decide what the WMF does. They hire and fire the CEO (or Director, or whatever), set the strategy and priorities, start "strategic initiatives", and so forth. So, that… thing… a little way back when they wanted to rebrand the "Wikimedia Foundation" to the "Wikipedia Foundation"…? That was an initiative from the Board where they hired hideously expensive external branding consultants to figure out a rebrand for the WMF (yes, that's what they spent donation money on).

There has been a growing frustration in the communities about this, and there's a dawning realisation that there's a disconnect between the WMF and the community that is poison to the long-term health of the movement. This election and the recent board changes are baby steps towards a better situation.

So I am asking every single Wikisourcerer to please vote! (quick link: SecurePoll)

The main overview page is m:Wikimedia Foundation elections/2022/Community Voting. Getting to know the candidates is best done by reading their responses to questions for the candidates. Actual voting happens through SecurePoll.

You need to make up your own mind who to vote for, of course, but if you find making up your mind on this hard I have two tips for you:

  • Kunal Mehta, better known as Legoktm, is a long-time community member that has also formerly been employed by the WMF (and knows where the bodies are buried 😎). He's been a primarily technical contributor for a long time, but has stayed active and engaged on-wiki, and understands the community and community processes well. When uploading large books to Commons was essentially completely broken for 9+ months last year, Legoktm was instrumental in finally getting the problem flagged as a priority as well as actually figuring out what was going on and fixing it. Of particular import to us is that Legoktm is fronting a few key issues (all of which are broad concerns in the community):
    • WMF prioritisation of resources needs to be driven bottom up by the needs of the community, rather than top down by a disconnected Board and external consultants; especially when it comes to where to expend technical resources
    • The Board desperately needs someone with deep technology understanding so they don't neglect fundamentals (did you know that for several years nobody at the WMF owned the multimedia stack, including all of Commons?). Like having a team churning out new features, but then not giving anybody responsibility for maintaining that feature and allocating no resources to it. Do you know why we don't have a decent {{cursive}} font on Wikisource? (T166138) Because the Language Team at the WMF, who added the {{blackletter}} support in the Universal Language Selector, have been so starved of resources that they've had to scope away responsibility for web fonts (in favour of core i18n and language support); the very feature they were originally created to develop!
    • Development resources need to be better prioritised to meet the community's needs, rather than some pie-in-the-sky new "initiative". For example, assigning developers to work on maintaining the core MediaWiki and related components—like Proofread Page which is almost entirely maintained by volunteers, who have trouble even getting patches approved because there's nobody to review them—instead of new feature development of little interest to existing communities.
    • And while I certainly don't agree with them on everything, I've interacted with them enough times over the years that I feel confident I can always go grab them on their talk page and expect a response (most WMF people have never edited and don't answer on their talk pages); raise an issue about Wikisource and expect them to understand our community's unique perspective and be able to take it into account; and, not least, discuss such issues openly and frankly on-wiki even when we disagree on something. I've tried contacting "community" Board members on-wiki before and gotten only crickets, so this is an important factor for me personally (you may disagree of course).
    • They're also fronting several other issues regarding transparency, collaboration, and involvement that you can read about in the above linked pages, but those are to a great degree shared with other candidates.
  • Mike Peel, going by Mike Peel on-wiki, is also a long-time on-wiki contributor and community member. And while more active on other projects, they have over 5k edits on English Wikisource and understand our workflows and needs well. I've only rarely interacted with them, but from all observations they are as responsive and willing to discuss as can be expected; with the possible exception that they have a lot of irons in the fire and can be a bit busy for that reason. Their answer to candidate questions indicate they care deeply about the community, prioritising their needs and involving them in decisions. When I mention them second, and write less, it's primarily because they are less technically oriented than Legoktm is and I believe we desperately need more deep technical understanding on the Board along with the community understanding. Otherwise Mike would probably have been my top choice.

I have nothing against the other candidates, and it's entirely possible that you'll prefer one of them for your personal priorities, but these two are the ones that stand out to me as the ones that would be the best choices to take care of the needs and priorities of Wikisource and the wider community. And whatever your preference I urge you, in the strongest possible terms, to go vote! All the other stuff the WMF spams us with can usually be ignored, but the Board elections actually matter and in the long term will have a huge impact on Wikisource and your day-to-day work here. For example, it is extremely unlikely that we will have any dedicated team or developers assigned to Wikisource and Proofread Page unless we manage to get voted in Board members who fight for bottom-up prioritisation of resources and closer alignment with community needs, and with the technical experience and insight to understand that development resources must be allocated to also maintain core technical components of the existing projects (like the Wikisource and Proofread Page extensions that are Wikisource).

And luckily, the voting is w:Single Transferable Vote, so you rank your candidates by preference; and if your preferred candidate can't win, your vote is transferred to your second choice instead of being effectively wasted.

Please vote! Here: SecurePoll Xover (talk) 07:32, 4 September 2022 (UTC)

I appreciate this post. In my view, if there's even a single charitable organization in the world that absolutely should be governed by democratic processes, it's the WMF. To this end it's vital to foster a culture of participation from the bottom up; I would hate to see community involvement in governance disappear due to apathy on our part. With about seven hours left in this election, I highly encourage everyone to vote, if only to support the principle of democracy within the Wikimedia community. There is not much way for us lowly contributors to influence the direction of the WMF without the exercise of these fundamental rights, and it's quite easy for rights to disappear when those who have them fail to use them. The only rights we truly possess are those we actively claim and exercise, so: Please vote! Shells-shells (talk) 17:02, 6 September 2022 (UTC)

Importing from Commons problems

I tried to import certain files being deleted on Commons; it doesn't seem to have worked, but pulled in a bunch of templates. If I've managed to break anything, I'm sorry.--Prosfilaes (talk) 23:57, 7 September 2022 (UTC)

@Prosfilaes: Ouch! Yeah, importing doesn't work on files (it operates on the description page, not the file itself), and it's a massive trap for the unwary: if you check the "include templates" option it can go hog wild. Case in point, your import of the file ended up importing and overwriting ~60 different templates and Lua modules. See Special:Log/import. A lot of these either had local modifications or were actually imported from another project (typically enwp), so this is going to take a bit to untangle. In the mean time we should expect lots of weird breakages and problems.
I'll try to take a look as soon as I can, but I'm travelling at the moment so it won't be any sooner than ~12 hours from now. Xover (talk) 06:12, 8 September 2022 (UTC)
Ack, sorry about my part in prompting this mess (I was the one who noticed the files didn't get copied over, and asked Prosfilaes to try it again). We really should add a big warning on Special:Import. :-O JesseW (talk) 12:21, 8 September 2022 (UTC)
Ain't nobody needs to apologize fer nuffink: Special:Import is an attractive nuisance!
In any case, the worst of it seems to have been untangled now. We were really lucky in that most of what was imported was either not in use here, or we had more recent edits then Commons, so we just polluted the revision history but didn't actually break much of anything. I've tweaked the interface messages for Special:Import so that you get some warning now, and added a page with some guidance at Help:Importing (improvements welcome!). The messed up revision histories are going to be a pain, but that can't really be helped. To fix them requires major surgery and lots of research that the issue just isn't worth. Xover (talk) 09:40, 10 September 2022 (UTC)

List of UK Admiralty Charts, 1967

In July 2021, there was a consensus on the English Wikipedia to transwiki w:Draft:List of UK Admiralty Charts, 1967 to Wikisource. This seems to have never been followed up on. * Pppery * it has begun... 17:16, 9 September 2022 (UTC)

@Pppery: That list doesn't appear to be in scope for enWS. Xover (talk) 20:20, 9 September 2022 (UTC)
The list is claimed to be taken directly from the official catalog(ue) of the Hydrographic Office, which should be in scope. Shells-shells (talk) 21:52, 9 September 2022 (UTC)
Yeah, looking at them, there does appear to be far more text and charts, there's a lot more than just Maps on there, so maybe? Reboot01 (talk) 23:21, 9 September 2022 (UTC)
Hmm. If it's taken directly from the published catalogue, rather than being compiled by a contributor, then it's possible to argue that it's in scope. But in this case, I think they mean that they've used that publication as a source from which they have compiled a list that does not otherwise exist in this form (I find no matching list in the original catalogue). And it's still mostly just a bunch of images of maps and tables of data, which makes it arguably out of scope here for much the same reason it's out of scope on enWP. It is also going to be literally impossible for us to reproduce this work properly due to the large tables and MediaWiki transclusion limits. Xover (talk) 05:10, 10 September 2022 (UTC)
maybe https://en.wikibooks.org/wiki/Main_Page is a home for the unloved list. not really interested in hosting english’s castoffs. --Slowking4Farmbrough's revenge 21:03, 10 September 2022 (UTC)

Tech News: 2022-37

01:49, 13 September 2022 (UTC)

Index:Gleichförmige Rotation und Lorentz-Kontraktion.djvu -- not actually scan backed

Back in 2018, it looks like someone misunderstood what "scan-backed" meant, and uploaded a German language (excerpt) and put their English translation in the Page namespace. This is Index:Gleichförmige Rotation und Lorentz-Kontraktion.djvu. Could someone help to untangle this? (probably via deletion, sadly) JesseW (talk) 23:27, 10 September 2022 (UTC)

@JesseW It's not just this single page; you can step through D.H's contributions to find many similar translated works. User translations are generally permissible; however, I don't think there's any specific policy governing the matter. WS:WWI#Translations states that "Wikisource also allows user-created wiki translations" without elaborating much further. See WS:Translations#Wikisource original translations for some of the community's thoughts on this, but note that it's not official policy. Shells-shells (talk) 04:42, 11 September 2022 (UTC)
That's correct. It's not real easy to read from the linked pages, but when I went digging into it a while back the summary is: user translations must be scan-backed, and the original language work must exist on the appropriate language Wikisource. They are then transcluded into a page in the Translation: namespace instead of in mainspace. See Translation:On Discoveries and Inventions for an example.
This is an area where policy and guidance really should be cleaned up, and probably rethought. I'm not convinced we should permit user translations (if you look through Translation: there's a lot of unverifiable crap there, and we tend to dump stuff there with no effort to actually clean it up; see WS:PD#Translation:Manshu for how hard it is to actually enforce such a de facto semi-expressed policy and what the result is when we don't manage that namespace properly). And if we do, I'm not sure a separate namespace for it makes sense (they use a dedicated header template in any case). And if we should permit them we need to design and try to get implemented better tools for it. And such better tools should probably address user annotations as well, if we are to permit them, since some of the technical issues overlap (again, see the deletion discussion linked above). Xover (talk) 05:31, 11 September 2022 (UTC)
Most of the translations related to Portal:Relativity and Wikisource:WikiProject Relativity have been created by me between 2010-2014, and since then I do mostly maintenance work on those articles (correcting errors, providing scans and transclusions etc.) without adding new ones. One of the reasons (besides lacking enough time) I stopped adding new translations, is the lack of clear policy or guideline (ws:Translations is still only a proposed policy), so it was never 100% clear to me whether those translation are fully acceptable on WS or not. --D.H (talk) 08:55, 11 September 2022 (UTC)
@D.H: User translations are definitely permitted on English Wikisource.
I am not, personally, convinced they should be, for various reasons, but they most definitely are currently (and have been for a very long time). So if that's the reason you've tagged a number of translated pages for speedy there's no need. Xover (talk) 19:24, 11 September 2022 (UTC)

Yes, I'm glad for the translations -- my concern was that they shouldn't (I think?) be in the Page namespace, but in the Translation namespace directly, and that the German originals should be (scan backed) in the German Wikisource. JesseW (talk) 00:56, 12 September 2022 (UTC)

Having the translation in page namespace provides many of the features of transcription in the page namespace, namely the ability to trivially compare the text to the original. Unfortunately the German Wikisource has some pretty strict rules about adding new texts--de:Wikisource_Diskussion:Projekte#Regeln_für_neue_Projekte--and it's not trivial for someone who just wants to translate a text to upload it to the German Wikisource first.--Prosfilaes (talk) 03:16, 12 September 2022 (UTC)
Good lord, remind me not to try and add works to German WS. Reboot01 (talk) 13:25, 13 September 2022 (UTC)
I am not a friend of WS user’s own translations, but as they are permitted here, using the page namespace for them instead of direct translations into translation namespace should be encouraged, as it makes easier to check the quality of the translations or various changes to the translations. I suspect that practically no established contributors try to patrol additions or changes in translation namespace and the current difficulties with searching for original and comparing it with our translated text are the reason. --Jan Kameníček (talk) 15:36, 13 September 2022 (UTC)

It seems like the consensus is in favor of the practice of putting user translations in the Page namespace (and not requiring that the originals be first uploaded to the original language WikiSource). I'm fine with this, but we should update WS:Translations#Wikisource original translations to clarify this. Anyone want to volunteer? JesseW (talk) 17:20, 13 September 2022 (UTC)

Commons editing not being picked up by Wikisource

This is just an oddity and no longer a problem. I made a mistake on a link on a Commons page so that when Wikisource picked it up the link was sent to a destination that did not exist. I therefore edited the link in Commons; Commons then showed the corrected link and directed correctly to the page in question, however Wikisource still persisted in directing to the erroneous destination. It refused to change; I shut down my computer and re-entered but this did not help, so provisionally I did a redirect. Finally, I deleted the page in Wikisource that contained the link and then copied the coding back in to recreate it. This solved the problem but I'm wondering why Wikisource was working on data that no longer existed. Esme Shepherd (talk) 15:50, 14 September 2022 (UTC)

Are hws/hwe deprecated?

On a book I'm validating there's a word broken between two pages. It's at the bottom of this page: Page:Shinto,_the_Way_of_the_Gods_-_Aston_-_1905.djvu/122. There's no hws template here, just a regular old hyphen. Yet on the transcluded page (Shinto:_The_Way_of_the_Gods/Chapter_6, at boundary between pages 112 and 113) the word is merged as if the hws/hwe templates were used. I would have expected it to be clear- ing on the transcluded page. What's going on? Are word breaks across pages handled automatically now? Do I not need to use hws/hwe anymore? --Arbitan (talk) 21:06, 16 September 2022 (UTC)

Yep, word breaks are handled automatically now; it's part of the <pages /> logic. If you want to prevent them, use {{nop}}, I think. JesseW (talk) 02:10, 17 September 2022 (UTC)
Thanks! I don't know how I missed that. --Arbitan (talk) 07:01, 17 September 2022 (UTC)
@Arbitan, @JesseW: {{hws}}/{{hwe}} are not deprecated deprecated, they're just not needed anymore except in special circumstances (inside footnotes you still have to use them). If you want to preserve a hyphen in a hyphenated word that occurs at a page boundary you need to use {{peh}} (which is not an acronym for "page-end hyphen", but is rather onomatopoeic for the sound one makes when encountering such a conundrum for the first time: Peh!?) Xover (talk) 08:22, 17 September 2022 (UTC)

Live list of texts to validate, sorted by number of pages remaining?

Has someone already created a way to view a list of texts that need validation (I know that exists), but sorted by number of pages still needing validation? I think it'd involve going thru the (currently ~1 million) entries in Category:Proofread and grouping them by the basename (the part before the slash). That could be done from a dump, or using the query service I vaguely remember exists/existed, or something? JesseW (talk) 02:15, 6 September 2022 (UTC)

OK, I think I made it here: https://quarry.wmcloud.org/query/67152 -- JesseW (talk) 02:41, 6 September 2022 (UTC)
And there seem to be 4,817 texts in that state (many of them are single page works, and some the rest not even started). Interesting! JesseW (talk) 02:46, 6 September 2022 (UTC)
Your query is picking up a huge number of texts that have exactly one page in the proofread state, while having many others un-proofread or non-existent. For example, Index:2010_constitution_of_Angola.djvu is on your list, which has 1 validated page, 1 proofread page, 4 not-proofread pages, and 85 pages still to be added at all. — Dcsohl (talk)
(contribs)
12:51, 6 September 2022 (UTC)
I've improved it now, please check it again. It now returns just 308 works whose Index is in Category:Index_Proofread and which have exactly one Page in Category:Proofread. JesseW (talk) 15:06, 6 September 2022 (UTC)

I've now copied the current results (200 works) onto User:JesseW/Works with only one unvalidated page; I plan to go thru them and either work on them, or add notes why not (i.e. handwritten, actually very long, I don't want to work on that subject, etc.) I'd be delighted for other people to contribute, if you are so moved. JesseW (talk) 00:48, 20 September 2022 (UTC)

OCR and sidenotes

As anyone who has worked with OCR trying to parse sidenotes can attest, OCR doesn't seem capable of differentiating sidenotes from the main text, even though in many documents there is a clear "border" since most text is set to justified. So it just pushes the sidenote into the text, and you've got to go and separate it out.

For small documents, it's no big deal, but this can be annoying with large documents, such as legislative documents.

Are there any OCR solutions that are more effective at this? Even if this requires using something other than Wikisource's OCR and then copying the data back in. Supertrinko (talk) 03:30, 19 September 2022 (UTC)

When in editing mode, on the right next to "transcribe text" you can roll down OCR settings, there you choose "advanced options" and then you can select only a particular area of the page for OCR, e.g. without sidenotes. --Jan Kameníček (talk) 05:45, 19 September 2022 (UTC)
I didn't know this! Thank you very much! Supertrinko (talk) 20:45, 19 September 2022 (UTC)
See also phab:T294903 for bringing this into the native WS page editor. Inductiveloadtalk/contribs 21:48, 19 September 2022 (UTC)

Tech News: 2022-38

MediaWiki message delivery 22:16, 19 September 2022 (UTC)

Proofread status of pages bar no longer present on transcluded mainspace pages

On Songs and Sonnets (Coleman), for example, where there should be a yellow status bar across the top, under the title, it's just a few dots, presumably the border in which the bar was supposed to appear. I tested it on both Chrome and Safari, and it's true on both browsers. It seems to be a sitewide issue. Does anybody know what might be going on? @Inductiveload: PseudoSkull (talk) 16:01, 21 September 2022 (UTC)

@Tpt, @Soda: This sounds like something that would be caused by either a change in PRP, or a skin change that affects PRP, in this week's train. Any ideas? We're triggering various on-site stuff off the presence of that progress bar and that's currently broken too. Possibly related, my "Wikidata info" gadget it showing "Wikidata item not found." for a whole lot of pages now too. Might be incidental though. Xover (talk) 17:08, 21 September 2022 (UTC)
Sorry for that. It's indeed caused by this change. I did a mistake with the ressource loading configuration. I have wrote a fix. It would be amazing if one of you could create a bug report on Phabricator about it. Edit: Done Tpt (talk) 19:13, 21 September 2022 (UTC)
I have wrote a quick fix before the patch deployement. Tpt (talk) 19:39, 21 September 2022 (UTC)
recently, i see also a large text box hovering over the footers field, and color dots. --Slowking4Farmbrough's revenge 00:52, 23 September 2022 (UTC)

Please could we extract Page:Dictionary of National Biography volume 46.djvu/88 and replace with the readable [9]/[10].

AND (UGH!) It seems as though there a prolific set of replacements Index:Dictionary_of_National_Biography_volume_46.djvu. Do you wish for a list, or is the pointer sufficient. Thanks. — billinghurst sDrewth 12:57, 25 September 2022 (UTC)

@Billinghurst I replaced up to page 298 (djvu 304), the IA replacement you linked has less pages (only up to p309). Quality is still very low even for some replaced page. Please give a look as it is a very error prone task. Mpaa (talk) 22:15, 25 September 2022 (UTC)

Tech News: 2022-39

MediaWiki message delivery 00:30, 27 September 2022 (UTC)

Hi. Wondering whether we could do something to prevent books that have been labelled in {{header}} with a year parameter that is a decade, eg. "1820s" from being categorised into this cat. Fore example Beautiful and interesting account of the shepherd of Salisbury Plain. — billinghurst sDrewth 22:42, 23 September 2022 (UTC)

We could, but I don't see why; there's a lot of different similar formats used by other works in that category, like "c. 1816-1820" Bart'lemy fair or "1897-98" Hagar of the Pawnshop (Canadian Magazine, 1897-98) or "1840-1850" Jim Crow. Until there are some standards about how to format years that aren't simple years, I don't see any reason to pull out 1820s and not 1840-1850.--Prosfilaes (talk) 01:15, 24 September 2022 (UTC)
well- you could model a date accuracy format, based on day, month, year, decade, century, but it is hard. note the red flags on english for the zotero reference date generator. (i.e. they don’t fix the code to reflect the reference format, but flag it for others to conform to their style) --Slowking4Farmbrough's revenge 19:31, 27 September 2022 (UTC)

Page indexing in non-content namespaces

I notice that page indexing by search engine robots is set to "allowed" for the "Page:" namespace, "User talk:", etc.

You can see this for any given page by clicking "page information" in the lefthand navigation, it will show you something like this.

In general, search engines like Google and Bing do not seem to rank Wikisource very highly. The reasons are surely complex, and I realize there is plenty of stuff in the main namespace that is not the highest quality from a web indexing perspective; but it seems that items in namespaces such as Page:, User talk:, Talk:, and several others is almost never going to be very useful if encountered outside the context of the Wikisource site as a whole, and therefore pretty useless to a search engine.

English Wikipedia has indexing of such namespaces turned off. I believe the facility for doing so is a page in the MediaWiki: namespace.

How do folks feel about the possibility of discouraging external robot indexing of several namespaces? -Pete (talk) 02:21, 24 September 2022 (UTC)

Encouraging preferential indexing of things outsiders are likely to want to see, like Index and Author pages, makes sense. HLHJ (talk) 22:37, 2 October 2022 (UTC)

Policy on reprints of public domain books with illustrations still in copyright

Would it be possible to upload scans of a later version of a public domain work if we omit the illustrations that are still under copyright? I have a copy of a book about Paul Bunyan from 1926, but the illustrations are from 1965. Could I upload a scan of this version of the book with the illustrations omitted, since the text is the same? Or would this still be considered to be poor form? SurprisedMewtwoFace (talk) 14:44, 29 September 2022 (UTC)

Why not use https://babel.hathitrust.org/cgi/pt?id=mdp.39015013096212 instead? I can upload that.--Prosfilaes (talk) 01:32, 1 October 2022 (UTC)
I've uploaded the Will Crawford illustrations to Commons:Category:Paul_Bunyan_and_his_great_blue_ox_(1926) and will upload the scans from https://archive.org/details/paul-bunyan-and-his-great-blue-ox-images as soon as IA finishes processing.
At File:Paul Bunyan and His Great Blue Ox (1926) by Wallace Wadsworth.djvu.--Prosfilaes (talk) 18:49, 1 October 2022 (UTC)
That one has a different illustrator "Will Crawford"--RaboKarbakian (talk) 20:28, 1 October 2022 (UTC)
Yes, but for our purposes, the other one is unillustrated.--Prosfilaes (talk) 04:20, 2 October 2022 (UTC)

Moving all chapter subpages using Roman numerals in titles to use Arabic numerals

I want to propose that we (finally) perform a mass moving of chapter subpages that use a roman numeral scheme (such as Tarzan the Untamed/Chapter II) to an Arabic numeral scheme (Tarzan the Untamed/Chapter 2). The roman numeral equivalents can stay as redirects.

I would apply this same procedure to anything else that uses roman numerals for incrementation, such as Volume/Issue numbers in periodicals, Part or Book numbers in books, "Poem" or "Letter" numbers or any other weird work-specific kind of chapter, etc.

We have been trying to transition away from using Roman numerals for a long time. It is very clearly not preferred and most of the works that still use Roman numerals in titles are older works that were originally added in the 2000s. In fact, the Wikisource help page Help:Subpages makes it clear that there is (at least loosely) a requirement not to use Roman numerals in page titling (emphasis mine):

Chapters, sections and so on should be numbered with Arabic numerals (i.e. 1, 2, 3; not Roman numerals) when such a numbering scheme exists in the original work.

There is no policy that I could find on this, though. I think there should be.

People who have been active here for a long time know exactly why Roman numeral titles are not good, but just a refresher for those who aren't in the know: Roman numerals add titling inconsistency to the site, makes it harder to search through (Isn't it easier to search for "Chapter 23" than have to write "Chapter XXIII"?), makes it more difficult to run scripts/bots through, etc.

I volunteer to participate in writing a PWB bot that can do this job, and if it gains enough support here, I'll take this to Bot requests. A past discussion from a couple years ago I found on this: Wikisource:Scriptorium/Archives/2019-08#Roman_Numerals_in_Page_titles.... We need to actually do it this time. Pinging @ShakespeareFan00, @Beeswaxcandle, @EncycloPetey: PseudoSkull (talk) 15:16, 8 September 2022 (UTC)

It seems (from my not very knowledgeable perspective) that, for works where the original uses Roman numerals, it would be better for linking purposes to have both forms available. I have no strong opinion on which one should be the redirect, but it makes more sense to always use Arabic numerals as the page title, I suppose. I strongly support using a bot to ensure we at least have redirects in place for all of them, even if we don't move the existing page titles. JesseW (talk) 16:17, 8 September 2022 (UTC)
  Support: It is important that the chapter’s title is transcribed from the page namespace as it is including the Roman numerals, so it makes no harm if the Roman numerals are changed into Arabic ones in the page name, while such a change has some advantages described in the proposal. Therefore I agree with the proposal. --Jan Kameníček (talk) 20:06, 8 September 2022 (UTC)
  Oppose Generalizing everything to "chapters, sections and so on" is a bad idea. There are situations where it is needlessly complicated to use Arabic numerals for subpage titles, such as in the Acts of plays. The standard notation for plays is to use Roman numerals for Acts and Arabic numerals for scenes. There are also classical works that might warrant investigation, for which the international standard for parts uses Roman numerals, however I'm not familiar enough in those works to make any authoritative statement. Finally, there are works that contain roman numerals as part of the title, and a sweeping statement does not make allowance for short stories or poems whose title is or ends in a Roman numeral. For chapters of novels, I agree with standardizing to Roman numerals, but I disagree for plays, and am uncertain or opposed concerning some other forms of subdivision. --EncycloPetey (talk) 20:15, 8 September 2022 (UTC)
@EncycloPetey: Care would be taken to ensure that certain titles of works which more naturally included roman numerals (something like "The Life of Charles VI" for example) do not get changed. I don't know if I agree about the acts in plays bit, since keeping their roman numerals would still affect usability and consistent parseability. But, if I were to grant that this procedure would not affect Acts of plays (or perhaps also Scenes), would you at least agree with the moving of subpages like "/Chapter VI", or "/Part III"? PseudoSkull (talk) 21:05, 8 September 2022 (UTC)
I agree generally with Arabic numbering of numbered chapters, or doing so in works where it is convenient to number the chapters (though titled and not not numbered in the original), and in general with Arabic numbering of volumes, issues, and parts for works where those are numbered in some fashion. But I do not agree with changing titles with Roman numerals into Arabic, nor forcing Arabic numbers onto acts of plays where convention is to use Roman numerals. I would also consider the potential for exceptions where, for example, a classical work uses roman numerals and multiple notes and footnotes all keyed to those same Roman numerals internally. In such a work it would be a PITA to have to double the numbering system to display one and use the other over and over repeatedly. Those exceptions should be rare, however. I can think of only one or two volumes offhand (which we don't have here) that use such a system consistently throughout. --EncycloPetey (talk) 02:18, 29 September 2022 (UTC)
yeah, i am not a fan of the roman numeral phobia. already we cannot use roman numerals as footnotes, even when the source text does. will we now "correct" all texts with roman numerals in the toc and preface numbers? i don’t see the problem it is fixing. --Slowking4Farmbrough's revenge 20:59, 10 September 2022 (UTC)
No, we will leave the source text as is, but the wiki titles would contain Arabic rather than Roman numerals. And the problems it is fixing have been mentioned above. PseudoSkull (talk) 05:54, 11 September 2022 (UTC)
so you are going to move all mainspace chapter pages with roman numerals to arabic? is there a structural reason? it seems like micromanagement, with no added benefit. --Slowking4Farmbrough's revenge 20:54, 12 September 2022 (UTC)
@Billinghurst: I saw you make a move of this nature recently. Your opinion on this proposal? PseudoSkull (talk) 21:08, 11 September 2022 (UTC)
  Support. Cross-links are our value proposition; they are the thing that makes us different from and better than other transcription projects. If I'm transcribing a work of non-fiction that references other works e.g. a work of botany, I want to be able to add deep cross-links as I go; for example, link a reference to an article in a specific number of a specific volume of a journal. I want to be able to do that even if we haven't transcribed the article yet; but I can't do that if I can't predict what the link target will be. e.g. if it is impossible to know if it will be "Example Journal/Volume 9/Issue 1" or "Example Journal/Volume IX/Issue 1" or "Example Journal/IX/1" or etc. Hesperian 23:34, 11 September 2022 (UTC)
  •   Comment As Hesperian said, we should be having all new works with arabic numbering for chapters. AND AND AND this is the exact reasoning that we expressed years ago for our deeplinking, and also for better chapter sorting. I would have a preference for all works, though can understand that there may be genuine exceptions to the rule. Noting that these should be well argued exceptions to the rule, not simply personal choice/fancy. What we do with older works is always an interesting conundrum. I hven't touched old works, I just concentrate on recent works. And to Slowking4 it is not a phobia, that is a misrepresentation, it is a standard style, and nothing wrong with having one. — billinghurst sDrewth 12:46, 13 September 2022 (UTC)
  • a standard style that does not admit technically possible alternatives, seems a little pushy? will you now move all the chapters with roman numbers taken from the toc? or edit war and block people who create them? it seems like a lot of work to get your ascii sort functionality. you will just drive others to have no numbers in chapter titles. the policies are written as norms, and enforced as iron laws. --Slowking4Farmbrough's revenge 15:21, 13 September 2022 (UTC)
  •   Comment ++ with respect to redirects of subpages, they are a bad idea as when we disambiguate works they are truly are a PITA or a true problem. If we are moving them, I would suggest that any moved subpages have {{dated soft redirect}}s — billinghurst sDrewth 13:08, 13 September 2022 (UTC)
    Could you say more about the problems that redirects of chapter pages have caused? I believe you, but for ease of linking, I'd love it if we had both styles (Arabic and Roman) as a matter of course, so I'd like to know more about what problems you've noted. JesseW (talk) 14:31, 13 September 2022 (UTC)
    Mostly it's a matter of having a lot of extra redirects to manage, and redirects do get stale and need updating over time (page moves). And if you create such redirects for these works you need to add them for all works for consistency, which is going to be a huge huge job, that needs manual intervention (because humans have created the input data so it is known not to be consistent). And then you have a literal doubling of the number of pages in mainspace. On the level of a single redirect it is true that redirects are cheap, but systematically at this scale they're far more costly than I think this convenience merits. Xover (talk) 04:10, 15 October 2022 (UTC)
  •   Support Per nom and Hesperian, with caveats per EncycloPetey and Billinghurst. I support cleanup of old sins to match our modern standard for this. I don't think it is a good idea to do this automatically and en masse because experience indicates this needs human attention to handle links and so forth. I support treating this as a "hard" policy, but with the possibility of commonsense exceptions for some (not all) the edge cases enumerated by EncycloPetey (I think EP overstates the problem to make a point, but there are definitely some cases where an exception should be at least considered). I am also opposed to subpage redirects at scale per Billinghurst.
    As a practical matter, perhaps the way to get started would be to run a bot through to categorize these pages into a suitable backlog? Start with subpages matching /Chapter [IVXLCM]+ or /Vol(\.|ume)? [IVXLCM]+, and give them separate categories. Maybe we can modify {{header}} to detect roman numerals in its page name (not title) and add a tracking category? That would have the benefit of creating a way to catch these in future without draconian edit filters (although we could profitably warn users and link to some guidance when they're about to create such a page). --Xover (talk) 04:30, 15 October 2022 (UTC)

Work exports to PDF format list non-contributors as contributors

For as long as I can remember, PDF exports had a list at the very end that said "The following users contributed to this book:". In that list, there are several usernames of people who obviously did not work on the Wikisource transcription itself. A list I just got from a PDF export of Captain Midnight broadcast signal intrusion, besides my own username, was:

Who are these people and why do they always come up in PDF exports if they never worked on the transcription? Were they responsible for working on the export functionality, or is there some other reason they are listed as contributors?

If it's just a bug that they're listed as contributors, maybe we should try and take them out of the default list? PseudoSkull (talk) 15:10, 29 September 2022 (UTC)

I frankly have no idea at all, this is all news to me. I did some work on a couple of templates 15 years ago though... AzaToth (talk) 22:22, 29 September 2022 (UTC)
@PseudoSkull, @AzaToth: The listed usernames all uploaded new versions to c:File:PD-icon.svg, which is used in {{PD-ineligible}}. Presumably they're listed because the template is evaluated as part of the body text. Shells-shells (talk) 23:49, 29 September 2022 (UTC)
But why are only those users who uploaded an image listed, and not also the contributors to, e.g., the actual {{PD-ineligible}} template itself? Shells-shells (talk) 23:54, 29 September 2022 (UTC)
I do not understand why the uploaders are mentioned, as the c:File:PD-icon.svg itself is licensed as PD-ineligible and so there is no need to mention any "authors" of the image when using it. It is a real non-sense to list the uploaders of the image whenever the license tag is used. --Jan Kameníček (talk) 08:21, 30 September 2022 (UTC)
WS Export lists all contributors to exported pages and exported files. You're quite right to say that it really doesn't make sense for license images! But most images are part of the work proper, which is why that decision was made (many years ago). It's also wrong to not include template authors; I've just made task T319010 to track that. A workaround for the license images specifically would be to add the ws-noexport class to those images. That way they won't be included in the exported book. There's also task T276672 which is about removing the credits list completely, and replacing it with an online version (that'd still suffer the same problems, of course). — Sam Wilson 09:49, 30 September 2022 (UTC)
I think I've argued elsewhere that including the list of contributors in the export is pretty useless: the point of that listing isn't really bragging rights, but rather fulfilling the attribution requirement of CC BY, and 1) it's well established that a link back to the wikipage where the revision history serves as attribution is sufficient, and 2) I have trouble thinking of any exportable content on Wikisource that is actually separately copyrightable (and hence subject to that attribution requirement) except Wikisource translations which aren't actually exportable at the moment. I would argue we should make the default behaviour to omit the list and just rely on the already existing link back to the wikipage on which the "Download" was pressed. The "list all contributors to all pages associated with this work" (i.e. phetools/credits.py) thing will be independently useful, but isn't really needed for this. Xover (talk) 15:22, 30 September 2022 (UTC)
Excuse this, it really doesn't belong here but.... For me, it was great to see User:Rocket000 in a list. Adds the style to the template and then you paste it all about....--RaboKarbakian (talk) 17:19, 30 September 2022 (UTC)
@Xover Well, in my opinion, it is quite the opposite and including the list of contributors for export is very very useful. From my experience as an admin of pl ws (quite small ws) and conversations at community meetings, it appears that rewriters and proofreaders of our works appreciate being on this list very much. And it's not about bragging by any means. Rather, it's a motivational factor and a tiny element of appreciation. In small communities such as pl sources, the motivational element s very important to keep them alive. I believe that this list should not be removed and that there should be the scribes who worked on the text, both on the page and the main ns. @Samwilson please, make any limiting changes optional for this matter. Zdzislaw (talk) 12:05, 1 October 2022 (UTC)
@Zdzislaw: Personally, there's a bunch of other stuff that I'd like to fix with WS Export before getting to that stuff, so I don't think any change is going to happen soon (although if anyone wants to work on it, I'm very happy to help!). And I think you're right that the contributors list makes people feel good — certainly I've at times enjoyed coming to the end of reading something on my Kobo and seeing people listed there. But I do wonder if by adding all image and (potentially) template contributors, we're not actually making the list less useful. Would it be better to make it a list of 'proofreaders'? i.e. we could extract names of people who have changed the proofreading status of pages in the work. (Although, that'd exclude people who do proofreading but don't change the status… so perhaps needs more thought!) Sam Wilson 23:57, 1 October 2022 (UTC)
Why exclude people who do the image work for a text? Some people do mostly image work. And why should not those technical contributors that made the templates that made the proofreading work possible get credit? Inductiveload, for example, who has made many of our most core templates used across a large proportion of our texts (not that I think IL particularly cares about being credited here, but for the sake of argument…).
If you go that route you really do need to include everyone. Xover (talk) 04:41, 2 October 2022 (UTC)
@Samwilson I think that all the volunteers who edited the main and page NS, as well as who prepared images, should be there, as before. Until now (before improving WsExport) technical items (copyright template images) were excluded from this list by adding the "link =" parameter to "[[File | ...]]" - now it doesn't work anymore. Many people in pl ws were also surprised that the list got so long, especially since in pl ws we do not treat it as an fulfilling the attribution requirement of CC BY. In my opinion, adding template editors to this list would "blur" its motivational aspect even more (and yes, it actually will make the list less useful). It would be useful, however, if a new class ("ws-nocontrib") could be used to mark technical items for which we do not want to add editors to the lists. Zdzislaw (talk) 19:33, 2 October 2022 (UTC)
@Zdzislaw: Thanks, I didn't realise that unlinked images used to be excluded. Is that something we want to return to? I feel like it's a slightly arbitrary way to designate an image as being 'of the work'; perhaps as you say a new class should be added for the images that shouldn't be included. But what about template authors? How do we determine which templates are of the work and which are not? Sam Wilson 02:07, 5 October 2022 (UTC)
@Samwilson Difficult questions... I think it would be good to make it "backward compatible"... at least by using new exclusion classes (then local ws could decide for themselves). Regarding template contributors - also as an option. Zdzislaw (talk) 20:22, 17 October 2022 (UTC)
@Zdzislaw: I should clarify: by "bragging rights" I don't mean "bragging" in the sense with negative connotations. I mean only as distinct from the legal obligation of the license to provide attribution. And I certainly understand the needs of community building and of keeping a small volunteer community going, and wouldn't want to try to speak for plWS. But here on enWS, my impression is that the community in general (I'm sure there are exceptions) doesn't really care that much about the "bragging rights" aspect, and when balanced against the clutter of listing every contributor for every edit of every resource used in a download, would fall down on the side of leaving it out. We fulfil the legal obligation (such as it is) with the existing link back on-wiki, and if in future there is some on-wiki way to see all contributors to a particular text that can be added as a bonus. Xover (talk) 04:50, 2 October 2022 (UTC)

Can I add a link to Wikisource/Help in this page's header, without having to start a new topic?

Everything is in the title? — ineuw (talk) 10:27, 29 September 2022 (UTC)

@Ineuw: I just now saw this. I don't understand the question. And I don't understand why you posted it in the "Repairs and moves" section. Can you clarify? Xover (talk) 19:09, 15 October 2022 (UTC)
Of no great importance. I want to navigate directly from the Scriptorium to the Scriptorium/Help posts. Since the Scriptorium page header already has a link which starts a post in the Help page, I am proposing to add another link, perhaps added to the Help page in the sidebar. — ineuw (talk) 03:22, 16 October 2022 (UTC)
@Ineuw: The header on Wikisource:Scriptorium already has a (text) link to Wikisource:Scriptorium/Help. Not all that prominent, I grant, but that header is pretty busy as it is so adding more big visible stuff will probably not help. Xover (talk) 07:30, 16 October 2022 (UTC)
@Ineuw: PS. I moved the thread from the "Repairs (and moves)" section to where it will be more visible. Xover (talk) 07:31, 16 October 2022 (UTC)
@Xover: Thanks. That is the link I was looking for. "Not prominent" is being kind. Not noticing it after all these years is worse.-- — ineuw (talk) 05:19, 17 October 2022 (UTC)

Make last_initial optional in Author template

I'd like to suggest we make the |last_initial= parameter of {{author}} optional, because it can be derived from |lastname= (which itself can be pulled from Wikidata). Discussion here: Template_talk:Author#Move_last_initial_to_Module:Author? Sam Wilson 02:21, 29 September 2022 (UTC)

  Support, long-needed DRYification. PseudoSkull (talk) 15:13, 29 September 2022 (UTC)
  Support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:21, 3 October 2022 (UTC)
  Oppose per discussion on Talk page. --EncycloPetey (talk) 14:50, 5 October 2022 (UTC)
Are there really enough exceptions? I have very rarely had to default the sorting to anything but the last word in the author's name. I think making it optional wouldn't cause any problems whatsoever that would be any different from problems that would come about by requiring it. I already preload the headers by the way through a Wikisource gadget, do you think I shouldn't do that either? Editors will know when the initials need to be different than the default if they weren't required at least just as much as if they were. PseudoSkull (talk) 18:18, 21 October 2022 (UTC)
  Support.--Jusjih (talk) 21:55, 28 October 2022 (UTC)