Wikisource:Administrators' noticeboard

Administrators' noticeboard
This is a discussion page for coordinating and discussing administrative tasks on Wikisource. Although its target audience is administrators, any user is welcome to leave a message or join the discussion here. This is also the place to report vandalism or request an administrator's help.
  • Please make your comments concise. Editors and administrators are less likely to pay attention to long diatribes.
  • This is not the place for general discussion. For that, see the community discussion page.
  • Administrators please use template {{closed}} to identify completed discussions that can be archived
Report abuse of editing privileges: Admin noticeboard | Open proxies
Wikisource snapshot

No. of pages = 3,084,485
No. of articles = 795,073
No. of files = 22,281
No. of edits = 10,230,493


No. of pages in Main = 471,297
No. of pages in Page: = 2,230,442
No. validated in Page: = 438,364
No. proofread in Page: = 719,777
No. not proofread in Page: = 876,940
No. problematic in Page: = 31,233
No. of validated works = 4,026
No. of proofread only works = 2,777
No. of pages in Main
with transclusions = 263,431
% transcluded pages in Main = 55.89
Σ pages in Main


No. of users = 2,949,922
No. of active users = 451
No. of group:autopatrolled = 465
No. in group:sysop = 26
No. in group:bureaucrat = 2
No. in group:bot = 22


Checkuser requestsEdit

  • Wikisource:checkuser policy
  • At this point of time, English Wikisource has no checkusers and requests need to undertaken by stewards
    • it would be expected that requests on authentic users would be discussed on this wiki prior to progressing to stewards
    • requests by administrators for identification and blocking of IP ranges to manage spambots and longer term nuisance-only editing can be progressed directly to the stewards
    • requests for checkuser

Bureaucrat requestsEdit

Page (un)protection requestsEdit

Request protection of Main Page templatesEdit

According to the very first point under Wikisource:Protection_policy#Special_cases “The main page should always be protected…”, yet this edit took place today. Some care and attention, please? (Normally Phe-bot is the sole updater of Template:ALL TEXTS!) 114.73.248.245 17:57, 15 October 2018 (UTC)

Thanks. I have place soft protection on the page. — billinghurst sDrewth 20:09, 15 October 2018 (UTC)
  Comment To fellow administrators, I have up'd the protection on a couple of templates that won't need updating. I have a question about Template:Highlights, should this be sitting at semi/soft? If we are unlikely to change it, then we should be protecting it further. — billinghurst sDrewth 20:27, 15 October 2018 (UTC)

OtherEdit

Resource Loader issue needs outside guidanceEdit

The more I read up on this RL change and the subsequent actions needed (or taken?), the more I get the feeling some of my approach to site wide & gadget .js/.css organization over the months is going to behind this week's latest problems. If that winds up to be the case, then I'm truly, truly sorry for that. Let me try to document those steps and the reasoning behind them in hopes someone (@Krinkle:) can made sense of our current state and put us on the right path post RL change(s).

Originally, we not only had a ridiculous amount of scripting and .css definitions in our primary site-wide MediaWiki files to begin with but also called a number of stand-alone .js/.css files within those primary MediaWiki files called unnecessarily in addition to calls to various sub-scripts on top of any User: selected gadgets being called -- some of which eventually became default loaded per concensus, etc..

A simple depiction of the key files mentioned minus any Gadgets basically went like this...

Over several months with help of other folks, I began to consolidate and/or eliminate as much scripting calls as I could -- creating optional Gadgets whenever possible -- and tried much the same for the .css class definitions. The rationale behind doing this can be found in several places, most importantly: Wikipedia. The premise to keep the MediaWiki site-wide files "lean" goes like this....

 /**
 * Keep code in MediaWiki:Common.js to a minimum as it is unconditionally
 * loaded for all users on every wiki page. If possible create a gadget that is
 * enabled by default instead of adding it here (since gadgets are fully
 * optimized ResourceLoader modules with possibility to add dependencies etc.)
 *
 * Since Common.js isn't a gadget, there is no place to declare its
 * dependencies, so we have to lazy load them with mw.loader.using on demand and
 * then execute the rest in the callback. In most cases these dependencies will
 * be loaded (or loading) already and the callback will not be delayed. In case a
 * dependency hasn't arrived yet it'll make sure those are loaded before this.
 */

The result of that effort as it stands today can be depicted basically like this....

The predominant change in order to move towards the previously cited rationale & approach is that the bulk of the scripting and class definitions now reside in the default-enabled Site gadget files, MediaWiki:Gadget-Site.js & MediaWiki:Gadget-Site.css. And by no means is the current state the desired final approach; its been a work in progress as time allowed over several months.

Obviously, now with the recent change to Gadgets and ResourceLoader, either the existing rationale or my attempts (or both) are no longer in harmony -- if they ever were. In my view, we need someone like Krinkle (or maybe the collective minds of Wikitech-l?) to take the time and attention needed to come in here and straighten all this out -- one way or the other. My gut tells me THAT will resolve the reported loss of one thing or another post-RL change(s). Again, if I'm right about my actions exacerbating problems for other, I apologize and take full responsibility. -- George Orwell III (talk) 20:54, 8 August 2015 (UTC)

I've made a few minor changes in addition to yours that hopefully make things work a bit more like you intended. I'm happy to provide further guidance but that probably works better for a more specific need or question. Perhaps bring it up on Wikitech-l or on IRC so we I can help you move forward with any unresolved issues. Krinkle (talk) 21:37, 20 August 2015 (UTC)

Interface administratorsEdit

Hi. Please see https://www.mediawiki.org/wiki/Topic:Unisfu5m161hs4zl. I do not remember if this was already discussed and how it is going to be addressed. Comments and suggestions welcome.   Comment As far as I am concerned I would trust any admin who feels skilled and confident enough to tackle such edits.— Mpaa (talk) 21:05, 29 October 2018 (UTC)

I can handle the technical aspects of it. However, it can take me a while to get around to tasks that take longer than a few minutes, so I don't want to create a false expectation of being able to handle time sensitive matters on my own. —Beleg Tâl (talk) 02:35, 30 October 2018 (UTC)


We should decide how to address the fact that EnWS has no m:interface administrators. I see basically the following options. Please add/amend as you feel appropriate.

Option A - Assign right on demand when needed

Option B - Assign right permanently to willing Admins, to be reviewed in the confirmation process

As I said above, I am for the simplest one.— Mpaa (talk) 21:28, 30 October 2018 (UTC)

Option C - Assign right permanently to selected Admins, after approval process, to be reviewed in the confirmation process

Option C sounds like you're being volunteered (based on the lack of the word 'willing'). ;) --Mukkakukaku (talk) 06:27, 31 October 2018 (UTC)

Option D - assign the rights to all the admins, who have already been vetted for community approval, and then whoever has the ability and desire can make use of it as they will and as needed. —Beleg Tâl (talk) 13:33, 31 October 2018 (UTC)

Option D would make the most sense for us. For anyone to get themselves to the point that we trust them with the admin tools just so that they can mess around in the interface, they would be playing a very long game. Beeswaxcandle (talk) 22:05, 2 November 2018 (UTC)
I agree with Beeswaxcandle, Option D, although I would also be fine with the right only going to admins who express an interest. BD2412 T 23:00, 2 November 2018 (UTC)
It is so rare I disagree with Beeswaxcandle but this must be one of those times. The whole point of this change is to prevent the ignorant from accidentally screwing up - insulting as the implications undoubtedly are! As such under the new regime trust is no longer enough; perhaps somebody ought to draw up some kind of eligibility examination…? 114.73.248.245 23:03, 2 November 2018 (UTC)
That hasn't been an issue for us yet, and accidental changes are easily reversed. If we had more users it would be more of a problem, but as it stands this kind of distinction is more cumbersome than helpful in my opinion. —Beleg Tâl (talk) 00:08, 3 November 2018 (UTC)
As much as I like the idea of making all existing admin interface admin, IA were separated from regular adminship specifically to reduce attack surface(from hackers), and it was pretty dangerous if the access fell into the wrong hand, I'd rather propose having existing admin request right from bureaucrat and could be granted at the bureaucrat's discretion, and should be automatically removed if no action after two month.Viztor (talk) 02:13, 10 August 2019 (UTC)
  •   Comment we discussed it when the rights were split, and it was agreed that it could be assigned on a needs basis. That has been done at least once for me with the temporary assignation of the IA rights. — billinghurst sDrewth 05:58, 10 August 2019 (UTC)
    Note that WMF Legal requires 2FA to be enabled for users who are to be assigned this right, so bureaucrats will have to verify this before doing so. MediaWiki's 2FA implementation is also sufficiently finicky that one may not want to enable it without proper consideration. --Xover (talk) 08:21, 10 August 2019 (UTC)
    What's wrong with the 2FA implementation? I haven't had any issues with it at all. —Beleg Tâl (talk) 22:17, 10 August 2019 (UTC)
    Ah, sorry, I should have been more clear. I am going on hearsay, mostly from admins on enwp (a crotchety bunch if ever there was one), and my own assessment of the documentation at meta. The main complaints are that the implementation in general is a little bit primitive (as is to be expected since WMF rolled their own instead of federating with one of the big providers), and that there is no way to regain access to your account if something goes wrong with the 2FA stuff (if your phone is stolen etc.) unless you happen to know one of the developers personally. None of these are in themselves showstoppers, and many people are using it entirely without issue. The phrasing sufficiently finicky that one may not want to enable it without proper consideration was not intended to discourage use, but merely to suggest that it is worthwhile actually giving it a little thought before requesting it be turned on. --Xover (talk) 17:52, 11 August 2019 (UTC)
    Okay, gotcha. As it happens, Wikimedia 2FA does include emergency access codes for use when your phone is unavailable. —Beleg Tâl (talk) 19:56, 11 August 2019 (UTC)

Formal requirements related to 2FAEdit

Picking up this again…

I finally got so annoyed by our inability to fix even simple stuff stuff that requires Interface Admin permissions that I hopped over to meta to figure out what the actual requirements are (versus the should stuff). As it turns out, the 2FA stuff is (surprise surprise) as half-baked as most such Papal bulls from the WMF: 2FA is required for intadmin, but there is no way for bureaucrats to actually check whether an account has that enabled. The result of this is that even on enwp (where they take this stuff really seriously) they do not actually try to verify that 2FA is enabled before they hand the permission out: they check that the user is in the right group so that they can turn on 2FA, remind the person in question of the requirement, but otherwise take it on faith (trust). There's a request in for the technical capability to verify 2FA (and I think Danny is even working on it), but it seems mostly everyone's waiting for 2FA to be enforced by the software.

Meanwhile, anyone with existing advanced permissions (i.e. +sysop) have the capability to enable 2FA, and anyone with a particular reason (e.g. that they need it to get Interface Administrator permission) can apply to be a "2FA Tester" and thus gain the ability to turn it on.

The net result is that our bureaucrats (ping Hesperian and Mpaa) can assign this permission so long as we somehow somewhere make at least a token effort to make sure those getting the bit have 2FA enabled. Whether that's an addition to, or footnote on, Wikisource:Adminship, or the bureaucrats asking/reminding the user when it comes up, or… whatever… I have no particular opinion on. Since the previous community discussions have been actively adverse to regulating this stuff in detail, and absent objections, I think "Whatever Hesperian and Mpaa agree on" is a reasonable enough summary of consensus.

I still think we should have an actual policy for Interface Administrators (or section on it in Wikisource:Adminship) and some facility for permanently assigning the permission (ala. +sysop; but intadmin tasks are not one-and-done like +sysop tasks, they often require iterative changes over time and need to fit into a overall architecture), but so long as there is no appetite for that, something that we can point to and say "That's how we handle the 2FA requirement" if the WMF should ever come asking. --Xover (talk) 07:37, 10 February 2020 (UTC)

Judicious cleaning required from Special:UnusedFilesEdit

I was just poking my head into Special:UnusedFiles. There are a significant number of images that utilise {{raw page scan}} that should be checked and if truly unused, we can delete as the file has been transwiki'd to Commons. And I note that always physically check their usage as I have previously seen that the NOT USED assessment is not always accurate.

Checking and deleting process:

  • Use the Page:… link
  • At Page:… check that there is a Commons loaded image in place (and no use of template:raw image)
  • grab the new filename
  • click back to the local image, delete the image, noting "File transwiki'd" and paste in the new filename (preferred not mandatory)

If admins could do 10 to 20 a session, we should get through them in a month or so. — billinghurst sDrewth 09:26, 14 November 2019 (UTC)

  Comment TIP: when doing the image check you can even take the time to validate proofread page with the image (very often sittin gin proofread status). — billinghurst sDrewth 09:31, 14 November 2019 (UTC)
@Billinghurst: Considering the list is maxed out at 5k, so there's no telling how many more of them there are (we have over 20k files tagged as raw page scans, possibly more that are untagged), it's highly unlikely we'll get through that in a month. But it's certainly something we need to start chipping away at.
And we should possibly even start considering more drastic measures, like periodically bot-deleting anything in Category:Raw page scans for missing images that isn't used anywhere (including inbound links). The manual processing is tedious and time-consuming, and provides very little additional value compared to an automated approach (linking to the replacement image in the deletion log, mostly, and that has marginal value at best). We'd need to check closely whether the category contains files that could be caught as false positives in such a run, but barring such pitfalls automation may be both the best option and the only realistic way to ever clear out this backlog (we have plenty of other image-related backlogs where human attention is necessary).
Oh, PS, DannyS712 has a neat user script at User:DannyS712/Change status.js that makes cases such as this a lot quicker. I'm not sure they consider it ready for prime-time (I don't think it's been advertised anywhere), so caveat emptor, but I've been using it a good bit today and seen no problems. To use, add importScript('User:DannyS712/Change status.js'); to your common.js. --Xover (talk) 20:18, 14 November 2019 (UTC)
If you want to use the script, it adds a link to the function (next to the "move" function) that will, if the page is "not proofread" or "problematic", mark it as "proofread". If it is already "proofread", and the user can mark it as validated, it marks it as validated instead. Let me know if there are any questions. Thanks, --DannyS712 (talk) 20:34, 14 November 2019 (UTC)
@Billinghurst: I agree with @Xover: that this kind of tasks should be automated.Mpaa (talk) 21:29, 15 November 2019 (UTC)
  Comment At this point I would think that the task is to start to chip away. I don't see that there is urgency in cleaning this space, so as long as we start. So what if it takes three months, heck I have works that I dip in and out of for years. As I said I have seen multiple issues of the tool being wrong in the past, if we can demonstrate that this is no longer the issue, then maybe we can look to bot removal. I though the admin review, and process of validating was beneficial.

P.S. Those quiescent admins, and those who find it hard to identify tasks to undertake are given a gift here! — billinghurst sDrewth 22:50, 14 November 2019 (UTC)

Having done about a hundred of these by hand… I'd say the realistic best case sustained rate here is something like 5 admins doing 5 files per day for 5 days a week. That's an aggregate rate of about 500 per month. If the number is 5k that means 10 months to get through it. If the number is 20k that's 40 months, or just shy of 3.5 years. I don't have sufficient data for an accurate estimate of net time, but assuming a range of 30-60 seconds per file, at 5k files that's an aggregate ~40–80 net admin-hours expended. At 20k files that's ~160–320 net admin hours. Assuming an 8 hour work day, that's one dedicated admin working flat out only on this for between one week (5k/30s) and 2 months (20k/60s). With no lunch break, by the way. That's a pretty high cost.
On the upside we have tagging the logs for deleted files with a link to the replacement images. But that only matters if you're actually looking at the deleted file, and for these raw page scans that is essentially never going to happen. Having a human in the loop also helps guard against Mediawiki bugs in categorising etc., but while, yes, that does happen, it's been years since I've run into that kind of bug anywhere that would matter here. What usually happens is that counts and references fail to update properly when pages are deleted, so you get categories saying they have members, but in reality the relevant items have already been deleted; and these eventually get cleared out by periodic maintenance tasks.
In other words, doing this manually is expensive and with a significant opportunity cost, and without a concomitant value. Automating it obviously carries risks (automatically deleting up to 5-20k files should never be done lightly). But with appropriate checks—for example, all files listed on Special:UnusedFiles who are also in category Raw page scans and who have no incoming links in WhatLinksHere—manual spot checks, and going in batches… the risk should be eminently manageable. --Xover (talk) 09:16, 16 November 2019 (UTC)
I can work out shortly a script that can scan category Raw page scans and checks for the conditions for deletion (and in case deletes). If you are OK to test small batches, let me know.Mpaa (talk) 14:52, 16 November 2019 (UTC)
@Mpaa: Can you have it run up a list of files and dump it in a sandbox somewhere so we can spot check the logic? Maybe a hundred or so files that the script thinks should be deleted, and, if relevant, the ones it thinks shouldn't be. Better to find any holes in the logic before we start deleting stuff. --Xover (talk) 15:10, 16 November 2019 (UTC)
Here: res sandbox.Mpaa (talk) 16:36, 16 November 2019 (UTC)
@Mpaa: Excellent! I've spot-checked pages from most of the works represented in that list and found none incorrect. I'd have no objection to running that (in batches so it can be checked; there're bound to be some pathological edge-case out there somewhere). --Xover (talk) 17:12, 16 November 2019 (UTC)
@Xover: I have done a small test batch of 45 pages as Mpaa.Mpaa (talk) 18:29, 17 November 2019 (UTC)
@Mpaa: Ok, I've spot-checked 2–3 files from each work in those 45, and find no real problems. The only issue I see is that the deletion log for File:A book of the west; being an introduction to Devon and Cornwall.djvu-453.png links to c:File:A book of the west; being an introduction to Devon and Cornwall.djvu instead of c:File:A Book of the West - ALMS HOUSES, S GERMANS.png; and ditto for File:A book of the west; being an introduction to Devon and Cornwall.djvu-223.png that points to c:File:A book of the west; being an introduction to Devon and Cornwall.djvu instead of c:File:A Book of the West - LAKEHEAD, KISTVAEN.png. --Xover (talk) 21:28, 17 November 2019 (UTC)
@Xover: Thanks, I fixed it, I ran another ~40 pages.Mpaa (talk) 18:33, 18 November 2019 (UTC)
I'm wondering if we can set up a scrolling gallery that does nothing but compare our page image side-by-side with the comparable Commons file. An editor could scroll through and eyeball any differences fairly quickly. BD2412 T 22:52, 18 November 2019 (UTC)

┌───────────────────────────────────────┘
Poking at this again…

I found no problems with Mpaa's test bot run, and we still have potentially ~20k files sitting there that it would be a waste of admin resources to process manually. Can we pull the trigger on a mass delete of these? If not, what are the concerns? --Xover (talk) 08:26, 18 February 2020 (UTC)

I ran about 15 pages to check that everything is still OK. Mpaa (talk) 21:47, 18 February 2020 (UTC)

Title blacklist updated to prevent invisible characters in page namesEdit

Based on this discussion I have added rules to the title blacklist to prevent the creation of pages with invisible Unicode characters in the name. Users hitting this rule should see the custom error message at MediaWiki:titleblacklist-invisible-characters-edit. --Xover (talk) 09:05, 5 December 2019 (UTC)

Should the “Emojis, etc. Very few characters outside the Basic Multilingual Plane are useful in titles” section be there? First, it’s clear that MediaWiki:titleblacklist-invisible-characters-edit is not an helpful message in that case. Secondly, there seems like some quite useful characters there; the Mathematical Alphanumeric Symbols is basically just so we can include mathematical titles in a plain text format. As for the rest of them, if we allow Chinese, we should allow all of Chinese; if we do enough academic work, we’re going to have Hieroglyphics and ancient Chinese in article titles.--Prosfilaes (talk) 11:26, 5 December 2019 (UTC)
@Prosfilaes: In page names? Where we recently had community consensus to not even permit curly quotes (as part of the discussion to permit them in page content) because that’d be too fiddly? In any case, both the error message and the rules can be tweaked if needed. The current rules are an attempt to prevent stuff like WORD JOINER and friends that are invisible and cause issues for people trying to work with such pages. I (currently) have no particularly strong opinions on the issue above a vague inclination towards limiting page names to roughly ASCII (on enWS, of course, other language projects have different needs). --Xover (talk) 11:52, 5 December 2019 (UTC)
What’s magical about page names? Page names need to match the names of the works they contain. Curly quotes are a special case; note that there was no restriction on character in pages, except for curly quotes. If an article title is "Čapek’s works in English” or “Injections from 𝕎 to 𝕁", then why should the page name be any different?--Prosfilaes (talk) 13:55, 5 December 2019 (UTC)
Page names have technical and practical concerns (peoples' ability to enter them, display, them, search for them, etc.) that means we should constrain them in some fashions; and at the same time we already do stuff like drop “The” from page titles to facilitate automatic sorting (which I actually disagree with, but that’s neither here nor there). The characters we disallow in the current rules are also exceedingly rare in practice, and the blacklist can be overridden by any admin at need, so I don’t think it is a problem we should expend too much effort on until and unless we start seeing actual cases where it causes problems.
Č (U+010D: LATIN SMALL LETTER C WITH CARON), is in the Latin Extended-A block which is a part of the Basic Multilingual Plane which the current ruleset lets through. 𝕎 (U+1D54E: MATHEMATICAL DOUBLE-STRUCK CAPITAL W) and 𝕁 (U+1D541: … J) are part of the Mathematical Alphanumeric Symbols block of the Supplementary Multilingual Plane (most common mathematical symbols are in the BMP; these are essentially font variations: bold, italic, fractur, etc.) that contains all the exotic and ancient stuff (Linear B, Coptic, Hieroglyphs, etc.) plus a good chunk of Emoji, so they would be disallowed by the current rules but we can whitelist ranges if we discover that we need them.
That all being said, I am by no means married to the current rules so whatever is the consensus is is fine by me; and I would, of course, be happy to help implement whatever that consensus is if needed. Most of the examples you mention above (the extended maths stuff, hieroglyphics, etc.) are contained in distinct blocks (Emoji aren’t, and I believe Chinese is also split up in inconvenient ways when you want to handle everything, but most of the rest) that should be relatively straightforward to whitelist. --Xover (talk) 13:32, 6 December 2019 (UTC)
I agree that "it is {not} a problem we should expend too much effort on until and unless we start seeing actual cases where it causes problems." I think that we should remove the restriction; it is easy to deal with a few poorly named or spammish pages on patrol, and bad to frustrate innocent users with a misleading and likely irrelevant error message.--Prosfilaes (talk) 01:16, 10 December 2019 (UTC)

Request for an interface admin to edit MediaWiki:Gadget-ocr.jsEdit

Hi!

Since hOCR is currently buggy, please see mul:Wikisource:Scriptorium#Request_for_an_interface_admin_to_edit_MediaWiki:OCR.js : you can edit your local gadget to use fallback OCR as default one, just commenting the if condition in hocr_callback() function. —Pols12 (talk) 12:58, 21 December 2019 (UTC)

Sockpuppets of User:LupsteEdit

User:Lupste has been globally locked. Suspected sockpuppets of this account that have since appeared on Wikisource:

--EncycloPetey (talk) 20:15, 23 December 2019 (UTC)

Poems of Sappho and other works deleted as copyvio which have meanwhile slipped into PDEdit

The Poems of Sappho were deleted in 2013 as copyvio and moved to Wikilivres (which has unfortunately disappeared). Looking at Author:Edwin Marion Cox the work was published in 1924 and so it should be in PD now. Is it possible to restore it? It would be also great if it were possible to find other deleted works of this kind which later slipped into the PD and restore them too. --Jan Kameníček (talk) 14:23, 15 February 2020 (UTC)

As I commented elsewhere, our copy of The Poems of Sappho was a tiny fragment of the whole work, and not the first part either. Wikisource standards have changed, and we wouldn't host such an extract these days. For that work, it would be advisable to start from scratch. --EncycloPetey (talk) 17:21, 15 February 2020 (UTC)

Speedy policy on raw OCR text?Edit

There are currently 30 pages in Category:Speedy deletion requests, that are posts of raw OCR, requested for deletion by someone (@Ratte:) who wants to recreate them properly. This seems reasonable to me: posts of raw OCR are generally unhelpful, and doubly so if they are interrupting the workflow of someone who wants to proofread.

Does this fall within a speedy delete criterion? Maybe "process deletion"? If so, are edits needed to make that explicit?

(I'm not sure I would support a 'nuke them from orbit' approach to raw OCR posts; I'm just talking about situations like this where the pages are interrupting a workflow)

Hesperian 00:07, 4 March 2020 (UTC)

But is that the raw OCR or the text layer of the file? If it's the text layer in the file, then exactly the same text will show up upon recreation of the page. --EncycloPetey (talk) 00:58, 4 March 2020 (UTC)
In the past when I've deleted pages to restore the text layer, I've used M1-Process deletion. However, @EncycloPetey: is correct, this is the text layer, so there's no point. Beeswaxcandle (talk) 05:27, 4 March 2020 (UTC)
I have scripts that only trigger when creating a page. I could tweak them, sure, but on the rare occasion when I hit a text layer dump, it's easier to delete and re-create the pages. So I am open to the possibility that these pages are interfering with Ratte's workflow.
At this point I am inclined to action the deletions under M1, but leave the policy as it is.
Hesperian 23:40, 4 March 2020 (UTC)
@Ratte: Are you using special scripts, or just working from the existing text layer? Your input would be welcome. --01:13, 5 March 2020 (UTC)

  Comment These are "not proofread" pages without formatting just delete them, what is the loss? How it is different for a new version uploaded or anything similar where we just man-handle the pages to meet needs? — billinghurst sDrewth 04:55, 5 March 2020 (UTC)

(None of the pings were successful, I've seen this discussion by chance). This is my usual practice since ru.wikisource: I nominate for SD non-proofread pages created by other user and then recreate it myself with proof-reading. Why? Because there's no any warranty that other user has created them without losing text, that's all. Maybe it's the text layer of the file; maybe not, you cannot know. It's just for ease in work. EncycloPetey has rejected my nominations, so there's no any subject for discussion. Thanks. Ratte (talk) 11:32, 5 March 2020 (UTC)

There's no guarantee that recreating the text layer will have all the text either. All proofreading should be done by comparing the scan with the edited text, regardless of its source. You cannot rely on the text layer to have all the text, the correct punctuation, or anything. If you are simply using spellcheck, and not comparing against the original, then you are not proofreading. --EncycloPetey (talk) 01:27, 6 March 2020 (UTC)
Does my contribution lead to the conclusion that I am simply using spellcheck, and not comparing against the original? I just wanted an original raw material (without possible outside interference) for further proofreading. It’s sad that I couldn't get any comprehension and help. Ratte (talk) 07:32, 6 March 2020 (UTC)
@Ratte: I feel your frustration. I also find such "not proofread" pages that are just raw OCR dumps a hindrance to proofreading, and see little if any value in them. But the issue is that we do not have a policy to directly address this specific issue. Nowhere that I have found (not even in help pages or style guidance) do we discourage or prohibit these, and there are long-standing community members that have the, at least occasional, practice of doing so (their motivation is incomprehensible to me, but that may just be my failing). Absent that it is not clear that administrators actually are permitted to delete such pages. We sometimes play fast and loose with such strictures when that seems to benefit the project, and the community has generally supported that, but there are limits to how far we can stretch that, and, speaking only for myself, this is an instance that would have given me pause (but I really wouldn't have batted an eye if someone had deleted them either). For that reason I would argue that we should have a policy addressing this, but that would take a community discussion and a formal proposal that may be felt to be more effort than the issue is worth. In any case, frustrating as it may feel, there's actually a reason why you didn't get the help you needed. Please do not be discouraged by this outcome and hopefully we'll do better the next time! --Xover (talk) 10:51, 6 March 2020 (UTC)
Thanks. It's just technical deletions — pages that needs to be deleted to perform non-controversial technical tasks. I am surprised that administrator’s decision is not enough for this. Ratte (talk) 12:03, 6 March 2020 (UTC)

National Library of Scotland contributorsEdit

We've recently had several contributors from NLS create accounts and start making contributions. Presumably this is while they are in lockdown and working remotely. They don't seem to have had any guidance on how to do things here—particularly in the area of basic layout. This means that the content is being validated, but the presentation is lacking. There's also no structure for transclusion from the Page: namespace to the Mainspace. As there are in the order of 15 to 20 NLS contributors, so I'm asking for some help in assisting them to bring their valued contributions up to our standards. Beeswaxcandle (talk) 22:18, 27 March 2020 (UTC)

Further to the above, I have been working off-Wiki with the NLS to develop guidelines for their staff to make useful contributions here. There are now more than 50 contributors who will be doing the initial pass-through of the OCRed text. They will be followed up by a smaller group of validators who will ensure that the layout of the pages meets our requirements. Once works are fully validated, a small team of NLS people will do the transclusions. I'll be assisting with this last while they get used to the process. The primary goal of NLS is to get the text up, so that it can be used in their searchable datamarts. A by-product is that we benefit by gaining access to works that are not held anywhere else and we may well gain some more longer-term contributors once this project is completed. Beeswaxcandle (talk) 02:21, 3 April 2020 (UTC)

User:Billinghurst admin tool misuseEdit

The user seems to be using their admin access to indiscrimently revert contributions of other volunteers they disagree with. Can someone stop it? Jura1 (talk) 11:54, 6 April 2020 (UTC)

You have been making trivial, though problematic changes to the means of transclusion in Page: ns, and the section tags, though not aligning with section parameters; and in ways that make it hard to undo and unpick or to go in and recast a transclusion, and having to check both the main and page namespaces, and then tens of trivial changes to tags in Page: ns. Easiest way to resolve this is to undo them all, and go back where I didn't get it right, and making sure that the main ns pages have transcluded. — billinghurst sDrewth 12:35, 6 April 2020 (UTC)
@Jura1: You are due my apologies for not having communicated prior to undertaken the action that I did. I had replied to your post in the other forum, and as I intimated there, unpicking (reverting) these transclusions and changes over pages; it becomes like pulling that loose thread of cotton, it just unravels, and you go looking for the scissors and have to work out where to cut. I thought that I had time to resolve the issue, and then come back to you about why and what I had done. I had set a short period of flood protection, and was doing that. My reading of your contributions indicated to me that you were not active, and clearly I presumed wrong. — billinghurst sDrewth 13:15, 6 April 2020 (UTC)
@Jura1: Please tell us all relevant edits and logs, or we cannot help.--Jusjih (talk) 00:22, 21 April 2020 (UTC)

Whitelisting requestEdit

I would like to draw the admins’ attention to a forgotten request at WS:Scriptorium#Spam whitelist request. Can somebody help there? --Jan Kameníček (talk) 23:31, 9 April 2020 (UTC)

What is the purpose of having the link? That question was asked, but not satisfactorily answered. "I want to add it" is not a reason to permit it. --EncycloPetey (talk) 23:50, 9 April 2020 (UTC)
I don't know the answer to that question, but I'd point out:
One might just as readily ask, "what's the purpose of denying a longtime, productive editor the ability to add a link to an article that clearly has wiki relevance?" I can understand why we might want to discourage linking to a self-publishing site like Medium, which publishes lots of stuff that is not up to Wikipedia's sourcing standards, and where publication generally is not sufficient for transcription on Wikisource. But there is worthwhile content published on Medium, and sometimes there is good reason to link to it. I trust Andy to make a good judgment about stuff like that. Is there no administrator who feels the same? -Pete (talk) 04:03, 10 April 2020 (UTC)
I'd also point out, @EncycloPetey: it does not appear to me that anybody did ask him why he wanted to have the link. Where was it asked? The thing I saw was a assumption that he wanted to transcribe the contents at the link, and a denial on the basis of that (clearly mistaken) assumption. If it's important to you to have an answer to that question, perhaps you should ask it. -Pete (talk) 04:06, 10 April 2020 (UTC)
Andy sort of bit Billinghurst when Billinghurst asked about it. It didn't encourage this administrator to ask more questions.--Prosfilaes (talk) 05:15, 10 April 2020 (UTC)
I agree that Andy’s way of communication is sometimes difficult to cope with. We can let him know that we do not like his form of messages but at the same time should not ignore their contents. --Jan Kameníček (talk) 08:15, 10 April 2020 (UTC)
I'll just note that there ain't a one of us that communicates optimally all of the time. And at the same time, two people communicating suboptimally tends to multiply the original problem. Doing better at this should be a constant goal for all of us. --Xover (talk) 10:01, 10 April 2020 (UTC)
Andy is an adult and knows well about communication, and the strengths and weaknesses of his approach. I don't think that we need to be saying anything specific that hasn't been said previously in other communities.

As administrators adding exclusions to the whitelist they should be documented as to the reason so the community knows why the exemption exists, and whether it is permanent or temporary. Having an answer is not an unreasonable process, and it is easy to do. — billinghurst sDrewth 13:54, 11 April 2020 (UTC)

If the purpose of adding the link was to create a miniature external link farm, as you suggest, then the link should not be added. --EncycloPetey (talk) 07:03, 10 April 2020 (UTC)

Lua module error for multiple Author pagesEdit

Something has been broken. On multiple Author pages, instead of an image, we now get a "Lua error in module". I've spotted the error on multiple Author pages, but a "null edit" fixes the problem. --EncycloPetey (talk) 04:26, 11 April 2020 (UTC)

Addendum: I've made null edits to all authors linked from the Main page, as well as a few high-profile authors and Classical author pages to speed clean-up. --EncycloPetey (talk) 04:38, 11 April 2020 (UTC)

A single subpage within the module was created, and it existed for 5 minutes until the issue was identified. I am surprised that 1) it flowed through and 2) stayed flowed through. <shrug> — billinghurst sDrewth 13:43, 11 April 2020 (UTC)
It's been nearly 24 hours and we still have multiple broken Author pages. We may need to run a bot to do touch-edits on all the pages in the Author namespace to fix this. --EncycloPetey (talk) 23:03, 11 April 2020 (UTC)
I am not seeing any broken author pages. I suggest you purge your cache. — billinghurst sDrewth 23:32, 11 April 2020 (UTC)
It's not my cache. I'm visiting Author pages that I've never visited from this computer, and the error is not browser dependent. I've checked on three different machines running three different operating systems with Safari (on two of them), Chrome, and Firefox. The same pages produces an error on all three machines. The error does not show up on all pages, but appears more often on pages for high-profile authors. I spent 90 minutes last night manually making null edits for as many as I could, but I'm still finding pages with Lua errors even now. --EncycloPetey (talk) 23:44, 11 April 2020 (UTC)
I have opened one hundred Special:Random/Author logged in Firefox, and 100 logged out in Chrome. Not one has an error. Please see if you can purge Module:Wikidata/i18n, thanks. — billinghurst sDrewth 06:02, 12 April 2020 (UTC)
To purge, I would first have to re-create the page, which is probably a bad idea. I cannot purge a page that has been deleted. --EncycloPetey (talk) 16:23, 12 April 2020 (UTC)
It exists/existed in your system as that is the error that you are getting, so purging it in your cache should remove it. — billinghurst sDrewth 08:21, 15 April 2020 (UTC)
As I already indicated: I purged my cache and that did not fix the problem. --EncycloPetey (talk) 14:53, 15 April 2020 (UTC)

Death of User:DmitrismirnovEdit

At special:permalink/10084886#User:Dmitrismirnov there is the announcement of the death of Dmitry, one of our long term and quietly achieving administrators. I have protected the page with a link to the Russian Wikinews item, notified stewards, removed from admin list, and closed him on the graph at Wikisource:Administrators/Archives. — billinghurst sDrewth 08:19, 15 April 2020 (UTC)

Rights removedbillinghurst sDrewth 07:51, 16 April 2020 (UTC)

Move requestEdit

I would like to ask to move the work Evening Songs (1919) and all its subpages to the correct title Evening Songs (1920). I apologize for the mistake. --Jan Kameníček (talk) 12:19, 17 April 2020 (UTC)

  Donebillinghurst sDrewth 16:11, 17 April 2020 (UTC)

Please move page and subpagesEdit

Please move the following page, and be sure to tick the box to move all subpages:

https://en.wikisource.org/wiki/Special:MovePage/Page:History_of_California,_Volume_1_(Bancroft).djvu

Change the number "1" to "3" (this is actually a scan of volume 3).

This should be an uncontroversial one. Thanks to user:James500 for pointing out the problem. -Pete (talk) 21:40, 2 June 2020 (UTC)

@Peteforsyth: The Mediawiki limit is 100 and this work has 200+ subpages (the limit is set based on what makes sense on enWP, not enWS where we routinely need to move 1000+ pages). This is an (admin)bot request more than an admin request, is what I'm saying. And I'm not set up for bot operations. Sorry. billinghurst and mpaa: are either of you able to help here? --Xover (talk) 06:04, 3 June 2020 (UTC)
Thanks for explaining...and, sorry to make such a dumb mistake! -Pete (talk) 06:08, 3 June 2020 (UTC)
@Xover: We found that we can have the hack of creating a root page in the Page: namespace that allows us to move 100 pages. We can cheat by recreating the root page over and over and sequentially moving the remaining "n x 100" pages without redirect by an admin, and not requiring a bot. Sure it is hacky loophole in the system, though quite useful so one we haven't bothered to have closed. Just need to remember to delete the root page at the end.

Can I say that I still do not think that we should be creating all these non-proofread pages for no real reason, and there are a whole heap of reasons to not create these pages (header/footer/page numbering/missing pages/wrong name/...). It is not especially productive compared to creating a page and proofreading them. The community stopped creating these pages by bots unless there was a clear and good reason to do so, and the current processing doesn't seem to fit within that space. This user seems to think that they know better and continues to undertake the creation process. — billinghurst sDrewth 06:44, 3 June 2020 (UTC)

Addendum. I sometimes think that we should stop standard users from moving Index: and Page: namespace pages with a filter. It is just painful to recover from that at times, and painful when we cannot get fresh access to the underlying scan. — billinghurst sDrewth 06:50, 3 June 2020 (UTC)
Thanks @Billinghurst:. Re: the latter part of your comment, please note that the first specific issue you called out (header) is one that I addressed via regex and manual attention prior to the Match & Split. The last work you brought this issue up with me was The Afro-American Press and Its Editors, which now, a few short months later, is 100% proofread, >50% validated, by the efforts of three dedicated contributors, and serving as source material for English Wikipedia pages as part of a campaign drive. I have tried on several occasions to engage with you in discussion about this, because I value your knowledge and your perspective; but several times, you have vanished from the discussion, sometimes stating that it's not worth talking to me, and other times simply vanishing. Either it's worth finding some common ground on process, or it's not. I am here to discuss if you would ever like to do so. I am open to adjusting my practices, but any discussion that starts from the obviously false premise that I'm an annoyance who does not do productive work and/or is not worth talking to is unlikely to go well. Please, think about your own role in this apparent disagreement, the shape of which I do not fully understand.
On your last comment, was it unhelpful for me to do the parts of maintenance that were available to me? I had thought that by doing those, I was reducing the impact on admins to help fix the problem. But if it would have been easier to simply report the problem here and let and admin take care of it all at one go, that makes sense to me. Just let me know if that's your preference, and I'll keep it in mind if a similar situation comes up. -Pete (talk) 07:04, 3 June 2020 (UTC)
@Billinghurst: Oh. I hadn't even considered whether we might be able to work around the limit like that! I guess that's why we pay you the big bucks! :)
@Peteforsyth: In light of that I'll try to get the moves done in a little while if nobody beats me to it.
This thread now is sliding into multiple side issues, so I'm going to bite my tongue and avoid commenting on them here. But I think we should make a point of addressing them in some suitable structure (user talk or a separate thread or something). --Xover (talk) 07:15, 3 June 2020 (UTC)
@Peteforsyth: Done. Please let me know if I messed anything up. --Xover (talk) 07:48, 3 June 2020 (UTC)
@Peteforsyth: I was referring to the recent application of over a hundred thousand Page: ns pages, without proofreading, and the other issues that ensue. Nothing to any of your undertakings. — billinghurst sDrewth 12:11, 3 June 2020 (UTC)
Not having thought particularly hard about it or heard any arguments in the practice's favour, I generally agree with you on the batch creation of pages containing nothing but raw OCR. But I don't think we have any articulated policy to hang that on (I'm sure there are second-order stuff that could justify preventing it, but that feels like reaching) when polite requests have no effect. I'd suggest having a community discussion on WS:S to settle the issue, but with the poor participation there lately (cf. the Lilypond thread as one recent example; and that's not by far the worst example) I fear that would just cause conflict without a firm consensus either way to offset it, and thus I'm hesitant to go that route. --Xover (talk) 14:21, 3 June 2020 (UTC)
Thank you @Xover:, that worked perfectly. Hope it wasn't too much hassle. I agree that it would be worthwhile to have some guidance about generating pages etc. For the situation you describe, in my experience the best approach is usually for a user with a clear vision to draft an essay describing what they see as best practice. That can be discussed as proposed policy, or it can sit on a wiki page unread, or it can attract discussion and incremental improvement. It can be a strong step toward greater clarity, but it requires real work on the part of the person with a strong opinion. I suppose I might be getting to the point where I could imagine taking a stab at something like that, but there are people with far more experience and far stronger opinions than me around here, so it hasn't occurred to me until now to do so.
@Billinghurst: Since you referred to "this user," and I'm the user who brought the issue up, and I'm the user who created all the pages, and I'm the user you've complained to about similar things in the past, I don't think I can be faulted for thinking you were talking about me. Even in hindsight it seems strange that it would be otherwise. Perhaps you were talking about James (who in this case did nothing besides identifying and reporting the problem); from your user page it appears he may have a similar story to tell about having a hard time finishing discussions with you; but I'll leave that to him. -Pete (talk) 16:59, 3 June 2020 (UTC)
I came here to address an issue to which I was pinged; please don't make people regret helping. If I am not answering many questions in many places at the time, it is due to RL. I don't have to justify how or where I spend my small amount of online time, grab some perspective.

We used to apply the layers by bot, and as I remember it, through community discussion we stopped; it was often problematic and there was the general feeling there was little gain in doing so. The problems outweighed the solutions. We especially wanted to be having proofread pages in preference to slapped down "not proofread" pages.

It is not a rule "to not do it" as there are times when it is of value, when people need a finding aid to transcribe. It is a practice to not do it unless there is a clear value; and it should be okay to have that as a conversation. It should be okay to ask a person to use a bot, per our guidance, if they are going to be applying over a hundred thousand pages. — billinghurst sDrewth 05:25, 4 June 2020 (UTC)

You are not addressing an issue on which you were pinged. You are making off topic comments that have nothing to do with the subject of this thread. James500 (talk) 06:10, 4 June 2020 (UTC)
I only asked if the file and index page needed to be moved, because I cannot request file moves on commons due to interface problems, and I do not know what the effect of moving an index page is. I could have moved all the sub pages myself if others had waited 24 hours. James500 (talk) 06:35, 4 June 2020 (UTC)
@James500: For this kind of move it is better to have an admin do it since they can suppress redirects at the old page names, and the old incorrect names would have been in the way in this case. On the other issues I think this is a conversation that should be had elsewhere (typically on WS:S for community discussions, or user talk pages for issues relevant to a particular user). --Xover (talk) 07:02, 4 June 2020 (UTC)

Page shift after page insertionEdit

Hi,

I have a small page shift request after inserting some missing pages:

  • Notes: pages 180, 182 do not need shifting.

Thanks, Inductiveloadtalk/contribs 09:52, 4 June 2020 (UTC)

@Inductiveload: Done. --Xover (talk) 12:15, 4 June 2020 (UTC)