Open main menu
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 371 active users here.

Contents

AnnouncementsEdit

ProposalsEdit

RFC: Allow curly quotes under some conditionsEdit

The following discussion is closed and will soon be archived:
WS:MOS has been updated with alt proposal below —Beleg Tâl (talk) 12:50, 30 September 2019 (UTC)

Per the discussion above, I would like to propose that the following change be made to Wikisource:Style guide: Replace... Use typewriter quotation marks ("straight," not “curly”). ...with...
Curly quotation marks are permitted only if they are used in the original work and are used consistently throughout the transcription. Otherwise straight quotation marks are recommended.
Please express whether you support or oppose this change. Kaldari (talk) 15:39, 14 August 2019 (UTC)

Support. Two suggested tweaks to the language:
Instead of the word "ensure" (which occurs twice), I suggest "unless one or more Wikisource users are committed to ensuring." It's impossible to ensure anything on a public wiki; and I can foresee unresolveable arguments about what it means to "ensure" this in any given case. But, the stated intention of a single wiki user can be a powerful thing, and is possible to define more clearly.
I find the parenthetical section slightly confusing. I think it means that a large number of contributors would make it harder to ensure consistency; but that isn't entirely clear, and that's not necessarily true. So instead, perhaps, "(e.g., because many contributors, without a clear or enforceable agreement on style conventions, are likely to contribute to this particular work.)" -Pete (talk) 17:09, 14 August 2019 (UTC)
@Peteforsyth: I've tweaked the wording to address your concerns. Kaldari (talk) 00:55, 15 August 2019 (UTC)
Thanks, that's an elegant solution. -Pete (talk) 01:04, 15 August 2019 (UTC)
Oppose. 1) Curly quotes should be allowed regardless of the style of quotes in the source scan, just like straight quotes. 2) I cannot tell if your wording intends to cover other systems of punctuation such as „lower quotation marks“ or «guillemets», which should never be replaced with upper quotation marks of either style. —Beleg Tâl (talk) 17:27, 14 August 2019 (UTC)
@Beleg Tâl: I've tweaked the wording to address your concerns. Kaldari (talk) 00:53, 15 August 2019 (UTC)
I agree with what I think Beleg Tâl is saying...if we're going to alter the policy, it should permit using other kinds of quotes (at least, if they are what the original uses). In fact, I think if several contributors agree that guillemets are the appropriate choice in a specific work, even if they weren't used in the original, there might be good reason for it, and it shouldn't be expressly disallowed. -Pete (talk) 01:04, 15 August 2019 (UTC)
@Kaldari: I appreciate the update. My concern #1 still applies so I am not comfortable supporting the proposal as it stands. —Beleg Tâl (talk) 13:24, 15 August 2019 (UTC)
Oppose. I do want curly quotes allowed but don't favor the proposed version of the proposal. I agree with both Beleg Tâl’s comments. I think that the only restriction should be something like, if a work already wholly or partially proofread uses straight quotes throughout, a user should not introduce curly quotes unless they're committed to changing the whole thing to curly. Levana Taylor (talk) 18:26, 14 August 2019 (UTC)
@Levana Taylor: I've tweaked the wording to address your concerns. Kaldari (talk) 00:53, 15 August 2019 (UTC)
I don’t think "all contributors to the transcription agree to use them consistently" is a practically workable condition to impose. What if (as is extremely likely) some early contributors can no longer be contacted? Someone who comes in later ought to be able to make a global change as long as they change the whole thing. Levana Taylor (talk) 02:22, 15 August 2019 (UTC)
@Levana Taylor: I've tweaked the wording again. Hope that sounds better. Kaldari (talk) 02:43, 15 August 2019 (UTC)
Yes! This is more like it. I can support this simplified version, which leaves it open whether consistency is to be achieved by consensus (e.g. on projects that have a discussion page), or (in smaller works) one person going through the whole thing. And I don't mind restricting use of curly quotes to works that use them in the original. Use of straight quotes is something of a special case -- might be found in modern documents (e.g. government work being added here), and it makes sense to keep that style, I guess. We still have to mention. guillemets and German „goose feet“ … maybe the wording should be re-organized, more or less thus: straight quotes, guillemets, etc. should be kept as in the original. If the original has curly quotes, these may have straight quotes substituted for them, or curly quotes may be used; the latter should only be done if they are used consistently throughout the transcription. Levana Taylor (talk) 03:51, 15 August 2019 (UTC)
Comment. The previous voting discussion has shown that there are supporters of the change in favour of other than only straight quotes, but they prefer various solutions. For this reason it is probably not a good idea to pick one of them and vote simply for or against, it would be better to vote about all of them and choose the one with the biggest support. (BTW, the chaos accompanying this process, when somebody considered the discussion to be voting, while others were waiting for the voting to start, is a result of missing instructions similar to Wiktionary:Voting policy and Template:vote on hold.) --Jan Kameníček (talk) 18:57, 14 August 2019 (UTC)

Pinging other folks involved in the original discussion: @Prosfilaes, @EncycloPetey, @Billinghurst, @Xover: @Nizolan, @TE(æ)A,ea., @Koavf, @Beeswaxcandle:. Kaldari (talk) 10:03, 15 August 2019 (UTC)

Support. I have made two changes to the style guide in favour of this; the first, a more general approach favouring standardisation, and the second, a more specific approach similar to the desires of Jan Kameníček and Levana Taylor. TE(æ)A,ea. (talk) 11:51, 15 August 2019 (UTC).
Sorry to contribute to the nit-picking but though I'd like to support this I'm not sure on the current wording because there's a disproportion between the two parts: if curly quotes are only permitted under certain conditions then why are straight quotes merely recommended otherwise? What's the alternative? Also agree with BT above that exceptions for guillemets and other forms should be specified. —Nizolan (talk) 12:49, 15 August 2019 (UTC)
Oppose. We're having a vote where the text is being tweaked throughout the vote after some people have voted. We need to vote on a set text, not be adjusting it throughout the vote. You don't change the candidates or platforms once the vote has begun. --EncycloPetey (talk) 15:18, 15 August 2019 (UTC)
If the tweaking doesn't go on for too long, it's not very hard to check in with the early voters and find out whether/how it impacts their votes. I stated above that the tweaks were to my liking, other early voters could always comment/clarify as well. But I think we're in agreement that it's best to at least limit/minimize changes in order to have a coherent vote. -Pete (talk) 16:49, 15 August 2019 (UTC)
Weak Support, as it is better than forcing everybody to use only straight quotes, but I am not very happy with the specific expression “curly quotes” instead of "the same kind of quotation marks as the work presents". There are several kinds of "curly" quotes and I believe they should not be used interchangeably. If a work uses “” the contributors should not use ’ ’ or „“, although they are all curly. --Jan Kameníček (talk) 22:18, 15 August 2019 (UTC)
The current MOS guideline to use straight quotes only does not imply that users can use ' ' in place of "; neither would this updated guideline imply that users can use ’ ’ in place of “”. Curly quotes in this context means specifically “this” as opposed to "this", and ‘this’ as opposed to 'this'. It would still be wrong to use “this” in place of ‘this’, or in place of «this», or in place of literally anything except "this". —Beleg Tâl (talk) 23:50, 15 August 2019 (UTC)

Alt proposal: Allow curly quotes under any conditions

I propose that, rather than the above change to WS:MOS, we instead change

  • Use typewriter quotation marks ("straight," not “curly”).

to the following:

  • Use a consistent style of quotation marks ("straight" or “curly”) within a given work. It is recommended to use "straight" quotes in works where there are a large number of contributing editors, since consistent use of “curly” quotes may be difficult to achieve.

Beleg Tâl (talk) 19:48, 15 August 2019 (UTC)

  SupportBeleg Tâl (talk) 19:48, 15 August 2019 (UTC)
  Support, simple is good. (@Beleg Tâl: there's a typo at the end, "consistant" -> "-ent" fixed)Nizolan (talk) 21:43, 15 August 2019 (UTC)
Support. This is generally similar to my second change to the style guide. TE(æ)A,ea. (talk) 21:47, 15 August 2019 (UTC).
I used your second change as a basis for wording my proposal. —Beleg Tâl (talk) 23:51, 15 August 2019 (UTC)
Oppose. I do not agree with allowing to use curly quotes even in cases when the original works use straight quotes. --Jan Kameníček (talk) 22:00, 15 August 2019 (UTC)
And yet you agree with allowing to use straight quotes even in cases when the original works use curly quotes. I think that if we allow straight quotes in place of curly, but do not allow curly in place of straight, then we may as well continue to disallow curly altogether. —Beleg Tâl (talk) 23:37, 15 August 2019 (UTC)
In my opinion, disallowing the use of curly quotes when a scan uses straight quotes is similar to: disallowing the use of 'a' when a scan uses 'ɑ'; disallowing the use of 'g' when a scan uses 'g'; disallowing the use of '$' when a scan uses ' '; &c. —Beleg Tâl (talk) 00:03, 16 August 2019 (UTC)
What if a scan uses German-style lower-level quotation marks, but those quotation marks are straight? There is no straight lower-level quotation mark to replace it with, and you would not allow the replacement of the straight quotation marks with „curly ones“. —Beleg Tâl (talk) 00:04, 16 August 2019 (UTC)
Hmm, I have a hard time imagining a case where this hypothetical problem becomes an actual problem. In most cases, there is no practical problem with one kind of quote...if it's an academic essay, for instance, I really don't see how a reader is done a disservice by encountering “curly quotes” where they expect "straight ones." In a few cases, like poetry, it might actually be significant. In those cases, I have more trust in the good judgment of my fellow Wikisourcers to find the proper solution, than I have in any policy. If a poem had straight quotes, and its appearance would be substantially altered by using curly quotes, it's hard for me to imagine a Wikisource editor who appreciates the poem using the policy to justify changing them to curly quotes. Your objection, Jan, seems to me rooted in worry about something that's very unlikely to happen. -Pete (talk) 00:52, 16 August 2019 (UTC)
Support. Nice, this is very similar to the original proposal, but slightly clearer, and uses more straightforward language. -Pete (talk) 00:52, 16 August 2019 (UTC)
  Support The more I work with epubs the more I want our exported books to look as nice as possible. —Sam Wilson 06:55, 16 August 2019 (UTC)
  Support Giving Wikisource editors some flexibility seems like a good thing to me. Kaldari (talk) 20:50, 16 August 2019 (UTC)
  Support I really like this. Cuts the Gordian knot, allows contributors to use their judgment; and if it doesn't address all the ins and outs and special cases, well, what wording could? Levana Taylor (talk) 23:13, 16 August 2019 (UTC)
  Support I'd like to have the option of using curly quotes, it makes the books look so nice. Prtksxna (talk) 05:22, 21 August 2019 (UTC)
Neutral While this codifies what we already have in place by de facto, I have seen nothing that addresses the basic issue that some of us are unable to type these so-called curly quotes. So, how will consistency in works be ensured? If this goes ahead it is essential that on each work (on the Index Talk: page), a formatting note is provided indicating which mode of textual quote marks are being used AND that all contributors to the work actually read the note prior to contributing. Beeswaxcandle (talk) 23:32, 24 August 2019 (UTC)
I expect it will be the same as any other issue. We have consistency problems in PotM, and even cases where a second editor makes spot changes to format. We cannot control these things except to note that they have happened and chastise a person who has done so. The point of this vote, however, is to provide a standard against which such calls may be made. --EncycloPetey (talk) 02:41, 25 August 2019 (UTC)
As for typing them: they can be found among the special characters just above the editing window, although it may slow down the work.
Both points raised above are among the reasons why I proposed there should be specifically written that the curly quote rule does not apply to works where cooperation of more people can be supposed. If three people out of four prefer curly quotes, it is no good if the fourth person is forced to use them too for "consistency reasons" even if s/he feels limited by them. There should be written that curly quotes are allowed only if it can be assured that no contributor to the work opposes them or may oppose them (typically when one single person transcribes the whole work). --Jan Kameníček (talk) 07:48, 25 August 2019 (UTC)
I don't like the idea of it being entirely impossible to use curly quotes in large, extensive works. It should be a matter of judgement by people who are working on it at the early stages, at the time style is being established. They should discuss as much as possible and decide whether curly quotes are practical under the circumstances where the work is being done.
As for entry, there are several plugins for Firefox, and I think for Chrome too, to assist with quotes. Personaly I use a combination of two method: convert chunks of text, highlighting quotes, with MS Word; while typing, use the superb macro plugin ABCTajpu which has made my life easier In so many ways. Levana Taylor (talk) 16:35, 25 August 2019 (UTC)
I maintain a WikiEditor button for converting to curly quotes. Sam Wilson 01:18, 27 August 2019 (UTC)
Exactly, you need various plugins, highlight the quotes in Word..., which some people may dislike or be unable to do. I personally prefer working directly in the editing window using OCR buttons. Google button usually transcribes all quotes as straight, so if somebody wants to use curly ones, they have to be changed manually, which is time consuming.
I do not believe that all contributors transcribing entries of large encyclopedias will bother with discussing which kind of quotes to use (I have seen results of collective projects like profread of the month, and unfortunately contributors here usually do not take care about consistency with much more important issues at all). However, let's suppose they will and imagine the following situation: 7 people start transcribing work, 4 prefer curly q., 3 straight. After quite lengthy discussion taking their time the three of them retreat (or some retreat and some leave the work) and they start using the curly q. After some time, other contributors come. Some of them do not use the curly quotes, but since quite a lot of work has been finished, they are notified that for consistency reasons they have to, and so they are forced to surrender to something they do not feel comfortable with (or they do not join). Imo this is not good. --Jan Kameníček (talk) 17:22, 25 August 2019 (UTC)
This is already how it goes for pretty much everything though. I don't like using the <poem> extension for example, and will push for explicit line breaks in transcribing poetry. If a project's original contributors have settled on using <poem>, I will have to either go along with that, or move on to some other project. —Beleg Tâl (talk) 18:18, 25 August 2019 (UTC)

As there is majority support and a sufficient length of time has passed, I will implement the measure in one day (24 hours), if there is no objection. TE(æ)A,ea. (talk) 21:19, 23 August 2019 (UTC).

I have undone this change. Do not give us arbitrary deadlines that have no requirement for a deadline. Typically our issues are discussed and open for extended periods. — billinghurst sDrewth 23:58, 24 August 2019 (UTC)
Yes, 24 hour notice is not remotely sufficient for closing a discussion. There are still editors commenting here. Be patient. I think if this discussion remains completely untouched for two full weeks we could consider it closed, though I would give it the full thirty days allotted by SpBot. —Beleg Tâl (talk) 22:22, 25 August 2019 (UTC)
  • I proposed the 24 hours not as a time period for the discussion, but a time period to notify in the place of further discussion, as the outcome is already obvious. I believed that one weeks’ time, as had already passed, was sufficient. TE(æ)A,ea. (talk) 23:31, 25 August 2019 (UTC).
    • There's no hurry. English Wikisource has existed for 15 years without curly quotes. Another few weeks won't hurt. Kaldari (talk) 15:04, 27 August 2019 (UTC)
  Support the alt proposal. I've just been alerted to this page, and not sure whether to add the comment here or in the lower discussion. The 1911 Encyclopædia Britannica Wikiproject is one that has long had a standard of curly quotes and apostrophes in its style guide, has a few current editors, and they are familiar with the style. Undoing the guide would be a needless waste of effort that could more usefully be put into proofreading (though I admit I'm guilty of not having done much of that lately). Raw scans will need fixing in any case. While I'm agnostic on what should be the default WS-wide, I'm strongly opposed to a new mandate for a long-established project. DavidBrooks (talk) 16:13, 28 August 2019 (UTC)
  •   Support I had actually decided to abstain on this, since it wasn't sufficiently detailedly specified for my peace of mind (I do like exhaustive detail on such matters). But after far more agonising than the issue strictly merits, I've come around: I just simply want the pretty curly quotes too much! And since we're taking the "let's just open the gates and deal with problems as and when they show up"-approach already, I think it makes the most sense to pick the variant that gives maximum flexibility to our contributors. I do urge everyone to be extra on guard for diverging practices going forward: the benefit of an ultra-conservative approach is it's easy to guarantee consistency, but cleaning up if chaos has been let run unchecked can be near impossible. Keeping an eye out for variations we don't want now can save us a whole mess of trouble later on. (also: shout-out to Jan and BWC, whose concerns I share but land on support anyway) --Xover (talk) 15:50, 30 August 2019 (UTC)
It is indeed a valid concern to worry about introducing a typographic feature which is not entirely straightforward to use correctly, lest it create mess. We should be thinking about software methods to help with proofreading, both checking at the time of adding text and going over existing pages. I have some ideas, but this isn't the place to detail them. Levana Taylor (talk) 16:08, 30 August 2019 (UTC)
Further concern Having just advised someone on creating a link, I've realised a further concern with this proposal. When linking to works, to subpages, or to sections within a page we must have a single standard orthography. This particularly so for apostrophes, but from time to time quote marks are also affected. Links within a work will be fine, but links from another work will require the editor to decide everytime if normal marks or bent marks were used. A hypothetical example: a book I'm working on refers to Odysseus's Return in Joe Blogg's seminal work Greek Myths and Legends Reinterpreted for the Victorian Age. Do I link to Greek Myths … Age/Odysseus's Return or to Greek Myths … Age/Odysseus’s Return ? The best way around this is to insist that all Titles and Subtitles only use straight quote marks and apostrophes. The next problem comes at sections within a (sub)page where the section heading is a text item within the book. If Odysseus's Return is a section on Greek Myths … Age/Chapter 7 and has not yet been proofread, how will I know which type of mark to use in creating the deeplink? Beeswaxcandle (talk) 19:03, 31 August 2019 (UTC)
I would be okay with insisting on straight quotes in page titles. However, there are currently pages that use curly style in the page title (e.g. Henry Ford’s Own Story; complete list here) and so far they are doing okay. —Beleg Tâl (talk) 20:44, 31 August 2019 (UTC)
Agree with straight quotes in page titles. However, for the ones that exist, I would advise moving them and leaving redirects. It seems to me there is a tendency on this wiki to not leave redirects. That seems counterproductive to me, especially when a page has been up for a while; there's no way of knowing the various places, online and offline, where there might be incoming links. What's the harm in a redirect? -Pete (talk) 17:01, 3 September 2019 (UTC)
I think straight quotes ought to be mandatory in page titles. Agree with leaving redirects though. —Nizolan (talk) 17:50, 15 September 2019 (UTC)
  •   Support I think the question of whether to allow curly quotes is the wrong question entirely. Of course we should—we should be preferring them like PG does nowadays. Straight quotes are a hack introduced with early typewriters as a way to save room on the keyboard, and I would really prefer that we use the original, traditional typesetting instead. So I suggest, instead, that the right question to be asking here is how to make it easiest and most convenient to type smartquotes, since I see that looms largely in people's minds, and rightly so. There are many options we could be discussing, including keyboard shortcuts (easy to add javascript so that ctrl-\" adds a pair of quotes), or a bot to do automatic substitution (which would need to be validated), or other possibilities I haven't even thought of. Dcsohl (talk) 20:01, 8 September 2019 (UTC)
  •   Support; nice balance between aesthetics and consistency. Spangineer (háblame) 13:46, 9 September 2019 (UTC)
  • As there has been little discussion for some time, and there is large support in favour of this proposal, I shall change the relevant guideline, if there are no objections in the next twenty-four (24) hours. TE(æ)A,ea. (talk) 23:55, 25 September 2019 (UTC).
    And I say to you again, you do not determine consensus. Please leave this to others. — billinghurst sDrewth 00:00, 26 September 2019 (UTC)
    Okay, I will change the relevant guideline. I was about to post here asking that opposers join the discussion, because the few opposers are not discussing things. The discussion has finished as to whether or not we do this.--Prosfilaes (talk) 04:31, 26 September 2019 (UTC)
  This section is considered resolved, for the purposes of archiving. If you disagree, replace this template with your comment. —Beleg Tâl (talk) 12:50, 30 September 2019 (UTC)

New speedy deletion criterion for person-based categoriesEdit

Following on from a discussion at WS:PD#Speedy deletion of author based categories.

It is long established and in the main uncontroversial that English Wikisource does not use person-based categories (of the type "Works by John Smith", "Poetry by John Smith", etc.). Some previous discussions can be found at: 1, 2, and 3 (and the two following threads). However, absent a speedy deletion criterium specifically for these, admins have to rely on the provision for precedent-based deletions. In practice this means such categories must be brought to WS:PD to be rubber stamped, wait at least two weeks (because inertia and habit), and then hopefully someone will remember to process them. Eventually.

I therefore propose that we extend the deletion policy with a new G8 criterion as follows:

  • Person-based categories—Categories where the defining characteristic is person-based. This includes, but is not limited to, author-based categories like "Works by author name".

All deletions (modulo CU type concerns) are subject to community challenge in any case, and are clearly visible in the deletion log, so there is no particular benefit to the bureaucracy where there exists no significant uncertainty or controversy. --Xover (talk) 14:32, 15 July 2019 (UTC)

  Support, but I'd note that there is an exception discussed in link #2: namely, American presidential documents categorized by president. This is due to the fact that the administration of the executive branch is tied to who is the president at the time. There was no consensus as to the scope of this exception: what kinds of presidential documents it applies to, or whether other governments may have the same treatment, etc. —Beleg Tâl (talk) 14:42, 15 July 2019 (UTC)
  Oppose 2 weeks is not too long to wait. organization of subject of a work is useful, a migration to a stable ontology is necessary. Slowking4Rama's revenge 13:58, 30 July 2019 (UTC)
2 weeks is definitely too long to wait when a full beaurocratic procedure with a foregone conclusion could be replaced with a simple administrative action. —Beleg Tâl (talk) 14:32, 30 July 2019 (UTC)
Also it is worth pointing out that this proposal is not regarding whether such categories should be kept or deleted (since we have already established that they should be deleted), but only whether they should be posted to WS:PD before we delete them. —Beleg Tâl (talk) 18:51, 30 July 2019 (UTC)
And that strictly speaking, under current policy, they can be deleted a few days after a notice has been posted to WS:PD (no two week wait required, just that the discussion must have "started"). It's just that habit and inertia inevitably means that almost all cases will in practice suffer this 2+ week purely bureaucratic delay. I'm a big believer in process and the value of bureaucracy when properly deployed, but even I think this one is a pointless waste of volunteer time. We have issues that require actual discussion or other action that have sat open on the noticeboards for a year and a half; we should not waste those resources on filling out forms in triplicate for issues that are not controversial. Any deletion can be reviewed and overturned, if needed, by the community; let's save the cautious multiple-safeguards approach for stuff that might actually need it. --Xover (talk) 19:11, 30 July 2019 (UTC)
I always wait until there has been a full month of inactivity, since there are many editors who only edit occasionally, but that's just me. —Beleg Tâl (talk) 19:17, 30 July 2019 (UTC)
  Support --EncycloPetey (talk) 17:40, 30 July 2019 (UTC)
  Support --Jan Kameníček (talk) 19:38, 30 July 2019 (UTC)
  Support though if possible I'd like to see the exception Beleg Tâl specified firmed up a bit, i.e. perhaps a general exception for things like governments, ministries, and reigns which are "person-based" but serve an obviously different function to categories-by-author (noting on the UK side things like Category:Acts of the Parliament of Great Britain passed under George III). —Nizolan (talk) 00:44, 1 August 2019 (UTC)
  • Note Based on the discussion above I have added the above criterion with an additional limitation to exempt things like UK governments tied to a monarch's regnal period or the administrations of US presidents. I read the above as general support for this criterion—sufficient for adding it—but with some remaining uncertainty about the optimum phrasing. I'll therefore leave this discussion open for a while longer so that interested parties may object or suggest better wording. I'll also add that minor changes to the wording (that do not change the meaning) can easily be made later with a proposal at the policy talk page. And we can always bring bigger changes up here for reevaluation if it causes problems. --Xover (talk) 19:32, 11 August 2019 (UTC)

Deletion reviewEdit

I long ago (2005) gathered together historical documents related to the life of Indigenous Australian warrior Yagan in Category:Yagan. This has always seems to me a reasonable category, but it just got speedily deleted without so much as a how-d'-y'-do.

The examples given in this proposal were of the form "Works by John Smith", "Poetry by John Smith", etc. No other examples were given in the discussion. So I'm not sure if the community really intends that categories like this would be deleted. Can we review this please?

Hesperian 23:48, 2 September 2019 (UTC)

Hmm. I'm not going to express an opinion on "should" / "should not" for this, but I will note that based on my understanding of the discussions this would indeed be the intended effect. The defining characteristic of the category is that its members relate somehow to a specific person, and for such the consensus appeared to be that portals were better suited. But perhaps there is a distinction between Category:Yagan and Category:John Smith that I am not seeing? Or is it the specificity: Category:Foo by Person is bad, butCategory:Person is acceptable? --Xover (talk) 03:58, 3 September 2019 (UTC)


As things stand:

  • I can gather together documents about the Battle of Borodino in Category:Battle of Borodino, because that's an event.
  • I can gather together documents about Fort Knox in Category:Fort Knox, because that's a place.
  • I can gather together documents about scissors in Category:Scissors, because they are objects.
  • I can gather together documents about intelligence in Category:Intelligence, because that's an abstract concept.
  • But I can't gather together documents about Yagan in Category:Yagan, because he was a person.

Can no-one see how bizarrely arbitrary this is??

And it hasn't even really been discussed, since the only examples given above are "Works by" categories, the deletion of which makes perfect sense. Hesperian 11:50, 3 September 2019 (UTC)

Fully agree with Hesperian, the speedy deletion is a misinterpretation of the guidance. The "category:works of ..." is to ensure that works of authors are added to author pages, and not categorised. There is no determination that it would relate to anything else. Categorisation has always existed for people, again our biggest issue is how to separate author categorisation from subject categorisation. — billinghurst sDrewth 12:39, 3 September 2019 (UTC)
Read the policy, it does not say "works by …", it says "person-based". —Beleg Tâl (talk) 12:56, 3 September 2019 (UTC)
Per our deletion policy (as updated according to the consensus in the above discussion), "Person-based categories" are now a criterion for speedy deletion. This "includes, but is not limited to, author-based categories", but "the defining characteristic is person-based". This was very explicit in the above proposal. My deletion of Category:Yagan was therefore 100% within our deletion policy. You can propose a reversion to the older version of the deletion policy, and a restoration of Category:Yagan (even though it is entirely redundant of Portal:Yagan), but I will have no part in it. —Beleg Tâl (talk) 12:53, 3 September 2019 (UTC)
Also: as things stood before the above discussion, I could gather together documents about Yagan in Category:Yagan, but couldn't gather together documents about Yazid III in Category:Yazid III, which is just as bizarrely arbitrary. —Beleg Tâl (talk) 13:03, 3 September 2019 (UTC)
(ec) It is my opinion that it is not a positive change. 0-100 in four seconds. I find the statement It is long established and in the main uncontroversial that English Wikisource does not use person-based categories to not be the case, especially as it has been the case since 2005. Something that was entirely in scope and I believe would have been kept in a PD, is now going to a speedy deletion and deleted without conversation. I find that inappropriate, and for that to have been implemented in four weeks is an example of poor implementation and poor policy. I am wondering where this community is going, and the lack of vision that this represents. — billinghurst sDrewth 13:14, 3 September 2019 (UTC)
It may also have simply flown under the radar. It is also just one category affected, and a completely redundant one at that (equally redundant to any Author-based categories). And the proposal to update the policy was done entirely by the books, and is a significant benefit to the community. —Beleg Tâl (talk) 13:30, 3 September 2019 (UTC)
And it has been long established and in the main uncontroversial that English Wikisource does not use categories for individuals who have pages in Author space; the fact that there existed one or two categories for an individual in Portal space is (to me) a minor detail and I would have also considered it long established and uncontroversial that these were also unwelcome. —Beleg Tâl (talk) 13:33, 3 September 2019 (UTC)


Of most concern to me in this new G8 is, what if Portal:Yagan did not exist? In that case, Category:Yagan would be the only way in which we had organised our material by topic, yet it would still be summarily deletable under this new G8.

I think a more coherent policy position might be:

We don't want to organise our material by both Author/Portal and Category. So it is fine to create a category for a topic if there is no corresponding Author/Portal page. But be aware that this is a stopgap -- once someone has created the Author/Portal page, the category may be deleted.

Note that this doesn't distinguish people from other topics. Category:Yagan is fine, but only until Portal:Yagan has been created. Even Category:Works by John Doe is fine, but only until Author:John Doe has been created.

I think the biggest problem with this position is the really big topics that would be better handled by a category than by an Author/Portal page e.g. War. In that case, I would say keep the category and ditch the portal, which would be unmaintainable. In a speedy criterion there would certainly need to be something to prevent deletion of categories that contained subcategories or a collection of portal/author pages.

Thoughts? Hesperian 22:50, 3 September 2019 (UTC)


Since the attitude to concerns raised here has been "I will have no part in it" followed by non-participation in the discussion, I have boldly replaced "person-based" with "author-based". I accept the new G8 was proposed, discussed and implemented in good faith, but subsequent objections have made it clear that there is no consensus for speedy deletion in the gap between person-based" and "author-based".

To be clear: we may not agree on whether Category:Yagan should have been deleted, but I think we can all agree that the deletion was contentious, and speedy delete criteria are intended to capture non-contentious matters.

Hesperian 07:53, 6 September 2019 (UTC)

@Hesperian: I'm not going to revert that because I think at least temporarily going back to the status quo is prudent when a concern has been raised so soon after implementation. But I do object in principle to your approach here: whatever the problems with the new G8, it was properly discussed, consensus determined, and implemented. For you to unilaterally reverse it is not a good practice, no matter the merits of your concerns with it. The proper description of the thread above is, strictly speaking, not "absence of consensus" but rather "complaints after the fact" (possibly good, proper, and meritorius complaints, but still after the fact). So I am going to insist that this removal of the new criterion is a temporary measure while discussion is ongoing, and not the new status quo. If no new consensus is reached here then we revert back to what was previously decided. (To be clear, if you had suggested we should temporarily revert I would have supported that. It is your acting unilaterally with an apparent intent to change the status quo I object to.)
That being said I am absolutely open to being convinced of anything from the new criterion needing to be tweaked and to it needing to be dropped altogether. The reason I am not currently actively discussing is that I do not feel I sufficiently grasp the issue and am mulling it over. Your distinction between "person-based" and "author-based" has not been apparent to me prior to your latest comment, and I now suspect that that distinction is the crux of your objection; but I still do not grasp why you do not feel a portal would be sufficient. On the other hand, reasonably curated categories are cheap, and can conceivably be automatically applied to works included in a portal.
I also suspect, though I may of course be entirely mistaken, that what we are discussing here is not actually a speedy criterion, but rather a more fundamental issue of category and portal policy. I am not convinced the speedy criterion is a useful proxy for that debate, on the one hand, and that the former will resolve itself neatly if the latter is settled, on the other. --Xover (talk) 08:30, 6 September 2019 (UTC)
@Hesperian: "I will have no part in it" is me, not the community. I agree with Xover that it is necessary to establish a new consensus with the community to make a subsequent update to the deletion policy (in which discussion I will remain neutral). And like I said to TE(æ)A,ea.: three days is not remotely sufficient for closing a discussion. Be patient. —Beleg Tâl (talk) 12:20, 6 September 2019 (UTC)
  •   Comment There is definitely a long-established practice that we collect and curate works that relate to authors, and due to our strong preference to curate, we determined to not categorise, which would have a duplication and a confusion. It has not been the case for individuals who were not authors, and it should not be a requirement that we have to curate such pages, especially where a person may be mentioned on a page(s) though not be the focus of the pages. For instance, the page The Perth Gazette and Western Australian Journal/Volume 1/Number 28 would be considered for categorisation in "Category:Yagan" though would not particularly be the focus of a page and put onto a Portal: ns page. I would definitely not expect someone to have to make edits to a portal page to that target, though I would have no qualms with someone categorising. Where we have authors, we have wikilink'd back to author pages for that relevance. So it is my belief that these non-author categories should not be speedied, if there is a case for their deletion, then bring it to the community. I also believe that a proposer should be listing consequences of their suggested policy changes, not leaving it to the community. I find the above consensus to be a troubling "yes ... tick and flick" exercise by the community without an in-depth exploration of the consequences, approving a change to speedy deletion should be items that are completely non-controversial.

    The above deletion discussion started with the scope of a PD discussion about author categories, and then specifically addressed two author related categories. No examples were given of non-author categories that would have been wrapped up in the change of our guidance, nor that we were going to now speedy delete categories that have been existing for greater than 10 years. I have a strong belief that anything that has existed for over 10 years onsite should not be speedied, and that speedy deletions are only best applied to recent additions.

    Xover: You suggested the policy change, then summarily closed less than four weeks later, and implemented. May I suggest that is not the ideal practice either, as this is a change of policy where all person categories are deleted, not as indicated in the discussion that it was an existing process and the speedy being the only change. We are not a huge community, we don't have the same editing rates, or the diversity of eyes to analyse such situations, and that is traditionally why we have left discussions open for extended periods. — billinghurst sDrewth 10:55, 7 September 2019 (UTC)

    @Billinghurst: "Too quickly closed" is a fair complaint, although I don't entirely agree with that assessment. I agree there should be plenty of time for the community to ponder, scrutinise, discuss, and decide; and in fact was somewhat disappointed that the proposal did not garner wider participation and more discussion. I agree speedy criteria should have a firm basis, which broad participation in the proposal is the best way to ensure (and document!). But I also observe that community participation in such discussions is distressingly low in general, and by that yardstick the above was about the most I felt one could realistically hope for. When no further comments either way surfaced—not even any "Unsure" or "Wait, I need to think a bit more"—I felt that was sufficient to implement. If we want to have much longer timeframes to tease out every possible community comment then we should have specific guidance to that effect (and I do mean a specific number of weeks).
    I agree that speedy should be for uncontroversial things, but then my understanding was that this was uncontroversial. My intent in making the proposal was not to change practice regarding use of categories vs. portals, but rather to eliminate a pointless two-week wait and bureaucratic box-ticking for something that was a priori determined would be deleted. I do however disagree that speedy should not be applicable to, for example, decade old clear copyvio. The purpose of speedy deletions is to reduce bureaucracy and make maintenance more efficient—where possible—and to reduce the demands on the community's time and attention in formal discussions. Because, as you point out, such participation is perhaps our scarcest resource! The age of the material affected is entirely orthogonal to whether it falls within one of the speedy deletion criteria.
    "Uncontroversial" is a better distinction, but even there some nuance is needed. The policy that leads to the deletion (by whatever process) must be unambiguously decided: it must be uncontroversial that that was what the community decided. The issue itself, though, can still be plenty controversial: there are some contributors who would never see anything deleted, for any reason, and express their frustration with copyright law and our copyright policy in every copyright discussion they participate in (nevermind proposed deletions). That someone disagrees with the community's decision, once made, is not a valid reason for considering the implementation of that decision controversial.
    On the issue at hand, though, I (am starting to) see the personauthor distincton, but I am having trouble understanding how a portal is any less suited for a person than for an author. To my mind the very same arguments for portal over category for authors apply equally to persons. Why wouldn't The Perth Gazette and Western Australian Journal/Volume 1/Number 28 go in the portal? Or is it the perceived relative amount of effort in curating the two approaches? Hesperian's more coherent policy position seems to suggest that that is the case.
    I don't think starting with a category but deleting it if a portal is created is a particularly rational approach, but as a proposal it does speak directly to the relationship between categories and portals. To me, the opposite end of the spectrum (that you also address) seems more elucidating: once a topic is sufficiently large, a portal becomes an awkward way to organise the information. In those cases I could see an argument for using both; the category for everything and the portal for the highlights. But that's an argument that will be relevant only rarely (relatively speaking) and only in the reverse order (only once the portal is "full" does the category come into play). Most person-related topics will not have too many relevant works for a portal.
    Or perhaps a different angle of attack would aid common understanding: Categories, Portals, and Author-pages overlap in various ways and in different degrees, and so we should establish some coherent guidance on the purpose of each, what to use each for, and how to distinguish between them in difficult cases. Perhaps in discussing what that guidance should be we would better understand the various perspectives than through the proxy of a speedy criterion? For example, do we want a portal about a person as a historical figure if that person is also an author? Is an Author: page and a Portal: the same thing except for inclusion criteria? Do the same layout rules and restrictions apply to both? --Xover (talk) 03:19, 9 September 2019 (UTC)
i am sad that admins persist in summarily deleting, for contentious issues that require a consensus. we need a standard of elevating issues on chat before deletion. and a standard of practice of how to organize ontologies of "subject of" and "depicts". i don’t care how- portals, categories, subsection, anything that can be linked from wikidata. but we need an organizational consensus, not deletion. Slowking4Rama's revenge 03:43, 13 September 2019 (UTC)
@Slowking4: But, but, but, but you do not understand the sysop perspective. They delete without consequence (for themselves, as from a sysop's perspective a deleted page may be view/restored and viewed without going through with restore. See? No consequence!) As for for the plebs, tough! Them's oughta put in an application to be tiara'd like good little princesses… 114.78.171.144 06:09, 13 September 2019 (UTC)
114.78: I realise you're taking the piss here, but I actually agree that this is an important difference in perspective to take into account. One thing is that the consequences of deletion can in some (but not all!) cases appear smaller to those with the technical ability to view and restore deleted pages, but the perspective is also shifted when you have long backlogs of tasks that either can only be resolved (in practice) by deletion or where deletion is a fairly foregone conclusion. To have to conduct a formal analysis, formulate it cogently, and run a community discussion is a lot of effort. The relatively low community participation in those discussions means they have a tendency to deadlock, and if resolved are too local to support any kind of future precedent. When a lot of your tasks are dealing with that dynamic, you will naturally tend to develop a bias (big or small) toward more efficient resolutions like having speedy criteria for whatever the issue at hand is.
But when you spend a lot of time going through the maintenance backlogs you also gain the very real experience that tells you that a lot of stuff has been dumped here with no followup, attempts to format properly, or even giving minimal source or copyright information. There is literally no hope of these works being brought up to standard as they are, and would in any case be easier to recreate from scratch than fix in place, even if they aren't blatant copyright violations. While we certainly need to watch for and not get fooled by the previously mentioned bias, we also should let ourselves be guided by this experience. Sometimes the perspective of those who work the maintenance backlogs (which is not by any means limited to just admins!) gives them a better foundation for reasoning about an issue than those who work primarily on their own transcriptions (and sometimes not). --Xover (talk) 07:25, 13 September 2019 (UTC)
your "guided by experience" does not address the power dynamics of a summary standard of practice. when you undertake an action. no matter how reasonable or justified you may feel, while the community is feeling ill-used, then you might want to rethink your action, if you would presume to lead a community. we have a lot of ban-able admins. Slowking4Rama's revenge 11:44, 13 September 2019 (UTC)
@Xover, @Slowking4:My sincere apologies if my comment came across solely as micturient. When young fresh meat front up to gain the authority bit it is entirely reasonable they not realise they are actually signing up for a melange of teacher, executioner, judge and neat-freak. What is less excusable is that some of them never even learn of the damage they do to the parallel roles whilst obsessing over the matter of the moment. Ordinary users are watchers and judger's too and may take away quite unexpected conclusions from administrator actions. Looked at another way the spread of intelligence is (sadly) unrelated to the authority role granted. That there never seems to be a shortage of potential idiot actions does not mean it is a good idea to go down each and every rabbit-hole.
On the other hand the occasional well-reasoned explanation might even result in the next applicant putting their hand up and taking some pressure off off the backlog slaves. If that flags me as both bitter and optimistic then just handle it. I have to. 114.78.171.144 22:06, 13 September 2019 (UTC)
@Slowking4: I have suggested above that the ontological discussion might be a better way to approach this issue than the speedy criterion. What are the ontological categories we need to handle, and what tool or structure of those we have available to us would be best to handle each? If we can figure out some guidance on that then what should be kept and what should be deleted will, hopefully, follow naturally. Perhaps you could flesh out your thoughts regarding "subject of" and "depicts" with that in mind? --Xover (talk) 07:25, 13 September 2019 (UTC)
we would need to group together all those works, which people seem to use categories . we have categories on authors, we could start with a wikidata infobox at author pages. if the community wants portals for subjects, then we will need a infobox and migration from categories to portals. (this is different from how it is done on commons) you could then link on wikidata, and have some query function to aid search, we need some wayfinding to aid search of topics. Slowking4Rama's revenge 11:53, 13 September 2019 (UTC)

Further discussion needed (New speedy deletion criterion for person-based categories)Edit

I am quite a bit concerned about this, and have unarchived it to prevent it lingering on unresolved.

We are now in a situation where the community has voted to implement a criteria for speedy deletion, that allows any administrator to delete such matter at their own discretion with no a priori community approval (all admin actions are, of course, subject to a posteriori review by the community), but where at least two long-standing and very experienced contributors have objected to the core issue after the fact, and levelled criticisms at the formalities of the community decision process. Their objections are reasonable ones (in the "reasonable men may disagree" sense), and the criticisms of the process valid.

To make clear the procedural issues, the proposal described the issue as "in the main uncontroversial", which the objections have demonstrated was not entirely accurate, and it was closed after a mere four weeks (two weeks after the last comment), when an objection became apparent after six weeks. Additionally, relating to the core issue, those who disagree feel the examples provided in the proposal do not accurately reflect the criterion as it was implemented. These are all valid complaints and the responsibility for these deficiencies in the procedure fall to me (my apologies).

But, in any case, the core issue remains: we now have a speedy criterion that two very respected and experienced community members have valid and strong-held objections to.

The arguments of those who object are presented above under the "Deletion review" thread. I had hoped that the community would chime in on that discussion such that it would be possible to assess whether the community shares the concerns of those who have objected, or whether they still support the criterion as implemented.

But as that has not happened I would like to directly request that the community chime in to make clear their position on how to handle this.

  • Despite the criticisms, the original community vote was valid and concluded with support, so the default outcome, if no change is mandated here, is that the criterion as written will be implemented. It is currently temporarily suspended as a conservative measure since objections have been raised.
    • In particular, this means that if you do not express an opinion now you will in practical effect be reaffirming the original outcome!
  • Does the community feel that the concerns raised are serious enough to invalidate the previous vote and revert to the status quo ante?
  • Does the community feel we should proceed as per the existing vote and adjust course as necessary at a later date?
  • Alternately, does the community feel we should proceed as previously voted but with specific changes to the wording of the criterion?
    • For example, Hesperian has specifically proposed replacing "Person-based" with "Author-based" in the criterium.
  • Would the community prefer a new proposal, that better explains the issues, be made and a new vote held on that?
  • In essence: do you have any opinion or recommendation on how this disagreement should be handled such that we end up with the issue settled?
    • Not everyone needs to agree with the outcome, but everyone should preferably feel that the outcome was fairly arrived at!

Pinging previous participants in the vote/discussion (but everyone are, of course, encouraged to chime in): Beleg Tâl, Slowking4, EncycloPetey, Jan Kameníček, Nizolan, billinghurst, Hesperian.

This has dragged on unresolved and it's the kind of thing that has the potential create conflicts and discord down the line so, despite the sheer amount of text and rehashing, please chime in and make your position clear! --Xover (talk) 07:22, 14 October 2019 (UTC)

  •   Comment The examples used of the purpose and solutions did not adequately represent the proposal. I don't believe that any long-held page that appears valid at a point in time should be speedy deleted with a change in policy, especially where it is unclear in the proposal that such pages were being incorporated. My understanding of our approach was that we would not build author category listing pages those to go. — billinghurst sDrewth 09:56, 14 October 2019 (UTC)
    @Billinghurst: It is not clear to me from this comment how you would prefer to resolve this issue. Could you make that explicit? --Xover (talk) 06:46, 15 October 2019 (UTC)
  • As I said before, I remain   Neutral regarding the proposed change from the current "person-based" deletion rationale to the proposed "author-based" rationale. —Beleg Tâl (talk) 15:19, 14 October 2019 (UTC)
    I voted for deletion of person-based categories and I hope that the vote also counts in this way. If somebody wishes only deletion of author-based categories instead, it should be suggested as an alternative rule. I admit it is my fault I did not protest when somebody changed the proposal without others expressing their consent clearly, but still: changing rules needs explicit consent, which is missing here.
    That said, I do not think that the idea of treating author-based categories differently from categories of other people is good.
    • Firstly, this can be a source of big confusion to many readers browsing categories: some people are included in the category tree and others not, and accidental visitor to Wikisource unfamiliar with our internal rules will not find the clue.
    • Secondly, it is not defined, who is considered to be an author by this rule: A person who is author of a work at Wikisource? A person who is author of a work eligible to be added to Wikisource? A person who is author of a work in English or translated into English, although it won't become eligible for WS for decades? Or any person who is author of whatever in any language, which may but also may not be translated into English in the future? We have some definition in the Style guide which says that "... author ... is any person who has written any text that is included in Wikisource. However, too many contributors refuse to follow this definition and found author pages of people who have no work here, sometimes even authors who have never written anything in English and nothing by them has been translated into English so far (example). I am afraid the same will sooner or later happen with categories.
    • Let's say that we determine some line dividing authors and the rule will say which authors can have categories and which not. The rule could be: authors who have an author page cannot have category, and vice versa (or any other definition). Again: accidental visitor browsing categories will be confused, unable to find our internal clue why Alois Rašín can be included in the category tree and Karel Kramář not.
    To conclude it, the best way is the simplest: forbid all person-based categories and organize people only in the author and portal namespaces, or alternatively allow categories for everybody. I am for the first of these two choices. Jan Kameníček (talk) 20:10, 14 October 2019 (UTC)
  • comment, i am concerned about increased use of speedy deletion, that has been abused elsewhere. i would prefer use of maintenance task flows in the open. i do not see a pressing problem. but maybe this is overblown, and the admin task flow here will not be abused. i raised my concern and got dismissed, which is fine with me.
  • what we really need is a consensus about how we structure our data with wikidata. (be it categories, portals or tags) we need a stable page, about work subjects, that can link to wikidata. we have a "works about" section for authors. but we need it for non-authors also.Slowking4Rama's revenge 14:09, 15 October 2019 (UTC)
    Your concerns were not dismissed, some editors merely disagreed with them. But in the interest of clarity, in view of your comment here and your original oppose vote to the proposal, do I understand correctly that your preferred resolution to this issue is to roll back to before this proposal and have no speedy deletion criterion for this at all? --Xover (talk) 14:41, 15 October 2019 (UTC)
yeah, apparently, i have out of consensus views of those who show up for process discussions. i just want some stable bibliographic metadata about "depicted people" and subjects. i am open to how to structure it, and what is the road map to get there. i do no care about rolling back a particular direction that i think is mistaken. (the problem with deletion is that it decreases the slim possibility of quality improvement, since it hides quality defects rather than making them more visible.) Slowking4Rama's revenge 15:43, 15 October 2019 (UTC)

Update to NopInserter GadgetEdit

A while back, while debugging an unrelated issue, I found a bug in MediaWiki:Gadget-NopInserter.js that prevented it from displaying the intended visual indication of its operation. I also found that at the time of implementation there had been some differing preferences for what type of visual indicator be used and when prompting for confirmation was appropriate. I have therefore created an updated version in User:Xover/Gadget-NopInserter.js that fixes the bug and adds configuration options for whether to confirm addition of a {{nop}}, the style of visual indicator, and the duration of the indicator effect. To try it out you can add the following to your common.js (but disable the site-wide gadget in your preferences first!):

mw.config.set('userjs-nopinserter', {
	dontConfirmNopAddition: true,
	notificationStyle: "highlight",
	notificationTimeout: 1000
});

mw.loader.load('//en.wikisource.org/w/index.php?title=User:Xover/Gadget-NopInserter.js&action=raw&ctype=text/javascript');

It also fixes the bug that prevented the site-wide gadget from actually showing the outline based highlight it was supposed to. And for good measure I added support for a notificationStyle using mediawiki's bubble notifications (set notificationStyle: "message" to try it out). The weird double-negative construction of "dontConfirmNopAddition" is just because I've preserved the default behaviour of the site-wide gadget. If you remove everything except the mw.loader.load line (no setting of options) you will get the old default behaviour with just the bugfix.

The changes can be seen in this diff.

The changed version has had some limited testing and seems ready for wider testing. I therefore propose that we update MediaWiki:Gadget-NopInserter.js with this version. Note that since we do not have interface administrators locally, I will have to request this edit from the Stewards at meta, and they will require a community discussion to verify that this is indeed a change in line with community consensus. It would therefore be very helpful if as many as possible indicated whether or not you support this proposal. --Xover (talk) 12:37, 6 October 2019 (UTC)

I think local bureaucrats can set the "Interface administrators" bit.Mpaa (talk) 14:18, 6 October 2019 (UTC)
Bureaucrats have the technical ability to flip that bit, yes. But by WMF Legal-imposed policy it requires 2FA, and so can't just be assigned ad hoc like other local permissions. And since we have no permanent interface admins, nor any "list of people willing and able to make interface admin-edits", we don't actually have any functioning local way to request such changes; unless you yourself happen to have 2FA enabled for other reasons. Thus, asking the Stewards at Meta is actually the easiest option for getting such changes made currently. --Xover (talk) 05:55, 7 October 2019 (UTC)
OK, just a bit weird that it is 'technically' possible but not 'legally' possible without 2FA. If they want to be on the safe side, they should not allow without 2FA.
  Support Anyhow, I am fine with the proposal.Mpaa (talk) 19:37, 7 October 2019 (UTC)

Proposed update to template:headerEdit

There has been a discussion about adding some parameters to the header to capture contributors to sections/subpages of works, and to create synonyms to have more generic parameters available. The proposed changes are at template talk:header#Action to be taken October 2019. Flagging this prior to implementation in case anyone sees problems or has major issues that should stop moving forward. — billinghurst sDrewth 21:43, 17 October 2019 (UTC)

  •   Support --Xover (talk) 08:52, 18 October 2019 (UTC)
  •   Support, I do miss the section translator in the template. --Jan Kameníček (talk) 17:17, 18 October 2019 (UTC)

Bot approval requestsEdit

Repairs (and moves)Edit

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

New pdf and index repairs for Once a Week volume 2Edit

One of the pages of the existing file for Index:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf is in the wrong place in the pdf. I tried to upload a repaired pdf but couldn’t figure out how to overwrite the existing file using chunked upload. I have stored the better pdf at the following filename on the Commons: c:File:Once a Week, S1 V2.pdf. Could someone please use that file to overwrite c:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf and then delete my temporary storage file? Also, the index needs to be corrected as follows (numbers are the file pages, not the magazine numbering). The main namespace transclusion page numbers are already correct for the new numbering.

  • no change: new 1-88 = old 1-88
  • new 89-100 = old 90-101
  • new 101 = old 89
  • no change: new 102-635 = old 102-635
  • in the new version, there are 4 no-content pages after the last page; in the old one, there are 8.

Levana Taylor (talk) 20:14, 30 August 2019 (UTC)

@Levana Taylor: I've uploaded the new version over c:File:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf and tagged c:File:Once a Week, S1 V2.pdf for deletion. But someone with AWB or similar tools must take care of the rejigging of pages in the Page: namespace here. --Xover (talk) 20:45, 30 August 2019 (UTC)
Thanks, that was fast! Levana Taylor (talk) 20:48, 30 August 2019 (UTC)

Once a Week pages shiftedEdit

Pages 76 to 88 of Index:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf are shifted, is there some automatic way to repair it? --Jan Kameníček (talk) 06:18, 9 September 2019 (UTC)

Done.Mpaa (talk) 20:04, 24 September 2019 (UTC)

File:Odes and Carmen Saeculare.djvuEdit

Would it be possible for someone to fix the broken offset in this file? Much appreciated —Beleg Tâl (talk) 19:40, 26 September 2019 (UTC)

Done.Mpaa (talk) 20:31, 6 October 2019 (UTC)

Other discussionsEdit

Proofreading line by lineEdit

I tried to do some proofreading and I experienced two difficulties:

1 I need to jump back and forth between the text version and the scan. Usually needs to jump up and down since the lines aren't aligned.

2 I need to proofread a whole page in order to submit, which means if I don't have 10+ minutes to carefully check the whole page, I better not start.

My proposed solution is to have the scans broken down to lines, and then one will be able to proofread one line at a time. It will also allow for mobile users to contribute easily.

This approach was implemented by Haifa University in Tikkoun Sofrim project.

Would like to know if there is any reason this wasn't done and if no such reason, I would like to start developing this tool, and as MVP simply upload a PDF with one line per page.Uziel302 (talk) 20:22, 19 September 2019 (UTC)

1) When proofreading, you should be able to drag the image to a position where the lines are aligned.
2) When I am only able to proofread part of a page, I will usually place a comment like <!-- PROOFREAD UP TO THIS LOCATION --> at the place where I proofread up to. Or you can just proofread part of it and make life easier for the next person who finishes the page.
3) If your tool requires that users chop up book scans line-by-line, I don't imagine it will gain much traction, but I'm certainly intrigued. —Beleg Tâl (talk) 20:33, 19 September 2019 (UTC)
I would be enthusiastic about a well-written proofreading gadget of the sort you describe; it would take quite a bit of programming though! I don’t think hand-uploading a line-by-line PDF is the solution. Too much work by far. Instead, image recognition has now advanced to the point that software should be able to pull apart an image into one-line image strips (and it could deal with multiple columns too -- not too hard to recognize them, or it could ask you how many columns). This would be far less of a challenge than OCR! Levana Taylor (talk) 20:39, 19 September 2019 (UTC)
At the very least, the DjVu and PDF files must have the OCR'd text locations available. Furthermore, if the scan is from the IA, this information is directly available in the djvu.xml file in terms of page, column, line and word co-ordinates, so you don't even need to pick apart the file ot get at it. For example: a random book's XML. Inductiveloadtalk/contribs 21:29, 19 September 2019 (UTC)
I can imagine that proofreading line by line works well if the contributors focus on pure text. However, Wikisource tries to focus also on text formatting, and so it is much better to work with the whole page. Formatting each line separately would be too much work and I suppose that putting the formatted lines together would also be very difficult. --Jan Kameníček (talk) 21:33, 19 September 2019 (UTC)
What Jan said ^^^ and most importantly is where hyphenation, formatting and wikilinks FWIW Trove has its newspapers in line by line, and it works for that where it replicates a work, and focused on text search, not so much for reproduction of books and works. — billinghurst sDrewth 03:38, 20 September 2019 (UTC)
i kinda agree. final formatting is important, but it would be good to get something mobile friendly or small screen friendly, for a preliminary pass. just a gamified find and replace for typical scan errors would be nice. and we could AI it, for suggested replaces for a work. (and possibility of feedback to OCR) Slowking4Rama's revenge 20:38, 20 September 2019 (UTC)
Slowking4, it's kinda funny, I already run a gamified find and replace small-mobile-friendly tool, and it helped us correct over 14000 typos on Hebrew Wikipedia. I ran the scan for English Wikisource and imported the tool, attached screenshot. But it seems to be irrelevant to correct typos, since many typos are sic, thus written. The only right way to correct such typos would be by attaching relevant screenshot of the source, but that would be much more work on my side. You are more than welcome to help me with the tool on English Wikipedia: Wikipedia:Correct typos in one click. Uziel302 (talk) 13:25, 28 September 2019 (UTC)
We either note original production errors with the invisible {{sic}} or the visible {{SIC}}. If we are talking about OCR errors, there are some weird and wonderfuls, and I repeatedly build endless versions for my automated regex replaceents, though find that many are internal to a work. — billinghurst sDrewth 11:26, 29 September 2019 (UTC)

WikilinksEdit

Am I the only one that finds this over killing?Mpaa (talk) 14:00, 22 September 2019 (UTC)

Yes, this is a real extreme. Imo, wikilinks should be made in two cases:
  1. To other texts to which the author clearly wanted to refer to (e. g. to a book, journal article, encyclopaedia entry) and which are also present at Wikisource.
  2. To more general pages that contain links to various other texts relevant to the topic and present at Wikisource, like to author pages or portals.
Links to articles of a specific encyclopaedia are undesirable, as they distract readers too much, see Wikisource:Wikilinks#Overlinking. Such extensive linking does not fall under basic linking described in Wikisource:Wikilinks#Basic wikilinks and should follow the rules for annotated works. --Jan Kameníček (talk) 14:26, 22 September 2019 (UTC)
@Mpaa: I agree that this example is clear overkill, but perhaps not by quite as much as you and Jan.Kamenicek think.
This one is an extreme example, but if we say this is the outer edge of a scale that has "zero links" as the other terminus, I think this one is quite possibly closer to the ideal compromise than the "zero links" extremity. We obviously shouldn't be linking common words like "salmon" in a biographical encyclopedia entry, and for links other than author/portal or work titles the use of redlinks is questionable, but providing links that enhance the reader's understanding is in general a good thing. The sheer density of links is clearly a problem in that text, but if you take out the obviously excessive ones ("salmon", "July", "published", etc.), the result is actually within the area where "reasonable men may disagree". Providing a link from "Lake Champlain" (which I've never heard of and have no idea where it is or even what it is, above guessing that it's most likely a lake somewhere that they primarily speak English) is a good thing! In this specific case, since the work is an encyclopedia, these would most naturally be to internal entries in the same work; but for things like scholarly monographs, I think even links to Wikipedia would be a good thing. There are also some necessary distinctions related to the type of work (scholarly monographs vs. fiction vs. poetry) or context in the work (inside a quotation vs. in the author's own voice) that quite strongly affect whether linking is appropriate at all, what to link, and how much.
WS:Links does not provide particularly clear guidance, and I am not at all convinced it still reflects community consensus (or, at least, that the interpretation of its practical consequences does), so I have a todo item for starting a community wide discussion about that (I'm, among other things, waiting for the currently open proposals to conclude to avoid overwhelming people). If nothing else, I am hoping to make the guideline clearer and to more directly address a few of the most obvious and common cases. If anybody has thoughts about this I would welcome them, both in advance to help inform my proposal and when such a proposal is eventually posted. --Xover (talk) 08:51, 23 September 2019 (UTC)
What you describe is exactly what annotated versions are for and outside annotated versions links should be used scarcely.
Links to our author pages and portals make sense because they are not that many and because they introduce a variety of our works connected with the person/topic. The aim of the links as I understand them is not to provide explanation (why should we choose a source of explanation for the readers? – they can surely do it much better themselves and definitely not in an encyclopaedia which is 100+ years old.) but simply to show that we have other works on the topic. To avoid overlinking, this should be done only if we have multiple works of that kind (gathered at author pages and portals) and not if we have only one old encyclopaedia entry. If explanation is really needed, Wikipedia can definitely be a better choice for such a purpose than Britannica, but it should be treated really cautiously. In fact the argument can be repeated: why should we choose Wikipedia for the readers, who would otherwise look for explanation by themselves and maybe find even better sources? Besides that I personally get always quite irritated if the links always go in different directions (once to an author page and another time to Wikipedia) so I am not able to predict where I am going to end up and if I find there what I am looking for. Links should also be predictable. When there is a link from a name of another work, it should always go to our text of the work (like The Bartered Bride) and never to Wikipedia (like The Bartered Bride), even if we did not have such text at Wikisource, because readers expect that such links go to works and not to articles about works. --Jan Kameníček (talk) 15:09, 23 September 2019 (UTC)
Thanks. All good points. But let me just be clear that my point is that the place where we draw the line to an Annotation (and subject to the WS:Annotations policy), where wikilinks are concerned, is currently not sufficiently clear, and not necessarily in the right place (depending on the interpretation). There is a world of difference between wikilinks, on the one hand, and added illustrations and explanatory notes, on the other. Depending on how one interprets the linking policy, it is currently extremely conservative, and I assert it should be loosened at least a little bit. --Xover (talk) 16:47, 23 September 2019 (UTC)
Something I have struggled with is old transcription schemes in works like SBE3, where you might have proper nouns like "Sze-mâ Khien", which, in modern pinyin is Sima Qian. It's normally fairly obvious when to link when there's an Author, Portal or Wikipedia page available, but sometimes there isn't.
An example is "ℨû-lâi" (Culai, 徂徕), which is a mountain in Shandong. My occasional approach here has been to use a tooltip with the modern transliteration and characters. However this is sub-optimal, as it's impossible to select/copy the tooltip text (useful if you want the characters and don't have a Chinese input method), they can't be hovered on mobile devices, and they don't appear in the exported formats. Would annotated references be acceptable here, and if so, how to indicate that they are not original references (when the work has references)?
The reason I'm particularly interested is because older texts on China universally use old transcription schemes (the current system dates from the 1950s; the one here is Legge romanization), and it's unfriendly to make readers decipher these themselves. In particular for Chinese, even if you understand the transcription mapping to pinyin, the Chinese character is not known without further research. This research if done by the proofreader seems a shame to throw away, even if it's not part of the original.
I think it's a case where a documented "optional best practice" would help. I.E.: you don't have to do it, but if you do, "X" is the preferred way to do so. Inductiveloadtalk/contribs 14:45, 24 September 2019 (UTC)
My opinion is that we should focus on enabling people to access old publications, not to try to modernize them. If the publication uses outdated transliteration, we should copy it and that is all. Whatever else you add, like tooltips with modern transliteration, falls under WS:Annotations policy. --Jan Kameníček (talk) 18:10, 24 September 2019 (UTC)

US copyright and the inclusion policyEdit

For the longest time, Wikisource's inclusion policy imposed additional criteria for including texts published on or after January 1 1923. This happened to coincide with what was then the public domain cutoff date for published works in the United States, although the policy makes clear that these are to be considered separate concerns. With the public domain rolling forward in the US at last, a user changed the references to 1923 to a dynamic CURRENTYEAR-95 expression. I have reverted this change pending discussion. Do we want to roll forward our unconditional inclusion threshold along with the copyright law or do we want it to stay put? Phillipedison1891 (talk) 06:54, 25 September 2019 (UTC)

why are you now raising this issue after a gap of nine months? without a discussion on the user’s talk page ? Slowking4Rama's revenge 13:33, 25 September 2019 (UTC)
@Phillipedison1891: The change was purposeful, and reflects the point that the 1923+95 has been reached, and each year that now passes increments the year we can have. The community had been awaiting that anniversary, and the consensus was determined at the creation of the template all those years ago Please undo your change. — billinghurst sDrewth 13:47, 25 September 2019 (UTC)
@Billinghurst: The change has been reverted, although it appears from the original 2007 discussion that the 1923 cutoff for scope was arbitrary and wasn't necessarily linked to public domain. Phillipedison1891 (talk) 16:20, 25 September 2019 (UTC)
the lack of discussion is more a function of a small wiki more interested in doing work, than memorializing consensus. it is not more restrictive than PD, but rather using whatever works commons will allow. the use of simplistic date hurdles is for them, using PD not renewed works are at the hazard of summary deletions there, for example c:Commons:Deletion requests/American seashells (1954). however, fair use of jeppeson charts in PD documents are not ok. but if you want to build a consensus for "what is in scope" go for it. Slowking4Rama's revenge 21:26, 28 September 2019 (UTC)
@Slowking4: Belated thanks to the people that retrieved those... I missed that discussion on Commons about the images, but had actually looked up the original registration and checked for a renewal by number before uploading the book, and it was also cleared by the Copyright Review Management System at HathiTrust (i.e. actual paid copyright experts). Disturbing that people would delete items specifically uploaded as 'not renewed' without actually checking, particularly works with a renewal date that would be in the online USCO database. (FWIW, that publisher was bought out by a big conglomerate in 1968, and most of their publications were probably never renewed.) Jarnsax (talk) 05:52, 8 October 2019 (UTC)
It's also in progress over here at Index:American Seashells (1954).djvu (blatant plug for help) :) Jarnsax (talk) 05:58, 8 October 2019 (UTC)
How strict will we enforce the URAA copyright restoration? Please consider the choices from m:United States non-acceptance of the rule of the shorter term#Statement from Wikimedia Foundation.--Jusjih (talk) 02:29, 9 October 2019 (UTC)
there is no consensus for URAA interpretation either here or on commons, see also c:Commons:Village_pump/Copyright#URAA_revisited_in_2019 and "The WMF does not plan to remove any content unless it has actual knowledge of infringement or receives a valid DMCA takedown notice. To date, no such notice has been received under the URAA. We are not recommending that community members undertake mass deletion of existing content on URAA grounds, without such actual knowledge of infringement or takedown notices." [1] -- Slowking4Rama's revenge 10:58, 18 October 2019 (UTC)
Last time we had this discussion, it was pretty clear there was consensus for not supporting texts that aren't in the public domain in the US on Wikisource.
Ignoring the URAA doesn't mean that the rule of the shorter term would come into play. Huge number of works as late as the 1980s would be PD. It also doesn't matter for many British and Canadian works, which were routinely published and even renewed in the US. E.g. the last works of H. Rider Haggard are still in copyright in the US, despite his death in 1925, with or without the URAA, because they were renewed. Many more may be out of copyright, no matter when their authors died, because they were registered with the US copyright office and not renewed.--Prosfilaes (talk) 15:50, 18 October 2019 (UTC)
Yes. Chinese Wikisource applies how the WMF does, without ignoring the URAA. It is why I ask here about how strict we enforce the URAA, active or passive.--Jusjih (talk) 03:01, 21 October 2019 (UTC)

Talk page junkEdit

Can someone speedy-delete all these: [2], please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:53, 25 September 2019 (UTC)

@Billinghurst: - This really needs an abuse filter as well, so that the IP's concerned (LTA) are auto blocked the moment they start spamming.ShakespeareFan00 (talk) 22:18, 25 September 2019 (UTC)

If it was possible to write an abuse filter to stop someone for being a spammer, a vandal or a general PITA, then it would have been achieved years ago. You have seen this person morph their behaviour, and that is the thing about humans, they are adaptable. We do what is possible that still allows editing from IP addresses. — billinghurst sDrewth 20:56, 26 September 2019 (UTC)

Tech News: 2019-40Edit

16:49, 30 September 2019 (UTC)

The consultation on partial and temporary Foundation bans just startedEdit

-- Kbrown (WMF) 17:14, 30 September 2019 (UTC)

help pleaseEdit

what can i add to this project? Baozon90 (talk) 18:06, 1 October 2019 (UTC)

@Baozon90: There's a lot you can do. One thing that is simple but requires a lot of care and patience is validating pages. You can find any page in Category:Proofread, ensure that the contents match the original, and mark it as validated. Adding a new text is a great thing to do as well but is more complicated. For someone entirely new to Wikisource, I'd recommend trying something small but valuable like validation as there are a lot of pages that could use it but don't require a lot of specialized skill to review. —Justin (koavf)TCM 18:19, 1 October 2019 (UTC)
@Baozon90: If you are particularly indecisive, and want a random page from the Proofread pages, you can try the Special:RandomInCategory page; I suggest you enter "Proofread" into the field there, if you want to follow @Koavf's suggestion of starting with validation. Dcsohl (talk) 17:14, 2 October 2019 (UTC)
Wikisource:Proofread of the Month is a good place to start. and there are maintenance categories. Category:Index Not-Proofread. if you are talking about texts, we accept PD or CC texts; it helps if they are at internet archive [5] --Slowking4Rama's revenge 15:56, 3 October 2019 (UTC)
@Baozon90: ^^^ what Slowking4 said about POtM is perfect for learning about our systems, our quirks, and to be supported while doing so. Once you have the basics, then is the time to branch out. — billinghurst sDrewth 02:34, 4 October 2019 (UTC)
also if the Optics work is a little math heavy for you, there is a list on talk to work on Wikisource_talk:Proofread_of_the_Month. they give a good sample of work in progress. Slowking4Rama's revenge 23:35, 6 October 2019 (UTC)

Is Wikisource no longer getting indexed by Google and other search engines?Edit

Several recently-created pages seem to be completely absent from Google. If I search for the complete title, and even add "site:wikisource.org", Google comes up with nothing. I see the same with Bing.

For example, I tried searching for "Ritual of the Order of the Eastern Star", which is currently at the top of the "New Texts" section of the front page, and again nothing (the index file comes up, but not the actual Wikisource text).

Has Wikisource done something to deter indexing, such as the use of a robots.txt file? If so, why? It makes Wikisource content harder to find, which is surely inimical to the success of the project. Grover cleveland (talk) 09:48, 4 October 2019 (UTC)

I've done some more digging, using the Wikisource:Works/2019 page. Notes on the Ornithology of Oxfordshire, Aplin 1899-1900 (added August 1st, so more than two months old) appears to be absent from Google. Other (older) pages that link to it are hit, though. It appears that Google isn't indexing newly-created pages, but is continuing to update its indices of older pages. Grover cleveland (talk) 10:12, 4 October 2019 (UTC)

Copyright and deletion discussions needing community input in October 2019Edit

The following copyright discussions and proposed deletions discussions have been open for more than 14 days, and with more than 14 days since the last comments, without a clear consensus having emerged. This is typically (but not always) because the issue is not clear cut or revolves around either interpretation of policy, personal preference within the scope afforded by policy, or other judgement calls (possibly in the face of imperfect information). In order to resolve these discussions it would be valuable with wider input from the community.

Copyright discussions require some understanding of copyright and our copyright policy, but often the sticking points are not intricate questions of law so one need not be an intellectual property lawyer to provide valuable input (most actual copyright questions are clear cut, so it's usually not these that linger). For other discussions it is simply the low number of participants that makes determining a consensus challenging, and so any further input on the matter would be helpful. In some cases, even "I have no opinion on this matter" would be helpful in that it tells us that this is a question the community is comfortable letting the generally low number of participants in such discussions decide.


Copyright discussions


Proposed deletions


Note that while these are discussions that have lingered the longest without resolution, all discussions on these pages would benefit from wider input. Even if you just agree with everyone else on an obvious case, noting your agreement documents and makes obvious that fact in a way the absence of comments does not. The same reasoning applies for noting your dissent even if everyone else has voted otherwise: it is good to document that a decision was not unanimous.

In short, I encourage everyone to participate in these two venues! --Xover (talk) 09:21, 5 October 2019 (UTC)

Broken links to scans from Index pagesEdit

We can usually get to a scan at Commons when we click to "djvu" or "pdf" in the Source field of an Index page. However, from time to time I come across a page where the link does not work, for example at Index:Bohemian Review, 1917–Czechoslovak Review, May 1919.djvu. How can it be fixed? --Jan Kameníček (talk) 16:18, 6 October 2019 (UTC)

I experimented a bit and found that it's because of the dash in the filename. The Index page template checks if the file exists in order to suppress redlinks. It checks using {{PAGENAMEE}}, which returns the file name "Bohemian_Review,_1917%E2%80%93Czechoslovak_Review,_May_1919.djvu". Using this filename still works: File:Bohemian_Review,_1917–Czechoslovak_Review,_May_1919.djvu - but {{#ifexist:File:Bohemian_Review,_1917%E2%80%93Czechoslovak_Review,_May_1919.djvu}} returns false. —Beleg Tâl (talk) 01:52, 7 October 2019 (UTC)
I'm not sure why {{PAGENAMEE}} is used instead of {{PAGENAME}}. A related discussion at Wikisource:Scriptorium/Archives/2011-05#Special:IndexPages seems to suggest there was a bug that needed to be worked around maybe. @Billinghurst: I think you were involved in those discussions, do you know anything about this? —Beleg Tâl (talk) 01:55, 7 October 2019 (UTC)
As is known sometimes It is quite a while and probably numbers of versions ago ... <shrug> I am surprised to see it used inside File: as that seems contrary to logic, though sometimes back in the earlier days we did what worked and who knows what testcases were done, and GOIII was not the best summary-user to understand what was being done or tested. I think that file: and media: use can progress as PAGENAME rather than PAGENAMEE. — billinghurst sDrewth 07:28, 7 October 2019 (UTC)

  Done and it the Index: work identified displays, though we should be checking a wider range of index pages with unusual characters in the title to look for good test cases. — billinghurst sDrewth 07:35, 7 October 2019 (UTC)

checked the first and last pages of Category:Index Validated and those unusual lead characters display fine. — billinghurst sDrewth 07:39, 7 October 2019 (UTC)
Bleh. The documentation suggests that this should fail for all files containing the characters ', ", and &. But testing does not confirm that. It does indicate some weirdness though: Index:"I solemnly swear that I won't eat no more ice cream what's made with sugar nor no more candy what's made with sugar. Ho - NARA - 512512.jpg.
What looks like might have happened here is… Well, that MediaWiki magic words are a poorly designed mess. {{PAGENAME}} returns the page name with the characters ', ", and & HTML encoded. {{PAGENAMEE}} returns the page name with a set of characters URL encoded. But the -E variant doesn't actually do real URL encoding, just a weird MediaWiki-specific variant. For proper URL encoding you need to use {{urlencode:…}} from mw:Extension:ParserFunctions. No magic word or function will actually get you the raw page name with no encoding applied.
Back when this checking was implemented in MediaWiki:Proofreadpage index template, using {{PAGENAME}} with {{#ifexist:…}} failed. I'm not quite sure how using the -E variant actually helped, but that appears to be the reason for the change. However, scanning through the listed bugs (a horror story), it looks like (but isn't actually documented anywhere that I can find), that someone at some point implemented a workaround in {{#ifexist:…}} that actually decodes the encoding applied by {{PAGENAME}} in order to make this work.
The net result is that we are relying on a poorly designed mess of stacked special-case workarounds. Lua has functions that may or may not avoid this, but I can't determine that for certain and would require rewriting the whole template in Lua which may not be worth the effort involved (then again, maybe it is!). --Xover (talk) 11:25, 7 October 2019 (UTC)
The page that you indicated above displays fine for me, no weirdness. Guessing that when we made a change to one fo the magic words, that someone changed all, for continuity, not accounting for quirks. Personally, I just want it to work, and care less about the path—it is low exposure, and presumably low overheads in the holistic sense. — billinghurst sDrewth 11:36, 7 October 2019 (UTC)
@Billinghurst: Do you see a "Source jpg" on that Index page (just above the progress field)? It's not showing up for me, not even after purging it. --Xover (talk) 11:54, 7 October 2019 (UTC)
Yes, I see it. To note that in "olden days" that we used to have to wiki-natively insert jpg images, and I would still do so when setting up an Index: page. The use of just the (page) number is not something that I would have even tried to do. — billinghurst sDrewth 12:04, 7 October 2019 (UTC)
That's very strange, as I tested it in a separate browser and not logged in; and it's not even included in the HTML source of the page. Are you sure we're looking at the same page? However, the reason it didn't show up would seem to be that the "Scans" dropdown for this index was set to "other" rather than "jpg". When I changed that it showed up properly as expected. It was thus probably unrelated to the problem that's the topic of this thread. --Xover (talk) 12:45, 7 October 2019 (UTC)

Tech News: 2019-41Edit

15:34, 7 October 2019 (UTC)

Ongoing discussion about DjVu files at Wikimedia CommonsEdit

Hello Wikisource community,

There is an ongoing discussion about the hosting of files in the DjVu format on Wikimedia Commons over at "Commons:Village pump/Proposals#DjVu is dead. We should deprecate it for new uploads.", as this format is almost universally implemented on Wikisource I request input and feedback from the Wikisource community over there. It's best to engage in discussion before "voting" as this is not a referendum but a proposal. -- DonTrung (徵國單)  (討論 🤙🏻) (方孔錢 ☯) 08:05, 8 October 2019 (UTC)

As notification about this discussion has so far only been posted here at English Wikisource (and a million thanks to Donald Trung for taking the time out to do so. Very much appreciated!), but it is of potential interest to all the different language Wikisources (at least any of them that use DjVu files; but even those that do not use DjVu have indirect stakes in the outcome), we should try to make sure they are also made aware of it. I am uncertain whether we have any established channels for this. mul:Wikisource:Scriptorium would seem to be one possible avenue, but I am uncertain to what degree the non-English projects watch that forum. Can anybody advise on this? Billinghurst perhaps? --Xover (talk) 09:42, 8 October 2019 (UTC)
1) There is the wikisource-l mailing list that would be worthwhile being pinged. 2) We could look to set up a MediaWiki message delivery distribution list at Meta where we list the Wikisource wikis, and add the pages for the respective Scriptoriums per d:Q16503 (63 entries). There may be the odd other pages at some wikis if they don't have a WS:S. — billinghurst sDrewth 10:18, 8 October 2019 (UTC)
m:MassMessage and there is list of interested users at m:Global message delivery/Targets/Wikisource Community User Group participants and m:Global message delivery/Targets/Wikisource News (en)billinghurst sDrewth 10:21, 8 October 2019 (UTC)
MassMessage for WS:S list built m:Global message delivery/Targets/Wikisource Scriptoriums
And to note as massmessage is a right allocated to admins, I have it at Meta and can send any prepared message. To note that MM at a wiki will only send local, MM at Meta is able to send globally. — billinghurst sDrewth 10:37, 8 October 2019 (UTC)
@Billinghurst: Wouldn't Donald's original notification here work well?
Sure, I simply wish for it to be seen as the community's considered and sanctioned message, rather than mine. Whether it is notification alone, or part opinion, or consideration of consequence, or condemnation … — billinghurst sDrewth 20:17, 8 October 2019 (UTC)

Hello Wikisource community,

There is an ongoing discussion about the hosting of files in the DjVu format on Wikimedia Commons over at "Commons:Village pump/Proposals#DjVu is dead. We should deprecate it for new uploads.", as this format is almost universally implemented on Wikisource your input and feedback would be valuable over there.

Or perhaps we should specify that that this is the English Wikisource community passing it on? Perhaps by tacking on a "The English Wikisource community has been made aware of a discussion that may be of interest to your project." at the beginning? --Xover (talk) 13:25, 8 October 2019 (UTC)

I withdrew the proposal. Kaldari (talk) 05:13, 9 October 2019 (UTC)

i kinda agree with the assessment, but would prefer an action plan to make IA uploader produce pdf’s from jp2. this would sunset the djvu. -- Slowking4Rama's revenge 13:09, 9 October 2019 (UTC)
Such an action plan will have little value if the points raised by Xover at Commons are not solved. --Jan Kameníček (talk) 17:38, 9 October 2019 (UTC)

Request for comment: Are drop by copy and pastes still in scope?Edit

When an IP address decides to copy and paste a text from an unknown source, eg. as was done at Turandot, in this time do we still consider these in scope per Wikisource:What Wikisource includes. To me, I think that we are past the point of just being an ugly text paste factory, and the work that it entails for others to curate and manage texts that are of unknown sources, and unable to be proofread further. If it was a registered editor, then I would normally follow up and see what is possible, though IP addresses are just problematic. — billinghurst sDrewth 20:58, 8 October 2019 (UTC)

If we were to consider these out of scope, where would you draw the line between copydump and legitimate text? Would you consider any text added by an IP without a clearly identified source to be out of scope? —Beleg Tâl (talk) 21:03, 8 October 2019 (UTC)
I wonder if it would be possible / beneficial to implement something like Commons' Upload Wizard, to push new users towards adding texts properly (i.e. with scans), while still leaving the regular methods available to those who know where to look, or take the time to find it. It is not foolproof but would possibly cut down on this type of drive by edits. —Beleg Tâl (talk) 21:08, 8 October 2019 (UTC)
here is a scan (does not appear to match) https://archive.org/details/turandotprincess00volluoft -- it’s unclear to me if a wizard will push this editor. don’t know if his periodic dumps rise to the level of nuisance, but yrmv. we could use some FAQ on boarding for new editors, and landing page and a welcome wagon. Slowking4Rama's revenge 02:25, 9 October 2019 (UTC)
@Slowking4: I am looking at a reasonable means for an administrator to look at a work and call it irretrievable, ugly high maintenance. I am not necessarily looking to be hard-arsed belligerent about works that we can suitably manage or retrieve, or where we can have a conversation with the contributor. I am looking at Turandot should give you an indicator of no source, and paste factory. If we can have a fast manage section within {{WS:PD]] rather than an indeterminate, laborious consideration, then I am fine with that. I just want a way to better cleanse. — billinghurst sDrewth 10:01, 9 October 2019 (UTC)
  • Hmm. I absolutely agree with Billinghurst that our standards should be higher than just accepting any old copydump absent an explicit policy reason that prevents us from hosting it. In fact, I think our standards should be higher in any number of ways, but that's outside the scope of this discussion.
    Factors that weigh in favour of keeping something are: scan available, index in place, source clearly provided, plausible license tagging, at least minimal formatting, and posted by a registered user (vs. an IP). Consequently, things that weigh towards not accepting a text are absence of scan and index, no clearly discernible source, no license tagging or incorrect or implausible license tags, and lack of Wiki formatting especially actual incorrect formatting (e.g. the big grey boxes visible in Turandot), and posted by an IP or a user with no or few other contribs.
    I think there should be some room to exercise judgement on these, weighing the totality of the above factors, and direct policy grounds for rejecting them (presumably a speedy criteria). Anything that has a reasonable chance of being improved to current standards should heavily favour being accepted; so, e.g., anything with a scan and Index posted by a registered user would be highly unlikely to be rejected on these grounds even if the text has multiple other problems. On the flip side, something with no formatting and lots of problems, with no or unclear source, missing or implausible license, and not posted by a registered user would be very likely to be rejected.
    For something like this we should probably have an explicit "challenge + review" type process, where the work in question is automatically restored and moved to full community discussion at WS:PD.
    I don't think WS:WWI is a particularly good policy hook to hang this on as a lot of crappy cut&paste jobs would be in scope if cleaned up. Better thus to have a speedy criteria with explicit built-in community review when challenged. If the uploader actually bothers initiating a challenge that fact on its own weighs in the work's favour (demonstrates interest in and willingness to work within the parameters of the project) the same way being a registered user does. --Xover (talk) 11:17, 9 October 2019 (UTC)
yeah. agree with assessment. but unclear about action plan. i would suggest an assessment, triage, put into maintenance category for further work, taskflow. it is unclear to me that WWI with a deletion task is preferable to "included but ignored / not supported / worked on" -- Slowking4Rama's revenge 13:01, 9 October 2019 (UTC)
I think a reasonable standard would be: if no one can figure out the source of a text dump, we delete it. Otherwise, no verification of it will ever be possible. Kaldari (talk) 21:29, 13 October 2019 (UTC)

MediaWiki:InterWikiTransclusion.jsEdit

Is there a reason we maintain our own copy of MediaWiki:InterWikiTransclusion.js instead of just using mul:MediaWiki:InterWikiTransclusion.js directly like frWS does? Our copy is a couple of years out of date and is missing functionality for section transclusion including fixes for phab:T188202. Some pages such as Lapsus Calami (Apr 1891)/Coll. Regal. are broken because our copy is missing these updates. —Beleg Tâl (talk) 15:22, 11 October 2019 (UTC)

@Beleg Tâl: I have changed the file to pull the mulWS script, please see whether it works as expected, there may be dependencies that need to be expressed. — billinghurst sDrewth 09:00, 14 October 2019 (UTC)
Not making a difference for me. — billinghurst sDrewth 09:03, 14 October 2019 (UTC)
Now it does. That damn caching of common.js. — billinghurst sDrewth 09:35, 14 October 2019 (UTC)

"Studies in Irish history, 1649-1775" available in scanned form?Edit

Is anyone able to see a downloadable, OCR'd copy of the work "Studies in Irish history, 1649-1775" by Murray, Alice Effie b. 1877., Gwynn, Stephen Lucius 1864-1950., Mangan, H. , Wilson, Philip. , Butler, William Francis Sir 1838-1910. et al.

I see that it is at HathiTrust though no indication of whether scanned.

If it is, it would be great if someone can get it into archive.org so we can take it to Commons. Thanks. — billinghurst sDrewth 03:18, 13 October 2019 (UTC)

It says full view at the bottom, and the first scan (unusually for HathiTrust) even lets you download the PDF directly. I've got a copy of the PDF, though at over 100 MB, I've got to upload it to IA and then sideload it to Commons.--Prosfilaes (talk) 03:33, 13 October 2019 (UTC)
I can't upload it to Commons, since one of the authors died 1950, and there's three others with no death date at hand. It's too big for me to upload it to Wikisource. It's at https://archive.org/details/studiesinirishhistory16491775 , if anyone knows how to get it here.--Prosfilaes (talk) 03:42, 13 October 2019 (UTC)
@Prosfilaes: Thanks, the download links must be country specific, and as I am not operating from a VPN, it knows that I am not US-based.. I will pull it onto Wikisource once it has derived a new form. If it is still over 100MB, then we can get someone to upload it on the backend. — billinghurst sDrewth 03:47, 13 October 2019 (UTC)
Looking around more, this is the same scan as https://archive.org/details/studiesinirishhi02obri . The HathiTrust version has an update date of 2012, so there may be a rescan or two, but it ultimately came from the same source.--Prosfilaes (talk) 04:25, 13 October 2019 (UTC)
What?!? I don't even see it in a search result. Sheesh, sorry to put you to the effort. — billinghurst sDrewth 05:40, 13 October 2019 (UTC)
@Billinghurst: Ad not US-based: Try Hathi Download Helper, but the downloading times are long. --Jan Kameníček (talk) 19:30, 13 October 2019 (UTC)

Tech News: 2019-42Edit

23:32, 14 October 2019 (UTC)

By the way, and unrelated to this particular "Tech News" issue, but still possibly relevant to be aware of is phab:T234576. The WMF have recently removed editToken from mw.user.tokens, and the replacement is csrfToken. This affects all Gadgets and user scripts that still use the former, and will show up as either an error message (in scripts with well-written error handling) or a silent failure when a script attempts to change a wiki page. In almost all cases the fix should be simply replacing the former token with the latter. --Xover (talk) 06:18, 15 October 2019 (UTC)

Projects from Community Wishlist — skates timeEdit

I am reliably informed that it would be beneficial for English Wikisource to think about and plan for projects from the community wishlist, either tidying them up, or extrapolating on them, and to maybe start thinking and planning very soon, or maybe right now. Picking the best and generating a case why they are good projects with excellent outcomes.

Previous years suggestions that have been classified as Wikisource-focused can be seen at

and of course some of the more general improvements can work for us as well.

What is it that we truly want that will make our editing lives better? What is it that will make our sites truly better? What is special about our sites that could and should be better to make us better? more integrated? better findable? Let us think how we can generate benefits, and what are the benefits, rather than just think features.

Building relationships within the broader Wikimedia can be advantageous, especially if we listen. — billinghurst sDrewth 09:43, 15 October 2019 (UTC)

CommentsEdit

  • Sidenotes and Layouts. We have had discussion here and never had solutions to building sidenotes from works, especially in the migration from a portrait page, to a landscape and wide computer monitor, and also onto a small mobile device. Aligned with that is the toggled layouts that were designed in prehistoric web time, and never fully functioned with sidenotes. So I would love to see if we can have some global work done on our CSS, as it is a true weakness for us. It also flows through to how we present in mobiles, and don't have good use of scripting means to present works in a known universal means, nor easy ability to check.

    So I would like for one consideration for a project is to look at the output of ProofreadPage to various devices and the compatibility of our CSS to do that well. Whether this then flows to other outputs as EPUB, and other portable document formats would be interesting. — billinghurst sDrewth 10:04, 15 October 2019 (UTC)

  • Visual Editor, user experience design as mentioned at Wikimania,[10] visual editor came to wikisource, but is not used because the menus are based on wikipedia. a UX refresh based on some user feedback would improve functionality. Slowking4Rama's revenge 13:59, 15 October 2019 (UTC)

Curly quote templatesEdit

I’m pleased to read that typographic quotes are now allowed.

One approach for editors who don't have them on their keyboards would be to create specific templates. E.g. {{sq}} could have variants like sqs (single quotes, straight) and sqc (sq, curly).

Alternately, would there be some way to have the quotes in special subpages so that something like "Name of file.djvu/dq/begin" contains the single character and have a template that transcludes that? You could then switch the quote style for an entire work with just 4 edits.

Opinions?

Pelagic (talk) 12:48, 15 October 2019 (UTC)

Would having a selection of them in the editing toolbar be sufficient? Templates for basic characters are fiddly and tend to look pretty cluttered in the code. --Xover (talk) 13:40, 15 October 2019 (UTC)
Could be a user option to move them out of the "Symbols" section of Special Characters to a more prominent location. Anyone up for writing an option to allow someone to place a customized set of special characters in the main editing toolbar? On second thought, there is already a gadget for keyboard shortcuts for accented characters. Easy to add quotes to that. Levana Taylor (talk) 15:30, 15 October 2019 (UTC)
Hmm, the gadget sounds good, so I have added it in my preferences now, but I cannot find anywhere whether there are any default shortcuts for some characters or how to add new shortcuts… Is there any documentation or help page about it? --Jan Kameníček (talk) 16:56, 15 October 2019 (UTC)
We can also resurrect {{dq}} and similar templates. —Beleg Tâl (talk) 15:47, 15 October 2019 (UTC)
Adding templates sounds as though we are encouraging their use. I wasn't aware that this was the plan of the community with the change. — billinghurst sDrewth 03:29, 16 October 2019 (UTC)
I think the ship has sailed … the idea of allowing curly quotes was greeted with enthusiasm by about 8 out of 10 commenters, so those same people will be busy converting texts, and it’s all to the good to facilitate doing so neatly and completely. Levana Taylor (talk) 04:17, 16 October 2019 (UTC)
Well, the issue here was perhaps more in regards using templates, specifically, for the quote marks. I don't think that's something we should encourage in the general case (with exceptions for special cases, including those where templates are the only reasonably efficient way for a given contributor to use curly quotes). All templates clutter the text in edit mode to some degree, and here we have a multiple-character expansion (6:1) that will occur tens of times on each page. It may even be enough to be a bona fide technical problem in a long quote-heavy work when you transclude a few hundred such pages into mainspace (cf. the issues with dotted toc lines).
For preference we should look to other methods to allow contributors to enter these characters, such as toolbar buttons (but at least one contributor does not use the 2010 wikitext editor or the 2017 wikitext editor), or OS/browser/keyboard-native input methods (but some OS and keyboards make that unreasonably hard). So we may need such templates for those outlier cases, but in that case we should label them accordingly as a stop-gap measure and possibly also run a continuous bot task to substitute them for the actual characters (on enwp they run a bot that automatically substs all instances of templates where the template is tagged as "must be subst:ed" that we could probably coopt for this purpose if needed). --Xover (talk) 04:58, 16 October 2019 (UTC)
I say no to templates. If people wish to do this less preferred means, then they can learn how to use all the means to add non-keyboard characters, or to use their User: drop down on their edittools set up. — billinghurst sDrewth 05:21, 16 October 2019 (UTC)
I also say no to templates. We have worked on removing the need for character templates (such as {{ae}}), so adding them for the various quote marks would be counter to our intentions. In terms of me being the main contributor who does not use either of the wikitext editors, don't initiate something specifically for me. I recognise that I am unusual in this regard. Currently things are working for me just fine without them. If I choose to work on a book in which curly quotes have been decided on, then I will add the characters to my User set in CharInsert and enter them that way. [Of course, works that I bring in and start working on will use straight quotes only.] Beeswaxcandle (talk) 06:05, 16 October 2019 (UTC)
Well, if you don't use them then it is likely that there may be more who choose not use them. I'm not saying this should be a dealbreaker for any solution we contemplate—like very old web browsers, at a certain point you just can't keep supporting them—but it would behoove us to keep the issue in mind and cater for it when reasonably possible. We want to increase the pool of contributors and make the bar to entry lower, `cause we sure ain't spoilt for folks willing to do the work! :) --Xover (talk) 06:20, 16 October 2019 (UTC)
Huh??? To whom are you directing this remark? The whole purpose of simple quotation marks is exactly for this purpose of simple editing, and why we have the style guide set as it is. The whole damn thing has been set for simplicity. In the past few years, it is the newer users who have been hyping things up trying to have exact replicas of works. The dinosaurs have long argued "KISS" and aligned with the simpler styles. — billinghurst sDrewth 11:31, 16 October 2019 (UTC)
I was responding to Beeswaxcandle's "Don't worry about me, I'll make do" by saying he's probably not the only one with that particular setup and if we can cater to it without excessive cost then we should cater to it. Recruiting new contributors is one thing, but it's equally important not to lose existing contributors (by making it harder or more frustrating for them to contribute). And in saying that I am merely stating general principles, not advocating any particular solution. But as an example (and only an example), by having a bot that automatically subst:s quote mark templates we could have our cake and eat it too: anybody that prefers it or needs it can enter them using templates, but the bot will replace them with the actual characters within minutes so that the presence of such templates in the work is only temporary. But, again, not a proposed thing we should actually do; just an example to illustrate the sort of thing that we might want to consider when the situation warrants. --Xover (talk) 14:00, 16 October 2019 (UTC)

Translations and source tabsEdit

Why do translations not have a "Source" tab? For example: Translation:Sleeping Beauty, which is transcluded from Index:La bella durmiente del bosque.djvu. Kaldari (talk) 17:31, 15 October 2019 (UTC)

Hmm. Good question. Perhaps tpt knows? --Xover (talk) 17:43, 15 October 2019 (UTC)
It's a known issue that has been outstanding since 2013 (i.e. when the Translation namespace was created), see phab:T53980Beleg Tâl (talk) 17:46, 15 October 2019 (UTC)
@Kaldari: because Translate: ns. was our add-on, and it probably never got coded in. Presumably there needs to be some connection the <page> building to know that it adds the /proofreadpage_source\ tab to the Translation: ns. when it transcludes pages.
I am not a coder so don't expect me to make it. — billinghurst sDrewth 05:40, 21 October 2019 (UTC)

Duplicated proofreading status with Visual EditorEdit

Hi! When creating a page, without any other change (just saving the content as is), if using the Visual Editor then the proofreading status is created twice (e.g. this page). Any idea? Is it a known bug, perhaps? It happens in other Wikisources as well. Thanks! -Aleator (talk) 19:03, 15 October 2019 (UTC)

Found: phab:T202200Beleg Tâl (talk) 20:10, 15 October 2019 (UTC)
Not being solved for 1 year and 2 months… How typical. --Jan Kameníček (talk) 21:40, 15 October 2019 (UTC)
yes, it is apparently having a status for header and status for body; status not easy to update in VE. could add to wishlist. Slowking4Rama's revenge 00:21, 16 October 2019 (UTC)
It would seem that it is still physically adding text
<pagequality level="1" user="" />
to the header, whereas the text is no longer actually added any more as it was melded into the page content model. I would have thought that it could have been a fairly easy fix. I would suggest that you bang on about the problem on the phabricator ticket, and ping some of the VE developers.. — billinghurst sDrewth 05:19, 16 October 2019 (UTC)
Somewhere in gerrit:/plugins/gitiles/mediawiki/extensions/ProofreadPage/+/master/modules/ve/pageTarget/ve.init.mw.ProofreadPagePageTarget.js and looking for "quality" to see where it adds the tag to the header, and presumably we want to see where it also sits in the new content model. The version presumably in the header needs to go, and ensure that VE changes the content in the other when it edits. — billinghurst sDrewth 05:46, 21 October 2019 (UTC)

Feedback wanted on Desktop Improvements projectEdit

07:15, 16 October 2019 (UTC)

IA-Upload tool is downEdit

@Samwilson: @Tpt: FYI —Beleg Tâl (talk) 14:55, 16 October 2019 (UTC)

Given the timing I'm going to go ahead and guess that this is related the ongoing reboot of the cloud hosts on which the toolserver/toolforge runs. There was one batch last Wednesday, one batch today, and will be another one next Wednesday. From the error message it looks it might just be that the service needs to be started again. --Xover (talk) 15:38, 16 October 2019 (UTC)
@Xover, @Tpt: I've restarted the web service and it seems to be fine again now. —Sam Wilson 00:02, 17 October 2019 (UTC)
It was down for 14 hours, 45 minutes and 58 seconds. Sam Wilson 00:17, 17 October 2019 (UTC)

Fully validated indexes that aren't transcludedEdit

Here are the 36 fully validated indexes that are not properly transcluded into main space (based on bot data):

Some of these have no transcluded content at all, some of them are partially transcluded, and some of them have more fundamental problems like incomplete indexes. If anyone wants to work on them, go for it! Kaldari (talk) 18:23, 16 October 2019 (UTC)

Here's a PetScan link for future reference: https://petscan.wmflabs.org/?psid=12253885 (also dibs on Townshend) —Beleg Tâl (talk) 18:43, 16 October 2019 (UTC)
  •   Comment Which is why I generated the petscan queries ages ago, see User:Billinghurst#Petscan_queries (note that some need to be retweeked, as Magnus's petscan v.2 made his fields case sensitive for initial character (-> category and templates), and I had hoped that he would fix that issue) and why I generally get around to transcluding these at the end of each months, though Southern Historical Society Papers is a massive job, and it is taking me forever to work through Index:Dictionary of Indian Biography.djvu. — billinghurst sDrewth 21:53, 16 October 2019 (UTC)
Index:Dictionary of Indian Biography.djvu can be mostly automated, if you are interested, feel free to ping me.Mpaa (talk) 21:51, 17 October 2019 (UTC)

Thank you both for your efforts in making this info more accessible. Much appreciated. -Pete (talk) 21:57, 16 October 2019 (UTC)

  • As I have noted on my talk page, 'Index:Adelaide Contents.pdf' was transcluded 17.5.2017. 'Index:Felicia Hemans in The Christmas Box, 1829.pdf' and 'Index:Felicia Hemans in The New Monthly Magazine Volume 34 1832.pdf' have been done today. Esme Shepherd (talk) 20:35, 19 October 2019 (UTC)