Open main menu
Scriptorium
The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help. Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 359 active users here.

Contents

AnnouncementsEdit

ProposalsEdit

New speedy deletion criterion for person-based categoriesEdit

Following on from a discussion at WS:PD#Speedy deletion of author based categories.

It is long established and in the main uncontroversial that English Wikisource does not use person-based categories (of the type "Works by John Smith", "Poetry by John Smith", etc.). Some previous discussions can be found at: 1, 2, and 3 (and the two following threads). However, absent a speedy deletion criterium specifically for these, admins have to rely on the provision for precedent-based deletions. In practice this means such categories must be brought to WS:PD to be rubber stamped, wait at least two weeks (because inertia and habit), and then hopefully someone will remember to process them. Eventually.

I therefore propose that we extend the deletion policy with a new G8 criterion as follows:

  • Person-based categories—Categories where the defining characteristic is person-based. This includes, but is not limited to, author-based categories like "Works by author name".

All deletions (modulo CU type concerns) are subject to community challenge in any case, and are clearly visible in the deletion log, so there is no particular benefit to the bureaucracy where there exists no significant uncertainty or controversy. --Xover (talk) 14:32, 15 July 2019 (UTC)

  Support, but I'd note that there is an exception discussed in link #2: namely, American presidential documents categorized by president. This is due to the fact that the administration of the executive branch is tied to who is the president at the time. There was no consensus as to the scope of this exception: what kinds of presidential documents it applies to, or whether other governments may have the same treatment, etc. —Beleg Tâl (talk) 14:42, 15 July 2019 (UTC)
  Oppose 2 weeks is not too long to wait. organization of subject of a work is useful, a migration to a stable ontology is necessary. Slowking4Rama's revenge 13:58, 30 July 2019 (UTC)
2 weeks is definitely too long to wait when a full beaurocratic procedure with a foregone conclusion could be replaced with a simple administrative action. —Beleg Tâl (talk) 14:32, 30 July 2019 (UTC)
Also it is worth pointing out that this proposal is not regarding whether such categories should be kept or deleted (since we have already established that they should be deleted), but only whether they should be posted to WS:PD before we delete them. —Beleg Tâl (talk) 18:51, 30 July 2019 (UTC)
And that strictly speaking, under current policy, they can be deleted a few days after a notice has been posted to WS:PD (no two week wait required, just that the discussion must have "started"). It's just that habit and inertia inevitably means that almost all cases will in practice suffer this 2+ week purely bureaucratic delay. I'm a big believer in process and the value of bureaucracy when properly deployed, but even I think this one is a pointless waste of volunteer time. We have issues that require actual discussion or other action that have sat open on the noticeboards for a year and a half; we should not waste those resources on filling out forms in triplicate for issues that are not controversial. Any deletion can be reviewed and overturned, if needed, by the community; let's save the cautious multiple-safeguards approach for stuff that might actually need it. --Xover (talk) 19:11, 30 July 2019 (UTC)
I always wait until there has been a full month of inactivity, since there are many editors who only edit occasionally, but that's just me. —Beleg Tâl (talk) 19:17, 30 July 2019 (UTC)
  Support --EncycloPetey (talk) 17:40, 30 July 2019 (UTC)
  Support --Jan Kameníček (talk) 19:38, 30 July 2019 (UTC)
  Support though if possible I'd like to see the exception Beleg Tâl specified firmed up a bit, i.e. perhaps a general exception for things like governments, ministries, and reigns which are "person-based" but serve an obviously different function to categories-by-author (noting on the UK side things like Category:Acts of the Parliament of Great Britain passed under George III). —Nizolan (talk) 00:44, 1 August 2019 (UTC)
  • Note Based on the discussion above I have added the above criterion with an additional limitation to exempt things like UK governments tied to a monarch's regnal period or the administrations of US presidents. I read the above as general support for this criterion—sufficient for adding it—but with some remaining uncertainty about the optimum phrasing. I'll therefore leave this discussion open for a while longer so that interested parties may object or suggest better wording. I'll also add that minor changes to the wording (that do not change the meaning) can easily be made later with a proposal at the policy talk page. And we can always bring bigger changes up here for reevaluation if it causes problems. --Xover (talk) 19:32, 11 August 2019 (UTC)

Deletion reviewEdit

I long ago (2005) gathered together historical documents related to the life of Indigenous Australian warrior Yagan in Category:Yagan. This has always seems to me a reasonable category, but it just got speedily deleted without so much as a how-d'-y'-do.

The examples given in this proposal were of the form "Works by John Smith", "Poetry by John Smith", etc. No other examples were given in the discussion. So I'm not sure if the community really intends that categories like this would be deleted. Can we review this please?

Hesperian 23:48, 2 September 2019 (UTC)

Hmm. I'm not going to express an opinion on "should" / "should not" for this, but I will note that based on my understanding of the discussions this would indeed be the intended effect. The defining characteristic of the category is that its members relate somehow to a specific person, and for such the consensus appeared to be that portals were better suited. But perhaps there is a distinction between Category:Yagan and Category:John Smith that I am not seeing? Or is it the specificity: Category:Foo by Person is bad, butCategory:Person is acceptable? --Xover (talk) 03:58, 3 September 2019 (UTC)


As things stand:

  • I can gather together documents about the Battle of Borodino in Category:Battle of Borodino, because that's an event.
  • I can gather together documents about Fort Knox in Category:Fort Knox, because that's a place.
  • I can gather together documents about scissors in Category:Scissors, because they are objects.
  • I can gather together documents about intelligence in Category:Intelligence, because that's an abstract concept.
  • But I can't gather together documents about Yagan in Category:Yagan, because he was a person.

Can no-one see how bizarrely arbitrary this is??

And it hasn't even really been discussed, since the only examples given above are "Works by" categories, the deletion of which makes perfect sense. Hesperian 11:50, 3 September 2019 (UTC)

Fully agree with Hesperian, the speedy deletion is a misinterpretation of the guidance. The "category:works of ..." is to ensure that works of authors are added to author pages, and not categorised. There is no determination that it would relate to anything else. Categorisation has always existed for people, again our biggest issue is how to separate author categorisation from subject categorisation. — billinghurst sDrewth 12:39, 3 September 2019 (UTC)
Read the policy, it does not say "works by …", it says "person-based". —Beleg Tâl (talk) 12:56, 3 September 2019 (UTC)
Per our deletion policy (as updated according to the consensus in the above discussion), "Person-based categories" are now a criterion for speedy deletion. This "includes, but is not limited to, author-based categories", but "the defining characteristic is person-based". This was very explicit in the above proposal. My deletion of Category:Yagan was therefore 100% within our deletion policy. You can propose a reversion to the older version of the deletion policy, and a restoration of Category:Yagan (even though it is entirely redundant of Portal:Yagan), but I will have no part in it. —Beleg Tâl (talk) 12:53, 3 September 2019 (UTC)
Also: as things stood before the above discussion, I could gather together documents about Yagan in Category:Yagan, but couldn't gather together documents about Yazid III in Category:Yazid III, which is just as bizarrely arbitrary. —Beleg Tâl (talk) 13:03, 3 September 2019 (UTC)
(ec) It is my opinion that it is not a positive change. 0-100 in four seconds. I find the statement It is long established and in the main uncontroversial that English Wikisource does not use person-based categories to not be the case, especially as it has been the case since 2005. Something that was entirely in scope and I believe would have been kept in a PD, is now going to a speedy deletion and deleted without conversation. I find that inappropriate, and for that to have been implemented in four weeks is an example of poor implementation and poor policy. I am wondering where this community is going, and the lack of vision that this represents. — billinghurst sDrewth 13:14, 3 September 2019 (UTC)
It may also have simply flown under the radar. It is also just one category affected, and a completely redundant one at that (equally redundant to any Author-based categories). And the proposal to update the policy was done entirely by the books, and is a significant benefit to the community. —Beleg Tâl (talk) 13:30, 3 September 2019 (UTC)
And it has been long established and in the main uncontroversial that English Wikisource does not use categories for individuals who have pages in Author space; the fact that there existed one or two categories for an individual in Portal space is (to me) a minor detail and I would have also considered it long established and uncontroversial that these were also unwelcome. —Beleg Tâl (talk) 13:33, 3 September 2019 (UTC)


Of most concern to me in this new G8 is, what if Portal:Yagan did not exist? In that case, Category:Yagan would be the only way in which we had organised our material by topic, yet it would still be summarily deletable under this new G8.

I think a more coherent policy position might be:

We don't want to organise our material by both Author/Portal and Category. So it is fine to create a category for a topic if there is no corresponding Author/Portal page. But be aware that this is a stopgap -- once someone has created the Author/Portal page, the category may be deleted.

Note that this doesn't distinguish people from other topics. Category:Yagan is fine, but only until Portal:Yagan has been created. Even Category:Works by John Doe is fine, but only until Author:John Doe has been created.

I think the biggest problem with this position is the really big topics that would be better handled by a category than by an Author/Portal page e.g. War. In that case, I would say keep the category and ditch the portal, which would be unmaintainable. In a speedy criterion there would certainly need to be something to prevent deletion of categories that contained subcategories or a collection of portal/author pages.

Thoughts? Hesperian 22:50, 3 September 2019 (UTC)


Since the attitude to concerns raised here has been "I will have no part in it" followed by non-participation in the discussion, I have boldly replaced "person-based" with "author-based". I accept the new G8 was proposed, discussed and implemented in good faith, but subsequent objections have made it clear that there is no consensus for speedy deletion in the gap between person-based" and "author-based".

To be clear: we may not agree on whether Category:Yagan should have been deleted, but I think we can all agree that the deletion was contentious, and speedy delete criteria are intended to capture non-contentious matters.

Hesperian 07:53, 6 September 2019 (UTC)

@Hesperian: I'm not going to revert that because I think at least temporarily going back to the status quo is prudent when a concern has been raised so soon after implementation. But I do object in principle to your approach here: whatever the problems with the new G8, it was properly discussed, consensus determined, and implemented. For you to unilaterally reverse it is not a good practice, no matter the merits of your concerns with it. The proper description of the thread above is, strictly speaking, not "absence of consensus" but rather "complaints after the fact" (possibly good, proper, and meritorius complaints, but still after the fact). So I am going to insist that this removal of the new criterion is a temporary measure while discussion is ongoing, and not the new status quo. If no new consensus is reached here then we revert back to what was previously decided. (To be clear, if you had suggested we should temporarily revert I would have supported that. It is your acting unilaterally with an apparent intent to change the status quo I object to.)
That being said I am absolutely open to being convinced of anything from the new criterion needing to be tweaked and to it needing to be dropped altogether. The reason I am not currently actively discussing is that I do not feel I sufficiently grasp the issue and am mulling it over. Your distinction between "person-based" and "author-based" has not been apparent to me prior to your latest comment, and I now suspect that that distinction is the crux of your objection; but I still do not grasp why you do not feel a portal would be sufficient. On the other hand, reasonably curated categories are cheap, and can conceivably be automatically applied to works included in a portal.
I also suspect, though I may of course be entirely mistaken, that what we are discussing here is not actually a speedy criterion, but rather a more fundamental issue of category and portal policy. I am not convinced the speedy criterion is a useful proxy for that debate, on the one hand, and that the former will resolve itself neatly if the latter is settled, on the other. --Xover (talk) 08:30, 6 September 2019 (UTC)
@Hesperian: "I will have no part in it" is me, not the community. I agree with Xover that it is necessary to establish a new consensus with the community to make a subsequent update to the deletion policy (in which discussion I will remain neutral). And like I said to TE(æ)A,ea.: three days is not remotely sufficient for closing a discussion. Be patient. —Beleg Tâl (talk) 12:20, 6 September 2019 (UTC)
  •   Comment There is definitely a long-established practice that we collect and curate works that relate to authors, and due to our strong preference to curate, we determined to not categorise, which would have a duplication and a confusion. It has not been the case for individuals who were not authors, and it should not be a requirement that we have to curate such pages, especially where a person may be mentioned on a page(s) though not be the focus of the pages. For instance, the page The Perth Gazette and Western Australian Journal/Volume 1/Number 28 would be considered for categorisation in "Category:Yagan" though would not particularly be the focus of a page and put onto a Portal: ns page. I would definitely not expect someone to have to make edits to a portal page to that target, though I would have no qualms with someone categorising. Where we have authors, we have wikilink'd back to author pages for that relevance. So it is my belief that these non-author categories should not be speedied, if there is a case for their deletion, then bring it to the community. I also believe that a proposer should be listing consequences of their suggested policy changes, not leaving it to the community. I find the above consensus to be a troubling "yes ... tick and flick" exercise by the community without an in-depth exploration of the consequences, approving a change to speedy deletion should be items that are completely non-controversial.

    The above deletion discussion started with the scope of a PD discussion about author categories, and then specifically addressed two author related categories. No examples were given of non-author categories that would have been wrapped up in the change of our guidance, nor that we were going to now speedy delete categories that have been existing for greater than 10 years. I have a strong belief that anything that has existed for over 10 years onsite should not be speedied, and that speedy deletions are only best applied to recent additions.

    Xover: You suggested the policy change, then summarily closed less than four weeks later, and implemented. May I suggest that is not the ideal practice either, as this is a change of policy where all person categories are deleted, not as indicated in the discussion that it was an existing process and the speedy being the only change. We are not a huge community, we don't have the same editing rates, or the diversity of eyes to analyse such situations, and that is traditionally why we have left discussions open for extended periods. — billinghurst sDrewth 10:55, 7 September 2019 (UTC)

    @Billinghurst: "Too quickly closed" is a fair complaint, although I don't entirely agree with that assessment. I agree there should be plenty of time for the community to ponder, scrutinise, discuss, and decide; and in fact was somewhat disappointed that the proposal did not garner wider participation and more discussion. I agree speedy criteria should have a firm basis, which broad participation in the proposal is the best way to ensure (and document!). But I also observe that community participation in such discussions is distressingly low in general, and by that yardstick the above was about the most I felt one could realistically hope for. When no further comments either way surfaced—not even any "Unsure" or "Wait, I need to think a bit more"—I felt that was sufficient to implement. If we want to have much longer timeframes to tease out every possible community comment then we should have specific guidance to that effect (and I do mean a specific number of weeks).
    I agree that speedy should be for uncontroversial things, but then my understanding was that this was uncontroversial. My intent in making the proposal was not to change practice regarding use of categories vs. portals, but rather to eliminate a pointless two-week wait and bureaucratic box-ticking for something that was a priori determined would be deleted. I do however disagree that speedy should not be applicable to, for example, decade old clear copyvio. The purpose of speedy deletions is to reduce bureaucracy and make maintenance more efficient—where possible—and to reduce the demands on the community's time and attention in formal discussions. Because, as you point out, such participation is perhaps our scarcest resource! The age of the material affected is entirely orthogonal to whether it falls within one of the speedy deletion criteria.
    "Uncontroversial" is a better distinction, but even there some nuance is needed. The policy that leads to the deletion (by whatever process) must be unambiguously decided: it must be uncontroversial that that was what the community decided. The issue itself, though, can still be plenty controversial: there are some contributors who would never see anything deleted, for any reason, and express their frustration with copyright law and our copyright policy in every copyright discussion they participate in (nevermind proposed deletions). That someone disagrees with the community's decision, once made, is not a valid reason for considering the implementation of that decision controversial.
    On the issue at hand, though, I (am starting to) see the personauthor distincton, but I am having trouble understanding how a portal is any less suited for a person than for an author. To my mind the very same arguments for portal over category for authors apply equally to persons. Why wouldn't The Perth Gazette and Western Australian Journal/Volume 1/Number 28 go in the portal? Or is it the perceived relative amount of effort in curating the two approaches? Hesperian's more coherent policy position seems to suggest that that is the case.
    I don't think starting with a category but deleting it if a portal is created is a particularly rational approach, but as a proposal it does speak directly to the relationship between categories and portals. To me, the opposite end of the spectrum (that you also address) seems more elucidating: once a topic is sufficiently large, a portal becomes an awkward way to organise the information. In those cases I could see an argument for using both; the category for everything and the portal for the highlights. But that's an argument that will be relevant only rarely (relatively speaking) and only in the reverse order (only once the portal is "full" does the category come into play). Most person-related topics will not have too many relevant works for a portal.
    Or perhaps a different angle of attack would aid common understanding: Categories, Portals, and Author-pages overlap in various ways and in different degrees, and so we should establish some coherent guidance on the purpose of each, what to use each for, and how to distinguish between them in difficult cases. Perhaps in discussing what that guidance should be we would better understand the various perspectives than through the proxy of a speedy criterion? For example, do we want a portal about a person as a historical figure if that person is also an author? Is an Author: page and a Portal: the same thing except for inclusion criteria? Do the same layout rules and restrictions apply to both? --Xover (talk) 03:19, 9 September 2019 (UTC)
i am sad that admins persist in summarily deleting, for contentious issues that require a consensus. we need a standard of elevating issues on chat before deletion. and a standard of practice of how to organize ontologies of "subject of" and "depicts". i don’t care how- portals, categories, subsection, anything that can be linked from wikidata. but we need an organizational consensus, not deletion. Slowking4Rama's revenge 03:43, 13 September 2019 (UTC)
@Slowking4: But, but, but, but you do not understand the sysop perspective. They delete without consequence (for themselves, as from a sysop's perspective a deleted page may be view/restored and viewed without going through with restore. See? No consequence!) As for for the plebs, tough! Them's oughta put in an application to be tiara'd like good little princesses… 114.78.171.144 06:09, 13 September 2019 (UTC)
114.78: I realise you're taking the piss here, but I actually agree that this is an important difference in perspective to take into account. One thing is that the consequences of deletion can in some (but not all!) cases appear smaller to those with the technical ability to view and restore deleted pages, but the perspective is also shifted when you have long backlogs of tasks that either can only be resolved (in practice) by deletion or where deletion is a fairly foregone conclusion. To have to conduct a formal analysis, formulate it cogently, and run a community discussion is a lot of effort. The relatively low community participation in those discussions means they have a tendency to deadlock, and if resolved are too local to support any kind of future precedent. When a lot of your tasks are dealing with that dynamic, you will naturally tend to develop a bias (big or small) toward more efficient resolutions like having speedy criteria for whatever the issue at hand is.
But when you spend a lot of time going through the maintenance backlogs you also gain the very real experience that tells you that a lot of stuff has been dumped here with no followup, attempts to format properly, or even giving minimal source or copyright information. There is literally no hope of these works being brought up to standard as they are, and would in any case be easier to recreate from scratch than fix in place, even if they aren't blatant copyright violations. While we certainly need to watch for and not get fooled by the previously mentioned bias, we also should let ourselves be guided by this experience. Sometimes the perspective of those who work the maintenance backlogs (which is not by any means limited to just admins!) gives them a better foundation for reasoning about an issue than those who work primarily on their own transcriptions (and sometimes not). --Xover (talk) 07:25, 13 September 2019 (UTC)
your "guided by experience" does not address the power dynamics of a summary standard of practice. when you undertake an action. no matter how reasonable or justified you may feel, while the community is feeling ill-used, then you might want to rethink your action, if you would presume to lead a community. we have a lot of ban-able admins. Slowking4Rama's revenge 11:44, 13 September 2019 (UTC)
@Xover, @Slowking4:My sincere apologies if my comment came across solely as micturient. When young fresh meat front up to gain the authority bit it is entirely reasonable they not realise they are actually signing up for a melange of teacher, executioner, judge and neat-freak. What is less excusable is that some of them never even learn of the damage they do to the parallel roles whilst obsessing over the matter of the moment. Ordinary users are watchers and judger's too and may take away quite unexpected conclusions from administrator actions. Looked at another way the spread of intelligence is (sadly) unrelated to the authority role granted. That there never seems to be a shortage of potential idiot actions does not mean it is a good idea to go down each and every rabbit-hole.
On the other hand the occasional well-reasoned explanation might even result in the next applicant putting their hand up and taking some pressure off off the backlog slaves. If that flags me as both bitter and optimistic then just handle it. I have to. 114.78.171.144 22:06, 13 September 2019 (UTC)
@Slowking4: I have suggested above that the ontological discussion might be a better way to approach this issue than the speedy criterion. What are the ontological categories we need to handle, and what tool or structure of those we have available to us would be best to handle each? If we can figure out some guidance on that then what should be kept and what should be deleted will, hopefully, follow naturally. Perhaps you could flesh out your thoughts regarding "subject of" and "depicts" with that in mind? --Xover (talk) 07:25, 13 September 2019 (UTC)
we would need to group together all those works, which people seem to use categories . we have categories on authors, we could start with a wikidata infobox at author pages. if the community wants portals for subjects, then we will need a infobox and migration from categories to portals. (this is different from how it is done on commons) you could then link on wikidata, and have some query function to aid search, we need some wayfinding to aid search of topics. Slowking4Rama's revenge 11:53, 13 September 2019 (UTC)

RFC: Allow curly quotes under some conditionsEdit

Per the discussion above, I would like to propose that the following change be made to Wikisource:Style guide: Replace... Use typewriter quotation marks ("straight," not “curly”). ...with...
Curly quotation marks are permitted only if they are used in the original work and are used consistently throughout the transcription. Otherwise straight quotation marks are recommended.
Please express whether you support or oppose this change. Kaldari (talk) 15:39, 14 August 2019 (UTC)

Support. Two suggested tweaks to the language:
Instead of the word "ensure" (which occurs twice), I suggest "unless one or more Wikisource users are committed to ensuring." It's impossible to ensure anything on a public wiki; and I can foresee unresolveable arguments about what it means to "ensure" this in any given case. But, the stated intention of a single wiki user can be a powerful thing, and is possible to define more clearly.
I find the parenthetical section slightly confusing. I think it means that a large number of contributors would make it harder to ensure consistency; but that isn't entirely clear, and that's not necessarily true. So instead, perhaps, "(e.g., because many contributors, without a clear or enforceable agreement on style conventions, are likely to contribute to this particular work.)" -Pete (talk) 17:09, 14 August 2019 (UTC)
@Peteforsyth: I've tweaked the wording to address your concerns. Kaldari (talk) 00:55, 15 August 2019 (UTC)
Thanks, that's an elegant solution. -Pete (talk) 01:04, 15 August 2019 (UTC)
Oppose. 1) Curly quotes should be allowed regardless of the style of quotes in the source scan, just like straight quotes. 2) I cannot tell if your wording intends to cover other systems of punctuation such as „lower quotation marks“ or «guillemets», which should never be replaced with upper quotation marks of either style. —Beleg Tâl (talk) 17:27, 14 August 2019 (UTC)
@Beleg Tâl: I've tweaked the wording to address your concerns. Kaldari (talk) 00:53, 15 August 2019 (UTC)
I agree with what I think Beleg Tâl is saying...if we're going to alter the policy, it should permit using other kinds of quotes (at least, if they are what the original uses). In fact, I think if several contributors agree that guillemets are the appropriate choice in a specific work, even if they weren't used in the original, there might be good reason for it, and it shouldn't be expressly disallowed. -Pete (talk) 01:04, 15 August 2019 (UTC)
@Kaldari: I appreciate the update. My concern #1 still applies so I am not comfortable supporting the proposal as it stands. —Beleg Tâl (talk) 13:24, 15 August 2019 (UTC)
Oppose. I do want curly quotes allowed but don't favor the proposed version of the proposal. I agree with both Beleg Tâl’s comments. I think that the only restriction should be something like, if a work already wholly or partially proofread uses straight quotes throughout, a user should not introduce curly quotes unless they're committed to changing the whole thing to curly. Levana Taylor (talk) 18:26, 14 August 2019 (UTC)
@Levana Taylor: I've tweaked the wording to address your concerns. Kaldari (talk) 00:53, 15 August 2019 (UTC)
I don’t think "all contributors to the transcription agree to use them consistently" is a practically workable condition to impose. What if (as is extremely likely) some early contributors can no longer be contacted? Someone who comes in later ought to be able to make a global change as long as they change the whole thing. Levana Taylor (talk) 02:22, 15 August 2019 (UTC)
@Levana Taylor: I've tweaked the wording again. Hope that sounds better. Kaldari (talk) 02:43, 15 August 2019 (UTC)
Yes! This is more like it. I can support this simplified version, which leaves it open whether consistency is to be achieved by consensus (e.g. on projects that have a discussion page), or (in smaller works) one person going through the whole thing. And I don't mind restricting use of curly quotes to works that use them in the original. Use of straight quotes is something of a special case -- might be found in modern documents (e.g. government work being added here), and it makes sense to keep that style, I guess. We still have to mention. guillemets and German „goose feet“ … maybe the wording should be re-organized, more or less thus: straight quotes, guillemets, etc. should be kept as in the original. If the original has curly quotes, these may have straight quotes substituted for them, or curly quotes may be used; the latter should only be done if they are used consistently throughout the transcription. Levana Taylor (talk) 03:51, 15 August 2019 (UTC)
Comment. The previous voting discussion has shown that there are supporters of the change in favour of other than only straight quotes, but they prefer various solutions. For this reason it is probably not a good idea to pick one of them and vote simply for or against, it would be better to vote about all of them and choose the one with the biggest support. (BTW, the chaos accompanying this process, when somebody considered the discussion to be voting, while others were waiting for the voting to start, is a result of missing instructions similar to Wiktionary:Voting policy and Template:vote on hold.) --Jan Kameníček (talk) 18:57, 14 August 2019 (UTC)

Pinging other folks involved in the original discussion: @Prosfilaes, @EncycloPetey, @Billinghurst, @Xover: @Nizolan, @TE(æ)A,ea., @Koavf, @Beeswaxcandle:. Kaldari (talk) 10:03, 15 August 2019 (UTC)

Support. I have made two changes to the style guide in favour of this; the first, a more general approach favouring standardisation, and the second, a more specific approach similar to the desires of Jan Kameníček and Levana Taylor. TE(æ)A,ea. (talk) 11:51, 15 August 2019 (UTC).
Sorry to contribute to the nit-picking but though I'd like to support this I'm not sure on the current wording because there's a disproportion between the two parts: if curly quotes are only permitted under certain conditions then why are straight quotes merely recommended otherwise? What's the alternative? Also agree with BT above that exceptions for guillemets and other forms should be specified. —Nizolan (talk) 12:49, 15 August 2019 (UTC)
Oppose. We're having a vote where the text is being tweaked throughout the vote after some people have voted. We need to vote on a set text, not be adjusting it throughout the vote. You don't change the candidates or platforms once the vote has begun. --EncycloPetey (talk) 15:18, 15 August 2019 (UTC)
If the tweaking doesn't go on for too long, it's not very hard to check in with the early voters and find out whether/how it impacts their votes. I stated above that the tweaks were to my liking, other early voters could always comment/clarify as well. But I think we're in agreement that it's best to at least limit/minimize changes in order to have a coherent vote. -Pete (talk) 16:49, 15 August 2019 (UTC)
Weak Support, as it is better than forcing everybody to use only straight quotes, but I am not very happy with the specific expression “curly quotes” instead of "the same kind of quotation marks as the work presents". There are several kinds of "curly" quotes and I believe they should not be used interchangeably. If a work uses “” the contributors should not use ’ ’ or „“, although they are all curly. --Jan Kameníček (talk) 22:18, 15 August 2019 (UTC)
The current MOS guideline to use straight quotes only does not imply that users can use ' ' in place of "; neither would this updated guideline imply that users can use ’ ’ in place of “”. Curly quotes in this context means specifically “this” as opposed to "this", and ‘this’ as opposed to 'this'. It would still be wrong to use “this” in place of ‘this’, or in place of «this», or in place of literally anything except "this". —Beleg Tâl (talk) 23:50, 15 August 2019 (UTC)

Alt proposal: Allow curly quotes under any conditionsEdit

I propose that, rather than the above change to WS:MOS, we instead change

  • Use typewriter quotation marks ("straight," not “curly”).

to the following:

  • Use a consistent style of quotation marks ("straight" or “curly”) within a given work. It is recommended to use "straight" quotes in works where there are a large number of contributing editors, since consistent use of “curly” quotes may be difficult to achieve.

Beleg Tâl (talk) 19:48, 15 August 2019 (UTC)

  SupportBeleg Tâl (talk) 19:48, 15 August 2019 (UTC)
  Support, simple is good. (@Beleg Tâl: there's a typo at the end, "consistant" -> "-ent" fixed)Nizolan (talk) 21:43, 15 August 2019 (UTC)
Support. This is generally similar to my second change to the style guide. TE(æ)A,ea. (talk) 21:47, 15 August 2019 (UTC).
I used your second change as a basis for wording my proposal. —Beleg Tâl (talk) 23:51, 15 August 2019 (UTC)
Oppose. I do not agree with allowing to use curly quotes even in cases when the original works use straight quotes. --Jan Kameníček (talk) 22:00, 15 August 2019 (UTC)
And yet you agree with allowing to use straight quotes even in cases when the original works use curly quotes. I think that if we allow straight quotes in place of curly, but do not allow curly in place of straight, then we may as well continue to disallow curly altogether. —Beleg Tâl (talk) 23:37, 15 August 2019 (UTC)
In my opinion, disallowing the use of curly quotes when a scan uses straight quotes is similar to: disallowing the use of 'a' when a scan uses 'ɑ'; disallowing the use of 'g' when a scan uses 'g'; disallowing the use of '$' when a scan uses ' '; &c. —Beleg Tâl (talk) 00:03, 16 August 2019 (UTC)
What if a scan uses German-style lower-level quotation marks, but those quotation marks are straight? There is no straight lower-level quotation mark to replace it with, and you would not allow the replacement of the straight quotation marks with „curly ones“. —Beleg Tâl (talk) 00:04, 16 August 2019 (UTC)
Hmm, I have a hard time imagining a case where this hypothetical problem becomes an actual problem. In most cases, there is no practical problem with one kind of quote...if it's an academic essay, for instance, I really don't see how a reader is done a disservice by encountering “curly quotes” where they expect "straight ones." In a few cases, like poetry, it might actually be significant. In those cases, I have more trust in the good judgment of my fellow Wikisourcers to find the proper solution, than I have in any policy. If a poem had straight quotes, and its appearance would be substantially altered by using curly quotes, it's hard for me to imagine a Wikisource editor who appreciates the poem using the policy to justify changing them to curly quotes. Your objection, Jan, seems to me rooted in worry about something that's very unlikely to happen. -Pete (talk) 00:52, 16 August 2019 (UTC)
Support. Nice, this is very similar to the original proposal, but slightly clearer, and uses more straightforward language. -Pete (talk) 00:52, 16 August 2019 (UTC)
  Support The more I work with epubs the more I want our exported books to look as nice as possible. —Sam Wilson 06:55, 16 August 2019 (UTC)
  Support Giving Wikisource editors some flexibility seems like a good thing to me. Kaldari (talk) 20:50, 16 August 2019 (UTC)
  Support I really like this. Cuts the Gordian knot, allows contributors to use their judgment; and if it doesn't address all the ins and outs and special cases, well, what wording could? Levana Taylor (talk) 23:13, 16 August 2019 (UTC)
  Support I'd like to have the option of using curly quotes, it makes the books look so nice. Prtksxna (talk) 05:22, 21 August 2019 (UTC)
Neutral While this codifies what we already have in place by de facto, I have seen nothing that addresses the basic issue that some of us are unable to type these so-called curly quotes. So, how will consistency in works be ensured? If this goes ahead it is essential that on each work (on the Index Talk: page), a formatting note is provided indicating which mode of textual quote marks are being used AND that all contributors to the work actually read the note prior to contributing. Beeswaxcandle (talk) 23:32, 24 August 2019 (UTC)
I expect it will be the same as any other issue. We have consistency problems in PotM, and even cases where a second editor makes spot changes to format. We cannot control these things except to note that they have happened and chastise a person who has done so. The point of this vote, however, is to provide a standard against which such calls may be made. --EncycloPetey (talk) 02:41, 25 August 2019 (UTC)
As for typing them: they can be found among the special characters just above the editing window, although it may slow down the work.
Both points raised above are among the reasons why I proposed there should be specifically written that the curly quote rule does not apply to works where cooperation of more people can be supposed. If three people out of four prefer curly quotes, it is no good if the fourth person is forced to use them too for "consistency reasons" even if s/he feels limited by them. There should be written that curly quotes are allowed only if it can be assured that no contributor to the work opposes them or may oppose them (typically when one single person transcribes the whole work). --Jan Kameníček (talk) 07:48, 25 August 2019 (UTC)
I don't like the idea of it being entirely impossible to use curly quotes in large, extensive works. It should be a matter of judgement by people who are working on it at the early stages, at the time style is being established. They should discuss as much as possible and decide whether curly quotes are practical under the circumstances where the work is being done.
As for entry, there are several plugins for Firefox, and I think for Chrome too, to assist with quotes. Personaly I use a combination of two method: convert chunks of text, highlighting quotes, with MS Word; while typing, use the superb macro plugin ABCTajpu which has made my life easier In so many ways. Levana Taylor (talk) 16:35, 25 August 2019 (UTC)
I maintain a WikiEditor button for converting to curly quotes. Sam Wilson 01:18, 27 August 2019 (UTC)
Exactly, you need various plugins, highlight the quotes in Word..., which some people may dislike or be unable to do. I personally prefer working directly in the editing window using OCR buttons. Google button usually transcribes all quotes as straight, so if somebody wants to use curly ones, they have to be changed manually, which is time consuming.
I do not believe that all contributors transcribing entries of large encyclopedias will bother with discussing which kind of quotes to use (I have seen results of collective projects like profread of the month, and unfortunately contributors here usually do not take care about consistency with much more important issues at all). However, let's suppose they will and imagine the following situation: 7 people start transcribing work, 4 prefer curly q., 3 straight. After quite lengthy discussion taking their time the three of them retreat (or some retreat and some leave the work) and they start using the curly q. After some time, other contributors come. Some of them do not use the curly quotes, but since quite a lot of work has been finished, they are notified that for consistency reasons they have to, and so they are forced to surrender to something they do not feel comfortable with (or they do not join). Imo this is not good. --Jan Kameníček (talk) 17:22, 25 August 2019 (UTC)
This is already how it goes for pretty much everything though. I don't like using the <poem> extension for example, and will push for explicit line breaks in transcribing poetry. If a project's original contributors have settled on using <poem>, I will have to either go along with that, or move on to some other project. —Beleg Tâl (talk) 18:18, 25 August 2019 (UTC)

As there is majority support and a sufficient length of time has passed, I will implement the measure in one day (24 hours), if there is no objection. TE(æ)A,ea. (talk) 21:19, 23 August 2019 (UTC).

I have undone this change. Do not give us arbitrary deadlines that have no requirement for a deadline. Typically our issues are discussed and open for extended periods. — billinghurst sDrewth 23:58, 24 August 2019 (UTC)
Yes, 24 hour notice is not remotely sufficient for closing a discussion. There are still editors commenting here. Be patient. I think if this discussion remains completely untouched for two full weeks we could consider it closed, though I would give it the full thirty days allotted by SpBot. —Beleg Tâl (talk) 22:22, 25 August 2019 (UTC)
  • I proposed the 24 hours not as a time period for the discussion, but a time period to notify in the place of further discussion, as the outcome is already obvious. I believed that one weeks’ time, as had already passed, was sufficient. TE(æ)A,ea. (talk) 23:31, 25 August 2019 (UTC).
    • There's no hurry. English Wikisource has existed for 15 years without curly quotes. Another few weeks won't hurt. Kaldari (talk) 15:04, 27 August 2019 (UTC)
  Support the alt proposal. I've just been alerted to this page, and not sure whether to add the comment here or in the lower discussion. The 1911 Encyclopædia Britannica Wikiproject is one that has long had a standard of curly quotes and apostrophes in its style guide, has a few current editors, and they are familiar with the style. Undoing the guide would be a needless waste of effort that could more usefully be put into proofreading (though I admit I'm guilty of not having done much of that lately). Raw scans will need fixing in any case. While I'm agnostic on what should be the default WS-wide, I'm strongly opposed to a new mandate for a long-established project. DavidBrooks (talk) 16:13, 28 August 2019 (UTC)
  •   Support I had actually decided to abstain on this, since it wasn't sufficiently detailedly specified for my peace of mind (I do like exhaustive detail on such matters). But after far more agonising than the issue strictly merits, I've come around: I just simply want the pretty curly quotes too much! And since we're taking the "let's just open the gates and deal with problems as and when they show up"-approach already, I think it makes the most sense to pick the variant that gives maximum flexibility to our contributors. I do urge everyone to be extra on guard for diverging practices going forward: the benefit of an ultra-conservative approach is it's easy to guarantee consistency, but cleaning up if chaos has been let run unchecked can be near impossible. Keeping an eye out for variations we don't want now can save us a whole mess of trouble later on. (also: shout-out to Jan and BWC, whose concerns I share but land on support anyway) --Xover (talk) 15:50, 30 August 2019 (UTC)
It is indeed a valid concern to worry about introducing a typographic feature which is not entirely straightforward to use correctly, lest it create mess. We should be thinking about software methods to help with proofreading, both checking at the time of adding text and going over existing pages. I have some ideas, but this isn't the place to detail them. Levana Taylor (talk) 16:08, 30 August 2019 (UTC)
Further concern Having just advised someone on creating a link, I've realised a further concern with this proposal. When linking to works, to subpages, or to sections within a page we must have a single standard orthography. This particularly so for apostrophes, but from time to time quote marks are also affected. Links within a work will be fine, but links from another work will require the editor to decide everytime if normal marks or bent marks were used. A hypothetical example: a book I'm working on refers to Odysseus's Return in Joe Blogg's seminal work Greek Myths and Legends Reinterpreted for the Victorian Age. Do I link to Greek Myths … Age/Odysseus's Return or to Greek Myths … Age/Odysseus’s Return ? The best way around this is to insist that all Titles and Subtitles only use straight quote marks and apostrophes. The next problem comes at sections within a (sub)page where the section heading is a text item within the book. If Odysseus's Return is a section on Greek Myths … Age/Chapter 7 and has not yet been proofread, how will I know which type of mark to use in creating the deeplink? Beeswaxcandle (talk) 19:03, 31 August 2019 (UTC)
I would be okay with insisting on straight quotes in page titles. However, there are currently pages that use curly style in the page title (e.g. Henry Ford’s Own Story; complete list here) and so far they are doing okay. —Beleg Tâl (talk) 20:44, 31 August 2019 (UTC)
Agree with straight quotes in page titles. However, for the ones that exist, I would advise moving them and leaving redirects. It seems to me there is a tendency on this wiki to not leave redirects. That seems counterproductive to me, especially when a page has been up for a while; there's no way of knowing the various places, online and offline, where there might be incoming links. What's the harm in a redirect? -Pete (talk) 17:01, 3 September 2019 (UTC)
I think straight quotes ought to be mandatory in page titles. Agree with leaving redirects though. —Nizolan (talk) 17:50, 15 September 2019 (UTC)
  •   Support I think the question of whether to allow curly quotes is the wrong question entirely. Of course we should—we should be preferring them like PG does nowadays. Straight quotes are a hack introduced with early typewriters as a way to save room on the keyboard, and I would really prefer that we use the original, traditional typesetting instead. So I suggest, instead, that the right question to be asking here is how to make it easiest and most convenient to type smartquotes, since I see that looms largely in people's minds, and rightly so. There are many options we could be discussing, including keyboard shortcuts (easy to add javascript so that ctrl-\" adds a pair of quotes), or a bot to do automatic substitution (which would need to be validated), or other possibilities I haven't even thought of. Dcsohl (talk) 20:01, 8 September 2019 (UTC)
  •   Support; nice balance between aesthetics and consistency. Spangineer (háblame) 13:46, 9 September 2019 (UTC)

Proposal to enable blocking functionality for abuse filtersEdit

A proposal is open at WS:AN#Upgrading our abuse filters to allow blocking until August 29. All community members are invited to contribute to the discussion. BethNaught (talk) 07:21, 23 August 2019 (UTC)

Some suggestions about web visibility and page qualityEdit

After rebuilding THIS LIST to include the corresponding main namespace pages to see if they exist and what is visible to the (first time) visitor.

In addition to our current visibility, we should consider adding separate webpages titled something like "Wikisource for proofreaders" and "Wikisource for readers" with their respective information and linked to the Wikisource main web page, while keeping the Wikisource main webpage as uncluttered as possible.

Regarding the main page of any work, the first thing noticed is that the quality indicators only display the current page status. Would it be possible to add a bar display on a work's title page aka {{ROOTPAGENAME}} a summary of the overall status of the subpages related to the title. — Ineuw (talk) 23:57, 3 September 2019 (UTC)

Bot approval requestsEdit

Repairs (and moves)Edit

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

Index:H.R. Rep. No. 94-1476Edit

The pages of this should be moved over to the consolidated version of the file under: Index:H.R. Rep. No. 94-1476 (1976).djvu , the destination pages being Page:H.R. Rep. No. 94-1476 (1976).djvu/1 to Page:H.R. Rep. No. 94-1476 (1976).djvu/368, respectively, All the pages of this have been validated, so a simple mass move/rename would be sufficient, (ideally done with a bot.) I already moved the Erratum. ShakespeareFan00 (talk) 16:35, 21 July 2019 (UTC)

Why would we want to move these, I don't see the point. The work is presumably okay, and has been transcluded, so not sure why we would want to move them, and then go and then have to go and check all the tranclusions. Sure it is not the presentation that we prefer, however, it isn't wrong. — billinghurst sDrewth 02:29, 24 July 2019 (UTC)
Okay then, if you think having the single pages is better, then the Erratum should be moved back, I can't do this directly as it would need an admin to ensure the history stayed intact. The consolidated file should then either have the pages duplicated (a bot task) or the consolidated file should be ditched as a direct duplicate. ShakespeareFan00 (talk) 08:51, 24 July 2019 (UTC)
The mainspace page (Copyright Law Revision (House Report No. 94-1476)) has some very complicated formatting for the three-column table that appears beginning at p. 186 of the source document, relying heavily on mw:Extension:Labeled Section Transclusion to assemble specified sections of each page into the correct column of the table. Although I generally like the idea of having all the pages collected in one file, the cleanup required to make the mainspace page render correctly afterwards would be nontrivial. It works in its current form. The “if it’s not broke, don’t fix it” principle applies. Tarmstro99 13:48, 21 August 2019 (UTC)

Template:Translation headerEdit

It seems that the parameter "original", which should include the name of the page that hosts the original language work with the interwiki link, does not work. --Jan Kameníček (talk) 21:36, 23 July 2019 (UTC)

It works for me, can you give examples? —Beleg Tâl (talk) 02:17, 24 July 2019 (UTC)
@Beleg Tâl: I am sorry, I have missed your reaction.
For example at Translation:Capsules only the translated title "Capsules" appears in the header. The original title "Cápsulas" does not appear anywhere, although it is written in the parameter "original=". --Jan Kameníček (talk) 20:59, 16 August 2019 (UTC)
Ah -- the "original" parameter is used to create the interwikilink to es:Cápsulas, but it doesn't display the value in the header. You can just use the "title" parameter for that, or put it in the header notes. —Beleg Tâl (talk) 21:30, 16 August 2019 (UTC)
I’m in agreement with Jan Kameníček that there ought to be a dedicated place in the header to display the title (and date IMO) of the original work. Levana Taylor (talk) 22:16, 16 August 2019 (UTC)
There is - it's "title" (and "year" for the date) —Beleg Tâl (talk) 22:59, 16 August 2019 (UTC)
"Title" is being used for the translated title, there isn’t a place for the original title. Levana Taylor (talk) 23:05, 16 August 2019 (UTC)
You can definitely put the original title in the title field if you want to. This is commonly done. See for example Translation:Ho, mia kor'Beleg Tâl (talk) 23:11, 16 August 2019 (UTC)
I am not a fan of combining two kinds of data in one field, it’s contrary to good design principles. Arrange to have them displayed on the same line if you like, but if they are separated into different parameters, the display can easily be rearranged. Levana Taylor (talk) 23:20, 16 August 2019 (UTC)
┌─────────────┘
It is very confusing for contributors (myself included) if there are two different principles applied in headers. For example the header created by the template {{translations}} treats it in a different (and more expectable) way (compare The Pitman. Imo there is no reason, why the attitude applied in creating the header by {{translation header}} should be different. --Jan Kameníček (talk) 04:46, 17 August 2019 (UTC)

I would just like to remind this unsolved issue with the confusing situation of different attitudes in the header forming templates {{translation header}} and {{translations}}? --Jan Kameníček (talk) 08:18, 24 August 2019 (UTC)

If you want to make a proposal for a change to {{translation header}}, go for it. —Beleg Tâl (talk) 13:23, 24 August 2019 (UTC)
I have considered this to be quite a clear proposal :-) Is there anything else that needs to be done to change it? I suppose no voting is required to repair a parameter of a template. --Jan Kameníček (talk) 16:36, 24 August 2019 (UTC)
Well actually, I am not quite sure what you’re proposing. Could you, maybe, do a mockup of what you’d like the translation header to look like? Levana Taylor (talk) 16:56, 24 August 2019 (UTC)

I propose to unify the behaviour of the parameter "original" of the template {{Translation header}} with the same parameter of the template {{Translations}}. That means: The original name of work written in the parameter "original" should also appear in the header in brackets.

Example: In page Translation:Capsules the template

{{translation header ... | original = Cápsulas ... }}

should produce the title Cápsulas (Capsules), similarly as it happens e.g. in The Pitman, where the template

{{translations ... | original = Kovkop ... }}

produces Kovkop (The Pitman). --Jan Kameníček (talk) 22:49, 24 August 2019 (UTC)

New pdf and index repairs for Once a Week volume 2Edit

One of the pages of the existing file for Index:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf is in the wrong place in the pdf. I tried to upload a repaired pdf but couldn’t figure out how to overwrite the existing file using chunked upload. I have stored the better pdf at the following filename on the Commons: c:File:Once a Week, S1 V2.pdf. Could someone please use that file to overwrite c:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf and then delete my temporary storage file? Also, the index needs to be corrected as follows (numbers are the file pages, not the magazine numbering). The main namespace transclusion page numbers are already correct for the new numbering.

  • no change: new 1-88 = old 1-88
  • new 89-100 = old 90-101
  • new 101 = old 89
  • no change: new 102-635 = old 102-635
  • in the new version, there are 4 no-content pages after the last page; in the old one, there are 8.

Levana Taylor (talk) 20:14, 30 August 2019 (UTC)

@Levana Taylor: I've uploaded the new version over c:File:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf and tagged c:File:Once a Week, S1 V2.pdf for deletion. But someone with AWB or similar tools must take care of the rejigging of pages in the Page: namespace here. --Xover (talk) 20:45, 30 August 2019 (UTC)
Thanks, that was fast! Levana Taylor (talk) 20:48, 30 August 2019 (UTC)

Once a Week pages shiftedEdit

Pages 76 to 88 of Index:Once a Week, Series 1, Volume II Dec 1859 to June 1860.pdf are shifted, is there some automatic way to repair it? --Jan Kameníček (talk) 06:18, 9 September 2019 (UTC)

Template:TOC begin/end/row seriesEdit

Can someone who has the patience and the time look to fix the series of this series templates to convert the template series to have a lead |- rather than a trailing |-. They break page numbering formatting on all table of contents so typically only the lead page number of the ToC shows and it is long time that it we get it resolved. They will need to go through the sandbox. — billinghurst sDrewth 02:18, 8 September 2019 (UTC)

@ShakespeareFan00:. I see this was just done directly on the templates. However, as I said, in detail, last time this was brought up (to which there was no reply), care must be taken here, because whitespace in the wikitext between TOC row templates will be sucked into the rows, whereas before, it was not. For example, Page:1965 FBI monograph on Nation of Islam.djvu/4 is now double-spaced. Specifically, whereas the following was fine before:
{{TOC row 1-1-1|foo|bar|baz}}

{{TOC row 1-1-1|foo|bar|baz}}
it should now be changed to:
{{TOC row 1-1-1|foo|bar|baz}}
{{TOC row 1-1-1|foo|bar|baz}}
I think this might be fixed by removing the newline before the "|-" at the start:
<includeonly>|-
|align=right valign=top|{{{1|}}}
Even then, you must still be careful about having two or more line spaces between rows (and newlines at the end of the last parameter count as well), so you'll need to ensure nothing gets broken. Thanks for making the effort to fix this long-standing issue. Inductiveloadtalk/contribs 10:11, 10 September 2019 (UTC)


The line feed at the start is needed so that Mediawiki recognises the Table row marker. Placing it as you suggests visibly breaks rows in a manner that is more obvious than the dual whitespace you mention. Whitespace handling interactions between templates, tables and transclusions is a long-standing issue, and so far no long-term soloution has emerged. I'm quite prepared to revert ALL my changes, on the basis that it's long overdue for someone else to thoroughly rethink how these templates are done, and redesign them to be white-space neutral.ShakespeareFan00 (talk) 10:47, 10 September 2019 (UTC)
Following further concerns, I've now reverted ALL the changes made to this family. Perhaps someone else can sit down and fully redesign it so that it is completely independent and neutral of the minutiae of Mediawiki's (still under documented) whitespace handling behaviour. ShakespeareFan00 (talk) 11:21, 10 September 2019 (UTC)
Getting the page numbers right is tricky, but it isn't due to some mysterious non-specific "minutiae" in whitespace, it's just simple concatenation of the page content. It's due to this: on a Page page that starts with a TOC row, you essentially have this:
<span class="pagenum ws-pagenum" id="6" data-page-number="6" title="Page:Sandbox.djvu/6">
<!--"real" page content starts now
--><tr>
  <td>1</td>
  <td>Chapter 1</td>
  <td>Page 1</td>
</tr>
etc etc
<!-- page content ends -->
It's documented in the second sentence at mw:Help:Extension:Linter/fostered: "Specifically, content in tables can only be present in table cells, table headings, and captions.". Because the ProofreadPage span (with class "pagenum") is inserted between tr's, it's therefore "fostered" and pops out of the table. This isn't noticed by the linter, I assume because ProofreadPage inserts its spans after the linter and parser do their things.
Your change to the template starts the page off as appending to the previous page's last row, like this it to this:
  <!-- previous page ends here -->
  <span class="pagenum ws-pagenum" id="6" data-page-number="6" title="Page:Sandbox.djvu/6"></td>
</tr>
<!--"real" page content starts now
--><tr>
  <td>1</td>
  <td>Chapter 1</td>
  <td>Page 1</td>
</tr>
While this is now a valid place for the pagenum span, its 1) on the wrong row and 2) this tends to suck in any inter-template whitespace just before the first </td> and make it part of the cell content, even when it's not part of the template parameter, which is slightly "surprising" behaviour.
I'm unsure as the the best approach. I think an ideal solution would be for the ProofreadPage extension to know when it's trying to stuff a span in-between table rows and instead push it into the first cell, but I have no earthly idea if that is possible. I have raised phabricator:T232477 to ask people who might know the answer.
If this is deemed to be "user error" (i.e. it's not MediaWiki/ProofreadPage's problem and these templates are objectively "wrong"), then the only recourse here is to change all the templates (as you did) and carefully check that the spacing is not broken. There are roughly 2000 pages using these templates, and many are OK, so it'll be a PITA, but tractable. Then, going forward, it will be disallowed to have blank lines between TOC rows. IMO it would be preferable to allow whitespace between templates, as it's more readable and it's surprising for users that single blank lines would cause such issues. But it wouldn't be the end of the world.
Anyone with any other ideas of how to get the best of both worlds? Inductiveloadtalk/contribs 13:59, 10 September 2019 (UTC)
@Inductiveload: This is so much not a solution — merely a suggestion towards… Get rid of the mediawiki mark-up in the templates and replace with explicit <table>, <tr>, <tb> etc. This does not solve the whitespace problem but at the very least eliminates the counter-productive requirement for blank lines required to be present for proper recognition of mediawiki keywords
Also, doubling down on the linter issue (i.e. why not introduce a structure which works in more cases even though it will drive lint slightly crazy in the general case…), is it worth considering modifying MediaWiki:Proofreadpage pagenum template to something like:
<includeonly><tr><td><span class="pagenum ws-pagenum" id="{{{num}}}" data-page-number="{{{num}}}" title="{{urlencode:{{{page}}}|WIKI}}">&#8203;</span></td><tr></includeonly>
— yes the above is pretty nuts in normal situations (but workable after the parser strips the useless tags… and works perfectly in the rare case where inserted between </tr>…<tr> tags…? 114.78.171.144 10:52, 11 September 2019 (UTC)
I have, in fact, already attempted using raw HTML tags (e.g. here), and while it does solve the line-break issue, the issue with page numbers persists (logically, as the <span> is still between a </tr> and <tr>).
Your idea about putting the pagenum in a <tr> has another drawback (other than upsetting the linter, though I actually think the linter would miss it due to when the PP extension injects the code) in that it will force a dummy row in the table, which would probably cause disruption across page-breaks. Inductiveloadtalk/contribs 11:22, 11 September 2019 (UTC)

Other discussionsEdit

Revisiting curly quotesEdit

Per EncycloPetey's suggestion at the style guide talk page, I would like to have the community revisit the idea of allowing curly quotes. Personally, I hate curly quotes and think they are a pox on humanity, but considering how we go to such great lengths to make our source texts faithful to the originals, it does seem like a prominent inconsistency. I'll try to list a few of the advantages and disadvantages that have been discussed...
Advantages

  • Can be more faithful to original text.
  • Consistent with Project Gutenberg (and probably the majority of commercial e-texts).
  • In some cases, may be easier to read, especially when multiple quote characters are in sequence.

Disadvantages

  • Harder to enter.
  • Some browsers may not be able to render (according to EncycloPetey in 2015).
  • They are often used incorrectly.

As this would be an optional style, I wouldn't give much weight to it being difficult to enter. It's certainly easier that most of our TOCs. And given that it's been 5 years since this was last debated, I have serious doubts that there are still issues around rendering. That basically leaves the objection that they are often used incorrectly, which I would say is also true of all the different dash characters we allow (and even expect). Are there other disadvantages that I'm forgetting? I suppose one is that it would cause inconsistent styles across Wikisource. What are other folks' opinions and thoughts about this? Kaldari (talk) 01:41, 2 July 2019 (UTC)

Curly quotes are better typography and are recommended by Unicode. Some quotation styles are not compatible with straight quotes (like „German quotations“). Browsers that can't handle basic web standards like Unicode are going to have far more serious problems on Wikisource than being able to render curly quotes. It is ridiculous that curly quotes are against our MOS. —Beleg Tâl (talk) 01:48, 2 July 2019 (UTC)
Re: "more faithful to original text" No, this is a typography issue, and has nothing to do with faithfulness to a text. A text has curly quotes because of a printer's choices, not any choice made by the author. And just as we do not specify fonts, we shouldn't bother specifying specific styles of punctuation either. Re: "Consistent with Project Gutenberg" this is irrelevant, and is at odds with the previous claim that using curly quotes would be "faithful to original text". If we're worried about the original text, then why should we care what other sites are doing? And if we're worried about what other sites are doing, then we're not caring about the original text. I'd also point out that Project Gutenberg is in no way consistent about the style of quotes they use. Additional disadvantage: We will need an additional series of quotation template to handle situations currently done with templates like {{' "}} which provide for clarity of punctuation. --EncycloPetey (talk) 03:32, 2 July 2019 (UTC)
It seems like we often try pretty hard to match the typography in the source: Page:KJV 1769 Oxford Edition, vol. 1.djvu/10. Should this be discouraged, in your opinion? Kaldari (talk) 04:02, 2 July 2019 (UTC)
RE: "we often", no this was a single editor over-formatting that page. --EncycloPetey (talk) 17:22, 6 July 2019 (UTC)
  •   Comment In the vast bulk of our works curly quotes are not required, and should not be encouraged, and their addition doesn't give value to the works, ie. disadvantages outweigh advantages, especially in true communal works. That said, we have always allowed some variation where there is a reasonable explanation of why a deviation from the style guide can be justified. We haven't been absolutionist about these matters, we just have a valid reasoning for setting a style guide, and generally asking people to follow it, and not deviate "just because", or "because I like them better". The test for a deviation has been an open conversation, and a semblance of consensus. — billinghurst sDrewth 05:32, 2 July 2019 (UTC)
In the vast majority of our works, neither casing, accents nor non-monospace fonts are required. We could go all old-school computer printout style, but given that we support Unicode and rich text, it seems reasonable that we write English as it is supposed to be written, with curly quotes. Their addition makes the text easier to read by adding additional cues as to the meaning of a quote, whether it is opening a quote or closing it. --Prosfilaes (talk) 07:14, 2 July 2019 (UTC)
I Would Çonsidér Accürate Reprǒductión Of Casing And Aççénts To Be Requĩred —Beleg Tâl (talk) 12:58, 2 July 2019 (UTC)
I THINK DECADES OF TELEGRAPHIC AND COMPUTER USE ESTABLISH THAT ACCURATE REPRODUCTION OF CASING IS NOT REQUIRED and a lack of accents in English has decades more of use with ASCII and normal keyboards.--Prosfilaes (talk) 09:32, 3 July 2019 (UTC)
I was considering opening up a discussion on this; it definitely is more faithful to the way English is published. Typographical norms should not be scorned for just being typographical norms.--Prosfilaes (talk) 07:14, 2 July 2019 (UTC)
  •   Comment I wouldn't mind them being used but I wouldn't use them personally, because it's another pain for us proofreaders to worry about that isn't worth the hassle. I wouldn't discourage other users from using them though, neither would I mind if a user were to add them in to a work I've proofread and used straight quotes on. In my opinion it is a minor typography issue and I really wouln't care if they're used or not. Jpez (talk) 11:45, 2 July 2019 (UTC)
  •   Comment The relevant current text of the Manual of Style reads:
"Use typewriter quotation marks (straight, not curly)."
I agree with what I think most above are saying, which I think would be most succinctly stated as follows:
"Any given work should be self-consistent in terms of the style of quotation marks and apostrophes used. That is, such marks should either all be curly, or all be straight, within any given work. If the initial transcriber of a work has chosen one style, the other style should not be adopted in that work unless a user intends to update the entire work to the new style choice."
I am very skeptical about this: "unless a user intends to update the entire work to the new style choice". Imagine occasional proofreaders doing small chunks of an encyclopedia. I doubt they will check what others have done and even more make it consistent. Pretty sure we will end up with mixed style except for committed users on single works.— Mpaa (talk) 23:30, 2 July 2019 (UTC)
@Mpaa: That's exactly the kind of thing I'm imagining (and have experienced). Here's how I imagine it working, with such a policy (perhaps worded a bit better) in place:
  • Me: Hi, I see you've validated about 5% of the pages of this work. That's awesome, thanks! I've proofread about 80% of them. I see that you have changed straight quotes to curly. Are you intending to go through the whole document and change them all to curly?
  1. Other editor: Yes, I plan to do that.
Me: Great! I look forward to seeing the final result.
  1. Other editor: No, I was just passing through, only interested in this one chapter of the book. I probably won't do more than a few more.
Me: Ah, I see. In that case, would you mind sticking with the convention I began with? (links to MOS) I'd rather stick with straight quotes than go to the trouble of updating all the pages.
In my view, that's a nice, easy way to resolve this "conflict" (which doesn't even need to be a conflict). Having a manual of style that guides us in this direction would, in my view, be a great advantage, and make it possible to quickly and easily arrive at an acceptable solution. -Pete (talk) 22:04, 3 July 2019 (UTC)
A satisfactory solution to an imaginary and likely situation, but that is not how it plays out in folk-lore. CYGNIS INSIGNIS 11:26, 6 July 2019 (UTC)
Hmm, I don't see anything relevant on that work's discussion pages, what am I missing? This is a type interaction I've seen work on many wikis over the years. I'm not sure where you'd anticipate it going off the rails...but, having a clearly articulated policy that sets the parameters is a necessary ingredient. -Pete (talk) 22:19, 6 July 2019 (UTC)
One practical concern is that various automated tools implement one style or the other, and may need to be rewritten or eschewed in order to comply with a change in the Manual of Style. -Pete (talk) 20:27, 2 July 2019 (UTC)
One of the reasons why I was doing this was because the OCR seems to spit out curly quotes, and I was tired of fixing them.
I’m guilty of using curly quotes in some works, where I am (or try to be) completely consistent throughout the whole work. That said, I normally only do it in novels (or this comment), where I’m planing on spending a fair bit of time sitting reading the thing — I think proper typography matters more then (and indeed, I’d say curly quotes, along with correctly-sized dashes and various other non-typewriter or -computer conventions are “proper”). For reference, non-prose, works, I think straight quotes are fine. (Maybe the distinction I’m getting at is between works with large amounts of dialog and those without?) I’ve seen people make the argument that they’re not required because we can make automatic replacements later, but that’s not really true: there are various situations in which it’s impossible to programmatically determine which type of quote character should be used. Anyway, I hope I’m not on the wrong side of common opinion here, but I do like to be able to use curly quotes on Wikisource. —Sam Wilson 11:33, 3 July 2019 (UTC)
I agree with this and Jpez's comment above. I don't see a good reason to forbid them categorically, and can see plausible use cases in novels and the like, even though I probably wouldn't use them myself (the projects I work on are generally academic and often require enough heavy lifting in Unicode without having to fiddle with quote marks). —Nizolan (talk) 14:30, 3 July 2019 (UTC)
On a personal level, I like curly quotes & find them easier to read. But from an editor's-usability standpoint it sure does make sense to convert everything to straight quotes, as the only way to avoid inconsistency. It is much easier to convert curly to straight than vice versa; I have a little application to straighten the quotes in all OCR output, but the reverse process would be no easy matter. I actually began entering Once a Week magazine with curly quotes but EncyloPetey pointed out the standards so I went through several hundred pages I'd entered and straightened the quotes. I am now up to 2000 pages and I am most emphatically not going to revisit all of them and curlify them .... We seem to be stuck with straight quotes as a legacy issue. There would have been ways to make entering curly quotes easier if they had been favored from the beginning; and the OCR output from the application that this site uses now gives us them automatically; but although I don't see a problem with allowing them in certain cases (there are, for a parallel example, a few texts displayed with long s's) a person would have to be urged to think carefully before they started down that route, because so much of the existing apparatus favors straight quotes that avoiding inconsistency would be difficult. Levana Taylor (talk) 23:08, 3 July 2019 (UTC)
It is much easier to convert curly to straight than vice versa is also an argument for curly quotes; transcribing text is all about recovering information from the images that can't be done automatically.
We should be stuck with nothing as legacy issues. If it's better we should make the change, and earlier is better. I'm not sure that much of the existing apparatus favors straight quotes, but this is a chance to change the existing apparatus.--Prosfilaes (talk) 01:07, 4 July 2019 (UTC)
Curly quotation marks are a legacy of printed blocks of text that required incremental spacing, denoting a beginning or end of a quotation if the squelching and stretching of the line made that ambiguous, and few of those legacies are transcribed here (often [or yet]). CYGNIS INSIGNIS 11:18, 6 July 2019 (UTC)
Quotation marks are a legacy of printing; inline quotation marks date no earlier than the 17th century. Printing pervades how we write English, and an attempt to abandon those legacies would produce something unusable or unwelcomed by most of our audience.--Prosfilaes (talk) 07:15, 7 July 2019 (UTC)
There is nothing difficult in writing curly quotes. On Macs and on all modern mobile devices they are easy to enter using the built-in keyboards. On Windows it's probably not built into the default keyboard, but that's what the Special characters button is for.
Old browsers are not a reason not to use modern technology. If they aren't upgraded today, they will be upgraded in a year or two.
A lot of websites that care about quality of presentation use curly quotes. Wikis in some languages have a gadget that converts straight quotes to elegant quotes automatically. Some sites where text can be edited do it as well, for example Quora, and Wikisource could do it (it must not be forced, though).
I actually find it surprising that there are people on English wiki sites who are against curly quotes, given that the English language has such a long typographic tradition of using rich punctuation, with quotes, dashes of various length, etc. --Amir E. Aharoni (talk) 05:47, 4 July 2019 (UTC)

(unindent) I have an idea; it'd take some programming, though. Suppose it was allowed to enter curly quotes, but the software would display them as straight quotes by default. That way, if only part of a text was entered curly, people would usually never notice because it'd all be displayed straight. However, there'd be a user-controlled setting allowing displaying curly quotes where they exist. Levana Taylor (talk) 02:12, 5 July 2019 (UTC)

I don't see the advantage. There's a lot of criticism of the "just add another user-controlled setting" idea in the UI world. It seems a lot better to offer tools to help make the changes and encourage not doing inconsistent changes.--Prosfilaes (talk) 03:46, 5 July 2019 (UTC)
I am also not sure it's a good idea to correct all curly quotes to typewriter quotes user-side by default. There are some cases in texts I've transcribed myself where curly quotes are necessary independently of the general guideline. An example of this is in transliterations of Semitic languages, where the distinct letters ayin and aleph (the half-rings ʿ and ʾ in modern scientific transcription) are often represented by curly apostrophes, ‘ and ’ respectively. In this case correcting these to typewriter quotes would remove necessary information. —Nizolan (talk) 11:51, 5 July 2019 (UTC)
Some additional advantages and disadvantages have been mentioned:
More Advantages:
  • Easier to convert from curly quotes to straight quotes than vice versa.
  • OCR tools already output curly quotes.
More Disadvantages:
  • Some new templates will be needed.
  • Some tools will need to be updated.
Let me know if I'm overlooking any. Kaldari (talk) 04:53, 6 July 2019 (UTC)
  • @EncycloPetey: I'm curious if any of the arguments above have led you to reconsider your opposition (as you seem to be the main opponent of the idea). Kaldari (talk) 04:54, 6 July 2019 (UTC)
    I may be more vocal, but that doesn't mean I'm the "main opponent", it merely means that my voice is stronger in this discussion. It is normal in the Wikisource community for long-time participants to sit back and read discussions without chiming in, so long as their opinion has been expressed by someone in the discussion. I have done this myself. No one in this discussion has explicitly voiced support or oppose, and it would be premature to interpret anyone's opinion when there has been no call for a vote. You've also biased your interpretation: where some editors have said "I don't care", you have interpreted that as "support", but it is not at all the same thing. This is a "community revisit", and not a vote to change policy. --EncycloPetey (talk) 17:32, 6 July 2019 (UTC)
    For the record, I explicitly   Support changing the policy to allow curly quotes at the editor's discretion, and   Oppose continuing to disallow curly quotes. However, this discussion didn't contain a proposal either way, so it doesn't matter until someone posts a proposal for !voting. —Beleg Tâl (talk) 23:19, 6 July 2019 (UTC)
  • I think Kaldari's summary is helpful, and I'm not sure why we're talking about whether this is a formal vote when nobody has claimed that it is. FWIW I have no objection to the characterization of my position. I like Jan's version below, making "straight" the default and only permitting “curly” where there's some evidence that curly will be used consistently throughout the work. In fact, that seems like a useful formalization of a principle I expect is already in use in some places, but not formally documented or endorsed. Which IMO is one of the best ways to develop policies on a wiki. -Pete (talk) 22:29, 6 July 2019 (UTC)
To make my position clear, I am very much in favor of allowing curly quotes on a case-by-case basis as long as the editor intends to make a good-faith effort to see they're used consistently (the guideline could be, "please don't add curly quotes to a work that's already partly straight quotes unless you're about to change the whole thing.") Though I worry about how to get it done, I think the problems are solvable, so yeah, in favor. Levana Taylor (talk) 03:20, 7 July 2019 (UTC)
@EncycloPetey: I wasn't trying to hold a vote, I was trying to see if maybe there was consensus for a change in the style guide, in which case, there would be no need for a vote. It is clear from your reply, however, that you are still against the idea, and maybe there are other silent voices that are as well. Also, please let me know whose opinion I have misinterpreted, and I will be happy to revise my statement above. Kaldari (talk) 14:43, 8 July 2019 (UTC)
And by "consensus", I meant actual consensus, not wiki-speak consensus. Kaldari (talk) 14:47, 8 July 2019 (UTC)

I have been following the discussion and thinking over the pros and cons and finally came to this opinion: The main disadvantage of allowing curly quotes is a danger of different attitudes of two or more people transcribing one work. For this reason I would explicitely allow curly quotes only if the contributor is able to ensure consistency of their usage throughout the whole work, typically when the contributor transcribes the whole work by himself/herself. When more people cooperate on transcription of a work, straight quotes should be recommended, unless they are all able to make an agreement about curly quotes (of course such agreement is practically impossible with such large works as Encyclopaedia Britannica). --Jan Kameníček (talk) 18:53, 6 July 2019 (UTC)

That's a nice thought, but even with the best will in the world, people start projects and don't finish them. The better thing is for the person who starts a project to document all their style choices on the talk page -- the note at the top of the index indicating that style guidelines exist is a great invention. Quote-style is no different from many other choices in that respect and can be handled the same way. If we shift to curly quotes and they become normal, then there will be no problem with expecting people who sign on late to a future project to use them. It's only the possibility of transitioning to curly quotes in projects that are already begun now that presents difficulties. Levana Taylor (talk) 21:03, 6 July 2019 (UTC)
A policy doesn't have to be perfect to be useful, sometimes "good enough" is good enough. I believe most Wikisource users would answer in good faith if asked, "do you intend to complete this project?" For myself, I think I'd answer "yes" for about half, "no" for about half. If a few projects end up with some inconsistencies because somebody intended to finish it but then got distracted or busy elsewhere, is that too high a price to pay if there are benefits elsewhere? -Pete (talk) 22:33, 6 July 2019 (UTC)
Both sides of this argument are starting to fall into the desperate position of trying to shore up numbers of supporters by appealing to the everybody who is reading this and not making their position clear must by inference be on my side of the argument. So for the sake of this alone I must delurk and declare that I am a closet supporter of the use of curly quotes for reasons I shall not go into here as they have already been adequately covered by others.
On the matter of automated tool use favouring straight quotes I have some sympathy. Creative laziness is always admirable but at heart it is just that: laziness. If some piece of scripted magic could perform reliable verification then why is everybody here at all? Proofreading still has an aspect of bespoke craft and we should take pride in our input.
As for perceived difficulty of entry of characters, aside from resort to UNICODE, there are the various pickers available both mediawiki supported and local. Nobody appears to have yet noted that native HTML such as the <q></q> construct works well under mediawiki, and all reputable browsers now handle the so-called HTML entity-forms: &ldquo; → “; &rdquo; → ” &lsquo; → ‘; &rsquo; → ’ Learn them; they are your friends! 114.78.66.82 22:56, 6 July 2019 (UTC)
Suggestion for implementation of quotes: there should be a gadget that finds all straight quotes on a page and converts them to curly while highlighting them so the editor can check the result for correctness (because it wouldn't be perfect). This would go a long way toward easing worries about inconsistency. Not only would it allow quicker, better conversion of works that have been begun using straight quotes, but if someone happens to notice stray straight quotes in a work that's mostly curly, they can rapidly find them all. Levana Taylor (talk) 03:20, 7 July 2019 (UTC)
  Comment I agree with the spirit of @Levana Taylor's suggestion but point out the occasional real-world case of quotations crossing pages — commencing on one page and terminating on a later one — would completely reverse the sense of correct quotation mark appearance. Which makes implementation of such a gadget tortuously impractical — as the analysis must take place at the work/chapter level to enable sensible decision for the gadget to act on the component page level. Only the {{nop}}-inserter gadget attempts this at present and for a vastly simpler case. 114.78.66.82 03:43, 7 July 2019 (UTC)
I don't see the problem. Ending quotes are at the end of words, lines and paragraphs, and starting quotes are at the start of words, lines and paragraphs. Quotes never cross paragraphs in normal English style, so any tool should restart at a new paragraph. On proofed text, it's possible the tool will put the wrong quotes on the first paragraph, but it won't be a problem for the whole page.--Prosfilaes (talk) 07:02, 7 July 2019 (UTC)


Arbitrary break (curly quotes)Edit

I have… questions… 🤔


  • Are we proposing to allow straight vs. non-straight quote mark style to be at the whim of the first contributor? Of any contributor that has a good-faith intent to update all previously proofread pages? Only when based on some set of criteria related to the work? What, if any, are the constraints on the choice?
  • Is the proposal for the benefit of proofreaders with a preference, or for our readers? That is, is our goal to achieve the best presentation for our readers, or to allow our proofreaders some flexibility or to express their own preference? What are we trying to achieve by making a change in this area?
  • At what level do we care about consistency? The work? The chapter? Individual entries in the DNB and similar? Across works within a series? How would we achieve such consistency in practice? How would we resolve conflicts in preference?
  • What kind of curly quotes (there are on the order of 30 of them) would be allowed, and how would the style be decided? Would the “Anglophone” style be allowed or preferred for a work by a « Francophone » or „Germanic“ author? How about for reproducing an official text of some kind (English translation of a law, say) where the originating country specifies (sometimes by law) a specific quote style?
  • How would we handle the issues currently dealt with by {{" '}}, {{' "}}, et al (there are 5 of them just for straight quotes; each extra style of quote would generate at least 5 more)? How do we detect and correct instances where accent marks are (accidentally) substituted for single quote marks? How about the inevitable Windows CP1252 character set issues?

I don't as yet have a firm opinion on the issue of curly quotes except that they do create a lot of complexity and that that complexity must be addressed if we are to adopt them. I do hold the opinion that good typography aids readability; that good typography creates visually appealing works, and that visual appeal is a desirable trait; that our goal should be the benefit of our readers over our own contributors; and that our readers are a diverse group with many different needs. I am also by inclination prone to prefer more diplomatic reproduction of works (I've driven certain community members to distraction by insisting on using {{lgst}})—which inclines me to want to reproduce a work's quotation style, and against any style that differs from the one use in the work (including substituting straight for curly)—but experience has taught me that there are good reasons to moderate that impulse (see, Billinghurst, I do listen and learn!).

And, ultimately my main concern is maintainability and manageability over the totality of the project, over a decade or two, and in the face of practical realities like the occasional conflict between contributors, the perennial slow changing of the guard (who now remembers why we made every decision shaping the project?), and the necessity of either automating or having the manpower for certain kinds of necessary cleanup or guidance for new contributors.

I like fancy quotes (and other typographic affordances), but they sound like they'd be really hard to do right. --Xover (talk) 07:18, 7 July 2019 (UTC)

Curly quotes should match the scanned text. That's simple.
We should deal with {{" '}} with fire, and then dump the ashes into a heart of a live volcano. I mean, if that's your bag, then whatever, but it seems weird about arguing for curly quotes against claims that the typography doesn't matter and that consistency is important, but have to deal with an idiosyncratic set of templates that surely have no consistency in use and tackle an issue of micro-typography that can and should be handled by modern text layout systems in web browsers; TrueType fonts have supported kerning pairs of characters for 25 years.
I would hope that modern systems won't dump CP1252 characters into the browser. We should have a bot checking the pages for inappropriate Unicode characters (Private Use, unmapped, etc.) and include 0080-009F in there.--Prosfilaes (talk) 07:56, 7 July 2019 (UTC)
I have not observed that the kerning issues with adjacent quote marks (or with "rn" and "m", for that matter) have been rendered moot by modern text layout systems. I have no particular affinity for those templates, but I do care about the issue they are attempting to solve. I also do not think making templates to deal with this issue that are consistent (what are the problems with the existing ones?) should be impossible. --Xover (talk) 08:06, 7 July 2019 (UTC)
Modern text layout systems are well-capable of dealing with kerning. If they don't, well, that's still stepping into their territory. There's no way of making such templates consistently used unless we make a big fuss about them, which I strongly object to. I haven't seen them in any thing I've used text I've worked on.--Prosfilaes (talk) 08:36, 7 July 2019 (UTC)
That they have advanced support for kerning (which they do, even on Linux) does not mean they can intuit the need for such automatically: we need to make use of such facilities for anything to happen. --Xover (talk) 08:52, 7 July 2019 (UTC)
In a TrueType font, there is a table listing pairs of characters and the space that needs to be added or removed between them. If "' needs extra space, then that table should list the amount of extra space it needs and the typesetting program should adjust the distance between the two glyphs appropriately. See w:Kerning.--Prosfilaes (talk) 12:31, 7 July 2019 (UTC)
Yes, that is indeed how the TrueType specification handles it; and OpenType has even more advanced features for this. However, so far as I know, no web browser on any operating system does this automatically, and the CSS features for explicitly enabling it are only partly supported (and would require some sort of markup on our side in any case). And even with that support it would require an OpenType font which has the appropriate kerning setting for these pairs, and that was available for us to use, which mythical beast may exist but I couldn't name one off the top of my head. I agree that it would be very nice if we didn't have to worry about this, but, again so far as I know, that is not actually the world we live in. If you know otherwise I'd be happy to get rid of those templates even if we only use straight quotes: they're nobody's idea of a perfect solution, they're just the best we've got available.--Xover (talk) 15:22, 7 July 2019 (UTC)
I have never used these templates; "'this'" or “‘this’” have always been completely adequate. If/when browser support for kerning becomes more common, then the appearance will be improved slightly, but in the meantime there is no need for manually padding the punctuation imho. —Beleg Tâl (talk) 20:11, 7 July 2019 (UTC)
I'd say pretty much this is the definition of a problem we don't have to deal with. We send "' down the line, and the other side renders it. If that rendering of a common pair of characters is unsatisfactory, the spirit of HTML and text transcription is that we don't know their fonts, their system. If systems don't do this automatically, and fonts don't set kerning for these pairs, then obviously it's not considered a big problem.--Prosfilaes (talk) 21:42, 7 July 2019 (UTC)
That's a fair stance (both of you). I don't agree with it—it goes to readability so a similar argument applies to this as to using typographers quote marks (or distinguishing between plain, en-, or em-dashes)—but it is absolutely an issue it is reasonable to consider falling within the limits of "good enough". Thus the question, above, of whether what we are trying to achieve in this discussion is flexibility for our contributors or a better reading experience for our readers. What the goal is affects the calculous of how much effort to put into stuff like getting typography and layout correct vs. getting it "good enough" (wherever you draw that line in general) and letting the users' browsers deal with it. --Xover (talk) 04:31, 8 July 2019 (UTC)
I think the difference is here that users' browsers can't add curly quotes properly; it is up to human intelligence to add them. On the other hand, we can't add space properly, given that we don't know what fonts are being used. In the long run, properly recording the characters that are there will help all usages of the text, where manually kerning characters will help only certain users, and make usages that don't reflect current web browsers more complex.--Prosfilaes (talk) 11:24, 8 July 2019 (UTC)
I use the {{" '}} group in my projects after seeing them in use elsewhere since it seems like a neat solution to the problem, and the templates can easily be made inoperative whenever browsers do get round to displaying them better. Given that I regularly have to add manual padding when typesetting documents in InDesign using professional typefaces I am somewhat sceptical about how effectively those TrueType and OpenType specifications are being used by fonts at the moment. As Xover said above fonts using the full scope of these settings properly are a bit of a mythical beast.
For the record, in Arial on Windows 10 and the current version of Chrome I can't distinguish "' from '" at all (or for that matter '''). In curly quotes, though, it seems much easier to distinguish “‘ and ‘“, so it's possible that the templates are simply unneeded in that case. —Nizolan (talk) 15:15, 8 July 2019 (UTC)
Back to topic of curly quotes, I checked the French, Italian, and German Wikisources and they all seem to use curly quotes by default. (It seems Spanish has their own style of quotation marks and doesn't use apostrophes.) Kaldari (talk) 14:35, 9 July 2019 (UTC)
@Kaldari: It looks to me like there's enough consensus in the discussion above to justify a more formal proposal. Do you have thoughts of putting something together? -Pete (talk) 20:57, 24 July 2019 (UTC)
@Peteforsyth: Unfortunately, I'm still somewhat of a newbie on Wikisource, so I'm not familiar with community practices here. How do these things work? Do you call a vote? Write an RFC? Honestly, I would love it if a more experienced Wikisource editor took over the process from here. Kaldari (talk) 22:10, 24 July 2019 (UTC)
@Kaldari: French uses the same style quotes as Spanish, namely« », so I'm not sure how you determined that they use curly quotes. Italian Wikisource is inconsistent, and they currently have about six active contributors, but Italian also has a different set of quotes, namely “ „ , than English does, so they (along with Spanish and French) are not a good basis for comparison. --EncycloPetey (talk) 01:49, 26 July 2019 (UTC)
@EncycloPetey: True, but the French, German, and Italian Wikisources seem to use the curly apostrophe pretty consistently. My point is, our use of straight quotes seems to be the exception, not the rule. Kaldari (talk) 17:44, 26 July 2019 (UTC)
@Kaldari: I still don't know how you came to that conclusion, because I looked at the same sites and didn't reach that conclusion. You should also consider that the apostrophe in languages like French and Czech can be coded differently. For example, there is a single Unicode character available for French ľ that is typically used, and French style is to prefer that over using separate characters. That being the case, it is inappropriate to make comparison because English uses no such character. --EncycloPetey (talk) 23:36, 26 July 2019 (UTC)
When comparing other projects, Czech Wikisource and also Czech Wikipedia use curly quotes, though a „different type“. "Straight quotes" are just tolerated but not recommended. --Jan Kameníček (talk) 18:43, 26 July 2019 (UTC)
  • As the consensus appears to be against the sole restriction on typewriter quotes (",") against “curly” quotes (“,”), I have edited the appropriate section of the style guide to reflect this discussion. If any one wishes to revise the wording, that would be more appropriate. TE(æ)A,ea. (talk) 21:18, 25 July 2019 (UTC).
    I disagree that such a conclusion has been reached that there should be change to the style guide, and have undone the change. — billinghurst sDrewth 22:10, 25 July 2019 (UTC)
    • Why do you disagree? There is no one who reasonably supports the current guideline; there is only a difference of opinion on the proper wording and implementation. TE(æ)A,ea. (talk) 23:33, 25 July 2019 (UTC).
      Did you read my sentence up to the comma? Could it be any more specific? PS. Don't flip me any metaphorical bird, I have done my time here, and earned my rights the hard way, and supported them with work the whole way through my many years here. — billinghurst sDrewth 04:42, 26 July 2019 (UTC)
    • Did you read mine? The reasoning you provided earlier aligns with my change to the style guide, That there should be consistency within a work preferably to authoritarian commands without reasoning. TE(æ)A,ea. (talk) 12:39, 26 July 2019 (UTC).
      "no one reasonably supports" is a way to wave your hand and dismiss any counterarguments, and suggests there is not a consensus. The last post before you made your comment was a move to write an explicit proposal and vote. It is therefore inappropriate to circumvent that process by offhandedly dismissing the whole thing on a whim. --EncycloPetey (talk) 01:49, 26 July 2019 (UTC)

Oh my, seems there's still some tension around here. I'd suggest there's no benefit to implementing a change before everybody involved has had the chance to review it an comment on it. I'm happy to prepare a more specific proposal, based on the gargantuan input in this section and that above, as I suggested earlier; until then, I'm not sure there's any pressing need to make changes to the style guide. That said, I appreciate @TE(æ)A,ea.: making the effort, and it might be useful to consult Special:Diff/9483059 in building the proposal. If anybody wants to propose alternative wording, feel free to do so here or on my talk page, and I'll do my best to incorporate it. -Pete (talk) 00:10, 27 July 2019 (UTC)

The proposal should have three options: 1. Keep the style guide as is (straight quotes only) 2. Write style guidelines that consider curly quotes to be the normal, standard usage 3. Allow curly quotes but only under certain circumstances or with certain restrictions, or do a gradual implementation of option 2, or a test implementation of it, or..... This option is a catchall meaning "not 1 or 2" and if we choose it we'll be starting another round of discussions leading to another proposal. Levana Taylor (talk) 00:32, 27 July 2019 (UTC)
@Peteforsyth: I would very much appreciate a proposed guideline that addresses directly (answers) the questions above, as opposed to a more general "Yes (figure out details later)" vs. "No" approach. On this issue I find myself leaning towards being conservative (perhaps overly so) and preserving the status quo when not all consequences are clear. I therefore think it would be best to ask this question in the form of a fully-formed assertion: "this is what the guideline should be, do you agree or disagree?". If a significant number disagree we revise the proposed guideline to address the concerns raised and try again. PS. I'm happy to help out here, so never hesitate to ask or ping me, but I don't think I'll actually be of any help as I am too uncertain and conflicted on this issue. --Xover (talk) 10:49, 27 July 2019 (UTC)

Break (curly quotes) for discussing changes to guidelineEdit

  1. Use typewriter quotation marks ("straight," not “curly”). Status quo, unchanged from original.
  2. Use the same quotation marks (either "straight" or “curly”) in a work, but not both. This, possibly with rewording, was my suggestion; this proposal favors standardization within a work, but not a universal idea.
  3. Use the same quotation marks as the work presents; i. e., if the work shows "straight" quotation marks, use those, or if it shows “curly” quotation marks, use those instead. This would likely be the most simple proposal, as there would be no need to change what the original source shows.
  4. It is allowed to use either straight quotation marks or the same kind of quotation marks as the original work presents, with the following condition: Other than straight quotations marks are permitted only if it can be ensured that they will be used consistently in the whole work. If such consistency cannot be ensured (e. g. because of a large number of contributors to the work or because of disagreement of some of them), straight quotation marks are recommended.
  5. The use of curly quotes is encouraged, and considered to be the default style in all texts, with or without a source. In cases where the source text uses a different typography (guillemets « », typewriter quotes, or whatever) that style can be used (encouraged, but not required). They are the standard form of “ ” ‘ ’ and " '. The use of typewriter quotes in texts where the original doesn't have them should be considered deprecated, to be phased out.

If any one else has an idea, write it out above; however, if you wish to reword an above proposal, please mark that as indicated above. I support either no. 2 or no. 3. TE(æ)A,ea. (talk) 12:33, 27 July 2019 (UTC).

I have added one more point. --Jan Kameníček (talk) 13:13, 27 July 2019 (UTC)
  • I have changed no. 4 to a sub-section of no. 2, as it appears to be quite similar to no. 2. TE(æ)A,ea. (talk) 21:08, 27 July 2019 (UTC).
    @TE(æ)A,ea.: Seems to me that numbering it as "1." again is quite confusing for discussions. What is more, it is very different from no. 2 because no. 2 lets the contributors choose only between two kinds of quotes, and also lets them choose the curly quotes even if they are not used in the original. My suggestion a) lets them choose any kind of quotes, but b) only if they are used in the original and c) not in complex works. --Jan Kameníček (talk) 06:12, 28 July 2019 (UTC)
    I think it's simpler to leave it as Jan wrote it to ensure that each proposal has its own number, and I've reverted it to that accordingly. —Nizolan (talk) 12:53, 28 July 2019 (UTC)
    No. 4 revised after discussion below, to make it more clear. --Jan Kameníček (talk) 05:11, 30 July 2019 (UTC)
I support #2, [edit: #4 per Jan's rewording below,] or, failing that, retaining the status quo per #1: be consistent but do not enforce curly quotes. I strongly disagree with #3 being the "simplest proposal"; given that the vast majority of texts here are typeset using curly quotes this is effectively the same as enforcing their usage and would sum to a potentially enormous amount of extra work. Many OCR programmes, such as Google's, do not automatically add curly quotes so this would need to be done manually. I don't see the point of #4.Nizolan (talk) 17:54, 27 July 2019 (UTC)
To explain the point: I believe that contributors should be allowed to use any quotes that are presented in a work (e. g. curly), but they should not be forced to it, and should be allowed to use straight quotes only, if they prefer so. (this is ensured by no. 2, but not by no. 3). On the other hand I do not consider it good to allow to use curly quotes if there are straight quotes in the original (this is ensured by no. 3, but not by no. 2). Therefore I suggested the new point, which ensures this + recommends how to deal with complex works where contributors are not able to reach consensus which quotes to use. Number 2 gives the contributors the right of choice, but does not deal with the situation when the contributors have different opinions. Thinking about it again, it can happen not only with large works, but with any work where 2 or more people contribute, so I am rewording the point for "... If two or more people contribute to one work and are not able to reach consensus what kind of quotation marks to use, straight quotation marks are recommended." --Jan Kameníček (talk) 15:31, 28 July 2019 (UTC)
@Jan.Kamenicek: I see your point regarding curly quotes where typewriter quotes are used in the original. I would say as currently worded #4 is problematic for a number of reasons, though. As worded, it seems to forbid using straight quotes unless they are present in the original, as per #3. I'm not sure that works that are internally inconsistent are enough of a problem to need special mention, but as currently worded the suggestion doesn't say anything about what to do when they are. The last point seems over-complicated, imo: Wikipedia's "don't mess around with an article's established formatting preferences" seems like the most straightforward way to prevent these disputes.
I would personally firm it up along these lines: "1. Curly quotes may be used if and only if the work being typeset consistently uses curly quotes. 2. The marks used in proofread material should be consistent. Contributors to projects that have already started should ensure that they follow whatever convention has already been adopted in the work." And perhaps a final clarification for exceptional cases where curly or straight quotes are required in a particular instance regardless of general convention (e.g. in a transliteration system that distinguishes them) would also be helpful. —Nizolan (talk) 00:58, 29 July 2019 (UTC)
@Nizolan: It definitely does not forbid straight quotes, there is explicitely written: "Use either straight quotation marks or the same quotation marks as the work presents". So if there are e. g. curly quotations marks in the work, contributors can use straight or curly. If there are double angle quotation marks, contributors can use straight or double angle ones. If there are straight quotation marks, they can use only straight. I think this point is quite clear.
The rule says what to do so that there were not internally inconsistent works. I do not think it is necessary to say what to do, if such situation occurs because somebody has broken the rule (e.g. a bot can be applied to make it right???).
As for "follow whatever convention has already been adopted in the work." This is exactly what I wanted to avoid, because IMO nobody should be forced to use curly quotes if they do not want to. Contributors who start a large work and know that they will need help of other contributors should also know that straight quotes are the default formatting and curly ones can be used only if they can ensure that the others are willing to follow. It is undesirable that somebody starts a new encyclopaedia, adds first 50 articles with curly quotes, and willy-nilly all others who come later have to follow it.
The rule "don't mess around with an article's established formatting preferences" does not help with works consisting of many articles, where different contributors might establish different formatting in different articles, which is undesirable. The last part of no. 4 is definitely open for better rewording, but the result should recommend straight quotes which can be changed in favour of the curly or other quotes only if the contributors are all able to agree on that.
I am not sure what you mean by "curly or straight quotes are required in a particular instance regardless of general convention", can you explain it in more detail, please? --Jan Kameníček (talk) 06:16, 29 July 2019 (UTC)
@Jan.Kamenicek: "as the work presents" means "use the formatting of the work", it doesn't give an option. If you meant something else it should be rewritten. On the last point, e.g. in transcription of Arabic ’ and ‘ represent different sounds; ' is used in some Russian transcriptions, etc., these must be preserved regardless of the general convention of the work. —Nizolan (talk) 10:34, 29 July 2019 (UTC)
@Nizolan: I am afraid I really do not understand the problem: "Use EITHER straight quotation marks OR the same quotation marks as the work presents." The option is clearly expressed.
As for the transcription symbols: these are not quotation marks and so imo they are not in the scope of this rule. --Jan Kameníček (talk) 12:22, 29 July 2019 (UTC)
@Jan.Kamenicek: I'll try to explain the problem for you: either ... or can be inclusive or exclusive. If read as exclusive, then your guideline states that if the usage is consistent, use the same marks; if it isn't, use typewriter marks. So, based on what you want it to say, it is not clearly expressed. (Edit: Also just to add, if it did just say "Use EITHER straight quotation marks OR the same quotation marks as the work presents" I think that would be fine—the problem comes with the "under the condition…" modifier.) —Nizolan (talk) 12:41, 29 July 2019 (UTC)
@Nizolan: Now I see your point. So what would you say to the following wording: "It is allowed to use either straight quotation marks or the same kind of quotation marks as the original work presents, with the following condition: Other than straight quotations marks are permitted only if it can be ensured that they will be used consistently in the whole work. If such consistency cannot be ensured (e. g. because of a large number of contributors to the work or because of disagreement of some of them), straight quotation marks are recommended." --Jan Kameníček (talk) 15:47, 29 July 2019 (UTC)
That seems decent to me. I'll add it as another option I'd support. —Nizolan (talk) 17:08, 29 July 2019 (UTC)
I support this point (no. 4) too, or, if it does not succeed, no. 3. --Jan Kameníček (talk) 05:20, 30 July 2019 (UTC)
I support statement 2, and do not support the others. In my opinion, imitating the source with respect to straight vs curly quotes is like imitating the source with respect to serif vs sans serif font. I would word statement 2 like follows: Either straight quotes or curly quotes are acceptable, but a consistent style of quotes should be used within a single work. Straight quotes are advised in collaborative projects to ensure consistency. —Beleg Tâl (talk) 12:29, 30 July 2019 (UTC)
Also, guillemets and other marks used for quotations are completely different characters, and should be used as in the source regardless of the straight-vs-curly conventions being used. —Beleg Tâl (talk) 12:32, 30 July 2019 (UTC)
Agreed on this. —Nizolan (talk) 23:32, 30 July 2019 (UTC)
┌──────┘
@Beleg Tâl: Various kinds of quotes are different characters and have their own Unicode or html codes. If you change Serif into Sans serif, only shapes of letters change. If you change one kind of quotes for the other, you changed the chosen character. If you change the kind of font used for quotes, their shape may change but the characters still stay.
It is also not clear what is meant by "curly" quotes, often mentioned here, since several characters used for quoting have curly shapes, and can be combined in various ways, e.g ” ” (&rdquo; &rdquo;); ‟ ” (&#8223; &rdquo;); „“ (&bdquo; &ldquo;); „” (&bdquo; &rdquo;); and their single variants like ’ ’ (&rsquo; &rsquo;) ...
What is more, various national typographic systems use different kinds of quotes and so the chosen kind can sometimes also have this national connotation (which is the reason why I decided to use „“ in Page:Guide to the Bohemian section and to the Kingdom of Bohemia - 1906.djvu/16 and the following pages, as it was published in Bohemia and so uses the kind of quotes typical for this region.) --Jan Kameníček (talk) 07:18, 31 July 2019 (UTC)
@Jan.Kamenicek: Curly quotes in this discussion are the ones that can be replaced with " and '—specifically these: “ ” ‘ ’ and possibly . Alternative quotation symbols like should never be replaced with these upper quotation marks. The fact that there does not exist any "straight" variant of is, as I said earlier in this discussion, one of the biggest reasons curly quotes should be permitted. However, in general, the appearance of vs " in a source scan, like the appearance of a vs ɑ, or s vs ſ, is generally an accident of publication and typeface, and is therefore beyond what Wikisource editors should be expected to reproduce. —Beleg Tâl (talk) 12:42, 31 July 2019 (UTC)
I modified Point 5 to reflect this. Levana Taylor (talk) 17:04, 31 July 2019 (UTC)

I have added another point, which positively encourages the use of curly quotes, just to round out the set of proposals by going the maximum distance. I'm not saying this is my favorite option, but I think it's at least a plausible one. Anyone who can think of a less extreme form of this option should add it as a subsection. Levana Taylor (talk) 16:51, 29 July 2019 (UTC)

Add me as someone who is in favor of typographic quotation marks and deprecating straight ones. —Justin (koavf)TCM 06:56, 2 August 2019 (UTC)

  • As there has been a recent dearth of discussion, I will follow the beliefs of the majority of editors and modify the style guide accordingly. The new wording is a simplified version of no. 4. unsigned comment by TE(æ)A,ea. (talk) 23:49, 11 August 2019‎.
Hang on, it’s not decided which option we prefer yet! For example, I have not yet put on record that I like number 3 best, followed by number 2, and don’t favor number 4. Levana Taylor (talk) 10:32, 12 August 2019 (UTC)
I have been aware of the above discussion, but have not contributed as I was waiting for a formal proposal. All that's happened so far is a discussion on how to word the proposal. Once this has been decided, then a formal proposal with the suggested options needs to be made in a new section away from this tldr discussion. Beeswaxcandle (talk) 07:09, 13 August 2019 (UTC)
I've just been alerted to this discussion, and have added my support for flexibility for well-established projects to the RFC above. In particular, the Wikisource:WikiProject 1911 Encyclopædia Britannica has preferred curly quotes and apostrophes for years. DavidBrooks (talk) 16:15, 28 August 2019 (UTC)

I thought I’d add another point. Any proposal that quotation marks should be represented by straight quotes would, I assume, be accompanied by requiring that apostrophes and embedded quotations be rendered using the straight apostrophe. If that is indeed part of the proposal, then it must include guidance on what to do with various forms of raised, reversed, and turned comma. Marks like Arabic rough breathing, Hebrew ayin (in Latin alphabet transcription), and the Gaelic raised “c” all appear in 1911 Encyclopædia Britannica, and all properly rendered in the printed version. These various marks have distinct Unicode points. To me, rendering these as the ASCII apostrophe is simply semantically wrong—which is what leads me to consider straight quotes and apostrophes as equally semantically wrong and ahistorical. We shouldn’t be hamstrung by the limitations that ANSI suffered from in the 1960’s.

Agreeing with you, and would like to mention that a single-quote and an apostrophe are two distinct punctuation marks, even if they resemble each other a lot. Rendering both as a straight mark collapses their appearance even more than necessary: with a curve, at least you can tell the difference between an apostrophe and a single quote at the left side of a word. Levana Taylor (talk) 16:57, 28 August 2019 (UTC)
@DavidBrooks: I think it goes without saying that marks like Unicode ayin ʿ that are formally distinct should be represented as such regardless of the guidelines on straight quotes—they are different symbols. The uncertainty, which I mentioned somewhere above, is over cases where a work uses curly quotes to represent something like ayin ‘ vs aleph ’. In that case, I've suggested that since turning these into straight marks would remove necessary semantic information, the appropriate curly mark must be preserved in the transcription regardless of the general guideline or whatever quotation style is being used in transcribing the work itself. —Nizolan (talk) 23:48, 29 August 2019 (UTC)

When to let someone else find typos and omissions?Edit

I proofread Index:Public General Statutes 1896.djvu a few years ago, yet on review I am still finding a lot of mistakes and scanos. At what point is it time to ask someone else to admit you need a second person to look as well? When you think it's perfect, or when you "know" it's perfect? ShakespeareFan00 (talk) 17:00, 18 August 2019 (UTC)

If you "know" a transcription is perfect, then either you are mistaken or it needs no review. The same applies if you "think" a transcription is perfect. Therefore this question misses the mark. Everyone should admit that they make mistakes and, when seeking perfection, always need someone else to check their work. Admitting that, we then ask ourselves what sensible steps we can take to improve the quality of our work so we make fewer mistakes. BethNaught (talk) 22:38, 18 August 2019 (UTC)
yeah, i find the philosophizing over degrees of doubt unproductive: it is essentialism used to justify conduct. as at PRP on commons, the misuse of doubt tends to undermine its evocation. we all improve over time, and this is a consensus project with error levels subject to the validater’s checking. Slowking4Rama's revenge 20:10, 20 August 2019 (UTC)
Everyone faces the same issues with their with proofreading. The boundaries of "think" or "know" are fluid and vary for the same individual. This is what validation by others is important. But, even validated pages have overlooked errors. — Ineuw (talk) 21:09, 3 September 2019 (UTC)

New tools and IP maskingEdit

14:19, 21 August 2019 (UTC)

Template issueEdit

I've used {{Plainlist}} over multiple pages, by closing and reopening them in the headers/footers, but they don't render correctly on Aerodynamics (Lanchester)/Contents. Where have I gone wrong? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:14, 21 August 2019 (UTC)

You have to use the "split" templates: {{plainlist/s}} and {{plainlist/e}}, e.g. this diff. Because the individual pages are rendered one at a time, and then transcluded all templates opened with {{ in the page body must also be closed within the page body. Otherwise the {{ and }} are part of two separate rendering processes. By using split templates, you can "open" the template's effect, while not crossing the body/footer boundary with an unclosed template syntax. Inductiveloadtalk/contribs 15:43, 21 August 2019 (UTC)
You can use the simple process on the intermediate pages as it will still have the complete template, and display properly, the important part is to have the open (/s) and close (/e) versions on the first and last pages. We tend to just use the /s & /e versions wherever they are split as it is more transparent and informative. — billinghurst sDrewth 04:24, 23 August 2019 (UTC)

Old taxonomists, published, but missing from WikisourceEdit

This Wikidata query may be of interest:

https://w.wiki/7MQ

It shows people, who died before 1949 (so whose works are out of copyright, by the lifetime + 70 years rule), with a Wikispecies entry, but no entry in any Wikisource.

There are currently 4,413 results.

Some caveats:

  • If these people co-authored papers with others who died later (or still live), those papers are not out of copyright
  • Some of them authored no works in English
  • Some taxonomists, who have an entry on Wikispecies and in a non-English Wikisource (so are not included in the results), could still have an entry here
  • Wikispecies lists no publications, for some of their entries.

-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:08, 22 August 2019 (UTC)

Very interesting! Just one small problem: I have noticed that quite a lot of people (e.g. Jean Antoine Arthur Gris) are being listed twice in the query. --Jan Kameníček (talk) 17:36, 22 August 2019 (UTC)
Fixed. That brings it down to 4243 results. —Beleg Tâl (talk) 18:33, 22 August 2019 (UTC)
I will try to proofread some texts of a couple of them. --Jan Kameníček (talk) 19:42, 22 August 2019 (UTC)
There's a big caveat that we use US copyright laws, so life+70 is irrelevant. Lots of those authors may have published only after 1923, and lots of authors with US-PD works may have died less than 70 years ago.--Prosfilaes (talk) 03:41, 23 August 2019 (UTC)
Yes and no. We would still apply copyright status and copyright tags, so pages would be tagged BUT I still don't think that we should be creating pages if we don't have one of: works reproduced, or a list of works, or the pages are wikilinked targets for authors in reproduced works. If we are adding something of value, then create the author page, if it is to contain the {{populate}} tag and nothing else, then please don't bother, there is not a lot of value in such pages for us or for users. — billinghurst sDrewth 06:40, 23 August 2019 (UTC)
I'd also note that we have 1463 works with non-existent author pages already here on Wikisource, which in my opinion is a slightly higher priority if people want to create missing author pages. —Beleg Tâl (talk) 13:06, 23 August 2019 (UTC)
  • If someone (possibly using a bot) could create a page that lists all the names, I could go through and mark which names have (1) no works published, (2) no works published in the English, or (3) no works published in the public domain. TE(æ)A,ea. (talk) 21:30, 23 August 2019 (UTC).
    I truly don't believe that it is a priority; tail wagging dog stuff. Crete stuff that is known to be needed, not stuff that does not represent what we have. — billinghurst sDrewth 06:23, 24 August 2019 (UTC)

Multiple transclusionsEdit

@RaboKarbakian: Why do we have both Amazing Stories/Volume 01/Number 02/The Infinite Vision and Amazing Stories/Volume 01/The Infinite Vision? IMO, we should usually transclude a work at one spot. I have a mild preference for the first one, but just one so things don't have to be fixed in multiple places.--Prosfilaes (talk) 16:10, 23 August 2019 (UTC)

WS:CSD#G4 says we can delete the duplicate copy - it looks like Amazing Stories/Volume ##/Number ##/Work Name is the standard for this work, so duplicates in other locations should be deleted. Users who can't delete the pages should tag them with {{sdelete}} so that they will be deleted. —Beleg Tâl (talk) 17:35, 23 August 2019 (UTC)
Turn the unwanted into a redirect, either hard or dated soft. — billinghurst sDrewth 06:24, 24 August 2019 (UTC)
From The Infinite Vision, sure. But hard redirects from a subpage to a different subpage are generally deleted anyway as "unneeded redirect", so we may as well skip the redirect step. —Beleg Tâl (talk) 13:29, 24 August 2019 (UTC)
I was making a whole volume and the individual stories. Like taxonomy with family and species. Complex? Maybe. Complicated -- not as much. -- RaboKarbakian (talk) 14:57, 24 August 2019 (UTC)
Is there a problem with having the story in tact? Is there a better way? My answer to both was no, but maybe I am wrong. --RaboKarbakian (talk) 15:25, 24 August 2019 (UTC)
@Prosfilaes: --RaboKarbakian (talk) 17:27, 24 August 2019 (UTC)
Duplicating entries like that means that things have to be fixed twice. It's also confusing; are Amazing Stories/Volume 01/Number 02/The Infinite Vision and Amazing Stories/Volume 01/The Infinite Vision two distinct editions? There's nothing on the pages that makes it clear that they're not. It's impossible to link Wikidata for one edition to two separate pages.--Prosfilaes (talk) 10:57, 25 August 2019 (UTC)
see also Wikisource:Proposed_deletions#The_Dragon-Fly--Slowking4Rama's revenge 13:01, 25 August 2019 (UTC)
... That is completely different, because both versions of "The Dragon-Fly" are clearly different editions. —Beleg Tâl (talk) 13:12, 25 August 2019 (UTC)
we differ on the interpretation of version. they are typographically the same work, as in this example. so now you need a thesis of how a book and periodical are different editions, and two periodicals are not. Slowking4Rama's revenge 15:23, 25 August 2019 (UTC)
Two periodicals are different editions; one periodical transcluded twice is not.--Prosfilaes (talk) 18:10, 25 August 2019 (UTC)
@Prosfilaes: Transclusion is not a duplication. The editable original (as seen with the scanned page is the only thing that might need "fixing". It is more of a reflection than a duplication. You can check this yourself by attempting to alter a properly transcluted page.... I wanted the whole story for the author links and the edition for the magazine to be as it was. --RaboKarbakian (talk) 13:33, 25 August 2019 (UTC)
Transclusion still requires duplication of the header and surrounding text. It's just not clear to the reader that they're different texts, and it's not clear to the editors which page to link to. This started because I was creating Author:Charles C. Winn, and didn't know which to link to.
I'm not sure what your goal exactly is. Why is having Amazing Stories/Volume 01/... better than Amazing Stories/Volume 01/Issue 0x/...? They're all linked together either way. Is this for serialized works? It would be better to link serialized works together in their header than have multiple transclusions of them.--Prosfilaes (talk) 14:23, 25 August 2019 (UTC)
The original publication was in editions with parts of stories. I imagine people waiting for the next edition to come via mail or in the local store or whatever. Like the cliffhanger at the end of a television series. The edition was how these stories were first shared with the public. 100 years later, some of these stories are well known and some of the authors also. Via current technology they can easily be presented both as they were originally published and as a whole. Wikidata can more easily handle them both ways. No additional editing. Author links are easier and if you would like more journalistic accuracy on the whole story a "published in such and such edition on such and such date" statement can be styled and templated in. It's beautiful and simple and not easily accomplished at other ebook sites.--RaboKarbakian (talk) 23:56, 25 August 2019 (UTC)
I think you're confusing "edition" and "issue". An edition would be the same work, but slightly different from other editions. Magazines come in issues.
Magazine volumes do not necessarily contain the whole of a serialized work. Weird Tales, Volume 4, Issue 1, for example, contains the second part of Draconda, continued from the previous issue. Instead of creating a volume collection, I would think it better to take just the serialized works, and transclude them as, say, part by part as Weird Tales/Volume 4/Issue 1/Draconda (part 2) and as one part directly in mainspace as Draconda. This reduces the duplication to just the works that need to be duplicated, and keeps the duplication in different forms.--Prosfilaes (talk) 07:47, 26 August 2019 (UTC)
Oops. s/edition/issue/g --RaboKarbakian (talk) 13:42, 26 August 2019 (UTC)

Wikidata, volumes and editions, and serialized talesEdit

The whole story as shown under the volume namespace gets linked at wikidata to the author and the volume.

The edition also gets linked to the volume.

The story links to every edition it was "part of".

The edition links back to every story that it has a portion of.

It is complicated to look at as a whole but viewing from different perspectives it is complete

(I wrote this on Prosfilaes talk page and pasted it here) --RaboKarbakian (talk) 00:05, 26 August 2019 (UTC)

Links to authors in WikispeciesEdit

If an author, such as Ferdinand Stoliczka, has an entry in Wikispecies, the entry automatically is listed in the Sister projects box in the header. However, the link is labelled as "taxonomy", which is imo quite confusing. Is there a possibility to change it? --Jan Kameníček (talk) 08:13, 24 August 2019 (UTC)

It is our label, it can be whatever the community decides. It went through a discussion at the time, and it was a tough one to find the exactly right label for all links to wikispecies from all namespaces. — billinghurst sDrewth 08:16, 24 August 2019 (UTC)
"Wikispecies". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:29, 24 August 2019 (UTC)
Agree: It is easy, expected and should be OK for all namespaces. --Jan Kameníček (talk) 22:57, 24 August 2019 (UTC)

Template:unsigned2Edit

Please will someone import w:Template:unsigned2 from en.Wikipedia? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:36, 24 August 2019 (UTC)

Can't you just use {{unsigned}}? Seems fairly redundant to have competing templates. — billinghurst sDrewth 23:53, 24 August 2019 (UTC)
They are not "competing". {{unsigned2}}, when it is Subst, leaves behind a copy of {{unsigned}}; but the inputs it takes are in reverse order, so can more easily be copied and pasted from page histories. This is all explained in the documentation of w:Template:unsigned2. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:41, 25 August 2019 (UTC)
They both put "unsigned", how is that not competing? We have been trying hard for years to not have template gumph and bloat, and to make do. Just put 2= and 1= in the front and you achieve the same thing with positional parameters. It isn't that hard. — billinghurst sDrewth 11:46, 25 August 2019 (UTC)
No, I wont "adapt" (per your edit summary), because {{unsigned}}, even with positional parameters, requires me to do more work (both cognitive, and physical keystrokes) than does {{unsigned2}}. They do not "compete" because one aids use of the other, and is not an alternative to it. {{unsigned2}} is used, without issue, on en.Wikipedia, seventeen other Wikipedias, Wikidata, Wikinews, Wikiquote, Wikiversity, Wiktionary, Meta-wiki, and Commons. Providing templates to make contributing to Wikisource easier - and less RSI-inducing - for volunteers is not "template gumph", nor "bloat". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:25, 25 August 2019 (UTC)

Updated templates to support curly quotesEdit

Hooray for the adoption of curly quotes! Who’s going to take care of creating equivalents for {{" '}} et al.? I think probably a template that simply places a gap of 0.15em between any pair of characters would do well enough. (Or 0.1em? I am no typographer). Levana Taylor (talk) 18:17, 24 August 2019 (UTC)

In my opinion, {{" '}} should be deprecated in favour of "' and “‘; spacing should be handled by the browser and the font and not by the content. That's just my opinion though. —Beleg Tâl (talk) 22:26, 25 August 2019 (UTC)
Like a couple of people pointed out in the last discussion, browsers don't handle it and there's no real prospect of them doing so any time soon, so I don't see the point of purism here. It's not WS-exclusive either: Wikipedia does the same hack automatically in citations, for example, and once there is widespread support the templates can just be disabled anyway. @Levana Taylor: Spacing arbitrary characters is already possible using {{sp}}, e.g. for “‘ you can do {{sp|“}}‘ for ‘. —Nizolan (talk)
That {{sp}} result looks pretty good to me; if other people think so, we don't need the quote-pair template at all. Levana Taylor (talk) 12:53, 26 August 2019 (UTC)
{{sp}} produces a passable visual result—if in a somewhat suboptimal way—by applying a .15em CSS letter spacing. However, creating a template for this one case is trivial: the sum total of the code would be &#8201; and &#8201; for its mate. This renders as “ ‘ and ’ ” (vs. “‘ and ’”). The challenge is accounting for all the possible combinations we end up permitting, and under sufficiently mnemonic names. --Xover (talk) 15:36, 26 August 2019 (UTC)
Imo the CSS option should actually be preferred to inserting a thin space character, for the reason explained at Wikipedia's documentation: "… It does this with CSS, and does so because the insertion of an extraneous space character of any kind (e.g., &nbsp; or &thinsp;) would violate the semantic integrity of web content in an article or another page in which it appears." {{' "}} etc. also use CSS at the moment, but via padding rather than letter-spacing. Though I'm not particularly bothered either way. —Nizolan (talk) 16:37, 26 August 2019 (UTC)
Nah, that's just wankery. These Unicode characters (which does not include &nbsp;, which is an actual space character) are intended specifically for this purpose and has no affect on semantics. The so-called "CSS" approach, however, is not actually a CSS-based method: it uses HTML markup whose sole purpose is formatting, making it no better than <b> or <u> or other physical markup. --Xover (talk) 17:19, 26 August 2019 (UTC)
{{sp}} changes the appearance so that it looks like there is a space between the characters, but there actually isn’t: browser search and cut-and-paste treat it like two consecutive characters (and search can find them). This is what we want! By contrast, &thinsp; is treated as a character for both purposes, it’s not good: shows up as a box when cut-and-pasted, and is not found by typing either '" or ' " into browser search. Levana Taylor (talk) 14:49, 27 August 2019 (UTC)
How it works in practice is of course an important consideration for choosing the best approach, and will need good testing as you have done here. However, I must also say, if your browser/OS/application turns a thin space into a � (or other replacement character) then it is time to look for a replacement. This is a bog standard Unicode character that's been in the spec forever, and not in any way an obscure one. Falling down and showing the replacement character would have been somewhat forgivable a decade ago, but it is inexcusable in 2019. --Xover (talk) 17:51, 27 August 2019 (UTC)
The issue is, really, that underlyingly the text is a consecutive pair of characters, which just need visual separation, not an actual space between them. The {{" '}} templates represent it that way, and so does {{sp}}. But inserting a thin space or a gap doesn’t achieve that result.
@Prosfilaes:, you’ve mentioned that there are font/browser combinations that automatically adjust spacing between pairs of quotes, and I suppose you have a concrete example in mind. Could you check how that interacts with {{sp}}? Levana Taylor (talk) 19:30, 27 August 2019 (UTC)
I don't have a concrete example in mind; it's just the simple fact that fonts have supported such automatic adjustments for decades, and every browser needs a complex shaper to support Arabic and such languages anyway, so not supporting that seems bizarre.--Prosfilaes (talk) 22:48, 28 August 2019 (UTC)
Indeed. Lack of support for even moderately advanced typographical features in current web browsers is mind-boggling and incredibly frustrating. And WS' use cases seems to be hitting a lot of the gaps in that support, so that even the areas where browsers do have advanced features help us only but a little. Because I absolutely agree that the perfect solution for this issue would be for fonts+browsers to handle it automatically with zero intervention from us (apart, perhaps, from picking a suitable default font). --Xover (talk) 05:38, 29 August 2019 (UTC)
@Xover: I disagree that it's just wankery—for example, if I copy and paste from a text with the "' sequence I'm not going to want an extraneous space to be inserted between the two, though I'm hardly going to tear my hair out over it. For the record I'm not sure font support for adding space between the straight versions is particularly advanced. I just checked, and in Indesign the sequences ''', "', and '" are pretty well indistinguishable without manual kerning not just in Arial but even in professional typefaces like Garamond Premier Pro, though smart quotes are handled fine. —Nizolan (talk) 00:31, 30 August 2019 (UTC)
@Nizolan: The issue itself is a fair one, certainly. The "wankery" comment was more addressed to the template docs at enWP that describe the issue as if it were about <font></font> tagsoup vs. <em></em> and semantic markup. That's just not the case, and even to a degree gets it completely backwards.
As you and Levana note, using a hair or thin space has certain effects that may or may not be desirable, and which must be weighed against other concerns. That you get the relevant character along when you copy the text is not only not surprising but rather is the expected and, in that circumstance, desired behaviour: why should the spacing we had added disappear when copied? But whether we want it there in the first place is a separate issue. I don't think there will often be a case where readers use in-browser search to search for the quotation marks, but it might be enough of a downside on its own to prefer another method.
And provided we use at least vaguely semantic markup and TemplateStyles I have no strong objection to a <span class="ws-quotation-marks"></span> type approach (overloading {{sp}} I am less enamoured of). Its verbosity and sheer overhead (one character vs. 40!) offends my sensibilities, but nothing strictly wrong with it.
Thanks for checking Indesign! It's been a while since I touched that or FrameMaker etc., and I can hardly claim any expertise in either, but I must say I'm a little surprised not even Indesign would take advantage of this font feature. Perhaps it is a deliberate choice on a theory that anyone using straight quotation marks really means to have basic untweaked quote marks? I take it it applies some form of spacing for "smart" quote marks(?), so it would seem it does support this feature. And in any case, I agree this isn't very "advanced". I'm continually surprised at what web browsers don't yet support (like drop-initials. Still not?!? Wow! And where're my historical ligatures? I want my long-s with CSS control, darnit!). --Xover (talk) 08:00, 30 August 2019 (UTC)

Organisations as authorsEdit

How do we deal with corporate, governmental, and similar authors, for example the red link for "Parliament of India", on Flag Code of India?

I have looked for examples, but have been unable to find any. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:17, 25 August 2019 (UTC)

Portals! See Stratemeyer Syndicate, Council of Trent, Brothers Grimm, Truth and Reconciliation Commission of Canada, &c &c —Beleg Tâl (talk) 18:25, 25 August 2019 (UTC)
Note that with regard to government works, it is common to use something like Parliament of Canada, but linking to Portal:Acts of Parliament of Canada, as there is currently no portal for Portal:Parliament of CanadaBeleg Tâl (talk) 18:28, 25 August 2019 (UTC)
Thank you, but I don't see the answer to my question there. For example on Portal:Stratemeyer Syndicate I selected a random entry, Dick Hamilton's Fortune (one of several I tried, on the various portals you suggest) and it has Author:Howard Roger Garis - a person, not a not a corporate entity. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:53, 25 August 2019 (UTC)
That's because that particular portal includes many texts that do have individual authors. See Canons and Decrees of the Council of Trent from another of BT's suggestions for a better example of what you're looking for. —Nizolan (talk) 20:21, 25 August 2019 (UTC)
Most of the "authors" on the Stratemeyer Syndicate portal are nom de plumes that were used by several people. Garis is the odd one out as Stratemeyer allowed him to write the Dick Hamilton series under his own name in return for writing several of the Tom Swift and Bobbsey Twins books under those series' nom de plumes. A portal was the best way we have of gathering the various series together in a coherent way. Beeswaxcandle (talk) 08:48, 26 August 2019 (UTC)
Right, I wasn't questioning the use of the portal lol, just pointing to a more direct example. —Nizolan (talk) 12:38, 26 August 2019 (UTC)

A new category for validated texts?Edit

The most frustrating thing about Wikisource to me is that it's virtually impossible to find good quality texts to read here (which is ironic given the purpose of the project). Let's say, for example, I want to read a non-fiction book about birds. Here are all the steps I try:

So after all that effort, I sort of found what I was looking for, but not really. And sadly, I didn't encounter any of our validated full-length non-fiction books about birds, such as Natural History, Birds or Bird Haunts and Nature Memories (which I found through the external Wikisource category browser tool).

I would like to propose that we create a new category called Category:Validated texts which is prominently linked from the Main Page and subdivided into categories by topic and format. That way it will be easy to find high quality texts to read about whatever you're interested in without having to dig through hundreds of unfinished and unproofread texts in the process. Kaldari (talk) 17:56, 25 August 2019 (UTC)

Portal:Birds is supposed to be the to-go place for finding good quality works about birds. We should probably make a better effort to maintain portals, and I encourage you to add Natural History, Birds and Bird Haunts and Nature Memories to Portal:Birds where they belong. —Beleg Tâl (talk) 18:31, 25 August 2019 (UTC)
It is a matter of imperfect categorisation. Specifically Category:Birds is quite confusingly a subcategory of Dinosaurs, which are at the same time considered to be only Prehistoric reptiles, and so from Vertebrates you can get there only thus: Category:Vertebrates --> Category:Reptiles --> Category:Prehistoric reptiles --> Category:Dinosaurs --> Category:Birds :-D
The only help is to keep improving the categorization. --Jan Kameníček (talk) 18:40, 25 August 2019 (UTC)
Fixed. --Jan Kameníček (talk) 18:45, 25 August 2019 (UTC)

@Kaldari: It is meant to be Portal namespace for curated pages. <shrug> The system is not perfect to flow from validated in Index: ns through to main ns. We have Category:Validated, Category:Index Validated and Category:Indexes validated by date. Or you can look at them via Special:IndexPages and even run searches there for Index pages/ Then we have works where we have put them into WD, and badged them (validated). Ideally we would have all our works in WD and have them categorised there and run searches, though that is clearly a pipe dream. Unfortunately there is no ready means to move between proofreading status of a complete work from its index page. Of course we also have our completely crap Wikisource:Works. As an example of how we do ourselves no favours, poke your eyes listed in Wikisource:Works/2017 and Wikisource:Works/2018 and see the poor use of portals and categories. It seems we prefer to transcribe and replicate the work of a typographer rather than present books to the public. — billinghurst sDrewth 11:11, 26 August 2019 (UTC)

@Beleg Tâl, @Billinghurst: The problem with Portals is that you can put anything in them (even barely started transcriptions and redlinks). Although they have some organization to them, there's no curation, and they tend to be more-or-less random lists of works, many of which aren't even readable. There needs to be a place on Wikisource somewhere that you can go to find finished works to read. The closest we have to that currently is Category:Featured texts, but it only has 126 pages in it. Would there be any downside to creating a category for validated works (with subcategories for topics and formats)? According to Category:Index Validated, there are only 3299 validated works on Wikisource. That doesn't seem like it would be impossible to manage. Then we could put a little blurb on the Main page that says something like "Looking for something to read? Check out our completed works." Kaldari (talk) 21:21, 26 August 2019 (UTC)
1. Is there maintenance that searches for validated indexes that haven't had their status set to validated? That should help with generating a list of works, though not necessarily with generating a list of articles in validated magazines.
2. Can there be a policy that only finished, proofread (though not necessarily validated) works should be on portal pages? That'd give someone tidying up a portal grounds for removing things. Levana Taylor (talk) 21:52, 26 August 2019 (UTC)
Portals are too versatile for your second idea to work. Like Author pages, sometimes it is valuable to list all relevant works even if they are not all hosted here yet. It can also be valuable to list works that are in progress so that editors interested in the topic can contribute to the works they are interested in. —Beleg Tâl (talk) 22:40, 26 August 2019 (UTC)
There is no ready ability to find Index works that have been proofread/validated though have not been marked so. We have enough issues with works being proofread and validated without being transcluded, and you may have seen my recent efforts to update those. We do have series of checks for works that need work in the transclusion space, or checks, and you will see a number of petscan queries on my user page that I utilise. It is manual work checks, and it is okay as it takes manual resolution, so it is about being in the space. — billinghurst sDrewth 11:47, 27 August 2019 (UTC)
Plus if you want to check our completed works, I would point you to Wikisource:Works/2018, Wikisource:Works/2017, Wikisource:Works/2016, etc. as these are proofred and transcluded works, or electronic documents not in need of transclusion. — billinghurst sDrewth 11:49, 27 August 2019 (UTC)

I agree with @Kaldari:'s original point. It's strange, and tends to defeat our purpose, for it to be so difficult to find fully validated works. (Making it easy to find fully proofread works, separately, would also be valuable, as the number of works would be vastly increased, and the quality drop would be minimal.) I can see how the various mechanisms are problematic; it seems to me that categories are the most natural fit, in the MediaWiki software.

I also think if we get this right, it should ultimately improve discovery. Search engines should learn to favor validated works over non-validated works, and our best material should become more discoverable through web search. -Pete (talk) 23:27, 26 August 2019 (UTC)

Really good point. Said category maybe should be flexible enough to be applied to individual articles in magazines that are proofread even if the entire issue isn’t finished, another way of increasing the number of works. Levana Taylor (talk) 23:30, 26 August 2019 (UTC)
The issue is that categories are bland and give next to no information about the work. They give neither scope nor detail to a work, and that is why we went to curated author pages (rather than author categories), and then started working on portals. If we were to do things properly we would be having auto-generated pages that pull wikidata. We need to be more assiduous with listing our works. If someone could find assiduous cataloguers that would be nice. By the way, if you do add pages to portals, we do have {{75%}} and {{100%}} to indicate proofread and validated works. For any portal, if you think that it looks ugly, we encourage you to fix it, or make it workable. With regard to red links, there are some who revel in adding redlinks to our curated pages, and while I dislike it it isn't necessarily wrong. — billinghurst sDrewth 11:42, 27 August 2019 (UTC)
on wiki search is so bad, i don’t use it. rather a google search is better. maybe a wikidata query by wikisource work with a proofread status, would be better. then you could have a proofread work page, with the wikidata query result. Slowking4Rama's revenge 03:41, 28 August 2019 (UTC)
(Just wondering aloud) Would it be feasible to have these icons automatically display as badges on a work based on the status of the underlying Index? Would it be feasible to make a template similar to {{scan}} which automatically shows the relevant icon based on the work's status? —Beleg Tâl (talk) 13:28, 28 August 2019 (UTC)
Are you meaning on the work, or in a built list? We usually just need to ask, and @Mike Peel: has been able to work with developers at Commons to get components together. Still the biggest issue is that we simply don't have enough people putting their works into Wikidata, let alone the badges. Further, the WEF framework, and all other tools that I have seen don't give us an easy means to bulk apply badges, they have to be manually applied per work. Ugly. Personally I gave up the data extraction approach from WD, just got too frustrating, and too much time editing in WD, and not enough here. — billinghurst sDrewth 13:40, 28 August 2019 (UTC)
What about a tool that would show results of intersection of a chosen category of works and validated/proofread works? Something similar to what they have in Commons in the top right corner of each category, with which you can filter out e.g. only featured images. However, it would be good if it were able to filter out not only validated works directly in the category, but also in the subcategories, so that a person interested in vertebrates received all mammals, reptiles, fish... --Jan Kameníček (talk) 15:18, 28 August 2019 (UTC)

Reply to @Billinghurst: Is your point about the software feature of categories, or is it about how categories were previously implemented here on Wikisource? (Do you have a link to the discussion, or remember roughly what year the change was made?) It seems to me that if categories are "bland," that's because nobody has taken the trouble to implement them in a way that is more useful. I have worked on MediaWiki-based wikis which put article-length commentary on category pages. If there's something missing from the category feature on a software level, I'd like to better understand what it is. If it was merely missing in implementation, it seems to me that shifting from categories to portals may have been a decision based on an imprecise understanding of the problem or the available solutions. -Pete (talk) 15:40, 28 August 2019 (UTC)

@Peteforsyth: Categories didn't then, and don't now meet our needs to neatly display our works. They are simple a page title, and add in the requirement of subpages, it just goes from bad to worse. We curate author pages, have a think about why we do that rather than rely on categories. Manual curation does suck, though until we can autogenerate lists on a daily basis pulled from something like WD, we are stoinkered. Until users properly attach to WD each article in a biographical dictionary, a journal, and link to an author, and fill those gaps we are stoinkered. Until we can create books <-> editions, we are stoinkered. So at this point, I will just go back to my maintenance, as tidying up after people here is way more than a fulltime task anyway. — billinghurst sDrewth 22:50, 28 August 2019 (UTC)
I think you've established with your comment that you and I have some pretty basic disagreements on the subject, but if you intended to communicate more, could you try again? I'm still interested to see the earlier discussion. (I can search myself I suppose, but without knowing any of the backstory it'll probably take me a while.) -Pete (talk) (originally posted along with the point now below, beginning the section #Problems with subpages.)
I'd like to make a few points on this and I'm very happy this conversation and the making texts discoverable conversation have come up
  • Categories may have their benefits but I think portals are the way to go for wikisource for users that are searching for works or subject matter. They are more customizable for our purpose and works can be better presented in portals than categories.
  • I think when a work has been proofread it it is ready to be presented to the public, so we shouldn't be focusing so much on only validated works.
  • Portals should be the first thing to come up when users are searching for subject matter, then works, authors, index pages etc.
  • Our main page sucks at advertising what this site has to offer. I think we need to seriously have a look at this. Navigation to subject matter for example.
  • To sum up I think portals are the best way to present subject matter to the public in wikisource.
  • This subject matter must be discoverable via the search bar or main page.
  • Proofread texts are good enough to be presented to the public. Anything else (auto generated OCR) etc shouldn't be presented at all.
  • Ideally in my opinion, if a user is searching for Birds, they would type Birds in the search box and be directed to the portal Birds. Whether this portal is properly maintained and up to scratch is up to us. Jpez (talk) 20:58, 29 August 2019 (UTC)

Problems with subpagesEdit

The "subpage" convention we use is somewhat related to this point. In my view, the subpage convention is less than ideal, for two reasons: (1) it creates page titles with slashes in them, i.e., they deviate from plain English and look like code. I would expect to see "The Lord of the Rings, Book 1" rather than "The Lord of the Rings/Book 1" – the latter is not proper English syntax. (2) The breadcrumbs it creates are redundant of the information in our headers. The "Book 1" page header should allow you to get to the main work, and also to Book 2, etc. Breadcrumbs that let you navigate "up" in the hierarchy are therefore redundant and cluttery. I prefer categories as a means of organizing pages on a wiki; they are much more flexible than subpages, and they do not impose restrictions on the titles of the pages they organize, or introduce redundant breadcrumb trails. -Pete (talk) 16:45, 29 August 2019 (UTC)

Incidentally, you can override the displayed title of a page in Mediawiki (with some limitations that I can't recall ottomh), which might be a way to alleviate the "slash-in-title" issue if needed. --Xover (talk) 16:57, 29 August 2019 (UTC)
Do you mean that some template can be placed on the page to change its displayed title? If, on the other hand, it’s a settings change on the user end, that doesn’t help the casual vistitor. Levana Taylor (talk) 17:07, 29 August 2019 (UTC)
If we feel the current displayed title for subpages is a problem, one possible way we could explore solving it is using Mediawiki's functionality for overriding the displayed title. There are several possible ways it could be deployed, but one of them would be to simply have the {{header}} that we already include on every page do the overriding. For an article in a periodical, for example, the header template has enough information to ask for "Awesome Article (The Magazine)", or whatever format we want.
For reference, the functionality I'm thinking of is the so-called "magic word" {{DISPLAYTITLE:…}}. It has some limitations in what it can do so we might need code changes in Mediawiki if our needs exceeds that scope. --Xover (talk) 17:34, 29 August 2019 (UTC)
This is related, too, to the ongoing discussion about visibility of our works. There’s an issue with the visibility of works that are, although complete in themselves, subpages of periodicals or collections. When you google them, it may not be immediately obvious that the long compound page name you see in the search results is THE text of the work you’re looking for. Also you don't pull them up when typing the subpage name into the search box here. What can be done about this? Redirects, or a fundamental rethinking of subpages as Peteforsyth suggests? Levana Taylor (talk) 17:04, 29 August 2019 (UTC)
Agree with User:Levana Taylor. Altogether, is it necessary for subpages to be visible in search engines? There many main namespace subpages that are not proofread? — Ineuw (talk) 23:05, 2 September 2019 (UTC)
Yes, it is necessary. My point was that many things that are currently filed as subpages are standalone works (even if in anthologies/collections/periodicals) and it ought to be easier for searches to find them as such; and that page-names ought to be rethought so that it would be easier to recognize what you have in front of you when you first see it. Levana Taylor (talk) 06:48, 6 September 2019 (UTC)

Updating the images in the text quality templatesEdit

In an above discussion, the text quality templates  ,   and so forth are mentioned. It seems to me that it would be desirable to update the imagery of these icons to reflect our scan-based work process (which these icons predate) and to match the icons used on Wikidata:  ,  , and so forth. Does anyone object to this idea? —Beleg Tâl (talk) 13:21, 28 August 2019 (UTC)

Makes sense to me to replicate the badges, and we should be implementing similar for listing of features articles, and other components that come through the badges. — billinghurst sDrewth 13:35, 28 August 2019 (UTC)
I think it would be terrible to use those badges here. Their meaning is completely obscure to anyone that isn't a Wikimedian. I would recommend using something like   for validated texts. For other statuses, I don't think it would be practically possible to keep their badges up to date everywhere a work is listed, nor would it be especially useful for readers. Kaldari (talk) 16:29, 28 August 2019 (UTC)
I strongly disagree with this for a few reasons. First, a mark for works that are complete and proofread is much more useful, comparatively, than a mark for validated works: the quality difference between proofread and validated is far smaller than the difference between proofread and non-proofread. Both statuses should be indicated, but if I were for some reason forced to pick one I would go with "proofread at least once", not validated. I also don't think it's much less impenetrable to only mark validated texts: if you don't know what validation means then exclusively marking validated works, which implies that only they are satisfactory, could be positively misleading. Finally, I don't think the work added is that much—Category:Index Proofread is actually smaller than Category:Index Validated, so it wouldn't even double it for scans. —Nizolan (talk) 17:54, 28 August 2019 (UTC)
If we are talking about placing badges on work pages (which is kind of beside the point of this discussion), I would say that the best way to handle this would be to have all scan-backed works badged based on the status of the linked Index page. —Beleg Tâl (talk) 19:01, 28 August 2019 (UTC)
Question: can't the badge state for a given page be read directly from Wikidata? Surely we wouldn't need to manually keep them in sync? Inductiveloadtalk/contribs 18:27, 28 August 2019 (UTC)
It can be automatically read from Wikidata, but Wikidata cannot automatically pull the info from Wikisource and needs to be kept updated manually. —Beleg Tâl (talk) 18:55, 28 August 2019 (UTC)
I do not think   is less obscure than  , but at least they are both less obscure than the current image  . However, there is one very important reason why   is preferable to  , namely that   is already hard-coded into Wikidata, and therefore has the advantage of consistency. Actually, scratch that last bit; Wikidata uses   for featured texts, but this doesn't impact our choice of icon for featured texts, so I guess consistency doesn't matter here. —Beleg Tâl (talk) 18:55, 28 August 2019 (UTC)
Anyway, I guess I'm open to replacing   with  , but then what are your suggestions for icons for  ,  ,  , and  ? I suppose   would make the most sense for  . —Beleg Tâl (talk) 19:08, 28 August 2019 (UTC)
Okay, how about this for suggestions.
  •   = null (formerly  )
  •   in purple = problematic (not used in old system)
  •   or   in red = not proofread (formerly   and  )
  •   in yellow = proofread (formerly  )
  •   as-is = validated (formerly  )
Beleg Tâl (talk) 19:44, 28 August 2019 (UTC)

I would prefer that we kept it simple and aligned. Whatever is used here and WD should be the same. We only really need to indicate proofread and validated texts, everything else is less relevant, and not what we are trying to highlight. Every bit of complexity is something that someone doesn't or won't do. — billinghurst sDrewth 22:37, 28 August 2019 (UTC)

Which to me means that you do not think we should update all the textquality templates as I suggested, but to do something pretty much entirely different. That's fine. —Beleg Tâl (talk) 23:10, 28 August 2019 (UTC)
@Beleg Tâl: I like your proposed icons above. I also agree with billinghurst that there's not much point to highlighting texts that aren't proofread or validated. If we adopted the icons you propose, I bet I could get the badges at Wikidata changed to match. Kaldari (talk) 00:24, 29 August 2019 (UTC)
I don't like any of the alternate images proposed so far. None of them are as crisp or clear as the originals—which I think is an important aspect of them—and none communicate better (the originals aren't great either) what we want to convey. I also think we should clearly delineate the two axis here: "progress" in a percentage only really makes sense for unfinished works in the proofreading stage, while "Not proofread", "Proofread", "Validated" are categories or overall status of a work. The old icons do an admirable job communicating both aspects: imperfect, but as good as can reasonably be hoped for. New icons would be easier and do better if we clearly differentiate those two uses (including which namespace they are envisioned to be used for: there seems to be two separate discussions going on here).
I also think choosing icons would best be done in conjunction with a refresh of the visual design, the context in which the icons will appear, but I realise that is a much much taller order.
That all being said, I do not actually oppose this proposal so much as wish it went farther and did more. If perfect is the enemy of good here, by all means let's make something that's merely good and then iterate! --Xover (talk) 05:30, 29 August 2019 (UTC)
  •   Comment for an immediate action, let us just poke the amber badge into 75%, and the green badge into 100% for the moment, and work from there. They are templates, and this isn't rocket science. The amber and the green are universal for WSes, to move from something standard and expected should be a pretty good reason. — billinghurst sDrewth 06:14, 29 August 2019 (UTC)
@Billinghurst, @Xover, @Beleg Tâl, @Kaldari: I re-colored as some of the icons as suggested (as well as the green one to make a Analogous colors scheme. Here are the results below:
  •   = null
  •   = problematic
  •   = not proofread
  •   = proofread
  •   = validated
Is this a acceptable design set? –MJLTalk 22:55, 31 August 2019 (UTC)
Looking at it.. could be a bit darker actually... :/ –MJLTalk 22:57, 31 August 2019 (UTC)
@MJL: Nice! The grey and yellow have insufficient contrast against a white background, and the green is borderline, so they tend towards illegibility. The identical shape in yellow and green uses only colour to convey its meaning, making the difference invisible to people with certain forms of disabilities (think colour-blind people). We also need to use them at a slightly larger relative size to be legible on mobile. (Incidentally, if we use them in a context where they are significantly larger we should have a thin border around the shapes for contrast) I see the logic in the symbols here, but I have a slight worry their meaning won't be clear to a fresh visitor and when you only see one of them alone. --Xover (talk) 10:35, 1 September 2019 (UTC)
Agree that the yellow and green need to be darker (similar to the original green). Kaldari (talk) 17:08, 2 September 2019 (UTC)
@Xover:I admire your concern for colourblind individuals and I agree with your perspective, but I will also point out that our page status system is already 100% colour-based and has little-to-no provision for accessibility whatsoever, including the Wikidata icons that my original proposal was based on. —Beleg Tâl (talk) 18:19, 2 September 2019 (UTC)
You're right that we have multiple things that need to be fixed. However, our existing systems are not "100% colour-based". The page quality icons ( ,  ,  ,  ) are different shapes as well as different colours and have alt text describing their meaning (well, some oof them: 25% = Incomplete, 50% = Text complete, 75% = no alt text, 100% = no alt text); and the radio buttons for "Without text", "Not Proofread", "Problematic", etc. in the Page: namespace have text labels in addition to the colour circles. The colour-only page status indication on Index: pages is problematic, but most likely fixable if we stop abusing the title attribute. Etc. --Xover (talk) 19:47, 2 September 2019 (UTC)
@Xover, @Beleg Tâl: This is a non-issue because I have already checked for this before picking out the exact shades. For all listed variants of color-blindness, Green looks significantly darker than yellow. I encourage you both to compare the two and see if they can be differentiated. [Also, they are now darker as requested] :) –MJLTalk 17:12, 5 September 2019 (UTC)

I am not understanding the need for most of this complexity. In most contexts, a three-way division of scan-based texts should be plenty: unfinished (has some blank or blue pages), unproofread (has none of those, but some pink pages), proofread (has only green or yellow pages). This suffices both for people who want to read (can go straight to the proofread texts) and people who want to contribute, since they can then decide if they want to do just proofreading or the more complex formatting involved in working on new or blue pages. If more details are needed, look at the index itself. Levana Taylor (talk) 20:27, 2 September 2019 (UTC)

Talk pages consultation, Phase 2 reportEdit

Please see the mw:Talk pages consultation 2019/Phase 2 report. This relates to Wikisource:Scriptorium/Archives/2019-05#Talk pages consultation: Phase 2.

The main consultation is over. However, we still need to hear from you! Please put the mw:Talk pages project page on your watchlist. Whatamidoing (WMF) (talk) 17:33, 28 August 2019 (UTC)

Making our validated texts discoverableEdit

Wikisource has thousands of validated texts that are typically higher quality than Project Gutenberg. There's one minor problem though: they are virtually impossible to find. Why are we putting all this hard work into creating beautiful accurate transcriptions if no one's ever going to see them. According to Alexa, Wikisource's bounce rate (the percentage of visitors who immediately leave) is 77%, while Project Gutenberg's is 41%! While English Wikipedia enjoys hundreds of millions of pageviews per day and has dedicated software development support from the WMF, English Wikisource is competing for lowest pageviews among the English projects and (unsurprisingly) gets virtually no support from the WMF, despite the great potential of the content here. I think one solution to this problem is to better surface our best content to site visitors and make our validated texts more discoverable.

To this effect, I would like to propose that we run a 3-month experiment (which is the period of time that Alexa's stats are based on). The experiment involves the following:

If after 3 months, our bounce rate has not improved, I'll remove the dedicated search box from the main page and put it back how it was. What do folks think about this idea? Kaldari (talk) 18:08, 28 August 2019 (UTC)

  •   Support Would be nice also to make the ebook (Epub/Mobi) downloads more discoverable (the feature text is the only place that this is obvious to me). Is there any idea how many times WSexport is used from enWS? Also, any page that doesn't have a complete list of the pages on the front page will not produce the right ebook. In that case, there should be complete instructions on how to create an ebook somewhere, and it should be part of promotion to validated status. An example where the export is not working: Anna Karenina. Inductiveloadtalk/contribs 18:20, 28 August 2019 (UTC)
    @Inductiveload: https://tools.wmflabs.org/wsexport/tool/stat.php, there is a link from the your choice own link. frWS has lots, and we have some. Whatever frWS is doing seems to have success. — billinghurst sDrewth 06:35, 29 August 2019 (UTC)
    Brilliant. frWS has a utility bar header at the top of all works (contains dynamic layout toggle, PDF download, print version and a citation button). When a work is in the category fr:Category:Bon pour export (good for export), it adds an epub/mobi download link to the bar. It's done with JS: fr:MediaWiki:Gadget-stockText.js. They also have a ebook links for all the "new texts" on the front page, not only the featured text. Inductiveloadtalk/contribs 09:37, 29 August 2019 (UTC)
    That's a good point. I wonder if there's some way we could integrate something like {{Featured download}} into the header template, which could be turned on for validated texts. Kaldari (talk) 18:29, 28 August 2019 (UTC)
  •   Support I think this could be a good idea, and if it is only for three months I think we can just go for it. —Beleg Tâl (talk) 19:05, 28 August 2019 (UTC)
  • Support --Jan Kameníček (talk) 19:29, 28 August 2019 (UTC)
  •   Support As an experiment this is a splendid idea. I am a little doubtful about its chances of success, though. 3 months is a pittance in terms of turning around this kind of trend, and I think the discoverability issue this is addressing needs multiple avenues of attack. But as a start, and as a component of an overall push for discoverability, this is a great idea and I would be inclined to support its extension beyond the three months unless any actually negative results are observed.
    Incidentally, I would also welcome any efforts to refresh the design of the main page and the visual design of common elements like the header, footer, and AuxTOC; as well as "something" to improve the various curated collections. There's also something in the gap between raw categories and plain search (Tags? Facets?) that might help, but I have only the vaguest possible idea of what that might be. --Xover (talk) 05:11, 29 August 2019 (UTC)
  •   Support, though I'd rather not have an automatic reversion if we don't see immediate improvement in the bounce rate. Rather see some discussion of the results and the best path forward at that time. Thanks Kaldari. -Pete (talk) 16:36, 29 August 2019 (UTC)
  •   Support The intermingling of poor-quality and incomplete texts with complete ones is unquestionably a serious problem and this is worth trying out as one part of a solution. I am not sure this is actually what's causing low traffic though so I wouldn't make its continuation dependent on fixing that. —Nizolan (talk) 21:40, 29 August 2019 (UTC)
  •   Support Great idea! Although, how will the bot determine what's validated? There's not stricktly a relationship of validated-index to validated-work, because some works contain more than one index. It can also be hard to determine which work relates to any given index, for example Index:Austen - Northanger Abbey. Persuasion, vol. I, 1818.djvu is part of a 4-index set that contains 2 works (although, admittedly, if that was being proofread these days it'd probably all go under a single mainspace top-level page). I've tried sometimes to scrap things by following the links in the title field of an index, but it feels imperfect. —Sam Wilson 00:40, 30 August 2019 (UTC)
  • Comparing Gutenberg with Wikisource is not fair to Wikisource. People go to Gutenberg to read and not proofread. We advertise ourselves as "the free library that anyone can improve". If more visitors wanted, provide prominent links to lists of completed works and promote downloading and printing. The proofreading aspect should be moved further down and not be immediately visible.
    In addition, 99% of the people I talk to, know nothing about other Wikimedia web sites, or what Wikimedia is. Perhaps we should focus on promoting prominently the completed main namespace works. Perhaps promote our existence by explaining that we are related to Wikipedia. Do we have a category indicating completed main namespace volume titles? Is there a way to exclude incomplete Main namespace works not to be added to search engines?
    P.S: Did a simple web search on two words, "Proofreading" and then "free book downloads" and the results are not very encouraging. — Ineuw (talk) 04:28, 30 August 2019 (UTC)
    The proofreading aspect should be moved further down and not be immediately visible. Yes! The best way to recruit new Wikisourcerors is to showcase what we have already achieved. To rope them in to contributing should be done more subtly with lots of little hints everywhere that communicate "You can edit this!", "We're missing this work from your favourite author, maybe you want to add it?", and so forth (to be clear, I mean by way of visible "Edit" buttons and lists of works with red links etc., not literal banners or talk page messages or such!). The main page has to throw as many finished and high-quality works as possible at the visitor: site maintenance and coordination is for people who are already contributing far more than for new visitors. PotM and similar should only be displayed on the main page to the degree, and in such a way, that they can help attract new contributors.
    And the same goes for unfinished works, or works below a certain level of quality: when entering through the main page, these should be hidden from you unless you search specifically for them. Anything displayed on the main page or which is navigable to simply by clicking links from the main page should be something we're proud to show off, not a fixer-upper! --Xover (talk) 08:26, 30 August 2019 (UTC)
    I agree with all of this. The key principle needs to be to distinguish firmly between the maintenance and preparation aspect of the project and the presentation of complete texts, while also encouraging people using the second to dive into the first. Imo, apart from this, the other main problem is the abstruse and poorly maintained portal and category hierarchy which limits text discoverability. There are also issues which can't be sorted out through decisions here: lack of serious integration with Wikipedia is probably another major issue for attracting traffic. —Nizolan (talk) 09:53, 30 August 2019 (UTC)
    +1. I think Xover's framing is good strategic prioritization. And Kaldari's suggestion is a good ingredient that sets the stage for further efforts to improve the way we present Wikisource to a broad audience. -Pete (talk) 20:44, 30 August 2019 (UTC)
    Good points. In the same spirit, how about removing redlinks from author pages? They add nothing (but we should keep the list of works, it is interesting, and can be accompanied by external links). The links on visible pages like author pages should highlight the texts we have, not the ones we don't! Perhaps (although this’d be way too much work to implement) we could go further and, on author pages, have a full link to completed works and a discreet smaller link, in the style of {{small scan link}}, to ones that are incomplete and unproofread. Drawback to this is how hard it’d be to keep it synchronized with the progress of work, with proofread works being promptly updated to full link status.
    BTW I myself am guilty of adding a boatload of links to unfinished works to author pages; I would welcome thoughts on how to improve the situation. Levana Taylor (talk) 23:06, 31 August 2019 (UTC)
    I don't think the red links should be removed per se. The red links serve an important purpose in pointing readers towards work they can do—hence why I try to make sure to add external links to any available scans alongside them. I do agree that author pages could be rearranged to better distinguish complete works, ongoing transcription projects, and works that haven't been started yet, though it would be a huge project at this point. —Nizolan (talk) 23:20, 31 August 2019 (UTC)
    Redlinks are generally a good thing: studies at enwp have demonstrated that they lead to contributions. But they need to be deployed correctly. Maybe some kind of approach where the list of works tells you the next step needed? "No scan available for this work. Please locate and upload one!", "This work has a scan but lacks an index, can you create one?", "This work has a scan and index but not all pages have been proofread yet. Can you help?", "This work has been proofread, but is not yet validated.". That sort of thing. A red link to mainspace in that context may actually be a deterrent to contributing. --Xover (talk) 10:13, 1 September 2019 (UTC)

Maintenance of the Month questionsEdit

I was studying the list in the Wikisource:Works but I don't know what to contribute.

For example: What is supposed to be done with the 3 author pages without any works in Wikisource? Ditto, what is needed to be done to the displayed images? Some enlightenment is much appreciated. — Ineuw (talk) 03:33, 30 August 2019 (UTC)

@Ineuw: Based on a quick look, it appears that template has not been updated since 2014. I question why we even have that on the main page. And, no, I don't understand what "Work index revision" is supposed to be in practical terms either. --Xover (talk) 10:02, 1 September 2019 (UTC)
Time to kill it from the main page, at least while it is not actively curated. — billinghurst sDrewth 13:08, 1 September 2019 (UTC)
Removed. I would suggest that we could do well to have a list of proofread works that require validation, as that is often an area where work is required. — billinghurst sDrewth 13:10, 1 September 2019 (UTC)
We have a page listing for completed proofread files of the Index ns waiting to be validated. I am working on a copy of the list to switch the links to the main namespace. — Ineuw (talk) 23:50, 1 September 2019 (UTC)
@Ineuw: Please check out WS:WPV, and if you want to make that a subpage there; then by all means! :D –MJLTalk 17:17, 5 September 2019 (UTC)
@MJL: Thanks for the link although I am unclear as to why we have so many lists for validation. It is confusing.
  1. Category:Index Proofread is an alphabetically ordered generic list.
  2. Wikisource:Validation_of_the_Month/validation_works#QUEUED I assume this to be in date order as any proofreader can add their completed work.
  3. Wikisource:WikiProject_Validate#Candidates_to_be_validated In what order is the list? I think that these two should be combined. — Ineuw (talk) 22:09, 5 September 2019 (UTC)
    Ad 3: The list ist not ordered as no order is needed. --Jan Kameníček (talk) 22:23, 5 September 2019 (UTC)

┌─────────────────────────────────┘
@Ineuw: Validation of the Month is a separate project that no one really is participating in AFAIK. The list you mentioned is just a voting page to decide the featured tasks should be. There's no particular order you can go in, but the project is limited to seven works it can consider validating at a time. As for the merging, that was considered at one point. The problem is that VotM is older and comes with it a large list of projects people just put there. WS:WPV generally only takes up works people actively nominated for consideration. –MJLTalk 22:20, 5 September 2019 (UTC)

In essence, these two lists are in addition to any validations users make? — Ineuw (talk) 02:11, 6 September 2019 (UTC)

Broken EPub download link on mobile pagesEdit

On en.m.wikisource (the default for mobile viewers), we do have a script MediaWiki:Mobile.js that places an EPub download link in the top bar. However, this link doesn't actually work. The "a" tag inside the list item seems to way over to the left and zero sized when I inspect it, but I can't immediately work out why that is. This means, since the mobile view is default for most visitors, and severely crippled in terms of menus, that there is no way a mobile viewer can actually access an ebook, unless they revert to desktop view and find it in the sidebar (which no visitor is going to do naturally). Inductiveloadtalk/contribs 12:47, 30 August 2019 (UTC)

Hmm, actually, looks like this list item is just constructed all wrong (Probably the UI has changed since this JS was written in 2015). I think it should be something like this:
			$( '#page-actions' ).append(
				$( '<li>' ).attr( {
					id: 'page-actions-export-epub',
					class: page-actions-menu__list-item',
					
				} ).append(
					$( '<a>' ).attr( {
                                                id: 'ca-export-epub'
						class: 'export-epub mw-ui-icon mw-ui-icon-element',
						href: '//tools.wmflabs.org/wsexport/tool/book.php?lang=en&format=epub&page=' + mw.config.get( 'wgPageName' ),
                                                title: 'Download an EPUB version of this page'
					} ).append("Download EPUB")
				)
			);
Inductiveloadtalk/contribs 13:01, 30 August 2019 (UTC)
@Inductiveload: Is that a complete script replacement, or a partial? — billinghurst sDrewth 12:39, 31 August 2019 (UTC)
@Billinghurst: partial, from line 6 onwards. Here is a complete script, which I managed to invoke manually from locally-served JS on my machine:
/* Any JavaScript here will be loaded for users using the mobile site */
( function ( mw, $ ) {
  $( function() {
    //link "download as ePub in the toolbar
    if( $.inArray( mw.config.get( 'wgNamespaceNumber' ), [ 0 , 114 ] ) !== -1 ) {
      $( '#page-actions' ).append(
        $( '<li>' ).attr( {
          id: 'page-actions-export-epub',
          class: 'page-actions-menu__list-item',
        } ).append(
          $( '<a>' ).attr( {
            id: 'ca-export-epub',
            class: 'export-epub mw-ui-icon mw-ui-icon-element',
            href: '//tools.wmflabs.org/wsexport/tool/book.php?lang=en&format=epub&page=' + mw.config.get( 'wgPageName' ),
            title: 'Download an EPUB version of this page'
          } ).append("Download EPUB")
        )
      );
    }
  } );
} ( mediaWiki, jQuery ) );

Inductiveloadtalk/contribs 20:43, 31 August 2019 (UTC)

  Done Thanks J. Others may wish to test on their mobile devices — billinghurst sDrewth 00:36, 1 September 2019 (UTC)
Working for me, on mobile and also PC with 'm' subdomain. Thanks! Inductiveloadtalk/contribs 10:12, 1 September 2019 (UTC)

Why have we allowed ToC to slip to subpages?Edit

Not certain when this little trend started to occur, however, we seem to have allowed the transcluded table of contents slip to a subpage. For us, this has to be a no no as it breaks the book tools ability to create works with all the subpages. This is even happening with editors who have been here for a while.

We need to be doing better with our patrolling, and ensuring that these things do not happen. That we have focus and bickering over curly apostrophes and cannot get the basics right with ToC and volumes is just maddening. Our standards have been slipping with deviations from the style manual, and from our focus on making it easy for transcription and proofreading. We used to focus on the words of the author, and now we have what seems to a blinding observance to have facsimiles of what a publisher output.

I have also seen a preponderance to start using {{Page}} more loosely rather than <pages>. This just leads to holes in transcriptions, OR poor page numbering, OR no page numbers. Why? We have a standard and a style that has been developed for really good reasons, and this seeming casual discarding of it has consequences. — billinghurst sDrewth 12:35, 31 August 2019 (UTC)

I see this a lot in older works; I fix it when I see it, but it's a long-entrenched trend, just like the practice of using a "front matter" subpage for the cover and title page (which I also fix when I see). —Beleg Tâl (talk) 16:44, 31 August 2019 (UTC)
Although I have never put TOC on a subpage, it is the first time (after a year of regular contributing here) I read that it is actually wrong. If it is important, it should be written in some rule, so that all newcomers were able to learn it.
However, it is not true, that the front matter subpage is wrong: see Help:Front matter: "Sometimes some or all of the front matter is transcluded into subpages instead." If the front matter is extensive, it is better to create a subpage, so that the readers did not have to scroll down before they find the TOC with links to subpages. For accesibility reasons it is better if the TOC is immediately visible when opening the base page (and the most important info from the title page is in the header of the base page anyway). --Jan Kameníček (talk) 17:23, 31 August 2019 (UTC)
I'd be interested to see links to (a) policies or guidelines that define how it's supposed to work, and (b) examples of how it should be and how it should not be. Both would make it easier for me to follow and learn more about this. -Pete (talk) 17:25, 31 August 2019 (UTC)
Yes exactly, there are no written rules about it, and it's a long-established way of doing things. As for front matter, I will assert that the practice of transcluding front matter on a subpage is bad even though Help:Front matter acknowledges that it has always been done. If the author and publisher designed the book such that readers must leaf past a cover, title page, and so forth before getting to the TOC, then that is what readers should expect to see here as well. Readers expect to see the beginning of the work when they start at the beginning, not to have to navigate backwards to get to the top. —Beleg Tâl (talk) 17:36, 31 August 2019 (UTC)
And if the publisher designed the book such that readers must leaf through the entire volume? :) Seems that what you're advising is in direct contradiction to what @Billinghurst: says in a case like that. I'm curious to learn about policies and norms, and if policies and guidelines need to be written, I'm happy to pitch in. I'm not sure what it accomplishes though, to talk about what's "bad" or what's a "no-no" if such documents don't exist, or are incomplete or inconsistent. What are the goals we're trying to serve here? I can see three, which might sometimes be in conflict with one another:
  • Accommodate software which knows how to build books only if certain rules are met (as billinghurst says)
  • Approximate the appearance of the original published text as closely as possible (as Beleg Tâl says)
  • Provide text in the way that is most useful to the reader, with minimal deviation from original format (what I'm inclined toward)
I'm not sure any of these deserves to be "the one and only principle" -- it seems to me that guidelines that help us balance these considerations are what would help the most. -Pete (talk) 17:47, 31 August 2019 (UTC)
My experience: Sometimes I scroll down to find a TOC, click on a link to open a chapter, find out that it is not what I was looking for, return back, have to scroll down again, click, return, scroll down, click, return scroll down... I have to say that the fact I have to keep scrolling down really sucks, while simple click-return-click-return is much friendlier to me.
BTW: It is not true that readers who open a book always expect title page to see first, not even with paper books. I as a reader expect it when I am choosing which book I want to open and read among a row of books, but Wikisource does not offer this feature anyway. After I have already chosen a book and decided to open it, I want to get to the contents as quickly as possible. --Jan Kameníček (talk) 18:29, 31 August 2019 (UTC)
I agree with the above. I don't see how the ToC point can be criticised as a "deviation from the style manual" when where to put the table of contents is not in fact stated anywhere in the style guide; people cannot be slipping from a guideline that doesn't exist and if it is a technical issue then this needs to be made explicit. Regarding the broader issue of the front matter, I think we should only follow the exact order where convenient. Long prefaces can sometimes come before the table of contents in printed books, but I doubt anyone would disagree with putting the ToC on the root page and the preface in a subpage in that case, for example. I also have a personal policy generally not to transclude pieces of front matter that serve no particular purpose to the reader (half-titles, publisher pages, copyright notices etc.), precisely in part because of Jan's point about the table of contents being pushed down by junk. —Nizolan (talk) 22:58, 31 August 2019 (UTC)

Maybe I need to take a step through a potted history of that development of WS, and some of this relates to tool development.

  1. Goals
    • Understand the concept of the page/work/... pretty quickly (is that what I wanted?)
    • Enable reader to get into the work quickly/easily not have to fumble or be burdened (does this interest me?)
    • Not make it unreasonably difficult to transcribe/proofread
    • To guide, and not have a prescriptive rules that you have to read just to contribute; and so to support new users to our styles, and our practices (ie. watch what we do as one set of prescriptive rules is problematic when working with publications over the centuries but we can work it out so it aligns with our other works)
  2. Headers were developed (the want and interest)
    • to get the metadata together in reproducible fashion
    • to enable easy navigation forward and backwards in a work
    • not too big to stop a person to easily get into a work, so somewhat minimal
  3. Table of contents (replicate works, or as auxiliary as navigation where not originally in work)
    • Before we had scans, people just started with a table of contents on a work, the essence of navigation of a work whether to a strip article or to subpages; this should be the essence of root page of a work
    • When we had scans people people had the front pages, so started to transcribe them, and some did not, and that is still the case today. It was personal choice whether people added those pages to the root page, though it was seen of value for the other meta data related to the edition and author (printer, copyright, other works, etc.) to be able to capture those pages, so poked them into front matter. Available for those interested in the particulars, though not imposed upon people interested in the work.
    • Doesn't work for serials as they have too many subpages, so subsidiary/detailed ToC on subpages, with a coarse ToC at top.
    • They were most useful when we had scans to transclude them to the Index: pages
  4. Tools
    • WSexport tool came along and it requires the navigation to subpages from the root page, which compounded the need for a ToC to exist on root page where ToCs are required for the work. As it helps for good web presentation, it was pretty much a nobrainer

billinghurst sDrewth 00:30, 1 September 2019 (UTC)

PS. Can we please focus on ToC on the root page, and not get distracted by other factors. Yes they are important and deserve their own separate conversations. And yes, I started the diversions, but when one is always caught up in fixing what are considered the basics :-/ We need to focus our feedback tightly on one issue in such a forum. — billinghurst sDrewth 00:38, 1 September 2019 (UTC)
As people have said, the basics are that there's no written guideline against putting ToCs on subpages at the moment, so I would focus on rectifying that. —Nizolan (talk) 04:29, 1 September 2019 (UTC)

Taking @Nizolan:'s suggestion to heart, here are two examples of how the front page of a work might look, when the original has the TOC buried deep in the work.

  1. Looters of the Public Domain front page as it was before (presented very consistent with the original published work, but with a note in the header indicating that there is a TOC to be found if one knows where to look.)
  2. Looters of the Public Domain front page as I just redid it, to conform more to what billinghurst has suggested -- making sure the TOC is on the main page.

I do think option #2 is better, and it explains the change in the header, so it's not like readers will be in the dark about the variance. What do others think? -Pete (talk) 23:56, 1 September 2019 (UTC)

I think it's much better now. It's much more intuitively navigable for someone landing on the work's main page. Also the EPub export works now. Due to the {{FI}} templates, the images are full size, so the EPub is nearly 300MB, but that's another issue, I suppose. Inductiveloadtalk/contribs 10:47, 3 September 2019 (UTC)

In the magazines I’m doing, I’m putting both the TOC and the index on the main page -- is that a problem for epub export or for anything else? Levana Taylor (talk) 15:30, 3 September 2019 (UTC)

It seems to work. You get the "artifical" boxed TOCs first, and then the index, and there is an epub index that you can access from the reader program sidebar. You can check yourself by downloading the epub either on a PC or on a phone. The dotted tables look pretty bad on a mobile device, but, again, a different issue. Inductiveloadtalk/contribs 16:31, 3 September 2019 (UTC)

Thanks for the feedback. @Beleg Tâl: Could you comment on the two options above? Do you feel that there are any instances in which it's worthwhile to exercise judgment about what best serves a reader on our platform, which could differ from the printed original? (@Inductiveload: I'm happy to discuss the best way to present images, maybe at Talk:Looters of the Public Domain?) -Pete (talk) 19:10, 4 September 2019 (UTC)

  • I'm only half following this discussion, but I just want to note that as far as wsexport goes, we could change it to support TOCs on subpages (i.e. follow every subpage link recursively) so as long as there's a link from a work's top page to its ToC page then the export function would work correctly. It sounds like there are other good reasons to have the ToC on the top page, but the technical requirement of the export tool needn't be one of them. —Sam Wilson 23:33, 4 September 2019 (UTC)

Use of Page templateEdit

Can anyone explain why we use {{page}} at all instead of <pages …/>? Is there some situation where the latter does not work but the former does? Does it confer some kind of advantage in some way? --Xover (talk) 09:51, 1 September 2019 (UTC)

A simple (and simplistic) reason for this is that <pages> always encloses its output in a <div>… there is simply no way of getting around this without modifying the ProofreadPage extension. On the other hand {{page}} utilises a different but similar mediawiki extension, Labeled Section Transclusion which does not enforce this enclosure. This is one of the fundamental reasons why page-crossing table transclusion is such a royal pain. 114.78.171.144 10:10, 1 September 2019 (UTC)
Also, <pages> doesn't work in the Index namespace, so if you want to transclude a TOC to the index you have to transclude pages one-by-one with this template or something equivalent. Inductiveloadtalk/contribs 10:17, 1 September 2019 (UTC)
<pages> was purposefully made not to work in Index: ns, and anyway we transclude the page, not use {{page}}
Index: is a bit of a special case, so I'm not immediately too concerned with use there (and it sounds like something that might conceivably be fixed in ProofreadPage anyway). It's the (extensive) use in mainspace that sounds a little worrisome to me. --Xover (talk) 10:21, 1 September 2019 (UTC)
Each <pages> invocation, or each Page: within a <pages> invocation? --Xover (talk) 10:21, 1 September 2019 (UTC)
Each <pages> invocation (it is easy enough to verify!) I proposed a ProofreadPage modification many, many moons ago when I was still active here but the proposal died because of disinterest and in the pre-HTML-lint days nobody noticed anyway. 114.78.171.144 12:18, 1 September 2019 (UTC)
Then my next question is, why is that a problem? Why would that affect multi-page tables? --Xover (talk) 12:24, 1 September 2019 (UTC)

  Comment there are some special cases where we would use {{page}} in main namespace, when you wish to insert multiple pages side-by-side rather than sequentially. Also, you can use {{page}} to exclude sections, rather than include sections, so it has uses, though they would be quite uncommon. — billinghurst sDrewth 12:32, 1 September 2019 (UTC)

further   Comment, in some of our transclusions based on image type (jpg/png/other, etc.) the <pages> syntax cannot be used to exclude pages, so this can be a case for use of Page:, usually when there is a plate and image that is out of order of the components of the work. — billinghurst sDrewth 11:15, 7 September 2019 (UTC)

Copyright and deletion discussions needing community input in September 2019Edit

The following copyright discussions and proposed deletions discussions have been open for more than 14 days, and with more than 14 days since the last comments, without a clear consensus having emerged. This is typically (but not always) because the issue is not clear cut or revolves around either interpretation of policy, personal preference within the scope afforded by policy, or other judgement calls (possibly in the face of imperfect information). In order to resolve these discussions it would be valuable with wider input from the community.

Copyright discussions require some understanding of copyright and our copyright policy, but often the sticking points are not intricate questions of law so one need not be an intellectual property lawyer to provide valuable input (most actual copyright questions are clear cut, so it's usually not these that linger). For other discussions it is simply the low number of participants that makes determining a consensus challenging, and so any further input on the matter would be helpful. In some cases, even "I have no opinion on this matter" would be helpful in that it tells us that this is a question the community is comfortable letting the generally low number of participants in such discussions decide.


Copyright discussions


Proposed deletions


Note that while these are discussions that have lingered the longest without resolution, all discussions on these pages would benefit from wider input. Even if you just agree with everyone else on an obvious case, noting your agreement documents and makes obvious that fact in a way the absence of comments does not. The same reasoning applies for noting your dissent even if everyone else has voted otherwise: it is good to document that a decision was not unanimous.

In short, I encourage everyone to participate in these two venues! --Xover (talk) 09:09, 1 September 2019 (UTC)

Wikisource:News (en): September 2019 EditionEdit

English Wikisource's monthly newsletter which seeks to inform all about Wikimedia's multilingual Wikisource.

Current · Archives · Discussion · Subscribe MJLTalk 23:03, 1 September 2019 (UTC)

subclass2 in Portal headerEdit

It seems that the parameter subclass2 in the template {{Portal header}} does not work well: the letter of the second subclass is generated, but the link is wrong. See Portal:Bernard Bolzano. Can somebody fix it, please? --Jan Kameníček (talk) 19:17, 2 September 2019 (UTC)

I am not understanding your issue, what exactly are you expecting where? (noting that all the classifications has always been a mystery to me) The links seem fine to me at face value. FWIW this seems controlled from Template:Portal Class and that is unprotected, and could do with a sandbox and testcases. — billinghurst sDrewth 12:06, 3 September 2019 (UTC)
@Billinghurst: There are three letters of classes/subclasses: class B, which is manifested by the letter B linked to Portal:Philosophy, Psychology and Religion, subclass C, which is manifested by the letter C linked to Portal:Logic, and subclass L, manifested by letter L which should be linked to Portal:Religion, but in fact is linked to Portal:Logic again. This needs to be fixed. --Jan Kameníček (talk) 16:46, 3 September 2019 (UTC)

Proposition to change the Wikispecies label in Module:Plain sister.Edit

Based on the discussion Links to authors in Wikispecies above, I propose to change the label of the link to Wikispecies pages in the Sister projects box (Module:Plain sister) from "taxonomy" to "Wikispecies". --Jan Kameníček (talk) 20:44, 2 September 2019 (UTC)

  •   Support Seems the most accurate and sensible label to me. --Xover (talk) 11:22, 3 September 2019 (UTC)
  •   Comment The intent of plain sister labels was to target the result, not wiki name specifically, view the other labels in the series. If the existing label is not suitable, I would just prefer "species" though I am a dinosaur. The words species and taxonomy come from the scope for that site. — billinghurst sDrewth 12:27, 3 September 2019 (UTC)
Wikispecies is an open, wiki-based species directory and central database of taxonomy. It is aimed at the needs of scientific users rather than general users.

—source, m:Wikispecies/FAQ

Although they still claim to be a "species directory and database of taxonomy", they became also a database of authorities. Ferdinand Stoliczka is not a species and so our link to this entry should be labelled appropriately. --Jan Kameníček (talk) 16:54, 3 September 2019 (UTC)
  •   Support Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:48, 3 September 2019 (UTC)
  •   Support Kaldari (talk) 22:40, 5 September 2019 (UTC)
  •   Oppose it is not meant to be identifying the wiki, it is meant to be identifying the type of article, which is the design of the whole plain sister. — billinghurst sDrewth 13:05, 10 September 2019 (UTC)
  •   Support per Jan's second post above; authors are not species so labelling it as a link to a taxonomy makes little sense. —Nizolan (talk) 17:58, 12 September 2019 (UTC)

Choosing from two author pictures in WikidataEdit

It seems that the author header has problems with choosing a picture from Wikidata if there are two of the same rank (see Wenceslaus II of Bohemia). It could probably be easily solved by distinguishing their ranks there, but this would solve only this particular case. I think the header should be corrected so that it could handle a similar situation, should it appear in the future again. --Jan Kameníček (talk) 21:36, 3 September 2019 (UTC)

So this particular issue was solve by removing the other picture, but the fact that Wikisource cannot handle such situation imo needs to be solved too. Wikidata allows to have there multiple pictures on purpose and so such situations may occur again. (Maybe there already are some more author pages with this problem, it could be worth if a bot made a search for them). --Jan Kameníček (talk) 08:14, 4 September 2019 (UTC)
It happens infrequently enough that fixing the priority in Wikidata has worked so far (and to be honest, I think the priority should be fixed when this happens on Wikidata). However, you are right, we should update the Header module to identify when this happens and adjust accordingly. —Beleg Tâl (talk) 12:41, 4 September 2019 (UTC)
I did it this way purposefully as we were getting the first image and that was often the rubbish image. Where the image is problematic, ie. there are two, page is flagged as broken at Category:Pages with missing files, and you fix it at Wikidata by setting a priority per the instruction on the category page. How is that problematic? — billinghurst sDrewth 12:54, 10 September 2019 (UTC)
And to note, that it was resolved by my preferring one image, not the removal of the image, and it was done on the same day that you created the page, and without knowledge of this post. So, normal maintenance processes work. — billinghurst sDrewth 13:02, 10 September 2019 (UTC)

Tech News: 2019-36Edit

09:08, 4 September 2019 (UTC)

Guidance at Portal:Speeches/CopyrightEdit

I recently deleted a political speech by Carrie Lam, Chief Executive of Hong Kong, as copyright work. A user has pointed me to Portal:Speeches/Copyright which references

which states that political speeches are able to be hosted.

I would argue that the current approach has been to delete scripted political speeches, than to follow the first guidance. That said, I don't get to reframe community's earlier consensus, nor to unilaterally impose my PoV, so I am putting this before the community.

If it is thought appropriate, this can be moved to WS:CV though I thought that I would start here initially due to the reference conversation being here and archived in the subpages. — billinghurst sDrewth 13:30, 9 September 2019 (UTC)

That guidance is terribly confused, and it's from 2005/6 (i.e. before any Wikimedia project had figured out the basics of copyright, much less had coherent copyright policies). As best I can tell it has not reflected actual community consensus for about a decade ({{PD-Manifesto}} was deprecated around 2010, and deleted in 2013; see this page and its further links), it's just never been updated.
Extemporaneous speech—aka. "off the cuff remarks"—is generally not considered to be protected by copyright. Mainly because extemporaneous speech has not been fixated in a tangible form, which most copyright frameworks (Common law, in particular) demand in order for a work to be eligible for copyright. However, it is extremely unlikely that any public speech of any sufficient notability to be hosted here was not prepared in advance and fixated prior to being memorised and recited, or read from the teleprompter. Even seemingly ad hoc statements made in response to audience or journalist questions is likely to have been previously fixed in the form of a list headed "Talking points". These are all copyrightable works subject to essentially the same copyright rules as any other kind of work.
In addition, even if the speech in question is truly extempore, we need to source the speech from somewhere; and that somewhere (like a news broadcast) is very likely to have its own copyright in the recording (think of it as the translator's copyright in a translated work).
There will of course be exceptions, that are truly extemporaneous speech, but these will rarely be speeches as such (and will often be out of scope for Wikisource). The guidance for speeches should be the same as for any other work: “Has its copyright expired?”, “Was it exempt from copyright under PD-USGov?”, etc.
In other words, the copyright guidance link on Portal:Speeches should be removed and the sub-page excised with fire. --Xover (talk) 15:58, 9 September 2019 (UTC)
Note that the WMF has posted their opinion on a related subject at m:Copyright of Political Speeches, and their opinion agrees with Xover's above: speeches that are written down by the speaker before hand are copyrighted, and off-the-cuff remarks are not copyrighted. Speeches written by an officer or employee of the United States Government as part of that person's official duties are exempt from copyright, but of course that does not apply to Lam's speeches. —Beleg Tâl (talk) 16:25, 9 September 2019 (UTC)
this is the copyright maximalst position - whenever you might try to reference a speech to a PD VOA video, then there is the mass nomination with "i saw the speech notes in their hands". but if they did not publish the speech text elsewhere, isn’t that intent to make public domain? i.e. precautionary becomes an enabler of the RIAA and MPAA. a better way would be keep as "no known copyright" until affirmative evidence of publication is found anywhere. Slowking4Rama's revenge 16:30, 9 September 2019 (UTC)
That is not how American copyright works. As soon as the text of the speech hits the paper, it is protected by copyright, regardless of the intent of the speaker, regardless of subsequent publication, regardless of whether anyone saw speech notes in anyone's hands. However, your point about keeping the speech anyway as "copyright status unknown" is a good one. —Beleg Tâl (talk) 16:38, 9 September 2019 (UTC)
american copyright law mostly does not work; and bad cases make bad law - i don’t see a wave of DMCA takedowns for speeches - and so "A work must be “fixed in a tangible means of expression” to be protected by copyright, but a work is “fixed” when [it] is sufficiently permanent or stable to permit it to be perceived, reproduced, or otherwise communicated for a period of more than transitory duration.”" means when i read from notes it is copyrighted, but when i throw away those notes it is not? (since they were not sufficiently permanent) do not nominate my wikimania talks. it is the imposition of Berne culture upon CC culture, i guess we can advise and extend our remarks to include a CC declaration. Slowking4Rama's revenge 12:25, 10 September 2019 (UTC)
Here is an important counterpoint: Help:Licensing compatibility#Presumed licensing: "Works whose licenses are unknown but are likely to be a compatible license or criteria […] are generally prohibited. However, some very limited conditions have been permitted by the community, most notably regarding public speeches." This is similar to what Slowking4 said above, that it would be better to keep the works until evidence of tangible medium is demonstrated. Note that there is a list of relevant discussions on the linked page. —Beleg Tâl (talk) 16:43, 9 September 2019 (UTC)
I'll note further that I don't really agree with this personally, because it seems to me that speeches are very unlikely to have a compatible license or criteria. —Beleg Tâl (talk) 16:46, 9 September 2019 (UTC)
That should also be fixed; it stems from the same time period as the Portal:Speeches guidance. Note that all the links predate the discussions that started in 2009(ish) and ended up with deleting PD-manifesto in 2013. The overall intent is fine—there's certainly a case to be made for keeping individual works of an unknown copyright status when special circumstances makes it likely-but-undocumented that free licensing obtains (*cough* Guerilla Open Access Manifesto *cough*)—but speeches as the example is actively misleading. --Xover (talk) 16:53, 9 September 2019 (UTC)
here are some videos with compatible license [10]; if video weren't so hard, we could have more; and as we move to video rather than text, the text based rules look more ridiculous; there is a push to get more uploaded with machine voice overs. so we need a consensus on the standard of practice, other than "this is professional video therefore delete"
the manifestos from the 60s typically are PD no notice. such were the proto-pirates. and no takedowns here [11] Slowking4Rama's revenge 23:24, 10 September 2019 (UTC)
It's clear that modern speeches are inherently copyrighted, and that anyone who wants their work freely used needs to say so explicitly.--Prosfilaes (talk) 02:04, 13 September 2019 (UTC)
here you go c:Category:Wikimania_2019_videos. but you might be as popular as the anti-cuteness crusader. Slowking4Rama's revenge 00:51, 15 September 2019 (UTC)
I'll assume that whoever uploaded those got the rights for them.--Prosfilaes (talk) 11:20, 15 September 2019 (UTC)

Should the disambiguation pages for "Song" and "A Song" be combined?Edit

Hi. I just noticed that along with the nice, well-organized disambiguation page for Song, there's a page for A Song too, which is just the same kind of generic "song" poems which can only be told apart by their first lines. Would it make sense to put the "A Song" poems in the list with the "Song" poems and change the smaller page into a redirect? ~~ Teller XIV (talk) 20:40, 10 September 2019 (UTC) ~~

Per Talk:Song#Merging with A Song: "Based on previous discussions, the community consensus is that long disambig lists like Song shouldn't be merged with similar disambig lists at A Song and The Song" —Beleg Tâl (talk) 20:57, 10 September 2019 (UTC)
OK, thank you. ~~ Teller XIV (talk) 20:58, 10 September 2019 (UTC) ~~

Overfloat makes troubles with paragraphsEdit

What could be the cause of the troubles with paragraphs when using the overfloat left template in Page:The Acts and Monuments of John Foxe Volume 3.djvu/96 and how it can be solved? A couple of days ago the paragraphs looked OK, so I guess something must have changed somewhere. Thank you very much. --Jan Kameníček (talk) 17:09, 11 September 2019 (UTC)

@Jan.Kamenicek: Not sure how that ever worked. This is using <div></div>-based (block) templates inside <span></span>-based (inline) templates, which is always going to be pure luck if it seems to work. Of the ones in use there, {{overfloat left}} and {{smaller}} work together; the rest don't. All of them except {{lh2/s}} can probably be replaced by new inline versions of the current templates if the need is great enough. {{lh2/s}} will probably simply not work as an inline template (line height, I think, can only be applied to block elements). --Xover (talk) 18:02, 11 September 2019 (UTC)
@Xover:Now I see. I guess the {{lh2/s}} could be replaced by {{lh}} which is based on span, so it would be perfect if the other templates were adapted for inline use. I cannot say how great the need is generally, but it would help me when proofreading this particular book of almost 800 pages :-) After several attempts this solution of sidenotes seemed best to me (before I discovered the paragraph problem) as I tried to avoid various distracting frames and also get the sidenote text as dense as possible so that two succeeding notes did not interfere with each other. --Jan Kameníček (talk) 18:28, 11 September 2019 (UTC)
This will increase the backlog of lint errors that I periodically try to decrease [12]. I fail to see the benefit of hammering templates that produce broken HTML only for the sake of trying to reproduce format as faithfully as possible.Mpaa (talk) 19:57, 11 September 2019 (UTC)
I thought that adapting the templates for inline use would help decreasing the lint errors. I am not seeking a solution causing any errors.
It does not have to be completely the same as original, although I would like to keep especially their position on the side of the text (i.e. not the text flowing around). It is also true that some other deviations from the original caused by our templates make the result ugly; for this reason it would be good if the text of the notes was aligned to the left, without distracting frames, and densely written (so that two succeeding notes do not interfere with each other). --Jan Kameníček (talk) 20:47, 11 September 2019 (UTC)
I think I got the solution, which finally turned out to be quite easy: Page:The Acts and Monuments of John Foxe Volume 3.djvu/96, while my previous experiment turned out to be unnecessarily complicated and weird; hope everything is OK now. Thanks for advising me with the div x span problem. --Jan Kameníček (talk) 22:18, 11 September 2019 (UTC)
yeah, i find your adherence to sidenotes quaint. i’m afraid i will be converting them to endnotes going forward. i.e. Page:Illustrations_of_the_history_of_medieval_thought_and_learning.djvu/42 -- Slowking4Rama's revenge 16:14, 12 September 2019 (UTC)
Now I feel really sorry for asking here: I searched help, not interference with my work. I strongly oppose, do not do it, please. If the author wanted to use endnotes, he would. The sidenote solution at Page:The Acts and Monuments of John Foxe Volume 3.djvu/96 works fine. --Jan Kameníček (talk) 16:44, 12 September 2019 (UTC)
@Jan.Kamenicek: I would very much assume Slowking4 means that they will convert side notes to end notes in the works they proofread themselves in the future, and not that they intend to do it in works that others (like you) have already proofread using side notes. Granted there are many issues we as a community disagree on, but doing that would be downright uncollegial! :) --Xover (talk) 18:46, 12 September 2019 (UTC)
Reading it again, I see that you are right and apologize for quite a hot-tempered reaction. --Jan Kameníček (talk) 19:00, 12 September 2019 (UTC)
yeah, sorry, i’m sure we can work out sidenote preference on work talk page. but you understand while i might applaud your attempts to fix sidenotes, i have given up for those works i will be doing. the backlog is so immense, that we could both go along without seeing each others edits for years. not enough value added for me. yrmv. Slowking4Rama's revenge 23:29, 12 September 2019 (UTC)
Why not use {{left sidenote}} for this sort of thing? I'd vastly prefer semantic templates over formatting ones any day (that is, templates that tell you what things are, rather than what they look like, like {{ch}} over {{c}}, etc). I've tried it out on your page (didn't save, of course), and the preview at least looked good to me.... Dcsohl (talk) 19:54, 12 September 2019 (UTC)
Before I started using the overfloat template, I had been trying the left sidenote template too, but 1) I really do not like the distracting frame and 2) the note in the left margin instead of the text flowing around imo also looks much better. --Jan Kameníček (talk) 20:12, 12 September 2019 (UTC)
Very strange. When I tried it out on your page, there was no frame and the content did not float around it; it looked much as yours does now, with a wider gap between the notes on the content. That was the only difference I saw. Dcsohl (talk) 20:54, 12 September 2019 (UTC)
@Dcsohl: Yes, because the template displays differently in the page namespace and the main namespace, see e.g. Page:The Solar System - Six Lectures - Lowell.djvu/20 (where it looks fine) and The Solar System/Chapter 1 (where it looks awful). --Jan Kameníček (talk) 21:27, 12 September 2019 (UTC)
…And at the moment {{left sidenote}} is (still) somewhat broken on two of the dynamic layouts because of errors in the CSS ("Proposed Layout", where the sidenotes get pushed to overlap the edge of the screen, and Layout 4, where they overlap the text). They work best in Layout 2. —Nizolan (talk) 23:17, 12 September 2019 (UTC)
@Nizolan: Ok, still new here, never noticed the different layout options before. What are they, what is the "Proposed Layout", and why does that one look so bad? But, to my original point, @Jan.Kamenicek:, this is why I like semantic templates better--the actual formatting is ultimately up to the device rendering the text. Showing that it is a sidenote in a clear fashion, I think, enhances the likelihood that it can be formatted in a meaningful way 20 years from now, even if the "Proposed Layout" doesn't look so great right now (and should be improved, rather than hacking around it). Dcsohl (talk) 01:12, 13 September 2019 (UTC)
Hmmm, that is something new to me as well, in layout 2 it looks OK, so I started thinking about changing it for left sidenote in combination with {{default layout|Layout 2}}. Good point, thanks. --Jan Kameníček (talk) 06:24, 13 September 2019 (UTC)

Jardine Naturalist's Library Exotic Moths - Complicated Table Page.Edit

Hi,

I have a complicated table page that needs formatting to completion, but I have been unable to find similar examples that I could follow to finish it off.

I would appreciate if someone could format this for me? or maybe provide a link to something close enough to its style so that I can try and understand and replicate the layout?

The page is here [13]

Thanks,

Sp1nd01 (talk) 10:14, 13 September 2019 (UTC)

{{p}} may need workEdit

{{p}} needs more margin options.See the complex piece of formatting on this page; if I try to reproduce that with {{p}} here is what I get:

But first give me leave to move his foot,
That he is dead is quite beyond dispute.

[Moving the horse’s feet.

When you have seen all my bill exprest,
My wife, to conclude, performs the rest.

Levana Taylor (talk) 05:56, 15 September 2019 (UTC)

Why would you even be going to that sort of formatting, it seems unnecessary complexity for zero value

But first give me leave to move his foot,
That he is dead is quite beyond dispute.

[Moving the horse’s feet.

When you have seen all my bill exprest,
My wife, to conclude, performs the rest.

billinghurst sDrewth 07:25, 15 September 2019 (UTC)

@Levana Taylor: It didn't work because {{p}} didn't actually have mt0 and mb0 options. I've added them now. But note that I also agree with Billinghurst here: don't go to extremes to reproduce formatting; good enough is good enough. Especially when the complex formatting starts to create problems on its own. --Xover (talk) 07:37, 15 September 2019 (UTC)
Thanks for adding those. I actually hardly ever use paragraph margins (I used to but "simplify, simplify, simplify" - I’ve gone through my old work and removed all fancy formatting). But zero margins are useful for a variety of purposes. For one thing, they help with avoiding the use of <poem>. Speaking of things that create problems, <poem> is the great offender & I am trying to never use it. Levana Taylor (talk) 08:01, 15 September 2019 (UTC)
For many years we allowed the words of the author to be king, and let the typography and typesetting of the printer to guide us, not force us. We wanted to simplify as someone else has to validate the work, and there should not be the expectation for people to dance through unnecessary complexity. So when I do my proofreading I try to keep those key components in my mind.

<poem> is acceptable usage here, though we allow first transcriber to make the choice for their work. So we don't condemn its usage, that is about your choice.

Simplicity also is key where someone wish to copy and paste our work, to get something useful. It is the philosophy of not forcing a font face unless truly needed, and why we do relative sizing so that the reader is able to not have us force their browser display. So I implore you to not knock simplicity in our replicating works, and to focus on the whole, not the part of a publisher as the priority. — billinghurst sDrewth 08:14, 15 September 2019 (UTC)

True, "does the text come out right if cut & pasted" is a bare minimum rule of thumb for good formatting. That’s not taking into account tables in the text, to be sure. Levana Taylor (talk) 08:26, 15 September 2019 (UTC)

One work, two PDFsEdit

I have uploaded to Commons a two-page work on two separate PDFs (1, 2). How can I make a transcription project? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:52, 15 September 2019 (UTC)

I suggest to merge them into one pdf, two pages should be easy e.g. by some online pdf editor. Otherwise you have to create the pagelist manually from the individual scans, see e. g. Index:Supplemental Act of July 12, 1862.jpg or Index:Florence Earle Coates Mine and Thine (1904). --Jan Kameníček (talk) 21:18, 15 September 2019 (UTC)
Thank you. I managed to do the former in this case, but it's good to know the latter technique for future reference. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:14, 15 September 2019 (UTC)
  This section is resolved and can be archived. If you disagree, replace this template with your comment. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:14, 15 September 2019 (UTC)

1941 UK publicationEdit

"Contra-Props" is a journal article published in the UK in 1941 by a British author who died in 1947. Does it need to be moved to Wikilivres? If so, what's the process for that? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:48, 15 September 2019 (UTC)

Index:Coloured Figures of English Fungi or Mushrooms.djvuEdit

Can some please write a style guide for this so I am not constantly having to do minor edits to get a consistent style for things like the formatting on the first letter, first word of paragaphs please? Thanks. ShakespeareFan00 (talk) 09:48, 17 September 2019 (UTC)

I am not sure what you need to have in the guide, but after a quick look I would say that the pictures in most of the (validated!?) pages, like Page:Coloured Figures of English Fungi or Mushrooms.djvu/159, need 1) to be cut 2) to be stripped of the handwritten pencil text. Generally, it seems to me that speed is too often preferred to quality in the validation process. --Jan Kameníček (talk) 10:45, 17 September 2019 (UTC)
Also, the images used should be from the original scan at IA, not extracted from the compressed-to-death DJVU file. Compare:
This script User:Inductiveload/Jump_to_file.js adds a direct link from the Page namespace to the image file at IA, as long as there's an IA link on the Commons description page. Inductiveloadtalk/contribs 13:01, 17 September 2019 (UTC)
By way of example I've cropped the image on that page (as a new file - the orginal remains Commons, as should they all), using Commons' 'crop tool'. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:25, 17 September 2019 (UTC)

Help identifying a symbolEdit

Hi. I'm proofreading a book in Spanish and tumbled upon some symbol i've never seen before. See this page. On the footnotes, after the numbers, there is this symbol that in that context I understand it means "thousands", but I can't find it elsewhere. Have any of you found something similar? In the meantime I wil proofread it as 000. Regards, --Ninovolador (talk) 21:27, 17 September 2019 (UTC)

Never seen it before either. I would suggest that you put a <!-- remark --> in place, for the validator to see. — billinghurst sDrewth 23:21, 17 September 2019 (UTC)
@Ninovolador: it appears to be called a calderón, millaron, or millar. There is a history of the symbol here. In this thread on the Unicode mailing list they note that some publications have represented the symbol using ↄ or ¶, and also make the suggestion that Ɔ⃦ is a good representation provided that the reader's browser renders it correctly (it's supposed to position the two vertical lines directly centered over the Ɔ)—Beleg Tâl (talk) 00:15, 18 September 2019 (UTC)
@Beleg Tâl, @Billinghurst: Thank you both! Specially Beleg for your time and research! --Ninovolador (talk) 00:47, 19 September 2019 (UTC)

Note: added an author search field to Wikisource:Authors, and some author category pagesEdit

As a trial, I have added an author search field to Wikisource:Authors, Category:authors and the page(s) at/below Category:Authors by alphabetical order‎. If it is problematic, then please leave a note here, and we can look to see what the resolution should be. If it suits community, then we can leave it in. — billinghurst sDrewth 00:22, 18 September 2019 (UTC)

Seems to work well. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:42, 18 September 2019 (UTC)
OK by me —Beleg Tâl (talk) 17:54, 18 September 2019 (UTC)

Proofreading line by lineEdit

I tried to do some proofreading and I experienced two difficulties:

1 I need to jump back and forth between the text version and the scan. Usually needs to jump up and down since the lines aren't aligned.

2 I need to proofread a whole page in order to submit, which means if I don't have 10+ minutes to carefully check the whole page, I better not start.

My proposed solution is to have the scans broken down to lines, and then one will be able to proofread one line at a time. It will also allow for mobile users to contribute easily.

This approach was implemented by Haifa University in Tikkoun Sofrim project.

Would like to know if there is any reason this wasn't done and if no such reason, I would like to start developing this tool, and as MVP simply upload a PDF with one line per page.Uziel302 (talk) 20:22, 19 September 2019 (UTC)

1) When proofreading, you should be able to drag the image to a position where the lines are aligned.
2) When I am only able to proofread part of a page, I will usually place a comment like <!-- PROOFREAD UP TO THIS LOCATION --> at the place where I proofread up to. Or you can just proofread part of it and make life easier for the next person who finishes the page.
3) If your tool requires that users chop up book scans line-by-line, I don't imagine it will gain much traction, but I'm certainly intrigued. —Beleg Tâl (talk) 20:33, 19 September 2019 (UTC)
I would be enthusiastic about a well-written proofreading gadget of the sort you describe; it would take quite a bit of programming though! I don’t think hand-uploading a line-by-line PDF is the solution. Too much work by far. Instead, image recognition has now advanced to the point that software should be able to pull apart an image into one-line image strips (and it could deal with multiple columns too -- not too hard to recognize them, or it could ask you how many columns). This would be far less of a challenge than OCR! Levana Taylor (talk) 20:39, 19 September 2019 (UTC)
At the very least, the DjVu and PDF files must have the OCR'd text locations available. Furthermore, if the scan is from the IA, this information is directly available in the djvu.xml file in terms of page, column, line and word co-ordinates, so you don't even need to pick apart the file ot get at it. For example: a random book's XML. Inductiveloadtalk/contribs 21:29, 19 September 2019 (UTC)
I can imagine that proofreading line by line works well if the contributors focus on pure text. However, Wikisource tries to focus also on text formatting, and so it is much better to work with the whole page. Formatting each line separately would be too much work and I suppose that putting the formatted lines together would also be very difficult. --Jan Kameníček (talk) 21:33, 19 September 2019 (UTC)
What Jan said ^^^ and most importantly is where hyphenation, formatting and wikilinks FWIW Trove has its newspapers in line by line, and it works for that where it replicates a work, and focused on text search, not so much for reproduction of books and works. — billinghurst sDrewth 03:38, 20 September 2019 (UTC)