Wikisource:Scriptorium

(Redirected from Scriptorium)
Scriptorium

The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help.

The Administrators' noticeboard can be used where appropriate. Some announcements and newsletters are subscribed to Announcements.

Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 408 active users here.

AnnouncementsEdit

ProposalsEdit

Bot approval requestsEdit

Repairs (and moves)Edit

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

See also Wikisource:Scan lab

Index:The_Strand_Magazine_(Volume_22).djvuEdit

Despite the name, this appears to be a scan of Volume 23 of The Strand Magazine, rather than 22, so it would be good to get it replaced with the correct volume (for example, https://archive.org/download/TheStrandMagazineAnIllustratedMonthly/TheStrandMagazine1901bVol.XxiiJul-dec.pdf ).

Let me know if there is a better place to post this, as this is my first post that isn't a proofreading edit. Qq1122qq (talk) 14:42, 29 July 2022 (UTC)

At Wikisource:Scan lab. Mpaa (talk) 08:57, 7 August 2022 (UTC)
Thanks - it looks like this has been noticed and dealt with, but I'll use the scan lab for future problems. Qq1122qq (talk) 13:18, 28 August 2022 (UTC)

Nihongi: Chronicles of Japan from the Earliest Times to A.D. 697Edit

Move to Nihongi: short title by which the work is generally known, disambiguation not being required as this is the only public domain English translation. TE(æ)A,ea. (talk) 02:04, 20 August 2022 (UTC)

  Not done The contributor put it where it is and back in 2014, there is no requirement to move it. I will create the redirect to the work — billinghurst sDrewth 09:00, 18 September 2022 (UTC)
If there were to be a naming convention on this (I wish people were on board with actually having this), I think having subtitles in nonfiction books is fine (as long as you don't include sub-subtitles and so on), because nonfiction books are often referred to (and distinguished) by their subtitles. But I think in fiction books the subtitles (such as silly things like "The Fifth Wheel: A Novel", "..., or, The Lover's Quarrel" or "...: The Story of a ... and His ...") are excessive and should never be included in the titling, except under exceptional circumstances. PseudoSkull (talk) 14:36, 22 September 2022 (UTC)
  • billinghurst, PseudoSkull: The subtitle for this work was added by the translator; the original had no subtitle. It is not as if this work is frequently referred to with the subtitle, either, as (since this is the only English translation) it is simply the Nihongi. There is no reason to have the work at the longer title, and it is much easier to use a work with a shorter title. TE(æ)A,ea. (talk) 14:59, 22 September 2022 (UTC)

File:Letters, speeches and tracts on Irish affairs.djvu needs pages insertedEdit

This is missing roman numeral pages x and xi (first ToC) which need to be inserted as /16 and /17. They are available as TIFs at /16 and /17. I'll move the pages to make way for an update. @Mpaa:billinghurst sDrewth 09:09, 18 September 2022 (UTC)

@Billinghurst file is fixed. Mpaa (talk) 18:08, 19 September 2022 (UTC)

Other discussionsEdit

Policy on substantially empty worksEdit

[This is imported from WS:PD, where it applies to multiple current proposals, and several other works].

We have quite a few cases of works that are "collective" or "encyclopaedic" in that they comprise many standalone articles of individual value, which are basically just "shell pages", with no substantial content of any sort, not even imported scans or Index pages. For example, and this isn't intended to make any statement about these specific works, they're just examples and they may well get some work done soon during their respective WS:PD discussions:

Based on the usual rate of editing for things like that, unless dragged up into a process like WS:PD, they'll remain that way a very, very long time. I think it is perhaps there might be a case to host a mainspace page for this work, even though there is zero, or almost zero actual content. Do we want:

  • Mainspace pages where this is a tiny bit of information like header notes, scan links and maybe detective work on the talk page (not in this case). This provides a place for people to incrementally add content. Also gives "false positive" blue links, since there is actually no "real" content from the work itself, or
  • Do not have a mainspace page until there's some content. Only host this in terms of scan links author/portal scan links, much like we do for something like a novel.

Personally, I lean (gently) towards #2, but with a fairly low bar for how much content is needed. Say, Indexes, basic templates, a title page and one example article. Ideally, a completed TOC if practical, especially for periodical volumes/numbers. It is fair to not wish to transcribe entire volumes of these work, it is fair to not want to import dozens of scans when you only wanted one, it is fair to only want an article or two, but it's not fair, IMO, to expect the first person who wants to add an article to have to do all the groundwork themselves, despite having been lured in with a blue link. That onus feels more like it should be on the person creating the top-level page in the first place.

I do see some value in periodical top pages with decent lists of volumes and scans where known, because these are often tricky and fiddly to compile from Google books/IA/Hathi, so it's not useless work, even if there are no imported scans (though imported is better than not).

We currently have a large handful of collective works listed for deletion right now in various levels of "no real content", and, furthermore, every single periodical that gets added can fall into this situation unless the person who adds, so I think we could have a think about what we really want to see here. Inductiveloadtalk/contribs 15:43, 3 July 2020 (UTC)

  • I believe that, if there is no scan as an Index: page, the main-namespace page should not exist unless it is being actively completed or is already mostly completed. A few pages (of the volume itself) is not very helpful, and is entirely useless if their is no scan given. TE(æ)A,ea. (talk) 15:59, 3 July 2020 (UTC).
  • I think such preparatory information would ideally be on more centralized WikiProject pages (for the broad subject), both for clarity and to assist in keeping different efforts consistent -- but that it certainly should be retained as visible to non-admins. I think that the red vs blue link issue is minor (but not totally negligible) and outweighed by the disadvantages of hiding the history of previous efforts. I strongly encourage redirecting such pages to appropriate WikiProject pages (after copying over the details there). JesseW (talk) 18:11, 3 July 2020 (UTC)
  • @JesseW: I agree that history shouldn't be deleted, but I think we should approach this in terms of what we want to see from these works, rather than what to do with the handful of examples at PD. There are hundreds of periodicals we could have but don't, and this applies to those as well. If we can come to a conclusion about what is and isn't wanted, we can make all the deletion requested works conform to that easily enough. Inductiveloadtalk/contribs 20:55, 3 July 2020 (UTC)
  • I think these pages are necessary to list index pages and external scans of multi-volume works (such as encyclopaedias and periodicals) especially if they are wholly or partly anonymous or have many authors or are simply large. I think it makes no difference whether such pages are in the mainspace, the portal space or the project space (except that it is harder to find pages outside the mainspace). The point is that these works often have so many volumes (often dozens or hundreds) that they must have their own page, and cannot be merged into a larger portal or wikiproject. If the community starts insisting on index pages, what will happen is the rapid upload of a large number of scans for the periodicals that already have their own page. Likewise if the community insists on transclusion. I also think it is reasonable to have a contents page in the mainspace, as it allows transclusion of articles. Most importantly, new restrictions should not immediately apply to existing pages that were created before the introduction of the restrictions. This is necessary to prevent a bottleneck. James500 (talk) 23:55, 3 July 2020 (UTC)
move the works to a maintenance category, and i will work them; delete them and i will not: i find your sword of Damocles demotivating. Slowking4Rama's revenge 01:55, 5 July 2020 (UTC)
@User:Slowking4: I am not proposing a sword of Damocles. I agree that the imposition of deadlines is counter-productive. I do not support the deletion of any of these pages. I would prefer to see them improved. James500 (talk) 04:38, 5 July 2020 (UTC)
TEA is on his usual deletion spree. not a fan. will not be finding scans to save texts, any more. he can do it. Slowking4Rama's revenge 00:15, 6 July 2020 (UTC)
The entire point of moving this here, and not staying at WS:PD is to decouple from the emotions that get stirred up in a deletion discussion. Let's keep deletion out of this. If we come up with some idea of what we do and don't want, then we can go back to WS:PD and decide what to do. I imagine that all that will be needed will be a fairly limited amount of housework to bring those works up to some standard that we can decide on here, and all the collective works there will be easy keeps. Hopefully with some kind of consensus that we can point at to outline a minimum viable product for such works going forward. There are hundreds and thousands of dictionaries, encyclopedias, periodicals and newspapers that we could/will, quite reasonably, have only snippets of. How do we want to present them? What, exactly, is the minimum threshold? Let's head of all those future deletion proposals off at the pass, because deletion proposals often cause friction. Inductiveloadtalk/contribs 00:47, 6 July 2020 (UTC)
and yet deletion is the default method to "motivate" quality improvement. i reject your assertion that "emotions get stirred in a deletion discussion", rather, anger is a valid response to a repeated broken process being kicked down on the volunteers. it is unclear that a minimum threshold is necessary, rather a functional quality improvement process is. until we have one, you should expect to see this periodic stirring of emotions, as the non-leaders act out. Slowking4Rama's revenge 11:53, 9 July 2020 (UTC)
@Slowking4: Thank you for presenting this opinion, and I'm sorry if I have not made myself clear. We do need to figure out how to avoid a de-facto process of using WS:PD as an ill-tempered ad-hoc venue for "forcing" improvements on people who have somehow managed to generate works that are so in need of improvement that another user has nominated them for deletion. Please also consider looking at #Re-purpose_WikiProject_OCR_to_WikiProject_Scans for an idea to have a "functional quality improvement process" to which such works could be referred upon discovery rather than kicking them straight to WS:PD. If you have other ideas or you have previously suggested something similar to address these frustrations, you could detail them there. Personally, I think we should always prefer improvement over deletion. Exactly what the remediation is (refer to a putative WP:Scans, WS:Scriptorium/Help, directly WS:PD as now, or something else) is not what this thread is for. This thread is for discussing, what, if anything, should be the tipping point for deeming a page "lacking" and doing something about, whatever "something" is. I don't think I can be much clearer that this is not about deletion. If we also have a better venue for improvements, then that's even better.
For example, my personal feeling and !vote on A Critical Dictionary of English Literature is "keep and improve", despite it lacking scans or even links to scans, having only one article and no other content, not even a title page: in short, failing almost every criterion suggested so far in this thread. The only thing it does have is have is good text quality of the one entry. I personally do not think this work should be deleted, but I do think it should be improved in specific ways. The first half of that sentence is not the focus of this discussion, the second half is. Inductiveloadtalk/contribs 14:18, 9 July 2020 (UTC)
deletion threat has been an habitual method of communicating by admins since the beginning of the project. and text dumps have been habitual following in the guttenberg example. culture change and process change would be required to change those behaviors. we could may it easier to start scan backed works, but the wishlist was not supported. Slowking4Rama's revenge 21:00, 14 July 2020 (UTC)

I don't think this needs to be much of an issue going forward -- we all agree that it's OK to create Index pages for scans, even if none of the Pages have been transcribed yet; so the only case where this would come up is recording research where no scan has yet been identified as suitable to be uploaded. And for that, I still think a WikiProject page is the right location, not mainspace. (Or, if you must, your userpage.) JesseW (talk) 00:59, 6 July 2020 (UTC) I realized I may not have been clear enough here -- in my view, the ideal process goes like this:

  1. Decide on a work you are interested in (in this case, a periodical/encyclopedic one) -- don't record that anywhere on-wiki (except maybe your user page)
  2. Find and upload (to Commons) a scan of one part/issue/etc of the work.
  3. Create a ProofreadPage-managed page in the Index: namespace for the scan. (You can stop after this point, without worry that your work will later be discarded.)
  4. EITHER
    1. Put further research (on other editions, context, possible wikification, etc.) on that Index_talk page.
    2. Proofread a complete part of the scan (an article from the magazine issue, a chapter from the book, a entry from an encyclopedia, etc.) and transclude it to the mainspace (and create necessary parent pages), and put the further research on the Talk: page of the parent mainspace entry.

If you can't find any scan, and don't want to leave your working notes on your user page, put them on a relevant WikiProject's page.

If you come across such research done by others and misplaced, follow the above process to relocate it to an appropriate place, then redirect the page where you found it to the new location. That's my proposal. JesseW (talk) 01:08, 6 July 2020 (UTC)

@JesseW: It's not clear to me in your above whether when you use the term "index" you refer to a ProofreadPage-managed page in the Index: namespace, or a general wikipage in the main namespace on which an index-like structure (and/or a ToC, or similar) is manually created. Could you clarify? --Xover (talk) 05:14, 6 July 2020 (UTC)
I meant the namespace. Clarified now. JesseW (talk) 05:17, 6 July 2020 (UTC)
  • Hoo-boy. Y'all sure know how to pick the difficult issues…
    My general stance is that: 1) scans and Index: (and Page:) namespace pages have no particular completion criteria to meet to merit inclusion, and can stay in whatever state indefinitely (there may be other reasons to get rid of them, but not this); and 2) the default for mainspace is that only scan-backed complete and finished works that meet a minimum standard for quality should exist there.
    That general stance must be nuanced in two main ways: 1) there must be some kind of grandfather clause for pre-existing pages; and 2) there must exist exceptions for certain kinds of works that meet certain criteria. I won't touch on the grandfather clause here much, except to say I'm generally in favour of making it minimal, maybe something like "No active effort to get rid of older works, but if they're brought to PD for other reasons they're fair game". The design of a grandfather clause for this is a whole separate discussion, and an intelligent one requires analysis of existing pages that would be affected by it. It is always preferable to migrate pages to a modern standard, so a grandfather clause is by definition a second choice option.
    Now, to the meat of the matter: the exceptions…
    We have a clear policy to start from: no excerpts. Works should either be complete as published, or they should not be in mainspace. But quite apart from the historical practices that modify this (which are somewhat subjective and inconsistent, so I'll ignore them for now), there are some fairly obvious cases that suggest a need for more nuance than a simple bright-line rule alone provides. The major ones that come to mind are: 1) massive never-completed projects like EB1911 or the New York Times (EB because it's big; NYT because new PD issues are added every year); 2) compilations or collections of stand-alone works with plausible claim to independent notability.
    For encyclopedias and encyclopedia-like things, we have to accept some subsets due to sheer scale of work. But when that is the grounds for exception, there needs to be some minimum level of completion. I'm not sure I can come up with a specific number of pages/entries or percentage, but it needs to be more than just a single entry (and, obviously, only complete entries). For this kind of exception to apply, I think it needs to be a requirement that the framing structure for it is complete: that is, the mainspace page should give a complete overview of the relevant work even if most of it is redlinks. That includes title pages and other prolegomena when relevant. For a periodical like the NYT, that means complete lists of issues with dates and other such relevant information (e,g. name changes etc.). For preference, these kinds of things should be in Portal: namespace or on a WikiProject page until actually complete, but that will not always be practical (EB1911 and NYT are examples of this). Mainspace or Portal:-space should never contain external links (i.e. to scans) or links to Index: or Page: space (except the implied link of transclusion and the "Source" tab in the MW UI provided by ProofreadPage).
    For exception claimed under independent notability there are a couple of distinct variants.
    Newspaper or magazine articles need to have a certain level of substance in addition to a specific identifiable byline (possibly anonymous or pseudonymous, and possibly identified after the fact by some other source, such as the Letters of Junius) in order to qualify. It is not enough to ipso facto be a newspaper article, a magazine article, a poem, or an encyclopedia entry. On the one hand we have things like dictionaries and thesauri, where an entry could be as little as two words. Or a one-sentence notice without byline in a newspaper. Or two rhymed lines (technically a poem) within a 1000-page scholarly monograph.
    To merit this exception it should be reasonable to argue that the "work" in question should exist as a stand-alone mainspace page (not that we generally want that; but as a test for this exception, it should be reasonable to make such an argument). This would clearly apply to moderately long entries in the EB1911 written by a known author that has their own Wikipedia article. It would apply to short stories or novella-length serialisations in literary magazines by authors that have later become famous (or "are still …"). It would apply to various longer-form journalistic material from identifiable journalists (again, rule of thumb is notable enough for enWP article), including things in magazines that have similar properties. For most periodicals the most relevant atomic (indivisable) part is the issue not the entry or article, but with some commonsense exceptions.
    It would, generally, not apply to things that are works by a single author, like a scholarly monograph that just happens to be arranged in "entries" rather than chapters. It would not apply to things that are essentially lists or tables of data. It would not apply to short entries in something encyclopedia-like or entries that are not by an identifiable author. The OED for example, iirc, is a collective work where entries are by multiple not individually identifiable authors (and each entry is mostly very short too); only the overall editor is usually cited.
    For works claiming this exception too the framing structure should be complete, even if most of it are redlinks. The same general rules about Portal:/WikiProject and no external or Index:-space links apply. An exception would be for periodicals where new issues enter the public domain every year; and we should generally avoid including even redlinks for the non-PD issues here (but may allow them in a WikiProject page). For non-periodical works in multiple volumes where some volumes were published after the PD cutoff, including listings for the non-PD volumes (but not links to scans; those are a copyvio issue) is ok.
    Poems, short stories, and novellas are a special class of works here. A lot of these were first published in a magazine (possibly serialized), and a lot of them exist as multiple editions in substantially the same form. Some exist in multiple versions. These should all primarily exist the same way as chapters as part of their various containing works; but there are some cases where we might want to have, for example, a series of connected pages of the poems of Emily Dickinson. I am significantly ambivalent about this practice, as it amounts to making our own "edition" or "collection" of her poems (in violation of several of our other policies), but I acknowledge that it is an established practice and it is something that has definite value to our readers. It may be that it is actually a practice that should be governed by its own dedicated policy rather be attempted to be handled within these other general policies.
    For the sake of example; applying this to the works Inductiveload listed at the start of this thread would shake out something like this:
    Auction Prices of Books—This work appears to have no sensible subdivisions and is in any case by a single author. I see no obvious reason to grant this work an exception, except under sheer volume of work and even there I would want to see both a substantial proportion completed and some kind of ongoing effort towards completion (no particular time frame, but definitely not infinite and definitely not as an effectively abandoned project). In a deletion discussion I would very likely vote to delete the mainspace pages here (but, as nearly always, to keep the Index: and Page: namespace artifacts). I don't see this as a reasonable candidate for a Portal:, nor really a good fit for a WikiProject (though I probably wouldn't object to a WikiProject if someone really wanted one).
    Central Law Journal/Volume 1—A single volume is too little, so I would want to see a complete structure for the entire Central Law Journal, with level of detail for each volume similar to the one existing volume. Each article in the journal can be individually considered for a stand-alone work exception; but for the collection I would want to see at minimum a full issue finished to justify having the mainspace structure, and preferably multiple issues (in a deletion discussion I might insist on multiple issues). Index: and Page:-space artefacts can, of course, stay. A Portal: might make sense for selections from the journal, of articles that meet the standalone work exception. A WikiProject to coordinate work and track links to scans etc. might be a decent fit here, if someone wanted that. As it currently stands I would probably vote delete for the mainspace artefacts (with option to move whatever content has reuse value to a non-mainspace page for preservation; and undeleting if someone wants to work on something is a low bar).
    A Critical Dictionary of English Literature—The top level mainspace page has near-zero value, existing only to link to the single transcribed entry. For a credible claim to exception to exist it would need to be a complete framework for the work as a whole, and significantly more than a single entry must be complete. I would probably also want to see ongoing work, unless a substantial percentage of the entries were complete. The single finished entry is eligible to claim a standalone work exception, but I think it probably would not meet my bar for that (I might be wrong; and the rest of the community might judge it differently). In a deletion discussion I would probably vote to delete all the mainspace artifacts here (as always keeping Index:/Page: stuff) but with a definite possibility that I might be persuaded on the one completed entry (an absolute requirement for convincing me would be to scan-back it: as a separate issue, my tolerance for grandfathering of non-scan-backed works is small, and effectively zero for new/non-grandfathered works).
    Bradshaw's Monthly Railway Guide—Would need a full framework and a number of individual issues finished to merit a mainspace page. I see no credible subdivisions for a standalone work exception, but might be persuaded otherwise if, say, one of the train tables was used as a (reliable primary) source in a Wikipedia article (implying some sort of notability beyond just being raw data). In a deletion discussion I would probably vote to delete all mainspace artifacts here. If anyone made the argument, I would entertain the notion that there is value in treating train tables like poems, and hosting a series of train tables like we do Dickinson's poems; but that would require a substantial number of them completed.
    For everything above my stance is nuanced by a willingness to accept temporary exceptions for things that are actively being worked: active being operative, but with no particular deadline to complete the work. We have differing amounts of time available, and some works are so labour-intensive or tedious to do, that my person threshold for "active" is a pretty low bar to clear. If it's months and years between every time you dip in and do a bit I might start to get antsy, but days or weeks probably won't faze me. And that the projected time to completion is very long at that pace is not particularly a problem so long as it is not infinite. Within those parameters I would always tend to err on the side of letting contributors just get on with it in peace, regardless of any of the policy-like rules sketched above.
    I also want to emphasise that I think this is a very difficult issue to deal with. There are a lot of competing concerns, and a lot of grey areas that will likely take individual discussions to resolve. My balance point on this issue is partly formed by a broader concern about our overall quality (we have waay too many works of plain sub-par quality, and too many not up to modern standards) and a hope that by preventing the creation of these kinds of works (rather than deleting them after creation) we will be able to retain the good and desirable exceptions without dragging down quality, and without the traumatic and stressful events that deletions and proposed deletion discussions are.
    And for that very reason I am grateful this issue was brought up here for discussion, and I hope we can end up with some clear guidance, possibly in the form of a policy page, going forward. And in any case, since it will create de facto policy, this is a discussion that needs to stay open for a good long while (there are several community members that have not yet commented whose opinion I would wish to hear before closing this), and depending on how well we manage to structure the consensus, may also require a formal vote (up in the #Proposals section). --Xover (talk) 09:03, 6 July 2020 (UTC)
  •   Oppose. It is becoming clear that a policy on incomplete works in the mainspace is going to place enormous pressure on individual editors. I think it would be more effective to start a wikiproject devoted to scan-backing works that lack scans and so on. James500 (talk) 12:14, 6 July 2020 (UTC)
    • @James500: FYI, this thread was made in order to provide an exception to the current policy of "no excerpts". A literal reading of the policy as it stands has a plausible chance of coming down delete on the mainspace pages over at WS:PD. This thread is a chance to come up with a better way to support such partial collective works. That we have several substantially incomplete and abandoned collective works lolling around in mainspace is actually the result of laxity in respect to stated policy (not to say I think it's a bad thing). The deletion proposals, whatever you may think of them, are actually not in contradiction to policy. That said, as always, there is scope to adjust policy. Which is what this is.
    • Now, in terms of a WikiProject to scan back works, I think that is a good idea. See #Re-purpose_WikiProject_OCR_to_WikiProject_Scans above, which proposed to reboot Wikiproject OCR as a scan-backing Wikiproject. Inductiveloadtalk/contribs 14:40, 6 July 2020 (UTC)
      • The policy says "When an entire work is available as a djvu file on commons and an Index page is created here, works are considered in process not excerpts." A literal reading of that policy is that no scan-backed work is an excerpt (it is expected to be completed eventually). Further the policy refers to "Random or selected sections of a larger work". A literal reading of that expression is that it does not include lists of scans, or auxilliary content tables, as they are not "sections" (they are not part of the work), and that not every incomplete portion of a work is either "random or selected" (which would not include starting from the beginning and getting as far as you can, with intent to finish later). I could probably argue that an encyclopedia article or periodical article is a complete work. James500 (talk) 15:16, 6 July 2020 (UTC)
  • Nice wall of text, Xover (and I say that with great respect!) -- it generally makes sense and sounds good to me. As another hopefully illustrative example, take The Works of Voltaire, which I've been digging thru lately. I think this would very much satisfy your criteria as a large work, with sufficient scaffolding to justify the mainspace pages that exist for it. I would love to hear others thoughts on that. JesseW (talk) 16:07, 6 July 2020 (UTC)
    @JesseW: Yeah, apologies for the length. Brevity is just not my strong suit.
    The Works of Voltaire probably qualifies on sheer scale of work, yes. I don't think the current wikipage at The Works of Voltaire is quite it though: as it currently stands it is more WikiProject than something that should sit in mainspace (its contents are for Wikisource contributors, to organise our effort, not our readers, who want to read finished transcriptions). It also mixes a work page with a versions page in a confusing way. So I would probably say… Move the current page to Wikisource:WikiProject Voltaire; create a new The Works of Voltaire as a pure versions page, linking to…; The Works of Voltaire (1906), that is set up as a work page with the cover and title (and other relevant front matter) of the first volume, and an AuxTOC (and possibly also the {{Works of Voltaire}} volume navigation template). I don't know how tightly coupled the volumes of this edition are (does the first volume have a common ToC or index of works for all the volumes?), so some flexibility on format may be needed to make sense. But as a base rule of thumb it should start from a regular works page and deviate only as needed to accommodate this work (mainly the size is different).
    In any case… With a volume or two completed (they're only ~350 pages each) I'd be perfectly happy having something like that sitting around. With less then that I'd possibly be a bit more iffy, but it's hard to put any kind of hard limit on that. And with somebody actively working on it I'd be in no hurry whatsoever regardless of current level of completion.
    PS. I'm pretty sure a large proportion of the contents of these volumes are works that would qualify under "standalone works" that could exist independently in mainspace, regardless of what's done with the The Works of Voltaire page. Even his individual poems and essays can presumably make a credible claim here (because it's Voltaire; less famous authors would have a higher bar). Better as part of the edition, but also acceptable on their own. --Xover (talk) 16:56, 6 July 2020 (UTC)
  • @JesseW: I personally take no issue with this page's existence (actually I think it's a nice work and good way to allow an important author's works to be slotted in piece-by-piece. I have some general comments which overlap with this thread (written before Xover's reply, so pardon overlap):
    • First off, I differ with Xover in terms of the scan links: I think they're better than nothing, and I don't see much value in duplicating the volume list onto an auxiliary page just to add scan links. However, I can sympathise with the sentiment that our mainspace shouldn't direct users off-wiki (or at least off-WMF). But if we don't have the scans, and that's what the user wants, they're leaving anyway. Real answer: import moar scans!
    • No scan links are necessary where the volume exists in mainspace and is scan-backed (e.g. v3)
    • Ext scan links should only be used when there is no Index page or imported scan. Use {{small scan link}} or {{Commons link}} when possible (e.g. v2)
    • The first volume list could probably be in an AuxTOC to mark it out as WS-generated content.
    • The "Other editions" section belongs on an auxiliary namespace page (Talk, Portal or Wikisource). I suggest the Talk page is best in this case. Inductiveloadtalk/contribs 17:35, 6 July 2020 (UTC)
  • @Xover: I am in agreement with the majority of what you say. Particularly, I think a framework around any collective work (be it a single-volume biographical dictionary or a 400-issue literary review spanning 80 years) is the critical prerequisite, plus at least some scans, the more the merrier. Where I think I differ:
    • I am inclined to be a bit more relaxed in terms of how much of a work we need. As long as a single article exists, it's not "trivial" (e.g. only a short advert or some incidental text like a "note to correspondents", as opposed to an actual article), it's well-formatted and scan-backed, and a complete framework exists, including front matter and a TOC, such that's it is easy for anyone to slot in new pieces, I'd be fairly happy. Lots of periodicals have all sort of tricky bits like tables of stocks or weather tables and writing into policy that those must be proofread in order to get the "real" articles into mainspace would be a chilling effect, in my opinion. If you allowed an exception, it would be verbose and tricky to capture the spirit without saying "unless, like, it's totally, like, hard, man".
    • I am not dead against scan links in the mainspace at the top level, when such a top-level page exists. See my comments on Voltaire above. I am against them where they could sensibly be on an Author page and they are the only mainspace content.
    • I am ambivalent on the presence of, e.g., disjointed train timetables. It's not my thing to have a smattering of random timetables, but as long as they're individually presented nicely, it's not too offensive to my sensibilities. I might question the sanity of someone who loves doing tables that much, but whatever floats the boats! Also, I think that this might circle back to "good for export" - a mark which certainly would require completed issues or volumes. If you want to get that box ticked, you have to do it all.
    • Re the "notability" aspect of individual articles, I'm not really bothered by that, as I don't think we'll see a flood of total dross because few people really want to take the time to transcribe 1867 articles about cats in a tree from the Nowhere, Arizona Daily Reporter, and, actually I think some of the "dross" can be quite interesting in a slice-of-life kind of a way (always assuming well-formed and scan-backed). And the real dross is usually so bad (no scans, raw OCR, etc) that it can be dealt with outside of this topic. I think part of the value of WS is the tiny, weird and wonderful, not just in blockbusters like War and Peace and Pultizers. I think I might like to see more of our articles strung together thematically via Portals, but that's another day's issue. Inductiveloadtalk/contribs 17:35, 6 July 2020 (UTC)
      • @Inductiveload: We appear to be mostly in agreement. But… instead of me dropping another wall of text on the remaining points of disagreement, maybe that means we're in a position to try to hash out a draft guidance / policy type page with the rough framework? Then we could go at the remaining issues point by point. Because I think I'm in with a decent chance to persuade you to my point of view on at least some of them, but this thread is fast getting unwieldy (mostly my fault). It would also probably be easier for the community to relate to now, and much easier to lean on in the future. --Xover (talk) 18:31, 6 July 2020 (UTC)
        • @Xover: If there are no more comments forthcoming after a couple of days, I think that makes sense. I don't want to railroad it: considering we have at least one !vote for "do nothing", I'd like to see if there are any other substantially different opinions floating about. Inductiveloadtalk/contribs 17:41, 7 July 2020 (UTC)

The quantity of text here has grown far faster than my ability to absorb it, so rather than continue to put it off, here's my position: I don't see any problem with transcriptions that are scan-backed, even if the transcription only covers a small fraction of the entire scan. If Sally chooses (say) to transcribe a favorite story, that happened to be published in an issue of Harper's back in the 1890s, and goes to the trouble of uploading the full issue, but only creates pages for the one story that interests her, I think that's great. It doesn't matter to me whether she intends to work on the other pages or not. If it's not scan-backed, but it's fairly high quality, I am personally willing to do some work trying to locate a scan and match it up to the text; I'd rather we take that approach, than deletion, though of course deletion is the better option in some cases where the scan is very hard to come by.

If all this has been said above, or if I've misunderstood the topic, my apologies. Please take this comment or leave it, as appropriate. -Pete (talk) 02:00, 8 July 2020 (UTC)

Apologies, I see I had missed the point.

I disagree with Xover's statement that a top-level page for a publication, with a link only to a single article within the publication, has "near-zero value." Such a page can serve an important function linking content together in ways that help the reader (and search engines) find the content they're looking for, or understand the context around it. For instance, A Critical Dictionary of English Literature is linked from the relevant Wikidata entry. The banner on the Wikisource page clearly tells a Wikisource reader that they won't find a full transcription here; and with a simple edit, it could link to a full scan on another site, or (with perhaps a little more effort) even transcription links here on Wikisource. This page has been here since 2010; we don't have any way of knowing what links might have been created elsewhere in the intervening decade. (I do think that new pages like this should not be created without a scan at Commons to be linked to.) -Pete (talk) 02:12, 8 July 2020 (UTC)

I'm really bad with walls of text, so I have only read a tiny portion of the above discussion. But I want to mention a couple of things that I think are worth considering in this discussion.
  • Most of the time, a mainspace "work" that is only a table of contents, but which has none of the actual content, and is not actively being worked on, can be (and should be) deleted as No meaningful content or history under our deletion policy.
  • A mainspace work that has only a little bit of content, but that content is a work unto itself within the scope of Wikisourse, should be kept. Most periodicals are like this. For an example, see the Journal of English and Germanic Philology which only has one hosted article, but that hosted article is scan-backed and firmly within scope.
  • On some occasions, empty mainspace works do have value. I ended up creating the page The Roman Breviary, depsite containing no actual content, mostly because there are a lot of works that link to it, using many different titles, and if someone uploaded a copy of the work under one title then many of the links would remain red because they point to different titles of the work. This could be easily solved by creating redirects to a simple placeholder page, so I did. I tried to make the placeholder page as useful as a placeholder page can be, as it contains useful information about the history and authorship of the work, and links to the Index pages where the transcription will take place.

Anyway those are my 2 cents, sorry if they are redundant —Beleg Tâl (talk) 00:40, 29 July 2020 (UTC)

ProposalEdit

Since there has been no extra input for a month, and not wanting this section to get archived without at least attempting a proposal, I have started a proposal #Collective work inclusion criteria above. Inductiveloadtalk/contribs 11:00, 25 August 2020 (UTC)

Since the proposal has now slipped off the main page (to here), with vague support for the first part (collective work inclusion criteria) and a fairly consistent opposition to the second (no-content pages), my plan is to transfer the first part, as guidelines rather than policy, to Wikisource:Periodical guidelines. As non-binding guidelines, they can then be worked on further in situ. Sound OK? Inductiveloadtalk/contribs 08:10, 16 April 2021 (UTC)
The example given in Wikisource:Periodical guidelines might be improved, PSM is and was an exercise that has gone its own way (no offense to @Ineuw:, this is a site under development and that is only one example).CYGNIS INSIGNIS 13:05, 17 April 2021 (UTC)
@Cygnis insignis: You would be wrong to think that I am offended. Remember that when I started, I knew everything. By now, so much of that knowledge is lost that I am happy to listen. Would you elaborate please? — Ineuw (talk) 19:50, 17 April 2021 (UTC)

I've created Bradshaw's Monthly Railway and Steam Navigation Guide (XVI) - it couldn't be done on one page, due to the very high number of template transclusions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:52, 1 September 2020 (UTC)

@Pigsonthewing: The links in the toc on that page appear non-functional. Also, depending on just exactly which templates were the culprit, it is possible that you may be able to put all the content you wanted onto one page now due to some recent technical changes (template code moved to a Lua module which drastically improves performance and prevents hitting transclusion limits until much later). Xover (talk) 11:17, 14 September 2021 (UTC)
Create the Draft namespace to hold substantially empty works? Then delete if no improvement after months?--Jusjih (talk) 19:22, 1 November 2021 (UTC)
The issue is that the "substantially empty works" can have useful and complete content that stands alone. For example, an article from a scientific journal.
I would not want to see that either shunted into a Draft namespace to rot or deleted a few weeks down the line.
Index and Page namespaces provide our long term staging areas, and works can and do remain unfinished there for years. But what do we do when a self-contained piece of a larger work is ready? Inductiveloadtalk/contribs 20:29, 1 November 2021 (UTC)

Universal Code of Conduct News – Issue 1Edit

Universal Code of Conduct News
Issue 1, June 2021Read the full newsletter


Welcome to the first issue of Universal Code of Conduct News! This newsletter will help Wikimedians stay involved with the development of the new code, and will distribute relevant news, research, and upcoming events related to the UCoC.

Please note, this is the first issue of UCoC Newsletter which is delivered to all subscribers and projects as an announcement of the initiative. If you want the future issues delivered to your talk page, village pumps, or any specific pages you find appropriate, you need to subscribe here.

You can help us by translating the newsletter issues in your languages to spread the news and create awareness of the new conduct to keep our beloved community safe for all of us. Please add your name here if you want to be informed of the draft issue to translate beforehand. Your participation is valued and appreciated.

  • Affiliate consultations – Wikimedia affiliates of all sizes and types were invited to participate in the UCoC affiliate consultation throughout March and April 2021. (continue reading)
  • 2021 key consultations – The Wikimedia Foundation held enforcement key questions consultations in April and May 2021 to request input about UCoC enforcement from the broader Wikimedia community. (continue reading)
  • Roundtable discussions – The UCoC facilitation team hosted two 90-minute-long public roundtable discussions in May 2021 to discuss UCoC key enforcement questions. More conversations are scheduled. (continue reading)
  • Phase 2 drafting committee – The drafting committee for the phase 2 of the UCoC started their work on 12 May 2021. Read more about their work. (continue reading)
  • Diff blogs – The UCoC facilitators wrote several blog posts based on interesting findings and insights from each community during local project consultation that took place in the 1st quarter of 2021. (continue reading)


unsigned comment by SOyeyele (WMF) (talk) 22:37, 10 June 2021‎ (UTC)

Index:Robert Carter- his life and work. 1807-1889 (IA robertcarterhis00coch).pdfEdit

First run through is done, and it's transcluded. Needs validation. Thanks in advance for any help. Jarnsax (talk) 18:13, 16 June 2021‎ (UTC)

J3lEdit

The Works of the Late Edgar Allan Poe/Volume 1/The Domain of Arnheim unsigned comment by 202.165.87.161 (talk) 18:52, 25 December 2021 ‎(UTC)

Subscribe to the This Month in Education newsletter - learn from others and share your storiesEdit

Dear community members,

Greetings from the EWOC Newsletter team and the education team at Wikimedia Foundation. We are very excited to share that we on tenth years of Education Newsletter (This Month in Education) invite you to join us by subscribing to the newsletter on your talk page or by sharing your activities in the upcoming newsletters. The Wikimedia Education newsletter is a monthly newsletter that collects articles written by community members using Wikimedia projects in education around the world, and it is published by the EWOC Newsletter team in collaboration with the Education team. These stories can bring you new ideas to try, valuable insights about the success and challenges of our community members in running education programs in their context.

If your affiliate/language project is developing its own education initiatives, please remember to take advantage of this newsletter to publish your stories with the wider movement that shares your passion for education. You can submit newsletter articles in your own language or submit bilingual articles for the education newsletter. For the month of January the deadline to submit articles is on the 20th January. We look forward to reading your stories.

Older versions of this newsletter can be found in the complete archive.

More information about the newsletter can be found at Education/Newsletter/About.

For more information, please contact spatnaik wikimedia.org.


About This Month in Education · Subscribe/Unsubscribe · Global message delivery · For the team: ZI Jony (Talk), Sunday 1:26, 25 September 2022 (UTC)

A default layout related inquiryEdit

Is there a bot that could insert the {{default layout}} template set to [Layout 4], in all the main namespace pages I created? If this is beyond the scope of our bots, (permissions etc,) how can I get it done? Comments are most appreciated. — ineuw (talk) 03:08, 15 August 2022 (UTC)

Perhaps I should have asked if I am allowed to do this. My page namespace layout width matches Layout 4, and without it, I don't know how to bring it to the attention of the random reader. I will be able to construct the SQL statement, but it would need someone with execute rights to implement it. Suggestions are welcome. — ineuw (talk) 00:51, 21 August 2022 (UTC)
@Ineuw: Absent actual opposition you're free to set whatever default layout you think best serves the texts you work on (just make sure you base it on what best serves the reader of that specific work, and not your own preferences for viewing pages), and that includes adding it retroactively like any other little tweak or improvement you subsequently think of. And since it's something you're allowed to do manually it is also something you can request that a bot operator do on your behalf.
That being said, I'm not sure we can find any reasonable definition of "all the main namespace pages [you] created" that we could easily use as the set of pages to work on. I haven't really looked at what's possible (i.e. what pywikibot supports out of the box), but this selection doesn't jump out at me as something that would be supported without writing a significant amount of custom code. I'll try to have a look when time allows (I could be wrong), but in the mean time: can you define the pages to work on any other way? If you can find a query page on-wiki, or a query tool on Toolforge or similar, that gives you the relevant list then we can always just tell the bot to work on that static list of pages (vs. dynamically querying for it itself). NB! Make sure the list doesn't contain "false positives": things that are obviously irrelevant to a human may be completely invisible to a bot. Xover (talk) 10:55, 21 August 2022 (UTC)
I'm not quite sure I follow what you're trying to do here... in particular, the phrase "bring it to the attention of the random reader" concerns me, given that a number of those readers will be exporting as ePub (or finding an ePub elsewhere, etc) and not reading it on this site. Are you formatting works assuming a specific page width? — Dcsohl (talk)
(contribs)
14:32, 23 August 2022 (UTC)
I am not formatting anything. This is a display setting. — ineuw (talk) 00:32, 25 August 2022 (UTC)
@Xover: Thanks for the reminders. My contributions' list quickly extracted my main namespace contributions (only those that were created). Is there a way to extract it from contributions as a text file? or see the SQL construct? And then who can I ask to implement this? Years ago I was able to do this on Toolforge but the data tables' designs have been upgraded several time since. — ineuw (talk) 04:35, 25 August 2022 (UTC)
I understand "Layout 4" is a display setting -- your wording in your initial inquiry set off some internal alarm bells and I thought that maybe you wanted Layout 4 because your pages had somehow been formatted assuming that particular page width. As long as they still look good with other widths, or in epub readers, it's all good. — Dcsohl (talk)
(contribs)
14:15, 25 August 2022 (UTC)
@Dcsohl: The "Layout 4" was kindly added at my request years ago, because I found the screen width pages of the default "Layout 1" were impossible to read. Layout 2 is also narrow, but appears with a Garamond like font. My original intent was to place a note to the reader to select "Layout 4" of the Display Options, because that is how the editor intended his work to appear at first, in a format approximating a printed page. My personal bias towards printing and paper, prevented me from considering downloading any works. Also, at that time, formatting for printing was problematic, so I ignored it. Since that time I also realized that screen displays are also a user's personal choice, and others never see my work as I see it. — ineuw (talk) 18:09, 25 August 2022 (UTC)
@Ineuw I well appreciate the sentiment, especially as someone who prefers reading on e-readers and can't imagine reading an e-book some other way! You have more than satisfied my concerns, and I thank you for that! — Dcsohl (talk)
(contribs)
17:09, 26 August 2022 (UTC)

Tech News: 2022-35Edit

23:05, 29 August 2022 (UTC)

The 2022 WMF Board Election (Please Vote! only 2 days left!)Edit

Hi all. As you've probably seen, and promptly ignored, the voting period for the 2022 WMF Board of Trustees election is open. Despite fatigue with global notices and mass messages about the latest hare-brained "strategy process" or "branding initiative" or… I am here to URGE YOU TO VOTE! (quick link: SecurePoll)

This election elects two community-filled seats to the Wikimedia Foundation Board of Trustees. The Board has 16 seats—one for Jimbo, 7 appointed by the usual middled processes, and 8 from the community (but that's including the Affiliates, with no on-wiki experience requirement)—and are the ones who ultimately decide what the WMF does. They hire and fire the CEO (or Director, or whatever), set the strategy and priorities, start "strategic initiatives", and so forth. So, that… thing… a little way back when they wanted to rebrand the "Wikimedia Foundation" to the "Wikipedia Foundation"…? That was an initiative from the Board where they hired hideously expensive external branding consultants to figure out a rebrand for the WMF (yes, that's what they spent donation money on).

There has been a growing frustration in the communities about this, and there's a dawning realisation that there's a disconnect between the WMF and the community that is poison to the long-term health of the movement. This election and the recent board changes are baby steps towards a better situation.

So I am asking every single Wikisourcerer to please vote! (quick link: SecurePoll)

The main overview page is m:Wikimedia Foundation elections/2022/Community Voting. Getting to know the candidates is best done by reading their responses to questions for the candidates. Actual voting happens through SecurePoll.

You need to make up your own mind who to vote for, of course, but if you find making up your mind on this hard I have two tips for you:

  • Kunal Mehta, better known as Legoktm, is a long-time community member that has also formerly been employed by the WMF (and knows where the bodies are buried 😎). He's been a primarily technical contributor for a long time, but has stayed active and engaged on-wiki, and understands the community and community processes well. When uploading large books to Commons was essentially completely broken for 9+ months last year, Legoktm was instrumental in finally getting the problem flagged as a priority as well as actually figuring out what was going on and fixing it. Of particular import to us is that Legoktm is fronting a few key issues (all of which are broad concerns in the community):
    • WMF prioritisation of resources needs to be driven bottom up by the needs of the community, rather than top down by a disconnected Board and external consultants; especially when it comes to where to expend technical resources
    • The Board desperately needs someone with deep technology understanding so they don't neglect fundamentals (did you know that for several years nobody at the WMF owned the multimedia stack, including all of Commons?). Like having a team churning out new features, but then not giving anybody responsibility for maintaining that feature and allocating no resources to it. Do you know why we don't have a decent {{cursive}} font on Wikisource? (T166138) Because the Language Team at the WMF, who added the {{blackletter}} support in the Universal Language Selector, have been so starved of resources that they've had to scope away responsibility for web fonts (in favour of core i18n and language support); the very feature they were originally created to develop!
    • Development resources need to be better prioritised to meet the community's needs, rather than some pie-in-the-sky new "initiative". For example, assigning developers to work on maintaining the core MediaWiki and related components—like Proofread Page which is almost entirely maintained by volunteers, who have trouble even getting patches approved because there's nobody to review them—instead of new feature development of little interest to existing communities.
    • And while I certainly don't agree with them on everything, I've interacted with them enough times over the years that I feel confident I can always go grab them on their talk page and expect a response (most WMF people have never edited and don't answer on their talk pages); raise an issue about Wikisource and expect them to understand our community's unique perspective and be able to take it into account; and, not least, discuss such issues openly and frankly on-wiki even when we disagree on something. I've tried contacting "community" Board members on-wiki before and gotten only crickets, so this is an important factor for me personally (you may disagree of course).
    • They're also fronting several other issues regarding transparency, collaboration, and involvement that you can read about in the above linked pages, but those are to a great degree shared with other candidates.
  • Mike Peel, going by Mike Peel on-wiki, is also a long-time on-wiki contributor and community member. And while more active on other projects, they have over 5k edits on English Wikisource and understand our workflows and needs well. I've only rarely interacted with them, but from all observations they are as responsive and willing to discuss as can be expected; with the possible exception that they have a lot of irons in the fire and can be a bit busy for that reason. Their answer to candidate questions indicate they care deeply about the community, prioritising their needs and involving them in decisions. When I mention them second, and write less, it's primarily because they are less technically oriented than Legoktm is and I believe we desperately need more deep technical understanding on the Board along with the community understanding. Otherwise Mike would probably have been my top choice.

I have nothing against the other candidates, and it's entirely possible that you'll prefer one of them for your personal priorities, but these two are the ones that stand out to me as the ones that would be the best choices to take care of the needs and priorities of Wikisource and the wider community. And whatever your preference I urge you, in the strongest possible terms, to go vote! All the other stuff the WMF spams us with can usually be ignored, but the Board elections actually matter and in the long term will have a huge impact on Wikisource and your day-to-day work here. For example, it is extremely unlikely that we will have any dedicated team or developers assigned to Wikisource and Proofread Page unless we manage to get voted in Board members who fight for bottom-up prioritisation of resources and closer alignment with community needs, and with the technical experience and insight to understand that development resources must be allocated to also maintain core technical components of the existing projects (like the Wikisource and Proofread Page extensions that are Wikisource).

And luckily, the voting is w:Single Transferable Vote, so you rank your candidates by preference; and if your preferred candidate can't win, your vote is transferred to your second choice instead of being effectively wasted.

Please vote! Here: SecurePoll Xover (talk) 07:32, 4 September 2022 (UTC)

I appreciate this post. In my view, if there's even a single charitable organization in the world that absolutely should be governed by democratic processes, it's the WMF. To this end it's vital to foster a culture of participation from the bottom up; I would hate to see community involvement in governance disappear due to apathy on our part. With about seven hours left in this election, I highly encourage everyone to vote, if only to support the principle of democracy within the Wikimedia community. There is not much way for us lowly contributors to influence the direction of the WMF without the exercise of these fundamental rights, and it's quite easy for rights to disappear when those who have them fail to use them. The only rights we truly possess are those we actively claim and exercise, so: Please vote! Shells-shells (talk) 17:02, 6 September 2022 (UTC)

Chapters in The History of Herodotus (Macaulay)/Book IIEdit

Book 2 of The History of Herodotus is not divided into chapters like the other books; This book is very basic, so I think dividing into into chapter will benefit many editors. Thanks in advance, פעמי-עליון (talk) 00:14, 5 September 2022 (UTC)

It's not scan-backed, so really, what would be better would be to find a scan and start working on that. But if you want to split it, I don't see a problem with that. JesseW (talk) 01:32, 6 September 2022 (UTC)

Tech News: 2022-36Edit

23:22, 5 September 2022 (UTC)

Live list of texts to validate, sorted by number of pages remaining?Edit

Has someone already created a way to view a list of texts that need validation (I know that exists), but sorted by number of pages still needing validation? I think it'd involve going thru the (currently ~1 million) entries in Category:Proofread and grouping them by the basename (the part before the slash). That could be done from a dump, or using the query service I vaguely remember exists/existed, or something? JesseW (talk) 02:15, 6 September 2022 (UTC)

OK, I think I made it here: https://quarry.wmcloud.org/query/67152 -- JesseW (talk) 02:41, 6 September 2022 (UTC)
And there seem to be 4,817 texts in that state (many of them are single page works, and some the rest not even started). Interesting! JesseW (talk) 02:46, 6 September 2022 (UTC)
Your query is picking up a huge number of texts that have exactly one page in the proofread state, while having many others un-proofread or non-existent. For example, Index:2010_constitution_of_Angola.djvu is on your list, which has 1 validated page, 1 proofread page, 4 not-proofread pages, and 85 pages still to be added at all. — Dcsohl (talk)
(contribs)
12:51, 6 September 2022 (UTC)
I've improved it now, please check it again. It now returns just 308 works whose Index is in Category:Index_Proofread and which have exactly one Page in Category:Proofread. JesseW (talk) 15:06, 6 September 2022 (UTC)

I've now copied the current results (200 works) onto User:JesseW/Works with only one unvalidated page; I plan to go thru them and either work on them, or add notes why not (i.e. handwritten, actually very long, I don't want to work on that subject, etc.) I'd be delighted for other people to contribute, if you are so moved. JesseW (talk) 00:48, 20 September 2022 (UTC)

Importing from Commons problemsEdit

I tried to import certain files being deleted on Commons; it doesn't seem to have worked, but pulled in a bunch of templates. If I've managed to break anything, I'm sorry.--Prosfilaes (talk) 23:57, 7 September 2022 (UTC)

@Prosfilaes: Ouch! Yeah, importing doesn't work on files (it operates on the description page, not the file itself), and it's a massive trap for the unwary: if you check the "include templates" option it can go hog wild. Case in point, your import of the file ended up importing and overwriting ~60 different templates and Lua modules. See Special:Log/import. A lot of these either had local modifications or were actually imported from another project (typically enwp), so this is going to take a bit to untangle. In the mean time we should expect lots of weird breakages and problems.
I'll try to take a look as soon as I can, but I'm travelling at the moment so it won't be any sooner than ~12 hours from now. Xover (talk) 06:12, 8 September 2022 (UTC)
Ack, sorry about my part in prompting this mess (I was the one who noticed the files didn't get copied over, and asked Prosfilaes to try it again). We really should add a big warning on Special:Import. :-O JesseW (talk) 12:21, 8 September 2022 (UTC)
Ain't nobody needs to apologize fer nuffink: Special:Import is an attractive nuisance!
In any case, the worst of it seems to have been untangled now. We were really lucky in that most of what was imported was either not in use here, or we had more recent edits then Commons, so we just polluted the revision history but didn't actually break much of anything. I've tweaked the interface messages for Special:Import so that you get some warning now, and added a page with some guidance at Help:Importing (improvements welcome!). The messed up revision histories are going to be a pain, but that can't really be helped. To fix them requires major surgery and lots of research that the issue just isn't worth. Xover (talk) 09:40, 10 September 2022 (UTC)

Moving all chapter subpages using Roman numerals in titles to use Arabic numeralsEdit

I want to propose that we (finally) perform a mass moving of chapter subpages that use a roman numeral scheme (such as Tarzan the Untamed/Chapter II) to an Arabic numeral scheme (Tarzan the Untamed/Chapter 2). The roman numeral equivalents can stay as redirects.

I would apply this same procedure to anything else that uses roman numerals for incrementation, such as Volume/Issue numbers in periodicals, Part or Book numbers in books, "Poem" or "Letter" numbers or any other weird work-specific kind of chapter, etc.

We have been trying to transition away from using Roman numerals for a long time. It is very clearly not preferred and most of the works that still use Roman numerals in titles are older works that were originally added in the 2000s. In fact, the Wikisource help page Help:Subpages makes it clear that there is (at least loosely) a requirement not to use Roman numerals in page titling (emphasis mine):

Chapters, sections and so on should be numbered with Arabic numerals (i.e. 1, 2, 3; not Roman numerals) when such a numbering scheme exists in the original work.

There is no policy that I could find on this, though. I think there should be.

People who have been active here for a long time know exactly why Roman numeral titles are not good, but just a refresher for those who aren't in the know: Roman numerals add titling inconsistency to the site, makes it harder to search through (Isn't it easier to search for "Chapter 23" than have to write "Chapter XXIII"?), makes it more difficult to run scripts/bots through, etc.

I volunteer to participate in writing a PWB bot that can do this job, and if it gains enough support here, I'll take this to Bot requests. A past discussion from a couple years ago I found on this: Wikisource:Scriptorium/Archives/2019-08#Roman_Numerals_in_Page_titles.... We need to actually do it this time. Pinging @ShakespeareFan00, @Beeswaxcandle, @EncycloPetey: PseudoSkull (talk) 15:16, 8 September 2022 (UTC)

It seems (from my not very knowledgeable perspective) that, for works where the original uses Roman numerals, it would be better for linking purposes to have both forms available. I have no strong opinion on which one should be the redirect, but it makes more sense to always use Arabic numerals as the page title, I suppose. I strongly support using a bot to ensure we at least have redirects in place for all of them, even if we don't move the existing page titles. JesseW (talk) 16:17, 8 September 2022 (UTC)
  Support: It is important that the chapter’s title is transcribed from the page namespace as it is including the Roman numerals, so it makes no harm if the Roman numerals are changed into Arabic ones in the page name, while such a change has some advantages described in the proposal. Therefore I agree with the proposal. --Jan Kameníček (talk) 20:06, 8 September 2022 (UTC)
  Oppose Generalizing everything to "chapters, sections and so on" is a bad idea. There are situations where it is needlessly complicated to use Arabic numerals for subpage titles, such as in the Acts of plays. The standard notation for plays is to use Roman numerals for Acts and Arabic numerals for scenes. There are also classical works that might warrant investigation, for which the international standard for parts uses Roman numerals, however I'm not familiar enough in those works to make any authoritative statement. Finally, there are works that contain roman numerals as part of the title, and a sweeping statement does not make allowance for short stories or poems whose title is or ends in a Roman numeral. For chapters of novels, I agree with standardizing to Roman numerals, but I disagree for plays, and am uncertain or opposed concerning some other forms of subdivision. --EncycloPetey (talk) 20:15, 8 September 2022 (UTC)
@EncycloPetey: Care would be taken to ensure that certain titles of works which more naturally included roman numerals (something like "The Life of Charles VI" for example) do not get changed. I don't know if I agree about the acts in plays bit, since keeping their roman numerals would still affect usability and consistent parseability. But, if I were to grant that this procedure would not affect Acts of plays (or perhaps also Scenes), would you at least agree with the moving of subpages like "/Chapter VI", or "/Part III"? PseudoSkull (talk) 21:05, 8 September 2022 (UTC)
yeah, i am not a fan of the roman numeral phobia. already we cannot use roman numerals as footnotes, even when the source text does. will we now "correct" all texts with roman numerals in the toc and preface numbers? i don’t see the problem it is fixing. --Slowking4Farmbrough's revenge 20:59, 10 September 2022 (UTC)
No, we will leave the source text as is, but the wiki titles would contain Arabic rather than Roman numerals. And the problems it is fixing have been mentioned above. PseudoSkull (talk) 05:54, 11 September 2022 (UTC)
so you are going to move all mainspace chapter pages with roman numerals to arabic? is there a structural reason? it seems like micromanagement, with no added benefit. --Slowking4Farmbrough's revenge 20:54, 12 September 2022 (UTC)
@Billinghurst: I saw you make a move of this nature recently. Your opinion on this proposal? PseudoSkull (talk) 21:08, 11 September 2022 (UTC)
  Support. Cross-links are our value proposition; they are the thing that makes us different from and better than other transcription projects. If I'm transcribing a work of non-fiction that references other works e.g. a work of botany, I want to be able to add deep cross-links as I go; for example, link a reference to an article in a specific number of a specific volume of a journal. I want to be able to do that even if we haven't transcribed the article yet; but I can't do that if I can't predict what the link target will be. e.g. if it is impossible to know if it will be "Example Journal/Volume 9/Issue 1" or "Example Journal/Volume IX/Issue 1" or "Example Journal/IX/1" or etc. Hesperian 23:34, 11 September 2022 (UTC)
  •   Comment As Hesperian said, we should be having all new works with arabic numbering for chapters. AND AND AND this is the exact reasoning that we expressed years ago for our deeplinking, and also for better chapter sorting. I would have a preference for all works, though can understand that there may be genuine exceptions to the rule. Noting that these should be well argued exceptions to the rule, not simply personal choice/fancy. What we do with older works is always an interesting conundrum. I hven't touched old works, I just concentrate on recent works. And to Slowking4 it is not a phobia, that is a misrepresentation, it is a standard style, and nothing wrong with having one. — billinghurst sDrewth 12:46, 13 September 2022 (UTC)
  • a standard style that does not admit technically possible alternatives, seems a little pushy? will you now move all the chapters with roman numbers taken from the toc? or edit war and block people who create them? it seems like a lot of work to get your ascii sort functionality. you will just drive others to have no numbers in chapter titles. the policies are written as norms, and enforced as iron laws. --Slowking4Farmbrough's revenge 15:21, 13 September 2022 (UTC)
  •   Comment ++ with respect to redirects of subpages, they are a bad idea as when we disambiguate works they are truly are a PITA or a true problem. If we are moving them, I would suggest that any moved subpages have {{dated soft redirect}}s — billinghurst sDrewth 13:08, 13 September 2022 (UTC)
    Could you say more about the problems that redirects of chapter pages have caused? I believe you, but for ease of linking, I'd love it if we had both styles (Arabic and Roman) as a matter of course, so I'd like to know more about what problems you've noted. JesseW (talk) 14:31, 13 September 2022 (UTC)

List of UK Admiralty Charts, 1967Edit

In July 2021, there was a consensus on the English Wikipedia to transwiki w:Draft:List of UK Admiralty Charts, 1967 to Wikisource. This seems to have never been followed up on. * Pppery * it has begun... 17:16, 9 September 2022 (UTC)

@Pppery: That list doesn't appear to be in scope for enWS. Xover (talk) 20:20, 9 September 2022 (UTC)
The list is claimed to be taken directly from the official catalog(ue) of the Hydrographic Office, which should be in scope. Shells-shells (talk) 21:52, 9 September 2022 (UTC)
Yeah, looking at them, there does appear to be far more text and charts, there's a lot more than just Maps on there, so maybe? Reboot01 (talk) 23:21, 9 September 2022 (UTC)
Hmm. If it's taken directly from the published catalogue, rather than being compiled by a contributor, then it's possible to argue that it's in scope. But in this case, I think they mean that they've used that publication as a source from which they have compiled a list that does not otherwise exist in this form (I find no matching list in the original catalogue). And it's still mostly just a bunch of images of maps and tables of data, which makes it arguably out of scope here for much the same reason it's out of scope on enWP. It is also going to be literally impossible for us to reproduce this work properly due to the large tables and MediaWiki transclusion limits. Xover (talk) 05:10, 10 September 2022 (UTC)
maybe https://en.wikibooks.org/wiki/Main_Page is a home for the unloved list. not really interested in hosting english’s castoffs. --Slowking4Farmbrough's revenge 21:03, 10 September 2022 (UTC)

Index:Gleichförmige Rotation und Lorentz-Kontraktion.djvu -- not actually scan backedEdit

Back in 2018, it looks like someone misunderstood what "scan-backed" meant, and uploaded a German language (excerpt) and put their English translation in the Page namespace. This is Index:Gleichförmige Rotation und Lorentz-Kontraktion.djvu. Could someone help to untangle this? (probably via deletion, sadly) JesseW (talk) 23:27, 10 September 2022 (UTC)

@JesseW It's not just this single page; you can step through D.H's contributions to find many similar translated works. User translations are generally permissible; however, I don't think there's any specific policy governing the matter. WS:WWI#Translations states that "Wikisource also allows user-created wiki translations" without elaborating much further. See WS:Translations#Wikisource original translations for some of the community's thoughts on this, but note that it's not official policy. Shells-shells (talk) 04:42, 11 September 2022 (UTC)
That's correct. It's not real easy to read from the linked pages, but when I went digging into it a while back the summary is: user translations must be scan-backed, and the original language work must exist on the appropriate language Wikisource. They are then transcluded into a page in the Translation: namespace instead of in mainspace. See Translation:On Discoveries and Inventions for an example.
This is an area where policy and guidance really should be cleaned up, and probably rethought. I'm not convinced we should permit user translations (if you look through Translation: there's a lot of unverifiable crap there, and we tend to dump stuff there with no effort to actually clean it up; see WS:PD#Translation:Manshu for how hard it is to actually enforce such a de facto semi-expressed policy and what the result is when we don't manage that namespace properly). And if we do, I'm not sure a separate namespace for it makes sense (they use a dedicated header template in any case). And if we should permit them we need to design and try to get implemented better tools for it. And such better tools should probably address user annotations as well, if we are to permit them, since some of the technical issues overlap (again, see the deletion discussion linked above). Xover (talk) 05:31, 11 September 2022 (UTC)
Most of the translations related to Portal:Relativity and Wikisource:WikiProject Relativity have been created by me between 2010-2014, and since then I do mostly maintenance work on those articles (correcting errors, providing scans and transclusions etc.) without adding new ones. One of the reasons (besides lacking enough time) I stopped adding new translations, is the lack of clear policy or guideline (ws:Translations is still only a proposed policy), so it was never 100% clear to me whether those translation are fully acceptable on WS or not. --D.H (talk) 08:55, 11 September 2022 (UTC)
@D.H: User translations are definitely permitted on English Wikisource.
I am not, personally, convinced they should be, for various reasons, but they most definitely are currently (and have been for a very long time). So if that's the reason you've tagged a number of translated pages for speedy there's no need. Xover (talk) 19:24, 11 September 2022 (UTC)

Yes, I'm glad for the translations -- my concern was that they shouldn't (I think?) be in the Page namespace, but in the Translation namespace directly, and that the German originals should be (scan backed) in the German Wikisource. JesseW (talk) 00:56, 12 September 2022 (UTC)

Having the translation in page namespace provides many of the features of transcription in the page namespace, namely the ability to trivially compare the text to the original. Unfortunately the German Wikisource has some pretty strict rules about adding new texts--de:Wikisource_Diskussion:Projekte#Regeln_für_neue_Projekte--and it's not trivial for someone who just wants to translate a text to upload it to the German Wikisource first.--Prosfilaes (talk) 03:16, 12 September 2022 (UTC)
Good lord, remind me not to try and add works to German WS. Reboot01 (talk) 13:25, 13 September 2022 (UTC)
I am not a friend of WS user’s own translations, but as they are permitted here, using the page namespace for them instead of direct translations into translation namespace should be encouraged, as it makes easier to check the quality of the translations or various changes to the translations. I suspect that practically no established contributors try to patrol additions or changes in translation namespace and the current difficulties with searching for original and comparing it with our translated text are the reason. --Jan Kameníček (talk) 15:36, 13 September 2022 (UTC)

It seems like the consensus is in favor of the practice of putting user translations in the Page namespace (and not requiring that the originals be first uploaded to the original language WikiSource). I'm fine with this, but we should update WS:Translations#Wikisource original translations to clarify this. Anyone want to volunteer? JesseW (talk) 17:20, 13 September 2022 (UTC)

Tech News: 2022-37Edit

01:49, 13 September 2022 (UTC)

Commons editing not being picked up by WikisourceEdit

This is just an oddity and no longer a problem. I made a mistake on a link on a Commons page so that when Wikisource picked it up the link was sent to a destination that did not exist. I therefore edited the link in Commons; Commons then showed the corrected link and directed correctly to the page in question, however Wikisource still persisted in directing to the erroneous destination. It refused to change; I shut down my computer and re-entered but this did not help, so provisionally I did a redirect. Finally, I deleted the page in Wikisource that contained the link and then copied the coding back in to recreate it. This solved the problem but I'm wondering why Wikisource was working on data that no longer existed. Esme Shepherd (talk) 15:50, 14 September 2022 (UTC)

Are hws/hwe deprecated?Edit

On a book I'm validating there's a word broken between two pages. It's at the bottom of this page: Page:Shinto,_the_Way_of_the_Gods_-_Aston_-_1905.djvu/122. There's no hws template here, just a regular old hyphen. Yet on the transcluded page (Shinto:_The_Way_of_the_Gods/Chapter_6, at boundary between pages 112 and 113) the word is merged as if the hws/hwe templates were used. I would have expected it to be clear- ing on the transcluded page. What's going on? Are word breaks across pages handled automatically now? Do I not need to use hws/hwe anymore? --Arbitan (talk) 21:06, 16 September 2022 (UTC)

Yep, word breaks are handled automatically now; it's part of the <pages /> logic. If you want to prevent them, use {{nop}}, I think. JesseW (talk) 02:10, 17 September 2022 (UTC)
Thanks! I don't know how I missed that. --Arbitan (talk) 07:01, 17 September 2022 (UTC)
@Arbitan, @JesseW: {{hws}}/{{hwe}} are not deprecated deprecated, they're just not needed anymore except in special circumstances (inside footnotes you still have to use them). If you want to preserve a hyphen in a hyphenated word that occurs at a page boundary you need to use {{peh}} (which is not an acronym for "page-end hyphen", but is rather onomatopoeic for the sound one makes when encountering such a conundrum for the first time: Peh!?) Xover (talk) 08:22, 17 September 2022 (UTC)

OCR and sidenotesEdit

As anyone who has worked with OCR trying to parse sidenotes can attest, OCR doesn't seem capable of differentiating sidenotes from the main text, even though in many documents there is a clear "border" since most text is set to justified. So it just pushes the sidenote into the text, and you've got to go and separate it out.

For small documents, it's no big deal, but this can be annoying with large documents, such as legislative documents.

Are there any OCR solutions that are more effective at this? Even if this requires using something other than Wikisource's OCR and then copying the data back in. Supertrinko (talk) 03:30, 19 September 2022 (UTC)

When in editing mode, on the right next to "transcribe text" you can roll down OCR settings, there you choose "advanced options" and then you can select only a particular area of the page for OCR, e.g. without sidenotes. --Jan Kameníček (talk) 05:45, 19 September 2022 (UTC)
I didn't know this! Thank you very much! Supertrinko (talk) 20:45, 19 September 2022 (UTC)
See also phab:T294903 for bringing this into the native WS page editor. Inductiveloadtalk/contribs 21:48, 19 September 2022 (UTC)

Tech News: 2022-38Edit

MediaWiki message delivery 22:16, 19 September 2022 (UTC)

Proofread status of pages bar no longer present on transcluded mainspace pagesEdit

On Songs and Sonnets (Coleman), for example, where there should be a yellow status bar across the top, under the title, it's just a few dots, presumably the border in which the bar was supposed to appear. I tested it on both Chrome and Safari, and it's true on both browsers. It seems to be a sitewide issue. Does anybody know what might be going on? @Inductiveload: PseudoSkull (talk) 16:01, 21 September 2022 (UTC)

@Tpt, @Soda: This sounds like something that would be caused by either a change in PRP, or a skin change that affects PRP, in this week's train. Any ideas? We're triggering various on-site stuff off the presence of that progress bar and that's currently broken too. Possibly related, my "Wikidata info" gadget it showing "Wikidata item not found." for a whole lot of pages now too. Might be incidental though. Xover (talk) 17:08, 21 September 2022 (UTC)
Sorry for that. It's indeed caused by this change. I did a mistake with the ressource loading configuration. I have wrote a fix. It would be amazing if one of you could create a bug report on Phabricator about it. Edit: Done Tpt (talk) 19:13, 21 September 2022 (UTC)
I have wrote a quick fix before the patch deployement. Tpt (talk) 19:39, 21 September 2022 (UTC)
recently, i see also a large text box hovering over the footers field, and color dots. --Slowking4Farmbrough's revenge 00:52, 23 September 2022 (UTC)

Category:Works with non-numeric dates and year=decade worksEdit

Hi. Wondering whether we could do something to prevent books that have been labelled in {{header}} with a year parameter that is a decade, eg. "1820s" from being categorised into this cat. Fore example Beautiful and interesting account of the shepherd of Salisbury Plain. — billinghurst sDrewth 22:42, 23 September 2022 (UTC)

We could, but I don't see why; there's a lot of different similar formats used by other works in that category, like "c. 1816-1820" Bart'lemy fair or "1897-98" Hagar of the Pawnshop (Canadian Magazine, 1897-98) or "1840-1850" Jim Crow. Until there are some standards about how to format years that aren't simple years, I don't see any reason to pull out 1820s and not 1840-1850.--Prosfilaes (talk) 01:15, 24 September 2022 (UTC)

Page indexing in non-content namespacesEdit

I notice that page indexing by search engine robots is set to "allowed" for the "Page:" namespace, "User talk:", etc.

You can see this for any given page by clicking "page information" in the lefthand navigation, it will show you something like this.

In general, search engines like Google and Bing do not seem to rank Wikisource very highly. The reasons are surely complex, and I realize there is plenty of stuff in the main namespace that is not the highest quality from a web indexing perspective; but it seems that items in namespaces such as Page:, User talk:, Talk:, and several others is almost never going to be very useful if encountered outside the context of the Wikisource site as a whole, and therefore pretty useless to a search engine.

English Wikipedia has indexing of such namespaces turned off. I believe the facility for doing so is a page in the MediaWiki: namespace.

How do folks feel about the possibility of discouraging external robot indexing of several namespaces? -Pete (talk) 02:21, 24 September 2022 (UTC)